LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] bonding: fix arp_validate toggling in active-backup mode
@ 2019-05-10 21:57 Jarod Wilson
  2019-05-10 22:53 ` Jay Vosburgh
  2019-05-13 16:44 ` David Miller
  0 siblings, 2 replies; 7+ messages in thread
From: Jarod Wilson @ 2019-05-10 21:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jarod Wilson, Jay Vosburgh, Veaceslav Falico, Andy Gospodarek,
	David S. Miller, netdev

There's currently a problem with toggling arp_validate on and off with an
active-backup bond. At the moment, you can start up a bond, like so:

modprobe bonding mode=1 arp_interval=100 arp_validate=0 arp_ip_targets=192.168.1.1
ip link set bond0 down
echo "ens4f0" > /sys/class/net/bond0/bonding/slaves
echo "ens4f1" > /sys/class/net/bond0/bonding/slaves
ip link set bond0 up
ip addr add 192.168.1.2/24 dev bond0

Pings to 192.168.1.1 work just fine. Now turn on arp_validate:

echo 1 > /sys/class/net/bond0/bonding/arp_validate

Pings to 192.168.1.1 continue to work just fine. Now when you go to turn
arp_validate off again, the link falls flat on it's face:

echo 0 > /sys/class/net/bond0/bonding/arp_validate
dmesg
...
[133191.911987] bond0: Setting arp_validate to none (0)
[133194.257793] bond0: bond_should_notify_peers: slave ens4f0
[133194.258031] bond0: link status definitely down for interface ens4f0, disabling it
[133194.259000] bond0: making interface ens4f1 the new active one
[133197.330130] bond0: link status definitely down for interface ens4f1, disabling it
[133197.331191] bond0: now running without any active interface!

The problem lies in bond_options.c, where passing in arp_validate=0
results in bond->recv_probe getting set to NULL. This flies directly in
the face of commit 3fe68df97c7f, which says we need to set recv_probe =
bond_arp_recv, even if we're not using arp_validate. Said commit fixed
this in bond_option_arp_interval_set, but missed that we can get to that
same state in bond_option_arp_validate_set as well.

One solution would be to universally set recv_probe = bond_arp_recv here
as well, but I don't think bond_option_arp_validate_set has any business
touching recv_probe at all, and that should be left to the arp_interval
code, so we can just make things much tidier here.

Fixes: 3fe68df97c7f ("bonding: always set recv_probe to bond_arp_rcv in arp monitor")
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
---
 drivers/net/bonding/bond_options.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index da1fc17295d9..b996967af8d9 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -1098,13 +1098,6 @@ static int bond_option_arp_validate_set(struct bonding *bond,
 {
 	netdev_dbg(bond->dev, "Setting arp_validate to %s (%llu)\n",
 		   newval->string, newval->value);
-
-	if (bond->dev->flags & IFF_UP) {
-		if (!newval->value)
-			bond->recv_probe = NULL;
-		else if (bond->params.arp_interval)
-			bond->recv_probe = bond_arp_rcv;
-	}
 	bond->params.arp_validate = newval->value;
 
 	return 0;
-- 
2.20.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] bonding: fix arp_validate toggling in active-backup mode
  2019-05-10 21:57 [PATCH] bonding: fix arp_validate toggling in active-backup mode Jarod Wilson
@ 2019-05-10 22:53 ` Jay Vosburgh
  2019-05-11  6:12   ` Jarod Wilson
  2019-05-13 16:44 ` David Miller
  1 sibling, 1 reply; 7+ messages in thread
From: Jay Vosburgh @ 2019-05-10 22:53 UTC (permalink / raw)
  To: Jarod Wilson
  Cc: linux-kernel, Veaceslav Falico, Andy Gospodarek, David S. Miller, netdev

Jarod Wilson <jarod@redhat.com> wrote:

>There's currently a problem with toggling arp_validate on and off with an
>active-backup bond. At the moment, you can start up a bond, like so:
>
>modprobe bonding mode=1 arp_interval=100 arp_validate=0 arp_ip_targets=192.168.1.1
>ip link set bond0 down
>echo "ens4f0" > /sys/class/net/bond0/bonding/slaves
>echo "ens4f1" > /sys/class/net/bond0/bonding/slaves
>ip link set bond0 up
>ip addr add 192.168.1.2/24 dev bond0
>
>Pings to 192.168.1.1 work just fine. Now turn on arp_validate:
>
>echo 1 > /sys/class/net/bond0/bonding/arp_validate
>
>Pings to 192.168.1.1 continue to work just fine. Now when you go to turn
>arp_validate off again, the link falls flat on it's face:
>
>echo 0 > /sys/class/net/bond0/bonding/arp_validate
>dmesg
>...
>[133191.911987] bond0: Setting arp_validate to none (0)
>[133194.257793] bond0: bond_should_notify_peers: slave ens4f0
>[133194.258031] bond0: link status definitely down for interface ens4f0, disabling it
>[133194.259000] bond0: making interface ens4f1 the new active one
>[133197.330130] bond0: link status definitely down for interface ens4f1, disabling it
>[133197.331191] bond0: now running without any active interface!
>
>The problem lies in bond_options.c, where passing in arp_validate=0
>results in bond->recv_probe getting set to NULL. This flies directly in
>the face of commit 3fe68df97c7f, which says we need to set recv_probe =
>bond_arp_recv, even if we're not using arp_validate. Said commit fixed
>this in bond_option_arp_interval_set, but missed that we can get to that
>same state in bond_option_arp_validate_set as well.
>
>One solution would be to universally set recv_probe = bond_arp_recv here
>as well, but I don't think bond_option_arp_validate_set has any business
>touching recv_probe at all, and that should be left to the arp_interval
>code, so we can just make things much tidier here.
>
>Fixes: 3fe68df97c7f ("bonding: always set recv_probe to bond_arp_rcv in arp monitor")

	Is the above Fixes: tag correct?  3fe68df97c7f is not the source
of the erroneous logic being removed, which was introduced by

commit 29c4948293bfc426e52a921f4259eb3676961e81
Author: sfeldma@cumulusnetworks.com <sfeldma@cumulusnetworks.com>
Date:   Thu Dec 12 14:10:38 2013 -0800

    bonding: add arp_validate netlink support

	Regardless of which Fixes: is correct, the patch itself looks
fine to me:

Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>

	-J


>CC: Jay Vosburgh <j.vosburgh@gmail.com>
>CC: Veaceslav Falico <vfalico@gmail.com>
>CC: Andy Gospodarek <andy@greyhouse.net>
>CC: "David S. Miller" <davem@davemloft.net>
>CC: netdev@vger.kernel.org
>Signed-off-by: Jarod Wilson <jarod@redhat.com>
>---
> drivers/net/bonding/bond_options.c | 7 -------
> 1 file changed, 7 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
>index da1fc17295d9..b996967af8d9 100644
>--- a/drivers/net/bonding/bond_options.c
>+++ b/drivers/net/bonding/bond_options.c
>@@ -1098,13 +1098,6 @@ static int bond_option_arp_validate_set(struct bonding *bond,
> {
> 	netdev_dbg(bond->dev, "Setting arp_validate to %s (%llu)\n",
> 		   newval->string, newval->value);
>-
>-	if (bond->dev->flags & IFF_UP) {
>-		if (!newval->value)
>-			bond->recv_probe = NULL;
>-		else if (bond->params.arp_interval)
>-			bond->recv_probe = bond_arp_rcv;
>-	}
> 	bond->params.arp_validate = newval->value;
> 
> 	return 0;
>-- 
>2.20.1
>

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] bonding: fix arp_validate toggling in active-backup mode
  2019-05-10 22:53 ` Jay Vosburgh
@ 2019-05-11  6:12   ` Jarod Wilson
  2019-05-13 16:43     ` Jay Vosburgh
  0 siblings, 1 reply; 7+ messages in thread
From: Jarod Wilson @ 2019-05-11  6:12 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: linux-kernel, Veaceslav Falico, Andy Gospodarek, David S. Miller, netdev

On 5/10/19 6:53 PM, Jay Vosburgh wrote:
> Jarod Wilson <jarod@redhat.com> wrote:
> 
>> There's currently a problem with toggling arp_validate on and off with an
>> active-backup bond. At the moment, you can start up a bond, like so:
>>
>> modprobe bonding mode=1 arp_interval=100 arp_validate=0 arp_ip_targets=192.168.1.1
>> ip link set bond0 down
>> echo "ens4f0" > /sys/class/net/bond0/bonding/slaves
>> echo "ens4f1" > /sys/class/net/bond0/bonding/slaves
>> ip link set bond0 up
>> ip addr add 192.168.1.2/24 dev bond0
>>
>> Pings to 192.168.1.1 work just fine. Now turn on arp_validate:
>>
>> echo 1 > /sys/class/net/bond0/bonding/arp_validate
>>
>> Pings to 192.168.1.1 continue to work just fine. Now when you go to turn
>> arp_validate off again, the link falls flat on it's face:
>>
>> echo 0 > /sys/class/net/bond0/bonding/arp_validate
>> dmesg
>> ...
>> [133191.911987] bond0: Setting arp_validate to none (0)
>> [133194.257793] bond0: bond_should_notify_peers: slave ens4f0
>> [133194.258031] bond0: link status definitely down for interface ens4f0, disabling it
>> [133194.259000] bond0: making interface ens4f1 the new active one
>> [133197.330130] bond0: link status definitely down for interface ens4f1, disabling it
>> [133197.331191] bond0: now running without any active interface!
>>
>> The problem lies in bond_options.c, where passing in arp_validate=0
>> results in bond->recv_probe getting set to NULL. This flies directly in
>> the face of commit 3fe68df97c7f, which says we need to set recv_probe =
>> bond_arp_recv, even if we're not using arp_validate. Said commit fixed
>> this in bond_option_arp_interval_set, but missed that we can get to that
>> same state in bond_option_arp_validate_set as well.
>>
>> One solution would be to universally set recv_probe = bond_arp_recv here
>> as well, but I don't think bond_option_arp_validate_set has any business
>> touching recv_probe at all, and that should be left to the arp_interval
>> code, so we can just make things much tidier here.
>>
>> Fixes: 3fe68df97c7f ("bonding: always set recv_probe to bond_arp_rcv in arp monitor")
> 
> 	Is the above Fixes: tag correct?  3fe68df97c7f is not the source
> of the erroneous logic being removed, which was introduced by
> 
> commit 29c4948293bfc426e52a921f4259eb3676961e81
> Author: sfeldma@cumulusnetworks.com <sfeldma@cumulusnetworks.com>
> Date:   Thu Dec 12 14:10:38 2013 -0800
> 
>      bonding: add arp_validate netlink support

I wasn't entirely sure that was the best choice for Fixes either, it was 
sort of more "Augments the Fix in", so I'd certainly have no objection 
to changing the Fixes tag to the earlier commit instead.

-- 
Jarod Wilson
jarod@redhat.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] bonding: fix arp_validate toggling in active-backup mode
  2019-05-11  6:12   ` Jarod Wilson
@ 2019-05-13 16:43     ` Jay Vosburgh
  2019-05-13 16:46       ` David Miller
  0 siblings, 1 reply; 7+ messages in thread
From: Jay Vosburgh @ 2019-05-13 16:43 UTC (permalink / raw)
  To: Jarod Wilson
  Cc: linux-kernel, Veaceslav Falico, Andy Gospodarek, David S. Miller, netdev

Jarod Wilson <jarod@redhat.com> wrote:

>On 5/10/19 6:53 PM, Jay Vosburgh wrote:
>> Jarod Wilson <jarod@redhat.com> wrote:
>>
>>> There's currently a problem with toggling arp_validate on and off with an
>>> active-backup bond. At the moment, you can start up a bond, like so:
>>>
>>> modprobe bonding mode=1 arp_interval=100 arp_validate=0 arp_ip_targets=192.168.1.1
>>> ip link set bond0 down
>>> echo "ens4f0" > /sys/class/net/bond0/bonding/slaves
>>> echo "ens4f1" > /sys/class/net/bond0/bonding/slaves
>>> ip link set bond0 up
>>> ip addr add 192.168.1.2/24 dev bond0
>>>
>>> Pings to 192.168.1.1 work just fine. Now turn on arp_validate:
>>>
>>> echo 1 > /sys/class/net/bond0/bonding/arp_validate
>>>
>>> Pings to 192.168.1.1 continue to work just fine. Now when you go to turn
>>> arp_validate off again, the link falls flat on it's face:
>>>
>>> echo 0 > /sys/class/net/bond0/bonding/arp_validate
>>> dmesg
>>> ...
>>> [133191.911987] bond0: Setting arp_validate to none (0)
>>> [133194.257793] bond0: bond_should_notify_peers: slave ens4f0
>>> [133194.258031] bond0: link status definitely down for interface ens4f0, disabling it
>>> [133194.259000] bond0: making interface ens4f1 the new active one
>>> [133197.330130] bond0: link status definitely down for interface ens4f1, disabling it
>>> [133197.331191] bond0: now running without any active interface!
>>>
>>> The problem lies in bond_options.c, where passing in arp_validate=0
>>> results in bond->recv_probe getting set to NULL. This flies directly in
>>> the face of commit 3fe68df97c7f, which says we need to set recv_probe =
>>> bond_arp_recv, even if we're not using arp_validate. Said commit fixed
>>> this in bond_option_arp_interval_set, but missed that we can get to that
>>> same state in bond_option_arp_validate_set as well.
>>>
>>> One solution would be to universally set recv_probe = bond_arp_recv here
>>> as well, but I don't think bond_option_arp_validate_set has any business
>>> touching recv_probe at all, and that should be left to the arp_interval
>>> code, so we can just make things much tidier here.
>>>
>>> Fixes: 3fe68df97c7f ("bonding: always set recv_probe to bond_arp_rcv in arp monitor")
>>
>> 	Is the above Fixes: tag correct?  3fe68df97c7f is not the source
>> of the erroneous logic being removed, which was introduced by
>>
>> commit 29c4948293bfc426e52a921f4259eb3676961e81
>> Author: sfeldma@cumulusnetworks.com <sfeldma@cumulusnetworks.com>
>> Date:   Thu Dec 12 14:10:38 2013 -0800
>>
>>      bonding: add arp_validate netlink support
>
>I wasn't entirely sure that was the best choice for Fixes either, it was
>sort of more "Augments the Fix in", so I'd certainly have no objection to
>changing the Fixes tag to the earlier commit instead.

	That would be my preference, as the 29c4948293bf commit looks to
be the change actually being fixed.

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] bonding: fix arp_validate toggling in active-backup mode
  2019-05-10 21:57 [PATCH] bonding: fix arp_validate toggling in active-backup mode Jarod Wilson
  2019-05-10 22:53 ` Jay Vosburgh
@ 2019-05-13 16:44 ` David Miller
  1 sibling, 0 replies; 7+ messages in thread
From: David Miller @ 2019-05-13 16:44 UTC (permalink / raw)
  To: jarod; +Cc: linux-kernel, j.vosburgh, vfalico, andy, netdev

From: Jarod Wilson <jarod@redhat.com>
Date: Fri, 10 May 2019 17:57:09 -0400

> There's currently a problem with toggling arp_validate on and off with an
> active-backup bond. At the moment, you can start up a bond, like so:
 ...
> The problem lies in bond_options.c, where passing in arp_validate=0
> results in bond->recv_probe getting set to NULL. This flies directly in
> the face of commit 3fe68df97c7f, which says we need to set recv_probe =
> bond_arp_recv, even if we're not using arp_validate. Said commit fixed
> this in bond_option_arp_interval_set, but missed that we can get to that
> same state in bond_option_arp_validate_set as well.
> 
> One solution would be to universally set recv_probe = bond_arp_recv here
> as well, but I don't think bond_option_arp_validate_set has any business
> touching recv_probe at all, and that should be left to the arp_interval
> code, so we can just make things much tidier here.
> 
> Fixes: 3fe68df97c7f ("bonding: always set recv_probe to bond_arp_rcv in arp monitor")
 ...
> Signed-off-by: Jarod Wilson <jarod@redhat.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] bonding: fix arp_validate toggling in active-backup mode
  2019-05-13 16:43     ` Jay Vosburgh
@ 2019-05-13 16:46       ` David Miller
  2019-05-13 17:10         ` Jay Vosburgh
  0 siblings, 1 reply; 7+ messages in thread
From: David Miller @ 2019-05-13 16:46 UTC (permalink / raw)
  To: jay.vosburgh; +Cc: jarod, linux-kernel, vfalico, andy, netdev

From: Jay Vosburgh <jay.vosburgh@canonical.com>
Date: Mon, 13 May 2019 09:43:30 -0700

> 	That would be my preference, as the 29c4948293bf commit looks to
> be the change actually being fixed.

Sorry I pushed the original commit message out :-(

But isn't the Fixes: tag he choose the one where the logic actually
causes problems?  That's kind of my real criteria for Fixes: tags.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] bonding: fix arp_validate toggling in active-backup mode
  2019-05-13 16:46       ` David Miller
@ 2019-05-13 17:10         ` Jay Vosburgh
  0 siblings, 0 replies; 7+ messages in thread
From: Jay Vosburgh @ 2019-05-13 17:10 UTC (permalink / raw)
  To: David Miller; +Cc: jarod, linux-kernel, vfalico, andy, netdev

David Miller <davem@davemloft.net> wrote:

>From: Jay Vosburgh <jay.vosburgh@canonical.com>
>Date: Mon, 13 May 2019 09:43:30 -0700
>
>> 	That would be my preference, as the 29c4948293bf commit looks to
>> be the change actually being fixed.
>
>Sorry I pushed the original commit message out :-(
>
>But isn't the Fixes: tag he choose the one where the logic actually
>causes problems?  That's kind of my real criteria for Fixes: tags.

	I don't think so.  It looks like the problem being fixed here
(clearing recv_probe when we shouldn't) was introduced at 29c4948293bf,
but was not the only place the same problem existed.  3fe68df97c7f fixed
the other occurrences of this problem, but missed the specific one added
by 29c4948293bf, which is now fixed by this patch.

	In any event, both of the commits in question are old enough
that it's kind of moot, as -stable will presumably get the right thing
regardless.

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-05-13 17:10 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-10 21:57 [PATCH] bonding: fix arp_validate toggling in active-backup mode Jarod Wilson
2019-05-10 22:53 ` Jay Vosburgh
2019-05-11  6:12   ` Jarod Wilson
2019-05-13 16:43     ` Jay Vosburgh
2019-05-13 16:46       ` David Miller
2019-05-13 17:10         ` Jay Vosburgh
2019-05-13 16:44 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).