Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/3] Fix for KSZ DSA switch shutdown
@ 2021-09-09  9:53 Lino Sanfilippo
  2021-09-09  9:53 ` [PATCH 1/3] net: dsa: introduce function dsa_tree_shutdown() Lino Sanfilippo
                   ` (4 more replies)
  0 siblings, 5 replies; 38+ messages in thread
From: Lino Sanfilippo @ 2021-09-09  9:53 UTC (permalink / raw)
  To: olteanv
  Cc: p.rosenberger, woojung.huh, UNGLinuxDriver, andrew,
	vivien.didelot, f.fainelli, davem, kuba, netdev, linux-kernel,
	Lino Sanfilippo

This patch series fixes a system hang I got each time i tried to shutdown
or reboot a system that uses a KSZ9897 as a DSA switch with a broadcom
GENET network device as the DSA master device. At the time the system hangs
the message "unregister_netdevice: waiting for eth0 to become free. Usage
count = 2." is dumped periodically to the console.

After some investigation I found the reason to be unreleased references to
the master device which are still held by the slave devices at the time the
system is shut down (I have two slave devices in use).

While these references are supposed to be released in ksz_switch_remove()
this function never gets the chance to be called due to the system hang at
the master device deregistration which happens before ksz_switch_remove()
is called.

The fix is to make sure that the master device references are already
released when the device is unregistered. For this reason PATCH1 provides
a new function dsa_tree_shutdown() that can be called by DSA drivers to
untear the DSA switch at shutdown. PATCH2 uses this function in a new
helper function for KSZ switches to properly shutdown the KSZ switch.
PATCH 3 uses the new helper function in the KSZ9477 shutdown handler.

Theses patches have been tested on a Raspberry PI 5.10 kernel with a
KSZ9897. The patches have been adjusted to apply against net-next and are
compile tested with next-next.

Lino Sanfilippo (3):
  net: dsa: introduce function dsa_tree_shutdown()
  net: dsa: microchip: provide the function ksz_switch_shutdown()
  net: dsa: microchip: tear down DSA tree at system shutdown

 drivers/net/dsa/microchip/ksz9477.c    | 12 +++++++++++-
 drivers/net/dsa/microchip/ksz_common.c | 13 +++++++++++++
 drivers/net/dsa/microchip/ksz_common.h |  1 +
 include/net/dsa.h                      |  1 +
 net/dsa/dsa2.c                         |  8 ++++++++
 5 files changed, 34 insertions(+), 1 deletion(-)


base-commit: 626bf91a292e2035af5b9d9cce35c5c138dfe06d
-- 
2.33.0


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 1/3] net: dsa: introduce function dsa_tree_shutdown()
  2021-09-09  9:53 [PATCH 0/3] Fix for KSZ DSA switch shutdown Lino Sanfilippo
@ 2021-09-09  9:53 ` Lino Sanfilippo
  2021-09-09  9:53 ` [PATCH 2/3] net: dsa: microchip: provide the function ksz_switch_shutdown() Lino Sanfilippo
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 38+ messages in thread
From: Lino Sanfilippo @ 2021-09-09  9:53 UTC (permalink / raw)
  To: olteanv
  Cc: p.rosenberger, woojung.huh, UNGLinuxDriver, andrew,
	vivien.didelot, f.fainelli, davem, kuba, netdev, linux-kernel,
	Lino Sanfilippo

Provide the function dsa_tree_shutdown() that can be called by DSA drivers
to tear down the DSA tree.
This is particularly useful for shutdown handlers to make sure that the DSA
tree is torn down (and thus all references to the master device are
released) before the master device is deregistered.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
---
 include/net/dsa.h | 1 +
 net/dsa/dsa2.c    | 8 ++++++++
 2 files changed, 9 insertions(+)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index f9a17145255a..7d4094faaa3a 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -1039,6 +1039,7 @@ static inline int dsa_ndo_eth_ioctl(struct net_device *dev, struct ifreq *ifr,
 }
 #endif
 
+void dsa_tree_shutdown(struct dsa_switch_tree *dst);
 void dsa_unregister_switch(struct dsa_switch *ds);
 int dsa_register_switch(struct dsa_switch *ds);
 struct dsa_switch *dsa_switch_find(int tree_index, int sw_index);
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 1b2b25d7bd02..0588581f6531 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -1066,6 +1066,14 @@ static void dsa_tree_teardown(struct dsa_switch_tree *dst)
 	dst->setup = false;
 }
 
+void dsa_tree_shutdown(struct dsa_switch_tree *dst)
+{
+	mutex_lock(&dsa2_mutex);
+	dsa_tree_teardown(dst);
+	mutex_unlock(&dsa2_mutex);
+}
+EXPORT_SYMBOL_GPL(dsa_tree_shutdown);
+
 /* Since the dsa/tagging sysfs device attribute is per master, the assumption
  * is that all DSA switches within a tree share the same tagger, otherwise
  * they would have formed disjoint trees (different "dsa,member" values).
-- 
2.33.0


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 2/3] net: dsa: microchip: provide the function ksz_switch_shutdown()
  2021-09-09  9:53 [PATCH 0/3] Fix for KSZ DSA switch shutdown Lino Sanfilippo
  2021-09-09  9:53 ` [PATCH 1/3] net: dsa: introduce function dsa_tree_shutdown() Lino Sanfilippo
@ 2021-09-09  9:53 ` Lino Sanfilippo
  2021-09-09  9:53 ` [PATCH 3/3] net: dsa: microchip: tear down DSA tree at system shutdown Lino Sanfilippo
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 38+ messages in thread
From: Lino Sanfilippo @ 2021-09-09  9:53 UTC (permalink / raw)
  To: olteanv
  Cc: p.rosenberger, woojung.huh, UNGLinuxDriver, andrew,
	vivien.didelot, f.fainelli, davem, kuba, netdev, linux-kernel,
	Lino Sanfilippo

Provide a function ksz_switch_shutdown() which properly shuts down the KSZ
switch by stopping the mib_read worker thread and then tearing down the DSA
tree.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
---
 drivers/net/dsa/microchip/ksz_common.c | 13 +++++++++++++
 drivers/net/dsa/microchip/ksz_common.h |  1 +
 2 files changed, 14 insertions(+)

diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c
index 1542bfb8b5e5..aaa5c45f4823 100644
--- a/drivers/net/dsa/microchip/ksz_common.c
+++ b/drivers/net/dsa/microchip/ksz_common.c
@@ -446,6 +446,19 @@ int ksz_switch_register(struct ksz_device *dev,
 }
 EXPORT_SYMBOL(ksz_switch_register);
 
+void ksz_switch_shutdown(struct ksz_device *dev)
+{
+	struct dsa_switch *ds = dev->ds;
+
+	/* timer started */
+	if (dev->mib_read_interval) {
+		cancel_delayed_work_sync(&dev->mib_read);
+		dev->mib_read_interval = 0;
+	}
+	dsa_tree_shutdown(ds->dst);
+}
+EXPORT_SYMBOL(ksz_switch_shutdown);
+
 void ksz_switch_remove(struct ksz_device *dev)
 {
 	/* timer started */
diff --git a/drivers/net/dsa/microchip/ksz_common.h b/drivers/net/dsa/microchip/ksz_common.h
index 1597c63988b4..9986f6c4c1e7 100644
--- a/drivers/net/dsa/microchip/ksz_common.h
+++ b/drivers/net/dsa/microchip/ksz_common.h
@@ -143,6 +143,7 @@ struct ksz_device *ksz_switch_alloc(struct device *base, void *priv);
 int ksz_switch_register(struct ksz_device *dev,
 			const struct ksz_dev_ops *ops);
 void ksz_switch_remove(struct ksz_device *dev);
+void ksz_switch_shutdown(struct ksz_device *dev);
 
 int ksz8_switch_register(struct ksz_device *dev);
 int ksz9477_switch_register(struct ksz_device *dev);
-- 
2.33.0


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 3/3] net: dsa: microchip: tear down DSA tree at system shutdown
  2021-09-09  9:53 [PATCH 0/3] Fix for KSZ DSA switch shutdown Lino Sanfilippo
  2021-09-09  9:53 ` [PATCH 1/3] net: dsa: introduce function dsa_tree_shutdown() Lino Sanfilippo
  2021-09-09  9:53 ` [PATCH 2/3] net: dsa: microchip: provide the function ksz_switch_shutdown() Lino Sanfilippo
@ 2021-09-09  9:53 ` Lino Sanfilippo
  2021-09-09 10:14 ` [PATCH 0/3] Fix for KSZ DSA switch shutdown Vladimir Oltean
  2021-09-09 12:40 ` Andrew Lunn
  4 siblings, 0 replies; 38+ messages in thread
From: Lino Sanfilippo @ 2021-09-09  9:53 UTC (permalink / raw)
  To: olteanv
  Cc: p.rosenberger, woojung.huh, UNGLinuxDriver, andrew,
	vivien.didelot, f.fainelli, davem, kuba, netdev, linux-kernel,
	Lino Sanfilippo

Shutting down the system with registered KSZ9477 slave devices results in a
system hang. Additionally the message "unregister_netdevice: waiting for
ETH to become free. Usage count = X" is dumped into the kernel log (with
ETH being the DSA master device and X the number of registered slave
devices).

The reason for this issue are pending references to the DSA master device
which are still held by the slave devices at the time master device is
unregistered.

While these references are supposed to be released in ksz_switch_remove()
this function never gets the chance to be called due to the earlier system
hang at the master device deregistration.

Fix this deadlock situation by deregistering the switch (and thus releasing
the master device references) in the KSZ9477 shutdown handler. Since this
handler is called before the master device deregistration all references to
the master device are released at the time of deregistration and thus the
deadlock does not occur.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
---
 drivers/net/dsa/microchip/ksz9477.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/microchip/ksz9477.c b/drivers/net/dsa/microchip/ksz9477.c
index 854e25f43fa7..5db82898b737 100644
--- a/drivers/net/dsa/microchip/ksz9477.c
+++ b/drivers/net/dsa/microchip/ksz9477.c
@@ -229,6 +229,16 @@ static int ksz9477_reset_switch(struct ksz_device *dev)
 	return 0;
 }
 
+static int ksz9477_shutdown(struct ksz_device *dev)
+{
+	int ret;
+
+	ret = ksz9477_reset_switch(dev);
+	ksz_switch_shutdown(dev);
+
+	return ret;
+}
+
 static void ksz9477_r_mib_cnt(struct ksz_device *dev, int port, u16 addr,
 			      u64 *cnt)
 {
@@ -1601,7 +1611,7 @@ static const struct ksz_dev_ops ksz9477_dev_ops = {
 	.r_mib_pkt = ksz9477_r_mib_pkt,
 	.freeze_mib = ksz9477_freeze_mib,
 	.port_init_cnt = ksz9477_port_init_cnt,
-	.shutdown = ksz9477_reset_switch,
+	.shutdown = ksz9477_shutdown,
 	.detect = ksz9477_switch_detect,
 	.init = ksz9477_switch_init,
 	.exit = ksz9477_switch_exit,
-- 
2.33.0


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09  9:53 [PATCH 0/3] Fix for KSZ DSA switch shutdown Lino Sanfilippo
                   ` (2 preceding siblings ...)
  2021-09-09  9:53 ` [PATCH 3/3] net: dsa: microchip: tear down DSA tree at system shutdown Lino Sanfilippo
@ 2021-09-09 10:14 ` Vladimir Oltean
  2021-09-09 11:08   ` Lino Sanfilippo
  2021-09-09 12:40 ` Andrew Lunn
  4 siblings, 1 reply; 38+ messages in thread
From: Vladimir Oltean @ 2021-09-09 10:14 UTC (permalink / raw)
  To: Lino Sanfilippo
  Cc: p.rosenberger, woojung.huh, UNGLinuxDriver, andrew,
	vivien.didelot, f.fainelli, davem, kuba, netdev, linux-kernel

On Thu, Sep 09, 2021 at 11:53:21AM +0200, Lino Sanfilippo wrote:
> This patch series fixes a system hang I got each time i tried to shutdown
> or reboot a system that uses a KSZ9897 as a DSA switch with a broadcom
> GENET network device as the DSA master device. At the time the system hangs
> the message "unregister_netdevice: waiting for eth0 to become free. Usage
> count = 2." is dumped periodically to the console.
> 
> After some investigation I found the reason to be unreleased references to
> the master device which are still held by the slave devices at the time the
> system is shut down (I have two slave devices in use).
> 
> While these references are supposed to be released in ksz_switch_remove()
> this function never gets the chance to be called due to the system hang at
> the master device deregistration which happens before ksz_switch_remove()
> is called.
> 
> The fix is to make sure that the master device references are already
> released when the device is unregistered. For this reason PATCH1 provides
> a new function dsa_tree_shutdown() that can be called by DSA drivers to
> untear the DSA switch at shutdown. PATCH2 uses this function in a new
> helper function for KSZ switches to properly shutdown the KSZ switch.
> PATCH 3 uses the new helper function in the KSZ9477 shutdown handler.
> 
> Theses patches have been tested on a Raspberry PI 5.10 kernel with a
> KSZ9897. The patches have been adjusted to apply against net-next and are
> compile tested with next-next.

Can you try this patch

commit 07b90056cb15ff9877dca0d8f1b6583d1051f724
Author: Vladimir Oltean <vladimir.oltean@nxp.com>
Date:   Tue Jan 12 01:09:43 2021 +0200

    net: dsa: unbind all switches from tree when DSA master unbinds

    Currently the following happens when a DSA master driver unbinds while
    there are DSA switches attached to it:

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 10:14 ` [PATCH 0/3] Fix for KSZ DSA switch shutdown Vladimir Oltean
@ 2021-09-09 11:08   ` Lino Sanfilippo
  2021-09-09 11:42     ` Vladimir Oltean
  0 siblings, 1 reply; 38+ messages in thread
From: Lino Sanfilippo @ 2021-09-09 11:08 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: p.rosenberger, woojung.huh, UNGLinuxDriver, andrew,
	vivien.didelot, f.fainelli, davem, kuba, netdev, linux-kernel


Hi,

On 09.09.21 at 12:14, Vladimir Oltean wrote:
>
> Can you try this patch
>
> commit 07b90056cb15ff9877dca0d8f1b6583d1051f724
> Author: Vladimir Oltean <vladimir.oltean@nxp.com>
> Date:   Tue Jan 12 01:09:43 2021 +0200
>
>     net: dsa: unbind all switches from tree when DSA master unbinds
>
>     Currently the following happens when a DSA master driver unbinds while
>     there are DSA switches attached to it:
>

This patch is already part of the kernel which shows the described shutdown issues.

Regards,
Lino

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 11:08   ` Lino Sanfilippo
@ 2021-09-09 11:42     ` Vladimir Oltean
  2021-09-09 12:56       ` Vladimir Oltean
  0 siblings, 1 reply; 38+ messages in thread
From: Vladimir Oltean @ 2021-09-09 11:42 UTC (permalink / raw)
  To: Lino Sanfilippo
  Cc: p.rosenberger, woojung.huh, UNGLinuxDriver, andrew,
	vivien.didelot, f.fainelli, davem, kuba, netdev, linux-kernel

On Thu, Sep 09, 2021 at 01:08:26PM +0200, Lino Sanfilippo wrote:
> Hi,
> 
> On 09.09.21 at 12:14, Vladimir Oltean wrote:
> >
> > Can you try this patch
> >
> > commit 07b90056cb15ff9877dca0d8f1b6583d1051f724
> > Author: Vladimir Oltean <vladimir.oltean@nxp.com>
> > Date:   Tue Jan 12 01:09:43 2021 +0200
> >
> >     net: dsa: unbind all switches from tree when DSA master unbinds
> >
> >     Currently the following happens when a DSA master driver unbinds while
> >     there are DSA switches attached to it:
> >
> 
> This patch is already part of the kernel which shows the described shutdown issues.

How can I reproduce this issue?

When I test with sysrq-o, I do see the various DSA trees getting torn
down:

[   16.731468] sysrq: HELP : loglevel(0-9) reboot(b) crash(c) show-all-locks(d) terminate-all-tasks(e) memory-full-oom-kill(f) kill-all-tasks(i) thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l) show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o) show-registers(p) show-all-timers(q) unraw(r) sync(s) show-task-states(t) unmount(u) show-blocked-tasks(w) dump-ftrace-buffer(z)
[   29.912535] sysrq: Power Off
[   29.917806] kvm: exiting hardware virtualization
[   29.988036] device swp0 left promiscuous mode
[   30.370424] sja1105 spi2.0: Link is Down
[   30.402790] DSA: tree 1 torn down
[   30.495096] device swp2 left promiscuous mode
[   31.011576] sja1105 spi2.1: Link is Down
[   31.032925] DSA: tree 2 torn down
[   31.074226] reboot: Power down

I feel that something is missing in your system. Is the device link
created? Is it deleted before going into effect on shutdown?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09  9:53 [PATCH 0/3] Fix for KSZ DSA switch shutdown Lino Sanfilippo
                   ` (3 preceding siblings ...)
  2021-09-09 10:14 ` [PATCH 0/3] Fix for KSZ DSA switch shutdown Vladimir Oltean
@ 2021-09-09 12:40 ` Andrew Lunn
  4 siblings, 0 replies; 38+ messages in thread
From: Andrew Lunn @ 2021-09-09 12:40 UTC (permalink / raw)
  To: Lino Sanfilippo
  Cc: olteanv, p.rosenberger, woojung.huh, UNGLinuxDriver,
	vivien.didelot, f.fainelli, davem, kuba, netdev, linux-kernel

On Thu, Sep 09, 2021 at 11:53:21AM +0200, Lino Sanfilippo wrote:
> This patch series fixes a system hang I got each time i tried to shutdown
> or reboot a system that uses a KSZ9897 as a DSA switch with a broadcom
> GENET network device as the DSA master device. At the time the system hangs
> the message "unregister_netdevice: waiting for eth0 to become free. Usage
> count = 2." is dumped periodically to the console.
> 
> After some investigation I found the reason to be unreleased references to
> the master device which are still held by the slave devices at the time the
> system is shut down (I have two slave devices in use).
> 
> While these references are supposed to be released in ksz_switch_remove()
> this function never gets the chance to be called due to the system hang at
> the master device deregistration which happens before ksz_switch_remove()
> is called.
> 
> The fix is to make sure that the master device references are already
> released when the device is unregistered. For this reason PATCH1 provides
> a new function dsa_tree_shutdown() that can be called by DSA drivers to
> untear the DSA switch at shutdown. PATCH2 uses this function in a new
> helper function for KSZ switches to properly shutdown the KSZ switch.
> PATCH 3 uses the new helper function in the KSZ9477 shutdown handler.

I agree with Vladimir here. Shutdown works without issue on mv88e6xxx,
i do it frequently. I'm sure other developers shutdown there devices
at the end of the edit/compile/test cycle. If there was a generic
problem, we would probably know about it. So it seems like there is
something specific to your system which breaks the reference
counting. We need to understand that first, then we can see how we fix
it.

> 
> Theses patches have been tested on a Raspberry PI 5.10 kernel with a
> KSZ9897. The patches have been adjusted to apply against net-next and are
> compile tested with next-next.

Is the switch on a hat? Are you using DT overlays?

   Andrew

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 11:42     ` Vladimir Oltean
@ 2021-09-09 12:56       ` Vladimir Oltean
  2021-09-09 13:19         ` Aw: " Lino Sanfilippo
  0 siblings, 1 reply; 38+ messages in thread
From: Vladimir Oltean @ 2021-09-09 12:56 UTC (permalink / raw)
  To: Lino Sanfilippo
  Cc: p.rosenberger, woojung.huh, UNGLinuxDriver, andrew,
	vivien.didelot, f.fainelli, davem, kuba, netdev, linux-kernel

On Thu, Sep 09, 2021 at 02:42:48PM +0300, Vladimir Oltean wrote:
> I feel that something is missing in your system. Is the device link
> created? Is it deleted before going into effect on shutdown?

So in case my questions were confusing, you can check the presence of
the device links via sysfs.

On my board, eno2 is the top-level DSA master, there is a switch which
is PCIe PF 0000:00:00.5 which is its consumer:

ls -la /sys/class/net/eno2/device/consumer\:pci\:0000\:00\:00.5
lrwxrwxrwx    1 root     root             0 Jan  1 00:00 /sys/class/net/eno2/device/consumer:pci:0000:00:00.5 -> ../../../../../virtual/devlink/pci:0000:00:00.2--pci:0000:00:00.5

In turn, that switch is a DSA master on two ports for SPI-attached switches:

ls -la /sys/class/net/swp0/device/consumer\:spi\:spi2.*
lrwxrwxrwx    1 root     root             0 Jan  1 00:04 /sys/class/net/swp0/device/consumer:spi:spi2.0 -> ../../../../../virtual/devlink/pci:0000:00:00.5--spi:spi2.0
lrwxrwxrwx    1 root     root             0 Jan  1 00:04 /sys/class/net/swp0/device/consumer:spi:spi2.1 -> ../../../../../virtual/devlink/pci:0000:00:00.5--spi:spi2.1

Do you see similar things on your 5.10 kernel?

Please note that I don't think that particular patch with device links
was backported to v5.10, at least I don't see it when I run:

git tag --contains  07b90056cb15f

So how did it reach your tree?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Aw: Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 12:56       ` Vladimir Oltean
@ 2021-09-09 13:19         ` Lino Sanfilippo
  2021-09-09 14:29           ` Lino Sanfilippo
                             ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Lino Sanfilippo @ 2021-09-09 13:19 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: p.rosenberger, woojung.huh, UNGLinuxDriver, andrew,
	vivien.didelot, f.fainelli, davem, kuba, netdev, linux-kernel

Hi Vladimir, Andrew,

sorry for the late response.

> Gesendet: Donnerstag, 09. September 2021 um 14:56 Uhr
> Von: "Vladimir Oltean" <olteanv@gmail.com>
> An: "Lino Sanfilippo" <LinoSanfilippo@gmx.de>
> Cc: p.rosenberger@kunbus.com, woojung.huh@microchip.com, UNGLinuxDriver@microchip.com, andrew@lunn.ch, vivien.didelot@gmail.com, f.fainelli@gmail.com, davem@davemloft.net, kuba@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org
> Betreff: Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
>
> On Thu, Sep 09, 2021 at 02:42:48PM +0300, Vladimir Oltean wrote:
> > I feel that something is missing in your system. Is the device link
> > created? Is it deleted before going into effect on shutdown?
>
> So in case my questions were confusing, you can check the presence of
> the device links via sysfs.
>
> On my board, eno2 is the top-level DSA master, there is a switch which
> is PCIe PF 0000:00:00.5 which is its consumer:
>
> ls -la /sys/class/net/eno2/device/consumer\:pci\:0000\:00\:00.5
> lrwxrwxrwx    1 root     root             0 Jan  1 00:00 /sys/class/net/eno2/device/consumer:pci:0000:00:00.5 -> ../../../../../virtual/devlink/pci:0000:00:00.2--pci:0000:00:00.5
>
> In turn, that switch is a DSA master on two ports for SPI-attached switches:
>
> ls -la /sys/class/net/swp0/device/consumer\:spi\:spi2.*
> lrwxrwxrwx    1 root     root             0 Jan  1 00:04 /sys/class/net/swp0/device/consumer:spi:spi2.0 -> ../../../../../virtual/devlink/pci:0000:00:00.5--spi:spi2.0
> lrwxrwxrwx    1 root     root             0 Jan  1 00:04 /sys/class/net/swp0/device/consumer:spi:spi2.1 -> ../../../../../virtual/devlink/pci:0000:00:00.5--spi:spi2.1
>
> Do you see similar things on your 5.10 kernel?

For the master device is see

lrwxrwxrwx 1 root root 0 Sep  9 14:10 /sys/class/net/eth0/device/consumer:spi:spi3.0 -> ../../../virtual/devlink/platform:fd580000.ethernet--spi:spi3.0


>
> Please note that I don't think that particular patch with device links
> was backported to v5.10, at least I don't see it when I run:


> git tag --contains  07b90056cb15f
>
> So how did it reach your tree?
>

The kernel I use is the Raspberry Pi 5.10 kernel. The commit number in this kernel is d0b97c8cd63e37e6d4dc9fefd6381b09f6c31a67

Andrew: the switch is not on a hat, the device tree part I use is:



       spi@7e204c00 {
            cs-gpios = <0x0000000f 0x00000010 0x00000001 0x0000000f 0x00000012 0x00000001>;
            pinctrl-0 = <0x000000e5 0x000000e6>;
            pinctrl-names = "default";
            compatible = "brcm,bcm2835-spi";
            reg = <0x7e204c00 0x00000200>;
            interrupts = <0x00000000 0x00000076 0x00000004>;
            clocks = <0x00000007 0x00000014>;
            #address-cells = <0x00000001>;
            #size-cells = <0x00000000>;
            status = "okay";
            phandle = <0x000000be>;
            tpm@1 {
                phandle = <0x000000ed>;
                status = "okay";
                interrupts = <0x0000000a 0x00000008>;
                #interrupt-cells = <0x00000002>;
                interrupt-parent = <0x0000000f>;
                spi-max-frequency = <0x000f4240>;
                reg = <0x00000001>;
                pinctrl-0 = <0x000000e7>;
                pinctrl-names = "default";
                compatible = "infineon,slb9670";
            };
            ksz9897@0 {
                phandle = <0x000000ec>;
                status = "okay";
                reset-gpios = <0x000000e1 0x0000000d 0x00000001>;
                spi-cpol;
                spi-cpha;
                spi-max-frequency = <0x01f78a40>;
                reg = <0x00000000>;
                compatible = "microchip,ksz9897";
                ports {
                    #size-cells = <0x00000000>;
                    #address-cells = <0x00000001>;
                    port@2 {
                        label = "piright";
                        reg = <0x00000002>;
                    };
                    port@1 {
                        label = "pileft";
                        reg = <0x00000001>;
                    };
                    port@0 {
                        ethernet = <0x000000d7>;
                        label = "cpu";
                        reg = <0x00000000>;
                    };
                };
            };

Regards,
Lino


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 13:19         ` Aw: " Lino Sanfilippo
@ 2021-09-09 14:29           ` Lino Sanfilippo
  2021-09-09 15:17             ` Andrew Lunn
  2021-09-09 15:11           ` Andrew Lunn
  2021-09-09 15:47           ` Vladimir Oltean
  2 siblings, 1 reply; 38+ messages in thread
From: Lino Sanfilippo @ 2021-09-09 14:29 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: p.rosenberger, woojung.huh, UNGLinuxDriver, andrew,
	vivien.didelot, f.fainelli, davem, kuba, netdev, linux-kernel

On 09.09.21 at 15:19, Lino Sanfilippo wrote:

>>
>
> The kernel I use is the Raspberry Pi 5.10 kernel. The commit number in this kernel is d0b97c8cd63e37e6d4dc9fefd6381b09f6c31a67
>

This is not correct. The kernel I use right now is based on Gregs stable linux-5.10.y.
The commit number is correct here. Sorry for the confusion.

Regards,
Lino

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 13:19         ` Aw: " Lino Sanfilippo
  2021-09-09 14:29           ` Lino Sanfilippo
@ 2021-09-09 15:11           ` Andrew Lunn
  2021-09-09 16:46             ` Lino Sanfilippo
  2021-09-09 15:47           ` Vladimir Oltean
  2 siblings, 1 reply; 38+ messages in thread
From: Andrew Lunn @ 2021-09-09 15:11 UTC (permalink / raw)
  To: Lino Sanfilippo
  Cc: Vladimir Oltean, p.rosenberger, woojung.huh, UNGLinuxDriver,
	vivien.didelot, f.fainelli, davem, kuba, netdev, linux-kernel

> Andrew: the switch is not on a hat, the device tree part I use is:

And this is not an overlay. It is all there at boot?

I was just thinking that maybe the Ethernet interface gets opened at
boot, and overlay is loaded, and the interface is opened a second
time. I don't know of anybody using DSA with overlays, so that could
of been the key difference which breaks it for you.

Your decompiled DT blob looks O.K.

    Andrew

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 14:29           ` Lino Sanfilippo
@ 2021-09-09 15:17             ` Andrew Lunn
  2021-09-09 16:41               ` Lino Sanfilippo
  0 siblings, 1 reply; 38+ messages in thread
From: Andrew Lunn @ 2021-09-09 15:17 UTC (permalink / raw)
  To: Lino Sanfilippo
  Cc: Vladimir Oltean, p.rosenberger, woojung.huh, UNGLinuxDriver,
	vivien.didelot, f.fainelli, davem, kuba, netdev, linux-kernel

> This is not correct. The kernel I use right now is based on Gregs stable linux-5.10.y.
> The commit number is correct here. Sorry for the confusion.

Can you use 5.14.2?

When we understand the problem, the fixes will need to be for
net-next, which will be based on 5.15-rcX. They will then be
backported to 5.10. So you need to do some testing on a newer
kernel. Such testing will also help us figure out if it is a new
problem, or a backporting problem.

	 Andrew

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 13:19         ` Aw: " Lino Sanfilippo
  2021-09-09 14:29           ` Lino Sanfilippo
  2021-09-09 15:11           ` Andrew Lunn
@ 2021-09-09 15:47           ` Vladimir Oltean
  2021-09-09 16:00             ` Florian Fainelli
  2021-09-09 16:37             ` Lino Sanfilippo
  2 siblings, 2 replies; 38+ messages in thread
From: Vladimir Oltean @ 2021-09-09 15:47 UTC (permalink / raw)
  To: Lino Sanfilippo
  Cc: p.rosenberger, woojung.huh, UNGLinuxDriver, andrew,
	vivien.didelot, f.fainelli, davem, kuba, netdev, linux-kernel

On Thu, Sep 09, 2021 at 03:19:52PM +0200, Lino Sanfilippo wrote:
> > Do you see similar things on your 5.10 kernel?
> 
> For the master device is see
> 
> lrwxrwxrwx 1 root root 0 Sep  9 14:10 /sys/class/net/eth0/device/consumer:spi:spi3.0 -> ../../../virtual/devlink/platform:fd580000.ethernet--spi:spi3.0

So this is the worst of the worst, we have a device link but it doesn't help.

Where the device link helps is here:

__device_release_driver
	while (device_links_busy(dev))
		device_links_unbind_consumers(dev);

but during dev_shutdown, device_links_unbind_consumers does not get called
(actually I am not even sure whether it should).

I've reproduced your issue by making this very simple change:

diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf.c b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
index 60d94e0a07d6..ec00f34cac47 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
@@ -1372,6 +1372,7 @@ static struct pci_driver enetc_pf_driver = {
 	.id_table = enetc_pf_id_table,
 	.probe = enetc_pf_probe,
 	.remove = enetc_pf_remove,
+	.shutdown = enetc_pf_remove,
 #ifdef CONFIG_PCI_IOV
 	.sriov_configure = enetc_sriov_configure,
 #endif

on my DSA master driver. This is what the genet driver has "special".

I was led into grave error by Documentation/driver-api/device_link.rst,
which I've based my patch on, where it clearly says that device links
are supposed to help with shutdown ordering (how?!).

So the question is, why did my DSA trees get torn down on shutdown?
Basically the short answer is that my SPI controller driver does
implement .shutdown, and calls the same code path as the .remove code,
which calls spi_unregister_controller which removes all SPI children..

When I added this device link, one of the main objectives was to not
modify all DSA drivers. I was certain based on the documentation that
device links would help, now I'm not so sure anymore.

So what happens is that the DSA master attempts to unregister its net
device on .shutdown, but DSA does not implement .shutdown, so it just
sits there holding a reference (supposedly via dev_hold, but where from?!)
to the master, which makes netdev_wait_allrefs to wait and wait.

I need more time for the denial phase to pass, and to understand what
can actually be done. I will also be away from the keyboard for the next
few days, so it might take a while. Your patches obviously offer a
solution only for KSZ switches, we need something more general. If I
understand your solution, it works not by virtue of there being any
shutdown ordering guarantee at all, but simply due to the fact that
DSA's .shutdown hook gets called eventually, and the reference to the
master gets freed eventually, which unblocks the unregister_netdevice
call from the master. I don't yet understand why DSA holds a long-term
reference to the master, that's one thing I need to figure out.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 15:47           ` Vladimir Oltean
@ 2021-09-09 16:00             ` Florian Fainelli
  2021-09-10  1:32               ` Saravana Kannan
  2021-09-09 16:37             ` Lino Sanfilippo
  1 sibling, 1 reply; 38+ messages in thread
From: Florian Fainelli @ 2021-09-09 16:00 UTC (permalink / raw)
  To: Vladimir Oltean, Lino Sanfilippo, Saravana Kannan
  Cc: p.rosenberger, woojung.huh, UNGLinuxDriver, andrew,
	vivien.didelot, davem, kuba, netdev, linux-kernel

+Saravana,

On 9/9/2021 8:47 AM, Vladimir Oltean wrote:
> On Thu, Sep 09, 2021 at 03:19:52PM +0200, Lino Sanfilippo wrote:
>>> Do you see similar things on your 5.10 kernel?
>>
>> For the master device is see
>>
>> lrwxrwxrwx 1 root root 0 Sep  9 14:10 /sys/class/net/eth0/device/consumer:spi:spi3.0 -> ../../../virtual/devlink/platform:fd580000.ethernet--spi:spi3.0
> 
> So this is the worst of the worst, we have a device link but it doesn't help.
> 
> Where the device link helps is here:
> 
> __device_release_driver
> 	while (device_links_busy(dev))
> 		device_links_unbind_consumers(dev);
> 
> but during dev_shutdown, device_links_unbind_consumers does not get called
> (actually I am not even sure whether it should).
> 
> I've reproduced your issue by making this very simple change:
> 
> diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf.c b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
> index 60d94e0a07d6..ec00f34cac47 100644
> --- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c
> +++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
> @@ -1372,6 +1372,7 @@ static struct pci_driver enetc_pf_driver = {
>   	.id_table = enetc_pf_id_table,
>   	.probe = enetc_pf_probe,
>   	.remove = enetc_pf_remove,
> +	.shutdown = enetc_pf_remove,
>   #ifdef CONFIG_PCI_IOV
>   	.sriov_configure = enetc_sriov_configure,
>   #endif
> 
> on my DSA master driver. This is what the genet driver has "special".
> 
> I was led into grave error by Documentation/driver-api/device_link.rst,
> which I've based my patch on, where it clearly says that device links
> are supposed to help with shutdown ordering (how?!).

I was also under the impression that device links were supposed to help 
with shutdown ordering, because it does matter a lot. One thing that I 
had to work before (and seems like it came back recently) is the 
shutdown ordering between gpio_keys.c and the GPIO controller. If you 
suspend the GPIO controller first, gpio_keys.c never gets a chance to 
keep the GPIO pin configured for a wake-up interrupt, therefore no 
wake-up event happens on key presses, whoops.

> 
> So the question is, why did my DSA trees get torn down on shutdown?
> Basically the short answer is that my SPI controller driver does
> implement .shutdown, and calls the same code path as the .remove code,
> which calls spi_unregister_controller which removes all SPI children..
> 
> When I added this device link, one of the main objectives was to not
> modify all DSA drivers. I was certain based on the documentation that
> device links would help, now I'm not so sure anymore.
> 
> So what happens is that the DSA master attempts to unregister its net
> device on .shutdown, but DSA does not implement .shutdown, so it just
> sits there holding a reference (supposedly via dev_hold, but where from?!)
> to the master, which makes netdev_wait_allrefs to wait and wait.

It's not coming from of_find_net_device_by_node() that's for sure and 
with OF we don't go through the code path calling 
dsa_dev_to_net_device() which does call dev_hold() and then shortly 
thereafter the caller calls dev_put() anyway.

> 
> I need more time for the denial phase to pass, and to understand what
> can actually be done. I will also be away from the keyboard for the next
> few days, so it might take a while. Your patches obviously offer a
> solution only for KSZ switches, we need something more general. If I
> understand your solution, it works not by virtue of there being any
> shutdown ordering guarantee at all, but simply due to the fact that
> DSA's .shutdown hook gets called eventually, and the reference to the
> master gets freed eventually, which unblocks the unregister_netdevice
> call from the master. I don't yet understand why DSA holds a long-term
> reference to the master, that's one thing I need to figure out.
> 

Agreed.
-- 
Florian

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 15:47           ` Vladimir Oltean
  2021-09-09 16:00             ` Florian Fainelli
@ 2021-09-09 16:37             ` Lino Sanfilippo
  2021-09-09 16:44               ` Florian Fainelli
  1 sibling, 1 reply; 38+ messages in thread
From: Lino Sanfilippo @ 2021-09-09 16:37 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: p.rosenberger, woojung.huh, UNGLinuxDriver, andrew,
	vivien.didelot, f.fainelli, davem, kuba, netdev, linux-kernel

On 09.09.21 at 17:47, Vladimir Oltean wrote:
> On Thu, Sep 09, 2021 at 03:19:52PM +0200, Lino Sanfilippo wrote:
>>> Do you see similar things on your 5.10 kernel?
>>
>> For the master device is see
>>
>> lrwxrwxrwx 1 root root 0 Sep  9 14:10 /sys/class/net/eth0/device/consumer:spi:spi3.0 -> ../../../virtual/devlink/platform:fd580000.ethernet--spi:spi3.0
>
> So this is the worst of the worst, we have a device link but it doesn't help.
>
> Where the device link helps is here:
>
> __device_release_driver
> 	while (device_links_busy(dev))
> 		device_links_unbind_consumers(dev);
>
> but during dev_shutdown, device_links_unbind_consumers does not get called
> (actually I am not even sure whether it should).
>
> I've reproduced your issue by making this very simple change:
>
> diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf.c b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
> index 60d94e0a07d6..ec00f34cac47 100644
> --- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c
> +++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
> @@ -1372,6 +1372,7 @@ static struct pci_driver enetc_pf_driver = {
>  	.id_table = enetc_pf_id_table,
>  	.probe = enetc_pf_probe,
>  	.remove = enetc_pf_remove,
> +	.shutdown = enetc_pf_remove,
>  #ifdef CONFIG_PCI_IOV
>  	.sriov_configure = enetc_sriov_configure,
>  #endif
>
> on my DSA master driver. This is what the genet driver has "special".
>

Ah, that is interesting.

> I was led into grave error by Documentation/driver-api/device_link.rst,
> which I've based my patch on, where it clearly says that device links
> are supposed to help with shutdown ordering (how?!).
>
> So the question is, why did my DSA trees get torn down on shutdown?
> Basically the short answer is that my SPI controller driver does
> implement .shutdown, and calls the same code path as the .remove code,
> which calls spi_unregister_controller which removes all SPI children..
>
> When I added this device link, one of the main objectives was to not
> modify all DSA drivers. I was certain based on the documentation that
> device links would help, now I'm not so sure anymore.
>
> So what happens is that the DSA master attempts to unregister its net
> device on .shutdown, but DSA does not implement .shutdown, so it just
> sits there holding a reference (supposedly via dev_hold, but where from?!)
> to the master, which makes netdev_wait_allrefs to wait and wait.
>

Right, that was also my conclusion.

> I need more time for the denial phase to pass, and to understand what
> can actually be done. I will also be away from the keyboard for the next
> few days, so it might take a while. Your patches obviously offer a
> solution only for KSZ switches, we need something more general. If I
> understand your solution, it works not by virtue of there being any
> shutdown ordering guarantee at all, but simply due to the fact that
> DSA's .shutdown hook gets called eventually, and the reference to the
> master gets freed eventually, which unblocks the unregister_netdevice
> call from the master.

Well actually the SPI shutdown hook gets called which then calls ksz9477_shutdown
(formerly ksz9477_reset_switch) which then shuts down the switch by
stopping the worker thread and tearing down the DSA tree (via dsa_tree_shutdown()).

While it is right that the patch series only fixes the KSZ case for now, the idea was that
other drivers could use a similar approach in by calling the new function dsa_tree_shutdown()
in their shutdown handler to make sure that all refs to the master device are released.


Regards,
Lino

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 15:17             ` Andrew Lunn
@ 2021-09-09 16:41               ` Lino Sanfilippo
  0 siblings, 0 replies; 38+ messages in thread
From: Lino Sanfilippo @ 2021-09-09 16:41 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Vladimir Oltean, p.rosenberger, woojung.huh, UNGLinuxDriver,
	vivien.didelot, f.fainelli, davem, kuba, netdev, linux-kernel

On 09.09.21 at 17:17, Andrew Lunn wrote:
>> This is not correct. The kernel I use right now is based on Gregs stable linux-5.10.y.
>> The commit number is correct here. Sorry for the confusion.
>
> Can you use 5.14.2?
>
> When we understand the problem, the fixes will need to be for
> net-next, which will be based on 5.15-rcX. They will then be
> backported to 5.10. So you need to do some testing on a newer
> kernel. Such testing will also help us figure out if it is a new
> problem, or a backporting problem.
>
> 	 Andrew
>


You are right, I will try to switch to a newer kernel to test future DSA issue like that
(I have the impression that DSA is an area in which a lot of progress is done from
one kernel version to the next).

Regards,
Lino

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 16:37             ` Lino Sanfilippo
@ 2021-09-09 16:44               ` Florian Fainelli
  2021-09-09 17:07                 ` Lino Sanfilippo
  0 siblings, 1 reply; 38+ messages in thread
From: Florian Fainelli @ 2021-09-09 16:44 UTC (permalink / raw)
  To: Lino Sanfilippo, Vladimir Oltean
  Cc: p.rosenberger, woojung.huh, UNGLinuxDriver, andrew,
	vivien.didelot, davem, kuba, netdev, linux-kernel



On 9/9/2021 9:37 AM, Lino Sanfilippo wrote:
> On 09.09.21 at 17:47, Vladimir Oltean wrote:
>> On Thu, Sep 09, 2021 at 03:19:52PM +0200, Lino Sanfilippo wrote:
>>>> Do you see similar things on your 5.10 kernel?
>>>
>>> For the master device is see
>>>
>>> lrwxrwxrwx 1 root root 0 Sep  9 14:10 /sys/class/net/eth0/device/consumer:spi:spi3.0 -> ../../../virtual/devlink/platform:fd580000.ethernet--spi:spi3.0
>>
>> So this is the worst of the worst, we have a device link but it doesn't help.
>>
>> Where the device link helps is here:
>>
>> __device_release_driver
>> 	while (device_links_busy(dev))
>> 		device_links_unbind_consumers(dev);
>>
>> but during dev_shutdown, device_links_unbind_consumers does not get called
>> (actually I am not even sure whether it should).
>>
>> I've reproduced your issue by making this very simple change:
>>
>> diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf.c b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
>> index 60d94e0a07d6..ec00f34cac47 100644
>> --- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c
>> +++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
>> @@ -1372,6 +1372,7 @@ static struct pci_driver enetc_pf_driver = {
>>   	.id_table = enetc_pf_id_table,
>>   	.probe = enetc_pf_probe,
>>   	.remove = enetc_pf_remove,
>> +	.shutdown = enetc_pf_remove,
>>   #ifdef CONFIG_PCI_IOV
>>   	.sriov_configure = enetc_sriov_configure,
>>   #endif
>>
>> on my DSA master driver. This is what the genet driver has "special".
>>
> 
> Ah, that is interesting.
> 
>> I was led into grave error by Documentation/driver-api/device_link.rst,
>> which I've based my patch on, where it clearly says that device links
>> are supposed to help with shutdown ordering (how?!).
>>
>> So the question is, why did my DSA trees get torn down on shutdown?
>> Basically the short answer is that my SPI controller driver does
>> implement .shutdown, and calls the same code path as the .remove code,
>> which calls spi_unregister_controller which removes all SPI children..
>>
>> When I added this device link, one of the main objectives was to not
>> modify all DSA drivers. I was certain based on the documentation that
>> device links would help, now I'm not so sure anymore.
>>
>> So what happens is that the DSA master attempts to unregister its net
>> device on .shutdown, but DSA does not implement .shutdown, so it just
>> sits there holding a reference (supposedly via dev_hold, but where from?!)
>> to the master, which makes netdev_wait_allrefs to wait and wait.
>>
> 
> Right, that was also my conclusion.
> 
>> I need more time for the denial phase to pass, and to understand what
>> can actually be done. I will also be away from the keyboard for the next
>> few days, so it might take a while. Your patches obviously offer a
>> solution only for KSZ switches, we need something more general. If I
>> understand your solution, it works not by virtue of there being any
>> shutdown ordering guarantee at all, but simply due to the fact that
>> DSA's .shutdown hook gets called eventually, and the reference to the
>> master gets freed eventually, which unblocks the unregister_netdevice
>> call from the master.
> 
> Well actually the SPI shutdown hook gets called which then calls ksz9477_shutdown
> (formerly ksz9477_reset_switch) which then shuts down the switch by
> stopping the worker thread and tearing down the DSA tree (via dsa_tree_shutdown()).
> 
> While it is right that the patch series only fixes the KSZ case for now, the idea was that
> other drivers could use a similar approach in by calling the new function dsa_tree_shutdown()
> in their shutdown handler to make sure that all refs to the master device are released.

It does not scale really well to have individual drivers call 
dsa_tree_shutdown() in their respective .shutdown callback, and in a 
multi-switch configuration, I am not sure what the results would look like.

In premise, each driver ought to be able to call 
dsa_unregister_switch(), along with all of the driver specific shutdown 
and eventually, given proper device ordering the DSA tree would get 
automatically torn down, and then the DSA master's .shutdown() callback 
would be called.

FWIW, the reason why we call .shutdown() in bcmgenet is to turn off DMA 
and clocks, which matters for kexec (DMA) as well as power savings (S5 
mode).
-- 
Florian

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 15:11           ` Andrew Lunn
@ 2021-09-09 16:46             ` Lino Sanfilippo
  2021-09-09 17:55               ` Andrew Lunn
  0 siblings, 1 reply; 38+ messages in thread
From: Lino Sanfilippo @ 2021-09-09 16:46 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Vladimir Oltean, p.rosenberger, woojung.huh, UNGLinuxDriver,
	vivien.didelot, f.fainelli, davem, kuba, netdev, linux-kernel

On 09.09.21 at 17:11, Andrew Lunn wrote:
>> Andrew: the switch is not on a hat, the device tree part I use is:
>
> And this is not an overlay. It is all there at boot?
>

Well actually we DO use an overlay. The dev tree snipped I posted was an excerpt form
fdtdump. The concerning fragment looks like this in the overlay file:


        fragment@4 {
                target = <&spi6>;
                __overlay__ {
                        #address-cells = <1>;
                        #size-cells = <0>;
                        pinctrl-names = "default";
                        pinctrl-0 = <&spi6_pins>, <&spi6_cs_pins>;
                        cs-gpios = <&gpio 16 GPIO_ACTIVE_LOW>,
                                   <&gpio 18 GPIO_ACTIVE_LOW>;
                        status = "okay";

                        ksz9897: ksz9897@0 {
                                compatible = "microchip,ksz9897";
                                reg = <0>;
                                spi-max-frequency = <50000000>;
                                spi-cpha;
                                spi-cpol;
                                reset-gpios = <&expander_core 13 GPIO_ACTIVE_LOW>;
                                status = "okay";

                                ports {
                                        #address-cells = <1>;
                                        #size-cells = <0>;
                                        /* PORT 1 */
                                        port@0 {
                                                reg = <0>;
                                                label = "cpu";
                                                ethernet = <&genet>;
                                        };
                                        /* PORT 2 */
                                        port@1 {
                                                reg = <1>;
                                                label = "pileft";
                                        };
                                        /* PORT 3 */
                                        port@2 {
                                                reg = <2>;
                                                label = "piright";
                                        };
                                        /*
                                         * PORT 4-7 unused
                                         */
                                };
                        };

                        tpm: tpm@1 {
                                compatible = "infineon,slb9670";
                                pinctrl-names = "default";
                                pinctrl-0 = <&tpm_pins>;
                                reg = <1>;
                                spi-max-frequency = <1000000>;
                                interrupt-parent = <&gpio>;
                                #interrupt-cells = <2>;
                                interrupts = <10 IRQ_TYPE_LEVEL_LOW>;
                                status = "okay";
                        };
                };
        };


But probably this does not matter any more now
that Vladimir was able to reproduce the issue.

> I was just thinking that maybe the Ethernet interface gets opened at
> boot, and overlay is loaded, and the interface is opened a second
> time. I don't know of anybody using DSA with overlays, so that could
> of been the key difference which breaks it for you.
>
> Your decompiled DT blob looks O.K.
>
>     Andrew
>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 16:44               ` Florian Fainelli
@ 2021-09-09 17:07                 ` Lino Sanfilippo
  2021-09-09 22:54                   ` Vladimir Oltean
  0 siblings, 1 reply; 38+ messages in thread
From: Lino Sanfilippo @ 2021-09-09 17:07 UTC (permalink / raw)
  To: Florian Fainelli, Vladimir Oltean
  Cc: p.rosenberger, woojung.huh, UNGLinuxDriver, andrew,
	vivien.didelot, davem, kuba, netdev, linux-kernel

On 09.09.21 at 18:44, Florian Fainelli wrote:
>
>
> On 9/9/2021 9:37 AM, Lino Sanfilippo wrote:
>> On 09.09.21 at 17:47, Vladimir Oltean wrote:
>>> On Thu, Sep 09, 2021 at 03:19:52PM +0200, Lino Sanfilippo wrote:
>>>>> Do you see similar things on your 5.10 kernel?
>>>>
>>>> For the master device is see
>>>>
>>>> lrwxrwxrwx 1 root root 0 Sep  9 14:10 /sys/class/net/eth0/device/consumer:spi:spi3.0 -> ../../../virtual/devlink/platform:fd580000.ethernet--spi:spi3.0
>>>
>>> So this is the worst of the worst, we have a device link but it doesn't help.
>>>
>>> Where the device link helps is here:
>>>
>>> __device_release_driver
>>>     while (device_links_busy(dev))
>>>         device_links_unbind_consumers(dev);
>>>
>>> but during dev_shutdown, device_links_unbind_consumers does not get called
>>> (actually I am not even sure whether it should).
>>>
>>> I've reproduced your issue by making this very simple change:
>>>
>>> diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf.c b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
>>> index 60d94e0a07d6..ec00f34cac47 100644
>>> --- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c
>>> +++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
>>> @@ -1372,6 +1372,7 @@ static struct pci_driver enetc_pf_driver = {
>>>       .id_table = enetc_pf_id_table,
>>>       .probe = enetc_pf_probe,
>>>       .remove = enetc_pf_remove,
>>> +    .shutdown = enetc_pf_remove,
>>>   #ifdef CONFIG_PCI_IOV
>>>       .sriov_configure = enetc_sriov_configure,
>>>   #endif
>>>
>>> on my DSA master driver. This is what the genet driver has "special".
>>>
>>
>> Ah, that is interesting.
>>
>>> I was led into grave error by Documentation/driver-api/device_link.rst,
>>> which I've based my patch on, where it clearly says that device links
>>> are supposed to help with shutdown ordering (how?!).
>>>
>>> So the question is, why did my DSA trees get torn down on shutdown?
>>> Basically the short answer is that my SPI controller driver does
>>> implement .shutdown, and calls the same code path as the .remove code,
>>> which calls spi_unregister_controller which removes all SPI children..
>>>
>>> When I added this device link, one of the main objectives was to not
>>> modify all DSA drivers. I was certain based on the documentation that
>>> device links would help, now I'm not so sure anymore.
>>>
>>> So what happens is that the DSA master attempts to unregister its net
>>> device on .shutdown, but DSA does not implement .shutdown, so it just
>>> sits there holding a reference (supposedly via dev_hold, but where from?!)
>>> to the master, which makes netdev_wait_allrefs to wait and wait.
>>>
>>
>> Right, that was also my conclusion.
>>
>>> I need more time for the denial phase to pass, and to understand what
>>> can actually be done. I will also be away from the keyboard for the next
>>> few days, so it might take a while. Your patches obviously offer a
>>> solution only for KSZ switches, we need something more general. If I
>>> understand your solution, it works not by virtue of there being any
>>> shutdown ordering guarantee at all, but simply due to the fact that
>>> DSA's .shutdown hook gets called eventually, and the reference to the
>>> master gets freed eventually, which unblocks the unregister_netdevice
>>> call from the master.
>>
>> Well actually the SPI shutdown hook gets called which then calls ksz9477_shutdown
>> (formerly ksz9477_reset_switch) which then shuts down the switch by
>> stopping the worker thread and tearing down the DSA tree (via dsa_tree_shutdown()).
>>
>> While it is right that the patch series only fixes the KSZ case for now, the idea was that
>> other drivers could use a similar approach in by calling the new function dsa_tree_shutdown()
>> in their shutdown handler to make sure that all refs to the master device are released.
> It does not scale really well to have individual drivers call dsa_tree_shutdown() in their respective .shutdown callback, and in a multi-switch configuration, I am not sure what the results would look like.
>
> In premise, each driver ought to be able to call dsa_unregister_switch(), along with all of the driver specific shutdown and eventually, given proper device ordering the DSA tree would get automatically torn down, and then the DSA master's .shutdown() callback would be called.
>
> FWIW, the reason why we call .shutdown() in bcmgenet is to turn off DMA and clocks, which matters for kexec (DMA) as well as power savings (S5 mode).

I agree with the scalability. Concerning the multi-switch case I dont know about the possible issues (I am quite new to working with DSA).
So lets wait for Vladimirs solution.

Regards,
Lino

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 16:46             ` Lino Sanfilippo
@ 2021-09-09 17:55               ` Andrew Lunn
  0 siblings, 0 replies; 38+ messages in thread
From: Andrew Lunn @ 2021-09-09 17:55 UTC (permalink / raw)
  To: Lino Sanfilippo
  Cc: Vladimir Oltean, p.rosenberger, woojung.huh, UNGLinuxDriver,
	vivien.didelot, f.fainelli, davem, kuba, netdev, linux-kernel

On Thu, Sep 09, 2021 at 06:46:49PM +0200, Lino Sanfilippo wrote:
> On 09.09.21 at 17:11, Andrew Lunn wrote:
> >> Andrew: the switch is not on a hat, the device tree part I use is:
> >
> > And this is not an overlay. It is all there at boot?
> >
> 
> Well actually we DO use an overlay. The dev tree snipped I posted was an excerpt form
> fdtdump. The concerning fragment looks like this in the overlay file:

Thanks for the information. Good to know somebody is using DSA like
this. The device tree description can be quite complex, especially for
some of the other switches.

> But probably this does not matter any more now that Vladimir was
> able to reproduce the issue.

Agreed.

	Andrew

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 17:07                 ` Lino Sanfilippo
@ 2021-09-09 22:54                   ` Vladimir Oltean
  2021-09-09 23:23                     ` Vladimir Oltean
  2021-09-10  2:15                     ` Florian Fainelli
  0 siblings, 2 replies; 38+ messages in thread
From: Vladimir Oltean @ 2021-09-09 22:54 UTC (permalink / raw)
  To: Lino Sanfilippo, Saravana Kannan
  Cc: Florian Fainelli, p.rosenberger, woojung.huh, UNGLinuxDriver,
	andrew, vivien.didelot, davem, kuba, netdev, linux-kernel

On Thu, Sep 09, 2021 at 07:07:33PM +0200, Lino Sanfilippo wrote:
> > It does not scale really well to have individual drivers call
> > dsa_tree_shutdown() in their respective .shutdown callback, and in a
> > multi-switch configuration, I am not sure what the results would
> > look like.
> >
> > In premise, each driver ought to be able to call
> > dsa_unregister_switch(), along with all of the driver specific
> > shutdown and eventually, given proper device ordering the DSA tree
> > would get automatically torn down, and then the DSA master's
> > .shutdown() callback would be called.
> >
> > FWIW, the reason why we call .shutdown() in bcmgenet is to turn off
> > DMA and clocks, which matters for kexec (DMA) as well as power
> > savings (S5 mode).
>
> I agree with the scalability. Concerning the multi-switch case I dont
> know about the possible issues (I am quite new to working with DSA).
> So lets wait for Vladimirs solution.

I'm back for now and was able to spend a bit more time and understand
what is happening.

So first things first: why does DSA call dev_hold long-term on the
master, and where from?

Answer: it does so since commit 2f1e8ea726e9 ("net: dsa: link interfaces
with the DSA master to get rid of lockdep warnings"), see this call path:

dsa_slave_create
-> netdev_upper_dev_link
   -> __netdev_upper_dev_link
      -> __netdev_adjacent_dev_insert
         -> dev_hold

Ok, so since DSA holds a reference to the master interface, it is
natural that unregister_netdevice() will not finish, and it will hang
the system.

Question 2: why does bcmgenet need to unregister the net device on
shutdown?

See Florian's answer, it doesn't, strictly speaking, it just needs to
turn off the DMA and some clocks.

Question 3: can we revert commit 2f1e8ea726e9?

Answer: not so easily, we are looking at >10 commits to revert, and find
other solutions to some problems. We have built in the meantime on top
of the fact that there is an upper/lower relationship between DSA user
ports and the DSA master.

Question 4: how do other stacked interfaces deal with this?

Answer: as I said in the commit message of 2f1e8ea726e9, DSA is not
VLAN, DSA has unique challenges of its own, like a tree of struct
devices to manage, with their own lifetime. So what other drivers do is
not really relevant. Anyway, to entertain the question: VLAN watches the
NETDEV_UNREGISTER event emitted on the netdev notifier chain for its
real_dev, and effectively unregisters itself. Now this is exactly why it
is irrelevant, we can watch for NETDEV_UNREGISTER on the DSA master, but
then what? There is nothing sensible to do. Consider that in the master
unbind case (not shutdown), both the NETDEV_UNREGISTER code path will
execute, and the unbind of the DSA switch itself, due to that device
link. But let's say we delete the device link and leave only the
NETDEV_UNREGISTER code path to do something. What?
device_release_driver(ds->dev), most probably. That would effectively
force the DSA unbind path. But surprise: the DSA unbind path takes the
rtnl_mutex from quite a couple of places, and we are already under the
rtnl_lock (held by the netdev notifier chain). So, unless we schedule
the DSA device driver detach, there is an impending deadlock.
Ok, let's entertain even that: detach the DSA driver in a scheduled work
item, with the rtnl_lock not held. First off, we will trigger again the
WARN_ON solved by commit 2f1e8ea726e9 (because the unregistering of the
DSA master has "completed", but it still has an upper interface - us),
and secondly, the unregister_netdev function will have already deleted
stuff belonging to the DSA master, namely its sysfs entries. But DSA
also touches the master's sysfs, namely the "tagging" file. So NULL
pointer dereference on the master's sysfs.
So very simply put, DSA cannot unbind itself from the switch device when
the master net device unregisters. The best case scenario would be for
DSA to unbind _before_ the net device even unregisters. That was the
whole point of my attempt with the device links, to ensure shutdown
_ordering_.

Question 5: can the device core actually be patched to call
device_links_unbind_consumers() from device_shutdown()? This would
actually simplify DSA's options, and make the device links live up to
their documented expectations.

Answer: yes and no, technically it can, but it is an invasive change
which will certainly introduce regressions. See the answer to question 2
for an example. Technically .shutdown exists so that drivers can do
something lightweight to quiesce the hardware, without really caring too
much about data structure integrity (hey, the kernel is going to die
soon anyway). But some drivers, like bcmgenet, do the same thing in
.resume and .shutdown, which blurs the lines quite a lot. If the device
links were to start calling .remove at shutdown time, potentially after
.shutdown was already called, bcmgenet would effectively unregister its
net device twice. Yikes.

Question 6: How about a patch on the device core that is more lightweight?
Wouldn't it be sensible for device_shutdown() to just call ->remove if
the device's bus has no ->shutdown, and the device's driver doesn't have
a ->shutdown either?

Answer: This would sometimes work, the vast majority of DSA switch
drivers, and Ethernet controllers (in this case used as DSA masters) do
not have a .shutdown method implemented. But their bus does: PCI does,
SPI controllers do, most of the time. So it would work for limited
scenarios, but would be ineffective in the general sense.

Question 7: I said that .shutdown, as opposed to .remove, doesn't really
care so much about the integrity of data structures. So how far should
we really go to fix this issue? Should we even bother to unbind the
whole DSA tree, when the sole problem is that we are the DSA master's
upper, and that is keeping a reference on it?

Answer: Well, any solution that does unnecessary data structure teardown
only delays the reboot for nothing. Lino's patch just bluntly calls
dsa_tree_teardown() from the switch .shutdown method, and this leaks
memory, namely dst->ports. But does this really matter? Nope, so let's
extrapolate. In this case, IMO, the simplest possible solution would be
to patch bcmgenet to not unregister the net device. Then treat every
other DSA master driver in the same way as they come, one by one.
Do you need to unregister_netdevice() at shutdown? No. Then don't.
Is it nice? Probably not, but I'm not seeing alternatives.

Also, unless I'm missing something, Lino probably still sees the WARN_ON
in bcmgenet's unregister_netdevice() about eth0 getting unregistered
while having an upper interface. If not, it's by sheer luck that the DSA
switch's ->shutdown gets called before bcmgenet's ->shutdown. But for
this reason, it isn't a great solution either. If the device links can't
guarantee us some sort of shutdown ordering (what we ideally want, as
mentioned, is for the DSA switch driver to get _unbound_ (->remove)
before the DSA master gets unbound or shut down).

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 22:54                   ` Vladimir Oltean
@ 2021-09-09 23:23                     ` Vladimir Oltean
  2021-09-10  1:08                       ` Vladimir Oltean
  2021-09-10  2:15                     ` Florian Fainelli
  1 sibling, 1 reply; 38+ messages in thread
From: Vladimir Oltean @ 2021-09-09 23:23 UTC (permalink / raw)
  To: Lino Sanfilippo, Saravana Kannan
  Cc: Florian Fainelli, p.rosenberger, woojung.huh, UNGLinuxDriver,
	andrew, vivien.didelot, davem, kuba, netdev, linux-kernel

On Fri, Sep 10, 2021 at 01:54:57AM +0300, Vladimir Oltean wrote:
> On Thu, Sep 09, 2021 at 07:07:33PM +0200, Lino Sanfilippo wrote:
> > > It does not scale really well to have individual drivers call
> > > dsa_tree_shutdown() in their respective .shutdown callback, and in a
> > > multi-switch configuration, I am not sure what the results would
> > > look like.
> > >
> > > In premise, each driver ought to be able to call
> > > dsa_unregister_switch(), along with all of the driver specific
> > > shutdown and eventually, given proper device ordering the DSA tree
> > > would get automatically torn down, and then the DSA master's
> > > .shutdown() callback would be called.
> > >
> > > FWIW, the reason why we call .shutdown() in bcmgenet is to turn off
> > > DMA and clocks, which matters for kexec (DMA) as well as power
> > > savings (S5 mode).
> >
> > I agree with the scalability. Concerning the multi-switch case I dont
> > know about the possible issues (I am quite new to working with DSA).
> > So lets wait for Vladimirs solution.
> 
> I'm back for now and was able to spend a bit more time and understand
> what is happening.
> 
> So first things first: why does DSA call dev_hold long-term on the
> master, and where from?
> 
> Answer: it does so since commit 2f1e8ea726e9 ("net: dsa: link interfaces
> with the DSA master to get rid of lockdep warnings"), see this call path:
> 
> dsa_slave_create
> -> netdev_upper_dev_link
>    -> __netdev_upper_dev_link
>       -> __netdev_adjacent_dev_insert
>          -> dev_hold
> 
> Ok, so since DSA holds a reference to the master interface, it is
> natural that unregister_netdevice() will not finish, and it will hang
> the system.
> 
> Question 2: why does bcmgenet need to unregister the net device on
> shutdown?
> 
> See Florian's answer, it doesn't, strictly speaking, it just needs to
> turn off the DMA and some clocks.
> 
> Question 3: can we revert commit 2f1e8ea726e9?
> 
> Answer: not so easily, we are looking at >10 commits to revert, and find
> other solutions to some problems. We have built in the meantime on top
> of the fact that there is an upper/lower relationship between DSA user
> ports and the DSA master.
> 
> Question 4: how do other stacked interfaces deal with this?
> 
> Answer: as I said in the commit message of 2f1e8ea726e9, DSA is not
> VLAN, DSA has unique challenges of its own, like a tree of struct
> devices to manage, with their own lifetime. So what other drivers do is
> not really relevant. Anyway, to entertain the question: VLAN watches the
> NETDEV_UNREGISTER event emitted on the netdev notifier chain for its
> real_dev, and effectively unregisters itself. Now this is exactly why it
> is irrelevant, we can watch for NETDEV_UNREGISTER on the DSA master, but
> then what? There is nothing sensible to do. Consider that in the master
> unbind case (not shutdown), both the NETDEV_UNREGISTER code path will
> execute, and the unbind of the DSA switch itself, due to that device
> link. But let's say we delete the device link and leave only the
> NETDEV_UNREGISTER code path to do something. What?
> device_release_driver(ds->dev), most probably. That would effectively
> force the DSA unbind path. But surprise: the DSA unbind path takes the
> rtnl_mutex from quite a couple of places, and we are already under the
> rtnl_lock (held by the netdev notifier chain). So, unless we schedule
> the DSA device driver detach, there is an impending deadlock.
> Ok, let's entertain even that: detach the DSA driver in a scheduled work
> item, with the rtnl_lock not held. First off, we will trigger again the
> WARN_ON solved by commit 2f1e8ea726e9 (because the unregistering of the
> DSA master has "completed", but it still has an upper interface - us),
> and secondly, the unregister_netdev function will have already deleted
> stuff belonging to the DSA master, namely its sysfs entries. But DSA
> also touches the master's sysfs, namely the "tagging" file. So NULL
> pointer dereference on the master's sysfs.
> So very simply put, DSA cannot unbind itself from the switch device when
> the master net device unregisters. The best case scenario would be for
> DSA to unbind _before_ the net device even unregisters. That was the
> whole point of my attempt with the device links, to ensure shutdown
> _ordering_.
> 
> Question 5: can the device core actually be patched to call
> device_links_unbind_consumers() from device_shutdown()? This would
> actually simplify DSA's options, and make the device links live up to
> their documented expectations.
> 
> Answer: yes and no, technically it can, but it is an invasive change
> which will certainly introduce regressions. See the answer to question 2
> for an example. Technically .shutdown exists so that drivers can do
> something lightweight to quiesce the hardware, without really caring too
> much about data structure integrity (hey, the kernel is going to die
> soon anyway). But some drivers, like bcmgenet, do the same thing in
> .resume and .shutdown, which blurs the lines quite a lot. If the device
> links were to start calling .remove at shutdown time, potentially after
> .shutdown was already called, bcmgenet would effectively unregister its
> net device twice. Yikes.
> 
> Question 6: How about a patch on the device core that is more lightweight?
> Wouldn't it be sensible for device_shutdown() to just call ->remove if
> the device's bus has no ->shutdown, and the device's driver doesn't have
> a ->shutdown either?
> 
> Answer: This would sometimes work, the vast majority of DSA switch
> drivers, and Ethernet controllers (in this case used as DSA masters) do
> not have a .shutdown method implemented. But their bus does: PCI does,
> SPI controllers do, most of the time. So it would work for limited
> scenarios, but would be ineffective in the general sense.
> 
> Question 7: I said that .shutdown, as opposed to .remove, doesn't really
> care so much about the integrity of data structures. So how far should
> we really go to fix this issue? Should we even bother to unbind the
> whole DSA tree, when the sole problem is that we are the DSA master's
> upper, and that is keeping a reference on it?
> 
> Answer: Well, any solution that does unnecessary data structure teardown
> only delays the reboot for nothing. Lino's patch just bluntly calls
> dsa_tree_teardown() from the switch .shutdown method, and this leaks
> memory, namely dst->ports. But does this really matter? Nope, so let's
> extrapolate. In this case, IMO, the simplest possible solution would be
> to patch bcmgenet to not unregister the net device. Then treat every
> other DSA master driver in the same way as they come, one by one.
> Do you need to unregister_netdevice() at shutdown? No. Then don't.
> Is it nice? Probably not, but I'm not seeing alternatives.
> 
> Also, unless I'm missing something, Lino probably still sees the WARN_ON
> in bcmgenet's unregister_netdevice() about eth0 getting unregistered
> while having an upper interface. If not, it's by sheer luck that the DSA
> switch's ->shutdown gets called before bcmgenet's ->shutdown. But for
> this reason, it isn't a great solution either. If the device links can't
> guarantee us some sort of shutdown ordering (what we ideally want, as
> mentioned, is for the DSA switch driver to get _unbound_ (->remove)
> before the DSA master gets unbound or shut down).

I forgot about this, for completeness:

Question 8: Ok, so this is an even more lightweight variant of question 6.
To patch device_shutdown here:

 		if (dev->bus && dev->bus->shutdown) {
 			if (initcall_debug)
 				dev_info(dev, "shutdown\n");
 			dev->bus->shutdown(dev);
 		} else if (dev->driver && dev->driver->shutdown) {
 			if (initcall_debug)
 				dev_info(dev, "shutdown\n");
 			dev->driver->shutdown(dev);
+		} else {
+			__device_release_driver(dev, parent);
 		}

would go towards helping DSA in general, but it wouldn't help the situation at hand,
and it would introduce regressions.

So what about patching bcmgenet (and other drivers) to implement .shutdown in the following way:

	device_release_driver(&pdev->dev);

basically this should force-unbind the driver from the device, which
would quite nicely make the device link go into action and make DSA
unbind too.

Answer: device_release_driver calls device_lock(dev), and device_shutdown
also holds that lock when it calls our ->shutdown method. So unless the
device core would be so nice so as to provide a generic shutdown method
that just unbinds the device using an unlocked version of device_release_driver,
we are back to square one with this solution. Anything that needs to patch
the device core is more or less disqualified, especially for a bug fix.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 23:23                     ` Vladimir Oltean
@ 2021-09-10  1:08                       ` Vladimir Oltean
  0 siblings, 0 replies; 38+ messages in thread
From: Vladimir Oltean @ 2021-09-10  1:08 UTC (permalink / raw)
  To: Lino Sanfilippo, Saravana Kannan
  Cc: Florian Fainelli, p.rosenberger, woojung.huh, UNGLinuxDriver,
	andrew, vivien.didelot, davem, kuba, netdev, linux-kernel

On Fri, Sep 10, 2021 at 02:23:58AM +0300, Vladimir Oltean wrote:
> On Fri, Sep 10, 2021 at 01:54:57AM +0300, Vladimir Oltean wrote:
> > On Thu, Sep 09, 2021 at 07:07:33PM +0200, Lino Sanfilippo wrote:
> > > > It does not scale really well to have individual drivers call
> > > > dsa_tree_shutdown() in their respective .shutdown callback, and in a
> > > > multi-switch configuration, I am not sure what the results would
> > > > look like.
> > > >
> > > > In premise, each driver ought to be able to call
> > > > dsa_unregister_switch(), along with all of the driver specific
> > > > shutdown and eventually, given proper device ordering the DSA tree
> > > > would get automatically torn down, and then the DSA master's
> > > > .shutdown() callback would be called.
> > > >
> > > > FWIW, the reason why we call .shutdown() in bcmgenet is to turn off
> > > > DMA and clocks, which matters for kexec (DMA) as well as power
> > > > savings (S5 mode).
> > >
> > > I agree with the scalability. Concerning the multi-switch case I dont
> > > know about the possible issues (I am quite new to working with DSA).
> > > So lets wait for Vladimirs solution.
> > 
> > I'm back for now and was able to spend a bit more time and understand
> > what is happening.
> > 
> > So first things first: why does DSA call dev_hold long-term on the
> > master, and where from?
> > 
> > Answer: it does so since commit 2f1e8ea726e9 ("net: dsa: link interfaces
> > with the DSA master to get rid of lockdep warnings"), see this call path:
> > 
> > dsa_slave_create
> > -> netdev_upper_dev_link
> >    -> __netdev_upper_dev_link
> >       -> __netdev_adjacent_dev_insert
> >          -> dev_hold
> > 
> > Ok, so since DSA holds a reference to the master interface, it is
> > natural that unregister_netdevice() will not finish, and it will hang
> > the system.
> > 
> > Question 2: why does bcmgenet need to unregister the net device on
> > shutdown?
> > 
> > See Florian's answer, it doesn't, strictly speaking, it just needs to
> > turn off the DMA and some clocks.
> > 
> > Question 3: can we revert commit 2f1e8ea726e9?
> > 
> > Answer: not so easily, we are looking at >10 commits to revert, and find
> > other solutions to some problems. We have built in the meantime on top
> > of the fact that there is an upper/lower relationship between DSA user
> > ports and the DSA master.
> > 
> > Question 4: how do other stacked interfaces deal with this?
> > 
> > Answer: as I said in the commit message of 2f1e8ea726e9, DSA is not
> > VLAN, DSA has unique challenges of its own, like a tree of struct
> > devices to manage, with their own lifetime. So what other drivers do is
> > not really relevant. Anyway, to entertain the question: VLAN watches the
> > NETDEV_UNREGISTER event emitted on the netdev notifier chain for its
> > real_dev, and effectively unregisters itself. Now this is exactly why it
> > is irrelevant, we can watch for NETDEV_UNREGISTER on the DSA master, but
> > then what? There is nothing sensible to do. Consider that in the master
> > unbind case (not shutdown), both the NETDEV_UNREGISTER code path will
> > execute, and the unbind of the DSA switch itself, due to that device
> > link. But let's say we delete the device link and leave only the
> > NETDEV_UNREGISTER code path to do something. What?
> > device_release_driver(ds->dev), most probably. That would effectively
> > force the DSA unbind path. But surprise: the DSA unbind path takes the
> > rtnl_mutex from quite a couple of places, and we are already under the
> > rtnl_lock (held by the netdev notifier chain). So, unless we schedule
> > the DSA device driver detach, there is an impending deadlock.
> > Ok, let's entertain even that: detach the DSA driver in a scheduled work
> > item, with the rtnl_lock not held. First off, we will trigger again the
> > WARN_ON solved by commit 2f1e8ea726e9 (because the unregistering of the
> > DSA master has "completed", but it still has an upper interface - us),
> > and secondly, the unregister_netdev function will have already deleted
> > stuff belonging to the DSA master, namely its sysfs entries. But DSA
> > also touches the master's sysfs, namely the "tagging" file. So NULL
> > pointer dereference on the master's sysfs.
> > So very simply put, DSA cannot unbind itself from the switch device when
> > the master net device unregisters. The best case scenario would be for
> > DSA to unbind _before_ the net device even unregisters. That was the
> > whole point of my attempt with the device links, to ensure shutdown
> > _ordering_.
> > 
> > Question 5: can the device core actually be patched to call
> > device_links_unbind_consumers() from device_shutdown()? This would
> > actually simplify DSA's options, and make the device links live up to
> > their documented expectations.
> > 
> > Answer: yes and no, technically it can, but it is an invasive change
> > which will certainly introduce regressions. See the answer to question 2
> > for an example. Technically .shutdown exists so that drivers can do
> > something lightweight to quiesce the hardware, without really caring too
> > much about data structure integrity (hey, the kernel is going to die
> > soon anyway). But some drivers, like bcmgenet, do the same thing in
> > .resume and .shutdown, which blurs the lines quite a lot. If the device
> > links were to start calling .remove at shutdown time, potentially after
> > .shutdown was already called, bcmgenet would effectively unregister its
> > net device twice. Yikes.
> > 
> > Question 6: How about a patch on the device core that is more lightweight?
> > Wouldn't it be sensible for device_shutdown() to just call ->remove if
> > the device's bus has no ->shutdown, and the device's driver doesn't have
> > a ->shutdown either?
> > 
> > Answer: This would sometimes work, the vast majority of DSA switch
> > drivers, and Ethernet controllers (in this case used as DSA masters) do
> > not have a .shutdown method implemented. But their bus does: PCI does,
> > SPI controllers do, most of the time. So it would work for limited
> > scenarios, but would be ineffective in the general sense.
> > 
> > Question 7: I said that .shutdown, as opposed to .remove, doesn't really
> > care so much about the integrity of data structures. So how far should
> > we really go to fix this issue? Should we even bother to unbind the
> > whole DSA tree, when the sole problem is that we are the DSA master's
> > upper, and that is keeping a reference on it?
> > 
> > Answer: Well, any solution that does unnecessary data structure teardown
> > only delays the reboot for nothing. Lino's patch just bluntly calls
> > dsa_tree_teardown() from the switch .shutdown method, and this leaks
> > memory, namely dst->ports. But does this really matter? Nope, so let's
> > extrapolate. In this case, IMO, the simplest possible solution would be
> > to patch bcmgenet to not unregister the net device. Then treat every
> > other DSA master driver in the same way as they come, one by one.
> > Do you need to unregister_netdevice() at shutdown? No. Then don't.
> > Is it nice? Probably not, but I'm not seeing alternatives.
> > 
> > Also, unless I'm missing something, Lino probably still sees the WARN_ON
> > in bcmgenet's unregister_netdevice() about eth0 getting unregistered
> > while having an upper interface. If not, it's by sheer luck that the DSA
> > switch's ->shutdown gets called before bcmgenet's ->shutdown. But for
> > this reason, it isn't a great solution either. If the device links can't
> > guarantee us some sort of shutdown ordering (what we ideally want, as
> > mentioned, is for the DSA switch driver to get _unbound_ (->remove)
> > before the DSA master gets unbound or shut down).
> 
> I forgot about this, for completeness:
> 
> Question 8: Ok, so this is an even more lightweight variant of question 6.
> To patch device_shutdown here:
> 
>  		if (dev->bus && dev->bus->shutdown) {
>  			if (initcall_debug)
>  				dev_info(dev, "shutdown\n");
>  			dev->bus->shutdown(dev);
>  		} else if (dev->driver && dev->driver->shutdown) {
>  			if (initcall_debug)
>  				dev_info(dev, "shutdown\n");
>  			dev->driver->shutdown(dev);
> +		} else {
> +			__device_release_driver(dev, parent);
>  		}
> 
> would go towards helping DSA in general, but it wouldn't help the situation at hand,
> and it would introduce regressions.
> 
> So what about patching bcmgenet (and other drivers) to implement .shutdown in the following way:
> 
> 	device_release_driver(&pdev->dev);
> 
> basically this should force-unbind the driver from the device, which
> would quite nicely make the device link go into action and make DSA
> unbind too.
> 
> Answer: device_release_driver calls device_lock(dev), and device_shutdown
> also holds that lock when it calls our ->shutdown method. So unless the
> device core would be so nice so as to provide a generic shutdown method
> that just unbinds the device using an unlocked version of device_release_driver,
> we are back to square one with this solution. Anything that needs to patch
> the device core is more or less disqualified, especially for a bug fix.

Question 9: can Lino's patch set be generalized to all DSA switches,
i.e. can all DSA drivers redirect their ->shutdown method to their
->remove method?

Answer: Apart from the fact that mdio_driver_register() does not provide
for a ->shutdown() method, which can be trivially addressed, we still
have issues. The fundamental problem is that some bus device drivers
implement ->shutdown as ->remove for their own device, see dspi_shutdown()
and dspi_remove() for an example.
When that happens, the DSA switch device attached to that bus will be
(a) once unbound from its driver (by dspi_shutdown -> dspi_remove ->
spi_unregister_controller -> device_for_each_child(...unregister), and
(b) once shut down (by its own ->shutdown method).

With the "ds" structure being naturally destroyed by the first call to
the ->remove function, a second call to ->remove is impossible to
succeed without tripping some NULL pointer, from the device's ->shutdown path.
Simply said, DSA is not designed to support an unbalanced number of
calls to dsa_register_switch and dsa_unregister_switch.

In fact, if Lino's ksz9897 switch was attached to the dspi driver for a
SPI controller, even his own patches would not work and result in this
unbalance of calls that I mentioned earlier. If we're going to fix it,
let's think of something that covers all cases at least.

Simply put, it looks like we need some guidance (again) from driver core
maintainers as to what should a ->suspend method _not_ do. Specifically,
if it is okay to redirect ->suspend to ->remove. It looks like in the
case of buses doing that, and child devices doing that too, the results
are not quite sane. And maybe that would give us some clue as to what to
do about the genet driver which does the same thing.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 16:00             ` Florian Fainelli
@ 2021-09-10  1:32               ` Saravana Kannan
  0 siblings, 0 replies; 38+ messages in thread
From: Saravana Kannan @ 2021-09-10  1:32 UTC (permalink / raw)
  To: Florian Fainelli, Rafael J. Wysocki
  Cc: Vladimir Oltean, Lino Sanfilippo, p.rosenberger, woojung.huh,
	UNGLinuxDriver, andrew, vivien.didelot, davem, kuba, netdev,
	linux-kernel

On Thu, Sep 9, 2021 at 9:00 AM Florian Fainelli <f.fainelli@gmail.com> wrote:
>
> +Saravana,
>
> On 9/9/2021 8:47 AM, Vladimir Oltean wrote:
> > On Thu, Sep 09, 2021 at 03:19:52PM +0200, Lino Sanfilippo wrote:
> >>> Do you see similar things on your 5.10 kernel?
> >>
> >> For the master device is see
> >>
> >> lrwxrwxrwx 1 root root 0 Sep  9 14:10 /sys/class/net/eth0/device/consumer:spi:spi3.0 -> ../../../virtual/devlink/platform:fd580000.ethernet--spi:spi3.0
> >
> > So this is the worst of the worst, we have a device link but it doesn't help.
> >
> > Where the device link helps is here:
> >
> > __device_release_driver
> >       while (device_links_busy(dev))
> >               device_links_unbind_consumers(dev);
> >
> > but during dev_shutdown, device_links_unbind_consumers does not get called
> > (actually I am not even sure whether it should).
> >
> > I've reproduced your issue by making this very simple change:
> >
> > diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf.c b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
> > index 60d94e0a07d6..ec00f34cac47 100644
> > --- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c
> > +++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
> > @@ -1372,6 +1372,7 @@ static struct pci_driver enetc_pf_driver = {
> >       .id_table = enetc_pf_id_table,
> >       .probe = enetc_pf_probe,
> >       .remove = enetc_pf_remove,
> > +     .shutdown = enetc_pf_remove,
> >   #ifdef CONFIG_PCI_IOV
> >       .sriov_configure = enetc_sriov_configure,
> >   #endif
> >
> > on my DSA master driver. This is what the genet driver has "special".
> >
> > I was led into grave error by Documentation/driver-api/device_link.rst,
> > which I've based my patch on, where it clearly says that device links
> > are supposed to help with shutdown ordering (how?!).
>
> I was also under the impression that device links were supposed to help
> with shutdown ordering, because it does matter a lot. One thing that I
> had to work before (and seems like it came back recently) is the
> shutdown ordering between gpio_keys.c and the GPIO controller. If you
> suspend the GPIO controller first, gpio_keys.c never gets a chance to
> keep the GPIO pin configured for a wake-up interrupt, therefore no
> wake-up event happens on key presses, whoops.

This is more of a Rafael question. Adding him. I haven't looked too
closely at device links and shutdown.

-Saravana

>
> >
> > So the question is, why did my DSA trees get torn down on shutdown?
> > Basically the short answer is that my SPI controller driver does
> > implement .shutdown, and calls the same code path as the .remove code,
> > which calls spi_unregister_controller which removes all SPI children..
> >
> > When I added this device link, one of the main objectives was to not
> > modify all DSA drivers. I was certain based on the documentation that
> > device links would help, now I'm not so sure anymore.
> >
> > So what happens is that the DSA master attempts to unregister its net
> > device on .shutdown, but DSA does not implement .shutdown, so it just
> > sits there holding a reference (supposedly via dev_hold, but where from?!)
> > to the master, which makes netdev_wait_allrefs to wait and wait.
>
> It's not coming from of_find_net_device_by_node() that's for sure and
> with OF we don't go through the code path calling
> dsa_dev_to_net_device() which does call dev_hold() and then shortly
> thereafter the caller calls dev_put() anyway.
>
> >
> > I need more time for the denial phase to pass, and to understand what
> > can actually be done. I will also be away from the keyboard for the next
> > few days, so it might take a while. Your patches obviously offer a
> > solution only for KSZ switches, we need something more general. If I
> > understand your solution, it works not by virtue of there being any
> > shutdown ordering guarantee at all, but simply due to the fact that
> > DSA's .shutdown hook gets called eventually, and the reference to the
> > master gets freed eventually, which unblocks the unregister_netdevice
> > call from the master. I don't yet understand why DSA holds a long-term
> > reference to the master, that's one thing I need to figure out.
> >
>
> Agreed.
> --
> Florian

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-09 22:54                   ` Vladimir Oltean
  2021-09-09 23:23                     ` Vladimir Oltean
@ 2021-09-10  2:15                     ` Florian Fainelli
  2021-09-10 11:51                       ` Andrew Lunn
  1 sibling, 1 reply; 38+ messages in thread
From: Florian Fainelli @ 2021-09-10  2:15 UTC (permalink / raw)
  To: Vladimir Oltean, Lino Sanfilippo, Saravana Kannan
  Cc: p.rosenberger, woojung.huh, UNGLinuxDriver, andrew,
	vivien.didelot, davem, kuba, netdev, linux-kernel



On 9/9/2021 3:54 PM, Vladimir Oltean wrote:
[snip]
> Question 6: How about a patch on the device core that is more lightweight?
> Wouldn't it be sensible for device_shutdown() to just call ->remove if
> the device's bus has no ->shutdown, and the device's driver doesn't have
> a ->shutdown either?
> 
> Answer: This would sometimes work, the vast majority of DSA switch
> drivers, and Ethernet controllers (in this case used as DSA masters) do
> not have a .shutdown method implemented. But their bus does: PCI does,
> SPI controllers do, most of the time. So it would work for limited
> scenarios, but would be ineffective in the general sense.

Having wondered about that question as well, I don't really see a 
compelling reason as to why we do not default to calling .remove() when 
.shutdown() is not implemented. In almost all of the cases the semantics 
of .remove() are superior to those required by .shutdown().

> 
> Question 7: I said that .shutdown, as opposed to .remove, doesn't really
> care so much about the integrity of data structures. So how far should
> we really go to fix this issue? Should we even bother to unbind the
> whole DSA tree, when the sole problem is that we are the DSA master's
> upper, and that is keeping a reference on it?
> 
> Answer: Well, any solution that does unnecessary data structure teardown
> only delays the reboot for nothing. Lino's patch just bluntly calls
> dsa_tree_teardown() from the switch .shutdown method, and this leaks
> memory, namely dst->ports. But does this really matter? Nope, so let's
> extrapolate. In this case, IMO, the simplest possible solution would be
> to patch bcmgenet to not unregister the net device. Then treat every
> other DSA master driver in the same way as they come, one by one.
> Do you need to unregister_netdevice() at shutdown? No. Then don't.
> Is it nice? Probably not, but I'm not seeing alternatives.

It does not really scale but we also don't have that many DSA masters to 
support, I believe I can name them all: bcmgenet, stmmac, bcmsysport, 
enetc, mv643xx_eth, cpsw, macb. If you want me to patch bcmgenet, give 
me a few days to test and make sure there is no power management 
regression, that's the primary concern I have.

> 
> Also, unless I'm missing something, Lino probably still sees the WARN_ON
> in bcmgenet's unregister_netdevice() about eth0 getting unregistered
> while having an upper interface. If not, it's by sheer luck that the DSA
> switch's ->shutdown gets called before bcmgenet's ->shutdown. But for
> this reason, it isn't a great solution either. If the device links can't
> guarantee us some sort of shutdown ordering (what we ideally want, as
> mentioned, is for the DSA switch driver to get _unbound_ (->remove)
> before the DSA master gets unbound or shut down).
> 

All of your questions are good and I don't have answers to any of them, 
however I would like you and others to reason about .shutdown() not just 
in the context of a reboot, or kexec'd kernel but also in the context of 
putting the system into ACPI S5 (via poweroff). In that case the goal is 
not only to quiesce the device, the goal is also to put it in a low 
power mode.

For bcmgenet specifically the code path that leads to a driver remove is 
well tested and is guaranteeing the network device registration, thus 
putting the PHY into suspend, shutting down DMAs, turning off clocks. 
This is a big hammer, but it gets the job done and does not introduce 
yet another code path to test, it's the same as the module removal.
-- 
Florian

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-10  2:15                     ` Florian Fainelli
@ 2021-09-10 11:51                       ` Andrew Lunn
  2021-09-10 14:58                         ` Vladimir Oltean
  0 siblings, 1 reply; 38+ messages in thread
From: Andrew Lunn @ 2021-09-10 11:51 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Vladimir Oltean, Lino Sanfilippo, Saravana Kannan, p.rosenberger,
	woojung.huh, UNGLinuxDriver, vivien.didelot, davem, kuba, netdev,
	linux-kernel

> It does not really scale but we also don't have that many DSA masters to
> support, I believe I can name them all: bcmgenet, stmmac, bcmsysport, enetc,
> mv643xx_eth, cpsw, macb.

fec, mvneta, mvpp2, i210/igb.

     Andrew

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-10 11:51                       ` Andrew Lunn
@ 2021-09-10 14:58                         ` Vladimir Oltean
  2021-09-11 11:44                           ` Vladimir Oltean
       [not found]                           ` <53f2509f-b648-b33d-1542-17a2c9d69966@gmx.de>
  0 siblings, 2 replies; 38+ messages in thread
From: Vladimir Oltean @ 2021-09-10 14:58 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli
  Cc: Lino Sanfilippo, Saravana Kannan, Rafael J. Wysocki,
	p.rosenberger, woojung.huh, UNGLinuxDriver, vivien.didelot,
	davem, kuba, netdev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1092 bytes --]

On Fri, Sep 10, 2021 at 01:51:56PM +0200, Andrew Lunn wrote:
> > It does not really scale but we also don't have that many DSA masters to
> > support, I believe I can name them all: bcmgenet, stmmac, bcmsysport, enetc,
> > mv643xx_eth, cpsw, macb.
> 
> fec, mvneta, mvpp2, i210/igb.

I can probably double that list only with Freescale/NXP Ethernet
drivers, some of which are not even submitted to mainline. To name some
mainline drivers: gianfar, dpaa-eth, dpaa2-eth, dpaa2-switch, ucc_geth.
Also consider that DSA/switchdev drivers can also be DSA masters of
their own, we have boards doing that too.

Anyway, I've decided to at least try and accept the fact that DSA
masters will unregister their net_device on shutdown, and attempt to do
something sane for all DSA switches in that case.

Attached are two patches (they are fairly big so I won't paste them
inline, and I would like initial feedback before posting them to the
list).

As mentioned in those patches, the shutdown ordering guarantee is still
very important, I still have no clue what goes on there, what we need to
do, etc.

[-- Attachment #2: 0001-net-mdio-introduce-a-shutdown-method-to-mdio-device-.patch --]
[-- Type: text/x-diff, Size: 2282 bytes --]

From 6cf2bf18535cd11da5168fa574061b2064a7b40a Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: Fri, 10 Sep 2021 03:13:39 +0300
Subject: [RFC PATCH net 1/2] net: mdio: introduce a shutdown method to mdio
 device drivers

MDIO devices might have interrupts and other things that might need
quiesced when we kexec into a new kernel. Things are even more creepy
when those interrupt lines are shared, and in that case it is absolutely
mandatory to disable all interrupt sources.

Moreover, MDIO devices might be DSA switches, and DSA needs its own
shutdown method to unlink from the DSA master.

So introduce a ->shutdown method in the MDIO device driver structure.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 drivers/net/phy/mdio_device.c | 11 +++++++++++
 include/linux/mdio.h          |  3 +++
 2 files changed, 14 insertions(+)

diff --git a/drivers/net/phy/mdio_device.c b/drivers/net/phy/mdio_device.c
index c94cb5382dc9..250742ffdfd9 100644
--- a/drivers/net/phy/mdio_device.c
+++ b/drivers/net/phy/mdio_device.c
@@ -179,6 +179,16 @@ static int mdio_remove(struct device *dev)
 	return 0;
 }
 
+static void mdio_shutdown(struct device *dev)
+{
+	struct mdio_device *mdiodev = to_mdio_device(dev);
+	struct device_driver *drv = mdiodev->dev.driver;
+	struct mdio_driver *mdiodrv = to_mdio_driver(drv);
+
+	if (mdiodrv->shutdown)
+		mdiodrv->shutdown(mdiodev);
+}
+
 /**
  * mdio_driver_register - register an mdio_driver with the MDIO layer
  * @drv: new mdio_driver to register
@@ -193,6 +203,7 @@ int mdio_driver_register(struct mdio_driver *drv)
 	mdiodrv->driver.bus = &mdio_bus_type;
 	mdiodrv->driver.probe = mdio_probe;
 	mdiodrv->driver.remove = mdio_remove;
+	mdiodrv->driver.shutdown = mdio_shutdown;
 
 	retval = driver_register(&mdiodrv->driver);
 	if (retval) {
diff --git a/include/linux/mdio.h b/include/linux/mdio.h
index ffb787d5ebde..5e6dc38f418e 100644
--- a/include/linux/mdio.h
+++ b/include/linux/mdio.h
@@ -80,6 +80,9 @@ struct mdio_driver {
 
 	/* Clears up any memory if needed */
 	void (*remove)(struct mdio_device *mdiodev);
+
+	/* Quiesces the device on system shutdown, turns off interrupts etc */
+	void (*shutdown)(struct mdio_device *mdiodev);
 };
 
 static inline struct mdio_driver *
-- 
2.25.1


[-- Attachment #3: 0002-net-dsa-implement-a-shutdown-procedure-and-call-it-f.patch --]
[-- Type: text/x-diff, Size: 42300 bytes --]

From 5458da3f6a3db71d0bb77dec5cac6e53ca9a14c6 Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: Fri, 10 Sep 2021 03:13:18 +0300
Subject: [RFC PATCH net 2/2] net: dsa: implement a shutdown procedure and call
 it from all drivers

Lino reports that on his system with bcmgenet as DSA master and KSZ9897
as a switch, rebooting or shutting down never works properly.

This message can be seen in a loop, and it hangs the reboot process there:

unregister_netdevice: waiting for eth0 to become free. Usage count = 3

So why 3?

A usage count of 1 is normal for a registered network interface, and any
virtual interface which links itself as an upper of that will increment
it via dev_hold. In the case of DSA, this is the call path:

dsa_slave_create
-> netdev_upper_dev_link
   -> __netdev_upper_dev_link
      -> __netdev_adjacent_dev_insert
         -> dev_hold

So a DSA switch with 3 interfaces will result in a usage count elevated
by two, and netdev_wait_allrefs will wait until they have gone away.

Other stacked interfaces, like VLAN, watch NETDEV_UNREGISTER events and
delete themselves, but DSA cannot just vanish and go poof, at most it
can unbind itself from the switch devices, but that must happen strictly
earlier compared to when the DSA master unregisters its net_device.

It seems that it is a pretty established pattern to have a driver's
->shutdown hook redirect to its ->remove hook, so the same code is
executed regardless of whether the driver is unbound from the device, or
the system is just shutting down. As Florian puts it, it is quite a big
hammer for bcmgenet to unregister its net_device during shutdown, but
having a common code path with the driver unbind helps ensure it is well
tested.

So DSA, for better or for worse, has to live with that and engage in an
arms race of implementing the ->shutdown hook too, from all individual
drivers, and do something sane when paired with masters that unregister
their net_device there. The only sane thing to do, of course, is to
unlink from the master.

However, complications arise really quickly.

The pattern of redirecting ->shutdown to ->remove is not unique to
bcmgenet or even to net_device drivers. In fact, SPI controllers do it
too (see dspi_shutdown -> dspi_remove), and presumably, I2C controllers
and MDIO controllers do it too. Since DSA switches might be SPI devices,
I2C devices, MDIO devices, the insane implication is that for the exact
same DSA switch device, we might have both ->shutdown and ->remove
getting called.

So we need to do something with that insane environment. The pattern
I've come up with is "if this, then not that", so if either ->shutdown
or ->remove gets called, we set the device's drvdata to NULL, and in the
other hook, we check whether the drvdata is NULL and just do nothing.
This is probably not necessary for platform devices, just for devices on
buses, but I would really insist for consistency among drivers, because
when code is copy-pasted, it is not always copy-pasted from the best
sources.

So depending on whether the DSA switch's ->remove or ->shutdown will get
called first, we cannot really guarantee even for the same driver if
rebooting will result in the same code path on all platforms. But
nonetheless, we need to do something minimally reasonable on ->shutdown
too to fix the bug. Of course, the ->remove will do more (a full
teardown of the tree, with all data structures freed, and this is why
the bug was not caught for so long). The new ->shutdown method is kept
separate from dsa_unregister_switch not because we couldn't have
unregistered the switch, but simply in the interest of doing something
quick and to the point.

One thing I still don't know is whether the device core guarantees that
DSA's {->remove, ->shutdown} methods will be called before the master's.
We have a device_link between them, but experimentally that seems to
help only with the unbind path, not even clear if it does something on
shutdown (documentation says it does, but I don't see what).

If the device core doesn't guarantee that, then:
- the master net_device reference hasn't gone away, remember we have a
  dev_hold on it, but
- the WARN_ON in unregister_netdevice will still trigger, saying that
  unregister is attempted for an interface with uppers.

My limited testing has shown this to not happen, but obviously I
wouldn't rely on experimentation alone.

Fixes: 2f1e8ea726e9 ("net: dsa: link interfaces with the DSA master to get rid of lockdep warnings")
Link: https://lore.kernel.org/netdev/20210909095324.12978-1-LinoSanfilippo@gmx.de/
Reported-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 drivers/net/dsa/b53/b53_mdio.c             | 21 +++++++++-
 drivers/net/dsa/b53/b53_mmap.c             | 13 ++++++
 drivers/net/dsa/b53/b53_priv.h             |  5 +++
 drivers/net/dsa/b53/b53_spi.c              | 13 ++++++
 drivers/net/dsa/b53/b53_srab.c             | 21 +++++++++-
 drivers/net/dsa/bcm_sf2.c                  | 12 ++++++
 drivers/net/dsa/dsa_loop.c                 | 22 ++++++++++-
 drivers/net/dsa/hirschmann/hellcreek.c     | 16 ++++++++
 drivers/net/dsa/lan9303-core.c             |  6 +++
 drivers/net/dsa/lan9303.h                  |  1 +
 drivers/net/dsa/lan9303_i2c.c              | 24 +++++++++--
 drivers/net/dsa/lan9303_mdio.c             | 15 +++++++
 drivers/net/dsa/lantiq_gswip.c             | 18 +++++++++
 drivers/net/dsa/microchip/ksz8795_spi.c    | 11 +++++-
 drivers/net/dsa/microchip/ksz8863_smi.c    | 13 ++++++
 drivers/net/dsa/microchip/ksz9477_i2c.c    | 14 ++++++-
 drivers/net/dsa/microchip/ksz9477_spi.c    |  8 +++-
 drivers/net/dsa/mt7530.c                   | 18 +++++++++
 drivers/net/dsa/mv88e6060.c                | 18 +++++++++
 drivers/net/dsa/mv88e6xxx/chip.c           | 22 ++++++++++-
 drivers/net/dsa/ocelot/felix_vsc9959.c     | 20 +++++++++-
 drivers/net/dsa/ocelot/seville_vsc9953.c   | 20 +++++++++-
 drivers/net/dsa/qca/ar9331.c               | 18 +++++++++
 drivers/net/dsa/qca8k.c                    | 18 +++++++++
 drivers/net/dsa/realtek-smi-core.c         | 20 +++++++++-
 drivers/net/dsa/sja1105/sja1105_main.c     | 21 +++++++++-
 drivers/net/dsa/vitesse-vsc73xx-core.c     |  6 +++
 drivers/net/dsa/vitesse-vsc73xx-platform.c | 22 ++++++++++-
 drivers/net/dsa/vitesse-vsc73xx-spi.c      | 22 ++++++++++-
 drivers/net/dsa/vitesse-vsc73xx.h          |  1 +
 drivers/net/dsa/xrs700x/xrs700x.c          |  6 +++
 drivers/net/dsa/xrs700x/xrs700x.h          |  1 +
 drivers/net/dsa/xrs700x/xrs700x_i2c.c      | 18 +++++++++
 drivers/net/dsa/xrs700x/xrs700x_mdio.c     | 18 +++++++++
 include/net/dsa.h                          |  1 +
 net/dsa/dsa2.c                             | 46 ++++++++++++++++++++++
 36 files changed, 525 insertions(+), 24 deletions(-)

diff --git a/drivers/net/dsa/b53/b53_mdio.c b/drivers/net/dsa/b53/b53_mdio.c
index a533a90e3904..a7aeb3c132c9 100644
--- a/drivers/net/dsa/b53/b53_mdio.c
+++ b/drivers/net/dsa/b53/b53_mdio.c
@@ -351,9 +351,25 @@ static int b53_mdio_probe(struct mdio_device *mdiodev)
 static void b53_mdio_remove(struct mdio_device *mdiodev)
 {
 	struct b53_device *dev = dev_get_drvdata(&mdiodev->dev);
-	struct dsa_switch *ds = dev->ds;
 
-	dsa_unregister_switch(ds);
+	if (!dev)
+		return;
+
+	b53_switch_remove(dev);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
+}
+
+static void b53_mdio_shutdown(struct mdio_device *mdiodev)
+{
+	struct b53_device *dev = dev_get_drvdata(&mdiodev->dev);
+
+	if (!dev)
+		return;
+
+	b53_switch_shutdown(dev);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
 }
 
 static const struct of_device_id b53_of_match[] = {
@@ -373,6 +389,7 @@ MODULE_DEVICE_TABLE(of, b53_of_match);
 static struct mdio_driver b53_mdio_driver = {
 	.probe	= b53_mdio_probe,
 	.remove	= b53_mdio_remove,
+	.shutdown = b53_mdio_shutdown,
 	.mdiodrv.driver = {
 		.name = "bcm53xx",
 		.of_match_table = b53_of_match,
diff --git a/drivers/net/dsa/b53/b53_mmap.c b/drivers/net/dsa/b53/b53_mmap.c
index 82680e083cc2..62ea45f492f3 100644
--- a/drivers/net/dsa/b53/b53_mmap.c
+++ b/drivers/net/dsa/b53/b53_mmap.c
@@ -316,9 +316,21 @@ static int b53_mmap_remove(struct platform_device *pdev)
 	if (dev)
 		b53_switch_remove(dev);
 
+	platform_set_drvdata(pdev, NULL);
+
 	return 0;
 }
 
+static int b53_mmap_shutdown(struct platform_device *pdev)
+{
+	struct b53_device *dev = platform_get_drvdata(pdev);
+
+	if (dev)
+		b53_switch_shutdown(dev);
+
+	platform_set_drvdata(pdev, NULL);
+}
+
 static const struct of_device_id b53_mmap_of_table[] = {
 	{ .compatible = "brcm,bcm3384-switch" },
 	{ .compatible = "brcm,bcm6328-switch" },
@@ -331,6 +343,7 @@ MODULE_DEVICE_TABLE(of, b53_mmap_of_table);
 static struct platform_driver b53_mmap_driver = {
 	.probe = b53_mmap_probe,
 	.remove = b53_mmap_remove,
+	.shutdown = b53_mmap_shutdown,
 	.driver = {
 		.name = "b53-switch",
 		.of_match_table = b53_mmap_of_table,
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index 5d068acf7cf8..959a52d41f0a 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -228,6 +228,11 @@ static inline void b53_switch_remove(struct b53_device *dev)
 	dsa_unregister_switch(dev->ds);
 }
 
+static inline void b53_switch_shutdown(struct b53_device *dev)
+{
+	dsa_switch_shutdown(dev->ds);
+}
+
 #define b53_build_op(type_op_size, val_type)				\
 static inline int b53_##type_op_size(struct b53_device *dev, u8 page,	\
 				     u8 reg, val_type val)		\
diff --git a/drivers/net/dsa/b53/b53_spi.c b/drivers/net/dsa/b53/b53_spi.c
index ecb9f7f6b335..01e37b75471e 100644
--- a/drivers/net/dsa/b53/b53_spi.c
+++ b/drivers/net/dsa/b53/b53_spi.c
@@ -321,9 +321,21 @@ static int b53_spi_remove(struct spi_device *spi)
 	if (dev)
 		b53_switch_remove(dev);
 
+	spi_set_drvdata(spi, NULL);
+
 	return 0;
 }
 
+static void b53_spi_shutdown(struct spi_device *spi)
+{
+	struct b53_device *dev = spi_get_drvdata(spi);
+
+	if (dev)
+		b53_switch_shutdown(dev);
+
+	spi_set_drvdata(spi, NULL);
+}
+
 static const struct of_device_id b53_spi_of_match[] = {
 	{ .compatible = "brcm,bcm5325" },
 	{ .compatible = "brcm,bcm5365" },
@@ -344,6 +356,7 @@ static struct spi_driver b53_spi_driver = {
 	},
 	.probe	= b53_spi_probe,
 	.remove	= b53_spi_remove,
+	.shutdown = b53_spi_shutdown,
 };
 
 module_spi_driver(b53_spi_driver);
diff --git a/drivers/net/dsa/b53/b53_srab.c b/drivers/net/dsa/b53/b53_srab.c
index 3f4249de70c5..4591bb1c05d2 100644
--- a/drivers/net/dsa/b53/b53_srab.c
+++ b/drivers/net/dsa/b53/b53_srab.c
@@ -629,17 +629,34 @@ static int b53_srab_probe(struct platform_device *pdev)
 static int b53_srab_remove(struct platform_device *pdev)
 {
 	struct b53_device *dev = platform_get_drvdata(pdev);
-	struct b53_srab_priv *priv = dev->priv;
 
-	b53_srab_intr_set(priv, false);
+	if (!dev)
+		return 0;
+
+	b53_srab_intr_set(dev->priv, false);
 	b53_switch_remove(dev);
 
+	platform_set_drvdata(pdev, NULL);
+
 	return 0;
 }
 
+static void b53_srab_shutdown(struct platform_device *pdev)
+{
+	struct b53_device *dev = platform_get_drvdata(pdev);
+
+	if (!dev)
+		return;
+
+	b53_switch_shutdown(dev);
+
+	platform_set_drvdata(pdev, NULL);
+}
+
 static struct platform_driver b53_srab_driver = {
 	.probe = b53_srab_probe,
 	.remove = b53_srab_remove,
+	.shutdown = b53_srab_shutdown,
 	.driver = {
 		.name = "b53-srab-switch",
 		.of_match_table = b53_srab_of_match,
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 6ce9ec1283e0..520fbb56c202 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -1512,6 +1512,9 @@ static int bcm_sf2_sw_remove(struct platform_device *pdev)
 {
 	struct bcm_sf2_priv *priv = platform_get_drvdata(pdev);
 
+	if (!priv)
+		return 0;
+
 	priv->wol_ports_mask = 0;
 	/* Disable interrupts */
 	bcm_sf2_intr_disable(priv);
@@ -1523,6 +1526,8 @@ static int bcm_sf2_sw_remove(struct platform_device *pdev)
 	if (priv->type == BCM7278_DEVICE_ID)
 		reset_control_assert(priv->rcdev);
 
+	platform_set_drvdata(pdev, NULL);
+
 	return 0;
 }
 
@@ -1530,6 +1535,9 @@ static void bcm_sf2_sw_shutdown(struct platform_device *pdev)
 {
 	struct bcm_sf2_priv *priv = platform_get_drvdata(pdev);
 
+	if (!priv)
+		return;
+
 	/* For a kernel about to be kexec'd we want to keep the GPHY on for a
 	 * successful MDIO bus scan to occur. If we did turn off the GPHY
 	 * before (e.g: port_disable), this will also power it back on.
@@ -1538,6 +1546,10 @@ static void bcm_sf2_sw_shutdown(struct platform_device *pdev)
 	 */
 	if (priv->hw_params.num_gphy == 1)
 		bcm_sf2_gphy_enable_set(priv->dev->ds, true);
+
+	dsa_switch_shutdown(priv->dev->ds);
+
+	platform_set_drvdata(pdev, NULL);
 }
 
 #ifdef CONFIG_PM_SLEEP
diff --git a/drivers/net/dsa/dsa_loop.c b/drivers/net/dsa/dsa_loop.c
index bfdf3324aac3..e638e3eea911 100644
--- a/drivers/net/dsa/dsa_loop.c
+++ b/drivers/net/dsa/dsa_loop.c
@@ -340,10 +340,29 @@ static int dsa_loop_drv_probe(struct mdio_device *mdiodev)
 static void dsa_loop_drv_remove(struct mdio_device *mdiodev)
 {
 	struct dsa_switch *ds = dev_get_drvdata(&mdiodev->dev);
-	struct dsa_loop_priv *ps = ds->priv;
+	struct dsa_loop_priv *ps;
+
+	if (!ds)
+		return;
+
+	ps = ds->priv;
 
 	dsa_unregister_switch(ds);
 	dev_put(ps->netdev);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
+}
+
+static void dsa_loop_drv_shutdown(struct mdio_device *mdiodev)
+{
+	struct dsa_switch *ds = dev_get_drvdata(&mdiodev->dev);
+
+	if (!ds)
+		return;
+
+	dsa_switch_shutdown(ds);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
 }
 
 static struct mdio_driver dsa_loop_drv = {
@@ -352,6 +371,7 @@ static struct mdio_driver dsa_loop_drv = {
 	},
 	.probe	= dsa_loop_drv_probe,
 	.remove	= dsa_loop_drv_remove,
+	.shutdown = dsa_loop_drv_shutdown,
 };
 
 #define NUM_FIXED_PHYS	(DSA_LOOP_NUM_PORTS - 2)
diff --git a/drivers/net/dsa/hirschmann/hellcreek.c b/drivers/net/dsa/hirschmann/hellcreek.c
index 542cfc4ccb08..354655f9ed00 100644
--- a/drivers/net/dsa/hirschmann/hellcreek.c
+++ b/drivers/net/dsa/hirschmann/hellcreek.c
@@ -1916,6 +1916,9 @@ static int hellcreek_remove(struct platform_device *pdev)
 {
 	struct hellcreek *hellcreek = platform_get_drvdata(pdev);
 
+	if (!hellcreek)
+		return 0;
+
 	hellcreek_hwtstamp_free(hellcreek);
 	hellcreek_ptp_free(hellcreek);
 	dsa_unregister_switch(hellcreek->ds);
@@ -1924,6 +1927,18 @@ static int hellcreek_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static void hellcreek_shutdown(struct platform_device *pdev)
+{
+	struct hellcreek *hellcreek = platform_get_drvdata(pdev);
+
+	if (!hellcreek)
+		return;
+
+	dsa_switch_shutdown(hellcreek->ds);
+
+	platform_set_drvdata(pdev, NULL);
+}
+
 static const struct hellcreek_platform_data de1soc_r1_pdata = {
 	.name		 = "r4c30",
 	.num_ports	 = 4,
@@ -1946,6 +1961,7 @@ MODULE_DEVICE_TABLE(of, hellcreek_of_match);
 static struct platform_driver hellcreek_driver = {
 	.probe	= hellcreek_probe,
 	.remove = hellcreek_remove,
+	.shutdown = hellcreek_shutdown,
 	.driver = {
 		.name = "hellcreek",
 		.of_match_table = hellcreek_of_match,
diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c
index d7ce281570b5..89f920289ae2 100644
--- a/drivers/net/dsa/lan9303-core.c
+++ b/drivers/net/dsa/lan9303-core.c
@@ -1379,6 +1379,12 @@ int lan9303_remove(struct lan9303 *chip)
 }
 EXPORT_SYMBOL(lan9303_remove);
 
+void lan9303_shutdown(struct lan9303 *chip)
+{
+	dsa_switch_shutdown(chip->ds);
+}
+EXPORT_SYMBOL(lan9303_shutdown);
+
 MODULE_AUTHOR("Juergen Borleis <kernel@pengutronix.de>");
 MODULE_DESCRIPTION("Core driver for SMSC/Microchip LAN9303 three port ethernet switch");
 MODULE_LICENSE("GPL v2");
diff --git a/drivers/net/dsa/lan9303.h b/drivers/net/dsa/lan9303.h
index 11f590b64701..c7f73efa50f0 100644
--- a/drivers/net/dsa/lan9303.h
+++ b/drivers/net/dsa/lan9303.h
@@ -10,3 +10,4 @@ extern const struct lan9303_phy_ops lan9303_indirect_phy_ops;
 
 int lan9303_probe(struct lan9303 *chip, struct device_node *np);
 int lan9303_remove(struct lan9303 *chip);
+void lan9303_shutdown(struct lan9303 *chip);
diff --git a/drivers/net/dsa/lan9303_i2c.c b/drivers/net/dsa/lan9303_i2c.c
index 9bffaef65a04..8ca4713310fa 100644
--- a/drivers/net/dsa/lan9303_i2c.c
+++ b/drivers/net/dsa/lan9303_i2c.c
@@ -67,13 +67,28 @@ static int lan9303_i2c_probe(struct i2c_client *client,
 
 static int lan9303_i2c_remove(struct i2c_client *client)
 {
-	struct lan9303_i2c *sw_dev;
+	struct lan9303_i2c *sw_dev = i2c_get_clientdata(client);
 
-	sw_dev = i2c_get_clientdata(client);
 	if (!sw_dev)
-		return -ENODEV;
+		return 0;
+
+	lan9303_remove(&sw_dev->chip);
+
+	i2c_set_clientdata(client, NULL);
+
+	return 0;
+}
+
+static void lan9303_i2c_shutdown(struct i2c_client *client)
+{
+	struct lan9303_i2c *sw_dev = i2c_get_clientdata(client);
+
+	if (!sw_dev)
+		return;
+
+	lan9303_shutdown(&sw_dev->chip);
 
-	return lan9303_remove(&sw_dev->chip);
+	i2c_set_clientdata(client, NULL);
 }
 
 /*-------------------------------------------------------------------------*/
@@ -97,6 +112,7 @@ static struct i2c_driver lan9303_i2c_driver = {
 	},
 	.probe = lan9303_i2c_probe,
 	.remove = lan9303_i2c_remove,
+	.shutdown = lan9303_i2c_shutdown,
 	.id_table = lan9303_i2c_id,
 };
 module_i2c_driver(lan9303_i2c_driver);
diff --git a/drivers/net/dsa/lan9303_mdio.c b/drivers/net/dsa/lan9303_mdio.c
index 9cbe80460b53..bbb7032409ba 100644
--- a/drivers/net/dsa/lan9303_mdio.c
+++ b/drivers/net/dsa/lan9303_mdio.c
@@ -138,6 +138,20 @@ static void lan9303_mdio_remove(struct mdio_device *mdiodev)
 		return;
 
 	lan9303_remove(&sw_dev->chip);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
+}
+
+static void lan9303_mdio_shutdown(struct mdio_device *mdiodev)
+{
+	struct lan9303_mdio *sw_dev = dev_get_drvdata(&mdiodev->dev);
+
+	if (!sw_dev)
+		return;
+
+	lan9303_shutdown(&sw_dev->chip);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
 }
 
 /*-------------------------------------------------------------------------*/
@@ -155,6 +169,7 @@ static struct mdio_driver lan9303_mdio_driver = {
 	},
 	.probe  = lan9303_mdio_probe,
 	.remove = lan9303_mdio_remove,
+	.shutdown = lan9303_mdio_shutdown,
 };
 mdio_module_driver(lan9303_mdio_driver);
 
diff --git a/drivers/net/dsa/lantiq_gswip.c b/drivers/net/dsa/lantiq_gswip.c
index 64d6dfa83122..6d8ca3613a42 100644
--- a/drivers/net/dsa/lantiq_gswip.c
+++ b/drivers/net/dsa/lantiq_gswip.c
@@ -2178,6 +2178,9 @@ static int gswip_remove(struct platform_device *pdev)
 	struct gswip_priv *priv = platform_get_drvdata(pdev);
 	int i;
 
+	if (!priv)
+		return 0;
+
 	/* disable the switch */
 	gswip_mdio_mask(priv, GSWIP_MDIO_GLOB_ENABLE, 0, GSWIP_MDIO_GLOB);
 
@@ -2191,9 +2194,23 @@ static int gswip_remove(struct platform_device *pdev)
 	for (i = 0; i < priv->num_gphy_fw; i++)
 		gswip_gphy_fw_remove(priv, &priv->gphy_fw[i]);
 
+	platform_set_drvdata(pdev, NULL);
+
 	return 0;
 }
 
+static void gswip_shutdown(struct platform_device *pdev)
+{
+	struct gswip_priv *priv = platform_get_drvdata(pdev);
+
+	if (!priv)
+		return;
+
+	dsa_switch_shutdown(priv->ds);
+
+	platform_set_drvdata(pdev, NULL);
+}
+
 static const struct gswip_hw_info gswip_xrx200 = {
 	.max_ports = 7,
 	.cpu_port = 6,
@@ -2217,6 +2234,7 @@ MODULE_DEVICE_TABLE(of, gswip_of_match);
 static struct platform_driver gswip_driver = {
 	.probe = gswip_probe,
 	.remove = gswip_remove,
+	.shutdown = gswip_shutdown,
 	.driver = {
 		.name = "gswip",
 		.of_match_table = gswip_of_match,
diff --git a/drivers/net/dsa/microchip/ksz8795_spi.c b/drivers/net/dsa/microchip/ksz8795_spi.c
index ea7550d1b634..866767b70d65 100644
--- a/drivers/net/dsa/microchip/ksz8795_spi.c
+++ b/drivers/net/dsa/microchip/ksz8795_spi.c
@@ -94,6 +94,8 @@ static int ksz8795_spi_remove(struct spi_device *spi)
 	if (dev)
 		ksz_switch_remove(dev);
 
+	spi_set_drvdata(spi, NULL);
+
 	return 0;
 }
 
@@ -101,8 +103,15 @@ static void ksz8795_spi_shutdown(struct spi_device *spi)
 {
 	struct ksz_device *dev = spi_get_drvdata(spi);
 
-	if (dev && dev->dev_ops->shutdown)
+	if (!dev)
+		return;
+
+	if (dev->dev_ops->shutdown)
 		dev->dev_ops->shutdown(dev);
+
+	dsa_switch_shutdown(dev->ds);
+
+	spi_set_drvdata(spi, NULL);
 }
 
 static const struct of_device_id ksz8795_dt_ids[] = {
diff --git a/drivers/net/dsa/microchip/ksz8863_smi.c b/drivers/net/dsa/microchip/ksz8863_smi.c
index 11293485138c..5883fa7edda2 100644
--- a/drivers/net/dsa/microchip/ksz8863_smi.c
+++ b/drivers/net/dsa/microchip/ksz8863_smi.c
@@ -191,6 +191,18 @@ static void ksz8863_smi_remove(struct mdio_device *mdiodev)
 
 	if (dev)
 		ksz_switch_remove(dev);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
+}
+
+static void ksz8863_smi_shutdown(struct mdio_device *mdiodev)
+{
+	struct ksz_device *dev = dev_get_drvdata(&mdiodev->dev);
+
+	if (dev)
+		dsa_switch_shutdown(dev->ds);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
 }
 
 static const struct of_device_id ksz8863_dt_ids[] = {
@@ -203,6 +215,7 @@ MODULE_DEVICE_TABLE(of, ksz8863_dt_ids);
 static struct mdio_driver ksz8863_driver = {
 	.probe	= ksz8863_smi_probe,
 	.remove	= ksz8863_smi_remove,
+	.shutdown = ksz8863_smi_shutdown,
 	.mdiodrv.driver = {
 		.name	= "ksz8863-switch",
 		.of_match_table = ksz8863_dt_ids,
diff --git a/drivers/net/dsa/microchip/ksz9477_i2c.c b/drivers/net/dsa/microchip/ksz9477_i2c.c
index 4e053a25d077..f3afb8b8c4cc 100644
--- a/drivers/net/dsa/microchip/ksz9477_i2c.c
+++ b/drivers/net/dsa/microchip/ksz9477_i2c.c
@@ -56,7 +56,10 @@ static int ksz9477_i2c_remove(struct i2c_client *i2c)
 {
 	struct ksz_device *dev = i2c_get_clientdata(i2c);
 
-	ksz_switch_remove(dev);
+	if (dev)
+		ksz_switch_remove(dev);
+
+	i2c_set_clientdata(i2c, NULL);
 
 	return 0;
 }
@@ -65,8 +68,15 @@ static void ksz9477_i2c_shutdown(struct i2c_client *i2c)
 {
 	struct ksz_device *dev = i2c_get_clientdata(i2c);
 
-	if (dev && dev->dev_ops->shutdown)
+	if (!dev)
+		return;
+
+	if (dev->dev_ops->shutdown)
 		dev->dev_ops->shutdown(dev);
+
+	dsa_switch_shutdown(dev->ds);
+
+	i2c_set_clientdata(i2c, NULL);
 }
 
 static const struct i2c_device_id ksz9477_i2c_id[] = {
diff --git a/drivers/net/dsa/microchip/ksz9477_spi.c b/drivers/net/dsa/microchip/ksz9477_spi.c
index 15bc11b3cda4..e3cb0e6c9f6f 100644
--- a/drivers/net/dsa/microchip/ksz9477_spi.c
+++ b/drivers/net/dsa/microchip/ksz9477_spi.c
@@ -72,6 +72,8 @@ static int ksz9477_spi_remove(struct spi_device *spi)
 	if (dev)
 		ksz_switch_remove(dev);
 
+	spi_set_drvdata(spi, NULL);
+
 	return 0;
 }
 
@@ -79,8 +81,10 @@ static void ksz9477_spi_shutdown(struct spi_device *spi)
 {
 	struct ksz_device *dev = spi_get_drvdata(spi);
 
-	if (dev && dev->dev_ops->shutdown)
-		dev->dev_ops->shutdown(dev);
+	if (dev)
+		dsa_switch_shutdown(dev->ds);
+
+	spi_set_drvdata(spi, NULL);
 }
 
 static const struct of_device_id ksz9477_dt_ids[] = {
diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
index d0cba2d1cd68..094737e5084a 100644
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -3286,6 +3286,9 @@ mt7530_remove(struct mdio_device *mdiodev)
 	struct mt7530_priv *priv = dev_get_drvdata(&mdiodev->dev);
 	int ret = 0;
 
+	if (!priv)
+		return;
+
 	ret = regulator_disable(priv->core_pwr);
 	if (ret < 0)
 		dev_err(priv->dev,
@@ -3301,11 +3304,26 @@ mt7530_remove(struct mdio_device *mdiodev)
 
 	dsa_unregister_switch(priv->ds);
 	mutex_destroy(&priv->reg_mutex);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
+}
+
+static void mt7530_shutdown(struct mdio_device *mdiodev)
+{
+	struct mt7530_priv *priv = dev_get_drvdata(&mdiodev->dev);
+
+	if (!priv)
+		return;
+
+	dsa_switch_shutdown(priv->ds);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
 }
 
 static struct mdio_driver mt7530_mdio_driver = {
 	.probe  = mt7530_probe,
 	.remove = mt7530_remove,
+	.shutdown = mt7530_shutdown,
 	.mdiodrv.driver = {
 		.name = "mt7530",
 		.of_match_table = mt7530_of_match,
diff --git a/drivers/net/dsa/mv88e6060.c b/drivers/net/dsa/mv88e6060.c
index 24b8219fd607..a4c6eb9a52d0 100644
--- a/drivers/net/dsa/mv88e6060.c
+++ b/drivers/net/dsa/mv88e6060.c
@@ -290,7 +290,24 @@ static void mv88e6060_remove(struct mdio_device *mdiodev)
 {
 	struct dsa_switch *ds = dev_get_drvdata(&mdiodev->dev);
 
+	if (!ds)
+		return;
+
 	dsa_unregister_switch(ds);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
+}
+
+static void mv88e6060_shutdown(struct mdio_device *mdiodev)
+{
+	struct dsa_switch *ds = dev_get_drvdata(&mdiodev->dev);
+
+	if (!ds)
+		return;
+
+	dsa_switch_shutdown(ds);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
 }
 
 static const struct of_device_id mv88e6060_of_match[] = {
@@ -303,6 +320,7 @@ static const struct of_device_id mv88e6060_of_match[] = {
 static struct mdio_driver mv88e6060_driver = {
 	.probe	= mv88e6060_probe,
 	.remove = mv88e6060_remove,
+	.shutdown = mv88e6060_shutdown,
 	.mdiodrv.driver = {
 		.name = "mv88e6060",
 		.of_match_table = mv88e6060_of_match,
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index c45ca2473743..fb10422d2c33 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -6389,7 +6389,12 @@ static int mv88e6xxx_probe(struct mdio_device *mdiodev)
 static void mv88e6xxx_remove(struct mdio_device *mdiodev)
 {
 	struct dsa_switch *ds = dev_get_drvdata(&mdiodev->dev);
-	struct mv88e6xxx_chip *chip = ds->priv;
+	struct mv88e6xxx_chip *chip;
+
+	if (!ds)
+		return;
+
+	chip = ds->priv;
 
 	if (chip->info->ptp_support) {
 		mv88e6xxx_hwtstamp_free(chip);
@@ -6410,6 +6415,20 @@ static void mv88e6xxx_remove(struct mdio_device *mdiodev)
 		mv88e6xxx_g1_irq_free(chip);
 	else
 		mv88e6xxx_irq_poll_free(chip);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
+}
+
+static void mv88e6xxx_shutdown(struct mdio_device *mdiodev)
+{
+	struct dsa_switch *ds = dev_get_drvdata(&mdiodev->dev);
+
+	if (!ds)
+		return;
+
+	dsa_switch_shutdown(ds);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
 }
 
 static const struct of_device_id mv88e6xxx_of_match[] = {
@@ -6433,6 +6452,7 @@ MODULE_DEVICE_TABLE(of, mv88e6xxx_of_match);
 static struct mdio_driver mv88e6xxx_driver = {
 	.probe	= mv88e6xxx_probe,
 	.remove = mv88e6xxx_remove,
+	.shutdown = mv88e6xxx_shutdown,
 	.mdiodrv.driver = {
 		.name = "mv88e6085",
 		.of_match_table = mv88e6xxx_of_match,
diff --git a/drivers/net/dsa/ocelot/felix_vsc9959.c b/drivers/net/dsa/ocelot/felix_vsc9959.c
index f966a253d1c7..75823c6323ad 100644
--- a/drivers/net/dsa/ocelot/felix_vsc9959.c
+++ b/drivers/net/dsa/ocelot/felix_vsc9959.c
@@ -1472,9 +1472,10 @@ static int felix_pci_probe(struct pci_dev *pdev,
 
 static void felix_pci_remove(struct pci_dev *pdev)
 {
-	struct felix *felix;
+	struct felix *felix = pci_get_drvdata(pdev);
 
-	felix = pci_get_drvdata(pdev);
+	if (!felix)
+		return;
 
 	dsa_unregister_switch(felix->ds);
 
@@ -1482,6 +1483,20 @@ static void felix_pci_remove(struct pci_dev *pdev)
 	kfree(felix);
 
 	pci_disable_device(pdev);
+
+	pci_set_drvdata(pdev, NULL);
+}
+
+static void felix_pci_shutdown(struct pci_dev *pdev)
+{
+	struct felix *felix = pci_get_drvdata(pdev);
+
+	if (!felix)
+		return;
+
+	dsa_switch_shutdown(felix->ds);
+
+	pci_set_drvdata(pdev, NULL);
 }
 
 static struct pci_device_id felix_ids[] = {
@@ -1498,6 +1513,7 @@ static struct pci_driver felix_vsc9959_pci_driver = {
 	.id_table	= felix_ids,
 	.probe		= felix_pci_probe,
 	.remove		= felix_pci_remove,
+	.shutdown	= felix_pci_shutdown,
 };
 module_pci_driver(felix_vsc9959_pci_driver);
 
diff --git a/drivers/net/dsa/ocelot/seville_vsc9953.c b/drivers/net/dsa/ocelot/seville_vsc9953.c
index deae923c8b7a..de1d34a1f1e4 100644
--- a/drivers/net/dsa/ocelot/seville_vsc9953.c
+++ b/drivers/net/dsa/ocelot/seville_vsc9953.c
@@ -1245,18 +1245,33 @@ static int seville_probe(struct platform_device *pdev)
 
 static int seville_remove(struct platform_device *pdev)
 {
-	struct felix *felix;
+	struct felix *felix = platform_get_drvdata(pdev);
 
-	felix = platform_get_drvdata(pdev);
+	if (!felix)
+		return 0;
 
 	dsa_unregister_switch(felix->ds);
 
 	kfree(felix->ds);
 	kfree(felix);
 
+	platform_set_drvdata(pdev, NULL);
+
 	return 0;
 }
 
+static void seville_shutdown(struct platform_device *pdev)
+{
+	struct felix *felix = platform_get_drvdata(pdev);
+
+	if (!felix)
+		return;
+
+	dsa_switch_shutdown(felix->ds);
+
+	platform_set_drvdata(pdev, NULL);
+}
+
 static const struct of_device_id seville_of_match[] = {
 	{ .compatible = "mscc,vsc9953-switch" },
 	{ },
@@ -1266,6 +1281,7 @@ MODULE_DEVICE_TABLE(of, seville_of_match);
 static struct platform_driver seville_vsc9953_driver = {
 	.probe		= seville_probe,
 	.remove		= seville_remove,
+	.shutdown	= seville_shutdown,
 	.driver = {
 		.name		= "mscc_seville",
 		.of_match_table	= of_match_ptr(seville_of_match),
diff --git a/drivers/net/dsa/qca/ar9331.c b/drivers/net/dsa/qca/ar9331.c
index 563d8a279030..a6bfb6abc51a 100644
--- a/drivers/net/dsa/qca/ar9331.c
+++ b/drivers/net/dsa/qca/ar9331.c
@@ -1083,6 +1083,9 @@ static void ar9331_sw_remove(struct mdio_device *mdiodev)
 	struct ar9331_sw_priv *priv = dev_get_drvdata(&mdiodev->dev);
 	unsigned int i;
 
+	if (!priv)
+		return;
+
 	for (i = 0; i < ARRAY_SIZE(priv->port); i++) {
 		struct ar9331_sw_port *port = &priv->port[i];
 
@@ -1094,6 +1097,20 @@ static void ar9331_sw_remove(struct mdio_device *mdiodev)
 	dsa_unregister_switch(&priv->ds);
 
 	reset_control_assert(priv->sw_reset);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
+}
+
+static void ar9331_sw_shutdown(struct mdio_device *mdiodev)
+{
+	struct ar9331_sw_priv *priv = dev_get_drvdata(&mdiodev->dev);
+
+	if (!priv)
+		return;
+
+	dsa_switch_shutdown(&priv->ds);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
 }
 
 static const struct of_device_id ar9331_sw_of_match[] = {
@@ -1104,6 +1121,7 @@ static const struct of_device_id ar9331_sw_of_match[] = {
 static struct mdio_driver ar9331_sw_mdio_driver = {
 	.probe = ar9331_sw_probe,
 	.remove = ar9331_sw_remove,
+	.shutdown = ar9331_sw_shutdown,
 	.mdiodrv.driver = {
 		.name = AR9331_SW_NAME,
 		.of_match_table = ar9331_sw_of_match,
diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index 1f63f50f73f1..a7ae9d0ae29b 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -1866,10 +1866,27 @@ qca8k_sw_remove(struct mdio_device *mdiodev)
 	struct qca8k_priv *priv = dev_get_drvdata(&mdiodev->dev);
 	int i;
 
+	if (!priv)
+		return;
+
 	for (i = 0; i < QCA8K_NUM_PORTS; i++)
 		qca8k_port_set_status(priv, i, 0);
 
 	dsa_unregister_switch(priv->ds);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
+}
+
+static void qca8k_sw_shutdown(struct mdio_device *mdiodev)
+{
+	struct qca8k_priv *priv = dev_get_drvdata(&mdiodev->dev);
+
+	if (!priv)
+		return;
+
+	dsa_switch_shutdown(priv->ds);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
 }
 
 #ifdef CONFIG_PM_SLEEP
@@ -1926,6 +1943,7 @@ static const struct of_device_id qca8k_of_match[] = {
 static struct mdio_driver qca8kmdio_driver = {
 	.probe  = qca8k_sw_probe,
 	.remove = qca8k_sw_remove,
+	.shutdown = qca8k_sw_shutdown,
 	.mdiodrv.driver = {
 		.name = "qca8k",
 		.of_match_table = qca8k_of_match,
diff --git a/drivers/net/dsa/realtek-smi-core.c b/drivers/net/dsa/realtek-smi-core.c
index 8e49d4f85d48..dd2f0d6208b3 100644
--- a/drivers/net/dsa/realtek-smi-core.c
+++ b/drivers/net/dsa/realtek-smi-core.c
@@ -464,16 +464,33 @@ static int realtek_smi_probe(struct platform_device *pdev)
 
 static int realtek_smi_remove(struct platform_device *pdev)
 {
-	struct realtek_smi *smi = dev_get_drvdata(&pdev->dev);
+	struct realtek_smi *smi = platform_get_drvdata(pdev);
+
+	if (!smi)
+		return 0;
 
 	dsa_unregister_switch(smi->ds);
 	if (smi->slave_mii_bus)
 		of_node_put(smi->slave_mii_bus->dev.of_node);
 	gpiod_set_value(smi->reset, 1);
 
+	platform_set_drvdata(pdev, NULL);
+
 	return 0;
 }
 
+static void realtek_smi_shutdown(struct platform_device *pdev)
+{
+	struct realtek_smi *smi = platform_get_drvdata(pdev);
+
+	if (!smi)
+		return;
+
+	dsa_switch_shutdown(smi->ds);
+
+	platform_set_drvdata(pdev, NULL);
+}
+
 static const struct of_device_id realtek_smi_of_match[] = {
 	{
 		.compatible = "realtek,rtl8366rb",
@@ -495,6 +512,7 @@ static struct platform_driver realtek_smi_driver = {
 	},
 	.probe  = realtek_smi_probe,
 	.remove = realtek_smi_remove,
+	.shutdown = realtek_smi_shutdown,
 };
 module_platform_driver(realtek_smi_driver);
 
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index 2f8cc6686c38..7c0db80eff00 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -3335,13 +3335,29 @@ static int sja1105_probe(struct spi_device *spi)
 static int sja1105_remove(struct spi_device *spi)
 {
 	struct sja1105_private *priv = spi_get_drvdata(spi);
-	struct dsa_switch *ds = priv->ds;
 
-	dsa_unregister_switch(ds);
+	if (!priv)
+		return 0;
+
+	dsa_unregister_switch(priv->ds);
+
+	spi_set_drvdata(spi, NULL);
 
 	return 0;
 }
 
+static void sja1105_shutdown(struct spi_device *spi)
+{
+	struct sja1105_private *priv = spi_get_drvdata(spi);
+
+	if (!priv)
+		return;
+
+	dsa_switch_shutdown(priv->ds);
+
+	spi_set_drvdata(spi, NULL);
+}
+
 static const struct of_device_id sja1105_dt_ids[] = {
 	{ .compatible = "nxp,sja1105e", .data = &sja1105e_info },
 	{ .compatible = "nxp,sja1105t", .data = &sja1105t_info },
@@ -3365,6 +3381,7 @@ static struct spi_driver sja1105_driver = {
 	},
 	.probe  = sja1105_probe,
 	.remove = sja1105_remove,
+	.shutdown = sja1105_shutdown,
 };
 
 module_spi_driver(sja1105_driver);
diff --git a/drivers/net/dsa/vitesse-vsc73xx-core.c b/drivers/net/dsa/vitesse-vsc73xx-core.c
index 19ce4aa0973b..a4b1447ff055 100644
--- a/drivers/net/dsa/vitesse-vsc73xx-core.c
+++ b/drivers/net/dsa/vitesse-vsc73xx-core.c
@@ -1225,6 +1225,12 @@ int vsc73xx_remove(struct vsc73xx *vsc)
 }
 EXPORT_SYMBOL(vsc73xx_remove);
 
+void vsc73xx_shutdown(struct vsc73xx *vsc)
+{
+	dsa_switch_shutdown(vsc->ds);
+}
+EXPORT_SYMBOL(vsc73xx_shutdown);
+
 MODULE_AUTHOR("Linus Walleij <linus.walleij@linaro.org>");
 MODULE_DESCRIPTION("Vitesse VSC7385/7388/7395/7398 driver");
 MODULE_LICENSE("GPL v2");
diff --git a/drivers/net/dsa/vitesse-vsc73xx-platform.c b/drivers/net/dsa/vitesse-vsc73xx-platform.c
index 2a57f337b2a2..fe4b154a0a57 100644
--- a/drivers/net/dsa/vitesse-vsc73xx-platform.c
+++ b/drivers/net/dsa/vitesse-vsc73xx-platform.c
@@ -116,7 +116,26 @@ static int vsc73xx_platform_remove(struct platform_device *pdev)
 {
 	struct vsc73xx_platform *vsc_platform = platform_get_drvdata(pdev);
 
-	return vsc73xx_remove(&vsc_platform->vsc);
+	if (!vsc_platform)
+		return 0;
+
+	vsc73xx_remove(&vsc_platform->vsc);
+
+	platform_set_drvdata(pdev, NULL);
+
+	return 0;
+}
+
+static void vsc73xx_platform_shutdown(struct platform_device *pdev)
+{
+	struct vsc73xx_platform *vsc_platform = platform_get_drvdata(pdev);
+
+	if (!vsc_platform)
+		return;
+
+	vsc73xx_shutdown(&vsc_platform->vsc);
+
+	platform_set_drvdata(pdev, NULL);
 }
 
 static const struct vsc73xx_ops vsc73xx_platform_ops = {
@@ -144,6 +163,7 @@ MODULE_DEVICE_TABLE(of, vsc73xx_of_match);
 static struct platform_driver vsc73xx_platform_driver = {
 	.probe = vsc73xx_platform_probe,
 	.remove = vsc73xx_platform_remove,
+	.shutdown = vsc73xx_platform_shutdown,
 	.driver = {
 		.name = "vsc73xx-platform",
 		.of_match_table = vsc73xx_of_match,
diff --git a/drivers/net/dsa/vitesse-vsc73xx-spi.c b/drivers/net/dsa/vitesse-vsc73xx-spi.c
index 81eca4a5781d..645398901e05 100644
--- a/drivers/net/dsa/vitesse-vsc73xx-spi.c
+++ b/drivers/net/dsa/vitesse-vsc73xx-spi.c
@@ -163,7 +163,26 @@ static int vsc73xx_spi_remove(struct spi_device *spi)
 {
 	struct vsc73xx_spi *vsc_spi = spi_get_drvdata(spi);
 
-	return vsc73xx_remove(&vsc_spi->vsc);
+	if (!vsc_spi)
+		return 0;
+
+	vsc73xx_remove(&vsc_spi->vsc);
+
+	spi_set_drvdata(spi, NULL);
+
+	return 0;
+}
+
+static void vsc73xx_spi_shutdown(struct spi_device *spi)
+{
+	struct vsc73xx_spi *vsc_spi = spi_get_drvdata(spi);
+
+	if (!vsc_spi)
+		return;
+
+	vsc73xx_shutdown(&vsc_spi->vsc);
+
+	spi_set_drvdata(spi, NULL);
 }
 
 static const struct vsc73xx_ops vsc73xx_spi_ops = {
@@ -191,6 +210,7 @@ MODULE_DEVICE_TABLE(of, vsc73xx_of_match);
 static struct spi_driver vsc73xx_spi_driver = {
 	.probe = vsc73xx_spi_probe,
 	.remove = vsc73xx_spi_remove,
+	.shutdown = vsc73xx_spi_shutdown,
 	.driver = {
 		.name = "vsc73xx-spi",
 		.of_match_table = vsc73xx_of_match,
diff --git a/drivers/net/dsa/vitesse-vsc73xx.h b/drivers/net/dsa/vitesse-vsc73xx.h
index 7478f8d4e0a9..30b951504e65 100644
--- a/drivers/net/dsa/vitesse-vsc73xx.h
+++ b/drivers/net/dsa/vitesse-vsc73xx.h
@@ -27,3 +27,4 @@ struct vsc73xx_ops {
 int vsc73xx_is_addr_valid(u8 block, u8 subblock);
 int vsc73xx_probe(struct vsc73xx *vsc);
 int vsc73xx_remove(struct vsc73xx *vsc);
+void vsc73xx_shutdown(struct vsc73xx *vsc);
diff --git a/drivers/net/dsa/xrs700x/xrs700x.c b/drivers/net/dsa/xrs700x/xrs700x.c
index 130abb0f1438..469420941054 100644
--- a/drivers/net/dsa/xrs700x/xrs700x.c
+++ b/drivers/net/dsa/xrs700x/xrs700x.c
@@ -822,6 +822,12 @@ void xrs700x_switch_remove(struct xrs700x *priv)
 }
 EXPORT_SYMBOL(xrs700x_switch_remove);
 
+void xrs700x_switch_shutdown(struct xrs700x *priv)
+{
+	dsa_switch_shutdown(priv->ds);
+}
+EXPORT_SYMBOL(xrs700x_switch_shutdown);
+
 MODULE_AUTHOR("George McCollister <george.mccollister@gmail.com>");
 MODULE_DESCRIPTION("Arrow SpeedChips XRS700x DSA driver");
 MODULE_LICENSE("GPL v2");
diff --git a/drivers/net/dsa/xrs700x/xrs700x.h b/drivers/net/dsa/xrs700x/xrs700x.h
index ff62cf61b091..4d58257471d2 100644
--- a/drivers/net/dsa/xrs700x/xrs700x.h
+++ b/drivers/net/dsa/xrs700x/xrs700x.h
@@ -40,3 +40,4 @@ struct xrs700x {
 struct xrs700x *xrs700x_switch_alloc(struct device *base, void *devpriv);
 int xrs700x_switch_register(struct xrs700x *priv);
 void xrs700x_switch_remove(struct xrs700x *priv);
+void xrs700x_switch_shutdown(struct xrs700x *priv);
diff --git a/drivers/net/dsa/xrs700x/xrs700x_i2c.c b/drivers/net/dsa/xrs700x/xrs700x_i2c.c
index 489d9385b4f0..6deae388a0d6 100644
--- a/drivers/net/dsa/xrs700x/xrs700x_i2c.c
+++ b/drivers/net/dsa/xrs700x/xrs700x_i2c.c
@@ -109,11 +109,28 @@ static int xrs700x_i2c_remove(struct i2c_client *i2c)
 {
 	struct xrs700x *priv = i2c_get_clientdata(i2c);
 
+	if (!priv)
+		return 0;
+
 	xrs700x_switch_remove(priv);
 
+	i2c_set_clientdata(i2c, NULL);
+
 	return 0;
 }
 
+static void xrs700x_i2c_shutdown(struct i2c_client *i2c)
+{
+	struct xrs700x *priv = i2c_get_clientdata(i2c);
+
+	if (!priv)
+		return;
+
+	xrs700x_switch_shutdown(priv);
+
+	i2c_set_clientdata(i2c, NULL);
+}
+
 static const struct i2c_device_id xrs700x_i2c_id[] = {
 	{ "xrs700x-switch", 0 },
 	{},
@@ -137,6 +154,7 @@ static struct i2c_driver xrs700x_i2c_driver = {
 	},
 	.probe	= xrs700x_i2c_probe,
 	.remove	= xrs700x_i2c_remove,
+	.shutdown = xrs700x_i2c_shutdown,
 	.id_table = xrs700x_i2c_id,
 };
 
diff --git a/drivers/net/dsa/xrs700x/xrs700x_mdio.c b/drivers/net/dsa/xrs700x/xrs700x_mdio.c
index 44f58bee04a4..d01cf1073d49 100644
--- a/drivers/net/dsa/xrs700x/xrs700x_mdio.c
+++ b/drivers/net/dsa/xrs700x/xrs700x_mdio.c
@@ -136,7 +136,24 @@ static void xrs700x_mdio_remove(struct mdio_device *mdiodev)
 {
 	struct xrs700x *priv = dev_get_drvdata(&mdiodev->dev);
 
+	if (!priv)
+		return;
+
 	xrs700x_switch_remove(priv);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
+}
+
+static void xrs700x_mdio_shutdown(struct mdio_device *mdiodev)
+{
+	struct xrs700x *priv = dev_get_drvdata(&mdiodev->dev);
+
+	if (!priv)
+		return;
+
+	xrs700x_switch_shutdown(priv);
+
+	dev_set_drvdata(&mdiodev->dev, NULL);
 }
 
 static const struct of_device_id __maybe_unused xrs700x_mdio_dt_ids[] = {
@@ -155,6 +172,7 @@ static struct mdio_driver xrs700x_mdio_driver = {
 	},
 	.probe	= xrs700x_mdio_probe,
 	.remove	= xrs700x_mdio_remove,
+	.shutdown = xrs700x_mdio_shutdown,
 };
 
 mdio_module_driver(xrs700x_mdio_driver);
diff --git a/include/net/dsa.h b/include/net/dsa.h
index f9a17145255a..2c39dbac63bd 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -1041,6 +1041,7 @@ static inline int dsa_ndo_eth_ioctl(struct net_device *dev, struct ifreq *ifr,
 
 void dsa_unregister_switch(struct dsa_switch *ds);
 int dsa_register_switch(struct dsa_switch *ds);
+void dsa_switch_shutdown(struct dsa_switch *ds);
 struct dsa_switch *dsa_switch_find(int tree_index, int sw_index);
 #ifdef CONFIG_PM_SLEEP
 int dsa_switch_suspend(struct dsa_switch *ds);
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 1b2b25d7bd02..b94d7bd62277 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -1546,3 +1546,49 @@ void dsa_unregister_switch(struct dsa_switch *ds)
 	mutex_unlock(&dsa2_mutex);
 }
 EXPORT_SYMBOL_GPL(dsa_unregister_switch);
+
+/* If the DSA master chooses to unregister its net_device on .shutdown, DSA is
+ * blocking that operation from completion, due to the dev_hold taken inside
+ * netdev_upper_dev_link. Unlink the DSA slave interfaces from being uppers of
+ * the DSA master, so that the system can reboot successfully.
+ */
+void dsa_switch_shutdown(struct dsa_switch *ds)
+{
+	struct net_device *master, *slave_dev;
+	struct dsa_port *dp;
+
+	mutex_lock(&dsa2_mutex);
+	rtnl_lock();
+	list_for_each_entry(dp, &ds->dst->ports, list) {
+		if (dp->ds != ds)
+			continue;
+
+		if (!dsa_is_user_port(ds, dp->index))
+			continue;
+
+		master = dp->cpu_dp->master;
+		slave_dev = dp->slave;
+
+		netdev_upper_dev_unlink(master, slave_dev);
+		/* Just unlinking ourselves as uppers of the master is not
+		 * sufficient. When the master net device unregisters, that will
+		 * also call dev_close, which we will catch as NETDEV_GOING_DOWN
+		 * and trigger a dev_close on our own devices (dsa_slave_close).
+		 * In turn, that will call dev_mc_unsync on the master's net
+		 * device. If the master is also a DSA switch port, this will
+		 * trigger dsa_slave_set_rx_mode which will call dev_mc_sync on
+		 * its own master. Lockdep will complain about the fact that
+		 * all cascaded masters have the same dsa_master_addr_list_lock_key,
+		 * which it normally would not do if the cascaded masters would
+		 * be in a proper upper/lower relationship, which we've just
+		 * destroyed.
+		 * To suppress the lockdep warnings, let's actually unregister
+		 * the DSA slave interfaces too, to avoid the nonsensical
+		 * multicast address list synchronization on shutdown.
+		 */
+		unregister_netdevice(slave_dev);
+	}
+	rtnl_unlock();
+	mutex_unlock(&dsa2_mutex);
+}
+EXPORT_SYMBOL_GPL(dsa_switch_shutdown);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-10 14:58                         ` Vladimir Oltean
@ 2021-09-11 11:44                           ` Vladimir Oltean
       [not found]                           ` <53f2509f-b648-b33d-1542-17a2c9d69966@gmx.de>
  1 sibling, 0 replies; 38+ messages in thread
From: Vladimir Oltean @ 2021-09-11 11:44 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli
  Cc: Lino Sanfilippo, Saravana Kannan, Rafael J. Wysocki,
	p.rosenberger, woojung.huh, UNGLinuxDriver, vivien.didelot,
	davem, kuba, netdev, linux-kernel

On Fri, Sep 10, 2021 at 05:58:52PM +0300, Vladimir Oltean wrote:
> On Fri, Sep 10, 2021 at 01:51:56PM +0200, Andrew Lunn wrote:
> > > It does not really scale but we also don't have that many DSA masters to
> > > support, I believe I can name them all: bcmgenet, stmmac, bcmsysport, enetc,
> > > mv643xx_eth, cpsw, macb.
> > 
> > fec, mvneta, mvpp2, i210/igb.
> 
> I can probably double that list only with Freescale/NXP Ethernet
> drivers, some of which are not even submitted to mainline. To name some
> mainline drivers: gianfar, dpaa-eth, dpaa2-eth, dpaa2-switch, ucc_geth.
> Also consider that DSA/switchdev drivers can also be DSA masters of
> their own, we have boards doing that too.
> 
> Anyway, I've decided to at least try and accept the fact that DSA
> masters will unregister their net_device on shutdown, and attempt to do
> something sane for all DSA switches in that case.
> 
> Attached are two patches (they are fairly big so I won't paste them
> inline, and I would like initial feedback before posting them to the
> list).
> 
> As mentioned in those patches, the shutdown ordering guarantee is still
> very important, I still have no clue what goes on there, what we need to
> do, etc.

So to answer my own question, there is a comment above device_link_add:

 * A side effect of the link creation is re-ordering of dpm_list and the
 * devices_kset list by moving the consumer device and all devices depending
 * on it to the ends of these lists (that does not happen to devices that have
 * not been registered when this function is called).

so the fact that DSA uses device_link_add towards its master is not
exactly for nothing. device_shutdown() walks devices_kset from the back,
so this is our guarantee that DSA's shutdown happens before the master's
shutdown.

So these patches should be okay. Any other comments? If not, I will
formally submit them tomorrow towards the net tree.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
       [not found]                           ` <53f2509f-b648-b33d-1542-17a2c9d69966@gmx.de>
@ 2021-09-12 20:29                             ` Vladimir Oltean
  2021-09-13 10:32                               ` Aw: " Lino Sanfilippo
  0 siblings, 1 reply; 38+ messages in thread
From: Vladimir Oltean @ 2021-09-12 20:29 UTC (permalink / raw)
  To: Lino Sanfilippo
  Cc: Andrew Lunn, Florian Fainelli, Saravana Kannan,
	Rafael J. Wysocki, p.rosenberger, woojung.huh, UNGLinuxDriver,
	vivien.didelot, davem, kuba, netdev, linux-kernel

On Sun, Sep 12, 2021 at 10:19:24PM +0200, Lino Sanfilippo wrote:
> 
> Hi,
> 
> On 10.09.21 at 16:58, Vladimir Oltean wrote:
> > On Fri, Sep 10, 2021 at 01:51:56PM +0200, Andrew Lunn wrote:
> >>> It does not really scale but we also don't have that many DSA masters to
> >>> support, I believe I can name them all: bcmgenet, stmmac, bcmsysport, enetc,
> >>> mv643xx_eth, cpsw, macb.
> >>
> >> fec, mvneta, mvpp2, i210/igb.
> >
> > I can probably double that list only with Freescale/NXP Ethernet
> > drivers, some of which are not even submitted to mainline. To name some
> > mainline drivers: gianfar, dpaa-eth, dpaa2-eth, dpaa2-switch, ucc_geth.
> > Also consider that DSA/switchdev drivers can also be DSA masters of
> > their own, we have boards doing that too.
> >
> > Anyway, I've decided to at least try and accept the fact that DSA
> > masters will unregister their net_device on shutdown, and attempt to do
> > something sane for all DSA switches in that case.
> >
> > Attached are two patches (they are fairly big so I won't paste them
> > inline, and I would like initial feedback before posting them to the
> > list).
> >
> > As mentioned in those patches, the shutdown ordering guarantee is still
> > very important, I still have no clue what goes on there, what we need to
> > do, etc.
> >
> 
> I tested these patches with my 5.10 kernel (based on Gregs 5.10.27 stable
> kernel) and while I do not see the message "unregister_netdevice: waiting
> for eth0 to become free. Usage count = 2." any more the shutdown/reboot hangs, too.
> After a few attempts without any error messages on the console I was able to get a
>  stack trace. Something still seems to go wrong in bcm2835_spi_shutdown() (see attachment).
> I have not had the time yet to investigate this further (or to test the patches
>  with a newer kernel).

Could you post the full kernel output? The picture you've posted is
truncated and only shows a WARN_ON in rpi_firmware_transaction and is
probably a symptom and not the issue (which is above and not shown).

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Aw: Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-12 20:29                             ` Vladimir Oltean
@ 2021-09-13 10:32                               ` Lino Sanfilippo
  2021-09-13 10:44                                 ` Vladimir Oltean
  0 siblings, 1 reply; 38+ messages in thread
From: Lino Sanfilippo @ 2021-09-13 10:32 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Andrew Lunn, Florian Fainelli, Saravana Kannan,
	Rafael J. Wysocki, p.rosenberger, woojung.huh, UNGLinuxDriver,
	vivien.didelot, davem, kuba, netdev, linux-kernel

Hi,

> Gesendet: Sonntag, 12. September 2021 um 22:29 Uhr
> Von: "Vladimir Oltean" <olteanv@gmail.com>
> An: "Lino Sanfilippo" <LinoSanfilippo@gmx.de>
> Cc: "Andrew Lunn" <andrew@lunn.ch>, "Florian Fainelli" <f.fainelli@gmail.com>, "Saravana Kannan" <saravanak@google.com>, "Rafael J. Wysocki" <rafael@kernel.org>, p.rosenberger@kunbus.com, woojung.huh@microchip.com, UNGLinuxDriver@microchip.com, vivien.didelot@gmail.com, davem@davemloft.net, kuba@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org
> Betreff: Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
>
> On Sun, Sep 12, 2021 at 10:19:24PM +0200, Lino Sanfilippo wrote:
> >
> > Hi,
> >
> > On 10.09.21 at 16:58, Vladimir Oltean wrote:
> > > On Fri, Sep 10, 2021 at 01:51:56PM +0200, Andrew Lunn wrote:
> > >>> It does not really scale but we also don't have that many DSA masters to
> > >>> support, I believe I can name them all: bcmgenet, stmmac, bcmsysport, enetc,
> > >>> mv643xx_eth, cpsw, macb.
> > >>
> > >> fec, mvneta, mvpp2, i210/igb.
> > >
> > > I can probably double that list only with Freescale/NXP Ethernet
> > > drivers, some of which are not even submitted to mainline. To name some
> > > mainline drivers: gianfar, dpaa-eth, dpaa2-eth, dpaa2-switch, ucc_geth.
> > > Also consider that DSA/switchdev drivers can also be DSA masters of
> > > their own, we have boards doing that too.
> > >
> > > Anyway, I've decided to at least try and accept the fact that DSA
> > > masters will unregister their net_device on shutdown, and attempt to do
> > > something sane for all DSA switches in that case.
> > >
> > > Attached are two patches (they are fairly big so I won't paste them
> > > inline, and I would like initial feedback before posting them to the
> > > list).
> > >
> > > As mentioned in those patches, the shutdown ordering guarantee is still
> > > very important, I still have no clue what goes on there, what we need to
> > > do, etc.
> > >
> >
> > I tested these patches with my 5.10 kernel (based on Gregs 5.10.27 stable
> > kernel) and while I do not see the message "unregister_netdevice: waiting
> > for eth0 to become free. Usage count = 2." any more the shutdown/reboot hangs, too.
> > After a few attempts without any error messages on the console I was able to get a
> >  stack trace. Something still seems to go wrong in bcm2835_spi_shutdown() (see attachment).
> > I have not had the time yet to investigate this further (or to test the patches
> >  with a newer kernel).
>
> Could you post the full kernel output? The picture you've posted is
> truncated and only shows a WARN_ON in rpi_firmware_transaction and is
> probably a symptom and not the issue (which is above and not shown).
>

Unfortunately I dont see anything in the kernel log. The console output is all I get,
thats why I made the photo.

Regards,
Lino



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-13 10:32                               ` Aw: " Lino Sanfilippo
@ 2021-09-13 10:44                                 ` Vladimir Oltean
  2021-09-13 11:01                                   ` Aw: " Lino Sanfilippo
  0 siblings, 1 reply; 38+ messages in thread
From: Vladimir Oltean @ 2021-09-13 10:44 UTC (permalink / raw)
  To: Lino Sanfilippo
  Cc: Andrew Lunn, Florian Fainelli, Saravana Kannan,
	Rafael J. Wysocki, p.rosenberger, woojung.huh, UNGLinuxDriver,
	vivien.didelot, davem, kuba, netdev, linux-kernel

On Mon, Sep 13, 2021 at 12:32:14PM +0200, Lino Sanfilippo wrote:
> Hi,
> 
> > Gesendet: Sonntag, 12. September 2021 um 22:29 Uhr
> > Von: "Vladimir Oltean" <olteanv@gmail.com>
> > An: "Lino Sanfilippo" <LinoSanfilippo@gmx.de>
> > Cc: "Andrew Lunn" <andrew@lunn.ch>, "Florian Fainelli" <f.fainelli@gmail.com>, "Saravana Kannan" <saravanak@google.com>, "Rafael J. Wysocki" <rafael@kernel.org>, p.rosenberger@kunbus.com, woojung.huh@microchip.com, UNGLinuxDriver@microchip.com, vivien.didelot@gmail.com, davem@davemloft.net, kuba@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org
> > Betreff: Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
> >
> > On Sun, Sep 12, 2021 at 10:19:24PM +0200, Lino Sanfilippo wrote:
> > >
> > > Hi,
> > >
> > > On 10.09.21 at 16:58, Vladimir Oltean wrote:
> > > > On Fri, Sep 10, 2021 at 01:51:56PM +0200, Andrew Lunn wrote:
> > > >>> It does not really scale but we also don't have that many DSA masters to
> > > >>> support, I believe I can name them all: bcmgenet, stmmac, bcmsysport, enetc,
> > > >>> mv643xx_eth, cpsw, macb.
> > > >>
> > > >> fec, mvneta, mvpp2, i210/igb.
> > > >
> > > > I can probably double that list only with Freescale/NXP Ethernet
> > > > drivers, some of which are not even submitted to mainline. To name some
> > > > mainline drivers: gianfar, dpaa-eth, dpaa2-eth, dpaa2-switch, ucc_geth.
> > > > Also consider that DSA/switchdev drivers can also be DSA masters of
> > > > their own, we have boards doing that too.
> > > >
> > > > Anyway, I've decided to at least try and accept the fact that DSA
> > > > masters will unregister their net_device on shutdown, and attempt to do
> > > > something sane for all DSA switches in that case.
> > > >
> > > > Attached are two patches (they are fairly big so I won't paste them
> > > > inline, and I would like initial feedback before posting them to the
> > > > list).
> > > >
> > > > As mentioned in those patches, the shutdown ordering guarantee is still
> > > > very important, I still have no clue what goes on there, what we need to
> > > > do, etc.
> > > >
> > >
> > > I tested these patches with my 5.10 kernel (based on Gregs 5.10.27 stable
> > > kernel) and while I do not see the message "unregister_netdevice: waiting
> > > for eth0 to become free. Usage count = 2." any more the shutdown/reboot hangs, too.
> > > After a few attempts without any error messages on the console I was able to get a
> > >  stack trace. Something still seems to go wrong in bcm2835_spi_shutdown() (see attachment).
> > > I have not had the time yet to investigate this further (or to test the patches
> > >  with a newer kernel).
> >
> > Could you post the full kernel output? The picture you've posted is
> > truncated and only shows a WARN_ON in rpi_firmware_transaction and is
> > probably a symptom and not the issue (which is above and not shown).
> >
> 
> Unfortunately I dont see anything in the kernel log. The console output is all I get,
> thats why I made the photo.

To clarify, are you saying nothing above this line gets printed? Because
the part of the log you've posted in the picture is pretty much
unworkable:

[   99.375389] [<bf0dc56c>] (bcm2835_spi_shutdown [spi_bcm2835]) from [<c0863ca0>] (platform_drv_shutdown+0x2c/0x30)

How do you access the device's serial console? Use a program with a
scrollback buffer like GNU screen or something.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Aw: Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-13 10:44                                 ` Vladimir Oltean
@ 2021-09-13 11:01                                   ` Lino Sanfilippo
  2021-09-14 18:48                                     ` Vladimir Oltean
  0 siblings, 1 reply; 38+ messages in thread
From: Lino Sanfilippo @ 2021-09-13 11:01 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Andrew Lunn, Florian Fainelli, Saravana Kannan,
	Rafael J. Wysocki, p.rosenberger, woojung.huh, UNGLinuxDriver,
	vivien.didelot, davem, kuba, netdev, linux-kernel



> Gesendet: Montag, 13. September 2021 um 12:44 Uhr
> Von: "Vladimir Oltean" <olteanv@gmail.com>
> An: "Lino Sanfilippo" <LinoSanfilippo@gmx.de>
> Cc: "Andrew Lunn" <andrew@lunn.ch>, "Florian Fainelli" <f.fainelli@gmail.com>, "Saravana Kannan" <saravanak@google.com>, "Rafael J. Wysocki" <rafael@kernel.org>, p.rosenberger@kunbus.com, woojung.huh@microchip.com, UNGLinuxDriver@microchip.com, vivien.didelot@gmail.com, davem@davemloft.net, kuba@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org
> Betreff: Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
>
> On Mon, Sep 13, 2021 at 12:32:14PM +0200, Lino Sanfilippo wrote:
> > Hi,
> >
> > > Gesendet: Sonntag, 12. September 2021 um 22:29 Uhr
> > > Von: "Vladimir Oltean" <olteanv@gmail.com>
> > > An: "Lino Sanfilippo" <LinoSanfilippo@gmx.de>
> > > Cc: "Andrew Lunn" <andrew@lunn.ch>, "Florian Fainelli" <f.fainelli@gmail.com>, "Saravana Kannan" <saravanak@google.com>, "Rafael J. Wysocki" <rafael@kernel.org>, p.rosenberger@kunbus.com, woojung.huh@microchip.com, UNGLinuxDriver@microchip.com, vivien.didelot@gmail.com, davem@davemloft.net, kuba@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org
> > > Betreff: Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
> > >
> > > On Sun, Sep 12, 2021 at 10:19:24PM +0200, Lino Sanfilippo wrote:
> > > >
> > > > Hi,
> > > >
> > > > On 10.09.21 at 16:58, Vladimir Oltean wrote:
> > > > > On Fri, Sep 10, 2021 at 01:51:56PM +0200, Andrew Lunn wrote:
> > > > >>> It does not really scale but we also don't have that many DSA masters to
> > > > >>> support, I believe I can name them all: bcmgenet, stmmac, bcmsysport, enetc,
> > > > >>> mv643xx_eth, cpsw, macb.
> > > > >>
> > > > >> fec, mvneta, mvpp2, i210/igb.
> > > > >
> > > > > I can probably double that list only with Freescale/NXP Ethernet
> > > > > drivers, some of which are not even submitted to mainline. To name some
> > > > > mainline drivers: gianfar, dpaa-eth, dpaa2-eth, dpaa2-switch, ucc_geth.
> > > > > Also consider that DSA/switchdev drivers can also be DSA masters of
> > > > > their own, we have boards doing that too.
> > > > >
> > > > > Anyway, I've decided to at least try and accept the fact that DSA
> > > > > masters will unregister their net_device on shutdown, and attempt to do
> > > > > something sane for all DSA switches in that case.
> > > > >
> > > > > Attached are two patches (they are fairly big so I won't paste them
> > > > > inline, and I would like initial feedback before posting them to the
> > > > > list).
> > > > >
> > > > > As mentioned in those patches, the shutdown ordering guarantee is still
> > > > > very important, I still have no clue what goes on there, what we need to
> > > > > do, etc.
> > > > >
> > > >
> > > > I tested these patches with my 5.10 kernel (based on Gregs 5.10.27 stable
> > > > kernel) and while I do not see the message "unregister_netdevice: waiting
> > > > for eth0 to become free. Usage count = 2." any more the shutdown/reboot hangs, too.
> > > > After a few attempts without any error messages on the console I was able to get a
> > > >  stack trace. Something still seems to go wrong in bcm2835_spi_shutdown() (see attachment).
> > > > I have not had the time yet to investigate this further (or to test the patches
> > > >  with a newer kernel).
> > >
> > > Could you post the full kernel output? The picture you've posted is
> > > truncated and only shows a WARN_ON in rpi_firmware_transaction and is
> > > probably a symptom and not the issue (which is above and not shown).
> > >
> >
> > Unfortunately I dont see anything in the kernel log. The console output is all I get,
> > thats why I made the photo.
>
> To clarify, are you saying nothing above this line gets printed? Because
> the part of the log you've posted in the picture is pretty much
> unworkable:
>
> [   99.375389] [<bf0dc56c>] (bcm2835_spi_shutdown [spi_bcm2835]) from [<c0863ca0>] (platform_drv_shutdown+0x2c/0x30)
>
> How do you access the device's serial console? Use a program with a
> scrollback buffer like GNU screen or something.
>

Ah no, this is not over a serial console. This is what I see via hdmi. I do not have a working serial connection yet.
Sorry I know this trace part is not very useful, I will try to get a full dump.



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-13 11:01                                   ` Aw: " Lino Sanfilippo
@ 2021-09-14 18:48                                     ` Vladimir Oltean
  2021-09-15  5:42                                       ` Lino Sanfilippo
  0 siblings, 1 reply; 38+ messages in thread
From: Vladimir Oltean @ 2021-09-14 18:48 UTC (permalink / raw)
  To: Lino Sanfilippo
  Cc: Andrew Lunn, Florian Fainelli, Saravana Kannan,
	Rafael J. Wysocki, p.rosenberger, woojung.huh, UNGLinuxDriver,
	vivien.didelot, davem, kuba, netdev, linux-kernel

On Mon, Sep 13, 2021 at 01:01:20PM +0200, Lino Sanfilippo wrote:
> > > > Could you post the full kernel output? The picture you've posted is
> > > > truncated and only shows a WARN_ON in rpi_firmware_transaction and is
> > > > probably a symptom and not the issue (which is above and not shown).
> > > >
> > >
> > > Unfortunately I dont see anything in the kernel log. The console output is all I get,
> > > thats why I made the photo.
> >
> > To clarify, are you saying nothing above this line gets printed? Because
> > the part of the log you've posted in the picture is pretty much
> > unworkable:
> >
> > [   99.375389] [<bf0dc56c>] (bcm2835_spi_shutdown [spi_bcm2835]) from [<c0863ca0>] (platform_drv_shutdown+0x2c/0x30)
> >
> > How do you access the device's serial console? Use a program with a
> > scrollback buffer like GNU screen or something.
> >
> 
> Ah no, this is not over a serial console. This is what I see via hdmi. I do not have a working serial connection yet.
> Sorry I know this trace part is not very useful, I will try to get a full dump.

Lino, are you going to provide a kernel output so I could look at your new breakage?
If you could set up a pstore logger with a ramoops region, you could
dump the log after the fact. Or if HDMI is all you have, you could use
an HDMI capture card to record it. Or just record the screen you're
looking at, as long as you don't have very shaky hands, whatever...

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-14 18:48                                     ` Vladimir Oltean
@ 2021-09-15  5:42                                       ` Lino Sanfilippo
  2021-09-18 19:37                                         ` Lino Sanfilippo
  0 siblings, 1 reply; 38+ messages in thread
From: Lino Sanfilippo @ 2021-09-15  5:42 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Andrew Lunn, Florian Fainelli, Saravana Kannan,
	Rafael J. Wysocki, p.rosenberger, woojung.huh, UNGLinuxDriver,
	vivien.didelot, davem, kuba, netdev, linux-kernel

Hi,

On 14.09.21 at 20:48, Vladimir Oltean wrote:
> On Mon, Sep 13, 2021 at 01:01:20PM +0200, Lino Sanfilippo wrote:
>>>>> Could you post the full kernel output? The picture you've posted is
>>>>> truncated and only shows a WARN_ON in rpi_firmware_transaction and is
>>>>> probably a symptom and not the issue (which is above and not shown).
>>>>>
>>>>
>>>> Unfortunately I dont see anything in the kernel log. The console output is all I get,
>>>> thats why I made the photo.
>>>
>>> To clarify, are you saying nothing above this line gets printed? Because
>>> the part of the log you've posted in the picture is pretty much
>>> unworkable:
>>>
>>> [   99.375389] [<bf0dc56c>] (bcm2835_spi_shutdown [spi_bcm2835]) from [<c0863ca0>] (platform_drv_shutdown+0x2c/0x30)
>>>
>>> How do you access the device's serial console? Use a program with a
>>> scrollback buffer like GNU screen or something.
>>>
>>
>> Ah no, this is not over a serial console. This is what I see via hdmi. I do not have a working serial connection yet.
>> Sorry I know this trace part is not very useful, I will try to get a full dump.
>
> Lino, are you going to provide a kernel output so I could look at your new breakage?
> If you could set up a pstore logger with a ramoops region, you could
> dump the log after the fact. Or if HDMI is all you have, you could use
> an HDMI capture card to record it. Or just record the screen you're
> looking at, as long as you don't have very shaky hands, whatever...
>

Yes, I will try to get something useful. I have already set up a serial connection
now. I still see the shutdown stopping with your patch but I have not seen the
kernel dump any more. I will try further and provide a dump as soon as I am successful.

Regards,
Lino

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-15  5:42                                       ` Lino Sanfilippo
@ 2021-09-18 19:37                                         ` Lino Sanfilippo
  2021-09-18 22:04                                           ` Vladimir Oltean
  0 siblings, 1 reply; 38+ messages in thread
From: Lino Sanfilippo @ 2021-09-18 19:37 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Andrew Lunn, Florian Fainelli, Saravana Kannan,
	Rafael J. Wysocki, p.rosenberger, woojung.huh, UNGLinuxDriver,
	vivien.didelot, davem, kuba, netdev, linux-kernel

Hi Vladimir,

On 15.09.21 at 07:42, Lino Sanfilippo wrote:
> On 14.09.21 at 20:48, Vladimir Oltean wrote:
>> On Mon, Sep 13, 2021 at 01:01:20PM +0200, Lino Sanfilippo wrote:
>>>>>> Could you post the full kernel output? The picture you've posted is
>>>>>> truncated and only shows a WARN_ON in rpi_firmware_transaction and is
>>>>>> probably a symptom and not the issue (which is above and not shown).
>>>>>>
>>>>>
>>>>> Unfortunately I dont see anything in the kernel log. The console output is all I get,
>>>>> thats why I made the photo.
>>>>
>>>> To clarify, are you saying nothing above this line gets printed? Because
>>>> the part of the log you've posted in the picture is pretty much
>>>> unworkable:
>>>>
>>>> [   99.375389] [<bf0dc56c>] (bcm2835_spi_shutdown [spi_bcm2835]) from [<c0863ca0>] (platform_drv_shutdown+0x2c/0x30)
>>>>
>>>> How do you access the device's serial console? Use a program with a
>>>> scrollback buffer like GNU screen or something.
>>>>
>>>
>>> Ah no, this is not over a serial console. This is what I see via hdmi. I do not have a working serial connection yet.
>>> Sorry I know this trace part is not very useful, I will try to get a full dump.
>>
>> Lino, are you going to provide a kernel output so I could look at your new breakage?
>> If you could set up a pstore logger with a ramoops region, you could
>> dump the log after the fact. Or if HDMI is all you have, you could use
>> an HDMI capture card to record it. Or just record the screen you're
>> looking at, as long as you don't have very shaky hands, whatever...
>>
>
> Yes, I will try to get something useful. I have already set up a serial connection
> now. I still see the shutdown stopping with your patch but I have not seen the
> kernel dump any more. I will try further and provide a dump as soon as I am successful.
>

Sorry for the delay. I was finally able to do some tests and get a dump via the serial console.
I tested with the latest Raspberry Pi kernel 5.10.y. Based on commit
4117cba235d24a7c4630dc38cb55cc80a04f5cf3. I applied your patches and got the following result
at shutdown:

raspberrypi login: [   58.754533] ------------[ cut here ]------------
[   58.760053] kernel BUG at drivers/net/phy/mdio_bus.c:651!
[   58.766361] Internal error: Oops - BUG: 0 [#1] SMP ARM
[   58.772376] Modules linked in: 8021q garp at24 tag_ksz tpm_tis_spi ksz9477_spi tpm_tis_core ksz9477 ksz_common tpm rts
[   58.837539] CPU: 3 PID: 1 Comm: systemd-shutdow Tainted: G         C        5.10.63-RP_PURE_510_VLADFIX+ #3
[   58.848388] Hardware name: BCM2711
[   58.852875] PC is at mdiobus_free+0x4c/0x50
[   58.858143] LR is at devm_mdiobus_free+0x1c/0x20
[   58.863853] pc : [<c08c9218>]    lr : [<c08c1898>]    psr: 80000013
[   58.871212] sp : c18fdc38  ip : c18fdc48  fp : c18fdc44
[   58.877505] r10: 00000000  r9 : c0867104  r8 : c18fdc5c
[   58.883823] r7 : 00000013  r6 : c31c8000  r5 : c3a50000  r4 : c379db80
[   58.891442] r3 : c2ab4000  r2 : 00000002  r1 : c379dbc0  r0 : c2ab4000
[   58.899037] Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   58.907297] Control: 30c5383d  Table: 03ac92c0  DAC: 55555555
[   58.914139] Process systemd-shutdow (pid: 1, stack limit = 0xff8113c1)
[   58.921774] Stack: (0xc18fdc38 to 0xc18fe000)
[   58.927285] dc20:                                                       c18fdc54 c18fdc48
[   58.936601] dc40: c08c1898 c08c91d8 c18fdc94 c18fdc58 c0866dac c08c1888 c31c819c c3527180
[   58.945921] dc60: c332d200 c1405048 c32f8800 c31c8000 00000000 bf191010 00000000 c32f8800
[   58.955289] dc80: c1095f3c c1aa6454 c18fdcac c18fdc98 c086715c c0866be8 c31c8000 00000000
[   58.964644] dca0: c18fdccc c18fdcb0 c0862c7c c0867128 c1a42e30 c31c8000 c14f7cf0 00000000
[   58.974018] dcc0: c18fdcdc c18fdcd0 c0862d40 c0862b68 c18fdcfc c18fdce0 c08613dc c0862d2c
[   58.983391] dce0: c31c8000 00000a68 c08ba6cc 00000000 c18fdd44 c18fdd00 c085c710 c086130c
[   58.992778] dd00: c0331394 c0332604 60000013 c18fdd74 c3656294 c1405048 c31c8000 c31c8000
[   59.002140] dd20: 00000000 c08ba6cc c160657c c155c018 c1095f3c c1aa6454 c18fdd5c c18fdd48
[   59.011521] dd40: c08ba6a8 c085c58c 00000000 00000000 c18fdd6c c18fdd60 c08ba6e4 c08ba670
[   59.020921] dd60: c18fdd9c c18fdd70 c085bc84 c08ba6d8 c18fdd8c c3656200 c3656394 c1405048
[   59.030334] dd80: c18fdda4 c32f8800 c32f8800 00000003 c18fddbc c18fdda0 c08bab7c c085bc20
[   59.039737] dda0: c32f8b80 c32f8800 00000000 c160657c c18fdddc c18fddc0 bf182554 c08bab4c
[   59.049164] ddc0: c1aa6400 c1a6e810 c1aa6410 c160657c c18fddf4 c18fdde0 bf1825a8 bf18252c
[   59.058602] dde0: c1aa6414 c1a6e810 c18fde04 c18fddf8 c0863dec bf182598 c18fde3c c18fde08
[   59.068057] de00: c085fd9c c0863dcc c18fde3c c1095f2c c024865c 00000000 00000000 620bef00
[   59.077487] de20: c140f510 fee1dead c18fc000 00000058 c18fde4c c18fde40 c0249c84 c085fc0c
[   59.086920] de40: c18fde64 c18fde50 c0249d74 c0249c4c 01234567 00000000 c18fdf94 c18fde68
[   59.096386] de60: c024a018 c0249d64 c18fded4 c31b0c00 00000024 c18fdf58 00000005 c0441cec
[   59.105852] de80: c18fdec4 c18fde90 c0441b30 c049852c 00000000 c18fdea0 c073ad04 00000024
[   59.115330] dea0: c31b0c00 c18fdf58 c18fded4 c31b0c00 00000005 00000000 c18fdf4c c18fdec8
[   59.124821] dec0: c0441cec c0425cb0 c18fded0 c18fded4 00000000 00000005 00000000 00000024
[   59.134317] dee0: c18fdeec 00000005 c0200074 bec45250 00000004 bec45f62 00000010 bec45264
[   59.143792] df00: 00000005 bec4531c 0000000a b6d10040 00000001 c0200e70 ffffe000 c1546a80
[   59.153282] df20: 00000000 c0467268 c18fdf4c c1405048 c31b0c00 bec4528c 00000000 00000000
[   59.162787] df40: c18fdf94 c18fdf50 c0441e6c c0441c50 00000000 00000000 00000000 00000000
[   59.172269] df60: c18fdf94 c1405048 c0331394 c1405048 bec4531c 00000000 00000000 00000000
[   59.181763] df80: 00000058 c0200204 c18fdfa4 c18fdf98 c024a16c c0249f10 00000000 c18fdfa8
[   59.191250] dfa0: c0200040 c024a160 00000000 00000000 fee1dead 28121969 01234567 620bef00
[   59.200735] dfc0: 00000000 00000000 00000000 00000058 00000fff bec45be8 00000000 00476b80
[   59.210245] dfe0: 00488e3c bec45b68 004734a8 b6e4ca38 60000010 fee1dead 00000000 00000000
[   59.219759] Backtrace:
[   59.223546] [<c08c91cc>] (mdiobus_free) from [<c08c1898>] (devm_mdiobus_free+0x1c/0x20)
[   59.232909] [<c08c187c>] (devm_mdiobus_free) from [<c0866dac>] (release_nodes+0x1d0/0x220)
[   59.242551] [<c0866bdc>] (release_nodes) from [<c086715c>] (devres_release_all+0x40/0x60)
[   59.252132]  r10:c1aa6454 r9:c1095f3c r8:c32f8800 r7:00000000 r6:bf191010 r5:00000000
[   59.261338]  r4:c31c8000
[   59.265239] [<c086711c>] (devres_release_all) from [<c0862c7c>] (device_release_driver_internal+0x120/0x1c4)
[   59.276479]  r5:00000000 r4:c31c8000
[   59.281440] [<c0862b5c>] (device_release_driver_internal) from [<c0862d40>] (device_release_driver+0x20/0x24)
[   59.292802]  r7:00000000 r6:c14f7cf0 r5:c31c8000 r4:c1a42e30
[   59.299900] [<c0862d20>] (device_release_driver) from [<c08613dc>] (bus_remove_device+0xdc/0x108)
[   59.310267] [<c0861300>] (bus_remove_device) from [<c085c710>] (device_del+0x190/0x428)
[   59.319748]  r7:00000000 r6:c08ba6cc r5:00000a68 r4:c31c8000
[   59.326896] [<c085c580>] (device_del) from [<c08ba6a8>] (spi_unregister_device+0x44/0x68)
[   59.336583]  r10:c1aa6454 r9:c1095f3c r8:c155c018 r7:c160657c r6:c08ba6cc r5:00000000
[   59.345924]  r4:c31c8000
[   59.349971] [<c08ba664>] (spi_unregister_device) from [<c08ba6e4>] (__unregister+0x18/0x20)
[   59.359870]  r5:00000000 r4:00000000
[   59.364972] [<c08ba6cc>] (__unregister) from [<c085bc84>] (device_for_each_child+0x70/0xb4)
[   59.374899] [<c085bc14>] (device_for_each_child) from [<c08bab7c>] (spi_unregister_controller+0x3c/0x128)
[   59.385979]  r6:00000003 r5:c32f8800 r4:c32f8800
[   59.392086] [<c08bab40>] (spi_unregister_controller) from [<bf182554>] (bcm2835_spi_remove+0x34/0x6c [spi_bcm2835])
[   59.404000]  r7:c160657c r6:00000000 r5:c32f8800 r4:c32f8b80
[   59.411084] [<bf182520>] (bcm2835_spi_remove [spi_bcm2835]) from [<bf1825a8>] (bcm2835_spi_shutdown+0x1c/0x38 [spi_bc)
[   59.423755]  r7:c160657c r6:c1aa6410 r5:c1a6e810 r4:c1aa6400
[   59.430847] [<bf18258c>] (bcm2835_spi_shutdown [spi_bcm2835]) from [<c0863dec>] (platform_drv_shutdown+0x2c/0x30)
[   59.442613]  r5:c1a6e810 r4:c1aa6414
[   59.447635] [<c0863dc0>] (platform_drv_shutdown) from [<c085fd9c>] (device_shutdown+0x19c/0x24c)
[   59.457932] [<c085fc00>] (device_shutdown) from [<c0249c84>] (kernel_restart_prepare+0x44/0x48)
[   59.468135]  r10:00000058 r9:c18fc000 r8:fee1dead r7:c140f510 r6:620bef00 r5:00000000
[   59.477470]  r4:00000000
[   59.481509] [<c0249c40>] (kernel_restart_prepare) from [<c0249d74>] (kernel_restart+0x1c/0x60)
[   59.491653] [<c0249d58>] (kernel_restart) from [<c024a018>] (__do_sys_reboot+0x114/0x1f8)
[   59.501359]  r5:00000000 r4:01234567
[   59.506447] [<c0249f04>] (__do_sys_reboot) from [<c024a16c>] (sys_reboot+0x18/0x1c)
[   59.515628]  r8:c0200204 r7:00000058 r6:00000000 r5:00000000 r4:00000000
[   59.523857] [<c024a154>] (sys_reboot) from [<c0200040>] (ret_fast_syscall+0x0/0x28)
[   59.533038] Exception stack(0xc18fdfa8 to 0xc18fdff0)
[   59.539607] dfa0:                   00000000 00000000 fee1dead 28121969 01234567 620bef00
[   59.549318] dfc0: 00000000 00000000 00000000 00000058 00000fff bec45be8 00000000 00476b80
[   59.559026] dfe0: 00488e3c bec45b68 004734a8 b6e4ca38
[   59.565596] Code: ebfe49f5 e89da800 ebed72a3 e89da800 (e7f001f2)
[   59.573246] ---[ end trace 7d800ce7b5664bb6 ]---
[   59.579413] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[   59.588634] Rebooting in 10 seconds..

The concerning source code line 651 is in my case:

void mdiobus_free(struct mii_bus *bus)
{
	/* For compatibility with error handling in drivers. */
	if (bus->state == MDIOBUS_ALLOCATED) {
		kfree(bus);
		return;
	}

651<	BUG_ON(bus->state != MDIOBUS_UNREGISTERED);
	bus->state = MDIOBUS_RELEASED;

	put_device(&bus->dev);
}
EXPORT_SYMBOL(mdiobus_free);

I tested with both versions of your patchset, with the same result. I also tested
with a RP 5.14 kernel (the latest RP kernel) but I did not see the original issue
(i.e. the system hang) here for some reason.

I then tried to get the net-next kernel running on my system but without success so far. So for
now the result with the RP 5.10 is all I can offer. I hope that helps a bit nevertheless.

Regards,
Lino




^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-18 19:37                                         ` Lino Sanfilippo
@ 2021-09-18 22:04                                           ` Vladimir Oltean
  2021-09-19  0:29                                             ` Vladimir Oltean
  0 siblings, 1 reply; 38+ messages in thread
From: Vladimir Oltean @ 2021-09-18 22:04 UTC (permalink / raw)
  To: Lino Sanfilippo
  Cc: Andrew Lunn, Florian Fainelli, Saravana Kannan,
	Rafael J. Wysocki, p.rosenberger, woojung.huh, UNGLinuxDriver,
	vivien.didelot, davem, kuba, netdev, linux-kernel

On Sat, Sep 18, 2021 at 09:37:17PM +0200, Lino Sanfilippo wrote:
> Hi Vladimir,
> 
> On 15.09.21 at 07:42, Lino Sanfilippo wrote:
> > On 14.09.21 at 20:48, Vladimir Oltean wrote:
> >> On Mon, Sep 13, 2021 at 01:01:20PM +0200, Lino Sanfilippo wrote:
> >>>>>> Could you post the full kernel output? The picture you've posted is
> >>>>>> truncated and only shows a WARN_ON in rpi_firmware_transaction and is
> >>>>>> probably a symptom and not the issue (which is above and not shown).
> >>>>>>
> >>>>>
> >>>>> Unfortunately I dont see anything in the kernel log. The console output is all I get,
> >>>>> thats why I made the photo.
> >>>>
> >>>> To clarify, are you saying nothing above this line gets printed? Because
> >>>> the part of the log you've posted in the picture is pretty much
> >>>> unworkable:
> >>>>
> >>>> [   99.375389] [<bf0dc56c>] (bcm2835_spi_shutdown [spi_bcm2835]) from [<c0863ca0>] (platform_drv_shutdown+0x2c/0x30)
> >>>>
> >>>> How do you access the device's serial console? Use a program with a
> >>>> scrollback buffer like GNU screen or something.
> >>>>
> >>>
> >>> Ah no, this is not over a serial console. This is what I see via hdmi. I do not have a working serial connection yet.
> >>> Sorry I know this trace part is not very useful, I will try to get a full dump.
> >>
> >> Lino, are you going to provide a kernel output so I could look at your new breakage?
> >> If you could set up a pstore logger with a ramoops region, you could
> >> dump the log after the fact. Or if HDMI is all you have, you could use
> >> an HDMI capture card to record it. Or just record the screen you're
> >> looking at, as long as you don't have very shaky hands, whatever...
> >>
> >
> > Yes, I will try to get something useful. I have already set up a serial connection
> > now. I still see the shutdown stopping with your patch but I have not seen the
> > kernel dump any more. I will try further and provide a dump as soon as I am successful.
> >
> 
> Sorry for the delay. I was finally able to do some tests and get a dump via the serial console.
> I tested with the latest Raspberry Pi kernel 5.10.y. Based on commit
> 4117cba235d24a7c4630dc38cb55cc80a04f5cf3. I applied your patches and got the following result
> at shutdown:
> 
> raspberrypi login: [   58.754533] ------------[ cut here ]------------
> [   58.760053] kernel BUG at drivers/net/phy/mdio_bus.c:651!
> [   58.766361] Internal error: Oops - BUG: 0 [#1] SMP ARM
> [   58.772376] Modules linked in: 8021q garp at24 tag_ksz tpm_tis_spi ksz9477_spi tpm_tis_core ksz9477 ksz_common tpm rts
> [   58.837539] CPU: 3 PID: 1 Comm: systemd-shutdow Tainted: G         C        5.10.63-RP_PURE_510_VLADFIX+ #3
> [   58.848388] Hardware name: BCM2711
> [   58.852875] PC is at mdiobus_free+0x4c/0x50
> [   58.858143] LR is at devm_mdiobus_free+0x1c/0x20
> [   58.863853] pc : [<c08c9218>]    lr : [<c08c1898>]    psr: 80000013
> [   58.871212] sp : c18fdc38  ip : c18fdc48  fp : c18fdc44
> [   58.877505] r10: 00000000  r9 : c0867104  r8 : c18fdc5c
> [   58.883823] r7 : 00000013  r6 : c31c8000  r5 : c3a50000  r4 : c379db80
> [   58.891442] r3 : c2ab4000  r2 : 00000002  r1 : c379dbc0  r0 : c2ab4000
> [   58.899037] Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> [   58.907297] Control: 30c5383d  Table: 03ac92c0  DAC: 55555555
> [   58.914139] Process systemd-shutdow (pid: 1, stack limit = 0xff8113c1)
> [   58.921774] Stack: (0xc18fdc38 to 0xc18fe000)
> [   58.927285] dc20:                                                       c18fdc54 c18fdc48
> [   58.936601] dc40: c08c1898 c08c91d8 c18fdc94 c18fdc58 c0866dac c08c1888 c31c819c c3527180
> [   58.945921] dc60: c332d200 c1405048 c32f8800 c31c8000 00000000 bf191010 00000000 c32f8800
> [   58.955289] dc80: c1095f3c c1aa6454 c18fdcac c18fdc98 c086715c c0866be8 c31c8000 00000000
> [   58.964644] dca0: c18fdccc c18fdcb0 c0862c7c c0867128 c1a42e30 c31c8000 c14f7cf0 00000000
> [   58.974018] dcc0: c18fdcdc c18fdcd0 c0862d40 c0862b68 c18fdcfc c18fdce0 c08613dc c0862d2c
> [   58.983391] dce0: c31c8000 00000a68 c08ba6cc 00000000 c18fdd44 c18fdd00 c085c710 c086130c
> [   58.992778] dd00: c0331394 c0332604 60000013 c18fdd74 c3656294 c1405048 c31c8000 c31c8000
> [   59.002140] dd20: 00000000 c08ba6cc c160657c c155c018 c1095f3c c1aa6454 c18fdd5c c18fdd48
> [   59.011521] dd40: c08ba6a8 c085c58c 00000000 00000000 c18fdd6c c18fdd60 c08ba6e4 c08ba670
> [   59.020921] dd60: c18fdd9c c18fdd70 c085bc84 c08ba6d8 c18fdd8c c3656200 c3656394 c1405048
> [   59.030334] dd80: c18fdda4 c32f8800 c32f8800 00000003 c18fddbc c18fdda0 c08bab7c c085bc20
> [   59.039737] dda0: c32f8b80 c32f8800 00000000 c160657c c18fdddc c18fddc0 bf182554 c08bab4c
> [   59.049164] ddc0: c1aa6400 c1a6e810 c1aa6410 c160657c c18fddf4 c18fdde0 bf1825a8 bf18252c
> [   59.058602] dde0: c1aa6414 c1a6e810 c18fde04 c18fddf8 c0863dec bf182598 c18fde3c c18fde08
> [   59.068057] de00: c085fd9c c0863dcc c18fde3c c1095f2c c024865c 00000000 00000000 620bef00
> [   59.077487] de20: c140f510 fee1dead c18fc000 00000058 c18fde4c c18fde40 c0249c84 c085fc0c
> [   59.086920] de40: c18fde64 c18fde50 c0249d74 c0249c4c 01234567 00000000 c18fdf94 c18fde68
> [   59.096386] de60: c024a018 c0249d64 c18fded4 c31b0c00 00000024 c18fdf58 00000005 c0441cec
> [   59.105852] de80: c18fdec4 c18fde90 c0441b30 c049852c 00000000 c18fdea0 c073ad04 00000024
> [   59.115330] dea0: c31b0c00 c18fdf58 c18fded4 c31b0c00 00000005 00000000 c18fdf4c c18fdec8
> [   59.124821] dec0: c0441cec c0425cb0 c18fded0 c18fded4 00000000 00000005 00000000 00000024
> [   59.134317] dee0: c18fdeec 00000005 c0200074 bec45250 00000004 bec45f62 00000010 bec45264
> [   59.143792] df00: 00000005 bec4531c 0000000a b6d10040 00000001 c0200e70 ffffe000 c1546a80
> [   59.153282] df20: 00000000 c0467268 c18fdf4c c1405048 c31b0c00 bec4528c 00000000 00000000
> [   59.162787] df40: c18fdf94 c18fdf50 c0441e6c c0441c50 00000000 00000000 00000000 00000000
> [   59.172269] df60: c18fdf94 c1405048 c0331394 c1405048 bec4531c 00000000 00000000 00000000
> [   59.181763] df80: 00000058 c0200204 c18fdfa4 c18fdf98 c024a16c c0249f10 00000000 c18fdfa8
> [   59.191250] dfa0: c0200040 c024a160 00000000 00000000 fee1dead 28121969 01234567 620bef00
> [   59.200735] dfc0: 00000000 00000000 00000000 00000058 00000fff bec45be8 00000000 00476b80
> [   59.210245] dfe0: 00488e3c bec45b68 004734a8 b6e4ca38 60000010 fee1dead 00000000 00000000
> [   59.219759] Backtrace:
> [   59.223546] [<c08c91cc>] (mdiobus_free) from [<c08c1898>] (devm_mdiobus_free+0x1c/0x20)
> [   59.232909] [<c08c187c>] (devm_mdiobus_free) from [<c0866dac>] (release_nodes+0x1d0/0x220)
> [   59.242551] [<c0866bdc>] (release_nodes) from [<c086715c>] (devres_release_all+0x40/0x60)
> [   59.252132]  r10:c1aa6454 r9:c1095f3c r8:c32f8800 r7:00000000 r6:bf191010 r5:00000000
> [   59.261338]  r4:c31c8000
> [   59.265239] [<c086711c>] (devres_release_all) from [<c0862c7c>] (device_release_driver_internal+0x120/0x1c4)
> [   59.276479]  r5:00000000 r4:c31c8000
> [   59.281440] [<c0862b5c>] (device_release_driver_internal) from [<c0862d40>] (device_release_driver+0x20/0x24)
> [   59.292802]  r7:00000000 r6:c14f7cf0 r5:c31c8000 r4:c1a42e30
> [   59.299900] [<c0862d20>] (device_release_driver) from [<c08613dc>] (bus_remove_device+0xdc/0x108)
> [   59.310267] [<c0861300>] (bus_remove_device) from [<c085c710>] (device_del+0x190/0x428)
> [   59.319748]  r7:00000000 r6:c08ba6cc r5:00000a68 r4:c31c8000
> [   59.326896] [<c085c580>] (device_del) from [<c08ba6a8>] (spi_unregister_device+0x44/0x68)
> [   59.336583]  r10:c1aa6454 r9:c1095f3c r8:c155c018 r7:c160657c r6:c08ba6cc r5:00000000
> [   59.345924]  r4:c31c8000
> [   59.349971] [<c08ba664>] (spi_unregister_device) from [<c08ba6e4>] (__unregister+0x18/0x20)
> [   59.359870]  r5:00000000 r4:00000000
> [   59.364972] [<c08ba6cc>] (__unregister) from [<c085bc84>] (device_for_each_child+0x70/0xb4)
> [   59.374899] [<c085bc14>] (device_for_each_child) from [<c08bab7c>] (spi_unregister_controller+0x3c/0x128)
> [   59.385979]  r6:00000003 r5:c32f8800 r4:c32f8800
> [   59.392086] [<c08bab40>] (spi_unregister_controller) from [<bf182554>] (bcm2835_spi_remove+0x34/0x6c [spi_bcm2835])
> [   59.404000]  r7:c160657c r6:00000000 r5:c32f8800 r4:c32f8b80
> [   59.411084] [<bf182520>] (bcm2835_spi_remove [spi_bcm2835]) from [<bf1825a8>] (bcm2835_spi_shutdown+0x1c/0x38 [spi_bc)
> [   59.423755]  r7:c160657c r6:c1aa6410 r5:c1a6e810 r4:c1aa6400
> [   59.430847] [<bf18258c>] (bcm2835_spi_shutdown [spi_bcm2835]) from [<c0863dec>] (platform_drv_shutdown+0x2c/0x30)
> [   59.442613]  r5:c1a6e810 r4:c1aa6414
> [   59.447635] [<c0863dc0>] (platform_drv_shutdown) from [<c085fd9c>] (device_shutdown+0x19c/0x24c)
> [   59.457932] [<c085fc00>] (device_shutdown) from [<c0249c84>] (kernel_restart_prepare+0x44/0x48)
> [   59.468135]  r10:00000058 r9:c18fc000 r8:fee1dead r7:c140f510 r6:620bef00 r5:00000000
> [   59.477470]  r4:00000000
> [   59.481509] [<c0249c40>] (kernel_restart_prepare) from [<c0249d74>] (kernel_restart+0x1c/0x60)
> [   59.491653] [<c0249d58>] (kernel_restart) from [<c024a018>] (__do_sys_reboot+0x114/0x1f8)
> [   59.501359]  r5:00000000 r4:01234567
> [   59.506447] [<c0249f04>] (__do_sys_reboot) from [<c024a16c>] (sys_reboot+0x18/0x1c)
> [   59.515628]  r8:c0200204 r7:00000058 r6:00000000 r5:00000000 r4:00000000
> [   59.523857] [<c024a154>] (sys_reboot) from [<c0200040>] (ret_fast_syscall+0x0/0x28)
> [   59.533038] Exception stack(0xc18fdfa8 to 0xc18fdff0)
> [   59.539607] dfa0:                   00000000 00000000 fee1dead 28121969 01234567 620bef00
> [   59.549318] dfc0: 00000000 00000000 00000000 00000058 00000fff bec45be8 00000000 00476b80
> [   59.559026] dfe0: 00488e3c bec45b68 004734a8 b6e4ca38
> [   59.565596] Code: ebfe49f5 e89da800 ebed72a3 e89da800 (e7f001f2)
> [   59.573246] ---[ end trace 7d800ce7b5664bb6 ]---
> [   59.579413] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> [   59.588634] Rebooting in 10 seconds..
> 
> The concerning source code line 651 is in my case:
> 
> void mdiobus_free(struct mii_bus *bus)
> {
> 	/* For compatibility with error handling in drivers. */
> 	if (bus->state == MDIOBUS_ALLOCATED) {
> 		kfree(bus);
> 		return;
> 	}
> 
> 651<	BUG_ON(bus->state != MDIOBUS_UNREGISTERED);
> 	bus->state = MDIOBUS_RELEASED;
> 
> 	put_device(&bus->dev);
> }
> EXPORT_SYMBOL(mdiobus_free);
> 
> I tested with both versions of your patchset, with the same result. I also tested
> with a RP 5.14 kernel (the latest RP kernel) but I did not see the original issue
> (i.e. the system hang) here for some reason.
> 
> I then tried to get the net-next kernel running on my system but without success so far. So for
> now the result with the RP 5.10 is all I can offer. I hope that helps a bit nevertheless.

Thank you Lino, this is a very valuable report. I will send a v3 soon (not sure if today).

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown
  2021-09-18 22:04                                           ` Vladimir Oltean
@ 2021-09-19  0:29                                             ` Vladimir Oltean
  0 siblings, 0 replies; 38+ messages in thread
From: Vladimir Oltean @ 2021-09-19  0:29 UTC (permalink / raw)
  To: Lino Sanfilippo
  Cc: Andrew Lunn, Florian Fainelli, Saravana Kannan,
	Rafael J. Wysocki, p.rosenberger, woojung.huh, UNGLinuxDriver,
	vivien.didelot, davem, kuba, netdev, linux-kernel

On Sun, Sep 19, 2021 at 01:04:12AM +0300, Vladimir Oltean wrote:
> On Sat, Sep 18, 2021 at 09:37:17PM +0200, Lino Sanfilippo wrote:
> > Hi Vladimir,
> > 
> > On 15.09.21 at 07:42, Lino Sanfilippo wrote:
> > > On 14.09.21 at 20:48, Vladimir Oltean wrote:
> > >> On Mon, Sep 13, 2021 at 01:01:20PM +0200, Lino Sanfilippo wrote:
> > >>>>>> Could you post the full kernel output? The picture you've posted is
> > >>>>>> truncated and only shows a WARN_ON in rpi_firmware_transaction and is
> > >>>>>> probably a symptom and not the issue (which is above and not shown).
> > >>>>>>
> > >>>>>
> > >>>>> Unfortunately I dont see anything in the kernel log. The console output is all I get,
> > >>>>> thats why I made the photo.
> > >>>>
> > >>>> To clarify, are you saying nothing above this line gets printed? Because
> > >>>> the part of the log you've posted in the picture is pretty much
> > >>>> unworkable:
> > >>>>
> > >>>> [   99.375389] [<bf0dc56c>] (bcm2835_spi_shutdown [spi_bcm2835]) from [<c0863ca0>] (platform_drv_shutdown+0x2c/0x30)
> > >>>>
> > >>>> How do you access the device's serial console? Use a program with a
> > >>>> scrollback buffer like GNU screen or something.
> > >>>>
> > >>>
> > >>> Ah no, this is not over a serial console. This is what I see via hdmi. I do not have a working serial connection yet.
> > >>> Sorry I know this trace part is not very useful, I will try to get a full dump.
> > >>
> > >> Lino, are you going to provide a kernel output so I could look at your new breakage?
> > >> If you could set up a pstore logger with a ramoops region, you could
> > >> dump the log after the fact. Or if HDMI is all you have, you could use
> > >> an HDMI capture card to record it. Or just record the screen you're
> > >> looking at, as long as you don't have very shaky hands, whatever...
> > >>
> > >
> > > Yes, I will try to get something useful. I have already set up a serial connection
> > > now. I still see the shutdown stopping with your patch but I have not seen the
> > > kernel dump any more. I will try further and provide a dump as soon as I am successful.
> > >
> > 
> > Sorry for the delay. I was finally able to do some tests and get a dump via the serial console.
> > I tested with the latest Raspberry Pi kernel 5.10.y. Based on commit
> > 4117cba235d24a7c4630dc38cb55cc80a04f5cf3. I applied your patches and got the following result
> > at shutdown:
> > 
> > raspberrypi login: [   58.754533] ------------[ cut here ]------------
> > [   58.760053] kernel BUG at drivers/net/phy/mdio_bus.c:651!
> > [   58.766361] Internal error: Oops - BUG: 0 [#1] SMP ARM
> > [   58.772376] Modules linked in: 8021q garp at24 tag_ksz tpm_tis_spi ksz9477_spi tpm_tis_core ksz9477 ksz_common tpm rts
> > [   58.837539] CPU: 3 PID: 1 Comm: systemd-shutdow Tainted: G         C        5.10.63-RP_PURE_510_VLADFIX+ #3
> > [   58.848388] Hardware name: BCM2711
> > [   58.852875] PC is at mdiobus_free+0x4c/0x50
> > [   58.858143] LR is at devm_mdiobus_free+0x1c/0x20
> > [   58.863853] pc : [<c08c9218>]    lr : [<c08c1898>]    psr: 80000013
> > [   58.871212] sp : c18fdc38  ip : c18fdc48  fp : c18fdc44
> > [   58.877505] r10: 00000000  r9 : c0867104  r8 : c18fdc5c
> > [   58.883823] r7 : 00000013  r6 : c31c8000  r5 : c3a50000  r4 : c379db80
> > [   58.891442] r3 : c2ab4000  r2 : 00000002  r1 : c379dbc0  r0 : c2ab4000
> > [   58.899037] Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> > [   58.907297] Control: 30c5383d  Table: 03ac92c0  DAC: 55555555
> > [   58.914139] Process systemd-shutdow (pid: 1, stack limit = 0xff8113c1)
> > [   58.921774] Stack: (0xc18fdc38 to 0xc18fe000)
> > [   58.927285] dc20:                                                       c18fdc54 c18fdc48
> > [   58.936601] dc40: c08c1898 c08c91d8 c18fdc94 c18fdc58 c0866dac c08c1888 c31c819c c3527180
> > [   58.945921] dc60: c332d200 c1405048 c32f8800 c31c8000 00000000 bf191010 00000000 c32f8800
> > [   58.955289] dc80: c1095f3c c1aa6454 c18fdcac c18fdc98 c086715c c0866be8 c31c8000 00000000
> > [   58.964644] dca0: c18fdccc c18fdcb0 c0862c7c c0867128 c1a42e30 c31c8000 c14f7cf0 00000000
> > [   58.974018] dcc0: c18fdcdc c18fdcd0 c0862d40 c0862b68 c18fdcfc c18fdce0 c08613dc c0862d2c
> > [   58.983391] dce0: c31c8000 00000a68 c08ba6cc 00000000 c18fdd44 c18fdd00 c085c710 c086130c
> > [   58.992778] dd00: c0331394 c0332604 60000013 c18fdd74 c3656294 c1405048 c31c8000 c31c8000
> > [   59.002140] dd20: 00000000 c08ba6cc c160657c c155c018 c1095f3c c1aa6454 c18fdd5c c18fdd48
> > [   59.011521] dd40: c08ba6a8 c085c58c 00000000 00000000 c18fdd6c c18fdd60 c08ba6e4 c08ba670
> > [   59.020921] dd60: c18fdd9c c18fdd70 c085bc84 c08ba6d8 c18fdd8c c3656200 c3656394 c1405048
> > [   59.030334] dd80: c18fdda4 c32f8800 c32f8800 00000003 c18fddbc c18fdda0 c08bab7c c085bc20
> > [   59.039737] dda0: c32f8b80 c32f8800 00000000 c160657c c18fdddc c18fddc0 bf182554 c08bab4c
> > [   59.049164] ddc0: c1aa6400 c1a6e810 c1aa6410 c160657c c18fddf4 c18fdde0 bf1825a8 bf18252c
> > [   59.058602] dde0: c1aa6414 c1a6e810 c18fde04 c18fddf8 c0863dec bf182598 c18fde3c c18fde08
> > [   59.068057] de00: c085fd9c c0863dcc c18fde3c c1095f2c c024865c 00000000 00000000 620bef00
> > [   59.077487] de20: c140f510 fee1dead c18fc000 00000058 c18fde4c c18fde40 c0249c84 c085fc0c
> > [   59.086920] de40: c18fde64 c18fde50 c0249d74 c0249c4c 01234567 00000000 c18fdf94 c18fde68
> > [   59.096386] de60: c024a018 c0249d64 c18fded4 c31b0c00 00000024 c18fdf58 00000005 c0441cec
> > [   59.105852] de80: c18fdec4 c18fde90 c0441b30 c049852c 00000000 c18fdea0 c073ad04 00000024
> > [   59.115330] dea0: c31b0c00 c18fdf58 c18fded4 c31b0c00 00000005 00000000 c18fdf4c c18fdec8
> > [   59.124821] dec0: c0441cec c0425cb0 c18fded0 c18fded4 00000000 00000005 00000000 00000024
> > [   59.134317] dee0: c18fdeec 00000005 c0200074 bec45250 00000004 bec45f62 00000010 bec45264
> > [   59.143792] df00: 00000005 bec4531c 0000000a b6d10040 00000001 c0200e70 ffffe000 c1546a80
> > [   59.153282] df20: 00000000 c0467268 c18fdf4c c1405048 c31b0c00 bec4528c 00000000 00000000
> > [   59.162787] df40: c18fdf94 c18fdf50 c0441e6c c0441c50 00000000 00000000 00000000 00000000
> > [   59.172269] df60: c18fdf94 c1405048 c0331394 c1405048 bec4531c 00000000 00000000 00000000
> > [   59.181763] df80: 00000058 c0200204 c18fdfa4 c18fdf98 c024a16c c0249f10 00000000 c18fdfa8
> > [   59.191250] dfa0: c0200040 c024a160 00000000 00000000 fee1dead 28121969 01234567 620bef00
> > [   59.200735] dfc0: 00000000 00000000 00000000 00000058 00000fff bec45be8 00000000 00476b80
> > [   59.210245] dfe0: 00488e3c bec45b68 004734a8 b6e4ca38 60000010 fee1dead 00000000 00000000
> > [   59.219759] Backtrace:
> > [   59.223546] [<c08c91cc>] (mdiobus_free) from [<c08c1898>] (devm_mdiobus_free+0x1c/0x20)
> > [   59.232909] [<c08c187c>] (devm_mdiobus_free) from [<c0866dac>] (release_nodes+0x1d0/0x220)
> > [   59.242551] [<c0866bdc>] (release_nodes) from [<c086715c>] (devres_release_all+0x40/0x60)
> > [   59.252132]  r10:c1aa6454 r9:c1095f3c r8:c32f8800 r7:00000000 r6:bf191010 r5:00000000
> > [   59.261338]  r4:c31c8000
> > [   59.265239] [<c086711c>] (devres_release_all) from [<c0862c7c>] (device_release_driver_internal+0x120/0x1c4)
> > [   59.276479]  r5:00000000 r4:c31c8000
> > [   59.281440] [<c0862b5c>] (device_release_driver_internal) from [<c0862d40>] (device_release_driver+0x20/0x24)
> > [   59.292802]  r7:00000000 r6:c14f7cf0 r5:c31c8000 r4:c1a42e30
> > [   59.299900] [<c0862d20>] (device_release_driver) from [<c08613dc>] (bus_remove_device+0xdc/0x108)
> > [   59.310267] [<c0861300>] (bus_remove_device) from [<c085c710>] (device_del+0x190/0x428)
> > [   59.319748]  r7:00000000 r6:c08ba6cc r5:00000a68 r4:c31c8000
> > [   59.326896] [<c085c580>] (device_del) from [<c08ba6a8>] (spi_unregister_device+0x44/0x68)
> > [   59.336583]  r10:c1aa6454 r9:c1095f3c r8:c155c018 r7:c160657c r6:c08ba6cc r5:00000000
> > [   59.345924]  r4:c31c8000
> > [   59.349971] [<c08ba664>] (spi_unregister_device) from [<c08ba6e4>] (__unregister+0x18/0x20)
> > [   59.359870]  r5:00000000 r4:00000000
> > [   59.364972] [<c08ba6cc>] (__unregister) from [<c085bc84>] (device_for_each_child+0x70/0xb4)
> > [   59.374899] [<c085bc14>] (device_for_each_child) from [<c08bab7c>] (spi_unregister_controller+0x3c/0x128)
> > [   59.385979]  r6:00000003 r5:c32f8800 r4:c32f8800
> > [   59.392086] [<c08bab40>] (spi_unregister_controller) from [<bf182554>] (bcm2835_spi_remove+0x34/0x6c [spi_bcm2835])
> > [   59.404000]  r7:c160657c r6:00000000 r5:c32f8800 r4:c32f8b80
> > [   59.411084] [<bf182520>] (bcm2835_spi_remove [spi_bcm2835]) from [<bf1825a8>] (bcm2835_spi_shutdown+0x1c/0x38 [spi_bc)
> > [   59.423755]  r7:c160657c r6:c1aa6410 r5:c1a6e810 r4:c1aa6400
> > [   59.430847] [<bf18258c>] (bcm2835_spi_shutdown [spi_bcm2835]) from [<c0863dec>] (platform_drv_shutdown+0x2c/0x30)
> > [   59.442613]  r5:c1a6e810 r4:c1aa6414
> > [   59.447635] [<c0863dc0>] (platform_drv_shutdown) from [<c085fd9c>] (device_shutdown+0x19c/0x24c)
> > [   59.457932] [<c085fc00>] (device_shutdown) from [<c0249c84>] (kernel_restart_prepare+0x44/0x48)
> > [   59.468135]  r10:00000058 r9:c18fc000 r8:fee1dead r7:c140f510 r6:620bef00 r5:00000000
> > [   59.477470]  r4:00000000
> > [   59.481509] [<c0249c40>] (kernel_restart_prepare) from [<c0249d74>] (kernel_restart+0x1c/0x60)
> > [   59.491653] [<c0249d58>] (kernel_restart) from [<c024a018>] (__do_sys_reboot+0x114/0x1f8)
> > [   59.501359]  r5:00000000 r4:01234567
> > [   59.506447] [<c0249f04>] (__do_sys_reboot) from [<c024a16c>] (sys_reboot+0x18/0x1c)
> > [   59.515628]  r8:c0200204 r7:00000058 r6:00000000 r5:00000000 r4:00000000
> > [   59.523857] [<c024a154>] (sys_reboot) from [<c0200040>] (ret_fast_syscall+0x0/0x28)
> > [   59.533038] Exception stack(0xc18fdfa8 to 0xc18fdff0)
> > [   59.539607] dfa0:                   00000000 00000000 fee1dead 28121969 01234567 620bef00
> > [   59.549318] dfc0: 00000000 00000000 00000000 00000058 00000fff bec45be8 00000000 00476b80
> > [   59.559026] dfe0: 00488e3c bec45b68 004734a8 b6e4ca38
> > [   59.565596] Code: ebfe49f5 e89da800 ebed72a3 e89da800 (e7f001f2)
> > [   59.573246] ---[ end trace 7d800ce7b5664bb6 ]---
> > [   59.579413] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> > [   59.588634] Rebooting in 10 seconds..
> > 
> > The concerning source code line 651 is in my case:
> > 
> > void mdiobus_free(struct mii_bus *bus)
> > {
> > 	/* For compatibility with error handling in drivers. */
> > 	if (bus->state == MDIOBUS_ALLOCATED) {
> > 		kfree(bus);
> > 		return;
> > 	}
> > 
> > 651<	BUG_ON(bus->state != MDIOBUS_UNREGISTERED);
> > 	bus->state = MDIOBUS_RELEASED;
> > 
> > 	put_device(&bus->dev);
> > }
> > EXPORT_SYMBOL(mdiobus_free);
> > 
> > I tested with both versions of your patchset, with the same result. I also tested
> > with a RP 5.14 kernel (the latest RP kernel) but I did not see the original issue
> > (i.e. the system hang) here for some reason.
> > 
> > I then tried to get the net-next kernel running on my system but without success so far. So for
> > now the result with the RP 5.10 is all I can offer. I hope that helps a bit nevertheless.
> 
> Thank you Lino, this is a very valuable report. I will send a v3 soon (not sure if today).

Actually, no, I will not send a v3, because the fact that devres can now
call devm_mdiobus_free without devm_mdiobus_unregister has nothing to do
with this series and touches much more than DSA, it is an issue introduced by
commit ac3a68d56651 ("net: phy: don't abuse devres in devm_mdiobus_register()").

I will deal with it separately, the basic idea being that
devm_mdiobus_alloc + plain mdiobus_register is bonkers, and Bartosz
didn't care enough to fix the existing users of that pattern before he
just went ahead to make _devm_mdiobus_free stop calling mdiobus_unregister.

In fact, this patch really got me wondering at the time, was the rtl8366
driver so broken at the time?
https://patchwork.kernel.org/project/netdevbpf/patch/20210822193145.1312668-2-alvin@pqrs.dk/
No, it stems from the exact same patch from Bartosz, devres used to
unregister MDIO buses when freeing them, now it isn't.

So while I don't disagree with Bartosz' overall idea, the execution kinda sucks.
In your case, what happens is that your driver's ->shutdown method gets
called, and this (as discussed) means that your driver's ->remove method
will be a no-op (to avoid unregistering DSA structures twice). But since
SPI bus drivers which implement their own ->shutdown as ->remove are a
thing, the fact that you don't run anything on ->remove means you won't
unregister an MDIO bus structure that was allocated under devres. Even
worse, the ksz9477_spi driver does _not_ allocate any MDIO bus structure
under devres, but the DSA core does, it's that pesky ds->slave_mii_bus
for which DSA is too helpful for its own good. Unregistering that MDIO
bus happens only from the dsa_unregister_switch call path, not from
dsa_switch_shutdown. But even if your SPI device driver does a no-op on
->remove, it doesn't mean the device_del on it won't be called. And we
can't stop the devres callbacks from running, which inevitably means
that an MDIO bus structure which is still registered will get freed.

My personal conclusion is that either you go with devres all the way
(devm_mdiobus_alloc + devm_mdiobus_register) or with no devres all the
way (mdiobus_alloc + mdiobus_register), otherwise you've just signed up
to learning things you never really wanted to learn. The pattern of
devres for the mdiobus_alloc and then no devres for the mdiobus_register
is very old, which explains why devm_mdiobus_register was only added
very recently (between kernel v5.7 and v5.8). It used to be fine in the
sense that it worked, but now it's completely broken. Some poor soul
needs to audit that pattern across the whole kernel after Bartosz'
patch, and it looks like that somebody is me...

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2021-09-19  0:30 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-09  9:53 [PATCH 0/3] Fix for KSZ DSA switch shutdown Lino Sanfilippo
2021-09-09  9:53 ` [PATCH 1/3] net: dsa: introduce function dsa_tree_shutdown() Lino Sanfilippo
2021-09-09  9:53 ` [PATCH 2/3] net: dsa: microchip: provide the function ksz_switch_shutdown() Lino Sanfilippo
2021-09-09  9:53 ` [PATCH 3/3] net: dsa: microchip: tear down DSA tree at system shutdown Lino Sanfilippo
2021-09-09 10:14 ` [PATCH 0/3] Fix for KSZ DSA switch shutdown Vladimir Oltean
2021-09-09 11:08   ` Lino Sanfilippo
2021-09-09 11:42     ` Vladimir Oltean
2021-09-09 12:56       ` Vladimir Oltean
2021-09-09 13:19         ` Aw: " Lino Sanfilippo
2021-09-09 14:29           ` Lino Sanfilippo
2021-09-09 15:17             ` Andrew Lunn
2021-09-09 16:41               ` Lino Sanfilippo
2021-09-09 15:11           ` Andrew Lunn
2021-09-09 16:46             ` Lino Sanfilippo
2021-09-09 17:55               ` Andrew Lunn
2021-09-09 15:47           ` Vladimir Oltean
2021-09-09 16:00             ` Florian Fainelli
2021-09-10  1:32               ` Saravana Kannan
2021-09-09 16:37             ` Lino Sanfilippo
2021-09-09 16:44               ` Florian Fainelli
2021-09-09 17:07                 ` Lino Sanfilippo
2021-09-09 22:54                   ` Vladimir Oltean
2021-09-09 23:23                     ` Vladimir Oltean
2021-09-10  1:08                       ` Vladimir Oltean
2021-09-10  2:15                     ` Florian Fainelli
2021-09-10 11:51                       ` Andrew Lunn
2021-09-10 14:58                         ` Vladimir Oltean
2021-09-11 11:44                           ` Vladimir Oltean
     [not found]                           ` <53f2509f-b648-b33d-1542-17a2c9d69966@gmx.de>
2021-09-12 20:29                             ` Vladimir Oltean
2021-09-13 10:32                               ` Aw: " Lino Sanfilippo
2021-09-13 10:44                                 ` Vladimir Oltean
2021-09-13 11:01                                   ` Aw: " Lino Sanfilippo
2021-09-14 18:48                                     ` Vladimir Oltean
2021-09-15  5:42                                       ` Lino Sanfilippo
2021-09-18 19:37                                         ` Lino Sanfilippo
2021-09-18 22:04                                           ` Vladimir Oltean
2021-09-19  0:29                                             ` Vladimir Oltean
2021-09-09 12:40 ` Andrew Lunn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).