LKML Archive on lore.kernel.org help / color / mirror / Atom feed
* [PATCH net] net: phy: Avoid multiple suspends @ 2020-02-20 23:34 Florian Fainelli 2020-02-24 4:59 ` David Miller 0 siblings, 1 reply; 8+ messages in thread From: Florian Fainelli @ 2020-02-20 23:34 UTC (permalink / raw) To: netdev Cc: yoshihiro.shimoda.uh, Florian Fainelli, Andrew Lunn, Heiner Kallweit, Russell King, David S. Miller, Fugang Duan, open list It is currently possible for a PHY device to be suspended as part of a network device driver's suspend call while it is still being attached to that net_device, either via phy_suspend() or implicitly via phy_stop(). Later on, when the MDIO bus controller get suspended, we would attempt to suspend again the PHY because it is still attached to a network device. This is both a waste of time and creates an opportunity for improper clock/power management bugs to creep in. Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> --- Heiner, Andrew, I did consider adding logic that would check for phydev->suspended in phy_suspend() and phy_resume(), but this was really the only place where I found it to be problematic. drivers/net/phy/phy_device.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 6a5056e0ae77..6131aca79823 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -247,7 +247,7 @@ static bool mdio_bus_phy_may_suspend(struct phy_device *phydev) * MDIO bus driver and clock gated at this point. */ if (!netdev) - return !phydev->suspended; + goto out; if (netdev->wol_enabled) return false; @@ -267,7 +267,8 @@ static bool mdio_bus_phy_may_suspend(struct phy_device *phydev) if (device_may_wakeup(&netdev->dev)) return false; - return true; +out: + return !phydev->suspended; } static int mdio_bus_phy_suspend(struct device *dev) -- 2.17.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: phy: Avoid multiple suspends 2020-02-20 23:34 [PATCH net] net: phy: Avoid multiple suspends Florian Fainelli @ 2020-02-24 4:59 ` David Miller 2020-03-10 14:16 ` Geert Uytterhoeven 0 siblings, 1 reply; 8+ messages in thread From: David Miller @ 2020-02-24 4:59 UTC (permalink / raw) To: f.fainelli Cc: netdev, yoshihiro.shimoda.uh, andrew, hkallweit1, linux, B38611, linux-kernel From: Florian Fainelli <f.fainelli@gmail.com> Date: Thu, 20 Feb 2020 15:34:53 -0800 > It is currently possible for a PHY device to be suspended as part of a > network device driver's suspend call while it is still being attached to > that net_device, either via phy_suspend() or implicitly via phy_stop(). > > Later on, when the MDIO bus controller get suspended, we would attempt > to suspend again the PHY because it is still attached to a network > device. > > This is both a waste of time and creates an opportunity for improper > clock/power management bugs to creep in. > > Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY") > Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Applied, and queued up for -stable, thanks Florian. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: phy: Avoid multiple suspends 2020-02-24 4:59 ` David Miller @ 2020-03-10 14:16 ` Geert Uytterhoeven 2020-03-10 16:46 ` Florian Fainelli 0 siblings, 1 reply; 8+ messages in thread From: Geert Uytterhoeven @ 2020-03-10 14:16 UTC (permalink / raw) To: David Miller, Florian Fainelli Cc: netdev, Yoshihiro Shimoda, Andrew Lunn, Heiner Kallweit, Russell King, B38611, Linux Kernel Mailing List, Linux-Renesas Hi Florian, David, On Mon, Feb 24, 2020 at 5:59 AM David Miller <davem@davemloft.net> wrote: > From: Florian Fainelli <f.fainelli@gmail.com> > Date: Thu, 20 Feb 2020 15:34:53 -0800 > > > It is currently possible for a PHY device to be suspended as part of a > > network device driver's suspend call while it is still being attached to > > that net_device, either via phy_suspend() or implicitly via phy_stop(). > > > > Later on, when the MDIO bus controller get suspended, we would attempt > > to suspend again the PHY because it is still attached to a network > > device. > > > > This is both a waste of time and creates an opportunity for improper > > clock/power management bugs to creep in. > > > > Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY") > > Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> > > Applied, and queued up for -stable, thanks Florian. This patch causes a regression on r8a73a4/ape6evm and sh73a0/kzm9g. After resume from s2ram, Ethernet no longer works: PM: suspend exit nfs: server aaa.bbb.ccc.ddd not responding, still trying ... Reverting commit 503ba7c6961034ff ("net: phy: Avoid multiple suspends") fixes the issue. On both boards, an SMSC LAN9220 is connected to a power-managed local bus. I added some debug code to check when the clock driving the local bus is stopped and started, but I see no difference before/after. Hence I suspect the Ethernet chip is no longer reinitialized after resume. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: phy: Avoid multiple suspends 2020-03-10 14:16 ` Geert Uytterhoeven @ 2020-03-10 16:46 ` Florian Fainelli 2020-03-10 17:34 ` Heiner Kallweit 2020-03-11 9:17 ` Geert Uytterhoeven 0 siblings, 2 replies; 8+ messages in thread From: Florian Fainelli @ 2020-03-10 16:46 UTC (permalink / raw) To: Geert Uytterhoeven, David Miller Cc: netdev, Yoshihiro Shimoda, Andrew Lunn, Heiner Kallweit, Russell King, B38611, Linux Kernel Mailing List, Linux-Renesas On 3/10/20 7:16 AM, Geert Uytterhoeven wrote: > Hi Florian, David, > > On Mon, Feb 24, 2020 at 5:59 AM David Miller <davem@davemloft.net> wrote: >> From: Florian Fainelli <f.fainelli@gmail.com> >> Date: Thu, 20 Feb 2020 15:34:53 -0800 >> >>> It is currently possible for a PHY device to be suspended as part of a >>> network device driver's suspend call while it is still being attached to >>> that net_device, either via phy_suspend() or implicitly via phy_stop(). >>> >>> Later on, when the MDIO bus controller get suspended, we would attempt >>> to suspend again the PHY because it is still attached to a network >>> device. >>> >>> This is both a waste of time and creates an opportunity for improper >>> clock/power management bugs to creep in. >>> >>> Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY") >>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> >> >> Applied, and queued up for -stable, thanks Florian. > > This patch causes a regression on r8a73a4/ape6evm and sh73a0/kzm9g. > After resume from s2ram, Ethernet no longer works: > > PM: suspend exit > nfs: server aaa.bbb.ccc.ddd not responding, still trying > ... > > Reverting commit 503ba7c6961034ff ("net: phy: Avoid multiple suspends") > fixes the issue. > > On both boards, an SMSC LAN9220 is connected to a power-managed local > bus. > > I added some debug code to check when the clock driving the local bus > is stopped and started, but I see no difference before/after. Hence I > suspect the Ethernet chip is no longer reinitialized after resume. Can you provide a complete log? Do you use the Generic PHY driver or a specialized one? Do you have a way to dump the registers at the time of failure and see if BMCR.PDOWN is still set somehow? Does the following help: diff --git a/drivers/net/ethernet/smsc/smsc911x.c b/drivers/net/ethernet/smsc/smsc911x.c index 49a6a9167af4..df17190c76c0 100644 --- a/drivers/net/ethernet/smsc/smsc911x.c +++ b/drivers/net/ethernet/smsc/smsc911x.c @@ -2618,6 +2618,7 @@ static int smsc911x_resume(struct device *dev) if (netif_running(ndev)) { netif_device_attach(ndev); netif_start_queue(ndev); + phy_resume(dev->phydev); } return 0; -- Florian ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: phy: Avoid multiple suspends 2020-03-10 16:46 ` Florian Fainelli @ 2020-03-10 17:34 ` Heiner Kallweit 2020-03-11 9:17 ` Geert Uytterhoeven 1 sibling, 0 replies; 8+ messages in thread From: Heiner Kallweit @ 2020-03-10 17:34 UTC (permalink / raw) To: Florian Fainelli, Geert Uytterhoeven, David Miller Cc: netdev, Yoshihiro Shimoda, Andrew Lunn, Russell King, B38611, Linux Kernel Mailing List, Linux-Renesas On 10.03.2020 17:46, Florian Fainelli wrote: > On 3/10/20 7:16 AM, Geert Uytterhoeven wrote: >> Hi Florian, David, >> >> On Mon, Feb 24, 2020 at 5:59 AM David Miller <davem@davemloft.net> wrote: >>> From: Florian Fainelli <f.fainelli@gmail.com> >>> Date: Thu, 20 Feb 2020 15:34:53 -0800 >>> >>>> It is currently possible for a PHY device to be suspended as part of a >>>> network device driver's suspend call while it is still being attached to >>>> that net_device, either via phy_suspend() or implicitly via phy_stop(). >>>> >>>> Later on, when the MDIO bus controller get suspended, we would attempt >>>> to suspend again the PHY because it is still attached to a network >>>> device. >>>> >>>> This is both a waste of time and creates an opportunity for improper >>>> clock/power management bugs to creep in. >>>> >>>> Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY") >>>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> >>> >>> Applied, and queued up for -stable, thanks Florian. >> >> This patch causes a regression on r8a73a4/ape6evm and sh73a0/kzm9g. >> After resume from s2ram, Ethernet no longer works: >> >> PM: suspend exit >> nfs: server aaa.bbb.ccc.ddd not responding, still trying >> ... >> >> Reverting commit 503ba7c6961034ff ("net: phy: Avoid multiple suspends") >> fixes the issue. >> >> On both boards, an SMSC LAN9220 is connected to a power-managed local >> bus. >> >> I added some debug code to check when the clock driving the local bus >> is stopped and started, but I see no difference before/after. Hence I >> suspect the Ethernet chip is no longer reinitialized after resume. > > Can you provide a complete log? Do you use the Generic PHY driver or a > specialized one? Do you have a way to dump the registers at the time of > failure and see if BMCR.PDOWN is still set somehow? > Maybe reason for the misbehavior is that mdio_bus_phy_may_suspend() is checked also in mdio_bus_phy_resume(), what's not very logical based on the naming. The call to phy_resume() therefore may be skipped. > Does the following help: > > diff --git a/drivers/net/ethernet/smsc/smsc911x.c > b/drivers/net/ethernet/smsc/smsc911x.c > index 49a6a9167af4..df17190c76c0 100644 > --- a/drivers/net/ethernet/smsc/smsc911x.c > +++ b/drivers/net/ethernet/smsc/smsc911x.c > @@ -2618,6 +2618,7 @@ static int smsc911x_resume(struct device *dev) > if (netif_running(ndev)) { > netif_device_attach(ndev); > netif_start_queue(ndev); > + phy_resume(dev->phydev); > } > > return 0; > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: phy: Avoid multiple suspends 2020-03-10 16:46 ` Florian Fainelli 2020-03-10 17:34 ` Heiner Kallweit @ 2020-03-11 9:17 ` Geert Uytterhoeven 2020-03-11 21:22 ` Heiner Kallweit 1 sibling, 1 reply; 8+ messages in thread From: Geert Uytterhoeven @ 2020-03-11 9:17 UTC (permalink / raw) To: Florian Fainelli Cc: David Miller, netdev, Yoshihiro Shimoda, Andrew Lunn, Heiner Kallweit, Russell King, Linux Kernel Mailing List, Linux-Renesas On Tue, Mar 10, 2020 at 5:47 PM Florian Fainelli <f.fainelli@gmail.com> wrote: > > On 3/10/20 7:16 AM, Geert Uytterhoeven wrote: > > Hi Florian, David, > > > > On Mon, Feb 24, 2020 at 5:59 AM David Miller <davem@davemloft.net> wrote: > >> From: Florian Fainelli <f.fainelli@gmail.com> > >> Date: Thu, 20 Feb 2020 15:34:53 -0800 > >> > >>> It is currently possible for a PHY device to be suspended as part of a > >>> network device driver's suspend call while it is still being attached to > >>> that net_device, either via phy_suspend() or implicitly via phy_stop(). > >>> > >>> Later on, when the MDIO bus controller get suspended, we would attempt > >>> to suspend again the PHY because it is still attached to a network > >>> device. > >>> > >>> This is both a waste of time and creates an opportunity for improper > >>> clock/power management bugs to creep in. > >>> > >>> Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY") > >>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> > >> > >> Applied, and queued up for -stable, thanks Florian. > > > > This patch causes a regression on r8a73a4/ape6evm and sh73a0/kzm9g. > > After resume from s2ram, Ethernet no longer works: > > > > PM: suspend exit > > nfs: server aaa.bbb.ccc.ddd not responding, still trying > > ... > > > > Reverting commit 503ba7c6961034ff ("net: phy: Avoid multiple suspends") > > fixes the issue. > > > > On both boards, an SMSC LAN9220 is connected to a power-managed local > > bus. > > > > I added some debug code to check when the clock driving the local bus > > is stopped and started, but I see no difference before/after. Hence I > > suspect the Ethernet chip is no longer reinitialized after resume. > > Can you provide a complete log? With some debug info: SDHI0 Vcc: disabling PM: suspend entry (deep) Filesystems sync: 0.002 seconds Freezing user space processes ... (elapsed 0.001 seconds) done. OOM killer disabled. Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. PM: ==== a3sp/ee120000.sd: stop PM: ==== a3sp/ee100000.sd: stop smsc911x 8000000.ethernet: smsc911x_suspend:2577 smsc911x 8000000.ethernet: smsc911x_suspend:2579 running smsc911x 8000000.ethernet: smsc911x_suspend:2584 PM: ==== a3sp/ee200000.mmc: stop PM: ==== c4/fec10000.bus: stop PM: ==== a3sp/e6c40000.serial: stop PM: ==== c5/e61f0000.thermal: stop PM: ==== c4/e61c0200.interrupt-controller: stop PM: == a3sp: power off rmobile_pd_power_down: a3sp Disabling non-boot CPUs ... PM: ==== c4/e61c0200.interrupt-controller: start PM: ==== c5/e61f0000.thermal: start PM: ==== a3sp/e6c40000.serial: start PM: ==== c4/fec10000.bus: start PM: ==== a3sp/ee200000.mmc: start smsc911x 8000000.ethernet: smsc911x_resume:2606 smsc911x 8000000.ethernet: smsc911x_resume:2625 running PM: ==== a3sp/ee100000.sd: start OOM killer enabled. Restarting tasks ... done. PM: ==== a3sp/ee120000.sd: start PM: suspend exit nfs: server aaa.bbb.ccc.ddd not responding, still trying ... But no difference between the good and the bad case, except for the nfs failures. > Do you use the Generic PHY driver or a > specialized one? CONFIG_FIXED_PHY=y CONFIG_SMSC_PHY=y Just the smsc,lan9115 node, cfr. arch/arm/boot/dts/r8a73a4-ape6evm.dts > Do you have a way to dump the registers at the time of > failure and see if BMCR.PDOWN is still set somehow? Added a hook into "nfs: server not responding", which prints: MII_BMCR = 0x1900 i.e. BMCR_PDOWN = 0x0800 is still set. > Does the following help: > > diff --git a/drivers/net/ethernet/smsc/smsc911x.c > b/drivers/net/ethernet/smsc/smsc911x.c > index 49a6a9167af4..df17190c76c0 100644 > --- a/drivers/net/ethernet/smsc/smsc911x.c > +++ b/drivers/net/ethernet/smsc/smsc911x.c > @@ -2618,6 +2618,7 @@ static int smsc911x_resume(struct device *dev) > if (netif_running(ndev)) { > netif_device_attach(ndev); > netif_start_queue(ndev); > + phy_resume(dev->phydev); > } > Yes i does, after s/dev->/ndev->/. Thanks! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: phy: Avoid multiple suspends 2020-03-11 9:17 ` Geert Uytterhoeven @ 2020-03-11 21:22 ` Heiner Kallweit 2020-03-12 8:26 ` Geert Uytterhoeven 0 siblings, 1 reply; 8+ messages in thread From: Heiner Kallweit @ 2020-03-11 21:22 UTC (permalink / raw) To: Geert Uytterhoeven, Florian Fainelli Cc: David Miller, netdev, Yoshihiro Shimoda, Andrew Lunn, Russell King, Linux Kernel Mailing List, Linux-Renesas On 11.03.2020 10:17, Geert Uytterhoeven wrote: > On Tue, Mar 10, 2020 at 5:47 PM Florian Fainelli <f.fainelli@gmail.com> wrote: >> >> On 3/10/20 7:16 AM, Geert Uytterhoeven wrote: >>> Hi Florian, David, >>> >>> On Mon, Feb 24, 2020 at 5:59 AM David Miller <davem@davemloft.net> wrote: >>>> From: Florian Fainelli <f.fainelli@gmail.com> >>>> Date: Thu, 20 Feb 2020 15:34:53 -0800 >>>> >>>>> It is currently possible for a PHY device to be suspended as part of a >>>>> network device driver's suspend call while it is still being attached to >>>>> that net_device, either via phy_suspend() or implicitly via phy_stop(). >>>>> >>>>> Later on, when the MDIO bus controller get suspended, we would attempt >>>>> to suspend again the PHY because it is still attached to a network >>>>> device. >>>>> >>>>> This is both a waste of time and creates an opportunity for improper >>>>> clock/power management bugs to creep in. >>>>> >>>>> Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY") >>>>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> >>>> >>>> Applied, and queued up for -stable, thanks Florian. >>> >>> This patch causes a regression on r8a73a4/ape6evm and sh73a0/kzm9g. >>> After resume from s2ram, Ethernet no longer works: >>> >>> PM: suspend exit >>> nfs: server aaa.bbb.ccc.ddd not responding, still trying >>> ... >>> >>> Reverting commit 503ba7c6961034ff ("net: phy: Avoid multiple suspends") >>> fixes the issue. >>> >>> On both boards, an SMSC LAN9220 is connected to a power-managed local >>> bus. >>> >>> I added some debug code to check when the clock driving the local bus >>> is stopped and started, but I see no difference before/after. Hence I >>> suspect the Ethernet chip is no longer reinitialized after resume. >> >> Can you provide a complete log? > > With some debug info: > > SDHI0 Vcc: disabling > PM: suspend entry (deep) > Filesystems sync: 0.002 seconds > Freezing user space processes ... (elapsed 0.001 seconds) done. > OOM killer disabled. > Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. > PM: ==== a3sp/ee120000.sd: stop > PM: ==== a3sp/ee100000.sd: stop > smsc911x 8000000.ethernet: smsc911x_suspend:2577 > smsc911x 8000000.ethernet: smsc911x_suspend:2579 running > smsc911x 8000000.ethernet: smsc911x_suspend:2584 > PM: ==== a3sp/ee200000.mmc: stop > PM: ==== c4/fec10000.bus: stop > PM: ==== a3sp/e6c40000.serial: stop > PM: ==== c5/e61f0000.thermal: stop > PM: ==== c4/e61c0200.interrupt-controller: stop > PM: == a3sp: power off > rmobile_pd_power_down: a3sp > Disabling non-boot CPUs ... > PM: ==== c4/e61c0200.interrupt-controller: start > PM: ==== c5/e61f0000.thermal: start > PM: ==== a3sp/e6c40000.serial: start > PM: ==== c4/fec10000.bus: start > PM: ==== a3sp/ee200000.mmc: start > smsc911x 8000000.ethernet: smsc911x_resume:2606 > smsc911x 8000000.ethernet: smsc911x_resume:2625 running > PM: ==== a3sp/ee100000.sd: start > OOM killer enabled. > Restarting tasks ... done. > PM: ==== a3sp/ee120000.sd: start > PM: suspend exit > nfs: server aaa.bbb.ccc.ddd not responding, still trying > ... > > But no difference between the good and the bad case, except for the nfs > failures. > >> Do you use the Generic PHY driver or a >> specialized one? > > CONFIG_FIXED_PHY=y > CONFIG_SMSC_PHY=y > > Just the smsc,lan9115 node, cfr. arch/arm/boot/dts/r8a73a4-ape6evm.dts > >> Do you have a way to dump the registers at the time of >> failure and see if BMCR.PDOWN is still set somehow? > > Added a hook into "nfs: server not responding", which prints: > > MII_BMCR = 0x1900 > > i.e. BMCR_PDOWN = 0x0800 is still set. > >> Does the following help: >> >> diff --git a/drivers/net/ethernet/smsc/smsc911x.c >> b/drivers/net/ethernet/smsc/smsc911x.c >> index 49a6a9167af4..df17190c76c0 100644 >> --- a/drivers/net/ethernet/smsc/smsc911x.c >> +++ b/drivers/net/ethernet/smsc/smsc911x.c >> @@ -2618,6 +2618,7 @@ static int smsc911x_resume(struct device *dev) >> if (netif_running(ndev)) { >> netif_device_attach(ndev); >> netif_start_queue(ndev); >> + phy_resume(dev->phydev); >> } >> > > Yes i does, after s/dev->/ndev->/. > Thanks! > This seems to be a workaround. And the same issue we may have with other drivers too. Could you please alternatively test the following? It tackles the issue that mdio_bus_phy_may_suspend() is used in suspend AND resume, and both calls may return different values. With this patch we call mdio_bus_phy_may_suspend() only when suspending, and let the phy_device store whether it was suspended by MDIO bus PM. diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 32a5ceddc..6d6c6a178 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -286,6 +286,8 @@ static int mdio_bus_phy_suspend(struct device *dev) if (!mdio_bus_phy_may_suspend(phydev)) return 0; + phydev->suspended_by_mdio_bus = 1; + return phy_suspend(phydev); } @@ -294,9 +296,11 @@ static int mdio_bus_phy_resume(struct device *dev) struct phy_device *phydev = to_phy_device(dev); int ret; - if (!mdio_bus_phy_may_suspend(phydev)) + if (!phydev->suspended_by_mdio_bus) goto no_resume; + phydev->suspended_by_mdio_bus = 0; + ret = phy_resume(phydev); if (ret < 0) return ret; diff --git a/include/linux/phy.h b/include/linux/phy.h index 8b299476b..118de9f5b 100644 --- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -357,6 +357,7 @@ struct macsec_ops; * is_gigabit_capable: Set to true if PHY supports 1000Mbps * has_fixups: Set to true if this phy has fixups/quirks. * suspended: Set to true if this phy has been suspended successfully. + * suspended_by_mdio_bus: Set to true if this phy was suspended by MDIO bus. * sysfs_links: Internal boolean tracking sysfs symbolic links setup/removal. * loopback_enabled: Set true if this phy has been loopbacked successfully. * state: state of the PHY for management purposes @@ -396,6 +397,7 @@ struct phy_device { unsigned is_gigabit_capable:1; unsigned has_fixups:1; unsigned suspended:1; + unsigned suspended_by_mdio_bus:1; unsigned sysfs_links:1; unsigned loopback_enabled:1; -- 2.25.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: phy: Avoid multiple suspends 2020-03-11 21:22 ` Heiner Kallweit @ 2020-03-12 8:26 ` Geert Uytterhoeven 0 siblings, 0 replies; 8+ messages in thread From: Geert Uytterhoeven @ 2020-03-12 8:26 UTC (permalink / raw) To: Heiner Kallweit Cc: Florian Fainelli, David Miller, netdev, Yoshihiro Shimoda, Andrew Lunn, Russell King, Linux Kernel Mailing List, Linux-Renesas Hi Heiner, On Wed, Mar 11, 2020 at 10:22 PM Heiner Kallweit <hkallweit1@gmail.com> wrote: > On 11.03.2020 10:17, Geert Uytterhoeven wrote: > > On Tue, Mar 10, 2020 at 5:47 PM Florian Fainelli <f.fainelli@gmail.com> wrote: > >> On 3/10/20 7:16 AM, Geert Uytterhoeven wrote: > >>> On Mon, Feb 24, 2020 at 5:59 AM David Miller <davem@davemloft.net> wrote: > >>>> From: Florian Fainelli <f.fainelli@gmail.com> > >>>> Date: Thu, 20 Feb 2020 15:34:53 -0800 > >>>> > >>>>> It is currently possible for a PHY device to be suspended as part of a > >>>>> network device driver's suspend call while it is still being attached to > >>>>> that net_device, either via phy_suspend() or implicitly via phy_stop(). > >>>>> > >>>>> Later on, when the MDIO bus controller get suspended, we would attempt > >>>>> to suspend again the PHY because it is still attached to a network > >>>>> device. > >>>>> > >>>>> This is both a waste of time and creates an opportunity for improper > >>>>> clock/power management bugs to creep in. > >>>>> > >>>>> Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY") > >>>>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> > >>>> > >>>> Applied, and queued up for -stable, thanks Florian. > >>> > >>> This patch causes a regression on r8a73a4/ape6evm and sh73a0/kzm9g. > >>> After resume from s2ram, Ethernet no longer works: > >>> > >>> PM: suspend exit > >>> nfs: server aaa.bbb.ccc.ddd not responding, still trying > >>> ... > >>> > >>> Reverting commit 503ba7c6961034ff ("net: phy: Avoid multiple suspends") > >>> fixes the issue. > >>> > >>> On both boards, an SMSC LAN9220 is connected to a power-managed local > >>> bus. > >>> > >>> I added some debug code to check when the clock driving the local bus > >>> is stopped and started, but I see no difference before/after. Hence I > >>> suspect the Ethernet chip is no longer reinitialized after resume. > >> > >> Can you provide a complete log? > > > > With some debug info: > > > > SDHI0 Vcc: disabling > > PM: suspend entry (deep) > > Filesystems sync: 0.002 seconds > > Freezing user space processes ... (elapsed 0.001 seconds) done. > > OOM killer disabled. > > Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. > > PM: ==== a3sp/ee120000.sd: stop > > PM: ==== a3sp/ee100000.sd: stop > > smsc911x 8000000.ethernet: smsc911x_suspend:2577 > > smsc911x 8000000.ethernet: smsc911x_suspend:2579 running > > smsc911x 8000000.ethernet: smsc911x_suspend:2584 > > PM: ==== a3sp/ee200000.mmc: stop > > PM: ==== c4/fec10000.bus: stop > > PM: ==== a3sp/e6c40000.serial: stop > > PM: ==== c5/e61f0000.thermal: stop > > PM: ==== c4/e61c0200.interrupt-controller: stop > > PM: == a3sp: power off > > rmobile_pd_power_down: a3sp > > Disabling non-boot CPUs ... > > PM: ==== c4/e61c0200.interrupt-controller: start > > PM: ==== c5/e61f0000.thermal: start > > PM: ==== a3sp/e6c40000.serial: start > > PM: ==== c4/fec10000.bus: start > > PM: ==== a3sp/ee200000.mmc: start > > smsc911x 8000000.ethernet: smsc911x_resume:2606 > > smsc911x 8000000.ethernet: smsc911x_resume:2625 running > > PM: ==== a3sp/ee100000.sd: start > > OOM killer enabled. > > Restarting tasks ... done. > > PM: ==== a3sp/ee120000.sd: start > > PM: suspend exit > > nfs: server aaa.bbb.ccc.ddd not responding, still trying > > ... > > > > But no difference between the good and the bad case, except for the nfs > > failures. > > > >> Do you use the Generic PHY driver or a > >> specialized one? > > > > CONFIG_FIXED_PHY=y > > CONFIG_SMSC_PHY=y > > > > Just the smsc,lan9115 node, cfr. arch/arm/boot/dts/r8a73a4-ape6evm.dts > > > >> Do you have a way to dump the registers at the time of > >> failure and see if BMCR.PDOWN is still set somehow? > > > > Added a hook into "nfs: server not responding", which prints: > > > > MII_BMCR = 0x1900 > > > > i.e. BMCR_PDOWN = 0x0800 is still set. > > > >> Does the following help: > >> > >> diff --git a/drivers/net/ethernet/smsc/smsc911x.c > >> b/drivers/net/ethernet/smsc/smsc911x.c > >> index 49a6a9167af4..df17190c76c0 100644 > >> --- a/drivers/net/ethernet/smsc/smsc911x.c > >> +++ b/drivers/net/ethernet/smsc/smsc911x.c > >> @@ -2618,6 +2618,7 @@ static int smsc911x_resume(struct device *dev) > >> if (netif_running(ndev)) { > >> netif_device_attach(ndev); > >> netif_start_queue(ndev); > >> + phy_resume(dev->phydev); > >> } > >> > > > > Yes i does, after s/dev->/ndev->/. > > Thanks! > > This seems to be a workaround. And the same issue we may have with I agree. > other drivers too. Could you please alternatively test the following? > It tackles the issue that mdio_bus_phy_may_suspend() is used in > suspend AND resume, and both calls may return different values. > > With this patch we call mdio_bus_phy_may_suspend() only when > suspending, and let the phy_device store whether it was suspended > by MDIO bus PM. Thanks, your patch fixes the issue, too. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-03-12 8:27 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-02-20 23:34 [PATCH net] net: phy: Avoid multiple suspends Florian Fainelli 2020-02-24 4:59 ` David Miller 2020-03-10 14:16 ` Geert Uytterhoeven 2020-03-10 16:46 ` Florian Fainelli 2020-03-10 17:34 ` Heiner Kallweit 2020-03-11 9:17 ` Geert Uytterhoeven 2020-03-11 21:22 ` Heiner Kallweit 2020-03-12 8:26 ` Geert Uytterhoeven
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).