LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH net] net: phy: Avoid multiple suspends
@ 2020-02-20 23:34 Florian Fainelli
2020-02-24 4:59 ` David Miller
0 siblings, 1 reply; 8+ messages in thread
From: Florian Fainelli @ 2020-02-20 23:34 UTC (permalink / raw)
To: netdev
Cc: yoshihiro.shimoda.uh, Florian Fainelli, Andrew Lunn,
Heiner Kallweit, Russell King, David S. Miller, Fugang Duan,
open list
It is currently possible for a PHY device to be suspended as part of a
network device driver's suspend call while it is still being attached to
that net_device, either via phy_suspend() or implicitly via phy_stop().
Later on, when the MDIO bus controller get suspended, we would attempt
to suspend again the PHY because it is still attached to a network
device.
This is both a waste of time and creates an opportunity for improper
clock/power management bugs to creep in.
Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
Heiner, Andrew,
I did consider adding logic that would check for phydev->suspended in
phy_suspend() and phy_resume(), but this was really the only place where
I found it to be problematic.
drivers/net/phy/phy_device.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 6a5056e0ae77..6131aca79823 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -247,7 +247,7 @@ static bool mdio_bus_phy_may_suspend(struct phy_device *phydev)
* MDIO bus driver and clock gated at this point.
*/
if (!netdev)
- return !phydev->suspended;
+ goto out;
if (netdev->wol_enabled)
return false;
@@ -267,7 +267,8 @@ static bool mdio_bus_phy_may_suspend(struct phy_device *phydev)
if (device_may_wakeup(&netdev->dev))
return false;
- return true;
+out:
+ return !phydev->suspended;
}
static int mdio_bus_phy_suspend(struct device *dev)
--
2.17.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: phy: Avoid multiple suspends
2020-02-20 23:34 [PATCH net] net: phy: Avoid multiple suspends Florian Fainelli
@ 2020-02-24 4:59 ` David Miller
2020-03-10 14:16 ` Geert Uytterhoeven
0 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2020-02-24 4:59 UTC (permalink / raw)
To: f.fainelli
Cc: netdev, yoshihiro.shimoda.uh, andrew, hkallweit1, linux, B38611,
linux-kernel
From: Florian Fainelli <f.fainelli@gmail.com>
Date: Thu, 20 Feb 2020 15:34:53 -0800
> It is currently possible for a PHY device to be suspended as part of a
> network device driver's suspend call while it is still being attached to
> that net_device, either via phy_suspend() or implicitly via phy_stop().
>
> Later on, when the MDIO bus controller get suspended, we would attempt
> to suspend again the PHY because it is still attached to a network
> device.
>
> This is both a waste of time and creates an opportunity for improper
> clock/power management bugs to creep in.
>
> Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY")
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Applied, and queued up for -stable, thanks Florian.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: phy: Avoid multiple suspends
2020-02-24 4:59 ` David Miller
@ 2020-03-10 14:16 ` Geert Uytterhoeven
2020-03-10 16:46 ` Florian Fainelli
0 siblings, 1 reply; 8+ messages in thread
From: Geert Uytterhoeven @ 2020-03-10 14:16 UTC (permalink / raw)
To: David Miller, Florian Fainelli
Cc: netdev, Yoshihiro Shimoda, Andrew Lunn, Heiner Kallweit,
Russell King, B38611, Linux Kernel Mailing List, Linux-Renesas
Hi Florian, David,
On Mon, Feb 24, 2020 at 5:59 AM David Miller <davem@davemloft.net> wrote:
> From: Florian Fainelli <f.fainelli@gmail.com>
> Date: Thu, 20 Feb 2020 15:34:53 -0800
>
> > It is currently possible for a PHY device to be suspended as part of a
> > network device driver's suspend call while it is still being attached to
> > that net_device, either via phy_suspend() or implicitly via phy_stop().
> >
> > Later on, when the MDIO bus controller get suspended, we would attempt
> > to suspend again the PHY because it is still attached to a network
> > device.
> >
> > This is both a waste of time and creates an opportunity for improper
> > clock/power management bugs to creep in.
> >
> > Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY")
> > Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
>
> Applied, and queued up for -stable, thanks Florian.
This patch causes a regression on r8a73a4/ape6evm and sh73a0/kzm9g.
After resume from s2ram, Ethernet no longer works:
PM: suspend exit
nfs: server aaa.bbb.ccc.ddd not responding, still trying
...
Reverting commit 503ba7c6961034ff ("net: phy: Avoid multiple suspends")
fixes the issue.
On both boards, an SMSC LAN9220 is connected to a power-managed local
bus.
I added some debug code to check when the clock driving the local bus
is stopped and started, but I see no difference before/after. Hence I
suspect the Ethernet chip is no longer reinitialized after resume.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: phy: Avoid multiple suspends
2020-03-10 14:16 ` Geert Uytterhoeven
@ 2020-03-10 16:46 ` Florian Fainelli
2020-03-10 17:34 ` Heiner Kallweit
2020-03-11 9:17 ` Geert Uytterhoeven
0 siblings, 2 replies; 8+ messages in thread
From: Florian Fainelli @ 2020-03-10 16:46 UTC (permalink / raw)
To: Geert Uytterhoeven, David Miller
Cc: netdev, Yoshihiro Shimoda, Andrew Lunn, Heiner Kallweit,
Russell King, B38611, Linux Kernel Mailing List, Linux-Renesas
On 3/10/20 7:16 AM, Geert Uytterhoeven wrote:
> Hi Florian, David,
>
> On Mon, Feb 24, 2020 at 5:59 AM David Miller <davem@davemloft.net> wrote:
>> From: Florian Fainelli <f.fainelli@gmail.com>
>> Date: Thu, 20 Feb 2020 15:34:53 -0800
>>
>>> It is currently possible for a PHY device to be suspended as part of a
>>> network device driver's suspend call while it is still being attached to
>>> that net_device, either via phy_suspend() or implicitly via phy_stop().
>>>
>>> Later on, when the MDIO bus controller get suspended, we would attempt
>>> to suspend again the PHY because it is still attached to a network
>>> device.
>>>
>>> This is both a waste of time and creates an opportunity for improper
>>> clock/power management bugs to creep in.
>>>
>>> Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY")
>>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
>>
>> Applied, and queued up for -stable, thanks Florian.
>
> This patch causes a regression on r8a73a4/ape6evm and sh73a0/kzm9g.
> After resume from s2ram, Ethernet no longer works:
>
> PM: suspend exit
> nfs: server aaa.bbb.ccc.ddd not responding, still trying
> ...
>
> Reverting commit 503ba7c6961034ff ("net: phy: Avoid multiple suspends")
> fixes the issue.
>
> On both boards, an SMSC LAN9220 is connected to a power-managed local
> bus.
>
> I added some debug code to check when the clock driving the local bus
> is stopped and started, but I see no difference before/after. Hence I
> suspect the Ethernet chip is no longer reinitialized after resume.
Can you provide a complete log? Do you use the Generic PHY driver or a
specialized one? Do you have a way to dump the registers at the time of
failure and see if BMCR.PDOWN is still set somehow?
Does the following help:
diff --git a/drivers/net/ethernet/smsc/smsc911x.c
b/drivers/net/ethernet/smsc/smsc911x.c
index 49a6a9167af4..df17190c76c0 100644
--- a/drivers/net/ethernet/smsc/smsc911x.c
+++ b/drivers/net/ethernet/smsc/smsc911x.c
@@ -2618,6 +2618,7 @@ static int smsc911x_resume(struct device *dev)
if (netif_running(ndev)) {
netif_device_attach(ndev);
netif_start_queue(ndev);
+ phy_resume(dev->phydev);
}
return 0;
--
Florian
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: phy: Avoid multiple suspends
2020-03-10 16:46 ` Florian Fainelli
@ 2020-03-10 17:34 ` Heiner Kallweit
2020-03-11 9:17 ` Geert Uytterhoeven
1 sibling, 0 replies; 8+ messages in thread
From: Heiner Kallweit @ 2020-03-10 17:34 UTC (permalink / raw)
To: Florian Fainelli, Geert Uytterhoeven, David Miller
Cc: netdev, Yoshihiro Shimoda, Andrew Lunn, Russell King, B38611,
Linux Kernel Mailing List, Linux-Renesas
On 10.03.2020 17:46, Florian Fainelli wrote:
> On 3/10/20 7:16 AM, Geert Uytterhoeven wrote:
>> Hi Florian, David,
>>
>> On Mon, Feb 24, 2020 at 5:59 AM David Miller <davem@davemloft.net> wrote:
>>> From: Florian Fainelli <f.fainelli@gmail.com>
>>> Date: Thu, 20 Feb 2020 15:34:53 -0800
>>>
>>>> It is currently possible for a PHY device to be suspended as part of a
>>>> network device driver's suspend call while it is still being attached to
>>>> that net_device, either via phy_suspend() or implicitly via phy_stop().
>>>>
>>>> Later on, when the MDIO bus controller get suspended, we would attempt
>>>> to suspend again the PHY because it is still attached to a network
>>>> device.
>>>>
>>>> This is both a waste of time and creates an opportunity for improper
>>>> clock/power management bugs to creep in.
>>>>
>>>> Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY")
>>>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
>>>
>>> Applied, and queued up for -stable, thanks Florian.
>>
>> This patch causes a regression on r8a73a4/ape6evm and sh73a0/kzm9g.
>> After resume from s2ram, Ethernet no longer works:
>>
>> PM: suspend exit
>> nfs: server aaa.bbb.ccc.ddd not responding, still trying
>> ...
>>
>> Reverting commit 503ba7c6961034ff ("net: phy: Avoid multiple suspends")
>> fixes the issue.
>>
>> On both boards, an SMSC LAN9220 is connected to a power-managed local
>> bus.
>>
>> I added some debug code to check when the clock driving the local bus
>> is stopped and started, but I see no difference before/after. Hence I
>> suspect the Ethernet chip is no longer reinitialized after resume.
>
> Can you provide a complete log? Do you use the Generic PHY driver or a
> specialized one? Do you have a way to dump the registers at the time of
> failure and see if BMCR.PDOWN is still set somehow?
>
Maybe reason for the misbehavior is that mdio_bus_phy_may_suspend() is
checked also in mdio_bus_phy_resume(), what's not very logical based
on the naming. The call to phy_resume() therefore may be skipped.
> Does the following help:
>
> diff --git a/drivers/net/ethernet/smsc/smsc911x.c
> b/drivers/net/ethernet/smsc/smsc911x.c
> index 49a6a9167af4..df17190c76c0 100644
> --- a/drivers/net/ethernet/smsc/smsc911x.c
> +++ b/drivers/net/ethernet/smsc/smsc911x.c
> @@ -2618,6 +2618,7 @@ static int smsc911x_resume(struct device *dev)
> if (netif_running(ndev)) {
> netif_device_attach(ndev);
> netif_start_queue(ndev);
> + phy_resume(dev->phydev);
> }
>
> return 0;
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: phy: Avoid multiple suspends
2020-03-10 16:46 ` Florian Fainelli
2020-03-10 17:34 ` Heiner Kallweit
@ 2020-03-11 9:17 ` Geert Uytterhoeven
2020-03-11 21:22 ` Heiner Kallweit
1 sibling, 1 reply; 8+ messages in thread
From: Geert Uytterhoeven @ 2020-03-11 9:17 UTC (permalink / raw)
To: Florian Fainelli
Cc: David Miller, netdev, Yoshihiro Shimoda, Andrew Lunn,
Heiner Kallweit, Russell King, Linux Kernel Mailing List,
Linux-Renesas
On Tue, Mar 10, 2020 at 5:47 PM Florian Fainelli <f.fainelli@gmail.com> wrote:
>
> On 3/10/20 7:16 AM, Geert Uytterhoeven wrote:
> > Hi Florian, David,
> >
> > On Mon, Feb 24, 2020 at 5:59 AM David Miller <davem@davemloft.net> wrote:
> >> From: Florian Fainelli <f.fainelli@gmail.com>
> >> Date: Thu, 20 Feb 2020 15:34:53 -0800
> >>
> >>> It is currently possible for a PHY device to be suspended as part of a
> >>> network device driver's suspend call while it is still being attached to
> >>> that net_device, either via phy_suspend() or implicitly via phy_stop().
> >>>
> >>> Later on, when the MDIO bus controller get suspended, we would attempt
> >>> to suspend again the PHY because it is still attached to a network
> >>> device.
> >>>
> >>> This is both a waste of time and creates an opportunity for improper
> >>> clock/power management bugs to creep in.
> >>>
> >>> Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY")
> >>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
> >>
> >> Applied, and queued up for -stable, thanks Florian.
> >
> > This patch causes a regression on r8a73a4/ape6evm and sh73a0/kzm9g.
> > After resume from s2ram, Ethernet no longer works:
> >
> > PM: suspend exit
> > nfs: server aaa.bbb.ccc.ddd not responding, still trying
> > ...
> >
> > Reverting commit 503ba7c6961034ff ("net: phy: Avoid multiple suspends")
> > fixes the issue.
> >
> > On both boards, an SMSC LAN9220 is connected to a power-managed local
> > bus.
> >
> > I added some debug code to check when the clock driving the local bus
> > is stopped and started, but I see no difference before/after. Hence I
> > suspect the Ethernet chip is no longer reinitialized after resume.
>
> Can you provide a complete log?
With some debug info:
SDHI0 Vcc: disabling
PM: suspend entry (deep)
Filesystems sync: 0.002 seconds
Freezing user space processes ... (elapsed 0.001 seconds) done.
OOM killer disabled.
Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
PM: ==== a3sp/ee120000.sd: stop
PM: ==== a3sp/ee100000.sd: stop
smsc911x 8000000.ethernet: smsc911x_suspend:2577
smsc911x 8000000.ethernet: smsc911x_suspend:2579 running
smsc911x 8000000.ethernet: smsc911x_suspend:2584
PM: ==== a3sp/ee200000.mmc: stop
PM: ==== c4/fec10000.bus: stop
PM: ==== a3sp/e6c40000.serial: stop
PM: ==== c5/e61f0000.thermal: stop
PM: ==== c4/e61c0200.interrupt-controller: stop
PM: == a3sp: power off
rmobile_pd_power_down: a3sp
Disabling non-boot CPUs ...
PM: ==== c4/e61c0200.interrupt-controller: start
PM: ==== c5/e61f0000.thermal: start
PM: ==== a3sp/e6c40000.serial: start
PM: ==== c4/fec10000.bus: start
PM: ==== a3sp/ee200000.mmc: start
smsc911x 8000000.ethernet: smsc911x_resume:2606
smsc911x 8000000.ethernet: smsc911x_resume:2625 running
PM: ==== a3sp/ee100000.sd: start
OOM killer enabled.
Restarting tasks ... done.
PM: ==== a3sp/ee120000.sd: start
PM: suspend exit
nfs: server aaa.bbb.ccc.ddd not responding, still trying
...
But no difference between the good and the bad case, except for the nfs
failures.
> Do you use the Generic PHY driver or a
> specialized one?
CONFIG_FIXED_PHY=y
CONFIG_SMSC_PHY=y
Just the smsc,lan9115 node, cfr. arch/arm/boot/dts/r8a73a4-ape6evm.dts
> Do you have a way to dump the registers at the time of
> failure and see if BMCR.PDOWN is still set somehow?
Added a hook into "nfs: server not responding", which prints:
MII_BMCR = 0x1900
i.e. BMCR_PDOWN = 0x0800 is still set.
> Does the following help:
>
> diff --git a/drivers/net/ethernet/smsc/smsc911x.c
> b/drivers/net/ethernet/smsc/smsc911x.c
> index 49a6a9167af4..df17190c76c0 100644
> --- a/drivers/net/ethernet/smsc/smsc911x.c
> +++ b/drivers/net/ethernet/smsc/smsc911x.c
> @@ -2618,6 +2618,7 @@ static int smsc911x_resume(struct device *dev)
> if (netif_running(ndev)) {
> netif_device_attach(ndev);
> netif_start_queue(ndev);
> + phy_resume(dev->phydev);
> }
>
Yes i does, after s/dev->/ndev->/.
Thanks!
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: phy: Avoid multiple suspends
2020-03-11 9:17 ` Geert Uytterhoeven
@ 2020-03-11 21:22 ` Heiner Kallweit
2020-03-12 8:26 ` Geert Uytterhoeven
0 siblings, 1 reply; 8+ messages in thread
From: Heiner Kallweit @ 2020-03-11 21:22 UTC (permalink / raw)
To: Geert Uytterhoeven, Florian Fainelli
Cc: David Miller, netdev, Yoshihiro Shimoda, Andrew Lunn,
Russell King, Linux Kernel Mailing List, Linux-Renesas
On 11.03.2020 10:17, Geert Uytterhoeven wrote:
> On Tue, Mar 10, 2020 at 5:47 PM Florian Fainelli <f.fainelli@gmail.com> wrote:
>>
>> On 3/10/20 7:16 AM, Geert Uytterhoeven wrote:
>>> Hi Florian, David,
>>>
>>> On Mon, Feb 24, 2020 at 5:59 AM David Miller <davem@davemloft.net> wrote:
>>>> From: Florian Fainelli <f.fainelli@gmail.com>
>>>> Date: Thu, 20 Feb 2020 15:34:53 -0800
>>>>
>>>>> It is currently possible for a PHY device to be suspended as part of a
>>>>> network device driver's suspend call while it is still being attached to
>>>>> that net_device, either via phy_suspend() or implicitly via phy_stop().
>>>>>
>>>>> Later on, when the MDIO bus controller get suspended, we would attempt
>>>>> to suspend again the PHY because it is still attached to a network
>>>>> device.
>>>>>
>>>>> This is both a waste of time and creates an opportunity for improper
>>>>> clock/power management bugs to creep in.
>>>>>
>>>>> Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY")
>>>>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
>>>>
>>>> Applied, and queued up for -stable, thanks Florian.
>>>
>>> This patch causes a regression on r8a73a4/ape6evm and sh73a0/kzm9g.
>>> After resume from s2ram, Ethernet no longer works:
>>>
>>> PM: suspend exit
>>> nfs: server aaa.bbb.ccc.ddd not responding, still trying
>>> ...
>>>
>>> Reverting commit 503ba7c6961034ff ("net: phy: Avoid multiple suspends")
>>> fixes the issue.
>>>
>>> On both boards, an SMSC LAN9220 is connected to a power-managed local
>>> bus.
>>>
>>> I added some debug code to check when the clock driving the local bus
>>> is stopped and started, but I see no difference before/after. Hence I
>>> suspect the Ethernet chip is no longer reinitialized after resume.
>>
>> Can you provide a complete log?
>
> With some debug info:
>
> SDHI0 Vcc: disabling
> PM: suspend entry (deep)
> Filesystems sync: 0.002 seconds
> Freezing user space processes ... (elapsed 0.001 seconds) done.
> OOM killer disabled.
> Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
> PM: ==== a3sp/ee120000.sd: stop
> PM: ==== a3sp/ee100000.sd: stop
> smsc911x 8000000.ethernet: smsc911x_suspend:2577
> smsc911x 8000000.ethernet: smsc911x_suspend:2579 running
> smsc911x 8000000.ethernet: smsc911x_suspend:2584
> PM: ==== a3sp/ee200000.mmc: stop
> PM: ==== c4/fec10000.bus: stop
> PM: ==== a3sp/e6c40000.serial: stop
> PM: ==== c5/e61f0000.thermal: stop
> PM: ==== c4/e61c0200.interrupt-controller: stop
> PM: == a3sp: power off
> rmobile_pd_power_down: a3sp
> Disabling non-boot CPUs ...
> PM: ==== c4/e61c0200.interrupt-controller: start
> PM: ==== c5/e61f0000.thermal: start
> PM: ==== a3sp/e6c40000.serial: start
> PM: ==== c4/fec10000.bus: start
> PM: ==== a3sp/ee200000.mmc: start
> smsc911x 8000000.ethernet: smsc911x_resume:2606
> smsc911x 8000000.ethernet: smsc911x_resume:2625 running
> PM: ==== a3sp/ee100000.sd: start
> OOM killer enabled.
> Restarting tasks ... done.
> PM: ==== a3sp/ee120000.sd: start
> PM: suspend exit
> nfs: server aaa.bbb.ccc.ddd not responding, still trying
> ...
>
> But no difference between the good and the bad case, except for the nfs
> failures.
>
>> Do you use the Generic PHY driver or a
>> specialized one?
>
> CONFIG_FIXED_PHY=y
> CONFIG_SMSC_PHY=y
>
> Just the smsc,lan9115 node, cfr. arch/arm/boot/dts/r8a73a4-ape6evm.dts
>
>> Do you have a way to dump the registers at the time of
>> failure and see if BMCR.PDOWN is still set somehow?
>
> Added a hook into "nfs: server not responding", which prints:
>
> MII_BMCR = 0x1900
>
> i.e. BMCR_PDOWN = 0x0800 is still set.
>
>> Does the following help:
>>
>> diff --git a/drivers/net/ethernet/smsc/smsc911x.c
>> b/drivers/net/ethernet/smsc/smsc911x.c
>> index 49a6a9167af4..df17190c76c0 100644
>> --- a/drivers/net/ethernet/smsc/smsc911x.c
>> +++ b/drivers/net/ethernet/smsc/smsc911x.c
>> @@ -2618,6 +2618,7 @@ static int smsc911x_resume(struct device *dev)
>> if (netif_running(ndev)) {
>> netif_device_attach(ndev);
>> netif_start_queue(ndev);
>> + phy_resume(dev->phydev);
>> }
>>
>
> Yes i does, after s/dev->/ndev->/.
> Thanks!
>
This seems to be a workaround. And the same issue we may have with
other drivers too. Could you please alternatively test the following?
It tackles the issue that mdio_bus_phy_may_suspend() is used in
suspend AND resume, and both calls may return different values.
With this patch we call mdio_bus_phy_may_suspend() only when
suspending, and let the phy_device store whether it was suspended
by MDIO bus PM.
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 32a5ceddc..6d6c6a178 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -286,6 +286,8 @@ static int mdio_bus_phy_suspend(struct device *dev)
if (!mdio_bus_phy_may_suspend(phydev))
return 0;
+ phydev->suspended_by_mdio_bus = 1;
+
return phy_suspend(phydev);
}
@@ -294,9 +296,11 @@ static int mdio_bus_phy_resume(struct device *dev)
struct phy_device *phydev = to_phy_device(dev);
int ret;
- if (!mdio_bus_phy_may_suspend(phydev))
+ if (!phydev->suspended_by_mdio_bus)
goto no_resume;
+ phydev->suspended_by_mdio_bus = 0;
+
ret = phy_resume(phydev);
if (ret < 0)
return ret;
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 8b299476b..118de9f5b 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -357,6 +357,7 @@ struct macsec_ops;
* is_gigabit_capable: Set to true if PHY supports 1000Mbps
* has_fixups: Set to true if this phy has fixups/quirks.
* suspended: Set to true if this phy has been suspended successfully.
+ * suspended_by_mdio_bus: Set to true if this phy was suspended by MDIO bus.
* sysfs_links: Internal boolean tracking sysfs symbolic links setup/removal.
* loopback_enabled: Set true if this phy has been loopbacked successfully.
* state: state of the PHY for management purposes
@@ -396,6 +397,7 @@ struct phy_device {
unsigned is_gigabit_capable:1;
unsigned has_fixups:1;
unsigned suspended:1;
+ unsigned suspended_by_mdio_bus:1;
unsigned sysfs_links:1;
unsigned loopback_enabled:1;
--
2.25.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] net: phy: Avoid multiple suspends
2020-03-11 21:22 ` Heiner Kallweit
@ 2020-03-12 8:26 ` Geert Uytterhoeven
0 siblings, 0 replies; 8+ messages in thread
From: Geert Uytterhoeven @ 2020-03-12 8:26 UTC (permalink / raw)
To: Heiner Kallweit
Cc: Florian Fainelli, David Miller, netdev, Yoshihiro Shimoda,
Andrew Lunn, Russell King, Linux Kernel Mailing List,
Linux-Renesas
Hi Heiner,
On Wed, Mar 11, 2020 at 10:22 PM Heiner Kallweit <hkallweit1@gmail.com> wrote:
> On 11.03.2020 10:17, Geert Uytterhoeven wrote:
> > On Tue, Mar 10, 2020 at 5:47 PM Florian Fainelli <f.fainelli@gmail.com> wrote:
> >> On 3/10/20 7:16 AM, Geert Uytterhoeven wrote:
> >>> On Mon, Feb 24, 2020 at 5:59 AM David Miller <davem@davemloft.net> wrote:
> >>>> From: Florian Fainelli <f.fainelli@gmail.com>
> >>>> Date: Thu, 20 Feb 2020 15:34:53 -0800
> >>>>
> >>>>> It is currently possible for a PHY device to be suspended as part of a
> >>>>> network device driver's suspend call while it is still being attached to
> >>>>> that net_device, either via phy_suspend() or implicitly via phy_stop().
> >>>>>
> >>>>> Later on, when the MDIO bus controller get suspended, we would attempt
> >>>>> to suspend again the PHY because it is still attached to a network
> >>>>> device.
> >>>>>
> >>>>> This is both a waste of time and creates an opportunity for improper
> >>>>> clock/power management bugs to creep in.
> >>>>>
> >>>>> Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY")
> >>>>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
> >>>>
> >>>> Applied, and queued up for -stable, thanks Florian.
> >>>
> >>> This patch causes a regression on r8a73a4/ape6evm and sh73a0/kzm9g.
> >>> After resume from s2ram, Ethernet no longer works:
> >>>
> >>> PM: suspend exit
> >>> nfs: server aaa.bbb.ccc.ddd not responding, still trying
> >>> ...
> >>>
> >>> Reverting commit 503ba7c6961034ff ("net: phy: Avoid multiple suspends")
> >>> fixes the issue.
> >>>
> >>> On both boards, an SMSC LAN9220 is connected to a power-managed local
> >>> bus.
> >>>
> >>> I added some debug code to check when the clock driving the local bus
> >>> is stopped and started, but I see no difference before/after. Hence I
> >>> suspect the Ethernet chip is no longer reinitialized after resume.
> >>
> >> Can you provide a complete log?
> >
> > With some debug info:
> >
> > SDHI0 Vcc: disabling
> > PM: suspend entry (deep)
> > Filesystems sync: 0.002 seconds
> > Freezing user space processes ... (elapsed 0.001 seconds) done.
> > OOM killer disabled.
> > Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
> > PM: ==== a3sp/ee120000.sd: stop
> > PM: ==== a3sp/ee100000.sd: stop
> > smsc911x 8000000.ethernet: smsc911x_suspend:2577
> > smsc911x 8000000.ethernet: smsc911x_suspend:2579 running
> > smsc911x 8000000.ethernet: smsc911x_suspend:2584
> > PM: ==== a3sp/ee200000.mmc: stop
> > PM: ==== c4/fec10000.bus: stop
> > PM: ==== a3sp/e6c40000.serial: stop
> > PM: ==== c5/e61f0000.thermal: stop
> > PM: ==== c4/e61c0200.interrupt-controller: stop
> > PM: == a3sp: power off
> > rmobile_pd_power_down: a3sp
> > Disabling non-boot CPUs ...
> > PM: ==== c4/e61c0200.interrupt-controller: start
> > PM: ==== c5/e61f0000.thermal: start
> > PM: ==== a3sp/e6c40000.serial: start
> > PM: ==== c4/fec10000.bus: start
> > PM: ==== a3sp/ee200000.mmc: start
> > smsc911x 8000000.ethernet: smsc911x_resume:2606
> > smsc911x 8000000.ethernet: smsc911x_resume:2625 running
> > PM: ==== a3sp/ee100000.sd: start
> > OOM killer enabled.
> > Restarting tasks ... done.
> > PM: ==== a3sp/ee120000.sd: start
> > PM: suspend exit
> > nfs: server aaa.bbb.ccc.ddd not responding, still trying
> > ...
> >
> > But no difference between the good and the bad case, except for the nfs
> > failures.
> >
> >> Do you use the Generic PHY driver or a
> >> specialized one?
> >
> > CONFIG_FIXED_PHY=y
> > CONFIG_SMSC_PHY=y
> >
> > Just the smsc,lan9115 node, cfr. arch/arm/boot/dts/r8a73a4-ape6evm.dts
> >
> >> Do you have a way to dump the registers at the time of
> >> failure and see if BMCR.PDOWN is still set somehow?
> >
> > Added a hook into "nfs: server not responding", which prints:
> >
> > MII_BMCR = 0x1900
> >
> > i.e. BMCR_PDOWN = 0x0800 is still set.
> >
> >> Does the following help:
> >>
> >> diff --git a/drivers/net/ethernet/smsc/smsc911x.c
> >> b/drivers/net/ethernet/smsc/smsc911x.c
> >> index 49a6a9167af4..df17190c76c0 100644
> >> --- a/drivers/net/ethernet/smsc/smsc911x.c
> >> +++ b/drivers/net/ethernet/smsc/smsc911x.c
> >> @@ -2618,6 +2618,7 @@ static int smsc911x_resume(struct device *dev)
> >> if (netif_running(ndev)) {
> >> netif_device_attach(ndev);
> >> netif_start_queue(ndev);
> >> + phy_resume(dev->phydev);
> >> }
> >>
> >
> > Yes i does, after s/dev->/ndev->/.
> > Thanks!
>
> This seems to be a workaround. And the same issue we may have with
I agree.
> other drivers too. Could you please alternatively test the following?
> It tackles the issue that mdio_bus_phy_may_suspend() is used in
> suspend AND resume, and both calls may return different values.
>
> With this patch we call mdio_bus_phy_may_suspend() only when
> suspending, and let the phy_device store whether it was suspended
> by MDIO bus PM.
Thanks, your patch fixes the issue, too.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-03-12 8:27 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-20 23:34 [PATCH net] net: phy: Avoid multiple suspends Florian Fainelli
2020-02-24 4:59 ` David Miller
2020-03-10 14:16 ` Geert Uytterhoeven
2020-03-10 16:46 ` Florian Fainelli
2020-03-10 17:34 ` Heiner Kallweit
2020-03-11 9:17 ` Geert Uytterhoeven
2020-03-11 21:22 ` Heiner Kallweit
2020-03-12 8:26 ` Geert Uytterhoeven
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).