Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH net] net: phy: marvell10g: fix broken PHY interrupts for anyone after us in the driver probe list
@ 2021-08-27 0:15 Vladimir Oltean
2021-08-27 1:32 ` Marek Behún
2021-08-27 1:49 ` Andrew Lunn
0 siblings, 2 replies; 5+ messages in thread
From: Vladimir Oltean @ 2021-08-27 0:15 UTC (permalink / raw)
To: netdev; +Cc: Russell King, Marek Behún, Andrew Lunn, Heiner Kallweit
Enabling interrupts via device tree for the internal PHYs on the
mv88e6390 DSA switch does not work. The driver insists to use poll mode.
Stage one debugging shows that the fwnode_mdiobus_phy_device_register
function calls fwnode_irq_get properly, and phy->irq is set to a valid
interrupt line initially.
But it is then cleared.
Stage two debugging shows that it is cleared here:
phy_probe:
/* Disable the interrupt if the PHY doesn't support it
* but the interrupt is still a valid one
*/
if (!phy_drv_supports_irq(phydrv) && phy_interrupt_is_valid(phydev))
phydev->irq = PHY_POLL;
Okay, so does the "Marvell 88E6390 Family" PHY driver not have the
.config_intr and .handle_interrupt function pointers? Yes it does.
Stage three debugging shows that the PHY device does not attempt a probe
against the "Marvell 88E6390 Family" driver, but against the "mv88x3310"
driver.
Okay, so why does the "mv88x3310" driver match on a mv8836390 internal PHY?
The PHY IDs (MARVELL_PHY_ID_88E6390_FAMILY vs MARVELL_PHY_ID_88X3310)
are way different.
Stage four debugging has us looking through:
phy_device_register
-> device_add
-> bus_probe_device
-> device_initial_probe
-> __device_attach
-> bus_for_each_drv
-> driver_match_device
-> drv->bus->match
-> phy_bus_match
Okay, so as we said, the MII_PHYSID1 of mv88e6390 does not match the
mv88x3310 driver's PHY mask & ID, so why would phy_bus_match return...
..Ahh, phy_bus_match calls a shortcircuit method, phydrv->match_phy_device,
and does not even bother to compare the PHY ID if that is implemented.
So of course, we go inside the marvell10g.c driver and sure enough, it
implements .match_phy_device and does not bother to check the PHY ID.
What's interesting though is that at the end of the device_add() from
phy_device_register(), the driver for the internal PHYs _is_ the proper
"Marvell 88E6390 Family". This is because "mv88x3310" ends up failing to
probe after all, and __device_attach_driver(), to quote:
/*
* Ignore errors returned by ->probe so that the next driver can try
* its luck.
*/
The next (and only other) driver that matches is the 6390 driver. For
this one, phy_probe doesn't fail, and everything expects to work as
normal, EXCEPT phydev->irq has already been cleared by the previous
unsuccessful probe of a driver which did not implement PHY interrupts,
and therefore cleared that IRQ.
Okay, so it is not just Marvell 6390 that has PHY interrupts broken.
Stuff like Atheros, Aquantia, Broadcom, Qualcomm work because they are
lexicographically before Marvell, and stuff like NXP, Realtek, Vitesse
are broken.
This goes to show how fragile it is to reset phydev->irq = PHY_POLL from
the actual beginning of phy_probe itself. That seems like an actual bug
of its own too, since phy_probe has side effects which are not restored
on probe failure, but the line of thought probably was, the same driver
will attempt probe again, so it doesn't matter. Well, looks like it does.
Maybe it would make more sense to move the phydev->irq clearing after
the actual device_add() in phy_device_register() completes, and the
bound driver is the actual final one.
(also, a bit frightening that drivers are permitted to bypass the MDIO
bus matching in such a trivial way and perform PHY reads and writes from
the .match_phy_device method, on devices that do not even belong to
them. In the general case it might not be guaranteed that the MDIO
accesses one driver needs to make to figure out whether to match on a
device is safe for all other PHY devices)
Fixes: a5de4be0aaaa ("net: phy: marvell10g: fix differentiation of 88X3310 from 88X3340")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
drivers/net/phy/marvell10g.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/net/phy/marvell10g.c b/drivers/net/phy/marvell10g.c
index 53a433442803..7bf35b24fd14 100644
--- a/drivers/net/phy/marvell10g.c
+++ b/drivers/net/phy/marvell10g.c
@@ -987,11 +987,17 @@ static int mv3310_get_number_of_ports(struct phy_device *phydev)
static int mv3310_match_phy_device(struct phy_device *phydev)
{
+ if ((phydev->phy_id & MARVELL_PHY_ID_MASK) != MARVELL_PHY_ID_88X3310)
+ return 0;
+
return mv3310_get_number_of_ports(phydev) == 1;
}
static int mv3340_match_phy_device(struct phy_device *phydev)
{
+ if ((phydev->phy_id & MARVELL_PHY_ID_MASK) != MARVELL_PHY_ID_88X3310)
+ return 0;
+
return mv3310_get_number_of_ports(phydev) == 4;
}
--
2.25.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] net: phy: marvell10g: fix broken PHY interrupts for anyone after us in the driver probe list
2021-08-27 0:15 [PATCH net] net: phy: marvell10g: fix broken PHY interrupts for anyone after us in the driver probe list Vladimir Oltean
@ 2021-08-27 1:32 ` Marek Behún
2021-08-27 1:42 ` Andrew Lunn
2021-08-27 15:33 ` Vladimir Oltean
2021-08-27 1:49 ` Andrew Lunn
1 sibling, 2 replies; 5+ messages in thread
From: Marek Behún @ 2021-08-27 1:32 UTC (permalink / raw)
To: Vladimir Oltean; +Cc: netdev, Russell King, Andrew Lunn, Heiner Kallweit
On Fri, 27 Aug 2021 03:15:13 +0300
Vladimir Oltean <vladimir.oltean@nxp.com> wrote:
> Enabling interrupts via device tree for the internal PHYs on the
> mv88e6390 DSA switch does not work. The driver insists to use poll mode.
>
> Stage one debugging shows that the fwnode_mdiobus_phy_device_register
> function calls fwnode_irq_get properly, and phy->irq is set to a valid
> interrupt line initially.
>
> But it is then cleared.
>
> Stage two debugging shows that it is cleared here:
>
> phy_probe:
>
> /* Disable the interrupt if the PHY doesn't support it
> * but the interrupt is still a valid one
> */
> if (!phy_drv_supports_irq(phydrv) && phy_interrupt_is_valid(phydev))
> phydev->irq = PHY_POLL;
>
> Okay, so does the "Marvell 88E6390 Family" PHY driver not have the
> .config_intr and .handle_interrupt function pointers? Yes it does.
>
> Stage three debugging shows that the PHY device does not attempt a probe
> against the "Marvell 88E6390 Family" driver, but against the "mv88x3310"
> driver.
>
> Okay, so why does the "mv88x3310" driver match on a mv8836390 internal PHY?
> The PHY IDs (MARVELL_PHY_ID_88E6390_FAMILY vs MARVELL_PHY_ID_88X3310)
> are way different.
>
> Stage four debugging has us looking through:
>
> phy_device_register
> -> device_add
> -> bus_probe_device
> -> device_initial_probe
> -> __device_attach
> -> bus_for_each_drv
> -> driver_match_device
> -> drv->bus->match
> -> phy_bus_match
>
> Okay, so as we said, the MII_PHYSID1 of mv88e6390 does not match the
> mv88x3310 driver's PHY mask & ID, so why would phy_bus_match return...
>
> ..Ahh, phy_bus_match calls a shortcircuit method, phydrv->match_phy_device,
> and does not even bother to compare the PHY ID if that is implemented.
>
> So of course, we go inside the marvell10g.c driver and sure enough, it
> implements .match_phy_device and does not bother to check the PHY ID.
>
> What's interesting though is that at the end of the device_add() from
> phy_device_register(), the driver for the internal PHYs _is_ the proper
> "Marvell 88E6390 Family". This is because "mv88x3310" ends up failing to
> probe after all, and __device_attach_driver(), to quote:
>
> /*
> * Ignore errors returned by ->probe so that the next driver can try
> * its luck.
> */
:))) /o\
> The next (and only other) driver that matches is the 6390 driver. For
> this one, phy_probe doesn't fail, and everything expects to work as
> normal, EXCEPT phydev->irq has already been cleared by the previous
> unsuccessful probe of a driver which did not implement PHY interrupts,
> and therefore cleared that IRQ.
>
> Okay, so it is not just Marvell 6390 that has PHY interrupts broken.
> Stuff like Atheros, Aquantia, Broadcom, Qualcomm work because they are
> lexicographically before Marvell, and stuff like NXP, Realtek, Vitesse
> are broken.
>
> This goes to show how fragile it is to reset phydev->irq = PHY_POLL from
> the actual beginning of phy_probe itself. That seems like an actual bug
> of its own too, since phy_probe has side effects which are not restored
> on probe failure, but the line of thought probably was, the same driver
> will attempt probe again, so it doesn't matter. Well, looks like it does.
>
> Maybe it would make more sense to move the phydev->irq clearing after
> the actual device_add() in phy_device_register() completes, and the
> bound driver is the actual final one.
>
> (also, a bit frightening that drivers are permitted to bypass the MDIO
> bus matching in such a trivial way and perform PHY reads and writes from
> the .match_phy_device method, on devices that do not even belong to
> them. In the general case it might not be guaranteed that the MDIO
> accesses one driver needs to make to figure out whether to match on a
> device is safe for all other PHY devices)
>
> Fixes: a5de4be0aaaa ("net: phy: marvell10g: fix differentiation of 88X3310 from 88X3340")
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
> drivers/net/phy/marvell10g.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/net/phy/marvell10g.c b/drivers/net/phy/marvell10g.c
> index 53a433442803..7bf35b24fd14 100644
> --- a/drivers/net/phy/marvell10g.c
> +++ b/drivers/net/phy/marvell10g.c
> @@ -987,11 +987,17 @@ static int mv3310_get_number_of_ports(struct phy_device *phydev)
>
> static int mv3310_match_phy_device(struct phy_device *phydev)
> {
> + if ((phydev->phy_id & MARVELL_PHY_ID_MASK) != MARVELL_PHY_ID_88X3310)
> + return 0;
> +
> return mv3310_get_number_of_ports(phydev) == 1;
> }
>
> static int mv3340_match_phy_device(struct phy_device *phydev)
> {
> + if ((phydev->phy_id & MARVELL_PHY_ID_MASK) != MARVELL_PHY_ID_88X3310)
> + return 0;
> +
> return mv3310_get_number_of_ports(phydev) == 4;
> }
>
I fear these checks won't work, since this is a C45 PHY.
You need to check phydev->c45_ids.device_ids[1], instead of
phydev->phy_id.
And even them I am not entirely sure. I will try to test it tomorrow.
Marek
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] net: phy: marvell10g: fix broken PHY interrupts for anyone after us in the driver probe list
2021-08-27 1:32 ` Marek Behún
@ 2021-08-27 1:42 ` Andrew Lunn
2021-08-27 15:33 ` Vladimir Oltean
1 sibling, 0 replies; 5+ messages in thread
From: Andrew Lunn @ 2021-08-27 1:42 UTC (permalink / raw)
To: Marek Behún; +Cc: Vladimir Oltean, netdev, Russell King, Heiner Kallweit
> I fear these checks won't work, since this is a C45 PHY.
>
> You need to check phydev->c45_ids.device_ids[1], instead of
> phydev->phy_id.
I think you are correct.
static int mv211x_match_phy_device(struct phy_device *phydev, bool has_5g)
{
int val;
if ((phydev->c45_ids.device_ids[MDIO_MMD_PMAPMD] &
MARVELL_PHY_ID_MASK) != MARVELL_PHY_ID_88E2110)
return 0;
This seems like a good pattern to follow.
Andrew
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] net: phy: marvell10g: fix broken PHY interrupts for anyone after us in the driver probe list
2021-08-27 0:15 [PATCH net] net: phy: marvell10g: fix broken PHY interrupts for anyone after us in the driver probe list Vladimir Oltean
2021-08-27 1:32 ` Marek Behún
@ 2021-08-27 1:49 ` Andrew Lunn
1 sibling, 0 replies; 5+ messages in thread
From: Andrew Lunn @ 2021-08-27 1:49 UTC (permalink / raw)
To: Vladimir Oltean; +Cc: netdev, Russell King, Marek Behún, Heiner Kallweit
Thanks for the detailed description.
> (also, a bit frightening that drivers are permitted to bypass the MDIO
> bus matching in such a trivial way and perform PHY reads and writes from
> the .match_phy_device method, on devices that do not even belong to
> them.
Yes, but i don't think we can block it. We might want to extend the
comment a bit:
/**
* @match_phy_device: Returns true if this is a suitable
* driver for the given phydev. If NULL, matching is based on
* phy_id and phy_id_mask.
*/
int (*match_phy_device)(struct phy_device *phydev);
I guess we got into this situation because it was assumed that phy_id
and phy_id_mask matched and now we can refine it down further between
two devices which have identical IDs. Which is wrong.
Andrew
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] net: phy: marvell10g: fix broken PHY interrupts for anyone after us in the driver probe list
2021-08-27 1:32 ` Marek Behún
2021-08-27 1:42 ` Andrew Lunn
@ 2021-08-27 15:33 ` Vladimir Oltean
1 sibling, 0 replies; 5+ messages in thread
From: Vladimir Oltean @ 2021-08-27 15:33 UTC (permalink / raw)
To: Marek Behún; +Cc: netdev, Russell King, Andrew Lunn, Heiner Kallweit
On Fri, Aug 27, 2021 at 03:32:29AM +0200, Marek Behún wrote:
> > Fixes: a5de4be0aaaa ("net: phy: marvell10g: fix differentiation of 88X3310 from 88X3340")
> > Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> > ---
> > drivers/net/phy/marvell10g.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/net/phy/marvell10g.c b/drivers/net/phy/marvell10g.c
> > index 53a433442803..7bf35b24fd14 100644
> > --- a/drivers/net/phy/marvell10g.c
> > +++ b/drivers/net/phy/marvell10g.c
> > @@ -987,11 +987,17 @@ static int mv3310_get_number_of_ports(struct phy_device *phydev)
> >
> > static int mv3310_match_phy_device(struct phy_device *phydev)
> > {
> > + if ((phydev->phy_id & MARVELL_PHY_ID_MASK) != MARVELL_PHY_ID_88X3310)
> > + return 0;
> > +
> > return mv3310_get_number_of_ports(phydev) == 1;
> > }
> >
> > static int mv3340_match_phy_device(struct phy_device *phydev)
> > {
> > + if ((phydev->phy_id & MARVELL_PHY_ID_MASK) != MARVELL_PHY_ID_88X3310)
> > + return 0;
> > +
> > return mv3310_get_number_of_ports(phydev) == 4;
> > }
> >
>
> I fear these checks won't work, since this is a C45 PHY.
>
> You need to check phydev->c45_ids.device_ids[1], instead of
> phydev->phy_id.
>
> And even them I am not entirely sure. I will try to test it tomorrow.
>
> Marek
Thanks for testing and resending. Last night I was trying to do
something completely different, and it looks like I concentrated all my
attention on the problem and none on the solution here.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-08-27 15:33 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-27 0:15 [PATCH net] net: phy: marvell10g: fix broken PHY interrupts for anyone after us in the driver probe list Vladimir Oltean
2021-08-27 1:32 ` Marek Behún
2021-08-27 1:42 ` Andrew Lunn
2021-08-27 15:33 ` Vladimir Oltean
2021-08-27 1:49 ` Andrew Lunn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).