LKML Archive on lore.kernel.org help / color / mirror / Atom feed
* [PATCH] x86/PCI: Fully disable devices before releasing IRQ resource @ 2015-03-05 21:06 Alex Williamson 2015-03-06 1:49 ` Jiang Liu 0 siblings, 1 reply; 12+ messages in thread From: Alex Williamson @ 2015-03-05 21:06 UTC (permalink / raw) To: x86, rjw, mingo, bp, lv.zheng, hpa, bhelgaas, tglx, yinghai, lenb Cc: linux-pci, tony.luck, linux-acpi, jiang.liu, linux-kernel The IRQ resource for a device is established when pci_enabled_device() is called on a fully disabled device (ie. enable_cnt == 0). With commit b4b55cda5874 ("x86/PCI: Refine the way to release PCI IRQ resources") this same IRQ resource is released when the driver is unbound from the device, regardless of the enable_cnt. This presents the situation that an ill-behaved driver can now make a device unusable to subsequent drivers by an imbalance in their use of pci_enable/disable_device(). It's one thing to break your own device if you're one of these ill-behaved drivers, but it's a serious regression for secondary drivers like vfio-pci, which are innocent of the transgressions of the previous driver. Resolve by pushing the device to a fully disabled state before releasing the IRQ resource. Fixes: b4b55cda5874 ("x86/PCI: Refine the way to release PCI IRQ resources") Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Cc: Jiang Liu <jiang.liu@linux.intel.com> --- arch/x86/pci/common.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c index 3d2612b..4810194 100644 --- a/arch/x86/pci/common.c +++ b/arch/x86/pci/common.c @@ -527,8 +527,19 @@ static int pci_irq_notifier(struct notifier_block *nb, unsigned long action, if (action != BUS_NOTIFY_UNBOUND_DRIVER) return NOTIFY_DONE; - if (pcibios_disable_irq) + if (pcibios_disable_irq) { + /* + * Broken drivers may allow a device to be .remove()'d while + * still enabled. pci_enable_device() will only re-establish + * dev->irq if the devices is fully disabled. So if we want + * to release the IRQ, we need to make sure the next driver + * can re-establish it using pci_enable_device(). + */ + while (pci_is_enabled(dev)) + pci_disable_device(dev); + pcibios_disable_irq(dev); + } return NOTIFY_OK; } ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/PCI: Fully disable devices before releasing IRQ resource 2015-03-05 21:06 [PATCH] x86/PCI: Fully disable devices before releasing IRQ resource Alex Williamson @ 2015-03-06 1:49 ` Jiang Liu 2015-03-06 3:51 ` Alex Williamson 0 siblings, 1 reply; 12+ messages in thread From: Jiang Liu @ 2015-03-06 1:49 UTC (permalink / raw) To: Alex Williamson, x86, rjw, mingo, bp, lv.zheng, hpa, bhelgaas, tglx, yinghai, lenb Cc: linux-pci, tony.luck, linux-acpi, linux-kernel On 2015/3/6 5:06, Alex Williamson wrote: > The IRQ resource for a device is established when pci_enabled_device() > is called on a fully disabled device (ie. enable_cnt == 0). With > commit b4b55cda5874 ("x86/PCI: Refine the way to release PCI IRQ > resources") this same IRQ resource is released when the driver is > unbound from the device, regardless of the enable_cnt. This presents > the situation that an ill-behaved driver can now make a device > unusable to subsequent drivers by an imbalance in their use of > pci_enable/disable_device(). It's one thing to break your own device > if you're one of these ill-behaved drivers, but it's a serious > regression for secondary drivers like vfio-pci, which are innocent > of the transgressions of the previous driver. > > Resolve by pushing the device to a fully disabled state before > releasing the IRQ resource. > > Fixes: b4b55cda5874 ("x86/PCI: Refine the way to release PCI IRQ resources") > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > Cc: Jiang Liu <jiang.liu@linux.intel.com> > --- > arch/x86/pci/common.c | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c > index 3d2612b..4810194 100644 > --- a/arch/x86/pci/common.c > +++ b/arch/x86/pci/common.c > @@ -527,8 +527,19 @@ static int pci_irq_notifier(struct notifier_block *nb, unsigned long action, > if (action != BUS_NOTIFY_UNBOUND_DRIVER) > return NOTIFY_DONE; > > - if (pcibios_disable_irq) > + if (pcibios_disable_irq) { > + /* > + * Broken drivers may allow a device to be .remove()'d while > + * still enabled. pci_enable_device() will only re-establish > + * dev->irq if the devices is fully disabled. So if we want > + * to release the IRQ, we need to make sure the next driver > + * can re-establish it using pci_enable_device(). > + */ > + while (pci_is_enabled(dev)) > + pci_disable_device(dev); > + > pcibios_disable_irq(dev); > + } Hi Alex, Thanks for debugging and fixing it. Will it be feasible to give a debug message to remind those driver authors to correctly disable PCI when unbinding? Regards! Gerry > > return NOTIFY_OK; > } > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/PCI: Fully disable devices before releasing IRQ resource 2015-03-06 1:49 ` Jiang Liu @ 2015-03-06 3:51 ` Alex Williamson 2015-03-11 16:47 ` Alex Williamson 0 siblings, 1 reply; 12+ messages in thread From: Alex Williamson @ 2015-03-06 3:51 UTC (permalink / raw) To: Jiang Liu, Bjorn Helgaas Cc: x86, rjw, mingo, bp, lv.zheng, hpa, tglx, yinghai, lenb, linux-pci, tony.luck, linux-acpi, linux-kernel On Fri, 2015-03-06 at 09:49 +0800, Jiang Liu wrote: > On 2015/3/6 5:06, Alex Williamson wrote: > > The IRQ resource for a device is established when pci_enabled_device() > > is called on a fully disabled device (ie. enable_cnt == 0). With > > commit b4b55cda5874 ("x86/PCI: Refine the way to release PCI IRQ > > resources") this same IRQ resource is released when the driver is > > unbound from the device, regardless of the enable_cnt. This presents > > the situation that an ill-behaved driver can now make a device > > unusable to subsequent drivers by an imbalance in their use of > > pci_enable/disable_device(). It's one thing to break your own device > > if you're one of these ill-behaved drivers, but it's a serious > > regression for secondary drivers like vfio-pci, which are innocent > > of the transgressions of the previous driver. > > > > Resolve by pushing the device to a fully disabled state before > > releasing the IRQ resource. > > > > Fixes: b4b55cda5874 ("x86/PCI: Refine the way to release PCI IRQ resources") > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > Cc: Jiang Liu <jiang.liu@linux.intel.com> > > --- > > arch/x86/pci/common.c | 13 ++++++++++++- > > 1 file changed, 12 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c > > index 3d2612b..4810194 100644 > > --- a/arch/x86/pci/common.c > > +++ b/arch/x86/pci/common.c > > @@ -527,8 +527,19 @@ static int pci_irq_notifier(struct notifier_block *nb, unsigned long action, > > if (action != BUS_NOTIFY_UNBOUND_DRIVER) > > return NOTIFY_DONE; > > > > - if (pcibios_disable_irq) > > + if (pcibios_disable_irq) { > > + /* > > + * Broken drivers may allow a device to be .remove()'d while > > + * still enabled. pci_enable_device() will only re-establish > > + * dev->irq if the devices is fully disabled. So if we want > > + * to release the IRQ, we need to make sure the next driver > > + * can re-establish it using pci_enable_device(). > > + */ > > + while (pci_is_enabled(dev)) > > + pci_disable_device(dev); > > + > > pcibios_disable_irq(dev); > > + } > Hi Alex, > Thanks for debugging and fixing it. > Will it be feasible to give a debug message to remind those > driver authors to correctly disable PCI when unbinding? I can certainly add a warning to the loop, it loses a bit of its teeth here though since we can't specify which driver to blame at this point. Maybe that warning and perhaps this enabling roll-back should happen in drivers/pci/pci-driver.c:pci_device_remove(). Bjorn, would you prefer it be done generically there? Thanks, Alex ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/PCI: Fully disable devices before releasing IRQ resource 2015-03-06 3:51 ` Alex Williamson @ 2015-03-11 16:47 ` Alex Williamson 2015-03-11 22:04 ` Rafael J. Wysocki 2015-03-13 2:06 ` [Bugfix] x86/PCI: Release PCI IRQ resource only if PCI device is disabled when unbinding Jiang Liu 0 siblings, 2 replies; 12+ messages in thread From: Alex Williamson @ 2015-03-11 16:47 UTC (permalink / raw) To: Jiang Liu Cc: Bjorn Helgaas, x86, rjw, mingo, bp, lv.zheng, hpa, tglx, yinghai, lenb, linux-pci, tony.luck, linux-acpi, linux-kernel On Thu, 2015-03-05 at 20:51 -0700, Alex Williamson wrote: > On Fri, 2015-03-06 at 09:49 +0800, Jiang Liu wrote: > > On 2015/3/6 5:06, Alex Williamson wrote: > > > The IRQ resource for a device is established when pci_enabled_device() > > > is called on a fully disabled device (ie. enable_cnt == 0). With > > > commit b4b55cda5874 ("x86/PCI: Refine the way to release PCI IRQ > > > resources") this same IRQ resource is released when the driver is > > > unbound from the device, regardless of the enable_cnt. This presents > > > the situation that an ill-behaved driver can now make a device > > > unusable to subsequent drivers by an imbalance in their use of > > > pci_enable/disable_device(). It's one thing to break your own device > > > if you're one of these ill-behaved drivers, but it's a serious > > > regression for secondary drivers like vfio-pci, which are innocent > > > of the transgressions of the previous driver. > > > > > > Resolve by pushing the device to a fully disabled state before > > > releasing the IRQ resource. > > > > > > Fixes: b4b55cda5874 ("x86/PCI: Refine the way to release PCI IRQ resources") > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > > Cc: Jiang Liu <jiang.liu@linux.intel.com> > > > --- > > > arch/x86/pci/common.c | 13 ++++++++++++- > > > 1 file changed, 12 insertions(+), 1 deletion(-) > > > > > > diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c > > > index 3d2612b..4810194 100644 > > > --- a/arch/x86/pci/common.c > > > +++ b/arch/x86/pci/common.c > > > @@ -527,8 +527,19 @@ static int pci_irq_notifier(struct notifier_block *nb, unsigned long action, > > > if (action != BUS_NOTIFY_UNBOUND_DRIVER) > > > return NOTIFY_DONE; > > > > > > - if (pcibios_disable_irq) > > > + if (pcibios_disable_irq) { > > > + /* > > > + * Broken drivers may allow a device to be .remove()'d while > > > + * still enabled. pci_enable_device() will only re-establish > > > + * dev->irq if the devices is fully disabled. So if we want > > > + * to release the IRQ, we need to make sure the next driver > > > + * can re-establish it using pci_enable_device(). > > > + */ > > > + while (pci_is_enabled(dev)) > > > + pci_disable_device(dev); > > > + > > > pcibios_disable_irq(dev); > > > + } > > Hi Alex, > > Thanks for debugging and fixing it. > > Will it be feasible to give a debug message to remind those > > driver authors to correctly disable PCI when unbinding? > > I can certainly add a warning to the loop, it loses a bit of its teeth > here though since we can't specify which driver to blame at this point. > Maybe that warning and perhaps this enabling roll-back should happen in > drivers/pci/pci-driver.c:pci_device_remove(). Bjorn, would you prefer > it be done generically there? Thanks, Unfortunately there's a long standing comment in pci_device_remove(): /* * We would love to complain here if pci_dev->is_enabled is set, that * the driver should have called pci_disable_device(), but the * unfortunate fact is there are too many odd BIOS and bridge setups * that don't like drivers doing that all of the time. * Oh well, we can dream of sane hardware when we sleep, no matter how * horrible the crap we have to deal with is when we are awake... */ So, unless we can somehow ignore that comment, I suspect forcing the device to be disabled on driver remove, whether done from pci-core or from x86/pci, is going to cause all sorts of breakage. Are the expectations set by b4b55cda5874 really valid? It seems like something needs to be done to allow the IRQ to be automatically re-established on x86 regardless of the driver doing the right thing when releasing the device. We're still looking at a regression for v4.0 as a result of b4b55cda5874. Thanks, Alex ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/PCI: Fully disable devices before releasing IRQ resource 2015-03-11 16:47 ` Alex Williamson @ 2015-03-11 22:04 ` Rafael J. Wysocki 2015-03-11 22:04 ` Luck, Tony 2015-03-13 2:06 ` [Bugfix] x86/PCI: Release PCI IRQ resource only if PCI device is disabled when unbinding Jiang Liu 1 sibling, 1 reply; 12+ messages in thread From: Rafael J. Wysocki @ 2015-03-11 22:04 UTC (permalink / raw) To: Alex Williamson, Jiang Liu Cc: Bjorn Helgaas, x86, mingo, bp, lv.zheng, hpa, tglx, yinghai, lenb, linux-pci, tony.luck, linux-acpi, linux-kernel On Wednesday, March 11, 2015 10:47:30 AM Alex Williamson wrote: > On Thu, 2015-03-05 at 20:51 -0700, Alex Williamson wrote: > > On Fri, 2015-03-06 at 09:49 +0800, Jiang Liu wrote: > > > On 2015/3/6 5:06, Alex Williamson wrote: > > > > The IRQ resource for a device is established when pci_enabled_device() > > > > is called on a fully disabled device (ie. enable_cnt == 0). With > > > > commit b4b55cda5874 ("x86/PCI: Refine the way to release PCI IRQ > > > > resources") this same IRQ resource is released when the driver is > > > > unbound from the device, regardless of the enable_cnt. This presents > > > > the situation that an ill-behaved driver can now make a device > > > > unusable to subsequent drivers by an imbalance in their use of > > > > pci_enable/disable_device(). It's one thing to break your own device > > > > if you're one of these ill-behaved drivers, but it's a serious > > > > regression for secondary drivers like vfio-pci, which are innocent > > > > of the transgressions of the previous driver. > > > > > > > > Resolve by pushing the device to a fully disabled state before > > > > releasing the IRQ resource. > > > > > > > > Fixes: b4b55cda5874 ("x86/PCI: Refine the way to release PCI IRQ resources") > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > > > Cc: Jiang Liu <jiang.liu@linux.intel.com> > > > > --- > > > > arch/x86/pci/common.c | 13 ++++++++++++- > > > > 1 file changed, 12 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c > > > > index 3d2612b..4810194 100644 > > > > --- a/arch/x86/pci/common.c > > > > +++ b/arch/x86/pci/common.c > > > > @@ -527,8 +527,19 @@ static int pci_irq_notifier(struct notifier_block *nb, unsigned long action, > > > > if (action != BUS_NOTIFY_UNBOUND_DRIVER) > > > > return NOTIFY_DONE; > > > > > > > > - if (pcibios_disable_irq) > > > > + if (pcibios_disable_irq) { > > > > + /* > > > > + * Broken drivers may allow a device to be .remove()'d while > > > > + * still enabled. pci_enable_device() will only re-establish > > > > + * dev->irq if the devices is fully disabled. So if we want > > > > + * to release the IRQ, we need to make sure the next driver > > > > + * can re-establish it using pci_enable_device(). > > > > + */ > > > > + while (pci_is_enabled(dev)) > > > > + pci_disable_device(dev); > > > > + > > > > pcibios_disable_irq(dev); > > > > + } > > > Hi Alex, > > > Thanks for debugging and fixing it. > > > Will it be feasible to give a debug message to remind those > > > driver authors to correctly disable PCI when unbinding? > > > > I can certainly add a warning to the loop, it loses a bit of its teeth > > here though since we can't specify which driver to blame at this point. > > Maybe that warning and perhaps this enabling roll-back should happen in > > drivers/pci/pci-driver.c:pci_device_remove(). Bjorn, would you prefer > > it be done generically there? Thanks, > > Unfortunately there's a long standing comment in pci_device_remove(): > > /* > * We would love to complain here if pci_dev->is_enabled is set, that > * the driver should have called pci_disable_device(), but the > * unfortunate fact is there are too many odd BIOS and bridge setups > * that don't like drivers doing that all of the time. > * Oh well, we can dream of sane hardware when we sleep, no matter how > * horrible the crap we have to deal with is when we are awake... > */ > > So, unless we can somehow ignore that comment, I suspect forcing the > device to be disabled on driver remove, whether done from pci-core or > from x86/pci, is going to cause all sorts of breakage. Are the > expectations set by b4b55cda5874 really valid? It seems like something > needs to be done to allow the IRQ to be automatically re-established on > x86 regardless of the driver doing the right thing when releasing the > device. We're still looking at a regression for v4.0 as a result of > b4b55cda5874. In which case we probably should revert commit b4b55cda5874 for the time being. At least I'd be very nervous about any ad-hoc fixes at this stage of the cycle. Gerry? -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [PATCH] x86/PCI: Fully disable devices before releasing IRQ resource 2015-03-11 22:04 ` Rafael J. Wysocki @ 2015-03-11 22:04 ` Luck, Tony 2015-03-12 1:17 ` Rafael J. Wysocki 0 siblings, 1 reply; 12+ messages in thread From: Luck, Tony @ 2015-03-11 22:04 UTC (permalink / raw) To: Rafael J. Wysocki, Alex Williamson, Jiang Liu Cc: Bjorn Helgaas, x86, mingo, bp, Zheng, Lv, hpa, tglx, yinghai, lenb, linux-pci, linux-acpi, linux-kernel [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1647 bytes --] >> Unfortunately there's a long standing comment in pci_device_remove(): >> >> /* >> * We would love to complain here if pci_dev->is_enabled is set, that >> * the driver should have called pci_disable_device(), but the >> * unfortunate fact is there are too many odd BIOS and bridge setups >> * that don't like drivers doing that all of the time. >> * Oh well, we can dream of sane hardware when we sleep, no matter how >> * horrible the crap we have to deal with is when we are awake... >> */ >> >> So, unless we can somehow ignore that comment, I suspect forcing the >> device to be disabled on driver remove, whether done from pci-core or >> from x86/pci, is going to cause all sorts of breakage. Are the >> expectations set by b4b55cda5874 really valid? It seems like something >> needs to be done to allow the IRQ to be automatically re-established on >> x86 regardless of the driver doing the right thing when releasing the >> device. We're still looking at a regression for v4.0 as a result of >> b4b55cda5874. > > In which case we probably should revert commit b4b55cda5874 for the time being. > > At least I'd be very nervous about any ad-hoc fixes at this stage of the cycle. The comment goes back to the dawn of "git" time ... not sure how much further back. Is this actually still an issue on modern systems? Maybe we need a black list or white list to separate the good from bad systems? -Tony ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/PCI: Fully disable devices before releasing IRQ resource 2015-03-11 22:04 ` Luck, Tony @ 2015-03-12 1:17 ` Rafael J. Wysocki 2015-03-12 1:41 ` Jiang Liu 0 siblings, 1 reply; 12+ messages in thread From: Rafael J. Wysocki @ 2015-03-12 1:17 UTC (permalink / raw) To: Luck, Tony Cc: Alex Williamson, Jiang Liu, Bjorn Helgaas, x86, mingo, bp, Zheng, Lv, hpa, tglx, yinghai, lenb, linux-pci, linux-acpi, linux-kernel On Wednesday, March 11, 2015 10:04:42 PM Luck, Tony wrote: > >> Unfortunately there's a long standing comment in pci_device_remove(): > >> > >> /* > >> * We would love to complain here if pci_dev->is_enabled is set, that > >> * the driver should have called pci_disable_device(), but the > >> * unfortunate fact is there are too many odd BIOS and bridge setups > >> * that don't like drivers doing that all of the time. > >> * Oh well, we can dream of sane hardware when we sleep, no matter how > >> * horrible the crap we have to deal with is when we are awake... > >> */ > >> > >> So, unless we can somehow ignore that comment, I suspect forcing the > >> device to be disabled on driver remove, whether done from pci-core or > >> from x86/pci, is going to cause all sorts of breakage. Are the > >> expectations set by b4b55cda5874 really valid? It seems like something > >> needs to be done to allow the IRQ to be automatically re-established on > >> x86 regardless of the driver doing the right thing when releasing the > >> device. We're still looking at a regression for v4.0 as a result of > >> b4b55cda5874. > > > > In which case we probably should revert commit b4b55cda5874 for the time being. > > > > At least I'd be very nervous about any ad-hoc fixes at this stage of the cycle. > > The comment goes back to the dawn of "git" time ... not sure how much further > back. > > Is this actually still an issue on modern systems? Maybe we need a black list > or white list to separate the good from bad systems? The answer to that is "We don't know" and in my not so humble opinion it is too risky to try to find out at the end of the cycle. -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/PCI: Fully disable devices before releasing IRQ resource 2015-03-12 1:17 ` Rafael J. Wysocki @ 2015-03-12 1:41 ` Jiang Liu 2015-03-12 16:08 ` Rafael J. Wysocki 0 siblings, 1 reply; 12+ messages in thread From: Jiang Liu @ 2015-03-12 1:41 UTC (permalink / raw) To: Rafael J. Wysocki, Luck, Tony Cc: Alex Williamson, Bjorn Helgaas, x86, mingo, bp, Zheng, Lv, hpa, tglx, yinghai, lenb, linux-pci, linux-acpi, linux-kernel On 2015/3/12 9:17, Rafael J. Wysocki wrote: > On Wednesday, March 11, 2015 10:04:42 PM Luck, Tony wrote: >>>> Unfortunately there's a long standing comment in pci_device_remove(): >>>> >>>> /* >>>> * We would love to complain here if pci_dev->is_enabled is set, that >>>> * the driver should have called pci_disable_device(), but the >>>> * unfortunate fact is there are too many odd BIOS and bridge setups >>>> * that don't like drivers doing that all of the time. >>>> * Oh well, we can dream of sane hardware when we sleep, no matter how >>>> * horrible the crap we have to deal with is when we are awake... >>>> */ >>>> >>>> So, unless we can somehow ignore that comment, I suspect forcing the >>>> device to be disabled on driver remove, whether done from pci-core or >>>> from x86/pci, is going to cause all sorts of breakage. Are the >>>> expectations set by b4b55cda5874 really valid? It seems like something >>>> needs to be done to allow the IRQ to be automatically re-established on >>>> x86 regardless of the driver doing the right thing when releasing the >>>> device. We're still looking at a regression for v4.0 as a result of >>>> b4b55cda5874. >>> >>> In which case we probably should revert commit b4b55cda5874 for the time being. >>> >>> At least I'd be very nervous about any ad-hoc fixes at this stage of the cycle. >> >> The comment goes back to the dawn of "git" time ... not sure how much further >> back. >> >> Is this actually still an issue on modern systems? Maybe we need a black list >> or white list to separate the good from bad systems? > > The answer to that is "We don't know" and in my not so humble opinion it is too > risky to try to find out at the end of the cycle. Hi Rafael and Alex, How about a patch which: 1) gives a warning if PCI device is still enabled when unloading driver 2) release PCI interrupt only if PCI device is disabled. By this way, we could support IOAPIC hot-removal on latest platforms and avoid regressions on old platforms. Thanks! Gerry ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/PCI: Fully disable devices before releasing IRQ resource 2015-03-12 1:41 ` Jiang Liu @ 2015-03-12 16:08 ` Rafael J. Wysocki 2015-03-13 1:49 ` Jiang Liu 0 siblings, 1 reply; 12+ messages in thread From: Rafael J. Wysocki @ 2015-03-12 16:08 UTC (permalink / raw) To: Jiang Liu Cc: Luck, Tony, Alex Williamson, Bjorn Helgaas, x86, mingo, bp, Zheng, Lv, hpa, tglx, yinghai, lenb, linux-pci, linux-acpi, linux-kernel On Thursday, March 12, 2015 09:41:21 AM Jiang Liu wrote: > On 2015/3/12 9:17, Rafael J. Wysocki wrote: > > On Wednesday, March 11, 2015 10:04:42 PM Luck, Tony wrote: > >>>> Unfortunately there's a long standing comment in pci_device_remove(): > >>>> > >>>> /* > >>>> * We would love to complain here if pci_dev->is_enabled is set, that > >>>> * the driver should have called pci_disable_device(), but the > >>>> * unfortunate fact is there are too many odd BIOS and bridge setups > >>>> * that don't like drivers doing that all of the time. > >>>> * Oh well, we can dream of sane hardware when we sleep, no matter how > >>>> * horrible the crap we have to deal with is when we are awake... > >>>> */ > >>>> > >>>> So, unless we can somehow ignore that comment, I suspect forcing the > >>>> device to be disabled on driver remove, whether done from pci-core or > >>>> from x86/pci, is going to cause all sorts of breakage. Are the > >>>> expectations set by b4b55cda5874 really valid? It seems like something > >>>> needs to be done to allow the IRQ to be automatically re-established on > >>>> x86 regardless of the driver doing the right thing when releasing the > >>>> device. We're still looking at a regression for v4.0 as a result of > >>>> b4b55cda5874. > >>> > >>> In which case we probably should revert commit b4b55cda5874 for the time being. > >>> > >>> At least I'd be very nervous about any ad-hoc fixes at this stage of the cycle. > >> > >> The comment goes back to the dawn of "git" time ... not sure how much further > >> back. > >> > >> Is this actually still an issue on modern systems? Maybe we need a black list > >> or white list to separate the good from bad systems? > > > > The answer to that is "We don't know" and in my not so humble opinion it is too > > risky to try to find out at the end of the cycle. > Hi Rafael and Alex, > How about a patch which: > 1) gives a warning if PCI device is still enabled when unloading driver That may become sort of noisy. I really would prefer to introduce things like that by the beginning of the cycle, not by the end of it. > 2) release PCI interrupt only if PCI device is disabled. > By this way, we could support IOAPIC hot-removal on latest platforms and > avoid regressions on old platforms. Well, please submit a patch for discussion. I would like to know Bjorn's opinion about that too at least. Rafael ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86/PCI: Fully disable devices before releasing IRQ resource 2015-03-12 16:08 ` Rafael J. Wysocki @ 2015-03-13 1:49 ` Jiang Liu 0 siblings, 0 replies; 12+ messages in thread From: Jiang Liu @ 2015-03-13 1:49 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Luck, Tony, Alex Williamson, Bjorn Helgaas, x86, mingo, bp, Zheng, Lv, hpa, tglx, yinghai, lenb, linux-pci, linux-acpi, linux-kernel On 2015/3/13 0:08, Rafael J. Wysocki wrote: > On Thursday, March 12, 2015 09:41:21 AM Jiang Liu wrote: >> On 2015/3/12 9:17, Rafael J. Wysocki wrote: >>> On Wednesday, March 11, 2015 10:04:42 PM Luck, Tony wrote: >>>>>> Unfortunately there's a long standing comment in pci_device_remove(): >>>>>> >>>>>> /* >>>>>> * We would love to complain here if pci_dev->is_enabled is set, that >>>>>> * the driver should have called pci_disable_device(), but the >>>>>> * unfortunate fact is there are too many odd BIOS and bridge setups >>>>>> * that don't like drivers doing that all of the time. >>>>>> * Oh well, we can dream of sane hardware when we sleep, no matter how >>>>>> * horrible the crap we have to deal with is when we are awake... >>>>>> */ >>>>>> >>>>>> So, unless we can somehow ignore that comment, I suspect forcing the >>>>>> device to be disabled on driver remove, whether done from pci-core or >>>>>> from x86/pci, is going to cause all sorts of breakage. Are the >>>>>> expectations set by b4b55cda5874 really valid? It seems like something >>>>>> needs to be done to allow the IRQ to be automatically re-established on >>>>>> x86 regardless of the driver doing the right thing when releasing the >>>>>> device. We're still looking at a regression for v4.0 as a result of >>>>>> b4b55cda5874. >>>>> >>>>> In which case we probably should revert commit b4b55cda5874 for the time being. >>>>> >>>>> At least I'd be very nervous about any ad-hoc fixes at this stage of the cycle. >>>> >>>> The comment goes back to the dawn of "git" time ... not sure how much further >>>> back. >>>> >>>> Is this actually still an issue on modern systems? Maybe we need a black list >>>> or white list to separate the good from bad systems? >>> >>> The answer to that is "We don't know" and in my not so humble opinion it is too >>> risky to try to find out at the end of the cycle. >> Hi Rafael and Alex, >> How about a patch which: >> 1) gives a warning if PCI device is still enabled when unloading driver > > That may become sort of noisy. I really would prefer to introduce things like > that by the beginning of the cycle, not by the end of it. Will try this on next merging window. >> 2) release PCI interrupt only if PCI device is disabled. >> By this way, we could support IOAPIC hot-removal on latest platforms and >> avoid regressions on old platforms. > > Well, please submit a patch for discussion. > > I would like to know Bjorn's opinion about that too at least. Still testing the patch, will send it out soon. > > Rafael > ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bugfix] x86/PCI: Release PCI IRQ resource only if PCI device is disabled when unbinding 2015-03-11 16:47 ` Alex Williamson 2015-03-11 22:04 ` Rafael J. Wysocki @ 2015-03-13 2:06 ` Jiang Liu 2015-03-13 21:45 ` Rafael J. Wysocki 1 sibling, 1 reply; 12+ messages in thread From: Jiang Liu @ 2015-03-13 2:06 UTC (permalink / raw) To: Alex Williamson, Rafael J . Wysocki, Bjorn Helgaas, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86 Cc: Jiang Liu, bp @ alien8 . de, Lv Zheng, yinghai @ kernel . org, lenb @ kernel . org, LKML, linux-pci, linux-acpi To support IOAPIC hot-removal, we need to release PCI interrupt resource when unbinding PCI device driver. But due to historical reason, /* * We would love to complain here if pci_dev->is_enabled is set, that * the driver should have called pci_disable_device(), but the * unfortunate fact is there are too many odd BIOS and bridge setups * that don't like drivers doing that all of the time. * Oh well, we can dream of sane hardware when we sleep, no matter how * horrible the crap we have to deal with is when we are awake... */ some drivers don't call pci_disable_device() when unloading, which prevents us from reallocating PCI interrupt resource on reloading PCI driver and causes regressions. So release PCI interrupt resource only if PCI device is disabled when unbinding. By this way, we could support IOAPIC hot-removal on latest platforms and avoid regressions on old platforms. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> --- arch/x86/pci/common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c index 3d2612b68694..8d792142cb2a 100644 --- a/arch/x86/pci/common.c +++ b/arch/x86/pci/common.c @@ -527,7 +527,7 @@ static int pci_irq_notifier(struct notifier_block *nb, unsigned long action, if (action != BUS_NOTIFY_UNBOUND_DRIVER) return NOTIFY_DONE; - if (pcibios_disable_irq) + if (!pci_is_enabled(dev) && pcibios_disable_irq) pcibios_disable_irq(dev); return NOTIFY_OK; -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [Bugfix] x86/PCI: Release PCI IRQ resource only if PCI device is disabled when unbinding 2015-03-13 2:06 ` [Bugfix] x86/PCI: Release PCI IRQ resource only if PCI device is disabled when unbinding Jiang Liu @ 2015-03-13 21:45 ` Rafael J. Wysocki 0 siblings, 0 replies; 12+ messages in thread From: Rafael J. Wysocki @ 2015-03-13 21:45 UTC (permalink / raw) To: Jiang Liu, Bjorn Helgaas Cc: Alex Williamson, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, bp @ alien8 . de, Lv Zheng, yinghai @ kernel . org, lenb @ kernel . org, LKML, linux-pci, linux-acpi On Friday, March 13, 2015 10:06:43 AM Jiang Liu wrote: > To support IOAPIC hot-removal, we need to release PCI interrupt resource > when unbinding PCI device driver. But due to historical reason, > /* > * We would love to complain here if pci_dev->is_enabled is set, that > * the driver should have called pci_disable_device(), but the > * unfortunate fact is there are too many odd BIOS and bridge setups > * that don't like drivers doing that all of the time. > * Oh well, we can dream of sane hardware when we sleep, no matter how > * horrible the crap we have to deal with is when we are awake... > */ > some drivers don't call pci_disable_device() when unloading, which > prevents us from reallocating PCI interrupt resource on reloading > PCI driver and causes regressions. > > So release PCI interrupt resource only if PCI device is disabled when > unbinding. By this way, we could support IOAPIC hot-removal on latest > platforms and avoid regressions on old platforms. > > Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> OK, I can agree with that. Bjorn, what do you think? > --- > arch/x86/pci/common.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c > index 3d2612b68694..8d792142cb2a 100644 > --- a/arch/x86/pci/common.c > +++ b/arch/x86/pci/common.c > @@ -527,7 +527,7 @@ static int pci_irq_notifier(struct notifier_block *nb, unsigned long action, > if (action != BUS_NOTIFY_UNBOUND_DRIVER) > return NOTIFY_DONE; > > - if (pcibios_disable_irq) > + if (!pci_is_enabled(dev) && pcibios_disable_irq) > pcibios_disable_irq(dev); > > return NOTIFY_OK; > -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2015-03-13 21:21 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-03-05 21:06 [PATCH] x86/PCI: Fully disable devices before releasing IRQ resource Alex Williamson 2015-03-06 1:49 ` Jiang Liu 2015-03-06 3:51 ` Alex Williamson 2015-03-11 16:47 ` Alex Williamson 2015-03-11 22:04 ` Rafael J. Wysocki 2015-03-11 22:04 ` Luck, Tony 2015-03-12 1:17 ` Rafael J. Wysocki 2015-03-12 1:41 ` Jiang Liu 2015-03-12 16:08 ` Rafael J. Wysocki 2015-03-13 1:49 ` Jiang Liu 2015-03-13 2:06 ` [Bugfix] x86/PCI: Release PCI IRQ resource only if PCI device is disabled when unbinding Jiang Liu 2015-03-13 21:45 ` Rafael J. Wysocki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).