LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/5] s390/pci: automatic error recovery
@ 2021-09-06  9:49 Niklas Schnelle
  2021-09-06  9:49 ` [PATCH 1/5] s390/pci: refresh function handle in iomap Niklas Schnelle
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Niklas Schnelle @ 2021-09-06  9:49 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Linas Vepstas, Oliver O'Halloran, Russell Currey,
	linuxppc-dev, linux-kernel, linux-s390, Matthew Rosato,
	Pierre Morel

Hello,

This series implements automatic error recovery for PCI devices on s390
following the scheme outlined at Documentation/PCI/pci-error-recovery.rst
it applies on top of currenct master.

The patches have are almost completely s390 specific except for two patches
exporting existing functionality for use by arch/s390/pci/ code. Nevertheless
I would also appreciate any feedback, especially on the last patch, concerning
the implementation of the error recovery flow. I believe we might be the first
implementation of PCI device recovery in a virtualized setting requiring us to
coordinate the device reset with the hypervisor platform by issuing a disable
and re-enable to the platform as well as starting the recovery following
a platform event.

The outline of the patches is as follows:

Patch 1 and 2 add s390 specific code implementing a reset mechanism that
takes the PCI function out of the platform specific error state.

Patches 3 and 4 export existing common code functions for use by the s390
specific recovery code.

Patch 3 I already sent separately resulting in the discussion below but without
a final conclusion.

https://lore.kernel.org/lkml/20210720150145.640727-1-schnelle@linux.ibm.com/

I believe even though there were some doubts about the use of
pci_dev_is_added() by arch code the existing uses as well as the use in the
final patch of this series warrant this export.

Patch 4 "PCI: Export pci_dev_lock()" is basically an extension to commit
e3a9b1212b9d ("PCI: Export pci_dev_trylock() and pci_dev_unlock()") which
already exported pci_dev_trylock(). In the final patch we make use of
pci_dev_lock() to wait for any other exclusive uses of the pdev to be finished
before starting recovery.

Finally Patch 5 implements the recovery flow as part of the existing s390
specific PCI availability and error event mechanism. Where previously the error
case only set an error indication requiring manual intervention to make the
device usable again. Now we handle the case where firmware has already reset
a PCI function after an error was encountered informing the OS that it should
be ready to be used again. Note that the same event is also issued by the
hypervisor if the function was manually taken into a service mode for example
for firmware upgrade via the hypervisor and is now ready to be used again.

Thanks,
Niklas Schnelle

Niklas Schnelle (5):
  s390/pci: refresh function handle in iomap
  s390/pci: implement reset_slot for hotplug slot
  PCI: Move pci_dev_is/assign_added() to pci.h
  PCI: Export pci_dev_lock()
  s390/pci: implement minimal PCI error recovery

 arch/powerpc/platforms/powernv/pci-sriov.c |   3 -
 arch/powerpc/platforms/pseries/setup.c     |   1 -
 arch/s390/include/asm/pci.h                |   6 +-
 arch/s390/pci/pci.c                        | 143 ++++++++++++++-
 arch/s390/pci/pci_event.c                  | 196 ++++++++++++++++++++-
 arch/s390/pci/pci_insn.c                   |   4 +-
 arch/s390/pci/pci_irq.c                    |   9 +
 arch/s390/pci/pci_sysfs.c                  |   2 -
 drivers/pci/hotplug/acpiphp_glue.c         |   1 -
 drivers/pci/hotplug/s390_pci_hpc.c         |  24 +++
 drivers/pci/pci.c                          |   3 +-
 drivers/pci/pci.h                          |  15 --
 include/linux/pci.h                        |  16 ++
 13 files changed, 389 insertions(+), 34 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/5] s390/pci: refresh function handle in iomap
  2021-09-06  9:49 [PATCH 0/5] s390/pci: automatic error recovery Niklas Schnelle
@ 2021-09-06  9:49 ` Niklas Schnelle
  2021-09-06  9:49 ` [PATCH 2/5] s390/pci: implement reset_slot for hotplug slot Niklas Schnelle
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Niklas Schnelle @ 2021-09-06  9:49 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Linas Vepstas, Oliver O'Halloran, Russell Currey,
	linuxppc-dev, linux-kernel, linux-s390, Matthew Rosato,
	Pierre Morel

The function handle of a PCI function is updated when disabling or
enabling it as well as when the function's availability changes or it
enters the error state.

Until now this only occurred either while there is no struct pci_dev
associated with the function yet or the function became unavailable.
This meant that leaving a stale function handle in the iomap either
didn't happen because there was no iomap yet or it lead to errors on PCI
access but so would the correct disabled function handle.

In the future a CLP Set PCI Function Disable/Enable cycle during PCI
device recovery may be done while the device is bound to a driver.  In
this case we must update the iomap associated with the now-stale
function handle to ensure that the resulting zPCI instruction references
an accurate function handle.

Since the function handle is accessed by the PCI accessor helpers
without locking use READ_ONCE()/WRITE_ONCE() to mark this access and
prevent compiler optimizations that would move the load/store.

With that infrastructure in place let's also properly update the
function handle in the existing cases. This makes sure that in the
future debugging of a zPCI function access through the handle will
show an up to date handle reducing the chance of confusion. Also it
makes sure we have one single place where a zPCI function handle is
updated after initialization.

Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
---
 arch/s390/include/asm/pci.h |  1 +
 arch/s390/pci/pci.c         | 36 ++++++++++++++++++++++++++++++++----
 arch/s390/pci/pci_event.c   |  6 +++---
 arch/s390/pci/pci_insn.c    |  4 ++--
 4 files changed, 38 insertions(+), 9 deletions(-)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index e4803ec51110..5e6cba22a801 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -211,6 +211,7 @@ int zpci_deconfigure_device(struct zpci_dev *zdev);
 int zpci_register_ioat(struct zpci_dev *, u8, u64, u64, u64);
 int zpci_unregister_ioat(struct zpci_dev *, u8);
 void zpci_remove_reserved_devices(void);
+void zpci_update_fh(struct zpci_dev *zdev, u32 fh);
 
 /* CLP */
 int clp_setup_writeback_mio(void);
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index e7e6788d75a8..af22778551c1 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -481,6 +481,34 @@ static void zpci_free_iomap(struct zpci_dev *zdev, int entry)
 	spin_unlock(&zpci_iomap_lock);
 }
 
+static void zpci_do_update_iomap_fh(struct zpci_dev *zdev, u32 fh)
+{
+	int bar, idx;
+
+	spin_lock(&zpci_iomap_lock);
+	for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
+		if (!zdev->bars[bar].size)
+			continue;
+		idx = zdev->bars[bar].map_idx;
+		if (!zpci_iomap_start[idx].count)
+			continue;
+		WRITE_ONCE(zpci_iomap_start[idx].fh, zdev->fh);
+	}
+	spin_unlock(&zpci_iomap_lock);
+}
+
+void zpci_update_fh(struct zpci_dev *zdev, u32 fh)
+{
+	if (!fh || zdev->fh == fh)
+		return;
+
+	zdev->fh = fh;
+	if (zpci_use_mio(zdev))
+		return;
+	if (zdev->has_resources && zdev_enabled(zdev))
+		zpci_do_update_iomap_fh(zdev, fh);
+}
+
 static struct resource *__alloc_res(struct zpci_dev *zdev, unsigned long start,
 				    unsigned long size, unsigned long flags)
 {
@@ -668,7 +696,7 @@ int zpci_enable_device(struct zpci_dev *zdev)
 	if (clp_enable_fh(zdev, &fh, ZPCI_NR_DMA_SPACES))
 		rc = -EIO;
 	else
-		zdev->fh = fh;
+		zpci_update_fh(zdev, fh);
 	return rc;
 }
 
@@ -679,14 +707,14 @@ int zpci_disable_device(struct zpci_dev *zdev)
 
 	cc = clp_disable_fh(zdev, &fh);
 	if (!cc) {
-		zdev->fh = fh;
+		zpci_update_fh(zdev, fh);
 	} else if (cc == CLP_RC_SETPCIFN_ALRDY) {
 		pr_info("Disabling PCI function %08x had no effect as it was already disabled\n",
 			zdev->fid);
 		/* Function is already disabled - update handle */
 		rc = clp_refresh_fh(zdev->fid, &fh);
 		if (!rc) {
-			zdev->fh = fh;
+			zpci_update_fh(zdev, fh);
 			rc = -EINVAL;
 		}
 	} else {
@@ -768,7 +796,7 @@ int zpci_scan_configured_device(struct zpci_dev *zdev, u32 fh)
 {
 	int rc;
 
-	zdev->fh = fh;
+	zpci_update_fh(zdev, fh);
 	/* the PCI function will be scanned once function 0 appears */
 	if (!zdev->zbus->bus)
 		return 0;
diff --git a/arch/s390/pci/pci_event.c b/arch/s390/pci/pci_event.c
index c856f80cb21b..e868d996ec5b 100644
--- a/arch/s390/pci/pci_event.c
+++ b/arch/s390/pci/pci_event.c
@@ -76,7 +76,7 @@ void zpci_event_error(void *data)
 
 static void zpci_event_hard_deconfigured(struct zpci_dev *zdev, u32 fh)
 {
-	zdev->fh = fh;
+	zpci_update_fh(zdev, fh);
 	/* Give the driver a hint that the function is
 	 * already unusable.
 	 */
@@ -117,7 +117,7 @@ static void __zpci_event_availability(struct zpci_ccdf_avail *ccdf)
 		if (!zdev)
 			zpci_create_device(ccdf->fid, ccdf->fh, ZPCI_FN_STATE_STANDBY);
 		else
-			zdev->fh = ccdf->fh;
+			zpci_update_fh(zdev, ccdf->fh);
 		break;
 	case 0x0303: /* Deconfiguration requested */
 		if (zdev) {
@@ -126,7 +126,7 @@ static void __zpci_event_availability(struct zpci_ccdf_avail *ccdf)
 			 */
 			if (zdev->state != ZPCI_FN_STATE_CONFIGURED)
 				break;
-			zdev->fh = ccdf->fh;
+			zpci_update_fh(zdev, ccdf->fh);
 			zpci_deconfigure_device(zdev);
 		}
 		break;
diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
index 2e43996159f0..28d863aaafea 100644
--- a/arch/s390/pci/pci_insn.c
+++ b/arch/s390/pci/pci_insn.c
@@ -163,7 +163,7 @@ static inline int zpci_load_fh(u64 *data, const volatile void __iomem *addr,
 			       unsigned long len)
 {
 	struct zpci_iomap_entry *entry = &zpci_iomap_start[ZPCI_IDX(addr)];
-	u64 req = ZPCI_CREATE_REQ(entry->fh, entry->bar, len);
+	u64 req = ZPCI_CREATE_REQ(READ_ONCE(entry->fh), entry->bar, len);
 
 	return __zpci_load(data, req, ZPCI_OFFSET(addr));
 }
@@ -244,7 +244,7 @@ static inline int zpci_store_fh(const volatile void __iomem *addr, u64 data,
 				unsigned long len)
 {
 	struct zpci_iomap_entry *entry = &zpci_iomap_start[ZPCI_IDX(addr)];
-	u64 req = ZPCI_CREATE_REQ(entry->fh, entry->bar, len);
+	u64 req = ZPCI_CREATE_REQ(READ_ONCE(entry->fh), entry->bar, len);
 
 	return __zpci_store(data, req, ZPCI_OFFSET(addr));
 }
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 2/5] s390/pci: implement reset_slot for hotplug slot
  2021-09-06  9:49 [PATCH 0/5] s390/pci: automatic error recovery Niklas Schnelle
  2021-09-06  9:49 ` [PATCH 1/5] s390/pci: refresh function handle in iomap Niklas Schnelle
@ 2021-09-06  9:49 ` Niklas Schnelle
  2021-09-06  9:49 ` [PATCH 3/5] PCI: Move pci_dev_is/assign_added() to pci.h Niklas Schnelle
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Niklas Schnelle @ 2021-09-06  9:49 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Linas Vepstas, Oliver O'Halloran, Russell Currey,
	linuxppc-dev, linux-kernel, linux-s390, Matthew Rosato,
	Pierre Morel

This is done by adding a zpci_hot_reset_device() call which does a low
level reset of the PCI function without changing its higher level
function state. This way it can be used while the zPCI function is bound
to a driver and with DMA tables being controlled either through the
IOMMU or DMA APIs which is prohibited when using zpci_disable_device()
as that drop existing DMA translations.

As this reset, unlike a normal FLR, also calls zpci_clear_irq() we need
to implement arch_restore_msi_irqs() and make sure we re-enable IRQs for
the PCI function if they were previously disabled.

Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
---
 arch/s390/include/asm/pci.h        |  1 +
 arch/s390/pci/pci.c                | 58 ++++++++++++++++++++++++++++++
 arch/s390/pci/pci_irq.c            |  9 +++++
 drivers/pci/hotplug/s390_pci_hpc.c | 24 +++++++++++++
 4 files changed, 92 insertions(+)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 5e6cba22a801..2a2ed165a270 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -208,6 +208,7 @@ int zpci_disable_device(struct zpci_dev *);
 int zpci_scan_configured_device(struct zpci_dev *zdev, u32 fh);
 int zpci_deconfigure_device(struct zpci_dev *zdev);
 
+int zpci_hot_reset_device(struct zpci_dev *zdev);
 int zpci_register_ioat(struct zpci_dev *, u8, u64, u64, u64);
 int zpci_unregister_ioat(struct zpci_dev *, u8);
 void zpci_remove_reserved_devices(void);
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index af22778551c1..a6322f45b5bd 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -723,6 +723,64 @@ int zpci_disable_device(struct zpci_dev *zdev)
 	return rc;
 }
 
+/**
+ * zpci_hot_reset_device - perform a reset of the given zPCI function
+ * @zdev: the slot which should be reset
+ *
+ * Performs a low level reset of the zPCI function. The reset is low level in
+ * the sense that the zPCI function can be reset without detaching it from the
+ * common PCI subsystem. The reset may be performed while under control of
+ * either DMA or IOMMU APIs in which case the existing DMA/IOMMU translation
+ * table is reinstated at the end of the reset.
+ *
+ * After the reset the functions internal state is reset to an initial state
+ * equivalent to its state during boot when first probing a driver.
+ * Consequently after reset the PCI function requires re-initialization via the
+ * common PCI code including re-enabling IRQs via pci_alloc_irq_vectors()
+ * and enabling the function via e.g.pci_enablde_device_flags().The caller
+ * must guard against concurrent reset attempts.
+ *
+ * In most cases this function should not be called directly but through
+ * pci_reset_function() or pci_reset_bus() which handle the save/restore and
+ * locking.
+ *
+ * Return: 0 on success and an error value otherwise
+ */
+int zpci_hot_reset_device(struct zpci_dev *zdev)
+{
+	int rc;
+
+	zpci_dbg(3, "reset fid:%x\n", zdev->fid);
+	if (zdev_enabled(zdev)) {
+		/* Disables device access, DMAs and IRQs (reset state) */
+		rc = zpci_disable_device(zdev);
+		/*
+		 * Due to a z/VM vs LPAR inconsistency in the error state the
+		 * FH may indicate an enabled device but disable says the
+		 * device is already disabled don't treat it as an error here.
+		 */
+		if (rc == -EINVAL)
+			rc = 0;
+		if (rc)
+			return rc;
+	}
+
+	rc = zpci_enable_device(zdev);
+	if (rc)
+		return rc;
+
+	if (zdev->dma_table) {
+		rc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma,
+					(u64)zdev->dma_table);
+		if (rc)
+			return rc;
+	} else {
+		zpci_dma_init_device(zdev);
+	}
+
+	return 0;
+}
+
 /**
  * zpci_create_device() - Create a new zpci_dev and add it to the zbus
  * @fid: Function ID of the device to be created
diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
index 9c7de9089939..ab98e7f5b79b 100644
--- a/arch/s390/pci/pci_irq.c
+++ b/arch/s390/pci/pci_irq.c
@@ -391,6 +391,15 @@ void arch_teardown_msi_irqs(struct pci_dev *pdev)
 		airq_iv_free(zpci_ibv[0], zdev->msi_first_bit, zdev->msi_nr_irqs);
 }
 
+void arch_restore_msi_irqs(struct pci_dev *pdev)
+{
+	struct zpci_dev *zdev = to_zpci(pdev);
+
+	if (!zdev->irqs_registered)
+		zpci_set_irq(zdev);
+	default_restore_msi_irqs(pdev);
+}
+
 static struct airq_struct zpci_airq = {
 	.handler = zpci_floating_irq_handler,
 	.isc = PCI_ISC,
diff --git a/drivers/pci/hotplug/s390_pci_hpc.c b/drivers/pci/hotplug/s390_pci_hpc.c
index 014868752cd4..07f28db0eed5 100644
--- a/drivers/pci/hotplug/s390_pci_hpc.c
+++ b/drivers/pci/hotplug/s390_pci_hpc.c
@@ -57,6 +57,29 @@ static int disable_slot(struct hotplug_slot *hotplug_slot)
 	return zpci_deconfigure_device(zdev);
 }
 
+static int reset_slot(struct hotplug_slot *hotplug_slot, int probe)
+{
+	struct zpci_dev *zdev = container_of(hotplug_slot, struct zpci_dev,
+					     hotplug_slot);
+
+	if (zdev->state != ZPCI_FN_STATE_CONFIGURED)
+		return -EIO;
+	/*
+	 * We can't take the zdev->lock as reset_slot may be called during
+	 * probing and/or device removal which already happens under the
+	 * zdev->lock. Instead the user should use the higher level
+	 * pci_reset_function() or pci_bus_reset() which hold the PCI device
+	 * lock preventing concurrent removal. If not using these functions
+	 * holding the PCI device lock is required.
+	 */
+
+	/* As long as the function is configured we can reset */
+	if (probe)
+		return 0;
+
+	return zpci_hot_reset_device(zdev);
+}
+
 static int get_power_status(struct hotplug_slot *hotplug_slot, u8 *value)
 {
 	struct zpci_dev *zdev = container_of(hotplug_slot, struct zpci_dev,
@@ -83,6 +106,7 @@ static int get_adapter_status(struct hotplug_slot *hotplug_slot, u8 *value)
 static const struct hotplug_slot_ops s390_hotplug_slot_ops = {
 	.enable_slot =		enable_slot,
 	.disable_slot =		disable_slot,
+	.reset_slot =		reset_slot,
 	.get_power_status =	get_power_status,
 	.get_adapter_status =	get_adapter_status,
 };
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 3/5] PCI: Move pci_dev_is/assign_added() to pci.h
  2021-09-06  9:49 [PATCH 0/5] s390/pci: automatic error recovery Niklas Schnelle
  2021-09-06  9:49 ` [PATCH 1/5] s390/pci: refresh function handle in iomap Niklas Schnelle
  2021-09-06  9:49 ` [PATCH 2/5] s390/pci: implement reset_slot for hotplug slot Niklas Schnelle
@ 2021-09-06  9:49 ` Niklas Schnelle
  2021-09-07  0:22   ` kernel test robot
  2021-09-07  0:25   ` kernel test robot
  2021-09-06  9:49 ` [PATCH 4/5] PCI: Export pci_dev_lock() Niklas Schnelle
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 17+ messages in thread
From: Niklas Schnelle @ 2021-09-06  9:49 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Linas Vepstas, Oliver O'Halloran, Russell Currey,
	linuxppc-dev, linux-kernel, linux-s390, Matthew Rosato,
	Pierre Morel

The helper function pci_dev_is_added() from drivers/pci/pci.h is used in
PCI arch code of both s390 and powerpc leading to awkward relative
includes. Move it to the global include/linux/pci.h and get rid of these
includes just for that one function.

Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-sriov.c |  3 ---
 arch/powerpc/platforms/pseries/setup.c     |  1 -
 arch/s390/pci/pci_sysfs.c                  |  2 --
 drivers/pci/hotplug/acpiphp_glue.c         |  1 -
 drivers/pci/pci.h                          | 15 ---------------
 include/linux/pci.h                        | 15 +++++++++++++++
 6 files changed, 15 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-sriov.c b/arch/powerpc/platforms/powernv/pci-sriov.c
index 28aac933a439..2e0ca5451e85 100644
--- a/arch/powerpc/platforms/powernv/pci-sriov.c
+++ b/arch/powerpc/platforms/powernv/pci-sriov.c
@@ -9,9 +9,6 @@
 
 #include "pci.h"
 
-/* for pci_dev_is_added() */
-#include "../../../../drivers/pci/pci.h"
-
 /*
  * The majority of the complexity in supporting SR-IOV on PowerNV comes from
  * the need to put the MMIO space for each VF into a separate PE. Internally
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 0dfaa6ab44cc..08e846ae1853 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -74,7 +74,6 @@
 #include <asm/hvconsole.h>
 
 #include "pseries.h"
-#include "../../../../drivers/pci/pci.h"
 
 DEFINE_STATIC_KEY_FALSE(shared_processor);
 EXPORT_SYMBOL(shared_processor);
diff --git a/arch/s390/pci/pci_sysfs.c b/arch/s390/pci/pci_sysfs.c
index 335c281811c7..40733b93a086 100644
--- a/arch/s390/pci/pci_sysfs.c
+++ b/arch/s390/pci/pci_sysfs.c
@@ -13,8 +13,6 @@
 #include <linux/stat.h>
 #include <linux/pci.h>
 
-#include "../../../drivers/pci/pci.h"
-
 #include <asm/sclp.h>
 
 #define zpci_attr(name, fmt, member)					\
diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
index f031302ad401..4cb963f88183 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -38,7 +38,6 @@
 #include <linux/slab.h>
 #include <linux/acpi.h>
 
-#include "../pci.h"
 #include "acpiphp.h"
 
 static LIST_HEAD(bridge_list);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 93dcdd431072..a159cd0f6f05 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -383,21 +383,6 @@ static inline bool pci_dev_is_disconnected(const struct pci_dev *dev)
 	return dev->error_state == pci_channel_io_perm_failure;
 }
 
-/* pci_dev priv_flags */
-#define PCI_DEV_ADDED 0
-#define PCI_DPC_RECOVERED 1
-#define PCI_DPC_RECOVERING 2
-
-static inline void pci_dev_assign_added(struct pci_dev *dev, bool added)
-{
-	assign_bit(PCI_DEV_ADDED, &dev->priv_flags, added);
-}
-
-static inline bool pci_dev_is_added(const struct pci_dev *dev)
-{
-	return test_bit(PCI_DEV_ADDED, &dev->priv_flags);
-}
-
 #ifdef CONFIG_PCIEAER
 #include <linux/aer.h>
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 540b377ca8f6..ea0e23dbc8ec 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -507,6 +507,21 @@ struct pci_dev {
 	unsigned long	priv_flags;	/* Private flags for the PCI driver */
 };
 
+/* pci_dev priv_flags */
+#define PCI_DEV_ADDED 0
+#define PCI_DPC_RECOVERED 1
+#define PCI_DPC_RECOVERING 2
+
+static inline void pci_dev_assign_added(struct pci_dev *dev, bool added)
+{
+	assign_bit(PCI_DEV_ADDED, &dev->priv_flags, added);
+}
+
+static inline bool pci_dev_is_added(const struct pci_dev *dev)
+{
+	return test_bit(PCI_DEV_ADDED, &dev->priv_flags);
+}
+
 static inline struct pci_dev *pci_physfn(struct pci_dev *dev)
 {
 #ifdef CONFIG_PCI_IOV
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 4/5] PCI: Export pci_dev_lock()
  2021-09-06  9:49 [PATCH 0/5] s390/pci: automatic error recovery Niklas Schnelle
                   ` (2 preceding siblings ...)
  2021-09-06  9:49 ` [PATCH 3/5] PCI: Move pci_dev_is/assign_added() to pci.h Niklas Schnelle
@ 2021-09-06  9:49 ` Niklas Schnelle
  2021-09-06  9:49 ` [PATCH 5/5] s390/pci: implement minimal PCI error recovery Niklas Schnelle
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Niklas Schnelle @ 2021-09-06  9:49 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Linas Vepstas, Oliver O'Halloran, Russell Currey,
	linuxppc-dev, linux-kernel, linux-s390, Matthew Rosato,
	Pierre Morel

Commit e3a9b1212b9d ("PCI: Export pci_dev_trylock() and pci_dev_unlock()")
already exported pci_dev_trylock()/pci_dev_unlock() however in some
circumstances such as during error recovery it makes sense to block
waiting to get full access to the device so also export pci_dev_lock().

Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
---
 drivers/pci/pci.c   | 3 ++-
 include/linux/pci.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index aacf575c15cf..3f416c6d3b0b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5040,12 +5040,13 @@ static int pci_reset_bus_function(struct pci_dev *dev, int probe)
 	return pci_parent_bus_reset(dev, probe);
 }
 
-static void pci_dev_lock(struct pci_dev *dev)
+void pci_dev_lock(struct pci_dev *dev)
 {
 	pci_cfg_access_lock(dev);
 	/* block PM suspend, driver probe, etc. */
 	device_lock(&dev->dev);
 }
+EXPORT_SYMBOL_GPL(pci_dev_lock);
 
 /* Return 1 on successful lock, 0 on contention */
 int pci_dev_trylock(struct pci_dev *dev)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index ea0e23dbc8ec..efc78b8d4696 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1639,6 +1639,7 @@ void pci_cfg_access_lock(struct pci_dev *dev);
 bool pci_cfg_access_trylock(struct pci_dev *dev);
 void pci_cfg_access_unlock(struct pci_dev *dev);
 
+void pci_dev_lock(struct pci_dev *dev);
 int pci_dev_trylock(struct pci_dev *dev);
 void pci_dev_unlock(struct pci_dev *dev);
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 5/5] s390/pci: implement minimal PCI error recovery
  2021-09-06  9:49 [PATCH 0/5] s390/pci: automatic error recovery Niklas Schnelle
                   ` (3 preceding siblings ...)
  2021-09-06  9:49 ` [PATCH 4/5] PCI: Export pci_dev_lock() Niklas Schnelle
@ 2021-09-06  9:49 ` Niklas Schnelle
  2021-09-07  2:04 ` [PATCH 0/5] s390/pci: automatic " Oliver O'Halloran
       [not found] ` <CAHrUA34TK6U4TB34FHejott9TdFvSgAedOpmro-Uj2ZwnvzecQ@mail.gmail.com>
  6 siblings, 0 replies; 17+ messages in thread
From: Niklas Schnelle @ 2021-09-06  9:49 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Linas Vepstas, Oliver O'Halloran, Russell Currey,
	linuxppc-dev, linux-kernel, linux-s390, Matthew Rosato,
	Pierre Morel

When the platform detects an error on a PCI function or a service action
has been performed it is put in the error state and an error event
notification is provided to the OS.

Currently we treat all error event notifications the same and simply set
pdev->error_state = pci_channel_io_perm_failure requiring user
intervention such as use of the recover attribute to get the device
usable again. Despite requiring a manual step this also has the
disadvantage that the device is completely torn down and recreated
resulting in higher level devices such as a block or network device
being recreated. In case of a block device this also means that it may
need to be removed and added to a software raid even if that could
otherwise survive with a temporary degradation.

This is of course not ideal more so since an error notification with PEC
0x3A indicates that the platform already performed error recovery
successfully or that the error state was caused by a service action that
is now finished.

At least in this case we can assume that the error state can be reset
and the function made usable again. So as not to have the disadvantage
of a full tear down and recreation we need to coordinate this recovery
with the driver. Thankfully there is already a well defined recovery
flow for this described in Documentation/PCI/pci-error-recovery.rst.

The implementation of this is somewhat straight forward and simplified
by the fact that our recovery flow is defined per PCI function. As
a reset we use the newly introduced zpci_hot_reset_device() which also
takes the PCI function out of the error state.

Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
---
 arch/s390/include/asm/pci.h |   4 +-
 arch/s390/pci/pci.c         |  49 ++++++++++
 arch/s390/pci/pci_event.c   | 190 +++++++++++++++++++++++++++++++++++-
 3 files changed, 241 insertions(+), 2 deletions(-)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 2a2ed165a270..558877aff2e5 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -294,8 +294,10 @@ void zpci_debug_exit(void);
 void zpci_debug_init_device(struct zpci_dev *, const char *);
 void zpci_debug_exit_device(struct zpci_dev *);
 
-/* Error reporting */
+/* Error handling */
 int zpci_report_error(struct pci_dev *, struct zpci_report_error_header *);
+int zpci_clear_error_state(struct zpci_dev *zdev);
+int zpci_reset_load_store_blocked(struct zpci_dev *zdev);
 
 #ifdef CONFIG_NUMA
 
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index a6322f45b5bd..77a3e85d43fb 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -954,6 +954,55 @@ int zpci_report_error(struct pci_dev *pdev,
 }
 EXPORT_SYMBOL(zpci_report_error);
 
+/**
+ * zpci_clear_error_state() - Clears the zPCI error state of the device
+ * @zdev: The zdev for which the zPCI error state should be reset
+ *
+ * Clear the zPCI error state of the device. If clearing the zPCI error state
+ * fails the device is left in the error state. In this case it may make sense
+ * to call zpci_io_perm_failure() on the associated pdev if it exists.
+ *
+ * Returns: 0 on success, -EIO otherwise
+ */
+int zpci_clear_error_state(struct zpci_dev *zdev)
+{
+	u64 req = ZPCI_CREATE_REQ(zdev->fh, 0, ZPCI_MOD_FC_RESET_ERROR);
+	struct zpci_fib fib = {0};
+	u8 status;
+	int rc;
+
+	rc = zpci_mod_fc(req, &fib, &status);
+	if (rc)
+		return -EIO;
+
+	return 0;
+}
+
+/**
+ * zpci_reset_load_store_blocked() - Re-enables L/S from error state
+ * @zdev: The zdev for which to unblock load/store access
+ *
+ * R-eenables load/store access for a PCI function in the error state while
+ * keeping DMA blocked. In this state drivers can poke MMIO space to determine
+ * if error recovery is possible while catching any rogue DMA access from the
+ * device.
+ *
+ * Returns: 0 on success, -EIO otherwise
+ */
+int zpci_reset_load_store_blocked(struct zpci_dev *zdev)
+{
+	u64 req = ZPCI_CREATE_REQ(zdev->fh, 0, ZPCI_MOD_FC_RESET_BLOCK);
+	struct zpci_fib fib = {0};
+	u8 status;
+	int rc;
+
+	rc = zpci_mod_fc(req, &fib, &status);
+	if (rc)
+		return -EIO;
+
+	return 0;
+}
+
 static int zpci_mem_init(void)
 {
 	BUILD_BUG_ON(!is_power_of_2(__alignof__(struct zpci_fmb)) ||
diff --git a/arch/s390/pci/pci_event.c b/arch/s390/pci/pci_event.c
index e868d996ec5b..ac9ed1572d39 100644
--- a/arch/s390/pci/pci_event.c
+++ b/arch/s390/pci/pci_event.c
@@ -47,15 +47,190 @@ struct zpci_ccdf_avail {
 	u16 pec;			/* PCI event code */
 } __packed;
 
+static inline bool ers_result_indicates_abort(pci_ers_result_t ers_res)
+{
+	switch (ers_res) {
+	case PCI_ERS_RESULT_CAN_RECOVER:
+	case PCI_ERS_RESULT_RECOVERED:
+	case PCI_ERS_RESULT_NEED_RESET:
+		return false;
+	default:
+		return true;
+	}
+}
+
+static pci_ers_result_t zpci_event_notify_error_detected(struct pci_dev *pdev)
+{
+	pci_ers_result_t ers_res = PCI_ERS_RESULT_DISCONNECT;
+	struct pci_driver *driver = pdev->driver;
+
+	pr_debug("%s: calling error_detected() callback\n", pci_name(pdev));
+	ers_res = driver->err_handler->error_detected(pdev,  pdev->error_state);
+	if (ers_result_indicates_abort(ers_res))
+		pr_info("%s: driver can't recover\n", pci_name(pdev));
+	else if (ers_res == PCI_ERS_RESULT_NEED_RESET)
+		pr_debug("%s: driver needs reset to recover\n", pci_name(pdev));
+
+	return ers_res;
+}
+
+static pci_ers_result_t zpci_event_do_error_state_clear(struct pci_dev *pdev)
+{
+	pci_ers_result_t ers_res = PCI_ERS_RESULT_DISCONNECT;
+	struct pci_driver *driver = pdev->driver;
+	struct zpci_dev *zdev = to_zpci(pdev);
+	int rc;
+
+	pr_debug("%s: reset load/store blocked\n", pci_name(pdev));
+	rc = zpci_reset_load_store_blocked(zdev);
+	if (rc) {
+		pr_err("%s: reset load/store blocked failed\n", pci_name(pdev));
+		/* Let's try a full reset instead */
+		return PCI_ERS_RESULT_NEED_RESET;
+	}
+
+	if (driver->err_handler->mmio_enabled) {
+		pr_debug("%s: calling mmio_enabled() callback\n", pci_name(pdev));
+		ers_res = driver->err_handler->mmio_enabled(pdev);
+		if (ers_result_indicates_abort(ers_res)) {
+			pr_info("%s: driver can't recover after enabling MMIO\n", pci_name(pdev));
+			return ers_res;
+		} else if (ers_res == PCI_ERS_RESULT_NEED_RESET) {
+			pr_debug("%s: driver needs reset to recover\n", pci_name(pdev));
+			return ers_res;
+		}
+	}
+
+	pr_debug("%s: clearing error state\n", pci_name(pdev));
+	rc = zpci_clear_error_state(zdev);
+	if (!rc) {
+		pdev->error_state = pci_channel_io_normal;
+	} else {
+		pr_err("%s: resetting error state failed\n", pci_name(pdev));
+		/* Let's try a full reset instead */
+		return PCI_ERS_RESULT_NEED_RESET;
+	}
+
+	return ers_res;
+}
+
+static pci_ers_result_t zpci_event_do_reset(struct pci_dev *pdev)
+{
+	pci_ers_result_t ers_res = PCI_ERS_RESULT_DISCONNECT;
+	struct pci_driver *driver = pdev->driver;
+
+	pr_info("%s: initiating reset\n", pci_name(pdev));
+	if (zpci_hot_reset_device(to_zpci(pdev))) {
+		pr_err("%s: resetting function failed\n", pci_name(pdev));
+		return ers_res;
+	}
+	pdev->error_state = pci_channel_io_normal;
+	if (driver->err_handler->slot_reset) {
+		ers_res = driver->err_handler->slot_reset(pdev);
+		if (ers_result_indicates_abort(ers_res)) {
+			pr_info("%s: driver can't recover after slot reset\n", pci_name(pdev));
+			return ers_res;
+		}
+	}
+
+	return ers_res;
+}
+
+/* zpci_event_attempt_error_recovery - Try to recover the given PCI function
+ * @pdev: PCI function to recover currently in the error state
+ *
+ * We follow the scheme outlined in Documentation/PCI/pci-error-recovery.rst.
+ * With the simplification that recovery always happens per function
+ * and the platform determines which functions are affected for
+ * multi-function devices.
+ */
+static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev)
+{
+	pci_ers_result_t ers_res = PCI_ERS_RESULT_DISCONNECT;
+	struct pci_driver *driver;
+
+	/*
+	 * Ensure that the PCI function is not removed concurrently, no driver
+	 * is unbound or probed and that userspace can't access its
+	 * configuration space while we perform recovery.
+	 */
+	pci_dev_lock(pdev);
+	/*
+	 * Between getting the pdev and locking it the PCI device may have been
+	 * removed e.g. by a concurrent call to recover_store().
+	 */
+	if (!pci_dev_is_added(pdev))
+		goto out_unlock;
+	if (pdev->error_state == pci_channel_io_perm_failure) {
+		ers_res = PCI_ERS_RESULT_DISCONNECT;
+		goto out_unlock;
+	}
+	pdev->error_state = pci_channel_io_frozen;
+
+	driver = pdev->driver;
+	if (!driver || !driver->err_handler || !driver->err_handler->error_detected) {
+		pr_info("%s: driver does not support error recovery\n", pci_name(pdev));
+		goto out_unlock;
+	}
+
+	ers_res = zpci_event_notify_error_detected(pdev);
+	if (ers_result_indicates_abort(ers_res))
+		goto out_unlock;
+
+	if (ers_res == PCI_ERS_RESULT_CAN_RECOVER) {
+		ers_res = zpci_event_do_error_state_clear(pdev);
+		if (ers_result_indicates_abort(ers_res))
+			goto out_unlock;
+	}
+
+	if (ers_res == PCI_ERS_RESULT_NEED_RESET)
+		ers_res = zpci_event_do_reset(pdev);
+
+	if (ers_res != PCI_ERS_RESULT_RECOVERED) {
+		pr_err("%s: recovery failed\n", pci_name(pdev));
+		goto out_unlock;
+	}
+
+	pr_info("%s: resuming operations\n", pci_name(pdev));
+	if (driver->err_handler->resume)
+		driver->err_handler->resume(pdev);
+out_unlock:
+	pci_dev_unlock(pdev);
+
+	return ers_res;
+}
+
+/* zpci_event_io_failure - Report IO failure state es to driver
+ * @pdev: PCI function to report as failed
+ */
+static void zpci_event_io_failure(struct pci_dev *pdev, pci_channel_state_t es)
+{
+	struct pci_driver *driver;
+
+	pci_dev_lock(pdev);
+	if (!pci_dev_is_added(pdev))
+		goto out_unlock;
+	pdev->error_state = es;
+	driver = pdev->driver;
+	if (driver && driver->err_handler && driver->err_handler->error_detected)
+		driver->err_handler->error_detected(pdev, pdev->error_state);
+out_unlock:
+	pci_dev_unlock(pdev);
+}
+
 static void __zpci_event_error(struct zpci_ccdf_err *ccdf)
 {
 	struct zpci_dev *zdev = get_zdev_by_fid(ccdf->fid);
 	struct pci_dev *pdev = NULL;
+	pci_ers_result_t ers_res;
 
 	zpci_err("error CCDF:\n");
 	zpci_err_hex(ccdf, sizeof(*ccdf));
 
 	if (zdev)
+		zpci_update_fh(zdev, ccdf->fh);
+
+	if (zdev->zbus->bus)
 		pdev = pci_get_slot(zdev->zbus->bus, zdev->devfn);
 
 	pr_err("%s: Event 0x%x reports an error for PCI function 0x%x\n",
@@ -64,7 +239,20 @@ static void __zpci_event_error(struct zpci_ccdf_err *ccdf)
 	if (!pdev)
 		return;
 
-	pdev->error_state = pci_channel_io_perm_failure;
+	switch (ccdf->pec) {
+	case 0x003a: /* Service Action or Error Recovery Successful */
+		ers_res = zpci_event_attempt_error_recovery(pdev);
+		if (ers_res != PCI_ERS_RESULT_RECOVERED)
+			zpci_event_io_failure(pdev, pci_channel_io_perm_failure);
+		break;
+	default:
+		/*
+		 * Mark as frozen not permanently failed because the device
+		 * could be subsequently recovered by the platform.
+		 */
+		zpci_event_io_failure(pdev, pci_channel_io_frozen);
+		break;
+	}
 	pci_dev_put(pdev);
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/5] PCI: Move pci_dev_is/assign_added() to pci.h
  2021-09-06  9:49 ` [PATCH 3/5] PCI: Move pci_dev_is/assign_added() to pci.h Niklas Schnelle
@ 2021-09-07  0:22   ` kernel test robot
  2021-09-07  0:25   ` kernel test robot
  1 sibling, 0 replies; 17+ messages in thread
From: kernel test robot @ 2021-09-07  0:22 UTC (permalink / raw)
  To: Niklas Schnelle, Bjorn Helgaas
  Cc: llvm, kbuild-all, Linas Vepstas, Oliver O'Halloran,
	Russell Currey, linuxppc-dev, linux-kernel, linux-s390,
	Matthew Rosato, Pierre Morel

[-- Attachment #1: Type: text/plain, Size: 28550 bytes --]

Hi Niklas,

I love your patch! Yet something to improve:

[auto build test ERROR on s390/features]
[also build test ERROR on next-20210906]
[cannot apply to pci/next powerpc/next v5.14]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Niklas-Schnelle/s390-pci-automatic-error-recovery/20210906-175309
base:   https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git features
config: i386-randconfig-a016-20210906 (attached as .config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 6fe2beba7d2a41964af658c8c59dd172683ef739)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/404ed8c00a612e7ae31c50557c80c6726c464863
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Niklas-Schnelle/s390-pci-automatic-error-recovery/20210906-175309
        git checkout 404ed8c00a612e7ae31c50557c80c6726c464863
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> drivers/pci/hotplug/acpiphp_glue.c:330:6: error: implicit declaration of function 'pci_bus_read_dev_vendor_id' [-Werror,-Wimplicit-function-declaration]
           if (pci_bus_read_dev_vendor_id(pbus, PCI_DEVFN(device, function),
               ^
>> drivers/pci/hotplug/acpiphp_glue.c:505:6: error: implicit declaration of function '__pci_bus_size_bridges' [-Werror,-Wimplicit-function-declaration]
                                           __pci_bus_size_bridges(dev->subordinate,
                                           ^
   drivers/pci/hotplug/acpiphp_glue.c:505:6: note: did you mean 'pci_bus_size_bridges'?
   include/linux/pci.h:1336:6: note: 'pci_bus_size_bridges' declared here
   void pci_bus_size_bridges(struct pci_bus *bus);
        ^
>> drivers/pci/hotplug/acpiphp_glue.c:510:3: error: implicit declaration of function '__pci_bus_assign_resources' [-Werror,-Wimplicit-function-declaration]
                   __pci_bus_assign_resources(bus, &add_list, NULL);
                   ^
   drivers/pci/hotplug/acpiphp_glue.c:510:3: note: did you mean 'pci_bus_assign_resources'?
   include/linux/pci.h:1334:6: note: 'pci_bus_assign_resources' declared here
   void pci_bus_assign_resources(const struct pci_bus *bus);
        ^
   drivers/pci/hotplug/acpiphp_glue.c:604:8: error: implicit declaration of function 'pci_bus_read_dev_vendor_id' [-Werror,-Wimplicit-function-declaration]
                           if (pci_bus_read_dev_vendor_id(slot->bus,
                               ^
   drivers/pci/hotplug/acpiphp_glue.c:619:7: error: implicit declaration of function 'pci_bus_read_dev_vendor_id' [-Werror,-Wimplicit-function-declaration]
                   if (pci_bus_read_dev_vendor_id(slot->bus,
                       ^
>> drivers/pci/hotplug/acpiphp_glue.c:660:3: error: implicit declaration of function 'pci_dev_set_disconnected' [-Werror,-Wimplicit-function-declaration]
                   pci_dev_set_disconnected(dev, NULL);
                   ^
>> drivers/pci/hotplug/acpiphp_glue.c:661:7: error: implicit declaration of function 'pci_has_subordinate' [-Werror,-Wimplicit-function-declaration]
                   if (pci_has_subordinate(dev))
                       ^
   7 errors generated.


vim +/pci_bus_read_dev_vendor_id +330 drivers/pci/hotplug/acpiphp_glue.c

4e8662bbd680c5 Kristen Accardi   2006-06-28  217  
3799c5a032aefb Rafael J. Wysocki 2014-02-16  218  /**
3799c5a032aefb Rafael J. Wysocki 2014-02-16  219   * acpiphp_add_context - Add ACPIPHP context to an ACPI device object.
3799c5a032aefb Rafael J. Wysocki 2014-02-16  220   * @handle: ACPI handle of the object to add a context to.
3799c5a032aefb Rafael J. Wysocki 2014-02-16  221   * @lvl: Not used.
3799c5a032aefb Rafael J. Wysocki 2014-02-16  222   * @data: The object's parent ACPIPHP bridge.
3799c5a032aefb Rafael J. Wysocki 2014-02-16  223   * @rv: Not used.
3799c5a032aefb Rafael J. Wysocki 2014-02-16  224   */
3799c5a032aefb Rafael J. Wysocki 2014-02-16  225  static acpi_status acpiphp_add_context(acpi_handle handle, u32 lvl, void *data,
cb7b8cedf6c88b Rafael J. Wysocki 2013-07-13  226  				       void **rv)
^1da177e4c3f41 Linus Torvalds    2005-04-16  227  {
cb7b8cedf6c88b Rafael J. Wysocki 2013-07-13  228  	struct acpiphp_bridge *bridge = data;
cb7b8cedf6c88b Rafael J. Wysocki 2013-07-13  229  	struct acpiphp_context *context;
bbcbfc0eed6220 Rafael J. Wysocki 2014-02-04  230  	struct acpi_device *adev;
^1da177e4c3f41 Linus Torvalds    2005-04-16  231  	struct acpiphp_slot *slot;
^1da177e4c3f41 Linus Torvalds    2005-04-16  232  	struct acpiphp_func *newfunc;
^1da177e4c3f41 Linus Torvalds    2005-04-16  233  	acpi_status status = AE_OK;
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  234  	unsigned long long adr;
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  235  	int device, function;
e8c331e963c58b Kenji Kaneshige   2008-12-17  236  	struct pci_bus *pbus = bridge->pci_bus;
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  237  	struct pci_dev *pdev = bridge->pci_dev;
3b63aaa70e1ccc Jiang Liu         2013-04-12  238  	u32 val;
^1da177e4c3f41 Linus Torvalds    2005-04-16  239  
dfb117b3e50c52 Bjorn Helgaas     2012-06-20  240  	status = acpi_evaluate_integer(handle, "_ADR", NULL, &adr);
dfb117b3e50c52 Bjorn Helgaas     2012-06-20  241  	if (ACPI_FAILURE(status)) {
f26ca1d699e8b5 Toshi Kani        2013-11-27  242  		if (status != AE_NOT_FOUND)
f26ca1d699e8b5 Toshi Kani        2013-11-27  243  			acpi_handle_warn(handle,
f26ca1d699e8b5 Toshi Kani        2013-11-27  244  				"can't evaluate _ADR (%#x)\n", status);
dfb117b3e50c52 Bjorn Helgaas     2012-06-20  245  		return AE_OK;
dfb117b3e50c52 Bjorn Helgaas     2012-06-20  246  	}
bbcbfc0eed6220 Rafael J. Wysocki 2014-02-04  247  	if (acpi_bus_get_device(handle, &adev))
bbcbfc0eed6220 Rafael J. Wysocki 2014-02-04  248  		return AE_OK;
dfb117b3e50c52 Bjorn Helgaas     2012-06-20  249  
dfb117b3e50c52 Bjorn Helgaas     2012-06-20  250  	device = (adr >> 16) & 0xffff;
dfb117b3e50c52 Bjorn Helgaas     2012-06-20  251  	function = adr & 0xffff;
dfb117b3e50c52 Bjorn Helgaas     2012-06-20  252  
e525506fcb67a9 Rafael J. Wysocki 2014-02-04  253  	acpi_lock_hp_context();
bbcbfc0eed6220 Rafael J. Wysocki 2014-02-04  254  	context = acpiphp_init_context(adev);
cb7b8cedf6c88b Rafael J. Wysocki 2013-07-13  255  	if (!context) {
e525506fcb67a9 Rafael J. Wysocki 2014-02-04  256  		acpi_unlock_hp_context();
cb7b8cedf6c88b Rafael J. Wysocki 2013-07-13  257  		acpi_handle_err(handle, "No hotplug context\n");
cb7b8cedf6c88b Rafael J. Wysocki 2013-07-13  258  		return AE_NOT_EXIST;
cb7b8cedf6c88b Rafael J. Wysocki 2013-07-13  259  	}
bd4674dfc5fc70 Rafael J. Wysocki 2013-07-13  260  	newfunc = &context->func;
bd4674dfc5fc70 Rafael J. Wysocki 2013-07-13  261  	newfunc->function = function;
bda46dbb6626c9 Rafael J. Wysocki 2013-07-13  262  	newfunc->parent = bridge;
edf5bf34d40804 Rafael J. Wysocki 2014-02-21  263  	acpi_unlock_hp_context();
cb7b8cedf6c88b Rafael J. Wysocki 2013-07-13  264  
edf5bf34d40804 Rafael J. Wysocki 2014-02-21  265  	/*
edf5bf34d40804 Rafael J. Wysocki 2014-02-21  266  	 * If this is a dock device, its _EJ0 should be executed by the dock
edf5bf34d40804 Rafael J. Wysocki 2014-02-21  267  	 * notify handler after calling _DCK.
edf5bf34d40804 Rafael J. Wysocki 2014-02-21  268  	 */
edf5bf34d40804 Rafael J. Wysocki 2014-02-21  269  	if (!is_dock_device(adev) && acpi_has_method(handle, "_EJ0"))
^1da177e4c3f41 Linus Torvalds    2005-04-16  270  		newfunc->flags = FUNC_HAS_EJ0;
^1da177e4c3f41 Linus Torvalds    2005-04-16  271  
ecd046da57d332 Jiang Liu         2013-06-29  272  	if (acpi_has_method(handle, "_STA"))
^1da177e4c3f41 Linus Torvalds    2005-04-16  273  		newfunc->flags |= FUNC_HAS_STA;
^1da177e4c3f41 Linus Torvalds    2005-04-16  274  
^1da177e4c3f41 Linus Torvalds    2005-04-16  275  	/* search for objects that share the same slot */
ad41dd9dd0c8ca Yijing Wang       2013-04-12  276  	list_for_each_entry(slot, &bridge->slots, node)
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  277  		if (slot->device == device)
ac372338b75064 Rafael J. Wysocki 2013-07-13  278  			goto slot_found;
^1da177e4c3f41 Linus Torvalds    2005-04-16  279  
f5afe8064f3087 Eric Sesterhenn   2006-02-28  280  	slot = kzalloc(sizeof(struct acpiphp_slot), GFP_KERNEL);
^1da177e4c3f41 Linus Torvalds    2005-04-16  281  	if (!slot) {
e525506fcb67a9 Rafael J. Wysocki 2014-02-04  282  		acpi_lock_hp_context();
146fc68a4bdd78 Rafael J. Wysocki 2014-02-04  283  		acpiphp_put_context(context);
e525506fcb67a9 Rafael J. Wysocki 2014-02-04  284  		acpi_unlock_hp_context();
146fc68a4bdd78 Rafael J. Wysocki 2014-02-04  285  		return AE_NO_MEMORY;
^1da177e4c3f41 Linus Torvalds    2005-04-16  286  	}
^1da177e4c3f41 Linus Torvalds    2005-04-16  287  
bda46dbb6626c9 Rafael J. Wysocki 2013-07-13  288  	slot->bus = bridge->pci_bus;
^1da177e4c3f41 Linus Torvalds    2005-04-16  289  	slot->device = device;
^1da177e4c3f41 Linus Torvalds    2005-04-16  290  	INIT_LIST_HEAD(&slot->funcs);
^1da177e4c3f41 Linus Torvalds    2005-04-16  291  
ad41dd9dd0c8ca Yijing Wang       2013-04-12  292  	list_add_tail(&slot->node, &bridge->slots);
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  293  
cc6254e00eb676 Rafael J. Wysocki 2014-02-16  294  	/*
cc6254e00eb676 Rafael J. Wysocki 2014-02-16  295  	 * Expose slots to user space for functions that have _EJ0 or _RMV or
cc6254e00eb676 Rafael J. Wysocki 2014-02-16  296  	 * are located in dock stations.  Do not expose them for devices handled
84c8b58ed3addf Mika Westerberg   2018-05-29  297  	 * by the native PCIe hotplug (PCIeHP) or standard PCI hotplug
84c8b58ed3addf Mika Westerberg   2018-05-29  298  	 * (SHPCHP), because that code is supposed to expose slots to user
84c8b58ed3addf Mika Westerberg   2018-05-29  299  	 * space in those cases.
cc6254e00eb676 Rafael J. Wysocki 2014-02-16  300  	 */
3b52b21fa1f44c Rafael J. Wysocki 2014-02-21  301  	if ((acpi_pci_check_ejectable(pbus, handle) || is_dock_device(adev))
84c8b58ed3addf Mika Westerberg   2018-05-29  302  	    && !(pdev && hotplug_is_native(pdev))) {
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  303  		unsigned long long sun;
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  304  		int retval;
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  305  
^1da177e4c3f41 Linus Torvalds    2005-04-16  306  		bridge->nr_slots++;
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  307  		status = acpi_evaluate_integer(handle, "_SUN", NULL, &sun);
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  308  		if (ACPI_FAILURE(status))
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  309  			sun = bridge->nr_slots;
^1da177e4c3f41 Linus Torvalds    2005-04-16  310  
bd950799d9510c Lan Tianyu        2013-09-24  311  		pr_debug("found ACPI PCI Hotplug slot %llu at PCI %04x:%02x:%02x\n",
7342798d0ab850 Rafael J. Wysocki 2013-07-13  312  		    sun, pci_domain_nr(pbus), pbus->number, device);
ac372338b75064 Rafael J. Wysocki 2013-07-13  313  
7342798d0ab850 Rafael J. Wysocki 2013-07-13  314  		retval = acpiphp_register_hotplug_slot(slot, sun);
e27da381417038 MUNEDA Takahiro   2006-02-23  315  		if (retval) {
1aaac07112f040 Rafael J. Wysocki 2013-08-17  316  			slot->slot = NULL;
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  317  			bridge->nr_slots--;
f46753c5e354b8 Alex Chiang       2008-06-10  318  			if (retval == -EBUSY)
227f06470502c4 Ryan Desfosses    2014-04-18  319  				pr_warn("Slot %llu already registered by another hotplug driver\n", sun);
f46753c5e354b8 Alex Chiang       2008-06-10  320  			else
227f06470502c4 Ryan Desfosses    2014-04-18  321  				pr_warn("acpiphp_register_hotplug_slot failed (err code = 0x%x)\n", retval);
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  322  		}
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  323  		/* Even if the slot registration fails, we can still use it. */
e27da381417038 MUNEDA Takahiro   2006-02-23  324  	}
^1da177e4c3f41 Linus Torvalds    2005-04-16  325  
ac372338b75064 Rafael J. Wysocki 2013-07-13  326   slot_found:
^1da177e4c3f41 Linus Torvalds    2005-04-16  327  	newfunc->slot = slot;
^1da177e4c3f41 Linus Torvalds    2005-04-16  328  	list_add_tail(&newfunc->sibling, &slot->funcs);
^1da177e4c3f41 Linus Torvalds    2005-04-16  329  
3b63aaa70e1ccc Jiang Liu         2013-04-12 @330  	if (pci_bus_read_dev_vendor_id(pbus, PCI_DEVFN(device, function),
3b63aaa70e1ccc Jiang Liu         2013-04-12  331  				       &val, 60*1000))
bc805a55392a7c Rafael J. Wysocki 2013-07-13  332  		slot->flags |= SLOT_ENABLED;
^1da177e4c3f41 Linus Torvalds    2005-04-16  333  
2e862c51904ddd Rafael J. Wysocki 2013-07-13  334  	return AE_OK;
^1da177e4c3f41 Linus Torvalds    2005-04-16  335  }
^1da177e4c3f41 Linus Torvalds    2005-04-16  336  
364d5094a43ff2 Rajesh Shah       2005-04-28  337  static void cleanup_bridge(struct acpiphp_bridge *bridge)
^1da177e4c3f41 Linus Torvalds    2005-04-16  338  {
3d54a3160fb6ba Jiang Liu         2013-04-12  339  	struct acpiphp_slot *slot;
3d54a3160fb6ba Jiang Liu         2013-04-12  340  	struct acpiphp_func *func;
42f49a6ae5dca9 Rajesh Shah       2005-04-28  341  
3d54a3160fb6ba Jiang Liu         2013-04-12  342  	list_for_each_entry(slot, &bridge->slots, node) {
3d54a3160fb6ba Jiang Liu         2013-04-12  343  		list_for_each_entry(func, &slot->funcs, sibling) {
1a699476e25814 Rafael J. Wysocki 2014-02-06  344  			struct acpi_device *adev = func_to_acpi_device(func);
5a3bc573ae32a7 Rafael J. Wysocki 2013-07-13  345  
1a699476e25814 Rafael J. Wysocki 2014-02-06  346  			acpi_lock_hp_context();
be27b3dcb02335 Rafael J. Wysocki 2014-02-21  347  			adev->hp->notify = NULL;
edf5bf34d40804 Rafael J. Wysocki 2014-02-21  348  			adev->hp->fixup = NULL;
1a699476e25814 Rafael J. Wysocki 2014-02-06  349  			acpi_unlock_hp_context();
42f49a6ae5dca9 Rajesh Shah       2005-04-28  350  		}
9217a984671e8a Rafael J. Wysocki 2014-01-10  351  		slot->flags |= SLOT_IS_GOING_AWAY;
1aaac07112f040 Rafael J. Wysocki 2013-08-17  352  		if (slot->slot)
e27da381417038 MUNEDA Takahiro   2006-02-23  353  			acpiphp_unregister_hotplug_slot(slot);
42f49a6ae5dca9 Rajesh Shah       2005-04-28  354  	}
42f49a6ae5dca9 Rajesh Shah       2005-04-28  355  
3d54a3160fb6ba Jiang Liu         2013-04-12  356  	mutex_lock(&bridge_mutex);
42f49a6ae5dca9 Rajesh Shah       2005-04-28  357  	list_del(&bridge->list);
3d54a3160fb6ba Jiang Liu         2013-04-12  358  	mutex_unlock(&bridge_mutex);
9217a984671e8a Rafael J. Wysocki 2014-01-10  359  
e525506fcb67a9 Rafael J. Wysocki 2014-02-04  360  	acpi_lock_hp_context();
9217a984671e8a Rafael J. Wysocki 2014-01-10  361  	bridge->is_going_away = true;
e525506fcb67a9 Rafael J. Wysocki 2014-02-04  362  	acpi_unlock_hp_context();
^1da177e4c3f41 Linus Torvalds    2005-04-16  363  }
^1da177e4c3f41 Linus Torvalds    2005-04-16  364  
15a1ae74879925 Kristen Accardi   2006-02-23  365  /**
26e6c66e47fe7f Randy Dunlap      2007-11-28  366   * acpiphp_max_busnr - return the highest reserved bus number under the given bus.
15a1ae74879925 Kristen Accardi   2006-02-23  367   * @bus: bus to start search with
15a1ae74879925 Kristen Accardi   2006-02-23  368   */
15a1ae74879925 Kristen Accardi   2006-02-23  369  static unsigned char acpiphp_max_busnr(struct pci_bus *bus)
15a1ae74879925 Kristen Accardi   2006-02-23  370  {
c6f0d5adc21e2d Yijing Wang       2014-02-13  371  	struct pci_bus *tmp;
15a1ae74879925 Kristen Accardi   2006-02-23  372  	unsigned char max, n;
15a1ae74879925 Kristen Accardi   2006-02-23  373  
15a1ae74879925 Kristen Accardi   2006-02-23  374  	/*
15a1ae74879925 Kristen Accardi   2006-02-23  375  	 * pci_bus_max_busnr will return the highest
15a1ae74879925 Kristen Accardi   2006-02-23  376  	 * reserved busnr for all these children.
15a1ae74879925 Kristen Accardi   2006-02-23  377  	 * that is equivalent to the bus->subordinate
15a1ae74879925 Kristen Accardi   2006-02-23  378  	 * value.  We don't want to use the parent's
15a1ae74879925 Kristen Accardi   2006-02-23  379  	 * bus->subordinate value because it could have
15a1ae74879925 Kristen Accardi   2006-02-23  380  	 * padding in it.
15a1ae74879925 Kristen Accardi   2006-02-23  381  	 */
b918c62e086b21 Yinghai Lu        2012-05-17  382  	max = bus->busn_res.start;
15a1ae74879925 Kristen Accardi   2006-02-23  383  
c6f0d5adc21e2d Yijing Wang       2014-02-13  384  	list_for_each_entry(tmp, &bus->children, node) {
c6f0d5adc21e2d Yijing Wang       2014-02-13  385  		n = pci_bus_max_busnr(tmp);
15a1ae74879925 Kristen Accardi   2006-02-23  386  		if (n > max)
15a1ae74879925 Kristen Accardi   2006-02-23  387  			max = n;
15a1ae74879925 Kristen Accardi   2006-02-23  388  	}
15a1ae74879925 Kristen Accardi   2006-02-23  389  	return max;
15a1ae74879925 Kristen Accardi   2006-02-23  390  }
15a1ae74879925 Kristen Accardi   2006-02-23  391  
d06070509147c9 Shaohua Li        2010-02-25  392  static void acpiphp_set_acpi_region(struct acpiphp_slot *slot)
d06070509147c9 Shaohua Li        2010-02-25  393  {
d06070509147c9 Shaohua Li        2010-02-25  394  	struct acpiphp_func *func;
d06070509147c9 Shaohua Li        2010-02-25  395  
d06070509147c9 Shaohua Li        2010-02-25  396  	list_for_each_entry(func, &slot->funcs, sibling) {
d06070509147c9 Shaohua Li        2010-02-25  397  		/* _REG is optional, we don't care about if there is failure */
6dd10c47e91239 Hans de Goede     2020-05-07  398  		acpi_evaluate_reg(func_to_handle(func),
6dd10c47e91239 Hans de Goede     2020-05-07  399  				  ACPI_ADR_SPACE_PCI_CONFIG,
6dd10c47e91239 Hans de Goede     2020-05-07  400  				  ACPI_REG_CONNECT);
d06070509147c9 Shaohua Li        2010-02-25  401  	}
d06070509147c9 Shaohua Li        2010-02-25  402  }
d06070509147c9 Shaohua Li        2010-02-25  403  
1f96a965e30d09 Yinghai Lu        2013-01-21  404  static void check_hotplug_bridge(struct acpiphp_slot *slot, struct pci_dev *dev)
1f96a965e30d09 Yinghai Lu        2013-01-21  405  {
1f96a965e30d09 Yinghai Lu        2013-01-21  406  	struct acpiphp_func *func;
1f96a965e30d09 Yinghai Lu        2013-01-21  407  
1f96a965e30d09 Yinghai Lu        2013-01-21  408  	/* quirk, or pcie could set it already */
1f96a965e30d09 Yinghai Lu        2013-01-21  409  	if (dev->is_hotplug_bridge)
1f96a965e30d09 Yinghai Lu        2013-01-21  410  		return;
1f96a965e30d09 Yinghai Lu        2013-01-21  411  
1f96a965e30d09 Yinghai Lu        2013-01-21  412  	list_for_each_entry(func, &slot->funcs, sibling) {
1f96a965e30d09 Yinghai Lu        2013-01-21  413  		if (PCI_FUNC(dev->devfn) == func->function) {
1f96a965e30d09 Yinghai Lu        2013-01-21  414  			dev->is_hotplug_bridge = 1;
1f96a965e30d09 Yinghai Lu        2013-01-21  415  			break;
1f96a965e30d09 Yinghai Lu        2013-01-21  416  		}
1f96a965e30d09 Yinghai Lu        2013-01-21  417  	}
1f96a965e30d09 Yinghai Lu        2013-01-21  418  }
3b63aaa70e1ccc Jiang Liu         2013-04-12  419  
a47d8c8e72a5fa Rafael J. Wysocki 2013-09-08  420  static int acpiphp_rescan_slot(struct acpiphp_slot *slot)
a47d8c8e72a5fa Rafael J. Wysocki 2013-09-08  421  {
a47d8c8e72a5fa Rafael J. Wysocki 2013-09-08  422  	struct acpiphp_func *func;
a47d8c8e72a5fa Rafael J. Wysocki 2013-09-08  423  
b6708fbf98ac01 Rafael J. Wysocki 2014-02-04  424  	list_for_each_entry(func, &slot->funcs, sibling) {
b6708fbf98ac01 Rafael J. Wysocki 2014-02-04  425  		struct acpi_device *adev = func_to_acpi_device(func);
a47d8c8e72a5fa Rafael J. Wysocki 2013-09-08  426  
b6708fbf98ac01 Rafael J. Wysocki 2014-02-04  427  		acpi_bus_scan(adev->handle);
b6708fbf98ac01 Rafael J. Wysocki 2014-02-04  428  		if (acpi_device_enumerated(adev))
b6708fbf98ac01 Rafael J. Wysocki 2014-02-04  429  			acpi_device_set_power(adev, ACPI_STATE_D0);
b6708fbf98ac01 Rafael J. Wysocki 2014-02-04  430  	}
a47d8c8e72a5fa Rafael J. Wysocki 2013-09-08  431  	return pci_scan_slot(slot->bus, PCI_DEVFN(slot->device, 0));
a47d8c8e72a5fa Rafael J. Wysocki 2013-09-08  432  }
a47d8c8e72a5fa Rafael J. Wysocki 2013-09-08  433  
84c8b58ed3addf Mika Westerberg   2018-05-29  434  static void acpiphp_native_scan_bridge(struct pci_dev *bridge)
84c8b58ed3addf Mika Westerberg   2018-05-29  435  {
84c8b58ed3addf Mika Westerberg   2018-05-29  436  	struct pci_bus *bus = bridge->subordinate;
84c8b58ed3addf Mika Westerberg   2018-05-29  437  	struct pci_dev *dev;
84c8b58ed3addf Mika Westerberg   2018-05-29  438  	int max;
84c8b58ed3addf Mika Westerberg   2018-05-29  439  
84c8b58ed3addf Mika Westerberg   2018-05-29  440  	if (!bus)
84c8b58ed3addf Mika Westerberg   2018-05-29  441  		return;
84c8b58ed3addf Mika Westerberg   2018-05-29  442  
84c8b58ed3addf Mika Westerberg   2018-05-29  443  	max = bus->busn_res.start;
84c8b58ed3addf Mika Westerberg   2018-05-29  444  	/* Scan already configured non-hotplug bridges */
84c8b58ed3addf Mika Westerberg   2018-05-29  445  	for_each_pci_bridge(dev, bus) {
84c8b58ed3addf Mika Westerberg   2018-05-29  446  		if (!hotplug_is_native(dev))
84c8b58ed3addf Mika Westerberg   2018-05-29  447  			max = pci_scan_bridge(bus, dev, max, 0);
84c8b58ed3addf Mika Westerberg   2018-05-29  448  	}
84c8b58ed3addf Mika Westerberg   2018-05-29  449  
84c8b58ed3addf Mika Westerberg   2018-05-29  450  	/* Scan non-hotplug bridges that need to be reconfigured */
84c8b58ed3addf Mika Westerberg   2018-05-29  451  	for_each_pci_bridge(dev, bus) {
77adf9355304f8 Mika Westerberg   2019-10-30  452  		if (hotplug_is_native(dev))
77adf9355304f8 Mika Westerberg   2019-10-30  453  			continue;
77adf9355304f8 Mika Westerberg   2019-10-30  454  
84c8b58ed3addf Mika Westerberg   2018-05-29  455  		max = pci_scan_bridge(bus, dev, max, 1);
77adf9355304f8 Mika Westerberg   2019-10-30  456  		if (dev->subordinate) {
77adf9355304f8 Mika Westerberg   2019-10-30  457  			pcibios_resource_survey_bus(dev->subordinate);
77adf9355304f8 Mika Westerberg   2019-10-30  458  			pci_bus_size_bridges(dev->subordinate);
77adf9355304f8 Mika Westerberg   2019-10-30  459  			pci_bus_assign_resources(dev->subordinate);
77adf9355304f8 Mika Westerberg   2019-10-30  460  		}
84c8b58ed3addf Mika Westerberg   2018-05-29  461  	}
84c8b58ed3addf Mika Westerberg   2018-05-29  462  }
84c8b58ed3addf Mika Westerberg   2018-05-29  463  
^1da177e4c3f41 Linus Torvalds    2005-04-16  464  /**
a1d0abcea84573 Rafael J. Wysocki 2013-07-13  465   * enable_slot - enable, configure a slot
^1da177e4c3f41 Linus Torvalds    2005-04-16  466   * @slot: slot to be enabled
f188b99f0b2d33 Mika Westerberg   2018-09-26  467   * @bridge: true if enable is for the whole bridge (not a single slot)
^1da177e4c3f41 Linus Torvalds    2005-04-16  468   *
^1da177e4c3f41 Linus Torvalds    2005-04-16  469   * This function should be called per *physical slot*,
^1da177e4c3f41 Linus Torvalds    2005-04-16  470   * not per each slot object in ACPI namespace.
^1da177e4c3f41 Linus Torvalds    2005-04-16  471   */
f188b99f0b2d33 Mika Westerberg   2018-09-26  472  static void enable_slot(struct acpiphp_slot *slot, bool bridge)
^1da177e4c3f41 Linus Torvalds    2005-04-16  473  {
^1da177e4c3f41 Linus Torvalds    2005-04-16  474  	struct pci_dev *dev;
bda46dbb6626c9 Rafael J. Wysocki 2013-07-13  475  	struct pci_bus *bus = slot->bus;
^1da177e4c3f41 Linus Torvalds    2005-04-16  476  	struct acpiphp_func *func;
84c8b58ed3addf Mika Westerberg   2018-05-29  477  
f188b99f0b2d33 Mika Westerberg   2018-09-26  478  	if (bridge && bus->self && hotplug_is_native(bus->self)) {
84c8b58ed3addf Mika Westerberg   2018-05-29  479  		/*
84c8b58ed3addf Mika Westerberg   2018-05-29  480  		 * If native hotplug is used, it will take care of hotplug
84c8b58ed3addf Mika Westerberg   2018-05-29  481  		 * slot management and resource allocation for hotplug
84c8b58ed3addf Mika Westerberg   2018-05-29  482  		 * bridges. However, ACPI hotplug may still be used for
84c8b58ed3addf Mika Westerberg   2018-05-29  483  		 * non-hotplug bridges to bring in additional devices such
84c8b58ed3addf Mika Westerberg   2018-05-29  484  		 * as a Thunderbolt host controller.
84c8b58ed3addf Mika Westerberg   2018-05-29  485  		 */
84c8b58ed3addf Mika Westerberg   2018-05-29  486  		for_each_pci_bridge(dev, bus) {
84c8b58ed3addf Mika Westerberg   2018-05-29  487  			if (PCI_SLOT(dev->devfn) == slot->device)
84c8b58ed3addf Mika Westerberg   2018-05-29  488  				acpiphp_native_scan_bridge(dev);
84c8b58ed3addf Mika Westerberg   2018-05-29  489  		}
84c8b58ed3addf Mika Westerberg   2018-05-29  490  	} else {
d66ecb7220a70e Jiang Liu         2013-06-23  491  		LIST_HEAD(add_list);
84c8b58ed3addf Mika Westerberg   2018-05-29  492  		int max, pass;
^1da177e4c3f41 Linus Torvalds    2005-04-16  493  
ab1225901da2d4 Mika Westerberg   2013-10-30  494  		acpiphp_rescan_slot(slot);
15a1ae74879925 Kristen Accardi   2006-02-23  495  		max = acpiphp_max_busnr(bus);
42f49a6ae5dca9 Rajesh Shah       2005-04-28  496  		for (pass = 0; pass < 2; pass++) {
24a0c654d7d606 Andy Shevchenko   2017-10-20  497  			for_each_pci_bridge(dev, bus) {
42f49a6ae5dca9 Rajesh Shah       2005-04-28  498  				if (PCI_SLOT(dev->devfn) != slot->device)
42f49a6ae5dca9 Rajesh Shah       2005-04-28  499  					continue;
a1d0abcea84573 Rafael J. Wysocki 2013-07-13  500  
42f49a6ae5dca9 Rajesh Shah       2005-04-28  501  				max = pci_scan_bridge(bus, dev, max, pass);
1f96a965e30d09 Yinghai Lu        2013-01-21  502  				if (pass && dev->subordinate) {
1f96a965e30d09 Yinghai Lu        2013-01-21  503  					check_hotplug_bridge(slot, dev);
d66ecb7220a70e Jiang Liu         2013-06-23  504  					pcibios_resource_survey_bus(dev->subordinate);
84c8b58ed3addf Mika Westerberg   2018-05-29 @505  					__pci_bus_size_bridges(dev->subordinate,
84c8b58ed3addf Mika Westerberg   2018-05-29  506  							       &add_list);
c64b5eead93f9d Kristen Accardi   2005-12-14  507  				}
42f49a6ae5dca9 Rajesh Shah       2005-04-28  508  			}
1f96a965e30d09 Yinghai Lu        2013-01-21  509  		}
d66ecb7220a70e Jiang Liu         2013-06-23 @510  		__pci_bus_assign_resources(bus, &add_list, NULL);
84c8b58ed3addf Mika Westerberg   2018-05-29  511  	}
2dc41281b1d117 Rafael J. Wysocki 2013-09-06  512  
8e5dce35221850 Kristen Accardi   2005-10-18  513  	acpiphp_sanitize_bus(bus);
81ee57326c9ca6 Bjorn Helgaas     2014-08-28  514  	pcie_bus_configure_settings(bus);
d06070509147c9 Shaohua Li        2010-02-25  515  	acpiphp_set_acpi_region(slot);
69643e4829c5cd Ian Campbell      2011-05-11  516  
69643e4829c5cd Ian Campbell      2011-05-11  517  	list_for_each_entry(dev, &bus->devices, bus_list) {
69643e4829c5cd Ian Campbell      2011-05-11  518  		/* Assume that newly added devices are powered on already. */
44bda4b7d26e9f Hari Vyas         2018-07-03  519  		if (!pci_dev_is_added(dev))
69643e4829c5cd Ian Campbell      2011-05-11  520  			dev->current_state = PCI_D0;
69643e4829c5cd Ian Campbell      2011-05-11  521  	}
69643e4829c5cd Ian Campbell      2011-05-11  522  
42f49a6ae5dca9 Rajesh Shah       2005-04-28  523  	pci_bus_add_devices(bus);
42f49a6ae5dca9 Rajesh Shah       2005-04-28  524  
f382a086f3129e Amos Kong         2011-11-25  525  	slot->flags |= SLOT_ENABLED;
58c08628c4fe66 Alex Chiang       2009-10-26  526  	list_for_each_entry(func, &slot->funcs, sibling) {
9d911d7903926a Alex Chiang       2009-05-21  527  		dev = pci_get_slot(bus, PCI_DEVFN(slot->device,
^1da177e4c3f41 Linus Torvalds    2005-04-16  528  						  func->function));
f382a086f3129e Amos Kong         2011-11-25  529  		if (!dev) {
f382a086f3129e Amos Kong         2011-11-25  530  			/* Do not set SLOT_ENABLED flag if some funcs
f382a086f3129e Amos Kong         2011-11-25  531  			   are not added. */
9337a493623d59 Mika Westerberg   2018-05-24  532  			slot->flags &= ~SLOT_ENABLED;
551bcb75b3d9f2 MUNEDA Takahiro   2006-03-22  533  			continue;
f382a086f3129e Amos Kong         2011-11-25  534  		}
3bbfd319034ddc Feilong Lin       2021-03-25  535  		pci_dev_put(dev);
^1da177e4c3f41 Linus Torvalds    2005-04-16  536  	}
^1da177e4c3f41 Linus Torvalds    2005-04-16  537  }
^1da177e4c3f41 Linus Torvalds    2005-04-16  538  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 42097 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/5] PCI: Move pci_dev_is/assign_added() to pci.h
  2021-09-06  9:49 ` [PATCH 3/5] PCI: Move pci_dev_is/assign_added() to pci.h Niklas Schnelle
  2021-09-07  0:22   ` kernel test robot
@ 2021-09-07  0:25   ` kernel test robot
  2021-09-07  7:51     ` Andy Shevchenko
  1 sibling, 1 reply; 17+ messages in thread
From: kernel test robot @ 2021-09-07  0:25 UTC (permalink / raw)
  To: Niklas Schnelle, Bjorn Helgaas
  Cc: kbuild-all, Linas Vepstas, Oliver O'Halloran, Russell Currey,
	linuxppc-dev, linux-kernel, linux-s390, Matthew Rosato,
	Pierre Morel

[-- Attachment #1: Type: text/plain, Size: 27718 bytes --]

Hi Niklas,

I love your patch! Yet something to improve:

[auto build test ERROR on s390/features]
[also build test ERROR on next-20210906]
[cannot apply to pci/next powerpc/next v5.14]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Niklas-Schnelle/s390-pci-automatic-error-recovery/20210906-175309
base:   https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git features
config: i386-allyesconfig (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/404ed8c00a612e7ae31c50557c80c6726c464863
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Niklas-Schnelle/s390-pci-automatic-error-recovery/20210906-175309
        git checkout 404ed8c00a612e7ae31c50557c80c6726c464863
        # save the attached .config to linux build tree
        make W=1 ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   drivers/pci/hotplug/acpiphp_glue.c: In function 'acpiphp_add_context':
>> drivers/pci/hotplug/acpiphp_glue.c:330:6: error: implicit declaration of function 'pci_bus_read_dev_vendor_id' [-Werror=implicit-function-declaration]
     330 |  if (pci_bus_read_dev_vendor_id(pbus, PCI_DEVFN(device, function),
         |      ^~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/pci/hotplug/acpiphp_glue.c: In function 'enable_slot':
>> drivers/pci/hotplug/acpiphp_glue.c:505:6: error: implicit declaration of function '__pci_bus_size_bridges'; did you mean 'pci_bus_size_bridges'? [-Werror=implicit-function-declaration]
     505 |      __pci_bus_size_bridges(dev->subordinate,
         |      ^~~~~~~~~~~~~~~~~~~~~~
         |      pci_bus_size_bridges
>> drivers/pci/hotplug/acpiphp_glue.c:510:3: error: implicit declaration of function '__pci_bus_assign_resources'; did you mean 'pci_bus_assign_resources'? [-Werror=implicit-function-declaration]
     510 |   __pci_bus_assign_resources(bus, &add_list, NULL);
         |   ^~~~~~~~~~~~~~~~~~~~~~~~~~
         |   pci_bus_assign_resources
   drivers/pci/hotplug/acpiphp_glue.c: In function 'trim_stale_devices':
>> drivers/pci/hotplug/acpiphp_glue.c:660:3: error: implicit declaration of function 'pci_dev_set_disconnected' [-Werror=implicit-function-declaration]
     660 |   pci_dev_set_disconnected(dev, NULL);
         |   ^~~~~~~~~~~~~~~~~~~~~~~~
>> drivers/pci/hotplug/acpiphp_glue.c:661:7: error: implicit declaration of function 'pci_has_subordinate' [-Werror=implicit-function-declaration]
     661 |   if (pci_has_subordinate(dev))
         |       ^~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors


vim +/pci_bus_read_dev_vendor_id +330 drivers/pci/hotplug/acpiphp_glue.c

4e8662bbd680c5 Kristen Accardi   2006-06-28  217  
3799c5a032aefb Rafael J. Wysocki 2014-02-16  218  /**
3799c5a032aefb Rafael J. Wysocki 2014-02-16  219   * acpiphp_add_context - Add ACPIPHP context to an ACPI device object.
3799c5a032aefb Rafael J. Wysocki 2014-02-16  220   * @handle: ACPI handle of the object to add a context to.
3799c5a032aefb Rafael J. Wysocki 2014-02-16  221   * @lvl: Not used.
3799c5a032aefb Rafael J. Wysocki 2014-02-16  222   * @data: The object's parent ACPIPHP bridge.
3799c5a032aefb Rafael J. Wysocki 2014-02-16  223   * @rv: Not used.
3799c5a032aefb Rafael J. Wysocki 2014-02-16  224   */
3799c5a032aefb Rafael J. Wysocki 2014-02-16  225  static acpi_status acpiphp_add_context(acpi_handle handle, u32 lvl, void *data,
cb7b8cedf6c88b Rafael J. Wysocki 2013-07-13  226  				       void **rv)
^1da177e4c3f41 Linus Torvalds    2005-04-16  227  {
cb7b8cedf6c88b Rafael J. Wysocki 2013-07-13  228  	struct acpiphp_bridge *bridge = data;
cb7b8cedf6c88b Rafael J. Wysocki 2013-07-13  229  	struct acpiphp_context *context;
bbcbfc0eed6220 Rafael J. Wysocki 2014-02-04  230  	struct acpi_device *adev;
^1da177e4c3f41 Linus Torvalds    2005-04-16  231  	struct acpiphp_slot *slot;
^1da177e4c3f41 Linus Torvalds    2005-04-16  232  	struct acpiphp_func *newfunc;
^1da177e4c3f41 Linus Torvalds    2005-04-16  233  	acpi_status status = AE_OK;
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  234  	unsigned long long adr;
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  235  	int device, function;
e8c331e963c58b Kenji Kaneshige   2008-12-17  236  	struct pci_bus *pbus = bridge->pci_bus;
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  237  	struct pci_dev *pdev = bridge->pci_dev;
3b63aaa70e1ccc Jiang Liu         2013-04-12  238  	u32 val;
^1da177e4c3f41 Linus Torvalds    2005-04-16  239  
dfb117b3e50c52 Bjorn Helgaas     2012-06-20  240  	status = acpi_evaluate_integer(handle, "_ADR", NULL, &adr);
dfb117b3e50c52 Bjorn Helgaas     2012-06-20  241  	if (ACPI_FAILURE(status)) {
f26ca1d699e8b5 Toshi Kani        2013-11-27  242  		if (status != AE_NOT_FOUND)
f26ca1d699e8b5 Toshi Kani        2013-11-27  243  			acpi_handle_warn(handle,
f26ca1d699e8b5 Toshi Kani        2013-11-27  244  				"can't evaluate _ADR (%#x)\n", status);
dfb117b3e50c52 Bjorn Helgaas     2012-06-20  245  		return AE_OK;
dfb117b3e50c52 Bjorn Helgaas     2012-06-20  246  	}
bbcbfc0eed6220 Rafael J. Wysocki 2014-02-04  247  	if (acpi_bus_get_device(handle, &adev))
bbcbfc0eed6220 Rafael J. Wysocki 2014-02-04  248  		return AE_OK;
dfb117b3e50c52 Bjorn Helgaas     2012-06-20  249  
dfb117b3e50c52 Bjorn Helgaas     2012-06-20  250  	device = (adr >> 16) & 0xffff;
dfb117b3e50c52 Bjorn Helgaas     2012-06-20  251  	function = adr & 0xffff;
dfb117b3e50c52 Bjorn Helgaas     2012-06-20  252  
e525506fcb67a9 Rafael J. Wysocki 2014-02-04  253  	acpi_lock_hp_context();
bbcbfc0eed6220 Rafael J. Wysocki 2014-02-04  254  	context = acpiphp_init_context(adev);
cb7b8cedf6c88b Rafael J. Wysocki 2013-07-13  255  	if (!context) {
e525506fcb67a9 Rafael J. Wysocki 2014-02-04  256  		acpi_unlock_hp_context();
cb7b8cedf6c88b Rafael J. Wysocki 2013-07-13  257  		acpi_handle_err(handle, "No hotplug context\n");
cb7b8cedf6c88b Rafael J. Wysocki 2013-07-13  258  		return AE_NOT_EXIST;
cb7b8cedf6c88b Rafael J. Wysocki 2013-07-13  259  	}
bd4674dfc5fc70 Rafael J. Wysocki 2013-07-13  260  	newfunc = &context->func;
bd4674dfc5fc70 Rafael J. Wysocki 2013-07-13  261  	newfunc->function = function;
bda46dbb6626c9 Rafael J. Wysocki 2013-07-13  262  	newfunc->parent = bridge;
edf5bf34d40804 Rafael J. Wysocki 2014-02-21  263  	acpi_unlock_hp_context();
cb7b8cedf6c88b Rafael J. Wysocki 2013-07-13  264  
edf5bf34d40804 Rafael J. Wysocki 2014-02-21  265  	/*
edf5bf34d40804 Rafael J. Wysocki 2014-02-21  266  	 * If this is a dock device, its _EJ0 should be executed by the dock
edf5bf34d40804 Rafael J. Wysocki 2014-02-21  267  	 * notify handler after calling _DCK.
edf5bf34d40804 Rafael J. Wysocki 2014-02-21  268  	 */
edf5bf34d40804 Rafael J. Wysocki 2014-02-21  269  	if (!is_dock_device(adev) && acpi_has_method(handle, "_EJ0"))
^1da177e4c3f41 Linus Torvalds    2005-04-16  270  		newfunc->flags = FUNC_HAS_EJ0;
^1da177e4c3f41 Linus Torvalds    2005-04-16  271  
ecd046da57d332 Jiang Liu         2013-06-29  272  	if (acpi_has_method(handle, "_STA"))
^1da177e4c3f41 Linus Torvalds    2005-04-16  273  		newfunc->flags |= FUNC_HAS_STA;
^1da177e4c3f41 Linus Torvalds    2005-04-16  274  
^1da177e4c3f41 Linus Torvalds    2005-04-16  275  	/* search for objects that share the same slot */
ad41dd9dd0c8ca Yijing Wang       2013-04-12  276  	list_for_each_entry(slot, &bridge->slots, node)
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  277  		if (slot->device == device)
ac372338b75064 Rafael J. Wysocki 2013-07-13  278  			goto slot_found;
^1da177e4c3f41 Linus Torvalds    2005-04-16  279  
f5afe8064f3087 Eric Sesterhenn   2006-02-28  280  	slot = kzalloc(sizeof(struct acpiphp_slot), GFP_KERNEL);
^1da177e4c3f41 Linus Torvalds    2005-04-16  281  	if (!slot) {
e525506fcb67a9 Rafael J. Wysocki 2014-02-04  282  		acpi_lock_hp_context();
146fc68a4bdd78 Rafael J. Wysocki 2014-02-04  283  		acpiphp_put_context(context);
e525506fcb67a9 Rafael J. Wysocki 2014-02-04  284  		acpi_unlock_hp_context();
146fc68a4bdd78 Rafael J. Wysocki 2014-02-04  285  		return AE_NO_MEMORY;
^1da177e4c3f41 Linus Torvalds    2005-04-16  286  	}
^1da177e4c3f41 Linus Torvalds    2005-04-16  287  
bda46dbb6626c9 Rafael J. Wysocki 2013-07-13  288  	slot->bus = bridge->pci_bus;
^1da177e4c3f41 Linus Torvalds    2005-04-16  289  	slot->device = device;
^1da177e4c3f41 Linus Torvalds    2005-04-16  290  	INIT_LIST_HEAD(&slot->funcs);
^1da177e4c3f41 Linus Torvalds    2005-04-16  291  
ad41dd9dd0c8ca Yijing Wang       2013-04-12  292  	list_add_tail(&slot->node, &bridge->slots);
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  293  
cc6254e00eb676 Rafael J. Wysocki 2014-02-16  294  	/*
cc6254e00eb676 Rafael J. Wysocki 2014-02-16  295  	 * Expose slots to user space for functions that have _EJ0 or _RMV or
cc6254e00eb676 Rafael J. Wysocki 2014-02-16  296  	 * are located in dock stations.  Do not expose them for devices handled
84c8b58ed3addf Mika Westerberg   2018-05-29  297  	 * by the native PCIe hotplug (PCIeHP) or standard PCI hotplug
84c8b58ed3addf Mika Westerberg   2018-05-29  298  	 * (SHPCHP), because that code is supposed to expose slots to user
84c8b58ed3addf Mika Westerberg   2018-05-29  299  	 * space in those cases.
cc6254e00eb676 Rafael J. Wysocki 2014-02-16  300  	 */
3b52b21fa1f44c Rafael J. Wysocki 2014-02-21  301  	if ((acpi_pci_check_ejectable(pbus, handle) || is_dock_device(adev))
84c8b58ed3addf Mika Westerberg   2018-05-29  302  	    && !(pdev && hotplug_is_native(pdev))) {
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  303  		unsigned long long sun;
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  304  		int retval;
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  305  
^1da177e4c3f41 Linus Torvalds    2005-04-16  306  		bridge->nr_slots++;
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  307  		status = acpi_evaluate_integer(handle, "_SUN", NULL, &sun);
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  308  		if (ACPI_FAILURE(status))
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  309  			sun = bridge->nr_slots;
^1da177e4c3f41 Linus Torvalds    2005-04-16  310  
bd950799d9510c Lan Tianyu        2013-09-24  311  		pr_debug("found ACPI PCI Hotplug slot %llu at PCI %04x:%02x:%02x\n",
7342798d0ab850 Rafael J. Wysocki 2013-07-13  312  		    sun, pci_domain_nr(pbus), pbus->number, device);
ac372338b75064 Rafael J. Wysocki 2013-07-13  313  
7342798d0ab850 Rafael J. Wysocki 2013-07-13  314  		retval = acpiphp_register_hotplug_slot(slot, sun);
e27da381417038 MUNEDA Takahiro   2006-02-23  315  		if (retval) {
1aaac07112f040 Rafael J. Wysocki 2013-08-17  316  			slot->slot = NULL;
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  317  			bridge->nr_slots--;
f46753c5e354b8 Alex Chiang       2008-06-10  318  			if (retval == -EBUSY)
227f06470502c4 Ryan Desfosses    2014-04-18  319  				pr_warn("Slot %llu already registered by another hotplug driver\n", sun);
f46753c5e354b8 Alex Chiang       2008-06-10  320  			else
227f06470502c4 Ryan Desfosses    2014-04-18  321  				pr_warn("acpiphp_register_hotplug_slot failed (err code = 0x%x)\n", retval);
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  322  		}
bbd34fcdd1b201 Rafael J. Wysocki 2013-07-13  323  		/* Even if the slot registration fails, we can still use it. */
e27da381417038 MUNEDA Takahiro   2006-02-23  324  	}
^1da177e4c3f41 Linus Torvalds    2005-04-16  325  
ac372338b75064 Rafael J. Wysocki 2013-07-13  326   slot_found:
^1da177e4c3f41 Linus Torvalds    2005-04-16  327  	newfunc->slot = slot;
^1da177e4c3f41 Linus Torvalds    2005-04-16  328  	list_add_tail(&newfunc->sibling, &slot->funcs);
^1da177e4c3f41 Linus Torvalds    2005-04-16  329  
3b63aaa70e1ccc Jiang Liu         2013-04-12 @330  	if (pci_bus_read_dev_vendor_id(pbus, PCI_DEVFN(device, function),
3b63aaa70e1ccc Jiang Liu         2013-04-12  331  				       &val, 60*1000))
bc805a55392a7c Rafael J. Wysocki 2013-07-13  332  		slot->flags |= SLOT_ENABLED;
^1da177e4c3f41 Linus Torvalds    2005-04-16  333  
2e862c51904ddd Rafael J. Wysocki 2013-07-13  334  	return AE_OK;
^1da177e4c3f41 Linus Torvalds    2005-04-16  335  }
^1da177e4c3f41 Linus Torvalds    2005-04-16  336  
364d5094a43ff2 Rajesh Shah       2005-04-28  337  static void cleanup_bridge(struct acpiphp_bridge *bridge)
^1da177e4c3f41 Linus Torvalds    2005-04-16  338  {
3d54a3160fb6ba Jiang Liu         2013-04-12  339  	struct acpiphp_slot *slot;
3d54a3160fb6ba Jiang Liu         2013-04-12  340  	struct acpiphp_func *func;
42f49a6ae5dca9 Rajesh Shah       2005-04-28  341  
3d54a3160fb6ba Jiang Liu         2013-04-12  342  	list_for_each_entry(slot, &bridge->slots, node) {
3d54a3160fb6ba Jiang Liu         2013-04-12  343  		list_for_each_entry(func, &slot->funcs, sibling) {
1a699476e25814 Rafael J. Wysocki 2014-02-06  344  			struct acpi_device *adev = func_to_acpi_device(func);
5a3bc573ae32a7 Rafael J. Wysocki 2013-07-13  345  
1a699476e25814 Rafael J. Wysocki 2014-02-06  346  			acpi_lock_hp_context();
be27b3dcb02335 Rafael J. Wysocki 2014-02-21  347  			adev->hp->notify = NULL;
edf5bf34d40804 Rafael J. Wysocki 2014-02-21  348  			adev->hp->fixup = NULL;
1a699476e25814 Rafael J. Wysocki 2014-02-06  349  			acpi_unlock_hp_context();
42f49a6ae5dca9 Rajesh Shah       2005-04-28  350  		}
9217a984671e8a Rafael J. Wysocki 2014-01-10  351  		slot->flags |= SLOT_IS_GOING_AWAY;
1aaac07112f040 Rafael J. Wysocki 2013-08-17  352  		if (slot->slot)
e27da381417038 MUNEDA Takahiro   2006-02-23  353  			acpiphp_unregister_hotplug_slot(slot);
42f49a6ae5dca9 Rajesh Shah       2005-04-28  354  	}
42f49a6ae5dca9 Rajesh Shah       2005-04-28  355  
3d54a3160fb6ba Jiang Liu         2013-04-12  356  	mutex_lock(&bridge_mutex);
42f49a6ae5dca9 Rajesh Shah       2005-04-28  357  	list_del(&bridge->list);
3d54a3160fb6ba Jiang Liu         2013-04-12  358  	mutex_unlock(&bridge_mutex);
9217a984671e8a Rafael J. Wysocki 2014-01-10  359  
e525506fcb67a9 Rafael J. Wysocki 2014-02-04  360  	acpi_lock_hp_context();
9217a984671e8a Rafael J. Wysocki 2014-01-10  361  	bridge->is_going_away = true;
e525506fcb67a9 Rafael J. Wysocki 2014-02-04  362  	acpi_unlock_hp_context();
^1da177e4c3f41 Linus Torvalds    2005-04-16  363  }
^1da177e4c3f41 Linus Torvalds    2005-04-16  364  
15a1ae74879925 Kristen Accardi   2006-02-23  365  /**
26e6c66e47fe7f Randy Dunlap      2007-11-28  366   * acpiphp_max_busnr - return the highest reserved bus number under the given bus.
15a1ae74879925 Kristen Accardi   2006-02-23  367   * @bus: bus to start search with
15a1ae74879925 Kristen Accardi   2006-02-23  368   */
15a1ae74879925 Kristen Accardi   2006-02-23  369  static unsigned char acpiphp_max_busnr(struct pci_bus *bus)
15a1ae74879925 Kristen Accardi   2006-02-23  370  {
c6f0d5adc21e2d Yijing Wang       2014-02-13  371  	struct pci_bus *tmp;
15a1ae74879925 Kristen Accardi   2006-02-23  372  	unsigned char max, n;
15a1ae74879925 Kristen Accardi   2006-02-23  373  
15a1ae74879925 Kristen Accardi   2006-02-23  374  	/*
15a1ae74879925 Kristen Accardi   2006-02-23  375  	 * pci_bus_max_busnr will return the highest
15a1ae74879925 Kristen Accardi   2006-02-23  376  	 * reserved busnr for all these children.
15a1ae74879925 Kristen Accardi   2006-02-23  377  	 * that is equivalent to the bus->subordinate
15a1ae74879925 Kristen Accardi   2006-02-23  378  	 * value.  We don't want to use the parent's
15a1ae74879925 Kristen Accardi   2006-02-23  379  	 * bus->subordinate value because it could have
15a1ae74879925 Kristen Accardi   2006-02-23  380  	 * padding in it.
15a1ae74879925 Kristen Accardi   2006-02-23  381  	 */
b918c62e086b21 Yinghai Lu        2012-05-17  382  	max = bus->busn_res.start;
15a1ae74879925 Kristen Accardi   2006-02-23  383  
c6f0d5adc21e2d Yijing Wang       2014-02-13  384  	list_for_each_entry(tmp, &bus->children, node) {
c6f0d5adc21e2d Yijing Wang       2014-02-13  385  		n = pci_bus_max_busnr(tmp);
15a1ae74879925 Kristen Accardi   2006-02-23  386  		if (n > max)
15a1ae74879925 Kristen Accardi   2006-02-23  387  			max = n;
15a1ae74879925 Kristen Accardi   2006-02-23  388  	}
15a1ae74879925 Kristen Accardi   2006-02-23  389  	return max;
15a1ae74879925 Kristen Accardi   2006-02-23  390  }
15a1ae74879925 Kristen Accardi   2006-02-23  391  
d06070509147c9 Shaohua Li        2010-02-25  392  static void acpiphp_set_acpi_region(struct acpiphp_slot *slot)
d06070509147c9 Shaohua Li        2010-02-25  393  {
d06070509147c9 Shaohua Li        2010-02-25  394  	struct acpiphp_func *func;
d06070509147c9 Shaohua Li        2010-02-25  395  
d06070509147c9 Shaohua Li        2010-02-25  396  	list_for_each_entry(func, &slot->funcs, sibling) {
d06070509147c9 Shaohua Li        2010-02-25  397  		/* _REG is optional, we don't care about if there is failure */
6dd10c47e91239 Hans de Goede     2020-05-07  398  		acpi_evaluate_reg(func_to_handle(func),
6dd10c47e91239 Hans de Goede     2020-05-07  399  				  ACPI_ADR_SPACE_PCI_CONFIG,
6dd10c47e91239 Hans de Goede     2020-05-07  400  				  ACPI_REG_CONNECT);
d06070509147c9 Shaohua Li        2010-02-25  401  	}
d06070509147c9 Shaohua Li        2010-02-25  402  }
d06070509147c9 Shaohua Li        2010-02-25  403  
1f96a965e30d09 Yinghai Lu        2013-01-21  404  static void check_hotplug_bridge(struct acpiphp_slot *slot, struct pci_dev *dev)
1f96a965e30d09 Yinghai Lu        2013-01-21  405  {
1f96a965e30d09 Yinghai Lu        2013-01-21  406  	struct acpiphp_func *func;
1f96a965e30d09 Yinghai Lu        2013-01-21  407  
1f96a965e30d09 Yinghai Lu        2013-01-21  408  	/* quirk, or pcie could set it already */
1f96a965e30d09 Yinghai Lu        2013-01-21  409  	if (dev->is_hotplug_bridge)
1f96a965e30d09 Yinghai Lu        2013-01-21  410  		return;
1f96a965e30d09 Yinghai Lu        2013-01-21  411  
1f96a965e30d09 Yinghai Lu        2013-01-21  412  	list_for_each_entry(func, &slot->funcs, sibling) {
1f96a965e30d09 Yinghai Lu        2013-01-21  413  		if (PCI_FUNC(dev->devfn) == func->function) {
1f96a965e30d09 Yinghai Lu        2013-01-21  414  			dev->is_hotplug_bridge = 1;
1f96a965e30d09 Yinghai Lu        2013-01-21  415  			break;
1f96a965e30d09 Yinghai Lu        2013-01-21  416  		}
1f96a965e30d09 Yinghai Lu        2013-01-21  417  	}
1f96a965e30d09 Yinghai Lu        2013-01-21  418  }
3b63aaa70e1ccc Jiang Liu         2013-04-12  419  
a47d8c8e72a5fa Rafael J. Wysocki 2013-09-08  420  static int acpiphp_rescan_slot(struct acpiphp_slot *slot)
a47d8c8e72a5fa Rafael J. Wysocki 2013-09-08  421  {
a47d8c8e72a5fa Rafael J. Wysocki 2013-09-08  422  	struct acpiphp_func *func;
a47d8c8e72a5fa Rafael J. Wysocki 2013-09-08  423  
b6708fbf98ac01 Rafael J. Wysocki 2014-02-04  424  	list_for_each_entry(func, &slot->funcs, sibling) {
b6708fbf98ac01 Rafael J. Wysocki 2014-02-04  425  		struct acpi_device *adev = func_to_acpi_device(func);
a47d8c8e72a5fa Rafael J. Wysocki 2013-09-08  426  
b6708fbf98ac01 Rafael J. Wysocki 2014-02-04  427  		acpi_bus_scan(adev->handle);
b6708fbf98ac01 Rafael J. Wysocki 2014-02-04  428  		if (acpi_device_enumerated(adev))
b6708fbf98ac01 Rafael J. Wysocki 2014-02-04  429  			acpi_device_set_power(adev, ACPI_STATE_D0);
b6708fbf98ac01 Rafael J. Wysocki 2014-02-04  430  	}
a47d8c8e72a5fa Rafael J. Wysocki 2013-09-08  431  	return pci_scan_slot(slot->bus, PCI_DEVFN(slot->device, 0));
a47d8c8e72a5fa Rafael J. Wysocki 2013-09-08  432  }
a47d8c8e72a5fa Rafael J. Wysocki 2013-09-08  433  
84c8b58ed3addf Mika Westerberg   2018-05-29  434  static void acpiphp_native_scan_bridge(struct pci_dev *bridge)
84c8b58ed3addf Mika Westerberg   2018-05-29  435  {
84c8b58ed3addf Mika Westerberg   2018-05-29  436  	struct pci_bus *bus = bridge->subordinate;
84c8b58ed3addf Mika Westerberg   2018-05-29  437  	struct pci_dev *dev;
84c8b58ed3addf Mika Westerberg   2018-05-29  438  	int max;
84c8b58ed3addf Mika Westerberg   2018-05-29  439  
84c8b58ed3addf Mika Westerberg   2018-05-29  440  	if (!bus)
84c8b58ed3addf Mika Westerberg   2018-05-29  441  		return;
84c8b58ed3addf Mika Westerberg   2018-05-29  442  
84c8b58ed3addf Mika Westerberg   2018-05-29  443  	max = bus->busn_res.start;
84c8b58ed3addf Mika Westerberg   2018-05-29  444  	/* Scan already configured non-hotplug bridges */
84c8b58ed3addf Mika Westerberg   2018-05-29  445  	for_each_pci_bridge(dev, bus) {
84c8b58ed3addf Mika Westerberg   2018-05-29  446  		if (!hotplug_is_native(dev))
84c8b58ed3addf Mika Westerberg   2018-05-29  447  			max = pci_scan_bridge(bus, dev, max, 0);
84c8b58ed3addf Mika Westerberg   2018-05-29  448  	}
84c8b58ed3addf Mika Westerberg   2018-05-29  449  
84c8b58ed3addf Mika Westerberg   2018-05-29  450  	/* Scan non-hotplug bridges that need to be reconfigured */
84c8b58ed3addf Mika Westerberg   2018-05-29  451  	for_each_pci_bridge(dev, bus) {
77adf9355304f8 Mika Westerberg   2019-10-30  452  		if (hotplug_is_native(dev))
77adf9355304f8 Mika Westerberg   2019-10-30  453  			continue;
77adf9355304f8 Mika Westerberg   2019-10-30  454  
84c8b58ed3addf Mika Westerberg   2018-05-29  455  		max = pci_scan_bridge(bus, dev, max, 1);
77adf9355304f8 Mika Westerberg   2019-10-30  456  		if (dev->subordinate) {
77adf9355304f8 Mika Westerberg   2019-10-30  457  			pcibios_resource_survey_bus(dev->subordinate);
77adf9355304f8 Mika Westerberg   2019-10-30  458  			pci_bus_size_bridges(dev->subordinate);
77adf9355304f8 Mika Westerberg   2019-10-30  459  			pci_bus_assign_resources(dev->subordinate);
77adf9355304f8 Mika Westerberg   2019-10-30  460  		}
84c8b58ed3addf Mika Westerberg   2018-05-29  461  	}
84c8b58ed3addf Mika Westerberg   2018-05-29  462  }
84c8b58ed3addf Mika Westerberg   2018-05-29  463  
^1da177e4c3f41 Linus Torvalds    2005-04-16  464  /**
a1d0abcea84573 Rafael J. Wysocki 2013-07-13  465   * enable_slot - enable, configure a slot
^1da177e4c3f41 Linus Torvalds    2005-04-16  466   * @slot: slot to be enabled
f188b99f0b2d33 Mika Westerberg   2018-09-26  467   * @bridge: true if enable is for the whole bridge (not a single slot)
^1da177e4c3f41 Linus Torvalds    2005-04-16  468   *
^1da177e4c3f41 Linus Torvalds    2005-04-16  469   * This function should be called per *physical slot*,
^1da177e4c3f41 Linus Torvalds    2005-04-16  470   * not per each slot object in ACPI namespace.
^1da177e4c3f41 Linus Torvalds    2005-04-16  471   */
f188b99f0b2d33 Mika Westerberg   2018-09-26  472  static void enable_slot(struct acpiphp_slot *slot, bool bridge)
^1da177e4c3f41 Linus Torvalds    2005-04-16  473  {
^1da177e4c3f41 Linus Torvalds    2005-04-16  474  	struct pci_dev *dev;
bda46dbb6626c9 Rafael J. Wysocki 2013-07-13  475  	struct pci_bus *bus = slot->bus;
^1da177e4c3f41 Linus Torvalds    2005-04-16  476  	struct acpiphp_func *func;
84c8b58ed3addf Mika Westerberg   2018-05-29  477  
f188b99f0b2d33 Mika Westerberg   2018-09-26  478  	if (bridge && bus->self && hotplug_is_native(bus->self)) {
84c8b58ed3addf Mika Westerberg   2018-05-29  479  		/*
84c8b58ed3addf Mika Westerberg   2018-05-29  480  		 * If native hotplug is used, it will take care of hotplug
84c8b58ed3addf Mika Westerberg   2018-05-29  481  		 * slot management and resource allocation for hotplug
84c8b58ed3addf Mika Westerberg   2018-05-29  482  		 * bridges. However, ACPI hotplug may still be used for
84c8b58ed3addf Mika Westerberg   2018-05-29  483  		 * non-hotplug bridges to bring in additional devices such
84c8b58ed3addf Mika Westerberg   2018-05-29  484  		 * as a Thunderbolt host controller.
84c8b58ed3addf Mika Westerberg   2018-05-29  485  		 */
84c8b58ed3addf Mika Westerberg   2018-05-29  486  		for_each_pci_bridge(dev, bus) {
84c8b58ed3addf Mika Westerberg   2018-05-29  487  			if (PCI_SLOT(dev->devfn) == slot->device)
84c8b58ed3addf Mika Westerberg   2018-05-29  488  				acpiphp_native_scan_bridge(dev);
84c8b58ed3addf Mika Westerberg   2018-05-29  489  		}
84c8b58ed3addf Mika Westerberg   2018-05-29  490  	} else {
d66ecb7220a70e Jiang Liu         2013-06-23  491  		LIST_HEAD(add_list);
84c8b58ed3addf Mika Westerberg   2018-05-29  492  		int max, pass;
^1da177e4c3f41 Linus Torvalds    2005-04-16  493  
ab1225901da2d4 Mika Westerberg   2013-10-30  494  		acpiphp_rescan_slot(slot);
15a1ae74879925 Kristen Accardi   2006-02-23  495  		max = acpiphp_max_busnr(bus);
42f49a6ae5dca9 Rajesh Shah       2005-04-28  496  		for (pass = 0; pass < 2; pass++) {
24a0c654d7d606 Andy Shevchenko   2017-10-20  497  			for_each_pci_bridge(dev, bus) {
42f49a6ae5dca9 Rajesh Shah       2005-04-28  498  				if (PCI_SLOT(dev->devfn) != slot->device)
42f49a6ae5dca9 Rajesh Shah       2005-04-28  499  					continue;
a1d0abcea84573 Rafael J. Wysocki 2013-07-13  500  
42f49a6ae5dca9 Rajesh Shah       2005-04-28  501  				max = pci_scan_bridge(bus, dev, max, pass);
1f96a965e30d09 Yinghai Lu        2013-01-21  502  				if (pass && dev->subordinate) {
1f96a965e30d09 Yinghai Lu        2013-01-21  503  					check_hotplug_bridge(slot, dev);
d66ecb7220a70e Jiang Liu         2013-06-23  504  					pcibios_resource_survey_bus(dev->subordinate);
84c8b58ed3addf Mika Westerberg   2018-05-29 @505  					__pci_bus_size_bridges(dev->subordinate,
84c8b58ed3addf Mika Westerberg   2018-05-29  506  							       &add_list);
c64b5eead93f9d Kristen Accardi   2005-12-14  507  				}
42f49a6ae5dca9 Rajesh Shah       2005-04-28  508  			}
1f96a965e30d09 Yinghai Lu        2013-01-21  509  		}
d66ecb7220a70e Jiang Liu         2013-06-23 @510  		__pci_bus_assign_resources(bus, &add_list, NULL);
84c8b58ed3addf Mika Westerberg   2018-05-29  511  	}
2dc41281b1d117 Rafael J. Wysocki 2013-09-06  512  
8e5dce35221850 Kristen Accardi   2005-10-18  513  	acpiphp_sanitize_bus(bus);
81ee57326c9ca6 Bjorn Helgaas     2014-08-28  514  	pcie_bus_configure_settings(bus);
d06070509147c9 Shaohua Li        2010-02-25  515  	acpiphp_set_acpi_region(slot);
69643e4829c5cd Ian Campbell      2011-05-11  516  
69643e4829c5cd Ian Campbell      2011-05-11  517  	list_for_each_entry(dev, &bus->devices, bus_list) {
69643e4829c5cd Ian Campbell      2011-05-11  518  		/* Assume that newly added devices are powered on already. */
44bda4b7d26e9f Hari Vyas         2018-07-03  519  		if (!pci_dev_is_added(dev))
69643e4829c5cd Ian Campbell      2011-05-11  520  			dev->current_state = PCI_D0;
69643e4829c5cd Ian Campbell      2011-05-11  521  	}
69643e4829c5cd Ian Campbell      2011-05-11  522  
42f49a6ae5dca9 Rajesh Shah       2005-04-28  523  	pci_bus_add_devices(bus);
42f49a6ae5dca9 Rajesh Shah       2005-04-28  524  
f382a086f3129e Amos Kong         2011-11-25  525  	slot->flags |= SLOT_ENABLED;
58c08628c4fe66 Alex Chiang       2009-10-26  526  	list_for_each_entry(func, &slot->funcs, sibling) {
9d911d7903926a Alex Chiang       2009-05-21  527  		dev = pci_get_slot(bus, PCI_DEVFN(slot->device,
^1da177e4c3f41 Linus Torvalds    2005-04-16  528  						  func->function));
f382a086f3129e Amos Kong         2011-11-25  529  		if (!dev) {
f382a086f3129e Amos Kong         2011-11-25  530  			/* Do not set SLOT_ENABLED flag if some funcs
f382a086f3129e Amos Kong         2011-11-25  531  			   are not added. */
9337a493623d59 Mika Westerberg   2018-05-24  532  			slot->flags &= ~SLOT_ENABLED;
551bcb75b3d9f2 MUNEDA Takahiro   2006-03-22  533  			continue;
f382a086f3129e Amos Kong         2011-11-25  534  		}
3bbfd319034ddc Feilong Lin       2021-03-25  535  		pci_dev_put(dev);
^1da177e4c3f41 Linus Torvalds    2005-04-16  536  	}
^1da177e4c3f41 Linus Torvalds    2005-04-16  537  }
^1da177e4c3f41 Linus Torvalds    2005-04-16  538  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 65111 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/5] s390/pci: automatic error recovery
  2021-09-06  9:49 [PATCH 0/5] s390/pci: automatic error recovery Niklas Schnelle
                   ` (4 preceding siblings ...)
  2021-09-06  9:49 ` [PATCH 5/5] s390/pci: implement minimal PCI error recovery Niklas Schnelle
@ 2021-09-07  2:04 ` Oliver O'Halloran
  2021-09-07  8:45   ` Niklas Schnelle
       [not found] ` <CAHrUA34TK6U4TB34FHejott9TdFvSgAedOpmro-Uj2ZwnvzecQ@mail.gmail.com>
  6 siblings, 1 reply; 17+ messages in thread
From: Oliver O'Halloran @ 2021-09-07  2:04 UTC (permalink / raw)
  To: Niklas Schnelle
  Cc: Bjorn Helgaas, Linas Vepstas, Russell Currey, linuxppc-dev,
	Linux Kernel Mailing List, linux-s390, Matthew Rosato,
	Pierre Morel

On Mon, Sep 6, 2021 at 7:49 PM Niklas Schnelle <schnelle@linux.ibm.com> wrote:
>
> Patch 3 I already sent separately resulting in the discussion below but without
> a final conclusion.
>
> https://lore.kernel.org/lkml/20210720150145.640727-1-schnelle@linux.ibm.com/
>
> I believe even though there were some doubts about the use of
> pci_dev_is_added() by arch code the existing uses as well as the use in the
> final patch of this series warrant this export.

The use of pci_dev_is_added() in arch/powerpc was because in the past
pci_bus_add_device() could be called before pci_device_add(). That was
fixed a while ago so It should be safe to remove those calls now.

> Patch 4 "PCI: Export pci_dev_lock()" is basically an extension to commit
> e3a9b1212b9d ("PCI: Export pci_dev_trylock() and pci_dev_unlock()") which
> already exported pci_dev_trylock(). In the final patch we make use of
> pci_dev_lock() to wait for any other exclusive uses of the pdev to be finished
> before starting recovery.

Hmm, I noticed the EEH
(arch/powerpc/kernel/eeh_driver.c:eeh_pe_report_edev())  and the
generic PCIe error recovery code (see
drivers/pci/pcie/err.c:report_error_detected()) only call
device_lock() before entering the driver's error handling callbacks. I
wonder if they should be using pci_dev_lock() instead. The only real
difference is that pci_dev_lock() will also block user space from
accessing the device's config space while error recovery is in
progress which seems sensible enough.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Fwd: [PATCH 0/5] s390/pci: automatic error recovery
       [not found] ` <CAHrUA34TK6U4TB34FHejott9TdFvSgAedOpmro-Uj2ZwnvzecQ@mail.gmail.com>
@ 2021-09-07  2:10   ` Linas Vepstas
  2021-09-07  7:49   ` Niklas Schnelle
  1 sibling, 0 replies; 17+ messages in thread
From: Linas Vepstas @ 2021-09-07  2:10 UTC (permalink / raw)
  To: linux-kernel, linux-s390

Ooops, try again without the html. --linas

---------- Forwarded message ---------
From: Linas Vepstas <linasvepstas@gmail.com>
Date: Mon, Sep 6, 2021 at 9:05 PM
Subject: Re: [PATCH 0/5] s390/pci: automatic error recovery
To: Niklas Schnelle <schnelle@linux.ibm.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>, Oliver O'Halloran
<oohall@gmail.com>, Russell Currey <ruscur@russell.cc>,
<linuxppc-dev@lists.ozlabs.org>, linux-kernel@vger.kernel.org
<linux-kernel@vger.kernel.org>, <linux-s390@vger.kernel.org>, Matthew
Rosato <mjrosato@linux.ibm.com>, Pierre Morel <pmorel@linux.ibm.com>




On Mon, Sep 6, 2021 at 4:49 AM Niklas Schnelle <schnelle@linux.ibm.com> wrote:
>
>  I believe we might be the first
> implementation of PCI device recovery in a virtualized setting requiring us to
> coordinate the device reset with the hypervisor platform by issuing a disable
> and re-enable to the platform as well as starting the recovery following
> a platform event.


I recall none of the details, but SRIOV is a standardized system for
sharing a PCI device across multiple virtual machines. It has detailed
info on what the hypervisor must do, and what the local OS instance
must do to accomplish this.  It's part of the PCI standard, and its
more than a decade old now, maybe two. Being a part of the PCI
standard, it was interoperable with error recovery, to the best of my
recollection. At the time it was introduced, it got pushed very
aggressively.  The x86 hypervisor vendors were aiming at the heart of
zseries, and were militant about it.

-- Linas

-- 
Patrick: Are they laughing at us?
Sponge Bob: No, Patrick, they are laughing next to us.




-- 
Patrick: Are they laughing at us?
Sponge Bob: No, Patrick, they are laughing next to us.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/5] s390/pci: automatic error recovery
       [not found] ` <CAHrUA34TK6U4TB34FHejott9TdFvSgAedOpmro-Uj2ZwnvzecQ@mail.gmail.com>
  2021-09-07  2:10   ` Fwd: " Linas Vepstas
@ 2021-09-07  7:49   ` Niklas Schnelle
  1 sibling, 0 replies; 17+ messages in thread
From: Niklas Schnelle @ 2021-09-07  7:49 UTC (permalink / raw)
  To: linasvepstas
  Cc: Bjorn Helgaas, Oliver O'Halloran, Russell Currey,
	linuxppc-dev, linux-kernel, linux-s390, Matthew Rosato,
	Pierre Morel

On Mon, 2021-09-06 at 21:05 -0500, Linas Vepstas wrote:
> On Mon, Sep 6, 2021 at 4:49 AM Niklas Schnelle <schnelle@linux.ibm.com>
> wrote:
> 
> >  I believe we might be the first
> > implementation of PCI device recovery in a virtualized setting requiring
> > us to
> > coordinate the device reset with the hypervisor platform by issuing a
> > disable
> > and re-enable to the platform as well as starting the recovery following
> > a platform event.
> > 
> 
> I recall none of the details, but SRIOV is a standardized system for
> sharing a PCI device across multiple virtual machines. It has detailed info
> on what the hypervisor must do, and what the local OS instance must do to
> accomplish this.  

Yes and in fact on s390 we make heavy use of SR-IOV.

> It's part of the PCI standard, and its more than a decade
> old now, maybe two. Being a part of the PCI standard, it was interoperable
> with error recovery, to the best of my recollection. 

Maybe I worded things with a bit too much sensationalism and it might
even be that POWER supports error recovery also with virtualization,
though I'm not sure how far that goes.

I believe you are right in that SR-IOV supports the error recovery,
after all this patch set also has to work together with SRIOV enabled
devices. At least on s390 though until this patch set the error
recovery performed by the hypervisor stopped in the hypervisor.

The missing part added by this patch set is coordinating with device
drivers in Linux to determine where use of a recovered device can pick
up after the PCIe level error recovery is done.

As for virtualization this coordination of course needs to cross the
hypervisor/guest boundary and at least for KVM+QEMU I know for a fact
that reporting a PCI error to the guest is currently just a stub that
actually completely stops the guest, so you definitely don't get smooth
error recovery there yet.

> At the time it was
> introduced, it got pushed very aggressively.  The x86 hypervisor vendors
> were aiming at the heart of zseries, and were militant about it.

And yet we're still here, use SR-IOV ourselves and even support Linux +
KVM as a hypervisor you can use just the same on a mainframe, an x86,
POWER, or ARM system.

> 
> -- Linas
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/5] PCI: Move pci_dev_is/assign_added() to pci.h
  2021-09-07  0:25   ` kernel test robot
@ 2021-09-07  7:51     ` Andy Shevchenko
  2021-09-07  8:14       ` Niklas Schnelle
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Shevchenko @ 2021-09-07  7:51 UTC (permalink / raw)
  To: kernel test robot
  Cc: Niklas Schnelle, Bjorn Helgaas, kbuild-all, Linas Vepstas,
	Oliver O'Halloran, Russell Currey,
	open list:LINUX FOR POWERPC PA SEMI PWRFICIENT,
	Linux Kernel Mailing List, linux-s390, Matthew Rosato,
	Pierre Morel

On Tue, Sep 7, 2021 at 3:26 AM kernel test robot <lkp@intel.com> wrote:
>
> Hi Niklas,
>
> I love your patch! Yet something to improve:
>
> [auto build test ERROR on s390/features]
> [also build test ERROR on next-20210906]
> [cannot apply to pci/next powerpc/next v5.14]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url:    https://github.com/0day-ci/linux/commits/Niklas-Schnelle/s390-pci-automatic-error-recovery/20210906-175309
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git features
> config: i386-allyesconfig (attached as .config)
> compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
> reproduce (this is a W=1 build):
>         # https://github.com/0day-ci/linux/commit/404ed8c00a612e7ae31c50557c80c6726c464863
>         git remote add linux-review https://github.com/0day-ci/linux
>         git fetch --no-tags linux-review Niklas-Schnelle/s390-pci-automatic-error-recovery/20210906-175309
>         git checkout 404ed8c00a612e7ae31c50557c80c6726c464863
>         # save the attached .config to linux build tree
>         make W=1 ARCH=i386
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
>
> All errors (new ones prefixed by >>):

Obviously drivers/pci/pci.h is not only for the above.

When play with headers always do two test builds: allyesconfig and allmodconfig.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/5] PCI: Move pci_dev_is/assign_added() to pci.h
  2021-09-07  7:51     ` Andy Shevchenko
@ 2021-09-07  8:14       ` Niklas Schnelle
  0 siblings, 0 replies; 17+ messages in thread
From: Niklas Schnelle @ 2021-09-07  8:14 UTC (permalink / raw)
  To: Andy Shevchenko, kernel test robot
  Cc: Bjorn Helgaas, kbuild-all, Linas Vepstas, Oliver O'Halloran,
	Russell Currey, open list:LINUX FOR POWERPC PA SEMI PWRFICIENT,
	Linux Kernel Mailing List, linux-s390, Matthew Rosato,
	Pierre Morel

On Tue, 2021-09-07 at 10:51 +0300, Andy Shevchenko wrote:
> On Tue, Sep 7, 2021 at 3:26 AM kernel test robot <lkp@intel.com> wrote:
> > Hi Niklas,
> > 
> > I love your patch! Yet something to improve:
> > 
> > [auto build test ERROR on s390/features]
> > [also build test ERROR on next-20210906]
> > [cannot apply to pci/next powerpc/next v5.14]
> > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > And when submitting patch, we suggest to use '--base' as documented in
> > https://git-scm.com/docs/git-format-patch]
> > 
> > url:    https://github.com/0day-ci/linux/commits/Niklas-Schnelle/s390-pci-automatic-error-recovery/20210906-175309
> > base:   https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git features
> > config: i386-allyesconfig (attached as .config)
> > compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
> > reproduce (this is a W=1 build):
> >         # https://github.com/0day-ci/linux/commit/404ed8c00a612e7ae31c50557c80c6726c464863
> >         git remote add linux-review https://github.com/0day-ci/linux
> >         git fetch --no-tags linux-review Niklas-Schnelle/s390-pci-automatic-error-recovery/20210906-175309
> >         git checkout 404ed8c00a612e7ae31c50557c80c6726c464863
> >         # save the attached .config to linux build tree
> >         make W=1 ARCH=i386
> > 
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot <lkp@intel.com>
> > 
> > All errors (new ones prefixed by >>):
> 
> Obviously drivers/pci/pci.h is not only for the above.
> 
> When play with headers always do two test builds: allyesconfig and allmodconfig.

You're right and additionally have to built on some other architectures
as well because allyesconfig and allmodconfig both run through fine on
s390. 

I'll look into it but at first glance it looks like I was over reaching
removing the include from drivers/pci/hotplug/acpiphp_glue.c in
addition it's not even the same kind of awkward relative include from
drivers into arch code. Sorry about that.

> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/5] s390/pci: automatic error recovery
  2021-09-07  2:04 ` [PATCH 0/5] s390/pci: automatic " Oliver O'Halloran
@ 2021-09-07  8:45   ` Niklas Schnelle
  2021-09-07 12:21     ` Niklas Schnelle
  0 siblings, 1 reply; 17+ messages in thread
From: Niklas Schnelle @ 2021-09-07  8:45 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: Bjorn Helgaas, Linas Vepstas, Russell Currey, linuxppc-dev,
	Linux Kernel Mailing List, linux-s390, Matthew Rosato,
	Pierre Morel

On Tue, 2021-09-07 at 12:04 +1000, Oliver O'Halloran wrote:
> On Mon, Sep 6, 2021 at 7:49 PM Niklas Schnelle <schnelle@linux.ibm.com> wrote:
> > Patch 3 I already sent separately resulting in the discussion below but without
> > a final conclusion.
> > 
> > https://lore.kernel.org/lkml/20210720150145.640727-1-schnelle@linux.ibm.com/
> > 
> > I believe even though there were some doubts about the use of
> > pci_dev_is_added() by arch code the existing uses as well as the use in the
> > final patch of this series warrant this export.
> 
> The use of pci_dev_is_added() in arch/powerpc was because in the past
> pci_bus_add_device() could be called before pci_device_add(). That was
> fixed a while ago so It should be safe to remove those calls now.

Hmm, ok that confirms Bjorns suspicion and explains how it came to be.
I can certainly sent a patch for that. This would then leave only the
existing use in s390 which I added because of a dead lock prevention
and explained here:
https://lore.kernel.org/lkml/87d15d5eead35c9eaa667958d057cf4a81a8bf13.camel@linux.ibm.com/

Plus the need to use it in the recovery code of this series. I think in
the EEH code the need for a similar check is alleviated by the checks
in the beginning of
arch/powerpc/kernel/eeh_driver.c:eeh_handle_normal_event() especially
eeh_slot_presence_check() which checks presence via the hotplug slot.
I guess we could use our own state tracking in a similar way but felt
like pci_dev_is_added() is the more logical choice.

> 
> > Patch 4 "PCI: Export pci_dev_lock()" is basically an extension to commit
> > e3a9b1212b9d ("PCI: Export pci_dev_trylock() and pci_dev_unlock()") which
> > already exported pci_dev_trylock(). In the final patch we make use of
> > pci_dev_lock() to wait for any other exclusive uses of the pdev to be finished
> > before starting recovery.
> 
> Hmm, I noticed the EEH
> (arch/powerpc/kernel/eeh_driver.c:eeh_pe_report_edev())  and the
> generic PCIe error recovery code (see
> drivers/pci/pcie/err.c:report_error_detected()) only call
> device_lock() before entering the driver's error handling callbacks. I
> wonder if they should be using pci_dev_lock() instead. The only real
> difference is that pci_dev_lock() will also block user space from
> accessing the device's config space while error recovery is in
> progress which seems sensible enough.

I agree that sounds reasonable.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/5] s390/pci: automatic error recovery
  2021-09-07  8:45   ` Niklas Schnelle
@ 2021-09-07 12:21     ` Niklas Schnelle
  2021-09-08  1:37       ` Oliver O'Halloran
  0 siblings, 1 reply; 17+ messages in thread
From: Niklas Schnelle @ 2021-09-07 12:21 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: Bjorn Helgaas, Linas Vepstas, Russell Currey, linuxppc-dev,
	Linux Kernel Mailing List, linux-s390, Matthew Rosato,
	Pierre Morel

On Tue, 2021-09-07 at 10:45 +0200, Niklas Schnelle wrote:
> On Tue, 2021-09-07 at 12:04 +1000, Oliver O'Halloran wrote:
> > On Mon, Sep 6, 2021 at 7:49 PM Niklas Schnelle <schnelle@linux.ibm.com> wrote:
> > > Patch 3 I already sent separately resulting in the discussion below but without
> > > a final conclusion.
> > > 
> > > https://lore.kernel.org/lkml/20210720150145.640727-1-schnelle@linux.ibm.com/
> > > 
> > > I believe even though there were some doubts about the use of
> > > pci_dev_is_added() by arch code the existing uses as well as the use in the
> > > final patch of this series warrant this export.
> > 
> > The use of pci_dev_is_added() in arch/powerpc was because in the past
> > pci_bus_add_device() could be called before pci_device_add(). That was
> > fixed a while ago so It should be safe to remove those calls now.
> 
> Hmm, ok that confirms Bjorns suspicion and explains how it came to be.
> I can certainly sent a patch for that. This would then leave only the
> existing use in s390 which I added because of a dead lock prevention
> and explained here:
> https://lore.kernel.org/lkml/87d15d5eead35c9eaa667958d057cf4a81a8bf13.camel@linux.ibm.com/
> 
> Plus the need to use it in the recovery code of this series. I think in
> the EEH code the need for a similar check is alleviated by the checks
> in the beginning of
> arch/powerpc/kernel/eeh_driver.c:eeh_handle_normal_event() especially
> eeh_slot_presence_check() which checks presence via the hotplug slot.
> I guess we could use our own state tracking in a similar way but felt
> like pci_dev_is_added() is the more logical choice.

Looking into this again, I think we actually can't easily track this
state ourselves outside struct pci_dev. The reason for this is that
when e.g. arch/s390/pci/pci_sysfs.c:recover_store() removes the struct
pci_dev and scans it again the new struct pci_dev re-uses the same
struct zpci_dev because from a platform point of view the PCI device
was never removed but only disabled and re-enabled. Thus we can only
distinguish the stale struct pci_dev by looking at things stored in
struct pci_dev itself.

That said, I think for the recovery case we might be able to drop the
pci_dev_is_added() and rely on pdev->driver != NULL which we check
anyway and that should catch any PCI device that was already removed.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/5] s390/pci: automatic error recovery
  2021-09-07 12:21     ` Niklas Schnelle
@ 2021-09-08  1:37       ` Oliver O'Halloran
  2021-09-08  8:09         ` Niklas Schnelle
  0 siblings, 1 reply; 17+ messages in thread
From: Oliver O'Halloran @ 2021-09-08  1:37 UTC (permalink / raw)
  To: Niklas Schnelle
  Cc: Bjorn Helgaas, Linas Vepstas, Russell Currey, linuxppc-dev,
	Linux Kernel Mailing List, linux-s390, Matthew Rosato,
	Pierre Morel

On Tue, Sep 7, 2021 at 10:21 PM Niklas Schnelle <schnelle@linux.ibm.com> wrote:
>
> On Tue, 2021-09-07 at 10:45 +0200, Niklas Schnelle wrote:
> > On Tue, 2021-09-07 at 12:04 +1000, Oliver O'Halloran wrote:
> > > On Mon, Sep 6, 2021 at 7:49 PM Niklas Schnelle <schnelle@linux.ibm.com> wrote:
> > > > Patch 3 I already sent separately resulting in the discussion below but without
> > > > a final conclusion.
> > > >
> > > > https://lore.kernel.org/lkml/20210720150145.640727-1-schnelle@linux.ibm.com/
> > > >
> > > > I believe even though there were some doubts about the use of
> > > > pci_dev_is_added() by arch code the existing uses as well as the use in the
> > > > final patch of this series warrant this export.
> > >
> > > The use of pci_dev_is_added() in arch/powerpc was because in the past
> > > pci_bus_add_device() could be called before pci_device_add(). That was
> > > fixed a while ago so It should be safe to remove those calls now.
> >
> > Hmm, ok that confirms Bjorns suspicion and explains how it came to be.
> > I can certainly sent a patch for that. This would then leave only the
> > existing use in s390 which I added because of a dead lock prevention
> > and explained here:
> > https://lore.kernel.org/lkml/87d15d5eead35c9eaa667958d057cf4a81a8bf13.camel@linux.ibm.com/
> >
> > Plus the need to use it in the recovery code of this series. I think in
> > the EEH code the need for a similar check is alleviated by the checks
> > in the beginning of
> > arch/powerpc/kernel/eeh_driver.c:eeh_handle_normal_event() especially
> > eeh_slot_presence_check() which checks presence via the hotplug slot.
> > I guess we could use our own state tracking in a similar way but felt
> > like pci_dev_is_added() is the more logical choice.

The slot check is mainly there to prevent attempts to "recover"
devices that have been surprise removed (i.e NVMe hot-unplug). The
actual recovery process operates off the eeh_pe tree which is frozen
in place when an error is detected. If a pci_dev is added or removed
it's not really a problem since those are only ever looked at when
notifying drivers which is done with the rescan_remove lock held. That
said, I wouldn't really encourage anyone to follow the EEH model since
it's pretty byzantine.

> Looking into this again, I think we actually can't easily track this
> state ourselves outside struct pci_dev. The reason for this is that
> when e.g. arch/s390/pci/pci_sysfs.c:recover_store() removes the struct
> pci_dev and scans it again the new struct pci_dev re-uses the same
> struct zpci_dev because from a platform point of view the PCI device
> was never removed but only disabled and re-enabled. Thus we can only
> distinguish the stale struct pci_dev by looking at things stored in
> struct pci_dev itself.

IMO the real problem is removing and re-adding the pci_dev. I think
it's something that's done largely because the PCI core doesn't really
provide any better mechanism for getting a device back into a
known-good state so it's abused to implement error recovery. This is
something that's always annoyed me since it conflates recovery with
hotplug. After a hot-(un)plug we might have a different device or no
device. In the recovery case we expect to start and end with the same
device. Why not apply the same logic to the pci_dev?

Something I was tinkering with before I left IBM was re-working the
way EEH handles recovering devices that don't have a driver with error
handling callbacks to something like:

1. unbind the driver
2. pci_save_state()
3. do the reset
4. pci_restore_state()
5. re-bind the driver

That would allow keeping the pci_dev around and let me delete a pile
of confusing code which handles binding the eeh_dev to the new
pci_dev. The obvious problem with that approach is the assumption the
device is functional enough to allow saving the config space, but I
don't think that's a deal breaker. We could stash a copy of the device
state before we allow drivers to attach and use that to restore the
device after the reset. The end result would be the same known-good
state that we'd get after a re-scan.

> That said, I think for the recovery case we might be able to drop the
> pci_dev_is_added() and rely on pdev->driver != NULL which we check
> anyway and that should catch any PCI device that was already removed.

Would that work if there was an error on a device without a driver
bound? If you're just trying to stop races between recovery and device
removal then pci_dev_is_added() is probably the right tool for the
job. Trying to substitute it with a proxy seems like a bad idea.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/5] s390/pci: automatic error recovery
  2021-09-08  1:37       ` Oliver O'Halloran
@ 2021-09-08  8:09         ` Niklas Schnelle
  0 siblings, 0 replies; 17+ messages in thread
From: Niklas Schnelle @ 2021-09-08  8:09 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: Bjorn Helgaas, Linas Vepstas, Russell Currey, linuxppc-dev,
	Linux Kernel Mailing List, linux-s390, Matthew Rosato,
	Pierre Morel

On Wed, 2021-09-08 at 11:37 +1000, Oliver O'Halloran wrote:
> On Tue, Sep 7, 2021 at 10:21 PM Niklas Schnelle <schnelle@linux.ibm.com> wrote:
> > On Tue, 2021-09-07 at 10:45 +0200, Niklas Schnelle wrote:
> > > On Tue, 2021-09-07 at 12:04 +1000, Oliver O'Halloran wrote:
> > > > On Mon, Sep 6, 2021 at 7:49 PM Niklas Schnelle <schnelle@linux.ibm.com> wrote:
> > > > > Patch 3 I already sent separately resulting in the discussion below but without
> > > > > a final conclusion.
> > > > > 
> > > > > https://lore.kernel.org/lkml/20210720150145.640727-1-schnelle@linux.ibm.com/
> > > > > 
> > > > > I believe even though there were some doubts about the use of
> > > > > pci_dev_is_added() by arch code the existing uses as well as the use in the
> > > > > final patch of this series warrant this export.
> > > > 
> > > > The use of pci_dev_is_added() in arch/powerpc was because in the past
> > > > pci_bus_add_device() could be called before pci_device_add(). That was
> > > > fixed a while ago so It should be safe to remove those calls now.
> > > 
> > > Hmm, ok that confirms Bjorns suspicion and explains how it came to be.
> > > I can certainly sent a patch for that. This would then leave only the
> > > existing use in s390 which I added because of a dead lock prevention
> > > and explained here:
> > > https://lore.kernel.org/lkml/87d15d5eead35c9eaa667958d057cf4a81a8bf13.camel@linux.ibm.com/
> > > 
> > > Plus the need to use it in the recovery code of this series. I think in
> > > the EEH code the need for a similar check is alleviated by the checks
> > > in the beginning of
> > > arch/powerpc/kernel/eeh_driver.c:eeh_handle_normal_event() especially
> > > eeh_slot_presence_check() which checks presence via the hotplug slot.
> > > I guess we could use our own state tracking in a similar way but felt
> > > like pci_dev_is_added() is the more logical choice.
> 
> The slot check is mainly there to prevent attempts to "recover"
> devices that have been surprise removed (i.e NVMe hot-unplug). The
> actual recovery process operates off the eeh_pe tree which is frozen
> in place when an error is detected. If a pci_dev is added or removed
> it's not really a problem since those are only ever looked at when
> notifying drivers which is done with the rescan_remove lock held. 

Thanks for the explanation.

> That
> said, I wouldn't really encourage anyone to follow the EEH model since
> it's pretty byzantine.
> 
> > Looking into this again, I think we actually can't easily track this
> > state ourselves outside struct pci_dev. The reason for this is that
> > when e.g. arch/s390/pci/pci_sysfs.c:recover_store() removes the struct
> > pci_dev and scans it again the new struct pci_dev re-uses the same
> > struct zpci_dev because from a platform point of view the PCI device
> > was never removed but only disabled and re-enabled. Thus we can only
> > distinguish the stale struct pci_dev by looking at things stored in
> > struct pci_dev itself.
> 
> IMO the real problem is removing and re-adding the pci_dev. I think
> it's something that's done largely because the PCI core doesn't really
> provide any better mechanism for getting a device back into a
> known-good state so it's abused to implement error recovery. This is
> something that's always annoyed me since it conflates recovery with
> hotplug. After a hot-(un)plug we might have a different device or no
> device. In the recovery case we expect to start and end with the same
> device. Why not apply the same logic to the pci_dev?

For us there are two cases. First The existing
/sys/bus/pci/devices/<dev>/recover attribute. This does the pci_dev
remove and re-add that you mention and thus we end up with a ne pci_dev
afterwards and I agree that is kind of a dumb way to recover which
(too?) closely resembles unplug/re-plug.

Secondly the automatic error recovery added in this series. Here we
only attempt recovery if we have a driver bound that supports the error
callbacks thus always keeping the same pci_dev. If there is no driver
we give up automatic recovery and are back at the situation without
this series.

> 
> Something I was tinkering with before I left IBM was re-working the
> way EEH handles recovering devices that don't have a driver with error
> handling callbacks to something like:
> 
> 1. unbind the driver
> 2. pci_save_state()
> 3. do the reset
> 4. pci_restore_state()
> 5. re-bind the driver
> 
> That would allow keeping the pci_dev around and let me delete a pile
> of confusing code which handles binding the eeh_dev to the new
> pci_dev.

This sounds like an interesting future approach for us too. Thankfully
our binding of the zpci_dev to the new pci_dev is pretty simple by now.
The main trouble with removing and re-adding a pci_dev is then that
upper layers like block devices are also re-created which really only
happens if we have a driver bound.

>  The obvious problem with that approach is the assumption the
> device is functional enough to allow saving the config space, but I
> don't think that's a deal breaker. We could stash a copy of the device
> state before we allow drivers to attach and use that to restore the
> device after the reset. The end result would be the same known-good
> state that we'd get after a re-scan.
> 
> > That said, I think for the recovery case we might be able to drop the
> > pci_dev_is_added() and rely on pdev->driver != NULL which we check
> > anyway and that should catch any PCI device that was already removed.
> 
> Would that work if there was an error on a device without a driver
> bound? 

For the automatic recovery flow introduced by this series we only
recover if such a driver is bound anyway so that is already a
requirement. Luckily all physical PCI devices we support on our
platform have drivers with that support.

> If you're just trying to stop races between recovery and device
> removal then pci_dev_is_added() is probably the right tool for the
> job. Trying to substitute it with a proxy seems like a bad idea.

Yes I believe at least for the existing recover attribute that does not
require a bound driver we still need pci_dev_is_added().

For the automatic recovery flow I think it would be okay to rely on the
fact that removed devices don't have a driver bound since the recovery
requires a bound driver anyway but yes an explicit pci_dev_is_added()
check as in this patch does feel more clean.




^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-09-08  8:09 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-06  9:49 [PATCH 0/5] s390/pci: automatic error recovery Niklas Schnelle
2021-09-06  9:49 ` [PATCH 1/5] s390/pci: refresh function handle in iomap Niklas Schnelle
2021-09-06  9:49 ` [PATCH 2/5] s390/pci: implement reset_slot for hotplug slot Niklas Schnelle
2021-09-06  9:49 ` [PATCH 3/5] PCI: Move pci_dev_is/assign_added() to pci.h Niklas Schnelle
2021-09-07  0:22   ` kernel test robot
2021-09-07  0:25   ` kernel test robot
2021-09-07  7:51     ` Andy Shevchenko
2021-09-07  8:14       ` Niklas Schnelle
2021-09-06  9:49 ` [PATCH 4/5] PCI: Export pci_dev_lock() Niklas Schnelle
2021-09-06  9:49 ` [PATCH 5/5] s390/pci: implement minimal PCI error recovery Niklas Schnelle
2021-09-07  2:04 ` [PATCH 0/5] s390/pci: automatic " Oliver O'Halloran
2021-09-07  8:45   ` Niklas Schnelle
2021-09-07 12:21     ` Niklas Schnelle
2021-09-08  1:37       ` Oliver O'Halloran
2021-09-08  8:09         ` Niklas Schnelle
     [not found] ` <CAHrUA34TK6U4TB34FHejott9TdFvSgAedOpmro-Uj2ZwnvzecQ@mail.gmail.com>
2021-09-07  2:10   ` Fwd: " Linas Vepstas
2021-09-07  7:49   ` Niklas Schnelle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).