LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v17 00/12] Add Error Disconnect Recover (EDR) support
@ 2020-03-04  2:36 sathyanarayanan.kuppuswamy
  2020-03-04  2:36 ` [PATCH v17 01/12] PCI/ERR: Update error status after reset_link() sathyanarayanan.kuppuswamy
                   ` (11 more replies)
  0 siblings, 12 replies; 68+ messages in thread
From: sathyanarayanan.kuppuswamy @ 2020-03-04  2:36 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, linux-kernel, ashok.raj, sathyanarayanan.kuppuswamy

From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

This patchset adds support for following features:

1. Error Disconnect Recover (EDR) support.
2. _OSC based negotiation support for DPC.

You can find EDR spec in the following link.

https://members.pcisig.com/wg/PCI-SIG/document/12614

Changes since v16:
 * Removed reset_link from pcie_port_service_driver.
 * Removed pcie_port_find_service().
 * Added pci_dpc_init() in pci_init_capabilities().

Changes since v15:
 * Splitted Patch # 3 in previous set into multiple patches.
 * Refactored EDR driver use pci_dev instead of dpc_dev.
 * Added some debug logs to EDR driver.
 * Used pci_aer_raw_clear_status() for clearing AER errors in EDR path.
 * Addressed other comments from Bjorn.
 * Rebased patches on top of Bjorns "PCI/DPC: Move data to struct pci_dev" patch.

Changes since v14:
 * Rebased on top of v5.6-rc1

Changes since v13:
 * Moved all EDR related code to edr.c
 * Addressed Bjorns comments.

Changes since v12:
 * Addressed Bjorns comments.
 * Added check for CONFIG_PCIE_EDR before requesting DPC control from firmware.
 * Removed ff_check parameter from AER APIs.
 * Used macros for _OST return status values in DPC driver.

Changes since v11:
 * Allowed error recovery to proceed after successful reset_link().
 * Used correct ACPI handle for sending EDR status.
 * Rebased on top of v5.5-rc5

Changes since v10:
 * Added "edr_enabled" member to dpc priv structure, which is used to cache EDR
   enabling status based on status of pcie_ports_dpc_native and FF mode.
 * Changed type of _DSM argument from Integer to Package in acpi_enable_dpc_port()
   function to fix ACPI related boot warnings.
 * Rebased on top of v5.5-rc3

Changes since v9:
 * Removed caching of pcie_aer_get_firmware_first() in dpc driver.
 * Added proper spec reference in git log for patch 5 & 7.
 * Added new function parameter "ff_check" to pci_cleanup_aer_uncorrect_error_status(),
   pci_aer_clear_fatal_status() and pci_cleanup_aer_error_status_regs() functions.
 * Rebased on top of v5.4-rc5

Changes since v8:
 * Rebased on top of v5.4-rc1

Changes since v7:
 * Updated DSM version number to match the spec.

Changes since v6:
 * Modified the order of patches to enable EDR only after all necessary support is added in kernel.
 * Addressed Bjorn comments.

Changes since v5:
 * Addressed Keith's comments.
 * Added additional check for FF mode in pci_aer_init().
 * Updated commit history of "PCI/DPC: Add support for DPC recovery on NON_FATAL errors" patch.

Changes since v4:
 * Rebased on top of v5.3-rc1
 * Fixed lock/unlock issue in edr_handle_event().
 * Merged "Update error status after reset_link()" patch into this patchset.

Changes since v3:
 * Moved EDR related ACPI functions/definitions to pci-acpi.c
 * Modified commit history in few patches to include spec reference.
 * Added support to handle DPC triggered by NON_FATAL errors.
 * Added edr_lock to protect PCI device receiving duplicate EDR notifications.
 * Addressed Bjorn comments.

Changes since v2:
 * Split EDR support patch into multiple patches.
 * Addressed Bjorn comments.

Changes since v1:
 * Rebased on top of v5.1-rc1

Kuppuswamy Sathyanarayanan (12):
  PCI/ERR: Update error status after reset_link()
  PCI/AER: Move pci_cleanup_aer_error_status_regs() declaration to pci.h
  PCI/ERR: Remove service dependency in pcie_do_recovery()
  PCI: portdrv: remove unnecessary pcie_port_find_service()
  PCI: portdrv: remove reset_link member from pcie_port_service_driver
  Documentation: PCI: Remove reset_link references
  PCI/ERR: Return status of pcie_do_recovery()
  PCI/DPC: Cache DPC capabilities in pci_init_capabilities()
  PCI/AER: Allow clearing Error Status Register in FF mode
  PCI/DPC: Export DPC error recovery functions
  PCI/DPC: Add Error Disconnect Recover (EDR) support
  PCI/ACPI: Enable EDR support

 Documentation/PCI/pcieaer-howto.rst |  25 ++-
 drivers/acpi/pci_root.c             |  16 ++
 drivers/pci/pci-acpi.c              |   3 +
 drivers/pci/pci.h                   |  16 +-
 drivers/pci/pcie/Kconfig            |  10 ++
 drivers/pci/pcie/Makefile           |   1 +
 drivers/pci/pcie/aer.c              |  34 ++--
 drivers/pci/pcie/dpc.c              |  47 ++++--
 drivers/pci/pcie/edr.c              | 251 ++++++++++++++++++++++++++++
 drivers/pci/pcie/err.c              |  26 +--
 drivers/pci/pcie/portdrv.h          |   5 -
 drivers/pci/pcie/portdrv_core.c     |  21 ---
 drivers/pci/probe.c                 |   2 +
 include/linux/acpi.h                |   6 +-
 include/linux/aer.h                 |   5 -
 include/linux/pci-acpi.h            |   8 +
 include/linux/pci.h                 |   1 +
 17 files changed, 389 insertions(+), 88 deletions(-)
 create mode 100644 drivers/pci/pcie/edr.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v17 01/12] PCI/ERR: Update error status after reset_link()
  2020-03-04  2:36 [PATCH v17 00/12] Add Error Disconnect Recover (EDR) support sathyanarayanan.kuppuswamy
@ 2020-03-04  2:36 ` sathyanarayanan.kuppuswamy
  2020-03-04  2:36 ` [PATCH v17 02/12] PCI/AER: Move pci_cleanup_aer_error_status_regs() declaration to pci.h sathyanarayanan.kuppuswamy
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 68+ messages in thread
From: sathyanarayanan.kuppuswamy @ 2020-03-04  2:36 UTC (permalink / raw)
  To: bhelgaas
  Cc: linux-pci, linux-kernel, ashok.raj, sathyanarayanan.kuppuswamy,
	Keith Busch

From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

Commit bdb5ac85777d ("PCI/ERR: Handle fatal error recovery") uses
reset_link() to recover from fatal errors. But during fatal error
recovery, if the initial value of error status is
PCI_ERS_RESULT_DISCONNECT or PCI_ERS_RESULT_NO_AER_DRIVER then
even after successful recovery (using reset_link()) pcie_do_recovery()
will report the recovery result as failure. So update the status of
error after reset_link().

You can reproduce this issue by triggering a SW DPC using "DPC
Software Trigger" bit in "DPC Control Register". You should see recovery
failed dmesg log as below.

[  164.659982] pcieport 0000:00:16.0: DPC: containment event,
status:0x1f27 source:0x0000
[  164.659989] pcieport 0000:00:16.0: DPC: software trigger detected
[  164.659994] pci 0000:04:00.0: AER: can't recover (no error_detected
callback)
[  164.794300] pcieport 0000:00:16.0: AER: device recovery failed

Fixes: bdb5ac85777d ("PCI/ERR: Handle fatal error recovery")
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Keith Busch <keith.busch@intel.com>
---
 drivers/pci/pcie/err.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 01dfc8bb7ca0..eefefe03857a 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -208,9 +208,11 @@ void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
 	else
 		pci_walk_bus(bus, report_normal_detected, &status);
 
-	if (state == pci_channel_io_frozen &&
-	    reset_link(dev, service) != PCI_ERS_RESULT_RECOVERED)
-		goto failed;
+	if (state == pci_channel_io_frozen) {
+		status = reset_link(dev, service);
+		if (status != PCI_ERS_RESULT_RECOVERED)
+			goto failed;
+	}
 
 	if (status == PCI_ERS_RESULT_CAN_RECOVER) {
 		status = PCI_ERS_RESULT_RECOVERED;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v17 02/12] PCI/AER: Move pci_cleanup_aer_error_status_regs() declaration to pci.h
  2020-03-04  2:36 [PATCH v17 00/12] Add Error Disconnect Recover (EDR) support sathyanarayanan.kuppuswamy
  2020-03-04  2:36 ` [PATCH v17 01/12] PCI/ERR: Update error status after reset_link() sathyanarayanan.kuppuswamy
@ 2020-03-04  2:36 ` sathyanarayanan.kuppuswamy
  2020-03-04  2:36 ` [PATCH v17 03/12] PCI/ERR: Remove service dependency in pcie_do_recovery() sathyanarayanan.kuppuswamy
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 68+ messages in thread
From: sathyanarayanan.kuppuswamy @ 2020-03-04  2:36 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, linux-kernel, ashok.raj, sathyanarayanan.kuppuswamy

From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

Since pci_cleanup_aer_error_status_regs() is only used within
drivers/pci/* directory move the function declaration to pci.h.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---
 drivers/pci/pci.h   | 5 +++++
 include/linux/aer.h | 5 -----
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 6394e7746fb5..a4c360515a69 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -651,12 +651,17 @@ void pci_aer_exit(struct pci_dev *dev);
 extern const struct attribute_group aer_stats_attr_group;
 void pci_aer_clear_fatal_status(struct pci_dev *dev);
 void pci_aer_clear_device_status(struct pci_dev *dev);
+int pci_cleanup_aer_error_status_regs(struct pci_dev *dev);
 #else
 static inline void pci_no_aer(void) { }
 static inline void pci_aer_init(struct pci_dev *d) { }
 static inline void pci_aer_exit(struct pci_dev *d) { }
 static inline void pci_aer_clear_fatal_status(struct pci_dev *dev) { }
 static inline void pci_aer_clear_device_status(struct pci_dev *dev) { }
+static inline int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
+{
+	return -EINVAL;
+}
 #endif
 
 #ifdef CONFIG_ACPI
diff --git a/include/linux/aer.h b/include/linux/aer.h
index fa19e01f418a..4e4b4960a3d8 100644
--- a/include/linux/aer.h
+++ b/include/linux/aer.h
@@ -45,7 +45,6 @@ struct aer_capability_regs {
 int pci_enable_pcie_error_reporting(struct pci_dev *dev);
 int pci_disable_pcie_error_reporting(struct pci_dev *dev);
 int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);
-int pci_cleanup_aer_error_status_regs(struct pci_dev *dev);
 void pci_save_aer_state(struct pci_dev *dev);
 void pci_restore_aer_state(struct pci_dev *dev);
 #else
@@ -61,10 +60,6 @@ static inline int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev)
 {
 	return -EINVAL;
 }
-static inline int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
-{
-	return -EINVAL;
-}
 static inline void pci_save_aer_state(struct pci_dev *dev) {}
 static inline void pci_restore_aer_state(struct pci_dev *dev) {}
 #endif
-- 
2.25.1


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v17 03/12] PCI/ERR: Remove service dependency in pcie_do_recovery()
  2020-03-04  2:36 [PATCH v17 00/12] Add Error Disconnect Recover (EDR) support sathyanarayanan.kuppuswamy
  2020-03-04  2:36 ` [PATCH v17 01/12] PCI/ERR: Update error status after reset_link() sathyanarayanan.kuppuswamy
  2020-03-04  2:36 ` [PATCH v17 02/12] PCI/AER: Move pci_cleanup_aer_error_status_regs() declaration to pci.h sathyanarayanan.kuppuswamy
@ 2020-03-04  2:36 ` sathyanarayanan.kuppuswamy
  2020-03-17 14:40   ` Christoph Hellwig
  2020-03-04  2:36 ` [PATCH v17 04/12] PCI: portdrv: remove unnecessary pcie_port_find_service() sathyanarayanan.kuppuswamy
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 68+ messages in thread
From: sathyanarayanan.kuppuswamy @ 2020-03-04  2:36 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, linux-kernel, ashok.raj, sathyanarayanan.kuppuswamy

From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

Currently we pass PCIe service type parameter to pcie_do_recovery()
function which was in-turn used by reset_link() function to identify
the underlying pci_port_service_driver and then initiate the driver
specific reset_link call. Instead of using this roundabout way, we
can just pass the driver specific reset_link callback function when
calling pcie_do_recovery() function.

This change will also enable non PCIe service driver to call
pcie_do_recovery() function. This is required for adding Error
Disconnect Recover (EDR) support.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---
 drivers/pci/pci.h      |  2 +-
 drivers/pci/pcie/aer.c | 11 +++++------
 drivers/pci/pcie/dpc.c |  2 +-
 drivers/pci/pcie/err.c | 16 ++++++++--------
 4 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index a4c360515a69..2962200bfe35 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -548,7 +548,7 @@ static inline int pci_dev_specific_disable_acs_redir(struct pci_dev *dev)
 
 /* PCI error reporting and recovery */
 void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
-		      u32 service);
+		      pci_ers_result_t (*reset_cb)(struct pci_dev *pdev));
 
 bool pcie_wait_for_link(struct pci_dev *pdev, bool active);
 #ifdef CONFIG_PCIEASPM
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 4a818b07a1af..1235eca0a2e6 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -102,6 +102,7 @@ struct aer_stats {
 #define ERR_UNCOR_ID(d)			(d >> 16)
 
 static int pcie_aer_disable;
+static pci_ers_result_t aer_root_reset(struct pci_dev *dev);
 
 void pci_no_aer(void)
 {
@@ -1053,11 +1054,9 @@ static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info)
 					info->status);
 		pci_aer_clear_device_status(dev);
 	} else if (info->severity == AER_NONFATAL)
-		pcie_do_recovery(dev, pci_channel_io_normal,
-				 PCIE_PORT_SERVICE_AER);
+		pcie_do_recovery(dev, pci_channel_io_normal, aer_root_reset);
 	else if (info->severity == AER_FATAL)
-		pcie_do_recovery(dev, pci_channel_io_frozen,
-				 PCIE_PORT_SERVICE_AER);
+		pcie_do_recovery(dev, pci_channel_io_frozen, aer_root_reset);
 	pci_dev_put(dev);
 }
 
@@ -1094,10 +1093,10 @@ static void aer_recover_work_func(struct work_struct *work)
 		cper_print_aer(pdev, entry.severity, entry.regs);
 		if (entry.severity == AER_NONFATAL)
 			pcie_do_recovery(pdev, pci_channel_io_normal,
-					 PCIE_PORT_SERVICE_AER);
+					 aer_root_reset);
 		else if (entry.severity == AER_FATAL)
 			pcie_do_recovery(pdev, pci_channel_io_frozen,
-					 PCIE_PORT_SERVICE_AER);
+					 aer_root_reset);
 		pci_dev_put(pdev);
 	}
 }
diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
index 6b116d7fdb89..114358d62ddf 100644
--- a/drivers/pci/pcie/dpc.c
+++ b/drivers/pci/pcie/dpc.c
@@ -227,7 +227,7 @@ static irqreturn_t dpc_handler(int irq, void *context)
 	}
 
 	/* We configure DPC so it only triggers on ERR_FATAL */
-	pcie_do_recovery(pdev, pci_channel_io_frozen, PCIE_PORT_SERVICE_DPC);
+	pcie_do_recovery(pdev, pci_channel_io_frozen, dpc_reset_link);
 
 	return IRQ_HANDLED;
 }
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index eefefe03857a..05f87bc9d011 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -162,14 +162,13 @@ static pci_ers_result_t default_reset_link(struct pci_dev *dev)
 	return rc ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
 }
 
-static pci_ers_result_t reset_link(struct pci_dev *dev, u32 service)
+static pci_ers_result_t reset_link(struct pci_dev *dev,
+			pci_ers_result_t (*reset_cb)(struct pci_dev *pdev))
 {
 	pci_ers_result_t status;
-	struct pcie_port_service_driver *driver = NULL;
 
-	driver = pcie_port_find_service(dev, service);
-	if (driver && driver->reset_link) {
-		status = driver->reset_link(dev);
+	if (reset_cb) {
+		status = reset_cb(dev);
 	} else if (pcie_downstream_port(dev)) {
 		status = default_reset_link(dev);
 	} else {
@@ -187,8 +186,9 @@ static pci_ers_result_t reset_link(struct pci_dev *dev, u32 service)
 	return status;
 }
 
-void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
-		      u32 service)
+void pcie_do_recovery(struct pci_dev *dev,
+		      enum pci_channel_state state,
+		      pci_ers_result_t (*reset_cb)(struct pci_dev *pdev))
 {
 	pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
 	struct pci_bus *bus;
@@ -209,7 +209,7 @@ void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
 		pci_walk_bus(bus, report_normal_detected, &status);
 
 	if (state == pci_channel_io_frozen) {
-		status = reset_link(dev, service);
+		status = reset_link(dev, reset_cb);
 		if (status != PCI_ERS_RESULT_RECOVERED)
 			goto failed;
 	}
-- 
2.25.1


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v17 04/12] PCI: portdrv: remove unnecessary pcie_port_find_service()
  2020-03-04  2:36 [PATCH v17 00/12] Add Error Disconnect Recover (EDR) support sathyanarayanan.kuppuswamy
                   ` (2 preceding siblings ...)
  2020-03-04  2:36 ` [PATCH v17 03/12] PCI/ERR: Remove service dependency in pcie_do_recovery() sathyanarayanan.kuppuswamy
@ 2020-03-04  2:36 ` sathyanarayanan.kuppuswamy
  2020-03-04  2:36 ` [PATCH v17 05/12] PCI: portdrv: remove reset_link member from pcie_port_service_driver sathyanarayanan.kuppuswamy
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 68+ messages in thread
From: sathyanarayanan.kuppuswamy @ 2020-03-04  2:36 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, linux-kernel, ashok.raj, sathyanarayanan.kuppuswamy

From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

Since pcie_do_recovery() has been refactored to not to depend
on PCIe service driver, we no longer have any users for
pcie_port_find_service() function. So just remove it.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---
 drivers/pci/pcie/portdrv.h      |  2 --
 drivers/pci/pcie/portdrv_core.c | 21 ---------------------
 2 files changed, 23 deletions(-)

diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index 1e673619b101..c5da165ce016 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -161,7 +161,5 @@ static inline int pcie_aer_get_firmware_first(struct pci_dev *pci_dev)
 }
 #endif
 
-struct pcie_port_service_driver *pcie_port_find_service(struct pci_dev *dev,
-							u32 service);
 struct device *pcie_port_find_device(struct pci_dev *dev, u32 service);
 #endif /* _PORTDRV_H_ */
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 5075cb9e850c..50a9522ab07d 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -458,27 +458,6 @@ static int find_service_iter(struct device *device, void *data)
 	return 0;
 }
 
-/**
- * pcie_port_find_service - find the service driver
- * @dev: PCI Express port the service is associated with
- * @service: Service to find
- *
- * Find PCI Express port service driver associated with given service
- */
-struct pcie_port_service_driver *pcie_port_find_service(struct pci_dev *dev,
-							u32 service)
-{
-	struct pcie_port_service_driver *drv;
-	struct portdrv_service_data pdrvs;
-
-	pdrvs.drv = NULL;
-	pdrvs.service = service;
-	device_for_each_child(&dev->dev, &pdrvs, find_service_iter);
-
-	drv = pdrvs.drv;
-	return drv;
-}
-
 /**
  * pcie_port_find_device - find the struct device
  * @dev: PCI Express port the service is associated with
-- 
2.25.1


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v17 05/12] PCI: portdrv: remove reset_link member from pcie_port_service_driver
  2020-03-04  2:36 [PATCH v17 00/12] Add Error Disconnect Recover (EDR) support sathyanarayanan.kuppuswamy
                   ` (3 preceding siblings ...)
  2020-03-04  2:36 ` [PATCH v17 04/12] PCI: portdrv: remove unnecessary pcie_port_find_service() sathyanarayanan.kuppuswamy
@ 2020-03-04  2:36 ` sathyanarayanan.kuppuswamy
  2020-03-17 14:41   ` Christoph Hellwig
  2020-03-04  2:36 ` [PATCH v17 06/12] Documentation: PCI: Remove reset_link references sathyanarayanan.kuppuswamy
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 68+ messages in thread
From: sathyanarayanan.kuppuswamy @ 2020-03-04  2:36 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, linux-kernel, ashok.raj, sathyanarayanan.kuppuswamy

From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

reset_link member in struct pcie_port_service_driver was
mainly added to let pcie_do_recovery() trigger the driver
specific reset_link() on PCIe fatal errors. But after
modifying the pcie_do_recovery() function to accept reset_link
callback as function parameter, we no longer have need to use
or set reset_link in struct pcie_port_service_driver. So remove
it.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---
 drivers/pci/pcie/aer.c     | 1 -
 drivers/pci/pcie/dpc.c     | 1 -
 drivers/pci/pcie/portdrv.h | 3 ---
 3 files changed, 5 deletions(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 1235eca0a2e6..c0540c3761dc 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1500,7 +1500,6 @@ static struct pcie_port_service_driver aerdriver = {
 
 	.probe		= aer_probe,
 	.remove		= aer_remove,
-	.reset_link	= aer_root_reset,
 };
 
 /**
diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
index 114358d62ddf..1ae5d94944eb 100644
--- a/drivers/pci/pcie/dpc.c
+++ b/drivers/pci/pcie/dpc.c
@@ -313,7 +313,6 @@ static struct pcie_port_service_driver dpcdriver = {
 	.service	= PCIE_PORT_SERVICE_DPC,
 	.probe		= dpc_probe,
 	.remove		= dpc_remove,
-	.reset_link	= dpc_reset_link,
 };
 
 int __init pcie_dpc_init(void)
diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index c5da165ce016..64b5e081cdb2 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -92,9 +92,6 @@ struct pcie_port_service_driver {
 	/* Device driver may resume normal operations */
 	void (*error_resume)(struct pci_dev *dev);
 
-	/* Link Reset Capability - AER service driver specific */
-	pci_ers_result_t (*reset_link)(struct pci_dev *dev);
-
 	int port_type;  /* Type of the port this driver can handle */
 	u32 service;    /* Port service this device represents */
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v17 06/12] Documentation: PCI: Remove reset_link references
  2020-03-04  2:36 [PATCH v17 00/12] Add Error Disconnect Recover (EDR) support sathyanarayanan.kuppuswamy
                   ` (4 preceding siblings ...)
  2020-03-04  2:36 ` [PATCH v17 05/12] PCI: portdrv: remove reset_link member from pcie_port_service_driver sathyanarayanan.kuppuswamy
@ 2020-03-04  2:36 ` sathyanarayanan.kuppuswamy
  2020-03-17 14:42   ` Christoph Hellwig
  2020-03-04  2:36 ` [PATCH v17 07/12] PCI/ERR: Return status of pcie_do_recovery() sathyanarayanan.kuppuswamy
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 68+ messages in thread
From: sathyanarayanan.kuppuswamy @ 2020-03-04  2:36 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, linux-kernel, ashok.raj, sathyanarayanan.kuppuswamy

From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

After pcie_do_recovery() refactor, instead of reset_link
member in struct pcie_port_service_driver, we use reset_cb
callback parameter in pcie_do_recovery() function to pass
the service driver specific reset_link function. So modify
the Documentation to reflect the latest changes.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---
 Documentation/PCI/pcieaer-howto.rst | 25 +++++++++++--------------
 1 file changed, 11 insertions(+), 14 deletions(-)

diff --git a/Documentation/PCI/pcieaer-howto.rst b/Documentation/PCI/pcieaer-howto.rst
index 18bdefaafd1a..0f3e5880efb8 100644
--- a/Documentation/PCI/pcieaer-howto.rst
+++ b/Documentation/PCI/pcieaer-howto.rst
@@ -147,7 +147,7 @@ section 3.3.
 Provide callbacks
 -----------------
 
-callback reset_link to reset pci express link
+callback reset_cb to reset pci express link
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 This callback is used to reset the pci express physical link when a
@@ -156,11 +156,8 @@ default reset_link function, but different upstream ports might
 have different specifications to reset pci express link, so all
 upstream ports should provide their own reset_link functions.
 
-In struct pcie_port_service_driver, a new pointer, reset_link, is
-added.
-::
-
-	pci_ers_result_t (*reset_link) (struct pci_dev *dev);
+In pcie_do_recovery function, reset_cb function pointer can be used
+to pass the port specific reset_link callback.
 
 Section 3.2.2.2 provides more detailed info on when to call
 reset_link.
@@ -212,13 +209,13 @@ error_detected(dev, pci_channel_io_frozen) to all drivers within
 a hierarchy in question. Then, performing link reset at upstream is
 necessary. As different kinds of devices might use different approaches
 to reset link, AER port service driver is required to provide the
-function to reset link. Firstly, kernel looks for if the upstream
-component has an aer driver. If it has, kernel uses the reset_link
-callback of the aer driver. If the upstream component has no aer driver
-and the port is downstream port, we will perform a hot reset as the
-default by setting the Secondary Bus Reset bit of the Bridge Control
-register associated with the downstream port. As for upstream ports,
-they should provide their own aer service drivers with reset_link
+function to reset link via reset_cb parameter of pcie_do_recovery()
+function call. If reset_cb is not NULL, recovery function will use it
+to reset the link. If there is no reset_cb callback provided and
+the port is downstream port, we will perform a hot reset as the default
+by setting the Secondary Bus Reset bit of the Bridge Control register
+associated with the downstream port. As for upstream ports,
+they should provide their own reset_link function via reset_cb callback
 function. If error_detected returns PCI_ERS_RESULT_CAN_RECOVER and
 reset_link returns PCI_ERS_RESULT_RECOVERED, the error handling goes
 to mmio_enabled.
@@ -262,7 +259,7 @@ A:
 
 Q:
   What happens if an upstream port service driver does not provide
-  callback reset_link?
+  callback reset_cb?
 
 A:
   Fatal error recovery will fail if the errors are reported by the
-- 
2.25.1


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v17 07/12] PCI/ERR: Return status of pcie_do_recovery()
  2020-03-04  2:36 [PATCH v17 00/12] Add Error Disconnect Recover (EDR) support sathyanarayanan.kuppuswamy
                   ` (5 preceding siblings ...)
  2020-03-04  2:36 ` [PATCH v17 06/12] Documentation: PCI: Remove reset_link references sathyanarayanan.kuppuswamy
@ 2020-03-04  2:36 ` sathyanarayanan.kuppuswamy
  2020-03-04  2:36 ` [PATCH v17 08/12] PCI/DPC: Cache DPC capabilities in pci_init_capabilities() sathyanarayanan.kuppuswamy
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 68+ messages in thread
From: sathyanarayanan.kuppuswamy @ 2020-03-04  2:36 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, linux-kernel, ashok.raj, sathyanarayanan.kuppuswamy

From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

As per the Downstream Port Containment Related Enhancements ECN to the
PCI Firmware Specification r3.2, sec 4.5.1, table 4-4, Support for Error
Disconnect Recover (EDR) implies that the OS will invalidate the
software state associated with child devices of the port without
attempting to access the child device hardware. If the OS supports
Downstream Port Containment (DPC), as indicated by the OS setting bit 7
of _OSC control field, the OS shall attempt to recover the child devices
if the port implements the Downstream Port Containment Extended
Capability. If the OS continues operation, the OS must inform the
Firmware of the status of the recovery operation via the _OST method.

So in adding EDR support, to report status of error recovery via _OST,
we need to know the status of error recovery. So add support to return
the status of pcie_do_recovery() function.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---
 drivers/pci/pci.h      |  5 +++--
 drivers/pci/pcie/err.c | 10 ++++++----
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 2962200bfe35..c2c35f152cde 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -547,8 +547,9 @@ static inline int pci_dev_specific_disable_acs_redir(struct pci_dev *dev)
 #endif
 
 /* PCI error reporting and recovery */
-void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
-		      pci_ers_result_t (*reset_cb)(struct pci_dev *pdev));
+pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
+			enum pci_channel_state state,
+			pci_ers_result_t (*reset_cb)(struct pci_dev *pdev));
 
 bool pcie_wait_for_link(struct pci_dev *pdev, bool active);
 #ifdef CONFIG_PCIEASPM
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 05f87bc9d011..b560f0096a70 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -186,9 +186,9 @@ static pci_ers_result_t reset_link(struct pci_dev *dev,
 	return status;
 }
 
-void pcie_do_recovery(struct pci_dev *dev,
-		      enum pci_channel_state state,
-		      pci_ers_result_t (*reset_cb)(struct pci_dev *pdev))
+pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
+			enum pci_channel_state state,
+			pci_ers_result_t (*reset_cb)(struct pci_dev *pdev))
 {
 	pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
 	struct pci_bus *bus;
@@ -240,11 +240,13 @@ void pcie_do_recovery(struct pci_dev *dev,
 	pci_aer_clear_device_status(dev);
 	pci_cleanup_aer_uncorrect_error_status(dev);
 	pci_info(dev, "device recovery successful\n");
-	return;
+	return status;
 
 failed:
 	pci_uevent_ers(dev, PCI_ERS_RESULT_DISCONNECT);
 
 	/* TODO: Should kernel panic here? */
 	pci_info(dev, "device recovery failed\n");
+
+	return status;
 }
-- 
2.25.1


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v17 08/12] PCI/DPC: Cache DPC capabilities in pci_init_capabilities()
  2020-03-04  2:36 [PATCH v17 00/12] Add Error Disconnect Recover (EDR) support sathyanarayanan.kuppuswamy
                   ` (6 preceding siblings ...)
  2020-03-04  2:36 ` [PATCH v17 07/12] PCI/ERR: Return status of pcie_do_recovery() sathyanarayanan.kuppuswamy
@ 2020-03-04  2:36 ` sathyanarayanan.kuppuswamy
  2020-03-04  2:36 ` [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode sathyanarayanan.kuppuswamy
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 68+ messages in thread
From: sathyanarayanan.kuppuswamy @ 2020-03-04  2:36 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, linux-kernel, ashok.raj, sathyanarayanan.kuppuswamy

From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

Since we need to re-use DPC error handling routines in Error Disconnect
Recover (EDR) driver, move the initalization and caching of DPC
capabilities to pci_init_capabilities().

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---
 drivers/pci/pci.h      |  2 ++
 drivers/pci/pcie/dpc.c | 32 ++++++++++++++++++++------------
 drivers/pci/probe.c    |  1 +
 3 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index c2c35f152cde..e57e78b619f8 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -448,9 +448,11 @@ void aer_print_error(struct pci_dev *dev, struct aer_err_info *info);
 #ifdef CONFIG_PCIE_DPC
 void pci_save_dpc_state(struct pci_dev *dev);
 void pci_restore_dpc_state(struct pci_dev *dev);
+void pci_dpc_init(struct pci_dev *pdev);
 #else
 static inline void pci_save_dpc_state(struct pci_dev *dev) {}
 static inline void pci_restore_dpc_state(struct pci_dev *dev) {}
+static inline void pci_dpc_init(struct pci_dev *pdev) {}
 #endif
 
 #ifdef CONFIG_PCI_ATS
diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
index 1ae5d94944eb..ad011d6b22c5 100644
--- a/drivers/pci/pcie/dpc.c
+++ b/drivers/pci/pcie/dpc.c
@@ -249,6 +249,26 @@ static irqreturn_t dpc_irq(int irq, void *context)
 	return IRQ_HANDLED;
 }
 
+void pci_dpc_init(struct pci_dev *pdev)
+{
+	u16 cap;
+
+	pdev->dpc_cap = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_DPC);
+	if (!pdev->dpc_cap)
+		return;
+
+	pci_read_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_CAP, &cap);
+	pdev->dpc_rp_extensions = (cap & PCI_EXP_DPC_CAP_RP_EXT) ? 1 : 0;
+	if (pdev->dpc_rp_extensions) {
+		pdev->dpc_rp_log_size = (cap & PCI_EXP_DPC_RP_PIO_LOG_SIZE) >> 8;
+		if (pdev->dpc_rp_log_size < 4 || pdev->dpc_rp_log_size > 9) {
+			pci_err(pdev, "RP PIO log size %u is invalid\n",
+				pdev->dpc_rp_log_size);
+			pdev->dpc_rp_log_size = 0;
+		}
+	}
+}
+
 #define FLAG(x, y) (((x) & (y)) ? '+' : '-')
 static int dpc_probe(struct pcie_device *dev)
 {
@@ -260,8 +280,6 @@ static int dpc_probe(struct pcie_device *dev)
 	if (pcie_aer_get_firmware_first(pdev) && !pcie_ports_dpc_native)
 		return -ENOTSUPP;
 
-	pdev->dpc_cap = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_DPC);
-
 	status = devm_request_threaded_irq(device, dev->irq, dpc_irq,
 					   dpc_handler, IRQF_SHARED,
 					   "pcie-dpc", pdev);
@@ -274,16 +292,6 @@ static int dpc_probe(struct pcie_device *dev)
 	pci_read_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_CAP, &cap);
 	pci_read_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_CTL, &ctl);
 
-	pdev->dpc_rp_extensions = (cap & PCI_EXP_DPC_CAP_RP_EXT) ? 1 : 0;
-	if (pdev->dpc_rp_extensions) {
-		pdev->dpc_rp_log_size = (cap & PCI_EXP_DPC_RP_PIO_LOG_SIZE) >> 8;
-		if (pdev->dpc_rp_log_size < 4 || pdev->dpc_rp_log_size > 9) {
-			pci_err(pdev, "RP PIO log size %u is invalid\n",
-				pdev->dpc_rp_log_size);
-			pdev->dpc_rp_log_size = 0;
-		}
-	}
-
 	ctl = (ctl & 0xfff4) | PCI_EXP_DPC_CTL_EN_FATAL | PCI_EXP_DPC_CTL_INT_EN;
 	pci_write_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_CTL, ctl);
 
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 512cb4312ddd..c6f91f886818 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2329,6 +2329,7 @@ static void pci_init_capabilities(struct pci_dev *dev)
 	pci_enable_acs(dev);		/* Enable ACS P2P upstream forwarding */
 	pci_ptm_init(dev);		/* Precision Time Measurement */
 	pci_aer_init(dev);		/* Advanced Error Reporting */
+	pci_dpc_init(dev);		/* Downstream Port Containment */
 
 	pcie_report_downtraining(dev);
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-04  2:36 [PATCH v17 00/12] Add Error Disconnect Recover (EDR) support sathyanarayanan.kuppuswamy
                   ` (7 preceding siblings ...)
  2020-03-04  2:36 ` [PATCH v17 08/12] PCI/DPC: Cache DPC capabilities in pci_init_capabilities() sathyanarayanan.kuppuswamy
@ 2020-03-04  2:36 ` sathyanarayanan.kuppuswamy
  2020-03-06  5:45   ` Kuppuswamy, Sathyanarayanan
  2020-03-10  2:40   ` Bjorn Helgaas
  2020-03-04  2:36 ` [PATCH v17 10/12] PCI/DPC: Export DPC error recovery functions sathyanarayanan.kuppuswamy
                   ` (2 subsequent siblings)
  11 siblings, 2 replies; 68+ messages in thread
From: sathyanarayanan.kuppuswamy @ 2020-03-04  2:36 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, linux-kernel, ashok.raj, sathyanarayanan.kuppuswamy

From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

As per PCI firmware specification r3.2 System Firmware Intermediary
(SFI) _OSC and DPC Updates ECR
(https://members.pcisig.com/wg/PCI-SIG/document/13563), sec titled "DPC
Event Handling Implementation Note", page 10, Error Disconnect Recover
(EDR) support allows OS to handle error recovery and clearing Error
Registers even in FF mode. So create new API pci_aer_raw_clear_status()
which allows clearing AER registers without FF mode checks.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---
 drivers/pci/pci.h      |  2 ++
 drivers/pci/pcie/aer.c | 22 ++++++++++++++++++----
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index e57e78b619f8..c239e6dd2542 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -655,6 +655,7 @@ extern const struct attribute_group aer_stats_attr_group;
 void pci_aer_clear_fatal_status(struct pci_dev *dev);
 void pci_aer_clear_device_status(struct pci_dev *dev);
 int pci_cleanup_aer_error_status_regs(struct pci_dev *dev);
+int pci_aer_raw_clear_status(struct pci_dev *dev);
 #else
 static inline void pci_no_aer(void) { }
 static inline void pci_aer_init(struct pci_dev *d) { }
@@ -665,6 +666,7 @@ static inline int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
 {
 	return -EINVAL;
 }
+int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
 #endif
 
 #ifdef CONFIG_ACPI
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index c0540c3761dc..41afefa562b7 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -420,7 +420,16 @@ void pci_aer_clear_fatal_status(struct pci_dev *dev)
 		pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
 }
 
-int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
+/**
+ * pci_aer_raw_clear_status - Clear AER error registers.
+ * @dev: the PCI device
+ *
+ * NOTE: Allows clearing error registers in both FF and
+ * non FF modes.
+ *
+ * Returns 0 on success, or negative on failure.
+ */
+int pci_aer_raw_clear_status(struct pci_dev *dev)
 {
 	int pos;
 	u32 status;
@@ -433,9 +442,6 @@ int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
 	if (!pos)
 		return -EIO;
 
-	if (pcie_aer_get_firmware_first(dev))
-		return -EIO;
-
 	port_type = pci_pcie_type(dev);
 	if (port_type == PCI_EXP_TYPE_ROOT_PORT) {
 		pci_read_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, &status);
@@ -451,6 +457,14 @@ int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
 	return 0;
 }
 
+int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
+{
+	if (pcie_aer_get_firmware_first(dev))
+		return -EIO;
+
+	return pci_aer_raw_clear_status(dev);
+}
+
 void pci_save_aer_state(struct pci_dev *dev)
 {
 	struct pci_cap_saved_state *save_state;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v17 10/12] PCI/DPC: Export DPC error recovery functions
  2020-03-04  2:36 [PATCH v17 00/12] Add Error Disconnect Recover (EDR) support sathyanarayanan.kuppuswamy
                   ` (8 preceding siblings ...)
  2020-03-04  2:36 ` [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode sathyanarayanan.kuppuswamy
@ 2020-03-04  2:36 ` sathyanarayanan.kuppuswamy
  2020-03-17 14:43   ` Christoph Hellwig
  2020-03-04  2:36 ` [PATCH v17 11/12] PCI/DPC: Add Error Disconnect Recover (EDR) support sathyanarayanan.kuppuswamy
  2020-03-04  2:36 ` [PATCH v17 12/12] PCI/ACPI: Enable EDR support sathyanarayanan.kuppuswamy
  11 siblings, 1 reply; 68+ messages in thread
From: sathyanarayanan.kuppuswamy @ 2020-03-04  2:36 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, linux-kernel, ashok.raj, sathyanarayanan.kuppuswamy

From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

This is a preparatory patch for adding EDR support.

As per the Downstream Port Containment Related Enhancements ECN to the
PCI Firmware Specification r3.2, sec 4.5.1, table 4-6, If DPC is
controlled by firmware, firmware is responsible for initializing
Downstream Port Containment Extended Capability Structures per firmware
policy. Further, the OS is permitted to read or write DPC Control and
Status registers of a port while processing an Error Disconnect Recover
notification from firmware on that port.

To add EDR support we need to re-use DPC error handling functions. So
add necessary interfaces.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---
 drivers/pci/pci.h      |  2 ++
 drivers/pci/pcie/dpc.c | 12 +++++++++---
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index c239e6dd2542..a475192c553a 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -449,6 +449,8 @@ void aer_print_error(struct pci_dev *dev, struct aer_err_info *info);
 void pci_save_dpc_state(struct pci_dev *dev);
 void pci_restore_dpc_state(struct pci_dev *dev);
 void pci_dpc_init(struct pci_dev *pdev);
+void dpc_process_error(struct pci_dev *pdev);
+pci_ers_result_t dpc_reset_link(struct pci_dev *pdev);
 #else
 static inline void pci_save_dpc_state(struct pci_dev *dev) {}
 static inline void pci_restore_dpc_state(struct pci_dev *dev) {}
diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
index ad011d6b22c5..72bfb58918e1 100644
--- a/drivers/pci/pcie/dpc.c
+++ b/drivers/pci/pcie/dpc.c
@@ -89,7 +89,7 @@ static int dpc_wait_rp_inactive(struct pci_dev *pdev)
 	return 0;
 }
 
-static pci_ers_result_t dpc_reset_link(struct pci_dev *pdev)
+pci_ers_result_t dpc_reset_link(struct pci_dev *pdev)
 {
 	u16 cap;
 
@@ -193,9 +193,8 @@ static int dpc_get_aer_uncorrect_severity(struct pci_dev *dev,
 	return 1;
 }
 
-static irqreturn_t dpc_handler(int irq, void *context)
+void dpc_process_error(struct pci_dev *pdev)
 {
-	struct pci_dev *pdev = context;
 	u16 cap = pdev->dpc_cap, status, source, reason, ext_reason;
 	struct aer_err_info info;
 
@@ -225,6 +224,13 @@ static irqreturn_t dpc_handler(int irq, void *context)
 		pci_cleanup_aer_uncorrect_error_status(pdev);
 		pci_aer_clear_fatal_status(pdev);
 	}
+}
+
+static irqreturn_t dpc_handler(int irq, void *context)
+{
+	struct pci_dev *pdev = context;
+
+	dpc_process_error(pdev);
 
 	/* We configure DPC so it only triggers on ERR_FATAL */
 	pcie_do_recovery(pdev, pci_channel_io_frozen, dpc_reset_link);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v17 11/12] PCI/DPC: Add Error Disconnect Recover (EDR) support
  2020-03-04  2:36 [PATCH v17 00/12] Add Error Disconnect Recover (EDR) support sathyanarayanan.kuppuswamy
                   ` (9 preceding siblings ...)
  2020-03-04  2:36 ` [PATCH v17 10/12] PCI/DPC: Export DPC error recovery functions sathyanarayanan.kuppuswamy
@ 2020-03-04  2:36 ` sathyanarayanan.kuppuswamy
  2020-03-06  3:47   ` Bjorn Helgaas
  2020-03-04  2:36 ` [PATCH v17 12/12] PCI/ACPI: Enable EDR support sathyanarayanan.kuppuswamy
  11 siblings, 1 reply; 68+ messages in thread
From: sathyanarayanan.kuppuswamy @ 2020-03-04  2:36 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, linux-kernel, ashok.raj, sathyanarayanan.kuppuswamy

From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

As per ACPI specification r6.3, sec 5.6.6, when firmware owns Downstream
Port Containment (DPC), its expected to use the "Error Disconnect
Recover" (EDR) notification to alert OSPM of a DPC event and if OS
supports EDR, its expected to handle the software state invalidation and
port recovery in OS, and also let firmware know the recovery status via
_OST ACPI call. Related _OST status codes can be found in ACPI
specification r6.3, sec 6.3.5.2.

Also, as per PCI firmware specification r3.2 Downstream Port Containment
Related Enhancements ECN, sec 4.5.1, table 4-6, If DPC is controlled by
firmware (firmware first mode), firmware is responsible for
configuring the DPC and OS is responsible for error recovery. Also, OS
is allowed to modify DPC registers only during the EDR notification
window.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---
 drivers/pci/pci-acpi.c    |   3 +
 drivers/pci/pcie/Kconfig  |  10 ++
 drivers/pci/pcie/Makefile |   1 +
 drivers/pci/pcie/edr.c    | 249 ++++++++++++++++++++++++++++++++++++++
 include/linux/pci-acpi.h  |   8 ++
 5 files changed, 271 insertions(+)
 create mode 100644 drivers/pci/pcie/edr.c

diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index 0c02d500158f..6af5d6a04990 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -1258,6 +1258,7 @@ static void pci_acpi_setup(struct device *dev)
 
 	acpi_pci_wakeup(pci_dev, false);
 	acpi_device_power_add_dependent(adev, dev);
+	pci_acpi_add_edr_notifier(pci_dev);
 }
 
 static void pci_acpi_cleanup(struct device *dev)
@@ -1276,6 +1277,8 @@ static void pci_acpi_cleanup(struct device *dev)
 
 		device_set_wakeup_capable(dev, false);
 	}
+
+	pci_acpi_remove_edr_notifier(pci_dev);
 }
 
 static bool pci_acpi_bus_match(struct device *dev)
diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig
index 6e3c04b46fb1..772b1f4cb19e 100644
--- a/drivers/pci/pcie/Kconfig
+++ b/drivers/pci/pcie/Kconfig
@@ -140,3 +140,13 @@ config PCIE_BW
 	  This enables PCI Express Bandwidth Change Notification.  If
 	  you know link width or rate changes occur only to correct
 	  unreliable links, you may answer Y.
+
+config PCIE_EDR
+	bool "PCI Express Error Disconnect Recover support"
+	depends on PCIE_DPC && ACPI
+	help
+	  This option adds Error Disconnect Recover support as specified
+	  in the Downstream Port Containment Related Enhancements ECN to
+	  the PCI Firmware Specification r3.2.  Enable this if you want to
+	  support hybrid DPC model which uses both firmware and OS to
+	  implement DPC.
diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
index efb9d2e71e9e..68da9280ff11 100644
--- a/drivers/pci/pcie/Makefile
+++ b/drivers/pci/pcie/Makefile
@@ -13,3 +13,4 @@ obj-$(CONFIG_PCIE_PME)		+= pme.o
 obj-$(CONFIG_PCIE_DPC)		+= dpc.o
 obj-$(CONFIG_PCIE_PTM)		+= ptm.o
 obj-$(CONFIG_PCIE_BW)		+= bw_notification.o
+obj-$(CONFIG_PCIE_EDR)		+= edr.o
diff --git a/drivers/pci/pcie/edr.c b/drivers/pci/pcie/edr.c
new file mode 100644
index 000000000000..2d8680be0302
--- /dev/null
+++ b/drivers/pci/pcie/edr.c
@@ -0,0 +1,249 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * PCI DPC Error Disconnect Recover support driver
+ * Author: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
+ *
+ * Copyright (C) 2020 Intel Corp.
+ */
+
+#define dev_fmt(fmt) "EDR: " fmt
+
+#include <linux/pci.h>
+#include <linux/pci-acpi.h>
+
+#include "portdrv.h"
+#include "../pci.h"
+
+#define EDR_PORT_ENABLE_DSM		0x0C
+#define EDR_PORT_LOCATE_DSM		0x0D
+#define EDR_OST_SUCCESS			0x80
+#define EDR_OST_FAILED			0x81
+
+/*
+ * _DSM wrapper function to enable/disable DPC port.
+ * @pdev   : PCI device structure.
+ *
+ * returns 0 on success or errno on failure.
+ */
+static int acpi_enable_dpc_port(struct pci_dev *pdev)
+{
+	struct acpi_device *adev = ACPI_COMPANION(&pdev->dev);
+	union acpi_object *obj, argv4, req;
+	int status = 0;
+
+	req.type = ACPI_TYPE_INTEGER;
+	req.integer.value = 1;
+
+	argv4.type = ACPI_TYPE_PACKAGE;
+	argv4.package.count = 1;
+	argv4.package.elements = &req;
+
+	/*
+	 * Per the Downstream Port Containment Related Enhancements ECN to
+	 * the PCI Firmware Specification r3.2, sec 4.6.12,
+	 * EDR_PORT_ENABLE_DSM is optional.  Return success if it's not
+	 * implemented.
+	 */
+	obj = acpi_evaluate_dsm(adev->handle, &pci_acpi_dsm_guid, 5,
+				EDR_PORT_ENABLE_DSM, &argv4);
+	if (!obj)
+		return 0;
+
+	if (obj->type != ACPI_TYPE_INTEGER) {
+		pci_err(pdev, "_DSM 0x0C returns non integer value\n");
+		status = -EIO;
+	}
+
+	if (obj->integer.value != 1) {
+		pci_err(pdev, "failed to enable DPC port\n");
+		status = -EIO;
+	}
+
+	ACPI_FREE(obj);
+
+	return status;
+}
+
+/*
+ * _DSM wrapper function to locate DPC port.
+ * @pdev   : Device which received EDR event.
+ *
+ * returns pci_dev or NULL.
+ */
+static struct pci_dev *acpi_dpc_port_get(struct pci_dev *pdev)
+{
+	struct acpi_device *adev = ACPI_COMPANION(&pdev->dev);
+	union acpi_object *obj;
+	u16 port;
+
+	obj = acpi_evaluate_dsm(adev->handle, &pci_acpi_dsm_guid, 5,
+				EDR_PORT_LOCATE_DSM, NULL);
+	if (!obj)
+		return pci_dev_get(pdev);
+
+	if (obj->type != ACPI_TYPE_INTEGER) {
+		ACPI_FREE(obj);
+		pci_err(pdev, "_DSM 0x0D returns non integer value\n");
+		return NULL;
+	}
+
+	/*
+	 * Firmware returns DPC port BDF details in following format:
+	 *	15:8 = bus
+	 *	 7:3 = device
+	 *	 2:0 = function
+	 */
+	port = obj->integer.value;
+
+	ACPI_FREE(obj);
+
+	return pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus),
+					   PCI_BUS_NUM(port), port & 0xff);
+}
+
+/*
+ * _OST wrapper function to let firmware know the status of EDR event.
+ * @pdev   : Device used to send _OST.
+ * @edev   : Device which experienced EDR event.
+ * @status: Status of EDR event.
+ */
+static int acpi_send_edr_status(struct pci_dev *pdev, struct pci_dev *edev,
+				u16 status)
+{
+	struct acpi_device *adev = ACPI_COMPANION(&pdev->dev);
+	u32 ost_status;
+
+	pci_dbg(pdev, "Sending EDR status :%#x\n", status);
+
+	ost_status =  PCI_DEVID(edev->bus->number, edev->devfn);
+	ost_status = (ost_status << 16) | status;
+
+	status = acpi_evaluate_ost(adev->handle,
+				   ACPI_NOTIFY_DISCONNECT_RECOVER,
+				   ost_status, NULL);
+	if (ACPI_FAILURE(status))
+		return -EINVAL;
+
+	return 0;
+}
+
+static void edr_handle_event(acpi_handle handle, u32 event, void *data)
+{
+	struct pci_dev *pdev = data, *edev;
+	pci_ers_result_t estate = PCI_ERS_RESULT_DISCONNECT;
+	u16 status;
+
+	pci_info(pdev, "ACPI event %#x received\n", event);
+
+	if (event != ACPI_NOTIFY_DISCONNECT_RECOVER)
+		return;
+
+	/*
+	 * Check if _DSM(0xD) is available, and if present locate the
+	 * port which issued EDR event.
+	 */
+	edev = acpi_dpc_port_get(pdev);
+	if (!edev) {
+		pci_err(pdev, "Firmware failed to locate DPC port\n");
+		return;
+	}
+
+	pci_dbg(pdev, "Reported EDR dev: %s\n", pci_name(edev));
+
+	/*
+	 * If port does not support DPC, just send the OST:
+	 */
+	if (!edev->dpc_cap) {
+		pci_err(edev, "Firmware BUG, located port doesn't support DPC\n");
+		goto send_ost;
+	}
+
+	/* Check if there is a valid DPC trigger */
+	pci_read_config_word(edev, edev->dpc_cap + PCI_EXP_DPC_STATUS, &status);
+	if (!(status & PCI_EXP_DPC_STATUS_TRIGGER)) {
+		pci_err(edev, "Invalid DPC trigger %#010x\n", status);
+		goto send_ost;
+	}
+
+	dpc_process_error(edev);
+
+	/* Clear AER registers */
+	pci_aer_raw_clear_status(edev);
+
+	/*
+	 * Irrespective of whether the DPC event is triggered by
+	 * ERR_FATAL or ERR_NONFATAL, since the link is already down,
+	 * use the FATAL error recovery path for both cases.
+	 */
+	estate = pcie_do_recovery(edev, pci_channel_io_frozen, dpc_reset_link);
+
+	pci_dbg(edev, "DPC port successfully recovered\n");
+send_ost:
+
+	/*
+	 * If recovery is successful, send _OST(0xF, BDF << 16 | 0x80)
+	 * to firmware. If not successful, send _OST(0xF, BDF << 16 | 0x81).
+	 */
+	if (estate == PCI_ERS_RESULT_RECOVERED)
+		acpi_send_edr_status(pdev, edev, EDR_OST_SUCCESS);
+	else
+		acpi_send_edr_status(pdev, edev, EDR_OST_FAILED);
+
+	pci_dev_put(edev);
+}
+
+void pci_acpi_add_edr_notifier(struct pci_dev *pdev)
+{
+	struct acpi_device *adev = ACPI_COMPANION(&pdev->dev);
+	acpi_status astatus;
+
+	if (!adev) {
+		pci_dbg(pdev, "No valid ACPI node, so skip EDR init\n");
+		return;
+	}
+
+	/*
+	 * Per the Downstream Port Containment Related Enhancements ECN to
+	 * the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-6, EDR support
+	 * can only be enabled if DPC is controlled by firmware.
+	 *
+	 * TODO: Remove dependency on ACPI FIRMWARE_FIRST bit to
+	 * determine ownership of DPC between firmware or OS.
+	 * Per the Downstream Port Containment Related Enhancements
+	 * ECN to the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
+	 * OS can use bit 7 of _OSC control field to negotiate control
+	 * over DPC Capability.
+	 */
+	if (!pcie_aer_get_firmware_first(pdev) || pcie_ports_dpc_native) {
+		pci_dbg(pdev, "OS handles AER/DPC, so skip EDR init\n");
+		return;
+	}
+
+	astatus = acpi_install_notify_handler(adev->handle, ACPI_SYSTEM_NOTIFY,
+					      edr_handle_event, pdev);
+	if (ACPI_FAILURE(astatus)) {
+		pci_err(pdev, "Install ACPI_SYSTEM_NOTIFY handler failed\n");
+		return;
+	}
+
+	if (acpi_enable_dpc_port(pdev))
+		acpi_remove_notify_handler(adev->handle, ACPI_SYSTEM_NOTIFY,
+					   edr_handle_event);
+
+	pci_dbg(pdev, "EDR notifier is added successfully\n");
+
+	return;
+}
+
+void pci_acpi_remove_edr_notifier(struct pci_dev *pdev)
+{
+	struct acpi_device *adev = ACPI_COMPANION(&pdev->dev);
+
+	if (!adev)
+		return;
+
+	pci_dbg(pdev, "EDR notifier is removed successfully\n");
+
+	acpi_remove_notify_handler(adev->handle, ACPI_SYSTEM_NOTIFY,
+				   edr_handle_event);
+}
diff --git a/include/linux/pci-acpi.h b/include/linux/pci-acpi.h
index 62b7fdcc661c..2d155bfb8fbf 100644
--- a/include/linux/pci-acpi.h
+++ b/include/linux/pci-acpi.h
@@ -112,6 +112,14 @@ extern const guid_t pci_acpi_dsm_guid;
 #define RESET_DELAY_DSM			0x08
 #define FUNCTION_DELAY_DSM		0x09
 
+#ifdef CONFIG_PCIE_EDR
+void pci_acpi_add_edr_notifier(struct pci_dev *pdev);
+void pci_acpi_remove_edr_notifier(struct pci_dev *pdev);
+#else
+static inline void pci_acpi_add_edr_notifier(struct pci_dev *pdev) { }
+static inline void pci_acpi_remove_edr_notifier(struct pci_dev *pdev) { }
+#endif /* CONFIG_PCIE_EDR */
+
 #else	/* CONFIG_ACPI */
 static inline void acpi_pci_add_bus(struct pci_bus *bus) { }
 static inline void acpi_pci_remove_bus(struct pci_bus *bus) { }
-- 
2.25.1


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v17 12/12] PCI/ACPI: Enable EDR support
  2020-03-04  2:36 [PATCH v17 00/12] Add Error Disconnect Recover (EDR) support sathyanarayanan.kuppuswamy
                   ` (10 preceding siblings ...)
  2020-03-04  2:36 ` [PATCH v17 11/12] PCI/DPC: Add Error Disconnect Recover (EDR) support sathyanarayanan.kuppuswamy
@ 2020-03-04  2:36 ` sathyanarayanan.kuppuswamy
  11 siblings, 0 replies; 68+ messages in thread
From: sathyanarayanan.kuppuswamy @ 2020-03-04  2:36 UTC (permalink / raw)
  To: bhelgaas
  Cc: linux-pci, linux-kernel, ashok.raj, sathyanarayanan.kuppuswamy,
	Rafael J. Wysocki, Len Brown, Keith Busch, Huong Nguyen,
	Austin Bolen

From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

As per PCI firmware specification r3.2 Downstream Port Containment
Related Enhancements ECN, sec 4.5.1, OS must implement following steps
to enable/use EDR feature.

1. OS can use bit 7 of _OSC Control Field to negotiate control over
Downstream Port Containment (DPC) configuration of PCIe port. After _OSC
negotiation, firmware will Set this bit to grant OS control over PCIe
DPC configuration and Clear it if this feature was requested and denied,
or was not requested.

2. Also, if OS supports EDR, it should expose its support to BIOS by
setting bit 7 of _OSC Support Field. And if OS sets bit 7 of _OSC
Control Field it must also expose support for EDR by setting bit 7 of
_OSC Support Field.

Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Tested-by: Huong Nguyen <huong.nguyen@dell.com>
Tested-by: Austin Bolen <Austin.Bolen@dell.com>
---
 drivers/acpi/pci_root.c | 16 ++++++++++++++++
 drivers/pci/pcie/edr.c  |  4 +++-
 drivers/pci/probe.c     |  1 +
 include/linux/acpi.h    |  6 ++++--
 include/linux/pci.h     |  1 +
 5 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index d1e666ef3fcc..ad1be5941a00 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -131,6 +131,7 @@ static struct pci_osc_bit_struct pci_osc_support_bit[] = {
 	{ OSC_PCI_CLOCK_PM_SUPPORT, "ClockPM" },
 	{ OSC_PCI_SEGMENT_GROUPS_SUPPORT, "Segments" },
 	{ OSC_PCI_MSI_SUPPORT, "MSI" },
+	{ OSC_PCI_EDR_SUPPORT, "EDR" },
 	{ OSC_PCI_HPX_TYPE_3_SUPPORT, "HPX-Type3" },
 };
 
@@ -141,6 +142,7 @@ static struct pci_osc_bit_struct pci_osc_control_bit[] = {
 	{ OSC_PCI_EXPRESS_AER_CONTROL, "AER" },
 	{ OSC_PCI_EXPRESS_CAPABILITY_CONTROL, "PCIeCapability" },
 	{ OSC_PCI_EXPRESS_LTR_CONTROL, "LTR" },
+	{ OSC_PCI_EXPRESS_DPC_CONTROL, "DPC" },
 };
 
 static void decode_osc_bits(struct acpi_pci_root *root, char *msg, u32 word,
@@ -440,6 +442,8 @@ static void negotiate_os_control(struct acpi_pci_root *root, int *no_aspm,
 		support |= OSC_PCI_ASPM_SUPPORT | OSC_PCI_CLOCK_PM_SUPPORT;
 	if (pci_msi_enabled())
 		support |= OSC_PCI_MSI_SUPPORT;
+	if (IS_ENABLED(CONFIG_PCIE_EDR))
+		support |= OSC_PCI_EDR_SUPPORT;
 
 	decode_osc_support(root, "OS supports", support);
 	status = acpi_pci_osc_support(root, support);
@@ -487,6 +491,16 @@ static void negotiate_os_control(struct acpi_pci_root *root, int *no_aspm,
 			control |= OSC_PCI_EXPRESS_AER_CONTROL;
 	}
 
+	/*
+	 * Per the Downstream Port Containment Related Enhancements ECN to
+	 * the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
+	 * OSC_PCI_EXPRESS_DPC_CONTROL indicates the OS supports both DPC
+	 * and EDR. So use CONFIG_PCIE_EDR for requesting DPC control which
+	 * will only be turned on if both EDR and DPC is enabled.
+	 */
+	if (IS_ENABLED(CONFIG_PCIE_EDR))
+		control |= OSC_PCI_EXPRESS_DPC_CONTROL;
+
 	requested = control;
 	status = acpi_pci_osc_control_set(handle, &control,
 					  OSC_PCI_EXPRESS_CAPABILITY_CONTROL);
@@ -916,6 +930,8 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
 		host_bridge->native_pme = 0;
 	if (!(root->osc_control_set & OSC_PCI_EXPRESS_LTR_CONTROL))
 		host_bridge->native_ltr = 0;
+	if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
+		host_bridge->native_dpc = 0;
 
 	/*
 	 * Evaluate the "PCI Boot Configuration" _DSM Function.  If it
diff --git a/drivers/pci/pcie/edr.c b/drivers/pci/pcie/edr.c
index 2d8680be0302..45d165e838bb 100644
--- a/drivers/pci/pcie/edr.c
+++ b/drivers/pci/pcie/edr.c
@@ -195,6 +195,7 @@ static void edr_handle_event(acpi_handle handle, u32 event, void *data)
 void pci_acpi_add_edr_notifier(struct pci_dev *pdev)
 {
 	struct acpi_device *adev = ACPI_COMPANION(&pdev->dev);
+	struct pci_host_bridge *host = pci_find_host_bridge(pdev->bus);
 	acpi_status astatus;
 
 	if (!adev) {
@@ -214,7 +215,8 @@ void pci_acpi_add_edr_notifier(struct pci_dev *pdev)
 	 * OS can use bit 7 of _OSC control field to negotiate control
 	 * over DPC Capability.
 	 */
-	if (!pcie_aer_get_firmware_first(pdev) || pcie_ports_dpc_native) {
+	if (!pcie_aer_get_firmware_first(pdev) || pcie_ports_dpc_native ||
+	    (host->native_dpc)) {
 		pci_dbg(pdev, "OS handles AER/DPC, so skip EDR init\n");
 		return;
 	}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index c6f91f886818..f67c007edcae 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -598,6 +598,7 @@ static void pci_init_host_bridge(struct pci_host_bridge *bridge)
 	bridge->native_shpc_hotplug = 1;
 	bridge->native_pme = 1;
 	bridge->native_ltr = 1;
+	bridge->native_dpc = 1;
 }
 
 struct pci_host_bridge *pci_alloc_host_bridge(size_t priv)
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 0f24d701fbdc..b7d3caf6f205 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -530,8 +530,9 @@ extern bool osc_pc_lpi_support_confirmed;
 #define OSC_PCI_CLOCK_PM_SUPPORT		0x00000004
 #define OSC_PCI_SEGMENT_GROUPS_SUPPORT		0x00000008
 #define OSC_PCI_MSI_SUPPORT			0x00000010
+#define OSC_PCI_EDR_SUPPORT			0x00000080
 #define OSC_PCI_HPX_TYPE_3_SUPPORT		0x00000100
-#define OSC_PCI_SUPPORT_MASKS			0x0000011f
+#define OSC_PCI_SUPPORT_MASKS			0x0000019f
 
 /* PCI Host Bridge _OSC: Capabilities DWORD 3: Control Field */
 #define OSC_PCI_EXPRESS_NATIVE_HP_CONTROL	0x00000001
@@ -540,7 +541,8 @@ extern bool osc_pc_lpi_support_confirmed;
 #define OSC_PCI_EXPRESS_AER_CONTROL		0x00000008
 #define OSC_PCI_EXPRESS_CAPABILITY_CONTROL	0x00000010
 #define OSC_PCI_EXPRESS_LTR_CONTROL		0x00000020
-#define OSC_PCI_CONTROL_MASKS			0x0000003f
+#define OSC_PCI_EXPRESS_DPC_CONTROL		0x00000080
+#define OSC_PCI_CONTROL_MASKS			0x000000bf
 
 #define ACPI_GSB_ACCESS_ATTRIB_QUICK		0x00000002
 #define ACPI_GSB_ACCESS_ATTRIB_SEND_RCV         0x00000004
diff --git a/include/linux/pci.h b/include/linux/pci.h
index a0b7e7a53741..7ed7c088c952 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -515,6 +515,7 @@ struct pci_host_bridge {
 	unsigned int	native_shpc_hotplug:1;	/* OS may use SHPC hotplug */
 	unsigned int	native_pme:1;		/* OS may use PCIe PME */
 	unsigned int	native_ltr:1;		/* OS may use PCIe LTR */
+	unsigned int	native_dpc:1;		/* OS may use PCIe DPC */
 	unsigned int	preserve_config:1;	/* Preserve FW resource setup */
 
 	/* Resource alignment requirements */
-- 
2.25.1


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 11/12] PCI/DPC: Add Error Disconnect Recover (EDR) support
  2020-03-04  2:36 ` [PATCH v17 11/12] PCI/DPC: Add Error Disconnect Recover (EDR) support sathyanarayanan.kuppuswamy
@ 2020-03-06  3:47   ` Bjorn Helgaas
  2020-03-06  6:32     ` Kuppuswamy, Sathyanarayanan
  0 siblings, 1 reply; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-06  3:47 UTC (permalink / raw)
  To: sathyanarayanan.kuppuswamy
  Cc: linux-pci, linux-kernel, ashok.raj, Olof Johansson

[+cc Olof for pcie_ports=dpc-native question]

On Tue, Mar 03, 2020 at 06:36:34PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

> +void pci_acpi_add_edr_notifier(struct pci_dev *pdev)
> +{
> +	struct acpi_device *adev = ACPI_COMPANION(&pdev->dev);
> +	acpi_status astatus;
> +
> +	if (!adev) {
> +		pci_dbg(pdev, "No valid ACPI node, so skip EDR init\n");
> +		return;
> +	}
> +
> +	/*
> +	 * Per the Downstream Port Containment Related Enhancements ECN to
> +	 * the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-6, EDR support
> +	 * can only be enabled if DPC is controlled by firmware.
> +	 *
> +	 * TODO: Remove dependency on ACPI FIRMWARE_FIRST bit to
> +	 * determine ownership of DPC between firmware or OS.
> +	 * Per the Downstream Port Containment Related Enhancements
> +	 * ECN to the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
> +	 * OS can use bit 7 of _OSC control field to negotiate control
> +	 * over DPC Capability.
> +	 */
> +	if (!pcie_aer_get_firmware_first(pdev) || pcie_ports_dpc_native) {
> +		pci_dbg(pdev, "OS handles AER/DPC, so skip EDR init\n");
> +		return;
> +	}
> +
> +	astatus = acpi_install_notify_handler(adev->handle, ACPI_SYSTEM_NOTIFY,
> +					      edr_handle_event, pdev);

I think this is still problematic.  You mentioned Alex's work [1,2].
We do need to revisit those patches, but I don't really want to defer
*this* question of the EDR notify handler.  Negotiating support of
AER/DPC/EDR is already complicated, and I don't want to complicate it
even more by merging something we already know is not quite right.

I don't understand your comment that "EDR can only be enabled if DPC
is controlled by firmware."  I don't see anything in table 4-6 to that
effect.  The only mention of EDR there is to say that the OS can
access the DPC capability in the EDR processing window, i.e., after
the OS receives the EDR notification and before it clears DPC Trigger
Status.

EDR is a general ACPI feature that is not PCI-specific.  For EDR on
PCI devices, OS support is advertised via _OSC *Support* (table 4-4),
which says:

  Error Disconnect Recover Supported

  The OS sets this bit to 1 if it supports Error Disconnect Recover
  notification on PCI Express Host Bridges, Root Ports and Switch
  Downstream Ports. Otherwise, the OS sets this bit to 0.

I think that means that if we set the "Error Disconnect Recover
Supported" _OSC bit (OSC_PCI_EDR_SUPPORT), we must install a handler
for EDR notifications.  We set OSC_PCI_EDR_SUPPORT whenever
CONFIG_PCIE_EDR=y, so I think we should install the notify handler
here unconditionally (since this file is compiled only when
CONFIG_PCIE_EDR=y).

I don't think we should even test pcie_ports_dpc_native here.  If we
told the platform we can handle EDR notifications, we should be
prepared to get them, regardless of whether the user booted with
"pcie_ports=dpc-native".

It's conceivable that pcie_ports_dpc_native should make us do
something different in the notify handler after we *get* a
notification, but I doubt we should even worry about that.

IIUC, pcie_ports_dpc_native exists because Linux DPC originally worked
even if the OS didn't have control of AER.  eed85ff4c0da7 ("PCI/DPC:
Enable DPC only if AER is available") meant that if Linux didn't have
control of AER, DPC no longer worked.  "pcie_ports=dpc-native" is
basically a way to get that previous behavior of Linux DPC regardless
of AER control.

I don't think that issue applies to EDR.  There's no concept of an OS
"enabling" or "being granted control of" EDR.  The OS merely
advertises that "yes, I'm prepared to handle EDR notifications".
AFAICT, the ECR says nothing about EDR support being conditional on OS
control of AER or DPC.  The notify *handler* might need to do
different things depending on whether we have AER or DPC control, but
the handler itself should be registered regardless.

[1] https://lore.kernel.org/linux-pci/20181115231605.24352-1-mr.nuke.me@gmail.com/
[2] https://lore.kernel.org/linux-pci/20190326172343.28946-1-mr.nuke.me@gmail.com/

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-04  2:36 ` [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode sathyanarayanan.kuppuswamy
@ 2020-03-06  5:45   ` Kuppuswamy, Sathyanarayanan
  2020-03-06 16:04     ` Bjorn Helgaas
  2020-03-10  2:40   ` Bjorn Helgaas
  1 sibling, 1 reply; 68+ messages in thread
From: Kuppuswamy, Sathyanarayanan @ 2020-03-06  5:45 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, linux-kernel, ashok.raj



On 3/3/2020 6:36 PM, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> 
> As per PCI firmware specification r3.2 System Firmware Intermediary
> (SFI) _OSC and DPC Updates ECR
> (https://members.pcisig.com/wg/PCI-SIG/document/13563), sec titled "DPC
> Event Handling Implementation Note", page 10, Error Disconnect Recover
> (EDR) support allows OS to handle error recovery and clearing Error
> Registers even in FF mode. So create new API pci_aer_raw_clear_status()
> which allows clearing AER registers without FF mode checks.
> 
> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> ---
>   drivers/pci/pci.h      |  2 ++
>   drivers/pci/pcie/aer.c | 22 ++++++++++++++++++----
>   2 files changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index e57e78b619f8..c239e6dd2542 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -655,6 +655,7 @@ extern const struct attribute_group aer_stats_attr_group;
>   void pci_aer_clear_fatal_status(struct pci_dev *dev);
>   void pci_aer_clear_device_status(struct pci_dev *dev);
>   int pci_cleanup_aer_error_status_regs(struct pci_dev *dev);
> +int pci_aer_raw_clear_status(struct pci_dev *dev);
>   #else
>   static inline void pci_no_aer(void) { }
>   static inline void pci_aer_init(struct pci_dev *d) { }
> @@ -665,6 +666,7 @@ static inline int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
>   {
>   	return -EINVAL;
>   }
> +int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
Its missing static specifier. It needs to be fixed. I can fix it in next 
version.
Bjorn, if there is no need for next version, can you please make this 
change ?
>   #endif
>   
>   #ifdef CONFIG_ACPI
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index c0540c3761dc..41afefa562b7 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -420,7 +420,16 @@ void pci_aer_clear_fatal_status(struct pci_dev *dev)
>   		pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
>   }
>   
> -int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
> +/**
> + * pci_aer_raw_clear_status - Clear AER error registers.
> + * @dev: the PCI device
> + *
> + * NOTE: Allows clearing error registers in both FF and
> + * non FF modes.
> + *
> + * Returns 0 on success, or negative on failure.
> + */
> +int pci_aer_raw_clear_status(struct pci_dev *dev)
>   {
>   	int pos;
>   	u32 status;
> @@ -433,9 +442,6 @@ int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
>   	if (!pos)
>   		return -EIO;
>   
> -	if (pcie_aer_get_firmware_first(dev))
> -		return -EIO;
> -
>   	port_type = pci_pcie_type(dev);
>   	if (port_type == PCI_EXP_TYPE_ROOT_PORT) {
>   		pci_read_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, &status);
> @@ -451,6 +457,14 @@ int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
>   	return 0;
>   }
>   
> +int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
> +{
> +	if (pcie_aer_get_firmware_first(dev))
> +		return -EIO;
> +
> +	return pci_aer_raw_clear_status(dev);
> +}
> +
>   void pci_save_aer_state(struct pci_dev *dev)
>   {
>   	struct pci_cap_saved_state *save_state;
> 

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 11/12] PCI/DPC: Add Error Disconnect Recover (EDR) support
  2020-03-06  3:47   ` Bjorn Helgaas
@ 2020-03-06  6:32     ` Kuppuswamy, Sathyanarayanan
  2020-03-06 21:00       ` Bjorn Helgaas
  0 siblings, 1 reply; 68+ messages in thread
From: Kuppuswamy, Sathyanarayanan @ 2020-03-06  6:32 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, linux-kernel, ashok.raj, Olof Johansson

Hi Bjorn,

On 3/5/2020 7:47 PM, Bjorn Helgaas wrote:
> [+cc Olof for pcie_ports=dpc-native question]
> 
> On Tue, Mar 03, 2020 at 06:36:34PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> 
>> +void pci_acpi_add_edr_notifier(struct pci_dev *pdev)
>> +{
>> +	struct acpi_device *adev = ACPI_COMPANION(&pdev->dev);
>> +	acpi_status astatus;
>> +
>> +	if (!adev) {
>> +		pci_dbg(pdev, "No valid ACPI node, so skip EDR init\n");
>> +		return;
>> +	}
>> +
>> +	/*
>> +	 * Per the Downstream Port Containment Related Enhancements ECN to
>> +	 * the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-6, EDR support
>> +	 * can only be enabled if DPC is controlled by firmware.
>> +	 *
>> +	 * TODO: Remove dependency on ACPI FIRMWARE_FIRST bit to
>> +	 * determine ownership of DPC between firmware or OS.
>> +	 * Per the Downstream Port Containment Related Enhancements
>> +	 * ECN to the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
>> +	 * OS can use bit 7 of _OSC control field to negotiate control
>> +	 * over DPC Capability.
>> +	 */
>> +	if (!pcie_aer_get_firmware_first(pdev) || pcie_ports_dpc_native) {
>> +		pci_dbg(pdev, "OS handles AER/DPC, so skip EDR init\n");
>> +		return;
>> +	}
>> +
>> +	astatus = acpi_install_notify_handler(adev->handle, ACPI_SYSTEM_NOTIFY,
>> +					      edr_handle_event, pdev);
> 
> I think this is still problematic.  You mentioned Alex's work [1,2].
> We do need to revisit those patches, but I don't really want to defer
> *this* question of the EDR notify handler.  Negotiating support of
> AER/DPC/EDR is already complicated, and I don't want to complicate it
> even more by merging something we already know is not quite right.
> 
> I don't understand your comment that "EDR can only be enabled if DPC
> is controlled by firmware."  I don't see anything in table 4-6 to that
> effect.  The only mention of EDR there is to say that the OS can
> access the DPC capability in the EDR processing window, i.e., after
> the OS receives the EDR notification and before it clears DPC Trigger
> Status.

Please check the following spec reference (from table 4-6).

     If control of this feature was requested and denied, firmware is
     responsible for initializing Downstream Port Containment Extended
     Capability Structures per firmware policy. Further, the OS is
     permitted to read or write DPC Control and Status registers of a
     port while processing an Error Disconnect Recover notification from
     firmware on that port.

It specifies firmware is expected to use EDR notification *only* when 
the control of DPC is requested and denied ( which means firmware owns 
the DPC). Although it does not explicitly state that we should install 
EDR notification handler only if firmware owns DPC, it mentions that EDR 
notification is only used if firmware owns DPC. So why should we install 
it if its not going to be used when OS owns DPC.

Also check the following reference from section 2 of EDR ECN. It also 
clarifies EDR feature is only used when firmware owns DPC.

     PCIe Base Specification suggests that Downstream Port Containment
     may be controlled either by the Firmware or the Operating System. It
     also suggests that the Firmware retain ownership of Downstream Port
     Containment if it also owns AER. When the Firmware owns Downstream
     Port Containment, *it is expected to use the new “Error Disconnect
     Recover” notification to alert OSPM of a Downstream Port Containment
     event*.


> 
> EDR is a general ACPI feature that is not PCI-specific.  For EDR on
> PCI devices, OS support is advertised via _OSC *Support* (table 4-4),
> which says:
> 
>    Error Disconnect Recover Supported
> 
>    The OS sets this bit to 1 if it supports Error Disconnect Recover
>    notification on PCI Express Host Bridges, Root Ports and Switch
>    Downstream Ports. Otherwise, the OS sets this bit to 0.
> 
> I think that means that if we set the "Error Disconnect Recover
> Supported" _OSC bit (OSC_PCI_EDR_SUPPORT), we must install a handler
> for EDR notifications.  We set OSC_PCI_EDR_SUPPORT whenever
> CONFIG_PCIE_EDR=y, so I think we should install the notify handler
> here unconditionally (since this file is compiled only when
> CONFIG_PCIE_EDR=y).

Although spec does not provide any restrictions on when to install EDR 
notification, it provides reference that notification is only used if 
firmware owns DPC. So when OS owns DPC, there is no need to install them 
at all.

Although installing them when OS owns DPC should not affect anything, it 
also opens up a additional way for firmware to mess up things. For 
example, consider a case when firmware gives OS control of DPC, but 
still sends EDR notification to OS. Although its unrealistic, I am just 
giving an example.

> 
> I don't think we should even test pcie_ports_dpc_native here.  If we
> told the platform we can handle EDR notifications, we should be
> prepared to get them, regardless of whether the user booted with
> "pcie_ports=dpc-native".

As per the command line parameter documentation, setting 
pcie_ports=dpc-native means, we will be using native PCIe service for 
DPC. So if DPC is handled by OS, as per my argument mentioned above (EDR 
is only useful if
DPC handled by firmware), there is no use in installing EDR notification.

https://github.com/torvalds/linux/blob/master/Documentation/admin-guide/kernel-parameters.txt#L3642

dpc-native - Use native PCIe service for DPC only.

> 
> It's conceivable that pcie_ports_dpc_native should make us do
> something different in the notify handler after we *get* a
> notification, but I doubt we should even worry about that.
> 
> IIUC, pcie_ports_dpc_native exists because Linux DPC originally worked
> even if the OS didn't have control of AER.  eed85ff4c0da7 ("PCI/DPC:
> Enable DPC only if AER is available") meant that if Linux didn't have
> control of AER, DPC no longer worked.  "pcie_ports=dpc-native" is
> basically a way to get that previous behavior of Linux DPC regardless
> of AER control.
> 
> I don't think that issue applies to EDR.  There's no concept of an OS
> "enabling" or "being granted control of" EDR.  The OS merely
> advertises that "yes, I'm prepared to handle EDR notifications".
> AFAICT, the ECR says nothing about EDR support being conditional on OS
> control of AER or DPC.  The notify *handler* might need to do
> different things depending on whether we have AER or DPC control, but
> the handler itself should be registered regardless.
> 
> [1] https://lore.kernel.org/linux-pci/20181115231605.24352-1-mr.nuke.me@gmail.com/
> [2] https://lore.kernel.org/linux-pci/20190326172343.28946-1-mr.nuke.me@gmail.com/
> 

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-06  5:45   ` Kuppuswamy, Sathyanarayanan
@ 2020-03-06 16:04     ` Bjorn Helgaas
  2020-03-06 16:11       ` Kuppuswamy, Sathyanarayanan
  0 siblings, 1 reply; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-06 16:04 UTC (permalink / raw)
  To: Kuppuswamy, Sathyanarayanan; +Cc: linux-pci, linux-kernel, ashok.raj

On Thu, Mar 05, 2020 at 09:45:46PM -0800, Kuppuswamy, Sathyanarayanan wrote:
> On 3/3/2020 6:36 PM, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
> > From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > 
> > As per PCI firmware specification r3.2 System Firmware Intermediary
> > (SFI) _OSC and DPC Updates ECR
> > (https://members.pcisig.com/wg/PCI-SIG/document/13563), sec titled "DPC
> > Event Handling Implementation Note", page 10, Error Disconnect Recover
> > (EDR) support allows OS to handle error recovery and clearing Error
> > Registers even in FF mode. So create new API pci_aer_raw_clear_status()
> > which allows clearing AER registers without FF mode checks.
> > 
> > Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > ---
> >   drivers/pci/pci.h      |  2 ++
> >   drivers/pci/pcie/aer.c | 22 ++++++++++++++++++----
> >   2 files changed, 20 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > index e57e78b619f8..c239e6dd2542 100644
> > --- a/drivers/pci/pci.h
> > +++ b/drivers/pci/pci.h
> > @@ -655,6 +655,7 @@ extern const struct attribute_group aer_stats_attr_group;
> >   void pci_aer_clear_fatal_status(struct pci_dev *dev);
> >   void pci_aer_clear_device_status(struct pci_dev *dev);
> >   int pci_cleanup_aer_error_status_regs(struct pci_dev *dev);
> > +int pci_aer_raw_clear_status(struct pci_dev *dev);
> >   #else
> >   static inline void pci_no_aer(void) { }
> >   static inline void pci_aer_init(struct pci_dev *d) { }
> > @@ -665,6 +666,7 @@ static inline int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
> >   {
> >   	return -EINVAL;
> >   }
> > +int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }

> It's missing static specifier. It needs to be fixed. I can fix it in
> next version.  Bjorn, if there is no need for next version, can you
> please make this change?

pci_aer_raw_clear_status() is defined in aer.c and called from aer.c
and edr.c, so I do not think it can be static.  Am I missing
something?

I have a review/edr branch that I hope becomes what will be applied.

Bjorn

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-06 16:04     ` Bjorn Helgaas
@ 2020-03-06 16:11       ` Kuppuswamy, Sathyanarayanan
  2020-03-06 16:41         ` Bjorn Helgaas
  0 siblings, 1 reply; 68+ messages in thread
From: Kuppuswamy, Sathyanarayanan @ 2020-03-06 16:11 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, linux-kernel, ashok.raj



On 3/6/2020 8:04 AM, Bjorn Helgaas wrote:
> On Thu, Mar 05, 2020 at 09:45:46PM -0800, Kuppuswamy, Sathyanarayanan wrote:
>> On 3/3/2020 6:36 PM, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
>>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>>
>>> As per PCI firmware specification r3.2 System Firmware Intermediary
>>> (SFI) _OSC and DPC Updates ECR
>>> (https://members.pcisig.com/wg/PCI-SIG/document/13563), sec titled "DPC
>>> Event Handling Implementation Note", page 10, Error Disconnect Recover
>>> (EDR) support allows OS to handle error recovery and clearing Error
>>> Registers even in FF mode. So create new API pci_aer_raw_clear_status()
>>> which allows clearing AER registers without FF mode checks.
>>>
>>> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>> ---
>>>    drivers/pci/pci.h      |  2 ++
>>>    drivers/pci/pcie/aer.c | 22 ++++++++++++++++++----
>>>    2 files changed, 20 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>>> index e57e78b619f8..c239e6dd2542 100644
>>> --- a/drivers/pci/pci.h
>>> +++ b/drivers/pci/pci.h
>>> @@ -655,6 +655,7 @@ extern const struct attribute_group aer_stats_attr_group;
>>>    void pci_aer_clear_fatal_status(struct pci_dev *dev);
>>>    void pci_aer_clear_device_status(struct pci_dev *dev);
>>>    int pci_cleanup_aer_error_status_regs(struct pci_dev *dev);
>>> +int pci_aer_raw_clear_status(struct pci_dev *dev);
>>>    #else
>>>    static inline void pci_no_aer(void) { }
>>>    static inline void pci_aer_init(struct pci_dev *d) { }
>>> @@ -665,6 +666,7 @@ static inline int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
>>>    {
>>>    	return -EINVAL;
>>>    }
>>> +int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
> 
>> It's missing static specifier. It needs to be fixed. I can fix it in
>> next version.  Bjorn, if there is no need for next version, can you
>> please make this change?
> 
> pci_aer_raw_clear_status() is defined in aer.c and called from aer.c
> and edr.c, so I do not think it can be static.  Am I missing
> something?
> 
> I have a review/edr branch that I hope becomes what will be applied.
For kernel configs that does not define CONFIG_PCIEAER, it will create 
redefinition error since pci.h can be included in many files.
> 
> Bjorn
> 

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-06 16:11       ` Kuppuswamy, Sathyanarayanan
@ 2020-03-06 16:41         ` Bjorn Helgaas
  0 siblings, 0 replies; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-06 16:41 UTC (permalink / raw)
  To: Kuppuswamy, Sathyanarayanan; +Cc: linux-pci, linux-kernel, ashok.raj

On Fri, Mar 06, 2020 at 08:11:41AM -0800, Kuppuswamy, Sathyanarayanan wrote:
> On 3/6/2020 8:04 AM, Bjorn Helgaas wrote:
> > On Thu, Mar 05, 2020 at 09:45:46PM -0800, Kuppuswamy, Sathyanarayanan wrote:
> > > On 3/3/2020 6:36 PM, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
> > > > From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > > > 
> > > > As per PCI firmware specification r3.2 System Firmware Intermediary
> > > > (SFI) _OSC and DPC Updates ECR
> > > > (https://members.pcisig.com/wg/PCI-SIG/document/13563), sec titled "DPC
> > > > Event Handling Implementation Note", page 10, Error Disconnect Recover
> > > > (EDR) support allows OS to handle error recovery and clearing Error
> > > > Registers even in FF mode. So create new API pci_aer_raw_clear_status()
> > > > which allows clearing AER registers without FF mode checks.
> > > > 
> > > > Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > > > ---
> > > >    drivers/pci/pci.h      |  2 ++
> > > >    drivers/pci/pcie/aer.c | 22 ++++++++++++++++++----
> > > >    2 files changed, 20 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > > > index e57e78b619f8..c239e6dd2542 100644
> > > > --- a/drivers/pci/pci.h
> > > > +++ b/drivers/pci/pci.h
> > > > @@ -655,6 +655,7 @@ extern const struct attribute_group aer_stats_attr_group;
> > > >    void pci_aer_clear_fatal_status(struct pci_dev *dev);
> > > >    void pci_aer_clear_device_status(struct pci_dev *dev);
> > > >    int pci_cleanup_aer_error_status_regs(struct pci_dev *dev);
> > > > +int pci_aer_raw_clear_status(struct pci_dev *dev);
> > > >    #else
> > > >    static inline void pci_no_aer(void) { }
> > > >    static inline void pci_aer_init(struct pci_dev *d) { }
> > > > @@ -665,6 +666,7 @@ static inline int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
> > > >    {
> > > >    	return -EINVAL;
> > > >    }
> > > > +int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
> > 
> > > It's missing static specifier. It needs to be fixed. I can fix it in
> > > next version.  Bjorn, if there is no need for next version, can you
> > > please make this change?
> > 
> > pci_aer_raw_clear_status() is defined in aer.c and called from aer.c
> > and edr.c, so I do not think it can be static.  Am I missing
> > something?

> For kernel configs that does not define CONFIG_PCIEAER, it will create
> redefinition error since pci.h can be included in many files.

Oh, right, I thought you were talking about the definition in aer.c.
The stub in pci.h is missing "inline" as well as "static".

Fixed.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 11/12] PCI/DPC: Add Error Disconnect Recover (EDR) support
  2020-03-06  6:32     ` Kuppuswamy, Sathyanarayanan
@ 2020-03-06 21:00       ` Bjorn Helgaas
  2020-03-06 22:42         ` Kuppuswamy Sathyanarayanan
  0 siblings, 1 reply; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-06 21:00 UTC (permalink / raw)
  To: Kuppuswamy, Sathyanarayanan
  Cc: linux-pci, linux-kernel, ashok.raj, Olof Johansson

On Thu, Mar 05, 2020 at 10:32:33PM -0800, Kuppuswamy, Sathyanarayanan wrote:
> On 3/5/2020 7:47 PM, Bjorn Helgaas wrote:
> > On Tue, Mar 03, 2020 at 06:36:34PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
> > > From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > 
> > > +void pci_acpi_add_edr_notifier(struct pci_dev *pdev)
> > > +{
> > > +	struct acpi_device *adev = ACPI_COMPANION(&pdev->dev);
> > > +	acpi_status astatus;
> > > +
> > > +	if (!adev) {
> > > +		pci_dbg(pdev, "No valid ACPI node, so skip EDR init\n");
> > > +		return;
> > > +	}
> > > +
> > > +	/*
> > > +	 * Per the Downstream Port Containment Related Enhancements ECN to
> > > +	 * the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-6, EDR support
> > > +	 * can only be enabled if DPC is controlled by firmware.
> > > +	 *
> > > +	 * TODO: Remove dependency on ACPI FIRMWARE_FIRST bit to
> > > +	 * determine ownership of DPC between firmware or OS.
> > > +	 * Per the Downstream Port Containment Related Enhancements
> > > +	 * ECN to the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
> > > +	 * OS can use bit 7 of _OSC control field to negotiate control
> > > +	 * over DPC Capability.
> > > +	 */
> > > +	if (!pcie_aer_get_firmware_first(pdev) || pcie_ports_dpc_native) {
> > > +		pci_dbg(pdev, "OS handles AER/DPC, so skip EDR init\n");
> > > +		return;
> > > +	}
> > > +
> > > +	astatus = acpi_install_notify_handler(adev->handle, ACPI_SYSTEM_NOTIFY,
> > > +					      edr_handle_event, pdev);
> > 
> > I think this is still problematic.  You mentioned Alex's work
> > [1,2].  We do need to revisit those patches, but I don't really
> > want to defer *this* question of the EDR notify handler.
> > Negotiating support of AER/DPC/EDR is already complicated, and I
> > don't want to complicate it even more by merging something we
> > already know is not quite right.
> > 
> > I don't understand your comment that "EDR can only be enabled if
> > DPC is controlled by firmware."  I don't see anything in table 4-6
> > to that effect.  The only mention of EDR there is to say that the
> > OS can access the DPC capability in the EDR processing window,
> > i.e., after the OS receives the EDR notification and before it
> > clears DPC Trigger Status.
> 
> Please check the following spec reference (from table 4-6).
> 
>     If control of this feature was requested and denied, firmware is
>     responsible for initializing Downstream Port Containment
>     Extended Capability Structures per firmware policy. Further, the
>     OS is permitted to read or write DPC Control and Status
>     registers of a port while processing an Error Disconnect Recover
>     notification from firmware on that port.
> 
> It specifies firmware is expected to use EDR notification *only*
> when the control of DPC is requested and denied (which means
> firmware owns the DPC).

No, I don't think it says that.  This section tells us how to use _OSC
to negotiate ownership of DPC.  The first sentence you quoted
basically says that if firmware retains control of DPC, firmware is
responsible for initializing the DPC capability.  That part is pretty
obvious: "firmware owns it, so firmware is responsible for configuring
it."

The second sentence is important because it's an exception to the
general rule of "the OS can't touch things owned by the firmware." The
exception is that even if firmware retains control of DPC, the OS is
allowed to access DPC registers during the EDR notification window.

There is nothing here about when firmware is allowed to use EDR.

> Although it does not explicitly state that we should install EDR
> notification handler only if firmware owns DPC, it mentions that EDR
> notification is only used if firmware owns DPC. So why should we
> install it if it's not going to be used when OS owns DPC.

It does not say anything about "EDR notification only being used if
firmware owns DPC."

We should install an EDR notify handler because we told the firmware
that we support EDR notifications.  I don't think we should make it
any more complicated than that.

> Also check the following reference from section 2 of EDR ECN. It also
> clarifies EDR feature is only used when firmware owns DPC.
> 
>     PCIe Base Specification suggests that Downstream Port Containment
>     may be controlled either by the Firmware or the Operating System. It
>     also suggests that the Firmware retain ownership of Downstream Port
>     Containment if it also owns AER. When the Firmware owns Downstream
>     Port Containment, *it is expected to use the new “Error Disconnect
>     Recover” notification to alert OSPM of a Downstream Port Containment
>     event*.

The text in section 2 will not become part of the spec, so we can't
rely on it to tell us how to implement things.  Even if it did, this
section does not say "OS should only install an EDR notify handler if
firmware owns DPC."  It just means that if firmware owns DPC, the OS
will not learn about DPC events directly via DPC interrupts, so
firmware has to use another mechanism, e.g., EDR, to tell the OS about
them.

If an OS requests DPC control, it must support both DPC and EDR (sec
4.5.2.4).  However, I think an OS may support EDR but not DPC
(although your patches don't support this configuration).  In that
case the OS would set the _OSC "EDR Supported" bit, but it would not
request DPC control.  Then the EDR notify handler would "invalidate
the software state associated with child devices of the port" (table
4-4), but it would not "attempt to recover the child devices of ports
implementing DPC."

> > EDR is a general ACPI feature that is not PCI-specific.  For EDR
> > on PCI devices, OS support is advertised via _OSC *Support* (table
> > 4-4), which says:
> > 
> >    Error Disconnect Recover Supported
> > 
> >    The OS sets this bit to 1 if it supports Error Disconnect
> >    Recover notification on PCI Express Host Bridges, Root Ports
> >    and Switch Downstream Ports. Otherwise, the OS sets this bit to
> >    0.
> > 
> > I think that means that if we set the "Error Disconnect Recover
> > Supported" _OSC bit (OSC_PCI_EDR_SUPPORT), we must install a
> > handler for EDR notifications.  We set OSC_PCI_EDR_SUPPORT
> > whenever CONFIG_PCIE_EDR=y, so I think we should install the
> > notify handler here unconditionally (since this file is compiled
> > only when CONFIG_PCIE_EDR=y).
> 
> Although spec does not provide any restrictions on when to install
> EDR notification, it provides reference that notification is only
> used if firmware owns DPC. So when OS owns DPC, there is no need to
> install them at all.

I disagree that the spec tells us that EDR is only used when firmware
owns DPC.

Even if it did, pcie_aer_get_firmware_first() only looks at HEST
tables.  There *might* be some connection between those and DPC
ownership, but that's internal to firmware and I think it's just
asking for trouble if we rely on that connection.

> Although installing them when OS owns DPC should not affect
> anything, it also opens up a additional way for firmware to mess up
> things. For example, consider a case when firmware gives OS control
> of DPC, but still sends EDR notification to OS. Although it's
> unrealistic, I am just giving an example.

Can you outline the problem that occurs in this scenario?  It seems
like the EDR notify handler could still work.  The OS can access DPC
at any time (not just during the EDR window).

> > I don't think we should even test pcie_ports_dpc_native here.  If we
> > told the platform we can handle EDR notifications, we should be
> > prepared to get them, regardless of whether the user booted with
> > "pcie_ports=dpc-native".
> 
> As per the command line parameter documentation, setting
> pcie_ports=dpc-native means, we will be using native PCIe service
> for DPC.  So if DPC is handled by OS, as per my argument mentioned
> above (EDR is only useful if DPC handled by firmware), there is no
> use in installing EDR notification.
> 
> https://github.com/torvalds/linux/blob/master/Documentation/admin-guide/kernel-parameters.txt#L3642
> 
> dpc-native - Use native PCIe service for DPC only.

It doesn't hurt anything to install a notify handler that never
receives a notification.  It might be an issue if we tell firmware
we're prepared for notifications but we don't install a handler.

> > It's conceivable that pcie_ports_dpc_native should make us do
> > something different in the notify handler after we *get* a
> > notification, but I doubt we should even worry about that.
> > 
> > IIUC, pcie_ports_dpc_native exists because Linux DPC originally worked
> > even if the OS didn't have control of AER.  eed85ff4c0da7 ("PCI/DPC:
> > Enable DPC only if AER is available") meant that if Linux didn't have
> > control of AER, DPC no longer worked.  "pcie_ports=dpc-native" is
> > basically a way to get that previous behavior of Linux DPC regardless
> > of AER control.
> > 
> > I don't think that issue applies to EDR.  There's no concept of an OS
> > "enabling" or "being granted control of" EDR.  The OS merely
> > advertises that "yes, I'm prepared to handle EDR notifications".
> > AFAICT, the ECR says nothing about EDR support being conditional on OS
> > control of AER or DPC.  The notify *handler* might need to do
> > different things depending on whether we have AER or DPC control, but
> > the handler itself should be registered regardless.
> > 
> > [1] https://lore.kernel.org/linux-pci/20181115231605.24352-1-mr.nuke.me@gmail.com/
> > [2] https://lore.kernel.org/linux-pci/20190326172343.28946-1-mr.nuke.me@gmail.com/
> > 
> 
> -- 
> Sathyanarayanan Kuppuswamy
> Linux Kernel Developer

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 11/12] PCI/DPC: Add Error Disconnect Recover (EDR) support
  2020-03-06 21:00       ` Bjorn Helgaas
@ 2020-03-06 22:42         ` Kuppuswamy Sathyanarayanan
  2020-03-06 23:23           ` Bjorn Helgaas
  0 siblings, 1 reply; 68+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2020-03-06 22:42 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, linux-kernel, ashok.raj, Olof Johansson

Hi Bjorn,

On 3/6/20 1:00 PM, Bjorn Helgaas wrote:
> On Thu, Mar 05, 2020 at 10:32:33PM -0800, Kuppuswamy, Sathyanarayanan wrote:
>> On 3/5/2020 7:47 PM, Bjorn Helgaas wrote:
>>> On Tue, Mar 03, 2020 at 06:36:34PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
>>>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>>> +void pci_acpi_add_edr_notifier(struct pci_dev *pdev)
>>>> +{
>>>> +	struct acpi_device *adev = ACPI_COMPANION(&pdev->dev);
>>>> +	acpi_status astatus;
>>>> +
>>>> +	if (!adev) {
>>>> +		pci_dbg(pdev, "No valid ACPI node, so skip EDR init\n");
>>>> +		return;
>>>> +	}
>>>> +
>>>> +	/*
>>>> +	 * Per the Downstream Port Containment Related Enhancements ECN to
>>>> +	 * the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-6, EDR support
>>>> +	 * can only be enabled if DPC is controlled by firmware.
>>>> +	 *
>>>> +	 * TODO: Remove dependency on ACPI FIRMWARE_FIRST bit to
>>>> +	 * determine ownership of DPC between firmware or OS.
>>>> +	 * Per the Downstream Port Containment Related Enhancements
>>>> +	 * ECN to the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
>>>> +	 * OS can use bit 7 of _OSC control field to negotiate control
>>>> +	 * over DPC Capability.
>>>> +	 */
>>>> +	if (!pcie_aer_get_firmware_first(pdev) || pcie_ports_dpc_native) {
>>>> +		pci_dbg(pdev, "OS handles AER/DPC, so skip EDR init\n");
>>>> +		return;
>>>> +	}
>>>> +
>>>> +	astatus = acpi_install_notify_handler(adev->handle, ACPI_SYSTEM_NOTIFY,
>>>> +					      edr_handle_event, pdev);
>>>
>>> It does not say anything about "EDR notification only being used if
>>> firmware owns DPC."
>>>
>>> We should install an EDR notify handler because we told the firmware
>>> that we support EDR notifications.  I don't think we should make it
>>> any more complicated than that.
I agree with your above statement. Since we told firmware *we support*
EDR notification, we should make that true by installing the notification
handler unconditionally.

But, based on inferences from PCI FW 3.2 ECN-DPC spec, current use case 
of EDR
notification is only to handle error recovery for the case where DPC is 
owned by firmware
and firmware sends EDR event. if you agree with above comment, is it 
alright if we add
the following check in EDR notification handler ?

Although spec does not restrict it, current tested use case of EDR is to 
handle notification
for firmware DPC case.

218         if (!pcie_aer_get_firmware_first(pdev) || 
pcie_ports_dpc_native || (host->native_dpc))
219                 return;



>
>> Also check the following reference from section 2 of EDR ECN. It also
>> clarifies EDR feature is only used when firmware owns DPC.
>>
>>      PCIe Base Specification suggests that Downstream Port Containment
>>      may be controlled either by the Firmware or the Operating System. It
>>      also suggests that the Firmware retain ownership of Downstream Port
>>      Containment if it also owns AER. When the Firmware owns Downstream
>>      Port Containment, *it is expected to use the new “Error Disconnect
>>      Recover” notification to alert OSPM of a Downstream Port Containment
>>      event*.
> The text in section 2 will not become part of the spec, so we can't
> rely on it to tell us how to implement things.  Even if it did, this
> section does not say "OS should only install an EDR notify handler if
> firmware owns DPC."  It just means that if firmware owns DPC, the OS
> will not learn about DPC events directly via DPC interrupts, so
> firmware has to use another mechanism, e.g., EDR, to tell the OS about
> them.
>
> If an OS requests DPC control, it must support both DPC and EDR (sec
> 4.5.2.4).  However, I think an OS may support EDR but not DPC
> (although your patches don't support this configuration).
Any use cases for above configuration ? Current PCI FW 3.2 ECN-DPC
spec does not mention any uses cases where EDR can be used outside
the scope of DPC ?

If required I can add this support. It should be easy to add it. In non 
DPC case,
EDR notification handler would mostly be empty. Please let me know if you
want me add this part of next patch set.

> In that
> case the OS would set the _OSC "EDR Supported" bit, but it would not
> request DPC control.  Then the EDR notify handler would "invalidate
> the software state associated with child devices of the port" (table
> 4-4), but it would not "attempt to recover the child devices of ports
> implementing DPC."
>
>>> EDR is a general ACPI feature that is not PCI-specific.  For EDR
>>> on PCI devices, OS support is advertised via _OSC *Support* (table
>>> 4-4), which says:
>>>
>>>     Error Disconnect Recover Supported
>>>
>>>     The OS sets this bit to 1 if it supports Error Disconnect
>>>     Recover notification on PCI Express Host Bridges, Root Ports
>>>     and Switch Downstream Ports. Otherwise, the OS sets this bit to
>>>     0.
>>>
>>> I think that means that if we set the "Error Disconnect Recover
>>> Supported" _OSC bit (OSC_PCI_EDR_SUPPORT), we must install a
>>> handler for EDR notifications.  We set OSC_PCI_EDR_SUPPORT
>>> whenever CONFIG_PCIE_EDR=y, so I think we should install the
>>> notify handler here unconditionally (since this file is compiled
>>> only when CONFIG_PCIE_EDR=y).
>> Although spec does not provide any restrictions on when to install
>> EDR notification, it provides reference that notification is only
>> used if firmware owns DPC. So when OS owns DPC, there is no need to
>> install them at all.
> I disagree that the spec tells us that EDR is only used when firmware
> owns DPC.
Agreed.
>
> Even if it did, pcie_aer_get_firmware_first() only looks at HEST
> tables.  There *might* be some connection between those and DPC
> ownership, but that's internal to firmware and I think it's just
> asking for trouble if we rely on that connection.
>
>> Although installing them when OS owns DPC should not affect
>> anything, it also opens up a additional way for firmware to mess up
>> things. For example, consider a case when firmware gives OS control
>> of DPC, but still sends EDR notification to OS. Although it's
>> unrealistic, I am just giving an example.
> Can you outline the problem that occurs in this scenario?  It seems
> like the EDR notify handler could still work.  The OS can access DPC
> at any time (not just during the EDR window).
When OS owns DPC and firmware sends  a EDR event, it could
create race between DPC interrupt handler and EDR
event handler. Although from hardware perspective it should
not make difference, since both code paths does the same thing.
>
>>> I don't think we should even test pcie_ports_dpc_native here.  If we
>>> told the platform we can handle EDR notifications, we should be
>>> prepared to get them, regardless of whether the user booted with
>>> "pcie_ports=dpc-native".
>> As per the command line parameter documentation, setting
>> pcie_ports=dpc-native means, we will be using native PCIe service
>> for DPC.  So if DPC is handled by OS, as per my argument mentioned
>> above (EDR is only useful if DPC handled by firmware), there is no
>> use in installing EDR notification.
>>
>> https://github.com/torvalds/linux/blob/master/Documentation/admin-guide/kernel-parameters.txt#L3642
>>
>> dpc-native - Use native PCIe service for DPC only.
> It doesn't hurt anything to install a notify handler that never
> receives a notification.  It might be an issue if we tell firmware
> we're prepared for notifications but we don't install a handler.
Agreed. Shall I send another version with this and "static inline" fix ?
>
>>> It's conceivable that pcie_ports_dpc_native should make us do
>>> something different in the notify handler after we *get* a
>>> notification, but I doubt we should even worry about that.
>>>
>>> IIUC, pcie_ports_dpc_native exists because Linux DPC originally worked
>>> even if the OS didn't have control of AER.  eed85ff4c0da7 ("PCI/DPC:
>>> Enable DPC only if AER is available") meant that if Linux didn't have
>>> control of AER, DPC no longer worked.  "pcie_ports=dpc-native" is
>>> basically a way to get that previous behavior of Linux DPC regardless
>>> of AER control.
>>>
>>> I don't think that issue applies to EDR.  There's no concept of an OS
>>> "enabling" or "being granted control of" EDR.  The OS merely
>>> advertises that "yes, I'm prepared to handle EDR notifications".
>>> AFAICT, the ECR says nothing about EDR support being conditional on OS
>>> control of AER or DPC.  The notify *handler* might need to do
>>> different things depending on whether we have AER or DPC control, but
>>> the handler itself should be registered regardless.
>>>
>>> [1] https://lore.kernel.org/linux-pci/20181115231605.24352-1-mr.nuke.me@gmail.com/
>>> [2] https://lore.kernel.org/linux-pci/20190326172343.28946-1-mr.nuke.me@gmail.com/
>>>
>> -- 
>> Sathyanarayanan Kuppuswamy
>> Linux Kernel Developer

-- 
Sathyanarayanan Kuppuswamy
Linux kernel developer


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 11/12] PCI/DPC: Add Error Disconnect Recover (EDR) support
  2020-03-06 22:42         ` Kuppuswamy Sathyanarayanan
@ 2020-03-06 23:23           ` Bjorn Helgaas
  2020-03-07  0:19             ` Kuppuswamy Sathyanarayanan
  0 siblings, 1 reply; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-06 23:23 UTC (permalink / raw)
  To: Kuppuswamy Sathyanarayanan
  Cc: linux-pci, linux-kernel, ashok.raj, Olof Johansson

On Fri, Mar 06, 2020 at 02:42:14PM -0800, Kuppuswamy Sathyanarayanan wrote:
> On 3/6/20 1:00 PM, Bjorn Helgaas wrote:
> > On Thu, Mar 05, 2020 at 10:32:33PM -0800, Kuppuswamy, Sathyanarayanan wrote:
> > > On 3/5/2020 7:47 PM, Bjorn Helgaas wrote:
> > > > On Tue, Mar 03, 2020 at 06:36:34PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
> > > > > From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > > > > +void pci_acpi_add_edr_notifier(struct pci_dev *pdev)
> > > > > +{
> > > > > +	struct acpi_device *adev = ACPI_COMPANION(&pdev->dev);
> > > > > +	acpi_status astatus;
> > > > > +
> > > > > +	if (!adev) {
> > > > > +		pci_dbg(pdev, "No valid ACPI node, so skip EDR init\n");
> > > > > +		return;
> > > > > +	}
> > > > > +
> > > > > +	/*
> > > > > +	 * Per the Downstream Port Containment Related Enhancements ECN to
> > > > > +	 * the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-6, EDR support
> > > > > +	 * can only be enabled if DPC is controlled by firmware.
> > > > > +	 *
> > > > > +	 * TODO: Remove dependency on ACPI FIRMWARE_FIRST bit to
> > > > > +	 * determine ownership of DPC between firmware or OS.
> > > > > +	 * Per the Downstream Port Containment Related Enhancements
> > > > > +	 * ECN to the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
> > > > > +	 * OS can use bit 7 of _OSC control field to negotiate control
> > > > > +	 * over DPC Capability.
> > > > > +	 */
> > > > > +	if (!pcie_aer_get_firmware_first(pdev) || pcie_ports_dpc_native) {
> > > > > +		pci_dbg(pdev, "OS handles AER/DPC, so skip EDR init\n");
> > > > > +		return;
> > > > > +	}
> > > > > +
> > > > > +	astatus = acpi_install_notify_handler(adev->handle, ACPI_SYSTEM_NOTIFY,
> > > > > +					      edr_handle_event, pdev);
> > > > 
> > > > It does not say anything about "EDR notification only being
> > > > used if firmware owns DPC."
> > > > 
> > > > We should install an EDR notify handler because we told the
> > > > firmware that we support EDR notifications.  I don't think we
> > > > should make it any more complicated than that.
>
> I agree with your above statement. Since we told firmware *we
> support* EDR notification, we should make that true by installing
> the notification handler unconditionally.
> 
> But, based on inferences from PCI FW 3.2 ECN-DPC spec, current use
> case of EDR notification is only to handle error recovery for the
> case where DPC is owned by firmware and firmware sends EDR event. if
> you agree with above comment, is it alright if we add the following
> check in EDR notification handler ?
> 
> Although spec does not restrict it, current tested use case of EDR
> is to handle notification for firmware DPC case.
> 
> 218         if (!pcie_aer_get_firmware_first(pdev) || pcie_ports_dpc_native
> || (host->native_dpc))
> 219                 return;

No, I do not think we should add a check like this.  There's no basis
in the spec for doing this.  pcie_aer_get_firmware_first() looks at
HEST, which isn't mentioned at all in relation to EDR.  Checks like
this make it really hard to understand the code, and I don't believe
in making things fail simply because we haven't tested the scenario.

> > > Also check the following reference from section 2 of EDR ECN. It also
> > > clarifies EDR feature is only used when firmware owns DPC.
> > > 
> > >      PCIe Base Specification suggests that Downstream Port Containment
> > >      may be controlled either by the Firmware or the Operating System. It
> > >      also suggests that the Firmware retain ownership of Downstream Port
> > >      Containment if it also owns AER. When the Firmware owns Downstream
> > >      Port Containment, *it is expected to use the new “Error Disconnect
> > >      Recover” notification to alert OSPM of a Downstream Port Containment
> > >      event*.
> > The text in section 2 will not become part of the spec, so we can't
> > rely on it to tell us how to implement things.  Even if it did, this
> > section does not say "OS should only install an EDR notify handler if
> > firmware owns DPC."  It just means that if firmware owns DPC, the OS
> > will not learn about DPC events directly via DPC interrupts, so
> > firmware has to use another mechanism, e.g., EDR, to tell the OS about
> > them.
> > 
> > If an OS requests DPC control, it must support both DPC and EDR
> > (sec 4.5.2.4).  However, I think an OS may support EDR but not DPC
> > (although your patches don't support this configuration).
>
> Any use cases for above configuration ? Current PCI FW 3.2 ECN-DPC
> spec does not mention any uses cases where EDR can be used outside
> the scope of DPC ?
> 
> If required I can add this support. It should be easy to add it. In
> non DPC case, EDR notification handler would mostly be empty. Please
> let me know if you want me add this part of next patch set.

I don't think there's a need to add support for this.  I just
mentioned it as part of the point that we shouldn't tie EDR to DPC
unnecessarily.

> > > Although installing them when OS owns DPC should not affect
> > > anything, it also opens up a additional way for firmware to mess
> > > up things. For example, consider a case when firmware gives OS
> > > control of DPC, but still sends EDR notification to OS. Although
> > > it's unrealistic, I am just giving an example.
>
> > Can you outline the problem that occurs in this scenario?  It
> > seems like the EDR notify handler could still work.  The OS can
> > access DPC at any time (not just during the EDR window).
>
> When OS owns DPC and firmware sends a EDR event, it could create
> race between DPC interrupt handler and EDR event handler. Although
> from hardware perspective it should not make difference, since both
> code paths does the same thing.

Yes, that's true.  I think we should wait until there is a problem
here before doing anything.

> > > > I don't think we should even test pcie_ports_dpc_native here.  If we
> > > > told the platform we can handle EDR notifications, we should be
> > > > prepared to get them, regardless of whether the user booted with
> > > > "pcie_ports=dpc-native".
> > > As per the command line parameter documentation, setting
> > > pcie_ports=dpc-native means, we will be using native PCIe service
> > > for DPC.  So if DPC is handled by OS, as per my argument mentioned
> > > above (EDR is only useful if DPC handled by firmware), there is no
> > > use in installing EDR notification.
> > > 
> > > https://github.com/torvalds/linux/blob/master/Documentation/admin-guide/kernel-parameters.txt#L3642
> > > 
> > > dpc-native - Use native PCIe service for DPC only.
> > It doesn't hurt anything to install a notify handler that never
> > receives a notification.  It might be an issue if we tell firmware
> > we're prepared for notifications but we don't install a handler.
> Agreed. Shall I send another version with this and "static inline" fix ?

No need.  Just take a look at my review/edr branch.  I intend to tweak
some commit logs and (maybe) make the "clear status" functions void
since there are only one or two minor uses of the return values.  But
it's pretty much what I hope to merge.

Bjorn

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 11/12] PCI/DPC: Add Error Disconnect Recover (EDR) support
  2020-03-06 23:23           ` Bjorn Helgaas
@ 2020-03-07  0:19             ` Kuppuswamy Sathyanarayanan
  0 siblings, 0 replies; 68+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2020-03-07  0:19 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, linux-kernel, ashok.raj, Olof Johansson

Hi Bjorn,

On 3/6/20 3:23 PM, Bjorn Helgaas wrote:
> No need.  Just take a look at my review/edr branch.  I intend to tweak
> some commit logs and (maybe) make the "clear status" functions void
> since there are only one or two minor uses of the return values.  But
> it's pretty much what I hope to merge.

Your review/edr branch seems to have covered everything.
Thanks for working on it.

>
> Bjorn

-- 
Sathyanarayanan Kuppuswamy
Linux kernel developer


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-04  2:36 ` [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode sathyanarayanan.kuppuswamy
  2020-03-06  5:45   ` Kuppuswamy, Sathyanarayanan
@ 2020-03-10  2:40   ` Bjorn Helgaas
  2020-03-10  4:28     ` Kuppuswamy, Sathyanarayanan
  1 sibling, 1 reply; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-10  2:40 UTC (permalink / raw)
  To: sathyanarayanan.kuppuswamy
  Cc: linux-pci, linux-kernel, ashok.raj, Austin Bolen

[+cc Austin, tentative Linux patches on this git branch:
https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/tree/drivers/pci/pcie?h=review/edr]

On Tue, Mar 03, 2020 at 06:36:32PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> 
> As per PCI firmware specification r3.2 System Firmware Intermediary
> (SFI) _OSC and DPC Updates ECR
> (https://members.pcisig.com/wg/PCI-SIG/document/13563), sec titled "DPC
> Event Handling Implementation Note", page 10, Error Disconnect Recover
> (EDR) support allows OS to handle error recovery and clearing Error
> Registers even in FF mode. So create new API pci_aer_raw_clear_status()
> which allows clearing AER registers without FF mode checks.

I see that this ECR was released as an ECN a few days ago:
https://members.pcisig.com/wg/PCI-SIG/document/14076
Regrettably the title in the PDF still says "ECR" (the rendered title
*page* says "ENGINEERING CHANGE NOTIFICATION", but some metadata
buried in the file says "ECR - SFI _OSC Support and DPC Updates".

Anyway, I think I see the note you refer to (now on page 12):

  IMPLEMENTATION NOTE
  DPC Event Handling

  The flow chart below documents the behavior when firmware maintains
  control of AER and DPC and grants control of PCIe Hot-Plug to the
  operating system.

  ...

  Capture and clear device AER status. OS may choose to offline
  devices3, either via SW (not load driver) or HW (power down device,
  disable Link5,6,7). Otherwise process _HPX, complete device
  enumeration, load drivers

This clearly suggests that the OS should clear device AER status.
However, according to the intro text, firmware has retained control of
AER, so what gives the OS the right to clear AER status?

The Downstream Port Containment Related Enhancements ECN (sec 4.5.1,
table 4-6) contains an exception that allows the OS to read/write
DPC registers during recovery.  But

  - that is for *DPC* registers, not for AER registers, and

  - that exception only applies between OS receipt of the EDR
    notification and OS release of DPC by clearing the DPC Trigger
    Status bit.

The flowchart in the SFI ECN shows the OS releasing DPC before
clearing AER status:

  - Receive EDR notification

  - Cleanup - Notify and unload child drivers below Port

  - Bring Port out of DPC, clear port error status, assign bus numbers
    to child devices.

    I assume this box includes clearing DPC error status and clearing
    Trigger Status?  They seem to be out of order in the box.

  - Evaluate _OST

  - Capture and clear device AER status.

    This seems suspect to me.  Where does it say the OS is allowed to
    write AER status when firmware retains control of AER?

This patch series does things in this order:

  - Receive EDR notification (edr_handle_event(), edr.c)

  - Read, log, and clear DPC error regs (dpc_process_error(), dpc.c).

    This also clears AER uncorrectable error status when the relevant
    HEST entries do not have the FIRMWARE_FIRST bit set.  I think this
    is incorrect: the test should be based the _OSC negotiation for
    AER ownership, not on the HEST entries.  But this problem
    pre-dates this patch series.

  - Clear AER status (pci_aer_raw_clear_status(), aer.c).

    This is at least inside the EDR recovery window, but again, I
    don't see where it says the OS is allowed to write the AER status.

  - Attempt recovery (pcie_do_recovery(), err.c)

  - Clear DPC Trigger Status (dpc_reset_link(), dpc.c)

  - Evaluate _OST (acpi_send_edr_status(), edr.c)

What am I missing?

> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> ---
>  drivers/pci/pci.h      |  2 ++
>  drivers/pci/pcie/aer.c | 22 ++++++++++++++++++----
>  2 files changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index e57e78b619f8..c239e6dd2542 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -655,6 +655,7 @@ extern const struct attribute_group aer_stats_attr_group;
>  void pci_aer_clear_fatal_status(struct pci_dev *dev);
>  void pci_aer_clear_device_status(struct pci_dev *dev);
>  int pci_cleanup_aer_error_status_regs(struct pci_dev *dev);
> +int pci_aer_raw_clear_status(struct pci_dev *dev);
>  #else
>  static inline void pci_no_aer(void) { }
>  static inline void pci_aer_init(struct pci_dev *d) { }
> @@ -665,6 +666,7 @@ static inline int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
>  {
>  	return -EINVAL;
>  }
> +int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
>  #endif
>  
>  #ifdef CONFIG_ACPI
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index c0540c3761dc..41afefa562b7 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -420,7 +420,16 @@ void pci_aer_clear_fatal_status(struct pci_dev *dev)
>  		pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
>  }
>  
> -int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
> +/**
> + * pci_aer_raw_clear_status - Clear AER error registers.
> + * @dev: the PCI device
> + *
> + * NOTE: Allows clearing error registers in both FF and
> + * non FF modes.
> + *
> + * Returns 0 on success, or negative on failure.
> + */
> +int pci_aer_raw_clear_status(struct pci_dev *dev)
>  {
>  	int pos;
>  	u32 status;
> @@ -433,9 +442,6 @@ int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
>  	if (!pos)
>  		return -EIO;
>  
> -	if (pcie_aer_get_firmware_first(dev))
> -		return -EIO;
> -
>  	port_type = pci_pcie_type(dev);
>  	if (port_type == PCI_EXP_TYPE_ROOT_PORT) {
>  		pci_read_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, &status);
> @@ -451,6 +457,14 @@ int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
>  	return 0;
>  }
>  
> +int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
> +{
> +	if (pcie_aer_get_firmware_first(dev))
> +		return -EIO;
> +
> +	return pci_aer_raw_clear_status(dev);
> +}
> +
>  void pci_save_aer_state(struct pci_dev *dev)
>  {
>  	struct pci_cap_saved_state *save_state;
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-10  2:40   ` Bjorn Helgaas
@ 2020-03-10  4:28     ` Kuppuswamy, Sathyanarayanan
  2020-03-10 18:14       ` Austin.Bolen
  0 siblings, 1 reply; 68+ messages in thread
From: Kuppuswamy, Sathyanarayanan @ 2020-03-10  4:28 UTC (permalink / raw)
  To: Bjorn Helgaas, Austin Bolen; +Cc: linux-pci, linux-kernel, ashok.raj

Hi Bjorn,

On 3/9/2020 7:40 PM, Bjorn Helgaas wrote:
> [+cc Austin, tentative Linux patches on this git branch:
> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/tree/drivers/pci/pcie?h=review/edr]
> 
> On Tue, Mar 03, 2020 at 06:36:32PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>
>> As per PCI firmware specification r3.2 System Firmware Intermediary
>> (SFI) _OSC and DPC Updates ECR
>> (https://members.pcisig.com/wg/PCI-SIG/document/13563), sec titled "DPC
>> Event Handling Implementation Note", page 10, Error Disconnect Recover
>> (EDR) support allows OS to handle error recovery and clearing Error
>> Registers even in FF mode. So create new API pci_aer_raw_clear_status()
>> which allows clearing AER registers without FF mode checks.
> 
> I see that this ECR was released as an ECN a few days ago:
> https://members.pcisig.com/wg/PCI-SIG/document/14076
> Regrettably the title in the PDF still says "ECR" (the rendered title
> *page* says "ENGINEERING CHANGE NOTIFICATION", but some metadata
> buried in the file says "ECR - SFI _OSC Support and DPC Updates".
> 
> Anyway, I think I see the note you refer to (now on page 12):
> 
>    IMPLEMENTATION NOTE
>    DPC Event Handling
> 
>    The flow chart below documents the behavior when firmware maintains
>    control of AER and DPC and grants control of PCIe Hot-Plug to the
>    operating system.
> 
>    ...
> 
>    Capture and clear device AER status. OS may choose to offline
>    devices3, either via SW (not load driver) or HW (power down device,
>    disable Link5,6,7). Otherwise process _HPX, complete device
>    enumeration, load drivers
> 
> This clearly suggests that the OS should clear device AER status.
> However, according to the intro text, firmware has retained control of
> AER, so what gives the OS the right to clear AER status?
> 
> The Downstream Port Containment Related Enhancements ECN (sec 4.5.1,
> table 4-6) contains an exception that allows the OS to read/write
> DPC registers during recovery.  But
> 
>    - that is for *DPC* registers, not for AER registers, and
> 
>    - that exception only applies between OS receipt of the EDR
>      notification and OS release of DPC by clearing the DPC Trigger
>      Status bit.
> 
> The flowchart in the SFI ECN shows the OS releasing DPC before
> clearing AER status:
> 
>    - Receive EDR notification
> 
>    - Cleanup - Notify and unload child drivers below Port
> 
>    - Bring Port out of DPC, clear port error status, assign bus numbers
>      to child devices.
> 
>      I assume this box includes clearing DPC error status and clearing
>      Trigger Status?  They seem to be out of order in the box.
> 
>    - Evaluate _OST
> 
>    - Capture and clear device AER status.
> 
>      This seems suspect to me.  Where does it say the OS is allowed to
>      write AER status when firmware retains control of AER?
> 
> This patch series does things in this order:
> 
>    - Receive EDR notification (edr_handle_event(), edr.c)
> 
>    - Read, log, and clear DPC error regs (dpc_process_error(), dpc.c).
> 
>      This also clears AER uncorrectable error status when the relevant
>      HEST entries do not have the FIRMWARE_FIRST bit set.  I think this
>      is incorrect: the test should be based the _OSC negotiation for
>      AER ownership, not on the HEST entries.  But this problem
>      pre-dates this patch series.
> 
>    - Clear AER status (pci_aer_raw_clear_status(), aer.c).
> 
>      This is at least inside the EDR recovery window, but again, I
>      don't see where it says the OS is allowed to write the AER status.

Implementation note is the only reference we have regarding clearing the
AER registers.

But since the spec says both DPC and AER needs to be always controlled
together by the either OS or firmware, and when firmware relinquishes
control over DPC registers in EDR notification window, we can assume
that we also have control over AER registers.

But I agree that is not explicitly spelled out any where outside the
implementation note.


Austin,

May be ECN (section 4.5.1, table 4-6) needs to be updated to add this
clarification.

> 
>    - Attempt recovery (pcie_do_recovery(), err.c)
> 
>    - Clear DPC Trigger Status (dpc_reset_link(), dpc.c)
> 
>    - Evaluate _OST (acpi_send_edr_status(), edr.c)
> 
> What am I missing?
> 
>> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>> ---
>>   drivers/pci/pci.h      |  2 ++
>>   drivers/pci/pcie/aer.c | 22 ++++++++++++++++++----
>>   2 files changed, 20 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>> index e57e78b619f8..c239e6dd2542 100644
>> --- a/drivers/pci/pci.h
>> +++ b/drivers/pci/pci.h
>> @@ -655,6 +655,7 @@ extern const struct attribute_group aer_stats_attr_group;
>>   void pci_aer_clear_fatal_status(struct pci_dev *dev);
>>   void pci_aer_clear_device_status(struct pci_dev *dev);
>>   int pci_cleanup_aer_error_status_regs(struct pci_dev *dev);
>> +int pci_aer_raw_clear_status(struct pci_dev *dev);
>>   #else
>>   static inline void pci_no_aer(void) { }
>>   static inline void pci_aer_init(struct pci_dev *d) { }
>> @@ -665,6 +666,7 @@ static inline int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
>>   {
>>   	return -EINVAL;
>>   }
>> +int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
>>   #endif
>>   
>>   #ifdef CONFIG_ACPI
>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>> index c0540c3761dc..41afefa562b7 100644
>> --- a/drivers/pci/pcie/aer.c
>> +++ b/drivers/pci/pcie/aer.c
>> @@ -420,7 +420,16 @@ void pci_aer_clear_fatal_status(struct pci_dev *dev)
>>   		pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
>>   }
>>   
>> -int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
>> +/**
>> + * pci_aer_raw_clear_status - Clear AER error registers.
>> + * @dev: the PCI device
>> + *
>> + * NOTE: Allows clearing error registers in both FF and
>> + * non FF modes.
>> + *
>> + * Returns 0 on success, or negative on failure.
>> + */
>> +int pci_aer_raw_clear_status(struct pci_dev *dev)
>>   {
>>   	int pos;
>>   	u32 status;
>> @@ -433,9 +442,6 @@ int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
>>   	if (!pos)
>>   		return -EIO;
>>   
>> -	if (pcie_aer_get_firmware_first(dev))
>> -		return -EIO;
>> -
>>   	port_type = pci_pcie_type(dev);
>>   	if (port_type == PCI_EXP_TYPE_ROOT_PORT) {
>>   		pci_read_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, &status);
>> @@ -451,6 +457,14 @@ int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
>>   	return 0;
>>   }
>>   
>> +int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
>> +{
>> +	if (pcie_aer_get_firmware_first(dev))
>> +		return -EIO;
>> +
>> +	return pci_aer_raw_clear_status(dev);
>> +}
>> +
>>   void pci_save_aer_state(struct pci_dev *dev)
>>   {
>>   	struct pci_cap_saved_state *save_state;
>> -- 
>> 2.25.1
>>

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-10 18:14       ` Austin.Bolen
@ 2020-03-10 19:32         ` Bjorn Helgaas
  2020-03-10 20:06           ` Austin.Bolen
  0 siblings, 1 reply; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-10 19:32 UTC (permalink / raw)
  To: Austin.Bolen
  Cc: sathyanarayanan.kuppuswamy, linux-pci, linux-kernel, ashok.raj

On Tue, Mar 10, 2020 at 06:14:20PM +0000, Austin.Bolen@dell.com wrote:
> On 3/9/2020 11:28 PM, Kuppuswamy, Sathyanarayanan wrote:
> > On 3/9/2020 7:40 PM, Bjorn Helgaas wrote:
> >> [+cc Austin, tentative Linux patches on this git branch:
> >> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/tree/drivers/pci/pcie?h=review/edr]
> >>
> >> On Tue, Mar 03, 2020 at 06:36:32PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
> >>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> >>>
> >>> As per PCI firmware specification r3.2 System Firmware Intermediary
> >>> (SFI) _OSC and DPC Updates ECR
> >>> (https://members.pcisig.com/wg/PCI-SIG/document/13563), sec titled "DPC
> >>> Event Handling Implementation Note", page 10, Error Disconnect Recover
> >>> (EDR) support allows OS to handle error recovery and clearing Error
> >>> Registers even in FF mode. So create new API pci_aer_raw_clear_status()
> >>> which allows clearing AER registers without FF mode checks.
> >>
> >> I see that this ECR was released as an ECN a few days ago:
> >> https://members.pcisig.com/wg/PCI-SIG/document/14076
> >> Regrettably the title in the PDF still says "ECR" (the rendered title
> >> *page* says "ENGINEERING CHANGE NOTIFICATION", but some metadata
> >> buried in the file says "ECR - SFI _OSC Support and DPC Updates".
> 
> I'll see if PCI-SIG can update the metadata and repost.

If that's possible, it would be nice to update the metadata for the
"Downstream Port Containment related Enhancements" ECN as well.  That
one currently says "ECR - CardBus Header Proposal", which means that's
what's in the window title bar and icons in the panel.

> >> Anyway, I think I see the note you refer to (now on page 12):
> >>
> >>     IMPLEMENTATION NOTE
> >>     DPC Event Handling
> >>
> >>     The flow chart below documents the behavior when firmware maintains
> >>     control of AER and DPC and grants control of PCIe Hot-Plug to the
> >>     operating system.
> >>
> >>     ...
> >>
> >>     Capture and clear device AER status. OS may choose to offline
> >>     devices3, either via SW (not load driver) or HW (power down device,
> >>     disable Link5,6,7). Otherwise process _HPX, complete device
> >>     enumeration, load drivers
> >>
> >> This clearly suggests that the OS should clear device AER status.
> >> However, according to the intro text, firmware has retained control of
> >> AER, so what gives the OS the right to clear AER status?
> >>
> >> The Downstream Port Containment Related Enhancements ECN (sec 4.5.1,
> >> table 4-6) contains an exception that allows the OS to read/write
> >> DPC registers during recovery.  But
> >>
> >>     - that is for *DPC* registers, not for AER registers, and
> >>
> >>     - that exception only applies between OS receipt of the EDR
> >>       notification and OS release of DPC by clearing the DPC Trigger
> >>       Status bit.
> >>
> >> The flowchart in the SFI ECN shows the OS releasing DPC before
> >> clearing AER status:
> >>
> >>     - Receive EDR notification
> >>
> >>     - Cleanup - Notify and unload child drivers below Port
> >>
> >>     - Bring Port out of DPC, clear port error status, assign bus numbers
> >>       to child devices.
> >>
> >>       I assume this box includes clearing DPC error status and clearing
> >>       Trigger Status?  They seem to be out of order in the box.
> 
> OS clears the DPC Trigger Status bit which will bring port below it out 
> of containment. Then OS will clear the "port" error status bits (i.e., 
> the AER and DPC status bits in the root port or downstream port that 
> triggered containment). I don't think it would hurt to do this two steps 
> in reverse order but don't think it is necessary. Note that error status 
> bits for devices below the port in containment are cleared later after 
> f/w has a chance to log them.

Maybe I'm misreading the DPC enhancements ECN.  I think it says the OS
can read/write DPC registers until it clears the DPC Trigger Status.
If the OS clears Trigger Status first, my understanding is that we're
now out of the EDR notification processing window and the OS is not
permitted to write DPC registers.

If it's OK for the OS to clear Trigger Status before clearing DPC
error status, what is the event that determines when the OS may no
longer read/write the DPC registers?

> >>     - Evaluate _OST
> >>
> >>     - Capture and clear device AER status.
> >>
> >>       This seems suspect to me.  Where does it say the OS is
> >>       allowed to write AER status when firmware retains control
> >>       of AER?
> >>
> >> This patch series does things in this order:
> >>
> >>     - Receive EDR notification (edr_handle_event(), edr.c)
> >>
> >>     - Read, log, and clear DPC error regs (dpc_process_error(),
> >>       dpc.c).
> >>
> >>       This also clears AER uncorrectable error status when the
> >>       relevant HEST entries do not have the FIRMWARE_FIRST bit
> >>       set.  I think this is incorrect: the test should be based
> >>       the _OSC negotiation for AER ownership, not on the HEST
> >>       entries.  But this problem pre-dates this patch series.
> >>
> >>     - Clear AER status (pci_aer_raw_clear_status(), aer.c).
> >>
> >>       This is at least inside the EDR recovery window, but again,
> >>       I don't see where it says the OS is allowed to write the
> >>       AER status.
> > 
> > Implementation note is the only reference we have regarding
> > clearing the AER registers.
> > 
> > But since the spec says both DPC and AER needs to be always
> > controlled together by the either OS or firmware, and when
> > firmware relinquishes control over DPC registers in EDR
> > notification window, we can assume that we also have control over
> > AER registers.
> > 
> > But I agree that is not explicitly spelled out any where outside
> > the implementation note.

This is all quite unsatisfying since implementation notes are not
normative.  I would far rather reference actual spec text.

> > Austin,
> > 
> > May be ECN (section 4.5.1, table 4-6) needs to be updated to add
> > this clarification.
> 
> Sure we can update to section 4.5.1, table 4-6 to indicate when OS
> can clear the AER status bits. It will just follow what's done in
> the implementation note so I think it's acceptable to follow
> implementation guidance for now.

There are no events after the "clear device AER status" box.  That
seems to mean the OS can write the AER status registers at any time.
But the whole implementation note assumes firmware maintains control
of AER.

> >>     - Attempt recovery (pcie_do_recovery(), err.c)
> >>
> >>     - Clear DPC Trigger Status (dpc_reset_link(), dpc.c)
> >>
> >>     - Evaluate _OST (acpi_send_edr_status(), edr.c)
> >>
> >> What am I missing?

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-10 19:32         ` Bjorn Helgaas
@ 2020-03-10 20:06           ` Austin.Bolen
  2020-03-10 20:41             ` Kuppuswamy Sathyanarayanan
                               ` (2 more replies)
  0 siblings, 3 replies; 68+ messages in thread
From: Austin.Bolen @ 2020-03-10 20:06 UTC (permalink / raw)
  To: helgaas, Austin.Bolen
  Cc: sathyanarayanan.kuppuswamy, linux-pci, linux-kernel, ashok.raj

On 3/10/2020 2:33 PM, Bjorn Helgaas wrote:
> 
> [EXTERNAL EMAIL]
> 
> On Tue, Mar 10, 2020 at 06:14:20PM +0000, Austin.Bolen@dell.com wrote:
>> On 3/9/2020 11:28 PM, Kuppuswamy, Sathyanarayanan wrote:
>>> On 3/9/2020 7:40 PM, Bjorn Helgaas wrote:
>>>> [+cc Austin, tentative Linux patches on this git branch:
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/tree/drivers/pci/pcie?h=review/edr]
>>>>
>>>> On Tue, Mar 03, 2020 at 06:36:32PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
>>>>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>>>>
>>>>> As per PCI firmware specification r3.2 System Firmware Intermediary
>>>>> (SFI) _OSC and DPC Updates ECR
>>>>> (https://members.pcisig.com/wg/PCI-SIG/document/13563), sec titled "DPC
>>>>> Event Handling Implementation Note", page 10, Error Disconnect Recover
>>>>> (EDR) support allows OS to handle error recovery and clearing Error
>>>>> Registers even in FF mode. So create new API pci_aer_raw_clear_status()
>>>>> which allows clearing AER registers without FF mode checks.
>>>>
>>>> I see that this ECR was released as an ECN a few days ago:
>>>> https://members.pcisig.com/wg/PCI-SIG/document/14076
>>>> Regrettably the title in the PDF still says "ECR" (the rendered title
>>>> *page* says "ENGINEERING CHANGE NOTIFICATION", but some metadata
>>>> buried in the file says "ECR - SFI _OSC Support and DPC Updates".
>>
>> I'll see if PCI-SIG can update the metadata and repost.
> 
> If that's possible, it would be nice to update the metadata for the
> "Downstream Port Containment related Enhancements" ECN as well.  That
> one currently says "ECR - CardBus Header Proposal", which means that's
> what's in the window title bar and icons in the panel.

Sure, I'll check.

> 
>>>> Anyway, I think I see the note you refer to (now on page 12):
>>>>
>>>>      IMPLEMENTATION NOTE
>>>>      DPC Event Handling
>>>>
>>>>      The flow chart below documents the behavior when firmware maintains
>>>>      control of AER and DPC and grants control of PCIe Hot-Plug to the
>>>>      operating system.
>>>>
>>>>      ...
>>>>
>>>>      Capture and clear device AER status. OS may choose to offline
>>>>      devices3, either via SW (not load driver) or HW (power down device,
>>>>      disable Link5,6,7). Otherwise process _HPX, complete device
>>>>      enumeration, load drivers
>>>>
>>>> This clearly suggests that the OS should clear device AER status.
>>>> However, according to the intro text, firmware has retained control of
>>>> AER, so what gives the OS the right to clear AER status?
>>>>
>>>> The Downstream Port Containment Related Enhancements ECN (sec 4.5.1,
>>>> table 4-6) contains an exception that allows the OS to read/write
>>>> DPC registers during recovery.  But
>>>>
>>>>      - that is for *DPC* registers, not for AER registers, and
>>>>
>>>>      - that exception only applies between OS receipt of the EDR
>>>>        notification and OS release of DPC by clearing the DPC Trigger
>>>>        Status bit.
>>>>
>>>> The flowchart in the SFI ECN shows the OS releasing DPC before
>>>> clearing AER status:
>>>>
>>>>      - Receive EDR notification
>>>>
>>>>      - Cleanup - Notify and unload child drivers below Port
>>>>
>>>>      - Bring Port out of DPC, clear port error status, assign bus numbers
>>>>        to child devices.
>>>>
>>>>        I assume this box includes clearing DPC error status and clearing
>>>>        Trigger Status?  They seem to be out of order in the box.
>>
>> OS clears the DPC Trigger Status bit which will bring port below it out
>> of containment. Then OS will clear the "port" error status bits (i.e.,
>> the AER and DPC status bits in the root port or downstream port that
>> triggered containment). I don't think it would hurt to do this two steps
>> in reverse order but don't think it is necessary. Note that error status
>> bits for devices below the port in containment are cleared later after
>> f/w has a chance to log them.
> 
> Maybe I'm misreading the DPC enhancements ECN.  I think it says the OS
> can read/write DPC registers until it clears the DPC Trigger Status.
> If the OS clears Trigger Status first, my understanding is that we're
> now out of the EDR notification processing window and the OS is not
> permitted to write DPC registers.
> 
> If it's OK for the OS to clear Trigger Status before clearing DPC
> error status, what is the event that determines when the OS may no
> longer read/write the DPC registers?

I think there are a few different registers to consider... DPC Control, 
DPC Status, various AER registers, and the RP PIO registers. At this 
point in the flow, the firmware has already had a chance to read all of 
them and so it really doesn't matter the order the OS does those two 
things. The firmware isn't going to get notified again until _OST so by 
then both operation will be done and system firmware will have no idea 
which order the OS did them in, nor will it care.  But since the 
existing normative text specifies and order, I would just follow that.

> 
>>>>      - Evaluate _OST
>>>>
>>>>      - Capture and clear device AER status.
>>>>
>>>>        This seems suspect to me.  Where does it say the OS is
>>>>        allowed to write AER status when firmware retains control
>>>>        of AER?
>>>>
>>>> This patch series does things in this order:
>>>>
>>>>      - Receive EDR notification (edr_handle_event(), edr.c)
>>>>
>>>>      - Read, log, and clear DPC error regs (dpc_process_error(),
>>>>        dpc.c).
>>>>
>>>>        This also clears AER uncorrectable error status when the
>>>>        relevant HEST entries do not have the FIRMWARE_FIRST bit
>>>>        set.  I think this is incorrect: the test should be based
>>>>        the _OSC negotiation for AER ownership, not on the HEST
>>>>        entries.  But this problem pre-dates this patch series.
>>>>
>>>>      - Clear AER status (pci_aer_raw_clear_status(), aer.c).
>>>>
>>>>        This is at least inside the EDR recovery window, but again,
>>>>        I don't see where it says the OS is allowed to write the
>>>>        AER status.
>>>
>>> Implementation note is the only reference we have regarding
>>> clearing the AER registers.
>>>
>>> But since the spec says both DPC and AER needs to be always
>>> controlled together by the either OS or firmware, and when
>>> firmware relinquishes control over DPC registers in EDR
>>> notification window, we can assume that we also have control over
>>> AER registers.
>>>
>>> But I agree that is not explicitly spelled out any where outside
>>> the implementation note.
> 
> This is all quite unsatisfying since implementation notes are not
> normative.  I would far rather reference actual spec text.

Yes, the change I mention below would be to add normative text.

> 
>>> Austin,
>>>
>>> May be ECN (section 4.5.1, table 4-6) needs to be updated to add
>>> this clarification.
>>
>> Sure we can update to section 4.5.1, table 4-6 to indicate when OS
>> can clear the AER status bits. It will just follow what's done in
>> the implementation note so I think it's acceptable to follow
>> implementation guidance for now.
> 
> There are no events after the "clear device AER status" box.  That
> seems to mean the OS can write the AER status registers at any time.
> But the whole implementation note assumes firmware maintains control
> of AER.
> 

In this model the OS doesn't own DPC or AER but the model allows OS to 
touch both DPC and AER registers at certain times.  I would view 
ownership in this case as who is the primary owner and not who is the 
sole entity allowed to access the registers.

For the normative text describing when OS clears the AER bits following 
the informative flow chart, it could say that OS clears AER as soon as 
possible after OST returns and before OS processes _HPX and loading 
drivers.  Open to other suggestions as well.

>>>>      - Attempt recovery (pcie_do_recovery(), err.c)
>>>>
>>>>      - Clear DPC Trigger Status (dpc_reset_link(), dpc.c)
>>>>
>>>>      - Evaluate _OST (acpi_send_edr_status(), edr.c)
>>>>
>>>> What am I missing?
> 


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-10 20:06           ` Austin.Bolen
@ 2020-03-10 20:41             ` Kuppuswamy Sathyanarayanan
  2020-03-10 20:41               ` Kuppuswamy Sathyanarayanan
  2020-03-10 20:49               ` Austin.Bolen
  2020-03-11 14:45             ` Bjorn Helgaas
  2020-03-11 22:05             ` Bjorn Helgaas
  2 siblings, 2 replies; 68+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2020-03-10 20:41 UTC (permalink / raw)
  To: Austin.Bolen, helgaas; +Cc: linux-pci, linux-kernel, ashok.raj

Hi,

On 3/10/20 1:06 PM, Austin.Bolen@dell.com wrote:
> On 3/10/2020 2:33 PM, Bjorn Helgaas wrote:
>> [EXTERNAL EMAIL]
>>
>> On Tue, Mar 10, 2020 at 06:14:20PM +0000, Austin.Bolen@dell.com wrote:
>>> On 3/9/2020 11:28 PM, Kuppuswamy, Sathyanarayanan wrote:
>>>> On 3/9/2020 7:40 PM, Bjorn Helgaas wrote:
>>>>> [+cc Austin, tentative Linux patches on this git branch:
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/tree/drivers/pci/pcie?h=review/edr]
>>>>>
>>>>> On Tue, Mar 03, 2020 at 06:36:32PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
>>>>>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>>>>>
>>>>>> As per PCI firmware specification r3.2 System Firmware Intermediary
>>>>>> (SFI) _OSC and DPC Updates ECR
>>>>>> (https://members.pcisig.com/wg/PCI-SIG/document/13563), sec titled "DPC
>>>>>> Event Handling Implementation Note", page 10, Error Disconnect Recover
>>>>>> (EDR) support allows OS to handle error recovery and clearing Error
>>>>>> Registers even in FF mode. So create new API pci_aer_raw_clear_status()
>>>>>> which allows clearing AER registers without FF mode checks.
>>>>> I see that this ECR was released as an ECN a few days ago:
>>>>> https://members.pcisig.com/wg/PCI-SIG/document/14076
>>>>> Regrettably the title in the PDF still says "ECR" (the rendered title
>>>>> *page* says "ENGINEERING CHANGE NOTIFICATION", but some metadata
>>>>> buried in the file says "ECR - SFI _OSC Support and DPC Updates".
>>> I'll see if PCI-SIG can update the metadata and repost.
>> If that's possible, it would be nice to update the metadata for the
>> "Downstream Port Containment related Enhancements" ECN as well.  That
>> one currently says "ECR - CardBus Header Proposal", which means that's
>> what's in the window title bar and icons in the panel.
> Sure, I'll check.
>
>>>>> Anyway, I think I see the note you refer to (now on page 12):
>>>>>
>>>>>       IMPLEMENTATION NOTE
>>>>>       DPC Event Handling
>>>>>
>>>>>       The flow chart below documents the behavior when firmware maintains
>>>>>       control of AER and DPC and grants control of PCIe Hot-Plug to the
>>>>>       operating system.
>>>>>
>>>>>       ...
>>>>>
>>>>>       Capture and clear device AER status. OS may choose to offline
>>>>>       devices3, either via SW (not load driver) or HW (power down device,
>>>>>       disable Link5,6,7). Otherwise process _HPX, complete device
>>>>>       enumeration, load drivers
>>>>>
>>>>> This clearly suggests that the OS should clear device AER status.
>>>>> However, according to the intro text, firmware has retained control of
>>>>> AER, so what gives the OS the right to clear AER status?
>>>>>
>>>>> The Downstream Port Containment Related Enhancements ECN (sec 4.5.1,
>>>>> table 4-6) contains an exception that allows the OS to read/write
>>>>> DPC registers during recovery.  But
>>>>>
>>>>>       - that is for *DPC* registers, not for AER registers, and
>>>>>
>>>>>       - that exception only applies between OS receipt of the EDR
>>>>>         notification and OS release of DPC by clearing the DPC Trigger
>>>>>         Status bit.
>>>>>
>>>>> The flowchart in the SFI ECN shows the OS releasing DPC before
>>>>> clearing AER status:
>>>>>
>>>>>       - Receive EDR notification
>>>>>
>>>>>       - Cleanup - Notify and unload child drivers below Port
>>>>>
>>>>>       - Bring Port out of DPC, clear port error status, assign bus numbers
>>>>>         to child devices.
>>>>>
>>>>>         I assume this box includes clearing DPC error status and clearing
>>>>>         Trigger Status?  They seem to be out of order in the box.
>>> OS clears the DPC Trigger Status bit which will bring port below it out
>>> of containment. Then OS will clear the "port" error status bits (i.e.,
>>> the AER and DPC status bits in the root port or downstream port that
>>> triggered containment). I don't think it would hurt to do this two steps
>>> in reverse order but don't think it is necessary. Note that error status
>>> bits for devices below the port in containment are cleared later after
>>> f/w has a chance to log them.
>> Maybe I'm misreading the DPC enhancements ECN.  I think it says the OS
>> can read/write DPC registers until it clears the DPC Trigger Status.
>> If the OS clears Trigger Status first, my understanding is that we're
>> now out of the EDR notification processing window and the OS is not
>> permitted to write DPC registers.
>>
>> If it's OK for the OS to clear Trigger Status before clearing DPC
>> error status, what is the event that determines when the OS may no
>> longer read/write the DPC registers?
> I think there are a few different registers to consider... DPC Control,
> DPC Status, various AER registers, and the RP PIO registers. At this
> point in the flow, the firmware has already had a chance to read all of
> them and so it really doesn't matter the order the OS does those two
> things. The firmware isn't going to get notified again until _OST so by
> then both operation will be done and system firmware will have no idea
> which order the OS did them in, nor will it care.  But since the
> existing normative text specifies and order, I would just follow that.
I think the correct order is to clear the port error status *before clearing
the DPC status trigger*.

Please check the following spec reference (change to 4.5.1 Table 4-6)

the OS is permitted to read or write DPC Control and Status registers of a
port while processing an Error Disconnect Recover notification from firmware
on that port. Error Disconnect Recover notification processing begins 
with the
Error Disconnect Recover notify from Firmware, and *ends when the OS 
releases
DPC by clearing the DPC Trigger Status bit*.Firmware can read DPC Trigger
Status bit to determine the ownership of DPC Control and Status 
registers. Firmware
is not permitted to write to DPC Control and Status registers if DPC 
Trigger Status is
set i.e. the link is in DPC state. *Outside of the Error Disconnect 
Recover notification
processing window, the OS is not permitted to modify DPC Control or 
Status registers*;
only firmware is allowed to.

Since the EDR processing window ends with clearing DPC Trigger status 
bit, OS needs to
clear DPC and AER registers before it ends.

Austin,

I think the order needs to be reversed in the implementation note.
>
>>>>>       - Evaluate _OST
>>>>>
>>>>>       - Capture and clear device AER status.
>>>>>
>>>>>         This seems suspect to me.  Where does it say the OS is
>>>>>         allowed to write AER status when firmware retains control
>>>>>         of AER?
>>>>>
>>>>> This patch series does things in this order:
>>>>>
>>>>>       - Receive EDR notification (edr_handle_event(), edr.c)
>>>>>
>>>>>       - Read, log, and clear DPC error regs (dpc_process_error(),
>>>>>         dpc.c).
>>>>>
>>>>>         This also clears AER uncorrectable error status when the
>>>>>         relevant HEST entries do not have the FIRMWARE_FIRST bit
>>>>>         set.  I think this is incorrect: the test should be based
>>>>>         the _OSC negotiation for AER ownership, not on the HEST
>>>>>         entries.  But this problem pre-dates this patch series.
>>>>>
>>>>>       - Clear AER status (pci_aer_raw_clear_status(), aer.c).
>>>>>
>>>>>         This is at least inside the EDR recovery window, but again,
>>>>>         I don't see where it says the OS is allowed to write the
>>>>>         AER status.
>>>> Implementation note is the only reference we have regarding
>>>> clearing the AER registers.
>>>>
>>>> But since the spec says both DPC and AER needs to be always
>>>> controlled together by the either OS or firmware, and when
>>>> firmware relinquishes control over DPC registers in EDR
>>>> notification window, we can assume that we also have control over
>>>> AER registers.
>>>>
>>>> But I agree that is not explicitly spelled out any where outside
>>>> the implementation note.
>> This is all quite unsatisfying since implementation notes are not
>> normative.  I would far rather reference actual spec text.
> Yes, the change I mention below would be to add normative text.
>
>>>> Austin,
>>>>
>>>> May be ECN (section 4.5.1, table 4-6) needs to be updated to add
>>>> this clarification.
>>> Sure we can update to section 4.5.1, table 4-6 to indicate when OS
>>> can clear the AER status bits. It will just follow what's done in
>>> the implementation note so I think it's acceptable to follow
>>> implementation guidance for now.
>> There are no events after the "clear device AER status" box.  That
>> seems to mean the OS can write the AER status registers at any time.
>> But the whole implementation note assumes firmware maintains control
>> of AER.
>>
> In this model the OS doesn't own DPC or AER but the model allows OS to
> touch both DPC and AER registers at certain times.  I would view
> ownership in this case as who is the primary owner and not who is the
> sole entity allowed to access the registers.
>
> For the normative text describing when OS clears the AER bits following
> the informative flow chart, it could say that OS clears AER as soon as
> possible after OST returns and before OS processes _HPX and loading
> drivers.  Open to other suggestions as well.
I think its better to have another handshake between OS and
firmware to avoid unnecessary races.
>
>>>>>       - Attempt recovery (pcie_do_recovery(), err.c)
>>>>>
>>>>>       - Clear DPC Trigger Status (dpc_reset_link(), dpc.c)
>>>>>
>>>>>       - Evaluate _OST (acpi_send_edr_status(), edr.c)
>>>>>
>>>>> What am I missing?

-- 
Sathyanarayanan Kuppuswamy
Linux kernel developer


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-10 20:41             ` Kuppuswamy Sathyanarayanan
@ 2020-03-10 20:41               ` Kuppuswamy Sathyanarayanan
  2020-03-10 20:49               ` Austin.Bolen
  1 sibling, 0 replies; 68+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2020-03-10 20:41 UTC (permalink / raw)
  To: Austin.Bolen, helgaas; +Cc: linux-pci, linux-kernel, ashok.raj

Hi,

On 3/10/20 1:06 PM, Austin.Bolen@dell.com wrote:
> On 3/10/2020 2:33 PM, Bjorn Helgaas wrote:
>> [EXTERNAL EMAIL]
>>
>> On Tue, Mar 10, 2020 at 06:14:20PM +0000, Austin.Bolen@dell.com wrote:
>>> On 3/9/2020 11:28 PM, Kuppuswamy, Sathyanarayanan wrote:
>>>> On 3/9/2020 7:40 PM, Bjorn Helgaas wrote:
>>>>> [+cc Austin, tentative Linux patches on this git branch:
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/tree/drivers/pci/pcie?h=review/edr]
>>>>>
>>>>> On Tue, Mar 03, 2020 at 06:36:32PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
>>>>>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>>>>>
>>>>>> As per PCI firmware specification r3.2 System Firmware Intermediary
>>>>>> (SFI) _OSC and DPC Updates ECR
>>>>>> (https://members.pcisig.com/wg/PCI-SIG/document/13563), sec titled "DPC
>>>>>> Event Handling Implementation Note", page 10, Error Disconnect Recover
>>>>>> (EDR) support allows OS to handle error recovery and clearing Error
>>>>>> Registers even in FF mode. So create new API pci_aer_raw_clear_status()
>>>>>> which allows clearing AER registers without FF mode checks.
>>>>> I see that this ECR was released as an ECN a few days ago:
>>>>> https://members.pcisig.com/wg/PCI-SIG/document/14076
>>>>> Regrettably the title in the PDF still says "ECR" (the rendered title
>>>>> *page* says "ENGINEERING CHANGE NOTIFICATION", but some metadata
>>>>> buried in the file says "ECR - SFI _OSC Support and DPC Updates".
>>> I'll see if PCI-SIG can update the metadata and repost.
>> If that's possible, it would be nice to update the metadata for the
>> "Downstream Port Containment related Enhancements" ECN as well.  That
>> one currently says "ECR - CardBus Header Proposal", which means that's
>> what's in the window title bar and icons in the panel.
> Sure, I'll check.
>
>>>>> Anyway, I think I see the note you refer to (now on page 12):
>>>>>
>>>>>       IMPLEMENTATION NOTE
>>>>>       DPC Event Handling
>>>>>
>>>>>       The flow chart below documents the behavior when firmware maintains
>>>>>       control of AER and DPC and grants control of PCIe Hot-Plug to the
>>>>>       operating system.
>>>>>
>>>>>       ...
>>>>>
>>>>>       Capture and clear device AER status. OS may choose to offline
>>>>>       devices3, either via SW (not load driver) or HW (power down device,
>>>>>       disable Link5,6,7). Otherwise process _HPX, complete device
>>>>>       enumeration, load drivers
>>>>>
>>>>> This clearly suggests that the OS should clear device AER status.
>>>>> However, according to the intro text, firmware has retained control of
>>>>> AER, so what gives the OS the right to clear AER status?
>>>>>
>>>>> The Downstream Port Containment Related Enhancements ECN (sec 4.5.1,
>>>>> table 4-6) contains an exception that allows the OS to read/write
>>>>> DPC registers during recovery.  But
>>>>>
>>>>>       - that is for *DPC* registers, not for AER registers, and
>>>>>
>>>>>       - that exception only applies between OS receipt of the EDR
>>>>>         notification and OS release of DPC by clearing the DPC Trigger
>>>>>         Status bit.
>>>>>
>>>>> The flowchart in the SFI ECN shows the OS releasing DPC before
>>>>> clearing AER status:
>>>>>
>>>>>       - Receive EDR notification
>>>>>
>>>>>       - Cleanup - Notify and unload child drivers below Port
>>>>>
>>>>>       - Bring Port out of DPC, clear port error status, assign bus numbers
>>>>>         to child devices.
>>>>>
>>>>>         I assume this box includes clearing DPC error status and clearing
>>>>>         Trigger Status?  They seem to be out of order in the box.
>>> OS clears the DPC Trigger Status bit which will bring port below it out
>>> of containment. Then OS will clear the "port" error status bits (i.e.,
>>> the AER and DPC status bits in the root port or downstream port that
>>> triggered containment). I don't think it would hurt to do this two steps
>>> in reverse order but don't think it is necessary. Note that error status
>>> bits for devices below the port in containment are cleared later after
>>> f/w has a chance to log them.
>> Maybe I'm misreading the DPC enhancements ECN.  I think it says the OS
>> can read/write DPC registers until it clears the DPC Trigger Status.
>> If the OS clears Trigger Status first, my understanding is that we're
>> now out of the EDR notification processing window and the OS is not
>> permitted to write DPC registers.
>>
>> If it's OK for the OS to clear Trigger Status before clearing DPC
>> error status, what is the event that determines when the OS may no
>> longer read/write the DPC registers?
> I think there are a few different registers to consider... DPC Control,
> DPC Status, various AER registers, and the RP PIO registers. At this
> point in the flow, the firmware has already had a chance to read all of
> them and so it really doesn't matter the order the OS does those two
> things. The firmware isn't going to get notified again until _OST so by
> then both operation will be done and system firmware will have no idea
> which order the OS did them in, nor will it care.  But since the
> existing normative text specifies and order, I would just follow that.
I think the correct order is to clear the port error status *before clearing
the DPC status trigger*.

Please check the following spec reference (change to 4.5.1 Table 4-6)

the OS is permitted to read or write DPC Control and Status registers of a
port while processing an Error Disconnect Recover notification from firmware
on that port. Error Disconnect Recover notification processing begins 
with the
Error Disconnect Recover notify from Firmware, and *ends when the OS 
releases
DPC by clearing the DPC Trigger Status bit*.Firmware can read DPC Trigger
Status bit to determine the ownership of DPC Control and Status 
registers. Firmware
is not permitted to write to DPC Control and Status registers if DPC 
Trigger Status is
set i.e. the link is in DPC state. *Outside of the Error Disconnect 
Recover notification
processing window, the OS is not permitted to modify DPC Control or 
Status registers*;
only firmware is allowed to.

Since the EDR processing window ends with clearing DPC Trigger status 
bit, OS needs to
clear DPC and AER registers before it ends.

Austin,

I think the order needs to be reversed in the implementation note.
>
>>>>>       - Evaluate _OST
>>>>>
>>>>>       - Capture and clear device AER status.
>>>>>
>>>>>         This seems suspect to me.  Where does it say the OS is
>>>>>         allowed to write AER status when firmware retains control
>>>>>         of AER?
>>>>>
>>>>> This patch series does things in this order:
>>>>>
>>>>>       - Receive EDR notification (edr_handle_event(), edr.c)
>>>>>
>>>>>       - Read, log, and clear DPC error regs (dpc_process_error(),
>>>>>         dpc.c).
>>>>>
>>>>>         This also clears AER uncorrectable error status when the
>>>>>         relevant HEST entries do not have the FIRMWARE_FIRST bit
>>>>>         set.  I think this is incorrect: the test should be based
>>>>>         the _OSC negotiation for AER ownership, not on the HEST
>>>>>         entries.  But this problem pre-dates this patch series.
>>>>>
>>>>>       - Clear AER status (pci_aer_raw_clear_status(), aer.c).
>>>>>
>>>>>         This is at least inside the EDR recovery window, but again,
>>>>>         I don't see where it says the OS is allowed to write the
>>>>>         AER status.
>>>> Implementation note is the only reference we have regarding
>>>> clearing the AER registers.
>>>>
>>>> But since the spec says both DPC and AER needs to be always
>>>> controlled together by the either OS or firmware, and when
>>>> firmware relinquishes control over DPC registers in EDR
>>>> notification window, we can assume that we also have control over
>>>> AER registers.
>>>>
>>>> But I agree that is not explicitly spelled out any where outside
>>>> the implementation note.
>> This is all quite unsatisfying since implementation notes are not
>> normative.  I would far rather reference actual spec text.
> Yes, the change I mention below would be to add normative text.
>
>>>> Austin,
>>>>
>>>> May be ECN (section 4.5.1, table 4-6) needs to be updated to add
>>>> this clarification.
>>> Sure we can update to section 4.5.1, table 4-6 to indicate when OS
>>> can clear the AER status bits. It will just follow what's done in
>>> the implementation note so I think it's acceptable to follow
>>> implementation guidance for now.
>> There are no events after the "clear device AER status" box.  That
>> seems to mean the OS can write the AER status registers at any time.
>> But the whole implementation note assumes firmware maintains control
>> of AER.
>>
> In this model the OS doesn't own DPC or AER but the model allows OS to
> touch both DPC and AER registers at certain times.  I would view
> ownership in this case as who is the primary owner and not who is the
> sole entity allowed to access the registers.
>
> For the normative text describing when OS clears the AER bits following
> the informative flow chart, it could say that OS clears AER as soon as
> possible after OST returns and before OS processes _HPX and loading
> drivers.  Open to other suggestions as well.
I think its better to have another handshake between OS and
firmware to avoid unnecessary races.
>
>>>>>       - Attempt recovery (pcie_do_recovery(), err.c)
>>>>>
>>>>>       - Clear DPC Trigger Status (dpc_reset_link(), dpc.c)
>>>>>
>>>>>       - Evaluate _OST (acpi_send_edr_status(), edr.c)
>>>>>
>>>>> What am I missing?

-- 
Sathyanarayanan Kuppuswamy
Linux kernel developer


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-10 20:41             ` Kuppuswamy Sathyanarayanan
  2020-03-10 20:41               ` Kuppuswamy Sathyanarayanan
@ 2020-03-10 20:49               ` Austin.Bolen
  1 sibling, 0 replies; 68+ messages in thread
From: Austin.Bolen @ 2020-03-10 20:49 UTC (permalink / raw)
  To: sathyanarayanan.kuppuswamy, Austin.Bolen, helgaas
  Cc: linux-pci, linux-kernel, ashok.raj

On 3/10/2020 3:44 PM, Kuppuswamy Sathyanarayanan wrote:
> 
<snip>
> 
> Please check the following spec reference (change to 4.5.1 Table 4-6)
> 
> the OS is permitted to read or write DPC Control and Status registers of a
> port while processing an Error Disconnect Recover notification from firmware
> on that port. Error Disconnect Recover notification processing begins
> with the
> Error Disconnect Recover notify from Firmware, and *ends when the OS
> releases
> DPC by clearing the DPC Trigger Status bit*.Firmware can read DPC Trigger
> Status bit to determine the ownership of DPC Control and Status
> registers. Firmware
> is not permitted to write to DPC Control and Status registers if DPC
> Trigger Status is
> set i.e. the link is in DPC state. *Outside of the Error Disconnect
> Recover notification
> processing window, the OS is not permitted to modify DPC Control or
> Status registers*;
> only firmware is allowed to.
> 
> Since the EDR processing window ends with clearing DPC Trigger status
> bit, OS needs to
> clear DPC and AER registers before it ends.
> 
> Austin,
> 
> I think the order needs to be reversed in the implementation note.

Agreed.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-10 20:06           ` Austin.Bolen
  2020-03-10 20:41             ` Kuppuswamy Sathyanarayanan
@ 2020-03-11 14:45             ` Bjorn Helgaas
  2020-03-11 15:19               ` Austin.Bolen
  2020-03-11 22:05             ` Bjorn Helgaas
  2 siblings, 1 reply; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-11 14:45 UTC (permalink / raw)
  To: Austin.Bolen
  Cc: sathyanarayanan.kuppuswamy, linux-pci, linux-kernel, ashok.raj

On Tue, Mar 10, 2020 at 08:06:21PM +0000, Austin.Bolen@dell.com wrote:
> On 3/10/2020 2:33 PM, Bjorn Helgaas wrote:
> > On Tue, Mar 10, 2020 at 06:14:20PM +0000, Austin.Bolen@dell.com wrote:
> >> On 3/9/2020 11:28 PM, Kuppuswamy, Sathyanarayanan wrote:
> >>> On 3/9/2020 7:40 PM, Bjorn Helgaas wrote:
> >>>> [+cc Austin, tentative Linux patches on this git branch:
> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/tree/drivers/pci/pcie?h=review/edr]
> >>>>
> >>>> On Tue, Mar 03, 2020 at 06:36:32PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
> >>>>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> >>>>>
> >>>>> As per PCI firmware specification r3.2 System Firmware Intermediary
> >>>>> (SFI) _OSC and DPC Updates ECR
> >>>>> (https://members.pcisig.com/wg/PCI-SIG/document/13563), sec titled "DPC
> >>>>> Event Handling Implementation Note", page 10, Error Disconnect Recover
> >>>>> (EDR) support allows OS to handle error recovery and clearing Error
> >>>>> Registers even in FF mode. So create new API pci_aer_raw_clear_status()
> >>>>> which allows clearing AER registers without FF mode checks.

> >> OS clears the DPC Trigger Status bit which will bring port below it out
> >> of containment. Then OS will clear the "port" error status bits (i.e.,
> >> the AER and DPC status bits in the root port or downstream port that
> >> triggered containment). I don't think it would hurt to do this two steps
> >> in reverse order but don't think it is necessary.

> >> Note that error status bits for devices below the port in
> >> containment are cleared later after f/w has a chance to log them.

Thanks for pointing out this wrinkle about devices below the port in
containment.  I think we might have an issue here with the current
series because evaluating _OST is the last thing the EDR notify
handler does.  More below.

> > Maybe I'm misreading the DPC enhancements ECN.  I think it says the OS
> > can read/write DPC registers until it clears the DPC Trigger Status.
> > If the OS clears Trigger Status first, my understanding is that we're
> > now out of the EDR notification processing window and the OS is not
> > permitted to write DPC registers.
> > 
> > If it's OK for the OS to clear Trigger Status before clearing DPC
> > error status, what is the event that determines when the OS may no
> > longer read/write the DPC registers?
> 
> I think there are a few different registers to consider... DPC
> Control, DPC Status, various AER registers, and the RP PIO
> registers. At this point in the flow, the firmware has already had a
> chance to read all of them and so it really doesn't matter the order
> the OS does those two things. The firmware isn't going to get
> notified again until _OST so by then both operation will be done and
> system firmware will have no idea which order the OS did them in,
> nor will it care.  But since the existing normative text specifies
> and order, I would just follow that.

OK, this series clears DPC error status before clearing DPC Trigger
Status, so I think we can keep that as-is.

> > There are no events after the "clear device AER status" box.  That
> > seems to mean the OS can write the AER status registers at any
> > time.  But the whole implementation note assumes firmware
> > maintains control of AER.
> 
> In this model the OS doesn't own DPC or AER but the model allows OS
> to touch both DPC and AER registers at certain times.  I would view
> ownership in this case as who is the primary owner and not who is
> the sole entity allowed to access the registers.

I'm not sure how to translate the idea of primary ownership into code.

> For the normative text describing when OS clears the AER bits
> following the informative flow chart, it could say that OS clears
> AER as soon as possible after OST returns and before OS processes
> _HPX and loading drivers.  Open to other suggestions as well.

I'm not sure what to do with "as soon as possible" either.  That
doesn't seem like something firmware and the OS can agree on.

For the port that triggered DPC containment, I think the easiest thing
to understand and implement would be to allow AER access during the
same EDR processing window where DPC access is allowed.

For child devices of that port, obviously it's impossible to access
AER registers until DPC Trigger Status is cleared, and the flowchart
says the OS shouldn't access them until after _OST.

I'm actually not sure we currently do *anything* with child device AER
info in the EDR path.  pcie_do_recovery() does walk the sub-hierarchy
of child devices, but it only calls error handling callbacks in the
child drivers; it doesn't do anything with the child AER registers
itself.  And of course, this happens before _OST, so it would be too
early in any case.  But maybe I'm missing something here.

BTW, if/when this is updated, I have another question: the _OSC DPC
control bit currently allows the OS to write DPC Control during that
window.  I understand the OS writing the RW1C *Status* bits to clear
them, but it seems like writing the DPC Control register is likely to
cause issues.  The same question would apply to the AER access we're
talking about.

Bjorn

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-11 14:45             ` Bjorn Helgaas
@ 2020-03-11 15:19               ` Austin.Bolen
  2020-03-11 17:12                 ` Bjorn Helgaas
  0 siblings, 1 reply; 68+ messages in thread
From: Austin.Bolen @ 2020-03-11 15:19 UTC (permalink / raw)
  To: helgaas, Austin.Bolen
  Cc: sathyanarayanan.kuppuswamy, linux-pci, linux-kernel, ashok.raj

On 3/11/2020 9:46 AM, Bjorn Helgaas wrote:
> 
> [EXTERNAL EMAIL]
> 
> On Tue, Mar 10, 2020 at 08:06:21PM +0000, Austin.Bolen@dell.com wrote:
>> On 3/10/2020 2:33 PM, Bjorn Helgaas wrote:
>>> On Tue, Mar 10, 2020 at 06:14:20PM +0000, Austin.Bolen@dell.com wrote:
>>>> On 3/9/2020 11:28 PM, Kuppuswamy, Sathyanarayanan wrote:
>>>>> On 3/9/2020 7:40 PM, Bjorn Helgaas wrote:
>>>>>> [+cc Austin, tentative Linux patches on this git branch:
>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/tree/drivers/pci/pcie?h=review/edr]
>>>>>>
>>>>>> On Tue, Mar 03, 2020 at 06:36:32PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
>>>>>>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>>>>>>
>>>>>>> As per PCI firmware specification r3.2 System Firmware Intermediary
>>>>>>> (SFI) _OSC and DPC Updates ECR
>>>>>>> (https://members.pcisig.com/wg/PCI-SIG/document/13563), sec titled "DPC
>>>>>>> Event Handling Implementation Note", page 10, Error Disconnect Recover
>>>>>>> (EDR) support allows OS to handle error recovery and clearing Error
>>>>>>> Registers even in FF mode. So create new API pci_aer_raw_clear_status()
>>>>>>> which allows clearing AER registers without FF mode checks.
> 
>>>> OS clears the DPC Trigger Status bit which will bring port below it out
>>>> of containment. Then OS will clear the "port" error status bits (i.e.,
>>>> the AER and DPC status bits in the root port or downstream port that
>>>> triggered containment). I don't think it would hurt to do this two steps
>>>> in reverse order but don't think it is necessary.
> 
>>>> Note that error status bits for devices below the port in
>>>> containment are cleared later after f/w has a chance to log them.
> 
> Thanks for pointing out this wrinkle about devices below the port in
> containment.  I think we might have an issue here with the current
> series because evaluating _OST is the last thing the EDR notify
> handler does.  More below.
> 
>>> Maybe I'm misreading the DPC enhancements ECN.  I think it says the OS
>>> can read/write DPC registers until it clears the DPC Trigger Status.
>>> If the OS clears Trigger Status first, my understanding is that we're
>>> now out of the EDR notification processing window and the OS is not
>>> permitted to write DPC registers.
>>>
>>> If it's OK for the OS to clear Trigger Status before clearing DPC
>>> error status, what is the event that determines when the OS may no
>>> longer read/write the DPC registers?
>>
>> I think there are a few different registers to consider... DPC
>> Control, DPC Status, various AER registers, and the RP PIO
>> registers. At this point in the flow, the firmware has already had a
>> chance to read all of them and so it really doesn't matter the order
>> the OS does those two things. The firmware isn't going to get
>> notified again until _OST so by then both operation will be done and
>> system firmware will have no idea which order the OS did them in,
>> nor will it care.  But since the existing normative text specifies
>> and order, I would just follow that.
> 
> OK, this series clears DPC error status before clearing DPC Trigger
> Status, so I think we can keep that as-is.
> 
>>> There are no events after the "clear device AER status" box.  That
>>> seems to mean the OS can write the AER status registers at any
>>> time.  But the whole implementation note assumes firmware
>>> maintains control of AER.
>>
>> In this model the OS doesn't own DPC or AER but the model allows OS
>> to touch both DPC and AER registers at certain times.  I would view
>> ownership in this case as who is the primary owner and not who is
>> the sole entity allowed to access the registers.
> 
> I'm not sure how to translate the idea of primary ownership into code.

I would just add text that said when it's ok for OS to touch these bits 
even when they don't own them similar to what's done for the DPC bits.

> 
>> For the normative text describing when OS clears the AER bits
>> following the informative flow chart, it could say that OS clears
>> AER as soon as possible after OST returns and before OS processes
>> _HPX and loading drivers.  Open to other suggestions as well.
> 
> I'm not sure what to do with "as soon as possible" either.  That
> doesn't seem like something firmware and the OS can agree on.
> 

I can just state that it's done after OST returns but before _HPX or 
driver is loaded. Any time in that range is fine. I can't get super 
specific here because different OSes do different things.  Even for a 
given OS they change over time. And I need something generic enough to 
support a wide variety of OS implementations.

> For the port that triggered DPC containment, I think the easiest thing
> to understand and implement would be to allow AER access during the
> same EDR processing window where DPC access is allowed.
Agreed.

> 
> For child devices of that port, obviously it's impossible to access
> AER registers until DPC Trigger Status is cleared, and the flowchart
> says the OS shouldn't access them until after _OST.
> 
> I'm actually not sure we currently do *anything* with child device AER
> info in the EDR path.  pcie_do_recovery() does walk the sub-hierarchy
> of child devices, but it only calls error handling callbacks in the
> child drivers; it doesn't do anything with the child AER registers
> itself.  And of course, this happens before _OST, so it would be too
> early in any case.  But maybe I'm missing something here.

My understanding is that the OS read/clears AER in the case where OS has 
native control of AER.  Feedback from OSVs is they wanted to continue to 
do that to keep the native OS controlled AER and FF mechanism similar. 
The other way we could have done it would be to have the firmware 
read/clear AER and report them to OS via APEI.

> 
> BTW, if/when this is updated, I have another question: the _OSC DPC
> control bit currently allows the OS to write DPC Control during that
> window.  I understand the OS writing the RW1C *Status* bits to clear
> them, but it seems like writing the DPC Control register is likely to
> cause issues.  The same question would apply to the AER access we're
> talking about.

We could specify which particular bits can and can't be touched.  But 
it's hard to maintain as new bits are added.  Probably better to add 
some guidance that OS should read/clear error status, DPC Trigger 
Status, etc. but shouldn't change masks/severity/control bits/etc.

> 
> Bjorn
> 


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-11 15:19               ` Austin.Bolen
@ 2020-03-11 17:12                 ` Bjorn Helgaas
  2020-03-11 17:27                   ` Austin.Bolen
  2020-03-11 18:12                   ` Kuppuswamy Sathyanarayanan
  0 siblings, 2 replies; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-11 17:12 UTC (permalink / raw)
  To: Austin.Bolen
  Cc: sathyanarayanan.kuppuswamy, linux-pci, linux-kernel, ashok.raj

On Wed, Mar 11, 2020 at 03:19:44PM +0000, Austin.Bolen@dell.com wrote:
> On 3/11/2020 9:46 AM, Bjorn Helgaas wrote:
> > On Tue, Mar 10, 2020 at 08:06:21PM +0000, Austin.Bolen@dell.com wrote:
> >> On 3/10/2020 2:33 PM, Bjorn Helgaas wrote:
> >>> On Tue, Mar 10, 2020 at 06:14:20PM +0000, Austin.Bolen@dell.com wrote:
> >>>> On 3/9/2020 11:28 PM, Kuppuswamy, Sathyanarayanan wrote:
> >>>>> On 3/9/2020 7:40 PM, Bjorn Helgaas wrote:
> >>>>>> [+cc Austin, tentative Linux patches on this git branch:
> >>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/tree/drivers/pci/pcie?h=review/edr]
> >>>>>>
> >>>>>> On Tue, Mar 03, 2020 at 06:36:32PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
> >>>>>>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> >>>>>>>
> >>>>>>> As per PCI firmware specification r3.2 System Firmware Intermediary
> >>>>>>> (SFI) _OSC and DPC Updates ECR
> >>>>>>> (https://members.pcisig.com/wg/PCI-SIG/document/13563), sec titled "DPC
> >>>>>>> Event Handling Implementation Note", page 10, Error Disconnect Recover
> >>>>>>> (EDR) support allows OS to handle error recovery and clearing Error
> >>>>>>> Registers even in FF mode. So create new API pci_aer_raw_clear_status()
> >>>>>>> which allows clearing AER registers without FF mode checks.
> > 
> >>>> OS clears the DPC Trigger Status bit which will bring port below it out
> >>>> of containment. Then OS will clear the "port" error status bits (i.e.,
> >>>> the AER and DPC status bits in the root port or downstream port that
> >>>> triggered containment). I don't think it would hurt to do this two steps
> >>>> in reverse order but don't think it is necessary.
> > 
> >>>> Note that error status bits for devices below the port in
> >>>> containment are cleared later after f/w has a chance to log them.
> > 
> > Thanks for pointing out this wrinkle about devices below the port in
> > containment.  I think we might have an issue here with the current
> > series because evaluating _OST is the last thing the EDR notify
> > handler does.  More below.
> > 
> >>> Maybe I'm misreading the DPC enhancements ECN.  I think it says the OS
> >>> can read/write DPC registers until it clears the DPC Trigger Status.
> >>> If the OS clears Trigger Status first, my understanding is that we're
> >>> now out of the EDR notification processing window and the OS is not
> >>> permitted to write DPC registers.
> >>>
> >>> If it's OK for the OS to clear Trigger Status before clearing DPC
> >>> error status, what is the event that determines when the OS may no
> >>> longer read/write the DPC registers?
> >>
> >> I think there are a few different registers to consider... DPC
> >> Control, DPC Status, various AER registers, and the RP PIO
> >> registers. At this point in the flow, the firmware has already had a
> >> chance to read all of them and so it really doesn't matter the order
> >> the OS does those two things. The firmware isn't going to get
> >> notified again until _OST so by then both operation will be done and
> >> system firmware will have no idea which order the OS did them in,
> >> nor will it care.  But since the existing normative text specifies
> >> and order, I would just follow that.
> > 
> > OK, this series clears DPC error status before clearing DPC Trigger
> > Status, so I think we can keep that as-is.
> > 

> >>> There are no events after the "clear device AER status" box.
> >>> That seems to mean the OS can write the AER status registers at
> >>> any time.  But the whole implementation note assumes firmware
> >>> maintains control of AER.
> >>
> >> In this model the OS doesn't own DPC or AER but the model allows
> >> OS to touch both DPC and AER registers at certain times.  I would
> >> view ownership in this case as who is the primary owner and not
> >> who is the sole entity allowed to access the registers.
> > 
> > I'm not sure how to translate the idea of primary ownership into
> > code.
> 
> I would just add text that said when it's ok for OS to touch these
> bits even when they don't own them similar to what's done for the
> DPC bits.

I'm probably missing your intent, but that sounds like "the OS can
read/write AER bits whenever it wants, regardless of ownership."

That doesn't sound practical to me, and I don't think it's really
similar to DPC, where it's pretty clear that the OS can touch DPC bits
it doesn't own but only *during the EDR processing window*.

> >> For the normative text describing when OS clears the AER bits
> >> following the informative flow chart, it could say that OS clears
> >> AER as soon as possible after OST returns and before OS processes
> >> _HPX and loading drivers.  Open to other suggestions as well.
> > 
> > I'm not sure what to do with "as soon as possible" either.  That
> > doesn't seem like something firmware and the OS can agree on.
> 
> I can just state that it's done after OST returns but before _HPX or
> driver is loaded. Any time in that range is fine. I can't get super
> specific here because different OSes do different things.  Even for
> a given OS they change over time. And I need something generic
> enough to support a wide variety of OS implementations.

Yeah.  I don't know how to solve this.

Linux doesn't actually unload and reload drivers for the child devices
(Sathy, correct me if I'm wrong here) even though DPC containment
takes the link down and effectively unplugs and replugs the device.  I
would *like* to handle it like hotplug, but some higher-level software
doesn't deal well with things like storage devices disappearing and
reappearing.

Since Linux doesn't actually re-enumerate the child devices, it
wouldn't evaluate _HPX again.  It would probably be cleaner if it did,
but it's all tied up with the whole unplug/replug problem.

> > For child devices of that port, obviously it's impossible to
> > access AER registers until DPC Trigger Status is cleared, and the
> > flowchart says the OS shouldn't access them until after _OST.
> > 
> > I'm actually not sure we currently do *anything* with child device
> > AER info in the EDR path.  pcie_do_recovery() does walk the
> > sub-hierarchy of child devices, but it only calls error handling
> > callbacks in the child drivers; it doesn't do anything with the
> > child AER registers itself.  And of course, this happens before
> > _OST, so it would be too early in any case.  But maybe I'm missing
> > something here.
> 
> My understanding is that the OS read/clears AER in the case where OS
> has native control of AER.  Feedback from OSVs is they wanted to
> continue to do that to keep the native OS controlled AER and FF
> mechanism similar.  The other way we could have done it would be to
> have the firmware read/clear AER and report them to OS via APEI.

When Linux has native control of AER, it reads/clears AER status.
The flowchart is for the case where firmware has AER control, so I
guess Linux would not field AER interrupts and wouldn't expect to
read/clear AER status.  So I *guess* Linux would assume APEI?  But
that doesn't seem to be what the flowchart assumes.

> > BTW, if/when this is updated, I have another question: the _OSC
> > DPC control bit currently allows the OS to write DPC Control
> > during that window.  I understand the OS writing the RW1C *Status*
> > bits to clear them, but it seems like writing the DPC Control
> > register is likely to cause issues.  The same question would apply
> > to the AER access we're talking about.
> 
> We could specify which particular bits can and can't be touched.
> But it's hard to maintain as new bits are added.  Probably better to
> add some guidance that OS should read/clear error status, DPC
> Trigger Status, etc. but shouldn't change masks/severity/control
> bits/etc.

Yeah.  I didn't mean at the level of individual bits; I was thinking
more of status/log/etc vs control registers.  But maybe even that is
hard, I dunno.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-11 17:12                 ` Bjorn Helgaas
@ 2020-03-11 17:27                   ` Austin.Bolen
  2020-03-11 20:33                     ` Bjorn Helgaas
  2020-03-11 18:12                   ` Kuppuswamy Sathyanarayanan
  1 sibling, 1 reply; 68+ messages in thread
From: Austin.Bolen @ 2020-03-11 17:27 UTC (permalink / raw)
  To: helgaas, Austin.Bolen
  Cc: sathyanarayanan.kuppuswamy, linux-pci, linux-kernel, ashok.raj

On 3/11/2020 12:12 PM, Bjorn Helgaas wrote:
> 
> [EXTERNAL EMAIL]
> 
<SNIP>
> 
> I'm probably missing your intent, but that sounds like "the OS can
> read/write AER bits whenever it wants, regardless of ownership."
> 
> That doesn't sound practical to me, and I don't think it's really
> similar to DPC, where it's pretty clear that the OS can touch DPC bits
> it doesn't own but only *during the EDR processing window*.

Yes, by treating AER bits like DPC bits I meant I'd define the specific 
time windows when OS can touch the AER status bits similar to how it's 
done for DPC in the current ECN.

> 
>>>> For the normative text describing when OS clears the AER bits
>>>> following the informative flow chart, it could say that OS clears
>>>> AER as soon as possible after OST returns and before OS processes
>>>> _HPX and loading drivers.  Open to other suggestions as well.
>>>
>>> I'm not sure what to do with "as soon as possible" either.  That
>>> doesn't seem like something firmware and the OS can agree on.
>>
>> I can just state that it's done after OST returns but before _HPX or
>> driver is loaded. Any time in that range is fine. I can't get super
>> specific here because different OSes do different things.  Even for
>> a given OS they change over time. And I need something generic
>> enough to support a wide variety of OS implementations.
> 
> Yeah.  I don't know how to solve this.
> 
> Linux doesn't actually unload and reload drivers for the child devices
> (Sathy, correct me if I'm wrong here) even though DPC containment
> takes the link down and effectively unplugs and replugs the device.  I
> would *like* to handle it like hotplug, but some higher-level software
> doesn't deal well with things like storage devices disappearing and
> reappearing.
> 
> Since Linux doesn't actually re-enumerate the child devices, it
> wouldn't evaluate _HPX again.  It would probably be cleaner if it did,
> but it's all tied up with the whole unplug/replug problem.

DPC resets everything below it and so to get it back up and running it 
would mean that all buses and resources need to be assigned, _HPX 
evaluated, and drivers reloaded. If those things don't happen then the 
whole hierarchy below the port that triggered DPC will be inaccessible.

For higher level software not handling storage device disappearing due 
to hot-plug, they will have the same problem with DPC since DPC holds 
the port in the disabled state (and hence will be inaccessible). And 
once DPC is released the devices will be unconfigured and so still 
inaccessible to upper-level software.  A lot of upper-level storage 
software I've seen can already handle this gracefully.

> 
>>> For child devices of that port, obviously it's impossible to
>>> access AER registers until DPC Trigger Status is cleared, and the
>>> flowchart says the OS shouldn't access them until after _OST.
>>>
>>> I'm actually not sure we currently do *anything* with child device
>>> AER info in the EDR path.  pcie_do_recovery() does walk the
>>> sub-hierarchy of child devices, but it only calls error handling
>>> callbacks in the child drivers; it doesn't do anything with the
>>> child AER registers itself.  And of course, this happens before
>>> _OST, so it would be too early in any case.  But maybe I'm missing
>>> something here.
>>
>> My understanding is that the OS read/clears AER in the case where OS
>> has native control of AER.  Feedback from OSVs is they wanted to
>> continue to do that to keep the native OS controlled AER and FF
>> mechanism similar.  The other way we could have done it would be to
>> have the firmware read/clear AER and report them to OS via APEI.
> 
> When Linux has native control of AER, it reads/clears AER status.
> The flowchart is for the case where firmware has AER control, so I
> guess Linux would not field AER interrupts and wouldn't expect to
> read/clear AER status.  So I *guess* Linux would assume APEI?  But
> that doesn't seem to be what the flowchart assumes.

Correct on the flowchart.  The OSVs we talked with did not want to use 
APEI.  They wanted to read and clear AER themselves and hence the 
flowchart is written that way.

> 
>>> BTW, if/when this is updated, I have another question: the _OSC
>>> DPC control bit currently allows the OS to write DPC Control
>>> during that window.  I understand the OS writing the RW1C *Status*
>>> bits to clear them, but it seems like writing the DPC Control
>>> register is likely to cause issues.  The same question would apply
>>> to the AER access we're talking about.
>>
>> We could specify which particular bits can and can't be touched.
>> But it's hard to maintain as new bits are added.  Probably better to
>> add some guidance that OS should read/clear error status, DPC
>> Trigger Status, etc. but shouldn't change masks/severity/control
>> bits/etc.
> 
> Yeah.  I didn't mean at the level of individual bits; I was thinking
> more of status/log/etc vs control registers.  But maybe even that is
> hard, I dunno.
> 
I'll see if I can break it out by register.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-11 17:12                 ` Bjorn Helgaas
  2020-03-11 17:27                   ` Austin.Bolen
@ 2020-03-11 18:12                   ` Kuppuswamy Sathyanarayanan
  1 sibling, 0 replies; 68+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2020-03-11 18:12 UTC (permalink / raw)
  To: Bjorn Helgaas, Austin.Bolen; +Cc: linux-pci, linux-kernel, ashok.raj


On 3/11/20 10:12 AM, Bjorn Helgaas wrote:
>
>> I can just state that it's done after OST returns but before _HPX or
>> driver is loaded. Any time in that range is fine. I can't get super
>> specific here because different OSes do different things.  Even for
>> a given OS they change over time. And I need something generic
>> enough to support a wide variety of OS implementations.
> Yeah.  I don't know how to solve this.
>
> Linux doesn't actually unload and reload drivers for the child devices
> (Sathy, correct me if I'm wrong here) even though DPC containment
> takes the link down and effectively unplugs and replugs the device.  I
> would *like* to handle it like hotplug, but some higher-level software
> doesn't deal well with things like storage devices disappearing and
> reappearing.
>
> Since Linux doesn't actually re-enumerate the child devices, it
> wouldn't evaluate _HPX again.  It would probably be cleaner if it did,
> but it's all tied up with the whole unplug/replug problem.
Yes, re-enumeration of child devices is handled by hot-plug path.
AFAIK, with current PCI driver design, I think its very difficult to create
dependency between current DPC handler and hot-plug device
enumeration handler.

>
>>> For child devices of that port, obviously it's impossible to
>>> access AER registers until DPC Trigger Status is cleared, and the
>>> flowchart says the OS shouldn't access them until after _OST.
>>>
>>> I'm actually not sure we currently do *anything* with child device
>>> AER info in the EDR path.  pcie_do_recovery() does walk the
>>> sub-hierarchy of child devices, but it only calls error handling
>>> callbacks in the child drivers; it doesn't do anything with the
>>> child AER registers itself.  And of course, this happens before
>>> _OST, so it would be too early in any case.  But maybe I'm missing
>>> something here.
>> My understanding is that the OS read/clears AER in the case where OS
>> has native control of AER.  Feedback from OSVs is they wanted to
>> continue to do that to keep the native OS controlled AER and FF
>> mechanism similar.  The other way we could have done it would be to
>> have the firmware read/clear AER and report them to OS via APEI.
> When Linux has native control of AER, it reads/clears AER status.
> The flowchart is for the case where firmware has AER control, so I
> guess Linux would not field AER interrupts and wouldn't expect to
> read/clear AER status.  So I *guess* Linux would assume APEI?  But
> that doesn't seem to be what the flowchart assumes.
Yes, in EDR case, based on our current Linux driver design, without
some spec changes it will be very difficult to implement the
clear the AER status of child devices part of the flow chart. This is the
reason why I did not implement that part in current patch set.

I think instead of depending on DPC status trigger to end the EDR
notification window, we should depend on some sort of handshake
between OS and firmware (may be some changes to _OST arg1 0:15 and
use _OST for it). Above change would give us a window to clear the
AER registers properly.
>
>>> BTW, if/when this is updated, I have another question: the _OSC
>>> DPC control bit currently allows the OS to write DPC Control
>>> during that window.  I understand the OS writing the RW1C *Status*
>>> bits to clear them, but it seems like writing the DPC Control
>>> register is likely to cause issues.  The same question would apply
>>> to the AER access we're talking about.
>> We could specify which particular bits can and can't be touched.
>> But it's hard to maintain as new bits are added.  Probably better to
>> add some guidance that OS should read/clear error status, DPC
>> Trigger Status, etc. but shouldn't change masks/severity/control
>> bits/etc.
> Yeah.  I didn't mean at the level of individual bits; I was thinking
> more of status/log/etc vs control registers.  But maybe even that is
> hard, I dunno.

-- 
Sathyanarayanan Kuppuswamy
Linux kernel developer


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-11 17:27                   ` Austin.Bolen
@ 2020-03-11 20:33                     ` Bjorn Helgaas
  2020-03-11 21:25                       ` Kuppuswamy Sathyanarayanan
  0 siblings, 1 reply; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-11 20:33 UTC (permalink / raw)
  To: Austin.Bolen
  Cc: sathyanarayanan.kuppuswamy, linux-pci, linux-kernel, ashok.raj

On Wed, Mar 11, 2020 at 05:27:35PM +0000, Austin.Bolen@dell.com wrote:
> On 3/11/2020 12:12 PM, Bjorn Helgaas wrote:
> > 
> > [EXTERNAL EMAIL]
> > 
> <SNIP>
> > 
> > I'm probably missing your intent, but that sounds like "the OS can
> > read/write AER bits whenever it wants, regardless of ownership."
> > 
> > That doesn't sound practical to me, and I don't think it's really
> > similar to DPC, where it's pretty clear that the OS can touch DPC bits
> > it doesn't own but only *during the EDR processing window*.
> 
> Yes, by treating AER bits like DPC bits I meant I'd define the specific 
> time windows when OS can touch the AER status bits similar to how it's 
> done for DPC in the current ECN.

Makes sense, thanks.

> >>>> For the normative text describing when OS clears the AER bits
> >>>> following the informative flow chart, it could say that OS clears
> >>>> AER as soon as possible after OST returns and before OS processes
> >>>> _HPX and loading drivers.  Open to other suggestions as well.
> >>>
> >>> I'm not sure what to do with "as soon as possible" either.  That
> >>> doesn't seem like something firmware and the OS can agree on.
> >>
> >> I can just state that it's done after OST returns but before _HPX or
> >> driver is loaded. Any time in that range is fine. I can't get super
> >> specific here because different OSes do different things.  Even for
> >> a given OS they change over time. And I need something generic
> >> enough to support a wide variety of OS implementations.
> > 
> > Yeah.  I don't know how to solve this.
> > 
> > Linux doesn't actually unload and reload drivers for the child devices
> > (Sathy, correct me if I'm wrong here) even though DPC containment
> > takes the link down and effectively unplugs and replugs the device.  I
> > would *like* to handle it like hotplug, but some higher-level software
> > doesn't deal well with things like storage devices disappearing and
> > reappearing.
> > 
> > Since Linux doesn't actually re-enumerate the child devices, it
> > wouldn't evaluate _HPX again.  It would probably be cleaner if it did,
> > but it's all tied up with the whole unplug/replug problem.
> 
> DPC resets everything below it and so to get it back up and running it 
> would mean that all buses and resources need to be assigned, _HPX 
> evaluated, and drivers reloaded. If those things don't happen then the 
> whole hierarchy below the port that triggered DPC will be inaccessible.

Hmm, I think I might be confusing this with another situation.  Sathy,
can you help me understand this?  I don't have a way to actually
exercise this EDR path.  Is there some way the pciehp hotplug driver
gets involved here?

Here's how this seems to work as far as I can tell:

  - Linux does not have DPC or AER control

  - Linux installs EDR notify handler

  - Linux evaluates DPC Enable _DSM

  - DPC containment event occurs

  - Firmware fields DPC interrupt

  - DPC event is not a surprise remove

  - Firmware sends EDR notification

  - Linux EDR notify handler evaluates Locate _DSM

  - Linux reads and logs DPC and AER error information for port in
    containment mode.  [If it was an RP PIO error, Linux clears RP PIO
    error status, which is an asymmetry with the non-RP PIO path.]

  - Linux clears AER error status (pci_aer_raw_clear_status())

  - Linux calls driver .error_detected() methods for all child devices
    of the port in containment mode (pcie_do_recovery()).  These
    devices are inaccessible because the link is down.

  - Linux clears DPC Trigger Status (dpc_reset_link() from
    pcie_do_recovery()).

  - Linux calls driver .mmio_enabled() methods for all child devices.

This is where I get lost.  These child devices are now accessible, but
they've been reset, so I don't know how their config space got
restored.  Did pciehp enumerate them?  Did we do something like
pci_restore_state()?  I don't see where either of these happens.

> For higher level software not handling storage device disappearing due 
> to hot-plug, they will have the same problem with DPC since DPC holds 
> the port in the disabled state (and hence will be inaccessible). And 
> once DPC is released the devices will be unconfigured and so still 
> inaccessible to upper-level software.  A lot of upper-level storage 
> software I've seen can already handle this gracefully.
> 
> >>> For child devices of that port, obviously it's impossible to
> >>> access AER registers until DPC Trigger Status is cleared, and the
> >>> flowchart says the OS shouldn't access them until after _OST.
> >>>
> >>> I'm actually not sure we currently do *anything* with child device
> >>> AER info in the EDR path.  pcie_do_recovery() does walk the
> >>> sub-hierarchy of child devices, but it only calls error handling
> >>> callbacks in the child drivers; it doesn't do anything with the
> >>> child AER registers itself.  And of course, this happens before
> >>> _OST, so it would be too early in any case.  But maybe I'm missing
> >>> something here.
> >>
> >> My understanding is that the OS read/clears AER in the case where OS
> >> has native control of AER.  Feedback from OSVs is they wanted to
> >> continue to do that to keep the native OS controlled AER and FF
> >> mechanism similar.  The other way we could have done it would be to
> >> have the firmware read/clear AER and report them to OS via APEI.
> > 
> > When Linux has native control of AER, it reads/clears AER status.
> > The flowchart is for the case where firmware has AER control, so I
> > guess Linux would not field AER interrupts and wouldn't expect to
> > read/clear AER status.  So I *guess* Linux would assume APEI?  But
> > that doesn't seem to be what the flowchart assumes.
> 
> Correct on the flowchart.  The OSVs we talked with did not want to use 
> APEI.  They wanted to read and clear AER themselves and hence the 
> flowchart is written that way.

So they want to basically do native AER handling even though firmware
owns AER?  My head hurts.

Bjorn

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-11 20:33                     ` Bjorn Helgaas
@ 2020-03-11 21:25                       ` Kuppuswamy Sathyanarayanan
  2020-03-11 21:53                         ` Austin.Bolen
  2020-03-11 22:13                         ` Bjorn Helgaas
  0 siblings, 2 replies; 68+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2020-03-11 21:25 UTC (permalink / raw)
  To: Bjorn Helgaas, Austin.Bolen; +Cc: linux-pci, linux-kernel, ashok.raj

Hi,

On 3/11/20 1:33 PM, Bjorn Helgaas wrote:
> On Wed, Mar 11, 2020 at 05:27:35PM +0000, Austin.Bolen@dell.com wrote:
>> On 3/11/2020 12:12 PM, Bjorn Helgaas wrote:
>>> [EXTERNAL EMAIL]
>>>
>> <SNIP>
>>> I'm probably missing your intent, but that sounds like "the OS can
>>> read/write AER bits whenever it wants, regardless of ownership."
>>>
>>> That doesn't sound practical to me, and I don't think it's really
>>> similar to DPC, where it's pretty clear that the OS can touch DPC bits
>>> it doesn't own but only *during the EDR processing window*.
>> Yes, by treating AER bits like DPC bits I meant I'd define the specific
>> time windows when OS can touch the AER status bits similar to how it's
>> done for DPC in the current ECN.
> Makes sense, thanks.
>
>>>>>> For the normative text describing when OS clears the AER bits
>>>>>> following the informative flow chart, it could say that OS clears
>>>>>> AER as soon as possible after OST returns and before OS processes
>>>>>> _HPX and loading drivers.  Open to other suggestions as well.
>>>>> I'm not sure what to do with "as soon as possible" either.  That
>>>>> doesn't seem like something firmware and the OS can agree on.
>>>> I can just state that it's done after OST returns but before _HPX or
>>>> driver is loaded. Any time in that range is fine. I can't get super
>>>> specific here because different OSes do different things.  Even for
>>>> a given OS they change over time. And I need something generic
>>>> enough to support a wide variety of OS implementations.
>>> Yeah.  I don't know how to solve this.
>>>
>>> Linux doesn't actually unload and reload drivers for the child devices
>>> (Sathy, correct me if I'm wrong here) even though DPC containment
>>> takes the link down and effectively unplugs and replugs the device.  I
>>> would *like* to handle it like hotplug, but some higher-level software
>>> doesn't deal well with things like storage devices disappearing and
>>> reappearing.
>>>
>>> Since Linux doesn't actually re-enumerate the child devices, it
>>> wouldn't evaluate _HPX again.  It would probably be cleaner if it did,
>>> but it's all tied up with the whole unplug/replug problem.
>> DPC resets everything below it and so to get it back up and running it
>> would mean that all buses and resources need to be assigned, _HPX
>> evaluated, and drivers reloaded. If those things don't happen then the
>> whole hierarchy below the port that triggered DPC will be inaccessible.
> Hmm, I think I might be confusing this with another situation.  Sathy,
> can you help me understand this?  I don't have a way to actually
> exercise this EDR path.  Is there some way the pciehp hotplug driver
> gets involved here?
>
> Here's how this seems to work as far as I can tell:
>
>    - Linux does not have DPC or AER control
>
>    - Linux installs EDR notify handler
>
>    - Linux evaluates DPC Enable _DSM
>
>    - DPC containment event occurs
>
>    - Firmware fields DPC interrupt
>
>    - DPC event is not a surprise remove
>
>    - Firmware sends EDR notification
>
>    - Linux EDR notify handler evaluates Locate _DSM
>
>    - Linux reads and logs DPC and AER error information for port in
>      containment mode.  [If it was an RP PIO error, Linux clears RP PIO
>      error status, which is an asymmetry with the non-RP PIO path.]
>
>    - Linux clears AER error status (pci_aer_raw_clear_status())
>
>    - Linux calls driver .error_detected() methods for all child devices
>      of the port in containment mode (pcie_do_recovery()).  These
>      devices are inaccessible because the link is down.
>
>    - Linux clears DPC Trigger Status (dpc_reset_link() from
>      pcie_do_recovery()).
>
>    - Linux calls driver .mmio_enabled() methods for all child devices.
>
> This is where I get lost.  These child devices are now accessible, but
> they've been reset, so I don't know how their config space got
> restored.  Did pciehp enumerate them?  Did we do something like
> pci_restore_state()?  I don't see where either of these happens.
AFAIK, AER error status registers  are sticky (RW1CS) and hence
will be preserved during reset.
>
> So they want to basically do native AER handling even though firmware
> owns AER?  My head hurts.
No, Its meant only for clearing AER registers. In EDR path, since
OS owns clearing DPC registers, they want to let OS own clearing AER
registers as well. Also,  it would give OS a chance to decide whether
we want to keep the device on based on error status and history of the
device attached.
>
> Bjorn

-- 
Sathyanarayanan Kuppuswamy
Linux kernel developer


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-11 21:25                       ` Kuppuswamy Sathyanarayanan
@ 2020-03-11 21:53                         ` Austin.Bolen
  2020-03-11 22:11                           ` Kuppuswamy Sathyanarayanan
  2020-03-11 22:13                         ` Bjorn Helgaas
  1 sibling, 1 reply; 68+ messages in thread
From: Austin.Bolen @ 2020-03-11 21:53 UTC (permalink / raw)
  To: sathyanarayanan.kuppuswamy, helgaas, Austin.Bolen
  Cc: linux-pci, linux-kernel, ashok.raj

On 3/11/2020 4:27 PM, Kuppuswamy Sathyanarayanan wrote:
> 
> [EXTERNAL EMAIL]
> 
> Hi,
> 
> On 3/11/20 1:33 PM, Bjorn Helgaas wrote:
>> On Wed, Mar 11, 2020 at 05:27:35PM +0000, Austin.Bolen@dell.com wrote:
>>> On 3/11/2020 12:12 PM, Bjorn Helgaas wrote:
>>>> [EXTERNAL EMAIL]
>>>>
>>> <SNIP>
>>>> I'm probably missing your intent, but that sounds like "the OS can
>>>> read/write AER bits whenever it wants, regardless of ownership."
>>>>
>>>> That doesn't sound practical to me, and I don't think it's really
>>>> similar to DPC, where it's pretty clear that the OS can touch DPC bits
>>>> it doesn't own but only *during the EDR processing window*.
>>> Yes, by treating AER bits like DPC bits I meant I'd define the specific
>>> time windows when OS can touch the AER status bits similar to how it's
>>> done for DPC in the current ECN.
>> Makes sense, thanks.
>>
>>>>>>> For the normative text describing when OS clears the AER bits
>>>>>>> following the informative flow chart, it could say that OS clears
>>>>>>> AER as soon as possible after OST returns and before OS processes
>>>>>>> _HPX and loading drivers.  Open to other suggestions as well.
>>>>>> I'm not sure what to do with "as soon as possible" either.  That
>>>>>> doesn't seem like something firmware and the OS can agree on.
>>>>> I can just state that it's done after OST returns but before _HPX or
>>>>> driver is loaded. Any time in that range is fine. I can't get super
>>>>> specific here because different OSes do different things.  Even for
>>>>> a given OS they change over time. And I need something generic
>>>>> enough to support a wide variety of OS implementations.
>>>> Yeah.  I don't know how to solve this.
>>>>
>>>> Linux doesn't actually unload and reload drivers for the child devices
>>>> (Sathy, correct me if I'm wrong here) even though DPC containment
>>>> takes the link down and effectively unplugs and replugs the device.  I
>>>> would *like* to handle it like hotplug, but some higher-level software
>>>> doesn't deal well with things like storage devices disappearing and
>>>> reappearing.
>>>>
>>>> Since Linux doesn't actually re-enumerate the child devices, it
>>>> wouldn't evaluate _HPX again.  It would probably be cleaner if it did,
>>>> but it's all tied up with the whole unplug/replug problem.
>>> DPC resets everything below it and so to get it back up and running it
>>> would mean that all buses and resources need to be assigned, _HPX
>>> evaluated, and drivers reloaded. If those things don't happen then the
>>> whole hierarchy below the port that triggered DPC will be inaccessible.
>> Hmm, I think I might be confusing this with another situation.  Sathy,
>> can you help me understand this?  I don't have a way to actually
>> exercise this EDR path.  Is there some way the pciehp hotplug driver
>> gets involved here?

If the port has hot-plug enabled then DPC trigger will cause the link to 
go down (disabled state) and will generate a DLLSC hot-plug interrupt. 
When DPC is released, the link will become active and generate another 
DLLSC hot-plug interrupt.

>>
>> Here's how this seems to work as far as I can tell:
>>
>>     - Linux does not have DPC or AER control
>>
>>     - Linux installs EDR notify handler
>>
>>     - Linux evaluates DPC Enable _DSM
>>
>>     - DPC containment event occurs
>>
>>     - Firmware fields DPC interrupt
>>
>>     - DPC event is not a surprise remove
>>
>>     - Firmware sends EDR notification
>>
>>     - Linux EDR notify handler evaluates Locate _DSM
>>
>>     - Linux reads and logs DPC and AER error information for port in
>>       containment mode.  [If it was an RP PIO error, Linux clears RP PIO
>>       error status, which is an asymmetry with the non-RP PIO path.]
>>
>>     - Linux clears AER error status (pci_aer_raw_clear_status())
>>
>>     - Linux calls driver .error_detected() methods for all child devices
>>       of the port in containment mode (pcie_do_recovery()).  These
>>       devices are inaccessible because the link is down.
>>
>>     - Linux clears DPC Trigger Status (dpc_reset_link() from
>>       pcie_do_recovery()).
>>
>>     - Linux calls driver .mmio_enabled() methods for all child devices.
>>
>> This is where I get lost.  These child devices are now accessible, but
>> they've been reset, so I don't know how their config space got
>> restored.  Did pciehp enumerate them?  Did we do something like
>> pci_restore_state()?  I don't see where either of these happens.
> AFAIK, AER error status registers  are sticky (RW1CS) and hence
> will be preserved during reset.

In our testing, the device directly connected to the port that was 
contained does get reprogrammed and the driver is reloaded.  These are 
hot-plug slots and so might be due to DLLSC hot-plug interrupt when 
containment is released and link goes back to active state.

However, if a switch is connected to the port where DPC was triggered 
then we do not see the whole switch hierarchy being re-enumerated.

Also, DPC could be enabled on non-hot-plug slots so can't always rely on 
hot-plug to re-init devices in the recovery path.

>>
>> So they want to basically do native AER handling even though firmware
>> owns AER?  My head hurts.
> No, Its meant only for clearing AER registers. In EDR path, since
> OS owns clearing DPC registers, they want to let OS own clearing AER
> registers as well. Also,  it would give OS a chance to decide whether
> we want to keep the device on based on error status and history of the
> device attached.

Right.  The way it was pitched to me was that the OSVs wanted to 
read/clear the error status bits so they could re-use the code that does 
that when OS natively owns AER/DPC.

>>
>> Bjorn
> 


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-10 20:06           ` Austin.Bolen
  2020-03-10 20:41             ` Kuppuswamy Sathyanarayanan
  2020-03-11 14:45             ` Bjorn Helgaas
@ 2020-03-11 22:05             ` Bjorn Helgaas
  2 siblings, 0 replies; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-11 22:05 UTC (permalink / raw)
  To: Austin.Bolen
  Cc: sathyanarayanan.kuppuswamy, linux-pci, linux-kernel, ashok.raj

On Tue, Mar 10, 2020 at 08:06:21PM +0000, Austin.Bolen@dell.com wrote:
> On 3/10/2020 2:33 PM, Bjorn Helgaas wrote:

> > If that's possible, it would be nice to update the metadata for the
> > "Downstream Port Containment related Enhancements" ECN as well.  That
> > one currently says "ECR - CardBus Header Proposal", which means that's
> > what's in the window title bar and icons in the panel.
> 
> Sure, I'll check.

FWIW, the PCI Firmware Specification, Rev 3.2, dated "Final - Jan 28,
2019" also has the same metadata problem.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-11 21:53                         ` Austin.Bolen
@ 2020-03-11 22:11                           ` Kuppuswamy Sathyanarayanan
  2020-03-11 22:23                             ` Bjorn Helgaas
  0 siblings, 1 reply; 68+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2020-03-11 22:11 UTC (permalink / raw)
  To: Austin.Bolen, helgaas; +Cc: linux-pci, linux-kernel, ashok.raj


On 3/11/20 2:53 PM, Austin.Bolen@dell.com wrote:
> On 3/11/2020 4:27 PM, Kuppuswamy Sathyanarayanan wrote:
>> [EXTERNAL EMAIL]
>>
>> Hi,
>>
>> On 3/11/20 1:33 PM, Bjorn Helgaas wrote:
>>> On Wed, Mar 11, 2020 at 05:27:35PM +0000, Austin.Bolen@dell.com wrote:
>>>> On 3/11/2020 12:12 PM, Bjorn Helgaas wrote:
>>>>> [EXTERNAL EMAIL]
>>>>>
>>>> <SNIP>
>>>>> I'm probably missing your intent, but that sounds like "the OS can
>>>>> read/write AER bits whenever it wants, regardless of ownership."
>>>>>
>>>>> That doesn't sound practical to me, and I don't think it's really
>>>>> similar to DPC, where it's pretty clear that the OS can touch DPC bits
>>>>> it doesn't own but only *during the EDR processing window*.
>>>> Yes, by treating AER bits like DPC bits I meant I'd define the specific
>>>> time windows when OS can touch the AER status bits similar to how it's
>>>> done for DPC in the current ECN.
>>> Makes sense, thanks.
>>>
>>>>>>>> For the normative text describing when OS clears the AER bits
>>>>>>>> following the informative flow chart, it could say that OS clears
>>>>>>>> AER as soon as possible after OST returns and before OS processes
>>>>>>>> _HPX and loading drivers.  Open to other suggestions as well.
>>>>>>> I'm not sure what to do with "as soon as possible" either.  That
>>>>>>> doesn't seem like something firmware and the OS can agree on.
>>>>>> I can just state that it's done after OST returns but before _HPX or
>>>>>> driver is loaded. Any time in that range is fine. I can't get super
>>>>>> specific here because different OSes do different things.  Even for
>>>>>> a given OS they change over time. And I need something generic
>>>>>> enough to support a wide variety of OS implementations.
>>>>> Yeah.  I don't know how to solve this.
>>>>>
>>>>> Linux doesn't actually unload and reload drivers for the child devices
>>>>> (Sathy, correct me if I'm wrong here) even though DPC containment
>>>>> takes the link down and effectively unplugs and replugs the device.  I
>>>>> would *like* to handle it like hotplug, but some higher-level software
>>>>> doesn't deal well with things like storage devices disappearing and
>>>>> reappearing.
>>>>>
>>>>> Since Linux doesn't actually re-enumerate the child devices, it
>>>>> wouldn't evaluate _HPX again.  It would probably be cleaner if it did,
>>>>> but it's all tied up with the whole unplug/replug problem.
>>>> DPC resets everything below it and so to get it back up and running it
>>>> would mean that all buses and resources need to be assigned, _HPX
>>>> evaluated, and drivers reloaded. If those things don't happen then the
>>>> whole hierarchy below the port that triggered DPC will be inaccessible.
>>> Hmm, I think I might be confusing this with another situation.  Sathy,
>>> can you help me understand this?  I don't have a way to actually
>>> exercise this EDR path.  Is there some way the pciehp hotplug driver
>>> gets involved here?
> If the port has hot-plug enabled then DPC trigger will cause the link to
> go down (disabled state) and will generate a DLLSC hot-plug interrupt.
> When DPC is released, the link will become active and generate another
> DLLSC hot-plug interrupt.
Yes, device/driver enumeration and removal will triggered by DLLSC
state change interrupt in pciehp driver.
>
>>> Here's how this seems to work as far as I can tell:
>>>
>>>      - Linux does not have DPC or AER control
>>>
>>>      - Linux installs EDR notify handler
>>>
>>>      - Linux evaluates DPC Enable _DSM
>>>
>>>      - DPC containment event occurs
>>>
>>>      - Firmware fields DPC interrupt
>>>
>>>      - DPC event is not a surprise remove
>>>
>>>      - Firmware sends EDR notification
>>>
>>>      - Linux EDR notify handler evaluates Locate _DSM
>>>
>>>      - Linux reads and logs DPC and AER error information for port in
>>>        containment mode.  [If it was an RP PIO error, Linux clears RP PIO
>>>        error status, which is an asymmetry with the non-RP PIO path.]
>>>
>>>      - Linux clears AER error status (pci_aer_raw_clear_status())
>>>
>>>      - Linux calls driver .error_detected() methods for all child devices
>>>        of the port in containment mode (pcie_do_recovery()).  These
>>>        devices are inaccessible because the link is down.
>>>
>>>      - Linux clears DPC Trigger Status (dpc_reset_link() from
>>>        pcie_do_recovery()).
>>>
>>>      - Linux calls driver .mmio_enabled() methods for all child devices.
>>>
>>> This is where I get lost.  These child devices are now accessible, but
>>> they've been reset, so I don't know how their config space got
>>> restored.  Did pciehp enumerate them?  Did we do something like
>>> pci_restore_state()?  I don't see where either of these happens.
>> AFAIK, AER error status registers  are sticky (RW1CS) and hence
>> will be preserved during reset.
> In our testing, the device directly connected to the port that was
> contained does get reprogrammed and the driver is reloaded.  These are
> hot-plug slots and so might be due to DLLSC hot-plug interrupt when
> containment is released and link goes back to active state.
>
> However, if a switch is connected to the port where DPC was triggered
> then we do not see the whole switch hierarchy being re-enumerated.
Now that I have a hardware to verify this scenario, I will look into
it. I suspect there is a transient state in link status which causes
this disconnect issue. But I think this issue is not related to
EDR support and hence should be reproducible in native handling
as well.
>
> Also, DPC could be enabled on non-hot-plug slots so can't always rely on
> hot-plug to re-init devices in the recovery path.
If hotplug is not supported then there is support to enumerate
devices via polling  or ACPI events. But a point to note
here is, enumeration path is independent of error handler path, and
hence there is no explicit trigger or event from error handler path
to enumeration path to kick start the enumeration.
>
>>> So they want to basically do native AER handling even though firmware
>>> owns AER?  My head hurts.
>> No, Its meant only for clearing AER registers. In EDR path, since
>> OS owns clearing DPC registers, they want to let OS own clearing AER
>> registers as well. Also,  it would give OS a chance to decide whether
>> we want to keep the device on based on error status and history of the
>> device attached.
> Right.  The way it was pitched to me was that the OSVs wanted to
> read/clear the error status bits so they could re-use the code that does
> that when OS natively owns AER/DPC.
>
>>> Bjorn

-- 
Sathyanarayanan Kuppuswamy
Linux kernel developer


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-11 21:25                       ` Kuppuswamy Sathyanarayanan
  2020-03-11 21:53                         ` Austin.Bolen
@ 2020-03-11 22:13                         ` Bjorn Helgaas
  2020-03-11 22:41                           ` Kuppuswamy Sathyanarayanan
  1 sibling, 1 reply; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-11 22:13 UTC (permalink / raw)
  To: Kuppuswamy Sathyanarayanan
  Cc: Austin.Bolen, linux-pci, linux-kernel, ashok.raj

On Wed, Mar 11, 2020 at 02:25:18PM -0700, Kuppuswamy Sathyanarayanan wrote:
> On 3/11/20 1:33 PM, Bjorn Helgaas wrote:
> > On Wed, Mar 11, 2020 at 05:27:35PM +0000, Austin.Bolen@dell.com wrote:
> > > On 3/11/2020 12:12 PM, Bjorn Helgaas wrote:
> > > <SNIP>
> > > > > > > For the normative text describing when OS clears the AER bits
> > > > > > > following the informative flow chart, it could say that OS clears
> > > > > > > AER as soon as possible after OST returns and before OS processes
> > > > > > > _HPX and loading drivers.  Open to other suggestions as well.
> > > > > > I'm not sure what to do with "as soon as possible" either.  That
> > > > > > doesn't seem like something firmware and the OS can agree on.
> > > > > I can just state that it's done after OST returns but before _HPX or
> > > > > driver is loaded. Any time in that range is fine. I can't get super
> > > > > specific here because different OSes do different things.  Even for
> > > > > a given OS they change over time. And I need something generic
> > > > > enough to support a wide variety of OS implementations.
> > > > Yeah.  I don't know how to solve this.
> > > > 
> > > > Linux doesn't actually unload and reload drivers for the child devices
> > > > (Sathy, correct me if I'm wrong here) even though DPC containment
> > > > takes the link down and effectively unplugs and replugs the device.  I
> > > > would *like* to handle it like hotplug, but some higher-level software
> > > > doesn't deal well with things like storage devices disappearing and
> > > > reappearing.
> > > > 
> > > > Since Linux doesn't actually re-enumerate the child devices, it
> > > > wouldn't evaluate _HPX again.  It would probably be cleaner if it did,
> > > > but it's all tied up with the whole unplug/replug problem.

> > > DPC resets everything below it and so to get it back up and running it
> > > would mean that all buses and resources need to be assigned, _HPX
> > > evaluated, and drivers reloaded. If those things don't happen then the
> > > whole hierarchy below the port that triggered DPC will be inaccessible.

> > Hmm, I think I might be confusing this with another situation.  Sathy,
> > can you help me understand this?  I don't have a way to actually
> > exercise this EDR path.  Is there some way the pciehp hotplug driver
> > gets involved here?
> > 
> > Here's how this seems to work as far as I can tell:
> > 
> >    - Linux does not have DPC or AER control
> > 
> >    - Linux installs EDR notify handler
> > 
> >    - Linux evaluates DPC Enable _DSM
> > 
> >    - DPC containment event occurs
> > 
> >    - Firmware fields DPC interrupt
> > 
> >    - DPC event is not a surprise remove
> > 
> >    - Firmware sends EDR notification
> > 
> >    - Linux EDR notify handler evaluates Locate _DSM
> > 
> >    - Linux reads and logs DPC and AER error information for port in
> >      containment mode.  [If it was an RP PIO error, Linux clears RP PIO
> >      error status, which is an asymmetry with the non-RP PIO path.]
> > 
> >    - Linux clears AER error status (pci_aer_raw_clear_status())
> > 
> >    - Linux calls driver .error_detected() methods for all child devices
> >      of the port in containment mode (pcie_do_recovery()).  These
> >      devices are inaccessible because the link is down.
> > 
> >    - Linux clears DPC Trigger Status (dpc_reset_link() from
> >      pcie_do_recovery()).
> > 
> >    - Linux calls driver .mmio_enabled() methods for all child devices.
> > 
> > This is where I get lost.  These child devices are now accessible, but
> > they've been reset, so I don't know how their config space got
> > restored.  Did pciehp enumerate them?  Did we do something like
> > pci_restore_state()?  I don't see where either of these happens.

> AFAIK, AER error status registers  are sticky (RW1CS) and hence
> will be preserved during reset.

I'm not concerned about the AER registers.  I'm wondering about bus
numbers & windows (for bridges), BAR settings, MSI programming, etc.:
all the normal stuff the driver expects.  Or do we actually detach the
driver, remove the device, hot-add the device, re-enumerate, and
rebind the driver?

> > So they want to basically do native AER handling even though firmware
> > owns AER?  My head hurts.

> No, it's meant only for clearing AER registers. In EDR path, since
> OS owns clearing DPC registers, they want to let OS own clearing AER
> registers as well. Also, it would give OS a chance to decide
> whether we want to keep the device on based on error status and
> history of the device attached.

It's obviously not meant "only for clearing AER registers" if the OS
is going to decide things based on the error status.  How is deciding
things and clearing AER registers different from native AER handling?

This sort of makes a mockery of the idea of "AER ownership".  But I
guess the spec doesn't actually say anything that limits OS access to
the AER capability, even if firmware retains "control".  It does
restrict *firmware* from modifying the AER cap if it grants control
to the OS, but not the other way around.

Bjorn

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-11 22:11                           ` Kuppuswamy Sathyanarayanan
@ 2020-03-11 22:23                             ` Bjorn Helgaas
  2020-03-11 23:07                               ` Kuppuswamy Sathyanarayanan
  0 siblings, 1 reply; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-11 22:23 UTC (permalink / raw)
  To: Kuppuswamy Sathyanarayanan
  Cc: Austin.Bolen, linux-pci, linux-kernel, ashok.raj

On Wed, Mar 11, 2020 at 03:11:06PM -0700, Kuppuswamy Sathyanarayanan wrote:
> On 3/11/20 2:53 PM, Austin.Bolen@dell.com wrote:
> > On 3/11/2020 4:27 PM, Kuppuswamy Sathyanarayanan wrote:
> > > On 3/11/20 1:33 PM, Bjorn Helgaas wrote:
> > > > On Wed, Mar 11, 2020 at 05:27:35PM +0000, Austin.Bolen@dell.com wrote:
> > > > > On 3/11/2020 12:12 PM, Bjorn Helgaas wrote:
> > > > > <SNIP>
> > > > > > I'm probably missing your intent, but that sounds like "the OS can
> > > > > > read/write AER bits whenever it wants, regardless of ownership."
> > > > > > 
> > > > > > That doesn't sound practical to me, and I don't think it's really
> > > > > > similar to DPC, where it's pretty clear that the OS can touch DPC bits
> > > > > > it doesn't own but only *during the EDR processing window*.
> > > > > Yes, by treating AER bits like DPC bits I meant I'd define the specific
> > > > > time windows when OS can touch the AER status bits similar to how it's
> > > > > done for DPC in the current ECN.
> > > > Makes sense, thanks.
> > > > 
> > > > > > > > > For the normative text describing when OS clears the AER bits
> > > > > > > > > following the informative flow chart, it could say that OS clears
> > > > > > > > > AER as soon as possible after OST returns and before OS processes
> > > > > > > > > _HPX and loading drivers.  Open to other suggestions as well.
> > > > > > > > I'm not sure what to do with "as soon as possible" either.  That
> > > > > > > > doesn't seem like something firmware and the OS can agree on.
> > > > > > > I can just state that it's done after OST returns but before _HPX or
> > > > > > > driver is loaded. Any time in that range is fine. I can't get super
> > > > > > > specific here because different OSes do different things.  Even for
> > > > > > > a given OS they change over time. And I need something generic
> > > > > > > enough to support a wide variety of OS implementations.
> > > > > > Yeah.  I don't know how to solve this.
> > > > > > 
> > > > > > Linux doesn't actually unload and reload drivers for the child devices
> > > > > > (Sathy, correct me if I'm wrong here) even though DPC containment
> > > > > > takes the link down and effectively unplugs and replugs the device.  I
> > > > > > would *like* to handle it like hotplug, but some higher-level software
> > > > > > doesn't deal well with things like storage devices disappearing and
> > > > > > reappearing.
> > > > > > 
> > > > > > Since Linux doesn't actually re-enumerate the child devices, it
> > > > > > wouldn't evaluate _HPX again.  It would probably be cleaner if it did,
> > > > > > but it's all tied up with the whole unplug/replug problem.
> > > > > DPC resets everything below it and so to get it back up and running it
> > > > > would mean that all buses and resources need to be assigned, _HPX
> > > > > evaluated, and drivers reloaded. If those things don't happen then the
> > > > > whole hierarchy below the port that triggered DPC will be inaccessible.
> > > > Hmm, I think I might be confusing this with another situation.  Sathy,
> > > > can you help me understand this?  I don't have a way to actually
> > > > exercise this EDR path.  Is there some way the pciehp hotplug driver
> > > > gets involved here?
> > If the port has hot-plug enabled then DPC trigger will cause the link to
> > go down (disabled state) and will generate a DLLSC hot-plug interrupt.
> > When DPC is released, the link will become active and generate another
> > DLLSC hot-plug interrupt.
> Yes, device/driver enumeration and removal will triggered by DLLSC
> state change interrupt in pciehp driver.
> > 
> > > > Here's how this seems to work as far as I can tell:
> > > > 
> > > >      - Linux does not have DPC or AER control
> > > > 
> > > >      - Linux installs EDR notify handler
> > > > 
> > > >      - Linux evaluates DPC Enable _DSM
> > > > 
> > > >      - DPC containment event occurs
> > > > 
> > > >      - Firmware fields DPC interrupt
> > > > 
> > > >      - DPC event is not a surprise remove
> > > > 
> > > >      - Firmware sends EDR notification
> > > > 
> > > >      - Linux EDR notify handler evaluates Locate _DSM
> > > > 
> > > >      - Linux reads and logs DPC and AER error information for port in
> > > >        containment mode.  [If it was an RP PIO error, Linux clears RP PIO
> > > >        error status, which is an asymmetry with the non-RP PIO path.]
> > > > 
> > > >      - Linux clears AER error status (pci_aer_raw_clear_status())
> > > > 
> > > >      - Linux calls driver .error_detected() methods for all child devices
> > > >        of the port in containment mode (pcie_do_recovery()).  These
> > > >        devices are inaccessible because the link is down.
> > > > 
> > > >      - Linux clears DPC Trigger Status (dpc_reset_link() from
> > > >        pcie_do_recovery()).
> > > > 
> > > >      - Linux calls driver .mmio_enabled() methods for all child devices.
> > > > 
> > > > This is where I get lost.  These child devices are now accessible, but
> > > > they've been reset, so I don't know how their config space got
> > > > restored.  Did pciehp enumerate them?  Did we do something like
> > > > pci_restore_state()?  I don't see where either of these happens.
> > > AFAIK, AER error status registers  are sticky (RW1CS) and hence
> > > will be preserved during reset.
> > In our testing, the device directly connected to the port that was
> > contained does get reprogrammed and the driver is reloaded.  These are
> > hot-plug slots and so might be due to DLLSC hot-plug interrupt when
> > containment is released and link goes back to active state.
> > 
> > However, if a switch is connected to the port where DPC was triggered
> > then we do not see the whole switch hierarchy being re-enumerated.
> Now that I have a hardware to verify this scenario, I will look into
> it. I suspect there is a transient state in link status which causes
> this disconnect issue. But I think this issue is not related to
> EDR support and hence should be reproducible in native handling
> as well.
> > 
> > Also, DPC could be enabled on non-hot-plug slots so can't always rely on
> > hot-plug to re-init devices in the recovery path.
> If hotplug is not supported then there is support to enumerate
> devices via polling  or ACPI events. But a point to note
> here is, enumeration path is independent of error handler path, and
> hence there is no explicit trigger or event from error handler path
> to enumeration path to kick start the enumeration.

Is any synchronization needed here between the EDR path and the
hotplug/enumeration path?

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-11 22:13                         ` Bjorn Helgaas
@ 2020-03-11 22:41                           ` Kuppuswamy Sathyanarayanan
  0 siblings, 0 replies; 68+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2020-03-11 22:41 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Austin.Bolen, linux-pci, linux-kernel, ashok.raj


On 3/11/20 3:13 PM, Bjorn Helgaas wrote:
> On Wed, Mar 11, 2020 at 02:25:18PM -0700, Kuppuswamy Sathyanarayanan wrote:
>> On 3/11/20 1:33 PM, Bjorn Helgaas wrote:
>>> On Wed, Mar 11, 2020 at 05:27:35PM +0000, Austin.Bolen@dell.com wrote:
>>>> On 3/11/2020 12:12 PM, Bjorn Helgaas wrote:
>>>> <SNIP>
>>>>>>>> For the normative text describing when OS clears the AER bits
>>>>>>>> following the informative flow chart, it could say that OS clears
>>>>>>>> AER as soon as possible after OST returns and before OS processes
>>>>>>>> _HPX and loading drivers.  Open to other suggestions as well.
>>>>>>> I'm not sure what to do with "as soon as possible" either.  That
>>>>>>> doesn't seem like something firmware and the OS can agree on.
>>>>>> I can just state that it's done after OST returns but before _HPX or
>>>>>> driver is loaded. Any time in that range is fine. I can't get super
>>>>>> specific here because different OSes do different things.  Even for
>>>>>> a given OS they change over time. And I need something generic
>>>>>> enough to support a wide variety of OS implementations.
>>>>> Yeah.  I don't know how to solve this.
>>>>>
>>>>> Linux doesn't actually unload and reload drivers for the child devices
>>>>> (Sathy, correct me if I'm wrong here) even though DPC containment
>>>>> takes the link down and effectively unplugs and replugs the device.  I
>>>>> would *like* to handle it like hotplug, but some higher-level software
>>>>> doesn't deal well with things like storage devices disappearing and
>>>>> reappearing.
>>>>>
>>>>> Since Linux doesn't actually re-enumerate the child devices, it
>>>>> wouldn't evaluate _HPX again.  It would probably be cleaner if it did,
>>>>> but it's all tied up with the whole unplug/replug problem.
>>>> DPC resets everything below it and so to get it back up and running it
>>>> would mean that all buses and resources need to be assigned, _HPX
>>>> evaluated, and drivers reloaded. If those things don't happen then the
>>>> whole hierarchy below the port that triggered DPC will be inaccessible.
>>> Hmm, I think I might be confusing this with another situation.  Sathy,
>>> can you help me understand this?  I don't have a way to actually
>>> exercise this EDR path.  Is there some way the pciehp hotplug driver
>>> gets involved here?
>>>
>>> Here's how this seems to work as far as I can tell:
>>>
>>>     - Linux does not have DPC or AER control
>>>
>>>     - Linux installs EDR notify handler
>>>
>>>     - Linux evaluates DPC Enable _DSM
>>>
>>>     - DPC containment event occurs
>>>
>>>     - Firmware fields DPC interrupt
>>>
>>>     - DPC event is not a surprise remove
>>>
>>>     - Firmware sends EDR notification
>>>
>>>     - Linux EDR notify handler evaluates Locate _DSM
>>>
>>>     - Linux reads and logs DPC and AER error information for port in
>>>       containment mode.  [If it was an RP PIO error, Linux clears RP PIO
>>>       error status, which is an asymmetry with the non-RP PIO path.]
>>>
>>>     - Linux clears AER error status (pci_aer_raw_clear_status())
>>>
>>>     - Linux calls driver .error_detected() methods for all child devices
>>>       of the port in containment mode (pcie_do_recovery()).  These
>>>       devices are inaccessible because the link is down.
>>>
>>>     - Linux clears DPC Trigger Status (dpc_reset_link() from
>>>       pcie_do_recovery()).
>>>
>>>     - Linux calls driver .mmio_enabled() methods for all child devices.
>>>
>>> This is where I get lost.  These child devices are now accessible, but
>>> they've been reset, so I don't know how their config space got
>>> restored.  Did pciehp enumerate them?  Did we do something like
>>> pci_restore_state()?  I don't see where either of these happens.
>> AFAIK, AER error status registers  are sticky (RW1CS) and hence
>> will be preserved during reset.
> I'm not concerned about the AER registers.  I'm wondering about bus
> numbers & windows (for bridges), BAR settings, MSI programming, etc.:
> all the normal stuff the driver expects.  Or do we actually detach the
> driver, remove the device, hot-add the device, re-enumerate, and
> rebind the driver?
Yes, link down event removes the device and detaches the
driver. It will re-added once the link comes up. Please check
the pciehp_handle_presence_or_link_change() and
pciehp_unconfigure_device() for details.
>
>>> So they want to basically do native AER handling even though firmware
>>> owns AER?  My head hurts.
>> No, it's meant only for clearing AER registers. In EDR path, since
>> OS owns clearing DPC registers, they want to let OS own clearing AER
>> registers as well. Also, it would give OS a chance to decide
>> whether we want to keep the device on based on error status and
>> history of the device attached.
> It's obviously not meant "only for clearing AER registers" if the OS
> is going to decide things based on the error status.  How is deciding
> things and clearing AER registers different from native AER handling?
In EDR case,  firmware *first* detects error events and decides whether
to attempt recovery or to notify OS. Even after you recovered the device
firmware can decide whether to keep the link on or off. This is the main
difference between native vs EDR model.

But, once OS receives receives notification then it can attempt recovery and
error handling similar to native model.
>
> This sort of makes a mockery of the idea of "AER ownership".  But I
> guess the spec doesn't actually say anything that limits OS access to
> the AER capability, even if firmware retains "control".  It does
> restrict *firmware* from modifying the AER cap if it grants control
> to the OS, but not the other way around.
>
> Bjorn

-- 
Sathyanarayanan Kuppuswamy
Linux kernel developer


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-11 22:23                             ` Bjorn Helgaas
@ 2020-03-11 23:07                               ` Kuppuswamy Sathyanarayanan
  2020-03-12 19:53                                 ` Bjorn Helgaas
  0 siblings, 1 reply; 68+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2020-03-11 23:07 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Austin.Bolen, linux-pci, linux-kernel, ashok.raj

Hi Bjorn,

Re-sending the response in text mode.

On 3/11/20 3:23 PM, Bjorn Helgaas wrote:
> Is any synchronization needed here between the EDR path and the
> hotplug/enumeration path?
If we want to follow the implementation note step by step (in sequence) then
we need some synchronization between EDR path and enumeration path. But
if its OK the achieve the same end result by following steps out of sequence
then we don't need to create any dependency between EDR and enumeration
paths. Currently we follow the later approach.

For example, consider the case in flow chart where after sending success 
_OST,
firmware decides to stop the recovery of the device.

if we follow the flow chart as its then the steps should be,

1. clear the DPC status trigger
2. Send success code via _OST, and wait for return from _OST
3. if successful return then enumerate the child devices and reassign 
bus numbers.

In current approach the steps followed are,

1. Clear the DPC status trigger.
2. Send success code via _OST
2. In parallel, LINK UP event path will enumerate the child devices.
3. if firmware decides not to recover the device,  then LINK DOWN event 
will eventually
     remove them again.

-- 
Sathyanarayanan Kuppuswamy
Linux kernel developer


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-11 23:07                               ` Kuppuswamy Sathyanarayanan
@ 2020-03-12 19:53                                 ` Bjorn Helgaas
  2020-03-12 21:02                                   ` Austin.Bolen
  2020-03-12 21:59                                   ` Kuppuswamy Sathyanarayanan
  0 siblings, 2 replies; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-12 19:53 UTC (permalink / raw)
  To: Kuppuswamy Sathyanarayanan
  Cc: Austin.Bolen, linux-pci, linux-kernel, ashok.raj

On Wed, Mar 11, 2020 at 04:07:59PM -0700, Kuppuswamy Sathyanarayanan wrote:
> On 3/11/20 3:23 PM, Bjorn Helgaas wrote:
> > Is any synchronization needed here between the EDR path and the
> > hotplug/enumeration path?
>
> If we want to follow the implementation note step by step (in
> sequence) then we need some synchronization between EDR path and
> enumeration path. But if it's OK to achieve the same end result by
> following steps out of sequence then we don't need to create any
> dependency between EDR and enumeration paths. Currently we follow
> the latter approach.

What would the synchronization look like?

Ideally I think it would be better to follow the order in the
flowchart if it's not too onerous.  That will make the code easier to
understand.  The current situation with this dependency on pciehp and
what it will do leaves a lot of things implicit.

What happens if CONFIG_PCIE_EDR=y but CONFIG_HOTPLUG_PCI_PCIE=n?

IIUC, when DPC triggers, pciehp is what fields the DLLSC interrupt and
unbinds the drivers and removes the devices.  If that doesn't happen,
and Linux clears the DPC trigger to bring the link back up, will those
drivers try to operate uninitialized devices?

Does EDR need a dependency on CONFIG_HOTPLUG_PCI_PCIE?

> For example, consider the case in flow chart where after sending
> success _OST, firmware decides to stop the recovery of the device.
> 
> if we follow the flow chart as is then the steps should be,
> 
> 1. clear the DPC status trigger
> 2. Send success code via _OST, and wait for return from _OST
> 3. if successful return then enumerate the child devices and
> reassign bus numbers.
> 
> In current approach the steps followed are,
> 
> 1. Clear the DPC status trigger.
> 2. Send success code via _OST
> 2. In parallel, LINK UP event path will enumerate the child devices.
> 3. if firmware decides not to recover the device, then LINK DOWN
> event will eventually remove them again.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-12 19:53                                 ` Bjorn Helgaas
@ 2020-03-12 21:02                                   ` Austin.Bolen
  2020-03-12 21:29                                     ` Kuppuswamy Sathyanarayanan
  2020-03-12 21:59                                   ` Kuppuswamy Sathyanarayanan
  1 sibling, 1 reply; 68+ messages in thread
From: Austin.Bolen @ 2020-03-12 21:02 UTC (permalink / raw)
  To: helgaas, sathyanarayanan.kuppuswamy
  Cc: Austin.Bolen, linux-pci, linux-kernel, ashok.raj

On 3/12/2020 2:53 PM, Bjorn Helgaas wrote:
> 
> [EXTERNAL EMAIL]
> 
> On Wed, Mar 11, 2020 at 04:07:59PM -0700, Kuppuswamy Sathyanarayanan wrote:
>> On 3/11/20 3:23 PM, Bjorn Helgaas wrote:
>>> Is any synchronization needed here between the EDR path and the
>>> hotplug/enumeration path?
>>
>> If we want to follow the implementation note step by step (in
>> sequence) then we need some synchronization between EDR path and
>> enumeration path. But if it's OK to achieve the same end result by
>> following steps out of sequence then we don't need to create any
>> dependency between EDR and enumeration paths. Currently we follow
>> the latter approach.
> 
> What would the synchronization look like?
> 
> Ideally I think it would be better to follow the order in the
> flowchart if it's not too onerous.  That will make the code easier to
> understand.  The current situation with this dependency on pciehp and
> what it will do leaves a lot of things implicit.
> 
> What happens if CONFIG_PCIE_EDR=y but CONFIG_HOTPLUG_PCI_PCIE=n?
> 
> IIUC, when DPC triggers, pciehp is what fields the DLLSC interrupt and
> unbinds the drivers and removes the devices.  If that doesn't happen,
> and Linux clears the DPC trigger to bring the link back up, will those
> drivers try to operate uninitialized devices?
> 
> Does EDR need a dependency on CONFIG_HOTPLUG_PCI_PCIE?

 From one of Sathya's other responses:

"If hotplug is not supported then there is support to enumerate
devices via polling  or ACPI events. But a point to note
here is, enumeration path is independent of error handler path, and
hence there is no explicit trigger or event from error handler path
to enumeration path to kick start the enumeration."

The EDR standard doesn't have any dependency on hot-plug. It sounds like 
in the current implementation there's some manual intervention needed if 
hot-plug is not supported? Ideally recovery would kick in automatically 
but requiring manual intervention is a good first step.

> 
>> For example, consider the case in flow chart where after sending
>> success _OST, firmware decides to stop the recovery of the device.
>>
>> if we follow the flow chart as is then the steps should be,
>>
>> 1. clear the DPC status trigger
>> 2. Send success code via _OST, and wait for return from _OST
>> 3. if successful return then enumerate the child devices and
>> reassign bus numbers.
>>
>> In current approach the steps followed are,
>>
>> 1. Clear the DPC status trigger.
>> 2. Send success code via _OST

Success in step 2 is assuming device trained and config space is 
accessible correct?  If device was removed or device config space is not 
accessible then failure status should be sent via _OST.

>> 2. In parallel, LINK UP event path will enumerate the child devices.
>> 3. if firmware decides not to recover the device, then LINK DOWN
>> event will eventually remove them again.
> 



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-12 21:02                                   ` Austin.Bolen
@ 2020-03-12 21:29                                     ` Kuppuswamy Sathyanarayanan
  2020-03-12 21:52                                       ` Bjorn Helgaas
  0 siblings, 1 reply; 68+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2020-03-12 21:29 UTC (permalink / raw)
  To: Austin.Bolen, helgaas; +Cc: linux-pci, linux-kernel, ashok.raj

Hi,

On 3/12/20 2:02 PM, Austin.Bolen@dell.com wrote:
> On 3/12/2020 2:53 PM, Bjorn Helgaas wrote:
>> [EXTERNAL EMAIL]
>>
>> On Wed, Mar 11, 2020 at 04:07:59PM -0700, Kuppuswamy Sathyanarayanan wrote:
>>> On 3/11/20 3:23 PM, Bjorn Helgaas wrote:
>>>> Is any synchronization needed here between the EDR path and the
>>>> hotplug/enumeration path?
>>> If we want to follow the implementation note step by step (in
>>> sequence) then we need some synchronization between EDR path and
>>> enumeration path. But if it's OK to achieve the same end result by
>>> following steps out of sequence then we don't need to create any
>>> dependency between EDR and enumeration paths. Currently we follow
>>> the latter approach.
>> What would the synchronization look like?
>>
>> Ideally I think it would be better to follow the order in the
>> flowchart if it's not too onerous.  That will make the code easier to
>> understand.  The current situation with this dependency on pciehp and
>> what it will do leaves a lot of things implicit.
>>
>> What happens if CONFIG_PCIE_EDR=y but CONFIG_HOTPLUG_PCI_PCIE=n?
>>
>> IIUC, when DPC triggers, pciehp is what fields the DLLSC interrupt and
>> unbinds the drivers and removes the devices.  If that doesn't happen,
>> and Linux clears the DPC trigger to bring the link back up, will those
>> drivers try to operate uninitialized devices?
>>
>> Does EDR need a dependency on CONFIG_HOTPLUG_PCI_PCIE?
>   From one of Sathya's other responses:
>
> "If hotplug is not supported then there is support to enumerate
> devices via polling  or ACPI events. But a point to note
> here is, enumeration path is independent of error handler path, and
> hence there is no explicit trigger or event from error handler path
> to enumeration path to kick start the enumeration."
>
> The EDR standard doesn't have any dependency on hot-plug. It sounds like
> in the current implementation there's some manual intervention needed if
> hot-plug is not supported?
No, there is no need for manual intervention even in non hotplug
cases.

For ACPI events case, we would rely on ACPI event to kick start the
enumeration.  And for polling model, there is an independent polling
thread which will kick start the enumeration.

Above both enumeration models are totally independent and has
no dependency on error handler thread.

We will decide which model to use based on hardware capability and
_OSC negotiation or kernel command line option.
> Ideally recovery would kick in automatically
> but requiring manual intervention is a good first step.
>
>>> For example, consider the case in flow chart where after sending
>>> success _OST, firmware decides to stop the recovery of the device.
>>>
>>> if we follow the flow chart as is then the steps should be,
>>>
>>> 1. clear the DPC status trigger
>>> 2. Send success code via _OST, and wait for return from _OST
>>> 3. if successful return then enumerate the child devices and
>>> reassign bus numbers.
>>>
>>> In current approach the steps followed are,
>>>
>>> 1. Clear the DPC status trigger.
>>> 2. Send success code via _OST
> Success in step 2 is assuming device trained and config space is
> accessible correct?
yes. we send success code only if the link is trained and on.
> If device was removed or device config space is not
> accessible then failure status should be sent via _OST.
yes.
>
>>> 2. In parallel, LINK UP event path will enumerate the child devices.
>>> 3. if firmware decides not to recover the device, then LINK DOWN
>>> event will eventually remove them again.
>
-- 
Sathyanarayanan Kuppuswamy
Linux kernel developer


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-12 21:29                                     ` Kuppuswamy Sathyanarayanan
@ 2020-03-12 21:52                                       ` Bjorn Helgaas
  2020-03-12 22:02                                         ` Kuppuswamy Sathyanarayanan
  0 siblings, 1 reply; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-12 21:52 UTC (permalink / raw)
  To: Kuppuswamy Sathyanarayanan
  Cc: Austin.Bolen, linux-pci, linux-kernel, ashok.raj

On Thu, Mar 12, 2020 at 02:29:58PM -0700, Kuppuswamy Sathyanarayanan wrote:
> Hi,
> 
> On 3/12/20 2:02 PM, Austin.Bolen@dell.com wrote:
> > On 3/12/2020 2:53 PM, Bjorn Helgaas wrote:
> > > On Wed, Mar 11, 2020 at 04:07:59PM -0700, Kuppuswamy Sathyanarayanan wrote:
> > > > On 3/11/20 3:23 PM, Bjorn Helgaas wrote:
> > > > > Is any synchronization needed here between the EDR path and the
> > > > > hotplug/enumeration path?
> > > > If we want to follow the implementation note step by step (in
> > > > sequence) then we need some synchronization between EDR path and
> > > > enumeration path. But if it's OK to achieve the same end result by
> > > > following steps out of sequence then we don't need to create any
> > > > dependency between EDR and enumeration paths. Currently we follow
> > > > the latter approach.
> > > What would the synchronization look like?
> > > 
> > > Ideally I think it would be better to follow the order in the
> > > flowchart if it's not too onerous.  That will make the code easier to
> > > understand.  The current situation with this dependency on pciehp and
> > > what it will do leaves a lot of things implicit.
> > > 
> > > What happens if CONFIG_PCIE_EDR=y but CONFIG_HOTPLUG_PCI_PCIE=n?
> > > 
> > > IIUC, when DPC triggers, pciehp is what fields the DLLSC interrupt and
> > > unbinds the drivers and removes the devices.  If that doesn't happen,
> > > and Linux clears the DPC trigger to bring the link back up, will those
> > > drivers try to operate uninitialized devices?
> > > 
> > > Does EDR need a dependency on CONFIG_HOTPLUG_PCI_PCIE?
> >   From one of Sathya's other responses:
> > 
> > "If hotplug is not supported then there is support to enumerate
> > devices via polling  or ACPI events. But a point to note
> > here is, enumeration path is independent of error handler path, and
> > hence there is no explicit trigger or event from error handler path
> > to enumeration path to kick start the enumeration."
> > 
> > The EDR standard doesn't have any dependency on hot-plug. It sounds like
> > in the current implementation there's some manual intervention needed if
> > hot-plug is not supported?
> No, there is no need for manual intervention even in non hotplug
> cases.
> 
> For ACPI events case, we would rely on ACPI event to kick start the
> enumeration.  And for polling model, there is an independent polling
> thread which will kick start the enumeration.

I'm guessing the ACPI case works via hotplug_is_native(): if
CONFIG_HOTPLUG_PCI_PCIE=n, pciehp_is_native() returns false, and
acpiphp manages hotplug.

What if CONFIG_HOTPLUG_PCI_ACPI=n also?

Where is the polling thread?

> Above both enumeration models are totally independent and has
> no dependency on error handler thread.

I see they're currently independent from the EDR thread, but it's not
clear to me that there's no dependency.  After all, both EDR and the
hotplug paths are operating on the same devices at roughly the same
time, so we should have some story about what keeps them from getting
in each other's way.

> We will decide which model to use based on hardware capability and
> _OSC negotiation or kernel command line option.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-12 19:53                                 ` Bjorn Helgaas
  2020-03-12 21:02                                   ` Austin.Bolen
@ 2020-03-12 21:59                                   ` Kuppuswamy Sathyanarayanan
  2020-03-12 22:32                                     ` Bjorn Helgaas
  1 sibling, 1 reply; 68+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2020-03-12 21:59 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Austin.Bolen, linux-pci, linux-kernel, ashok.raj

Hi Bjorn,

On 3/12/20 12:53 PM, Bjorn Helgaas wrote:
> On Wed, Mar 11, 2020 at 04:07:59PM -0700, Kuppuswamy Sathyanarayanan wrote:
>> On 3/11/20 3:23 PM, Bjorn Helgaas wrote:
>>> Is any synchronization needed here between the EDR path and the
>>> hotplug/enumeration path?
>> If we want to follow the implementation note step by step (in
>> sequence) then we need some synchronization between EDR path and
>> enumeration path. But if it's OK to achieve the same end result by
>> following steps out of sequence then we don't need to create any
>> dependency between EDR and enumeration paths. Currently we follow
>> the latter approach.
> What would the synchronization look like?
we might need some way to disable the enumeration path till
we get response from firmware.

In native hot plug case, I think we can do it in two ways.

1. Disable hotplug notification in slot ctl registers.
     (pcie_disable_notification())
2. Some how block hotplug driver from processing the new
     events (not sure how feasible its).

Following method 1 would be easy, But I am not sure whether
its alright to disable them randomly. I think, unless we
clear the status as well, we might get some issues due to stale
notification history.

For ACPI event case, I am not sure whether we have some
communication protocol in place to disable receiving ACPI
events temporarily.

For polling model, we need to disable to the polling
timer thread till we receive _OST response from firmware.
>
> Ideally I think it would be better to follow the order in the
> flowchart if it's not too onerous.
None of the above changes will be pretty and I think it will
not be simple as well.
>   That will make the code easier to
> understand.  The current situation with this dependency on pciehp and
> what it will do leaves a lot of things implicit.
>
> What happens if CONFIG_PCIE_EDR=y but CONFIG_HOTPLUG_PCI_PCIE=n?
>
> IIUC, when DPC triggers, pciehp is what fields the DLLSC interrupt and
> unbinds the drivers and removes the devices.

>   If that doesn't happen,
> and Linux clears the DPC trigger to bring the link back up, will those
> drivers try to operate uninitialized devices?
I don't think this will happen. In DPC reset_link before we bring
up the device we wait for link to go down first
using pcie_wait_for_link(pdev, false) function.
>
> Does EDR need a dependency on CONFIG_HOTPLUG_PCI_PCIE?
No, enumeration can happen other ways as well (ACPI events, polling, etc).
>
>> For example, consider the case in flow chart where after sending
>> success _OST, firmware decides to stop the recovery of the device.
>>
>> if we follow the flow chart as is then the steps should be,
>>
>> 1. clear the DPC status trigger
>> 2. Send success code via _OST, and wait for return from _OST
>> 3. if successful return then enumerate the child devices and
>> reassign bus numbers.
>>
>> In current approach the steps followed are,
>>
>> 1. Clear the DPC status trigger.
>> 2. Send success code via _OST
>> 2. In parallel, LINK UP event path will enumerate the child devices.
>> 3. if firmware decides not to recover the device, then LINK DOWN
>> event will eventually remove them again.

-- 
Sathyanarayanan Kuppuswamy
Linux kernel developer


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-12 21:52                                       ` Bjorn Helgaas
@ 2020-03-12 22:02                                         ` Kuppuswamy Sathyanarayanan
  2020-03-12 22:36                                           ` Bjorn Helgaas
  0 siblings, 1 reply; 68+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2020-03-12 22:02 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Austin.Bolen, linux-pci, linux-kernel, ashok.raj

Hi,

On 3/12/20 2:52 PM, Bjorn Helgaas wrote:
> On Thu, Mar 12, 2020 at 02:29:58PM -0700, Kuppuswamy Sathyanarayanan wrote:
>> Hi,
>>
>> On 3/12/20 2:02 PM, Austin.Bolen@dell.com wrote:
>>> On 3/12/2020 2:53 PM, Bjorn Helgaas wrote:
>>>> On Wed, Mar 11, 2020 at 04:07:59PM -0700, Kuppuswamy Sathyanarayanan wrote:
>>>>> On 3/11/20 3:23 PM, Bjorn Helgaas wrote:
>>>>>> Is any synchronization needed here between the EDR path and the
>>>>>> hotplug/enumeration path?
>>>>> If we want to follow the implementation note step by step (in
>>>>> sequence) then we need some synchronization between EDR path and
>>>>> enumeration path. But if it's OK to achieve the same end result by
>>>>> following steps out of sequence then we don't need to create any
>>>>> dependency between EDR and enumeration paths. Currently we follow
>>>>> the latter approach.
>>>> What would the synchronization look like?
>>>>
>>>> Ideally I think it would be better to follow the order in the
>>>> flowchart if it's not too onerous.  That will make the code easier to
>>>> understand.  The current situation with this dependency on pciehp and
>>>> what it will do leaves a lot of things implicit.
>>>>
>>>> What happens if CONFIG_PCIE_EDR=y but CONFIG_HOTPLUG_PCI_PCIE=n?
>>>>
>>>> IIUC, when DPC triggers, pciehp is what fields the DLLSC interrupt and
>>>> unbinds the drivers and removes the devices.  If that doesn't happen,
>>>> and Linux clears the DPC trigger to bring the link back up, will those
>>>> drivers try to operate uninitialized devices?
>>>>
>>>> Does EDR need a dependency on CONFIG_HOTPLUG_PCI_PCIE?
>>>    From one of Sathya's other responses:
>>>
>>> "If hotplug is not supported then there is support to enumerate
>>> devices via polling  or ACPI events. But a point to note
>>> here is, enumeration path is independent of error handler path, and
>>> hence there is no explicit trigger or event from error handler path
>>> to enumeration path to kick start the enumeration."
>>>
>>> The EDR standard doesn't have any dependency on hot-plug. It sounds like
>>> in the current implementation there's some manual intervention needed if
>>> hot-plug is not supported?
>> No, there is no need for manual intervention even in non hotplug
>> cases.
>>
>> For ACPI events case, we would rely on ACPI event to kick start the
>> enumeration.  And for polling model, there is an independent polling
>> thread which will kick start the enumeration.
> I'm guessing the ACPI case works via hotplug_is_native(): if
> CONFIG_HOTPLUG_PCI_PCIE=n, pciehp_is_native() returns false, and
> acpiphp manages hotplug.
>
> What if CONFIG_HOTPLUG_PCI_ACPI=n also?
If none of the auto scans are enabled then we might need some
manual trigger ( rescan). But this would be needed in native
DPC case as well.
>
> Where is the polling thread?
drivers/pci/hotplug/pciehp_hpc.c
>
>> Above both enumeration models are totally independent and has
>> no dependency on error handler thread.
> I see they're currently independent from the EDR thread, but it's not
> clear to me that there's no dependency.  After all, both EDR and the
> hotplug paths are operating on the same devices at roughly the same
> time, so we should have some story about what keeps them from getting
> in each other's way.
>
>> We will decide which model to use based on hardware capability and
>> _OSC negotiation or kernel command line option.

-- 
Sathyanarayanan Kuppuswamy
Linux kernel developer


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-12 21:59                                   ` Kuppuswamy Sathyanarayanan
@ 2020-03-12 22:32                                     ` Bjorn Helgaas
  2020-03-13  6:22                                       ` Kuppuswamy, Sathyanarayanan
  0 siblings, 1 reply; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-12 22:32 UTC (permalink / raw)
  To: Kuppuswamy Sathyanarayanan
  Cc: Austin.Bolen, linux-pci, linux-kernel, ashok.raj

On Thu, Mar 12, 2020 at 02:59:15PM -0700, Kuppuswamy Sathyanarayanan wrote:
> Hi Bjorn,
> 
> On 3/12/20 12:53 PM, Bjorn Helgaas wrote:
> > On Wed, Mar 11, 2020 at 04:07:59PM -0700, Kuppuswamy Sathyanarayanan wrote:
> > > On 3/11/20 3:23 PM, Bjorn Helgaas wrote:
> > > > Is any synchronization needed here between the EDR path and the
> > > > hotplug/enumeration path?
> > > If we want to follow the implementation note step by step (in
> > > sequence) then we need some synchronization between EDR path and
> > > enumeration path. But if it's OK to achieve the same end result by
> > > following steps out of sequence then we don't need to create any
> > > dependency between EDR and enumeration paths. Currently we follow
> > > the latter approach.
> > What would the synchronization look like?
> we might need some way to disable the enumeration path till
> we get response from firmware.
> 
> In native hot plug case, I think we can do it in two ways.
> 
> 1. Disable hotplug notification in slot ctl registers.
>     (pcie_disable_notification())
> 2. Some how block hotplug driver from processing the new
>     events (not sure how feasible its).
> 
> Following method 1 would be easy, But I am not sure whether
> its alright to disable them randomly. I think, unless we
> clear the status as well, we might get some issues due to stale
> notification history.
> 
> For ACPI event case, I am not sure whether we have some
> communication protocol in place to disable receiving ACPI
> events temporarily.
> 
> For polling model, we need to disable to the polling
> timer thread till we receive _OST response from firmware.
> > 
> > Ideally I think it would be better to follow the order in the
> > flowchart if it's not too onerous.
> None of the above changes will be pretty and I think it will
> not be simple as well.
> >   That will make the code easier to
> > understand.  The current situation with this dependency on pciehp and
> > what it will do leaves a lot of things implicit.
> > 
> > What happens if CONFIG_PCIE_EDR=y but CONFIG_HOTPLUG_PCI_PCIE=n?
> > 
> > IIUC, when DPC triggers, pciehp is what fields the DLLSC interrupt and
> > unbinds the drivers and removes the devices.
> 
> >  If that doesn't happen, and Linux clears the DPC trigger to bring
> >  the link back up, will those drivers try to operate uninitialized
> >  devices?
>
> I don't think this will happen. In DPC reset_link before we bring up
> the device we wait for link to go down first using
> pcie_wait_for_link(pdev, false) function.

I understand that, but these child devices were reset when DPC
disabled the link.  When the link comes back up, their BARs contain
zeros.

If CONFIG_HOTPLUG_PCI_PCIE=y, the DLLSC interrupt will cause pciehp to
unbind the driver.  It seems like the unbind races with the EDR notify
handler.  If pciehp unbinds the driver before edr_handle_event() calls
pcie_do_recovery(), this seems fine -- we'll call dpc_reset_link(),
which brings up the link, we won't call any driver callbacks because
there's no driver, and another DLLSC interrupt will cause pciehp to
re-enumerate, which will re-initialize the device, then rebind the
driver.

If the EDR notify handler runs before pciehp unbinds the driver,
couldn't EDR bring up the link and call driver .mmio_enabled() before
the device has been initialized?

If CONFIG_HOTPLUG_PCI_PCIE=n and CONFIG_HOTPLUG_PCI_ACPI=y, I could
believe that the situations are similar to the above.

What if CONFIG_HOTPLUG_PCI_PCIE=n and CONFIG_HOTPLUG_PCI_ACPI=n?  Then
I assume there's nothing to unbind the driver, so pcie_do_recovery()
will call the driver .mmio_enabled() and other recovery callbacks on a
device that hasn't been initialized?

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-12 22:02                                         ` Kuppuswamy Sathyanarayanan
@ 2020-03-12 22:36                                           ` Bjorn Helgaas
  0 siblings, 0 replies; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-12 22:36 UTC (permalink / raw)
  To: Kuppuswamy Sathyanarayanan
  Cc: Austin.Bolen, linux-pci, linux-kernel, ashok.raj

On Thu, Mar 12, 2020 at 03:02:07PM -0700, Kuppuswamy Sathyanarayanan wrote:
> On 3/12/20 2:52 PM, Bjorn Helgaas wrote:
> > On Thu, Mar 12, 2020 at 02:29:58PM -0700, Kuppuswamy Sathyanarayanan wrote:
> > > On 3/12/20 2:02 PM, Austin.Bolen@dell.com wrote:
> > > > On 3/12/2020 2:53 PM, Bjorn Helgaas wrote:
> > > > > On Wed, Mar 11, 2020 at 04:07:59PM -0700, Kuppuswamy Sathyanarayanan wrote:
> > > > > > On 3/11/20 3:23 PM, Bjorn Helgaas wrote:
> > > > > > > Is any synchronization needed here between the EDR path and the
> > > > > > > hotplug/enumeration path?
> > > > > > If we want to follow the implementation note step by step (in
> > > > > > sequence) then we need some synchronization between EDR path and
> > > > > > enumeration path. But if it's OK to achieve the same end result by
> > > > > > following steps out of sequence then we don't need to create any
> > > > > > dependency between EDR and enumeration paths. Currently we follow
> > > > > > the latter approach.
> > > > > What would the synchronization look like?
> > > > > 
> > > > > Ideally I think it would be better to follow the order in the
> > > > > flowchart if it's not too onerous.  That will make the code easier to
> > > > > understand.  The current situation with this dependency on pciehp and
> > > > > what it will do leaves a lot of things implicit.
> > > > > 
> > > > > What happens if CONFIG_PCIE_EDR=y but CONFIG_HOTPLUG_PCI_PCIE=n?
> > > > > 
> > > > > IIUC, when DPC triggers, pciehp is what fields the DLLSC interrupt and
> > > > > unbinds the drivers and removes the devices.  If that doesn't happen,
> > > > > and Linux clears the DPC trigger to bring the link back up, will those
> > > > > drivers try to operate uninitialized devices?
> > > > > 
> > > > > Does EDR need a dependency on CONFIG_HOTPLUG_PCI_PCIE?
> > > >    From one of Sathya's other responses:
> > > > 
> > > > "If hotplug is not supported then there is support to enumerate
> > > > devices via polling  or ACPI events. But a point to note
> > > > here is, enumeration path is independent of error handler path, and
> > > > hence there is no explicit trigger or event from error handler path
> > > > to enumeration path to kick start the enumeration."
> > > > 
> > > > The EDR standard doesn't have any dependency on hot-plug. It sounds like
> > > > in the current implementation there's some manual intervention needed if
> > > > hot-plug is not supported?
> > >
> > > No, there is no need for manual intervention even in non hotplug
> > > cases.
> > > 
> > > For ACPI events case, we would rely on ACPI event to kick start the
> > > enumeration.  And for polling model, there is an independent polling
> > > thread which will kick start the enumeration.
>
> > I'm guessing the ACPI case works via hotplug_is_native(): if
> > CONFIG_HOTPLUG_PCI_PCIE=n, pciehp_is_native() returns false, and
> > acpiphp manages hotplug.
> > 
> > What if CONFIG_HOTPLUG_PCI_ACPI=n also?
>
> If none of the auto scans are enabled then we might need some
> manual trigger (rescan). But this would be needed in native
> DPC case as well.
> > 
> > Where is the polling thread?
>
> drivers/pci/hotplug/pciehp_hpc.c

Only if CONFIG_HOTPLUG_PCI_PCIE=y, obviously.  My question is about
what happens when CONFIG_HOTPLUG_PCI_PCIE=n.

I'm not as concerned about requiring a manual rescan.  That's
inconvenient, but doesn't seem like a big deal because that's what you
expect with no hotplug driver.

What I *am* worried about is calling driver callbacks on a device that
has been reset but not initialized.  That could cause all sorts of
havoc because the driver thinks it can trust BARs and other
configuration.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-12 22:32                                     ` Bjorn Helgaas
@ 2020-03-13  6:22                                       ` Kuppuswamy, Sathyanarayanan
  2020-03-13 19:28                                         ` Bjorn Helgaas
  0 siblings, 1 reply; 68+ messages in thread
From: Kuppuswamy, Sathyanarayanan @ 2020-03-13  6:22 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Austin.Bolen, linux-pci, linux-kernel, ashok.raj

Hi Bjorn,

On 3/12/2020 3:32 PM, Bjorn Helgaas wrote:
> On Thu, Mar 12, 2020 at 02:59:15PM -0700, Kuppuswamy Sathyanarayanan wrote:
>> Hi Bjorn,
>>
>> On 3/12/20 12:53 PM, Bjorn Helgaas wrote:
>>> On Wed, Mar 11, 2020 at 04:07:59PM -0700, Kuppuswamy Sathyanarayanan wrote:
>>>> On 3/11/20 3:23 PM, Bjorn Helgaas wrote:
>>>>> Is any synchronization needed here between the EDR path and the
>>>>> hotplug/enumeration path?
>>>> If we want to follow the implementation note step by step (in
>>>> sequence) then we need some synchronization between EDR path and
>>>> enumeration path. But if it's OK to achieve the same end result by
>>>> following steps out of sequence then we don't need to create any
>>>> dependency between EDR and enumeration paths. Currently we follow
>>>> the latter approach.
>>> What would the synchronization look like?
>> we might need some way to disable the enumeration path till
>> we get response from firmware.
>>
>> In native hot plug case, I think we can do it in two ways.
>>
>> 1. Disable hotplug notification in slot ctl registers.
>>      (pcie_disable_notification())
>> 2. Some how block hotplug driver from processing the new
>>      events (not sure how feasible its).
>>
>> Following method 1 would be easy, But I am not sure whether
>> its alright to disable them randomly. I think, unless we
>> clear the status as well, we might get some issues due to stale
>> notification history.
>>
>> For ACPI event case, I am not sure whether we have some
>> communication protocol in place to disable receiving ACPI
>> events temporarily.
>>
>> For polling model, we need to disable to the polling
>> timer thread till we receive _OST response from firmware.
>>>
>>> Ideally I think it would be better to follow the order in the
>>> flowchart if it's not too onerous.
>> None of the above changes will be pretty and I think it will
>> not be simple as well.
>>>    That will make the code easier to
>>> understand.  The current situation with this dependency on pciehp and
>>> what it will do leaves a lot of things implicit.
>>>
>>> What happens if CONFIG_PCIE_EDR=y but CONFIG_HOTPLUG_PCI_PCIE=n?
>>>
>>> IIUC, when DPC triggers, pciehp is what fields the DLLSC interrupt and
>>> unbinds the drivers and removes the devices.
>>
>>>   If that doesn't happen, and Linux clears the DPC trigger to bring
>>>   the link back up, will those drivers try to operate uninitialized
>>>   devices?
>>
>> I don't think this will happen. In DPC reset_link before we bring up
>> the device we wait for link to go down first using
>> pcie_wait_for_link(pdev, false) function.
> 
> I understand that, but these child devices were reset when DPC
> disabled the link.  When the link comes back up, their BARs contain
> zeros.
> 
> If CONFIG_HOTPLUG_PCI_PCIE=y, the DLLSC interrupt will cause pciehp to
> unbind the driver.  It seems like the unbind races with the EDR notify
> handler. '

Agree. But even if there is a race condition, after clearing DPC trigger
status, if hotplug driver properly removes/re-enumerates the driver then
the end result will still be same. There should be no functional impact.

  If pciehp unbinds the driver before edr_handle_event() calls
> pcie_do_recovery(), this seems fine -- we'll call dpc_reset_link(),
> which brings up the link, we won't call any driver callbacks because
> there's no driver, and another DLLSC interrupt will cause pciehp to
> re-enumerate, which will re-initialize the device, then rebind the
> driver.
> 
> If the EDR notify handler runs before pciehp unbinds the driver,
In the above case, from the kernel perspective device is still 
accessible and IIUC, it will try to recover it in pcie_do_recovery()
using one of the callbacks.

int (*mmio_enabled)(struct pci_dev *dev);
int (*slot_reset)(struct pci_dev *dev);
void (*resume)(struct pci_dev *dev);

One of these callbacks will do pci_restore_state() to restore the
device, and IO will not attempted in these callbacks until the device
is successfully recovered.

> couldn't EDR bring up the link and call driver .mmio_enabled() before
> the device has been initialized?
Calling mmio_enabled in this case should not be a problem right ?

Please check the following content from 
Documentation/PCI/pci-error-recovery.rst. IIUC (following content), IO 
will not be attempted until
the device is successfully re-configured.

STEP 2: MMIO Enabled
--------------------
This callback is made if all drivers on a segment agree that they can
try to recover and if no automatic link reset was performed by the HW.
If the platform can't just re-enable IOs without a slot reset or a link
reset, it will not call this callback, and instead will have gone
directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset)


> 
> If CONFIG_HOTPLUG_PCI_PCIE=n and CONFIG_HOTPLUG_PCI_ACPI=y, I could
> believe that the situations are similar to the above.
> 
> What if CONFIG_HOTPLUG_PCI_PCIE=n and CONFIG_HOTPLUG_PCI_ACPI=n?  Then
> I assume there's nothing to unbind the driver, so pcie_do_recovery()
> will call the driver .mmio_enabled() and other recovery callbacks on a
> device that hasn't been initialized?

probably in .slot_reset() callback device config will be restored and it
will make the device functional again.

Also since in above case hotplug is not supported, topology change will
not be supported.

> 

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-13  6:22                                       ` Kuppuswamy, Sathyanarayanan
@ 2020-03-13 19:28                                         ` Bjorn Helgaas
  2020-03-13 20:26                                           ` Kuppuswamy Sathyanarayanan
  0 siblings, 1 reply; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-13 19:28 UTC (permalink / raw)
  To: Kuppuswamy, Sathyanarayanan
  Cc: Austin.Bolen, linux-pci, linux-kernel, ashok.raj, Russell Currey,
	Sam Bobroff, Oliver O'Halloran

[+cc Russell, Sam, Oliver since we're talking about the error recovery
flow.  The code we're talking about is at [1]]

On Thu, Mar 12, 2020 at 11:22:13PM -0700, Kuppuswamy, Sathyanarayanan wrote:
> On 3/12/2020 3:32 PM, Bjorn Helgaas wrote:
> > On Thu, Mar 12, 2020 at 02:59:15PM -0700, Kuppuswamy Sathyanarayanan wrote:
> > > On 3/12/20 12:53 PM, Bjorn Helgaas wrote:
> > > > On Wed, Mar 11, 2020 at 04:07:59PM -0700, Kuppuswamy Sathyanarayanan wrote:
> > > > > On 3/11/20 3:23 PM, Bjorn Helgaas wrote:
> > > > > > Is any synchronization needed here between the EDR path and the
> > > > > > hotplug/enumeration path?
> > > > > If we want to follow the implementation note step by step (in
> > > > > sequence) then we need some synchronization between EDR path and
> > > > > enumeration path. But if it's OK to achieve the same end result by
> > > > > following steps out of sequence then we don't need to create any
> > > > > dependency between EDR and enumeration paths. Currently we follow
> > > > > the latter approach.
> > > > What would the synchronization look like?
> > > we might need some way to disable the enumeration path till
> > > we get response from firmware.
> > > 
> > > In native hot plug case, I think we can do it in two ways.
> > > 
> > > 1. Disable hotplug notification in slot ctl registers.
> > >      (pcie_disable_notification())
> > > 2. Some how block hotplug driver from processing the new
> > >      events (not sure how feasible its).
> > > 
> > > Following method 1 would be easy, But I am not sure whether
> > > its alright to disable them randomly. I think, unless we
> > > clear the status as well, we might get some issues due to stale
> > > notification history.
> > > 
> > > For ACPI event case, I am not sure whether we have some
> > > communication protocol in place to disable receiving ACPI
> > > events temporarily.
> > > 
> > > For polling model, we need to disable to the polling
> > > timer thread till we receive _OST response from firmware.
> > > > 
> > > > Ideally I think it would be better to follow the order in the
> > > > flowchart if it's not too onerous.
> > > None of the above changes will be pretty and I think it will
> > > not be simple as well.
> > > >    That will make the code easier to
> > > > understand.  The current situation with this dependency on pciehp and
> > > > what it will do leaves a lot of things implicit.
> > > > 
> > > > What happens if CONFIG_PCIE_EDR=y but CONFIG_HOTPLUG_PCI_PCIE=n?
> > > > 
> > > > IIUC, when DPC triggers, pciehp is what fields the DLLSC interrupt and
> > > > unbinds the drivers and removes the devices.
> > > 
> > > >   If that doesn't happen, and Linux clears the DPC trigger to bring
> > > >   the link back up, will those drivers try to operate uninitialized
> > > >   devices?
> > > 
> > > I don't think this will happen. In DPC reset_link before we bring up
> > > the device we wait for link to go down first using
> > > pcie_wait_for_link(pdev, false) function.
> > 
> > I understand that, but these child devices were reset when DPC
> > disabled the link.  When the link comes back up, their BARs
> > contain zeros.
> > 
> > If CONFIG_HOTPLUG_PCI_PCIE=y, the DLLSC interrupt will cause
> > pciehp to unbind the driver.  It seems like the unbind races with
> > the EDR notify handler.
> 
> Agree. But even if there is a race condition, after clearing DPC
> trigger status, if hotplug driver properly removes/re-enumerates the
> driver then the end result will still be same. There should be no
> functional impact.
> 
> > If pciehp unbinds the driver before edr_handle_event() calls
> > pcie_do_recovery(), this seems fine -- we'll call
> > dpc_reset_link(), which brings up the link, we won't call any
> > driver callbacks because there's no driver, and another DLLSC
> > interrupt will cause pciehp to re-enumerate, which will
> > re-initialize the device, then rebind the driver.
> > 
> > If the EDR notify handler runs before pciehp unbinds the driver,
>
> In the above case, from the kernel perspective device is still
> accessible and IIUC, it will try to recover it in pcie_do_recovery()
> using one of the callbacks.
> 
> int (*mmio_enabled)(struct pci_dev *dev);
> int (*slot_reset)(struct pci_dev *dev);
> void (*resume)(struct pci_dev *dev);
> 
> One of these callbacks will do pci_restore_state() to restore the
> device, and IO will not attempted in these callbacks until the device
> is successfully recovered.

That might be what *should* happen, but I don't think it's what
*does* happen.

I don't think we use .mmio_enabled() and .slot_reset() for EDR
because Linux EDR currently depends on DPC, so we'll be using
dpc_reset_link(), which normally returns PCI_ERS_RESULT_RECOVERED,
so pcie_do_recovery() skips .mmio_enabled() and .slot_reset().

I looked at the first few .resume() implementations (FWIW, I used [2]
to find them), and none of them calls pci_restore_state() before doing
I/O to the device:

  ioat_pcie_error_resume()
  pci_resume() (hfi1)
  qib_pci_resume()
  cxl_pci_resume()
  genwqe_err_resume()
  ...

But I assume you've tested EDR with some driver that *does* call
pci_restore_state()?  Or maybe you have pciehp enabled, and it always
wins the race and unbinds the driver before the EDR notification?  It
would be interesting to make pciehp *lose* the race and see if
anything breaks.

pci-error-recovery.rst does not mention any requirement for the driver
to call pci_restore_state(), and I think any state restoration like
that should be the responsibility of the PCI core, not the driver.

> > couldn't EDR bring up the link and call driver .mmio_enabled() before
> > the device has been initialized?
>
> Calling mmio_enabled in this case should not be a problem right?
>
> Please check the following content from
> Documentation/PCI/pci-error-recovery.rst. IIUC (following content),
> IO will not be attempted until the device is successfully
> re-configured.
> 
> STEP 2: MMIO Enabled
> --------------------
> This callback is made if all drivers on a segment agree that they can
> try to recover and if no automatic link reset was performed by the HW.
> If the platform can't just re-enable IOs without a slot reset or a link
> reset, it will not call this callback, and instead will have gone
> directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset)
> 
> > If CONFIG_HOTPLUG_PCI_PCIE=n and CONFIG_HOTPLUG_PCI_ACPI=y, I could
> > believe that the situations are similar to the above.
> > 
> > What if CONFIG_HOTPLUG_PCI_PCIE=n and CONFIG_HOTPLUG_PCI_ACPI=n?  Then
> > I assume there's nothing to unbind the driver, so pcie_do_recovery()
> > will call the driver .mmio_enabled() and other recovery callbacks on a
> > device that hasn't been initialized?
> 
> probably in .slot_reset() callback device config will be restored and it
> will make the device functional again.

I don't think .mmio_enabled() is a problem because IIUC, the device
should not have been reset before calling .mmio_enabled().

But I think .slot_reset() *is* a problem.  I looked at several
.slot_reset() implementations ([3]); some called pci_restore_state(),
but many did not.

If no hotplug driver is enabled, I think the .slot_reset() callbacks
that do not call pci_restore_state() are broken.

> Also since in above case hotplug is not supported, topology change will
> not be supported.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/log/?h=review/edr
[2] F='\.resume'; git grep -A 10 "struct pci_error_handlers" | grep "$F\s*=" | sed -e "s/.*$F\s*=\s*//" -e 's/,\s*$//'
[3] F='\.slot_reset'; git grep -A 10 "struct pci_error_handlers" | grep "$F\s*=" | sed -e "s/.*$F\s*=\s*//" -e 's/,\s*$//'

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-13 19:28                                         ` Bjorn Helgaas
@ 2020-03-13 20:26                                           ` Kuppuswamy Sathyanarayanan
  2020-03-19 23:03                                             ` Bjorn Helgaas
  0 siblings, 1 reply; 68+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2020-03-13 20:26 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Austin.Bolen, linux-pci, linux-kernel, ashok.raj, Russell Currey,
	Sam Bobroff, Oliver O'Halloran

Hi Bjorn,

On 3/13/20 12:28 PM, Bjorn Helgaas wrote:
> [+cc Russell, Sam, Oliver since we're talking about the error recovery
> flow.  The code we're talking about is at [1]]
>
> On Thu, Mar 12, 2020 at 11:22:13PM -0700, Kuppuswamy, Sathyanarayanan wrote:
>> On 3/12/2020 3:32 PM, Bjorn Helgaas wrote:
>>> On Thu, Mar 12, 2020 at 02:59:15PM -0700, Kuppuswamy Sathyanarayanan wrote:
>>>> On 3/12/20 12:53 PM, Bjorn Helgaas wrote:
>>>>> On Wed, Mar 11, 2020 at 04:07:59PM -0700, Kuppuswamy Sathyanarayanan wrote:
>>>>>> On 3/11/20 3:23 PM, Bjorn Helgaas wrote:
>>>>>>> Is any synchronization needed here between the EDR path and the
>>>>>>> hotplug/enumeration path?
>>>>>> If we want to follow the implementation note step by step (in
>>>>>> sequence) then we need some synchronization between EDR path and
>>>>>> enumeration path. But if it's OK to achieve the same end result by
>>>>>> following steps out of sequence then we don't need to create any
>>>>>> dependency between EDR and enumeration paths. Currently we follow
>>>>>> the latter approach.
>>>>> What would the synchronization look like?
>>>> we might need some way to disable the enumeration path till
>>>> we get response from firmware.
>>>>
>>>> In native hot plug case, I think we can do it in two ways.
>>>>
>>>> 1. Disable hotplug notification in slot ctl registers.
>>>>       (pcie_disable_notification())
>>>> 2. Some how block hotplug driver from processing the new
>>>>       events (not sure how feasible its).
>>>>
>>>> Following method 1 would be easy, But I am not sure whether
>>>> its alright to disable them randomly. I think, unless we
>>>> clear the status as well, we might get some issues due to stale
>>>> notification history.
>>>>
>>>> For ACPI event case, I am not sure whether we have some
>>>> communication protocol in place to disable receiving ACPI
>>>> events temporarily.
>>>>
>>>> For polling model, we need to disable to the polling
>>>> timer thread till we receive _OST response from firmware.
>>>>> Ideally I think it would be better to follow the order in the
>>>>> flowchart if it's not too onerous.
>>>> None of the above changes will be pretty and I think it will
>>>> not be simple as well.
>>>>>     That will make the code easier to
>>>>> understand.  The current situation with this dependency on pciehp and
>>>>> what it will do leaves a lot of things implicit.
>>>>>
>>>>> What happens if CONFIG_PCIE_EDR=y but CONFIG_HOTPLUG_PCI_PCIE=n?
>>>>>
>>>>> IIUC, when DPC triggers, pciehp is what fields the DLLSC interrupt and
>>>>> unbinds the drivers and removes the devices.
>>>>>    If that doesn't happen, and Linux clears the DPC trigger to bring
>>>>>    the link back up, will those drivers try to operate uninitialized
>>>>>    devices?
>>>> I don't think this will happen. In DPC reset_link before we bring up
>>>> the device we wait for link to go down first using
>>>> pcie_wait_for_link(pdev, false) function.
>>> I understand that, but these child devices were reset when DPC
>>> disabled the link.  When the link comes back up, their BARs
>>> contain zeros.
>>>
>>> If CONFIG_HOTPLUG_PCI_PCIE=y, the DLLSC interrupt will cause
>>> pciehp to unbind the driver.  It seems like the unbind races with
>>> the EDR notify handler.
>> Agree. But even if there is a race condition, after clearing DPC
>> trigger status, if hotplug driver properly removes/re-enumerates the
>> driver then the end result will still be same. There should be no
>> functional impact.
>>
>>> If pciehp unbinds the driver before edr_handle_event() calls
>>> pcie_do_recovery(), this seems fine -- we'll call
>>> dpc_reset_link(), which brings up the link, we won't call any
>>> driver callbacks because there's no driver, and another DLLSC
>>> interrupt will cause pciehp to re-enumerate, which will
>>> re-initialize the device, then rebind the driver.
>>>
>>> If the EDR notify handler runs before pciehp unbinds the driver,
>> In the above case, from the kernel perspective device is still
>> accessible and IIUC, it will try to recover it in pcie_do_recovery()
>> using one of the callbacks.
>>
>> int (*mmio_enabled)(struct pci_dev *dev);
>> int (*slot_reset)(struct pci_dev *dev);
>> void (*resume)(struct pci_dev *dev);
>>
>> One of these callbacks will do pci_restore_state() to restore the
>> device, and IO will not attempted in these callbacks until the device
>> is successfully recovered.
> That might be what *should* happen, but I don't think it's what
> *does* happen.
>
> I don't think we use .mmio_enabled() and .slot_reset() for EDR
> because Linux EDR currently depends on DPC, so we'll be using
> dpc_reset_link(), which normally returns PCI_ERS_RESULT_RECOVERED,
> so pcie_do_recovery() skips .mmio_enabled() and .slot_reset().
After our discussion about non-hotplug cases, I am thinking
that reset_link() callback should not return
PCI_ERS_RESULT_RECOVERED in non hotplug cases. If
successfully reset-ed, it should return PCI_ERS_RESULT_NEED_RESET.
This will enable pcie_do_recovery() to proceed to .slot_reset() to
successfully recover the device.

Any comments ?
>
> I looked at the first few .resume() implementations (FWIW, I used [2]
> to find them), and none of them calls pci_restore_state() before doing
> I/O to the device:
>
>    ioat_pcie_error_resume()
>    pci_resume() (hfi1)
>    qib_pci_resume()
>    cxl_pci_resume()
>    genwqe_err_resume()
>    ...
>
> But I assume you've tested EDR with some driver that *does* call
> pci_restore_state()?  Or maybe you have pciehp enabled,
Yes. I have tested it only with hotplug enabled. Let me try to disable
hotplug and verify the cases.
> and it always
> wins the race and unbinds the driver before the EDR notification?  It
> would be interesting to make pciehp *lose* the race and see if
> anything breaks.
>
> pci-error-recovery.rst does not mention any requirement for the driver
> to call pci_restore_state(), and I think any state restoration like
> that should be the responsibility of the PCI core, not the driver.
>
>>> couldn't EDR bring up the link and call driver .mmio_enabled() before
>>> the device has been initialized?
>> Calling mmio_enabled in this case should not be a problem right?
>>
>> Please check the following content from
>> Documentation/PCI/pci-error-recovery.rst. IIUC (following content),
>> IO will not be attempted until the device is successfully
>> re-configured.
>>
>> STEP 2: MMIO Enabled
>> --------------------
>> This callback is made if all drivers on a segment agree that they can
>> try to recover and if no automatic link reset was performed by the HW.
>> If the platform can't just re-enable IOs without a slot reset or a link
>> reset, it will not call this callback, and instead will have gone
>> directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset)
>>
>>> If CONFIG_HOTPLUG_PCI_PCIE=n and CONFIG_HOTPLUG_PCI_ACPI=y, I could
>>> believe that the situations are similar to the above.
>>>
>>> What if CONFIG_HOTPLUG_PCI_PCIE=n and CONFIG_HOTPLUG_PCI_ACPI=n?  Then
>>> I assume there's nothing to unbind the driver, so pcie_do_recovery()
>>> will call the driver .mmio_enabled() and other recovery callbacks on a
>>> device that hasn't been initialized?
>> probably in .slot_reset() callback device config will be restored and it
>> will make the device functional again.
> I don't think .mmio_enabled() is a problem because IIUC, the device
> should not have been reset before calling .mmio_enabled().
In hotplug case, it is possible. since reset_link() is called before
.mmio_enabled, the device might be in reset state by the time
.mmio_enabled is called.
>
> But I think .slot_reset() *is* a problem.  I looked at several
> .slot_reset() implementations ([3]); some called pci_restore_state(),
> but many did not.
>
> If no hotplug driver is enabled, I think the .slot_reset() callbacks
> that do not call pci_restore_state() are broken.
Yes. Agree. May be the documentation needs to be explicit about it ?
>
>> Also since in above case hotplug is not supported, topology change will
>> not be supported.
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/log/?h=review/edr
> [2] F='\.resume'; git grep -A 10 "struct pci_error_handlers" | grep "$F\s*=" | sed -e "s/.*$F\s*=\s*//" -e 's/,\s*$//'
> [3] F='\.slot_reset'; git grep -A 10 "struct pci_error_handlers" | grep "$F\s*=" | sed -e "s/.*$F\s*=\s*//" -e 's/,\s*$//'

-- 
Sathyanarayanan Kuppuswamy
Linux kernel developer


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 03/12] PCI/ERR: Remove service dependency in pcie_do_recovery()
  2020-03-04  2:36 ` [PATCH v17 03/12] PCI/ERR: Remove service dependency in pcie_do_recovery() sathyanarayanan.kuppuswamy
@ 2020-03-17 14:40   ` Christoph Hellwig
  0 siblings, 0 replies; 68+ messages in thread
From: Christoph Hellwig @ 2020-03-17 14:40 UTC (permalink / raw)
  To: sathyanarayanan.kuppuswamy; +Cc: bhelgaas, linux-pci, linux-kernel, ashok.raj

> -static pci_ers_result_t reset_link(struct pci_dev *dev, u32 service)
> +static pci_ers_result_t reset_link(struct pci_dev *dev,
> +			pci_ers_result_t (*reset_cb)(struct pci_dev *pdev))
>  {
>  	pci_ers_result_t status;
> -	struct pcie_port_service_driver *driver = NULL;
>  
> -	driver = pcie_port_find_service(dev, service);
> -	if (driver && driver->reset_link) {
> -		status = driver->reset_link(dev);
> +	if (reset_cb) {
> +		status = reset_cb(dev);

As far as I can tell reset_cb is never NULL.  So all the code below
is dead, and the remainder of reset_link is so trivial that it can
be inlined into the only caller.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 05/12] PCI: portdrv: remove reset_link member from pcie_port_service_driver
  2020-03-04  2:36 ` [PATCH v17 05/12] PCI: portdrv: remove reset_link member from pcie_port_service_driver sathyanarayanan.kuppuswamy
@ 2020-03-17 14:41   ` Christoph Hellwig
  2020-03-17 14:55     ` Kuppuswamy, Sathyanarayanan
  0 siblings, 1 reply; 68+ messages in thread
From: Christoph Hellwig @ 2020-03-17 14:41 UTC (permalink / raw)
  To: sathyanarayanan.kuppuswamy; +Cc: bhelgaas, linux-pci, linux-kernel, ashok.raj

On Tue, Mar 03, 2020 at 06:36:28PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> 
> reset_link member in struct pcie_port_service_driver was
> mainly added to let pcie_do_recovery() trigger the driver
> specific reset_link() on PCIe fatal errors. But after
> modifying the pcie_do_recovery() function to accept reset_link
> callback as function parameter, we no longer have need to use
> or set reset_link in struct pcie_port_service_driver. So remove
> it.

This should be folded into
"PCI/ERR: Remove service dependency in pcie_do_recovery()"

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 06/12] Documentation: PCI: Remove reset_link references
  2020-03-04  2:36 ` [PATCH v17 06/12] Documentation: PCI: Remove reset_link references sathyanarayanan.kuppuswamy
@ 2020-03-17 14:42   ` Christoph Hellwig
  2020-03-17 15:05     ` Kuppuswamy, Sathyanarayanan
  0 siblings, 1 reply; 68+ messages in thread
From: Christoph Hellwig @ 2020-03-17 14:42 UTC (permalink / raw)
  To: sathyanarayanan.kuppuswamy; +Cc: bhelgaas, linux-pci, linux-kernel, ashok.raj

On Tue, Mar 03, 2020 at 06:36:29PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> 
> After pcie_do_recovery() refactor, instead of reset_link
> member in struct pcie_port_service_driver, we use reset_cb
> callback parameter in pcie_do_recovery() function to pass
> the service driver specific reset_link function. So modify
> the Documentation to reflect the latest changes.

This should be folded into the patch removing the method.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 10/12] PCI/DPC: Export DPC error recovery functions
  2020-03-04  2:36 ` [PATCH v17 10/12] PCI/DPC: Export DPC error recovery functions sathyanarayanan.kuppuswamy
@ 2020-03-17 14:43   ` Christoph Hellwig
  0 siblings, 0 replies; 68+ messages in thread
From: Christoph Hellwig @ 2020-03-17 14:43 UTC (permalink / raw)
  To: sathyanarayanan.kuppuswamy; +Cc: bhelgaas, linux-pci, linux-kernel, ashok.raj

Nothing is actually exported here (fortunately!).

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 05/12] PCI: portdrv: remove reset_link member from pcie_port_service_driver
  2020-03-17 14:41   ` Christoph Hellwig
@ 2020-03-17 14:55     ` Kuppuswamy, Sathyanarayanan
  0 siblings, 0 replies; 68+ messages in thread
From: Kuppuswamy, Sathyanarayanan @ 2020-03-17 14:55 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: bhelgaas, linux-pci, linux-kernel, ashok.raj

Hi,

On 3/17/20 7:41 AM, Christoph Hellwig wrote:
> On Tue, Mar 03, 2020 at 06:36:28PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
it.
> 
> This should be folded into
> "PCI/ERR: Remove service dependency in pcie_do_recovery()"
I think Bjorn already folded them together. Please check.
https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/commit/?h=review/edr&id=7a18dc6506f108db3dc40f5cd779bc15270c4183

> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 06/12] Documentation: PCI: Remove reset_link references
  2020-03-17 14:42   ` Christoph Hellwig
@ 2020-03-17 15:05     ` Kuppuswamy, Sathyanarayanan
  2020-03-17 15:07       ` Christoph Hellwig
  0 siblings, 1 reply; 68+ messages in thread
From: Kuppuswamy, Sathyanarayanan @ 2020-03-17 15:05 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: bhelgaas, linux-pci, linux-kernel, ashok.raj



On 3/17/20 7:42 AM, Christoph Hellwig wrote:
> On Tue, Mar 03, 2020 at 06:36:29PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>
> 
> This should be folded into the patch removing the method.
This is also folded in the mentioned patch.
https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/commit/?h=review/edr&id=7a18dc6506f108db3dc40f5cd779bc15270c4183
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 06/12] Documentation: PCI: Remove reset_link references
  2020-03-17 15:05     ` Kuppuswamy, Sathyanarayanan
@ 2020-03-17 15:07       ` Christoph Hellwig
  2020-03-17 16:03         ` Bjorn Helgaas
  0 siblings, 1 reply; 68+ messages in thread
From: Christoph Hellwig @ 2020-03-17 15:07 UTC (permalink / raw)
  To: Kuppuswamy, Sathyanarayanan
  Cc: Christoph Hellwig, bhelgaas, linux-pci, linux-kernel, ashok.raj

On Tue, Mar 17, 2020 at 08:05:50AM -0700, Kuppuswamy, Sathyanarayanan wrote:
> > > From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > > 
> > 
> > This should be folded into the patch removing the method.
> This is also folded in the mentioned patch.
> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/commit/?h=review/edr&id=7a18dc6506f108db3dc40f5cd779bc15270c4183

I can't find that series anywhere on the list.  What did I miss?

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 06/12] Documentation: PCI: Remove reset_link references
  2020-03-17 15:07       ` Christoph Hellwig
@ 2020-03-17 16:03         ` Bjorn Helgaas
  2020-03-17 17:06           ` Christoph Hellwig
  0 siblings, 1 reply; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-17 16:03 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Kuppuswamy, Sathyanarayanan, Bjorn Helgaas, linux-pci, LKML, Raj, Ashok

On Tue, Mar 17, 2020 at 10:09 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Tue, Mar 17, 2020 at 08:05:50AM -0700, Kuppuswamy, Sathyanarayanan wrote:
> > > > From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > > >
> > >
> > > This should be folded into the patch removing the method.
> > This is also folded in the mentioned patch.
> > https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/commit/?h=review/edr&id=7a18dc6506f108db3dc40f5cd779bc15270c4183
>
> I can't find that series anywhere on the list.  What did I miss?

We've still been discussing other issues (access to AER registers,
synchronization between EDR and hotplug, etc) in other parts of this
thread.  The git branch Sathy pointed to above is my local branch.
I'll send it to the list before putting it into -next, but I wanted to
make progress on some of these other issues first.

Bjorn

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 06/12] Documentation: PCI: Remove reset_link references
  2020-03-17 16:03         ` Bjorn Helgaas
@ 2020-03-17 17:06           ` Christoph Hellwig
  2020-03-19 22:52             ` Bjorn Helgaas
  0 siblings, 1 reply; 68+ messages in thread
From: Christoph Hellwig @ 2020-03-17 17:06 UTC (permalink / raw)
  To: bjorn
  Cc: Christoph Hellwig, Kuppuswamy, Sathyanarayanan, Bjorn Helgaas,
	linux-pci, LKML, Raj, Ashok

On Tue, Mar 17, 2020 at 11:03:36AM -0500, Bjorn Helgaas wrote:
> On Tue, Mar 17, 2020 at 10:09 AM Christoph Hellwig <hch@infradead.org> wrote:
> >
> > On Tue, Mar 17, 2020 at 08:05:50AM -0700, Kuppuswamy, Sathyanarayanan wrote:
> > > > > From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > > > >
> > > >
> > > > This should be folded into the patch removing the method.
> > > This is also folded in the mentioned patch.
> > > https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/commit/?h=review/edr&id=7a18dc6506f108db3dc40f5cd779bc15270c4183
> >
> > I can't find that series anywhere on the list.  What did I miss?
> 
> We've still been discussing other issues (access to AER registers,
> synchronization between EDR and hotplug, etc) in other parts of this
> thread.  The git branch Sathy pointed to above is my local branch.
> I'll send it to the list before putting it into -next, but I wanted to
> make progress on some of these other issues first.

A few nitpicks:

PCI/ERR: Update error status after reset_link():

 - there are two "if (state == pci_channel_io_frozen)"
   right after each other now, merging them would make the code a little
   easier to read.

PCI/DPC: Move DPC data into struct pci_dev:

 - dpc_rp_extensions probable should be a "bool : 1"

PCI/ERR: Remove service dependency in pcie_do_recovery():

 - as mentioned to Kuppuswamy the reset_cb is never NULL, and thus
   a lot of dead code in reset_link can be removed.  Also reset_link
   should be merged into pcie_do_recovery.  That would also enable
   to call the argument reset_link, which might be a bit more
   descriptive than reset_cb.

PCI/DPC: Cache DPC capabilities in pci_init_capabilities():

 - I think the pci_dpc_init could be cleaned up a bit to:

	...
	pci_read_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_CAP, &cap);
	if (!(cap & PCI_EXP_DPC_CAP_RP_EXT))
		return;
	pdev->dpc_rp_extensions = true;
	pdev->dpc_rp_log_size = (cap & PCI_EXP_DPC_RP_PIO_LOG_SIZE) >> 8;
	...

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 06/12] Documentation: PCI: Remove reset_link references
  2020-03-17 17:06           ` Christoph Hellwig
@ 2020-03-19 22:52             ` Bjorn Helgaas
  0 siblings, 0 replies; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-19 22:52 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: bjorn, Kuppuswamy, Sathyanarayanan, linux-pci, LKML, Raj, Ashok

On Tue, Mar 17, 2020 at 10:06:54AM -0700, Christoph Hellwig wrote:
> On Tue, Mar 17, 2020 at 11:03:36AM -0500, Bjorn Helgaas wrote:
> > On Tue, Mar 17, 2020 at 10:09 AM Christoph Hellwig <hch@infradead.org> wrote:
> > > On Tue, Mar 17, 2020 at 08:05:50AM -0700, Kuppuswamy, Sathyanarayanan wrote:
> > > > > > From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > > > >
> > > > > This should be folded into the patch removing the method.
> > > > This is also folded in the mentioned patch.
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/commit/?h=review/edr&id=7a18dc6506f108db3dc40f5cd779bc15270c4183
> > >
> > > I can't find that series anywhere on the list.  What did I miss?
> > 
> > We've still been discussing other issues (access to AER registers,
> > synchronization between EDR and hotplug, etc) in other parts of this
> > thread.  The git branch Sathy pointed to above is my local branch.
> > I'll send it to the list before putting it into -next, but I wanted to
> > make progress on some of these other issues first.
> 
> A few nitpicks:
> 
> PCI/ERR: Update error status after reset_link():
> 
>  - there are two "if (state == pci_channel_io_frozen)"
>    right after each other now, merging them would make the code a little
>    easier to read.

Merged, thanks.

> PCI/DPC: Move DPC data into struct pci_dev:
> 
>  - dpc_rp_extensions probable should be a "bool : 1"

I actually had not seen "bool : 1" used, but you're right, there are
several.  There aren't any in drivers/pci, though, so I'm inclined to
stay consistent with "unsigned int : 1" unless there's an advantage,
and then I'd probably convert all of drivers/pci over.

My rule of thumb has been [1], where Linus suggests "unsigned int
percpu:1", but maybe that should be updated.

> PCI/ERR: Remove service dependency in pcie_do_recovery():
> 
>  - as mentioned to Kuppuswamy the reset_cb is never NULL, and thus
>    a lot of dead code in reset_link can be removed.

Agreed, thanks, I removed that dead code.

>    Also reset_link should be merged into pcie_do_recovery.  That
>    would also enable to call the argument reset_link, which might be
>    a bit more descriptive than reset_cb.

I didn't do this because it sounds like it might be a separate patch.
But maybe Sathy can do this in the next round?

> PCI/DPC: Cache DPC capabilities in pci_init_capabilities():
> 
>  - I think the pci_dpc_init could be cleaned up a bit to:
> 
> 	...
> 	pci_read_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_CAP, &cap);
> 	if (!(cap & PCI_EXP_DPC_CAP_RP_EXT))
> 		return;
> 	pdev->dpc_rp_extensions = true;
> 	pdev->dpc_rp_log_size = (cap & PCI_EXP_DPC_RP_PIO_LOG_SIZE) >> 8;
> 	...

Nice, thanks!  I made this change, too.

Thanks a lot for reviewing this!

Bjorn


[1] https://lore.kernel.org/linux-arm-kernel/CA+55aFxnePDimkVKVtv3gNmRGcwc8KQ5mHYvUxY8sAQg6yvVYg@mail.gmail.com/

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-13 20:26                                           ` Kuppuswamy Sathyanarayanan
@ 2020-03-19 23:03                                             ` Bjorn Helgaas
  2020-03-19 23:20                                               ` Kuppuswamy, Sathyanarayanan
  0 siblings, 1 reply; 68+ messages in thread
From: Bjorn Helgaas @ 2020-03-19 23:03 UTC (permalink / raw)
  To: Kuppuswamy Sathyanarayanan
  Cc: Austin.Bolen, linux-pci, linux-kernel, ashok.raj, Russell Currey,
	Sam Bobroff, Oliver O'Halloran

On Fri, Mar 13, 2020 at 01:26:28PM -0700, Kuppuswamy Sathyanarayanan wrote:
> On 3/13/20 12:28 PM, Bjorn Helgaas wrote:
> > [+cc Russell, Sam, Oliver since we're talking about the error recovery
> > flow.  The code we're talking about is at [1]]
> > 
> > On Thu, Mar 12, 2020 at 11:22:13PM -0700, Kuppuswamy, Sathyanarayanan wrote:
> > > On 3/12/2020 3:32 PM, Bjorn Helgaas wrote:
> > > > On Thu, Mar 12, 2020 at 02:59:15PM -0700, Kuppuswamy Sathyanarayanan wrote:
> > > > > On 3/12/20 12:53 PM, Bjorn Helgaas wrote:
> > > > > > On Wed, Mar 11, 2020 at 04:07:59PM -0700, Kuppuswamy Sathyanarayanan wrote:
> > > > > > > On 3/11/20 3:23 PM, Bjorn Helgaas wrote:
> > > > > > > > Is any synchronization needed here between the EDR path and the
> > > > > > > > hotplug/enumeration path?
> > > > > > > If we want to follow the implementation note step by step (in
> > > > > > > sequence) then we need some synchronization between EDR path and
> > > > > > > enumeration path. But if it's OK to achieve the same end result by
> > > > > > > following steps out of sequence then we don't need to create any
> > > > > > > dependency between EDR and enumeration paths. Currently we follow
> > > > > > > the latter approach.
> > > > > > What would the synchronization look like?
> > > > > we might need some way to disable the enumeration path till
> > > > > we get response from firmware.
> > > > > 
> > > > > In native hot plug case, I think we can do it in two ways.
> > > > > 
> > > > > 1. Disable hotplug notification in slot ctl registers.
> > > > >       (pcie_disable_notification())
> > > > > 2. Some how block hotplug driver from processing the new
> > > > >       events (not sure how feasible its).
> > > > > 
> > > > > Following method 1 would be easy, But I am not sure whether
> > > > > its alright to disable them randomly. I think, unless we
> > > > > clear the status as well, we might get some issues due to stale
> > > > > notification history.
> > > > > 
> > > > > For ACPI event case, I am not sure whether we have some
> > > > > communication protocol in place to disable receiving ACPI
> > > > > events temporarily.
> > > > > 
> > > > > For polling model, we need to disable to the polling
> > > > > timer thread till we receive _OST response from firmware.
> > > > > > Ideally I think it would be better to follow the order in the
> > > > > > flowchart if it's not too onerous.
> > > > > None of the above changes will be pretty and I think it will
> > > > > not be simple as well.
> > > > > >     That will make the code easier to
> > > > > > understand.  The current situation with this dependency on pciehp and
> > > > > > what it will do leaves a lot of things implicit.
> > > > > > 
> > > > > > What happens if CONFIG_PCIE_EDR=y but CONFIG_HOTPLUG_PCI_PCIE=n?
> > > > > > 
> > > > > > IIUC, when DPC triggers, pciehp is what fields the DLLSC interrupt and
> > > > > > unbinds the drivers and removes the devices.
> > > > > >    If that doesn't happen, and Linux clears the DPC trigger to bring
> > > > > >    the link back up, will those drivers try to operate uninitialized
> > > > > >    devices?
> > > > > I don't think this will happen. In DPC reset_link before we bring up
> > > > > the device we wait for link to go down first using
> > > > > pcie_wait_for_link(pdev, false) function.
> > > > I understand that, but these child devices were reset when DPC
> > > > disabled the link.  When the link comes back up, their BARs
> > > > contain zeros.
> > > > 
> > > > If CONFIG_HOTPLUG_PCI_PCIE=y, the DLLSC interrupt will cause
> > > > pciehp to unbind the driver.  It seems like the unbind races with
> > > > the EDR notify handler.
> > > Agree. But even if there is a race condition, after clearing DPC
> > > trigger status, if hotplug driver properly removes/re-enumerates the
> > > driver then the end result will still be same. There should be no
> > > functional impact.
> > > 
> > > > If pciehp unbinds the driver before edr_handle_event() calls
> > > > pcie_do_recovery(), this seems fine -- we'll call
> > > > dpc_reset_link(), which brings up the link, we won't call any
> > > > driver callbacks because there's no driver, and another DLLSC
> > > > interrupt will cause pciehp to re-enumerate, which will
> > > > re-initialize the device, then rebind the driver.
> > > > 
> > > > If the EDR notify handler runs before pciehp unbinds the driver,
> > > In the above case, from the kernel perspective device is still
> > > accessible and IIUC, it will try to recover it in pcie_do_recovery()
> > > using one of the callbacks.
> > > 
> > > int (*mmio_enabled)(struct pci_dev *dev);
> > > int (*slot_reset)(struct pci_dev *dev);
> > > void (*resume)(struct pci_dev *dev);
> > > 
> > > One of these callbacks will do pci_restore_state() to restore the
> > > device, and IO will not attempted in these callbacks until the device
> > > is successfully recovered.
> > That might be what *should* happen, but I don't think it's what
> > *does* happen.
> > 
> > I don't think we use .mmio_enabled() and .slot_reset() for EDR
> > because Linux EDR currently depends on DPC, so we'll be using
> > dpc_reset_link(), which normally returns PCI_ERS_RESULT_RECOVERED,
> > so pcie_do_recovery() skips .mmio_enabled() and .slot_reset().
> After our discussion about non-hotplug cases, I am thinking
> that reset_link() callback should not return
> PCI_ERS_RESULT_RECOVERED in non hotplug cases. If
> successfully reset-ed, it should return PCI_ERS_RESULT_NEED_RESET.
> This will enable pcie_do_recovery() to proceed to .slot_reset() to
> successfully recover the device.
> 
> Any comments ?
> > 
> > I looked at the first few .resume() implementations (FWIW, I used [2]
> > to find them), and none of them calls pci_restore_state() before doing
> > I/O to the device:
> > 
> >    ioat_pcie_error_resume()
> >    pci_resume() (hfi1)
> >    qib_pci_resume()
> >    cxl_pci_resume()
> >    genwqe_err_resume()
> >    ...
> > 
> > But I assume you've tested EDR with some driver that *does* call
> > pci_restore_state()?  Or maybe you have pciehp enabled,
> Yes. I have tested it only with hotplug enabled. Let me try to disable
> hotplug and verify the cases.
> > and it always
> > wins the race and unbinds the driver before the EDR notification?  It
> > would be interesting to make pciehp *lose* the race and see if
> > anything breaks.
> > 
> > pci-error-recovery.rst does not mention any requirement for the driver
> > to call pci_restore_state(), and I think any state restoration like
> > that should be the responsibility of the PCI core, not the driver.
> > 
> > > > couldn't EDR bring up the link and call driver .mmio_enabled() before
> > > > the device has been initialized?
> > > Calling mmio_enabled in this case should not be a problem right?
> > > 
> > > Please check the following content from
> > > Documentation/PCI/pci-error-recovery.rst. IIUC (following content),
> > > IO will not be attempted until the device is successfully
> > > re-configured.
> > > 
> > > STEP 2: MMIO Enabled
> > > --------------------
> > > This callback is made if all drivers on a segment agree that they can
> > > try to recover and if no automatic link reset was performed by the HW.
> > > If the platform can't just re-enable IOs without a slot reset or a link
> > > reset, it will not call this callback, and instead will have gone
> > > directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset)
> > > 
> > > > If CONFIG_HOTPLUG_PCI_PCIE=n and CONFIG_HOTPLUG_PCI_ACPI=y, I could
> > > > believe that the situations are similar to the above.
> > > > 
> > > > What if CONFIG_HOTPLUG_PCI_PCIE=n and CONFIG_HOTPLUG_PCI_ACPI=n?  Then
> > > > I assume there's nothing to unbind the driver, so pcie_do_recovery()
> > > > will call the driver .mmio_enabled() and other recovery callbacks on a
> > > > device that hasn't been initialized?
> > > probably in .slot_reset() callback device config will be restored and it
> > > will make the device functional again.
> > I don't think .mmio_enabled() is a problem because IIUC, the device
> > should not have been reset before calling .mmio_enabled().
> In hotplug case, it is possible. since reset_link() is called before
> .mmio_enabled, the device might be in reset state by the time
> .mmio_enabled is called.
> > 
> > But I think .slot_reset() *is* a problem.  I looked at several
> > .slot_reset() implementations ([3]); some called pci_restore_state(),
> > but many did not.
> > 
> > If no hotplug driver is enabled, I think the .slot_reset() callbacks
> > that do not call pci_restore_state() are broken.
> Yes. Agree. May be the documentation needs to be explicit about it ?

Sorry, I got distracted here and lost the flow of the conversation.
I haven't been able to think about the synchronization question and
your comments above yet.

I do think that when pci_restore_state() is required, it should be
done by the PCI core, not by the drivers.  But I think that's out of
scope for this series, so probably a project for later.

I made a few of the updates Christoph suggested and updated the
review/edr branch.  Do you want to start with that as the basis for a
v18 posting?

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-19 23:03                                             ` Bjorn Helgaas
@ 2020-03-19 23:20                                               ` Kuppuswamy, Sathyanarayanan
  0 siblings, 0 replies; 68+ messages in thread
From: Kuppuswamy, Sathyanarayanan @ 2020-03-19 23:20 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Austin.Bolen, linux-pci, linux-kernel, ashok.raj, Russell Currey,
	Sam Bobroff, Oliver O'Halloran

Hi Bjorn,

On 3/19/20 4:03 PM, Bjorn Helgaas wrote:
> I made a few of the updates Christoph suggested and updated the
> review/edr branch.  Do you want to start with that as the basis for a
> v18 posting?

Do you want to merge your version of patch set first? or wait till we
address all of the following open issues.

1. Move reset_link callback to pcie_do_recovery().
2. If recovery issue exist in non-hotplug case, investigate
    and fix it ( this needs more testing and review).
3. synchronize EDR handler with hot-plug DLLSC state change handler.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode
  2020-03-10  4:28     ` Kuppuswamy, Sathyanarayanan
@ 2020-03-10 18:14       ` Austin.Bolen
  2020-03-10 19:32         ` Bjorn Helgaas
  0 siblings, 1 reply; 68+ messages in thread
From: Austin.Bolen @ 2020-03-10 18:14 UTC (permalink / raw)
  To: sathyanarayanan.kuppuswamy, helgaas, Austin.Bolen
  Cc: linux-pci, linux-kernel, ashok.raj

On 3/9/2020 11:28 PM, Kuppuswamy, Sathyanarayanan wrote:
> 
> [EXTERNAL EMAIL]
> 
> Hi Bjorn,
> 
> On 3/9/2020 7:40 PM, Bjorn Helgaas wrote:
>> [+cc Austin, tentative Linux patches on this git branch:
>> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/tree/drivers/pci/pcie?h=review/edr]
>>
>> On Tue, Mar 03, 2020 at 06:36:32PM -0800, sathyanarayanan.kuppuswamy@linux.intel.com wrote:
>>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>>
>>> As per PCI firmware specification r3.2 System Firmware Intermediary
>>> (SFI) _OSC and DPC Updates ECR
>>> (https://members.pcisig.com/wg/PCI-SIG/document/13563), sec titled "DPC
>>> Event Handling Implementation Note", page 10, Error Disconnect Recover
>>> (EDR) support allows OS to handle error recovery and clearing Error
>>> Registers even in FF mode. So create new API pci_aer_raw_clear_status()
>>> which allows clearing AER registers without FF mode checks.
>>
>> I see that this ECR was released as an ECN a few days ago:
>> https://members.pcisig.com/wg/PCI-SIG/document/14076
>> Regrettably the title in the PDF still says "ECR" (the rendered title
>> *page* says "ENGINEERING CHANGE NOTIFICATION", but some metadata
>> buried in the file says "ECR - SFI _OSC Support and DPC Updates".

I'll see if PCI-SIG can update the metadata and repost.

>>
>> Anyway, I think I see the note you refer to (now on page 12):
>>
>>     IMPLEMENTATION NOTE
>>     DPC Event Handling
>>
>>     The flow chart below documents the behavior when firmware maintains
>>     control of AER and DPC and grants control of PCIe Hot-Plug to the
>>     operating system.
>>
>>     ...
>>
>>     Capture and clear device AER status. OS may choose to offline
>>     devices3, either via SW (not load driver) or HW (power down device,
>>     disable Link5,6,7). Otherwise process _HPX, complete device
>>     enumeration, load drivers
>>
>> This clearly suggests that the OS should clear device AER status.
>> However, according to the intro text, firmware has retained control of
>> AER, so what gives the OS the right to clear AER status?
>>
>> The Downstream Port Containment Related Enhancements ECN (sec 4.5.1,
>> table 4-6) contains an exception that allows the OS to read/write
>> DPC registers during recovery.  But
>>
>>     - that is for *DPC* registers, not for AER registers, and
>>
>>     - that exception only applies between OS receipt of the EDR
>>       notification and OS release of DPC by clearing the DPC Trigger
>>       Status bit.
>>
>> The flowchart in the SFI ECN shows the OS releasing DPC before
>> clearing AER status:
>>
>>     - Receive EDR notification
>>
>>     - Cleanup - Notify and unload child drivers below Port
>>
>>     - Bring Port out of DPC, clear port error status, assign bus numbers
>>       to child devices.
>>
>>       I assume this box includes clearing DPC error status and clearing
>>       Trigger Status?  They seem to be out of order in the box.

OS clears the DPC Trigger Status bit which will bring port below it out 
of containment. Then OS will clear the "port" error status bits (i.e., 
the AER and DPC status bits in the root port or downstream port that 
triggered containment). I don't think it would hurt to do this two steps 
in reverse order but don't think it is necessary. Note that error status 
bits for devices below the port in containment are cleared later after 
f/w has a chance to log them.

>>
>>     - Evaluate _OST
>>
>>     - Capture and clear device AER status.
>>
>>       This seems suspect to me.  Where does it say the OS is allowed to
>>       write AER status when firmware retains control of AER?
>>
>> This patch series does things in this order:
>>
>>     - Receive EDR notification (edr_handle_event(), edr.c)
>>
>>     - Read, log, and clear DPC error regs (dpc_process_error(), dpc.c).
>>
>>       This also clears AER uncorrectable error status when the relevant
>>       HEST entries do not have the FIRMWARE_FIRST bit set.  I think this
>>       is incorrect: the test should be based the _OSC negotiation for
>>       AER ownership, not on the HEST entries.  But this problem
>>       pre-dates this patch series.
>>
>>     - Clear AER status (pci_aer_raw_clear_status(), aer.c).
>>
>>       This is at least inside the EDR recovery window, but again, I
>>       don't see where it says the OS is allowed to write the AER status.
> 
> Implementation note is the only reference we have regarding clearing the
> AER registers.
> 
> But since the spec says both DPC and AER needs to be always controlled
> together by the either OS or firmware, and when firmware relinquishes
> control over DPC registers in EDR notification window, we can assume
> that we also have control over AER registers.
> 
> But I agree that is not explicitly spelled out any where outside the
> implementation note.
> 
> 
> Austin,
> 
> May be ECN (section 4.5.1, table 4-6) needs to be updated to add this
> clarification.

Sure we can update to section 4.5.1, table 4-6 to indicate when OS can 
clear the AER status bits. It will just follow what's done in the 
implementation note so I think it's acceptable to follow implementation 
guidance for now.

> 
>>
>>     - Attempt recovery (pcie_do_recovery(), err.c)
>>
>>     - Clear DPC Trigger Status (dpc_reset_link(), dpc.c)
>>
>>     - Evaluate _OST (acpi_send_edr_status(), edr.c)
>>
>> What am I missing?
>>
>>> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>> ---
>>>    drivers/pci/pci.h      |  2 ++
>>>    drivers/pci/pcie/aer.c | 22 ++++++++++++++++++----
>>>    2 files changed, 20 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>>> index e57e78b619f8..c239e6dd2542 100644
>>> --- a/drivers/pci/pci.h
>>> +++ b/drivers/pci/pci.h
>>> @@ -655,6 +655,7 @@ extern const struct attribute_group aer_stats_attr_group;
>>>    void pci_aer_clear_fatal_status(struct pci_dev *dev);
>>>    void pci_aer_clear_device_status(struct pci_dev *dev);
>>>    int pci_cleanup_aer_error_status_regs(struct pci_dev *dev);
>>> +int pci_aer_raw_clear_status(struct pci_dev *dev);
>>>    #else
>>>    static inline void pci_no_aer(void) { }
>>>    static inline void pci_aer_init(struct pci_dev *d) { }
>>> @@ -665,6 +666,7 @@ static inline int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
>>>    {
>>>    	return -EINVAL;
>>>    }
>>> +int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
>>>    #endif
>>>    
>>>    #ifdef CONFIG_ACPI
>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>> index c0540c3761dc..41afefa562b7 100644
>>> --- a/drivers/pci/pcie/aer.c
>>> +++ b/drivers/pci/pcie/aer.c
>>> @@ -420,7 +420,16 @@ void pci_aer_clear_fatal_status(struct pci_dev *dev)
>>>    		pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
>>>    }
>>>    
>>> -int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
>>> +/**
>>> + * pci_aer_raw_clear_status - Clear AER error registers.
>>> + * @dev: the PCI device
>>> + *
>>> + * NOTE: Allows clearing error registers in both FF and
>>> + * non FF modes.
>>> + *
>>> + * Returns 0 on success, or negative on failure.
>>> + */
>>> +int pci_aer_raw_clear_status(struct pci_dev *dev)
>>>    {
>>>    	int pos;
>>>    	u32 status;
>>> @@ -433,9 +442,6 @@ int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
>>>    	if (!pos)
>>>    		return -EIO;
>>>    
>>> -	if (pcie_aer_get_firmware_first(dev))
>>> -		return -EIO;
>>> -
>>>    	port_type = pci_pcie_type(dev);
>>>    	if (port_type == PCI_EXP_TYPE_ROOT_PORT) {
>>>    		pci_read_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, &status);
>>> @@ -451,6 +457,14 @@ int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
>>>    	return 0;
>>>    }
>>>    
>>> +int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
>>> +{
>>> +	if (pcie_aer_get_firmware_first(dev))
>>> +		return -EIO;
>>> +
>>> +	return pci_aer_raw_clear_status(dev);
>>> +}
>>> +
>>>    void pci_save_aer_state(struct pci_dev *dev)
>>>    {
>>>    	struct pci_cap_saved_state *save_state;
>>> -- 
>>> 2.25.1
>>>
> 


^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2020-03-19 23:20 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-04  2:36 [PATCH v17 00/12] Add Error Disconnect Recover (EDR) support sathyanarayanan.kuppuswamy
2020-03-04  2:36 ` [PATCH v17 01/12] PCI/ERR: Update error status after reset_link() sathyanarayanan.kuppuswamy
2020-03-04  2:36 ` [PATCH v17 02/12] PCI/AER: Move pci_cleanup_aer_error_status_regs() declaration to pci.h sathyanarayanan.kuppuswamy
2020-03-04  2:36 ` [PATCH v17 03/12] PCI/ERR: Remove service dependency in pcie_do_recovery() sathyanarayanan.kuppuswamy
2020-03-17 14:40   ` Christoph Hellwig
2020-03-04  2:36 ` [PATCH v17 04/12] PCI: portdrv: remove unnecessary pcie_port_find_service() sathyanarayanan.kuppuswamy
2020-03-04  2:36 ` [PATCH v17 05/12] PCI: portdrv: remove reset_link member from pcie_port_service_driver sathyanarayanan.kuppuswamy
2020-03-17 14:41   ` Christoph Hellwig
2020-03-17 14:55     ` Kuppuswamy, Sathyanarayanan
2020-03-04  2:36 ` [PATCH v17 06/12] Documentation: PCI: Remove reset_link references sathyanarayanan.kuppuswamy
2020-03-17 14:42   ` Christoph Hellwig
2020-03-17 15:05     ` Kuppuswamy, Sathyanarayanan
2020-03-17 15:07       ` Christoph Hellwig
2020-03-17 16:03         ` Bjorn Helgaas
2020-03-17 17:06           ` Christoph Hellwig
2020-03-19 22:52             ` Bjorn Helgaas
2020-03-04  2:36 ` [PATCH v17 07/12] PCI/ERR: Return status of pcie_do_recovery() sathyanarayanan.kuppuswamy
2020-03-04  2:36 ` [PATCH v17 08/12] PCI/DPC: Cache DPC capabilities in pci_init_capabilities() sathyanarayanan.kuppuswamy
2020-03-04  2:36 ` [PATCH v17 09/12] PCI/AER: Allow clearing Error Status Register in FF mode sathyanarayanan.kuppuswamy
2020-03-06  5:45   ` Kuppuswamy, Sathyanarayanan
2020-03-06 16:04     ` Bjorn Helgaas
2020-03-06 16:11       ` Kuppuswamy, Sathyanarayanan
2020-03-06 16:41         ` Bjorn Helgaas
2020-03-10  2:40   ` Bjorn Helgaas
2020-03-10  4:28     ` Kuppuswamy, Sathyanarayanan
2020-03-10 18:14       ` Austin.Bolen
2020-03-10 19:32         ` Bjorn Helgaas
2020-03-10 20:06           ` Austin.Bolen
2020-03-10 20:41             ` Kuppuswamy Sathyanarayanan
2020-03-10 20:41               ` Kuppuswamy Sathyanarayanan
2020-03-10 20:49               ` Austin.Bolen
2020-03-11 14:45             ` Bjorn Helgaas
2020-03-11 15:19               ` Austin.Bolen
2020-03-11 17:12                 ` Bjorn Helgaas
2020-03-11 17:27                   ` Austin.Bolen
2020-03-11 20:33                     ` Bjorn Helgaas
2020-03-11 21:25                       ` Kuppuswamy Sathyanarayanan
2020-03-11 21:53                         ` Austin.Bolen
2020-03-11 22:11                           ` Kuppuswamy Sathyanarayanan
2020-03-11 22:23                             ` Bjorn Helgaas
2020-03-11 23:07                               ` Kuppuswamy Sathyanarayanan
2020-03-12 19:53                                 ` Bjorn Helgaas
2020-03-12 21:02                                   ` Austin.Bolen
2020-03-12 21:29                                     ` Kuppuswamy Sathyanarayanan
2020-03-12 21:52                                       ` Bjorn Helgaas
2020-03-12 22:02                                         ` Kuppuswamy Sathyanarayanan
2020-03-12 22:36                                           ` Bjorn Helgaas
2020-03-12 21:59                                   ` Kuppuswamy Sathyanarayanan
2020-03-12 22:32                                     ` Bjorn Helgaas
2020-03-13  6:22                                       ` Kuppuswamy, Sathyanarayanan
2020-03-13 19:28                                         ` Bjorn Helgaas
2020-03-13 20:26                                           ` Kuppuswamy Sathyanarayanan
2020-03-19 23:03                                             ` Bjorn Helgaas
2020-03-19 23:20                                               ` Kuppuswamy, Sathyanarayanan
2020-03-11 22:13                         ` Bjorn Helgaas
2020-03-11 22:41                           ` Kuppuswamy Sathyanarayanan
2020-03-11 18:12                   ` Kuppuswamy Sathyanarayanan
2020-03-11 22:05             ` Bjorn Helgaas
2020-03-04  2:36 ` [PATCH v17 10/12] PCI/DPC: Export DPC error recovery functions sathyanarayanan.kuppuswamy
2020-03-17 14:43   ` Christoph Hellwig
2020-03-04  2:36 ` [PATCH v17 11/12] PCI/DPC: Add Error Disconnect Recover (EDR) support sathyanarayanan.kuppuswamy
2020-03-06  3:47   ` Bjorn Helgaas
2020-03-06  6:32     ` Kuppuswamy, Sathyanarayanan
2020-03-06 21:00       ` Bjorn Helgaas
2020-03-06 22:42         ` Kuppuswamy Sathyanarayanan
2020-03-06 23:23           ` Bjorn Helgaas
2020-03-07  0:19             ` Kuppuswamy Sathyanarayanan
2020-03-04  2:36 ` [PATCH v17 12/12] PCI/ACPI: Enable EDR support sathyanarayanan.kuppuswamy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).