LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Sinan Kaya <okaya@codeaurora.org>,
	Bjorn Helgaas <bhelgaas@google.com>
Subject: [PATCH 4.9 27/74] PCI: Wait up to 60 seconds for device to become ready after FLR
Date: Fri, 27 Apr 2018 15:58:17 +0200	[thread overview]
Message-ID: <20180427135711.058892150@linuxfoundation.org> (raw)
In-Reply-To: <20180427135709.899303463@linuxfoundation.org>

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sinan Kaya <okaya@codeaurora.org>

commit 821cdad5c46cae94ce65b9a98614c70a6ff021f8 upstream.

Sporadic reset issues have been observed with an Intel 750 NVMe drive while
assigning the physical function to the guest machine.  The sequence of
events observed is as follows:

  - perform a Function Level Reset (FLR)
  - sleep up to 1000ms total
  - read ~0 from PCI_COMMAND (CRS completion for config read)
  - warn that the device didn't return from FLR
  - touch the device before it's ready
  - device drops config writes when we restore register settings (there's
    no mechanism for software to learn about CRS completions for writes)
  - incomplete register restore leaves device in inconsistent state
  - device probe fails because device is in inconsistent state

After reset, an endpoint may respond to config requests with Configuration
Request Retry Status (CRS) to indicate that it is not ready to accept new
requests. See PCIe r3.1, sec 2.3.1 and 6.6.2.

Increase the timeout value from 1 second to 60 seconds to cover the period
where device responds with CRS and also report polling progress.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
[bhelgaas: include the mandatory 100ms in the delays we print]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/pci/pci.c |   52 +++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 37 insertions(+), 15 deletions(-)

--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3756,27 +3756,49 @@ int pci_wait_for_pending_transaction(str
 }
 EXPORT_SYMBOL(pci_wait_for_pending_transaction);
 
-/*
- * We should only need to wait 100ms after FLR, but some devices take longer.
- * Wait for up to 1000ms for config space to return something other than -1.
- * Intel IGD requires this when an LCD panel is attached.  We read the 2nd
- * dword because VFs don't implement the 1st dword.
- */
 static void pci_flr_wait(struct pci_dev *dev)
 {
-	int i = 0;
+	int delay = 1, timeout = 60000;
 	u32 id;
 
-	do {
-		msleep(100);
+	/*
+	 * Per PCIe r3.1, sec 6.6.2, a device must complete an FLR within
+	 * 100ms, but may silently discard requests while the FLR is in
+	 * progress.  Wait 100ms before trying to access the device.
+	 */
+	msleep(100);
+
+	/*
+	 * After 100ms, the device should not silently discard config
+	 * requests, but it may still indicate that it needs more time by
+	 * responding to them with CRS completions.  The Root Port will
+	 * generally synthesize ~0 data to complete the read (except when
+	 * CRS SV is enabled and the read was for the Vendor ID; in that
+	 * case it synthesizes 0x0001 data).
+	 *
+	 * Wait for the device to return a non-CRS completion.  Read the
+	 * Command register instead of Vendor ID so we don't have to
+	 * contend with the CRS SV value.
+	 */
+	pci_read_config_dword(dev, PCI_COMMAND, &id);
+	while (id == ~0) {
+		if (delay > timeout) {
+			dev_warn(&dev->dev, "not ready %dms after FLR; giving up\n",
+				 100 + delay - 1);
+			return;
+		}
+
+		if (delay > 1000)
+			dev_info(&dev->dev, "not ready %dms after FLR; waiting\n",
+				 100 + delay - 1);
+
+		msleep(delay);
+		delay *= 2;
 		pci_read_config_dword(dev, PCI_COMMAND, &id);
-	} while (i++ < 10 && id == ~0);
+	}
 
-	if (id == ~0)
-		dev_warn(&dev->dev, "Failed to return from FLR\n");
-	else if (i > 1)
-		dev_info(&dev->dev, "Required additional %dms to return from FLR\n",
-			 (i - 1) * 100);
+	if (delay > 1000)
+		dev_info(&dev->dev, "ready %dms after FLR\n", 100 + delay - 1);
 }
 
 static int pcie_flr(struct pci_dev *dev, int probe)

  parent reply	other threads:[~2018-04-27 13:58 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-27 13:57 [PATCH 4.9 00/74] 4.9.97-stable review Greg Kroah-Hartman
2018-04-27 13:57 ` [PATCH 4.9 01/74] cifs: do not allow creating sockets except with SMB1 posix exensions Greg Kroah-Hartman
2018-04-27 13:57 ` [PATCH 4.9 02/74] x86/tsc: Prevent 32bit truncation in calc_hpet_ref() Greg Kroah-Hartman
2018-04-27 13:57 ` [PATCH 4.9 03/74] drm/vc4: Fix memory leak during BO teardown Greg Kroah-Hartman
2018-04-27 13:57 ` [PATCH 4.9 04/74] drm/i915: Fix LSPCON TMDS output buffer enabling from low-power state Greg Kroah-Hartman
2018-04-27 13:57 ` [PATCH 4.9 05/74] i2c: i801: store and restore the SLVCMD register at load and unload Greg Kroah-Hartman
2018-04-27 13:57 ` [PATCH 4.9 06/74] i2c: i801: Save register SMBSLVCMD value only once Greg Kroah-Hartman
2018-04-27 13:57 ` [PATCH 4.9 07/74] i2c: i801: Restore configuration at shutdown Greg Kroah-Hartman
2018-04-27 13:57 ` [PATCH 4.9 08/74] usb: musb: fix enumeration after resume Greg Kroah-Hartman
2018-04-27 13:57 ` [PATCH 4.9 09/74] usb: musb: call pm_runtime_{get,put}_sync before reading vbus registers Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 10/74] usb: musb: Fix external abort in musb_remove on omap2430 Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 11/74] MIPS: Generic: Fix big endian CPUs on generic machine Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 12/74] Input: drv260x - fix initializing overdrive voltage Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 13/74] power: supply: bq2415x: check for NULL acpi_id to avoid null pointer dereference Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 14/74] [media] stk-webcam: fix an endian bug in stk_camera_read_reg() Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 15/74] OF: Prevent unaligned access in of_alias_scan() Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 16/74] ath9k_hw: check if the chip failed to wake up Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 17/74] jbd2: fix use after free in kjournald2() Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 18/74] Revert "perf tools: Decompress kernel module when reading DSO data" Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 19/74] perf: Fix sample_max_stack maximum check Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 20/74] perf: Return proper values for user stack errors Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 21/74] RDMA/mlx5: Fix NULL dereference while accessing XRC_TGT QPs Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 22/74] drm/i915/bxt, glk: Increase PCODE timeouts during CDCLK freq changing Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 23/74] mac80211_hwsim: fix use-after-free bug in hwsim_exit_net Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 24/74] r8152: add Linksys USB3GIGV1 id Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 25/74] Revert "pinctrl: intel: Initialize GPIO properly when used through irqchip" Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 26/74] Revert "ath10k: send (re)assoc peer command when NSS changed" Greg Kroah-Hartman
2018-04-27 13:58 ` Greg Kroah-Hartman [this message]
2018-04-27 13:58 ` [PATCH 4.9 28/74] s390: introduce CPU alternatives Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 29/74] s390: enable CPU alternatives unconditionally Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 30/74] KVM: s390: wire up bpb feature Greg Kroah-Hartman
2018-04-27 15:43   ` Christian Borntraeger
2018-04-27 13:58 ` [PATCH 4.9 31/74] s390: scrub registers on kernel entry and KVM exit Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 32/74] s390: add optimized array_index_mask_nospec Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 33/74] s390/alternative: use a copy of the facility bit mask Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 34/74] s390: add options to change branch prediction behaviour for the kernel Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 35/74] s390: run user space and KVM guests with modified branch prediction Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 36/74] s390: introduce execute-trampolines for branches Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 37/74] KVM: s390: force bp isolation for VSIE Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 38/74] s390: Replace IS_ENABLED(EXPOLINE_*) with IS_ENABLED(CONFIG_EXPOLINE_*) Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 39/74] s390: do not bypass BPENTER for interrupt system calls Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 40/74] s390/entry.S: fix spurious zeroing of r0 Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 41/74] s390: move nobp parameter functions to nospec-branch.c Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 42/74] s390: add automatic detection of the spectre defense Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 43/74] s390: report spectre mitigation via syslog Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 44/74] s390: add sysfs attributes for spectre Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 45/74] s390: correct nospec auto detection init order Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 46/74] s390: correct module section names for expoline code revert Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 47/74] bonding: do not set slave_dev npinfo before slave_enable_netpoll in bond_enslave Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 48/74] KEYS: DNS: limit the length of option strings Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 49/74] l2tp: check sockaddr length in pppol2tp_connect() Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 50/74] net: validate attribute sizes in neigh_dump_table() Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 51/74] llc: delete timers synchronously in llc_sk_free() Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 52/74] tcp: dont read out-of-bounds opsize Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 53/74] team: avoid adding twice the same option to the event list Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 54/74] team: fix netconsole setup over team Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 55/74] packet: fix bitfield update race Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 56/74] tipc: add policy for TIPC_NLA_NET_ADDR Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 57/74] pppoe: check sockaddr length in pppoe_connect() Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 58/74] vlan: Fix reading memory beyond skb->tail in skb_vlan_tagged_multi Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 59/74] sctp: do not check port in sctp_inet6_cmp_addr Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 60/74] net: sched: ife: signal not finding metaid Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 61/74] llc: hold llc_sap before release_sock() Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 62/74] llc: fix NULL pointer deref for SOCK_ZAPPED Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 63/74] net: ethernet: ti: cpsw: fix tx vlan priority mapping Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 64/74] net: fix deadlock while clearing neighbor proxy table Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 65/74] tcp: md5: reject TCP_MD5SIG or TCP_MD5SIG_EXT on established sockets Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 66/74] net: af_packet: fix race in PACKET_{R|T}X_RING Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 67/74] ipv6: add RTA_TABLE and RTA_PREFSRC to rtm_ipv6_policy Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 68/74] strparser: Fix incorrect strp->need_bytes value Greg Kroah-Hartman
2018-04-27 13:58 ` [PATCH 4.9 69/74] scsi: mptsas: Disable WRITE SAME Greg Kroah-Hartman
2018-04-27 13:59 ` [PATCH 4.9 70/74] cdrom: information leak in cdrom_ioctl_media_changed() Greg Kroah-Hartman
2018-04-27 13:59 ` [PATCH 4.9 71/74] s390/cio: update chpid descriptor after resource accessibility event Greg Kroah-Hartman
2018-04-27 13:59 ` [PATCH 4.9 72/74] s390/dasd: fix IO error for newly defined devices Greg Kroah-Hartman
2018-04-27 13:59 ` [PATCH 4.9 73/74] s390/uprobes: implement arch_uretprobe_is_alive() Greg Kroah-Hartman
2018-04-27 13:59 ` [PATCH 4.9 74/74] ACPI / video: Only default only_lcd to true on Win8-ready _desktops_ Greg Kroah-Hartman
2018-04-27 18:13 ` [PATCH 4.9 00/74] 4.9.97-stable review Shuah Khan
2018-04-27 20:33 ` Dan Rue
2018-04-27 20:44 ` kernelci.org bot
2018-04-28 14:26 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180427135711.058892150@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=bhelgaas@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=okaya@codeaurora.org \
    --cc=stable@vger.kernel.org \
    --subject='Re: [PATCH 4.9 27/74] PCI: Wait up to 60 seconds for device to become ready after FLR' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).