LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Parav Pandit <parav@nvidia.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	Sasha Levin <sashal@kernel.org>,
	virtualization@lists.linux-foundation.org
Subject: [PATCH AUTOSEL 5.13 07/26] virtio_pci: Support surprise removal of virtio pci device
Date: Mon, 23 Aug 2021 20:53:37 -0400	[thread overview]
Message-ID: <20210824005356.630888-7-sashal@kernel.org> (raw)
In-Reply-To: <20210824005356.630888-1-sashal@kernel.org>

From: Parav Pandit <parav@nvidia.com>

[ Upstream commit 43bb40c5b92659966bdf4bfe584fde0a3575a049 ]

When a virtio pci device undergo surprise removal (aka async removal in
PCIe spec), mark the device as broken so that any upper layer drivers can
abort any outstanding operation.

When a virtio net pci device undergo surprise removal which is used by a
NetworkManager, a below call trace was observed.

kernel:watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [kworker/1:1:27059]
watchdog: BUG: soft lockup - CPU#1 stuck for 52s! [kworker/1:1:27059]
CPU: 1 PID: 27059 Comm: kworker/1:1 Tainted: G S      W I  L    5.13.0-hotplug+ #8
Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.9.4 11/06/2020
Workqueue: events linkwatch_event
RIP: 0010:virtnet_send_command+0xfc/0x150 [virtio_net]
Call Trace:
 virtnet_set_rx_mode+0xcf/0x2a7 [virtio_net]
 ? __hw_addr_create_ex+0x85/0xc0
 __dev_mc_add+0x72/0x80
 igmp6_group_added+0xa7/0xd0
 ipv6_mc_up+0x3c/0x60
 ipv6_find_idev+0x36/0x80
 addrconf_add_dev+0x1e/0xa0
 addrconf_dev_config+0x71/0x130
 addrconf_notify+0x1f5/0xb40
 ? rtnl_is_locked+0x11/0x20
 ? __switch_to_asm+0x42/0x70
 ? finish_task_switch+0xaf/0x2c0
 ? raw_notifier_call_chain+0x3e/0x50
 raw_notifier_call_chain+0x3e/0x50
 netdev_state_change+0x67/0x90
 linkwatch_do_dev+0x3c/0x50
 __linkwatch_run_queue+0xd2/0x220
 linkwatch_event+0x21/0x30
 process_one_work+0x1c8/0x370
 worker_thread+0x30/0x380
 ? process_one_work+0x370/0x370
 kthread+0x118/0x140
 ? set_kthread_struct+0x40/0x40
 ret_from_fork+0x1f/0x30

Hence, add the ability to abort the command on surprise removal
which prevents infinite loop and system lockup.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Link: https://lore.kernel.org/r/20210721142648.1525924-5-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/virtio/virtio_pci_common.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index 222d630c41fc..b35bb2d57f62 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -576,6 +576,13 @@ static void virtio_pci_remove(struct pci_dev *pci_dev)
 	struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev);
 	struct device *dev = get_device(&vp_dev->vdev.dev);
 
+	/*
+	 * Device is marked broken on surprise removal so that virtio upper
+	 * layers can abort any ongoing operation.
+	 */
+	if (!pci_device_is_present(pci_dev))
+		virtio_break_device(&vp_dev->vdev);
+
 	pci_disable_sriov(pci_dev);
 
 	unregister_virtio_device(&vp_dev->vdev);
-- 
2.30.2


  parent reply	other threads:[~2021-08-24  0:54 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-24  0:53 [PATCH AUTOSEL 5.13 01/26] iwlwifi: pnvm: accept multiple HW-type TLVs Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 02/26] iwlwifi: add new SoF with JF devices Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 03/26] iwlwifi: add new so-jf devices Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 04/26] opp: remove WARN when no valid OPPs remain Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 05/26] cpufreq: blocklist Qualcomm sm8150 in cpufreq-dt-platdev Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 06/26] virtio: Improve vq->broken access to avoid any compiler optimization Sasha Levin
2021-08-24  0:53 ` Sasha Levin [this message]
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 08/26] virtio_vdpa: reject invalid vq indices Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 09/26] vringh: Use wiov->used to check for read/write desc order Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 10/26] tools/virtio: fix build Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 11/26] platform/x86: asus-nb-wmi: Allow configuring SW_TABLET_MODE method with a module option Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 12/26] platform/x86: asus-nb-wmi: Add tablet_mode_sw=lid-flip quirk for the TP200s Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 13/26] qed: qed ll2 race condition fixes Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 14/26] qed: Fix null-pointer dereference in qed_rdma_create_qp() Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 15/26] Revert "drm/amd/pm: fix workload mismatch on vega10" Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 16/26] drm/amd/pm: change the workload type for some cards Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 17/26] blk-mq: don't grab rq's refcount in blk_mq_check_expired() Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 18/26] drm: Copy drm_wait_vblank to user before returning Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 19/26] platform/x86: gigabyte-wmi: add support for X570 GAMING X Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 20/26] drm/nouveau: recognise GA107 Sasha Levin
2021-08-24 17:08   ` Lyude Paul
2021-08-30 12:17     ` Sasha Levin
2021-08-30 17:08       ` Lyude Paul
2021-08-30 17:09         ` Lyude Paul
2021-08-30 21:55           ` Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 21/26] drm/nouveau/disp: power down unused DP links during init Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 22/26] drm/nouveau/kms/nv50: workaround EFI GOP window channel format differences Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 23/26] drm/nouveau: block a bunch of classes from userspace Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 24/26] platform/x86: gigabyte-wmi: add support for B450M S2H V2 Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 25/26] net/rds: dma_map_sg is entitled to merge entries Sasha Levin
2021-08-24  0:53 ` [PATCH AUTOSEL 5.13 26/26] arm64: initialize all of CNTHCTL_EL2 Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210824005356.630888-7-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=parav@nvidia.com \
    --cc=stable@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    --subject='Re: [PATCH AUTOSEL 5.13 07/26] virtio_pci: Support surprise removal of virtio pci device' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).