LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v4 1/2] PCI: hv: Serialize the present and eject work items
@ 2018-03-15 14:20 Dexuan Cui
2018-03-15 17:01 ` Lorenzo Pieralisi
0 siblings, 1 reply; 4+ messages in thread
From: Dexuan Cui @ 2018-03-15 14:20 UTC (permalink / raw)
To: lorenzo.pieralisi, bhelgaas, linux-pci, KY Srinivasan, Stephen Hemminger
Cc: olaf, Haiyang Zhang, driverdev-devel, linux-kernel,
Michael Kelley (EOSG),
marcelo.cerri, apw, vkuznets, jasowang
When we hot-remove the device, we first receive a PCI_EJECT message and
then receive a PCI_BUS_RELATIONS message with bus_rel->device_count == 0.
The first message is offloaded to hv_eject_device_work(), and the second
is offloaded to pci_devices_present_work(). Both the paths can be running
list_del(&hpdev->list_entry), causing general protection fault, because
system_wq can run them concurrently.
The patch eliminates the race condition.
Tested-by: Adrian Suhov <v-adsuho@microsoft.com>
Tested-by: Chris Valean <v-chvale@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Cc: stable@vger.kernel.org
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Jack Morgenstein <jackm@mellanox.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
---
drivers/pci/host/pci-hyperv.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
index 2faf38eab785..5e67dff779a7 100644
--- a/drivers/pci/host/pci-hyperv.c
+++ b/drivers/pci/host/pci-hyperv.c
@@ -461,6 +461,8 @@ struct hv_pcibus_device {
struct retarget_msi_interrupt retarget_msi_interrupt_params;
spinlock_t retarget_msi_interrupt_lock;
+
+ struct workqueue_struct *wq;
};
/*
@@ -1770,7 +1772,7 @@ static void hv_pci_devices_present(struct hv_pcibus_device *hbus,
spin_unlock_irqrestore(&hbus->device_list_lock, flags);
get_hvpcibus(hbus);
- schedule_work(&dr_wrk->wrk);
+ queue_work(hbus->wq, &dr_wrk->wrk);
}
/**
@@ -1848,7 +1850,7 @@ static void hv_pci_eject_device(struct hv_pci_dev *hpdev)
get_pcichild(hpdev, hv_pcidev_ref_pnp);
INIT_WORK(&hpdev->wrk, hv_eject_device_work);
get_hvpcibus(hpdev->hbus);
- schedule_work(&hpdev->wrk);
+ queue_work(hpdev->hbus->wq, &hpdev->wrk);
}
/**
@@ -2463,11 +2465,17 @@ static int hv_pci_probe(struct hv_device *hdev,
spin_lock_init(&hbus->retarget_msi_interrupt_lock);
sema_init(&hbus->enum_sem, 1);
init_completion(&hbus->remove_event);
+ hbus->wq = alloc_ordered_workqueue("hv_pci_%x", 0,
+ hbus->sysdata.domain);
+ if (!hbus->wq) {
+ ret = -ENOMEM;
+ goto free_bus;
+ }
ret = vmbus_open(hdev->channel, pci_ring_size, pci_ring_size, NULL, 0,
hv_pci_onchannelcallback, hbus);
if (ret)
- goto free_bus;
+ goto destroy_wq;
hv_set_drvdata(hdev, hbus);
@@ -2536,6 +2544,8 @@ static int hv_pci_probe(struct hv_device *hdev,
hv_free_config_window(hbus);
close:
vmbus_close(hdev->channel);
+destroy_wq:
+ destroy_workqueue(hbus->wq);
free_bus:
free_page((unsigned long)hbus);
return ret;
@@ -2615,6 +2625,7 @@ static int hv_pci_remove(struct hv_device *hdev)
irq_domain_free_fwnode(hbus->sysdata.fwnode);
put_hvpcibus(hbus);
wait_for_completion(&hbus->remove_event);
+ destroy_workqueue(hbus->wq);
free_page((unsigned long)hbus);
return 0;
}
--
2.7.4
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v4 1/2] PCI: hv: Serialize the present and eject work items
2018-03-15 14:20 [PATCH v4 1/2] PCI: hv: Serialize the present and eject work items Dexuan Cui
@ 2018-03-15 17:01 ` Lorenzo Pieralisi
2018-03-15 17:55 ` Dexuan Cui
0 siblings, 1 reply; 4+ messages in thread
From: Lorenzo Pieralisi @ 2018-03-15 17:01 UTC (permalink / raw)
To: Dexuan Cui
Cc: olaf, Stephen Hemminger, linux-pci, Haiyang Zhang,
driverdev-devel, linux-kernel, Michael Kelley (EOSG),
apw, marcelo.cerri, bhelgaas, vkuznets, jasowang
On Thu, Mar 15, 2018 at 02:20:53PM +0000, Dexuan Cui wrote:
> When we hot-remove the device, we first receive a PCI_EJECT message and
> then receive a PCI_BUS_RELATIONS message with bus_rel->device_count == 0.
>
> The first message is offloaded to hv_eject_device_work(), and the second
> is offloaded to pci_devices_present_work(). Both the paths can be running
> list_del(&hpdev->list_entry), causing general protection fault, because
> system_wq can run them concurrently.
>
> The patch eliminates the race condition.
>
> Tested-by: Adrian Suhov <v-adsuho@microsoft.com>
> Tested-by: Chris Valean <v-chvale@microsoft.com>
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
> Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: stable@vger.kernel.org
I need to know either what commit you are fixing (ie Fixes: tag - which
is preferrable) or you tell me which kernel versions we are targeting
for the stable backport.
Thanks,
Lorenzo
> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
> Cc: Jack Morgenstein <jackm@mellanox.com>
> Cc: Stephen Hemminger <sthemmin@microsoft.com>
> Cc: K. Y. Srinivasan <kys@microsoft.com>
> ---
> drivers/pci/host/pci-hyperv.c | 17 ++++++++++++++---
> 1 file changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
> index 2faf38eab785..5e67dff779a7 100644
> --- a/drivers/pci/host/pci-hyperv.c
> +++ b/drivers/pci/host/pci-hyperv.c
> @@ -461,6 +461,8 @@ struct hv_pcibus_device {
> struct retarget_msi_interrupt retarget_msi_interrupt_params;
>
> spinlock_t retarget_msi_interrupt_lock;
> +
> + struct workqueue_struct *wq;
> };
>
> /*
> @@ -1770,7 +1772,7 @@ static void hv_pci_devices_present(struct hv_pcibus_device *hbus,
> spin_unlock_irqrestore(&hbus->device_list_lock, flags);
>
> get_hvpcibus(hbus);
> - schedule_work(&dr_wrk->wrk);
> + queue_work(hbus->wq, &dr_wrk->wrk);
> }
>
> /**
> @@ -1848,7 +1850,7 @@ static void hv_pci_eject_device(struct hv_pci_dev *hpdev)
> get_pcichild(hpdev, hv_pcidev_ref_pnp);
> INIT_WORK(&hpdev->wrk, hv_eject_device_work);
> get_hvpcibus(hpdev->hbus);
> - schedule_work(&hpdev->wrk);
> + queue_work(hpdev->hbus->wq, &hpdev->wrk);
> }
>
> /**
> @@ -2463,11 +2465,17 @@ static int hv_pci_probe(struct hv_device *hdev,
> spin_lock_init(&hbus->retarget_msi_interrupt_lock);
> sema_init(&hbus->enum_sem, 1);
> init_completion(&hbus->remove_event);
> + hbus->wq = alloc_ordered_workqueue("hv_pci_%x", 0,
> + hbus->sysdata.domain);
> + if (!hbus->wq) {
> + ret = -ENOMEM;
> + goto free_bus;
> + }
>
> ret = vmbus_open(hdev->channel, pci_ring_size, pci_ring_size, NULL, 0,
> hv_pci_onchannelcallback, hbus);
> if (ret)
> - goto free_bus;
> + goto destroy_wq;
>
> hv_set_drvdata(hdev, hbus);
>
> @@ -2536,6 +2544,8 @@ static int hv_pci_probe(struct hv_device *hdev,
> hv_free_config_window(hbus);
> close:
> vmbus_close(hdev->channel);
> +destroy_wq:
> + destroy_workqueue(hbus->wq);
> free_bus:
> free_page((unsigned long)hbus);
> return ret;
> @@ -2615,6 +2625,7 @@ static int hv_pci_remove(struct hv_device *hdev)
> irq_domain_free_fwnode(hbus->sysdata.fwnode);
> put_hvpcibus(hbus);
> wait_for_completion(&hbus->remove_event);
> + destroy_workqueue(hbus->wq);
> free_page((unsigned long)hbus);
> return 0;
> }
> --
> 2.7.4
>
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: [PATCH v4 1/2] PCI: hv: Serialize the present and eject work items
2018-03-15 17:01 ` Lorenzo Pieralisi
@ 2018-03-15 17:55 ` Dexuan Cui
2018-03-15 18:28 ` Dexuan Cui
0 siblings, 1 reply; 4+ messages in thread
From: Dexuan Cui @ 2018-03-15 17:55 UTC (permalink / raw)
To: Lorenzo Pieralisi
Cc: olaf, Stephen Hemminger, linux-pci, Haiyang Zhang,
driverdev-devel, linux-kernel, Michael Kelley (EOSG),
apw, marcelo.cerri, bhelgaas, vkuznets, jasowang
> From: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> Sent: Thursday, March 15, 2018 10:02
> On Thu, Mar 15, 2018 at 02:20:53PM +0000, Dexuan Cui wrote:
> > When we hot-remove the device, we first receive a PCI_EJECT message and
> > then receive a PCI_BUS_RELATIONS message with bus_rel->device_count ==
> 0.
> >
> > The first message is offloaded to (), and the second
> > is offloaded to pci_devices_present_work(). Both the paths can be running
> > list_del(&hpdev->list_entry), causing general protection fault, because
> > system_wq can run them concurrently.
> >
> > The patch eliminates the race condition.
> >
> > Cc: stable@vger.kernel.org
>
> I need to know either what commit you are fixing (ie Fixes: tag - which
> is preferrable) or you tell me which kernel versions we are targeting
> for the stable backport.
>
> Thanks,
> Lorenzo
Sorry. Here I was hesitant to add a "Fixes:" because the bug was there the first day
when the driver was introduced.
Please use
Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")
or
Cc: <stable@vger.kernel.org> # v4.6+
Thanks,
-- Dexuan
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: [PATCH v4 1/2] PCI: hv: Serialize the present and eject work items
2018-03-15 17:55 ` Dexuan Cui
@ 2018-03-15 18:28 ` Dexuan Cui
0 siblings, 0 replies; 4+ messages in thread
From: Dexuan Cui @ 2018-03-15 18:28 UTC (permalink / raw)
To: 'Lorenzo Pieralisi'
Cc: 'olaf@aepfle.de',
Stephen Hemminger, 'linux-pci@vger.kernel.org',
Haiyang Zhang, 'driverdev-devel@linuxdriverproject.org',
'linux-kernel@vger.kernel.org', Michael Kelley (EOSG),
'apw@canonical.com',
'marcelo.cerri@canonical.com',
'bhelgaas@google.com', 'vkuznets@redhat.com',
'jasowang@redhat.com'
> From: Dexuan Cui
> > From: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > I need to know either what commit you are fixing (ie Fixes: tag - which
> > is preferrable) or you tell me which kernel versions we are targeting
> > for the stable backport.
> > Lorenzo
>
> Sorry. Here I was hesitant to add a "Fixes:" because the bug was there the first
> day
> when the driver was introduced.
>
> Please use
> Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft
> Hyper-V VMs")
> or
> Cc: <stable@vger.kernel.org> # v4.6+
BTW, the bug here is a race condtion which couldn't be easily hit in the past,
probably because most of the time only one PCI device was only added into the
VM once. But now it's becoming typical that a VM can have 4 GPU devices so we
start to notice this bug. With 7 Mellanox VFs assigned to a VM, we can easily
reproduce the bug by "hot-remove and hot-add VFs" test:
general protection fault: 0000 [#1] SMP
...
Workqueue: events hv_eject_device_work [pci_hyperv]
task: ffff8800ed5e5400 ti: ffff8800ee674000 task.ti: ffff8800ee674000
RIP: 0010:[<ffffffffc025c5ce>] ... hv_eject_device_work+0xbe/0x160 [pci_hyperv]
...
Call Trace:
[<ffffffff8183c3a6>] ? __schedule+0x3b6/0xa30
[<ffffffff8109a585>] process_one_work+0x165/0x480
[<ffffffff8109a8eb>] worker_thread+0x4b/0x4c0
[<ffffffff8109a8a0>] ? process_one_work+0x480/0x480
[<ffffffff810a0c25>] kthread+0xe5/0x100
[<ffffffff810a0b40>] ? kthread_create_on_node+0x1e0/0x1e0
[<ffffffff81840f0f>] ret_from_fork+0x3f/0x70
[<ffffffff810a0b40>] ? kthread_create_on_node+0x1e0/0x1e0
Code: ...
RIP [<ffffffffc025c5ce>] hv_eject_device_work+0xbe/0x160 [pci_hyperv]
...
BUG: unable to handle kernel paging request at ffffffffffffffd8
IP: [<ffffffff810a12d0>] kthread_data+0x10/0x20
PGD 1e0d067 PUD 1e0f067 PMD 0
Oops: 0000 [#2] SMP
Thanks,
-- Dexuan
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-03-15 18:28 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-15 14:20 [PATCH v4 1/2] PCI: hv: Serialize the present and eject work items Dexuan Cui
2018-03-15 17:01 ` Lorenzo Pieralisi
2018-03-15 17:55 ` Dexuan Cui
2018-03-15 18:28 ` Dexuan Cui
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).