LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute
@ 2014-12-04 12:06 Konrad Rzeszutek Wilk
  2014-12-04 12:24 ` [Xen-devel] " David Vrabel
  0 siblings, 1 reply; 18+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-12-04 12:06 UTC (permalink / raw)
  To: David Vrabel
  Cc: Boris Ostrovsky, linux-pci, xen-devel, linux-kernel, bhelgaas

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=utf-8, Size: 830 bytes --]


On Dec 4, 2014 6:30 AM, David Vrabel <david.vrabel@citrix.com> wrote:
>
> On 03/12/14 21:40, Konrad Rzeszutek Wilk wrote: 
> > 
> > Instead of doing all this complex dance, we depend on the toolstack 
> > doing the right thing. As such implement the 'do_flr' SysFS attribute 
> > which 'xl' uses when a device is detached or attached from/to a guest. 
> > It bypasses the need to worry about the PCI lock. 
>
> No.  Get pciback to add its own "reset" sysfs file (as I have repeatedly 
> proposed). 
>

Which does not work as the kobj will complain (as there is already an 'reset' associated with the PCI device).

Unless you mean an different name (reset2).

> David 
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute
  2014-12-04 12:06 [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute Konrad Rzeszutek Wilk
@ 2014-12-04 12:24 ` David Vrabel
  2014-12-04 13:10   ` Sander Eikelenboom
  0 siblings, 1 reply; 18+ messages in thread
From: David Vrabel @ 2014-12-04 12:24 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, David Vrabel
  Cc: bhelgaas, linux-pci, Boris Ostrovsky, linux-kernel, xen-devel

[-- Attachment #1: Type: text/plain, Size: 1224 bytes --]

On 04/12/14 12:06, Konrad Rzeszutek Wilk wrote:
> 
> On Dec 4, 2014 6:30 AM, David Vrabel <david.vrabel@citrix.com> wrote:
>>
>> On 03/12/14 21:40, Konrad Rzeszutek Wilk wrote: 
>>>
>>> Instead of doing all this complex dance, we depend on the toolstack 
>>> doing the right thing. As such implement the 'do_flr' SysFS attribute 
>>> which 'xl' uses when a device is detached or attached from/to a guest. 
>>> It bypasses the need to worry about the PCI lock. 
>>
>> No.  Get pciback to add its own "reset" sysfs file (as I have repeatedly 
>> proposed). 
>>
> 
> Which does not work as the kobj will complain (as there is already an 'reset' associated with the PCI device).

It is only needed if the core won't provide one.

+static int pcistub_try_create_reset_file(struct pci_dev *pci)
+{
+       struct xen_pcibk_dev_data *dev_data = pci_get_drvdata(pci);
+       struct device *dev = &pci->dev;
+       int ret;
+
+       /* Already have a per-function reset? */
+       if (pci_probe_reset_function(pci) == 0)
+               return 0;
+
+       ret = device_create_file(dev, &dev_attr_reset);
+       if (ret < 0)
+               return ret;
+       dev_data->created_reset_file = true;
+       return 0;
+}

David



[-- Attachment #2: 0002-xen-pciback-provide-a-reset-sysfs-file-to-try-harder.patch --]
[-- Type: text/x-diff, Size: 6659 bytes --]

>From e65a814106bc9994ec47b5317f7cdc975270d27f Mon Sep 17 00:00:00 2001
From: David Vrabel <david.vrabel@citrix.com>
Date: Thu, 10 Jul 2014 13:09:25 +0100
Subject: [PATCH 2/2] xen-pciback: provide a "reset" sysfs file to try harder
 at an SBR

The standard implementation of pci_reset_function() does not try an
SBR if there are other sibling devices.  This is a common
configuration for multi-function devices (e.g., GPUs with a HDMI audio
device function).

If all the functions are co-assigned to the same domain at the same
time, then it is safe to perform an SBR to reset all functions at the
same time.

Add a "reset" sysfs file with the same interface as the standard one
(i.e., write 1 to reset) that will try an SBR if all sibling devices
are unbound or bound to pciback.

Note that this is weaker than the requirement for functions to be
simultaneously co-assigned, but the toolstack is expected to ensure
this.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 4e8ba38..d6c29aa 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -14,6 +14,7 @@
 #include <linux/wait.h>
 #include <linux/sched.h>
 #include <linux/atomic.h>
+#include <linux/delay.h>
 #include <xen/events.h>
 #include <asm/xen/pci.h>
 #include <asm/xen/hypervisor.h>
@@ -60,6 +61,114 @@ static LIST_HEAD(pcistub_devices);
 static int initialize_devices;
 static LIST_HEAD(seized_devices);
 
+/*
+ * pci_reset_function() will only work if there is a mechanism to
+ * reset that single function (e.g., FLR or a D-state transition).
+ * For PCI hardware that has two or more functions but no per-function
+ * reset, we can do a bus reset iff all the functions are co-assigned
+ * to the same domain.
+ *
+ * If a function has no per-function reset mechanism the 'reset' sysfs
+ * file that the toolstack uses to reset a function prior to assigning
+ * the device will be missing.  In this case, pciback adds its own
+ * which will try a bus reset.
+ *
+ * Note: pciback does not check for co-assigment before doing a bus
+ * reset, only that the devices are bound to pciback.  The toolstack
+ * is assumed to have done the right thing.
+ */
+static int __pcistub_reset_function(struct pci_dev *dev)
+{
+	struct pci_dev *pdev;
+	u16 ctrl;
+	int ret;
+
+	ret = __pci_reset_function_locked(dev);
+	if (ret == 0)
+		return 0;
+
+	if (pci_is_root_bus(dev->bus) || dev->subordinate || !dev->bus->self)
+		return -ENOTTY;
+
+	list_for_each_entry(pdev, &dev->bus->devices, bus_list) {
+		if (pdev != dev && (!pdev->driver
+				    || strcmp(pdev->driver->name, "pciback")))
+			return -ENOTTY;
+		pci_save_state(pdev);
+	}
+
+	pci_read_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, &ctrl);
+	ctrl |= PCI_BRIDGE_CTL_BUS_RESET;
+	pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
+	msleep(200);
+
+	ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
+	pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
+	msleep(200);
+
+	list_for_each_entry(pdev, &dev->bus->devices, bus_list)
+		pci_restore_state(pdev);
+
+	return 0;
+}
+
+static int pcistub_reset_function(struct pci_dev *dev)
+{
+	int ret;
+
+	device_lock(&dev->dev);
+	ret = __pcistub_reset_function(dev);
+	device_unlock(&dev->dev);
+
+	return ret;
+}
+
+static ssize_t pcistub_reset_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t count)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	unsigned long val;
+	ssize_t result = strict_strtoul(buf, 0, &val);
+
+	if (result < 0)
+		return result;
+
+	if (val != 1)
+		return -EINVAL;
+
+	result = pcistub_reset_function(pdev);
+	if (result < 0)
+		return result;
+	return count;
+}
+static DEVICE_ATTR(reset, 0200, NULL, pcistub_reset_store);
+
+static int pcistub_try_create_reset_file(struct pci_dev *pci)
+{
+	struct xen_pcibk_dev_data *dev_data = pci_get_drvdata(pci);
+	struct device *dev = &pci->dev;
+	int ret;
+
+	/* Already have a per-function reset? */
+	if (pci_probe_reset_function(pci) == 0)
+		return 0;
+
+	ret = device_create_file(dev, &dev_attr_reset);
+	if (ret < 0)
+		return ret;
+	dev_data->created_reset_file = true;
+	return 0;
+}
+
+static void pcistub_remove_reset_file(struct pci_dev *pci)
+{
+	struct xen_pcibk_dev_data *dev_data = pci_get_drvdata(pci);
+
+	if (dev_data && dev_data->created_reset_file)
+		device_remove_file(&pci->dev, &dev_attr_reset);
+}
+
 static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
 {
 	struct pcistub_device *psdev;
@@ -100,7 +209,8 @@ static void pcistub_device_release(struct kref *kref)
 	/* Call the reset function which does not take lock as this
 	 * is called from "unbind" which takes a device_lock mutex.
 	 */
-	__pci_reset_function_locked(dev);
+	__pcistub_reset_function(psdev->dev);
+
 	if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
 		dev_dbg(&dev->dev, "Could not reload PCI state\n");
 	else
@@ -268,7 +378,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	/* This is OK - we are running from workqueue context
 	 * and want to inhibit the user from fiddling with 'reset'
 	 */
-	pci_reset_function(dev);
+	pcistub_reset_function(dev);
 	pci_restore_state(psdev->dev);
 
 	/* This disables the device. */
@@ -392,7 +502,7 @@ static int pcistub_init_device(struct pci_dev *dev)
 		dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
 	else {
 		dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
-		__pci_reset_function_locked(dev);
+		__pcistub_reset_function(dev);
 		pci_restore_state(dev);
 	}
 	/* Now disable the device (this also ensures some private device
@@ -401,6 +511,10 @@ static int pcistub_init_device(struct pci_dev *dev)
 	dev_dbg(&dev->dev, "reset device\n");
 	xen_pcibk_reset_device(dev);
 
+	err = pcistub_try_create_reset_file(dev);
+	if (err < 0)
+		goto config_release;
+
 	dev->dev_flags |= PCI_DEV_FLAGS_ASSIGNED;
 	return 0;
 
@@ -526,6 +640,8 @@ static void pcistub_remove(struct pci_dev *dev)
 
 	dev_dbg(&dev->dev, "removing\n");
 
+	pcistub_remove_reset_file(dev);
+
 	spin_lock_irqsave(&pcistub_devices_lock, flags);
 
 	xen_pcibk_config_quirk_release(dev);
diff --git a/drivers/xen/xen-pciback/pciback.h b/drivers/xen/xen-pciback/pciback.h
index f72af87..708ade9 100644
--- a/drivers/xen/xen-pciback/pciback.h
+++ b/drivers/xen/xen-pciback/pciback.h
@@ -42,6 +42,7 @@ struct xen_pcibk_device {
 struct xen_pcibk_dev_data {
 	struct list_head config_fields;
 	struct pci_saved_state *pci_saved_state;
+	bool created_reset_file;
 	unsigned int permissive:1;
 	unsigned int warned_on_write:1;
 	unsigned int enable_intx:1;

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute
  2014-12-04 12:24 ` [Xen-devel] " David Vrabel
@ 2014-12-04 13:10   ` Sander Eikelenboom
  2014-12-04 13:43     ` David Vrabel
  0 siblings, 1 reply; 18+ messages in thread
From: Sander Eikelenboom @ 2014-12-04 13:10 UTC (permalink / raw)
  To: David Vrabel
  Cc: Konrad Rzeszutek Wilk, bhelgaas, linux-pci, Boris Ostrovsky,
	linux-kernel, xen-devel


Thursday, December 4, 2014, 1:24:47 PM, you wrote:

> On 04/12/14 12:06, Konrad Rzeszutek Wilk wrote:
>> 
>> On Dec 4, 2014 6:30 AM, David Vrabel <david.vrabel@citrix.com> wrote:
>>>
>>> On 03/12/14 21:40, Konrad Rzeszutek Wilk wrote: 
>>>>
>>>> Instead of doing all this complex dance, we depend on the toolstack 
>>>> doing the right thing. As such implement the 'do_flr' SysFS attribute 
>>>> which 'xl' uses when a device is detached or attached from/to a guest. 
>>>> It bypasses the need to worry about the PCI lock. 
>>>
>>> No.  Get pciback to add its own "reset" sysfs file (as I have repeatedly 
>>> proposed). 
>>>
>> 
>> Which does not work as the kobj will complain (as there is already an 'reset' associated with the PCI device).

> It is only needed if the core won't provide one.

> +static int pcistub_try_create_reset_file(struct pci_dev *pci)
> +{
> +       struct xen_pcibk_dev_data *dev_data = pci_get_drvdata(pci);
> +       struct device *dev = &pci->dev;
> +       int ret;
> +
> +       /* Already have a per-function reset? */
> +       if (pci_probe_reset_function(pci) == 0)
> +               return 0;
> +
> +       ret = device_create_file(dev, &dev_attr_reset);
> +       if (ret < 0)
> +               return ret;
+       dev_data->>created_reset_file = true;
> +       return 0;
> +}

Wouldn't the "core-reset-sysfs-file" be still wired to the end up calling 
"pci.c:__pci_dev_reset" ?

The problem with that function is that from my testing it seems that the 
first option "pci_dev_specific_reset" always seems to return succes, so all the
other options are skipped (flr, pm, slot, bus). However the device it self is 
not properly reset enough (perhaps the pci_dev_specific_reset is good enough for 
none virtualization purposes and it's probably the least intrusive. For 
virtualization however it would be nice to be sure it resets properly, or have a 
way to force a specific reset routine.) 

So it's the ordering and skipping of the other resets that seems to make
this workaround necessary in the first place.

> David






^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute
  2014-12-04 13:10   ` Sander Eikelenboom
@ 2014-12-04 13:43     ` David Vrabel
  2014-12-04 14:09       ` Sander Eikelenboom
  0 siblings, 1 reply; 18+ messages in thread
From: David Vrabel @ 2014-12-04 13:43 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: Konrad Rzeszutek Wilk, bhelgaas, linux-pci, Boris Ostrovsky,
	linux-kernel, xen-devel

On 04/12/14 13:10, Sander Eikelenboom wrote:
> 
> Thursday, December 4, 2014, 1:24:47 PM, you wrote:
> 
>> On 04/12/14 12:06, Konrad Rzeszutek Wilk wrote:
>>>
>>> On Dec 4, 2014 6:30 AM, David Vrabel <david.vrabel@citrix.com> wrote:
>>>>
>>>> On 03/12/14 21:40, Konrad Rzeszutek Wilk wrote: 
>>>>>
>>>>> Instead of doing all this complex dance, we depend on the toolstack 
>>>>> doing the right thing. As such implement the 'do_flr' SysFS attribute 
>>>>> which 'xl' uses when a device is detached or attached from/to a guest. 
>>>>> It bypasses the need to worry about the PCI lock. 
>>>>
>>>> No.  Get pciback to add its own "reset" sysfs file (as I have repeatedly 
>>>> proposed). 
>>>>
>>>
>>> Which does not work as the kobj will complain (as there is already an 'reset' associated with the PCI device).
> 
>> It is only needed if the core won't provide one.
> 
>> +static int pcistub_try_create_reset_file(struct pci_dev *pci)
>> +{
>> +       struct xen_pcibk_dev_data *dev_data = pci_get_drvdata(pci);
>> +       struct device *dev = &pci->dev;
>> +       int ret;
>> +
>> +       /* Already have a per-function reset? */
>> +       if (pci_probe_reset_function(pci) == 0)
>> +               return 0;
>> +
>> +       ret = device_create_file(dev, &dev_attr_reset);
>> +       if (ret < 0)
>> +               return ret;
> +       dev_data->>created_reset_file = true;
>> +       return 0;
>> +}
> 
> Wouldn't the "core-reset-sysfs-file" be still wired to the end up calling 
> "pci.c:__pci_dev_reset" ?
> 
> The problem with that function is that from my testing it seems that the 
> first option "pci_dev_specific_reset" always seems to return succes, so all the
> other options are skipped (flr, pm, slot, bus). However the device it self is 
> not properly reset enough (perhaps the pci_dev_specific_reset is good enough for 
> none virtualization purposes and it's probably the least intrusive. For 
> virtualization however it would be nice to be sure it resets properly, or have a 
> way to force a specific reset routine.)

Then you need work with the maintainer for those specific devices or
drivers to fix their specific reset function.

I'm not adding stuff to pciback to workaround broken quirks.

David

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute
  2014-12-04 13:43     ` David Vrabel
@ 2014-12-04 14:09       ` Sander Eikelenboom
  2014-12-04 14:14         ` Sander Eikelenboom
  2014-12-04 14:31         ` David Vrabel
  0 siblings, 2 replies; 18+ messages in thread
From: Sander Eikelenboom @ 2014-12-04 14:09 UTC (permalink / raw)
  To: David Vrabel
  Cc: Konrad Rzeszutek Wilk, bhelgaas, linux-pci, Boris Ostrovsky,
	linux-kernel, xen-devel


Thursday, December 4, 2014, 2:43:06 PM, you wrote:

> On 04/12/14 13:10, Sander Eikelenboom wrote:
>> 
>> Thursday, December 4, 2014, 1:24:47 PM, you wrote:
>> 
>>> On 04/12/14 12:06, Konrad Rzeszutek Wilk wrote:
>>>>
>>>> On Dec 4, 2014 6:30 AM, David Vrabel <david.vrabel@citrix.com> wrote:
>>>>>
>>>>> On 03/12/14 21:40, Konrad Rzeszutek Wilk wrote: 
>>>>>>
>>>>>> Instead of doing all this complex dance, we depend on the toolstack 
>>>>>> doing the right thing. As such implement the 'do_flr' SysFS attribute 
>>>>>> which 'xl' uses when a device is detached or attached from/to a guest. 
>>>>>> It bypasses the need to worry about the PCI lock. 
>>>>>
>>>>> No.  Get pciback to add its own "reset" sysfs file (as I have repeatedly 
>>>>> proposed). 
>>>>>
>>>>
>>>> Which does not work as the kobj will complain (as there is already an 'reset' associated with the PCI device).
>> 
>>> It is only needed if the core won't provide one.
>> 
>>> +static int pcistub_try_create_reset_file(struct pci_dev *pci)
>>> +{
>>> +       struct xen_pcibk_dev_data *dev_data = pci_get_drvdata(pci);
>>> +       struct device *dev = &pci->dev;
>>> +       int ret;
>>> +
>>> +       /* Already have a per-function reset? */
>>> +       if (pci_probe_reset_function(pci) == 0)
>>> +               return 0;
>>> +
>>> +       ret = device_create_file(dev, &dev_attr_reset);
>>> +       if (ret < 0)
>>> +               return ret;
>> +       dev_data->>created_reset_file = true;
>>> +       return 0;
>>> +}
>> 
>> Wouldn't the "core-reset-sysfs-file" be still wired to the end up calling 
>> "pci.c:__pci_dev_reset" ?
>> 
>> The problem with that function is that from my testing it seems that the 
>> first option "pci_dev_specific_reset" always seems to return succes, so all the
>> other options are skipped (flr, pm, slot, bus). However the device it self is 
>> not properly reset enough (perhaps the pci_dev_specific_reset is good enough for 
>> none virtualization purposes and it's probably the least intrusive. For 
>> virtualization however it would be nice to be sure it resets properly, or have a 
>> way to force a specific reset routine.)

> Then you need work with the maintainer for those specific devices or
> drivers to fix their specific reset function.

> I'm not adding stuff to pciback to workaround broken quirks.

OK that's a pretty clear message there, so if one wants to use pci and vga
passthrough one should better use KVM and vfio-pci. 

vfio-pci has:
- logic to do the try-slot-bus-reset logic
- it has quirks specific to vga passthrough
implemented in from the looks of it a quite clean driver.
(the main issue with it so far was you could only seize devices based on 
vendor and device id, which can be a problem when you have multiple devices.
However that was resolved recently if i am correct.)

And neither of those will be supported by xen-pciback if i get your message 
right ?

--
Sander

> David



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute
  2014-12-04 14:09       ` Sander Eikelenboom
@ 2014-12-04 14:14         ` Sander Eikelenboom
  2014-12-04 14:31         ` David Vrabel
  1 sibling, 0 replies; 18+ messages in thread
From: Sander Eikelenboom @ 2014-12-04 14:14 UTC (permalink / raw)
  To: David Vrabel
  Cc: Konrad Rzeszutek Wilk, bhelgaas, linux-pci, Boris Ostrovsky,
	linux-kernel, xen-devel

Hello Sander,

Thursday, December 4, 2014, 3:09:09 PM, you wrote:


> Thursday, December 4, 2014, 2:43:06 PM, you wrote:

>> On 04/12/14 13:10, Sander Eikelenboom wrote:
>>> 
>>> Thursday, December 4, 2014, 1:24:47 PM, you wrote:
>>> 
>>>> On 04/12/14 12:06, Konrad Rzeszutek Wilk wrote:
>>>>>
>>>>> On Dec 4, 2014 6:30 AM, David Vrabel <david.vrabel@citrix.com> wrote:
>>>>>>
>>>>>> On 03/12/14 21:40, Konrad Rzeszutek Wilk wrote: 
>>>>>>>
>>>>>>> Instead of doing all this complex dance, we depend on the toolstack 
>>>>>>> doing the right thing. As such implement the 'do_flr' SysFS attribute 
>>>>>>> which 'xl' uses when a device is detached or attached from/to a guest. 
>>>>>>> It bypasses the need to worry about the PCI lock. 
>>>>>>
>>>>>> No.  Get pciback to add its own "reset" sysfs file (as I have repeatedly 
>>>>>> proposed). 
>>>>>>
>>>>>
>>>>> Which does not work as the kobj will complain (as there is already an 'reset' associated with the PCI device).
>>> 
>>>> It is only needed if the core won't provide one.
>>> 
>>>> +static int pcistub_try_create_reset_file(struct pci_dev *pci)
>>>> +{
>>>> +       struct xen_pcibk_dev_data *dev_data = pci_get_drvdata(pci);
>>>> +       struct device *dev = &pci->dev;
>>>> +       int ret;
>>>> +
>>>> +       /* Already have a per-function reset? */
>>>> +       if (pci_probe_reset_function(pci) == 0)
>>>> +               return 0;
>>>> +
>>>> +       ret = device_create_file(dev, &dev_attr_reset);
>>>> +       if (ret < 0)
>>>> +               return ret;
>>> +       dev_data->>created_reset_file = true;
>>>> +       return 0;
>>>> +}
>>> 
>>> Wouldn't the "core-reset-sysfs-file" be still wired to the end up calling 
>>> "pci.c:__pci_dev_reset" ?
>>> 
>>> The problem with that function is that from my testing it seems that the 
>>> first option "pci_dev_specific_reset" always seems to return succes, so all the
>>> other options are skipped (flr, pm, slot, bus). However the device it self is 
>>> not properly reset enough (perhaps the pci_dev_specific_reset is good enough for 
>>> none virtualization purposes and it's probably the least intrusive. For 
>>> virtualization however it would be nice to be sure it resets properly, or have a 
>>> way to force a specific reset routine.)

>> Then you need work with the maintainer for those specific devices or
>> drivers to fix their specific reset function.

>> I'm not adding stuff to pciback to workaround broken quirks.

> OK that's a pretty clear message there, so if one wants to use pci and vga
> passthrough one should better use KVM and vfio-pci. 

> vfio-pci has:
> - logic to do the try-slot-bus-reset logic
> - it has quirks specific to vga passthrough
Hrmm have to correct my self because the vga-pt quirks are part of the vfio-pci 
part in qemu.

The try-slot-bus-reset logic is part of the kernel vfio-pci driver though and they faced the same locking issue it 
seems:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=61cf16d8bd38c3dc52033ea75d5b1f8368514a17
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=890ed578df82f5b7b5a874f9f2fa4f117305df5f

> implemented in from the looks of it a quite clean driver.
> (the main issue with it so far was you could only seize devices based on 
> vendor and device id, which can be a problem when you have multiple devices.
> However that was resolved recently if i am correct.)

> And neither of those will be supported by xen-pciback if i get your message 
> right ?

> --
> Sander

>> David



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute
  2014-12-04 14:09       ` Sander Eikelenboom
  2014-12-04 14:14         ` Sander Eikelenboom
@ 2014-12-04 14:31         ` David Vrabel
  2014-12-04 14:50           ` Sander Eikelenboom
  2014-12-04 19:05           ` Konrad Rzeszutek Wilk
  1 sibling, 2 replies; 18+ messages in thread
From: David Vrabel @ 2014-12-04 14:31 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: Konrad Rzeszutek Wilk, bhelgaas, linux-pci, Boris Ostrovsky,
	linux-kernel, xen-devel

On 04/12/14 14:09, Sander Eikelenboom wrote:
> 
> Thursday, December 4, 2014, 2:43:06 PM, you wrote:
> 
>> On 04/12/14 13:10, Sander Eikelenboom wrote:
>>>
>>> Thursday, December 4, 2014, 1:24:47 PM, you wrote:
>>>
>>>> On 04/12/14 12:06, Konrad Rzeszutek Wilk wrote:
>>>>>
>>>>> On Dec 4, 2014 6:30 AM, David Vrabel <david.vrabel@citrix.com> wrote:
>>>>>>
>>>>>> On 03/12/14 21:40, Konrad Rzeszutek Wilk wrote: 
>>>>>>>
>>>>>>> Instead of doing all this complex dance, we depend on the toolstack 
>>>>>>> doing the right thing. As such implement the 'do_flr' SysFS attribute 
>>>>>>> which 'xl' uses when a device is detached or attached from/to a guest. 
>>>>>>> It bypasses the need to worry about the PCI lock. 
>>>>>>
>>>>>> No.  Get pciback to add its own "reset" sysfs file (as I have repeatedly 
>>>>>> proposed). 
>>>>>>
>>>>>
>>>>> Which does not work as the kobj will complain (as there is already an 'reset' associated with the PCI device).
>>>
>>>> It is only needed if the core won't provide one.
>>>
>>>> +static int pcistub_try_create_reset_file(struct pci_dev *pci)
>>>> +{
>>>> +       struct xen_pcibk_dev_data *dev_data = pci_get_drvdata(pci);
>>>> +       struct device *dev = &pci->dev;
>>>> +       int ret;
>>>> +
>>>> +       /* Already have a per-function reset? */
>>>> +       if (pci_probe_reset_function(pci) == 0)
>>>> +               return 0;
>>>> +
>>>> +       ret = device_create_file(dev, &dev_attr_reset);
>>>> +       if (ret < 0)
>>>> +               return ret;
>>> +       dev_data->>created_reset_file = true;
>>>> +       return 0;
>>>> +}
>>>
>>> Wouldn't the "core-reset-sysfs-file" be still wired to the end up calling 
>>> "pci.c:__pci_dev_reset" ?
>>>
>>> The problem with that function is that from my testing it seems that the 
>>> first option "pci_dev_specific_reset" always seems to return succes, so all the
>>> other options are skipped (flr, pm, slot, bus). However the device it self is 
>>> not properly reset enough (perhaps the pci_dev_specific_reset is good enough for 
>>> none virtualization purposes and it's probably the least intrusive. For 
>>> virtualization however it would be nice to be sure it resets properly, or have a 
>>> way to force a specific reset routine.)
> 
>> Then you need work with the maintainer for those specific devices or
>> drivers to fix their specific reset function.
> 
>> I'm not adding stuff to pciback to workaround broken quirks.
> 
> OK that's a pretty clear message there, so if one wants to use pci and vga
> passthrough one should better use KVM and vfio-pci.

Have you (or anyone else) ever raised the problem with the broken reset
quirk for certain devices with the relevant maintainer?

> vfio-pci has:
> - logic to do the try-slot-bus-reset logic

Just because vfio-pci fixed it incorrectly doesn't mean pciback has to
as well.

It makes no sense for both pciback and vfio-pci to workaround problems
with pci_function_reset() in different ways -- it should be fixed in the
core PCI code so both can benefit and make use of the same code.

David


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute
  2014-12-04 14:31         ` David Vrabel
@ 2014-12-04 14:50           ` Sander Eikelenboom
  2014-12-04 15:39             ` Alex Williamson
  2014-12-04 19:05           ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 18+ messages in thread
From: Sander Eikelenboom @ 2014-12-04 14:50 UTC (permalink / raw)
  To: David Vrabel, bhelgaas, Alex Williamson
  Cc: Konrad Rzeszutek Wilk, linux-pci, Boris Ostrovsky, linux-kernel,
	xen-devel


Thursday, December 4, 2014, 3:31:11 PM, you wrote:

> On 04/12/14 14:09, Sander Eikelenboom wrote:
>> 
>> Thursday, December 4, 2014, 2:43:06 PM, you wrote:
>> 
>>> On 04/12/14 13:10, Sander Eikelenboom wrote:
>>>>
>>>> Thursday, December 4, 2014, 1:24:47 PM, you wrote:
>>>>
>>>>> On 04/12/14 12:06, Konrad Rzeszutek Wilk wrote:
>>>>>>
>>>>>> On Dec 4, 2014 6:30 AM, David Vrabel <david.vrabel@citrix.com> wrote:
>>>>>>>
>>>>>>> On 03/12/14 21:40, Konrad Rzeszutek Wilk wrote: 
>>>>>>>>
>>>>>>>> Instead of doing all this complex dance, we depend on the toolstack 
>>>>>>>> doing the right thing. As such implement the 'do_flr' SysFS attribute 
>>>>>>>> which 'xl' uses when a device is detached or attached from/to a guest. 
>>>>>>>> It bypasses the need to worry about the PCI lock. 
>>>>>>>
>>>>>>> No.  Get pciback to add its own "reset" sysfs file (as I have repeatedly 
>>>>>>> proposed). 
>>>>>>>
>>>>>>
>>>>>> Which does not work as the kobj will complain (as there is already an 'reset' associated with the PCI device).
>>>>
>>>>> It is only needed if the core won't provide one.
>>>>
>>>>> +static int pcistub_try_create_reset_file(struct pci_dev *pci)
>>>>> +{
>>>>> +       struct xen_pcibk_dev_data *dev_data = pci_get_drvdata(pci);
>>>>> +       struct device *dev = &pci->dev;
>>>>> +       int ret;
>>>>> +
>>>>> +       /* Already have a per-function reset? */
>>>>> +       if (pci_probe_reset_function(pci) == 0)
>>>>> +               return 0;
>>>>> +
>>>>> +       ret = device_create_file(dev, &dev_attr_reset);
>>>>> +       if (ret < 0)
>>>>> +               return ret;
>>>> +       dev_data->>created_reset_file = true;
>>>>> +       return 0;
>>>>> +}
>>>>
>>>> Wouldn't the "core-reset-sysfs-file" be still wired to the end up calling 
>>>> "pci.c:__pci_dev_reset" ?
>>>>
>>>> The problem with that function is that from my testing it seems that the 
>>>> first option "pci_dev_specific_reset" always seems to return succes, so all the
>>>> other options are skipped (flr, pm, slot, bus). However the device it self is 
>>>> not properly reset enough (perhaps the pci_dev_specific_reset is good enough for 
>>>> none virtualization purposes and it's probably the least intrusive. For 
>>>> virtualization however it would be nice to be sure it resets properly, or have a 
>>>> way to force a specific reset routine.)
>> 
>>> Then you need work with the maintainer for those specific devices or
>>> drivers to fix their specific reset function.
>> 
>>> I'm not adding stuff to pciback to workaround broken quirks.
>> 
>> OK that's a pretty clear message there, so if one wants to use pci and vga
>> passthrough one should better use KVM and vfio-pci.

> Have you (or anyone else) ever raised the problem with the broken reset
> quirk for certain devices with the relevant maintainer?

>> vfio-pci has:
>> - logic to do the try-slot-bus-reset logic

> Just because vfio-pci fixed it incorrectly doesn't mean pciback has to
> as well.

Depends on what you call an "incorrect fix" .. it fixes a quirk .. 
you can say that's incorrect, but then you would have to remove 50% of
the kernel and Xen code as well.

(i do in general agree it's better to strive for a generic solution though,
that's exactly why i brought up that that function doesn't seem to work perfect
for virtualization purposes) 

> It makes no sense for both pciback and vfio-pci to workaround problems
> with pci_function_reset() in different ways -- it should be fixed in the
> core PCI code so both can benefit and make use of the same code.

Well perhaps Bjorn knows why the order of resets and skipping the rest as
implemented in "pci.c:__pci_dev_reset" was implemented in that way ?

Especially what is the expectation about pci_dev_specific_reset doing a proper 
reset for say a vga-card:
- i know it doesn't work on a radeon card (doesn't blank screen, on next guest 
  boot reports it's already posted, powermanagement doesn't work).
- while with a slot/bus reset, that all just works fine, screen blanks 
  immediately and everything else also works.

Added Alex as well since he added this workaround for KVM/vfio-pci, perhaps he knows why
he introduced the workaround in vfio-pci instead of trying to fix it in core pci 
code ?

--
Sander


> David




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute
  2014-12-04 14:50           ` Sander Eikelenboom
@ 2014-12-04 15:39             ` Alex Williamson
  2014-12-04 16:25               ` Sander Eikelenboom
  2014-12-05 10:30               ` David Vrabel
  0 siblings, 2 replies; 18+ messages in thread
From: Alex Williamson @ 2014-12-04 15:39 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: David Vrabel, bhelgaas, Konrad Rzeszutek Wilk, linux-pci,
	Boris Ostrovsky, linux-kernel, xen-devel

On Thu, 2014-12-04 at 15:50 +0100, Sander Eikelenboom wrote:
> Thursday, December 4, 2014, 3:31:11 PM, you wrote:
> 
> > On 04/12/14 14:09, Sander Eikelenboom wrote:
> >> 
> >> Thursday, December 4, 2014, 2:43:06 PM, you wrote:
> >> 
> >>> On 04/12/14 13:10, Sander Eikelenboom wrote:
> >>>>
> >>>> Thursday, December 4, 2014, 1:24:47 PM, you wrote:
> >>>>
> >>>>> On 04/12/14 12:06, Konrad Rzeszutek Wilk wrote:
> >>>>>>
> >>>>>> On Dec 4, 2014 6:30 AM, David Vrabel <david.vrabel@citrix.com> wrote:
> >>>>>>>
> >>>>>>> On 03/12/14 21:40, Konrad Rzeszutek Wilk wrote: 
> >>>>>>>>
> >>>>>>>> Instead of doing all this complex dance, we depend on the toolstack 
> >>>>>>>> doing the right thing. As such implement the 'do_flr' SysFS attribute 
> >>>>>>>> which 'xl' uses when a device is detached or attached from/to a guest. 
> >>>>>>>> It bypasses the need to worry about the PCI lock. 
> >>>>>>>
> >>>>>>> No.  Get pciback to add its own "reset" sysfs file (as I have repeatedly 
> >>>>>>> proposed). 
> >>>>>>>
> >>>>>>
> >>>>>> Which does not work as the kobj will complain (as there is already an 'reset' associated with the PCI device).
> >>>>
> >>>>> It is only needed if the core won't provide one.
> >>>>
> >>>>> +static int pcistub_try_create_reset_file(struct pci_dev *pci)
> >>>>> +{
> >>>>> +       struct xen_pcibk_dev_data *dev_data = pci_get_drvdata(pci);
> >>>>> +       struct device *dev = &pci->dev;
> >>>>> +       int ret;
> >>>>> +
> >>>>> +       /* Already have a per-function reset? */
> >>>>> +       if (pci_probe_reset_function(pci) == 0)
> >>>>> +               return 0;
> >>>>> +
> >>>>> +       ret = device_create_file(dev, &dev_attr_reset);
> >>>>> +       if (ret < 0)
> >>>>> +               return ret;
> >>>> +       dev_data->>created_reset_file = true;
> >>>>> +       return 0;
> >>>>> +}
> >>>>
> >>>> Wouldn't the "core-reset-sysfs-file" be still wired to the end up calling 
> >>>> "pci.c:__pci_dev_reset" ?
> >>>>
> >>>> The problem with that function is that from my testing it seems that the 
> >>>> first option "pci_dev_specific_reset" always seems to return succes, so all the
> >>>> other options are skipped (flr, pm, slot, bus). However the device it self is 
> >>>> not properly reset enough (perhaps the pci_dev_specific_reset is good enough for 
> >>>> none virtualization purposes and it's probably the least intrusive. For 
> >>>> virtualization however it would be nice to be sure it resets properly, or have a 
> >>>> way to force a specific reset routine.)
> >> 
> >>> Then you need work with the maintainer for those specific devices or
> >>> drivers to fix their specific reset function.
> >> 
> >>> I'm not adding stuff to pciback to workaround broken quirks.
> >> 
> >> OK that's a pretty clear message there, so if one wants to use pci and vga
> >> passthrough one should better use KVM and vfio-pci.
> 
> > Have you (or anyone else) ever raised the problem with the broken reset
> > quirk for certain devices with the relevant maintainer?
> 
> >> vfio-pci has:
> >> - logic to do the try-slot-bus-reset logic
> 
> > Just because vfio-pci fixed it incorrectly doesn't mean pciback has to
> > as well.
> 
> Depends on what you call an "incorrect fix" .. it fixes a quirk .. 
> you can say that's incorrect, but then you would have to remove 50% of
> the kernel and Xen code as well.
> 
> (i do in general agree it's better to strive for a generic solution though,
> that's exactly why i brought up that that function doesn't seem to work perfect
> for virtualization purposes) 
> 
> > It makes no sense for both pciback and vfio-pci to workaround problems
> > with pci_function_reset() in different ways -- it should be fixed in the
> > core PCI code so both can benefit and make use of the same code.
> 
> Well perhaps Bjorn knows why the order of resets and skipping the rest as
> implemented in "pci.c:__pci_dev_reset" was implemented in that way ?
> 
> Especially what is the expectation about pci_dev_specific_reset doing a proper 
> reset for say a vga-card:
> - i know it doesn't work on a radeon card (doesn't blank screen, on next guest 
>   boot reports it's already posted, powermanagement doesn't work).
> - while with a slot/bus reset, that all just works fine, screen blanks 
>   immediately and everything else also works.
> 
> Added Alex as well since he added this workaround for KVM/vfio-pci, perhaps he knows why
> he introduced the workaround in vfio-pci instead of trying to fix it in core pci 
> code ?

I don't know what workaround you're talking about.  As devices are
released from the user, vfio-pci attempts to reset them.  If
pci_reset_function() returns success we mark the device clean, otherwise
it gets marked dirty.  Each time a device is released, if there are
dirty devices we test whether we can try a bus/slot reset to clean them.
In the case of assigning a GPU this typically means that the GPU or
audio function come through first, there's no reset mechanism so it gets
marked dirty, the next device comes through and we manage to try a bus
reset.  vfio-pci does not have any device specific resets, all
functionality is added to the PCI-core, thank-you-very-much.  I even
posted a generic PCI quirk patch recently that marks AMD VGA PM reset as
bad so that pci_reset_function() won't claim that worked.  All VGA
access quirks are done in QEMU, the kernel doesn't have any business in
remapping config space over MMIO regions or trapping other config space
backdoors.

I have never heard of problems with the dev specific reset claiming to
work and not doing anything, there are only a few of these, it should be
easy to debug.

I didn't read the original patch, but the title alone of this patch is
quite confusing.  FLR is specifically a function-level-reset, so one
would expect 'do_flr' to be function specific.  The pci-sysfs 'reset'
attribute is already function specific.  If pci_reset_function() isn't
doing the job and we need to use bus/slot reset, it's clearly not an
FLR.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute
  2014-12-04 15:39             ` Alex Williamson
@ 2014-12-04 16:25               ` Sander Eikelenboom
  2014-12-04 16:55                 ` Alex Williamson
  2014-12-05 10:30               ` David Vrabel
  1 sibling, 1 reply; 18+ messages in thread
From: Sander Eikelenboom @ 2014-12-04 16:25 UTC (permalink / raw)
  To: Alex Williamson
  Cc: David Vrabel, bhelgaas, Konrad Rzeszutek Wilk, linux-pci,
	Boris Ostrovsky, linux-kernel, xen-devel


Thursday, December 4, 2014, 4:39:06 PM, you wrote:

> On Thu, 2014-12-04 at 15:50 +0100, Sander Eikelenboom wrote:
>> Thursday, December 4, 2014, 3:31:11 PM, you wrote:
>> 
>> > On 04/12/14 14:09, Sander Eikelenboom wrote:
>> >> 
>> >> Thursday, December 4, 2014, 2:43:06 PM, you wrote:
>> >> 
>> >>> On 04/12/14 13:10, Sander Eikelenboom wrote:
>> >>>>
>> >>>> Thursday, December 4, 2014, 1:24:47 PM, you wrote:
>> >>>>
>> >>>>> On 04/12/14 12:06, Konrad Rzeszutek Wilk wrote:
>> >>>>>>
>> >>>>>> On Dec 4, 2014 6:30 AM, David Vrabel <david.vrabel@citrix.com> wrote:
>> >>>>>>>
>> >>>>>>> On 03/12/14 21:40, Konrad Rzeszutek Wilk wrote: 
>> >>>>>>>>
>> >>>>>>>> Instead of doing all this complex dance, we depend on the toolstack 
>> >>>>>>>> doing the right thing. As such implement the 'do_flr' SysFS attribute 
>> >>>>>>>> which 'xl' uses when a device is detached or attached from/to a guest. 
>> >>>>>>>> It bypasses the need to worry about the PCI lock. 
>> >>>>>>>
>> >>>>>>> No.  Get pciback to add its own "reset" sysfs file (as I have repeatedly 
>> >>>>>>> proposed). 
>> >>>>>>>
>> >>>>>>
>> >>>>>> Which does not work as the kobj will complain (as there is already an 'reset' associated with the PCI device).
>> >>>>
>> >>>>> It is only needed if the core won't provide one.
>> >>>>
>> >>>>> +static int pcistub_try_create_reset_file(struct pci_dev *pci)
>> >>>>> +{
>> >>>>> +       struct xen_pcibk_dev_data *dev_data = pci_get_drvdata(pci);
>> >>>>> +       struct device *dev = &pci->dev;
>> >>>>> +       int ret;
>> >>>>> +
>> >>>>> +       /* Already have a per-function reset? */
>> >>>>> +       if (pci_probe_reset_function(pci) == 0)
>> >>>>> +               return 0;
>> >>>>> +
>> >>>>> +       ret = device_create_file(dev, &dev_attr_reset);
>> >>>>> +       if (ret < 0)
>> >>>>> +               return ret;
>> >>>> +       dev_data->>created_reset_file = true;
>> >>>>> +       return 0;
>> >>>>> +}
>> >>>>
>> >>>> Wouldn't the "core-reset-sysfs-file" be still wired to the end up calling 
>> >>>> "pci.c:__pci_dev_reset" ?
>> >>>>
>> >>>> The problem with that function is that from my testing it seems that the 
>> >>>> first option "pci_dev_specific_reset" always seems to return succes, so all the
>> >>>> other options are skipped (flr, pm, slot, bus). However the device it self is 
>> >>>> not properly reset enough (perhaps the pci_dev_specific_reset is good enough for 
>> >>>> none virtualization purposes and it's probably the least intrusive. For 
>> >>>> virtualization however it would be nice to be sure it resets properly, or have a 
>> >>>> way to force a specific reset routine.)
>> >> 
>> >>> Then you need work with the maintainer for those specific devices or
>> >>> drivers to fix their specific reset function.
>> >> 
>> >>> I'm not adding stuff to pciback to workaround broken quirks.
>> >> 
>> >> OK that's a pretty clear message there, so if one wants to use pci and vga
>> >> passthrough one should better use KVM and vfio-pci.
>> 
>> > Have you (or anyone else) ever raised the problem with the broken reset
>> > quirk for certain devices with the relevant maintainer?
>> 
>> >> vfio-pci has:
>> >> - logic to do the try-slot-bus-reset logic
>> 
>> > Just because vfio-pci fixed it incorrectly doesn't mean pciback has to
>> > as well.
>> 
>> Depends on what you call an "incorrect fix" .. it fixes a quirk .. 
>> you can say that's incorrect, but then you would have to remove 50% of
>> the kernel and Xen code as well.
>> 
>> (i do in general agree it's better to strive for a generic solution though,
>> that's exactly why i brought up that that function doesn't seem to work perfect
>> for virtualization purposes) 
>> 
>> > It makes no sense for both pciback and vfio-pci to workaround problems
>> > with pci_function_reset() in different ways -- it should be fixed in the
>> > core PCI code so both can benefit and make use of the same code.
>> 
>> Well perhaps Bjorn knows why the order of resets and skipping the rest as
>> implemented in "pci.c:__pci_dev_reset" was implemented in that way ?
>> 
>> Especially what is the expectation about pci_dev_specific_reset doing a proper 
>> reset for say a vga-card:
>> - i know it doesn't work on a radeon card (doesn't blank screen, on next guest 
>>   boot reports it's already posted, powermanagement doesn't work).
>> - while with a slot/bus reset, that all just works fine, screen blanks 
>>   immediately and everything else also works.
>> 
>> Added Alex as well since he added this workaround for KVM/vfio-pci, perhaps he knows why
>> he introduced the workaround in vfio-pci instead of trying to fix it in core pci 
>> code ?

> I don't know what workaround you're talking about.  As devices are
> released from the user, vfio-pci attempts to reset them.  If
> pci_reset_function() returns success we mark the device clean, otherwise
> it gets marked dirty.  Each time a device is released, if there are
> dirty devices we test whether we can try a bus/slot reset to clean them.
> In the case of assigning a GPU this typically means that the GPU or
> audio function come through first, there's no reset mechanism so it gets
> marked dirty, the next device comes through and we manage to try a bus
> reset.  vfio-pci does not have any device specific resets, all
> functionality is added to the PCI-core, thank-you-very-much.  I even
> posted a generic PCI quirk patch recently that marks AMD VGA PM reset as
> bad so that pci_reset_function() won't claim that worked.  All VGA
> access quirks are done in QEMU, the kernel doesn't have any business in
> remapping config space over MMIO regions or trapping other config space
> backdoors.

Thanks for your insightful reply!

With "workaround" I was trying to refer to "vfio_pci_try_bus_reset()" which
implements how to reset the devices, it indeed uses function you introduced in
pci core code (with a solution for locking issues Konrad also seems to have 
ran into: 
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=61cf16d8bd38c3dc52033ea75d5b1f8368514a17

David seems to be arguing the whole "vfio_pci_try_bus_reset()" should be not 
needed and just doing calling "pci_reset_function()" (directly or by
echo "1" > /sys/bus/pci/devices/BDF/reset shoud always magically do the
right thing. 
(Which in my opinion seems the contradict with the mere existence
of "vfio_pci_try_bus_reset()" (i don't think you would have implemented it 
when you would have deemed it unnecessary)) 

> I have never heard of problems with the dev specific reset claiming to
> work and not doing anything, there are only a few of these, it should be
> easy to debug.

> I didn't read the original patch, but the title alone of this patch is
> quite confusing.  FLR is specifically a function-level-reset, so one
> would expect 'do_flr' to be function specific.  The pci-sysfs 'reset'
> attribute is already function specific.  If pci_reset_function() isn't
> doing the job and we need to use bus/slot reset, it's clearly not an
> FLR.  Thanks,
> Alex

The name "do_flr" is coming from the Xen xl toolstack which historically has 
code that tries to reset devices using a echo "BDF" > /sys/bus/pci/drivers/pciback/do_flr

But the name "do_flr" and the debug messages indeed are incorrect (it's not 
doing a flr nor a D3/PM reset), confusing and should not be used.

And as you seem to have solved the locking issue for vfio-pci, it is probably 
possible for xen-pciback to do the same. Instead of letting xen-pciback
work around the locking problem by deferring to the xl toolstack the resetting
logic could be kept into xen-pciback it self. 
That would also mean that the sysfs attribute would be unnecessary and make 
the naming issue moot.

--
Sander



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute
  2014-12-04 16:25               ` Sander Eikelenboom
@ 2014-12-04 16:55                 ` Alex Williamson
  0 siblings, 0 replies; 18+ messages in thread
From: Alex Williamson @ 2014-12-04 16:55 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: David Vrabel, bhelgaas, Konrad Rzeszutek Wilk, linux-pci,
	Boris Ostrovsky, linux-kernel, xen-devel

On Thu, 2014-12-04 at 17:25 +0100, Sander Eikelenboom wrote:
> Thursday, December 4, 2014, 4:39:06 PM, you wrote:
> 
> > On Thu, 2014-12-04 at 15:50 +0100, Sander Eikelenboom wrote:
> >> Thursday, December 4, 2014, 3:31:11 PM, you wrote:
> >> 
> >> > On 04/12/14 14:09, Sander Eikelenboom wrote:
> >> >> 
> >> >> Thursday, December 4, 2014, 2:43:06 PM, you wrote:
> >> >> 
> >> >>> On 04/12/14 13:10, Sander Eikelenboom wrote:
> >> >>>>
> >> >>>> Thursday, December 4, 2014, 1:24:47 PM, you wrote:
> >> >>>>
> >> >>>>> On 04/12/14 12:06, Konrad Rzeszutek Wilk wrote:
> >> >>>>>>
> >> >>>>>> On Dec 4, 2014 6:30 AM, David Vrabel <david.vrabel@citrix.com> wrote:
> >> >>>>>>>
> >> >>>>>>> On 03/12/14 21:40, Konrad Rzeszutek Wilk wrote: 
> >> >>>>>>>>
> >> >>>>>>>> Instead of doing all this complex dance, we depend on the toolstack 
> >> >>>>>>>> doing the right thing. As such implement the 'do_flr' SysFS attribute 
> >> >>>>>>>> which 'xl' uses when a device is detached or attached from/to a guest. 
> >> >>>>>>>> It bypasses the need to worry about the PCI lock. 
> >> >>>>>>>
> >> >>>>>>> No.  Get pciback to add its own "reset" sysfs file (as I have repeatedly 
> >> >>>>>>> proposed). 
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>> Which does not work as the kobj will complain (as there is already an 'reset' associated with the PCI device).
> >> >>>>
> >> >>>>> It is only needed if the core won't provide one.
> >> >>>>
> >> >>>>> +static int pcistub_try_create_reset_file(struct pci_dev *pci)
> >> >>>>> +{
> >> >>>>> +       struct xen_pcibk_dev_data *dev_data = pci_get_drvdata(pci);
> >> >>>>> +       struct device *dev = &pci->dev;
> >> >>>>> +       int ret;
> >> >>>>> +
> >> >>>>> +       /* Already have a per-function reset? */
> >> >>>>> +       if (pci_probe_reset_function(pci) == 0)
> >> >>>>> +               return 0;
> >> >>>>> +
> >> >>>>> +       ret = device_create_file(dev, &dev_attr_reset);
> >> >>>>> +       if (ret < 0)
> >> >>>>> +               return ret;
> >> >>>> +       dev_data->>created_reset_file = true;
> >> >>>>> +       return 0;
> >> >>>>> +}
> >> >>>>
> >> >>>> Wouldn't the "core-reset-sysfs-file" be still wired to the end up calling 
> >> >>>> "pci.c:__pci_dev_reset" ?
> >> >>>>
> >> >>>> The problem with that function is that from my testing it seems that the 
> >> >>>> first option "pci_dev_specific_reset" always seems to return succes, so all the
> >> >>>> other options are skipped (flr, pm, slot, bus). However the device it self is 
> >> >>>> not properly reset enough (perhaps the pci_dev_specific_reset is good enough for 
> >> >>>> none virtualization purposes and it's probably the least intrusive. For 
> >> >>>> virtualization however it would be nice to be sure it resets properly, or have a 
> >> >>>> way to force a specific reset routine.)
> >> >> 
> >> >>> Then you need work with the maintainer for those specific devices or
> >> >>> drivers to fix their specific reset function.
> >> >> 
> >> >>> I'm not adding stuff to pciback to workaround broken quirks.
> >> >> 
> >> >> OK that's a pretty clear message there, so if one wants to use pci and vga
> >> >> passthrough one should better use KVM and vfio-pci.
> >> 
> >> > Have you (or anyone else) ever raised the problem with the broken reset
> >> > quirk for certain devices with the relevant maintainer?
> >> 
> >> >> vfio-pci has:
> >> >> - logic to do the try-slot-bus-reset logic
> >> 
> >> > Just because vfio-pci fixed it incorrectly doesn't mean pciback has to
> >> > as well.
> >> 
> >> Depends on what you call an "incorrect fix" .. it fixes a quirk .. 
> >> you can say that's incorrect, but then you would have to remove 50% of
> >> the kernel and Xen code as well.
> >> 
> >> (i do in general agree it's better to strive for a generic solution though,
> >> that's exactly why i brought up that that function doesn't seem to work perfect
> >> for virtualization purposes) 
> >> 
> >> > It makes no sense for both pciback and vfio-pci to workaround problems
> >> > with pci_function_reset() in different ways -- it should be fixed in the
> >> > core PCI code so both can benefit and make use of the same code.
> >> 
> >> Well perhaps Bjorn knows why the order of resets and skipping the rest as
> >> implemented in "pci.c:__pci_dev_reset" was implemented in that way ?
> >> 
> >> Especially what is the expectation about pci_dev_specific_reset doing a proper 
> >> reset for say a vga-card:
> >> - i know it doesn't work on a radeon card (doesn't blank screen, on next guest 
> >>   boot reports it's already posted, powermanagement doesn't work).
> >> - while with a slot/bus reset, that all just works fine, screen blanks 
> >>   immediately and everything else also works.
> >> 
> >> Added Alex as well since he added this workaround for KVM/vfio-pci, perhaps he knows why
> >> he introduced the workaround in vfio-pci instead of trying to fix it in core pci 
> >> code ?
> 
> > I don't know what workaround you're talking about.  As devices are
> > released from the user, vfio-pci attempts to reset them.  If
> > pci_reset_function() returns success we mark the device clean, otherwise
> > it gets marked dirty.  Each time a device is released, if there are
> > dirty devices we test whether we can try a bus/slot reset to clean them.
> > In the case of assigning a GPU this typically means that the GPU or
> > audio function come through first, there's no reset mechanism so it gets
> > marked dirty, the next device comes through and we manage to try a bus
> > reset.  vfio-pci does not have any device specific resets, all
> > functionality is added to the PCI-core, thank-you-very-much.  I even
> > posted a generic PCI quirk patch recently that marks AMD VGA PM reset as
> > bad so that pci_reset_function() won't claim that worked.  All VGA
> > access quirks are done in QEMU, the kernel doesn't have any business in
> > remapping config space over MMIO regions or trapping other config space
> > backdoors.
> 
> Thanks for your insightful reply!
> 
> With "workaround" I was trying to refer to "vfio_pci_try_bus_reset()" which
> implements how to reset the devices, it indeed uses function you introduced in
> pci core code (with a solution for locking issues Konrad also seems to have 
> ran into: 
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=61cf16d8bd38c3dc52033ea75d5b1f8368514a17
> 
> David seems to be arguing the whole "vfio_pci_try_bus_reset()" should be not 
> needed and just doing calling "pci_reset_function()" (directly or by
> echo "1" > /sys/bus/pci/devices/BDF/reset shoud always magically do the
> right thing. 
> (Which in my opinion seems the contradict with the mere existence
> of "vfio_pci_try_bus_reset()" (i don't think you would have implemented it 
> when you would have deemed it unnecessary)) 

That truly would be magic because a bus/slot reset and function reset
are completely different beasts.  QEMU, through vfio-pci, makes use of
both.  Take for instance hot-plugging the second port of a dual-port NIC
to a guest, where the first port may be (a) assigned to the same guest,
(b) assigned to a different guest, (c) in-use by the host, or (d)
not-in-use.  For a hotplug I can only make use of a bus/slot reset in
one of those cases (d).  For a cold-plug or VM reset, only two (a,d).  I
don't see how pci_reset_function() can have that sort of visibility to
the ownership and usage of a given device.  vfio-pci doesn't even have
this visibility, which is why the distinction is made in QEMU.  vfio-pci
is just a conduit and gatekeeper to the PCI-core interfaces, for
instance preventing QEMU from doing a reset in the (b) and (c) cases.
What prevents that in the Xen case?  Userspace?

> > I have never heard of problems with the dev specific reset claiming to
> > work and not doing anything, there are only a few of these, it should be
> > easy to debug.
> 
> > I didn't read the original patch, but the title alone of this patch is
> > quite confusing.  FLR is specifically a function-level-reset, so one
> > would expect 'do_flr' to be function specific.  The pci-sysfs 'reset'
> > attribute is already function specific.  If pci_reset_function() isn't
> > doing the job and we need to use bus/slot reset, it's clearly not an
> > FLR.  Thanks,
> > Alex
> 
> The name "do_flr" is coming from the Xen xl toolstack which historically has 
> code that tries to reset devices using a echo "BDF" > /sys/bus/pci/drivers/pciback/do_flr

Redundant to /sys/bus/pci/devices/DDDD:BB:DD.F/reset

> But the name "do_flr" and the debug messages indeed are incorrect (it's not 
> doing a flr nor a D3/PM reset), confusing and should not be used.
> 
> And as you seem to have solved the locking issue for vfio-pci, it is probably 
> possible for xen-pciback to do the same. Instead of letting xen-pciback
> work around the locking problem by deferring to the xl toolstack the resetting
> logic could be kept into xen-pciback it self. 
> That would also mean that the sysfs attribute would be unnecessary and make 
> the naming issue moot.

I would consider the try_*_reset() interfaces to be a workaround for
existing locking issues which are much harder to solve.  It makes the
vfio-pci reset-on-release a best effort approach, which is generally
fine.  For vfio I can't rely on a toolstack, nor maybe should you.
There's always a chance that the VM/user is sent a kill -9 and it's the
kernel's job to protect itself and return things to a quiescent state.
This is why I don't simply have QEMU send a bus reset on shutdown or put
reset policy that can affect other users or the host in userspace.
Thanks,

Alex


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute
  2014-12-04 14:31         ` David Vrabel
  2014-12-04 14:50           ` Sander Eikelenboom
@ 2014-12-04 19:05           ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 18+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-12-04 19:05 UTC (permalink / raw)
  To: David Vrabel, alex.williamson
  Cc: Sander Eikelenboom, bhelgaas, linux-pci, Boris Ostrovsky,
	linux-kernel, xen-devel

On Thu, Dec 04, 2014 at 02:31:11PM +0000, David Vrabel wrote:
> On 04/12/14 14:09, Sander Eikelenboom wrote:
> > 
> > Thursday, December 4, 2014, 2:43:06 PM, you wrote:
> > 
> >> On 04/12/14 13:10, Sander Eikelenboom wrote:
> >>>
> >>> Thursday, December 4, 2014, 1:24:47 PM, you wrote:
> >>>
> >>>> On 04/12/14 12:06, Konrad Rzeszutek Wilk wrote:
> >>>>>
> >>>>> On Dec 4, 2014 6:30 AM, David Vrabel <david.vrabel@citrix.com> wrote:
> >>>>>>
> >>>>>> On 03/12/14 21:40, Konrad Rzeszutek Wilk wrote: 
> >>>>>>>
> >>>>>>> Instead of doing all this complex dance, we depend on the toolstack 
> >>>>>>> doing the right thing. As such implement the 'do_flr' SysFS attribute 
> >>>>>>> which 'xl' uses when a device is detached or attached from/to a guest. 
> >>>>>>> It bypasses the need to worry about the PCI lock. 
> >>>>>>
> >>>>>> No.  Get pciback to add its own "reset" sysfs file (as I have repeatedly 
> >>>>>> proposed). 
> >>>>>>
> >>>>>
> >>>>> Which does not work as the kobj will complain (as there is already an 'reset' associated with the PCI device).
> >>>
> >>>> It is only needed if the core won't provide one.
> >>>
> >>>> +static int pcistub_try_create_reset_file(struct pci_dev *pci)
> >>>> +{
> >>>> +       struct xen_pcibk_dev_data *dev_data = pci_get_drvdata(pci);
> >>>> +       struct device *dev = &pci->dev;
> >>>> +       int ret;
> >>>> +
> >>>> +       /* Already have a per-function reset? */
> >>>> +       if (pci_probe_reset_function(pci) == 0)
> >>>> +               return 0;
> >>>> +
> >>>> +       ret = device_create_file(dev, &dev_attr_reset);
> >>>> +       if (ret < 0)
> >>>> +               return ret;
> >>> +       dev_data->>created_reset_file = true;
> >>>> +       return 0;
> >>>> +}
> >>>
> >>> Wouldn't the "core-reset-sysfs-file" be still wired to the end up calling 
> >>> "pci.c:__pci_dev_reset" ?
> >>>
> >>> The problem with that function is that from my testing it seems that the 
> >>> first option "pci_dev_specific_reset" always seems to return succes, so all the
> >>> other options are skipped (flr, pm, slot, bus). However the device it self is 
> >>> not properly reset enough (perhaps the pci_dev_specific_reset is good enough for 
> >>> none virtualization purposes and it's probably the least intrusive. For 
> >>> virtualization however it would be nice to be sure it resets properly, or have a 
> >>> way to force a specific reset routine.)
> > 
> >> Then you need work with the maintainer for those specific devices or
> >> drivers to fix their specific reset function.
> > 
> >> I'm not adding stuff to pciback to workaround broken quirks.
> > 
> > OK that's a pretty clear message there, so if one wants to use pci and vga
> > passthrough one should better use KVM and vfio-pci.
> 
> Have you (or anyone else) ever raised the problem with the broken reset
> quirk for certain devices with the relevant maintainer?
> 
> > vfio-pci has:
> > - logic to do the try-slot-bus-reset logic
> 
> Just because vfio-pci fixed it incorrectly doesn't mean pciback has to
> as well.
> 
> It makes no sense for both pciback and vfio-pci to workaround problems
> with pci_function_reset() in different ways -- it should be fixed in the
> core PCI code so both can benefit and make use of the same code.

We seem to be going in circles.

This thread: http://patchwork.ozlabs.org/patch/368668/

has an interesting discussion that pretty much touches all of this
and I believe it ends with David agreeing with Alex that an
bus-reset is a perfect use-case.

I believe the contention was on how to expose this interface
to the user-space. David's was thinking to use 'reset' while
I used 'do_flr' (which is a misleading name but hte toolstack
already does it). Perhaps we should just have (as David suggested)
an 'bus_reset' on the devices.

I am going to go on a limb and presume that David was thinking
that this 'bus_reset' would be exposed in the generic PCI land?

> 
> David
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute
  2014-12-04 15:39             ` Alex Williamson
  2014-12-04 16:25               ` Sander Eikelenboom
@ 2014-12-05 10:30               ` David Vrabel
  2014-12-05 17:22                 ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 18+ messages in thread
From: David Vrabel @ 2014-12-05 10:30 UTC (permalink / raw)
  To: Alex Williamson, Sander Eikelenboom
  Cc: bhelgaas, Konrad Rzeszutek Wilk, linux-pci, Boris Ostrovsky,
	linux-kernel, xen-devel

On 04/12/14 15:39, Alex Williamson wrote:
> 
> I don't know what workaround you're talking about.  As devices are
> released from the user, vfio-pci attempts to reset them.  If
> pci_reset_function() returns success we mark the device clean, otherwise
> it gets marked dirty.  Each time a device is released, if there are
> dirty devices we test whether we can try a bus/slot reset to clean them.
> In the case of assigning a GPU this typically means that the GPU or
> audio function come through first, there's no reset mechanism so it gets
> marked dirty, the next device comes through and we manage to try a bus
> reset.  vfio-pci does not have any device specific resets, all
> functionality is added to the PCI-core, thank-you-very-much.  I even
> posted a generic PCI quirk patch recently that marks AMD VGA PM reset as
> bad so that pci_reset_function() won't claim that worked.  All VGA
> access quirks are done in QEMU, the kernel doesn't have any business in
> remapping config space over MMIO regions or trapping other config space
> backdoors.

Thanks for the info Alex, I hadn't got around to actually looking and
the vfio-pci code and was just going to what Sander said.

We probably do need to have a more in depth look at now PCI devices and
handled by both the toolstack and pciback but in the short term I would
like a simple solution that does not extend the ABI.

David

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute
  2014-12-05 10:30               ` David Vrabel
@ 2014-12-05 17:22                 ` Konrad Rzeszutek Wilk
  2014-12-08 10:38                   ` David Vrabel
  0 siblings, 1 reply; 18+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-12-05 17:22 UTC (permalink / raw)
  To: David Vrabel
  Cc: Alex Williamson, Sander Eikelenboom, bhelgaas, linux-pci,
	Boris Ostrovsky, linux-kernel, xen-devel

On Fri, Dec 05, 2014 at 10:30:01AM +0000, David Vrabel wrote:
> On 04/12/14 15:39, Alex Williamson wrote:
> > 
> > I don't know what workaround you're talking about.  As devices are
> > released from the user, vfio-pci attempts to reset them.  If
> > pci_reset_function() returns success we mark the device clean, otherwise
> > it gets marked dirty.  Each time a device is released, if there are
> > dirty devices we test whether we can try a bus/slot reset to clean them.
> > In the case of assigning a GPU this typically means that the GPU or
> > audio function come through first, there's no reset mechanism so it gets
> > marked dirty, the next device comes through and we manage to try a bus
> > reset.  vfio-pci does not have any device specific resets, all
> > functionality is added to the PCI-core, thank-you-very-much.  I even
> > posted a generic PCI quirk patch recently that marks AMD VGA PM reset as
> > bad so that pci_reset_function() won't claim that worked.  All VGA
> > access quirks are done in QEMU, the kernel doesn't have any business in
> > remapping config space over MMIO regions or trapping other config space
> > backdoors.
> 
> Thanks for the info Alex, I hadn't got around to actually looking and
> the vfio-pci code and was just going to what Sander said.
> 
> We probably do need to have a more in depth look at now PCI devices and
> handled by both the toolstack and pciback but in the short term I would
> like a simple solution that does not extend the ABI.

Could you enumerate the 'simple solution' then please? I am having
a frustrating time figuring out what it is that you are proposing.


> 
> David

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute
  2014-12-05 17:22                 ` Konrad Rzeszutek Wilk
@ 2014-12-08 10:38                   ` David Vrabel
  2014-12-08 16:04                     ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 18+ messages in thread
From: David Vrabel @ 2014-12-08 10:38 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Alex Williamson, Sander Eikelenboom, bhelgaas, linux-pci,
	Boris Ostrovsky, linux-kernel, xen-devel

On 05/12/14 17:22, Konrad Rzeszutek Wilk wrote:
> On Fri, Dec 05, 2014 at 10:30:01AM +0000, David Vrabel wrote:
>> On 04/12/14 15:39, Alex Williamson wrote:
>>>
>>> I don't know what workaround you're talking about.  As devices are
>>> released from the user, vfio-pci attempts to reset them.  If
>>> pci_reset_function() returns success we mark the device clean, otherwise
>>> it gets marked dirty.  Each time a device is released, if there are
>>> dirty devices we test whether we can try a bus/slot reset to clean them.
>>> In the case of assigning a GPU this typically means that the GPU or
>>> audio function come through first, there's no reset mechanism so it gets
>>> marked dirty, the next device comes through and we manage to try a bus
>>> reset.  vfio-pci does not have any device specific resets, all
>>> functionality is added to the PCI-core, thank-you-very-much.  I even
>>> posted a generic PCI quirk patch recently that marks AMD VGA PM reset as
>>> bad so that pci_reset_function() won't claim that worked.  All VGA
>>> access quirks are done in QEMU, the kernel doesn't have any business in
>>> remapping config space over MMIO regions or trapping other config space
>>> backdoors.
>>
>> Thanks for the info Alex, I hadn't got around to actually looking and
>> the vfio-pci code and was just going to what Sander said.
>>
>> We probably do need to have a more in depth look at now PCI devices and
>> handled by both the toolstack and pciback but in the short term I would
>> like a simple solution that does not extend the ABI.
> 
> Could you enumerate the 'simple solution' then please? I am having
> a frustrating time figuring out what it is that you are proposing.

I've posted it repeatedly.

David

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute
  2014-12-08 10:38                   ` David Vrabel
@ 2014-12-08 16:04                     ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 18+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-12-08 16:04 UTC (permalink / raw)
  To: David Vrabel
  Cc: Alex Williamson, Sander Eikelenboom, bhelgaas, linux-pci,
	Boris Ostrovsky, linux-kernel, xen-devel

On Mon, Dec 08, 2014 at 10:38:09AM +0000, David Vrabel wrote:
> On 05/12/14 17:22, Konrad Rzeszutek Wilk wrote:
> > On Fri, Dec 05, 2014 at 10:30:01AM +0000, David Vrabel wrote:
> >> On 04/12/14 15:39, Alex Williamson wrote:
> >>>
> >>> I don't know what workaround you're talking about.  As devices are
> >>> released from the user, vfio-pci attempts to reset them.  If
> >>> pci_reset_function() returns success we mark the device clean, otherwise
> >>> it gets marked dirty.  Each time a device is released, if there are
> >>> dirty devices we test whether we can try a bus/slot reset to clean them.
> >>> In the case of assigning a GPU this typically means that the GPU or
> >>> audio function come through first, there's no reset mechanism so it gets
> >>> marked dirty, the next device comes through and we manage to try a bus
> >>> reset.  vfio-pci does not have any device specific resets, all
> >>> functionality is added to the PCI-core, thank-you-very-much.  I even
> >>> posted a generic PCI quirk patch recently that marks AMD VGA PM reset as
> >>> bad so that pci_reset_function() won't claim that worked.  All VGA
> >>> access quirks are done in QEMU, the kernel doesn't have any business in
> >>> remapping config space over MMIO regions or trapping other config space
> >>> backdoors.
> >>
> >> Thanks for the info Alex, I hadn't got around to actually looking and
> >> the vfio-pci code and was just going to what Sander said.
> >>
> >> We probably do need to have a more in depth look at now PCI devices and
> >> handled by both the toolstack and pciback but in the short term I would
> >> like a simple solution that does not extend the ABI.
> > 
> > Could you enumerate the 'simple solution' then please? I am having
> > a frustrating time figuring out what it is that you are proposing.
> 
> I've posted it repeatedly.

Are you referring to http://lists.xenproject.org/archives/html/xen-devel/2014-07/msg01270.html
which is still waiting for your feedback?


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute
  2014-12-03 21:40 ` [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute Konrad Rzeszutek Wilk
@ 2014-12-04 11:30   ` David Vrabel
  0 siblings, 0 replies; 18+ messages in thread
From: David Vrabel @ 2014-12-04 11:30 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, bhelgaas, linux-pci, linux-kernel,
	xen-devel, boris.ostrovsky

On 03/12/14 21:40, Konrad Rzeszutek Wilk wrote:
> 
> Instead of doing all this complex dance, we depend on the toolstack
> doing the right thing. As such implement the 'do_flr' SysFS attribute
> which 'xl' uses when a device is detached or attached from/to a guest.
> It bypasses the need to worry about the PCI lock.

No.  Get pciback to add its own "reset" sysfs file (as I have repeatedly
proposed).

David

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute
  2014-12-03 21:40 [PATCH v5] Fixes for PCI backend for 3.19 Konrad Rzeszutek Wilk
@ 2014-12-03 21:40 ` Konrad Rzeszutek Wilk
  2014-12-04 11:30   ` David Vrabel
  0 siblings, 1 reply; 18+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-12-03 21:40 UTC (permalink / raw)
  To: bhelgaas, linux-pci, linux-kernel, xen-devel, boris.ostrovsky,
	david.vrabel
  Cc: Konrad Rzeszutek Wilk

The life-cycle of a PCI device in Xen pciback is complex
and is constrained by the PCI generic locking mechanism.

It starts with the device being binded to us - for which
we do a device function reset (and done via SysFS
so the PCI lock is held)

If the device is unbinded from us - we also do a function
reset (also done via SysFS so the PCI lock is held).

If the device is un-assigned from a guest - we do a function
reset (no PCI lock).

All on the individual PCI function level (so bus:device:function).

Unfortunatly a function reset is not adequate for certain
PCIe devices. The reset for an individual PCI function "means
device must support FLR (PCIe or AF), PM reset on D3hot->D0
device specific reset, or be a singleton device on a bus
a secondary bus reset.  FLR does not have widespread support,
reset is not very reliable, and bus topology is dictated by the
and device design.  We need to provide a means for a user to
a bus reset in cases where the existing mechanisms are not
 or not reliable. " (Adam Williamson, 'vfio-pci: PCI hot reset
interface' commit 8b27ee60bfd6bbb84d2df28fa706c5c5081066ca).

As such to do a slot or a bus reset is we need another mechanism.
This is not exposed SysFS as there is no good way of exposing
a bus topology there.

This is due to the complexity - we MUST know that the different
functions off a PCIe device are not in use by other drivers, or
if they are in use (say one of them is assigned to a guest
and the other is idle) - it is still OK to reset the slot
(assuming both of them are owned by Xen pciback).

This patch does that by doing an slot or bus reset (if
slot not supported) if all of the functions of a PCIe
device belong to Xen PCIback. We do not care if the device is
in-use as we depend on the toolstack to be aware of this -
however if it is we will WARN the user.

Due to the complexity with the PCI lock we cannot do
the reset when a device is binded ('echo $BDF > bind')
or when unbinded ('echo $BDF > unbind') as the pci_[slot|bus]_reset
also take the same lock resulting in a dead-lock.

Putting the reset function in a workqueue or thread
won't work either - as we have to do the reset function
outside the 'unbind' context (it holds the PCI lock).
But once you 'unbind' a device the device is no longer
under the ownership of Xen pciback and the pci_set_drvdata
has been reset so we cannot use a thread for this.

Instead of doing all this complex dance, we depend on the toolstack
doing the right thing. As such implement the 'do_flr' SysFS attribute
which 'xl' uses when a device is detached or attached from/to a guest.
It bypasses the need to worry about the PCI lock.

To not inadvertly do a bus reset that would affect devices that
are in use by other drivers (other than Xen pciback) prior
to the reset we check that all of the devices under the bridge
are owned by Xen pciback. If they are not we do not do
the bus (or slot) reset.

We also warn the user if the device is in use - but still
continue with the reset. This should not happen as the toolstack
also does the check.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 Documentation/ABI/testing/sysfs-driver-pciback |  12 +++
 drivers/xen/xen-pciback/pci_stub.c             | 124 ++++++++++++++++++++++---
 2 files changed, 125 insertions(+), 11 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-driver-pciback b/Documentation/ABI/testing/sysfs-driver-pciback
index 6a733bf..2d4e32f 100644
--- a/Documentation/ABI/testing/sysfs-driver-pciback
+++ b/Documentation/ABI/testing/sysfs-driver-pciback
@@ -11,3 +11,15 @@ Description:
                 #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks
                 will allow the guest to read and write to the configuration
                 register 0x0E.
+
+
+What:           /sys/bus/pci/drivers/pciback/do_flr
+Date:           December 2014
+KernelVersion:  3.19
+Contact:        xen-devel@lists.xenproject.org
+Description:
+                An option to slot or bus reset an PCI device owned by
+                Xen PCI backend. Writing a string of DDDD:BB:DD.F will cause
+                the driver to perform an slot or bus reset if the device
+                supports. It also checks to make sure that all of the devices
+                under the bridge are owned by Xen PCI backend.
diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index cc3cbb4..f830bf4 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -100,14 +100,9 @@ static void pcistub_device_release(struct kref *kref)
 
 	xen_unregister_device_domain_owner(dev);
 
-	/* Call the reset function which does not take lock as this
-	 * is called from "unbind" which takes a device_lock mutex.
-	 */
-	__pci_reset_function_locked(dev);
+	/* Reset is done by the toolstack by using 'do_flr' on the SysFS. */
 	if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
 		dev_info(&dev->dev, "Could not reload PCI state\n");
-	else
-		pci_restore_state(dev);
 
 	if (dev->msix_cap) {
 		struct physdev_pci_device ppdev = {
@@ -123,9 +118,6 @@ static void pcistub_device_release(struct kref *kref)
 				 err);
 	}
 
-	/* Disable the device */
-	xen_pcibk_reset_device(dev);
-
 	kfree(dev_data);
 	pci_set_drvdata(dev, NULL);
 
@@ -242,6 +234,87 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
 	return found_dev;
 }
 
+struct wrapper_args {
+	struct pci_dev *dev;
+	int in_use;
+};
+
+static int pcistub_pci_walk_wrapper(struct pci_dev *dev, void *data)
+{
+	struct pcistub_device *psdev, *found_psdev = NULL;
+	struct wrapper_args *arg = data;
+	unsigned long flags;
+
+	spin_lock_irqsave(&pcistub_devices_lock, flags);
+	list_for_each_entry(psdev, &pcistub_devices, dev_list) {
+		if (psdev->dev == dev) {
+			found_psdev = psdev;
+			if (psdev->pdev)
+				arg->in_use++;
+			break;
+		}
+	}
+	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
+	dev_dbg(&dev->dev, "%s\n", found_psdev ? "OK" : "not owned by us!");
+
+	if (!found_psdev)
+		arg->dev = dev;
+	return found_psdev ? 0 : 1;
+}
+
+static int pcistub_reset_pci_dev(struct pci_dev *dev)
+{
+	struct xen_pcibk_dev_data *dev_data;
+	struct wrapper_args arg = { .dev = NULL, .in_use = 0 };
+	bool slot = false, bus = false;
+	int rc;
+
+	if (!dev)
+		return -EINVAL;
+
+	if (!pci_probe_reset_slot(dev->slot))
+		slot = true;
+	else if (!pci_probe_reset_bus(dev->bus)) {
+		/* We won't attempt to reset a root bridge. */
+		if (!pci_is_root_bus(dev->bus))
+			bus = true;
+	}
+	dev_dbg(&dev->dev, "resetting (FLR, D3, %s %s) the device\n",
+		slot ? "slot" : "", bus ? "bus" : "");
+
+	pci_walk_bus(dev->bus, pcistub_pci_walk_wrapper, &arg);
+
+	if (arg.in_use)
+		dev_err(&dev->dev, "is in use!\n");
+
+	/*
+	 * Takes the PCI lock. OK to do it as we are never called
+	 * from 'unbind' state and don't deadlock.
+	 */
+	dev_data = pci_get_drvdata(dev);
+	if (!pci_load_saved_state(dev, dev_data->pci_saved_state))
+		pci_restore_state(dev);
+
+	pci_reset_function(dev);
+
+	/* This disables the device. */
+	xen_pcibk_reset_device(dev);
+
+	/* And cleanup up our emulated fields. */
+	xen_pcibk_config_reset_dev(dev);
+
+	if (!bus && !slot)
+		return 0;
+
+	/* All slots or devices under the bus should be part of pcistub! */
+	if (arg.dev) {
+		dev_err(&dev->dev, "depends on: %s!\n", pci_name(arg.dev));
+		return -EBUSY;
+	}
+	return slot ? pci_try_reset_slot(dev->slot) :
+		      pci_try_reset_bus(dev->bus);
+}
+
 /*
  * Called when:
  *  - XenBus state has been reconfigure (pci unplug). See xen_pcibk_remove_device
@@ -277,8 +350,9 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	* pcistub and xen_pcibk when AER is in processing
 	*/
 	down_write(&pcistub_sem);
-	/* Cleanup our device
-	 * (so it's ready for the next domain)
+	/* Cleanup our device (so it's ready for the next domain)
+	 * That is the job of the toolstack which has to call 'do_flr' before
+	 * providing the PCI device to a guest (see pcistub_do_flr).
 	 */
 	device_lock_assert(&dev->dev);
 	__pci_reset_function_locked(dev);
@@ -1389,6 +1463,29 @@ static ssize_t permissive_show(struct device_driver *drv, char *buf)
 static DRIVER_ATTR(permissive, S_IRUSR | S_IWUSR, permissive_show,
 		   permissive_add);
 
+static ssize_t pcistub_do_flr(struct device_driver *drv, const char *buf,
+				size_t count)
+{
+	int domain, bus, slot, func;
+	int err;
+	struct pcistub_device *psdev;
+
+	err = str_to_slot(buf, &domain, &bus, &slot, &func);
+	if (err)
+		goto out;
+
+	psdev = pcistub_device_find(domain, bus, slot, func);
+	if (psdev) {
+		err = pcistub_reset_pci_dev(psdev->dev);
+		pcistub_device_put(psdev);
+	} else
+		err = -ENODEV;
+out:
+	if (!err)
+		err = count;
+	return err;
+}
+static DRIVER_ATTR(do_flr, S_IWUSR, NULL, pcistub_do_flr);
 static void pcistub_exit(void)
 {
 	driver_remove_file(&xen_pcibk_pci_driver.driver, &driver_attr_new_slot);
@@ -1402,6 +1499,8 @@ static void pcistub_exit(void)
 			   &driver_attr_irq_handlers);
 	driver_remove_file(&xen_pcibk_pci_driver.driver,
 			   &driver_attr_irq_handler_state);
+	driver_remove_file(&xen_pcibk_pci_driver.driver,
+			   &driver_attr_do_flr);
 	pci_unregister_driver(&xen_pcibk_pci_driver);
 }
 
@@ -1495,6 +1594,9 @@ static int __init pcistub_init(void)
 	if (!err)
 		err = driver_create_file(&xen_pcibk_pci_driver.driver,
 					&driver_attr_irq_handler_state);
+	if (!err)
+		err = driver_create_file(&xen_pcibk_pci_driver.driver,
+					&driver_attr_do_flr);
 	if (err)
 		pcistub_exit();
 
-- 
1.9.3


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2014-12-08 16:04 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-04 12:06 [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute Konrad Rzeszutek Wilk
2014-12-04 12:24 ` [Xen-devel] " David Vrabel
2014-12-04 13:10   ` Sander Eikelenboom
2014-12-04 13:43     ` David Vrabel
2014-12-04 14:09       ` Sander Eikelenboom
2014-12-04 14:14         ` Sander Eikelenboom
2014-12-04 14:31         ` David Vrabel
2014-12-04 14:50           ` Sander Eikelenboom
2014-12-04 15:39             ` Alex Williamson
2014-12-04 16:25               ` Sander Eikelenboom
2014-12-04 16:55                 ` Alex Williamson
2014-12-05 10:30               ` David Vrabel
2014-12-05 17:22                 ` Konrad Rzeszutek Wilk
2014-12-08 10:38                   ` David Vrabel
2014-12-08 16:04                     ` Konrad Rzeszutek Wilk
2014-12-04 19:05           ` Konrad Rzeszutek Wilk
  -- strict thread matches above, loose matches on Subject: below --
2014-12-03 21:40 [PATCH v5] Fixes for PCI backend for 3.19 Konrad Rzeszutek Wilk
2014-12-03 21:40 ` [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute Konrad Rzeszutek Wilk
2014-12-04 11:30   ` David Vrabel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).