LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier()
@ 2021-11-28  2:50 Lu Baolu
  2021-11-28  2:50 ` [PATCH v2 01/17] iommu: Add device dma ownership set/release interfaces Lu Baolu
                   ` (18 more replies)
  0 siblings, 19 replies; 27+ messages in thread
From: Lu Baolu @ 2021-11-28  2:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel, Lu Baolu

The original post and intent of this series is here.
https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/

Change log:
v1: initial post
  - https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/

v2:
  - Move kernel dma ownership auto-claiming from driver core to bus
    callback. [Greg/Christoph/Robin/Jason]
    https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/T/#m153706912b770682cb12e3c28f57e171aa1f9d0c

  - Code and interface refactoring for iommu_set/release_dma_owner()
    interfaces. [Jason]
    https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/T/#mea70ed8e4e3665aedf32a5a0a7db095bf680325e

  - [NEW]Add new iommu_attach/detach_device_shared() interfaces for
    multiple devices group. [Robin/Jason]
    https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/T/#mea70ed8e4e3665aedf32a5a0a7db095bf680325e
  
  - [NEW]Use iommu_attach/detach_device_shared() in drm/tegra drivers.

  - Refactoring and description refinement.

This is based on v5.16-rc2 and available on github:
https://github.com/LuBaolu/intel-iommu/commits/iommu-dma-ownership-v2

Best regards,
baolu

Jason Gunthorpe (2):
  vfio: Delete the unbound_list
  drm/tegra: Use the iommu dma_owner mechanism

Lu Baolu (15):
  iommu: Add device dma ownership set/release interfaces
  driver core: Add dma_unconfigure callback in bus_type
  PCI: Add driver dma ownership management
  driver core: platform: Add driver dma ownership management
  amba: Add driver dma ownership management
  bus: fsl-mc: Add driver dma ownership management
  PCI: pci_stub: Suppress kernel DMA ownership auto-claiming
  PCI: portdrv: Suppress kernel DMA ownership auto-claiming
  iommu: Add security context management for assigned devices
  iommu: Expose group variants of dma ownership interfaces
  iommu: Add iommu_at[de]tach_device_shared() for multi-device groups
  vfio: Set DMA USER ownership for VFIO devices
  vfio: Remove use of vfio_group_viable()
  vfio: Remove iommu group notifier
  iommu: Remove iommu group changes notifier

 include/linux/amba/bus.h              |   1 +
 include/linux/device/bus.h            |   3 +
 include/linux/fsl/mc.h                |   5 +
 include/linux/iommu.h                 |  93 ++++++--
 include/linux/pci.h                   |   5 +
 include/linux/platform_device.h       |   1 +
 drivers/amba/bus.c                    |  30 ++-
 drivers/base/dd.c                     |   7 +-
 drivers/base/platform.c               |  30 ++-
 drivers/bus/fsl-mc/fsl-mc-bus.c       |  26 +-
 drivers/gpu/drm/tegra/dc.c            |   1 +
 drivers/gpu/drm/tegra/drm.c           |  55 ++---
 drivers/gpu/drm/tegra/gr2d.c          |   1 +
 drivers/gpu/drm/tegra/gr3d.c          |   1 +
 drivers/gpu/drm/tegra/vic.c           |   1 +
 drivers/iommu/iommu.c                 | 329 ++++++++++++++++++++------
 drivers/pci/pci-driver.c              |  21 ++
 drivers/pci/pci-stub.c                |   1 +
 drivers/pci/pcie/portdrv_pci.c        |   2 +
 drivers/vfio/fsl-mc/vfio_fsl_mc.c     |   1 +
 drivers/vfio/pci/vfio_pci.c           |   1 +
 drivers/vfio/platform/vfio_amba.c     |   1 +
 drivers/vfio/platform/vfio_platform.c |   1 +
 drivers/vfio/vfio.c                   | 248 ++-----------------
 24 files changed, 502 insertions(+), 363 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 01/17] iommu: Add device dma ownership set/release interfaces
  2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
@ 2021-11-28  2:50 ` Lu Baolu
  2021-11-28  2:50 ` [PATCH v2 02/17] driver core: Add dma_unconfigure callback in bus_type Lu Baolu
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2021-11-28  2:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel, Lu Baolu

From the perspective of who is initiating the device to do DMA, device
DMA could be divided into the following types:

        DMA_OWNER_DMA_API: Device DMAs are initiated by a kernel driver
			through the kernel DMA API.
        DMA_OWNER_PRIVATE_DOMAIN: Device DMAs are initiated by a kernel
			driver with its own PRIVATE domain.
	DMA_OWNER_PRIVATE_DOMAIN_USER: Device DMAs are initiated by
			userspace.

Different DMA ownerships are exclusive for all devices in the same iommu
group as an iommu group is the smallest granularity of device isolation
and protection that the IOMMU subsystem can guarantee. This extends the
iommu core to enforce this exclusion.

Basically two new interfaces are provided:

        int iommu_device_set_dma_owner(struct device *dev,
                enum iommu_dma_owner type, void *owner_cookie);
        void iommu_device_release_dma_owner(struct device *dev,
                enum iommu_dma_owner type);

Although above interfaces are per-device, DMA owner is tracked per group
under the hood. An iommu group cannot have different dma ownership set
at the same time. Violation of this assumption fails
iommu_device_set_dma_owner().

Kernel driver which does DMA have DMA_OWNER_DMA_API automatically set/
released in the driver binding/unbinding process (see next patch).

Kernel driver which doesn't do DMA could avoid setting the owner type.
Device bound to such driver is considered same as a driver-less device
which is compatible to all owner types.

Userspace driver framework (e.g. vfio) should set
DMA_OWNER_PRIVATE_DOMAIN_USER for a device before the userspace is allowed
to access it, plus a owner cookie pointer to mark the user identity so a
single group cannot be operated by multiple users simultaneously. Vice
versa, the owner type should be released after the user access permission
is withdrawn.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/iommu.h | 36 +++++++++++++++++
 drivers/iommu/iommu.c | 93 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 129 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index d2f3435e7d17..24676b498f38 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -162,6 +162,23 @@ enum iommu_dev_features {
 	IOMMU_DEV_FEAT_IOPF,
 };
 
+/**
+ * enum iommu_dma_owner - IOMMU DMA ownership
+ * @DMA_OWNER_NONE: No DMA ownership.
+ * @DMA_OWNER_DMA_API: Device DMAs are initiated by a kernel driver through
+ *			the kernel DMA API.
+ * @DMA_OWNER_PRIVATE_DOMAIN: Device DMAs are initiated by a kernel driver
+ *			which provides an UNMANAGED domain.
+ * @DMA_OWNER_PRIVATE_DOMAIN_USER: Device DMAs are initiated by userspace,
+ *			kernel ensures that DMAs never go to kernel memory.
+ */
+enum iommu_dma_owner {
+	DMA_OWNER_NONE,
+	DMA_OWNER_DMA_API,
+	DMA_OWNER_PRIVATE_DOMAIN,
+	DMA_OWNER_PRIVATE_DOMAIN_USER,
+};
+
 #define IOMMU_PASID_INVALID	(-1U)
 
 #ifdef CONFIG_IOMMU_API
@@ -681,6 +698,10 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev,
 void iommu_sva_unbind_device(struct iommu_sva *handle);
 u32 iommu_sva_get_pasid(struct iommu_sva *handle);
 
+int iommu_device_set_dma_owner(struct device *dev, enum iommu_dma_owner owner,
+			       void *owner_cookie);
+void iommu_device_release_dma_owner(struct device *dev, enum iommu_dma_owner owner);
+
 #else /* CONFIG_IOMMU_API */
 
 struct iommu_ops {};
@@ -1081,6 +1102,21 @@ static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
 {
 	return NULL;
 }
+
+static inline int iommu_device_set_dma_owner(struct device *dev,
+					     enum iommu_dma_owner owner,
+					     void *owner_cookie)
+{
+	if (owner != DMA_OWNER_DMA_API)
+		return -EINVAL;
+
+	return 0;
+}
+
+static inline void iommu_device_release_dma_owner(struct device *dev,
+						  enum iommu_dma_owner owner)
+{
+}
 #endif /* CONFIG_IOMMU_API */
 
 /**
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 8b86406b7162..1de520a07518 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -48,6 +48,9 @@ struct iommu_group {
 	struct iommu_domain *default_domain;
 	struct iommu_domain *domain;
 	struct list_head entry;
+	enum iommu_dma_owner dma_owner;
+	refcount_t owner_cnt;
+	void *owner_cookie;
 };
 
 struct group_device {
@@ -621,6 +624,7 @@ struct iommu_group *iommu_group_alloc(void)
 	INIT_LIST_HEAD(&group->devices);
 	INIT_LIST_HEAD(&group->entry);
 	BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
+	group->dma_owner = DMA_OWNER_NONE;
 
 	ret = ida_simple_get(&iommu_group_ida, 0, 0, GFP_KERNEL);
 	if (ret < 0) {
@@ -3351,3 +3355,92 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
 
 	return ret;
 }
+
+static int __iommu_group_set_dma_owner(struct iommu_group *group,
+				       enum iommu_dma_owner owner,
+				       void *owner_cookie)
+{
+	if (refcount_inc_not_zero(&group->owner_cnt)) {
+		if (group->dma_owner != owner ||
+		    group->owner_cookie != owner_cookie) {
+			refcount_dec(&group->owner_cnt);
+			return -EBUSY;
+		}
+
+		return 0;
+	}
+
+	group->dma_owner = owner;
+	group->owner_cookie = owner_cookie;
+	refcount_set(&group->owner_cnt, 1);
+
+	return 0;
+}
+
+static void __iommu_group_release_dma_owner(struct iommu_group *group,
+					    enum iommu_dma_owner owner)
+{
+	if (WARN_ON(group->dma_owner != owner))
+		return;
+
+	if (!refcount_dec_and_test(&group->owner_cnt))
+		return;
+
+	group->dma_owner = DMA_OWNER_NONE;
+}
+
+/**
+ * iommu_device_set_dma_owner() - Set DMA ownership of a device
+ * @dev: The device.
+ * @owner: DMA ownership type.
+ * @owner_cookie: Caller specified pointer. Could be used for exclusive
+ *                declaration. Could be NULL.
+ *
+ * Set the DMA ownership of a device. The different ownerships are
+ * exclusive. The caller could specify a owner_cookie pointer so that
+ * the same DMA ownership could be exclusive among different owners.
+ */
+int iommu_device_set_dma_owner(struct device *dev, enum iommu_dma_owner owner,
+			       void *owner_cookie)
+{
+	struct iommu_group *group = iommu_group_get(dev);
+	int ret;
+
+	if (!group) {
+		if (owner == DMA_OWNER_DMA_API)
+			return 0;
+		else
+			return -ENODEV;
+	}
+
+	mutex_lock(&group->mutex);
+	ret = __iommu_group_set_dma_owner(group, owner, owner_cookie);
+	mutex_unlock(&group->mutex);
+	iommu_group_put(group);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_device_set_dma_owner);
+
+/**
+ * iommu_device_release_dma_owner() - Release DMA ownership of a device
+ * @dev: The device.
+ * @owner: The DMA ownership type.
+ *
+ * Release the DMA ownership claimed by iommu_device_set_dma_owner().
+ */
+void iommu_device_release_dma_owner(struct device *dev, enum iommu_dma_owner owner)
+{
+	struct iommu_group *group = iommu_group_get(dev);
+
+	if (!group) {
+		WARN_ON(owner != DMA_OWNER_DMA_API);
+		return;
+	}
+
+	mutex_lock(&group->mutex);
+	__iommu_group_release_dma_owner(group, owner);
+	mutex_unlock(&group->mutex);
+	iommu_group_put(group);
+}
+EXPORT_SYMBOL_GPL(iommu_device_release_dma_owner);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 02/17] driver core: Add dma_unconfigure callback in bus_type
  2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
  2021-11-28  2:50 ` [PATCH v2 01/17] iommu: Add device dma ownership set/release interfaces Lu Baolu
@ 2021-11-28  2:50 ` Lu Baolu
  2021-11-28  8:02   ` Greg Kroah-Hartman
  2021-11-28  2:50 ` [PATCH v2 03/17] PCI: Add driver dma ownership management Lu Baolu
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 27+ messages in thread
From: Lu Baolu @ 2021-11-28  2:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel, Lu Baolu

The bus_type structure defines dma_configure() callback for bus drivers
to configure DMA on the devices. This adds the paired dma_unconfigure()
callback and calls it during driver unbinding so that bus drivers can do
some cleanup work.

One use case for this paired DMA callbacks is for the bus driver to check
for DMA ownership conflicts during driver binding, where multiple devices
belonging to a same IOMMU group (the minimum granularity of isolation and
protection) may be assigned to kernel drivers or user space respectively.

Without this change, for example, the vfio driver has to listen to a bus
BOUND_DRIVER event and then BUG_ON() in case of dma ownership conflict.
This leads to bad user experience since careless driver binding operation
may crash the system if the admin overlooks the group restriction. Aside
from bad design, this leads to a security problem as a root user, even with
lockdown=integrity, can force the kernel to BUG.

With this change, the bus driver could check and set the DMA ownership in
driver binding process and fail on ownership conflicts. The DMA ownership
should be released during driver unbinding.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/linux-iommu/20210922123931.GI327412@nvidia.com/
Link: https://lore.kernel.org/linux-iommu/20210928115751.GK964074@nvidia.com/
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/device/bus.h | 3 +++
 drivers/base/dd.c          | 7 ++++++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h
index a039ab809753..ef54a71e5f8f 100644
--- a/include/linux/device/bus.h
+++ b/include/linux/device/bus.h
@@ -59,6 +59,8 @@ struct fwnode_handle;
  *		bus supports.
  * @dma_configure:	Called to setup DMA configuration on a device on
  *			this bus.
+ * @dma_unconfigure:	Called to cleanup DMA configuration on a device on
+ *			this bus.
  * @pm:		Power management operations of this bus, callback the specific
  *		device driver's pm-ops.
  * @iommu_ops:  IOMMU specific operations for this bus, used to attach IOMMU
@@ -103,6 +105,7 @@ struct bus_type {
 	int (*num_vf)(struct device *dev);
 
 	int (*dma_configure)(struct device *dev);
+	void (*dma_unconfigure)(struct device *dev);
 
 	const struct dev_pm_ops *pm;
 
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 68ea1f949daa..a37aafff5fde 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -577,7 +577,7 @@ static int really_probe(struct device *dev, struct device_driver *drv)
 	if (dev->bus->dma_configure) {
 		ret = dev->bus->dma_configure(dev);
 		if (ret)
-			goto probe_failed;
+			goto pinctrl_bind_failed;
 	}
 
 	ret = driver_sysfs_add(dev);
@@ -660,6 +660,8 @@ static int really_probe(struct device *dev, struct device_driver *drv)
 	if (dev->bus)
 		blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
 					     BUS_NOTIFY_DRIVER_NOT_BOUND, dev);
+	if (dev->bus->dma_unconfigure)
+		dev->bus->dma_unconfigure(dev);
 pinctrl_bind_failed:
 	device_links_no_driver(dev);
 	devres_release_all(dev);
@@ -1204,6 +1206,9 @@ static void __device_release_driver(struct device *dev, struct device *parent)
 		else if (drv->remove)
 			drv->remove(dev);
 
+		if (dev->bus->dma_unconfigure)
+			dev->bus->dma_unconfigure(dev);
+
 		device_links_driver_cleanup(dev);
 
 		devres_release_all(dev);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 03/17] PCI: Add driver dma ownership management
  2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
  2021-11-28  2:50 ` [PATCH v2 01/17] iommu: Add device dma ownership set/release interfaces Lu Baolu
  2021-11-28  2:50 ` [PATCH v2 02/17] driver core: Add dma_unconfigure callback in bus_type Lu Baolu
@ 2021-11-28  2:50 ` Lu Baolu
  2021-11-28  2:50 ` [PATCH v2 04/17] driver core: platform: " Lu Baolu
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2021-11-28  2:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel, Lu Baolu

Multiple PCI devices may be placed in the same IOMMU group because they
cannot be isolated from each other. These devices must either be entirely
under kernel control or userspace control, never a mixture. This checks
and sets DMA ownership during driver binding, and vice versa, release the
ownership during driver unbinding.

Driver may set a new flag (suppress_auto_claim_dma_owner) to disable auto
claiming DMA_OWNER_DMA_API ownership in the binding process. For instance,
the userspace framework drivers (vfio etc.) which need to manually claim
DMA_OWNER_PRIVATE_DOMAIN_USER when assigning a device to userspace.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/pci.h      |  5 +++++
 drivers/pci/pci-driver.c | 21 +++++++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 18a75c8e615c..1b29af0ab43b 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -882,6 +882,10 @@ struct module;
  *              created once it is bound to the driver.
  * @driver:	Driver model structure.
  * @dynids:	List of dynamically added device IDs.
+ * @suppress_auto_claim_dma_owner: Disable auto claiming of kernel DMA owner.
+ *		Drivers which don't require DMA or want to manually claim the
+ *		owner type (e.g. userspace driver frameworks) could set this
+ *		flag.
  */
 struct pci_driver {
 	struct list_head	node;
@@ -900,6 +904,7 @@ struct pci_driver {
 	const struct attribute_group **dev_groups;
 	struct device_driver	driver;
 	struct pci_dynids	dynids;
+	bool suppress_auto_claim_dma_owner;
 };
 
 static inline struct pci_driver *to_pci_driver(struct device_driver *drv)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 588588cfda48..cad299d79f1a 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -20,6 +20,7 @@
 #include <linux/of_device.h>
 #include <linux/acpi.h>
 #include <linux/dma-map-ops.h>
+#include <linux/iommu.h>
 #include "pci.h"
 #include "pcie/portdrv.h"
 
@@ -1590,9 +1591,16 @@ static int pci_bus_num_vf(struct device *dev)
  */
 static int pci_dma_configure(struct device *dev)
 {
+	struct pci_driver *driver = to_pci_driver(dev->driver);
 	struct device *bridge;
 	int ret = 0;
 
+	if (!driver->suppress_auto_claim_dma_owner) {
+		ret = iommu_device_set_dma_owner(dev, DMA_OWNER_DMA_API, NULL);
+		if (ret)
+			return ret;
+	}
+
 	bridge = pci_get_host_bridge_device(to_pci_dev(dev));
 
 	if (IS_ENABLED(CONFIG_OF) && bridge->parent &&
@@ -1605,9 +1613,21 @@ static int pci_dma_configure(struct device *dev)
 	}
 
 	pci_put_host_bridge_device(bridge);
+
+	if (ret && !driver->suppress_auto_claim_dma_owner)
+		iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
+
 	return ret;
 }
 
+static void pci_dma_unconfigure(struct device *dev)
+{
+	struct pci_driver *driver = to_pci_driver(dev->driver);
+
+	if (!driver->suppress_auto_claim_dma_owner)
+		iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
+}
+
 struct bus_type pci_bus_type = {
 	.name		= "pci",
 	.match		= pci_bus_match,
@@ -1621,6 +1641,7 @@ struct bus_type pci_bus_type = {
 	.pm		= PCI_PM_OPS_PTR,
 	.num_vf		= pci_bus_num_vf,
 	.dma_configure	= pci_dma_configure,
+	.dma_unconfigure = pci_dma_unconfigure,
 };
 EXPORT_SYMBOL(pci_bus_type);
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 04/17] driver core: platform: Add driver dma ownership management
  2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
                   ` (2 preceding siblings ...)
  2021-11-28  2:50 ` [PATCH v2 03/17] PCI: Add driver dma ownership management Lu Baolu
@ 2021-11-28  2:50 ` Lu Baolu
  2021-11-28  8:10   ` Greg Kroah-Hartman
  2021-11-28  2:50 ` [PATCH v2 05/17] amba: " Lu Baolu
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 27+ messages in thread
From: Lu Baolu @ 2021-11-28  2:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel, Lu Baolu

Multiple platform devices may be placed in the same IOMMU group because
they cannot be isolated from each other. These devices must either be
entirely under kernel control or userspace control, never a mixture. This
checks and sets DMA ownership during driver binding, and release the
ownership during driver unbinding.

Driver may set a new flag (suppress_auto_claim_dma_owner) to disable auto
claiming DMA_OWNER_DMA_API ownership in the binding process. For instance,
the userspace framework drivers (vfio etc.) which need to manually claim
DMA_OWNER_PRIVATE_DOMAIN_USER when assigning a device to userspace.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/platform_device.h |  1 +
 drivers/base/platform.c         | 30 +++++++++++++++++++++++++++++-
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 7c96f169d274..779bcf2a851c 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -210,6 +210,7 @@ struct platform_driver {
 	struct device_driver driver;
 	const struct platform_device_id *id_table;
 	bool prevent_deferred_probe;
+	bool suppress_auto_claim_dma_owner;
 };
 
 #define to_platform_driver(drv)	(container_of((drv), struct platform_driver, \
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 598acf93a360..df4b385c8a52 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -30,6 +30,7 @@
 #include <linux/property.h>
 #include <linux/kmemleak.h>
 #include <linux/types.h>
+#include <linux/iommu.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -1465,6 +1466,32 @@ int platform_dma_configure(struct device *dev)
 	return ret;
 }
 
+static int _platform_dma_configure(struct device *dev)
+{
+	struct platform_driver *drv = to_platform_driver(dev->driver);
+	int ret;
+
+	if (!drv->suppress_auto_claim_dma_owner) {
+		ret = iommu_device_set_dma_owner(dev, DMA_OWNER_DMA_API, NULL);
+		if (ret)
+			return ret;
+	}
+
+	ret = platform_dma_configure(dev);
+	if (ret && !drv->suppress_auto_claim_dma_owner)
+		iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
+
+	return ret;
+}
+
+static void _platform_dma_unconfigure(struct device *dev)
+{
+	struct platform_driver *drv = to_platform_driver(dev->driver);
+
+	if (!drv->suppress_auto_claim_dma_owner)
+		iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
+}
+
 static const struct dev_pm_ops platform_dev_pm_ops = {
 	SET_RUNTIME_PM_OPS(pm_generic_runtime_suspend, pm_generic_runtime_resume, NULL)
 	USE_PLATFORM_PM_SLEEP_OPS
@@ -1478,7 +1505,8 @@ struct bus_type platform_bus_type = {
 	.probe		= platform_probe,
 	.remove		= platform_remove,
 	.shutdown	= platform_shutdown,
-	.dma_configure	= platform_dma_configure,
+	.dma_configure	= _platform_dma_configure,
+	.dma_unconfigure = _platform_dma_unconfigure,
 	.pm		= &platform_dev_pm_ops,
 };
 EXPORT_SYMBOL_GPL(platform_bus_type);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 05/17] amba: Add driver dma ownership management
  2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
                   ` (3 preceding siblings ...)
  2021-11-28  2:50 ` [PATCH v2 04/17] driver core: platform: " Lu Baolu
@ 2021-11-28  2:50 ` Lu Baolu
  2021-11-28  2:50 ` [PATCH v2 06/17] bus: fsl-mc: " Lu Baolu
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2021-11-28  2:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel, Lu Baolu

Multiple amba devices may be placed in the same IOMMU group because
they cannot be isolated from each other. These devices must either be
entirely under kernel control or userspace control, never a mixture. This
checks and sets DMA ownership during driver binding, and release the
ownership during driver unbinding.

Driver may set a new flag (suppress_auto_claim_dma_owner) to disable auto
claiming DMA_OWNER_DMA_API ownership in the binding process. For instance,
the userspace framework drivers (vfio etc.) which need to manually claim
DMA_OWNER_PRIVATE_DOMAIN_USER when assigning a device to userspace.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/amba/bus.h |  1 +
 drivers/amba/bus.c       | 30 +++++++++++++++++++++++++++++-
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/include/linux/amba/bus.h b/include/linux/amba/bus.h
index edfcf7a14dcd..745c5a60ddd8 100644
--- a/include/linux/amba/bus.h
+++ b/include/linux/amba/bus.h
@@ -79,6 +79,7 @@ struct amba_driver {
 	void			(*remove)(struct amba_device *);
 	void			(*shutdown)(struct amba_device *);
 	const struct amba_id	*id_table;
+	bool suppress_auto_claim_dma_owner;
 };
 
 /*
diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
index 720aa6cdd402..cd0cd9d141ec 100644
--- a/drivers/amba/bus.c
+++ b/drivers/amba/bus.c
@@ -20,6 +20,7 @@
 #include <linux/platform_device.h>
 #include <linux/reset.h>
 #include <linux/of_irq.h>
+#include <linux/iommu.h>
 
 #include <asm/irq.h>
 
@@ -305,6 +306,32 @@ static const struct dev_pm_ops amba_pm = {
 	)
 };
 
+static int amba_dma_configure(struct device *dev)
+{
+	struct amba_driver *drv = to_amba_driver(dev->driver);
+	int ret;
+
+	if (!drv->suppress_auto_claim_dma_owner) {
+		ret = iommu_device_set_dma_owner(dev, DMA_OWNER_DMA_API, NULL);
+		if (ret)
+			return ret;
+	}
+
+	ret = platform_dma_configure(dev);
+	if (ret && !drv->suppress_auto_claim_dma_owner)
+		iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
+
+	return ret;
+}
+
+static void amba_dma_unconfigure(struct device *dev)
+{
+	struct amba_driver *drv = to_amba_driver(dev->driver);
+
+	if (!drv->suppress_auto_claim_dma_owner)
+		iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
+}
+
 /*
  * Primecells are part of the Advanced Microcontroller Bus Architecture,
  * so we call the bus "amba".
@@ -319,7 +346,8 @@ struct bus_type amba_bustype = {
 	.probe		= amba_probe,
 	.remove		= amba_remove,
 	.shutdown	= amba_shutdown,
-	.dma_configure	= platform_dma_configure,
+	.dma_configure	= amba_dma_configure,
+	.dma_unconfigure = amba_dma_unconfigure,
 	.pm		= &amba_pm,
 };
 EXPORT_SYMBOL_GPL(amba_bustype);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 06/17] bus: fsl-mc: Add driver dma ownership management
  2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
                   ` (4 preceding siblings ...)
  2021-11-28  2:50 ` [PATCH v2 05/17] amba: " Lu Baolu
@ 2021-11-28  2:50 ` Lu Baolu
  2021-11-28  2:50 ` [PATCH v2 07/17] PCI: pci_stub: Suppress kernel DMA ownership auto-claiming Lu Baolu
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2021-11-28  2:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel, Lu Baolu

Multiple fsl-mc devices may be placed in the same IOMMU group because
they cannot be isolated from each other. These devices must either be
entirely under kernel control or userspace control, never a mixture. This
checks and sets DMA ownership during driver binding, and release the
ownership during driver unbinding.

Driver may set a new flag (suppress_auto_claim_dma_owner) to disable auto
claiming DMA_OWNER_DMA_API ownership in the binding process. For instance,
the userspace framework drivers (vfio etc.) which need to manually claim
DMA_OWNER_PRIVATE_DOMAIN_USER when assigning a device to userspace.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/fsl/mc.h          |  5 +++++
 drivers/bus/fsl-mc/fsl-mc-bus.c | 26 ++++++++++++++++++++++++--
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/include/linux/fsl/mc.h b/include/linux/fsl/mc.h
index e026f6c48b49..3a26421ee3dc 100644
--- a/include/linux/fsl/mc.h
+++ b/include/linux/fsl/mc.h
@@ -32,6 +32,10 @@ struct fsl_mc_io;
  * @shutdown: Function called at shutdown time to quiesce the device
  * @suspend: Function called when a device is stopped
  * @resume: Function called when a device is resumed
+ * @suppress_auto_claim_dma_owner: Disable auto claiming of kernel DMA owner.
+ *		Drivers which don't require DMA or want to manually claim the
+ *		owner type (e.g. userspace driver frameworks) could set this
+ *		flag.
  *
  * Generic DPAA device driver object for device drivers that are registered
  * with a DPRC bus. This structure is to be embedded in each device-specific
@@ -45,6 +49,7 @@ struct fsl_mc_driver {
 	void (*shutdown)(struct fsl_mc_device *dev);
 	int (*suspend)(struct fsl_mc_device *dev, pm_message_t state);
 	int (*resume)(struct fsl_mc_device *dev);
+	bool suppress_auto_claim_dma_owner;
 };
 
 #define to_fsl_mc_driver(_drv) \
diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 8fd4a356a86e..23dd5f0070e7 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -140,15 +140,36 @@ static int fsl_mc_dma_configure(struct device *dev)
 {
 	struct device *dma_dev = dev;
 	struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
+	struct fsl_mc_driver *mc_drv = to_fsl_mc_driver(dev->driver);
 	u32 input_id = mc_dev->icid;
+	int ret;
+
+	if (!mc_drv->suppress_auto_claim_dma_owner) {
+		ret = iommu_device_set_dma_owner(dev, DMA_OWNER_DMA_API, NULL);
+		if (ret)
+			return ret;
+	}
 
 	while (dev_is_fsl_mc(dma_dev))
 		dma_dev = dma_dev->parent;
 
 	if (dev_of_node(dma_dev))
-		return of_dma_configure_id(dev, dma_dev->of_node, 0, &input_id);
+		ret = of_dma_configure_id(dev, dma_dev->of_node, 0, &input_id);
+	else
+		ret = acpi_dma_configure_id(dev, DEV_DMA_COHERENT, &input_id);
+
+	if (ret && !mc_drv->suppress_auto_claim_dma_owner)
+		iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
+
+	return ret;
+}
+
+static void fsl_mc_dma_unconfigure(struct device *dev)
+{
+	struct fsl_mc_driver *mc_drv = to_fsl_mc_driver(dev->driver);
 
-	return acpi_dma_configure_id(dev, DEV_DMA_COHERENT, &input_id);
+	if (!mc_drv->suppress_auto_claim_dma_owner)
+		iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
 }
 
 static ssize_t modalias_show(struct device *dev, struct device_attribute *attr,
@@ -312,6 +333,7 @@ struct bus_type fsl_mc_bus_type = {
 	.match = fsl_mc_bus_match,
 	.uevent = fsl_mc_bus_uevent,
 	.dma_configure  = fsl_mc_dma_configure,
+	.dma_unconfigure = fsl_mc_dma_unconfigure,
 	.dev_groups = fsl_mc_dev_groups,
 	.bus_groups = fsl_mc_bus_groups,
 };
-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 07/17] PCI: pci_stub: Suppress kernel DMA ownership auto-claiming
  2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
                   ` (5 preceding siblings ...)
  2021-11-28  2:50 ` [PATCH v2 06/17] bus: fsl-mc: " Lu Baolu
@ 2021-11-28  2:50 ` Lu Baolu
  2021-11-28  2:50 ` [PATCH v2 08/17] PCI: portdrv: " Lu Baolu
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2021-11-28  2:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel, Lu Baolu

The pci_dma_configure() marks the iommu_group as containing only devices
with kernel drivers that manage DMA. Avoid this default behavior for the
pci_stub because it does not program any DMA itself.  This allows the
pci_stub still able to be used by the admin to block driver binding after
applying the DMA ownership to vfio.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/pci/pci-stub.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/pci/pci-stub.c b/drivers/pci/pci-stub.c
index e408099fea52..14b9b9f2ad2b 100644
--- a/drivers/pci/pci-stub.c
+++ b/drivers/pci/pci-stub.c
@@ -36,6 +36,7 @@ static struct pci_driver stub_driver = {
 	.name		= "pci-stub",
 	.id_table	= NULL,	/* only dynamic id's */
 	.probe		= pci_stub_probe,
+	.suppress_auto_claim_dma_owner = true,
 };
 
 static int __init pci_stub_init(void)
-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 08/17] PCI: portdrv: Suppress kernel DMA ownership auto-claiming
  2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
                   ` (6 preceding siblings ...)
  2021-11-28  2:50 ` [PATCH v2 07/17] PCI: pci_stub: Suppress kernel DMA ownership auto-claiming Lu Baolu
@ 2021-11-28  2:50 ` Lu Baolu
  2021-11-28  2:50 ` [PATCH v2 09/17] iommu: Add security context management for assigned devices Lu Baolu
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2021-11-28  2:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel, Lu Baolu

IOMMU grouping on PCI necessitates that if we lack isolation on a bridge
then all of the downstream devices will be part of the same IOMMU group
as the bridge. The existing vfio framework allows the portdrv driver to
be bound to the bridge while its downstream devices are assigned to user
space. The pci_dma_configure() marks the iommu_group as containing only
devices with kernel drivers that manage DMA. Avoid this default behavior
for the portdrv driver in order for compatibility with the current vfio
policy.

The commit 5f096b14d421b ("vfio: Whitelist PCI bridges") extended above
policy to all kernel drivers of bridge class. This is not always safe.
For example, The shpchp_core driver relies on the PCI MMIO access for the
controller functionality. With its downstream devices assigned to the
userspace, the MMIO might be changed through user initiated P2P accesses
without any notification. This might break the kernel driver integrity
and lead to some unpredictable consequences.

For any bridge driver, in order to avoiding default kernel DMA ownership
claiming, we should consider:

 1) Does the bridge driver use DMA? Calling pci_set_master() or
    a dma_map_* API is a sure indicate the driver is doing DMA

 2) If the bridge driver uses MMIO, is it tolerant to hostile
    userspace also touching the same MMIO registers via P2P DMA
    attacks?

Conservatively if the driver maps an MMIO region at all, we can say that
it fails the test.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/pci/pcie/portdrv_pci.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 35eca6277a96..c66a83f2c987 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -202,6 +202,8 @@ static struct pci_driver pcie_portdriver = {
 
 	.err_handler	= &pcie_portdrv_err_handler,
 
+	.suppress_auto_claim_dma_owner = true,
+
 	.driver.pm	= PCIE_PORTDRV_PM_OPS,
 };
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 09/17] iommu: Add security context management for assigned devices
  2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
                   ` (7 preceding siblings ...)
  2021-11-28  2:50 ` [PATCH v2 08/17] PCI: portdrv: " Lu Baolu
@ 2021-11-28  2:50 ` Lu Baolu
  2021-11-28  2:50 ` [PATCH v2 10/17] iommu: Expose group variants of dma ownership interfaces Lu Baolu
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2021-11-28  2:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel, Lu Baolu

When an iommu group has DMA_OWNER_PRIVATE_DOMAIN_USER set for the first
time, it is a contract that the group could be assigned to userspace from
now on. The group must be detached from the default iommu domain and all
devices in this group are blocked from doing DMA until it is attached to a
user controlled iommu_domain. Correspondingly, the default domain should
be reattached after the last DMA_OWNER_USER is released.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/iommu.c | 42 +++++++++++++++++++++++++++++++++++++++---
 1 file changed, 39 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 1de520a07518..0cba04a8ea3b 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -292,7 +292,12 @@ int iommu_probe_device(struct device *dev)
 	mutex_lock(&group->mutex);
 	iommu_alloc_default_domain(group, dev);
 
-	if (group->default_domain) {
+	/*
+	 * If any device in the group has been initialized for user dma,
+	 * avoid attaching the default domain.
+	 */
+	if (group->default_domain &&
+	    group->dma_owner != DMA_OWNER_PRIVATE_DOMAIN_USER) {
 		ret = __iommu_attach_device(group->default_domain, dev);
 		if (ret) {
 			mutex_unlock(&group->mutex);
@@ -2324,7 +2329,7 @@ static int __iommu_attach_group(struct iommu_domain *domain,
 {
 	int ret;
 
-	if (group->default_domain && group->domain != group->default_domain)
+	if (group->domain && group->domain != group->default_domain)
 		return -EBUSY;
 
 	ret = __iommu_group_for_each_dev(group, domain,
@@ -2361,7 +2366,12 @@ static void __iommu_detach_group(struct iommu_domain *domain,
 {
 	int ret;
 
-	if (!group->default_domain) {
+	/*
+	 * If any device in the group has been initialized for user dma,
+	 * avoid re-attaching the default domain.
+	 */
+	if (!group->default_domain ||
+	    group->dma_owner == DMA_OWNER_PRIVATE_DOMAIN_USER) {
 		__iommu_group_for_each_dev(group, domain,
 					   iommu_group_do_detach_device);
 		group->domain = NULL;
@@ -3371,6 +3381,23 @@ static int __iommu_group_set_dma_owner(struct iommu_group *group,
 	}
 
 	group->dma_owner = owner;
+
+	/*
+	 * We must ensure that any device DMAs issued after this call
+	 * are discarded. DMAs can only reach real memory once someone
+	 * has attached a real domain.
+	 */
+	if (owner == DMA_OWNER_PRIVATE_DOMAIN_USER) {
+		if (group->domain) {
+			if (group->domain != group->default_domain) {
+				group->dma_owner = DMA_OWNER_NONE;
+				return -EBUSY;
+			}
+
+			__iommu_detach_group(group->domain, group);
+		}
+	}
+
 	group->owner_cookie = owner_cookie;
 	refcount_set(&group->owner_cnt, 1);
 
@@ -3387,6 +3414,15 @@ static void __iommu_group_release_dma_owner(struct iommu_group *group,
 		return;
 
 	group->dma_owner = DMA_OWNER_NONE;
+
+	/*
+	 * The UNMANAGED domain should be detached before all USER
+	 * owners have been released.
+	 */
+	if (owner == DMA_OWNER_PRIVATE_DOMAIN_USER) {
+		if (!WARN_ON(group->domain) && group->default_domain)
+			__iommu_attach_group(group->default_domain, group);
+	}
 }
 
 /**
-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 10/17] iommu: Expose group variants of dma ownership interfaces
  2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
                   ` (8 preceding siblings ...)
  2021-11-28  2:50 ` [PATCH v2 09/17] iommu: Add security context management for assigned devices Lu Baolu
@ 2021-11-28  2:50 ` Lu Baolu
  2021-11-28  2:50 ` [PATCH v2 11/17] iommu: Add iommu_at[de]tach_device_shared() for multi-device groups Lu Baolu
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2021-11-28  2:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel, Lu Baolu

The vfio needs to set DMA_OWNER_PRIVATE_DOMAIN_USER for the entire group
when attaching it to a vfio container. Expose group variants of setting/
releasing dma ownership for this purpose.

This also exposes the helper iommu_group_dma_owner_unclaimed() for vfio
to report to userspace if the group is viable to user assignment for
compatibility with VFIO_GROUP_FLAGS_VIABLE.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/iommu.h | 21 ++++++++++++++++
 drivers/iommu/iommu.c | 58 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 79 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 24676b498f38..afcc07bc8d41 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -701,6 +701,10 @@ u32 iommu_sva_get_pasid(struct iommu_sva *handle);
 int iommu_device_set_dma_owner(struct device *dev, enum iommu_dma_owner owner,
 			       void *owner_cookie);
 void iommu_device_release_dma_owner(struct device *dev, enum iommu_dma_owner owner);
+int iommu_group_set_dma_owner(struct iommu_group *group, enum iommu_dma_owner owner,
+			      void *owner_cookie);
+void iommu_group_release_dma_owner(struct iommu_group *group, enum iommu_dma_owner owner);
+bool iommu_group_dma_owner_unclaimed(struct iommu_group *group);
 
 #else /* CONFIG_IOMMU_API */
 
@@ -1117,6 +1121,23 @@ static inline void iommu_device_release_dma_owner(struct device *dev,
 						  enum iommu_dma_owner owner)
 {
 }
+
+static inline int iommu_group_set_dma_owner(struct iommu_group *group,
+					    enum iommu_dma_owner owner,
+					    void *owner_cookie)
+{
+	return -EINVAL;
+}
+
+static inline void iommu_group_release_dma_owner(struct iommu_group *group,
+						 enum iommu_dma_owner owner)
+{
+}
+
+static inline bool iommu_group_dma_owner_unclaimed(struct iommu_group *group)
+{
+	return false;
+}
 #endif /* CONFIG_IOMMU_API */
 
 /**
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0cba04a8ea3b..423197db99a9 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -3425,6 +3425,64 @@ static void __iommu_group_release_dma_owner(struct iommu_group *group,
 	}
 }
 
+/**
+ * iommu_group_set_dma_owner() - Set DMA ownership of a group
+ * @group: The group.
+ * @owner: DMA owner type.
+ * @owner_cookie: Caller specified pointer. Could be used for exclusive
+ *                declaration. Could be NULL.
+ *
+ * This is to support backward compatibility for legacy vfio which manages
+ * dma ownership in group level. New invocations on this interface should be
+ * prohibited. Instead, please turn to iommu_device_set_dma_owner().
+ */
+int iommu_group_set_dma_owner(struct iommu_group *group, enum iommu_dma_owner owner,
+			      void *owner_cookie)
+{
+	int ret;
+
+	mutex_lock(&group->mutex);
+	ret = __iommu_group_set_dma_owner(group, owner, owner_cookie);
+	mutex_unlock(&group->mutex);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_group_set_dma_owner);
+
+/**
+ * iommu_group_release_dma_owner() - Release DMA ownership of a group
+ * @group: The group.
+ * @owner: DMA owner type.
+ *
+ * Release the DMA ownership claimed by iommu_group_set_dma_owner().
+ */
+void iommu_group_release_dma_owner(struct iommu_group *group, enum iommu_dma_owner owner)
+{
+	mutex_lock(&group->mutex);
+	__iommu_group_release_dma_owner(group, owner);
+	mutex_unlock(&group->mutex);
+}
+EXPORT_SYMBOL_GPL(iommu_group_release_dma_owner);
+
+/**
+ * iommu_group_dma_owner_unclaimed() - Is group dma ownership claimed
+ * @group: The group.
+ *
+ * This provides status check on a given group. It is racey and only for
+ * non-binding status reporting.
+ */
+bool iommu_group_dma_owner_unclaimed(struct iommu_group *group)
+{
+	enum iommu_dma_owner owner;
+
+	mutex_lock(&group->mutex);
+	owner = group->dma_owner;
+	mutex_unlock(&group->mutex);
+
+	return owner == DMA_OWNER_NONE;
+}
+EXPORT_SYMBOL_GPL(iommu_group_dma_owner_unclaimed);
+
 /**
  * iommu_device_set_dma_owner() - Set DMA ownership of a device
  * @dev: The device.
-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 11/17] iommu: Add iommu_at[de]tach_device_shared() for multi-device groups
  2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
                   ` (9 preceding siblings ...)
  2021-11-28  2:50 ` [PATCH v2 10/17] iommu: Expose group variants of dma ownership interfaces Lu Baolu
@ 2021-11-28  2:50 ` Lu Baolu
  2021-11-28  2:50 ` [PATCH v2 12/17] vfio: Set DMA USER ownership for VFIO devices Lu Baolu
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2021-11-28  2:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel, Lu Baolu

The iommu_attach/detach_device() interfaces were exposed for the device
drivers to attach/detach their own domains. The commit <426a273834eae>
("iommu: Limit iommu_attach/detach_device to device with their own group")
restricted them to singleton groups to avoid different device in a group
attaching different domain.

As we've introduced device DMA ownership into the iommu core. We can now
introduce interfaces for muliple-device groups, and "all devices are in the
same address space" is still guaranteed.

The iommu_attach/detach_device_shared() could be used when multiple drivers
sharing the group claim the DMA_OWNER_PRIVATE_DOMAIN ownership. The first
call of iommu_attach_device_shared() attaches the domain to the group.
Other drivers could join it later. The domain will be detached from the
group after all drivers unjoin it.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/iommu.h | 13 +++++++++
 drivers/iommu/iommu.c | 61 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 74 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index afcc07bc8d41..8c81ba11ae8c 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -705,6 +705,8 @@ int iommu_group_set_dma_owner(struct iommu_group *group, enum iommu_dma_owner ow
 			      void *owner_cookie);
 void iommu_group_release_dma_owner(struct iommu_group *group, enum iommu_dma_owner owner);
 bool iommu_group_dma_owner_unclaimed(struct iommu_group *group);
+int iommu_attach_device_shared(struct iommu_domain *domain, struct device *dev);
+void iommu_detach_device_shared(struct iommu_domain *domain, struct device *dev);
 
 #else /* CONFIG_IOMMU_API */
 
@@ -745,11 +747,22 @@ static inline int iommu_attach_device(struct iommu_domain *domain,
 	return -ENODEV;
 }
 
+static inline int iommu_attach_device_shared(struct iommu_domain *domain,
+					     struct device *dev)
+{
+	return -ENODEV;
+}
+
 static inline void iommu_detach_device(struct iommu_domain *domain,
 				       struct device *dev)
 {
 }
 
+static inline void iommu_detach_device_shared(struct iommu_domain *domain,
+					      struct device *dev)
+{
+}
+
 static inline struct iommu_domain *iommu_get_domain_for_dev(struct device *dev)
 {
 	return NULL;
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 423197db99a9..f9cb96acbac8 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -50,6 +50,7 @@ struct iommu_group {
 	struct list_head entry;
 	enum iommu_dma_owner dma_owner;
 	refcount_t owner_cnt;
+	refcount_t attach_cnt;
 	void *owner_cookie;
 };
 
@@ -2026,6 +2027,41 @@ int iommu_attach_device(struct iommu_domain *domain, struct device *dev)
 }
 EXPORT_SYMBOL_GPL(iommu_attach_device);
 
+int iommu_attach_device_shared(struct iommu_domain *domain, struct device *dev)
+{
+	struct iommu_group *group;
+	int ret = 0;
+
+	group = iommu_group_get(dev);
+	if (!group)
+		return -ENODEV;
+
+	mutex_lock(&group->mutex);
+	if (refcount_inc_not_zero(&group->attach_cnt)) {
+		if (group->domain != domain ||
+		    (group->dma_owner != DMA_OWNER_PRIVATE_DOMAIN &&
+		     group->dma_owner != DMA_OWNER_PRIVATE_DOMAIN_USER)) {
+			refcount_dec(&group->attach_cnt);
+			ret = -EBUSY;
+		}
+
+		goto unlock_out;
+	}
+
+	ret = __iommu_attach_group(domain, group);
+	if (ret)
+		goto unlock_out;
+
+	refcount_set(&group->owner_cnt, 1);
+
+unlock_out:
+	mutex_unlock(&group->mutex);
+	iommu_group_put(group);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_attach_device_shared);
+
 int iommu_deferred_attach(struct device *dev, struct iommu_domain *domain)
 {
 	const struct iommu_ops *ops = domain->ops;
@@ -2281,6 +2317,31 @@ void iommu_detach_device(struct iommu_domain *domain, struct device *dev)
 }
 EXPORT_SYMBOL_GPL(iommu_detach_device);
 
+void iommu_detach_device_shared(struct iommu_domain *domain, struct device *dev)
+{
+	struct iommu_group *group;
+
+	group = iommu_group_get(dev);
+	if (!group)
+		return;
+
+	mutex_lock(&group->mutex);
+	if (WARN_ON(group->domain != domain ||
+		    (group->dma_owner != DMA_OWNER_PRIVATE_DOMAIN &&
+		     group->dma_owner != DMA_OWNER_PRIVATE_DOMAIN_USER)))
+		goto unlock_out;
+
+	if (refcount_dec_and_test(&group->owner_cnt)) {
+		__iommu_detach_group(domain, group);
+		group->dma_owner = DMA_OWNER_NONE;
+	}
+
+unlock_out:
+	mutex_unlock(&group->mutex);
+	iommu_group_put(group);
+}
+EXPORT_SYMBOL_GPL(iommu_detach_device_shared);
+
 struct iommu_domain *iommu_get_domain_for_dev(struct device *dev)
 {
 	struct iommu_domain *domain;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 12/17] vfio: Set DMA USER ownership for VFIO devices
  2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
                   ` (10 preceding siblings ...)
  2021-11-28  2:50 ` [PATCH v2 11/17] iommu: Add iommu_at[de]tach_device_shared() for multi-device groups Lu Baolu
@ 2021-11-28  2:50 ` Lu Baolu
  2021-11-28  2:50 ` [PATCH v2 13/17] vfio: Remove use of vfio_group_viable() Lu Baolu
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2021-11-28  2:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel, Lu Baolu

Set DMA_OWNER_PRIVATE_DOMAIN_USER when an iommu group is set to a
container, and release DMA_OWNER_USER once the iommu group is unset
from a container.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/vfio/fsl-mc/vfio_fsl_mc.c     |  1 +
 drivers/vfio/pci/vfio_pci.c           |  1 +
 drivers/vfio/platform/vfio_amba.c     |  1 +
 drivers/vfio/platform/vfio_platform.c |  1 +
 drivers/vfio/vfio.c                   | 13 ++++++++++++-
 5 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
index 6e2e62c6f47a..5f36ffbb07d1 100644
--- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c
+++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c
@@ -588,6 +588,7 @@ static struct fsl_mc_driver vfio_fsl_mc_driver = {
 		.name	= "vfio-fsl-mc",
 		.owner	= THIS_MODULE,
 	},
+	.suppress_auto_claim_dma_owner = true,
 };
 
 static int __init vfio_fsl_mc_driver_init(void)
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index a5ce92beb655..31810b074aa4 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -193,6 +193,7 @@ static struct pci_driver vfio_pci_driver = {
 	.remove			= vfio_pci_remove,
 	.sriov_configure	= vfio_pci_sriov_configure,
 	.err_handler		= &vfio_pci_core_err_handlers,
+	.suppress_auto_claim_dma_owner = true,
 };
 
 static void __init vfio_pci_fill_ids(void)
diff --git a/drivers/vfio/platform/vfio_amba.c b/drivers/vfio/platform/vfio_amba.c
index badfffea14fb..e598d6af90ad 100644
--- a/drivers/vfio/platform/vfio_amba.c
+++ b/drivers/vfio/platform/vfio_amba.c
@@ -95,6 +95,7 @@ static struct amba_driver vfio_amba_driver = {
 		.name = "vfio-amba",
 		.owner = THIS_MODULE,
 	},
+	.suppress_auto_claim_dma_owner = true,
 };
 
 module_amba_driver(vfio_amba_driver);
diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c
index 68a1c87066d7..8bda68775b97 100644
--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -76,6 +76,7 @@ static struct platform_driver vfio_platform_driver = {
 	.driver	= {
 		.name	= "vfio-platform",
 	},
+	.suppress_auto_claim_dma_owner = true,
 };
 
 module_platform_driver(vfio_platform_driver);
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 82fb75464f92..00f89acfec39 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1198,6 +1198,9 @@ static void __vfio_group_unset_container(struct vfio_group *group)
 		driver->ops->detach_group(container->iommu_data,
 					  group->iommu_group);
 
+	iommu_group_release_dma_owner(group->iommu_group,
+				      DMA_OWNER_PRIVATE_DOMAIN_USER);
+
 	group->container = NULL;
 	wake_up(&group->container_q);
 	list_del(&group->container_next);
@@ -1282,13 +1285,21 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 		goto unlock_out;
 	}
 
+	ret = iommu_group_set_dma_owner(group->iommu_group,
+					DMA_OWNER_PRIVATE_DOMAIN_USER, f.file);
+	if (ret)
+		goto unlock_out;
+
 	driver = container->iommu_driver;
 	if (driver) {
 		ret = driver->ops->attach_group(container->iommu_data,
 						group->iommu_group,
 						group->type);
-		if (ret)
+		if (ret) {
+			iommu_group_release_dma_owner(group->iommu_group,
+						      DMA_OWNER_PRIVATE_DOMAIN_USER);
 			goto unlock_out;
+		}
 	}
 
 	group->container = container;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 13/17] vfio: Remove use of vfio_group_viable()
  2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
                   ` (11 preceding siblings ...)
  2021-11-28  2:50 ` [PATCH v2 12/17] vfio: Set DMA USER ownership for VFIO devices Lu Baolu
@ 2021-11-28  2:50 ` Lu Baolu
  2021-11-28  2:50 ` [PATCH v2 14/17] vfio: Delete the unbound_list Lu Baolu
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2021-11-28  2:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel, Lu Baolu

As DMA USER ownership is claimed for the iommu group when a vfio group is
added to a vfio container, the vfio group viability is guaranteed as long
as group->container_users > 0. Remove those unnecessary group viability
checks which are only hit when group->container_users is not zero.

The only remaining reference is in GROUP_GET_STATUS, which could be called
at any time when group fd is valid. Here we just replace the
vfio_group_viable() by directly calling iommu core to get viability status.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/vfio/vfio.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 00f89acfec39..63e24f746b82 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1316,12 +1316,6 @@ static int vfio_group_set_container(struct vfio_group *group, int container_fd)
 	return ret;
 }
 
-static bool vfio_group_viable(struct vfio_group *group)
-{
-	return (iommu_group_for_each_dev(group->iommu_group,
-					 group, vfio_dev_viable) == 0);
-}
-
 static int vfio_group_add_container_user(struct vfio_group *group)
 {
 	if (!atomic_inc_not_zero(&group->container_users))
@@ -1331,7 +1325,7 @@ static int vfio_group_add_container_user(struct vfio_group *group)
 		atomic_dec(&group->container_users);
 		return -EPERM;
 	}
-	if (!group->container->iommu_driver || !vfio_group_viable(group)) {
+	if (!group->container->iommu_driver) {
 		atomic_dec(&group->container_users);
 		return -EINVAL;
 	}
@@ -1349,7 +1343,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 	int ret = 0;
 
 	if (0 == atomic_read(&group->container_users) ||
-	    !group->container->iommu_driver || !vfio_group_viable(group))
+	    !group->container->iommu_driver)
 		return -EINVAL;
 
 	if (group->type == VFIO_NO_IOMMU && !capable(CAP_SYS_RAWIO))
@@ -1441,11 +1435,11 @@ static long vfio_group_fops_unl_ioctl(struct file *filep,
 
 		status.flags = 0;
 
-		if (vfio_group_viable(group))
-			status.flags |= VFIO_GROUP_FLAGS_VIABLE;
-
 		if (group->container)
-			status.flags |= VFIO_GROUP_FLAGS_CONTAINER_SET;
+			status.flags |= VFIO_GROUP_FLAGS_CONTAINER_SET |
+					VFIO_GROUP_FLAGS_VIABLE;
+		else if (iommu_group_dma_owner_unclaimed(group->iommu_group))
+			status.flags |= VFIO_GROUP_FLAGS_VIABLE;
 
 		if (copy_to_user((void __user *)arg, &status, minsz))
 			return -EFAULT;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 14/17] vfio: Delete the unbound_list
  2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
                   ` (12 preceding siblings ...)
  2021-11-28  2:50 ` [PATCH v2 13/17] vfio: Remove use of vfio_group_viable() Lu Baolu
@ 2021-11-28  2:50 ` Lu Baolu
  2021-11-28  2:50 ` [PATCH v2 15/17] vfio: Remove iommu group notifier Lu Baolu
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2021-11-28  2:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel, Lu Baolu

From: Jason Gunthorpe <jgg@nvidia.com>

commit 60720a0fc646 ("vfio: Add device tracking during unbind") added the
unbound list to plug a problem with KVM where KVM_DEV_VFIO_GROUP_DEL
relied on vfio_group_get_external_user() succeeding to return the
vfio_group from a group file descriptor. The unbound list allowed
vfio_group_get_external_user() to continue to succeed in edge cases.

However commit 5d6dee80a1e9 ("vfio: New external user group/file match")
deleted the call to vfio_group_get_external_user() during
KVM_DEV_VFIO_GROUP_DEL. Instead vfio_external_group_match_file() is used
to directly match the file descriptor to the group pointer.

This in turn avoids the call down to vfio_dev_viable() during
KVM_DEV_VFIO_GROUP_DEL and also avoids the trouble the first commit was
trying to fix.

There are no other users of vfio_dev_viable() that care about the time
after vfio_unregister_group_dev() returns, so simply delete the
unbound_list entirely.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/vfio/vfio.c | 74 ++-------------------------------------------
 1 file changed, 2 insertions(+), 72 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 63e24f746b82..5c81346367b1 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -62,11 +62,6 @@ struct vfio_container {
 	bool				noiommu;
 };
 
-struct vfio_unbound_dev {
-	struct device			*dev;
-	struct list_head		unbound_next;
-};
-
 struct vfio_group {
 	struct device 			dev;
 	struct cdev			cdev;
@@ -79,8 +74,6 @@ struct vfio_group {
 	struct notifier_block		nb;
 	struct list_head		vfio_next;
 	struct list_head		container_next;
-	struct list_head		unbound_list;
-	struct mutex			unbound_lock;
 	atomic_t			opened;
 	wait_queue_head_t		container_q;
 	enum vfio_group_type		type;
@@ -340,16 +333,8 @@ vfio_group_get_from_iommu(struct iommu_group *iommu_group)
 static void vfio_group_release(struct device *dev)
 {
 	struct vfio_group *group = container_of(dev, struct vfio_group, dev);
-	struct vfio_unbound_dev *unbound, *tmp;
-
-	list_for_each_entry_safe(unbound, tmp,
-				 &group->unbound_list, unbound_next) {
-		list_del(&unbound->unbound_next);
-		kfree(unbound);
-	}
 
 	mutex_destroy(&group->device_lock);
-	mutex_destroy(&group->unbound_lock);
 	iommu_group_put(group->iommu_group);
 	ida_free(&vfio.group_ida, MINOR(group->dev.devt));
 	kfree(group);
@@ -381,8 +366,6 @@ static struct vfio_group *vfio_group_alloc(struct iommu_group *iommu_group,
 	refcount_set(&group->users, 1);
 	INIT_LIST_HEAD(&group->device_list);
 	mutex_init(&group->device_lock);
-	INIT_LIST_HEAD(&group->unbound_list);
-	mutex_init(&group->unbound_lock);
 	init_waitqueue_head(&group->container_q);
 	group->iommu_group = iommu_group;
 	/* put in vfio_group_release() */
@@ -571,19 +554,8 @@ static int vfio_dev_viable(struct device *dev, void *data)
 	struct vfio_group *group = data;
 	struct vfio_device *device;
 	struct device_driver *drv = READ_ONCE(dev->driver);
-	struct vfio_unbound_dev *unbound;
-	int ret = -EINVAL;
 
-	mutex_lock(&group->unbound_lock);
-	list_for_each_entry(unbound, &group->unbound_list, unbound_next) {
-		if (dev == unbound->dev) {
-			ret = 0;
-			break;
-		}
-	}
-	mutex_unlock(&group->unbound_lock);
-
-	if (!ret || !drv || vfio_dev_driver_allowed(dev, drv))
+	if (!drv || vfio_dev_driver_allowed(dev, drv))
 		return 0;
 
 	device = vfio_group_get_device(group, dev);
@@ -592,7 +564,7 @@ static int vfio_dev_viable(struct device *dev, void *data)
 		return 0;
 	}
 
-	return ret;
+	return -EINVAL;
 }
 
 /**
@@ -634,7 +606,6 @@ static int vfio_iommu_group_notifier(struct notifier_block *nb,
 {
 	struct vfio_group *group = container_of(nb, struct vfio_group, nb);
 	struct device *dev = data;
-	struct vfio_unbound_dev *unbound;
 
 	switch (action) {
 	case IOMMU_GROUP_NOTIFY_ADD_DEVICE:
@@ -663,28 +634,6 @@ static int vfio_iommu_group_notifier(struct notifier_block *nb,
 			__func__, iommu_group_id(group->iommu_group),
 			dev->driver->name);
 		break;
-	case IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER:
-		dev_dbg(dev, "%s: group %d unbound from driver\n", __func__,
-			iommu_group_id(group->iommu_group));
-		/*
-		 * XXX An unbound device in a live group is ok, but we'd
-		 * really like to avoid the above BUG_ON by preventing other
-		 * drivers from binding to it.  Once that occurs, we have to
-		 * stop the system to maintain isolation.  At a minimum, we'd
-		 * want a toggle to disable driver auto probe for this device.
-		 */
-
-		mutex_lock(&group->unbound_lock);
-		list_for_each_entry(unbound,
-				    &group->unbound_list, unbound_next) {
-			if (dev == unbound->dev) {
-				list_del(&unbound->unbound_next);
-				kfree(unbound);
-				break;
-			}
-		}
-		mutex_unlock(&group->unbound_lock);
-		break;
 	}
 	return NOTIFY_OK;
 }
@@ -889,29 +838,10 @@ static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group,
 void vfio_unregister_group_dev(struct vfio_device *device)
 {
 	struct vfio_group *group = device->group;
-	struct vfio_unbound_dev *unbound;
 	unsigned int i = 0;
 	bool interrupted = false;
 	long rc;
 
-	/*
-	 * When the device is removed from the group, the group suddenly
-	 * becomes non-viable; the device has a driver (until the unbind
-	 * completes), but it's not present in the group.  This is bad news
-	 * for any external users that need to re-acquire a group reference
-	 * in order to match and release their existing reference.  To
-	 * solve this, we track such devices on the unbound_list to bridge
-	 * the gap until they're fully unbound.
-	 */
-	unbound = kzalloc(sizeof(*unbound), GFP_KERNEL);
-	if (unbound) {
-		unbound->dev = device->dev;
-		mutex_lock(&group->unbound_lock);
-		list_add(&unbound->unbound_next, &group->unbound_list);
-		mutex_unlock(&group->unbound_lock);
-	}
-	WARN_ON(!unbound);
-
 	vfio_device_put(device);
 	rc = try_wait_for_completion(&device->comp);
 	while (rc <= 0) {
-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 15/17] vfio: Remove iommu group notifier
  2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
                   ` (13 preceding siblings ...)
  2021-11-28  2:50 ` [PATCH v2 14/17] vfio: Delete the unbound_list Lu Baolu
@ 2021-11-28  2:50 ` Lu Baolu
  2021-11-28  2:50 ` [PATCH v2 16/17] iommu: Remove iommu group changes notifier Lu Baolu
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2021-11-28  2:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel, Lu Baolu

The iommu core and driver core have been enhanced to avoid unsafe driver
binding to a live group after iommu_group_set_dma_owner(PRIVATE_USER)
has been called. There's no need to register iommu group notifier. This
removes the iommu group notifer which contains BUG_ON() and WARN().

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/vfio/vfio.c | 147 --------------------------------------------
 1 file changed, 147 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 5c81346367b1..33d984ff3cc5 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -71,7 +71,6 @@ struct vfio_group {
 	struct vfio_container		*container;
 	struct list_head		device_list;
 	struct mutex			device_lock;
-	struct notifier_block		nb;
 	struct list_head		vfio_next;
 	struct list_head		container_next;
 	atomic_t			opened;
@@ -274,8 +273,6 @@ void vfio_unregister_iommu_driver(const struct vfio_iommu_driver_ops *ops)
 }
 EXPORT_SYMBOL_GPL(vfio_unregister_iommu_driver);
 
-static int vfio_iommu_group_notifier(struct notifier_block *nb,
-				     unsigned long action, void *data);
 static void vfio_group_get(struct vfio_group *group);
 
 /**
@@ -395,13 +392,6 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 		goto err_put;
 	}
 
-	group->nb.notifier_call = vfio_iommu_group_notifier;
-	err = iommu_group_register_notifier(iommu_group, &group->nb);
-	if (err) {
-		ret = ERR_PTR(err);
-		goto err_put;
-	}
-
 	mutex_lock(&vfio.group_lock);
 
 	/* Did we race creating this group? */
@@ -422,7 +412,6 @@ static struct vfio_group *vfio_create_group(struct iommu_group *iommu_group,
 
 err_unlock:
 	mutex_unlock(&vfio.group_lock);
-	iommu_group_unregister_notifier(group->iommu_group, &group->nb);
 err_put:
 	put_device(&group->dev);
 	return ret;
@@ -447,7 +436,6 @@ static void vfio_group_put(struct vfio_group *group)
 	cdev_device_del(&group->cdev, &group->dev);
 	mutex_unlock(&vfio.group_lock);
 
-	iommu_group_unregister_notifier(group->iommu_group, &group->nb);
 	put_device(&group->dev);
 }
 
@@ -503,141 +491,6 @@ static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
 	return NULL;
 }
 
-/*
- * Some drivers, like pci-stub, are only used to prevent other drivers from
- * claiming a device and are therefore perfectly legitimate for a user owned
- * group.  The pci-stub driver has no dependencies on DMA or the IOVA mapping
- * of the device, but it does prevent the user from having direct access to
- * the device, which is useful in some circumstances.
- *
- * We also assume that we can include PCI interconnect devices, ie. bridges.
- * IOMMU grouping on PCI necessitates that if we lack isolation on a bridge
- * then all of the downstream devices will be part of the same IOMMU group as
- * the bridge.  Thus, if placing the bridge into the user owned IOVA space
- * breaks anything, it only does so for user owned devices downstream.  Note
- * that error notification via MSI can be affected for platforms that handle
- * MSI within the same IOVA space as DMA.
- */
-static const char * const vfio_driver_allowed[] = { "pci-stub" };
-
-static bool vfio_dev_driver_allowed(struct device *dev,
-				    struct device_driver *drv)
-{
-	if (dev_is_pci(dev)) {
-		struct pci_dev *pdev = to_pci_dev(dev);
-
-		if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
-			return true;
-	}
-
-	return match_string(vfio_driver_allowed,
-			    ARRAY_SIZE(vfio_driver_allowed),
-			    drv->name) >= 0;
-}
-
-/*
- * A vfio group is viable for use by userspace if all devices are in
- * one of the following states:
- *  - driver-less
- *  - bound to a vfio driver
- *  - bound to an otherwise allowed driver
- *  - a PCI interconnect device
- *
- * We use two methods to determine whether a device is bound to a vfio
- * driver.  The first is to test whether the device exists in the vfio
- * group.  The second is to test if the device exists on the group
- * unbound_list, indicating it's in the middle of transitioning from
- * a vfio driver to driver-less.
- */
-static int vfio_dev_viable(struct device *dev, void *data)
-{
-	struct vfio_group *group = data;
-	struct vfio_device *device;
-	struct device_driver *drv = READ_ONCE(dev->driver);
-
-	if (!drv || vfio_dev_driver_allowed(dev, drv))
-		return 0;
-
-	device = vfio_group_get_device(group, dev);
-	if (device) {
-		vfio_device_put(device);
-		return 0;
-	}
-
-	return -EINVAL;
-}
-
-/**
- * Async device support
- */
-static int vfio_group_nb_add_dev(struct vfio_group *group, struct device *dev)
-{
-	struct vfio_device *device;
-
-	/* Do we already know about it?  We shouldn't */
-	device = vfio_group_get_device(group, dev);
-	if (WARN_ON_ONCE(device)) {
-		vfio_device_put(device);
-		return 0;
-	}
-
-	/* Nothing to do for idle groups */
-	if (!atomic_read(&group->container_users))
-		return 0;
-
-	/* TODO Prevent device auto probing */
-	dev_WARN(dev, "Device added to live group %d!\n",
-		 iommu_group_id(group->iommu_group));
-
-	return 0;
-}
-
-static int vfio_group_nb_verify(struct vfio_group *group, struct device *dev)
-{
-	/* We don't care what happens when the group isn't in use */
-	if (!atomic_read(&group->container_users))
-		return 0;
-
-	return vfio_dev_viable(dev, group);
-}
-
-static int vfio_iommu_group_notifier(struct notifier_block *nb,
-				     unsigned long action, void *data)
-{
-	struct vfio_group *group = container_of(nb, struct vfio_group, nb);
-	struct device *dev = data;
-
-	switch (action) {
-	case IOMMU_GROUP_NOTIFY_ADD_DEVICE:
-		vfio_group_nb_add_dev(group, dev);
-		break;
-	case IOMMU_GROUP_NOTIFY_DEL_DEVICE:
-		/*
-		 * Nothing to do here.  If the device is in use, then the
-		 * vfio sub-driver should block the remove callback until
-		 * it is unused.  If the device is unused or attached to a
-		 * stub driver, then it should be released and we don't
-		 * care that it will be going away.
-		 */
-		break;
-	case IOMMU_GROUP_NOTIFY_BIND_DRIVER:
-		dev_dbg(dev, "%s: group %d binding to driver\n", __func__,
-			iommu_group_id(group->iommu_group));
-		break;
-	case IOMMU_GROUP_NOTIFY_BOUND_DRIVER:
-		dev_dbg(dev, "%s: group %d bound to driver %s\n", __func__,
-			iommu_group_id(group->iommu_group), dev->driver->name);
-		BUG_ON(vfio_group_nb_verify(group, dev));
-		break;
-	case IOMMU_GROUP_NOTIFY_UNBIND_DRIVER:
-		dev_dbg(dev, "%s: group %d unbinding from driver %s\n",
-			__func__, iommu_group_id(group->iommu_group),
-			dev->driver->name);
-		break;
-	}
-	return NOTIFY_OK;
-}
-
 /**
  * VFIO driver API
  */
-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 16/17] iommu: Remove iommu group changes notifier
  2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
                   ` (14 preceding siblings ...)
  2021-11-28  2:50 ` [PATCH v2 15/17] vfio: Remove iommu group notifier Lu Baolu
@ 2021-11-28  2:50 ` Lu Baolu
  2021-11-28  2:50 ` [PATCH v2 17/17] drm/tegra: Use the iommu dma_owner mechanism Lu Baolu
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2021-11-28  2:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel, Lu Baolu,
	Christoph Hellwig

The iommu group changes notifer is not referenced in the tree. Remove it
to avoid dead code.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 include/linux/iommu.h | 23 -------------
 drivers/iommu/iommu.c | 75 -------------------------------------------
 2 files changed, 98 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 8c81ba11ae8c..e7eeac801bcd 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -419,13 +419,6 @@ static inline void iommu_iotlb_gather_init(struct iommu_iotlb_gather *gather)
 	};
 }
 
-#define IOMMU_GROUP_NOTIFY_ADD_DEVICE		1 /* Device added */
-#define IOMMU_GROUP_NOTIFY_DEL_DEVICE		2 /* Pre Device removed */
-#define IOMMU_GROUP_NOTIFY_BIND_DRIVER		3 /* Pre Driver bind */
-#define IOMMU_GROUP_NOTIFY_BOUND_DRIVER		4 /* Post Driver bind */
-#define IOMMU_GROUP_NOTIFY_UNBIND_DRIVER	5 /* Pre Driver unbind */
-#define IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER	6 /* Post Driver unbind */
-
 extern int bus_set_iommu(struct bus_type *bus, const struct iommu_ops *ops);
 extern int bus_iommu_probe(struct bus_type *bus);
 extern bool iommu_present(struct bus_type *bus);
@@ -498,10 +491,6 @@ extern int iommu_group_for_each_dev(struct iommu_group *group, void *data,
 extern struct iommu_group *iommu_group_get(struct device *dev);
 extern struct iommu_group *iommu_group_ref_get(struct iommu_group *group);
 extern void iommu_group_put(struct iommu_group *group);
-extern int iommu_group_register_notifier(struct iommu_group *group,
-					 struct notifier_block *nb);
-extern int iommu_group_unregister_notifier(struct iommu_group *group,
-					   struct notifier_block *nb);
 extern int iommu_register_device_fault_handler(struct device *dev,
 					iommu_dev_fault_handler_t handler,
 					void *data);
@@ -915,18 +904,6 @@ static inline void iommu_group_put(struct iommu_group *group)
 {
 }
 
-static inline int iommu_group_register_notifier(struct iommu_group *group,
-						struct notifier_block *nb)
-{
-	return -ENODEV;
-}
-
-static inline int iommu_group_unregister_notifier(struct iommu_group *group,
-						  struct notifier_block *nb)
-{
-	return 0;
-}
-
 static inline
 int iommu_register_device_fault_handler(struct device *dev,
 					iommu_dev_fault_handler_t handler,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index f9cb96acbac8..9cb1cebfd884 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -18,7 +18,6 @@
 #include <linux/errno.h>
 #include <linux/iommu.h>
 #include <linux/idr.h>
-#include <linux/notifier.h>
 #include <linux/err.h>
 #include <linux/pci.h>
 #include <linux/bitops.h>
@@ -40,7 +39,6 @@ struct iommu_group {
 	struct kobject *devices_kobj;
 	struct list_head devices;
 	struct mutex mutex;
-	struct blocking_notifier_head notifier;
 	void *iommu_data;
 	void (*iommu_data_release)(void *iommu_data);
 	char *name;
@@ -629,7 +627,6 @@ struct iommu_group *iommu_group_alloc(void)
 	mutex_init(&group->mutex);
 	INIT_LIST_HEAD(&group->devices);
 	INIT_LIST_HEAD(&group->entry);
-	BLOCKING_INIT_NOTIFIER_HEAD(&group->notifier);
 	group->dma_owner = DMA_OWNER_NONE;
 
 	ret = ida_simple_get(&iommu_group_ida, 0, 0, GFP_KERNEL);
@@ -905,10 +902,6 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 	if (ret)
 		goto err_put_group;
 
-	/* Notify any listeners about change to group. */
-	blocking_notifier_call_chain(&group->notifier,
-				     IOMMU_GROUP_NOTIFY_ADD_DEVICE, dev);
-
 	trace_add_device_to_group(group->id, dev);
 
 	dev_info(dev, "Adding to iommu group %d\n", group->id);
@@ -950,10 +943,6 @@ void iommu_group_remove_device(struct device *dev)
 
 	dev_info(dev, "Removing from iommu group %d\n", group->id);
 
-	/* Pre-notify listeners that a device is being removed. */
-	blocking_notifier_call_chain(&group->notifier,
-				     IOMMU_GROUP_NOTIFY_DEL_DEVICE, dev);
-
 	mutex_lock(&group->mutex);
 	list_for_each_entry(tmp_device, &group->devices, list) {
 		if (tmp_device->dev == dev) {
@@ -1076,36 +1065,6 @@ void iommu_group_put(struct iommu_group *group)
 }
 EXPORT_SYMBOL_GPL(iommu_group_put);
 
-/**
- * iommu_group_register_notifier - Register a notifier for group changes
- * @group: the group to watch
- * @nb: notifier block to signal
- *
- * This function allows iommu group users to track changes in a group.
- * See include/linux/iommu.h for actions sent via this notifier.  Caller
- * should hold a reference to the group throughout notifier registration.
- */
-int iommu_group_register_notifier(struct iommu_group *group,
-				  struct notifier_block *nb)
-{
-	return blocking_notifier_chain_register(&group->notifier, nb);
-}
-EXPORT_SYMBOL_GPL(iommu_group_register_notifier);
-
-/**
- * iommu_group_unregister_notifier - Unregister a notifier
- * @group: the group to watch
- * @nb: notifier block to signal
- *
- * Unregister a previously registered group notifier block.
- */
-int iommu_group_unregister_notifier(struct iommu_group *group,
-				    struct notifier_block *nb)
-{
-	return blocking_notifier_chain_unregister(&group->notifier, nb);
-}
-EXPORT_SYMBOL_GPL(iommu_group_unregister_notifier);
-
 /**
  * iommu_register_device_fault_handler() - Register a device fault handler
  * @dev: the device
@@ -1654,14 +1613,8 @@ static int remove_iommu_group(struct device *dev, void *data)
 static int iommu_bus_notifier(struct notifier_block *nb,
 			      unsigned long action, void *data)
 {
-	unsigned long group_action = 0;
 	struct device *dev = data;
-	struct iommu_group *group;
 
-	/*
-	 * ADD/DEL call into iommu driver ops if provided, which may
-	 * result in ADD/DEL notifiers to group->notifier
-	 */
 	if (action == BUS_NOTIFY_ADD_DEVICE) {
 		int ret;
 
@@ -1672,34 +1625,6 @@ static int iommu_bus_notifier(struct notifier_block *nb,
 		return NOTIFY_OK;
 	}
 
-	/*
-	 * Remaining BUS_NOTIFYs get filtered and republished to the
-	 * group, if anyone is listening
-	 */
-	group = iommu_group_get(dev);
-	if (!group)
-		return 0;
-
-	switch (action) {
-	case BUS_NOTIFY_BIND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_BIND_DRIVER;
-		break;
-	case BUS_NOTIFY_BOUND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_BOUND_DRIVER;
-		break;
-	case BUS_NOTIFY_UNBIND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_UNBIND_DRIVER;
-		break;
-	case BUS_NOTIFY_UNBOUND_DRIVER:
-		group_action = IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER;
-		break;
-	}
-
-	if (group_action)
-		blocking_notifier_call_chain(&group->notifier,
-					     group_action, dev);
-
-	iommu_group_put(group);
 	return 0;
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 17/17] drm/tegra: Use the iommu dma_owner mechanism
  2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
                   ` (15 preceding siblings ...)
  2021-11-28  2:50 ` [PATCH v2 16/17] iommu: Remove iommu group changes notifier Lu Baolu
@ 2021-11-28  2:50 ` Lu Baolu
  2021-11-28  8:10 ` [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Greg Kroah-Hartman
  2021-12-06  2:07 ` Lu Baolu
  18 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2021-11-28  2:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel, Lu Baolu

From: Jason Gunthorpe <jgg@nvidia.com>

Tegra joins many platform devices onto the same iommu_domain and builds
sort-of a DMA API on top of it.

Given that iommu_attach/detatch_device_shared() has supported this usage
model. Each device that wants to use the special domain will use
suppress_auto_claim_dma_owner and call iommu_attach_device_shared() which
will use dma owner framework to lock out other usages of the group and
refcount the domain attachment.

When the last device calls detatch the domain will be disconnected.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/gpu/drm/tegra/dc.c   |  1 +
 drivers/gpu/drm/tegra/drm.c  | 55 ++++++++++++++++++------------------
 drivers/gpu/drm/tegra/gr2d.c |  1 +
 drivers/gpu/drm/tegra/gr3d.c |  1 +
 drivers/gpu/drm/tegra/vic.c  |  1 +
 5 files changed, 32 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index a29d64f87563..3a2458f56fbc 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -3111,4 +3111,5 @@ struct platform_driver tegra_dc_driver = {
 	},
 	.probe = tegra_dc_probe,
 	.remove = tegra_dc_remove,
+	.suppress_auto_claim_dma_owner = true,
 };
diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index 8d37d6b00562..e6e2da799674 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -928,12 +928,15 @@ int tegra_drm_unregister_client(struct tegra_drm *tegra,
 	return 0;
 }
 
+/*
+ * Clients which use this function must set suppress_auto_claim_dma_owner in
+ * their platform_driver's device_driver struct.
+ */
 int host1x_client_iommu_attach(struct host1x_client *client)
 {
 	struct iommu_domain *domain = iommu_get_domain_for_dev(client->dev);
 	struct drm_device *drm = dev_get_drvdata(client->host);
 	struct tegra_drm *tegra = drm->dev_private;
-	struct iommu_group *group = NULL;
 	int err;
 
 	/*
@@ -941,47 +944,45 @@ int host1x_client_iommu_attach(struct host1x_client *client)
 	 * not the shared IOMMU domain, don't try to attach it to a different
 	 * domain. This allows using the IOMMU-backed DMA API.
 	 */
+	client->group = NULL;
 	if (domain && domain != tegra->domain)
+		return iommu_device_set_dma_owner(client->dev,
+						  DMA_OWNER_DMA_API, NULL);
+
+	if (!tegra->domain)
 		return 0;
 
-	if (tegra->domain) {
-		group = iommu_group_get(client->dev);
-		if (!group)
-			return -ENODEV;
-
-		if (domain != tegra->domain) {
-			err = iommu_attach_group(tegra->domain, group);
-			if (err < 0) {
-				iommu_group_put(group);
-				return err;
-			}
-		}
+	err = iommu_device_set_dma_owner(client->dev,
+					 DMA_OWNER_PRIVATE_DOMAIN, NULL);
+	if (err)
+		return err;
 
-		tegra->use_explicit_iommu = true;
+	err = iommu_attach_device_shared(tegra->domain, client->dev);
+	if (err) {
+		iommu_device_release_dma_owner(client->dev,
+					       DMA_OWNER_PRIVATE_DOMAIN);
+		return err;
 	}
 
-	client->group = group;
-
+	tegra->use_explicit_iommu = true;
+	client->group = client->dev->iommu_group;
 	return 0;
 }
 
 void host1x_client_iommu_detach(struct host1x_client *client)
 {
+	struct iommu_domain *domain = iommu_get_domain_for_dev(client->dev);
 	struct drm_device *drm = dev_get_drvdata(client->host);
 	struct tegra_drm *tegra = drm->dev_private;
-	struct iommu_domain *domain;
 
-	if (client->group) {
-		/*
-		 * Devices that are part of the same group may no longer be
-		 * attached to a domain at this point because their group may
-		 * have been detached by an earlier client.
-		 */
-		domain = iommu_get_domain_for_dev(client->dev);
-		if (domain)
-			iommu_detach_group(tegra->domain, client->group);
+	if (domain && domain != tegra->domain)
+		return iommu_device_release_dma_owner(client->dev,
+						      DMA_OWNER_DMA_API);
 
-		iommu_group_put(client->group);
+	if (client->group) {
+		iommu_detach_device_shared(tegra->domain, client->dev);
+		iommu_device_release_dma_owner(client->dev,
+					       DMA_OWNER_PRIVATE_DOMAIN);
 		client->group = NULL;
 	}
 }
diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c
index de288cba3905..8d92141c3c86 100644
--- a/drivers/gpu/drm/tegra/gr2d.c
+++ b/drivers/gpu/drm/tegra/gr2d.c
@@ -271,4 +271,5 @@ struct platform_driver tegra_gr2d_driver = {
 	},
 	.probe = gr2d_probe,
 	.remove = gr2d_remove,
+	.suppress_auto_claim_dma_owner = true,
 };
diff --git a/drivers/gpu/drm/tegra/gr3d.c b/drivers/gpu/drm/tegra/gr3d.c
index 24442ade0da3..33c4954977ab 100644
--- a/drivers/gpu/drm/tegra/gr3d.c
+++ b/drivers/gpu/drm/tegra/gr3d.c
@@ -400,4 +400,5 @@ struct platform_driver tegra_gr3d_driver = {
 	},
 	.probe = gr3d_probe,
 	.remove = gr3d_remove,
+	.suppress_auto_claim_dma_owner = true,
 };
diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c
index c02010ff2b7f..114df067ea00 100644
--- a/drivers/gpu/drm/tegra/vic.c
+++ b/drivers/gpu/drm/tegra/vic.c
@@ -527,6 +527,7 @@ struct platform_driver tegra_vic_driver = {
 	},
 	.probe = vic_probe,
 	.remove = vic_remove,
+	.suppress_auto_claim_dma_owner = true,
 };
 
 #if IS_ENABLED(CONFIG_ARCH_TEGRA_124_SOC)
-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 02/17] driver core: Add dma_unconfigure callback in bus_type
  2021-11-28  2:50 ` [PATCH v2 02/17] driver core: Add dma_unconfigure callback in bus_type Lu Baolu
@ 2021-11-28  8:02   ` Greg Kroah-Hartman
  2021-11-29  4:03     ` Lu Baolu
  0 siblings, 1 reply; 27+ messages in thread
From: Greg Kroah-Hartman @ 2021-11-28  8:02 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Joerg Roedel, Alex Williamson, Bjorn Helgaas, Jason Gunthorpe,
	Christoph Hellwig, Kevin Tian, Ashok Raj, Will Deacon,
	Robin Murphy, Dan Williams, rafael, Diana Craciun, Cornelia Huck,
	Eric Auger, Liu Yi L, Jacob jun Pan, Chaitanya Kulkarni,
	Stuart Yoder, Laurentiu Tudor, Thierry Reding, David Airlie,
	Daniel Vetter, Jonathan Hunter, Li Yang, iommu, linux-pci, kvm,
	linux-kernel

On Sun, Nov 28, 2021 at 10:50:36AM +0800, Lu Baolu wrote:
> The bus_type structure defines dma_configure() callback for bus drivers
> to configure DMA on the devices. This adds the paired dma_unconfigure()
> callback and calls it during driver unbinding so that bus drivers can do
> some cleanup work.
> 
> One use case for this paired DMA callbacks is for the bus driver to check
> for DMA ownership conflicts during driver binding, where multiple devices
> belonging to a same IOMMU group (the minimum granularity of isolation and
> protection) may be assigned to kernel drivers or user space respectively.
> 
> Without this change, for example, the vfio driver has to listen to a bus
> BOUND_DRIVER event and then BUG_ON() in case of dma ownership conflict.
> This leads to bad user experience since careless driver binding operation
> may crash the system if the admin overlooks the group restriction. Aside
> from bad design, this leads to a security problem as a root user, even with
> lockdown=integrity, can force the kernel to BUG.
> 
> With this change, the bus driver could check and set the DMA ownership in
> driver binding process and fail on ownership conflicts. The DMA ownership
> should be released during driver unbinding.
> 
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Link: https://lore.kernel.org/linux-iommu/20210922123931.GI327412@nvidia.com/
> Link: https://lore.kernel.org/linux-iommu/20210928115751.GK964074@nvidia.com/
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  include/linux/device/bus.h | 3 +++
>  drivers/base/dd.c          | 7 ++++++-
>  2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h
> index a039ab809753..ef54a71e5f8f 100644
> --- a/include/linux/device/bus.h
> +++ b/include/linux/device/bus.h
> @@ -59,6 +59,8 @@ struct fwnode_handle;
>   *		bus supports.
>   * @dma_configure:	Called to setup DMA configuration on a device on
>   *			this bus.
> + * @dma_unconfigure:	Called to cleanup DMA configuration on a device on
> + *			this bus.

"dma_cleanup()" is a better name for this, don't you think?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 04/17] driver core: platform: Add driver dma ownership management
  2021-11-28  2:50 ` [PATCH v2 04/17] driver core: platform: " Lu Baolu
@ 2021-11-28  8:10   ` Greg Kroah-Hartman
  2021-11-28 23:15     ` Jason Gunthorpe
  0 siblings, 1 reply; 27+ messages in thread
From: Greg Kroah-Hartman @ 2021-11-28  8:10 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Joerg Roedel, Alex Williamson, Bjorn Helgaas, Jason Gunthorpe,
	Christoph Hellwig, Kevin Tian, Ashok Raj, Will Deacon,
	Robin Murphy, Dan Williams, rafael, Diana Craciun, Cornelia Huck,
	Eric Auger, Liu Yi L, Jacob jun Pan, Chaitanya Kulkarni,
	Stuart Yoder, Laurentiu Tudor, Thierry Reding, David Airlie,
	Daniel Vetter, Jonathan Hunter, Li Yang, iommu, linux-pci, kvm,
	linux-kernel

On Sun, Nov 28, 2021 at 10:50:38AM +0800, Lu Baolu wrote:
> Multiple platform devices may be placed in the same IOMMU group because
> they cannot be isolated from each other. These devices must either be
> entirely under kernel control or userspace control, never a mixture. This
> checks and sets DMA ownership during driver binding, and release the
> ownership during driver unbinding.
> 
> Driver may set a new flag (suppress_auto_claim_dma_owner) to disable auto
> claiming DMA_OWNER_DMA_API ownership in the binding process. For instance,
> the userspace framework drivers (vfio etc.) which need to manually claim
> DMA_OWNER_PRIVATE_DOMAIN_USER when assigning a device to userspace.

Why would any vfio driver be a platform driver?  That should never be
the case as they obviously are not platform drivers, they are virtual
ones.

> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  include/linux/platform_device.h |  1 +
>  drivers/base/platform.c         | 30 +++++++++++++++++++++++++++++-
>  2 files changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
> index 7c96f169d274..779bcf2a851c 100644
> --- a/include/linux/platform_device.h
> +++ b/include/linux/platform_device.h
> @@ -210,6 +210,7 @@ struct platform_driver {
>  	struct device_driver driver;
>  	const struct platform_device_id *id_table;
>  	bool prevent_deferred_probe;
> +	bool suppress_auto_claim_dma_owner;

What platform driver needs this change?

>  };
>  
>  #define to_platform_driver(drv)	(container_of((drv), struct platform_driver, \
> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> index 598acf93a360..df4b385c8a52 100644
> --- a/drivers/base/platform.c
> +++ b/drivers/base/platform.c
> @@ -30,6 +30,7 @@
>  #include <linux/property.h>
>  #include <linux/kmemleak.h>
>  #include <linux/types.h>
> +#include <linux/iommu.h>
>  
>  #include "base.h"
>  #include "power/power.h"
> @@ -1465,6 +1466,32 @@ int platform_dma_configure(struct device *dev)
>  	return ret;
>  }
>  
> +static int _platform_dma_configure(struct device *dev)
> +{
> +	struct platform_driver *drv = to_platform_driver(dev->driver);
> +	int ret;
> +
> +	if (!drv->suppress_auto_claim_dma_owner) {
> +		ret = iommu_device_set_dma_owner(dev, DMA_OWNER_DMA_API, NULL);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	ret = platform_dma_configure(dev);
> +	if (ret && !drv->suppress_auto_claim_dma_owner)
> +		iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
> +
> +	return ret;
> +}
> +
> +static void _platform_dma_unconfigure(struct device *dev)
> +{
> +	struct platform_driver *drv = to_platform_driver(dev->driver);
> +
> +	if (!drv->suppress_auto_claim_dma_owner)
> +		iommu_device_release_dma_owner(dev, DMA_OWNER_DMA_API);
> +}
> +
>  static const struct dev_pm_ops platform_dev_pm_ops = {
>  	SET_RUNTIME_PM_OPS(pm_generic_runtime_suspend, pm_generic_runtime_resume, NULL)
>  	USE_PLATFORM_PM_SLEEP_OPS
> @@ -1478,7 +1505,8 @@ struct bus_type platform_bus_type = {
>  	.probe		= platform_probe,
>  	.remove		= platform_remove,
>  	.shutdown	= platform_shutdown,
> -	.dma_configure	= platform_dma_configure,
> +	.dma_configure	= _platform_dma_configure,

What happened to the original platform_dma_configure() function?

And single "_" prefixes are odd, please just spell out what the
difference is in the function name, "_" gives us no hint at all.

thnaks,

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier()
  2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
                   ` (16 preceding siblings ...)
  2021-11-28  2:50 ` [PATCH v2 17/17] drm/tegra: Use the iommu dma_owner mechanism Lu Baolu
@ 2021-11-28  8:10 ` Greg Kroah-Hartman
  2021-11-29  3:59   ` Lu Baolu
  2021-12-06  2:07 ` Lu Baolu
  18 siblings, 1 reply; 27+ messages in thread
From: Greg Kroah-Hartman @ 2021-11-28  8:10 UTC (permalink / raw)
  To: Lu Baolu
  Cc: Joerg Roedel, Alex Williamson, Bjorn Helgaas, Jason Gunthorpe,
	Christoph Hellwig, Kevin Tian, Ashok Raj, Will Deacon,
	Robin Murphy, Dan Williams, rafael, Diana Craciun, Cornelia Huck,
	Eric Auger, Liu Yi L, Jacob jun Pan, Chaitanya Kulkarni,
	Stuart Yoder, Laurentiu Tudor, Thierry Reding, David Airlie,
	Daniel Vetter, Jonathan Hunter, Li Yang, iommu, linux-pci, kvm,
	linux-kernel

On Sun, Nov 28, 2021 at 10:50:34AM +0800, Lu Baolu wrote:
> The original post and intent of this series is here.
> https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/

Please put the intent in the message, dont make us go and dig it up
again.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 04/17] driver core: platform: Add driver dma ownership management
  2021-11-28  8:10   ` Greg Kroah-Hartman
@ 2021-11-28 23:15     ` Jason Gunthorpe
  2021-11-29 10:34       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 27+ messages in thread
From: Jason Gunthorpe @ 2021-11-28 23:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Lu Baolu, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Christoph Hellwig, Kevin Tian, Ashok Raj, Will Deacon,
	Robin Murphy, Dan Williams, rafael, Diana Craciun, Cornelia Huck,
	Eric Auger, Liu Yi L, Jacob jun Pan, Chaitanya Kulkarni,
	Stuart Yoder, Laurentiu Tudor, Thierry Reding, David Airlie,
	Daniel Vetter, Jonathan Hunter, Li Yang, iommu, linux-pci, kvm,
	linux-kernel

On Sun, Nov 28, 2021 at 09:10:14AM +0100, Greg Kroah-Hartman wrote:
> On Sun, Nov 28, 2021 at 10:50:38AM +0800, Lu Baolu wrote:
> > Multiple platform devices may be placed in the same IOMMU group because
> > they cannot be isolated from each other. These devices must either be
> > entirely under kernel control or userspace control, never a mixture. This
> > checks and sets DMA ownership during driver binding, and release the
> > ownership during driver unbinding.
> > 
> > Driver may set a new flag (suppress_auto_claim_dma_owner) to disable auto
> > claiming DMA_OWNER_DMA_API ownership in the binding process. For instance,
> > the userspace framework drivers (vfio etc.) which need to manually claim
> > DMA_OWNER_PRIVATE_DOMAIN_USER when assigning a device to userspace.
> 
> Why would any vfio driver be a platform driver?  

Why not? VFIO implements drivers for most physical device types
these days. Why wouldn't platform be included?

> That should never be the case as they obviously are not platform
> drivers, they are virtual ones.

Huh?

> > diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
> > index 7c96f169d274..779bcf2a851c 100644
> > +++ b/include/linux/platform_device.h
> > @@ -210,6 +210,7 @@ struct platform_driver {
> >  	struct device_driver driver;
> >  	const struct platform_device_id *id_table;
> >  	bool prevent_deferred_probe;
> > +	bool suppress_auto_claim_dma_owner;
> 
> What platform driver needs this change?

It is in patch 12:

--- a/drivers/vfio/platform/vfio_platform.c
+++ b/drivers/vfio/platform/vfio_platform.c
@@ -76,6 +76,7 @@ static struct platform_driver vfio_platform_driver = {
        .driver = {
                .name   = "vfio-platform",
        },
+       .suppress_auto_claim_dma_owner = true,
 };

Which is how VFIO provides support to DPDK for some Ethernet
controllers embedded in a few ARM SOCs.

It is also used in patch 17 in five tegra platform_drivers to make
their sharing of an iommu group between possibly related
platform_driver's safer.

> >  	USE_PLATFORM_PM_SLEEP_OPS
> > @@ -1478,7 +1505,8 @@ struct bus_type platform_bus_type = {
> >  	.probe		= platform_probe,
> >  	.remove		= platform_remove,
> >  	.shutdown	= platform_shutdown,
> > -	.dma_configure	= platform_dma_configure,
> > +	.dma_configure	= _platform_dma_configure,
> 
> What happened to the original platform_dma_configure() function?

It is still called. The issue here is that platform_dma_configure has
nothing to do with platform and is being re-used by AMBA.

Probably the resolution to both remarks is to rename
platform_dma_configure to something sensible (firwmare dma configure
maybe?) and use it in all places that do the of & acpi stuff -
pci/amba/platform at least.

Jason

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier()
  2021-11-28  8:10 ` [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Greg Kroah-Hartman
@ 2021-11-29  3:59   ` Lu Baolu
  0 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2021-11-29  3:59 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: baolu.lu, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj,
	Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel

Hi Greg,

On 11/28/21 4:10 PM, Greg Kroah-Hartman wrote:
> On Sun, Nov 28, 2021 at 10:50:34AM +0800, Lu Baolu wrote:
>> The original post and intent of this series is here.
>> https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/
> 
> Please put the intent in the message, dont make us go and dig it up
> again.

Sure! I will include the message next time. Thanks!

> thanks,
> 
> greg k-h
> 

Best regards,
baolu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 02/17] driver core: Add dma_unconfigure callback in bus_type
  2021-11-28  8:02   ` Greg Kroah-Hartman
@ 2021-11-29  4:03     ` Lu Baolu
  0 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2021-11-29  4:03 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: baolu.lu, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj,
	Will Deacon, Robin Murphy, Dan Williams, rafael, Diana Craciun,
	Cornelia Huck, Eric Auger, Liu Yi L, Jacob jun Pan,
	Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel

On 11/28/21 4:02 PM, Greg Kroah-Hartman wrote:
> On Sun, Nov 28, 2021 at 10:50:36AM +0800, Lu Baolu wrote:
>> The bus_type structure defines dma_configure() callback for bus drivers
>> to configure DMA on the devices. This adds the paired dma_unconfigure()
>> callback and calls it during driver unbinding so that bus drivers can do
>> some cleanup work.
>>
>> One use case for this paired DMA callbacks is for the bus driver to check
>> for DMA ownership conflicts during driver binding, where multiple devices
>> belonging to a same IOMMU group (the minimum granularity of isolation and
>> protection) may be assigned to kernel drivers or user space respectively.
>>
>> Without this change, for example, the vfio driver has to listen to a bus
>> BOUND_DRIVER event and then BUG_ON() in case of dma ownership conflict.
>> This leads to bad user experience since careless driver binding operation
>> may crash the system if the admin overlooks the group restriction. Aside
>> from bad design, this leads to a security problem as a root user, even with
>> lockdown=integrity, can force the kernel to BUG.
>>
>> With this change, the bus driver could check and set the DMA ownership in
>> driver binding process and fail on ownership conflicts. The DMA ownership
>> should be released during driver unbinding.
>>
>> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
>> Link: https://lore.kernel.org/linux-iommu/20210922123931.GI327412@nvidia.com/
>> Link: https://lore.kernel.org/linux-iommu/20210928115751.GK964074@nvidia.com/
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>> ---
>>   include/linux/device/bus.h | 3 +++
>>   drivers/base/dd.c          | 7 ++++++-
>>   2 files changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h
>> index a039ab809753..ef54a71e5f8f 100644
>> --- a/include/linux/device/bus.h
>> +++ b/include/linux/device/bus.h
>> @@ -59,6 +59,8 @@ struct fwnode_handle;
>>    *		bus supports.
>>    * @dma_configure:	Called to setup DMA configuration on a device on
>>    *			this bus.
>> + * @dma_unconfigure:	Called to cleanup DMA configuration on a device on
>> + *			this bus.
> 
> "dma_cleanup()" is a better name for this, don't you think?

I agree with you. dma_cleanup() is more explicit and better here.

> 
> thanks,
> 
> greg k-h
> 

Best regards,
baolu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 04/17] driver core: platform: Add driver dma ownership management
  2021-11-28 23:15     ` Jason Gunthorpe
@ 2021-11-29 10:34       ` Greg Kroah-Hartman
  2021-11-29 12:59         ` Jason Gunthorpe
  0 siblings, 1 reply; 27+ messages in thread
From: Greg Kroah-Hartman @ 2021-11-29 10:34 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Lu Baolu, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Christoph Hellwig, Kevin Tian, Ashok Raj, Will Deacon,
	Robin Murphy, Dan Williams, rafael, Diana Craciun, Cornelia Huck,
	Eric Auger, Liu Yi L, Jacob jun Pan, Chaitanya Kulkarni,
	Stuart Yoder, Laurentiu Tudor, Thierry Reding, David Airlie,
	Daniel Vetter, Jonathan Hunter, Li Yang, iommu, linux-pci, kvm,
	linux-kernel

On Sun, Nov 28, 2021 at 07:15:09PM -0400, Jason Gunthorpe wrote:
> On Sun, Nov 28, 2021 at 09:10:14AM +0100, Greg Kroah-Hartman wrote:
> > On Sun, Nov 28, 2021 at 10:50:38AM +0800, Lu Baolu wrote:
> > > Multiple platform devices may be placed in the same IOMMU group because
> > > they cannot be isolated from each other. These devices must either be
> > > entirely under kernel control or userspace control, never a mixture. This
> > > checks and sets DMA ownership during driver binding, and release the
> > > ownership during driver unbinding.
> > > 
> > > Driver may set a new flag (suppress_auto_claim_dma_owner) to disable auto
> > > claiming DMA_OWNER_DMA_API ownership in the binding process. For instance,
> > > the userspace framework drivers (vfio etc.) which need to manually claim
> > > DMA_OWNER_PRIVATE_DOMAIN_USER when assigning a device to userspace.
> > 
> > Why would any vfio driver be a platform driver?  
> 
> Why not? VFIO implements drivers for most physical device types
> these days. Why wouldn't platform be included?

Because "platform" is not a real device type.  It's a catch-all for
devices that are only described by firmware, so why would you have a
virtual device for that?  Why would that be needed?

> > > diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
> > > index 7c96f169d274..779bcf2a851c 100644
> > > +++ b/include/linux/platform_device.h
> > > @@ -210,6 +210,7 @@ struct platform_driver {
> > >  	struct device_driver driver;
> > >  	const struct platform_device_id *id_table;
> > >  	bool prevent_deferred_probe;
> > > +	bool suppress_auto_claim_dma_owner;
> > 
> > What platform driver needs this change?
> 
> It is in patch 12:
> 
> --- a/drivers/vfio/platform/vfio_platform.c
> +++ b/drivers/vfio/platform/vfio_platform.c

Ok, nevermind, you do have a virtual platform device, which personally,
I find crazy as why would firmware export a "virtual device"?

> @@ -76,6 +76,7 @@ static struct platform_driver vfio_platform_driver = {
>         .driver = {
>                 .name   = "vfio-platform",
>         },
> +       .suppress_auto_claim_dma_owner = true,
>  };
> 
> Which is how VFIO provides support to DPDK for some Ethernet
> controllers embedded in a few ARM SOCs.

Ick.  Where does the DT file for these devices live that describe a
"virtual device" to match with this driver?

> It is also used in patch 17 in five tegra platform_drivers to make
> their sharing of an iommu group between possibly related
> platform_driver's safer.

Safer how?

> > >  	USE_PLATFORM_PM_SLEEP_OPS
> > > @@ -1478,7 +1505,8 @@ struct bus_type platform_bus_type = {
> > >  	.probe		= platform_probe,
> > >  	.remove		= platform_remove,
> > >  	.shutdown	= platform_shutdown,
> > > -	.dma_configure	= platform_dma_configure,
> > > +	.dma_configure	= _platform_dma_configure,
> > 
> > What happened to the original platform_dma_configure() function?
> 
> It is still called. The issue here is that platform_dma_configure has
> nothing to do with platform and is being re-used by AMBA.

Ick, why?  AMBA needs to be a real bus type and use their own functions
if needed.  There is nothing here that makes this obvious that someone
else is using those functions and that the platform bus should only be
using these "new" functions.

> Probably the resolution to both remarks is to rename
> platform_dma_configure to something sensible (firwmare dma configure
> maybe?) and use it in all places that do the of & acpi stuff -
> pci/amba/platform at least.

That would be better than what is being proposed here.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 04/17] driver core: platform: Add driver dma ownership management
  2021-11-29 10:34       ` Greg Kroah-Hartman
@ 2021-11-29 12:59         ` Jason Gunthorpe
  0 siblings, 0 replies; 27+ messages in thread
From: Jason Gunthorpe @ 2021-11-29 12:59 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Lu Baolu, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Christoph Hellwig, Kevin Tian, Ashok Raj, Will Deacon,
	Robin Murphy, Dan Williams, rafael, Diana Craciun, Cornelia Huck,
	Eric Auger, Liu Yi L, Jacob jun Pan, Chaitanya Kulkarni,
	Stuart Yoder, Laurentiu Tudor, Thierry Reding, David Airlie,
	Daniel Vetter, Jonathan Hunter, Li Yang, iommu, linux-pci, kvm,
	linux-kernel

On Mon, Nov 29, 2021 at 11:34:39AM +0100, Greg Kroah-Hartman wrote:
> On Sun, Nov 28, 2021 at 07:15:09PM -0400, Jason Gunthorpe wrote:
> > On Sun, Nov 28, 2021 at 09:10:14AM +0100, Greg Kroah-Hartman wrote:
> > > On Sun, Nov 28, 2021 at 10:50:38AM +0800, Lu Baolu wrote:
> > > > Multiple platform devices may be placed in the same IOMMU group because
> > > > they cannot be isolated from each other. These devices must either be
> > > > entirely under kernel control or userspace control, never a mixture. This
> > > > checks and sets DMA ownership during driver binding, and release the
> > > > ownership during driver unbinding.
> > > > 
> > > > Driver may set a new flag (suppress_auto_claim_dma_owner) to disable auto
> > > > claiming DMA_OWNER_DMA_API ownership in the binding process. For instance,
> > > > the userspace framework drivers (vfio etc.) which need to manually claim
> > > > DMA_OWNER_PRIVATE_DOMAIN_USER when assigning a device to userspace.
> > > 
> > > Why would any vfio driver be a platform driver?  
> > 
> > Why not? VFIO implements drivers for most physical device types
> > these days. Why wouldn't platform be included?
> 
> Because "platform" is not a real device type.  It's a catch-all for
> devices that are only described by firmware, so why would you have a
> virtual device for that?  Why would that be needed?

Why does it matter how a physical device is enumerated?
PCI/DT/ACPI/setup.c - it doesn't matter. There is still a physical
device with physical DMA and MMIO.

As long as people are making ethernet controllers and other
interesting devices enumerated through platform_device there will be
need to expose them to userspace through VFIO too.

> Ok, nevermind, you do have a virtual platform device, which personally,
> I find crazy as why would firmware export a "virtual device"?

Why do you keep saying "virtual device"?

The "VF" in vfio refers to language in the PCI-SIG SRIOV
specification. You are better to think of the V as meaning "for
virtualization".

Today vfio is just a nonsense acronym. The subsystem's job is to allow
user space to fully operate a physical HW device, including using its
DMA and interrupts. With the advent of DPDK/SPDK/etc it isn't even
related to virtualization use-cases any more.

It is a lot like UIO except VFIO can operate devices that use DMA too.

> > @@ -76,6 +76,7 @@ static struct platform_driver vfio_platform_driver = {
> >         .driver = {
> >                 .name   = "vfio-platform",
> >         },
> > +       .suppress_auto_claim_dma_owner = true,
> >  };
> > 
> > Which is how VFIO provides support to DPDK for some Ethernet
> > controllers embedded in a few ARM SOCs.
> 
> Ick.  Where does the DT file for these devices live that describe a
> "virtual device" to match with this driver?

The DT describes a physical ethernet device that would normally load
it's netdev driver. If the admin wishes to use that physical ethernet
device in userspace, say with DPDK, then the admin switches the driver
from netdev to vfio.

The OF compatible string 'calxeda,hb-xgmac' is one example:
 Documentation/devicetree/bindings/net/calxeda-xgmac.yaml
 arch/arm/boot/dts/ecx-common.dtsi
 drivers/net/ethernet/calxeda/xgmac.c
 drivers/vfio/platform/reset/vfio_platform_calxedaxgmac.c

We all recently went over the switching mechanim in a lot of detail
when you Ack'd this patch series:

https://lore.kernel.org/kvm/20210721161609.68223-1-yishaih@nvidia.com/

(though vfio platform has not yet been revised to use this mechanism,
it still uses the sysfs driver_override)

> > It is also used in patch 17 in five tegra platform_drivers to make
> > their sharing of an iommu group between possibly related
> > platform_driver's safer.
> 
> Safer how?

tegra makes assumptions that the DT configures the IOMMU in a certain
way. If the DT does something else then the kernel will probably corrupt
memory with wild DMAs. After this series the conflicting IOMMU usages
will be detected and blocked instead.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier()
  2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
                   ` (17 preceding siblings ...)
  2021-11-28  8:10 ` [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Greg Kroah-Hartman
@ 2021-12-06  2:07 ` Lu Baolu
  18 siblings, 0 replies; 27+ messages in thread
From: Lu Baolu @ 2021-12-06  2:07 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Joerg Roedel, Alex Williamson, Bjorn Helgaas,
	Jason Gunthorpe, Christoph Hellwig, Kevin Tian, Ashok Raj
  Cc: baolu.lu, Will Deacon, Robin Murphy, Dan Williams, rafael,
	Diana Craciun, Cornelia Huck, Eric Auger, Liu Yi L,
	Jacob jun Pan, Chaitanya Kulkarni, Stuart Yoder, Laurentiu Tudor,
	Thierry Reding, David Airlie, Daniel Vetter, Jonathan Hunter,
	Li Yang, iommu, linux-pci, kvm, linux-kernel

On 11/28/21 10:50 AM, Lu Baolu wrote:
> The original post and intent of this series is here.
> https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/
> 
> Change log:
> v1: initial post
>    - https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/
> 
> v2:
>    - Move kernel dma ownership auto-claiming from driver core to bus
>      callback. [Greg/Christoph/Robin/Jason]
>      https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/T/#m153706912b770682cb12e3c28f57e171aa1f9d0c
> 
>    - Code and interface refactoring for iommu_set/release_dma_owner()
>      interfaces. [Jason]
>      https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/T/#mea70ed8e4e3665aedf32a5a0a7db095bf680325e
> 
>    - [NEW]Add new iommu_attach/detach_device_shared() interfaces for
>      multiple devices group. [Robin/Jason]
>      https://lore.kernel.org/linux-iommu/20211115020552.2378167-1-baolu.lu@linux.intel.com/T/#mea70ed8e4e3665aedf32a5a0a7db095bf680325e
>    
>    - [NEW]Use iommu_attach/detach_device_shared() in drm/tegra drivers.
> 
>    - Refactoring and description refinement.
> 
> This is based on v5.16-rc2 and available on github:
> https://github.com/LuBaolu/intel-iommu/commits/iommu-dma-ownership-v2

The v3 of this series has been posted here:

https://lore.kernel.org/linux-iommu/20211206015903.88687-1-baolu.lu@linux.intel.com/

Best regards,
baolu

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2021-12-06  2:07 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-28  2:50 [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Lu Baolu
2021-11-28  2:50 ` [PATCH v2 01/17] iommu: Add device dma ownership set/release interfaces Lu Baolu
2021-11-28  2:50 ` [PATCH v2 02/17] driver core: Add dma_unconfigure callback in bus_type Lu Baolu
2021-11-28  8:02   ` Greg Kroah-Hartman
2021-11-29  4:03     ` Lu Baolu
2021-11-28  2:50 ` [PATCH v2 03/17] PCI: Add driver dma ownership management Lu Baolu
2021-11-28  2:50 ` [PATCH v2 04/17] driver core: platform: " Lu Baolu
2021-11-28  8:10   ` Greg Kroah-Hartman
2021-11-28 23:15     ` Jason Gunthorpe
2021-11-29 10:34       ` Greg Kroah-Hartman
2021-11-29 12:59         ` Jason Gunthorpe
2021-11-28  2:50 ` [PATCH v2 05/17] amba: " Lu Baolu
2021-11-28  2:50 ` [PATCH v2 06/17] bus: fsl-mc: " Lu Baolu
2021-11-28  2:50 ` [PATCH v2 07/17] PCI: pci_stub: Suppress kernel DMA ownership auto-claiming Lu Baolu
2021-11-28  2:50 ` [PATCH v2 08/17] PCI: portdrv: " Lu Baolu
2021-11-28  2:50 ` [PATCH v2 09/17] iommu: Add security context management for assigned devices Lu Baolu
2021-11-28  2:50 ` [PATCH v2 10/17] iommu: Expose group variants of dma ownership interfaces Lu Baolu
2021-11-28  2:50 ` [PATCH v2 11/17] iommu: Add iommu_at[de]tach_device_shared() for multi-device groups Lu Baolu
2021-11-28  2:50 ` [PATCH v2 12/17] vfio: Set DMA USER ownership for VFIO devices Lu Baolu
2021-11-28  2:50 ` [PATCH v2 13/17] vfio: Remove use of vfio_group_viable() Lu Baolu
2021-11-28  2:50 ` [PATCH v2 14/17] vfio: Delete the unbound_list Lu Baolu
2021-11-28  2:50 ` [PATCH v2 15/17] vfio: Remove iommu group notifier Lu Baolu
2021-11-28  2:50 ` [PATCH v2 16/17] iommu: Remove iommu group changes notifier Lu Baolu
2021-11-28  2:50 ` [PATCH v2 17/17] drm/tegra: Use the iommu dma_owner mechanism Lu Baolu
2021-11-28  8:10 ` [PATCH v2 00/17] Fix BUG_ON in vfio_iommu_group_notifier() Greg Kroah-Hartman
2021-11-29  3:59   ` Lu Baolu
2021-12-06  2:07 ` Lu Baolu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).