LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v2 0/4] Retrieving zPCI specific info with VFIO
@ 2019-05-17 16:16 Pierre Morel
  2019-05-17 16:16 ` [PATCH v2 1/4] s390: pci: Exporting access to CLP PCI function and PCI group Pierre Morel
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Pierre Morel @ 2019-05-17 16:16 UTC (permalink / raw)
  To: sebott
  Cc: gerald.schaefer, pasic, borntraeger, walling, linux-s390, iommu,
	joro, linux-kernel, alex.williamson, kvm, schwidefsky,
	heiko.carstens, robin.murphy

Using the PCI VFIO interface allows userland, a.k.a. QEMU, to retrieve
ZPCI specific information without knowing Z specific identifiers
like the function ID or the function handle of the zPCI function
hidden behind the PCI interface.

By using the VFIO_IOMMU_GET_INFO ioctl we enter the vfio_iommu_type1
ioctl callback and can insert there the treatment for a new Z specific
capability.

Once in vfio_iommu_type1 we can retrieve the real iommu device,
s390_iommu and call the get_attr iommu operation's callback
in which we can retrieve the zdev device and start clp operations
to retrieve Z specific values the guest driver is concerned with.

To share the code with arch/s390/pci/pci_clp.c the original functions
in pci_clp.c to query PCI functions and PCI functions group are
modified so that they can be exported.

A new function clp_query_pci() replaces clp_query_pci_fn() and
the previous calls to clp_query_pci_fn() and clp_query_pci_fngrp()
are replaced with calls to zdev_query_pci_fn() and zdev_query_pci_fngrp()
using a zdev pointer as argument.


Pierre Morel (4):
  s390: pci: Exporting access to CLP PCI function and PCI group
  vfio: vfio_iommu_type1: Define VFIO_IOMMU_INFO_CAPABILITIES
  s390: iommu: Adding get attributes for s390_iommu
  vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES

 arch/s390/include/asm/pci.h     |   3 +
 arch/s390/pci/pci_clp.c         |  70 ++++++++++++-----------
 drivers/iommu/s390-iommu.c      |  77 +++++++++++++++++++++++++
 drivers/vfio/vfio_iommu_type1.c | 122 +++++++++++++++++++++++++++++++++++++++-
 include/linux/iommu.h           |   4 ++
 include/uapi/linux/vfio.h       |  67 ++++++++++++++++++++++
 6 files changed, 308 insertions(+), 35 deletions(-)

-- 
2.7.4

Changelog
From V1:
- no export of the symbol of the new zPCI CLP functions
(Robin)
- adding descriptions for the VFIO capabilities
(Alex)
- defining the structure of the data retrieved through
  VFIO_IOMMU_GET_INFO
(Alex)
- code modifications to allow architecture independence
  for the capabilities
(Alex)


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/4] s390: pci: Exporting access to CLP PCI function and PCI group
  2019-05-17 16:16 [PATCH v2 0/4] Retrieving zPCI specific info with VFIO Pierre Morel
@ 2019-05-17 16:16 ` Pierre Morel
  2019-05-17 16:16 ` [PATCH v2 2/4] vfio: vfio_iommu_type1: Define VFIO_IOMMU_INFO_CAPABILITIES Pierre Morel
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 18+ messages in thread
From: Pierre Morel @ 2019-05-17 16:16 UTC (permalink / raw)
  To: sebott
  Cc: gerald.schaefer, pasic, borntraeger, walling, linux-s390, iommu,
	joro, linux-kernel, alex.williamson, kvm, schwidefsky,
	heiko.carstens, robin.murphy

For the generic implementation of VFIO PCI we need to retrieve
the hardware configuration for the PCI functions and the
PCI function groups.

We modify the internal function using CLP Query PCI function and
CLP query PCI function group so that they can be called from
outside the S390 architecture PCI code and prefix the two
functions with "zdev" to make clear that they can be called
knowing only the associated zdevice.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.ibm.com>
---
 arch/s390/include/asm/pci.h |  3 ++
 arch/s390/pci/pci_clp.c     | 70 +++++++++++++++++++++++----------------------
 2 files changed, 39 insertions(+), 34 deletions(-)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 305befd..e66b246 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -261,4 +261,7 @@ cpumask_of_pcibus(const struct pci_bus *bus)
 
 #endif /* CONFIG_NUMA */
 
+int zdev_query_pci_fngrp(struct zpci_dev *zdev,
+			 struct clp_req_rsp_query_pci_grp *rrb);
+int zdev_query_pci_fn(struct zpci_dev *zdev, struct clp_req_rsp_query_pci *rrb);
 #endif
diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
index 3a36b07..c57f675 100644
--- a/arch/s390/pci/pci_clp.c
+++ b/arch/s390/pci/pci_clp.c
@@ -113,31 +113,16 @@ static void clp_store_query_pci_fngrp(struct zpci_dev *zdev,
 	}
 }
 
-static int clp_query_pci_fngrp(struct zpci_dev *zdev, u8 pfgid)
+int zdev_query_pci_fngrp(struct zpci_dev *zdev,
+			 struct clp_req_rsp_query_pci_grp *rrb)
 {
-	struct clp_req_rsp_query_pci_grp *rrb;
-	int rc;
-
-	rrb = clp_alloc_block(GFP_KERNEL);
-	if (!rrb)
-		return -ENOMEM;
-
 	memset(rrb, 0, sizeof(*rrb));
 	rrb->request.hdr.len = sizeof(rrb->request);
 	rrb->request.hdr.cmd = CLP_QUERY_PCI_FNGRP;
 	rrb->response.hdr.len = sizeof(rrb->response);
-	rrb->request.pfgid = pfgid;
+	rrb->request.pfgid = zdev->pfgid;
 
-	rc = clp_req(rrb, CLP_LPS_PCI);
-	if (!rc && rrb->response.hdr.rsp == CLP_RC_OK)
-		clp_store_query_pci_fngrp(zdev, &rrb->response);
-	else {
-		zpci_err("Q PCI FGRP:\n");
-		zpci_err_clp(rrb->response.hdr.rsp, rc);
-		rc = -EIO;
-	}
-	clp_free_block(rrb);
-	return rc;
+	return clp_req(rrb, CLP_LPS_PCI);
 }
 
 static int clp_store_query_pci_fn(struct zpci_dev *zdev,
@@ -174,32 +159,49 @@ static int clp_store_query_pci_fn(struct zpci_dev *zdev,
 	return 0;
 }
 
-static int clp_query_pci_fn(struct zpci_dev *zdev, u32 fh)
+int zdev_query_pci_fn(struct zpci_dev *zdev, struct clp_req_rsp_query_pci *rrb)
+{
+
+	memset(rrb, 0, sizeof(*rrb));
+	rrb->request.hdr.len = sizeof(rrb->request);
+	rrb->request.hdr.cmd = CLP_QUERY_PCI_FN;
+	rrb->response.hdr.len = sizeof(rrb->response);
+	rrb->request.fh = zdev->fh;
+
+	return clp_req(rrb, CLP_LPS_PCI);
+}
+
+static int clp_query_pci(struct zpci_dev *zdev)
 {
 	struct clp_req_rsp_query_pci *rrb;
+	struct clp_req_rsp_query_pci_grp *grrb;
 	int rc;
 
 	rrb = clp_alloc_block(GFP_KERNEL);
 	if (!rrb)
 		return -ENOMEM;
 
-	memset(rrb, 0, sizeof(*rrb));
-	rrb->request.hdr.len = sizeof(rrb->request);
-	rrb->request.hdr.cmd = CLP_QUERY_PCI_FN;
-	rrb->response.hdr.len = sizeof(rrb->response);
-	rrb->request.fh = fh;
-
-	rc = clp_req(rrb, CLP_LPS_PCI);
-	if (!rc && rrb->response.hdr.rsp == CLP_RC_OK) {
-		rc = clp_store_query_pci_fn(zdev, &rrb->response);
-		if (rc)
-			goto out;
-		rc = clp_query_pci_fngrp(zdev, rrb->response.pfgid);
-	} else {
+	rc = zdev_query_pci_fn(zdev, rrb);
+	if (rc || rrb->response.hdr.rsp != CLP_RC_OK) {
 		zpci_err("Q PCI FN:\n");
 		zpci_err_clp(rrb->response.hdr.rsp, rc);
 		rc = -EIO;
+		goto out;
 	}
+	rc = clp_store_query_pci_fn(zdev, &rrb->response);
+	if (rc)
+		goto out;
+
+	grrb = (struct clp_req_rsp_query_pci_grp *)rrb;
+	rc = zdev_query_pci_fngrp(zdev, grrb);
+	if (rc || grrb->response.hdr.rsp != CLP_RC_OK) {
+		zpci_err("Q PCI FGRP:\n");
+		zpci_err_clp(grrb->response.hdr.rsp, rc);
+		rc = -EIO;
+		goto out;
+	}
+	clp_store_query_pci_fngrp(zdev, &grrb->response);
+
 out:
 	clp_free_block(rrb);
 	return rc;
@@ -219,7 +221,7 @@ int clp_add_pci_device(u32 fid, u32 fh, int configured)
 	zdev->fid = fid;
 
 	/* Query function properties and update zdev */
-	rc = clp_query_pci_fn(zdev, fh);
+	rc = clp_query_pci(zdev);
 	if (rc)
 		goto error;
 
-- 
2.7.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 2/4] vfio: vfio_iommu_type1: Define VFIO_IOMMU_INFO_CAPABILITIES
  2019-05-17 16:16 [PATCH v2 0/4] Retrieving zPCI specific info with VFIO Pierre Morel
  2019-05-17 16:16 ` [PATCH v2 1/4] s390: pci: Exporting access to CLP PCI function and PCI group Pierre Morel
@ 2019-05-17 16:16 ` Pierre Morel
  2019-05-17 16:16 ` [PATCH v2 3/4] s390: iommu: Adding get attributes for s390_iommu Pierre Morel
  2019-05-17 16:16 ` [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES Pierre Morel
  3 siblings, 0 replies; 18+ messages in thread
From: Pierre Morel @ 2019-05-17 16:16 UTC (permalink / raw)
  To: sebott
  Cc: gerald.schaefer, pasic, borntraeger, walling, linux-s390, iommu,
	joro, linux-kernel, alex.williamson, kvm, schwidefsky,
	heiko.carstens, robin.murphy

We add a capabilities functionality to VFIO_IOMMU_GET_INFO.

This will allow the VFIO_IOMMU_GET_INFO ioctl to retrieve IOMMU
specific information.

we define a new flag VFIO_IOMMU_INFO_CAPABILITIES in the
vfio_iommu_type1_info structure and two Z-PCI specific
capabilities:
VFIO_IOMMU_INFO_CAP_QFN: to query Z-PCI function information
VFIO_IOMMU_INFO_CAP_QGRP: to query for Z-PCI group information
and we define the associated information structures.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 include/uapi/linux/vfio.h | 67 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 67 insertions(+)

diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 8f10748..aed0e72 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -715,6 +715,73 @@ struct vfio_iommu_type1_info {
 	__u32	flags;
 #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)	/* supported page sizes info */
 	__u64	iova_pgsizes;		/* Bitmap of supported page sizes */
+#define VFIO_IOMMU_INFO_CAPABILITIES (1 << 1)  /* support capabilities info */
+	__u64   cap_offset;     /* Offset within info struct of first cap */
+};
+
+/*
+ * The VFIO IOMMU INFO  PCI function capability allows to retrieve
+ * Z-PCI function specific data needed by the VFIO user to provide
+ * them to the guest function's driver.
+ *
+ * The structures below define version 1 of this capability.
+ */
+#define VFIO_IOMMU_INFO_CAP_QFN		1
+
+struct vfio_iommu_pci_function {
+	__u32 ignored;
+	__u32 format;		/* Structure format */
+	__u64 reserved1;
+	__u16 vfn;		/* Virtual function number */
+	__u8 u;			/* utility string presence */
+	__u8 gid;		/* Function group */
+	__u32 fid;		/* Function identifier */
+	__u8 bar_size[6];	/* Bar size */
+	__u16 pchid;		/* Physical channel ID */
+	__u32 bar[6];		/* PCI Bar address */
+	__u64 reserved2;
+	__u64 sdma;		/* Start available DMA */
+	__u64 edma;		/* End available DMA */
+	__u32 reserved3[11];
+	__u32 uid;		/* User's identifier */
+	__u8 util_str[64];	/* Adapter specific utility string */
+};
+
+struct vfio_iommu_type1_info_pcifn {
+	struct vfio_info_cap_header header;
+	struct vfio_iommu_pci_function response;
+};
+
+/*
+ * The VFIO IOMMU INFO PCI function group capability allows to retrieve
+ * information, specific to a group of Z-PCI functions, needed by
+ * the VFIO user to provide them to the guest function's driver.
+ *
+ * The structures below define version 1 of this capability.
+ */
+#define VFIO_IOMMU_INFO_CAP_QGRP	2
+
+struct vfio_iommu_pci_function_group {
+	__u32 ignored;
+	__u32 format;		/* Structure format */
+	__u64 reserved1;
+	__u16 noi;		/* Maximum number of interruptions */
+	__u8 version;		/* Version */
+	__u8 flags;		/* Flags */
+#define VFIO_IOMMU_ZPCI_REFRESH	0x01
+#define VFIO_IOMMU_ZPCI_FRAME	0x02
+	__u16 maxstbl;		/* Maximum store-block length */
+	__u16 mui;		/* Measurement block update interval */
+	__u64 reserved3;
+	__u64 dasm;		/* DMA Address space mask */
+	__u64 msia;		/* MSI Address */
+	__u64 reserved4;
+	__u64 reserved5;
+};
+
+struct vfio_iommu_type1_info_pcifg {
+	struct vfio_info_cap_header header;
+	struct vfio_iommu_pci_function_group response;
 };
 
 #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)
-- 
2.7.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 3/4] s390: iommu: Adding get attributes for s390_iommu
  2019-05-17 16:16 [PATCH v2 0/4] Retrieving zPCI specific info with VFIO Pierre Morel
  2019-05-17 16:16 ` [PATCH v2 1/4] s390: pci: Exporting access to CLP PCI function and PCI group Pierre Morel
  2019-05-17 16:16 ` [PATCH v2 2/4] vfio: vfio_iommu_type1: Define VFIO_IOMMU_INFO_CAPABILITIES Pierre Morel
@ 2019-05-17 16:16 ` Pierre Morel
  2019-05-17 16:16 ` [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES Pierre Morel
  3 siblings, 0 replies; 18+ messages in thread
From: Pierre Morel @ 2019-05-17 16:16 UTC (permalink / raw)
  To: sebott
  Cc: gerald.schaefer, pasic, borntraeger, walling, linux-s390, iommu,
	joro, linux-kernel, alex.williamson, kvm, schwidefsky,
	heiko.carstens, robin.murphy

We add the "get attributes" callback to the S390 iommu operations
to retrieve the S390 specific attributes through the call of zPCI
dedicated CLP functions.

The caller can use the following attributes and retrieve:
DOMAIN_ATTR_ZPCI_FN_SIZE: the size of the Z-PCI function attributes
DOMAIN_ATTR_ZPCI_GRP_SIZE: the size of the Z-PCI function group attributes
DOMAIN_ATTR_ZPCI_FN: the Z-PCI function attributes
DOMAIN_ATTR_ZPCI_GRP: the Z-PCI function group attributes

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 drivers/iommu/s390-iommu.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/iommu.h      |  4 +++
 2 files changed, 81 insertions(+)

diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index 22d4db3..98082f0 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -363,6 +363,82 @@ void zpci_destroy_iommu(struct zpci_dev *zdev)
 	iommu_device_sysfs_remove(&zdev->iommu_dev);
 }
 
+struct zpci_dev *get_zpci(struct s390_domain *s390_domain)
+{
+	struct s390_domain_device *domain_device;
+
+	domain_device = list_first_entry(&s390_domain->devices,
+					 struct s390_domain_device, list);
+	if (!domain_device)
+		return NULL;
+	return domain_device->zdev;
+}
+
+static int s390_domain_get_fn(struct iommu_domain *domain, void *data)
+{
+	struct zpci_dev *zdev;
+	struct clp_req_rsp_query_pci *rrb;
+	int rc;
+
+	zdev = get_zpci(to_s390_domain(domain));
+	if (!zdev)
+		return -ENODEV;
+	rrb = (struct clp_req_rsp_query_pci *)
+	      __get_free_pages(GFP_KERNEL, get_order(CLP_BLK_SIZE));
+	if (!rrb)
+		return -ENOMEM;
+	rc = zdev_query_pci_fn(zdev, rrb);
+
+	if (!rc && rrb->response.hdr.rsp == CLP_RC_OK)
+		memcpy(data, &rrb->response, sizeof(struct clp_rsp_query_pci));
+	else
+		rc = -EIO;
+	free_pages((unsigned long) rrb, get_order(CLP_BLK_SIZE));
+	return rc;
+}
+
+static int s390_domain_get_grp(struct iommu_domain *domain, void *data)
+{
+	struct zpci_dev *zdev;
+	struct clp_req_rsp_query_pci_grp *rrb;
+	int rc;
+
+	zdev = get_zpci(to_s390_domain(domain));
+	if (!zdev)
+		return -ENODEV;
+	rrb = (struct clp_req_rsp_query_pci_grp *)
+	      __get_free_pages(GFP_KERNEL, get_order(CLP_BLK_SIZE));
+	if (!rrb)
+		return -ENOMEM;
+
+	rc = zdev_query_pci_fngrp(zdev, rrb);
+	if (!rc && rrb->response.hdr.rsp == CLP_RC_OK)
+		memcpy(data, &rrb->response,
+		       sizeof(struct clp_rsp_query_pci_grp));
+	else
+		rc = -EIO;
+
+	free_pages((unsigned long) rrb, get_order(CLP_BLK_SIZE));
+	return rc;
+}
+
+static int s390_domain_get_attr(struct iommu_domain *domain,
+				enum iommu_attr attr, void *data)
+{
+	switch (attr) {
+	case DOMAIN_ATTR_ZPCI_FN_SIZE:
+		return sizeof(struct clp_rsp_query_pci);
+	case DOMAIN_ATTR_ZPCI_GRP_SIZE:
+		return sizeof(struct clp_rsp_query_pci_grp);
+	case DOMAIN_ATTR_ZPCI_FN:
+		return s390_domain_get_fn(domain, data);
+	case DOMAIN_ATTR_ZPCI_GRP:
+		return s390_domain_get_grp(domain, data);
+	default:
+		return -ENODEV;
+	}
+}
+
 static const struct iommu_ops s390_iommu_ops = {
 	.capable = s390_iommu_capable,
 	.domain_alloc = s390_domain_alloc,
@@ -376,6 +452,7 @@ static const struct iommu_ops s390_iommu_ops = {
 	.remove_device = s390_iommu_remove_device,
 	.device_group = generic_device_group,
 	.pgsize_bitmap = S390_IOMMU_PGSIZES,
+	.domain_get_attr = s390_domain_get_attr,
 };
 
 static int __init s390_iommu_init(void)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index ffbbc7e..ebdcac4 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -125,6 +125,10 @@ enum iommu_attr {
 	DOMAIN_ATTR_FSL_PAMUV1,
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
 	DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
+	DOMAIN_ATTR_ZPCI_FN_SIZE,
+	DOMAIN_ATTR_ZPCI_GRP_SIZE,
+	DOMAIN_ATTR_ZPCI_FN,
+	DOMAIN_ATTR_ZPCI_GRP,
 	DOMAIN_ATTR_MAX,
 };
 
-- 
2.7.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES
  2019-05-17 16:16 [PATCH v2 0/4] Retrieving zPCI specific info with VFIO Pierre Morel
                   ` (2 preceding siblings ...)
  2019-05-17 16:16 ` [PATCH v2 3/4] s390: iommu: Adding get attributes for s390_iommu Pierre Morel
@ 2019-05-17 16:16 ` Pierre Morel
  2019-05-17 16:41   ` Alex Williamson
                     ` (2 more replies)
  3 siblings, 3 replies; 18+ messages in thread
From: Pierre Morel @ 2019-05-17 16:16 UTC (permalink / raw)
  To: sebott
  Cc: gerald.schaefer, pasic, borntraeger, walling, linux-s390, iommu,
	joro, linux-kernel, alex.williamson, kvm, schwidefsky,
	heiko.carstens, robin.murphy

We implement the capability interface for VFIO_IOMMU_GET_INFO.

When calling the ioctl, the user must specify
VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and
must check in the answer if capabilities are supported.

The iommu get_attr callback will be used to retrieve the specific
attributes and fill the capabilities.

Currently two Z-PCI specific capabilities will be queried and
filled by the underlying Z specific s390_iommu:
VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes
and
VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group.

Other architectures may add new capabilities in the same way
after enhancing the architecture specific IOMMU driver.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 drivers/vfio/vfio_iommu_type1.c | 122 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 121 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index d0f731c..9435647 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1658,6 +1658,97 @@ static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
 	return ret;
 }
 
+static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain,
+				    struct vfio_info_cap *caps, size_t size)
+{
+	struct vfio_iommu_type1_info_pcifn *info_fn;
+	int ret;
+
+	info_fn = kzalloc(size, GFP_KERNEL);
+	if (!info_fn)
+		return -ENOMEM;
+
+	ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN,
+				    &info_fn->response);
+	if (ret < 0)
+		goto free_fn;
+
+	info_fn->header.id = VFIO_IOMMU_INFO_CAP_QFN;
+	ret = vfio_info_add_capability(caps, &info_fn->header, size);
+
+free_fn:
+	kfree(info_fn);
+	return ret;
+}
+
+static int vfio_iommu_type1_zpci_grp(struct iommu_domain *domain,
+				     struct vfio_info_cap *caps,
+				     size_t grp_size)
+{
+	struct vfio_iommu_type1_info_pcifg *info_grp;
+	int ret;
+
+	info_grp = kzalloc(grp_size, GFP_KERNEL);
+	if (!info_grp)
+		return -ENOMEM;
+
+	ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_GRP,
+				    (void *) &info_grp->response);
+	if (ret < 0)
+		goto free_grp;
+	info_grp->header.id = VFIO_IOMMU_INFO_CAP_QGRP;
+	ret = vfio_info_add_capability(caps, &info_grp->header, grp_size);
+
+free_grp:
+	kfree(info_grp);
+	return ret;
+}
+
+int vfio_iommu_type1_caps(struct vfio_iommu *iommu, struct vfio_info_cap *caps,
+			  size_t size)
+{
+	struct vfio_domain *d;
+	unsigned long total_size, fn_size, grp_size;
+	int ret;
+
+	d = list_first_entry(&iommu->domain_list, struct vfio_domain, next);
+	if (!d)
+		return -ENODEV;
+
+	/* First compute the size the user must provide */
+	total_size = 0;
+	fn_size = iommu_domain_get_attr(d->domain,
+					DOMAIN_ATTR_ZPCI_FN_SIZE, NULL);
+	if (fn_size > 0) {
+		fn_size +=  sizeof(struct vfio_info_cap_header);
+		total_size += fn_size;
+	}
+
+	grp_size = iommu_domain_get_attr(d->domain,
+					 DOMAIN_ATTR_ZPCI_GRP_SIZE, NULL);
+	if (grp_size > 0) {
+		grp_size +=  sizeof(struct vfio_info_cap_header);
+		total_size += grp_size;
+	}
+
+	if (total_size > size) {
+		/* Tell caller to call us with a greater buffer */
+		caps->size = total_size;
+		return 0;
+	}
+
+	if (fn_size) {
+		ret = vfio_iommu_type1_zpci_fn(d->domain, caps, fn_size);
+		if (ret)
+			return ret;
+	}
+
+	if (grp_size)
+		ret = vfio_iommu_type1_zpci_grp(d->domain, caps, grp_size);
+
+	return ret;
+}
+
 static long vfio_iommu_type1_ioctl(void *iommu_data,
 				   unsigned int cmd, unsigned long arg)
 {
@@ -1679,6 +1770,8 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 		}
 	} else if (cmd == VFIO_IOMMU_GET_INFO) {
 		struct vfio_iommu_type1_info info;
+		struct vfio_info_cap caps = { .buf = NULL, .size = 0 };
+		int ret;
 
 		minsz = offsetofend(struct vfio_iommu_type1_info, iova_pgsizes);
 
@@ -1688,7 +1781,34 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 		if (info.argsz < minsz)
 			return -EINVAL;
 
-		info.flags = VFIO_IOMMU_INFO_PGSIZES;
+		if (info.flags & VFIO_IOMMU_INFO_CAPABILITIES) {
+			minsz = offsetofend(struct vfio_iommu_type1_info,
+					    cap_offset);
+			if (info.argsz < minsz)
+				return -EINVAL;
+			ret = vfio_iommu_type1_caps(iommu, &caps,
+						    info.argsz - sizeof(info));
+			if (ret)
+				return ret;
+		}
+		if (caps.size) {
+			if (info.argsz < sizeof(info) + caps.size) {
+				info.argsz = sizeof(info) + caps.size;
+				info.cap_offset = 0;
+			} else {
+				if (copy_to_user((void __user *)arg +
+						 sizeof(info), caps.buf,
+						 caps.size)) {
+					kfree(caps.buf);
+					return -EFAULT;
+				}
+
+				info.cap_offset = sizeof(info);
+			}
+			kfree(caps.buf);
+		}
+
+		info.flags |= VFIO_IOMMU_INFO_PGSIZES;
 
 		info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
 
-- 
2.7.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES
  2019-05-17 16:16 ` [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES Pierre Morel
@ 2019-05-17 16:41   ` Alex Williamson
  2019-05-17 18:04     ` Pierre Morel
  2019-05-20  3:02   ` kbuild test robot
  2019-05-20  3:02   ` [RFC PATCH] vfio: vfio_iommu_type1: vfio_iommu_type1_caps() can be static kbuild test robot
  2 siblings, 1 reply; 18+ messages in thread
From: Alex Williamson @ 2019-05-17 16:41 UTC (permalink / raw)
  To: Pierre Morel
  Cc: sebott, gerald.schaefer, pasic, borntraeger, walling, linux-s390,
	iommu, joro, linux-kernel, kvm, schwidefsky, heiko.carstens,
	robin.murphy

On Fri, 17 May 2019 18:16:50 +0200
Pierre Morel <pmorel@linux.ibm.com> wrote:

> We implement the capability interface for VFIO_IOMMU_GET_INFO.
> 
> When calling the ioctl, the user must specify
> VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and
> must check in the answer if capabilities are supported.
> 
> The iommu get_attr callback will be used to retrieve the specific
> attributes and fill the capabilities.
> 
> Currently two Z-PCI specific capabilities will be queried and
> filled by the underlying Z specific s390_iommu:
> VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes
> and
> VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group.
> 
> Other architectures may add new capabilities in the same way
> after enhancing the architecture specific IOMMU driver.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> ---
>  drivers/vfio/vfio_iommu_type1.c | 122 +++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 121 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index d0f731c..9435647 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -1658,6 +1658,97 @@ static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
>  	return ret;
>  }
>  
> +static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain,
> +				    struct vfio_info_cap *caps, size_t size)
> +{
> +	struct vfio_iommu_type1_info_pcifn *info_fn;
> +	int ret;
> +
> +	info_fn = kzalloc(size, GFP_KERNEL);
> +	if (!info_fn)
> +		return -ENOMEM;
> +
> +	ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN,
> +				    &info_fn->response);

What ensures that the 'struct clp_rsp_query_pci' returned from this
get_attr remains consistent with a 'struct vfio_iommu_pci_function'?
Why does the latter contains so many reserved fields (beyond simply
alignment) for a user API?  What fields of these structures are
actually useful to userspace?  Should any fields not be exposed to the
user?  Aren't BAR sizes redundant to what's available through the vfio
PCI API?  I'm afraid that simply redefining an internal structure as
the API leaves a lot to be desired too.  Thanks,

Alex

> +	if (ret < 0)
> +		goto free_fn;
> +
> +	info_fn->header.id = VFIO_IOMMU_INFO_CAP_QFN;
> +	ret = vfio_info_add_capability(caps, &info_fn->header, size);
> +
> +free_fn:
> +	kfree(info_fn);
> +	return ret;
> +}
> +
> +static int vfio_iommu_type1_zpci_grp(struct iommu_domain *domain,
> +				     struct vfio_info_cap *caps,
> +				     size_t grp_size)
> +{
> +	struct vfio_iommu_type1_info_pcifg *info_grp;
> +	int ret;
> +
> +	info_grp = kzalloc(grp_size, GFP_KERNEL);
> +	if (!info_grp)
> +		return -ENOMEM;
> +
> +	ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_GRP,
> +				    (void *) &info_grp->response);
> +	if (ret < 0)
> +		goto free_grp;
> +	info_grp->header.id = VFIO_IOMMU_INFO_CAP_QGRP;
> +	ret = vfio_info_add_capability(caps, &info_grp->header, grp_size);
> +
> +free_grp:
> +	kfree(info_grp);
> +	return ret;
> +}
> +
> +int vfio_iommu_type1_caps(struct vfio_iommu *iommu, struct vfio_info_cap *caps,
> +			  size_t size)
> +{
> +	struct vfio_domain *d;
> +	unsigned long total_size, fn_size, grp_size;
> +	int ret;
> +
> +	d = list_first_entry(&iommu->domain_list, struct vfio_domain, next);
> +	if (!d)
> +		return -ENODEV;
> +
> +	/* First compute the size the user must provide */
> +	total_size = 0;
> +	fn_size = iommu_domain_get_attr(d->domain,
> +					DOMAIN_ATTR_ZPCI_FN_SIZE, NULL);
> +	if (fn_size > 0) {
> +		fn_size +=  sizeof(struct vfio_info_cap_header);
> +		total_size += fn_size;
> +	}
> +
> +	grp_size = iommu_domain_get_attr(d->domain,
> +					 DOMAIN_ATTR_ZPCI_GRP_SIZE, NULL);
> +	if (grp_size > 0) {
> +		grp_size +=  sizeof(struct vfio_info_cap_header);
> +		total_size += grp_size;
> +	}
> +
> +	if (total_size > size) {
> +		/* Tell caller to call us with a greater buffer */
> +		caps->size = total_size;
> +		return 0;
> +	}
> +
> +	if (fn_size) {
> +		ret = vfio_iommu_type1_zpci_fn(d->domain, caps, fn_size);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	if (grp_size)
> +		ret = vfio_iommu_type1_zpci_grp(d->domain, caps, grp_size);
> +
> +	return ret;
> +}
> +
>  static long vfio_iommu_type1_ioctl(void *iommu_data,
>  				   unsigned int cmd, unsigned long arg)
>  {
> @@ -1679,6 +1770,8 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
>  		}
>  	} else if (cmd == VFIO_IOMMU_GET_INFO) {
>  		struct vfio_iommu_type1_info info;
> +		struct vfio_info_cap caps = { .buf = NULL, .size = 0 };
> +		int ret;
>  
>  		minsz = offsetofend(struct vfio_iommu_type1_info, iova_pgsizes);
>  
> @@ -1688,7 +1781,34 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
>  		if (info.argsz < minsz)
>  			return -EINVAL;
>  
> -		info.flags = VFIO_IOMMU_INFO_PGSIZES;
> +		if (info.flags & VFIO_IOMMU_INFO_CAPABILITIES) {
> +			minsz = offsetofend(struct vfio_iommu_type1_info,
> +					    cap_offset);
> +			if (info.argsz < minsz)
> +				return -EINVAL;
> +			ret = vfio_iommu_type1_caps(iommu, &caps,
> +						    info.argsz - sizeof(info));
> +			if (ret)
> +				return ret;
> +		}
> +		if (caps.size) {
> +			if (info.argsz < sizeof(info) + caps.size) {
> +				info.argsz = sizeof(info) + caps.size;
> +				info.cap_offset = 0;
> +			} else {
> +				if (copy_to_user((void __user *)arg +
> +						 sizeof(info), caps.buf,
> +						 caps.size)) {
> +					kfree(caps.buf);
> +					return -EFAULT;
> +				}
> +
> +				info.cap_offset = sizeof(info);
> +			}
> +			kfree(caps.buf);
> +		}
> +
> +		info.flags |= VFIO_IOMMU_INFO_PGSIZES;
>  
>  		info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
>  


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES
  2019-05-17 16:41   ` Alex Williamson
@ 2019-05-17 18:04     ` Pierre Morel
  2019-05-20 11:19       ` Pierre Morel
  0 siblings, 1 reply; 18+ messages in thread
From: Pierre Morel @ 2019-05-17 18:04 UTC (permalink / raw)
  To: Alex Williamson
  Cc: sebott, gerald.schaefer, pasic, borntraeger, walling, linux-s390,
	iommu, joro, linux-kernel, kvm, schwidefsky, heiko.carstens,
	robin.murphy

On 17/05/2019 18:41, Alex Williamson wrote:
> On Fri, 17 May 2019 18:16:50 +0200
> Pierre Morel <pmorel@linux.ibm.com> wrote:
> 
>> We implement the capability interface for VFIO_IOMMU_GET_INFO.
>>
>> When calling the ioctl, the user must specify
>> VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and
>> must check in the answer if capabilities are supported.
>>
>> The iommu get_attr callback will be used to retrieve the specific
>> attributes and fill the capabilities.
>>
>> Currently two Z-PCI specific capabilities will be queried and
>> filled by the underlying Z specific s390_iommu:
>> VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes
>> and
>> VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group.
>>
>> Other architectures may add new capabilities in the same way
>> after enhancing the architecture specific IOMMU driver.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>> ---
>>   drivers/vfio/vfio_iommu_type1.c | 122 +++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 121 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
>> index d0f731c..9435647 100644
>> --- a/drivers/vfio/vfio_iommu_type1.c
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -1658,6 +1658,97 @@ static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
>>   	return ret;
>>   }
>>   
>> +static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain,
>> +				    struct vfio_info_cap *caps, size_t size)
>> +{
>> +	struct vfio_iommu_type1_info_pcifn *info_fn;
>> +	int ret;
>> +
>> +	info_fn = kzalloc(size, GFP_KERNEL);
>> +	if (!info_fn)
>> +		return -ENOMEM;
>> +
>> +	ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN,
>> +				    &info_fn->response);
> 
> What ensures that the 'struct clp_rsp_query_pci' returned from this
> get_attr remains consistent with a 'struct vfio_iommu_pci_function'?
> Why does the latter contains so many reserved fields (beyond simply
> alignment) for a user API?  What fields of these structures are
> actually useful to userspace?  Should any fields not be exposed to the
> user?  Aren't BAR sizes redundant to what's available through the vfio
> PCI API?  I'm afraid that simply redefining an internal structure as
> the API leaves a lot to be desired too.  Thanks,
> 
> Alex
> 
Hi Alex,

I simply used the structure returned by the firmware to be sure to be 
consistent with future evolutions and facilitate the copy from CLP and 
to userland.

If you prefer, and I understand that this is the case, I can define a 
specific VFIO_IOMMU structure with only the fields relevant to the user, 
leaving future enhancement of the user's interface being implemented in 
another kernel patch when the time has come.

In fact, the struct will have all defined fields I used but not the BAR 
size and address (at least for now because there are special cases we do 
not support yet with bars).
All the reserved fields can go away.

Is it more conform to your idea?

Also I have 2 interfaces:

s390_iommu.get_attr <-I1-> VFIO_IOMMU <-I2-> userland

Do you prefer:
- 2 different structures, no CLP raw structure
- the CLP raw structure for I1 and a VFIO specific structure for I2
- the same VFIO structure for both I1 and I2

Thank you if you could give me a direction for this.

Thanks for the comments, and thanks a lot to have answered so quickly.

Pierre







-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES
  2019-05-17 16:16 ` [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES Pierre Morel
  2019-05-17 16:41   ` Alex Williamson
@ 2019-05-20  3:02   ` kbuild test robot
  2019-05-20  3:02   ` [RFC PATCH] vfio: vfio_iommu_type1: vfio_iommu_type1_caps() can be static kbuild test robot
  2 siblings, 0 replies; 18+ messages in thread
From: kbuild test robot @ 2019-05-20  3:02 UTC (permalink / raw)
  To: Pierre Morel
  Cc: kbuild-all, sebott, gerald.schaefer, pasic, borntraeger, walling,
	linux-s390, iommu, joro, linux-kernel, alex.williamson, kvm,
	schwidefsky, heiko.carstens, robin.murphy

Hi Pierre,

I love your patch! Perhaps something to improve:

[auto build test WARNING on s390/features]
[also build test WARNING on v5.1 next-20190517]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Pierre-Morel/s390-pci-Exporting-access-to-CLP-PCI-function-and-PCI-group/20190520-025155
base:   https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git features
reproduce:
        # apt-get install sparse
        make ARCH=x86_64 allmodconfig
        make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)

>> drivers/vfio/vfio_iommu_type1.c:1707:5: sparse: sparse: symbol 'vfio_iommu_type1_caps' was not declared. Should it be static?

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [RFC PATCH] vfio: vfio_iommu_type1: vfio_iommu_type1_caps() can be static
  2019-05-17 16:16 ` [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES Pierre Morel
  2019-05-17 16:41   ` Alex Williamson
  2019-05-20  3:02   ` kbuild test robot
@ 2019-05-20  3:02   ` kbuild test robot
  2 siblings, 0 replies; 18+ messages in thread
From: kbuild test robot @ 2019-05-20  3:02 UTC (permalink / raw)
  To: Pierre Morel
  Cc: kbuild-all, sebott, gerald.schaefer, pasic, borntraeger, walling,
	linux-s390, iommu, joro, linux-kernel, alex.williamson, kvm,
	schwidefsky, heiko.carstens, robin.murphy


Fixes: f10b2b74bbea ("vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES")
Signed-off-by: kbuild test robot <lkp@intel.com>
---
 vfio_iommu_type1.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 9435647..46a4939 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1704,8 +1704,8 @@ static int vfio_iommu_type1_zpci_grp(struct iommu_domain *domain,
 	return ret;
 }
 
-int vfio_iommu_type1_caps(struct vfio_iommu *iommu, struct vfio_info_cap *caps,
-			  size_t size)
+static int vfio_iommu_type1_caps(struct vfio_iommu *iommu, struct vfio_info_cap *caps,
+				 size_t size)
 {
 	struct vfio_domain *d;
 	unsigned long total_size, fn_size, grp_size;

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES
  2019-05-17 18:04     ` Pierre Morel
@ 2019-05-20 11:19       ` Pierre Morel
  2019-05-20 14:27         ` Cornelia Huck
  0 siblings, 1 reply; 18+ messages in thread
From: Pierre Morel @ 2019-05-20 11:19 UTC (permalink / raw)
  To: Alex Williamson
  Cc: sebott, gerald.schaefer, pasic, borntraeger, walling, linux-s390,
	iommu, joro, linux-kernel, kvm, schwidefsky, heiko.carstens,
	robin.murphy

On 17/05/2019 20:04, Pierre Morel wrote:
> On 17/05/2019 18:41, Alex Williamson wrote:
>> On Fri, 17 May 2019 18:16:50 +0200
>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>
>>> We implement the capability interface for VFIO_IOMMU_GET_INFO.
>>>
>>> When calling the ioctl, the user must specify
>>> VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and
>>> must check in the answer if capabilities are supported.
>>>
>>> The iommu get_attr callback will be used to retrieve the specific
>>> attributes and fill the capabilities.
>>>
>>> Currently two Z-PCI specific capabilities will be queried and
>>> filled by the underlying Z specific s390_iommu:
>>> VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes
>>> and
>>> VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group.
>>>
>>> Other architectures may add new capabilities in the same way
>>> after enhancing the architecture specific IOMMU driver.
>>>
>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>> ---
>>>   drivers/vfio/vfio_iommu_type1.c | 122 
>>> +++++++++++++++++++++++++++++++++++++++-
>>>   1 file changed, 121 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/vfio/vfio_iommu_type1.c 
>>> b/drivers/vfio/vfio_iommu_type1.c
>>> index d0f731c..9435647 100644
>>> --- a/drivers/vfio/vfio_iommu_type1.c
>>> +++ b/drivers/vfio/vfio_iommu_type1.c
>>> @@ -1658,6 +1658,97 @@ static int 
>>> vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
>>>       return ret;
>>>   }
>>> +static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain,
>>> +                    struct vfio_info_cap *caps, size_t size)
>>> +{
>>> +    struct vfio_iommu_type1_info_pcifn *info_fn;
>>> +    int ret;
>>> +
>>> +    info_fn = kzalloc(size, GFP_KERNEL);
>>> +    if (!info_fn)
>>> +        return -ENOMEM;
>>> +
>>> +    ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN,
>>> +                    &info_fn->response);
>>
>> What ensures that the 'struct clp_rsp_query_pci' returned from this
>> get_attr remains consistent with a 'struct vfio_iommu_pci_function'?
>> Why does the latter contains so many reserved fields (beyond simply
>> alignment) for a user API?  What fields of these structures are
>> actually useful to userspace?  Should any fields not be exposed to the
>> user?  Aren't BAR sizes redundant to what's available through the vfio
>> PCI API?  I'm afraid that simply redefining an internal structure as
>> the API leaves a lot to be desired too.  Thanks,
>>
>> Alex
>>
> Hi Alex,
> 
> I simply used the structure returned by the firmware to be sure to be 
> consistent with future evolutions and facilitate the copy from CLP and 
> to userland.
> 
> If you prefer, and I understand that this is the case, I can define a 
> specific VFIO_IOMMU structure with only the fields relevant to the user, 
> leaving future enhancement of the user's interface being implemented in 
> another kernel patch when the time has come.
> 
> In fact, the struct will have all defined fields I used but not the BAR 
> size and address (at least for now because there are special cases we do 
> not support yet with bars).
> All the reserved fields can go away.
> 
> Is it more conform to your idea?
> 
> Also I have 2 interfaces:
> 
> s390_iommu.get_attr <-I1-> VFIO_IOMMU <-I2-> userland
> 
> Do you prefer:
> - 2 different structures, no CLP raw structure
> - the CLP raw structure for I1 and a VFIO specific structure for I2

Hi Alex,

I am back again on this.
This solution here above seems to me the best one but in this way I must 
include S390 specific include inside the iommu_type1, which is AFAIU not 
a good thing.
It seems that the powerpc architecture use a solution with a dedicated 
VFIO_IOMMU, the vfio_iommu_spar_tce.

Wouldn't it be a solution for s390 too, to use the vfio_iommu_type1 as a 
basis to have a s390 dedicated solution.
Then it becomes easier to have on one side the s390_iommu interface, 
S390 specific, and on the other side a VFIO interface without a blind 
copy of the firmware values.

Do you think it is a viable solution?

Thanks,
Pierre



> - the same VFIO structure for both I1 and I2
> 
> Thank you if you could give me a direction for this.
> 
> Thanks for the comments, and thanks a lot to have answered so quickly.
> 
> Pierre
> 
> 
> 
> 
> 
> 
> 


-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES
  2019-05-20 11:19       ` Pierre Morel
@ 2019-05-20 14:27         ` Cornelia Huck
  2019-05-20 16:31           ` Pierre Morel
  0 siblings, 1 reply; 18+ messages in thread
From: Cornelia Huck @ 2019-05-20 14:27 UTC (permalink / raw)
  To: Pierre Morel
  Cc: Alex Williamson, sebott, gerald.schaefer, pasic, borntraeger,
	walling, linux-s390, iommu, joro, linux-kernel, kvm, schwidefsky,
	heiko.carstens, robin.murphy

On Mon, 20 May 2019 13:19:23 +0200
Pierre Morel <pmorel@linux.ibm.com> wrote:

> On 17/05/2019 20:04, Pierre Morel wrote:
> > On 17/05/2019 18:41, Alex Williamson wrote:  
> >> On Fri, 17 May 2019 18:16:50 +0200
> >> Pierre Morel <pmorel@linux.ibm.com> wrote:
> >>  
> >>> We implement the capability interface for VFIO_IOMMU_GET_INFO.
> >>>
> >>> When calling the ioctl, the user must specify
> >>> VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and
> >>> must check in the answer if capabilities are supported.
> >>>
> >>> The iommu get_attr callback will be used to retrieve the specific
> >>> attributes and fill the capabilities.
> >>>
> >>> Currently two Z-PCI specific capabilities will be queried and
> >>> filled by the underlying Z specific s390_iommu:
> >>> VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes
> >>> and
> >>> VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group.
> >>>
> >>> Other architectures may add new capabilities in the same way
> >>> after enhancing the architecture specific IOMMU driver.
> >>>
> >>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> >>> ---
> >>>   drivers/vfio/vfio_iommu_type1.c | 122 
> >>> +++++++++++++++++++++++++++++++++++++++-
> >>>   1 file changed, 121 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/vfio/vfio_iommu_type1.c 
> >>> b/drivers/vfio/vfio_iommu_type1.c
> >>> index d0f731c..9435647 100644
> >>> --- a/drivers/vfio/vfio_iommu_type1.c
> >>> +++ b/drivers/vfio/vfio_iommu_type1.c
> >>> @@ -1658,6 +1658,97 @@ static int 
> >>> vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
> >>>       return ret;
> >>>   }
> >>> +static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain,
> >>> +                    struct vfio_info_cap *caps, size_t size)
> >>> +{
> >>> +    struct vfio_iommu_type1_info_pcifn *info_fn;
> >>> +    int ret;
> >>> +
> >>> +    info_fn = kzalloc(size, GFP_KERNEL);
> >>> +    if (!info_fn)
> >>> +        return -ENOMEM;
> >>> +
> >>> +    ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN,
> >>> +                    &info_fn->response);  
> >>
> >> What ensures that the 'struct clp_rsp_query_pci' returned from this
> >> get_attr remains consistent with a 'struct vfio_iommu_pci_function'?
> >> Why does the latter contains so many reserved fields (beyond simply
> >> alignment) for a user API?  What fields of these structures are
> >> actually useful to userspace?  Should any fields not be exposed to the
> >> user?  Aren't BAR sizes redundant to what's available through the vfio
> >> PCI API?  I'm afraid that simply redefining an internal structure as
> >> the API leaves a lot to be desired too.  Thanks,
> >>
> >> Alex
> >>  
> > Hi Alex,
> > 
> > I simply used the structure returned by the firmware to be sure to be 
> > consistent with future evolutions and facilitate the copy from CLP and 
> > to userland.
> > 
> > If you prefer, and I understand that this is the case, I can define a 
> > specific VFIO_IOMMU structure with only the fields relevant to the user, 
> > leaving future enhancement of the user's interface being implemented in 
> > another kernel patch when the time has come.
> > 
> > In fact, the struct will have all defined fields I used but not the BAR 
> > size and address (at least for now because there are special cases we do 
> > not support yet with bars).
> > All the reserved fields can go away.
> > 
> > Is it more conform to your idea?
> > 
> > Also I have 2 interfaces:
> > 
> > s390_iommu.get_attr <-I1-> VFIO_IOMMU <-I2-> userland
> > 
> > Do you prefer:
> > - 2 different structures, no CLP raw structure
> > - the CLP raw structure for I1 and a VFIO specific structure for I2  

<entering from the sideline>

IIUC, get_attr extracts various data points via clp, and we then make
it available to userspace. The clp interface needs to be abstracted
away at some point... one question from me: Is there a chance that
someone else may want to make use of the userspace interface (extra
information about a function)? If yes, I'd expect the get_attr to
obtain some kind of portable information already (basically your third
option, below).

> 
> Hi Alex,
> 
> I am back again on this.
> This solution here above seems to me the best one but in this way I must 
> include S390 specific include inside the iommu_type1, which is AFAIU not 
> a good thing.
> It seems that the powerpc architecture use a solution with a dedicated 
> VFIO_IOMMU, the vfio_iommu_spar_tce.
> 
> Wouldn't it be a solution for s390 too, to use the vfio_iommu_type1 as a 
> basis to have a s390 dedicated solution.
> Then it becomes easier to have on one side the s390_iommu interface, 
> S390 specific, and on the other side a VFIO interface without a blind 
> copy of the firmware values.

If nobody else would want this exact interface, it might be a solution.
It would still be better not to encode clp data explicitly in the
userspace interface.

> 
> Do you think it is a viable solution?
> 
> Thanks,
> Pierre
> 
> 
> 
> > - the same VFIO structure for both I1 and I2

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES
  2019-05-20 14:27         ` Cornelia Huck
@ 2019-05-20 16:31           ` Pierre Morel
  2019-05-20 18:23             ` Alex Williamson
  0 siblings, 1 reply; 18+ messages in thread
From: Pierre Morel @ 2019-05-20 16:31 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Alex Williamson, sebott, gerald.schaefer, pasic, borntraeger,
	walling, linux-s390, iommu, joro, linux-kernel, kvm, schwidefsky,
	heiko.carstens, robin.murphy

On 20/05/2019 16:27, Cornelia Huck wrote:
> On Mon, 20 May 2019 13:19:23 +0200
> Pierre Morel <pmorel@linux.ibm.com> wrote:
> 
>> On 17/05/2019 20:04, Pierre Morel wrote:
>>> On 17/05/2019 18:41, Alex Williamson wrote:
>>>> On Fri, 17 May 2019 18:16:50 +0200
>>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>>>   
>>>>> We implement the capability interface for VFIO_IOMMU_GET_INFO.
>>>>>
>>>>> When calling the ioctl, the user must specify
>>>>> VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and
>>>>> must check in the answer if capabilities are supported.
>>>>>
>>>>> The iommu get_attr callback will be used to retrieve the specific
>>>>> attributes and fill the capabilities.
>>>>>
>>>>> Currently two Z-PCI specific capabilities will be queried and
>>>>> filled by the underlying Z specific s390_iommu:
>>>>> VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes
>>>>> and
>>>>> VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group.
>>>>>
>>>>> Other architectures may add new capabilities in the same way
>>>>> after enhancing the architecture specific IOMMU driver.
>>>>>
>>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>>>> ---
>>>>>    drivers/vfio/vfio_iommu_type1.c | 122
>>>>> +++++++++++++++++++++++++++++++++++++++-
>>>>>    1 file changed, 121 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/vfio/vfio_iommu_type1.c
>>>>> b/drivers/vfio/vfio_iommu_type1.c
>>>>> index d0f731c..9435647 100644
>>>>> --- a/drivers/vfio/vfio_iommu_type1.c
>>>>> +++ b/drivers/vfio/vfio_iommu_type1.c
>>>>> @@ -1658,6 +1658,97 @@ static int
>>>>> vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
>>>>>        return ret;
>>>>>    }
>>>>> +static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain,
>>>>> +                    struct vfio_info_cap *caps, size_t size)
>>>>> +{
>>>>> +    struct vfio_iommu_type1_info_pcifn *info_fn;
>>>>> +    int ret;
>>>>> +
>>>>> +    info_fn = kzalloc(size, GFP_KERNEL);
>>>>> +    if (!info_fn)
>>>>> +        return -ENOMEM;
>>>>> +
>>>>> +    ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN,
>>>>> +                    &info_fn->response);
>>>>
>>>> What ensures that the 'struct clp_rsp_query_pci' returned from this
>>>> get_attr remains consistent with a 'struct vfio_iommu_pci_function'?
>>>> Why does the latter contains so many reserved fields (beyond simply
>>>> alignment) for a user API?  What fields of these structures are
>>>> actually useful to userspace?  Should any fields not be exposed to the
>>>> user?  Aren't BAR sizes redundant to what's available through the vfio
>>>> PCI API?  I'm afraid that simply redefining an internal structure as
>>>> the API leaves a lot to be desired too.  Thanks,
>>>>
>>>> Alex
>>>>   
>>> Hi Alex,
>>>
>>> I simply used the structure returned by the firmware to be sure to be
>>> consistent with future evolutions and facilitate the copy from CLP and
>>> to userland.
>>>
>>> If you prefer, and I understand that this is the case, I can define a
>>> specific VFIO_IOMMU structure with only the fields relevant to the user,
>>> leaving future enhancement of the user's interface being implemented in
>>> another kernel patch when the time has come.
>>>
>>> In fact, the struct will have all defined fields I used but not the BAR
>>> size and address (at least for now because there are special cases we do
>>> not support yet with bars).
>>> All the reserved fields can go away.
>>>
>>> Is it more conform to your idea?
>>>
>>> Also I have 2 interfaces:
>>>
>>> s390_iommu.get_attr <-I1-> VFIO_IOMMU <-I2-> userland
>>>
>>> Do you prefer:
>>> - 2 different structures, no CLP raw structure
>>> - the CLP raw structure for I1 and a VFIO specific structure for I2
> 
> <entering from the sideline>
> 
> IIUC, get_attr extracts various data points via clp, and we then make
> it available to userspace. The clp interface needs to be abstracted
> away at some point... one question from me: Is there a chance that
> someone else may want to make use of the userspace interface (extra
> information about a function)? If yes, I'd expect the get_attr to
> obtain some kind of portable information already (basically your third
> option, below).

Yes, seems the most reasonable.
In this case I need to share the structure definition between:
userspace through vfio.h
vfio_iommu (this is obvious)
s390_iommu

It is this third include which made me doubt.
But when you re formulate it it looks the more reasonable because there 
are much less changes.

Thanks for the help, I start this way, still wait one day or two to see 
if any comment against this solution comes and send the update.

Thanks,
Pierre

> 
>>
>> Hi Alex,
>>
>> I am back again on this.
>> This solution here above seems to me the best one but in this way I must
>> include S390 specific include inside the iommu_type1, which is AFAIU not
>> a good thing.
>> It seems that the powerpc architecture use a solution with a dedicated
>> VFIO_IOMMU, the vfio_iommu_spar_tce.
>>
>> Wouldn't it be a solution for s390 too, to use the vfio_iommu_type1 as a
>> basis to have a s390 dedicated solution.
>> Then it becomes easier to have on one side the s390_iommu interface,
>> S390 specific, and on the other side a VFIO interface without a blind
>> copy of the firmware values.
> 
> If nobody else would want this exact interface, it might be a solution.
> It would still be better not to encode clp data explicitly in the
> userspace interface.
> 
>>
>> Do you think it is a viable solution?
>>
>> Thanks,
>> Pierre
>>
>>
>>
>>> - the same VFIO structure for both I1 and I2
> 


-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES
  2019-05-20 16:31           ` Pierre Morel
@ 2019-05-20 18:23             ` Alex Williamson
  2019-05-21  9:14               ` Pierre Morel
  0 siblings, 1 reply; 18+ messages in thread
From: Alex Williamson @ 2019-05-20 18:23 UTC (permalink / raw)
  To: Pierre Morel
  Cc: Cornelia Huck, sebott, gerald.schaefer, pasic, borntraeger,
	walling, linux-s390, iommu, joro, linux-kernel, kvm, schwidefsky,
	heiko.carstens, robin.murphy

On Mon, 20 May 2019 18:31:08 +0200
Pierre Morel <pmorel@linux.ibm.com> wrote:

> On 20/05/2019 16:27, Cornelia Huck wrote:
> > On Mon, 20 May 2019 13:19:23 +0200
> > Pierre Morel <pmorel@linux.ibm.com> wrote:
> >   
> >> On 17/05/2019 20:04, Pierre Morel wrote:  
> >>> On 17/05/2019 18:41, Alex Williamson wrote:  
> >>>> On Fri, 17 May 2019 18:16:50 +0200
> >>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
> >>>>     
> >>>>> We implement the capability interface for VFIO_IOMMU_GET_INFO.
> >>>>>
> >>>>> When calling the ioctl, the user must specify
> >>>>> VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and
> >>>>> must check in the answer if capabilities are supported.
> >>>>>
> >>>>> The iommu get_attr callback will be used to retrieve the specific
> >>>>> attributes and fill the capabilities.
> >>>>>
> >>>>> Currently two Z-PCI specific capabilities will be queried and
> >>>>> filled by the underlying Z specific s390_iommu:
> >>>>> VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes
> >>>>> and
> >>>>> VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group.
> >>>>>
> >>>>> Other architectures may add new capabilities in the same way
> >>>>> after enhancing the architecture specific IOMMU driver.
> >>>>>
> >>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> >>>>> ---
> >>>>>    drivers/vfio/vfio_iommu_type1.c | 122
> >>>>> +++++++++++++++++++++++++++++++++++++++-
> >>>>>    1 file changed, 121 insertions(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/drivers/vfio/vfio_iommu_type1.c
> >>>>> b/drivers/vfio/vfio_iommu_type1.c
> >>>>> index d0f731c..9435647 100644
> >>>>> --- a/drivers/vfio/vfio_iommu_type1.c
> >>>>> +++ b/drivers/vfio/vfio_iommu_type1.c
> >>>>> @@ -1658,6 +1658,97 @@ static int
> >>>>> vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
> >>>>>        return ret;
> >>>>>    }
> >>>>> +static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain,
> >>>>> +                    struct vfio_info_cap *caps, size_t size)
> >>>>> +{
> >>>>> +    struct vfio_iommu_type1_info_pcifn *info_fn;
> >>>>> +    int ret;
> >>>>> +
> >>>>> +    info_fn = kzalloc(size, GFP_KERNEL);
> >>>>> +    if (!info_fn)
> >>>>> +        return -ENOMEM;
> >>>>> +
> >>>>> +    ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN,
> >>>>> +                    &info_fn->response);  
> >>>>
> >>>> What ensures that the 'struct clp_rsp_query_pci' returned from this
> >>>> get_attr remains consistent with a 'struct vfio_iommu_pci_function'?
> >>>> Why does the latter contains so many reserved fields (beyond simply
> >>>> alignment) for a user API?  What fields of these structures are
> >>>> actually useful to userspace?  Should any fields not be exposed to the
> >>>> user?  Aren't BAR sizes redundant to what's available through the vfio
> >>>> PCI API?  I'm afraid that simply redefining an internal structure as
> >>>> the API leaves a lot to be desired too.  Thanks,
> >>>>
> >>>> Alex
> >>>>     
> >>> Hi Alex,
> >>>
> >>> I simply used the structure returned by the firmware to be sure to be
> >>> consistent with future evolutions and facilitate the copy from CLP and
> >>> to userland.
> >>>
> >>> If you prefer, and I understand that this is the case, I can define a
> >>> specific VFIO_IOMMU structure with only the fields relevant to the user,
> >>> leaving future enhancement of the user's interface being implemented in
> >>> another kernel patch when the time has come.

TBH, I had no idea that CLP is an s390 firmware interface and this is
just dumping that to userspace.  The cover letter says:

  Using the PCI VFIO interface allows userland, a.k.a. QEMU, to
  retrieve ZPCI specific information without knowing Z specific
  identifiers like the function ID or the function handle of the zPCI
  function hidden behind the PCI interface.

But what does this allow userland to do and what specific pieces of
information do they need?  We do have a case already where Intel
graphics devices have a table (OpRegion) living in host system memory
that we expose via a vfio region, so it wouldn't be unprecedented to do
something like this, but as Connie suggests, if we knew what was being
consumed here and why, maybe we could generalize it into something
useful for others.

> >>> In fact, the struct will have all defined fields I used but not the BAR
> >>> size and address (at least for now because there are special cases we do
> >>> not support yet with bars).
> >>> All the reserved fields can go away.
> >>>
> >>> Is it more conform to your idea?
> >>>
> >>> Also I have 2 interfaces:
> >>>
> >>> s390_iommu.get_attr <-I1-> VFIO_IOMMU <-I2-> userland
> >>>
> >>> Do you prefer:
> >>> - 2 different structures, no CLP raw structure
> >>> - the CLP raw structure for I1 and a VFIO specific structure for I2  
> > 
> > <entering from the sideline>
> > 
> > IIUC, get_attr extracts various data points via clp, and we then make
> > it available to userspace. The clp interface needs to be abstracted
> > away at some point... one question from me: Is there a chance that
> > someone else may want to make use of the userspace interface (extra
> > information about a function)? If yes, I'd expect the get_attr to
> > obtain some kind of portable information already (basically your third
> > option, below).

I agree, but I also suspect we're pretty deep into s390
eccentricities.  An ioctl on the IOMMU container to get information
about a PCI function (singular) really seems like it can only exist on
a system where the actual PCI hardware is already being virtualized to
the host system.  I don't think this excludes us from the conversation
about what we're actually trying to expose and what it enables in
userspace though.
 
> Yes, seems the most reasonable.
> In this case I need to share the structure definition between:
> userspace through vfio.h
> vfio_iommu (this is obvious)
> s390_iommu
> 
> It is this third include which made me doubt.
> But when you re formulate it it looks the more reasonable because there 
> are much less changes.

It depends on what we settle on for get_attr.  If there are discrete
features that vfio_iommu_type1 can query and assemble into the
userspace response, the s390_iommu doesn't need to know the resulting
structure.  Even if it's just a CLP structure from the get_attr, why
would s390_iommu be responsible for formatting that into a user
structure vs vfio_iommu?  I don't think we want get_attr passing vfio
specific structures.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES
  2019-05-20 18:23             ` Alex Williamson
@ 2019-05-21  9:14               ` Pierre Morel
  2019-05-21 11:11                 ` Cornelia Huck
  2019-05-21 14:59                 ` Alex Williamson
  0 siblings, 2 replies; 18+ messages in thread
From: Pierre Morel @ 2019-05-21  9:14 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Cornelia Huck, sebott, gerald.schaefer, pasic, borntraeger,
	walling, linux-s390, iommu, joro, linux-kernel, kvm, schwidefsky,
	heiko.carstens, robin.murphy

On 20/05/2019 20:23, Alex Williamson wrote:
> On Mon, 20 May 2019 18:31:08 +0200
> Pierre Morel <pmorel@linux.ibm.com> wrote:
> 
>> On 20/05/2019 16:27, Cornelia Huck wrote:
>>> On Mon, 20 May 2019 13:19:23 +0200
>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>>    
>>>> On 17/05/2019 20:04, Pierre Morel wrote:
>>>>> On 17/05/2019 18:41, Alex Williamson wrote:
>>>>>> On Fri, 17 May 2019 18:16:50 +0200
>>>>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>>>>>      
>>>>>>> We implement the capability interface for VFIO_IOMMU_GET_INFO.
>>>>>>>
>>>>>>> When calling the ioctl, the user must specify
>>>>>>> VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and
>>>>>>> must check in the answer if capabilities are supported.
>>>>>>>
>>>>>>> The iommu get_attr callback will be used to retrieve the specific
>>>>>>> attributes and fill the capabilities.
>>>>>>>
>>>>>>> Currently two Z-PCI specific capabilities will be queried and
>>>>>>> filled by the underlying Z specific s390_iommu:
>>>>>>> VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes
>>>>>>> and
>>>>>>> VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group.
>>>>>>>
>>>>>>> Other architectures may add new capabilities in the same way
>>>>>>> after enhancing the architecture specific IOMMU driver.
>>>>>>>
>>>>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>>>>>> ---
>>>>>>>     drivers/vfio/vfio_iommu_type1.c | 122
>>>>>>> +++++++++++++++++++++++++++++++++++++++-
>>>>>>>     1 file changed, 121 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/drivers/vfio/vfio_iommu_type1.c
>>>>>>> b/drivers/vfio/vfio_iommu_type1.c
>>>>>>> index d0f731c..9435647 100644
>>>>>>> --- a/drivers/vfio/vfio_iommu_type1.c
>>>>>>> +++ b/drivers/vfio/vfio_iommu_type1.c
>>>>>>> @@ -1658,6 +1658,97 @@ static int
>>>>>>> vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
>>>>>>>         return ret;
>>>>>>>     }
>>>>>>> +static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain,
>>>>>>> +                    struct vfio_info_cap *caps, size_t size)
>>>>>>> +{
>>>>>>> +    struct vfio_iommu_type1_info_pcifn *info_fn;
>>>>>>> +    int ret;
>>>>>>> +
>>>>>>> +    info_fn = kzalloc(size, GFP_KERNEL);
>>>>>>> +    if (!info_fn)
>>>>>>> +        return -ENOMEM;
>>>>>>> +
>>>>>>> +    ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN,
>>>>>>> +                    &info_fn->response);
>>>>>>
>>>>>> What ensures that the 'struct clp_rsp_query_pci' returned from this
>>>>>> get_attr remains consistent with a 'struct vfio_iommu_pci_function'?
>>>>>> Why does the latter contains so many reserved fields (beyond simply
>>>>>> alignment) for a user API?  What fields of these structures are
>>>>>> actually useful to userspace?  Should any fields not be exposed to the
>>>>>> user?  Aren't BAR sizes redundant to what's available through the vfio
>>>>>> PCI API?  I'm afraid that simply redefining an internal structure as
>>>>>> the API leaves a lot to be desired too.  Thanks,
>>>>>>
>>>>>> Alex
>>>>>>      
>>>>> Hi Alex,
>>>>>
>>>>> I simply used the structure returned by the firmware to be sure to be
>>>>> consistent with future evolutions and facilitate the copy from CLP and
>>>>> to userland.
>>>>>
>>>>> If you prefer, and I understand that this is the case, I can define a
>>>>> specific VFIO_IOMMU structure with only the fields relevant to the user,
>>>>> leaving future enhancement of the user's interface being implemented in
>>>>> another kernel patch when the time has come.
> 
> TBH, I had no idea that CLP is an s390 firmware interface and this is
> just dumping that to userspace.  The cover letter says:
> 
>    Using the PCI VFIO interface allows userland, a.k.a. QEMU, to
>    retrieve ZPCI specific information without knowing Z specific
>    identifiers like the function ID or the function handle of the zPCI
>    function hidden behind the PCI interface.
> 
> But what does this allow userland to do and what specific pieces of
> information do they need?  We do have a case already where Intel
> graphics devices have a table (OpRegion) living in host system memory
> that we expose via a vfio region, so it wouldn't be unprecedented to do
> something like this, but as Connie suggests, if we knew what was being
> consumed here and why, maybe we could generalize it into something
> useful for others.

OK, sorry I try to explain better.

1) A short description, of zPCI functions and groups

IN Z, PCI cards, leave behind an adapter between subchannels and PCI.
We access PCI cards through 2 ways:
- dedicated PCI instructions (pci_load/pci_store/pci/store_block)
- DMA
We receive events through
- Adapter interrupts
- CHSC events

The adapter propose an IOMMU to protect the DMA
and the interrupt handling goes through a MSIX like interface handled by 
the adapter.

The architecture specific PCI do the interface between the standard PCI 
level and the zPCI function (PCI + DMA/IOMMU/Interrupt)

To handle the communication through the "zPCI way" the CLP interface 
provides instructions to retrieve informations from the adapters.

There are different group of functions having same functionalities.

clp_list give us a list from zPCI functions
clp_query_pci_function returns informations specific to a function
clp_query_group returns information on a function group


2) Why do we need it in the guest

We need to provide the guest with information on the adapters and zPCI 
functions returned by the clp_query instruction so that the guest's 
driver gets the right information on how the way to the zPCI function 
has been built in the host.


When a guest issues the CLP instructions we intercept the clp command in 
QEMU and we need to feed the response with the right values for the guest.
The "right" values are not the raw CLP response values:

- some identifier must be virtualized, like UID and FID,

- some values must match what the host received from the CLP response, 
like the size of the transmited blocks, the DMA Address Space Mask, 
number of interrupt, MSIA

- some other must match what the host handled with the adapter and 
function, the start and end of DMA,

- some what the host IOMMU driver supports (frame size),



3) We have three different way to get This information:

The PCI Linux interface is a standard PCI interface and some Z specific 
information is available in sysfs.
Not all the information needed to be returned inside the CLP response is 
available.
So we can not use the sysfs interface to get all the information.

There is a CLP ioctl interface but this interface is not secure in that 
it returns the information for all adapters in the system.

The VFIO interface offers the advantage to point to a single PCI 
function, so more secure than the clp ioctl interface.
Coupled with the s390_iommu we get access to the zPCI CLP instruction 
and to the values handled by the zPCI driver.


4) Until now we used to fill the CLP response to the guest inside QEMU 
with fixed values corresponding to the only PCI card we supported.
To support new cards we need to get the right values from the kernel out.

> 
>>>>> In fact, the struct will have all defined fields I used but not the BAR
>>>>> size and address (at least for now because there are special cases we do
>>>>> not support yet with bars).
>>>>> All the reserved fields can go away.
>>>>>
>>>>> Is it more conform to your idea?
>>>>>
>>>>> Also I have 2 interfaces:
>>>>>
>>>>> s390_iommu.get_attr <-I1-> VFIO_IOMMU <-I2-> userland
>>>>>
>>>>> Do you prefer:
>>>>> - 2 different structures, no CLP raw structure
>>>>> - the CLP raw structure for I1 and a VFIO specific structure for I2
>>>
>>> <entering from the sideline>
>>>
>>> IIUC, get_attr extracts various data points via clp, and we then make
>>> it available to userspace. The clp interface needs to be abstracted
>>> away at some point... one question from me: Is there a chance that
>>> someone else may want to make use of the userspace interface (extra
>>> information about a function)? If yes, I'd expect the get_attr to
>>> obtain some kind of portable information already (basically your third
>>> option, below).
> 
> I agree, but I also suspect we're pretty deep into s390
> eccentricities.  An ioctl on the IOMMU container to get information
> about a PCI function (singular) really seems like it can only exist on
> a system where the actual PCI hardware is already being virtualized to
> the host system.  I don't think this excludes us from the conversation
> about what we're actually trying to expose and what it enables in
> userspace though.

I hope I answered these question in this email, above.

>   
>> Yes, seems the most reasonable.
>> In this case I need to share the structure definition between:
>> userspace through vfio.h
>> vfio_iommu (this is obvious)
>> s390_iommu
>>
>> It is this third include which made me doubt.
>> But when you re formulate it it looks the more reasonable because there
>> are much less changes.
> 
> It depends on what we settle on for get_attr.  If there are discrete
> features that vfio_iommu_type1 can query and assemble into the
> userspace response, the s390_iommu doesn't need to know the resulting
> structure.  Even if it's just a CLP structure from the get_attr, why
> would s390_iommu be responsible for formatting that into a user
> structure vs vfio_iommu?  I don't think we want get_attr passing vfio
> specific structures.  Thanks,
> 
> Alex
> 

OK, yes, I can do this and setup one ATTR for each feature and 
reassemble it in VFIO_IOMMU_TYPE1

Thanks,
Pierre

-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES
  2019-05-21  9:14               ` Pierre Morel
@ 2019-05-21 11:11                 ` Cornelia Huck
  2019-05-21 12:46                   ` Pierre Morel
  2019-05-21 14:59                 ` Alex Williamson
  1 sibling, 1 reply; 18+ messages in thread
From: Cornelia Huck @ 2019-05-21 11:11 UTC (permalink / raw)
  To: Pierre Morel
  Cc: Alex Williamson, sebott, gerald.schaefer, pasic, borntraeger,
	walling, linux-s390, iommu, joro, linux-kernel, kvm, schwidefsky,
	heiko.carstens, robin.murphy

On Tue, 21 May 2019 11:14:38 +0200
Pierre Morel <pmorel@linux.ibm.com> wrote:

> 1) A short description, of zPCI functions and groups
> 
> IN Z, PCI cards, leave behind an adapter between subchannels and PCI.
> We access PCI cards through 2 ways:
> - dedicated PCI instructions (pci_load/pci_store/pci/store_block)
> - DMA

Quick question: What about the new pci instructions? Anything that
needs to be considered there?

> We receive events through
> - Adapter interrupts

Note for the non-s390 folks: These are (I/O) interrupts that are not
tied to a specific device. MSI-X is mapped to this.

> - CHSC events

Another note for the non-s390 folks: This is a notification mechanism
that is using machine check interrupts; more information is retrieved
via a special instruction (chsc).

> 
> The adapter propose an IOMMU to protect the DMA
> and the interrupt handling goes through a MSIX like interface handled by 
> the adapter.
> 
> The architecture specific PCI do the interface between the standard PCI 
> level and the zPCI function (PCI + DMA/IOMMU/Interrupt)
> 
> To handle the communication through the "zPCI way" the CLP interface 
> provides instructions to retrieve informations from the adapters.
> 
> There are different group of functions having same functionalities.
> 
> clp_list give us a list from zPCI functions
> clp_query_pci_function returns informations specific to a function
> clp_query_group returns information on a function group
> 
> 
> 2) Why do we need it in the guest
> 
> We need to provide the guest with information on the adapters and zPCI 
> functions returned by the clp_query instruction so that the guest's 
> driver gets the right information on how the way to the zPCI function 
> has been built in the host.
> 
> 
> When a guest issues the CLP instructions we intercept the clp command in 
> QEMU and we need to feed the response with the right values for the guest.
> The "right" values are not the raw CLP response values:
> 
> - some identifier must be virtualized, like UID and FID,
> 
> - some values must match what the host received from the CLP response, 
> like the size of the transmited blocks, the DMA Address Space Mask, 
> number of interrupt, MSIA
> 
> - some other must match what the host handled with the adapter and 
> function, the start and end of DMA,
> 
> - some what the host IOMMU driver supports (frame size),
> 
> 
> 
> 3) We have three different way to get This information:
> 
> The PCI Linux interface is a standard PCI interface and some Z specific 
> information is available in sysfs.
> Not all the information needed to be returned inside the CLP response is 
> available.
> So we can not use the sysfs interface to get all the information.
> 
> There is a CLP ioctl interface but this interface is not secure in that 
> it returns the information for all adapters in the system.
> 
> The VFIO interface offers the advantage to point to a single PCI 
> function, so more secure than the clp ioctl interface.
> Coupled with the s390_iommu we get access to the zPCI CLP instruction 
> and to the values handled by the zPCI driver.
> 
> 
> 4) Until now we used to fill the CLP response to the guest inside QEMU 
> with fixed values corresponding to the only PCI card we supported.
> To support new cards we need to get the right values from the kernel out.

IIRC, the current code fills in values that make sense for one specific
type of card only, right? We also use the same values for emulated
cards (virtio); I assume that they are not completely weird for that
case...

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES
  2019-05-21 11:11                 ` Cornelia Huck
@ 2019-05-21 12:46                   ` Pierre Morel
  0 siblings, 0 replies; 18+ messages in thread
From: Pierre Morel @ 2019-05-21 12:46 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Alex Williamson, sebott, gerald.schaefer, pasic, borntraeger,
	walling, linux-s390, iommu, joro, linux-kernel, kvm, schwidefsky,
	heiko.carstens, robin.murphy

On 21/05/2019 13:11, Cornelia Huck wrote:
> On Tue, 21 May 2019 11:14:38 +0200
> Pierre Morel <pmorel@linux.ibm.com> wrote:
> 
>> 1) A short description, of zPCI functions and groups
>>
>> IN Z, PCI cards, leave behind an adapter between subchannels and PCI.
>> We access PCI cards through 2 ways:
>> - dedicated PCI instructions (pci_load/pci_store/pci/store_block)
>> - DMA
> 
> Quick question: What about the new pci instructions? Anything that
> needs to be considered there?

No and yes.

No because they should be used when pci_{load,stor,store_block} are 
interpreted AFAIU.
And currently we only use interception.

Yes, because, the CLP part, use to setup the translations IIUC, (do not 
ask me for details now), will need to be re-issued by the kernel after 
some modifications and this will also need a way from QEMU S390 PCI down 
to the ZPCI driver.
Way that I try to setup with this patch.

So answer is not now but we should keep in mind that we will 
definitively need a way down to the zpci low level in the host.

> 
>> We receive events through
>> - Adapter interrupts
> 
> Note for the non-s390 folks: These are (I/O) interrupts that are not
> tied to a specific device. MSI-X is mapped to this.
> 
>> - CHSC events
> 
> Another note for the non-s390 folks: This is a notification mechanism
> that is using machine check interrupts; more information is retrieved
> via a special instruction (chsc).
> 

thanks, it is yes better to explain better :)

>>
>> The adapter propose an IOMMU to protect the DMA
>> and the interrupt handling goes through a MSIX like interface handled by
>> the adapter.
>>
>> The architecture specific PCI do the interface between the standard PCI
>> level and the zPCI function (PCI + DMA/IOMMU/Interrupt)
>>
>> To handle the communication through the "zPCI way" the CLP interface
>> provides instructions to retrieve informations from the adapters.
>>
>> There are different group of functions having same functionalities.
>>
>> clp_list give us a list from zPCI functions
>> clp_query_pci_function returns informations specific to a function
>> clp_query_group returns information on a function group
>>
>>
>> 2) Why do we need it in the guest
>>
>> We need to provide the guest with information on the adapters and zPCI
>> functions returned by the clp_query instruction so that the guest's
>> driver gets the right information on how the way to the zPCI function
>> has been built in the host.
>>
>>
>> When a guest issues the CLP instructions we intercept the clp command in
>> QEMU and we need to feed the response with the right values for the guest.
>> The "right" values are not the raw CLP response values:
>>
>> - some identifier must be virtualized, like UID and FID,
>>
>> - some values must match what the host received from the CLP response,
>> like the size of the transmited blocks, the DMA Address Space Mask,
>> number of interrupt, MSIA
>>
>> - some other must match what the host handled with the adapter and
>> function, the start and end of DMA,
>>
>> - some what the host IOMMU driver supports (frame size),
>>
>>
>>
>> 3) We have three different way to get This information:
>>
>> The PCI Linux interface is a standard PCI interface and some Z specific
>> information is available in sysfs.
>> Not all the information needed to be returned inside the CLP response is
>> available.
>> So we can not use the sysfs interface to get all the information.
>>
>> There is a CLP ioctl interface but this interface is not secure in that
>> it returns the information for all adapters in the system.
>>
>> The VFIO interface offers the advantage to point to a single PCI
>> function, so more secure than the clp ioctl interface.
>> Coupled with the s390_iommu we get access to the zPCI CLP instruction
>> and to the values handled by the zPCI driver.
>>
>>
>> 4) Until now we used to fill the CLP response to the guest inside QEMU
>> with fixed values corresponding to the only PCI card we supported.
>> To support new cards we need to get the right values from the kernel out.
> 
> IIRC, the current code fills in values that make sense for one specific
> type of card only, right?

yes, right

> We also use the same values for emulated
> cards (virtio); I assume that they are not completely weird for that
> case...
> 

No they are not.

For emulated cards, all is done inside QEMU, we do not need kernel 
access, the emulated cards get a specific emulation function and group 
assigned with pre-defined values.

I sent a QEMU patch related to this.
Even the kernel interface will change with the changes in the kernel 
patch, the emulation should continue in this way.

Regards,
Pierre










-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES
  2019-05-21  9:14               ` Pierre Morel
  2019-05-21 11:11                 ` Cornelia Huck
@ 2019-05-21 14:59                 ` Alex Williamson
  2019-05-21 15:33                   ` Pierre Morel
  1 sibling, 1 reply; 18+ messages in thread
From: Alex Williamson @ 2019-05-21 14:59 UTC (permalink / raw)
  To: Pierre Morel
  Cc: Cornelia Huck, sebott, gerald.schaefer, pasic, borntraeger,
	walling, linux-s390, iommu, joro, linux-kernel, kvm, schwidefsky,
	heiko.carstens, robin.murphy

On Tue, 21 May 2019 11:14:38 +0200
Pierre Morel <pmorel@linux.ibm.com> wrote:

> On 20/05/2019 20:23, Alex Williamson wrote:
> > On Mon, 20 May 2019 18:31:08 +0200
> > Pierre Morel <pmorel@linux.ibm.com> wrote:
> >   
> >> On 20/05/2019 16:27, Cornelia Huck wrote:  
> >>> On Mon, 20 May 2019 13:19:23 +0200
> >>> Pierre Morel <pmorel@linux.ibm.com> wrote:
> >>>      
> >>>> On 17/05/2019 20:04, Pierre Morel wrote:  
> >>>>> On 17/05/2019 18:41, Alex Williamson wrote:  
> >>>>>> On Fri, 17 May 2019 18:16:50 +0200
> >>>>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
> >>>>>>        
> >>>>>>> We implement the capability interface for VFIO_IOMMU_GET_INFO.
> >>>>>>>
> >>>>>>> When calling the ioctl, the user must specify
> >>>>>>> VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and
> >>>>>>> must check in the answer if capabilities are supported.
> >>>>>>>
> >>>>>>> The iommu get_attr callback will be used to retrieve the specific
> >>>>>>> attributes and fill the capabilities.
> >>>>>>>
> >>>>>>> Currently two Z-PCI specific capabilities will be queried and
> >>>>>>> filled by the underlying Z specific s390_iommu:
> >>>>>>> VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes
> >>>>>>> and
> >>>>>>> VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group.
> >>>>>>>
> >>>>>>> Other architectures may add new capabilities in the same way
> >>>>>>> after enhancing the architecture specific IOMMU driver.
> >>>>>>>
> >>>>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> >>>>>>> ---
> >>>>>>>     drivers/vfio/vfio_iommu_type1.c | 122
> >>>>>>> +++++++++++++++++++++++++++++++++++++++-
> >>>>>>>     1 file changed, 121 insertions(+), 1 deletion(-)
> >>>>>>>
> >>>>>>> diff --git a/drivers/vfio/vfio_iommu_type1.c
> >>>>>>> b/drivers/vfio/vfio_iommu_type1.c
> >>>>>>> index d0f731c..9435647 100644
> >>>>>>> --- a/drivers/vfio/vfio_iommu_type1.c
> >>>>>>> +++ b/drivers/vfio/vfio_iommu_type1.c
> >>>>>>> @@ -1658,6 +1658,97 @@ static int
> >>>>>>> vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
> >>>>>>>         return ret;
> >>>>>>>     }
> >>>>>>> +static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain,
> >>>>>>> +                    struct vfio_info_cap *caps, size_t size)
> >>>>>>> +{
> >>>>>>> +    struct vfio_iommu_type1_info_pcifn *info_fn;
> >>>>>>> +    int ret;
> >>>>>>> +
> >>>>>>> +    info_fn = kzalloc(size, GFP_KERNEL);
> >>>>>>> +    if (!info_fn)
> >>>>>>> +        return -ENOMEM;
> >>>>>>> +
> >>>>>>> +    ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN,
> >>>>>>> +                    &info_fn->response);  
> >>>>>>
> >>>>>> What ensures that the 'struct clp_rsp_query_pci' returned from this
> >>>>>> get_attr remains consistent with a 'struct vfio_iommu_pci_function'?
> >>>>>> Why does the latter contains so many reserved fields (beyond simply
> >>>>>> alignment) for a user API?  What fields of these structures are
> >>>>>> actually useful to userspace?  Should any fields not be exposed to the
> >>>>>> user?  Aren't BAR sizes redundant to what's available through the vfio
> >>>>>> PCI API?  I'm afraid that simply redefining an internal structure as
> >>>>>> the API leaves a lot to be desired too.  Thanks,
> >>>>>>
> >>>>>> Alex
> >>>>>>        
> >>>>> Hi Alex,
> >>>>>
> >>>>> I simply used the structure returned by the firmware to be sure to be
> >>>>> consistent with future evolutions and facilitate the copy from CLP and
> >>>>> to userland.
> >>>>>
> >>>>> If you prefer, and I understand that this is the case, I can define a
> >>>>> specific VFIO_IOMMU structure with only the fields relevant to the user,
> >>>>> leaving future enhancement of the user's interface being implemented in
> >>>>> another kernel patch when the time has come.  
> > 
> > TBH, I had no idea that CLP is an s390 firmware interface and this is
> > just dumping that to userspace.  The cover letter says:
> > 
> >    Using the PCI VFIO interface allows userland, a.k.a. QEMU, to
> >    retrieve ZPCI specific information without knowing Z specific
> >    identifiers like the function ID or the function handle of the zPCI
> >    function hidden behind the PCI interface.
> > 
> > But what does this allow userland to do and what specific pieces of
> > information do they need?  We do have a case already where Intel
> > graphics devices have a table (OpRegion) living in host system memory
> > that we expose via a vfio region, so it wouldn't be unprecedented to do
> > something like this, but as Connie suggests, if we knew what was being
> > consumed here and why, maybe we could generalize it into something
> > useful for others.  
> 
> OK, sorry I try to explain better.
> 
> 1) A short description, of zPCI functions and groups
> 
> IN Z, PCI cards, leave behind an adapter between subchannels and PCI.
> We access PCI cards through 2 ways:
> - dedicated PCI instructions (pci_load/pci_store/pci/store_block)
> - DMA
> We receive events through
> - Adapter interrupts
> - CHSC events
> 
> The adapter propose an IOMMU to protect the DMA
> and the interrupt handling goes through a MSIX like interface handled by 
> the adapter.
> 
> The architecture specific PCI do the interface between the standard PCI 
> level and the zPCI function (PCI + DMA/IOMMU/Interrupt)
> 
> To handle the communication through the "zPCI way" the CLP interface 
> provides instructions to retrieve informations from the adapters.
> 
> There are different group of functions having same functionalities.
> 
> clp_list give us a list from zPCI functions
> clp_query_pci_function returns informations specific to a function
> clp_query_group returns information on a function group
> 
> 
> 2) Why do we need it in the guest
> 
> We need to provide the guest with information on the adapters and zPCI 
> functions returned by the clp_query instruction so that the guest's 
> driver gets the right information on how the way to the zPCI function 
> has been built in the host.
> 
> 
> When a guest issues the CLP instructions we intercept the clp command in 
> QEMU and we need to feed the response with the right values for the guest.
> The "right" values are not the raw CLP response values:
> 
> - some identifier must be virtualized, like UID and FID,
> 
> - some values must match what the host received from the CLP response, 
> like the size of the transmited blocks, the DMA Address Space Mask, 
> number of interrupt, MSIA
> 
> - some other must match what the host handled with the adapter and 
> function, the start and end of DMA,
> 
> - some what the host IOMMU driver supports (frame size),

This seems very reminiscent of virtualizing PCI config space... so why
is this being proposed as a VFIO IOMMU ioctl extension?  These are all
function level characteristics, right?  Should this be a capability on
the VFIO device, or perhaps a region like we used for the Intel
OpRegion (though the structure size seems more akin to a capability
here)?  As I mentioned in my previous reply, tying this into the IOMMU
interface seemed to rely on (I assume) an one-to-one-to-one mapping of
PCI function to IOMMU group to IOMMU domain, but that doesn't still
doesn't necessarily lend itself to using the IOMMU for device level
information.  If there is IOMMU info, perhaps it needs to be split, ie.
expose a frame size via domain_get_attr, expose device level features
via a device capability, let QEMU assemble these into something
coherent to emulate the clp interface.

> 3) We have three different way to get This information:
> 
> The PCI Linux interface is a standard PCI interface and some Z specific 
> information is available in sysfs.
> Not all the information needed to be returned inside the CLP response is 
> available.
> So we can not use the sysfs interface to get all the information.
> 
> There is a CLP ioctl interface but this interface is not secure in that 
> it returns the information for all adapters in the system.
> 
> The VFIO interface offers the advantage to point to a single PCI 
> function, so more secure than the clp ioctl interface.
> Coupled with the s390_iommu we get access to the zPCI CLP instruction 
> and to the values handled by the zPCI driver.
> 
> 
> 4) Until now we used to fill the CLP response to the guest inside QEMU 
> with fixed values corresponding to the only PCI card we supported.
> To support new cards we need to get the right values from the kernel out.

If it's already emulated, I much prefer figuring out how to expose the
right pieces of information via an appropriate interface to virtualize
fields that are actually necessary rather than simply providing an
interface to dump the clp info straight to userspace and pipe it to the
VM.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES
  2019-05-21 14:59                 ` Alex Williamson
@ 2019-05-21 15:33                   ` Pierre Morel
  0 siblings, 0 replies; 18+ messages in thread
From: Pierre Morel @ 2019-05-21 15:33 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Cornelia Huck, sebott, gerald.schaefer, pasic, borntraeger,
	walling, linux-s390, iommu, joro, linux-kernel, kvm, schwidefsky,
	heiko.carstens, robin.murphy

On 21/05/2019 16:59, Alex Williamson wrote:
> On Tue, 21 May 2019 11:14:38 +0200
> Pierre Morel <pmorel@linux.ibm.com> wrote:
> 
>> On 20/05/2019 20:23, Alex Williamson wrote:
>>> On Mon, 20 May 2019 18:31:08 +0200
>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>>    
>>>> On 20/05/2019 16:27, Cornelia Huck wrote:
>>>>> On Mon, 20 May 2019 13:19:23 +0200
>>>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>>>>       
>>>>>> On 17/05/2019 20:04, Pierre Morel wrote:
>>>>>>> On 17/05/2019 18:41, Alex Williamson wrote:
>>>>>>>> On Fri, 17 May 2019 18:16:50 +0200
>>>>>>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>>>>>>>         
>>>>>>>>> We implement the capability interface for VFIO_IOMMU_GET_INFO.
>>>>>>>>>
>>>>>>>>> When calling the ioctl, the user must specify
>>>>>>>>> VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and
>>>>>>>>> must check in the answer if capabilities are supported.
>>>>>>>>>
>>>>>>>>> The iommu get_attr callback will be used to retrieve the specific
>>>>>>>>> attributes and fill the capabilities.
>>>>>>>>>
>>>>>>>>> Currently two Z-PCI specific capabilities will be queried and
>>>>>>>>> filled by the underlying Z specific s390_iommu:
>>>>>>>>> VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes
>>>>>>>>> and
>>>>>>>>> VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group.
>>>>>>>>>
>>>>>>>>> Other architectures may add new capabilities in the same way
>>>>>>>>> after enhancing the architecture specific IOMMU driver.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>>>>>>>> ---
>>>>>>>>>      drivers/vfio/vfio_iommu_type1.c | 122
>>>>>>>>> +++++++++++++++++++++++++++++++++++++++-
>>>>>>>>>      1 file changed, 121 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/vfio/vfio_iommu_type1.c
>>>>>>>>> b/drivers/vfio/vfio_iommu_type1.c
>>>>>>>>> index d0f731c..9435647 100644
>>>>>>>>> --- a/drivers/vfio/vfio_iommu_type1.c
>>>>>>>>> +++ b/drivers/vfio/vfio_iommu_type1.c
>>>>>>>>> @@ -1658,6 +1658,97 @@ static int
>>>>>>>>> vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
>>>>>>>>>          return ret;
>>>>>>>>>      }
>>>>>>>>> +static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain,
>>>>>>>>> +                    struct vfio_info_cap *caps, size_t size)
>>>>>>>>> +{
>>>>>>>>> +    struct vfio_iommu_type1_info_pcifn *info_fn;
>>>>>>>>> +    int ret;
>>>>>>>>> +
>>>>>>>>> +    info_fn = kzalloc(size, GFP_KERNEL);
>>>>>>>>> +    if (!info_fn)
>>>>>>>>> +        return -ENOMEM;
>>>>>>>>> +
>>>>>>>>> +    ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN,
>>>>>>>>> +                    &info_fn->response);
>>>>>>>>
>>>>>>>> What ensures that the 'struct clp_rsp_query_pci' returned from this
>>>>>>>> get_attr remains consistent with a 'struct vfio_iommu_pci_function'?
>>>>>>>> Why does the latter contains so many reserved fields (beyond simply
>>>>>>>> alignment) for a user API?  What fields of these structures are
>>>>>>>> actually useful to userspace?  Should any fields not be exposed to the
>>>>>>>> user?  Aren't BAR sizes redundant to what's available through the vfio
>>>>>>>> PCI API?  I'm afraid that simply redefining an internal structure as
>>>>>>>> the API leaves a lot to be desired too.  Thanks,
>>>>>>>>
>>>>>>>> Alex
>>>>>>>>         
>>>>>>> Hi Alex,
>>>>>>>
>>>>>>> I simply used the structure returned by the firmware to be sure to be
>>>>>>> consistent with future evolutions and facilitate the copy from CLP and
>>>>>>> to userland.
>>>>>>>
>>>>>>> If you prefer, and I understand that this is the case, I can define a
>>>>>>> specific VFIO_IOMMU structure with only the fields relevant to the user,
>>>>>>> leaving future enhancement of the user's interface being implemented in
>>>>>>> another kernel patch when the time has come.
>>>
>>> TBH, I had no idea that CLP is an s390 firmware interface and this is
>>> just dumping that to userspace.  The cover letter says:
>>>
>>>     Using the PCI VFIO interface allows userland, a.k.a. QEMU, to
>>>     retrieve ZPCI specific information without knowing Z specific
>>>     identifiers like the function ID or the function handle of the zPCI
>>>     function hidden behind the PCI interface.
>>>
>>> But what does this allow userland to do and what specific pieces of
>>> information do they need?  We do have a case already where Intel
>>> graphics devices have a table (OpRegion) living in host system memory
>>> that we expose via a vfio region, so it wouldn't be unprecedented to do
>>> something like this, but as Connie suggests, if we knew what was being
>>> consumed here and why, maybe we could generalize it into something
>>> useful for others.
>>
>> OK, sorry I try to explain better.
>>
>> 1) A short description, of zPCI functions and groups
>>
>> IN Z, PCI cards, leave behind an adapter between subchannels and PCI.
>> We access PCI cards through 2 ways:
>> - dedicated PCI instructions (pci_load/pci_store/pci/store_block)
>> - DMA
>> We receive events through
>> - Adapter interrupts
>> - CHSC events
>>
>> The adapter propose an IOMMU to protect the DMA
>> and the interrupt handling goes through a MSIX like interface handled by
>> the adapter.
>>
>> The architecture specific PCI do the interface between the standard PCI
>> level and the zPCI function (PCI + DMA/IOMMU/Interrupt)
>>
>> To handle the communication through the "zPCI way" the CLP interface
>> provides instructions to retrieve informations from the adapters.
>>
>> There are different group of functions having same functionalities.
>>
>> clp_list give us a list from zPCI functions
>> clp_query_pci_function returns informations specific to a function
>> clp_query_group returns information on a function group
>>
>>
>> 2) Why do we need it in the guest
>>
>> We need to provide the guest with information on the adapters and zPCI
>> functions returned by the clp_query instruction so that the guest's
>> driver gets the right information on how the way to the zPCI function
>> has been built in the host.
>>
>>
>> When a guest issues the CLP instructions we intercept the clp command in
>> QEMU and we need to feed the response with the right values for the guest.
>> The "right" values are not the raw CLP response values:
>>
>> - some identifier must be virtualized, like UID and FID,
>>
>> - some values must match what the host received from the CLP response,
>> like the size of the transmited blocks, the DMA Address Space Mask,
>> number of interrupt, MSIA
>>
>> - some other must match what the host handled with the adapter and
>> function, the start and end of DMA,
>>
>> - some what the host IOMMU driver supports (frame size),
> 
> This seems very reminiscent of virtualizing PCI config space... so why
> is this being proposed as a VFIO IOMMU ioctl extension?  These are all
> function level characteristics, right?  Should this be a capability on
> the VFIO device, or perhaps a region like we used for the Intel
> OpRegion (though the structure size seems more akin to a capability
> here)?  As I mentioned in my previous reply, tying this into the IOMMU
> interface seemed to rely on (I assume) an one-to-one-to-one mapping of
> PCI function to IOMMU group to IOMMU domain, but that doesn't still
> doesn't necessarily lend itself to using the IOMMU for device level
> information.  If there is IOMMU info, perhaps it needs to be split, ie.
> expose a frame size via domain_get_attr, expose device level features
> via a device capability, let QEMU assemble these into something
> coherent to emulate the clp interface.
> 
>> 3) We have three different way to get This information:
>>
>> The PCI Linux interface is a standard PCI interface and some Z specific
>> information is available in sysfs.
>> Not all the information needed to be returned inside the CLP response is
>> available.
>> So we can not use the sysfs interface to get all the information.
>>
>> There is a CLP ioctl interface but this interface is not secure in that
>> it returns the information for all adapters in the system.
>>
>> The VFIO interface offers the advantage to point to a single PCI
>> function, so more secure than the clp ioctl interface.
>> Coupled with the s390_iommu we get access to the zPCI CLP instruction
>> and to the values handled by the zPCI driver.
>>
>>
>> 4) Until now we used to fill the CLP response to the guest inside QEMU
>> with fixed values corresponding to the only PCI card we supported.
>> To support new cards we need to get the right values from the kernel out.
> 
> If it's already emulated, I much prefer figuring out how to expose the
> right pieces of information via an appropriate interface to virtualize
> fields that are actually necessary rather than simply providing an
> interface to dump the clp info straight to userspace and pipe it to the
> VM.  Thanks,
> 
> Alex
> 

OK, I understand. Seems very clear, IOMMU features through IOMMU 
interface device features through device interface.

Some times I do not understand what I did. Seems I messed up.

Thanks to have take time to explain.

Been back in a while with a better series.

Regards,
Pierre




-- 
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-05-21 15:33 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-17 16:16 [PATCH v2 0/4] Retrieving zPCI specific info with VFIO Pierre Morel
2019-05-17 16:16 ` [PATCH v2 1/4] s390: pci: Exporting access to CLP PCI function and PCI group Pierre Morel
2019-05-17 16:16 ` [PATCH v2 2/4] vfio: vfio_iommu_type1: Define VFIO_IOMMU_INFO_CAPABILITIES Pierre Morel
2019-05-17 16:16 ` [PATCH v2 3/4] s390: iommu: Adding get attributes for s390_iommu Pierre Morel
2019-05-17 16:16 ` [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES Pierre Morel
2019-05-17 16:41   ` Alex Williamson
2019-05-17 18:04     ` Pierre Morel
2019-05-20 11:19       ` Pierre Morel
2019-05-20 14:27         ` Cornelia Huck
2019-05-20 16:31           ` Pierre Morel
2019-05-20 18:23             ` Alex Williamson
2019-05-21  9:14               ` Pierre Morel
2019-05-21 11:11                 ` Cornelia Huck
2019-05-21 12:46                   ` Pierre Morel
2019-05-21 14:59                 ` Alex Williamson
2019-05-21 15:33                   ` Pierre Morel
2019-05-20  3:02   ` kbuild test robot
2019-05-20  3:02   ` [RFC PATCH] vfio: vfio_iommu_type1: vfio_iommu_type1_caps() can be static kbuild test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).