LKML Archive on
help / color / mirror / Atom feed
From: Jean-Philippe Brucker <>
To: "Tian, Kevin" <>,
	Alex Williamson <>
Cc: Jacob Pan <>,
	LKML <>,
	Joerg Roedel <>,
	David Woodhouse <>,
	Greg Kroah-Hartman <>,
	"Wysocki, Rafael J" <>,
	"Liu, Yi L" <>,
	"Raj, Ashok" <>,
	Christoph Hellwig <>,
	Lu Baolu <>,
	Yi L <>,
	Auger Eric <>
Subject: Re: [PATCH v4 04/22] iommu/vt-d: add bind_pasid_table function
Date: Wed, 30 May 2018 12:53:53 +0100	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On 30/05/18 04:45, Tian, Kevin wrote:
>>>>>> On SMMUv3 the minimum alignment for base_ptr is 64 bytes, so a
>>>> guest
>>>>>> under a vSMMU might pass a pointer that's not aligned on 4k.
>>>>> PASID table pointer for VT-d is 4K aligned.
>>>>>> Maybe this information could be part of the data passed to
>> userspace
>>>>>> about IOMMU table formats and features? They're not part of this
>>>>>> series, but I think we wanted to communicate IOMMU-specific
>> features
>>>>>> via sysfs.
>>>>> Agreed, I believe Yi Liu is working on a sysfs interface such that QEMU
>>>>> can match IOMMU model and features.
>>>> Digging this up again since v5 still has this issue.  The IOMMU API is
>>>> a kernel internal abstraction of the IOMMU.  sysfs is a userspace
>>>> interface.  Are we suggesting that the /only/ way to make use of the
>>>> internal IOMMU API here is to have a user provided opaque pasid table
>>>> that we can't even do minimal compatibility sanity testing on and we
>>>> simply hope that hardware covers all the fault conditions without
>>>> taking the host down with it?  I guess we have to assume the latter
>>>> since the user has full control of the table, but I have a hard time
>>>> getting past lack of internal ability to use the interface and no
>>>> ability to provide even the slimmest sanity testing.  Thanks,
>>> checking size, alignment, ... is OK, which I think is already considered
>>> by vendor IOMMU driver. However sanity testing table format might
>>> be difficult. The initial table provided by guest is likely just all ZEROs.
>>> whatever format violation may be caught only when a PASID entry
>>> is updated...
>> There's sanity testing the actual contents of the table, which I agree
>> would be difficult and would likely require some sort of shadowing at
>> additional overhead, but what about even basic consistency checking?
>> For example, is it possible that due to hardware variations a user
>> might generate a table which works on some systems but not others?
>> What
>> if two table formats are sufficiently similar that the IOMMU driver
>> puts an incompatible table in place but it continuously generates
>> faults, how do we debug that?  As an intermediary in this whole process
>> I'd really rather be able to identify that the user claims to be
>> providing a TypeA table but the IOMMU only supports TypeB, so clearly
>> this won't work.  I don't see that we have that capability.  Thanks,
> I remember we ever discussed to define some vendor/model ID,
> which can be retrieved by user space and then passed back when
> doing table binding. Then above simple model matching check can
> be done accordingly. It is actually a basic requirement when using 
> virtio-iommu, same driver expecting to work on all vendor IOMMUs.
> However I don't remember whether/where that logic is implemented
> in this series (especially when there are two tracks moving in parallel).
> I'll leave to Jacob/Jean to further comment.

For Arm we do need some form of sanity checking. As each architecture
version brings a new set of features that may be supported and enabled
individually, we need to communicate fine-grained features to users.
They describes the general capability of the physical IOMMU, and also
which fields are available in the PASID table (entries are 512-bits and
leave some space for future extensions).

In the past I briefly tried using a ioctl-based interface through VFIO
only, but it seemed more complicated to extend than sysfs for this kind
of probing.

Note that the following is from my own prototype. I'm not sure how much
Yi Liu's implementation differs but I think this was roughly what we
agreed on last time. In sysfs an IOMMU device is described with:

* A model number, for example intel-vtd=1, arm-smmu-v3=2.
* Properties and features, describing in detail what the pIOMMU device
  and driver support.


For example an SMMUv3:

The model number is described as a property
/sys/class/iommu/smmu.0x00000000e0600000/arm-smmu-v3/model = 2

A few feature bits and values:
.../arm-smmu-v3/asid_bits	// max address space ID bits, %d
.../arm-smmu-v3/ssid_bits	// max substream ID (PASID) bits, %d
.../arm-smmu-v3/input_bits	// max input address size, %d
.../arm-smmu-v3/output_bits	// max output address size, %d
.../arm-smmu-v3/btm		// broadcast TLB maintenance, enabled/disabled
.../arm-smmu-v3/httu		// Hardware table update, access+dirty/access/none
.../arm-smmu-v3/stall		// transaction stalling, enabled/disabled/force

(Note that the base pointer alignment previously discussed could be
implied by the model number, or added explicitly here.)

Which page table formats are supported:
I'm not sure yet what values these will have, they might simply contain
arbitrary format numbers because fields available in the page tables can
be deduced from the above features bits. (Out of laziness, in my
prototype I just describe a preferred format in a pgtable_format file)

As you can imagine I'd rather not pass the fine details back to the
kernel in bind_pasid_table. The list of features is growing, and
describing them is a pain. It could be done for debugging purpose, but
all we'd be achieving is telling the kernel that userspace has read the
values, not that the guest intends to use them. The guest selects
features by writing PASID table entries, which aren't read by the host.

If the guest writes invalid values in the PASID table then yes, we have
to rely on the hardware to contain the fault and not bring the host down
with it. If the IOMMU cannot do that, then the driver really shouldn't
implement bind_pasid_table... Otherwise, a fault while reading the PASID
table can be injected into the guest as an unrecoverable fault
patch 10) or printed by the host when debugging.

However I think the model number should be added to pasid_table_config.
For one thing it gives us a simple sanity-check, but it also tells which
other fields are valid in pasid_table_config. Arm-smmu-v3 needs at least
two additional 8-bit fields describing the PASID table format (number of
levels and PASID0 behaviour), which are written to device context tables
when installing the PASID table pointer.

Compatibility: new optional features are easy to add to a given model,
just add a new sysfs file. If in the future, the host describes a new
feature that is mandatory, or implements a different PASID table format,
how does it ensure that user understands it? Perhaps use a new model
number for this, e.g. "arm-smmu-v3-a=3", with similar features. I think
it would be the same if the host stops supporting a feature for a given
model, because they are ABI. But we can also define default values from
the start, for example "if ssid_bits file isn't present, default value
is 0 - PASID not supported"


  reply	other threads:[~2018-05-30 11:54 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-16 21:48 [PATCH v4 00/22] IOMMU and VT-d driver support for Shared Virtual Address (SVA) Jacob Pan
2018-04-16 21:48 ` [PATCH v4 01/22] iommu: introduce bind_pasid_table API function Jacob Pan
2018-04-16 21:48 ` [PATCH v4 02/22] iommu/vt-d: move device_domain_info to header Jacob Pan
2018-04-16 21:48 ` [PATCH v4 03/22] iommu/vt-d: add a flag for pasid table bound status Jacob Pan
2018-04-16 21:48 ` [PATCH v4 04/22] iommu/vt-d: add bind_pasid_table function Jacob Pan
2018-04-17 19:10   ` Alex Williamson
2018-04-20 18:25     ` Jean-Philippe Brucker
2018-04-20 23:42       ` Jacob Pan
2018-05-29 20:09         ` Alex Williamson
2018-05-30  1:41           ` Tian, Kevin
2018-05-30  3:17             ` Alex Williamson
2018-05-30  3:45               ` Tian, Kevin
2018-05-30 11:53                 ` Jean-Philippe Brucker [this message]
2018-05-30 19:52                   ` Jacob Pan
2018-05-31  9:09                     ` Jean-Philippe Brucker
2018-06-05 17:32                       ` Jacob Pan
2018-06-06 11:20                         ` Jean-Philippe Brucker
2018-06-06 21:22                           ` Jacob Pan
2018-06-07 13:21                             ` Jean-Philippe Brucker
2018-04-20 23:22     ` Jacob Pan
2018-04-16 21:48 ` [PATCH v4 05/22] iommu: introduce iommu invalidate API function Jacob Pan
2018-04-20 18:19   ` Jean-Philippe Brucker
2018-04-23 20:43     ` Jacob Pan
2018-04-27 18:07       ` Jean-Philippe Brucker
2018-04-28  2:41         ` Tian, Kevin
2018-05-01 22:58         ` Jacob Pan
2018-05-02  9:31           ` Jean-Philippe Brucker
2018-05-04  4:46             ` Jacob Pan
2018-05-04 18:07               ` Jacob Pan
2018-05-08 10:35                 ` Jean-Philippe Brucker
2018-05-09 12:55                   ` Jacob Pan
2018-05-05 22:19   ` Jerry Snitselaar
2018-05-07 15:41     ` Jacob Pan
2018-04-16 21:48 ` [PATCH v4 06/22] iommu/vt-d: add definitions for PFSID Jacob Pan
2018-04-16 21:48 ` [PATCH v4 07/22] iommu/vt-d: fix dev iotlb pfsid use Jacob Pan
2018-04-16 21:48 ` [PATCH v4 08/22] iommu/vt-d: support flushing more translation cache types Jacob Pan
2018-04-16 21:48 ` [PATCH v4 09/22] iommu/vt-d: add svm/sva invalidate function Jacob Pan
2018-04-17 19:10   ` Alex Williamson
2018-04-20 22:36     ` Jacob Pan
2018-04-16 21:48 ` [PATCH v4 10/22] iommu: introduce device fault data Jacob Pan
2018-04-23 10:11   ` Jean-Philippe Brucker
2018-04-23 11:54     ` Jacob Pan
2018-05-20  8:17   ` Liu, Yi L
2018-05-21 23:16     ` Jacob Pan
2018-04-16 21:49 ` [PATCH v4 11/22] driver core: add per device iommu param Jacob Pan
2018-04-23 10:26   ` Greg Kroah-Hartman
2018-04-16 21:49 ` [PATCH v4 12/22] iommu: introduce device fault report API Jacob Pan
2018-04-23 11:30   ` Jean-Philippe Brucker
2018-04-24 18:29     ` Jacob Pan
2018-04-30 16:53   ` Jean-Philippe Brucker
2018-04-30 18:54     ` Jacob Pan
2018-04-16 21:49 ` [PATCH v4 13/22] iommu: introduce page response function Jacob Pan
2018-04-23 11:47   ` Jean-Philippe Brucker
2018-04-23 12:16     ` Jacob Pan
2018-04-23 15:50       ` Jean-Philippe Brucker
2018-04-16 21:49 ` [PATCH v4 14/22] iommu: handle page response timeout Jacob Pan
2018-04-23 15:36   ` Jean-Philippe Brucker
2018-04-25 15:37     ` Jacob Pan
2018-04-30 10:58       ` Jean-Philippe Brucker
2018-04-30 17:54         ` Jacob Pan
2018-04-16 21:49 ` [PATCH v4 15/22] iommu/config: add build dependency for dmar Jacob Pan
2018-04-16 21:49 ` [PATCH v4 16/22] iommu/vt-d: report non-recoverable faults to device Jacob Pan
2018-04-16 21:49 ` [PATCH v4 17/22] iommu/intel-svm: report device page request Jacob Pan
2018-04-16 21:49 ` [PATCH v4 18/22] iommu/intel-svm: replace dev ops with fault report API Jacob Pan
2018-04-16 21:49 ` [PATCH v4 19/22] iommu/intel-svm: do not flush iotlb for viommu Jacob Pan
2018-04-16 21:49 ` [PATCH v4 20/22] iommu/vt-d: add intel iommu page response function Jacob Pan
2018-04-16 21:49 ` [PATCH v4 21/22] trace/iommu: add sva trace events Jacob Pan
2018-04-16 21:49 ` [PATCH v4 22/22] iommu: use sva invalidate and device fault trace event Jacob Pan
  -- strict thread matches above, loose matches on Subject: below --
2018-03-23  3:11 [PATCH v4 00/22] IOMMU and VT-d driver support for Shared Virtual Address (SVA) Jacob Pan
2018-03-23  3:11 ` [PATCH v4 04/22] iommu/vt-d: add bind_pasid_table function Jacob Pan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
    --subject='Re: [PATCH v4 04/22] iommu/vt-d: add bind_pasid_table function' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).