LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Jacob Pan <jacob.jun.pan@linux.intel.com>
To: Auger Eric <eric.auger@redhat.com>
Cc: iommu@lists.linux-foundation.org,
	LKML <linux-kernel@vger.kernel.org>,
	Joerg Roedel <joro@8bytes.org>,
	David Woodhouse <dwmw2@infradead.org>,
	Alex Williamson <alex.williamson@redhat.com>,
	Jean-Philippe Brucker <jean-philippe.brucker@arm.com>,
	Yi Liu <yi.l.liu@intel.com>, "Tian, Kevin" <kevin.tian@intel.com>,
	Raj Ashok <ashok.raj@intel.com>,
	Christoph Hellwig <hch@infradead.org>,
	Lu Baolu <baolu.lu@linux.intel.com>,
	Andriy Shevchenko <andriy.shevchenko@linux.intel.com>,
	jacob.jun.pan@linux.intel.com
Subject: Re: [PATCH v3 04/16] ioasid: Add custom IOASID allocator
Date: Thu, 23 May 2019 08:40:19 -0700
Message-ID: <20190523084019.7f940aa5@jacob-builder> (raw)
In-Reply-To: <a33797e9-d34b-b0a9-4f39-700dce8252b3@redhat.com>

On Thu, 23 May 2019 09:14:07 +0200
Auger Eric <eric.auger@redhat.com> wrote:

> Hi Jacob,
> 
> On 5/22/19 9:42 PM, Jacob Pan wrote:
> > On Tue, 21 May 2019 11:55:55 +0200
> > Auger Eric <eric.auger@redhat.com> wrote:
> >   
> >> Hi Jacob,
> >>
> >> On 5/4/19 12:32 AM, Jacob Pan wrote:  
> >>> Sometimes, IOASID allocation must be handled by platform specific
> >>> code. The use cases are guest vIOMMU and pvIOMMU where IOASIDs
> >>> need to be allocated by the host via enlightened or paravirt
> >>> interfaces.
> >>>
> >>> This patch adds an extension to the IOASID allocator APIs such
> >>> that platform drivers can register a custom allocator, possibly
> >>> at boot time, to take over the allocation. Xarray is still used
> >>> for tracking and searching purposes internal to the IOASID code.
> >>> Private data of an IOASID can also be set after the allocation.
> >>>
> >>> There can be multiple custom allocators registered but only one is
> >>> used at a time. In case of hot removal of devices that provides
> >>> the allocator, all IOASIDs must be freed prior to unregistering
> >>> the allocator. Default XArray based allocator cannot be mixed with
> >>> custom allocators, i.e. custom allocators will not be used if
> >>> there are outstanding IOASIDs allocated by the default XA
> >>> allocator.
> >>>
> >>> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> >>> ---
> >>>  drivers/iommu/ioasid.c | 125
> >>> +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed,
> >>> 125 insertions(+)
> >>>
> >>> diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c
> >>> index 99f5e0a..ed2915a 100644
> >>> --- a/drivers/iommu/ioasid.c
> >>> +++ b/drivers/iommu/ioasid.c
> >>> @@ -17,6 +17,100 @@ struct ioasid_data {
> >>>  };
> >>>  
> >>>  static DEFINE_XARRAY_ALLOC(ioasid_xa);
> >>> +static DEFINE_MUTEX(ioasid_allocator_lock);
> >>> +static struct ioasid_allocator *active_custom_allocator;
> >>> +
> >>> +static LIST_HEAD(custom_allocators);
> >>> +/*
> >>> + * A flag to track if ioasid default allocator is in use, this
> >>> will
> >>> + * prevent custom allocator from being used. The reason is that
> >>> custom allocator
> >>> + * must have unadulterated space to track private data with
> >>> xarray, there cannot
> >>> + * be a mix been default and custom allocated IOASIDs.
> >>> + */
> >>> +static int default_allocator_active;
> >>> +
> >>> +/**
> >>> + * ioasid_register_allocator - register a custom allocator
> >>> + * @allocator: the custom allocator to be registered
> >>> + *
> >>> + * Custom allocators take precedence over the default xarray
> >>> based allocator.
> >>> + * Private data associated with the ASID are managed by ASID
> >>> common code
> >>> + * similar to data stored in xa.
> >>> + *
> >>> + * There can be multiple allocators registered but only one is
> >>> active. In case
> >>> + * of runtime removal of a custom allocator, the next one is
> >>> activated based
> >>> + * on the registration ordering.
> >>> + */
> >>> +int ioasid_register_allocator(struct ioasid_allocator *allocator)
> >>> +{
> >>> +	struct ioasid_allocator *pallocator;
> >>> +	int ret = 0;
> >>> +
> >>> +	if (!allocator)
> >>> +		return -EINVAL;    
> >> is it really necessary? Sin't it the caller responsibility?  
> > makes sense. will remove this one and below.  
> >>> +
> >>> +	mutex_lock(&ioasid_allocator_lock);
> >>> +	/*
> >>> +	 * No particular preference since all custom allocators
> >>> end up calling
> >>> +	 * the host to allocate IOASIDs. We activate the first
> >>> one and keep
> >>> +	 * the later registered allocators in a list in case the
> >>> first one gets
> >>> +	 * removed due to hotplug.
> >>> +	 */
> >>> +	if (list_empty(&custom_allocators))
> >>> +		active_custom_allocator = allocator;> +
> >>> else {
> >>> +		/* Check if the allocator is already registered
> >>> */
> >>> +		list_for_each_entry(pallocator,
> >>> &custom_allocators, list) {
> >>> +			if (pallocator == allocator) {
> >>> +				pr_err("IOASID allocator already
> >>> registered\n");
> >>> +				ret = -EEXIST;
> >>> +				goto out_unlock;
> >>> +			}
> >>> +		}
> >>> +	}
> >>> +	list_add_tail(&allocator->list, &custom_allocators);
> >>> +
> >>> +out_unlock:
> >>> +	mutex_unlock(&ioasid_allocator_lock);
> >>> +	return ret;
> >>> +}
> >>> +EXPORT_SYMBOL_GPL(ioasid_register_allocator);
> >>> +
> >>> +/**
> >>> + * ioasid_unregister_allocator - Remove a custom IOASID allocator
> >>> + * @allocator: the custom allocator to be removed
> >>> + *
> >>> + * Remove an allocator from the list, activate the next allocator
> >>> in
> >>> + * the order it was registered.
> >>> + */
> >>> +void ioasid_unregister_allocator(struct ioasid_allocator
> >>> *allocator) +{
> >>> +	if (!allocator)
> >>> +		return;    
> >> is it really necessary?  
> >>> +
> >>> +	if (list_empty(&custom_allocators)) {
> >>> +		pr_warn("No custom IOASID allocators active!\n");
> >>> +		return;
> >>> +	}
> >>> +
> >>> +	mutex_lock(&ioasid_allocator_lock);
> >>> +	list_del(&allocator->list);
> >>> +	if (list_empty(&custom_allocators)) {
> >>> +		pr_info("No custom IOASID allocators\n")>
> >>> +		/*
> >>> +		 * All IOASIDs should have been freed before the
> >>> last custom
> >>> +		 * allocator is unregistered. Unless default
> >>> allocator is in
> >>> +		 * use.
> >>> +		 */
> >>> +		BUG_ON(!xa_empty(&ioasid_xa)
> >>> && !default_allocator_active);
> >>> +		active_custom_allocator = NULL;
> >>> +	} else if (allocator == active_custom_allocator) {    
> >> In case you are removing the active custom allocator don't you also
> >> need to check that all ioasids were freed. Otherwise you are likely
> >> to switch to a different allocator whereas the asid space is
> >> partially populated.  
> > The assumption is that all custom allocators on the same guest will
> > end up calling the same host allocator. Having multiple custom
> > allocators in the list is just a way to support multiple (p)vIOMMUs
> > with hotplug. Therefore, we cannot nor need to free all PASIDs when
> > one custom allocator goes away. This is a different situation then
> > switching between default allocator and custom allocator, where
> > custom allocator has to start with a clean space.  
> Although I understand your specific usecase, this framework may have
> other users, where custom allocators behave differently.
> 
> Also the commit msg says:"
> In case of hot removal of devices that provides the
> allocator, all IOASIDs must be freed prior to unregistering the
> allocator."
> 
Right, it is inconsistent.
Consider the following scenario on a single guest with two vIOMMUs:
1. vIOMMU1 register allocator A1 first
2. vIOMMU2 register allocator A2 stored to the allocator list
3. device belong to vIOMMU1 bind_sva(), allocate PASID1 from A1
4. device belong to vIOMMU2 bind_sva(), allocate PASID2 from A1
5. vIOMMU1 hot removed, free PASID1, then unregister A1
6. IOASID framework will try to free A1 then install A2 as the active
allocator but PASID2 is in use. It will be unnecessarily disruptive to
free PASID2

I can think of a solution:
 - Add a flag when registering ioasid custom allocator,
IOASID_ALLOC_RETAIN, which means when switching to another custom
allocator, all outstanding PASIDs will be retained. Of course it does
not include switching to default allocator which does not have this
RETAIN flag.

 - For the allocators do not have this flag, their PASIDs must be freed
upon unregistering.

Any thoughts?

Jacob

> Thanks
> 
> Eric
> > 
> >    
> >>> +		active_custom_allocator =
> >>> list_entry(&custom_allocators, struct ioasid_allocator, list);
> >>> +		pr_info("IOASID allocator changed");
> >>> +	}
> >>> +	mutex_unlock(&ioasid_allocator_lock);
> >>> +}
> >>> +EXPORT_SYMBOL_GPL(ioasid_unregister_allocator);
> >>>  
> >>>  /**
> >>>   * ioasid_set_data - Set private data for an allocated ioasid
> >>> @@ -68,6 +162,29 @@ ioasid_t ioasid_alloc(struct ioasid_set *set,
> >>> ioasid_t min, ioasid_t max, data->set = set;
> >>>  	data->private = private;
> >>>  
> >>> +	mutex_lock(&ioasid_allocator_lock);
> >>> +	/*
> >>> +	 * Use custom allocator if available, otherwise use
> >>> default.
> >>> +	 * However, if there are active IOASIDs already been
> >>> allocated by default
> >>> +	 * allocator, custom allocator cannot be used.
> >>> +	 */
> >>> +	if (!default_allocator_active &&
> >>> active_custom_allocator) {
> >>> +		id = active_custom_allocator->alloc(min, max,
> >>> active_custom_allocator->pdata);
> >>> +		if (id == INVALID_IOASID) {
> >>> +			pr_err("Failed ASID allocation by custom
> >>> allocator\n");
> >>> +			mutex_unlock(&ioasid_allocator_lock);
> >>> +			goto exit_free;
> >>> +		}
> >>> +		/*
> >>> +		 * Use XA to manage private data also sanitiy
> >>> check custom
> >>> +		 * allocator for duplicates.
> >>> +		 */
> >>> +		min = id;
> >>> +		max = id + 1;
> >>> +	} else
> >>> +		default_allocator_active = 1;    
> >> nit: true?  
> > yes, i can turn default_allocator_active into a bool type.
> >   
> >>> +	mutex_unlock(&ioasid_allocator_lock);
> >>> +
> >>>  	if (xa_alloc(&ioasid_xa, &id, data, XA_LIMIT(min, max),
> >>> GFP_KERNEL)) { pr_err("Failed to alloc ioasid from %d to %d\n",
> >>> min, max); goto exit_free;> @@ -91,9 +208,17 @@ void
> >>> ioasid_free(ioasid_t ioasid) {
> >>>  	struct ioasid_data *ioasid_data;
> >>>  
> >>> +	mutex_lock(&ioasid_allocator_lock);
> >>> +	if (active_custom_allocator)
> >>> +		active_custom_allocator->free(ioasid,
> >>> active_custom_allocator->pdata);
> >>> +	mutex_unlock(&ioasid_allocator_lock);
> >>> +
> >>>  	ioasid_data = xa_erase(&ioasid_xa, ioasid);
> >>>  
> >>>  	kfree_rcu(ioasid_data, rcu);
> >>> +
> >>> +	if (xa_empty(&ioasid_xa))
> >>> +		default_allocator_active = 0;    
> >> Isn't it racy? what if an xa_alloc occurs inbetween?
> >>
> >>  
> > yes, i will move it under the mutex. Thanks.  
> >>>  }
> >>>  EXPORT_SYMBOL_GPL(ioasid_free);
> >>>  
> >>>     
> >>
> >> Thanks
> >>
> >> Eric  
> > 
> > [Jacob Pan]
> >   

[Jacob Pan]

  reply index

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-03 22:32 [PATCH v3 00/16] Shared virtual address IOMMU and VT-d support Jacob Pan
2019-05-03 22:32 ` [PATCH v3 01/16] iommu: Introduce attach/detach_pasid_table API Jacob Pan
2019-05-03 22:32 ` [PATCH v3 02/16] iommu: Introduce cache_invalidate API Jacob Pan
2019-05-13  9:14   ` Auger Eric
2019-05-13 11:20     ` Jean-Philippe Brucker
2019-05-13 16:50       ` Auger Eric
2019-05-13 17:09         ` Jean-Philippe Brucker
2019-05-13 22:16           ` Jacob Pan
2019-05-14  7:36             ` Auger Eric
2019-05-14 10:41               ` Jean-Philippe Brucker
2019-05-14 17:44                 ` Jacob Pan
2019-05-14 17:57                   ` Jacob Pan
2019-05-15 11:03                   ` Jean-Philippe Brucker
2019-05-15 14:47                     ` Tian, Kevin
2019-05-15 15:25                       ` Jean-Philippe Brucker
2019-05-14  7:46           ` Auger Eric
2019-05-14 10:42             ` Jean-Philippe Brucker
2019-05-14 11:02               ` Auger Eric
2019-05-14 17:55                 ` Jacob Pan
2019-05-15 15:52                   ` Jean-Philippe Brucker
2019-05-15 16:25                     ` Jacob Pan
2019-05-03 22:32 ` [PATCH v3 03/16] iommu: Add I/O ASID allocator Jacob Pan
2019-05-21  8:21   ` Auger Eric
2019-05-21 17:03     ` Jacob Pan
2019-05-22 12:19       ` Jean-Philippe Brucker
2019-05-21  9:41   ` Auger Eric
2019-05-21 17:05     ` Jacob Pan
2019-05-03 22:32 ` [PATCH v3 04/16] ioasid: Add custom IOASID allocator Jacob Pan
2019-05-21  9:55   ` Auger Eric
2019-05-22 19:42     ` Jacob Pan
2019-05-23  7:14       ` Auger Eric
2019-05-23 15:40         ` Jacob Pan [this message]
2019-05-03 22:32 ` [PATCH v3 05/16] iommu/vt-d: Enlightened PASID allocation Jacob Pan
2019-05-03 22:32 ` [PATCH v3 06/16] iommu/vt-d: Add custom allocator for IOASID Jacob Pan
2019-05-03 22:32 ` [PATCH v3 07/16] iommu/vtd: Optimize tlb invalidation for vIOMMU Jacob Pan
2019-05-03 22:32 ` [PATCH v3 08/16] iommu/vt-d: Replace Intel specific PASID allocator with IOASID Jacob Pan
2019-05-03 22:32 ` [PATCH v3 09/16] iommu: Introduce guest PASID bind function Jacob Pan
2019-05-16 14:14   ` Jean-Philippe Brucker
2019-05-16 16:14     ` Jacob Pan
2019-05-20 19:22       ` Jacob Pan
2019-05-21 16:09         ` Jean-Philippe Brucker
2019-05-21 22:50           ` Jacob Pan
2019-05-22 15:05             ` Jean-Philippe Brucker
2019-05-22 17:15               ` Jacob Pan
2019-05-03 22:32 ` [PATCH v3 10/16] iommu/vt-d: Move domain helper to header Jacob Pan
2019-05-03 22:32 ` [PATCH v3 11/16] iommu/vt-d: Avoid duplicated code for PASID setup Jacob Pan
2019-05-03 22:32 ` [PATCH v3 12/16] iommu/vt-d: Add nested translation helper function Jacob Pan
2019-05-03 22:32 ` [PATCH v3 13/16] iommu/vt-d: Clean up for SVM device list Jacob Pan
2019-05-03 22:32 ` [PATCH v3 14/16] iommu/vt-d: Add bind guest PASID support Jacob Pan
2019-05-03 22:32 ` [PATCH v3 15/16] iommu/vt-d: Support flushing more translation cache types Jacob Pan
2019-05-03 22:32 ` [PATCH v3 16/16] iommu/vt-d: Add svm/sva invalidate function Jacob Pan
2019-05-15 16:31 ` [PATCH v3 00/16] Shared virtual address IOMMU and VT-d support Jacob Pan

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190523084019.7f940aa5@jacob-builder \
    --to=jacob.jun.pan@linux.intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=dwmw2@infradead.org \
    --cc=eric.auger@redhat.com \
    --cc=hch@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jean-philippe.brucker@arm.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lkml.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lkml.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lkml.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lkml.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lkml.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lkml.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lkml.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lkml.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lkml.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git