LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Michael Kelley <mikelley@microsoft.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Tianyu Lan <ltykernel@gmail.com>,
	KY Srinivasan <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Stephen Hemminger <sthemmin@microsoft.com>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	Dexuan Cui <decui@microsoft.com>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"will@kernel.org" <will@kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"bp@alien8.de" <bp@alien8.de>, "x86@kernel.org" <x86@kernel.org>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"luto@kernel.org" <luto@kernel.org>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"konrad.wilk@oracle.com" <konrad.wilk@oracle.com>,
	"boris.ostrovsky@oracle.com" <boris.ostrovsky@oracle.com>,
	"jgross@suse.com" <jgross@suse.com>,
	"sstabellini@kernel.org" <sstabellini@kernel.org>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"kuba@kernel.org" <kuba@kernel.org>,
	"jejb@linux.ibm.com" <jejb@linux.ibm.com>,
	"martin.petersen@oracle.com" <martin.petersen@oracle.com>,
	"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"arnd@arndb.de" <arnd@arndb.de>,
	"m.szyprowski@samsung.com" <m.szyprowski@samsung.com>,
	"robin.murphy@arm.com" <robin.murphy@arm.com>,
	"brijesh.singh@amd.com" <brijesh.singh@amd.com>,
	"thomas.lendacky@amd.com" <thomas.lendacky@amd.com>,
	Tianyu Lan <Tianyu.Lan@microsoft.com>,
	"pgonda@google.com" <pgonda@google.com>,
	"martin.b.radev@gmail.com" <martin.b.radev@gmail.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>,
	"rppt@kernel.org" <rppt@kernel.org>,
	"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
	"aneesh.kumar@linux.ibm.com" <aneesh.kumar@linux.ibm.com>,
	"krish.sadhukhan@oracle.com" <krish.sadhukhan@oracle.com>,
	"saravanand@fb.com" <saravanand@fb.com>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	"rientjes@google.com" <rientjes@google.com>,
	"ardb@kernel.org" <ardb@kernel.org>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	vkuznets <vkuznets@redhat.com>,
	"parri.andrea@gmail.com" <parri.andrea@gmail.com>,
	"dave.hansen@intel.com" <dave.hansen@intel.com>
Subject: RE: [PATCH V4 00/13] x86/Hyper-V: Add Hyper-V Isolation VM support
Date: Thu, 2 Sep 2021 15:57:24 +0000	[thread overview]
Message-ID: <MWHPR21MB1593060DCFD854FDA14604D3D7CE9@MWHPR21MB1593.namprd21.prod.outlook.com> (raw)
In-Reply-To: <20210902075939.GB14986@lst.de>

From: Christoph Hellwig <hch@lst.de> Sent: Thursday, September 2, 2021 1:00 AM
> 
> On Tue, Aug 31, 2021 at 05:16:19PM +0000, Michael Kelley wrote:
> > As a quick overview, I think there are four places where the
> > shared_gpa_boundary must be applied to adjust the guest physical
> > address that is used.  Each requires mapping a corresponding
> > virtual address range.  Here are the four places:
> >
> > 1)  The so-called "monitor pages" that are a core communication
> > mechanism between the guest and Hyper-V.  These are two single
> > pages, and the mapping is handled by calling memremap() for
> > each of the two pages.  See Patch 7 of Tianyu's series.
> 
> Ah, interesting.
> 
> > 3)  The network driver send and receive buffers.  vmap_phys_range()
> > should work here.
> 
> Actually it won't.  The problem with these buffers is that they are
> physically non-contiguous allocations.  

Indeed you are right.  These buffers are allocated with vzalloc().

> We really have two sensible options:
> 
>  1) use vmap_pfn as in the current series.  But in that case I think
>     we should get rid of the other mapping created by vmalloc.  I
>     though a bit about finding a way to apply the offset in vmalloc
>     itself, but I think it would be too invasive to the normal fast
>     path.  So the other sub-option would be to allocate the pages
>     manually (maybe even using high order allocations to reduce TLB
>     pressure) and then remap them

What's the benefit of getting rid of the other mapping created by
vmalloc if it isn't referenced?  Just page table space?  The default sizes
are a 16 Meg receive buffer and a 1 Meg send buffer for each VMbus
channel used by netvsc, and usually the max number of channels
is 8.  So there's 128 Meg of virtual space to be saved on the receive
buffers,  which could be worth it.

Allocating the pages manually is also an option, but we have to
be careful about high order allocations.  While typically these buffers
are allocated during system boot, these synthetic NICs can be hot
added and removed while the VM is running.   The channel count
can also be changed while the VM is running.  So multiple 16 Meg
receive buffer allocations may need to be done after the system has
been running a long time.

>  2) do away with the contiguous kernel mapping entirely.  This means
>     the simple memcpy calls become loops over kmap_local_pfn.  As
>     I just found out for the send side that would be pretty easy,
>     but the receive side would be more work.  We'd also need to check
>     the performance implications.

Doing away with the contiguous kernel mapping entirely seems like
it would result in fairly messy code to access the buffer.  What's the
benefit of doing away with the mapping?  I'm not an expert on the
netvsc driver, but decoding the incoming packets is already fraught
with complexities because of the nature of the protocol with Hyper-V.
The contiguous kernel mapping at least keeps the basics sane.

> 
> > 4) The swiotlb memory used for bounce buffers.  vmap_phys_range()
> > should work here as well.
> 
> Or memremap if it works for 1.
> 
> > Case #2 above does unusual mapping.  The ring buffer consists of a ring
> > buffer header page, followed by one or more pages that are the actual
> > ring buffer.  The pages making up the actual ring buffer are mapped
> > twice in succession.  For example, if the ring buffer has 4 pages
> > (one header page and three ring buffer pages), the contiguous
> > virtual mapping must cover these seven pages:  0, 1, 2, 3, 1, 2, 3.
> > The duplicate contiguous mapping allows the code that is reading
> > or writing the actual ring buffer to not be concerned about wrap-around
> > because writing off the end of the ring buffer is automatically
> > wrapped-around by the mapping.  The amount of data read or
> > written in one batch never exceeds the size of the ring buffer, and
> > after a batch is read or written, the read or write indices are adjusted
> > to put them back into the range of the first mapping of the actual
> > ring buffer pages.  So there's method to the madness, and the
> > technique works pretty well.  But this kind of mapping is not
> > amenable to using vmap_phys_range().
> 
> Hmm.  Can you point me to where this is mapped?  Especially for the
> classic non-isolated case where no vmap/vmalloc mapping is involved
> at all?

The existing code is in hv_ringbuffer_init() in drivers/hv/ring_buffer.c.
The code hasn't changed in a while, so any recent upstream code tree
is valid to look at.  The memory pages are typically allocated
in vmbus_alloc_ring() in drivers/hv/channel.c.

Michael

  parent reply	other threads:[~2021-09-02 15:57 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-27 17:20 Tianyu Lan
2021-08-27 17:20 ` [PATCH V4 01/13] x86/hyperv: Initialize GHCB page in Isolation VM Tianyu Lan
2021-09-02  0:15   ` Michael Kelley
2021-08-27 17:21 ` [PATCH V4 02/13] x86/hyperv: Initialize shared memory boundary in the " Tianyu Lan
2021-09-02  0:15   ` Michael Kelley
2021-09-02  6:35     ` Tianyu Lan
2021-08-27 17:21 ` [PATCH V4 03/13] x86/hyperv: Add new hvcall guest address host visibility support Tianyu Lan
2021-09-02  0:16   ` Michael Kelley
2021-08-27 17:21 ` [PATCH V4 04/13] hyperv: Mark vmbus ring buffer visible to host in Isolation VM Tianyu Lan
2021-08-27 17:41   ` Greg KH
2021-08-27 17:44     ` Tianyu Lan
2021-09-02  0:17   ` Michael Kelley
2021-08-27 17:21 ` [PATCH V4 05/13] hyperv: Add Write/Read MSR registers via ghcb page Tianyu Lan
2021-08-27 17:41   ` Greg KH
2021-08-27 17:46     ` Tianyu Lan
2021-09-02  3:32   ` Michael Kelley
2021-08-27 17:21 ` [PATCH V4 06/13] hyperv: Add ghcb hvcall support for SNP VM Tianyu Lan
2021-09-02  0:20   ` Michael Kelley
2021-08-27 17:21 ` [PATCH V4 07/13] hyperv/Vmbus: Add SNP support for VMbus channel initiate message Tianyu Lan
2021-09-02  0:21   ` Michael Kelley
2021-08-27 17:21 ` [PATCH V4 08/13] hyperv/vmbus: Initialize VMbus ring buffer for Isolation VM Tianyu Lan
2021-09-02  0:23   ` Michael Kelley
2021-09-02 13:35     ` Tianyu Lan
2021-09-02 16:14       ` Michael Kelley
2021-08-27 17:21 ` [PATCH V4 09/13] DMA: Add dma_map_decrypted/dma_unmap_encrypted() function Tianyu Lan
2021-08-27 17:21 ` [PATCH V4 10/13] x86/Swiotlb: Add Swiotlb bounce buffer remap function for HV IVM Tianyu Lan
2021-08-27 17:21 ` [PATCH V4 11/13] hyperv/IOMMU: Enable swiotlb bounce buffer for Isolation VM Tianyu Lan
2021-09-02  1:27   ` Michael Kelley
2021-08-27 17:21 ` [PATCH V4 12/13] hv_netvsc: Add Isolation VM support for netvsc driver Tianyu Lan
2021-09-02  2:34   ` Michael Kelley
2021-09-02  4:56     ` Michael Kelley
2021-08-27 17:21 ` [PATCH V4 13/13] hv_storvsc: Add Isolation VM support for storvsc driver Tianyu Lan
2021-09-02  2:08   ` Michael Kelley
2021-08-30 12:00 ` [PATCH V4 00/13] x86/Hyper-V: Add Hyper-V Isolation VM support Christoph Hellwig
2021-08-31 15:20   ` Tianyu Lan
2021-09-02  7:51     ` Christoph Hellwig
2021-08-31 17:16   ` Michael Kelley
2021-09-02  7:59     ` Christoph Hellwig
2021-09-02 11:21       ` Tianyu Lan
2021-09-02 15:57       ` Michael Kelley [this message]
2021-09-14 14:41         ` Tianyu Lan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MWHPR21MB1593060DCFD854FDA14604D3D7CE9@MWHPR21MB1593.namprd21.prod.outlook.com \
    --to=mikelley@microsoft.com \
    --cc=Tianyu.Lan@microsoft.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=ardb@kernel.org \
    --cc=arnd@arndb.de \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=brijesh.singh@amd.com \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=decui@microsoft.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=haiyangz@microsoft.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@lst.de \
    --cc=hpa@zytor.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jejb@linux.ibm.com \
    --cc=jgross@suse.com \
    --cc=joro@8bytes.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=konrad.wilk@oracle.com \
    --cc=krish.sadhukhan@oracle.com \
    --cc=kuba@kernel.org \
    --cc=kys@microsoft.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=ltykernel@gmail.com \
    --cc=luto@kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=martin.b.radev@gmail.com \
    --cc=martin.petersen@oracle.com \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=parri.andrea@gmail.com \
    --cc=peterz@infradead.org \
    --cc=pgonda@google.com \
    --cc=rientjes@google.com \
    --cc=robin.murphy@arm.com \
    --cc=rppt@kernel.org \
    --cc=saravanand@fb.com \
    --cc=sstabellini@kernel.org \
    --cc=sthemmin@microsoft.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=vkuznets@redhat.com \
    --cc=wei.liu@kernel.org \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    --subject='RE: [PATCH V4 00/13] x86/Hyper-V: Add Hyper-V Isolation VM support' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).