LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: Liran Alon <liran.alon@oracle.com>
Cc: Andy Lutomirski <luto@kernel.org>,
	Alexandre Chartre <alexandre.chartre@oracle.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Radim Krcmar <rkrcmar@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	kvm list <kvm@vger.kernel.org>, X86 ML <x86@kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	jan.setjeeilers@oracle.com, Jonathan Adams <jwadams@google.com>
Subject: Re: [RFC KVM 00/27] KVM Address Space Isolation
Date: Mon, 13 May 2019 19:07:36 -0700	[thread overview]
Message-ID: <CALCETrXK8+tUxNA=iVDse31nFRZyiQYvcrQxV1JaidhnL4GC0w@mail.gmail.com> (raw)
In-Reply-To: <C2A30CC6-1459-4182-B71A-D8FF121A19F2@oracle.com>

On Mon, May 13, 2019 at 2:09 PM Liran Alon <liran.alon@oracle.com> wrote:
>
>
>
> > On 13 May 2019, at 21:17, Andy Lutomirski <luto@kernel.org> wrote:
> >
> >> I expect that the KVM address space can eventually be expanded to include
> >> the ioctl syscall entries. By doing so, and also adding the KVM page table
> >> to the process userland page table (which should be safe to do because the
> >> KVM address space doesn't have any secret), we could potentially handle the
> >> KVM ioctl without having to switch to the kernel pagetable (thus effectively
> >> eliminating KPTI for KVM). Then the only overhead would be if a VM-Exit has
> >> to be handled using the full kernel address space.
> >>
> >
> > In the hopefully common case where a VM exits and then gets re-entered
> > without needing to load full page tables, what code actually runs?
> > I'm trying to understand when the optimization of not switching is
> > actually useful.
> >
> > Allowing ioctl() without switching to kernel tables sounds...
> > extremely complicated.  It also makes the dubious assumption that user
> > memory contains no secrets.
>
> Let me attempt to clarify what we were thinking when creating this patch series:
>
> 1) It is never safe to execute one hyperthread inside guest while it’s sibling hyperthread runs in a virtual address space which contains secrets of host or other guests.
> This is because we assume that using some speculative gadget (such as half-Spectrev2 gadget), it will be possible to populate *some* CPU core resource which could then be *somehow* leaked by the hyperthread running inside guest. In case of L1TF, this would be data populated to the L1D cache.
>
> 2) Because of (1), every time a hyperthread runs inside host kernel, we must make sure it’s sibling is not running inside guest. i.e. We must kick the sibling hyperthread outside of guest using IPI.
>
> 3) From (2), we should have theoretically deduced that for every #VMExit, there is a need to kick the sibling hyperthread also outside of guest until the #VMExit is completed. Such a patch series was implemented at some point but it had (obviously) significant performance hit.
>
>
4) The main goal of this patch series is to preserve (2), but to avoid
the overhead specified in (3).
>
> The way this patch series achieves (4) is by observing that during the run of a VM, most #VMExits can be handled rather quickly and locally inside KVM and doesn’t need to reference any data that is not relevant to this VM or KVM code. Therefore, if we will run these #VMExits in an isolated virtual address space (i.e. KVM isolated address space), there is no need to kick the sibling hyperthread from guest while these #VMExits handlers run.

Thanks!  This clarifies a lot of things.

> The hope is that the very vast majority of #VMExit handlers will be able to completely run without requiring to switch to full address space. Therefore, avoiding the performance hit of (2).
> However, for the very few #VMExits that does require to run in full kernel address space, we must first kick the sibling hyperthread outside of guest and only then switch to full kernel address space and only once all hyperthreads return to KVM address space, then allow then to enter into guest.

What exactly does "kick" mean in this context?  It sounds like you're
going to need to be able to kick sibling VMs from extremely atomic
contexts like NMI and MCE.

  reply	other threads:[~2019-05-14  2:07 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-13 14:38 Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 01/27] kernel: Export memory-management symbols required for KVM address space isolation Alexandre Chartre
2019-05-13 15:15   ` Peter Zijlstra
2019-05-13 15:17     ` Liran Alon
2019-05-13 14:38 ` [RFC KVM 02/27] KVM: x86: Introduce address_space_isolation module parameter Alexandre Chartre
2019-05-13 15:46   ` Andy Lutomirski
2019-05-13 15:55     ` Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 03/27] KVM: x86: Introduce KVM separate virtual address space Alexandre Chartre
2019-05-13 15:45   ` Andy Lutomirski
2019-05-13 16:04     ` Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 04/27] KVM: x86: Switch to KVM address space on entry to guest Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 05/27] KVM: x86: Add handler to exit kvm isolation Alexandre Chartre
2019-05-13 15:49   ` Andy Lutomirski
2019-05-13 16:10     ` Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 06/27] KVM: x86: Exit KVM isolation on IRQ entry Alexandre Chartre
2019-05-13 15:51   ` Andy Lutomirski
2019-05-13 16:28     ` Alexandre Chartre
2019-05-13 18:13       ` Andy Lutomirski
2019-05-14  7:07         ` Peter Zijlstra
2019-05-14  7:58           ` Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 07/27] KVM: x86: Switch to host address space when may access sensitive data Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 08/27] KVM: x86: Optimize branches which checks if address space isolation enabled Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 09/27] kvm/isolation: function to track buffers allocated for the KVM page table Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 10/27] kvm/isolation: add KVM page table entry free functions Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 11/27] kvm/isolation: add KVM page table entry offset functions Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 12/27] kvm/isolation: add KVM page table entry allocation functions Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 13/27] kvm/isolation: add KVM page table entry set functions Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 14/27] kvm/isolation: functions to copy page table entries for a VA range Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 15/27] kvm/isolation: keep track of VA range mapped in KVM address space Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 16/27] kvm/isolation: functions to clear page table entries for a VA range Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 17/27] kvm/isolation: improve mapping copy when mapping is already present Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 18/27] kvm/isolation: function to copy page table entries for percpu buffer Alexandre Chartre
2019-05-13 18:18   ` Andy Lutomirski
2019-05-14  7:09     ` Peter Zijlstra
2019-05-14  8:25       ` Alexandre Chartre
2019-05-14  8:34         ` Andy Lutomirski
2019-05-14  9:41           ` Alexandre Chartre
2019-05-14 15:23             ` Andy Lutomirski
2019-05-14 16:24               ` Alexandre Chartre
2019-05-14 17:05                 ` Peter Zijlstra
2019-05-14 18:09                   ` Sean Christopherson
2019-05-14 20:33                     ` Andy Lutomirski
2019-05-14 21:06                       ` Sean Christopherson
2019-05-14 21:55                         ` Andy Lutomirski
2019-05-14 22:38                           ` Sean Christopherson
2019-05-18  0:05                             ` Jonathan Adams
2019-05-14 20:27                   ` Andy Lutomirski
2019-05-13 14:38 ` [RFC KVM 19/27] kvm/isolation: initialize the KVM page table with core mappings Alexandre Chartre
2019-05-13 15:50   ` Dave Hansen
2019-05-13 16:00     ` Andy Lutomirski
2019-05-13 17:00       ` Alexandre Chartre
2019-05-13 16:46     ` Sean Christopherson
2019-05-13 16:47     ` Alexandre Chartre
2019-05-14 10:26       ` Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 20/27] kvm/isolation: initialize the KVM page table with vmx specific data Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 21/27] kvm/isolation: initialize the KVM page table with vmx VM data Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 22/27] kvm/isolation: initialize the KVM page table with vmx cpu data Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 23/27] kvm/isolation: initialize the KVM page table with the vcpu tasks Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 24/27] kvm/isolation: KVM page fault handler Alexandre Chartre
2019-05-13 15:15   ` Peter Zijlstra
2019-05-13 21:25     ` Liran Alon
2019-05-14  2:02       ` Andy Lutomirski
2019-05-14  7:21         ` Peter Zijlstra
2019-05-14 15:36           ` Alexandre Chartre
2019-05-14 15:43             ` Andy Lutomirski
2019-05-13 16:02   ` Andy Lutomirski
2019-05-13 16:21     ` Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 25/27] kvm/isolation: implement actual KVM isolation enter/exit Alexandre Chartre
2019-05-13 15:16   ` Peter Zijlstra
2019-05-13 16:01   ` Andy Lutomirski
2019-05-13 14:38 ` [RFC KVM 26/27] kvm/isolation: initialize the KVM page table with KVM memslots Alexandre Chartre
2019-05-13 14:38 ` [RFC KVM 27/27] kvm/isolation: initialize the KVM page table with KVM buses Alexandre Chartre
2019-05-13 16:42 ` [RFC KVM 00/27] KVM Address Space Isolation Liran Alon
2019-05-13 18:17 ` Andy Lutomirski
2019-05-13 21:08   ` Liran Alon
2019-05-14  2:07     ` Andy Lutomirski [this message]
2019-05-14  7:37       ` Peter Zijlstra
2019-05-14 21:32         ` Jan Setje-Eilers
2019-05-14  8:05       ` Liran Alon
2019-05-14  7:29     ` Peter Zijlstra
2019-05-14  7:57       ` Liran Alon
2019-05-14  8:33     ` Alexandre Chartre
2019-05-13 19:31 ` Nakajima, Jun
2019-05-13 21:16   ` Liran Alon
     [not found]     ` <D07C8F51-F2DF-4C8B-AB3B-0DFABD5F4C33@intel.com>
2019-05-13 21:53       ` Liran Alon
2019-05-15 12:52 ` Alexandre Chartre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALCETrXK8+tUxNA=iVDse31nFRZyiQYvcrQxV1JaidhnL4GC0w@mail.gmail.com' \
    --to=luto@kernel.org \
    --cc=alexandre.chartre@oracle.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jan.setjeeilers@oracle.com \
    --cc=jwadams@google.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liran.alon@oracle.com \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rkrcmar@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --subject='Re: [RFC KVM 00/27] KVM Address Space Isolation' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).