LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Lai Jiangshan <laijs@linux.alibaba.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>,
	linux-kernel@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Avi Kivity <avi@redhat.com>,
	kvm@vger.kernel.org
Subject: Re: [PATCH 2/7] KVM: X86: Synchronize the shadow pagetable before link it
Date: Fri, 3 Sep 2021 16:06:30 +0000	[thread overview]
Message-ID: <YTJIBr/lm5QU/Z3W@google.com> (raw)
In-Reply-To: <c8cd9508-7516-0891-f507-4b869d7e4322@linux.alibaba.com>

On Fri, Sep 03, 2021, Lai Jiangshan wrote:
> 
> On 2021/9/3 07:54, Sean Christopherson wrote:
> > 
> >   trace_get_page:
> > diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> > index 50ade6450ace..5b13918a55c2 100644
> > --- a/arch/x86/kvm/mmu/paging_tmpl.h
> > +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> > @@ -704,6 +704,10 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
> >   			access = gw->pt_access[it.level - 2];
> >   			sp = kvm_mmu_get_page(vcpu, table_gfn, fault->addr,
> >   					      it.level-1, false, access);
> > +			if (sp->unsync_children) {
> > +				kvm_make_all_cpus_request(KVM_REQ_MMU_SYNC, vcpu);
> > +				return RET_PF_RETRY;
> 
> Making KVM_REQ_MMU_SYNC be able remotely is good idea.
> But if the sp is not linked, the @sp might not be synced even we
> tried many times. So we should continue to link it.

Hrm, yeah.  The sp has to be linked in at least one mmu, but it may not be linked
in the current mmu, so KVM would have to sync all roots across all current and
previous mmus in order to guarantee the target page is linked.  Eww.

> But if we continue to link it, KVM_REQ_MMU_SYNC should be extended to
> sync all roots (current root and prev_roots).  And maybe add a
> KVM_REQ_MMU_SYNC_CURRENT for current root syncing.
> 
> It is not going to be a simple.  I have a new way to sync pages
> and also fix the problem,  but that include several non-fix patches.
> 
> We need to fix this problem in the simplest way.  In my patch
> mmu_sync_children() has a @root argument.  I think we can disallow
> releasing the lock when @root is false. Is it OK?

With a caveat, it should work.  I was exploring that option before the remote
sync idea.

The danger is inducing a stall in the host (RCU, etc...) if sp is an upper level
entry, e.g. with 5-level paging it can even be a PML4.  My thought for that is to
skip the yield if there are less than N unsync children remaining, and then bail
out if the caller doesn't allow yielding.  If mmu_sync_children() fails, restart
the guest and go through the entire page fault path.  Worst case scenario, it will
take a "few" rounds for the vCPU to finally resolve the page fault.

Regarding params, please use "can_yield" instead of "root" to match similar logic
in the TDP MMU, and return an int instead of a bool.

Thanks!

---
 arch/x86/kvm/mmu/mmu.c         | 18 ++++++++++++------
 arch/x86/kvm/mmu/paging_tmpl.h |  3 +++
 2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 4853c033e6ce..5be990cdb2be 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2024,8 +2024,8 @@ static void mmu_pages_clear_parents(struct mmu_page_path *parents)
 	} while (!sp->unsync_children);
 }

-static void mmu_sync_children(struct kvm_vcpu *vcpu,
-			      struct kvm_mmu_page *parent)
+static int mmu_sync_children(struct kvm_vcpu *vcpu,
+			     struct kvm_mmu_page *parent, bool can_yield)
 {
 	int i;
 	struct kvm_mmu_page *sp;
@@ -2050,7 +2050,15 @@ static void mmu_sync_children(struct kvm_vcpu *vcpu,
 			flush |= kvm_sync_page(vcpu, sp, &invalid_list);
 			mmu_pages_clear_parents(&parents);
 		}
-		if (need_resched() || rwlock_needbreak(&vcpu->kvm->mmu_lock)) {
+		/*
+		 * Don't yield if there are fewer than <N> unsync children
+		 * remaining, just finish up and get out.
+		 */
+		if (parent->unsync_children > SOME_ARBITRARY_THRESHOLD &&
+		    (need_resched() || rwlock_needbreak(&vcpu->kvm->mmu_lock))) {
+			if (!can_yield)
+				return -EINTR;
+
 			kvm_mmu_flush_or_zap(vcpu, &invalid_list, false, flush);
 			cond_resched_rwlock_write(&vcpu->kvm->mmu_lock);
 			flush = false;
@@ -2058,6 +2066,7 @@ static void mmu_sync_children(struct kvm_vcpu *vcpu,
 	}

 	kvm_mmu_flush_or_zap(vcpu, &invalid_list, false, flush);
+	return 0;
 }

 static void __clear_sp_write_flooding_count(struct kvm_mmu_page *sp)
@@ -2143,9 +2152,6 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 			kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu);
 		}

-		if (sp->unsync_children)
-			kvm_make_request(KVM_REQ_MMU_SYNC, vcpu);
-
 		__clear_sp_write_flooding_count(sp);

 trace_get_page:
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 50ade6450ace..2ff123ec0d64 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -704,6 +704,9 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
 			access = gw->pt_access[it.level - 2];
 			sp = kvm_mmu_get_page(vcpu, table_gfn, fault->addr,
 					      it.level-1, false, access);
+			if (sp->unsync_children &&
+			    mmu_sync_children(vcpu, sp, false))
+				return RET_PF_RETRY;
 		}

 		/*
--

  reply	other threads:[~2021-09-03 16:06 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-24  7:55 [PATCH 0/7] KVM: X86: MMU: misc fixes and cleanups Lai Jiangshan
2021-08-24  7:55 ` [PATCH 1/7] KVM: X86: Fix missed remote tlb flush in rmap_write_protect() Lai Jiangshan
2021-09-02 21:38   ` Sean Christopherson
2021-09-13  9:57   ` Maxim Levitsky
2021-08-24  7:55 ` [PATCH 2/7] KVM: X86: Synchronize the shadow pagetable before link it Lai Jiangshan
2021-09-02 23:40   ` Sean Christopherson
2021-09-02 23:54     ` Sean Christopherson
2021-09-03  0:44       ` Lai Jiangshan
2021-09-03 16:06         ` Sean Christopherson [this message]
2021-09-03 16:25           ` Lai Jiangshan
2021-09-03 16:40             ` Sean Christopherson
2021-09-03 17:00               ` Lai Jiangshan
2021-09-03 16:33           ` Lai Jiangshan
2021-09-03  0:51     ` Lai Jiangshan
2021-09-13 11:30     ` Maxim Levitsky
2021-09-13 20:49       ` Sean Christopherson
2021-09-13 22:31         ` Maxim Levitsky
2021-08-24  7:55 ` [PATCH 3/7] KVM: X86: Zap the invalid list after remote tlb flushing Lai Jiangshan
2021-09-02 21:54   ` Sean Christopherson
2021-08-24  7:55 ` [PATCH 4/7] KVM: X86: Remove FNAME(update_pte) Lai Jiangshan
2021-09-13  9:49   ` Maxim Levitsky
2021-08-24  7:55 ` [PATCH 5/7] KVM: X86: Don't unsync pagetables when speculative Lai Jiangshan
2021-09-13 11:02   ` Maxim Levitsky
2021-09-18  3:06     ` Lai Jiangshan
2021-08-24  7:55 ` [PATCH 6/7] KVM: X86: Don't check unsync if the original spte is writible Lai Jiangshan
2021-08-24  7:55 ` [PATCH 7/7] KVM: X86: Also prefetch the last range in __direct_pte_prefetch() Lai Jiangshan
2021-08-25 15:18   ` Sean Christopherson
2021-08-25 22:58     ` Lai Jiangshan
2021-08-31 18:02 ` [PATCH 0/7] KVM: X86: MMU: misc fixes and cleanups Lai Jiangshan
2021-08-31 21:57   ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YTJIBr/lm5QU/Z3W@google.com \
    --to=seanjc@google.com \
    --cc=avi@redhat.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=jiangshanlai@gmail.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=laijs@linux.alibaba.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    --subject='Re: [PATCH 2/7] KVM: X86: Synchronize the shadow pagetable before link it' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).