From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.1 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E86AC433FE for ; Fri, 3 Sep 2021 17:00:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E8D3261056 for ; Fri, 3 Sep 2021 17:00:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350179AbhICRBh (ORCPT ); Fri, 3 Sep 2021 13:01:37 -0400 Received: from out30-132.freemail.mail.aliyun.com ([115.124.30.132]:48422 "EHLO out30-132.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235492AbhICRBg (ORCPT ); Fri, 3 Sep 2021 13:01:36 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R141e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01424;MF=laijs@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0Un7jB2E_1630688433; Received: from C02XQCBJJG5H.local(mailfrom:laijs@linux.alibaba.com fp:SMTPD_---0Un7jB2E_1630688433) by smtp.aliyun-inc.com(127.0.0.1); Sat, 04 Sep 2021 01:00:34 +0800 Subject: Re: [PATCH 2/7] KVM: X86: Synchronize the shadow pagetable before link it To: Sean Christopherson Cc: Lai Jiangshan , linux-kernel@vger.kernel.org, Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" , Marcelo Tosatti , Avi Kivity , kvm@vger.kernel.org References: <20210824075524.3354-1-jiangshanlai@gmail.com> <20210824075524.3354-3-jiangshanlai@gmail.com> <7067bec0-8a15-1a18-481e-e2ea79575dcf@linux.alibaba.com> From: Lai Jiangshan Message-ID: Date: Sat, 4 Sep 2021 01:00:33 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021/9/4 00:40, Sean Christopherson wrote: > On Sat, Sep 04, 2021, Lai Jiangshan wrote: >> >> On 2021/9/4 00:06, Sean Christopherson wrote: >> >>> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h >>> index 50ade6450ace..2ff123ec0d64 100644 >>> --- a/arch/x86/kvm/mmu/paging_tmpl.h >>> +++ b/arch/x86/kvm/mmu/paging_tmpl.h >>> @@ -704,6 +704,9 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, >>> access = gw->pt_access[it.level - 2]; >>> sp = kvm_mmu_get_page(vcpu, table_gfn, fault->addr, >>> it.level-1, false, access); >>> + if (sp->unsync_children && >>> + mmu_sync_children(vcpu, sp, false)) >>> + return RET_PF_RETRY; >> >> It was like my first (unsent) fix. Just return RET_PF_RETRY when break. >> >> And then I thought that it'd be better to retry fetching directly rather than >> retry guest when the conditions are still valid/unchanged to avoid all the >> next guest page walking and GUP(). Although the code does not check all >> conditions such as interrupt event pending. (we can add that too) > > But not in a bug fix that needs to go to stable branches. Good point, it is too complicated for a fix, I accept just "return RET_PF_RETRY". (and don't need "SOME_ARBITRARY_THRESHOLD"). Is it Ok? I will update the patch as it. > >> I think it is a good design to allow break mmu_lock when mmu is handling >> heavy work. > > I don't disagree in principle, but I question the relevance/need. I doubt this > code is relevant to nested TDP performance as hypervisors generally don't do the > type of PTE manipulations that would lead to linking an existing unsync sp. And > for legacy shadow paging, my preference would be to put it into maintenance-only > mode as much as possible. I'm not dead set against new features/functionality > for shadow paging, but for something like dropping mmu_lock in the page fault path, > IMO there needs to be performance numbers to justify such a change. > I understood the concern and the relevance/need.