LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] KVM: LAPIC: Per vCPU control over kvm_can_post_timer_interrupt
@ 2021-11-22  1:58 Aili Yao
  2021-11-22 19:13 ` Sean Christopherson
  0 siblings, 1 reply; 9+ messages in thread
From: Aili Yao @ 2021-11-22  1:58 UTC (permalink / raw)
  To: pbonzini, seanjc, vkuznets, wanpengli, jmattson, joro, tglx,
	mingo, bp, dave.hansen
  Cc: x86, hpa, kvm, linux-kernel, yaoaili

From: Aili Yao <yaoaili@kingsoft.com>

When we isolate some pyhiscal cores, We may not use them for kvm guests,
We may use them for other purposes like DPDK, or we can make some kvm
guests isolated and some not, the global judgement pi_inject_timer is
not enough; We may make wrong decisions:

In such a scenario, the guests without isolated cores will not be
permitted to use vmx preemption timer, and tscdeadline fastpath also be
disabled, both will lead to performance penalty.

So check whether the vcpu->cpu is isolated, if not, don't post timer
interrupt.

Signed-off-by: Aili Yao <yaoaili@kingsoft.com>
---
 arch/x86/kvm/lapic.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 759952dd1222..72dde5532101 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -34,6 +34,7 @@
 #include <asm/delay.h>
 #include <linux/atomic.h>
 #include <linux/jump_label.h>
+#include <linux/sched/isolation.h>
 #include "kvm_cache_regs.h"
 #include "irq.h"
 #include "ioapic.h"
@@ -113,7 +114,8 @@ static inline u32 kvm_x2apic_id(struct kvm_lapic *apic)
 
 static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)
 {
-	return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
+	return pi_inject_timer && kvm_vcpu_apicv_active(vcpu) &&
+		!housekeeping_cpu(vcpu->cpu, HK_FLAG_TIMER);
 }
 
 bool kvm_can_use_hv_timer(struct kvm_vcpu *vcpu)
-- 
2.25.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: LAPIC: Per vCPU control over kvm_can_post_timer_interrupt
  2021-11-22  1:58 [PATCH] KVM: LAPIC: Per vCPU control over kvm_can_post_timer_interrupt Aili Yao
@ 2021-11-22 19:13 ` Sean Christopherson
  2021-11-23  2:57   ` Wanpeng Li
  2021-11-23  8:18   ` Aili Yao
  0 siblings, 2 replies; 9+ messages in thread
From: Sean Christopherson @ 2021-11-22 19:13 UTC (permalink / raw)
  To: Aili Yao
  Cc: pbonzini, vkuznets, wanpengli, jmattson, joro, tglx, mingo, bp,
	dave.hansen, x86, hpa, kvm, linux-kernel, yaoaili

On Mon, Nov 22, 2021, Aili Yao wrote:
> From: Aili Yao <yaoaili@kingsoft.com>
> 
> When we isolate some pyhiscal cores, We may not use them for kvm guests,
> We may use them for other purposes like DPDK, or we can make some kvm
> guests isolated and some not, the global judgement pi_inject_timer is
> not enough; We may make wrong decisions:
> 
> In such a scenario, the guests without isolated cores will not be
> permitted to use vmx preemption timer, and tscdeadline fastpath also be
> disabled, both will lead to performance penalty.
> 
> So check whether the vcpu->cpu is isolated, if not, don't post timer
> interrupt.
> 
> Signed-off-by: Aili Yao <yaoaili@kingsoft.com>
> ---
>  arch/x86/kvm/lapic.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 759952dd1222..72dde5532101 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -34,6 +34,7 @@
>  #include <asm/delay.h>
>  #include <linux/atomic.h>
>  #include <linux/jump_label.h>
> +#include <linux/sched/isolation.h>
>  #include "kvm_cache_regs.h"
>  #include "irq.h"
>  #include "ioapic.h"
> @@ -113,7 +114,8 @@ static inline u32 kvm_x2apic_id(struct kvm_lapic *apic)
>  
>  static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)
>  {
> -	return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
> +	return pi_inject_timer && kvm_vcpu_apicv_active(vcpu) &&
> +		!housekeeping_cpu(vcpu->cpu, HK_FLAG_TIMER);

I don't think this is safe, vcpu->cpu will be -1 if the vCPU isn't scheduled in.
This also doesn't play nice with the admin forcing pi_inject_timer=1.  Not saying
there's a reasonable use case for doing that, but it's supported today and this
would break that behavior.  It would also lead to weird behavior if a vCPU were
migrated on/off a housekeeping vCPU.  Again, probably not a reasonable use case,
but I don't see anything that would outright prevent that behavior.

The existing behavior also feels a bit unsafe as pi_inject_timer is writable while
KVM is running, though I supposed that's orthogonal to this discussion.

Rather than check vcpu->cpu, is there an existing vCPU flag that can be queried,
e.g. KVM_HINTS_REALTIME?

>  }
>  
>  bool kvm_can_use_hv_timer(struct kvm_vcpu *vcpu)
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: LAPIC: Per vCPU control over kvm_can_post_timer_interrupt
  2021-11-22 19:13 ` Sean Christopherson
@ 2021-11-23  2:57   ` Wanpeng Li
  2021-11-23  4:11     ` yaoaili [么爱利]
  2021-11-23  8:18   ` Aili Yao
  1 sibling, 1 reply; 9+ messages in thread
From: Wanpeng Li @ 2021-11-23  2:57 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Aili Yao, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, the arch/x86 maintainers,
	H. Peter Anvin, kvm, LKML, yaoaili

On Tue, 23 Nov 2021 at 03:14, Sean Christopherson <seanjc@google.com> wrote:
>
> On Mon, Nov 22, 2021, Aili Yao wrote:
> > From: Aili Yao <yaoaili@kingsoft.com>
> >
> > When we isolate some pyhiscal cores, We may not use them for kvm guests,
> > We may use them for other purposes like DPDK, or we can make some kvm
> > guests isolated and some not, the global judgement pi_inject_timer is
> > not enough; We may make wrong decisions:
> >
> > In such a scenario, the guests without isolated cores will not be
> > permitted to use vmx preemption timer, and tscdeadline fastpath also be
> > disabled, both will lead to performance penalty.
> >
> > So check whether the vcpu->cpu is isolated, if not, don't post timer
> > interrupt.
> >
> > Signed-off-by: Aili Yao <yaoaili@kingsoft.com>
> > ---
> >  arch/x86/kvm/lapic.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > index 759952dd1222..72dde5532101 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -34,6 +34,7 @@
> >  #include <asm/delay.h>
> >  #include <linux/atomic.h>
> >  #include <linux/jump_label.h>
> > +#include <linux/sched/isolation.h>
> >  #include "kvm_cache_regs.h"
> >  #include "irq.h"
> >  #include "ioapic.h"
> > @@ -113,7 +114,8 @@ static inline u32 kvm_x2apic_id(struct kvm_lapic *apic)
> >
> >  static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)
> >  {
> > -     return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
> > +     return pi_inject_timer && kvm_vcpu_apicv_active(vcpu) &&
> > +             !housekeeping_cpu(vcpu->cpu, HK_FLAG_TIMER);
>
> I don't think this is safe, vcpu->cpu will be -1 if the vCPU isn't scheduled in.
> This also doesn't play nice with the admin forcing pi_inject_timer=1.  Not saying
> there's a reasonable use case for doing that, but it's supported today and this
> would break that behavior.  It would also lead to weird behavior if a vCPU were
> migrated on/off a housekeeping vCPU.  Again, probably not a reasonable use case,
> but I don't see anything that would outright prevent that behavior.
>
> The existing behavior also feels a bit unsafe as pi_inject_timer is writable while
> KVM is running, though I supposed that's orthogonal to this discussion.
>
> Rather than check vcpu->cpu, is there an existing vCPU flag that can be queried,
> e.g. KVM_HINTS_REALTIME?

How about something like below:

From 67f605120e212384cb3d5788ba8c83f15659503b Mon Sep 17 00:00:00 2001
From: Wanpeng Li <wanpengli@tencent.com>
Date: Tue, 23 Nov 2021 10:36:10 +0800
Subject: [PATCH] KVM: LAPIC: To keep the vCPUs in non-root mode for timer-pi

From: Wanpeng Li <wanpengli@tencent.com>

As commit 0c5f81dad46 (KVM: LAPIC: Inject timer interrupt via posted interrupt)
mentioned that the host admin should well tune the guest setup, so that vCPUs
are placed on isolated pCPUs, and with several pCPUs surplus for
*busy* housekeeping.
It is better to disable mwait/hlt/pause vmexits to keep the vCPUs in non-root
mode. However, we may isolate pCPUs for other purpose like DPDK or we can make
some guests isolated and others not, Let's add the checking kvm_mwait_in_guest()
to kvm_can_post_timer_interrupt() since we can't benefit from timer
posted-interrupt
w/o keeping the vCPUs in non-root mode.

Reported-by: Aili Yao <yaoaili@kingsoft.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
 arch/x86/kvm/lapic.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 759952dd1222..8257566d44c7 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -113,14 +113,13 @@ static inline u32 kvm_x2apic_id(struct kvm_lapic *apic)

 static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)
 {
-    return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
+    return pi_inject_timer && kvm_mwait_in_guest(vcpu->kvm) &&
kvm_vcpu_apicv_active(vcpu);
 }

 bool kvm_can_use_hv_timer(struct kvm_vcpu *vcpu)
 {
     return kvm_x86_ops.set_hv_timer
-           && !(kvm_mwait_in_guest(vcpu->kvm) ||
-            kvm_can_post_timer_interrupt(vcpu));
+           && !kvm_mwait_in_guest(vcpu->kvm);
 }
 EXPORT_SYMBOL_GPL(kvm_can_use_hv_timer);

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] KVM: LAPIC: Per vCPU control over kvm_can_post_timer_interrupt
  2021-11-23  2:57   ` Wanpeng Li
@ 2021-11-23  4:11     ` yaoaili [么爱利]
  2021-11-23  6:24       ` Wanpeng Li
  0 siblings, 1 reply; 9+ messages in thread
From: yaoaili [么爱利] @ 2021-11-23  4:11 UTC (permalink / raw)
  To: Wanpeng Li, Sean Christopherson
  Cc: Aili Yao, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, the arch/x86 maintainers,
	H. Peter Anvin, kvm, LKML

> On Tue, 23 Nov 2021 at 03:14, Sean Christopherson <seanjc@google.com>
> wrote:
> >
> > On Mon, Nov 22, 2021, Aili Yao wrote:
> > > From: Aili Yao <yaoaili@kingsoft.com>
> > >
> > > When we isolate some pyhiscal cores, We may not use them for kvm
> > > guests, We may use them for other purposes like DPDK, or we can make
> > > some kvm guests isolated and some not, the global judgement
> > > pi_inject_timer is not enough; We may make wrong decisions:
> > >
> > > In such a scenario, the guests without isolated cores will not be
> > > permitted to use vmx preemption timer, and tscdeadline fastpath also
> > > be disabled, both will lead to performance penalty.
> > >
> > > So check whether the vcpu->cpu is isolated, if not, don't post timer
> > > interrupt.
> > >
> > > Signed-off-by: Aili Yao <yaoaili@kingsoft.com>
> > > ---
> > >  arch/x86/kvm/lapic.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index
> > > 759952dd1222..72dde5532101 100644
> > > --- a/arch/x86/kvm/lapic.c
> > > +++ b/arch/x86/kvm/lapic.c
> > > @@ -34,6 +34,7 @@
> > >  #include <asm/delay.h>
> > >  #include <linux/atomic.h>
> > >  #include <linux/jump_label.h>
> > > +#include <linux/sched/isolation.h>
> > >  #include "kvm_cache_regs.h"
> > >  #include "irq.h"
> > >  #include "ioapic.h"
> > > @@ -113,7 +114,8 @@ static inline u32 kvm_x2apic_id(struct kvm_lapic
> > > *apic)
> > >
> > >  static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)  {
> > > -     return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
> > > +     return pi_inject_timer && kvm_vcpu_apicv_active(vcpu) &&
> > > +             !housekeeping_cpu(vcpu->cpu, HK_FLAG_TIMER);
> >
> > I don't think this is safe, vcpu->cpu will be -1 if the vCPU isn't scheduled in.

Yes, vcpu->cpu is  -1 before vcpu create, but in my environments, it didn't
trigger this issue. I need to dig more, Thanks!  
Maybe I need one valid check here.

> > This also doesn't play nice with the admin forcing pi_inject_timer=1.
> > Not saying there's a reasonable use case for doing that, but it's
> > supported today and this would break that behavior.  It would also
> > lead to weird behavior if a vCPU were migrated on/off a housekeeping
> > vCPU.  Again, probably not a reasonable use case, but I don't see anything
> that would outright prevent that behavior.

Yes,  this is not one common operation,  But I did do test some scenarios:
1. isolated cpu --> housekeeping cpu;
    isolated guest timer is in housekeeping CPU, for migration, kvm_can_post_timer_interrupt
    will return false, so the timer may be migrated to vcpu->cpu;
    This seems works in my test;
2. isolated --> isolated
    Isolated guest timer is in housekeeping cpu, for migration,kvm_can_post_timer_interrupt return 
    true, timer is not migrated
3. housekeeping CPU --> isolated CPU
    non-isolated CPU timer is usually in vcpu->cpu, for migration to isolated, kvm_can_post_timer_interrupt
    will be true,  the timer remain on the same CPU;
    This seems works in my test;  
4. housekeeping CPU --> housekeeping CPU
     timer migrated;
It seems this is not an affecting problem;   

> >
> > The existing behavior also feels a bit unsafe as pi_inject_timer is
> > writable while KVM is running, though I supposed that's orthogonal to this
> discussion.
> >
> > Rather than check vcpu->cpu, is there an existing vCPU flag that can
> > be queried, e.g. KVM_HINTS_REALTIME?
> 
> How about something like below:
> 
> From 67f605120e212384cb3d5788ba8c83f15659503b Mon Sep 17 00:00:00
> 2001
> From: Wanpeng Li <wanpengli@tencent.com>
> Date: Tue, 23 Nov 2021 10:36:10 +0800
> Subject: [PATCH] KVM: LAPIC: To keep the vCPUs in non-root mode for timer-
> pi
> 
> From: Wanpeng Li <wanpengli@tencent.com>
> 
> As commit 0c5f81dad46 (KVM: LAPIC: Inject timer interrupt via posted
> interrupt) mentioned that the host admin should well tune the guest setup,
> so that vCPUs are placed on isolated pCPUs, and with several pCPUs surplus
> for
> *busy* housekeeping.
> It is better to disable mwait/hlt/pause vmexits to keep the vCPUs in non-root
> mode. However, we may isolate pCPUs for other purpose like DPDK or we
> can make some guests isolated and others not, Let's add the checking
> kvm_mwait_in_guest() to kvm_can_post_timer_interrupt() since we can't
> benefit from timer posted-interrupt w/o keeping the vCPUs in non-root
> mode.
> 
> Reported-by: Aili Yao <yaoaili@kingsoft.com>
> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> ---
>  arch/x86/kvm/lapic.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index
> 759952dd1222..8257566d44c7 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -113,14 +113,13 @@ static inline u32 kvm_x2apic_id(struct kvm_lapic
> *apic)
> 
>  static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)  {
> -    return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
> +    return pi_inject_timer && kvm_mwait_in_guest(vcpu->kvm) &&
> kvm_vcpu_apicv_active(vcpu);
>  }
> 
>  bool kvm_can_use_hv_timer(struct kvm_vcpu *vcpu)  {
>      return kvm_x86_ops.set_hv_timer
> -           && !(kvm_mwait_in_guest(vcpu->kvm) ||
> -            kvm_can_post_timer_interrupt(vcpu));
> +           && !kvm_mwait_in_guest(vcpu->kvm);
>  }
>  EXPORT_SYMBOL_GPL(kvm_can_use_hv_timer);

This method seems more quick and safe, but I have one question: Does this kvm_mwait_in_guest
can guarantee the CPU isolated,  in some production environments and usually,  MWAIT feature is disabled in host
and even guests with isolated CPUs.  And also we can set guests kvm_mwait_in_guest true with CPUs just pinned, not isolated.

Thanks

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: LAPIC: Per vCPU control over kvm_can_post_timer_interrupt
  2021-11-23  4:11     ` yaoaili [么爱利]
@ 2021-11-23  6:24       ` Wanpeng Li
  2021-11-23  7:02         ` Aili Yao
  0 siblings, 1 reply; 9+ messages in thread
From: Wanpeng Li @ 2021-11-23  6:24 UTC (permalink / raw)
  To: yaoaili [么爱利]
  Cc: Sean Christopherson, Aili Yao, Paolo Bonzini, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen,
	the arch/x86 maintainers, H. Peter Anvin, kvm, LKML

On Tue, 23 Nov 2021 at 12:11, yaoaili [么爱利] <yaoaili@kingsoft.com> wrote:
>
> > On Tue, 23 Nov 2021 at 03:14, Sean Christopherson <seanjc@google.com>
> > wrote:
> > >
> > > On Mon, Nov 22, 2021, Aili Yao wrote:
> > > > From: Aili Yao <yaoaili@kingsoft.com>
> > > >
> > > > When we isolate some pyhiscal cores, We may not use them for kvm
> > > > guests, We may use them for other purposes like DPDK, or we can make
> > > > some kvm guests isolated and some not, the global judgement
> > > > pi_inject_timer is not enough; We may make wrong decisions:
> > > >
> > > > In such a scenario, the guests without isolated cores will not be
> > > > permitted to use vmx preemption timer, and tscdeadline fastpath also
> > > > be disabled, both will lead to performance penalty.
> > > >
> > > > So check whether the vcpu->cpu is isolated, if not, don't post timer
> > > > interrupt.
> > > >
> > > > Signed-off-by: Aili Yao <yaoaili@kingsoft.com>
> > > > ---
> > > >  arch/x86/kvm/lapic.c | 4 +++-
> > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index
> > > > 759952dd1222..72dde5532101 100644
> > > > --- a/arch/x86/kvm/lapic.c
> > > > +++ b/arch/x86/kvm/lapic.c
> > > > @@ -34,6 +34,7 @@
> > > >  #include <asm/delay.h>
> > > >  #include <linux/atomic.h>
> > > >  #include <linux/jump_label.h>
> > > > +#include <linux/sched/isolation.h>
> > > >  #include "kvm_cache_regs.h"
> > > >  #include "irq.h"
> > > >  #include "ioapic.h"
> > > > @@ -113,7 +114,8 @@ static inline u32 kvm_x2apic_id(struct kvm_lapic
> > > > *apic)
> > > >
> > > >  static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)  {
> > > > -     return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
> > > > +     return pi_inject_timer && kvm_vcpu_apicv_active(vcpu) &&
> > > > +             !housekeeping_cpu(vcpu->cpu, HK_FLAG_TIMER);
> > >
> > > I don't think this is safe, vcpu->cpu will be -1 if the vCPU isn't scheduled in.
>
> Yes, vcpu->cpu is  -1 before vcpu create, but in my environments, it didn't
> trigger this issue. I need to dig more, Thanks!
> Maybe I need one valid check here.
>
> > > This also doesn't play nice with the admin forcing pi_inject_timer=1.
> > > Not saying there's a reasonable use case for doing that, but it's
> > > supported today and this would break that behavior.  It would also
> > > lead to weird behavior if a vCPU were migrated on/off a housekeeping
> > > vCPU.  Again, probably not a reasonable use case, but I don't see anything
> > that would outright prevent that behavior.
>
> Yes,  this is not one common operation,  But I did do test some scenarios:
> 1. isolated cpu --> housekeeping cpu;
>     isolated guest timer is in housekeeping CPU, for migration, kvm_can_post_timer_interrupt
>     will return false, so the timer may be migrated to vcpu->cpu;
>     This seems works in my test;
> 2. isolated --> isolated
>     Isolated guest timer is in housekeeping cpu, for migration,kvm_can_post_timer_interrupt return
>     true, timer is not migrated
> 3. housekeeping CPU --> isolated CPU
>     non-isolated CPU timer is usually in vcpu->cpu, for migration to isolated, kvm_can_post_timer_interrupt
>     will be true,  the timer remain on the same CPU;
>     This seems works in my test;
> 4. housekeeping CPU --> housekeeping CPU
>      timer migrated;
> It seems this is not an affecting problem;
>
> > >
> > > The existing behavior also feels a bit unsafe as pi_inject_timer is
> > > writable while KVM is running, though I supposed that's orthogonal to this
> > discussion.
> > >
> > > Rather than check vcpu->cpu, is there an existing vCPU flag that can
> > > be queried, e.g. KVM_HINTS_REALTIME?
> >
> > How about something like below:
> >
> > From 67f605120e212384cb3d5788ba8c83f15659503b Mon Sep 17 00:00:00
> > 2001
> > From: Wanpeng Li <wanpengli@tencent.com>
> > Date: Tue, 23 Nov 2021 10:36:10 +0800
> > Subject: [PATCH] KVM: LAPIC: To keep the vCPUs in non-root mode for timer-
> > pi
> >
> > From: Wanpeng Li <wanpengli@tencent.com>
> >
> > As commit 0c5f81dad46 (KVM: LAPIC: Inject timer interrupt via posted
> > interrupt) mentioned that the host admin should well tune the guest setup,
> > so that vCPUs are placed on isolated pCPUs, and with several pCPUs surplus
> > for
> > *busy* housekeeping.
> > It is better to disable mwait/hlt/pause vmexits to keep the vCPUs in non-root
> > mode. However, we may isolate pCPUs for other purpose like DPDK or we
> > can make some guests isolated and others not, Let's add the checking
> > kvm_mwait_in_guest() to kvm_can_post_timer_interrupt() since we can't
> > benefit from timer posted-interrupt w/o keeping the vCPUs in non-root
> > mode.
> >
> > Reported-by: Aili Yao <yaoaili@kingsoft.com>
> > Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> > ---
> >  arch/x86/kvm/lapic.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index
> > 759952dd1222..8257566d44c7 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -113,14 +113,13 @@ static inline u32 kvm_x2apic_id(struct kvm_lapic
> > *apic)
> >
> >  static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)  {
> > -    return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
> > +    return pi_inject_timer && kvm_mwait_in_guest(vcpu->kvm) &&
> > kvm_vcpu_apicv_active(vcpu);
> >  }
> >
> >  bool kvm_can_use_hv_timer(struct kvm_vcpu *vcpu)  {
> >      return kvm_x86_ops.set_hv_timer
> > -           && !(kvm_mwait_in_guest(vcpu->kvm) ||
> > -            kvm_can_post_timer_interrupt(vcpu));
> > +           && !kvm_mwait_in_guest(vcpu->kvm);
> >  }
> >  EXPORT_SYMBOL_GPL(kvm_can_use_hv_timer);
>
> This method seems more quick and safe, but I have one question: Does this kvm_mwait_in_guest
> can guarantee the CPU isolated,  in some production environments and usually,  MWAIT feature is disabled in host
> and even guests with isolated CPUs.  And also we can set guests kvm_mwait_in_guest true with CPUs just pinned, not isolated.

You won't benefit from timer posted-interrupt if mwait is not exposed
to the guest since you can't keep CPU in non-root mode.
kvm_mwait_in_guest() will not guarantee the CPU is isolated, but
what's still bothering?

   Wanpeng

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: LAPIC: Per vCPU control over kvm_can_post_timer_interrupt
  2021-11-23  6:24       ` Wanpeng Li
@ 2021-11-23  7:02         ` Aili Yao
  2021-11-23  7:22           ` Wanpeng Li
  0 siblings, 1 reply; 9+ messages in thread
From: Aili Yao @ 2021-11-23  7:02 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: yaoaili [么爱利],
	Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, the arch/x86 maintainers,
	H. Peter Anvin, kvm, LKML

On Tue, 23 Nov 2021 14:24:19 +0800
Wanpeng Li <kernellwp@gmail.com> wrote:

> On Tue, 23 Nov 2021 at 12:11, yaoaili [么爱利] <yaoaili@kingsoft.com>
> wrote:
> >  
> > > On Tue, 23 Nov 2021 at 03:14, Sean Christopherson
> > > <seanjc@google.com> wrote:  
> > > >
> > > > On Mon, Nov 22, 2021, Aili Yao wrote:  
> > > > > From: Aili Yao <yaoaili@kingsoft.com>
> > > > >
> > > > > When we isolate some pyhiscal cores, We may not use them for
> > > > > kvm guests, We may use them for other purposes like DPDK, or
> > > > > we can make some kvm guests isolated and some not, the global
> > > > > judgement pi_inject_timer is not enough; We may make wrong
> > > > > decisions:
> > > > >
> > > > > In such a scenario, the guests without isolated cores will
> > > > > not be permitted to use vmx preemption timer, and tscdeadline
> > > > > fastpath also be disabled, both will lead to performance
> > > > > penalty.
> > > > >
> > > > > So check whether the vcpu->cpu is isolated, if not, don't
> > > > > post timer interrupt.
> > > > >
> > > > > Signed-off-by: Aili Yao <yaoaili@kingsoft.com>
> > > > > ---
> > > > >  arch/x86/kvm/lapic.c | 4 +++-
> > > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index
> > > > > 759952dd1222..72dde5532101 100644
> > > > > --- a/arch/x86/kvm/lapic.c
> > > > > +++ b/arch/x86/kvm/lapic.c
> > > > > @@ -34,6 +34,7 @@
> > > > >  #include <asm/delay.h>
> > > > >  #include <linux/atomic.h>
> > > > >  #include <linux/jump_label.h>
> > > > > +#include <linux/sched/isolation.h>
> > > > >  #include "kvm_cache_regs.h"
> > > > >  #include "irq.h"
> > > > >  #include "ioapic.h"
> > > > > @@ -113,7 +114,8 @@ static inline u32 kvm_x2apic_id(struct
> > > > > kvm_lapic *apic)
> > > > >
> > > > >  static bool kvm_can_post_timer_interrupt(struct kvm_vcpu
> > > > > *vcpu)  {
> > > > > -     return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
> > > > > +     return pi_inject_timer && kvm_vcpu_apicv_active(vcpu) &&
> > > > > +             !housekeeping_cpu(vcpu->cpu, HK_FLAG_TIMER);  
> > > >
> > > > I don't think this is safe, vcpu->cpu will be -1 if the vCPU
> > > > isn't scheduled in.  
> >
> > Yes, vcpu->cpu is  -1 before vcpu create, but in my environments,
> > it didn't trigger this issue. I need to dig more, Thanks!
> > Maybe I need one valid check here.
> >  
> > > > This also doesn't play nice with the admin forcing
> > > > pi_inject_timer=1. Not saying there's a reasonable use case for
> > > > doing that, but it's supported today and this would break that
> > > > behavior.  It would also lead to weird behavior if a vCPU were
> > > > migrated on/off a housekeeping vCPU.  Again, probably not a
> > > > reasonable use case, but I don't see anything  
> > > that would outright prevent that behavior.  
> >
> > Yes,  this is not one common operation,  But I did do test some
> > scenarios: 1. isolated cpu --> housekeeping cpu;
> >     isolated guest timer is in housekeeping CPU, for migration,
> > kvm_can_post_timer_interrupt will return false, so the timer may be
> > migrated to vcpu->cpu; This seems works in my test;
> > 2. isolated --> isolated
> >     Isolated guest timer is in housekeeping cpu, for
> > migration,kvm_can_post_timer_interrupt return true, timer is not
> > migrated 3. housekeeping CPU --> isolated CPU
> >     non-isolated CPU timer is usually in vcpu->cpu, for migration
> > to isolated, kvm_can_post_timer_interrupt will be true,  the timer
> > remain on the same CPU; This seems works in my test;
> > 4. housekeeping CPU --> housekeeping CPU
> >      timer migrated;
> > It seems this is not an affecting problem;
> >  
> > > >
> > > > The existing behavior also feels a bit unsafe as
> > > > pi_inject_timer is writable while KVM is running, though I
> > > > supposed that's orthogonal to this  
> > > discussion.  
> > > >
> > > > Rather than check vcpu->cpu, is there an existing vCPU flag
> > > > that can be queried, e.g. KVM_HINTS_REALTIME?  
> > >
> > > How about something like below:
> > >
> > > From 67f605120e212384cb3d5788ba8c83f15659503b Mon Sep 17 00:00:00
> > > 2001
> > > From: Wanpeng Li <wanpengli@tencent.com>
> > > Date: Tue, 23 Nov 2021 10:36:10 +0800
> > > Subject: [PATCH] KVM: LAPIC: To keep the vCPUs in non-root mode
> > > for timer- pi
> > >
> > > From: Wanpeng Li <wanpengli@tencent.com>
> > >
> > > As commit 0c5f81dad46 (KVM: LAPIC: Inject timer interrupt via
> > > posted interrupt) mentioned that the host admin should well tune
> > > the guest setup, so that vCPUs are placed on isolated pCPUs, and
> > > with several pCPUs surplus for
> > > *busy* housekeeping.
> > > It is better to disable mwait/hlt/pause vmexits to keep the vCPUs
> > > in non-root mode. However, we may isolate pCPUs for other purpose
> > > like DPDK or we can make some guests isolated and others not,
> > > Let's add the checking kvm_mwait_in_guest() to
> > > kvm_can_post_timer_interrupt() since we can't benefit from timer
> > > posted-interrupt w/o keeping the vCPUs in non-root mode.
> > >
> > > Reported-by: Aili Yao <yaoaili@kingsoft.com>
> > > Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> > > ---
> > >  arch/x86/kvm/lapic.c | 5 ++---
> > >  1 file changed, 2 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index
> > > 759952dd1222..8257566d44c7 100644
> > > --- a/arch/x86/kvm/lapic.c
> > > +++ b/arch/x86/kvm/lapic.c
> > > @@ -113,14 +113,13 @@ static inline u32 kvm_x2apic_id(struct
> > > kvm_lapic *apic)
> > >
> > >  static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)
> > > {
> > > -    return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
> > > +    return pi_inject_timer && kvm_mwait_in_guest(vcpu->kvm) &&
> > > kvm_vcpu_apicv_active(vcpu);
> > >  }
> > >
> > >  bool kvm_can_use_hv_timer(struct kvm_vcpu *vcpu)  {
> > >      return kvm_x86_ops.set_hv_timer
> > > -           && !(kvm_mwait_in_guest(vcpu->kvm) ||
> > > -            kvm_can_post_timer_interrupt(vcpu));
> > > +           && !kvm_mwait_in_guest(vcpu->kvm);
> > >  }
> > >  EXPORT_SYMBOL_GPL(kvm_can_use_hv_timer);  
> >
> > This method seems more quick and safe, but I have one question:
> > Does this kvm_mwait_in_guest can guarantee the CPU isolated,  in
> > some production environments and usually,  MWAIT feature is
> > disabled in host and even guests with isolated CPUs.  And also we
> > can set guests kvm_mwait_in_guest true with CPUs just pinned, not
> > isolated.  
> 
> You won't benefit from timer posted-interrupt if mwait is not exposed
> to the guest since you can't keep CPU in non-root mode.
> kvm_mwait_in_guest() will not guarantee the CPU is isolated, but
> what's still bothering?

Sorry, Did I miss some thing?

What in my mind: MWait may be disabled in bios, so host will use halt
instruction as one replacement for idle operation, in such a
configuration, Mwait in guest will also be disabled even if you try to
set kvm_mwait_in_guest true; As a result, halt,pause may not exit the
guest, so the post interrupt still counts?

For current code, We can migrate guest between isolated and
housekeeping or we can change the cpu pinning on the fly, we allow this
even the operation is not usually used, right?

Thanks!
Aili Yao




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: LAPIC: Per vCPU control over kvm_can_post_timer_interrupt
  2021-11-23  7:02         ` Aili Yao
@ 2021-11-23  7:22           ` Wanpeng Li
  0 siblings, 0 replies; 9+ messages in thread
From: Wanpeng Li @ 2021-11-23  7:22 UTC (permalink / raw)
  To: Aili Yao
  Cc: yaoaili [么爱利],
	Sean Christopherson, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, the arch/x86 maintainers,
	H. Peter Anvin, kvm, LKML

On Tue, 23 Nov 2021 at 15:04, Aili Yao <yaoaili126@gmail.com> wrote:
>
> On Tue, 23 Nov 2021 14:24:19 +0800
> Wanpeng Li <kernellwp@gmail.com> wrote:
>
> > On Tue, 23 Nov 2021 at 12:11, yaoaili [么爱利] <yaoaili@kingsoft.com>
> > wrote:
> > >
> > > > On Tue, 23 Nov 2021 at 03:14, Sean Christopherson
> > > > <seanjc@google.com> wrote:
> > > > >
> > > > > On Mon, Nov 22, 2021, Aili Yao wrote:
> > > > > > From: Aili Yao <yaoaili@kingsoft.com>
> > > > > >
> > > > > > When we isolate some pyhiscal cores, We may not use them for
> > > > > > kvm guests, We may use them for other purposes like DPDK, or
> > > > > > we can make some kvm guests isolated and some not, the global
> > > > > > judgement pi_inject_timer is not enough; We may make wrong
> > > > > > decisions:
> > > > > >
> > > > > > In such a scenario, the guests without isolated cores will
> > > > > > not be permitted to use vmx preemption timer, and tscdeadline
> > > > > > fastpath also be disabled, both will lead to performance
> > > > > > penalty.
> > > > > >
> > > > > > So check whether the vcpu->cpu is isolated, if not, don't
> > > > > > post timer interrupt.
> > > > > >
> > > > > > Signed-off-by: Aili Yao <yaoaili@kingsoft.com>
> > > > > > ---
> > > > > >  arch/x86/kvm/lapic.c | 4 +++-
> > > > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index
> > > > > > 759952dd1222..72dde5532101 100644
> > > > > > --- a/arch/x86/kvm/lapic.c
> > > > > > +++ b/arch/x86/kvm/lapic.c
> > > > > > @@ -34,6 +34,7 @@
> > > > > >  #include <asm/delay.h>
> > > > > >  #include <linux/atomic.h>
> > > > > >  #include <linux/jump_label.h>
> > > > > > +#include <linux/sched/isolation.h>
> > > > > >  #include "kvm_cache_regs.h"
> > > > > >  #include "irq.h"
> > > > > >  #include "ioapic.h"
> > > > > > @@ -113,7 +114,8 @@ static inline u32 kvm_x2apic_id(struct
> > > > > > kvm_lapic *apic)
> > > > > >
> > > > > >  static bool kvm_can_post_timer_interrupt(struct kvm_vcpu
> > > > > > *vcpu)  {
> > > > > > -     return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
> > > > > > +     return pi_inject_timer && kvm_vcpu_apicv_active(vcpu) &&
> > > > > > +             !housekeeping_cpu(vcpu->cpu, HK_FLAG_TIMER);
> > > > >
> > > > > I don't think this is safe, vcpu->cpu will be -1 if the vCPU
> > > > > isn't scheduled in.
> > >
> > > Yes, vcpu->cpu is  -1 before vcpu create, but in my environments,
> > > it didn't trigger this issue. I need to dig more, Thanks!
> > > Maybe I need one valid check here.
> > >
> > > > > This also doesn't play nice with the admin forcing
> > > > > pi_inject_timer=1. Not saying there's a reasonable use case for
> > > > > doing that, but it's supported today and this would break that
> > > > > behavior.  It would also lead to weird behavior if a vCPU were
> > > > > migrated on/off a housekeeping vCPU.  Again, probably not a
> > > > > reasonable use case, but I don't see anything
> > > > that would outright prevent that behavior.
> > >
> > > Yes,  this is not one common operation,  But I did do test some
> > > scenarios: 1. isolated cpu --> housekeeping cpu;
> > >     isolated guest timer is in housekeeping CPU, for migration,
> > > kvm_can_post_timer_interrupt will return false, so the timer may be
> > > migrated to vcpu->cpu; This seems works in my test;
> > > 2. isolated --> isolated
> > >     Isolated guest timer is in housekeeping cpu, for
> > > migration,kvm_can_post_timer_interrupt return true, timer is not
> > > migrated 3. housekeeping CPU --> isolated CPU
> > >     non-isolated CPU timer is usually in vcpu->cpu, for migration
> > > to isolated, kvm_can_post_timer_interrupt will be true,  the timer
> > > remain on the same CPU; This seems works in my test;
> > > 4. housekeeping CPU --> housekeeping CPU
> > >      timer migrated;
> > > It seems this is not an affecting problem;
> > >
> > > > >
> > > > > The existing behavior also feels a bit unsafe as
> > > > > pi_inject_timer is writable while KVM is running, though I
> > > > > supposed that's orthogonal to this
> > > > discussion.
> > > > >
> > > > > Rather than check vcpu->cpu, is there an existing vCPU flag
> > > > > that can be queried, e.g. KVM_HINTS_REALTIME?
> > > >
> > > > How about something like below:
> > > >
> > > > From 67f605120e212384cb3d5788ba8c83f15659503b Mon Sep 17 00:00:00
> > > > 2001
> > > > From: Wanpeng Li <wanpengli@tencent.com>
> > > > Date: Tue, 23 Nov 2021 10:36:10 +0800
> > > > Subject: [PATCH] KVM: LAPIC: To keep the vCPUs in non-root mode
> > > > for timer- pi
> > > >
> > > > From: Wanpeng Li <wanpengli@tencent.com>
> > > >
> > > > As commit 0c5f81dad46 (KVM: LAPIC: Inject timer interrupt via
> > > > posted interrupt) mentioned that the host admin should well tune
> > > > the guest setup, so that vCPUs are placed on isolated pCPUs, and
> > > > with several pCPUs surplus for
> > > > *busy* housekeeping.
> > > > It is better to disable mwait/hlt/pause vmexits to keep the vCPUs
> > > > in non-root mode. However, we may isolate pCPUs for other purpose
> > > > like DPDK or we can make some guests isolated and others not,
> > > > Let's add the checking kvm_mwait_in_guest() to
> > > > kvm_can_post_timer_interrupt() since we can't benefit from timer
> > > > posted-interrupt w/o keeping the vCPUs in non-root mode.
> > > >
> > > > Reported-by: Aili Yao <yaoaili@kingsoft.com>
> > > > Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> > > > ---
> > > >  arch/x86/kvm/lapic.c | 5 ++---
> > > >  1 file changed, 2 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index
> > > > 759952dd1222..8257566d44c7 100644
> > > > --- a/arch/x86/kvm/lapic.c
> > > > +++ b/arch/x86/kvm/lapic.c
> > > > @@ -113,14 +113,13 @@ static inline u32 kvm_x2apic_id(struct
> > > > kvm_lapic *apic)
> > > >
> > > >  static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)
> > > > {
> > > > -    return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
> > > > +    return pi_inject_timer && kvm_mwait_in_guest(vcpu->kvm) &&
> > > > kvm_vcpu_apicv_active(vcpu);
> > > >  }
> > > >
> > > >  bool kvm_can_use_hv_timer(struct kvm_vcpu *vcpu)  {
> > > >      return kvm_x86_ops.set_hv_timer
> > > > -           && !(kvm_mwait_in_guest(vcpu->kvm) ||
> > > > -            kvm_can_post_timer_interrupt(vcpu));
> > > > +           && !kvm_mwait_in_guest(vcpu->kvm);
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(kvm_can_use_hv_timer);
> > >
> > > This method seems more quick and safe, but I have one question:
> > > Does this kvm_mwait_in_guest can guarantee the CPU isolated,  in
> > > some production environments and usually,  MWAIT feature is
> > > disabled in host and even guests with isolated CPUs.  And also we
> > > can set guests kvm_mwait_in_guest true with CPUs just pinned, not
> > > isolated.
> >
> > You won't benefit from timer posted-interrupt if mwait is not exposed
> > to the guest since you can't keep CPU in non-root mode.
> > kvm_mwait_in_guest() will not guarantee the CPU is isolated, but
> > what's still bothering?
>
> Sorry, Did I miss some thing?
>
> What in my mind: MWait may be disabled in bios, so host will use halt
> instruction as one replacement for idle operation, in such a
> configuration, Mwait in guest will also be disabled even if you try to
> set kvm_mwait_in_guest true; As a result, halt,pause may not exit the
> guest, so the post interrupt still counts?

I prefer to expose mwait/hlt/pause to the guest simultaneously, you
don't need the ultra schedule latency/performance if you aren't
exposing mwait. Then why do you care about latency from the timer?

>
> For current code, We can migrate guest between isolated and
> housekeeping or we can change the cpu pinning on the fly, we allow this
> even the operation is not usually used, right?

My patch will not prevent using vmx preemption timer or tscdeadline fastpath.

    Wanpeng

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: LAPIC: Per vCPU control over kvm_can_post_timer_interrupt
  2021-11-22 19:13 ` Sean Christopherson
  2021-11-23  2:57   ` Wanpeng Li
@ 2021-11-23  8:18   ` Aili Yao
  2021-11-29  3:28     ` Aili Yao
  1 sibling, 1 reply; 9+ messages in thread
From: Aili Yao @ 2021-11-23  8:18 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: pbonzini, vkuznets, wanpengli, jmattson, joro, tglx, mingo, bp,
	dave.hansen, x86, hpa, kvm, linux-kernel, yaoaili

On Mon, 22 Nov 2021 19:13:02 +0000
Sean Christopherson <seanjc@google.com> wrote:

> On Mon, Nov 22, 2021, Aili Yao wrote:
> > From: Aili Yao <yaoaili@kingsoft.com>
> > 
> > When we isolate some pyhiscal cores, We may not use them for kvm
> > guests, We may use them for other purposes like DPDK, or we can
> > make some kvm guests isolated and some not, the global judgement
> > pi_inject_timer is not enough; We may make wrong decisions:
> > 
> > In such a scenario, the guests without isolated cores will not be
> > permitted to use vmx preemption timer, and tscdeadline fastpath
> > also be disabled, both will lead to performance penalty.
> > 
> > So check whether the vcpu->cpu is isolated, if not, don't post timer
> > interrupt.
> > 
> > Signed-off-by: Aili Yao <yaoaili@kingsoft.com>
> > ---
> >  arch/x86/kvm/lapic.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > index 759952dd1222..72dde5532101 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -34,6 +34,7 @@
> >  #include <asm/delay.h>
> >  #include <linux/atomic.h>
> >  #include <linux/jump_label.h>
> > +#include <linux/sched/isolation.h>
> >  #include "kvm_cache_regs.h"
> >  #include "irq.h"
> >  #include "ioapic.h"
> > @@ -113,7 +114,8 @@ static inline u32 kvm_x2apic_id(struct
> > kvm_lapic *apic) 
> >  static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)
> >  {
> > -	return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
> > +	return pi_inject_timer && kvm_vcpu_apicv_active(vcpu) &&
> > +		!housekeeping_cpu(vcpu->cpu, HK_FLAG_TIMER);  
> 
> I don't think this is safe, vcpu->cpu will be -1 if the vCPU isn't
> scheduled in. 

I checked this, It seems we will set vcpu->cpu to a valid value when we
create vcpu( kvm_vm_ioctl_create_vcpu()), only after that we can
configure lapic through vcpu fd and start the timer, this may not be one
real problem.

Currently, the patch seems work as expected in my test, maybe one
possible candidate for the issue listed above.

Thanks

> This also doesn't play nice with the admin forcing
> pi_inject_timer=1.  Not saying there's a reasonable use case for
> doing that, but it's supported today and this would break that
> behavior.  It would also lead to weird behavior if a vCPU were
> migrated on/off a housekeeping vCPU.  Again, probably not a
> reasonable use case, but I don't see anything that would outright
> prevent that behavior.
> 
> The existing behavior also feels a bit unsafe as pi_inject_timer is
> writable while KVM is running, though I supposed that's orthogonal to
> this discussion.
> 
> Rather than check vcpu->cpu, is there an existing vCPU flag that can
> be queried, e.g. KVM_HINTS_REALTIME?
> 
> >  }
> >  
> >  bool kvm_can_use_hv_timer(struct kvm_vcpu *vcpu)
> > -- 
> > 2.25.1
> >   


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] KVM: LAPIC: Per vCPU control over kvm_can_post_timer_interrupt
  2021-11-23  8:18   ` Aili Yao
@ 2021-11-29  3:28     ` Aili Yao
  0 siblings, 0 replies; 9+ messages in thread
From: Aili Yao @ 2021-11-29  3:28 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: pbonzini, vkuznets, wanpengli, jmattson, joro, tglx, mingo, bp,
	dave.hansen, x86, hpa, kvm, linux-kernel, yaoaili

On Tue, 23 Nov 2021 16:18:34 +0800
Aili Yao <yaoaili126@gmail.com> wrote:

> On Mon, 22 Nov 2021 19:13:02 +0000
> Sean Christopherson <seanjc@google.com> wrote:
> 
> > On Mon, Nov 22, 2021, Aili Yao wrote:
> > > From: Aili Yao <yaoaili@kingsoft.com>
> > > 
> > > When we isolate some pyhiscal cores, We may not use them for kvm
> > > guests, We may use them for other purposes like DPDK, or we can
> > > make some kvm guests isolated and some not, the global judgement
> > > pi_inject_timer is not enough; We may make wrong decisions:
> > > 
> > > In such a scenario, the guests without isolated cores will not be
> > > permitted to use vmx preemption timer, and tscdeadline fastpath
> > > also be disabled, both will lead to performance penalty.
> > > 
> > > So check whether the vcpu->cpu is isolated, if not, don't post timer
> > > interrupt.
> > > 
> > > Signed-off-by: Aili Yao <yaoaili@kingsoft.com>
> > > ---
> > >  arch/x86/kvm/lapic.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > > index 759952dd1222..72dde5532101 100644
> > > --- a/arch/x86/kvm/lapic.c
> > > +++ b/arch/x86/kvm/lapic.c
> > > @@ -34,6 +34,7 @@
> > >  #include <asm/delay.h>
> > >  #include <linux/atomic.h>
> > >  #include <linux/jump_label.h>
> > > +#include <linux/sched/isolation.h>
> > >  #include "kvm_cache_regs.h"
> > >  #include "irq.h"
> > >  #include "ioapic.h"
> > > @@ -113,7 +114,8 @@ static inline u32 kvm_x2apic_id(struct
> > > kvm_lapic *apic) 
> > >  static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)
> > >  {
> > > -	return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
> > > +	return pi_inject_timer && kvm_vcpu_apicv_active(vcpu) &&
> > > +		!housekeeping_cpu(vcpu->cpu, HK_FLAG_TIMER);  
> > 
> > I don't think this is safe, vcpu->cpu will be -1 if the vCPU isn't
> > scheduled in. 
> 
> I checked this, It seems we will set vcpu->cpu to a valid value when we
> create vcpu( kvm_vm_ioctl_create_vcpu()), 

Really Sorry, My code base is too old; This vcpu->cpu assignment has been deleted
in latest code, And this housekeeping_cpu() check will result problem.

Thanks!

>only after that we can
> configure lapic through vcpu fd and start the timer, this may not be one
> real problem.
> 
> Currently, the patch seems work as expected in my test, maybe one
> possible candidate for the issue listed above.
> 
> Thanks
> 
> > This also doesn't play nice with the admin forcing
> > pi_inject_timer=1.  Not saying there's a reasonable use case for
> > doing that, but it's supported today and this would break that
> > behavior.  It would also lead to weird behavior if a vCPU were
> > migrated on/off a housekeeping vCPU.  Again, probably not a
> > reasonable use case, but I don't see anything that would outright
> > prevent that behavior.
> > 
> > The existing behavior also feels a bit unsafe as pi_inject_timer is
> > writable while KVM is running, though I supposed that's orthogonal to
> > this discussion.
> > 
> > Rather than check vcpu->cpu, is there an existing vCPU flag that can
> > be queried, e.g. KVM_HINTS_REALTIME?
> > 
> > >  }
> > >  
> > >  bool kvm_can_use_hv_timer(struct kvm_vcpu *vcpu)
> > > -- 
> > > 2.25.1
> > >   
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-11-29  3:31 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-22  1:58 [PATCH] KVM: LAPIC: Per vCPU control over kvm_can_post_timer_interrupt Aili Yao
2021-11-22 19:13 ` Sean Christopherson
2021-11-23  2:57   ` Wanpeng Li
2021-11-23  4:11     ` yaoaili [么爱利]
2021-11-23  6:24       ` Wanpeng Li
2021-11-23  7:02         ` Aili Yao
2021-11-23  7:22           ` Wanpeng Li
2021-11-23  8:18   ` Aili Yao
2021-11-29  3:28     ` Aili Yao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).