LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [RFC] KVM: x86: SVM: don't expose PV_SEND_IPI feature with AVIC
@ 2021-11-08  9:59 Kele Huang
  2021-11-08 10:30 ` Paolo Bonzini
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Kele Huang @ 2021-11-08  9:59 UTC (permalink / raw)
  To: pbonzini
  Cc: chaiwen.cc, xieyongji, dengliang.1214, pizhenwei, wanpengli,
	seanjc, huangkele, Vitaly Kuznetsov, Jim Mattson, Joerg Roedel,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, kvm, linux-kernel

Currently, AVIC is disabled if x2apic feature is exposed to guest
or in-kernel PIT is in re-injection mode.

We can enable AVIC with options:

  Kmod args:
  modprobe kvm_amd avic=1 nested=0 npt=1
  QEMU args:
  ... -cpu host,-x2apic -global kvm-pit.lost_tick_policy=discard ...

When LAPIC works in xapic mode, both AVIC and PV_SEND_IPI feature
can accelerate IPI operations for guest. However, the relationship
between AVIC and PV_SEND_IPI feature is not sorted out.

In logical, AVIC accelerates most of frequently IPI operations
without VMM intervention, while the re-hooking of apic->send_IPI_xxx
from PV_SEND_IPI feature masks out it. People can get confused
if AVIC is enabled while getting lots of hypercall kvm_exits
from IPI.

In performance, benchmark tool
https://lore.kernel.org/kvm/20171219085010.4081-1-ynorov@caviumnetworks.com/
shows below results:

  Test env:
  CPU: AMD EPYC 7742 64-Core Processor
  2 vCPUs pinned 1:1
  idle=poll

  Test result (average ns per IPI of lots of running):
  PV_SEND_IPI 	: 1860
  AVIC 		: 1390

Besides, disscussions in https://lkml.org/lkml/2021/10/20/423
do have some solid performance test results to this.

This patch fixes this by masking out PV_SEND_IPI feature when
AVIC is enabled in setting up of guest vCPUs' CPUID.

Signed-off-by: Kele Huang <huangkele@bytedance.com>
---
 arch/x86/kvm/cpuid.c   |  4 ++--
 arch/x86/kvm/svm/svm.c | 13 +++++++++++++
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 2d70edb0f323..cc22975e2ac5 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -194,8 +194,6 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 		best->ecx |= XFEATURE_MASK_FPSSE;
 	}
 
-	kvm_update_pv_runtime(vcpu);
-
 	vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu);
 	vcpu->arch.reserved_gpa_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu);
 
@@ -208,6 +206,8 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	/* Invoke the vendor callback only after the above state is updated. */
 	static_call(kvm_x86_vcpu_after_set_cpuid)(vcpu);
 
+	kvm_update_pv_runtime(vcpu);
+
 	/*
 	 * Except for the MMU, which needs to do its thing any vendor specific
 	 * adjustments to the reserved GPA bits.
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index b36ca4e476c2..b13bcfb2617c 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4114,6 +4114,19 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 		if (nested && guest_cpuid_has(vcpu, X86_FEATURE_SVM))
 			kvm_request_apicv_update(vcpu->kvm, false,
 						 APICV_INHIBIT_REASON_NESTED);
+
+		if (!guest_cpuid_has(vcpu, X86_FEATURE_X2APIC) &&
+				!(nested && guest_cpuid_has(vcpu, X86_FEATURE_SVM))) {
+			/*
+			 * PV_SEND_IPI feature masks out AVIC acceleration to IPI.
+			 * So, we do not expose PV_SEND_IPI feature to guest when
+			 * AVIC is enabled.
+			 */
+			best = kvm_find_cpuid_entry(vcpu, KVM_CPUID_FEATURES, 0);
+			if (best && enable_apicv &&
+					(best->eax & (1 << KVM_FEATURE_PV_SEND_IPI)))
+				best->eax &= ~(1 << KVM_FEATURE_PV_SEND_IPI);
+		}
 	}
 	init_vmcb_after_set_cpuid(vcpu);
 }
-- 
2.11.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] KVM: x86: SVM: don't expose PV_SEND_IPI feature with AVIC
  2021-11-08  9:59 [RFC] KVM: x86: SVM: don't expose PV_SEND_IPI feature with AVIC Kele Huang
@ 2021-11-08 10:30 ` Paolo Bonzini
  2021-11-08 11:08   ` Maxim Levitsky
  2021-11-08 10:45 ` Vitaly Kuznetsov
  2021-11-16  2:04 ` Wanpeng Li
  2 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2021-11-08 10:30 UTC (permalink / raw)
  To: Kele Huang
  Cc: chaiwen.cc, xieyongji, dengliang.1214, pizhenwei, wanpengli,
	seanjc, Vitaly Kuznetsov, Jim Mattson, Joerg Roedel,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, kvm, linux-kernel

On 11/8/21 10:59, Kele Huang wrote:
> Currently, AVIC is disabled if x2apic feature is exposed to guest
> or in-kernel PIT is in re-injection mode.
> 
> We can enable AVIC with options:
> 
>    Kmod args:
>    modprobe kvm_amd avic=1 nested=0 npt=1
>    QEMU args:
>    ... -cpu host,-x2apic -global kvm-pit.lost_tick_policy=discard ...
> 
> When LAPIC works in xapic mode, both AVIC and PV_SEND_IPI feature
> can accelerate IPI operations for guest. However, the relationship
> between AVIC and PV_SEND_IPI feature is not sorted out.
> 
> In logical, AVIC accelerates most of frequently IPI operations
> without VMM intervention, while the re-hooking of apic->send_IPI_xxx
> from PV_SEND_IPI feature masks out it. People can get confused
> if AVIC is enabled while getting lots of hypercall kvm_exits
> from IPI.
> 
> In performance, benchmark tool
> https://lore.kernel.org/kvm/20171219085010.4081-1-ynorov@caviumnetworks.com/
> shows below results:
> 
>    Test env:
>    CPU: AMD EPYC 7742 64-Core Processor
>    2 vCPUs pinned 1:1
>    idle=poll
> 
>    Test result (average ns per IPI of lots of running):
>    PV_SEND_IPI 	: 1860
>    AVIC 		: 1390
> 
> Besides, disscussions in https://lkml.org/lkml/2021/10/20/423
> do have some solid performance test results to this.
> 
> This patch fixes this by masking out PV_SEND_IPI feature when
> AVIC is enabled in setting up of guest vCPUs' CPUID.
> 
> Signed-off-by: Kele Huang <huangkele@bytedance.com>

AVIC can change across migration.  I think we should instead use a new 
KVM_HINTS_* bit (KVM_HINTS_ACCELERATED_LAPIC or something like that). 
The KVM_HINTS_* bits are intended to be changeable across migration, 
even though we don't have for now anything equivalent to the Hyper-V 
reenlightenment interrupt.

Paolo

> ---
>   arch/x86/kvm/cpuid.c   |  4 ++--
>   arch/x86/kvm/svm/svm.c | 13 +++++++++++++
>   2 files changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 2d70edb0f323..cc22975e2ac5 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -194,8 +194,6 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>   		best->ecx |= XFEATURE_MASK_FPSSE;
>   	}
>   
> -	kvm_update_pv_runtime(vcpu);
> -
>   	vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu);
>   	vcpu->arch.reserved_gpa_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu);
>   
> @@ -208,6 +206,8 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>   	/* Invoke the vendor callback only after the above state is updated. */
>   	static_call(kvm_x86_vcpu_after_set_cpuid)(vcpu);
>   
> +	kvm_update_pv_runtime(vcpu);
> +
>   	/*
>   	 * Except for the MMU, which needs to do its thing any vendor specific
>   	 * adjustments to the reserved GPA bits.
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index b36ca4e476c2..b13bcfb2617c 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4114,6 +4114,19 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>   		if (nested && guest_cpuid_has(vcpu, X86_FEATURE_SVM))
>   			kvm_request_apicv_update(vcpu->kvm, false,
>   						 APICV_INHIBIT_REASON_NESTED);
> +
> +		if (!guest_cpuid_has(vcpu, X86_FEATURE_X2APIC) &&
> +				!(nested && guest_cpuid_has(vcpu, X86_FEATURE_SVM))) {
> +			/*
> +			 * PV_SEND_IPI feature masks out AVIC acceleration to IPI.
> +			 * So, we do not expose PV_SEND_IPI feature to guest when
> +			 * AVIC is enabled.
> +			 */
> +			best = kvm_find_cpuid_entry(vcpu, KVM_CPUID_FEATURES, 0);
> +			if (best && enable_apicv &&
> +					(best->eax & (1 << KVM_FEATURE_PV_SEND_IPI)))
> +				best->eax &= ~(1 << KVM_FEATURE_PV_SEND_IPI);
> +		}
>   	}
>   	init_vmcb_after_set_cpuid(vcpu);
>   }
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] KVM: x86: SVM: don't expose PV_SEND_IPI feature with AVIC
  2021-11-08  9:59 [RFC] KVM: x86: SVM: don't expose PV_SEND_IPI feature with AVIC Kele Huang
  2021-11-08 10:30 ` Paolo Bonzini
@ 2021-11-08 10:45 ` Vitaly Kuznetsov
  2021-11-16  2:04 ` Wanpeng Li
  2 siblings, 0 replies; 13+ messages in thread
From: Vitaly Kuznetsov @ 2021-11-08 10:45 UTC (permalink / raw)
  To: Kele Huang, pbonzini
  Cc: chaiwen.cc, xieyongji, dengliang.1214, pizhenwei, wanpengli,
	seanjc, huangkele, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	kvm, linux-kernel

Kele Huang <huangkele@bytedance.com> writes:

> Currently, AVIC is disabled if x2apic feature is exposed to guest
> or in-kernel PIT is in re-injection mode.
>
> We can enable AVIC with options:
>
>   Kmod args:
>   modprobe kvm_amd avic=1 nested=0 npt=1
>   QEMU args:
>   ... -cpu host,-x2apic -global kvm-pit.lost_tick_policy=discard ...
>
> When LAPIC works in xapic mode, both AVIC and PV_SEND_IPI feature
> can accelerate IPI operations for guest. However, the relationship
> between AVIC and PV_SEND_IPI feature is not sorted out.
>
> In logical, AVIC accelerates most of frequently IPI operations
> without VMM intervention, while the re-hooking of apic->send_IPI_xxx
> from PV_SEND_IPI feature masks out it. People can get confused
> if AVIC is enabled while getting lots of hypercall kvm_exits
> from IPI.
>
> In performance, benchmark tool
> https://lore.kernel.org/kvm/20171219085010.4081-1-ynorov@caviumnetworks.com/
> shows below results:
>
>   Test env:
>   CPU: AMD EPYC 7742 64-Core Processor
>   2 vCPUs pinned 1:1
>   idle=poll
>
>   Test result (average ns per IPI of lots of running):
>   PV_SEND_IPI 	: 1860
>   AVIC 		: 1390
>
> Besides, disscussions in https://lkml.org/lkml/2021/10/20/423
> do have some solid performance test results to this.
>
> This patch fixes this by masking out PV_SEND_IPI feature when
> AVIC is enabled in setting up of guest vCPUs' CPUID.
>
> Signed-off-by: Kele Huang <huangkele@bytedance.com>
> ---
>  arch/x86/kvm/cpuid.c   |  4 ++--
>  arch/x86/kvm/svm/svm.c | 13 +++++++++++++
>  2 files changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 2d70edb0f323..cc22975e2ac5 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -194,8 +194,6 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  		best->ecx |= XFEATURE_MASK_FPSSE;
>  	}
>  
> -	kvm_update_pv_runtime(vcpu);
> -
>  	vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu);
>  	vcpu->arch.reserved_gpa_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu);
>  
> @@ -208,6 +206,8 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	/* Invoke the vendor callback only after the above state is updated. */
>  	static_call(kvm_x86_vcpu_after_set_cpuid)(vcpu);
>  
> +	kvm_update_pv_runtime(vcpu);
> +
>  	/*
>  	 * Except for the MMU, which needs to do its thing any vendor specific
>  	 * adjustments to the reserved GPA bits.
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index b36ca4e476c2..b13bcfb2617c 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4114,6 +4114,19 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  		if (nested && guest_cpuid_has(vcpu, X86_FEATURE_SVM))
>  			kvm_request_apicv_update(vcpu->kvm, false,
>  						 APICV_INHIBIT_REASON_NESTED);
> +
> +		if (!guest_cpuid_has(vcpu, X86_FEATURE_X2APIC) &&
> +				!(nested && guest_cpuid_has(vcpu, X86_FEATURE_SVM))) {
> +			/*
> +			 * PV_SEND_IPI feature masks out AVIC acceleration to IPI.
> +			 * So, we do not expose PV_SEND_IPI feature to guest when
> +			 * AVIC is enabled.
> +			 */
> +			best = kvm_find_cpuid_entry(vcpu, KVM_CPUID_FEATURES, 0);
> +			if (best && enable_apicv &&
> +					(best->eax & (1 << KVM_FEATURE_PV_SEND_IPI)))
> +				best->eax &= ~(1 << KVM_FEATURE_PV_SEND_IPI);
> +		}

Personally, I'd prefer this to be done in userspace (e.g. QEMU) as with
this patch it becomes very un-obvious why in certain cases some feature
bits are missing. This also breaks migration from KVM pre-patch to KVM
post-patch with e.g. KVM_CAP_ENFORCE_PV_FEATURE_CPUID: the feature will
just disappear from underneath the guest.

What we don't have in KVM is something like KVM_GET_RECOMMENDED_CPUID
at least for KVM PV/Hyper-V features. We could've made it easier for
userspace to make 'default' decisions.

>  	}
>  	init_vmcb_after_set_cpuid(vcpu);
>  }

-- 
Vitaly


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] KVM: x86: SVM: don't expose PV_SEND_IPI feature with AVIC
  2021-11-08 10:30 ` Paolo Bonzini
@ 2021-11-08 11:08   ` Maxim Levitsky
  2021-11-08 11:14     ` zhenwei pi
  2021-11-16  2:48     ` Wanpeng Li
  0 siblings, 2 replies; 13+ messages in thread
From: Maxim Levitsky @ 2021-11-08 11:08 UTC (permalink / raw)
  To: Paolo Bonzini, Kele Huang
  Cc: chaiwen.cc, xieyongji, dengliang.1214, pizhenwei, wanpengli,
	seanjc, Vitaly Kuznetsov, Jim Mattson, Joerg Roedel,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, kvm, linux-kernel

On Mon, 2021-11-08 at 11:30 +0100, Paolo Bonzini wrote:
> On 11/8/21 10:59, Kele Huang wrote:
> > Currently, AVIC is disabled if x2apic feature is exposed to guest
> > or in-kernel PIT is in re-injection mode.
> > 
> > We can enable AVIC with options:
> > 
> >    Kmod args:
> >    modprobe kvm_amd avic=1 nested=0 npt=1
> >    QEMU args:
> >    ... -cpu host,-x2apic -global kvm-pit.lost_tick_policy=discard ...
> > 
> > When LAPIC works in xapic mode, both AVIC and PV_SEND_IPI feature
> > can accelerate IPI operations for guest. However, the relationship
> > between AVIC and PV_SEND_IPI feature is not sorted out.
> > 
> > In logical, AVIC accelerates most of frequently IPI operations
> > without VMM intervention, while the re-hooking of apic->send_IPI_xxx
> > from PV_SEND_IPI feature masks out it. People can get confused
> > if AVIC is enabled while getting lots of hypercall kvm_exits
> > from IPI.
> > 
> > In performance, benchmark tool
> > https://lore.kernel.org/kvm/20171219085010.4081-1-ynorov@caviumnetworks.com/
> > shows below results:
> > 
> >    Test env:
> >    CPU: AMD EPYC 7742 64-Core Processor
> >    2 vCPUs pinned 1:1
> >    idle=poll
> > 
> >    Test result (average ns per IPI of lots of running):
> >    PV_SEND_IPI 	: 1860
> >    AVIC 		: 1390
> > 
> > Besides, disscussions in https://lkml.org/lkml/2021/10/20/423
> > do have some solid performance test results to this.
> > 
> > This patch fixes this by masking out PV_SEND_IPI feature when
> > AVIC is enabled in setting up of guest vCPUs' CPUID.
> > 
> > Signed-off-by: Kele Huang <huangkele@bytedance.com>
> 
> AVIC can change across migration.  I think we should instead use a new 
> KVM_HINTS_* bit (KVM_HINTS_ACCELERATED_LAPIC or something like that). 
> The KVM_HINTS_* bits are intended to be changeable across migration, 
> even though we don't have for now anything equivalent to the Hyper-V 
> reenlightenment interrupt.

Note that the same issue exists with HyperV. It also has PV APIC,
which is harmful when AVIC is enabled (that is guest uses it instead
of using AVIC, negating AVIC benefits).

Also note that Intel recently posted IPI virtualizaion, which
will make this issue relevant to APICv too soon.

I don't yet know if there is a solution to this which doesn't
involve some management software decision (e.g libvirt or higher).

Best regards,
	Maxim Levitsky

> 
> Paolo
> 
> > ---
> >   arch/x86/kvm/cpuid.c   |  4 ++--
> >   arch/x86/kvm/svm/svm.c | 13 +++++++++++++
> >   2 files changed, 15 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index 2d70edb0f323..cc22975e2ac5 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -194,8 +194,6 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> >   		best->ecx |= XFEATURE_MASK_FPSSE;
> >   	}
> >   
> > -	kvm_update_pv_runtime(vcpu);
> > -
> >   	vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu);
> >   	vcpu->arch.reserved_gpa_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu);
> >   
> > @@ -208,6 +206,8 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> >   	/* Invoke the vendor callback only after the above state is updated. */
> >   	static_call(kvm_x86_vcpu_after_set_cpuid)(vcpu);
> >   
> > +	kvm_update_pv_runtime(vcpu);
> > +
> >   	/*
> >   	 * Except for the MMU, which needs to do its thing any vendor specific
> >   	 * adjustments to the reserved GPA bits.
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index b36ca4e476c2..b13bcfb2617c 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -4114,6 +4114,19 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
> >   		if (nested && guest_cpuid_has(vcpu, X86_FEATURE_SVM))
> >   			kvm_request_apicv_update(vcpu->kvm, false,
> >   						 APICV_INHIBIT_REASON_NESTED);
> > +
> > +		if (!guest_cpuid_has(vcpu, X86_FEATURE_X2APIC) &&
> > +				!(nested && guest_cpuid_has(vcpu, X86_FEATURE_SVM))) {
> > +			/*
> > +			 * PV_SEND_IPI feature masks out AVIC acceleration to IPI.
> > +			 * So, we do not expose PV_SEND_IPI feature to guest when
> > +			 * AVIC is enabled.
> > +			 */
> > +			best = kvm_find_cpuid_entry(vcpu, KVM_CPUID_FEATURES, 0);
> > +			if (best && enable_apicv &&
> > +					(best->eax & (1 << KVM_FEATURE_PV_SEND_IPI)))
> > +				best->eax &= ~(1 << KVM_FEATURE_PV_SEND_IPI);
> > +		}
> >   	}
> >   	init_vmcb_after_set_cpuid(vcpu);
> >   }
> > 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: [RFC] KVM: x86: SVM: don't expose PV_SEND_IPI feature with AVIC
  2021-11-08 11:08   ` Maxim Levitsky
@ 2021-11-08 11:14     ` zhenwei pi
  2021-11-08 11:18       ` Paolo Bonzini
  2021-11-16  2:48     ` Wanpeng Li
  1 sibling, 1 reply; 13+ messages in thread
From: zhenwei pi @ 2021-11-08 11:14 UTC (permalink / raw)
  To: Maxim Levitsky, Paolo Bonzini, Kele Huang
  Cc: chaiwen.cc, xieyongji, dengliang.1214, wanpengli, seanjc,
	Vitaly Kuznetsov, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	kvm, linux-kernel

On 11/8/21 7:08 PM, Maxim Levitsky wrote:
> On Mon, 2021-11-08 at 11:30 +0100, Paolo Bonzini wrote:
>> On 11/8/21 10:59, Kele Huang wrote:
>>> Currently, AVIC is disabled if x2apic feature is exposed to guest
>>> or in-kernel PIT is in re-injection mode.
>>>
>>> We can enable AVIC with options:
>>>
>>>     Kmod args:
>>>     modprobe kvm_amd avic=1 nested=0 npt=1
>>>     QEMU args:
>>>     ... -cpu host,-x2apic -global kvm-pit.lost_tick_policy=discard ...
>>>
>>> When LAPIC works in xapic mode, both AVIC and PV_SEND_IPI feature
>>> can accelerate IPI operations for guest. However, the relationship
>>> between AVIC and PV_SEND_IPI feature is not sorted out.
>>>
>>> In logical, AVIC accelerates most of frequently IPI operations
>>> without VMM intervention, while the re-hooking of apic->send_IPI_xxx
>>> from PV_SEND_IPI feature masks out it. People can get confused
>>> if AVIC is enabled while getting lots of hypercall kvm_exits
>>> from IPI.
>>>
>>> In performance, benchmark tool
>>> https://lore.kernel.org/kvm/20171219085010.4081-1-ynorov@caviumnetworks.com/
>>> shows below results:
>>>
>>>     Test env:
>>>     CPU: AMD EPYC 7742 64-Core Processor
>>>     2 vCPUs pinned 1:1
>>>     idle=poll
>>>
>>>     Test result (average ns per IPI of lots of running):
>>>     PV_SEND_IPI 	: 1860
>>>     AVIC 		: 1390
>>>
>>> Besides, disscussions in https://lkml.org/lkml/2021/10/20/423
>>> do have some solid performance test results to this.
>>>
>>> This patch fixes this by masking out PV_SEND_IPI feature when
>>> AVIC is enabled in setting up of guest vCPUs' CPUID.
>>>
>>> Signed-off-by: Kele Huang <huangkele@bytedance.com>
>>
>> AVIC can change across migration.  I think we should instead use a new
>> KVM_HINTS_* bit (KVM_HINTS_ACCELERATED_LAPIC or something like that).
>> The KVM_HINTS_* bits are intended to be changeable across migration,
>> even though we don't have for now anything equivalent to the Hyper-V
>> reenlightenment interrupt.
> 
> Note that the same issue exists with HyperV. It also has PV APIC,
> which is harmful when AVIC is enabled (that is guest uses it instead
> of using AVIC, negating AVIC benefits).
> 
> Also note that Intel recently posted IPI virtualizaion, which
> will make this issue relevant to APICv too soon.
> 
> I don't yet know if there is a solution to this which doesn't
> involve some management software decision (e.g libvirt or higher).
> 
> Best regards,
> 	Maxim Levitsky
> 

For QEMU, "-cpu host,kvm-pv-ipi=off" can disable kvm-pv-ipi.
And for libvirt, I posted a patch to disable kvm-pv-ipi by libvirt xml, 
link:
https://github.com/libvirt/libvirt/commit/b2757b697e29fa86972a4638a5879dccc8add2ad

>>
>> Paolo
>>
>>> ---
>>>    arch/x86/kvm/cpuid.c   |  4 ++--
>>>    arch/x86/kvm/svm/svm.c | 13 +++++++++++++
>>>    2 files changed, 15 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>>> index 2d70edb0f323..cc22975e2ac5 100644
>>> --- a/arch/x86/kvm/cpuid.c
>>> +++ b/arch/x86/kvm/cpuid.c
>>> @@ -194,8 +194,6 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>>>    		best->ecx |= XFEATURE_MASK_FPSSE;
>>>    	}
>>>    
>>> -	kvm_update_pv_runtime(vcpu);
>>> -
>>>    	vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu);
>>>    	vcpu->arch.reserved_gpa_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu);
>>>    
>>> @@ -208,6 +206,8 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>>>    	/* Invoke the vendor callback only after the above state is updated. */
>>>    	static_call(kvm_x86_vcpu_after_set_cpuid)(vcpu);
>>>    
>>> +	kvm_update_pv_runtime(vcpu);
>>> +
>>>    	/*
>>>    	 * Except for the MMU, which needs to do its thing any vendor specific
>>>    	 * adjustments to the reserved GPA bits.
>>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>>> index b36ca4e476c2..b13bcfb2617c 100644
>>> --- a/arch/x86/kvm/svm/svm.c
>>> +++ b/arch/x86/kvm/svm/svm.c
>>> @@ -4114,6 +4114,19 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>>>    		if (nested && guest_cpuid_has(vcpu, X86_FEATURE_SVM))
>>>    			kvm_request_apicv_update(vcpu->kvm, false,
>>>    						 APICV_INHIBIT_REASON_NESTED);
>>> +
>>> +		if (!guest_cpuid_has(vcpu, X86_FEATURE_X2APIC) &&
>>> +				!(nested && guest_cpuid_has(vcpu, X86_FEATURE_SVM))) {
>>> +			/*
>>> +			 * PV_SEND_IPI feature masks out AVIC acceleration to IPI.
>>> +			 * So, we do not expose PV_SEND_IPI feature to guest when
>>> +			 * AVIC is enabled.
>>> +			 */
>>> +			best = kvm_find_cpuid_entry(vcpu, KVM_CPUID_FEATURES, 0);
>>> +			if (best && enable_apicv &&
>>> +					(best->eax & (1 << KVM_FEATURE_PV_SEND_IPI)))
>>> +				best->eax &= ~(1 << KVM_FEATURE_PV_SEND_IPI);
>>> +		}
>>>    	}
>>>    	init_vmcb_after_set_cpuid(vcpu);
>>>    }
>>>
> 
> 

-- 
zhenwei pi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] KVM: x86: SVM: don't expose PV_SEND_IPI feature with AVIC
  2021-11-08 11:14     ` zhenwei pi
@ 2021-11-08 11:18       ` Paolo Bonzini
  0 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2021-11-08 11:18 UTC (permalink / raw)
  To: zhenwei pi, Maxim Levitsky, Kele Huang
  Cc: chaiwen.cc, xieyongji, dengliang.1214, wanpengli, seanjc,
	Vitaly Kuznetsov, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	kvm, linux-kernel

On 11/8/21 12:14, zhenwei pi wrote:
> I don't yet know if there is a solution to this which doesn't
> involve some management software decision (e.g libvirt or higher).

If we use a new hint, there's no problems with migration from/to older 
QEMU and libvirt's host-model/host-passthrough would pick up the bit 
automatically.

(I'm not sure if libvirt knows that hints can change across migration, 
though).

Paolo


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] KVM: x86: SVM: don't expose PV_SEND_IPI feature with AVIC
  2021-11-08  9:59 [RFC] KVM: x86: SVM: don't expose PV_SEND_IPI feature with AVIC Kele Huang
  2021-11-08 10:30 ` Paolo Bonzini
  2021-11-08 10:45 ` Vitaly Kuznetsov
@ 2021-11-16  2:04 ` Wanpeng Li
  2 siblings, 0 replies; 13+ messages in thread
From: Wanpeng Li @ 2021-11-16  2:04 UTC (permalink / raw)
  To: Kele Huang
  Cc: Paolo Bonzini, chaiwen.cc, xieyongji, dengliang.1214, zhenwei pi,
	Wanpeng Li, Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Joerg Roedel, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, the arch/x86 maintainers, H. Peter Anvin, kvm, LKML

On Mon, 8 Nov 2021 at 20:48, Kele Huang <huangkele@bytedance.com> wrote:
>
> Currently, AVIC is disabled if x2apic feature is exposed to guest
> or in-kernel PIT is in re-injection mode.
>
> We can enable AVIC with options:
>
>   Kmod args:
>   modprobe kvm_amd avic=1 nested=0 npt=1
>   QEMU args:
>   ... -cpu host,-x2apic -global kvm-pit.lost_tick_policy=discard ...
>
> When LAPIC works in xapic mode, both AVIC and PV_SEND_IPI feature
> can accelerate IPI operations for guest. However, the relationship
> between AVIC and PV_SEND_IPI feature is not sorted out.
>
> In logical, AVIC accelerates most of frequently IPI operations
> without VMM intervention, while the re-hooking of apic->send_IPI_xxx
> from PV_SEND_IPI feature masks out it. People can get confused
> if AVIC is enabled while getting lots of hypercall kvm_exits
> from IPI.
>
> In performance, benchmark tool
> https://lore.kernel.org/kvm/20171219085010.4081-1-ynorov@caviumnetworks.com/
> shows below results:
>
>   Test env:
>   CPU: AMD EPYC 7742 64-Core Processor
>   2 vCPUs pinned 1:1
>   idle=poll
>
>   Test result (average ns per IPI of lots of running):
>   PV_SEND_IPI   : 1860
>   AVIC          : 1390
>
> Besides, disscussions in https://lkml.org/lkml/2021/10/20/423
> do have some solid performance test results to this.
>
> This patch fixes this by masking out PV_SEND_IPI feature when
> AVIC is enabled in setting up of guest vCPUs' CPUID.

This is the second time in community you bytedance guys post patches
w/o evaluating ipi broadcast performance.
https://lore.kernel.org/all/CANRm+Cx597FNRUCyVz1D=B6Vs2GX3Sw57X7Muk+yMpi_hb+v1w@mail.gmail.com/T/#u

    Wanpeng

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC] KVM: x86: SVM: don't expose PV_SEND_IPI feature with AVIC
  2021-11-08 11:08   ` Maxim Levitsky
  2021-11-08 11:14     ` zhenwei pi
@ 2021-11-16  2:48     ` Wanpeng Li
  2021-11-16  2:56       ` zhenwei pi
  1 sibling, 1 reply; 13+ messages in thread
From: Wanpeng Li @ 2021-11-16  2:48 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, Kele Huang, chaiwen.cc, xieyongji, dengliang.1214,
	zhenwei pi, Wanpeng Li, Sean Christopherson, Vitaly Kuznetsov,
	Jim Mattson, Joerg Roedel, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, the arch/x86 maintainers,
	H. Peter Anvin, kvm, LKML

On Mon, 8 Nov 2021 at 22:09, Maxim Levitsky <mlevitsk@redhat.com> wrote:
>
> On Mon, 2021-11-08 at 11:30 +0100, Paolo Bonzini wrote:
> > On 11/8/21 10:59, Kele Huang wrote:
> > > Currently, AVIC is disabled if x2apic feature is exposed to guest
> > > or in-kernel PIT is in re-injection mode.
> > >
> > > We can enable AVIC with options:
> > >
> > >    Kmod args:
> > >    modprobe kvm_amd avic=1 nested=0 npt=1
> > >    QEMU args:
> > >    ... -cpu host,-x2apic -global kvm-pit.lost_tick_policy=discard ...
> > >
> > > When LAPIC works in xapic mode, both AVIC and PV_SEND_IPI feature
> > > can accelerate IPI operations for guest. However, the relationship
> > > between AVIC and PV_SEND_IPI feature is not sorted out.
> > >
> > > In logical, AVIC accelerates most of frequently IPI operations
> > > without VMM intervention, while the re-hooking of apic->send_IPI_xxx
> > > from PV_SEND_IPI feature masks out it. People can get confused
> > > if AVIC is enabled while getting lots of hypercall kvm_exits
> > > from IPI.
> > >
> > > In performance, benchmark tool
> > > https://lore.kernel.org/kvm/20171219085010.4081-1-ynorov@caviumnetworks.com/
> > > shows below results:
> > >
> > >    Test env:
> > >    CPU: AMD EPYC 7742 64-Core Processor
> > >    2 vCPUs pinned 1:1
> > >    idle=poll
> > >
> > >    Test result (average ns per IPI of lots of running):
> > >    PV_SEND_IPI      : 1860
> > >    AVIC             : 1390
> > >
> > > Besides, disscussions in https://lkml.org/lkml/2021/10/20/423
> > > do have some solid performance test results to this.
> > >
> > > This patch fixes this by masking out PV_SEND_IPI feature when
> > > AVIC is enabled in setting up of guest vCPUs' CPUID.
> > >
> > > Signed-off-by: Kele Huang <huangkele@bytedance.com>
> >
> > AVIC can change across migration.  I think we should instead use a new
> > KVM_HINTS_* bit (KVM_HINTS_ACCELERATED_LAPIC or something like that).
> > The KVM_HINTS_* bits are intended to be changeable across migration,
> > even though we don't have for now anything equivalent to the Hyper-V
> > reenlightenment interrupt.
>
> Note that the same issue exists with HyperV. It also has PV APIC,
> which is harmful when AVIC is enabled (that is guest uses it instead
> of using AVIC, negating AVIC benefits).
>
> Also note that Intel recently posted IPI virtualizaion, which
> will make this issue relevant to APICv too soon.

The recently posted Intel IPI virtualization will accelerate unicast
ipi but not broadcast ipis, AMD AVIC accelerates unicast ipi well but
accelerates broadcast ipis worse than pv ipis. Could we just handle
unicast ipi here?

    Wanpeng

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: [RFC] KVM: x86: SVM: don't expose PV_SEND_IPI feature with AVIC
  2021-11-16  2:48     ` Wanpeng Li
@ 2021-11-16  2:56       ` zhenwei pi
  2021-11-16  9:06         ` Chao Gao
  0 siblings, 1 reply; 13+ messages in thread
From: zhenwei pi @ 2021-11-16  2:56 UTC (permalink / raw)
  To: Wanpeng Li, Maxim Levitsky
  Cc: Paolo Bonzini, Kele Huang, chaiwen.cc, xieyongji, dengliang.1214,
	Wanpeng Li, Sean Christopherson, Vitaly Kuznetsov, Jim Mattson,
	Joerg Roedel, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, the arch/x86 maintainers, H. Peter Anvin, kvm, LKML



On 11/16/21 10:48 AM, Wanpeng Li wrote:
> On Mon, 8 Nov 2021 at 22:09, Maxim Levitsky <mlevitsk@redhat.com> wrote:
>>
>> On Mon, 2021-11-08 at 11:30 +0100, Paolo Bonzini wrote:
>>> On 11/8/21 10:59, Kele Huang wrote:
>>>> Currently, AVIC is disabled if x2apic feature is exposed to guest
>>>> or in-kernel PIT is in re-injection mode.
>>>>
>>>> We can enable AVIC with options:
>>>>
>>>>     Kmod args:
>>>>     modprobe kvm_amd avic=1 nested=0 npt=1
>>>>     QEMU args:
>>>>     ... -cpu host,-x2apic -global kvm-pit.lost_tick_policy=discard ...
>>>>
>>>> When LAPIC works in xapic mode, both AVIC and PV_SEND_IPI feature
>>>> can accelerate IPI operations for guest. However, the relationship
>>>> between AVIC and PV_SEND_IPI feature is not sorted out.
>>>>
>>>> In logical, AVIC accelerates most of frequently IPI operations
>>>> without VMM intervention, while the re-hooking of apic->send_IPI_xxx
>>>> from PV_SEND_IPI feature masks out it. People can get confused
>>>> if AVIC is enabled while getting lots of hypercall kvm_exits
>>>> from IPI.
>>>>
>>>> In performance, benchmark tool
>>>> https://lore.kernel.org/kvm/20171219085010.4081-1-ynorov@caviumnetworks.com/
>>>> shows below results:
>>>>
>>>>     Test env:
>>>>     CPU: AMD EPYC 7742 64-Core Processor
>>>>     2 vCPUs pinned 1:1
>>>>     idle=poll
>>>>
>>>>     Test result (average ns per IPI of lots of running):
>>>>     PV_SEND_IPI      : 1860
>>>>     AVIC             : 1390
>>>>
>>>> Besides, disscussions in https://lkml.org/lkml/2021/10/20/423
>>>> do have some solid performance test results to this.
>>>>
>>>> This patch fixes this by masking out PV_SEND_IPI feature when
>>>> AVIC is enabled in setting up of guest vCPUs' CPUID.
>>>>
>>>> Signed-off-by: Kele Huang <huangkele@bytedance.com>
>>>
>>> AVIC can change across migration.  I think we should instead use a new
>>> KVM_HINTS_* bit (KVM_HINTS_ACCELERATED_LAPIC or something like that).
>>> The KVM_HINTS_* bits are intended to be changeable across migration,
>>> even though we don't have for now anything equivalent to the Hyper-V
>>> reenlightenment interrupt.
>>
>> Note that the same issue exists with HyperV. It also has PV APIC,
>> which is harmful when AVIC is enabled (that is guest uses it instead
>> of using AVIC, negating AVIC benefits).
>>
>> Also note that Intel recently posted IPI virtualizaion, which
>> will make this issue relevant to APICv too soon.
> 
> The recently posted Intel IPI virtualization will accelerate unicast
> ipi but not broadcast ipis, AMD AVIC accelerates unicast ipi well but
> accelerates broadcast ipis worse than pv ipis. Could we just handle
> unicast ipi here?
> 
>      Wanpeng
> 
Depend on the number of target vCPUs, broadcast IPIs gets unstable 
performance on AVIC, and usually worse than PV Send IPI.
So agree with Wanpeng's point, is it possible to separate single IPI and 
broadcast IPI on a hardware acceleration platform?

-- 
zhenwei pi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: [RFC] KVM: x86: SVM: don't expose PV_SEND_IPI feature with AVIC
  2021-11-16  2:56       ` zhenwei pi
@ 2021-11-16  9:06         ` Chao Gao
  2021-11-16  9:30           ` [External] " 黄科乐
                             ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Chao Gao @ 2021-11-16  9:06 UTC (permalink / raw)
  To: zhenwei pi
  Cc: Wanpeng Li, Maxim Levitsky, Paolo Bonzini, Kele Huang,
	chaiwen.cc, xieyongji, dengliang.1214, Wanpeng Li,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson, Joerg Roedel,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	the arch/x86 maintainers, H. Peter Anvin, kvm, LKML

On Tue, Nov 16, 2021 at 10:56:25AM +0800, zhenwei pi wrote:
>
>
>On 11/16/21 10:48 AM, Wanpeng Li wrote:
>> On Mon, 8 Nov 2021 at 22:09, Maxim Levitsky <mlevitsk@redhat.com> wrote:
>> > 
>> > On Mon, 2021-11-08 at 11:30 +0100, Paolo Bonzini wrote:
>> > > On 11/8/21 10:59, Kele Huang wrote:
>> > > > Currently, AVIC is disabled if x2apic feature is exposed to guest
>> > > > or in-kernel PIT is in re-injection mode.
>> > > > 
>> > > > We can enable AVIC with options:
>> > > > 
>> > > >     Kmod args:
>> > > >     modprobe kvm_amd avic=1 nested=0 npt=1
>> > > >     QEMU args:
>> > > >     ... -cpu host,-x2apic -global kvm-pit.lost_tick_policy=discard ...
>> > > > 
>> > > > When LAPIC works in xapic mode, both AVIC and PV_SEND_IPI feature
>> > > > can accelerate IPI operations for guest. However, the relationship
>> > > > between AVIC and PV_SEND_IPI feature is not sorted out.
>> > > > 
>> > > > In logical, AVIC accelerates most of frequently IPI operations
>> > > > without VMM intervention, while the re-hooking of apic->send_IPI_xxx
>> > > > from PV_SEND_IPI feature masks out it. People can get confused
>> > > > if AVIC is enabled while getting lots of hypercall kvm_exits
>> > > > from IPI.
>> > > > 
>> > > > In performance, benchmark tool
>> > > > https://lore.kernel.org/kvm/20171219085010.4081-1-ynorov@caviumnetworks.com/
>> > > > shows below results:
>> > > > 
>> > > >     Test env:
>> > > >     CPU: AMD EPYC 7742 64-Core Processor
>> > > >     2 vCPUs pinned 1:1
>> > > >     idle=poll
>> > > > 
>> > > >     Test result (average ns per IPI of lots of running):
>> > > >     PV_SEND_IPI      : 1860
>> > > >     AVIC             : 1390
>> > > > 
>> > > > Besides, disscussions in https://lkml.org/lkml/2021/10/20/423
>> > > > do have some solid performance test results to this.
>> > > > 
>> > > > This patch fixes this by masking out PV_SEND_IPI feature when
>> > > > AVIC is enabled in setting up of guest vCPUs' CPUID.
>> > > > 
>> > > > Signed-off-by: Kele Huang <huangkele@bytedance.com>
>> > > 
>> > > AVIC can change across migration.  I think we should instead use a new
>> > > KVM_HINTS_* bit (KVM_HINTS_ACCELERATED_LAPIC or something like that).
>> > > The KVM_HINTS_* bits are intended to be changeable across migration,
>> > > even though we don't have for now anything equivalent to the Hyper-V
>> > > reenlightenment interrupt.
>> > 
>> > Note that the same issue exists with HyperV. It also has PV APIC,
>> > which is harmful when AVIC is enabled (that is guest uses it instead
>> > of using AVIC, negating AVIC benefits).
>> > 
>> > Also note that Intel recently posted IPI virtualizaion, which
>> > will make this issue relevant to APICv too soon.
>> 
>> The recently posted Intel IPI virtualization will accelerate unicast
>> ipi but not broadcast ipis, AMD AVIC accelerates unicast ipi well but
>> accelerates broadcast ipis worse than pv ipis. Could we just handle
>> unicast ipi here?
>> 
>>      Wanpeng
>> 
>Depend on the number of target vCPUs, broadcast IPIs gets unstable
>performance on AVIC, and usually worse than PV Send IPI.
>So agree with Wanpeng's point, is it possible to separate single IPI and
>broadcast IPI on a hardware acceleration platform?

Actually, this is how kernel works in x2apic mode: use PV interface
(hypercall) to send multi-cast IPIs and write ICR MSR directly to send
unicast IPIs.

But if guest works in xapic mode, both unicast and multi-cast are issued
via PV interface. It is a side-effect introduced by commit aaffcfd1e82d.

how about just correcting the logic for xapic:

From 13447b221252b64cd85ed1329f7d917afa54efc8 Mon Sep 17 00:00:00 2001
From: Jiaqing Zhao <jiaqing.zhao@intel.com>
Date: Fri, 9 Apr 2021 13:53:39 +0800
Subject: [PATCH 1/2] x86/apic/flat: Add specific send IPI logic

Currently, apic_flat.send_IPI() uses default_send_IPI_single(), which
is a wrapper of apic->send_IPI_mask(). Since commit aaffcfd1e82d
("KVM: X86: Implement PV IPIs in linux guest"), KVM PV IPI driver will
override apic->send_IPI_mask(), and may cause unwated side effects.

This patch removes such side effects by creating a specific send_IPI
method.

Signed-off-by: Jiaqing Zhao <jiaqing.zhao@intel.com>
---
 arch/x86/kernel/apic/apic_flat_64.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/apic/apic_flat_64.c b/arch/x86/kernel/apic/apic_flat_64.c
index 8f72b4351c9f..3196bf220230 100644
--- a/arch/x86/kernel/apic/apic_flat_64.c
+++ b/arch/x86/kernel/apic/apic_flat_64.c
@@ -64,6 +64,13 @@ static void flat_send_IPI_mask(const struct cpumask *cpumask, int vector)
 	_flat_send_IPI_mask(mask, vector);
 }

+static void flat_send_IPI_single(int cpu, int vector)
+{
+	unsigned long mask = cpumask_bits(cpumask_of(cpu))[0];
+
+	_flat_send_IPI_mask(mask, vector);
+}
+
 static void
 flat_send_IPI_mask_allbutself(const struct cpumask *cpumask, int vector)
 {
@@ -132,7 +139,7 @@ static struct apic apic_flat __ro_after_init = {

 	.calc_dest_apicid		= apic_flat_calc_apicid,

-	.send_IPI			= default_send_IPI_single,
+	.send_IPI			= flat_send_IPI_single,
 	.send_IPI_mask			= flat_send_IPI_mask,
 	.send_IPI_mask_allbutself	= flat_send_IPI_mask_allbutself,
 	.send_IPI_allbutself		= default_send_IPI_allbutself,
--
2.27.0



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [External] Re: Re: [RFC] KVM: x86: SVM: don't expose PV_SEND_IPI feature with AVIC
  2021-11-16  9:06         ` Chao Gao
@ 2021-11-16  9:30           ` 黄科乐
  2021-11-16  9:30           ` Wanpeng Li
       [not found]           ` <CAKUug92xp7mU_KB66jGtdYRhgQpgfCm67r+3kMOMdbrGOrTQcA@mail.gmail.com>
  2 siblings, 0 replies; 13+ messages in thread
From: 黄科乐 @ 2021-11-16  9:30 UTC (permalink / raw)
  To: Chao Gao
  Cc: zhenwei pi, Wanpeng Li, Maxim Levitsky, Paolo Bonzini,
	chaiwen.cc, xieyongji, dengliang.1214, Wanpeng Li,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson, Joerg Roedel,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	the arch/x86 maintainers, H. Peter Anvin, kvm, LKML

> The recently posted Intel IPI virtualization will accelerate unicast
> ipi but not broadcast ipis, AMD AVIC accelerates unicast ipi well but
> accelerates broadcast ipis worse than pv ipis. Could we just handle
> unicast ipi here?

Thanks for the explanation! It is true that AVIC does not always perform better
than PV IPI, actually not even swx2apic.

> So agree with Wanpeng's point, is it possible to separate single IPI and
> broadcast IPI on a hardware acceleration platform?


> how about just correcting the logic for xapic:

> From 13447b221252b64cd85ed1329f7d917afa54efc8 Mon Sep 17 00:00:00 2001
> From: Jiaqing Zhao <jiaqing.zhao@intel.com>
> Date: Fri, 9 Apr 2021 13:53:39 +0800
> Subject: [PATCH 1/2] x86/apic/flat: Add specific send IPI logic

> Currently, apic_flat.send_IPI() uses default_send_IPI_single(), which
> is a wrapper of apic->send_IPI_mask(). Since commit aaffcfd1e82d
> ("KVM: X86: Implement PV IPIs in linux guest"), KVM PV IPI driver will
> override apic->send_IPI_mask(), and may cause unwated side effects.

> This patch removes such side effects by creating a specific send_IPI
> method.

> Signed-off-by: Jiaqing Zhao <jiaqing.zhao@intel.com>

Actually, I think this issue is more about how to sort out the relationship
between AVIC and PV IPI. As far as I understand, currently, no matter
the option from userspace or the determination made in kernel works
in some way, but not in the migration scenario. For instance, migration with
AVIC feature changes can make guests lose the PV IPI feature needlessly.
Besides, the current patch is not consistent with
KVM_CAP_ENFORCE_PV_FEATURE_CPUID.
Paolo's advice about using a new hint shall work well. Currently try
working on it.
Best regards,
Kele

On Tue, Nov 16, 2021 at 4:56 PM Chao Gao <chao.gao@intel.com> wrote:
>
> On Tue, Nov 16, 2021 at 10:56:25AM +0800, zhenwei pi wrote:
> >
> >
> >On 11/16/21 10:48 AM, Wanpeng Li wrote:
> >> On Mon, 8 Nov 2021 at 22:09, Maxim Levitsky <mlevitsk@redhat.com> wrote:
> >> >
> >> > On Mon, 2021-11-08 at 11:30 +0100, Paolo Bonzini wrote:
> >> > > On 11/8/21 10:59, Kele Huang wrote:
> >> > > > Currently, AVIC is disabled if x2apic feature is exposed to guest
> >> > > > or in-kernel PIT is in re-injection mode.
> >> > > >
> >> > > > We can enable AVIC with options:
> >> > > >
> >> > > >     Kmod args:
> >> > > >     modprobe kvm_amd avic=1 nested=0 npt=1
> >> > > >     QEMU args:
> >> > > >     ... -cpu host,-x2apic -global kvm-pit.lost_tick_policy=discard ...
> >> > > >
> >> > > > When LAPIC works in xapic mode, both AVIC and PV_SEND_IPI feature
> >> > > > can accelerate IPI operations for guest. However, the relationship
> >> > > > between AVIC and PV_SEND_IPI feature is not sorted out.
> >> > > >
> >> > > > In logical, AVIC accelerates most of frequently IPI operations
> >> > > > without VMM intervention, while the re-hooking of apic->send_IPI_xxx
> >> > > > from PV_SEND_IPI feature masks out it. People can get confused
> >> > > > if AVIC is enabled while getting lots of hypercall kvm_exits
> >> > > > from IPI.
> >> > > >
> >> > > > In performance, benchmark tool
> >> > > > https://lore.kernel.org/kvm/20171219085010.4081-1-ynorov@caviumnetworks.com/
> >> > > > shows below results:
> >> > > >
> >> > > >     Test env:
> >> > > >     CPU: AMD EPYC 7742 64-Core Processor
> >> > > >     2 vCPUs pinned 1:1
> >> > > >     idle=poll
> >> > > >
> >> > > >     Test result (average ns per IPI of lots of running):
> >> > > >     PV_SEND_IPI      : 1860
> >> > > >     AVIC             : 1390
> >> > > >
> >> > > > Besides, disscussions in https://lkml.org/lkml/2021/10/20/423
> >> > > > do have some solid performance test results to this.
> >> > > >
> >> > > > This patch fixes this by masking out PV_SEND_IPI feature when
> >> > > > AVIC is enabled in setting up of guest vCPUs' CPUID.
> >> > > >
> >> > > > Signed-off-by: Kele Huang <huangkele@bytedance.com>
> >> > >
> >> > > AVIC can change across migration.  I think we should instead use a new
> >> > > KVM_HINTS_* bit (KVM_HINTS_ACCELERATED_LAPIC or something like that).
> >> > > The KVM_HINTS_* bits are intended to be changeable across migration,
> >> > > even though we don't have for now anything equivalent to the Hyper-V
> >> > > reenlightenment interrupt.
> >> >
> >> > Note that the same issue exists with HyperV. It also has PV APIC,
> >> > which is harmful when AVIC is enabled (that is guest uses it instead
> >> > of using AVIC, negating AVIC benefits).
> >> >
> >> > Also note that Intel recently posted IPI virtualizaion, which
> >> > will make this issue relevant to APICv too soon.
> >>
> >> The recently posted Intel IPI virtualization will accelerate unicast
> >> ipi but not broadcast ipis, AMD AVIC accelerates unicast ipi well but
> >> accelerates broadcast ipis worse than pv ipis. Could we just handle
> >> unicast ipi here?
> >>
> >>      Wanpeng
> >>
> >Depend on the number of target vCPUs, broadcast IPIs gets unstable
> >performance on AVIC, and usually worse than PV Send IPI.
> >So agree with Wanpeng's point, is it possible to separate single IPI and
> >broadcast IPI on a hardware acceleration platform?
>
> Actually, this is how kernel works in x2apic mode: use PV interface
> (hypercall) to send multi-cast IPIs and write ICR MSR directly to send
> unicast IPIs.
>
> But if guest works in xapic mode, both unicast and multi-cast are issued
> via PV interface. It is a side-effect introduced by commit aaffcfd1e82d.
>
> how about just correcting the logic for xapic:
>
> From 13447b221252b64cd85ed1329f7d917afa54efc8 Mon Sep 17 00:00:00 2001
> From: Jiaqing Zhao <jiaqing.zhao@intel.com>
> Date: Fri, 9 Apr 2021 13:53:39 +0800
> Subject: [PATCH 1/2] x86/apic/flat: Add specific send IPI logic
>
> Currently, apic_flat.send_IPI() uses default_send_IPI_single(), which
> is a wrapper of apic->send_IPI_mask(). Since commit aaffcfd1e82d
> ("KVM: X86: Implement PV IPIs in linux guest"), KVM PV IPI driver will
> override apic->send_IPI_mask(), and may cause unwated side effects.
>
> This patch removes such side effects by creating a specific send_IPI
> method.
>
> Signed-off-by: Jiaqing Zhao <jiaqing.zhao@intel.com>
> ---
>  arch/x86/kernel/apic/apic_flat_64.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/apic/apic_flat_64.c b/arch/x86/kernel/apic/apic_flat_64.c
> index 8f72b4351c9f..3196bf220230 100644
> --- a/arch/x86/kernel/apic/apic_flat_64.c
> +++ b/arch/x86/kernel/apic/apic_flat_64.c
> @@ -64,6 +64,13 @@ static void flat_send_IPI_mask(const struct cpumask *cpumask, int vector)
>         _flat_send_IPI_mask(mask, vector);
>  }
>
> +static void flat_send_IPI_single(int cpu, int vector)
> +{
> +       unsigned long mask = cpumask_bits(cpumask_of(cpu))[0];
> +
> +       _flat_send_IPI_mask(mask, vector);
> +}
> +
>  static void
>  flat_send_IPI_mask_allbutself(const struct cpumask *cpumask, int vector)
>  {
> @@ -132,7 +139,7 @@ static struct apic apic_flat __ro_after_init = {
>
>         .calc_dest_apicid               = apic_flat_calc_apicid,
>
> -       .send_IPI                       = default_send_IPI_single,
> +       .send_IPI                       = flat_send_IPI_single,
>         .send_IPI_mask                  = flat_send_IPI_mask,
>         .send_IPI_mask_allbutself       = flat_send_IPI_mask_allbutself,
>         .send_IPI_allbutself            = default_send_IPI_allbutself,
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: [RFC] KVM: x86: SVM: don't expose PV_SEND_IPI feature with AVIC
  2021-11-16  9:06         ` Chao Gao
  2021-11-16  9:30           ` [External] " 黄科乐
@ 2021-11-16  9:30           ` Wanpeng Li
       [not found]           ` <CAKUug92xp7mU_KB66jGtdYRhgQpgfCm67r+3kMOMdbrGOrTQcA@mail.gmail.com>
  2 siblings, 0 replies; 13+ messages in thread
From: Wanpeng Li @ 2021-11-16  9:30 UTC (permalink / raw)
  To: Chao Gao
  Cc: zhenwei pi, Maxim Levitsky, Paolo Bonzini, Kele Huang,
	chaiwen.cc, xieyongji, dengliang.1214, Wanpeng Li,
	Sean Christopherson, Vitaly Kuznetsov, Jim Mattson, Joerg Roedel,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	the arch/x86 maintainers, H. Peter Anvin, kvm, LKML

On Tue, 16 Nov 2021 at 16:56, Chao Gao <chao.gao@intel.com> wrote:
>
> On Tue, Nov 16, 2021 at 10:56:25AM +0800, zhenwei pi wrote:
> >
> >
> >On 11/16/21 10:48 AM, Wanpeng Li wrote:
> >> On Mon, 8 Nov 2021 at 22:09, Maxim Levitsky <mlevitsk@redhat.com> wrote:
> >> >
> >> > On Mon, 2021-11-08 at 11:30 +0100, Paolo Bonzini wrote:
> >> > > On 11/8/21 10:59, Kele Huang wrote:
> >> > > > Currently, AVIC is disabled if x2apic feature is exposed to guest
> >> > > > or in-kernel PIT is in re-injection mode.
> >> > > >
> >> > > > We can enable AVIC with options:
> >> > > >
> >> > > >     Kmod args:
> >> > > >     modprobe kvm_amd avic=1 nested=0 npt=1
> >> > > >     QEMU args:
> >> > > >     ... -cpu host,-x2apic -global kvm-pit.lost_tick_policy=discard ...
> >> > > >
> >> > > > When LAPIC works in xapic mode, both AVIC and PV_SEND_IPI feature
> >> > > > can accelerate IPI operations for guest. However, the relationship
> >> > > > between AVIC and PV_SEND_IPI feature is not sorted out.
> >> > > >
> >> > > > In logical, AVIC accelerates most of frequently IPI operations
> >> > > > without VMM intervention, while the re-hooking of apic->send_IPI_xxx
> >> > > > from PV_SEND_IPI feature masks out it. People can get confused
> >> > > > if AVIC is enabled while getting lots of hypercall kvm_exits
> >> > > > from IPI.
> >> > > >
> >> > > > In performance, benchmark tool
> >> > > > https://lore.kernel.org/kvm/20171219085010.4081-1-ynorov@caviumnetworks.com/
> >> > > > shows below results:
> >> > > >
> >> > > >     Test env:
> >> > > >     CPU: AMD EPYC 7742 64-Core Processor
> >> > > >     2 vCPUs pinned 1:1
> >> > > >     idle=poll
> >> > > >
> >> > > >     Test result (average ns per IPI of lots of running):
> >> > > >     PV_SEND_IPI      : 1860
> >> > > >     AVIC             : 1390
> >> > > >
> >> > > > Besides, disscussions in https://lkml.org/lkml/2021/10/20/423
> >> > > > do have some solid performance test results to this.
> >> > > >
> >> > > > This patch fixes this by masking out PV_SEND_IPI feature when
> >> > > > AVIC is enabled in setting up of guest vCPUs' CPUID.
> >> > > >
> >> > > > Signed-off-by: Kele Huang <huangkele@bytedance.com>
> >> > >
> >> > > AVIC can change across migration.  I think we should instead use a new
> >> > > KVM_HINTS_* bit (KVM_HINTS_ACCELERATED_LAPIC or something like that).
> >> > > The KVM_HINTS_* bits are intended to be changeable across migration,
> >> > > even though we don't have for now anything equivalent to the Hyper-V
> >> > > reenlightenment interrupt.
> >> >
> >> > Note that the same issue exists with HyperV. It also has PV APIC,
> >> > which is harmful when AVIC is enabled (that is guest uses it instead
> >> > of using AVIC, negating AVIC benefits).
> >> >
> >> > Also note that Intel recently posted IPI virtualizaion, which
> >> > will make this issue relevant to APICv too soon.
> >>
> >> The recently posted Intel IPI virtualization will accelerate unicast
> >> ipi but not broadcast ipis, AMD AVIC accelerates unicast ipi well but
> >> accelerates broadcast ipis worse than pv ipis. Could we just handle
> >> unicast ipi here?
> >>
> >>      Wanpeng
> >>
> >Depend on the number of target vCPUs, broadcast IPIs gets unstable
> >performance on AVIC, and usually worse than PV Send IPI.
> >So agree with Wanpeng's point, is it possible to separate single IPI and
> >broadcast IPI on a hardware acceleration platform?
>
> Actually, this is how kernel works in x2apic mode: use PV interface
> (hypercall) to send multi-cast IPIs and write ICR MSR directly to send
> unicast IPIs.
>
> But if guest works in xapic mode, both unicast and multi-cast are issued
> via PV interface. It is a side-effect introduced by commit aaffcfd1e82d.
>
> how about just correcting the logic for xapic:
>
> From 13447b221252b64cd85ed1329f7d917afa54efc8 Mon Sep 17 00:00:00 2001
> From: Jiaqing Zhao <jiaqing.zhao@intel.com>
> Date: Fri, 9 Apr 2021 13:53:39 +0800
> Subject: [PATCH 1/2] x86/apic/flat: Add specific send IPI logic
>
> Currently, apic_flat.send_IPI() uses default_send_IPI_single(), which
> is a wrapper of apic->send_IPI_mask(). Since commit aaffcfd1e82d
> ("KVM: X86: Implement PV IPIs in linux guest"), KVM PV IPI driver will
> override apic->send_IPI_mask(), and may cause unwated side effects.
>
> This patch removes such side effects by creating a specific send_IPI
> method.

This looks reasonable to me.

    Wanpeng

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [External] Re: Re: [RFC] KVM: x86: SVM: don't expose PV_SEND_IPI feature with AVIC
       [not found]           ` <CAKUug92xp7mU_KB66jGtdYRhgQpgfCm67r+3kMOMdbrGOrTQcA@mail.gmail.com>
@ 2021-11-16 15:57             ` Sean Christopherson
  0 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2021-11-16 15:57 UTC (permalink / raw)
  To: 黄科乐
  Cc: Chao Gao, zhenwei pi, Wanpeng Li, Maxim Levitsky, Paolo Bonzini,
	chaiwen.cc, xieyongji, dengliang.1214, Wanpeng Li,
	Vitaly Kuznetsov, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen,
	the arch/x86 maintainers, H. Peter Anvin, kvm, LKML

On Tue, Nov 16, 2021, 黄科乐 wrote:
> > The recently posted Intel IPI virtualization will accelerate unicast
> > ipi but not broadcast ipis, AMD AVIC accelerates unicast ipi well but
> > accelerates broadcast ipis worse than pv ipis. Could we just handle
> > unicast ipi here?
> 
> Thanks for the explanation! It is true that AVIC does not always perform
> better
> than PV IPI, actually not even swx2apic.
> 
> > So agree with Wanpeng's point, is it possible to separate single IPI and
> > broadcast IPI on a hardware acceleration platform?
> 
> 
> > how about just correcting the logic for xapic:
> 
> > From 13447b221252b64cd85ed1329f7d917afa54efc8 Mon Sep 17 00:00:00 2001
> > From: Jiaqing Zhao <jiaqing.zhao@intel.com>
> > Date: Fri, 9 Apr 2021 13:53:39 +0800
> > Subject: [PATCH 1/2] x86/apic/flat: Add specific send IPI logic
> 
> > Currently, apic_flat.send_IPI() uses default_send_IPI_single(), which
> > is a wrapper of apic->send_IPI_mask(). Since commit aaffcfd1e82d
> > ("KVM: X86: Implement PV IPIs in linux guest"), KVM PV IPI driver will
> > override apic->send_IPI_mask(), and may cause unwated side effects.
> 
> > This patch removes such side effects by creating a specific send_IPI
> > method.
> 
> > Signed-off-by: Jiaqing Zhao <jiaqing.zhao@intel.com>
> 
> Actually, I think this issue is more about how to sort out the relationship
> between AVIC and PV IPI. As far as I understand, currently, no matter
> the option from userspace or the determination made in kernel works
> in some way, but not in the migration scenario. For instance, migration with
> AVIC feature changes can make guests lose the PV IPI feature needlessly.
> Besides, the current patch is not consistent with
> KVM_CAP_ENFORCE_PV_FEATURE_CPUID.
> Paolo's advice about using a new hint shall work well. Currently try
> working on it.

IIUC, you want to have the guest switch between AVIC and PV IPI when the guest
is migrated?  That doesn't require a new hint, it would be just as easy for the
host to manipulate CPUID.KVM_FEATURE_PV_SEND_IPI as it would a new CPUID hint.

The real trick will be getting the guest to be aware of the CPUID and reconfigure
it's APIC setup on the fly.

Or did I misundersetand what you meant by "migration with AVIC feature changes"?

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-11-16 15:58 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-08  9:59 [RFC] KVM: x86: SVM: don't expose PV_SEND_IPI feature with AVIC Kele Huang
2021-11-08 10:30 ` Paolo Bonzini
2021-11-08 11:08   ` Maxim Levitsky
2021-11-08 11:14     ` zhenwei pi
2021-11-08 11:18       ` Paolo Bonzini
2021-11-16  2:48     ` Wanpeng Li
2021-11-16  2:56       ` zhenwei pi
2021-11-16  9:06         ` Chao Gao
2021-11-16  9:30           ` [External] " 黄科乐
2021-11-16  9:30           ` Wanpeng Li
     [not found]           ` <CAKUug92xp7mU_KB66jGtdYRhgQpgfCm67r+3kMOMdbrGOrTQcA@mail.gmail.com>
2021-11-16 15:57             ` [External] " Sean Christopherson
2021-11-08 10:45 ` Vitaly Kuznetsov
2021-11-16  2:04 ` Wanpeng Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).