LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v2 0/7] Use static_call for kvm_pmu_ops
@ 2021-11-08 11:10 Like Xu
2021-11-08 11:10 ` [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX Like Xu
` (6 more replies)
0 siblings, 7 replies; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
To: Paolo Bonzini, Sean Christopherson
Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
linux-kernel
Hi,
This is a successor to a previous patch set from Jason Baron. Let's convert
kvm_pmu_ops to use static_call. Shows good performance gains for
a typical perf use case [2] in the guest (results in patch 3/3).
V1 -> V2 Changelog:
- Export kvm_pmu_is_valid_msr() for nVMX [Sean]
- Land memcpy() above kvm_ops_static_call_update() [Sean]
- Move the pmu_ops to kvm_x86_init_ops and tagged as __initdata. [Sean]
- Move the kvm_ops_static_call_update() to x86.c [Sean]
- Drop kvm_pmu_ops_static_call_update() [Sean]
- Fix WARNING that macros KVM_X86_OP should not use a trailing semicolon
Previous:
https://lore.kernel.org/kvm/20211103070310.43380-1-likexu@tencent.com/
[1] https://lore.kernel.org/lkml/cover.1610680941.git.jbaron@akamai.com/
[2] perf record -e branch-instructions -e branch-misses \
-e cache-misses -e cache-references -e cpu-cycles \
-e instructions ./workload
Thanks,
Like Xu (7):
KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX
KVM: x86: Fix WARNING that macros should not use a trailing semicolon
KVM: x86: Move kvm_ops_static_call_update() to x86.c
KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection
KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata
KVM: x86: Introduce definitions to support static calls for
kvm_pmu_ops
KVM: x86: Use static calls to reduce kvm_pmu_ops overhead
arch/x86/include/asm/kvm-x86-ops.h | 218 ++++++++++++-------------
arch/x86/include/asm/kvm-x86-pmu-ops.h | 32 ++++
arch/x86/include/asm/kvm_host.h | 14 +-
arch/x86/kvm/pmu.c | 46 +++---
arch/x86/kvm/pmu.h | 9 +-
arch/x86/kvm/svm/pmu.c | 2 +-
arch/x86/kvm/svm/svm.c | 2 +-
arch/x86/kvm/vmx/nested.c | 2 +-
arch/x86/kvm/vmx/pmu_intel.c | 2 +-
arch/x86/kvm/vmx/vmx.c | 2 +-
arch/x86/kvm/x86.c | 16 +-
11 files changed, 199 insertions(+), 146 deletions(-)
create mode 100644 arch/x86/include/asm/kvm-x86-pmu-ops.h
--
2.33.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX
2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
@ 2021-11-08 11:10 ` Like Xu
2021-12-08 17:55 ` Sean Christopherson
2021-11-08 11:10 ` [PATCH v2 2/7] KVM: x86: Fix WARNING that macros should not use a trailing semicolon Like Xu
` (5 subsequent siblings)
6 siblings, 1 reply; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
To: Paolo Bonzini, Sean Christopherson
Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
linux-kernel
From: Like Xu <likexu@tencent.com>
Let's export kvm_pmu_is_valid_msr() for nVMX, instead of
exporting all kvm_pmu_ops for this one case. The reduced access
scope will help to optimize the kvm_x86_ops.pmu_ops stuff later.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
arch/x86/kvm/pmu.c | 1 +
arch/x86/kvm/vmx/nested.c | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 0772bad9165c..aa6ac9c7aff2 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -398,6 +398,7 @@ bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
return kvm_x86_ops.pmu_ops->msr_idx_to_pmc(vcpu, msr) ||
kvm_x86_ops.pmu_ops->is_valid_msr(vcpu, msr);
}
+EXPORT_SYMBOL_GPL(kvm_pmu_is_valid_msr);
static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
{
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index b4ee5e9f9e20..6c6bc8b2072a 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4796,7 +4796,7 @@ void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
return;
vmx = to_vmx(vcpu);
- if (kvm_x86_ops.pmu_ops->is_valid_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL)) {
+ if (kvm_pmu_is_valid_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL)) {
vmx->nested.msrs.entry_ctls_high |=
VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
vmx->nested.msrs.exit_ctls_high |=
--
2.33.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 2/7] KVM: x86: Fix WARNING that macros should not use a trailing semicolon
2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
2021-11-08 11:10 ` [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX Like Xu
@ 2021-11-08 11:10 ` Like Xu
2021-11-08 11:10 ` [PATCH v2 3/7] KVM: x86: Move kvm_ops_static_call_update() to x86.c Like Xu
` (4 subsequent siblings)
6 siblings, 0 replies; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
To: Paolo Bonzini, Sean Christopherson
Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
linux-kernel
From: Like Xu <likexu@tencent.com>
The scripts/checkpatch.pl complains about the macro KVM_X86_OP in this way:
WARNING: macros should not use a trailing semicolon
+#define KVM_X86_OP(func) \
+ DECLARE_STATIC_CALL(kvm_x86_##func, *(((struct kvm_x86_ops *)0)->func));
Signed-off-by: Like Xu <likexu@tencent.com>
---
arch/x86/include/asm/kvm-x86-ops.h | 218 ++++++++++++++---------------
arch/x86/include/asm/kvm_host.h | 4 +-
arch/x86/kvm/x86.c | 2 +-
3 files changed, 112 insertions(+), 112 deletions(-)
diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index cefe1d81e2e8..8f1035cc2c06 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -12,115 +12,115 @@ BUILD_BUG_ON(1)
* case where there is no definition or a function name that
* doesn't match the typical naming convention is supplied.
*/
-KVM_X86_OP_NULL(hardware_enable)
-KVM_X86_OP_NULL(hardware_disable)
-KVM_X86_OP_NULL(hardware_unsetup)
-KVM_X86_OP_NULL(cpu_has_accelerated_tpr)
-KVM_X86_OP(has_emulated_msr)
-KVM_X86_OP(vcpu_after_set_cpuid)
-KVM_X86_OP(vm_init)
-KVM_X86_OP_NULL(vm_destroy)
-KVM_X86_OP(vcpu_create)
-KVM_X86_OP(vcpu_free)
-KVM_X86_OP(vcpu_reset)
-KVM_X86_OP(prepare_guest_switch)
-KVM_X86_OP(vcpu_load)
-KVM_X86_OP(vcpu_put)
-KVM_X86_OP(update_exception_bitmap)
-KVM_X86_OP(get_msr)
-KVM_X86_OP(set_msr)
-KVM_X86_OP(get_segment_base)
-KVM_X86_OP(get_segment)
-KVM_X86_OP(get_cpl)
-KVM_X86_OP(set_segment)
-KVM_X86_OP_NULL(get_cs_db_l_bits)
-KVM_X86_OP(set_cr0)
-KVM_X86_OP(is_valid_cr4)
-KVM_X86_OP(set_cr4)
-KVM_X86_OP(set_efer)
-KVM_X86_OP(get_idt)
-KVM_X86_OP(set_idt)
-KVM_X86_OP(get_gdt)
-KVM_X86_OP(set_gdt)
-KVM_X86_OP(sync_dirty_debug_regs)
-KVM_X86_OP(set_dr7)
-KVM_X86_OP(cache_reg)
-KVM_X86_OP(get_rflags)
-KVM_X86_OP(set_rflags)
-KVM_X86_OP(tlb_flush_all)
-KVM_X86_OP(tlb_flush_current)
-KVM_X86_OP_NULL(tlb_remote_flush)
-KVM_X86_OP_NULL(tlb_remote_flush_with_range)
-KVM_X86_OP(tlb_flush_gva)
-KVM_X86_OP(tlb_flush_guest)
-KVM_X86_OP(run)
-KVM_X86_OP_NULL(handle_exit)
-KVM_X86_OP_NULL(skip_emulated_instruction)
-KVM_X86_OP_NULL(update_emulated_instruction)
-KVM_X86_OP(set_interrupt_shadow)
-KVM_X86_OP(get_interrupt_shadow)
-KVM_X86_OP(patch_hypercall)
-KVM_X86_OP(set_irq)
-KVM_X86_OP(set_nmi)
-KVM_X86_OP(queue_exception)
-KVM_X86_OP(cancel_injection)
-KVM_X86_OP(interrupt_allowed)
-KVM_X86_OP(nmi_allowed)
-KVM_X86_OP(get_nmi_mask)
-KVM_X86_OP(set_nmi_mask)
-KVM_X86_OP(enable_nmi_window)
-KVM_X86_OP(enable_irq_window)
-KVM_X86_OP(update_cr8_intercept)
-KVM_X86_OP(check_apicv_inhibit_reasons)
-KVM_X86_OP(refresh_apicv_exec_ctrl)
-KVM_X86_OP(hwapic_irr_update)
-KVM_X86_OP(hwapic_isr_update)
-KVM_X86_OP_NULL(guest_apic_has_interrupt)
-KVM_X86_OP(load_eoi_exitmap)
-KVM_X86_OP(set_virtual_apic_mode)
-KVM_X86_OP_NULL(set_apic_access_page_addr)
-KVM_X86_OP(deliver_posted_interrupt)
-KVM_X86_OP_NULL(sync_pir_to_irr)
-KVM_X86_OP(set_tss_addr)
-KVM_X86_OP(set_identity_map_addr)
-KVM_X86_OP(get_mt_mask)
-KVM_X86_OP(load_mmu_pgd)
-KVM_X86_OP_NULL(has_wbinvd_exit)
-KVM_X86_OP(get_l2_tsc_offset)
-KVM_X86_OP(get_l2_tsc_multiplier)
-KVM_X86_OP(write_tsc_offset)
-KVM_X86_OP(write_tsc_multiplier)
-KVM_X86_OP(get_exit_info)
-KVM_X86_OP(check_intercept)
-KVM_X86_OP(handle_exit_irqoff)
-KVM_X86_OP_NULL(request_immediate_exit)
-KVM_X86_OP(sched_in)
-KVM_X86_OP_NULL(update_cpu_dirty_logging)
-KVM_X86_OP_NULL(pre_block)
-KVM_X86_OP_NULL(post_block)
-KVM_X86_OP_NULL(vcpu_blocking)
-KVM_X86_OP_NULL(vcpu_unblocking)
-KVM_X86_OP_NULL(update_pi_irte)
-KVM_X86_OP_NULL(start_assignment)
-KVM_X86_OP_NULL(apicv_post_state_restore)
-KVM_X86_OP_NULL(dy_apicv_has_pending_interrupt)
-KVM_X86_OP_NULL(set_hv_timer)
-KVM_X86_OP_NULL(cancel_hv_timer)
-KVM_X86_OP(setup_mce)
-KVM_X86_OP(smi_allowed)
-KVM_X86_OP(enter_smm)
-KVM_X86_OP(leave_smm)
-KVM_X86_OP(enable_smi_window)
-KVM_X86_OP_NULL(mem_enc_op)
-KVM_X86_OP_NULL(mem_enc_reg_region)
-KVM_X86_OP_NULL(mem_enc_unreg_region)
-KVM_X86_OP(get_msr_feature)
-KVM_X86_OP(can_emulate_instruction)
-KVM_X86_OP(apic_init_signal_blocked)
-KVM_X86_OP_NULL(enable_direct_tlbflush)
-KVM_X86_OP_NULL(migrate_timers)
-KVM_X86_OP(msr_filter_changed)
-KVM_X86_OP_NULL(complete_emulated_msr)
+KVM_X86_OP_NULL(hardware_enable);
+KVM_X86_OP_NULL(hardware_disable);
+KVM_X86_OP_NULL(hardware_unsetup);
+KVM_X86_OP_NULL(cpu_has_accelerated_tpr);
+KVM_X86_OP(has_emulated_msr);
+KVM_X86_OP(vcpu_after_set_cpuid);
+KVM_X86_OP(vm_init);
+KVM_X86_OP_NULL(vm_destroy);
+KVM_X86_OP(vcpu_create);
+KVM_X86_OP(vcpu_free);
+KVM_X86_OP(vcpu_reset);
+KVM_X86_OP(prepare_guest_switch);
+KVM_X86_OP(vcpu_load);
+KVM_X86_OP(vcpu_put);
+KVM_X86_OP(update_exception_bitmap);
+KVM_X86_OP(get_msr);
+KVM_X86_OP(set_msr);
+KVM_X86_OP(get_segment_base);
+KVM_X86_OP(get_segment);
+KVM_X86_OP(get_cpl);
+KVM_X86_OP(set_segment);
+KVM_X86_OP_NULL(get_cs_db_l_bits);
+KVM_X86_OP(set_cr0);
+KVM_X86_OP(is_valid_cr4);
+KVM_X86_OP(set_cr4);
+KVM_X86_OP(set_efer);
+KVM_X86_OP(get_idt);
+KVM_X86_OP(set_idt);
+KVM_X86_OP(get_gdt);
+KVM_X86_OP(set_gdt);
+KVM_X86_OP(sync_dirty_debug_regs);
+KVM_X86_OP(set_dr7);
+KVM_X86_OP(cache_reg);
+KVM_X86_OP(get_rflags);
+KVM_X86_OP(set_rflags);
+KVM_X86_OP(tlb_flush_all);
+KVM_X86_OP(tlb_flush_current);
+KVM_X86_OP_NULL(tlb_remote_flush);
+KVM_X86_OP_NULL(tlb_remote_flush_with_range);
+KVM_X86_OP(tlb_flush_gva);
+KVM_X86_OP(tlb_flush_guest);
+KVM_X86_OP(run);
+KVM_X86_OP_NULL(handle_exit);
+KVM_X86_OP_NULL(skip_emulated_instruction);
+KVM_X86_OP_NULL(update_emulated_instruction);
+KVM_X86_OP(set_interrupt_shadow);
+KVM_X86_OP(get_interrupt_shadow);
+KVM_X86_OP(patch_hypercall);
+KVM_X86_OP(set_irq);
+KVM_X86_OP(set_nmi);
+KVM_X86_OP(queue_exception);
+KVM_X86_OP(cancel_injection);
+KVM_X86_OP(interrupt_allowed);
+KVM_X86_OP(nmi_allowed);
+KVM_X86_OP(get_nmi_mask);
+KVM_X86_OP(set_nmi_mask);
+KVM_X86_OP(enable_nmi_window);
+KVM_X86_OP(enable_irq_window);
+KVM_X86_OP(update_cr8_intercept);
+KVM_X86_OP(check_apicv_inhibit_reasons);
+KVM_X86_OP(refresh_apicv_exec_ctrl);
+KVM_X86_OP(hwapic_irr_update);
+KVM_X86_OP(hwapic_isr_update);
+KVM_X86_OP_NULL(guest_apic_has_interrupt);
+KVM_X86_OP(load_eoi_exitmap);
+KVM_X86_OP(set_virtual_apic_mode);
+KVM_X86_OP_NULL(set_apic_access_page_addr);
+KVM_X86_OP(deliver_posted_interrupt);
+KVM_X86_OP_NULL(sync_pir_to_irr);
+KVM_X86_OP(set_tss_addr);
+KVM_X86_OP(set_identity_map_addr);
+KVM_X86_OP(get_mt_mask);
+KVM_X86_OP(load_mmu_pgd);
+KVM_X86_OP_NULL(has_wbinvd_exit);
+KVM_X86_OP(get_l2_tsc_offset);
+KVM_X86_OP(get_l2_tsc_multiplier);
+KVM_X86_OP(write_tsc_offset);
+KVM_X86_OP(write_tsc_multiplier);
+KVM_X86_OP(get_exit_info);
+KVM_X86_OP(check_intercept);
+KVM_X86_OP(handle_exit_irqoff);
+KVM_X86_OP_NULL(request_immediate_exit);
+KVM_X86_OP(sched_in);
+KVM_X86_OP_NULL(update_cpu_dirty_logging);
+KVM_X86_OP_NULL(pre_block);
+KVM_X86_OP_NULL(post_block);
+KVM_X86_OP_NULL(vcpu_blocking);
+KVM_X86_OP_NULL(vcpu_unblocking);
+KVM_X86_OP_NULL(update_pi_irte);
+KVM_X86_OP_NULL(start_assignment);
+KVM_X86_OP_NULL(apicv_post_state_restore);
+KVM_X86_OP_NULL(dy_apicv_has_pending_interrupt);
+KVM_X86_OP_NULL(set_hv_timer);
+KVM_X86_OP_NULL(cancel_hv_timer);
+KVM_X86_OP(setup_mce);
+KVM_X86_OP(smi_allowed);
+KVM_X86_OP(enter_smm);
+KVM_X86_OP(leave_smm);
+KVM_X86_OP(enable_smi_window);
+KVM_X86_OP_NULL(mem_enc_op);
+KVM_X86_OP_NULL(mem_enc_reg_region);
+KVM_X86_OP_NULL(mem_enc_unreg_region);
+KVM_X86_OP(get_msr_feature);
+KVM_X86_OP(can_emulate_instruction);
+KVM_X86_OP(apic_init_signal_blocked);
+KVM_X86_OP_NULL(enable_direct_tlbflush);
+KVM_X86_OP_NULL(migrate_timers);
+KVM_X86_OP(msr_filter_changed);
+KVM_X86_OP_NULL(complete_emulated_msr);
#undef KVM_X86_OP
#undef KVM_X86_OP_NULL
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 88fce6ab4bbd..c2a4a362f3e2 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1532,14 +1532,14 @@ extern bool __read_mostly enable_apicv;
extern struct kvm_x86_ops kvm_x86_ops;
#define KVM_X86_OP(func) \
- DECLARE_STATIC_CALL(kvm_x86_##func, *(((struct kvm_x86_ops *)0)->func));
+ DECLARE_STATIC_CALL(kvm_x86_##func, *(((struct kvm_x86_ops *)0)->func))
#define KVM_X86_OP_NULL KVM_X86_OP
#include <asm/kvm-x86-ops.h>
static inline void kvm_ops_static_call_update(void)
{
#define KVM_X86_OP(func) \
- static_call_update(kvm_x86_##func, kvm_x86_ops.func);
+ static_call_update(kvm_x86_##func, kvm_x86_ops.func)
#define KVM_X86_OP_NULL KVM_X86_OP
#include <asm/kvm-x86-ops.h>
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ac83d873d65b..775051070627 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -125,7 +125,7 @@ EXPORT_SYMBOL_GPL(kvm_x86_ops);
#define KVM_X86_OP(func) \
DEFINE_STATIC_CALL_NULL(kvm_x86_##func, \
- *(((struct kvm_x86_ops *)0)->func));
+ *(((struct kvm_x86_ops *)0)->func))
#define KVM_X86_OP_NULL KVM_X86_OP
#include <asm/kvm-x86-ops.h>
EXPORT_STATIC_CALL_GPL(kvm_x86_get_cs_db_l_bits);
--
2.33.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 3/7] KVM: x86: Move kvm_ops_static_call_update() to x86.c
2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
2021-11-08 11:10 ` [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX Like Xu
2021-11-08 11:10 ` [PATCH v2 2/7] KVM: x86: Fix WARNING that macros should not use a trailing semicolon Like Xu
@ 2021-11-08 11:10 ` Like Xu
2021-11-08 11:10 ` [PATCH v2 4/7] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection Like Xu
` (3 subsequent siblings)
6 siblings, 0 replies; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
To: Paolo Bonzini, Sean Christopherson
Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
linux-kernel
From: Like Xu <likexu@tencent.com>
The kvm_ops_static_call_update() is defined in kvm_host.h.
That's completely unnecessary, it should have exactly one caller,
kvm_arch_hardware_setup(). As a prep match, move
kvm_ops_static_call_update() to x86.c, then it can reference
the kvm_pmu_ops stuff.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
arch/x86/include/asm/kvm_host.h | 8 --------
arch/x86/kvm/x86.c | 8 ++++++++
2 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c2a4a362f3e2..c2d4ee2973c5 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1536,14 +1536,6 @@ extern struct kvm_x86_ops kvm_x86_ops;
#define KVM_X86_OP_NULL KVM_X86_OP
#include <asm/kvm-x86-ops.h>
-static inline void kvm_ops_static_call_update(void)
-{
-#define KVM_X86_OP(func) \
- static_call_update(kvm_x86_##func, kvm_x86_ops.func)
-#define KVM_X86_OP_NULL KVM_X86_OP
-#include <asm/kvm-x86-ops.h>
-}
-
#define __KVM_HAVE_ARCH_VM_ALLOC
static inline struct kvm *kvm_arch_alloc_vm(void)
{
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 775051070627..0aee0a078d6f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11300,6 +11300,14 @@ void kvm_arch_hardware_disable(void)
drop_user_return_notifiers();
}
+static inline void kvm_ops_static_call_update(void)
+{
+#define KVM_X86_OP(func) \
+ static_call_update(kvm_x86_##func, kvm_x86_ops.func)
+#define KVM_X86_OP_NULL KVM_X86_OP
+#include <asm/kvm-x86-ops.h>
+}
+
int kvm_arch_hardware_setup(void *opaque)
{
struct kvm_x86_init_ops *ops = opaque;
--
2.33.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 4/7] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection
2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
` (2 preceding siblings ...)
2021-11-08 11:10 ` [PATCH v2 3/7] KVM: x86: Move kvm_ops_static_call_update() to x86.c Like Xu
@ 2021-11-08 11:10 ` Like Xu
2021-12-08 18:10 ` Sean Christopherson
2021-11-08 11:10 ` [PATCH v2 5/7] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata Like Xu
` (2 subsequent siblings)
6 siblings, 1 reply; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
To: Paolo Bonzini, Sean Christopherson
Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
linux-kernel
From: Like Xu <likexu@tencent.com>
Replace the kvm_pmu_ops pointer in common x86 with an instance of the
struct to save one pointer dereference when invoking functions. Copy the
struct by value to set the ops during kvm_init().
Signed-off-by: Like Xu <likexu@tencent.com>
---
arch/x86/kvm/pmu.c | 41 ++++++++++++++++++++++-------------------
arch/x86/kvm/pmu.h | 4 +++-
arch/x86/kvm/x86.c | 1 +
3 files changed, 26 insertions(+), 20 deletions(-)
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index aa6ac9c7aff2..353989bf0102 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -47,6 +47,9 @@
* * AMD: [0 .. AMD64_NUM_COUNTERS-1] <=> gp counters
*/
+struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
+EXPORT_SYMBOL_GPL(kvm_pmu_ops);
+
static void kvm_pmi_trigger_fn(struct irq_work *irq_work)
{
struct kvm_pmu *pmu = container_of(irq_work, struct kvm_pmu, irq_work);
@@ -214,7 +217,7 @@ void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel)
ARCH_PERFMON_EVENTSEL_CMASK |
HSW_IN_TX |
HSW_IN_TX_CHECKPOINTED))) {
- config = kvm_x86_ops.pmu_ops->find_arch_event(pmc_to_pmu(pmc),
+ config = kvm_pmu_ops.find_arch_event(pmc_to_pmu(pmc),
event_select,
unit_mask);
if (config != PERF_COUNT_HW_MAX)
@@ -268,7 +271,7 @@ void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int idx)
pmc->current_config = (u64)ctrl;
pmc_reprogram_counter(pmc, PERF_TYPE_HARDWARE,
- kvm_x86_ops.pmu_ops->find_fixed_event(idx),
+ kvm_pmu_ops.find_fixed_event(idx),
!(en_field & 0x2), /* exclude user */
!(en_field & 0x1), /* exclude kernel */
pmi, false, false);
@@ -277,7 +280,7 @@ EXPORT_SYMBOL_GPL(reprogram_fixed_counter);
void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx)
{
- struct kvm_pmc *pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, pmc_idx);
+ struct kvm_pmc *pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, pmc_idx);
if (!pmc)
return;
@@ -299,7 +302,7 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
int bit;
for_each_set_bit(bit, pmu->reprogram_pmi, X86_PMC_IDX_MAX) {
- struct kvm_pmc *pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, bit);
+ struct kvm_pmc *pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, bit);
if (unlikely(!pmc || !pmc->perf_event)) {
clear_bit(bit, pmu->reprogram_pmi);
@@ -321,7 +324,7 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
/* check if idx is a valid index to access PMU */
int kvm_pmu_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx)
{
- return kvm_x86_ops.pmu_ops->is_valid_rdpmc_ecx(vcpu, idx);
+ return kvm_pmu_ops.is_valid_rdpmc_ecx(vcpu, idx);
}
bool is_vmware_backdoor_pmc(u32 pmc_idx)
@@ -371,7 +374,7 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
if (is_vmware_backdoor_pmc(idx))
return kvm_pmu_rdpmc_vmware(vcpu, idx, data);
- pmc = kvm_x86_ops.pmu_ops->rdpmc_ecx_to_pmc(vcpu, idx, &mask);
+ pmc = kvm_pmu_ops.rdpmc_ecx_to_pmc(vcpu, idx, &mask);
if (!pmc)
return 1;
@@ -387,23 +390,23 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
void kvm_pmu_deliver_pmi(struct kvm_vcpu *vcpu)
{
if (lapic_in_kernel(vcpu)) {
- if (kvm_x86_ops.pmu_ops->deliver_pmi)
- kvm_x86_ops.pmu_ops->deliver_pmi(vcpu);
+ if (kvm_pmu_ops.deliver_pmi)
+ kvm_pmu_ops.deliver_pmi(vcpu);
kvm_apic_local_deliver(vcpu->arch.apic, APIC_LVTPC);
}
}
bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
{
- return kvm_x86_ops.pmu_ops->msr_idx_to_pmc(vcpu, msr) ||
- kvm_x86_ops.pmu_ops->is_valid_msr(vcpu, msr);
+ return kvm_pmu_ops.msr_idx_to_pmc(vcpu, msr) ||
+ kvm_pmu_ops.is_valid_msr(vcpu, msr);
}
EXPORT_SYMBOL_GPL(kvm_pmu_is_valid_msr);
static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
- struct kvm_pmc *pmc = kvm_x86_ops.pmu_ops->msr_idx_to_pmc(vcpu, msr);
+ struct kvm_pmc *pmc = kvm_pmu_ops.msr_idx_to_pmc(vcpu, msr);
if (pmc)
__set_bit(pmc->idx, pmu->pmc_in_use);
@@ -411,13 +414,13 @@ static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
{
- return kvm_x86_ops.pmu_ops->get_msr(vcpu, msr_info);
+ return kvm_pmu_ops.get_msr(vcpu, msr_info);
}
int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
{
kvm_pmu_mark_pmc_in_use(vcpu, msr_info->index);
- return kvm_x86_ops.pmu_ops->set_msr(vcpu, msr_info);
+ return kvm_pmu_ops.set_msr(vcpu, msr_info);
}
/* refresh PMU settings. This function generally is called when underlying
@@ -426,7 +429,7 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
*/
void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
{
- kvm_x86_ops.pmu_ops->refresh(vcpu);
+ kvm_pmu_ops.refresh(vcpu);
}
void kvm_pmu_reset(struct kvm_vcpu *vcpu)
@@ -434,7 +437,7 @@ void kvm_pmu_reset(struct kvm_vcpu *vcpu)
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
irq_work_sync(&pmu->irq_work);
- kvm_x86_ops.pmu_ops->reset(vcpu);
+ kvm_pmu_ops.reset(vcpu);
}
void kvm_pmu_init(struct kvm_vcpu *vcpu)
@@ -442,7 +445,7 @@ void kvm_pmu_init(struct kvm_vcpu *vcpu)
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
memset(pmu, 0, sizeof(*pmu));
- kvm_x86_ops.pmu_ops->init(vcpu);
+ kvm_pmu_ops.init(vcpu);
init_irq_work(&pmu->irq_work, kvm_pmi_trigger_fn);
pmu->event_count = 0;
pmu->need_cleanup = false;
@@ -474,14 +477,14 @@ void kvm_pmu_cleanup(struct kvm_vcpu *vcpu)
pmu->pmc_in_use, X86_PMC_IDX_MAX);
for_each_set_bit(i, bitmask, X86_PMC_IDX_MAX) {
- pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, i);
+ pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, i);
if (pmc && pmc->perf_event && !pmc_speculative_in_use(pmc))
pmc_stop_counter(pmc);
}
- if (kvm_x86_ops.pmu_ops->cleanup)
- kvm_x86_ops.pmu_ops->cleanup(vcpu);
+ if (kvm_pmu_ops.cleanup)
+ kvm_pmu_ops.cleanup(vcpu);
bitmap_zero(pmu->pmc_in_use, X86_PMC_IDX_MAX);
}
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 0e4f2b1fa9fb..b2fe135d395a 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -17,6 +17,8 @@
#define MAX_FIXED_COUNTERS 3
+extern struct kvm_pmu_ops kvm_pmu_ops;
+
struct kvm_event_hw_type_mapping {
u8 eventsel;
u8 unit_mask;
@@ -92,7 +94,7 @@ static inline bool pmc_is_fixed(struct kvm_pmc *pmc)
static inline bool pmc_is_enabled(struct kvm_pmc *pmc)
{
- return kvm_x86_ops.pmu_ops->pmc_is_enabled(pmc);
+ return kvm_pmu_ops.pmc_is_enabled(pmc);
}
static inline bool kvm_valid_perf_global_ctrl(struct kvm_pmu *pmu,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0aee0a078d6f..ca9a76abb6ba 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11323,6 +11323,7 @@ int kvm_arch_hardware_setup(void *opaque)
return r;
memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));
+ memcpy(&kvm_pmu_ops, kvm_x86_ops.pmu_ops, sizeof(kvm_pmu_ops));
kvm_ops_static_call_update();
if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
--
2.33.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 5/7] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata
2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
` (3 preceding siblings ...)
2021-11-08 11:10 ` [PATCH v2 4/7] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection Like Xu
@ 2021-11-08 11:10 ` Like Xu
2021-12-08 18:17 ` Sean Christopherson
2021-11-08 11:10 ` [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops Like Xu
2021-11-08 11:10 ` [PATCH v2 7/7] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead Like Xu
6 siblings, 1 reply; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
To: Paolo Bonzini, Sean Christopherson
Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
linux-kernel
From: Like Xu <likexu@tencent.com>
The pmu_ops should be moved to kvm_x86_init_ops and tagged as __initdata.
That'll save those precious few bytes, and more importantly make
the original ops unreachable, i.e. make it harder to sneak in post-init
modification bugs.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
arch/x86/include/asm/kvm_host.h | 4 ++--
arch/x86/kvm/svm/pmu.c | 2 +-
arch/x86/kvm/svm/svm.c | 2 +-
arch/x86/kvm/vmx/pmu_intel.c | 2 +-
arch/x86/kvm/vmx/vmx.c | 2 +-
arch/x86/kvm/x86.c | 2 +-
6 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c2d4ee2973c5..00760a3ac88c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1436,8 +1436,7 @@ struct kvm_x86_ops {
int cpu_dirty_log_size;
void (*update_cpu_dirty_logging)(struct kvm_vcpu *vcpu);
- /* pmu operations of sub-arch */
- const struct kvm_pmu_ops *pmu_ops;
+ /* nested operations of sub-arch */
const struct kvm_x86_nested_ops *nested_ops;
/*
@@ -1516,6 +1515,7 @@ struct kvm_x86_init_ops {
int (*hardware_setup)(void);
struct kvm_x86_ops *runtime_ops;
+ struct kvm_pmu_ops *pmu_ops;
};
struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index fdf587f19c5f..4554cbc3820c 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -319,7 +319,7 @@ static void amd_pmu_reset(struct kvm_vcpu *vcpu)
}
}
-struct kvm_pmu_ops amd_pmu_ops = {
+struct kvm_pmu_ops amd_pmu_ops __initdata = {
.find_arch_event = amd_find_arch_event,
.find_fixed_event = amd_find_fixed_event,
.pmc_is_enabled = amd_pmc_is_enabled,
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 21bb81710e0f..8834d7d2b991 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4681,7 +4681,6 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
.sched_in = svm_sched_in,
- .pmu_ops = &amd_pmu_ops,
.nested_ops = &svm_nested_ops,
.deliver_posted_interrupt = svm_deliver_avic_intr,
@@ -4717,6 +4716,7 @@ static struct kvm_x86_init_ops svm_init_ops __initdata = {
.check_processor_compatibility = svm_check_processor_compat,
.runtime_ops = &svm_x86_ops,
+ .pmu_ops = &amd_pmu_ops,
};
static int __init svm_init(void)
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index b8e0d21b7c8a..c0b905d032c8 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -703,7 +703,7 @@ static void intel_pmu_cleanup(struct kvm_vcpu *vcpu)
intel_pmu_release_guest_lbr_event(vcpu);
}
-struct kvm_pmu_ops intel_pmu_ops = {
+struct kvm_pmu_ops intel_pmu_ops __initdata = {
.find_arch_event = intel_find_arch_event,
.find_fixed_event = intel_find_fixed_event,
.pmc_is_enabled = intel_pmc_is_enabled,
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 71f54d85f104..ce787d2e8e08 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7680,7 +7680,6 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
.pre_block = vmx_pre_block,
.post_block = vmx_post_block,
- .pmu_ops = &intel_pmu_ops,
.nested_ops = &vmx_nested_ops,
.update_pi_irte = pi_update_irte,
@@ -7922,6 +7921,7 @@ static struct kvm_x86_init_ops vmx_init_ops __initdata = {
.hardware_setup = hardware_setup,
.runtime_ops = &vmx_x86_ops,
+ .pmu_ops = &intel_pmu_ops,
};
static void vmx_cleanup_l1d_flush(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ca9a76abb6ba..70dc8f41329c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11323,7 +11323,7 @@ int kvm_arch_hardware_setup(void *opaque)
return r;
memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));
- memcpy(&kvm_pmu_ops, kvm_x86_ops.pmu_ops, sizeof(kvm_pmu_ops));
+ memcpy(&kvm_pmu_ops, ops->pmu_ops, sizeof(kvm_pmu_ops));
kvm_ops_static_call_update();
if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
--
2.33.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops
2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
` (4 preceding siblings ...)
2021-11-08 11:10 ` [PATCH v2 5/7] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata Like Xu
@ 2021-11-08 11:10 ` Like Xu
2021-12-08 18:35 ` Sean Christopherson
2021-11-08 11:10 ` [PATCH v2 7/7] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead Like Xu
6 siblings, 1 reply; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
To: Paolo Bonzini, Sean Christopherson
Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
linux-kernel
From: Like Xu <likexu@tencent.com>
Use static calls to improve kvm_pmu_ops performance. Introduce the
definitions that will be used by a subsequent patch to actualize the
savings. Add a new kvm-x86-pmu-ops.h header that can be used for the
definition of static calls. This header is also intended to be
used to simplify the defition of amd_pmu_ops and intel_pmu_ops.
Like what we did for kvm_x86_ops, 'pmu_ops' can be covered by
static calls in a simlilar manner for insignificant but not
negligible performance impact, especially on older models.
Signed-off-by: Like Xu <likexu@tencent.com>
---
arch/x86/include/asm/kvm-x86-pmu-ops.h | 32 ++++++++++++++++++++++++++
arch/x86/kvm/pmu.c | 6 +++++
arch/x86/kvm/pmu.h | 5 ++++
3 files changed, 43 insertions(+)
create mode 100644 arch/x86/include/asm/kvm-x86-pmu-ops.h
diff --git a/arch/x86/include/asm/kvm-x86-pmu-ops.h b/arch/x86/include/asm/kvm-x86-pmu-ops.h
new file mode 100644
index 000000000000..b7713b16d21d
--- /dev/null
+++ b/arch/x86/include/asm/kvm-x86-pmu-ops.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#if !defined(KVM_X86_PMU_OP) || !defined(KVM_X86_PMU_OP_NULL)
+BUILD_BUG_ON(1)
+#endif
+
+/*
+ * KVM_X86_PMU_OP() and KVM_X86_PMU_OP_NULL() are used to
+ * help generate "static_call()"s. They are also intended for use when defining
+ * the amd/intel KVM_X86_PMU_OPs. KVM_X86_PMU_OP() can be used
+ * for those functions that follow the [amd|intel]_func_name convention.
+ * KVM_X86_PMU_OP_NULL() can leave a NULL definition for the
+ * case where there is no definition or a function name that
+ * doesn't match the typical naming convention is supplied.
+ */
+KVM_X86_PMU_OP(find_arch_event);
+KVM_X86_PMU_OP(find_fixed_event);
+KVM_X86_PMU_OP(pmc_is_enabled);
+KVM_X86_PMU_OP(pmc_idx_to_pmc);
+KVM_X86_PMU_OP(rdpmc_ecx_to_pmc);
+KVM_X86_PMU_OP(msr_idx_to_pmc);
+KVM_X86_PMU_OP(is_valid_rdpmc_ecx);
+KVM_X86_PMU_OP(is_valid_msr);
+KVM_X86_PMU_OP(get_msr);
+KVM_X86_PMU_OP(set_msr);
+KVM_X86_PMU_OP(refresh);
+KVM_X86_PMU_OP(init);
+KVM_X86_PMU_OP(reset);
+KVM_X86_PMU_OP_NULL(deliver_pmi);
+KVM_X86_PMU_OP_NULL(cleanup);
+
+#undef KVM_X86_PMU_OP
+#undef KVM_X86_PMU_OP_NULL
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 353989bf0102..bfdd9f2bc0fa 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -50,6 +50,12 @@
struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
EXPORT_SYMBOL_GPL(kvm_pmu_ops);
+#define KVM_X86_PMU_OP(func) \
+ DEFINE_STATIC_CALL_NULL(kvm_x86_pmu_##func, \
+ *(((struct kvm_pmu_ops *)0)->func))
+#define KVM_X86_PMU_OP_NULL KVM_X86_PMU_OP
+#include <asm/kvm-x86-pmu-ops.h>
+
static void kvm_pmi_trigger_fn(struct irq_work *irq_work)
{
struct kvm_pmu *pmu = container_of(irq_work, struct kvm_pmu, irq_work);
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index b2fe135d395a..40e0b523637b 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -45,6 +45,11 @@ struct kvm_pmu_ops {
void (*cleanup)(struct kvm_vcpu *vcpu);
};
+#define KVM_X86_PMU_OP(func) \
+ DECLARE_STATIC_CALL(kvm_x86_pmu_##func, *(((struct kvm_pmu_ops *)0)->func))
+#define KVM_X86_PMU_OP_NULL KVM_X86_PMU_OP
+#include <asm/kvm-x86-pmu-ops.h>
+
static inline u64 pmc_bitmask(struct kvm_pmc *pmc)
{
struct kvm_pmu *pmu = pmc_to_pmu(pmc);
--
2.33.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 7/7] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead
2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
` (5 preceding siblings ...)
2021-11-08 11:10 ` [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops Like Xu
@ 2021-11-08 11:10 ` Like Xu
6 siblings, 0 replies; 13+ messages in thread
From: Like Xu @ 2021-11-08 11:10 UTC (permalink / raw)
To: Paolo Bonzini, Sean Christopherson
Cc: kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov, Joerg Roedel,
linux-kernel
From: Like Xu <likexu@tencent.com>
Convert kvm_pmu_ops to use static calls.
Here are the worst sched_clock() nanosecond numbers for the kvm_pmu_ops
functions that is most often called (up to 7 digits of calls) when running
a single perf test case in a guest on an ICX 2.70GHz host (mitigations=on):
| legacy | static call
------------------------------------------------------------
.pmc_idx_to_pmc | 10946 | 10047 (8%)
.pmc_is_enabled | 11291 | 11175 (1%)
.msr_idx_to_pmc | 13526 | 12346 (8%)
.is_valid_msr | 10895 | 10484 (3%)
Signed-off-by: Like Xu <likexu@tencent.com>
---
arch/x86/kvm/pmu.c | 36 +++++++++++++++++-------------------
arch/x86/kvm/pmu.h | 2 +-
arch/x86/kvm/x86.c | 5 +++++
3 files changed, 23 insertions(+), 20 deletions(-)
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index bfdd9f2bc0fa..c86ff3057e2c 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -223,7 +223,7 @@ void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel)
ARCH_PERFMON_EVENTSEL_CMASK |
HSW_IN_TX |
HSW_IN_TX_CHECKPOINTED))) {
- config = kvm_pmu_ops.find_arch_event(pmc_to_pmu(pmc),
+ config = static_call(kvm_x86_pmu_find_arch_event)(pmc_to_pmu(pmc),
event_select,
unit_mask);
if (config != PERF_COUNT_HW_MAX)
@@ -277,7 +277,7 @@ void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int idx)
pmc->current_config = (u64)ctrl;
pmc_reprogram_counter(pmc, PERF_TYPE_HARDWARE,
- kvm_pmu_ops.find_fixed_event(idx),
+ static_call(kvm_x86_pmu_find_fixed_event)(idx),
!(en_field & 0x2), /* exclude user */
!(en_field & 0x1), /* exclude kernel */
pmi, false, false);
@@ -286,7 +286,7 @@ EXPORT_SYMBOL_GPL(reprogram_fixed_counter);
void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx)
{
- struct kvm_pmc *pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, pmc_idx);
+ struct kvm_pmc *pmc = static_call(kvm_x86_pmu_pmc_idx_to_pmc)(pmu, pmc_idx);
if (!pmc)
return;
@@ -308,7 +308,7 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
int bit;
for_each_set_bit(bit, pmu->reprogram_pmi, X86_PMC_IDX_MAX) {
- struct kvm_pmc *pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, bit);
+ struct kvm_pmc *pmc = static_call(kvm_x86_pmu_pmc_idx_to_pmc)(pmu, bit);
if (unlikely(!pmc || !pmc->perf_event)) {
clear_bit(bit, pmu->reprogram_pmi);
@@ -330,7 +330,7 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
/* check if idx is a valid index to access PMU */
int kvm_pmu_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx)
{
- return kvm_pmu_ops.is_valid_rdpmc_ecx(vcpu, idx);
+ return static_call(kvm_x86_pmu_is_valid_rdpmc_ecx)(vcpu, idx);
}
bool is_vmware_backdoor_pmc(u32 pmc_idx)
@@ -380,7 +380,7 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
if (is_vmware_backdoor_pmc(idx))
return kvm_pmu_rdpmc_vmware(vcpu, idx, data);
- pmc = kvm_pmu_ops.rdpmc_ecx_to_pmc(vcpu, idx, &mask);
+ pmc = static_call(kvm_x86_pmu_rdpmc_ecx_to_pmc)(vcpu, idx, &mask);
if (!pmc)
return 1;
@@ -396,23 +396,22 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
void kvm_pmu_deliver_pmi(struct kvm_vcpu *vcpu)
{
if (lapic_in_kernel(vcpu)) {
- if (kvm_pmu_ops.deliver_pmi)
- kvm_pmu_ops.deliver_pmi(vcpu);
+ static_call_cond(kvm_x86_pmu_deliver_pmi)(vcpu);
kvm_apic_local_deliver(vcpu->arch.apic, APIC_LVTPC);
}
}
bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
{
- return kvm_pmu_ops.msr_idx_to_pmc(vcpu, msr) ||
- kvm_pmu_ops.is_valid_msr(vcpu, msr);
+ return static_call(kvm_x86_pmu_msr_idx_to_pmc)(vcpu, msr) ||
+ static_call(kvm_x86_pmu_is_valid_msr)(vcpu, msr);
}
EXPORT_SYMBOL_GPL(kvm_pmu_is_valid_msr);
static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
- struct kvm_pmc *pmc = kvm_pmu_ops.msr_idx_to_pmc(vcpu, msr);
+ struct kvm_pmc *pmc = static_call(kvm_x86_pmu_msr_idx_to_pmc)(vcpu, msr);
if (pmc)
__set_bit(pmc->idx, pmu->pmc_in_use);
@@ -420,13 +419,13 @@ static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)
int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
{
- return kvm_pmu_ops.get_msr(vcpu, msr_info);
+ return static_call(kvm_x86_pmu_get_msr)(vcpu, msr_info);
}
int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
{
kvm_pmu_mark_pmc_in_use(vcpu, msr_info->index);
- return kvm_pmu_ops.set_msr(vcpu, msr_info);
+ return static_call(kvm_x86_pmu_set_msr)(vcpu, msr_info);
}
/* refresh PMU settings. This function generally is called when underlying
@@ -435,7 +434,7 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
*/
void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
{
- kvm_pmu_ops.refresh(vcpu);
+ static_call(kvm_x86_pmu_refresh)(vcpu);
}
void kvm_pmu_reset(struct kvm_vcpu *vcpu)
@@ -443,7 +442,7 @@ void kvm_pmu_reset(struct kvm_vcpu *vcpu)
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
irq_work_sync(&pmu->irq_work);
- kvm_pmu_ops.reset(vcpu);
+ static_call(kvm_x86_pmu_reset)(vcpu);
}
void kvm_pmu_init(struct kvm_vcpu *vcpu)
@@ -451,7 +450,7 @@ void kvm_pmu_init(struct kvm_vcpu *vcpu)
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
memset(pmu, 0, sizeof(*pmu));
- kvm_pmu_ops.init(vcpu);
+ static_call(kvm_x86_pmu_init)(vcpu);
init_irq_work(&pmu->irq_work, kvm_pmi_trigger_fn);
pmu->event_count = 0;
pmu->need_cleanup = false;
@@ -483,14 +482,13 @@ void kvm_pmu_cleanup(struct kvm_vcpu *vcpu)
pmu->pmc_in_use, X86_PMC_IDX_MAX);
for_each_set_bit(i, bitmask, X86_PMC_IDX_MAX) {
- pmc = kvm_pmu_ops.pmc_idx_to_pmc(pmu, i);
+ pmc = static_call(kvm_x86_pmu_pmc_idx_to_pmc)(pmu, i);
if (pmc && pmc->perf_event && !pmc_speculative_in_use(pmc))
pmc_stop_counter(pmc);
}
- if (kvm_pmu_ops.cleanup)
- kvm_pmu_ops.cleanup(vcpu);
+ static_call_cond(kvm_x86_pmu_cleanup)(vcpu);
bitmap_zero(pmu->pmc_in_use, X86_PMC_IDX_MAX);
}
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 40e0b523637b..a4bfd4200d67 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -99,7 +99,7 @@ static inline bool pmc_is_fixed(struct kvm_pmc *pmc)
static inline bool pmc_is_enabled(struct kvm_pmc *pmc)
{
- return kvm_pmu_ops.pmc_is_enabled(pmc);
+ return static_call(kvm_x86_pmu_pmc_is_enabled)(pmc);
}
static inline bool kvm_valid_perf_global_ctrl(struct kvm_pmu *pmu,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 70dc8f41329c..c5db444d5f4a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11306,6 +11306,11 @@ static inline void kvm_ops_static_call_update(void)
static_call_update(kvm_x86_##func, kvm_x86_ops.func)
#define KVM_X86_OP_NULL KVM_X86_OP
#include <asm/kvm-x86-ops.h>
+
+#define KVM_X86_PMU_OP(func) \
+ static_call_update(kvm_x86_pmu_##func, kvm_pmu_ops.func)
+#define KVM_X86_PMU_OP_NULL KVM_X86_PMU_OP
+#include <asm/kvm-x86-pmu-ops.h>
}
int kvm_arch_hardware_setup(void *opaque)
--
2.33.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX
2021-11-08 11:10 ` [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX Like Xu
@ 2021-12-08 17:55 ` Sean Christopherson
0 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2021-12-08 17:55 UTC (permalink / raw)
To: Like Xu
Cc: Paolo Bonzini, kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov,
Joerg Roedel, linux-kernel
This is doing more than exporting a function, the export isn't even the focal
point of the patch.
On Mon, Nov 08, 2021, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
>
> Let's export kvm_pmu_is_valid_msr() for nVMX, instead of
Please wrap at ~75 chars.
> exporting all kvm_pmu_ops for this one case.
kvm_pmu_ops doesn't exist as of this patch, it comes later in the series.
> The reduced access scope will help to optimize the kvm_x86_ops.pmu_ops stuff
> later.
The changelog needs to explain why it's ok to add the msr_idx_to_pmc() check.
Something like:
KVM: nVMX: Use kvm_pmu_is_valid_msr() to check for PERF_GLOBAL_CTRL support
Use the generic kvm_pmu_is_valid_msr() helper when determining whether or not
PERF_GLOBAL_CTRL is exposed to the guest and thus can be loaded on nested
VM-Enter/VM-Exit. The extra (indirect!) call to msr_idx_to_pmc() that comes
with the helper is unnecessary, but harmless, as it's guaranteed to return
false for MSR_CORE_PERF_GLOBAL_CTRL and this is a already a very slow path.
Using the helper will allow future code to use static_call() for the PMU ops
without having to export any static_call definitions.
Export kvm_pmu_is_valid_msr() as necessary.
All that said, looking at this whole thing again, I think I'd prefer:
From d217914c9897d9a2cfd01073284a933ace4709b7 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Wed, 8 Dec 2021 09:48:20 -0800
Subject: [PATCH] KVM: nVMX: Refactor PMU refresh to avoid referencing
kvm_x86_ops.pmu_ops
Refactor the nested VMX PMU refresh helper to pass it a flag stating
whether or not the vCPU has PERF_GLOBAL_CTRL instead of having the nVMX
helper query the information by bouncing through kvm_x86_ops.pmu_ops.
This will allow a future patch to use static_call() for the PMU ops
without having to export any static call definitions from common x86.
Alternatively, nVMX could call kvm_pmu_is_valid_msr() to indirectly use
kvm_x86_ops.pmu_ops, but that would exporting kvm_pmu_is_valid_msr() and
incurs an extra layer of indirection.
Opportunistically rename the helper to keep line lengths somewhat
reasonable, and to better capture its high-level role.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/vmx/nested.c | 5 +++--
arch/x86/kvm/vmx/nested.h | 3 ++-
arch/x86/kvm/vmx/pmu_intel.c | 3 ++-
3 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 08e785871985..c87a81071288 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4773,7 +4773,8 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
return 0;
}
-void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
+void nested_vmx_pmu_refresh(struct kvm_vcpu *vcpu,
+ bool vcpu_has_perf_global_ctrl)
{
struct vcpu_vmx *vmx;
@@ -4781,7 +4782,7 @@ void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
return;
vmx = to_vmx(vcpu);
- if (kvm_x86_ops.pmu_ops->is_valid_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL)) {
+ if (vcpu_has_perf_global_ctrl) {
vmx->nested.msrs.entry_ctls_high |=
VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
vmx->nested.msrs.exit_ctls_high |=
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index b69a80f43b37..c92cea0b8ccc 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -32,7 +32,8 @@ int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data);
int vmx_get_vmx_msr(struct nested_vmx_msrs *msrs, u32 msr_index, u64 *pdata);
int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
u32 vmx_instruction_info, bool wr, int len, gva_t *ret);
-void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu);
+void nested_vmx_pmu_refresh(struct kvm_vcpu *vcpu,
+ bool vcpu_has_perf_global_ctrl);
void nested_mark_vmcs12_pages_dirty(struct kvm_vcpu *vcpu);
bool nested_vmx_check_io_bitmaps(struct kvm_vcpu *vcpu, unsigned int port,
int size);
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 1b7456b2177b..7ca870d0249d 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -529,7 +529,8 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
bitmap_set(pmu->all_valid_pmc_idx,
INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters);
- nested_vmx_pmu_entry_exit_ctls_update(vcpu);
+ nested_vmx_pmu_refresh(vcpu,
+ intel_is_valid_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL));
if (intel_pmu_lbr_is_compatible(vcpu))
x86_perf_get_lbr(&lbr_desc->records);
--
2.34.1.400.ga245620fadb-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 4/7] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection
2021-11-08 11:10 ` [PATCH v2 4/7] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection Like Xu
@ 2021-12-08 18:10 ` Sean Christopherson
0 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2021-12-08 18:10 UTC (permalink / raw)
To: Like Xu
Cc: Paolo Bonzini, kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov,
Joerg Roedel, linux-kernel
On Mon, Nov 08, 2021, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
>
> Replace the kvm_pmu_ops pointer in common x86 with an instance of the
> struct to save one pointer dereference when invoking functions. Copy the
> struct by value to set the ops during kvm_init().
>
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
> arch/x86/kvm/pmu.c | 41 ++++++++++++++++++++++-------------------
> arch/x86/kvm/pmu.h | 4 +++-
> arch/x86/kvm/x86.c | 1 +
> 3 files changed, 26 insertions(+), 20 deletions(-)
>
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index aa6ac9c7aff2..353989bf0102 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -47,6 +47,9 @@
> * * AMD: [0 .. AMD64_NUM_COUNTERS-1] <=> gp counters
> */
>
> +struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
> +EXPORT_SYMBOL_GPL(kvm_pmu_ops);
This export isn't necessary.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 5/7] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata
2021-11-08 11:10 ` [PATCH v2 5/7] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata Like Xu
@ 2021-12-08 18:17 ` Sean Christopherson
0 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2021-12-08 18:17 UTC (permalink / raw)
To: Like Xu
Cc: Paolo Bonzini, kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov,
Joerg Roedel, linux-kernel
s/tagged/tag
On Mon, Nov 08, 2021, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
>
> The pmu_ops should be moved to kvm_x86_init_ops and tagged as __initdata.
State what the patch does, not what "should" be done.
Now that pmu_ops is copied by value during kvm_arch_hardware_setup(), move
the pointer to kvm_x86_init_ops and tag implementations as __initdata to make
the implementations unreachable once KVM is loaded, e.g. to make it harder to
sneak in post-init modification bugs.
> That'll save those precious few bytes, and more importantly make
> the original ops unreachable, i.e. make it harder to sneak in post-init
> modification bugs.
>
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
> arch/x86/include/asm/kvm_host.h | 4 ++--
> arch/x86/kvm/svm/pmu.c | 2 +-
> arch/x86/kvm/svm/svm.c | 2 +-
> arch/x86/kvm/vmx/pmu_intel.c | 2 +-
> arch/x86/kvm/vmx/vmx.c | 2 +-
> arch/x86/kvm/x86.c | 2 +-
> 6 files changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index c2d4ee2973c5..00760a3ac88c 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1436,8 +1436,7 @@ struct kvm_x86_ops {
> int cpu_dirty_log_size;
> void (*update_cpu_dirty_logging)(struct kvm_vcpu *vcpu);
>
> - /* pmu operations of sub-arch */
> - const struct kvm_pmu_ops *pmu_ops;
> + /* nested operations of sub-arch */
No need for the new comment.
> const struct kvm_x86_nested_ops *nested_ops;
>
> /*
Nits aside,
Reviewed-by: Sean Christopherson <seanjc@google.com>
I'd also be a-ok squashing this with the copy-by-value patch.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops
2021-11-08 11:10 ` [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops Like Xu
@ 2021-12-08 18:35 ` Sean Christopherson
2021-12-09 12:50 ` Like Xu
0 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2021-12-08 18:35 UTC (permalink / raw)
To: Like Xu
Cc: Paolo Bonzini, kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov,
Joerg Roedel, linux-kernel
On Mon, Nov 08, 2021, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
>
> Use static calls to improve kvm_pmu_ops performance. Introduce the
> definitions that will be used by a subsequent patch to actualize the
> savings. Add a new kvm-x86-pmu-ops.h header that can be used for the
> definition of static calls. This header is also intended to be
> used to simplify the defition of amd_pmu_ops and intel_pmu_ops.
>
> Like what we did for kvm_x86_ops, 'pmu_ops' can be covered by
> static calls in a simlilar manner for insignificant but not
> negligible performance impact, especially on older models.
>
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
This absolutely shouldn't be separated from patch 7/7. By _defining_ the static
calls but not providing the logic to actually _update_ the calls, it's entirely
possible to add static_call() invocations that will compile cleanly without any
chance of doing the right thing at runtime.
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 0236c1a953d0..804f98b5552e 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -99,7 +99,7 @@ static inline bool pmc_is_fixed(struct kvm_pmc *pmc)
static inline bool pmc_is_enabled(struct kvm_pmc *pmc)
{
- return kvm_pmu_ops.pmc_is_enabled(pmc);
+ return static_call(kvm_x86_pmu_pmc_is_enabled)(pmc);
}
static inline bool kvm_valid_perf_global_ctrl(struct kvm_pmu *pmu,
> @@ -0,0 +1,32 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#if !defined(KVM_X86_PMU_OP) || !defined(KVM_X86_PMU_OP_NULL)
> +BUILD_BUG_ON(1)
> +#endif
> +
> +/*
> + * KVM_X86_PMU_OP() and KVM_X86_PMU_OP_NULL() are used to
Please use all 80 chars.
> + * help generate "static_call()"s. They are also intended for use when defining
> + * the amd/intel KVM_X86_PMU_OPs. KVM_X86_PMU_OP() can be used
AMD/Intel since this is referring to the vendor and not to function names (like
the below reference).
> + * for those functions that follow the [amd|intel]_func_name convention.
> + * KVM_X86_PMU_OP_NULL() can leave a NULL definition for the
As below, please drop the _NULL() variant.
> + * case where there is no definition or a function name that
> + * doesn't match the typical naming convention is supplied.
> + */
...
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index 353989bf0102..bfdd9f2bc0fa 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -50,6 +50,12 @@
> struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
> EXPORT_SYMBOL_GPL(kvm_pmu_ops);
>
> +#define KVM_X86_PMU_OP(func) \
> + DEFINE_STATIC_CALL_NULL(kvm_x86_pmu_##func, \
> + *(((struct kvm_pmu_ops *)0)->func))
> +#define KVM_X86_PMU_OP_NULL KVM_X86_PMU_OP
> +#include <asm/kvm-x86-pmu-ops.h>
> +
> static void kvm_pmi_trigger_fn(struct irq_work *irq_work)
> {
> struct kvm_pmu *pmu = container_of(irq_work, struct kvm_pmu, irq_work);
> diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
> index b2fe135d395a..40e0b523637b 100644
> --- a/arch/x86/kvm/pmu.h
> +++ b/arch/x86/kvm/pmu.h
> @@ -45,6 +45,11 @@ struct kvm_pmu_ops {
> void (*cleanup)(struct kvm_vcpu *vcpu);
> };
>
> +#define KVM_X86_PMU_OP(func) \
> + DECLARE_STATIC_CALL(kvm_x86_pmu_##func, *(((struct kvm_pmu_ops *)0)->func))
> +#define KVM_X86_PMU_OP_NULL KVM_X86_PMU_OP
I don't want to proliferate the pointless and bitrot-prone KVM_X86_OP_NULL macro,
just omit this. I'll send a patch to drop KVM_X86_OP_NULL.
> +#include <asm/kvm-x86-pmu-ops.h>
> +
> static inline u64 pmc_bitmask(struct kvm_pmc *pmc)
> {
> struct kvm_pmu *pmu = pmc_to_pmu(pmc);
> --
> 2.33.0
>
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops
2021-12-08 18:35 ` Sean Christopherson
@ 2021-12-09 12:50 ` Like Xu
0 siblings, 0 replies; 13+ messages in thread
From: Like Xu @ 2021-12-09 12:50 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, kvm, Wanpeng Li, Jim Mattson, Vitaly Kuznetsov,
Joerg Roedel, linux-kernel
Hi Sean,
On 9/12/2021 2:35 am, Sean Christopherson wrote:
> On Mon, Nov 08, 2021, Like Xu wrote:
>> From: Like Xu <likexu@tencent.com>
>>
>> Use static calls to improve kvm_pmu_ops performance. Introduce the
>> definitions that will be used by a subsequent patch to actualize the
>> savings. Add a new kvm-x86-pmu-ops.h header that can be used for the
>> definition of static calls. This header is also intended to be
>> used to simplify the defition of amd_pmu_ops and intel_pmu_ops.
>>
>> Like what we did for kvm_x86_ops, 'pmu_ops' can be covered by
>> static calls in a simlilar manner for insignificant but not
>> negligible performance impact, especially on older models.
>>
>> Signed-off-by: Like Xu <likexu@tencent.com>
>> ---
>
> This absolutely shouldn't be separated from patch 7/7. By _defining_ the static
> calls but not providing the logic to actually _update_ the calls, it's entirely
> possible to add static_call() invocations that will compile cleanly without any
> chance of doing the right thing at runtime.
>
> diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
> index 0236c1a953d0..804f98b5552e 100644
> --- a/arch/x86/kvm/pmu.h
> +++ b/arch/x86/kvm/pmu.h
> @@ -99,7 +99,7 @@ static inline bool pmc_is_fixed(struct kvm_pmc *pmc)
>
> static inline bool pmc_is_enabled(struct kvm_pmc *pmc)
> {
> - return kvm_pmu_ops.pmc_is_enabled(pmc);
> + return static_call(kvm_x86_pmu_pmc_is_enabled)(pmc);
> }
>
> static inline bool kvm_valid_perf_global_ctrl(struct kvm_pmu *pmu,
>
>> @@ -0,0 +1,32 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#if !defined(KVM_X86_PMU_OP) || !defined(KVM_X86_PMU_OP_NULL)
>> +BUILD_BUG_ON(1)
>> +#endif
>> +
>> +/*
>> + * KVM_X86_PMU_OP() and KVM_X86_PMU_OP_NULL() are used to
>
> Please use all 80 chars.
>
>> + * help generate "static_call()"s. They are also intended for use when defining
>> + * the amd/intel KVM_X86_PMU_OPs. KVM_X86_PMU_OP() can be used
>
> AMD/Intel since this is referring to the vendor and not to function names (like
> the below reference).
>
>> + * for those functions that follow the [amd|intel]_func_name convention.
>> + * KVM_X86_PMU_OP_NULL() can leave a NULL definition for the
>
> As below, please drop the _NULL() variant.
>
>> + * case where there is no definition or a function name that
>> + * doesn't match the typical naming convention is supplied.
>> + */
>
> ...
>
>> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
>> index 353989bf0102..bfdd9f2bc0fa 100644
>> --- a/arch/x86/kvm/pmu.c
>> +++ b/arch/x86/kvm/pmu.c
>> @@ -50,6 +50,12 @@
>> struct kvm_pmu_ops kvm_pmu_ops __read_mostly;
>> EXPORT_SYMBOL_GPL(kvm_pmu_ops);
>>
>> +#define KVM_X86_PMU_OP(func) \
>> + DEFINE_STATIC_CALL_NULL(kvm_x86_pmu_##func, \
>> + *(((struct kvm_pmu_ops *)0)->func))
>> +#define KVM_X86_PMU_OP_NULL KVM_X86_PMU_OP
>> +#include <asm/kvm-x86-pmu-ops.h>
>> +
>> static void kvm_pmi_trigger_fn(struct irq_work *irq_work)
>> {
>> struct kvm_pmu *pmu = container_of(irq_work, struct kvm_pmu, irq_work);
>> diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
>> index b2fe135d395a..40e0b523637b 100644
>> --- a/arch/x86/kvm/pmu.h
>> +++ b/arch/x86/kvm/pmu.h
>> @@ -45,6 +45,11 @@ struct kvm_pmu_ops {
>> void (*cleanup)(struct kvm_vcpu *vcpu);
>> };
>>
>> +#define KVM_X86_PMU_OP(func) \
>> + DECLARE_STATIC_CALL(kvm_x86_pmu_##func, *(((struct kvm_pmu_ops *)0)->func))
>> +#define KVM_X86_PMU_OP_NULL KVM_X86_PMU_OP
>
> I don't want to proliferate the pointless and bitrot-prone KVM_X86_OP_NULL macro,
> just omit this. I'll send a patch to drop KVM_X86_OP_NULL.
Thanks for your clear comments on this patch set.
I will send out V3 once KVM_X86_OP_NULL is dropped as well as
the comment in arch/x86/include/asm/kvm-x86-ops.h is updated.
>
>> +#include <asm/kvm-x86-pmu-ops.h>
>> +
>> static inline u64 pmc_bitmask(struct kvm_pmc *pmc)
>> {
>> struct kvm_pmu *pmu = pmc_to_pmu(pmc);
>> --
>> 2.33.0
>>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2021-12-09 12:50 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-08 11:10 [PATCH v2 0/7] Use static_call for kvm_pmu_ops Like Xu
2021-11-08 11:10 ` [PATCH v2 1/7] KVM: x86: Export kvm_pmu_is_valid_msr() for nVMX Like Xu
2021-12-08 17:55 ` Sean Christopherson
2021-11-08 11:10 ` [PATCH v2 2/7] KVM: x86: Fix WARNING that macros should not use a trailing semicolon Like Xu
2021-11-08 11:10 ` [PATCH v2 3/7] KVM: x86: Move kvm_ops_static_call_update() to x86.c Like Xu
2021-11-08 11:10 ` [PATCH v2 4/7] KVM: x86: Copy kvm_pmu_ops by value to eliminate layer of indirection Like Xu
2021-12-08 18:10 ` Sean Christopherson
2021-11-08 11:10 ` [PATCH v2 5/7] KVM: x86: Move .pmu_ops to kvm_x86_init_ops and tagged as __initdata Like Xu
2021-12-08 18:17 ` Sean Christopherson
2021-11-08 11:10 ` [PATCH v2 6/7] KVM: x86: Introduce definitions to support static calls for kvm_pmu_ops Like Xu
2021-12-08 18:35 ` Sean Christopherson
2021-12-09 12:50 ` Like Xu
2021-11-08 11:10 ` [PATCH v2 7/7] KVM: x86: Use static calls to reduce kvm_pmu_ops overhead Like Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).