LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v5 00/13] Introduce Architectural LBR for vPMU
@ 2021-07-09 10:04 Yang Weijiang
  2021-07-09 10:04 ` [PATCH v5 01/13] perf/x86/intel: Fix the comment about guest LBR support on KVM Yang Weijiang
                   ` (12 more replies)
  0 siblings, 13 replies; 40+ messages in thread
From: Yang Weijiang @ 2021-07-09 10:04 UTC (permalink / raw)
  To: pbonzini, seanjc, vkuznets, jmattson, wei.w.wang, like.xu.linux,
	kvm, linux-kernel
  Cc: Yang Weijiang

The Architectural Last Branch Records (LBRs) is published in the 319433-040
release of Intel Architecture Instruction Set Extensions and Future Features
Programming Reference[0].

The main advantages of Arch LBR are [1]:
- Faster context switching due to XSAVES support and faster reset of
  LBR MSRs via the new DEPTH MSR
- Faster LBR read for a non-PEBS event due to XSAVES support, which
  lowers the overhead of the NMI handler.
- Linux kernel can support the LBR features without knowing the model
  number of the current CPU.

From end user's point of view, the usage of Arch LBR is the same as
the Legacy LBR that has been merged in the mainline.

[0] https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-and-future-features-programming-reference.html
[1] https://lore.kernel.org/lkml/1593780569-62993-1-git-send-email-kan.liang@linux.intel.com/

Note: to make the patchset buildable with kernel tree, this series includes 3 queued
CET patches from Paolo's queued branch.

Previous version:
v4:
https://lkml.kernel.org/kvm/20210510081535.94184-1-like.xu@linux.intel.com/

Like Xu (6):
  perf/x86/intel: Fix the comment about guest LBR support on KVM
  perf/x86/lbr: Simplify the exposure check for the LBR_INFO registers
  KVM: vmx/pmu: Emulate MSR_ARCH_LBR_DEPTH for guest Arch LBR
  KVM: vmx/pmu: Emulate MSR_ARCH_LBR_CTL for guest Arch LBR
  KVM: x86: Refine the matching and clearing logic for supported_xss
  KVM: x86: Add XSAVE Support for Architectural LBR

Sean Christopherson (1):
  KVM: x86: Report XSS as an MSR to be saved if there are supported
    features

Yang Weijiang (6):
  KVM: x86: Add arch LBR MSRs to msrs_to_save_all list
  KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state
  KVM: x86/pmu: Refactor code to support guest Arch LBR
  KVM: x86: Refresh CPUID on writes to MSR_IA32_XSS
  KVM: x86/vmx: Check Arch LBR config  when return perf capabilities
  KVM: x86/cpuid: Advise Arch LBR feature in CPUID

 arch/x86/events/intel/core.c     |   3 +-
 arch/x86/events/intel/lbr.c      |   6 +-
 arch/x86/include/asm/kvm_host.h  |   1 +
 arch/x86/include/asm/msr-index.h |   1 +
 arch/x86/include/asm/vmx.h       |   4 ++
 arch/x86/kvm/cpuid.c             |  44 ++++++++++--
 arch/x86/kvm/vmx/capabilities.h  |  25 ++++---
 arch/x86/kvm/vmx/pmu_intel.c     | 114 +++++++++++++++++++++++++++----
 arch/x86/kvm/vmx/vmx.c           |  50 ++++++++++++--
 arch/x86/kvm/vmx/vmx.h           |   4 ++
 arch/x86/kvm/x86.c               |  24 ++++++-
 11 files changed, 237 insertions(+), 39 deletions(-)

-- 
2.21.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v5 01/13] perf/x86/intel: Fix the comment about guest LBR support on KVM
  2021-07-09 10:04 [PATCH v5 00/13] Introduce Architectural LBR for vPMU Yang Weijiang
@ 2021-07-09 10:04 ` Yang Weijiang
  2021-07-09 10:05 ` [PATCH v5 02/13] perf/x86/lbr: Simplify the exposure check for the LBR_INFO registers Yang Weijiang
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 40+ messages in thread
From: Yang Weijiang @ 2021-07-09 10:04 UTC (permalink / raw)
  To: pbonzini, seanjc, vkuznets, jmattson, wei.w.wang, like.xu.linux,
	kvm, linux-kernel
  Cc: Like Xu, Peter Zijlstra, Yang Weijiang

From: Like Xu <like.xu@linux.intel.com>

Starting from v5.12, KVM reports guest LBR and extra_regs support
when the host has relevant support. Just delete this part of the
comment and fix a typo incidentally.

Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/events/intel/core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index e28892270c58..84fdb8a085e5 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -6249,8 +6249,7 @@ __init int intel_pmu_init(void)
 					  x86_pmu.intel_ctrl);
 	/*
 	 * Access LBR MSR may cause #GP under certain circumstances.
-	 * E.g. KVM doesn't support LBR MSR
-	 * Check all LBT MSR here.
+	 * Check all LBR MSR here.
 	 * Disable LBR access if any LBR MSRs can not be accessed.
 	 */
 	if (x86_pmu.lbr_tos && !check_msr(x86_pmu.lbr_tos, 0x3UL))
-- 
2.21.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v5 02/13] perf/x86/lbr: Simplify the exposure check for the LBR_INFO registers
  2021-07-09 10:04 [PATCH v5 00/13] Introduce Architectural LBR for vPMU Yang Weijiang
  2021-07-09 10:04 ` [PATCH v5 01/13] perf/x86/intel: Fix the comment about guest LBR support on KVM Yang Weijiang
@ 2021-07-09 10:05 ` Yang Weijiang
  2021-07-09 10:05 ` [PATCH v5 03/13] KVM: x86: Add arch LBR MSRs to msrs_to_save_all list Yang Weijiang
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 40+ messages in thread
From: Yang Weijiang @ 2021-07-09 10:05 UTC (permalink / raw)
  To: pbonzini, seanjc, vkuznets, jmattson, wei.w.wang, like.xu.linux,
	kvm, linux-kernel
  Cc: Like Xu, Peter Zijlstra, Yang Weijiang

From: Like Xu <like.xu@linux.intel.com>

The x86_pmu.lbr_info is 0 unless explicitly initialized, so there's
no point checking x86_pmu.intel_cap.lbr_format.

Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/events/intel/lbr.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index e8453de7a964..46dc8f6cc21a 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -1848,12 +1848,10 @@ void __init intel_pmu_arch_lbr_init(void)
  */
 int x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
 {
-	int lbr_fmt = x86_pmu.intel_cap.lbr_format;
-
 	lbr->nr = x86_pmu.lbr_nr;
 	lbr->from = x86_pmu.lbr_from;
 	lbr->to = x86_pmu.lbr_to;
-	lbr->info = (lbr_fmt == LBR_FORMAT_INFO) ? x86_pmu.lbr_info : 0;
+	lbr->info = x86_pmu.lbr_info;
 
 	return 0;
 }
-- 
2.21.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v5 03/13] KVM: x86: Add arch LBR MSRs to msrs_to_save_all list
  2021-07-09 10:04 [PATCH v5 00/13] Introduce Architectural LBR for vPMU Yang Weijiang
  2021-07-09 10:04 ` [PATCH v5 01/13] perf/x86/intel: Fix the comment about guest LBR support on KVM Yang Weijiang
  2021-07-09 10:05 ` [PATCH v5 02/13] perf/x86/lbr: Simplify the exposure check for the LBR_INFO registers Yang Weijiang
@ 2021-07-09 10:05 ` Yang Weijiang
  2021-07-09 18:24   ` Jim Mattson
  2021-07-09 10:05 ` [PATCH v5 04/13] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_DEPTH for guest Arch LBR Yang Weijiang
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 40+ messages in thread
From: Yang Weijiang @ 2021-07-09 10:05 UTC (permalink / raw)
  To: pbonzini, seanjc, vkuznets, jmattson, wei.w.wang, like.xu.linux,
	kvm, linux-kernel
  Cc: Yang Weijiang

Arch LBR MSR_ARCH_LBR_DEPTH and MSR_ARCH_LBR_CTL are {saved|restored}
by userspace application if they're available.

Suggested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/x86.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e0f4a46649d7..b586a45fce2b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1298,6 +1298,7 @@ static const u32 msrs_to_save_all[] = {
 	MSR_ARCH_PERFMON_EVENTSEL0 + 12, MSR_ARCH_PERFMON_EVENTSEL0 + 13,
 	MSR_ARCH_PERFMON_EVENTSEL0 + 14, MSR_ARCH_PERFMON_EVENTSEL0 + 15,
 	MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17,
+	MSR_ARCH_LBR_CTL, MSR_ARCH_LBR_DEPTH,
 };
 
 static u32 msrs_to_save[ARRAY_SIZE(msrs_to_save_all)];
@@ -6049,6 +6050,11 @@ static void kvm_init_msr_list(void)
 			    min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
 				continue;
 			break;
+		case MSR_ARCH_LBR_DEPTH:
+		case MSR_ARCH_LBR_CTL:
+			if (!kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR))
+				continue;
+			break;
 		default:
 			break;
 		}
-- 
2.21.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v5 04/13] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_DEPTH for guest Arch LBR
  2021-07-09 10:04 [PATCH v5 00/13] Introduce Architectural LBR for vPMU Yang Weijiang
                   ` (2 preceding siblings ...)
  2021-07-09 10:05 ` [PATCH v5 03/13] KVM: x86: Add arch LBR MSRs to msrs_to_save_all list Yang Weijiang
@ 2021-07-09 10:05 ` Yang Weijiang
  2021-07-09 20:35   ` Jim Mattson
  2021-07-09 10:05 ` [PATCH v5 05/13] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_CTL " Yang Weijiang
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 40+ messages in thread
From: Yang Weijiang @ 2021-07-09 10:05 UTC (permalink / raw)
  To: pbonzini, seanjc, vkuznets, jmattson, wei.w.wang, like.xu.linux,
	kvm, linux-kernel
  Cc: Like Xu, Yang Weijiang

From: Like Xu <like.xu@linux.intel.com>

The number of Arch LBR entries available is determined by the value
in host MSR_ARCH_LBR_DEPTH.DEPTH. The supported LBR depth values are
enumerated in CPUID.(EAX=01CH, ECX=0):EAX[7:0]. For each bit "n" set
in this field, the MSR_ARCH_LBR_DEPTH.DEPTH value of "8*(n+1)" is
supported.

On a guest write to MSR_ARCH_LBR_DEPTH, all LBR entries are reset to 0.
KVM emulates the reset behavior by introducing lbr_desc->arch_lbr_reset.
KVM writes guest requested value to the native ARCH_LBR_DEPTH MSR
(this is safe because the two values will be the same) when the Arch LBR
records MSRs are pass-through to the guest.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/vmx/pmu_intel.c | 46 +++++++++++++++++++++++++++++++++++-
 arch/x86/kvm/vmx/vmx.h       |  3 +++
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 9efc1a6b8693..da68f0e74702 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -211,7 +211,7 @@ static bool intel_pmu_is_valid_lbr_msr(struct kvm_vcpu *vcpu, u32 index)
 static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
-	int ret;
+	int ret = 0;
 
 	switch (msr) {
 	case MSR_CORE_PERF_FIXED_CTR_CTRL:
@@ -220,6 +220,10 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 	case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
 		ret = pmu->version > 1;
 		break;
+	case MSR_ARCH_LBR_DEPTH:
+		if (kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR))
+			ret = guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR);
+		break;
 	default:
 		ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) ||
 			get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0) ||
@@ -250,6 +254,7 @@ static inline void intel_pmu_release_guest_lbr_event(struct kvm_vcpu *vcpu)
 	if (lbr_desc->event) {
 		perf_event_release_kernel(lbr_desc->event);
 		lbr_desc->event = NULL;
+		lbr_desc->arch_lbr_reset = false;
 		vcpu_to_pmu(vcpu)->event_count--;
 	}
 }
@@ -348,10 +353,26 @@ static bool intel_pmu_handle_lbr_msrs_access(struct kvm_vcpu *vcpu,
 	return true;
 }
 
+/*
+ * Check if the requested depth values is supported
+ * based on the bits [0:7] of the guest cpuid.1c.eax.
+ */
+static bool arch_lbr_depth_is_valid(struct kvm_vcpu *vcpu, u64 depth)
+{
+	struct kvm_cpuid_entry2 *best;
+
+	best = kvm_find_cpuid_entry(vcpu, 0x1c, 0);
+	if (best && depth && (depth < 65) && !(depth & 7))
+		return best->eax & BIT_ULL(depth / 8 - 1);
+
+	return false;
+}
+
 static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 	struct kvm_pmc *pmc;
+	struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
 	u32 msr = msr_info->index;
 
 	switch (msr) {
@@ -367,6 +388,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
 		msr_info->data = pmu->global_ovf_ctrl;
 		return 0;
+	case MSR_ARCH_LBR_DEPTH:
+		msr_info->data = lbr_desc->records.nr;
+		return 0;
 	default:
 		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
 		    (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
@@ -393,6 +417,7 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 	struct kvm_pmc *pmc;
+	struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
 	u32 msr = msr_info->index;
 	u64 data = msr_info->data;
 
@@ -427,6 +452,12 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			return 0;
 		}
 		break;
+	case MSR_ARCH_LBR_DEPTH:
+		if (!arch_lbr_depth_is_valid(vcpu, data))
+			return 1;
+		lbr_desc->records.nr = data;
+		lbr_desc->arch_lbr_reset = true;
+		return 0;
 	default:
 		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
 		    (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
@@ -566,6 +597,7 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu)
 	lbr_desc->records.nr = 0;
 	lbr_desc->event = NULL;
 	lbr_desc->msr_passthrough = false;
+	lbr_desc->arch_lbr_reset = false;
 }
 
 static void intel_pmu_reset(struct kvm_vcpu *vcpu)
@@ -623,6 +655,15 @@ static void intel_pmu_deliver_pmi(struct kvm_vcpu *vcpu)
 		intel_pmu_legacy_freezing_lbrs_on_pmi(vcpu);
 }
 
+static void intel_pmu_arch_lbr_reset(struct kvm_vcpu *vcpu)
+{
+	struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
+
+	/* Software write to IA32_LBR_DEPTH will reset all LBR entries. */
+	wrmsrl(MSR_ARCH_LBR_DEPTH, lbr_desc->records.nr);
+	lbr_desc->arch_lbr_reset = false;
+}
+
 static void vmx_update_intercept_for_lbr_msrs(struct kvm_vcpu *vcpu, bool set)
 {
 	struct x86_pmu_lbr *lbr = vcpu_to_lbr_records(vcpu);
@@ -654,6 +695,9 @@ static inline void vmx_enable_lbr_msrs_passthrough(struct kvm_vcpu *vcpu)
 {
 	struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
 
+	if (unlikely(lbr_desc->arch_lbr_reset))
+		intel_pmu_arch_lbr_reset(vcpu);
+
 	if (lbr_desc->msr_passthrough)
 		return;
 
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 16e4e457ba23..cc362e2d3eaa 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -116,6 +116,9 @@ struct lbr_desc {
 
 	/* True if LBRs are marked as not intercepted in the MSR bitmap */
 	bool msr_passthrough;
+
+	/* Reset all LBR entries on a guest write to MSR_ARCH_LBR_DEPTH */
+	bool arch_lbr_reset;
 };
 
 /*
-- 
2.21.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v5 05/13] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_CTL for guest Arch LBR
  2021-07-09 10:04 [PATCH v5 00/13] Introduce Architectural LBR for vPMU Yang Weijiang
                   ` (3 preceding siblings ...)
  2021-07-09 10:05 ` [PATCH v5 04/13] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_DEPTH for guest Arch LBR Yang Weijiang
@ 2021-07-09 10:05 ` Yang Weijiang
  2021-07-09 21:55   ` Jim Mattson
  2021-07-10  0:42   ` kernel test robot
  2021-07-09 10:05 ` [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state Yang Weijiang
                   ` (7 subsequent siblings)
  12 siblings, 2 replies; 40+ messages in thread
From: Yang Weijiang @ 2021-07-09 10:05 UTC (permalink / raw)
  To: pbonzini, seanjc, vkuznets, jmattson, wei.w.wang, like.xu.linux,
	kvm, linux-kernel
  Cc: Like Xu, Yang Weijiang

From: Like Xu <like.xu@linux.intel.com>

Arch LBRs are enabled by setting MSR_ARCH_LBR_CTL.LBREn to 1. A new guest
state field named "Guest IA32_LBR_CTL" is added to enhance guest LBR usage.
When guest Arch LBR is enabled, a guest LBR event will be created like the
model-specific LBR does. Clear guest LBR enable bit on host PMI handling so
guest can see expected config.

On processors that support Arch LBR, MSR_IA32_DEBUGCTLMSR[bit 0] has no
meaning. It can be written to 0 or 1, but reads will always return 0.
Like IA32_DEBUGCTL, IA32_ARCH_LBR_CTL msr is also reserved on INIT.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/events/intel/lbr.c      |  2 --
 arch/x86/include/asm/msr-index.h |  1 +
 arch/x86/include/asm/vmx.h       |  2 ++
 arch/x86/kvm/vmx/pmu_intel.c     | 31 ++++++++++++++++++++++++++-----
 arch/x86/kvm/vmx/vmx.c           |  9 +++++++++
 5 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 46dc8f6cc21a..0e8e6c23928f 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -168,8 +168,6 @@ enum {
 	 ARCH_LBR_RETURN		|\
 	 ARCH_LBR_OTHER_BRANCH)
 
-#define ARCH_LBR_CTL_MASK			0x7f000e
-
 static void intel_pmu_lbr_filter(struct cpu_hw_events *cpuc);
 
 static __always_inline bool is_lbr_call_stack_bit_set(u64 config)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 211ba3375ee9..229c955b5cfc 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -169,6 +169,7 @@
 #define LBR_INFO_BR_TYPE		(0xfull << LBR_INFO_BR_TYPE_OFFSET)
 
 #define MSR_ARCH_LBR_CTL		0x000014ce
+#define ARCH_LBR_CTL_MASK		0x7f000e
 #define ARCH_LBR_CTL_LBREN		BIT(0)
 #define ARCH_LBR_CTL_CPL_OFFSET		1
 #define ARCH_LBR_CTL_CPL		(0x3ull << ARCH_LBR_CTL_CPL_OFFSET)
diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 0ffaa3156a4e..ea3be961cc8e 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -245,6 +245,8 @@ enum vmcs_field {
 	GUEST_BNDCFGS_HIGH              = 0x00002813,
 	GUEST_IA32_RTIT_CTL		= 0x00002814,
 	GUEST_IA32_RTIT_CTL_HIGH	= 0x00002815,
+	GUEST_IA32_LBR_CTL		= 0x00002816,
+	GUEST_IA32_LBR_CTL_HIGH		= 0x00002817,
 	HOST_IA32_PAT			= 0x00002c00,
 	HOST_IA32_PAT_HIGH		= 0x00002c01,
 	HOST_IA32_EFER			= 0x00002c02,
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index da68f0e74702..4500c564c63a 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -19,6 +19,11 @@
 #include "pmu.h"
 
 #define MSR_PMC_FULL_WIDTH_BIT      (MSR_IA32_PMC0 - MSR_IA32_PERFCTR0)
+/*
+ * Regardless of the Arch LBR or legacy LBR, when the LBR_EN bit 0 of the
+ * corresponding control MSR is set to 1, LBR recording will be enabled.
+ */
+#define KVM_ARCH_LBR_CTL_MASK	(ARCH_LBR_CTL_MASK | ARCH_LBR_CTL_LBREN)
 
 static struct kvm_event_hw_type_mapping intel_arch_events[] = {
 	/* Index must match CPUID 0x0A.EBX bit vector */
@@ -221,6 +226,7 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 		ret = pmu->version > 1;
 		break;
 	case MSR_ARCH_LBR_DEPTH:
+	case MSR_ARCH_LBR_CTL:
 		if (kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR))
 			ret = guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR);
 		break;
@@ -391,6 +397,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_ARCH_LBR_DEPTH:
 		msr_info->data = lbr_desc->records.nr;
 		return 0;
+	case MSR_ARCH_LBR_CTL:
+		msr_info->data = vmcs_read64(GUEST_IA32_LBR_CTL);
+		return 0;
 	default:
 		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
 		    (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
@@ -458,6 +467,14 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		lbr_desc->records.nr = data;
 		lbr_desc->arch_lbr_reset = true;
 		return 0;
+	case MSR_ARCH_LBR_CTL:
+		if (data & ~KVM_ARCH_LBR_CTL_MASK)
+			break;
+		vmcs_write64(GUEST_IA32_LBR_CTL, data);
+		if (intel_pmu_lbr_is_enabled(vcpu) && !lbr_desc->event &&
+		    (data & ARCH_LBR_CTL_LBREN))
+			intel_pmu_create_guest_lbr_event(vcpu);
+		return 0;
 	default:
 		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
 		    (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
@@ -636,12 +653,16 @@ static void intel_pmu_reset(struct kvm_vcpu *vcpu)
  */
 static void intel_pmu_legacy_freezing_lbrs_on_pmi(struct kvm_vcpu *vcpu)
 {
-	u64 data = vmcs_read64(GUEST_IA32_DEBUGCTL);
+	u32 lbr_ctl_field = GUEST_IA32_DEBUGCTL;
 
-	if (data & DEBUGCTLMSR_FREEZE_LBRS_ON_PMI) {
-		data &= ~DEBUGCTLMSR_LBR;
-		vmcs_write64(GUEST_IA32_DEBUGCTL, data);
-	}
+	if (!(vmcs_read64(GUEST_IA32_DEBUGCTL) & DEBUGCTLMSR_FREEZE_LBRS_ON_PMI))
+		return;
+
+	if (kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR) &&
+	    guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR))
+		lbr_ctl_field = GUEST_IA32_LBR_CTL;
+
+	vmcs_write64(lbr_ctl_field, vmcs_read64(lbr_ctl_field) & ~0x1ULL);
 }
 
 static void intel_pmu_deliver_pmi(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 9f2793c89155..1a79ac1757af 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2088,6 +2088,13 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 						VM_EXIT_SAVE_DEBUG_CONTROLS)
 			get_vmcs12(vcpu)->guest_ia32_debugctl = data;
 
+		/*
+		 * For Arch LBR, IA32_DEBUGCTL[bit 0] has no meaning.
+		 * It can be written to 0 or 1, but reads will always return 0.
+		 */
+		if (guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR))
+			data &= ~DEBUGCTLMSR_LBR;
+
 		vmcs_write64(GUEST_IA32_DEBUGCTL, data);
 		if (intel_pmu_lbr_is_enabled(vcpu) && !to_vmx(vcpu)->lbr_desc.event &&
 		    (data & DEBUGCTLMSR_LBR))
@@ -4527,6 +4534,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 		vmcs_writel(GUEST_SYSENTER_ESP, 0);
 		vmcs_writel(GUEST_SYSENTER_EIP, 0);
 		vmcs_write64(GUEST_IA32_DEBUGCTL, 0);
+		if (static_cpu_has(X86_FEATURE_ARCH_LBR))
+			vmcs_write64(GUEST_IA32_LBR_CTL, 0);
 	}
 
 	kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
-- 
2.21.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state
  2021-07-09 10:04 [PATCH v5 00/13] Introduce Architectural LBR for vPMU Yang Weijiang
                   ` (4 preceding siblings ...)
  2021-07-09 10:05 ` [PATCH v5 05/13] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_CTL " Yang Weijiang
@ 2021-07-09 10:05 ` Yang Weijiang
  2021-07-09 22:54   ` Jim Mattson
  2021-07-09 10:05 ` [PATCH v5 07/13] KVM: x86/pmu: Refactor code to support guest Arch LBR Yang Weijiang
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 40+ messages in thread
From: Yang Weijiang @ 2021-07-09 10:05 UTC (permalink / raw)
  To: pbonzini, seanjc, vkuznets, jmattson, wei.w.wang, like.xu.linux,
	kvm, linux-kernel
  Cc: Yang Weijiang

If host is using MSR_ARCH_LBR_CTL then save it before vm-entry
and reload it after vm-exit.

Co-developed-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 23 +++++++++++++++++++++++
 arch/x86/kvm/vmx/vmx.h |  1 +
 2 files changed, 24 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 1a79ac1757af..0d714e76e2d5 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1397,6 +1397,26 @@ void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, int cpu,
 		decache_tsc_multiplier(vmx);
 }
 
+static inline unsigned long get_lbrctlmsr(void)
+{
+	unsigned long lbrctlmsr = 0;
+
+	if (!static_cpu_has(X86_FEATURE_ARCH_LBR))
+		return 0;
+
+	rdmsrl(MSR_ARCH_LBR_CTL, lbrctlmsr);
+
+	return lbrctlmsr;
+}
+
+static inline void update_lbrctlmsr(unsigned long lbrctlmsr)
+{
+	if (!static_cpu_has(X86_FEATURE_ARCH_LBR))
+		return;
+
+	wrmsrl(MSR_ARCH_LBR_CTL, lbrctlmsr);
+}
+
 /*
  * Switches to specified vcpu, until a matching vcpu_put(), but assumes
  * vcpu mutex is already taken.
@@ -1410,6 +1430,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	vmx_vcpu_pi_load(vcpu, cpu);
 
 	vmx->host_debugctlmsr = get_debugctlmsr();
+	vmx->host_lbrctlmsr = get_lbrctlmsr();
 }
 
 static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
@@ -6797,6 +6818,8 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
 	/* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed */
 	if (vmx->host_debugctlmsr)
 		update_debugctlmsr(vmx->host_debugctlmsr);
+	if (vmx->host_lbrctlmsr)
+		update_lbrctlmsr(vmx->host_lbrctlmsr);
 
 #ifndef CONFIG_X86_64
 	/*
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index cc362e2d3eaa..69e243fea23d 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -328,6 +328,7 @@ struct vcpu_vmx {
 	u64 current_tsc_ratio;
 
 	unsigned long host_debugctlmsr;
+	unsigned long host_lbrctlmsr;
 
 	/*
 	 * Only bits masked by msr_ia32_feature_control_valid_bits can be set in
-- 
2.21.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v5 07/13] KVM: x86/pmu: Refactor code to support guest Arch LBR
  2021-07-09 10:04 [PATCH v5 00/13] Introduce Architectural LBR for vPMU Yang Weijiang
                   ` (5 preceding siblings ...)
  2021-07-09 10:05 ` [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state Yang Weijiang
@ 2021-07-09 10:05 ` Yang Weijiang
  2021-07-09 10:05 ` [PATCH v5 08/13] KVM: x86: Refresh CPUID on writes to MSR_IA32_XSS Yang Weijiang
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 40+ messages in thread
From: Yang Weijiang @ 2021-07-09 10:05 UTC (permalink / raw)
  To: pbonzini, seanjc, vkuznets, jmattson, wei.w.wang, like.xu.linux,
	kvm, linux-kernel
  Cc: Yang Weijiang

Take account of Arch LBR when do sanity checks before program
vPMU for guest. Pass through Arch LBR recording MSRs to guest
to gain better performance. Note, Arch LBR and Legacy LBR support
are mutually exclusive, i.e., they're not both available on one
platform.

Co-developed-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/vmx/pmu_intel.c | 37 +++++++++++++++++++++++++++++-------
 arch/x86/kvm/vmx/vmx.c       |  3 +++
 2 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 4500c564c63a..4041f1d9ba36 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -180,12 +180,16 @@ static inline struct kvm_pmc *get_fw_gp_pmc(struct kvm_pmu *pmu, u32 msr)
 
 bool intel_pmu_lbr_is_compatible(struct kvm_vcpu *vcpu)
 {
+	if (kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR))
+		return guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR);
+
 	/*
 	 * As a first step, a guest could only enable LBR feature if its
 	 * cpu model is the same as the host because the LBR registers
 	 * would be pass-through to the guest and they're model specific.
 	 */
-	return boot_cpu_data.x86_model == guest_cpuid_model(vcpu);
+	return !boot_cpu_has(X86_FEATURE_ARCH_LBR) &&
+		boot_cpu_data.x86_model == guest_cpuid_model(vcpu);
 }
 
 bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu)
@@ -203,12 +207,19 @@ static bool intel_pmu_is_valid_lbr_msr(struct kvm_vcpu *vcpu, u32 index)
 	if (!intel_pmu_lbr_is_enabled(vcpu))
 		return ret;
 
-	ret = (index == MSR_LBR_SELECT) || (index == MSR_LBR_TOS) ||
-		(index >= records->from && index < records->from + records->nr) ||
-		(index >= records->to && index < records->to + records->nr);
+	if (!guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR))
+		ret = (index == MSR_LBR_SELECT) || (index == MSR_LBR_TOS);
+
+	if (!ret) {
+		ret = (index >= records->from &&
+		       index < records->from + records->nr) ||
+		      (index >= records->to &&
+		       index < records->to + records->nr);
+	}
 
 	if (!ret && records->info)
-		ret = (index >= records->info && index < records->info + records->nr);
+		ret = (index >= records->info &&
+		       index < records->info + records->nr);
 
 	return ret;
 }
@@ -697,6 +708,9 @@ static void vmx_update_intercept_for_lbr_msrs(struct kvm_vcpu *vcpu, bool set)
 			vmx_set_intercept_for_msr(vcpu, lbr->info + i, MSR_TYPE_RW, set);
 	}
 
+	if (guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR))
+		return;
+
 	vmx_set_intercept_for_msr(vcpu, MSR_LBR_SELECT, MSR_TYPE_RW, set);
 	vmx_set_intercept_for_msr(vcpu, MSR_LBR_TOS, MSR_TYPE_RW, set);
 }
@@ -740,10 +754,13 @@ void vmx_passthrough_lbr_msrs(struct kvm_vcpu *vcpu)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 	struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
+	bool lbr_enable = guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR) ?
+		(vmcs_read64(GUEST_IA32_LBR_CTL) & ARCH_LBR_CTL_LBREN) :
+		(vmcs_read64(GUEST_IA32_DEBUGCTL) & DEBUGCTLMSR_LBR);
 
 	if (!lbr_desc->event) {
 		vmx_disable_lbr_msrs_passthrough(vcpu);
-		if (vmcs_read64(GUEST_IA32_DEBUGCTL) & DEBUGCTLMSR_LBR)
+		if (lbr_enable)
 			goto warn;
 		if (test_bit(INTEL_PMC_IDX_FIXED_VLBR, pmu->pmc_in_use))
 			goto warn;
@@ -760,13 +777,19 @@ void vmx_passthrough_lbr_msrs(struct kvm_vcpu *vcpu)
 	return;
 
 warn:
+	if (kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR))
+		wrmsrl(MSR_ARCH_LBR_DEPTH, lbr_desc->records.nr);
 	pr_warn_ratelimited("kvm: vcpu-%d: fail to passthrough LBR.\n",
 		vcpu->vcpu_id);
 }
 
 static void intel_pmu_cleanup(struct kvm_vcpu *vcpu)
 {
-	if (!(vmcs_read64(GUEST_IA32_DEBUGCTL) & DEBUGCTLMSR_LBR))
+	bool lbr_enable = guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR) ?
+		(vmcs_read64(GUEST_IA32_LBR_CTL) & ARCH_LBR_CTL_LBREN) :
+		(vmcs_read64(GUEST_IA32_DEBUGCTL) & DEBUGCTLMSR_LBR);
+
+	if (!lbr_enable)
 		intel_pmu_release_guest_lbr_event(vcpu);
 }
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 0d714e76e2d5..2d23c0296611 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -671,6 +671,9 @@ static bool is_valid_passthrough_msr(u32 msr)
 	case MSR_LBR_NHM_TO ... MSR_LBR_NHM_TO + 31:
 	case MSR_LBR_CORE_FROM ... MSR_LBR_CORE_FROM + 8:
 	case MSR_LBR_CORE_TO ... MSR_LBR_CORE_TO + 8:
+	case MSR_ARCH_LBR_FROM_0 ... MSR_ARCH_LBR_FROM_0 + 31:
+	case MSR_ARCH_LBR_TO_0 ... MSR_ARCH_LBR_TO_0 + 31:
+	case MSR_ARCH_LBR_INFO_0 ... MSR_ARCH_LBR_INFO_0 + 31:
 		/* LBR MSRs. These are handled in vmx_update_intercept_for_lbr_msrs() */
 		return true;
 	}
-- 
2.21.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v5 08/13] KVM: x86: Refresh CPUID on writes to MSR_IA32_XSS
  2021-07-09 10:04 [PATCH v5 00/13] Introduce Architectural LBR for vPMU Yang Weijiang
                   ` (6 preceding siblings ...)
  2021-07-09 10:05 ` [PATCH v5 07/13] KVM: x86/pmu: Refactor code to support guest Arch LBR Yang Weijiang
@ 2021-07-09 10:05 ` Yang Weijiang
  2021-07-09 10:05 ` [PATCH v5 09/13] KVM: x86: Report XSS as an MSR to be saved if there are supported features Yang Weijiang
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 40+ messages in thread
From: Yang Weijiang @ 2021-07-09 10:05 UTC (permalink / raw)
  To: pbonzini, seanjc, vkuznets, jmattson, wei.w.wang, like.xu.linux,
	kvm, linux-kernel
  Cc: Yang Weijiang

Updated CPUID.0xD.0x1, which reports the current required storage size
of all features enabled via XCR0 | XSS, when the guest's XSS is modified.

Note, KVM does not yet support any XSS based features, i.e. supported_xss
is guaranteed to be zero at this time.

Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Message-Id: <20210203113421.5759-3-weijiang.yang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/cpuid.c            | 21 ++++++++++++++++++---
 arch/x86/kvm/x86.c              |  7 +++++--
 3 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 9c7ced0e3171..a98b15cefc6b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -651,6 +651,7 @@ struct kvm_vcpu_arch {
 
 	u64 xcr0;
 	u64 guest_supported_xcr0;
+	u64 guest_supported_xss;
 
 	struct kvm_pio_request pio;
 	void *pio_data;
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index b4da665bb892..d6e343809b25 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -131,9 +131,24 @@ void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu)
 		best->ebx = xstate_required_size(vcpu->arch.xcr0, false);
 
 	best = kvm_find_cpuid_entry(vcpu, 0xD, 1);
-	if (best && (cpuid_entry_has(best, X86_FEATURE_XSAVES) ||
-		     cpuid_entry_has(best, X86_FEATURE_XSAVEC)))
-		best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
+	if (best) {
+		if (cpuid_entry_has(best, X86_FEATURE_XSAVES) ||
+		    cpuid_entry_has(best, X86_FEATURE_XSAVEC))  {
+			u64 xstate = vcpu->arch.xcr0 | vcpu->arch.ia32_xss;
+
+			best->ebx = xstate_required_size(xstate, true);
+		}
+
+		if (!cpuid_entry_has(best, X86_FEATURE_XSAVES)) {
+			best->ecx = 0;
+			best->edx = 0;
+		}
+		vcpu->arch.guest_supported_xss =
+			(((u64)best->edx << 32) | best->ecx) & supported_xss;
+
+	} else {
+		vcpu->arch.guest_supported_xss = 0;
+	}
 
 	best = kvm_find_cpuid_entry(vcpu, KVM_CPUID_FEATURES, 0);
 	if (kvm_hlt_in_guest(vcpu->kvm) && best &&
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b586a45fce2b..5f2e13c9f507 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3265,9 +3265,12 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		 * IA32_XSS[bit 8]. Guests have to use RDMSR/WRMSR rather than
 		 * XSAVES/XRSTORS to save/restore PT MSRs.
 		 */
-		if (data & ~supported_xss)
+		if (data & ~vcpu->arch.guest_supported_xss)
 			return 1;
-		vcpu->arch.ia32_xss = data;
+		if (vcpu->arch.ia32_xss != data) {
+			vcpu->arch.ia32_xss = data;
+			kvm_update_cpuid_runtime(vcpu);
+		}
 		break;
 	case MSR_SMI_COUNT:
 		if (!msr_info->host_initiated)
-- 
2.21.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v5 09/13] KVM: x86: Report XSS as an MSR to be saved if there are supported features
  2021-07-09 10:04 [PATCH v5 00/13] Introduce Architectural LBR for vPMU Yang Weijiang
                   ` (7 preceding siblings ...)
  2021-07-09 10:05 ` [PATCH v5 08/13] KVM: x86: Refresh CPUID on writes to MSR_IA32_XSS Yang Weijiang
@ 2021-07-09 10:05 ` Yang Weijiang
  2021-07-09 10:05 ` [PATCH v5 10/13] KVM: x86: Refine the matching and clearing logic for supported_xss Yang Weijiang
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 40+ messages in thread
From: Yang Weijiang @ 2021-07-09 10:05 UTC (permalink / raw)
  To: pbonzini, seanjc, vkuznets, jmattson, wei.w.wang, like.xu.linux,
	kvm, linux-kernel
  Cc: Sean Christopherson, Yang Weijiang

From: Sean Christopherson <sean.j.christopherson@intel.com>

Add MSR_IA32_XSS to the list of MSRs reported to userspace if
supported_xss is non-zero, i.e. KVM supports at least one XSS based
feature.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Message-Id: <20210203113421.5759-2-weijiang.yang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/x86.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5f2e13c9f507..c225260c949e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1299,6 +1299,7 @@ static const u32 msrs_to_save_all[] = {
 	MSR_ARCH_PERFMON_EVENTSEL0 + 14, MSR_ARCH_PERFMON_EVENTSEL0 + 15,
 	MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17,
 	MSR_ARCH_LBR_CTL, MSR_ARCH_LBR_DEPTH,
+	MSR_IA32_XSS,
 };
 
 static u32 msrs_to_save[ARRAY_SIZE(msrs_to_save_all)];
@@ -6058,6 +6059,10 @@ static void kvm_init_msr_list(void)
 			if (!kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR))
 				continue;
 			break;
+		case MSR_IA32_XSS:
+			if (!supported_xss)
+				continue;
+			break;
 		default:
 			break;
 		}
-- 
2.21.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v5 10/13] KVM: x86: Refine the matching and clearing logic for supported_xss
  2021-07-09 10:04 [PATCH v5 00/13] Introduce Architectural LBR for vPMU Yang Weijiang
                   ` (8 preceding siblings ...)
  2021-07-09 10:05 ` [PATCH v5 09/13] KVM: x86: Report XSS as an MSR to be saved if there are supported features Yang Weijiang
@ 2021-07-09 10:05 ` Yang Weijiang
  2021-07-09 10:05 ` [PATCH v5 11/13] KVM: x86: Add XSAVE Support for Architectural LBR Yang Weijiang
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 40+ messages in thread
From: Yang Weijiang @ 2021-07-09 10:05 UTC (permalink / raw)
  To: pbonzini, seanjc, vkuznets, jmattson, wei.w.wang, like.xu.linux,
	kvm, linux-kernel
  Cc: Like Xu, Yang Weijiang

From: Like Xu <like.xu@linux.intel.com>

Refine the code path of the existing clearing of supported_xss in this way:
initialize the supported_xss with the filter of KVM_SUPPORTED_XSS mask and
update its value in a bit clear manner (rather than bit setting).

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 5 +++--
 arch/x86/kvm/x86.c     | 6 +++++-
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 2d23c0296611..636c50b95038 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7349,9 +7349,10 @@ static __init void vmx_set_cpu_caps(void)
 		kvm_cpu_cap_set(X86_FEATURE_UMIP);
 
 	/* CPUID 0xD.1 */
-	supported_xss = 0;
-	if (!cpu_has_vmx_xsaves())
+	if (!cpu_has_vmx_xsaves()) {
 		kvm_cpu_cap_clear(X86_FEATURE_XSAVES);
+		supported_xss = 0;
+	}
 
 	/* CPUID 0x80000001 and 0x7 (RDPID) */
 	if (!cpu_has_vmx_rdtscp()) {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c225260c949e..2424d475a4d7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -203,6 +203,8 @@ static struct kvm_user_return_msrs __percpu *user_return_msrs;
 				| XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
 				| XFEATURE_MASK_PKRU)
 
+#define KVM_SUPPORTED_XSS     0
+
 u64 __read_mostly host_efer;
 EXPORT_SYMBOL_GPL(host_efer);
 
@@ -10654,8 +10656,10 @@ int kvm_arch_hardware_setup(void *opaque)
 
 	rdmsrl_safe(MSR_EFER, &host_efer);
 
-	if (boot_cpu_has(X86_FEATURE_XSAVES))
+	if (boot_cpu_has(X86_FEATURE_XSAVES)) {
 		rdmsrl(MSR_IA32_XSS, host_xss);
+		supported_xss = host_xss & KVM_SUPPORTED_XSS;
+	}
 
 	r = ops->hardware_setup();
 	if (r != 0)
-- 
2.21.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v5 11/13] KVM: x86: Add XSAVE Support for Architectural LBR
  2021-07-09 10:04 [PATCH v5 00/13] Introduce Architectural LBR for vPMU Yang Weijiang
                   ` (9 preceding siblings ...)
  2021-07-09 10:05 ` [PATCH v5 10/13] KVM: x86: Refine the matching and clearing logic for supported_xss Yang Weijiang
@ 2021-07-09 10:05 ` Yang Weijiang
  2021-07-09 10:05 ` [PATCH v5 12/13] KVM: x86/vmx: Check Arch LBR config when return perf capabilities Yang Weijiang
  2021-07-09 10:05 ` [PATCH v5 13/13] KVM: x86/cpuid: Advise Arch LBR feature in CPUID Yang Weijiang
  12 siblings, 0 replies; 40+ messages in thread
From: Yang Weijiang @ 2021-07-09 10:05 UTC (permalink / raw)
  To: pbonzini, seanjc, vkuznets, jmattson, wei.w.wang, like.xu.linux,
	kvm, linux-kernel
  Cc: Like Xu, Yang Weijiang

From: Like Xu <like.xu@linux.intel.com>

On processors supporting XSAVES and XRSTORS, Architectural LBR XSAVE
support is enumerated from CPUID.(EAX=0DH, ECX=1):ECX[bit 15].
The detailed sub-leaf for Arch LBR is enumerated in CPUID.(0DH, 0FH).

XSAVES provides a faster means than RDMSR for guest to read all LBRs.
When guest IA32_XSS[bit 15] is set, the Arch LBR state can be saved using
XSAVES and restored by XRSTORS with the appropriate RFBM.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 4 ++++
 arch/x86/kvm/x86.c     | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 636c50b95038..3eccda710495 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7337,6 +7337,10 @@ static __init void vmx_set_cpu_caps(void)
 		kvm_cpu_cap_clear(X86_FEATURE_INVPCID);
 	if (vmx_pt_mode_is_host_guest())
 		kvm_cpu_cap_check_and_set(X86_FEATURE_INTEL_PT);
+	if (!cpu_has_vmx_arch_lbr()) {
+		kvm_cpu_cap_clear(X86_FEATURE_ARCH_LBR);
+		supported_xss &= ~XFEATURE_MASK_LBR;
+	}
 
 	if (!enable_sgx) {
 		kvm_cpu_cap_clear(X86_FEATURE_SGX);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2424d475a4d7..c09522c1f3ec 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -203,7 +203,7 @@ static struct kvm_user_return_msrs __percpu *user_return_msrs;
 				| XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
 				| XFEATURE_MASK_PKRU)
 
-#define KVM_SUPPORTED_XSS     0
+#define KVM_SUPPORTED_XSS     XFEATURE_MASK_LBR
 
 u64 __read_mostly host_efer;
 EXPORT_SYMBOL_GPL(host_efer);
-- 
2.21.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v5 12/13] KVM: x86/vmx: Check Arch LBR config  when return perf capabilities
  2021-07-09 10:04 [PATCH v5 00/13] Introduce Architectural LBR for vPMU Yang Weijiang
                   ` (10 preceding siblings ...)
  2021-07-09 10:05 ` [PATCH v5 11/13] KVM: x86: Add XSAVE Support for Architectural LBR Yang Weijiang
@ 2021-07-09 10:05 ` Yang Weijiang
  2021-07-09 10:05 ` [PATCH v5 13/13] KVM: x86/cpuid: Advise Arch LBR feature in CPUID Yang Weijiang
  12 siblings, 0 replies; 40+ messages in thread
From: Yang Weijiang @ 2021-07-09 10:05 UTC (permalink / raw)
  To: pbonzini, seanjc, vkuznets, jmattson, wei.w.wang, like.xu.linux,
	kvm, linux-kernel
  Cc: Yang Weijiang

Two new bit fields(VM_EXIT_CLEAR_IA32_LBR_CTL, VM_ENTRY_LOAD_IA32_LBR_CTL)
are added to support guest Arch LBR. These two bits should be set in order
to make Arch LBR workable in both guest and host.

Co-developed-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/include/asm/vmx.h      |  2 ++
 arch/x86/kvm/vmx/capabilities.h | 25 +++++++++++++++++--------
 arch/x86/kvm/vmx/vmx.c          |  6 ++++--
 3 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index ea3be961cc8e..d9b1dffc4638 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -95,6 +95,7 @@
 #define VM_EXIT_CLEAR_BNDCFGS                   0x00800000
 #define VM_EXIT_PT_CONCEAL_PIP			0x01000000
 #define VM_EXIT_CLEAR_IA32_RTIT_CTL		0x02000000
+#define VM_EXIT_CLEAR_IA32_LBR_CTL		0x04000000
 
 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR	0x00036dff
 
@@ -108,6 +109,7 @@
 #define VM_ENTRY_LOAD_BNDCFGS                   0x00010000
 #define VM_ENTRY_PT_CONCEAL_PIP			0x00020000
 #define VM_ENTRY_LOAD_IA32_RTIT_CTL		0x00040000
+#define VM_ENTRY_LOAD_IA32_LBR_CTL		0x00200000
 
 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR	0x000011ff
 
diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index aa0e7872fcc9..b65e4087c9a9 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -377,20 +377,29 @@ static inline bool vmx_pt_mode_is_host_guest(void)
 	return pt_mode == PT_MODE_HOST_GUEST;
 }
 
-static inline u64 vmx_get_perf_capabilities(void)
+static inline bool cpu_has_vmx_arch_lbr(void)
 {
-	u64 perf_cap = 0;
-
-	if (boot_cpu_has(X86_FEATURE_PDCM))
-		rdmsrl(MSR_IA32_PERF_CAPABILITIES, perf_cap);
-
-	perf_cap &= PMU_CAP_LBR_FMT;
+	return (vmcs_config.vmexit_ctrl & VM_EXIT_CLEAR_IA32_LBR_CTL) &&
+		(vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_LBR_CTL);
+}
 
+static inline u64 vmx_get_perf_capabilities(void)
+{
 	/*
 	 * Since counters are virtualized, KVM would support full
 	 * width counting unconditionally, even if the host lacks it.
 	 */
-	return PMU_CAP_FW_WRITES | perf_cap;
+	u64 perf_cap = PMU_CAP_FW_WRITES;
+	u64 host_perf_cap = 0;
+
+	if (boot_cpu_has(X86_FEATURE_PDCM))
+		rdmsrl(MSR_IA32_PERF_CAPABILITIES, host_perf_cap);
+
+	perf_cap |= host_perf_cap & PMU_CAP_LBR_FMT;
+	if (boot_cpu_has(X86_FEATURE_ARCH_LBR) && !cpu_has_vmx_arch_lbr())
+		perf_cap &= ~PMU_CAP_LBR_FMT;
+
+	return perf_cap;
 }
 
 static inline u64 vmx_supported_debugctl(void)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 3eccda710495..ae12fd7ac44e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2620,7 +2620,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	      VM_EXIT_LOAD_IA32_EFER |
 	      VM_EXIT_CLEAR_BNDCFGS |
 	      VM_EXIT_PT_CONCEAL_PIP |
-	      VM_EXIT_CLEAR_IA32_RTIT_CTL;
+	      VM_EXIT_CLEAR_IA32_RTIT_CTL |
+	      VM_EXIT_CLEAR_IA32_LBR_CTL;
 	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
 				&_vmexit_control) < 0)
 		return -EIO;
@@ -2644,7 +2645,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	      VM_ENTRY_LOAD_IA32_EFER |
 	      VM_ENTRY_LOAD_BNDCFGS |
 	      VM_ENTRY_PT_CONCEAL_PIP |
-	      VM_ENTRY_LOAD_IA32_RTIT_CTL;
+	      VM_ENTRY_LOAD_IA32_RTIT_CTL |
+	      VM_ENTRY_LOAD_IA32_LBR_CTL;
 	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS,
 				&_vmentry_control) < 0)
 		return -EIO;
-- 
2.21.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v5 13/13] KVM: x86/cpuid: Advise Arch LBR feature in CPUID
  2021-07-09 10:04 [PATCH v5 00/13] Introduce Architectural LBR for vPMU Yang Weijiang
                   ` (11 preceding siblings ...)
  2021-07-09 10:05 ` [PATCH v5 12/13] KVM: x86/vmx: Check Arch LBR config when return perf capabilities Yang Weijiang
@ 2021-07-09 10:05 ` Yang Weijiang
  12 siblings, 0 replies; 40+ messages in thread
From: Yang Weijiang @ 2021-07-09 10:05 UTC (permalink / raw)
  To: pbonzini, seanjc, vkuznets, jmattson, wei.w.wang, like.xu.linux,
	kvm, linux-kernel
  Cc: Yang Weijiang

Add Arch LBR feature bit in CPU cap-mask to expose the feature.
Currently only max LBR depth is available for guest, and it's
consistent with host Arch LBR settings.

Co-developed-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/cpuid.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index d6e343809b25..b51bfeaccea3 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -490,7 +490,7 @@ void kvm_set_cpu_caps(void)
 		F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
 		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
 		F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) |
-		F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16)
+		F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16) | F(ARCH_LBR)
 	);
 
 	/* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */
@@ -902,6 +902,27 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
 				goto out;
 		}
 		break;
+	/* Architectural LBR */
+	case 0x1c: {
+		u64 lbr_depth_mask = entry->eax & 0xff;
+
+		if (!lbr_depth_mask ||
+		    !kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR)) {
+			entry->eax = entry->ebx = entry->ecx = entry->edx = 0;
+			break;
+		}
+		/*
+		 * KVM only exposes the maximum supported depth, which is the
+		 * fixed value used on the host side.
+		 * KVM doesn't allow VMM userspace to adjust LBR depth because
+		 * guest LBR emulation depends on the configuration of host LBR
+		 * driver.
+		 */
+		lbr_depth_mask = 1UL << (fls(lbr_depth_mask) - 1);
+		entry->eax &= ~0xff;
+		entry->eax |= lbr_depth_mask;
+		break;
+	}
 	case KVM_CPUID_SIGNATURE: {
 		static const char signature[12] = "KVMKVMKVM\0\0";
 		const u32 *sigptr = (const u32 *)signature;
-- 
2.21.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 03/13] KVM: x86: Add arch LBR MSRs to msrs_to_save_all list
  2021-07-09 10:05 ` [PATCH v5 03/13] KVM: x86: Add arch LBR MSRs to msrs_to_save_all list Yang Weijiang
@ 2021-07-09 18:24   ` Jim Mattson
  2021-07-12  8:55     ` Yang Weijiang
  0 siblings, 1 reply; 40+ messages in thread
From: Jim Mattson @ 2021-07-09 18:24 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, seanjc, vkuznets, wei.w.wang, like.xu.linux, kvm, linux-kernel

On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
>
> Arch LBR MSR_ARCH_LBR_DEPTH and MSR_ARCH_LBR_CTL are {saved|restored}
> by userspace application if they're available.
>
> Suggested-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 04/13] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_DEPTH for guest Arch LBR
  2021-07-09 10:05 ` [PATCH v5 04/13] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_DEPTH for guest Arch LBR Yang Weijiang
@ 2021-07-09 20:35   ` Jim Mattson
  2021-07-12  9:17     ` Yang Weijiang
  0 siblings, 1 reply; 40+ messages in thread
From: Jim Mattson @ 2021-07-09 20:35 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, seanjc, vkuznets, wei.w.wang, like.xu.linux, kvm,
	linux-kernel, Like Xu

On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
>
> From: Like Xu <like.xu@linux.intel.com>
>
> The number of Arch LBR entries available is determined by the value
> in host MSR_ARCH_LBR_DEPTH.DEPTH. The supported LBR depth values are
> enumerated in CPUID.(EAX=01CH, ECX=0):EAX[7:0]. For each bit "n" set
> in this field, the MSR_ARCH_LBR_DEPTH.DEPTH value of "8*(n+1)" is
> supported.
>
> On a guest write to MSR_ARCH_LBR_DEPTH, all LBR entries are reset to 0.
> KVM emulates the reset behavior by introducing lbr_desc->arch_lbr_reset.
> KVM writes guest requested value to the native ARCH_LBR_DEPTH MSR
> (this is safe because the two values will be the same) when the Arch LBR
> records MSRs are pass-through to the guest.
>
> Signed-off-by: Like Xu <like.xu@linux.intel.com>
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---

> @@ -393,6 +417,7 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  {
>         struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
>         struct kvm_pmc *pmc;
> +       struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
>         u32 msr = msr_info->index;
>         u64 data = msr_info->data;
>
> @@ -427,6 +452,12 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>                         return 0;
>                 }
>                 break;
> +       case MSR_ARCH_LBR_DEPTH:
> +               if (!arch_lbr_depth_is_valid(vcpu, data))
> +                       return 1;

Does this imply that, when restoring a vCPU, KVM_SET_CPUID2 must be
called before KVM_SET_MSRS, so that arch_lbr_depth_is_valid() knows
what to do? Is this documented anywhere?

> +               lbr_desc->records.nr = data;
> +               lbr_desc->arch_lbr_reset = true;

Doesn't this make it impossible to restore vCPU state, since the LBRs
will be reset on the next VM-entry? At the very least, you probably
shouldn't set arch_lbr_reset when the MSR write is host-initiated.

However, there is another problem: arch_lbr_reset isn't serialized
anywhere. If you fix the host-initiated issue, then you still have a
problem if the last guest instruction prior to suspending the vCPU was
a write to IA32_LBR_DEPTH. If there is no subsequent VM-entry prior to
saving the vCPU state, then the LBRs will be saved/restored as part of
the guest XSAVE state, and they will not get cleared on resuming the
vCPU.

> +               return 0;
>         default:
>                 if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
>                     (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
> @@ -566,6 +597,7 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu)
>         lbr_desc->records.nr = 0;
>         lbr_desc->event = NULL;
>         lbr_desc->msr_passthrough = false;
> +       lbr_desc->arch_lbr_reset = false;

I'm not sure this is entirely correct. If the last guest instruction
prior to a warm reset was a write to IA32_LBR_DEPTH, then the LBRs
should be cleared (and arch_lbr_reset will be true). However, if you
clear that flag here, the LBRs will never get cleared.

>  }
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 05/13] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_CTL for guest Arch LBR
  2021-07-09 10:05 ` [PATCH v5 05/13] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_CTL " Yang Weijiang
@ 2021-07-09 21:55   ` Jim Mattson
  2021-07-12  9:36     ` Yang Weijiang
  2021-07-10  0:42   ` kernel test robot
  1 sibling, 1 reply; 40+ messages in thread
From: Jim Mattson @ 2021-07-09 21:55 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, seanjc, vkuznets, wei.w.wang, like.xu.linux, kvm,
	linux-kernel, Like Xu

On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
>
> From: Like Xu <like.xu@linux.intel.com>
>
> Arch LBRs are enabled by setting MSR_ARCH_LBR_CTL.LBREn to 1. A new guest
> state field named "Guest IA32_LBR_CTL" is added to enhance guest LBR usage.
> When guest Arch LBR is enabled, a guest LBR event will be created like the
> model-specific LBR does. Clear guest LBR enable bit on host PMI handling so
> guest can see expected config.
>
> On processors that support Arch LBR, MSR_IA32_DEBUGCTLMSR[bit 0] has no
> meaning. It can be written to 0 or 1, but reads will always return 0.
> Like IA32_DEBUGCTL, IA32_ARCH_LBR_CTL msr is also reserved on INIT.

I suspect you mean "preserved" rather than "reserved."

> Signed-off-by: Like Xu <like.xu@linux.intel.com>
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>  arch/x86/events/intel/lbr.c      |  2 --
>  arch/x86/include/asm/msr-index.h |  1 +
>  arch/x86/include/asm/vmx.h       |  2 ++
>  arch/x86/kvm/vmx/pmu_intel.c     | 31 ++++++++++++++++++++++++++-----
>  arch/x86/kvm/vmx/vmx.c           |  9 +++++++++
>  5 files changed, 38 insertions(+), 7 deletions(-)
>

> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> index da68f0e74702..4500c564c63a 100644
> --- a/arch/x86/kvm/vmx/pmu_intel.c
> +++ b/arch/x86/kvm/vmx/pmu_intel.c
> @@ -19,6 +19,11 @@
>  #include "pmu.h"
>
>  #define MSR_PMC_FULL_WIDTH_BIT      (MSR_IA32_PMC0 - MSR_IA32_PERFCTR0)
> +/*
> + * Regardless of the Arch LBR or legacy LBR, when the LBR_EN bit 0 of the
> + * corresponding control MSR is set to 1, LBR recording will be enabled.
> + */

Is this comment misplaced? It doesn't seem to have anything to do with
the macro being defined below.

> @@ -458,6 +467,14 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>                 lbr_desc->records.nr = data;
>                 lbr_desc->arch_lbr_reset = true;
>                 return 0;
> +       case MSR_ARCH_LBR_CTL:
> +               if (data & ~KVM_ARCH_LBR_CTL_MASK)

Is a static mask sufficient? Per the Intel® Architecture Instruction
Set Extensions and Future Features Programming Reference, some of
these bits may not be supported on all microarchitectures. See Table
7-8. CPUID Leaf 01CH Enumeration of Architectural LBR Capabilities.

> +                       break;
> +               vmcs_write64(GUEST_IA32_LBR_CTL, data);
> +               if (intel_pmu_lbr_is_enabled(vcpu) && !lbr_desc->event &&
> +                   (data & ARCH_LBR_CTL_LBREN))
> +                       intel_pmu_create_guest_lbr_event(vcpu);

Nothing has to be done when the LBREN bit goes from 1 to 0?

> +               return 0;
>         default:
>                 if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
>                     (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {

Per the Intel® Architecture Instruction Set Extensions and Future
Features Programming Reference, "IA32_LBR_CTL.LBREn is saved and
cleared on #SMI, and restored on RSM." I don't see that happening
anywhere. That manual also says, "On a warm reset...IA32_LBR_CTL.LBREn
is cleared to 0, disabling LBRs." I don't see that happening either.

I have a question about section 7.1.4.4 in that manual. It says, "On a
debug breakpoint event (#DB), IA32_LBR_CTL.LBREn is cleared." When,
exactly, does that happen? In particular, if kvm synthesizes such an
event (for example, in kvm_vcpu_do_singlestep), does
IA32_LBR_CTL.LBREn automatically get cleared (after loading the guest
IA32_LBR_CTL value from the VMCS)? Or does kvm need to explicitly
clear that bit in the VMCS before injecting the #DB?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state
  2021-07-09 10:05 ` [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state Yang Weijiang
@ 2021-07-09 22:54   ` Jim Mattson
  2021-07-09 23:41     ` Jim Mattson
  2021-07-12  9:50     ` Yang Weijiang
  0 siblings, 2 replies; 40+ messages in thread
From: Jim Mattson @ 2021-07-09 22:54 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, seanjc, vkuznets, wei.w.wang, like.xu.linux, kvm, linux-kernel

On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
>
> If host is using MSR_ARCH_LBR_CTL then save it before vm-entry
> and reload it after vm-exit.

I don't see anything being done here "before VM-entry" or "after
VM-exit." This code seems to be invoked on vcpu_load and vcpu_put.

In any case, I don't see why this one MSR is special. It seems that if
the host is using the architectural LBR MSRs, then *all* of the host
architectural LBR MSRs have to be saved on vcpu_load and restored on
vcpu_put. Shouldn't  kvm_load_guest_fpu() and kvm_put_guest_fpu() do
that via the calls to kvm_save_current_fpu(vcpu->arch.user_fpu) and
restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state)?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state
  2021-07-09 22:54   ` Jim Mattson
@ 2021-07-09 23:41     ` Jim Mattson
  2021-07-12  9:53       ` Yang Weijiang
  2021-07-12  9:50     ` Yang Weijiang
  1 sibling, 1 reply; 40+ messages in thread
From: Jim Mattson @ 2021-07-09 23:41 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, seanjc, vkuznets, wei.w.wang, like.xu.linux, kvm, linux-kernel

On Fri, Jul 9, 2021 at 3:54 PM Jim Mattson <jmattson@google.com> wrote:
>
> On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
> >
> > If host is using MSR_ARCH_LBR_CTL then save it before vm-entry
> > and reload it after vm-exit.
>
> I don't see anything being done here "before VM-entry" or "after
> VM-exit." This code seems to be invoked on vcpu_load and vcpu_put.
>
> In any case, I don't see why this one MSR is special. It seems that if
> the host is using the architectural LBR MSRs, then *all* of the host
> architectural LBR MSRs have to be saved on vcpu_load and restored on
> vcpu_put. Shouldn't  kvm_load_guest_fpu() and kvm_put_guest_fpu() do
> that via the calls to kvm_save_current_fpu(vcpu->arch.user_fpu) and
> restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state)?

It does seem like there is something special about IA32_LBR_DEPTH, though...

Section 7.3.1 of the Intel® Architecture Instruction Set Extensions
and Future Features Programming Reference
says, "IA32_LBR_DEPTH is saved by XSAVES, but it is not written by
XRSTORS in any circumstance." It seems like that would require some
special handling if the host depth and the guest depth do not match.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 05/13] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_CTL for guest Arch LBR
  2021-07-09 10:05 ` [PATCH v5 05/13] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_CTL " Yang Weijiang
  2021-07-09 21:55   ` Jim Mattson
@ 2021-07-10  0:42   ` kernel test robot
  1 sibling, 0 replies; 40+ messages in thread
From: kernel test robot @ 2021-07-10  0:42 UTC (permalink / raw)
  To: Yang Weijiang, pbonzini, seanjc, vkuznets, jmattson, wei.w.wang,
	like.xu.linux, kvm, linux-kernel
  Cc: kbuild-all, Like Xu, Yang Weijiang

[-- Attachment #1: Type: text/plain, Size: 4293 bytes --]

Hi Yang,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on v5.13]
[also build test WARNING on next-20210709]
[cannot apply to kvm/queue tip/perf/core tip/x86/core]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Yang-Weijiang/Introduce-Architectural-LBR-for-vPMU/20210709-175347
base:    62fb9874f5da54fdb243003b386128037319b219
config: x86_64-rhel-8.3-kselftests (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.3-341-g8af24329-dirty
        # https://github.com/0day-ci/linux/commit/2fedefbda774e3c9031b139ab34fde2f5916c645
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Yang-Weijiang/Introduce-Architectural-LBR-for-vPMU/20210709-175347
        git checkout 2fedefbda774e3c9031b139ab34fde2f5916c645
        # save the attached .config to linux build tree
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)
   arch/x86/kvm/vmx/pmu_intel.c: note: in included file (through arch/x86/kvm/vmx/vmx_ops.h, arch/x86/kvm/vmx/vmx.h, arch/x86/kvm/vmx/nested.h):
>> arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a058a becomes 58a)
>> arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a058a becomes 58a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a008a becomes 8a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a008a becomes 8a)
   arch/x86/kvm/vmx/evmcs.h:81:30: sparse: sparse: cast truncates bits from constant value (a008a becomes 8a)

vim +81 arch/x86/kvm/vmx/evmcs.h

75edce8a45486f arch/x86/kvm/vmx/evmcs.h Sean Christopherson 2018-12-03  77  
773e8a0425c923 arch/x86/kvm/vmx_evmcs.h Vitaly Kuznetsov    2018-03-20  78  static __always_inline int get_evmcs_offset(unsigned long field,
773e8a0425c923 arch/x86/kvm/vmx_evmcs.h Vitaly Kuznetsov    2018-03-20  79  					    u16 *clean_field)
773e8a0425c923 arch/x86/kvm/vmx_evmcs.h Vitaly Kuznetsov    2018-03-20  80  {
773e8a0425c923 arch/x86/kvm/vmx_evmcs.h Vitaly Kuznetsov    2018-03-20 @81  	unsigned int index = ROL16(field, 6);
773e8a0425c923 arch/x86/kvm/vmx_evmcs.h Vitaly Kuznetsov    2018-03-20  82  	const struct evmcs_field *evmcs_field;
773e8a0425c923 arch/x86/kvm/vmx_evmcs.h Vitaly Kuznetsov    2018-03-20  83  
75edce8a45486f arch/x86/kvm/vmx/evmcs.h Sean Christopherson 2018-12-03  84  	if (unlikely(index >= nr_evmcs_1_fields)) {
773e8a0425c923 arch/x86/kvm/vmx_evmcs.h Vitaly Kuznetsov    2018-03-20  85  		WARN_ONCE(1, "KVM: accessing unsupported EVMCS field %lx\n",
773e8a0425c923 arch/x86/kvm/vmx_evmcs.h Vitaly Kuznetsov    2018-03-20  86  			  field);
773e8a0425c923 arch/x86/kvm/vmx_evmcs.h Vitaly Kuznetsov    2018-03-20  87  		return -ENOENT;
773e8a0425c923 arch/x86/kvm/vmx_evmcs.h Vitaly Kuznetsov    2018-03-20  88  	}
773e8a0425c923 arch/x86/kvm/vmx_evmcs.h Vitaly Kuznetsov    2018-03-20  89  
773e8a0425c923 arch/x86/kvm/vmx_evmcs.h Vitaly Kuznetsov    2018-03-20  90  	evmcs_field = &vmcs_field_to_evmcs_1[index];
773e8a0425c923 arch/x86/kvm/vmx_evmcs.h Vitaly Kuznetsov    2018-03-20  91  
773e8a0425c923 arch/x86/kvm/vmx_evmcs.h Vitaly Kuznetsov    2018-03-20  92  	if (clean_field)
773e8a0425c923 arch/x86/kvm/vmx_evmcs.h Vitaly Kuznetsov    2018-03-20  93  		*clean_field = evmcs_field->clean_field;
773e8a0425c923 arch/x86/kvm/vmx_evmcs.h Vitaly Kuznetsov    2018-03-20  94  
773e8a0425c923 arch/x86/kvm/vmx_evmcs.h Vitaly Kuznetsov    2018-03-20  95  	return evmcs_field->offset;
773e8a0425c923 arch/x86/kvm/vmx_evmcs.h Vitaly Kuznetsov    2018-03-20  96  }
773e8a0425c923 arch/x86/kvm/vmx_evmcs.h Vitaly Kuznetsov    2018-03-20  97  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 41898 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 03/13] KVM: x86: Add arch LBR MSRs to msrs_to_save_all list
  2021-07-09 18:24   ` Jim Mattson
@ 2021-07-12  8:55     ` Yang Weijiang
  0 siblings, 0 replies; 40+ messages in thread
From: Yang Weijiang @ 2021-07-12  8:55 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Yang Weijiang, pbonzini, seanjc, vkuznets, wei.w.wang,
	like.xu.linux, kvm, linux-kernel

On Fri, Jul 09, 2021 at 11:24:46AM -0700, Jim Mattson wrote:
> On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
> >
> > Arch LBR MSR_ARCH_LBR_DEPTH and MSR_ARCH_LBR_CTL are {saved|restored}
> > by userspace application if they're available.
> >
> > Suggested-by: Jim Mattson <jmattson@google.com>
> > Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> Reviewed-by: Jim Mattson <jmattson@google.com>
Thanks Jim for review!

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 04/13] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_DEPTH for guest Arch LBR
  2021-07-09 20:35   ` Jim Mattson
@ 2021-07-12  9:17     ` Yang Weijiang
  0 siblings, 0 replies; 40+ messages in thread
From: Yang Weijiang @ 2021-07-12  9:17 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Yang Weijiang, pbonzini, seanjc, vkuznets, wei.w.wang,
	like.xu.linux, kvm, linux-kernel, Like Xu

On Fri, Jul 09, 2021 at 01:35:34PM -0700, Jim Mattson wrote:
> On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
> >
> > From: Like Xu <like.xu@linux.intel.com>
> >
> > The number of Arch LBR entries available is determined by the value
> > in host MSR_ARCH_LBR_DEPTH.DEPTH. The supported LBR depth values are
> > enumerated in CPUID.(EAX=01CH, ECX=0):EAX[7:0]. For each bit "n" set
> > in this field, the MSR_ARCH_LBR_DEPTH.DEPTH value of "8*(n+1)" is
> > supported.
> >
> > On a guest write to MSR_ARCH_LBR_DEPTH, all LBR entries are reset to 0.
> > KVM emulates the reset behavior by introducing lbr_desc->arch_lbr_reset.
> > KVM writes guest requested value to the native ARCH_LBR_DEPTH MSR
> > (this is safe because the two values will be the same) when the Arch LBR
> > records MSRs are pass-through to the guest.
> >
> > Signed-off-by: Like Xu <like.xu@linux.intel.com>
> > Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> > ---
> 
> > @@ -393,6 +417,7 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> >  {
> >         struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
> >         struct kvm_pmc *pmc;
> > +       struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
> >         u32 msr = msr_info->index;
> >         u64 data = msr_info->data;
> >
> > @@ -427,6 +452,12 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> >                         return 0;
> >                 }
> >                 break;
> > +       case MSR_ARCH_LBR_DEPTH:
> > +               if (!arch_lbr_depth_is_valid(vcpu, data))
> > +                       return 1;
> 
> Does this imply that, when restoring a vCPU, KVM_SET_CPUID2 must be
> called before KVM_SET_MSRS, so that arch_lbr_depth_is_valid() knows
> what to do? Is this documented anywhere?
There shoudn't be such kind of assumption :-D, I'll check and modify it.
Thanks for pointing it out!

> 
> > +               lbr_desc->records.nr = data;
> > +               lbr_desc->arch_lbr_reset = true;
> 
> Doesn't this make it impossible to restore vCPU state, since the LBRs
> will be reset on the next VM-entry? At the very least, you probably
> shouldn't set arch_lbr_reset when the MSR write is host-initiated.
Host/Guest operation should be identified, will change it.

> 
> However, there is another problem: arch_lbr_reset isn't serialized
> anywhere. If you fix the host-initiated issue, then you still have a
> problem if the last guest instruction prior to suspending the vCPU was
> a write to IA32_LBR_DEPTH. If there is no subsequent VM-entry prior to
> saving the vCPU state, then the LBRs will be saved/restored as part of
> the guest XSAVE state, and they will not get cleared on resuming the
> vCPU.
Yes, it's a problem, I'll replace the code with a on-spot MSR write to
reset it.

> 
> > +               return 0;
> >         default:
> >                 if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
> >                     (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
> > @@ -566,6 +597,7 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu)
> >         lbr_desc->records.nr = 0;
> >         lbr_desc->event = NULL;
> >         lbr_desc->msr_passthrough = false;
> > +       lbr_desc->arch_lbr_reset = false;
> 
> I'm not sure this is entirely correct. If the last guest instruction
> prior to a warm reset was a write to IA32_LBR_DEPTH, then the LBRs
> should be cleared (and arch_lbr_reset will be true). However, if you
> clear that flag here, the LBRs will never get cleared.
I hope the on-spot reset can avoid above issue too.
Thanks!

> 
> >  }
> >

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 05/13] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_CTL for guest Arch LBR
  2021-07-09 21:55   ` Jim Mattson
@ 2021-07-12  9:36     ` Yang Weijiang
  2021-07-12 10:10       ` Like Xu
  0 siblings, 1 reply; 40+ messages in thread
From: Yang Weijiang @ 2021-07-12  9:36 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Yang Weijiang, pbonzini, seanjc, vkuznets, wei.w.wang,
	like.xu.linux, kvm, linux-kernel, Like Xu

On Fri, Jul 09, 2021 at 02:55:35PM -0700, Jim Mattson wrote:
> On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
> >
> > From: Like Xu <like.xu@linux.intel.com>
> >
> > Arch LBRs are enabled by setting MSR_ARCH_LBR_CTL.LBREn to 1. A new guest
> > state field named "Guest IA32_LBR_CTL" is added to enhance guest LBR usage.
> > When guest Arch LBR is enabled, a guest LBR event will be created like the
> > model-specific LBR does. Clear guest LBR enable bit on host PMI handling so
> > guest can see expected config.
> >
> > On processors that support Arch LBR, MSR_IA32_DEBUGCTLMSR[bit 0] has no
> > meaning. It can be written to 0 or 1, but reads will always return 0.
> > Like IA32_DEBUGCTL, IA32_ARCH_LBR_CTL msr is also reserved on INIT.
> 
> I suspect you mean "preserved" rather than "reserved."
Yes, should be preserved.

> 
> > Signed-off-by: Like Xu <like.xu@linux.intel.com>
> > Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> > ---
> >  arch/x86/events/intel/lbr.c      |  2 --
> >  arch/x86/include/asm/msr-index.h |  1 +
> >  arch/x86/include/asm/vmx.h       |  2 ++
> >  arch/x86/kvm/vmx/pmu_intel.c     | 31 ++++++++++++++++++++++++++-----
> >  arch/x86/kvm/vmx/vmx.c           |  9 +++++++++
> >  5 files changed, 38 insertions(+), 7 deletions(-)
> >
> 
> > diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> > index da68f0e74702..4500c564c63a 100644
> > --- a/arch/x86/kvm/vmx/pmu_intel.c
> > +++ b/arch/x86/kvm/vmx/pmu_intel.c
> > @@ -19,6 +19,11 @@
> >  #include "pmu.h"
> >
> >  #define MSR_PMC_FULL_WIDTH_BIT      (MSR_IA32_PMC0 - MSR_IA32_PERFCTR0)
> > +/*
> > + * Regardless of the Arch LBR or legacy LBR, when the LBR_EN bit 0 of the
> > + * corresponding control MSR is set to 1, LBR recording will be enabled.
> > + */
> 
> Is this comment misplaced? It doesn't seem to have anything to do with
> the macro being defined below.
Agree, will put this in commit message.
> 
> > @@ -458,6 +467,14 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> >                 lbr_desc->records.nr = data;
> >                 lbr_desc->arch_lbr_reset = true;
> >                 return 0;
> > +       case MSR_ARCH_LBR_CTL:
> > +               if (data & ~KVM_ARCH_LBR_CTL_MASK)
> 
> Is a static mask sufficient? Per the Intel® Architecture Instruction
> Set Extensions and Future Features Programming Reference, some of
> these bits may not be supported on all microarchitectures. See Table
> 7-8. CPUID Leaf 01CH Enumeration of Architectural LBR Capabilities.
Yes, more sanity checks are required, thanks!

> 
> > +                       break;
> > +               vmcs_write64(GUEST_IA32_LBR_CTL, data);
> > +               if (intel_pmu_lbr_is_enabled(vcpu) && !lbr_desc->event &&
> > +                   (data & ARCH_LBR_CTL_LBREN))
> > +                       intel_pmu_create_guest_lbr_event(vcpu);
> 
> Nothing has to be done when the LBREN bit goes from 1 to 0?
Need to release the event and reset related flag when the bit goes from
1 to 0. Thanks!
> 
> > +               return 0;
> >         default:
> >                 if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
> >                     (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
> 
> Per the Intel® Architecture Instruction Set Extensions and Future
> Features Programming Reference, "IA32_LBR_CTL.LBREn is saved and
> cleared on #SMI, and restored on RSM." I don't see that happening
> anywhere. That manual also says, "On a warm reset...IA32_LBR_CTL.LBREn
> is cleared to 0, disabling LBRs." I don't see that happening either.

Yes, I'll add related code to make it consistent with spec, thanks!
> 
> I have a question about section 7.1.4.4 in that manual. It says, "On a
> debug breakpoint event (#DB), IA32_LBR_CTL.LBREn is cleared." When,
> exactly, does that happen? In particular, if kvm synthesizes such an
> event (for example, in kvm_vcpu_do_singlestep), does
> IA32_LBR_CTL.LBREn automatically get cleared (after loading the guest
> IA32_LBR_CTL value from the VMCS)? Or does kvm need to explicitly
> clear that bit in the VMCS before injecting the #DB?
OK, I don't have answer now, will ask the Arch to get clear answer on this,
thanks for raising the question!


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state
  2021-07-09 22:54   ` Jim Mattson
  2021-07-09 23:41     ` Jim Mattson
@ 2021-07-12  9:50     ` Yang Weijiang
  2021-07-12 17:23       ` Jim Mattson
  1 sibling, 1 reply; 40+ messages in thread
From: Yang Weijiang @ 2021-07-12  9:50 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Yang Weijiang, pbonzini, seanjc, vkuznets, wei.w.wang,
	like.xu.linux, kvm, linux-kernel

On Fri, Jul 09, 2021 at 03:54:53PM -0700, Jim Mattson wrote:
> On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
> >
> > If host is using MSR_ARCH_LBR_CTL then save it before vm-entry
> > and reload it after vm-exit.
> 
> I don't see anything being done here "before VM-entry" or "after
> VM-exit." This code seems to be invoked on vcpu_load and vcpu_put.
> 
> In any case, I don't see why this one MSR is special. It seems that if
> the host is using the architectural LBR MSRs, then *all* of the host
> architectural LBR MSRs have to be saved on vcpu_load and restored on
> vcpu_put. Shouldn't  kvm_load_guest_fpu() and kvm_put_guest_fpu() do
> that via the calls to kvm_save_current_fpu(vcpu->arch.user_fpu) and
> restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state)?
I looked back on the discussion thread:
https://patchwork.kernel.org/project/kvm/patch/20210303135756.1546253-8-like.xu@linux.intel.com/
not sure why this code is added, but IMO, although fpu save/restore in outer loop
covers this LBR MSR, but the operation points are far away from vm-entry/exit
point, i.e., the guest MSR setting could leak to host side for a signicant
long of time, it may cause host side profiling accuracy. if we save/restore it
manually, it'll mitigate the issue signifcantly.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state
  2021-07-09 23:41     ` Jim Mattson
@ 2021-07-12  9:53       ` Yang Weijiang
  2021-07-12 10:19         ` Like Xu
  0 siblings, 1 reply; 40+ messages in thread
From: Yang Weijiang @ 2021-07-12  9:53 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Yang Weijiang, pbonzini, seanjc, vkuznets, wei.w.wang,
	like.xu.linux, kvm, linux-kernel

On Fri, Jul 09, 2021 at 04:41:30PM -0700, Jim Mattson wrote:
> On Fri, Jul 9, 2021 at 3:54 PM Jim Mattson <jmattson@google.com> wrote:
> >
> > On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
> > >
> > > If host is using MSR_ARCH_LBR_CTL then save it before vm-entry
> > > and reload it after vm-exit.
> >
> > I don't see anything being done here "before VM-entry" or "after
> > VM-exit." This code seems to be invoked on vcpu_load and vcpu_put.
> >
> > In any case, I don't see why this one MSR is special. It seems that if
> > the host is using the architectural LBR MSRs, then *all* of the host
> > architectural LBR MSRs have to be saved on vcpu_load and restored on
> > vcpu_put. Shouldn't  kvm_load_guest_fpu() and kvm_put_guest_fpu() do
> > that via the calls to kvm_save_current_fpu(vcpu->arch.user_fpu) and
> > restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state)?
> 
> It does seem like there is something special about IA32_LBR_DEPTH, though...
> 
> Section 7.3.1 of the Intel® Architecture Instruction Set Extensions
> and Future Features Programming Reference
> says, "IA32_LBR_DEPTH is saved by XSAVES, but it is not written by
> XRSTORS in any circumstance." It seems like that would require some
> special handling if the host depth and the guest depth do not match.
In our vPMU design, guest depth is alway kept the same as that of host,
so this won't be a problem. But I'll double check the code again, thanks!

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 05/13] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_CTL for guest Arch LBR
  2021-07-12  9:36     ` Yang Weijiang
@ 2021-07-12 10:10       ` Like Xu
  2021-07-13  9:05         ` Yang Weijiang
  0 siblings, 1 reply; 40+ messages in thread
From: Like Xu @ 2021-07-12 10:10 UTC (permalink / raw)
  To: Yang Weijiang, Jim Mattson
  Cc: pbonzini, seanjc, vkuznets, wei.w.wang, kvm, linux-kernel, Like Xu

On 12/7/2021 5:36 pm, Yang Weijiang wrote:
> On Fri, Jul 09, 2021 at 02:55:35PM -0700, Jim Mattson wrote:
>> On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
>>>
>>> From: Like Xu <like.xu@linux.intel.com>
>>>
>>> Arch LBRs are enabled by setting MSR_ARCH_LBR_CTL.LBREn to 1. A new guest
>>> state field named "Guest IA32_LBR_CTL" is added to enhance guest LBR usage.
>>> When guest Arch LBR is enabled, a guest LBR event will be created like the
>>> model-specific LBR does. Clear guest LBR enable bit on host PMI handling so
>>> guest can see expected config.
>>>
>>> On processors that support Arch LBR, MSR_IA32_DEBUGCTLMSR[bit 0] has no
>>> meaning. It can be written to 0 or 1, but reads will always return 0.
>>> Like IA32_DEBUGCTL, IA32_ARCH_LBR_CTL msr is also reserved on INIT.
>>
>> I suspect you mean "preserved" rather than "reserved."
> Yes, should be preserved.
> 
>>
>>> Signed-off-by: Like Xu <like.xu@linux.intel.com>
>>> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
>>> ---
>>>   arch/x86/events/intel/lbr.c      |  2 --
>>>   arch/x86/include/asm/msr-index.h |  1 +
>>>   arch/x86/include/asm/vmx.h       |  2 ++
>>>   arch/x86/kvm/vmx/pmu_intel.c     | 31 ++++++++++++++++++++++++++-----
>>>   arch/x86/kvm/vmx/vmx.c           |  9 +++++++++
>>>   5 files changed, 38 insertions(+), 7 deletions(-)
>>>
>>
>>> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
>>> index da68f0e74702..4500c564c63a 100644
>>> --- a/arch/x86/kvm/vmx/pmu_intel.c
>>> +++ b/arch/x86/kvm/vmx/pmu_intel.c
>>> @@ -19,6 +19,11 @@
>>>   #include "pmu.h"
>>>
>>>   #define MSR_PMC_FULL_WIDTH_BIT      (MSR_IA32_PMC0 - MSR_IA32_PERFCTR0)
>>> +/*
>>> + * Regardless of the Arch LBR or legacy LBR, when the LBR_EN bit 0 of the
>>> + * corresponding control MSR is set to 1, LBR recording will be enabled.
>>> + */
>>
>> Is this comment misplaced? It doesn't seem to have anything to do with
>> the macro being defined below.
> Agree, will put this in commit message.
>>
>>> @@ -458,6 +467,14 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>>                  lbr_desc->records.nr = data;
>>>                  lbr_desc->arch_lbr_reset = true;
>>>                  return 0;
>>> +       case MSR_ARCH_LBR_CTL:
>>> +               if (data & ~KVM_ARCH_LBR_CTL_MASK)
>>
>> Is a static mask sufficient? Per the Intel® Architecture Instruction
>> Set Extensions and Future Features Programming Reference, some of
>> these bits may not be supported on all microarchitectures. See Table
>> 7-8. CPUID Leaf 01CH Enumeration of Architectural LBR Capabilities.
> Yes, more sanity checks are required, thanks!
> 
>>
>>> +                       break;
>>> +               vmcs_write64(GUEST_IA32_LBR_CTL, data);
>>> +               if (intel_pmu_lbr_is_enabled(vcpu) && !lbr_desc->event &&
>>> +                   (data & ARCH_LBR_CTL_LBREN))
>>> +                       intel_pmu_create_guest_lbr_event(vcpu);
>>
>> Nothing has to be done when the LBREN bit goes from 1 to 0?
> Need to release the event and reset related flag when the bit goes from
> 1 to 0. Thanks!

No need to release the LBR event and it will be lazily released.

>>
>>> +               return 0;
>>>          default:
>>>                  if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
>>>                      (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
>>
>> Per the Intel® Architecture Instruction Set Extensions and Future
>> Features Programming Reference, "IA32_LBR_CTL.LBREn is saved and
>> cleared on #SMI, and restored on RSM." I don't see that happening
>> anywhere. That manual also says, "On a warm reset...IA32_LBR_CTL.LBREn
>> is cleared to 0, disabling LBRs." I don't see that happening either.
> 
> Yes, I'll add related code to make it consistent with spec, thanks!
>>
>> I have a question about section 7.1.4.4 in that manual. It says, "On a
>> debug breakpoint event (#DB), IA32_LBR_CTL.LBREn is cleared." When,
>> exactly, does that happen? In particular, if kvm synthesizes such an
>> event (for example, in kvm_vcpu_do_singlestep), does
>> IA32_LBR_CTL.LBREn automatically get cleared (after loading the guest
>> IA32_LBR_CTL value from the VMCS)? Or does kvm need to explicitly
>> clear that bit in the VMCS before injecting the #DB?
> OK, I don't have answer now, will ask the Arch to get clear answer on this,
> thanks for raising the question!

I think we also need a kvm-unit-tests to cover it (as well as the legacy 
LBR).

> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state
  2021-07-12  9:53       ` Yang Weijiang
@ 2021-07-12 10:19         ` Like Xu
  2021-07-12 17:20           ` Jim Mattson
  0 siblings, 1 reply; 40+ messages in thread
From: Like Xu @ 2021-07-12 10:19 UTC (permalink / raw)
  To: Yang Weijiang, Jim Mattson
  Cc: pbonzini, seanjc, vkuznets, wei.w.wang, kvm, linux-kernel, kan.liang

On 12/7/2021 5:53 pm, Yang Weijiang wrote:
> On Fri, Jul 09, 2021 at 04:41:30PM -0700, Jim Mattson wrote:
>> On Fri, Jul 9, 2021 at 3:54 PM Jim Mattson <jmattson@google.com> wrote:
>>>
>>> On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
>>>>
>>>> If host is using MSR_ARCH_LBR_CTL then save it before vm-entry
>>>> and reload it after vm-exit.
>>>
>>> I don't see anything being done here "before VM-entry" or "after
>>> VM-exit." This code seems to be invoked on vcpu_load and vcpu_put.
>>>
>>> In any case, I don't see why this one MSR is special. It seems that if
>>> the host is using the architectural LBR MSRs, then *all* of the host
>>> architectural LBR MSRs have to be saved on vcpu_load and restored on
>>> vcpu_put. Shouldn't  kvm_load_guest_fpu() and kvm_put_guest_fpu() do
>>> that via the calls to kvm_save_current_fpu(vcpu->arch.user_fpu) and
>>> restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state)?
>>
>> It does seem like there is something special about IA32_LBR_DEPTH, though...
>>
>> Section 7.3.1 of the Intel® Architecture Instruction Set Extensions
>> and Future Features Programming Reference
>> says, "IA32_LBR_DEPTH is saved by XSAVES, but it is not written by
>> XRSTORS in any circumstance." It seems like that would require some
>> special handling if the host depth and the guest depth do not match.
> In our vPMU design, guest depth is alway kept the same as that of host,
> so this won't be a problem. But I'll double check the code again, thanks!

KVM only exposes the host's depth value to the user space
so the guest can only use the same depth as the host.

A further reason is that the host perf driver currently
does not support different threads using different depths.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state
  2021-07-12 10:19         ` Like Xu
@ 2021-07-12 17:20           ` Jim Mattson
  2021-07-12 17:45             ` Jim Mattson
  2021-07-13  9:53             ` Yang Weijiang
  0 siblings, 2 replies; 40+ messages in thread
From: Jim Mattson @ 2021-07-12 17:20 UTC (permalink / raw)
  To: Like Xu
  Cc: Yang Weijiang, pbonzini, seanjc, vkuznets, wei.w.wang, kvm,
	linux-kernel, kan.liang

On Mon, Jul 12, 2021 at 3:19 AM Like Xu <like.xu.linux@gmail.com> wrote:
>
> On 12/7/2021 5:53 pm, Yang Weijiang wrote:
> > On Fri, Jul 09, 2021 at 04:41:30PM -0700, Jim Mattson wrote:
> >> On Fri, Jul 9, 2021 at 3:54 PM Jim Mattson <jmattson@google.com> wrote:
> >>>
> >>> On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
> >>>>
> >>>> If host is using MSR_ARCH_LBR_CTL then save it before vm-entry
> >>>> and reload it after vm-exit.
> >>>
> >>> I don't see anything being done here "before VM-entry" or "after
> >>> VM-exit." This code seems to be invoked on vcpu_load and vcpu_put.
> >>>
> >>> In any case, I don't see why this one MSR is special. It seems that if
> >>> the host is using the architectural LBR MSRs, then *all* of the host
> >>> architectural LBR MSRs have to be saved on vcpu_load and restored on
> >>> vcpu_put. Shouldn't  kvm_load_guest_fpu() and kvm_put_guest_fpu() do
> >>> that via the calls to kvm_save_current_fpu(vcpu->arch.user_fpu) and
> >>> restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state)?
> >>
> >> It does seem like there is something special about IA32_LBR_DEPTH, though...
> >>
> >> Section 7.3.1 of the Intel® Architecture Instruction Set Extensions
> >> and Future Features Programming Reference
> >> says, "IA32_LBR_DEPTH is saved by XSAVES, but it is not written by
> >> XRSTORS in any circumstance." It seems like that would require some
> >> special handling if the host depth and the guest depth do not match.
> > In our vPMU design, guest depth is alway kept the same as that of host,
> > so this won't be a problem. But I'll double check the code again, thanks!
>
> KVM only exposes the host's depth value to the user space
> so the guest can only use the same depth as the host.

The allowed depth supplied by KVM_GET_SUPPORTED_CPUID isn't enforced,
though, is it?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state
  2021-07-12  9:50     ` Yang Weijiang
@ 2021-07-12 17:23       ` Jim Mattson
  2021-07-13  9:47         ` Yang Weijiang
  0 siblings, 1 reply; 40+ messages in thread
From: Jim Mattson @ 2021-07-12 17:23 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, seanjc, vkuznets, wei.w.wang, like.xu.linux, kvm, linux-kernel

On Mon, Jul 12, 2021 at 2:36 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
>
> On Fri, Jul 09, 2021 at 03:54:53PM -0700, Jim Mattson wrote:
> > On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
> > >
> > > If host is using MSR_ARCH_LBR_CTL then save it before vm-entry
> > > and reload it after vm-exit.
> >
> > I don't see anything being done here "before VM-entry" or "after
> > VM-exit." This code seems to be invoked on vcpu_load and vcpu_put.
> >
> > In any case, I don't see why this one MSR is special. It seems that if
> > the host is using the architectural LBR MSRs, then *all* of the host
> > architectural LBR MSRs have to be saved on vcpu_load and restored on
> > vcpu_put. Shouldn't  kvm_load_guest_fpu() and kvm_put_guest_fpu() do
> > that via the calls to kvm_save_current_fpu(vcpu->arch.user_fpu) and
> > restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state)?
> I looked back on the discussion thread:
> https://patchwork.kernel.org/project/kvm/patch/20210303135756.1546253-8-like.xu@linux.intel.com/
> not sure why this code is added, but IMO, although fpu save/restore in outer loop
> covers this LBR MSR, but the operation points are far away from vm-entry/exit
> point, i.e., the guest MSR setting could leak to host side for a signicant
> long of time, it may cause host side profiling accuracy. if we save/restore it
> manually, it'll mitigate the issue signifcantly.

I'll be interested to see how you distinguish the intermingled branch
streams, if you allow the host to record LBRs while the LBR MSRs
contain guest values!

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state
  2021-07-12 17:20           ` Jim Mattson
@ 2021-07-12 17:45             ` Jim Mattson
  2021-07-13  9:49               ` Like Xu
  2021-07-13  9:53             ` Yang Weijiang
  1 sibling, 1 reply; 40+ messages in thread
From: Jim Mattson @ 2021-07-12 17:45 UTC (permalink / raw)
  To: Like Xu
  Cc: Yang Weijiang, pbonzini, seanjc, vkuznets, wei.w.wang, kvm,
	linux-kernel, kan.liang

On Mon, Jul 12, 2021 at 10:20 AM Jim Mattson <jmattson@google.com> wrote:
>
> On Mon, Jul 12, 2021 at 3:19 AM Like Xu <like.xu.linux@gmail.com> wrote:
> >
> > On 12/7/2021 5:53 pm, Yang Weijiang wrote:
> > > On Fri, Jul 09, 2021 at 04:41:30PM -0700, Jim Mattson wrote:
> > >> On Fri, Jul 9, 2021 at 3:54 PM Jim Mattson <jmattson@google.com> wrote:
> > >>>
> > >>> On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
> > >>>>
> > >>>> If host is using MSR_ARCH_LBR_CTL then save it before vm-entry
> > >>>> and reload it after vm-exit.
> > >>>
> > >>> I don't see anything being done here "before VM-entry" or "after
> > >>> VM-exit." This code seems to be invoked on vcpu_load and vcpu_put.
> > >>>
> > >>> In any case, I don't see why this one MSR is special. It seems that if
> > >>> the host is using the architectural LBR MSRs, then *all* of the host
> > >>> architectural LBR MSRs have to be saved on vcpu_load and restored on
> > >>> vcpu_put. Shouldn't  kvm_load_guest_fpu() and kvm_put_guest_fpu() do
> > >>> that via the calls to kvm_save_current_fpu(vcpu->arch.user_fpu) and
> > >>> restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state)?
> > >>
> > >> It does seem like there is something special about IA32_LBR_DEPTH, though...
> > >>
> > >> Section 7.3.1 of the Intel® Architecture Instruction Set Extensions
> > >> and Future Features Programming Reference
> > >> says, "IA32_LBR_DEPTH is saved by XSAVES, but it is not written by
> > >> XRSTORS in any circumstance." It seems like that would require some
> > >> special handling if the host depth and the guest depth do not match.
> > > In our vPMU design, guest depth is alway kept the same as that of host,
> > > so this won't be a problem. But I'll double check the code again, thanks!
> >
> > KVM only exposes the host's depth value to the user space
> > so the guest can only use the same depth as the host.
>
> The allowed depth supplied by KVM_GET_SUPPORTED_CPUID isn't enforced,
> though, is it?

Also, doesn't this end up being a major constraint on future
platforms? Every host that this vCPU will ever run on will have to use
the same LBR depth as the host on which it was started.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 05/13] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_CTL for guest Arch LBR
  2021-07-12 10:10       ` Like Xu
@ 2021-07-13  9:05         ` Yang Weijiang
  0 siblings, 0 replies; 40+ messages in thread
From: Yang Weijiang @ 2021-07-13  9:05 UTC (permalink / raw)
  To: Like Xu
  Cc: Yang Weijiang, Jim Mattson, pbonzini, seanjc, vkuznets,
	wei.w.wang, kvm, linux-kernel, Like Xu

On Mon, Jul 12, 2021 at 06:10:30PM +0800, Like Xu wrote:
> On 12/7/2021 5:36 pm, Yang Weijiang wrote:
> >On Fri, Jul 09, 2021 at 02:55:35PM -0700, Jim Mattson wrote:
> >>On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
> >>>
> >>>From: Like Xu <like.xu@linux.intel.com>
> >>>
> >>>Arch LBRs are enabled by setting MSR_ARCH_LBR_CTL.LBREn to 1. A new guest
> >>>state field named "Guest IA32_LBR_CTL" is added to enhance guest LBR usage.
> >>>When guest Arch LBR is enabled, a guest LBR event will be created like the
> >>>model-specific LBR does. Clear guest LBR enable bit on host PMI handling so
> >>>guest can see expected config.
> >>>
> >>>On processors that support Arch LBR, MSR_IA32_DEBUGCTLMSR[bit 0] has no
> >>>meaning. It can be written to 0 or 1, but reads will always return 0.
> >>>Like IA32_DEBUGCTL, IA32_ARCH_LBR_CTL msr is also reserved on INIT.
> >>
> >>I suspect you mean "preserved" rather than "reserved."
> >Yes, should be preserved.
> >
> >>
> >>>Signed-off-by: Like Xu <like.xu@linux.intel.com>
> >>>Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> >>>---
> >>>  arch/x86/events/intel/lbr.c      |  2 --
> >>>  arch/x86/include/asm/msr-index.h |  1 +
> >>>  arch/x86/include/asm/vmx.h       |  2 ++
> >>>  arch/x86/kvm/vmx/pmu_intel.c     | 31 ++++++++++++++++++++++++++-----
> >>>  arch/x86/kvm/vmx/vmx.c           |  9 +++++++++
> >>>  5 files changed, 38 insertions(+), 7 deletions(-)
> >>>
> >>
> >>>diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> >>>index da68f0e74702..4500c564c63a 100644
> >>>--- a/arch/x86/kvm/vmx/pmu_intel.c
> >>>+++ b/arch/x86/kvm/vmx/pmu_intel.c
> >>>@@ -19,6 +19,11 @@
> >>>  #include "pmu.h"
> >>>
> >>>  #define MSR_PMC_FULL_WIDTH_BIT      (MSR_IA32_PMC0 - MSR_IA32_PERFCTR0)
> >>>+/*
> >>>+ * Regardless of the Arch LBR or legacy LBR, when the LBR_EN bit 0 of the
> >>>+ * corresponding control MSR is set to 1, LBR recording will be enabled.
> >>>+ */
> >>
> >>Is this comment misplaced? It doesn't seem to have anything to do with
> >>the macro being defined below.
> >Agree, will put this in commit message.
> >>
> >>>@@ -458,6 +467,14 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> >>>                 lbr_desc->records.nr = data;
> >>>                 lbr_desc->arch_lbr_reset = true;
> >>>                 return 0;
> >>>+       case MSR_ARCH_LBR_CTL:
> >>>+               if (data & ~KVM_ARCH_LBR_CTL_MASK)
> >>
> >>Is a static mask sufficient? Per the Intel® Architecture Instruction
> >>Set Extensions and Future Features Programming Reference, some of
> >>these bits may not be supported on all microarchitectures. See Table
> >>7-8. CPUID Leaf 01CH Enumeration of Architectural LBR Capabilities.
> >Yes, more sanity checks are required, thanks!
> >
> >>
> >>>+                       break;
> >>>+               vmcs_write64(GUEST_IA32_LBR_CTL, data);
> >>>+               if (intel_pmu_lbr_is_enabled(vcpu) && !lbr_desc->event &&
> >>>+                   (data & ARCH_LBR_CTL_LBREN))
> >>>+                       intel_pmu_create_guest_lbr_event(vcpu);
> >>
> >>Nothing has to be done when the LBREN bit goes from 1 to 0?
> >Need to release the event and reset related flag when the bit goes from
> >1 to 0. Thanks!
> 
> No need to release the LBR event and it will be lazily released.
>
I forgot the lazy cleanup :-), thanks!
 
> >>
> >>>+               return 0;
> >>>         default:
> >>>                 if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
> >>>                     (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
> >>
> >>Per the Intel® Architecture Instruction Set Extensions and Future
> >>Features Programming Reference, "IA32_LBR_CTL.LBREn is saved and
> >>cleared on #SMI, and restored on RSM." I don't see that happening
> >>anywhere. That manual also says, "On a warm reset...IA32_LBR_CTL.LBREn
> >>is cleared to 0, disabling LBRs." I don't see that happening either.
> >
> >Yes, I'll add related code to make it consistent with spec, thanks!
> >>
> >>I have a question about section 7.1.4.4 in that manual. It says, "On a
> >>debug breakpoint event (#DB), IA32_LBR_CTL.LBREn is cleared." When,
> >>exactly, does that happen? In particular, if kvm synthesizes such an
> >>event (for example, in kvm_vcpu_do_singlestep), does
> >>IA32_LBR_CTL.LBREn automatically get cleared (after loading the guest
> >>IA32_LBR_CTL value from the VMCS)? Or does kvm need to explicitly
> >>clear that bit in the VMCS before injecting the #DB?
> >OK, I don't have answer now, will ask the Arch to get clear answer on this,
> >thanks for raising the question!
> 
> I think we also need a kvm-unit-tests to cover it (as well as the legacy
> LBR).
> 
Yes, I'll add one later.

> >

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state
  2021-07-12 17:23       ` Jim Mattson
@ 2021-07-13  9:47         ` Yang Weijiang
  2021-07-13 10:16           ` Like Xu
  0 siblings, 1 reply; 40+ messages in thread
From: Yang Weijiang @ 2021-07-13  9:47 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Yang Weijiang, pbonzini, seanjc, vkuznets, wei.w.wang,
	like.xu.linux, kvm, linux-kernel

On Mon, Jul 12, 2021 at 10:23:02AM -0700, Jim Mattson wrote:
> On Mon, Jul 12, 2021 at 2:36 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
> >
> > On Fri, Jul 09, 2021 at 03:54:53PM -0700, Jim Mattson wrote:
> > > On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
> > > >
> > > > If host is using MSR_ARCH_LBR_CTL then save it before vm-entry
> > > > and reload it after vm-exit.
> > >
> > > I don't see anything being done here "before VM-entry" or "after
> > > VM-exit." This code seems to be invoked on vcpu_load and vcpu_put.
> > >
> > > In any case, I don't see why this one MSR is special. It seems that if
> > > the host is using the architectural LBR MSRs, then *all* of the host
> > > architectural LBR MSRs have to be saved on vcpu_load and restored on
> > > vcpu_put. Shouldn't  kvm_load_guest_fpu() and kvm_put_guest_fpu() do
> > > that via the calls to kvm_save_current_fpu(vcpu->arch.user_fpu) and
> > > restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state)?
> > I looked back on the discussion thread:
> > https://patchwork.kernel.org/project/kvm/patch/20210303135756.1546253-8-like.xu@linux.intel.com/
> > not sure why this code is added, but IMO, although fpu save/restore in outer loop
> > covers this LBR MSR, but the operation points are far away from vm-entry/exit
> > point, i.e., the guest MSR setting could leak to host side for a signicant
> > long of time, it may cause host side profiling accuracy. if we save/restore it
> > manually, it'll mitigate the issue signifcantly.
> 
> I'll be interested to see how you distinguish the intermingled branch
> streams, if you allow the host to record LBRs while the LBR MSRs
> contain guest values!
I'll check if an inner simplified xsave/restore to guest/host LBR MSRs is meaningful,
the worst case is to drop this patch since it's not correct to only enable host lbr ctl
while still leaves guest LBR data in the MSRs. Thanks for the reminder!


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state
  2021-07-12 17:45             ` Jim Mattson
@ 2021-07-13  9:49               ` Like Xu
  2021-07-13 17:00                 ` Jim Mattson
  0 siblings, 1 reply; 40+ messages in thread
From: Like Xu @ 2021-07-13  9:49 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Yang Weijiang, pbonzini, seanjc, vkuznets, wei.w.wang, kvm,
	linux-kernel, kan.liang

On 13/7/2021 1:45 am, Jim Mattson wrote:
> On Mon, Jul 12, 2021 at 10:20 AM Jim Mattson <jmattson@google.com> wrote:
>>
>> On Mon, Jul 12, 2021 at 3:19 AM Like Xu <like.xu.linux@gmail.com> wrote:
>>>
>>> On 12/7/2021 5:53 pm, Yang Weijiang wrote:
>>>> On Fri, Jul 09, 2021 at 04:41:30PM -0700, Jim Mattson wrote:
>>>>> On Fri, Jul 9, 2021 at 3:54 PM Jim Mattson <jmattson@google.com> wrote:
>>>>>>
>>>>>> On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
>>>>>>>
>>>>>>> If host is using MSR_ARCH_LBR_CTL then save it before vm-entry
>>>>>>> and reload it after vm-exit.
>>>>>>
>>>>>> I don't see anything being done here "before VM-entry" or "after
>>>>>> VM-exit." This code seems to be invoked on vcpu_load and vcpu_put.
>>>>>>
>>>>>> In any case, I don't see why this one MSR is special. It seems that if
>>>>>> the host is using the architectural LBR MSRs, then *all* of the host
>>>>>> architectural LBR MSRs have to be saved on vcpu_load and restored on
>>>>>> vcpu_put. Shouldn't  kvm_load_guest_fpu() and kvm_put_guest_fpu() do
>>>>>> that via the calls to kvm_save_current_fpu(vcpu->arch.user_fpu) and
>>>>>> restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state)?
>>>>>
>>>>> It does seem like there is something special about IA32_LBR_DEPTH, though...
>>>>>
>>>>> Section 7.3.1 of the Intel® Architecture Instruction Set Extensions
>>>>> and Future Features Programming Reference
>>>>> says, "IA32_LBR_DEPTH is saved by XSAVES, but it is not written by
>>>>> XRSTORS in any circumstance." It seems like that would require some
>>>>> special handling if the host depth and the guest depth do not match.
>>>> In our vPMU design, guest depth is alway kept the same as that of host,
>>>> so this won't be a problem. But I'll double check the code again, thanks!
>>>
>>> KVM only exposes the host's depth value to the user space
>>> so the guest can only use the same depth as the host.
>>
>> The allowed depth supplied by KVM_GET_SUPPORTED_CPUID isn't enforced,
>> though, is it?

Like other hardware dependent features, the functionality will not
promise to work properly if the guest uses the unsupported CPUID.

> 
> Also, doesn't this end up being a major constraint on future
> platforms? Every host that this vCPU will ever run on will have to use
> the same LBR depth as the host on which it was started.
> 

As a first step, we made the guest LBR feature only available for the
"migratable=off" user space, which is why we intentionally did not add
MSR_ARCH_LBR_* stuff to msrs_to_save_all[] in earlier versions.

But hopefully, we may make it at least migratable for Arch LBR.

I'm personally curious about the cost of using XSAVES to swicth
guest lbr msrs during vmx transaction, and if the cost is unacceptable,
we may ask the perf host to adjust different depths for threads.



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state
  2021-07-12 17:20           ` Jim Mattson
  2021-07-12 17:45             ` Jim Mattson
@ 2021-07-13  9:53             ` Yang Weijiang
  1 sibling, 0 replies; 40+ messages in thread
From: Yang Weijiang @ 2021-07-13  9:53 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Like Xu, Yang Weijiang, pbonzini, seanjc, vkuznets, wei.w.wang,
	kvm, linux-kernel, kan.liang

On Mon, Jul 12, 2021 at 10:20:04AM -0700, Jim Mattson wrote:
> On Mon, Jul 12, 2021 at 3:19 AM Like Xu <like.xu.linux@gmail.com> wrote:
> >
> > On 12/7/2021 5:53 pm, Yang Weijiang wrote:
> > > On Fri, Jul 09, 2021 at 04:41:30PM -0700, Jim Mattson wrote:
> > >> On Fri, Jul 9, 2021 at 3:54 PM Jim Mattson <jmattson@google.com> wrote:
> > >>>
> > >>> On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
> > >>>>
> > >>>> If host is using MSR_ARCH_LBR_CTL then save it before vm-entry
> > >>>> and reload it after vm-exit.
> > >>>
> > >>> I don't see anything being done here "before VM-entry" or "after
> > >>> VM-exit." This code seems to be invoked on vcpu_load and vcpu_put.
> > >>>
> > >>> In any case, I don't see why this one MSR is special. It seems that if
> > >>> the host is using the architectural LBR MSRs, then *all* of the host
> > >>> architectural LBR MSRs have to be saved on vcpu_load and restored on
> > >>> vcpu_put. Shouldn't  kvm_load_guest_fpu() and kvm_put_guest_fpu() do
> > >>> that via the calls to kvm_save_current_fpu(vcpu->arch.user_fpu) and
> > >>> restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state)?
> > >>
> > >> It does seem like there is something special about IA32_LBR_DEPTH, though...
> > >>
> > >> Section 7.3.1 of the Intel® Architecture Instruction Set Extensions
> > >> and Future Features Programming Reference
> > >> says, "IA32_LBR_DEPTH is saved by XSAVES, but it is not written by
> > >> XRSTORS in any circumstance." It seems like that would require some
> > >> special handling if the host depth and the guest depth do not match.
> > > In our vPMU design, guest depth is alway kept the same as that of host,
> > > so this won't be a problem. But I'll double check the code again, thanks!
> >
> > KVM only exposes the host's depth value to the user space
> > so the guest can only use the same depth as the host.
> 
> The allowed depth supplied by KVM_GET_SUPPORTED_CPUID isn't enforced,
> though, is it?
Do you mean to enforce a check to depth written from user space via KVM_SET_CPUID2?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state
  2021-07-13  9:47         ` Yang Weijiang
@ 2021-07-13 10:16           ` Like Xu
  2021-07-13 17:12             ` Jim Mattson
  0 siblings, 1 reply; 40+ messages in thread
From: Like Xu @ 2021-07-13 10:16 UTC (permalink / raw)
  To: Yang Weijiang, Jim Mattson
  Cc: pbonzini, seanjc, vkuznets, wei.w.wang, kvm, linux-kernel

On 13/7/2021 5:47 pm, Yang Weijiang wrote:
> On Mon, Jul 12, 2021 at 10:23:02AM -0700, Jim Mattson wrote:
>> On Mon, Jul 12, 2021 at 2:36 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
>>>
>>> On Fri, Jul 09, 2021 at 03:54:53PM -0700, Jim Mattson wrote:
>>>> On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
>>>>>
>>>>> If host is using MSR_ARCH_LBR_CTL then save it before vm-entry
>>>>> and reload it after vm-exit.
>>>>
>>>> I don't see anything being done here "before VM-entry" or "after
>>>> VM-exit." This code seems to be invoked on vcpu_load and vcpu_put.
>>>>
>>>> In any case, I don't see why this one MSR is special. It seems that if
>>>> the host is using the architectural LBR MSRs, then *all* of the host
>>>> architectural LBR MSRs have to be saved on vcpu_load and restored on
>>>> vcpu_put. Shouldn't  kvm_load_guest_fpu() and kvm_put_guest_fpu() do
>>>> that via the calls to kvm_save_current_fpu(vcpu->arch.user_fpu) and
>>>> restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state)?
>>> I looked back on the discussion thread:
>>> https://patchwork.kernel.org/project/kvm/patch/20210303135756.1546253-8-like.xu@linux.intel.com/
>>> not sure why this code is added, but IMO, although fpu save/restore in outer loop
>>> covers this LBR MSR, but the operation points are far away from vm-entry/exit
>>> point, i.e., the guest MSR setting could leak to host side for a signicant
>>> long of time, it may cause host side profiling accuracy. if we save/restore it
>>> manually, it'll mitigate the issue signifcantly.
>>
>> I'll be interested to see how you distinguish the intermingled branch
>> streams, if you allow the host to record LBRs while the LBR MSRs
>> contain guest values!

The guest is pretty fine that the real LBR MSRs contain the guest values
even after vm-exit if there is no other LBR user in the current thread.

(The perf subsystem makes this data visible only to the current thread)

Except for MSR_ARCH_LBR_CTL, we don't want to add msr switch overhead to
the vmx transaction (just think about {from, to, info} * 32 entries).

If we have other LBR user (such as a "perf kvm") in the current thread,
the host/guest LBR user will create separate LBR events to compete for
who can use the LBR in the the current thread.

The final arbiter is the host perf scheduler. The host perf will
save/restore the contents of the LBR when switching between two
LBR events.

Indeed, if the LBR hardware is assigned to the host LBR event before
vm-entry, then the guest LBR feature will be broken and a warning
will be triggered on the host.

LBR is the kind of exclusive hardware resource and cannot be shared
by different host/guest lbr_select configurations.

> I'll check if an inner simplified xsave/restore to guest/host LBR MSRs is meaningful,
> the worst case is to drop this patch since it's not correct to only enable host lbr ctl
> while still leaves guest LBR data in the MSRs. Thanks for the reminder!
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state
  2021-07-13  9:49               ` Like Xu
@ 2021-07-13 17:00                 ` Jim Mattson
  2021-07-14 13:33                   ` Like Xu
  0 siblings, 1 reply; 40+ messages in thread
From: Jim Mattson @ 2021-07-13 17:00 UTC (permalink / raw)
  To: Like Xu
  Cc: Yang Weijiang, pbonzini, seanjc, vkuznets, wei.w.wang, kvm,
	linux-kernel, kan.liang

On Tue, Jul 13, 2021 at 2:49 AM Like Xu <like.xu.linux@gmail.com> wrote:
>
> On 13/7/2021 1:45 am, Jim Mattson wrote:
> > On Mon, Jul 12, 2021 at 10:20 AM Jim Mattson <jmattson@google.com> wrote:
> >>
> >> On Mon, Jul 12, 2021 at 3:19 AM Like Xu <like.xu.linux@gmail.com> wrote:
> >>>
> >>> On 12/7/2021 5:53 pm, Yang Weijiang wrote:
> >>>> On Fri, Jul 09, 2021 at 04:41:30PM -0700, Jim Mattson wrote:
> >>>>> On Fri, Jul 9, 2021 at 3:54 PM Jim Mattson <jmattson@google.com> wrote:
> >>>>>>
> >>>>>> On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
> >>>>>>>
> >>>>>>> If host is using MSR_ARCH_LBR_CTL then save it before vm-entry
> >>>>>>> and reload it after vm-exit.
> >>>>>>
> >>>>>> I don't see anything being done here "before VM-entry" or "after
> >>>>>> VM-exit." This code seems to be invoked on vcpu_load and vcpu_put.
> >>>>>>
> >>>>>> In any case, I don't see why this one MSR is special. It seems that if
> >>>>>> the host is using the architectural LBR MSRs, then *all* of the host
> >>>>>> architectural LBR MSRs have to be saved on vcpu_load and restored on
> >>>>>> vcpu_put. Shouldn't  kvm_load_guest_fpu() and kvm_put_guest_fpu() do
> >>>>>> that via the calls to kvm_save_current_fpu(vcpu->arch.user_fpu) and
> >>>>>> restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state)?
> >>>>>
> >>>>> It does seem like there is something special about IA32_LBR_DEPTH, though...
> >>>>>
> >>>>> Section 7.3.1 of the Intel® Architecture Instruction Set Extensions
> >>>>> and Future Features Programming Reference
> >>>>> says, "IA32_LBR_DEPTH is saved by XSAVES, but it is not written by
> >>>>> XRSTORS in any circumstance." It seems like that would require some
> >>>>> special handling if the host depth and the guest depth do not match.
> >>>> In our vPMU design, guest depth is alway kept the same as that of host,
> >>>> so this won't be a problem. But I'll double check the code again, thanks!
> >>>
> >>> KVM only exposes the host's depth value to the user space
> >>> so the guest can only use the same depth as the host.
> >>
> >> The allowed depth supplied by KVM_GET_SUPPORTED_CPUID isn't enforced,
> >> though, is it?
>
> Like other hardware dependent features, the functionality will not
> promise to work properly if the guest uses the unsupported CPUID.

It's fine if it doesn't work in the guest, but can't a guest with the
wrong depth prevent the host LBRs from being reloaded when switching
back to the host state? It's definitely not okay for an ill-configured
guest to break host functionality.

> >
> > Also, doesn't this end up being a major constraint on future
> > platforms? Every host that this vCPU will ever run on will have to use
> > the same LBR depth as the host on which it was started.
> >
>
> As a first step, we made the guest LBR feature only available for the
> "migratable=off" user space, which is why we intentionally did not add
> MSR_ARCH_LBR_* stuff to msrs_to_save_all[] in earlier versions.

We have no such concept in our user space. Features that are not
migratable should clearly be identified as such by an appropriate KVM
API. At present, I don't believe there is such an API.

> But hopefully, we may make it at least migratable for Arch LBR.
>
> I'm personally curious about the cost of using XSAVES to swicth
> guest lbr msrs during vmx transaction, and if the cost is unacceptable,
> we may ask the perf host to adjust different depths for threads.
>
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state
  2021-07-13 10:16           ` Like Xu
@ 2021-07-13 17:12             ` Jim Mattson
  2021-07-14 13:55               ` Like Xu
  0 siblings, 1 reply; 40+ messages in thread
From: Jim Mattson @ 2021-07-13 17:12 UTC (permalink / raw)
  To: Like Xu
  Cc: Yang Weijiang, pbonzini, seanjc, vkuznets, wei.w.wang, kvm, linux-kernel

On Tue, Jul 13, 2021 at 3:16 AM Like Xu <like.xu.linux@gmail.com> wrote:
>
> On 13/7/2021 5:47 pm, Yang Weijiang wrote:
> > On Mon, Jul 12, 2021 at 10:23:02AM -0700, Jim Mattson wrote:
> >> On Mon, Jul 12, 2021 at 2:36 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
> >>>
> >>> On Fri, Jul 09, 2021 at 03:54:53PM -0700, Jim Mattson wrote:
> >>>> On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
> >>>>>
> >>>>> If host is using MSR_ARCH_LBR_CTL then save it before vm-entry
> >>>>> and reload it after vm-exit.
> >>>>
> >>>> I don't see anything being done here "before VM-entry" or "after
> >>>> VM-exit." This code seems to be invoked on vcpu_load and vcpu_put.
> >>>>
> >>>> In any case, I don't see why this one MSR is special. It seems that if
> >>>> the host is using the architectural LBR MSRs, then *all* of the host
> >>>> architectural LBR MSRs have to be saved on vcpu_load and restored on
> >>>> vcpu_put. Shouldn't  kvm_load_guest_fpu() and kvm_put_guest_fpu() do
> >>>> that via the calls to kvm_save_current_fpu(vcpu->arch.user_fpu) and
> >>>> restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state)?
> >>> I looked back on the discussion thread:
> >>> https://patchwork.kernel.org/project/kvm/patch/20210303135756.1546253-8-like.xu@linux.intel.com/
> >>> not sure why this code is added, but IMO, although fpu save/restore in outer loop
> >>> covers this LBR MSR, but the operation points are far away from vm-entry/exit
> >>> point, i.e., the guest MSR setting could leak to host side for a signicant
> >>> long of time, it may cause host side profiling accuracy. if we save/restore it
> >>> manually, it'll mitigate the issue signifcantly.
> >>
> >> I'll be interested to see how you distinguish the intermingled branch
> >> streams, if you allow the host to record LBRs while the LBR MSRs
> >> contain guest values!
>
> The guest is pretty fine that the real LBR MSRs contain the guest values
> even after vm-exit if there is no other LBR user in the current thread.
>
> (The perf subsystem makes this data visible only to the current thread)
>
> Except for MSR_ARCH_LBR_CTL, we don't want to add msr switch overhead to
> the vmx transaction (just think about {from, to, info} * 32 entries).
>
> If we have other LBR user (such as a "perf kvm") in the current thread,
> the host/guest LBR user will create separate LBR events to compete for
> who can use the LBR in the the current thread.
>
> The final arbiter is the host perf scheduler. The host perf will
> save/restore the contents of the LBR when switching between two
> LBR events.
>
> Indeed, if the LBR hardware is assigned to the host LBR event before
> vm-entry, then the guest LBR feature will be broken and a warning
> will be triggered on the host.

Are you saying that the guest LBR feature only works some of the time?
How are failures communicated to the guest? If this feature doesn't
follow the architectural specification, perhaps you should consider
offering a paravirtual feature instead.

Warnings on the host, by the way, are almost completely useless. How
do I surface such a warning to a customer who has a misbehaving VM? At
the very least, user space should be notified of KVM emulation errors,
so I can get an appropriate message to the customer.

> LBR is the kind of exclusive hardware resource and cannot be shared
> by different host/guest lbr_select configurations.

In that case, it definitely sounds like guest architectural LBRs
should be a paravirtual feature, since you can't actually virtualize
the hardware.

> > I'll check if an inner simplified xsave/restore to guest/host LBR MSRs is meaningful,
> > the worst case is to drop this patch since it's not correct to only enable host lbr ctl
> > while still leaves guest LBR data in the MSRs. Thanks for the reminder!
> >

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state
  2021-07-13 17:00                 ` Jim Mattson
@ 2021-07-14 13:33                   ` Like Xu
  2021-07-14 16:15                     ` Jim Mattson
  0 siblings, 1 reply; 40+ messages in thread
From: Like Xu @ 2021-07-14 13:33 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Yang Weijiang, pbonzini, seanjc, vkuznets, wei.w.wang, kvm,
	linux-kernel, kan.liang

On 14/7/2021 1:00 am, Jim Mattson wrote:
> On Tue, Jul 13, 2021 at 2:49 AM Like Xu <like.xu.linux@gmail.com> wrote:
>>
>> On 13/7/2021 1:45 am, Jim Mattson wrote:
>>> On Mon, Jul 12, 2021 at 10:20 AM Jim Mattson <jmattson@google.com> wrote:
>>>>
>>>> On Mon, Jul 12, 2021 at 3:19 AM Like Xu <like.xu.linux@gmail.com> wrote:
>>>>>
>>>>> On 12/7/2021 5:53 pm, Yang Weijiang wrote:
>>>>>> On Fri, Jul 09, 2021 at 04:41:30PM -0700, Jim Mattson wrote:
>>>>>>> On Fri, Jul 9, 2021 at 3:54 PM Jim Mattson <jmattson@google.com> wrote:
>>>>>>>>
>>>>>>>> On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
>>>>>>>>>
>>>>>>>>> If host is using MSR_ARCH_LBR_CTL then save it before vm-entry
>>>>>>>>> and reload it after vm-exit.
>>>>>>>>
>>>>>>>> I don't see anything being done here "before VM-entry" or "after
>>>>>>>> VM-exit." This code seems to be invoked on vcpu_load and vcpu_put.
>>>>>>>>
>>>>>>>> In any case, I don't see why this one MSR is special. It seems that if
>>>>>>>> the host is using the architectural LBR MSRs, then *all* of the host
>>>>>>>> architectural LBR MSRs have to be saved on vcpu_load and restored on
>>>>>>>> vcpu_put. Shouldn't  kvm_load_guest_fpu() and kvm_put_guest_fpu() do
>>>>>>>> that via the calls to kvm_save_current_fpu(vcpu->arch.user_fpu) and
>>>>>>>> restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state)?
>>>>>>>
>>>>>>> It does seem like there is something special about IA32_LBR_DEPTH, though...
>>>>>>>
>>>>>>> Section 7.3.1 of the Intel® Architecture Instruction Set Extensions
>>>>>>> and Future Features Programming Reference
>>>>>>> says, "IA32_LBR_DEPTH is saved by XSAVES, but it is not written by
>>>>>>> XRSTORS in any circumstance." It seems like that would require some
>>>>>>> special handling if the host depth and the guest depth do not match.
>>>>>> In our vPMU design, guest depth is alway kept the same as that of host,
>>>>>> so this won't be a problem. But I'll double check the code again, thanks!
>>>>>
>>>>> KVM only exposes the host's depth value to the user space
>>>>> so the guest can only use the same depth as the host.
>>>>
>>>> The allowed depth supplied by KVM_GET_SUPPORTED_CPUID isn't enforced,
>>>> though, is it?
>>
>> Like other hardware dependent features, the functionality will not
>> promise to work properly if the guest uses the unsupported CPUID.
> 
> It's fine if it doesn't work in the guest, but can't a guest with the
> wrong depth prevent the host LBRs from being reloaded when switching
> back to the host state? It's definitely not okay for an ill-configured
> guest to break host functionality.

If the ownership of LBR changes, there must be two (or more) LBR events
for perf event scheduling switch for current task, and the perf
subsystem callback will save the previous LBR state and restore
the state of the next LBR event considering different depths.

> 
>>>
>>> Also, doesn't this end up being a major constraint on future
>>> platforms? Every host that this vCPU will ever run on will have to use
>>> the same LBR depth as the host on which it was started.
>>>
>>
>> As a first step, we made the guest LBR feature only available for the
>> "migratable=off" user space, which is why we intentionally did not add
>> MSR_ARCH_LBR_* stuff to msrs_to_save_all[] in earlier versions.
> 
> We have no such concept in our user space. Features that are not
> migratable should clearly be identified as such by an appropriate KVM
> API. At present, I don't believe there is such an API.

I couldn't agree with you more on this point.

We do have a code gap to make Arch LBR migratable for any KVM user.

> 
>> But hopefully, we may make it at least migratable for Arch LBR.
>>
>> I'm personally curious about the cost of using XSAVES to swicth
>> guest lbr msrs during vmx transaction, and if the cost is unacceptable,
>> we may ask the perf host to adjust different depths for threads.
>>
>>
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state
  2021-07-13 17:12             ` Jim Mattson
@ 2021-07-14 13:55               ` Like Xu
  0 siblings, 0 replies; 40+ messages in thread
From: Like Xu @ 2021-07-14 13:55 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Yang Weijiang, pbonzini, seanjc, vkuznets, wei.w.wang, kvm, linux-kernel

On 14/7/2021 1:12 am, Jim Mattson wrote:
> On Tue, Jul 13, 2021 at 3:16 AM Like Xu <like.xu.linux@gmail.com> wrote:
>>
>> On 13/7/2021 5:47 pm, Yang Weijiang wrote:
>>> On Mon, Jul 12, 2021 at 10:23:02AM -0700, Jim Mattson wrote:
>>>> On Mon, Jul 12, 2021 at 2:36 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
>>>>>
>>>>> On Fri, Jul 09, 2021 at 03:54:53PM -0700, Jim Mattson wrote:
>>>>>> On Fri, Jul 9, 2021 at 2:51 AM Yang Weijiang <weijiang.yang@intel.com> wrote:
>>>>>>>
>>>>>>> If host is using MSR_ARCH_LBR_CTL then save it before vm-entry
>>>>>>> and reload it after vm-exit.
>>>>>>
>>>>>> I don't see anything being done here "before VM-entry" or "after
>>>>>> VM-exit." This code seems to be invoked on vcpu_load and vcpu_put.
>>>>>>
>>>>>> In any case, I don't see why this one MSR is special. It seems that if
>>>>>> the host is using the architectural LBR MSRs, then *all* of the host
>>>>>> architectural LBR MSRs have to be saved on vcpu_load and restored on
>>>>>> vcpu_put. Shouldn't  kvm_load_guest_fpu() and kvm_put_guest_fpu() do
>>>>>> that via the calls to kvm_save_current_fpu(vcpu->arch.user_fpu) and
>>>>>> restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state)?
>>>>> I looked back on the discussion thread:
>>>>> https://patchwork.kernel.org/project/kvm/patch/20210303135756.1546253-8-like.xu@linux.intel.com/
>>>>> not sure why this code is added, but IMO, although fpu save/restore in outer loop
>>>>> covers this LBR MSR, but the operation points are far away from vm-entry/exit
>>>>> point, i.e., the guest MSR setting could leak to host side for a signicant
>>>>> long of time, it may cause host side profiling accuracy. if we save/restore it
>>>>> manually, it'll mitigate the issue signifcantly.
>>>>
>>>> I'll be interested to see how you distinguish the intermingled branch
>>>> streams, if you allow the host to record LBRs while the LBR MSRs
>>>> contain guest values!
>>
>> The guest is pretty fine that the real LBR MSRs contain the guest values
>> even after vm-exit if there is no other LBR user in the current thread.
>>
>> (The perf subsystem makes this data visible only to the current thread)
>>
>> Except for MSR_ARCH_LBR_CTL, we don't want to add msr switch overhead to
>> the vmx transaction (just think about {from, to, info} * 32 entries).
>>
>> If we have other LBR user (such as a "perf kvm") in the current thread,
>> the host/guest LBR user will create separate LBR events to compete for
>> who can use the LBR in the the current thread.
>>
>> The final arbiter is the host perf scheduler. The host perf will
>> save/restore the contents of the LBR when switching between two
>> LBR events.
>>
>> Indeed, if the LBR hardware is assigned to the host LBR event before
>> vm-entry, then the guest LBR feature will be broken and a warning
>> will be triggered on the host.
> 
> Are you saying that the guest LBR feature only works some of the time?

If other LBR events preempt KVM-owned LBR events on the current CPU,
we lose the guest LBR feature in the next vm-entry time slice.

> How are failures communicated to the guest? If this feature doesn't
> follow the architectural specification, perhaps you should consider
> offering a paravirtual feature instead.

The failure notification *didn't* bring anything meaningful because
the guest lost the hardware support and the LBR data it captured.

> 
> Warnings on the host, by the way, are almost completely useless. How
> do I surface such a warning to a customer who has a misbehaving VM? At
> the very least, user space should be notified of KVM emulation errors,
> so I can get an appropriate message to the customer.

We have recommended in previous legacy LBR enabling commits
that CSP administrators would be better off not using LBR to
interfere with guest that has its own LBR enabled.

> 
>> LBR is the kind of exclusive hardware resource and cannot be shared
>> by different host/guest lbr_select configurations.
> 
> In that case, it definitely sounds like guest architectural LBRs
> should be a paravirtual feature, since you can't actually virtualize
> the hardware.

My earlier plan is to enable vmx-switch to handle the switching of
LBR msrs when we have another LBR event competing (maybe w/ XSAVES
help a lot on the overhead) and we expect that this competitive time
will be very short-lived. Or forcing the host perf to always make the
guest demand to be the first priority user in every situation, which
is even harder to get PeterZ to buy in.

> 
>>> I'll check if an inner simplified xsave/restore to guest/host LBR MSRs is meaningful,
>>> the worst case is to drop this patch since it's not correct to only enable host lbr ctl
>>> while still leaves guest LBR data in the MSRs. Thanks for the reminder!
>>>
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state
  2021-07-14 13:33                   ` Like Xu
@ 2021-07-14 16:15                     ` Jim Mattson
  0 siblings, 0 replies; 40+ messages in thread
From: Jim Mattson @ 2021-07-14 16:15 UTC (permalink / raw)
  To: Like Xu
  Cc: Yang Weijiang, pbonzini, seanjc, vkuznets, wei.w.wang, kvm,
	linux-kernel, kan.liang

On Wed, Jul 14, 2021 at 6:33 AM Like Xu <like.xu.linux@gmail.com> wrote:
>
> On 14/7/2021 1:00 am, Jim Mattson wrote:

> > We have no such concept in our user space. Features that are not
> > migratable should clearly be identified as such by an appropriate KVM
> > API. At present, I don't believe there is such an API.
>
> I couldn't agree with you more on this point.

Maybe KVM_GET_SUPPORTED_CPUID could be made more useful if it took an
argument, such as 'MAX', 'MIGRATABLE', etc.

> We do have a code gap to make Arch LBR migratable for any KVM user.

Right. My point is that the feature as architected appears to be
migratable. Therefore, it would be reasonable for userspace to assume
that any KVM implementation that claims to support the feature (via
KVM_GET_SUPPORTED_CPUID) should support it in a fully migratable way
(i.e. if the hardware supports the guest LBR depth, then it should be
able to run the VM, without loss of functionality).

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2021-07-14 16:15 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-09 10:04 [PATCH v5 00/13] Introduce Architectural LBR for vPMU Yang Weijiang
2021-07-09 10:04 ` [PATCH v5 01/13] perf/x86/intel: Fix the comment about guest LBR support on KVM Yang Weijiang
2021-07-09 10:05 ` [PATCH v5 02/13] perf/x86/lbr: Simplify the exposure check for the LBR_INFO registers Yang Weijiang
2021-07-09 10:05 ` [PATCH v5 03/13] KVM: x86: Add arch LBR MSRs to msrs_to_save_all list Yang Weijiang
2021-07-09 18:24   ` Jim Mattson
2021-07-12  8:55     ` Yang Weijiang
2021-07-09 10:05 ` [PATCH v5 04/13] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_DEPTH for guest Arch LBR Yang Weijiang
2021-07-09 20:35   ` Jim Mattson
2021-07-12  9:17     ` Yang Weijiang
2021-07-09 10:05 ` [PATCH v5 05/13] KVM: vmx/pmu: Emulate MSR_ARCH_LBR_CTL " Yang Weijiang
2021-07-09 21:55   ` Jim Mattson
2021-07-12  9:36     ` Yang Weijiang
2021-07-12 10:10       ` Like Xu
2021-07-13  9:05         ` Yang Weijiang
2021-07-10  0:42   ` kernel test robot
2021-07-09 10:05 ` [PATCH v5 06/13] KVM: x86/vmx: Save/Restore host MSR_ARCH_LBR_CTL state Yang Weijiang
2021-07-09 22:54   ` Jim Mattson
2021-07-09 23:41     ` Jim Mattson
2021-07-12  9:53       ` Yang Weijiang
2021-07-12 10:19         ` Like Xu
2021-07-12 17:20           ` Jim Mattson
2021-07-12 17:45             ` Jim Mattson
2021-07-13  9:49               ` Like Xu
2021-07-13 17:00                 ` Jim Mattson
2021-07-14 13:33                   ` Like Xu
2021-07-14 16:15                     ` Jim Mattson
2021-07-13  9:53             ` Yang Weijiang
2021-07-12  9:50     ` Yang Weijiang
2021-07-12 17:23       ` Jim Mattson
2021-07-13  9:47         ` Yang Weijiang
2021-07-13 10:16           ` Like Xu
2021-07-13 17:12             ` Jim Mattson
2021-07-14 13:55               ` Like Xu
2021-07-09 10:05 ` [PATCH v5 07/13] KVM: x86/pmu: Refactor code to support guest Arch LBR Yang Weijiang
2021-07-09 10:05 ` [PATCH v5 08/13] KVM: x86: Refresh CPUID on writes to MSR_IA32_XSS Yang Weijiang
2021-07-09 10:05 ` [PATCH v5 09/13] KVM: x86: Report XSS as an MSR to be saved if there are supported features Yang Weijiang
2021-07-09 10:05 ` [PATCH v5 10/13] KVM: x86: Refine the matching and clearing logic for supported_xss Yang Weijiang
2021-07-09 10:05 ` [PATCH v5 11/13] KVM: x86: Add XSAVE Support for Architectural LBR Yang Weijiang
2021-07-09 10:05 ` [PATCH v5 12/13] KVM: x86/vmx: Check Arch LBR config when return perf capabilities Yang Weijiang
2021-07-09 10:05 ` [PATCH v5 13/13] KVM: x86/cpuid: Advise Arch LBR feature in CPUID Yang Weijiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).