LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/6] KVM: x86: CPUID emulation and tracing fixes
@ 2020-03-02 19:57 Sean Christopherson
  2020-03-02 19:57 ` [PATCH 1/6] KVM: x86: Fix tracing of CPUID.function when function is out-of-range Sean Christopherson
                   ` (6 more replies)
  0 siblings, 7 replies; 40+ messages in thread
From: Sean Christopherson @ 2020-03-02 19:57 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Jan Kiszka, Xiaoyao Li

Two fixes related to out-of-range CPUID emulation and related cleanup on
top.

I have a unit test and also manually verified a few interesting cases.
I'm not planning on posting the unit test at this time because I haven't
figured out how to avoid false positives, e.g. if a random in-bounds
leaf just happens to match the output of a max basic leaf.  It might be
doable by hardcoding the cpu model?

Sean Christopherson (6):
  KVM: x86: Fix tracing of CPUID.function when function is out-of-range
  KVM: x86: Fix CPUID range check for Centaur and Hypervisor ranges
  KVM: x86: Add dedicated emulator helper for grabbing CPUID.maxphyaddr
  KVM: x86: Drop return value from kvm_cpuid()
  KVM: x86: Rename "found" variable in kvm_cpuid() to
    "exact_entry_exists"
  KVM: x86: Add requested index to the CPUID tracepoint

 arch/x86/include/asm/kvm_emulate.h |  3 ++-
 arch/x86/kvm/cpuid.c               | 26 ++++++++++++--------------
 arch/x86/kvm/cpuid.h               |  2 +-
 arch/x86/kvm/emulate.c             | 10 +---------
 arch/x86/kvm/trace.h               | 13 ++++++++-----
 arch/x86/kvm/x86.c                 | 10 ++++++++--
 6 files changed, 32 insertions(+), 32 deletions(-)

-- 
2.24.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 1/6] KVM: x86: Fix tracing of CPUID.function when function is out-of-range
  2020-03-02 19:57 [PATCH 0/6] KVM: x86: CPUID emulation and tracing fixes Sean Christopherson
@ 2020-03-02 19:57 ` Sean Christopherson
  2020-03-02 20:26   ` Jan Kiszka
  2020-03-03  2:50   ` Xiaoyao Li
  2020-03-02 19:57 ` [PATCH 2/6] KVM: x86: Fix CPUID range check for Centaur and Hypervisor ranges Sean Christopherson
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 40+ messages in thread
From: Sean Christopherson @ 2020-03-02 19:57 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Jan Kiszka, Xiaoyao Li

Rework kvm_cpuid() to query entry->function when adjusting the output
values so that the original function (in the aptly named "function") is
preserved for tracing.  This fixes a bug where trace_kvm_cpuid() will
trace the max function for a range instead of the requested function if
the requested function is out-of-range and an entry for the max function
exists.

Fixes: 43561123ab37 ("kvm: x86: Improve emulation of CPUID leaves 0BH and 1FH")
Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/cpuid.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index b1c469446b07..6be012937eba 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -997,12 +997,12 @@ static bool cpuid_function_in_range(struct kvm_vcpu *vcpu, u32 function)
 	return max && function <= max->eax;
 }
 
+/* Returns true if the requested leaf/function exists in guest CPUID. */
 bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
 	       u32 *ecx, u32 *edx, bool check_limit)
 {
-	u32 function = *eax, index = *ecx;
+	const u32 function = *eax, index = *ecx;
 	struct kvm_cpuid_entry2 *entry;
-	struct kvm_cpuid_entry2 *max;
 	bool found;
 
 	entry = kvm_find_cpuid_entry(vcpu, function, index);
@@ -1015,18 +1015,17 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
 	 */
 	if (!entry && check_limit && !guest_cpuid_is_amd(vcpu) &&
 	    !cpuid_function_in_range(vcpu, function)) {
-		max = kvm_find_cpuid_entry(vcpu, 0, 0);
-		if (max) {
-			function = max->eax;
-			entry = kvm_find_cpuid_entry(vcpu, function, index);
-		}
+		entry = kvm_find_cpuid_entry(vcpu, 0, 0);
+		if (entry)
+			entry = kvm_find_cpuid_entry(vcpu, entry->eax, index);
 	}
 	if (entry) {
 		*eax = entry->eax;
 		*ebx = entry->ebx;
 		*ecx = entry->ecx;
 		*edx = entry->edx;
-		if (function == 7 && index == 0) {
+
+		if (entry->function == 7 && index == 0) {
 			u64 data;
 		        if (!__kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data, true) &&
 			    (data & TSX_CTRL_CPUID_CLEAR))
-- 
2.24.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 2/6] KVM: x86: Fix CPUID range check for Centaur and Hypervisor ranges
  2020-03-02 19:57 [PATCH 0/6] KVM: x86: CPUID emulation and tracing fixes Sean Christopherson
  2020-03-02 19:57 ` [PATCH 1/6] KVM: x86: Fix tracing of CPUID.function when function is out-of-range Sean Christopherson
@ 2020-03-02 19:57 ` Sean Christopherson
  2020-03-02 21:59   ` Jim Mattson
  2020-03-03  3:25   ` Jim Mattson
  2020-03-02 19:57 ` [PATCH 3/6] KVM: x86: Add dedicated emulator helper for grabbing CPUID.maxphyaddr Sean Christopherson
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 40+ messages in thread
From: Sean Christopherson @ 2020-03-02 19:57 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Jan Kiszka, Xiaoyao Li

Extend the mask in cpuid_function_in_range() for finding the "class" of
the function to 0xfffffff00.  While there is no official definition of
what constitutes a class, e.g. arguably bits 31:16 should be the class
and bits 15:0 the functions within that class, the Hypervisor logic
effectively uses bits 31:8 as the class by virtue of checking for
different bases in increments of 0x100, e.g. KVM advertises its CPUID
functions starting at 0x40000100 when HyperV features are advertised at
the default base of 0x40000000.

Masking against 0x80000000 only handles basic and extended leafs, which
results in Centaur and Hypervisor range checks being performed against
the basic CPUID range, e.g. if CPUID.0x40000000.EAX=0x4000000A and there
is no entry for CPUID.0x40000006, then function 0x40000006 would be
incorrectly reported as out of bounds.

The bad range check doesn't cause function problems for any known VMM
because out-of-range semantics only come into play if the exact entry
isn't found, and VMMs either support a very limited Hypervisor range,
e.g. the official KVM range is 0x40000000-0x40000001 (effectively no
room for undefined leafs) or explicitly defines gaps to be zero, e.g.
Qemu explicitly creates zeroed entries up to the Cenatur and Hypervisor
limits (the latter comes into play when providing HyperV features).

The bad behavior can be visually confirmed by dumping CPUID output in
the guest when running Qemu with a stable TSC, as Qemu extends the limit
of range 0x40000000 to 0x40000010 to advertise VMware's cpuid_freq,
without defining zeroed entries for 0x40000002 - 0x4000000f.

Fixes: 43561123ab37 ("kvm: x86: Improve emulation of CPUID leaves 0BH and 1FH")
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/cpuid.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 6be012937eba..c320126e0118 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -993,7 +993,7 @@ static bool cpuid_function_in_range(struct kvm_vcpu *vcpu, u32 function)
 {
 	struct kvm_cpuid_entry2 *max;
 
-	max = kvm_find_cpuid_entry(vcpu, function & 0x80000000, 0);
+	max = kvm_find_cpuid_entry(vcpu, function & 0xffffff00u, 0);
 	return max && function <= max->eax;
 }
 
-- 
2.24.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 3/6] KVM: x86: Add dedicated emulator helper for grabbing CPUID.maxphyaddr
  2020-03-02 19:57 [PATCH 0/6] KVM: x86: CPUID emulation and tracing fixes Sean Christopherson
  2020-03-02 19:57 ` [PATCH 1/6] KVM: x86: Fix tracing of CPUID.function when function is out-of-range Sean Christopherson
  2020-03-02 19:57 ` [PATCH 2/6] KVM: x86: Fix CPUID range check for Centaur and Hypervisor ranges Sean Christopherson
@ 2020-03-02 19:57 ` Sean Christopherson
  2020-03-03  8:48   ` Paolo Bonzini
  2020-03-02 19:57 ` [PATCH 4/6] KVM: x86: Drop return value from kvm_cpuid() Sean Christopherson
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2020-03-02 19:57 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Jan Kiszka, Xiaoyao Li

Add a helper to retrieve cpuid_maxphyaddr() instead of manually
calculating the value in the emulator via raw CPUID output.  In addition
to consolidating logic, this also paves the way toward simplifying
kvm_cpuid(), whose somewhat confusing return value exists purely to
support the emulator's maxphyaddr calculation.

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/include/asm/kvm_emulate.h |  1 +
 arch/x86/kvm/emulate.c             | 10 +---------
 arch/x86/kvm/x86.c                 |  6 ++++++
 3 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index bf5f5e476f65..ded06515d30f 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -222,6 +222,7 @@ struct x86_emulate_ops {
 
 	bool (*get_cpuid)(struct x86_emulate_ctxt *ctxt, u32 *eax, u32 *ebx,
 			  u32 *ecx, u32 *edx, bool check_limit);
+	int (*get_cpuid_maxphyaddr)(struct x86_emulate_ctxt *ctxt);
 	bool (*guest_has_long_mode)(struct x86_emulate_ctxt *ctxt);
 	bool (*guest_has_movbe)(struct x86_emulate_ctxt *ctxt);
 	bool (*guest_has_fxsr)(struct x86_emulate_ctxt *ctxt);
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index dd19fb3539e0..bf02ed51e90f 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -4244,16 +4244,8 @@ static int check_cr_write(struct x86_emulate_ctxt *ctxt)
 
 		ctxt->ops->get_msr(ctxt, MSR_EFER, &efer);
 		if (efer & EFER_LMA) {
-			u64 maxphyaddr;
-			u32 eax, ebx, ecx, edx;
+			int maxphyaddr = ctxt->ops->get_cpuid_maxphyaddr(ctxt);
 
-			eax = 0x80000008;
-			ecx = 0;
-			if (ctxt->ops->get_cpuid(ctxt, &eax, &ebx, &ecx,
-						 &edx, false))
-				maxphyaddr = eax & 0xff;
-			else
-				maxphyaddr = 36;
 			rsvd = rsvd_bits(maxphyaddr, 63);
 			if (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_PCIDE)
 				rsvd &= ~X86_CR3_PCID_NOFLUSH;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ddd1d296bd20..5467ee71c25b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6209,6 +6209,11 @@ static bool emulator_get_cpuid(struct x86_emulate_ctxt *ctxt,
 	return kvm_cpuid(emul_to_vcpu(ctxt), eax, ebx, ecx, edx, check_limit);
 }
 
+static int emulator_get_cpuid_maxphyaddr(struct x86_emulate_ctxt *ctxt)
+{
+	return cpuid_maxphyaddr(emul_to_vcpu(ctxt));
+}
+
 static bool emulator_guest_has_long_mode(struct x86_emulate_ctxt *ctxt)
 {
 	return guest_cpuid_has(emul_to_vcpu(ctxt), X86_FEATURE_LM);
@@ -6301,6 +6306,7 @@ static const struct x86_emulate_ops emulate_ops = {
 	.fix_hypercall       = emulator_fix_hypercall,
 	.intercept           = emulator_intercept,
 	.get_cpuid           = emulator_get_cpuid,
+	.get_cpuid_maxphyaddr= emulator_get_cpuid_maxphyaddr,
 	.guest_has_long_mode = emulator_guest_has_long_mode,
 	.guest_has_movbe     = emulator_guest_has_movbe,
 	.guest_has_fxsr      = emulator_guest_has_fxsr,
-- 
2.24.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 4/6] KVM: x86: Drop return value from kvm_cpuid()
  2020-03-02 19:57 [PATCH 0/6] KVM: x86: CPUID emulation and tracing fixes Sean Christopherson
                   ` (2 preceding siblings ...)
  2020-03-02 19:57 ` [PATCH 3/6] KVM: x86: Add dedicated emulator helper for grabbing CPUID.maxphyaddr Sean Christopherson
@ 2020-03-02 19:57 ` Sean Christopherson
  2020-03-02 19:57 ` [PATCH 5/6] KVM: x86: Rename "found" variable in kvm_cpuid() to "exact_entry_exists" Sean Christopherson
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 40+ messages in thread
From: Sean Christopherson @ 2020-03-02 19:57 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Jan Kiszka, Xiaoyao Li

Remove the boolean return from kvm_cpuid() now that all callers ignore
the return value.

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/include/asm/kvm_emulate.h | 2 +-
 arch/x86/kvm/cpuid.c               | 4 +---
 arch/x86/kvm/cpuid.h               | 2 +-
 arch/x86/kvm/x86.c                 | 4 ++--
 4 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
index ded06515d30f..20a26dc792ce 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -220,7 +220,7 @@ struct x86_emulate_ops {
 			 struct x86_instruction_info *info,
 			 enum x86_intercept_stage stage);
 
-	bool (*get_cpuid)(struct x86_emulate_ctxt *ctxt, u32 *eax, u32 *ebx,
+	void (*get_cpuid)(struct x86_emulate_ctxt *ctxt, u32 *eax, u32 *ebx,
 			  u32 *ecx, u32 *edx, bool check_limit);
 	int (*get_cpuid_maxphyaddr)(struct x86_emulate_ctxt *ctxt);
 	bool (*guest_has_long_mode)(struct x86_emulate_ctxt *ctxt);
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index c320126e0118..869526930cf7 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -997,8 +997,7 @@ static bool cpuid_function_in_range(struct kvm_vcpu *vcpu, u32 function)
 	return max && function <= max->eax;
 }
 
-/* Returns true if the requested leaf/function exists in guest CPUID. */
-bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
+void kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
 	       u32 *ecx, u32 *edx, bool check_limit)
 {
 	const u32 function = *eax, index = *ecx;
@@ -1049,7 +1048,6 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
 		}
 	}
 	trace_kvm_cpuid(function, *eax, *ebx, *ecx, *edx, found);
-	return found;
 }
 EXPORT_SYMBOL_GPL(kvm_cpuid);
 
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 7366c618aa04..5df64ff25663 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -22,7 +22,7 @@ int kvm_vcpu_ioctl_set_cpuid2(struct kvm_vcpu *vcpu,
 int kvm_vcpu_ioctl_get_cpuid2(struct kvm_vcpu *vcpu,
 			      struct kvm_cpuid2 *cpuid,
 			      struct kvm_cpuid_entry2 __user *entries);
-bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
+void kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
 	       u32 *ecx, u32 *edx, bool check_limit);
 
 int cpuid_query_maxphyaddr(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5467ee71c25b..bfff92fcf0d1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6203,10 +6203,10 @@ static int emulator_intercept(struct x86_emulate_ctxt *ctxt,
 	return kvm_x86_ops->check_intercept(emul_to_vcpu(ctxt), info, stage);
 }
 
-static bool emulator_get_cpuid(struct x86_emulate_ctxt *ctxt,
+static void emulator_get_cpuid(struct x86_emulate_ctxt *ctxt,
 			u32 *eax, u32 *ebx, u32 *ecx, u32 *edx, bool check_limit)
 {
-	return kvm_cpuid(emul_to_vcpu(ctxt), eax, ebx, ecx, edx, check_limit);
+	kvm_cpuid(emul_to_vcpu(ctxt), eax, ebx, ecx, edx, check_limit);
 }
 
 static int emulator_get_cpuid_maxphyaddr(struct x86_emulate_ctxt *ctxt)
-- 
2.24.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 5/6] KVM: x86: Rename "found" variable in kvm_cpuid() to "exact_entry_exists"
  2020-03-02 19:57 [PATCH 0/6] KVM: x86: CPUID emulation and tracing fixes Sean Christopherson
                   ` (3 preceding siblings ...)
  2020-03-02 19:57 ` [PATCH 4/6] KVM: x86: Drop return value from kvm_cpuid() Sean Christopherson
@ 2020-03-02 19:57 ` Sean Christopherson
  2020-03-02 20:20   ` Jan Kiszka
  2020-03-02 19:57 ` [PATCH 6/6] KVM: x86: Add requested index to the CPUID tracepoint Sean Christopherson
  2020-03-03  8:48 ` [PATCH 0/6] KVM: x86: CPUID emulation and tracing fixes Paolo Bonzini
  6 siblings, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2020-03-02 19:57 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Jan Kiszka, Xiaoyao Li

Rename "found" in kvm_cpuid() to "exact_entry_exists" to better convey
that the intent of the tracepoint's "found/not found" output is to trace
whether the output values are for the actual requested leaf or for some
other (likely unrelated) leaf that was found while processing entries to
emulate funky CPU behavior, e.g. the max basic leaf on Intel CPUs when
the requested CPUID leaf is out of range.

Suggested-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/cpuid.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 869526930cf7..b0a4f3c17932 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -1002,10 +1002,10 @@ void kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
 {
 	const u32 function = *eax, index = *ecx;
 	struct kvm_cpuid_entry2 *entry;
-	bool found;
+	bool exact_entry_exists;
 
 	entry = kvm_find_cpuid_entry(vcpu, function, index);
-	found = entry;
+	exact_entry_exists = !!entry;
 	/*
 	 * Intel CPUID semantics treats any query for an out-of-range
 	 * leaf as if the highest basic leaf (i.e. CPUID.0H:EAX) were
@@ -1047,7 +1047,7 @@ void kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
 			}
 		}
 	}
-	trace_kvm_cpuid(function, *eax, *ebx, *ecx, *edx, found);
+	trace_kvm_cpuid(function, *eax, *ebx, *ecx, *edx, exact_entry_exists);
 }
 EXPORT_SYMBOL_GPL(kvm_cpuid);
 
-- 
2.24.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 6/6] KVM: x86: Add requested index to the CPUID tracepoint
  2020-03-02 19:57 [PATCH 0/6] KVM: x86: CPUID emulation and tracing fixes Sean Christopherson
                   ` (4 preceding siblings ...)
  2020-03-02 19:57 ` [PATCH 5/6] KVM: x86: Rename "found" variable in kvm_cpuid() to "exact_entry_exists" Sean Christopherson
@ 2020-03-02 19:57 ` Sean Christopherson
  2020-03-07  9:48   ` Jan Kiszka
  2020-03-03  8:48 ` [PATCH 0/6] KVM: x86: CPUID emulation and tracing fixes Paolo Bonzini
  6 siblings, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2020-03-02 19:57 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Jan Kiszka, Xiaoyao Li

Output the requested index when tracing CPUID emulation; it's basically
mandatory for leafs where the index is meaningful, and is helpful for
verifying KVM correctness even when the index isn't meaningful, e.g. the
trace for a Linux guest's hypervisor_cpuid_base() probing appears to
be broken (returns all zeroes) at first glance, but is correct because
the index is non-zero, i.e. the output values correspond to random index
in the maximum basic leaf.

Suggested-by: Xiaoyao Li <xiaoyao.li@intel.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/cpuid.c |  3 ++-
 arch/x86/kvm/trace.h | 13 ++++++++-----
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index b0a4f3c17932..a3c9f6bf43f3 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -1047,7 +1047,8 @@ void kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
 			}
 		}
 	}
-	trace_kvm_cpuid(function, *eax, *ebx, *ecx, *edx, exact_entry_exists);
+	trace_kvm_cpuid(function, index, *eax, *ebx, *ecx, *edx,
+			exact_entry_exists);
 }
 EXPORT_SYMBOL_GPL(kvm_cpuid);
 
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index f194dd058470..aa372d0119f0 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -151,12 +151,14 @@ TRACE_EVENT(kvm_fast_mmio,
  * Tracepoint for cpuid.
  */
 TRACE_EVENT(kvm_cpuid,
-	TP_PROTO(unsigned int function, unsigned long rax, unsigned long rbx,
-		 unsigned long rcx, unsigned long rdx, bool found),
-	TP_ARGS(function, rax, rbx, rcx, rdx, found),
+	TP_PROTO(unsigned int function, unsigned int index, unsigned long rax,
+		 unsigned long rbx, unsigned long rcx, unsigned long rdx,
+		 bool found),
+	TP_ARGS(function, index, rax, rbx, rcx, rdx, found),
 
 	TP_STRUCT__entry(
 		__field(	unsigned int,	function	)
+		__field(	unsigned int,	index		)
 		__field(	unsigned long,	rax		)
 		__field(	unsigned long,	rbx		)
 		__field(	unsigned long,	rcx		)
@@ -166,6 +168,7 @@ TRACE_EVENT(kvm_cpuid,
 
 	TP_fast_assign(
 		__entry->function	= function;
+		__entry->index		= index;
 		__entry->rax		= rax;
 		__entry->rbx		= rbx;
 		__entry->rcx		= rcx;
@@ -173,8 +176,8 @@ TRACE_EVENT(kvm_cpuid,
 		__entry->found		= found;
 	),
 
-	TP_printk("func %x rax %lx rbx %lx rcx %lx rdx %lx, cpuid entry %s",
-		  __entry->function, __entry->rax,
+	TP_printk("func %x idx %x rax %lx rbx %lx rcx %lx rdx %lx, cpuid entry %s",
+		  __entry->function, __entry->index, __entry->rax,
 		  __entry->rbx, __entry->rcx, __entry->rdx,
 		  __entry->found ? "found" : "not found")
 );
-- 
2.24.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 5/6] KVM: x86: Rename "found" variable in kvm_cpuid() to "exact_entry_exists"
  2020-03-02 19:57 ` [PATCH 5/6] KVM: x86: Rename "found" variable in kvm_cpuid() to "exact_entry_exists" Sean Christopherson
@ 2020-03-02 20:20   ` Jan Kiszka
  2020-03-02 20:35     ` Sean Christopherson
  0 siblings, 1 reply; 40+ messages in thread
From: Jan Kiszka @ 2020-03-02 20:20 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li

On 02.03.20 20:57, Sean Christopherson wrote:
> Rename "found" in kvm_cpuid() to "exact_entry_exists" to better convey
> that the intent of the tracepoint's "found/not found" output is to trace
> whether the output values are for the actual requested leaf or for some
> other (likely unrelated) leaf that was found while processing entries to
> emulate funky CPU behavior, e.g. the max basic leaf on Intel CPUs when
> the requested CPUID leaf is out of range.
> 
> Suggested-by: Jan Kiszka <jan.kiszka@siemens.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>   arch/x86/kvm/cpuid.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 869526930cf7..b0a4f3c17932 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -1002,10 +1002,10 @@ void kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>   {
>   	const u32 function = *eax, index = *ecx;
>   	struct kvm_cpuid_entry2 *entry;
> -	bool found;
> +	bool exact_entry_exists;
>   
>   	entry = kvm_find_cpuid_entry(vcpu, function, index);
> -	found = entry;
> +	exact_entry_exists = !!entry;
>   	/*
>   	 * Intel CPUID semantics treats any query for an out-of-range
>   	 * leaf as if the highest basic leaf (i.e. CPUID.0H:EAX) were
> @@ -1047,7 +1047,7 @@ void kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>   			}
>   		}
>   	}
> -	trace_kvm_cpuid(function, *eax, *ebx, *ecx, *edx, found);
> +	trace_kvm_cpuid(function, *eax, *ebx, *ecx, *edx, exact_entry_exists);

Actually, I think we also what to change output in the tracepoint.

Jan

>   }
>   EXPORT_SYMBOL_GPL(kvm_cpuid);
>   
> 

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/6] KVM: x86: Fix tracing of CPUID.function when function is out-of-range
  2020-03-02 19:57 ` [PATCH 1/6] KVM: x86: Fix tracing of CPUID.function when function is out-of-range Sean Christopherson
@ 2020-03-02 20:26   ` Jan Kiszka
  2020-03-02 20:49     ` Sean Christopherson
  2020-03-03  2:50   ` Xiaoyao Li
  1 sibling, 1 reply; 40+ messages in thread
From: Jan Kiszka @ 2020-03-02 20:26 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li

On 02.03.20 20:57, Sean Christopherson wrote:
> Rework kvm_cpuid() to query entry->function when adjusting the output
> values so that the original function (in the aptly named "function") is
> preserved for tracing.  This fixes a bug where trace_kvm_cpuid() will
> trace the max function for a range instead of the requested function if
> the requested function is out-of-range and an entry for the max function
> exists.
> 
> Fixes: 43561123ab37 ("kvm: x86: Improve emulation of CPUID leaves 0BH and 1FH")
> Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
> Cc: Jim Mattson <jmattson@google.com>
> Cc: Xiaoyao Li <xiaoyao.li@intel.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>   arch/x86/kvm/cpuid.c | 15 +++++++--------
>   1 file changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index b1c469446b07..6be012937eba 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -997,12 +997,12 @@ static bool cpuid_function_in_range(struct kvm_vcpu *vcpu, u32 function)
>   	return max && function <= max->eax;
>   }
>   
> +/* Returns true if the requested leaf/function exists in guest CPUID. */
>   bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>   	       u32 *ecx, u32 *edx, bool check_limit)
>   {
> -	u32 function = *eax, index = *ecx;
> +	const u32 function = *eax, index = *ecx;
>   	struct kvm_cpuid_entry2 *entry;
> -	struct kvm_cpuid_entry2 *max;
>   	bool found;
>   
>   	entry = kvm_find_cpuid_entry(vcpu, function, index);
> @@ -1015,18 +1015,17 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>   	 */
>   	if (!entry && check_limit && !guest_cpuid_is_amd(vcpu) &&
>   	    !cpuid_function_in_range(vcpu, function)) {
> -		max = kvm_find_cpuid_entry(vcpu, 0, 0);
> -		if (max) {
> -			function = max->eax;
> -			entry = kvm_find_cpuid_entry(vcpu, function, index);
> -		}
> +		entry = kvm_find_cpuid_entry(vcpu, 0, 0);
> +		if (entry)
> +			entry = kvm_find_cpuid_entry(vcpu, entry->eax, index);
>   	}
>   	if (entry) {
>   		*eax = entry->eax;
>   		*ebx = entry->ebx;
>   		*ecx = entry->ecx;
>   		*edx = entry->edx;
> -		if (function == 7 && index == 0) {
> +
> +		if (entry->function == 7 && index == 0) {
>   			u64 data;
>   		        if (!__kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data, true) &&
>   			    (data & TSX_CTRL_CPUID_CLEAR))
> 

What about the !entry case below this? It was impacted by the function 
capping so far, not it's no longer.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 5/6] KVM: x86: Rename "found" variable in kvm_cpuid() to "exact_entry_exists"
  2020-03-02 20:20   ` Jan Kiszka
@ 2020-03-02 20:35     ` Sean Christopherson
  2020-03-02 20:48       ` Jan Kiszka
  0 siblings, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2020-03-02 20:35 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li

On Mon, Mar 02, 2020 at 09:20:52PM +0100, Jan Kiszka wrote:
> On 02.03.20 20:57, Sean Christopherson wrote:
> >Rename "found" in kvm_cpuid() to "exact_entry_exists" to better convey
> >that the intent of the tracepoint's "found/not found" output is to trace
> >whether the output values are for the actual requested leaf or for some
> >other (likely unrelated) leaf that was found while processing entries to
> >emulate funky CPU behavior, e.g. the max basic leaf on Intel CPUs when
> >the requested CPUID leaf is out of range.
> >
> >Suggested-by: Jan Kiszka <jan.kiszka@siemens.com>
> >Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> >---
> >  arch/x86/kvm/cpuid.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> >diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> >index 869526930cf7..b0a4f3c17932 100644
> >--- a/arch/x86/kvm/cpuid.c
> >+++ b/arch/x86/kvm/cpuid.c
> >@@ -1002,10 +1002,10 @@ void kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
> >  {
> >  	const u32 function = *eax, index = *ecx;
> >  	struct kvm_cpuid_entry2 *entry;
> >-	bool found;
> >+	bool exact_entry_exists;
> >  	entry = kvm_find_cpuid_entry(vcpu, function, index);
> >-	found = entry;
> >+	exact_entry_exists = !!entry;
> >  	/*
> >  	 * Intel CPUID semantics treats any query for an out-of-range
> >  	 * leaf as if the highest basic leaf (i.e. CPUID.0H:EAX) were
> >@@ -1047,7 +1047,7 @@ void kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
> >  			}
> >  		}
> >  	}
> >-	trace_kvm_cpuid(function, *eax, *ebx, *ecx, *edx, found);
> >+	trace_kvm_cpuid(function, *eax, *ebx, *ecx, *edx, exact_entry_exists);
> 
> Actually, I think we also what to change output in the tracepoint.

Oh, I definitely want to change it, but AIUI it's ABI and shouldn't be
changed.  Paolo?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 5/6] KVM: x86: Rename "found" variable in kvm_cpuid() to "exact_entry_exists"
  2020-03-02 20:35     ` Sean Christopherson
@ 2020-03-02 20:48       ` Jan Kiszka
  0 siblings, 0 replies; 40+ messages in thread
From: Jan Kiszka @ 2020-03-02 20:48 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li

On 02.03.20 21:35, Sean Christopherson wrote:
> On Mon, Mar 02, 2020 at 09:20:52PM +0100, Jan Kiszka wrote:
>> On 02.03.20 20:57, Sean Christopherson wrote:
>>> Rename "found" in kvm_cpuid() to "exact_entry_exists" to better convey
>>> that the intent of the tracepoint's "found/not found" output is to trace
>>> whether the output values are for the actual requested leaf or for some
>>> other (likely unrelated) leaf that was found while processing entries to
>>> emulate funky CPU behavior, e.g. the max basic leaf on Intel CPUs when
>>> the requested CPUID leaf is out of range.
>>>
>>> Suggested-by: Jan Kiszka <jan.kiszka@siemens.com>
>>> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
>>> ---
>>>   arch/x86/kvm/cpuid.c | 6 +++---
>>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>>> index 869526930cf7..b0a4f3c17932 100644
>>> --- a/arch/x86/kvm/cpuid.c
>>> +++ b/arch/x86/kvm/cpuid.c
>>> @@ -1002,10 +1002,10 @@ void kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>>>   {
>>>   	const u32 function = *eax, index = *ecx;
>>>   	struct kvm_cpuid_entry2 *entry;
>>> -	bool found;
>>> +	bool exact_entry_exists;
>>>   	entry = kvm_find_cpuid_entry(vcpu, function, index);
>>> -	found = entry;
>>> +	exact_entry_exists = !!entry;
>>>   	/*
>>>   	 * Intel CPUID semantics treats any query for an out-of-range
>>>   	 * leaf as if the highest basic leaf (i.e. CPUID.0H:EAX) were
>>> @@ -1047,7 +1047,7 @@ void kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>>>   			}
>>>   		}
>>>   	}
>>> -	trace_kvm_cpuid(function, *eax, *ebx, *ecx, *edx, found);
>>> +	trace_kvm_cpuid(function, *eax, *ebx, *ecx, *edx, exact_entry_exists);
>>
>> Actually, I think we also what to change output in the tracepoint.
> 
> Oh, I definitely want to change it, but AIUI it's ABI and shouldn't be
> changed.  Paolo?
> 

This can be discovered via the format string in sysfs and handled by the 
consumer. But I'm not an expert in the tracing ABI.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/6] KVM: x86: Fix tracing of CPUID.function when function is out-of-range
  2020-03-02 20:26   ` Jan Kiszka
@ 2020-03-02 20:49     ` Sean Christopherson
  2020-03-02 20:59       ` Jan Kiszka
  2020-03-03  2:27       ` Xiaoyao Li
  0 siblings, 2 replies; 40+ messages in thread
From: Sean Christopherson @ 2020-03-02 20:49 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li

On Mon, Mar 02, 2020 at 09:26:54PM +0100, Jan Kiszka wrote:
> On 02.03.20 20:57, Sean Christopherson wrote:
> >Rework kvm_cpuid() to query entry->function when adjusting the output
> >values so that the original function (in the aptly named "function") is
> >preserved for tracing.  This fixes a bug where trace_kvm_cpuid() will
> >trace the max function for a range instead of the requested function if
> >the requested function is out-of-range and an entry for the max function
> >exists.
> >
> >Fixes: 43561123ab37 ("kvm: x86: Improve emulation of CPUID leaves 0BH and 1FH")
> >Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
> >Cc: Jim Mattson <jmattson@google.com>
> >Cc: Xiaoyao Li <xiaoyao.li@intel.com>
> >Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> >---
> >  arch/x86/kvm/cpuid.c | 15 +++++++--------
> >  1 file changed, 7 insertions(+), 8 deletions(-)
> >
> >diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> >index b1c469446b07..6be012937eba 100644
> >--- a/arch/x86/kvm/cpuid.c
> >+++ b/arch/x86/kvm/cpuid.c
> >@@ -997,12 +997,12 @@ static bool cpuid_function_in_range(struct kvm_vcpu *vcpu, u32 function)
> >  	return max && function <= max->eax;
> >  }
> >+/* Returns true if the requested leaf/function exists in guest CPUID. */
> >  bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
> >  	       u32 *ecx, u32 *edx, bool check_limit)
> >  {
> >-	u32 function = *eax, index = *ecx;
> >+	const u32 function = *eax, index = *ecx;
> >  	struct kvm_cpuid_entry2 *entry;
> >-	struct kvm_cpuid_entry2 *max;
> >  	bool found;
> >  	entry = kvm_find_cpuid_entry(vcpu, function, index);
> >@@ -1015,18 +1015,17 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
> >  	 */
> >  	if (!entry && check_limit && !guest_cpuid_is_amd(vcpu) &&
> >  	    !cpuid_function_in_range(vcpu, function)) {
> >-		max = kvm_find_cpuid_entry(vcpu, 0, 0);
> >-		if (max) {
> >-			function = max->eax;
> >-			entry = kvm_find_cpuid_entry(vcpu, function, index);
> >-		}
> >+		entry = kvm_find_cpuid_entry(vcpu, 0, 0);
> >+		if (entry)
> >+			entry = kvm_find_cpuid_entry(vcpu, entry->eax, index);
> >  	}
> >  	if (entry) {
> >  		*eax = entry->eax;
> >  		*ebx = entry->ebx;
> >  		*ecx = entry->ecx;
> >  		*edx = entry->edx;
> >-		if (function == 7 && index == 0) {
> >+
> >+		if (entry->function == 7 && index == 0) {
> >  			u64 data;
> >  		        if (!__kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data, true) &&
> >  			    (data & TSX_CTRL_CPUID_CLEAR))
> >
> 
> What about the !entry case below this? It was impacted by the function
> capping so far, not it's no longer.

Hmm, the only way the output would be different is in a really contrived
scenario where userspace doesn't provide an entry for the max basic leaf.

The !entry path can only be reached with "orig_function != function" if
orig_function is out of range and there is no entry for the max basic leaf.
The adjustments for 0xb/0x1f require the max basic leaf to be 0xb or 0x1f,
and to take effect with !entry would require there to be a CPUID.max.1 but
not a CPUID.max.0.  That'd be a violation of Intel's SDM, i.e. it's bogus
userspace input and IMO can be ignored.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/6] KVM: x86: Fix tracing of CPUID.function when function is out-of-range
  2020-03-02 20:49     ` Sean Christopherson
@ 2020-03-02 20:59       ` Jan Kiszka
  2020-03-03  2:27       ` Xiaoyao Li
  1 sibling, 0 replies; 40+ messages in thread
From: Jan Kiszka @ 2020-03-02 20:59 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li

On 02.03.20 21:49, Sean Christopherson wrote:
> On Mon, Mar 02, 2020 at 09:26:54PM +0100, Jan Kiszka wrote:
>> On 02.03.20 20:57, Sean Christopherson wrote:
>>> Rework kvm_cpuid() to query entry->function when adjusting the output
>>> values so that the original function (in the aptly named "function") is
>>> preserved for tracing.  This fixes a bug where trace_kvm_cpuid() will
>>> trace the max function for a range instead of the requested function if
>>> the requested function is out-of-range and an entry for the max function
>>> exists.
>>>
>>> Fixes: 43561123ab37 ("kvm: x86: Improve emulation of CPUID leaves 0BH and 1FH")
>>> Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
>>> Cc: Jim Mattson <jmattson@google.com>
>>> Cc: Xiaoyao Li <xiaoyao.li@intel.com>
>>> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
>>> ---
>>>   arch/x86/kvm/cpuid.c | 15 +++++++--------
>>>   1 file changed, 7 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>>> index b1c469446b07..6be012937eba 100644
>>> --- a/arch/x86/kvm/cpuid.c
>>> +++ b/arch/x86/kvm/cpuid.c
>>> @@ -997,12 +997,12 @@ static bool cpuid_function_in_range(struct kvm_vcpu *vcpu, u32 function)
>>>   	return max && function <= max->eax;
>>>   }
>>> +/* Returns true if the requested leaf/function exists in guest CPUID. */
>>>   bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>>>   	       u32 *ecx, u32 *edx, bool check_limit)
>>>   {
>>> -	u32 function = *eax, index = *ecx;
>>> +	const u32 function = *eax, index = *ecx;
>>>   	struct kvm_cpuid_entry2 *entry;
>>> -	struct kvm_cpuid_entry2 *max;
>>>   	bool found;
>>>   	entry = kvm_find_cpuid_entry(vcpu, function, index);
>>> @@ -1015,18 +1015,17 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>>>   	 */
>>>   	if (!entry && check_limit && !guest_cpuid_is_amd(vcpu) &&
>>>   	    !cpuid_function_in_range(vcpu, function)) {
>>> -		max = kvm_find_cpuid_entry(vcpu, 0, 0);
>>> -		if (max) {
>>> -			function = max->eax;
>>> -			entry = kvm_find_cpuid_entry(vcpu, function, index);
>>> -		}
>>> +		entry = kvm_find_cpuid_entry(vcpu, 0, 0);
>>> +		if (entry)
>>> +			entry = kvm_find_cpuid_entry(vcpu, entry->eax, index);
>>>   	}
>>>   	if (entry) {
>>>   		*eax = entry->eax;
>>>   		*ebx = entry->ebx;
>>>   		*ecx = entry->ecx;
>>>   		*edx = entry->edx;
>>> -		if (function == 7 && index == 0) {
>>> +
>>> +		if (entry->function == 7 && index == 0) {
>>>   			u64 data;
>>>   		        if (!__kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data, true) &&
>>>   			    (data & TSX_CTRL_CPUID_CLEAR))
>>>
>>
>> What about the !entry case below this? It was impacted by the function
>> capping so far, not it's no longer.
> 
> Hmm, the only way the output would be different is in a really contrived
> scenario where userspace doesn't provide an entry for the max basic leaf.

I think I've seen that, a cap to 0x10, with QEMU and '-cpu host# when 
providing intentionally bogus values to cpuid.

Jan

> 
> The !entry path can only be reached with "orig_function != function" if
> orig_function is out of range and there is no entry for the max basic leaf.
> The adjustments for 0xb/0x1f require the max basic leaf to be 0xb or 0x1f,
> and to take effect with !entry would require there to be a CPUID.max.1 but
> not a CPUID.max.0.  That'd be a violation of Intel's SDM, i.e. it's bogus
> userspace input and IMO can be ignored.
> 

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/6] KVM: x86: Fix CPUID range check for Centaur and Hypervisor ranges
  2020-03-02 19:57 ` [PATCH 2/6] KVM: x86: Fix CPUID range check for Centaur and Hypervisor ranges Sean Christopherson
@ 2020-03-02 21:59   ` Jim Mattson
  2020-03-03  0:57     ` Sean Christopherson
  2020-03-03  3:25   ` Jim Mattson
  1 sibling, 1 reply; 40+ messages in thread
From: Jim Mattson @ 2020-03-02 21:59 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Jan Kiszka, Xiaoyao Li

On Mon, Mar 2, 2020 at 11:57 AM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> Extend the mask in cpuid_function_in_range() for finding the "class" of
> the function to 0xfffffff00.  While there is no official definition of
> what constitutes a class, e.g. arguably bits 31:16 should be the class
> and bits 15:0 the functions within that class, the Hypervisor logic
> effectively uses bits 31:8 as the class by virtue of checking for
> different bases in increments of 0x100, e.g. KVM advertises its CPUID
> functions starting at 0x40000100 when HyperV features are advertised at
> the default base of 0x40000000.

This convention deserves explicit documentation outside of the commit message.

> Masking against 0x80000000 only handles basic and extended leafs, which
> results in Centaur and Hypervisor range checks being performed against
> the basic CPUID range, e.g. if CPUID.0x40000000.EAX=0x4000000A and there
> is no entry for CPUID.0x40000006, then function 0x40000006 would be
> incorrectly reported as out of bounds.
>
> The bad range check doesn't cause function problems for any known VMM
> because out-of-range semantics only come into play if the exact entry
> isn't found, and VMMs either support a very limited Hypervisor range,
> e.g. the official KVM range is 0x40000000-0x40000001 (effectively no
> room for undefined leafs) or explicitly defines gaps to be zero, e.g.
> Qemu explicitly creates zeroed entries up to the Cenatur and Hypervisor
> limits (the latter comes into play when providing HyperV features).

Does Centaur implement the bizarre Intel behavior for out-of-bound
entries? It seems that if there are Centaur leaves defined, the CPUD
semantics should be those specified by Centaur.

> The bad behavior can be visually confirmed by dumping CPUID output in
> the guest when running Qemu with a stable TSC, as Qemu extends the limit
> of range 0x40000000 to 0x40000010 to advertise VMware's cpuid_freq,
> without defining zeroed entries for 0x40000002 - 0x4000000f.
>
> Fixes: 43561123ab37 ("kvm: x86: Improve emulation of CPUID leaves 0BH and 1FH")
> Cc: Jim Mattson <jmattson@google.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  arch/x86/kvm/cpuid.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 6be012937eba..c320126e0118 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -993,7 +993,7 @@ static bool cpuid_function_in_range(struct kvm_vcpu *vcpu, u32 function)
>  {
>         struct kvm_cpuid_entry2 *max;
>
> -       max = kvm_find_cpuid_entry(vcpu, function & 0x80000000, 0);
> +       max = kvm_find_cpuid_entry(vcpu, function & 0xffffff00u, 0);

This assumes that CPUID.(function & 0xffffff00):EAX always contains
the maximum input value for the 256-entry range sharing the high 24
bits. I don't believe that convention has ever been established or
documented.

>         return max && function <= max->eax;
>  }
>
> --
> 2.24.1
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/6] KVM: x86: Fix CPUID range check for Centaur and Hypervisor ranges
  2020-03-02 21:59   ` Jim Mattson
@ 2020-03-03  0:57     ` Sean Christopherson
  0 siblings, 0 replies; 40+ messages in thread
From: Sean Christopherson @ 2020-03-03  0:57 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Jan Kiszka, Xiaoyao Li

On Mon, Mar 02, 2020 at 01:59:10PM -0800, Jim Mattson wrote:
> On Mon, Mar 2, 2020 at 11:57 AM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> > Extend the mask in cpuid_function_in_range() for finding the "class" of
> > the function to 0xfffffff00.  While there is no official definition of
> > what constitutes a class, e.g. arguably bits 31:16 should be the class
> > and bits 15:0 the functions within that class, the Hypervisor logic
> > effectively uses bits 31:8 as the class by virtue of checking for
> > different bases in increments of 0x100, e.g. KVM advertises its CPUID
> > functions starting at 0x40000100 when HyperV features are advertised at
> > the default base of 0x40000000.
> 
> This convention deserves explicit documentation outside of the commit message.

No argument there.
 
> > Masking against 0x80000000 only handles basic and extended leafs, which
> > results in Centaur and Hypervisor range checks being performed against
> > the basic CPUID range, e.g. if CPUID.0x40000000.EAX=0x4000000A and there
> > is no entry for CPUID.0x40000006, then function 0x40000006 would be
> > incorrectly reported as out of bounds.
> >
> > The bad range check doesn't cause function problems for any known VMM
> > because out-of-range semantics only come into play if the exact entry
> > isn't found, and VMMs either support a very limited Hypervisor range,
> > e.g. the official KVM range is 0x40000000-0x40000001 (effectively no
> > room for undefined leafs) or explicitly defines gaps to be zero, e.g.
> > Qemu explicitly creates zeroed entries up to the Cenatur and Hypervisor
> > limits (the latter comes into play when providing HyperV features).
> 
> Does Centaur implement the bizarre Intel behavior for out-of-bound
> entries? It seems that if there are Centaur leaves defined, the CPUD
> semantics should be those specified by Centaur.

Ah, right, because this code triggers on !=AMD, not ==Intel.  Your guess
is as good as mine, I've dug around a few times trying to track down a spec
for Centaur/VIA without success.

I would say that KVM's emulation behavior should probably be all or
nothing, i.e. either due Intel's silly logic for all ranges/classes or do
it for none.

> > The bad behavior can be visually confirmed by dumping CPUID output in
> > the guest when running Qemu with a stable TSC, as Qemu extends the limit
> > of range 0x40000000 to 0x40000010 to advertise VMware's cpuid_freq,
> > without defining zeroed entries for 0x40000002 - 0x4000000f.
> >
> > Fixes: 43561123ab37 ("kvm: x86: Improve emulation of CPUID leaves 0BH and 1FH")
> > Cc: Jim Mattson <jmattson@google.com>
> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> > ---
> >  arch/x86/kvm/cpuid.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index 6be012937eba..c320126e0118 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -993,7 +993,7 @@ static bool cpuid_function_in_range(struct kvm_vcpu *vcpu, u32 function)
> >  {
> >         struct kvm_cpuid_entry2 *max;
> >
> > -       max = kvm_find_cpuid_entry(vcpu, function & 0x80000000, 0);
> > +       max = kvm_find_cpuid_entry(vcpu, function & 0xffffff00u, 0);
> 
> This assumes that CPUID.(function & 0xffffff00):EAX always contains
> the maximum input value for the 256-entry range sharing the high 24
> bits. I don't believe that convention has ever been established or
> documented.

Not sure if it's formally documented, but it's well established.  The
closest thing I could find to documentation is the lkml thread where what's
implemented today (AFAICT) was proposed.

https://lore.kernel.org/lkml/48E3BBC1.2050607@goop.org/


The relevant linux code in Linux (arch/x86/include/asm/processor.h), where
@leaves contains the kernel's required minimum leaf to enable paravirt
stuff for the hypervisor.

static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves)
{
        uint32_t base, eax, signature[3];

        for (base = 0x40000000; base < 0x40010000; base += 0x100) {
                cpuid(base, &eax, &signature[0], &signature[1], &signature[2]);

                if (!memcmp(sig, signature, 12) &&
                    (leaves == 0 || ((eax - base) >= leaves)))
                        return base;
        }

        return 0;
}

> >         return max && function <= max->eax;
> >  }
> >
> > --
> > 2.24.1
> >

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/6] KVM: x86: Fix tracing of CPUID.function when function is out-of-range
  2020-03-02 20:49     ` Sean Christopherson
  2020-03-02 20:59       ` Jan Kiszka
@ 2020-03-03  2:27       ` Xiaoyao Li
  2020-03-03  3:45         ` Sean Christopherson
  1 sibling, 1 reply; 40+ messages in thread
From: Xiaoyao Li @ 2020-03-03  2:27 UTC (permalink / raw)
  To: Sean Christopherson, Jan Kiszka
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

On 3/3/2020 4:49 AM, Sean Christopherson wrote:
> On Mon, Mar 02, 2020 at 09:26:54PM +0100, Jan Kiszka wrote:
>> On 02.03.20 20:57, Sean Christopherson wrote:
>>> Rework kvm_cpuid() to query entry->function when adjusting the output
>>> values so that the original function (in the aptly named "function") is
>>> preserved for tracing.  This fixes a bug where trace_kvm_cpuid() will
>>> trace the max function for a range instead of the requested function if
>>> the requested function is out-of-range and an entry for the max function
>>> exists.
>>>
>>> Fixes: 43561123ab37 ("kvm: x86: Improve emulation of CPUID leaves 0BH and 1FH")
>>> Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
>>> Cc: Jim Mattson <jmattson@google.com>
>>> Cc: Xiaoyao Li <xiaoyao.li@intel.com>
>>> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
>>> ---
>>>   arch/x86/kvm/cpuid.c | 15 +++++++--------
>>>   1 file changed, 7 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>>> index b1c469446b07..6be012937eba 100644
>>> --- a/arch/x86/kvm/cpuid.c
>>> +++ b/arch/x86/kvm/cpuid.c
>>> @@ -997,12 +997,12 @@ static bool cpuid_function_in_range(struct kvm_vcpu *vcpu, u32 function)
>>>   	return max && function <= max->eax;
>>>   }
>>> +/* Returns true if the requested leaf/function exists in guest CPUID. */
>>>   bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>>>   	       u32 *ecx, u32 *edx, bool check_limit)
>>>   {
>>> -	u32 function = *eax, index = *ecx;
>>> +	const u32 function = *eax, index = *ecx;
>>>   	struct kvm_cpuid_entry2 *entry;
>>> -	struct kvm_cpuid_entry2 *max;
>>>   	bool found;
>>>   	entry = kvm_find_cpuid_entry(vcpu, function, index);
>>> @@ -1015,18 +1015,17 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>>>   	 */
>>>   	if (!entry && check_limit && !guest_cpuid_is_amd(vcpu) &&
>>>   	    !cpuid_function_in_range(vcpu, function)) {
>>> -		max = kvm_find_cpuid_entry(vcpu, 0, 0);
>>> -		if (max) {
>>> -			function = max->eax;
>>> -			entry = kvm_find_cpuid_entry(vcpu, function, index);
>>> -		}
>>> +		entry = kvm_find_cpuid_entry(vcpu, 0, 0);
>>> +		if (entry)
>>> +			entry = kvm_find_cpuid_entry(vcpu, entry->eax, index);
>>>   	}
>>>   	if (entry) {
>>>   		*eax = entry->eax;
>>>   		*ebx = entry->ebx;
>>>   		*ecx = entry->ecx;
>>>   		*edx = entry->edx;
>>> -		if (function == 7 && index == 0) {
>>> +
>>> +		if (entry->function == 7 && index == 0) {
>>>   			u64 data;
>>>   		        if (!__kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data, true) &&
>>>   			    (data & TSX_CTRL_CPUID_CLEAR))
>>>
>>
>> What about the !entry case below this? It was impacted by the function
>> capping so far, not it's no longer.
> 
> Hmm, the only way the output would be different is in a really contrived
> scenario where userspace doesn't provide an entry for the max basic leaf.
> 
> The !entry path can only be reached with "orig_function != function" if
> orig_function is out of range and there is no entry for the max basic leaf.

> The adjustments for 0xb/0x1f require the max basic leaf to be 0xb or 0x1f,
> and to take effect with !entry would require there to be a CPUID.max.1 but
> not a CPUID.max.0.  That'd be a violation of Intel's SDM, i.e. it's bogus
> userspace input and IMO can be ignored.
> 

Sorry I cannot catch you. Why it's a violation of Intel's SDM?

Supposing the max basic is 0x1f, and it queries cpuid(0x20, 0x5),
it should return cpuid(0x1f, 0x5).

But based on this patch, it returns all zeros.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/6] KVM: x86: Fix tracing of CPUID.function when function is out-of-range
  2020-03-02 19:57 ` [PATCH 1/6] KVM: x86: Fix tracing of CPUID.function when function is out-of-range Sean Christopherson
  2020-03-02 20:26   ` Jan Kiszka
@ 2020-03-03  2:50   ` Xiaoyao Li
  2020-03-03  4:08     ` Sean Christopherson
  1 sibling, 1 reply; 40+ messages in thread
From: Xiaoyao Li @ 2020-03-03  2:50 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Jan Kiszka

On 3/3/2020 3:57 AM, Sean Christopherson wrote:
> Rework kvm_cpuid() to query entry->function when adjusting the output
> values so that the original function (in the aptly named "function") is
> preserved for tracing.  This fixes a bug where trace_kvm_cpuid() will
> trace the max function for a range instead of the requested function if
> the requested function is out-of-range and an entry for the max function
> exists.
> 
> Fixes: 43561123ab37 ("kvm: x86: Improve emulation of CPUID leaves 0BH and 1FH")
> Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
> Cc: Jim Mattson <jmattson@google.com>
> Cc: Xiaoyao Li <xiaoyao.li@intel.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>   arch/x86/kvm/cpuid.c | 15 +++++++--------
>   1 file changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index b1c469446b07..6be012937eba 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -997,12 +997,12 @@ static bool cpuid_function_in_range(struct kvm_vcpu *vcpu, u32 function)
>   	return max && function <= max->eax;
>   }
>   
> +/* Returns true if the requested leaf/function exists in guest CPUID. */
>   bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>   	       u32 *ecx, u32 *edx, bool check_limit)
>   {
> -	u32 function = *eax, index = *ecx;
> +	const u32 function = *eax, index = *ecx;
>   	struct kvm_cpuid_entry2 *entry;
> -	struct kvm_cpuid_entry2 *max;
>   	bool found;
>   
>   	entry = kvm_find_cpuid_entry(vcpu, function, index);
> @@ -1015,18 +1015,17 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>   	 */
>   	if (!entry && check_limit && !guest_cpuid_is_amd(vcpu) &&
>   	    !cpuid_function_in_range(vcpu, function)) {
> -		max = kvm_find_cpuid_entry(vcpu, 0, 0);
> -		if (max) {
> -			function = max->eax;
> -			entry = kvm_find_cpuid_entry(vcpu, function, index);
> -		}
> +		entry = kvm_find_cpuid_entry(vcpu, 0, 0);
> +		if (entry)
> +			entry = kvm_find_cpuid_entry(vcpu, entry->eax, index);

There is a problem.

when queried leaf is out of range on Intel CPU, it returns the maximum 
basic leaf, and any dependence on input ECX (i.e., subleaf) value in the 
basic leaf is honored. As disclaimed in SDM of CPUID instruction.

The ECX should be honored if and only the leaf has a significant index.
If the leaf doesn't has a significant index, it just ignores the EDX 
input in bare metal.

So it should be something like:

if (!entry && check_limit && !guest_cpuid_is_amd(vcpu) &&
	!cpuid_function_in_range(vcpu, function)) {
	entry = kvm_find_cpuid_entry(vcpu, 0, 0);
	if (entry) {
		entry = kvm_find_cpuid_entry(vcpu, entry->eax, 0);
		if (entry &&
		    entry->flags & KVM_CPUID_FLAG_SIGNIFCANT_INDEX ) {
			entry = kvm_find_cpuid_entry(vcpu, entry->eax,
						     index);
		}
	}
}

>   	}
>   	if (entry) {
>   		*eax = entry->eax;
>   		*ebx = entry->ebx;
>   		*ecx = entry->ecx;
>   		*edx = entry->edx;
> -		if (function == 7 && index == 0) {
> +
> +		if (entry->function == 7 && index == 0) {
>   			u64 data;
>   		        if (!__kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data, true) &&
>   			    (data & TSX_CTRL_CPUID_CLEAR))
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/6] KVM: x86: Fix CPUID range check for Centaur and Hypervisor ranges
  2020-03-02 19:57 ` [PATCH 2/6] KVM: x86: Fix CPUID range check for Centaur and Hypervisor ranges Sean Christopherson
  2020-03-02 21:59   ` Jim Mattson
@ 2020-03-03  3:25   ` Jim Mattson
  2020-03-03  4:25     ` Jim Mattson
  1 sibling, 1 reply; 40+ messages in thread
From: Jim Mattson @ 2020-03-03  3:25 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Jan Kiszka, Xiaoyao Li

On Mon, Mar 2, 2020 at 11:57 AM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:

> The bad behavior can be visually confirmed by dumping CPUID output in
> the guest when running Qemu with a stable TSC, as Qemu extends the limit
> of range 0x40000000 to 0x40000010 to advertise VMware's cpuid_freq,
> without defining zeroed entries for 0x40000002 - 0x4000000f.

I think it could be reasonably argued that this is a userspace bug.
Clearly, when userspace explicitly supplies the results for a leaf,
those results override the default CPUID values for that leaf. But I
haven't seen it documented anywhere that leaves *not* explicitly
supplied by userspace will override the default CPUID values, just
because they happen to appear in some magic range.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/6] KVM: x86: Fix tracing of CPUID.function when function is out-of-range
  2020-03-03  2:27       ` Xiaoyao Li
@ 2020-03-03  3:45         ` Sean Christopherson
  2020-03-03  4:02           ` Xiaoyao Li
  0 siblings, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2020-03-03  3:45 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Jan Kiszka, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, kvm, linux-kernel

On Tue, Mar 03, 2020 at 10:27:47AM +0800, Xiaoyao Li wrote:
> On 3/3/2020 4:49 AM, Sean Christopherson wrote:
> >On Mon, Mar 02, 2020 at 09:26:54PM +0100, Jan Kiszka wrote:
> >>On 02.03.20 20:57, Sean Christopherson wrote:
> >>>Rework kvm_cpuid() to query entry->function when adjusting the output
> >>>values so that the original function (in the aptly named "function") is
> >>>preserved for tracing.  This fixes a bug where trace_kvm_cpuid() will
> >>>trace the max function for a range instead of the requested function if
> >>>the requested function is out-of-range and an entry for the max function
> >>>exists.
> >>>
> >>>Fixes: 43561123ab37 ("kvm: x86: Improve emulation of CPUID leaves 0BH and 1FH")
> >>>Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
> >>>Cc: Jim Mattson <jmattson@google.com>
> >>>Cc: Xiaoyao Li <xiaoyao.li@intel.com>
> >>>Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> >>>---
> >>>  arch/x86/kvm/cpuid.c | 15 +++++++--------
> >>>  1 file changed, 7 insertions(+), 8 deletions(-)
> >>>
> >>>diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> >>>index b1c469446b07..6be012937eba 100644
> >>>--- a/arch/x86/kvm/cpuid.c
> >>>+++ b/arch/x86/kvm/cpuid.c
> >>>@@ -997,12 +997,12 @@ static bool cpuid_function_in_range(struct kvm_vcpu *vcpu, u32 function)
> >>>  	return max && function <= max->eax;
> >>>  }
> >>>+/* Returns true if the requested leaf/function exists in guest CPUID. */
> >>>  bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
> >>>  	       u32 *ecx, u32 *edx, bool check_limit)
> >>>  {
> >>>-	u32 function = *eax, index = *ecx;
> >>>+	const u32 function = *eax, index = *ecx;
> >>>  	struct kvm_cpuid_entry2 *entry;
> >>>-	struct kvm_cpuid_entry2 *max;
> >>>  	bool found;
> >>>  	entry = kvm_find_cpuid_entry(vcpu, function, index);
> >>>@@ -1015,18 +1015,17 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
> >>>  	 */
> >>>  	if (!entry && check_limit && !guest_cpuid_is_amd(vcpu) &&
> >>>  	    !cpuid_function_in_range(vcpu, function)) {
> >>>-		max = kvm_find_cpuid_entry(vcpu, 0, 0);
> >>>-		if (max) {
> >>>-			function = max->eax;
> >>>-			entry = kvm_find_cpuid_entry(vcpu, function, index);
> >>>-		}
> >>>+		entry = kvm_find_cpuid_entry(vcpu, 0, 0);
> >>>+		if (entry)
> >>>+			entry = kvm_find_cpuid_entry(vcpu, entry->eax, index);
> >>>  	}
> >>>  	if (entry) {
> >>>  		*eax = entry->eax;
> >>>  		*ebx = entry->ebx;
> >>>  		*ecx = entry->ecx;
> >>>  		*edx = entry->edx;
> >>>-		if (function == 7 && index == 0) {
> >>>+
> >>>+		if (entry->function == 7 && index == 0) {
> >>>  			u64 data;
> >>>  		        if (!__kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data, true) &&
> >>>  			    (data & TSX_CTRL_CPUID_CLEAR))
> >>>
> >>
> >>What about the !entry case below this? It was impacted by the function
> >>capping so far, not it's no longer.
> >
> >Hmm, the only way the output would be different is in a really contrived
> >scenario where userspace doesn't provide an entry for the max basic leaf.
> >
> >The !entry path can only be reached with "orig_function != function" if
> >orig_function is out of range and there is no entry for the max basic leaf.
> 
> >The adjustments for 0xb/0x1f require the max basic leaf to be 0xb or 0x1f,
> >and to take effect with !entry would require there to be a CPUID.max.1 but
> >not a CPUID.max.0.  That'd be a violation of Intel's SDM, i.e. it's bogus
> >userspace input and IMO can be ignored.
> >
> 
> Sorry I cannot catch you. Why it's a violation of Intel's SDM?

The case being discussed above would look like:

KVM CPUID Entries:
   Function   Index Output
   0x00000000 0x00: eax=0x0000000b ebx=0x756e6547 ecx=0x6c65746e edx=0x49656e69
   0x00000001 0x00: eax=0x000906ea ebx=0x03000800 ecx=0xfffa3223 edx=0x0f8bfbff
   0x00000002 0x00: eax=0x00000001 ebx=0x00000000 ecx=0x0000004d edx=0x002c307d
   0x00000003 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000004 0x00: eax=0x00000121 ebx=0x01c0003f ecx=0x0000003f edx=0x00000001
   0x00000004 0x01: eax=0x00000122 ebx=0x01c0003f ecx=0x0000003f edx=0x00000001
   0x00000004 0x02: eax=0x00000143 ebx=0x03c0003f ecx=0x00000fff edx=0x00000001
   0x00000004 0x03: eax=0x00000163 ebx=0x03c0003f ecx=0x00003fff edx=0x00000006
   0x00000005 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000003 edx=0x00000000
   0x00000006 0x00: eax=0x00000004 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000007 0x00: eax=0x00000000 ebx=0x009c4fbb ecx=0x00000004 edx=0x84000000
   0x00000008 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x00000009 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
   0x0000000a 0x00: eax=0x07300402 ebx=0x00000000 ecx=0x00000000 edx=0x00000603
--> MISSING CPUID.0xB.0
   0x0000000b 0x01: eax=0x00000000 ebx=0x00000001 ecx=0x00000201 edx=0x00000003

CPUID.0xB.0 does not exist, so output.ECX=0, which indicates an invalid
level-type.

The SDM states (for CPUID.0xB):

   If an input value n in ECX returns the invalid level-type of 0 in ECX[15:8],
   other input values with ECX > n also return 0 in ECX[15:8]

That means returning a valid level-type in CPUID.0xB.1 as above violates
the SDM's definition of how leaf 0xB works.  I'm arguing we can ignore the
adjustments that would be done on output.E{C,D} for an out of range leaf
because the model is bogus.

> Supposing the max basic is 0x1f, and it queries cpuid(0x20, 0x5),
> it should return cpuid(0x1f, 0x5).
> 
> But based on this patch, it returns all zeros.

Have you tested the patch, or is your comment based on the above discussion
and/or code inspection?  Honest question, because I've thoroughly tested
the above scenario and it works as you describe, but now I'm worried I
completely botched my testing.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/6] KVM: x86: Fix tracing of CPUID.function when function is out-of-range
  2020-03-03  3:45         ` Sean Christopherson
@ 2020-03-03  4:02           ` Xiaoyao Li
  2020-03-03  4:12             ` Sean Christopherson
  0 siblings, 1 reply; 40+ messages in thread
From: Xiaoyao Li @ 2020-03-03  4:02 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Jan Kiszka, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, kvm, linux-kernel

On 3/3/2020 11:45 AM, Sean Christopherson wrote:
> On Tue, Mar 03, 2020 at 10:27:47AM +0800, Xiaoyao Li wrote:
>> On 3/3/2020 4:49 AM, Sean Christopherson wrote:
>>> On Mon, Mar 02, 2020 at 09:26:54PM +0100, Jan Kiszka wrote:
>>>> On 02.03.20 20:57, Sean Christopherson wrote:
>>>>> Rework kvm_cpuid() to query entry->function when adjusting the output
>>>>> values so that the original function (in the aptly named "function") is
>>>>> preserved for tracing.  This fixes a bug where trace_kvm_cpuid() will
>>>>> trace the max function for a range instead of the requested function if
>>>>> the requested function is out-of-range and an entry for the max function
>>>>> exists.
>>>>>
>>>>> Fixes: 43561123ab37 ("kvm: x86: Improve emulation of CPUID leaves 0BH and 1FH")
>>>>> Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>>> Cc: Jim Mattson <jmattson@google.com>
>>>>> Cc: Xiaoyao Li <xiaoyao.li@intel.com>
>>>>> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
>>>>> ---
>>>>>   arch/x86/kvm/cpuid.c | 15 +++++++--------
>>>>>   1 file changed, 7 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>>>>> index b1c469446b07..6be012937eba 100644
>>>>> --- a/arch/x86/kvm/cpuid.c
>>>>> +++ b/arch/x86/kvm/cpuid.c
>>>>> @@ -997,12 +997,12 @@ static bool cpuid_function_in_range(struct kvm_vcpu *vcpu, u32 function)
>>>>>   	return max && function <= max->eax;
>>>>>   }
>>>>> +/* Returns true if the requested leaf/function exists in guest CPUID. */
>>>>>   bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>>>>>   	       u32 *ecx, u32 *edx, bool check_limit)
>>>>>   {
>>>>> -	u32 function = *eax, index = *ecx;
>>>>> +	const u32 function = *eax, index = *ecx;
>>>>>   	struct kvm_cpuid_entry2 *entry;
>>>>> -	struct kvm_cpuid_entry2 *max;
>>>>>   	bool found;
>>>>>   	entry = kvm_find_cpuid_entry(vcpu, function, index);
>>>>> @@ -1015,18 +1015,17 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>>>>>   	 */
>>>>>   	if (!entry && check_limit && !guest_cpuid_is_amd(vcpu) &&
>>>>>   	    !cpuid_function_in_range(vcpu, function)) {
>>>>> -		max = kvm_find_cpuid_entry(vcpu, 0, 0);
>>>>> -		if (max) {
>>>>> -			function = max->eax;
>>>>> -			entry = kvm_find_cpuid_entry(vcpu, function, index);
>>>>> -		}
>>>>> +		entry = kvm_find_cpuid_entry(vcpu, 0, 0);
>>>>> +		if (entry)
>>>>> +			entry = kvm_find_cpuid_entry(vcpu, entry->eax, index);
>>>>>   	}
>>>>>   	if (entry) {
>>>>>   		*eax = entry->eax;
>>>>>   		*ebx = entry->ebx;
>>>>>   		*ecx = entry->ecx;
>>>>>   		*edx = entry->edx;
>>>>> -		if (function == 7 && index == 0) {
>>>>> +
>>>>> +		if (entry->function == 7 && index == 0) {
>>>>>   			u64 data;
>>>>>   		        if (!__kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data, true) &&
>>>>>   			    (data & TSX_CTRL_CPUID_CLEAR))
>>>>>
>>>>
>>>> What about the !entry case below this? It was impacted by the function
>>>> capping so far, not it's no longer.
>>>
>>> Hmm, the only way the output would be different is in a really contrived
>>> scenario where userspace doesn't provide an entry for the max basic leaf.
>>>
>>> The !entry path can only be reached with "orig_function != function" if
>>> orig_function is out of range and there is no entry for the max basic leaf.
>>
>>> The adjustments for 0xb/0x1f require the max basic leaf to be 0xb or 0x1f,
>>> and to take effect with !entry would require there to be a CPUID.max.1 but
>>> not a CPUID.max.0.  That'd be a violation of Intel's SDM, i.e. it's bogus
>>> userspace input and IMO can be ignored.
>>>
>>
>> Sorry I cannot catch you. Why it's a violation of Intel's SDM?
> 
> The case being discussed above would look like:
> 
> KVM CPUID Entries:
>     Function   Index Output
>     0x00000000 0x00: eax=0x0000000b ebx=0x756e6547 ecx=0x6c65746e edx=0x49656e69
>     0x00000001 0x00: eax=0x000906ea ebx=0x03000800 ecx=0xfffa3223 edx=0x0f8bfbff
>     0x00000002 0x00: eax=0x00000001 ebx=0x00000000 ecx=0x0000004d edx=0x002c307d
>     0x00000003 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
>     0x00000004 0x00: eax=0x00000121 ebx=0x01c0003f ecx=0x0000003f edx=0x00000001
>     0x00000004 0x01: eax=0x00000122 ebx=0x01c0003f ecx=0x0000003f edx=0x00000001
>     0x00000004 0x02: eax=0x00000143 ebx=0x03c0003f ecx=0x00000fff edx=0x00000001
>     0x00000004 0x03: eax=0x00000163 ebx=0x03c0003f ecx=0x00003fff edx=0x00000006
>     0x00000005 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000003 edx=0x00000000
>     0x00000006 0x00: eax=0x00000004 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
>     0x00000007 0x00: eax=0x00000000 ebx=0x009c4fbb ecx=0x00000004 edx=0x84000000
>     0x00000008 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
>     0x00000009 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
>     0x0000000a 0x00: eax=0x07300402 ebx=0x00000000 ecx=0x00000000 edx=0x00000603
> --> MISSING CPUID.0xB.0
>     0x0000000b 0x01: eax=0x00000000 ebx=0x00000001 ecx=0x00000201 edx=0x00000003
> 
> CPUID.0xB.0 does not exist, so output.ECX=0, which indicates an invalid
> level-type.
> 
> The SDM states (for CPUID.0xB):
> 
>     If an input value n in ECX returns the invalid level-type of 0 in ECX[15:8],
>     other input values with ECX > n also return 0 in ECX[15:8]
> 
> That means returning a valid level-type in CPUID.0xB.1 as above violates
> the SDM's definition of how leaf 0xB works.  I'm arguing we can ignore the
> adjustments that would be done on output.E{C,D} for an out of range leaf
> because the model is bogus.

Right.

So we'd better do something in KVM_SET_CPUID* , to avoid userspace set 
bogus cpuid.

>> Supposing the max basic is 0x1f, and it queries cpuid(0x20, 0x5),
>> it should return cpuid(0x1f, 0x5).
>>
>> But based on this patch, it returns all zeros.
> 
> Have you tested the patch, or is your comment based on the above discussion
> and/or code inspection?  Honest question, because I've thoroughly tested
> the above scenario and it works as you describe, but now I'm worried I
> completely botched my testing.
> 

No, I didn't test.

Leaf 0xB and 0x1f are special cases when they are the maximum basic 
leaf, because no matter what subleaf is, there is always a non-zero 
E[CX,DX].

If cpuid.0 returns maximum basic leaf as 0xB/0x1F, when queried leaf is 
greater, it should always return a non-zero value.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/6] KVM: x86: Fix tracing of CPUID.function when function is out-of-range
  2020-03-03  2:50   ` Xiaoyao Li
@ 2020-03-03  4:08     ` Sean Christopherson
  2020-03-03  4:16       ` Xiaoyao Li
  0 siblings, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2020-03-03  4:08 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Jan Kiszka

On Tue, Mar 03, 2020 at 10:50:03AM +0800, Xiaoyao Li wrote:
> On 3/3/2020 3:57 AM, Sean Christopherson wrote:
> >Rework kvm_cpuid() to query entry->function when adjusting the output
> >values so that the original function (in the aptly named "function") is
> >preserved for tracing.  This fixes a bug where trace_kvm_cpuid() will
> >trace the max function for a range instead of the requested function if
> >the requested function is out-of-range and an entry for the max function
> >exists.
> >
> >Fixes: 43561123ab37 ("kvm: x86: Improve emulation of CPUID leaves 0BH and 1FH")
> >Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
> >Cc: Jim Mattson <jmattson@google.com>
> >Cc: Xiaoyao Li <xiaoyao.li@intel.com>
> >Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> >---
> >  arch/x86/kvm/cpuid.c | 15 +++++++--------
> >  1 file changed, 7 insertions(+), 8 deletions(-)
> >
> >diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> >index b1c469446b07..6be012937eba 100644
> >--- a/arch/x86/kvm/cpuid.c
> >+++ b/arch/x86/kvm/cpuid.c
> >@@ -997,12 +997,12 @@ static bool cpuid_function_in_range(struct kvm_vcpu *vcpu, u32 function)
> >  	return max && function <= max->eax;
> >  }
> >+/* Returns true if the requested leaf/function exists in guest CPUID. */
> >  bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
> >  	       u32 *ecx, u32 *edx, bool check_limit)
> >  {
> >-	u32 function = *eax, index = *ecx;
> >+	const u32 function = *eax, index = *ecx;
> >  	struct kvm_cpuid_entry2 *entry;
> >-	struct kvm_cpuid_entry2 *max;
> >  	bool found;
> >  	entry = kvm_find_cpuid_entry(vcpu, function, index);
> >@@ -1015,18 +1015,17 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
> >  	 */
> >  	if (!entry && check_limit && !guest_cpuid_is_amd(vcpu) &&
> >  	    !cpuid_function_in_range(vcpu, function)) {
> >-		max = kvm_find_cpuid_entry(vcpu, 0, 0);
> >-		if (max) {
> >-			function = max->eax;
> >-			entry = kvm_find_cpuid_entry(vcpu, function, index);
> >-		}
> >+		entry = kvm_find_cpuid_entry(vcpu, 0, 0);
> >+		if (entry)
> >+			entry = kvm_find_cpuid_entry(vcpu, entry->eax, index);
> 
> There is a problem.
> 
> when queried leaf is out of range on Intel CPU, it returns the maximum basic
> leaf, and any dependence on input ECX (i.e., subleaf) value in the basic
> leaf is honored. As disclaimed in SDM of CPUID instruction.

That's what the code above does.

> The ECX should be honored if and only the leaf has a significant index.
> If the leaf doesn't has a significant index, it just ignores the EDX input

s/EDX/ECX

> in bare metal.
>
> So it should be something like:
> 
> if (!entry && check_limit && !guest_cpuid_is_amd(vcpu) &&
> 	!cpuid_function_in_range(vcpu, function)) {
> 	entry = kvm_find_cpuid_entry(vcpu, 0, 0);
> 	if (entry) {
> 		entry = kvm_find_cpuid_entry(vcpu, entry->eax, 0);
> 		if (entry &&
> 		    entry->flags & KVM_CPUID_FLAG_SIGNIFCANT_INDEX ) {

This is unnecessary IMO.  The only scenario where SIGNFICANT_INDEX is 0
and cpuid_entry(entry->eax, 0) != cpuid_entry(entry->eax, index) is if
userspace created a cpuid entry for index>0 with SIGNFICANT_INDEX.  That's
a busted model, e.g. it'd be the SDM equivalent of an Intel CPU having
different output for CPUID.0x16.0 and CPUID.16.5 despite the SDM stating
that the CPUID.0x16 ignores the index.

E.g. on my system with a max basic leaf of 0x16

$ cpuid -1 -r
CPU:
   0x00000000 0x00: eax=0x00000016 ebx=0x756e6547 ecx=0x6c65746e edx=0x49656e69
...
   0x00000016 0x00: eax=0x00000e74 ebx=0x0000125c ecx=0x00000064 edx=0x00000000

$ cpuid -1 -r -l 0x16
CPU:
   0x00000016 0x00: eax=0x00000e74 ebx=0x0000125c ecx=0x00000064 edx=0x00000000
~ $ cpuid -1 -r -l 0x16 -s 4
CPU:
   0x00000016 0x04: eax=0x00000e74 ebx=0x0000125c ecx=0x00000064 edx=0x00000000
~ $ cpuid -1 -r -l 0x16 -s 466
CPU:
   0x00000016 0x1d2: eax=0x00000e74 ebx=0x0000125c ecx=0x00000064 edx=0x00000000


If it returned anything else for CPUID.0x16.0x4 then it'd be a CPU bug.
Same thing here, it's a userspace bug if it creates a CPUID entry that
shouldn't exist.  E.g. ignoring Intel's silly "max basic leaf" behavior
for the moment, if userspace created a entry for CPUID.0x0.N it would
break the Linux kernel's cpu_detect(), as it doesn't initialize ECX when
doing CPUID.0x0.

> 			entry = kvm_find_cpuid_entry(vcpu, entry->eax,
> 						     index);
> 		}
> 	}
> }
> 
> >  	}
> >  	if (entry) {
> >  		*eax = entry->eax;
> >  		*ebx = entry->ebx;
> >  		*ecx = entry->ecx;
> >  		*edx = entry->edx;
> >-		if (function == 7 && index == 0) {
> >+
> >+		if (entry->function == 7 && index == 0) {
> >  			u64 data;
> >  		        if (!__kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data, true) &&
> >  			    (data & TSX_CTRL_CPUID_CLEAR))
> >
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/6] KVM: x86: Fix tracing of CPUID.function when function is out-of-range
  2020-03-03  4:02           ` Xiaoyao Li
@ 2020-03-03  4:12             ` Sean Christopherson
  2020-03-03  4:30               ` Xiaoyao Li
  0 siblings, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2020-03-03  4:12 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Jan Kiszka, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, kvm, linux-kernel

On Tue, Mar 03, 2020 at 12:02:39PM +0800, Xiaoyao Li wrote:
> On 3/3/2020 11:45 AM, Sean Christopherson wrote:
> >On Tue, Mar 03, 2020 at 10:27:47AM +0800, Xiaoyao Li wrote:
> >>Sorry I cannot catch you. Why it's a violation of Intel's SDM?
> >
> >The case being discussed above would look like:
> >
> >KVM CPUID Entries:
> >    Function   Index Output
> >    0x00000000 0x00: eax=0x0000000b ebx=0x756e6547 ecx=0x6c65746e edx=0x49656e69
> >    0x00000001 0x00: eax=0x000906ea ebx=0x03000800 ecx=0xfffa3223 edx=0x0f8bfbff
> >    0x00000002 0x00: eax=0x00000001 ebx=0x00000000 ecx=0x0000004d edx=0x002c307d
> >    0x00000003 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
> >    0x00000004 0x00: eax=0x00000121 ebx=0x01c0003f ecx=0x0000003f edx=0x00000001
> >    0x00000004 0x01: eax=0x00000122 ebx=0x01c0003f ecx=0x0000003f edx=0x00000001
> >    0x00000004 0x02: eax=0x00000143 ebx=0x03c0003f ecx=0x00000fff edx=0x00000001
> >    0x00000004 0x03: eax=0x00000163 ebx=0x03c0003f ecx=0x00003fff edx=0x00000006
> >    0x00000005 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000003 edx=0x00000000
> >    0x00000006 0x00: eax=0x00000004 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
> >    0x00000007 0x00: eax=0x00000000 ebx=0x009c4fbb ecx=0x00000004 edx=0x84000000
> >    0x00000008 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
> >    0x00000009 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
> >    0x0000000a 0x00: eax=0x07300402 ebx=0x00000000 ecx=0x00000000 edx=0x00000603
> >--> MISSING CPUID.0xB.0
> >    0x0000000b 0x01: eax=0x00000000 ebx=0x00000001 ecx=0x00000201 edx=0x00000003
> >
> >CPUID.0xB.0 does not exist, so output.ECX=0, which indicates an invalid
> >level-type.
> >
> >The SDM states (for CPUID.0xB):
> >
> >    If an input value n in ECX returns the invalid level-type of 0 in ECX[15:8],
> >    other input values with ECX > n also return 0 in ECX[15:8]
> >
> >That means returning a valid level-type in CPUID.0xB.1 as above violates
> >the SDM's definition of how leaf 0xB works.  I'm arguing we can ignore the
> >adjustments that would be done on output.E{C,D} for an out of range leaf
> >because the model is bogus.
> 
> Right.
> 
> So we'd better do something in KVM_SET_CPUID* , to avoid userspace set bogus
> cpuid.
> 
> >>Supposing the max basic is 0x1f, and it queries cpuid(0x20, 0x5),
> >>it should return cpuid(0x1f, 0x5).
> >>
> >>But based on this patch, it returns all zeros.
> >
> >Have you tested the patch, or is your comment based on the above discussion
> >and/or code inspection?  Honest question, because I've thoroughly tested
> >the above scenario and it works as you describe, but now I'm worried I
> >completely botched my testing.
> >
> 
> No, I didn't test.
> 
> Leaf 0xB and 0x1f are special cases when they are the maximum basic leaf,
> because no matter what subleaf is, there is always a non-zero E[CX,DX].
> 
> If cpuid.0 returns maximum basic leaf as 0xB/0x1F, when queried leaf is
> greater, it should always return a non-zero value.

Yes, and that's userspace's responsibility to not screw up.  E.g. if
userspace didn't create CPUID.0xB.0 (as above) then it's not KVM's fault
for returning zeros when the guest executes CPUID.0xB.0.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/6] KVM: x86: Fix tracing of CPUID.function when function is out-of-range
  2020-03-03  4:08     ` Sean Christopherson
@ 2020-03-03  4:16       ` Xiaoyao Li
  0 siblings, 0 replies; 40+ messages in thread
From: Xiaoyao Li @ 2020-03-03  4:16 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Jan Kiszka

On 3/3/2020 12:08 PM, Sean Christopherson wrote:
> On Tue, Mar 03, 2020 at 10:50:03AM +0800, Xiaoyao Li wrote:
>> On 3/3/2020 3:57 AM, Sean Christopherson wrote:
>>> Rework kvm_cpuid() to query entry->function when adjusting the output
>>> values so that the original function (in the aptly named "function") is
>>> preserved for tracing.  This fixes a bug where trace_kvm_cpuid() will
>>> trace the max function for a range instead of the requested function if
>>> the requested function is out-of-range and an entry for the max function
>>> exists.
>>>
>>> Fixes: 43561123ab37 ("kvm: x86: Improve emulation of CPUID leaves 0BH and 1FH")
>>> Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
>>> Cc: Jim Mattson <jmattson@google.com>
>>> Cc: Xiaoyao Li <xiaoyao.li@intel.com>
>>> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
>>> ---
>>>   arch/x86/kvm/cpuid.c | 15 +++++++--------
>>>   1 file changed, 7 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>>> index b1c469446b07..6be012937eba 100644
>>> --- a/arch/x86/kvm/cpuid.c
>>> +++ b/arch/x86/kvm/cpuid.c
>>> @@ -997,12 +997,12 @@ static bool cpuid_function_in_range(struct kvm_vcpu *vcpu, u32 function)
>>>   	return max && function <= max->eax;
>>>   }
>>> +/* Returns true if the requested leaf/function exists in guest CPUID. */
>>>   bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>>>   	       u32 *ecx, u32 *edx, bool check_limit)
>>>   {
>>> -	u32 function = *eax, index = *ecx;
>>> +	const u32 function = *eax, index = *ecx;
>>>   	struct kvm_cpuid_entry2 *entry;
>>> -	struct kvm_cpuid_entry2 *max;
>>>   	bool found;
>>>   	entry = kvm_find_cpuid_entry(vcpu, function, index);
>>> @@ -1015,18 +1015,17 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>>>   	 */
>>>   	if (!entry && check_limit && !guest_cpuid_is_amd(vcpu) &&
>>>   	    !cpuid_function_in_range(vcpu, function)) {
>>> -		max = kvm_find_cpuid_entry(vcpu, 0, 0);
>>> -		if (max) {
>>> -			function = max->eax;
>>> -			entry = kvm_find_cpuid_entry(vcpu, function, index);
>>> -		}
>>> +		entry = kvm_find_cpuid_entry(vcpu, 0, 0);
>>> +		if (entry)
>>> +			entry = kvm_find_cpuid_entry(vcpu, entry->eax, index);
>>
>> There is a problem.
>>
>> when queried leaf is out of range on Intel CPU, it returns the maximum basic
>> leaf, and any dependence on input ECX (i.e., subleaf) value in the basic
>> leaf is honored. As disclaimed in SDM of CPUID instruction.
> 
> That's what the code above does.
> 
>> The ECX should be honored if and only the leaf has a significant index.
>> If the leaf doesn't has a significant index, it just ignores the EDX input
> 
> s/EDX/ECX
> 
>> in bare metal.
>>
>> So it should be something like:
>>
>> if (!entry && check_limit && !guest_cpuid_is_amd(vcpu) &&
>> 	!cpuid_function_in_range(vcpu, function)) {
>> 	entry = kvm_find_cpuid_entry(vcpu, 0, 0);
>> 	if (entry) {
>> 		entry = kvm_find_cpuid_entry(vcpu, entry->eax, 0);
>> 		if (entry &&
>> 		    entry->flags & KVM_CPUID_FLAG_SIGNIFCANT_INDEX ) {
> 
> This is unnecessary IMO.  The only scenario where SIGNFICANT_INDEX is 0
> and cpuid_entry(entry->eax, 0) != cpuid_entry(entry->eax, index) is if
> userspace created a cpuid entry for index>0 with SIGNFICANT_INDEX.  

I just forgot that is_matching_cpuid_entry() has taken SIGNIFICANT_INDEX 
into account.

Please ignore my stupid noise.

> a busted model, e.g. it'd be the SDM equivalent of an Intel CPU having
> different output for CPUID.0x16.0 and CPUID.16.5 despite the SDM stating
> that the CPUID.0x16 ignores the index.
> 
> E.g. on my system with a max basic leaf of 0x16
> 
> $ cpuid -1 -r
> CPU:
>     0x00000000 0x00: eax=0x00000016 ebx=0x756e6547 ecx=0x6c65746e edx=0x49656e69
> ...
>     0x00000016 0x00: eax=0x00000e74 ebx=0x0000125c ecx=0x00000064 edx=0x00000000
> 
> $ cpuid -1 -r -l 0x16
> CPU:
>     0x00000016 0x00: eax=0x00000e74 ebx=0x0000125c ecx=0x00000064 edx=0x00000000
> ~ $ cpuid -1 -r -l 0x16 -s 4
> CPU:
>     0x00000016 0x04: eax=0x00000e74 ebx=0x0000125c ecx=0x00000064 edx=0x00000000
> ~ $ cpuid -1 -r -l 0x16 -s 466
> CPU:
>     0x00000016 0x1d2: eax=0x00000e74 ebx=0x0000125c ecx=0x00000064 edx=0x00000000
> 
> 
> If it returned anything else for CPUID.0x16.0x4 then it'd be a CPU bug.
> Same thing here, it's a userspace bug if it creates a CPUID entry that
> shouldn't exist.  E.g. ignoring Intel's silly "max basic leaf" behavior
> for the moment, if userspace created a entry for CPUID.0x0.N it would
> break the Linux kernel's cpu_detect(), as it doesn't initialize ECX when
> doing CPUID.0x0.
> 
>> 			entry = kvm_find_cpuid_entry(vcpu, entry->eax,
>> 						     index);
>> 		}
>> 	}
>> }
>>
>>>   	}
>>>   	if (entry) {
>>>   		*eax = entry->eax;
>>>   		*ebx = entry->ebx;
>>>   		*ecx = entry->ecx;
>>>   		*edx = entry->edx;
>>> -		if (function == 7 && index == 0) {
>>> +
>>> +		if (entry->function == 7 && index == 0) {
>>>   			u64 data;
>>>   		        if (!__kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data, true) &&
>>>   			    (data & TSX_CTRL_CPUID_CLEAR))
>>>
>>


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/6] KVM: x86: Fix CPUID range check for Centaur and Hypervisor ranges
  2020-03-03  3:25   ` Jim Mattson
@ 2020-03-03  4:25     ` Jim Mattson
  2020-03-03  4:58       ` Sean Christopherson
  0 siblings, 1 reply; 40+ messages in thread
From: Jim Mattson @ 2020-03-03  4:25 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Jan Kiszka, Xiaoyao Li

On Mon, Mar 2, 2020 at 7:25 PM Jim Mattson <jmattson@google.com> wrote:
>
> On Mon, Mar 2, 2020 at 11:57 AM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
>
> > The bad behavior can be visually confirmed by dumping CPUID output in
> > the guest when running Qemu with a stable TSC, as Qemu extends the limit
> > of range 0x40000000 to 0x40000010 to advertise VMware's cpuid_freq,
> > without defining zeroed entries for 0x40000002 - 0x4000000f.
>
> I think it could be reasonably argued that this is a userspace bug.
> Clearly, when userspace explicitly supplies the results for a leaf,
> those results override the default CPUID values for that leaf. But I
> haven't seen it documented anywhere that leaves *not* explicitly
> supplied by userspace will override the default CPUID values, just
> because they happen to appear in some magic range.

In fact, the more I think about it, the original change is correct, at
least in this regard. Your "fix" introduces undocumented and
unfathomable behavior.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/6] KVM: x86: Fix tracing of CPUID.function when function is out-of-range
  2020-03-03  4:12             ` Sean Christopherson
@ 2020-03-03  4:30               ` Xiaoyao Li
  0 siblings, 0 replies; 40+ messages in thread
From: Xiaoyao Li @ 2020-03-03  4:30 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Jan Kiszka, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, kvm, linux-kernel

On 3/3/2020 12:12 PM, Sean Christopherson wrote:
> On Tue, Mar 03, 2020 at 12:02:39PM +0800, Xiaoyao Li wrote:
>> On 3/3/2020 11:45 AM, Sean Christopherson wrote:
>>> On Tue, Mar 03, 2020 at 10:27:47AM +0800, Xiaoyao Li wrote:
>>>> Sorry I cannot catch you. Why it's a violation of Intel's SDM?
>>>
>>> The case being discussed above would look like:
>>>
>>> KVM CPUID Entries:
>>>     Function   Index Output
>>>     0x00000000 0x00: eax=0x0000000b ebx=0x756e6547 ecx=0x6c65746e edx=0x49656e69
>>>     0x00000001 0x00: eax=0x000906ea ebx=0x03000800 ecx=0xfffa3223 edx=0x0f8bfbff
>>>     0x00000002 0x00: eax=0x00000001 ebx=0x00000000 ecx=0x0000004d edx=0x002c307d
>>>     0x00000003 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
>>>     0x00000004 0x00: eax=0x00000121 ebx=0x01c0003f ecx=0x0000003f edx=0x00000001
>>>     0x00000004 0x01: eax=0x00000122 ebx=0x01c0003f ecx=0x0000003f edx=0x00000001
>>>     0x00000004 0x02: eax=0x00000143 ebx=0x03c0003f ecx=0x00000fff edx=0x00000001
>>>     0x00000004 0x03: eax=0x00000163 ebx=0x03c0003f ecx=0x00003fff edx=0x00000006
>>>     0x00000005 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000003 edx=0x00000000
>>>     0x00000006 0x00: eax=0x00000004 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
>>>     0x00000007 0x00: eax=0x00000000 ebx=0x009c4fbb ecx=0x00000004 edx=0x84000000
>>>     0x00000008 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
>>>     0x00000009 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
>>>     0x0000000a 0x00: eax=0x07300402 ebx=0x00000000 ecx=0x00000000 edx=0x00000603
>>> --> MISSING CPUID.0xB.0
>>>     0x0000000b 0x01: eax=0x00000000 ebx=0x00000001 ecx=0x00000201 edx=0x00000003
>>>
>>> CPUID.0xB.0 does not exist, so output.ECX=0, which indicates an invalid
>>> level-type.
>>>
>>> The SDM states (for CPUID.0xB):
>>>
>>>     If an input value n in ECX returns the invalid level-type of 0 in ECX[15:8],
>>>     other input values with ECX > n also return 0 in ECX[15:8]
>>>
>>> That means returning a valid level-type in CPUID.0xB.1 as above violates
>>> the SDM's definition of how leaf 0xB works.  I'm arguing we can ignore the
>>> adjustments that would be done on output.E{C,D} for an out of range leaf
>>> because the model is bogus.
>>
>> Right.
>>
>> So we'd better do something in KVM_SET_CPUID* , to avoid userspace set bogus
>> cpuid.
>>
>>>> Supposing the max basic is 0x1f, and it queries cpuid(0x20, 0x5),
>>>> it should return cpuid(0x1f, 0x5).
>>>>
>>>> But based on this patch, it returns all zeros.
>>>
>>> Have you tested the patch, or is your comment based on the above discussion
>>> and/or code inspection?  Honest question, because I've thoroughly tested
>>> the above scenario and it works as you describe, but now I'm worried I
>>> completely botched my testing.
>>>
>>
>> No, I didn't test.
>>
>> Leaf 0xB and 0x1f are special cases when they are the maximum basic leaf,
>> because no matter what subleaf is, there is always a non-zero E[CX,DX].
>>
>> If cpuid.0 returns maximum basic leaf as 0xB/0x1F, when queried leaf is
>> greater, it should always return a non-zero value.
> 
> Yes, and that's userspace's responsibility to not screw up.  E.g. if
> userspace didn't create CPUID.0xB.0 (as above) then it's not KVM's fault
> for returning zeros when the guest executes CPUID.0xB.0.
> 

But this needs userspace to create all the subleaf of 0xB/0x1F. with a 
correct userspace, for example, it only creates one more subleaf of 
0xB/0x1F to indicate from this subleaf, no valid level-type anymore.

So when maximum basic leaf is 0x1f, cpuid(0x20, bigger than the first 
invalid subleaf created by userspace) returns all-zero.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/6] KVM: x86: Fix CPUID range check for Centaur and Hypervisor ranges
  2020-03-03  4:25     ` Jim Mattson
@ 2020-03-03  4:58       ` Sean Christopherson
  2020-03-03 17:42         ` Jim Mattson
  0 siblings, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2020-03-03  4:58 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Jan Kiszka, Xiaoyao Li

On Mon, Mar 02, 2020 at 08:25:31PM -0800, Jim Mattson wrote:
> On Mon, Mar 2, 2020 at 7:25 PM Jim Mattson <jmattson@google.com> wrote:
> >
> > On Mon, Mar 2, 2020 at 11:57 AM Sean Christopherson
> > <sean.j.christopherson@intel.com> wrote:
> >
> > > The bad behavior can be visually confirmed by dumping CPUID output in
> > > the guest when running Qemu with a stable TSC, as Qemu extends the limit
> > > of range 0x40000000 to 0x40000010 to advertise VMware's cpuid_freq,
> > > without defining zeroed entries for 0x40000002 - 0x4000000f.
> >
> > I think it could be reasonably argued that this is a userspace bug.
> > Clearly, when userspace explicitly supplies the results for a leaf,
> > those results override the default CPUID values for that leaf. But I
> > haven't seen it documented anywhere that leaves *not* explicitly
> > supplied by userspace will override the default CPUID values, just
> > because they happen to appear in some magic range.
> 
> In fact, the more I think about it, the original change is correct, at
> least in this regard. Your "fix" introduces undocumented and
> unfathomable behavior.

Heh, the takeaway from this is that whatever we decide on needs to be
documented somewhere :-)

I wouldn't say it's unfathomable, conceptually it seems like the intent
of the hypervisor range was to mimic the basic and extended ranges.  The
whole thing is arbitrary behavior.  Of course if Intel CPUs would just
return 0s on undefined leafs it would be a lot less arbitrary :-)
 
Anyways, I don't have a strong opinion on whether this patch stays or goes.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/6] KVM: x86: Add dedicated emulator helper for grabbing CPUID.maxphyaddr
  2020-03-02 19:57 ` [PATCH 3/6] KVM: x86: Add dedicated emulator helper for grabbing CPUID.maxphyaddr Sean Christopherson
@ 2020-03-03  8:48   ` Paolo Bonzini
  2020-03-03  9:48     ` Jan Kiszka
  2020-03-03 16:28     ` Sean Christopherson
  0 siblings, 2 replies; 40+ messages in thread
From: Paolo Bonzini @ 2020-03-03  8:48 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Jan Kiszka, Xiaoyao Li

On 02/03/20 20:57, Sean Christopherson wrote:
> Add a helper to retrieve cpuid_maxphyaddr() instead of manually
> calculating the value in the emulator via raw CPUID output.  In addition
> to consolidating logic, this also paves the way toward simplifying
> kvm_cpuid(), whose somewhat confusing return value exists purely to
> support the emulator's maxphyaddr calculation.
> 
> No functional change intended.

I don't think this is a particularly useful change.  Yes, it's not
intuitive but is it more than a matter of documentation (and possibly
moving the check_cr_write snippet into a separate function)?

Paolo

> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  arch/x86/include/asm/kvm_emulate.h |  1 +
>  arch/x86/kvm/emulate.c             | 10 +---------
>  arch/x86/kvm/x86.c                 |  6 ++++++
>  3 files changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
> index bf5f5e476f65..ded06515d30f 100644
> --- a/arch/x86/include/asm/kvm_emulate.h
> +++ b/arch/x86/include/asm/kvm_emulate.h
> @@ -222,6 +222,7 @@ struct x86_emulate_ops {
>  
>  	bool (*get_cpuid)(struct x86_emulate_ctxt *ctxt, u32 *eax, u32 *ebx,
>  			  u32 *ecx, u32 *edx, bool check_limit);
> +	int (*get_cpuid_maxphyaddr)(struct x86_emulate_ctxt *ctxt);
>  	bool (*guest_has_long_mode)(struct x86_emulate_ctxt *ctxt);
>  	bool (*guest_has_movbe)(struct x86_emulate_ctxt *ctxt);
>  	bool (*guest_has_fxsr)(struct x86_emulate_ctxt *ctxt);
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index dd19fb3539e0..bf02ed51e90f 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -4244,16 +4244,8 @@ static int check_cr_write(struct x86_emulate_ctxt *ctxt)
>  
>  		ctxt->ops->get_msr(ctxt, MSR_EFER, &efer);
>  		if (efer & EFER_LMA) {
> -			u64 maxphyaddr;
> -			u32 eax, ebx, ecx, edx;
> +			int maxphyaddr = ctxt->ops->get_cpuid_maxphyaddr(ctxt);
>  
> -			eax = 0x80000008;
> -			ecx = 0;
> -			if (ctxt->ops->get_cpuid(ctxt, &eax, &ebx, &ecx,
> -						 &edx, false))
> -				maxphyaddr = eax & 0xff;
> -			else
> -				maxphyaddr = 36;
>  			rsvd = rsvd_bits(maxphyaddr, 63);
>  			if (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_PCIDE)
>  				rsvd &= ~X86_CR3_PCID_NOFLUSH;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index ddd1d296bd20..5467ee71c25b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6209,6 +6209,11 @@ static bool emulator_get_cpuid(struct x86_emulate_ctxt *ctxt,
>  	return kvm_cpuid(emul_to_vcpu(ctxt), eax, ebx, ecx, edx, check_limit);
>  }
>  
> +static int emulator_get_cpuid_maxphyaddr(struct x86_emulate_ctxt *ctxt)
> +{
> +	return cpuid_maxphyaddr(emul_to_vcpu(ctxt));
> +}
> +
>  static bool emulator_guest_has_long_mode(struct x86_emulate_ctxt *ctxt)
>  {
>  	return guest_cpuid_has(emul_to_vcpu(ctxt), X86_FEATURE_LM);
> @@ -6301,6 +6306,7 @@ static const struct x86_emulate_ops emulate_ops = {
>  	.fix_hypercall       = emulator_fix_hypercall,
>  	.intercept           = emulator_intercept,
>  	.get_cpuid           = emulator_get_cpuid,
> +	.get_cpuid_maxphyaddr= emulator_get_cpuid_maxphyaddr,
>  	.guest_has_long_mode = emulator_guest_has_long_mode,
>  	.guest_has_movbe     = emulator_guest_has_movbe,
>  	.guest_has_fxsr      = emulator_guest_has_fxsr,
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/6] KVM: x86: CPUID emulation and tracing fixes
  2020-03-02 19:57 [PATCH 0/6] KVM: x86: CPUID emulation and tracing fixes Sean Christopherson
                   ` (5 preceding siblings ...)
  2020-03-02 19:57 ` [PATCH 6/6] KVM: x86: Add requested index to the CPUID tracepoint Sean Christopherson
@ 2020-03-03  8:48 ` Paolo Bonzini
  2020-03-03 16:38   ` Sean Christopherson
  6 siblings, 1 reply; 40+ messages in thread
From: Paolo Bonzini @ 2020-03-03  8:48 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Jan Kiszka, Xiaoyao Li

On 02/03/20 20:57, Sean Christopherson wrote:
> Two fixes related to out-of-range CPUID emulation and related cleanup on
> top.
> 
> I have a unit test and also manually verified a few interesting cases.
> I'm not planning on posting the unit test at this time because I haven't
> figured out how to avoid false positives, e.g. if a random in-bounds
> leaf just happens to match the output of a max basic leaf.  It might be
> doable by hardcoding the cpu model?

It would be best suited for selftests rather than kvm-unit-tests.  But I 
don't really see the benefit of anything more than just

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index b1c469446b07..c1abf5de4461 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -1001,6 +1001,7 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
 	       u32 *ecx, u32 *edx, bool check_limit)
 {
 	u32 function = *eax, index = *ecx;
+	u32 orig_function = function;
 	struct kvm_cpuid_entry2 *entry;
 	struct kvm_cpuid_entry2 *max;
 	bool found;
@@ -1049,7 +1050,7 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
 			}
 		}
 	}
-	trace_kvm_cpuid(function, *eax, *ebx, *ecx, *edx, found);
+	trace_kvm_cpuid(orig_function, *eax, *ebx, *ecx, *edx, found);
 	return found;
 }
 EXPORT_SYMBOL_GPL(kvm_cpuid);


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/6] KVM: x86: Add dedicated emulator helper for grabbing CPUID.maxphyaddr
  2020-03-03  8:48   ` Paolo Bonzini
@ 2020-03-03  9:48     ` Jan Kiszka
  2020-03-03 10:14       ` Paolo Bonzini
  2020-03-03 16:28     ` Sean Christopherson
  1 sibling, 1 reply; 40+ messages in thread
From: Jan Kiszka @ 2020-03-03  9:48 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li

On 03.03.20 09:48, Paolo Bonzini wrote:
> On 02/03/20 20:57, Sean Christopherson wrote:
>> Add a helper to retrieve cpuid_maxphyaddr() instead of manually
>> calculating the value in the emulator via raw CPUID output.  In addition
>> to consolidating logic, this also paves the way toward simplifying
>> kvm_cpuid(), whose somewhat confusing return value exists purely to
>> support the emulator's maxphyaddr calculation.
>>
>> No functional change intended.
> 
> I don't think this is a particularly useful change.  Yes, it's not
> intuitive but is it more than a matter of documentation (and possibly
> moving the check_cr_write snippet into a separate function)?

Besides the non obvious return value of the current function, this 
approach also avoids leaving cpuid traces for querying maxphyaddr, which 
is also not very intuitive IMHO.

Jan

> 
> Paolo
> 
>> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
>> ---
>>   arch/x86/include/asm/kvm_emulate.h |  1 +
>>   arch/x86/kvm/emulate.c             | 10 +---------
>>   arch/x86/kvm/x86.c                 |  6 ++++++
>>   3 files changed, 8 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h
>> index bf5f5e476f65..ded06515d30f 100644
>> --- a/arch/x86/include/asm/kvm_emulate.h
>> +++ b/arch/x86/include/asm/kvm_emulate.h
>> @@ -222,6 +222,7 @@ struct x86_emulate_ops {
>>   
>>   	bool (*get_cpuid)(struct x86_emulate_ctxt *ctxt, u32 *eax, u32 *ebx,
>>   			  u32 *ecx, u32 *edx, bool check_limit);
>> +	int (*get_cpuid_maxphyaddr)(struct x86_emulate_ctxt *ctxt);
>>   	bool (*guest_has_long_mode)(struct x86_emulate_ctxt *ctxt);
>>   	bool (*guest_has_movbe)(struct x86_emulate_ctxt *ctxt);
>>   	bool (*guest_has_fxsr)(struct x86_emulate_ctxt *ctxt);
>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>> index dd19fb3539e0..bf02ed51e90f 100644
>> --- a/arch/x86/kvm/emulate.c
>> +++ b/arch/x86/kvm/emulate.c
>> @@ -4244,16 +4244,8 @@ static int check_cr_write(struct x86_emulate_ctxt *ctxt)
>>   
>>   		ctxt->ops->get_msr(ctxt, MSR_EFER, &efer);
>>   		if (efer & EFER_LMA) {
>> -			u64 maxphyaddr;
>> -			u32 eax, ebx, ecx, edx;
>> +			int maxphyaddr = ctxt->ops->get_cpuid_maxphyaddr(ctxt);
>>   
>> -			eax = 0x80000008;
>> -			ecx = 0;
>> -			if (ctxt->ops->get_cpuid(ctxt, &eax, &ebx, &ecx,
>> -						 &edx, false))
>> -				maxphyaddr = eax & 0xff;
>> -			else
>> -				maxphyaddr = 36;
>>   			rsvd = rsvd_bits(maxphyaddr, 63);
>>   			if (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_PCIDE)
>>   				rsvd &= ~X86_CR3_PCID_NOFLUSH;
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index ddd1d296bd20..5467ee71c25b 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -6209,6 +6209,11 @@ static bool emulator_get_cpuid(struct x86_emulate_ctxt *ctxt,
>>   	return kvm_cpuid(emul_to_vcpu(ctxt), eax, ebx, ecx, edx, check_limit);
>>   }
>>   
>> +static int emulator_get_cpuid_maxphyaddr(struct x86_emulate_ctxt *ctxt)
>> +{
>> +	return cpuid_maxphyaddr(emul_to_vcpu(ctxt));
>> +}
>> +
>>   static bool emulator_guest_has_long_mode(struct x86_emulate_ctxt *ctxt)
>>   {
>>   	return guest_cpuid_has(emul_to_vcpu(ctxt), X86_FEATURE_LM);
>> @@ -6301,6 +6306,7 @@ static const struct x86_emulate_ops emulate_ops = {
>>   	.fix_hypercall       = emulator_fix_hypercall,
>>   	.intercept           = emulator_intercept,
>>   	.get_cpuid           = emulator_get_cpuid,
>> +	.get_cpuid_maxphyaddr= emulator_get_cpuid_maxphyaddr,
>>   	.guest_has_long_mode = emulator_guest_has_long_mode,
>>   	.guest_has_movbe     = emulator_guest_has_movbe,
>>   	.guest_has_fxsr      = emulator_guest_has_fxsr,
>>
> 

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/6] KVM: x86: Add dedicated emulator helper for grabbing CPUID.maxphyaddr
  2020-03-03  9:48     ` Jan Kiszka
@ 2020-03-03 10:14       ` Paolo Bonzini
  2020-03-04 20:47         ` Sean Christopherson
  0 siblings, 1 reply; 40+ messages in thread
From: Paolo Bonzini @ 2020-03-03 10:14 UTC (permalink / raw)
  To: Jan Kiszka, Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li

On 03/03/20 10:48, Jan Kiszka wrote:
>>
>> I don't think this is a particularly useful change.  Yes, it's not
>> intuitive but is it more than a matter of documentation (and possibly
>> moving the check_cr_write snippet into a separate function)?
> 
> Besides the non obvious return value of the current function, this
> approach also avoids leaving cpuid traces for querying maxphyaddr, which
> is also not very intuitive IMHO.

There are already other cases where we leave CPUID traces.  We can just
stop tracing if check_limit (which should be renamed to from_guest) is
true; there are other internal cases which call ctxt->ops->get_cpuid,
such as vendor_intel, and those should also use check_limit==true and
check the return value of ctxt->ops->get_cpuid.

Paolo


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/6] KVM: x86: Add dedicated emulator helper for grabbing CPUID.maxphyaddr
  2020-03-03  8:48   ` Paolo Bonzini
  2020-03-03  9:48     ` Jan Kiszka
@ 2020-03-03 16:28     ` Sean Christopherson
  2020-03-03 17:21       ` Paolo Bonzini
  1 sibling, 1 reply; 40+ messages in thread
From: Sean Christopherson @ 2020-03-03 16:28 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Jan Kiszka, Xiaoyao Li

On Tue, Mar 03, 2020 at 09:48:03AM +0100, Paolo Bonzini wrote:
> On 02/03/20 20:57, Sean Christopherson wrote:
> > Add a helper to retrieve cpuid_maxphyaddr() instead of manually
> > calculating the value in the emulator via raw CPUID output.  In addition
> > to consolidating logic, this also paves the way toward simplifying
> > kvm_cpuid(), whose somewhat confusing return value exists purely to
> > support the emulator's maxphyaddr calculation.
> > 
> > No functional change intended.
> 
> I don't think this is a particularly useful change.  Yes, it's not
> intuitive but is it more than a matter of documentation (and possibly
> moving the check_cr_write snippet into a separate function)?

I really don't like duplicating the maxphyaddr logic.  I'm paranoid
something will come along and change the "effective" maxphyaddr and we'll
forget all about the emulator, e.g. SEV, TME and paravirt XO all dance
around maxphyaddr.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/6] KVM: x86: CPUID emulation and tracing fixes
  2020-03-03  8:48 ` [PATCH 0/6] KVM: x86: CPUID emulation and tracing fixes Paolo Bonzini
@ 2020-03-03 16:38   ` Sean Christopherson
  0 siblings, 0 replies; 40+ messages in thread
From: Sean Christopherson @ 2020-03-03 16:38 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Jan Kiszka, Xiaoyao Li

On Tue, Mar 03, 2020 at 09:48:44AM +0100, Paolo Bonzini wrote:
> On 02/03/20 20:57, Sean Christopherson wrote:
> > Two fixes related to out-of-range CPUID emulation and related cleanup on
> > top.
> > 
> > I have a unit test and also manually verified a few interesting cases.
> > I'm not planning on posting the unit test at this time because I haven't
> > figured out how to avoid false positives, e.g. if a random in-bounds
> > leaf just happens to match the output of a max basic leaf.  It might be
> > doable by hardcoding the cpu model?
> 
> It would be best suited for selftests rather than kvm-unit-tests.  But I 
> don't really see the benefit of anything more than just

Gotta save those stack bytes?

I got a bit confused by the "max" variable; I thought it would hold the
max basic leaf, not CPUID.0x0.  Removing it seemed easier than trying to
come up with a better name :-)
 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index b1c469446b07..c1abf5de4461 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -1001,6 +1001,7 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>  	       u32 *ecx, u32 *edx, bool check_limit)
>  {
>  	u32 function = *eax, index = *ecx;
> +	u32 orig_function = function;
>  	struct kvm_cpuid_entry2 *entry;
>  	struct kvm_cpuid_entry2 *max;
>  	bool found;
> @@ -1049,7 +1050,7 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>  			}
>  		}
>  	}
> -	trace_kvm_cpuid(function, *eax, *ebx, *ecx, *edx, found);
> +	trace_kvm_cpuid(orig_function, *eax, *ebx, *ecx, *edx, found);
>  	return found;
>  }
>  EXPORT_SYMBOL_GPL(kvm_cpuid);
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/6] KVM: x86: Add dedicated emulator helper for grabbing CPUID.maxphyaddr
  2020-03-03 16:28     ` Sean Christopherson
@ 2020-03-03 17:21       ` Paolo Bonzini
  0 siblings, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2020-03-03 17:21 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Jan Kiszka, Xiaoyao Li

On 03/03/20 17:28, Sean Christopherson wrote:
>> I don't think this is a particularly useful change.  Yes, it's not
>> intuitive but is it more than a matter of documentation (and possibly
>> moving the check_cr_write snippet into a separate function)?
> 
> I really don't like duplicating the maxphyaddr logic.  I'm paranoid
> something will come along and change the "effective" maxphyaddr and we'll
> forget all about the emulator, e.g. SEV, TME and paravirt XO all dance
> around maxphyaddr.

Well, I'm paranoid about breaking everything else... :)

Adding a separate emulator_get_maxphyaddr function would at least
simplify grepping, since searching for 0x80000008 isn't exactly intuitive.

Paolo


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/6] KVM: x86: Fix CPUID range check for Centaur and Hypervisor ranges
  2020-03-03  4:58       ` Sean Christopherson
@ 2020-03-03 17:42         ` Jim Mattson
  2020-03-03 18:01           ` Sean Christopherson
  0 siblings, 1 reply; 40+ messages in thread
From: Jim Mattson @ 2020-03-03 17:42 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Jan Kiszka, Xiaoyao Li

Unfathomable was the wrong word. I can see what you're trying to do. I
just don't think it's defensible. I suspect that Intel CPU architects
will be surprised and disappointed to find that the maximum effective
value of CPUID.0H:EAX is now 255, and that they have to define
CPUID.100H:EAX as the "maximum leaf between 100H and 1FFH" if they
want to define any leaves between 100H and 1FFH.

Furthermore, AMD has only ceded 4000_0000h through 4000_00FFh to
hypervisors, so kvm's use of 40000100H through 400001FFH appears to be
a land grab, akin to VIA's unilateral grab of the C0000000H leaves.
Admittedly, one could argue that the 40000000H leaves are not AMD's to
apportion, since AMD and Intel appear to have reached a detente by
splitting the available space down the middle. Intel, who seems to be
the recognized authority for this range, declares the entire range
from 40000000H through 4FFFFFFFH to be invalid. Make of that what you
will.

In any event, no one has ever documented what's supposed to happen if
you leave gaps in the 4xxxxxxxH range when defining synthesized CPUID
leaves under kvm.

On Mon, Mar 2, 2020 at 8:58 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Mon, Mar 02, 2020 at 08:25:31PM -0800, Jim Mattson wrote:
> > On Mon, Mar 2, 2020 at 7:25 PM Jim Mattson <jmattson@google.com> wrote:
> > >
> > > On Mon, Mar 2, 2020 at 11:57 AM Sean Christopherson
> > > <sean.j.christopherson@intel.com> wrote:
> > >
> > > > The bad behavior can be visually confirmed by dumping CPUID output in
> > > > the guest when running Qemu with a stable TSC, as Qemu extends the limit
> > > > of range 0x40000000 to 0x40000010 to advertise VMware's cpuid_freq,
> > > > without defining zeroed entries for 0x40000002 - 0x4000000f.
> > >
> > > I think it could be reasonably argued that this is a userspace bug.
> > > Clearly, when userspace explicitly supplies the results for a leaf,
> > > those results override the default CPUID values for that leaf. But I
> > > haven't seen it documented anywhere that leaves *not* explicitly
> > > supplied by userspace will override the default CPUID values, just
> > > because they happen to appear in some magic range.
> >
> > In fact, the more I think about it, the original change is correct, at
> > least in this regard. Your "fix" introduces undocumented and
> > unfathomable behavior.
>
> Heh, the takeaway from this is that whatever we decide on needs to be
> documented somewhere :-)
>
> I wouldn't say it's unfathomable, conceptually it seems like the intent
> of the hypervisor range was to mimic the basic and extended ranges.  The
> whole thing is arbitrary behavior.  Of course if Intel CPUs would just
> return 0s on undefined leafs it would be a lot less arbitrary :-)
>
> Anyways, I don't have a strong opinion on whether this patch stays or goes.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/6] KVM: x86: Fix CPUID range check for Centaur and Hypervisor ranges
  2020-03-03 17:42         ` Jim Mattson
@ 2020-03-03 18:01           ` Sean Christopherson
  2020-03-03 18:08             ` Jim Mattson
  2020-03-04 11:18             ` Paolo Bonzini
  0 siblings, 2 replies; 40+ messages in thread
From: Sean Christopherson @ 2020-03-03 18:01 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Jan Kiszka, Xiaoyao Li

On Tue, Mar 03, 2020 at 09:42:42AM -0800, Jim Mattson wrote:
> Unfathomable was the wrong word.

I dunno, one could argue that the behavior of Intel CPUs for CPUID is
unfathomable and I was just trying to follow suit :-D

>  I can see what you're trying to do. I
> just don't think it's defensible. I suspect that Intel CPU architects
> will be surprised and disappointed to find that the maximum effective
> value of CPUID.0H:EAX is now 255, and that they have to define
> CPUID.100H:EAX as the "maximum leaf between 100H and 1FFH" if they
> want to define any leaves between 100H and 1FFH.

Hmm, ya, I agree that applying a 0xffffff00 mask to all classes of CPUID
ranges is straight up wrong.

> Furthermore, AMD has only ceded 4000_0000h through 4000_00FFh to
> hypervisors, so kvm's use of 40000100H through 400001FFH appears to be
> a land grab, akin to VIA's unilateral grab of the C0000000H leaves.
> Admittedly, one could argue that the 40000000H leaves are not AMD's to
> apportion, since AMD and Intel appear to have reached a detente by
> splitting the available space down the middle. Intel, who seems to be
> the recognized authority for this range, declares the entire range
> from 40000000H through 4FFFFFFFH to be invalid. Make of that what you
> will.
> 
> In any event, no one has ever documented what's supposed to happen if
> you leave gaps in the 4xxxxxxxH range when defining synthesized CPUID
> leaves under kvm.

Probably stating the obvious, but for me, the least suprising thing is for
such leafs to output zeros.  It also feels safer, e.g. a guest that's
querying hypervisor support is less likely to be led astray by all zeros
than by a random feature bits being set.

What about something like this?  Along with a comment and documentation...

static bool cpuid_function_in_range(struct kvm_vcpu *vcpu, u32 function)
{
	struct kvm_cpuid_entry2 *max;

	if (function >= 0x40000000 && function <= 0x4fffffff)
		max = kvm_find_cpuid_entry(vcpu, function & 0xffffff00, 0);
	else
		max = kvm_find_cpuid_entry(vcpu, function & 0x80000000, 0);
	return max && function <= max->eax;
}

> On Mon, Mar 2, 2020 at 8:58 PM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> > On Mon, Mar 02, 2020 at 08:25:31PM -0800, Jim Mattson wrote:
> > > On Mon, Mar 2, 2020 at 7:25 PM Jim Mattson <jmattson@google.com> wrote:
> > > >
> > > > On Mon, Mar 2, 2020 at 11:57 AM Sean Christopherson
> > > > <sean.j.christopherson@intel.com> wrote:
> > > >
> > > > > The bad behavior can be visually confirmed by dumping CPUID output in
> > > > > the guest when running Qemu with a stable TSC, as Qemu extends the limit
> > > > > of range 0x40000000 to 0x40000010 to advertise VMware's cpuid_freq,
> > > > > without defining zeroed entries for 0x40000002 - 0x4000000f.
> > > >
> > > > I think it could be reasonably argued that this is a userspace bug.
> > > > Clearly, when userspace explicitly supplies the results for a leaf,
> > > > those results override the default CPUID values for that leaf. But I
> > > > haven't seen it documented anywhere that leaves *not* explicitly
> > > > supplied by userspace will override the default CPUID values, just
> > > > because they happen to appear in some magic range.
> > >
> > > In fact, the more I think about it, the original change is correct, at
> > > least in this regard. Your "fix" introduces undocumented and
> > > unfathomable behavior.
> >
> > Heh, the takeaway from this is that whatever we decide on needs to be
> > documented somewhere :-)
> >
> > I wouldn't say it's unfathomable, conceptually it seems like the intent
> > of the hypervisor range was to mimic the basic and extended ranges.  The
> > whole thing is arbitrary behavior.  Of course if Intel CPUs would just
> > return 0s on undefined leafs it would be a lot less arbitrary :-)
> >
> > Anyways, I don't have a strong opinion on whether this patch stays or goes.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/6] KVM: x86: Fix CPUID range check for Centaur and Hypervisor ranges
  2020-03-03 18:01           ` Sean Christopherson
@ 2020-03-03 18:08             ` Jim Mattson
  2020-03-04 11:18             ` Paolo Bonzini
  1 sibling, 0 replies; 40+ messages in thread
From: Jim Mattson @ 2020-03-03 18:08 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel,
	kvm list, LKML, Jan Kiszka, Xiaoyao Li

On Tue, Mar 3, 2020 at 10:01 AM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Tue, Mar 03, 2020 at 09:42:42AM -0800, Jim Mattson wrote:
> > Unfathomable was the wrong word.
>
> I dunno, one could argue that the behavior of Intel CPUs for CPUID is
> unfathomable and I was just trying to follow suit :-D
>
> >  I can see what you're trying to do. I
> > just don't think it's defensible. I suspect that Intel CPU architects
> > will be surprised and disappointed to find that the maximum effective
> > value of CPUID.0H:EAX is now 255, and that they have to define
> > CPUID.100H:EAX as the "maximum leaf between 100H and 1FFH" if they
> > want to define any leaves between 100H and 1FFH.
>
> Hmm, ya, I agree that applying a 0xffffff00 mask to all classes of CPUID
> ranges is straight up wrong.
>
> > Furthermore, AMD has only ceded 4000_0000h through 4000_00FFh to
> > hypervisors, so kvm's use of 40000100H through 400001FFH appears to be
> > a land grab, akin to VIA's unilateral grab of the C0000000H leaves.
> > Admittedly, one could argue that the 40000000H leaves are not AMD's to
> > apportion, since AMD and Intel appear to have reached a detente by
> > splitting the available space down the middle. Intel, who seems to be
> > the recognized authority for this range, declares the entire range
> > from 40000000H through 4FFFFFFFH to be invalid. Make of that what you
> > will.
> >
> > In any event, no one has ever documented what's supposed to happen if
> > you leave gaps in the 4xxxxxxxH range when defining synthesized CPUID
> > leaves under kvm.
>
> Probably stating the obvious, but for me, the least suprising thing is for
> such leafs to output zeros.  It also feels safer, e.g. a guest that's
> querying hypervisor support is less likely to be led astray by all zeros
> than by a random feature bits being set.
>
> What about something like this?  Along with a comment and documentation...
>
> static bool cpuid_function_in_range(struct kvm_vcpu *vcpu, u32 function)
> {
>         struct kvm_cpuid_entry2 *max;
>
>         if (function >= 0x40000000 && function <= 0x4fffffff)
>                 max = kvm_find_cpuid_entry(vcpu, function & 0xffffff00, 0);
>         else
>                 max = kvm_find_cpuid_entry(vcpu, function & 0x80000000, 0);
>         return max && function <= max->eax;
> }

I can get behind that. The behavior of the 4xxxxxxxH leaves under kvm
is arguably up to kvm (though AMD may disagree).

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/6] KVM: x86: Fix CPUID range check for Centaur and Hypervisor ranges
  2020-03-03 18:01           ` Sean Christopherson
  2020-03-03 18:08             ` Jim Mattson
@ 2020-03-04 11:18             ` Paolo Bonzini
  1 sibling, 0 replies; 40+ messages in thread
From: Paolo Bonzini @ 2020-03-04 11:18 UTC (permalink / raw)
  To: Sean Christopherson, Jim Mattson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel, kvm list, LKML,
	Jan Kiszka, Xiaoyao Li

On 03/03/20 19:01, Sean Christopherson wrote:
> static bool cpuid_function_in_range(struct kvm_vcpu *vcpu, u32 function)
> {
> 	struct kvm_cpuid_entry2 *max;
> 
> 	if (function >= 0x40000000 && function <= 0x4fffffff)
> 		max = kvm_find_cpuid_entry(vcpu, function & 0xffffff00, 0);
> 	else
> 		max = kvm_find_cpuid_entry(vcpu, function & 0x80000000, 0);
> 	return max && function <= max->eax;
> }

Yes, this is a good idea (except it should be & 0xc0000000 to cover
Centaur).

Paolo


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/6] KVM: x86: Add dedicated emulator helper for grabbing CPUID.maxphyaddr
  2020-03-03 10:14       ` Paolo Bonzini
@ 2020-03-04 20:47         ` Sean Christopherson
  0 siblings, 0 replies; 40+ messages in thread
From: Sean Christopherson @ 2020-03-04 20:47 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Jan Kiszka, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li

On Tue, Mar 03, 2020 at 11:14:22AM +0100, Paolo Bonzini wrote:
> On 03/03/20 10:48, Jan Kiszka wrote:
> >>
> >> I don't think this is a particularly useful change.  Yes, it's not
> >> intuitive but is it more than a matter of documentation (and possibly
> >> moving the check_cr_write snippet into a separate function)?
> > 
> > Besides the non obvious return value of the current function, this
> > approach also avoids leaving cpuid traces for querying maxphyaddr, which
> > is also not very intuitive IMHO.
> 
> There are already other cases where we leave CPUID traces.  We can just
> stop tracing if check_limit (which should be renamed to from_guest) is
> true; there are other internal cases which call ctxt->ops->get_cpuid,
> such as vendor_intel, and those should also use check_limit==true and
> check the return value of ctxt->ops->get_cpuid.

No, the vendor checks that use get_cpuid() shouldn't do check_limit=true,
they're looking for an exact match on the vendor.

Not that it matters.  @check_limit only comes into play on a vendor check
if CPUID.0 doesn't exist, and @check_limit only effects the output if
CPUID.0 _does_ exist.  I.e. the output for CPUID.0 is unaffected by
@check_limit.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 6/6] KVM: x86: Add requested index to the CPUID tracepoint
  2020-03-02 19:57 ` [PATCH 6/6] KVM: x86: Add requested index to the CPUID tracepoint Sean Christopherson
@ 2020-03-07  9:48   ` Jan Kiszka
  2020-03-10  4:00     ` Sean Christopherson
  0 siblings, 1 reply; 40+ messages in thread
From: Jan Kiszka @ 2020-03-07  9:48 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Xiaoyao Li

On 02.03.20 20:57, Sean Christopherson wrote:
> Output the requested index when tracing CPUID emulation; it's basically
> mandatory for leafs where the index is meaningful, and is helpful for
> verifying KVM correctness even when the index isn't meaningful, e.g. the
> trace for a Linux guest's hypervisor_cpuid_base() probing appears to
> be broken (returns all zeroes) at first glance, but is correct because
> the index is non-zero, i.e. the output values correspond to random index
> in the maximum basic leaf.
>
> Suggested-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Cc: Jan Kiszka <jan.kiszka@siemens.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>   arch/x86/kvm/cpuid.c |  3 ++-
>   arch/x86/kvm/trace.h | 13 ++++++++-----
>   2 files changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index b0a4f3c17932..a3c9f6bf43f3 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -1047,7 +1047,8 @@ void kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>   			}
>   		}
>   	}
> -	trace_kvm_cpuid(function, *eax, *ebx, *ecx, *edx, exact_entry_exists);
> +	trace_kvm_cpuid(function, index, *eax, *ebx, *ecx, *edx,
> +			exact_entry_exists);
>   }
>   EXPORT_SYMBOL_GPL(kvm_cpuid);
>
> diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
> index f194dd058470..aa372d0119f0 100644
> --- a/arch/x86/kvm/trace.h
> +++ b/arch/x86/kvm/trace.h
> @@ -151,12 +151,14 @@ TRACE_EVENT(kvm_fast_mmio,
>    * Tracepoint for cpuid.
>    */
>   TRACE_EVENT(kvm_cpuid,
> -	TP_PROTO(unsigned int function, unsigned long rax, unsigned long rbx,
> -		 unsigned long rcx, unsigned long rdx, bool found),
> -	TP_ARGS(function, rax, rbx, rcx, rdx, found),
> +	TP_PROTO(unsigned int function, unsigned int index, unsigned long rax,
> +		 unsigned long rbx, unsigned long rcx, unsigned long rdx,
> +		 bool found),
> +	TP_ARGS(function, index, rax, rbx, rcx, rdx, found),
>
>   	TP_STRUCT__entry(
>   		__field(	unsigned int,	function	)
> +		__field(	unsigned int,	index		)
>   		__field(	unsigned long,	rax		)
>   		__field(	unsigned long,	rbx		)
>   		__field(	unsigned long,	rcx		)
> @@ -166,6 +168,7 @@ TRACE_EVENT(kvm_cpuid,
>
>   	TP_fast_assign(
>   		__entry->function	= function;
> +		__entry->index		= index;
>   		__entry->rax		= rax;
>   		__entry->rbx		= rbx;
>   		__entry->rcx		= rcx;
> @@ -173,8 +176,8 @@ TRACE_EVENT(kvm_cpuid,
>   		__entry->found		= found;
>   	),
>
> -	TP_printk("func %x rax %lx rbx %lx rcx %lx rdx %lx, cpuid entry %s",
> -		  __entry->function, __entry->rax,
> +	TP_printk("func %x idx %x rax %lx rbx %lx rcx %lx rdx %lx, cpuid entry %s",
> +		  __entry->function, __entry->index, __entry->rax,
>   		  __entry->rbx, __entry->rcx, __entry->rdx,
>   		  __entry->found ? "found" : "not found")
>   );
>

What happened to this patch in your v2 round?

Jan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 6/6] KVM: x86: Add requested index to the CPUID tracepoint
  2020-03-07  9:48   ` Jan Kiszka
@ 2020-03-10  4:00     ` Sean Christopherson
  0 siblings, 0 replies; 40+ messages in thread
From: Sean Christopherson @ 2020-03-10  4:00 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Xiaoyao Li

On Sat, Mar 07, 2020 at 10:48:25AM +0100, Jan Kiszka wrote:
> On 02.03.20 20:57, Sean Christopherson wrote:
> >Output the requested index when tracing CPUID emulation; it's basically
> >mandatory for leafs where the index is meaningful, and is helpful for
> >verifying KVM correctness even when the index isn't meaningful, e.g. the
> >trace for a Linux guest's hypervisor_cpuid_base() probing appears to
> >be broken (returns all zeroes) at first glance, but is correct because
> >the index is non-zero, i.e. the output values correspond to random index
> >in the maximum basic leaf.
> >
> >Suggested-by: Xiaoyao Li <xiaoyao.li@intel.com>
> >Cc: Jan Kiszka <jan.kiszka@siemens.com>
> >Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> >---
> >  arch/x86/kvm/cpuid.c |  3 ++-
> >  arch/x86/kvm/trace.h | 13 ++++++++-----
> >  2 files changed, 10 insertions(+), 6 deletions(-)
> >
> >diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> >index b0a4f3c17932..a3c9f6bf43f3 100644
> >--- a/arch/x86/kvm/cpuid.c
> >+++ b/arch/x86/kvm/cpuid.c
> >@@ -1047,7 +1047,8 @@ void kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
> >  			}
> >  		}
> >  	}
> >-	trace_kvm_cpuid(function, *eax, *ebx, *ecx, *edx, exact_entry_exists);
> >+	trace_kvm_cpuid(function, index, *eax, *ebx, *ecx, *edx,
> >+			exact_entry_exists);
> >  }
> >  EXPORT_SYMBOL_GPL(kvm_cpuid);
> >
> >diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
> >index f194dd058470..aa372d0119f0 100644
> >--- a/arch/x86/kvm/trace.h
> >+++ b/arch/x86/kvm/trace.h
> >@@ -151,12 +151,14 @@ TRACE_EVENT(kvm_fast_mmio,
> >   * Tracepoint for cpuid.
> >   */
> >  TRACE_EVENT(kvm_cpuid,
> >-	TP_PROTO(unsigned int function, unsigned long rax, unsigned long rbx,
> >-		 unsigned long rcx, unsigned long rdx, bool found),
> >-	TP_ARGS(function, rax, rbx, rcx, rdx, found),
> >+	TP_PROTO(unsigned int function, unsigned int index, unsigned long rax,
> >+		 unsigned long rbx, unsigned long rcx, unsigned long rdx,
> >+		 bool found),
> >+	TP_ARGS(function, index, rax, rbx, rcx, rdx, found),
> >
> >  	TP_STRUCT__entry(
> >  		__field(	unsigned int,	function	)
> >+		__field(	unsigned int,	index		)
> >  		__field(	unsigned long,	rax		)
> >  		__field(	unsigned long,	rbx		)
> >  		__field(	unsigned long,	rcx		)
> >@@ -166,6 +168,7 @@ TRACE_EVENT(kvm_cpuid,
> >
> >  	TP_fast_assign(
> >  		__entry->function	= function;
> >+		__entry->index		= index;
> >  		__entry->rax		= rax;
> >  		__entry->rbx		= rbx;
> >  		__entry->rcx		= rcx;
> >@@ -173,8 +176,8 @@ TRACE_EVENT(kvm_cpuid,
> >  		__entry->found		= found;
> >  	),
> >
> >-	TP_printk("func %x rax %lx rbx %lx rcx %lx rdx %lx, cpuid entry %s",
> >-		  __entry->function, __entry->rax,
> >+	TP_printk("func %x idx %x rax %lx rbx %lx rcx %lx rdx %lx, cpuid entry %s",
> >+		  __entry->function, __entry->index, __entry->rax,
> >  		  __entry->rbx, __entry->rcx, __entry->rdx,
> >  		  __entry->found ? "found" : "not found")
> >  );
> >
> 
> What happened to this patch in your v2 round?

I completely forgot about it...

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2020-03-10  4:00 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-02 19:57 [PATCH 0/6] KVM: x86: CPUID emulation and tracing fixes Sean Christopherson
2020-03-02 19:57 ` [PATCH 1/6] KVM: x86: Fix tracing of CPUID.function when function is out-of-range Sean Christopherson
2020-03-02 20:26   ` Jan Kiszka
2020-03-02 20:49     ` Sean Christopherson
2020-03-02 20:59       ` Jan Kiszka
2020-03-03  2:27       ` Xiaoyao Li
2020-03-03  3:45         ` Sean Christopherson
2020-03-03  4:02           ` Xiaoyao Li
2020-03-03  4:12             ` Sean Christopherson
2020-03-03  4:30               ` Xiaoyao Li
2020-03-03  2:50   ` Xiaoyao Li
2020-03-03  4:08     ` Sean Christopherson
2020-03-03  4:16       ` Xiaoyao Li
2020-03-02 19:57 ` [PATCH 2/6] KVM: x86: Fix CPUID range check for Centaur and Hypervisor ranges Sean Christopherson
2020-03-02 21:59   ` Jim Mattson
2020-03-03  0:57     ` Sean Christopherson
2020-03-03  3:25   ` Jim Mattson
2020-03-03  4:25     ` Jim Mattson
2020-03-03  4:58       ` Sean Christopherson
2020-03-03 17:42         ` Jim Mattson
2020-03-03 18:01           ` Sean Christopherson
2020-03-03 18:08             ` Jim Mattson
2020-03-04 11:18             ` Paolo Bonzini
2020-03-02 19:57 ` [PATCH 3/6] KVM: x86: Add dedicated emulator helper for grabbing CPUID.maxphyaddr Sean Christopherson
2020-03-03  8:48   ` Paolo Bonzini
2020-03-03  9:48     ` Jan Kiszka
2020-03-03 10:14       ` Paolo Bonzini
2020-03-04 20:47         ` Sean Christopherson
2020-03-03 16:28     ` Sean Christopherson
2020-03-03 17:21       ` Paolo Bonzini
2020-03-02 19:57 ` [PATCH 4/6] KVM: x86: Drop return value from kvm_cpuid() Sean Christopherson
2020-03-02 19:57 ` [PATCH 5/6] KVM: x86: Rename "found" variable in kvm_cpuid() to "exact_entry_exists" Sean Christopherson
2020-03-02 20:20   ` Jan Kiszka
2020-03-02 20:35     ` Sean Christopherson
2020-03-02 20:48       ` Jan Kiszka
2020-03-02 19:57 ` [PATCH 6/6] KVM: x86: Add requested index to the CPUID tracepoint Sean Christopherson
2020-03-07  9:48   ` Jan Kiszka
2020-03-10  4:00     ` Sean Christopherson
2020-03-03  8:48 ` [PATCH 0/6] KVM: x86: CPUID emulation and tracing fixes Paolo Bonzini
2020-03-03 16:38   ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).