LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v6 00/11] Intel Processor Trace virtualization enabling
@ 2018-03-20 11:21 Luwei Kang
  2018-03-20 11:21 ` [PATCH v6 01/11] perf/x86/intel/pt: Move Intel-PT MSR bit definitions to a public header Luwei Kang
                   ` (11 more replies)
  0 siblings, 12 replies; 17+ messages in thread
From: Luwei Kang @ 2018-03-20 11:21 UTC (permalink / raw)
  To: kvm
  Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro, Luwei Kang

Hi All,

Here is a patch-series which adding Processor Trace enabling in KVM guest. You can get It's software developer manuals from:
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
In Chapter 5 INTEL PROCESSOR TRACE: VMX IMPROVEMENTS.

Introduction:
Intel Processor Trace (Intel PT) is an extension of Intel Architecture that captures information about software execution using dedicated hardware facilities that cause only minimal performance perturbation to the software being traced. Details on the Intel PT infrastructure and trace capabilities can be found in the Intel 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C.

The suite of architecture changes serve to simplify the process of virtualizing Intel PT for use by a guest software. There are two primary elements to this new architecture support for VMX support improvements made for Intel PT.
1. Addition of a new guest IA32_RTIT_CTL value field to the VMCS.
  — This serves to speed and simplify the process of disabling trace on VM exit, and restoring it on VM entry.
2. Enabling use of EPT to redirect PT output.
  — This enables the VMM to elect to virtualize the PT output buffer using EPT. In this mode, the CPU will treat PT output addresses as Guest Physical Addresses (GPAs) and translate them using EPT. This means that Intel PT output reads (of the ToPA table) and writes (of trace output) can cause EPT violations, and other output events.

Processor Trace virtualization can be work in one of 3 possible modes by set new option "pt_mode". Default value is system mode.
 a. system-wide: trace both host/guest and output to host buffer;
 b. host-only: only trace host and output to host buffer;
 c. host-guest: trace host/guest simultaneous and output to their respective buffer.

>From V5:
 - rename the function from pt_cap_get_ex() to __pt_cap_get();
 - replace the most of function from vmx_pt_supported() to "pt_mode == PT_MODE_HOST_GUEST"(or !=).

>From V4:
 - add data check when setting the value of MSR_IA32_RTIT_CTL;
 - Invoke new interface to set the intercept of MSRs read/write after "MSR bitmap per-vcpu" patches.

>From V3:
 - change default mode to SYSTEM mode;
 - add a new patch to move PT out of scattered features;
 - add a new fucntion kvm_get_pt_addr_cnt() to get the number of address ranges;
 - add a new function vmx_set_rtit_ctl() to set the value of guest RTIT_CTL, GUEST_IA32_RTIT_CTL and MSRs intercept.

>From v2:
 - replace *_PT_SUPPRESS_PIP to *_PT_CONCEAL_PIP;
 - clean SECONDARY_EXEC_PT_USE_GPA, VM_EXIT_CLEAR_IA32_RTIT_CTL and VM_ENTRY_LOAD_IA32_RTIT_CTL in SYSTEM mode. These bits must be all set or all clean;
 - move processor tracing out of scattered features;
 - add a new function to enable/disable intercept MSRs read/write;
 - add all Intel PT MSRs read/write and disable intercept when PT is enabled in guest;
 - disable Intel PT and enable intercept MSRs when L1 guest VMXON;
 - performance optimization.
   In Host only mode. we just need to save host RTIT_CTL before vm-entry and restore host RTIT_CTL after vm-exit;
   In HOST_GUEST mode. we need to save and restore all MSRs only when PT has enabled in guest.
 - use XSAVES/XRESTORES implement context switch.
   Haven't implementation in this version and still in debuging. will make a separate patch work on this.

>From v1:
 - remove guest-only mode because guest-only mode can be covered by host-guest mode;
 - always set "use GPA for processor tracing" in secondary execution control if it can be;
 - trap RTIT_CTL read/write. Forbid write this msr when VMXON in L1 hypervisor.

Chao Peng (8):
  perf/x86/intel/pt: Move Intel-PT MSR bit definitions to a public
    header
  perf/x86/intel/pt: Change pt_cap_get() to a public function
  KVM: x86: Add Intel Processor Trace virtualization mode
  KVM: x86: Add Intel Processor Trace cpuid emulation
  KVM: x86: Add Intel processor trace context for each vcpu
  KVM: x86: Implement Intel Processor Trace context switch
  KVM: x86: Implement Intel Processor Trace MSRs read/write
  KVM: x86: Set intercept for Intel PT MSRs

Luwei Kang (3):
  perf/x86/intel/pt: Introduce a new function to get the capability of
    Intel PT
  KVM: x86: Introduce a function to initialize the PT configuration
  KVM: x86: Disable Intel Processor Trace when VMXON in L1 guest

 arch/x86/events/intel/pt.c       |  16 +-
 arch/x86/events/intel/pt.h       |  58 ------
 arch/x86/include/asm/intel_pt.h  |  40 ++++
 arch/x86/include/asm/kvm_host.h  |   1 +
 arch/x86/include/asm/msr-index.h |  38 ++++
 arch/x86/include/asm/vmx.h       |   8 +
 arch/x86/kvm/cpuid.c             |  22 ++-
 arch/x86/kvm/svm.c               |   6 +
 arch/x86/kvm/vmx.c               | 412 ++++++++++++++++++++++++++++++++++++++-
 arch/x86/kvm/x86.c               |  33 +++-
 10 files changed, 565 insertions(+), 69 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v6 01/11] perf/x86/intel/pt: Move Intel-PT MSR bit definitions to a public header
  2018-03-20 11:21 [PATCH v6 00/11] Intel Processor Trace virtualization enabling Luwei Kang
@ 2018-03-20 11:21 ` Luwei Kang
  2018-04-30 13:57   ` Peter Zijlstra
  2018-03-20 11:21 ` [PATCH v6 02/11] perf/x86/intel/pt: Change pt_cap_get() to a public function Luwei Kang
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 17+ messages in thread
From: Luwei Kang @ 2018-03-20 11:21 UTC (permalink / raw)
  To: kvm
  Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
	Chao Peng, Luwei Kang

From: Chao Peng <chao.p.peng@linux.intel.com>

Intel Processor Trace virtualization enabling in guest need
to use these MSR bits, so move them to public header msr-index.h.
Introduce RTIT_CTL_FABRIC_EN and sync the definitions to
latest spec.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 arch/x86/events/intel/pt.c       |  4 ++--
 arch/x86/events/intel/pt.h       | 37 -------------------------------------
 arch/x86/include/asm/msr-index.h | 37 +++++++++++++++++++++++++++++++++++++
 3 files changed, 39 insertions(+), 39 deletions(-)

diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index 81fd41d..6d6dd4f 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -271,7 +271,7 @@ static int __init pt_pmu_hw_init(void)
 	return ret;
 }
 
-#define RTIT_CTL_CYC_PSB (RTIT_CTL_CYCLEACC	| \
+#define RTIT_CTL_CYC_PSB (RTIT_CTL_CYC_EN	| \
 			  RTIT_CTL_CYC_THRESH	| \
 			  RTIT_CTL_PSB_FREQ)
 
@@ -293,7 +293,7 @@ static int __init pt_pmu_hw_init(void)
 
 #define PT_CONFIG_MASK (RTIT_CTL_TRACEEN	| \
 			RTIT_CTL_TSC_EN		| \
-			RTIT_CTL_DISRETC	| \
+			RTIT_CTL_DIS_RETC	| \
 			RTIT_CTL_BRANCH_EN	| \
 			RTIT_CTL_CYC_PSB	| \
 			RTIT_CTL_MTC		| \
diff --git a/arch/x86/events/intel/pt.h b/arch/x86/events/intel/pt.h
index 0eb41d0..0050ca1 100644
--- a/arch/x86/events/intel/pt.h
+++ b/arch/x86/events/intel/pt.h
@@ -20,43 +20,6 @@
 #define __INTEL_PT_H__
 
 /*
- * PT MSR bit definitions
- */
-#define RTIT_CTL_TRACEEN		BIT(0)
-#define RTIT_CTL_CYCLEACC		BIT(1)
-#define RTIT_CTL_OS			BIT(2)
-#define RTIT_CTL_USR			BIT(3)
-#define RTIT_CTL_PWR_EVT_EN		BIT(4)
-#define RTIT_CTL_FUP_ON_PTW		BIT(5)
-#define RTIT_CTL_CR3EN			BIT(7)
-#define RTIT_CTL_TOPA			BIT(8)
-#define RTIT_CTL_MTC_EN			BIT(9)
-#define RTIT_CTL_TSC_EN			BIT(10)
-#define RTIT_CTL_DISRETC		BIT(11)
-#define RTIT_CTL_PTW_EN			BIT(12)
-#define RTIT_CTL_BRANCH_EN		BIT(13)
-#define RTIT_CTL_MTC_RANGE_OFFSET	14
-#define RTIT_CTL_MTC_RANGE		(0x0full << RTIT_CTL_MTC_RANGE_OFFSET)
-#define RTIT_CTL_CYC_THRESH_OFFSET	19
-#define RTIT_CTL_CYC_THRESH		(0x0full << RTIT_CTL_CYC_THRESH_OFFSET)
-#define RTIT_CTL_PSB_FREQ_OFFSET	24
-#define RTIT_CTL_PSB_FREQ      		(0x0full << RTIT_CTL_PSB_FREQ_OFFSET)
-#define RTIT_CTL_ADDR0_OFFSET		32
-#define RTIT_CTL_ADDR0      		(0x0full << RTIT_CTL_ADDR0_OFFSET)
-#define RTIT_CTL_ADDR1_OFFSET		36
-#define RTIT_CTL_ADDR1      		(0x0full << RTIT_CTL_ADDR1_OFFSET)
-#define RTIT_CTL_ADDR2_OFFSET		40
-#define RTIT_CTL_ADDR2      		(0x0full << RTIT_CTL_ADDR2_OFFSET)
-#define RTIT_CTL_ADDR3_OFFSET		44
-#define RTIT_CTL_ADDR3      		(0x0full << RTIT_CTL_ADDR3_OFFSET)
-#define RTIT_STATUS_FILTEREN		BIT(0)
-#define RTIT_STATUS_CONTEXTEN		BIT(1)
-#define RTIT_STATUS_TRIGGEREN		BIT(2)
-#define RTIT_STATUS_BUFFOVF		BIT(3)
-#define RTIT_STATUS_ERROR		BIT(4)
-#define RTIT_STATUS_STOPPED		BIT(5)
-
-/*
  * Single-entry ToPA: when this close to region boundary, switch
  * buffers to avoid losing data.
  */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 53d5b1b..2211087 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -106,7 +106,43 @@
 #define MSR_PEBS_LD_LAT_THRESHOLD	0x000003f6
 
 #define MSR_IA32_RTIT_CTL		0x00000570
+#define RTIT_CTL_TRACEEN		BIT(0)
+#define RTIT_CTL_CYC_EN			BIT(1)
+#define RTIT_CTL_OS			BIT(2)
+#define RTIT_CTL_USR			BIT(3)
+#define RTIT_CTL_PWR_EVT_EN		BIT(4)
+#define RTIT_CTL_FUP_ON_PTW		BIT(5)
+#define RTIT_CTL_FABRIC_EN		BIT(6)
+#define RTIT_CTL_CR3_FILTER		BIT(7)
+#define RTIT_CTL_TOPA			BIT(8)
+#define RTIT_CTL_MTC_EN			BIT(9)
+#define RTIT_CTL_TSC_EN			BIT(10)
+#define RTIT_CTL_DIS_RETC		BIT(11)
+#define RTIT_CTL_PTW_EN			BIT(12)
+#define RTIT_CTL_BRANCH_EN		BIT(13)
+#define RTIT_CTL_MTC_RANGE_OFFSET	14
+#define RTIT_CTL_MTC_RANGE		(0x0full << RTIT_CTL_MTC_RANGE_OFFSET)
+#define RTIT_CTL_CYC_THRESH_OFFSET	19
+#define RTIT_CTL_CYC_THRESH		(0x0full << RTIT_CTL_CYC_THRESH_OFFSET)
+#define RTIT_CTL_PSB_FREQ_OFFSET	24
+#define RTIT_CTL_PSB_FREQ		(0x0full << RTIT_CTL_PSB_FREQ_OFFSET)
+#define RTIT_CTL_ADDR0_OFFSET		32
+#define RTIT_CTL_ADDR0			(0x0full << RTIT_CTL_ADDR0_OFFSET)
+#define RTIT_CTL_ADDR1_OFFSET		36
+#define RTIT_CTL_ADDR1			(0x0full << RTIT_CTL_ADDR1_OFFSET)
+#define RTIT_CTL_ADDR2_OFFSET		40
+#define RTIT_CTL_ADDR2			(0x0full << RTIT_CTL_ADDR2_OFFSET)
+#define RTIT_CTL_ADDR3_OFFSET		44
+#define RTIT_CTL_ADDR3			(0x0full << RTIT_CTL_ADDR3_OFFSET)
 #define MSR_IA32_RTIT_STATUS		0x00000571
+#define RTIT_STATUS_FILTEREN		BIT(0)
+#define RTIT_STATUS_CONTEXTEN		BIT(1)
+#define RTIT_STATUS_TRIGGEREN		BIT(2)
+#define RTIT_STATUS_BUFFOVF		BIT(3)
+#define RTIT_STATUS_ERROR		BIT(4)
+#define RTIT_STATUS_STOPPED		BIT(5)
+#define RTIT_STATUS_BYTECNT_OFFSET	32
+#define RTIT_STATUS_BYTECNT		(0x1ffffull << RTIT_STATUS_BYTECNT_OFFSET)
 #define MSR_IA32_RTIT_ADDR0_A		0x00000580
 #define MSR_IA32_RTIT_ADDR0_B		0x00000581
 #define MSR_IA32_RTIT_ADDR1_A		0x00000582
@@ -115,6 +151,7 @@
 #define MSR_IA32_RTIT_ADDR2_B		0x00000585
 #define MSR_IA32_RTIT_ADDR3_A		0x00000586
 #define MSR_IA32_RTIT_ADDR3_B		0x00000587
+#define MSR_IA32_RTIT_ADDR_COUNT	8
 #define MSR_IA32_RTIT_CR3_MATCH		0x00000572
 #define MSR_IA32_RTIT_OUTPUT_BASE	0x00000560
 #define MSR_IA32_RTIT_OUTPUT_MASK	0x00000561
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v6 02/11] perf/x86/intel/pt: Change pt_cap_get() to a public function
  2018-03-20 11:21 [PATCH v6 00/11] Intel Processor Trace virtualization enabling Luwei Kang
  2018-03-20 11:21 ` [PATCH v6 01/11] perf/x86/intel/pt: Move Intel-PT MSR bit definitions to a public header Luwei Kang
@ 2018-03-20 11:21 ` Luwei Kang
  2018-04-30 13:59   ` Peter Zijlstra
  2018-03-20 11:21 ` [PATCH v6 03/11] perf/x86/intel/pt: Introduce a new function to get the capability of Intel PT Luwei Kang
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 17+ messages in thread
From: Luwei Kang @ 2018-03-20 11:21 UTC (permalink / raw)
  To: kvm
  Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
	Chao Peng, Luwei Kang

From: Chao Peng <chao.p.peng@linux.intel.com>

Change pt_cap_get() to a public function so that KVM can access it.
Introduce new capablility PT_CAP_output_subsys to support
of output to Trace Transport subsystem.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 arch/x86/events/intel/pt.c      |  4 +++-
 arch/x86/events/intel/pt.h      | 21 ---------------------
 arch/x86/include/asm/intel_pt.h | 24 ++++++++++++++++++++++++
 3 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index 6d6dd4f..d89dd8c 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -68,6 +68,7 @@
 	PT_CAP(topa_output,		0, CPUID_ECX, BIT(0)),
 	PT_CAP(topa_multiple_entries,	0, CPUID_ECX, BIT(1)),
 	PT_CAP(single_range_output,	0, CPUID_ECX, BIT(2)),
+	PT_CAP(output_subsys,		0, CPUID_ECX, BIT(3)),
 	PT_CAP(payloads_lip,		0, CPUID_ECX, BIT(31)),
 	PT_CAP(num_address_ranges,	1, CPUID_EAX, 0x3),
 	PT_CAP(mtc_periods,		1, CPUID_EAX, 0xffff0000),
@@ -75,7 +76,7 @@
 	PT_CAP(psb_periods,		1, CPUID_EBX, 0xffff0000),
 };
 
-static u32 pt_cap_get(enum pt_capabilities cap)
+u32 pt_cap_get(enum pt_capabilities cap)
 {
 	struct pt_cap_desc *cd = &pt_caps[cap];
 	u32 c = pt_pmu.caps[cd->leaf * PT_CPUID_REGS_NUM + cd->reg];
@@ -83,6 +84,7 @@ static u32 pt_cap_get(enum pt_capabilities cap)
 
 	return (c & cd->mask) >> shift;
 }
+EXPORT_SYMBOL_GPL(pt_cap_get);
 
 static ssize_t pt_cap_show(struct device *cdev,
 			   struct device_attribute *attr,
diff --git a/arch/x86/events/intel/pt.h b/arch/x86/events/intel/pt.h
index 0050ca1..269e15a 100644
--- a/arch/x86/events/intel/pt.h
+++ b/arch/x86/events/intel/pt.h
@@ -45,30 +45,9 @@ struct topa_entry {
 	u64	rsvd4	: 16;
 };
 
-#define PT_CPUID_LEAVES		2
-#define PT_CPUID_REGS_NUM	4 /* number of regsters (eax, ebx, ecx, edx) */
-
 /* TSC to Core Crystal Clock Ratio */
 #define CPUID_TSC_LEAF		0x15
 
-enum pt_capabilities {
-	PT_CAP_max_subleaf = 0,
-	PT_CAP_cr3_filtering,
-	PT_CAP_psb_cyc,
-	PT_CAP_ip_filtering,
-	PT_CAP_mtc,
-	PT_CAP_ptwrite,
-	PT_CAP_power_event_trace,
-	PT_CAP_topa_output,
-	PT_CAP_topa_multiple_entries,
-	PT_CAP_single_range_output,
-	PT_CAP_payloads_lip,
-	PT_CAP_num_address_ranges,
-	PT_CAP_mtc_periods,
-	PT_CAP_cycle_thresholds,
-	PT_CAP_psb_periods,
-};
-
 struct pt_pmu {
 	struct pmu		pmu;
 	u32			caps[PT_CPUID_REGS_NUM * PT_CPUID_LEAVES];
diff --git a/arch/x86/include/asm/intel_pt.h b/arch/x86/include/asm/intel_pt.h
index b523f51..2de4db0 100644
--- a/arch/x86/include/asm/intel_pt.h
+++ b/arch/x86/include/asm/intel_pt.h
@@ -2,10 +2,34 @@
 #ifndef _ASM_X86_INTEL_PT_H
 #define _ASM_X86_INTEL_PT_H
 
+#define PT_CPUID_LEAVES		2
+#define PT_CPUID_REGS_NUM	4 /* number of regsters (eax, ebx, ecx, edx) */
+
+enum pt_capabilities {
+	PT_CAP_max_subleaf = 0,
+	PT_CAP_cr3_filtering,
+	PT_CAP_psb_cyc,
+	PT_CAP_ip_filtering,
+	PT_CAP_mtc,
+	PT_CAP_ptwrite,
+	PT_CAP_power_event_trace,
+	PT_CAP_topa_output,
+	PT_CAP_topa_multiple_entries,
+	PT_CAP_single_range_output,
+	PT_CAP_output_subsys,
+	PT_CAP_payloads_lip,
+	PT_CAP_num_address_ranges,
+	PT_CAP_mtc_periods,
+	PT_CAP_cycle_thresholds,
+	PT_CAP_psb_periods,
+};
+
 #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL)
 void cpu_emergency_stop_pt(void);
+extern u32 pt_cap_get(enum pt_capabilities cap);
 #else
 static inline void cpu_emergency_stop_pt(void) {}
+static inline u32 pt_cap_get(enum pt_capabilities cap) { return 0; }
 #endif
 
 #endif /* _ASM_X86_INTEL_PT_H */
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v6 03/11] perf/x86/intel/pt: Introduce a new function to get the capability of Intel PT
  2018-03-20 11:21 [PATCH v6 00/11] Intel Processor Trace virtualization enabling Luwei Kang
  2018-03-20 11:21 ` [PATCH v6 01/11] perf/x86/intel/pt: Move Intel-PT MSR bit definitions to a public header Luwei Kang
  2018-03-20 11:21 ` [PATCH v6 02/11] perf/x86/intel/pt: Change pt_cap_get() to a public function Luwei Kang
@ 2018-03-20 11:21 ` Luwei Kang
  2018-04-30 14:01   ` Peter Zijlstra
  2018-03-20 11:21 ` [PATCH v6 04/11] KVM: x86: Add Intel Processor Trace virtualization mode Luwei Kang
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 17+ messages in thread
From: Luwei Kang @ 2018-03-20 11:21 UTC (permalink / raw)
  To: kvm
  Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro, Luwei Kang

Because of the guest CPUID information may different with
host(some bits may mask off in guest) so introduce a new
function to get the capability of Intel PT.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 arch/x86/events/intel/pt.c      | 10 ++++++++--
 arch/x86/include/asm/intel_pt.h |  2 ++
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index d89dd8c..c66eb90 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -76,14 +76,20 @@
 	PT_CAP(psb_periods,		1, CPUID_EBX, 0xffff0000),
 };
 
-u32 pt_cap_get(enum pt_capabilities cap)
+u32 __pt_cap_get(u32 *caps, enum pt_capabilities cap)
 {
 	struct pt_cap_desc *cd = &pt_caps[cap];
-	u32 c = pt_pmu.caps[cd->leaf * PT_CPUID_REGS_NUM + cd->reg];
+	u32 c = caps[cd->leaf * PT_CPUID_REGS_NUM + cd->reg];
 	unsigned int shift = __ffs(cd->mask);
 
 	return (c & cd->mask) >> shift;
 }
+EXPORT_SYMBOL_GPL(__pt_cap_get);
+
+u32 pt_cap_get(enum pt_capabilities cap)
+{
+	return __pt_cap_get(pt_pmu.caps, cap);
+}
 EXPORT_SYMBOL_GPL(pt_cap_get);
 
 static ssize_t pt_cap_show(struct device *cdev,
diff --git a/arch/x86/include/asm/intel_pt.h b/arch/x86/include/asm/intel_pt.h
index 2de4db0..3a4f524 100644
--- a/arch/x86/include/asm/intel_pt.h
+++ b/arch/x86/include/asm/intel_pt.h
@@ -27,9 +27,11 @@ enum pt_capabilities {
 #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL)
 void cpu_emergency_stop_pt(void);
 extern u32 pt_cap_get(enum pt_capabilities cap);
+extern u32 __pt_cap_get(u32 *caps, enum pt_capabilities cap);
 #else
 static inline void cpu_emergency_stop_pt(void) {}
 static inline u32 pt_cap_get(enum pt_capabilities cap) { return 0; }
+static u32 __pt_cap_get(u32 *caps, enum pt_capabilities cap) { return 0; }
 #endif
 
 #endif /* _ASM_X86_INTEL_PT_H */
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v6 04/11] KVM: x86: Add Intel Processor Trace virtualization mode
  2018-03-20 11:21 [PATCH v6 00/11] Intel Processor Trace virtualization enabling Luwei Kang
                   ` (2 preceding siblings ...)
  2018-03-20 11:21 ` [PATCH v6 03/11] perf/x86/intel/pt: Introduce a new function to get the capability of Intel PT Luwei Kang
@ 2018-03-20 11:21 ` Luwei Kang
  2018-03-20 11:21 ` [PATCH v6 05/11] KVM: x86: Add Intel Processor Trace cpuid emulation Luwei Kang
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Luwei Kang @ 2018-03-20 11:21 UTC (permalink / raw)
  To: kvm
  Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
	Chao Peng, Luwei Kang

From: Chao Peng <chao.p.peng@linux.intel.com>

Intel PT virtualization can be work in one of 3 possible modes:
a. system-wide: trace both host/guest and output to host buffer;
b. host-only: only trace host and output to host buffer;
c. host-guest: trace host/guest simultaneous and output to their
   respective buffer.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 arch/x86/include/asm/intel_pt.h  |  6 ++++
 arch/x86/include/asm/msr-index.h |  1 +
 arch/x86/include/asm/vmx.h       |  8 +++++
 arch/x86/kvm/vmx.c               | 68 +++++++++++++++++++++++++++++++++++++---
 4 files changed, 79 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/intel_pt.h b/arch/x86/include/asm/intel_pt.h
index 3a4f524..43ad260 100644
--- a/arch/x86/include/asm/intel_pt.h
+++ b/arch/x86/include/asm/intel_pt.h
@@ -5,6 +5,12 @@
 #define PT_CPUID_LEAVES		2
 #define PT_CPUID_REGS_NUM	4 /* number of regsters (eax, ebx, ecx, edx) */
 
+enum pt_mode {
+	PT_MODE_SYSTEM = 0,
+	PT_MODE_HOST,
+	PT_MODE_HOST_GUEST,
+};
+
 enum pt_capabilities {
 	PT_CAP_max_subleaf = 0,
 	PT_CAP_cr3_filtering,
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 2211087..2e890a2 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -790,6 +790,7 @@
 #define VMX_BASIC_INOUT		0x0040000000000000LLU
 
 /* MSR_IA32_VMX_MISC bits */
+#define MSR_IA32_VMX_MISC_INTEL_PT                 (1ULL << 14)
 #define MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS (1ULL << 29)
 #define MSR_IA32_VMX_MISC_PREEMPTION_TIMER_SCALE   0x1F
 /* AMD-V MSRs */
diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 8b67807..9e828d4 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -76,7 +76,9 @@
 #define SECONDARY_EXEC_SHADOW_VMCS              0x00004000
 #define SECONDARY_EXEC_RDSEED_EXITING		0x00010000
 #define SECONDARY_EXEC_ENABLE_PML               0x00020000
+#define SECONDARY_EXEC_PT_CONCEAL_VMX		0x00080000
 #define SECONDARY_EXEC_XSAVES			0x00100000
+#define SECONDARY_EXEC_PT_USE_GPA		0x01000000
 #define SECONDARY_EXEC_TSC_SCALING              0x02000000
 
 #define PIN_BASED_EXT_INTR_MASK                 0x00000001
@@ -97,6 +99,8 @@
 #define VM_EXIT_LOAD_IA32_EFER                  0x00200000
 #define VM_EXIT_SAVE_VMX_PREEMPTION_TIMER       0x00400000
 #define VM_EXIT_CLEAR_BNDCFGS                   0x00800000
+#define VM_EXIT_PT_CONCEAL_PIP			0x01000000
+#define VM_EXIT_CLEAR_IA32_RTIT_CTL		0x02000000
 
 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR	0x00036dff
 
@@ -108,6 +112,8 @@
 #define VM_ENTRY_LOAD_IA32_PAT			0x00004000
 #define VM_ENTRY_LOAD_IA32_EFER                 0x00008000
 #define VM_ENTRY_LOAD_BNDCFGS                   0x00010000
+#define VM_ENTRY_PT_CONCEAL_PIP			0x00020000
+#define VM_ENTRY_LOAD_IA32_RTIT_CTL		0x00040000
 
 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR	0x000011ff
 
@@ -234,6 +240,8 @@ enum vmcs_field {
 	GUEST_PDPTR3_HIGH               = 0x00002811,
 	GUEST_BNDCFGS                   = 0x00002812,
 	GUEST_BNDCFGS_HIGH              = 0x00002813,
+	GUEST_IA32_RTIT_CTL		= 0x00002814,
+	GUEST_IA32_RTIT_CTL_HIGH	= 0x00002815,
 	HOST_IA32_PAT			= 0x00002c00,
 	HOST_IA32_PAT_HIGH		= 0x00002c01,
 	HOST_IA32_EFER			= 0x00002c02,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 1b8d122..1093ae0 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -53,6 +53,7 @@
 #include <asm/mmu_context.h>
 #include <asm/microcode.h>
 #include <asm/nospec-branch.h>
+#include <asm/intel_pt.h>
 
 #include "trace.h"
 #include "pmu.h"
@@ -194,6 +195,10 @@
 static int ple_window_max        = KVM_VMX_DEFAULT_PLE_WINDOW_MAX;
 module_param(ple_window_max, int, S_IRUGO);
 
+/* Default is SYSTEM mode. */
+static int __read_mostly pt_mode = PT_MODE_SYSTEM;
+module_param(pt_mode, int, S_IRUGO);
+
 extern const ulong vmx_return;
 
 #define NR_AUTOLOAD_MSRS 8
@@ -1319,6 +1324,19 @@ static inline bool cpu_has_vmx_vmfunc(void)
 		SECONDARY_EXEC_ENABLE_VMFUNC;
 }
 
+static inline bool cpu_has_vmx_intel_pt(void)
+{
+	u64 vmx_msr;
+
+	rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
+	return vmx_msr & MSR_IA32_VMX_MISC_INTEL_PT;
+}
+
+static inline bool cpu_has_vmx_pt_use_gpa(void)
+{
+	return vmcs_config.cpu_based_2nd_exec_ctrl & SECONDARY_EXEC_PT_USE_GPA;
+}
+
 static inline bool report_flexpriority(void)
 {
 	return flexpriority_enabled;
@@ -3805,6 +3823,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
 			SECONDARY_EXEC_RDRAND_EXITING |
 			SECONDARY_EXEC_ENABLE_PML |
 			SECONDARY_EXEC_TSC_SCALING |
+			SECONDARY_EXEC_PT_USE_GPA |
+			SECONDARY_EXEC_PT_CONCEAL_VMX |
 			SECONDARY_EXEC_ENABLE_VMFUNC;
 		if (adjust_vmx_controls(min2, opt2,
 					MSR_IA32_VMX_PROCBASED_CTLS2,
@@ -3849,7 +3869,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
 	min |= VM_EXIT_HOST_ADDR_SPACE_SIZE;
 #endif
 	opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT |
-		VM_EXIT_CLEAR_BNDCFGS;
+		VM_EXIT_CLEAR_BNDCFGS | VM_EXIT_PT_CONCEAL_PIP |
+		VM_EXIT_CLEAR_IA32_RTIT_CTL;
 	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
 				&_vmexit_control) < 0)
 		return -EIO;
@@ -3868,11 +3889,20 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
 		_pin_based_exec_control &= ~PIN_BASED_POSTED_INTR;
 
 	min = VM_ENTRY_LOAD_DEBUG_CONTROLS;
-	opt = VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS;
+	opt = VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS |
+		VM_ENTRY_PT_CONCEAL_PIP | VM_ENTRY_LOAD_IA32_RTIT_CTL;
 	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS,
 				&_vmentry_control) < 0)
 		return -EIO;
 
+	if (!(_cpu_based_2nd_exec_control & SECONDARY_EXEC_PT_USE_GPA) ||
+		!(_vmexit_control & VM_EXIT_CLEAR_IA32_RTIT_CTL) ||
+		!(_vmentry_control & VM_ENTRY_LOAD_IA32_RTIT_CTL)) {
+		_cpu_based_2nd_exec_control &= ~SECONDARY_EXEC_PT_USE_GPA;
+		_vmexit_control &= ~VM_EXIT_CLEAR_IA32_RTIT_CTL;
+		_vmentry_control &= ~VM_ENTRY_LOAD_IA32_RTIT_CTL;
+	}
+
 	rdmsr(MSR_IA32_VMX_BASIC, vmx_msr_low, vmx_msr_high);
 
 	/* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
@@ -5573,6 +5603,28 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
 	return exec_control;
 }
 
+static u32 vmx_vmexit_control(struct vcpu_vmx *vmx)
+{
+	u32 vmexit_control = vmcs_config.vmexit_ctrl;
+
+	if (pt_mode == PT_MODE_SYSTEM)
+		vmexit_control &= ~(VM_EXIT_CLEAR_IA32_RTIT_CTL |
+				    VM_EXIT_PT_CONCEAL_PIP);
+
+	return vmexit_control;
+}
+
+static u32 vmx_vmentry_control(struct vcpu_vmx *vmx)
+{
+	u32 vmentry_control = vmcs_config.vmentry_ctrl;
+
+	if (pt_mode == PT_MODE_SYSTEM)
+		vmentry_control &= ~(VM_ENTRY_PT_CONCEAL_PIP |
+				     VM_ENTRY_LOAD_IA32_RTIT_CTL);
+
+	return vmentry_control;
+}
+
 static bool vmx_rdrand_supported(void)
 {
 	return vmcs_config.cpu_based_2nd_exec_ctrl &
@@ -5709,6 +5761,10 @@ static void vmx_compute_secondary_exec_control(struct vcpu_vmx *vmx)
 		}
 	}
 
+	if (pt_mode == PT_MODE_SYSTEM)
+		exec_control &= ~(SECONDARY_EXEC_PT_USE_GPA |
+				  SECONDARY_EXEC_PT_CONCEAL_VMX);
+
 	vmx->secondary_exec_control = exec_control;
 }
 
@@ -5819,10 +5875,10 @@ static void vmx_vcpu_setup(struct vcpu_vmx *vmx)
 	if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
 		rdmsrl(MSR_IA32_ARCH_CAPABILITIES, vmx->arch_capabilities);
 
-	vm_exit_controls_init(vmx, vmcs_config.vmexit_ctrl);
+	vm_exit_controls_init(vmx, vmx_vmexit_control(vmx));
 
 	/* 22.2.1, 20.8.1 */
-	vm_entry_controls_init(vmx, vmcs_config.vmentry_ctrl);
+	vm_entry_controls_init(vmx, vmx_vmentry_control(vmx));
 
 	vmx->vcpu.arch.cr0_guest_owned_bits = X86_CR0_TS;
 	vmcs_writel(CR0_GUEST_HOST_MASK, ~X86_CR0_TS);
@@ -7190,6 +7246,10 @@ static __init int hardware_setup(void)
 
 	kvm_mce_cap_supported |= MCG_LMCE_P;
 
+	if (!enable_ept || !pt_cap_get(PT_CAP_topa_output) ||
+		!cpu_has_vmx_intel_pt() || !cpu_has_vmx_pt_use_gpa())
+		pt_mode = PT_MODE_SYSTEM;
+
 	return alloc_kvm_area();
 
 out:
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v6 05/11] KVM: x86: Add Intel Processor Trace cpuid emulation
  2018-03-20 11:21 [PATCH v6 00/11] Intel Processor Trace virtualization enabling Luwei Kang
                   ` (3 preceding siblings ...)
  2018-03-20 11:21 ` [PATCH v6 04/11] KVM: x86: Add Intel Processor Trace virtualization mode Luwei Kang
@ 2018-03-20 11:21 ` Luwei Kang
  2018-03-20 11:21 ` [PATCH v6 06/11] KVM: x86: Add Intel processor trace context for each vcpu Luwei Kang
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Luwei Kang @ 2018-03-20 11:21 UTC (permalink / raw)
  To: kvm
  Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
	Chao Peng, Luwei Kang

From: Chao Peng <chao.p.peng@linux.intel.com>

Expose Intel Processor Trace to guest only when PT work in
HOST_GUEST mode.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/cpuid.c            | 22 ++++++++++++++++++++--
 arch/x86/kvm/svm.c              |  6 ++++++
 arch/x86/kvm/vmx.c              |  6 ++++++
 4 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 8fb6028..966b6e4 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1040,6 +1040,7 @@ struct kvm_x86_ops {
 	bool (*mpx_supported)(void);
 	bool (*xsaves_supported)(void);
 	bool (*umip_emulated)(void);
+	bool (*pt_supported)(void);
 
 	int (*check_nested_events)(struct kvm_vcpu *vcpu, bool external_intr);
 
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 82055b9..e04bf67 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -336,6 +336,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 	unsigned f_mpx = kvm_mpx_supported() ? F(MPX) : 0;
 	unsigned f_xsaves = kvm_x86_ops->xsaves_supported() ? F(XSAVES) : 0;
 	unsigned f_umip = kvm_x86_ops->umip_emulated() ? F(UMIP) : 0;
+	unsigned f_intel_pt = kvm_x86_ops->pt_supported() ? F(INTEL_PT) : 0;
 
 	/* cpuid 1.edx */
 	const u32 kvm_cpuid_1_edx_x86_features =
@@ -393,7 +394,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | f_mpx | F(RDSEED) |
 		F(ADX) | F(SMAP) | F(AVX512IFMA) | F(AVX512F) | F(AVX512PF) |
 		F(AVX512ER) | F(AVX512CD) | F(CLFLUSHOPT) | F(CLWB) | F(AVX512DQ) |
-		F(SHA_NI) | F(AVX512BW) | F(AVX512VL);
+		F(SHA_NI) | F(AVX512BW) | F(AVX512VL) | f_intel_pt;
 
 	/* cpuid 0xD.1.eax */
 	const u32 kvm_cpuid_D_1_eax_x86_features =
@@ -423,7 +424,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 
 	switch (function) {
 	case 0:
-		entry->eax = min(entry->eax, (u32)0xd);
+		entry->eax = min(entry->eax, (u32)(f_intel_pt ? 0x14 : 0xd));
 		break;
 	case 1:
 		entry->edx &= kvm_cpuid_1_edx_x86_features;
@@ -595,6 +596,23 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		}
 		break;
 	}
+	/* Intel PT */
+	case 0x14: {
+		int t, times = entry->eax;
+
+		if (!f_intel_pt)
+			break;
+
+		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+		for (t = 1; t <= times; ++t) {
+			if (*nent >= maxnent)
+				goto out;
+			do_cpuid_1_ent(&entry[t], function, t);
+			entry[t].flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+			++*nent;
+		}
+		break;
+	}
 	case KVM_CPUID_SIGNATURE: {
 		static const char signature[12] = "KVMKVMKVM\0\0";
 		const u32 *sigptr = (const u32 *)signature;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 9729b7e..c838894 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5741,6 +5741,11 @@ static bool svm_umip_emulated(void)
 	return false;
 }
 
+static bool svm_pt_supported(void)
+{
+	return false;
+}
+
 static bool svm_has_wbinvd_exit(void)
 {
 	return true;
@@ -6961,6 +6966,7 @@ static int svm_unregister_enc_region(struct kvm *kvm,
 	.mpx_supported = svm_mpx_supported,
 	.xsaves_supported = svm_xsaves_supported,
 	.umip_emulated = svm_umip_emulated,
+	.pt_supported = svm_pt_supported,
 
 	.set_supported_cpuid = svm_set_supported_cpuid,
 
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 1093ae0..84c4c6d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -9401,6 +9401,11 @@ static bool vmx_umip_emulated(void)
 		SECONDARY_EXEC_DESC;
 }
 
+static bool vmx_pt_supported(void)
+{
+	return (pt_mode == PT_MODE_HOST_GUEST);
+}
+
 static void vmx_recover_nmi_blocking(struct vcpu_vmx *vmx)
 {
 	u32 exit_intr_info;
@@ -12581,6 +12586,7 @@ static int enable_smi_window(struct kvm_vcpu *vcpu)
 	.mpx_supported = vmx_mpx_supported,
 	.xsaves_supported = vmx_xsaves_supported,
 	.umip_emulated = vmx_umip_emulated,
+	.pt_supported = vmx_pt_supported,
 
 	.check_nested_events = vmx_check_nested_events,
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v6 06/11] KVM: x86: Add Intel processor trace context for each vcpu
  2018-03-20 11:21 [PATCH v6 00/11] Intel Processor Trace virtualization enabling Luwei Kang
                   ` (4 preceding siblings ...)
  2018-03-20 11:21 ` [PATCH v6 05/11] KVM: x86: Add Intel Processor Trace cpuid emulation Luwei Kang
@ 2018-03-20 11:21 ` Luwei Kang
  2018-03-20 11:21 ` [PATCH v6 07/11] KVM: x86: Implement Intel Processor Trace context switch Luwei Kang
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Luwei Kang @ 2018-03-20 11:21 UTC (permalink / raw)
  To: kvm
  Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
	Chao Peng, Luwei Kang

From: Chao Peng <chao.p.peng@linux.intel.com>

Add a data structure to save Intel Processor Trace context.
It mainly include the value of Intel PT host/guest MSRs
and guest CPUID information.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 arch/x86/kvm/vmx.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 84c4c6d..940df0e 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -596,6 +596,23 @@ static inline int pi_test_sn(struct pi_desc *pi_desc)
 			(unsigned long *)&pi_desc->control);
 }
 
+struct pt_ctx {
+	u64 ctl;
+	u64 status;
+	u64 output_base;
+	u64 output_mask;
+	u64 cr3_match;
+	u64 addrs[MSR_IA32_RTIT_ADDR_COUNT];
+};
+
+struct pt_desc {
+	u64 ctl_bitmask;
+	u32 range_cnt;
+	u32 caps[PT_CPUID_REGS_NUM * PT_CPUID_LEAVES];
+	struct pt_ctx host;
+	struct pt_ctx guest;
+};
+
 struct vcpu_vmx {
 	struct kvm_vcpu       vcpu;
 	unsigned long         host_rsp;
@@ -692,6 +709,8 @@ struct vcpu_vmx {
 	 */
 	u64 msr_ia32_feature_control;
 	u64 msr_ia32_feature_control_valid_bits;
+
+	struct pt_desc pt_desc;
 };
 
 enum segment_cache_field {
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v6 07/11] KVM: x86: Implement Intel Processor Trace context switch
  2018-03-20 11:21 [PATCH v6 00/11] Intel Processor Trace virtualization enabling Luwei Kang
                   ` (5 preceding siblings ...)
  2018-03-20 11:21 ` [PATCH v6 06/11] KVM: x86: Add Intel processor trace context for each vcpu Luwei Kang
@ 2018-03-20 11:21 ` Luwei Kang
  2018-03-20 11:21 ` [PATCH v6 08/11] KVM: x86: Introduce a function to initialize the PT configuration Luwei Kang
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Luwei Kang @ 2018-03-20 11:21 UTC (permalink / raw)
  To: kvm
  Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
	Chao Peng, Luwei Kang

From: Chao Peng <chao.p.peng@linux.intel.com>

Load/Store Intel processor trace register in context switch.
MSR IA32_RTIT_CTL is loaded/stored automatically from VMCS.
In HOST mode, we just need to restore the status of IA32_RTIT_CTL.
In HOST_GUEST mode, we need load/resore PT MSRs only when PT is
enabled in guest.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 arch/x86/kvm/vmx.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 940df0e..34bcb30 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2187,6 +2187,55 @@ static unsigned long segment_base(u16 selector)
 }
 #endif
 
+static inline void pt_load_msr(struct pt_ctx *ctx, u32 range_cnt)
+{
+	u32 i;
+
+	wrmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
+	wrmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
+	wrmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
+	wrmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
+	for (i = 0; i < range_cnt; i++)
+		wrmsrl(MSR_IA32_RTIT_ADDR0_A + i, ctx->addrs[i]);
+}
+
+static inline void pt_save_msr(struct pt_ctx *ctx, u32 range_cnt)
+{
+	u32 i;
+
+	rdmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
+	rdmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
+	rdmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
+	rdmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
+	for (i = 0; i < range_cnt; i++)
+		rdmsrl(MSR_IA32_RTIT_ADDR0_A + i, ctx->addrs[i]);
+}
+
+static void pt_guest_enter(struct vcpu_vmx *vmx)
+{
+	if (pt_mode == PT_MODE_HOST || pt_mode == PT_MODE_HOST_GUEST)
+		rdmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl);
+
+	if (pt_mode == PT_MODE_HOST_GUEST &&
+		vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) {
+		wrmsrl(MSR_IA32_RTIT_CTL, 0);
+		pt_save_msr(&vmx->pt_desc.host, vmx->pt_desc.range_cnt);
+		pt_load_msr(&vmx->pt_desc.guest, vmx->pt_desc.range_cnt);
+	}
+}
+
+static void pt_guest_exit(struct vcpu_vmx *vmx)
+{
+	if (pt_mode == PT_MODE_HOST_GUEST &&
+		vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) {
+		pt_save_msr(&vmx->pt_desc.guest, vmx->pt_desc.range_cnt);
+		pt_load_msr(&vmx->pt_desc.host, vmx->pt_desc.range_cnt);
+	}
+
+	if (pt_mode == PT_MODE_HOST || pt_mode == PT_MODE_HOST_GUEST)
+		wrmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl);
+}
+
 static void vmx_save_host_state(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -5912,6 +5961,13 @@ static void vmx_vcpu_setup(struct vcpu_vmx *vmx)
 		vmcs_write64(PML_ADDRESS, page_to_phys(vmx->pml_pg));
 		vmcs_write16(GUEST_PML_INDEX, PML_ENTITY_NUM - 1);
 	}
+
+	if (pt_mode == PT_MODE_HOST_GUEST) {
+		memset(&vmx->pt_desc, 0, sizeof(vmx->pt_desc));
+		/* Bit[6~0] are forced to 1, writes are ignored. */
+		vmx->pt_desc.guest.output_mask = 0x7F;
+		vmcs_write64(GUEST_IA32_RTIT_CTL, 0);
+	}
 }
 
 static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
@@ -9632,6 +9688,8 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 	    vcpu->arch.pkru != vmx->host_pkru)
 		__write_pkru(vcpu->arch.pkru);
 
+	pt_guest_enter(vmx);
+
 	atomic_switch_perf_msrs(vmx);
 
 	vmx_arm_hv_timer(vcpu);
@@ -9811,6 +9869,8 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 				  | (1 << VCPU_EXREG_CR3));
 	vcpu->arch.regs_dirty = 0;
 
+	pt_guest_exit(vmx);
+
 	/*
 	 * eager fpu is enabled if PKEY is supported and CR4 is switched
 	 * back on host, so it is safe to read guest PKRU from current
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v6 08/11] KVM: x86: Introduce a function to initialize the PT configuration
  2018-03-20 11:21 [PATCH v6 00/11] Intel Processor Trace virtualization enabling Luwei Kang
                   ` (6 preceding siblings ...)
  2018-03-20 11:21 ` [PATCH v6 07/11] KVM: x86: Implement Intel Processor Trace context switch Luwei Kang
@ 2018-03-20 11:21 ` Luwei Kang
  2018-03-20 11:21 ` [PATCH v6 09/11] KVM: x86: Implement Intel Processor Trace MSRs read/write Luwei Kang
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Luwei Kang @ 2018-03-20 11:21 UTC (permalink / raw)
  To: kvm
  Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro, Luwei Kang

Initialize the Intel PT configuration when cpuid update.
Include cpuid inforamtion, rtit_ctl bit mask and the number of
address ranges.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 arch/x86/kvm/vmx.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 34bcb30..5166d72 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -10191,6 +10191,71 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
 #undef cr4_fixed1_update
 }
 
+static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+	struct kvm_cpuid_entry2 *best = NULL;
+	int i;
+
+	for (i = 0; i < PT_CPUID_LEAVES; i++) {
+		best = kvm_find_cpuid_entry(vcpu, 0x14, i);
+		if (!best)
+			return;
+		vmx->pt_desc.caps[CPUID_EAX + i*PT_CPUID_REGS_NUM] = best->eax;
+		vmx->pt_desc.caps[CPUID_EBX + i*PT_CPUID_REGS_NUM] = best->ebx;
+		vmx->pt_desc.caps[CPUID_ECX + i*PT_CPUID_REGS_NUM] = best->ecx;
+		vmx->pt_desc.caps[CPUID_EDX + i*PT_CPUID_REGS_NUM] = best->edx;
+	}
+
+	/* Get the number of configurable Address Ranges for filtering */
+	vmx->pt_desc.range_cnt = __pt_cap_get(vmx->pt_desc.caps,
+						PT_CAP_num_address_ranges);
+
+	/* Clear the no dependency bits */
+	vmx->pt_desc.ctl_bitmask = ~(RTIT_CTL_TRACEEN | RTIT_CTL_OS |
+			RTIT_CTL_USR | RTIT_CTL_TSC_EN | RTIT_CTL_DIS_RETC);
+
+	/* If CPUID.(EAX=14H,ECX=0):EBX[0]=1 CR3Filter can be set */
+	if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_cr3_filtering))
+		vmx->pt_desc.ctl_bitmask &= ~RTIT_CTL_CR3_FILTER;
+
+	/*
+	 * If CPUID.(EAX=14H,ECX=0):EBX[1]=1 CYCEn, CycThresh and
+	 * PSBFreq can be set
+	 */
+	if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_psb_cyc))
+		vmx->pt_desc.ctl_bitmask &= ~(RTIT_CTL_CYC_EN |
+			RTIT_CTL_CYC_THRESH | RTIT_CTL_PSB_FREQ);
+
+	/*
+	 * If CPUID.(EAX=14H,ECX=0):EBX[3]=1 MTCEn BranchEn and
+	 * MTCFreq can be set
+	 */
+	if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_mtc))
+		vmx->pt_desc.ctl_bitmask &= ~(RTIT_CTL_MTC_EN |
+			RTIT_CTL_BRANCH_EN | RTIT_CTL_MTC_RANGE);
+
+	/* If CPUID.(EAX=14H,ECX=0):EBX[4]=1 FUPonPTW and PTWEn can be set */
+	if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_ptwrite))
+		vmx->pt_desc.ctl_bitmask &= ~(RTIT_CTL_FUP_ON_PTW |
+						RTIT_CTL_PTW_EN);
+
+	/* If CPUID.(EAX=14H,ECX=0):EBX[5]=1 PwrEvEn can be set */
+	if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_power_event_trace))
+		vmx->pt_desc.ctl_bitmask &= ~RTIT_CTL_PWR_EVT_EN;
+
+	/* If CPUID.(EAX=14H,ECX=0):ECX[0]=1 ToPA can be set */
+	if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_topa_output))
+		vmx->pt_desc.ctl_bitmask &= ~RTIT_CTL_TOPA;
+	/* If CPUID.(EAX=14H,ECX=0):ECX[3]=1 FabircEn can be set */
+	if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_output_subsys))
+		vmx->pt_desc.ctl_bitmask &= ~RTIT_CTL_FABRIC_EN;
+
+	/* unmask address range configure area */
+	for (i = 0; i < vmx->pt_desc.range_cnt; i++)
+		vmx->pt_desc.ctl_bitmask &= ~(0xf << (32 + i*4));
+}
+
 static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -10209,6 +10274,10 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
 
 	if (nested_vmx_allowed(vcpu))
 		nested_vmx_cr_fixed1_bits_update(vcpu);
+
+	if (boot_cpu_has(X86_FEATURE_INTEL_PT) &&
+		guest_cpuid_has(vcpu, X86_FEATURE_INTEL_PT))
+		update_intel_pt_cfg(vcpu);
 }
 
 static void vmx_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry)
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v6 09/11] KVM: x86: Implement Intel Processor Trace MSRs read/write
  2018-03-20 11:21 [PATCH v6 00/11] Intel Processor Trace virtualization enabling Luwei Kang
                   ` (7 preceding siblings ...)
  2018-03-20 11:21 ` [PATCH v6 08/11] KVM: x86: Introduce a function to initialize the PT configuration Luwei Kang
@ 2018-03-20 11:21 ` Luwei Kang
  2018-03-20 11:21 ` [PATCH v6 10/11] KVM: x86: Set intercept for Intel PT MSRs Luwei Kang
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Luwei Kang @ 2018-03-20 11:21 UTC (permalink / raw)
  To: kvm
  Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
	Chao Peng, Luwei Kang

From: Chao Peng <chao.p.peng@linux.intel.com>

Implement Intel Processor Trace MSRs read/write.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 arch/x86/include/asm/intel_pt.h |   8 ++
 arch/x86/kvm/vmx.c              | 163 ++++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c              |  33 +++++++-
 3 files changed, 203 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/intel_pt.h b/arch/x86/include/asm/intel_pt.h
index 43ad260..dc0f3f0 100644
--- a/arch/x86/include/asm/intel_pt.h
+++ b/arch/x86/include/asm/intel_pt.h
@@ -5,6 +5,14 @@
 #define PT_CPUID_LEAVES		2
 #define PT_CPUID_REGS_NUM	4 /* number of regsters (eax, ebx, ecx, edx) */
 
+#define MSR_IA32_RTIT_STATUS_MASK (~(RTIT_STATUS_FILTEREN | \
+		RTIT_STATUS_CONTEXTEN | RTIT_STATUS_TRIGGEREN | \
+		RTIT_STATUS_ERROR | RTIT_STATUS_STOPPED | \
+		RTIT_STATUS_BYTECNT))
+
+#define MSR_IA32_RTIT_OUTPUT_BASE_MASK \
+	(~((1UL << cpuid_query_maxphyaddr(vcpu)) - 1) | 0x7f)
+
 enum pt_mode {
 	PT_MODE_SYSTEM = 0,
 	PT_MODE_HOST,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 5166d72..0334100 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2573,6 +2573,77 @@ static void vmx_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask)
 		vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, interruptibility);
 }
 
+static int vmx_rtit_ctl_check(struct kvm_vcpu *vcpu, u64 data)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+	unsigned long value;
+
+	/*
+	 * Any MSR write that attempts to change bits marked reserved will
+	 * case a #GP fault.
+	 */
+	if (data & vmx->pt_desc.ctl_bitmask)
+		return 1;
+
+	/*
+	 * Any attempt to modify IA32_RTIT_CTL while TraceEn is set will
+	 * result in a #GP unless the same write also clears TraceEn.
+	 */
+	if ((vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) &&
+		((vmx->pt_desc.guest.ctl ^ data) & ~RTIT_CTL_TRACEEN))
+		return 1;
+
+	/*
+	 * WRMSR to IA32_RTIT_CTL that sets TraceEn but clears this bit
+	 * and FabricEn would cause #GP, if
+	 * CPUID.(EAX=14H, ECX=0):ECX.SNGLRGNOUT[bit 2] = 0
+	 */
+	if ((data & RTIT_CTL_TRACEEN) && !(data & RTIT_CTL_TOPA) &&
+		!(data & RTIT_CTL_FABRIC_EN) &&
+		!__pt_cap_get(vmx->pt_desc.caps, PT_CAP_single_range_output))
+		return 1;
+
+	/*
+	 * MTCFreq, CycThresh and PSBFreq encodings check, any MSR write that
+	 * utilize encodings marked reserved will casue a #GP fault.
+	 */
+	value = __pt_cap_get(vmx->pt_desc.caps, PT_CAP_mtc_periods);
+	if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_mtc) &&
+			!test_bit((data & RTIT_CTL_MTC_RANGE) >>
+			RTIT_CTL_MTC_RANGE_OFFSET, &value))
+		return 1;
+	value = __pt_cap_get(vmx->pt_desc.caps, PT_CAP_cycle_thresholds);
+	if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_psb_cyc) &&
+			!test_bit((data & RTIT_CTL_CYC_THRESH) >>
+			RTIT_CTL_CYC_THRESH_OFFSET, &value))
+		return 1;
+	value = __pt_cap_get(vmx->pt_desc.caps, PT_CAP_psb_periods);
+	if (__pt_cap_get(vmx->pt_desc.caps, PT_CAP_psb_cyc) &&
+			!test_bit((data & RTIT_CTL_PSB_FREQ) >>
+			RTIT_CTL_PSB_FREQ_OFFSET, &value))
+		return 1;
+
+	/*
+	 * If ADDRx_CFG is reserved or the encodings is >2 will
+	 * cause a #GP fault.
+	 */
+	value = (data & RTIT_CTL_ADDR0) >> RTIT_CTL_ADDR0_OFFSET;
+	if ((value && (vmx->pt_desc.range_cnt < 1)) || (value > 2))
+		return 1;
+	value = (data & RTIT_CTL_ADDR1) >> RTIT_CTL_ADDR1_OFFSET;
+	if ((value && (vmx->pt_desc.range_cnt < 2)) || (value > 2))
+		return 1;
+	value = (data & RTIT_CTL_ADDR2) >> RTIT_CTL_ADDR2_OFFSET;
+	if ((value && (vmx->pt_desc.range_cnt < 3)) || (value > 2))
+		return 1;
+	value = (data & RTIT_CTL_ADDR3) >> RTIT_CTL_ADDR3_OFFSET;
+	if ((value && (vmx->pt_desc.range_cnt < 4)) || (value > 2))
+		return 1;
+
+	return 0;
+}
+
+
 static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
 {
 	unsigned long rip;
@@ -3459,6 +3530,48 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			return 1;
 		msr_info->data = vcpu->arch.ia32_xss;
 		break;
+	case MSR_IA32_RTIT_CTL:
+		if (pt_mode != PT_MODE_HOST_GUEST)
+			return 1;
+		msr_info->data = vmx->pt_desc.guest.ctl;
+		break;
+	case MSR_IA32_RTIT_STATUS:
+		if (pt_mode != PT_MODE_HOST_GUEST)
+			return 1;
+		msr_info->data = vmx->pt_desc.guest.status;
+		break;
+	case MSR_IA32_RTIT_CR3_MATCH:
+		if ((pt_mode != PT_MODE_HOST_GUEST) ||
+			!__pt_cap_get(vmx->pt_desc.caps, PT_CAP_cr3_filtering))
+			return 1;
+		msr_info->data = vmx->pt_desc.guest.cr3_match;
+		break;
+	case MSR_IA32_RTIT_OUTPUT_BASE:
+		if ((pt_mode != PT_MODE_HOST_GUEST) ||
+			(!__pt_cap_get(vmx->pt_desc.caps, PT_CAP_topa_output) &&
+			 !__pt_cap_get(vmx->pt_desc.caps,
+					PT_CAP_single_range_output)))
+			return 1;
+		msr_info->data = vmx->pt_desc.guest.output_base;
+		break;
+	case MSR_IA32_RTIT_OUTPUT_MASK:
+		if ((pt_mode != PT_MODE_HOST_GUEST) ||
+			(!__pt_cap_get(vmx->pt_desc.caps, PT_CAP_topa_output) &&
+			 !__pt_cap_get(vmx->pt_desc.caps,
+					PT_CAP_single_range_output)))
+			return 1;
+		msr_info->data = vmx->pt_desc.guest.output_mask;
+		break;
+	case MSR_IA32_RTIT_ADDR0_A ... MSR_IA32_RTIT_ADDR3_B:
+		if ((pt_mode != PT_MODE_HOST_GUEST) ||
+			(msr_info->index - MSR_IA32_RTIT_ADDR0_A >=
+				2 * __pt_cap_get(vmx->pt_desc.caps,
+					PT_CAP_num_address_ranges)))
+			return 1;
+		msr_info->data =
+			vmx->pt_desc.guest.addrs[msr_info->index -
+						MSR_IA32_RTIT_ADDR0_A];
+		break;
 	case MSR_TSC_AUX:
 		if (!msr_info->host_initiated &&
 		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
@@ -3647,6 +3760,56 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		else
 			clear_atomic_switch_msr(vmx, MSR_IA32_XSS);
 		break;
+	case MSR_IA32_RTIT_CTL:
+		if ((pt_mode != PT_MODE_HOST_GUEST) ||
+			vmx_rtit_ctl_check(vcpu, data))
+			return 1;
+		vmcs_write64(GUEST_IA32_RTIT_CTL, data);
+		vmx->pt_desc.guest.ctl = data;
+		break;
+	case MSR_IA32_RTIT_STATUS:
+		if ((pt_mode != PT_MODE_HOST_GUEST) ||
+			(vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) ||
+			(data & MSR_IA32_RTIT_STATUS_MASK))
+			return 1;
+		vmx->pt_desc.guest.status = data;
+		break;
+	case MSR_IA32_RTIT_CR3_MATCH:
+		if ((pt_mode != PT_MODE_HOST_GUEST) ||
+			(vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) ||
+			!__pt_cap_get(vmx->pt_desc.caps, PT_CAP_cr3_filtering))
+			return 1;
+		vmx->pt_desc.guest.cr3_match = data;
+		break;
+	case MSR_IA32_RTIT_OUTPUT_BASE:
+		if ((pt_mode != PT_MODE_HOST_GUEST) ||
+			(vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) ||
+			(!__pt_cap_get(vmx->pt_desc.caps, PT_CAP_topa_output) &&
+			 !__pt_cap_get(vmx->pt_desc.caps,
+					PT_CAP_single_range_output)) ||
+			(data & MSR_IA32_RTIT_OUTPUT_BASE_MASK))
+			return 1;
+		vmx->pt_desc.guest.output_base = data;
+		break;
+	case MSR_IA32_RTIT_OUTPUT_MASK:
+		if ((pt_mode != PT_MODE_HOST_GUEST) ||
+			(vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) ||
+			(!__pt_cap_get(vmx->pt_desc.caps, PT_CAP_topa_output) &&
+			 !__pt_cap_get(vmx->pt_desc.caps,
+					PT_CAP_single_range_output)))
+			return 1;
+		vmx->pt_desc.guest.output_mask = data;
+		break;
+	case MSR_IA32_RTIT_ADDR0_A ... MSR_IA32_RTIT_ADDR3_B:
+		if ((pt_mode != PT_MODE_HOST_GUEST) ||
+			(vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) ||
+			(msr_info->index - MSR_IA32_RTIT_ADDR0_A >=
+				2 * __pt_cap_get(vmx->pt_desc.caps,
+					PT_CAP_num_address_ranges)))
+			return 1;
+		vmx->pt_desc.guest.addrs[msr_info->index -
+				MSR_IA32_RTIT_ADDR0_A] = data;
+		break;
 	case MSR_TSC_AUX:
 		if (!msr_info->host_initiated &&
 		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 26a8cd9..a8845d3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -69,6 +69,7 @@
 #include <asm/irq_remapping.h>
 #include <asm/mshyperv.h>
 #include <asm/hypervisor.h>
+#include <asm/intel_pt.h>
 
 #define CREATE_TRACE_POINTS
 #include "trace.h"
@@ -1020,7 +1021,13 @@ bool kvm_rdpmc(struct kvm_vcpu *vcpu)
 #endif
 	MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA,
 	MSR_IA32_FEATURE_CONTROL, MSR_IA32_BNDCFGS, MSR_TSC_AUX,
-	MSR_IA32_SPEC_CTRL, MSR_IA32_ARCH_CAPABILITIES
+	MSR_IA32_SPEC_CTRL, MSR_IA32_ARCH_CAPABILITIES,
+	MSR_IA32_RTIT_CTL, MSR_IA32_RTIT_STATUS, MSR_IA32_RTIT_CR3_MATCH,
+	MSR_IA32_RTIT_OUTPUT_BASE, MSR_IA32_RTIT_OUTPUT_MASK,
+	MSR_IA32_RTIT_ADDR0_A, MSR_IA32_RTIT_ADDR0_B,
+	MSR_IA32_RTIT_ADDR1_A, MSR_IA32_RTIT_ADDR1_B,
+	MSR_IA32_RTIT_ADDR2_A, MSR_IA32_RTIT_ADDR2_B,
+	MSR_IA32_RTIT_ADDR3_A, MSR_IA32_RTIT_ADDR3_B,
 };
 
 static unsigned num_msrs_to_save;
@@ -4581,6 +4588,30 @@ static void kvm_init_msr_list(void)
 			if (!kvm_x86_ops->rdtscp_supported())
 				continue;
 			break;
+		case MSR_IA32_RTIT_CTL:
+		case MSR_IA32_RTIT_STATUS:
+			if (!kvm_x86_ops->pt_supported())
+				continue;
+			break;
+		case MSR_IA32_RTIT_CR3_MATCH:
+			if (!kvm_x86_ops->pt_supported() ||
+					!pt_cap_get(PT_CAP_cr3_filtering))
+				continue;
+			break;
+		case MSR_IA32_RTIT_OUTPUT_BASE:
+		case MSR_IA32_RTIT_OUTPUT_MASK:
+			if (!kvm_x86_ops->pt_supported() ||
+				(!pt_cap_get(PT_CAP_topa_output) &&
+				 !pt_cap_get(PT_CAP_single_range_output)))
+				continue;
+			break;
+		case MSR_IA32_RTIT_ADDR0_A ... MSR_IA32_RTIT_ADDR3_B: {
+			if (!kvm_x86_ops->pt_supported() ||
+				msrs_to_save[i] - MSR_IA32_RTIT_ADDR0_A >=
+				2 * pt_cap_get(PT_CAP_num_address_ranges))
+				continue;
+			break;
+		}
 		default:
 			break;
 		}
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v6 10/11] KVM: x86: Set intercept for Intel PT MSRs
  2018-03-20 11:21 [PATCH v6 00/11] Intel Processor Trace virtualization enabling Luwei Kang
                   ` (8 preceding siblings ...)
  2018-03-20 11:21 ` [PATCH v6 09/11] KVM: x86: Implement Intel Processor Trace MSRs read/write Luwei Kang
@ 2018-03-20 11:21 ` Luwei Kang
  2018-03-20 11:21 ` [PATCH v6 11/11] KVM: x86: Disable Intel Processor Trace when VMXON in L1 guest Luwei Kang
  2018-04-30 13:17 ` [PATCH v6 00/11] Intel Processor Trace virtualization enabling Paolo Bonzini
  11 siblings, 0 replies; 17+ messages in thread
From: Luwei Kang @ 2018-03-20 11:21 UTC (permalink / raw)
  To: kvm
  Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro,
	Chao Peng, Luwei Kang

From: Chao Peng <chao.p.peng@linux.intel.com>

Disable interrcept Intel PT MSRs only when Intel PT is
enabled in guest. But MSR_IA32_RTIT_CTL will alway be
intercept.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 arch/x86/kvm/vmx.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 0334100..27185da 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -941,6 +941,7 @@ static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12,
 static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu);
 static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
 							  u32 msr, int type);
+static void pt_set_intercept_for_msr(struct vcpu_vmx *vmx, bool flag);
 
 static DEFINE_PER_CPU(struct vmcs *, vmxarea);
 static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
@@ -3765,6 +3766,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			vmx_rtit_ctl_check(vcpu, data))
 			return 1;
 		vmcs_write64(GUEST_IA32_RTIT_CTL, data);
+		pt_set_intercept_for_msr(vmx, !(data & RTIT_CTL_TRACEEN));
 		vmx->pt_desc.guest.ctl = data;
 		break;
 	case MSR_IA32_RTIT_STATUS:
@@ -5554,6 +5556,24 @@ static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu)
 	vmx->msr_bitmap_mode = mode;
 }
 
+static void pt_set_intercept_for_msr(struct vcpu_vmx *vmx, bool flag)
+{
+	unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap;
+	u32 i;
+
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_STATUS,
+							MSR_TYPE_RW, flag);
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_OUTPUT_BASE,
+							MSR_TYPE_RW, flag);
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_OUTPUT_MASK,
+							MSR_TYPE_RW, flag);
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_CR3_MATCH,
+							MSR_TYPE_RW, flag);
+	for (i = 0; i < vmx->pt_desc.range_cnt; i++)
+		vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_ADDR0_A + i,
+							MSR_TYPE_RW, flag);
+}
+
 static bool vmx_get_enable_apicv(struct kvm_vcpu *vcpu)
 {
 	return enable_apicv;
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v6 11/11] KVM: x86: Disable Intel Processor Trace when VMXON in L1 guest
  2018-03-20 11:21 [PATCH v6 00/11] Intel Processor Trace virtualization enabling Luwei Kang
                   ` (9 preceding siblings ...)
  2018-03-20 11:21 ` [PATCH v6 10/11] KVM: x86: Set intercept for Intel PT MSRs Luwei Kang
@ 2018-03-20 11:21 ` Luwei Kang
  2018-04-30 13:17 ` [PATCH v6 00/11] Intel Processor Trace virtualization enabling Paolo Bonzini
  11 siblings, 0 replies; 17+ messages in thread
From: Luwei Kang @ 2018-03-20 11:21 UTC (permalink / raw)
  To: kvm
  Cc: tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro, Luwei Kang

Currently, Intel Processor Trace do not support tracing in L1 guest
VMX operation(IA32_VMX_MISC[bit 14] is 0). As mentioned in SDM,
on these type of processors, execution of the VMXON instruction will
clears IA32_RTIT_CTL.TraceEn and any attempt to write IA32_RTIT_CTL
causes a general-protection exception (#GP).

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 arch/x86/kvm/vmx.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 27185da..77a28f3 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3763,7 +3763,8 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		break;
 	case MSR_IA32_RTIT_CTL:
 		if ((pt_mode != PT_MODE_HOST_GUEST) ||
-			vmx_rtit_ctl_check(vcpu, data))
+			vmx_rtit_ctl_check(vcpu, data) ||
+			vmx->nested.vmxon)
 			return 1;
 		vmcs_write64(GUEST_IA32_RTIT_CTL, data);
 		pt_set_intercept_for_msr(vmx, !(data & RTIT_CTL_TRACEEN));
@@ -7869,6 +7870,12 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
 	if (ret)
 		return ret;
 
+	if (pt_mode == PT_MODE_HOST_GUEST) {
+		vmx->pt_desc.guest.ctl = 0;
+		vmcs_write64(GUEST_IA32_RTIT_CTL, 0);
+		pt_set_intercept_for_msr(vmx, 1);
+	}
+
 	nested_vmx_succeed(vcpu);
 	return kvm_skip_emulated_instruction(vcpu);
 }
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 00/11] Intel Processor Trace virtualization enabling
  2018-03-20 11:21 [PATCH v6 00/11] Intel Processor Trace virtualization enabling Luwei Kang
                   ` (10 preceding siblings ...)
  2018-03-20 11:21 ` [PATCH v6 11/11] KVM: x86: Disable Intel Processor Trace when VMXON in L1 guest Luwei Kang
@ 2018-04-30 13:17 ` Paolo Bonzini
  11 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2018-04-30 13:17 UTC (permalink / raw)
  To: Luwei Kang, kvm
  Cc: tglx, mingo, hpa, x86, rkrcmar, linux-kernel, joro,
	Peter Zijlstra, Ingo Molnar

On 20/03/2018 12:21, Luwei Kang wrote:
> Hi All,
> 
> Here is a patch-series which adding Processor Trace enabling in KVM
> guest. 
> 
> Processor Trace virtualization can be work in one of 3 possible modes
> by set new option "pt_mode". Default value is system mode.
> a. system-wide: trace both host/guest and output to host buffer;
> b. host-only: only trace host and output to host buffer;
> c. host-guest: trace host/guest simultaneous and output to their
> respective buffer.

tip folks, could you please ack and/or prepare a topic branch for
patches 1-3 in this series?

Thanks!

Paolo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 01/11] perf/x86/intel/pt: Move Intel-PT MSR bit definitions to a public header
  2018-03-20 11:21 ` [PATCH v6 01/11] perf/x86/intel/pt: Move Intel-PT MSR bit definitions to a public header Luwei Kang
@ 2018-04-30 13:57   ` Peter Zijlstra
  2018-05-02  7:41     ` Kang, Luwei
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2018-04-30 13:57 UTC (permalink / raw)
  To: Luwei Kang
  Cc: kvm, tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel,
	joro, Chao Peng

On Tue, Mar 20, 2018 at 07:21:48PM +0800, Luwei Kang wrote:
> From: Chao Peng <chao.p.peng@linux.intel.com>
> 
> Intel Processor Trace virtualization enabling in guest need
> to use these MSR bits, so move them to public header msr-index.h.
> Introduce RTIT_CTL_FABRIC_EN and sync the definitions to
> latest spec.

You forgot to Cc the maintainers.

Also, this patch does 2 things, I think we have a rule that says that's
a bad thing.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 02/11] perf/x86/intel/pt: Change pt_cap_get() to a public function
  2018-03-20 11:21 ` [PATCH v6 02/11] perf/x86/intel/pt: Change pt_cap_get() to a public function Luwei Kang
@ 2018-04-30 13:59   ` Peter Zijlstra
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Zijlstra @ 2018-04-30 13:59 UTC (permalink / raw)
  To: Luwei Kang
  Cc: kvm, tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel,
	joro, Chao Peng

On Tue, Mar 20, 2018 at 07:21:49PM +0800, Luwei Kang wrote:
> From: Chao Peng <chao.p.peng@linux.intel.com>
> 
> Change pt_cap_get() to a public function so that KVM can access it.
> Introduce new capablility PT_CAP_output_subsys to support
> of output to Trace Transport subsystem.

That is again, more than a single change.

Also, the Changelog is far too terse, why do you need a new capability?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 03/11] perf/x86/intel/pt: Introduce a new function to get the capability of Intel PT
  2018-03-20 11:21 ` [PATCH v6 03/11] perf/x86/intel/pt: Introduce a new function to get the capability of Intel PT Luwei Kang
@ 2018-04-30 14:01   ` Peter Zijlstra
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Zijlstra @ 2018-04-30 14:01 UTC (permalink / raw)
  To: Luwei Kang
  Cc: kvm, tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel, joro

On Tue, Mar 20, 2018 at 07:21:50PM +0800, Luwei Kang wrote:
> Because of the guest CPUID information may different with
> host(some bits may mask off in guest) so introduce a new
> function to get the capability of Intel PT.

That's not English, also you need whitespace before a parenthesis.

Also, what?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH v6 01/11] perf/x86/intel/pt: Move Intel-PT MSR bit definitions to a public header
  2018-04-30 13:57   ` Peter Zijlstra
@ 2018-05-02  7:41     ` Kang, Luwei
  0 siblings, 0 replies; 17+ messages in thread
From: Kang, Luwei @ 2018-05-02  7:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: kvm, tglx, mingo, hpa, x86, pbonzini, rkrcmar, linux-kernel,
	joro, Chao Peng

> > Intel Processor Trace virtualization enabling in guest need to use
> > these MSR bits, so move them to public header msr-index.h.
> > Introduce RTIT_CTL_FABRIC_EN and sync the definitions to latest spec.
> 
> You forgot to Cc the maintainers.
> 
> Also, this patch does 2 things, I think we have a rule that says that's a bad thing.

I will split it into two patches, one patch to make these #define in a public header and another patch to add new bits for Intel Processor Trace MSRs. Thank for the review.

Thanks,
Luwei Kang

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-05-02  7:41 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-20 11:21 [PATCH v6 00/11] Intel Processor Trace virtualization enabling Luwei Kang
2018-03-20 11:21 ` [PATCH v6 01/11] perf/x86/intel/pt: Move Intel-PT MSR bit definitions to a public header Luwei Kang
2018-04-30 13:57   ` Peter Zijlstra
2018-05-02  7:41     ` Kang, Luwei
2018-03-20 11:21 ` [PATCH v6 02/11] perf/x86/intel/pt: Change pt_cap_get() to a public function Luwei Kang
2018-04-30 13:59   ` Peter Zijlstra
2018-03-20 11:21 ` [PATCH v6 03/11] perf/x86/intel/pt: Introduce a new function to get the capability of Intel PT Luwei Kang
2018-04-30 14:01   ` Peter Zijlstra
2018-03-20 11:21 ` [PATCH v6 04/11] KVM: x86: Add Intel Processor Trace virtualization mode Luwei Kang
2018-03-20 11:21 ` [PATCH v6 05/11] KVM: x86: Add Intel Processor Trace cpuid emulation Luwei Kang
2018-03-20 11:21 ` [PATCH v6 06/11] KVM: x86: Add Intel processor trace context for each vcpu Luwei Kang
2018-03-20 11:21 ` [PATCH v6 07/11] KVM: x86: Implement Intel Processor Trace context switch Luwei Kang
2018-03-20 11:21 ` [PATCH v6 08/11] KVM: x86: Introduce a function to initialize the PT configuration Luwei Kang
2018-03-20 11:21 ` [PATCH v6 09/11] KVM: x86: Implement Intel Processor Trace MSRs read/write Luwei Kang
2018-03-20 11:21 ` [PATCH v6 10/11] KVM: x86: Set intercept for Intel PT MSRs Luwei Kang
2018-03-20 11:21 ` [PATCH v6 11/11] KVM: x86: Disable Intel Processor Trace when VMXON in L1 guest Luwei Kang
2018-04-30 13:17 ` [PATCH v6 00/11] Intel Processor Trace virtualization enabling Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).