LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v7 0/6] Add TDX Guest Support (boot support)
@ 2021-10-05 23:05 Kuppuswamy Sathyanarayanan
  2021-10-05 23:05 ` [PATCH v7 1/6] x86/boot: Add a trampoline for APs booting in 64-bit mode Kuppuswamy Sathyanarayanan
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2021-10-05 23:05 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Paolo Bonzini, David Hildenbrand, Andrea Arcangeli,
	Josh Poimboeuf, H . Peter Anvin
  Cc: Dave Hansen, Tony Luck, Dan Williams, Andi Kleen,
	Kirill Shutemov, Sean Christopherson, Kuppuswamy Sathyanarayanan,
	Kuppuswamy Sathyanarayanan, linux-kernel

Hi All,

Intel's Trust Domain Extensions (TDX) protect guest VMs from malicious
hosts and some physical attacks. This series adds boot code support
and some additional fixes required for successful boot of TDX guest.

This series is the continuation of the patch series titled "Add TDX Guest
Support (Initial support)" and "Add TDX Guest Support (#VE handler support
)", which added initial support and #VE handler support for TDX guests. You
can find the related patchsets in the following links.

[set 1, v8] - https://lore.kernel.org/lkml/20211005025205.1784480-1-sathyanarayanan.kuppuswamy@linux.intel.com/
[set 2, v7] - https://lore.kernel.org/lkml/20211005204136.1812078-1-sathyanarayanan.kuppuswamy@linux.intel.com/

Also please note that this series alone is not necessarily fully
functional.

You can find TDX related documents in the following link.

https://software.intel.com/content/www/br/pt/develop/articles/intel-trust-domain-extensions.html

Changes since v6:
 * Included "x86/split_lock: Fix the split lock #AC handling when
   running as guest" part of this series.
 * Rebased on top of other TDX patches (set 1 & set 2).

Changes since v5:
 * Rebased on top of Tom Lendackys CC guest change.
 * Rebased on top of v5.15-rc1
 * No functional changes.

Changes since v4:
 * Renamed tdg_* prefix with tdx_*.
 * No functional changes.

Changes since v3:
 * Rebased on top of Tom Lendackys protected guest change.
 * No functional changes.

Changes since v2:
 * Rebased on top of v5.14-rc1.
 * No functional changes.

Changes since v1:
 * Rebased on top of v3 version of "Add TDX Guest Support (Initial support)"
   patchset. Since it had some changes at the TDCALL implementation level, we
   have to rebase other dependent patches.

Kuppuswamy Sathyanarayanan (2):
  x86/topology: Disable CPU online/offline control for TDX guest
  x86: Skip WBINVD instruction for VM guest

Sean Christopherson (3):
  x86/boot: Add a trampoline for APs booting in 64-bit mode
  x86/boot: Avoid #VE during boot for TDX platforms
  x86/tdx: Forcefully disable legacy PIC for TDX guests

Xiaoyao Li (1):
  x86/split_lock: Fix the split lock #AC handling when running as guest

 arch/x86/boot/compressed/head_64.S       | 16 +++++--
 arch/x86/boot/compressed/pgtable.h       |  2 +-
 arch/x86/include/asm/acenv.h             |  7 ++-
 arch/x86/include/asm/realmode.h          | 12 +++++
 arch/x86/kernel/cpu/intel.c              |  7 ++-
 arch/x86/kernel/head_64.S                | 20 +++++++-
 arch/x86/kernel/smpboot.c                |  2 +-
 arch/x86/kernel/tdx.c                    | 19 ++++++++
 arch/x86/kernel/topology.c               |  3 +-
 arch/x86/realmode/rm/header.S            |  1 +
 arch/x86/realmode/rm/trampoline_64.S     | 59 ++++++++++++++++++++++--
 arch/x86/realmode/rm/trampoline_common.S | 12 ++++-
 12 files changed, 145 insertions(+), 15 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v7 1/6] x86/boot: Add a trampoline for APs booting in 64-bit mode
  2021-10-05 23:05 [PATCH v7 0/6] Add TDX Guest Support (boot support) Kuppuswamy Sathyanarayanan
@ 2021-10-05 23:05 ` Kuppuswamy Sathyanarayanan
  2021-10-05 23:05 ` [PATCH v7 2/6] x86/boot: Avoid #VE during boot for TDX platforms Kuppuswamy Sathyanarayanan
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2021-10-05 23:05 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Paolo Bonzini, David Hildenbrand, Andrea Arcangeli,
	Josh Poimboeuf, H . Peter Anvin
  Cc: Dave Hansen, Tony Luck, Dan Williams, Andi Kleen,
	Kirill Shutemov, Sean Christopherson, Kuppuswamy Sathyanarayanan,
	Kuppuswamy Sathyanarayanan, linux-kernel

From: Sean Christopherson <sean.j.christopherson@intel.com>

Add a trampoline for booting APs in 64-bit mode via a software handoff
with BIOS, and use the new trampoline for the ACPI MP wake protocol used
by TDX. MADT MP wake protocol details can be found in ACPI specification
r6.4, sec titled "Multiprocessor Wakeup Structure" (v5.2.12.19).

Extend the real mode IDT pointer by four bytes to support LIDT in 64-bit
mode.  For the GDT pointer, create a new entry as the existing storage
for the pointer occupies the zero entry in the GDT itself.

Reported-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---

Changes since v6:
 * None

 arch/x86/include/asm/realmode.h          | 12 ++++++++
 arch/x86/kernel/smpboot.c                |  2 +-
 arch/x86/realmode/rm/header.S            |  1 +
 arch/x86/realmode/rm/trampoline_64.S     | 38 ++++++++++++++++++++++++
 arch/x86/realmode/rm/trampoline_common.S | 12 +++++++-
 5 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/realmode.h b/arch/x86/include/asm/realmode.h
index 5db5d083c873..a3b1d9264c3d 100644
--- a/arch/x86/include/asm/realmode.h
+++ b/arch/x86/include/asm/realmode.h
@@ -12,6 +12,7 @@
 #ifndef __ASSEMBLY__
 
 #include <linux/types.h>
+#include <linux/cc_platform.h>
 #include <asm/io.h>
 
 /* This must match data at realmode/rm/header.S */
@@ -25,6 +26,7 @@ struct real_mode_header {
 	u32	sev_es_trampoline_start;
 #endif
 #ifdef CONFIG_X86_64
+	u32	trampoline_start64;
 	u32	trampoline_pgd;
 #endif
 	/* ACPI S3 wakeup */
@@ -88,6 +90,16 @@ static inline void set_real_mode_mem(phys_addr_t mem)
 	real_mode_header = (struct real_mode_header *) __va(mem);
 }
 
+/* Common helper function to get start IP address */
+static inline unsigned long get_trampoline_start_ip(struct real_mode_header *rmh)
+{
+#ifdef CONFIG_X86_64
+	if (cc_platform_has(CC_ATTR_GUEST_TDX))
+		return rmh->trampoline_start64;
+#endif
+	return rmh->trampoline_start;
+}
+
 void reserve_real_mode(void);
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 85f6e242b6b4..3e9c9c33bef2 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1034,7 +1034,7 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle,
 		       int *cpu0_nmi_registered)
 {
 	/* start_ip had better be page-aligned! */
-	unsigned long start_ip = real_mode_header->trampoline_start;
+	unsigned long start_ip = get_trampoline_start_ip(real_mode_header);
 
 	unsigned long boot_error = 0;
 	unsigned long timeout;
diff --git a/arch/x86/realmode/rm/header.S b/arch/x86/realmode/rm/header.S
index 8c1db5bf5d78..2eb62be6d256 100644
--- a/arch/x86/realmode/rm/header.S
+++ b/arch/x86/realmode/rm/header.S
@@ -24,6 +24,7 @@ SYM_DATA_START(real_mode_header)
 	.long	pa_sev_es_trampoline_start
 #endif
 #ifdef CONFIG_X86_64
+	.long	pa_trampoline_start64
 	.long	pa_trampoline_pgd;
 #endif
 	/* ACPI S3 wakeup */
diff --git a/arch/x86/realmode/rm/trampoline_64.S b/arch/x86/realmode/rm/trampoline_64.S
index cc8391f86cdb..ae112a91592f 100644
--- a/arch/x86/realmode/rm/trampoline_64.S
+++ b/arch/x86/realmode/rm/trampoline_64.S
@@ -161,6 +161,19 @@ SYM_CODE_START(startup_32)
 	ljmpl	$__KERNEL_CS, $pa_startup_64
 SYM_CODE_END(startup_32)
 
+SYM_CODE_START(pa_trampoline_compat)
+	/*
+	 * In compatibility mode.  Prep ESP and DX for startup_32, then disable
+	 * paging and complete the switch to legacy 32-bit mode.
+	 */
+	movl	$rm_stack_end, %esp
+	movw	$__KERNEL_DS, %dx
+
+	movl	$X86_CR0_PE, %eax
+	movl	%eax, %cr0
+	ljmpl   $__KERNEL32_CS, $pa_startup_32
+SYM_CODE_END(pa_trampoline_compat)
+
 	.section ".text64","ax"
 	.code64
 	.balign 4
@@ -169,6 +182,20 @@ SYM_CODE_START(startup_64)
 	jmpq	*tr_start(%rip)
 SYM_CODE_END(startup_64)
 
+SYM_CODE_START(trampoline_start64)
+	/*
+	 * APs start here on a direct transfer from 64-bit BIOS with identity
+	 * mapped page tables.  Load the kernel's GDT in order to gear down to
+	 * 32-bit mode (to handle 4-level vs. 5-level paging), and to (re)load
+	 * segment registers.  Load the zero IDT so any fault triggers a
+	 * shutdown instead of jumping back into BIOS.
+	 */
+	lidt	tr_idt(%rip)
+	lgdt	tr_gdt64(%rip)
+
+	ljmpl	*tr_compat(%rip)
+SYM_CODE_END(trampoline_start64)
+
 	.section ".rodata","a"
 	# Duplicate the global descriptor table
 	# so the kernel can live anywhere
@@ -182,6 +209,17 @@ SYM_DATA_START(tr_gdt)
 	.quad	0x00cf93000000ffff	# __KERNEL_DS
 SYM_DATA_END_LABEL(tr_gdt, SYM_L_LOCAL, tr_gdt_end)
 
+SYM_DATA_START(tr_gdt64)
+	.short	tr_gdt_end - tr_gdt - 1	# gdt limit
+	.long	pa_tr_gdt
+	.long	0
+SYM_DATA_END(tr_gdt64)
+
+SYM_DATA_START(tr_compat)
+	.long	pa_trampoline_compat
+	.short	__KERNEL32_CS
+SYM_DATA_END(tr_compat)
+
 	.bss
 	.balign	PAGE_SIZE
 SYM_DATA(trampoline_pgd, .space PAGE_SIZE)
diff --git a/arch/x86/realmode/rm/trampoline_common.S b/arch/x86/realmode/rm/trampoline_common.S
index 5033e640f957..4331c32c47f8 100644
--- a/arch/x86/realmode/rm/trampoline_common.S
+++ b/arch/x86/realmode/rm/trampoline_common.S
@@ -1,4 +1,14 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 	.section ".rodata","a"
 	.balign	16
-SYM_DATA_LOCAL(tr_idt, .fill 1, 6, 0)
+
+/*
+ * When a bootloader hands off to the kernel in 32-bit mode an
+ * IDT with a 2-byte limit and 4-byte base is needed. When a boot
+ * loader hands off to a kernel 64-bit mode the base address
+ * extends to 8-bytes. Reserve enough space for either scenario.
+ */
+SYM_DATA_START_LOCAL(tr_idt)
+	.short  0
+	.quad   0
+SYM_DATA_END(tr_idt)
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v7 2/6] x86/boot: Avoid #VE during boot for TDX platforms
  2021-10-05 23:05 [PATCH v7 0/6] Add TDX Guest Support (boot support) Kuppuswamy Sathyanarayanan
  2021-10-05 23:05 ` [PATCH v7 1/6] x86/boot: Add a trampoline for APs booting in 64-bit mode Kuppuswamy Sathyanarayanan
@ 2021-10-05 23:05 ` Kuppuswamy Sathyanarayanan
  2021-10-17 19:49   ` Thomas Gleixner
  2021-10-05 23:05 ` [PATCH v7 3/6] x86/topology: Disable CPU online/offline control for TDX guest Kuppuswamy Sathyanarayanan
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2021-10-05 23:05 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Paolo Bonzini, David Hildenbrand, Andrea Arcangeli,
	Josh Poimboeuf, H . Peter Anvin
  Cc: Dave Hansen, Tony Luck, Dan Williams, Andi Kleen,
	Kirill Shutemov, Sean Christopherson, Kuppuswamy Sathyanarayanan,
	Kuppuswamy Sathyanarayanan, linux-kernel

From: Sean Christopherson <seanjc@google.com>

There are a few MSRs and control register bits which the kernel
normally needs to modify during boot. But, TDX disallows
modification of these registers to help provide consistent security
guarantees. Fortunately, TDX ensures that these are all in the correct
state before the kernel loads, which means the kernel has no need to
modify them.

The conditions to avoid are:

  * Any writes to the EFER MSR
  * Clearing CR0.NE
  * Clearing CR3.MCE

This theoretically makes guest boot more fragile. If, for instance,
EFER was set up incorrectly and a WRMSR was performed, it will trigger
early exception panic or a triple fault, if it's before early
exceptions are set up. However, this is likely to trip up the guest
BIOS long before control reaches the kernel. In any case, these kinds
of problems are unlikely to occur in production environments, and
developers have good debug tools to fix them quickly. 

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---

Changes since v6:
 * None

 arch/x86/boot/compressed/head_64.S   | 16 ++++++++++++----
 arch/x86/boot/compressed/pgtable.h   |  2 +-
 arch/x86/kernel/head_64.S            | 20 ++++++++++++++++++--
 arch/x86/realmode/rm/trampoline_64.S | 23 +++++++++++++++++++----
 4 files changed, 50 insertions(+), 11 deletions(-)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 572c535cf45b..e71001b380fe 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -643,12 +643,20 @@ SYM_CODE_START(trampoline_32bit_src)
 	movl	$MSR_EFER, %ecx
 	rdmsr
 	btsl	$_EFER_LME, %eax
+	/* Avoid writing EFER if no change was made (for TDX guest) */
+	jc	1f
 	wrmsr
-	popl	%edx
+1:	popl	%edx
 	popl	%ecx
 
 	/* Enable PAE and LA57 (if required) paging modes */
-	movl	$X86_CR4_PAE, %eax
+	movl	%cr4, %eax
+	/*
+	 * Clear all bits except CR4.MCE, which is preserved.
+	 * Clearing CR4.MCE will #VE in TDX guests.
+	 */
+	andl	$X86_CR4_MCE, %eax
+	orl	$X86_CR4_PAE, %eax
 	testl	%edx, %edx
 	jz	1f
 	orl	$X86_CR4_LA57, %eax
@@ -662,8 +670,8 @@ SYM_CODE_START(trampoline_32bit_src)
 	pushl	$__KERNEL_CS
 	pushl	%eax
 
-	/* Enable paging again */
-	movl	$(X86_CR0_PG | X86_CR0_PE), %eax
+	/* Enable paging again. Avoid clearing X86_CR0_NE for TDX */
+	movl	$(X86_CR0_PG | X86_CR0_NE | X86_CR0_PE), %eax
 	movl	%eax, %cr0
 
 	lret
diff --git a/arch/x86/boot/compressed/pgtable.h b/arch/x86/boot/compressed/pgtable.h
index 6ff7e81b5628..cc9b2529a086 100644
--- a/arch/x86/boot/compressed/pgtable.h
+++ b/arch/x86/boot/compressed/pgtable.h
@@ -6,7 +6,7 @@
 #define TRAMPOLINE_32BIT_PGTABLE_OFFSET	0
 
 #define TRAMPOLINE_32BIT_CODE_OFFSET	PAGE_SIZE
-#define TRAMPOLINE_32BIT_CODE_SIZE	0x70
+#define TRAMPOLINE_32BIT_CODE_SIZE	0x80
 
 #define TRAMPOLINE_32BIT_STACK_END	TRAMPOLINE_32BIT_SIZE
 
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index d8b3ebd2bb85..96beac9eff42 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -141,7 +141,13 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 1:
 
 	/* Enable PAE mode, PGE and LA57 */
-	movl	$(X86_CR4_PAE | X86_CR4_PGE), %ecx
+	movq	%cr4, %rcx
+	/*
+	 * Clear all bits except CR4.MCE, which is preserved.
+	 * Clearing CR4.MCE will #VE in TDX guests.
+	 */
+	andl	$X86_CR4_MCE, %ecx
+	orl	$(X86_CR4_PAE | X86_CR4_PGE), %ecx
 #ifdef CONFIG_X86_5LEVEL
 	testl	$1, __pgtable_l5_enabled(%rip)
 	jz	1f
@@ -229,13 +235,23 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 	/* Setup EFER (Extended Feature Enable Register) */
 	movl	$MSR_EFER, %ecx
 	rdmsr
+	/*
+	 * Preserve current value of EFER for comparison and to skip
+	 * EFER writes if no change was made (for TDX guest)
+	 */
+	movl    %eax, %edx
 	btsl	$_EFER_SCE, %eax	/* Enable System Call */
 	btl	$20,%edi		/* No Execute supported? */
 	jnc     1f
 	btsl	$_EFER_NX, %eax
 	btsq	$_PAGE_BIT_NX,early_pmd_flags(%rip)
-1:	wrmsr				/* Make changes effective */
 
+	/* Avoid writing EFER if no change was made (for TDX guest) */
+1:	cmpl	%edx, %eax
+	je	1f
+	xor	%edx, %edx
+	wrmsr				/* Make changes effective */
+1:
 	/* Setup cr0 */
 	movl	$CR0_STATE, %eax
 	/* Make changes effective */
diff --git a/arch/x86/realmode/rm/trampoline_64.S b/arch/x86/realmode/rm/trampoline_64.S
index ae112a91592f..0fdd74054044 100644
--- a/arch/x86/realmode/rm/trampoline_64.S
+++ b/arch/x86/realmode/rm/trampoline_64.S
@@ -143,13 +143,27 @@ SYM_CODE_START(startup_32)
 	movl	%eax, %cr3
 
 	# Set up EFER
+	movl	$MSR_EFER, %ecx
+	rdmsr
+	/*
+	 * Skip writing to EFER if the register already has desired
+	 * value (to avoid #VE for the TDX guest).
+	 */
+	cmp	pa_tr_efer, %eax
+	jne	.Lwrite_efer
+	cmp	pa_tr_efer + 4, %edx
+	je	.Ldone_efer
+.Lwrite_efer:
 	movl	pa_tr_efer, %eax
 	movl	pa_tr_efer + 4, %edx
-	movl	$MSR_EFER, %ecx
 	wrmsr
 
-	# Enable paging and in turn activate Long Mode
-	movl	$(X86_CR0_PG | X86_CR0_WP | X86_CR0_PE), %eax
+.Ldone_efer:
+	/*
+	 * Enable paging and in turn activate Long Mode. Avoid clearing
+	 * X86_CR0_NE for TDX.
+	 */
+	movl	$(X86_CR0_PG | X86_CR0_WP | X86_CR0_NE | X86_CR0_PE), %eax
 	movl	%eax, %cr0
 
 	/*
@@ -169,7 +183,8 @@ SYM_CODE_START(pa_trampoline_compat)
 	movl	$rm_stack_end, %esp
 	movw	$__KERNEL_DS, %dx
 
-	movl	$X86_CR0_PE, %eax
+	/* Avoid clearing X86_CR0_NE for TDX */
+	movl	$(X86_CR0_NE | X86_CR0_PE), %eax
 	movl	%eax, %cr0
 	ljmpl   $__KERNEL32_CS, $pa_startup_32
 SYM_CODE_END(pa_trampoline_compat)
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v7 3/6] x86/topology: Disable CPU online/offline control for TDX guest
  2021-10-05 23:05 [PATCH v7 0/6] Add TDX Guest Support (boot support) Kuppuswamy Sathyanarayanan
  2021-10-05 23:05 ` [PATCH v7 1/6] x86/boot: Add a trampoline for APs booting in 64-bit mode Kuppuswamy Sathyanarayanan
  2021-10-05 23:05 ` [PATCH v7 2/6] x86/boot: Avoid #VE during boot for TDX platforms Kuppuswamy Sathyanarayanan
@ 2021-10-05 23:05 ` Kuppuswamy Sathyanarayanan
  2021-10-17 19:23   ` Thomas Gleixner
  2021-10-05 23:05 ` [PATCH v7 4/6] x86/tdx: Forcefully disable legacy PIC for TDX guests Kuppuswamy Sathyanarayanan
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2021-10-05 23:05 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Paolo Bonzini, David Hildenbrand, Andrea Arcangeli,
	Josh Poimboeuf, H . Peter Anvin
  Cc: Dave Hansen, Tony Luck, Dan Williams, Andi Kleen,
	Kirill Shutemov, Sean Christopherson, Kuppuswamy Sathyanarayanan,
	Kuppuswamy Sathyanarayanan, linux-kernel

As per Intel TDX Virtual Firmware Design Guide, sec titled "AP
initialization in OS" (index 4.3.5) and sec titled "Hotplug Device"
(index 9.4), all unused CPUs are put in spinning state by TDVF until
OS requests for CPU bring-up via mailbox address passed by ACPI MADT
table. Since by default all unused CPUs are always in spinning state,
there is no point in supporting dynamic CPU online/offline feature. So
current generation of TDVF does not support CPU hotplug feature. It may
be supported in next generation.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---

Changes since v6:
 * None

 arch/x86/kernel/tdx.c      | 16 ++++++++++++++++
 arch/x86/kernel/topology.c |  3 ++-
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
index a66520405109..203deb57c4c9 100644
--- a/arch/x86/kernel/tdx.c
+++ b/arch/x86/kernel/tdx.c
@@ -4,6 +4,8 @@
 #undef pr_fmt
 #define pr_fmt(fmt)     "tdx: " fmt
 
+#include <linux/cpuhotplug.h>
+
 #include <asm/tdx.h>
 #include <asm/vmx.h>
 #include <asm/insn.h>
@@ -307,6 +309,17 @@ static int tdx_handle_mmio(struct pt_regs *regs, struct ve_info *ve)
 	return insn.length;
 }
 
+static int tdx_cpu_offline_prepare(unsigned int cpu)
+{
+	/*
+	 * Per Intel TDX Virtual Firmware Design Guide,
+	 * sec 4.3.5 and sec 9.4, Hotplug is not supported
+	 * in TDX platforms. So don't support CPU
+	 * offline feature once it is turned on.
+	 */
+	return -EOPNOTSUPP;
+}
+
 unsigned long tdx_get_ve_info(struct ve_info *ve)
 {
 	struct tdx_module_output out = {0};
@@ -451,5 +464,8 @@ void __init tdx_early_init(void)
 	pv_ops.irq.safe_halt = tdx_safe_halt;
 	pv_ops.irq.halt = tdx_halt;
 
+	cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "tdx:cpu_hotplug",
+			  NULL, tdx_cpu_offline_prepare);
+
 	pr_info("Guest initialized\n");
 }
diff --git a/arch/x86/kernel/topology.c b/arch/x86/kernel/topology.c
index bd83748e2bde..ded34eda5bac 100644
--- a/arch/x86/kernel/topology.c
+++ b/arch/x86/kernel/topology.c
@@ -32,6 +32,7 @@
 #include <linux/init.h>
 #include <linux/smp.h>
 #include <linux/irq.h>
+#include <linux/cc_platform.h>
 #include <asm/io_apic.h>
 #include <asm/cpu.h>
 
@@ -130,7 +131,7 @@ int arch_register_cpu(int num)
 			}
 		}
 	}
-	if (num || cpu0_hotpluggable)
+	if ((num || cpu0_hotpluggable) && !cc_platform_has(CC_ATTR_GUEST_TDX))
 		per_cpu(cpu_devices, num).cpu.hotpluggable = 1;
 
 	return register_cpu(&per_cpu(cpu_devices, num).cpu, num);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v7 4/6] x86/tdx: Forcefully disable legacy PIC for TDX guests
  2021-10-05 23:05 [PATCH v7 0/6] Add TDX Guest Support (boot support) Kuppuswamy Sathyanarayanan
                   ` (2 preceding siblings ...)
  2021-10-05 23:05 ` [PATCH v7 3/6] x86/topology: Disable CPU online/offline control for TDX guest Kuppuswamy Sathyanarayanan
@ 2021-10-05 23:05 ` Kuppuswamy Sathyanarayanan
  2021-10-05 23:05 ` [PATCH v7 5/6] x86: Skip WBINVD instruction for VM guest Kuppuswamy Sathyanarayanan
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2021-10-05 23:05 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Paolo Bonzini, David Hildenbrand, Andrea Arcangeli,
	Josh Poimboeuf, H . Peter Anvin
  Cc: Dave Hansen, Tony Luck, Dan Williams, Andi Kleen,
	Kirill Shutemov, Sean Christopherson, Kuppuswamy Sathyanarayanan,
	Kuppuswamy Sathyanarayanan, linux-kernel

From: Sean Christopherson <sean.j.christopherson@intel.com>

Disable the legacy PIC (8259) for TDX guests as the PIC cannot be
supported by the VMM. TDX Module does not allow direct IRQ injection,
and using posted interrupt style delivery requires the guest to EOI
the IRQ, which diverges from the legacy PIC behavior.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---

Changes since v6:
 * None

 arch/x86/kernel/tdx.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
index 203deb57c4c9..987dc3ee5bbf 100644
--- a/arch/x86/kernel/tdx.c
+++ b/arch/x86/kernel/tdx.c
@@ -7,6 +7,7 @@
 #include <linux/cpuhotplug.h>
 
 #include <asm/tdx.h>
+#include <asm/i8259.h>
 #include <asm/vmx.h>
 #include <asm/insn.h>
 #include <asm/insn-eval.h>
@@ -464,6 +465,8 @@ void __init tdx_early_init(void)
 	pv_ops.irq.safe_halt = tdx_safe_halt;
 	pv_ops.irq.halt = tdx_halt;
 
+	legacy_pic = &null_legacy_pic;
+
 	cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "tdx:cpu_hotplug",
 			  NULL, tdx_cpu_offline_prepare);
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v7 5/6] x86: Skip WBINVD instruction for VM guest
  2021-10-05 23:05 [PATCH v7 0/6] Add TDX Guest Support (boot support) Kuppuswamy Sathyanarayanan
                   ` (3 preceding siblings ...)
  2021-10-05 23:05 ` [PATCH v7 4/6] x86/tdx: Forcefully disable legacy PIC for TDX guests Kuppuswamy Sathyanarayanan
@ 2021-10-05 23:05 ` Kuppuswamy Sathyanarayanan
  2021-10-05 23:05 ` [PATCH v7 6/6] x86/split_lock: Fix the split lock #AC handling when running as guest Kuppuswamy Sathyanarayanan
  2021-10-06  8:21 ` [PATCH v7 0/6] Add TDX Guest Support (boot support) David Hildenbrand
  6 siblings, 0 replies; 17+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2021-10-05 23:05 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Paolo Bonzini, David Hildenbrand, Andrea Arcangeli,
	Josh Poimboeuf, H . Peter Anvin
  Cc: Dave Hansen, Tony Luck, Dan Williams, Andi Kleen,
	Kirill Shutemov, Sean Christopherson, Kuppuswamy Sathyanarayanan,
	Kuppuswamy Sathyanarayanan, linux-kernel

VM guests that supports ACPI, use standard ACPI mechanisms to signal
sleep state entry (including reboot) to the host. The ACPI
specification mandates WBINVD on any sleep state entry with the
expectation that the platform is only responsible for maintaining the
state of memory over sleep states, not preserving dirty data in any
CPU caches. ACPI cache flushing requirements pre-date the advent of
virtualization. Given guest sleep state entry does not affect any
host power rails it is not required to flush caches. The host is
responsible for maintaining cache state over its own bare metal sleep
state transitions that power-off the cache. A TDX guest, unlike a
typical guest, will machine check if the CPU cache is powered off.

Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: linux-acpi@vger.kernel.org
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---

Changes since v6:
 * None

 arch/x86/include/asm/acenv.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/acenv.h b/arch/x86/include/asm/acenv.h
index 9aff97f0de7f..d4162e94bee8 100644
--- a/arch/x86/include/asm/acenv.h
+++ b/arch/x86/include/asm/acenv.h
@@ -10,10 +10,15 @@
 #define _ASM_X86_ACENV_H
 
 #include <asm/special_insns.h>
+#include <asm/cpu.h>
 
 /* Asm macros */
 
-#define ACPI_FLUSH_CPU_CACHE()	wbinvd()
+#define ACPI_FLUSH_CPU_CACHE()				\
+do {							\
+	if (!boot_cpu_has(X86_FEATURE_HYPERVISOR))	\
+		wbinvd();				\
+} while (0)
 
 int __acpi_acquire_global_lock(unsigned int *lock);
 int __acpi_release_global_lock(unsigned int *lock);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v7 6/6] x86/split_lock: Fix the split lock #AC handling when running as guest
  2021-10-05 23:05 [PATCH v7 0/6] Add TDX Guest Support (boot support) Kuppuswamy Sathyanarayanan
                   ` (4 preceding siblings ...)
  2021-10-05 23:05 ` [PATCH v7 5/6] x86: Skip WBINVD instruction for VM guest Kuppuswamy Sathyanarayanan
@ 2021-10-05 23:05 ` Kuppuswamy Sathyanarayanan
  2021-10-13 20:30   ` Sean Christopherson
  2021-10-06  8:21 ` [PATCH v7 0/6] Add TDX Guest Support (boot support) David Hildenbrand
  6 siblings, 1 reply; 17+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2021-10-05 23:05 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Paolo Bonzini, David Hildenbrand, Andrea Arcangeli,
	Josh Poimboeuf, H . Peter Anvin
  Cc: Dave Hansen, Tony Luck, Dan Williams, Andi Kleen,
	Kirill Shutemov, Sean Christopherson, Kuppuswamy Sathyanarayanan,
	Kuppuswamy Sathyanarayanan, linux-kernel

From: Xiaoyao Li <xiaoyao.li@intel.com>

If running as guest and hypervisor enables
MSR_TEST_CTRL.SPLIT_LOCK_DETECT during its running, it can get split
lock #AC even though sld_state is sld_off.

For kernel mode #AC, it always dies("split lock"), no more action
needed.

For user mode #AC, it should treat sld_off (default state when feature
is not available) as fatal as well.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---
 arch/x86/kernel/cpu/intel.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 01d7935feaed..47f0bc95ce2a 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -1190,7 +1190,12 @@ static void bus_lock_init(void)
 
 bool handle_user_split_lock(struct pt_regs *regs, long error_code)
 {
-	if ((regs->flags & X86_EFLAGS_AC) || sld_state == sld_fatal)
+	/*
+	 * In virtualization environment, it can get split lock #AC even when
+	 * sld_off but hypervisor enables it.
+	 * Thus only handles when sld_warn explicitly.
+	 */
+	if ((regs->flags & X86_EFLAGS_AC) || sld_state != sld_warn)
 		return false;
 	split_lock_warn(regs->ip);
 	return true;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 0/6] Add TDX Guest Support (boot support)
  2021-10-05 23:05 [PATCH v7 0/6] Add TDX Guest Support (boot support) Kuppuswamy Sathyanarayanan
                   ` (5 preceding siblings ...)
  2021-10-05 23:05 ` [PATCH v7 6/6] x86/split_lock: Fix the split lock #AC handling when running as guest Kuppuswamy Sathyanarayanan
@ 2021-10-06  8:21 ` David Hildenbrand
  6 siblings, 0 replies; 17+ messages in thread
From: David Hildenbrand @ 2021-10-06  8:21 UTC (permalink / raw)
  To: Kuppuswamy Sathyanarayanan, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, Paolo Bonzini, Andrea Arcangeli,
	Josh Poimboeuf, H . Peter Anvin
  Cc: Dave Hansen, Tony Luck, Dan Williams, Andi Kleen,
	Kirill Shutemov, Sean Christopherson, Kuppuswamy Sathyanarayanan,
	linux-kernel

On 06.10.21 01:05, Kuppuswamy Sathyanarayanan wrote:
> Hi All,
> 
> Intel's Trust Domain Extensions (TDX) protect guest VMs from malicious
> hosts and some physical attacks. This series adds boot code support
> and some additional fixes required for successful boot of TDX guest.
> 
> This series is the continuation of the patch series titled "Add TDX Guest
> Support (Initial support)" and "Add TDX Guest Support (#VE handler support
> )", which added initial support and #VE handler support for TDX guests. You
> can find the related patchsets in the following links.
> 
> [set 1, v8] - https://lore.kernel.org/lkml/20211005025205.1784480-1-sathyanarayanan.kuppuswamy@linux.intel.com/
> [set 2, v7] - https://lore.kernel.org/lkml/20211005204136.1812078-1-sathyanarayanan.kuppuswamy@linux.intel.com/
> 
> Also please note that this series alone is not necessarily fully
> functional.

I had a quick peek over all patches and nothing jumped at me (however, I 
am by far no expert on early x86 code!).


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 6/6] x86/split_lock: Fix the split lock #AC handling when running as guest
  2021-10-05 23:05 ` [PATCH v7 6/6] x86/split_lock: Fix the split lock #AC handling when running as guest Kuppuswamy Sathyanarayanan
@ 2021-10-13 20:30   ` Sean Christopherson
  2021-10-13 21:32     ` Sathyanarayanan Kuppuswamy
  0 siblings, 1 reply; 17+ messages in thread
From: Sean Christopherson @ 2021-10-13 20:30 UTC (permalink / raw)
  To: Kuppuswamy Sathyanarayanan
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Paolo Bonzini, David Hildenbrand, Andrea Arcangeli,
	Josh Poimboeuf, H . Peter Anvin, Dave Hansen, Tony Luck,
	Dan Williams, Andi Kleen, Kirill Shutemov,
	Kuppuswamy Sathyanarayanan, linux-kernel

On Tue, Oct 05, 2021, Kuppuswamy Sathyanarayanan wrote:
> From: Xiaoyao Li <xiaoyao.li@intel.com>
> 
> If running as guest and hypervisor enables
> MSR_TEST_CTRL.SPLIT_LOCK_DETECT during its running, it can get split
> lock #AC even though sld_state is sld_off.

That's a hypervisor bug, no?  The hypervisor should never inject a fault that
the guest cannot reasonably expect.

> For kernel mode #AC, it always dies("split lock"), no more action
> needed.
> 
> For user mode #AC, it should treat sld_off (default state when feature
> is not available) as fatal as well.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> ---
>  arch/x86/kernel/cpu/intel.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
> index 01d7935feaed..47f0bc95ce2a 100644
> --- a/arch/x86/kernel/cpu/intel.c
> +++ b/arch/x86/kernel/cpu/intel.c
> @@ -1190,7 +1190,12 @@ static void bus_lock_init(void)
>  
>  bool handle_user_split_lock(struct pt_regs *regs, long error_code)
>  {
> -	if ((regs->flags & X86_EFLAGS_AC) || sld_state == sld_fatal)
> +	/*
> +	 * In virtualization environment, it can get split lock #AC even when
> +	 * sld_off but hypervisor enables it.
> +	 * Thus only handles when sld_warn explicitly.
> +	 */
> +	if ((regs->flags & X86_EFLAGS_AC) || sld_state != sld_warn)
>  		return false;
>  	split_lock_warn(regs->ip);
>  	return true;
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 6/6] x86/split_lock: Fix the split lock #AC handling when running as guest
  2021-10-13 20:30   ` Sean Christopherson
@ 2021-10-13 21:32     ` Sathyanarayanan Kuppuswamy
  2021-10-14  1:24       ` Xiaoyao Li
  0 siblings, 1 reply; 17+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2021-10-13 21:32 UTC (permalink / raw)
  To: Sean Christopherson, Kuppuswamy Sathyanarayanan, xiaoyao.li
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Paolo Bonzini, David Hildenbrand, Andrea Arcangeli,
	Josh Poimboeuf, H . Peter Anvin, Dave Hansen, Tony Luck,
	Dan Williams, Andi Kleen, Kirill Shutemov,
	Kuppuswamy Sathyanarayanan, linux-kernel

+ Xiaoyao

On 10/13/21 1:30 PM, Sean Christopherson wrote:
> On Tue, Oct 05, 2021, Kuppuswamy Sathyanarayanan wrote:
>> From: Xiaoyao Li <xiaoyao.li@intel.com>
>>
>> If running as guest and hypervisor enables
>> MSR_TEST_CTRL.SPLIT_LOCK_DETECT during its running, it can get split
>> lock #AC even though sld_state is sld_off.
> That's a hypervisor bug, no?  The hypervisor should never inject a fault that
> the guest cannot reasonably expect.
>
>> For kernel mode #AC, it always dies("split lock"), no more action
>> needed.
>>
>> For user mode #AC, it should treat sld_off (default state when feature
>> is not available) as fatal as well.
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>> ---
>>   arch/x86/kernel/cpu/intel.c | 7 ++++++-
>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
>> index 01d7935feaed..47f0bc95ce2a 100644
>> --- a/arch/x86/kernel/cpu/intel.c
>> +++ b/arch/x86/kernel/cpu/intel.c
>> @@ -1190,7 +1190,12 @@ static void bus_lock_init(void)
>>   
>>   bool handle_user_split_lock(struct pt_regs *regs, long error_code)
>>   {
>> -	if ((regs->flags & X86_EFLAGS_AC) || sld_state == sld_fatal)
>> +	/*
>> +	 * In virtualization environment, it can get split lock #AC even when
>> +	 * sld_off but hypervisor enables it.
>> +	 * Thus only handles when sld_warn explicitly.
>> +	 */
>> +	if ((regs->flags & X86_EFLAGS_AC) || sld_state != sld_warn)
>>   		return false;
>>   	split_lock_warn(regs->ip);
>>   	return true;
>> -- 
>> 2.25.1
>>
-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 6/6] x86/split_lock: Fix the split lock #AC handling when running as guest
  2021-10-13 21:32     ` Sathyanarayanan Kuppuswamy
@ 2021-10-14  1:24       ` Xiaoyao Li
  2021-10-14  2:21         ` Sathyanarayanan Kuppuswamy
  2021-10-14 15:04         ` Sean Christopherson
  0 siblings, 2 replies; 17+ messages in thread
From: Xiaoyao Li @ 2021-10-14  1:24 UTC (permalink / raw)
  To: Sathyanarayanan Kuppuswamy, Sean Christopherson,
	Kuppuswamy Sathyanarayanan, xiaoyao.li
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Paolo Bonzini, David Hildenbrand, Andrea Arcangeli,
	Josh Poimboeuf, H . Peter Anvin, Dave Hansen, Tony Luck,
	Dan Williams, Andi Kleen, Kirill Shutemov,
	Kuppuswamy Sathyanarayanan, linux-kernel

On 10/14/2021 5:32 AM, Sathyanarayanan Kuppuswamy wrote:
> + Xiaoyao
> 
> On 10/13/21 1:30 PM, Sean Christopherson wrote:
>> On Tue, Oct 05, 2021, Kuppuswamy Sathyanarayanan wrote:
>>> From: Xiaoyao Li <xiaoyao.li@intel.com>
>>>
>>> If running as guest and hypervisor enables
>>> MSR_TEST_CTRL.SPLIT_LOCK_DETECT during its running, it can get split
>>> lock #AC even though sld_state is sld_off.
>> That's a hypervisor bug, no?  The hypervisor should never inject a 
>> fault that
>> the guest cannot reasonably expect.

What if hypervisor doesn't intercept #AC and host enables 
SPLIT_LOCK_DETECT during guest running? That's exactly the case TDX is 
facing.

(BTW, this patch is not complete that no matter what state sld_state is, 
it should always treat it as fatal even sld_warn because sld_warn 
doesn't guarantee SPLIT_LOCK_DETECT is available)

Sathya, we need to drop this one. Andi has anther one to disable split 
lock detection for SPR. That's a better direction.

>>> For kernel mode #AC, it always dies("split lock"), no more action
>>> needed.
>>>
>>> For user mode #AC, it should treat sld_off (default state when feature
>>> is not available) as fatal as well.
>>>
>>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>>> Signed-off-by: Kuppuswamy Sathyanarayanan 
>>> <sathyanarayanan.kuppuswamy@linux.intel.com>
>>> ---
>>>   arch/x86/kernel/cpu/intel.c | 7 ++++++-
>>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
>>> index 01d7935feaed..47f0bc95ce2a 100644
>>> --- a/arch/x86/kernel/cpu/intel.c
>>> +++ b/arch/x86/kernel/cpu/intel.c
>>> @@ -1190,7 +1190,12 @@ static void bus_lock_init(void)
>>>   bool handle_user_split_lock(struct pt_regs *regs, long error_code)
>>>   {
>>> -    if ((regs->flags & X86_EFLAGS_AC) || sld_state == sld_fatal)
>>> +    /*
>>> +     * In virtualization environment, it can get split lock #AC even 
>>> when
>>> +     * sld_off but hypervisor enables it.
>>> +     * Thus only handles when sld_warn explicitly.
>>> +     */
>>> +    if ((regs->flags & X86_EFLAGS_AC) || sld_state != sld_warn)
>>>           return false;
>>>       split_lock_warn(regs->ip);
>>>       return true;
>>> -- 
>>> 2.25.1
>>>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 6/6] x86/split_lock: Fix the split lock #AC handling when running as guest
  2021-10-14  1:24       ` Xiaoyao Li
@ 2021-10-14  2:21         ` Sathyanarayanan Kuppuswamy
  2021-10-14 15:04         ` Sean Christopherson
  1 sibling, 0 replies; 17+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2021-10-14  2:21 UTC (permalink / raw)
  To: Xiaoyao Li, Sean Christopherson, Kuppuswamy Sathyanarayanan
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Paolo Bonzini, David Hildenbrand, Andrea Arcangeli,
	Josh Poimboeuf, H . Peter Anvin, Dave Hansen, Tony Luck,
	Dan Williams, Andi Kleen, Kirill Shutemov,
	Kuppuswamy Sathyanarayanan, linux-kernel


On 10/13/21 6:24 PM, Xiaoyao Li wrote:
> On 10/14/2021 5:32 AM, Sathyanarayanan Kuppuswamy wrote:
>> + Xiaoyao
>>
>> On 10/13/21 1:30 PM, Sean Christopherson wrote:
>>> On Tue, Oct 05, 2021, Kuppuswamy Sathyanarayanan wrote:
>>>> From: Xiaoyao Li <xiaoyao.li@intel.com>
>>>>
>>>> If running as guest and hypervisor enables
>>>> MSR_TEST_CTRL.SPLIT_LOCK_DETECT during its running, it can get split
>>>> lock #AC even though sld_state is sld_off.
>>> That's a hypervisor bug, no?  The hypervisor should never inject a 
>>> fault that
>>> the guest cannot reasonably expect.
>
> What if hypervisor doesn't intercept #AC and host enables 
> SPLIT_LOCK_DETECT during guest running? That's exactly the case TDX is 
> facing.
>
> (BTW, this patch is not complete that no matter what state sld_state 
> is, it should always treat it as fatal even sld_warn because sld_warn 
> doesn't guarantee SPLIT_LOCK_DETECT is available)
>
> Sathya, we need to drop this one. Andi has anther one to disable split 
> lock detection for SPR. That's a better direction.

Ok. I will fix this in next submission.

>
>>>> For kernel mode #AC, it always dies("split lock"), no more action
>>>> needed.
>>>>
>>>> For user mode #AC, it should treat sld_off (default state when feature
>>>> is not available) as fatal as well.
>>>>
>>>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>>>> Signed-off-by: Kuppuswamy Sathyanarayanan 
>>>> <sathyanarayanan.kuppuswamy@linux.intel.com>
>>>> ---
>>>>   arch/x86/kernel/cpu/intel.c | 7 ++++++-
>>>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
>>>> index 01d7935feaed..47f0bc95ce2a 100644
>>>> --- a/arch/x86/kernel/cpu/intel.c
>>>> +++ b/arch/x86/kernel/cpu/intel.c
>>>> @@ -1190,7 +1190,12 @@ static void bus_lock_init(void)
>>>>   bool handle_user_split_lock(struct pt_regs *regs, long error_code)
>>>>   {
>>>> -    if ((regs->flags & X86_EFLAGS_AC) || sld_state == sld_fatal)
>>>> +    /*
>>>> +     * In virtualization environment, it can get split lock #AC 
>>>> even when
>>>> +     * sld_off but hypervisor enables it.
>>>> +     * Thus only handles when sld_warn explicitly.
>>>> +     */
>>>> +    if ((regs->flags & X86_EFLAGS_AC) || sld_state != sld_warn)
>>>>           return false;
>>>>       split_lock_warn(regs->ip);
>>>>       return true;
>>>> -- 
>>>> 2.25.1
>>>>
>
-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 6/6] x86/split_lock: Fix the split lock #AC handling when running as guest
  2021-10-14  1:24       ` Xiaoyao Li
  2021-10-14  2:21         ` Sathyanarayanan Kuppuswamy
@ 2021-10-14 15:04         ` Sean Christopherson
  2021-10-15  1:30           ` Xiaoyao Li
  1 sibling, 1 reply; 17+ messages in thread
From: Sean Christopherson @ 2021-10-14 15:04 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Sathyanarayanan Kuppuswamy, Kuppuswamy Sathyanarayanan,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Paolo Bonzini, David Hildenbrand, Andrea Arcangeli,
	Josh Poimboeuf, H . Peter Anvin, Dave Hansen, Tony Luck,
	Dan Williams, Andi Kleen, Kirill Shutemov,
	Kuppuswamy Sathyanarayanan, linux-kernel

On Thu, Oct 14, 2021, Xiaoyao Li wrote:
> On 10/14/2021 5:32 AM, Sathyanarayanan Kuppuswamy wrote:
> > + Xiaoyao
> > 
> > On 10/13/21 1:30 PM, Sean Christopherson wrote:
> > > On Tue, Oct 05, 2021, Kuppuswamy Sathyanarayanan wrote:
> > > > From: Xiaoyao Li <xiaoyao.li@intel.com>
> > > > 
> > > > If running as guest and hypervisor enables
> > > > MSR_TEST_CTRL.SPLIT_LOCK_DETECT during its running, it can get split
> > > > lock #AC even though sld_state is sld_off.
> > > That's a hypervisor bug, no?  The hypervisor should never inject a fault
> > > that the guest cannot reasonably expect.
> 
> What if hypervisor doesn't intercept #AC and host enables SPLIT_LOCK_DETECT
> during guest running? That's exactly the case TDX is facing.

That's a hypervisor bug.  Since it sounds like the TDX Module buries its head in
the sand for split-lock #AC, KVM should refuse to run TDX guests if split-lock #AC
is enabled.  Ideally the TDX Module would provide support for conditionally
intercepting #AC, e.g. intercept and re-inject "normal" #AC, and exit to the VMM
for split-lock #AC.  That would give VMMs the option of enabling split-lock
detection in fatal mode for guests.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 6/6] x86/split_lock: Fix the split lock #AC handling when running as guest
  2021-10-14 15:04         ` Sean Christopherson
@ 2021-10-15  1:30           ` Xiaoyao Li
  0 siblings, 0 replies; 17+ messages in thread
From: Xiaoyao Li @ 2021-10-15  1:30 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Sathyanarayanan Kuppuswamy, Kuppuswamy Sathyanarayanan,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Paolo Bonzini, David Hildenbrand, Andrea Arcangeli,
	Josh Poimboeuf, H . Peter Anvin, Dave Hansen, Tony Luck,
	Dan Williams, Andi Kleen, Kirill Shutemov,
	Kuppuswamy Sathyanarayanan, linux-kernel

On 10/14/2021 11:04 PM, Sean Christopherson wrote:
> On Thu, Oct 14, 2021, Xiaoyao Li wrote:
>> On 10/14/2021 5:32 AM, Sathyanarayanan Kuppuswamy wrote:
>>> + Xiaoyao
>>>
>>> On 10/13/21 1:30 PM, Sean Christopherson wrote:
>>>> On Tue, Oct 05, 2021, Kuppuswamy Sathyanarayanan wrote:
>>>>> From: Xiaoyao Li <xiaoyao.li@intel.com>
>>>>>
>>>>> If running as guest and hypervisor enables
>>>>> MSR_TEST_CTRL.SPLIT_LOCK_DETECT during its running, it can get split
>>>>> lock #AC even though sld_state is sld_off.
>>>> That's a hypervisor bug, no?  The hypervisor should never inject a fault
>>>> that the guest cannot reasonably expect.
>>
>> What if hypervisor doesn't intercept #AC and host enables SPLIT_LOCK_DETECT
>> during guest running? That's exactly the case TDX is facing.
> 
> That's a hypervisor bug.  Since it sounds like the TDX Module buries its head in
> the sand for split-lock #AC, KVM should refuse to run TDX guests if split-lock #AC
> is enabled.  Ideally the TDX Module would provide support for conditionally
> intercepting #AC, e.g. intercept and re-inject "normal" #AC, and exit to the VMM
> for split-lock #AC.  That would give VMMs the option of enabling split-lock
> detection in fatal mode for guests.
> 

We have bus lock VM exit for it.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 3/6] x86/topology: Disable CPU online/offline control for TDX guest
  2021-10-05 23:05 ` [PATCH v7 3/6] x86/topology: Disable CPU online/offline control for TDX guest Kuppuswamy Sathyanarayanan
@ 2021-10-17 19:23   ` Thomas Gleixner
  2021-10-17 19:28     ` Sathyanarayanan Kuppuswamy
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Gleixner @ 2021-10-17 19:23 UTC (permalink / raw)
  To: Kuppuswamy Sathyanarayanan, Ingo Molnar, Borislav Petkov, x86,
	Paolo Bonzini, David Hildenbrand, Andrea Arcangeli,
	Josh Poimboeuf, H . Peter Anvin
  Cc: Dave Hansen, Tony Luck, Dan Williams, Andi Kleen,
	Kirill Shutemov, Sean Christopherson, Kuppuswamy Sathyanarayanan,
	Kuppuswamy Sathyanarayanan, linux-kernel

On Tue, Oct 05 2021 at 16:05, Kuppuswamy Sathyanarayanan wrote:
>  
> +static int tdx_cpu_offline_prepare(unsigned int cpu)
> +{
> +	/*
> +	 * Per Intel TDX Virtual Firmware Design Guide,
> +	 * sec 4.3.5 and sec 9.4, Hotplug is not supported
> +	 * in TDX platforms. So don't support CPU
> +	 * offline feature once it is turned on.
> +	 */
> +	return -EOPNOTSUPP;
> +}
> +
>  unsigned long tdx_get_ve_info(struct ve_info *ve)
>  {
>  	struct tdx_module_output out = {0};
> @@ -451,5 +464,8 @@ void __init tdx_early_init(void)
>  	pv_ops.irq.safe_halt = tdx_safe_halt;
>  	pv_ops.irq.halt = tdx_halt;
>  
> +	cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "tdx:cpu_hotplug",
> +			  NULL, tdx_cpu_offline_prepare);

Seriously? This lets the unplug start, which starts to kick off tasks
from the CPU just to make it fail a few steps later?

The obvious place to prevent this is the CPU hotplug code itself, right?

Thanks,

        tglx
---
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 192e43a87407..c544eb6c79d3 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1178,6 +1178,8 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen,
 
 static int cpu_down_maps_locked(unsigned int cpu, enum cpuhp_state target)
 {
+	if (cc_platform_has(CC_HOTPLUG_DISABLED))
+		return -ENOTSUPP;
 	if (cpu_hotplug_disabled)
 		return -EBUSY;
 	return _cpu_down(cpu, 0, target);


        

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 3/6] x86/topology: Disable CPU online/offline control for TDX guest
  2021-10-17 19:23   ` Thomas Gleixner
@ 2021-10-17 19:28     ` Sathyanarayanan Kuppuswamy
  0 siblings, 0 replies; 17+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2021-10-17 19:28 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Paolo Bonzini, David Hildenbrand, Andrea Arcangeli,
	Josh Poimboeuf, H . Peter Anvin
  Cc: Dave Hansen, Tony Luck, Dan Williams, Andi Kleen,
	Kirill Shutemov, Sean Christopherson, Kuppuswamy Sathyanarayanan,
	linux-kernel


On 10/17/21 12:23 PM, Thomas Gleixner wrote:
> Seriously? This lets the unplug start, which starts to kick off tasks
> from the CPU just to make it fail a few steps later?
>
> The obvious place to prevent this is the CPU hotplug code itself, right?
>
> Thanks,
>
>          tglx
> ---
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 192e43a87407..c544eb6c79d3 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -1178,6 +1178,8 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen,
>   
>   static int cpu_down_maps_locked(unsigned int cpu, enum cpuhp_state target)
>   {
> +	if (cc_platform_has(CC_HOTPLUG_DISABLED))
> +		return -ENOTSUPP;
>   	if (cpu_hotplug_disabled)
>   		return -EBUSY;
>   	return _cpu_down(cpu, 0, target);

Makes sense. I will use it in next version.

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 2/6] x86/boot: Avoid #VE during boot for TDX platforms
  2021-10-05 23:05 ` [PATCH v7 2/6] x86/boot: Avoid #VE during boot for TDX platforms Kuppuswamy Sathyanarayanan
@ 2021-10-17 19:49   ` Thomas Gleixner
  0 siblings, 0 replies; 17+ messages in thread
From: Thomas Gleixner @ 2021-10-17 19:49 UTC (permalink / raw)
  To: Kuppuswamy Sathyanarayanan, Ingo Molnar, Borislav Petkov, x86,
	Paolo Bonzini, David Hildenbrand, Andrea Arcangeli,
	Josh Poimboeuf, H . Peter Anvin
  Cc: Dave Hansen, Tony Luck, Dan Williams, Andi Kleen,
	Kirill Shutemov, Sean Christopherson, Kuppuswamy Sathyanarayanan,
	Kuppuswamy Sathyanarayanan, linux-kernel

On Tue, Oct 05 2021 at 16:05, Kuppuswamy Sathyanarayanan wrote:
>  
>  	/* Enable PAE and LA57 (if required) paging modes */
> -	movl	$X86_CR4_PAE, %eax
> +	movl	%cr4, %eax
> +	/*
> +	 * Clear all bits except CR4.MCE, which is preserved.
> +	 * Clearing CR4.MCE will #VE in TDX guests.

Sure. But what's the side effect for non TDX?

> +	 */
> +	andl	$X86_CR4_MCE, %eax
> +	orl	$X86_CR4_PAE, %eax
>  	testl	%edx, %edx
>  	jz	1f
>  	orl	$X86_CR4_LA57, %eax
> @@ -662,8 +670,8 @@ SYM_CODE_START(trampoline_32bit_src)
>  	pushl	$__KERNEL_CS
>  	pushl	%eax
>  
> -	/* Enable paging again */
> -	movl	$(X86_CR0_PG | X86_CR0_PE), %eax
> +	/* Enable paging again. Avoid clearing X86_CR0_NE for TDX */

Ditto.

The changelog is not providing any information either.

Also instead of '... TDX' all over the place please add sensible defines
and add comments to those in one place. There is really no need to
sprinkle TDX all over the place.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-10-17 19:49 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-05 23:05 [PATCH v7 0/6] Add TDX Guest Support (boot support) Kuppuswamy Sathyanarayanan
2021-10-05 23:05 ` [PATCH v7 1/6] x86/boot: Add a trampoline for APs booting in 64-bit mode Kuppuswamy Sathyanarayanan
2021-10-05 23:05 ` [PATCH v7 2/6] x86/boot: Avoid #VE during boot for TDX platforms Kuppuswamy Sathyanarayanan
2021-10-17 19:49   ` Thomas Gleixner
2021-10-05 23:05 ` [PATCH v7 3/6] x86/topology: Disable CPU online/offline control for TDX guest Kuppuswamy Sathyanarayanan
2021-10-17 19:23   ` Thomas Gleixner
2021-10-17 19:28     ` Sathyanarayanan Kuppuswamy
2021-10-05 23:05 ` [PATCH v7 4/6] x86/tdx: Forcefully disable legacy PIC for TDX guests Kuppuswamy Sathyanarayanan
2021-10-05 23:05 ` [PATCH v7 5/6] x86: Skip WBINVD instruction for VM guest Kuppuswamy Sathyanarayanan
2021-10-05 23:05 ` [PATCH v7 6/6] x86/split_lock: Fix the split lock #AC handling when running as guest Kuppuswamy Sathyanarayanan
2021-10-13 20:30   ` Sean Christopherson
2021-10-13 21:32     ` Sathyanarayanan Kuppuswamy
2021-10-14  1:24       ` Xiaoyao Li
2021-10-14  2:21         ` Sathyanarayanan Kuppuswamy
2021-10-14 15:04         ` Sean Christopherson
2021-10-15  1:30           ` Xiaoyao Li
2021-10-06  8:21 ` [PATCH v7 0/6] Add TDX Guest Support (boot support) David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).