LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v2 0/2] x86/asm: Pin sensitive CR4 and CR0 bits
@ 2019-06-04 23:44 Kees Cook
  2019-06-04 23:44 ` [PATCH v2 1/2] x86/asm: Pin sensitive CR4 bits Kees Cook
  2019-06-04 23:44 ` [PATCH v2 2/2] x86/asm: Pin sensitive CR0 bits Kees Cook
  0 siblings, 2 replies; 9+ messages in thread
From: Kees Cook @ 2019-06-04 23:44 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Kees Cook, Linus Torvalds, x86, Peter Zijlstra, Dave Hansen,
	linux-kernel

Hi,

Here's a v2 that hopefully addresses the concerns from the v1 thread[1] on
CR4 pinning. Now it's using static branches to avoid potential atomicity
problems (though perhaps that is overkill), and it has dropped the
needless volatile marking in favor of proper asm constraint flags. The
one piece that eluded me, but which I think is okay, is delaying bit
setting on a per-CPU basis. But since the bits are global state and we
don't have read-only per-CPU data, it seemed safe as I've got it here.

Full patch 1 commit log follows, just in case it's useful to have it
in this cover letter...

[1] https://lkml.kernel.org/r/CAHk-=wjNes0wn0KUMMY6dOK_sN69z2TiGpDZ2cyzYF07s64bXQ@mail.gmail.com

-Kees


Several recent exploits have used direct calls to the native_write_cr4()
function to disable SMEP and SMAP before then continuing their exploits
using userspace memory access. This pins bits of CR4 so that they cannot
be changed through a common function. This is not intended to be general
ROP protection (which would require CFI to defend against properly), but
rather a way to avoid trivial direct function calling (or CFI bypasses
via a matching function prototype) as seen in:

https://googleprojectzero.blogspot.com/2017/05/exploiting-linux-kernel-via-packet.html
(https://github.com/xairy/kernel-exploits/tree/master/CVE-2017-7308)

The goals of this change:
 - pin specific bits (SMEP, SMAP, and UMIP) when writing CR4.
 - avoid setting the bits too early (they must become pinned only after
   CPU feature detection and selection has finished).
 - pinning mask needs to be read-only during normal runtime.
 - pinning needs to be checked after write to avoid jumps past the
   preceding "or".

Using __ro_after_init on the mask is done so it can't be first disabled
with a malicious write.

Since these bits are global state (once established by the boot CPU
and kernel boot parameters), they are safe to write to secondary CPUs
before those CPUs have finished feature detection. As such, the bits are
written with an "or" performed before the register write as that is both
easier and uses a few bytes less storage of a location we don't have:
read-only per-CPU data. (Note that initialization via cr4_init_shadow()
isn't early enough to avoid early native_write_cr4() calls.)

A check is performed after the register write because an attack could
just skip over the "or" before the register write. Such a direct jump
is possible because of how this function may be built by the compiler
(especially due to the removal of frame pointers) where it doesn't add
a stack frame (function exit may only be a retq without pops) which
is sufficient for trivial exploitation like in the timer overwrites
mentioned above).

The asm argument constraints gain the "+" modifier to convince the
compiler that it shouldn't make ordering assumptions about the arguments
or memory, and treat them as changed.

---
v2:
- move setup until after CPU feature detection and selection.
- refactor to use static branches to have atomic enabling.
- only perform the "or" after a failed check.
---

Kees Cook (2):
  x86/asm: Pin sensitive CR4 bits
  x86/asm: Pin sensitive CR0 bits

 arch/x86/include/asm/special_insns.h | 41 ++++++++++++++++++++++++++--
 arch/x86/kernel/cpu/common.c         | 18 ++++++++++++
 2 files changed, 57 insertions(+), 2 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/2] x86/asm: Pin sensitive CR4 bits
  2019-06-04 23:44 [PATCH v2 0/2] x86/asm: Pin sensitive CR4 and CR0 bits Kees Cook
@ 2019-06-04 23:44 ` Kees Cook
  2019-06-05 14:28   ` Kees Cook
  2019-06-14 14:57   ` Thomas Gleixner
  2019-06-04 23:44 ` [PATCH v2 2/2] x86/asm: Pin sensitive CR0 bits Kees Cook
  1 sibling, 2 replies; 9+ messages in thread
From: Kees Cook @ 2019-06-04 23:44 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Kees Cook, Linus Torvalds, x86, Peter Zijlstra, Dave Hansen,
	linux-kernel

Several recent exploits have used direct calls to the native_write_cr4()
function to disable SMEP and SMAP before then continuing their exploits
using userspace memory access. This pins bits of CR4 so that they cannot
be changed through a common function. This is not intended to be general
ROP protection (which would require CFI to defend against properly), but
rather a way to avoid trivial direct function calling (or CFI bypasses
via a matching function prototype) as seen in:

https://googleprojectzero.blogspot.com/2017/05/exploiting-linux-kernel-via-packet.html
(https://github.com/xairy/kernel-exploits/tree/master/CVE-2017-7308)

The goals of this change:
 - pin specific bits (SMEP, SMAP, and UMIP) when writing CR4.
 - avoid setting the bits too early (they must become pinned only after
   CPU feature detection and selection has finished).
 - pinning mask needs to be read-only during normal runtime.
 - pinning needs to be checked after write to avoid jumps past the
   preceding "or".

Using __ro_after_init on the mask is done so it can't be first disabled
with a malicious write.

Since these bits are global state (once established by the boot CPU
and kernel boot parameters), they are safe to write to secondary CPUs
before those CPUs have finished feature detection. As such, the bits are
written with an "or" performed before the register write as that is both
easier and uses a few bytes less storage of a location we don't have:
read-only per-CPU data. (Note that initialization via cr4_init_shadow()
isn't early enough to avoid early native_write_cr4() calls.)

A check is performed after the register write because an attack could
just skip over the "or" before the register write. Such a direct jump
is possible because of how this function may be built by the compiler
(especially due to the removal of frame pointers) where it doesn't add
a stack frame (function exit may only be a retq without pops) which
is sufficient for trivial exploitation like in the timer overwrites
mentioned above).

The asm argument constraints gain the "+" modifier to convince the
compiler that it shouldn't make ordering assumptions about the arguments
or memory, and treat them as changed.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
v2:
- move setup until after CPU feature detection and selection.
- refactor to use static branches to have atomic enabling.
- only perform the "or" after a failed check.
---
 arch/x86/include/asm/special_insns.h | 24 +++++++++++++++++++++++-
 arch/x86/kernel/cpu/common.c         | 18 ++++++++++++++++++
 2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 0a3c4cab39db..284a77d52fea 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -6,6 +6,8 @@
 #ifdef __KERNEL__
 
 #include <asm/nops.h>
+#include <asm/processor-flags.h>
+#include <linux/jump_label.h>
 
 /*
  * Volatile isn't enough to prevent the compiler from reordering the
@@ -16,6 +18,10 @@
  */
 extern unsigned long __force_order;
 
+/* Starts false and gets enabled once CPU feature detection is done. */
+DECLARE_STATIC_KEY_FALSE(cr_pinning);
+extern unsigned long cr4_pinned_bits;
+
 static inline unsigned long native_read_cr0(void)
 {
 	unsigned long val;
@@ -74,7 +80,23 @@ static inline unsigned long native_read_cr4(void)
 
 static inline void native_write_cr4(unsigned long val)
 {
-	asm volatile("mov %0,%%cr4": : "r" (val), "m" (__force_order));
+	unsigned long bits_missing = 0;
+
+set_register:
+	if (static_branch_likely(&cr_pinning))
+		val |= cr4_pinned_bits;
+
+	asm volatile("mov %0,%%cr4": "+r" (val), "+m" (cr4_pinned_bits));
+
+	if (static_branch_likely(&cr_pinning)) {
+		if (unlikely((val & cr4_pinned_bits) != cr4_pinned_bits)) {
+			bits_missing = ~val & cr4_pinned_bits;
+			goto set_register;
+		}
+		/* Warn after we've set the missing bits. */
+		WARN_ONCE(bits_missing, "CR4 bits went missing: %lx!?\n",
+			  bits_missing);
+	}
 }
 
 #ifdef CONFIG_X86_64
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 2c57fffebf9b..6b210be12734 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -366,6 +366,23 @@ static __always_inline void setup_umip(struct cpuinfo_x86 *c)
 	cr4_clear_bits(X86_CR4_UMIP);
 }
 
+DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
+unsigned long cr4_pinned_bits __ro_after_init;
+
+/*
+ * Once CPU feature detection is finished (and boot params have been
+ * parsed), record any of the sensitive CR bits that are set, and
+ * enable CR pinning.
+ */
+static void __init setup_cr_pinning(void)
+{
+	unsigned long mask;
+
+	mask = (X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP);
+	cr4_pinned_bits = this_cpu_read(cpu_tlbstate.cr4) & mask;
+	static_key_enable(&cr_pinning.key);
+}
+
 /*
  * Protection Keys are not available in 32-bit mode.
  */
@@ -1464,6 +1481,7 @@ void __init identify_boot_cpu(void)
 	enable_sep_cpu();
 #endif
 	cpu_detect_tlb(&boot_cpu_data);
+	setup_cr_pinning();
 }
 
 void identify_secondary_cpu(struct cpuinfo_x86 *c)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 2/2] x86/asm: Pin sensitive CR0 bits
  2019-06-04 23:44 [PATCH v2 0/2] x86/asm: Pin sensitive CR4 and CR0 bits Kees Cook
  2019-06-04 23:44 ` [PATCH v2 1/2] x86/asm: Pin sensitive CR4 bits Kees Cook
@ 2019-06-04 23:44 ` Kees Cook
  2019-06-14 14:58   ` Thomas Gleixner
  1 sibling, 1 reply; 9+ messages in thread
From: Kees Cook @ 2019-06-04 23:44 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Kees Cook, Linus Torvalds, x86, Peter Zijlstra, Dave Hansen,
	linux-kernel

With sensitive CR4 bits pinned now, it's possible that the WP bit for
CR0 might become a target as well. Following the same reasoning for
the CR4 pinning, this pins CR0's WP bit (but this can be done with a
static value).

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/include/asm/special_insns.h | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 284a77d52fea..9c9fd3760079 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -31,7 +31,22 @@ static inline unsigned long native_read_cr0(void)
 
 static inline void native_write_cr0(unsigned long val)
 {
-	asm volatile("mov %0,%%cr0": : "r" (val), "m" (__force_order));
+	unsigned long bits_missing = 0;
+
+set_register:
+	if (static_branch_likely(&cr_pinning))
+		val |= X86_CR0_WP;
+
+	asm volatile("mov %0,%%cr0": "+r" (val), "+m" (__force_order));
+
+	if (static_branch_likely(&cr_pinning)) {
+		if (unlikely((val & X86_CR0_WP) != X86_CR0_WP)) {
+			bits_missing = X86_CR0_WP;
+			goto set_register;
+		}
+		/* Warn after we've set the missing bits. */
+		WARN_ONCE(bits_missing, "CR0 WP bit went missing!?\n");
+	}
 }
 
 static inline unsigned long native_read_cr2(void)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] x86/asm: Pin sensitive CR4 bits
  2019-06-04 23:44 ` [PATCH v2 1/2] x86/asm: Pin sensitive CR4 bits Kees Cook
@ 2019-06-05 14:28   ` Kees Cook
  2019-06-14 14:57   ` Thomas Gleixner
  1 sibling, 0 replies; 9+ messages in thread
From: Kees Cook @ 2019-06-05 14:28 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Linus Torvalds, x86, Peter Zijlstra, Dave Hansen, linux-kernel

On Tue, Jun 04, 2019 at 04:44:21PM -0700, Kees Cook wrote:
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 2c57fffebf9b..6b210be12734 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -366,6 +366,23 @@ static __always_inline void setup_umip(struct cpuinfo_x86 *c)
>  	cr4_clear_bits(X86_CR4_UMIP);
>  }
>  
> +DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
> +unsigned long cr4_pinned_bits __ro_after_init;

I missed EXPORT_SYMBOL()s for these -- I will fix in v3.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] x86/asm: Pin sensitive CR4 bits
  2019-06-04 23:44 ` [PATCH v2 1/2] x86/asm: Pin sensitive CR4 bits Kees Cook
  2019-06-05 14:28   ` Kees Cook
@ 2019-06-14 14:57   ` Thomas Gleixner
  2019-06-15  3:19     ` Kees Cook
  1 sibling, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2019-06-14 14:57 UTC (permalink / raw)
  To: Kees Cook; +Cc: Linus Torvalds, x86, Peter Zijlstra, Dave Hansen, linux-kernel

On Tue, 4 Jun 2019, Kees Cook wrote:
> ---
> v2:
> - move setup until after CPU feature detection and selection.
> - refactor to use static branches to have atomic enabling.
> - only perform the "or" after a failed check.

Maybe I'm missing something, but

>  static inline unsigned long native_read_cr0(void)
>  {
>  	unsigned long val;
> @@ -74,7 +80,23 @@ static inline unsigned long native_read_cr4(void)
>  
>  static inline void native_write_cr4(unsigned long val)
>  {
> -	asm volatile("mov %0,%%cr4": : "r" (val), "m" (__force_order));
> +	unsigned long bits_missing = 0;
> +
> +set_register:
> +	if (static_branch_likely(&cr_pinning))
> +		val |= cr4_pinned_bits;

The or happens before the first write and therefore

> +	asm volatile("mov %0,%%cr4": "+r" (val), "+m" (cr4_pinned_bits));
> +
> +	if (static_branch_likely(&cr_pinning)) {
> +		if (unlikely((val & cr4_pinned_bits) != cr4_pinned_bits)) {

this path can never be entered when the function is called proper. Sure,
for the attack scenario the jump directly to the mov cr4 instruction is
caught.

Wouldn't it make sense to catch situations where a regular caller provides
@val with pinned bits unset? I.e. move the OR into this code path after
storing bits_missing.

> +			bits_missing = ~val & cr4_pinned_bits;
> +			goto set_register;
> +		}
> +		/* Warn after we've set the missing bits. */
> +		WARN_ONCE(bits_missing, "CR4 bits went missing: %lx!?\n",
> +			  bits_missing);
> +	}
>  }

Something like this:

	unsigned long bits_missing = 0;

again:
	asm volatile("mov %0,%%cr4": "+r" (val), "+m" (cr4_pinned_bits));

	if (static_branch_likely(&cr_pinning)) {
		if (unlikely((val & cr4_pinned_bits) != cr4_pinned_bits)) {
			bits_missing = ~val & cr4_pinned_bits;
			val |= cr4_pinned_bits;
			goto again;
		}
		/* Warn after we've set the missing bits. */
		WARN_ONCE(bits_missing, "CR4 bits went missing: %lx!?\n",
			  bits_missing);
	}

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 2/2] x86/asm: Pin sensitive CR0 bits
  2019-06-04 23:44 ` [PATCH v2 2/2] x86/asm: Pin sensitive CR0 bits Kees Cook
@ 2019-06-14 14:58   ` Thomas Gleixner
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Gleixner @ 2019-06-14 14:58 UTC (permalink / raw)
  To: Kees Cook; +Cc: Linus Torvalds, x86, Peter Zijlstra, Dave Hansen, linux-kernel

On Tue, 4 Jun 2019, Kees Cook wrote:

> With sensitive CR4 bits pinned now, it's possible that the WP bit for
> CR0 might become a target as well. Following the same reasoning for
> the CR4 pinning, this pins CR0's WP bit (but this can be done with a
> static value).
> 
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
>  arch/x86/include/asm/special_insns.h | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
> index 284a77d52fea..9c9fd3760079 100644
> --- a/arch/x86/include/asm/special_insns.h
> +++ b/arch/x86/include/asm/special_insns.h
> @@ -31,7 +31,22 @@ static inline unsigned long native_read_cr0(void)
>  
>  static inline void native_write_cr0(unsigned long val)
>  {
> -	asm volatile("mov %0,%%cr0": : "r" (val), "m" (__force_order));
> +	unsigned long bits_missing = 0;
> +
> +set_register:
> +	if (static_branch_likely(&cr_pinning))
> +		val |= X86_CR0_WP;
> +
> +	asm volatile("mov %0,%%cr0": "+r" (val), "+m" (__force_order));
> +
> +	if (static_branch_likely(&cr_pinning)) {
> +		if (unlikely((val & X86_CR0_WP) != X86_CR0_WP)) {
> +			bits_missing = X86_CR0_WP;
> +			goto set_register;

Same comment as for the cr4 variant.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] x86/asm: Pin sensitive CR4 bits
  2019-06-14 14:57   ` Thomas Gleixner
@ 2019-06-15  3:19     ` Kees Cook
  2019-06-16 22:26       ` Thomas Gleixner
  0 siblings, 1 reply; 9+ messages in thread
From: Kees Cook @ 2019-06-15  3:19 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Linus Torvalds, x86, Peter Zijlstra, Dave Hansen, linux-kernel

On Fri, Jun 14, 2019 at 04:57:36PM +0200, Thomas Gleixner wrote:
> Wouldn't it make sense to catch situations where a regular caller provides
> @val with pinned bits unset? I.e. move the OR into this code path after
> storing bits_missing.

I mentioned this in the commit log, but maybe I wasn't very clear:

> > Since these bits are global state (once established by the boot CPU
> > and kernel boot parameters), they are safe to write to secondary CPUs
> > before those CPUs have finished feature detection. As such, the bits are
> > written with an "or" performed before the register write as that is both
> > easier and uses a few bytes less storage of a location we don't have:
> > read-only per-CPU data. (Note that initialization via cr4_init_shadow()
> > isn't early enough to avoid early native_write_cr4() calls.)

Basically, there are calls of native_write_cr4() made very early in
secondary CPU init that would hit the "eek missing bit" case, and using
the cr4_init_shadow() trick mentioned by Linus still wasn't early
enough. So, since feature detection for these bits is "done" (i.e.
secondary CPUs will match the boot CPU's for these bits), it's safe to
set them "early". To avoid this, we'd need a per-cpu "am I ready to set
these bits?" state, and it'd need to be read-only-after-init, which is
not a section that exists and seems wasteful to create (4095 bytes
unused) for this feature alone.

> Something like this:
> 
> 	unsigned long bits_missing = 0;
> 
> again:
> 	asm volatile("mov %0,%%cr4": "+r" (val), "+m" (cr4_pinned_bits));
> 
> 	if (static_branch_likely(&cr_pinning)) {
> 		if (unlikely((val & cr4_pinned_bits) != cr4_pinned_bits)) {
> 			bits_missing = ~val & cr4_pinned_bits;
> 			val |= cr4_pinned_bits;
> 			goto again;
> 		}
> 		/* Warn after we've set the missing bits. */
> 		WARN_ONCE(bits_missing, "CR4 bits went missing: %lx!?\n",
> 			  bits_missing);
> 	}

Yup, that's actually exactly what I had tried. Should I try to track down
the very early callers and OR in the bits at the call sites instead? (This
had occurred to me also, but it seemed operationally equivalent to ORing
at the start of native_write_cr4(), so I didn't even bother trying it).

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] x86/asm: Pin sensitive CR4 bits
  2019-06-15  3:19     ` Kees Cook
@ 2019-06-16 22:26       ` Thomas Gleixner
  2019-06-18  4:09         ` Kees Cook
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2019-06-16 22:26 UTC (permalink / raw)
  To: Kees Cook; +Cc: Linus Torvalds, x86, Peter Zijlstra, Dave Hansen, linux-kernel

On Fri, 14 Jun 2019, Kees Cook wrote:

> On Fri, Jun 14, 2019 at 04:57:36PM +0200, Thomas Gleixner wrote:
> > Wouldn't it make sense to catch situations where a regular caller provides
> > @val with pinned bits unset? I.e. move the OR into this code path after
> > storing bits_missing.
> 
> I mentioned this in the commit log, but maybe I wasn't very clear:
> 
> > > Since these bits are global state (once established by the boot CPU
> > > and kernel boot parameters), they are safe to write to secondary CPUs
> > > before those CPUs have finished feature detection. As such, the bits are
> > > written with an "or" performed before the register write as that is both
> > > easier and uses a few bytes less storage of a location we don't have:
> > > read-only per-CPU data. (Note that initialization via cr4_init_shadow()
> > > isn't early enough to avoid early native_write_cr4() calls.)

Right, I know, but I was really aiming for the extra benefit of catching
legit kernel callers which feed crap into that function. I'm not so fond of
silently papering over crap.

> Basically, there are calls of native_write_cr4() made very early in
> secondary CPU init that would hit the "eek missing bit" case, and using
> the cr4_init_shadow() trick mentioned by Linus still wasn't early
> enough. So, since feature detection for these bits is "done" (i.e.
> secondary CPUs will match the boot CPU's for these bits), it's safe to
> set them "early". To avoid this, we'd need a per-cpu "am I ready to set
> these bits?" state, and it'd need to be read-only-after-init, which is
> not a section that exists and seems wasteful to create (4095 bytes
> unused) for this feature alone.
> 
> > Something like this:
> > 
> > 	unsigned long bits_missing = 0;
> > 
> > again:
> > 	asm volatile("mov %0,%%cr4": "+r" (val), "+m" (cr4_pinned_bits));
> > 
> > 	if (static_branch_likely(&cr_pinning)) {
> > 		if (unlikely((val & cr4_pinned_bits) != cr4_pinned_bits)) {
> > 			bits_missing = ~val & cr4_pinned_bits;
> > 			val |= cr4_pinned_bits;
> > 			goto again;
> > 		}
> > 		/* Warn after we've set the missing bits. */
> > 		WARN_ONCE(bits_missing, "CR4 bits went missing: %lx!?\n",
> > 			  bits_missing);
> > 	}
> 
> Yup, that's actually exactly what I had tried. Should I try to track down
> the very early callers and OR in the bits at the call sites instead? (This
> had occurred to me also, but it seemed operationally equivalent to ORing
> at the start of native_write_cr4(), so I didn't even bother trying it).

Nah. The straight forward approach is to do right when the secondary CPU
comes into life, before it does any cr4 write (except for the code in
entry_64.S), something like the below.

Thanks,

	tglx
	
8<-------------

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 362dd8953f48..f9304b356a83 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -205,13 +205,17 @@ static int enable_start_cpu0;
  */
 static void notrace start_secondary(void *unused)
 {
+	unsigned long cr4 = __read_cr4();
+
 	/*
 	 * Don't put *anything* except direct CPU state initialization
 	 * before cpu_init(), SMP booting is too fragile that we want to
 	 * limit the things done here to the most necessary things.
 	 */
 	if (boot_cpu_has(X86_FEATURE_PCID))
-		__write_cr4(__read_cr4() | X86_CR4_PCIDE);
+		cr4 |= X86_CR4_PCIDE;
+
+	__write_cr4(cr4 | cr4_pinned_bits)
 
 #ifdef CONFIG_X86_32
 	/* switch away from the initial page table */


    

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 1/2] x86/asm: Pin sensitive CR4 bits
  2019-06-16 22:26       ` Thomas Gleixner
@ 2019-06-18  4:09         ` Kees Cook
  0 siblings, 0 replies; 9+ messages in thread
From: Kees Cook @ 2019-06-18  4:09 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Linus Torvalds, x86, Peter Zijlstra, Dave Hansen, linux-kernel

On Mon, Jun 17, 2019 at 12:26:01AM +0200, Thomas Gleixner wrote:
> Nah. The straight forward approach is to do right when the secondary CPU
> comes into life, before it does any cr4 write (except for the code in
> entry_64.S), something like the below.

Okay, will do; thanks! :) I'll respin things and get it tested.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-06-18  4:09 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-04 23:44 [PATCH v2 0/2] x86/asm: Pin sensitive CR4 and CR0 bits Kees Cook
2019-06-04 23:44 ` [PATCH v2 1/2] x86/asm: Pin sensitive CR4 bits Kees Cook
2019-06-05 14:28   ` Kees Cook
2019-06-14 14:57   ` Thomas Gleixner
2019-06-15  3:19     ` Kees Cook
2019-06-16 22:26       ` Thomas Gleixner
2019-06-18  4:09         ` Kees Cook
2019-06-04 23:44 ` [PATCH v2 2/2] x86/asm: Pin sensitive CR0 bits Kees Cook
2019-06-14 14:58   ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).