LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/2] x86: sp0 fixes
@ 2015-03-07 1:50 Andy Lutomirski
2015-03-07 1:50 ` [PATCH 1/2] x86: Delay loading sp0 slightly on task switch Andy Lutomirski
2015-03-07 1:50 ` [PATCH 2/2] x86: Replace this_cpu_sp0 with current_top_of_stack and fix it on x86_32 Andy Lutomirski
0 siblings, 2 replies; 9+ messages in thread
From: Andy Lutomirski @ 2015-03-07 1:50 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Borislav Petkov, Oleg Nesterov, Denys Vlasenko, Andy Lutomirski
I broke x86_32 and I made an inadvertent change to both bitnesses.
Undo the inadvertent change and fix x86_32.
This isn't as pretty as I hoped. Sorry.
Andy Lutomirski (2):
x86: Delay loading sp0 slightly on task switch
x86: Replace this_cpu_sp0 with current_top_of_stack and fix it on
x86_32
arch/x86/include/asm/processor.h | 11 ++++++++++-
arch/x86/include/asm/thread_info.h | 4 +---
arch/x86/kernel/cpu/common.c | 13 +++++++++++--
arch/x86/kernel/process_32.c | 17 ++++++++++-------
arch/x86/kernel/process_64.c | 6 +++---
arch/x86/kernel/smpboot.c | 2 ++
arch/x86/kernel/traps.c | 4 ++--
7 files changed, 39 insertions(+), 18 deletions(-)
--
2.1.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/2] x86: Delay loading sp0 slightly on task switch
2015-03-07 1:50 [PATCH 0/2] x86: sp0 fixes Andy Lutomirski
@ 2015-03-07 1:50 ` Andy Lutomirski
2015-03-07 8:37 ` [tip:x86/asm] x86/asm/entry: " tip-bot for Andy Lutomirski
2015-03-07 1:50 ` [PATCH 2/2] x86: Replace this_cpu_sp0 with current_top_of_stack and fix it on x86_32 Andy Lutomirski
1 sibling, 1 reply; 9+ messages in thread
From: Andy Lutomirski @ 2015-03-07 1:50 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Borislav Petkov, Oleg Nesterov, Denys Vlasenko, Andy Lutomirski
The change:
75182b1632a8 x86/asm/entry: Switch all C consumers of kernel_stack to this_cpu_sp0()
had the unintended side effect of changing the return value of
current_thread_info() during part of the context switch process.
Change it back.
This has no effect as far as I can tell -- it's just for
consistency.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
---
arch/x86/kernel/process_32.c | 10 +++++-----
arch/x86/kernel/process_64.c | 6 +++---
2 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index d3460af3d27a..0405cab6634d 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -256,11 +256,6 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
fpu = switch_fpu_prepare(prev_p, next_p, cpu);
/*
- * Reload esp0.
- */
- load_sp0(tss, next);
-
- /*
* Save away %gs. No need to save %fs, as it was saved on the
* stack on entry. No need to save %es and %ds, as those are
* always kernel segments while inside the kernel. Doing this
@@ -310,6 +305,11 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
*/
arch_end_context_switch(next_p);
+ /*
+ * Reload esp0. This changes current_thread_info().
+ */
+ load_sp0(tss, next);
+
this_cpu_write(kernel_stack,
(unsigned long)task_stack_page(next_p) +
THREAD_SIZE - KERNEL_STACK_OFFSET);
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 2cd562f96c1f..1e393d27d701 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -283,9 +283,6 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
fpu = switch_fpu_prepare(prev_p, next_p, cpu);
- /* Reload esp0 and ss1. */
- load_sp0(tss, next);
-
/* We must save %fs and %gs before load_TLS() because
* %fs and %gs may be cleared by load_TLS().
*
@@ -413,6 +410,9 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
task_thread_info(prev_p)->saved_preempt_count = this_cpu_read(__preempt_count);
this_cpu_write(__preempt_count, task_thread_info(next_p)->saved_preempt_count);
+ /* Reload esp0 and ss1. This changes current_thread_info(). */
+ load_sp0(tss, next);
+
this_cpu_write(kernel_stack,
(unsigned long)task_stack_page(next_p) +
THREAD_SIZE - KERNEL_STACK_OFFSET);
--
2.1.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/2] x86: Replace this_cpu_sp0 with current_top_of_stack and fix it on x86_32
2015-03-07 1:50 [PATCH 0/2] x86: sp0 fixes Andy Lutomirski
2015-03-07 1:50 ` [PATCH 1/2] x86: Delay loading sp0 slightly on task switch Andy Lutomirski
@ 2015-03-07 1:50 ` Andy Lutomirski
2015-03-07 8:37 ` [tip:x86/asm] x86/asm/entry: Replace this_cpu_sp0() with current_top_of_stack() " tip-bot for Andy Lutomirski
2015-03-26 13:30 ` [PATCH 2/2] x86: Replace this_cpu_sp0 with current_top_of_stack " Boris Ostrovsky
1 sibling, 2 replies; 9+ messages in thread
From: Andy Lutomirski @ 2015-03-07 1:50 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Borislav Petkov, Oleg Nesterov, Denys Vlasenko, Andy Lutomirski
I broke 32-bit kernels. The implementation of sp0 was correct as
far as I can tell, but sp0 was much weirder on x86_32 than I
realized. It has the following issues:
- Init's sp0 is inconsistent with everything else's: non-init tasks
are offset by 8 bytes. (I have no idea why, and the comment is unhelpful.)
- vm86 does crazy things to sp0.
Fix it up by replacing this_cpu_sp0() with current_top_of_stack()
and using a new percpu variable to track the top of the stack on
x86_32.
Fixes: 75182b1632a8 x86/asm/entry: Switch all C consumers of kernel_stack to this_cpu_sp0()
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
---
arch/x86/include/asm/processor.h | 11 ++++++++++-
arch/x86/include/asm/thread_info.h | 4 +---
arch/x86/kernel/cpu/common.c | 13 +++++++++++--
arch/x86/kernel/process_32.c | 11 +++++++----
arch/x86/kernel/smpboot.c | 2 ++
arch/x86/kernel/traps.c | 4 ++--
6 files changed, 33 insertions(+), 12 deletions(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index f5e3ec63767d..48a61c1c626e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -284,6 +284,10 @@ struct tss_struct {
DECLARE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss);
+#ifdef CONFIG_X86_32
+DECLARE_PER_CPU(unsigned long, cpu_current_top_of_stack);
+#endif
+
/*
* Save the original ist values for checking stack pointers during debugging
*/
@@ -564,9 +568,14 @@ static inline void native_swapgs(void)
#endif
}
-static inline unsigned long this_cpu_sp0(void)
+static inline unsigned long current_top_of_stack(void)
{
+#ifdef CONFIG_X86_64
return this_cpu_read_stable(cpu_tss.x86_tss.sp0);
+#else
+ /* sp0 on x86_32 is special in and around vm86 mode. */
+ return this_cpu_read_stable(cpu_current_top_of_stack);
+#endif
}
#ifdef CONFIG_PARAVIRT
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index a2fa1899494e..7740edd56fed 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -158,9 +158,7 @@ DECLARE_PER_CPU(unsigned long, kernel_stack);
static inline struct thread_info *current_thread_info(void)
{
- struct thread_info *ti;
- ti = (void *)(this_cpu_sp0() - THREAD_SIZE);
- return ti;
+ return (struct thread_info *)(current_top_of_stack() - THREAD_SIZE);
}
static inline unsigned long current_stack_pointer(void)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 5d0f0cc7ea26..76348334b934 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1130,8 +1130,8 @@ DEFINE_PER_CPU_FIRST(union irq_stack_union,
irq_stack_union) __aligned(PAGE_SIZE) __visible;
/*
- * The following four percpu variables are hot. Align current_task to
- * cacheline size such that all four fall in the same cacheline.
+ * The following percpu variables are hot. Align current_task to
+ * cacheline size such that they fall in the same cacheline.
*/
DEFINE_PER_CPU(struct task_struct *, current_task) ____cacheline_aligned =
&init_task;
@@ -1226,6 +1226,15 @@ DEFINE_PER_CPU(int, __preempt_count) = INIT_PREEMPT_COUNT;
EXPORT_PER_CPU_SYMBOL(__preempt_count);
DEFINE_PER_CPU(struct task_struct *, fpu_owner_task);
+/*
+ * On x86_32, vm86 modifies tss.sp0, so sp0 isn't a reliable way to find
+ * the top of the kernel stack. Use an extra percpu variable to track the
+ * top of the kernel stack directly.
+ */
+DEFINE_PER_CPU(unsigned long, cpu_current_top_of_stack) =
+ (unsigned long)&init_thread_union + THREAD_SIZE;
+EXPORT_PER_CPU_SYMBOL(cpu_current_top_of_stack);
+
#ifdef CONFIG_CC_STACKPROTECTOR
DEFINE_PER_CPU_ALIGNED(struct stack_canary, stack_canary);
#endif
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 0405cab6634d..1b9963faf4eb 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -306,13 +306,16 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
arch_end_context_switch(next_p);
/*
- * Reload esp0. This changes current_thread_info().
+ * Reload esp0, kernel_stack, and current_top_of_stack. This changes
+ * current_thread_info().
*/
load_sp0(tss, next);
-
this_cpu_write(kernel_stack,
- (unsigned long)task_stack_page(next_p) +
- THREAD_SIZE - KERNEL_STACK_OFFSET);
+ (unsigned long)task_stack_page(next_p) +
+ THREAD_SIZE - KERNEL_STACK_OFFSET);
+ this_cpu_write(cpu_current_top_of_stack,
+ (unsigned long)task_stack_page(next_p) +
+ THREAD_SIZE);
/*
* Restore %gs if needed (which is common)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index febc6aabc72e..759388c538cf 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -806,6 +806,8 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle)
#ifdef CONFIG_X86_32
/* Stack for startup_32 can be just as for start_secondary onwards */
irq_ctx_init(cpu);
+ per_cpu(cpu_current_top_of_stack, cpu) =
+ (unsigned long)task_stack_page(idle) + THREAD_SIZE;
#else
clear_tsk_thread_flag(idle, TIF_FORK);
initial_gs = per_cpu_offset(cpu);
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index fa290586ed37..081252c44cde 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -174,8 +174,8 @@ void ist_begin_non_atomic(struct pt_regs *regs)
* will catch asm bugs and any attempt to use ist_preempt_enable
* from double_fault.
*/
- BUG_ON((unsigned long)(this_cpu_sp0() - current_stack_pointer()) >=
- THREAD_SIZE);
+ BUG_ON((unsigned long)(current_top_of_stack() -
+ current_stack_pointer()) >= THREAD_SIZE);
preempt_count_sub(HARDIRQ_OFFSET);
}
--
2.1.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [tip:x86/asm] x86/asm/entry: Delay loading sp0 slightly on task switch
2015-03-07 1:50 ` [PATCH 1/2] x86: Delay loading sp0 slightly on task switch Andy Lutomirski
@ 2015-03-07 8:37 ` tip-bot for Andy Lutomirski
0 siblings, 0 replies; 9+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-03-07 8:37 UTC (permalink / raw)
To: linux-tip-commits
Cc: luto, linux-kernel, oleg, mingo, tglx, torvalds, dvlasenk, hpa, bp
Commit-ID: b27559a433bb6080d95c2593d4a2b81401197911
Gitweb: http://git.kernel.org/tip/b27559a433bb6080d95c2593d4a2b81401197911
Author: Andy Lutomirski <luto@amacapital.net>
AuthorDate: Fri, 6 Mar 2015 17:50:18 -0800
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 7 Mar 2015 09:34:03 +0100
x86/asm/entry: Delay loading sp0 slightly on task switch
The change:
75182b1632a8 ("x86/asm/entry: Switch all C consumers of kernel_stack to this_cpu_sp0()")
had the unintended side effect of changing the return value of
current_thread_info() during part of the context switch process.
Change it back.
This has no effect as far as I can tell -- it's just for
consistency.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/9fcaa47dd8487db59eed7a3911b6ae409476763e.1425692936.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/process_32.c | 10 +++++-----
arch/x86/kernel/process_64.c | 6 +++---
2 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index d3460af..0405cab 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -256,11 +256,6 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
fpu = switch_fpu_prepare(prev_p, next_p, cpu);
/*
- * Reload esp0.
- */
- load_sp0(tss, next);
-
- /*
* Save away %gs. No need to save %fs, as it was saved on the
* stack on entry. No need to save %es and %ds, as those are
* always kernel segments while inside the kernel. Doing this
@@ -310,6 +305,11 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
*/
arch_end_context_switch(next_p);
+ /*
+ * Reload esp0. This changes current_thread_info().
+ */
+ load_sp0(tss, next);
+
this_cpu_write(kernel_stack,
(unsigned long)task_stack_page(next_p) +
THREAD_SIZE - KERNEL_STACK_OFFSET);
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 2cd562f..1e393d2 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -283,9 +283,6 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
fpu = switch_fpu_prepare(prev_p, next_p, cpu);
- /* Reload esp0 and ss1. */
- load_sp0(tss, next);
-
/* We must save %fs and %gs before load_TLS() because
* %fs and %gs may be cleared by load_TLS().
*
@@ -413,6 +410,9 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
task_thread_info(prev_p)->saved_preempt_count = this_cpu_read(__preempt_count);
this_cpu_write(__preempt_count, task_thread_info(next_p)->saved_preempt_count);
+ /* Reload esp0 and ss1. This changes current_thread_info(). */
+ load_sp0(tss, next);
+
this_cpu_write(kernel_stack,
(unsigned long)task_stack_page(next_p) +
THREAD_SIZE - KERNEL_STACK_OFFSET);
^ permalink raw reply [flat|nested] 9+ messages in thread
* [tip:x86/asm] x86/asm/entry: Replace this_cpu_sp0() with current_top_of_stack() and fix it on x86_32
2015-03-07 1:50 ` [PATCH 2/2] x86: Replace this_cpu_sp0 with current_top_of_stack and fix it on x86_32 Andy Lutomirski
@ 2015-03-07 8:37 ` tip-bot for Andy Lutomirski
2015-03-09 13:04 ` Denys Vlasenko
2015-03-26 13:30 ` [PATCH 2/2] x86: Replace this_cpu_sp0 with current_top_of_stack " Boris Ostrovsky
1 sibling, 1 reply; 9+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-03-07 8:37 UTC (permalink / raw)
To: linux-tip-commits
Cc: torvalds, mingo, bp, dvlasenk, luto, linux-kernel, oleg, hpa, tglx
Commit-ID: a7fcf28d431ef70afaa91496e64e16dc51dccec4
Gitweb: http://git.kernel.org/tip/a7fcf28d431ef70afaa91496e64e16dc51dccec4
Author: Andy Lutomirski <luto@amacapital.net>
AuthorDate: Fri, 6 Mar 2015 17:50:19 -0800
Committer: Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 7 Mar 2015 09:34:03 +0100
x86/asm/entry: Replace this_cpu_sp0() with current_top_of_stack() and fix it on x86_32
I broke 32-bit kernels. The implementation of sp0 was correct
as far as I can tell, but sp0 was much weirder on x86_32 than I
realized. It has the following issues:
- Init's sp0 is inconsistent with everything else's: non-init tasks
are offset by 8 bytes. (I have no idea why, and the comment is unhelpful.)
- vm86 does crazy things to sp0.
Fix it up by replacing this_cpu_sp0() with
current_top_of_stack() and using a new percpu variable to track
the top of the stack on x86_32.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 75182b1632a8 ("x86/asm/entry: Switch all C consumers of kernel_stack to this_cpu_sp0()")
Link: http://lkml.kernel.org/r/d09dbe270883433776e0cbee3c7079433349e96d.1425692936.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/include/asm/processor.h | 11 ++++++++++-
arch/x86/include/asm/thread_info.h | 4 +---
arch/x86/kernel/cpu/common.c | 13 +++++++++++--
arch/x86/kernel/process_32.c | 11 +++++++----
arch/x86/kernel/smpboot.c | 2 ++
arch/x86/kernel/traps.c | 4 ++--
6 files changed, 33 insertions(+), 12 deletions(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index f5e3ec6..48a61c1 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -284,6 +284,10 @@ struct tss_struct {
DECLARE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss);
+#ifdef CONFIG_X86_32
+DECLARE_PER_CPU(unsigned long, cpu_current_top_of_stack);
+#endif
+
/*
* Save the original ist values for checking stack pointers during debugging
*/
@@ -564,9 +568,14 @@ static inline void native_swapgs(void)
#endif
}
-static inline unsigned long this_cpu_sp0(void)
+static inline unsigned long current_top_of_stack(void)
{
+#ifdef CONFIG_X86_64
return this_cpu_read_stable(cpu_tss.x86_tss.sp0);
+#else
+ /* sp0 on x86_32 is special in and around vm86 mode. */
+ return this_cpu_read_stable(cpu_current_top_of_stack);
+#endif
}
#ifdef CONFIG_PARAVIRT
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index a2fa189..7740edd 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -158,9 +158,7 @@ DECLARE_PER_CPU(unsigned long, kernel_stack);
static inline struct thread_info *current_thread_info(void)
{
- struct thread_info *ti;
- ti = (void *)(this_cpu_sp0() - THREAD_SIZE);
- return ti;
+ return (struct thread_info *)(current_top_of_stack() - THREAD_SIZE);
}
static inline unsigned long current_stack_pointer(void)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 5d0f0cc..7634833 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1130,8 +1130,8 @@ DEFINE_PER_CPU_FIRST(union irq_stack_union,
irq_stack_union) __aligned(PAGE_SIZE) __visible;
/*
- * The following four percpu variables are hot. Align current_task to
- * cacheline size such that all four fall in the same cacheline.
+ * The following percpu variables are hot. Align current_task to
+ * cacheline size such that they fall in the same cacheline.
*/
DEFINE_PER_CPU(struct task_struct *, current_task) ____cacheline_aligned =
&init_task;
@@ -1226,6 +1226,15 @@ DEFINE_PER_CPU(int, __preempt_count) = INIT_PREEMPT_COUNT;
EXPORT_PER_CPU_SYMBOL(__preempt_count);
DEFINE_PER_CPU(struct task_struct *, fpu_owner_task);
+/*
+ * On x86_32, vm86 modifies tss.sp0, so sp0 isn't a reliable way to find
+ * the top of the kernel stack. Use an extra percpu variable to track the
+ * top of the kernel stack directly.
+ */
+DEFINE_PER_CPU(unsigned long, cpu_current_top_of_stack) =
+ (unsigned long)&init_thread_union + THREAD_SIZE;
+EXPORT_PER_CPU_SYMBOL(cpu_current_top_of_stack);
+
#ifdef CONFIG_CC_STACKPROTECTOR
DEFINE_PER_CPU_ALIGNED(struct stack_canary, stack_canary);
#endif
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 0405cab..1b9963f 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -306,13 +306,16 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
arch_end_context_switch(next_p);
/*
- * Reload esp0. This changes current_thread_info().
+ * Reload esp0, kernel_stack, and current_top_of_stack. This changes
+ * current_thread_info().
*/
load_sp0(tss, next);
-
this_cpu_write(kernel_stack,
- (unsigned long)task_stack_page(next_p) +
- THREAD_SIZE - KERNEL_STACK_OFFSET);
+ (unsigned long)task_stack_page(next_p) +
+ THREAD_SIZE - KERNEL_STACK_OFFSET);
+ this_cpu_write(cpu_current_top_of_stack,
+ (unsigned long)task_stack_page(next_p) +
+ THREAD_SIZE);
/*
* Restore %gs if needed (which is common)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index febc6aa..759388c 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -806,6 +806,8 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle)
#ifdef CONFIG_X86_32
/* Stack for startup_32 can be just as for start_secondary onwards */
irq_ctx_init(cpu);
+ per_cpu(cpu_current_top_of_stack, cpu) =
+ (unsigned long)task_stack_page(idle) + THREAD_SIZE;
#else
clear_tsk_thread_flag(idle, TIF_FORK);
initial_gs = per_cpu_offset(cpu);
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index fa29058..081252c 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -174,8 +174,8 @@ void ist_begin_non_atomic(struct pt_regs *regs)
* will catch asm bugs and any attempt to use ist_preempt_enable
* from double_fault.
*/
- BUG_ON((unsigned long)(this_cpu_sp0() - current_stack_pointer()) >=
- THREAD_SIZE);
+ BUG_ON((unsigned long)(current_top_of_stack() -
+ current_stack_pointer()) >= THREAD_SIZE);
preempt_count_sub(HARDIRQ_OFFSET);
}
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [tip:x86/asm] x86/asm/entry: Replace this_cpu_sp0() with current_top_of_stack() and fix it on x86_32
2015-03-07 8:37 ` [tip:x86/asm] x86/asm/entry: Replace this_cpu_sp0() with current_top_of_stack() " tip-bot for Andy Lutomirski
@ 2015-03-09 13:04 ` Denys Vlasenko
2015-03-09 13:15 ` Andy Lutomirski
0 siblings, 1 reply; 9+ messages in thread
From: Denys Vlasenko @ 2015-03-09 13:04 UTC (permalink / raw)
To: Linux Kernel Mailing List, Thomas Gleixner, H. Peter Anvin,
Oleg Nesterov, Ingo Molnar, Linus Torvalds, Andy Lutomirski,
Denys Vlasenko, Borislav Petkov
Cc: linux-tip-commits
On Sat, Mar 7, 2015 at 9:37 AM, tip-bot for Andy Lutomirski
<tipbot@zytor.com> wrote:
> Commit-ID: a7fcf28d431ef70afaa91496e64e16dc51dccec4
> Gitweb: http://git.kernel.org/tip/a7fcf28d431ef70afaa91496e64e16dc51dccec4
> Author: Andy Lutomirski <luto@amacapital.net>
> AuthorDate: Fri, 6 Mar 2015 17:50:19 -0800
> Committer: Ingo Molnar <mingo@kernel.org>
> CommitDate: Sat, 7 Mar 2015 09:34:03 +0100
>
> x86/asm/entry: Replace this_cpu_sp0() with current_top_of_stack() and fix it on x86_32
>
> I broke 32-bit kernels. The implementation of sp0 was correct
> as far as I can tell, but sp0 was much weirder on x86_32 than I
> realized. It has the following issues:
>
> - Init's sp0 is inconsistent with everything else's: non-init tasks
> are offset by 8 bytes. (I have no idea why, and the comment is unhelpful.)
>
> - vm86 does crazy things to sp0.
>
> Fix it up by replacing this_cpu_sp0() with
> current_top_of_stack() and using a new percpu variable to track
> the top of the stack on x86_32.
Looks like the hope that tss.sp0 is a reliable variable
which points to top of stack didn't really play out :(
Recent relevant commits in x86/entry were:
x86/asm/entry: Add this_cpu_sp0() to read sp0 for the current cpu
- added accessor to tss.sp0
"We currently store references to the top of the kernel stack in
multiple places: kernel_stack (with an offset) and
init_tss.x86_tss.sp0 (no offset). The latter is defined by
hardware and is a clean canonical way to find the top of the
stack. Add an accessor so we can start using it."
x86/asm/entry: Switch all C consumers of kernel_stack to this_cpu_sp0()
- equivalent change, no win/no loss
x86/asm/entry/64/compat: Change the 32-bit sysenter code to use sp0
- Even though it did remove one insn, we can get the same
if KERNEL_STACK_OFFSET will be eliminated
x86: Delay loading sp0 slightly on task switch
- simple fix, nothing needed to be added
x86: Replace this_cpu_sp0 with current_top_of_stack and fix it on x86_32
- added a percpu var cpu_current_top_of_stack
- needs to set it in do_boot_cpu()
- added ifdef forest:
+#ifdef CONFIG_X86_64
return this_cpu_read_stable(cpu_tss.x86_tss.sp0);
+#else
+ /* sp0 on x86_32 is special in and around vm86 mode. */
+ return this_cpu_read_stable(cpu_current_top_of_stack);
+#endif
End result is, now 32-bit kernel has two per-cpu vartiables,
cpu_current_top_of_stack and kernel_stack.
cpu_current_top_of_stack is essentially "real top of stack",
and kernel_stack is "real top of stack - KERNEL_STACK_OFFSET".
When/if we get rid of KERNEL_STACK_OFFSET,
we can also get rid of kernel_stack, since it will be the same as
cpu_current_top_of_stack (which is a better name anyway).
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [tip:x86/asm] x86/asm/entry: Replace this_cpu_sp0() with current_top_of_stack() and fix it on x86_32
2015-03-09 13:04 ` Denys Vlasenko
@ 2015-03-09 13:15 ` Andy Lutomirski
0 siblings, 0 replies; 9+ messages in thread
From: Andy Lutomirski @ 2015-03-09 13:15 UTC (permalink / raw)
To: Denys Vlasenko
Cc: Linux Kernel Mailing List, Thomas Gleixner, H. Peter Anvin,
Oleg Nesterov, Ingo Molnar, Linus Torvalds, Denys Vlasenko,
Borislav Petkov, linux-tip-commits
On Mon, Mar 9, 2015 at 6:04 AM, Denys Vlasenko <vda.linux@googlemail.com> wrote:
> On Sat, Mar 7, 2015 at 9:37 AM, tip-bot for Andy Lutomirski
> <tipbot@zytor.com> wrote:
>> Commit-ID: a7fcf28d431ef70afaa91496e64e16dc51dccec4
>> Gitweb: http://git.kernel.org/tip/a7fcf28d431ef70afaa91496e64e16dc51dccec4
>> Author: Andy Lutomirski <luto@amacapital.net>
>> AuthorDate: Fri, 6 Mar 2015 17:50:19 -0800
>> Committer: Ingo Molnar <mingo@kernel.org>
>> CommitDate: Sat, 7 Mar 2015 09:34:03 +0100
>>
>> x86/asm/entry: Replace this_cpu_sp0() with current_top_of_stack() and fix it on x86_32
>>
>> I broke 32-bit kernels. The implementation of sp0 was correct
>> as far as I can tell, but sp0 was much weirder on x86_32 than I
>> realized. It has the following issues:
>>
>> - Init's sp0 is inconsistent with everything else's: non-init tasks
>> are offset by 8 bytes. (I have no idea why, and the comment is unhelpful.)
>>
>> - vm86 does crazy things to sp0.
>>
>> Fix it up by replacing this_cpu_sp0() with
>> current_top_of_stack() and using a new percpu variable to track
>> the top of the stack on x86_32.
>
> Looks like the hope that tss.sp0 is a reliable variable
> which points to top of stack didn't really play out :(
>
> Recent relevant commits in x86/entry were:
>
> x86/asm/entry: Add this_cpu_sp0() to read sp0 for the current cpu
> - added accessor to tss.sp0
> "We currently store references to the top of the kernel stack in
> multiple places: kernel_stack (with an offset) and
> init_tss.x86_tss.sp0 (no offset). The latter is defined by
> hardware and is a clean canonical way to find the top of the
> stack. Add an accessor so we can start using it."
>
> x86/asm/entry: Switch all C consumers of kernel_stack to this_cpu_sp0()
> - equivalent change, no win/no loss
>
> x86/asm/entry/64/compat: Change the 32-bit sysenter code to use sp0
> - Even though it did remove one insn, we can get the same
> if KERNEL_STACK_OFFSET will be eliminated
>
> x86: Delay loading sp0 slightly on task switch
> - simple fix, nothing needed to be added
>
> x86: Replace this_cpu_sp0 with current_top_of_stack and fix it on x86_32
> - added a percpu var cpu_current_top_of_stack
> - needs to set it in do_boot_cpu()
> - added ifdef forest:
> +#ifdef CONFIG_X86_64
> return this_cpu_read_stable(cpu_tss.x86_tss.sp0);
> +#else
> + /* sp0 on x86_32 is special in and around vm86 mode. */
> + return this_cpu_read_stable(cpu_current_top_of_stack);
> +#endif
>
>
>
> End result is, now 32-bit kernel has two per-cpu vartiables,
> cpu_current_top_of_stack and kernel_stack.
>
> cpu_current_top_of_stack is essentially "real top of stack",
> and kernel_stack is "real top of stack - KERNEL_STACK_OFFSET".
>
> When/if we get rid of KERNEL_STACK_OFFSET,
> we can also get rid of kernel_stack, since it will be the same as
> cpu_current_top_of_stack (which is a better name anyway).
Exactly.
I think the next step might be to decouple GET_THREAD_INFO and friends
from kernel_stack. I think that might be enough to get rid of
kernel_stack on 32-bit. 64 has two other remaining users: the syscall
entries.
--Andy
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] x86: Replace this_cpu_sp0 with current_top_of_stack and fix it on x86_32
2015-03-07 1:50 ` [PATCH 2/2] x86: Replace this_cpu_sp0 with current_top_of_stack and fix it on x86_32 Andy Lutomirski
2015-03-07 8:37 ` [tip:x86/asm] x86/asm/entry: Replace this_cpu_sp0() with current_top_of_stack() " tip-bot for Andy Lutomirski
@ 2015-03-26 13:30 ` Boris Ostrovsky
2015-03-26 18:33 ` Andy Lutomirski
1 sibling, 1 reply; 9+ messages in thread
From: Boris Ostrovsky @ 2015-03-26 13:30 UTC (permalink / raw)
To: Andy Lutomirski, x86, linux-kernel
Cc: Borislav Petkov, Oleg Nesterov, Denys Vlasenko, xen-devel
On 03/06/2015 08:50 PM, Andy Lutomirski wrote:
> I broke 32-bit kernels. The implementation of sp0 was correct as
> far as I can tell, but sp0 was much weirder on x86_32 than I
> realized. It has the following issues:
>
> - Init's sp0 is inconsistent with everything else's: non-init tasks
> are offset by 8 bytes. (I have no idea why, and the comment is unhelpful.)
>
> - vm86 does crazy things to sp0.
>
> Fix it up by replacing this_cpu_sp0() with current_top_of_stack()
> and using a new percpu variable to track the top of the stack on
> x86_32.
>
> Fixes: 75182b1632a8 x86/asm/entry: Switch all C consumers of kernel_stack to this_cpu_sp0()
> Signed-off-by: Andy Lutomirski <luto@amacapital.net>
> ---
...
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index febc6aabc72e..759388c538cf 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -806,6 +806,8 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle)
> #ifdef CONFIG_X86_32
> /* Stack for startup_32 can be just as for start_secondary onwards */
> irq_ctx_init(cpu);
> + per_cpu(cpu_current_top_of_stack, cpu) =
> + (unsigned long)task_stack_page(idle) + THREAD_SIZE;
> #else
> clear_tsk_thread_flag(idle, TIF_FORK);
> initial_gs = per_cpu_offset(cpu);
Andy,
We need a similar change for Xen, otherwise 32-bit PV guests are not
happy. Is the patch above final (and then should I submit a separate
patch) or are you still working on it (and if so, please add the change
below)?
-boris
diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 1c5e760..561d6f5 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -444,6 +444,8 @@ static int xen_cpu_up(unsigned int cpu, struct
task_struct *idle)
per_cpu(current_task, cpu) = idle;
#ifdef CONFIG_X86_32
irq_ctx_init(cpu);
+ per_cpu(cpu_current_top_of_stack, cpu) =
+ (unsigned long)task_stack_page(idle) + THREAD_SIZE;
#else
clear_tsk_thread_flag(idle, TIF_FORK);
#endif
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] x86: Replace this_cpu_sp0 with current_top_of_stack and fix it on x86_32
2015-03-26 13:30 ` [PATCH 2/2] x86: Replace this_cpu_sp0 with current_top_of_stack " Boris Ostrovsky
@ 2015-03-26 18:33 ` Andy Lutomirski
0 siblings, 0 replies; 9+ messages in thread
From: Andy Lutomirski @ 2015-03-26 18:33 UTC (permalink / raw)
To: Boris Ostrovsky
Cc: xen-devel, Borislav Petkov, X86 ML, Denys Vlasenko,
Oleg Nesterov, linux-kernel
On Mar 26, 2015 6:32 AM, "Boris Ostrovsky" <boris.ostrovsky@oracle.com> wrote:
>
> On 03/06/2015 08:50 PM, Andy Lutomirski wrote:
>>
>> I broke 32-bit kernels. The implementation of sp0 was correct as
>> far as I can tell, but sp0 was much weirder on x86_32 than I
>> realized. It has the following issues:
>>
>> - Init's sp0 is inconsistent with everything else's: non-init tasks
>> are offset by 8 bytes. (I have no idea why, and the comment is unhelpful.)
>>
>> - vm86 does crazy things to sp0.
>>
>> Fix it up by replacing this_cpu_sp0() with current_top_of_stack()
>> and using a new percpu variable to track the top of the stack on
>> x86_32.
>>
>> Fixes: 75182b1632a8 x86/asm/entry: Switch all C consumers of kernel_stack to this_cpu_sp0()
>> Signed-off-by: Andy Lutomirski <luto@amacapital.net>
>> ---
>
>
> ...
>
>
>> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
>> index febc6aabc72e..759388c538cf 100644
>> --- a/arch/x86/kernel/smpboot.c
>> +++ b/arch/x86/kernel/smpboot.c
>> @@ -806,6 +806,8 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle)
>> #ifdef CONFIG_X86_32
>> /* Stack for startup_32 can be just as for start_secondary onwards */
>> irq_ctx_init(cpu);
>> + per_cpu(cpu_current_top_of_stack, cpu) =
>> + (unsigned long)task_stack_page(idle) + THREAD_SIZE;
>> #else
>> clear_tsk_thread_flag(idle, TIF_FORK);
>> initial_gs = per_cpu_offset(cpu);
>
>
>
> Andy,
>
> We need a similar change for Xen, otherwise 32-bit PV guests are not happy. Is the patch above final (and then should I submit a separate patch) or are you still working on it (and if so, please add the change below)?
>
My patch is final -- it's been in -tip for a while now.
It would be really nice if we could merge the bits of Xen and native
initialization that are identical rather than needing to duplicate all
this code.
--Andy
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-03-26 18:33 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-07 1:50 [PATCH 0/2] x86: sp0 fixes Andy Lutomirski
2015-03-07 1:50 ` [PATCH 1/2] x86: Delay loading sp0 slightly on task switch Andy Lutomirski
2015-03-07 8:37 ` [tip:x86/asm] x86/asm/entry: " tip-bot for Andy Lutomirski
2015-03-07 1:50 ` [PATCH 2/2] x86: Replace this_cpu_sp0 with current_top_of_stack and fix it on x86_32 Andy Lutomirski
2015-03-07 8:37 ` [tip:x86/asm] x86/asm/entry: Replace this_cpu_sp0() with current_top_of_stack() " tip-bot for Andy Lutomirski
2015-03-09 13:04 ` Denys Vlasenko
2015-03-09 13:15 ` Andy Lutomirski
2015-03-26 13:30 ` [PATCH 2/2] x86: Replace this_cpu_sp0 with current_top_of_stack " Boris Ostrovsky
2015-03-26 18:33 ` Andy Lutomirski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).