LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC
@ 2017-11-30  8:03 Jiri Slaby
  2017-11-30  8:03 ` [PATCH 2/2] x86/stacktrace: make sure reliable stack trace reached userspace Jiri Slaby
  2017-11-30 19:57 ` [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC Josh Poimboeuf
  0 siblings, 2 replies; 19+ messages in thread
From: Jiri Slaby @ 2017-11-30  8:03 UTC (permalink / raw)
  To: mingo
  Cc: linux-kernel, Jiri Slaby, Thomas Gleixner, H. Peter Anvin, x86,
	Josh Poimboeuf

save_stack_trace_reliable now returns "non reliable" when there are
kernel pt_regs on stack. This means an interrupt or exception happened.
Somewhere down the route. It is a problem for frame pointer unwinder,
because the frame might now have been set up yet when the irq happened,
so it might fail to unwind from the interrupted function.

With ORC, this is not a problem, as ORC has out-of-band data. We can
find ORC data even for the IP in interrupted function and always unwind
one level up.

So introduce `unwind_regs_reliable' which decides if this is an issue
for the currently selected unwinder at all and change the code
accordingly.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
---
 arch/x86/include/asm/unwind.h | 21 +++++++++++++++++++++
 arch/x86/kernel/stacktrace.c  | 21 ++++++++-------------
 2 files changed, 29 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/unwind.h b/arch/x86/include/asm/unwind.h
index 5be2fb23825a..2e345b3ef1d4 100644
--- a/arch/x86/include/asm/unwind.h
+++ b/arch/x86/include/asm/unwind.h
@@ -73,6 +73,27 @@ static inline struct pt_regs *unwind_get_entry_regs(struct unwind_state *state)
 }
 #endif
 
+#if defined(CONFIG_UNWINDER_ORC)
+/*
+ * ORC is never afraid of stored regs -- out of band data tell him what to do
+ * at each instruction reliably.
+ */
+static inline bool unwind_regs_reliable(struct pt_regs *regs)
+{
+	return true;
+}
+#else
+/*
+ * Kernel mode registers on the stack indicate an in-kernel interrupt or
+ * exception (e.g., preemption or a page fault), which can make frame pointers
+ * unreliable.
+ */
+static inline bool unwind_regs_reliable(struct pt_regs *regs)
+{
+	return user_mode(regs);
+}
+#endif
+
 #ifdef CONFIG_UNWINDER_ORC
 void unwind_init(void);
 void unwind_module_init(struct module *mod, void *orc_ip, size_t orc_ip_size,
diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c
index 77835bc021c7..221a03e251bb 100644
--- a/arch/x86/kernel/stacktrace.c
+++ b/arch/x86/kernel/stacktrace.c
@@ -103,20 +103,15 @@ __save_stack_trace_reliable(struct stack_trace *trace,
 	     unwind_next_frame(&state)) {
 
 		regs = unwind_get_entry_regs(&state);
-		if (regs) {
-			/*
-			 * Kernel mode registers on the stack indicate an
-			 * in-kernel interrupt or exception (e.g., preemption
-			 * or a page fault), which can make frame pointers
-			 * unreliable.
-			 */
-			if (!user_mode(regs))
-				return -EINVAL;
 
-			/*
-			 * The last frame contains the user mode syscall
-			 * pt_regs.  Skip it and finish the unwind.
-			 */
+		if (regs && !unwind_regs_reliable(regs))
+			return -EINVAL;
+
+		/*
+		 * The last frame contains the user mode syscall pt_regs.  Skip
+		 * it and finish the unwind.
+		 */
+		if (regs && user_mode(regs)) {
 			unwind_next_frame(&state);
 			if (!unwind_done(&state)) {
 				STACKTRACE_DUMP_ONCE(task);
-- 
2.15.0

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 2/2] x86/stacktrace: make sure reliable stack trace reached userspace
  2017-11-30  8:03 [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC Jiri Slaby
@ 2017-11-30  8:03 ` Jiri Slaby
  2017-11-30 19:57 ` [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC Josh Poimboeuf
  1 sibling, 0 replies; 19+ messages in thread
From: Jiri Slaby @ 2017-11-30  8:03 UTC (permalink / raw)
  To: mingo
  Cc: linux-kernel, Jiri Slaby, Thomas Gleixner, H. Peter Anvin, x86,
	Josh Poimboeuf

Make sure that save_stack_trace_reliable reached userspace when saying
"this stack trace is reliable."

Place this check only after stack unwinding error check, so that the
stack is printed if something really went wrong during unwinding.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
---
 arch/x86/kernel/stacktrace.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c
index 221a03e251bb..2153aff1b459 100644
--- a/arch/x86/kernel/stacktrace.c
+++ b/arch/x86/kernel/stacktrace.c
@@ -98,6 +98,7 @@ __save_stack_trace_reliable(struct stack_trace *trace,
 	struct unwind_state state;
 	struct pt_regs *regs;
 	unsigned long addr;
+	bool reached_user = false;
 
 	for (unwind_start(&state, task, NULL, NULL); !unwind_done(&state);
 	     unwind_next_frame(&state)) {
@@ -117,6 +118,7 @@ __save_stack_trace_reliable(struct stack_trace *trace,
 				STACKTRACE_DUMP_ONCE(task);
 				return -EINVAL;
 			}
+			reached_user = true;
 			break;
 		}
 
@@ -142,6 +144,9 @@ __save_stack_trace_reliable(struct stack_trace *trace,
 		return -EINVAL;
 	}
 
+	if (!reached_user)
+		return -EINVAL;
+
 	if (trace->nr_entries < trace->max_entries)
 		trace->entries[trace->nr_entries++] = ULONG_MAX;
 
-- 
2.15.0

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC
  2017-11-30  8:03 [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC Jiri Slaby
  2017-11-30  8:03 ` [PATCH 2/2] x86/stacktrace: make sure reliable stack trace reached userspace Jiri Slaby
@ 2017-11-30 19:57 ` Josh Poimboeuf
  2017-11-30 19:59   ` Josh Poimboeuf
  2017-12-14 21:58   ` Jiri Slaby
  1 sibling, 2 replies; 19+ messages in thread
From: Josh Poimboeuf @ 2017-11-30 19:57 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: mingo, linux-kernel, Thomas Gleixner, H. Peter Anvin, x86

On Thu, Nov 30, 2017 at 09:03:24AM +0100, Jiri Slaby wrote:
> save_stack_trace_reliable now returns "non reliable" when there are
> kernel pt_regs on stack. This means an interrupt or exception happened.
> Somewhere down the route. It is a problem for frame pointer unwinder,
> because the frame might now have been set up yet when the irq happened,
> so it might fail to unwind from the interrupted function.
> 
> With ORC, this is not a problem, as ORC has out-of-band data. We can
> find ORC data even for the IP in interrupted function and always unwind
> one level up.
> 
> So introduce `unwind_regs_reliable' which decides if this is an issue
> for the currently selected unwinder at all and change the code
> accordingly.

Thanks.  I'm thinking there a few ways we can simplify things.  (Most of
these are actually issues with the existing code.)

- Currently we check to make sure that there's no frame *after* the user
  space regs.  I think there's no way that could ever happen and the
  check is overkill.

- We should probably remove the STACKTRACE_DUMP_ONCE() warnings.  There
  are some known places where a stack trace will fail, particularly with
  generated code.  I wish we had a DEBUG_WARN_ON() macro which used
  pr_debug(), but oh well.  At least the livepatch code has some helpful
  pr_warn()s, those are probably good enough.

- The unwind->error checks are superfluous.  The only errors we need to
  check for are (a) whether the FP unwinder encountered a kernel irq and
  b) whether we reached the final user regs frame.  So I think
  unwind->error can be removed altogether.

So with those changes in mind, how about something like this (plus
comments)?

	for (unwind_start(&state, task, NULL, NULL); !unwind_done(&state);
	     unwind_next_frame(&state)) {

		regs = unwind_get_entry_regs(&state);
		if (regs) {
			if (user_mode(regs))
				goto success;

			if (IS_ENABLED(CONFIG_FRAME_POINTER))
				return -EINVAL;
		}

		addr = unwind_get_return_address(&state);
		if (!addr)
			return -EINVAL;

		if (save_stack_address(trace, addr, false))
			return -EINVAL;
	}

	return -EINVAL;

success:
	if (trace->nr_entries < trace->max_entries)
		trace->entries[trace->nr_entries++] = ULONG_MAX;

	return 0;

After these changes I believe we can enable
CONFIG_HAVE_RELIABLE_STACKTRACE for ORC.

Also, when you post the next version, please cc the live patching
mailing list, since this is directly relevant to livepatch.

Thanks!

-- 
Josh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC
  2017-11-30 19:57 ` [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC Josh Poimboeuf
@ 2017-11-30 19:59   ` Josh Poimboeuf
  2017-12-01  7:34     ` Jiri Slaby
  2017-12-14 21:58   ` Jiri Slaby
  1 sibling, 1 reply; 19+ messages in thread
From: Josh Poimboeuf @ 2017-11-30 19:59 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: mingo, linux-kernel, Thomas Gleixner, H. Peter Anvin, x86, live-patching

On Thu, Nov 30, 2017 at 01:57:10PM -0600, Josh Poimboeuf wrote:
> On Thu, Nov 30, 2017 at 09:03:24AM +0100, Jiri Slaby wrote:
> > save_stack_trace_reliable now returns "non reliable" when there are
> > kernel pt_regs on stack. This means an interrupt or exception happened.
> > Somewhere down the route. It is a problem for frame pointer unwinder,
> > because the frame might now have been set up yet when the irq happened,
> > so it might fail to unwind from the interrupted function.
> > 
> > With ORC, this is not a problem, as ORC has out-of-band data. We can
> > find ORC data even for the IP in interrupted function and always unwind
> > one level up.
> > 
> > So introduce `unwind_regs_reliable' which decides if this is an issue
> > for the currently selected unwinder at all and change the code
> > accordingly.
> 
> Thanks.  I'm thinking there a few ways we can simplify things.  (Most of
> these are actually issues with the existing code.)
> 
> - Currently we check to make sure that there's no frame *after* the user
>   space regs.  I think there's no way that could ever happen and the
>   check is overkill.
> 
> - We should probably remove the STACKTRACE_DUMP_ONCE() warnings.  There
>   are some known places where a stack trace will fail, particularly with
>   generated code.  I wish we had a DEBUG_WARN_ON() macro which used
>   pr_debug(), but oh well.  At least the livepatch code has some helpful
>   pr_warn()s, those are probably good enough.
    ^^^^^^^
    meant to say pr_debug()s.

Also adding the live patching mailing list as an FYI.

> 
> - The unwind->error checks are superfluous.  The only errors we need to
>   check for are (a) whether the FP unwinder encountered a kernel irq and
>   b) whether we reached the final user regs frame.  So I think
>   unwind->error can be removed altogether.
> 
> So with those changes in mind, how about something like this (plus
> comments)?
> 
> 	for (unwind_start(&state, task, NULL, NULL); !unwind_done(&state);
> 	     unwind_next_frame(&state)) {
> 
> 		regs = unwind_get_entry_regs(&state);
> 		if (regs) {
> 			if (user_mode(regs))
> 				goto success;
> 
> 			if (IS_ENABLED(CONFIG_FRAME_POINTER))
> 				return -EINVAL;
> 		}
> 
> 		addr = unwind_get_return_address(&state);
> 		if (!addr)
> 			return -EINVAL;
> 
> 		if (save_stack_address(trace, addr, false))
> 			return -EINVAL;
> 	}
> 
> 	return -EINVAL;
> 
> success:
> 	if (trace->nr_entries < trace->max_entries)
> 		trace->entries[trace->nr_entries++] = ULONG_MAX;
> 
> 	return 0;
> 
> After these changes I believe we can enable
> CONFIG_HAVE_RELIABLE_STACKTRACE for ORC.
> 
> Also, when you post the next version, please cc the live patching
> mailing list, since this is directly relevant to livepatch.
> 
> Thanks!

-- 
Josh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC
  2017-11-30 19:59   ` Josh Poimboeuf
@ 2017-12-01  7:34     ` Jiri Slaby
  0 siblings, 0 replies; 19+ messages in thread
From: Jiri Slaby @ 2017-12-01  7:34 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: mingo, linux-kernel, Thomas Gleixner, H. Peter Anvin, x86, live-patching

On 11/30/2017, 08:59 PM, Josh Poimboeuf wrote:
> On Thu, Nov 30, 2017 at 01:57:10PM -0600, Josh Poimboeuf wrote:
>> On Thu, Nov 30, 2017 at 09:03:24AM +0100, Jiri Slaby wrote:
>>> save_stack_trace_reliable now returns "non reliable" when there are
>>> kernel pt_regs on stack. This means an interrupt or exception happened.
>>> Somewhere down the route. It is a problem for frame pointer unwinder,
>>> because the frame might now have been set up yet when the irq happened,
>>> so it might fail to unwind from the interrupted function.
>>>
>>> With ORC, this is not a problem, as ORC has out-of-band data. We can
>>> find ORC data even for the IP in interrupted function and always unwind
>>> one level up.
>>>
>>> So introduce `unwind_regs_reliable' which decides if this is an issue
>>> for the currently selected unwinder at all and change the code
>>> accordingly.
>>
>> Thanks.  I'm thinking there a few ways we can simplify things.  (Most of
>> these are actually issues with the existing code.)
>>
>> - Currently we check to make sure that there's no frame *after* the user
>>   space regs.  I think there's no way that could ever happen and the
>>   check is overkill.
>>
>> - We should probably remove the STACKTRACE_DUMP_ONCE() warnings.  There
>>   are some known places where a stack trace will fail, particularly with
>>   generated code.  I wish we had a DEBUG_WARN_ON() macro which used
>>   pr_debug(), but oh well.  At least the livepatch code has some helpful
>>   pr_warn()s, those are probably good enough.
>     ^^^^^^^
>     meant to say pr_debug()s.
> 
> Also adding the live patching mailing list as an FYI.

The changes are now in:
https://git.kernel.org/pub/scm/linux/kernel/git/jirislaby/linux.git/log/?h=devel

Letting 01 bot to run through them, then I will send a v2.

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC
  2017-11-30 19:57 ` [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC Josh Poimboeuf
  2017-11-30 19:59   ` Josh Poimboeuf
@ 2017-12-14 21:58   ` Jiri Slaby
  2017-12-18  3:37     ` Josh Poimboeuf
  1 sibling, 1 reply; 19+ messages in thread
From: Jiri Slaby @ 2017-12-14 21:58 UTC (permalink / raw)
  To: Josh Poimboeuf, Jiri Slaby
  Cc: mingo, linux-kernel, Thomas Gleixner, H. Peter Anvin, x86

On 11/30/2017, 08:57 PM, Josh Poimboeuf wrote:
> So with those changes in mind, how about something like this (plus
> comments)?
> 
> 	for (unwind_start(&state, task, NULL, NULL); !unwind_done(&state);
> 	     unwind_next_frame(&state)) {
> 
> 		regs = unwind_get_entry_regs(&state);
> 		if (regs) {
> 			if (user_mode(regs))
> 				goto success;
> 
> 			if (IS_ENABLED(CONFIG_FRAME_POINTER))
> 				return -EINVAL;
> 		}
> 
> 		addr = unwind_get_return_address(&state);
> 		if (!addr)
> 			return -EINVAL;
> 
> 		if (save_stack_address(trace, addr, false))
> 			return -EINVAL;
> 	}
> 
> 	return -EINVAL;

Kthreads and idle tasks hit this error as they have no user regs on the
stack obviously :).

So making it:
        if (!(task->flags & (PF_KTHREAD | PF_IDLE)))
                return -EINVAL;

works, but is not reliable now. So I believe, we cannot live without
unwind->error to differentiate between "unwind_done() == true" because:
* full stack unwound and the stack type is set to UNKNOWN
* unwinding failed and the stack type is set to UNKNOWN

Or perhaps introduce stack type BOTTOM, NONE, or NOMORE meaning the
bottom of the stacks reached?

> success:
> 	if (trace->nr_entries < trace->max_entries)
> 		trace->entries[trace->nr_entries++] = ULONG_MAX;
> 
> 	return 0;
> 
> After these changes I believe we can enable
> CONFIG_HAVE_RELIABLE_STACKTRACE for ORC.

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC
  2017-12-14 21:58   ` Jiri Slaby
@ 2017-12-18  3:37     ` Josh Poimboeuf
  2017-12-18  6:59       ` Jiri Slaby
  2017-12-20 17:45       ` Josh Poimboeuf
  0 siblings, 2 replies; 19+ messages in thread
From: Josh Poimboeuf @ 2017-12-18  3:37 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: mingo, linux-kernel, Thomas Gleixner, H. Peter Anvin, x86

On Thu, Dec 14, 2017 at 10:58:35PM +0100, Jiri Slaby wrote:
> On 11/30/2017, 08:57 PM, Josh Poimboeuf wrote:
> > So with those changes in mind, how about something like this (plus
> > comments)?
> > 
> > 	for (unwind_start(&state, task, NULL, NULL); !unwind_done(&state);
> > 	     unwind_next_frame(&state)) {
> > 
> > 		regs = unwind_get_entry_regs(&state);
> > 		if (regs) {
> > 			if (user_mode(regs))
> > 				goto success;
> > 
> > 			if (IS_ENABLED(CONFIG_FRAME_POINTER))
> > 				return -EINVAL;
> > 		}
> > 
> > 		addr = unwind_get_return_address(&state);
> > 		if (!addr)
> > 			return -EINVAL;
> > 
> > 		if (save_stack_address(trace, addr, false))
> > 			return -EINVAL;
> > 	}
> > 
> > 	return -EINVAL;
> 
> Kthreads and idle tasks hit this error as they have no user regs on the
> stack obviously :).

Doh, sorry, I forgot about that.

> 
> So making it:
>         if (!(task->flags & (PF_KTHREAD | PF_IDLE)))
>                 return -EINVAL;
> 
> works, but is not reliable now. So I believe, we cannot live without
> unwind->error to differentiate between "unwind_done() == true" because:
> * full stack unwound and the stack type is set to UNKNOWN
> * unwinding failed and the stack type is set to UNKNOWN
> 
> Or perhaps introduce stack type BOTTOM, NONE, or NOMORE meaning the
> bottom of the stacks reached?

Yeah, we'll need something... I need to think about it a little more.

-- 
Josh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC
  2017-12-18  3:37     ` Josh Poimboeuf
@ 2017-12-18  6:59       ` Jiri Slaby
  2017-12-20 17:45       ` Josh Poimboeuf
  1 sibling, 0 replies; 19+ messages in thread
From: Jiri Slaby @ 2017-12-18  6:59 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: mingo, linux-kernel, Thomas Gleixner, H. Peter Anvin, x86,
	Miroslav Benes

On 12/18/2017, 04:37 AM, Josh Poimboeuf wrote:
> On Thu, Dec 14, 2017 at 10:58:35PM +0100, Jiri Slaby wrote:
>>> 	return -EINVAL;
>>
>> Kthreads and idle tasks hit this error as they have no user regs on the
>> stack obviously :).
> 
> Doh, sorry, I forgot about that.

FWIW and to be complete, as Mira Benes pointed out in some other thread,
also zombies/dead tasks belong here which have emptied stack.

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC
  2017-12-18  3:37     ` Josh Poimboeuf
  2017-12-18  6:59       ` Jiri Slaby
@ 2017-12-20 17:45       ` Josh Poimboeuf
  2017-12-20 19:07         ` Jiri Slaby
  1 sibling, 1 reply; 19+ messages in thread
From: Josh Poimboeuf @ 2017-12-20 17:45 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: mingo, linux-kernel, Thomas Gleixner, H. Peter Anvin, x86

On Sun, Dec 17, 2017 at 09:37:49PM -0600, Josh Poimboeuf wrote:
> > 
> > So making it:
> >         if (!(task->flags & (PF_KTHREAD | PF_IDLE)))
> >                 return -EINVAL;
> > 
> > works, but is not reliable now. So I believe, we cannot live without
> > unwind->error to differentiate between "unwind_done() == true" because:
> > * full stack unwound and the stack type is set to UNKNOWN
> > * unwinding failed and the stack type is set to UNKNOWN
> > 
> > Or perhaps introduce stack type BOTTOM, NONE, or NOMORE meaning the
> > bottom of the stacks reached?
> 
> Yeah, we'll need something... I need to think about it a little more.

I can update the ORC unwinder to set unwind->error in case it runs into
an issue or it doesn't reach the "end", like the FP unwinder does.

It might not be until 2018 though.  But in the meantime you can go ahead
and update your patches accordingly and then we can combine them for
testing next year.

-- 
Josh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC
  2017-12-20 17:45       ` Josh Poimboeuf
@ 2017-12-20 19:07         ` Jiri Slaby
  2018-04-16 22:16           ` Josh Poimboeuf
  0 siblings, 1 reply; 19+ messages in thread
From: Jiri Slaby @ 2017-12-20 19:07 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: mingo, linux-kernel, Thomas Gleixner, H. Peter Anvin, x86

On 12/20/2017, 06:45 PM, Josh Poimboeuf wrote:
> It might not be until 2018 though.  But in the meantime you can go ahead
> and update your patches accordingly and then we can combine them for
> testing next year.

I already did ;). So when you have that ready, I will send it on top
right after.

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC
  2017-12-20 19:07         ` Jiri Slaby
@ 2018-04-16 22:16           ` Josh Poimboeuf
  2018-04-19 13:42             ` Josh Poimboeuf
  0 siblings, 1 reply; 19+ messages in thread
From: Josh Poimboeuf @ 2018-04-16 22:16 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: mingo, linux-kernel, Thomas Gleixner, H. Peter Anvin, x86

On Wed, Dec 20, 2017 at 08:07:17PM +0100, Jiri Slaby wrote:
> On 12/20/2017, 06:45 PM, Josh Poimboeuf wrote:
> > It might not be until 2018 though.  But in the meantime you can go ahead
> > and update your patches accordingly and then we can combine them for
> > testing next year.
> 
> I already did ;). So when you have that ready, I will send it on top
> right after.

Sorry for the delay...  Here's a (lightly tested) patch.  Can you test
it with the latest version of your patches?

----

From: Josh Poimboeuf <jpoimboe@redhat.com>
Subject: [PATCH] x86/unwind/orc: Detect the end of the stack

The existing UNWIND_HINT_EMPTY annotations happen to be good indicators
of where entry code calls into C code for the first time.  So also use
them to mark the end of the stack for the ORC unwinder.

Use that information to set unwind->error if the ORC unwinder doesn't
unwind all the way to the end.  This will be needed for enabling
HAVE_RELIABLE_STACKTRACE for the ORC unwinder so we can use it with the
livepatch consistency model.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 arch/x86/entry/entry_64.S                      |  1 +
 arch/x86/include/asm/orc_types.h               |  2 ++
 arch/x86/include/asm/unwind_hints.h            | 16 +++++----
 arch/x86/kernel/unwind_orc.c                   | 50 ++++++++++++++++----------
 tools/objtool/arch/x86/include/asm/orc_types.h |  2 ++
 5 files changed, 46 insertions(+), 25 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index ab4f451aeb97..e35dbc507425 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -408,6 +408,7 @@ ENTRY(ret_from_fork)
 
 1:
 	/* kernel thread */
+	UNWIND_HINT_EMPTY
 	movq	%r12, %rdi
 	CALL_NOSPEC %rbx
 	/*
diff --git a/arch/x86/include/asm/orc_types.h b/arch/x86/include/asm/orc_types.h
index 9c9dc579bd7d..46f516dd80ce 100644
--- a/arch/x86/include/asm/orc_types.h
+++ b/arch/x86/include/asm/orc_types.h
@@ -88,6 +88,7 @@ struct orc_entry {
 	unsigned	sp_reg:4;
 	unsigned	bp_reg:4;
 	unsigned	type:2;
+	unsigned	end:1;
 } __packed;
 
 /*
@@ -101,6 +102,7 @@ struct unwind_hint {
 	s16		sp_offset;
 	u8		sp_reg;
 	u8		type;
+	u8		end;
 };
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/x86/include/asm/unwind_hints.h b/arch/x86/include/asm/unwind_hints.h
index bae46fc6b9de..0bcdb1279361 100644
--- a/arch/x86/include/asm/unwind_hints.h
+++ b/arch/x86/include/asm/unwind_hints.h
@@ -26,7 +26,7 @@
  * the debuginfo as necessary.  It will also warn if it sees any
  * inconsistencies.
  */
-.macro UNWIND_HINT sp_reg=ORC_REG_SP sp_offset=0 type=ORC_TYPE_CALL
+.macro UNWIND_HINT sp_reg=ORC_REG_SP sp_offset=0 type=ORC_TYPE_CALL end=0
 #ifdef CONFIG_STACK_VALIDATION
 .Lunwind_hint_ip_\@:
 	.pushsection .discard.unwind_hints
@@ -35,12 +35,14 @@
 		.short \sp_offset
 		.byte \sp_reg
 		.byte \type
+		.byte \end
+		.balign 4
 	.popsection
 #endif
 .endm
 
 .macro UNWIND_HINT_EMPTY
-	UNWIND_HINT sp_reg=ORC_REG_UNDEFINED
+	UNWIND_HINT sp_reg=ORC_REG_UNDEFINED end=1
 .endm
 
 .macro UNWIND_HINT_REGS base=%rsp offset=0 indirect=0 extra=1 iret=0
@@ -86,19 +88,21 @@
 
 #else /* !__ASSEMBLY__ */
 
-#define UNWIND_HINT(sp_reg, sp_offset, type)			\
+#define UNWIND_HINT(sp_reg, sp_offset, type, end)		\
 	"987: \n\t"						\
 	".pushsection .discard.unwind_hints\n\t"		\
 	/* struct unwind_hint */				\
 	".long 987b - .\n\t"					\
-	".short " __stringify(sp_offset) "\n\t"		\
+	".short " __stringify(sp_offset) "\n\t"			\
 	".byte " __stringify(sp_reg) "\n\t"			\
 	".byte " __stringify(type) "\n\t"			\
+	".byte " __stringify(end) "\n\t"			\
+	".balign 4 \n\t"					\
 	".popsection\n\t"
 
-#define UNWIND_HINT_SAVE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_SAVE)
+#define UNWIND_HINT_SAVE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_SAVE, 0)
 
-#define UNWIND_HINT_RESTORE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_RESTORE)
+#define UNWIND_HINT_RESTORE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_RESTORE, 0)
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index feb28fee6cea..8dd7f1fa3885 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -352,7 +352,7 @@ static bool deref_stack_iret_regs(struct unwind_state *state, unsigned long addr
 
 bool unwind_next_frame(struct unwind_state *state)
 {
-	unsigned long ip_p, sp, orig_ip, prev_sp = state->sp;
+	unsigned long ip_p, sp, orig_ip = state->ip, prev_sp = state->sp;
 	enum stack_type prev_type = state->stack_info.type;
 	struct orc_entry *orc;
 	bool indirect = false;
@@ -363,7 +363,7 @@ bool unwind_next_frame(struct unwind_state *state)
 	/* Don't let modules unload while we're reading their ORC data. */
 	preempt_disable();
 
-	/* Have we reached the end? */
+	/* End-of-stack check for user tasks: */
 	if (state->regs && user_mode(state->regs))
 		goto done;
 
@@ -374,9 +374,16 @@ bool unwind_next_frame(struct unwind_state *state)
 	 * calls and calls to noreturn functions.
 	 */
 	orc = orc_find(state->signal ? state->ip : state->ip - 1);
-	if (!orc || orc->sp_reg == ORC_REG_UNDEFINED)
-		goto done;
-	orig_ip = state->ip;
+	if (!orc)
+		goto err;
+
+	/* End-of-stack check for kernel threads: */
+	if (orc->sp_reg == ORC_REG_UNDEFINED) {
+		if (!orc->end)
+			goto err;
+
+		goto the_end;
+	}
 
 	/* Find the previous frame's stack: */
 	switch (orc->sp_reg) {
@@ -402,7 +409,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!state->regs || !state->full_regs) {
 			orc_warn("missing regs for base reg R10 at ip %pB\n",
 				 (void *)state->ip);
-			goto done;
+			goto err;
 		}
 		sp = state->regs->r10;
 		break;
@@ -411,7 +418,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!state->regs || !state->full_regs) {
 			orc_warn("missing regs for base reg R13 at ip %pB\n",
 				 (void *)state->ip);
-			goto done;
+			goto err;
 		}
 		sp = state->regs->r13;
 		break;
@@ -420,7 +427,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!state->regs || !state->full_regs) {
 			orc_warn("missing regs for base reg DI at ip %pB\n",
 				 (void *)state->ip);
-			goto done;
+			goto err;
 		}
 		sp = state->regs->di;
 		break;
@@ -429,7 +436,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!state->regs || !state->full_regs) {
 			orc_warn("missing regs for base reg DX at ip %pB\n",
 				 (void *)state->ip);
-			goto done;
+			goto err;
 		}
 		sp = state->regs->dx;
 		break;
@@ -437,12 +444,12 @@ bool unwind_next_frame(struct unwind_state *state)
 	default:
 		orc_warn("unknown SP base reg %d for ip %pB\n",
 			 orc->sp_reg, (void *)state->ip);
-		goto done;
+		goto err;
 	}
 
 	if (indirect) {
 		if (!deref_stack_reg(state, sp, &sp))
-			goto done;
+			goto err;
 	}
 
 	/* Find IP, SP and possibly regs: */
@@ -451,7 +458,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		ip_p = sp - sizeof(long);
 
 		if (!deref_stack_reg(state, ip_p, &state->ip))
-			goto done;
+			goto err;
 
 		state->ip = ftrace_graph_ret_addr(state->task, &state->graph_idx,
 						  state->ip, (void *)ip_p);
@@ -465,7 +472,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!deref_stack_regs(state, sp, &state->ip, &state->sp)) {
 			orc_warn("can't dereference registers at %p for ip %pB\n",
 				 (void *)sp, (void *)orig_ip);
-			goto done;
+			goto err;
 		}
 
 		state->regs = (struct pt_regs *)sp;
@@ -477,7 +484,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!deref_stack_iret_regs(state, sp, &state->ip, &state->sp)) {
 			orc_warn("can't dereference iret registers at %p for ip %pB\n",
 				 (void *)sp, (void *)orig_ip);
-			goto done;
+			goto err;
 		}
 
 		state->regs = (void *)sp - IRET_FRAME_OFFSET;
@@ -500,18 +507,18 @@ bool unwind_next_frame(struct unwind_state *state)
 
 	case ORC_REG_PREV_SP:
 		if (!deref_stack_reg(state, sp + orc->bp_offset, &state->bp))
-			goto done;
+			goto err;
 		break;
 
 	case ORC_REG_BP:
 		if (!deref_stack_reg(state, state->bp + orc->bp_offset, &state->bp))
-			goto done;
+			goto err;
 		break;
 
 	default:
 		orc_warn("unknown BP base reg %d for ip %pB\n",
 			 orc->bp_reg, (void *)orig_ip);
-		goto done;
+		goto err;
 	}
 
 	/* Prevent a recursive loop due to bad ORC data: */
@@ -520,16 +527,21 @@ bool unwind_next_frame(struct unwind_state *state)
 	    state->sp <= prev_sp) {
 		orc_warn("stack going in the wrong direction? ip=%pB\n",
 			 (void *)orig_ip);
-		goto done;
+		goto err;
 	}
 
 	preempt_enable();
 	return true;
 
-done:
+err:
+	state->error = true;
+
+the_end:
 	preempt_enable();
 	state->stack_info.type = STACK_TYPE_UNKNOWN;
 	return false;
+
+
 }
 EXPORT_SYMBOL_GPL(unwind_next_frame);
 
diff --git a/tools/objtool/arch/x86/include/asm/orc_types.h b/tools/objtool/arch/x86/include/asm/orc_types.h
index 9c9dc579bd7d..46f516dd80ce 100644
--- a/tools/objtool/arch/x86/include/asm/orc_types.h
+++ b/tools/objtool/arch/x86/include/asm/orc_types.h
@@ -88,6 +88,7 @@ struct orc_entry {
 	unsigned	sp_reg:4;
 	unsigned	bp_reg:4;
 	unsigned	type:2;
+	unsigned	end:1;
 } __packed;
 
 /*
@@ -101,6 +102,7 @@ struct unwind_hint {
 	s16		sp_offset;
 	u8		sp_reg;
 	u8		type;
+	u8		end;
 };
 #endif /* __ASSEMBLY__ */
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC
  2018-04-16 22:16           ` Josh Poimboeuf
@ 2018-04-19 13:42             ` Josh Poimboeuf
  2018-05-10 12:33               ` Jiri Slaby
  0 siblings, 1 reply; 19+ messages in thread
From: Josh Poimboeuf @ 2018-04-19 13:42 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: mingo, linux-kernel, Thomas Gleixner, H. Peter Anvin, x86

On Mon, Apr 16, 2018 at 05:16:53PM -0500, Josh Poimboeuf wrote:
> On Wed, Dec 20, 2017 at 08:07:17PM +0100, Jiri Slaby wrote:
> > On 12/20/2017, 06:45 PM, Josh Poimboeuf wrote:
> > > It might not be until 2018 though.  But in the meantime you can go ahead
> > > and update your patches accordingly and then we can combine them for
> > > testing next year.
> > 
> > I already did ;). So when you have that ready, I will send it on top
> > right after.
> 
> Sorry for the delay...  Here's a (lightly tested) patch.  Can you test
> it with the latest version of your patches?

This one actually compiles:

From: Josh Poimboeuf <jpoimboe@redhat.com>
Subject: [PATCH] x86/unwind/orc: Detect the end of the stack

The existing UNWIND_HINT_EMPTY annotations happen to be good indicators
of where entry code calls into C code for the first time.  So also use
them to mark the end of the stack for the ORC unwinder.

Use that information to set unwind->error if the ORC unwinder doesn't
unwind all the way to the end.  This will be needed for enabling
HAVE_RELIABLE_STACKTRACE for the ORC unwinder so we can use it with the
livepatch consistency model.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 arch/x86/entry/entry_64.S                      |  1 +
 arch/x86/include/asm/orc_types.h               |  2 +
 arch/x86/include/asm/unwind_hints.h            | 16 +++++---
 arch/x86/kernel/unwind_orc.c                   | 52 ++++++++++++++++----------
 tools/objtool/arch/x86/include/asm/orc_types.h |  2 +
 5 files changed, 47 insertions(+), 26 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index ab4f451aeb97..e35dbc507425 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -408,6 +408,7 @@ ENTRY(ret_from_fork)
 
 1:
 	/* kernel thread */
+	UNWIND_HINT_EMPTY
 	movq	%r12, %rdi
 	CALL_NOSPEC %rbx
 	/*
diff --git a/arch/x86/include/asm/orc_types.h b/arch/x86/include/asm/orc_types.h
index 9c9dc579bd7d..46f516dd80ce 100644
--- a/arch/x86/include/asm/orc_types.h
+++ b/arch/x86/include/asm/orc_types.h
@@ -88,6 +88,7 @@ struct orc_entry {
 	unsigned	sp_reg:4;
 	unsigned	bp_reg:4;
 	unsigned	type:2;
+	unsigned	end:1;
 } __packed;
 
 /*
@@ -101,6 +102,7 @@ struct unwind_hint {
 	s16		sp_offset;
 	u8		sp_reg;
 	u8		type;
+	u8		end;
 };
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/x86/include/asm/unwind_hints.h b/arch/x86/include/asm/unwind_hints.h
index bae46fc6b9de..0bcdb1279361 100644
--- a/arch/x86/include/asm/unwind_hints.h
+++ b/arch/x86/include/asm/unwind_hints.h
@@ -26,7 +26,7 @@
  * the debuginfo as necessary.  It will also warn if it sees any
  * inconsistencies.
  */
-.macro UNWIND_HINT sp_reg=ORC_REG_SP sp_offset=0 type=ORC_TYPE_CALL
+.macro UNWIND_HINT sp_reg=ORC_REG_SP sp_offset=0 type=ORC_TYPE_CALL end=0
 #ifdef CONFIG_STACK_VALIDATION
 .Lunwind_hint_ip_\@:
 	.pushsection .discard.unwind_hints
@@ -35,12 +35,14 @@
 		.short \sp_offset
 		.byte \sp_reg
 		.byte \type
+		.byte \end
+		.balign 4
 	.popsection
 #endif
 .endm
 
 .macro UNWIND_HINT_EMPTY
-	UNWIND_HINT sp_reg=ORC_REG_UNDEFINED
+	UNWIND_HINT sp_reg=ORC_REG_UNDEFINED end=1
 .endm
 
 .macro UNWIND_HINT_REGS base=%rsp offset=0 indirect=0 extra=1 iret=0
@@ -86,19 +88,21 @@
 
 #else /* !__ASSEMBLY__ */
 
-#define UNWIND_HINT(sp_reg, sp_offset, type)			\
+#define UNWIND_HINT(sp_reg, sp_offset, type, end)		\
 	"987: \n\t"						\
 	".pushsection .discard.unwind_hints\n\t"		\
 	/* struct unwind_hint */				\
 	".long 987b - .\n\t"					\
-	".short " __stringify(sp_offset) "\n\t"		\
+	".short " __stringify(sp_offset) "\n\t"			\
 	".byte " __stringify(sp_reg) "\n\t"			\
 	".byte " __stringify(type) "\n\t"			\
+	".byte " __stringify(end) "\n\t"			\
+	".balign 4 \n\t"					\
 	".popsection\n\t"
 
-#define UNWIND_HINT_SAVE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_SAVE)
+#define UNWIND_HINT_SAVE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_SAVE, 0)
 
-#define UNWIND_HINT_RESTORE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_RESTORE)
+#define UNWIND_HINT_RESTORE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_RESTORE, 0)
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index feb28fee6cea..8b3c665a65c3 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -352,7 +352,7 @@ static bool deref_stack_iret_regs(struct unwind_state *state, unsigned long addr
 
 bool unwind_next_frame(struct unwind_state *state)
 {
-	unsigned long ip_p, sp, orig_ip, prev_sp = state->sp;
+	unsigned long ip_p, sp, orig_ip = state->ip, prev_sp = state->sp;
 	enum stack_type prev_type = state->stack_info.type;
 	struct orc_entry *orc;
 	bool indirect = false;
@@ -363,9 +363,9 @@ bool unwind_next_frame(struct unwind_state *state)
 	/* Don't let modules unload while we're reading their ORC data. */
 	preempt_disable();
 
-	/* Have we reached the end? */
+	/* End-of-stack check for user tasks: */
 	if (state->regs && user_mode(state->regs))
-		goto done;
+		goto the_end;
 
 	/*
 	 * Find the orc_entry associated with the text address.
@@ -374,9 +374,16 @@ bool unwind_next_frame(struct unwind_state *state)
 	 * calls and calls to noreturn functions.
 	 */
 	orc = orc_find(state->signal ? state->ip : state->ip - 1);
-	if (!orc || orc->sp_reg == ORC_REG_UNDEFINED)
-		goto done;
-	orig_ip = state->ip;
+	if (!orc)
+		goto err;
+
+	/* End-of-stack check for kernel threads: */
+	if (orc->sp_reg == ORC_REG_UNDEFINED) {
+		if (!orc->end)
+			goto err;
+
+		goto the_end;
+	}
 
 	/* Find the previous frame's stack: */
 	switch (orc->sp_reg) {
@@ -402,7 +409,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!state->regs || !state->full_regs) {
 			orc_warn("missing regs for base reg R10 at ip %pB\n",
 				 (void *)state->ip);
-			goto done;
+			goto err;
 		}
 		sp = state->regs->r10;
 		break;
@@ -411,7 +418,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!state->regs || !state->full_regs) {
 			orc_warn("missing regs for base reg R13 at ip %pB\n",
 				 (void *)state->ip);
-			goto done;
+			goto err;
 		}
 		sp = state->regs->r13;
 		break;
@@ -420,7 +427,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!state->regs || !state->full_regs) {
 			orc_warn("missing regs for base reg DI at ip %pB\n",
 				 (void *)state->ip);
-			goto done;
+			goto err;
 		}
 		sp = state->regs->di;
 		break;
@@ -429,7 +436,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!state->regs || !state->full_regs) {
 			orc_warn("missing regs for base reg DX at ip %pB\n",
 				 (void *)state->ip);
-			goto done;
+			goto err;
 		}
 		sp = state->regs->dx;
 		break;
@@ -437,12 +444,12 @@ bool unwind_next_frame(struct unwind_state *state)
 	default:
 		orc_warn("unknown SP base reg %d for ip %pB\n",
 			 orc->sp_reg, (void *)state->ip);
-		goto done;
+		goto err;
 	}
 
 	if (indirect) {
 		if (!deref_stack_reg(state, sp, &sp))
-			goto done;
+			goto err;
 	}
 
 	/* Find IP, SP and possibly regs: */
@@ -451,7 +458,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		ip_p = sp - sizeof(long);
 
 		if (!deref_stack_reg(state, ip_p, &state->ip))
-			goto done;
+			goto err;
 
 		state->ip = ftrace_graph_ret_addr(state->task, &state->graph_idx,
 						  state->ip, (void *)ip_p);
@@ -465,7 +472,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!deref_stack_regs(state, sp, &state->ip, &state->sp)) {
 			orc_warn("can't dereference registers at %p for ip %pB\n",
 				 (void *)sp, (void *)orig_ip);
-			goto done;
+			goto err;
 		}
 
 		state->regs = (struct pt_regs *)sp;
@@ -477,7 +484,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!deref_stack_iret_regs(state, sp, &state->ip, &state->sp)) {
 			orc_warn("can't dereference iret registers at %p for ip %pB\n",
 				 (void *)sp, (void *)orig_ip);
-			goto done;
+			goto err;
 		}
 
 		state->regs = (void *)sp - IRET_FRAME_OFFSET;
@@ -500,18 +507,18 @@ bool unwind_next_frame(struct unwind_state *state)
 
 	case ORC_REG_PREV_SP:
 		if (!deref_stack_reg(state, sp + orc->bp_offset, &state->bp))
-			goto done;
+			goto err;
 		break;
 
 	case ORC_REG_BP:
 		if (!deref_stack_reg(state, state->bp + orc->bp_offset, &state->bp))
-			goto done;
+			goto err;
 		break;
 
 	default:
 		orc_warn("unknown BP base reg %d for ip %pB\n",
 			 orc->bp_reg, (void *)orig_ip);
-		goto done;
+		goto err;
 	}
 
 	/* Prevent a recursive loop due to bad ORC data: */
@@ -520,16 +527,21 @@ bool unwind_next_frame(struct unwind_state *state)
 	    state->sp <= prev_sp) {
 		orc_warn("stack going in the wrong direction? ip=%pB\n",
 			 (void *)orig_ip);
-		goto done;
+		goto err;
 	}
 
 	preempt_enable();
 	return true;
 
-done:
+err:
+	state->error = true;
+
+the_end:
 	preempt_enable();
 	state->stack_info.type = STACK_TYPE_UNKNOWN;
 	return false;
+
+
 }
 EXPORT_SYMBOL_GPL(unwind_next_frame);
 
diff --git a/tools/objtool/arch/x86/include/asm/orc_types.h b/tools/objtool/arch/x86/include/asm/orc_types.h
index 9c9dc579bd7d..46f516dd80ce 100644
--- a/tools/objtool/arch/x86/include/asm/orc_types.h
+++ b/tools/objtool/arch/x86/include/asm/orc_types.h
@@ -88,6 +88,7 @@ struct orc_entry {
 	unsigned	sp_reg:4;
 	unsigned	bp_reg:4;
 	unsigned	type:2;
+	unsigned	end:1;
 } __packed;
 
 /*
@@ -101,6 +102,7 @@ struct unwind_hint {
 	s16		sp_offset;
 	u8		sp_reg;
 	u8		type;
+	u8		end;
 };
 #endif /* __ASSEMBLY__ */
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC
  2018-04-19 13:42             ` Josh Poimboeuf
@ 2018-05-10 12:33               ` Jiri Slaby
  2018-05-10 12:49                 ` Josh Poimboeuf
  0 siblings, 1 reply; 19+ messages in thread
From: Jiri Slaby @ 2018-05-10 12:33 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: mingo, linux-kernel, Thomas Gleixner, H. Peter Anvin, x86

On 04/19/2018, 03:42 PM, Josh Poimboeuf wrote:
> On Mon, Apr 16, 2018 at 05:16:53PM -0500, Josh Poimboeuf wrote:
>> On Wed, Dec 20, 2017 at 08:07:17PM +0100, Jiri Slaby wrote:
>>> On 12/20/2017, 06:45 PM, Josh Poimboeuf wrote:
>>>> It might not be until 2018 though.  But in the meantime you can go ahead
>>>> and update your patches accordingly and then we can combine them for
>>>> testing next year.
>>>
>>> I already did ;). So when you have that ready, I will send it on top
>>> right after.
>>
>> Sorry for the delay...  Here's a (lightly tested) patch.  Can you test
>> it with the latest version of your patches?
> 
> This one actually compiles:
> 
> From: Josh Poimboeuf <jpoimboe@redhat.com>
> Subject: [PATCH] x86/unwind/orc: Detect the end of the stack

With this patch applied, livepatching never completes. Kthreads are in
the unpatched state forever. I have no details yet.

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC
  2018-05-10 12:33               ` Jiri Slaby
@ 2018-05-10 12:49                 ` Josh Poimboeuf
  2018-05-11  7:01                   ` Jiri Slaby
  0 siblings, 1 reply; 19+ messages in thread
From: Josh Poimboeuf @ 2018-05-10 12:49 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: mingo, linux-kernel, Thomas Gleixner, H. Peter Anvin, x86

On Thu, May 10, 2018 at 02:33:03PM +0200, Jiri Slaby wrote:
> On 04/19/2018, 03:42 PM, Josh Poimboeuf wrote:
> > On Mon, Apr 16, 2018 at 05:16:53PM -0500, Josh Poimboeuf wrote:
> >> On Wed, Dec 20, 2017 at 08:07:17PM +0100, Jiri Slaby wrote:
> >>> On 12/20/2017, 06:45 PM, Josh Poimboeuf wrote:
> >>>> It might not be until 2018 though.  But in the meantime you can go ahead
> >>>> and update your patches accordingly and then we can combine them for
> >>>> testing next year.
> >>>
> >>> I already did ;). So when you have that ready, I will send it on top
> >>> right after.
> >>
> >> Sorry for the delay...  Here's a (lightly tested) patch.  Can you test
> >> it with the latest version of your patches?
> > 
> > This one actually compiles:
> > 
> > From: Josh Poimboeuf <jpoimboe@redhat.com>
> > Subject: [PATCH] x86/unwind/orc: Detect the end of the stack
> 
> With this patch applied, livepatching never completes. Kthreads are in
> the unpatched state forever. I have no details yet.

Hm, I thought I had tested that, but now I'm not sure.  Let me try it
again.

-- 
Josh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC
  2018-05-10 12:49                 ` Josh Poimboeuf
@ 2018-05-11  7:01                   ` Jiri Slaby
  2018-05-11 15:12                     ` Josh Poimboeuf
  0 siblings, 1 reply; 19+ messages in thread
From: Jiri Slaby @ 2018-05-11  7:01 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: mingo, linux-kernel, Thomas Gleixner, H. Peter Anvin, x86

On 05/10/2018, 02:49 PM, Josh Poimboeuf wrote:
> On Thu, May 10, 2018 at 02:33:03PM +0200, Jiri Slaby wrote:
>> On 04/19/2018, 03:42 PM, Josh Poimboeuf wrote:
>>> On Mon, Apr 16, 2018 at 05:16:53PM -0500, Josh Poimboeuf wrote:
>>>> On Wed, Dec 20, 2017 at 08:07:17PM +0100, Jiri Slaby wrote:
>>>>> On 12/20/2017, 06:45 PM, Josh Poimboeuf wrote:
>>>>>> It might not be until 2018 though.  But in the meantime you can go ahead
>>>>>> and update your patches accordingly and then we can combine them for
>>>>>> testing next year.
>>>>>
>>>>> I already did ;). So when you have that ready, I will send it on top
>>>>> right after.
>>>>
>>>> Sorry for the delay...  Here's a (lightly tested) patch.  Can you test
>>>> it with the latest version of your patches?
>>>
>>> This one actually compiles:
>>>
>>> From: Josh Poimboeuf <jpoimboe@redhat.com>
>>> Subject: [PATCH] x86/unwind/orc: Detect the end of the stack
>>
>> With this patch applied, livepatching never completes. Kthreads are in
>> the unpatched state forever. I have no details yet.
> 
> Hm, I thought I had tested that, but now I'm not sure.  Let me try it
> again.

We need to propagate end from hints to orcs. This works for me™:

commit ad05e939b5809db104528731ed0147ad59466db5
Author: Jiri Slaby <jslaby@suse.cz>
Date:   Fri May 11 09:00:12 2018 +0200

    objtool: take end hint into account

    Signed-off-by: Jiri Slaby <jslaby@suse.cz>

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 5409f6f..fef15ce 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1109,6 +1109,7 @@ static int read_unwind_hints(struct objtool_file
*file)

                cfa->offset = hint->sp_offset;
                insn->state.type = hint->type;
+               insn->state.end = hint->end;
        }

        return 0;
diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index c6b68fc..000ecf3 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -30,6 +30,7 @@ struct insn_state {
        struct cfi_reg regs[CFI_NUM_REGS];
        int stack_size;
        unsigned char type;
+       unsigned char end;
        bool bp_scratch;
        bool drap;
        int drap_reg, drap_offset;
diff --git a/tools/objtool/orc_dump.c b/tools/objtool/orc_dump.c
index c334382..a9bf1861 100644
--- a/tools/objtool/orc_dump.c
+++ b/tools/objtool/orc_dump.c
@@ -203,7 +203,9 @@ int orc_dump(const char *_objname)

                print_reg(orc[i].bp_reg, orc[i].bp_offset);

-               printf(" type:%s\n", orc_type_name(orc[i].type));
+               printf(" type:%s", orc_type_name(orc[i].type));
+
+               printf(" end:%u\n", orc[i].end);
        }

        elf_end(elf);
diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
index 18384d9..f413cc2 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -31,6 +31,9 @@ int create_orc(struct objtool_file *file)
                struct cfi_reg *cfa = &insn->state.cfa;
                struct cfi_reg *bp = &insn->state.regs[CFI_BP];

+               //if (insn->hint)
+                       orc->end = insn->state.end;
+
                if (cfa->base == CFI_UNDEFINED) {
                        orc->sp_reg = ORC_REG_UNDEFINED;
                        continue;

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC
  2018-05-11  7:01                   ` Jiri Slaby
@ 2018-05-11 15:12                     ` Josh Poimboeuf
  2018-05-14 11:08                       ` Jiri Slaby
  0 siblings, 1 reply; 19+ messages in thread
From: Josh Poimboeuf @ 2018-05-11 15:12 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: mingo, linux-kernel, Thomas Gleixner, H. Peter Anvin, x86

On Fri, May 11, 2018 at 09:01:30AM +0200, Jiri Slaby wrote:
> On 05/10/2018, 02:49 PM, Josh Poimboeuf wrote:
> > On Thu, May 10, 2018 at 02:33:03PM +0200, Jiri Slaby wrote:
> >> On 04/19/2018, 03:42 PM, Josh Poimboeuf wrote:
> >>> On Mon, Apr 16, 2018 at 05:16:53PM -0500, Josh Poimboeuf wrote:
> >>>> On Wed, Dec 20, 2017 at 08:07:17PM +0100, Jiri Slaby wrote:
> >>>>> On 12/20/2017, 06:45 PM, Josh Poimboeuf wrote:
> >>>>>> It might not be until 2018 though.  But in the meantime you can go ahead
> >>>>>> and update your patches accordingly and then we can combine them for
> >>>>>> testing next year.
> >>>>>
> >>>>> I already did ;). So when you have that ready, I will send it on top
> >>>>> right after.
> >>>>
> >>>> Sorry for the delay...  Here's a (lightly tested) patch.  Can you test
> >>>> it with the latest version of your patches?
> >>>
> >>> This one actually compiles:
> >>>
> >>> From: Josh Poimboeuf <jpoimboe@redhat.com>
> >>> Subject: [PATCH] x86/unwind/orc: Detect the end of the stack
> >>
> >> With this patch applied, livepatching never completes. Kthreads are in
> >> the unpatched state forever. I have no details yet.
> > 
> > Hm, I thought I had tested that, but now I'm not sure.  Let me try it
> > again.
> 
> We need to propagate end from hints to orcs. This works for me™:

Ah, right, I forgot to connect the dots.  Here's an updated patch with a
few more tweaks.  Can you test?  I assume you're going to carry this as
part of your patches to enable HAVE_RELIABLE_STACKTRACE?


From: Josh Poimboeuf <jpoimboe@redhat.com>
Subject: [PATCH] x86/unwind/orc: Detect the end of the stack

The existing UNWIND_HINT_EMPTY annotations happen to be good indicators
of where entry code calls into C code for the first time.  So also use
them to mark the end of the stack for the ORC unwinder.

Use that information to set unwind->error if the ORC unwinder doesn't
unwind all the way to the end.  This will be needed for enabling
HAVE_RELIABLE_STACKTRACE for the ORC unwinder so we can use it with the
livepatch consistency model.

Thanks to Jiri Slaby for teaching the ORCs.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 arch/x86/entry/entry_64.S                     |  1 +
 arch/x86/include/asm/orc_types.h              |  2 +
 arch/x86/include/asm/unwind_hints.h           | 16 +++---
 arch/x86/kernel/unwind_orc.c                  | 52 +++++++++++--------
 .../objtool/arch/x86/include/asm/orc_types.h  |  2 +
 tools/objtool/check.c                         |  1 +
 tools/objtool/check.h                         |  2 +-
 tools/objtool/orc_dump.c                      |  3 +-
 tools/objtool/orc_gen.c                       |  1 +
 9 files changed, 51 insertions(+), 29 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index ab4f451aeb97..e35dbc507425 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -408,6 +408,7 @@ ENTRY(ret_from_fork)
 
 1:
 	/* kernel thread */
+	UNWIND_HINT_EMPTY
 	movq	%r12, %rdi
 	CALL_NOSPEC %rbx
 	/*
diff --git a/arch/x86/include/asm/orc_types.h b/arch/x86/include/asm/orc_types.h
index 9c9dc579bd7d..46f516dd80ce 100644
--- a/arch/x86/include/asm/orc_types.h
+++ b/arch/x86/include/asm/orc_types.h
@@ -88,6 +88,7 @@ struct orc_entry {
 	unsigned	sp_reg:4;
 	unsigned	bp_reg:4;
 	unsigned	type:2;
+	unsigned	end:1;
 } __packed;
 
 /*
@@ -101,6 +102,7 @@ struct unwind_hint {
 	s16		sp_offset;
 	u8		sp_reg;
 	u8		type;
+	u8		end;
 };
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/x86/include/asm/unwind_hints.h b/arch/x86/include/asm/unwind_hints.h
index bae46fc6b9de..0bcdb1279361 100644
--- a/arch/x86/include/asm/unwind_hints.h
+++ b/arch/x86/include/asm/unwind_hints.h
@@ -26,7 +26,7 @@
  * the debuginfo as necessary.  It will also warn if it sees any
  * inconsistencies.
  */
-.macro UNWIND_HINT sp_reg=ORC_REG_SP sp_offset=0 type=ORC_TYPE_CALL
+.macro UNWIND_HINT sp_reg=ORC_REG_SP sp_offset=0 type=ORC_TYPE_CALL end=0
 #ifdef CONFIG_STACK_VALIDATION
 .Lunwind_hint_ip_\@:
 	.pushsection .discard.unwind_hints
@@ -35,12 +35,14 @@
 		.short \sp_offset
 		.byte \sp_reg
 		.byte \type
+		.byte \end
+		.balign 4
 	.popsection
 #endif
 .endm
 
 .macro UNWIND_HINT_EMPTY
-	UNWIND_HINT sp_reg=ORC_REG_UNDEFINED
+	UNWIND_HINT sp_reg=ORC_REG_UNDEFINED end=1
 .endm
 
 .macro UNWIND_HINT_REGS base=%rsp offset=0 indirect=0 extra=1 iret=0
@@ -86,19 +88,21 @@
 
 #else /* !__ASSEMBLY__ */
 
-#define UNWIND_HINT(sp_reg, sp_offset, type)			\
+#define UNWIND_HINT(sp_reg, sp_offset, type, end)		\
 	"987: \n\t"						\
 	".pushsection .discard.unwind_hints\n\t"		\
 	/* struct unwind_hint */				\
 	".long 987b - .\n\t"					\
-	".short " __stringify(sp_offset) "\n\t"		\
+	".short " __stringify(sp_offset) "\n\t"			\
 	".byte " __stringify(sp_reg) "\n\t"			\
 	".byte " __stringify(type) "\n\t"			\
+	".byte " __stringify(end) "\n\t"			\
+	".balign 4 \n\t"					\
 	".popsection\n\t"
 
-#define UNWIND_HINT_SAVE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_SAVE)
+#define UNWIND_HINT_SAVE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_SAVE, 0)
 
-#define UNWIND_HINT_RESTORE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_RESTORE)
+#define UNWIND_HINT_RESTORE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_RESTORE, 0)
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index feb28fee6cea..26038eacf74a 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -198,7 +198,7 @@ static int orc_sort_cmp(const void *_a, const void *_b)
 	 * whitelisted .o files which didn't get objtool generation.
 	 */
 	orc_a = cur_orc_table + (a - cur_orc_ip_table);
-	return orc_a->sp_reg == ORC_REG_UNDEFINED ? -1 : 1;
+	return orc_a->sp_reg == ORC_REG_UNDEFINED && !orc_a->end ? -1 : 1;
 }
 
 #ifdef CONFIG_MODULES
@@ -352,7 +352,7 @@ static bool deref_stack_iret_regs(struct unwind_state *state, unsigned long addr
 
 bool unwind_next_frame(struct unwind_state *state)
 {
-	unsigned long ip_p, sp, orig_ip, prev_sp = state->sp;
+	unsigned long ip_p, sp, orig_ip = state->ip, prev_sp = state->sp;
 	enum stack_type prev_type = state->stack_info.type;
 	struct orc_entry *orc;
 	bool indirect = false;
@@ -363,9 +363,9 @@ bool unwind_next_frame(struct unwind_state *state)
 	/* Don't let modules unload while we're reading their ORC data. */
 	preempt_disable();
 
-	/* Have we reached the end? */
+	/* End-of-stack check for user tasks: */
 	if (state->regs && user_mode(state->regs))
-		goto done;
+		goto the_end;
 
 	/*
 	 * Find the orc_entry associated with the text address.
@@ -374,9 +374,16 @@ bool unwind_next_frame(struct unwind_state *state)
 	 * calls and calls to noreturn functions.
 	 */
 	orc = orc_find(state->signal ? state->ip : state->ip - 1);
-	if (!orc || orc->sp_reg == ORC_REG_UNDEFINED)
-		goto done;
-	orig_ip = state->ip;
+	if (!orc)
+		goto err;
+
+	/* End-of-stack check for kernel threads: */
+	if (orc->sp_reg == ORC_REG_UNDEFINED) {
+		if (!orc->end)
+			goto err;
+
+		goto the_end;
+	}
 
 	/* Find the previous frame's stack: */
 	switch (orc->sp_reg) {
@@ -402,7 +409,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!state->regs || !state->full_regs) {
 			orc_warn("missing regs for base reg R10 at ip %pB\n",
 				 (void *)state->ip);
-			goto done;
+			goto err;
 		}
 		sp = state->regs->r10;
 		break;
@@ -411,7 +418,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!state->regs || !state->full_regs) {
 			orc_warn("missing regs for base reg R13 at ip %pB\n",
 				 (void *)state->ip);
-			goto done;
+			goto err;
 		}
 		sp = state->regs->r13;
 		break;
@@ -420,7 +427,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!state->regs || !state->full_regs) {
 			orc_warn("missing regs for base reg DI at ip %pB\n",
 				 (void *)state->ip);
-			goto done;
+			goto err;
 		}
 		sp = state->regs->di;
 		break;
@@ -429,7 +436,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!state->regs || !state->full_regs) {
 			orc_warn("missing regs for base reg DX at ip %pB\n",
 				 (void *)state->ip);
-			goto done;
+			goto err;
 		}
 		sp = state->regs->dx;
 		break;
@@ -437,12 +444,12 @@ bool unwind_next_frame(struct unwind_state *state)
 	default:
 		orc_warn("unknown SP base reg %d for ip %pB\n",
 			 orc->sp_reg, (void *)state->ip);
-		goto done;
+		goto err;
 	}
 
 	if (indirect) {
 		if (!deref_stack_reg(state, sp, &sp))
-			goto done;
+			goto err;
 	}
 
 	/* Find IP, SP and possibly regs: */
@@ -451,7 +458,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		ip_p = sp - sizeof(long);
 
 		if (!deref_stack_reg(state, ip_p, &state->ip))
-			goto done;
+			goto err;
 
 		state->ip = ftrace_graph_ret_addr(state->task, &state->graph_idx,
 						  state->ip, (void *)ip_p);
@@ -465,7 +472,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!deref_stack_regs(state, sp, &state->ip, &state->sp)) {
 			orc_warn("can't dereference registers at %p for ip %pB\n",
 				 (void *)sp, (void *)orig_ip);
-			goto done;
+			goto err;
 		}
 
 		state->regs = (struct pt_regs *)sp;
@@ -477,7 +484,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!deref_stack_iret_regs(state, sp, &state->ip, &state->sp)) {
 			orc_warn("can't dereference iret registers at %p for ip %pB\n",
 				 (void *)sp, (void *)orig_ip);
-			goto done;
+			goto err;
 		}
 
 		state->regs = (void *)sp - IRET_FRAME_OFFSET;
@@ -500,18 +507,18 @@ bool unwind_next_frame(struct unwind_state *state)
 
 	case ORC_REG_PREV_SP:
 		if (!deref_stack_reg(state, sp + orc->bp_offset, &state->bp))
-			goto done;
+			goto err;
 		break;
 
 	case ORC_REG_BP:
 		if (!deref_stack_reg(state, state->bp + orc->bp_offset, &state->bp))
-			goto done;
+			goto err;
 		break;
 
 	default:
 		orc_warn("unknown BP base reg %d for ip %pB\n",
 			 orc->bp_reg, (void *)orig_ip);
-		goto done;
+		goto err;
 	}
 
 	/* Prevent a recursive loop due to bad ORC data: */
@@ -520,13 +527,16 @@ bool unwind_next_frame(struct unwind_state *state)
 	    state->sp <= prev_sp) {
 		orc_warn("stack going in the wrong direction? ip=%pB\n",
 			 (void *)orig_ip);
-		goto done;
+		goto err;
 	}
 
 	preempt_enable();
 	return true;
 
-done:
+err:
+	state->error = true;
+
+the_end:
 	preempt_enable();
 	state->stack_info.type = STACK_TYPE_UNKNOWN;
 	return false;
diff --git a/tools/objtool/arch/x86/include/asm/orc_types.h b/tools/objtool/arch/x86/include/asm/orc_types.h
index 9c9dc579bd7d..46f516dd80ce 100644
--- a/tools/objtool/arch/x86/include/asm/orc_types.h
+++ b/tools/objtool/arch/x86/include/asm/orc_types.h
@@ -88,6 +88,7 @@ struct orc_entry {
 	unsigned	sp_reg:4;
 	unsigned	bp_reg:4;
 	unsigned	type:2;
+	unsigned	end:1;
 } __packed;
 
 /*
@@ -101,6 +102,7 @@ struct unwind_hint {
 	s16		sp_offset;
 	u8		sp_reg;
 	u8		type;
+	u8		end;
 };
 #endif /* __ASSEMBLY__ */
 
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 5409f6f6c48d..fef15cea9f1e 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1109,6 +1109,7 @@ static int read_unwind_hints(struct objtool_file *file)
 
 		cfa->offset = hint->sp_offset;
 		insn->state.type = hint->type;
+		insn->state.end = hint->end;
 	}
 
 	return 0;
diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index c6b68fcb926f..95700a2bcb7c 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -31,7 +31,7 @@ struct insn_state {
 	int stack_size;
 	unsigned char type;
 	bool bp_scratch;
-	bool drap;
+	bool drap, end;
 	int drap_reg, drap_offset;
 	struct cfi_reg vals[CFI_NUM_REGS];
 };
diff --git a/tools/objtool/orc_dump.c b/tools/objtool/orc_dump.c
index c3343820916a..faa444270ee3 100644
--- a/tools/objtool/orc_dump.c
+++ b/tools/objtool/orc_dump.c
@@ -203,7 +203,8 @@ int orc_dump(const char *_objname)
 
 		print_reg(orc[i].bp_reg, orc[i].bp_offset);
 
-		printf(" type:%s\n", orc_type_name(orc[i].type));
+		printf(" type:%s end:%d\n",
+		       orc_type_name(orc[i].type), orc[i].end);
 	}
 
 	elf_end(elf);
diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
index 18384d9be4e1..229c12e3fcf6 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -86,6 +86,7 @@ int create_orc(struct objtool_file *file)
 		orc->sp_offset = cfa->offset;
 		orc->bp_offset = bp->offset;
 		orc->type = insn->state.type;
+		orc->end = insn->state.end;
 	}
 
 	return 0;
-- 
2.17.0

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC
  2018-05-11 15:12                     ` Josh Poimboeuf
@ 2018-05-14 11:08                       ` Jiri Slaby
  2018-05-14 13:42                         ` Josh Poimboeuf
  0 siblings, 1 reply; 19+ messages in thread
From: Jiri Slaby @ 2018-05-14 11:08 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: mingo, linux-kernel, Thomas Gleixner, H. Peter Anvin, x86

On 05/11/2018, 05:12 PM, Josh Poimboeuf wrote:
> I assume you're going to carry this as
> part of your patches to enable HAVE_RELIABLE_STACKTRACE?

Sure.

> --- a/tools/objtool/orc_dump.c
> +++ b/tools/objtool/orc_dump.c
> @@ -203,7 +203,8 @@ int orc_dump(const char *_objname)
>  
>  		print_reg(orc[i].bp_reg, orc[i].bp_offset);
>  
> -		printf(" type:%s\n", orc_type_name(orc[i].type));
> +		printf(" type:%s end:%d\n",
> +		       orc_type_name(orc[i].type), orc[i].end);

Maybe just " type:%s%s\n" and »orc[i].end ? " END" : ""«?

>  	}
>  
>  	elf_end(elf);
> diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
> index 18384d9be4e1..229c12e3fcf6 100644
> --- a/tools/objtool/orc_gen.c
> +++ b/tools/objtool/orc_gen.c
> @@ -86,6 +86,7 @@ int create_orc(struct objtool_file *file)
>  		orc->sp_offset = cfa->offset;
>  		orc->bp_offset = bp->offset;
>  		orc->type = insn->state.type;
> +		orc->end = insn->state.end;

So you moved the assignment below of 'if (cfa->base == CFI_UNDEFINED)' –
this assignment will never happen, as I have just verified.

--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -31,6 +31,8 @@ int create_orc(struct objtool_file *file)
                struct cfi_reg *cfa = &insn->state.cfa;
                struct cfi_reg *bp = &insn->state.regs[CFI_BP];

+               orc->end = insn->state.end;
+
                if (cfa->base == CFI_UNDEFINED) {
                        orc->sp_reg = ORC_REG_UNDEFINED;
                        continue;
@@ -86,7 +88,6 @@ int create_orc(struct objtool_file *file)
                orc->sp_offset = cfa->offset;
                orc->bp_offset = bp->offset;
                orc->type = insn->state.type;
-               orc->end = insn->state.end;
        }

        return 0;


thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC
  2018-05-14 11:08                       ` Jiri Slaby
@ 2018-05-14 13:42                         ` Josh Poimboeuf
  2018-05-14 14:00                           ` Jiri Slaby
  0 siblings, 1 reply; 19+ messages in thread
From: Josh Poimboeuf @ 2018-05-14 13:42 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: mingo, linux-kernel, Thomas Gleixner, H. Peter Anvin, x86

On Mon, May 14, 2018 at 01:08:40PM +0200, Jiri Slaby wrote:
> On 05/11/2018, 05:12 PM, Josh Poimboeuf wrote:
> > I assume you're going to carry this as
> > part of your patches to enable HAVE_RELIABLE_STACKTRACE?
> 
> Sure.
> 
> > --- a/tools/objtool/orc_dump.c
> > +++ b/tools/objtool/orc_dump.c
> > @@ -203,7 +203,8 @@ int orc_dump(const char *_objname)
> >  
> >  		print_reg(orc[i].bp_reg, orc[i].bp_offset);
> >  
> > -		printf(" type:%s\n", orc_type_name(orc[i].type));
> > +		printf(" type:%s end:%d\n",
> > +		       orc_type_name(orc[i].type), orc[i].end);
> 
> Maybe just " type:%s%s\n" and »orc[i].end ? " END" : ""«?

All those 'end:0' strings are a bit repetitive, but I think that's ok.
'type:call' already has the same issue.  I'd rather keep the formatting
consistent with the rest of the printed variables.

> >  	}
> >  
> >  	elf_end(elf);
> > diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
> > index 18384d9be4e1..229c12e3fcf6 100644
> > --- a/tools/objtool/orc_gen.c
> > +++ b/tools/objtool/orc_gen.c
> > @@ -86,6 +86,7 @@ int create_orc(struct objtool_file *file)
> >  		orc->sp_offset = cfa->offset;
> >  		orc->bp_offset = bp->offset;
> >  		orc->type = insn->state.type;
> > +		orc->end = insn->state.end;
> 
> So you moved the assignment below of 'if (cfa->base == CFI_UNDEFINED)' –
> this assignment will never happen, as I have just verified.
> 
> --- a/tools/objtool/orc_gen.c
> +++ b/tools/objtool/orc_gen.c
> @@ -31,6 +31,8 @@ int create_orc(struct objtool_file *file)
>                 struct cfi_reg *cfa = &insn->state.cfa;
>                 struct cfi_reg *bp = &insn->state.regs[CFI_BP];
> 
> +               orc->end = insn->state.end;
> +
>                 if (cfa->base == CFI_UNDEFINED) {
>                         orc->sp_reg = ORC_REG_UNDEFINED;
>                         continue;
> @@ -86,7 +88,6 @@ int create_orc(struct objtool_file *file)
>                 orc->sp_offset = cfa->offset;
>                 orc->bp_offset = bp->offset;
>                 orc->type = insn->state.type;
> -               orc->end = insn->state.end;
>         }
> 
>         return 0;

Oops.  How about this one:

From: Josh Poimboeuf <jpoimboe@redhat.com>
Subject: [PATCH v2] x86/unwind/orc: Detect the end of the stack

The existing UNWIND_HINT_EMPTY annotations happen to be good indicators
of where entry code calls into C code for the first time.  So also use
them to mark the end of the stack for the ORC unwinder.

Use that information to set unwind->error if the ORC unwinder doesn't
unwind all the way to the end.  This will be needed for enabling
HAVE_RELIABLE_STACKTRACE for the ORC unwinder so we can use it with the
livepatch consistency model.

Thanks to Jiri Slaby for teaching the ORCs about the unwind hints.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 arch/x86/entry/entry_64.S                     |  1 +
 arch/x86/include/asm/orc_types.h              |  2 +
 arch/x86/include/asm/unwind_hints.h           | 16 +++---
 arch/x86/kernel/unwind_orc.c                  | 52 +++++++++++--------
 .../objtool/arch/x86/include/asm/orc_types.h  |  2 +
 tools/objtool/check.c                         |  1 +
 tools/objtool/check.h                         |  2 +-
 tools/objtool/orc_dump.c                      |  3 +-
 tools/objtool/orc_gen.c                       |  2 +
 9 files changed, 52 insertions(+), 29 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index ab4f451aeb97..e35dbc507425 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -408,6 +408,7 @@ ENTRY(ret_from_fork)
 
 1:
 	/* kernel thread */
+	UNWIND_HINT_EMPTY
 	movq	%r12, %rdi
 	CALL_NOSPEC %rbx
 	/*
diff --git a/arch/x86/include/asm/orc_types.h b/arch/x86/include/asm/orc_types.h
index 9c9dc579bd7d..46f516dd80ce 100644
--- a/arch/x86/include/asm/orc_types.h
+++ b/arch/x86/include/asm/orc_types.h
@@ -88,6 +88,7 @@ struct orc_entry {
 	unsigned	sp_reg:4;
 	unsigned	bp_reg:4;
 	unsigned	type:2;
+	unsigned	end:1;
 } __packed;
 
 /*
@@ -101,6 +102,7 @@ struct unwind_hint {
 	s16		sp_offset;
 	u8		sp_reg;
 	u8		type;
+	u8		end;
 };
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/x86/include/asm/unwind_hints.h b/arch/x86/include/asm/unwind_hints.h
index bae46fc6b9de..0bcdb1279361 100644
--- a/arch/x86/include/asm/unwind_hints.h
+++ b/arch/x86/include/asm/unwind_hints.h
@@ -26,7 +26,7 @@
  * the debuginfo as necessary.  It will also warn if it sees any
  * inconsistencies.
  */
-.macro UNWIND_HINT sp_reg=ORC_REG_SP sp_offset=0 type=ORC_TYPE_CALL
+.macro UNWIND_HINT sp_reg=ORC_REG_SP sp_offset=0 type=ORC_TYPE_CALL end=0
 #ifdef CONFIG_STACK_VALIDATION
 .Lunwind_hint_ip_\@:
 	.pushsection .discard.unwind_hints
@@ -35,12 +35,14 @@
 		.short \sp_offset
 		.byte \sp_reg
 		.byte \type
+		.byte \end
+		.balign 4
 	.popsection
 #endif
 .endm
 
 .macro UNWIND_HINT_EMPTY
-	UNWIND_HINT sp_reg=ORC_REG_UNDEFINED
+	UNWIND_HINT sp_reg=ORC_REG_UNDEFINED end=1
 .endm
 
 .macro UNWIND_HINT_REGS base=%rsp offset=0 indirect=0 extra=1 iret=0
@@ -86,19 +88,21 @@
 
 #else /* !__ASSEMBLY__ */
 
-#define UNWIND_HINT(sp_reg, sp_offset, type)			\
+#define UNWIND_HINT(sp_reg, sp_offset, type, end)		\
 	"987: \n\t"						\
 	".pushsection .discard.unwind_hints\n\t"		\
 	/* struct unwind_hint */				\
 	".long 987b - .\n\t"					\
-	".short " __stringify(sp_offset) "\n\t"		\
+	".short " __stringify(sp_offset) "\n\t"			\
 	".byte " __stringify(sp_reg) "\n\t"			\
 	".byte " __stringify(type) "\n\t"			\
+	".byte " __stringify(end) "\n\t"			\
+	".balign 4 \n\t"					\
 	".popsection\n\t"
 
-#define UNWIND_HINT_SAVE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_SAVE)
+#define UNWIND_HINT_SAVE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_SAVE, 0)
 
-#define UNWIND_HINT_RESTORE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_RESTORE)
+#define UNWIND_HINT_RESTORE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_RESTORE, 0)
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index feb28fee6cea..26038eacf74a 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -198,7 +198,7 @@ static int orc_sort_cmp(const void *_a, const void *_b)
 	 * whitelisted .o files which didn't get objtool generation.
 	 */
 	orc_a = cur_orc_table + (a - cur_orc_ip_table);
-	return orc_a->sp_reg == ORC_REG_UNDEFINED ? -1 : 1;
+	return orc_a->sp_reg == ORC_REG_UNDEFINED && !orc_a->end ? -1 : 1;
 }
 
 #ifdef CONFIG_MODULES
@@ -352,7 +352,7 @@ static bool deref_stack_iret_regs(struct unwind_state *state, unsigned long addr
 
 bool unwind_next_frame(struct unwind_state *state)
 {
-	unsigned long ip_p, sp, orig_ip, prev_sp = state->sp;
+	unsigned long ip_p, sp, orig_ip = state->ip, prev_sp = state->sp;
 	enum stack_type prev_type = state->stack_info.type;
 	struct orc_entry *orc;
 	bool indirect = false;
@@ -363,9 +363,9 @@ bool unwind_next_frame(struct unwind_state *state)
 	/* Don't let modules unload while we're reading their ORC data. */
 	preempt_disable();
 
-	/* Have we reached the end? */
+	/* End-of-stack check for user tasks: */
 	if (state->regs && user_mode(state->regs))
-		goto done;
+		goto the_end;
 
 	/*
 	 * Find the orc_entry associated with the text address.
@@ -374,9 +374,16 @@ bool unwind_next_frame(struct unwind_state *state)
 	 * calls and calls to noreturn functions.
 	 */
 	orc = orc_find(state->signal ? state->ip : state->ip - 1);
-	if (!orc || orc->sp_reg == ORC_REG_UNDEFINED)
-		goto done;
-	orig_ip = state->ip;
+	if (!orc)
+		goto err;
+
+	/* End-of-stack check for kernel threads: */
+	if (orc->sp_reg == ORC_REG_UNDEFINED) {
+		if (!orc->end)
+			goto err;
+
+		goto the_end;
+	}
 
 	/* Find the previous frame's stack: */
 	switch (orc->sp_reg) {
@@ -402,7 +409,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!state->regs || !state->full_regs) {
 			orc_warn("missing regs for base reg R10 at ip %pB\n",
 				 (void *)state->ip);
-			goto done;
+			goto err;
 		}
 		sp = state->regs->r10;
 		break;
@@ -411,7 +418,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!state->regs || !state->full_regs) {
 			orc_warn("missing regs for base reg R13 at ip %pB\n",
 				 (void *)state->ip);
-			goto done;
+			goto err;
 		}
 		sp = state->regs->r13;
 		break;
@@ -420,7 +427,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!state->regs || !state->full_regs) {
 			orc_warn("missing regs for base reg DI at ip %pB\n",
 				 (void *)state->ip);
-			goto done;
+			goto err;
 		}
 		sp = state->regs->di;
 		break;
@@ -429,7 +436,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!state->regs || !state->full_regs) {
 			orc_warn("missing regs for base reg DX at ip %pB\n",
 				 (void *)state->ip);
-			goto done;
+			goto err;
 		}
 		sp = state->regs->dx;
 		break;
@@ -437,12 +444,12 @@ bool unwind_next_frame(struct unwind_state *state)
 	default:
 		orc_warn("unknown SP base reg %d for ip %pB\n",
 			 orc->sp_reg, (void *)state->ip);
-		goto done;
+		goto err;
 	}
 
 	if (indirect) {
 		if (!deref_stack_reg(state, sp, &sp))
-			goto done;
+			goto err;
 	}
 
 	/* Find IP, SP and possibly regs: */
@@ -451,7 +458,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		ip_p = sp - sizeof(long);
 
 		if (!deref_stack_reg(state, ip_p, &state->ip))
-			goto done;
+			goto err;
 
 		state->ip = ftrace_graph_ret_addr(state->task, &state->graph_idx,
 						  state->ip, (void *)ip_p);
@@ -465,7 +472,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!deref_stack_regs(state, sp, &state->ip, &state->sp)) {
 			orc_warn("can't dereference registers at %p for ip %pB\n",
 				 (void *)sp, (void *)orig_ip);
-			goto done;
+			goto err;
 		}
 
 		state->regs = (struct pt_regs *)sp;
@@ -477,7 +484,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!deref_stack_iret_regs(state, sp, &state->ip, &state->sp)) {
 			orc_warn("can't dereference iret registers at %p for ip %pB\n",
 				 (void *)sp, (void *)orig_ip);
-			goto done;
+			goto err;
 		}
 
 		state->regs = (void *)sp - IRET_FRAME_OFFSET;
@@ -500,18 +507,18 @@ bool unwind_next_frame(struct unwind_state *state)
 
 	case ORC_REG_PREV_SP:
 		if (!deref_stack_reg(state, sp + orc->bp_offset, &state->bp))
-			goto done;
+			goto err;
 		break;
 
 	case ORC_REG_BP:
 		if (!deref_stack_reg(state, state->bp + orc->bp_offset, &state->bp))
-			goto done;
+			goto err;
 		break;
 
 	default:
 		orc_warn("unknown BP base reg %d for ip %pB\n",
 			 orc->bp_reg, (void *)orig_ip);
-		goto done;
+		goto err;
 	}
 
 	/* Prevent a recursive loop due to bad ORC data: */
@@ -520,13 +527,16 @@ bool unwind_next_frame(struct unwind_state *state)
 	    state->sp <= prev_sp) {
 		orc_warn("stack going in the wrong direction? ip=%pB\n",
 			 (void *)orig_ip);
-		goto done;
+		goto err;
 	}
 
 	preempt_enable();
 	return true;
 
-done:
+err:
+	state->error = true;
+
+the_end:
 	preempt_enable();
 	state->stack_info.type = STACK_TYPE_UNKNOWN;
 	return false;
diff --git a/tools/objtool/arch/x86/include/asm/orc_types.h b/tools/objtool/arch/x86/include/asm/orc_types.h
index 9c9dc579bd7d..46f516dd80ce 100644
--- a/tools/objtool/arch/x86/include/asm/orc_types.h
+++ b/tools/objtool/arch/x86/include/asm/orc_types.h
@@ -88,6 +88,7 @@ struct orc_entry {
 	unsigned	sp_reg:4;
 	unsigned	bp_reg:4;
 	unsigned	type:2;
+	unsigned	end:1;
 } __packed;
 
 /*
@@ -101,6 +102,7 @@ struct unwind_hint {
 	s16		sp_offset;
 	u8		sp_reg;
 	u8		type;
+	u8		end;
 };
 #endif /* __ASSEMBLY__ */
 
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 5409f6f6c48d..fef15cea9f1e 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1109,6 +1109,7 @@ static int read_unwind_hints(struct objtool_file *file)
 
 		cfa->offset = hint->sp_offset;
 		insn->state.type = hint->type;
+		insn->state.end = hint->end;
 	}
 
 	return 0;
diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index c6b68fcb926f..95700a2bcb7c 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -31,7 +31,7 @@ struct insn_state {
 	int stack_size;
 	unsigned char type;
 	bool bp_scratch;
-	bool drap;
+	bool drap, end;
 	int drap_reg, drap_offset;
 	struct cfi_reg vals[CFI_NUM_REGS];
 };
diff --git a/tools/objtool/orc_dump.c b/tools/objtool/orc_dump.c
index c3343820916a..faa444270ee3 100644
--- a/tools/objtool/orc_dump.c
+++ b/tools/objtool/orc_dump.c
@@ -203,7 +203,8 @@ int orc_dump(const char *_objname)
 
 		print_reg(orc[i].bp_reg, orc[i].bp_offset);
 
-		printf(" type:%s\n", orc_type_name(orc[i].type));
+		printf(" type:%s end:%d\n",
+		       orc_type_name(orc[i].type), orc[i].end);
 	}
 
 	elf_end(elf);
diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
index 18384d9be4e1..3f98dcfbc177 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -31,6 +31,8 @@ int create_orc(struct objtool_file *file)
 		struct cfi_reg *cfa = &insn->state.cfa;
 		struct cfi_reg *bp = &insn->state.regs[CFI_BP];
 
+		orc->end = insn->state.end;
+
 		if (cfa->base == CFI_UNDEFINED) {
 			orc->sp_reg = ORC_REG_UNDEFINED;
 			continue;
-- 
2.17.0

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC
  2018-05-14 13:42                         ` Josh Poimboeuf
@ 2018-05-14 14:00                           ` Jiri Slaby
  0 siblings, 0 replies; 19+ messages in thread
From: Jiri Slaby @ 2018-05-14 14:00 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: mingo, linux-kernel, Thomas Gleixner, H. Peter Anvin, x86

On 05/14/2018, 03:42 PM, Josh Poimboeuf wrote:
> Oops.  How about this one:

Yup, we are there.

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2018-05-14 14:00 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-30  8:03 [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC Jiri Slaby
2017-11-30  8:03 ` [PATCH 2/2] x86/stacktrace: make sure reliable stack trace reached userspace Jiri Slaby
2017-11-30 19:57 ` [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC Josh Poimboeuf
2017-11-30 19:59   ` Josh Poimboeuf
2017-12-01  7:34     ` Jiri Slaby
2017-12-14 21:58   ` Jiri Slaby
2017-12-18  3:37     ` Josh Poimboeuf
2017-12-18  6:59       ` Jiri Slaby
2017-12-20 17:45       ` Josh Poimboeuf
2017-12-20 19:07         ` Jiri Slaby
2018-04-16 22:16           ` Josh Poimboeuf
2018-04-19 13:42             ` Josh Poimboeuf
2018-05-10 12:33               ` Jiri Slaby
2018-05-10 12:49                 ` Josh Poimboeuf
2018-05-11  7:01                   ` Jiri Slaby
2018-05-11 15:12                     ` Josh Poimboeuf
2018-05-14 11:08                       ` Jiri Slaby
2018-05-14 13:42                         ` Josh Poimboeuf
2018-05-14 14:00                           ` Jiri Slaby

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).