LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Masami Hiramatsu <mhiramat@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>,
linux-kernel@vger.kernel.org,
Linus Torvalds <torvalds@linux-foundation.org>,
Ingo Molnar <mingo@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Andy Lutomirski <luto@kernel.org>,
Nicolai Stange <nstange@suse.de>,
Thomas Gleixner <tglx@linutronix.de>,
Borislav Petkov <bp@alien8.de>, "H. Peter Anvin" <hpa@zytor.com>,
x86@kernel.org, Jiri Kosina <jikos@kernel.org>,
Miroslav Benes <mbenes@suse.cz>, Petr Mladek <pmladek@suse.com>,
Joe Lawrence <joe.lawrence@redhat.com>,
Shuah Khan <shuah@kernel.org>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
Tim Chen <tim.c.chen@linux.intel.com>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Mimi Zohar <zohar@linux.ibm.com>, Juergen Gross <jgross@suse.com>,
Nick Desaulniers <ndesaulniers@google.com>,
Nayna Jain <nayna@linux.ibm.com>,
Masahiro Yamada <yamada.masahiro@socionext.com>,
Joerg Roedel <jroedel@suse.de>,
linux-kselftest@vger.kernel.org
Subject: Re: [PATCH 2/4] x86/kprobes: Fix frame pointer annotations
Date: Thu, 9 May 2019 23:01:06 +0900 [thread overview]
Message-ID: <20190509230106.3551b08553440d125e437f66@kernel.org> (raw)
In-Reply-To: <20190509081431.GO2589@hirez.programming.kicks-ass.net>
On Thu, 9 May 2019 10:14:31 +0200
Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, May 09, 2019 at 10:20:30AM +0900, Masami Hiramatsu wrote:
> > Hi Josh,
> >
> > On Wed, 8 May 2019 13:48:48 -0500
> > Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >
> > > On Wed, May 08, 2019 at 05:39:07PM +0200, Peter Zijlstra wrote:
> > > > On Wed, May 08, 2019 at 07:42:48AM -0500, Josh Poimboeuf wrote:
> > > > > On Wed, May 08, 2019 at 02:04:16PM +0200, Peter Zijlstra wrote:
> > > >
> > > > > > Do the x86_64 variants also want some ORC annotation?
> > > > >
> > > > > Maybe so. Though it looks like regs->ip isn't saved. The saved
> > > > > registers might need to be tweaked. I'll need to look into it.
> > > >
> > > > What all these sites do (and maybe we should look at unifying them
> > > > somehow) is turn a CALL frame (aka RET-IP) into an exception frame (aka
> > > > pt_regs).
> > > >
> > > > So regs->ip will be the return address (which is fixed up to be the CALL
> > > > address in the handler).
> > >
> > > But from what I can tell, trampoline_handler() hard-codes regs->ip to
> > > point to kretprobe_trampoline(), and the original return address is
> > > placed in regs->sp.
> > >
> > > Masami, is there a reason why regs->ip doesn't have the original return
> > > address and regs->sp doesn't have the original SP? I think that would
> > > help the unwinder understand things.
> >
> > Yes, for regs->ip, there is a histrical reason. Since previously, we had
> > an int3 at trampoline, so the user (kretprobe) handler expects that
> > regs->ip is trampoline address and ri->ret_addr is original return address.
> > It is better to check the other archs, but I think it is possible to
> > change the regs->ip to original return address, since no one cares such
> > "fixed address". :)
> >
> > For the regs->sp, there are 2 reasons.
> >
> > For x86-64, it's just for over-optimizing (reduce stack usage).
> > I think we can make a gap for putting return address, something like
> >
> > "kretprobe_trampoline:\n"
> > #ifdef CONFIG_X86_64
> > " pushq %rsp\n" /* Make a gap for return address */
> > " pushq 0(%rsp)\n" /* Copy original stack pointer */
> > " pushfq\n"
> > SAVE_REGS_STRING
> > " movq %rsp, %rdi\n"
> > " call trampoline_handler\n"
> > /* Push the true return address to the bottom */
> > " movq %rax, 20*8(%rsp)\n"
> > RESTORE_REGS_STRING
> > " popfq\n"
> > " addq $8, %rsp\n" /* Skip original stack pointer */
> >
> > For i386 (x86-32), there is no other way to keep ®s->sp as
> > the original stack pointer. It has to be changed with this series,
> > maybe as same as x86-64.
>
> Right; I already fixed that in my patch changing i386's pt_regs.
I see it, and it is good to me. :)
> But what I'd love to do is something like the belwo patch, and make all
> the trampolines (very much including ftrace) use that. Such that we then
> only have 1 copy of this magic (well, 2 because x86_64 also needs an
> implementation of this of course).
OK, but I will make kretprobe integrated with func-graph tracer,
since it is inefficient that we have 2 different hidden return stack...
Anyway,
> Changing ftrace over to this would be a little more work but it can
> easily chain things a little to get its original context back:
>
> ENTRY(ftrace_regs_caller)
> GLOBAL(ftrace_regs_func)
> push ftrace_stub
> push ftrace_regs_handler
> jmp call_to_exception_trampoline
> END(ftrace_regs_caller)
>
> typedef void (*ftrace_func_t)(unsigned long, unsigned long, struct ftrace_op *, struct pt_regs *);
>
> struct ftrace_regs_stack {
> ftrace_func_t func;
> unsigned long parent_ip;
> };
>
> void ftrace_regs_handler(struct pr_regs *regs)
> {
> struct ftrace_regs_stack *st = (void *)regs->sp;
> ftrace_func_t func = st->func;
>
> regs->sp += sizeof(long); /* pop func */
Sorry, why pop here?
>
> func(regs->ip, st->parent_ip, function_trace_op, regs);
> }
>
> Hmm? I didn't look into the function_graph thing, but I imagine it can
> be added without too much pain.
Yes, that should be good for function_graph trampoline too.
We use very similar technic.
>
> ---
> --- a/arch/x86/entry/entry_32.S
> +++ b/arch/x86/entry/entry_32.S
> @@ -1576,3 +1576,100 @@ ENTRY(rewind_stack_do_exit)
> call do_exit
> 1: jmp 1b
> END(rewind_stack_do_exit)
> +
> +/*
> + * Transforms a CALL frame into an exception frame; IOW it pretends the CALL we
> + * just did was in fact scribbled with an INT3.
> + *
> + * Use this trampoline like:
> + *
> + * PUSH $func
> + * JMP call_to_exception_trampoline
> + *
> + * $func will see regs->ip point at the CALL instruction and must therefore
> + * modify regs->ip in order to make progress (just like a normal INT3 scribbled
> + * CALL).
> + *
> + * NOTE: we do not restore any of the segment registers.
> + */
> +ENTRY(call_to_exception_trampoline)
> + /*
> + * On entry the stack looks like:
> + *
> + * 2*4(%esp) <previous context>
> + * 1*4(%esp) RET-IP
> + * 0*4(%esp) func
> + *
> + * transform this into:
> + *
> + * 19*4(%esp) <previous context>
> + * 18*4(%esp) gap / RET-IP
> + * 17*4(%esp) gap / func
> + * 16*4(%esp) ss
> + * 15*415*4(%esp) sp / <previous context>
isn't this "&<previous context>" ?
> + * 14*4(%esp) flags
> + * 13*4(%esp) cs
> + * 12*4(%esp) ip / RET-IP
> + * 11*4(%esp) orig_eax
> + * 10*4(%esp) gs
> + * 9*4(%esp) fs
> + * 8*4(%esp) es
> + * 7*4(%esp) ds
> + * 6*4(%esp) eax
> + * 5*4(%esp) ebp
> + * 4*4(%esp) edi
> + * 3*4(%esp) esi
> + * 2*4(%esp) edx
> + * 1*4(%esp) ecx
> + * 0*4(%esp) ebx
> + */
> + pushl %ss
> + pushl %esp # points at ss
> + addl $3*4, (%esp) # point it at <previous context>
> + pushfl
> + pushl %cs
> + pushl 5*4(%esp) # RET-IP
> + subl 5, (%esp) # point at CALL instruction
> + pushl $-1
> + pushl %gs
> + pushl %fs
> + pushl %es
> + pushl %ds
> + pushl %eax
> + pushl %ebp
> + pushl %edi
> + pushl %esi
> + pushl %edx
> + pushl %ecx
> + pushl %ebx
> +
> + ENCODE_FRAME_POINTER
> +
> + movl %esp, %eax # 1st argument: pt_regs
> +
> + movl 17*4(%esp), %ebx # func
> + CALL_NOSPEC %ebx
> +
> + movl PT_OLDESP(%esp), %eax
Is PT_OLDESP(%esp) "<previous context>" or "&<previous contex>"?
> +
> + movl PT_EIP(%esp), %ecx
> + movl %ecx, -1*4(%eax)
Ah, OK, so $func must set the true return address to regs->ip
instead of returning it.
> +
> + movl PT_EFLAGS(%esp), %ecx
> + movl %ecx, -2*4(%eax)
> +
> + movl PT_EAX(%esp), %ecx
> + movl %ecx, -3*4(%eax)
So, at this point, the stack becomes
18*4(%esp) RET-IP
17*4(%esp) eflags
16*4(%esp) eax
Correct?
> +
> + popl %ebx
> + popl %ecx
> + popl %edx
> + popl %esi
> + popl %edi
> + popl %ebp
> +
> + lea -3*4(%eax), %esp
> + popl %eax
> + popfl
> + ret
> +END(call_to_exception_trampoline)
> --- a/arch/x86/kernel/kprobes/core.c
> +++ b/arch/x86/kernel/kprobes/core.c
> @@ -731,29 +731,8 @@ asm(
> ".global kretprobe_trampoline\n"
> ".type kretprobe_trampoline, @function\n"
> "kretprobe_trampoline:\n"
> - /* We don't bother saving the ss register */
> -#ifdef CONFIG_X86_64
> - " pushq %rsp\n"
> - " pushfq\n"
> - SAVE_REGS_STRING
> - " movq %rsp, %rdi\n"
> - " call trampoline_handler\n"
> - /* Replace saved sp with true return address. */
> - " movq %rax, 19*8(%rsp)\n"
> - RESTORE_REGS_STRING
> - " popfq\n"
> -#else
> - " pushl %esp\n"
> - " pushfl\n"
> - SAVE_REGS_STRING
> - " movl %esp, %eax\n"
> - " call trampoline_handler\n"
> - /* Replace saved sp with true return address. */
> - " movl %eax, 15*4(%esp)\n"
> - RESTORE_REGS_STRING
> - " popfl\n"
> -#endif
> - " ret\n"
Here, we need a gap for storing ret-ip, because kretprobe_trampoline is
the address which is returned from the target function. We have no
"ret-ip" here at this point. So something like
+ "push $0\n" /* This is a gap, will be filled with real return address*/
> + "push trampoline_handler\n"
> + "jmp call_to_exception_trampoline\n"
> ".size kretprobe_trampoline, .-kretprobe_trampoline\n"
> );
> NOKPROBE_SYMBOL(kretprobe_trampoline);
> @@ -791,12 +770,7 @@ static __used void *trampoline_handler(s
>
> INIT_HLIST_HEAD(&empty_rp);
> kretprobe_hash_lock(current, &head, &flags);
> - /* fixup registers */
> - regs->cs = __KERNEL_CS;
> -#ifdef CONFIG_X86_32
> - regs->cs |= get_kernel_rpl();
> - regs->gs = 0;
> -#endif
> +
> /* We use pt_regs->sp for return address holder. */
> frame_pointer = ®s->sp;
> regs->ip = trampoline_address;
Thank you,
--
Masami Hiramatsu <mhiramat@kernel.org>
next prev parent reply other threads:[~2019-05-09 14:01 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-08 7:49 [PATCH 0/4] x86: int3 fallout Peter Zijlstra
2019-05-08 7:49 ` [PATCH 1/4] x86/entry/32: Clean up return from interrupt preemption path Peter Zijlstra
2019-05-08 7:49 ` [PATCH 2/4] x86/kprobes: Fix frame pointer annotations Peter Zijlstra
2019-05-08 11:54 ` Josh Poimboeuf
2019-05-08 12:04 ` Peter Zijlstra
2019-05-08 12:40 ` Peter Zijlstra
2019-05-08 12:42 ` Josh Poimboeuf
2019-05-08 15:39 ` Peter Zijlstra
2019-05-08 18:48 ` Josh Poimboeuf
2019-05-09 1:20 ` Masami Hiramatsu
2019-05-09 8:14 ` Peter Zijlstra
2019-05-09 9:27 ` Peter Zijlstra
2019-05-09 14:00 ` Josh Poimboeuf
2019-05-09 14:01 ` Masami Hiramatsu [this message]
2019-05-09 17:14 ` Peter Zijlstra
2019-05-10 4:58 ` Masami Hiramatsu
2019-05-10 12:31 ` Peter Zijlstra
2019-05-11 0:52 ` Masami Hiramatsu
2019-05-10 12:40 ` Peter Zijlstra
2019-05-11 0:56 ` Masami Hiramatsu
2019-05-13 8:15 ` Peter Zijlstra
2019-05-09 16:20 ` Andy Lutomirski
2019-05-09 17:18 ` Peter Zijlstra
2019-05-09 17:43 ` Steven Rostedt
2019-05-10 3:21 ` Masami Hiramatsu
2019-05-10 12:14 ` Peter Zijlstra
2019-05-10 12:17 ` Peter Zijlstra
2019-05-10 14:54 ` Steven Rostedt
2019-05-09 17:37 ` Steven Rostedt
2019-05-09 18:26 ` Peter Zijlstra
2019-05-09 18:36 ` Steven Rostedt
2019-05-08 7:49 ` [PATCH 3/4] x86/ftrace: Add pt_regs frame annotations Peter Zijlstra
2019-05-08 7:49 ` [RFC][PATCH 4/4] x86_32: Provide consistent pt_regs Peter Zijlstra
2019-05-08 11:57 ` Josh Poimboeuf
2019-05-08 12:46 ` Ingo Molnar
2019-05-08 20:58 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190509230106.3551b08553440d125e437f66@kernel.org \
--to=mhiramat@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=bigeasy@linutronix.de \
--cc=bp@alien8.de \
--cc=hpa@zytor.com \
--cc=jgross@suse.com \
--cc=jikos@kernel.org \
--cc=joe.lawrence@redhat.com \
--cc=jpoimboe@redhat.com \
--cc=jroedel@suse.de \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mbenes@suse.cz \
--cc=mingo@kernel.org \
--cc=nayna@linux.ibm.com \
--cc=ndesaulniers@google.com \
--cc=nstange@suse.de \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=shuah@kernel.org \
--cc=tglx@linutronix.de \
--cc=tim.c.chen@linux.intel.com \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
--cc=yamada.masahiro@socionext.com \
--cc=zohar@linux.ibm.com \
--subject='Re: [PATCH 2/4] x86/kprobes: Fix frame pointer annotations' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).