LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: Jan Kiszka <jan.kiszka@siemens.com>
Cc: LKML <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Christoph Hellwig <hch@infradead.org>,
Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>,
Gregory Haskins <ghaskins@novell.com>,
Arnaldo Carvalho de Melo <acme@ghostprotocols.net>,
Thomas Gleixner <tglx@linutronix.de>,
Tim Bird <tim.bird@am.sony.com>, Sam Ravnborg <sam@ravnborg.org>,
"Frank Ch. Eigler" <fche@redhat.com>,
Steven Rostedt <srostedt@redhat.com>,
Philippe Gerum <rpm@xenomai.org>
Subject: Re: [RFC PATCH 01/22 -v2] Add basic support for gcc profiler instrumentation
Date: Thu, 10 Jan 2008 14:54:04 -0500 (EST) [thread overview]
Message-ID: <Pine.LNX.4.58.0801101411000.15391@gandalf.stny.rr.com> (raw)
In-Reply-To: <478661B9.7050406@siemens.com>
On Thu, 10 Jan 2008, Jan Kiszka wrote:
> Steven Rostedt wrote:
> > Index: linux-compile-i386.git/Makefile
> > ===================================================================
> > --- linux-compile-i386.git.orig/Makefile 2008-01-09 14:09:36.000000000 -0500
> > +++ linux-compile-i386.git/Makefile 2008-01-09 14:10:07.000000000 -0500
> > @@ -509,6 +509,10 @@ endif
> >
> > include $(srctree)/arch/$(SRCARCH)/Makefile
> >
> > +# MCOUNT expects frame pointer
>
> This comment looks stray.
Actually it's not ;-)
The original code had something like this:
#if CONFIG_MCOUNT
KBUILD_CFLAGS += ...
#else
#if CONFIG_FRAME_POINTER
KBUILD_CFLAGS += ...
#else
KBUILD_CFLAGS += ...
#endif
#endif
And Sam Ravnborg suggested to put that logic into the Kbuild system. For
which I did, but I put that comment there to just let others know that
MCOUNT expects the flags of FRAME_POINTER. But, I guess we can nuke that
comment anyway. It just leads to confusion.
>
> > +ifdef CONFIG_MCOUNT
> > +KBUILD_CFLAGS += -pg
> > +endif
> > ifdef CONFIG_FRAME_POINTER
> > KBUILD_CFLAGS += -fno-omit-frame-pointer -fno-optimize-sibling-calls
> > else
> > Index: linux-compile-i386.git/arch/x86/Kconfig
> > ===================================================================
> > --- linux-compile-i386.git.orig/arch/x86/Kconfig 2008-01-09 14:09:36.000000000 -0500
> > +++ linux-compile-i386.git/arch/x86/Kconfig 2008-01-09 14:10:07.000000000 -0500
> > @@ -28,6 +28,10 @@ config GENERIC_CMOS_UPDATE
> > bool
> > default y
> >
> > +config ARCH_HAS_MCOUNT
> > + bool
> > + default y
> > +
> > config CLOCKSOURCE_WATCHDOG
> > bool
> > default y
> > Index: linux-compile-i386.git/arch/x86/kernel/Makefile_32
> > ===================================================================
> > --- linux-compile-i386.git.orig/arch/x86/kernel/Makefile_32 2008-01-09 14:09:36.000000000 -0500
> > +++ linux-compile-i386.git/arch/x86/kernel/Makefile_32 2008-01-09 14:10:07.000000000 -0500
> > @@ -23,6 +23,7 @@ obj-$(CONFIG_APM) += apm_32.o
> > obj-$(CONFIG_X86_SMP) += smp_32.o smpboot_32.o tsc_sync.o
> > obj-$(CONFIG_SMP) += smpcommon_32.o
> > obj-$(CONFIG_X86_TRAMPOLINE) += trampoline_32.o
> > +obj-$(CONFIG_MCOUNT) += mcount-wrapper.o
>
> So far the code organization is different for 32 and 64 bit. I would
> suggest to either
>
> o move both trampolines into entry_*.S or
> o put them in something like mcount-wrapper_32/64.S.
Yeah, that's a relic from -rt. I never liked that, but I was just too lazy
to change it. I think I'll move the mcount_wrapper into entry_32.S
>
> > obj-$(CONFIG_X86_MPPARSE) += mpparse_32.o
> > obj-$(CONFIG_X86_LOCAL_APIC) += apic_32.o nmi_32.o
> > obj-$(CONFIG_X86_IO_APIC) += io_apic_32.o
> > Index: linux-compile-i386.git/arch/x86/kernel/mcount-wrapper.S
> > ===================================================================
> > --- /dev/null 1970-01-01 00:00:00.000000000 +0000
> > +++ linux-compile-i386.git/arch/x86/kernel/mcount-wrapper.S 2008-01-09 14:10:07.000000000 -0500
> > @@ -0,0 +1,25 @@
> > +/*
> > + * linux/arch/x86/mcount-wrapper.S
> > + *
> > + * Copyright (C) 2004 Ingo Molnar
> > + */
> > +
> > +.globl mcount
> > +mcount:
> > + cmpl $0, mcount_enabled
> > + jz out
> > +
> > + push %ebp
> > + mov %esp, %ebp
>
> What is the benefit of having a call frame in this trampoline? We used
> to carry this in the i386 mcount tracer for Adeos/I-pipe too (it was
> derived from the -rt code), but I just successfully tested a removal
> patch. Also glibc [1] doesn't include it.
Hmm, what about having frame pointers on? Isn't that a requirement?
>
> > + pushl %eax
> > + pushl %ecx
> > + pushl %edx
> > +
> > + call __mcount
>
> I think this indirection should be avoided, just like the 64-bit version
> and glibc do.
I thought about that too, but didn't have the time to look into the
calling convention for that. <does a quick look at glibc>
# objdump --start-address 0x`nm /lib/libc-2.7.so | sed -ne '/ mcount$/s/^\([0-9a-f]*\).*/\1/p'` -D /lib/libc-2.7.so |head -28 |tail -12
49201cd0 <_mcount>:
49201cd0: 50 push %eax
49201cd1: 51 push %ecx
49201cd2: 52 push %edx
49201cd3: 8b 54 24 0c mov 0xc(%esp),%edx
49201cd7: 8b 45 04 mov 0x4(%ebp),%eax
49201cda: e8 91 f4 ff ff call 49201170
<__mcount_internal>
49201cdf: 5a pop %edx
49201ce0: 59 pop %ecx
49201ce1: 58 pop %eax
49201ce2: c3 ret
49201ce3: 90 nop
Until I found out about the frame pointers, I'll leave in the ebp copy.
>
> > +
> > + popl %edx
> > + popl %ecx
> > + popl %eax
> > + popl %ebp
> > +out:
> > + ret
>
> ....
[...]
> > +/** __mcount - hook for profiling
> > + *
> > + * This routine is called from the arch specific mcount routine, that in turn is
> > + * called from code inserted by gcc -pg.
> > + */
> > +notrace void __mcount(void)
> > +{
> > + mcount_trace_function(CALLER_ADDR1, CALLER_ADDR2);
> > +}
>
> mcount_trace_function should always be called from the assembly
> trampoline, IMO.
I'll try that.
> > Index: linux-compile-i386.git/include/linux/mcount.h
> > ===================================================================
> > --- /dev/null 1970-01-01 00:00:00.000000000 +0000
> > +++ linux-compile-i386.git/include/linux/mcount.h 2008-01-09 15:17:20.000000000 -0500
> > @@ -0,0 +1,21 @@
> > +#ifndef _LINUX_MCOUNT_H
> > +#define _LINUX_MCOUNT_H
> > +
> > +#ifdef CONFIG_MCOUNT
> > +extern int mcount_enabled;
> > +
> > +#include <linux/linkage.h>
> > +
> > +#define CALLER_ADDR0 ((unsigned long)__builtin_return_address(0))
> > +#define CALLER_ADDR1 ((unsigned long)__builtin_return_address(1))
> > +#define CALLER_ADDR2 ((unsigned long)__builtin_return_address(2))
>
> Still used when __mcount would be gone?
Will be used later on by the tracers. Actually, I wish this was in a more
generic kernel header, since I find myself typing
"__builtin_return_address" quite often.
>
> > +
> > +typedef void (*mcount_func_t)(unsigned long ip, unsigned long parent_ip);
> > +
> > +extern void mcount(void);
> > +
> > +int register_mcount_function(mcount_func_t func);
> > +void clear_mcount_function(void);
> > +
> > +#endif /* CONFIG_MCOUNT */
> > +#endif /* _LINUX_MCOUNT_H */
> > Index: linux-compile-i386.git/arch/x86/kernel/entry_64.S
> > ===================================================================
> > --- linux-compile-i386.git.orig/arch/x86/kernel/entry_64.S 2008-01-09 14:09:36.000000000 -0500
> > +++ linux-compile-i386.git/arch/x86/kernel/entry_64.S 2008-01-09 14:10:07.000000000 -0500
> > @@ -53,6 +53,46 @@
> >
> > .code64
> >
> > +#ifdef CONFIG_MCOUNT
> > +
> > +ENTRY(mcount)
> > + cmpl $0, mcount_enabled
> > + jz out
> > +
> > + push %rbp
> > + mov %rsp,%rbp
>
> Same as for x86_32.
Same checking for frame pointers too.
>
> > +
> > + push %r11
> > + push %r10
>
> glibc [2] doesn't save those two, and we were also happy without them so
> far. Or are there nasty corner-cases in the kernel?
Probably not. I'll see what happens without them.
>
> > + push %r9
> > + push %r8
> > + push %rdi
> > + push %rsi
> > + push %rdx
> > + push %rcx
> > + push %rax
>
> SAVE_ARGS/RESTORE_ARGS and glibc use explicit rsp manipulation + movq
> instead of push/pop. I wonder if there is a small advantage, but I'm not
> that deep into this arch.
Yeah, it's probably a bit faster to do the mov instead. I'll add that.
>
> > +
> > + mov 0x0(%rbp),%rax
> > + mov 0x8(%rbp),%rdi
> > + mov 0x8(%rax),%rsi
>
> See [2] for saving one instruction here. :)
hehe, yeah, will do.
>
> > +
> > + call *mcount_trace_function
> > +
> > + pop %rax
> > + pop %rcx
> > + pop %rdx
> > + pop %rsi
> > + pop %rdi
> > + pop %r8
> > + pop %r9
> > + pop %r10
> > + pop %r11
> > +
> > + pop %rbp
> > +out:
> > + ret
> > +#endif
> > +
> > #ifndef CONFIG_PREEMPT
> > #define retint_kernel retint_restore_args
> > #endif
>
> This generic approach is very appreciated here as well. It would take
> away the burden of maintaining the arch-dependent stubs within I-pipe.
>
> What we could contribute later on is a blackfin trampoline, there is
> just still a bug in their toolchain which breaks mcount for modules. But
> I could check with the bfin guys again about the progress and underline
> the importance of this long-pending issue.
Thanks,
-- Steve
next prev parent reply other threads:[~2008-01-10 19:55 UTC|newest]
Thread overview: 100+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-09 23:29 [RFC PATCH 00/22 -v2] mcount and latency tracing utility -v2 Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 01/22 -v2] Add basic support for gcc profiler instrumentation Steven Rostedt
2008-01-10 18:19 ` Jan Kiszka
2008-01-10 19:54 ` Steven Rostedt [this message]
2008-01-10 23:02 ` Steven Rostedt
2008-01-10 18:28 ` Sam Ravnborg
2008-01-10 19:10 ` Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 02/22 -v2] Annotate core code that should not be traced Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 03/22 -v2] x86_64: notrace annotations Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 04/22 -v2] add notrace annotations to vsyscall Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 05/22 -v2] add notrace annotations for NMI routines Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 06/22 -v2] mcount based trace in the form of a header file library Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 07/22 -v2] tracer add debugfs interface Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 08/22 -v2] mcount tracer output file Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 09/22 -v2] mcount tracer show task comm and pid Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 10/22 -v2] Add a symbol only trace output Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 11/22 -v2] Reset the tracer when started Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 12/22 -v2] separate out the percpu date into a percpu struct Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 13/22 -v2] handle accurate time keeping over long delays Steven Rostedt
2008-01-10 0:00 ` john stultz
2008-01-10 0:09 ` Steven Rostedt
2008-01-10 19:54 ` Tony Luck
2008-01-10 20:15 ` Steven Rostedt
2008-01-10 20:41 ` john stultz
2008-01-10 20:29 ` john stultz
2008-01-10 20:42 ` Mathieu Desnoyers
2008-01-10 21:25 ` john stultz
2008-01-10 22:00 ` Mathieu Desnoyers
2008-01-10 22:40 ` Steven Rostedt
2008-01-10 22:51 ` john stultz
2008-01-10 23:05 ` john stultz
2008-01-10 21:33 ` [RFC PATCH 13/22 -v2] handle accurate time keeping over longdelays Luck, Tony
2008-01-10 0:19 ` [RFC PATCH 13/22 -v2] handle accurate time keeping over long delays john stultz
2008-01-10 0:25 ` Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 14/22 -v2] time keeping add cycle_raw for actual incrementation Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 15/22 -v2] initialize the clock source to jiffies clock Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 16/22 -v2] add get_monotonic_cycles Steven Rostedt
2008-01-10 3:28 ` Daniel Walker
2008-01-15 21:46 ` Mathieu Desnoyers
2008-01-15 22:01 ` Steven Rostedt
2008-01-15 22:03 ` Steven Rostedt
2008-01-15 22:08 ` Mathieu Desnoyers
2008-01-16 1:38 ` Steven Rostedt
2008-01-16 3:17 ` Mathieu Desnoyers
2008-01-16 13:17 ` Steven Rostedt
2008-01-16 14:56 ` Mathieu Desnoyers
2008-01-16 15:06 ` Steven Rostedt
2008-01-16 15:28 ` Mathieu Desnoyers
2008-01-16 15:58 ` Steven Rostedt
2008-01-16 17:00 ` Mathieu Desnoyers
2008-01-16 17:49 ` Mathieu Desnoyers
2008-01-16 19:43 ` Steven Rostedt
2008-01-16 20:17 ` Mathieu Desnoyers
2008-01-16 20:45 ` Tim Bird
2008-01-16 20:49 ` Steven Rostedt
2008-01-17 20:08 ` Steven Rostedt
2008-01-17 20:37 ` Frank Ch. Eigler
2008-01-17 21:03 ` Steven Rostedt
2008-01-18 22:26 ` Mathieu Desnoyers
2008-01-18 22:49 ` Steven Rostedt
2008-01-18 23:19 ` Mathieu Desnoyers
2008-01-19 3:36 ` Frank Ch. Eigler
2008-01-19 3:55 ` Steven Rostedt
2008-01-19 4:23 ` Frank Ch. Eigler
2008-01-19 15:29 ` Mathieu Desnoyers
2008-01-19 3:32 ` Frank Ch. Eigler
2008-01-16 18:01 ` Tim Bird
2008-01-16 22:36 ` john stultz
2008-01-16 22:51 ` john stultz
2008-01-16 23:33 ` Steven Rostedt
2008-01-17 2:28 ` john stultz
2008-01-17 2:40 ` Mathieu Desnoyers
2008-01-17 2:50 ` Mathieu Desnoyers
2008-01-17 3:02 ` Steven Rostedt
2008-01-17 3:21 ` Paul Mackerras
2008-01-17 3:39 ` Steven Rostedt
2008-01-17 4:22 ` Mathieu Desnoyers
2008-01-17 4:25 ` Mathieu Desnoyers
2008-01-17 4:14 ` Mathieu Desnoyers
2008-01-17 15:22 ` Steven Rostedt
2008-01-17 17:46 ` Linus Torvalds
2008-01-17 2:51 ` Steven Rostedt
2008-01-16 23:39 ` Mathieu Desnoyers
2008-01-16 23:50 ` Steven Rostedt
2008-01-17 0:36 ` Steven Rostedt
2008-01-17 0:33 ` john stultz
2008-01-17 2:20 ` Mathieu Desnoyers
2008-01-17 1:03 ` Linus Torvalds
2008-01-17 1:35 ` Mathieu Desnoyers
2008-01-17 2:20 ` john stultz
2008-01-17 2:35 ` Mathieu Desnoyers
2008-01-09 23:29 ` [RFC PATCH 17/22 -v2] Add timestamps to tracer Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 18/22 -v2] Sort trace by timestamp Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 19/22 -v2] speed up the output of the tracer Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 20/22 -v2] Add latency_trace format tor tracer Steven Rostedt
2008-01-10 3:41 ` Daniel Walker
2008-01-09 23:29 ` [RFC PATCH 21/22 -v2] Split out specific tracing functions Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 22/22 -v2] Trace irq disabled critical timings Steven Rostedt
2008-01-10 3:58 ` Daniel Walker
2008-01-10 14:45 ` Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.58.0801101411000.15391@gandalf.stny.rr.com \
--to=rostedt@goodmis.org \
--cc=a.p.zijlstra@chello.nl \
--cc=acme@ghostprotocols.net \
--cc=akpm@linux-foundation.org \
--cc=fche@redhat.com \
--cc=ghaskins@novell.com \
--cc=hch@infradead.org \
--cc=jan.kiszka@siemens.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@polymtl.ca \
--cc=mingo@elte.hu \
--cc=rpm@xenomai.org \
--cc=sam@ravnborg.org \
--cc=srostedt@redhat.com \
--cc=tglx@linutronix.de \
--cc=tim.bird@am.sony.com \
--cc=torvalds@linux-foundation.org \
--subject='Re: [RFC PATCH 01/22 -v2] Add basic support for gcc profiler instrumentation' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).