LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: Jan Kiszka <jan.kiszka@siemens.com>
Cc: LKML <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Christoph Hellwig <hch@infradead.org>,
	Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>,
	Gregory Haskins <ghaskins@novell.com>,
	Arnaldo Carvalho de Melo <acme@ghostprotocols.net>,
	Thomas Gleixner <tglx@linutronix.de>,
	Tim Bird <tim.bird@am.sony.com>, Sam Ravnborg <sam@ravnborg.org>,
	"Frank Ch. Eigler" <fche@redhat.com>,
	John Stultz <johnstul@us.ibm.com>,
	Arjan van de Ven <arjan@infradead.org>,
	Steven Rostedt <srostedt@redhat.com>
Subject: Re: [PATCH 02/22 -v7] Add basic support for gcc profiler instrumentation
Date: Wed, 30 Jan 2008 09:28:19 -0500 (EST)	[thread overview]
Message-ID: <Pine.LNX.4.58.0801300921020.29020@gandalf.stny.rr.com> (raw)
In-Reply-To: <Pine.LNX.4.58.0801300845180.29020@gandalf.stny.rr.com>


On Wed, 30 Jan 2008, Steven Rostedt wrote:
> well, actually, I disagree. I only set mcount_enabled=1 when I'm about to
> test something. You're right that we want the impact of the test least
> affected, but when we have mcount_enabled=1 we usually also have a
> function that's attached and in that case this change is negligible. But
> on the normal case where mcount_enabled=0, this change may have a bigger
> impact.
>
> Remember CONFIG_MCOUNT=y && mcount_enabled=0 is (15% overhead)
>          CONFIG_MCOUNT=y && mcount_enabled=1 dummy func (49% overhead)
>          CONFIG_MCOUNT=y && mcount_enabled=1 trace func (500% overhead)
>
> The trace func is the one that will be most likely used when analyzing. It
> gives hackbench a 500% overhead, so I'm expecting this change to be
> negligible in that case. But after I find what's wrong, I like to rebuild
> the kernel without rebooting so I like to have mcount_enabled=0 have the
> smallest impact ;-)
>
> I'll put back the original code and run some new numbers.

I just ran with the original version of that test (on x86_64, the same box
as the previous tests were done, with the same kernel and config except
for this change)

Here's the numbers with the new design (the one that was used in this
patch):

mcount disabled:
 Avg: 4.8638 (15.934498% overhead)

mcount enabled:
 Avg: 6.2819 (49.736610% overhead)

function tracing:
 Avg: 25.2035 (500.755607% overhead)

Now changing the code to:

ENTRY(mcount)
        /* likely(mcount_enabled) */
        cmpl $0, mcount_enabled
        jz out

        /* taken from glibc */
        subq $0x38, %rsp
        movq %rax, (%rsp)
        movq %rcx, 8(%rsp)
        movq %rdx, 16(%rsp)
        movq %rsi, 24(%rsp)
        movq %rdi, 32(%rsp)
        movq %r8, 40(%rsp)
        movq %r9, 48(%rsp)

        movq 0x38(%rsp), %rsi
        movq 8(%rbp), %rdi

        call   *mcount_trace_function

        movq 48(%rsp), %r9
        movq 40(%rsp), %r8
        movq 32(%rsp), %rdi
        movq 24(%rsp), %rsi
        movq 16(%rsp), %rdx
        movq 8(%rsp), %rcx
        movq (%rsp), %rax
        addq $0x38, %rsp

out:
        retq


mcount disabled:
 Avg: 4.908 (16.988058% overhead)

mcount enabled:
 Avg: 6.244. (48.840369% overhead)

function tracing:
 Avg: 25.1963 (500.583987% overhead)


The change seems to cause a 1% overhead difference. With mcount disabled,
the newer code has a 1% performance benefit. With mcount enabled as well
as with tracing on, the old code has the 1% benefit.

But 1% has a bigger impact on something that is 15% than it does on
something that is 48% or 500%, so I'm keeping the newer version.

-- Steve


  reply	other threads:[~2008-01-30 14:28 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-30  3:15 [PATCH 00/22 -v7] mcount and latency tracing utility -v7 Steven Rostedt
2008-01-30  3:15 ` [PATCH 01/22 -v7] printk - dont wakeup klogd with interrupts disabled Steven Rostedt
2008-01-30  3:15 ` [PATCH 02/22 -v7] Add basic support for gcc profiler instrumentation Steven Rostedt
2008-01-30  8:46   ` Peter Zijlstra
2008-01-30 13:08     ` Steven Rostedt
2008-01-30 14:09       ` Steven Rostedt
2008-01-30 14:25         ` Peter Zijlstra
2008-02-01 22:34           ` Paul E. McKenney
2008-02-02  1:56             ` Steven Rostedt
2008-02-02 21:41               ` Paul E. McKenney
2008-02-04 17:09                 ` Steven Rostedt
2008-02-04 21:40                   ` Paul E. McKenney
2008-02-04 22:03                     ` Steven Rostedt
2008-02-04 22:41                       ` Mathieu Desnoyers
2008-02-05  6:11                         ` Paul E. McKenney
2008-02-05  5:13                       ` Paul E. McKenney
2008-01-30 13:21   ` Jan Kiszka
2008-01-30 13:53     ` Steven Rostedt
2008-01-30 14:28       ` Steven Rostedt [this message]
2008-01-30  3:15 ` [PATCH 03/22 -v7] Annotate core code that should not be traced Steven Rostedt
2008-01-30  3:15 ` [PATCH 04/22 -v7] x86_64: notrace annotations Steven Rostedt
2008-01-30  3:15 ` [PATCH 05/22 -v7] add notrace annotations to vsyscall Steven Rostedt
2008-01-30  8:49   ` Peter Zijlstra
2008-01-30 13:15     ` Steven Rostedt
2008-01-30  3:15 ` [PATCH 06/22 -v7] handle accurate time keeping over long delays Steven Rostedt
2008-01-30  3:15 ` [PATCH 07/22 -v7] initialize the clock source to jiffies clock Steven Rostedt
2008-01-30  3:15 ` [PATCH 08/22 -v7] add get_monotonic_cycles Steven Rostedt
2008-01-30  3:15 ` [PATCH 09/22 -v7] add notrace annotations to timing events Steven Rostedt
2008-01-30  3:15 ` [PATCH 10/22 -v7] mcount based trace in the form of a header file library Steven Rostedt
2008-01-30  3:15 ` [PATCH 11/22 -v7] Add context switch marker to sched.c Steven Rostedt
2008-01-30  3:15 ` [PATCH 12/22 -v7] Make the task State char-string visible to all Steven Rostedt
2008-01-30  3:15 ` [PATCH 13/22 -v7] Add tracing of context switches Steven Rostedt
2008-01-30  3:15 ` [PATCH 14/22 -v7] Generic command line storage Steven Rostedt
2008-01-30  3:15 ` [PATCH 15/22 -v7] trace generic call to schedule switch Steven Rostedt
2008-01-30  3:15 ` [PATCH 16/22 -v7] Add marker in try_to_wake_up Steven Rostedt
2008-01-30  3:15 ` [PATCH 17/22 -v7] mcount tracer for wakeup latency timings Steven Rostedt
2008-01-30  9:31   ` Peter Zijlstra
2008-01-30 13:18     ` Steven Rostedt
2008-01-30  3:15 ` [PATCH 18/22 -v7] Trace irq disabled critical timings Steven Rostedt
2008-01-30  3:15 ` [PATCH 19/22 -v7] trace preempt off " Steven Rostedt
2008-01-30  9:40   ` Peter Zijlstra
2008-01-30 13:40     ` Steven Rostedt
2008-01-30  3:15 ` [PATCH 20/22 -v7] Add markers to various events Steven Rostedt
2008-01-30  3:15 ` [PATCH 21/22 -v7] Add event tracer Steven Rostedt
2008-01-30  3:15 ` [PATCH 22/22 -v7] Critical latency timings histogram Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.58.0801300921020.29020@gandalf.stny.rr.com \
    --to=rostedt@goodmis.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@ghostprotocols.net \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@infradead.org \
    --cc=fche@redhat.com \
    --cc=ghaskins@novell.com \
    --cc=hch@infradead.org \
    --cc=jan.kiszka@siemens.com \
    --cc=johnstul@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=mingo@elte.hu \
    --cc=sam@ravnborg.org \
    --cc=srostedt@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tim.bird@am.sony.com \
    --cc=torvalds@linux-foundation.org \
    --subject='Re: [PATCH 02/22 -v7] Add basic support for gcc profiler instrumentation' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).