LKML Archive on
help / color / mirror / Atom feed
From: Andi Kleen <>
To: "Metzger, Markus T" <>
Cc:,,,,, "Siddha, Suresh B" <>
Subject: Re: ptrace API extensions for BTS
Date: Fri, 7 Dec 2007 12:18:17 +0100	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On Friday 07 December 2007 10:11:04 Metzger, Markus T wrote:
> Roland, Andi,
> I would like to discuss the ptrace user interface for the BTS extension.
> In previous emails,
> Andi suggested a stream-like interface, but is also OK with an
> array-like interface (as far as I understood).
> Roland is dubious about the ptrace API additions.
> I would like to settle the discussion and find an interface that
> everybody can agree to, so I can implement that interface and we can
> move forward with the patch.

The most efficient interface would be zero copy with tracer user process
supplying memory that is pinned (get_user_pages()) subject to the
mlock rlimit. Then kernel telling the CPU to directly log into

Kernel buffers would be only needed for the per CPU kernel 

Then the only information that would need to be passed with
system calls would be wakeup, tail position and perhaps a wrapping

> Regarding 1, we currently provide scheduling timestamps, which are arch

That's actually broken because you don't log the CPU number.
sched_clock() without the CPU number associated is meaningless 
on systems without synchronized, pstate invariant TSC 
[that is older Intel systems or some larger current systems]

And even if you log the CPU number it is unclear how user space
would make sense of that. It can't generally, even the kernel
can't. Perhaps better to just not supply any time stamps for this.

Even on systems that don't have unsync TSC problem above
it can be tricky to convert the TSC into real time. Right now
we don't report the TSC frequency for once. Usually it tends
to be at highest p state but finding that out is also 
difficult and unreliable (rounding errors) and might not
always be true in the future. Anyways could be solved
by reporting that separately in /proc/cpuinfo, but given all
the other problems I have my doubts it is really worth it. I would
suggest dropping the time stamp.

> Additional architectures may want to (re)use and extend the x86 bts
> record, or they may want to invent their own format. In the former case,

I think that's actually not a good goal. If the code is so complicated
that it makes sense sharing then you did something wrong :)


  reply	other threads:[~2007-12-07 11:18 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-07  9:11 Metzger, Markus T
2007-12-07 11:18 ` Andi Kleen [this message]
2007-12-07 12:01   ` Metzger, Markus T
2007-12-07 13:04     ` Andi Kleen
2007-12-07 13:36       ` Metzger, Markus T
2008-01-30  7:25 ` Roland McGrath
2008-01-30 10:32   ` Metzger, Markus T
2008-01-30 11:01     ` stephane eranian
2008-01-30 12:52       ` Metzger, Markus T
2008-01-30 13:39         ` stephane eranian
2008-01-31 11:15         ` stephane eranian
2008-01-31 17:45           ` Metzger, Markus T

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \
    --subject='Re: ptrace API extensions for BTS' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).