LKML Archive on
help / color / mirror / Atom feed
From: Mikael Pettersson <>
Subject: Re: [PATCH][3/7] perfctr-2.7.2 for 2.6.6-mm2: x86_64
Date: Sat, 15 May 2004 16:42:46 +0200 (MEST)	[thread overview]
Message-ID: <> (raw)

On Fri, 14 May 2004 17:14:43 +0200, Andi Kleen wrote:
>Mikael Pettersson <> writes:
>Before merging all that I would definitely recommend some generic
>module to allocate performance counters. IBM had a patch for this
>long ago, and it is even more needed now.

No. Managing the mapping to physical counters can and should be
done in user-space. Due to hardware irregularities, user-space
needs the ability to control the mapping. Due to the low-overhead
user-space sampling code, user-space needs the ability to read
the mapping. None of this is a kernel-level problem -- user-space
already does all of this.

Computing the mapping in the kernel would need lots of code, unless
it cheats like oprofile does on the P4 and hides most counters.
Neither hiding counters nor computing imperfect mappings is acceptable.

>> diff -ruN linux-2.6.6-mm2/drivers/perfctr/x86_64.c linux-2.6.6-mm2.perfctr-2.7.2.x86_64/drivers/perfctr/x86_64.c
>> --- linux-2.6.6-mm2/drivers/perfctr/x86_64.c	1970-01-01 01:00:00.000000000 +0100
>> +++ linux-2.6.6-mm2.perfctr-2.7.2.x86_64/drivers/perfctr/x86_64.c	2004-05-14 14:45:43.990230001 +0200
>> @@ -0,0 +1,660 @@
>> +/* $Id: x86_64.c,v 1.27 2004/05/13 23:32:50 mikpe Exp $
>> + * x86_64 performance-monitoring counters driver.
>Can't you share most/all of that file with i386 ? 
>You'll want that definitely once you support Intel CPUs too, 
>and you have to do that eventually.
>Same for include/asm-x86_64/perfctr.h

Yes, that's on my TODO list. x86_64 started as a separate file because
it was so much simpler than x86. Now that Intel will do a P4-like
64-bitter, it makes sense to merge.

>+struct per_cpu_cache {	/* roughly a subset of perfctr_cpu_state */
>+	union {
>+		unsigned int id;	/* cache owner id */
>+	} k1;
>+	struct {
>+		/* NOTE: these caches have physical indices, not virtual */
>+		unsigned int evntsel[4];
>+	} control;
>+} ____cacheline_aligned;
>+static struct per_cpu_cache per_cpu_cache[NR_CPUS] __cacheline_aligned;
>This should use per_cpu_data

Yes. per_cpu() doesn't (or at least didn't originally) work in modules,
I just forgot to change this when I removed modular support.

>+static unsigned int new_id(void)
>Why can't that wrap? Maybe it should use the functions in lib/idr.c ?

idr.c didn't exist when this was written. The driver doesn't
release ids because the ids propagate from the high-level driver's
state to any number of per-cpu caches -- and way back it also
propagated to child processes but that's no longer the case.

I will change things so that the high-level drivers are required
to invoke the low-level driver before the control state dies.
This allows the low-level driver to release the cache id.

>+	if( perfctr_cstatus_has_tsc(cstatus) )
>+		rdtscl(ctrs->tsc);
>+	nrctrs = perfctr_cstatus_nractrs(cstatus);
>+	for(i = 0; i < nrctrs; ++i) {
>+		unsigned int pmc = state->pmc[i].map;
>+		rdpmc_low(pmc, ctrs->pmc[i]);
>+	}
>K8 has speculative rdtsc. Most likely you want a sync_core() somewhere
>in there.

What's the cost for sync_core()? The counts don't have to be

>The way you set your brackets is weird.

I have an old habit of spacing the parens in conditionals so
as to make the conditional expression more visible. I can
change that; I know it looks awkward to other people.

>Why do you check for K8 C stepping? I don't see any code that
>does anything special with that.

Old API mistake. The previous APIs placed the burden of CPU
detection on the driver not user-space (it's exported via the
perfctr_info stuff), but this is broken and wrong. I wasn't
able to change that previously for compatibility reasons, but
with a new API (however it will turn out to be) there will
have to be a new user-space library, so this is no longer
a problem. Thanks for reminding me.


             reply	other threads:[~2004-05-15 14:44 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-05-15 14:42 Mikael Pettersson [this message]
2004-05-15 19:16 ` Andi Kleen
2004-05-15 20:40   ` John Reiser
2004-05-15 20:49     ` Andi Kleen
  -- strict thread matches above, loose matches on Subject: below --
2004-05-16  9:58 Mikael Pettersson
2004-05-15 14:44 Mikael Pettersson
2004-05-15 19:26 ` Andi Kleen
     [not found] <>
2004-05-14 15:14 ` Andi Kleen
2004-05-15  5:37   ` Bryan O'Sullivan
2004-05-15  9:09     ` Andi Kleen
2004-05-16  4:15       ` Bryan O'Sullivan
2004-05-14 14:11 Mikael Pettersson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \
    --subject='Re: [PATCH][3/7] perfctr-2.7.2 for 2.6.6-mm2: x86_64' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).