LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Carl Love <cel@us.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>,
	linuxppc-dev@ozlabs.org, cbe-oss-dev@ozlabs.org,
	oprofile-list@lists.sourceforge.net,
	linux-kernel@vger.kernel.org
Subject: Re: [Cbe-oss-dev] [RFC, PATCH] CELL Oprofile SPU profiling	updated	patch
Date: Thu, 15 Feb 2007 13:50:48 -0800	[thread overview]
Message-ID: <20070215215047.GE1913@linux.vnet.ibm.com> (raw)
In-Reply-To: <1171570918.31179.36.camel@dyn9047021078.beaverton.ibm.com>

On Thu, Feb 15, 2007 at 12:21:58PM -0800, Carl Love wrote:
> On Thu, 2007-02-15 at 15:37 +0100, Arnd Bergmann wrote:

[ . . . ]

> > I agree with Milton that it would be far nicer even to calculate
> > the value from user space, but since you say that would
> > violate the oprofile interface conventions, let's not go there.
> > In order to make this code nicer on the user, you should probably
> > insert a 'cond_resched()' somewhere in the loop, maybe every
> > 500 iterations or so.
> > 
> > it also looks like there is whitespace damage in the code here.
> 
> I will double check on the whitespace damage.  I thought I had gotten
> all that out.  
> 
> I have done some quick measurements.  The above method limits the loop
> to at most 2^16 iterations.  Based on running the algorithm in user
> space, it takes about 3ms of computation time to do the loop 2^16 times.
> 
> At the vary least, we need to put the resched in say every 10,000
> iterations which would be about every 0.5ms.  Should we do a resched
> more often?  
> 
> Additionally we could up the size of the table to 512 which would reduce
> the maximum time to about 1.5ms.  What do people think about increasing
> the table size?

Is this 1.5ms with interrupts disabled?  This time period is problematic
from a realtime perspective if so -- need to be able to preempt.

						Thanx, Paul

> A little more general discussion about the logarithmic algorithm and
> limiting the range.  The hardware supports a 24 bit LFSR value. This
> means the user can say is capture a sample every N cycles, where N is in
> the range of 1 to 2^24.  The OProfile user tool enforces a minimum value
> of N to make sure the overhead of OProfile doesn't bring the machine to
> its knees.  The minimum values is not intended to guarantee the
> performance impact of OProfile will not be significant.  It is left as
> an exercise for the user to pick an N that will give minimal performance
> impact.  We set the lower limit for N for SPU profiling to 100,000. This
> is actually high enough that we don't seem to see much performance
> impact when running OProfile.  If the user picked N=2^24 then for a
> 3.2GHz machine you would get about 200 samples per second on each node.
> Where a sample consists of the PC value for all 8 SPUs on the node.  If
> the user wanted to do a relatively long OProfile run, I can see where
> they might use N=2^24 to avoid gathering too much data.  My gut feeling
> is that the sampling frequency for N=2^24 is not low enough that someone
> would never want to use it when doing long runs.  Hence, we should not
> arbitrarily reduce the maximum value for N.  Although I would expect
> that the typical value for N will be in the range of several hundred
> thousand to a few million.
> 
> As for using a logarithmic spacing of the precomputed values, this
> approach means that the space between the precomputed values at the high
> end would be much larger then 2^14, assuming 256 precomputed values.
> That means it could take much longer then 3ms to get the needed LFSR
> value for a large N.  By evenly spacing the precomputed values, we can
> ensure that for all N it will take less then 3ms to get the value.
> Personally, I am more comfortable with a hard limit on the compute time
> then a variable time that could get much bigger then the 1ms threshold
> that Arnd wants for resched.  Any thoughts?
> 
> > 
> > > +
> > > +/* This interface allows a profiler (e.g., OProfile) to store
> > > + * spu_context information needed for profiling, allowing it to
> > > + * be saved across context save/restore operation.
> > > + *
> > > + * Assumes the caller has already incremented the ref count to
> > > + * profile_info; then spu_context_destroy must call kref_put
> > > + * on prof_info_kref.
> > > + */
> > > +void spu_set_profile_private(struct spu_context * ctx, void * profile_info,
> > > +			     struct kref * prof_info_kref,
> > > +			     void (* prof_info_release) (struct kref * kref))
> > > +{
> > > +	ctx->profile_private = profile_info;
> > > +	ctx->prof_priv_kref = prof_info_kref;
> > > +	ctx->prof_priv_release = prof_info_release;
> > > +}
> > > +EXPORT_SYMBOL_GPL(spu_set_profile_private);
> > 
> > I think you don't need the profile_private member here, if you just use
> > container_of with ctx->prof_priv_kref in all users.
> > 
> > 	Arnd <><
> 
> _______________________________________________
> cbe-oss-dev mailing list
> cbe-oss-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/cbe-oss-dev

  parent reply	other threads:[~2007-02-15 21:50 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-14 23:52 Carl Love
2007-02-15 14:37 ` Arnd Bergmann
2007-02-15 16:15   ` Maynard Johnson
2007-02-15 18:13     ` Arnd Bergmann
2007-02-15 20:21   ` Carl Love
2007-02-15 21:03     ` Arnd Bergmann
2007-02-15 21:50     ` Paul E. McKenney [this message]
2007-02-16  0:33       ` Arnd Bergmann
2007-02-16  0:32   ` Maynard Johnson
2007-02-16 17:14     ` Arnd Bergmann
2007-02-16 21:43       ` Maynard Johnson
2007-02-18 23:18         ` Maynard Johnson
  -- strict thread matches above, loose matches on Subject: below --
2007-02-22  0:02 Carl Love
2007-02-26 23:50 ` Arnd Bergmann
2007-02-27  1:31   ` Michael Ellerman
2007-02-27 16:52   ` Maynard Johnson
2007-02-28  1:44     ` Arnd Bergmann
2007-02-06  0:28 [RFC,PATCH] CELL PPU " Carl Love
2007-02-06 23:02 ` [Cbe-oss-dev] [RFC, PATCH] CELL " Carl Love
2007-02-07 15:41   ` Maynard Johnson
2007-02-07 22:48     ` Michael Ellerman
2007-02-08 15:03       ` Maynard Johnson
2007-02-08 14:18   ` Milton Miller
2007-02-08 17:21     ` Arnd Bergmann
2007-02-08 18:01       ` Adrian Reber
2007-02-08 22:51       ` Carl Love
2007-02-09  2:46         ` Milton Miller
2007-02-09 16:17           ` Carl Love
2007-02-11 22:46             ` Milton Miller
2007-02-12 16:38               ` Carl Love
2007-02-09 18:47       ` Milton Miller
2007-02-09 19:10         ` Arnd Bergmann
2007-02-09 19:46           ` Milton Miller
2007-02-08 23:59     ` Maynard Johnson
2007-02-09 18:03       ` Milton Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070215215047.GE1913@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=arnd@arndb.de \
    --cc=cbe-oss-dev@ozlabs.org \
    --cc=cel@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=oprofile-list@lists.sourceforge.net \
    --subject='Re: [Cbe-oss-dev] [RFC, PATCH] CELL Oprofile SPU profiling	updated	patch' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).