LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* CONFIG_SLOW_TIMESTAMPS was Re: ANNOUNCE: CE Linux Forum - Specification V1.0 draft
[not found] ` <1XHzV-5OV-9@gated-at.bofh.it>
@ 2004-05-20 9:56 ` Andi Kleen
2004-05-21 21:44 ` Tim Bird
0 siblings, 1 reply; 3+ messages in thread
From: Andi Kleen @ 2004-05-20 9:56 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: tim.bird, linux-kernel
Theodore Ts'o <tytso@mit.edu> writes:
>
> B) When CONFIG_FAST_TIMESTAMPS is enabled, the kernel SHALL provide
> the following 2 routines:
>
> 2.1 void store_timestamp(timestamp_t *t)
sched_clock() already exists and does that, although it is a bit confusingly
named.
Or just use do_gettimeofday() which should be fast - it is used
in networking after all.
> 2.2 void timestamp_to_timeval(timestamp_t *t, struct timeval *tv)
I don't think this API is a good idea.
The obvious way to implement timestamp_t is just store the CPU integrated
time stamp counter into it (TSC in x86 terminology). But there are CPUs
where the TSC frequency changes when the CPU changes its core frequency
for power saving purposes.
(newer designs generally avoid this problem, but older designs often have
it). Now to convert the TSC value you need the xtime and the TSC value
when the xtime matching the TSC value and you need to know about
all frequency changes that happened inbetween, so that you can
compute the TSC offset.
do_gettimeofday can handle this in a atomic matter, the CPU frequency
changes just has to resync xtime (I am not sure it does this currently,
but the API allows it at least)
But splitting it into two functions like in your spec makes this
impossible, because you would need to keep track about all the
possible frequency changes in the timestamp_t between the store and
the timestamp_to_timeval to compute the TSC offset, which is obviously
not practical.
Anyways, the only way to avoid this problem would be to not use the
CPU TSC, but some external timer that is independent of the CPU
frequency (this is sometimes already done to work around other
problems). The problem is just that such external timers in the
southbridge are usually factor 100 and more slower than an TSC access
that can stay inside the CPU, because an access first has to cross the
slow CPU front side bus.
This would turn your fast time stamp into an extremly slow time stamp.
You would be probably better of by just using do_gettimeofday(), which
actually has a chance to use the TSC sanely and is not really that
slow usually.
BTW there are other problems on multiprocessor systems with this that
I don't want to go into.
-Andi
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: CONFIG_SLOW_TIMESTAMPS was Re: ANNOUNCE: CE Linux Forum - Specification V1.0 draft
2004-05-20 9:56 ` CONFIG_SLOW_TIMESTAMPS was Re: ANNOUNCE: CE Linux Forum - Specification V1.0 draft Andi Kleen
@ 2004-05-21 21:44 ` Tim Bird
2004-05-22 7:58 ` Andi Kleen
0 siblings, 1 reply; 3+ messages in thread
From: Tim Bird @ 2004-05-21 21:44 UTC (permalink / raw)
To: Andi Kleen; +Cc: Theodore Ts'o, linux-kernel
Andi Kleen wrote:
> Theodore Ts'o <tytso@mit.edu> writes:
>
>>B) When CONFIG_FAST_TIMESTAMPS is enabled, the kernel SHALL provide
>>the following 2 routines:
>>
>> 2.1 void store_timestamp(timestamp_t *t)
>
>
> sched_clock() already exists and does that, although it is a bit confusingly
> named.
I'm not familiar with this call - I'll take a look at it.
Does this replace get_cycles()?
> Or just use do_gettimeofday() which should be fast - it is used
> in networking after all.
Actually, this call was inspired by the networking guys' desire
to use something besides do_gettimeofday().
See http://lwn.net/Articles/61269/
>
>
>> 2.2 void timestamp_to_timeval(timestamp_t *t, struct timeval *tv)
>
>
> I don't think this API is a good idea.
>
> The obvious way to implement timestamp_t is just store the CPU integrated
> time stamp counter into it (TSC in x86 terminology). But there are CPUs
> where the TSC frequency changes when the CPU changes its core frequency
> for power saving purposes.
The specification for this is in the context of bootup time.
See http://tree.celinuxforum.org/pubwiki/moin.cgi/TimingAPISpecification_5fR1
The spec. acknowledges that changes in clock frequency will be
problematic for this API, but ignores that in the context
of use during the bootup sequence, where the clock frequency
should be stable.
However, if these calls are used outside of bootup context,
then cpu frequency changes are something that would have to be
dealt with. David Miller's proposed solution for this is to
fall back to do_gettimeofday() when a processor lacks
adequate features to support this properly. We would probably
do that same thing in our implementations.
You acknowledge that newer TSC implementations (and things like
it on other platforms) now avoid this problem. In the future,
I think this will be less of a problem. Hopefully the specification
and discussions like this will help inform other semiconductor
vendors about desired features for fast timestamp clocks in new
chips.
>...
> Anyways, the only way to avoid this problem would be to not use the
> CPU TSC, but some external timer that is independent of the CPU
> frequency (this is sometimes already done to work around other
> problems). The problem is just that such external timers in the
> southbridge are usually factor 100 and more slower than an TSC access
> that can stay inside the CPU, because an access first has to cross the
> slow CPU front side bus.
>
> This would turn your fast time stamp into an extremly slow time stamp.
Yes. Our preference would be for all chips to provide good TSC-like
features. Most of the major semiconductor vendors (and SOC makers
for CE products) are members of the forum. So hopefully we can drill
this into their brains.
> You would be probably better of by just using do_gettimeofday(), which
> actually has a chance to use the TSC sanely and is not really that
> slow usually.
As I said above, the idea to decouple the timestamp retrieval from
the conversion to normalized units came from the suggestion by
network guys that do_gettimeofday has too much overhead for their
packet-stamping requirements. In their case, very often the stamps
are not used, and so they thought it would be nice if the conversion
of the values into normalized units (as do_gettimeofday does) were
a separate operation, that could be performed only as needed.
In the case of instrumentation code, the conversion to normalized
units (seconds/microseconds) can often be deferred until the
dump is post-processed in user space. So if it's possible (but
only if it's possible, mind you :-), then it's nice to separate
these two operations.
Instrumentation code is extremely sensitive to the overhead
introduced by the timing code. Even for code which uses the
TSC we have seen something like 8% variance in timing of bootup code
because of the introduction of the instrumentation.
Another issue with do_gettimeofday() is that it is not available
as early as we would like in the bootup sequence. This is because
it requires setup, which usually doesn't happen until setup_arch().
(I know, this happens pretty soon, but we want to capture everything
from start_kernel and beyond.)
Finally, another reason (which is only mentioned very briefly
in the spec.) for not using do_gettimeofday is that we envision
instrumentation scenarios where the time measured goes even
back before start_kernel (that is, it includes the bootloader.
It is pretty easy to get uniformity of time measurements
across both firmware and the kernel using a TSC-like feature, because
the TSC requires no setup. do_gettimeofday() does not have this property.
Other CPU-external clocks could be configured by firmware, and
left alone by the kernel, but then read by this API, to achieve
this objective (of measurement uniformity across the kernel
start boundary).
Thanks very much for the feedback. I'll take a look at sched_clock(),
and, as I said above, we'll probably use do_gettimeofday() for
architectures where the CPU or platform doesn't have features that
support our needs.
The basic idea of this particular feature is to create a standardized
API available on all platforms that instrumentation system authors
can rely on to be there. A lot of the very good instrumentation
systems (LTT, timepegs, kernel function instrumentation) have historically
relied on TSC reads and many have not been portable to other platforms.
This specification, and the implementations that we're working on now
related to it, are intended to help aid in increasing that portability.
Here' a page of background info, for those interested:
http://tree.celinuxforum.org/pubwiki/moin.cgi/InstrumentationAPI
=============================
Tim Bird
Architecture Group Co-Chair
CE Linux Forum
Senior Staff Engineer
Sony Electronics
E-mail: Tim.Bird@am.sony.com
=============================
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: CONFIG_SLOW_TIMESTAMPS was Re: ANNOUNCE: CE Linux Forum - Specification V1.0 draft
2004-05-21 21:44 ` Tim Bird
@ 2004-05-22 7:58 ` Andi Kleen
0 siblings, 0 replies; 3+ messages in thread
From: Andi Kleen @ 2004-05-22 7:58 UTC (permalink / raw)
To: Tim Bird; +Cc: Andi Kleen, Theodore Ts'o, linux-kernel
On Fri, May 21, 2004 at 02:44:04PM -0700, Tim Bird wrote:
> Andi Kleen wrote:
>
> >Theodore Ts'o <tytso@mit.edu> writes:
> >
> >>B) When CONFIG_FAST_TIMESTAMPS is enabled, the kernel SHALL provide
> >>the following 2 routines:
> >>
> >> 2.1 void store_timestamp(timestamp_t *t)
> >
> >
> >sched_clock() already exists and does that, although it is a bit
> >confusingly
> >named.
>
> I'm not familiar with this call - I'll take a look at it.
> Does this replace get_cycles()?
Kind of.
They have slightly different requirements based on their original
callers. get_cycles was only intended for the random device and all that matters
is that its result is somewhat, umm..., random.
sched_clock is an unsynchronized over CPUs fast clock to be used by the
scheduler.
At least on i386 they are currently implemented with the same mechanism,
but that doesn't need to be always the case.
> You acknowledge that newer TSC implementations (and things like
> it on other platforms) now avoid this problem. In the future,
> I think this will be less of a problem. Hopefully the specification
> and discussions like this will help inform other semiconductor
> vendors about desired features for fast timestamp clocks in new
> chips.
That's very optimistic.
I know x86s address it slowly, but most chips around still have a variable
TSC. The only x86 that has it fixed right now AFAIK is Intel's new Prescott
core (which is still pretty uncommon).
For embedded CPUs I have no data.
> As I said above, the idea to decouple the timestamp retrieval from
> the conversion to normalized units came from the suggestion by
> network guys that do_gettimeofday has too much overhead for their
> packet-stamping requirements. In their case, very often the stamps
That was probably me (if you check the original thread about this issue)
But I eventually implemented a better fix for networking - just don't
take a timestamp unless you really need it. This better version is used
in the current code.
> are not used, and so they thought it would be nice if the conversion
> of the values into normalized units (as do_gettimeofday does) were
> a separate operation, that could be performed only as needed.
While the conversion can be a bit slow (due to the memory barriers
needed for the xtime seqlocK) the real issue is the slow external
clock access for the timestamp on some systems.
-Andi
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2004-05-22 7:58 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <1WVF1-Nn-25@gated-at.bofh.it>
[not found] ` <1WVYp-11I-5@gated-at.bofh.it>
[not found] ` <1WXdS-279-41@gated-at.bofh.it>
[not found] ` <1XiBG-2sN-37@gated-at.bofh.it>
[not found] ` <1XF55-3Ij-7@gated-at.bofh.it>
[not found] ` <1XHzV-5OV-9@gated-at.bofh.it>
2004-05-20 9:56 ` CONFIG_SLOW_TIMESTAMPS was Re: ANNOUNCE: CE Linux Forum - Specification V1.0 draft Andi Kleen
2004-05-21 21:44 ` Tim Bird
2004-05-22 7:58 ` Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).