LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: How how latent should non-preemptive scheduling be?
[not found] ` <fa.Jx/Ygtm46CVRawlA6OnfYNn6cN0@ifi.uio.no>
@ 2008-09-18 7:26 ` Sitsofe Wheeler
[not found] ` <fa.iIHgL48F3T5VAqFw3mqaf9Pzrs4@ifi.uio.no>
1 sibling, 0 replies; 28+ messages in thread
From: Sitsofe Wheeler @ 2008-09-18 7:26 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Arjan van de Ven, linux-kernel, Ingo Molnar
Peter Zijlstra wrote:
> On Wed, 2008-09-17 at 14:54 -0700, Arjan van de Ven wrote:
>> On Wed, 17 Sep 2008 22:48:55 +0100
>>
>>>> Cause Maximum
>>>>
>> Percentage
>>
>> Scheduler: waiting for cpu 208 msec 59.4 %
>>
>>
>> you're rather CPU bound, and your process was woken up but didn't run for over 200 milliseconds..
>> that sounds like a scheduler fairness issue!
>
> Really hard subject. Perfect fairness requires 0 latency - which with a
> CPU only being able to run one thing at a time is impossible. So what
> latency ends up being is a measure for the convergence towards fairness.
>
> Anyway - 200ms isn't too weird depending on the circumstances. We start
> out with a 20ms latency for UP, we then multiply with 1+log2(nr_cpus)
> which in say a quad core machine ends up with 60ms. That ought to mean
> that under light load the max latency should not exceed twice that
> (basically a consequence of the Nyquist-Shannon sampling theorem IIRC).
>
> Now, if you get get under some load (by default: nr_running > 5) the
> expected latency starts to linearly grow with nr_running.
>
>>From what I gather from the reply to this email the machine was not
> doing much (and after having looked up the original email I see its a
> eeeeeeeee atom - which is dual cpu iirc, so that yields 40ms default) -
> so 200 is definately on the high side.
No, it's not an eeeeee atom. It's an eee celeron M (900Mhz) so it's
definitely a single CPU with no hyperthreading (and SMP is not enabled
in the kernel config either). It has less grunt that the atom and can't
do cpu scaling either (although it seems to have C states).
The load average is less than 0.5 but obviously I don't know if it is
periodically spiking over 5 and then smoothing out.
> What you can do to investigate this, is use the sched_wakeup tracer from
> ftrace, that should give a function trace of the highest wakeup latency
> showing what the kernel is doing.
Thanks for the hint - I was wondering where to look next.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
[not found] ` <fa.Td8xkKZKMSMghlJmEYefTRVF2kc@ifi.uio.no>
@ 2008-09-19 11:54 ` Sitsofe Wheeler
2008-09-19 14:20 ` Peter Zijlstra
2008-09-22 11:57 ` Ingo Molnar
0 siblings, 2 replies; 28+ messages in thread
From: Sitsofe Wheeler @ 2008-09-19 11:54 UTC (permalink / raw)
To: Ingo Molnar
Cc: Peter Zijlstra, Arjan van de Ven, linux-kernel, Steven Rostedt
Ingo Molnar wrote:
> here's two quick howtos:
>
> http://redhat.com/~mingo/sched-devel.git/readme-tracer.txt
> http://redhat.com/~mingo/sched-devel.git/howto-trace-latencies.txt
These two files appear to be identical. Is this intentional?
Anyway after following your instructions a putting together a small
script to dice the output, I collated the 10 switches which took the
longest:
# Top ten longest switches
# Rel TS Process Abs TS
0.122161 hald-3423 1867.821170 ***
0.039438 <idle>-0 1867.379054
0.036318 hald-3423 1867.669009
0.031362 <idle>-0 1868.002762
0.030000 hald-3423 1867.699009
0.028933 <idle>-0 1867.529238
0.028539 <idle>-0 1867.228861
0.028196 <idle>-0 1867.128731
0.027763 <idle>-0 1868.101449
0.027513 <idle>-0 1867.028606
# tracer: sched_switch from around longest switch
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<idle>-0 [00] 1867.608017: 0:140:R + 3:115:S
<idle>-0 [00] 1867.608038: 0:140:R + 3423:120:D
<idle>-0 [00] 1867.608045: 0:140:R ==> 3:115:R
ksoftirqd/0-3 [00] 1867.608048: 3:115:S ==> 3423:120:R
hald-3423 [00] 1867.629350: 3423:120:R + 6096:120:S
hald-3423 [00] 1867.632691: 3423:120:R + 3827:120:S
hald-3423 [00] 1867.669009: 3423:120:R + 3998:120:S
hald-3423 [00] 1867.699009: 3423:120:R + 6097:120:S
***hald-3423 [00] 1867.821170: 3423:120:R ==> 6096:120:R
rhythmbox-6096 [00] 1867.821219: 6096:120:S ==> 6097:120:R
rhythmbox-6097 [00] 1867.821262: 6097:120:R + 3826:120:S
rhythmbox-6097 [00] 1867.821289: 6097:120:S ==> 3826:120:R
pulseaudio-3826 [00] 1867.821332: 3826:120:R + 6097:120:S
pulseaudio-3826 [00] 1867.821374: 3826:120:S ==> 6097:120:R
rhythmbox-6097 [00] 1867.821380: 6097:120:S ==> 3998:120:R
rhythmbox-3998 [00] 1867.821709: 3998:120:S ==> 3827:120:R
pulseaudio-3827 [00] 1867.824041: 3827:120:R + 3826:120:S
> LatencyTOP version 0.4 (C) 2008 Intel Corporation
>
> Cause Maximum Percentage
> Scheduler: waiting for cpu 152.4 msec 13.8 %
> Userspace lock contention 5.0 msec 68.0 %
> Waiting for event (poll) 5.0 msec 14.3 %
> Waiting for event (select) 4.9 msec 3.5 %
> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_bu 1.9 msec 0.1 %
> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_re 1.9 msec 0.2 %
> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_bu 1.9 msec 0.1 %
> Executing raw SCSI command 1.1 msec 0.0 %
> Waiting for TTY to finish sending 0.4 msec 0.0 %
>
>
> Process rhythmbox (3998) Total: 328.3 msec
> Scheduler: waiting for cpu 152.4 msec 80.0 %
> Userspace lock contention 4.1 msec 15.7 %
> Waiting for event (poll) 2.7 msec 4.1 %
> you need to enable:
>
> CONFIG_SCHED_TRACER=y
> CONFIG_CONTEXT_SWITCH_TRACER=y
I actually have both of these enabled but there's still no wakeup tracer
(as mentioned in the git kernel documentation http://tinyurl.com/4f9s4l
). The good news is that your instructions don't need the wakeup tracer.
> it's not particularly well named though. Why doesnt it say
> LATENCY_TRACER or something?
I agree it would be nicer if it had a better name.
>> Additionally I think I found a trigger - unplugging the power cable
>> from the EeePC and having it run on battery seems to then set off this
>> periodic stall every 30 seconds... There's no CPU frequency scaling
>> enabled either (Celeron M's seemingly don't have P states and support
>> for cpufreq is configured out).
>
> sounds like potential SMM triggered latencies.
I have just gone away and read about the SMM (
http://blogs.msdn.com/carmencr/archive/2005/08/31/458609.aspx ). If
you're right there is pretty much nothing that can be done about the
problem : (
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-19 11:54 ` Sitsofe Wheeler
@ 2008-09-19 14:20 ` Peter Zijlstra
2008-09-22 11:57 ` Ingo Molnar
1 sibling, 0 replies; 28+ messages in thread
From: Peter Zijlstra @ 2008-09-19 14:20 UTC (permalink / raw)
To: Sitsofe Wheeler
Cc: Ingo Molnar, Arjan van de Ven, linux-kernel, Steven Rostedt
On Fri, 2008-09-19 at 12:54 +0100, Sitsofe Wheeler wrote:
> Ingo Molnar wrote:
> > here's two quick howtos:
> >
> > http://redhat.com/~mingo/sched-devel.git/readme-tracer.txt
> > http://redhat.com/~mingo/sched-devel.git/howto-trace-latencies.txt
>
> These two files appear to be identical. Is this intentional?
>
> Anyway after following your instructions a putting together a small
> script to dice the output, I collated the 10 switches which took the
> longest:
>
> # Top ten longest switches
> # Rel TS Process Abs TS
> 0.122161 hald-3423 1867.821170 ***
> 0.039438 <idle>-0 1867.379054
> 0.036318 hald-3423 1867.669009
> 0.031362 <idle>-0 1868.002762
> 0.030000 hald-3423 1867.699009
> 0.028933 <idle>-0 1867.529238
> 0.028539 <idle>-0 1867.228861
> 0.028196 <idle>-0 1867.128731
> 0.027763 <idle>-0 1868.101449
> 0.027513 <idle>-0 1867.028606
>
> # tracer: sched_switch from around longest switch
> #
> # TASK-PID CPU# TIMESTAMP FUNCTION
> # | | | | |
> <idle>-0 [00] 1867.608017: 0:140:R + 3:115:S
> <idle>-0 [00] 1867.608038: 0:140:R + 3423:120:D
> <idle>-0 [00] 1867.608045: 0:140:R ==> 3:115:R
> ksoftirqd/0-3 [00] 1867.608048: 3:115:S ==> 3423:120:R
> hald-3423 [00] 1867.629350: 3423:120:R + 6096:120:S
> hald-3423 [00] 1867.632691: 3423:120:R + 3827:120:S
> hald-3423 [00] 1867.669009: 3423:120:R + 3998:120:S
> hald-3423 [00] 1867.699009: 3423:120:R + 6097:120:S
> ***hald-3423 [00] 1867.821170: 3423:120:R ==> 6096:120:R
> rhythmbox-6096 [00] 1867.821219: 6096:120:S ==> 6097:120:R
> rhythmbox-6097 [00] 1867.821262: 6097:120:R + 3826:120:S
> rhythmbox-6097 [00] 1867.821289: 6097:120:S ==> 3826:120:R
> pulseaudio-3826 [00] 1867.821332: 3826:120:R + 6097:120:S
> pulseaudio-3826 [00] 1867.821374: 3826:120:S ==> 6097:120:R
> rhythmbox-6097 [00] 1867.821380: 6097:120:S ==> 3998:120:R
> rhythmbox-3998 [00] 1867.821709: 3998:120:S ==> 3827:120:R
> pulseaudio-3827 [00] 1867.824041: 3827:120:R + 3826:120:S
>
Its actually function tracer output I'm interested in.. that shows what
all its doing to make it take 120+ms.
I thought we had a wakeup latency tracer exacty like we have preempt and
irq off latency tracers, Steve, where'd that go?
> > sounds like potential SMM triggered latencies.
>
> I have just gone away and read about the SMM (
> http://blogs.msdn.com/carmencr/archive/2005/08/31/458609.aspx ). If
> you're right there is pretty much nothing that can be done about the
> problem : (
Yeah, SMM/SMI is nasty stuff :-(
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-19 11:54 ` Sitsofe Wheeler
2008-09-19 14:20 ` Peter Zijlstra
@ 2008-09-22 11:57 ` Ingo Molnar
2008-09-22 12:07 ` Steven Rostedt
2008-09-23 6:33 ` Sitsofe Wheeler
1 sibling, 2 replies; 28+ messages in thread
From: Ingo Molnar @ 2008-09-22 11:57 UTC (permalink / raw)
To: Sitsofe Wheeler
Cc: Peter Zijlstra, Arjan van de Ven, linux-kernel, Steven Rostedt
* Sitsofe Wheeler <sitsofe@yahoo.com> wrote:
> Ingo Molnar wrote:
>> here's two quick howtos:
>>
>> http://redhat.com/~mingo/sched-devel.git/readme-tracer.txt
>> http://redhat.com/~mingo/sched-devel.git/howto-trace-latencies.txt
>
> These two files appear to be identical. Is this intentional?
>
> Anyway after following your instructions a putting together a small
> script to dice the output, I collated the 10 switches which took the
> longest:
>
> # Top ten longest switches
> # Rel TS Process Abs TS
> 0.122161 hald-3423 1867.821170 ***
> 0.039438 <idle>-0 1867.379054
> 0.036318 hald-3423 1867.669009
> 0.031362 <idle>-0 1868.002762
> 0.030000 hald-3423 1867.699009
> 0.028933 <idle>-0 1867.529238
> 0.028539 <idle>-0 1867.228861
> 0.028196 <idle>-0 1867.128731
> 0.027763 <idle>-0 1868.101449
> 0.027513 <idle>-0 1867.028606
>
> # tracer: sched_switch from around longest switch
> #
> # TASK-PID CPU# TIMESTAMP FUNCTION
> # | | | | |
> <idle>-0 [00] 1867.608017: 0:140:R + 3:115:S
> <idle>-0 [00] 1867.608038: 0:140:R + 3423:120:D
> <idle>-0 [00] 1867.608045: 0:140:R ==> 3:115:R
> ksoftirqd/0-3 [00] 1867.608048: 3:115:S ==> 3423:120:R
> hald-3423 [00] 1867.629350: 3423:120:R + 6096:120:S
> hald-3423 [00] 1867.632691: 3423:120:R + 3827:120:S
> hald-3423 [00] 1867.669009: 3423:120:R + 3998:120:S
> hald-3423 [00] 1867.699009: 3423:120:R + 6097:120:S
> ***hald-3423 [00] 1867.821170: 3423:120:R ==> 6096:120:R
> rhythmbox-6096 [00] 1867.821219: 6096:120:S ==> 6097:120:R
> rhythmbox-6097 [00] 1867.821262: 6097:120:R + 3826:120:S
> rhythmbox-6097 [00] 1867.821289: 6097:120:S ==> 3826:120:R
> pulseaudio-3826 [00] 1867.821332: 3826:120:R + 6097:120:S
> pulseaudio-3826 [00] 1867.821374: 3826:120:S ==> 6097:120:R
> rhythmbox-6097 [00] 1867.821380: 6097:120:S ==> 3998:120:R
> rhythmbox-3998 [00] 1867.821709: 3998:120:S ==> 3827:120:R
> pulseaudio-3827 [00] 1867.824041: 3827:120:R + 3826:120:S
>
>
>> LatencyTOP version 0.4 (C) 2008 Intel Corporation
>>
>> Cause Maximum Percentage
>> Scheduler: waiting for cpu 152.4 msec 13.8 %
>> Userspace lock contention 5.0 msec 68.0 %
>> Waiting for event (poll) 5.0 msec 14.3 %
>> Waiting for event (select) 4.9 msec 3.5 %
>> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_bu 1.9 msec 0.1 %
>> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_re 1.9 msec 0.2 %
>> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_bu 1.9 msec 0.1 %
>> Executing raw SCSI command 1.1 msec 0.0 %
>> Waiting for TTY to finish sending 0.4 msec 0.0 %
>>
>>
>> Process rhythmbox (3998) Total: 328.3 msec
>> Scheduler: waiting for cpu 152.4 msec 80.0 %
>> Userspace lock contention 4.1 msec 15.7 %
>> Waiting for event (poll) 2.7 msec 4.1 %
>
>
>> you need to enable:
>>
>> CONFIG_SCHED_TRACER=y
>> CONFIG_CONTEXT_SWITCH_TRACER=y
>
> I actually have both of these enabled but there's still no wakeup tracer
> (as mentioned in the git kernel documentation http://tinyurl.com/4f9s4l
> ). The good news is that your instructions don't need the wakeup tracer.
>
>> it's not particularly well named though. Why doesnt it say
>> LATENCY_TRACER or something?
>
> I agree it would be nicer if it had a better name.
>
>>> Additionally I think I found a trigger - unplugging the power cable
>>> from the EeePC and having it run on battery seems to then set off
>>> this periodic stall every 30 seconds... There's no CPU frequency
>>> scaling enabled either (Celeron M's seemingly don't have P states and
>>> support for cpufreq is configured out).
>>
>> sounds like potential SMM triggered latencies.
>
> I have just gone away and read about the SMM (
> http://blogs.msdn.com/carmencr/archive/2005/08/31/458609.aspx ). If
> you're right there is pretty much nothing that can be done about the
> problem : (
well, since they went away after you enabled CONFIG_PREEMPT=y, they are
definitely in-kernel latencies, not any external SMM latencies.
I.e. they are inherently fixable. Could you enable:
CONFIG_DYNAMIC_FTRACE=y
CONFIG_FTRACE_MCOUNT_RECORD=y
that should make the traces a lot more verbose - every kernel function
executed in the latency path will be logged. That way we'll be able to
say which one takes that long.
note, you might have to increase /debug/tracing/trace_entries to get a
long enough trace to capture the relevant portion of the latency.
Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-22 11:57 ` Ingo Molnar
@ 2008-09-22 12:07 ` Steven Rostedt
2008-09-23 6:33 ` Sitsofe Wheeler
1 sibling, 0 replies; 28+ messages in thread
From: Steven Rostedt @ 2008-09-22 12:07 UTC (permalink / raw)
To: Ingo Molnar
Cc: Sitsofe Wheeler, Peter Zijlstra, Arjan van de Ven, linux-kernel
On Mon, 22 Sep 2008, Ingo Molnar wrote:
> * Sitsofe Wheeler <sitsofe@yahoo.com> wrote:
> >>
> >> sounds like potential SMM triggered latencies.
> >
> > I have just gone away and read about the SMM (
> > http://blogs.msdn.com/carmencr/archive/2005/08/31/458609.aspx ). If
> > you're right there is pretty much nothing that can be done about the
> > problem : (
>
> well, since they went away after you enabled CONFIG_PREEMPT=y, they are
> definitely in-kernel latencies, not any external SMM latencies.
>
> I.e. they are inherently fixable. Could you enable:
>
> CONFIG_DYNAMIC_FTRACE=y
> CONFIG_FTRACE_MCOUNT_RECORD=y
>
> that should make the traces a lot more verbose - every kernel function
> executed in the latency path will be logged. That way we'll be able to
> say which one takes that long.
>
> note, you might have to increase /debug/tracing/trace_entries to get a
> long enough trace to capture the relevant portion of the latency.
Also note, to modify trace_entries, you must be in the none (nop?) tracer,
otherwise the size will not be effected.
If you find the trace is also too big, you can echo a list of functions
into:
/debug/tracing/set_ftrace_notrace
to not trace those functions. using '>' will remove any existing function
in that file, but using '>>' will append functions to the file.
For a list of functions that you can add, see:
/debug/tracing/available_filter_functions
-- Steve
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-22 11:57 ` Ingo Molnar
2008-09-22 12:07 ` Steven Rostedt
@ 2008-09-23 6:33 ` Sitsofe Wheeler
2008-09-23 11:53 ` Ingo Molnar
1 sibling, 1 reply; 28+ messages in thread
From: Sitsofe Wheeler @ 2008-09-23 6:33 UTC (permalink / raw)
To: Ingo Molnar
Cc: Peter Zijlstra, Arjan van de Ven, linux-kernel, Steven Rostedt
Ingo Molnar wrote:
> well, since they went away after you enabled CONFIG_PREEMPT=y, they are
> definitely in-kernel latencies, not any external SMM latencies.
>
> I.e. they are inherently fixable. Could you enable:
>
> CONFIG_DYNAMIC_FTRACE=y
> CONFIG_FTRACE_MCOUNT_RECORD=y
>
> that should make the traces a lot more verbose - every kernel function
> executed in the latency path will be logged. That way we'll be able to
> say which one takes that long.
I do not appear to have the CONFIG_FTRACE_MCOUNT_RECORD option in
2.6.27rc7. Is it an option that is only in -tip ?
--
Sitsofe | http://sucs.org/~sits/
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-23 6:33 ` Sitsofe Wheeler
@ 2008-09-23 11:53 ` Ingo Molnar
2008-09-23 16:30 ` Sitsofe Wheeler
0 siblings, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2008-09-23 11:53 UTC (permalink / raw)
To: Sitsofe Wheeler
Cc: Peter Zijlstra, Arjan van de Ven, linux-kernel, Steven Rostedt
* Sitsofe Wheeler <sitsofe@yahoo.com> wrote:
> Ingo Molnar wrote:
>> well, since they went away after you enabled CONFIG_PREEMPT=y, they are
>> definitely in-kernel latencies, not any external SMM latencies.
>>
>> I.e. they are inherently fixable. Could you enable:
>>
>> CONFIG_DYNAMIC_FTRACE=y
>> CONFIG_FTRACE_MCOUNT_RECORD=y
>>
>> that should make the traces a lot more verbose - every kernel function
>> executed in the latency path will be logged. That way we'll be able to
>> say which one takes that long.
>
> I do not appear to have the CONFIG_FTRACE_MCOUNT_RECORD option in
> 2.6.27rc7. Is it an option that is only in -tip ?
yeah - it's a new ftrace feature queued up for v2.6.28.
Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-23 11:53 ` Ingo Molnar
@ 2008-09-23 16:30 ` Sitsofe Wheeler
2008-09-23 19:39 ` Sitsofe Wheeler
0 siblings, 1 reply; 28+ messages in thread
From: Sitsofe Wheeler @ 2008-09-23 16:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: Peter Zijlstra, Arjan van de Ven, linux-kernel, Steven Rostedt
Ingo Molnar wrote:
> * Sitsofe Wheeler <sitsofe@yahoo.com> wrote:
>
>> Ingo Molnar wrote:
>>> well, since they went away after you enabled CONFIG_PREEMPT=y, they are
>>> definitely in-kernel latencies, not any external SMM latencies.
>>>
>>> I.e. they are inherently fixable. Could you enable:
>>>
>>> CONFIG_DYNAMIC_FTRACE=y
>>> CONFIG_FTRACE_MCOUNT_RECORD=y
>>>
>>> that should make the traces a lot more verbose - every kernel function
>>> executed in the latency path will be logged. That way we'll be able to
>>> say which one takes that long.
>> I do not appear to have the CONFIG_FTRACE_MCOUNT_RECORD option in
>> 2.6.27rc7. Is it an option that is only in -tip ?
>
> yeah - it's a new ftrace feature queued up for v2.6.28.
I've been struggling to boot -tip/master - currently it blows up just
after printing SLUB information saying:
BUG: unable to handle kernel NULL pointer dereference at 00000004
IP: [<c0120078>] account_system_time+0x48/0x120
*pde = 00000000
Thread overran stack, or stack corrupted
Oops: 0002 [#1] PREEMPT
--
Sitsofe | http://sucs.org/~sits/
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-23 16:30 ` Sitsofe Wheeler
@ 2008-09-23 19:39 ` Sitsofe Wheeler
2008-09-23 22:01 ` Sitsofe Wheeler
0 siblings, 1 reply; 28+ messages in thread
From: Sitsofe Wheeler @ 2008-09-23 19:39 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Peter Zijlstra, Arjan van de Ven, linux-kernel,
Steven Rostedt
Sitsofe Wheeler wrote:
> I've been struggling to boot -tip/master - currently it blows up just
> after printing SLUB information saying:
>
> BUG: unable to handle kernel NULL pointer dereference at 00000004
> IP: [<c0120078>] account_system_time+0x48/0x120
> *pde = 00000000
> Thread overran stack, or stack corrupted
> Oops: 0002 [#1] PREEMPT
OK this BUG seems to have gone away in the past few hours.
Ingo when you were asking for the ftrace report I presume that would be
for a non preempt kernel (as a preemptive one only showed a very worst
latency of 19657 us in one exceptional case)?
--
Sitsofe | http://sucs.org/~sits/
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-23 19:39 ` Sitsofe Wheeler
@ 2008-09-23 22:01 ` Sitsofe Wheeler
2008-09-27 20:48 ` Ingo Molnar
0 siblings, 1 reply; 28+ messages in thread
From: Sitsofe Wheeler @ 2008-09-23 22:01 UTC (permalink / raw)
Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Arjan van de Ven,
Steven Rostedt
Sitsofe Wheeler wrote:
> Ingo when you were asking for the ftrace report I presume that would be
> for a non preempt kernel (as a preemptive one only showed a very worst
> latency of 19657 us in one exceptional case)?
With a non preempt kernel I found the logs would simply become too big
without filtering. On Peter's suggestion I used the following to remove
the most frequently called functions:
echo acpi_os_release_object > set_ftrace_notrace && echo kmem_cache_* >>
set_ftrace_notrace && echo acpi_ut_* >> set_ftrace_notrace
By doing counts across multiple runs I would say that the most
frequently called functions are the following (in most frequently called
order). The counts are definitely approximate but are reasonable
relative to each other.
475325 acpi_os_release_object (acpi_ut_delete_generic_state)
406895 kmem_cache_free (acpi_os_release_object)
402838 kmem_cache_alloc (acpi_ut_create_generic_state)
132968 acpi_ut_update_ref_count (acpi_ut_update_object_reference)
131041 acpi_ut_pop_generic_state (acpi_ut_update_object_reference)
131036 acpi_ut_delete_generic_state (acpi_ut_update_object_reference)
131025 acpi_ut_create_generic_state (acpi_ut_create_update_state)
131023 acpi_ut_create_update_state_and_push
(acpi_ut_update_object_reference)
131020 acpi_ut_create_update_state (acpi_ut_create_update_state_and_push)
131018 acpi_ut_push_generic_state (acpi_ut_create_update_state_and_push)
60147 acpi_ns_get_next_node (acpi_ns_delete_namespace_by_owner)
28974 acpi_ns_get_next_valid_node (acpi_ns_get_next_node)
Logs with the filtering on can be seen here (15Mbytes decompressed each):
http://sucs.org/~sits/test/eeepc-debug/20080923/latency_trace.gz
http://sucs.org/~sits/test/eeepc-debug/20080923/trace.txt.gz
It looks like lots of acpi state is created and deleted...
--
Sitsofe | http://sucs.org/~sits/
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-23 22:01 ` Sitsofe Wheeler
@ 2008-09-27 20:48 ` Ingo Molnar
2008-09-28 20:56 ` Sitsofe Wheeler
0 siblings, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2008-09-27 20:48 UTC (permalink / raw)
To: Sitsofe Wheeler
Cc: linux-kernel, Peter Zijlstra, Arjan van de Ven, Steven Rostedt
* Sitsofe Wheeler <sitsofe@yahoo.com> wrote:
> Logs with the filtering on can be seen here (15Mbytes decompressed each):
> http://sucs.org/~sits/test/eeepc-debug/20080923/latency_trace.gz
> http://sucs.org/~sits/test/eeepc-debug/20080923/trace.txt.gz
>
> It looks like lots of acpi state is created and deleted...
yeah. The latency starts here:
cat-5901 0dNh. 1us : default_wake_function (__wake_up_common)
cat-5901 0.Nh. 2us : kill_fasync (snd_pcm_period_elapsed)
[...]
and ends here:
[...]
cat-5901 0.N.. 270501us+: mutex_lock (acpi_ec_transaction)
cat-5901 0d... 270507us : __cond_resched (_cond_resched)
270 _milliseconds_ later. That's excessive.
The main overhead is starting here:
cat-5901 0.N.. 167us : acpi_ds_result_push (acpi_ds_exec_end_op)
lots of ACPI code executed ...
does it get better if you have CONFIG_PREEMPT_VOLUNTARY=y enabled? That
_should_ break up this section neatly. If it doesnt then please add a
might_sleep() check to kernel/kernel/semaphore.c's down_timeout()
function - that is called a number of times in this trace.
Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-27 20:48 ` Ingo Molnar
@ 2008-09-28 20:56 ` Sitsofe Wheeler
2008-09-29 8:37 ` Ingo Molnar
0 siblings, 1 reply; 28+ messages in thread
From: Sitsofe Wheeler @ 2008-09-28 20:56 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Peter Zijlstra, Arjan van de Ven, Steven Rostedt
Ingo Molnar wrote:
> does it get better if you have CONFIG_PREEMPT_VOLUNTARY=y enabled? That
CONFIG_PREEMPT_VOLUNTARY=y has always been enabled in these results.
> _should_ break up this section neatly. If it doesnt then please add a
> might_sleep() check to kernel/kernel/semaphore.c's down_timeout()
> function - that is called a number of times in this trace.
I added might_sleep() to the start of down_timeout() but it neither
printed anything to dmesg nor changed the latency issue...
Strangely stalling only seems to turn up in linux-tip kernels with very
little debugging options set within them. I have a linux-tip that has
lots of extra debugging options set and this problem doesn't show up...
I've also asked about this on the ACPI mailing list and had a huge
amount of help from finding a good point for a cond_resched -
http://marc.info/?l=linux-acpi&m=122236450105747&w=2 . Certain things
(like games) still stutter when the battery is read but it's enough to
play back sound in rhythmbox without stuttering.
Any ideas why the issue would go away with a debug kernel though?
--
Sitsofe | http://sucs.org/~sits/
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-28 20:56 ` Sitsofe Wheeler
@ 2008-09-29 8:37 ` Ingo Molnar
2008-09-29 23:11 ` Sitsofe Wheeler
0 siblings, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2008-09-29 8:37 UTC (permalink / raw)
To: Sitsofe Wheeler
Cc: linux-kernel, Peter Zijlstra, Arjan van de Ven, Steven Rostedt
* Sitsofe Wheeler <sitsofe@yahoo.com> wrote:
> Ingo Molnar wrote:
>> does it get better if you have CONFIG_PREEMPT_VOLUNTARY=y enabled? That
>>
>
> CONFIG_PREEMPT_VOLUNTARY=y has always been enabled in these results.
okay - then this definitely needs fixed.
>> _should_ break up this section neatly. If it doesnt then please add a
>> might_sleep() check to kernel/kernel/semaphore.c's down_timeout()
>> function - that is called a number of times in this trace.
>
> I added might_sleep() to the start of down_timeout() but it neither
> printed anything to dmesg nor changed the latency issue...
>
> Strangely stalling only seems to turn up in linux-tip kernels with
> very little debugging options set within them. I have a linux-tip that
> has lots of extra debugging options set and this problem doesn't show
> up...
>
> I've also asked about this on the ACPI mailing list and had a huge
> amount of help from finding a good point for a cond_resched -
> http://marc.info/?l=linux-acpi&m=122236450105747&w=2 . Certain things
> (like games) still stutter when the battery is read but it's enough to
> play back sound in rhythmbox without stuttering.
>
> Any ideas why the issue would go away with a debug kernel though?
hm, that's weird indeed. You could try to trace the full battery readout
itself, via a script like this:
cat /debug/tracing/trace > /dev/null # drain trace
cat /proc/acpi/whatever
cat /debug/tracing/trace > trace.txt
this way you should be seeing a full trace of the whole thing. (with the
'ftrace' tracer plugin set in current_tracer)
then, assuming the trace.txt is complete, you can check where it
reschedules and why. Search for 'schedule' calls. Compare the debug
versus non-debug traces.
You can even turn on stack backtrace tracing feature, via:
echo stacktrace > /debug/tracing/iter_ctrl
this adds extra trace entries that show the call path of every
reschedule. [this attribute is default-disabled.]
Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-29 8:37 ` Ingo Molnar
@ 2008-09-29 23:11 ` Sitsofe Wheeler
2008-09-30 11:22 ` Ingo Molnar
0 siblings, 1 reply; 28+ messages in thread
From: Sitsofe Wheeler @ 2008-09-29 23:11 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Peter Zijlstra, Arjan van de Ven, Steven Rostedt
Ingo Molnar wrote:
> * Sitsofe Wheeler <sitsofe@yahoo.com> wrote:
>
>> Any ideas why the issue would go away with a debug kernel though?
>
> hm, that's weird indeed. You could try to trace the full battery readout
> itself, via a script like this:
>
> cat /debug/tracing/trace > /dev/null # drain trace
> cat /proc/acpi/whatever
> cat /debug/tracing/trace > trace.txt
>
> You can even turn on stack backtrace tracing feature, via:
>
> echo stacktrace > /debug/tracing/iter_ctrl
>
> this adds extra trace entries that show the call path of every
> reschedule. [this attribute is default-disabled.]
(Something bad seems to be happening with my kernels as I have been
finding strange problems like network-manager detecting the wifi as a
wired interface unless I did a make clean on my kernel sources before
compiling). The stalling issue seems to keep coming and going and is
currently showing up in both debug and non debug kernels.
There are traces when looking at battery information on both AC (where
the problem is never there) and battery (where the problem always is
there) here:
http://sucs.org/~sits/test/eeepc-debug/20080929/trace-on-ac.txt.bz2
http://sucs.org/~sits/test/eeepc-debug/20080929/trace-on-battery.txt.bz2
(15Mbytes uncompressed)
I wasn't quite sure how to tell when a reschedule was done. I simply did
a grep schedule on the file and most of it seemed reasonable. One part
that caught my eye was the following:
speaker-test-3891 [000] 1392.751699: __switch_to <-schedule
<idle>-0 [000] 1392.752070: account_scheduler_latency
<-enqueue_task_fair
<idle>-0 [000] 1392.752091: __switch_to <-schedule
cat-7717 [000] 1392.752092: del_timer <-schedule_timeout
cat-7717 [000] 1392.772263: account_scheduler_latency
<-enqueue_task_fair
cat-7717 [000] 1392.773224: __tasklet_schedule <-kbd_event
cat-7717 [000] 1392.773225: schedule_console_callback
<-kbd_event
cat-7717 [000] 1392.773225: schedule_work
<-schedule_console_callback
cat-7717 [000] 1392.773226: queue_work <-schedule_work
cat-7717 [000] 1392.773232: account_scheduler_latency
<-enqueue_task_fair
cat-7717 [000] 1392.773240: account_scheduler_latency
<-enqueue_task_fair
cat-7717 [000] 1392.773251: schedule_delayed_work
<-put_queue
cat-7717 [000] 1392.773251: queue_delayed_work
<-schedule_delayed_work
cat-7717 [000] 1392.773258: schedule_console_callback
<-kbd_event
cat-7717 [000] 1392.773258: schedule_work
<-schedule_console_callback
cat-7717 [000] 1392.773259: queue_work <-schedule_work
cat-7717 [000] 1392.773267: schedule_console_callback
<-kbd_event
cat-7717 [000] 1392.773268: schedule_work
<-schedule_console_callback
cat-7717 [000] 1392.773268: queue_work <-schedule_work
cat-7717 [000] 1392.807931: account_scheduler_latency
<-enqueue_task_fair
cat-7717 [000] 1392.839464: __tasklet_schedule <-ath5k_intr
cat-7717 [000] 1392.839506: account_scheduler_latency
<-enqueue_task_fair
cat-7717 [000] 1392.941867: __tasklet_schedule <-ath5k_intr
cat-7717 [000] 1393.005963: __tasklet_schedule <-ath5k_intr
cat-7717 [000] 1393.033222: __switch_to <-schedule
Here a schedule seemingly doesn't happen for a few hundredths of a second...
--
Sitsofe | http://sucs.org/~sits/
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-29 23:11 ` Sitsofe Wheeler
@ 2008-09-30 11:22 ` Ingo Molnar
2008-09-30 13:18 ` Sitsofe Wheeler
0 siblings, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2008-09-30 11:22 UTC (permalink / raw)
To: Sitsofe Wheeler
Cc: linux-kernel, Peter Zijlstra, Arjan van de Ven, Steven Rostedt
* Sitsofe Wheeler <sitsofe@yahoo.com> wrote:
> cat-7717 [000] 1392.941867: __tasklet_schedule <-ath5k_intr
> cat-7717 [000] 1393.005963: __tasklet_schedule <-ath5k_intr
> cat-7717 [000] 1393.033222: __switch_to <-schedule
>
> Here a schedule seemingly doesn't happen for a few hundredths of a second...
that's about 60 msecs. ath5k_intr IRQ handling overhead? IRQ handlers
are non-preemptible. (even under CONFIG_PREEMPT=y)
Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-30 11:22 ` Ingo Molnar
@ 2008-09-30 13:18 ` Sitsofe Wheeler
2008-10-04 10:50 ` Reading EeePC900 battery info causes stalls when using SLUB (was Re: How how latent should non-preemptive scheduling be?) Sitsofe Wheeler
0 siblings, 1 reply; 28+ messages in thread
From: Sitsofe Wheeler @ 2008-09-30 13:18 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Peter Zijlstra, Arjan van de Ven, Steven Rostedt
Ingo Molnar wrote:
> * Sitsofe Wheeler <sitsofe@yahoo.com> wrote:
>
>> cat-7717 [000] 1392.941867: __tasklet_schedule <-ath5k_intr
>> cat-7717 [000] 1393.005963: __tasklet_schedule <-ath5k_intr
>> cat-7717 [000] 1393.033222: __switch_to <-schedule
>>
>> Here a schedule seemingly doesn't happen for a few hundredths of a second...
>
> that's about 60 msecs. ath5k_intr IRQ handling overhead? IRQ handlers
> are non-preemptible. (even under CONFIG_PREEMPT=y)
What about the piece leading up to that:
cat-7717 [000] 1392.752092: del_timer <-schedule_timeout
cat-7717 [000] 1392.772263: account_scheduler_latency
<-enqueue_task_fair
cat-7717 [000] 1392.773224: __tasklet_schedule <-kbd_event
cat-7717 [000] 1392.773225: schedule_console_callback
<-kbd_event
cat-7717 [000] 1392.773225: schedule_work
<-schedule_console_callback
cat-7717 [000] 1392.773226: queue_work <-schedule_work
cat-7717 [000] 1392.773232: account_scheduler_latency
<-enqueue_task_fair
cat-7717 [000] 1392.773240: account_scheduler_latency
<-enqueue_task_fair
cat-7717 [000] 1392.773251: schedule_delayed_work
<-put_queue
cat-7717 [000] 1392.773251: queue_delayed_work
<-schedule_delayed_work
cat-7717 [000] 1392.773258: schedule_console_callback
<-kbd_event
cat-7717 [000] 1392.773258: schedule_work
<-schedule_console_callback
cat-7717 [000] 1392.773259: queue_work <-schedule_work
cat-7717 [000] 1392.773267: schedule_console_callback
<-kbd_event
cat-7717 [000] 1392.773268: schedule_work
<-schedule_console_callback
cat-7717 [000] 1392.773268: queue_work <-schedule_work
cat-7717 [000] 1392.807931: account_scheduler_latency
<-enqueue_task_fair
cat-7717 [000] 1392.839464: __tasklet_schedule <-ath5k_intr
Here's a snippet from
http://sucs.org/~sits/test/eeepc-debug/20080929/trace-on-battery.txt.bz2 :
cat-7717 [000] 1392.752092: del_timer <-schedule_timeout
cat-7717 [000] 1392.752092: acpi_ec_check_status
<-acpi_ec_wait
cat-7717 [000] 1392.752093: constant_test_bit
<-acpi_ec_check_status
cat-7717 [000] 1392.752095: mutex_unlock
<-acpi_ec_transaction
cat-7717 [000] 1392.752096: acpi_ex_enter_interpreter
<-acpi_ev_address_space_dispatch
cat-7717 [000] 1392.752097: acpi_os_wait_semaphore
<-acpi_ut_acquire_mutex
cat-7717 [000] 1392.752097: down_timeout
<-acpi_os_wait_semaphore
cat-7717 [000] 1392.752098: acpi_ex_release_global_lock
<-acpi_ex_read_data_from_field
cat-7717 [000] 1392.752099: acpi_ex_release_mutex_object
<-acpi_ex_release_global_lock
cat-7717 [000] 1392.752100: acpi_ev_release_global_lock
<-acpi_ex_release_mutex_object
cat-7717 [000] 1392.752100: __acpi_release_global_lock
<-acpi_ev_release_global_lock
cat-7717 [000] 1392.752101: acpi_os_signal_semaphore
<-acpi_ev_release_global_lock
cat-7717 [000] 1392.752102: up <-acpi_os_signal_semaphore
...
cat-7717 [000] 1392.839447:
acpi_ex_truncate_for32bit_table <-acpi_ds_exec_end_op
cat-7717 [000] 1392.839447: acpi_ps_next_parse_state
<-acpi_ps_parse_loop
cat-7717 [000] 1392.839448: acpi_ps_complete_op
<-acpi_ps_parse_loop
cat-7717 [000] 1392.839448: acpi_ps_complete_this_op
<-acpi_ps_complete_op
cat-7717 [000] 1392.839449: acpi_ps_delete_parse_tree
<-acpi_ps_complete_this_op
cat-7717 [000] 1392.839449: acpi_ps_get_arg
<-acpi_ps_delete_parse_tree
cat-7717 [000] 1392.839450: acpi_ps_get_opcode_info
<-acpi_ps_get_arg
cat-7717 [000] 1392.839450: acpi_ps_get_arg
<-acpi_ps_delete_parse_tree
cat-7717 [000] 1392.839451: acpi_ps_get_opcode_info
<-acpi_ps_get_arg
cat-7717 [000] 1392.839452: do_IRQ <-common_interrupt
cat-7717 [000] 1392.839453: irq_enter <-do_IRQ
cat-7717 [000] 1392.839453: irq_to_desc <-do_IRQ
cat-7717 [000] 1392.839454: handle_fasteoi_irq <-do_IRQ
cat-7717 [000] 1392.839455: handle_IRQ_event
<-handle_fasteoi_irq
cat-7717 [000] 1392.839455: usb_hcd_irq <-handle_IRQ_event
cat-7717 [000] 1392.839456: uhci_irq <-usb_hcd_irq
cat-7717 [000] 1392.839458: ath5k_intr <-handle_IRQ_event
cat-7717 [000] 1392.839459: ath5k_hw_is_intr_pending
<-ath5k_intr
cat-7717 [000] 1392.839461: ath5k_hw_get_isr <-ath5k_intr
cat-7717 [000] 1392.839464: __tasklet_schedule <-ath5k_intr
Why didn't another process cut in from 1392.752092 to 1392.839464? In
the schedule grep normally runs for cat are half as long e.g.:
<idle>-0 [000] 1392.747102: __switch_to <-schedule
cat-7717 [000] 1392.747103: del_timer <-schedule_timeout
cat-7717 [000] 1392.747107:
schedule_timeout_uninterruptible <-msleep
cat-7717 [000] 1392.747107: schedule_timeout
<-schedule_timeout_uninterruptible
cat-7717 [000] 1392.747108: init_timer <-schedule_timeout
cat-7717 [000] 1392.747108: __mod_timer <-schedule_timeout
cat-7717 [000] 1392.747111: __switch_to <-schedule
<idle>-0 [000] 1392.749067: account_scheduler_latency
<-enqueue_task_fair
Further why (after taking a break to process the ath5k interrupt) did it
not schedule another process other than cat until 1393.033226:
cat-7717 [000] 1393.033215: set_normalized_timespec
<-ktime_get_ts
cat-7717 [000] 1393.033216: ktime_add_safe
<-hrtimer_start_range_ns
cat-7717 [000] 1393.033217: ktime_add_safe
<-hrtimer_start_range_ns
cat-7717 [000] 1393.033217: enqueue_hrtimer
<-hrtimer_start_range_ns
cat-7717 [000] 1393.033222: __switch_to <-schedule
hald-addon-inpu-3237 [000] 1393.033226: fget_light <-do_sys_poll
--
Sitsofe | http://sucs.org/~sits/
^ permalink raw reply [flat|nested] 28+ messages in thread
* Reading EeePC900 battery info causes stalls when using SLUB (was Re: How how latent should non-preemptive scheduling be?)
2008-09-30 13:18 ` Sitsofe Wheeler
@ 2008-10-04 10:50 ` Sitsofe Wheeler
0 siblings, 0 replies; 28+ messages in thread
From: Sitsofe Wheeler @ 2008-10-04 10:50 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Peter Zijlstra, Arjan van de Ven, Steven Rostedt,
Christoph Lameter
Sitsofe Wheeler wrote:
> I have an EeePC 900 (Intel Celeron 900Mhz) and it seems to be
> skipping while playing sound through various desktop apps with a
> 2.6.27rc6 kernel. It is running off an SD card which really shows up
After weeks of head scratching as to why these latencies didn't occur
using the xandros 2.6.21.4 kernel (but keeping the same userspace) when
my own kernels would always show this problem I finally found the answer
after reading
http://tservice.net.ru/~s0mbre/blog/devel/other/2008_10_01 on kernel
planet - SLUB can cause regressions compared to SLAB. Switching from
SLUB to SLAB made the problem more or less disappear (which I guess
makes sense given the large number of kmem_* calls that are made when
reading the battery).
My understanding with the memory allocators is that SLOB is the smallest
and simplest. Its forté is that it uses very little memory which makes
it ideal for embedded systems and is the easiest allocator to
understand. The downside is that it might tend towards memory
fragmentation the longer a system runs and is apparently a little bit
slower than SLAB. The SLAB allocator is something that the linux kernel
has had for many years and was a good performer until the pressures to
perform with large SMP systems started to come in to play (at which
point lots of memory would be used up on fixed structures -
http://lwn.net/Articles/229984/ ). SLUB is the newest allocator, scales
up well and has finer grained diagnostics that SLAB.
Prior to today my understanding was that SLUB would be able to replace
SLAB in all scenarios and perform just as well or better. However now
I'm not so sure (SLAB appears to be less latent than SLUB). Other than
SLUB's newness are there cases where SLAB should be chosen instead of
SLUB (e.g. uniprocessor desktop systems with hundreds of megabytes of
RAM)? Will SLAB exist as an alternative to SLUB for certain setups?
--
Sitsofe | http://sucs.org/~sits
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-21 20:56 ` Matt Keenan
@ 2008-09-22 6:50 ` Sitsofe Wheeler
0 siblings, 0 replies; 28+ messages in thread
From: Sitsofe Wheeler @ 2008-09-22 6:50 UTC (permalink / raw)
To: Matt Keenan; +Cc: linux-kernel, Ingo Molnar
Matt Keenan wrote:
> If you use ALSA you could write a custom .asoundrc file and increase
> ALSA's default buffer size, this gives the hardware a longer buffer to
> play which may work around the problem. This used to do the trick for me
> with the old scheduler with slow machines. Not a fix but if nothing else
> works...
In all fairness turning on preemption also helps (I don't hear any drops
or see any stalls once it's on) but I'm trying to work out if this is an
underlying problem that can be fixed (and then there is the case of the
default desktop distros).
--
Sitsofe | http://sucs.org/~sits/
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-17 21:18 Sitsofe Wheeler
2008-09-17 21:28 ` Sitsofe Wheeler
2008-09-17 21:34 ` Arjan van de Ven
@ 2008-09-21 20:56 ` Matt Keenan
2008-09-22 6:50 ` Sitsofe Wheeler
2 siblings, 1 reply; 28+ messages in thread
From: Matt Keenan @ 2008-09-21 20:56 UTC (permalink / raw)
To: Sitsofe Wheeler; +Cc: linux-kernel, Ingo Molnar
On Wed, 2008-09-17 at 22:18 +0100, Sitsofe Wheeler wrote:
> Hi,
>
> I have an EeePC 900 (Intel Celeron 900Mhz) and it seems to be skipping
> while playing sound through various desktop apps with a 2.6.27rc6
> kernel. It is running off an SD card which really shows up slow writes
> but the sound is seemingly skipping even when ext3 is not being used.
[snip snip]
If you use ALSA you could write a custom .asoundrc file and increase
ALSA's default buffer size, this gives the hardware a longer buffer to
play which may work around the problem. This used to do the trick for me
with the old scheduler with slow machines. Not a fix but if nothing else
works...
Matt
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-18 18:25 ` Sitsofe Wheeler
@ 2008-09-19 8:44 ` Ingo Molnar
0 siblings, 0 replies; 28+ messages in thread
From: Ingo Molnar @ 2008-09-19 8:44 UTC (permalink / raw)
To: Sitsofe Wheeler
Cc: Peter Zijlstra, Arjan van de Ven, linux-kernel, Steven Rostedt
* Sitsofe Wheeler <sitsofe@yahoo.com> wrote:
> Peter Zijlstra wrote:
>> What you can do to investigate this, is use the sched_wakeup tracer from
>> ftrace, that should give a function trace of the highest wakeup latency
>> showing what the kernel is doing.
>
> I struggled to find documentation of ftrace because it's quite new. I
> have come across
> http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_MRG/1.0/html/Realtime_Tuning_Guide/sect-Realtime_Tuning_Guide-Realtime_Specific_Tuning-Using_the_ftrace_Utility_for_Tracing_Latencies.html
> and
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/ftrace.txt;h=d330fe3103da9c9a3cb8f888ac7255ce48e666d4;hb=45e9c0de2e86485f8b6633fd64ab19cfbff167f6
> .
>
> Thanks to those I started up the debugfs filesystem and went to the
> trace directories but the on tracers in available_tracers are
> ftrace sched_switch none
>
> I can't see anything in the code that would disable wakeup... Any ideas
> on what might be wrong? I'm using a 2.6.27rc6 kernel.
here's two quick howtos:
http://redhat.com/~mingo/sched-devel.git/readme-tracer.txt
http://redhat.com/~mingo/sched-devel.git/howto-trace-latencies.txt
you need to enable:
CONFIG_SCHED_TRACER=y
CONFIG_CONTEXT_SWITCH_TRACER=y
it's not particularly well named though. Why doesnt it say
LATENCY_TRACER or something?
> Additionally I think I found a trigger - unplugging the power cable
> from the EeePC and having it run on battery seems to then set off this
> periodic stall every 30 seconds... There's no CPU frequency scaling
> enabled either (Celeron M's seemingly don't have P states and support
> for cpufreq is configured out).
sounds like potential SMM triggered latencies.
Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-18 2:42 ` Peter Zijlstra
@ 2008-09-18 18:25 ` Sitsofe Wheeler
2008-09-19 8:44 ` Ingo Molnar
0 siblings, 1 reply; 28+ messages in thread
From: Sitsofe Wheeler @ 2008-09-18 18:25 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Arjan van de Ven, linux-kernel, Ingo Molnar
Peter Zijlstra wrote:
> What you can do to investigate this, is use the sched_wakeup tracer from
> ftrace, that should give a function trace of the highest wakeup latency
> showing what the kernel is doing.
I struggled to find documentation of ftrace because it's quite new. I
have come across
http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_MRG/1.0/html/Realtime_Tuning_Guide/sect-Realtime_Tuning_Guide-Realtime_Specific_Tuning-Using_the_ftrace_Utility_for_Tracing_Latencies.html
and
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/ftrace.txt;h=d330fe3103da9c9a3cb8f888ac7255ce48e666d4;hb=45e9c0de2e86485f8b6633fd64ab19cfbff167f6
.
Thanks to those I started up the debugfs filesystem and went to the
trace directories but the on tracers in available_tracers are
ftrace sched_switch none
I can't see anything in the code that would disable wakeup... Any ideas
on what might be wrong? I'm using a 2.6.27rc6 kernel.
Additionally I think I found a trigger - unplugging the power cable from
the EeePC and having it run on battery seems to then set off this
periodic stall every 30 seconds... There's no CPU frequency scaling
enabled either (Celeron M's seemingly don't have P states and support
for cpufreq is configured out).
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-17 21:54 ` Arjan van de Ven
2008-09-17 22:29 ` Sitsofe Wheeler
@ 2008-09-18 2:42 ` Peter Zijlstra
2008-09-18 18:25 ` Sitsofe Wheeler
1 sibling, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2008-09-18 2:42 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Sitsofe Wheeler, linux-kernel, Ingo Molnar
On Wed, 2008-09-17 at 14:54 -0700, Arjan van de Ven wrote:
> On Wed, 17 Sep 2008 22:48:55 +0100
> Sitsofe Wheeler <sitsofe@yahoo.com> wrote:
>
> > Arjan van de Ven wrote:
> > > this says you haven't done "make install" on the latencytop
> > > directory so it's not translating things for you.. can you do that
> > > please?
> >
> > > Cause Maximum
> > >
> Percentage
>
> Scheduler: waiting for cpu 208 msec 59.4 %
>
>
> you're rather CPU bound, and your process was woken up but didn't run for over 200 milliseconds..
> that sounds like a scheduler fairness issue!
Really hard subject. Perfect fairness requires 0 latency - which with a
CPU only being able to run one thing at a time is impossible. So what
latency ends up being is a measure for the convergence towards fairness.
Anyway - 200ms isn't too weird depending on the circumstances. We start
out with a 20ms latency for UP, we then multiply with 1+log2(nr_cpus)
which in say a quad core machine ends up with 60ms. That ought to mean
that under light load the max latency should not exceed twice that
(basically a consequence of the Nyquist-Shannon sampling theorem IIRC).
Now, if you get get under some load (by default: nr_running > 5) the
expected latency starts to linearly grow with nr_running.
>From what I gather from the reply to this email the machine was not
doing much (and after having looked up the original email I see its a
eeeeeeeee atom - which is dual cpu iirc, so that yields 40ms default) -
so 200 is definately on the high side.
What you can do to investigate this, is use the sched_wakeup tracer from
ftrace, that should give a function trace of the highest wakeup latency
showing what the kernel is doing.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-17 21:54 ` Arjan van de Ven
@ 2008-09-17 22:29 ` Sitsofe Wheeler
2008-09-18 2:42 ` Peter Zijlstra
1 sibling, 0 replies; 28+ messages in thread
From: Sitsofe Wheeler @ 2008-09-17 22:29 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: linux-kernel, Ingo Molnar, peterz
Arjan van de Ven wrote:
> On Wed, 17 Sep 2008 22:48:55 +0100
> Sitsofe Wheeler <sitsofe@yahoo.com> wrote:
>
>> Arjan van de Ven wrote:
>>> this says you haven't done "make install" on the latencytop
>>> directory so it's not translating things for you.. can you do that
>>> please?
>>> Cause Maximum
>>>
> Percentage
>
> Scheduler: waiting for cpu 208 msec 59.4 %
>
>
> you're rather CPU bound, and your process was woken up but didn't run for over 200 milliseconds..
> that sounds like a scheduler fairness issue!
I'm at a bit of a loss to explain this. I'm not using the group
scheduler and the load average is less than 0.4. CPU usage does seem to
spike a fair bit (I'm not sure why rhythmbox needs 50%+ for decoding
oggs from time to time) but there always seems to be 20% CPU free...
I don't know where to start debugging this one and I'm suspicious of the
way the problem would happen every 30 seconds too... (this laptop is
using ath5k for its wifi)
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-17 21:48 ` How how latent should non-preemptive scheduling be? Sitsofe Wheeler
@ 2008-09-17 21:54 ` Arjan van de Ven
2008-09-17 22:29 ` Sitsofe Wheeler
2008-09-18 2:42 ` Peter Zijlstra
0 siblings, 2 replies; 28+ messages in thread
From: Arjan van de Ven @ 2008-09-17 21:54 UTC (permalink / raw)
To: Sitsofe Wheeler; +Cc: linux-kernel, Ingo Molnar, peterz
On Wed, 17 Sep 2008 22:48:55 +0100
Sitsofe Wheeler <sitsofe@yahoo.com> wrote:
> Arjan van de Ven wrote:
> > this says you haven't done "make install" on the latencytop
> > directory so it's not translating things for you.. can you do that
> > please?
>
> > Cause Maximum
> >
Percentage
Scheduler: waiting for cpu 208 msec 59.4 %
you're rather CPU bound, and your process was woken up but didn't run for over 200 milliseconds..
that sounds like a scheduler fairness issue!
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
[not found] ` <fa.808p0ZtU9DCpeky4KfNS8Drdw9w@ifi.uio.no>
@ 2008-09-17 21:48 ` Sitsofe Wheeler
2008-09-17 21:54 ` Arjan van de Ven
0 siblings, 1 reply; 28+ messages in thread
From: Sitsofe Wheeler @ 2008-09-17 21:48 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: linux-kernel, Ingo Molnar
Arjan van de Ven wrote:
> this says you haven't done "make install" on the latencytop directory
> so it's not translating things for you.. can you do that please?
> Cause Maximum Percentage
> Scheduler: waiting for cpu 208.3 msec 59.4 %
> Userspace lock contention 4.9 msec 29.7 %
> Waiting for event (poll) 4.8 msec 7.1 %
> Waiting for event (select) 4.5 msec 0.6 %
> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_re 1.9 msec 2.0 %
> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_bu 1.9 msec 0.7 %
> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_bu 1.9 msec 0.6 %
> Executing raw SCSI command 0.7 msec 0.0 %
> ex_resolve_to_value
>
>
>
>
>
> Process firefox-bin (5756) Total: 317.2 msec
> Scheduler: waiting for cpu 208.3 msec 96.4 %
> Waiting for event (poll) 3.7 msec 3.0 %
> Userspace lock contention 1.0 msec 0.7 %
>
>
>
>
>
>
>
>
> pdflush gnome-vfs-daemo bash top bash pulseaudio gconf-helper bash rhythmbox firefox run-mozilla.sh firefox-bin
Better?
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-17 21:18 Sitsofe Wheeler
2008-09-17 21:28 ` Sitsofe Wheeler
@ 2008-09-17 21:34 ` Arjan van de Ven
2008-09-21 20:56 ` Matt Keenan
2 siblings, 0 replies; 28+ messages in thread
From: Arjan van de Ven @ 2008-09-17 21:34 UTC (permalink / raw)
To: Sitsofe Wheeler; +Cc: linux-kernel, Ingo Molnar
On Wed, 17 Sep 2008 22:18:18 +0100
Sitsofe Wheeler <sitsofe@yahoo.com> wrote:
> Hi,
>
> I have an EeePC 900 (Intel Celeron 900Mhz) and it seems to be
> skipping while playing sound through various desktop apps with a
> 2.6.27rc6 kernel. It is running off an SD card which really shows up
> slow writes but the sound is seemingly skipping even when ext3 is not
> being used. When look at latencytop I often see results similar to
> this:
>
> > Cause Maximum
> > Percentage Scheduler: waiting for cpu 247.2
> > msec 64.6 % futex_wait do_futex sys_futex
> > sysenter_do_call 5.0 msec 24.8 % do_select
> > core_sys_select sys_select sysenter_do_c 4.7 msec 0.7 %
> > do_sys_poll sys_poll sysenter_do_call 4.5
this says you haven't done "make install" on the latencytop directory
so it's not translating things for you.. can you do that please?
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: How how latent should non-preemptive scheduling be?
2008-09-17 21:18 Sitsofe Wheeler
@ 2008-09-17 21:28 ` Sitsofe Wheeler
2008-09-17 21:34 ` Arjan van de Ven
2008-09-21 20:56 ` Matt Keenan
2 siblings, 0 replies; 28+ messages in thread
From: Sitsofe Wheeler @ 2008-09-17 21:28 UTC (permalink / raw)
To: linux-kernel; +Cc: Ingo Molnar, Arjan van de Ven
Sitsofe Wheeler wrote:
> I have an EeePC 900 (Intel Celeron 900Mhz) and it seems to be skipping
> while playing sound through various desktop apps with a 2.6.27rc6
> kernel. It is running off an SD card which really shows up slow writes
> but the sound is seemingly skipping even when ext3 is not being used.
> When look at latencytop I often see results similar to this:
One other small datapoint. When the skipping starts to occur regularly
it seems to be exactly 30 seconds apart. ALSA doesn't report any xruns
either...
^ permalink raw reply [flat|nested] 28+ messages in thread
* How how latent should non-preemptive scheduling be?
@ 2008-09-17 21:18 Sitsofe Wheeler
2008-09-17 21:28 ` Sitsofe Wheeler
` (2 more replies)
0 siblings, 3 replies; 28+ messages in thread
From: Sitsofe Wheeler @ 2008-09-17 21:18 UTC (permalink / raw)
To: linux-kernel; +Cc: Ingo Molnar
Hi,
I have an EeePC 900 (Intel Celeron 900Mhz) and it seems to be skipping
while playing sound through various desktop apps with a 2.6.27rc6
kernel. It is running off an SD card which really shows up slow writes
but the sound is seemingly skipping even when ext3 is not being used.
When look at latencytop I often see results similar to this:
> Cause Maximum Percentage
> Scheduler: waiting for cpu 247.2 msec 64.6 %
> futex_wait do_futex sys_futex sysenter_do_call 5.0 msec 24.8 %
> do_select core_sys_select sys_select sysenter_do_c 4.7 msec 0.7 %
> do_sys_poll sys_poll sysenter_do_call 4.5 msec 5.6 %
> do_sys_poll sys_ppoll sysenter_do_call 4.2 msec 1.7 %
> rt_mutex_timed_lock futex_lock_pi do_futex sys_fut 3.5 msec 0.5 %
> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_re 1.9 msec 1.3 %
> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_bu 1.9 msec 0.4 %
> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_bu 1.9 msec 0.5 %
> on acpi_ex_field_datum_io acpi_ex_extract_from_field acpi_ex_read_data_from_field acpi_ex_resolve_node_to_value acpi_ex_resolve_to_va
> lue
>
>
>
> Process rhythmbox (5694) Total: 358.4 msec
> Scheduler: waiting for cpu 221.8 msec 69.9 %
> futex_wait do_futex sys_futex sysenter_do_call 4.5 msec 26.3 %
> do_sys_poll sys_poll sysenter_do_call 3.8 msec 3.2 %
> do_select core_sys_select sys_select sysenter_do_c 1.0 msec 0.4 %
> rt_mutex_timed_lock futex_lock_pi do_futex sys_fut 0.2 msec 0.1 %
> Cause Maximum Percentage
> Scheduler: waiting for cpu 234.8 msec 31.1 %
> futex_wait do_futex sys_futex sysenter_do_call 5.0 msec 34.4 %
> do_select core_sys_select sys_select sysenter_do_c 5.0 msec 10.3 %
> do_sys_poll sys_ppoll sysenter_do_call 5.0 msec 6.2 %
> do_sys_poll sys_poll sysenter_do_call 5.0 msec 16.4 %
> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_re 1.9 msec 0.9 %
> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_bu 1.9 msec 0.3 %
> msleep acpi_ec_wait acpi_ec_transaction acpi_ec_bu 1.9 msec 0.3 %
> blk_execute_rq scsi_execute scsi_execute_req scsi_ 1.0 msec 0.0 %
>
>
> Process firefox-bin (5756) Total: 482.2 msec
> Scheduler: waiting for cpu 234.8 msec 95.5 %
> do_sys_poll sys_poll sysenter_do_call 4.1 msec 3.5 %
> futex_wait do_futex sys_futex sysenter_do_call 3.9 msec 1.0 %
> do_select core_sys_select sys_select sysenter_do_c 0.2 msec 0.0 %
top seems to show that there is CPU time still available. The kernel is
compiled with voluntary preemption, dynticks and a HZ of 1000.
Is the scheduler waiting for CPU anything to do with the skips? If so
what sort of maximum should I be expecting? If you are listing to music
is it expected that you have preemption on? Would preemption even help?
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2008-10-04 10:50 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <fa.ERZTl/6uH+mhNoef5fPJKTRjJag@ifi.uio.no>
[not found] ` <fa.PtPFzP5kIJVCCov6YCewrh+o4z4@ifi.uio.no>
[not found] ` <fa.C6WSm5Rh2Nb+Qho7b0qDOZ9RPV8@ifi.uio.no>
[not found] ` <fa.ch6j4qXs/2sFpLkHz5fXrtjTR8c@ifi.uio.no>
[not found] ` <fa.Jx/Ygtm46CVRawlA6OnfYNn6cN0@ifi.uio.no>
2008-09-18 7:26 ` How how latent should non-preemptive scheduling be? Sitsofe Wheeler
[not found] ` <fa.iIHgL48F3T5VAqFw3mqaf9Pzrs4@ifi.uio.no>
[not found] ` <fa.Td8xkKZKMSMghlJmEYefTRVF2kc@ifi.uio.no>
2008-09-19 11:54 ` Sitsofe Wheeler
2008-09-19 14:20 ` Peter Zijlstra
2008-09-22 11:57 ` Ingo Molnar
2008-09-22 12:07 ` Steven Rostedt
2008-09-23 6:33 ` Sitsofe Wheeler
2008-09-23 11:53 ` Ingo Molnar
2008-09-23 16:30 ` Sitsofe Wheeler
2008-09-23 19:39 ` Sitsofe Wheeler
2008-09-23 22:01 ` Sitsofe Wheeler
2008-09-27 20:48 ` Ingo Molnar
2008-09-28 20:56 ` Sitsofe Wheeler
2008-09-29 8:37 ` Ingo Molnar
2008-09-29 23:11 ` Sitsofe Wheeler
2008-09-30 11:22 ` Ingo Molnar
2008-09-30 13:18 ` Sitsofe Wheeler
2008-10-04 10:50 ` Reading EeePC900 battery info causes stalls when using SLUB (was Re: How how latent should non-preemptive scheduling be?) Sitsofe Wheeler
[not found] <fa.vMKgvqjqmYnI2J40GHoTENeYm8U@ifi.uio.no>
[not found] ` <fa.808p0ZtU9DCpeky4KfNS8Drdw9w@ifi.uio.no>
2008-09-17 21:48 ` How how latent should non-preemptive scheduling be? Sitsofe Wheeler
2008-09-17 21:54 ` Arjan van de Ven
2008-09-17 22:29 ` Sitsofe Wheeler
2008-09-18 2:42 ` Peter Zijlstra
2008-09-18 18:25 ` Sitsofe Wheeler
2008-09-19 8:44 ` Ingo Molnar
2008-09-17 21:18 Sitsofe Wheeler
2008-09-17 21:28 ` Sitsofe Wheeler
2008-09-17 21:34 ` Arjan van de Ven
2008-09-21 20:56 ` Matt Keenan
2008-09-22 6:50 ` Sitsofe Wheeler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).