LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Regression in latest sched-git
@ 2008-02-12 18:53 Dhaval Giani
2008-02-12 19:40 ` Peter Zijlstra
2008-02-14 11:20 ` Peter Zijlstra
0 siblings, 2 replies; 9+ messages in thread
From: Dhaval Giani @ 2008-02-12 18:53 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Srivatsa Vaddagiri, Peter Zijlstra, lkml
Hi Ingo,
I've been running the latest sched-git through some tests. Here is
essentially what I am doing,
1. Mount the control group
2. Create 3-4 groups
3. Start kernbench inside each group
4. Run cpu hogs in each group
Essentially the idea is to see how the system responds under extreme CPU
load.
This is what I get (and this is in a shell which belongs to the root
group)
[root@llm11 ~]# time sleep 1
real 0m1.212s
user 0m0.004s
sys 0m0.000s
[root@llm11 ~]# time sleep 1
real 0m1.200s
user 0m0.000s
sys 0m0.004s
[root@llm11 ~]# time sleep 1
real 0m1.266s
user 0m0.000s
sys 0m0.000s
[root@llm11 ~]# time sleep 1
real 0m1.113s
user 0m0.000s
sys 0m0.000s
[root@llm11 ~]#
On the sched-devel tree that I have, the same gives me following
results.
[root@llm11 ~]# time sleep 1
real 0m1.057s
user 0m0.000s
sys 0m0.004s
[root@llm11 ~]# time sleep 1
real 0m1.038s
user 0m0.000s
sys 0m0.004s
[root@llm11 ~]# time sleep 1
real 0m1.075s
user 0m0.000s
sys 0m0.000s
[root@llm11 ~]# time sleep 1
real 0m1.071s
user 0m0.000s
sys 0m0.000s
[root@llm11 ~]# time sleep 1
real 0m1.073s
user 0m0.000s
sys 0m0.004s
[root@llm11 ~]# time sleep 1
real 0m1.055s
user 0m0.000s
sys 0m0.004s
I agree this is not a very great test. Its getting a bit late here. I
will get some better test case tomorrow morning (and if you have some, I
can try those as well). I just did not want the tree to get merged in
without further discussion.
--
regards,
Dhaval
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Regression in latest sched-git
2008-02-12 18:53 Regression in latest sched-git Dhaval Giani
@ 2008-02-12 19:40 ` Peter Zijlstra
2008-02-13 3:00 ` Srivatsa Vaddagiri
2008-02-14 11:20 ` Peter Zijlstra
1 sibling, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2008-02-12 19:40 UTC (permalink / raw)
To: Dhaval Giani; +Cc: Ingo Molnar, Srivatsa Vaddagiri, lkml
On Wed, 2008-02-13 at 00:23 +0530, Dhaval Giani wrote:
> Hi Ingo,
>
> I've been running the latest sched-git through some tests. Here is
> essentially what I am doing,
>
> 1. Mount the control group
> 2. Create 3-4 groups
> 3. Start kernbench inside each group
> 4. Run cpu hogs in each group
>
> Essentially the idea is to see how the system responds under extreme CPU
> load.
> This is what I get (and this is in a shell which belongs to the root
> group)
> [root@llm11 ~]# time sleep 1
>
> real 0m1.212s
> user 0m0.004s
> sys 0m0.000s
> On the sched-devel tree that I have, the same gives me following
> results.
>
> [root@llm11 ~]# time sleep 1
>
> real 0m1.057s
> user 0m0.000s
> sys 0m0.004s
Yes, latency isolation is the one thing I had to sacrifice in order to
get the normal latencies under control.
The problem with the old code is that under light load: a kernel make
-j2 as root, under an otherwise idle X session, generates latencies up
to 120ms on my UP laptop. (uid grouping; two active users: peter, root).
Others have reported latencies up to 300ms, and Ingo found a 700ms
latency on his machine.
The source for this problem is I think the vruntime driven wakeup
preemption (but I'm not quite sure). The other things that rely on
global vruntime are sleeper fairness and yield. Now while I can't
possibly care less about yield, the loss of sleeper fairness is somewhat
sad (NB. turning it off with the old group scheduling does improve life
somewhat).
So my first attempt at getting a global vruntime was flattening the
whole RQ structure, you can see that patch in sched.git (I really ought
to have posted that, will do so tomorrow).
With the experience gained from doing that, I think it might be possible
to construct a hierarchical RQ model that has synced vruntime; but
thinking about that still makes my head hurt.
Anyway, yes, its not ideal, but it does the more common case of light
load much better - I basically had to tell people to disable
CONFIG_FAIR_GROUP_SCHED in order to use their computer, which is sad,
because its the default and we want it to be the default in the cgroup
future.
So yes, I share your concern, lets work on this together.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Regression in latest sched-git
2008-02-12 19:40 ` Peter Zijlstra
@ 2008-02-13 3:00 ` Srivatsa Vaddagiri
2008-02-13 12:51 ` Peter Zijlstra
0 siblings, 1 reply; 9+ messages in thread
From: Srivatsa Vaddagiri @ 2008-02-13 3:00 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Dhaval Giani, Ingo Molnar, lkml
On Tue, Feb 12, 2008 at 08:40:08PM +0100, Peter Zijlstra wrote:
> Yes, latency isolation is the one thing I had to sacrifice in order to
> get the normal latencies under control.
Hi Peter,
I don't have easy solution in mind either to meet both fairness
and latency goals in a acceptable way.
But I am puzzled at the max latency numbers you have provided below:
> The problem with the old code is that under light load: a kernel make
> -j2 as root, under an otherwise idle X session, generates latencies up
> to 120ms on my UP laptop. (uid grouping; two active users: peter, root).
If it was just two active users, then max latency should be:
latency to schedule user entity (~10ms?) +
latency to schedule task within that user
20-30 ms seems more reaonable max latency to expect in this scenario.
120ms seems abnormal, unless the user had large number of tasks.
On the same lines, I cant understand how we can be seeing 700ms latency
(below) unless we had: large number of active groups/users and large number of
tasks within each group/user.
> Others have reported latencies up to 300ms, and Ingo found a 700ms
> latency on his machine.
>
> The source for this problem is I think the vruntime driven wakeup
> preemption (but I'm not quite sure). The other things that rely on
> global vruntime are sleeper fairness and yield. Now while I can't
> possibly care less about yield, the loss of sleeper fairness is somewhat
> sad (NB. turning it off with the old group scheduling does improve life
> somewhat).
>
> So my first attempt at getting a global vruntime was flattening the
> whole RQ structure, you can see that patch in sched.git (I really ought
> to have posted that, will do so tomorrow).
We will do some exhaustive testing with this approach. My main concern
with this is that it may compromise the level of isolation between two
groups (imagine one group does a fork-bomb and how it would affect
fairness for other groups).
> With the experience gained from doing that, I think it might be possible
> to construct a hierarchical RQ model that has synced vruntime; but
> thinking about that still makes my head hurt.
>
> Anyway, yes, its not ideal, but it does the more common case of light
> load much better - I basically had to tell people to disable
> CONFIG_FAIR_GROUP_SCHED in order to use their computer, which is sad,
> because its the default and we want it to be the default in the cgroup
> future.
>
> So yes, I share your concern, lets work on this together.
--
Regards,
vatsa
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Regression in latest sched-git
2008-02-13 3:00 ` Srivatsa Vaddagiri
@ 2008-02-13 12:51 ` Peter Zijlstra
2008-02-13 16:34 ` Dhaval Giani
0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2008-02-13 12:51 UTC (permalink / raw)
To: vatsa; +Cc: Dhaval Giani, Ingo Molnar, lkml
On Wed, 2008-02-13 at 08:30 +0530, Srivatsa Vaddagiri wrote:
> On Tue, Feb 12, 2008 at 08:40:08PM +0100, Peter Zijlstra wrote:
> > Yes, latency isolation is the one thing I had to sacrifice in order to
> > get the normal latencies under control.
>
> Hi Peter,
> I don't have easy solution in mind either to meet both fairness
> and latency goals in a acceptable way.
Ah, do be careful with 'fairness' here. The single RQ is fair wrt cpu
time, just not quite as 'fair' wrt to latency.
> But I am puzzled at the max latency numbers you have provided below:
>
> > The problem with the old code is that under light load: a kernel make
> > -j2 as root, under an otherwise idle X session, generates latencies up
> > to 120ms on my UP laptop. (uid grouping; two active users: peter, root).
>
> If it was just two active users, then max latency should be:
>
> latency to schedule user entity (~10ms?) +
> latency to schedule task within that user
>
> 20-30 ms seems more reaonable max latency to expect in this scenario.
> 120ms seems abnormal, unless the user had large number of tasks.
>
> On the same lines, I cant understand how we can be seeing 700ms latency
> (below) unless we had: large number of active groups/users and large number of
> tasks within each group/user.
All I can say it that its trivial to reproduce these horrid latencies.
As for Ingo's setup, the worst that he does is run distcc with (32?)
instances on that machine - and I assume he has that user niced waay
down.
> > Others have reported latencies up to 300ms, and Ingo found a 700ms
> > latency on his machine.
> >
> > The source for this problem is I think the vruntime driven wakeup
> > preemption (but I'm not quite sure). The other things that rely on
> > global vruntime are sleeper fairness and yield. Now while I can't
> > possibly care less about yield, the loss of sleeper fairness is somewhat
> > sad (NB. turning it off with the old group scheduling does improve life
> > somewhat).
> >
> > So my first attempt at getting a global vruntime was flattening the
> > whole RQ structure, you can see that patch in sched.git (I really ought
> > to have posted that, will do so tomorrow).
>
> We will do some exhaustive testing with this approach. My main concern
> with this is that it may compromise the level of isolation between two
> groups (imagine one group does a fork-bomb and how it would affect
> fairness for other groups).
Again, be careful with the fairness issue. CPU time should still be
fair, but yes, other groups might experience some latencies.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Regression in latest sched-git
2008-02-13 12:51 ` Peter Zijlstra
@ 2008-02-13 16:34 ` Dhaval Giani
2008-02-13 16:37 ` Dhaval Giani
2008-02-13 17:06 ` Srivatsa Vaddagiri
0 siblings, 2 replies; 9+ messages in thread
From: Dhaval Giani @ 2008-02-13 16:34 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: vatsa, Ingo Molnar, lkml
On Wed, Feb 13, 2008 at 01:51:18PM +0100, Peter Zijlstra wrote:
>
> On Wed, 2008-02-13 at 08:30 +0530, Srivatsa Vaddagiri wrote:
> > On Tue, Feb 12, 2008 at 08:40:08PM +0100, Peter Zijlstra wrote:
> > > Yes, latency isolation is the one thing I had to sacrifice in order to
> > > get the normal latencies under control.
> >
> > Hi Peter,
> > I don't have easy solution in mind either to meet both fairness
> > and latency goals in a acceptable way.
>
> Ah, do be careful with 'fairness' here. The single RQ is fair wrt cpu
> time, just not quite as 'fair' wrt to latency.
>
> > But I am puzzled at the max latency numbers you have provided below:
> >
> > > The problem with the old code is that under light load: a kernel make
> > > -j2 as root, under an otherwise idle X session, generates latencies up
> > > to 120ms on my UP laptop. (uid grouping; two active users: peter, root).
> >
> > If it was just two active users, then max latency should be:
> >
> > latency to schedule user entity (~10ms?) +
> > latency to schedule task within that user
> >
> > 20-30 ms seems more reaonable max latency to expect in this scenario.
> > 120ms seems abnormal, unless the user had large number of tasks.
> >
> > On the same lines, I cant understand how we can be seeing 700ms latency
> > (below) unless we had: large number of active groups/users and large number of
> > tasks within each group/user.
>
> All I can say it that its trivial to reproduce these horrid latencies.
>
Hi Peter,
I've been trying to reproduce the latencies, and the worst I have
managed only 80ms. At an average I am getting around 60 ms. This is with
a make -j4 as root, and dhaval running other programs. (with maxcpus=1).
> As for Ingo's setup, the worst that he does is run distcc with (32?)
> instances on that machine - and I assume he has that user niced waay
> down.
>
> > > Others have reported latencies up to 300ms, and Ingo found a 700ms
> > > latency on his machine.
> > >
> > > The source for this problem is I think the vruntime driven wakeup
> > > preemption (but I'm not quite sure). The other things that rely on
> > > global vruntime are sleeper fairness and yield. Now while I can't
> > > possibly care less about yield, the loss of sleeper fairness is somewhat
> > > sad (NB. turning it off with the old group scheduling does improve life
> > > somewhat).
> > >
> > > So my first attempt at getting a global vruntime was flattening the
> > > whole RQ structure, you can see that patch in sched.git (I really ought
> > > to have posted that, will do so tomorrow).
> >
> > We will do some exhaustive testing with this approach. My main concern
> > with this is that it may compromise the level of isolation between two
> > groups (imagine one group does a fork-bomb and how it would affect
> > fairness for other groups).
>
> Again, be careful with the fairness issue. CPU time should still be
> fair, but yes, other groups might experience some latencies.
>
I know I am missing something, but aren't we trying to reduce latencies
here?
--
regards,
Dhaval
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Regression in latest sched-git
2008-02-13 16:34 ` Dhaval Giani
@ 2008-02-13 16:37 ` Dhaval Giani
2008-02-13 16:43 ` Peter Zijlstra
2008-02-13 17:06 ` Srivatsa Vaddagiri
1 sibling, 1 reply; 9+ messages in thread
From: Dhaval Giani @ 2008-02-13 16:37 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: vatsa, Ingo Molnar, lkml
On Wed, Feb 13, 2008 at 10:04:44PM +0530, Dhaval Giani wrote:
> > > On the same lines, I cant understand how we can be seeing 700ms latency
> > > (below) unless we had: large number of active groups/users and large number of
> > > tasks within each group/user.
> >
> > All I can say it that its trivial to reproduce these horrid latencies.
> >
>
> Hi Peter,
>
> I've been trying to reproduce the latencies, and the worst I have
> managed only 80ms. At an average I am getting around 60 ms. This is with
> a make -j4 as root, and dhaval running other programs. (with maxcpus=1).
>
Totally missed here. Any more hints to reproduce?
--
regards,
Dhaval
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Regression in latest sched-git
2008-02-13 16:37 ` Dhaval Giani
@ 2008-02-13 16:43 ` Peter Zijlstra
0 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2008-02-13 16:43 UTC (permalink / raw)
To: Dhaval Giani; +Cc: vatsa, Ingo Molnar, lkml
On Wed, 2008-02-13 at 22:07 +0530, Dhaval Giani wrote:
> On Wed, Feb 13, 2008 at 10:04:44PM +0530, Dhaval Giani wrote:
> > > > On the same lines, I cant understand how we can be seeing 700ms latency
> > > > (below) unless we had: large number of active groups/users and large number of
> > > > tasks within each group/user.
> > >
> > > All I can say it that its trivial to reproduce these horrid latencies.
> > >
> >
> > Hi Peter,
> >
> > I've been trying to reproduce the latencies, and the worst I have
> > managed only 80ms. At an average I am getting around 60 ms. This is with
> > a make -j4 as root, and dhaval running other programs. (with maxcpus=1).
> >
>
> Totally missed here. Any more hints to reproduce?
Not really, this is the recipie I took from Lukas Hejtmanek's report and
it worked for me.
I'll see if I can find some time to try the ftrace patches to narrow
this down.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Regression in latest sched-git
2008-02-13 16:34 ` Dhaval Giani
2008-02-13 16:37 ` Dhaval Giani
@ 2008-02-13 17:06 ` Srivatsa Vaddagiri
1 sibling, 0 replies; 9+ messages in thread
From: Srivatsa Vaddagiri @ 2008-02-13 17:06 UTC (permalink / raw)
To: Dhaval Giani; +Cc: Peter Zijlstra, Ingo Molnar, lkml
On Wed, Feb 13, 2008 at 10:04:44PM +0530, Dhaval Giani wrote:
> I know I am missing something, but aren't we trying to reduce latencies
> here?
I guess Peter is referring to the latency in seeing fairness results. In
other words, with single rq approach, you may require more time for the groups
to converge on fairness.
--
Regards,
vatsa
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Regression in latest sched-git
2008-02-12 18:53 Regression in latest sched-git Dhaval Giani
2008-02-12 19:40 ` Peter Zijlstra
@ 2008-02-14 11:20 ` Peter Zijlstra
1 sibling, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2008-02-14 11:20 UTC (permalink / raw)
To: Dhaval Giani; +Cc: Ingo Molnar, Srivatsa Vaddagiri, lkml, Mike Galbraith
Hi Dhaval,
How does this patch (on top of todays sched-devel.git) work for you?
It keeps my laptop nice and spiffy when I run
let i=0; while [ $i -lt 100 ]; do let i+=1; while :; do :; done & done
under a third user (nobody). This generates huge latencies for the nobody
user (up to 1.6s) but root and peter don't seem to get above 40ms
---
include/linux/sched.h | 1 +
kernel/sched_fair.c | 6 +++++-
2 files changed, 6 insertions(+), 1 deletion(-)
Index: linux-2.6/include/linux/sched.h
===================================================================
--- linux-2.6.orig/include/linux/sched.h
+++ linux-2.6/include/linux/sched.h
@@ -925,6 +925,7 @@ struct sched_entity {
u64 exec_start;
u64 sum_exec_runtime;
u64 vruntime;
+ u64 vperiod;
u64 prev_sum_exec_runtime;
#ifdef CONFIG_SCHEDSTATS
Index: linux-2.6/kernel/sched_fair.c
===================================================================
--- linux-2.6.orig/kernel/sched_fair.c
+++ linux-2.6/kernel/sched_fair.c
@@ -220,9 +220,11 @@ static inline u64 min_vruntime(u64 min_v
static inline s64 entity_key(struct cfs_rq *cfs_rq, struct sched_entity *se)
{
- return se->vruntime - cfs_rq->min_vruntime;
+ return se->vruntime + se->vperiod - cfs_rq->min_vruntime;
}
+static u64 sched_vslice_add(struct cfs_rq *cfs_rq, struct sched_entity *se);
+
/*
* Enqueue an entity into the rb-tree:
*/
@@ -240,6 +242,8 @@ static void __enqueue_entity(struct cfs_
if (se == cfs_rq->curr)
return;
+ se->vperiod = sched_vslice_add(cfs_rq, se);
+
cfs_rq = &rq_of(cfs_rq)->cfs;
link = &cfs_rq->tasks_timeline.rb_node;
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-02-14 11:21 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-12 18:53 Regression in latest sched-git Dhaval Giani
2008-02-12 19:40 ` Peter Zijlstra
2008-02-13 3:00 ` Srivatsa Vaddagiri
2008-02-13 12:51 ` Peter Zijlstra
2008-02-13 16:34 ` Dhaval Giani
2008-02-13 16:37 ` Dhaval Giani
2008-02-13 16:43 ` Peter Zijlstra
2008-02-13 17:06 ` Srivatsa Vaddagiri
2008-02-14 11:20 ` Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).