LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [patch] restore sched_exec load balance heuristics
@ 2008-11-06 19:40 Ken Chen
  2008-11-06 20:07 ` Ingo Molnar
  0 siblings, 1 reply; 8+ messages in thread
From: Ken Chen @ 2008-11-06 19:40 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Linux Kernel Mailing List

We've seen long standing performance regression on sys_execve for several
upstream kernels, largely on workload that does heavy execve.  The main
reason for the regression was due to a change in sched_exec load balance
heuristics.  For example, on 2.6.11 kernel, the "exec" task will run on
the same cpu if that is the only task running.  However, 2.6.13 and onward
kernels will go around the sched-domain looking for most idle CPU (which
doesn't treat task exec'ing as an idle CPU).  Thus bouncing the exec'ing
task all over the place which leads to poor CPU cache and numa locality.
(The workload happens to share common data between subsequent exec program).

This execve heuristic was removed in upstream kernel by this git commit:

commit 68767a0ae428801649d510d9a65bb71feed44dd1
Author: Nick Piggin <nickpiggin@yahoo.com.au>
Date:   Sat Jun 25 14:57:20 2005 -0700

[PATCH] sched: schedstats update for balance on fork
Add SCHEDSTAT statistics for sched-balance-fork.

>From the commit description, it appears that deleting the heuristics
was an accident, as the commit is supposedly just for schedstats.

So, restore the sched-exec load balancing if exec'ing task is the only
task running on that specific CPU.  The logic make sense: newly exec
program should continue to run on current CPU as it doesn't change any
load imbalance nor does it help anything by bouncing to another idle
CPU. By keeping on the same CPU, it preserves cache and numa locality.

Signed-off-by: Ken Chen <kenchen@google.com>

diff --git a/kernel/sched.c b/kernel/sched.c
index e8819bc..4ad1907 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2873,7 +2873,12 @@ out:
  */
 void sched_exec(void)
 {
-	int new_cpu, this_cpu = get_cpu();
+	int new_cpu, this_cpu;
+
+	if (this_rq()->nr_running <= 1)
+		return;
+
+	this_cpu = get_cpu();
 	new_cpu = sched_balance_self(this_cpu, SD_BALANCE_EXEC);
 	put_cpu();
 	if (new_cpu != this_cpu)

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2008-11-10 12:54 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-11-06 19:40 [patch] restore sched_exec load balance heuristics Ken Chen
2008-11-06 20:07 ` Ingo Molnar
2008-11-06 20:32   ` Ken Chen
2008-11-06 20:38     ` Ingo Molnar
2008-11-06 20:49     ` Chris Friesen
2008-11-10  8:50   ` Peter Zijlstra
2008-11-10  9:29     ` Ingo Molnar
2008-11-10 12:54       ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).