LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] sched: rsdl improvements
@ 2007-03-21 17:29 Con Kolivas
2007-03-21 23:27 ` Artur Skawina
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Con Kolivas @ 2007-03-21 17:29 UTC (permalink / raw)
To: linux list, ck list, Ingo Molnar, Andrew Morton
Hi all
As my time at the pc is limited I unfortunately cannot spend it responding to
the huge number of emails I got in response to RSDL. Instead, here's a patch.
I may be offline for extended periods at a time still so please others feel
free to poke at the code and don't take it personally if I don't respond to
your emails.
Note no interactive boost idea here.
Patch is for 2.6.21-rc4-mm1. I have not spent the time trying to bring other
bases in sync.
---
Further improve the deterministic nature of the RSDL cpu scheduler and make
the rr_interval tunable.
By only giving out priority slots to tasks at the current runqueue's
prio_level or below we can make the cpu allocation not altered by accounting
issues across major_rotation periods. This makes the cpu allocation and
latencies more deterministic, and decreases maximum latencies substantially.
This change removes the possibility that tasks can get bursts of cpu activity
which can favour towards interactive tasks but also favour towards cpu bound
tasks which happen to wait on other activity (such as I/O) and is a net
gain.
This change also makes negative nice values less harmful to latencies of more
niced tasks, and should lead to less preemption which might decrease the
context switch rate and subsequently improve throughput.
The rr_interval can be made a tunable such that if an environment exists that
is not as latency sensitive, it can be increased for maximum throughput.
A tiny change checking for MAX_PRIO in normal_prio() may prevent oopses on
bootup on large SMP due to forking off the idle task.
Other minor cleanups.
Signed-off-by: Con Kolivas <kernel@kolivas.org>
---
Documentation/sysctl/kernel.txt | 12 +++++
kernel/sched.c | 94 ++++++++++++++++++++++------------------
kernel/sysctl.c | 25 ++++++++--
3 files changed, 83 insertions(+), 48 deletions(-)
Index: linux-2.6.21-rc4-mm1/Documentation/sysctl/kernel.txt
===================================================================
--- linux-2.6.21-rc4-mm1.orig/Documentation/sysctl/kernel.txt 2007-03-21 20:53:50.000000000 +1100
+++ linux-2.6.21-rc4-mm1/Documentation/sysctl/kernel.txt 2007-03-21 20:54:19.000000000 +1100
@@ -43,6 +43,7 @@ show up in /proc/sys/kernel:
- printk
- real-root-dev ==> Documentation/initrd.txt
- reboot-cmd [ SPARC only ]
+- rr_interval
- rtsig-max
- rtsig-nr
- sem
@@ -288,6 +289,17 @@ rebooting. ???
==============================================================
+rr_interval:
+
+This is the smallest duration that any cpu process scheduling unit
+will run for. Increasing this value can increase throughput of cpu
+bound tasks substantially but at the expense of increased latencies
+overall. This value is in _ticks_ and the default value chosen depends
+on the number of cpus available at scheduler initialisation. Valid
+values are from 1-100.
+
+==============================================================
+
rtsig-max & rtsig-nr:
The file rtsig-max can be used to tune the maximum number
Index: linux-2.6.21-rc4-mm1/kernel/sched.c
===================================================================
--- linux-2.6.21-rc4-mm1.orig/kernel/sched.c 2007-03-21 20:53:50.000000000 +1100
+++ linux-2.6.21-rc4-mm1/kernel/sched.c 2007-03-22 03:58:42.000000000 +1100
@@ -93,8 +93,10 @@ unsigned long long __attribute__((weak))
/*
* This is the time all tasks within the same priority round robin.
* Set to a minimum of 8ms. Scales with number of cpus and rounds with HZ.
+ * Tunable via /proc interface.
*/
-static unsigned int rr_interval __read_mostly;
+int rr_interval __read_mostly;
+
#define RR_INTERVAL 8
#define DEF_TIMESLICE (rr_interval * 20)
@@ -686,19 +688,32 @@ static inline void task_new_array(struct
p->rotation = rq->prio_rotation;
}
+/* Find the first slot from the relevant prio_matrix entry */
static inline int first_prio_slot(struct task_struct *p)
{
return SCHED_PRIO(find_first_zero_bit(
prio_matrix[USER_PRIO(p->static_prio)], PRIO_RANGE));
}
-static inline int next_prio_slot(struct task_struct *p, int prio)
+/* Is a dynamic_prio part of the allocated slots for this static_prio */
+static inline int entitled_slot(int static_prio, int dynamic_prio)
+{
+ return !test_bit(USER_PRIO(dynamic_prio),
+ prio_matrix[USER_PRIO(static_prio)]);
+}
+
+/*
+ * Find the first unused slot by this task that is also in its prio_matrix
+ * level.
+ */
+static inline int next_entitled_slot(struct task_struct *p, struct rq *rq)
{
DECLARE_BITMAP(tmp, PRIO_RANGE);
+
bitmap_or(tmp, p->bitmap, prio_matrix[USER_PRIO(p->static_prio)],
PRIO_RANGE);
return SCHED_PRIO(find_next_zero_bit(tmp, PRIO_RANGE,
- USER_PRIO(prio)));
+ USER_PRIO(rq->prio_level)));
}
static void queue_expired(struct task_struct *p, struct rq *rq)
@@ -725,23 +740,12 @@ static void queue_expired(struct task_st
static void recalc_task_prio(struct task_struct *p, struct rq *rq)
{
struct prio_array *array = rq->active;
- int queue_prio, search_prio = MAX_RT_PRIO;
-
- /*
- * SCHED_BATCH tasks never start at better priority than any other
- * task that is already running since they are flagged as latency
- * insensitive. This means they never cause greater latencies in other
- * non SCHED_BATCH tasks of the same nice level, but they still will
- * not be exposed to high latencies themselves.
- */
- if (unlikely(p->policy == SCHED_BATCH))
- search_prio = rq->prio_level;
+ int queue_prio;
if (p->rotation == rq->prio_rotation) {
if (p->array == array) {
if (p->time_slice && rq_quota(rq, p->prio))
return;
- search_prio = p->prio;
} else if (p->array == rq->expired) {
queue_expired(p, rq);
return;
@@ -750,7 +754,7 @@ static void recalc_task_prio(struct task
} else
task_new_array(p, rq);
- queue_prio = next_prio_slot(p, search_prio);
+ queue_prio = next_entitled_slot(p, rq);
if (queue_prio >= MAX_PRIO) {
queue_expired(p, rq);
return;
@@ -907,7 +911,7 @@ static inline int normal_prio(struct tas
if (has_rt_policy(p))
return MAX_RT_PRIO-1 - p->rt_priority;
/* Other tasks all have normal_prio set in recalc_task_prio */
- if (likely(p->prio >= MAX_RT_PRIO))
+ if (likely(p->prio >= MAX_RT_PRIO && p->prio < MAX_PRIO))
return p->prio;
else
return p->static_prio;
@@ -942,10 +946,10 @@ static int effective_prio(struct task_st
*/
static unsigned int rr_quota(struct task_struct *p)
{
- int neg_nice = -TASK_NICE(p), rr = rr_interval;
+ int nice = TASK_NICE(p), rr = rr_interval;
- if (neg_nice > 6 && !rt_task(p)) {
- rr *= neg_nice * neg_nice;
+ if (nice < -6 && !rt_task(p)) {
+ rr *= nice * nice;
rr /= 40;
}
return rr;
@@ -1583,7 +1587,7 @@ int fastcall wake_up_state(struct task_s
return try_to_wake_up(p, state, 0);
}
-static void task_running_tick(struct rq *rq, struct task_struct *p);
+static void task_running_tick(struct rq *rq, struct task_struct *p, int tick);
/*
* Perform scheduler related setup for a newly forked process p.
* p is forked by current.
@@ -1645,7 +1649,7 @@ void fastcall sched_fork(struct task_str
* a problem.
*/
current->time_slice = 1;
- task_running_tick(cpu_rq(cpu), current);
+ task_running_tick(cpu_rq(cpu), current, 0);
}
local_irq_enable();
out:
@@ -1720,14 +1724,16 @@ void fastcall wake_up_new_task(struct ta
*/
void fastcall sched_exit(struct task_struct *p)
{
+ struct task_struct *parent;
unsigned long flags;
struct rq *rq;
- rq = task_rq_lock(p->parent, &flags);
- if (p->first_time_slice && task_cpu(p) == task_cpu(p->parent)) {
- p->parent->time_slice += p->time_slice;
- if (unlikely(p->parent->time_slice > p->quota))
- p->parent->time_slice = p->quota;
+ parent = p->parent;
+ rq = task_rq_lock(parent, &flags);
+ if (p->first_time_slice && task_cpu(p) == task_cpu(parent)) {
+ parent->time_slice += p->time_slice;
+ if (unlikely(parent->time_slice > parent->quota))
+ parent->time_slice = parent->quota;
}
task_rq_unlock(rq, &flags);
}
@@ -3372,7 +3378,7 @@ static inline void rotate_runqueue_prior
rq_quota(rq, new_prio_level) += 1;
}
-static void task_running_tick(struct rq *rq, struct task_struct *p)
+static void task_running_tick(struct rq *rq, struct task_struct *p, int tick)
{
if (unlikely(!task_queued(p))) {
/* Task has expired but was not scheduled yet */
@@ -3395,6 +3401,13 @@ static void task_running_tick(struct rq
if (!--p->time_slice)
task_expired_entitlement(rq, p);
/*
+ * If we're actually calling this function not in a scheduler_tick
+ * we are doing so to fix accounting across fork and should not be
+ * deducting anything from rq_quota.
+ */
+ if (!tick)
+ goto out_unlock;
+ /*
* We only employ the deadline mechanism if we run over the quota.
* It allows aliasing problems around the scheduler_tick to be
* less harmful.
@@ -3405,6 +3418,7 @@ static void task_running_tick(struct rq
rotate_runqueue_priority(rq);
set_tsk_need_resched(p);
}
+out_unlock:
spin_unlock(&rq->lock);
}
@@ -3423,7 +3437,7 @@ void scheduler_tick(void)
update_cpu_clock(p, rq, now);
if (!idle_at_tick)
- task_running_tick(rq, p);
+ task_running_tick(rq, p, 1);
#ifdef CONFIG_SMP
update_load(rq);
rq->idle_at_tick = idle_at_tick;
@@ -3469,20 +3483,13 @@ EXPORT_SYMBOL(sub_preempt_count);
#endif
-/* Is a dynamic_prio part of the allocated slots for this static_prio */
-static inline int entitled_slot(int static_prio, int dynamic_prio)
-{
- return !test_bit(USER_PRIO(dynamic_prio),
- prio_matrix[USER_PRIO(static_prio)]);
-}
-
/*
* If a task is queued at a priority that isn't from its bitmap we exchange
* by setting one of the entitlement bits.
*/
-static inline void exchange_slot(struct task_struct *p, int prio)
+static inline void exchange_slot(struct task_struct *p, struct rq *rq)
{
- int slot = next_prio_slot(p, prio);
+ int slot = next_entitled_slot(p, rq);
if (slot < MAX_PRIO)
__set_bit(USER_PRIO(slot), p->bitmap);
@@ -3524,6 +3531,7 @@ retry:
}
queue = array->queue + idx;
next = list_entry(queue->next, struct task_struct, run_list);
+ rq->prio_level = idx;
/*
* When the task is chosen it is checked to see if its quota has been
* added to this runqueue level which is only performed once per
@@ -3533,17 +3541,16 @@ retry:
/* Task has moved during major rotation */
task_new_array(next, rq);
if (!entitled_slot(next->static_prio, idx))
- exchange_slot(next, idx);
+ exchange_slot(next, rq);
set_task_entitlement(next);
rq_quota(rq, idx) += next->quota;
} else if (!test_bit(USER_PRIO(idx), next->bitmap)) {
/* Task has moved during minor rotation */
if (!entitled_slot(next->static_prio, idx))
- exchange_slot(next, idx);
+ exchange_slot(next, rq);
set_task_entitlement(next);
rq_quota(rq, idx) += next->quota;
}
- rq->prio_level = idx;
/*
* next needs to have its prio and array reset here in case the
* values are wrong due to priority rotation.
@@ -3632,8 +3639,11 @@ need_resched_nonpreemptible:
next = list_entry(queue->next, struct task_struct, run_list);
}
switch_tasks:
- if (next == rq->idle)
+ if (next == rq->idle) {
+ rq->prio_level = MAX_RT_PRIO;
+ rq->prio_rotation++;
schedstat_inc(rq, sched_goidle);
+ }
prefetch(next);
prefetch_stack(next);
clear_tsk_need_resched(prev);
Index: linux-2.6.21-rc4-mm1/kernel/sysctl.c
===================================================================
--- linux-2.6.21-rc4-mm1.orig/kernel/sysctl.c 2007-03-21 20:53:50.000000000 +1100
+++ linux-2.6.21-rc4-mm1/kernel/sysctl.c 2007-03-21 20:56:16.000000000 +1100
@@ -79,6 +79,7 @@ extern int percpu_pagelist_fraction;
extern int compat_log;
extern int maps_protect;
extern int print_fatal_signals;
+extern int rr_interval;
#if defined(CONFIG_ADAPTIVE_READAHEAD)
extern int readahead_ratio;
@@ -167,6 +168,13 @@ int sysctl_legacy_va_layout;
#endif
+/* Constants for minimum and maximum testing in vm_table.
+ We use these as one-element integer vectors. */
+static int __read_mostly zero;
+static int __read_mostly one = 1;
+static int __read_mostly one_hundred = 100;
+
+
/* The default sysctl tables: */
static ctl_table root_table[] = {
@@ -515,6 +523,17 @@ static ctl_table kern_table[] = {
.mode = 0444,
.proc_handler = &proc_dointvec,
},
+ {
+ .ctl_name = CTL_UNNUMBERED,
+ .procname = "rr_interval",
+ .data = &rr_interval,
+ .maxlen = sizeof (int),
+ .mode = 0644,
+ .proc_handler = &proc_dointvec_minmax,
+ .strategy = &sysctl_intvec,
+ .extra1 = &one,
+ .extra2 = &one_hundred,
+ },
#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86)
{
.ctl_name = KERN_UNKNOWN_NMI_PANIC,
@@ -631,12 +650,6 @@ static ctl_table kern_table[] = {
{ .ctl_name = 0 }
};
-/* Constants for minimum and maximum testing in vm_table.
- We use these as one-element integer vectors. */
-static int zero;
-static int one_hundred = 100;
-
-
static ctl_table vm_table[] = {
{
.ctl_name = VM_OVERCOMMIT_MEMORY,
--
-ck
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] sched: rsdl improvements
2007-03-21 17:29 [PATCH] sched: rsdl improvements Con Kolivas
@ 2007-03-21 23:27 ` Artur Skawina
2007-03-21 23:48 ` Jeffrey Hundstad
2007-03-21 23:36 ` [PATCH] sched: rsdl improvements Andrew Morton
2007-03-22 14:46 ` Christian
2 siblings, 1 reply; 11+ messages in thread
From: Artur Skawina @ 2007-03-21 23:27 UTC (permalink / raw)
To: Con Kolivas; +Cc: linux list, ck list, Ingo Molnar, Andrew Morton
Con Kolivas wrote:
> Note no interactive boost idea here.
>
> Patch is for 2.6.21-rc4-mm1. I have not spent the time trying to bring other
> bases in sync.
I've tried RSDLv.31+this on 2.6.20.3 as i'm not tracking -mm.
> Further improve the deterministic nature of the RSDL cpu scheduler and make
> the rr_interval tunable.
>
> By only giving out priority slots to tasks at the current runqueue's
> prio_level or below we can make the cpu allocation not altered by accounting
> issues across major_rotation periods. This makes the cpu allocation and
> latencies more deterministic, and decreases maximum latencies substantially.
> This change removes the possibility that tasks can get bursts of cpu activity
> which can favour towards interactive tasks but also favour towards cpu bound
> tasks which happen to wait on other activity (such as I/O) and is a net
> gain.
I'm not sure this is going in the right direction... I'm writing
this while compiling a kernel w/ "nice -20 make -j2" and X is almost
unusable -- even the x pointer jumps instead of moving smoothly like
it always did; I had to stop the build to be able to quickly finish
this as the latency is making it hard to properly position the cursor...
Hmm, this is weird; I've tried various nice values for the build and
19 is the only one triggering this, w/ 18 and less the cursor moves
smoothly, but there are short sub-second stalls. nice=0 isn't much
different.
RSDL 0.31 was behaving properly, and only exhibited problems when
the box was overloaded w/ non-niced tasks; Right now even a properly
niced background job kills interactivity completely.
artur
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] sched: rsdl improvements
2007-03-21 17:29 [PATCH] sched: rsdl improvements Con Kolivas
2007-03-21 23:27 ` Artur Skawina
@ 2007-03-21 23:36 ` Andrew Morton
2007-03-22 5:03 ` Con Kolivas
2007-03-22 14:46 ` Christian
2 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2007-03-21 23:36 UTC (permalink / raw)
To: Con Kolivas; +Cc: linux list, ck list, Ingo Molnar
On Thu, 22 Mar 2007 04:29:44 +1100
Con Kolivas <kernel@kolivas.org> wrote:
> Further improve the deterministic nature of the RSDL cpu scheduler and make
> the rr_interval tunable.
I might actually need to drop RSDL from next -mm, see if those sched oopses
whcih several people have reported go away.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] sched: rsdl improvements
2007-03-21 23:27 ` Artur Skawina
@ 2007-03-21 23:48 ` Jeffrey Hundstad
2007-03-22 0:15 ` Artur Skawina
2007-03-22 0:24 ` Con Kolivas
0 siblings, 2 replies; 11+ messages in thread
From: Jeffrey Hundstad @ 2007-03-21 23:48 UTC (permalink / raw)
To: Artur Skawina
Cc: Con Kolivas, linux list, ck list, Ingo Molnar, Andrew Morton
Artur Skawina wrote:
> Con Kolivas wrote:
>
>> Note no interactive boost idea here.
>>
>> Patch is for 2.6.21-rc4-mm1. I have not spent the time trying to bring other
>> bases in sync.
>>
>
> I've tried RSDLv.31+this on 2.6.20.3 as i'm not tracking -mm.
>
>
>> Further improve the deterministic nature of the RSDL cpu scheduler and make
>> the rr_interval tunable.
>>
>> By only giving out priority slots to tasks at the current runqueue's
>> prio_level or below we can make the cpu allocation not altered by accounting
>> issues across major_rotation periods. This makes the cpu allocation and
>> latencies more deterministic, and decreases maximum latencies substantially.
>> This change removes the possibility that tasks can get bursts of cpu activity
>> which can favour towards interactive tasks but also favour towards cpu bound
>> tasks which happen to wait on other activity (such as I/O) and is a net
>> gain.
>>
>
> I'm not sure this is going in the right direction... I'm writing
> this while compiling a kernel w/ "nice -20 make -j2" and X is almost
>
Did you mean "nice -20"? If so, that should have slowed X quite a bit.
Try "nice 19" instead.
nice(1):
Run COMMAND with an adjusted niceness, which affects process
scheduling. With no COMMAND, print the current niceness. Nicenesses
range from -20 (most favorable scheduling) to 19 (least favorable).
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] sched: rsdl improvements
2007-03-21 23:48 ` Jeffrey Hundstad
@ 2007-03-22 0:15 ` Artur Skawina
2007-03-22 0:24 ` Con Kolivas
1 sibling, 0 replies; 11+ messages in thread
From: Artur Skawina @ 2007-03-22 0:15 UTC (permalink / raw)
To: Jeffrey Hundstad; +Cc: Con Kolivas, linux list, ck list, Ingo Molnar
Jeffrey Hundstad wrote:
>> I'm not sure this is going in the right direction... I'm writing
>> this while compiling a kernel w/ "nice -20 make -j2" and X is almost
>>
> Did you mean "nice -20"? If so, that should have slowed X quite a bit.
> Try "nice 19" instead.
i did try "nice --20" too :) Resulted in long X stalls, but i don't
think that's a reasonable load so I did not mention it.
"nice -20 cmd" runs cmd at nice==19.
Usage: nice [OPTION] [COMMAND [ARG]...]
Run COMMAND with an adjusted niceness, which affects process scheduling.
With no COMMAND, print the current niceness. Nicenesses range from
-20 (most favorable scheduling) to 19 (least favorable).
-n, --adjustment=N add integer N to the niceness (default 10)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] sched: rsdl improvements
2007-03-21 23:48 ` Jeffrey Hundstad
2007-03-22 0:15 ` Artur Skawina
@ 2007-03-22 0:24 ` Con Kolivas
2007-03-22 0:52 ` Con Kolivas
1 sibling, 1 reply; 11+ messages in thread
From: Con Kolivas @ 2007-03-22 0:24 UTC (permalink / raw)
To: Jeffrey Hundstad
Cc: Artur Skawina, linux list, ck list, Ingo Molnar, Andrew Morton
On Thursday 22 March 2007 10:48, Jeffrey Hundstad wrote:
> Artur Skawina wrote:
> > Con Kolivas wrote:
> >> Note no interactive boost idea here.
> >>
> >> Patch is for 2.6.21-rc4-mm1. I have not spent the time trying to bring
> >> other bases in sync.
> >
> > I've tried RSDLv.31+this on 2.6.20.3 as i'm not tracking -mm.
> >
> >> Further improve the deterministic nature of the RSDL cpu scheduler and
> >> make the rr_interval tunable.
> >>
> >> By only giving out priority slots to tasks at the current runqueue's
> >> prio_level or below we can make the cpu allocation not altered by
> >> accounting issues across major_rotation periods. This makes the cpu
> >> allocation and latencies more deterministic, and decreases maximum
> >> latencies substantially. This change removes the possibility that tasks
> >> can get bursts of cpu activity which can favour towards interactive
> >> tasks but also favour towards cpu bound tasks which happen to wait on
> >> other activity (such as I/O) and is a net gain.
> >
> > I'm not sure this is going in the right direction... I'm writing
> > this while compiling a kernel w/ "nice -20 make -j2" and X is almost
>
> Did you mean "nice -20"? If so, that should have slowed X quite a bit.
> Try "nice 19" instead.
>
> nice(1):
> Run COMMAND with an adjusted niceness, which affects process
> scheduling. With no COMMAND, print the current niceness. Nicenesses
> range from -20 (most favorable scheduling) to 19 (least favorable).
No he's right. Something scrambled my brain and I've completely left out the
part where I offer the old bursts as a tunable option as well, which
unintentionally killed off SCHED_BATCH as an entity. I'll have to put that as
an additional patch sorry as this by itself is not always a win. Hang in
there.
--
-ck
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] sched: rsdl improvements
2007-03-22 0:24 ` Con Kolivas
@ 2007-03-22 0:52 ` Con Kolivas
2007-03-22 2:04 ` [PATCH] sched: rsdl check for niced tasks lowering prio level Con Kolivas
0 siblings, 1 reply; 11+ messages in thread
From: Con Kolivas @ 2007-03-22 0:52 UTC (permalink / raw)
To: Jeffrey Hundstad
Cc: Artur Skawina, linux list, ck list, Ingo Molnar, Andrew Morton
On Thursday 22 March 2007 11:24, Con Kolivas wrote:
> On Thursday 22 March 2007 10:48, Jeffrey Hundstad wrote:
> > Artur Skawina wrote:
> > > Con Kolivas wrote:
> > >> Note no interactive boost idea here.
> > >>
> > >> Patch is for 2.6.21-rc4-mm1. I have not spent the time trying to bring
> > >> other bases in sync.
> > >
> > > I've tried RSDLv.31+this on 2.6.20.3 as i'm not tracking -mm.
> > >
> > >> Further improve the deterministic nature of the RSDL cpu scheduler and
> > >> make the rr_interval tunable.
> > >>
> > >> By only giving out priority slots to tasks at the current runqueue's
> > >> prio_level or below we can make the cpu allocation not altered by
> > >> accounting issues across major_rotation periods. This makes the cpu
> > >> allocation and latencies more deterministic, and decreases maximum
> > >> latencies substantially. This change removes the possibility that
> > >> tasks can get bursts of cpu activity which can favour towards
> > >> interactive tasks but also favour towards cpu bound tasks which happen
> > >> to wait on other activity (such as I/O) and is a net gain.
> > >
> > > I'm not sure this is going in the right direction... I'm writing
> > > this while compiling a kernel w/ "nice -20 make -j2" and X is almost
> >
> > Did you mean "nice -20"? If so, that should have slowed X quite a bit.
> > Try "nice 19" instead.
> >
> > nice(1):
> > Run COMMAND with an adjusted niceness, which affects process
> > scheduling. With no COMMAND, print the current niceness. Nicenesses
> > range from -20 (most favorable scheduling) to 19 (least favorable).
>
> No he's right. Something scrambled my brain and I've completely left out
> the part where I offer the old bursts as a tunable option as well, which
> unintentionally killed off SCHED_BATCH as an entity. I'll have to put that
> as an additional patch sorry as this by itself is not always a win. Hang in
> there.
Actually, reworking the priority matrix to always have a slot at position 1
should fix this without needing a tunable. That is a better approach so I'll
do that.
--
-ck
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] sched: rsdl check for niced tasks lowering prio level
2007-03-22 0:52 ` Con Kolivas
@ 2007-03-22 2:04 ` Con Kolivas
2007-03-22 13:34 ` Artur Skawina
0 siblings, 1 reply; 11+ messages in thread
From: Con Kolivas @ 2007-03-22 2:04 UTC (permalink / raw)
To: Jeffrey Hundstad
Cc: Artur Skawina, linux list, ck list, Ingo Molnar, Andrew Morton
Here is the best fix for the bug pointed out. Thanks.
I'll try and find pc time to wrap these two patches together and make a v0.32
available.
---
Ensure niced tasks are not inappropriately limiting sleeping unniced tasks
by explicitly checking what the best static priority that has run this
major rotation was.
Reimplement SCHED_BATCH using this check.
Signed-off-by: Con Kolivas <kernel@kolivas.org>
---
kernel/sched.c | 33 ++++++++++++++++++++++++---------
1 file changed, 24 insertions(+), 9 deletions(-)
Index: linux-2.6.21-rc4-mm1/kernel/sched.c
===================================================================
--- linux-2.6.21-rc4-mm1.orig/kernel/sched.c 2007-03-22 12:44:05.000000000 +1100
+++ linux-2.6.21-rc4-mm1/kernel/sched.c 2007-03-22 12:58:26.000000000 +1100
@@ -201,8 +201,11 @@ struct rq {
struct prio_array *active, *expired, arrays[2];
unsigned long *dyn_bitmap, *exp_bitmap;
- int prio_level;
- /* The current dynamic priority level this runqueue is at */
+ int prio_level, best_static_prio;
+ /*
+ * The current dynamic priority level this runqueue is at, and the
+ * best static priority queued this major rotation.
+ */
unsigned long prio_rotation;
/* How many times we have rotated the priority queue */
@@ -704,16 +707,24 @@ static inline int entitled_slot(int stat
/*
* Find the first unused slot by this task that is also in its prio_matrix
- * level.
+ * level. Ensure that the prio_level is not unnecessarily low by checking
+ * that best_static_prio this major rotation was not a niced task.
+ * SCHED_BATCH tasks do not perform this check so they do not induce
+ * latencies in tasks of any nice level.
*/
static inline int next_entitled_slot(struct task_struct *p, struct rq *rq)
{
- DECLARE_BITMAP(tmp, PRIO_RANGE);
+ if (p->static_prio < rq->best_static_prio && p->policy != SCHED_BATCH)
+ return SCHED_PRIO(find_first_zero_bit(p->bitmap, PRIO_RANGE));
+ else {
+ DECLARE_BITMAP(tmp, PRIO_RANGE);
- bitmap_or(tmp, p->bitmap, prio_matrix[USER_PRIO(p->static_prio)],
- PRIO_RANGE);
- return SCHED_PRIO(find_next_zero_bit(tmp, PRIO_RANGE,
- USER_PRIO(rq->prio_level)));
+ bitmap_or(tmp, p->bitmap,
+ prio_matrix[USER_PRIO(p->static_prio)],
+ PRIO_RANGE);
+ return SCHED_PRIO(find_next_zero_bit(tmp, PRIO_RANGE,
+ USER_PRIO(rq->prio_level)));
+ }
}
static void queue_expired(struct task_struct *p, struct rq *rq)
@@ -3315,6 +3326,7 @@ static inline void major_prio_rotation(s
rq->active = new_array;
rq->exp_bitmap = rq->expired->prio_bitmap;
rq->dyn_bitmap = rq->active->prio_bitmap;
+ rq->best_static_prio = MAX_PRIO;
rq->prio_rotation++;
}
@@ -3640,10 +3652,12 @@ need_resched_nonpreemptible:
}
switch_tasks:
if (next == rq->idle) {
+ rq->best_static_prio = MAX_PRIO;
rq->prio_level = MAX_RT_PRIO;
rq->prio_rotation++;
schedstat_inc(rq, sched_goidle);
- }
+ } else if (next->static_prio < rq->best_static_prio)
+ rq->best_static_prio = next->static_prio;
prefetch(next);
prefetch_stack(next);
clear_tsk_need_resched(prev);
@@ -7093,6 +7107,7 @@ void __init sched_init(void)
lockdep_set_class(&rq->lock, &rq->rq_lock_key);
rq->nr_running = 0;
rq->prio_rotation = 0;
+ rq->best_static_prio = MAX_PRIO;
rq->prio_level = MAX_RT_PRIO;
rq->active = rq->arrays;
rq->expired = rq->arrays + 1;
--
-ck
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] sched: rsdl improvements
2007-03-21 23:36 ` [PATCH] sched: rsdl improvements Andrew Morton
@ 2007-03-22 5:03 ` Con Kolivas
0 siblings, 0 replies; 11+ messages in thread
From: Con Kolivas @ 2007-03-22 5:03 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux list, ck list, Ingo Molnar
On Thursday 22 March 2007 10:36, Andrew Morton wrote:
> On Thu, 22 Mar 2007 04:29:44 +1100
>
> Con Kolivas <kernel@kolivas.org> wrote:
> > Further improve the deterministic nature of the RSDL cpu scheduler and
> > make the rr_interval tunable.
>
> I might actually need to drop RSDL from next -mm, see if those sched oopses
> whcih several people have reported go away.
I did mention them in the changelog further down. While it may not be
immediately apparent from the minimal emails I'm sending, I am trying hard to
address every known regression in the time alloted. Without access to the
hardware though I'm reliant on others testing it so I can't know for certain
if I've fixed them.
--
-ck
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] sched: rsdl check for niced tasks lowering prio level
2007-03-22 2:04 ` [PATCH] sched: rsdl check for niced tasks lowering prio level Con Kolivas
@ 2007-03-22 13:34 ` Artur Skawina
0 siblings, 0 replies; 11+ messages in thread
From: Artur Skawina @ 2007-03-22 13:34 UTC (permalink / raw)
To: Con Kolivas
Cc: Jeffrey Hundstad, linux list, ck list, Ingo Molnar, Andrew Morton
Con Kolivas wrote:
> Here is the best fix for the bug pointed out. Thanks.
> Ensure niced tasks are not inappropriately limiting sleeping unniced tasks
> by explicitly checking what the best static priority that has run this
> major rotation was.
yes, this made the machine usable again.
After noticing that the context switch rate during a "nice -19 make
-j2" has now been halved vs stock 2.6.20, compiled a kernel under
both and RSDL w/these two patches was 2% faster; so this may be the
right direction after all :)
artur
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] sched: rsdl improvements
2007-03-21 17:29 [PATCH] sched: rsdl improvements Con Kolivas
2007-03-21 23:27 ` Artur Skawina
2007-03-21 23:36 ` [PATCH] sched: rsdl improvements Andrew Morton
@ 2007-03-22 14:46 ` Christian
2 siblings, 0 replies; 11+ messages in thread
From: Christian @ 2007-03-22 14:46 UTC (permalink / raw)
To: Con Kolivas, linux list; +Cc: ck list
Hello List!
I'm using the new scheduler since a few days and I have to say that this is an
amazing improvement for gaming loads. When playing enemy-territory the
animations are completly smooth and fluid, without any hiccups. Et runs now
clearly better on Linux than on my Win XP partition (same HW). I have never
seen such smooth animations on any OS before.
(I have LD_PRELOADED a libnoyield ripped from a post here on lkml)
I can play et and have a 'nice make -j4' in the background and the only thing
you notice is longer load-times/more latency on disk access. I can't tell if
the compilation is finished or not while gaming ;-)
On everyday dektop usage there is a slight improvement too. But for me the
current mainline scheduler has no problems. I couldn't tell which scheduler
is in use by only doing desktop related things. Mainline scheduler is good
and has no probs there.
I didn't find any severe regressions on my dual core system. No problems with
sound or video even with make -j8 kernel compile in the background.
When I start a single cpu-hog it tends to jump around cores every second or
so. I think that this is a regression to the current scheduler which tends to
have a better affinity management on multicore.
Really nice work Con! ;-)
-Christian
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2007-03-22 14:48 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-21 17:29 [PATCH] sched: rsdl improvements Con Kolivas
2007-03-21 23:27 ` Artur Skawina
2007-03-21 23:48 ` Jeffrey Hundstad
2007-03-22 0:15 ` Artur Skawina
2007-03-22 0:24 ` Con Kolivas
2007-03-22 0:52 ` Con Kolivas
2007-03-22 2:04 ` [PATCH] sched: rsdl check for niced tasks lowering prio level Con Kolivas
2007-03-22 13:34 ` Artur Skawina
2007-03-21 23:36 ` [PATCH] sched: rsdl improvements Andrew Morton
2007-03-22 5:03 ` Con Kolivas
2007-03-22 14:46 ` Christian
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).