LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH RESEND 1/2] sched/rt: Check to push the task when changing its affinity
@ 2015-02-04  1:12 Xunlei Pang
  2015-02-04  1:12 ` [PATCH RESEND 2/2] sched/rt: Add check_preempt_equal_prio() logic in pick_next_task_rt() Xunlei Pang
  2015-02-04  3:14 ` [PATCH RESEND 1/2] sched/rt: Check to push the task when changing its affinity Steven Rostedt
  0 siblings, 2 replies; 6+ messages in thread
From: Xunlei Pang @ 2015-02-04  1:12 UTC (permalink / raw)
  To: linux-kernel; +Cc: Peter Zijlstra, Steven Rostedt, Juri Lelli, Xunlei Pang

From: Xunlei Pang <pang.xunlei@linaro.org>

We may suffer from extra rt overload rq due to the affinity,
so when the affinity of any runnable rt task is changed, we
should check to trigger balancing, otherwise it will cause
some unnecessary delayed real-time response. Unfortunately,
current RT global scheduler doesn't trigger anything.

For example: a 2-cpu system with two runnable FIFO tasks(same
rt_priority) bound on CPU0, let's name them rt1(running) and
rt2(runnable) respectively; CPU1 has no RTs. Then, someone sets
the affinity of rt2 to 0x3(i.e. CPU0 and CPU1), but after this,
rt2 still can't be scheduled until rt1 enters schedule(), this
definitely causes some/big response latency for rt2.

So, when doing set_cpus_allowed_rt(), if detecting such cases,
check to trigger a push behaviour.

Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
---
 kernel/sched/rt.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 59 insertions(+), 10 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index f4d4b07..4dacb6e 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1428,7 +1428,7 @@ static struct sched_rt_entity *pick_next_rt_entity(struct rq *rq,
 	return next;
 }
 
-static struct task_struct *_pick_next_task_rt(struct rq *rq)
+static struct task_struct *_pick_next_task_rt(struct rq *rq, int peek_only)
 {
 	struct sched_rt_entity *rt_se;
 	struct task_struct *p;
@@ -1441,7 +1441,8 @@ static struct task_struct *_pick_next_task_rt(struct rq *rq)
 	} while (rt_rq);
 
 	p = rt_task_of(rt_se);
-	p->se.exec_start = rq_clock_task(rq);
+	if (!peek_only)
+		p->se.exec_start = rq_clock_task(rq);
 
 	return p;
 }
@@ -1476,7 +1477,7 @@ pick_next_task_rt(struct rq *rq, struct task_struct *prev)
 
 	put_prev_task(rq, prev);
 
-	p = _pick_next_task_rt(rq);
+	p = _pick_next_task_rt(rq, 0);
 
 	/* The running task is never eligible for pushing */
 	dequeue_pushable_task(rq, p);
@@ -1886,28 +1887,69 @@ static void set_cpus_allowed_rt(struct task_struct *p,
 				const struct cpumask *new_mask)
 {
 	struct rq *rq;
-	int weight;
+	int old_weight, new_weight;
+	int preempt_push = 0, direct_push = 0;
 
 	BUG_ON(!rt_task(p));
 
 	if (!task_on_rq_queued(p))
 		return;
 
-	weight = cpumask_weight(new_mask);
+	old_weight = p->nr_cpus_allowed;
+	new_weight = cpumask_weight(new_mask);
+
+	rq = task_rq(p);
+
+	if (new_weight > 1 &&
+	    rt_task(rq->curr) &&
+	    !test_tsk_need_resched(rq->curr)) {
+		/*
+		 * Set new mask information to prepare pushing.
+		 * It's safe to do this here.
+		 */
+		cpumask_copy(&p->cpus_allowed, new_mask);
+		p->nr_cpus_allowed = new_weight;
+
+		if (task_running(rq, p) &&
+		    cpumask_test_cpu(task_cpu(p), new_mask) &&
+		    cpupri_find(&rq->rd->cpupri, p, NULL)) {
+			/*
+			 * At this point, current task gets migratable most
+			 * likely due to the change of its affinity, let's
+			 * figure out if we can migrate it.
+			 *
+			 * Is there any task with the same priority as that
+			 * of current task? If found one, we should resched.
+			 * NOTE: The target may be unpushable.
+			 */
+			if (p->prio == rq->rt.highest_prio.next) {
+				/* One target just in pushable_tasks list. */
+				requeue_task_rt(rq, p, 0);
+				preempt_push = 1;
+			} else if (rq->rt.rt_nr_total > 1) {
+				struct task_struct *next;
+
+				requeue_task_rt(rq, p, 0);
+				/* peek only */
+				next = _pick_next_task_rt(rq, 1);
+				if (next != p && next->prio == p->prio)
+					preempt_push = 1;
+			}
+		} else if (!task_running(rq, p))
+			direct_push = 1;
+	}
 
 	/*
 	 * Only update if the process changes its state from whether it
 	 * can migrate or not.
 	 */
-	if ((p->nr_cpus_allowed > 1) == (weight > 1))
-		return;
-
-	rq = task_rq(p);
+	if ((old_weight > 1) == (new_weight > 1))
+		goto out;
 
 	/*
 	 * The process used to be able to migrate OR it can now migrate
 	 */
-	if (weight <= 1) {
+	if (new_weight <= 1) {
 		if (!task_current(rq, p))
 			dequeue_pushable_task(rq, p);
 		BUG_ON(!rq->rt.rt_nr_migratory);
@@ -1919,6 +1961,13 @@ static void set_cpus_allowed_rt(struct task_struct *p,
 	}
 
 	update_rt_migration(&rq->rt);
+
+out:
+	if (direct_push)
+		push_rt_tasks(rq);
+
+	if (preempt_push)
+		resched_curr(rq);
 }
 
 /* Assumes rq->lock is held */
-- 
1.9.1



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH RESEND 2/2] sched/rt: Add check_preempt_equal_prio() logic in pick_next_task_rt()
  2015-02-04  1:12 [PATCH RESEND 1/2] sched/rt: Check to push the task when changing its affinity Xunlei Pang
@ 2015-02-04  1:12 ` Xunlei Pang
  2015-02-04  3:17   ` Steven Rostedt
  2015-02-04  3:14 ` [PATCH RESEND 1/2] sched/rt: Check to push the task when changing its affinity Steven Rostedt
  1 sibling, 1 reply; 6+ messages in thread
From: Xunlei Pang @ 2015-02-04  1:12 UTC (permalink / raw)
  To: linux-kernel; +Cc: Peter Zijlstra, Steven Rostedt, Juri Lelli, Xunlei Pang

From: Xunlei Pang <pang.xunlei@linaro.org>

check_preempt_curr() doesn't call sched_class::check_preempt_curr
when the class of current is a higher level. So if there is a DL
task running when doing this for RT, check_preempt_equal_prio()
will definitely miss, which may result in some response latency
for this RT task if it is pinned and there're some same-priority
migratable rt tasks already queued.

We should do the similar thing in select_task_rq_rt() when first
picking rt tasks after running out of DL tasks.

This patch tackles the issue by peeking the next rt task(RT1), and
if find RT1 migratable, just requeue it to the tail of the rq using
requeue_task_rt(rq, p, 0). In this way:
- If there do have another rt task(RT2) with the same priority as
  RT1, RT2 will finally be picked as the running task. While RT1
  will be pushed onto another cpu via RT1's post_schedule(), as
  RT1 is migratable. The difference from check_preempt_equal_prio()
  here is that we just don't care whether RT2 is migratable.

- Otherwise, if there's no rt task with the same priority as RT1,
  RT1 will still be picked as the running task after the requeuing.

Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
---
 kernel/sched/rt.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 4dacb6e..b2385ee 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1477,6 +1477,21 @@ pick_next_task_rt(struct rq *rq, struct task_struct *prev)
 
 	put_prev_task(rq, prev);
 
+#ifdef CONFIG_SMP
+		/*
+		 * If there's a running deadline task, check_preempt_curr()
+		 * doesn't invoke check_preempt_curr_rt() for rt tasks, so
+		 * we can do it here.
+		 */
+		if (prev->sched_class == &dl_sched_class &&
+		    rq->rt.rt_nr_total > 1) {
+			p = _pick_next_task_rt(rq, 1); /* peek only */
+			if (p->nr_cpus_allowed != 1 &&
+			    cpupri_find(&rq->rd->cpupri, p, NULL))
+				requeue_task_rt(rq, p, 0);
+		}
+#endif
+
 	p = _pick_next_task_rt(rq, 0);
 
 	/* The running task is never eligible for pushing */
-- 
1.9.1



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RESEND 1/2] sched/rt: Check to push the task when changing its affinity
  2015-02-04  1:12 [PATCH RESEND 1/2] sched/rt: Check to push the task when changing its affinity Xunlei Pang
  2015-02-04  1:12 ` [PATCH RESEND 2/2] sched/rt: Add check_preempt_equal_prio() logic in pick_next_task_rt() Xunlei Pang
@ 2015-02-04  3:14 ` Steven Rostedt
  2015-02-04 12:59   ` Xunlei Pang
  1 sibling, 1 reply; 6+ messages in thread
From: Steven Rostedt @ 2015-02-04  3:14 UTC (permalink / raw)
  To: Xunlei Pang; +Cc: linux-kernel, Peter Zijlstra, Juri Lelli, Xunlei Pang

On Wed,  4 Feb 2015 09:12:20 +0800
Xunlei Pang <xlpang@126.com> wrote:

> From: Xunlei Pang <pang.xunlei@linaro.org>
> 
> We may suffer from extra rt overload rq due to the affinity,
> so when the affinity of any runnable rt task is changed, we
> should check to trigger balancing, otherwise it will cause
> some unnecessary delayed real-time response. Unfortunately,
> current RT global scheduler doesn't trigger anything.
> 
> For example: a 2-cpu system with two runnable FIFO tasks(same
> rt_priority) bound on CPU0, let's name them rt1(running) and
> rt2(runnable) respectively; CPU1 has no RTs. Then, someone sets
> the affinity of rt2 to 0x3(i.e. CPU0 and CPU1), but after this,
> rt2 still can't be scheduled until rt1 enters schedule(), this
> definitely causes some/big response latency for rt2.
> 

I understand the issue you point out, but I have to be honest and say
that I really do not like this approach.

> So, when doing set_cpus_allowed_rt(), if detecting such cases,
> check to trigger a push behaviour.
> 
> Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
> ---
>  kernel/sched/rt.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 59 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index f4d4b07..4dacb6e 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -1428,7 +1428,7 @@ static struct sched_rt_entity *pick_next_rt_entity(struct rq *rq,
>  	return next;
>  }
>  
> -static struct task_struct *_pick_next_task_rt(struct rq *rq)
> +static struct task_struct *_pick_next_task_rt(struct rq *rq, int peek_only)
>  {

peek_only should be bool, but don't worry about it, I think this isn't
needed.

>  	struct sched_rt_entity *rt_se;
>  	struct task_struct *p;
> @@ -1441,7 +1441,8 @@ static struct task_struct *_pick_next_task_rt(struct rq *rq)
>  	} while (rt_rq);
>  
>  	p = rt_task_of(rt_se);
> -	p->se.exec_start = rq_clock_task(rq);
> +	if (!peek_only)
> +		p->se.exec_start = rq_clock_task(rq);
>  
>  	return p;
>  }
> @@ -1476,7 +1477,7 @@ pick_next_task_rt(struct rq *rq, struct task_struct *prev)
>  
>  	put_prev_task(rq, prev);
>  
> -	p = _pick_next_task_rt(rq);
> +	p = _pick_next_task_rt(rq, 0);
>  
>  	/* The running task is never eligible for pushing */
>  	dequeue_pushable_task(rq, p);
> @@ -1886,28 +1887,69 @@ static void set_cpus_allowed_rt(struct task_struct *p,
>  				const struct cpumask *new_mask)
>  {
>  	struct rq *rq;
> -	int weight;
> +	int old_weight, new_weight;
> +	int preempt_push = 0, direct_push = 0;
>  
>  	BUG_ON(!rt_task(p));
>  
>  	if (!task_on_rq_queued(p))
>  		return;
>  
> -	weight = cpumask_weight(new_mask);
> +	old_weight = p->nr_cpus_allowed;
> +	new_weight = cpumask_weight(new_mask);
> +
> +	rq = task_rq(p);
> +
> +	if (new_weight > 1 &&
> +	    rt_task(rq->curr) &&
> +	    !test_tsk_need_resched(rq->curr)) {
> +		/*
> +		 * Set new mask information to prepare pushing.
> +		 * It's safe to do this here.

Please explain why it is safe.

> +		 */
> +		cpumask_copy(&p->cpus_allowed, new_mask);
> +		p->nr_cpus_allowed = new_weight;
> +
> +		if (task_running(rq, p) &&
> +		    cpumask_test_cpu(task_cpu(p), new_mask) &&
> +		    cpupri_find(&rq->rd->cpupri, p, NULL)) {

Hmm, You called cpupri_find() which should also return a mask of the
CPUs with the lowest priorities. I wonder if we could have utilize this
information instead of doing it twice? Of course things could change by
the time the task migrates.

> +			/*
> +			 * At this point, current task gets migratable most
> +			 * likely due to the change of its affinity, let's
> +			 * figure out if we can migrate it.
> +			 *
> +			 * Is there any task with the same priority as that
> +			 * of current task? If found one, we should resched.
> +			 * NOTE: The target may be unpushable.
> +			 */
> +			if (p->prio == rq->rt.highest_prio.next) {
> +				/* One target just in pushable_tasks list. */
> +				requeue_task_rt(rq, p, 0);
> +				preempt_push = 1;
> +			} else if (rq->rt.rt_nr_total > 1) {
> +				struct task_struct *next;
> +
> +				requeue_task_rt(rq, p, 0);
> +				/* peek only */
> +				next = _pick_next_task_rt(rq, 1);
> +				if (next != p && next->prio == p->prio)
> +					preempt_push = 1;
> +			}

I'm thinking it would be better just to send an IPI to the CPU that
figures this out and pushes a task off of itself.

> +		} else if (!task_running(rq, p))
> +			direct_push = 1;
> +	}
>  
>  	/*
>  	 * Only update if the process changes its state from whether it
>  	 * can migrate or not.
>  	 */
> -	if ((p->nr_cpus_allowed > 1) == (weight > 1))
> -		return;
> -
> -	rq = task_rq(p);
> +	if ((old_weight > 1) == (new_weight > 1))
> +		goto out;
>  
>  	/*
>  	 * The process used to be able to migrate OR it can now migrate
>  	 */
> -	if (weight <= 1) {
> +	if (new_weight <= 1) {
>  		if (!task_current(rq, p))
>  			dequeue_pushable_task(rq, p);
>  		BUG_ON(!rq->rt.rt_nr_migratory);
> @@ -1919,6 +1961,13 @@ static void set_cpus_allowed_rt(struct task_struct *p,
>  	}
>  
>  	update_rt_migration(&rq->rt);
> +
> +out:
> +	if (direct_push)
> +		push_rt_tasks(rq);
> +
> +	if (preempt_push)
> +		resched_curr(rq);

I don't know. This just doesn't seem clean.

-- Steve

>  }
>  
>  /* Assumes rq->lock is held */


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RESEND 2/2] sched/rt: Add check_preempt_equal_prio() logic in pick_next_task_rt()
  2015-02-04  1:12 ` [PATCH RESEND 2/2] sched/rt: Add check_preempt_equal_prio() logic in pick_next_task_rt() Xunlei Pang
@ 2015-02-04  3:17   ` Steven Rostedt
  2015-02-04 13:05     ` Xunlei Pang
  0 siblings, 1 reply; 6+ messages in thread
From: Steven Rostedt @ 2015-02-04  3:17 UTC (permalink / raw)
  To: Xunlei Pang; +Cc: linux-kernel, Peter Zijlstra, Juri Lelli, Xunlei Pang

On Wed,  4 Feb 2015 09:12:21 +0800
Xunlei Pang <xlpang@126.com> wrote:

> From: Xunlei Pang <pang.xunlei@linaro.org>
> 
> check_preempt_curr() doesn't call sched_class::check_preempt_curr
> when the class of current is a higher level. So if there is a DL
> task running when doing this for RT, check_preempt_equal_prio()
> will definitely miss, which may result in some response latency
> for this RT task if it is pinned and there're some same-priority
> migratable rt tasks already queued.
> 
> We should do the similar thing in select_task_rq_rt() when first
> picking rt tasks after running out of DL tasks.
> 
> This patch tackles the issue by peeking the next rt task(RT1), and
> if find RT1 migratable, just requeue it to the tail of the rq using
> requeue_task_rt(rq, p, 0). In this way:
> - If there do have another rt task(RT2) with the same priority as
>   RT1, RT2 will finally be picked as the running task. While RT1
>   will be pushed onto another cpu via RT1's post_schedule(), as
>   RT1 is migratable. The difference from check_preempt_equal_prio()
>   here is that we just don't care whether RT2 is migratable.
> 
> - Otherwise, if there's no rt task with the same priority as RT1,
>   RT1 will still be picked as the running task after the requeuing.
> 
> Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
> ---
>  kernel/sched/rt.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index 4dacb6e..b2385ee 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -1477,6 +1477,21 @@ pick_next_task_rt(struct rq *rq, struct task_struct *prev)
>  
>  	put_prev_task(rq, prev);
>  
> +#ifdef CONFIG_SMP
> +		/*
> +		 * If there's a running deadline task, check_preempt_curr()
> +		 * doesn't invoke check_preempt_curr_rt() for rt tasks, so
> +		 * we can do it here.
> +		 */

Why the strange indentation?

> +		if (prev->sched_class == &dl_sched_class &&
> +		    rq->rt.rt_nr_total > 1) {
> +			p = _pick_next_task_rt(rq, 1); /* peek only */

I hate the "peek only". Just split the function into two, where you
have something like check_next_task(rq) which does your "peek only"
and the __pick_next_task_rt() calls check_next_task() first and then
runs the rest of the code.

-- Steve

> +			if (p->nr_cpus_allowed != 1 &&
> +			    cpupri_find(&rq->rd->cpupri, p, NULL))
> +				requeue_task_rt(rq, p, 0);
> +		}
> +#endif
> +
>  	p = _pick_next_task_rt(rq, 0);
>  
>  	/* The running task is never eligible for pushing */


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RESEND 1/2] sched/rt: Check to push the task when changing its affinity
  2015-02-04  3:14 ` [PATCH RESEND 1/2] sched/rt: Check to push the task when changing its affinity Steven Rostedt
@ 2015-02-04 12:59   ` Xunlei Pang
  0 siblings, 0 replies; 6+ messages in thread
From: Xunlei Pang @ 2015-02-04 12:59 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Xunlei Pang, lkml, Peter Zijlstra, Juri Lelli

Hi Steve,

On 4 February 2015 at 11:14, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Wed,  4 Feb 2015 09:12:20 +0800
> Xunlei Pang <xlpang@126.com> wrote:
>
>> From: Xunlei Pang <pang.xunlei@linaro.org>
>>
>> +              */
>> +             cpumask_copy(&p->cpus_allowed, new_mask);
>> +             p->nr_cpus_allowed = new_weight;
>> +
>> +             if (task_running(rq, p) &&
>> +                 cpumask_test_cpu(task_cpu(p), new_mask) &&
>> +                 cpupri_find(&rq->rd->cpupri, p, NULL)) {
>
> Hmm, You called cpupri_find() which should also return a mask of the
> CPUs with the lowest priorities. I wonder if we could have utilize this
> information instead of doing it twice? Of course things could change by
> the time the task migrates.

We do this if the target task is running, so we can migrate it only
by resched_cur() or stop_one_cpu(), that's what I can think of now :)
I think resched_cur() would be better.

>
>> +                     /*
>> +                      * At this point, current task gets migratable most
>> +                      * likely due to the change of its affinity, let's
>> +                      * figure out if we can migrate it.
>> +                      *
>> +                      * Is there any task with the same priority as that
>> +                      * of current task? If found one, we should resched.
>> +                      * NOTE: The target may be unpushable.
>> +                      */
>> +                     if (p->prio == rq->rt.highest_prio.next) {
>> +                             /* One target just in pushable_tasks list. */
>> +                             requeue_task_rt(rq, p, 0);
>> +                             preempt_push = 1;
>> +                     } else if (rq->rt.rt_nr_total > 1) {
>> +                             struct task_struct *next;
>> +
>> +                             requeue_task_rt(rq, p, 0);
>> +                             /* peek only */
>> +                             next = _pick_next_task_rt(rq, 1);
>> +                             if (next != p && next->prio == p->prio)
>> +                                     preempt_push = 1;
>> +                     }
>
> I'm thinking it would be better just to send an IPI to the CPU that
> figures this out and pushes a task off of itself.

My thought is that we try the best not to disturb the running task,
actually using direct push_rt_tasks() here instead of IPI is sort of
similar to that logic in task_woken_rt().

>
>> +             } else if (!task_running(rq, p))
>> +                     direct_push = 1;
>> +     }
>>
>>       /*
>>        * Only update if the process changes its state from whether it
>>        * can migrate or not.
>>        */
>> -     if ((p->nr_cpus_allowed > 1) == (weight > 1))
>> -             return;
>> -
>> -     rq = task_rq(p);
>> +     if ((old_weight > 1) == (new_weight > 1))
>> +             goto out;
>>
>>       /*
>>        * The process used to be able to migrate OR it can now migrate
>>        */
>> -     if (weight <= 1) {
>> +     if (new_weight <= 1) {
>>               if (!task_current(rq, p))
>>                       dequeue_pushable_task(rq, p);
>>               BUG_ON(!rq->rt.rt_nr_migratory);
>> @@ -1919,6 +1961,13 @@ static void set_cpus_allowed_rt(struct task_struct *p,
>>       }
>>
>>       update_rt_migration(&rq->rt);
>> +
>> +out:
>> +     if (direct_push)
>> +             push_rt_tasks(rq);
>> +
>> +     if (preempt_push)
>> +             resched_curr(rq);
>
> I don't know. This just doesn't seem clean.
>

Thanks for your time, any of your suggestions would be helpful.

Regards,
Xunlei

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RESEND 2/2] sched/rt: Add check_preempt_equal_prio() logic in pick_next_task_rt()
  2015-02-04  3:17   ` Steven Rostedt
@ 2015-02-04 13:05     ` Xunlei Pang
  0 siblings, 0 replies; 6+ messages in thread
From: Xunlei Pang @ 2015-02-04 13:05 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Xunlei Pang, lkml, Peter Zijlstra, Juri Lelli

Hi Steve,

On 4 February 2015 at 11:17, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Wed,  4 Feb 2015 09:12:21 +0800
> Xunlei Pang <xlpang@126.com> wrote:
>
>> From: Xunlei Pang <pang.xunlei@linaro.org>
>>
>> check_preempt_curr() doesn't call sched_class::check_preempt_curr
>> when the class of current is a higher level. So if there is a DL
>> task running when doing this for RT, check_preempt_equal_prio()
>> will definitely miss, which may result in some response latency
>> for this RT task if it is pinned and there're some same-priority
>> migratable rt tasks already queued.
>>
>> We should do the similar thing in select_task_rq_rt() when first
>> picking rt tasks after running out of DL tasks.
>>
>> This patch tackles the issue by peeking the next rt task(RT1), and
>> if find RT1 migratable, just requeue it to the tail of the rq using
>> requeue_task_rt(rq, p, 0). In this way:
>> - If there do have another rt task(RT2) with the same priority as
>>   RT1, RT2 will finally be picked as the running task. While RT1
>>   will be pushed onto another cpu via RT1's post_schedule(), as
>>   RT1 is migratable. The difference from check_preempt_equal_prio()
>>   here is that we just don't care whether RT2 is migratable.
>>
>> - Otherwise, if there's no rt task with the same priority as RT1,
>>   RT1 will still be picked as the running task after the requeuing.
>>
>> Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
>> ---
>>  kernel/sched/rt.c | 15 +++++++++++++++
>>  1 file changed, 15 insertions(+)
>>
>> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
>> index 4dacb6e..b2385ee 100644
>> --- a/kernel/sched/rt.c
>> +++ b/kernel/sched/rt.c
>> @@ -1477,6 +1477,21 @@ pick_next_task_rt(struct rq *rq, struct task_struct *prev)
>>
>>       put_prev_task(rq, prev);
>>
>> +#ifdef CONFIG_SMP
>> +             /*
>> +              * If there's a running deadline task, check_preempt_curr()
>> +              * doesn't invoke check_preempt_curr_rt() for rt tasks, so
>> +              * we can do it here.
>> +              */
>
> Why the strange indentation?
>

Thanks for catching this, I'll fix it.

>> +             if (prev->sched_class == &dl_sched_class &&
>> +                 rq->rt.rt_nr_total > 1) {
>> +                     p = _pick_next_task_rt(rq, 1); /* peek only */
>
> I hate the "peek only". Just split the function into two, where you
> have something like check_next_task(rq) which does your "peek only"
> and the __pick_next_task_rt() calls check_next_task() first and then
> runs the rest of the code.
>

This sounds good, I'll make a new peek_next_task_rt() as the base one.

Thanks,
Xunlei

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-02-04 13:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-04  1:12 [PATCH RESEND 1/2] sched/rt: Check to push the task when changing its affinity Xunlei Pang
2015-02-04  1:12 ` [PATCH RESEND 2/2] sched/rt: Add check_preempt_equal_prio() logic in pick_next_task_rt() Xunlei Pang
2015-02-04  3:17   ` Steven Rostedt
2015-02-04 13:05     ` Xunlei Pang
2015-02-04  3:14 ` [PATCH RESEND 1/2] sched/rt: Check to push the task when changing its affinity Steven Rostedt
2015-02-04 12:59   ` Xunlei Pang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).