LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: Xunlei Pang <xlpang@126.com>
Cc: linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@gmail.com>,
	Xunlei Pang <pang.xunlei@linaro.org>
Subject: Re: [PATCH RESEND 1/2] sched/rt: Check to push the task when changing its affinity
Date: Tue, 3 Feb 2015 22:14:19 -0500	[thread overview]
Message-ID: <20150203221419.4428225e@grimm.local.home> (raw)
In-Reply-To: <1423012341-30265-1-git-send-email-xlpang@126.com>

On Wed,  4 Feb 2015 09:12:20 +0800
Xunlei Pang <xlpang@126.com> wrote:

> From: Xunlei Pang <pang.xunlei@linaro.org>
> 
> We may suffer from extra rt overload rq due to the affinity,
> so when the affinity of any runnable rt task is changed, we
> should check to trigger balancing, otherwise it will cause
> some unnecessary delayed real-time response. Unfortunately,
> current RT global scheduler doesn't trigger anything.
> 
> For example: a 2-cpu system with two runnable FIFO tasks(same
> rt_priority) bound on CPU0, let's name them rt1(running) and
> rt2(runnable) respectively; CPU1 has no RTs. Then, someone sets
> the affinity of rt2 to 0x3(i.e. CPU0 and CPU1), but after this,
> rt2 still can't be scheduled until rt1 enters schedule(), this
> definitely causes some/big response latency for rt2.
> 

I understand the issue you point out, but I have to be honest and say
that I really do not like this approach.

> So, when doing set_cpus_allowed_rt(), if detecting such cases,
> check to trigger a push behaviour.
> 
> Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
> ---
>  kernel/sched/rt.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 59 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index f4d4b07..4dacb6e 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -1428,7 +1428,7 @@ static struct sched_rt_entity *pick_next_rt_entity(struct rq *rq,
>  	return next;
>  }
>  
> -static struct task_struct *_pick_next_task_rt(struct rq *rq)
> +static struct task_struct *_pick_next_task_rt(struct rq *rq, int peek_only)
>  {

peek_only should be bool, but don't worry about it, I think this isn't
needed.

>  	struct sched_rt_entity *rt_se;
>  	struct task_struct *p;
> @@ -1441,7 +1441,8 @@ static struct task_struct *_pick_next_task_rt(struct rq *rq)
>  	} while (rt_rq);
>  
>  	p = rt_task_of(rt_se);
> -	p->se.exec_start = rq_clock_task(rq);
> +	if (!peek_only)
> +		p->se.exec_start = rq_clock_task(rq);
>  
>  	return p;
>  }
> @@ -1476,7 +1477,7 @@ pick_next_task_rt(struct rq *rq, struct task_struct *prev)
>  
>  	put_prev_task(rq, prev);
>  
> -	p = _pick_next_task_rt(rq);
> +	p = _pick_next_task_rt(rq, 0);
>  
>  	/* The running task is never eligible for pushing */
>  	dequeue_pushable_task(rq, p);
> @@ -1886,28 +1887,69 @@ static void set_cpus_allowed_rt(struct task_struct *p,
>  				const struct cpumask *new_mask)
>  {
>  	struct rq *rq;
> -	int weight;
> +	int old_weight, new_weight;
> +	int preempt_push = 0, direct_push = 0;
>  
>  	BUG_ON(!rt_task(p));
>  
>  	if (!task_on_rq_queued(p))
>  		return;
>  
> -	weight = cpumask_weight(new_mask);
> +	old_weight = p->nr_cpus_allowed;
> +	new_weight = cpumask_weight(new_mask);
> +
> +	rq = task_rq(p);
> +
> +	if (new_weight > 1 &&
> +	    rt_task(rq->curr) &&
> +	    !test_tsk_need_resched(rq->curr)) {
> +		/*
> +		 * Set new mask information to prepare pushing.
> +		 * It's safe to do this here.

Please explain why it is safe.

> +		 */
> +		cpumask_copy(&p->cpus_allowed, new_mask);
> +		p->nr_cpus_allowed = new_weight;
> +
> +		if (task_running(rq, p) &&
> +		    cpumask_test_cpu(task_cpu(p), new_mask) &&
> +		    cpupri_find(&rq->rd->cpupri, p, NULL)) {

Hmm, You called cpupri_find() which should also return a mask of the
CPUs with the lowest priorities. I wonder if we could have utilize this
information instead of doing it twice? Of course things could change by
the time the task migrates.

> +			/*
> +			 * At this point, current task gets migratable most
> +			 * likely due to the change of its affinity, let's
> +			 * figure out if we can migrate it.
> +			 *
> +			 * Is there any task with the same priority as that
> +			 * of current task? If found one, we should resched.
> +			 * NOTE: The target may be unpushable.
> +			 */
> +			if (p->prio == rq->rt.highest_prio.next) {
> +				/* One target just in pushable_tasks list. */
> +				requeue_task_rt(rq, p, 0);
> +				preempt_push = 1;
> +			} else if (rq->rt.rt_nr_total > 1) {
> +				struct task_struct *next;
> +
> +				requeue_task_rt(rq, p, 0);
> +				/* peek only */
> +				next = _pick_next_task_rt(rq, 1);
> +				if (next != p && next->prio == p->prio)
> +					preempt_push = 1;
> +			}

I'm thinking it would be better just to send an IPI to the CPU that
figures this out and pushes a task off of itself.

> +		} else if (!task_running(rq, p))
> +			direct_push = 1;
> +	}
>  
>  	/*
>  	 * Only update if the process changes its state from whether it
>  	 * can migrate or not.
>  	 */
> -	if ((p->nr_cpus_allowed > 1) == (weight > 1))
> -		return;
> -
> -	rq = task_rq(p);
> +	if ((old_weight > 1) == (new_weight > 1))
> +		goto out;
>  
>  	/*
>  	 * The process used to be able to migrate OR it can now migrate
>  	 */
> -	if (weight <= 1) {
> +	if (new_weight <= 1) {
>  		if (!task_current(rq, p))
>  			dequeue_pushable_task(rq, p);
>  		BUG_ON(!rq->rt.rt_nr_migratory);
> @@ -1919,6 +1961,13 @@ static void set_cpus_allowed_rt(struct task_struct *p,
>  	}
>  
>  	update_rt_migration(&rq->rt);
> +
> +out:
> +	if (direct_push)
> +		push_rt_tasks(rq);
> +
> +	if (preempt_push)
> +		resched_curr(rq);

I don't know. This just doesn't seem clean.

-- Steve

>  }
>  
>  /* Assumes rq->lock is held */


  parent reply	other threads:[~2015-02-04  3:13 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-04  1:12 Xunlei Pang
2015-02-04  1:12 ` [PATCH RESEND 2/2] sched/rt: Add check_preempt_equal_prio() logic in pick_next_task_rt() Xunlei Pang
2015-02-04  3:17   ` Steven Rostedt
2015-02-04 13:05     ` Xunlei Pang
2015-02-04  3:14 ` Steven Rostedt [this message]
2015-02-04 12:59   ` [PATCH RESEND 1/2] sched/rt: Check to push the task when changing its affinity Xunlei Pang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150203221419.4428225e@grimm.local.home \
    --to=rostedt@goodmis.org \
    --cc=juri.lelli@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pang.xunlei@linaro.org \
    --cc=peterz@infradead.org \
    --cc=xlpang@126.com \
    --subject='Re: [PATCH RESEND 1/2] sched/rt: Check to push the task when changing its affinity' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).