LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Benjamin Segall <bsegall@google.com>
To: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Mel Gorman <mgorman@techsingularity.net>,
	peterz@infradead.org, bristot@redhat.com,
	dietmar.eggemann@arm.com, joshdon@google.com,
	juri.lelli@redhat.com, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org,
	linux@rasmusvillemoes.dk, mgorman@suse.de, mingo@kernel.org,
	rostedt@goodmis.org, valentin.schneider@arm.com,
	vincent.guittot@linaro.org
Subject: Re: [PATCH 1/1] sched/fair: improve yield_to vs fairness
Date: Tue, 27 Jul 2021 11:57:13 -0700	[thread overview]
Message-ID: <xm2635rza8l2.fsf@google.com> (raw)
In-Reply-To: <1acd7520-bd4b-d43d-302a-8dcacf6defa5@de.ibm.com> (Christian Borntraeger's message of "Mon, 26 Jul 2021 20:41:15 +0200")

Christian Borntraeger <borntraeger@de.ibm.com> writes:

> On 23.07.21 18:21, Mel Gorman wrote:
>> On Fri, Jul 23, 2021 at 02:36:21PM +0200, Christian Borntraeger wrote:
>>>> sched: Do not select highest priority task to run if it should be skipped
>>>>
>>>> <SNIP>
>>>>
>>>> index 44c452072a1b..ddc0212d520f 100644
>>>> --- a/kernel/sched/fair.c
>>>> +++ b/kernel/sched/fair.c
>>>> @@ -4522,7 +4522,8 @@ pick_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *curr)
>>>>    			se = second;
>>>>    	}
>>>> -	if (cfs_rq->next && wakeup_preempt_entity(cfs_rq->next, left) < 1) {
>>>> +	if (cfs_rq->next &&
>>>> +	    (cfs_rq->skip == left || wakeup_preempt_entity(cfs_rq->next, left) < 1)) {
>>>>    		/*
>>>>    		 * Someone really wants this to run. If it's not unfair, run it.
>>>>    		 */
>>>>
>>>
>>> I do see a reduction in ignored yields, but from a performance aspect for my
>>> testcases this patch does not provide a benefit, while the the simple
>>> 	curr->vruntime += sysctl_sched_min_granularity;
>>> does.
>> I'm still not a fan because vruntime gets distorted. From the docs
>>     Small detail: on "ideal" hardware, at any time all tasks would have the
>> same
>>     p->se.vruntime value --- i.e., tasks would execute simultaneously and no task
>>     would ever get "out of balance" from the "ideal" share of CPU time
>> If yield_to impacts this "ideal share" then it could have other
>> consequences.
>> I think your patch may be performing better in your test case because every
>> "wrong" task selected that is not the yield_to target gets penalised and
>> so the yield_to target gets pushed up the list.
>> 
>>> I still think that your approach is probably the cleaner one, any chance to improve this
>>> somehow?
>>>
>> Potentially. The patch was a bit off because while it noticed that skip
>> was not being obeyed, the fix was clumsy and isolated. The current flow is
>> 1. pick se == left as the candidate
>> 2. try pick a different se if the "ideal" candidate is a skip candidate
>> 3. Ignore the se update if next or last are set
>> Step 3 looks off because it ignores skip if next or last buddies are set
>> and I don't think that was intended. Can you try this?
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 44c452072a1b..d56f7772a607 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -4522,12 +4522,12 @@ pick_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *curr)
>>   			se = second;
>>   	}
>>   -	if (cfs_rq->next && wakeup_preempt_entity(cfs_rq->next, left) < 1) {
>> +	if (cfs_rq->next && wakeup_preempt_entity(cfs_rq->next, se) < 1) {
>>   		/*
>>   		 * Someone really wants this to run. If it's not unfair, run it.
>>   		 */
>>   		se = cfs_rq->next;
>> -	} else if (cfs_rq->last && wakeup_preempt_entity(cfs_rq->last, left) < 1) {
>> +	} else if (cfs_rq->last && wakeup_preempt_entity(cfs_rq->last, se) < 1) {
>>   		/*
>>   		 * Prefer last buddy, try to return the CPU to a preempted task.
>>   		 */
>> 
>
> This one alone does not seem to make a difference. Neither in ignored yield, nor
> in performance.
>
> Your first patch does really help in terms of ignored yields when
> all threads are pinned to one host CPU. After that we do have no ignored yield
> it seems. But it does not affect the performance of my testcase.
> I did some more experiments and I removed the wakeup_preempt_entity checks in
> pick_next_entity - assuming that this will result in source always being stopped
> and target always being picked. But still, no performance difference.
> As soon as I play with vruntime I do see a difference (but only without the cpu cgroup
> controller). I will try to better understand the scheduler logic and do some more
> testing. If you have anything that I should test, let me know.
>
> Christian

If both yielder and target are in the same cpu cgroup or the cpu cgroup
is disabled (ie, if cfs_rq_of(p->se) matches), you could try

if (p->se.vruntime > rq->curr->se.vruntime)
	swap(p->se.vruntime, rq->curr->se.vruntime)

as well as the existing buddy flags, as an entirely fair vruntime boost
to the target.

For when they aren't direct siblings, you /could/ use find_matching_se,
but it's much less clear that's desirable, since it would yield vruntime
for the entire hierarchy to the target's hierarchy.

  parent reply	other threads:[~2021-07-27 18:57 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-12 10:14 [PATCH v2 0/9] sched: Clean up SCHED_DEBUG Peter Zijlstra
2021-04-12 10:14 ` [PATCH v2 1/9] sched/numa: Allow runtime enabling/disabling of NUMA balance without SCHED_DEBUG Peter Zijlstra
2021-04-12 10:14 ` [PATCH v2 2/9] sched: Remove sched_schedstats sysctl out from under SCHED_DEBUG Peter Zijlstra
2021-04-16 15:53   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-04-12 10:14 ` [PATCH v2 3/9] sched: Dont make LATENCYTOP select SCHED_DEBUG Peter Zijlstra
2021-04-16 15:53   ` [tip: sched/core] sched: Don't " tip-bot2 for Peter Zijlstra
2021-04-12 10:14 ` [PATCH v2 4/9] sched: Move SCHED_DEBUG sysctl to debugfs Peter Zijlstra
2021-04-15 16:29   ` [PATCH] sched/debug: Rename the sched_debug parameter to sched_debug_verbose Peter Zijlstra
2021-04-19 19:26     ` Josh Don
2021-04-16 15:53   ` [tip: sched/core] sched: Move SCHED_DEBUG sysctl to debugfs tip-bot2 for Peter Zijlstra
2021-04-27 14:59   ` Christian Borntraeger
2021-04-27 15:09     ` Steven Rostedt
2021-04-27 15:17       ` Christian Borntraeger
2021-04-28  8:47       ` Peter Zijlstra
2021-04-28  8:46     ` Peter Zijlstra
2021-04-28  8:54       ` Christian Borntraeger
2021-04-28  8:58         ` Christian Borntraeger
2021-04-28  9:25         ` Peter Zijlstra
2021-04-28  9:31           ` Christian Borntraeger
2021-04-28  9:42       ` Christian Borntraeger
2021-04-28 12:38         ` Peter Zijlstra
2021-04-28 14:49           ` Christian Borntraeger
2021-07-07 12:34           ` [PATCH 0/1] Improve yield (was: sched: Move SCHED_DEBUG sysctl to debugfs) Christian Borntraeger
2021-07-07 12:34             ` [PATCH 1/1] sched/fair: improve yield_to vs fairness Christian Borntraeger
2021-07-07 18:07               ` kernel test robot
2021-07-23  9:35               ` Mel Gorman
2021-07-23 12:36                 ` Christian Borntraeger
2021-07-23 16:21                   ` Mel Gorman
2021-07-26 18:41                     ` Christian Borntraeger
2021-07-26 19:32                       ` Mel Gorman
2021-07-27  6:59                         ` Christian Borntraeger
2021-07-27 18:57                       ` Benjamin Segall [this message]
2021-07-28 16:23                         ` Christian Borntraeger
2021-08-10  8:49                           ` Vincent Guittot
2021-07-27 13:29                     ` Peter Zijlstra
2021-07-27 13:33                 ` Peter Zijlstra
2021-07-27 14:31                   ` Mel Gorman
2021-04-12 10:14 ` [PATCH v2 5/9] sched,preempt: Move preempt_dynamic to debug.c Peter Zijlstra
2021-04-16 15:53   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-04-12 10:14 ` [PATCH v2 6/9] debugfs: Implement debugfs_create_str() Peter Zijlstra
2021-04-16 15:53   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-04-12 10:14 ` [PATCH v2 7/9] sched,debug: Convert sysctl sched_domains to debugfs Peter Zijlstra
2021-04-13 14:55   ` Valentin Schneider
2021-04-15  9:06     ` Peter Zijlstra
2021-04-15 12:16       ` Dietmar Eggemann
2021-04-15 12:34       ` Valentin Schneider
2021-04-15 13:02         ` Peter Zijlstra
2021-04-12 10:14 ` [PATCH v2 8/9] sched: Move /proc/sched_debug " Peter Zijlstra
2021-04-16 15:53   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-04-12 10:14 ` [PATCH v2 9/9] sched,fair: Alternative sched_slice() Peter Zijlstra
2021-04-12 10:26   ` Peter Zijlstra
2021-04-16 15:53   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xm2635rza8l2.fsf@google.com \
    --to=bsegall@google.com \
    --cc=borntraeger@de.ibm.com \
    --cc=bristot@redhat.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=joshdon@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=mgorman@suse.de \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    --subject='Re: [PATCH 1/1] sched/fair: improve yield_to vs fairness' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).