LKML Archive on lore.kernel.org help / color / mirror / Atom feed
From: Vincent Guittot <vincent.guittot@linaro.org> To: Josh Don <joshdon@google.com> Cc: Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>, Juri Lelli <juri.lelli@redhat.com>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Steven Rostedt <rostedt@goodmis.org>, Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>, Daniel Bristot de Oliveira <bristot@redhat.com>, Paul Turner <pjt@google.com>, Oleg Rombakh <olegrom@google.com>, Viresh Kumar <viresh.kumar@linaro.org>, Steve Sistare <steven.sistare@oracle.com>, Tejun Heo <tj@kernel.org>, Rik van Riel <riel@surriel.com>, linux-kernel <linux-kernel@vger.kernel.org> Subject: Re: [PATCH 2/2] sched: adjust SCHED_IDLE interactions Date: Fri, 13 Aug 2021 14:43:12 +0200 [thread overview] Message-ID: <CAKfTPtB=ao5yrE3OtEj7mZYPNeMGCEB4rGMRb=vN5QfF=ySGiw@mail.gmail.com> (raw) In-Reply-To: <CABk29Ns8P9AGy7Tpo6duOeEh=ZFWM1jO8FnvhZhktfcA0GWOpw@mail.gmail.com> On Thu, 12 Aug 2021 at 23:09, Josh Don <joshdon@google.com> wrote: > > > > @@ -697,8 +699,18 @@ static u64 sched_slice(struct cfs_rq *cfs_rq, struct sched_entity *se) > > > slice = __calc_delta(slice, se->load.weight, load); > > > } > > > > > > - if (sched_feat(BASE_SLICE)) > > > - slice = max(slice, (u64)w); > > > + if (sched_feat(BASE_SLICE)) { > > > + /* > > > + * SCHED_IDLE entities are not subject to min_granularity if > > > + * they are competing with non SCHED_IDLE entities. As a result, > > > + * non SCHED_IDLE entities will have reduced latency to get back > > > + * on cpu, at the cost of increased context switch frequency of > > > + * SCHED_IDLE entities. > > > + */ > > > > Ensuring that the entity will have a minimum runtime has been added to > > ensure that we let enough time to move forward. > > If you exclude sched_idle entities from this min runtime, the > > sched_slice of an idle_entity will be really small. > > I don't have details of your example above but I can imagine that it's > > a 16 cpus system which means a sysctl_sched_min_granularity=3.75ms > > which explains the 4ms running time of an idle entity > > For a 16 cpus system, the sched_slice of an idle_entity in your > > example in the cover letter is: 6*(1+log2(16))*3/1027=87us. Of course > > this become even worse with more threads and cgroups or thread with > > ncie prio -19 > > > > This value is then used to set the next hrtimer event in SCHED_HRTICK > > and 87us is too small to make any progress > > > > The 1ms of your test comes from the tick which could be a good > > candidate for a min value or the > > normalized_sysctl_sched_min_granularity which has the advantage of not > > increasing with number of CPU > > Fair point, this shouldn't completely ignore min granularity. Something like > > unsigned int sysctl_sched_idle_min_granularity = NSEC_PER_MSEC; > > (and still only using this value instead of the default > min_granularity when the SCHED_IDLE entity is competing with normal > entities) Yes that looks like a good option Also note that with a NSEC_PER_MSEC default value, the sched_idle entity will most probably run 2 ticks instead of the 1 tick (HZ=1000) that you have with your proposal because a bit less than a full tick is accounted to the running thread (the time spent in interrupt is not accounted as an example) so sysctl_sched_idle_min_granularity of 1ms with HZ=1000 will most propably run 2 ticks. Instead you could reuse the default 750000ULL value of sched_idle_min_granularity That being said sysctl_sched_idle_min_granularity = normalized_sysctl_sched_min_granularity * scale_factor which means that normalized_sysctl_sched_min_granularity stays the same (750000ULL) whatever the number of cpus > > > > @@ -4216,7 +4228,15 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) > > > if (sched_feat(GENTLE_FAIR_SLEEPERS)) > > > thresh >>= 1; > > > > > > - vruntime -= thresh; > > > + /* > > > + * Don't give sleep credit to a SCHED_IDLE entity if we're > > > + * placing it onto a cfs_rq with non SCHED_IDLE entities. > > > + */ > > > + if (!se_is_idle(se) || > > > + cfs_rq->h_nr_running == cfs_rq->idle_h_nr_running) > > > > Can't this condition above create unfairness between idle entities ? > > idle thread 1 wake up while normal thread is running > > normal thread thread sleeps immediately after > > idle thread 2 wakes up just after and gets some credits compared to the 1st one. > > Yes, this sacrifices some idle<->idle fairness when there is a normal > thread that comes and goes. One alternative is to simply further > reduce thresh for idle entities. That will interfere with idle<->idle > fairness when there are no normal threads, which is why I opted for > the former. On second thought though, the former fairness issue seems > more problematic. Thoughts on applying a smaller sleep credit > threshold universally to idle entities? This one is a bit more complex to set. With adding 1, you favor the already runnable tasks by ensuring that they have or will run a slice during this period before sched_idle task But as soon as you subtract something to min_vruntime, the task will most probably be scheduled at the next tick if other tasks already run for a while (at least a sched period). If we use sysctl_sched_min_granularity for sched_idle tasks that wake up instead of sysctl_sched_latency, we will ensure that a sched_idle task will not preempt a normal task, which woke up few ms before, and we will keep some fairness for sched_idle task that sleeps compare to other. so a thresh of sysctl_sched_min_granularity (3.75ms with 16 cpus ) should not disturb your UC and keep some benefit for newly wake up sched_ide task
next prev parent reply other threads:[~2021-08-13 12:43 UTC|newest] Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-07-30 2:00 [PATCH v2 0/2] SCHED_IDLE extensions Josh Don 2021-07-30 2:00 ` [PATCH v2 1/2] sched: cgroup SCHED_IDLE support Josh Don 2021-08-03 2:14 ` jun qian 2021-08-03 20:37 ` Josh Don 2021-08-05 10:18 ` Peter Zijlstra 2021-08-05 17:13 ` Tejun Heo 2021-08-05 23:54 ` Josh Don 2021-08-11 13:48 ` Vincent Guittot 2021-08-23 9:26 ` [tip: sched/core] sched: Cgroup " tip-bot2 for Josh Don 2021-07-30 2:00 ` [PATCH 2/2] sched: adjust SCHED_IDLE interactions Josh Don 2021-08-11 13:31 ` Vincent Guittot 2021-08-12 21:09 ` Josh Don 2021-08-13 12:43 ` Vincent Guittot [this message] 2021-08-13 23:55 ` Josh Don 2021-08-16 12:31 ` Peter Zijlstra 2021-08-17 23:48 ` Josh Don 2021-08-16 12:52 ` Peter Zijlstra 2021-08-16 12:56 ` Vincent Guittot 2021-08-17 23:40 ` Josh Don 2021-08-16 12:31 ` Peter Zijlstra
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAKfTPtB=ao5yrE3OtEj7mZYPNeMGCEB4rGMRb=vN5QfF=ySGiw@mail.gmail.com' \ --to=vincent.guittot@linaro.org \ --cc=bristot@redhat.com \ --cc=bsegall@google.com \ --cc=dietmar.eggemann@arm.com \ --cc=joshdon@google.com \ --cc=juri.lelli@redhat.com \ --cc=linux-kernel@vger.kernel.org \ --cc=mgorman@suse.de \ --cc=mingo@redhat.com \ --cc=olegrom@google.com \ --cc=peterz@infradead.org \ --cc=pjt@google.com \ --cc=riel@surriel.com \ --cc=rostedt@goodmis.org \ --cc=steven.sistare@oracle.com \ --cc=tj@kernel.org \ --cc=viresh.kumar@linaro.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).