LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "Song Bao Hua (Barry Song)" <song.bao.hua@hisilicon.com>
To: Mel Gorman <mgorman@techsingularity.net>,
	LKML <linux-kernel@vger.kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Valentin Schneider <valentin.schneider@arm.com>,
	Aubrey Li <aubrey.li@linux.intel.com>,
	yangyicong <yangyicong@huawei.com>
Subject: RE: [PATCH 7/9] sched/fair: Enforce proportional scan limits when scanning for an idle core
Date: Mon, 2 Aug 2021 10:52:01 +0000	[thread overview]
Message-ID: <58167022b9074ed9951b09ab6ba1983e@hisilicon.com> (raw)
In-Reply-To: <20210726102247.21437-8-mgorman@techsingularity.net>



> -----Original Message-----
> From: Mel Gorman [mailto:mgorman@techsingularity.net]
> Sent: Monday, July 26, 2021 10:23 PM
> To: LKML <linux-kernel@vger.kernel.org>
> Cc: Ingo Molnar <mingo@kernel.org>; Peter Zijlstra <peterz@infradead.org>;
> Vincent Guittot <vincent.guittot@linaro.org>; Valentin Schneider
> <valentin.schneider@arm.com>; Aubrey Li <aubrey.li@linux.intel.com>; Mel
> Gorman <mgorman@techsingularity.net>
> Subject: [PATCH 7/9] sched/fair: Enforce proportional scan limits when scanning
> for an idle core
> 
> When scanning for a single CPU, the scan is limited based on the estimated
> average idle time for a domain to reduce the risk that more time is spent
> scanning for idle CPUs than we are idle for.
> 
> With SMT, if an idle core is expected to exist there is no scan depth
> limits so the scan depth may or may not be related to average idle time.
> Unfortunately has_idle_cores can be very inaccurate when workloads are
> rapidly entering/exiting idle (e.g. hackbench).
> 
> As the scan depth is now proportional to cores and not CPUs, enforce
> SIS_PROP for idle core scans.
> 
> The performance impact of this is variable and is neither a universal
> gain nor loss. In some cases, has_idle_cores will be cleared prematurely
> because the whole domain was not scanned but has_idle_cores is already
> known to be an inaccurate heuristic. There is also additional cost because
> time calculations are made even for an idle core scan and the delta is
> calculated for both scan successes and failures. Finally, SMT siblings
> may be used prematurely due to scan depth limitations.
> 
> On the flip side, scan depth is now consistent for both core and smt
> scans. The reduction in scan depth improves performance in some cases
> and wakeup latency is reduced in some cases.
> 
> There were few changes identified in the SIS statistics but notably,
> "SIS Core Hit" was slightly reduced in tbench as thread counts increased,
> presumably due to the core search depth being throttled.
> 
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> ---
>  kernel/sched/fair.c | 33 +++++++++++++++++++--------------
>  1 file changed, 19 insertions(+), 14 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 20b9255ebf97..b180205e6b25 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6232,7 +6232,7 @@ static int select_idle_cpu(struct task_struct *p, struct
> sched_domain *sd, bool
> 
>  	cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
> 
> -	if (sched_feat(SIS_PROP) && !has_idle_core) {
> +	if (sched_feat(SIS_PROP)) {
>  		u64 avg_cost, avg_idle, span_avg;
>  		unsigned long now = jiffies;
> 
> @@ -6265,30 +6265,35 @@ static int select_idle_cpu(struct task_struct *p, struct
> sched_domain *sd, bool
>  		if (has_idle_core) {
>  			i = select_idle_core(p, cpu, cpus, &idle_cpu);
>  			if ((unsigned int)i < nr_cpumask_bits)
> -				return i;
> +				break;
> 
> +			nr -= sched_smt_weight;
>  		} else {
> -			if (!--nr)
> -				return -1;
>  			idle_cpu = __select_idle_cpu(cpu, p);
>  			if ((unsigned int)idle_cpu < nr_cpumask_bits)
>  				break;
> +			nr--;
>  		}
> +
> +		if (nr < 0)
> +			break;
>  	}
> 
> -	if (has_idle_core)
> -		set_idle_cores(target, false);
> +	if ((unsigned int)idle_cpu < nr_cpumask_bits) {
> +		if (has_idle_core)
> +			set_idle_cores(target, false);
> 

For example, if we have 16 cpus(8 SMT2 cores). In case core7 is idle,
we only have scanned core0+core1(cpu0-cpu3) and if these two cores
are not idle, but here we set has_idle_cores to false while core7 is
idle. It seems incorrect.

> -	if (sched_feat(SIS_PROP) && !has_idle_core) {
> -		time = cpu_clock(this) - time;
> +		if (sched_feat(SIS_PROP)) {
> +			time = cpu_clock(this) - time;
> 
> -		/*
> -		 * Account for the scan cost of wakeups against the average
> -		 * idle time.
> -		 */
> -		this_rq->wake_avg_idle -= min(this_rq->wake_avg_idle, time);
> +			/*
> +			 * Account for the scan cost of wakeups against the average
> +			 * idle time.
> +			 */
> +			this_rq->wake_avg_idle -= min(this_rq->wake_avg_idle, time);
> 
> -		update_avg(&this_sd->avg_scan_cost, time);
> +			update_avg(&this_sd->avg_scan_cost, time);
> +		}
>  	}
> 
>  	return idle_cpu;
> --
> 2.26.2


Thanks
Barry


  reply	other threads:[~2021-08-02 10:52 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-26 10:22 [RFC PATCH 0/9] Modify and/or delete SIS_PROP Mel Gorman
2021-07-26 10:22 ` [PATCH 1/9] sched/fair: Track efficiency of select_idle_sibling Mel Gorman
2021-07-26 10:22 ` [PATCH 2/9] sched/fair: Track efficiency of task recent_used_cpu Mel Gorman
2021-07-26 10:22 ` [PATCH 3/9] sched/fair: Track efficiency of select_idle_core Mel Gorman
2021-07-26 10:22 ` [PATCH 4/9] sched/fair: Use prev instead of new target as recent_used_cpu Mel Gorman
2021-07-26 10:22 ` [PATCH 5/9] sched/fair: Avoid a second scan of target in select_idle_cpu Mel Gorman
2021-07-26 10:22 ` [PATCH 6/9] sched/fair: Make select_idle_cpu() proportional to cores Mel Gorman
2021-07-26 10:22 ` [PATCH 7/9] sched/fair: Enforce proportional scan limits when scanning for an idle core Mel Gorman
2021-08-02 10:52   ` Song Bao Hua (Barry Song) [this message]
2021-08-04 10:22     ` Mel Gorman
2021-07-26 10:22 ` [PATCH 8/9] sched/fair: select idle cpu from idle cpumask for task wakeup Mel Gorman
2021-08-02 10:41   ` Song Bao Hua (Barry Song)
2021-08-04 10:26     ` Mel Gorman
2021-08-05  0:23       ` Aubrey Li
2021-09-17  3:44         ` Barry Song
2021-09-17  4:15         ` Barry Song
2021-09-17  9:11           ` Aubrey Li
2021-09-17 13:35             ` Mel Gorman
2021-07-26 10:22 ` [PATCH 9/9] sched/core: Delete SIS_PROP and rely on the idle cpu mask Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58167022b9074ed9951b09ab6ba1983e@hisilicon.com \
    --to=song.bao.hua@hisilicon.com \
    --cc=aubrey.li@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    --cc=yangyicong@huawei.com \
    --subject='RE: [PATCH 7/9] sched/fair: Enforce proportional scan limits when scanning for an idle core' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).