From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2908CC432BE for ; Fri, 27 Aug 2021 19:43:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DF56F60F58 for ; Fri, 27 Aug 2021 19:43:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231150AbhH0ToT (ORCPT ); Fri, 27 Aug 2021 15:44:19 -0400 Received: from mga12.intel.com ([192.55.52.136]:56885 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231174AbhH0ToP (ORCPT ); Fri, 27 Aug 2021 15:44:15 -0400 X-IronPort-AV: E=McAfee;i="6200,9189,10089"; a="197581563" X-IronPort-AV: E=Sophos;i="5.84,357,1620716400"; d="scan'208";a="197581563" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Aug 2021 12:43:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.84,357,1620716400"; d="scan'208";a="528418837" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by FMSMGA003.fm.intel.com with ESMTP; 27 Aug 2021 12:43:22 -0700 Date: Fri, 27 Aug 2021 12:43:14 -0700 From: Ricardo Neri To: Vincent Guittot Cc: Peter Zijlstra , Ingo Molnar , Juri Lelli , Srikar Dronamraju , Nicholas Piggin , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Len Brown , Srinivas Pandruvada , Tim Chen , Aubrey Li , "Ravi V. Shankar" , Ricardo Neri , "Rafael J. Wysocki" , Quentin Perret , "Joel Fernandes (Google)" , linuxppc-dev@lists.ozlabs.org, linux-kernel , Aubrey Li , Daniel Bristot de Oliveira Subject: Re: [PATCH v4 6/6] sched/fair: Consider SMT in ASYM_PACKING load balance Message-ID: <20210827194314.GA14720@ranerica-svr.sc.intel.com> References: <20210810144145.18776-1-ricardo.neri-calderon@linux.intel.com> <20210810144145.18776-7-ricardo.neri-calderon@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Aug 27, 2021 at 05:17:22PM +0200, Vincent Guittot wrote: > On Fri, 27 Aug 2021 at 16:50, Peter Zijlstra wrote: > > > > On Fri, Aug 27, 2021 at 12:13:42PM +0200, Vincent Guittot wrote: > > > > +/** > > > > + * asym_smt_can_pull_tasks - Check whether the load balancing CPU can pull tasks > > > > + * @dst_cpu: Destination CPU of the load balancing > > > > + * @sds: Load-balancing data with statistics of the local group > > > > + * @sgs: Load-balancing statistics of the candidate busiest group > > > > + * @sg: The candidate busiet group > > > > + * > > > > + * Check the state of the SMT siblings of both @sds::local and @sg and decide > > > > + * if @dst_cpu can pull tasks. If @dst_cpu does not have SMT siblings, it can > > > > + * pull tasks if two or more of the SMT siblings of @sg are busy. If only one > > > > + * CPU in @sg is busy, pull tasks only if @dst_cpu has higher priority. > > > > + * > > > > + * If both @dst_cpu and @sg have SMT siblings, even the number of idle CPUs > > > > + * between @sds::local and @sg. Thus, pull tasks from @sg if the difference > > > > + * between the number of busy CPUs is 2 or more. If the difference is of 1, > > > > + * only pull if @dst_cpu has higher priority. If @sg does not have SMT siblings > > > > + * only pull tasks if all of the SMT siblings of @dst_cpu are idle and @sg > > > > + * has lower priority. > > > > + */ > > > > +static bool asym_smt_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds, > > > > + struct sg_lb_stats *sgs, > > > > + struct sched_group *sg) > > > > +{ > > > > +#ifdef CONFIG_SCHED_SMT > > > > + bool local_is_smt, sg_is_smt; > > > > + int sg_busy_cpus; > > > > + > > > > + local_is_smt = sds->local->flags & SD_SHARE_CPUCAPACITY; > > > > + sg_is_smt = sg->flags & SD_SHARE_CPUCAPACITY; > > > > + > > > > + sg_busy_cpus = sgs->group_weight - sgs->idle_cpus; > > > > + > > > > + if (!local_is_smt) { > > > > + /* > > > > + * If we are here, @dst_cpu is idle and does not have SMT > > > > + * siblings. Pull tasks if candidate group has two or more > > > > + * busy CPUs. > > > > + */ > > > > + if (sg_is_smt && sg_busy_cpus >= 2) > > > > + return true; > > > > + > > > > + /* > > > > + * @dst_cpu does not have SMT siblings. @sg may have SMT > > > > + * siblings and only one is busy. In such case, @dst_cpu > > > > + * can help if it has higher priority and is idle. > > > > + */ > > > > + return !sds->local_stat.group_util && > > > > > > sds->local_stat.group_util can't be used to decide if a CPU or group > > > of CPUs is idle. util_avg is usually not null when a CPU becomes idle > > > and you can have to wait more than 300ms before it becomes Null > > > At the opposite, the utilization of a CPU can be null but a task with > > > null utilization has just woken up on it. > > > Utilization is used to reflect the average work of the CPU or group of > > > CPUs but not the current state > > > > If you want immediate idle, sgs->nr_running == 0 or sgs->idle_cpus == > > sgs->group_weight come to mind. > > yes, I have the same in mind Thank you very much Vincent and Peter for the feedback! I'll look at using these stats to determine immediate idle. > > > > > > > + sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu); > > > > + } > > > > + > > > > + /* @dst_cpu has SMT siblings. */ > > > > + > > > > + if (sg_is_smt) { > > > > + int local_busy_cpus = sds->local->group_weight - > > > > + sds->local_stat.idle_cpus; > > > > + int busy_cpus_delta = sg_busy_cpus - local_busy_cpus; > > > > + > > > > + /* Local can always help to even the number busy CPUs. */ > > > > > > default behavior of the load balance already tries to even the number > a> > of idle CPUs. > > > > Right, but I suppose this is because we're trapped here and have to deal > > with the SMT-SMT case too. Ricardo, can you clarify? > > IIUC, this function is used in sg_lb_stats to set > sgs->group_asym_packing which is then used to set the group state to > group_asym_packing and force asym migration. > But if we only want to even the number of busy CPUs between the group, > we should not need to set the group state to group_asym_packing Yes, what Vincent describe is the intent. Then, I think it is probably true that it is not necessary to even the number of idle CPUs here. > > > > > > > + if (busy_cpus_delta >= 2) > > > > + return true; > > > > + > > > > + if (busy_cpus_delta == 1) > > > > + return sched_asym_prefer(dst_cpu, > > > > + sg->asym_prefer_cpu); I think we should keep this check in order to move tasks to higher priority CPUs. > > > > + > > > > + return false; > > > > + } > > > > + > > > > + /* > > > > + * @sg does not have SMT siblings. Ensure that @sds::local does not end > > > > + * up with more than one busy SMT sibling and only pull tasks if there > > > > + * are not busy CPUs. As CPUs move in and out of idle state frequently, > > > > + * also check the group utilization to smoother the decision. > > > > + */ > > > > + if (!sds->local_stat.group_util) > > > > > > same comment as above about the meaning of group_util == 0 I will look into using the suggested statistics. Thanks and BR, Ricardo