LKML Archive on lore.kernel.org help / color / mirror / Atom feed
From: Venkatesh Pallipadi <venki@google.com> To: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@elte.hu>, linux-kernel@vger.kernel.org, Paul Turner <pjt@google.com>, Suresh Siddha <suresh.b.siddha@intel.com>, Mike Galbraith <efault@gmx.de> Subject: Re: [PATCH] sched: Resolve sd_idle and first_idle_cpu Catch-22 - v1 Date: Mon, 7 Feb 2011 10:21:58 -0800 [thread overview] Message-ID: <AANLkTi=xo8bq6uTjYQ7yxJ69ATry1Xs3q0deE7UN9jPj@mail.gmail.com> (raw) In-Reply-To: <1297086642.13327.15.camel@laptop> On Mon, Feb 7, 2011 at 5:50 AM, Peter Zijlstra <peterz@infradead.org> wrote: > On Fri, 2011-02-04 at 13:25 -0800, Venkatesh Pallipadi wrote: >> Consider a system with { [ (A B) (C D) ] [ (E F) (G H) ] }, >> () denoting SMT siblings, [] cores on same socket and {} system wide >> Further, A, C and D are idle, B is busy and one of EFGH has excess load. >> >> With sd_idle logic, a check in rebalance_domains() converts tick >> based load balance requests from CPU A to busy load balance for core >> and above domains (lower rate of balance and higher load_idx). > > the if (load_balance()) > idle = CPU_NOT_IDLE; > bit, right? > >> With first_idle_cpu logic, when CPU C or D tries to balance across domains >> the logic finds CPU A as first idle CPU in the group and nominates CPU A to >> idle balance across sockets. > > Right.. > >> But, sd_idle above would not allow CPU A to do cross socket idle balance >> as CPU A switches its higher level balancing to busy balance. > > Because it fails the sd->flags & SD_SHARE_CPUPOWER test at the beginning > of load_balance() and hence sd_idle will remain 0, right? > > I'm just not quite sure how we then end up returning !0 for > load_balance(), both branches returning -1 seem conditional on > SD_SHARE_CPUPOWER but the [ (A B) (C D) ], domain doesn't have that set. > For (A B) domain, SD_SHARE_CPUPOWER is set and when A finds that B is busy, it sets its sd_idle to 1 during its SMT sibling balance (once every 2-4 ticks) and load_balance() returns -1 in this case. And rebalance_domains() looks at this -1 and makes load_balance calls for CORE and NUMA domains as CPU_NOT_IDLE, thus increasing the load_balance period. >> So, this can result is no cross socket balancing for extended periods. > > Which is bad > >> The fix here adds additional check to detect sd_idle logic in >> first_idle_cpu code path. We will now nominate (in order or preference): >> * First fully idle CPU >> * First semi-idle CPU >> * First CPU >> >> Note that this solution works fine for 2 SMT siblings case and won't be >> perfect in picking proper semi-idle in case of more than 2 SMT threads. > > All these SMT exceptions make my head hurt, can't we clean that up > instead of making them worse? > > Why is SMT treaded differently from say a shared cache? In both cases we > want to spread the load as wide as possible to provide as much of the > resources to the few runnable tasks. > IIRC, the reason for the whole sd_idle part was to have less aggressive load balance when one SMT sibling is busy and other is idle, in order not to take CPU cycles away from the busy sibling. Suresh will know the exact reasoning behind this and which CPUs and which workload this helped.. Since this patch, I started looking at sd_idle more closely and I have two other patches fixing problems related to sd_idle in its current form. I agree that sd_idle stuff is making the code a lot complicated and we can clean/remove it. Thanks, Venki
next prev parent reply other threads:[~2011-02-07 18:22 UTC|newest] Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top 2011-02-04 20:51 [PATCH] sched: Resolve sd_idle and first_idle_cpu Catch-22 Venkatesh Pallipadi 2011-02-04 21:25 ` [PATCH] sched: Resolve sd_idle and first_idle_cpu Catch-22 - v1 Venkatesh Pallipadi 2011-02-07 13:50 ` Peter Zijlstra 2011-02-07 18:21 ` Venkatesh Pallipadi [this message] 2011-02-07 19:53 ` Suresh Siddha 2011-02-08 17:37 ` Venkatesh Pallipadi 2011-02-08 18:13 ` Misc sd_idle related fixes Venkatesh Pallipadi 2011-02-09 9:29 ` Peter Zijlstra 2011-02-10 17:24 ` Venkatesh Pallipadi 2011-02-08 18:13 ` [PATCH 1/3] sched: Resolve sd_idle and first_idle_cpu Catch-22 Venkatesh Pallipadi 2011-02-08 18:13 ` [PATCH 2/3] sched: fix_up broken SMT load balance dilation Venkatesh Pallipadi 2011-02-08 18:13 ` [PATCH 3/3] sched: newidle balance set idle_timestamp only on successful pull Venkatesh Pallipadi 2011-02-09 3:37 ` Mike Galbraith 2011-02-09 15:55 ` [PATCH] sched: Resolve sd_idle and first_idle_cpu Catch-22 - v1 Peter Zijlstra 2011-02-12 1:20 ` Suresh Siddha 2011-02-14 22:38 ` [PATCH] sched: Wholesale removal of sd_idle logic Venkatesh Pallipadi 2011-02-15 17:01 ` Vaidyanathan Srinivasan 2011-02-15 18:26 ` Venkatesh Pallipadi 2011-02-16 8:53 ` Vaidyanathan Srinivasan 2011-02-16 11:43 ` Peter Zijlstra 2011-02-16 13:50 ` [tip:sched/core] " tip-bot for Venkatesh Pallipadi 2011-02-15 9:15 ` [PATCH] sched: Resolve sd_idle and first_idle_cpu Catch-22 - v1 Peter Zijlstra 2011-02-15 19:11 ` Suresh Siddha 2011-02-18 1:05 ` Alex,Shi
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='AANLkTi=xo8bq6uTjYQ7yxJ69ATry1Xs3q0deE7UN9jPj@mail.gmail.com' \ --to=venki@google.com \ --cc=efault@gmx.de \ --cc=linux-kernel@vger.kernel.org \ --cc=mingo@elte.hu \ --cc=peterz@infradead.org \ --cc=pjt@google.com \ --cc=suresh.b.siddha@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).