From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752778AbbBYH5M (ORCPT ); Wed, 25 Feb 2015 02:57:12 -0500 Received: from out114-135.biz.mail.alibaba.com ([205.204.114.135]:47161 "EHLO out11.biz.mail.alibaba.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752666AbbBYH5K (ORCPT ); Wed, 25 Feb 2015 02:57:10 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R141e4;FP=0|-1|-1|-1|0|-1|-1|-1;HT=r41f05015;MF=hillf.zj@alibaba-inc.com;PH=DS;RN=7;RT=7;SR=0; Reply-To: "Hillf Danton" From: "Hillf Danton" To: "Steven Rostedt" Cc: "Ingo Molnar" , "Peter Zijlstra" , "'Thomas Gleixner'" , "'Clark Williams'" , "'Mike Galbraith'" , "linux-kernel" Subject: Re: [RFC][PATCH v2] sched/rt: Use IPI to trigger RT task push migration instead of pulling Date: Wed, 25 Feb 2015 15:56:21 +0800 Message-ID: <07af01d050d0$8ba39e80$a2eadb80$@alibaba-inc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Outlook 14.0 Thread-Index: AdBQz3GuxR79TSYvS9y26DTOf+Bohg== Content-Language: zh-cn Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > +static void try_to_push_tasks(void *arg) > +{ > + struct rt_rq *rt_rq = arg; > + struct rq *rq, *next_rq; > + int next_cpu = -1; > + int next_prio = MAX_PRIO + 1; > + int this_prio; > + int src_prio; > + int prio; > + int this_cpu; > + int success; > + int cpu; > + > + /* Make sure we can see csd_cpu */ > + smp_rmb(); > + > + this_cpu = rt_rq->push_csd_cpu; > + > + /* Paranoid check */ > + BUG_ON(this_cpu != smp_processor_id()); > + > + rq = cpu_rq(this_cpu); > + > + /* > + * If there's nothing to push here, then see if another queue > + * can push instead. > + */ > + if (!has_pushable_tasks(rq)) > + goto pass_the_ipi; > + > + raw_spin_lock(&rq->lock); > + success = push_rt_task(rq); > + raw_spin_unlock(&rq->lock); > + > + if (success) > + goto done; The latency, 150us over a 20 hour run, goes up if we goto done directly? Hillf > + > + /* Nothing was pushed, try another queue */ > +pass_the_ipi: > + > + /* > + * We use the priority that determined to send to this CPU > + * even if the priority for this CPU changed. This is used > + * to determine what other CPUs to send to, to keep from > + * doing a ping pong from each CPU. > + */ > + this_prio = rt_rq->push_csd_prio; > + src_prio = rt_rq->highest_prio.curr; > + > + for_each_cpu(cpu, rq->rd->rto_mask) { > + if (this_cpu == cpu) > + continue; > + > + /* > + * This function was called because some rq lowered its > + * priority. It then searched for the highest priority > + * rq that had overloaded tasks and sent an smp function > + * call to that cpu to call this function to push its > + * tasks. But when it got here, the task was either > + * already pushed, or due to affinity, could not move > + * the overloaded task. > + * > + * Now we need to see if there's another overloaded rq that > + * has an RT task that can migrate to that CPU. > + * > + * We need to be careful, we do not want to cause a ping > + * pong between this CPU and another CPU that has an RT task > + * that can migrate, but not to the CPU that lowered its > + * priority. Since the lowering priority CPU finds the highest > + * priority rq to send to, we will ignore any rq that is of higher > + * priority than this current one. That is, if a rq scheduled a > + * task of higher priority, the schedule itself would do the > + * push or pull then. We can safely ignore higher priority rqs. > + * And if there's one that is the same priority, since the CPUS > + * are searched in order we will ignore CPUS of the same priority > + * unless the CPU number is greater than this CPU's number. > + */ > + next_rq = cpu_rq(cpu); > + > + /* Use a single read for the next prio for decision making */ > + prio = READ_ONCE(next_rq->rt.highest_prio.next); > + > + /* Looking for highest priority */ > + if (prio >= next_prio) > + continue; > + > + /* Make sure that the rq can push to the source rq */ > + if (prio >= src_prio) > + continue; > + > + /* If the prio is higher than the current prio, ignore it */ > + if (prio < this_prio) > + continue; > + > + /* > + * If the prio is equal to the current prio, only use it > + * if the cpu number is greater than the current cpu. > + * This prevents a ping pong effect. > + */ > + if (prio == this_prio && cpu < this_cpu) > + continue; > + > + next_prio = prio; > + next_cpu = cpu; > + } > + > + /* Nothing found, do nothing */ > + if (next_cpu < 0) > + goto done; > + > + /* > + * Now we can not send another smp async function due to locking, > + * use irq_work instead. > + */ > + > + rt_rq->push_csd_cpu = next_cpu; > + rt_rq->push_csd_prio = next_prio; > + > + /* Make sure the next cpu is seen on remote CPU */ > + smp_mb(); > + > + irq_work_queue_on(&rt_rq->push_csd_work, next_cpu); > + > + return; > + > +done: > + rt_rq->push_csd_pending = 0; > + > + /* Now make sure the src CPU can see this update */ > + smp_wmb(); > +}