From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752644AbYJYExE (ORCPT ); Sat, 25 Oct 2008 00:53:04 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1750840AbYJYEwy (ORCPT ); Sat, 25 Oct 2008 00:52:54 -0400 Received: from E23SMTP01.au.ibm.com ([202.81.18.162]:43484 "EHLO e23smtp01.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750762AbYJYEwx (ORCPT ); Sat, 25 Oct 2008 00:52:53 -0400 Date: Sat, 25 Oct 2008 10:22:38 +0530 From: Gautham R Shenoy To: "Paul E. McKenney" Cc: Thomas Gleixner , Peter Zijlstra , Ingo Molnar , linux-kernel@vger.kernel.org Subject: Re: [PATCH] timers: Handle HRTIMER_CB_IRQSAFE_UNLOCKED correctly from softirq context Message-ID: <20081025045238.GA27934@in.ibm.com> Reply-To: ego@in.ibm.com References: <20081023215003.GC1480@in.ibm.com> <20081025023303.GD6248@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20081025023303.GD6248@linux.vnet.ibm.com> User-Agent: Mutt/1.5.17 (2007-11-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 24, 2008 at 07:33:03PM -0700, Paul E. McKenney wrote: > On Fri, Oct 24, 2008 at 03:20:03AM +0530, Gautham R Shenoy wrote: > > timers: Handle HRTIMER_CB_IRQSAFE_UNLOCKED correctly from softirq context > > > > From: Gautham R Shenoy > > > > While migrating the the CB_IRQSAFE_UNLOCKED timers during a cpu-offline, > > we queue them on the cb_pending list, so that they won't go > > stale. > > > > Thus, when the callbacks of the timers run from the softirq context, > > they could run into potential deadlocks, since these callbacks > > assume that they're running with irq's disabled, thereby annoying > > lockdep (see below)! > > > > Fix this by emulating hardirq context while running these callbacks from > > the hrtimer softirq. > > Good catch!!! One confusion (probably on my part) below. Hmm.. Let's see... > > > diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c > > index cdec83e..60aaad6 100644 > > --- a/kernel/hrtimer.c > > +++ b/kernel/hrtimer.c > > @@ -1188,6 +1188,7 @@ static void run_hrtimer_pending(struct hrtimer_cpu_base *cpu_base) > > enum hrtimer_restart (*fn)(struct hrtimer *); > > struct hrtimer *timer; > > int restart; > > + int emulate_hardirq_ctx = 0; > > > > timer = list_entry(cpu_base->cb_pending.next, > > struct hrtimer, cb_entry); > > @@ -1196,12 +1197,26 @@ static void run_hrtimer_pending(struct hrtimer_cpu_base *cpu_base) > > timer_stats_account_hrtimer(timer); > > > > fn = timer->function; > > + /* > > + * A timer might have been added to the cb_pending list > > + * when it was migrated during a cpu-offline operation. > > + * Emulate hardirq context for such timers. > > + */ > > + if (timer->cb_mode == HRTIMER_CB_IRQSAFE_PERCPU || > > + timer->cb_mode == HRTIMER_CB_IRQSAFE_UNLOCKED) > > + emulate_hardirq_ctx = 1; > > Is this the case where we need to emulate a hardirq context? Indeed it is. > > > + > > __remove_hrtimer(timer, timer->base, HRTIMER_STATE_CALLBACK, 0); > > - spin_unlock_irq(&cpu_base->lock); ^^^^^^^^^^ ----------------------------------------(A) > > + spin_unlock(&cpu_base->lock); > > > > - restart = fn(timer); > > + if (likely(!emulate_hardirq_ctx)) { > > If so, why the "!" above? Since this function runs from the HRTIMER_SOFTIRQ, it enables interrupts before calling the hrtimer-callback function. This has been the original behaviour (see A above) We would like to retain this behavior for the normal cases where we don't want to emulate hard-irq context. So if you see below, all we're doing in the normal case is enabling the interrupts before calling the callback in the normal case, and keeping the interrupts disabled when we want to emulate the hard-irq case. Because HRTIMER_CB_IRQSAFE_UNLOCKED callbacks assume that they run with interrupts disabled. > > Or am I misinterpreting the name of the variable? > > Thanx, Paul > > > + local_irq_enable(); ^^^^^^^^ ----------------------- (B) > > + restart = fn(timer); > > + local_irq_disable(); > > + } else > > + restart = fn(timer); > > Probably it may appear as though we're doing something special for the "!" case, but infact it's the opposite. Hmmm, how about the following instead ? Thanks and Regards gautham. ---------> timers: Handle HRTIMER_CB_IRQSAFE_UNLOCKED correctly from softirq context From: Gautham R Shenoy While migrating the the CB_IRQSAFE_UNLOCKED timers during a cpu-offline, we queue them on the cb_pending list, so that they won't go stale. Thus, when the callbacks of the timers run from the softirq context, they could run into potential deadlocks, since these callbacks assume that they're running with irq's disabled, thereby annoying lockdep! Fix this by emulating hardirq context while running these callbacks from the hrtimer softirq. ================================= [ INFO: inconsistent lock state ] 2.6.27 #2 -------------------------------- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage. ksoftirqd/0/4 [HC0[0]:SC1[1]:HE1:SE0] takes: (&rq->lock){++..}, at: [] sched_rt_period_timer+0x9e/0x1fc {in-hardirq-W} state was registered at: [] __lock_acquire+0x549/0x121e [] native_sched_clock+0x88/0x99 [] clocksource_get_next+0x39/0x3f [] update_wall_time+0x616/0x7df [] lock_acquire+0x5a/0x74 [] scheduler_tick+0x3a/0x18d [] _spin_lock+0x1c/0x45 [] scheduler_tick+0x3a/0x18d [] scheduler_tick+0x3a/0x18d [] update_process_times+0x3a/0x44 [] tick_periodic+0x63/0x6d [] tick_handle_periodic+0x14/0x5e [] timer_interrupt+0x44/0x4a [] handle_IRQ_event+0x13/0x3d [] handle_level_irq+0x79/0xbd [] do_IRQ+0x69/0x7d [] common_interrupt+0x28/0x30 [] aac_probe_one+0x1a3/0x3f3 [] _spin_unlock_irqrestore+0x36/0x39 [] setup_irq+0x1be/0x1f9 [] start_kernel+0x259/0x2c5 [] 0xffffffff irq event stamp: 50102 hardirqs last enabled at (50102): [] _spin_unlock_irq+0x20/0x23 hardirqs last disabled at (50101): [] _spin_lock_irq+0xa/0x4b softirqs last enabled at (50088): [] do_softirq+0x37/0x4d softirqs last disabled at (50099): [] do_softirq+0x37/0x4d other info that might help us debug this: no locks held by ksoftirqd/0/4. stack backtrace: Pid: 4, comm: ksoftirqd/0 Not tainted 2.6.27 #2 [] print_usage_bug+0x13e/0x147 [] mark_lock+0x493/0x797 [] __lock_acquire+0x5be/0x121e [] lock_acquire+0x5a/0x74 [] sched_rt_period_timer+0x9e/0x1fc [] _spin_lock+0x1c/0x45 [] sched_rt_period_timer+0x9e/0x1fc [] sched_rt_period_timer+0x9e/0x1fc [] finish_task_switch+0x41/0xbd [] native_sched_clock+0x88/0x99 [] sched_rt_period_timer+0x0/0x1fc [] run_hrtimer_pending+0x54/0xe5 [] sched_rt_period_timer+0x0/0x1fc [] __do_softirq+0x7b/0xef [] do_softirq+0x37/0x4d [] ksoftirqd+0x56/0xc5 [] ksoftirqd+0x0/0xc5 [] kthread+0x38/0x5d [] kthread+0x0/0x5d [] kernel_thread_helper+0x7/0x10 ======================= Signed-off-by: Gautham R Shenoy Cc: Thomas Gleixner Cc: Peter Zijlstra --- kernel/hrtimer.c | 17 ++++++++++++++++- 1 files changed, 16 insertions(+), 1 deletions(-) diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c index cdec83e..f49bc14 100644 --- a/kernel/hrtimer.c +++ b/kernel/hrtimer.c @@ -1188,6 +1188,7 @@ static void run_hrtimer_pending(struct hrtimer_cpu_base *cpu_base) enum hrtimer_restart (*fn)(struct hrtimer *); struct hrtimer *timer; int restart; + int emulate_hardirq_ctx = 0; timer = list_entry(cpu_base->cb_pending.next, struct hrtimer, cb_entry); @@ -1196,10 +1197,24 @@ static void run_hrtimer_pending(struct hrtimer_cpu_base *cpu_base) timer_stats_account_hrtimer(timer); fn = timer->function; + /* + * A timer might have been added to the cb_pending list + * when it was migrated during a cpu-offline operation. + * Emulate hardirq context for such timers. + */ + if (timer->cb_mode == HRTIMER_CB_IRQSAFE_PERCPU || + timer->cb_mode == HRTIMER_CB_IRQSAFE_UNLOCKED) + emulate_hardirq_ctx = 1; + __remove_hrtimer(timer, timer->base, HRTIMER_STATE_CALLBACK, 0); spin_unlock_irq(&cpu_base->lock); - restart = fn(timer); + if (unlikely(emulate_hardirq_ctx)) { + local_irq_disable(); + restart = fn(timer); + local_irq_enable(); + } else + restart = fn(timer); spin_lock_irq(&cpu_base->lock); -- Thanks and Regards gautham