LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Ed Tomlinson <edt@aei.ca>
To: paulmck@linux.vnet.ibm.com
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Steven Rostedt <rostedt@goodmis.org>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dipankar Sarma <dipankar@in.ibm.com>,
	linux-kernel@vger.kernel.org
Subject: Re: INFO: possible circular locking dependency detected
Date: Sat, 16 Jul 2011 21:56:56 -0400	[thread overview]
Message-ID: <201107162156.56830.edt@aei.ca> (raw)
In-Reply-To: <20110717000217.GB2370@linux.vnet.ibm.com>

On Saturday 16 July 2011 20:02:17 Paul E. McKenney wrote:
> On Sat, Jul 16, 2011 at 03:42:30PM -0400, Ed Tomlinson wrote:
> > On Friday 15 July 2011 18:04:47 Paul E. McKenney wrote:
> > > On Fri, Jul 15, 2011 at 05:48:06PM -0400, Ed Tomlinson wrote:
> > > > On Friday 15 July 2011 12:56:13 Paul E. McKenney wrote:
> > > 
> > > [ . . . ]
> > > 
> > > > > OK.  Ed, would you be willing to try the patch out?
> > > > 
> > > > I am booted at the same git commit with a bluetooth and the disable local_bh around softirq()
> > > > patch from this thread.  So far so good.  Not sure how 'easy' this one is to trigger a second time -
> > > > I've been running with threadirq enabled since .39 came out.  Last night was the first deadlock...
> > > > If nothing happened post rc6 to make it more likely it could be a while before it triggers again.
> > > 
> > > Thank you for trying it out, Ed!  And I know that you will not be shy
> > > should the problem recur.  ;-)
> > 
> > Found this in dmesg this afternoon.  This time, though X was dead, I was able to cancel and restart
> > it.  This is with Peter's patch to call softirq() with local_bh disabled.
> 
> Hmmm...  Was RCU_BOOST enabled?  If so, could you please try the
> following patch?  If not, more thought is required.
>

Paul,

No boost set. 

grover linux # grep RCU .config
# RCU Subsystem
CONFIG_TREE_PREEMPT_RCU=y
CONFIG_PREEMPT_RCU=y
# CONFIG_RCU_TRACE is not set
CONFIG_RCU_FANOUT=64
# CONFIG_RCU_FANOUT_EXACT is not set
# CONFIG_TREE_RCU_TRACE is not set
# CONFIG_RCU_BOOST is not set
# CONFIG_PROVE_RCU is not set
# CONFIG_SPARSE_RCU_POINTER is not set
CONFIG_RCU_TORTURE_TEST=m
CONFIG_RCU_CPU_STALL_TIMEOUT=60
# CONFIG_RCU_CPU_STALL_VERBOSE is not set

thinking cap time I would guess.

If I enable boost do you think the patch might help?

Ed

> 
> ------------------------------------------------------------------------
> 
> rcu: Fix RCU_BOOST race handling current->rcu_read_unlock_special
> 
> The RCU_BOOST commits for TREE_PREEMPT_RCU introduced an other-task
> write to a new RCU_READ_UNLOCK_BOOSTED bit in the task_struct structure's
> ->rcu_read_unlock_special field, but, as noted by Steven Rostedt, without
> correctly synchronizing all accesses to ->rcu_read_unlock_special.
> This could result in bits in ->rcu_read_unlock_special being spuriously
> set and cleared due to conflicting accesses, which in turn could result
> in deadlocks between the rcu_node structure's ->lock and the scheduler's
> rq and pi locks.  These deadlocks would result from RCU incorrectly
> believing that the just-ended RCU read-side critical section had been
> preempted and/or boosted.  If that RCU read-side critical section was
> executed with either rq or pi locks held, RCU's ensuing (incorrect)
> calls to the scheduler would cause the scheduler to attempt to once
> again acquire the rq and pi locks, resulting in deadlock.  More complex
> deadlock cycles are also possible, involving multiple rq and pi locks
> as well as locks from multiple rcu_node structures.
> 
> This commit fixes synchronization by creating ->rcu_boosted field in
> task_struct that is accessed and modified only when holding the ->lock
> in the rcu_node structure on which the task is queued (on that rcu_node
> structure's ->blkd_tasks list).  This results in tasks accessing only
> their own current->rcu_read_unlock_special fields, making unsynchronized
> access once again legal, and keeping the rcu_read_unlock() fastpath free
> of atomic instructions and memory barriers.
> 
> The reason that the rcu_read_unlock() fastpath does not need to access
> the new current->rcu_boosted field is that this new field cannot
> be non-zero unless the RCU_READ_UNLOCK_BLOCKED bit is set in the
> current->rcu_read_unlock_special field.  Therefore, rcu_read_unlock()
> need only test current->rcu_read_unlock_special: if that is zero, then
> current->rcu_boosted must also be zero.
> 
> This bug does not affect TINY_PREEMPT_RCU because this implementation
> of RCU accesses current->rcu_read_unlock_special with irqs disabled,
> thus preventing races on the !SMP systems that TINY_PREEMPT_RCU runs on.
> 
> Maybe-reported-by: Dave Jones <davej@redhat.com>
> Maybe-reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Reported-by: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 496770a..76676a4 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1254,6 +1254,9 @@ struct task_struct {
>  #ifdef CONFIG_PREEMPT_RCU
>  	int rcu_read_lock_nesting;
>  	char rcu_read_unlock_special;
> +#if defined(CONFIG_RCU_BOOST) && defined(CONFIG_TREE_PREEMPT_RCU)
> +	int rcu_boosted;
> +#endif /* #if defined(CONFIG_RCU_BOOST) && defined(CONFIG_TREE_PREEMPT_RCU) */
>  	struct list_head rcu_node_entry;
>  #endif /* #ifdef CONFIG_PREEMPT_RCU */
>  #ifdef CONFIG_TREE_PREEMPT_RCU
> diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> index 75113cb..8d38a98 100644
> --- a/kernel/rcutree_plugin.h
> +++ b/kernel/rcutree_plugin.h
> @@ -342,6 +342,11 @@ static void rcu_read_unlock_special(struct task_struct *t)
>  #ifdef CONFIG_RCU_BOOST
>  		if (&t->rcu_node_entry == rnp->boost_tasks)
>  			rnp->boost_tasks = np;
> +		/* Snapshot and clear ->rcu_boosted with rcu_node lock held. */
> +		if (t->rcu_boosted) {
> +			special |= RCU_READ_UNLOCK_BOOSTED;
> +			t->rcu_boosted = 0;
> +		}
>  #endif /* #ifdef CONFIG_RCU_BOOST */
>  		t->rcu_blocked_node = NULL;
>  
> @@ -358,7 +363,6 @@ static void rcu_read_unlock_special(struct task_struct *t)
>  #ifdef CONFIG_RCU_BOOST
>  		/* Unboost if we were boosted. */
>  		if (special & RCU_READ_UNLOCK_BOOSTED) {
> -			t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_BOOSTED;
>  			rt_mutex_unlock(t->rcu_boost_mutex);
>  			t->rcu_boost_mutex = NULL;
>  		}
> @@ -1174,7 +1178,7 @@ static int rcu_boost(struct rcu_node *rnp)
>  	t = container_of(tb, struct task_struct, rcu_node_entry);
>  	rt_mutex_init_proxy_locked(&mtx, t);
>  	t->rcu_boost_mutex = &mtx;
> -	t->rcu_read_unlock_special |= RCU_READ_UNLOCK_BOOSTED;
> +	t->rcu_boosted = 1;
>  	raw_spin_unlock_irqrestore(&rnp->lock, flags);
>  	rt_mutex_lock(&mtx);  /* Side effect: boosts task t's priority. */
>  	rt_mutex_unlock(&mtx);  /* Keep lockdep happy. */
> 
> 

  reply	other threads:[~2011-07-17  1:57 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-14 14:49 Sergey Senozhatsky
2011-07-14 16:41 ` Peter Zijlstra
2011-07-14 16:57   ` Paul E. McKenney
2011-07-14 19:16     ` Sergey Senozhatsky
2011-07-14 19:15   ` Sergey Senozhatsky
2011-07-14 19:34     ` Paul E. McKenney
2011-07-14 19:38       ` Dave Jones
2011-07-14 20:33         ` Paul E. McKenney
2011-07-14 19:38       ` Sergey Senozhatsky
2011-07-14 16:58 ` Steven Rostedt
2011-07-14 17:02   ` Steven Rostedt
2011-07-14 17:05     ` Paul E. McKenney
2011-07-14 17:32       ` Steven Rostedt
2011-07-14 17:46         ` Steven Rostedt
2011-07-14 19:18           ` Paul E. McKenney
2011-07-14 19:41             ` Steven Rostedt
2011-07-14 20:33               ` Paul E. McKenney
2011-07-15 11:05             ` Ed Tomlinson
2011-07-15 11:29               ` Peter Zijlstra
2011-07-15 11:35                 ` Ed Tomlinson
2011-07-15 11:39                 ` Peter Zijlstra
2011-07-15 18:11                   ` Paul E. McKenney
2011-07-15 12:42                 ` Paul E. McKenney
2011-07-15 13:07                   ` Peter Zijlstra
2011-07-15 14:36                     ` Paul E. McKenney
2011-07-15 15:04                       ` Peter Zijlstra
2011-07-15 15:59                         ` Paul E. McKenney
2011-07-15 16:11                           ` Peter Zijlstra
2011-07-15 16:56                             ` Paul E. McKenney
2011-07-15 21:48                               ` Ed Tomlinson
2011-07-15 22:04                                 ` Paul E. McKenney
2011-07-16 19:42                                   ` Ed Tomlinson
2011-07-17  0:02                                     ` Paul E. McKenney
2011-07-17  1:56                                       ` Ed Tomlinson [this message]
2011-07-17 14:28                                         ` Paul E. McKenney
2011-07-18 15:15                                           ` Paul E. McKenney
2011-07-18  9:29                                     ` Peter Zijlstra
2011-07-18 15:29                                       ` Paul E. McKenney
2011-07-15 16:55                     ` Steven Rostedt
2011-07-15 17:03                       ` Paul E. McKenney
2011-07-15 17:16                         ` Steven Rostedt
2011-07-15 17:24                           ` Paul E. McKenney
2011-07-15 17:42                             ` Steven Rostedt
2011-07-15 18:33                               ` Paul E. McKenney
  -- strict thread matches above, loose matches on Subject: below --
2011-08-07 16:22 Justin P. Mattock
2011-08-11 20:57 ` Justin P. Mattock
2009-12-06 10:11 Richard Zidlicky
2009-10-10 23:09 John Kacur
2007-02-08 15:03 Pedro M. López
2006-10-16 14:05 alpha @ steudten Engineering
2006-10-16 14:32 ` Nick Piggin
2006-10-16 15:42   ` Randy Dunlap
2006-10-16 15:46     ` Nick Piggin
2006-10-19  6:02   ` Andrew Morton
2006-10-19  6:30     ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201107162156.56830.edt@aei.ca \
    --to=edt@aei.ca \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=dipankar@in.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=rostedt@goodmis.org \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=tglx@linutronix.de \
    --subject='Re: INFO: possible circular locking dependency detected' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).