LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* rcu: frequent rcu lockups
@ 2015-03-11 19:57 Sasha Levin
  2015-03-11 20:17 ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Sasha Levin @ 2015-03-11 19:57 UTC (permalink / raw)
  To: Paul E. McKenney, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan
  Cc: LKML, Dave Jones

Hi all,

I've started seeing the following hang pretty often during my fuzzing. The
system proceeds to lock up after that.

[ 3209.655703] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 3209.655703] 	Tasks blocked on level-1 rcu_node (CPUs 16-31):
[ 3209.655703] 	(detected by 0, t=30502 jiffies, g=48799, c=48798, q=1730)
[ 3209.655703] All QSes seen, last rcu_preempt kthread activity 1 (4295246069-4295246068), jiffies_till_next_fqs=1, root ->qsmask 0x2
[ 3209.655703] trinity-c24     R  running task    26944  9338   9110 0x10080000
[ 3209.655703]  0000000000002396 00000000e68fa48e ffff880050607dc8 ffffffffa427679b
[ 3209.655703]  ffff880050607d98 ffffffffb1b36000 0000000000000001 00000001000440f4
[ 3209.655703]  ffffffffb1b351c8 dffffc0000000000 ffff880050622000 ffffffffb1721000
[ 3209.655703] Call Trace:
[ 3209.655703] <IRQ> sched_show_task (kernel/sched/core.c:4542)
[ 3209.655703] rcu_check_callbacks (kernel/rcu/tree.c:1225 kernel/rcu/tree.c:1331 kernel/rcu/tree.c:3389 kernel/rcu/tree.c:3453 kernel/rcu/tree.c:2683)
[ 3209.655703] ? acct_account_cputime (kernel/tsacct.c:168)
[ 3209.655703] update_process_times (./arch/x86/include/asm/preempt.h:22 kernel/time/timer.c:1386)
[ 3209.655703] tick_periodic (kernel/time/tick-common.c:92)
[ 3209.655703] ? tick_handle_periodic (kernel/time/tick-common.c:105)
[ 3209.655703] tick_handle_periodic (kernel/time/tick-common.c:105)
[ 3209.655703] local_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:891)
[ 3209.655703] smp_trace_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:934 include/linux/jump_label.h:114 ./arch/x86/include/asm/trace/irq_vectors.h:45 arch/x86/kernel/apic/apic.c:935)
[ 3209.655703] trace_apic_timer_interrupt (arch/x86/kernel/entry_64.S:920)
[ 3209.655703] <EOI> ? add_wait_queue (include/linux/wait.h:116 kernel/sched/wait.c:29)
[ 3209.655703] ? _raw_spin_unlock_irqrestore (./arch/x86/include/asm/paravirt.h:809 include/linux/spinlock_api_smp.h:162 kernel/locking/spinlock.c:191)
[ 3209.655703] add_wait_queue (kernel/sched/wait.c:31)
[ 3209.655703] do_wait (kernel/exit.c:1473)
[ 3209.655703] ? trace_rcu_dyntick (include/trace/events/rcu.h:363 (discriminator 19))
[ 3209.655703] ? wait_consider_task (kernel/exit.c:1465)
[ 3209.655703] ? find_get_pid (kernel/pid.c:490)
[ 3209.655703] SyS_wait4 (kernel/exit.c:1618 kernel/exit.c:1586)
[ 3209.655703] ? perf_syscall_exit (kernel/trace/trace_syscalls.c:549)
[ 3209.655703] ? SyS_waitid (kernel/exit.c:1586)
[ 3209.655703] ? kill_orphaned_pgrp (kernel/exit.c:1444)
[ 3209.655703] ? syscall_trace_enter_phase2 (arch/x86/kernel/ptrace.c:1592)
[ 3209.655703] ? trace_hardirqs_on_thunk (arch/x86/lib/thunk_64.S:42)
[ 3209.655703] tracesys_phase2 (arch/x86/kernel/entry_64.S:347)


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: rcu: frequent rcu lockups
  2015-03-11 19:57 rcu: frequent rcu lockups Sasha Levin
@ 2015-03-11 20:17 ` Paul E. McKenney
  2015-03-11 20:18   ` Sasha Levin
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-03-11 20:17 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	LKML, Dave Jones

On Wed, Mar 11, 2015 at 03:57:32PM -0400, Sasha Levin wrote:
> Hi all,
> 
> I've started seeing the following hang pretty often during my fuzzing. The
> system proceeds to lock up after that.
> 
> [ 3209.655703] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [ 3209.655703] 	Tasks blocked on level-1 rcu_node (CPUs 16-31):
> [ 3209.655703] 	(detected by 0, t=30502 jiffies, g=48799, c=48798, q=1730)
> [ 3209.655703] All QSes seen, last rcu_preempt kthread activity 1 (4295246069-4295246068), jiffies_till_next_fqs=1, root ->qsmask 0x2
> [ 3209.655703] trinity-c24     R  running task    26944  9338   9110 0x10080000
> [ 3209.655703]  0000000000002396 00000000e68fa48e ffff880050607dc8 ffffffffa427679b
> [ 3209.655703]  ffff880050607d98 ffffffffb1b36000 0000000000000001 00000001000440f4
> [ 3209.655703]  ffffffffb1b351c8 dffffc0000000000 ffff880050622000 ffffffffb1721000
> [ 3209.655703] Call Trace:
> [ 3209.655703] <IRQ> sched_show_task (kernel/sched/core.c:4542)
> [ 3209.655703] rcu_check_callbacks (kernel/rcu/tree.c:1225 kernel/rcu/tree.c:1331 kernel/rcu/tree.c:3389 kernel/rcu/tree.c:3453 kernel/rcu/tree.c:2683)
> [ 3209.655703] ? acct_account_cputime (kernel/tsacct.c:168)
> [ 3209.655703] update_process_times (./arch/x86/include/asm/preempt.h:22 kernel/time/timer.c:1386)
> [ 3209.655703] tick_periodic (kernel/time/tick-common.c:92)
> [ 3209.655703] ? tick_handle_periodic (kernel/time/tick-common.c:105)
> [ 3209.655703] tick_handle_periodic (kernel/time/tick-common.c:105)
> [ 3209.655703] local_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:891)
> [ 3209.655703] smp_trace_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:934 include/linux/jump_label.h:114 ./arch/x86/include/asm/trace/irq_vectors.h:45 arch/x86/kernel/apic/apic.c:935)
> [ 3209.655703] trace_apic_timer_interrupt (arch/x86/kernel/entry_64.S:920)
> [ 3209.655703] <EOI> ? add_wait_queue (include/linux/wait.h:116 kernel/sched/wait.c:29)
> [ 3209.655703] ? _raw_spin_unlock_irqrestore (./arch/x86/include/asm/paravirt.h:809 include/linux/spinlock_api_smp.h:162 kernel/locking/spinlock.c:191)
> [ 3209.655703] add_wait_queue (kernel/sched/wait.c:31)
> [ 3209.655703] do_wait (kernel/exit.c:1473)
> [ 3209.655703] ? trace_rcu_dyntick (include/trace/events/rcu.h:363 (discriminator 19))
> [ 3209.655703] ? wait_consider_task (kernel/exit.c:1465)
> [ 3209.655703] ? find_get_pid (kernel/pid.c:490)
> [ 3209.655703] SyS_wait4 (kernel/exit.c:1618 kernel/exit.c:1586)
> [ 3209.655703] ? perf_syscall_exit (kernel/trace/trace_syscalls.c:549)
> [ 3209.655703] ? SyS_waitid (kernel/exit.c:1586)
> [ 3209.655703] ? kill_orphaned_pgrp (kernel/exit.c:1444)
> [ 3209.655703] ? syscall_trace_enter_phase2 (arch/x86/kernel/ptrace.c:1592)
> [ 3209.655703] ? trace_hardirqs_on_thunk (arch/x86/lib/thunk_64.S:42)
> [ 3209.655703] tracesys_phase2 (arch/x86/kernel/entry_64.S:347)

OK, that is not good.

What version are you running and what is your .config?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: rcu: frequent rcu lockups
  2015-03-11 20:17 ` Paul E. McKenney
@ 2015-03-11 20:18   ` Sasha Levin
  2015-03-11 20:41     ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Sasha Levin @ 2015-03-11 20:18 UTC (permalink / raw)
  To: paulmck
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	LKML, Dave Jones

[-- Attachment #1: Type: text/plain, Size: 3077 bytes --]

On 03/11/2015 04:17 PM, Paul E. McKenney wrote:
> On Wed, Mar 11, 2015 at 03:57:32PM -0400, Sasha Levin wrote:
>> Hi all,
>>
>> I've started seeing the following hang pretty often during my fuzzing. The
>> system proceeds to lock up after that.
>>
>> [ 3209.655703] INFO: rcu_preempt detected stalls on CPUs/tasks:
>> [ 3209.655703] 	Tasks blocked on level-1 rcu_node (CPUs 16-31):
>> [ 3209.655703] 	(detected by 0, t=30502 jiffies, g=48799, c=48798, q=1730)
>> [ 3209.655703] All QSes seen, last rcu_preempt kthread activity 1 (4295246069-4295246068), jiffies_till_next_fqs=1, root ->qsmask 0x2
>> [ 3209.655703] trinity-c24     R  running task    26944  9338   9110 0x10080000
>> [ 3209.655703]  0000000000002396 00000000e68fa48e ffff880050607dc8 ffffffffa427679b
>> [ 3209.655703]  ffff880050607d98 ffffffffb1b36000 0000000000000001 00000001000440f4
>> [ 3209.655703]  ffffffffb1b351c8 dffffc0000000000 ffff880050622000 ffffffffb1721000
>> [ 3209.655703] Call Trace:
>> [ 3209.655703] <IRQ> sched_show_task (kernel/sched/core.c:4542)
>> [ 3209.655703] rcu_check_callbacks (kernel/rcu/tree.c:1225 kernel/rcu/tree.c:1331 kernel/rcu/tree.c:3389 kernel/rcu/tree.c:3453 kernel/rcu/tree.c:2683)
>> [ 3209.655703] ? acct_account_cputime (kernel/tsacct.c:168)
>> [ 3209.655703] update_process_times (./arch/x86/include/asm/preempt.h:22 kernel/time/timer.c:1386)
>> [ 3209.655703] tick_periodic (kernel/time/tick-common.c:92)
>> [ 3209.655703] ? tick_handle_periodic (kernel/time/tick-common.c:105)
>> [ 3209.655703] tick_handle_periodic (kernel/time/tick-common.c:105)
>> [ 3209.655703] local_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:891)
>> [ 3209.655703] smp_trace_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:934 include/linux/jump_label.h:114 ./arch/x86/include/asm/trace/irq_vectors.h:45 arch/x86/kernel/apic/apic.c:935)
>> [ 3209.655703] trace_apic_timer_interrupt (arch/x86/kernel/entry_64.S:920)
>> [ 3209.655703] <EOI> ? add_wait_queue (include/linux/wait.h:116 kernel/sched/wait.c:29)
>> [ 3209.655703] ? _raw_spin_unlock_irqrestore (./arch/x86/include/asm/paravirt.h:809 include/linux/spinlock_api_smp.h:162 kernel/locking/spinlock.c:191)
>> [ 3209.655703] add_wait_queue (kernel/sched/wait.c:31)
>> [ 3209.655703] do_wait (kernel/exit.c:1473)
>> [ 3209.655703] ? trace_rcu_dyntick (include/trace/events/rcu.h:363 (discriminator 19))
>> [ 3209.655703] ? wait_consider_task (kernel/exit.c:1465)
>> [ 3209.655703] ? find_get_pid (kernel/pid.c:490)
>> [ 3209.655703] SyS_wait4 (kernel/exit.c:1618 kernel/exit.c:1586)
>> [ 3209.655703] ? perf_syscall_exit (kernel/trace/trace_syscalls.c:549)
>> [ 3209.655703] ? SyS_waitid (kernel/exit.c:1586)
>> [ 3209.655703] ? kill_orphaned_pgrp (kernel/exit.c:1444)
>> [ 3209.655703] ? syscall_trace_enter_phase2 (arch/x86/kernel/ptrace.c:1592)
>> [ 3209.655703] ? trace_hardirqs_on_thunk (arch/x86/lib/thunk_64.S:42)
>> [ 3209.655703] tracesys_phase2 (arch/x86/kernel/entry_64.S:347)
> 
> OK, that is not good.
> 
> What version are you running and what is your .config?

latest -next. .config attached.


Thanks,
Sasha


[-- Attachment #2: config-sasha.gz --]
[-- Type: application/gzip, Size: 40069 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: rcu: frequent rcu lockups
  2015-03-11 20:18   ` Sasha Levin
@ 2015-03-11 20:41     ` Paul E. McKenney
  2015-03-11 22:18       ` Sasha Levin
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-03-11 20:41 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	LKML, Dave Jones

On Wed, Mar 11, 2015 at 04:18:49PM -0400, Sasha Levin wrote:
> On 03/11/2015 04:17 PM, Paul E. McKenney wrote:
> > On Wed, Mar 11, 2015 at 03:57:32PM -0400, Sasha Levin wrote:
> >> Hi all,
> >>
> >> I've started seeing the following hang pretty often during my fuzzing. The
> >> system proceeds to lock up after that.
> >>
> >> [ 3209.655703] INFO: rcu_preempt detected stalls on CPUs/tasks:
> >> [ 3209.655703] 	Tasks blocked on level-1 rcu_node (CPUs 16-31):
> >> [ 3209.655703] 	(detected by 0, t=30502 jiffies, g=48799, c=48798, q=1730)
> >> [ 3209.655703] All QSes seen, last rcu_preempt kthread activity 1 (4295246069-4295246068), jiffies_till_next_fqs=1, root ->qsmask 0x2
> >> [ 3209.655703] trinity-c24     R  running task    26944  9338   9110 0x10080000
> >> [ 3209.655703]  0000000000002396 00000000e68fa48e ffff880050607dc8 ffffffffa427679b
> >> [ 3209.655703]  ffff880050607d98 ffffffffb1b36000 0000000000000001 00000001000440f4
> >> [ 3209.655703]  ffffffffb1b351c8 dffffc0000000000 ffff880050622000 ffffffffb1721000
> >> [ 3209.655703] Call Trace:
> >> [ 3209.655703] <IRQ> sched_show_task (kernel/sched/core.c:4542)
> >> [ 3209.655703] rcu_check_callbacks (kernel/rcu/tree.c:1225 kernel/rcu/tree.c:1331 kernel/rcu/tree.c:3389 kernel/rcu/tree.c:3453 kernel/rcu/tree.c:2683)
> >> [ 3209.655703] ? acct_account_cputime (kernel/tsacct.c:168)
> >> [ 3209.655703] update_process_times (./arch/x86/include/asm/preempt.h:22 kernel/time/timer.c:1386)
> >> [ 3209.655703] tick_periodic (kernel/time/tick-common.c:92)
> >> [ 3209.655703] ? tick_handle_periodic (kernel/time/tick-common.c:105)
> >> [ 3209.655703] tick_handle_periodic (kernel/time/tick-common.c:105)
> >> [ 3209.655703] local_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:891)
> >> [ 3209.655703] smp_trace_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:934 include/linux/jump_label.h:114 ./arch/x86/include/asm/trace/irq_vectors.h:45 arch/x86/kernel/apic/apic.c:935)
> >> [ 3209.655703] trace_apic_timer_interrupt (arch/x86/kernel/entry_64.S:920)
> >> [ 3209.655703] <EOI> ? add_wait_queue (include/linux/wait.h:116 kernel/sched/wait.c:29)
> >> [ 3209.655703] ? _raw_spin_unlock_irqrestore (./arch/x86/include/asm/paravirt.h:809 include/linux/spinlock_api_smp.h:162 kernel/locking/spinlock.c:191)
> >> [ 3209.655703] add_wait_queue (kernel/sched/wait.c:31)
> >> [ 3209.655703] do_wait (kernel/exit.c:1473)
> >> [ 3209.655703] ? trace_rcu_dyntick (include/trace/events/rcu.h:363 (discriminator 19))
> >> [ 3209.655703] ? wait_consider_task (kernel/exit.c:1465)
> >> [ 3209.655703] ? find_get_pid (kernel/pid.c:490)
> >> [ 3209.655703] SyS_wait4 (kernel/exit.c:1618 kernel/exit.c:1586)
> >> [ 3209.655703] ? perf_syscall_exit (kernel/trace/trace_syscalls.c:549)
> >> [ 3209.655703] ? SyS_waitid (kernel/exit.c:1586)
> >> [ 3209.655703] ? kill_orphaned_pgrp (kernel/exit.c:1444)
> >> [ 3209.655703] ? syscall_trace_enter_phase2 (arch/x86/kernel/ptrace.c:1592)
> >> [ 3209.655703] ? trace_hardirqs_on_thunk (arch/x86/lib/thunk_64.S:42)
> >> [ 3209.655703] tracesys_phase2 (arch/x86/kernel/entry_64.S:347)
> > 
> > OK, that is not good.
> > 
> > What version are you running and what is your .config?
> 
> latest -next. .config attached.

Aha, I forgot to update rcu/next.  I have now updated it, so it should
make it there today or tomorrow.  In the meantime, does the following
commit help?

Also, how quickly does your test setup reproduce this?

							Thanx, Paul

------------------------------------------------------------------------

rcu: Yet another fix for preemption and CPU hotplug

As noted earlier, the following sequence of events can occur when
running PREEMPT_RCU and HOTPLUG_CPU on a system with a multi-level
rcu_node combining tree:

1.	A group of tasks block on CPUs corresponding to a given leaf
    	rcu_node structure while within RCU read-side critical sections.
2.	All CPUs corrsponding to that rcu_node structure go offline.
3.	The next grace period starts, but because there are still tasks
    	blocked, the upper-level bits corresponding to this leaf rcu_node
    	structure remain set.
4.	All the tasks exit their RCU read-side critical sections and
    	remove themselves from the leaf rcu_node structure's list,
    	leaving it empty.
5.	But because there now is code to check for this condition at
    	force-quiescent-state time, the upper bits are cleared and the
    	grace period completes.
    
However, there is another complication that can occur following step 4 above:
    
4a.	The grace period starts, and the leaf rcu_node structure's
    	gp_tasks pointer is set to NULL because there are no tasks
    	blocked on this structure.
4b.	One of the CPUs corresponding to the leaf rcu_node structure
    	comes back online.
4b.	An endless stream of tasks are preempted within RCU read-side
    	critical sections on this CPU, such that the ->blkd_tasks
    	list is always non-empty.
    
The grace period will never end.

This commit therefore makes the force-quiescent-state processing check only
for absence of tasks blocking the current grace period rather than absence
of tasks altogether.  This will cause a quiescent state to be reported if
the current leaf rcu_node structure is not blocking the current grace period
and its parent thinks that it is, regardless of how RCU managed to get
itself into this state.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> # 4.0.x

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index cc2e9bebf585..f9a56523e8dd 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2212,8 +2212,8 @@ static void rcu_report_unblock_qs_rnp(struct rcu_state *rsp,
 	unsigned long mask;
 	struct rcu_node *rnp_p;
 
-	WARN_ON_ONCE(rsp == &rcu_bh_state || rsp == &rcu_sched_state);
-	if (rnp->qsmask != 0 || rcu_preempt_blocked_readers_cgp(rnp)) {
+	if (rcu_state_p == &rcu_sched_state || rsp != rcu_state_p ||
+	    rnp->qsmask != 0 || rcu_preempt_blocked_readers_cgp(rnp)) {
 		raw_spin_unlock_irqrestore(&rnp->lock, flags);
 		return;  /* Still need more quiescent states! */
 	}
@@ -2221,9 +2221,8 @@ static void rcu_report_unblock_qs_rnp(struct rcu_state *rsp,
 	rnp_p = rnp->parent;
 	if (rnp_p == NULL) {
 		/*
-		 * Either there is only one rcu_node in the tree,
-		 * or tasks were kicked up to root rcu_node due to
-		 * CPUs going offline.
+		 * Only one rcu_node structure in the tree, so don't
+		 * try to report up to its nonexistent parent!
 		 */
 		rcu_report_qs_rsp(rsp, flags);
 		return;
@@ -2715,8 +2714,29 @@ static void force_qs_rnp(struct rcu_state *rsp,
 			return;
 		}
 		if (rnp->qsmask == 0) {
-			rcu_initiate_boost(rnp, flags); /* releases rnp->lock */
-			continue;
+			if (rcu_state_p == &rcu_sched_state ||
+			    rsp != rcu_state_p ||
+			    rcu_preempt_blocked_readers_cgp(rnp)) {
+				/*
+				 * No point in scanning bits because they
+				 * are all zero.  But we might need to
+				 * priority-boost blocked readers.
+				 */
+				rcu_initiate_boost(rnp, flags);
+				/* rcu_initiate_boost() releases rnp->lock */
+				continue;
+			}
+			if (rnp->parent &&
+			    (rnp->parent->qsmask & rnp->grpmask)) {
+				/*
+				 * Race between grace-period
+				 * initialization and task exiting RCU
+				 * read-side critical section: Report.
+				 */
+				rcu_report_unblock_qs_rnp(rsp, rnp, flags);
+				/* rcu_report_unblock_qs_rnp() rlses ->lock */
+				continue;
+			}
 		}
 		cpu = rnp->grplo;
 		bit = 1;
@@ -2731,15 +2751,6 @@ static void force_qs_rnp(struct rcu_state *rsp,
 		if (mask != 0) {
 			/* Idle/offline CPUs, report. */
 			rcu_report_qs_rnp(mask, rsp, rnp, flags);
-		} else if (rnp->parent &&
-			 list_empty(&rnp->blkd_tasks) &&
-			 !rnp->qsmask &&
-			 (rnp->parent->qsmask & rnp->grpmask)) {
-			/*
-			 * Race between grace-period initialization and task
-			 * existing RCU read-side critical section, report.
-			 */
-			rcu_report_unblock_qs_rnp(rsp, rnp, flags);
 		} else {
 			/* Nothing to do here, so just drop the lock. */
 			raw_spin_unlock_irqrestore(&rnp->lock, flags);


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: rcu: frequent rcu lockups
  2015-03-11 20:41     ` Paul E. McKenney
@ 2015-03-11 22:18       ` Sasha Levin
  2015-03-11 23:01         ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Sasha Levin @ 2015-03-11 22:18 UTC (permalink / raw)
  To: paulmck
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	LKML, Dave Jones

On 03/11/2015 04:41 PM, Paul E. McKenney wrote:
> Aha, I forgot to update rcu/next.  I have now updated it, so it should
> make it there today or tomorrow.  In the meantime, does the following
> commit help?
> 
> Also, how quickly does your test setup reproduce this?

I usually hit it within an hour.

With the commit I didn't hit it yet, but I do see 4 different WARNings:

[ 1002.577249] WARNING: CPU: 19 PID: 22311 at kernel/rcu/tree.c:2190 rcu_report_qs_rnp+0x41e/0x590()
[ 1002.582438] Modules linked in:
[ 1002.582438] CPU: 19 PID: 22311 Comm: trinity-c58 Not tainted 4.0.0-rc3-next-20150311-sasha-00035-g0a45b48 #2051
[ 1002.582438]  ffff88047f200000 00000000727b89c5 ffff880672807cc8 ffffffff9ea97ec9
[ 1002.582438]  0000000000000000 0000000000000000 ffff880672807d18 ffffffff941e242a
[ 1002.582438]  0000000000000282 ffffffff943339be dffffc0000000000 ffffffffa1721000
[ 1002.582438] Call Trace:
[ 1002.582438] <IRQ> dump_stack (lib/dump_stack.c:52)
[ 1002.582438] warn_slowpath_common (kernel/panic.c:447)
[ 1002.582438] ? rcu_report_qs_rnp (kernel/rcu/tree.c:2190 (discriminator 3))
[ 1002.582438] warn_slowpath_null (kernel/panic.c:481)
[ 1002.582438] rcu_report_qs_rnp (kernel/rcu/tree.c:2190 (discriminator 3))
[ 1002.582438] rcu_process_callbacks (kernel/rcu/tree.c:2288 kernel/rcu/tree.c:2324 kernel/rcu/tree.c:2816 kernel/rcu/tree.c:2849)
[ 1002.582438] ? __lock_is_held (kernel/locking/lockdep.c:3518)
[ 1002.582438] __do_softirq (kernel/softirq.c:273 include/linux/jump_label.h:114 include/trace/events/irq.h:126 kernel/softirq.c:274)
[ 1002.582438] irq_exit (kernel/softirq.c:350 kernel/softirq.c:391)
[ 1002.582438] smp_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:918)
[ 1002.582438] apic_timer_interrupt (arch/x86/kernel/entry_64.S:920)
[ 1002.582438] <EOI> ? __lock_is_held (kernel/locking/lockdep.c:3518)
[ 1002.582438] ? lock_acquire (./arch/x86/include/asm/paravirt.h:809 kernel/locking/lockdep.c:3605)
[ 1002.582438] ? rmap_walk (mm/rmap.c:1569 mm/rmap.c:1594)
[ 1002.582438] down_read (./arch/x86/include/asm/rwsem.h:65 kernel/locking/rwsem.c:24)
[ 1002.582438] ? rmap_walk (mm/rmap.c:1569 mm/rmap.c:1594)
[ 1002.582438] ? up_read (./arch/x86/include/asm/rwsem.h:156 kernel/locking/rwsem.c:81)
[ 1002.582438] ? page_remove_rmap (mm/rmap.c:1209)
[ 1002.582438] rmap_walk (mm/rmap.c:1569 mm/rmap.c:1594)
[ 1002.582438] try_to_munlock (mm/rmap.c:1446)
[ 1002.582438] ? try_to_unmap (mm/rmap.c:1446)
[ 1002.582438] ? page_remove_rmap (mm/rmap.c:1209)
[ 1002.582438] ? invalid_migration_vma (mm/rmap.c:1382)
[ 1002.582438] ? anon_vma_prepare (mm/rmap.c:497)
[ 1002.582438] __munlock_isolated_page (mm/mlock.c:132)
[ 1002.582438] __munlock_pagevec (mm/mlock.c:332)
[ 1002.582438] ? __munlock_isolate_lru_page (mm/mlock.c:277)
[ 1002.582438] ? _raw_spin_unlock (./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:154 kernel/locking/spinlock.c:183)
[ 1002.582438] ? __get_locked_pte (mm/memory.c:1432)
[ 1002.582438] munlock_vma_pages_range (mm/mlock.c:477)
[ 1002.582438] ? munlock_vma_page (mm/mlock.c:424)
[ 1002.582438] ? __lock_acquire (kernel/locking/lockdep.c:2019 kernel/locking/lockdep.c:3184)
[ 1002.582438] ? kfree (mm/slub.c:1300 mm/slub.c:2707 mm/slub.c:3416)
[ 1002.582438] ? __lock_acquire (kernel/locking/lockdep.c:2019 kernel/locking/lockdep.c:3184)
[ 1002.582438] ? vma_merge (mm/mmap.c:1085)
[ 1002.582438] mlock_fixup (mm/mlock.c:549)
[ 1002.582438] do_mlockall (mm/mlock.c:677)
[ 1002.582438] ? do_mlock (mm/mlock.c:658)
[ 1002.582438] sys_munlockall (mm/mlock.c:720)
[ 1002.582438] ia32_do_call (arch/x86/ia32/ia32entry.S:486)

[ 1034.418016] WARNING: CPU: 0 PID: 8 at kernel/rcu/tree.c:1936 rcu_gp_kthread+0x2a7c/0x4dd0()
[ 1034.419933] Modules linked in:
[ 1034.420802] CPU: 0 PID: 8 Comm: rcu_preempt Tainted: G        W       4.0.0-rc3-next-20150311-sasha-00035-g0a45b48 #2051
[ 1034.423329]  ffff8800496f8000 000000005fe03e34 ffff880049ac7b28 ffffffff9ea97ec9
[ 1034.426460]  0000000000000000 0000000000000000 ffff880049ac7b78 ffffffff941e242a
[ 1034.426460]  0000000000022000 ffffffff9433cafc ffffffffa1725040 00000001000110de
[ 1034.426460] Call Trace:
[ 1034.426460] dump_stack (lib/dump_stack.c:52)
[ 1034.426460] warn_slowpath_common (kernel/panic.c:447)
[ 1034.426460] ? rcu_gp_kthread (kernel/rcu/tree.c:1936 kernel/rcu/tree.c:2061)
[ 1034.426460] warn_slowpath_null (kernel/panic.c:481)
[ 1034.426460] rcu_gp_kthread (kernel/rcu/tree.c:1936 kernel/rcu/tree.c:2061)
[ 1034.426460] ? plist_check_head (lib/plist.c:62)
[ 1034.426460] ? cond_synchronize_rcu (kernel/rcu/tree.c:1972)
[ 1034.426460] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2554 kernel/locking/lockdep.c:2601)
[ 1034.426460] ? _raw_spin_unlock_irq (./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:171 kernel/locking/spinlock.c:199)
[ 1034.426460] ? __schedule (kernel/sched/core.c:2803)
[ 1034.426460] ? ___preempt_schedule (arch/x86/lib/thunk_64.S:51)
[ 1034.426460] ? abort_exclusive_wait (kernel/sched/wait.c:292)
[ 1034.426460] ? cond_synchronize_rcu (kernel/rcu/tree.c:1972)
[ 1034.426460] kthread (kernel/kthread.c:207)
[ 1034.426460] ? flush_kthread_work (kernel/kthread.c:176)
[ 1034.426460] ? _raw_spin_unlock_irq (./arch/x86/include/asm/paravirt.h:819 include/linux/spinlock_api_smp.h:170 kernel/locking/spinlock.c:199)
[ 1034.426460] ? flush_kthread_work (kernel/kthread.c:176)
[ 1034.426460] ret_from_fork (arch/x86/kernel/entry_64.S:565)
[ 1034.426460] ? flush_kthread_work (kernel/kthread.c:176)

[ 1034.501448] WARNING: CPU: 0 PID: 8 at kernel/rcu/tree_plugin.h:479 rcu_gp_kthread+0x1e58/0x4dd0()
[ 1034.505431] Modules linked in:
[ 1034.507242] CPU: 0 PID: 8 Comm: rcu_preempt Tainted: G        W       4.0.0-rc3-next-20150311-sasha-00035-g0a45b48 #2051
[ 1034.509348]  ffff8800496f8000 000000005fe03e34 ffff880049ac7b28 ffffffff9ea97ec9
[ 1034.509348]  0000000000000000 0000000000000000 ffff880049ac7b78 ffffffff941e242a
[ 1034.509348]  ffffffffa1723038 ffffffff9433bed8 ffffffffa1723040 ffff880050622000
[ 1034.509348] Call Trace:
[ 1034.509348] dump_stack (lib/dump_stack.c:52)
[ 1034.509348] warn_slowpath_common (kernel/panic.c:447)
[ 1034.509348] ? rcu_gp_kthread (kernel/rcu/tree_plugin.h:479 kernel/rcu/tree.c:1834 kernel/rcu/tree.c:1993)
[ 1034.509348] warn_slowpath_null (kernel/panic.c:481)
[ 1034.509348] rcu_gp_kthread (kernel/rcu/tree_plugin.h:479 kernel/rcu/tree.c:1834 kernel/rcu/tree.c:1993)
[ 1034.509348] ? plist_check_head (lib/plist.c:62)
[ 1034.509348] ? cond_synchronize_rcu (kernel/rcu/tree.c:1972)
[ 1034.509348] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2554 kernel/locking/lockdep.c:2601)
[ 1034.509348] ? _raw_spin_unlock_irq (./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:171 kernel/locking/spinlock.c:199)
[ 1034.509348] ? __schedule (kernel/sched/core.c:2803)
[ 1034.509348] ? ___preempt_schedule (arch/x86/lib/thunk_64.S:51)
[ 1034.509348] ? abort_exclusive_wait (kernel/sched/wait.c:292)
[ 1034.509348] ? cond_synchronize_rcu (kernel/rcu/tree.c:1972)
[ 1034.509348] kthread (kernel/kthread.c:207)
[ 1034.509348] ? flush_kthread_work (kernel/kthread.c:176)
[ 1034.509348] ? _raw_spin_unlock_irq (./arch/x86/include/asm/paravirt.h:819 include/linux/spinlock_api_smp.h:170 kernel/locking/spinlock.c:199)
[ 1034.509348] ? flush_kthread_work (kernel/kthread.c:176)
[ 1034.509348] ret_from_fork (arch/x86/kernel/entry_64.S:565)
[ 1034.509348] ? flush_kthread_work (kernel/kthread.c:176)

[ 2733.830899] WARNING: CPU: 0 PID: 8 at kernel/rcu/tree.c:1935 rcu_gp_kthread+0x2aa6/0x4dd0()
[ 2733.839509] Modules linked in:
[ 2733.839509] CPU: 0 PID: 8 Comm: rcu_preempt Tainted: G        W       4.0.0-rc3-next-20150311-sasha-00035-g0a45b48 #2051
[ 2733.839509]  ffff8800496f8000 000000005fe03e34 ffff880049ac7b28 ffffffff9ea97ec9
[ 2733.839509]  0000000000000000 0000000000000000 ffff880049ac7b78 ffffffff941e242a
[ 2733.839509]  0000000000022000 ffffffff9433cb26 ffffffffa1723040 0000000100038724
[ 2733.839509] Call Trace:
[ 2733.839509] dump_stack (lib/dump_stack.c:52)
[ 2733.839509] warn_slowpath_common (kernel/panic.c:447)
[ 2733.839509] ? rcu_gp_kthread (kernel/rcu/tree.c:1935 kernel/rcu/tree.c:2061)
[ 2733.839509] warn_slowpath_null (kernel/panic.c:481)
[ 2733.839509] rcu_gp_kthread (kernel/rcu/tree.c:1935 kernel/rcu/tree.c:2061)
[ 2733.839509] ? plist_check_head (lib/plist.c:62)
[ 2733.839509] ? cond_synchronize_rcu (kernel/rcu/tree.c:1972)
[ 2733.839509] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2554 kernel/locking/lockdep.c:2601)
[ 2733.839509] ? _raw_spin_unlock_irq (./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:171 kernel/locking/spinlock.c:199)
[ 2733.839509] ? __schedule (kernel/sched/core.c:2803)
[ 2733.839509] ? ___preempt_schedule (arch/x86/lib/thunk_64.S:51)
[ 2733.839509] ? abort_exclusive_wait (kernel/sched/wait.c:292)
[ 2733.839509] ? cond_synchronize_rcu (kernel/rcu/tree.c:1972)
[ 2733.839509] kthread (kernel/kthread.c:207)
[ 2733.839509] ? flush_kthread_work (kernel/kthread.c:176)
[ 2733.839509] ? _raw_spin_unlock_irq (./arch/x86/include/asm/paravirt.h:819 include/linux/spinlock_api_smp.h:170 kernel/locking/spinlock.c:199)
[ 2733.839509] ? flush_kthread_work (kernel/kthread.c:176)
[ 2733.839509] ret_from_fork (arch/x86/kernel/entry_64.S:565)
[ 2733.839509] ? flush_kthread_work (kernel/kthread.c:176)


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: rcu: frequent rcu lockups
  2015-03-11 22:18       ` Sasha Levin
@ 2015-03-11 23:01         ` Paul E. McKenney
  2015-03-11 23:06           ` Sasha Levin
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-03-11 23:01 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	LKML, Dave Jones

On Wed, Mar 11, 2015 at 06:18:57PM -0400, Sasha Levin wrote:
> On 03/11/2015 04:41 PM, Paul E. McKenney wrote:
> > Aha, I forgot to update rcu/next.  I have now updated it, so it should
> > make it there today or tomorrow.  In the meantime, does the following
> > commit help?
> > 
> > Also, how quickly does your test setup reproduce this?
> 
> I usually hit it within an hour.

I clearly need to work more on rcutorture -- took tens of hours to hit
it there.

> With the commit I didn't hit it yet, but I do see 4 different WARNings:

I wish that I could say that I am surprised, but the sad fact is that
I am still shaking the bugs out.  

> [ 1002.577249] WARNING: CPU: 19 PID: 22311 at kernel/rcu/tree.c:2190 rcu_report_qs_rnp+0x41e/0x590()

I haven't seen this particular one, but it is another likely symptom.

> [ 1002.582438] Modules linked in:
> [ 1002.582438] CPU: 19 PID: 22311 Comm: trinity-c58 Not tainted 4.0.0-rc3-next-20150311-sasha-00035-g0a45b48 #2051
> [ 1002.582438]  ffff88047f200000 00000000727b89c5 ffff880672807cc8 ffffffff9ea97ec9
> [ 1002.582438]  0000000000000000 0000000000000000 ffff880672807d18 ffffffff941e242a
> [ 1002.582438]  0000000000000282 ffffffff943339be dffffc0000000000 ffffffffa1721000
> [ 1002.582438] Call Trace:
> [ 1002.582438] <IRQ> dump_stack (lib/dump_stack.c:52)
> [ 1002.582438] warn_slowpath_common (kernel/panic.c:447)
> [ 1002.582438] ? rcu_report_qs_rnp (kernel/rcu/tree.c:2190 (discriminator 3))
> [ 1002.582438] warn_slowpath_null (kernel/panic.c:481)
> [ 1002.582438] rcu_report_qs_rnp (kernel/rcu/tree.c:2190 (discriminator 3))
> [ 1002.582438] rcu_process_callbacks (kernel/rcu/tree.c:2288 kernel/rcu/tree.c:2324 kernel/rcu/tree.c:2816 kernel/rcu/tree.c:2849)
> [ 1002.582438] ? __lock_is_held (kernel/locking/lockdep.c:3518)
> [ 1002.582438] __do_softirq (kernel/softirq.c:273 include/linux/jump_label.h:114 include/trace/events/irq.h:126 kernel/softirq.c:274)
> [ 1002.582438] irq_exit (kernel/softirq.c:350 kernel/softirq.c:391)
> [ 1002.582438] smp_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:918)
> [ 1002.582438] apic_timer_interrupt (arch/x86/kernel/entry_64.S:920)
> [ 1002.582438] <EOI> ? __lock_is_held (kernel/locking/lockdep.c:3518)
> [ 1002.582438] ? lock_acquire (./arch/x86/include/asm/paravirt.h:809 kernel/locking/lockdep.c:3605)
> [ 1002.582438] ? rmap_walk (mm/rmap.c:1569 mm/rmap.c:1594)
> [ 1002.582438] down_read (./arch/x86/include/asm/rwsem.h:65 kernel/locking/rwsem.c:24)
> [ 1002.582438] ? rmap_walk (mm/rmap.c:1569 mm/rmap.c:1594)
> [ 1002.582438] ? up_read (./arch/x86/include/asm/rwsem.h:156 kernel/locking/rwsem.c:81)
> [ 1002.582438] ? page_remove_rmap (mm/rmap.c:1209)
> [ 1002.582438] rmap_walk (mm/rmap.c:1569 mm/rmap.c:1594)
> [ 1002.582438] try_to_munlock (mm/rmap.c:1446)
> [ 1002.582438] ? try_to_unmap (mm/rmap.c:1446)
> [ 1002.582438] ? page_remove_rmap (mm/rmap.c:1209)
> [ 1002.582438] ? invalid_migration_vma (mm/rmap.c:1382)
> [ 1002.582438] ? anon_vma_prepare (mm/rmap.c:497)
> [ 1002.582438] __munlock_isolated_page (mm/mlock.c:132)
> [ 1002.582438] __munlock_pagevec (mm/mlock.c:332)
> [ 1002.582438] ? __munlock_isolate_lru_page (mm/mlock.c:277)
> [ 1002.582438] ? _raw_spin_unlock (./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:154 kernel/locking/spinlock.c:183)
> [ 1002.582438] ? __get_locked_pte (mm/memory.c:1432)
> [ 1002.582438] munlock_vma_pages_range (mm/mlock.c:477)
> [ 1002.582438] ? munlock_vma_page (mm/mlock.c:424)
> [ 1002.582438] ? __lock_acquire (kernel/locking/lockdep.c:2019 kernel/locking/lockdep.c:3184)
> [ 1002.582438] ? kfree (mm/slub.c:1300 mm/slub.c:2707 mm/slub.c:3416)
> [ 1002.582438] ? __lock_acquire (kernel/locking/lockdep.c:2019 kernel/locking/lockdep.c:3184)
> [ 1002.582438] ? vma_merge (mm/mmap.c:1085)
> [ 1002.582438] mlock_fixup (mm/mlock.c:549)
> [ 1002.582438] do_mlockall (mm/mlock.c:677)
> [ 1002.582438] ? do_mlock (mm/mlock.c:658)
> [ 1002.582438] sys_munlockall (mm/mlock.c:720)
> [ 1002.582438] ia32_do_call (arch/x86/ia32/ia32entry.S:486)
> 
> [ 1034.418016] WARNING: CPU: 0 PID: 8 at kernel/rcu/tree.c:1936 rcu_gp_kthread+0x2a7c/0x4dd0()

I have seen this one, tracking it down.

> [ 1034.419933] Modules linked in:
> [ 1034.420802] CPU: 0 PID: 8 Comm: rcu_preempt Tainted: G        W       4.0.0-rc3-next-20150311-sasha-00035-g0a45b48 #2051
> [ 1034.423329]  ffff8800496f8000 000000005fe03e34 ffff880049ac7b28 ffffffff9ea97ec9
> [ 1034.426460]  0000000000000000 0000000000000000 ffff880049ac7b78 ffffffff941e242a
> [ 1034.426460]  0000000000022000 ffffffff9433cafc ffffffffa1725040 00000001000110de
> [ 1034.426460] Call Trace:
> [ 1034.426460] dump_stack (lib/dump_stack.c:52)
> [ 1034.426460] warn_slowpath_common (kernel/panic.c:447)
> [ 1034.426460] ? rcu_gp_kthread (kernel/rcu/tree.c:1936 kernel/rcu/tree.c:2061)
> [ 1034.426460] warn_slowpath_null (kernel/panic.c:481)
> [ 1034.426460] rcu_gp_kthread (kernel/rcu/tree.c:1936 kernel/rcu/tree.c:2061)
> [ 1034.426460] ? plist_check_head (lib/plist.c:62)
> [ 1034.426460] ? cond_synchronize_rcu (kernel/rcu/tree.c:1972)
> [ 1034.426460] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2554 kernel/locking/lockdep.c:2601)
> [ 1034.426460] ? _raw_spin_unlock_irq (./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:171 kernel/locking/spinlock.c:199)
> [ 1034.426460] ? __schedule (kernel/sched/core.c:2803)
> [ 1034.426460] ? ___preempt_schedule (arch/x86/lib/thunk_64.S:51)
> [ 1034.426460] ? abort_exclusive_wait (kernel/sched/wait.c:292)
> [ 1034.426460] ? cond_synchronize_rcu (kernel/rcu/tree.c:1972)
> [ 1034.426460] kthread (kernel/kthread.c:207)
> [ 1034.426460] ? flush_kthread_work (kernel/kthread.c:176)
> [ 1034.426460] ? _raw_spin_unlock_irq (./arch/x86/include/asm/paravirt.h:819 include/linux/spinlock_api_smp.h:170 kernel/locking/spinlock.c:199)
> [ 1034.426460] ? flush_kthread_work (kernel/kthread.c:176)
> [ 1034.426460] ret_from_fork (arch/x86/kernel/entry_64.S:565)
> [ 1034.426460] ? flush_kthread_work (kernel/kthread.c:176)
> 
> [ 1034.501448] WARNING: CPU: 0 PID: 8 at kernel/rcu/tree_plugin.h:479 rcu_gp_kthread+0x1e58/0x4dd0()

And this one as well.

> [ 1034.505431] Modules linked in:
> [ 1034.507242] CPU: 0 PID: 8 Comm: rcu_preempt Tainted: G        W       4.0.0-rc3-next-20150311-sasha-00035-g0a45b48 #2051
> [ 1034.509348]  ffff8800496f8000 000000005fe03e34 ffff880049ac7b28 ffffffff9ea97ec9
> [ 1034.509348]  0000000000000000 0000000000000000 ffff880049ac7b78 ffffffff941e242a
> [ 1034.509348]  ffffffffa1723038 ffffffff9433bed8 ffffffffa1723040 ffff880050622000
> [ 1034.509348] Call Trace:
> [ 1034.509348] dump_stack (lib/dump_stack.c:52)
> [ 1034.509348] warn_slowpath_common (kernel/panic.c:447)
> [ 1034.509348] ? rcu_gp_kthread (kernel/rcu/tree_plugin.h:479 kernel/rcu/tree.c:1834 kernel/rcu/tree.c:1993)
> [ 1034.509348] warn_slowpath_null (kernel/panic.c:481)
> [ 1034.509348] rcu_gp_kthread (kernel/rcu/tree_plugin.h:479 kernel/rcu/tree.c:1834 kernel/rcu/tree.c:1993)
> [ 1034.509348] ? plist_check_head (lib/plist.c:62)
> [ 1034.509348] ? cond_synchronize_rcu (kernel/rcu/tree.c:1972)
> [ 1034.509348] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2554 kernel/locking/lockdep.c:2601)
> [ 1034.509348] ? _raw_spin_unlock_irq (./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:171 kernel/locking/spinlock.c:199)
> [ 1034.509348] ? __schedule (kernel/sched/core.c:2803)
> [ 1034.509348] ? ___preempt_schedule (arch/x86/lib/thunk_64.S:51)
> [ 1034.509348] ? abort_exclusive_wait (kernel/sched/wait.c:292)
> [ 1034.509348] ? cond_synchronize_rcu (kernel/rcu/tree.c:1972)
> [ 1034.509348] kthread (kernel/kthread.c:207)
> [ 1034.509348] ? flush_kthread_work (kernel/kthread.c:176)
> [ 1034.509348] ? _raw_spin_unlock_irq (./arch/x86/include/asm/paravirt.h:819 include/linux/spinlock_api_smp.h:170 kernel/locking/spinlock.c:199)
> [ 1034.509348] ? flush_kthread_work (kernel/kthread.c:176)
> [ 1034.509348] ret_from_fork (arch/x86/kernel/entry_64.S:565)
> [ 1034.509348] ? flush_kthread_work (kernel/kthread.c:176)
> 
> [ 2733.830899] WARNING: CPU: 0 PID: 8 at kernel/rcu/tree.c:1935 rcu_gp_kthread+0x2aa6/0x4dd0()

And this one, too.

> [ 2733.839509] Modules linked in:
> [ 2733.839509] CPU: 0 PID: 8 Comm: rcu_preempt Tainted: G        W       4.0.0-rc3-next-20150311-sasha-00035-g0a45b48 #2051
> [ 2733.839509]  ffff8800496f8000 000000005fe03e34 ffff880049ac7b28 ffffffff9ea97ec9
> [ 2733.839509]  0000000000000000 0000000000000000 ffff880049ac7b78 ffffffff941e242a
> [ 2733.839509]  0000000000022000 ffffffff9433cb26 ffffffffa1723040 0000000100038724
> [ 2733.839509] Call Trace:
> [ 2733.839509] dump_stack (lib/dump_stack.c:52)
> [ 2733.839509] warn_slowpath_common (kernel/panic.c:447)
> [ 2733.839509] ? rcu_gp_kthread (kernel/rcu/tree.c:1935 kernel/rcu/tree.c:2061)
> [ 2733.839509] warn_slowpath_null (kernel/panic.c:481)
> [ 2733.839509] rcu_gp_kthread (kernel/rcu/tree.c:1935 kernel/rcu/tree.c:2061)
> [ 2733.839509] ? plist_check_head (lib/plist.c:62)
> [ 2733.839509] ? cond_synchronize_rcu (kernel/rcu/tree.c:1972)
> [ 2733.839509] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2554 kernel/locking/lockdep.c:2601)
> [ 2733.839509] ? _raw_spin_unlock_irq (./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:171 kernel/locking/spinlock.c:199)
> [ 2733.839509] ? __schedule (kernel/sched/core.c:2803)
> [ 2733.839509] ? ___preempt_schedule (arch/x86/lib/thunk_64.S:51)
> [ 2733.839509] ? abort_exclusive_wait (kernel/sched/wait.c:292)
> [ 2733.839509] ? cond_synchronize_rcu (kernel/rcu/tree.c:1972)
> [ 2733.839509] kthread (kernel/kthread.c:207)
> [ 2733.839509] ? flush_kthread_work (kernel/kthread.c:176)
> [ 2733.839509] ? _raw_spin_unlock_irq (./arch/x86/include/asm/paravirt.h:819 include/linux/spinlock_api_smp.h:170 kernel/locking/spinlock.c:199)
> [ 2733.839509] ? flush_kthread_work (kernel/kthread.c:176)
> [ 2733.839509] ret_from_fork (arch/x86/kernel/entry_64.S:565)
> [ 2733.839509] ? flush_kthread_work (kernel/kthread.c:176)

I did locate a similar bug in the expedited RCU code path, and perhaps
that will provide a clue to the other problems as well.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: rcu: frequent rcu lockups
  2015-03-11 23:01         ` Paul E. McKenney
@ 2015-03-11 23:06           ` Sasha Levin
  2015-03-11 23:16             ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Sasha Levin @ 2015-03-11 23:06 UTC (permalink / raw)
  To: paulmck
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	LKML, Dave Jones

On 03/11/2015 07:01 PM, Paul E. McKenney wrote:
>> With the commit I didn't hit it yet, but I do see 4 different WARNings:
> I wish that I could say that I am surprised, but the sad fact is that
> I am still shaking the bugs out.  

I have one more to add:

[   93.330539] WARNING: CPU: 1 PID: 8 at kernel/rcu/tree_plugin.h:476 rcu_gp_kthread+0x1eaa/0x4dd0()
[   93.333515] Modules linked in:
[   93.334678] CPU: 1 PID: 8 Comm: rcu_preempt Tainted: G        W       4.0.0-rc3-next-20150311-sasha-00035-g0a45b48-dirty #2052
[   93.338402]  ffff8800496f8000 00000000f3e72ee7 ffff880049ac7b28 ffffffff9ea97e89
[   93.339495]  0000000000000000 0000000000000000 ffff880049ac7b78 ffffffff941e242a
[   93.339495]  ffffffffa1721038 ffffffff9433bf2a ffffffffa1721040 ffff8800a2822000
[   93.339495] Call Trace:
[   93.339495]  [<ffffffff9ea97e89>] dump_stack+0x4f/0x7b
[   93.339495]  [<ffffffff941e242a>] warn_slowpath_common+0xca/0x130
[   93.339495]  [<ffffffff9433bf2a>] ? rcu_gp_kthread+0x1eaa/0x4dd0
[   93.339495]  [<ffffffff941e266a>] warn_slowpath_null+0x1a/0x20
[   93.339495]  [<ffffffff9433bf2a>] rcu_gp_kthread+0x1eaa/0x4dd0
[   93.339495]  [<ffffffff95f2346c>] ? plist_check_head+0x3c/0x50
[   93.339495]  [<ffffffff9433a080>] ? cond_synchronize_rcu+0x20/0x20
[   93.339495]  [<ffffffff942dffb5>] ? trace_hardirqs_on_caller+0x475/0x610
[   93.339495]  [<ffffffff9eb0bdf9>] ? _raw_spin_unlock_irq+0x69/0xa0
[   93.339495]  [<ffffffff9eaf95e3>] ? __schedule+0x993/0x1b90
[   93.339495]  [<ffffffff95f49e39>] ? ___preempt_schedule+0x12/0x14
[   93.339495]  [<ffffffff942c4530>] ? abort_exclusive_wait+0x1d0/0x1d0
[   93.339495]  [<ffffffff9433a080>] ? cond_synchronize_rcu+0x20/0x20
[   93.339495]  [<ffffffff94248b56>] kthread+0x216/0x2e0
[   93.339495]  [<ffffffff94248940>] ? flush_kthread_work+0x300/0x300
[   93.339495]  [<ffffffff9eb0bdc0>] ? _raw_spin_unlock_irq+0x30/0xa0
[   93.339495]  [<ffffffff94248940>] ? flush_kthread_work+0x300/0x300
[   93.339495]  [<ffffffff9eb0d293>] ret_from_fork+0x53/0x80
[   93.339495]  [<ffffffff94248940>] ? flush_kthread_work+0x300/0x300


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: rcu: frequent rcu lockups
  2015-03-11 23:06           ` Sasha Levin
@ 2015-03-11 23:16             ` Paul E. McKenney
  2015-03-12 12:28               ` Sasha Levin
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-03-11 23:16 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	LKML, Dave Jones

On Wed, Mar 11, 2015 at 07:06:40PM -0400, Sasha Levin wrote:
> On 03/11/2015 07:01 PM, Paul E. McKenney wrote:
> >> With the commit I didn't hit it yet, but I do see 4 different WARNings:
> > I wish that I could say that I am surprised, but the sad fact is that
> > I am still shaking the bugs out.  
> 
> I have one more to add:
> 
> [   93.330539] WARNING: CPU: 1 PID: 8 at kernel/rcu/tree_plugin.h:476 rcu_gp_kthread+0x1eaa/0x4dd0()

A bit different, but still in the class of a combining-tree bitmask
handling bug.

							Thanx, Paul

> [   93.333515] Modules linked in:
> [   93.334678] CPU: 1 PID: 8 Comm: rcu_preempt Tainted: G        W       4.0.0-rc3-next-20150311-sasha-00035-g0a45b48-dirty #2052
> [   93.338402]  ffff8800496f8000 00000000f3e72ee7 ffff880049ac7b28 ffffffff9ea97e89
> [   93.339495]  0000000000000000 0000000000000000 ffff880049ac7b78 ffffffff941e242a
> [   93.339495]  ffffffffa1721038 ffffffff9433bf2a ffffffffa1721040 ffff8800a2822000
> [   93.339495] Call Trace:
> [   93.339495]  [<ffffffff9ea97e89>] dump_stack+0x4f/0x7b
> [   93.339495]  [<ffffffff941e242a>] warn_slowpath_common+0xca/0x130
> [   93.339495]  [<ffffffff9433bf2a>] ? rcu_gp_kthread+0x1eaa/0x4dd0
> [   93.339495]  [<ffffffff941e266a>] warn_slowpath_null+0x1a/0x20
> [   93.339495]  [<ffffffff9433bf2a>] rcu_gp_kthread+0x1eaa/0x4dd0
> [   93.339495]  [<ffffffff95f2346c>] ? plist_check_head+0x3c/0x50
> [   93.339495]  [<ffffffff9433a080>] ? cond_synchronize_rcu+0x20/0x20
> [   93.339495]  [<ffffffff942dffb5>] ? trace_hardirqs_on_caller+0x475/0x610
> [   93.339495]  [<ffffffff9eb0bdf9>] ? _raw_spin_unlock_irq+0x69/0xa0
> [   93.339495]  [<ffffffff9eaf95e3>] ? __schedule+0x993/0x1b90
> [   93.339495]  [<ffffffff95f49e39>] ? ___preempt_schedule+0x12/0x14
> [   93.339495]  [<ffffffff942c4530>] ? abort_exclusive_wait+0x1d0/0x1d0
> [   93.339495]  [<ffffffff9433a080>] ? cond_synchronize_rcu+0x20/0x20
> [   93.339495]  [<ffffffff94248b56>] kthread+0x216/0x2e0
> [   93.339495]  [<ffffffff94248940>] ? flush_kthread_work+0x300/0x300
> [   93.339495]  [<ffffffff9eb0bdc0>] ? _raw_spin_unlock_irq+0x30/0xa0
> [   93.339495]  [<ffffffff94248940>] ? flush_kthread_work+0x300/0x300
> [   93.339495]  [<ffffffff9eb0d293>] ret_from_fork+0x53/0x80
> [   93.339495]  [<ffffffff94248940>] ? flush_kthread_work+0x300/0x300
> 
> 
> Thanks,
> Sasha
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: rcu: frequent rcu lockups
  2015-03-11 23:16             ` Paul E. McKenney
@ 2015-03-12 12:28               ` Sasha Levin
  2015-03-12 12:32                 ` Sasha Levin
  0 siblings, 1 reply; 19+ messages in thread
From: Sasha Levin @ 2015-03-12 12:28 UTC (permalink / raw)
  To: paulmck
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	LKML, Dave Jones

On 03/11/2015 07:16 PM, Paul E. McKenney wrote:
> On Wed, Mar 11, 2015 at 07:06:40PM -0400, Sasha Levin wrote:
>> > On 03/11/2015 07:01 PM, Paul E. McKenney wrote:
>>>> > >> With the commit I didn't hit it yet, but I do see 4 different WARNings:
>>> > > I wish that I could say that I am surprised, but the sad fact is that
>>> > > I am still shaking the bugs out.  
>> > 
>> > I have one more to add:
>> > 
>> > [   93.330539] WARNING: CPU: 1 PID: 8 at kernel/rcu/tree_plugin.h:476 rcu_gp_kthread+0x1eaa/0x4dd0()
> A bit different, but still in the class of a combining-tree bitmask
> handling bug.

I left it overnight, and am still seeing hangs. Although (and don't catch me
by that) it seems to be significantly less of them.

[ 4423.001809] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 4423.001809]  Tasks blocked on level-1 rcu_node (CPUs 16-31):
[ 4423.001809]  (detected by 0, t=30502 jiffies, g=60989, c=60988, q=18648)
[ 4423.001809] All QSes seen, last rcu_preempt kthread activity 1 (4295375352-4295375351), jiffies_till_next_fqs=1, root ->qsmask 0x2
[ 4423.001809] trinity-c0      R  running task    27480 15862   9833 0x10080000
[ 4423.001809]  0000000000002669 00000000ac401e1d ffff880050607de8 ffffffff9327679b
[ 4423.001809]  ffff880050607db8 ffffffffa0b36000 0000000000000001 00000001000639f7
[ 4423.001809]  ffffffffa0b351c8 dffffc0000000000 ffff880050622000 ffffffffa0721000
[ 4423.001809] Call Trace:
[ 4423.001809] <IRQ> sched_show_task (kernel/sched/core.c:4542)
[ 4423.001809] rcu_check_callbacks (kernel/rcu/tree.c:1225 kernel/rcu/tree.c:1331 kernel/rcu/tree.c:3400 kernel/rcu/tree.c:3464 kernel/rcu/tree.c:2682)
[ 4423.001809] ? acct_account_cputime (kernel/tsacct.c:168)
[ 4423.001809] update_process_times (./arch/x86/include/asm/preempt.h:22 kernel/time/timer.c:1386)
[ 4423.001809] tick_periodic (kernel/time/tick-common.c:92)
[ 4423.001809] ? tick_handle_periodic (kernel/time/tick-common.c:105)
[ 4423.001809] tick_handle_periodic (kernel/time/tick-common.c:105)
[ 4423.001809] local_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:891)
[ 4423.001809] ? irq_enter (kernel/softirq.c:338)
[ 4423.001809] smp_apic_timer_interrupt (./arch/x86/include/asm/apic.h:650 arch/x86/kernel/apic/apic.c:915)
[ 4423.001809] apic_timer_interrupt (arch/x86/kernel/entry_64.S:920)
[ 4423.001809] <EOI> ? remove_wait_queue (include/linux/wait.h:145 kernel/sched/wait.c:50)
[ 4423.001809] ? _raw_spin_unlock_irqrestore (./arch/x86/include/asm/paravirt.h:809 include/linux/spinlock_api_smp.h:162 kernel/locking/spinlock.c:191)
[ 4423.001809] remove_wait_queue (kernel/sched/wait.c:52)
[ 4423.001809] do_wait (kernel/exit.c:1465 (discriminator 1))
[ 4423.001809] ? wait_consider_task (kernel/exit.c:1465)
[ 4423.001809] ? find_get_pid (kernel/pid.c:490)
[ 4423.001809] SyS_wait4 (kernel/exit.c:1618 kernel/exit.c:1586)
[ 4423.001809] ? SyS_waitid (kernel/exit.c:1586)
[ 4423.001809] ? kill_orphaned_pgrp (kernel/exit.c:1444)
[ 4423.001809] ? syscall_trace_enter_phase2 (arch/x86/kernel/ptrace.c:1592)
[ 4423.001809] ? trace_hardirqs_on_thunk (arch/x86/lib/thunk_64.S:42)
[ 4423.001809] tracesys_phase2 (arch/x86/kernel/entry_64.S:347)


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: rcu: frequent rcu lockups
  2015-03-12 12:28               ` Sasha Levin
@ 2015-03-12 12:32                 ` Sasha Levin
  2015-03-12 14:07                   ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Sasha Levin @ 2015-03-12 12:32 UTC (permalink / raw)
  To: paulmck
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	LKML, Dave Jones

On 03/12/2015 08:28 AM, Sasha Levin wrote:
> On 03/11/2015 07:16 PM, Paul E. McKenney wrote:
>> > On Wed, Mar 11, 2015 at 07:06:40PM -0400, Sasha Levin wrote:
>>>> >> > On 03/11/2015 07:01 PM, Paul E. McKenney wrote:
>>>>>>>> >>>> > >> With the commit I didn't hit it yet, but I do see 4 different WARNings:
>>>>>> >>> > > I wish that I could say that I am surprised, but the sad fact is that
>>>>>> >>> > > I am still shaking the bugs out.  
>>>> >> > 
>>>> >> > I have one more to add:
>>>> >> > 
>>>> >> > [   93.330539] WARNING: CPU: 1 PID: 8 at kernel/rcu/tree_plugin.h:476 rcu_gp_kthread+0x1eaa/0x4dd0()
>> > A bit different, but still in the class of a combining-tree bitmask
>> > handling bug.
> I left it overnight, and am still seeing hangs. Although (and don't catch me
> by that) it seems to be significantly less of them.

In one of the cases, KASan ended up complaining about something odd going on in
rcu_check_callbacks():

[ 7170.779042] BUG: KASan: out of bounds on stack in sched_show_task+0x324/0x3d0 at addr ffff88075315fdd8
[ 7170.779042] Read of size 8 by task trinity-c0/15862
[ 7170.779042] page:ffffea001d4c57c0 count:0 mapcount:0 mapping:          (null) index:0x0
[ 7170.779042] flags: 0x5afffff80000000()
[ 7170.779042] page dumped because: kasan: bad access detected
[ 7170.779042] CPU: 0 PID: 15862 Comm: trinity-c0 Tainted: G        W       4.0.0-rc3-next-20150311-sasha-00035-g0a45b48-dirty #2052
[ 7170.779042]  ffff88075315fdd8 00000000ac401e1d ffff880050607cd8 ffffffff9da97e89
[ 7170.779042]  1ffffd4003a98aff ffff880050607d68 ffff880050607d58 ffffffff9360eda4
[ 7170.779042]  ffff880050607d98 ffffffff9da8c9df 0000000000000082 ffffffffa0260429
[ 7170.779042] Call Trace:
[ 7170.779042] <IRQ> dump_stack (lib/dump_stack.c:52)
[ 7170.779042] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
[ 7170.779042] ? printk (kernel/printk/printk.c:1854)
[ 7170.779042] ? bitmap_weight (include/linux/bitmap.h:303)
[ 7170.779042] __asan_report_load8_noabort (mm/kasan/report.c:230 mm/kasan/report.c:251)
[ 7170.779042] ? sched_show_task (kernel/sched/core.c:4526)
[ 7170.779042] sched_show_task (kernel/sched/core.c:4526)
[ 7170.779042] rcu_check_callbacks (kernel/rcu/tree.c:1225 kernel/rcu/tree.c:1331 kernel/rcu/tree.c:3400 kernel/rcu/tree.c:3464 kernel/rcu/tree.c:2682)
[ 7170.779042] update_process_times (./arch/x86/include/asm/preempt.h:22 kernel/time/timer.c:1386)
[ 7170.779042] tick_periodic (kernel/time/tick-common.c:92)
[ 7170.779042] ? tick_handle_periodic (kernel/time/tick-common.c:105)
[ 7170.779042] tick_handle_periodic (kernel/time/tick-common.c:105)
[ 7170.779042] local_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:891)
[ 7170.779042] ? irq_enter (kernel/softirq.c:338)
[ 7170.779042] smp_apic_timer_interrupt (./arch/x86/include/asm/apic.h:650 arch/x86/kernel/apic/apic.c:915)
[ 7170.779042] apic_timer_interrupt (arch/x86/kernel/entry_64.S:920)
[ 7170.779042] <EOI> ? do_raw_read_lock (include/asm-generic/qrwlock.h:104 kernel/locking/spinlock_debug.c:204)
[ 7170.779042] _raw_read_lock (kernel/locking/spinlock.c:224)
[ 7170.779042] ? do_wait (kernel/exit.c:1498 (discriminator 1))
[ 7170.779042] do_wait (kernel/exit.c:1498 (discriminator 1))
[ 7170.779042] ? wait_consider_task (kernel/exit.c:1465)
[ 7170.779042] ? find_get_pid (kernel/pid.c:490)
[ 7170.779042] SyS_wait4 (kernel/exit.c:1618 kernel/exit.c:1586)
[ 7170.779042] ? SyS_waitid (kernel/exit.c:1586)
[ 7170.779042] ? kill_orphaned_pgrp (kernel/exit.c:1444)
[ 7170.779042] ? syscall_trace_enter_phase2 (arch/x86/kernel/ptrace.c:1592)
[ 7170.779042] ? trace_hardirqs_on_thunk (arch/x86/lib/thunk_64.S:42)
[ 7170.779042] tracesys_phase2 (arch/x86/kernel/entry_64.S:347)
[ 7170.779042] Memory state around the buggy address:
[ 7170.779042]  ffff88075315fc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 7170.779042]  ffff88075315fd00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 7170.779042] >ffff88075315fd80: 00 f1 f1 f1 f1 00 f4 f4 f4 f2 f2 f2 f2 00 f4 f4
[ 7170.779042]                                                     ^
[ 7170.779042]  ffff88075315fe00: f4 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1
[ 7170.779042]  ffff88075315fe80: f1 f1 f1 00 00 00 00 00 00 00 00 00 00 00 f4 00


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: rcu: frequent rcu lockups
  2015-03-12 12:32                 ` Sasha Levin
@ 2015-03-12 14:07                   ` Paul E. McKenney
  2015-03-13 17:39                     ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-03-12 14:07 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	LKML, Dave Jones

On Thu, Mar 12, 2015 at 08:32:05AM -0400, Sasha Levin wrote:
> On 03/12/2015 08:28 AM, Sasha Levin wrote:
> > On 03/11/2015 07:16 PM, Paul E. McKenney wrote:
> >> > On Wed, Mar 11, 2015 at 07:06:40PM -0400, Sasha Levin wrote:
> >>>> >> > On 03/11/2015 07:01 PM, Paul E. McKenney wrote:
> >>>>>>>> >>>> > >> With the commit I didn't hit it yet, but I do see 4 different WARNings:
> >>>>>> >>> > > I wish that I could say that I am surprised, but the sad fact is that
> >>>>>> >>> > > I am still shaking the bugs out.  
> >>>> >> > 
> >>>> >> > I have one more to add:
> >>>> >> > 
> >>>> >> > [   93.330539] WARNING: CPU: 1 PID: 8 at kernel/rcu/tree_plugin.h:476 rcu_gp_kthread+0x1eaa/0x4dd0()
> >> > A bit different, but still in the class of a combining-tree bitmask
> >> > handling bug.
> > I left it overnight, and am still seeing hangs. Although (and don't catch me
> > by that) it seems to be significantly less of them.
> 
> In one of the cases, KASan ended up complaining about something odd going on in
> rcu_check_callbacks():

Hmmm...  Looks like I have a synchronization blow between RCU CPU stall
warnings and task exit or some such.  Promising clue, thank you!

							Thanx, Paul

> [ 7170.779042] BUG: KASan: out of bounds on stack in sched_show_task+0x324/0x3d0 at addr ffff88075315fdd8
> [ 7170.779042] Read of size 8 by task trinity-c0/15862
> [ 7170.779042] page:ffffea001d4c57c0 count:0 mapcount:0 mapping:          (null) index:0x0
> [ 7170.779042] flags: 0x5afffff80000000()
> [ 7170.779042] page dumped because: kasan: bad access detected
> [ 7170.779042] CPU: 0 PID: 15862 Comm: trinity-c0 Tainted: G        W       4.0.0-rc3-next-20150311-sasha-00035-g0a45b48-dirty #2052
> [ 7170.779042]  ffff88075315fdd8 00000000ac401e1d ffff880050607cd8 ffffffff9da97e89
> [ 7170.779042]  1ffffd4003a98aff ffff880050607d68 ffff880050607d58 ffffffff9360eda4
> [ 7170.779042]  ffff880050607d98 ffffffff9da8c9df 0000000000000082 ffffffffa0260429
> [ 7170.779042] Call Trace:
> [ 7170.779042] <IRQ> dump_stack (lib/dump_stack.c:52)
> [ 7170.779042] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
> [ 7170.779042] ? printk (kernel/printk/printk.c:1854)
> [ 7170.779042] ? bitmap_weight (include/linux/bitmap.h:303)
> [ 7170.779042] __asan_report_load8_noabort (mm/kasan/report.c:230 mm/kasan/report.c:251)
> [ 7170.779042] ? sched_show_task (kernel/sched/core.c:4526)
> [ 7170.779042] sched_show_task (kernel/sched/core.c:4526)
> [ 7170.779042] rcu_check_callbacks (kernel/rcu/tree.c:1225 kernel/rcu/tree.c:1331 kernel/rcu/tree.c:3400 kernel/rcu/tree.c:3464 kernel/rcu/tree.c:2682)
> [ 7170.779042] update_process_times (./arch/x86/include/asm/preempt.h:22 kernel/time/timer.c:1386)
> [ 7170.779042] tick_periodic (kernel/time/tick-common.c:92)
> [ 7170.779042] ? tick_handle_periodic (kernel/time/tick-common.c:105)
> [ 7170.779042] tick_handle_periodic (kernel/time/tick-common.c:105)
> [ 7170.779042] local_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:891)
> [ 7170.779042] ? irq_enter (kernel/softirq.c:338)
> [ 7170.779042] smp_apic_timer_interrupt (./arch/x86/include/asm/apic.h:650 arch/x86/kernel/apic/apic.c:915)
> [ 7170.779042] apic_timer_interrupt (arch/x86/kernel/entry_64.S:920)
> [ 7170.779042] <EOI> ? do_raw_read_lock (include/asm-generic/qrwlock.h:104 kernel/locking/spinlock_debug.c:204)
> [ 7170.779042] _raw_read_lock (kernel/locking/spinlock.c:224)
> [ 7170.779042] ? do_wait (kernel/exit.c:1498 (discriminator 1))
> [ 7170.779042] do_wait (kernel/exit.c:1498 (discriminator 1))
> [ 7170.779042] ? wait_consider_task (kernel/exit.c:1465)
> [ 7170.779042] ? find_get_pid (kernel/pid.c:490)
> [ 7170.779042] SyS_wait4 (kernel/exit.c:1618 kernel/exit.c:1586)
> [ 7170.779042] ? SyS_waitid (kernel/exit.c:1586)
> [ 7170.779042] ? kill_orphaned_pgrp (kernel/exit.c:1444)
> [ 7170.779042] ? syscall_trace_enter_phase2 (arch/x86/kernel/ptrace.c:1592)
> [ 7170.779042] ? trace_hardirqs_on_thunk (arch/x86/lib/thunk_64.S:42)
> [ 7170.779042] tracesys_phase2 (arch/x86/kernel/entry_64.S:347)
> [ 7170.779042] Memory state around the buggy address:
> [ 7170.779042]  ffff88075315fc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 7170.779042]  ffff88075315fd00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 7170.779042] >ffff88075315fd80: 00 f1 f1 f1 f1 00 f4 f4 f4 f2 f2 f2 f2 00 f4 f4
> [ 7170.779042]                                                     ^
> [ 7170.779042]  ffff88075315fe00: f4 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1
> [ 7170.779042]  ffff88075315fe80: f1 f1 f1 00 00 00 00 00 00 00 00 00 00 00 f4 00
> 
> 
> Thanks,
> Sasha
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: rcu: frequent rcu lockups
  2015-03-12 14:07                   ` Paul E. McKenney
@ 2015-03-13 17:39                     ` Paul E. McKenney
  2015-03-15 20:43                       ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-03-13 17:39 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	LKML, Dave Jones

On Thu, Mar 12, 2015 at 07:07:49AM -0700, Paul E. McKenney wrote:
> On Thu, Mar 12, 2015 at 08:32:05AM -0400, Sasha Levin wrote:
> > On 03/12/2015 08:28 AM, Sasha Levin wrote:
> > > On 03/11/2015 07:16 PM, Paul E. McKenney wrote:
> > >> > On Wed, Mar 11, 2015 at 07:06:40PM -0400, Sasha Levin wrote:
> > >>>> >> > On 03/11/2015 07:01 PM, Paul E. McKenney wrote:
> > >>>>>>>> >>>> > >> With the commit I didn't hit it yet, but I do see 4 different WARNings:
> > >>>>>> >>> > > I wish that I could say that I am surprised, but the sad fact is that
> > >>>>>> >>> > > I am still shaking the bugs out.  
> > >>>> >> > 
> > >>>> >> > I have one more to add:
> > >>>> >> > 
> > >>>> >> > [   93.330539] WARNING: CPU: 1 PID: 8 at kernel/rcu/tree_plugin.h:476 rcu_gp_kthread+0x1eaa/0x4dd0()
> > >> > A bit different, but still in the class of a combining-tree bitmask
> > >> > handling bug.
> > > I left it overnight, and am still seeing hangs. Although (and don't catch me
> > > by that) it seems to be significantly less of them.
> > 
> > In one of the cases, KASan ended up complaining about something odd going on in
> > rcu_check_callbacks():
> 
> Hmmm...  Looks like I have a synchronization blow between RCU CPU stall
> warnings and task exit or some such.  Promising clue, thank you!

But I am not seeing it.  There should be no synchronization issues
with sched_show_task(current), as the current task cannot exit while
it is calling sched_show_task(current).  There is a remote-task
sched_show_task(t), but it is invoked only on tasks that are on RCU's
list of tasks that have blocked within an RCU read-side critical section
during the current grace period.  Such a task might exit while in in this
critical section, but in that case exit_rcu() will take care of this, and
it will acquire the same lock held across the call to sched_show_task(t).

I suppose this could happen if someone re-enabled preemption late
in the exit() code path and then entered an RCU read-side critical
section, but that would be a bad breakage of the exit() path.

Am I missing something here?

							Thanx, Paul

> > [ 7170.779042] BUG: KASan: out of bounds on stack in sched_show_task+0x324/0x3d0 at addr ffff88075315fdd8
> > [ 7170.779042] Read of size 8 by task trinity-c0/15862
> > [ 7170.779042] page:ffffea001d4c57c0 count:0 mapcount:0 mapping:          (null) index:0x0
> > [ 7170.779042] flags: 0x5afffff80000000()
> > [ 7170.779042] page dumped because: kasan: bad access detected
> > [ 7170.779042] CPU: 0 PID: 15862 Comm: trinity-c0 Tainted: G        W       4.0.0-rc3-next-20150311-sasha-00035-g0a45b48-dirty #2052
> > [ 7170.779042]  ffff88075315fdd8 00000000ac401e1d ffff880050607cd8 ffffffff9da97e89
> > [ 7170.779042]  1ffffd4003a98aff ffff880050607d68 ffff880050607d58 ffffffff9360eda4
> > [ 7170.779042]  ffff880050607d98 ffffffff9da8c9df 0000000000000082 ffffffffa0260429
> > [ 7170.779042] Call Trace:
> > [ 7170.779042] <IRQ> dump_stack (lib/dump_stack.c:52)
> > [ 7170.779042] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
> > [ 7170.779042] ? printk (kernel/printk/printk.c:1854)
> > [ 7170.779042] ? bitmap_weight (include/linux/bitmap.h:303)
> > [ 7170.779042] __asan_report_load8_noabort (mm/kasan/report.c:230 mm/kasan/report.c:251)
> > [ 7170.779042] ? sched_show_task (kernel/sched/core.c:4526)
> > [ 7170.779042] sched_show_task (kernel/sched/core.c:4526)
> > [ 7170.779042] rcu_check_callbacks (kernel/rcu/tree.c:1225 kernel/rcu/tree.c:1331 kernel/rcu/tree.c:3400 kernel/rcu/tree.c:3464 kernel/rcu/tree.c:2682)
> > [ 7170.779042] update_process_times (./arch/x86/include/asm/preempt.h:22 kernel/time/timer.c:1386)
> > [ 7170.779042] tick_periodic (kernel/time/tick-common.c:92)
> > [ 7170.779042] ? tick_handle_periodic (kernel/time/tick-common.c:105)
> > [ 7170.779042] tick_handle_periodic (kernel/time/tick-common.c:105)
> > [ 7170.779042] local_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:891)
> > [ 7170.779042] ? irq_enter (kernel/softirq.c:338)
> > [ 7170.779042] smp_apic_timer_interrupt (./arch/x86/include/asm/apic.h:650 arch/x86/kernel/apic/apic.c:915)
> > [ 7170.779042] apic_timer_interrupt (arch/x86/kernel/entry_64.S:920)
> > [ 7170.779042] <EOI> ? do_raw_read_lock (include/asm-generic/qrwlock.h:104 kernel/locking/spinlock_debug.c:204)
> > [ 7170.779042] _raw_read_lock (kernel/locking/spinlock.c:224)
> > [ 7170.779042] ? do_wait (kernel/exit.c:1498 (discriminator 1))
> > [ 7170.779042] do_wait (kernel/exit.c:1498 (discriminator 1))
> > [ 7170.779042] ? wait_consider_task (kernel/exit.c:1465)
> > [ 7170.779042] ? find_get_pid (kernel/pid.c:490)
> > [ 7170.779042] SyS_wait4 (kernel/exit.c:1618 kernel/exit.c:1586)
> > [ 7170.779042] ? SyS_waitid (kernel/exit.c:1586)
> > [ 7170.779042] ? kill_orphaned_pgrp (kernel/exit.c:1444)
> > [ 7170.779042] ? syscall_trace_enter_phase2 (arch/x86/kernel/ptrace.c:1592)
> > [ 7170.779042] ? trace_hardirqs_on_thunk (arch/x86/lib/thunk_64.S:42)
> > [ 7170.779042] tracesys_phase2 (arch/x86/kernel/entry_64.S:347)
> > [ 7170.779042] Memory state around the buggy address:
> > [ 7170.779042]  ffff88075315fc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > [ 7170.779042]  ffff88075315fd00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > [ 7170.779042] >ffff88075315fd80: 00 f1 f1 f1 f1 00 f4 f4 f4 f2 f2 f2 f2 00 f4 f4
> > [ 7170.779042]                                                     ^
> > [ 7170.779042]  ffff88075315fe00: f4 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1
> > [ 7170.779042]  ffff88075315fe80: f1 f1 f1 00 00 00 00 00 00 00 00 00 00 00 f4 00
> > 
> > 
> > Thanks,
> > Sasha
> > 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: rcu: frequent rcu lockups
  2015-03-13 17:39                     ` Paul E. McKenney
@ 2015-03-15 20:43                       ` Paul E. McKenney
  2015-03-15 23:32                         ` Sasha Levin
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-03-15 20:43 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	LKML, Dave Jones

On Fri, Mar 13, 2015 at 10:39:43AM -0700, Paul E. McKenney wrote:
> On Thu, Mar 12, 2015 at 07:07:49AM -0700, Paul E. McKenney wrote:
> > On Thu, Mar 12, 2015 at 08:32:05AM -0400, Sasha Levin wrote:
> > > On 03/12/2015 08:28 AM, Sasha Levin wrote:
> > > > On 03/11/2015 07:16 PM, Paul E. McKenney wrote:
> > > >> > On Wed, Mar 11, 2015 at 07:06:40PM -0400, Sasha Levin wrote:
> > > >>>> >> > On 03/11/2015 07:01 PM, Paul E. McKenney wrote:
> > > >>>>>>>> >>>> > >> With the commit I didn't hit it yet, but I do see 4 different WARNings:
> > > >>>>>> >>> > > I wish that I could say that I am surprised, but the sad fact is that
> > > >>>>>> >>> > > I am still shaking the bugs out.  
> > > >>>> >> > 
> > > >>>> >> > I have one more to add:
> > > >>>> >> > 
> > > >>>> >> > [   93.330539] WARNING: CPU: 1 PID: 8 at kernel/rcu/tree_plugin.h:476 rcu_gp_kthread+0x1eaa/0x4dd0()
> > > >> > A bit different, but still in the class of a combining-tree bitmask
> > > >> > handling bug.
> > > > I left it overnight, and am still seeing hangs. Although (and don't catch me
> > > > by that) it seems to be significantly less of them.
> > > 
> > > In one of the cases, KASan ended up complaining about something odd going on in
> > > rcu_check_callbacks():
> > 
> > Hmmm...  Looks like I have a synchronization blow between RCU CPU stall
> > warnings and task exit or some such.  Promising clue, thank you!
> 
> But I am not seeing it.  There should be no synchronization issues
> with sched_show_task(current), as the current task cannot exit while
> it is calling sched_show_task(current).  There is a remote-task
> sched_show_task(t), but it is invoked only on tasks that are on RCU's
> list of tasks that have blocked within an RCU read-side critical section
> during the current grace period.  Such a task might exit while in in this
> critical section, but in that case exit_rcu() will take care of this, and
> it will acquire the same lock held across the call to sched_show_task(t).
> 
> I suppose this could happen if someone re-enabled preemption late
> in the exit() code path and then entered an RCU read-side critical
> section, but that would be a bad breakage of the exit() path.
> 
> Am I missing something here?

But I did find a bug that would result in the other warnings, and could
also result in too-short grace periods, which could in turn result in
arbitrarily arbitrary misbehavior.  The patch below, which is also on
its way into -next, should fix this.  Please let me know how it does
for you.

							Thanx, Paul

------------------------------------------------------------------------

    rcu: Associate quiescent-state reports with grace period
    
    As noted in earlier commit logs, CPU hotplug operations running
    concurrently with grace-period initialization can result in a given
    leaf rcu_node structure having all CPUs offline and no blocked readers,
    but with this rcu_node structure nevertheless blocking the current
    grace period.  Therefore, the quiescent-state forcing code now checks
    for this situation and repairs it.
    
    Unfortunately, this checking can result in false positives, for example,
    when the last task has just removed itself from this leaf rcu_node
    structure, but has not yet started clearing the ->qsmask bits further
    up the structure.  This means that the grace-period kthread (which
    forces quiescent states) and some other task might be attempting to
    concurrently clear these ->qsmask bits.  This is usually not a problem:
    One of these tasks will be the first to acquire the upper-level rcu_node
    structure's lock and with therefore clear the bit, and the other task,
    seeing the bit already cleared, will stop trying to clear bits.
    
    Sadly, this means that the following unusual sequence of events -can-
    result in a problem:
    
    1.	The grace-period kthread wins, and clears the ->qsmask bits.
    
    2.	This is the last thing blocking the current grace period, so
    	that the grace-period kthread clears ->qsmask bits all the way
    	to the root and finds that the root ->qsmask field is now zero.
    
    3.	Another grace period is required, so that the grace period kthread
    	initializes it, including setting all the needed qsmask bits.
    
    4.	The leaf rcu_node structure (the one that started this whole
    	mess) is blocking this new grace period, either because it
    	has at least one online CPU or because there is at least one
    	task that had blocked within an RCU read-side critical section
    	while running on one of this leaf rcu_node structure's CPUs.
    	(And yes, that CPU might well have gone offline before the
    	grace period in step (3) above started, which can mean that
    	there is a task on the leaf rcu_node structure's ->blkd_tasks
    	list, but ->qsmask equal to zero.)
    
    5.	The other kthread didn't get around to trying to clear the upper
    	level ->qsmask bits until all the above had happened.  This means
    	that it now sees bits set in the upper-level ->qsmask field, so it
    	proceeds to clear them.  Too bad that it is doing so on behalf of
    	a quiescent state that does not apply to the current grace period!
    
    This sequence of events can result in the new grace period being too
    short.  It can also result in the new grace period ending before the
    leaf rcu_node structure's ->qsmask bits have been cleared, which will
    result in splats during initialization of the next grace period.  In
    addition, it can result in tasks blocking the new grace period still
    being queued at the start of the next grace period, which will result
    in other splats.  Sasha's testing turned up another of these splats,
    as did rcutorture testing.  (And yes, rcutorture is being adjusted to
    make these splats show up more quickly.  Which probably is having the
    undesirable side effect of making other problems show up less quickly.
    Can't have everything!)
    
    Reported-by: Sasha Levin <sasha.levin@oracle.com>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 664b052f82a9..098046362b39 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2162,14 +2162,14 @@ static void rcu_report_qs_rsp(struct rcu_state *rsp, unsigned long flags)
  */
 static void
 rcu_report_qs_rnp(unsigned long mask, struct rcu_state *rsp,
-		  struct rcu_node *rnp, unsigned long flags)
+		  struct rcu_node *rnp, unsigned long gps, unsigned long flags)
 	__releases(rnp->lock)
 {
 	struct rcu_node *rnp_c;
 
 	/* Walk up the rcu_node hierarchy. */
 	for (;;) {
-		if (!(rnp->qsmask & mask)) {
+		if (!(rnp->qsmask & mask) || rnp->gpnum != gps) {
 
 			/* Our bit has already been cleared, so done. */
 			raw_spin_unlock_irqrestore(&rnp->lock, flags);
@@ -2220,6 +2220,7 @@ static void rcu_report_unblock_qs_rnp(struct rcu_state *rsp,
 				      struct rcu_node *rnp, unsigned long flags)
 	__releases(rnp->lock)
 {
+	unsigned long gps;
 	unsigned long mask;
 	struct rcu_node *rnp_p;
 
@@ -2240,11 +2241,12 @@ static void rcu_report_unblock_qs_rnp(struct rcu_state *rsp,
 	}
 
 	/* Report up the rest of the hierarchy. */
+	gps = rnp->gpnum;
 	mask = rnp->grpmask;
 	raw_spin_unlock(&rnp->lock);	/* irqs remain disabled. */
 	raw_spin_lock(&rnp_p->lock);	/* irqs already disabled. */
 	smp_mb__after_unlock_lock();
-	rcu_report_qs_rnp(mask, rsp, rnp_p, flags);
+	rcu_report_qs_rnp(mask, rsp, rnp_p, gps, flags);
 }
 
 /*
@@ -2295,7 +2297,8 @@ rcu_report_qs_rdp(int cpu, struct rcu_state *rsp, struct rcu_data *rdp)
 		 */
 		needwake = rcu_accelerate_cbs(rsp, rnp, rdp);
 
-		rcu_report_qs_rnp(mask, rsp, rnp, flags); /* rlses rnp->lock */
+		rcu_report_qs_rnp(mask, rsp, rnp, rnp->gpnum, flags);
+		/* ^^^ Released rnp->lock */
 		if (needwake)
 			rcu_gp_kthread_wake(rsp);
 	}
@@ -2754,8 +2757,8 @@ static void force_qs_rnp(struct rcu_state *rsp,
 			}
 		}
 		if (mask != 0) {
-			/* Idle/offline CPUs, report. */
-			rcu_report_qs_rnp(mask, rsp, rnp, flags);
+			/* Idle/offline CPUs, report (releases rnp->lock. */
+			rcu_report_qs_rnp(mask, rsp, rnp, rnp->gpnum, flags);
 		} else {
 			/* Nothing to do here, so just drop the lock. */
 			raw_spin_unlock_irqrestore(&rnp->lock, flags);


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: rcu: frequent rcu lockups
  2015-03-15 20:43                       ` Paul E. McKenney
@ 2015-03-15 23:32                         ` Sasha Levin
  2015-03-16  0:02                           ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Sasha Levin @ 2015-03-15 23:32 UTC (permalink / raw)
  To: paulmck
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	LKML, Dave Jones

On 03/15/2015 04:43 PM, Paul E. McKenney wrote:
> But I did find a bug that would result in the other warnings, and could
> also result in too-short grace periods, which could in turn result in
> arbitrarily arbitrary misbehavior.  The patch below, which is also on
> its way into -next, should fix this.  Please let me know how it does
> for you.

I've stopped seeing the warnings I've previously reported, but started
seeing a new one:

[  788.564596] WARNING: CPU: 12 PID: 9711 at kernel/rcu/tree.c:2201 rcu_report_qs_rnp+0x42e/0x5a0()
[  788.568123] Modules linked in:
[  788.568123] CPU: 12 PID: 9711 Comm: trinity-main Not tainted 4.0.0-rc3-next-20150313-sasha-00041-g83a3dc8-dirty #2078
[  788.568123]  ffff8803a1ba0000 00000000400df16a ffff880442807cc8 ffffffffb1ab01ca
[  788.568123]  0000000000000000 0000000000000000 ffff880442807d18 ffffffffa71e261a
[  788.568123]  dffffc0000000000 ffffffffa733d2ee ffff880442807d28 ffffffffb4724000
[  788.568123] Call Trace:
[  788.568123] <IRQ> dump_stack (lib/dump_stack.c:52)
[  788.568123] warn_slowpath_common (kernel/panic.c:447)
[  788.568123] ? rcu_report_qs_rnp (kernel/rcu/tree.c:2201 (discriminator 3))
[  788.568123] warn_slowpath_null (kernel/panic.c:481)
[  788.568123] rcu_report_qs_rnp (kernel/rcu/tree.c:2201 (discriminator 3))
[  788.568123] rcu_process_callbacks (kernel/rcu/tree.c:2302 kernel/rcu/tree.c:2338 kernel/rcu/tree.c:2824 kernel/rcu/tree.c:2857)
[  788.568123] __do_softirq (kernel/softirq.c:273 include/linux/jump_label.h:114 include/trace/events/irq.h:126 kernel/softirq.c:274)
[  788.568123] irq_exit (kernel/softirq.c:350 kernel/softirq.c:391)
[  788.568123] smp_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:918)
[  788.568123] apic_timer_interrupt (arch/x86/kernel/entry_64.S:920)
[  788.568123] <EOI> ? mark_held_locks (kernel/locking/lockdep.c:2525)
[  788.568123] finish_task_switch (kernel/sched/core.c:2231)
[  788.568123] __schedule (kernel/sched/core.c:2337 kernel/sched/core.c:2795)
[  788.568123] schedule (./arch/x86/include/asm/bitops.h:311 (discriminator 1) kernel/sched/core.c:2824 (discriminator 1))
[  788.568123] schedule_preempt_disabled (kernel/sched/core.c:2856)
[  788.568123] mutex_lock_nested (kernel/locking/mutex.c:585 kernel/locking/mutex.c:623)
[  788.568123] kernfs_iop_permission (fs/kernfs/inode.c:366)
[  788.568123] __inode_permission (fs/namei.c:374 fs/namei.c:408)
[  788.568123] inode_permission (fs/namei.c:460)
[  788.568123] link_path_walk (fs/namei.c:1520 fs/namei.c:1782)
[  788.568123] path_init (fs/namei.c:1947)
[  788.568123] path_lookupat (fs/namei.c:1989)
[  788.568123] filename_lookup (fs/namei.c:2025)
[  788.568123] user_path_at_empty (fs/namei.c:2189)
[  788.568123] user_path_at (fs/namei.c:2200)
[  788.568123] vfs_fstatat (fs/stat.c:106)
[  788.568123] SYSC_newfstatat (fs/stat.c:298)
[  788.568123] SyS_newfstatat (fs/stat.c:291)
[  788.568123] tracesys_phase2 (arch/x86/kernel/entry_64.S:347)


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: rcu: frequent rcu lockups
  2015-03-15 23:32                         ` Sasha Levin
@ 2015-03-16  0:02                           ` Paul E. McKenney
  2015-03-20 14:15                             ` Sasha Levin
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-03-16  0:02 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	LKML, Dave Jones

On Sun, Mar 15, 2015 at 07:32:32PM -0400, Sasha Levin wrote:
> On 03/15/2015 04:43 PM, Paul E. McKenney wrote:
> > But I did find a bug that would result in the other warnings, and could
> > also result in too-short grace periods, which could in turn result in
> > arbitrarily arbitrary misbehavior.  The patch below, which is also on
> > its way into -next, should fix this.  Please let me know how it does
> > for you.
> 
> I've stopped seeing the warnings I've previously reported, but started
> seeing a new one:
> 
> [  788.564596] WARNING: CPU: 12 PID: 9711 at kernel/rcu/tree.c:2201 rcu_report_qs_rnp+0x42e/0x5a0()
> [  788.568123] Modules linked in:
> [  788.568123] CPU: 12 PID: 9711 Comm: trinity-main Not tainted 4.0.0-rc3-next-20150313-sasha-00041-g83a3dc8-dirty #2078
> [  788.568123]  ffff8803a1ba0000 00000000400df16a ffff880442807cc8 ffffffffb1ab01ca
> [  788.568123]  0000000000000000 0000000000000000 ffff880442807d18 ffffffffa71e261a
> [  788.568123]  dffffc0000000000 ffffffffa733d2ee ffff880442807d28 ffffffffb4724000
> [  788.568123] Call Trace:
> [  788.568123] <IRQ> dump_stack (lib/dump_stack.c:52)
> [  788.568123] warn_slowpath_common (kernel/panic.c:447)
> [  788.568123] ? rcu_report_qs_rnp (kernel/rcu/tree.c:2201 (discriminator 3))
> [  788.568123] warn_slowpath_null (kernel/panic.c:481)
> [  788.568123] rcu_report_qs_rnp (kernel/rcu/tree.c:2201 (discriminator 3))
> [  788.568123] rcu_process_callbacks (kernel/rcu/tree.c:2302 kernel/rcu/tree.c:2338 kernel/rcu/tree.c:2824 kernel/rcu/tree.c:2857)
> [  788.568123] __do_softirq (kernel/softirq.c:273 include/linux/jump_label.h:114 include/trace/events/irq.h:126 kernel/softirq.c:274)
> [  788.568123] irq_exit (kernel/softirq.c:350 kernel/softirq.c:391)
> [  788.568123] smp_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:918)
> [  788.568123] apic_timer_interrupt (arch/x86/kernel/entry_64.S:920)
> [  788.568123] <EOI> ? mark_held_locks (kernel/locking/lockdep.c:2525)
> [  788.568123] finish_task_switch (kernel/sched/core.c:2231)
> [  788.568123] __schedule (kernel/sched/core.c:2337 kernel/sched/core.c:2795)
> [  788.568123] schedule (./arch/x86/include/asm/bitops.h:311 (discriminator 1) kernel/sched/core.c:2824 (discriminator 1))
> [  788.568123] schedule_preempt_disabled (kernel/sched/core.c:2856)
> [  788.568123] mutex_lock_nested (kernel/locking/mutex.c:585 kernel/locking/mutex.c:623)
> [  788.568123] kernfs_iop_permission (fs/kernfs/inode.c:366)
> [  788.568123] __inode_permission (fs/namei.c:374 fs/namei.c:408)
> [  788.568123] inode_permission (fs/namei.c:460)
> [  788.568123] link_path_walk (fs/namei.c:1520 fs/namei.c:1782)
> [  788.568123] path_init (fs/namei.c:1947)
> [  788.568123] path_lookupat (fs/namei.c:1989)
> [  788.568123] filename_lookup (fs/namei.c:2025)
> [  788.568123] user_path_at_empty (fs/namei.c:2189)
> [  788.568123] user_path_at (fs/namei.c:2200)
> [  788.568123] vfs_fstatat (fs/stat.c:106)
> [  788.568123] SYSC_newfstatat (fs/stat.c:298)
> [  788.568123] SyS_newfstatat (fs/stat.c:291)
> [  788.568123] tracesys_phase2 (arch/x86/kernel/entry_64.S:347)

OK, I guess it would help to update the WARN_ON()s while I am at it.  :-/

Here is an updated patch that replaces the one resulting in the above
splat.

								Thanx, Paul

------------------------------------------------------------------------

    rcu: Associate quiescent-state reports with grace period
    
    As noted in earlier commit logs, CPU hotplug operations running
    concurrently with grace-period initialization can result in a given
    leaf rcu_node structure having all CPUs offline and no blocked readers,
    but with this rcu_node structure nevertheless blocking the current
    grace period.  Therefore, the quiescent-state forcing code now checks
    for this situation and repairs it.
    
    Unfortunately, this checking can result in false positives, for example,
    when the last task has just removed itself from this leaf rcu_node
    structure, but has not yet started clearing the ->qsmask bits further
    up the structure.  This means that the grace-period kthread (which
    forces quiescent states) and some other task might be attempting to
    concurrently clear these ->qsmask bits.  This is usually not a problem:
    One of these tasks will be the first to acquire the upper-level rcu_node
    structure's lock and with therefore clear the bit, and the other task,
    seeing the bit already cleared, will stop trying to clear bits.
    
    Sadly, this means that the following unusual sequence of events -can-
    result in a problem:
    
    1.	The grace-period kthread wins, and clears the ->qsmask bits.
    
    2.	This is the last thing blocking the current grace period, so
    	that the grace-period kthread clears ->qsmask bits all the way
    	to the root and finds that the root ->qsmask field is now zero.
    
    3.	Another grace period is required, so that the grace period kthread
    	initializes it, including setting all the needed qsmask bits.
    
    4.	The leaf rcu_node structure (the one that started this whole
    	mess) is blocking this new grace period, either because it
    	has at least one online CPU or because there is at least one
    	task that had blocked within an RCU read-side critical section
    	while running on one of this leaf rcu_node structure's CPUs.
    	(And yes, that CPU might well have gone offline before the
    	grace period in step (3) above started, which can mean that
    	there is a task on the leaf rcu_node structure's ->blkd_tasks
    	list, but ->qsmask equal to zero.)
    
    5.	The other kthread didn't get around to trying to clear the upper
    	level ->qsmask bits until all the above had happened.  This means
    	that it now sees bits set in the upper-level ->qsmask field, so it
    	proceeds to clear them.  Too bad that it is doing so on behalf of
    	a quiescent state that does not apply to the current grace period!
    
    This sequence of events can result in the new grace period being too
    short.  It can also result in the new grace period ending before the
    leaf rcu_node structure's ->qsmask bits have been cleared, which will
    result in splats during initialization of the next grace period.  In
    addition, it can result in tasks blocking the new grace period still
    being queued at the start of the next grace period, which will result
    in other splats.  Sasha's testing turned up another of these splats,
    as did rcutorture testing.  (And yes, rcutorture is being adjusted to
    make these splats show up more quickly.  Which probably is having the
    undesirable side effect of making other problems show up less quickly.
    Can't have everything!)
    
    Reported-by: Sasha Levin <sasha.levin@oracle.com>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 664b052f82a9..e8699fb94ed2 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2156,25 +2156,32 @@ static void rcu_report_qs_rsp(struct rcu_state *rsp, unsigned long flags)
  * Similar to rcu_report_qs_rdp(), for which it is a helper function.
  * Allows quiescent states for a group of CPUs to be reported at one go
  * to the specified rcu_node structure, though all the CPUs in the group
- * must be represented by the same rcu_node structure (which need not be
- * a leaf rcu_node structure, though it often will be).  That structure's
- * lock must be held upon entry, and it is released before return.
+ * must be represented by the same rcu_node structure (which need not be a
+ * leaf rcu_node structure, though it often will be).  The gps parameter
+ * is the grace-period snapshot, which means that the quiescent states
+ * are valid only if rnp->gpnum is equal to gps.  That structure's lock
+ * must be held upon entry, and it is released before return.
  */
 static void
 rcu_report_qs_rnp(unsigned long mask, struct rcu_state *rsp,
-		  struct rcu_node *rnp, unsigned long flags)
+		  struct rcu_node *rnp, unsigned long gps, unsigned long flags)
 	__releases(rnp->lock)
 {
+	unsigned long oldmask = 0;
 	struct rcu_node *rnp_c;
 
 	/* Walk up the rcu_node hierarchy. */
 	for (;;) {
-		if (!(rnp->qsmask & mask)) {
+		if (!(rnp->qsmask & mask) || rnp->gpnum != gps) {
 
-			/* Our bit has already been cleared, so done. */
+			/*
+			 * Our bit has already been cleared, or the
+			 * relevant grace period is already over, so done.
+			 */
 			raw_spin_unlock_irqrestore(&rnp->lock, flags);
 			return;
 		}
+		WARN_ON_ONCE(oldmask); /* Any child must be all zeroed! */
 		rnp->qsmask &= ~mask;
 		trace_rcu_quiescent_state_report(rsp->name, rnp->gpnum,
 						 mask, rnp->qsmask, rnp->level,
@@ -2198,7 +2205,7 @@ rcu_report_qs_rnp(unsigned long mask, struct rcu_state *rsp,
 		rnp = rnp->parent;
 		raw_spin_lock_irqsave(&rnp->lock, flags);
 		smp_mb__after_unlock_lock();
-		WARN_ON_ONCE(rnp_c->qsmask);
+		oldmask = rnp_c->qsmask;
 	}
 
 	/*
@@ -2220,6 +2227,7 @@ static void rcu_report_unblock_qs_rnp(struct rcu_state *rsp,
 				      struct rcu_node *rnp, unsigned long flags)
 	__releases(rnp->lock)
 {
+	unsigned long gps;
 	unsigned long mask;
 	struct rcu_node *rnp_p;
 
@@ -2239,12 +2247,13 @@ static void rcu_report_unblock_qs_rnp(struct rcu_state *rsp,
 		return;
 	}
 
-	/* Report up the rest of the hierarchy. */
+	/* Report up the rest of the hierarchy, tracking current ->gpnum. */
+	gps = rnp->gpnum;
 	mask = rnp->grpmask;
 	raw_spin_unlock(&rnp->lock);	/* irqs remain disabled. */
 	raw_spin_lock(&rnp_p->lock);	/* irqs already disabled. */
 	smp_mb__after_unlock_lock();
-	rcu_report_qs_rnp(mask, rsp, rnp_p, flags);
+	rcu_report_qs_rnp(mask, rsp, rnp_p, gps, flags);
 }
 
 /*
@@ -2295,7 +2304,8 @@ rcu_report_qs_rdp(int cpu, struct rcu_state *rsp, struct rcu_data *rdp)
 		 */
 		needwake = rcu_accelerate_cbs(rsp, rnp, rdp);
 
-		rcu_report_qs_rnp(mask, rsp, rnp, flags); /* rlses rnp->lock */
+		rcu_report_qs_rnp(mask, rsp, rnp, rnp->gpnum, flags);
+		/* ^^^ Released rnp->lock */
 		if (needwake)
 			rcu_gp_kthread_wake(rsp);
 	}
@@ -2754,8 +2764,8 @@ static void force_qs_rnp(struct rcu_state *rsp,
 			}
 		}
 		if (mask != 0) {
-			/* Idle/offline CPUs, report. */
-			rcu_report_qs_rnp(mask, rsp, rnp, flags);
+			/* Idle/offline CPUs, report (releases rnp->lock. */
+			rcu_report_qs_rnp(mask, rsp, rnp, rnp->gpnum, flags);
 		} else {
 			/* Nothing to do here, so just drop the lock. */
 			raw_spin_unlock_irqrestore(&rnp->lock, flags);


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: rcu: frequent rcu lockups
  2015-03-16  0:02                           ` Paul E. McKenney
@ 2015-03-20 14:15                             ` Sasha Levin
  2015-03-20 14:29                               ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Sasha Levin @ 2015-03-20 14:15 UTC (permalink / raw)
  To: paulmck
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	LKML, Dave Jones

On 03/15/2015 08:02 PM, Paul E. McKenney wrote:
> On Sun, Mar 15, 2015 at 07:32:32PM -0400, Sasha Levin wrote:
>> > On 03/15/2015 04:43 PM, Paul E. McKenney wrote:
>>> > > But I did find a bug that would result in the other warnings, and could
>>> > > also result in too-short grace periods, which could in turn result in
>>> > > arbitrarily arbitrary misbehavior.  The patch below, which is also on
>>> > > its way into -next, should fix this.  Please let me know how it does
>>> > > for you.
>> > 
>> > I've stopped seeing the warnings I've previously reported, but started
>> > seeing a new one:
>> > 
>> > [  788.564596] WARNING: CPU: 12 PID: 9711 at kernel/rcu/tree.c:2201 rcu_report_qs_rnp+0x42e/0x5a0()
>> > [  788.568123] Modules linked in:
>> > [  788.568123] CPU: 12 PID: 9711 Comm: trinity-main Not tainted 4.0.0-rc3-next-20150313-sasha-00041-g83a3dc8-dirty #2078
>> > [  788.568123]  ffff8803a1ba0000 00000000400df16a ffff880442807cc8 ffffffffb1ab01ca
>> > [  788.568123]  0000000000000000 0000000000000000 ffff880442807d18 ffffffffa71e261a
>> > [  788.568123]  dffffc0000000000 ffffffffa733d2ee ffff880442807d28 ffffffffb4724000
>> > [  788.568123] Call Trace:
>> > [  788.568123] <IRQ> dump_stack (lib/dump_stack.c:52)
>> > [  788.568123] warn_slowpath_common (kernel/panic.c:447)
>> > [  788.568123] ? rcu_report_qs_rnp (kernel/rcu/tree.c:2201 (discriminator 3))
>> > [  788.568123] warn_slowpath_null (kernel/panic.c:481)
>> > [  788.568123] rcu_report_qs_rnp (kernel/rcu/tree.c:2201 (discriminator 3))
>> > [  788.568123] rcu_process_callbacks (kernel/rcu/tree.c:2302 kernel/rcu/tree.c:2338 kernel/rcu/tree.c:2824 kernel/rcu/tree.c:2857)
>> > [  788.568123] __do_softirq (kernel/softirq.c:273 include/linux/jump_label.h:114 include/trace/events/irq.h:126 kernel/softirq.c:274)
>> > [  788.568123] irq_exit (kernel/softirq.c:350 kernel/softirq.c:391)
>> > [  788.568123] smp_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:918)
>> > [  788.568123] apic_timer_interrupt (arch/x86/kernel/entry_64.S:920)
>> > [  788.568123] <EOI> ? mark_held_locks (kernel/locking/lockdep.c:2525)
>> > [  788.568123] finish_task_switch (kernel/sched/core.c:2231)
>> > [  788.568123] __schedule (kernel/sched/core.c:2337 kernel/sched/core.c:2795)
>> > [  788.568123] schedule (./arch/x86/include/asm/bitops.h:311 (discriminator 1) kernel/sched/core.c:2824 (discriminator 1))
>> > [  788.568123] schedule_preempt_disabled (kernel/sched/core.c:2856)
>> > [  788.568123] mutex_lock_nested (kernel/locking/mutex.c:585 kernel/locking/mutex.c:623)
>> > [  788.568123] kernfs_iop_permission (fs/kernfs/inode.c:366)
>> > [  788.568123] __inode_permission (fs/namei.c:374 fs/namei.c:408)
>> > [  788.568123] inode_permission (fs/namei.c:460)
>> > [  788.568123] link_path_walk (fs/namei.c:1520 fs/namei.c:1782)
>> > [  788.568123] path_init (fs/namei.c:1947)
>> > [  788.568123] path_lookupat (fs/namei.c:1989)
>> > [  788.568123] filename_lookup (fs/namei.c:2025)
>> > [  788.568123] user_path_at_empty (fs/namei.c:2189)
>> > [  788.568123] user_path_at (fs/namei.c:2200)
>> > [  788.568123] vfs_fstatat (fs/stat.c:106)
>> > [  788.568123] SYSC_newfstatat (fs/stat.c:298)
>> > [  788.568123] SyS_newfstatat (fs/stat.c:291)
>> > [  788.568123] tracesys_phase2 (arch/x86/kernel/entry_64.S:347)
> OK, I guess it would help to update the WARN_ON()s while I am at it.  :-/
> 
> Here is an updated patch that replaces the one resulting in the above
> splat.

That did the trick for me, but I don't think it made it to -next?


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: rcu: frequent rcu lockups
  2015-03-20 14:15                             ` Sasha Levin
@ 2015-03-20 14:29                               ` Paul E. McKenney
  2015-03-20 14:46                                 ` Sasha Levin
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-03-20 14:29 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	LKML, Dave Jones

On Fri, Mar 20, 2015 at 10:15:34AM -0400, Sasha Levin wrote:
> On 03/15/2015 08:02 PM, Paul E. McKenney wrote:
> > On Sun, Mar 15, 2015 at 07:32:32PM -0400, Sasha Levin wrote:
> >> > On 03/15/2015 04:43 PM, Paul E. McKenney wrote:
> >>> > > But I did find a bug that would result in the other warnings, and could
> >>> > > also result in too-short grace periods, which could in turn result in
> >>> > > arbitrarily arbitrary misbehavior.  The patch below, which is also on
> >>> > > its way into -next, should fix this.  Please let me know how it does
> >>> > > for you.
> >> > 
> >> > I've stopped seeing the warnings I've previously reported, but started
> >> > seeing a new one:
> >> > 
> >> > [  788.564596] WARNING: CPU: 12 PID: 9711 at kernel/rcu/tree.c:2201 rcu_report_qs_rnp+0x42e/0x5a0()
> >> > [  788.568123] Modules linked in:
> >> > [  788.568123] CPU: 12 PID: 9711 Comm: trinity-main Not tainted 4.0.0-rc3-next-20150313-sasha-00041-g83a3dc8-dirty #2078
> >> > [  788.568123]  ffff8803a1ba0000 00000000400df16a ffff880442807cc8 ffffffffb1ab01ca
> >> > [  788.568123]  0000000000000000 0000000000000000 ffff880442807d18 ffffffffa71e261a
> >> > [  788.568123]  dffffc0000000000 ffffffffa733d2ee ffff880442807d28 ffffffffb4724000
> >> > [  788.568123] Call Trace:
> >> > [  788.568123] <IRQ> dump_stack (lib/dump_stack.c:52)
> >> > [  788.568123] warn_slowpath_common (kernel/panic.c:447)
> >> > [  788.568123] ? rcu_report_qs_rnp (kernel/rcu/tree.c:2201 (discriminator 3))
> >> > [  788.568123] warn_slowpath_null (kernel/panic.c:481)
> >> > [  788.568123] rcu_report_qs_rnp (kernel/rcu/tree.c:2201 (discriminator 3))
> >> > [  788.568123] rcu_process_callbacks (kernel/rcu/tree.c:2302 kernel/rcu/tree.c:2338 kernel/rcu/tree.c:2824 kernel/rcu/tree.c:2857)
> >> > [  788.568123] __do_softirq (kernel/softirq.c:273 include/linux/jump_label.h:114 include/trace/events/irq.h:126 kernel/softirq.c:274)
> >> > [  788.568123] irq_exit (kernel/softirq.c:350 kernel/softirq.c:391)
> >> > [  788.568123] smp_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:918)
> >> > [  788.568123] apic_timer_interrupt (arch/x86/kernel/entry_64.S:920)
> >> > [  788.568123] <EOI> ? mark_held_locks (kernel/locking/lockdep.c:2525)
> >> > [  788.568123] finish_task_switch (kernel/sched/core.c:2231)
> >> > [  788.568123] __schedule (kernel/sched/core.c:2337 kernel/sched/core.c:2795)
> >> > [  788.568123] schedule (./arch/x86/include/asm/bitops.h:311 (discriminator 1) kernel/sched/core.c:2824 (discriminator 1))
> >> > [  788.568123] schedule_preempt_disabled (kernel/sched/core.c:2856)
> >> > [  788.568123] mutex_lock_nested (kernel/locking/mutex.c:585 kernel/locking/mutex.c:623)
> >> > [  788.568123] kernfs_iop_permission (fs/kernfs/inode.c:366)
> >> > [  788.568123] __inode_permission (fs/namei.c:374 fs/namei.c:408)
> >> > [  788.568123] inode_permission (fs/namei.c:460)
> >> > [  788.568123] link_path_walk (fs/namei.c:1520 fs/namei.c:1782)
> >> > [  788.568123] path_init (fs/namei.c:1947)
> >> > [  788.568123] path_lookupat (fs/namei.c:1989)
> >> > [  788.568123] filename_lookup (fs/namei.c:2025)
> >> > [  788.568123] user_path_at_empty (fs/namei.c:2189)
> >> > [  788.568123] user_path_at (fs/namei.c:2200)
> >> > [  788.568123] vfs_fstatat (fs/stat.c:106)
> >> > [  788.568123] SYSC_newfstatat (fs/stat.c:298)
> >> > [  788.568123] SyS_newfstatat (fs/stat.c:291)
> >> > [  788.568123] tracesys_phase2 (arch/x86/kernel/entry_64.S:347)
> > OK, I guess it would help to update the WARN_ON()s while I am at it.  :-/
> > 
> > Here is an updated patch that replaces the one resulting in the above
> > splat.
> 
> That did the trick for me, but I don't think it made it to -next?

Good to hear!  May I add your Tested-by?

And thank you for the reminder on -next, pushing it out now.

								Thanx, Paul


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: rcu: frequent rcu lockups
  2015-03-20 14:29                               ` Paul E. McKenney
@ 2015-03-20 14:46                                 ` Sasha Levin
  2015-03-20 15:35                                   ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Sasha Levin @ 2015-03-20 14:46 UTC (permalink / raw)
  To: paulmck
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	LKML, Dave Jones

On 03/20/2015 10:29 AM, Paul E. McKenney wrote:
> On Fri, Mar 20, 2015 at 10:15:34AM -0400, Sasha Levin wrote:
>> On 03/15/2015 08:02 PM, Paul E. McKenney wrote:
>>> On Sun, Mar 15, 2015 at 07:32:32PM -0400, Sasha Levin wrote:
>>>>> On 03/15/2015 04:43 PM, Paul E. McKenney wrote:
>>>>>>> But I did find a bug that would result in the other warnings, and could
>>>>>>> also result in too-short grace periods, which could in turn result in
>>>>>>> arbitrarily arbitrary misbehavior.  The patch below, which is also on
>>>>>>> its way into -next, should fix this.  Please let me know how it does
>>>>>>> for you.
>>>>>
>>>>> I've stopped seeing the warnings I've previously reported, but started
>>>>> seeing a new one:
>>>>>
>>>>> [  788.564596] WARNING: CPU: 12 PID: 9711 at kernel/rcu/tree.c:2201 rcu_report_qs_rnp+0x42e/0x5a0()
>>>>> [  788.568123] Modules linked in:
>>>>> [  788.568123] CPU: 12 PID: 9711 Comm: trinity-main Not tainted 4.0.0-rc3-next-20150313-sasha-00041-g83a3dc8-dirty #2078
>>>>> [  788.568123]  ffff8803a1ba0000 00000000400df16a ffff880442807cc8 ffffffffb1ab01ca
>>>>> [  788.568123]  0000000000000000 0000000000000000 ffff880442807d18 ffffffffa71e261a
>>>>> [  788.568123]  dffffc0000000000 ffffffffa733d2ee ffff880442807d28 ffffffffb4724000
>>>>> [  788.568123] Call Trace:
>>>>> [  788.568123] <IRQ> dump_stack (lib/dump_stack.c:52)
>>>>> [  788.568123] warn_slowpath_common (kernel/panic.c:447)
>>>>> [  788.568123] ? rcu_report_qs_rnp (kernel/rcu/tree.c:2201 (discriminator 3))
>>>>> [  788.568123] warn_slowpath_null (kernel/panic.c:481)
>>>>> [  788.568123] rcu_report_qs_rnp (kernel/rcu/tree.c:2201 (discriminator 3))
>>>>> [  788.568123] rcu_process_callbacks (kernel/rcu/tree.c:2302 kernel/rcu/tree.c:2338 kernel/rcu/tree.c:2824 kernel/rcu/tree.c:2857)
>>>>> [  788.568123] __do_softirq (kernel/softirq.c:273 include/linux/jump_label.h:114 include/trace/events/irq.h:126 kernel/softirq.c:274)
>>>>> [  788.568123] irq_exit (kernel/softirq.c:350 kernel/softirq.c:391)
>>>>> [  788.568123] smp_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:918)
>>>>> [  788.568123] apic_timer_interrupt (arch/x86/kernel/entry_64.S:920)
>>>>> [  788.568123] <EOI> ? mark_held_locks (kernel/locking/lockdep.c:2525)
>>>>> [  788.568123] finish_task_switch (kernel/sched/core.c:2231)
>>>>> [  788.568123] __schedule (kernel/sched/core.c:2337 kernel/sched/core.c:2795)
>>>>> [  788.568123] schedule (./arch/x86/include/asm/bitops.h:311 (discriminator 1) kernel/sched/core.c:2824 (discriminator 1))
>>>>> [  788.568123] schedule_preempt_disabled (kernel/sched/core.c:2856)
>>>>> [  788.568123] mutex_lock_nested (kernel/locking/mutex.c:585 kernel/locking/mutex.c:623)
>>>>> [  788.568123] kernfs_iop_permission (fs/kernfs/inode.c:366)
>>>>> [  788.568123] __inode_permission (fs/namei.c:374 fs/namei.c:408)
>>>>> [  788.568123] inode_permission (fs/namei.c:460)
>>>>> [  788.568123] link_path_walk (fs/namei.c:1520 fs/namei.c:1782)
>>>>> [  788.568123] path_init (fs/namei.c:1947)
>>>>> [  788.568123] path_lookupat (fs/namei.c:1989)
>>>>> [  788.568123] filename_lookup (fs/namei.c:2025)
>>>>> [  788.568123] user_path_at_empty (fs/namei.c:2189)
>>>>> [  788.568123] user_path_at (fs/namei.c:2200)
>>>>> [  788.568123] vfs_fstatat (fs/stat.c:106)
>>>>> [  788.568123] SYSC_newfstatat (fs/stat.c:298)
>>>>> [  788.568123] SyS_newfstatat (fs/stat.c:291)
>>>>> [  788.568123] tracesys_phase2 (arch/x86/kernel/entry_64.S:347)
>>> OK, I guess it would help to update the WARN_ON()s while I am at it.  :-/
>>>
>>> Here is an updated patch that replaces the one resulting in the above
>>> splat.
>>
>> That did the trick for me, but I don't think it made it to -next?
> 
> Good to hear!  May I add your Tested-by?

Sure.

> And thank you for the reminder on -next, pushing it out now.

Thanks!


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: rcu: frequent rcu lockups
  2015-03-20 14:46                                 ` Sasha Levin
@ 2015-03-20 15:35                                   ` Paul E. McKenney
  0 siblings, 0 replies; 19+ messages in thread
From: Paul E. McKenney @ 2015-03-20 15:35 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	LKML, Dave Jones

On Fri, Mar 20, 2015 at 10:46:33AM -0400, Sasha Levin wrote:
> On 03/20/2015 10:29 AM, Paul E. McKenney wrote:
> > On Fri, Mar 20, 2015 at 10:15:34AM -0400, Sasha Levin wrote:
> >> On 03/15/2015 08:02 PM, Paul E. McKenney wrote:
> >>> On Sun, Mar 15, 2015 at 07:32:32PM -0400, Sasha Levin wrote:
> >>>>> On 03/15/2015 04:43 PM, Paul E. McKenney wrote:
> >>>>>>> But I did find a bug that would result in the other warnings, and could
> >>>>>>> also result in too-short grace periods, which could in turn result in
> >>>>>>> arbitrarily arbitrary misbehavior.  The patch below, which is also on
> >>>>>>> its way into -next, should fix this.  Please let me know how it does
> >>>>>>> for you.
> >>>>>
> >>>>> I've stopped seeing the warnings I've previously reported, but started
> >>>>> seeing a new one:
> >>>>>
> >>>>> [  788.564596] WARNING: CPU: 12 PID: 9711 at kernel/rcu/tree.c:2201 rcu_report_qs_rnp+0x42e/0x5a0()
> >>>>> [  788.568123] Modules linked in:
> >>>>> [  788.568123] CPU: 12 PID: 9711 Comm: trinity-main Not tainted 4.0.0-rc3-next-20150313-sasha-00041-g83a3dc8-dirty #2078
> >>>>> [  788.568123]  ffff8803a1ba0000 00000000400df16a ffff880442807cc8 ffffffffb1ab01ca
> >>>>> [  788.568123]  0000000000000000 0000000000000000 ffff880442807d18 ffffffffa71e261a
> >>>>> [  788.568123]  dffffc0000000000 ffffffffa733d2ee ffff880442807d28 ffffffffb4724000
> >>>>> [  788.568123] Call Trace:
> >>>>> [  788.568123] <IRQ> dump_stack (lib/dump_stack.c:52)
> >>>>> [  788.568123] warn_slowpath_common (kernel/panic.c:447)
> >>>>> [  788.568123] ? rcu_report_qs_rnp (kernel/rcu/tree.c:2201 (discriminator 3))
> >>>>> [  788.568123] warn_slowpath_null (kernel/panic.c:481)
> >>>>> [  788.568123] rcu_report_qs_rnp (kernel/rcu/tree.c:2201 (discriminator 3))
> >>>>> [  788.568123] rcu_process_callbacks (kernel/rcu/tree.c:2302 kernel/rcu/tree.c:2338 kernel/rcu/tree.c:2824 kernel/rcu/tree.c:2857)
> >>>>> [  788.568123] __do_softirq (kernel/softirq.c:273 include/linux/jump_label.h:114 include/trace/events/irq.h:126 kernel/softirq.c:274)
> >>>>> [  788.568123] irq_exit (kernel/softirq.c:350 kernel/softirq.c:391)
> >>>>> [  788.568123] smp_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:918)
> >>>>> [  788.568123] apic_timer_interrupt (arch/x86/kernel/entry_64.S:920)
> >>>>> [  788.568123] <EOI> ? mark_held_locks (kernel/locking/lockdep.c:2525)
> >>>>> [  788.568123] finish_task_switch (kernel/sched/core.c:2231)
> >>>>> [  788.568123] __schedule (kernel/sched/core.c:2337 kernel/sched/core.c:2795)
> >>>>> [  788.568123] schedule (./arch/x86/include/asm/bitops.h:311 (discriminator 1) kernel/sched/core.c:2824 (discriminator 1))
> >>>>> [  788.568123] schedule_preempt_disabled (kernel/sched/core.c:2856)
> >>>>> [  788.568123] mutex_lock_nested (kernel/locking/mutex.c:585 kernel/locking/mutex.c:623)
> >>>>> [  788.568123] kernfs_iop_permission (fs/kernfs/inode.c:366)
> >>>>> [  788.568123] __inode_permission (fs/namei.c:374 fs/namei.c:408)
> >>>>> [  788.568123] inode_permission (fs/namei.c:460)
> >>>>> [  788.568123] link_path_walk (fs/namei.c:1520 fs/namei.c:1782)
> >>>>> [  788.568123] path_init (fs/namei.c:1947)
> >>>>> [  788.568123] path_lookupat (fs/namei.c:1989)
> >>>>> [  788.568123] filename_lookup (fs/namei.c:2025)
> >>>>> [  788.568123] user_path_at_empty (fs/namei.c:2189)
> >>>>> [  788.568123] user_path_at (fs/namei.c:2200)
> >>>>> [  788.568123] vfs_fstatat (fs/stat.c:106)
> >>>>> [  788.568123] SYSC_newfstatat (fs/stat.c:298)
> >>>>> [  788.568123] SyS_newfstatat (fs/stat.c:291)
> >>>>> [  788.568123] tracesys_phase2 (arch/x86/kernel/entry_64.S:347)
> >>> OK, I guess it would help to update the WARN_ON()s while I am at it.  :-/
> >>>
> >>> Here is an updated patch that replaces the one resulting in the above
> >>> splat.
> >>
> >> That did the trick for me, but I don't think it made it to -next?
> > 
> > Good to hear!  May I add your Tested-by?
> 
> Sure.
> 
> > And thank you for the reminder on -next, pushing it out now.
> 
> Thanks!

Both done, thank you!

							Thanx, Paul


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-03-20 15:35 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-11 19:57 rcu: frequent rcu lockups Sasha Levin
2015-03-11 20:17 ` Paul E. McKenney
2015-03-11 20:18   ` Sasha Levin
2015-03-11 20:41     ` Paul E. McKenney
2015-03-11 22:18       ` Sasha Levin
2015-03-11 23:01         ` Paul E. McKenney
2015-03-11 23:06           ` Sasha Levin
2015-03-11 23:16             ` Paul E. McKenney
2015-03-12 12:28               ` Sasha Levin
2015-03-12 12:32                 ` Sasha Levin
2015-03-12 14:07                   ` Paul E. McKenney
2015-03-13 17:39                     ` Paul E. McKenney
2015-03-15 20:43                       ` Paul E. McKenney
2015-03-15 23:32                         ` Sasha Levin
2015-03-16  0:02                           ` Paul E. McKenney
2015-03-20 14:15                             ` Sasha Levin
2015-03-20 14:29                               ` Paul E. McKenney
2015-03-20 14:46                                 ` Sasha Levin
2015-03-20 15:35                                   ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).