LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks
@ 2018-04-11 13:50 Boqun Feng
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 01/20] lockdep/Documention: Recursive read lock detection reasoning Boqun Feng
                   ` (19 more replies)
  0 siblings, 20 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney, Boqun Feng

Hi Ingo and Peter,

This is V6 for recursive read lock support in lockdep. I moved the
explanation about reasoning to patch #1, which will help understand this
whole series. This patchset is based on v4.16.

Other changes since V5:

*	Rewrite the the explanation of the reasoning, focus on the proof
	of equivalence between closed strong paths and deadlock
	possiblity.

*	Rewrite the detection for irq-safe->irq-unsafe check, not only
	we support deadlock detection for recursive read locks, but also
	save two BFS searchs (one backwards and one forwards) in the
	detection. Thanks a lot for the discussion with Peter Zijlstra.

*	Annotate SRCU related primitives with 'check' lockdep
	annotations, so that we can detect deadlocks related to SRCU.
	Also a self test case is added. The use case is provided by Paul
	E. Mckenney.

*	Make __bfs(.math) return bool, as suggested by Peter Zijlstra.

*	Improve the readibliy of code based on good suggestions from
	Peter Zijlstra. Hope this time nobody's brain gets hurted ;-)

*	Minor fixes for typos.

V1: https://marc.info/?l=linux-kernel&m=150393341825453
V2: https://marc.info/?l=linux-kernel&m=150468649417950
V3: https://marc.info/?l=linux-kernel&m=150637795424969
V4: https://marc.info/?l=linux-kernel&m=151550860121565
V5: https://marc.info/?l=linux-kernel&m=151928315529363


As Peter pointed out:

	https://marc.info/?l=linux-kernel&m=150349072023540

The lockdep current has a limit support for recursive read locks, the
deadlock case as follow could not be detected:

	read_lock(A);
				lock(B);
	lock(B);
				write_lock(A);

I got some inspiration from Gautham R Shenoy:

	https://lwn.net/Articles/332801/

, and came up with this series.

The basic idea is:

*	Add recursive read locks into the graph

*	Classify dependencies into -(RR)->, -(NR)->, -(RN)->,
	-(NN)->, where R stands for recursive read lock, N stands for
	other locks(i.e. non-recursive read locks and write locks).

*	Define strong dependency paths as the paths of dependencies
	don't have two adjacent dependencies as -(*R)-> and -(R*)->.

*	Extend __bfs() to only traverse on strong dependency paths.

*	If __bfs() finds a strong dependency circle, then a deadlock is
	reported.

The whole series consists of 20 patches:

1.	Add documentation for recursive read lock deadlock detection
	reasoning

2.	Do a clean up on the return value of __bfs() and its friends.

3.	Make __bfs() able to visit every dependency until a match is
	found. The old version of __bfs() could only visit each lock
	class once, and this is insufficient if we are going to add
	recursive read locks into the dependency graph.

4.	Redefine LOCK*_STATE*, now LOCK*_STATE_RR stand for recursive
	read lock only and LOCK*_STATE stand for write lock and
	non-recursive read lock.

5.	Reduce the size of lock_list::distance.

6-7	Extend __bfs() to be able to traverse the stong dependency
	patchs after recursive read locks added into the graph.

8.	Make __bfs(.math) return bool.

9-11	Adjust check_redundant(), check_noncircular() and
	check_irq_usage() with recursive read locks into consideration.

12.	Finally add recursive read locks into the dependency graph.

13-14	Adjust lock cache chain key generation with recursive read locks
	into consideration, and provide a test case.

15-16	Add more test cases.

17.	Revert commit d82fed752942 ("locking/lockdep/selftests: Fix
	mixed read-write ABBA tests"),

18.	Add myself as a LOCKING PRIMITIVES reviewer.

19-20	Annotation SRCU correctly for deadlock detection, and provide a
	test case.

This series passed all the lockdep selftest cases (including those I
introduce).

Test and comments are welcome!

Regards,
Boqun

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 01/20] lockdep/Documention: Recursive read lock detection reasoning
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
@ 2018-04-11 13:50 ` Boqun Feng
  2018-04-15  0:38   ` Randy Dunlap
  2018-04-27 13:50   ` Boqun Feng
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 02/20] lockdep: Demagic the return value of BFS Boqun Feng
                   ` (18 subsequent siblings)
  19 siblings, 2 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney,
	Boqun Feng, Jonathan Corbet, open list:DOCUMENTATION

This patch add the documentation piece for the reasoning of deadlock
detection related to recursive read lock. The following sections are
added:

*	Explain what is a recursive read lock, and what deadlock cases
	they could introduce.

*	Introduce the notations for different types of dependencies, and
	the definition of strong paths.

*	Proof for a closed strong path is both sufficient and necessary
	for deadlock detections with recursive read locks involved. The
	proof could also explain why we call the path "strong"

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 Documentation/locking/lockdep-design.txt | 178 +++++++++++++++++++++++++++++++
 1 file changed, 178 insertions(+)

diff --git a/Documentation/locking/lockdep-design.txt b/Documentation/locking/lockdep-design.txt
index 9de1c158d44c..6bb9e90e2c4f 100644
--- a/Documentation/locking/lockdep-design.txt
+++ b/Documentation/locking/lockdep-design.txt
@@ -284,3 +284,181 @@ Run the command and save the output, then compare against the output from
 a later run of this command to identify the leakers.  This same output
 can also help you find situations where runtime lock initialization has
 been omitted.
+
+Recursive read locks:
+---------------------
+
+Lockdep now is equipped with deadlock detection for recursive read locks.
+
+Recursive read locks, as their name indicates, are the locks able to be
+acquired recursively. Unlike non-recursive read locks, recursive read locks
+only get blocked by current write lock *holders* other than write lock
+*waiters*, for example:
+
+	TASK A:			TASK B:
+
+	read_lock(X);
+
+				write_lock(X);
+
+	read_lock(X);
+
+is not a deadlock for recursive read locks, as while the task B is waiting for
+the lock X, the second read_lock() doesn't need to wait because it's a recursive
+read lock. However if the read_lock() is non-recursive read lock, then the above
+case is a deadlock, because even if the write_lock() in TASK B can not get the
+lock, but it can block the second read_lock() in TASK A.
+
+Note that a lock can be a write lock (exclusive lock), a non-recursive read
+lock (non-recursive shared lock) or a recursive read lock (recursive shared
+lock), depending on the lock operations used to acquire it (more specifically,
+the value of the 'read' parameter for lock_acquire()). In other words, a single
+lock instance has three types of acquisition depending on the acquisition
+functions: exclusive, non-recursive read, and recursive read.
+
+To be concise, we call that write locks and non-recursive read locks as
+"non-recursive" locks and recursive read locks as "recursive" locks.
+
+Recursive locks don't block each other, while non-recursive locks do (this is
+even true for two non-recursive read locks). A non-recursive lock can block the
+corresponding recursive lock, and vice versa.
+
+A deadlock case with recursive locks involved is as follow:
+
+	TASK A:			TASK B:
+
+	read_lock(X);
+				read_lock(Y);
+	write_lock(Y);
+				write_lock(X);
+
+Task A is waiting for task B to read_unlock() Y and task B is waiting for task
+A to read_unlock() X.
+
+Dependency types and strong dependency paths:
+---------------------------------------------
+In order to detect deadlocks as above, lockdep needs to track different dependencies.
+There are 4 categories for dependency edges in the lockdep graph:
+
+1) -(NN)->: non-recursive to non-recursive dependency. "X -(NN)-> Y" means
+            X -> Y and both X and Y are non-recursive locks.
+
+2) -(RN)->: recursive to non-recursive dependency. "X -(RN)-> Y" means
+            X -> Y and X is recursive read lock and Y is non-recursive lock.
+
+3) -(NR)->: non-recursive to recursive dependency, "X -(NR)-> Y" means
+            X -> Y and X is non-recursive lock and Y is recursive lock.
+
+4) -(RR)->: recursive to recursive dependency, "X -(RR)-> Y" means
+            X -> Y and both X and Y are recursive locks.
+
+Note that given two locks, they may have multiple dependencies between them, for example:
+
+	TASK A:
+
+	read_lock(X);
+	write_lock(Y);
+	...
+
+	TASK B:
+
+	write_lock(X);
+	write_lock(Y);
+
+, we have both X -(RN)-> Y and X -(NN)-> Y in the dependency graph.
+
+We use -(*N)-> for edges that is either -(RN)-> or -(NN)->, the similar for -(N*)->,
+-(*R)-> and -(R*)->
+
+A "path" is a series of conjunct dependency edges in the graph. And we define a
+"strong" path, which indicates the strong dependency throughout each dependency
+in the path, as the path that doesn't have two conjunct edges (dependencies) as
+-(*R)-> and -(R*)->. In other words, a "strong" path is a path from a lock
+walking to another through the lock dependencies, and if X -> Y -> Z in the
+path (where X, Y, Z are locks), if the walk from X to Y is through a -(NR)-> or
+-(RR)-> dependency, the walk from Y to Z must not be through a -(RN)-> or
+-(RR)-> dependency, otherwise it's not a strong path.
+
+We will see why the path is called "strong" in next section.
+
+Recursive Read Deadlock Detection:
+----------------------------------
+
+We now prove two things:
+
+Lemma 1:
+
+If there is a closed strong path (i.e. a strong cirle), then there is a
+combination of locking sequences that causes deadlock. I.e. a strong circle is
+sufficient for deadlock detection.
+
+Lemma 2:
+
+If there is no closed strong path (i.e. strong cirle), then there is no
+combination of locking sequences that could cause deadlock. I.e.  strong
+circles are necessary for deadlock detection.
+
+With these two Lemmas, we can easily say a closed strong path is both sufficient
+and necessary for deadlocks, therefore a closed strong path is equivalent to
+deadlock possibility. As a closed strong path stands for a dependency chain that
+could cause deadlocks, so we call it "strong", considering there are dependency
+circles that won't cause deadlocks.
+
+Proof for sufficiency (Lemma 1):
+
+Let's say we have a strong cirlce:
+
+	L1 -> L2 ... -> Ln -> L1
+
+, which means we have dependencies:
+
+	L1 -> L2
+	L2 -> L3
+	...
+	Ln-1 -> Ln
+	Ln -> L1
+
+We now can construct a combination of locking sequences that cause deadlock:
+
+Firstly let's make one CPU/task get the L1 in L1 -> L2, and then another get
+the L2 in L2 -> L3, and so on. After this, all of the Lx in Lx -> Lx+1 are
+held by different CPU/tasks.
+
+And then because we have L1 -> L2, so the holder of L1 is going to acquire L2
+in L1 -> L2, however since L2 is already held by another CPU/task, plus L1 ->
+L2 and L2 -> L3 are not *R and R* (the definition of strong), therefore the
+holder of L1 can not get L2, it has to wait L2's holder to release.
+
+Moreover, we can have a similar conclusion for L2's holder: it has to wait L3's
+holder to release, and so on. We now can proof that Lx's holder has to wait for
+Lx+1's holder to release, and note that Ln+1 is L1, so we have a circular
+waiting scenario and nobody can get progress, therefore a deadlock.
+
+Proof for necessary (Lemma 2):
+
+Lemma 2 is equivalent to: If there is a deadlock scenario, then there must be a
+strong circle in the dependency graph.
+
+According to Wikipedia[1], if there is a deadlock, then there must be a circular
+waiting scenario, means there are N CPU/tasks, where CPU/task P1 is waiting for
+a lock held by P2, and P2 is waiting for a lock held by P3, ... and Pn is waiting
+for a lock held by P1. Let's name the lock Px is waiting as Lx, so since P1 is waiting
+for L1 and holding Ln, so we will have Ln -> L1 in the dependency graph. Similarly,
+we have L1 -> L2, L2 -> L3, ..., Ln-1 -> Ln in the dependency graph, which means we
+have a circle:
+
+	Ln -> L1 -> L2 -> ... -> Ln
+
+, and now let's prove the circle is strong:
+
+For a lock Lx, Px contributes the dependency Lx-1 -> Lx and Px+1 contributes
+the dependency Lx -> Lx+1, and since Px is waiting for Px+1 to release Lx,
+so Lx can not be both recursive in Lx -> Lx+1 and Lx-1 -> Lx, because recursive
+locks don't block each other, therefore Lx-1 -> Lx and Lx -> Lx+1 can not be a
+-(*R)-> -(R*)-> pair, and this is true for any lock in the circle, therefore,
+the circle is strong.
+
+References:
+-----------
+[1]: https://en.wikipedia.org/wiki/Deadlock
+[2]: Shibu, K. (2009). Intro To Embedded Systems (1st ed.). Tata McGraw-Hill
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 02/20] lockdep: Demagic the return value of BFS
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 01/20] lockdep/Documention: Recursive read lock detection reasoning Boqun Feng
@ 2018-04-11 13:50 ` Boqun Feng
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 03/20] lockdep: Make __bfs() visit every dependency until a match Boqun Feng
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney, Boqun Feng

__bfs() could return four magic numbers:

	1: search succeeds, but none match.
	0: search succeeds, find one match.
	-1: search fails because of the cq is full.
	-2: search fails because a invalid node is found.

This patch cleans things up by using a enum type for the return value
of __bfs() and its friends, this improves the code readability of the
code, and further, could help if we want to extend the BFS.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/locking/lockdep.c | 136 ++++++++++++++++++++++++++++-------------------
 1 file changed, 80 insertions(+), 56 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 89b5f83f1969..2dbaff381778 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -984,21 +984,52 @@ static inline int get_lock_depth(struct lock_list *child)
 	}
 	return depth;
 }
+/*
+ * Return values of a bfs search:
+ *
+ * BFS_E* indicates an error
+ * BFS_R* indicates a result (match or not)
+ *
+ * BFS_EINVALIDNODE: Find a invalid node in the graph.
+ *
+ * BFS_EQUEUEFULL: The queue is full while doing the bfs.
+ *
+ * BFS_RMATCH: Find the matched node in the graph, and put that node * into
+ *            *@target_entry.
+ *
+ * BFS_RNOMATCH: Haven't found the matched node and keep *@target_entry
+ *              _unchanged_.
+ */
+enum bfs_result {
+	BFS_EINVALIDNODE = -2,
+	BFS_EQUEUEFULL = -1,
+	BFS_RMATCH = 0,
+	BFS_RNOMATCH = 1,
+};
 
-static int __bfs(struct lock_list *source_entry,
-		 void *data,
-		 int (*match)(struct lock_list *entry, void *data),
-		 struct lock_list **target_entry,
-		 int forward)
+/*
+ * bfs_result < 0 means error
+ */
+
+static inline bool bfs_error(enum bfs_result res)
+{
+	return res < 0;
+}
+
+static enum bfs_result __bfs(struct lock_list *source_entry,
+			     void *data,
+			     int (*match)(struct lock_list *entry, void *data),
+			     struct lock_list **target_entry,
+			     int forward)
 {
 	struct lock_list *entry;
 	struct list_head *head;
 	struct circular_queue *cq = &lock_cq;
-	int ret = 1;
+	enum bfs_result ret = BFS_RNOMATCH;
 
 	if (match(source_entry, data)) {
 		*target_entry = source_entry;
-		ret = 0;
+		ret = BFS_RMATCH;
 		goto exit;
 	}
 
@@ -1019,7 +1050,7 @@ static int __bfs(struct lock_list *source_entry,
 		__cq_dequeue(cq, (unsigned long *)&lock);
 
 		if (!lock->class) {
-			ret = -2;
+			ret = BFS_EINVALIDNODE;
 			goto exit;
 		}
 
@@ -1036,12 +1067,12 @@ static int __bfs(struct lock_list *source_entry,
 				mark_lock_accessed(entry, lock);
 				if (match(entry, data)) {
 					*target_entry = entry;
-					ret = 0;
+					ret = BFS_RMATCH;
 					goto exit;
 				}
 
 				if (__cq_enqueue(cq, (unsigned long)entry)) {
-					ret = -1;
+					ret = BFS_EQUEUEFULL;
 					goto exit;
 				}
 				cq_depth = __cq_get_elem_count(cq);
@@ -1054,19 +1085,21 @@ static int __bfs(struct lock_list *source_entry,
 	return ret;
 }
 
-static inline int __bfs_forwards(struct lock_list *src_entry,
-			void *data,
-			int (*match)(struct lock_list *entry, void *data),
-			struct lock_list **target_entry)
+static inline enum bfs_result
+__bfs_forwards(struct lock_list *src_entry,
+	       void *data,
+	       int (*match)(struct lock_list *entry, void *data),
+	       struct lock_list **target_entry)
 {
 	return __bfs(src_entry, data, match, target_entry, 1);
 
 }
 
-static inline int __bfs_backwards(struct lock_list *src_entry,
-			void *data,
-			int (*match)(struct lock_list *entry, void *data),
-			struct lock_list **target_entry)
+static inline enum bfs_result
+__bfs_backwards(struct lock_list *src_entry,
+		void *data,
+		int (*match)(struct lock_list *entry, void *data),
+		struct lock_list **target_entry)
 {
 	return __bfs(src_entry, data, match, target_entry, 0);
 
@@ -1299,13 +1332,13 @@ unsigned long lockdep_count_backward_deps(struct lock_class *class)
 
 /*
  * Prove that the dependency graph starting at <entry> can not
- * lead to <target>. Print an error and return 0 if it does.
+ * lead to <target>. Print an error and return BFS_RMATCH if it does.
  */
-static noinline int
+static noinline enum bfs_result
 check_noncircular(struct lock_list *root, struct lock_class *target,
-		struct lock_list **target_entry)
+		  struct lock_list **target_entry)
 {
-	int result;
+	enum bfs_result result;
 
 	debug_atomic_inc(nr_cyclic_checks);
 
@@ -1314,11 +1347,11 @@ check_noncircular(struct lock_list *root, struct lock_class *target,
 	return result;
 }
 
-static noinline int
+static noinline enum bfs_result
 check_redundant(struct lock_list *root, struct lock_class *target,
 		struct lock_list **target_entry)
 {
-	int result;
+	enum bfs_result result;
 
 	debug_atomic_inc(nr_redundant_checks);
 
@@ -1347,15 +1380,12 @@ static inline int usage_match(struct lock_list *entry, void *bit)
  *
  * Return 0 if such a node exists in the subgraph, and put that node
  * into *@target_entry.
- *
- * Return 1 otherwise and keep *@target_entry unchanged.
- * Return <0 on error.
  */
-static int
+static enum bfs_result
 find_usage_forwards(struct lock_list *root, enum lock_usage_bit bit,
 			struct lock_list **target_entry)
 {
-	int result;
+	enum bfs_result result;
 
 	debug_atomic_inc(nr_find_usage_forwards_checks);
 
@@ -1367,18 +1397,12 @@ find_usage_forwards(struct lock_list *root, enum lock_usage_bit bit,
 /*
  * Find a node in the backwards-direction dependency sub-graph starting
  * at @root->class that matches @bit.
- *
- * Return 0 if such a node exists in the subgraph, and put that node
- * into *@target_entry.
- *
- * Return 1 otherwise and keep *@target_entry unchanged.
- * Return <0 on error.
  */
-static int
+static enum bfs_result
 find_usage_backwards(struct lock_list *root, enum lock_usage_bit bit,
 			struct lock_list **target_entry)
 {
-	int result;
+	enum bfs_result result;
 
 	debug_atomic_inc(nr_find_usage_backwards_checks);
 
@@ -1586,18 +1610,18 @@ check_usage(struct task_struct *curr, struct held_lock *prev,
 
 	this.class = hlock_class(prev);
 	ret = find_usage_backwards(&this, bit_backwards, &target_entry);
-	if (ret < 0)
+	if (bfs_error(ret))
 		return print_bfs_bug(ret);
-	if (ret == 1)
-		return ret;
+	if (ret == BFS_RNOMATCH)
+		return 1;
 
 	that.parent = NULL;
 	that.class = hlock_class(next);
 	ret = find_usage_forwards(&that, bit_forwards, &target_entry1);
-	if (ret < 0)
+	if (bfs_error(ret))
 		return print_bfs_bug(ret);
-	if (ret == 1)
-		return ret;
+	if (ret == BFS_RNOMATCH)
+		return 1;
 
 	return print_bad_irq_dependency(curr, &this, &that,
 			target_entry, target_entry1,
@@ -1834,10 +1858,10 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
 	       struct held_lock *next, int distance, struct stack_trace *trace,
 	       int (*save)(struct stack_trace *trace))
 {
-	struct lock_list *uninitialized_var(target_entry);
 	struct lock_list *entry;
+	enum bfs_result ret;
 	struct lock_list this;
-	int ret;
+	struct lock_list *uninitialized_var(target_entry);
 
 	/*
 	 * Prove that the new <prev> -> <next> dependency would not
@@ -1851,7 +1875,7 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
 	this.class = hlock_class(next);
 	this.parent = NULL;
 	ret = check_noncircular(&this, hlock_class(prev), &target_entry);
-	if (unlikely(!ret)) {
+	if (unlikely(ret == BFS_RMATCH)) {
 		if (!trace->entries) {
 			/*
 			 * If @save fails here, the printing might trigger
@@ -1862,7 +1886,7 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
 		}
 		return print_circular_bug(&this, target_entry, next, prev, trace);
 	}
-	else if (unlikely(ret < 0))
+	else if (unlikely(bfs_error(ret)))
 		return print_bfs_bug(ret);
 
 	if (!check_prev_add_irq(curr, prev, next))
@@ -1900,11 +1924,11 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
 	this.class = hlock_class(prev);
 	this.parent = NULL;
 	ret = check_redundant(&this, hlock_class(next), &target_entry);
-	if (!ret) {
+	if (ret == BFS_RMATCH) {
 		debug_atomic_inc(nr_redundant);
 		return 2;
 	}
-	if (ret < 0)
+	if (bfs_error(ret))
 		return print_bfs_bug(ret);
 
 
@@ -2633,16 +2657,16 @@ static int
 check_usage_forwards(struct task_struct *curr, struct held_lock *this,
 		     enum lock_usage_bit bit, const char *irqclass)
 {
-	int ret;
+	enum bfs_result ret;
 	struct lock_list root;
 	struct lock_list *uninitialized_var(target_entry);
 
 	root.parent = NULL;
 	root.class = hlock_class(this);
 	ret = find_usage_forwards(&root, bit, &target_entry);
-	if (ret < 0)
+	if (bfs_error(ret))
 		return print_bfs_bug(ret);
-	if (ret == 1)
+	if (ret == BFS_RNOMATCH)
 		return ret;
 
 	return print_irq_inversion_bug(curr, &root, target_entry,
@@ -2657,17 +2681,17 @@ static int
 check_usage_backwards(struct task_struct *curr, struct held_lock *this,
 		      enum lock_usage_bit bit, const char *irqclass)
 {
-	int ret;
+	enum bfs_result ret;
 	struct lock_list root;
 	struct lock_list *uninitialized_var(target_entry);
 
 	root.parent = NULL;
 	root.class = hlock_class(this);
 	ret = find_usage_backwards(&root, bit, &target_entry);
-	if (ret < 0)
+	if (bfs_error(ret))
 		return print_bfs_bug(ret);
-	if (ret == 1)
-		return ret;
+	if (ret == BFS_RNOMATCH)
+		return 1;
 
 	return print_irq_inversion_bug(curr, &root, target_entry,
 					this, 0, irqclass);
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 03/20] lockdep: Make __bfs() visit every dependency until a match
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 01/20] lockdep/Documention: Recursive read lock detection reasoning Boqun Feng
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 02/20] lockdep: Demagic the return value of BFS Boqun Feng
@ 2018-04-11 13:50 ` Boqun Feng
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 04/20] lockdep: Redefine LOCK_*_STATE* bits Boqun Feng
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney, Boqun Feng

Currently, __bfs() will do a breadth-first search in the dependency
graph and visit each lock class in the graph exactly once, so for
example, in the following graph:

	A ---------> B
	|            ^
	|            |
	+----------> C

a __bfs() call starts at A, will visit B through dependency A -> B and
visit C through dependency A -> C and that's it, IOW, __bfs() will not
visit dependency C -> B.

This is OK for now, as we only have strong dependencies in the
dependency graph, so whenever there is a traverse path from A to B in
__bfs(), it means A has strong dependency to B (IOW, B depends on A
strongly). So no need to visit all dependencies in the graph.

However, as we are going to add recursive-read lock into the dependency
graph, afterwards, not all the paths mean strong dependencies, in the
same example above, dependency A -> B may be a weak dependency and
traverse A -> C -> B may be a strong dependency path. And with the old
way of __bfs() (i.e. visiting every lock class exactly once), we will
miss the strong dependency path, which will result into failing to find
a deadlock. To cure this for the future, we need to find a way for
__bfs() to visit each dependency, rather than each class, exactly once
in the search until we find a match.

The solution is simple:

We used to mark lock_class::lockdep_dependency_gen_id to indicate a
class has been visited in __bfs(), now we change the semantics a little
bit: we now mark lock_class::lockdep_dependency_gen_id to indicate _all
the dependencies_ in its lock_{after,before} have been visited in the
__bfs() (note we only take one direction in a __bfs() search). In this
way, every dependency is guaranteed to be visited until we find a match.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/locking/lockdep.c | 61 +++++++++++++++++++++++++++---------------------
 1 file changed, 34 insertions(+), 27 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 2dbaff381778..f39a071ef0a8 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -948,24 +948,20 @@ static inline unsigned int  __cq_get_elem_count(struct circular_queue *cq)
 	return (cq->rear - cq->front) & CQ_MASK;
 }
 
-static inline void mark_lock_accessed(struct lock_list *lock,
-					struct lock_list *parent)
+static inline void mark_lock_list_accessed(struct lock_class *class)
 {
-	unsigned long nr;
+	class->dep_gen_id = lockdep_dependency_gen_id;
+}
 
-	nr = lock - list_entries;
-	WARN_ON(nr >= nr_list_entries); /* Out-of-bounds, input fail */
+static inline void visit_lock_entry(struct lock_list *lock,
+				    struct lock_list *parent)
+{
 	lock->parent = parent;
-	lock->class->dep_gen_id = lockdep_dependency_gen_id;
 }
 
-static inline unsigned long lock_accessed(struct lock_list *lock)
+static inline unsigned long lock_list_accessed(struct lock_class *class)
 {
-	unsigned long nr;
-
-	nr = lock - list_entries;
-	WARN_ON(nr >= nr_list_entries); /* Out-of-bounds, input fail */
-	return lock->class->dep_gen_id == lockdep_dependency_gen_id;
+	return class->dep_gen_id == lockdep_dependency_gen_id;
 }
 
 static inline struct lock_list *get_lock_parent(struct lock_list *child)
@@ -1054,6 +1050,18 @@ static enum bfs_result __bfs(struct lock_list *source_entry,
 			goto exit;
 		}
 
+		/*
+		 * If we have visited all the dependencies from this @lock to
+		 * others (iow, if we have visited all lock_list entries in
+		 * @lock->class->locks_{after,before}) we skip, otherwise go
+		 * and visit all the dependencies in the list and mark this
+		 * list accessed.
+		 */
+		if (lock_list_accessed(lock->class))
+			continue;
+		else
+			mark_lock_list_accessed(lock->class);
+
 		if (forward)
 			head = &lock->class->locks_after;
 		else
@@ -1062,23 +1070,22 @@ static enum bfs_result __bfs(struct lock_list *source_entry,
 		DEBUG_LOCKS_WARN_ON(!irqs_disabled());
 
 		list_for_each_entry_rcu(entry, head, entry) {
-			if (!lock_accessed(entry)) {
-				unsigned int cq_depth;
-				mark_lock_accessed(entry, lock);
-				if (match(entry, data)) {
-					*target_entry = entry;
-					ret = BFS_RMATCH;
-					goto exit;
-				}
+			unsigned int cq_depth;
 
-				if (__cq_enqueue(cq, (unsigned long)entry)) {
-					ret = BFS_EQUEUEFULL;
-					goto exit;
-				}
-				cq_depth = __cq_get_elem_count(cq);
-				if (max_bfs_queue_depth < cq_depth)
-					max_bfs_queue_depth = cq_depth;
+			visit_lock_entry(entry, lock);
+			if (match(entry, data)) {
+				*target_entry = entry;
+				ret = BFS_RMATCH;
+				goto exit;
+			}
+
+			if (__cq_enqueue(cq, (unsigned long)entry)) {
+				ret = BFS_EQUEUEFULL;
+				goto exit;
 			}
+			cq_depth = __cq_get_elem_count(cq);
+			if (max_bfs_queue_depth < cq_depth)
+				max_bfs_queue_depth = cq_depth;
 		}
 	}
 exit:
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 04/20] lockdep: Redefine LOCK_*_STATE* bits
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
                   ` (2 preceding siblings ...)
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 03/20] lockdep: Make __bfs() visit every dependency until a match Boqun Feng
@ 2018-04-11 13:50 ` Boqun Feng
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 05/20] lockdep: Reduce the size of lock_list::distance Boqun Feng
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney,
	Boqun Feng, Jonathan Corbet, open list:DOCUMENTATION

There are three types of lock acquisitions: write, non-recursive read
and recursive read, among which write locks and non-recursive read locks
have no difference from a viewpoint for deadlock detections, because a
write acquisition of the corresponding lock on an independent CPU or
task makes a non-recursive read lock act as a write lock in the sense of
deadlock. So we could treat them as the same type (named as
"non-recursive lock") in lockdep.

As in the irq lock inversion detection (safe->unsafe deadlock
detection), we used to differ write lock with read lock (non-recursive
and recursive ones), such a classification could be improved as
non-recursive read lock behaves the same as write lock, so this patch
redefines the meanings of LOCK_{USED_IN, ENABLED}_STATE*.

old:
	LOCK_* : stands for write lock
	LOCK_*_READ: stands for read lock(non-recursive and recursive)
new:
	LOCK_* : stands for non-recursive(write lock and non-recursive
	read lock)
	LOCK_*_RR: stands for recursive read lock

Such a change is needed for a future improvement on recursive read
related irq inversion deadlock detection.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 Documentation/locking/lockdep-design.txt |  6 +++---
 kernel/locking/lockdep.c                 | 28 ++++++++++++++--------------
 kernel/locking/lockdep_internals.h       | 16 ++++++++--------
 kernel/locking/lockdep_proc.c            | 12 ++++++------
 4 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/Documentation/locking/lockdep-design.txt b/Documentation/locking/lockdep-design.txt
index 6bb9e90e2c4f..53ede30ce16d 100644
--- a/Documentation/locking/lockdep-design.txt
+++ b/Documentation/locking/lockdep-design.txt
@@ -30,9 +30,9 @@ State
 The validator tracks lock-class usage history into 4n + 1 separate state bits:
 
 - 'ever held in STATE context'
-- 'ever held as readlock in STATE context'
+- 'ever held as recursive readlock in STATE context'
 - 'ever held with STATE enabled'
-- 'ever held as readlock with STATE enabled'
+- 'ever held as recurisve readlock with STATE enabled'
 
 Where STATE can be either one of (kernel/locking/lockdep_states.h)
  - hardirq
@@ -51,7 +51,7 @@ locking error messages, inside curlies. A contrived example:
     (&sio_locks[i].lock){-.-...}, at: [<c02867fd>] mutex_lock+0x21/0x24
 
 
-The bit position indicates STATE, STATE-read, for each of the states listed
+The bit position indicates STATE, STATE-RR, for each of the states listed
 above, and the character displayed in each indicates:
 
    '.'  acquired while irqs disabled and not in irq context
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index f39a071ef0a8..14af2327b52a 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -448,10 +448,10 @@ DEFINE_PER_CPU(struct lockdep_stats, lockdep_stats);
  */
 
 #define __USAGE(__STATE)						\
-	[LOCK_USED_IN_##__STATE] = "IN-"__stringify(__STATE)"-W",	\
-	[LOCK_ENABLED_##__STATE] = __stringify(__STATE)"-ON-W",		\
-	[LOCK_USED_IN_##__STATE##_READ] = "IN-"__stringify(__STATE)"-R",\
-	[LOCK_ENABLED_##__STATE##_READ] = __stringify(__STATE)"-ON-R",
+	[LOCK_USED_IN_##__STATE] = "IN-"__stringify(__STATE),		\
+	[LOCK_ENABLED_##__STATE] = __stringify(__STATE)"-ON",		\
+	[LOCK_USED_IN_##__STATE##_RR] = "IN-"__stringify(__STATE)"-RR",	\
+	[LOCK_ENABLED_##__STATE##_RR] = __stringify(__STATE)"-ON-RR",
 
 static const char *usage_str[] =
 {
@@ -492,7 +492,7 @@ void get_usage_chars(struct lock_class *class, char usage[LOCK_USAGE_CHARS])
 
 #define LOCKDEP_STATE(__STATE) 						\
 	usage[i++] = get_usage_char(class, LOCK_USED_IN_##__STATE);	\
-	usage[i++] = get_usage_char(class, LOCK_USED_IN_##__STATE##_READ);
+	usage[i++] = get_usage_char(class, LOCK_USED_IN_##__STATE##_RR);
 #include "lockdep_states.h"
 #undef LOCKDEP_STATE
 
@@ -1645,7 +1645,7 @@ static const char *state_names[] = {
 
 static const char *state_rnames[] = {
 #define LOCKDEP_STATE(__STATE) \
-	__stringify(__STATE)"-READ",
+	__stringify(__STATE)"-RR",
 #include "lockdep_states.h"
 #undef LOCKDEP_STATE
 };
@@ -3039,14 +3039,14 @@ static int mark_irqflags(struct task_struct *curr, struct held_lock *hlock)
 	 * mark the lock as used in these contexts:
 	 */
 	if (!hlock->trylock) {
-		if (hlock->read) {
+		if (hlock->read == 2) {
 			if (curr->hardirq_context)
 				if (!mark_lock(curr, hlock,
-						LOCK_USED_IN_HARDIRQ_READ))
+						LOCK_USED_IN_HARDIRQ_RR))
 					return 0;
 			if (curr->softirq_context)
 				if (!mark_lock(curr, hlock,
-						LOCK_USED_IN_SOFTIRQ_READ))
+						LOCK_USED_IN_SOFTIRQ_RR))
 					return 0;
 		} else {
 			if (curr->hardirq_context)
@@ -3058,13 +3058,13 @@ static int mark_irqflags(struct task_struct *curr, struct held_lock *hlock)
 		}
 	}
 	if (!hlock->hardirqs_off) {
-		if (hlock->read) {
+		if (hlock->read == 2) {
 			if (!mark_lock(curr, hlock,
-					LOCK_ENABLED_HARDIRQ_READ))
+					LOCK_ENABLED_HARDIRQ_RR))
 				return 0;
 			if (curr->softirqs_enabled)
 				if (!mark_lock(curr, hlock,
-						LOCK_ENABLED_SOFTIRQ_READ))
+						LOCK_ENABLED_SOFTIRQ_RR))
 					return 0;
 		} else {
 			if (!mark_lock(curr, hlock,
@@ -3170,9 +3170,9 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 	switch (new_bit) {
 #define LOCKDEP_STATE(__STATE)			\
 	case LOCK_USED_IN_##__STATE:		\
-	case LOCK_USED_IN_##__STATE##_READ:	\
+	case LOCK_USED_IN_##__STATE##_RR:	\
 	case LOCK_ENABLED_##__STATE:		\
-	case LOCK_ENABLED_##__STATE##_READ:
+	case LOCK_ENABLED_##__STATE##_RR:
 #include "lockdep_states.h"
 #undef LOCKDEP_STATE
 		ret = mark_lock_irq(curr, this, new_bit);
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index d459d624ba2a..93002d337936 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -13,9 +13,9 @@
 enum lock_usage_bit {
 #define LOCKDEP_STATE(__STATE)		\
 	LOCK_USED_IN_##__STATE,		\
-	LOCK_USED_IN_##__STATE##_READ,	\
+	LOCK_USED_IN_##__STATE##_RR,	\
 	LOCK_ENABLED_##__STATE,		\
-	LOCK_ENABLED_##__STATE##_READ,
+	LOCK_ENABLED_##__STATE##_RR,
 #include "lockdep_states.h"
 #undef LOCKDEP_STATE
 	LOCK_USED,
@@ -30,9 +30,9 @@ enum lock_usage_bit {
 enum {
 #define LOCKDEP_STATE(__STATE)						\
 	__LOCKF(USED_IN_##__STATE)					\
-	__LOCKF(USED_IN_##__STATE##_READ)				\
+	__LOCKF(USED_IN_##__STATE##_RR)				\
 	__LOCKF(ENABLED_##__STATE)					\
-	__LOCKF(ENABLED_##__STATE##_READ)
+	__LOCKF(ENABLED_##__STATE##_RR)
 #include "lockdep_states.h"
 #undef LOCKDEP_STATE
 	__LOCKF(USED)
@@ -41,10 +41,10 @@ enum {
 #define LOCKF_ENABLED_IRQ (LOCKF_ENABLED_HARDIRQ | LOCKF_ENABLED_SOFTIRQ)
 #define LOCKF_USED_IN_IRQ (LOCKF_USED_IN_HARDIRQ | LOCKF_USED_IN_SOFTIRQ)
 
-#define LOCKF_ENABLED_IRQ_READ \
-		(LOCKF_ENABLED_HARDIRQ_READ | LOCKF_ENABLED_SOFTIRQ_READ)
-#define LOCKF_USED_IN_IRQ_READ \
-		(LOCKF_USED_IN_HARDIRQ_READ | LOCKF_USED_IN_SOFTIRQ_READ)
+#define LOCKF_ENABLED_IRQ_RR \
+		(LOCKF_ENABLED_HARDIRQ_RR | LOCKF_ENABLED_SOFTIRQ_RR)
+#define LOCKF_USED_IN_IRQ_RR \
+		(LOCKF_USED_IN_HARDIRQ_RR | LOCKF_USED_IN_SOFTIRQ_RR)
 
 /*
  * CONFIG_LOCKDEP_SMALL is defined for sparc. Sparc requires .text,
diff --git a/kernel/locking/lockdep_proc.c b/kernel/locking/lockdep_proc.c
index ad69bbc9bd28..630a6bc6e24c 100644
--- a/kernel/locking/lockdep_proc.c
+++ b/kernel/locking/lockdep_proc.c
@@ -252,17 +252,17 @@ static int lockdep_stats_show(struct seq_file *m, void *v)
 			nr_hardirq_safe++;
 		if (class->usage_mask & LOCKF_ENABLED_HARDIRQ)
 			nr_hardirq_unsafe++;
-		if (class->usage_mask & LOCKF_USED_IN_IRQ_READ)
+		if (class->usage_mask & LOCKF_USED_IN_IRQ_RR)
 			nr_irq_read_safe++;
-		if (class->usage_mask & LOCKF_ENABLED_IRQ_READ)
+		if (class->usage_mask & LOCKF_ENABLED_IRQ_RR)
 			nr_irq_read_unsafe++;
-		if (class->usage_mask & LOCKF_USED_IN_SOFTIRQ_READ)
+		if (class->usage_mask & LOCKF_USED_IN_SOFTIRQ_RR)
 			nr_softirq_read_safe++;
-		if (class->usage_mask & LOCKF_ENABLED_SOFTIRQ_READ)
+		if (class->usage_mask & LOCKF_ENABLED_SOFTIRQ_RR)
 			nr_softirq_read_unsafe++;
-		if (class->usage_mask & LOCKF_USED_IN_HARDIRQ_READ)
+		if (class->usage_mask & LOCKF_USED_IN_HARDIRQ_RR)
 			nr_hardirq_read_safe++;
-		if (class->usage_mask & LOCKF_ENABLED_HARDIRQ_READ)
+		if (class->usage_mask & LOCKF_ENABLED_HARDIRQ_RR)
 			nr_hardirq_read_unsafe++;
 
 #ifdef CONFIG_PROVE_LOCKING
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 05/20] lockdep: Reduce the size of lock_list::distance
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
                   ` (3 preceding siblings ...)
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 04/20] lockdep: Redefine LOCK_*_STATE* bits Boqun Feng
@ 2018-04-11 13:50 ` Boqun Feng
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 06/20] lockdep: Introduce lock_list::dep Boqun Feng
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney, Boqun Feng

lock_list::distance is always not greater than MAX_LOCKDEP_DEPTH (which
is 48 right now), so a u16 will fit. This patch reduces the size of
lock_list::distance to save space, so that we can introduce other fields
to help detect recursive read lock deadlocks without increasing the size
of lock_list structure.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 include/linux/lockdep.h  | 2 +-
 kernel/locking/lockdep.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 6fc77d4dbdcd..d2af32387aaa 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -186,7 +186,7 @@ struct lock_list {
 	struct list_head		entry;
 	struct lock_class		*class;
 	struct stack_trace		trace;
-	int				distance;
+	u16				distance;
 
 	/*
 	 * The parent field is used to implement breadth-first search, and the
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 14af2327b52a..1806060c88ce 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -859,7 +859,7 @@ static struct lock_list *alloc_list_entry(void)
  * Add a new dependency to the head of the list:
  */
 static int add_lock_to_list(struct lock_class *this, struct list_head *head,
-			    unsigned long ip, int distance,
+			    unsigned long ip, u16 distance,
 			    struct stack_trace *trace)
 {
 	struct lock_list *entry;
@@ -1996,7 +1996,7 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next)
 		goto out_bug;
 
 	for (;;) {
-		int distance = curr->lockdep_depth - depth + 1;
+		u16 distance = curr->lockdep_depth - depth + 1;
 		hlock = curr->held_locks + depth - 1;
 
 		/*
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 06/20] lockdep: Introduce lock_list::dep
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
                   ` (4 preceding siblings ...)
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 05/20] lockdep: Reduce the size of lock_list::distance Boqun Feng
@ 2018-04-11 13:50 ` Boqun Feng
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 07/20] lockdep: Extend __bfs() to work with multiple types of dependencies Boqun Feng
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney, Boqun Feng

To add recursive read locks into the dependency graph, we need to store
the types of dependencies for the BFS later. There are four types of
dependencies:

*	Non-recursive -> Non-recursive dependencies: NN
	e.g. write_lock(prev) held and try to acquire write_lock(next),
	which can be represented as "prev -(NN)-> next".

*	Recursive -> Non-recursive dependencies: RN
	e.g. read_lock(prev) held and try to acquire write_lock(next),
	which can be represented as "prev -(RN)-> next".

*	Non-recursive -> recursive dependencies: NR
	e.g. write_lock(prev) held and try to acquire read_lock(next),
	which can be represented as "prev -(NR)-> next".

*	Recursive -> recursive dependencies: RR
	e.g. read_lock(prev) held and try to acquire read_lock(next),
	which can be represented as "prev -(RR)-> next".

So we use 4 bits for the presence of each type in lock_list::dep. Helper
functions and marcos are also introduced to convert a pair of locks into
lock_list::dep bit and maintain the addition of different types of
dependencies.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 include/linux/lockdep.h  |  2 ++
 kernel/locking/lockdep.c | 76 +++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 74 insertions(+), 4 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index d2af32387aaa..cbd257f62877 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -187,6 +187,8 @@ struct lock_list {
 	struct lock_class		*class;
 	struct stack_trace		trace;
 	u16				distance;
+	/* bitmap of different dependencies from head to this */
+	u8				dep;
 
 	/*
 	 * The parent field is used to implement breadth-first search, and the
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 1806060c88ce..0892855b6c57 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -859,7 +859,7 @@ static struct lock_list *alloc_list_entry(void)
  * Add a new dependency to the head of the list:
  */
 static int add_lock_to_list(struct lock_class *this, struct list_head *head,
-			    unsigned long ip, u16 distance,
+			    unsigned long ip, u16 distance, u8 dep,
 			    struct stack_trace *trace)
 {
 	struct lock_list *entry;
@@ -872,6 +872,7 @@ static int add_lock_to_list(struct lock_class *this, struct list_head *head,
 		return 0;
 
 	entry->class = this;
+	entry->dep = dep;
 	entry->distance = distance;
 	entry->trace = *trace;
 	/*
@@ -1012,6 +1013,41 @@ static inline bool bfs_error(enum bfs_result res)
 	return res < 0;
 }
 
+/*
+ * DEP_*_BIT in lock_list::dep
+ *
+ * For dependency @prev -> @next:
+ *
+ *   RR: both @prev and @next are recursive read locks, i.e. ->read == 2.
+ *   NR: @prev is a non-recursive and @next is recursive.
+ *   RN: @prev is recursive and @next is non-recursive.
+ *   NN: both @prev and @next are non-recursive.
+ *
+ * Note that we define the value of DEP_*_BITs so that:
+ *   bit0 is prev->read != 2
+ *   bit1 is next->read != 2
+ */
+#define DEP_RR_BIT (0 + (0 << 1)) /* 0 */
+#define DEP_NR_BIT (1 + (0 << 1)) /* 1 */
+#define DEP_RN_BIT (0 + (1 << 1)) /* 2 */
+#define DEP_NN_BIT (1 + (1 << 1)) /* 3 */
+
+#define DEP_RR_MASK (1U << (DEP_RR_BIT))
+#define DEP_NR_MASK (1U << (DEP_NR_BIT))
+#define DEP_RN_MASK (1U << (DEP_RN_BIT))
+#define DEP_NN_MASK (1U << (DEP_NN_BIT))
+
+static inline unsigned int
+__calc_dep_bit(struct held_lock *prev, struct held_lock *next)
+{
+	return (prev->read != 2) + ((next->read != 2) << 1);
+}
+
+static inline u8 calc_dep(struct held_lock *prev, struct held_lock *next)
+{
+	return 1U << __calc_dep_bit(prev, next);
+}
+
 static enum bfs_result __bfs(struct lock_list *source_entry,
 			     void *data,
 			     int (*match)(struct lock_list *entry, void *data),
@@ -1921,7 +1957,35 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
 		if (entry->class == hlock_class(next)) {
 			if (distance == 1)
 				entry->distance = 1;
-			return 1;
+			entry->dep |= calc_dep(prev, next);
+
+			/*
+			 * Also, update the reverse dependency in @next's
+			 * ->locks_before list.
+			 *
+			 *  Here we reuse @entry as the cursor, which is fine
+			 *  because we won't go to the next iteration of the
+			 *  outer loop:
+			 *
+			 *  For normal cases, we return in the inner loop.
+			 *
+			 *  If we fail to return, we have inconsistency, i.e.
+			 *  <prev>::locks_after contains <next> while
+			 *  <next>::locks_before doesn't contain <prev>. In
+			 *  that case, we return after the inner and indicate
+			 *  something is wrong.
+			 */
+			list_for_each_entry(entry, &hlock_class(next)->locks_before, entry) {
+				if (entry->class == hlock_class(prev)) {
+					if (distance == 1)
+						entry->distance = 1;
+					entry->dep |= calc_dep(next, prev);
+					return 1;
+				}
+			}
+
+			/* <prev> is not found in <next>::locks_before */
+			return 0;
 		}
 	}
 
@@ -1948,14 +2012,18 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
 	 */
 	ret = add_lock_to_list(hlock_class(next),
 			       &hlock_class(prev)->locks_after,
-			       next->acquire_ip, distance, trace);
+			       next->acquire_ip, distance,
+			       calc_dep(prev, next),
+			       trace);
 
 	if (!ret)
 		return 0;
 
 	ret = add_lock_to_list(hlock_class(prev),
 			       &hlock_class(next)->locks_before,
-			       next->acquire_ip, distance, trace);
+			       next->acquire_ip, distance,
+			       calc_dep(next, prev),
+			       trace);
 	if (!ret)
 		return 0;
 
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 07/20] lockdep: Extend __bfs() to work with multiple types of dependencies
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
                   ` (5 preceding siblings ...)
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 06/20] lockdep: Introduce lock_list::dep Boqun Feng
@ 2018-04-11 13:50 ` Boqun Feng
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 08/20] lockdep: Make __bfs(.match) return bool Boqun Feng
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney, Boqun Feng

Now we have four types of dependencies in the dependency graph, and not
all the pathes carry real dependencies (the dependencies that may cause
a deadlock), for example:

	Given lock A and B, if we have:

	CPU1			CPU2
	=============		==============
	write_lock(A);		read_lock(B);
	read_lock(B);		write_lock(A);

	then we have dependencies A -(NR)-> B, and B -(RN)-> A, and a
	dependency path A -(NR)-> B -(RN)-> A.

	In lockdep w/o recursive locks, a dependency path from A to A
	means a deadlock. However, the above case is obviously not a
	deadlock, because no one holds B exclusively, therefore no one
	waits for the other to release B, so who get A first in CPU1 and
	CPU2 will run non-blockingly.

	As a result, dependency path A -(NR)-> B -(RN)-> A is not a
	real/strong dependency that could cause a deadlock.

>From the observation above, we know that for a dependency path to be
real/strong, we need at least one exclusive holder for each lock in the
dependency path, otherwise, one could get the resource from the shared
holder and get progress.

Therefore we have the definition: If a path of dependencies doesn't have
two adjacent dependencies as -(*R)-> L -(R*)->, where L is some lock, it
is a strong dependency path, otherwise it's not.

Now our mission is to make __bfs() traverse only the strong dependency
paths, which is simple: we record whether we only have -(*R)-> for the
previous lock_list of the path in lock_list::only_xr, and when we pick a
dependency in the traverse, we 1) filter out -(R*)-> dependency if the
previous lock_list only has -(*R)-> dependency (i.e. ->only_xr is true)
and 2) set the next lock_list::only_xr to true if we only have -(*R)->
left after we filter out dependencies based on 1), otherwise, set it to
false.

With this extension for __bfs(), we now need to initialize the root of
__bfs() properly (with a correct ->only_xr), to do so, we introduce some
helper functions, which also cleans up a little bit for the __bfs() root
initialization code.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 include/linux/lockdep.h  |   2 +
 kernel/locking/lockdep.c | 103 +++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 88 insertions(+), 17 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index cbd257f62877..6b661717a594 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -189,6 +189,8 @@ struct lock_list {
 	u16				distance;
 	/* bitmap of different dependencies from head to this */
 	u8				dep;
+	/* used by BFS to record whether "prev -> this" only has *R */
+	u8				only_xr;
 
 	/*
 	 * The parent field is used to implement breadth-first search, and the
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 0892855b6c57..53ce81e8a6a9 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1037,6 +1037,10 @@ static inline bool bfs_error(enum bfs_result res)
 #define DEP_RN_MASK (1U << (DEP_RN_BIT))
 #define DEP_NN_MASK (1U << (DEP_NN_BIT))
 
+/* dep masks for *N and N* */
+#define DEP_XN_MASK (DEP_RN_MASK | DEP_NN_MASK)
+#define DEP_NX_MASK (DEP_NR_MASK | DEP_NN_MASK)
+
 static inline unsigned int
 __calc_dep_bit(struct held_lock *prev, struct held_lock *next)
 {
@@ -1048,6 +1052,61 @@ static inline u8 calc_dep(struct held_lock *prev, struct held_lock *next)
 	return 1U << __calc_dep_bit(prev, next);
 }
 
+/*
+ * Initialize a lock_list entry @lock belonging to @class as the root for a BFS
+ * search.
+ */
+static inline void __bfs_init_root(struct lock_list *lock,
+				   struct lock_class *class)
+{
+	lock->class = class;
+	lock->parent = NULL;
+	lock->only_xr = 0;
+}
+
+/*
+ * Initialize a lock_list entry @lock based on a lock acquisition @hlock as the
+ * root for a BFS search.
+ *
+ * ->only of the initial lock node is set to @hlock->read == 2, to make sure
+ * that <prev> -> @hlock and @hlock -> <whatever __bfs() found> is not *R and
+ * R*.
+ */
+static inline void bfs_init_root(struct lock_list *lock,
+				 struct held_lock *hlock)
+{
+	__bfs_init_root(lock, hlock_class(hlock));
+	lock->only_xr = (hlock->read == 2);
+}
+
+/*
+ * Breadth-First Search to find a strong path in the dependency graph.
+ *
+ * @source_entry: the source of the path we are searching for.
+ * @data: data used for the second parameter of @match function
+ * @match: match function for the search
+ * @target_entry: pointer to the target of a matched path
+ * @forward: direction of path, the lockdep dependency forward or backward
+ *
+ * We may have multiple edges (considering different kinds of dependencies,
+ * e.g. NR and RN) between two nodes in the dependency graph. But
+ * only the strong dependency path in the graph is relevant to deadlocks. A
+ * strong dependency path is a dependency path that doesn't have two adjacent
+ * dependencies as *R -> R*, the reason why strong dependency path can be
+ * defined as this is:
+ *
+ *     In order to make tasks/CPUs to block one by one in a dependency path,
+ *     there must be at least one exclusive holder for each lock, i.e. two
+ *     adjacent dependencies can be *N -> N*, *R -> N* or *N -> R*, but not *R
+ *     -> R*.
+ *
+ * In __bfs(), we only traverse in the strong dependency path:
+ *
+ *     In lock_list::only_xr, we record whether the previous dependency only
+ *     has *R in the search, and if it does (prev only has *R), we filter out
+ *     any R* in the current dependency and after that, the ->only_xr is set
+ *     according to whether we only have *R left.
+ */
 static enum bfs_result __bfs(struct lock_list *source_entry,
 			     void *data,
 			     int (*match)(struct lock_list *entry, void *data),
@@ -1078,6 +1137,7 @@ static enum bfs_result __bfs(struct lock_list *source_entry,
 
 	while (!__cq_empty(cq)) {
 		struct lock_list *lock;
+		bool prev_only_xr;
 
 		__cq_dequeue(cq, (unsigned long *)&lock);
 
@@ -1103,10 +1163,28 @@ static enum bfs_result __bfs(struct lock_list *source_entry,
 		else
 			head = &lock->class->locks_before;
 
+		prev_only_xr = lock->only_xr;
+
 		DEBUG_LOCKS_WARN_ON(!irqs_disabled());
 
 		list_for_each_entry_rcu(entry, head, entry) {
 			unsigned int cq_depth;
+			u8 dep = entry->dep;
+
+			/*
+			 * Mask out all R* if we only have *R in previous
+			 * step, because *R -> R* don't make up a strong
+			 * dependency.
+			 */
+			if (prev_only_xr)
+				dep &= ~(DEP_RR_MASK | DEP_RN_MASK);
+
+			/* If nothing left, we skip */
+			if (!dep)
+				continue;
+
+			/* If there are only *R left, set that for the next step */
+			entry->only_xr = !(dep & (DEP_RN_MASK | DEP_NN_MASK));
 
 			visit_lock_entry(entry, lock);
 			if (match(entry, data)) {
@@ -1334,8 +1412,7 @@ unsigned long lockdep_count_forward_deps(struct lock_class *class)
 	unsigned long ret, flags;
 	struct lock_list this;
 
-	this.parent = NULL;
-	this.class = class;
+	__bfs_init_root(&this, class);
 
 	local_irq_save(flags);
 	arch_spin_lock(&lockdep_lock);
@@ -1361,8 +1438,7 @@ unsigned long lockdep_count_backward_deps(struct lock_class *class)
 	unsigned long ret, flags;
 	struct lock_list this;
 
-	this.parent = NULL;
-	this.class = class;
+	__bfs_init_root(&this, class);
 
 	local_irq_save(flags);
 	arch_spin_lock(&lockdep_lock);
@@ -1649,17 +1725,14 @@ check_usage(struct task_struct *curr, struct held_lock *prev,
 	struct lock_list *uninitialized_var(target_entry);
 	struct lock_list *uninitialized_var(target_entry1);
 
-	this.parent = NULL;
-
-	this.class = hlock_class(prev);
+	bfs_init_root(&this, prev);
 	ret = find_usage_backwards(&this, bit_backwards, &target_entry);
 	if (bfs_error(ret))
 		return print_bfs_bug(ret);
 	if (ret == BFS_RNOMATCH)
 		return 1;
 
-	that.parent = NULL;
-	that.class = hlock_class(next);
+	bfs_init_root(&that, next);
 	ret = find_usage_forwards(&that, bit_forwards, &target_entry1);
 	if (bfs_error(ret))
 		return print_bfs_bug(ret);
@@ -1915,8 +1988,7 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
 	 * We are using global variables to control the recursion, to
 	 * keep the stackframe size of the recursive functions low:
 	 */
-	this.class = hlock_class(next);
-	this.parent = NULL;
+	bfs_init_root(&this, next);
 	ret = check_noncircular(&this, hlock_class(prev), &target_entry);
 	if (unlikely(ret == BFS_RMATCH)) {
 		if (!trace->entries) {
@@ -1992,8 +2064,7 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
 	/*
 	 * Is the <prev> -> <next> link redundant?
 	 */
-	this.class = hlock_class(prev);
-	this.parent = NULL;
+	bfs_init_root(&this, prev);
 	ret = check_redundant(&this, hlock_class(next), &target_entry);
 	if (ret == BFS_RMATCH) {
 		debug_atomic_inc(nr_redundant);
@@ -2736,8 +2807,7 @@ check_usage_forwards(struct task_struct *curr, struct held_lock *this,
 	struct lock_list root;
 	struct lock_list *uninitialized_var(target_entry);
 
-	root.parent = NULL;
-	root.class = hlock_class(this);
+	bfs_init_root(&root, this);
 	ret = find_usage_forwards(&root, bit, &target_entry);
 	if (bfs_error(ret))
 		return print_bfs_bug(ret);
@@ -2760,8 +2830,7 @@ check_usage_backwards(struct task_struct *curr, struct held_lock *this,
 	struct lock_list root;
 	struct lock_list *uninitialized_var(target_entry);
 
-	root.parent = NULL;
-	root.class = hlock_class(this);
+	bfs_init_root(&root, this);
 	ret = find_usage_backwards(&root, bit, &target_entry);
 	if (bfs_error(ret))
 		return print_bfs_bug(ret);
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 08/20] lockdep: Make __bfs(.match) return bool
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
                   ` (6 preceding siblings ...)
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 07/20] lockdep: Extend __bfs() to work with multiple types of dependencies Boqun Feng
@ 2018-04-11 13:50 ` Boqun Feng
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 09/20] lockdep: Support deadlock detection for recursive read locks in check_noncircular() Boqun Feng
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney, Boqun Feng

The "match" parameter of __bfs() is used for checking whether we hit a
match in the search, therefore it should return a boolean value rather
than an integer for better readability.

This patch then changes the return type of the function parameter and the
match functions to bool.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/locking/lockdep.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 53ce81e8a6a9..df1637db923a 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1109,7 +1109,7 @@ static inline void bfs_init_root(struct lock_list *lock,
  */
 static enum bfs_result __bfs(struct lock_list *source_entry,
 			     void *data,
-			     int (*match)(struct lock_list *entry, void *data),
+			     bool (*match)(struct lock_list *entry, void *data),
 			     struct lock_list **target_entry,
 			     int forward)
 {
@@ -1209,7 +1209,7 @@ static enum bfs_result __bfs(struct lock_list *source_entry,
 static inline enum bfs_result
 __bfs_forwards(struct lock_list *src_entry,
 	       void *data,
-	       int (*match)(struct lock_list *entry, void *data),
+	       bool (*match)(struct lock_list *entry, void *data),
 	       struct lock_list **target_entry)
 {
 	return __bfs(src_entry, data, match, target_entry, 1);
@@ -1219,7 +1219,7 @@ __bfs_forwards(struct lock_list *src_entry,
 static inline enum bfs_result
 __bfs_backwards(struct lock_list *src_entry,
 		void *data,
-		int (*match)(struct lock_list *entry, void *data),
+		bool (*match)(struct lock_list *entry, void *data),
 		struct lock_list **target_entry)
 {
 	return __bfs(src_entry, data, match, target_entry, 0);
@@ -1333,7 +1333,7 @@ print_circular_bug_header(struct lock_list *entry, unsigned int depth,
 	return 0;
 }
 
-static inline int class_equal(struct lock_list *entry, void *data)
+static inline bool class_equal(struct lock_list *entry, void *data)
 {
 	return entry->class == data;
 }
@@ -1392,10 +1392,10 @@ static noinline int print_bfs_bug(int ret)
 	return 0;
 }
 
-static int noop_count(struct lock_list *entry, void *data)
+static bool noop_count(struct lock_list *entry, void *data)
 {
 	(*(unsigned long *)data)++;
-	return 0;
+	return false;
 }
 
 static unsigned long __lockdep_count_forward_deps(struct lock_list *this)
@@ -1486,7 +1486,7 @@ check_redundant(struct lock_list *root, struct lock_class *target,
  * without creating any illegal irq-safe -> irq-unsafe lock dependency.
  */
 
-static inline int usage_match(struct lock_list *entry, void *bit)
+static inline bool usage_match(struct lock_list *entry, void *bit)
 {
 	return entry->class->usage_mask & (1 << (enum lock_usage_bit)bit);
 }
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 09/20] lockdep: Support deadlock detection for recursive read locks in check_noncircular()
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
                   ` (7 preceding siblings ...)
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 08/20] lockdep: Make __bfs(.match) return bool Boqun Feng
@ 2018-04-11 13:50 ` Boqun Feng
  2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 10/20] lockdep: Adjust check_redundant() for recursive read change Boqun Feng
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney, Boqun Feng

Currently, lockdep only has limit support for deadlock detection for
recursive read locks.

This patch support deadlock detection for recursive read locks. The
basic idea is:

We are about to add dependency B -> A in to the dependency graph, we use
check_noncircular() to find whether we have a strong dependency path
A -> .. -> B so that we have a strong dependency circle:

	 A -> .. -> B -> A

, which doesn't have two adjacent dependencies as -(*R)-> L -(R*)->.
Because similar to the definition of strong dependency paths, if there
are two adjacent dependencies like that, there is at least one lock
which doesn't have an exclusive holder, so no deadlock.

Since A -> .. -> B is already a strong dependency path, so if either
B -> A is N* or A -> .. -> B is *N, the circle A -> .. -> B -> A is
strong, otherwise not. So we introduce a new match function
hlock_conflict() to replace the class_equal() for the deadlock check in
check_noncircular().

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/locking/lockdep.c | 35 +++++++++++++++++++++++++++++++----
 1 file changed, 31 insertions(+), 4 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index df1637db923a..6b5d43687c3b 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1338,6 +1338,33 @@ static inline bool class_equal(struct lock_list *entry, void *data)
 	return entry->class == data;
 }
 
+/*
+ * We are about to add B -> A into the dependency graph, and in __bfs() a
+ * strong dependency path A -> .. -> B is found: hlock_class equals
+ * entry->class.
+ *
+ * We will have a deadlock case (conflict) if A -> .. -> B -> A is a strong
+ * dependency cycle, that means:
+ *
+ * Either
+ *
+ *     a) B -> A is N*
+ *
+ * or
+ *
+ *     b) A -> .. -> B is *N (i.e. A -> .. -(*N)-> B)
+ *
+ * as then we don't have *R -> R* in the cycle.
+ */
+static inline bool hlock_conflict(struct lock_list *entry, void *data)
+{
+	struct held_lock *hlock = (struct held_lock *)data;
+
+	return hlock_class(hlock) == entry->class && /* Found A -> .. -> B */
+	       (hlock->read != 2 || /* B -> A is N* */
+		!entry->only_xr); /* A -> .. -> B is *N */
+}
+
 static noinline int print_circular_bug(struct lock_list *this,
 				struct lock_list *target,
 				struct held_lock *check_src,
@@ -1450,18 +1477,18 @@ unsigned long lockdep_count_backward_deps(struct lock_class *class)
 }
 
 /*
- * Prove that the dependency graph starting at <entry> can not
+ * Prove that the dependency graph starting at <root> can not
  * lead to <target>. Print an error and return BFS_RMATCH if it does.
  */
 static noinline enum bfs_result
-check_noncircular(struct lock_list *root, struct lock_class *target,
+check_noncircular(struct lock_list *root, struct held_lock *target,
 		  struct lock_list **target_entry)
 {
 	enum bfs_result result;
 
 	debug_atomic_inc(nr_cyclic_checks);
 
-	result = __bfs_forwards(root, target, class_equal, target_entry);
+	result = __bfs_forwards(root, target, hlock_conflict, target_entry);
 
 	return result;
 }
@@ -1989,7 +2016,7 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
 	 * keep the stackframe size of the recursive functions low:
 	 */
 	bfs_init_root(&this, next);
-	ret = check_noncircular(&this, hlock_class(prev), &target_entry);
+	ret = check_noncircular(&this, prev, &target_entry);
 	if (unlikely(ret == BFS_RMATCH)) {
 		if (!trace->entries) {
 			/*
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 10/20] lockdep: Adjust check_redundant() for recursive read change
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
                   ` (8 preceding siblings ...)
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 09/20] lockdep: Support deadlock detection for recursive read locks in check_noncircular() Boqun Feng
@ 2018-04-11 13:51 ` Boqun Feng
  2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 11/20] lockdep: Fix recursive read lock related safe->unsafe detection Boqun Feng
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney, Boqun Feng

check_redundant() will report redundancy if it finds a path could
replace the about-to-add dependency in the BFS search. With recursive
read lock changes, we certainly need to change the match function for
the check_redundant(), because the path needs to match not only the lock
class but also the dependency kinds. For example, if the about-to-add
dependency @prev -> @next is A -(RN)-> B, and we find a path A -(R*)->
.. -(*R)->B in the dependency graph with __bfs() (for simplicity, we can
also say we find an RR path from A to B), we can not replace the
dependency with that path in the BFS search. Because the RN dependency
can make a strong path with an RN dependency, however an RR path cannot.

Further, we can also replace an RN dependency with a NN path, that means
if we find a path which is stronger than or equal to the about-to-add
dependency, we can report the redundancy. By "stronger", it means both
the start and the end of the path are not weaker than the start and the
end of the dependency, so that we can replace the dependency with that
path.

To make sure we find a path whose start point is not weaker than the
about-to-add dependency, we use a trick: the ->only_xr of the root
(start point) of __bfs() is initialized as @prev-> !=2, therefore if
@prev is N, __bfs() will pick N* for the first dependency, otherwise,
__bfs() can pick N* or R* for the first dependency.

To make sure we find a path whose end point is not weaker than the
about-to-add dependency, we replace the match function for __bfs()
check_redundant(), we check for the case that either @next is R
(anything is not weaker than it) or the end point of the path is N
(which is not weaker than anything).

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/locking/lockdep.c | 53 ++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 47 insertions(+), 6 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 6b5d43687c3b..6135d1836ed3 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1333,9 +1333,40 @@ print_circular_bug_header(struct lock_list *entry, unsigned int depth,
 	return 0;
 }
 
-static inline bool class_equal(struct lock_list *entry, void *data)
+/*
+ * We are about to add A -> B into the dependency graph, and in __bfs() a
+ * strong dependency path A -> .. -> B is found: hlock_class equals
+ * entry->class.
+ *
+ * If A -> .. -> B can replace A -> B in any __bfs() search (means the former
+ * is _stronger_ than or equal to the latter), we consider A -> B as redundant.
+ * For example if A -> .. -> B is NN (i.e. A -(N*)-> .. -(*N)-> B), and A -> B
+ * is NR or NN, then we don't need to add A -> B into the dependency graph, as
+ * any strong path ..-> A -> B ->.. we can get with having dependency A -> B,
+ * we could already get a equivalent path ..-> A -> .. -> B -> .. with A -> ..
+ * -> B. Therefore A -> B is reduntant.
+ *
+ * We need to make sure both the start and the end of A -> .. -> B is not
+ * weaker than A -> B. For the start part, please see the comment before
+ * call-site of check_redundant() in check_prev_add(). For the end part, we
+ * need:
+ *
+ * Either
+ *
+ *     a) A -> B is *R (everything is not weaker than that)
+ *
+ * or
+ *
+ *     b) A -> .. -> B is *N (nothing is stronger than this)
+ *
+ */
+static inline bool hlock_equal(struct lock_list *entry, void *data)
 {
-	return entry->class == data;
+	struct held_lock *hlock = (struct held_lock *)data;
+
+	return hlock_class(hlock) == entry->class && /* Found A -> .. -> B */
+	       (hlock->read == 2 ||  /* A -> B is *R */
+		!entry->only_xr); /* A -> .. -> B is *N */
 }
 
 /*
@@ -1494,14 +1525,14 @@ check_noncircular(struct lock_list *root, struct held_lock *target,
 }
 
 static noinline enum bfs_result
-check_redundant(struct lock_list *root, struct lock_class *target,
+check_redundant(struct lock_list *root, struct held_lock *target,
 		struct lock_list **target_entry)
 {
 	enum bfs_result result;
 
 	debug_atomic_inc(nr_redundant_checks);
 
-	result = __bfs_forwards(root, target, class_equal, target_entry);
+	result = __bfs_forwards(root, target, hlock_equal, target_entry);
 
 	return result;
 }
@@ -2090,9 +2121,19 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
 
 	/*
 	 * Is the <prev> -> <next> link redundant?
+	 *
+	 * Special setup for check_redundant().
+	 *
+	 * To report redundant, we need to find a strong dependency path that
+	 * is equal to or stronger than <prev> -> <next>. So if <prev> is N,
+	 * we need to let __bfs() only search for a path starting at a N*, we
+	 * achieve this by setting the initial node's ->only_xr to true in
+	 * that case. And if <prev> is R, we set initial ->only_xr to false
+	 * because both R* (equal) and N* (stronger) are redundant.
 	 */
-	bfs_init_root(&this, prev);
-	ret = check_redundant(&this, hlock_class(next), &target_entry);
+	__bfs_init_root(&this, hlock_class(prev));
+	this.only_xr = prev->read != 2;
+	ret = check_redundant(&this, next, &target_entry);
 	if (ret == BFS_RMATCH) {
 		debug_atomic_inc(nr_redundant);
 		return 2;
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 11/20] lockdep: Fix recursive read lock related safe->unsafe detection
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
                   ` (9 preceding siblings ...)
  2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 10/20] lockdep: Adjust check_redundant() for recursive read change Boqun Feng
@ 2018-04-11 13:51 ` Boqun Feng
  2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 12/20] lockdep: Add recursive read locks into dependency graph Boqun Feng
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney, Boqun Feng

irq-safe -> irq-unsafe lock dependency is illegal, because it could
introduce the "irq inversion" problem, that is when a irq-unsafe lock
is held, some one else interrupts and tries to acquire irq-safe lock.
And that case adds a temporary from irq-unsafe to irq-safe, as a result,
deadlock.

There are four cases for irq inversion deadlocks:

(-(X..Y)-> means a strong dependency path starts with a --(X*)-->
dependency and ends with a --(*Y)-- dependency.)

1.	An irq-safe lock L1 has a dependency -> .. -> to an
	irq-unsafe lock L2.

2.	An irq-read-safe lock L1 has a dependency -(N*)-> .. -> to an
	irq-unsafe lock L2.

3.	An irq-safe lock L1 has a dependency -> .. -(*N)-> to an
	irq-read-unsafe lock L2.

4.	An irq-read-safe lock L1 has a dependency -(N*)-> .. -(*N)-> to
	an irq-read-unsafe lock L2.

The current check_usage() only checks 1) and 2), with the enhanced
__bfs(), we could combine all the four in find_usage_{back,for}wards(),
by using another match function. The idea is, if we are in dependency
path L1 -> .. -> L2, and the temporary irq inversion dependency for
unsafe to safe is L2 -> L1. We need a strong dependency path/circle for
L1 -> .. -> L2 -> L1 to prove it's a deadlock. So that we need either L2
-> L1 is *N or L1 -> .. -> L2 is N*, that means either L1 is
irq-(write-)safe lock or backwards BFS search finds the ->only_xr is
false for L1. Like wise for L2. usage_match() is updated for exactly
this logic.

Moveover, with the new find_usage_{back,for}wards(), mark_lock_irq() can
be simplified and since self inversion case is already covered, we can
further kill valid_state().

Another thing is that since we now find the inversion in one search, and
__bfs() only report whether a match is found or not, we are then lack of
the information for print_bad_irq_dependency(), so we introduce a new
function match_bit(), which returns the matched usage bit in __bfs(),
and it must be called after we find a match.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
Peter,

I went further than our discussion:

	https://marc.info/?l=linux-kernel&m=151938583827331

, I found we can actually only need one backwards BFS and one forwards
BFS for check_irq_usage(), there is no need for two check_usage()s
there, so I open-coded check_usage() into check_irq_usage() and cleaned
other things.

Given that I added quite a few lines of comments, so the diff actually
shows we save a few lines of code.

 kernel/locking/lockdep.c | 299 ++++++++++++++++++++++++++---------------------
 1 file changed, 163 insertions(+), 136 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 6135d1836ed3..18eb76189727 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -461,6 +461,26 @@ static const char *usage_str[] =
 	[LOCK_USED] = "INITIAL USE",
 };
 
+static const char * const state_names[] = {
+#define LOCKDEP_STATE(__STATE) \
+	__stringify(__STATE),
+#include "lockdep_states.h"
+#undef LOCKDEP_STATE
+};
+
+static const char * const state_rnames[] = {
+#define LOCKDEP_STATE(__STATE) \
+	__stringify(__STATE)"-RR",
+#include "lockdep_states.h"
+#undef LOCKDEP_STATE
+};
+
+static inline const char *state_name(enum lock_usage_bit bit)
+{
+	return (bit & 1) ? state_rnames[bit >> 2] : state_names[bit >> 2];
+}
+
+
 const char * __get_key_name(struct lockdep_subclass_key *key, char *str)
 {
 	return kallsyms_lookup((unsigned long)key, NULL, NULL, NULL, str);
@@ -1542,11 +1562,44 @@ check_redundant(struct lock_list *root, struct held_lock *target,
  * Forwards and backwards subgraph searching, for the purposes of
  * proving that two subgraphs can be connected by a new dependency
  * without creating any illegal irq-safe -> irq-unsafe lock dependency.
+ *
+ * irq-safe -> irq-unsafe lock dependency is illegal, because it could happen
+ * that when irq-unsafe lock L2 is held, an interrupt happens and the
+ * corresponding handler tries to acquire irq-safe lock L1, and that creates
+ * a temporary dependency L2 -> L1, and since we already find a dependency from
+ * L1 to L2, which means we have a cirlce L2 -> L1 -> .. -> L2. But note that
+ * the circle has to be a strong path for a deadlock, so we need to rule out
+ * case where 1) L1 -> .. -> L2 is R* and L1 is only irq-read-safe lock and 2)
+ * L1 -> .. -> L2 is *R and L2 is only irq-read-unsafe lock.
  */
 
+/*
+ * The match function for usage checks.
+ *
+ * As above, if in the BFS, entry->only_xr is true (means L1 -> .. is R* in
+ * backwards searching or .. -> L2 is *R in forwards searching), then read
+ * usage of @entry doesn't count as strong. So we only test read usage if
+ * entry->only_xr is false.
+ *
+ * In other words, if we find a write usage (either irq-safe or irq-unsafe), we
+ * don't need to care about the type of the path (i.e. we need to care
+ * ->only_xr). Otherwise, ->only_xr has to be false, and we also need a read
+ * usage.
+ */
 static inline bool usage_match(struct lock_list *entry, void *bit)
 {
-	return entry->class->usage_mask & (1 << (enum lock_usage_bit)bit);
+	enum lock_usage_bit ub = (enum lock_usage_bit)bit;
+	unsigned long mask;
+	unsigned long read_mask;
+	unsigned long usage;
+
+	ub &= ~1; /* remove the read usage bit */
+	mask = 1UL << ub;
+	read_mask = 1UL << (ub + 1);
+	usage = entry->class->usage_mask;
+
+	return (usage & mask) || /* write usage works with any path */
+	       (!entry->only_xr && (usage & read_mask)); /* read usage only works with *N path */
 }
 
 
@@ -1708,8 +1761,7 @@ print_bad_irq_dependency(struct task_struct *curr,
 			 struct held_lock *prev,
 			 struct held_lock *next,
 			 enum lock_usage_bit bit1,
-			 enum lock_usage_bit bit2,
-			 const char *irqclass)
+			 enum lock_usage_bit bit2)
 {
 	if (!debug_locks_off_graph_unlock() || debug_locks_silent)
 		return 0;
@@ -1717,7 +1769,7 @@ print_bad_irq_dependency(struct task_struct *curr,
 	pr_warn("\n");
 	pr_warn("=====================================================\n");
 	pr_warn("WARNING: %s-safe -> %s-unsafe lock order detected\n",
-		irqclass, irqclass);
+		state_name(bit1), state_name(bit2));
 	print_kernel_ident();
 	pr_warn("-----------------------------------------------------\n");
 	pr_warn("%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] is trying to acquire:\n",
@@ -1737,15 +1789,15 @@ print_bad_irq_dependency(struct task_struct *curr,
 	pr_cont("\n");
 
 	pr_warn("\nbut this new dependency connects a %s-irq-safe lock:\n",
-		irqclass);
+		state_name(bit1));
 	print_lock_name(backwards_entry->class);
-	pr_warn("\n... which became %s-irq-safe at:\n", irqclass);
+	pr_warn("\n... which became %s-irq-safe at:\n", state_name(bit1));
 
 	print_stack_trace(backwards_entry->class->usage_traces + bit1, 1);
 
-	pr_warn("\nto a %s-irq-unsafe lock:\n", irqclass);
+	pr_warn("\nto a %s-irq-unsafe lock:\n", state_name(bit2));
 	print_lock_name(forwards_entry->class);
-	pr_warn("\n... which became %s-irq-unsafe at:\n", irqclass);
+	pr_warn("\n... which became %s-irq-unsafe at:\n", state_name(bit2));
 	pr_warn("...");
 
 	print_stack_trace(forwards_entry->class->usage_traces + bit2, 1);
@@ -1756,13 +1808,13 @@ print_bad_irq_dependency(struct task_struct *curr,
 
 	lockdep_print_held_locks(curr);
 
-	pr_warn("\nthe dependencies between %s-irq-safe lock and the holding lock:\n", irqclass);
+	pr_warn("\nthe dependencies between %s-irq-safe lock and the holding lock:\n", state_name(bit1));
 	if (!save_trace(&prev_root->trace))
 		return 0;
 	print_shortest_lock_dependencies(backwards_entry, prev_root);
 
 	pr_warn("\nthe dependencies between the lock to be acquired");
-	pr_warn(" and %s-irq-unsafe lock:\n", irqclass);
+	pr_warn(" and %s-irq-unsafe lock:\n", state_name(bit2));
 	if (!save_trace(&next_root->trace))
 		return 0;
 	print_shortest_lock_dependencies(forwards_entry, next_root);
@@ -1773,55 +1825,6 @@ print_bad_irq_dependency(struct task_struct *curr,
 	return 0;
 }
 
-static int
-check_usage(struct task_struct *curr, struct held_lock *prev,
-	    struct held_lock *next, enum lock_usage_bit bit_backwards,
-	    enum lock_usage_bit bit_forwards, const char *irqclass)
-{
-	int ret;
-	struct lock_list this, that;
-	struct lock_list *uninitialized_var(target_entry);
-	struct lock_list *uninitialized_var(target_entry1);
-
-	bfs_init_root(&this, prev);
-	ret = find_usage_backwards(&this, bit_backwards, &target_entry);
-	if (bfs_error(ret))
-		return print_bfs_bug(ret);
-	if (ret == BFS_RNOMATCH)
-		return 1;
-
-	bfs_init_root(&that, next);
-	ret = find_usage_forwards(&that, bit_forwards, &target_entry1);
-	if (bfs_error(ret))
-		return print_bfs_bug(ret);
-	if (ret == BFS_RNOMATCH)
-		return 1;
-
-	return print_bad_irq_dependency(curr, &this, &that,
-			target_entry, target_entry1,
-			prev, next,
-			bit_backwards, bit_forwards, irqclass);
-}
-
-static const char *state_names[] = {
-#define LOCKDEP_STATE(__STATE) \
-	__stringify(__STATE),
-#include "lockdep_states.h"
-#undef LOCKDEP_STATE
-};
-
-static const char *state_rnames[] = {
-#define LOCKDEP_STATE(__STATE) \
-	__stringify(__STATE)"-RR",
-#include "lockdep_states.h"
-#undef LOCKDEP_STATE
-};
-
-static inline const char *state_name(enum lock_usage_bit bit)
-{
-	return (bit & 1) ? state_rnames[bit >> 2] : state_names[bit >> 2];
-}
-
 static int exclusive_bit(int new_bit)
 {
 	/*
@@ -1844,32 +1847,66 @@ static int exclusive_bit(int new_bit)
 	return state | (dir ^ 2);
 }
 
+/*
+ * We found a lock L whose class is @entry->class and L has a usage matching
+ * @dir_bit with usage_match() in a BFS search. And we need to figure out which
+ * exact usage bit is matched by usage_match().
+ *
+ * This function must be called after find_usage_{for,back)wards() and we find
+ * a match.
+ *
+ * Since we already find a match, if the lock doesn't have write usage, that
+ * means the usage we matched was read usage, otherwise it was write usage
+ * (because we check write usage in usage_match() first).
+ */
+static enum lock_usage_bit match_bit(struct lock_list *entry,
+				     enum lock_usage_bit dir_bit)
+{
+	unsigned long mask = 1UL << dir_bit;
+	unsigned long usage = entry->class->usage_mask;
+
+	return dir_bit + !(usage & mask);
+}
+
 static int check_irq_usage(struct task_struct *curr, struct held_lock *prev,
-			   struct held_lock *next, enum lock_usage_bit bit)
+			   struct held_lock *next, enum lock_usage_bit dir_bit)
 {
-	/*
-	 * Prove that the new dependency does not connect a hardirq-safe
-	 * lock with a hardirq-unsafe lock - to achieve this we search
-	 * the backwards-subgraph starting at <prev>, and the
-	 * forwards-subgraph starting at <next>:
-	 */
-	if (!check_usage(curr, prev, next, bit,
-			   exclusive_bit(bit), state_name(bit)))
-		return 0;
+	int ret;
+	struct lock_list this, that;
+	struct lock_list *uninitialized_var(target_entry);
+	struct lock_list *uninitialized_var(target_entry1);
+	enum lock_usage_bit excl_bit;
 
-	bit++; /* _READ */
+	/* the read/write bit should be zero */
+	if (WARN_ON_ONCE(dir_bit & 1))
+		dir_bit &= ~1;
+
+	excl_bit = exclusive_bit(dir_bit);
 
 	/*
-	 * Prove that the new dependency does not connect a hardirq-safe-read
-	 * lock with a hardirq-unsafe lock - to achieve this we search
-	 * the backwards-subgraph starting at <prev>, and the
-	 * forwards-subgraph starting at <next>:
+	 * Prove that the new dependency does not connect a irq-safe lock with
+	 * a irq-unsafe lock - to achieve this we search the backwards-subgraph
+	 * starting at <prev>, and the forwards-subgraph starting at <next>:
 	 */
-	if (!check_usage(curr, prev, next, bit,
-			   exclusive_bit(bit), state_name(bit)))
-		return 0;
+	bfs_init_root(&this, prev);
+	ret = find_usage_backwards(&this, dir_bit, &target_entry);
+	if (bfs_error(ret))
+		return print_bfs_bug(ret);
+	if (ret == BFS_RNOMATCH)
+		return 1;
 
-	return 1;
+	bfs_init_root(&that, next);
+	ret = find_usage_forwards(&that, excl_bit, &target_entry1);
+	if (bfs_error(ret))
+		return print_bfs_bug(ret);
+	if (ret == BFS_RNOMATCH)
+		return 1;
+
+	return print_bad_irq_dependency(curr, &this, &that,
+			target_entry, target_entry1,
+			prev, next,
+			match_bit(target_entry, dir_bit),
+			match_bit(target_entry1, excl_bit));
 }
 
 static int
@@ -2782,18 +2819,6 @@ print_usage_bug(struct task_struct *curr, struct held_lock *this,
 	return 0;
 }
 
-/*
- * Print out an error if an invalid bit is set:
- */
-static inline int
-valid_state(struct task_struct *curr, struct held_lock *this,
-	    enum lock_usage_bit new_bit, enum lock_usage_bit bad_bit)
-{
-	if (unlikely(hlock_class(this)->usage_mask & (1 << bad_bit)))
-		return print_usage_bug(curr, this, bad_bit, new_bit);
-	return 1;
-}
-
 static int mark_lock(struct task_struct *curr, struct held_lock *this,
 		     enum lock_usage_bit new_bit);
 
@@ -2863,27 +2888,48 @@ print_irq_inversion_bug(struct task_struct *curr,
 	return 0;
 }
 
+/*
+ * Forwards and backwards subgraph searching, for the purposes of
+ * proving that a dependency path can become an illegal irq-safe ->
+ * irq-unsafe lock dependency because of the newly usage found.
+ *
+ * A special case other than what we describe in comments before usage_match()
+ * is, a lock L adds usage USED_IN while it already has ENABLED usage or vice
+ * versa, unless a lock only adds USED_IN_READ or ENABLED_READ usage to each
+ * other, which is not a deadlock.
+ *
+ * In the special case, find_usage_forwards() and find_usage_backwards() still
+ * work, because __bfs() will first try the root node. We just need to split
+ * this out for a different debug report (self irq inversion).
+ */
+
 /*
  * Prove that in the forwards-direction subgraph starting at <this>
  * there is no lock matching <mask>:
  */
 static int
 check_usage_forwards(struct task_struct *curr, struct held_lock *this,
-		     enum lock_usage_bit bit, const char *irqclass)
+		     enum lock_usage_bit bit, enum lock_usage_bit excl_bit)
 {
 	enum bfs_result ret;
 	struct lock_list root;
 	struct lock_list *uninitialized_var(target_entry);
 
 	bfs_init_root(&root, this);
-	ret = find_usage_forwards(&root, bit, &target_entry);
+	ret = find_usage_forwards(&root, excl_bit, &target_entry);
 	if (bfs_error(ret))
 		return print_bfs_bug(ret);
 	if (ret == BFS_RNOMATCH)
 		return ret;
 
-	return print_irq_inversion_bug(curr, &root, target_entry,
-					this, 1, irqclass);
+	/* self inversion */
+	if (target_entry->class == hlock_class(this))
+		return print_usage_bug(curr, this,
+				       match_bit(target_entry, excl_bit),
+				       bit);
+	else
+		return print_irq_inversion_bug(curr, &root, target_entry,
+					       this, 1, state_name(bit));
 }
 
 /*
@@ -2892,21 +2938,27 @@ check_usage_forwards(struct task_struct *curr, struct held_lock *this,
  */
 static int
 check_usage_backwards(struct task_struct *curr, struct held_lock *this,
-		      enum lock_usage_bit bit, const char *irqclass)
+		      enum lock_usage_bit bit, enum lock_usage_bit excl_bit)
 {
 	enum bfs_result ret;
 	struct lock_list root;
 	struct lock_list *uninitialized_var(target_entry);
 
 	bfs_init_root(&root, this);
-	ret = find_usage_backwards(&root, bit, &target_entry);
+	ret = find_usage_backwards(&root, excl_bit, &target_entry);
 	if (bfs_error(ret))
 		return print_bfs_bug(ret);
 	if (ret == BFS_RNOMATCH)
 		return 1;
 
-	return print_irq_inversion_bug(curr, &root, target_entry,
-					this, 0, irqclass);
+	/* self inversion */
+	if (target_entry->class == hlock_class(this))
+		return print_usage_bug(curr, this,
+				       match_bit(target_entry, excl_bit),
+				       bit);
+	else
+		return print_irq_inversion_bug(curr, &root, target_entry,
+					       this, 0, state_name(bit));
 }
 
 void print_irqtrace_events(struct task_struct *curr)
@@ -2942,8 +2994,6 @@ static int SOFTIRQ_verbose(struct lock_class *class)
 	return 0;
 }
 
-#define STRICT_READ_CHECKS	1
-
 static int (*state_verbose_f[])(struct lock_class *class) = {
 #define LOCKDEP_STATE(__STATE) \
 	__STATE##_verbose,
@@ -2965,44 +3015,21 @@ mark_lock_irq(struct task_struct *curr, struct held_lock *this,
 		enum lock_usage_bit new_bit)
 {
 	int excl_bit = exclusive_bit(new_bit);
-	int read = new_bit & 1;
-	int dir = new_bit & 2;
-
-	/*
-	 * mark USED_IN has to look forwards -- to ensure no dependency
-	 * has ENABLED state, which would allow recursion deadlocks.
-	 *
-	 * mark ENABLED has to look backwards -- to ensure no dependee
-	 * has USED_IN state, which, again, would allow  recursion deadlocks.
-	 */
-	check_usage_f usage = dir ?
-		check_usage_backwards : check_usage_forwards;
 
-	/*
-	 * Validate that this particular lock does not have conflicting
-	 * usage states.
-	 */
-	if (!valid_state(curr, this, new_bit, excl_bit))
-		return 0;
-
-	/*
-	 * Validate that the lock dependencies don't have conflicting usage
-	 * states.
-	 */
-	if ((!read || !dir || STRICT_READ_CHECKS) &&
-			!usage(curr, this, excl_bit, state_name(new_bit & ~1)))
-		return 0;
-
-	/*
-	 * Check for read in write conflicts
-	 */
-	if (!read) {
-		if (!valid_state(curr, this, new_bit, excl_bit + 1))
+	if (new_bit & 2) {
+		/*
+		 * mark ENABLED has to look backwards -- to ensure no dependee
+		 * has USED_IN state, which, again, would allow recursion
+		 * deadlocks.
+		 */
+		if (!check_usage_backwards(curr, this, new_bit, excl_bit))
 			return 0;
-
-		if (STRICT_READ_CHECKS &&
-			!usage(curr, this, excl_bit + 1,
-				state_name(new_bit + 1)))
+	} else {
+		/*
+		 * mark USED_IN has to look forwards -- to ensure no dependency
+		 * has ENABLED state, which would allow recursion deadlocks.
+		 */
+		if (!check_usage_forwards(curr, this, new_bit, excl_bit))
 			return 0;
 	}
 
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 12/20] lockdep: Add recursive read locks into dependency graph
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
                   ` (10 preceding siblings ...)
  2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 11/20] lockdep: Fix recursive read lock related safe->unsafe detection Boqun Feng
@ 2018-04-11 13:51 ` Boqun Feng
  2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 13/20] lockdep/selftest: Add a R-L/L-W test case specific to chain cache behavior Boqun Feng
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney, Boqun Feng

Since we have all the fundamental to handle recursive read locks, we now
add them into the dependency graph.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/locking/lockdep.c | 16 +---------------
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 18eb76189727..45a370438add 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2102,16 +2102,6 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
 	if (!check_prev_add_irq(curr, prev, next))
 		return 0;
 
-	/*
-	 * For recursive read-locks we do all the dependency checks,
-	 * but we dont store read-triggered dependencies (only
-	 * write-triggered dependencies). This ensures that only the
-	 * write-side dependencies matter, and that if for example a
-	 * write-lock never takes any other locks, then the reads are
-	 * equivalent to a NOP.
-	 */
-	if (next->read == 2 || prev->read == 2)
-		return 1;
 	/*
 	 * Is the <prev> -> <next> dependency already present?
 	 *
@@ -2243,11 +2233,7 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next)
 		u16 distance = curr->lockdep_depth - depth + 1;
 		hlock = curr->held_locks + depth - 1;
 
-		/*
-		 * Only non-recursive-read entries get new dependencies
-		 * added:
-		 */
-		if (hlock->read != 2 && hlock->check) {
+		if (hlock->check) {
 			int ret = check_prev_add(curr, hlock, next, distance, &trace, save_trace);
 			if (!ret)
 				return 0;
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 13/20] lockdep/selftest: Add a R-L/L-W test case specific to chain cache behavior
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
                   ` (11 preceding siblings ...)
  2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 12/20] lockdep: Add recursive read locks into dependency graph Boqun Feng
@ 2018-04-11 13:51 ` Boqun Feng
  2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 14/20] lockdep: Take read/write status in consideration when generate chainkey Boqun Feng
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney, Boqun Feng

As our chain cache doesn't differ read/write locks, so even we can
detect a read-lock/lock-write deadlock in check_noncircular(), we can
still be fooled if a read-lock/lock-read case(which is not a deadlock)
comes first.

So introduce this test case to test specific to the chain cache behavior
on detecting recursive read lock related deadlocks.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 lib/locking-selftest.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 47 insertions(+)

diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index b5c1293ce147..700f9aa19db6 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -395,6 +395,49 @@ static void rwsem_ABBA1(void)
 	MU(Y1); // should fail
 }
 
+/*
+ * read_lock(A)
+ * spin_lock(B)
+ *		spin_lock(B)
+ *		write_lock(A)
+ *
+ * This test case is aimed at poking whether the chain cache prevents us from
+ * detecting a read-lock/lock-write deadlock: if the chain cache doesn't differ
+ * read/write locks, the following case may happen
+ *
+ * 	{ read_lock(A)->lock(B) dependency exists }
+ *
+ * 	P0:
+ * 	lock(B);
+ * 	read_lock(A);
+ *
+ *	{ Not a deadlock, B -> A is added in the chain cache }
+ *
+ *	P1:
+ *	lock(B);
+ *	write_lock(A);
+ *
+ *	{ B->A found in chain cache, not reported as a deadlock }
+ *
+ */
+static void rlock_chaincache_ABBA1(void)
+{
+	RL(X1);
+	L(Y1);
+	U(Y1);
+	RU(X1);
+
+	L(Y1);
+	RL(X1);
+	RU(X1);
+	U(Y1);
+
+	L(Y1);
+	WL(X1);
+	WU(X1);
+	U(Y1); // should fail
+}
+
 /*
  * read_lock(A)
  * spin_lock(B)
@@ -2055,6 +2098,10 @@ void locking_selftest(void)
 	pr_cont("             |");
 	dotest(rwsem_ABBA3, FAILURE, LOCKTYPE_RWSEM);
 
+	print_testname("chain cached mixed R-L/L-W ABBA");
+	pr_cont("             |");
+	dotest(rlock_chaincache_ABBA1, FAILURE, LOCKTYPE_RWLOCK);
+
 	printk("  --------------------------------------------------------------------------\n");
 
 	/*
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 14/20] lockdep: Take read/write status in consideration when generate chainkey
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
                   ` (12 preceding siblings ...)
  2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 13/20] lockdep/selftest: Add a R-L/L-W test case specific to chain cache behavior Boqun Feng
@ 2018-04-11 13:51 ` Boqun Feng
  2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 15/20] lockdep/selftest: Unleash irq_read_recursion2 and add more Boqun Feng
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney, Boqun Feng

Currently, the chainkey of a lock chain is a hash sum of the class_idx
of all the held locks, the read/write status are not taken in to
consideration while generating the chainkey. This could result into a
problem, if we have:

	P1()
	{
		read_lock(B);
		lock(A);
	}

	P2()
	{
		lock(A);
		read_lock(B);
	}

	P3()
	{
		lock(A);
		write_lock(B);
	}

, and P1(), P2(), P3() run one by one. And when running P2(), lockdep
detects such a lock chain A -> B is not a deadlock, then it's added in
the chain cache, and then when running P3(), even if it's a deadlock, we
could miss it because of the hit of chain cache. This could be confirmed
by self testcase "chain cached mixed R-L/L-W ".

To resolve this, we use concept "hlock_id" to generate the chainkey, the
hlock_id is a tuple (hlock->class_idx, hlock->read), which fits in a u16
type. With this, the chainkeys are different is the lock sequences have
the same locks but different read/write status.

Besides, since we use "hlock_id" to generate chainkeys, the chain_hlocks
array now store the "hlock_id"s rather than lock_class indexes.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 kernel/locking/lockdep.c | 56 +++++++++++++++++++++++++++++++-----------------
 1 file changed, 36 insertions(+), 20 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 45a370438add..d88fded8b339 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -307,6 +307,21 @@ static struct hlist_head classhash_table[CLASSHASH_SIZE];
 
 static struct hlist_head chainhash_table[CHAINHASH_SIZE];
 
+/*
+ * the id of held_lock
+ */
+static inline u16 hlock_id(struct held_lock *hlock)
+{
+	BUILD_BUG_ON(MAX_LOCKDEP_KEYS_BITS + 2 > 16);
+
+	return (hlock->class_idx | (hlock->read << MAX_LOCKDEP_KEYS_BITS));
+}
+
+static inline unsigned int chain_hlock_class_idx(u16 hlock_id)
+{
+	return hlock_id & MAX_LOCKDEP_KEYS;
+}
+
 /*
  * The hash key of the lock dependency chains is a hash itself too:
  * it's a hash of all locks taken up to that lock, including that lock.
@@ -2283,7 +2298,10 @@ static u16 chain_hlocks[MAX_LOCKDEP_CHAIN_HLOCKS];
 
 struct lock_class *lock_chain_get_class(struct lock_chain *chain, int i)
 {
-	return lock_classes + chain_hlocks[chain->base + i];
+	u16 chain_hlock = chain_hlocks[chain->base + i];
+	unsigned int class_idx = chain_hlock_class_idx(chain_hlock);
+
+	return lock_classes + class_idx - 1;
 }
 
 /*
@@ -2309,12 +2327,12 @@ static inline int get_first_held_lock(struct task_struct *curr,
 /*
  * Returns the next chain_key iteration
  */
-static u64 print_chain_key_iteration(int class_idx, u64 chain_key)
+static u64 print_chain_key_iteration(u16 hlock_id, u64 chain_key)
 {
-	u64 new_chain_key = iterate_chain_key(chain_key, class_idx);
+	u64 new_chain_key = iterate_chain_key(chain_key, hlock_id);
 
-	printk(" class_idx:%d -> chain_key:%016Lx",
-		class_idx,
+	printk(" hlock_id:%d -> chain_key:%016Lx",
+		(unsigned int)hlock_id,
 		(unsigned long long)new_chain_key);
 	return new_chain_key;
 }
@@ -2330,12 +2348,12 @@ print_chain_keys_held_locks(struct task_struct *curr, struct held_lock *hlock_ne
 	printk("depth: %u\n", depth + 1);
 	for (i = get_first_held_lock(curr, hlock_next); i < depth; i++) {
 		hlock = curr->held_locks + i;
-		chain_key = print_chain_key_iteration(hlock->class_idx, chain_key);
+		chain_key = print_chain_key_iteration(hlock_id(hlock), chain_key);
 
 		print_lock(hlock);
 	}
 
-	print_chain_key_iteration(hlock_next->class_idx, chain_key);
+	print_chain_key_iteration(hlock_id(hlock_next), chain_key);
 	print_lock(hlock_next);
 }
 
@@ -2343,14 +2361,14 @@ static void print_chain_keys_chain(struct lock_chain *chain)
 {
 	int i;
 	u64 chain_key = 0;
-	int class_id;
+	u16 hlock_id;
 
 	printk("depth: %u\n", chain->depth);
 	for (i = 0; i < chain->depth; i++) {
-		class_id = chain_hlocks[chain->base + i];
-		chain_key = print_chain_key_iteration(class_id + 1, chain_key);
+		hlock_id = chain_hlocks[chain->base + i];
+		chain_key = print_chain_key_iteration(hlock_id, chain_key);
 
-		print_lock_name(lock_classes + class_id);
+		print_lock_name(lock_classes + chain_hlock_class_idx(hlock_id) - 1);
 		printk("\n");
 	}
 }
@@ -2399,7 +2417,7 @@ static int check_no_collision(struct task_struct *curr,
 	}
 
 	for (j = 0; j < chain->depth - 1; j++, i++) {
-		id = curr->held_locks[i].class_idx - 1;
+		id = hlock_id(&curr->held_locks[i]);
 
 		if (DEBUG_LOCKS_WARN_ON(chain_hlocks[chain->base + j] != id)) {
 			print_collision(curr, hlock, chain);
@@ -2456,8 +2474,8 @@ static inline int add_chain_cache_classes(unsigned int prev,
 	if (likely(nr_chain_hlocks + chain->depth <= MAX_LOCKDEP_CHAIN_HLOCKS)) {
 		chain->base = nr_chain_hlocks;
 		nr_chain_hlocks += chain->depth;
-		chain_hlocks[chain->base] = prev - 1;
-		chain_hlocks[chain->base + 1] = next -1;
+		chain_hlocks[chain->base] = prev;
+		chain_hlocks[chain->base + 1] = next;
 	}
 #ifdef CONFIG_DEBUG_LOCKDEP
 	/*
@@ -2491,7 +2509,6 @@ static inline int add_chain_cache(struct task_struct *curr,
 				  struct held_lock *hlock,
 				  u64 chain_key)
 {
-	struct lock_class *class = hlock_class(hlock);
 	struct hlist_head *hash_head = chainhashentry(chain_key);
 	struct lock_chain *chain;
 	int i, j;
@@ -2530,10 +2547,9 @@ static inline int add_chain_cache(struct task_struct *curr,
 	if (likely(nr_chain_hlocks + chain->depth <= MAX_LOCKDEP_CHAIN_HLOCKS)) {
 		chain->base = nr_chain_hlocks;
 		for (j = 0; j < chain->depth - 1; j++, i++) {
-			int lock_id = curr->held_locks[i].class_idx - 1;
-			chain_hlocks[chain->base + j] = lock_id;
+			chain_hlocks[chain->base + j] = hlock_id(&curr->held_locks[i]);
 		}
-		chain_hlocks[chain->base + j] = class - lock_classes;
+		chain_hlocks[chain->base + j] = hlock_id(hlock);
 	}
 
 	if (nr_chain_hlocks < MAX_LOCKDEP_CHAIN_HLOCKS)
@@ -2731,7 +2747,7 @@ static void check_chain_key(struct task_struct *curr)
 		if (prev_hlock && (prev_hlock->irq_context !=
 							hlock->irq_context))
 			chain_key = 0;
-		chain_key = iterate_chain_key(chain_key, hlock->class_idx);
+		chain_key = iterate_chain_key(chain_key, hlock_id(hlock));
 		prev_hlock = hlock;
 	}
 	if (chain_key != curr->curr_chain_key) {
@@ -3672,7 +3688,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 		chain_key = 0;
 		chain_head = 1;
 	}
-	chain_key = iterate_chain_key(chain_key, class_idx);
+	chain_key = iterate_chain_key(chain_key, hlock_id(hlock));
 
 	if (nest_lock && !__lock_is_held(nest_lock, -1))
 		return print_lock_nested_lock_not_held(curr, hlock, ip);
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 15/20] lockdep/selftest: Unleash irq_read_recursion2 and add more
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
                   ` (13 preceding siblings ...)
  2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 14/20] lockdep: Take read/write status in consideration when generate chainkey Boqun Feng
@ 2018-04-11 13:51 ` Boqun Feng
  2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 16/20] lockdep/selftest: Add more recursive read related test cases Boqun Feng
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney, Boqun Feng

Now since we can handle recursive read related irq inversion deadlocks
correctly, uncomment the irq_read_recursion2 and add more testcases.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 lib/locking-selftest.c | 59 ++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 47 insertions(+), 12 deletions(-)

diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index 700f9aa19db6..c2f06b423da8 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -1052,20 +1052,28 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_inversion_soft_wlock)
 #define E3()				\
 					\
 	IRQ_ENTER();			\
-	RL(A);				\
+	LOCK(A);			\
 	L(B);				\
 	U(B);				\
-	RU(A);				\
+	UNLOCK(A);			\
 	IRQ_EXIT();
 
 /*
- * Generate 12 testcases:
+ * Generate 24 testcases:
  */
 #include "locking-selftest-hardirq.h"
-GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_hard)
+#include "locking-selftest-rlock.h"
+GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_hard_rlock)
+
+#include "locking-selftest-wlock.h"
+GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_hard_wlock)
 
 #include "locking-selftest-softirq.h"
-GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
+#include "locking-selftest-rlock.h"
+GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft_rlock)
+
+#include "locking-selftest-wlock.h"
+GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft_wlock)
 
 #undef E1
 #undef E2
@@ -1079,8 +1087,8 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
 					\
 	IRQ_DISABLE();			\
 	L(B);				\
-	WL(A);				\
-	WU(A);				\
+	LOCK(A);			\
+	UNLOCK(A);			\
 	U(B);				\
 	IRQ_ENABLE();
 
@@ -1097,13 +1105,21 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
 	IRQ_EXIT();
 
 /*
- * Generate 12 testcases:
+ * Generate 24 testcases:
  */
 #include "locking-selftest-hardirq.h"
-// GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion2_hard)
+#include "locking-selftest-rlock.h"
+GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion2_hard_rlock)
+
+#include "locking-selftest-wlock.h"
+GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion2_hard_wlock)
 
 #include "locking-selftest-softirq.h"
-// GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion2_soft)
+#include "locking-selftest-rlock.h"
+GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion2_soft_rlock)
+
+#include "locking-selftest-wlock.h"
+GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion2_soft_wlock)
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 # define I_SPINLOCK(x)	lockdep_reset_lock(&lock_##x.dep_map)
@@ -1256,6 +1272,25 @@ static inline void print_testname(const char *testname)
 	dotest(name##_rlock_##nr, SUCCESS, LOCKTYPE_RWLOCK);	\
 	pr_cont("\n");
 
+#define DO_TESTCASE_2RW(desc, name, nr)				\
+	print_testname(desc"/"#nr);				\
+	pr_cont("      |");					\
+	dotest(name##_wlock_##nr, FAILURE, LOCKTYPE_RWLOCK);	\
+	dotest(name##_rlock_##nr, SUCCESS, LOCKTYPE_RWLOCK);	\
+	pr_cont("\n");
+
+#define DO_TESTCASE_2x2RW(desc, name, nr)			\
+	DO_TESTCASE_2RW("hard-"desc, name##_hard, nr)		\
+	DO_TESTCASE_2RW("soft-"desc, name##_soft, nr)		\
+
+#define DO_TESTCASE_6x2x2RW(desc, name)				\
+	DO_TESTCASE_2x2RW(desc, name, 123);			\
+	DO_TESTCASE_2x2RW(desc, name, 132);			\
+	DO_TESTCASE_2x2RW(desc, name, 213);			\
+	DO_TESTCASE_2x2RW(desc, name, 231);			\
+	DO_TESTCASE_2x2RW(desc, name, 312);			\
+	DO_TESTCASE_2x2RW(desc, name, 321);
+
 #define DO_TESTCASE_6(desc, name)				\
 	print_testname(desc);					\
 	dotest(name##_spin, FAILURE, LOCKTYPE_SPIN);		\
@@ -2114,8 +2149,8 @@ void locking_selftest(void)
 	DO_TESTCASE_6x6("safe-A + unsafe-B #2", irqsafe4);
 	DO_TESTCASE_6x6RW("irq lock-inversion", irq_inversion);
 
-	DO_TESTCASE_6x2("irq read-recursion", irq_read_recursion);
-//	DO_TESTCASE_6x2B("irq read-recursion #2", irq_read_recursion2);
+	DO_TESTCASE_6x2x2RW("irq read-recursion", irq_read_recursion);
+	DO_TESTCASE_6x2x2RW("irq read-recursion #2", irq_read_recursion2);
 
 	ww_tests();
 
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 16/20] lockdep/selftest: Add more recursive read related test cases
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
                   ` (14 preceding siblings ...)
  2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 15/20] lockdep/selftest: Unleash irq_read_recursion2 and add more Boqun Feng
@ 2018-04-11 13:51 ` Boqun Feng
  2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 17/20] Revert "locking/lockdep/selftests: Fix mixed read-write ABBA tests" Boqun Feng
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney, Boqun Feng

Add those four test cases:

1.	X --(NR)--> Y --(NR)--> Z --(NR)--> X is deadlock.

2.	X --(NN)--> Y --(RR)--> Z --(NR)--> X is deadlock.

3.	X --(NN)--> Y --(RR)--> Z --(RN)--> X is not deadlock.

4.	X --(NR)--> Y --(RR)--> Z --(NN)--> X is not deadlock.

Those self testcases are valuable for the development of supporting
recursive read related deadlock detection.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 lib/locking-selftest.c | 161 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 161 insertions(+)

diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index c2f06b423da8..6b7a28d84fc4 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -1033,6 +1033,133 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_inversion_soft_wlock)
 #undef E2
 #undef E3
 
+/*
+ * write-read / write-read / write-read deadlock even if read is recursive
+ */
+
+#define E1()				\
+					\
+	WL(X1);				\
+	RL(Y1);				\
+	RU(Y1);				\
+	WU(X1);
+
+#define E2()				\
+					\
+	WL(Y1);				\
+	RL(Z1);				\
+	RU(Z1);				\
+	WU(Y1);
+
+#define E3()				\
+					\
+	WL(Z1);				\
+	RL(X1);				\
+	RU(X1);				\
+	WU(Z1);
+
+#include "locking-selftest-rlock.h"
+GENERATE_PERMUTATIONS_3_EVENTS(W1R2_W2R3_W3R1)
+
+#undef E1
+#undef E2
+#undef E3
+
+/*
+ * write-write / read-read / write-read deadlock even if read is recursive
+ */
+
+#define E1()				\
+					\
+	WL(X1);				\
+	WL(Y1);				\
+	WU(Y1);				\
+	WU(X1);
+
+#define E2()				\
+					\
+	RL(Y1);				\
+	RL(Z1);				\
+	RU(Z1);				\
+	RU(Y1);
+
+#define E3()				\
+					\
+	WL(Z1);				\
+	RL(X1);				\
+	RU(X1);				\
+	WU(Z1);
+
+#include "locking-selftest-rlock.h"
+GENERATE_PERMUTATIONS_3_EVENTS(W1W2_R2R3_W3R1)
+
+#undef E1
+#undef E2
+#undef E3
+
+/*
+ * write-write / read-read / read-write is not deadlock when read is recursive
+ */
+
+#define E1()				\
+					\
+	WL(X1);				\
+	WL(Y1);				\
+	WU(Y1);				\
+	WU(X1);
+
+#define E2()				\
+					\
+	RL(Y1);				\
+	RL(Z1);				\
+	RU(Z1);				\
+	RU(Y1);
+
+#define E3()				\
+					\
+	RL(Z1);				\
+	WL(X1);				\
+	WU(X1);				\
+	RU(Z1);
+
+#include "locking-selftest-rlock.h"
+GENERATE_PERMUTATIONS_3_EVENTS(W1R2_R2R3_W3W1)
+
+#undef E1
+#undef E2
+#undef E3
+
+/*
+ * write-read / read-read / write-write is not deadlock when read is recursive
+ */
+
+#define E1()				\
+					\
+	WL(X1);				\
+	RL(Y1);				\
+	RU(Y1);				\
+	WU(X1);
+
+#define E2()				\
+					\
+	RL(Y1);				\
+	RL(Z1);				\
+	RU(Z1);				\
+	RU(Y1);
+
+#define E3()				\
+					\
+	WL(Z1);				\
+	WL(X1);				\
+	WU(X1);				\
+	WU(Z1);
+
+#include "locking-selftest-rlock.h"
+GENERATE_PERMUTATIONS_3_EVENTS(W1W2_R2R3_R3W1)
+
+#undef E1
+#undef E2
+#undef E3
 /*
  * read-lock / write-lock recursion that is actually safe.
  */
@@ -1258,6 +1385,19 @@ static inline void print_testname(const char *testname)
 	dotest(name##_##nr, FAILURE, LOCKTYPE_RWLOCK);		\
 	pr_cont("\n");
 
+#define DO_TESTCASE_1RR(desc, name, nr)				\
+	print_testname(desc"/"#nr);				\
+	pr_cont("             |");				\
+	dotest(name##_##nr, SUCCESS, LOCKTYPE_RWLOCK);		\
+	pr_cont("\n");
+
+#define DO_TESTCASE_1RRB(desc, name, nr)			\
+	print_testname(desc"/"#nr);				\
+	pr_cont("             |");				\
+	dotest(name##_##nr, FAILURE, LOCKTYPE_RWLOCK);		\
+	pr_cont("\n");
+
+
 #define DO_TESTCASE_3(desc, name, nr)				\
 	print_testname(desc"/"#nr);				\
 	dotest(name##_spin_##nr, FAILURE, LOCKTYPE_SPIN);	\
@@ -1367,6 +1507,22 @@ static inline void print_testname(const char *testname)
 	DO_TESTCASE_2IB(desc, name, 312);			\
 	DO_TESTCASE_2IB(desc, name, 321);
 
+#define DO_TESTCASE_6x1RR(desc, name)				\
+	DO_TESTCASE_1RR(desc, name, 123);			\
+	DO_TESTCASE_1RR(desc, name, 132);			\
+	DO_TESTCASE_1RR(desc, name, 213);			\
+	DO_TESTCASE_1RR(desc, name, 231);			\
+	DO_TESTCASE_1RR(desc, name, 312);			\
+	DO_TESTCASE_1RR(desc, name, 321);
+
+#define DO_TESTCASE_6x1RRB(desc, name)				\
+	DO_TESTCASE_1RRB(desc, name, 123);			\
+	DO_TESTCASE_1RRB(desc, name, 132);			\
+	DO_TESTCASE_1RRB(desc, name, 213);			\
+	DO_TESTCASE_1RRB(desc, name, 231);			\
+	DO_TESTCASE_1RRB(desc, name, 312);			\
+	DO_TESTCASE_1RRB(desc, name, 321);
+
 #define DO_TESTCASE_6x6(desc, name)				\
 	DO_TESTCASE_6I(desc, name, 123);			\
 	DO_TESTCASE_6I(desc, name, 132);			\
@@ -2137,6 +2293,11 @@ void locking_selftest(void)
 	pr_cont("             |");
 	dotest(rlock_chaincache_ABBA1, FAILURE, LOCKTYPE_RWLOCK);
 
+	DO_TESTCASE_6x1RRB("rlock W1R2/W2R3/W3R1", W1R2_W2R3_W3R1);
+	DO_TESTCASE_6x1RRB("rlock W1W2/R2R3/W3R1", W1W2_R2R3_W3R1);
+	DO_TESTCASE_6x1RR("rlock W1W2/R2R3/R3W1", W1W2_R2R3_R3W1);
+	DO_TESTCASE_6x1RR("rlock W1R2/R2R3/W3W1", W1R2_R2R3_W3W1);
+
 	printk("  --------------------------------------------------------------------------\n");
 
 	/*
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 17/20] Revert "locking/lockdep/selftests: Fix mixed read-write ABBA tests"
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
                   ` (15 preceding siblings ...)
  2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 16/20] lockdep/selftest: Add more recursive read related test cases Boqun Feng
@ 2018-04-11 13:51 ` Boqun Feng
  2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 18/20] MAINTAINERS: Add myself as a LOCKING PRIMITIVES reviewer Boqun Feng
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney, Boqun Feng

This reverts commit d82fed75294229abc9d757f08a4817febae6c4f4.

Since we now could handle mixed read-write deadlock detection well, the
self tests could be detected as expected, no need to use this
work-around.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 lib/locking-selftest.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index 6b7a28d84fc4..79270288fa28 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -2266,14 +2266,6 @@ void locking_selftest(void)
 	print_testname("mixed read-lock/lock-write ABBA");
 	pr_cont("             |");
 	dotest(rlock_ABBA1, FAILURE, LOCKTYPE_RWLOCK);
-#ifdef CONFIG_PROVE_LOCKING
-	/*
-	 * Lockdep does indeed fail here, but there's nothing we can do about
-	 * that now.  Don't kill lockdep for it.
-	 */
-	unexpected_testcase_failures--;
-#endif
-
 	pr_cont("             |");
 	dotest(rwsem_ABBA1, FAILURE, LOCKTYPE_RWSEM);
 
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 18/20] MAINTAINERS: Add myself as a LOCKING PRIMITIVES reviewer
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
                   ` (16 preceding siblings ...)
  2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 17/20] Revert "locking/lockdep/selftests: Fix mixed read-write ABBA tests" Boqun Feng
@ 2018-04-11 13:51 ` Boqun Feng
  2018-04-11 13:56 ` [RFC tip/locking/lockdep v6 19/20] rcu: Equip sleepable RCU with lockdep dependency graph checks Boqun Feng
  2018-04-11 13:57 ` [RFC tip/locking/lockdep v6 20/20] lockdep/selftest: Add a test case for SRCU Boqun Feng
  19 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney,
	Boqun Feng, David S. Miller, Mauro Carvalho Chehab,
	Greg Kroah-Hartman, Linus Walleij, Andrew Morton, Randy Dunlap

Recursive read lock detection work touches most core part of lockdep, so
add myself as a dedicated reviewer to help people find me if any of my
code introduces problems or misunderstandings, also if they need my help
on parsing logs related to recursive read locks.

Besides, I'd like to provide any help for lock related code.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 6e950b8b4a41..255e5f25df54 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8243,6 +8243,7 @@ F:	Documentation/admin-guide/LSM/LoadPin.rst
 LOCKING PRIMITIVES
 M:	Peter Zijlstra <peterz@infradead.org>
 M:	Ingo Molnar <mingo@redhat.com>
+R:	Boqun Feng <boqun.feng@gmail.com>
 L:	linux-kernel@vger.kernel.org
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git locking/core
 S:	Maintained
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 19/20] rcu: Equip sleepable RCU with lockdep dependency graph checks
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
                   ` (17 preceding siblings ...)
  2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 18/20] MAINTAINERS: Add myself as a LOCKING PRIMITIVES reviewer Boqun Feng
@ 2018-04-11 13:56 ` Boqun Feng
  2018-04-11 18:57   ` Paul E. McKenney
  2018-04-11 13:57 ` [RFC tip/locking/lockdep v6 20/20] lockdep/selftest: Add a test case for SRCU Boqun Feng
  19 siblings, 1 reply; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney,
	Boqun Feng, Lai Jiangshan, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers

Although all flavors of RCU are annotated correctly with lockdep
annotations as recursive read locks, the 'check' parameter for their
calls to lock_acquire() is unset. Which means RCU read locks are not
added into the lockdep dependency graph. This is fine for all flavors
except sleepable RCU, because the deadlock scenarios for them are
simple: calling synchronize_rcu() and its friends inside their read-side
critical sections. But for sleepable RCU, as there may be multiple
instances with multiple classes, there are more deadlock cases.
Considering the following:

	TASK 1				TASK 2
	=======				========
	i = srcu_read_lock(&sa);	i = srcu_read_lock(&sb);
	synchronize_srcu(&sb);		synchronize_srcu(&sa);
	srcu_read_unlock(&sa);		srcu_read_unlock(&sb);

Neither TASK 1 or 2 could go out of the read-side critical sections,
because they are waiting for each other at synchronize_srcu().

With the new improvement for lockdep, which allows us to detect
deadlocks for recursive read locks, we can actually detect this. What we
need to do are simply: a) mark srcu_read_{,un}lock() as 'check'
lock_acquire() and b) annotate synchronize_srcu() as a empty
grab-and-drop for a write lock (because synchronize_srcu() will wait for
previous srcu_read_lock() to release, and won't block the next
srcu_read_lock(), just like a empty write lock section).

This patch adds those to allow we check deadlocks related to sleepable
RCU with lockdep.

Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 include/linux/srcu.h  | 51 +++++++++++++++++++++++++++++++++++++++++++++++++--
 kernel/rcu/srcutiny.c |  2 ++
 kernel/rcu/srcutree.c |  2 ++
 3 files changed, 53 insertions(+), 2 deletions(-)

diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index 33c1c698df09..23f397bd192c 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -99,6 +99,49 @@ static inline int srcu_read_lock_held(const struct srcu_struct *sp)
 	return lock_is_held(&sp->dep_map);
 }
 
+/**
+ * lockdep annotations for srcu_read_{un,}lock, and synchronize_srcu():
+ *
+ * srcu_read_lock() and srcu_read_unlock() are similar to rcu_read_lock() and
+ * rcu_read_unlock(), they are recursive read locks. But we mark them as
+ * "check", they will be added into lockdep dependency graph for deadlock
+ * detection. And we also annotate synchronize_srcu() as a
+ * write_lock()+write_unlock(), because synchronize_srcu() will wait for any
+ * corresponding previous srcu_read_lock() to release, and that acts like a
+ * empty grab-and-drop write lock.
+ *
+ * We do so because multiple sleepable rcu instances may cause deadlock as
+ * follow:
+ *
+ *   Task 1:
+ *     ia = srcu_read_lock(&srcu_A);
+ *     synchronize_srcu(&srcu_B);
+ *     srcu_read_unlock(&srcu_A, ia);
+ *
+ *   Task 2:
+ *     ib = srcu_read_lock(&srcu_B);
+ *     synchronize_srcu(&srcu_A);
+ *     srcu_read_unlock(&srcu_B, ib);
+ *
+ * And we want lockdep to detect this or more complicated deadlock with SRCU
+ * involved.
+ */
+static inline void srcu_lock_acquire(struct lockdep_map *map)
+{
+	lock_map_acquire_read(map);
+}
+
+static inline void srcu_lock_release(struct lockdep_map *map)
+{
+	lock_map_release(map);
+}
+
+static inline void srcu_lock_sync(struct lockdep_map *map)
+{
+	lock_map_acquire(map);
+	lock_map_release(map);
+}
+
 #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
 
 static inline int srcu_read_lock_held(const struct srcu_struct *sp)
@@ -106,6 +149,10 @@ static inline int srcu_read_lock_held(const struct srcu_struct *sp)
 	return 1;
 }
 
+#define srcu_lock_acquire(m)	do { } while (0)
+#define srcu_lock_release(m)	do { } while (0)
+#define srcu_lock_sync(m)	do { } while (0)
+
 #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
 
 /**
@@ -157,7 +204,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
 	int retval;
 
 	retval = __srcu_read_lock(sp);
-	rcu_lock_acquire(&(sp)->dep_map);
+	srcu_lock_acquire(&(sp)->dep_map);
 	return retval;
 }
 
@@ -171,7 +218,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
 static inline void srcu_read_unlock(struct srcu_struct *sp, int idx)
 	__releases(sp)
 {
-	rcu_lock_release(&(sp)->dep_map);
+	srcu_lock_release(&(sp)->dep_map);
 	__srcu_read_unlock(sp, idx);
 }
 
diff --git a/kernel/rcu/srcutiny.c b/kernel/rcu/srcutiny.c
index 76ac5f50b2c7..bc89cb48d800 100644
--- a/kernel/rcu/srcutiny.c
+++ b/kernel/rcu/srcutiny.c
@@ -188,6 +188,8 @@ void synchronize_srcu(struct srcu_struct *sp)
 {
 	struct rcu_synchronize rs;
 
+	srcu_lock_sync(&sp->dep_map);
+
 	init_rcu_head_on_stack(&rs.head);
 	init_completion(&rs.completion);
 	call_srcu(sp, &rs.head, wakeme_after_rcu);
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index d5cea81378cc..e2628e9275b9 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -997,6 +997,8 @@ EXPORT_SYMBOL_GPL(synchronize_srcu_expedited);
  */
 void synchronize_srcu(struct srcu_struct *sp)
 {
+	srcu_lock_sync(&sp->dep_map);
+
 	if (srcu_might_be_idle(sp) || rcu_gp_is_expedited())
 		synchronize_srcu_expedited(sp);
 	else
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC tip/locking/lockdep v6 20/20] lockdep/selftest: Add a test case for SRCU
  2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
                   ` (18 preceding siblings ...)
  2018-04-11 13:56 ` [RFC tip/locking/lockdep v6 19/20] rcu: Equip sleepable RCU with lockdep dependency graph checks Boqun Feng
@ 2018-04-11 13:57 ` Boqun Feng
  19 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-11 13:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney, Boqun Feng

Since we now could detect deadlock cases for sleepable RCU, a self test
case is added. More other complex scenarios may be added later to
srcu_tests().

Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 lib/locking-selftest.c | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index 79270288fa28..5756ca1827e2 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -49,6 +49,7 @@ __setup("debug_locks_verbose=", setup_debug_locks_verbose);
 #define LOCKTYPE_RWSEM	0x8
 #define LOCKTYPE_WW	0x10
 #define LOCKTYPE_RTMUTEX 0x20
+#define LOCKTYPE_SRCU	0x40
 
 static struct ww_acquire_ctx t, t2;
 static struct ww_mutex o, o2, o3;
@@ -86,6 +87,11 @@ static DEFINE_RT_MUTEX(rtmutex_D);
 
 #endif
 
+#ifdef CONFIG_SRCU
+static struct srcu_struct srcu_A;
+static struct srcu_struct srcu_B;
+#endif
+
 /*
  * Locks that we initialize dynamically as well so that
  * e.g. X1 and X2 becomes two instances of the same class,
@@ -163,6 +169,11 @@ static void init_shared_classes(void)
 	__rt_mutex_init(&rtmutex_Z2, __func__, &rt_Z);
 #endif
 
+#ifdef CONFIG_SRCU
+	init_srcu_struct(&srcu_A);
+	init_srcu_struct(&srcu_B);
+#endif
+
 	init_class_X(&lock_X1, &rwlock_X1, &mutex_X1, &rwsem_X1);
 	init_class_X(&lock_X2, &rwlock_X2, &mutex_X2, &rwsem_X2);
 
@@ -2200,6 +2211,30 @@ static void ww_tests(void)
 	pr_cont("\n");
 }
 
+static void srcu_ABBA(void)
+{
+	int ia, ib;
+
+	ia = srcu_read_lock(&srcu_A);
+	synchronize_srcu(&srcu_B);
+	srcu_read_unlock(&srcu_A, ia);
+
+	ib = srcu_read_lock(&srcu_B);
+	synchronize_srcu(&srcu_A);
+	srcu_read_unlock(&srcu_B, ib); // should fail
+}
+
+static void srcu_tests(void)
+{
+	printk("  --------------------------------------------------------------------------\n");
+	printk("  | SRCU tests |\n");
+	printk("  ---------------\n");
+	print_testname("ABBA read-sync/read-sync");
+	dotest(srcu_ABBA, FAILURE, LOCKTYPE_SRCU);
+	pr_cont("\n");
+}
+
+
 void locking_selftest(void)
 {
 	/*
@@ -2306,6 +2341,7 @@ void locking_selftest(void)
 	DO_TESTCASE_6x2x2RW("irq read-recursion #2", irq_read_recursion2);
 
 	ww_tests();
+	srcu_tests();
 
 	if (unexpected_testcase_failures) {
 		printk("-----------------------------------------------------------------\n");
-- 
2.16.2

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC tip/locking/lockdep v6 19/20] rcu: Equip sleepable RCU with lockdep dependency graph checks
  2018-04-11 13:56 ` [RFC tip/locking/lockdep v6 19/20] rcu: Equip sleepable RCU with lockdep dependency graph checks Boqun Feng
@ 2018-04-11 18:57   ` Paul E. McKenney
  2018-04-12  2:12     ` Boqun Feng
  0 siblings, 1 reply; 28+ messages in thread
From: Paul E. McKenney @ 2018-04-11 18:57 UTC (permalink / raw)
  To: Boqun Feng
  Cc: linux-kernel, Peter Zijlstra, Ingo Molnar, Andrea Parri,
	Lai Jiangshan, Josh Triplett, Steven Rostedt, Mathieu Desnoyers

On Wed, Apr 11, 2018 at 09:56:44PM +0800, Boqun Feng wrote:
> Although all flavors of RCU are annotated correctly with lockdep
> annotations as recursive read locks, the 'check' parameter for their
> calls to lock_acquire() is unset. Which means RCU read locks are not
> added into the lockdep dependency graph. This is fine for all flavors
> except sleepable RCU, because the deadlock scenarios for them are
> simple: calling synchronize_rcu() and its friends inside their read-side
> critical sections. But for sleepable RCU, as there may be multiple
> instances with multiple classes, there are more deadlock cases.
> Considering the following:
> 
> 	TASK 1				TASK 2
> 	=======				========
> 	i = srcu_read_lock(&sa);	i = srcu_read_lock(&sb);
> 	synchronize_srcu(&sb);		synchronize_srcu(&sa);
> 	srcu_read_unlock(&sa);		srcu_read_unlock(&sb);
> 
> Neither TASK 1 or 2 could go out of the read-side critical sections,
> because they are waiting for each other at synchronize_srcu().
> 
> With the new improvement for lockdep, which allows us to detect
> deadlocks for recursive read locks, we can actually detect this. What we
> need to do are simply: a) mark srcu_read_{,un}lock() as 'check'
> lock_acquire() and b) annotate synchronize_srcu() as a empty
> grab-and-drop for a write lock (because synchronize_srcu() will wait for
> previous srcu_read_lock() to release, and won't block the next
> srcu_read_lock(), just like a empty write lock section).
> 
> This patch adds those to allow we check deadlocks related to sleepable
> RCU with lockdep.
> 
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>

Very cool!

One question though...  Won't this report a false-positive self-deadlock if
srcu_read_lock() is invoked from an interrupt handler?

							Thanx, Paul

> ---
>  include/linux/srcu.h  | 51 +++++++++++++++++++++++++++++++++++++++++++++++++--
>  kernel/rcu/srcutiny.c |  2 ++
>  kernel/rcu/srcutree.c |  2 ++
>  3 files changed, 53 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/srcu.h b/include/linux/srcu.h
> index 33c1c698df09..23f397bd192c 100644
> --- a/include/linux/srcu.h
> +++ b/include/linux/srcu.h
> @@ -99,6 +99,49 @@ static inline int srcu_read_lock_held(const struct srcu_struct *sp)
>  	return lock_is_held(&sp->dep_map);
>  }
> 
> +/**
> + * lockdep annotations for srcu_read_{un,}lock, and synchronize_srcu():
> + *
> + * srcu_read_lock() and srcu_read_unlock() are similar to rcu_read_lock() and
> + * rcu_read_unlock(), they are recursive read locks. But we mark them as
> + * "check", they will be added into lockdep dependency graph for deadlock
> + * detection. And we also annotate synchronize_srcu() as a
> + * write_lock()+write_unlock(), because synchronize_srcu() will wait for any
> + * corresponding previous srcu_read_lock() to release, and that acts like a
> + * empty grab-and-drop write lock.
> + *
> + * We do so because multiple sleepable rcu instances may cause deadlock as
> + * follow:
> + *
> + *   Task 1:
> + *     ia = srcu_read_lock(&srcu_A);
> + *     synchronize_srcu(&srcu_B);
> + *     srcu_read_unlock(&srcu_A, ia);
> + *
> + *   Task 2:
> + *     ib = srcu_read_lock(&srcu_B);
> + *     synchronize_srcu(&srcu_A);
> + *     srcu_read_unlock(&srcu_B, ib);
> + *
> + * And we want lockdep to detect this or more complicated deadlock with SRCU
> + * involved.
> + */
> +static inline void srcu_lock_acquire(struct lockdep_map *map)
> +{
> +	lock_map_acquire_read(map);
> +}
> +
> +static inline void srcu_lock_release(struct lockdep_map *map)
> +{
> +	lock_map_release(map);
> +}
> +
> +static inline void srcu_lock_sync(struct lockdep_map *map)
> +{
> +	lock_map_acquire(map);
> +	lock_map_release(map);
> +}
> +
>  #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
> 
>  static inline int srcu_read_lock_held(const struct srcu_struct *sp)
> @@ -106,6 +149,10 @@ static inline int srcu_read_lock_held(const struct srcu_struct *sp)
>  	return 1;
>  }
> 
> +#define srcu_lock_acquire(m)	do { } while (0)
> +#define srcu_lock_release(m)	do { } while (0)
> +#define srcu_lock_sync(m)	do { } while (0)
> +
>  #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
> 
>  /**
> @@ -157,7 +204,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
>  	int retval;
> 
>  	retval = __srcu_read_lock(sp);
> -	rcu_lock_acquire(&(sp)->dep_map);
> +	srcu_lock_acquire(&(sp)->dep_map);
>  	return retval;
>  }
> 
> @@ -171,7 +218,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
>  static inline void srcu_read_unlock(struct srcu_struct *sp, int idx)
>  	__releases(sp)
>  {
> -	rcu_lock_release(&(sp)->dep_map);
> +	srcu_lock_release(&(sp)->dep_map);
>  	__srcu_read_unlock(sp, idx);
>  }
> 
> diff --git a/kernel/rcu/srcutiny.c b/kernel/rcu/srcutiny.c
> index 76ac5f50b2c7..bc89cb48d800 100644
> --- a/kernel/rcu/srcutiny.c
> +++ b/kernel/rcu/srcutiny.c
> @@ -188,6 +188,8 @@ void synchronize_srcu(struct srcu_struct *sp)
>  {
>  	struct rcu_synchronize rs;
> 
> +	srcu_lock_sync(&sp->dep_map);
> +
>  	init_rcu_head_on_stack(&rs.head);
>  	init_completion(&rs.completion);
>  	call_srcu(sp, &rs.head, wakeme_after_rcu);
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index d5cea81378cc..e2628e9275b9 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -997,6 +997,8 @@ EXPORT_SYMBOL_GPL(synchronize_srcu_expedited);
>   */
>  void synchronize_srcu(struct srcu_struct *sp)
>  {
> +	srcu_lock_sync(&sp->dep_map);
> +
>  	if (srcu_might_be_idle(sp) || rcu_gp_is_expedited())
>  		synchronize_srcu_expedited(sp);
>  	else
> -- 
> 2.16.2
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC tip/locking/lockdep v6 19/20] rcu: Equip sleepable RCU with lockdep dependency graph checks
  2018-04-11 18:57   ` Paul E. McKenney
@ 2018-04-12  2:12     ` Boqun Feng
  2018-04-12  9:12       ` Peter Zijlstra
  0 siblings, 1 reply; 28+ messages in thread
From: Boqun Feng @ 2018-04-12  2:12 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, Peter Zijlstra, Ingo Molnar, Andrea Parri,
	Lai Jiangshan, Josh Triplett, Steven Rostedt, Mathieu Desnoyers

[-- Attachment #1: Type: text/plain, Size: 7669 bytes --]

On Wed, Apr 11, 2018 at 11:57:30AM -0700, Paul E. McKenney wrote:
> On Wed, Apr 11, 2018 at 09:56:44PM +0800, Boqun Feng wrote:
> > Although all flavors of RCU are annotated correctly with lockdep
> > annotations as recursive read locks, the 'check' parameter for their
> > calls to lock_acquire() is unset. Which means RCU read locks are not
> > added into the lockdep dependency graph. This is fine for all flavors
> > except sleepable RCU, because the deadlock scenarios for them are
> > simple: calling synchronize_rcu() and its friends inside their read-side
> > critical sections. But for sleepable RCU, as there may be multiple
> > instances with multiple classes, there are more deadlock cases.
> > Considering the following:
> > 
> > 	TASK 1				TASK 2
> > 	=======				========
> > 	i = srcu_read_lock(&sa);	i = srcu_read_lock(&sb);
> > 	synchronize_srcu(&sb);		synchronize_srcu(&sa);
> > 	srcu_read_unlock(&sa);		srcu_read_unlock(&sb);
> > 
> > Neither TASK 1 or 2 could go out of the read-side critical sections,
> > because they are waiting for each other at synchronize_srcu().
> > 
> > With the new improvement for lockdep, which allows us to detect
> > deadlocks for recursive read locks, we can actually detect this. What we
> > need to do are simply: a) mark srcu_read_{,un}lock() as 'check'
> > lock_acquire() and b) annotate synchronize_srcu() as a empty
> > grab-and-drop for a write lock (because synchronize_srcu() will wait for
> > previous srcu_read_lock() to release, and won't block the next
> > srcu_read_lock(), just like a empty write lock section).
> > 
> > This patch adds those to allow we check deadlocks related to sleepable
> > RCU with lockdep.
> > 
> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> 
> Very cool!
> 
> One question though...  Won't this report a false-positive self-deadlock if
> srcu_read_lock() is invoked from an interrupt handler?
> 

Ah.. right. And the false-positive happens because synchronize_srcu() is
annotated as a irq-write-unsafe lock, which should be fixed because
synchronize_srcu() doesn't block a srcu_read_lock() and the empty
write lock critical section in srcu_lock_sync() should mean the
grab-and-drop is atomic (i.e. no one could interrupt), therefore no irq
inversion problem.

A trivial fix/hack would be adding local_irq_disable() and
local_irq_enable() around srcu_lock_sync() like:

	static inline void srcu_lock_sync(struct lockdep_map *map)
	{
		local_irq_disable();
		lock_map_acquire(map);
		lock_map_release(map);
		local_irq_enable();
	}

However, it might be better, if lockdep could provide some annotation
API for such an empty critical section to say the grap-and-drop is
atomic. Something like:

	/*
	 * Annotate a wait point for all previous critical section to
	 * go out.
	 * 
	 * This won't make @map a irq unsafe lock, no matter it's called
	 * w/ or w/o irq disabled.
	 */
	lock_wait_unlock(struct lockdep_map *map, ..)

And in this primitive, we do something similar like
lock_acquire()+lock_release(). This primitive could be used elsewhere,
as I bebieve we have several empty grab-and-drop critical section for
lockdep annotations, e.g. in start_flush_work().

Thoughts?

This cerntainly requires a bit more work, in the meanwhile, I will add
another self testcase which has a srcu_read_lock() called in irq.
Thanks!

Regards,
Boqun

> 							Thanx, Paul
> 
> > ---
> >  include/linux/srcu.h  | 51 +++++++++++++++++++++++++++++++++++++++++++++++++--
> >  kernel/rcu/srcutiny.c |  2 ++
> >  kernel/rcu/srcutree.c |  2 ++
> >  3 files changed, 53 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/linux/srcu.h b/include/linux/srcu.h
> > index 33c1c698df09..23f397bd192c 100644
> > --- a/include/linux/srcu.h
> > +++ b/include/linux/srcu.h
> > @@ -99,6 +99,49 @@ static inline int srcu_read_lock_held(const struct srcu_struct *sp)
> >  	return lock_is_held(&sp->dep_map);
> >  }
> > 
> > +/**
> > + * lockdep annotations for srcu_read_{un,}lock, and synchronize_srcu():
> > + *
> > + * srcu_read_lock() and srcu_read_unlock() are similar to rcu_read_lock() and
> > + * rcu_read_unlock(), they are recursive read locks. But we mark them as
> > + * "check", they will be added into lockdep dependency graph for deadlock
> > + * detection. And we also annotate synchronize_srcu() as a
> > + * write_lock()+write_unlock(), because synchronize_srcu() will wait for any
> > + * corresponding previous srcu_read_lock() to release, and that acts like a
> > + * empty grab-and-drop write lock.
> > + *
> > + * We do so because multiple sleepable rcu instances may cause deadlock as
> > + * follow:
> > + *
> > + *   Task 1:
> > + *     ia = srcu_read_lock(&srcu_A);
> > + *     synchronize_srcu(&srcu_B);
> > + *     srcu_read_unlock(&srcu_A, ia);
> > + *
> > + *   Task 2:
> > + *     ib = srcu_read_lock(&srcu_B);
> > + *     synchronize_srcu(&srcu_A);
> > + *     srcu_read_unlock(&srcu_B, ib);
> > + *
> > + * And we want lockdep to detect this or more complicated deadlock with SRCU
> > + * involved.
> > + */
> > +static inline void srcu_lock_acquire(struct lockdep_map *map)
> > +{
> > +	lock_map_acquire_read(map);
> > +}
> > +
> > +static inline void srcu_lock_release(struct lockdep_map *map)
> > +{
> > +	lock_map_release(map);
> > +}
> > +
> > +static inline void srcu_lock_sync(struct lockdep_map *map)
> > +{
> > +	lock_map_acquire(map);
> > +	lock_map_release(map);
> > +}
> > +
> >  #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
> > 
> >  static inline int srcu_read_lock_held(const struct srcu_struct *sp)
> > @@ -106,6 +149,10 @@ static inline int srcu_read_lock_held(const struct srcu_struct *sp)
> >  	return 1;
> >  }
> > 
> > +#define srcu_lock_acquire(m)	do { } while (0)
> > +#define srcu_lock_release(m)	do { } while (0)
> > +#define srcu_lock_sync(m)	do { } while (0)
> > +
> >  #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
> > 
> >  /**
> > @@ -157,7 +204,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
> >  	int retval;
> > 
> >  	retval = __srcu_read_lock(sp);
> > -	rcu_lock_acquire(&(sp)->dep_map);
> > +	srcu_lock_acquire(&(sp)->dep_map);
> >  	return retval;
> >  }
> > 
> > @@ -171,7 +218,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
> >  static inline void srcu_read_unlock(struct srcu_struct *sp, int idx)
> >  	__releases(sp)
> >  {
> > -	rcu_lock_release(&(sp)->dep_map);
> > +	srcu_lock_release(&(sp)->dep_map);
> >  	__srcu_read_unlock(sp, idx);
> >  }
> > 
> > diff --git a/kernel/rcu/srcutiny.c b/kernel/rcu/srcutiny.c
> > index 76ac5f50b2c7..bc89cb48d800 100644
> > --- a/kernel/rcu/srcutiny.c
> > +++ b/kernel/rcu/srcutiny.c
> > @@ -188,6 +188,8 @@ void synchronize_srcu(struct srcu_struct *sp)
> >  {
> >  	struct rcu_synchronize rs;
> > 
> > +	srcu_lock_sync(&sp->dep_map);
> > +
> >  	init_rcu_head_on_stack(&rs.head);
> >  	init_completion(&rs.completion);
> >  	call_srcu(sp, &rs.head, wakeme_after_rcu);
> > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> > index d5cea81378cc..e2628e9275b9 100644
> > --- a/kernel/rcu/srcutree.c
> > +++ b/kernel/rcu/srcutree.c
> > @@ -997,6 +997,8 @@ EXPORT_SYMBOL_GPL(synchronize_srcu_expedited);
> >   */
> >  void synchronize_srcu(struct srcu_struct *sp)
> >  {
> > +	srcu_lock_sync(&sp->dep_map);
> > +
> >  	if (srcu_might_be_idle(sp) || rcu_gp_is_expedited())
> >  		synchronize_srcu_expedited(sp);
> >  	else
> > -- 
> > 2.16.2
> > 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC tip/locking/lockdep v6 19/20] rcu: Equip sleepable RCU with lockdep dependency graph checks
  2018-04-12  2:12     ` Boqun Feng
@ 2018-04-12  9:12       ` Peter Zijlstra
  2018-04-13 13:24         ` Boqun Feng
  0 siblings, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2018-04-12  9:12 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Paul E. McKenney, linux-kernel, Ingo Molnar, Andrea Parri,
	Lai Jiangshan, Josh Triplett, Steven Rostedt, Mathieu Desnoyers

On Thu, Apr 12, 2018 at 10:12:33AM +0800, Boqun Feng wrote:
> A trivial fix/hack would be adding local_irq_disable() and
> local_irq_enable() around srcu_lock_sync() like:
> 
> 	static inline void srcu_lock_sync(struct lockdep_map *map)
> 	{
> 		local_irq_disable();
> 		lock_map_acquire(map);
> 		lock_map_release(map);
> 		local_irq_enable();
> 	}
> 
> However, it might be better, if lockdep could provide some annotation
> API for such an empty critical section to say the grap-and-drop is
> atomic. Something like:
> 
> 	/*
> 	 * Annotate a wait point for all previous critical section to
> 	 * go out.
> 	 * 
> 	 * This won't make @map a irq unsafe lock, no matter it's called
> 	 * w/ or w/o irq disabled.
> 	 */
> 	lock_wait_unlock(struct lockdep_map *map, ..)
> 
> And in this primitive, we do something similar like
> lock_acquire()+lock_release(). This primitive could be used elsewhere,
> as I bebieve we have several empty grab-and-drop critical section for
> lockdep annotations, e.g. in start_flush_work().
> 
> Thoughts?
> 
> This cerntainly requires a bit more work, in the meanwhile, I will add
> another self testcase which has a srcu_read_lock() called in irq.

Yeah, I've never really bothered to clean those things up, but I don't
see any reason to stop you from doing it ;-)

As to the initial pattern with disabling IRQs, I think I've seen code
like that before, and in general performance isn't a top priority
(within reason) when you're running lockdep kernels, so I've usually let
it be.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC tip/locking/lockdep v6 19/20] rcu: Equip sleepable RCU with lockdep dependency graph checks
  2018-04-12  9:12       ` Peter Zijlstra
@ 2018-04-13 13:24         ` Boqun Feng
  0 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-13 13:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Paul E. McKenney, linux-kernel, Ingo Molnar, Andrea Parri,
	Lai Jiangshan, Josh Triplett, Steven Rostedt, Mathieu Desnoyers

[-- Attachment #1: Type: text/plain, Size: 3214 bytes --]

On Thu, Apr 12, 2018 at 11:12:17AM +0200, Peter Zijlstra wrote:
> On Thu, Apr 12, 2018 at 10:12:33AM +0800, Boqun Feng wrote:
> > A trivial fix/hack would be adding local_irq_disable() and
> > local_irq_enable() around srcu_lock_sync() like:
> > 
> > 	static inline void srcu_lock_sync(struct lockdep_map *map)
> > 	{
> > 		local_irq_disable();
> > 		lock_map_acquire(map);
> > 		lock_map_release(map);
> > 		local_irq_enable();
> > 	}
> > 
> > However, it might be better, if lockdep could provide some annotation
> > API for such an empty critical section to say the grap-and-drop is
> > atomic. Something like:
> > 
> > 	/*
> > 	 * Annotate a wait point for all previous critical section to
> > 	 * go out.
> > 	 * 
> > 	 * This won't make @map a irq unsafe lock, no matter it's called
> > 	 * w/ or w/o irq disabled.
> > 	 */
> > 	lock_wait_unlock(struct lockdep_map *map, ..)
> > 
> > And in this primitive, we do something similar like
> > lock_acquire()+lock_release(). This primitive could be used elsewhere,
> > as I bebieve we have several empty grab-and-drop critical section for
> > lockdep annotations, e.g. in start_flush_work().
> > 
> > Thoughts?
> > 
> > This cerntainly requires a bit more work, in the meanwhile, I will add
> > another self testcase which has a srcu_read_lock() called in irq.
> 
> Yeah, I've never really bothered to clean those things up, but I don't
> see any reason to stop you from doing it ;-)
> 
> As to the initial pattern with disabling IRQs, I think I've seen code
> like that before, and in general performance isn't a top priority

Yeah, I saw we used that pattern in del_timer_sync()

> (within reason) when you're running lockdep kernels, so I've usually let
> it be.

Turns out it's not very hard to write a working version of
lock_wait_unlock() ;-) Just call __lock_acquire() and __lock_release()
back-to-back with the @hardirqoff for __lock_acquire() to be 1:

	/*
	 * lock_sync() - synchronize with all previous critical sections to finish.
	 *
	 * Simply a acquire+release annotation with hardirqoff is true, because no lock
	 * is actually held, so this annotaion alone is safe to be interrupted as if
	 * irqs are off
	 */
	void lock_sync(struct lockdep_map *lock, unsigned subclass, int read,
		       int check, struct lockdep_map *nest_lock, unsigned long ip)
	{
		unsigned long flags;

		if (unlikely(current->lockdep_recursion))
			return;

		raw_local_irq_save(flags);
		check_flags(flags);

		current->lockdep_recursion = 1;
		__lock_acquire(lock, subclass, 0, read, check, 1, nest_lock, ip, 0, 0);
		if (__lock_release(lock, 0, ip))
			check_chain_key(current);

		current->lockdep_recursion = 0;
		raw_local_irq_restore(flags);
	}
	EXPORT_SYMBOL_GPL(lock_sync);

I rename as lock_sync(), because most of the time, we annotate with this
for a "sync point" with other critical sections. We can avoid some
overhead if we refactor __lock_acquire() and __lock_release() with some
helper functions, but I think this version is good enough for now, at
least better than disabling IRQs around lock_map_acquire() +
lock_map_release() ;-)

Thoughts?

Regards,
Boqun


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC tip/locking/lockdep v6 01/20] lockdep/Documention: Recursive read lock detection reasoning
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 01/20] lockdep/Documention: Recursive read lock detection reasoning Boqun Feng
@ 2018-04-15  0:38   ` Randy Dunlap
  2018-04-16  6:29     ` Boqun Feng
  2018-04-27 13:50   ` Boqun Feng
  1 sibling, 1 reply; 28+ messages in thread
From: Randy Dunlap @ 2018-04-15  0:38 UTC (permalink / raw)
  To: Boqun Feng, linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney,
	Jonathan Corbet, open list:DOCUMENTATION

Hi,

Just a few typos etc. below...

On 04/11/2018 06:50 AM, Boqun Feng wrote:
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> ---
>  Documentation/locking/lockdep-design.txt | 178 +++++++++++++++++++++++++++++++
>  1 file changed, 178 insertions(+)
> 
> diff --git a/Documentation/locking/lockdep-design.txt b/Documentation/locking/lockdep-design.txt
> index 9de1c158d44c..6bb9e90e2c4f 100644
> --- a/Documentation/locking/lockdep-design.txt
> +++ b/Documentation/locking/lockdep-design.txt
> @@ -284,3 +284,181 @@ Run the command and save the output, then compare against the output from
>  a later run of this command to identify the leakers.  This same output
>  can also help you find situations where runtime lock initialization has
>  been omitted.
> +
> +Recursive read locks:
> +---------------------
> +
> +Lockdep now is equipped with deadlock detection for recursive read locks.
> +
> +Recursive read locks, as their name indicates, are the locks able to be
> +acquired recursively. Unlike non-recursive read locks, recursive read locks
> +only get blocked by current write lock *holders* other than write lock
> +*waiters*, for example:
> +
> +	TASK A:			TASK B:
> +
> +	read_lock(X);
> +
> +				write_lock(X);
> +
> +	read_lock(X);
> +
> +is not a deadlock for recursive read locks, as while the task B is waiting for
> +the lock X, the second read_lock() doesn't need to wait because it's a recursive
> +read lock. However if the read_lock() is non-recursive read lock, then the above
> +case is a deadlock, because even if the write_lock() in TASK B can not get the
> +lock, but it can block the second read_lock() in TASK A.
> +
> +Note that a lock can be a write lock (exclusive lock), a non-recursive read
> +lock (non-recursive shared lock) or a recursive read lock (recursive shared
> +lock), depending on the lock operations used to acquire it (more specifically,
> +the value of the 'read' parameter for lock_acquire()). In other words, a single
> +lock instance has three types of acquisition depending on the acquisition
> +functions: exclusive, non-recursive read, and recursive read.
> +
> +To be concise, we call that write locks and non-recursive read locks as
> +"non-recursive" locks and recursive read locks as "recursive" locks.
> +
> +Recursive locks don't block each other, while non-recursive locks do (this is
> +even true for two non-recursive read locks). A non-recursive lock can block the
> +corresponding recursive lock, and vice versa.
> +
> +A deadlock case with recursive locks involved is as follow:
> +
> +	TASK A:			TASK B:
> +
> +	read_lock(X);
> +				read_lock(Y);
> +	write_lock(Y);
> +				write_lock(X);
> +
> +Task A is waiting for task B to read_unlock() Y and task B is waiting for task
> +A to read_unlock() X.
> +
> +Dependency types and strong dependency paths:
> +---------------------------------------------
> +In order to detect deadlocks as above, lockdep needs to track different dependencies.
> +There are 4 categories for dependency edges in the lockdep graph:
> +
> +1) -(NN)->: non-recursive to non-recursive dependency. "X -(NN)-> Y" means
> +            X -> Y and both X and Y are non-recursive locks.
> +
> +2) -(RN)->: recursive to non-recursive dependency. "X -(RN)-> Y" means
> +            X -> Y and X is recursive read lock and Y is non-recursive lock.
> +
> +3) -(NR)->: non-recursive to recursive dependency, "X -(NR)-> Y" means
> +            X -> Y and X is non-recursive lock and Y is recursive lock.
> +
> +4) -(RR)->: recursive to recursive dependency, "X -(RR)-> Y" means
> +            X -> Y and both X and Y are recursive locks.
> +
> +Note that given two locks, they may have multiple dependencies between them, for example:
> +
> +	TASK A:
> +
> +	read_lock(X);
> +	write_lock(Y);
> +	...
> +
> +	TASK B:
> +
> +	write_lock(X);
> +	write_lock(Y);
> +
> +, we have both X -(RN)-> Y and X -(NN)-> Y in the dependency graph.
> +
> +We use -(*N)-> for edges that is either -(RN)-> or -(NN)->, the similar for -(N*)->,
> +-(*R)-> and -(R*)->
> +
> +A "path" is a series of conjunct dependency edges in the graph. And we define a
> +"strong" path, which indicates the strong dependency throughout each dependency
> +in the path, as the path that doesn't have two conjunct edges (dependencies) as
> +-(*R)-> and -(R*)->. In other words, a "strong" path is a path from a lock
> +walking to another through the lock dependencies, and if X -> Y -> Z in the
> +path (where X, Y, Z are locks), if the walk from X to Y is through a -(NR)-> or
> +-(RR)-> dependency, the walk from Y to Z must not be through a -(RN)-> or
> +-(RR)-> dependency, otherwise it's not a strong path.
> +
> +We will see why the path is called "strong" in next section.
> +
> +Recursive Read Deadlock Detection:
> +----------------------------------
> +
> +We now prove two things:
> +
> +Lemma 1:
> +
> +If there is a closed strong path (i.e. a strong cirle), then there is a

??                                                 circle

> +combination of locking sequences that causes deadlock. I.e. a strong circle is
> +sufficient for deadlock detection.
> +
> +Lemma 2:
> +
> +If there is no closed strong path (i.e. strong cirle), then there is no

??                                                circle

> +combination of locking sequences that could cause deadlock. I.e.  strong
> +circles are necessary for deadlock detection.
> +
> +With these two Lemmas, we can easily say a closed strong path is both sufficient
> +and necessary for deadlocks, therefore a closed strong path is equivalent to
> +deadlock possibility. As a closed strong path stands for a dependency chain that
> +could cause deadlocks, so we call it "strong", considering there are dependency
> +circles that won't cause deadlocks.
> +
> +Proof for sufficiency (Lemma 1):
> +
> +Let's say we have a strong cirlce:

                              circle:

> +
> +	L1 -> L2 ... -> Ln -> L1
> +
> +, which means we have dependencies:
> +
> +	L1 -> L2
> +	L2 -> L3
> +	...
> +	Ln-1 -> Ln
> +	Ln -> L1
> +
> +We now can construct a combination of locking sequences that cause deadlock:
> +
> +Firstly let's make one CPU/task get the L1 in L1 -> L2, and then another get
> +the L2 in L2 -> L3, and so on. After this, all of the Lx in Lx -> Lx+1 are
> +held by different CPU/tasks.
> +
> +And then because we have L1 -> L2, so the holder of L1 is going to acquire L2
> +in L1 -> L2, however since L2 is already held by another CPU/task, plus L1 ->
> +L2 and L2 -> L3 are not *R and R* (the definition of strong), therefore the
> +holder of L1 can not get L2, it has to wait L2's holder to release.
> +
> +Moreover, we can have a similar conclusion for L2's holder: it has to wait L3's
> +holder to release, and so on. We now can proof that Lx's holder has to wait for

                                            prove

> +Lx+1's holder to release, and note that Ln+1 is L1, so we have a circular
> +waiting scenario and nobody can get progress, therefore a deadlock.
> +
> +Proof for necessary (Lemma 2):
> +
> +Lemma 2 is equivalent to: If there is a deadlock scenario, then there must be a
> +strong circle in the dependency graph.
> +
> +According to Wikipedia[1], if there is a deadlock, then there must be a circular
> +waiting scenario, means there are N CPU/tasks, where CPU/task P1 is waiting for
> +a lock held by P2, and P2 is waiting for a lock held by P3, ... and Pn is waiting
> +for a lock held by P1. Let's name the lock Px is waiting as Lx, so since P1 is waiting
> +for L1 and holding Ln, so we will have Ln -> L1 in the dependency graph. Similarly,
> +we have L1 -> L2, L2 -> L3, ..., Ln-1 -> Ln in the dependency graph, which means we
> +have a circle:
> +
> +	Ln -> L1 -> L2 -> ... -> Ln
> +
> +, and now let's prove the circle is strong:
> +
> +For a lock Lx, Px contributes the dependency Lx-1 -> Lx and Px+1 contributes
> +the dependency Lx -> Lx+1, and since Px is waiting for Px+1 to release Lx,
> +so Lx can not be both recursive in Lx -> Lx+1 and Lx-1 -> Lx, because recursive
> +locks don't block each other, therefore Lx-1 -> Lx and Lx -> Lx+1 can not be a
> +-(*R)-> -(R*)-> pair, and this is true for any lock in the circle, therefore,
> +the circle is strong.
> +
> +References:
> +-----------
> +[1]: https://en.wikipedia.org/wiki/Deadlock
> +[2]: Shibu, K. (2009). Intro To Embedded Systems (1st ed.). Tata McGraw-Hill
> 
I would also change all /can not/ to /cannot/...

-- 
~Randy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC tip/locking/lockdep v6 01/20] lockdep/Documention: Recursive read lock detection reasoning
  2018-04-15  0:38   ` Randy Dunlap
@ 2018-04-16  6:29     ` Boqun Feng
  0 siblings, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-16  6:29 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: linux-kernel, Peter Zijlstra, Ingo Molnar, Andrea Parri,
	Paul E. McKenney, Jonathan Corbet, open list:DOCUMENTATION

[-- Attachment #1: Type: text/plain, Size: 9317 bytes --]

On Sat, Apr 14, 2018 at 05:38:54PM -0700, Randy Dunlap wrote:
> Hi,
> 

Hello Randy,

> Just a few typos etc. below...
> 

Thanks! I fixed those typos according to your comments.

> On 04/11/2018 06:50 AM, Boqun Feng wrote:
> > Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> > ---
> >  Documentation/locking/lockdep-design.txt | 178 +++++++++++++++++++++++++++++++
> >  1 file changed, 178 insertions(+)
> > 
> > diff --git a/Documentation/locking/lockdep-design.txt b/Documentation/locking/lockdep-design.txt
> > index 9de1c158d44c..6bb9e90e2c4f 100644
> > --- a/Documentation/locking/lockdep-design.txt
> > +++ b/Documentation/locking/lockdep-design.txt
> > @@ -284,3 +284,181 @@ Run the command and save the output, then compare against the output from
> >  a later run of this command to identify the leakers.  This same output
> >  can also help you find situations where runtime lock initialization has
> >  been omitted.
> > +
> > +Recursive read locks:
> > +---------------------
> > +
> > +Lockdep now is equipped with deadlock detection for recursive read locks.
> > +
> > +Recursive read locks, as their name indicates, are the locks able to be
> > +acquired recursively. Unlike non-recursive read locks, recursive read locks
> > +only get blocked by current write lock *holders* other than write lock
> > +*waiters*, for example:
> > +
> > +	TASK A:			TASK B:
> > +
> > +	read_lock(X);
> > +
> > +				write_lock(X);
> > +
> > +	read_lock(X);
> > +
> > +is not a deadlock for recursive read locks, as while the task B is waiting for
> > +the lock X, the second read_lock() doesn't need to wait because it's a recursive
> > +read lock. However if the read_lock() is non-recursive read lock, then the above
> > +case is a deadlock, because even if the write_lock() in TASK B can not get the
> > +lock, but it can block the second read_lock() in TASK A.
> > +
> > +Note that a lock can be a write lock (exclusive lock), a non-recursive read
> > +lock (non-recursive shared lock) or a recursive read lock (recursive shared
> > +lock), depending on the lock operations used to acquire it (more specifically,
> > +the value of the 'read' parameter for lock_acquire()). In other words, a single
> > +lock instance has three types of acquisition depending on the acquisition
> > +functions: exclusive, non-recursive read, and recursive read.
> > +
> > +To be concise, we call that write locks and non-recursive read locks as
> > +"non-recursive" locks and recursive read locks as "recursive" locks.
> > +
> > +Recursive locks don't block each other, while non-recursive locks do (this is
> > +even true for two non-recursive read locks). A non-recursive lock can block the
> > +corresponding recursive lock, and vice versa.
> > +
> > +A deadlock case with recursive locks involved is as follow:
> > +
> > +	TASK A:			TASK B:
> > +
> > +	read_lock(X);
> > +				read_lock(Y);
> > +	write_lock(Y);
> > +				write_lock(X);
> > +
> > +Task A is waiting for task B to read_unlock() Y and task B is waiting for task
> > +A to read_unlock() X.
> > +
> > +Dependency types and strong dependency paths:
> > +---------------------------------------------
> > +In order to detect deadlocks as above, lockdep needs to track different dependencies.
> > +There are 4 categories for dependency edges in the lockdep graph:
> > +
> > +1) -(NN)->: non-recursive to non-recursive dependency. "X -(NN)-> Y" means
> > +            X -> Y and both X and Y are non-recursive locks.
> > +
> > +2) -(RN)->: recursive to non-recursive dependency. "X -(RN)-> Y" means
> > +            X -> Y and X is recursive read lock and Y is non-recursive lock.
> > +
> > +3) -(NR)->: non-recursive to recursive dependency, "X -(NR)-> Y" means
> > +            X -> Y and X is non-recursive lock and Y is recursive lock.
> > +
> > +4) -(RR)->: recursive to recursive dependency, "X -(RR)-> Y" means
> > +            X -> Y and both X and Y are recursive locks.
> > +
> > +Note that given two locks, they may have multiple dependencies between them, for example:
> > +
> > +	TASK A:
> > +
> > +	read_lock(X);
> > +	write_lock(Y);
> > +	...
> > +
> > +	TASK B:
> > +
> > +	write_lock(X);
> > +	write_lock(Y);
> > +
> > +, we have both X -(RN)-> Y and X -(NN)-> Y in the dependency graph.
> > +
> > +We use -(*N)-> for edges that is either -(RN)-> or -(NN)->, the similar for -(N*)->,
> > +-(*R)-> and -(R*)->
> > +
> > +A "path" is a series of conjunct dependency edges in the graph. And we define a
> > +"strong" path, which indicates the strong dependency throughout each dependency
> > +in the path, as the path that doesn't have two conjunct edges (dependencies) as
> > +-(*R)-> and -(R*)->. In other words, a "strong" path is a path from a lock
> > +walking to another through the lock dependencies, and if X -> Y -> Z in the
> > +path (where X, Y, Z are locks), if the walk from X to Y is through a -(NR)-> or
> > +-(RR)-> dependency, the walk from Y to Z must not be through a -(RN)-> or
> > +-(RR)-> dependency, otherwise it's not a strong path.
> > +
> > +We will see why the path is called "strong" in next section.
> > +
> > +Recursive Read Deadlock Detection:
> > +----------------------------------
> > +
> > +We now prove two things:
> > +
> > +Lemma 1:
> > +
> > +If there is a closed strong path (i.e. a strong cirle), then there is a
> 
> ??                                                 circle
> 
> > +combination of locking sequences that causes deadlock. I.e. a strong circle is
> > +sufficient for deadlock detection.
> > +
> > +Lemma 2:
> > +
> > +If there is no closed strong path (i.e. strong cirle), then there is no
> 
> ??                                                circle
> 
> > +combination of locking sequences that could cause deadlock. I.e.  strong
> > +circles are necessary for deadlock detection.
> > +
> > +With these two Lemmas, we can easily say a closed strong path is both sufficient
> > +and necessary for deadlocks, therefore a closed strong path is equivalent to
> > +deadlock possibility. As a closed strong path stands for a dependency chain that
> > +could cause deadlocks, so we call it "strong", considering there are dependency
> > +circles that won't cause deadlocks.
> > +
> > +Proof for sufficiency (Lemma 1):
> > +
> > +Let's say we have a strong cirlce:
> 
>                               circle:
> 
> > +
> > +	L1 -> L2 ... -> Ln -> L1
> > +
> > +, which means we have dependencies:
> > +
> > +	L1 -> L2
> > +	L2 -> L3
> > +	...
> > +	Ln-1 -> Ln
> > +	Ln -> L1
> > +
> > +We now can construct a combination of locking sequences that cause deadlock:
> > +
> > +Firstly let's make one CPU/task get the L1 in L1 -> L2, and then another get
> > +the L2 in L2 -> L3, and so on. After this, all of the Lx in Lx -> Lx+1 are
> > +held by different CPU/tasks.
> > +
> > +And then because we have L1 -> L2, so the holder of L1 is going to acquire L2
> > +in L1 -> L2, however since L2 is already held by another CPU/task, plus L1 ->
> > +L2 and L2 -> L3 are not *R and R* (the definition of strong), therefore the
> > +holder of L1 can not get L2, it has to wait L2's holder to release.
> > +
> > +Moreover, we can have a similar conclusion for L2's holder: it has to wait L3's
> > +holder to release, and so on. We now can proof that Lx's holder has to wait for
> 
>                                             prove
> 
> > +Lx+1's holder to release, and note that Ln+1 is L1, so we have a circular
> > +waiting scenario and nobody can get progress, therefore a deadlock.
> > +
> > +Proof for necessary (Lemma 2):
> > +
> > +Lemma 2 is equivalent to: If there is a deadlock scenario, then there must be a
> > +strong circle in the dependency graph.
> > +
> > +According to Wikipedia[1], if there is a deadlock, then there must be a circular
> > +waiting scenario, means there are N CPU/tasks, where CPU/task P1 is waiting for
> > +a lock held by P2, and P2 is waiting for a lock held by P3, ... and Pn is waiting
> > +for a lock held by P1. Let's name the lock Px is waiting as Lx, so since P1 is waiting
> > +for L1 and holding Ln, so we will have Ln -> L1 in the dependency graph. Similarly,
> > +we have L1 -> L2, L2 -> L3, ..., Ln-1 -> Ln in the dependency graph, which means we
> > +have a circle:
> > +
> > +	Ln -> L1 -> L2 -> ... -> Ln
> > +
> > +, and now let's prove the circle is strong:
> > +
> > +For a lock Lx, Px contributes the dependency Lx-1 -> Lx and Px+1 contributes
> > +the dependency Lx -> Lx+1, and since Px is waiting for Px+1 to release Lx,
> > +so Lx can not be both recursive in Lx -> Lx+1 and Lx-1 -> Lx, because recursive
> > +locks don't block each other, therefore Lx-1 -> Lx and Lx -> Lx+1 can not be a
> > +-(*R)-> -(R*)-> pair, and this is true for any lock in the circle, therefore,
> > +the circle is strong.
> > +
> > +References:
> > +-----------
> > +[1]: https://en.wikipedia.org/wiki/Deadlock
> > +[2]: Shibu, K. (2009). Intro To Embedded Systems (1st ed.). Tata McGraw-Hill
> > 
> I would also change all /can not/ to /cannot/...

Agreed. I will use 'cannot' for any future version, thanks a lot!

Regards,
Boqun

> 
> -- 
> ~Randy

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC tip/locking/lockdep v6 01/20] lockdep/Documention: Recursive read lock detection reasoning
  2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 01/20] lockdep/Documention: Recursive read lock detection reasoning Boqun Feng
  2018-04-15  0:38   ` Randy Dunlap
@ 2018-04-27 13:50   ` Boqun Feng
  1 sibling, 0 replies; 28+ messages in thread
From: Boqun Feng @ 2018-04-27 13:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Andrea Parri, Paul E. McKenney,
	Jonathan Corbet, open list:DOCUMENTATION, willy, ktkhai, jlayton,
	bfields, viro, linux-fsdevel, longman, Will Deacon

[-- Attachment #1: Type: text/plain, Size: 11689 bytes --]

(Copy more people)

On Wed, Apr 11, 2018 at 09:50:51PM +0800, Boqun Feng wrote:
> This patch add the documentation piece for the reasoning of deadlock
> detection related to recursive read lock. The following sections are
> added:
> 
> *	Explain what is a recursive read lock, and what deadlock cases
> 	they could introduce.
> 
> *	Introduce the notations for different types of dependencies, and
> 	the definition of strong paths.
> 
> *	Proof for a closed strong path is both sufficient and necessary
> 	for deadlock detections with recursive read locks involved. The
> 	proof could also explain why we call the path "strong"
> 
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> ---
>  Documentation/locking/lockdep-design.txt | 178 +++++++++++++++++++++++++++++++
>  1 file changed, 178 insertions(+)
> 
> diff --git a/Documentation/locking/lockdep-design.txt b/Documentation/locking/lockdep-design.txt
> index 9de1c158d44c..6bb9e90e2c4f 100644
> --- a/Documentation/locking/lockdep-design.txt
> +++ b/Documentation/locking/lockdep-design.txt
> @@ -284,3 +284,181 @@ Run the command and save the output, then compare against the output from
>  a later run of this command to identify the leakers.  This same output
>  can also help you find situations where runtime lock initialization has
>  been omitted.
> +
> +Recursive read locks:
> +---------------------
> +
> +Lockdep now is equipped with deadlock detection for recursive read locks.
> +
> +Recursive read locks, as their name indicates, are the locks able to be
> +acquired recursively. Unlike non-recursive read locks, recursive read locks
> +only get blocked by current write lock *holders* other than write lock
> +*waiters*, for example:
> +
> +	TASK A:			TASK B:
> +
> +	read_lock(X);
> +
> +				write_lock(X);
> +
> +	read_lock(X);
> +
> +is not a deadlock for recursive read locks, as while the task B is waiting for
> +the lock X, the second read_lock() doesn't need to wait because it's a recursive
> +read lock. However if the read_lock() is non-recursive read lock, then the above
> +case is a deadlock, because even if the write_lock() in TASK B can not get the
> +lock, but it can block the second read_lock() in TASK A.
> +
> +Note that a lock can be a write lock (exclusive lock), a non-recursive read
> +lock (non-recursive shared lock) or a recursive read lock (recursive shared
> +lock), depending on the lock operations used to acquire it (more specifically,
> +the value of the 'read' parameter for lock_acquire()). In other words, a single
> +lock instance has three types of acquisition depending on the acquisition
> +functions: exclusive, non-recursive read, and recursive read.
> +
> +To be concise, we call that write locks and non-recursive read locks as
> +"non-recursive" locks and recursive read locks as "recursive" locks.
> +
> +Recursive locks don't block each other, while non-recursive locks do (this is
> +even true for two non-recursive read locks). A non-recursive lock can block the
> +corresponding recursive lock, and vice versa.
> +
> +A deadlock case with recursive locks involved is as follow:
> +
> +	TASK A:			TASK B:
> +
> +	read_lock(X);
> +				read_lock(Y);
> +	write_lock(Y);
> +				write_lock(X);
> +
> +Task A is waiting for task B to read_unlock() Y and task B is waiting for task
> +A to read_unlock() X.
> +
> +Dependency types and strong dependency paths:
> +---------------------------------------------
> +In order to detect deadlocks as above, lockdep needs to track different dependencies.
> +There are 4 categories for dependency edges in the lockdep graph:
> +
> +1) -(NN)->: non-recursive to non-recursive dependency. "X -(NN)-> Y" means
> +            X -> Y and both X and Y are non-recursive locks.
> +
> +2) -(RN)->: recursive to non-recursive dependency. "X -(RN)-> Y" means
> +            X -> Y and X is recursive read lock and Y is non-recursive lock.
> +
> +3) -(NR)->: non-recursive to recursive dependency, "X -(NR)-> Y" means
> +            X -> Y and X is non-recursive lock and Y is recursive lock.
> +
> +4) -(RR)->: recursive to recursive dependency, "X -(RR)-> Y" means
> +            X -> Y and both X and Y are recursive locks.
> +
> +Note that given two locks, they may have multiple dependencies between them, for example:
> +
> +	TASK A:
> +
> +	read_lock(X);
> +	write_lock(Y);
> +	...
> +
> +	TASK B:
> +
> +	write_lock(X);
> +	write_lock(Y);
> +
> +, we have both X -(RN)-> Y and X -(NN)-> Y in the dependency graph.
> +
> +We use -(*N)-> for edges that is either -(RN)-> or -(NN)->, the similar for -(N*)->,
> +-(*R)-> and -(R*)->
> +
> +A "path" is a series of conjunct dependency edges in the graph. And we define a
> +"strong" path, which indicates the strong dependency throughout each dependency
> +in the path, as the path that doesn't have two conjunct edges (dependencies) as
> +-(*R)-> and -(R*)->. In other words, a "strong" path is a path from a lock
> +walking to another through the lock dependencies, and if X -> Y -> Z in the
> +path (where X, Y, Z are locks), if the walk from X to Y is through a -(NR)-> or
> +-(RR)-> dependency, the walk from Y to Z must not be through a -(RN)-> or
> +-(RR)-> dependency, otherwise it's not a strong path.
> +

So Matthew's request for better deadlock detection for rwlock can be
solved by:

	https://marc.info/?l=linux-kernel&m=152483640529740&w=2k

However, that will bring up a new challenge for the deadlock detection
of recursive read locks. Because in my previous design, I assumed that
a lock cannot have both non-recursive readers and recursive readers.
With that assumption, the definition of "strong" path works.

Now since rwlock_t may have both non-recursive readers and recursive
readers, then we have more interesting cases:

Case 1:

	<in irq handler>
	read_lock(&A);
				spin_lock_irq(&B);	
	spin_lock(&B); // recursive
				read_lock(&A);

, is a deadlock. And we have circle A -(RN)-> B -(NN)-> A (note that "N"
stands for non-recursive).

Case 2:

	<in irq handler>
	spin_lock(&A);
				read_lock(&B);
	read_lock(&A); // recursive
				spin_lock_irq(&B);

, is not a deadlock, even if we have circle A -(NR)-> B -(NN)-> A.

So we need to redefine "strong" path, because the block conditions
change between non-recursive locks and recursive locks: a recursive
readers can block a non-recursive readers, while a non-recursive readers
cannot block a recursive readers.

Let's mark non-recursive readers as S (shared) and writers as W, then
the "strong" path is a path without conjunct edges as -(*R)-> -(S*)-> or
-(*R)-> -(R*)->, we can prove this with similar reasoning based on the
new block conditions.

And luckily enough, we don't need to change the code too much, because
we can, rather than record RR RN NR NN, record

1)	-(R/S N)->: the first lock is reader (recursive or not) and the
		    second is non-recursive (writer or non-recursive
		    reader)

2)	-(R/S R)->: the first lock is reader and the second is recursive

3)	-(W N)->: the first lock is writer and the second is non-recursive

4)	-(W R)->: the first lock is writer and the second is recursive

, and a "strong" path is a path without conjunct edges -(*R) -(R/S*)->.

As a result, only a bit code needs to be changed, and I have already
done that in a local branch (though a lot of documention updates are
needed and I haven't done that yet for the new version).

There remains two questions before I make a move:

1.	Changing the annotation for queued rwlocks needs extra work for
	lockdep self test cases, but could help reveal more bugs. Do
	people want this very soon?

2.	Or we can focus on deadlock detections for recursive read locks
	first? And if we get it settled, we can move to annotate queued
	rwlocks properly.

I ask because doing those two together seems too big for a patchset, and
probably both will introduce some regression, so..

Thoughts?

Regards,
Boqun
> +We will see why the path is called "strong" in next section.
> +
> +Recursive Read Deadlock Detection:
> +----------------------------------
> +
> +We now prove two things:
> +
> +Lemma 1:
> +
> +If there is a closed strong path (i.e. a strong cirle), then there is a
> +combination of locking sequences that causes deadlock. I.e. a strong circle is
> +sufficient for deadlock detection.
> +
> +Lemma 2:
> +
> +If there is no closed strong path (i.e. strong cirle), then there is no
> +combination of locking sequences that could cause deadlock. I.e.  strong
> +circles are necessary for deadlock detection.
> +
> +With these two Lemmas, we can easily say a closed strong path is both sufficient
> +and necessary for deadlocks, therefore a closed strong path is equivalent to
> +deadlock possibility. As a closed strong path stands for a dependency chain that
> +could cause deadlocks, so we call it "strong", considering there are dependency
> +circles that won't cause deadlocks.
> +
> +Proof for sufficiency (Lemma 1):
> +
> +Let's say we have a strong cirlce:
> +
> +	L1 -> L2 ... -> Ln -> L1
> +
> +, which means we have dependencies:
> +
> +	L1 -> L2
> +	L2 -> L3
> +	...
> +	Ln-1 -> Ln
> +	Ln -> L1
> +
> +We now can construct a combination of locking sequences that cause deadlock:
> +
> +Firstly let's make one CPU/task get the L1 in L1 -> L2, and then another get
> +the L2 in L2 -> L3, and so on. After this, all of the Lx in Lx -> Lx+1 are
> +held by different CPU/tasks.
> +
> +And then because we have L1 -> L2, so the holder of L1 is going to acquire L2
> +in L1 -> L2, however since L2 is already held by another CPU/task, plus L1 ->
> +L2 and L2 -> L3 are not *R and R* (the definition of strong), therefore the
> +holder of L1 can not get L2, it has to wait L2's holder to release.
> +
> +Moreover, we can have a similar conclusion for L2's holder: it has to wait L3's
> +holder to release, and so on. We now can proof that Lx's holder has to wait for
> +Lx+1's holder to release, and note that Ln+1 is L1, so we have a circular
> +waiting scenario and nobody can get progress, therefore a deadlock.
> +
> +Proof for necessary (Lemma 2):
> +
> +Lemma 2 is equivalent to: If there is a deadlock scenario, then there must be a
> +strong circle in the dependency graph.
> +
> +According to Wikipedia[1], if there is a deadlock, then there must be a circular
> +waiting scenario, means there are N CPU/tasks, where CPU/task P1 is waiting for
> +a lock held by P2, and P2 is waiting for a lock held by P3, ... and Pn is waiting
> +for a lock held by P1. Let's name the lock Px is waiting as Lx, so since P1 is waiting
> +for L1 and holding Ln, so we will have Ln -> L1 in the dependency graph. Similarly,
> +we have L1 -> L2, L2 -> L3, ..., Ln-1 -> Ln in the dependency graph, which means we
> +have a circle:
> +
> +	Ln -> L1 -> L2 -> ... -> Ln
> +
> +, and now let's prove the circle is strong:
> +
> +For a lock Lx, Px contributes the dependency Lx-1 -> Lx and Px+1 contributes
> +the dependency Lx -> Lx+1, and since Px is waiting for Px+1 to release Lx,
> +so Lx can not be both recursive in Lx -> Lx+1 and Lx-1 -> Lx, because recursive
> +locks don't block each other, therefore Lx-1 -> Lx and Lx -> Lx+1 can not be a
> +-(*R)-> -(R*)-> pair, and this is true for any lock in the circle, therefore,
> +the circle is strong.
> +
> +References:
> +-----------
> +[1]: https://en.wikipedia.org/wiki/Deadlock
> +[2]: Shibu, K. (2009). Intro To Embedded Systems (1st ed.). Tata McGraw-Hill
> -- 
> 2.16.2
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2018-04-27 13:46 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-11 13:50 [RFC tip/locking/lockdep v6 00/20] lockdep: Support deadlock detection for recursive read locks Boqun Feng
2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 01/20] lockdep/Documention: Recursive read lock detection reasoning Boqun Feng
2018-04-15  0:38   ` Randy Dunlap
2018-04-16  6:29     ` Boqun Feng
2018-04-27 13:50   ` Boqun Feng
2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 02/20] lockdep: Demagic the return value of BFS Boqun Feng
2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 03/20] lockdep: Make __bfs() visit every dependency until a match Boqun Feng
2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 04/20] lockdep: Redefine LOCK_*_STATE* bits Boqun Feng
2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 05/20] lockdep: Reduce the size of lock_list::distance Boqun Feng
2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 06/20] lockdep: Introduce lock_list::dep Boqun Feng
2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 07/20] lockdep: Extend __bfs() to work with multiple types of dependencies Boqun Feng
2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 08/20] lockdep: Make __bfs(.match) return bool Boqun Feng
2018-04-11 13:50 ` [RFC tip/locking/lockdep v6 09/20] lockdep: Support deadlock detection for recursive read locks in check_noncircular() Boqun Feng
2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 10/20] lockdep: Adjust check_redundant() for recursive read change Boqun Feng
2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 11/20] lockdep: Fix recursive read lock related safe->unsafe detection Boqun Feng
2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 12/20] lockdep: Add recursive read locks into dependency graph Boqun Feng
2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 13/20] lockdep/selftest: Add a R-L/L-W test case specific to chain cache behavior Boqun Feng
2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 14/20] lockdep: Take read/write status in consideration when generate chainkey Boqun Feng
2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 15/20] lockdep/selftest: Unleash irq_read_recursion2 and add more Boqun Feng
2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 16/20] lockdep/selftest: Add more recursive read related test cases Boqun Feng
2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 17/20] Revert "locking/lockdep/selftests: Fix mixed read-write ABBA tests" Boqun Feng
2018-04-11 13:51 ` [RFC tip/locking/lockdep v6 18/20] MAINTAINERS: Add myself as a LOCKING PRIMITIVES reviewer Boqun Feng
2018-04-11 13:56 ` [RFC tip/locking/lockdep v6 19/20] rcu: Equip sleepable RCU with lockdep dependency graph checks Boqun Feng
2018-04-11 18:57   ` Paul E. McKenney
2018-04-12  2:12     ` Boqun Feng
2018-04-12  9:12       ` Peter Zijlstra
2018-04-13 13:24         ` Boqun Feng
2018-04-11 13:57 ` [RFC tip/locking/lockdep v6 20/20] lockdep/selftest: Add a test case for SRCU Boqun Feng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).