Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] tools/memory-model: document the "one-time init" pattern
@ 2020-07-17  4:44 Eric Biggers
  2020-07-17  5:49 ` Sedat Dilek
                   ` (5 more replies)
  0 siblings, 6 replies; 39+ messages in thread
From: Eric Biggers @ 2020-07-17  4:44 UTC (permalink / raw)
  To: linux-kernel, linux-arch, Paul E . McKenney
  Cc: linux-fsdevel, Akira Yokosawa, Alan Stern, Andrea Parri,
	Boqun Feng, Daniel Lustig, Darrick J . Wong, Dave Chinner,
	David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin,
	Peter Zijlstra, Will Deacon

From: Eric Biggers <ebiggers@google.com>

The "one-time init" pattern is implemented incorrectly in various places
in the kernel.  And when people do try to implement it correctly, it is
unclear what to use.  Try to give some proper guidance.

This is motivated by the discussion at
https://lkml.kernel.org/linux-fsdevel/20200713033330.205104-1-ebiggers@kernel.org/T/#u
regarding fixing the initialization of super_block::s_dio_done_wq.

Cc: Akira Yokosawa <akiyks@gmail.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Daniel Lustig <dlustig@nvidia.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jade Alglave <j.alglave@ucl.ac.uk>
Cc: Luc Maranget <luc.maranget@inria.fr>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 tools/memory-model/Documentation/recipes.txt | 151 +++++++++++++++++++
 1 file changed, 151 insertions(+)

diff --git a/tools/memory-model/Documentation/recipes.txt b/tools/memory-model/Documentation/recipes.txt
index 7fe8d7aa3029..04beb06dbfc7 100644
--- a/tools/memory-model/Documentation/recipes.txt
+++ b/tools/memory-model/Documentation/recipes.txt
@@ -519,6 +519,157 @@ CPU1 puts the waiting task to sleep and CPU0 fails to wake it up.
 
 Note that use of locking can greatly simplify this pattern.
 
+One-time init
+-------------
+
+The "one-time init" pattern is when multiple tasks can race to
+initialize the same data structure(s) on first use.
+
+In many cases, it's best to just avoid the need for this by simply
+initializing the data ahead of time.
+
+But in cases where the data would often go unused, one-time init can be
+appropriate to avoid wasting kernel resources.  It can also be
+appropriate if the initialization has other prerequisites which preclude
+it being done ahead of time.
+
+First, consider if your data has (a) global or static scope, (b) can be
+initialized from atomic context, and (c) cannot fail to be initialized.
+If all of those apply, just use DO_ONCE() from <linux/once.h>:
+
+	DO_ONCE(func);
+
+If that doesn't apply, you'll have to implement one-time init yourself.
+
+The simplest implementation just uses a mutex and an 'inited' flag.
+This implementation should be used where feasible:
+
+	static bool foo_inited;
+	static DEFINE_MUTEX(foo_init_mutex);
+
+	int init_foo_if_needed(void)
+	{
+		int err = 0;
+
+		mutex_lock(&foo_init_mutex);
+		if (!foo_inited) {
+			err = init_foo();
+			if (err == 0)
+				foo_inited = true;
+		}
+		mutex_unlock(&foo_init_mutex);
+		return err;
+	}
+
+The above example uses static variables, but this solution also works
+for initializing something that is part of another data structure.  The
+mutex may still be static.
+
+In where cases where taking the mutex in the "already initialized" case
+presents scalability concerns, the implementation can be optimized to
+check the 'inited' flag outside the mutex.  Unfortunately, this
+optimization is often implemented incorrectly by using a plain load.
+That violates the memory model and may result in unpredictable behavior.
+
+A correct implementation is:
+
+	static bool foo_inited;
+	static DEFINE_MUTEX(foo_init_mutex);
+
+	int init_foo_if_needed(void)
+	{
+		int err = 0;
+
+		/* pairs with smp_store_release() below */
+		if (smp_load_acquire(&foo_inited))
+			return 0;
+
+		mutex_lock(&foo_init_mutex);
+		if (!foo_inited) {
+			err = init_foo();
+			if (err == 0) /* pairs with smp_load_acquire() above */
+				smp_store_release(&foo_inited, true);
+		}
+		mutex_unlock(&foo_init_mutex);
+		return err;
+	}
+
+If only a single data structure is being initialized, then the pointer
+itself can take the place of the 'inited' flag:
+
+	static struct foo *foo;
+	static DEFINE_MUTEX(foo_init_mutex);
+
+	int init_foo_if_needed(void)
+	{
+		int err = 0;
+
+		/* pairs with smp_store_release() below */
+		if (smp_load_acquire(&foo))
+			return 0;
+
+		mutex_lock(&foo_init_mutex);
+		if (!foo) {
+			struct foo *p = alloc_foo();
+
+			if (p) /* pairs with smp_load_acquire() above */
+				smp_store_release(&foo, p);
+			else
+				err = -ENOMEM;
+		}
+		mutex_unlock(&foo_init_mutex);
+		return err;
+	}
+
+There are also cases in which the smp_load_acquire() can be replaced by
+the more lightweight READ_ONCE().  (smp_store_release() is still
+required.)  Specifically, if all initialized memory is transitively
+reachable from the pointer itself, then there is no control dependency
+so the data dependency barrier provided by READ_ONCE() is sufficient.
+
+However, using the READ_ONCE() optimization is discouraged for
+nontrivial data structures, as it can be difficult to determine if there
+is a control dependency.  For complex data structures it may depend on
+internal implementation details of other kernel subsystems.
+
+For the single-pointer case, a further optimized implementation
+eliminates the mutex and instead uses compare-and-exchange:
+
+	static struct foo *foo;
+
+	int init_foo_if_needed(void)
+	{
+		struct foo *p;
+
+		/* pairs with successful cmpxchg_release() below */
+		if (smp_load_acquire(&foo))
+			return 0;
+
+		p = alloc_foo();
+		if (!p)
+			return -ENOMEM;
+
+		/* on success, pairs with smp_load_acquire() above and below */
+		if (cmpxchg_release(&foo, NULL, p) != NULL) {
+			free_foo(p);
+			/* pairs with successful cmpxchg_release() above */
+			smp_load_acquire(&foo);
+		}
+		return 0;
+	}
+
+Note that when the cmpxchg_release() fails due to another task already
+having done it, a second smp_load_acquire() is required, since we still
+need to acquire the data that the other task released.  You may be
+tempted to upgrade cmpxchg_release() to cmpxchg() with the goal of it
+acting as both ACQUIRE and RELEASE, but that doesn't work here because
+cmpxchg() only guarantees memory ordering if it succeeds.
+
+Because of the above subtlety, the version with the mutex instead of
+cmpxchg_release() should be preferred, except potentially in cases where
+it is difficult to provide anything other than a global mutex and where
+the one-time data is part of a frequently allocated structure.  In that
+case, a global mutex might present scalability concerns.
 
 Rules of thumb
 ==============
-- 
2.27.0


^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, back to index

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-17  4:44 [PATCH] tools/memory-model: document the "one-time init" pattern Eric Biggers
2020-07-17  5:49 ` Sedat Dilek
2020-07-17 12:35 ` Matthew Wilcox
2020-07-17 14:26 ` Alan Stern
2020-07-17 17:47 ` Matthew Wilcox
2020-07-17 17:51   ` Alan Stern
2020-07-18  1:02     ` Eric Biggers
2020-07-27 12:51       ` Matthew Wilcox
2020-07-17 21:05   ` Darrick J. Wong
2020-07-18  0:44   ` Darrick J. Wong
2020-07-18  1:38   ` Eric Biggers
2020-07-18  2:13     ` Matthew Wilcox
2020-07-18  5:28       ` Eric Biggers
2020-07-18 14:35         ` Alan Stern
2020-07-20  2:07         ` Dave Chinner
2020-07-20  9:00           ` Peter Zijlstra
2020-07-27 15:17         ` Alan Stern
2020-07-27 15:28           ` Matthew Wilcox
2020-07-27 16:01             ` Paul E. McKenney
2020-07-27 16:31             ` Alan Stern
2020-07-27 16:59               ` Matthew Wilcox
2020-07-27 19:13                 ` Alan Stern
2020-07-17 20:53 ` Darrick J. Wong
2020-07-18  0:58   ` Eric Biggers
2020-07-18  1:25     ` Alan Stern
2020-07-18  1:40       ` Matthew Wilcox
2020-07-18  2:00       ` Dave Chinner
2020-07-18 14:21         ` Alan Stern
2020-07-18  2:00       ` Eric Biggers
2020-07-18  1:42 ` Dave Chinner
2020-07-18 14:08   ` Alan Stern
2020-07-20  1:33     ` Dave Chinner
2020-07-20 14:52       ` Alan Stern
2020-07-20 15:37         ` Darrick J. Wong
2020-07-20 15:39         ` Matthew Wilcox
2020-07-20 16:04           ` Paul E. McKenney
2020-07-20 16:48             ` peterz
2020-07-20 22:06               ` Paul E. McKenney
2020-07-20 16:12           ` Alan Stern

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lkml.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lkml.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git