Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] tools/memory-model: document the "one-time init" pattern
@ 2020-07-17  4:44 Eric Biggers
  2020-07-17  5:49 ` Sedat Dilek
                   ` (5 more replies)
  0 siblings, 6 replies; 39+ messages in thread
From: Eric Biggers @ 2020-07-17  4:44 UTC (permalink / raw)
  To: linux-kernel, linux-arch, Paul E . McKenney
  Cc: linux-fsdevel, Akira Yokosawa, Alan Stern, Andrea Parri,
	Boqun Feng, Daniel Lustig, Darrick J . Wong, Dave Chinner,
	David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin,
	Peter Zijlstra, Will Deacon

From: Eric Biggers <ebiggers@google.com>

The "one-time init" pattern is implemented incorrectly in various places
in the kernel.  And when people do try to implement it correctly, it is
unclear what to use.  Try to give some proper guidance.

This is motivated by the discussion at
https://lkml.kernel.org/linux-fsdevel/20200713033330.205104-1-ebiggers@kernel.org/T/#u
regarding fixing the initialization of super_block::s_dio_done_wq.

Cc: Akira Yokosawa <akiyks@gmail.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Daniel Lustig <dlustig@nvidia.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jade Alglave <j.alglave@ucl.ac.uk>
Cc: Luc Maranget <luc.maranget@inria.fr>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 tools/memory-model/Documentation/recipes.txt | 151 +++++++++++++++++++
 1 file changed, 151 insertions(+)

diff --git a/tools/memory-model/Documentation/recipes.txt b/tools/memory-model/Documentation/recipes.txt
index 7fe8d7aa3029..04beb06dbfc7 100644
--- a/tools/memory-model/Documentation/recipes.txt
+++ b/tools/memory-model/Documentation/recipes.txt
@@ -519,6 +519,157 @@ CPU1 puts the waiting task to sleep and CPU0 fails to wake it up.
 
 Note that use of locking can greatly simplify this pattern.
 
+One-time init
+-------------
+
+The "one-time init" pattern is when multiple tasks can race to
+initialize the same data structure(s) on first use.
+
+In many cases, it's best to just avoid the need for this by simply
+initializing the data ahead of time.
+
+But in cases where the data would often go unused, one-time init can be
+appropriate to avoid wasting kernel resources.  It can also be
+appropriate if the initialization has other prerequisites which preclude
+it being done ahead of time.
+
+First, consider if your data has (a) global or static scope, (b) can be
+initialized from atomic context, and (c) cannot fail to be initialized.
+If all of those apply, just use DO_ONCE() from <linux/once.h>:
+
+	DO_ONCE(func);
+
+If that doesn't apply, you'll have to implement one-time init yourself.
+
+The simplest implementation just uses a mutex and an 'inited' flag.
+This implementation should be used where feasible:
+
+	static bool foo_inited;
+	static DEFINE_MUTEX(foo_init_mutex);
+
+	int init_foo_if_needed(void)
+	{
+		int err = 0;
+
+		mutex_lock(&foo_init_mutex);
+		if (!foo_inited) {
+			err = init_foo();
+			if (err == 0)
+				foo_inited = true;
+		}
+		mutex_unlock(&foo_init_mutex);
+		return err;
+	}
+
+The above example uses static variables, but this solution also works
+for initializing something that is part of another data structure.  The
+mutex may still be static.
+
+In where cases where taking the mutex in the "already initialized" case
+presents scalability concerns, the implementation can be optimized to
+check the 'inited' flag outside the mutex.  Unfortunately, this
+optimization is often implemented incorrectly by using a plain load.
+That violates the memory model and may result in unpredictable behavior.
+
+A correct implementation is:
+
+	static bool foo_inited;
+	static DEFINE_MUTEX(foo_init_mutex);
+
+	int init_foo_if_needed(void)
+	{
+		int err = 0;
+
+		/* pairs with smp_store_release() below */
+		if (smp_load_acquire(&foo_inited))
+			return 0;
+
+		mutex_lock(&foo_init_mutex);
+		if (!foo_inited) {
+			err = init_foo();
+			if (err == 0) /* pairs with smp_load_acquire() above */
+				smp_store_release(&foo_inited, true);
+		}
+		mutex_unlock(&foo_init_mutex);
+		return err;
+	}
+
+If only a single data structure is being initialized, then the pointer
+itself can take the place of the 'inited' flag:
+
+	static struct foo *foo;
+	static DEFINE_MUTEX(foo_init_mutex);
+
+	int init_foo_if_needed(void)
+	{
+		int err = 0;
+
+		/* pairs with smp_store_release() below */
+		if (smp_load_acquire(&foo))
+			return 0;
+
+		mutex_lock(&foo_init_mutex);
+		if (!foo) {
+			struct foo *p = alloc_foo();
+
+			if (p) /* pairs with smp_load_acquire() above */
+				smp_store_release(&foo, p);
+			else
+				err = -ENOMEM;
+		}
+		mutex_unlock(&foo_init_mutex);
+		return err;
+	}
+
+There are also cases in which the smp_load_acquire() can be replaced by
+the more lightweight READ_ONCE().  (smp_store_release() is still
+required.)  Specifically, if all initialized memory is transitively
+reachable from the pointer itself, then there is no control dependency
+so the data dependency barrier provided by READ_ONCE() is sufficient.
+
+However, using the READ_ONCE() optimization is discouraged for
+nontrivial data structures, as it can be difficult to determine if there
+is a control dependency.  For complex data structures it may depend on
+internal implementation details of other kernel subsystems.
+
+For the single-pointer case, a further optimized implementation
+eliminates the mutex and instead uses compare-and-exchange:
+
+	static struct foo *foo;
+
+	int init_foo_if_needed(void)
+	{
+		struct foo *p;
+
+		/* pairs with successful cmpxchg_release() below */
+		if (smp_load_acquire(&foo))
+			return 0;
+
+		p = alloc_foo();
+		if (!p)
+			return -ENOMEM;
+
+		/* on success, pairs with smp_load_acquire() above and below */
+		if (cmpxchg_release(&foo, NULL, p) != NULL) {
+			free_foo(p);
+			/* pairs with successful cmpxchg_release() above */
+			smp_load_acquire(&foo);
+		}
+		return 0;
+	}
+
+Note that when the cmpxchg_release() fails due to another task already
+having done it, a second smp_load_acquire() is required, since we still
+need to acquire the data that the other task released.  You may be
+tempted to upgrade cmpxchg_release() to cmpxchg() with the goal of it
+acting as both ACQUIRE and RELEASE, but that doesn't work here because
+cmpxchg() only guarantees memory ordering if it succeeds.
+
+Because of the above subtlety, the version with the mutex instead of
+cmpxchg_release() should be preferred, except potentially in cases where
+it is difficult to provide anything other than a global mutex and where
+the one-time data is part of a frequently allocated structure.  In that
+case, a global mutex might present scalability concerns.
 
 Rules of thumb
 ==============
-- 
2.27.0


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-17  4:44 [PATCH] tools/memory-model: document the "one-time init" pattern Eric Biggers
@ 2020-07-17  5:49 ` Sedat Dilek
  2020-07-17 12:35 ` Matthew Wilcox
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 39+ messages in thread
From: Sedat Dilek @ 2020-07-17  5:49 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-kernel, linux-arch, Paul E . McKenney, linux-fsdevel,
	Akira Yokosawa, Alan Stern, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, Dave Chinner, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Fri, Jul 17, 2020 at 6:48 AM Eric Biggers <ebiggers@kernel.org> wrote:
>
> From: Eric Biggers <ebiggers@google.com>
...
> This is motivated by the discussion at
> https://lkml.kernel.org/linux-fsdevel/20200713033330.205104-1-ebiggers@kernel.org/T/#u
...
> +In where cases where taking the mutex in the "already initialized" case

"In cases where..." (drop first "where")

> +presents scalability concerns, the implementation can be optimized to
> +check the 'inited' flag outside the mutex.  Unfortunately, this
> +optimization is often implemented incorrectly by using a plain load.
> +That violates the memory model and may result in unpredictable behavior.

- Sedat -

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-17  4:44 [PATCH] tools/memory-model: document the "one-time init" pattern Eric Biggers
  2020-07-17  5:49 ` Sedat Dilek
@ 2020-07-17 12:35 ` Matthew Wilcox
  2020-07-17 14:26 ` Alan Stern
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 39+ messages in thread
From: Matthew Wilcox @ 2020-07-17 12:35 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-kernel, linux-arch, Paul E . McKenney, linux-fsdevel,
	Akira Yokosawa, Alan Stern, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, Dave Chinner, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Thu, Jul 16, 2020 at 09:44:27PM -0700, Eric Biggers wrote:
> +The simplest implementation just uses a mutex and an 'inited' flag.

There's a perfectly good real word "initialised" / initialized.
https://chambers.co.uk/search/?query=inited&title=21st

> +For the single-pointer case, a further optimized implementation
> +eliminates the mutex and instead uses compare-and-exchange:
> +
> +	static struct foo *foo;
> +
> +	int init_foo_if_needed(void)
> +	{
> +		struct foo *p;
> +
> +		/* pairs with successful cmpxchg_release() below */
> +		if (smp_load_acquire(&foo))
> +			return 0;
> +
> +		p = alloc_foo();
> +		if (!p)
> +			return -ENOMEM;
> +
> +		/* on success, pairs with smp_load_acquire() above and below */
> +		if (cmpxchg_release(&foo, NULL, p) != NULL) {
> +			free_foo(p);
> +			/* pairs with successful cmpxchg_release() above */
> +			smp_load_acquire(&foo);
> +		}
> +		return 0;
> +	}
> +
> +Note that when the cmpxchg_release() fails due to another task already
> +having done it, a second smp_load_acquire() is required, since we still
> +need to acquire the data that the other task released.  You may be
> +tempted to upgrade cmpxchg_release() to cmpxchg() with the goal of it
> +acting as both ACQUIRE and RELEASE, but that doesn't work here because
> +cmpxchg() only guarantees memory ordering if it succeeds.
> +
> +Because of the above subtlety, the version with the mutex instead of
> +cmpxchg_release() should be preferred, except potentially in cases where
> +it is difficult to provide anything other than a global mutex and where
> +the one-time data is part of a frequently allocated structure.  In that
> +case, a global mutex might present scalability concerns.

There are concerns other than scalability where we might want to eliminate
the mutex.  For example, if (likely) alloc_foo() needs to allocate memory
and we would need foo to perform page writeback, then either we must
allocate foo using GFP_NOFS or do without the mutex, lest we deadlock
on this new mutex.

You might think this would argue for just using GFP_NOFS always, but
GFP_NOFS is a big hammer which forbids reclaiming from any filesystem,
whereas we might only need this foo to reclaim from a particular
filesystem.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-17  4:44 [PATCH] tools/memory-model: document the "one-time init" pattern Eric Biggers
  2020-07-17  5:49 ` Sedat Dilek
  2020-07-17 12:35 ` Matthew Wilcox
@ 2020-07-17 14:26 ` Alan Stern
  2020-07-17 17:47 ` Matthew Wilcox
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 39+ messages in thread
From: Alan Stern @ 2020-07-17 14:26 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-kernel, linux-arch, Paul E . McKenney, linux-fsdevel,
	Akira Yokosawa, Andrea Parri, Boqun Feng, Daniel Lustig,
	Darrick J . Wong, Dave Chinner, David Howells, Jade Alglave,
	Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon

On Thu, Jul 16, 2020 at 09:44:27PM -0700, Eric Biggers wrote:

...
> +Note that when the cmpxchg_release() fails due to another task already
> +having done it, a second smp_load_acquire() is required, since we still
> +need to acquire the data that the other task released.

When people talk about releasing data, they usually mean something 
different from what you mean here (i.e., they usually mean 
deallocating).  Likewise, acquiring data (to the extent that it means 
anything) would generally be regarded as meaning a simple read.  I 
recommend changing the last phrase above to:

	... since we still need a load-acquire of the data on which
	the other task performed a store-release.

Alan Stern

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-17  4:44 [PATCH] tools/memory-model: document the "one-time init" pattern Eric Biggers
                   ` (2 preceding siblings ...)
  2020-07-17 14:26 ` Alan Stern
@ 2020-07-17 17:47 ` Matthew Wilcox
  2020-07-17 17:51   ` Alan Stern
                     ` (3 more replies)
  2020-07-17 20:53 ` Darrick J. Wong
  2020-07-18  1:42 ` Dave Chinner
  5 siblings, 4 replies; 39+ messages in thread
From: Matthew Wilcox @ 2020-07-17 17:47 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-kernel, linux-arch, Paul E . McKenney, linux-fsdevel,
	Akira Yokosawa, Alan Stern, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, Dave Chinner, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Thu, Jul 16, 2020 at 09:44:27PM -0700, Eric Biggers wrote:
> +If that doesn't apply, you'll have to implement one-time init yourself.
> +
> +The simplest implementation just uses a mutex and an 'inited' flag.
> +This implementation should be used where feasible:

I think some syntactic sugar should make it feasible for normal people
to implement the most efficient version of this just like they use locks.

> +For the single-pointer case, a further optimized implementation
> +eliminates the mutex and instead uses compare-and-exchange:
> +
> +	static struct foo *foo;
> +
> +	int init_foo_if_needed(void)
> +	{
> +		struct foo *p;
> +
> +		/* pairs with successful cmpxchg_release() below */
> +		if (smp_load_acquire(&foo))
> +			return 0;
> +
> +		p = alloc_foo();
> +		if (!p)
> +			return -ENOMEM;
> +
> +		/* on success, pairs with smp_load_acquire() above and below */
> +		if (cmpxchg_release(&foo, NULL, p) != NULL) {

Why do we have cmpxchg_release() anyway?  Under what circumstances is
cmpxchg() useful _without_ having release semantics?

> +			free_foo(p);
> +			/* pairs with successful cmpxchg_release() above */
> +			smp_load_acquire(&foo);
> +		}
> +		return 0;
> +	}

How about something like this ...

once.h:

static struct init_once_pointer {
	void *p;
};

static inline void *once_get(struct init_once_pointer *oncep)
{ ... }

static inline bool once_store(struct init_once_pointer *oncep, void *p)
{ ... }

--- foo.c ---

struct foo *get_foo(gfp_t gfp)
{
	static struct init_once_pointer my_foo;
	struct foo *foop;

	foop = once_get(&my_foo);
	if (foop)
		return foop;

	foop = alloc_foo(gfp);
	if (!once_store(&my_foo, foop)) {
		free_foo(foop);
		foop = once_get(&my_foo);
	}

	return foop;
}

Any kernel programmer should be able to handle that pattern.  And no mutex!

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-17 17:47 ` Matthew Wilcox
@ 2020-07-17 17:51   ` Alan Stern
  2020-07-18  1:02     ` Eric Biggers
  2020-07-17 21:05   ` Darrick J. Wong
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 39+ messages in thread
From: Alan Stern @ 2020-07-17 17:51 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Eric Biggers, linux-kernel, linux-arch, Paul E . McKenney,
	linux-fsdevel, Akira Yokosawa, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, Dave Chinner, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Fri, Jul 17, 2020 at 06:47:50PM +0100, Matthew Wilcox wrote:
> On Thu, Jul 16, 2020 at 09:44:27PM -0700, Eric Biggers wrote:
...
> > +		/* on success, pairs with smp_load_acquire() above and below */
> > +		if (cmpxchg_release(&foo, NULL, p) != NULL) {
> 
> Why do we have cmpxchg_release() anyway?  Under what circumstances is
> cmpxchg() useful _without_ having release semantics?

To answer just the last question: cmpxchg() is useful for lock 
acquisition, in which case it needs to have acquire semantics rather 
than release semantics.

Alan Stern


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-17  4:44 [PATCH] tools/memory-model: document the "one-time init" pattern Eric Biggers
                   ` (3 preceding siblings ...)
  2020-07-17 17:47 ` Matthew Wilcox
@ 2020-07-17 20:53 ` Darrick J. Wong
  2020-07-18  0:58   ` Eric Biggers
  2020-07-18  1:42 ` Dave Chinner
  5 siblings, 1 reply; 39+ messages in thread
From: Darrick J. Wong @ 2020-07-17 20:53 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-kernel, linux-arch, Paul E . McKenney, linux-fsdevel,
	Akira Yokosawa, Alan Stern, Andrea Parri, Boqun Feng,
	Daniel Lustig, Dave Chinner, David Howells, Jade Alglave,
	Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon

On Thu, Jul 16, 2020 at 09:44:27PM -0700, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
> 
> The "one-time init" pattern is implemented incorrectly in various places
> in the kernel.  And when people do try to implement it correctly, it is
> unclear what to use.  Try to give some proper guidance.
> 
> This is motivated by the discussion at
> https://lkml.kernel.org/linux-fsdevel/20200713033330.205104-1-ebiggers@kernel.org/T/#u
> regarding fixing the initialization of super_block::s_dio_done_wq.
> 
> Cc: Akira Yokosawa <akiyks@gmail.com>
> Cc: Alan Stern <stern@rowland.harvard.edu>
> Cc: Andrea Parri <parri.andrea@gmail.com>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Daniel Lustig <dlustig@nvidia.com>
> Cc: Darrick J. Wong <darrick.wong@oracle.com>
> Cc: Dave Chinner <david@fromorbit.com>
> Cc: David Howells <dhowells@redhat.com>
> Cc: Jade Alglave <j.alglave@ucl.ac.uk>
> Cc: Luc Maranget <luc.maranget@inria.fr>
> Cc: Nicholas Piggin <npiggin@gmail.com>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Will Deacon <will@kernel.org>
> Signed-off-by: Eric Biggers <ebiggers@google.com>
> ---
>  tools/memory-model/Documentation/recipes.txt | 151 +++++++++++++++++++
>  1 file changed, 151 insertions(+)
> 
> diff --git a/tools/memory-model/Documentation/recipes.txt b/tools/memory-model/Documentation/recipes.txt
> index 7fe8d7aa3029..04beb06dbfc7 100644
> --- a/tools/memory-model/Documentation/recipes.txt
> +++ b/tools/memory-model/Documentation/recipes.txt
> @@ -519,6 +519,157 @@ CPU1 puts the waiting task to sleep and CPU0 fails to wake it up.
>  
>  Note that use of locking can greatly simplify this pattern.
>  
> +One-time init
> +-------------
> +
> +The "one-time init" pattern is when multiple tasks can race to
> +initialize the same data structure(s) on first use.
> +
> +In many cases, it's best to just avoid the need for this by simply
> +initializing the data ahead of time.
> +
> +But in cases where the data would often go unused, one-time init can be
> +appropriate to avoid wasting kernel resources.  It can also be
> +appropriate if the initialization has other prerequisites which preclude
> +it being done ahead of time.
> +
> +First, consider if your data has (a) global or static scope, (b) can be
> +initialized from atomic context, and (c) cannot fail to be initialized.
> +If all of those apply, just use DO_ONCE() from <linux/once.h>:
> +
> +	DO_ONCE(func);
> +
> +If that doesn't apply, you'll have to implement one-time init yourself.
> +
> +The simplest implementation just uses a mutex and an 'inited' flag.
> +This implementation should be used where feasible:
> +
> +	static bool foo_inited;
> +	static DEFINE_MUTEX(foo_init_mutex);
> +
> +	int init_foo_if_needed(void)
> +	{
> +		int err = 0;
> +
> +		mutex_lock(&foo_init_mutex);
> +		if (!foo_inited) {
> +			err = init_foo();
> +			if (err == 0)
> +				foo_inited = true;
> +		}
> +		mutex_unlock(&foo_init_mutex);
> +		return err;
> +	}
> +
> +The above example uses static variables, but this solution also works
> +for initializing something that is part of another data structure.  The
> +mutex may still be static.
> +
> +In where cases where taking the mutex in the "already initialized" case
> +presents scalability concerns, the implementation can be optimized to
> +check the 'inited' flag outside the mutex.  Unfortunately, this
> +optimization is often implemented incorrectly by using a plain load.
> +That violates the memory model and may result in unpredictable behavior.
> +
> +A correct implementation is:
> +
> +	static bool foo_inited;
> +	static DEFINE_MUTEX(foo_init_mutex);
> +
> +	int init_foo_if_needed(void)
> +	{
> +		int err = 0;
> +
> +		/* pairs with smp_store_release() below */
> +		if (smp_load_acquire(&foo_inited))
> +			return 0;
> +
> +		mutex_lock(&foo_init_mutex);
> +		if (!foo_inited) {
> +			err = init_foo();
> +			if (err == 0) /* pairs with smp_load_acquire() above */
> +				smp_store_release(&foo_inited, true);
> +		}
> +		mutex_unlock(&foo_init_mutex);
> +		return err;
> +	}
> +
> +If only a single data structure is being initialized, then the pointer
> +itself can take the place of the 'inited' flag:
> +
> +	static struct foo *foo;
> +	static DEFINE_MUTEX(foo_init_mutex);
> +
> +	int init_foo_if_needed(void)
> +	{
> +		int err = 0;
> +
> +		/* pairs with smp_store_release() below */
> +		if (smp_load_acquire(&foo))
> +			return 0;
> +
> +		mutex_lock(&foo_init_mutex);
> +		if (!foo) {
> +			struct foo *p = alloc_foo();
> +
> +			if (p) /* pairs with smp_load_acquire() above */
> +				smp_store_release(&foo, p);
> +			else
> +				err = -ENOMEM;
> +		}
> +		mutex_unlock(&foo_init_mutex);
> +		return err;
> +	}
> +
> +There are also cases in which the smp_load_acquire() can be replaced by
> +the more lightweight READ_ONCE().  (smp_store_release() is still
> +required.)  Specifically, if all initialized memory is transitively
> +reachable from the pointer itself, then there is no control dependency

I don't quite understand what "transitively reachable from the pointer
itself" means?  Does that describe the situation where all the objects
reachable through the object that the global struct foo pointer points
at are /only/ reachable via that global pointer?

Aside from that question, I very much like having this recipe in the
kernel documentation.  Thank you for putting this together!

--D

> +so the data dependency barrier provided by READ_ONCE() is sufficient.
> +
> +However, using the READ_ONCE() optimization is discouraged for
> +nontrivial data structures, as it can be difficult to determine if there
> +is a control dependency.  For complex data structures it may depend on
> +internal implementation details of other kernel subsystems.
> +
> +For the single-pointer case, a further optimized implementation
> +eliminates the mutex and instead uses compare-and-exchange:
> +
> +	static struct foo *foo;
> +
> +	int init_foo_if_needed(void)
> +	{
> +		struct foo *p;
> +
> +		/* pairs with successful cmpxchg_release() below */
> +		if (smp_load_acquire(&foo))
> +			return 0;
> +
> +		p = alloc_foo();
> +		if (!p)
> +			return -ENOMEM;
> +
> +		/* on success, pairs with smp_load_acquire() above and below */
> +		if (cmpxchg_release(&foo, NULL, p) != NULL) {
> +			free_foo(p);
> +			/* pairs with successful cmpxchg_release() above */
> +			smp_load_acquire(&foo);
> +		}
> +		return 0;
> +	}
> +
> +Note that when the cmpxchg_release() fails due to another task already
> +having done it, a second smp_load_acquire() is required, since we still
> +need to acquire the data that the other task released.  You may be
> +tempted to upgrade cmpxchg_release() to cmpxchg() with the goal of it
> +acting as both ACQUIRE and RELEASE, but that doesn't work here because
> +cmpxchg() only guarantees memory ordering if it succeeds.
> +
> +Because of the above subtlety, the version with the mutex instead of
> +cmpxchg_release() should be preferred, except potentially in cases where
> +it is difficult to provide anything other than a global mutex and where
> +the one-time data is part of a frequently allocated structure.  In that
> +case, a global mutex might present scalability concerns.
>  
>  Rules of thumb
>  ==============
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-17 17:47 ` Matthew Wilcox
  2020-07-17 17:51   ` Alan Stern
@ 2020-07-17 21:05   ` Darrick J. Wong
  2020-07-18  0:44   ` Darrick J. Wong
  2020-07-18  1:38   ` Eric Biggers
  3 siblings, 0 replies; 39+ messages in thread
From: Darrick J. Wong @ 2020-07-17 21:05 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Eric Biggers, linux-kernel, linux-arch, Paul E . McKenney,
	linux-fsdevel, Akira Yokosawa, Alan Stern, Andrea Parri,
	Boqun Feng, Daniel Lustig, Dave Chinner, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Fri, Jul 17, 2020 at 06:47:50PM +0100, Matthew Wilcox wrote:
> On Thu, Jul 16, 2020 at 09:44:27PM -0700, Eric Biggers wrote:
> > +If that doesn't apply, you'll have to implement one-time init yourself.
> > +
> > +The simplest implementation just uses a mutex and an 'inited' flag.
> > +This implementation should be used where feasible:
> 
> I think some syntactic sugar should make it feasible for normal people
> to implement the most efficient version of this just like they use locks.
> 
> > +For the single-pointer case, a further optimized implementation
> > +eliminates the mutex and instead uses compare-and-exchange:
> > +
> > +	static struct foo *foo;
> > +
> > +	int init_foo_if_needed(void)
> > +	{
> > +		struct foo *p;
> > +
> > +		/* pairs with successful cmpxchg_release() below */
> > +		if (smp_load_acquire(&foo))
> > +			return 0;
> > +
> > +		p = alloc_foo();
> > +		if (!p)
> > +			return -ENOMEM;
> > +
> > +		/* on success, pairs with smp_load_acquire() above and below */
> > +		if (cmpxchg_release(&foo, NULL, p) != NULL) {
> 
> Why do we have cmpxchg_release() anyway?  Under what circumstances is
> cmpxchg() useful _without_ having release semantics?
> 
> > +			free_foo(p);
> > +			/* pairs with successful cmpxchg_release() above */
> > +			smp_load_acquire(&foo);
> > +		}
> > +		return 0;
> > +	}
> 
> How about something like this ...
> 
> once.h:
> 
> static struct init_once_pointer {
> 	void *p;
> };
> 
> static inline void *once_get(struct init_once_pointer *oncep)
> { ... }
> 
> static inline bool once_store(struct init_once_pointer *oncep, void *p)
> { ... }
> 
> --- foo.c ---
> 
> struct foo *get_foo(gfp_t gfp)
> {
> 	static struct init_once_pointer my_foo;
> 	struct foo *foop;
> 
> 	foop = once_get(&my_foo);
> 	if (foop)
> 		return foop;
> 
> 	foop = alloc_foo(gfp);
> 	if (!once_store(&my_foo, foop)) {
> 		free_foo(foop);
> 		foop = once_get(&my_foo);
> 	}
> 
> 	return foop;
> }
> 
> Any kernel programmer should be able to handle that pattern.  And no mutex!

I like it... :)

--D

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-17 17:47 ` Matthew Wilcox
  2020-07-17 17:51   ` Alan Stern
  2020-07-17 21:05   ` Darrick J. Wong
@ 2020-07-18  0:44   ` Darrick J. Wong
  2020-07-18  1:38   ` Eric Biggers
  3 siblings, 0 replies; 39+ messages in thread
From: Darrick J. Wong @ 2020-07-18  0:44 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Eric Biggers, linux-kernel, linux-arch, Paul E . McKenney,
	linux-fsdevel, Akira Yokosawa, Alan Stern, Andrea Parri,
	Boqun Feng, Daniel Lustig, Dave Chinner, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Fri, Jul 17, 2020 at 06:47:50PM +0100, Matthew Wilcox wrote:
> On Thu, Jul 16, 2020 at 09:44:27PM -0700, Eric Biggers wrote:
> > +If that doesn't apply, you'll have to implement one-time init yourself.
> > +
> > +The simplest implementation just uses a mutex and an 'inited' flag.
> > +This implementation should be used where feasible:
> 
> I think some syntactic sugar should make it feasible for normal people
> to implement the most efficient version of this just like they use locks.
> 
> > +For the single-pointer case, a further optimized implementation
> > +eliminates the mutex and instead uses compare-and-exchange:
> > +
> > +	static struct foo *foo;
> > +
> > +	int init_foo_if_needed(void)
> > +	{
> > +		struct foo *p;
> > +
> > +		/* pairs with successful cmpxchg_release() below */
> > +		if (smp_load_acquire(&foo))
> > +			return 0;
> > +
> > +		p = alloc_foo();
> > +		if (!p)
> > +			return -ENOMEM;
> > +
> > +		/* on success, pairs with smp_load_acquire() above and below */
> > +		if (cmpxchg_release(&foo, NULL, p) != NULL) {
> 
> Why do we have cmpxchg_release() anyway?  Under what circumstances is
> cmpxchg() useful _without_ having release semantics?
> 
> > +			free_foo(p);
> > +			/* pairs with successful cmpxchg_release() above */
> > +			smp_load_acquire(&foo);
> > +		}
> > +		return 0;
> > +	}
> 
> How about something like this ...
> 
> once.h:
> 
> static struct init_once_pointer {
> 	void *p;
> };
> 
> static inline void *once_get(struct init_once_pointer *oncep)
> { ... }
> 
> static inline bool once_store(struct init_once_pointer *oncep, void *p)
> { ... }
> 
> --- foo.c ---
> 
> struct foo *get_foo(gfp_t gfp)
> {
> 	static struct init_once_pointer my_foo;
> 	struct foo *foop;
> 
> 	foop = once_get(&my_foo);
> 	if (foop)
> 		return foop;
> 
> 	foop = alloc_foo(gfp);
> 	if (!once_store(&my_foo, foop)) {
> 		free_foo(foop);
> 		foop = once_get(&my_foo);
> 	}
> 
> 	return foop;
> }
> 
> Any kernel programmer should be able to handle that pattern.  And no mutex!

That would be even better.

--D

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-17 20:53 ` Darrick J. Wong
@ 2020-07-18  0:58   ` Eric Biggers
  2020-07-18  1:25     ` Alan Stern
  0 siblings, 1 reply; 39+ messages in thread
From: Eric Biggers @ 2020-07-18  0:58 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-kernel, linux-arch, Paul E . McKenney, linux-fsdevel,
	Akira Yokosawa, Alan Stern, Andrea Parri, Boqun Feng,
	Daniel Lustig, Dave Chinner, David Howells, Jade Alglave,
	Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon

On Fri, Jul 17, 2020 at 01:53:40PM -0700, Darrick J. Wong wrote:
> > +There are also cases in which the smp_load_acquire() can be replaced by
> > +the more lightweight READ_ONCE().  (smp_store_release() is still
> > +required.)  Specifically, if all initialized memory is transitively
> > +reachable from the pointer itself, then there is no control dependency
> 
> I don't quite understand what "transitively reachable from the pointer
> itself" means?  Does that describe the situation where all the objects
> reachable through the object that the global struct foo pointer points
> at are /only/ reachable via that global pointer?
> 

The intent is that "transitively reachable" means that all initialized memory
can be reached by dereferencing the pointer in some way, e.g. p->a->b[5]->c.

It could also be the case that allocating the object initializes some global or
static data, which isn't reachable in that way.  Access to that data would then
be a control dependency, which a data dependency barrier wouldn't work for.

It's possible I misunderstood something.  (Note the next paragraph does say that
using READ_ONCE() is discouraged, exactly for this reason -- it can be hard to
tell whether it's correct.)  Suggestions of what to write here are appreciated.

- Eric

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-17 17:51   ` Alan Stern
@ 2020-07-18  1:02     ` Eric Biggers
  2020-07-27 12:51       ` Matthew Wilcox
  0 siblings, 1 reply; 39+ messages in thread
From: Eric Biggers @ 2020-07-18  1:02 UTC (permalink / raw)
  To: Alan Stern
  Cc: Matthew Wilcox, linux-kernel, linux-arch, Paul E . McKenney,
	linux-fsdevel, Akira Yokosawa, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, Dave Chinner, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Fri, Jul 17, 2020 at 01:51:38PM -0400, Alan Stern wrote:
> On Fri, Jul 17, 2020 at 06:47:50PM +0100, Matthew Wilcox wrote:
> > On Thu, Jul 16, 2020 at 09:44:27PM -0700, Eric Biggers wrote:
> ...
> > > +		/* on success, pairs with smp_load_acquire() above and below */
> > > +		if (cmpxchg_release(&foo, NULL, p) != NULL) {
> > 
> > Why do we have cmpxchg_release() anyway?  Under what circumstances is
> > cmpxchg() useful _without_ having release semantics?
> 
> To answer just the last question: cmpxchg() is useful for lock 
> acquisition, in which case it needs to have acquire semantics rather 
> than release semantics.
> 

To clarify, there are 4 versions of cmpxchg:

	cmpxchg(): does ACQUIRE and RELEASE (on success)
	cmpxchg_acquire(): does ACQUIRE only (on success)
	cmpxchg_release(): does RELEASE only (on success)
	cmpxchg_relaxed(): no barriers

The problem here is that here we need RELEASE on success and ACQUIRE on failure.
But no version guarantees any barrier on failure.

So as far as I can tell, the best we can do is use cmpxchg_release() (or
cmpxchg() which would be stronger but unnecessary), followed by a separate
ACQUIRE on failure.

- Eric

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-18  0:58   ` Eric Biggers
@ 2020-07-18  1:25     ` Alan Stern
  2020-07-18  1:40       ` Matthew Wilcox
                         ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Alan Stern @ 2020-07-18  1:25 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Darrick J. Wong, linux-kernel, linux-arch, Paul E . McKenney,
	linux-fsdevel, Akira Yokosawa, Andrea Parri, Boqun Feng,
	Daniel Lustig, Dave Chinner, David Howells, Jade Alglave,
	Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon

On Fri, Jul 17, 2020 at 05:58:57PM -0700, Eric Biggers wrote:
> On Fri, Jul 17, 2020 at 01:53:40PM -0700, Darrick J. Wong wrote:
> > > +There are also cases in which the smp_load_acquire() can be replaced by
> > > +the more lightweight READ_ONCE().  (smp_store_release() is still
> > > +required.)  Specifically, if all initialized memory is transitively
> > > +reachable from the pointer itself, then there is no control dependency
> > 
> > I don't quite understand what "transitively reachable from the pointer
> > itself" means?  Does that describe the situation where all the objects
> > reachable through the object that the global struct foo pointer points
> > at are /only/ reachable via that global pointer?
> > 
> 
> The intent is that "transitively reachable" means that all initialized memory
> can be reached by dereferencing the pointer in some way, e.g. p->a->b[5]->c.
> 
> It could also be the case that allocating the object initializes some global or
> static data, which isn't reachable in that way.  Access to that data would then
> be a control dependency, which a data dependency barrier wouldn't work for.
> 
> It's possible I misunderstood something.  (Note the next paragraph does say that
> using READ_ONCE() is discouraged, exactly for this reason -- it can be hard to
> tell whether it's correct.)  Suggestions of what to write here are appreciated.

Perhaps something like this:

	Specifically, if the only way to reach the initialized memory 
	involves dereferencing the pointer itself then READ_ONCE() is 
	sufficient.  This is because there will be an address dependency 
	between reading the pointer and accessing the memory, which will 
	ensure proper ordering.  But if some of the initialized memory 
	is reachable some other way (for example, if it is global or 
	static data) then there need not be an address dependency, 
	merely a control dependency (checking whether the pointer is 
	non-NULL).  Control dependencies do not always ensure ordering 
	-- certainly not for reads, and depending on the compiler, 
	possibly not for some writes -- and therefore a load-acquire is 
	necessary.

Perhaps this is more wordy than you want, but it does get the important 
ideas across.

Alan Stern

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-17 17:47 ` Matthew Wilcox
                     ` (2 preceding siblings ...)
  2020-07-18  0:44   ` Darrick J. Wong
@ 2020-07-18  1:38   ` Eric Biggers
  2020-07-18  2:13     ` Matthew Wilcox
  3 siblings, 1 reply; 39+ messages in thread
From: Eric Biggers @ 2020-07-18  1:38 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-kernel, linux-arch, Paul E . McKenney, linux-fsdevel,
	Akira Yokosawa, Alan Stern, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, Dave Chinner, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Fri, Jul 17, 2020 at 06:47:50PM +0100, Matthew Wilcox wrote:
> On Thu, Jul 16, 2020 at 09:44:27PM -0700, Eric Biggers wrote:
> > +If that doesn't apply, you'll have to implement one-time init yourself.
> > +
> > +The simplest implementation just uses a mutex and an 'inited' flag.
> > +This implementation should be used where feasible:
> 
> I think some syntactic sugar should make it feasible for normal people
> to implement the most efficient version of this just like they use locks.

Note that the cmpxchg version is not necessarily the "most efficient".

If the one-time initialization is expensive, e.g. if it allocates a lot of
memory or if takes a long time, it could be better to use the mutex version so
that at most one task does it.

> How about something like this ...
> 
> once.h:
> 
> static struct init_once_pointer {
> 	void *p;
> };
> 
> static inline void *once_get(struct init_once_pointer *oncep)
> { ... }
> 
> static inline bool once_store(struct init_once_pointer *oncep, void *p)
> { ... }
> 
> --- foo.c ---
> 
> struct foo *get_foo(gfp_t gfp)
> {
> 	static struct init_once_pointer my_foo;
> 	struct foo *foop;
> 
> 	foop = once_get(&my_foo);
> 	if (foop)
> 		return foop;
> 
> 	foop = alloc_foo(gfp);
> 	if (!once_store(&my_foo, foop)) {
> 		free_foo(foop);
> 		foop = once_get(&my_foo);
> 	}
> 
> 	return foop;
> }
> 
> Any kernel programmer should be able to handle that pattern.  And no mutex!

I don't think this version would be worthwhile.  It eliminates type safety due
to the use of 'void *', and doesn't actually save any lines of code.  Nor does
it eliminate the need to correctly implement the cmpxchg failure case, which is
tricky (it must free the object and get the new one) and will be rarely tested.

It also forces all users of the struct to use this helper function to access it.
That could be considered a good thing, but it's also bad because even with
one-time init there's still usually some sort of ordering of "initialization"
vs. "use".  Just taking a random example I'm familiar with, we do one-time init
of inode::i_crypt_info when we open an encrypted file, so we guarantee it's set
for all I/O to the file, where we then simply access ->i_crypt_info directly.
We don't want the code to read like it's initializing ->i_crypt_info in the
middle of ->writepages(), since that would be wrong.

An improvement might be to make once_store() take the free function as a
parameter so that it would handle the failure case for you:

struct foo *get_foo(gfp_t gfp)
{
	static struct init_once_pointer my_foo;
	struct foo *foop;
 
 	foop = once_get(&my_foo);
 	if (!foop) {
		foop = alloc_foo(gfp);
		if (foop)
			once_store(&my_foo, foop, free_foo);
	}
 	return foop;
}

I'm not sure even that version would be worthwhile, though.

What I do like is DO_ONCE() in <linux/once.h>, which is used as just:

    DO_ONCE(func)

But it has limitations:

   - Only atomic context
   - Initialization can't fail
   - Only global/static data

We could address the first two limitations by splitting it into DO_ONCE_ATOMIC()
and DO_ONCE_BLOCKING(), and by allowing the initialization function to return an
error code.  That would make it work for all global/static data cases, I think.

Ideally we'd have something almost as simple for non-global/non-static data too.
I'm not sure the best way to do it, though.  It would have to be something more
complex like:

    ONETIME_INIT_DATA(&my_foo, alloc_foo, free_foo)

- Eric

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-18  1:25     ` Alan Stern
@ 2020-07-18  1:40       ` Matthew Wilcox
  2020-07-18  2:00       ` Dave Chinner
  2020-07-18  2:00       ` Eric Biggers
  2 siblings, 0 replies; 39+ messages in thread
From: Matthew Wilcox @ 2020-07-18  1:40 UTC (permalink / raw)
  To: Alan Stern
  Cc: Eric Biggers, Darrick J. Wong, linux-kernel, linux-arch,
	Paul E . McKenney, linux-fsdevel, Akira Yokosawa, Andrea Parri,
	Boqun Feng, Daniel Lustig, Dave Chinner, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Fri, Jul 17, 2020 at 09:25:55PM -0400, Alan Stern wrote:
> On Fri, Jul 17, 2020 at 05:58:57PM -0700, Eric Biggers wrote:
> > On Fri, Jul 17, 2020 at 01:53:40PM -0700, Darrick J. Wong wrote:
> > > > +There are also cases in which the smp_load_acquire() can be replaced by
> > > > +the more lightweight READ_ONCE().  (smp_store_release() is still
> > > > +required.)  Specifically, if all initialized memory is transitively
> > > > +reachable from the pointer itself, then there is no control dependency
> > > 
> > > I don't quite understand what "transitively reachable from the pointer
> > > itself" means?  Does that describe the situation where all the objects
> > > reachable through the object that the global struct foo pointer points
> > > at are /only/ reachable via that global pointer?
> > > 
> > 
> > The intent is that "transitively reachable" means that all initialized memory
> > can be reached by dereferencing the pointer in some way, e.g. p->a->b[5]->c.
> > 
> > It could also be the case that allocating the object initializes some global or
> > static data, which isn't reachable in that way.  Access to that data would then
> > be a control dependency, which a data dependency barrier wouldn't work for.
> > 
> > It's possible I misunderstood something.  (Note the next paragraph does say that
> > using READ_ONCE() is discouraged, exactly for this reason -- it can be hard to
> > tell whether it's correct.)  Suggestions of what to write here are appreciated.
> 
> Perhaps something like this:
> 
> 	Specifically, if the only way to reach the initialized memory 
> 	involves dereferencing the pointer itself then READ_ONCE() is 
> 	sufficient.  This is because there will be an address dependency 
> 	between reading the pointer and accessing the memory, which will 
> 	ensure proper ordering.  But if some of the initialized memory 
> 	is reachable some other way (for example, if it is global or 
> 	static data) then there need not be an address dependency, 
> 	merely a control dependency (checking whether the pointer is 
> 	non-NULL).  Control dependencies do not always ensure ordering 
> 	-- certainly not for reads, and depending on the compiler, 
> 	possibly not for some writes -- and therefore a load-acquire is 
> 	necessary.
> 
> Perhaps this is more wordy than you want, but it does get the important 
> ideas across.

I don't think we should worry about wordsmithing this.  We should just
say "Use the init_pointer_once API" and then people who want to worry
about optimising the implementation of that API never have to talk to
the people who want to use that API.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-17  4:44 [PATCH] tools/memory-model: document the "one-time init" pattern Eric Biggers
                   ` (4 preceding siblings ...)
  2020-07-17 20:53 ` Darrick J. Wong
@ 2020-07-18  1:42 ` Dave Chinner
  2020-07-18 14:08   ` Alan Stern
  5 siblings, 1 reply; 39+ messages in thread
From: Dave Chinner @ 2020-07-18  1:42 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-kernel, linux-arch, Paul E . McKenney, linux-fsdevel,
	Akira Yokosawa, Alan Stern, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, David Howells, Jade Alglave,
	Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon

On Thu, Jul 16, 2020 at 09:44:27PM -0700, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
> 
> The "one-time init" pattern is implemented incorrectly in various places
> in the kernel.  And when people do try to implement it correctly, it is
> unclear what to use.  Try to give some proper guidance.
> 
> This is motivated by the discussion at
> https://lkml.kernel.org/linux-fsdevel/20200713033330.205104-1-ebiggers@kernel.org/T/#u
> regarding fixing the initialization of super_block::s_dio_done_wq.

You're still using words that the target audience of the
documentation will not understand.

This is known as the "curse of knowledge" cognative bias, where
subject matter experts try to explain something to non-experts using
terms only subject matter experts understand....

This is one of the reasons that the LKMM documetnation is so damn
difficult to read and understand: just understanding the vocabulary
it uses requires a huge learning curve, and it's not defined
anywhere. Understanding the syntax of examples requires a huge
learning curve, because it's not defined anywhere. 

Recipes are *not useful* if you need to understand the LKMM
documenation to select the correct recipe to use. Recipes are not
useful if you say "here's 5 different variations of the same thing,
up to you to understand which one you need to use". Recipes are not
useful if changes in other code can silently break the recipe that
was selected by the user by carefully considering the most optimal
variant at the time they selected it.

i.e. Recipes are not for experts who understand the LKMM - recipes
are for developers who don't really understand how the LKMM all
works and just want a single, solid, reliable pattern they can use
just about everywhere for that specific operation.

Performance and optimisation doesn't even enter the picture here -
we need to provide a simple, easy to use and understand pattern that
just works. We need to stop making this harder than it should be.

So....

> Cc: Nicholas Piggin <npiggin@gmail.com>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Will Deacon <will@kernel.org>
> Signed-off-by: Eric Biggers <ebiggers@google.com>
> ---
>  tools/memory-model/Documentation/recipes.txt | 151 +++++++++++++++++++
>  1 file changed, 151 insertions(+)
> 
> diff --git a/tools/memory-model/Documentation/recipes.txt b/tools/memory-model/Documentation/recipes.txt
> index 7fe8d7aa3029..04beb06dbfc7 100644
> --- a/tools/memory-model/Documentation/recipes.txt
> +++ b/tools/memory-model/Documentation/recipes.txt
> @@ -519,6 +519,157 @@ CPU1 puts the waiting task to sleep and CPU0 fails to wake it up.
>  
>  Note that use of locking can greatly simplify this pattern.
>  
> +One-time init
> +-------------
> +
> +The "one-time init" pattern is when multiple tasks can race to
> +initialize the same data structure(s) on first use.
> +
> +In many cases, it's best to just avoid the need for this by simply
> +initializing the data ahead of time.
> +
> +But in cases where the data would often go unused, one-time init can be
> +appropriate to avoid wasting kernel resources.  It can also be
> +appropriate if the initialization has other prerequisites which preclude
> +it being done ahead of time.
> +
> +First, consider if your data has (a) global or static scope, (b) can be
> +initialized from atomic context, and (c) cannot fail to be initialized.
> +If all of those apply, just use DO_ONCE() from <linux/once.h>:
> +
> +	DO_ONCE(func);
> +
> +If that doesn't apply, you'll have to implement one-time init yourself.
> +
> +The simplest implementation just uses a mutex and an 'inited' flag.
> +This implementation should be used where feasible:
> +
> +	static bool foo_inited;
> +	static DEFINE_MUTEX(foo_init_mutex);
> +
> +	int init_foo_if_needed(void)
> +	{
> +		int err = 0;
> +
> +		mutex_lock(&foo_init_mutex);
> +		if (!foo_inited) {
> +			err = init_foo();
> +			if (err == 0)
> +				foo_inited = true;
> +		}
> +		mutex_unlock(&foo_init_mutex);
> +		return err;
> +	}
> +
> +The above example uses static variables, but this solution also works
> +for initializing something that is part of another data structure.  The
> +mutex may still be static.

All good up to here - people will see this and understand that this
is the pattern they want to use, and DO_ONCE() is a great, simple
API that is easy to use.

What needs to follow is a canonical example of how to do it
locklessly and efficiently, without describing conditional use of it
using words like "initialised memory is transitively reachable" (I
don't know WTF that means!).  Don't discuss potential optimisations,
control flow/data dependencies, etc, because failing to understand
those details are the reason people are looking for a simple recipe
that does what they need in the first place ("curse of knowledge").

However, I think the whole problem around code like this is that it
is being open-coded and that is the reason people get it wrong.
Hence I agree with Willy that this needs to be wrapped in a simple,
easy to use and hard to get wrong APIs for the patterns we expect to
see people use.

And the recipes should doucment the use of that API for the
init-once pattern, not try to teach people how to open-code their
own init-once pattern that they will continue to screw up....

As a result, I think the examples should document correct use of the
API for the two main variants it would be used for. The first
variant has an external "inited" flag that handles multiple
structure initialisations, and the second variant handles allocation
and initialisation of a single structure that is stored and accessed
by a single location.

Work out an API to do these things correctly, then write the recipes
to use them. Then people like yourself can argue all day and night
on how to best optimise them, and people like myself can just ignore
that all knowing that my init_once() call will always do the right
thing.

> +For the single-pointer case, a further optimized implementation
> +eliminates the mutex and instead uses compare-and-exchange:
> +
> +	static struct foo *foo;
> +
> +	int init_foo_if_needed(void)
> +	{
> +		struct foo *p;
> +
> +		/* pairs with successful cmpxchg_release() below */
> +		if (smp_load_acquire(&foo))
> +			return 0;
> +
> +		p = alloc_foo();
> +		if (!p)
> +			return -ENOMEM;
> +
> +		/* on success, pairs with smp_load_acquire() above and below */
> +		if (cmpxchg_release(&foo, NULL, p) != NULL) {
> +			free_foo(p);
> +			/* pairs with successful cmpxchg_release() above */
> +			smp_load_acquire(&foo);

This is the failure path, not the success. So it describing it as
pairing with "successful cmpxchg_release() above" when the code that
is executing is in the path where  the cmpxchg_release() above -just
failed- doesn't help anyone understand exactly what it is pairing
with.

This is why I said in that other thread "just saying 'pairs with
<foo>' is not sufficient to explain to the reader exactly what the
memory barrier is doing. You needed two full paragraphs to explain
why this was put here and that, to me, indicate the example and
expected use case is wrong.

But even then, I think this example is incorrect and doesn't fit the
patterns people might expect. That is, if this init function
-returned foo for the caller to use-, then the smp_load_acquire() on
failure is necessary to ensure the initialisation done by the racing
context is correct seen. But this function doesn't return foo, and
so the smp_load_acquire() is required in whatever context is trying
to access the contents of foo, not the init function.

Hence I think this example is likely incorrect and will lead to bugs
because it does not, in any way, indicate that
smp_load_acquire(&foo) must always be used in the contexts where foo
may accessed before (or during) the init function has been run...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-18  1:25     ` Alan Stern
  2020-07-18  1:40       ` Matthew Wilcox
@ 2020-07-18  2:00       ` Dave Chinner
  2020-07-18 14:21         ` Alan Stern
  2020-07-18  2:00       ` Eric Biggers
  2 siblings, 1 reply; 39+ messages in thread
From: Dave Chinner @ 2020-07-18  2:00 UTC (permalink / raw)
  To: Alan Stern
  Cc: Eric Biggers, Darrick J. Wong, linux-kernel, linux-arch,
	Paul E . McKenney, linux-fsdevel, Akira Yokosawa, Andrea Parri,
	Boqun Feng, Daniel Lustig, David Howells, Jade Alglave,
	Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon

On Fri, Jul 17, 2020 at 09:25:55PM -0400, Alan Stern wrote:
> On Fri, Jul 17, 2020 at 05:58:57PM -0700, Eric Biggers wrote:
> > On Fri, Jul 17, 2020 at 01:53:40PM -0700, Darrick J. Wong wrote:
> > > > +There are also cases in which the smp_load_acquire() can be replaced by
> > > > +the more lightweight READ_ONCE().  (smp_store_release() is still
> > > > +required.)  Specifically, if all initialized memory is transitively
> > > > +reachable from the pointer itself, then there is no control dependency
> > > 
> > > I don't quite understand what "transitively reachable from the pointer
> > > itself" means?  Does that describe the situation where all the objects
> > > reachable through the object that the global struct foo pointer points
> > > at are /only/ reachable via that global pointer?
> > > 
> > 
> > The intent is that "transitively reachable" means that all initialized memory
> > can be reached by dereferencing the pointer in some way, e.g. p->a->b[5]->c.
> > 
> > It could also be the case that allocating the object initializes some global or
> > static data, which isn't reachable in that way.  Access to that data would then
> > be a control dependency, which a data dependency barrier wouldn't work for.
> > 
> > It's possible I misunderstood something.  (Note the next paragraph does say that
> > using READ_ONCE() is discouraged, exactly for this reason -- it can be hard to
> > tell whether it's correct.)  Suggestions of what to write here are appreciated.
> 
> Perhaps something like this:
> 
> 	Specifically, if the only way to reach the initialized memory 
> 	involves dereferencing the pointer itself then READ_ONCE() is 
> 	sufficient.  This is because there will be an address dependency 
> 	between reading the pointer and accessing the memory, which will 
> 	ensure proper ordering.  But if some of the initialized memory 
> 	is reachable some other way (for example, if it is global or 
> 	static data) then there need not be an address dependency, 
> 	merely a control dependency (checking whether the pointer is 
> 	non-NULL).  Control dependencies do not always ensure ordering 
> 	-- certainly not for reads, and depending on the compiler, 
> 	possibly not for some writes -- and therefore a load-acquire is 
> 	necessary.

Recipes are aimed at people who simply don't understand any of that
goobledegook. This won't help them -write correct code-.

> Perhaps this is more wordy than you want, but it does get the important 
> ideas across.

You think they are important because you understand what those words
mean.  Large numbers of developers do not understand what they mean,
nor how to put them into practise correctly.

Seriously: if you want people to use this stuff correctly, you need
to -dumb it down-, not make it even more challenging by explaining
words people don't understand with yet more words they don't
understand...

This is the "curse of knowledge" cognative bias in a nutshell.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-18  1:25     ` Alan Stern
  2020-07-18  1:40       ` Matthew Wilcox
  2020-07-18  2:00       ` Dave Chinner
@ 2020-07-18  2:00       ` Eric Biggers
  2 siblings, 0 replies; 39+ messages in thread
From: Eric Biggers @ 2020-07-18  2:00 UTC (permalink / raw)
  To: Alan Stern
  Cc: Darrick J. Wong, linux-kernel, linux-arch, Paul E . McKenney,
	linux-fsdevel, Akira Yokosawa, Andrea Parri, Boqun Feng,
	Daniel Lustig, Dave Chinner, David Howells, Jade Alglave,
	Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon

On Fri, Jul 17, 2020 at 09:25:55PM -0400, Alan Stern wrote:
> On Fri, Jul 17, 2020 at 05:58:57PM -0700, Eric Biggers wrote:
> > On Fri, Jul 17, 2020 at 01:53:40PM -0700, Darrick J. Wong wrote:
> > > > +There are also cases in which the smp_load_acquire() can be replaced by
> > > > +the more lightweight READ_ONCE().  (smp_store_release() is still
> > > > +required.)  Specifically, if all initialized memory is transitively
> > > > +reachable from the pointer itself, then there is no control dependency
> > > 
> > > I don't quite understand what "transitively reachable from the pointer
> > > itself" means?  Does that describe the situation where all the objects
> > > reachable through the object that the global struct foo pointer points
> > > at are /only/ reachable via that global pointer?
> > > 
> > 
> > The intent is that "transitively reachable" means that all initialized memory
> > can be reached by dereferencing the pointer in some way, e.g. p->a->b[5]->c.
> > 
> > It could also be the case that allocating the object initializes some global or
> > static data, which isn't reachable in that way.  Access to that data would then
> > be a control dependency, which a data dependency barrier wouldn't work for.
> > 
> > It's possible I misunderstood something.  (Note the next paragraph does say that
> > using READ_ONCE() is discouraged, exactly for this reason -- it can be hard to
> > tell whether it's correct.)  Suggestions of what to write here are appreciated.
> 
> Perhaps something like this:
> 
> 	Specifically, if the only way to reach the initialized memory 
> 	involves dereferencing the pointer itself then READ_ONCE() is 
> 	sufficient.  This is because there will be an address dependency 
> 	between reading the pointer and accessing the memory, which will 
> 	ensure proper ordering.  But if some of the initialized memory 
> 	is reachable some other way (for example, if it is global or 
> 	static data) then there need not be an address dependency, 
> 	merely a control dependency (checking whether the pointer is 
> 	non-NULL).  Control dependencies do not always ensure ordering 
> 	-- certainly not for reads, and depending on the compiler, 
> 	possibly not for some writes -- and therefore a load-acquire is 
> 	necessary.
> 
> Perhaps this is more wordy than you want, but it does get the important 
> ideas across.
> 

How about:

	There are also cases in which the smp_load_acquire() can be replaced by
	the more lightweight READ_ONCE().  (smp_store_release() is still
	required.)  Specifically, if the only way to reach the initialized
	memory involves dereferencing the pointer itself, then the data
	dependency barrier provided by READ_ONCE() is sufficient.  However, if
	some of the initialized memory is reachable some other way (for example,
	if it is global or static data) then there need not be an address
	dependency, merely a control dependency (checking whether the pointer is
	non-NULL).  READ_ONCE() is *not* sufficient in that case.

	The optimization of replacing smp_load_acquire() with READ_ONCE() is
	discouraged for nontrivial data structures, since it can be difficult to
	determine if it is correct.  In particular, for complex data structures
	the correctness of the READ_ONCE() optimization may depend on internal
	implementation details of other kernel subsystems.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-18  1:38   ` Eric Biggers
@ 2020-07-18  2:13     ` Matthew Wilcox
  2020-07-18  5:28       ` Eric Biggers
  0 siblings, 1 reply; 39+ messages in thread
From: Matthew Wilcox @ 2020-07-18  2:13 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-kernel, linux-arch, Paul E . McKenney, linux-fsdevel,
	Akira Yokosawa, Alan Stern, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, Dave Chinner, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Fri, Jul 17, 2020 at 06:38:39PM -0700, Eric Biggers wrote:
> On Fri, Jul 17, 2020 at 06:47:50PM +0100, Matthew Wilcox wrote:
> > On Thu, Jul 16, 2020 at 09:44:27PM -0700, Eric Biggers wrote:
> > > +If that doesn't apply, you'll have to implement one-time init yourself.
> > > +
> > > +The simplest implementation just uses a mutex and an 'inited' flag.
> > > +This implementation should be used where feasible:
> > 
> > I think some syntactic sugar should make it feasible for normal people
> > to implement the most efficient version of this just like they use locks.
> 
> Note that the cmpxchg version is not necessarily the "most efficient".
> 
> If the one-time initialization is expensive, e.g. if it allocates a lot of
> memory or if takes a long time, it could be better to use the mutex version so
> that at most one task does it.

Sure, but I think those are far less common than just allocating a single
thing.

> > How about something like this ...
> > 
> > once.h:
> > 
> > static struct init_once_pointer {
> > 	void *p;
> > };
> > 
> > static inline void *once_get(struct init_once_pointer *oncep)
> > { ... }
> > 
> > static inline bool once_store(struct init_once_pointer *oncep, void *p)
> > { ... }
> > 
> > --- foo.c ---
> > 
> > struct foo *get_foo(gfp_t gfp)
> > {
> > 	static struct init_once_pointer my_foo;
> > 	struct foo *foop;
> > 
> > 	foop = once_get(&my_foo);
> > 	if (foop)
> > 		return foop;
> > 
> > 	foop = alloc_foo(gfp);
> > 	if (!once_store(&my_foo, foop)) {
> > 		free_foo(foop);
> > 		foop = once_get(&my_foo);
> > 	}
> > 
> > 	return foop;
> > }
> > 
> > Any kernel programmer should be able to handle that pattern.  And no mutex!
> 
> I don't think this version would be worthwhile.  It eliminates type safety due
> to the use of 'void *', and doesn't actually save any lines of code.  Nor does
> it eliminate the need to correctly implement the cmpxchg failure case, which is
> tricky (it must free the object and get the new one) and will be rarely tested.

You're missing the point.  It prevents people from trying to optimise
"can I use READ_ONCE() here, or do I need to use smp_rmb()?"  The type
safety is provided by the get_foo() function.  I suppose somebody could
play some games with _Generic or something, but there's really no need to.
It's like using a list_head and casting to the container_of.

> It also forces all users of the struct to use this helper function to access it.
> That could be considered a good thing, but it's also bad because even with
> one-time init there's still usually some sort of ordering of "initialization"
> vs. "use".  Just taking a random example I'm familiar with, we do one-time init
> of inode::i_crypt_info when we open an encrypted file, so we guarantee it's set
> for all I/O to the file, where we then simply access ->i_crypt_info directly.
> We don't want the code to read like it's initializing ->i_crypt_info in the
> middle of ->writepages(), since that would be wrong.

Right, and I wouldn't use this pattern for that.  You can't get to
writepages without having opened the file, so just initialising the
pointer in open is fine.

> An improvement might be to make once_store() take the free function as a
> parameter so that it would handle the failure case for you:
> 
> struct foo *get_foo(gfp_t gfp)
> {
> 	static struct init_once_pointer my_foo;
> 	struct foo *foop;
>  
>  	foop = once_get(&my_foo);
>  	if (!foop) {
> 		foop = alloc_foo(gfp);
> 		if (foop)
> 			once_store(&my_foo, foop, free_foo);

Need to mark once_store as __must_check to avoid the bug you have here:

			foop = once_store(&my_foo, foop, free_foo);

Maybe we could use a macro for once_store so we could write:

void *once_get(struct init_pointer_once *);
int once_store(struct init_pointer_once *, void *);

#define once_alloc(s, o_alloc, o_free) ({                               \
        void *__p = o_alloc;                                            \
        if (__p) {                                                      \
                if (!once_store(s, __p)) {                              \
                        o_free(__p);                                    \
                        __p = once_get(s);                              \
                }                                                       \
        }                                                               \
        __p;                                                            \
})

---

struct foo *alloc_foo(gfp_t);
void free_foo(struct foo *);

struct foo *get_foo(gfp_t gfp)
{
        static struct init_pointer_once my_foo;
        struct foo *foop;

        foop = once_get(&my_foo);
        if (!foop)
                foop = once_alloc(&my_foo, alloc_foo(gfp), free_foo);
        return foop;
}

That's pretty hard to misuse (I compile-tested it, and it works).

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-18  2:13     ` Matthew Wilcox
@ 2020-07-18  5:28       ` Eric Biggers
  2020-07-18 14:35         ` Alan Stern
                           ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Eric Biggers @ 2020-07-18  5:28 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-kernel, linux-arch, Paul E . McKenney, linux-fsdevel,
	Akira Yokosawa, Alan Stern, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, Dave Chinner, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Sat, Jul 18, 2020 at 03:13:04AM +0100, Matthew Wilcox wrote:
> On Fri, Jul 17, 2020 at 06:38:39PM -0700, Eric Biggers wrote:
> > On Fri, Jul 17, 2020 at 06:47:50PM +0100, Matthew Wilcox wrote:
> > > On Thu, Jul 16, 2020 at 09:44:27PM -0700, Eric Biggers wrote:
> > > > +If that doesn't apply, you'll have to implement one-time init yourself.
> > > > +
> > > > +The simplest implementation just uses a mutex and an 'inited' flag.
> > > > +This implementation should be used where feasible:
> > > 
> > > I think some syntactic sugar should make it feasible for normal people
> > > to implement the most efficient version of this just like they use locks.
> > 
> > Note that the cmpxchg version is not necessarily the "most efficient".
> > 
> > If the one-time initialization is expensive, e.g. if it allocates a lot of
> > memory or if takes a long time, it could be better to use the mutex version so
> > that at most one task does it.
> 
> Sure, but I think those are far less common than just allocating a single
> thing.
> 
> > > How about something like this ...
> > > 
> > > once.h:
> > > 
> > > static struct init_once_pointer {
> > > 	void *p;
> > > };
> > > 
> > > static inline void *once_get(struct init_once_pointer *oncep)
> > > { ... }
> > > 
> > > static inline bool once_store(struct init_once_pointer *oncep, void *p)
> > > { ... }
> > > 
> > > --- foo.c ---
> > > 
> > > struct foo *get_foo(gfp_t gfp)
> > > {
> > > 	static struct init_once_pointer my_foo;
> > > 	struct foo *foop;
> > > 
> > > 	foop = once_get(&my_foo);
> > > 	if (foop)
> > > 		return foop;
> > > 
> > > 	foop = alloc_foo(gfp);
> > > 	if (!once_store(&my_foo, foop)) {
> > > 		free_foo(foop);
> > > 		foop = once_get(&my_foo);
> > > 	}
> > > 
> > > 	return foop;
> > > }
> > > 
> > > Any kernel programmer should be able to handle that pattern.  And no mutex!
> > 
> > I don't think this version would be worthwhile.  It eliminates type safety due
> > to the use of 'void *', and doesn't actually save any lines of code.  Nor does
> > it eliminate the need to correctly implement the cmpxchg failure case, which is
> > tricky (it must free the object and get the new one) and will be rarely tested.
> 
> You're missing the point.  It prevents people from trying to optimise
> "can I use READ_ONCE() here, or do I need to use smp_rmb()?"  The type
> safety is provided by the get_foo() function.  I suppose somebody could
> play some games with _Generic or something, but there's really no need to.
> It's like using a list_head and casting to the container_of.
> 
> > It also forces all users of the struct to use this helper function to access it.
> > That could be considered a good thing, but it's also bad because even with
> > one-time init there's still usually some sort of ordering of "initialization"
> > vs. "use".  Just taking a random example I'm familiar with, we do one-time init
> > of inode::i_crypt_info when we open an encrypted file, so we guarantee it's set
> > for all I/O to the file, where we then simply access ->i_crypt_info directly.
> > We don't want the code to read like it's initializing ->i_crypt_info in the
> > middle of ->writepages(), since that would be wrong.
> 
> Right, and I wouldn't use this pattern for that.  You can't get to
> writepages without having opened the file, so just initialising the
> pointer in open is fine.
> 
> > An improvement might be to make once_store() take the free function as a
> > parameter so that it would handle the failure case for you:
> > 
> > struct foo *get_foo(gfp_t gfp)
> > {
> > 	static struct init_once_pointer my_foo;
> > 	struct foo *foop;
> >  
> >  	foop = once_get(&my_foo);
> >  	if (!foop) {
> > 		foop = alloc_foo(gfp);
> > 		if (foop)
> > 			once_store(&my_foo, foop, free_foo);
> 
> Need to mark once_store as __must_check to avoid the bug you have here:
> 
> 			foop = once_store(&my_foo, foop, free_foo);
> 
> Maybe we could use a macro for once_store so we could write:
> 
> void *once_get(struct init_pointer_once *);
> int once_store(struct init_pointer_once *, void *);
> 
> #define once_alloc(s, o_alloc, o_free) ({                               \
>         void *__p = o_alloc;                                            \
>         if (__p) {                                                      \
>                 if (!once_store(s, __p)) {                              \
>                         o_free(__p);                                    \
>                         __p = once_get(s);                              \
>                 }                                                       \
>         }                                                               \
>         __p;                                                            \
> })
> 
> ---
> 
> struct foo *alloc_foo(gfp_t);
> void free_foo(struct foo *);
> 
> struct foo *get_foo(gfp_t gfp)
> {
>         static struct init_pointer_once my_foo;
>         struct foo *foop;
> 
>         foop = once_get(&my_foo);
>         if (!foop)
>                 foop = once_alloc(&my_foo, alloc_foo(gfp), free_foo);
>         return foop;
> }
> 
> That's pretty hard to misuse (I compile-tested it, and it works).

I'm still not sure this is the best API.

It's very much designed around the cmpxchg solution, which is clever and is more
efficient in some cases.  But it's not really a good default.

The cmpxchg solution isn't simply a different implementation of one-time-init,
but rather it changes the *behavior* so that the allocation/initialization can
happen multiple times concurrently.  So it's not really "once" anymore.

I think most people would find this unexpected, especially if they're used to
one-time-init in programming languages that support it natively (e.g., C++
static local variables, to use a random example...).

Concurrent attempts at the initialization can cause problems if it involves
allocating a lot of memory or other kernel resources, does CPU-intensive work,
calls out to user-mode helpers, calls non-thread-safe code, tries to register
things with duplicate names, etc.

It also means the user has provide a free() function, even in cases like static
data in non-modular code where free() would otherwise not be needed.

And since the concurrent allocation (and free()) case is unlikely to get tested,
users may not notice problems until too late.

Using 'struct init_once_pointer' is nice to prevent use of the pointer without
either initializing it or executing a memory barrier.  But most of the time
that's not what people want.  Usually code is structured such that the
initialization happens at one point, and then later accesses just want the plain
pointer.  And almost everyone will need to free() the struct at some point,
which again should just be a plain access.

So if we want any new macros to be usable as widely as possible, I think they'll
need to use use plain pointers.

It would also be nice to have the APIs/macros for the different cases (single
strucct, multiple structs, static data) be consistent and work similarly.

What do people think about the following instead?  (Not proofread / tested yet,
so please comment on the high-level approach, not minor mistakes :-) )

#define DO_ONCE_ATOMIC()
	/* renamed from current DO_ONCE() */

/**
 * DO_ONCE_BLOCKING() - call a function exactly once
 * @once_func: the function to call exactly once
 * @...: additional arguments to pass to @once_func (optional)
 *
 * Return: 0 on success (done or already done), or an error returned by
 *	   @once_func.
 */
#define DO_ONCE_BLOCKING(once_func, ...)				\
({									\
	static DEFINE_MUTEX(mutex);					\
 	static bool done;						\
 	int err = 0;							\
									\
	if (!smp_load_acquire(&done)) {					\
		mutex_lock(&mutex);					\
		if (!done) {						\
			err = once_func(__VA_ARGS__);			\
			if (!err)					\
				smp_store_release(&done, true);		\
		}							\
		mutex_unlock(&mutex);					\
	}								\
 	err;								\
})

/**
 * INIT_ONCE() - do one-time initialization
 * @done: pointer to a 'bool' flag that tracks whether initialization has been
 *	  done yet or not.  Must be false by default.
 * @mutex: pointer to a mutex to use to synchronize executions of @init_func
 * @init_func: the one-time initialization function
 * @...: additional arguments to pass to @init_func (optional)
 *
 * This is a more general version of DO_ONCE_BLOCKING() which supports
 * non-static data by allowing the user to specify their own 'done' flag and
 * mutex.
 *
 * Return: 0 on success (done or already done), or a negative errno value
 *	   returned by @init_func.
 */
#define INIT_ONCE(done, mutex, init_func, ...)				\
({									\
 	int err = 0;							\
									\
	if (!smp_load_acquire(done)) {					\
		mutex_lock(mutex);					\
		if (!*(done)) {						\
			err = init_func(__VA_ARGS__);			\
			if (!err)					\
				smp_store_release((done), true);	\
		}							\
		mutex_unlock(mutex);					\
	}								\
 	err;								\
})

/**
 * INIT_ONCE_PTR() - do one-time allocation of a data structure
 * @ptr: pointer to the data structure's pointer. Must be NULL by default.
 * @mutex: pointer to a mutex to use to synchronize executions of @alloc_func
 * @alloc_func: the allocation function
 * @...: additional arguments to pass to @alloc_func (optional)
 *
 * Like INIT_ONCE(), but eliminates the need for the 'done' flag by assuming a
 * single pointer is being initialized.
 *
 * Return: 0 on success (done or already done), or an error from @alloc_func.
 *         If @alloc_func returns an ERR_PTR(), PTR_ERR() will be returned.
 *         If @alloc_func returns NULL, -ENOMEM will be returned.
 */
#define INIT_ONCE_PTR(ptr, mutex, alloc_func, ...)			\
({									\
	int err = 0;							\
									\
	if (!smp_load_acquire(ptr)) {					\
		mutex_lock(mutex);					\
		if (!*(ptr)) {						\
			typeof (*(ptr)) p = alloc_func(__VA_ARGS__);	\
									\
			if (!IS_ERR_OR_NULL(p))				\
				smp_store_release((ptr), p);		\
			else						\
 				err = p ? PTR_ERR(p) : -ENOMEM;		\
		}							\
		mutex_unlock(mutex);					\
	}								\
 	err;								\
})


In the fs/direct-io.c case we'd use:

int sb_init_dio_done_wq(struct super_block *sb)
{
	static DEFINE_MUTEX(sb_init_dio_done_mutex);

	return INIT_ONCE_PTR(&sb->s_dio_done_wq, &sb_init_dio_done_mutex,
			     alloc_workqueue,
			     "dio/%s", WQ_MEM_RECLAIM, 0, sb->s_id);
}

The only part I really don't like is the way arguments are passed to the
alloc_func.  We could also make it work like the following, though it would
break the usual rules since it looks like the function call is always executed,
but it's not:

	return INIT_ONCE_PTR(&sb->s_dio_done_wq, &sb_init_dio_done_mutex,
			     alloc_workqueue("dio/%s", WQ_MEM_RECLAIM, 0,
					     sb->s_id));

And for static data, here are a couple quick examples of how I'd use
DO_ONCE_BLOCKING() in fs/crypto/:

diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
index 9e82a8856aba..2661849c0a53 100644
--- a/fs/crypto/crypto.c
+++ b/fs/crypto/crypto.c
@@ -37,7 +37,6 @@ MODULE_PARM_DESC(num_prealloc_crypto_pages,
 static mempool_t *fscrypt_bounce_page_pool = NULL;
 
 static struct workqueue_struct *fscrypt_read_workqueue;
-static DEFINE_MUTEX(fscrypt_init_mutex);
 
 struct kmem_cache *fscrypt_info_cachep;
 
@@ -304,6 +303,15 @@ int fscrypt_decrypt_block_inplace(const struct inode *inode, struct page *page,
 }
 EXPORT_SYMBOL(fscrypt_decrypt_block_inplace);
 
+static int fscrypt_do_initialize(void)
+{
+	fscrypt_bounce_page_pool =
+		mempool_create_page_pool(num_prealloc_crypto_pages, 0);
+	if (!fscrypt_bounce_page_pool)
+		return -ENOMEM;
+	return 0;
+}
+
 /**
  * fscrypt_initialize() - allocate major buffers for fs encryption.
  * @cop_flags:  fscrypt operations flags
@@ -315,26 +323,11 @@ EXPORT_SYMBOL(fscrypt_decrypt_block_inplace);
  */
 int fscrypt_initialize(unsigned int cop_flags)
 {
-	int err = 0;
-
 	/* No need to allocate a bounce page pool if this FS won't use it. */
 	if (cop_flags & FS_CFLG_OWN_PAGES)
 		return 0;
 
-	mutex_lock(&fscrypt_init_mutex);
-	if (fscrypt_bounce_page_pool)
-		goto out_unlock;
-
-	err = -ENOMEM;
-	fscrypt_bounce_page_pool =
-		mempool_create_page_pool(num_prealloc_crypto_pages, 0);
-	if (!fscrypt_bounce_page_pool)
-		goto out_unlock;
-
-	err = 0;
-out_unlock:
-	mutex_unlock(&fscrypt_init_mutex);
-	return err;
+	return DO_ONCE_BLOCKING(fscrypt_do_initialize);
 }
 
 void fscrypt_msg(const struct inode *inode, const char *level,
diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c
index d828e3df898b..2744a3daceb8 100644
--- a/fs/crypto/fname.c
+++ b/fs/crypto/fname.c
@@ -63,28 +63,22 @@ struct fscrypt_nokey_name {
 
 static struct crypto_shash *sha256_hash_tfm;
 
-static int fscrypt_do_sha256(const u8 *data, unsigned int data_len, u8 *result)
+static int fscrypt_init_sha256(void)
 {
-	struct crypto_shash *tfm = READ_ONCE(sha256_hash_tfm);
+	sha256_hash_tfm = crypto_alloc_shash("sha256", 0, 0);
+	return PTR_ERR_OR_ZERO(sha256_hash_tfm);
+}
 
-	if (unlikely(!tfm)) {
-		struct crypto_shash *prev_tfm;
+static int fscrypt_do_sha256(const u8 *data, unsigned int data_len, u8 *result)
+{
+	int err = DO_ONCE_BLOCKING(fscrypt_init_sha256);
 
-		tfm = crypto_alloc_shash("sha256", 0, 0);
-		if (IS_ERR(tfm)) {
-			fscrypt_err(NULL,
-				    "Error allocating SHA-256 transform: %ld",
-				    PTR_ERR(tfm));
-			return PTR_ERR(tfm);
-		}
-		prev_tfm = cmpxchg(&sha256_hash_tfm, NULL, tfm);
-		if (prev_tfm) {
-			crypto_free_shash(tfm);
-			tfm = prev_tfm;
-		}
+	if (err) {
+		fscrypt_err(NULL, "Error allocating SHA-256 transform: %d",
+			    err);
+		return err;
 	}
-
-	return crypto_shash_tfm_digest(tfm, data, data_len, result);
+	return crypto_shash_tfm_digest(sha256_hash_tfm, data, data_len, result);
 }

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-18  1:42 ` Dave Chinner
@ 2020-07-18 14:08   ` Alan Stern
  2020-07-20  1:33     ` Dave Chinner
  0 siblings, 1 reply; 39+ messages in thread
From: Alan Stern @ 2020-07-18 14:08 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Eric Biggers, linux-kernel, linux-arch, Paul E . McKenney,
	linux-fsdevel, Akira Yokosawa, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, David Howells, Jade Alglave,
	Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon

> This is one of the reasons that the LKMM documetnation is so damn
> difficult to read and understand: just understanding the vocabulary
> it uses requires a huge learning curve, and it's not defined
> anywhere. Understanding the syntax of examples requires a huge
> learning curve, because it's not defined anywhere. 

Have you seen tools/memory-model/Documentation/explanation.txt?  That
file was specifically written for non-experts to help them overcome the
learning curve.  It tries to define the vocabulary as terms are
introduced and to avoid using obscure syntax.

If you think it needs improvement and can give some specific details
about where it falls short, I would like to hear them.

Alan Stern

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-18  2:00       ` Dave Chinner
@ 2020-07-18 14:21         ` Alan Stern
  0 siblings, 0 replies; 39+ messages in thread
From: Alan Stern @ 2020-07-18 14:21 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Eric Biggers, Darrick J. Wong, linux-kernel, linux-arch,
	Paul E . McKenney, linux-fsdevel, Akira Yokosawa, Andrea Parri,
	Boqun Feng, Daniel Lustig, David Howells, Jade Alglave,
	Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon

On Sat, Jul 18, 2020 at 12:00:01PM +1000, Dave Chinner wrote:
> Recipes are aimed at people who simply don't understand any of that
> goobledegook. This won't help them -write correct code-.

Indeed.	 Perhaps this writeup belongs in a different document (with a 
pointer	from the recipes file), and the actual recipe itself should
await the development of a general and robust API.

Alan Stern


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-18  5:28       ` Eric Biggers
@ 2020-07-18 14:35         ` Alan Stern
  2020-07-20  2:07         ` Dave Chinner
  2020-07-27 15:17         ` Alan Stern
  2 siblings, 0 replies; 39+ messages in thread
From: Alan Stern @ 2020-07-18 14:35 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Matthew Wilcox, linux-kernel, linux-arch, Paul E . McKenney,
	linux-fsdevel, Akira Yokosawa, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, Dave Chinner, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Fri, Jul 17, 2020 at 10:28:18PM -0700, Eric Biggers wrote:
> /**
>  * INIT_ONCE() - do one-time initialization
>  * @done: pointer to a 'bool' flag that tracks whether initialization has been
>  *	  done yet or not.  Must be false by default.
>  * @mutex: pointer to a mutex to use to synchronize executions of @init_func
>  * @init_func: the one-time initialization function
>  * @...: additional arguments to pass to @init_func (optional)
>  *
>  * This is a more general version of DO_ONCE_BLOCKING() which supports
>  * non-static data by allowing the user to specify their own 'done' flag and
>  * mutex.
>  *
>  * Return: 0 on success (done or already done), or a negative errno value
>  *	   returned by @init_func.

It might be worth pointing out explicitly that init_func can be called 
multiple times, if it returns an error.

>  */
> #define INIT_ONCE(done, mutex, init_func, ...)				\
> ({									\
>  	int err = 0;							\
> 									\
> 	if (!smp_load_acquire(done)) {					\
> 		mutex_lock(mutex);					\
> 		if (!*(done)) {						\
> 			err = init_func(__VA_ARGS__);			\
> 			if (!err)					\
> 				smp_store_release((done), true);	\
> 		}							\
> 		mutex_unlock(mutex);					\
> 	}								\
>  	err;								\
> })

If this macro is invoked in multiple places for the same object (which 
is not unlikely), there is a distinct risk that people will supply 
different mutexes or done variables for the invocations.

IMO a better approach would be to have a macro which, given a variable 
name v, generates an actual init_once_v() function.  Then code wanting 
to use v would call init_once_v() first, with no danger of inconsistent 
usage.  You can fill in the details...

Alan Stern

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-18 14:08   ` Alan Stern
@ 2020-07-20  1:33     ` Dave Chinner
  2020-07-20 14:52       ` Alan Stern
  0 siblings, 1 reply; 39+ messages in thread
From: Dave Chinner @ 2020-07-20  1:33 UTC (permalink / raw)
  To: Alan Stern
  Cc: Eric Biggers, linux-kernel, linux-arch, Paul E . McKenney,
	linux-fsdevel, Akira Yokosawa, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, David Howells, Jade Alglave,
	Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon

On Sat, Jul 18, 2020 at 10:08:11AM -0400, Alan Stern wrote:
> > This is one of the reasons that the LKMM documetnation is so damn
> > difficult to read and understand: just understanding the vocabulary
> > it uses requires a huge learning curve, and it's not defined
> > anywhere. Understanding the syntax of examples requires a huge
> > learning curve, because it's not defined anywhere. 
> 
> Have you seen tools/memory-model/Documentation/explanation.txt?

<raises eyebrow>

Well, yes. Several times. I look at it almost daily, but that
doesn't mean it's approachable, easy to read or even -that I
understand what large parts of it say-. IOWs, that's one of the 
problematic documents that I've been saying have a huge learning
curve.

So, if I say "the LKMM documentation", I mean -all- of the
documentation, not just some tiny subset of it. I've read it all,
I've even installed herd7 so I can run the litmus tests to see if
that helps me understand the documentation better.

That only increased the WTF factor because the documentation of that
stuff is far, far more impenetrable than the LKMM documentation.  If
the LKMM learnign curve is near vertical, then the stuff in the
herd7 tools is an -overhang-. And I most certainly can't climb
that....

/me idly wonders if you recognise that your comment is, yet again, a
classic demonstration of the behaviour the "curse of knowledge"
cognitive bias describes.

> That
> file was specifically written for non-experts to help them overcome the
> learning curve.  It tries to define the vocabulary as terms are
> introduced and to avoid using obscure syntax.

It tries to teach people about what a memory model is at the same
time it tries to define the LKMM. What it doesn't do at all is
teach people how to write safe code. People want to write safe code,
not become "memory model experts".

Memory models are -your expertise- but they aren't mine. My
expertise is filesystems: I don't care about the nitty gritty
details of memory models, I just want to be able to write lockless
algorithms correctly. Which, I might point out, I've been doing for
well over a decade...

> If you think it needs improvement and can give some specific
> details about where it falls short, I would like to hear them.

Haven't you understood anything I've been saying? That developers
don't care about how the theory behind the memory model  or how it
works - we just want to be able to write safe code. And to do that
quickly and efficiently. The "make the documentation more complex"
response is the wrong direction. Please *dumb it down* to the most
basic, simplest, common concurrency patterns that programmers use
and then write APIs to do those things that *hide the memory model
for the programmer*.

Adding documentation about all the possible things you could do,
all the optimisations you could make, all the intricate, subtle
variations you can use, etc is not helpful. It might be interesting
to you, but I just want -somethign that works- and not have to
understand the LKMM to get stuff done.

Example: I know how smp_load_acquire() works. I know that I can
expect the same behavioural semantics from smp_cond_load_acquire().
But I don't care how the implementation of smp_cond_load_acquire()
is optimised to minimise ordering barriers as it spins. That sort of
optimisation is your job, not mine - I just want a function that
will spin safely until a specific value is seen and then return with
acquire semantics on the successful load.....

Can you see the difference between "understanding the LKMM
documenation" vs "using a well defined API that provides commonly
used functionality" to write correct, optimal code that needs to
spin waiting for some other context to update a variable?

That's the problem the LKMM documentation fails to address. It is
written to explain the theory behind the LKMM rather than provide
developers with pointers to the templates and APIs that implement
the lockless co-ordination functionality they want to use....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-18  5:28       ` Eric Biggers
  2020-07-18 14:35         ` Alan Stern
@ 2020-07-20  2:07         ` Dave Chinner
  2020-07-20  9:00           ` Peter Zijlstra
  2020-07-27 15:17         ` Alan Stern
  2 siblings, 1 reply; 39+ messages in thread
From: Dave Chinner @ 2020-07-20  2:07 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Matthew Wilcox, linux-kernel, linux-arch, Paul E . McKenney,
	linux-fsdevel, Akira Yokosawa, Alan Stern, Andrea Parri,
	Boqun Feng, Daniel Lustig, Darrick J . Wong, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Fri, Jul 17, 2020 at 10:28:18PM -0700, Eric Biggers wrote:
> What do people think about the following instead?  (Not proofread / tested yet,
> so please comment on the high-level approach, not minor mistakes :-) )

No huge long macros, please.

We don't accept people writing long complex static inline functions,
so for the same reasons it is not acceptable to write long complex
macros.  Especially ones that use variable arguments and embed
invisible locking within them.

The whole serialisation/atomic/ordering APIs have fallen badly off
the macro cliff, to the point where finding out something as simple
as the order of parameters passed to cmpxchg and what semantics it
provides requires macro-spelunking 5 layers deep to find the generic
implementation function that contains a comment describing what it
does....

That's yet another barrier to understanding what all the different
synchronisation primitives do.

....

> In the fs/direct-io.c case we'd use:
> 
> int sb_init_dio_done_wq(struct super_block *sb)
> {
> 	static DEFINE_MUTEX(sb_init_dio_done_mutex);
> 
> 	return INIT_ONCE_PTR(&sb->s_dio_done_wq, &sb_init_dio_done_mutex,
> 			     alloc_workqueue,
> 			     "dio/%s", WQ_MEM_RECLAIM, 0, sb->s_id);
> }

Yeah, that's pretty horrible...

I *much* prefer an API like Willy's suggested to somethign like
this. Just because you can hide all sorts of stuff in macros doesn't
mean you should.

> The only part I really don't like is the way arguments are passed to the
> alloc_func.  We could also make it work like the following, though it would
> break the usual rules since it looks like the function call is always executed,
> but it's not:
> 
> 	return INIT_ONCE_PTR(&sb->s_dio_done_wq, &sb_init_dio_done_mutex,
> 			     alloc_workqueue("dio/%s", WQ_MEM_RECLAIM, 0,
> 					     sb->s_id));

Yeah, that's even worse. Code that does not do what it looks like it
should needs to be nuked from orbit.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-20  2:07         ` Dave Chinner
@ 2020-07-20  9:00           ` Peter Zijlstra
  0 siblings, 0 replies; 39+ messages in thread
From: Peter Zijlstra @ 2020-07-20  9:00 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Eric Biggers, Matthew Wilcox, linux-kernel, linux-arch,
	Paul E . McKenney, linux-fsdevel, Akira Yokosawa, Alan Stern,
	Andrea Parri, Boqun Feng, Daniel Lustig, Darrick J . Wong,
	David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin,
	Will Deacon

On Mon, Jul 20, 2020 at 12:07:31PM +1000, Dave Chinner wrote:
> The whole serialisation/atomic/ordering APIs have fallen badly off
> the macro cliff, to the point where finding out something as simple
> as the order of parameters passed to cmpxchg and what semantics it
> provides requires macro-spelunking 5 layers deep to find the generic
> implementation function that contains a comment describing what it
> does....
> 
> That's yet another barrier to understanding what all the different
> synchronisation primitives do.

Documentation/atomic_t.txt ?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-20  1:33     ` Dave Chinner
@ 2020-07-20 14:52       ` Alan Stern
  2020-07-20 15:37         ` Darrick J. Wong
  2020-07-20 15:39         ` Matthew Wilcox
  0 siblings, 2 replies; 39+ messages in thread
From: Alan Stern @ 2020-07-20 14:52 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Eric Biggers, linux-kernel, linux-arch, Paul E . McKenney,
	linux-fsdevel, Akira Yokosawa, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, David Howells, Jade Alglave,
	Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon

On Mon, Jul 20, 2020 at 11:33:20AM +1000, Dave Chinner wrote:
> On Sat, Jul 18, 2020 at 10:08:11AM -0400, Alan Stern wrote:
> > > This is one of the reasons that the LKMM documetnation is so damn
> > > difficult to read and understand: just understanding the vocabulary
> > > it uses requires a huge learning curve, and it's not defined
> > > anywhere. Understanding the syntax of examples requires a huge
> > > learning curve, because it's not defined anywhere. 
> > 
> > Have you seen tools/memory-model/Documentation/explanation.txt?
> 
> <raises eyebrow>
> 
> Well, yes. Several times. I look at it almost daily, but that
> doesn't mean it's approachable, easy to read or even -that I
> understand what large parts of it say-. IOWs, that's one of the 
> problematic documents that I've been saying have a huge learning
> curve.

Can you be more specific?  For example, exactly where does it start to 
become unapproachable or difficult to read?

Don't forget that this document was meant to help mitigate the LKMM's 
learning curve.  If it isn't successful, I want to improve it.

> So, if I say "the LKMM documentation", I mean -all- of the
> documentation, not just some tiny subset of it. I've read it all,
> I've even installed herd7 so I can run the litmus tests to see if
> that helps me understand the documentation better.
> 
> That only increased the WTF factor because the documentation of that
> stuff is far, far more impenetrable than the LKMM documentation.  If
> the LKMM learnign curve is near vertical, then the stuff in the
> herd7 tools is an -overhang-. And I most certainly can't climb
> that....

I can't argue with that.  Really understanding herd7 does require a 
pretty extensive background in the field.

> /me idly wonders if you recognise that your comment is, yet again, a
> classic demonstration of the behaviour the "curse of knowledge"
> cognitive bias describes.

Not at all.  I think you are confusing several different things.

For one, at a purely literal level my comment could not possibly be 
taken as a demonstration of "curse of knowledge" behavior, because it 
was a simple question: Have you seen explanation.txt?  Nothing obscure 
or outré about that.

For another, you appear to be confusing the LKMM with the kernel's API, 
and reference documents with programming guides (or recipes).  I'm sure 
that you aren't doing this deliberately and are well aware of these 
distinctions, but that's the impression your email leaves.

> > That
> > file was specifically written for non-experts to help them overcome the
> > learning curve.  It tries to define the vocabulary as terms are
> > introduced and to avoid using obscure syntax.
> 
> It tries to teach people about what a memory model is at the same
> time it tries to define the LKMM. What it doesn't do at all is
> teach people how to write safe code.

Of course it doesn't.  It was never meant to.  You can see this right in 
the filename "explanation.txt"; its purpose is to explain the LKMM.  
Nobody ever claimed it teaches how to write safe code or how to use the 
kernel's concurrent-programming API.  Those things belong in a separate 
document, such as recipes.txt.

>  People want to write safe code,
> not become "memory model experts".

Speak for yourself.  I personally want both, and no doubt there are 
others who feel the same way.

> Memory models are -your expertise- but they aren't mine. My
> expertise is filesystems: I don't care about the nitty gritty
> details of memory models, I just want to be able to write lockless
> algorithms correctly. Which, I might point out, I've been doing for
> well over a decade...

That's perfectly fine; I understand completely.  But your criticism is 
misplaced: It should be applied to recipes.txt, not to explanation.txt.

And remember: It was _you_ who claimed: "just understanding the 
vocabulary [the LKMM] uses requires a huge learning curve, and it's not 
defined anywhere".  explanation.txt shows that this statement is at 
least partly wrong.  Besides, given that you don't care about the nitty 
gritty details of memory models in any case, why are you complaining 
that understanding the LKMM is so hard?

My impression is that you really want to complain about the inadequate 
quality of recipes.txt as a programmers' guide.  Fine, but don't extend 
that to a blanket condemnation of all the LKMM documentation.

> > If you think it needs improvement and can give some specific
> > details about where it falls short, I would like to hear them.
> 
> Haven't you understood anything I've been saying? That developers
> don't care about how the theory behind the memory model  or how it
> works - we just want to be able to write safe code.

Again, speak for yourself.

>  And to do that
> quickly and efficiently. The "make the documentation more complex"
> response is the wrong direction. Please *dumb it down* to the most
> basic, simplest, common concurrency patterns that programmers use
> and then write APIs to do those things that *hide the memory model
> for the programmer*.
> 
> Adding documentation about all the possible things you could do,
> all the optimisations you could make, all the intricate, subtle
> variations you can use, etc is not helpful. It might be interesting
> to you, but I just want -somethign that works- and not have to
> understand the LKMM to get stuff done.

In principle, both can be included in the same document.  Say, with the 
more in-depth discussions relegated to specially marked-off sections 
that readers are invited to skip if they aren't interested.

> Example: I know how smp_load_acquire() works. I know that I can
> expect the same behavioural semantics from smp_cond_load_acquire().
> But I don't care how the implementation of smp_cond_load_acquire()
> is optimised to minimise ordering barriers as it spins. That sort of
> optimisation is your job, not mine - I just want a function that
> will spin safely until a specific value is seen and then return with
> acquire semantics on the successful load.....
> 
> Can you see the difference between "understanding the LKMM
> documenation" vs "using a well defined API that provides commonly
> used functionality" to write correct, optimal code that needs to
> spin waiting for some other context to update a variable?

Certainly I can.  Can't _you_ see the difference between a document that 
helps people "understand the LKMM" and one that demonstrates "using a 
well defined API that provides commonly used functionality"?

> That's the problem the LKMM documentation fails to address. It is
> written to explain the theory behind the LKMM rather than provide
> developers with pointers to the templates and APIs that implement
> the lockless co-ordination functionality they want to use....

That's the difference between a reference document and a programmers' 
guide.  Grousing that one isn't the other is futile.

On the other hand, pointing out specific areas of improvement for a 
document that was meant to be a programmers' guide can be very helpful. 
You may not be inclined to spend any time editing recipes.txt, but 
perhaps you could point out a few of the specific areas most in need of 
work?

Alan Stern

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-20 14:52       ` Alan Stern
@ 2020-07-20 15:37         ` Darrick J. Wong
  2020-07-20 15:39         ` Matthew Wilcox
  1 sibling, 0 replies; 39+ messages in thread
From: Darrick J. Wong @ 2020-07-20 15:37 UTC (permalink / raw)
  To: Alan Stern
  Cc: Dave Chinner, Eric Biggers, linux-kernel, linux-arch,
	Paul E . McKenney, linux-fsdevel, Akira Yokosawa, Andrea Parri,
	Boqun Feng, Daniel Lustig, David Howells, Jade Alglave,
	Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon

On Mon, Jul 20, 2020 at 10:52:11AM -0400, Alan Stern wrote:
> On Mon, Jul 20, 2020 at 11:33:20AM +1000, Dave Chinner wrote:
> > On Sat, Jul 18, 2020 at 10:08:11AM -0400, Alan Stern wrote:
> > > > This is one of the reasons that the LKMM documetnation is so damn
> > > > difficult to read and understand: just understanding the vocabulary
> > > > it uses requires a huge learning curve, and it's not defined
> > > > anywhere. Understanding the syntax of examples requires a huge
> > > > learning curve, because it's not defined anywhere. 
> > > 
> > > Have you seen tools/memory-model/Documentation/explanation.txt?
> > 
> > <raises eyebrow>
> > 
> > Well, yes. Several times. I look at it almost daily, but that
> > doesn't mean it's approachable, easy to read or even -that I
> > understand what large parts of it say-. IOWs, that's one of the 
> > problematic documents that I've been saying have a huge learning
> > curve.
> 
> Can you be more specific?  For example, exactly where does it start to 
> become unapproachable or difficult to read?
> 
> Don't forget that this document was meant to help mitigate the LKMM's 
> learning curve.  If it isn't successful, I want to improve it.
> 
> > So, if I say "the LKMM documentation", I mean -all- of the
> > documentation, not just some tiny subset of it. I've read it all,
> > I've even installed herd7 so I can run the litmus tests to see if
> > that helps me understand the documentation better.
> > 
> > That only increased the WTF factor because the documentation of that
> > stuff is far, far more impenetrable than the LKMM documentation.  If
> > the LKMM learnign curve is near vertical, then the stuff in the
> > herd7 tools is an -overhang-. And I most certainly can't climb
> > that....
> 
> I can't argue with that.  Really understanding herd7 does require a 
> pretty extensive background in the field.
> 
> > /me idly wonders if you recognise that your comment is, yet again, a
> > classic demonstration of the behaviour the "curse of knowledge"
> > cognitive bias describes.
> 
> Not at all.  I think you are confusing several different things.
> 
> For one, at a purely literal level my comment could not possibly be 
> taken as a demonstration of "curse of knowledge" behavior, because it 
> was a simple question: Have you seen explanation.txt?  Nothing obscure 
> or outré about that.
> 
> For another, you appear to be confusing the LKMM with the kernel's API, 
> and reference documents with programming guides (or recipes).  I'm sure 
> that you aren't doing this deliberately and are well aware of these 
> distinctions, but that's the impression your email leaves.
> 
> > > That
> > > file was specifically written for non-experts to help them overcome the
> > > learning curve.  It tries to define the vocabulary as terms are
> > > introduced and to avoid using obscure syntax.
> > 
> > It tries to teach people about what a memory model is at the same
> > time it tries to define the LKMM. What it doesn't do at all is
> > teach people how to write safe code.
> 
> Of course it doesn't.  It was never meant to.  You can see this right in 
> the filename "explanation.txt"; its purpose is to explain the LKMM.  
> Nobody ever claimed it teaches how to write safe code or how to use the 
> kernel's concurrent-programming API.  Those things belong in a separate 
> document, such as recipes.txt.
> 
> >  People want to write safe code,
> > not become "memory model experts".
> 
> Speak for yourself.  I personally want both, and no doubt there are 
> others who feel the same way.

Ok then.  I'm a lazy a***** maintainer who wants to write safe code
without first having to understand the memory models of *multiple CPU
architectures*, and I'm perfectly fine with accepting the smallest
common featureset across those CPUs so that I can think as little as
possible about things outside my area of expertise.

Willy's proposed API looks good enough for me, and if that's all I ever
have to know, I'm good with that.  Eric's proposed recipes patch is a
decent breadcrumb should I ever desire to know more.  I don't expect
I'll read the LKMM, just like I don't expect any of the rest of you will
read the XFS internals book.

(Well, ok, I imagine Dave has...)

--D

> 
> > Memory models are -your expertise- but they aren't mine. My
> > expertise is filesystems: I don't care about the nitty gritty
> > details of memory models, I just want to be able to write lockless
> > algorithms correctly. Which, I might point out, I've been doing for
> > well over a decade...
> 
> That's perfectly fine; I understand completely.  But your criticism is 
> misplaced: It should be applied to recipes.txt, not to explanation.txt.
> 
> And remember: It was _you_ who claimed: "just understanding the 
> vocabulary [the LKMM] uses requires a huge learning curve, and it's not 
> defined anywhere".  explanation.txt shows that this statement is at 
> least partly wrong.  Besides, given that you don't care about the nitty 
> gritty details of memory models in any case, why are you complaining 
> that understanding the LKMM is so hard?
> 
> My impression is that you really want to complain about the inadequate 
> quality of recipes.txt as a programmers' guide.  Fine, but don't extend 
> that to a blanket condemnation of all the LKMM documentation.
> 
> > > If you think it needs improvement and can give some specific
> > > details about where it falls short, I would like to hear them.
> > 
> > Haven't you understood anything I've been saying? That developers
> > don't care about how the theory behind the memory model  or how it
> > works - we just want to be able to write safe code.
> 
> Again, speak for yourself.
> 
> >  And to do that
> > quickly and efficiently. The "make the documentation more complex"
> > response is the wrong direction. Please *dumb it down* to the most
> > basic, simplest, common concurrency patterns that programmers use
> > and then write APIs to do those things that *hide the memory model
> > for the programmer*.
> > 
> > Adding documentation about all the possible things you could do,
> > all the optimisations you could make, all the intricate, subtle
> > variations you can use, etc is not helpful. It might be interesting
> > to you, but I just want -somethign that works- and not have to
> > understand the LKMM to get stuff done.
> 
> In principle, both can be included in the same document.  Say, with the 
> more in-depth discussions relegated to specially marked-off sections 
> that readers are invited to skip if they aren't interested.
> 
> > Example: I know how smp_load_acquire() works. I know that I can
> > expect the same behavioural semantics from smp_cond_load_acquire().
> > But I don't care how the implementation of smp_cond_load_acquire()
> > is optimised to minimise ordering barriers as it spins. That sort of
> > optimisation is your job, not mine - I just want a function that
> > will spin safely until a specific value is seen and then return with
> > acquire semantics on the successful load.....
> > 
> > Can you see the difference between "understanding the LKMM
> > documenation" vs "using a well defined API that provides commonly
> > used functionality" to write correct, optimal code that needs to
> > spin waiting for some other context to update a variable?
> 
> Certainly I can.  Can't _you_ see the difference between a document that 
> helps people "understand the LKMM" and one that demonstrates "using a 
> well defined API that provides commonly used functionality"?
> 
> > That's the problem the LKMM documentation fails to address. It is
> > written to explain the theory behind the LKMM rather than provide
> > developers with pointers to the templates and APIs that implement
> > the lockless co-ordination functionality they want to use....
> 
> That's the difference between a reference document and a programmers' 
> guide.  Grousing that one isn't the other is futile.
> 
> On the other hand, pointing out specific areas of improvement for a 
> document that was meant to be a programmers' guide can be very helpful. 
> You may not be inclined to spend any time editing recipes.txt, but 
> perhaps you could point out a few of the specific areas most in need of 
> work?
> 
> Alan Stern

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-20 14:52       ` Alan Stern
  2020-07-20 15:37         ` Darrick J. Wong
@ 2020-07-20 15:39         ` Matthew Wilcox
  2020-07-20 16:04           ` Paul E. McKenney
  2020-07-20 16:12           ` Alan Stern
  1 sibling, 2 replies; 39+ messages in thread
From: Matthew Wilcox @ 2020-07-20 15:39 UTC (permalink / raw)
  To: Alan Stern
  Cc: Dave Chinner, Eric Biggers, linux-kernel, linux-arch,
	Paul E . McKenney, linux-fsdevel, Akira Yokosawa, Andrea Parri,
	Boqun Feng, Daniel Lustig, Darrick J . Wong, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Mon, Jul 20, 2020 at 10:52:11AM -0400, Alan Stern wrote:
> On Mon, Jul 20, 2020 at 11:33:20AM +1000, Dave Chinner wrote:
> > On Sat, Jul 18, 2020 at 10:08:11AM -0400, Alan Stern wrote:
> > > > This is one of the reasons that the LKMM documetnation is so damn
> > > > difficult to read and understand: just understanding the vocabulary
> > > > it uses requires a huge learning curve, and it's not defined
> > > > anywhere. Understanding the syntax of examples requires a huge
> > > > learning curve, because it's not defined anywhere. 
> > > 
> > > Have you seen tools/memory-model/Documentation/explanation.txt?
> > 
> > <raises eyebrow>
> > 
> > Well, yes. Several times. I look at it almost daily, but that
> > doesn't mean it's approachable, easy to read or even -that I
> > understand what large parts of it say-. IOWs, that's one of the 
> > problematic documents that I've been saying have a huge learning
> > curve.
> 
> Can you be more specific?  For example, exactly where does it start to 
> become unapproachable or difficult to read?
> 
> Don't forget that this document was meant to help mitigate the LKMM's 
> learning curve.  If it isn't successful, I want to improve it.

I can't speak for Dave, but the introduction to that documentation makes
it clear to me that it's not the document I want to read.

: This document describes the ideas underlying the LKMM.  It is meant
: for people who want to understand how the model was designed.  It does
: not go into the details of the code in the .bell and .cat files;
: rather, it explains in English what the code expresses symbolically.

I don't want to know how the model was designed.  I want to write a
device driver, or filesystem, or whatever.

Honestly, even the term "release semantics" trips me up _every_ time.
It's a barrier to understanding because I have to translate it into "Oh,
he means it's like an unlock".  Why can't you just say "unlock semantics"?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-20 15:39         ` Matthew Wilcox
@ 2020-07-20 16:04           ` Paul E. McKenney
  2020-07-20 16:48             ` peterz
  2020-07-20 16:12           ` Alan Stern
  1 sibling, 1 reply; 39+ messages in thread
From: Paul E. McKenney @ 2020-07-20 16:04 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Alan Stern, Dave Chinner, Eric Biggers, linux-kernel, linux-arch,
	linux-fsdevel, Akira Yokosawa, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, David Howells, Jade Alglave,
	Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon

On Mon, Jul 20, 2020 at 04:39:11PM +0100, Matthew Wilcox wrote:
> On Mon, Jul 20, 2020 at 10:52:11AM -0400, Alan Stern wrote:
> > On Mon, Jul 20, 2020 at 11:33:20AM +1000, Dave Chinner wrote:
> > > On Sat, Jul 18, 2020 at 10:08:11AM -0400, Alan Stern wrote:
> > > > > This is one of the reasons that the LKMM documetnation is so damn
> > > > > difficult to read and understand: just understanding the vocabulary
> > > > > it uses requires a huge learning curve, and it's not defined
> > > > > anywhere. Understanding the syntax of examples requires a huge
> > > > > learning curve, because it's not defined anywhere. 
> > > > 
> > > > Have you seen tools/memory-model/Documentation/explanation.txt?
> > > 
> > > <raises eyebrow>
> > > 
> > > Well, yes. Several times. I look at it almost daily, but that
> > > doesn't mean it's approachable, easy to read or even -that I
> > > understand what large parts of it say-. IOWs, that's one of the 
> > > problematic documents that I've been saying have a huge learning
> > > curve.
> > 
> > Can you be more specific?  For example, exactly where does it start to 
> > become unapproachable or difficult to read?
> > 
> > Don't forget that this document was meant to help mitigate the LKMM's 
> > learning curve.  If it isn't successful, I want to improve it.
> 
> I can't speak for Dave, but the introduction to that documentation makes
> it clear to me that it's not the document I want to read.
> 
> : This document describes the ideas underlying the LKMM.  It is meant
> : for people who want to understand how the model was designed.  It does
> : not go into the details of the code in the .bell and .cat files;
> : rather, it explains in English what the code expresses symbolically.
> 
> I don't want to know how the model was designed.  I want to write a
> device driver, or filesystem, or whatever.
> 
> Honestly, even the term "release semantics" trips me up _every_ time.
> It's a barrier to understanding because I have to translate it into "Oh,
> he means it's like an unlock".  Why can't you just say "unlock semantics"?

One way to think of it is "release" as in "release a lock".

But to answer your question:

1.	The rest of the industry settled on "release" for that form
	of atomics a very long time ago.  Yes, we could insist on
	Linux-kernel exceptionalism, but that doesn't seem consistent
	with the large number of people coming into the kernel, even if
	only briefly.

2.	If we were to say "unlock" instead of "release", consistency
	would demand that we also say "lock" instead of "acquire".
	But "lock" is subtlely different than "acquire", and there is
	a history of people requesting further divergence.

3.	At least one architecture has started using GCC intrinsics
	to implement atomics.  I would expect more to follow.  And
	the GCC intrinsics say "_release".

4.	More cynically (and I do not bleach my hair, so yes, I do have the
	hair color for it), the Linux kernel community is sufficiently
	creative in its (ab)use of whatever tools happen to be lying
	around that any name will become misleading over time in any case.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-20 15:39         ` Matthew Wilcox
  2020-07-20 16:04           ` Paul E. McKenney
@ 2020-07-20 16:12           ` Alan Stern
  1 sibling, 0 replies; 39+ messages in thread
From: Alan Stern @ 2020-07-20 16:12 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Dave Chinner, Eric Biggers, linux-kernel, linux-arch,
	Paul E . McKenney, linux-fsdevel, Akira Yokosawa, Andrea Parri,
	Boqun Feng, Daniel Lustig, Darrick J . Wong, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Mon, Jul 20, 2020 at 04:39:11PM +0100, Matthew Wilcox wrote:
> Honestly, even the term "release semantics" trips me up _every_ time.
> It's a barrier to understanding because I have to translate it into "Oh,
> he means it's like an unlock".  Why can't you just say "unlock semantics"?

It's not as bad as all that; people do talk about acquiring and 
releasing locks, and presumably you don't have any trouble understanding 
those terms.  In fact this usage is quite common -- and I believe it's 
where the names "acquire semantics" and "release semantics" came from 
originally.

Alan Stern

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-20 16:04           ` Paul E. McKenney
@ 2020-07-20 16:48             ` peterz
  2020-07-20 22:06               ` Paul E. McKenney
  0 siblings, 1 reply; 39+ messages in thread
From: peterz @ 2020-07-20 16:48 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Matthew Wilcox, Alan Stern, Dave Chinner, Eric Biggers,
	linux-kernel, linux-arch, linux-fsdevel, Akira Yokosawa,
	Andrea Parri, Boqun Feng, Daniel Lustig, Darrick J . Wong,
	David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin,
	Will Deacon

On Mon, Jul 20, 2020 at 09:04:34AM -0700, Paul E. McKenney wrote:
> 2.	If we were to say "unlock" instead of "release", consistency
> 	would demand that we also say "lock" instead of "acquire".
> 	But "lock" is subtlely different than "acquire", and there is
> 	a history of people requesting further divergence.

This, acquire/release are RCpc, while (with the exception of Power)
LOCK/UNLOCK are RCsc.

( Or did we settle on RCtso for our release/acquire order? I have vague
memories of a long-ish thread, but seem to have forgotten the outcome,
if any. )

Lots of subtlety and head-aches right about there. Anyway, it would be
awesome if we can get Power into the RCsc locking camp :-)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-20 16:48             ` peterz
@ 2020-07-20 22:06               ` Paul E. McKenney
  0 siblings, 0 replies; 39+ messages in thread
From: Paul E. McKenney @ 2020-07-20 22:06 UTC (permalink / raw)
  To: peterz
  Cc: Matthew Wilcox, Alan Stern, Dave Chinner, Eric Biggers,
	linux-kernel, linux-arch, linux-fsdevel, Akira Yokosawa,
	Andrea Parri, Boqun Feng, Daniel Lustig, Darrick J . Wong,
	David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin,
	Will Deacon

On Mon, Jul 20, 2020 at 06:48:50PM +0200, peterz@infradead.org wrote:
> On Mon, Jul 20, 2020 at 09:04:34AM -0700, Paul E. McKenney wrote:
> > 2.	If we were to say "unlock" instead of "release", consistency
> > 	would demand that we also say "lock" instead of "acquire".
> > 	But "lock" is subtlely different than "acquire", and there is
> > 	a history of people requesting further divergence.
> 
> This, acquire/release are RCpc, while (with the exception of Power)
> LOCK/UNLOCK are RCsc.
> 
> ( Or did we settle on RCtso for our release/acquire order? I have vague
> memories of a long-ish thread, but seem to have forgotten the outcome,
> if any. )
> 
> Lots of subtlety and head-aches right about there. Anyway, it would be
> awesome if we can get Power into the RCsc locking camp :-)

I will let you take that one up with the Power folks.

But I should give an example of a current difference between lock and
acquire, and just to spread the blame, I will pick on an architecture
other than Power.  ;-)

With lock acquisition, the following orders the access to X and Z:

	WRITE_ONCE(X, 1);
	spin_lock(&my_lock);
	smp_mb__after_lock();
	r1 = READ_ONCE(Z);

But if we replace the lock acquisition with a load acquire, there are
no ordering guarantees for the accesses to X and Z:

	WRITE_ONCE(X, 1);
	r2 = smp_load_acquire(&Y);
	smp_mb__after_lock();  // Yeah, there is no lock.  ;-)
	r3 = READ_ONCE(Z);

There -is- ordering between the accesses to Y and Z, but not to X and Z.
This is not a theoretical issue.  The x86 platform really can reorder
the access to X to follow that of both Y and Z.

So the memory-model divergence between lock acquisition and acquire
loads is very real in the here and now.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-18  1:02     ` Eric Biggers
@ 2020-07-27 12:51       ` Matthew Wilcox
  0 siblings, 0 replies; 39+ messages in thread
From: Matthew Wilcox @ 2020-07-27 12:51 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Alan Stern, linux-kernel, linux-arch, Paul E . McKenney,
	linux-fsdevel, Akira Yokosawa, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, Dave Chinner, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Fri, Jul 17, 2020 at 06:02:47PM -0700, Eric Biggers wrote:
> On Fri, Jul 17, 2020 at 01:51:38PM -0400, Alan Stern wrote:
> > On Fri, Jul 17, 2020 at 06:47:50PM +0100, Matthew Wilcox wrote:
> > > On Thu, Jul 16, 2020 at 09:44:27PM -0700, Eric Biggers wrote:
> > ...
> > > > +		/* on success, pairs with smp_load_acquire() above and below */
> > > > +		if (cmpxchg_release(&foo, NULL, p) != NULL) {
> > > 
> > > Why do we have cmpxchg_release() anyway?  Under what circumstances is
> > > cmpxchg() useful _without_ having release semantics?
> > 
> > To answer just the last question: cmpxchg() is useful for lock 
> > acquisition, in which case it needs to have acquire semantics rather 
> > than release semantics.
> > 
> 
> To clarify, there are 4 versions of cmpxchg:
> 
> 	cmpxchg(): does ACQUIRE and RELEASE (on success)
> 	cmpxchg_acquire(): does ACQUIRE only (on success)
> 	cmpxchg_release(): does RELEASE only (on success)
> 	cmpxchg_relaxed(): no barriers
> 
> The problem here is that here we need RELEASE on success and ACQUIRE on failure.
> But no version guarantees any barrier on failure.

Why not?  Do CPU designers not do load-linked-with-acquire-semantics?
Or is it our fault for not using the appropriate instruction?

> So as far as I can tell, the best we can do is use cmpxchg_release() (or
> cmpxchg() which would be stronger but unnecessary), followed by a separate
> ACQUIRE on failure.

OK, but that detail needs to be hidden behind a higher level primitive,
not exposed to device driver writers.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-18  5:28       ` Eric Biggers
  2020-07-18 14:35         ` Alan Stern
  2020-07-20  2:07         ` Dave Chinner
@ 2020-07-27 15:17         ` Alan Stern
  2020-07-27 15:28           ` Matthew Wilcox
  2 siblings, 1 reply; 39+ messages in thread
From: Alan Stern @ 2020-07-27 15:17 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Matthew Wilcox, linux-kernel, linux-arch, Paul E . McKenney,
	linux-fsdevel, Akira Yokosawa, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, Dave Chinner, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Fri, Jul 17, 2020 at 10:28:18PM -0700, Eric Biggers wrote:
> I'm still not sure this is the best API.

I cast my vote for something along the following lines.  It's simple,
easily understood and easily used.  The approach has two variants: One
that returns an integer and one that returns a pointer.  I'll use the
pointer variant to illustrate.

Given a type "T", an object x of type pointer-to-T, and a function
"func" that takes various arguments and returns a pointer-to-T, the
accepted API for calling func once would be to create once_func() as
follows:

T *once_func(T **ppt, args...)
{
	static DEFINE_MUTEX(mut);
	T *p;

	p = smp_load_acquire(ppt);	/* Mild optimization */
	if (p)
		return p;

	mutex_lock(mut);
	p = smp_load_acquire(ppt);
	if (!p) {
		p = func(args...);
		if (!IS_ERR_OR_NULL(p))
			smp_store_release(ppt, p);
	}
	mutex_unlock(mut);
	return p;
}

Users then would have to call once_func(&x, args...) and check the
result.  Different x objects would constitute different "once"
domains.

(In the integer variant, x, p and the return type of func are all int,
and ppt is an int *.  Everything else is the same.  This variant would
be used in cases where you're not allocating anything, you're doing
some other sort of initialization only once.)

While this would be a perfectly good recipe in itself, the whole thing
can be made much simpler for users by creating a MAKE_ONCE_FUNC macro
which would generate once_func given the type T, the name "func", and
the args.  The result is type-safe.

IMO the fact that once_func() is not inline is an advantage, not a
drawback.  Yes, it doesn't actually do any allocation or anything like
that -- the idea is that once_func's purpose is merely to ensure that
func is successfully called only once.  Any memory allocation or other
stuff of that sort should be handled by func.

In fact, the only drawback I can think of is that because this relies
on a single mutex for all the different possible x's, it might lead to
locking conflicts (if func had to call once_func() recursively, for
example).  In most reasonable situations such conflicts would not
arise.

Alan Stern

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-27 15:17         ` Alan Stern
@ 2020-07-27 15:28           ` Matthew Wilcox
  2020-07-27 16:01             ` Paul E. McKenney
  2020-07-27 16:31             ` Alan Stern
  0 siblings, 2 replies; 39+ messages in thread
From: Matthew Wilcox @ 2020-07-27 15:28 UTC (permalink / raw)
  To: Alan Stern
  Cc: Eric Biggers, linux-kernel, linux-arch, Paul E . McKenney,
	linux-fsdevel, Akira Yokosawa, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, Dave Chinner, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Mon, Jul 27, 2020 at 11:17:46AM -0400, Alan Stern wrote:
> Given a type "T", an object x of type pointer-to-T, and a function
> "func" that takes various arguments and returns a pointer-to-T, the
> accepted API for calling func once would be to create once_func() as
> follows:
> 
> T *once_func(T **ppt, args...)
> {
> 	static DEFINE_MUTEX(mut);
> 	T *p;
> 
> 	p = smp_load_acquire(ppt);	/* Mild optimization */
> 	if (p)
> 		return p;
> 
> 	mutex_lock(mut);
> 	p = smp_load_acquire(ppt);
> 	if (!p) {
> 		p = func(args...);
> 		if (!IS_ERR_OR_NULL(p))
> 			smp_store_release(ppt, p);
> 	}
> 	mutex_unlock(mut);
> 	return p;
> }
> 
> Users then would have to call once_func(&x, args...) and check the
> result.  Different x objects would constitute different "once"
> domains.
[...]
> In fact, the only drawback I can think of is that because this relies
> on a single mutex for all the different possible x's, it might lead to
> locking conflicts (if func had to call once_func() recursively, for
> example).  In most reasonable situations such conflicts would not
> arise.

Another drawback for this approach relative to my get_foo() approach
upthread is that, because we don't have compiler support, there's no
enforcement that accesses to 'x' go through once_func().  My approach
wraps accesses in a deliberately-opaque struct so you have to write
some really ugly code to get at the raw value, and it's just easier to
call get_foo().

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-27 15:28           ` Matthew Wilcox
@ 2020-07-27 16:01             ` Paul E. McKenney
  2020-07-27 16:31             ` Alan Stern
  1 sibling, 0 replies; 39+ messages in thread
From: Paul E. McKenney @ 2020-07-27 16:01 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Alan Stern, Eric Biggers, linux-kernel, linux-arch,
	linux-fsdevel, Akira Yokosawa, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, Dave Chinner, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Mon, Jul 27, 2020 at 04:28:27PM +0100, Matthew Wilcox wrote:
> On Mon, Jul 27, 2020 at 11:17:46AM -0400, Alan Stern wrote:
> > Given a type "T", an object x of type pointer-to-T, and a function
> > "func" that takes various arguments and returns a pointer-to-T, the
> > accepted API for calling func once would be to create once_func() as
> > follows:
> > 
> > T *once_func(T **ppt, args...)
> > {
> > 	static DEFINE_MUTEX(mut);
> > 	T *p;
> > 
> > 	p = smp_load_acquire(ppt);	/* Mild optimization */
> > 	if (p)
> > 		return p;
> > 
> > 	mutex_lock(mut);
> > 	p = smp_load_acquire(ppt);
> > 	if (!p) {
> > 		p = func(args...);
> > 		if (!IS_ERR_OR_NULL(p))
> > 			smp_store_release(ppt, p);
> > 	}
> > 	mutex_unlock(mut);
> > 	return p;
> > }
> > 
> > Users then would have to call once_func(&x, args...) and check the
> > result.  Different x objects would constitute different "once"
> > domains.
> [...]
> > In fact, the only drawback I can think of is that because this relies
> > on a single mutex for all the different possible x's, it might lead to
> > locking conflicts (if func had to call once_func() recursively, for
> > example).  In most reasonable situations such conflicts would not
> > arise.
> 
> Another drawback for this approach relative to my get_foo() approach
> upthread is that, because we don't have compiler support, there's no
> enforcement that accesses to 'x' go through once_func().  My approach
> wraps accesses in a deliberately-opaque struct so you have to write
> some really ugly code to get at the raw value, and it's just easier to
> call get_foo().

Could ACCESS_PRIVATE() help in this case?  This relies on sparse rather
than the compiler, but some of the testing services do run sparse
regularly.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-27 15:28           ` Matthew Wilcox
  2020-07-27 16:01             ` Paul E. McKenney
@ 2020-07-27 16:31             ` Alan Stern
  2020-07-27 16:59               ` Matthew Wilcox
  1 sibling, 1 reply; 39+ messages in thread
From: Alan Stern @ 2020-07-27 16:31 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Eric Biggers, linux-kernel, linux-arch, Paul E . McKenney,
	linux-fsdevel, Akira Yokosawa, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, Dave Chinner, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Mon, Jul 27, 2020 at 04:28:27PM +0100, Matthew Wilcox wrote:
> On Mon, Jul 27, 2020 at 11:17:46AM -0400, Alan Stern wrote:
> > Given a type "T", an object x of type pointer-to-T, and a function
> > "func" that takes various arguments and returns a pointer-to-T, the
> > accepted API for calling func once would be to create once_func() as
> > follows:
> > 
> > T *once_func(T **ppt, args...)
> > {
> > 	static DEFINE_MUTEX(mut);
> > 	T *p;
> > 
> > 	p = smp_load_acquire(ppt);	/* Mild optimization */
> > 	if (p)
> > 		return p;
> > 
> > 	mutex_lock(mut);
> > 	p = smp_load_acquire(ppt);
> > 	if (!p) {
> > 		p = func(args...);
> > 		if (!IS_ERR_OR_NULL(p))
> > 			smp_store_release(ppt, p);
> > 	}
> > 	mutex_unlock(mut);
> > 	return p;
> > }
> > 
> > Users then would have to call once_func(&x, args...) and check the
> > result.  Different x objects would constitute different "once"
> > domains.
> [...]
> > In fact, the only drawback I can think of is that because this relies
> > on a single mutex for all the different possible x's, it might lead to
> > locking conflicts (if func had to call once_func() recursively, for
> > example).  In most reasonable situations such conflicts would not
> > arise.
> 
> Another drawback for this approach relative to my get_foo() approach
> upthread is that, because we don't have compiler support, there's no
> enforcement that accesses to 'x' go through once_func().  My approach
> wraps accesses in a deliberately-opaque struct so you have to write
> some really ugly code to get at the raw value, and it's just easier to
> call get_foo().

Something like that could be included in once_func too.  It's relatively 
tangential to the main point I was making, which was to settle on an 
overall API and discuss how it should be described in recipes.txt.

Alan Stern

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-27 16:31             ` Alan Stern
@ 2020-07-27 16:59               ` Matthew Wilcox
  2020-07-27 19:13                 ` Alan Stern
  0 siblings, 1 reply; 39+ messages in thread
From: Matthew Wilcox @ 2020-07-27 16:59 UTC (permalink / raw)
  To: Alan Stern
  Cc: Eric Biggers, linux-kernel, linux-arch, Paul E . McKenney,
	linux-fsdevel, Akira Yokosawa, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, Dave Chinner, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Mon, Jul 27, 2020 at 12:31:49PM -0400, Alan Stern wrote:
> On Mon, Jul 27, 2020 at 04:28:27PM +0100, Matthew Wilcox wrote:
> > On Mon, Jul 27, 2020 at 11:17:46AM -0400, Alan Stern wrote:
> > > Given a type "T", an object x of type pointer-to-T, and a function
> > > "func" that takes various arguments and returns a pointer-to-T, the
> > > accepted API for calling func once would be to create once_func() as
> > > follows:
> > > 
> > > T *once_func(T **ppt, args...)
> > > {
> > > 	static DEFINE_MUTEX(mut);
> > > 	T *p;
> > > 
> > > 	p = smp_load_acquire(ppt);	/* Mild optimization */
> > > 	if (p)
> > > 		return p;
> > > 
> > > 	mutex_lock(mut);
> > > 	p = smp_load_acquire(ppt);
> > > 	if (!p) {
> > > 		p = func(args...);
> > > 		if (!IS_ERR_OR_NULL(p))
> > > 			smp_store_release(ppt, p);
> > > 	}
> > > 	mutex_unlock(mut);
> > > 	return p;
> > > }
> > > 
> > > Users then would have to call once_func(&x, args...) and check the
> > > result.  Different x objects would constitute different "once"
> > > domains.
> > [...]
> > > In fact, the only drawback I can think of is that because this relies
> > > on a single mutex for all the different possible x's, it might lead to
> > > locking conflicts (if func had to call once_func() recursively, for
> > > example).  In most reasonable situations such conflicts would not
> > > arise.
> > 
> > Another drawback for this approach relative to my get_foo() approach
> > upthread is that, because we don't have compiler support, there's no
> > enforcement that accesses to 'x' go through once_func().  My approach
> > wraps accesses in a deliberately-opaque struct so you have to write
> > some really ugly code to get at the raw value, and it's just easier to
> > call get_foo().
> 
> Something like that could be included in once_func too.  It's relatively 
> tangential to the main point I was making, which was to settle on an 
> overall API and discuss how it should be described in recipes.txt.

Then I think you're trying to describe something which is too complicated
because it's overly general.  I don't think device drivers should contain
"smp_load_acquire" and "smp_store_release".  Most device driver authors
struggle with spinlocks and mutexes.

The once_get() / once_store() API:

struct foo *get_foo(gfp_t gfp)
{
	static struct once_pointer my_foo;
	struct foo *foop;

	foop = once_get(&my_foo);
	if (foop)
		return foop;

	foop = alloc_foo(gfp);
	if (foop && !once_store(&my_foo, foop)) {
		free_foo(foop);
		foop = once_get(&my_foo);
	}

	return foop;
}

is easy to understand.  There's no need to talk about acquire and release
semantics, barriers, reordering, ... it all just works in the obvious way
that it's written.

If you want to put a mutex around all this, you can:

struct foo *get_foo(gfp_t gfp)
{
	static DEFINE_MUTEX(m);
	static struct init_once_pointer my_foo;
	struct foo *foop;

	foop = once_get(&my_foo);
	if (foop)
		return foop;

	mutex_lock(&m);
	foop = once_get(&my_foo);
	if (!foop) {
		foop = alloc_foo(gfp);
		if (foop && !once_store(&my_foo, foop)) {
			free_foo(foop);
			foop = once_get(&my_foo);
		}
	}
	mutex_unlock(&m);

	return foop;
}

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] tools/memory-model: document the "one-time init" pattern
  2020-07-27 16:59               ` Matthew Wilcox
@ 2020-07-27 19:13                 ` Alan Stern
  0 siblings, 0 replies; 39+ messages in thread
From: Alan Stern @ 2020-07-27 19:13 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Eric Biggers, linux-kernel, linux-arch, Paul E . McKenney,
	linux-fsdevel, Akira Yokosawa, Andrea Parri, Boqun Feng,
	Daniel Lustig, Darrick J . Wong, Dave Chinner, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra,
	Will Deacon

On Mon, Jul 27, 2020 at 05:59:17PM +0100, Matthew Wilcox wrote:
> On Mon, Jul 27, 2020 at 12:31:49PM -0400, Alan Stern wrote:
> > On Mon, Jul 27, 2020 at 04:28:27PM +0100, Matthew Wilcox wrote:
> > > On Mon, Jul 27, 2020 at 11:17:46AM -0400, Alan Stern wrote:
> > > > Given a type "T", an object x of type pointer-to-T, and a function
> > > > "func" that takes various arguments and returns a pointer-to-T, the
> > > > accepted API for calling func once would be to create once_func() as
> > > > follows:
> > > > 
> > > > T *once_func(T **ppt, args...)
> > > > {
> > > > 	static DEFINE_MUTEX(mut);
> > > > 	T *p;
> > > > 
> > > > 	p = smp_load_acquire(ppt);	/* Mild optimization */
> > > > 	if (p)
> > > > 		return p;
> > > > 
> > > > 	mutex_lock(mut);
> > > > 	p = smp_load_acquire(ppt);
> > > > 	if (!p) {
> > > > 		p = func(args...);
> > > > 		if (!IS_ERR_OR_NULL(p))
> > > > 			smp_store_release(ppt, p);
> > > > 	}
> > > > 	mutex_unlock(mut);
> > > > 	return p;
> > > > }
> > > > 
> > > > Users then would have to call once_func(&x, args...) and check the
> > > > result.  Different x objects would constitute different "once"
> > > > domains.
> > > [...]
> > > > In fact, the only drawback I can think of is that because this relies
> > > > on a single mutex for all the different possible x's, it might lead to
> > > > locking conflicts (if func had to call once_func() recursively, for
> > > > example).  In most reasonable situations such conflicts would not
> > > > arise.
> > > 
> > > Another drawback for this approach relative to my get_foo() approach
> > > upthread is that, because we don't have compiler support, there's no
> > > enforcement that accesses to 'x' go through once_func().  My approach
> > > wraps accesses in a deliberately-opaque struct so you have to write
> > > some really ugly code to get at the raw value, and it's just easier to
> > > call get_foo().
> > 
> > Something like that could be included in once_func too.  It's relatively 
> > tangential to the main point I was making, which was to settle on an 
> > overall API and discuss how it should be described in recipes.txt.
> 
> Then I think you're trying to describe something which is too complicated
> because it's overly general.  I don't think device drivers should contain
> "smp_load_acquire" and "smp_store_release".  Most device driver authors
> struggle with spinlocks and mutexes.

Then I didn't explain my proposal clearly enough.  It doesn't require 
device driver authors to know anything about smp_load_acquire, 
smp_store_release, spinlocks, or mutexes.

Suppose an author wants to allocate and initialize a struct foo exactly 
once.  Then the driver code would contain something like this:

struct foo *foop;

static struct foo *alloc_foo(gfp_t gfp)
{
	... allocate and initialize ...
}

MAKE_ONCE_FUNC(struct foo, alloc_foo, (gfp_t gfp), (gfp))

The code to use it is:

	struct foo *p = once_alloc_foo(&foop, GFP_KERNEL);

If you don't like the global pointer, encapsulate it as follows:

struct foo *get_foo(grp_t gfp)
{
	static struct foo *foop;

	return once_alloc_foo(&foop, gfp);
}

and have users call get_foo instead of once_alloc_foo.

It's hard to imagine this getting much simpler.

> The once_get() / once_store() API:
> 
> struct foo *get_foo(gfp_t gfp)
> {
> 	static struct once_pointer my_foo;
> 	struct foo *foop;
> 
> 	foop = once_get(&my_foo);
> 	if (foop)
> 		return foop;
> 
> 	foop = alloc_foo(gfp);
> 	if (foop && !once_store(&my_foo, foop)) {
> 		free_foo(foop);
> 		foop = once_get(&my_foo);
> 	}
> 
> 	return foop;
> }
> 
> is easy to understand.  There's no need to talk about acquire and release
> semantics, barriers, reordering, ... it all just works in the obvious way
> that it's written.

The MAKE_ONCE_FUNC API is just as easy to understand and requires less 
boilerplate.  It's type-safe whereas your once_pointer structures 
aren't.  And it's more general, in the sense that it provides a way to 
call a function only once, as opposed to a way to store a pointer only 
once.

Alan Stern

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, back to index

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-17  4:44 [PATCH] tools/memory-model: document the "one-time init" pattern Eric Biggers
2020-07-17  5:49 ` Sedat Dilek
2020-07-17 12:35 ` Matthew Wilcox
2020-07-17 14:26 ` Alan Stern
2020-07-17 17:47 ` Matthew Wilcox
2020-07-17 17:51   ` Alan Stern
2020-07-18  1:02     ` Eric Biggers
2020-07-27 12:51       ` Matthew Wilcox
2020-07-17 21:05   ` Darrick J. Wong
2020-07-18  0:44   ` Darrick J. Wong
2020-07-18  1:38   ` Eric Biggers
2020-07-18  2:13     ` Matthew Wilcox
2020-07-18  5:28       ` Eric Biggers
2020-07-18 14:35         ` Alan Stern
2020-07-20  2:07         ` Dave Chinner
2020-07-20  9:00           ` Peter Zijlstra
2020-07-27 15:17         ` Alan Stern
2020-07-27 15:28           ` Matthew Wilcox
2020-07-27 16:01             ` Paul E. McKenney
2020-07-27 16:31             ` Alan Stern
2020-07-27 16:59               ` Matthew Wilcox
2020-07-27 19:13                 ` Alan Stern
2020-07-17 20:53 ` Darrick J. Wong
2020-07-18  0:58   ` Eric Biggers
2020-07-18  1:25     ` Alan Stern
2020-07-18  1:40       ` Matthew Wilcox
2020-07-18  2:00       ` Dave Chinner
2020-07-18 14:21         ` Alan Stern
2020-07-18  2:00       ` Eric Biggers
2020-07-18  1:42 ` Dave Chinner
2020-07-18 14:08   ` Alan Stern
2020-07-20  1:33     ` Dave Chinner
2020-07-20 14:52       ` Alan Stern
2020-07-20 15:37         ` Darrick J. Wong
2020-07-20 15:39         ` Matthew Wilcox
2020-07-20 16:04           ` Paul E. McKenney
2020-07-20 16:48             ` peterz
2020-07-20 22:06               ` Paul E. McKenney
2020-07-20 16:12           ` Alan Stern

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lkml.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lkml.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git