Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Eric Biggers <ebiggers@kernel.org>
Cc: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org,
	linux-ext4@vger.kernel.org
Subject: Re: [PATCH] fs/direct-io: avoid data race on ->s_dio_done_wq
Date: Wed, 15 Jul 2020 11:30:08 +1000
Message-ID: <20200715013008.GD2005@dread.disaster.area> (raw)
In-Reply-To: <20200713033330.205104-1-ebiggers@kernel.org>

On Sun, Jul 12, 2020 at 08:33:30PM -0700, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
> 
> Fix the preliminary checks for ->s_dio_done_wq to use READ_ONCE(), since
> it's a data race, and technically the behavior is undefined without
> READ_ONCE().  Also, on one CPU architecture (Alpha), the data read
> dependency barrier included in READ_ONCE() is needed to guarantee that
> the pointed-to struct is seen as fully initialized.
> 
> Signed-off-by: Eric Biggers <ebiggers@google.com>
> ---
>  fs/direct-io.c       | 8 +++-----
>  fs/internal.h        | 9 ++++++++-
>  fs/iomap/direct-io.c | 3 +--
>  3 files changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/direct-io.c b/fs/direct-io.c
> index 6d5370eac2a8..26221ae24156 100644
> --- a/fs/direct-io.c
> +++ b/fs/direct-io.c
> @@ -590,7 +590,7 @@ static inline int dio_bio_reap(struct dio *dio, struct dio_submit *sdio)
>   * filesystems that don't need it and also allows us to create the workqueue
>   * late enough so the we can include s_id in the name of the workqueue.
>   */
> -int sb_init_dio_done_wq(struct super_block *sb)
> +int __sb_init_dio_done_wq(struct super_block *sb)
>  {
>  	struct workqueue_struct *old;
>  	struct workqueue_struct *wq = alloc_workqueue("dio/%s",
> @@ -615,9 +615,7 @@ static int dio_set_defer_completion(struct dio *dio)
>  	if (dio->defer_completion)
>  		return 0;
>  	dio->defer_completion = true;
> -	if (!sb->s_dio_done_wq)
> -		return sb_init_dio_done_wq(sb);
> -	return 0;
> +	return sb_init_dio_done_wq(sb);
>  }
>  
>  /*
> @@ -1250,7 +1248,7 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
>  		retval = 0;
>  		if (iocb->ki_flags & IOCB_DSYNC)
>  			retval = dio_set_defer_completion(dio);
> -		else if (!dio->inode->i_sb->s_dio_done_wq) {
> +		else {
>  			/*
>  			 * In case of AIO write racing with buffered read we
>  			 * need to defer completion. We can't decide this now,
> diff --git a/fs/internal.h b/fs/internal.h
> index 9b863a7bd708..6736c9eee978 100644
> --- a/fs/internal.h
> +++ b/fs/internal.h
> @@ -178,7 +178,14 @@ extern void mnt_pin_kill(struct mount *m);
>  extern const struct dentry_operations ns_dentry_operations;
>  
>  /* direct-io.c: */
> -int sb_init_dio_done_wq(struct super_block *sb);
> +int __sb_init_dio_done_wq(struct super_block *sb);
> +static inline int sb_init_dio_done_wq(struct super_block *sb)
> +{
> +	/* pairs with cmpxchg() in __sb_init_dio_done_wq() */
> +	if (likely(READ_ONCE(sb->s_dio_done_wq)))
> +		return 0;
> +	return __sb_init_dio_done_wq(sb);
> +}

Ummm, why don't you just add this check in sb_init_dio_done_wq(). I
don't see any need for adding another level of function call
abstraction in the source code?

Also, you need to explain the reason for the READ_ONCE() existing
rather than just saying "it pairs with <some other operation>".
Knowing what operation it pairs with doesn't explain why the pairing
is necessary in the first place, and that leads to nobody reading
the code being able to understand what this is protecting against.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply index

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-13  3:33 Eric Biggers
2020-07-15  1:30 ` Dave Chinner [this message]
2020-07-15  2:37   ` Eric Biggers
2020-07-15  8:01     ` Dave Chinner
2020-07-15 16:13       ` Eric Biggers
2020-07-15 16:41         ` Darrick J. Wong
2020-07-16  1:46         ` Dave Chinner
2020-07-16  2:39           ` Eric Biggers
2020-07-16  2:47           ` Matthew Wilcox
2020-07-16  3:19             ` Eric Biggers
2020-07-16  5:33             ` Eric Biggers
2020-07-16  8:16               ` Dave Chinner
2020-07-17  1:36                 ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200715013008.GD2005@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=ebiggers@kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lkml.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lkml.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git