Linux-Fsdevel Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Jan Kara <jack@suse.cz>
Cc: Mikulas Patocka <mpatocka@redhat.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Matthew Wilcox <willy@infradead.org>,
	Eric Sandeen <esandeen@redhat.com>,
	Dave Chinner <dchinner@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: the "read" syscall sees partial effects of the "write" syscall
Date: Mon, 21 Sep 2020 09:41:03 +1000	[thread overview]
Message-ID: <20200920234103.GX12096@dread.disaster.area> (raw)
In-Reply-To: <20200918131317.GH18920@quack2.suse.cz>

On Fri, Sep 18, 2020 at 03:13:17PM +0200, Jan Kara wrote:
> On Fri 18-09-20 08:25:28, Mikulas Patocka wrote:
> > I'd like to ask about this problem: when we write to a file, the kernel 
> > takes the write inode lock. When we read from a file, no lock is taken - 
> > thus the read syscall can read data that are halfway modified by the write 
> > syscall.
> > 
> > The standard specifies the effects of the write syscall are atomic - see 
> > this:
> > https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_09_07
> 
> Yes, but no Linux filesystem (except for XFS AFAIK) follows the POSIX spec
> in this regard. Mostly because the mixed read-write performance sucks when
> you follow it (not that it would absolutely have to suck - you can use
> clever locking with range locks but nobody does it currently). In practice,
> the read-write atomicity works on Linux only on per-page basis for buffered
> IO.

We come across this from time to time with POSIX compliant
applications being ported from other Unixes that rely on a write
from one thread being seen atomically from a read from another
thread. There are quite a few custom enterprise apps around that
rely on this POSIX behaviour, especially stuff that has come from
different Unixes that actually provided Posix compliant behaviour.

IOWs, from an upstream POV, POSIX atomic write behaviour doesn't
matter very much. From an enterprise distro POV it's often a
different story....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2020-09-20 23:41 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-15 12:34 [RFC] nvfs: a filesystem for persistent memory Mikulas Patocka
2020-09-15 13:00 ` Matthew Wilcox
2020-09-15 13:24   ` Mikulas Patocka
2020-09-22 10:04   ` Ritesh Harjani
2020-09-15 15:16 ` Dan Williams
2020-09-15 16:58   ` Mikulas Patocka
2020-09-15 17:38     ` Mikulas Patocka
2020-09-16 10:57       ` [PATCH] pmem: export the symbols __copy_user_flushcache and __copy_from_user_flushcache Mikulas Patocka
2020-09-16 16:21         ` Dan Williams
2020-09-16 17:24           ` Mikulas Patocka
2020-09-16 17:40             ` Dan Williams
2020-09-16 18:06               ` Mikulas Patocka
2020-09-21 16:20                 ` NVFS XFS metadata (was: [PATCH] pmem: export the symbols __copy_user_flushcache and __copy_from_user_flushcache) Mikulas Patocka
2020-09-22  5:03                   ` Dave Chinner
2020-09-22 16:46                     ` Mikulas Patocka
2020-09-22 17:25                       ` Matthew Wilcox
2020-09-24 15:00                         ` Mikulas Patocka
2020-09-28 15:22                           ` Mikulas Patocka
2020-09-23  2:45                       ` Dave Chinner
2020-09-23  9:20                         ` A bug in ext4 with big directories (was: NVFS XFS metadata) Mikulas Patocka
2020-09-23  9:44                           ` Jan Kara
2020-09-23 12:46                             ` Mikulas Patocka
2020-09-23 20:20                             ` Andreas Dilger
2020-09-23 17:19                         ` NVFS XFS metadata (was: [PATCH] pmem: export the symbols __copy_user_flushcache and __copy_from_user_flushcache) Mikulas Patocka
2020-09-23  9:57                       ` Jan Kara
2020-09-23 13:11                         ` Mikulas Patocka
2020-09-23 15:04                           ` Matthew Wilcox
2020-09-22 12:28                   ` Matthew Wilcox
2020-09-22 12:39                     ` Mikulas Patocka
2020-09-16 18:56               ` [PATCH] pmem: fix __copy_user_flushcache Mikulas Patocka
2020-09-18  1:53                 ` Dan Williams
2020-09-18 12:25                   ` the "read" syscall sees partial effects of the "write" syscall Mikulas Patocka
2020-09-18 13:13                     ` Jan Kara
2020-09-18 18:02                       ` Linus Torvalds
2020-09-20 23:41                       ` Dave Chinner [this message]
2020-09-17  6:50               ` [PATCH] pmem: export the symbols __copy_user_flushcache and __copy_from_user_flushcache Christoph Hellwig
2020-09-21 16:19   ` [RFC] nvfs: a filesystem for persistent memory Mikulas Patocka
2020-09-21 16:29     ` Dan Williams
2020-09-22 15:43     ` Ira Weiny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200920234103.GX12096@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=dchinner@redhat.com \
    --cc=esandeen@redhat.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    --subject='Re: the "read" syscall sees partial effects of the "write" syscall' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).