LKML Archive on lore.kernel.org help / color / mirror / Atom feed
From: Trond Myklebust <email@example.com> To: "firstname.lastname@example.org" <email@example.com>, "firstname.lastname@example.org" <email@example.com> Cc: "firstname.lastname@example.org" <email@example.com>, "firstname.lastname@example.org" <email@example.com>, "firstname.lastname@example.org" <email@example.com>, "firstname.lastname@example.org" <email@example.com>, "firstname.lastname@example.org" <email@example.com>, "firstname.lastname@example.org" <email@example.com>, "firstname.lastname@example.org" <email@example.com>, "firstname.lastname@example.org" <email@example.com>, "firstname.lastname@example.org" <email@example.com>, "firstname.lastname@example.org" <email@example.com>, "firstname.lastname@example.org" <email@example.com>, "firstname.lastname@example.org" <email@example.com>, "firstname.lastname@example.org" <email@example.com>, "firstname.lastname@example.org" <email@example.com>, "firstname.lastname@example.org" <email@example.com>, "firstname.lastname@example.org" <email@example.com>, "firstname.lastname@example.org" <email@example.com>, "firstname.lastname@example.org" <email@example.com> Subject: Re: Canvassing for network filesystem write size vs page size Date: Thu, 5 Aug 2021 17:43:09 +0000 [thread overview] Message-ID: <firstname.lastname@example.org> (raw) In-Reply-To: <CAHk-=wjyEk9EuYgE3nBnRCRd_AmRYVOGACEjt0X33QnORd5email@example.com> On Thu, 2021-08-05 at 10:27 -0700, Linus Torvalds wrote: > On Thu, Aug 5, 2021 at 9:36 AM David Howells <firstname.lastname@example.org> > wrote: > > > > Some network filesystems, however, currently keep track of which > > byte ranges > > are modified within a dirty page (AFS does; NFS seems to also) and > > only write > > out the modified data. > > NFS definitely does. I haven't used NFS in two decades, but I worked > on some of the code (read: I made nfs use the page cache both for > reading and writing) back in my Transmeta days, because NFSv2 was the > default filesystem setup back then. > > See fs/nfs/write.c, although I have to admit that I don't recognize > that code any more. > > It's fairly important to be able to do streaming writes without > having > to read the old contents for some loads. And read-modify-write cycles > are death for performance, so you really want to coalesce writes > until > you have the whole page. > > That said, I suspect it's also *very* filesystem-specific, to the > point where it might not be worth trying to do in some generic > manner. > > In particular, NFS had things like interesting credential issues, so > if you have multiple concurrent writers that used different 'struct > file *' to write to the file, you can't just mix the writes. You have > to sync the writes from one writer before you start the writes for > the > next one, because one might succeed and the other not. > > So you can't just treat it as some random "page cache with dirty byte > extents". You really have to be careful about credentials, timeouts, > etc, and the pending writes have to keep a fair amount of state > around. > > At least that was the case two decades ago. > > [ goes off and looks. See "nfs_write_begin()" and friends in > fs/nfs/file.c for some of the examples of these things, althjough it > looks like the code is less aggressive about avoding the > read-modify-write case than I thought I remembered, and only does it > for write-only opens ] > All correct, however there is also the issue that even if we have done a read-modify-write, we can't always extend the write to cover the entire page. If you look at nfs_can_extend_write(), you'll note that we don't extend the page data if the file is range locked, if the attributes have not been revalidated, or if the page cache contents are suspected to be invalid for some other reason. -- Trond Myklebust Linux NFS client maintainer, Hammerspace email@example.com
next prev parent reply other threads:[~2021-08-05 17:43 UTC|newest] Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-08-05 10:19 Could it be made possible to offer "supplementary" data to a DIO write ? David Howells 2021-08-05 12:37 ` Matthew Wilcox 2021-08-05 13:07 ` David Howells 2021-08-05 13:35 ` Matthew Wilcox 2021-08-05 14:38 ` David Howells 2021-08-05 15:06 ` Matthew Wilcox 2021-08-05 15:38 ` David Howells 2021-08-05 16:35 ` Canvassing for network filesystem write size vs page size David Howells 2021-08-05 17:27 ` Linus Torvalds 2021-08-05 17:43 ` Trond Myklebust [this message] 2021-08-05 22:11 ` Matthew Wilcox 2021-08-06 13:42 ` David Howells 2021-08-06 14:17 ` Matthew Wilcox 2021-08-06 15:04 ` David Howells 2021-08-05 17:52 ` Adam Borowski 2021-08-05 18:50 ` Jeff Layton 2021-08-05 23:47 ` Matthew Wilcox 2021-08-06 13:44 ` David Howells 2021-08-05 17:45 ` Could it be made possible to offer "supplementary" data to a DIO write ? Adam Borowski
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).