Linux-Fsdevel Archive on lore.kernel.org help / color / mirror / Atom feed
From: "Matthew Wilcox (Oracle)" <willy@infradead.org> To: linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>, linux-mm@kvack.org, v9fs-developer@lists.sourceforge.net, linux-kernel@vger.kernel.org, linux-afs@lists.infradead.org, ceph-devel@vger.kernel.org, linux-cifs@vger.kernel.org, ecryptfs@vger.kernel.org, linux-um@lists.infradead.org, linux-mtd@lists.infradead.org, Richard Weinberger <richard@nod.at> Subject: [PATCH 01/13] mm: Add AOP_UPDATED_PAGE return value Date: Thu, 17 Sep 2020 16:10:38 +0100 [thread overview] Message-ID: <20200917151050.5363-2-willy@infradead.org> (raw) In-Reply-To: <20200917151050.5363-1-willy@infradead.org> Allow synchronous ->readpage implementations to execute more efficiently by skipping the re-locking of the page. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> --- Documentation/filesystems/locking.rst | 7 ++++--- Documentation/filesystems/vfs.rst | 21 ++++++++++++++------- include/linux/fs.h | 5 +++++ mm/filemap.c | 12 ++++++++++-- 4 files changed, 33 insertions(+), 12 deletions(-) diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst index 64f94a18d97e..06a7a8bf2362 100644 --- a/Documentation/filesystems/locking.rst +++ b/Documentation/filesystems/locking.rst @@ -269,7 +269,7 @@ locking rules: ops PageLocked(page) i_rwsem ====================== ======================== ========= writepage: yes, unlocks (see below) -readpage: yes, unlocks +readpage: yes, may unlock writepages: set_page_dirty no readahead: yes, unlocks @@ -294,8 +294,9 @@ swap_deactivate: no ->write_begin(), ->write_end() and ->readpage() may be called from the request handler (/dev/loop). -->readpage() unlocks the page, either synchronously or via I/O -completion. +->readpage() may return AOP_UPDATED_PAGE if the page is now Uptodate +or 0 if the page will be unlocked asynchronously by I/O completion. +If it returns -errno, it should unlock the page. ->readahead() unlocks the pages that I/O is attempted on like ->readpage(). diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst index ca52c82e5bb5..16248c299aaa 100644 --- a/Documentation/filesystems/vfs.rst +++ b/Documentation/filesystems/vfs.rst @@ -643,7 +643,7 @@ set_page_dirty to write data into the address_space, and writepage and writepages to writeback data to storage. Adding and removing pages to/from an address_space is protected by the -inode's i_mutex. +inode's i_rwsem held exclusively. When data is written to a page, the PG_Dirty flag should be set. It typically remains set until writepage asks for it to be written. This @@ -757,12 +757,19 @@ cache in your filesystem. The following members are defined: ``readpage`` called by the VM to read a page from backing store. The page - will be Locked when readpage is called, and should be unlocked - and marked uptodate once the read completes. If ->readpage - discovers that it needs to unlock the page for some reason, it - can do so, and then return AOP_TRUNCATED_PAGE. In this case, - the page will be relocated, relocked and if that all succeeds, - ->readpage will be called again. + will be Locked and !Uptodate when readpage is called. Ideally, + the filesystem will bring the page Uptodate and return + AOP_UPDATED_PAGE. If the filesystem encounters an error, it + should unlock the page and return a negative errno without marking + the page Uptodate. It does not need to mark the page as Error. + If the filesystem returns 0, this means the page will be unlocked + asynchronously by I/O completion. The VFS will wait for the + page to be unlocked, so there is no advantage to executing this + operation asynchronously. + + The filesystem can also return AOP_TRUNCATED_PAGE to indicate + that it had to unlock the page to avoid a deadlock. The caller + will re-check the page cache and call ->readpage again. ``writepages`` called by the VM to write out pages associated with the diff --git a/include/linux/fs.h b/include/linux/fs.h index e019ea2f1347..6fc650050d20 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -273,6 +273,10 @@ struct iattr { * reference, it should drop it before retrying. Returned * by readpage(). * + * @AOP_UPDATED_PAGE: The readpage method has brought the page Uptodate + * without releasing the page lock. This is suitable for synchronous + * implementations of readpage. + * * address_space_operation functions return these large constants to indicate * special semantics to the caller. These are much larger than the bytes in a * page to allow for functions that return the number of bytes operated on in a @@ -282,6 +286,7 @@ struct iattr { enum positive_aop_returns { AOP_WRITEPAGE_ACTIVATE = 0x80000, AOP_TRUNCATED_PAGE = 0x80001, + AOP_UPDATED_PAGE = 0x80002, }; #define AOP_FLAG_CONT_EXPAND 0x0001 /* called from cont_expand */ diff --git a/mm/filemap.c b/mm/filemap.c index 1aaea26556cc..131a2aaa1537 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2254,8 +2254,10 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb, * PG_error will be set again if readpage fails. */ ClearPageError(page); - /* Start the actual read. The read will unlock the page. */ + /* Start the actual read. The read may unlock the page. */ error = mapping->a_ops->readpage(filp, page); + if (error == AOP_UPDATED_PAGE) + goto page_ok; if (unlikely(error)) { if (error == AOP_TRUNCATED_PAGE) { @@ -2619,7 +2621,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) */ if (unlikely(!PageUptodate(page))) goto page_not_uptodate; - +page_ok: /* * We've made it this far and we had to drop our mmap_lock, now is the * time to return to the upper layer and have it re-find the vma and @@ -2654,6 +2656,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) ClearPageError(page); fpin = maybe_unlock_mmap_for_io(vmf, fpin); error = mapping->a_ops->readpage(file, page); + if (error == AOP_UPDATED_PAGE) + goto page_ok; if (!error) { wait_on_page_locked(page); if (!PageUptodate(page)) @@ -2867,6 +2871,10 @@ static struct page *do_read_cache_page(struct address_space *mapping, err = filler(data, page); else err = mapping->a_ops->readpage(data, page); + if (err == AOP_UPDATED_PAGE) { + unlock_page(page); + goto out; + } if (err < 0) { put_page(page); -- 2.28.0
next prev parent reply other threads:[~2020-09-17 15:13 UTC|newest] Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-09-17 15:10 [PATCH 00/13] Allow readpage to return a locked page Matthew Wilcox (Oracle) 2020-09-17 15:10 ` Matthew Wilcox (Oracle) [this message] 2020-09-17 22:03 ` [PATCH 01/13] mm: Add AOP_UPDATED_PAGE return value Matthew Wilcox 2020-09-17 15:10 ` [PATCH 02/13] 9p: Tell the VFS that readpage was synchronous Matthew Wilcox (Oracle) 2020-09-18 5:59 ` [V9fs-developer] " Dominique Martinet 2020-09-18 11:19 ` Matthew Wilcox 2020-09-18 12:30 ` Dominique Martinet 2020-09-17 15:10 ` [PATCH 03/13] afs: " Matthew Wilcox (Oracle) 2020-09-17 15:10 ` [PATCH 04/13] ceph: " Matthew Wilcox (Oracle) 2020-09-17 16:49 ` Jeff Layton 2020-09-17 15:10 ` [PATCH 05/13] cifs: " Matthew Wilcox (Oracle) 2020-09-17 15:10 ` [PATCH 06/13] cramfs: " Matthew Wilcox (Oracle) 2020-09-17 15:10 ` [PATCH 07/13] ecryptfs: " Matthew Wilcox (Oracle) 2020-09-17 15:10 ` [PATCH 08/13] fuse: " Matthew Wilcox (Oracle) 2020-09-17 15:10 ` [PATCH 09/13] hostfs: " Matthew Wilcox (Oracle) 2020-09-17 15:10 ` [PATCH 10/13] jffs2: " Matthew Wilcox (Oracle) 2020-09-17 15:10 ` [PATCH 11/13] ubifs: " Matthew Wilcox (Oracle) 2020-09-17 20:46 ` Richard Weinberger 2020-09-17 15:10 ` [PATCH 12/13] udf: " Matthew Wilcox (Oracle) 2020-09-24 9:00 ` Jan Kara 2020-09-17 15:10 ` [PATCH 13/13] vboxsf: " Matthew Wilcox (Oracle) 2020-09-17 22:56 ` [PATCH 14/13] iomap: Inline iomap_iop_set_range_uptodate into its one caller Matthew Wilcox (Oracle) 2020-09-17 22:56 ` [PATCH 15/13] iomap: Inline iomap_read_finish " Matthew Wilcox (Oracle) 2020-09-19 6:31 ` Christoph Hellwig 2020-09-17 22:56 ` [PATCH 16/13] iomap: Make readpage synchronous Matthew Wilcox (Oracle) 2020-09-19 6:39 ` Christoph Hellwig 2020-09-19 6:43 ` Christoph Hellwig 2020-09-19 17:03 ` Matthew Wilcox 2020-09-19 17:10 ` Matthew Wilcox 2020-09-19 6:31 ` [PATCH 14/13] iomap: Inline iomap_iop_set_range_uptodate into its one caller Christoph Hellwig
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200917151050.5363-2-willy@infradead.org \ --to=willy@infradead.org \ --cc=ceph-devel@vger.kernel.org \ --cc=ecryptfs@vger.kernel.org \ --cc=linux-afs@lists.infradead.org \ --cc=linux-cifs@vger.kernel.org \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linux-mtd@lists.infradead.org \ --cc=linux-um@lists.infradead.org \ --cc=richard@nod.at \ --cc=v9fs-developer@lists.sourceforge.net \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).