Linux-Fsdevel Archive on lore.kernel.org
help / color / mirror / Atom feed
* [RFC v2 0/2] Fix gfs2 readahead deadlocks
@ 2020-07-03  9:53 Andreas Gruenbacher
  2020-07-03  9:53 ` [RFC v2 1/2] fs: Add IOCB_NOIO flag for generic_file_read_iter Andreas Gruenbacher
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Andreas Gruenbacher @ 2020-07-03  9:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Matthew Wilcox, Dave Chinner, linux-fsdevel, linux-mm,
	linux-kernel, Andreas Gruenbacher

Here's an improved version.  If the IOCB_NOIO flag can be added right
away, we can just fix the locking in gfs2.

Thanks,
Andreas

Andreas Gruenbacher (2):
  fs: Add IOCB_NOIO flag for generic_file_read_iter
  gfs2: Rework read and page fault locking

 fs/gfs2/aops.c     | 45 +--------------------------------------
 fs/gfs2/file.c     | 52 ++++++++++++++++++++++++++++++++++++++++++++--
 include/linux/fs.h |  1 +
 mm/filemap.c       | 17 +++++++++++++--
 4 files changed, 67 insertions(+), 48 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC v2 1/2] fs: Add IOCB_NOIO flag for generic_file_read_iter
  2020-07-03  9:53 [RFC v2 0/2] Fix gfs2 readahead deadlocks Andreas Gruenbacher
@ 2020-07-03  9:53 ` Andreas Gruenbacher
  2020-07-03 11:41   ` Matthew Wilcox
  2020-07-03  9:53 ` [RFC v2 2/2] gfs2: Rework read and page fault locking Andreas Gruenbacher
  2020-07-03 19:24 ` [RFC v2 0/2] Fix gfs2 readahead deadlocks Linus Torvalds
  2 siblings, 1 reply; 9+ messages in thread
From: Andreas Gruenbacher @ 2020-07-03  9:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Matthew Wilcox, Dave Chinner, linux-fsdevel, linux-mm,
	linux-kernel, Andreas Gruenbacher

Add an IOCB_NOIO flag that indicates to generic_file_read_iter that it
shouldn't trigger any filesystem I/O for the actual request or for
readahead.  This allows to do tentative reads out of the page cache as
some filesystems allow, and to take the appropriate locks and retry the
reads only if the requested pages are not cached.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 include/linux/fs.h |  1 +
 mm/filemap.c       | 17 +++++++++++++++--
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 3f881a892ea7..1ab2ea19e883 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -315,6 +315,7 @@ enum rw_hint {
 #define IOCB_SYNC		(1 << 5)
 #define IOCB_WRITE		(1 << 6)
 #define IOCB_NOWAIT		(1 << 7)
+#define IOCB_NOIO		(1 << 8)
 
 struct kiocb {
 	struct file		*ki_filp;
diff --git a/mm/filemap.c b/mm/filemap.c
index f0ae9a6308cb..22f7ff2d369e 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2028,7 +2028,7 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb,
 
 		page = find_get_page(mapping, index);
 		if (!page) {
-			if (iocb->ki_flags & IOCB_NOWAIT)
+			if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_NOIO))
 				goto would_block;
 			page_cache_sync_readahead(mapping,
 					ra, filp,
@@ -2038,6 +2038,10 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb,
 				goto no_cached_page;
 		}
 		if (PageReadahead(page)) {
+			if (iocb->ki_flags & IOCB_NOIO) {
+				put_page(page);
+				goto out;
+			}
 			page_cache_async_readahead(mapping,
 					ra, filp, page,
 					index, last_index - index);
@@ -2249,9 +2253,18 @@ EXPORT_SYMBOL_GPL(generic_file_buffered_read);
  *
  * This is the "read_iter()" routine for all filesystems
  * that can use the page cache directly.
+ *
+ * The IOCB_NOWAIT flag in iocb->ki_flags indicates that -EAGAIN shall
+ * be returned when no data can be read without waiting for I/O requests
+ * to complete; it doesn't prevent readahead.
+ *
+ * The IOCB_NOIO flag in iocb->ki_flags indicates that -EAGAIN shall be
+ * returned when no data can be read without issuing new I/O requests,
+ * and 0 shall be returned when readhead would have been triggered.
+ *
  * Return:
  * * number of bytes copied, even for partial reads
- * * negative error code if nothing was read
+ * * negative error code (or 0 if IOCB_NOIO) if nothing was read
  */
 ssize_t
 generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
-- 
2.26.2


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC v2 2/2] gfs2: Rework read and page fault locking
  2020-07-03  9:53 [RFC v2 0/2] Fix gfs2 readahead deadlocks Andreas Gruenbacher
  2020-07-03  9:53 ` [RFC v2 1/2] fs: Add IOCB_NOIO flag for generic_file_read_iter Andreas Gruenbacher
@ 2020-07-03  9:53 ` Andreas Gruenbacher
  2020-07-03 11:38   ` Matthew Wilcox
  2020-07-03 19:24 ` [RFC v2 0/2] Fix gfs2 readahead deadlocks Linus Torvalds
  2 siblings, 1 reply; 9+ messages in thread
From: Andreas Gruenbacher @ 2020-07-03  9:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Matthew Wilcox, Dave Chinner, linux-fsdevel, linux-mm,
	linux-kernel, Andreas Gruenbacher

So far, gfs2 has taken the inode glocks inside the ->readpage and
->readahead address space operations.  Since commit d4388340ae0b ("fs:
convert mpage_readpages to mpage_readahead"), gfs2_readahead is passed
the pages to read ahead locked.  With that, the current holder of the
inode glock may be trying to lock one of those pages while
gfs2_readahead is trying to take the inode glock, resulting in a
deadlock.

Fix that by moving the lock taking to the higher-level ->read_iter file
and ->fault vm operations.  This also gets rid of an ugly lock inversion
workaround in gfs2_readpage.

The cache consistency model of filesystems like gfs2 is such that if
data is found in the page cache, the data is up to date and can be used
without taking any filesystem locks.  If a page is not cached,
filesystem locks must be taken before populating the page cache.

To avoid taking the inode glock when the data is already cached,
gfs2_file_read_iter first tries to read the data with the IOCB_NOIO flag
set.  If that fails, the inode glock is taken and the operation is
retried with the IOCB_NOIO flag cleared.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/aops.c | 45 +------------------------------------------
 fs/gfs2/file.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 51 insertions(+), 46 deletions(-)

diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index 72c9560f4467..68cd700a2719 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -468,21 +468,10 @@ static int stuffed_readpage(struct gfs2_inode *ip, struct page *page)
 }
 
 
-/**
- * __gfs2_readpage - readpage
- * @file: The file to read a page for
- * @page: The page to read
- *
- * This is the core of gfs2's readpage. It's used by the internal file
- * reading code as in that case we already hold the glock. Also it's
- * called by gfs2_readpage() once the required lock has been granted.
- */
-
 static int __gfs2_readpage(void *file, struct page *page)
 {
 	struct gfs2_inode *ip = GFS2_I(page->mapping->host);
 	struct gfs2_sbd *sdp = GFS2_SB(page->mapping->host);
-
 	int error;
 
 	if (i_blocksize(page->mapping->host) == PAGE_SIZE &&
@@ -505,36 +494,11 @@ static int __gfs2_readpage(void *file, struct page *page)
  * gfs2_readpage - read a page of a file
  * @file: The file to read
  * @page: The page of the file
- *
- * This deals with the locking required. We have to unlock and
- * relock the page in order to get the locking in the right
- * order.
  */
 
 static int gfs2_readpage(struct file *file, struct page *page)
 {
-	struct address_space *mapping = page->mapping;
-	struct gfs2_inode *ip = GFS2_I(mapping->host);
-	struct gfs2_holder gh;
-	int error;
-
-	unlock_page(page);
-	gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
-	error = gfs2_glock_nq(&gh);
-	if (unlikely(error))
-		goto out;
-	error = AOP_TRUNCATED_PAGE;
-	lock_page(page);
-	if (page->mapping == mapping && !PageUptodate(page))
-		error = __gfs2_readpage(file, page);
-	else
-		unlock_page(page);
-	gfs2_glock_dq(&gh);
-out:
-	gfs2_holder_uninit(&gh);
-	if (error && error != AOP_TRUNCATED_PAGE)
-		lock_page(page);
-	return error;
+	return __gfs2_readpage(file, page);
 }
 
 /**
@@ -598,16 +562,9 @@ static void gfs2_readahead(struct readahead_control *rac)
 {
 	struct inode *inode = rac->mapping->host;
 	struct gfs2_inode *ip = GFS2_I(inode);
-	struct gfs2_holder gh;
 
-	gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
-	if (gfs2_glock_nq(&gh))
-		goto out_uninit;
 	if (!gfs2_is_stuffed(ip))
 		mpage_readahead(rac, gfs2_block_map);
-	gfs2_glock_dq(&gh);
-out_uninit:
-	gfs2_holder_uninit(&gh);
 }
 
 /**
diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index fe305e4bfd37..bebde537ac8c 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -558,8 +558,29 @@ static vm_fault_t gfs2_page_mkwrite(struct vm_fault *vmf)
 	return block_page_mkwrite_return(ret);
 }
 
+static vm_fault_t gfs2_fault(struct vm_fault *vmf)
+{
+	struct inode *inode = file_inode(vmf->vma->vm_file);
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct gfs2_holder gh;
+	vm_fault_t ret;
+	int err;
+
+	gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
+	err = gfs2_glock_nq(&gh);
+	if (err) {
+		ret = block_page_mkwrite_return(err);
+		goto out_uninit;
+	}
+	ret = filemap_fault(vmf);
+	gfs2_glock_dq(&gh);
+out_uninit:
+	gfs2_holder_uninit(&gh);
+	return ret;
+}
+
 static const struct vm_operations_struct gfs2_vm_ops = {
-	.fault = filemap_fault,
+	.fault = gfs2_fault,
 	.map_pages = filemap_map_pages,
 	.page_mkwrite = gfs2_page_mkwrite,
 };
@@ -824,6 +845,9 @@ static ssize_t gfs2_file_direct_write(struct kiocb *iocb, struct iov_iter *from)
 
 static ssize_t gfs2_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
 {
+	struct gfs2_inode *ip;
+	struct gfs2_holder gh;
+	size_t written = 0;
 	ssize_t ret;
 
 	if (iocb->ki_flags & IOCB_DIRECT) {
@@ -832,7 +856,31 @@ static ssize_t gfs2_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
 			return ret;
 		iocb->ki_flags &= ~IOCB_DIRECT;
 	}
-	return generic_file_read_iter(iocb, to);
+	iocb->ki_flags |= IOCB_NOIO;
+	ret = generic_file_read_iter(iocb, to);
+	iocb->ki_flags &= ~IOCB_NOIO;
+	if (ret >= 0) {
+		if (!iov_iter_count(to))
+			return ret;
+		written = ret;
+	} else {
+		if (ret != -EAGAIN)
+			return ret;
+		if (iocb->ki_flags & IOCB_NOWAIT)
+			return ret;
+	}
+	ip = GFS2_I(iocb->ki_filp->f_mapping->host);
+	gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
+	ret = gfs2_glock_nq(&gh);
+	if (ret)
+		goto out_uninit;
+	ret = generic_file_read_iter(iocb, to);
+	if (ret > 0)
+		written += ret;
+	gfs2_glock_dq(&gh);
+out_uninit:
+	gfs2_holder_uninit(&gh);
+	return written ? written : ret;
 }
 
 /**
-- 
2.26.2


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC v2 2/2] gfs2: Rework read and page fault locking
  2020-07-03  9:53 ` [RFC v2 2/2] gfs2: Rework read and page fault locking Andreas Gruenbacher
@ 2020-07-03 11:38   ` Matthew Wilcox
  2020-07-03 11:44     ` Matthew Wilcox
  0 siblings, 1 reply; 9+ messages in thread
From: Matthew Wilcox @ 2020-07-03 11:38 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Linus Torvalds, Dave Chinner, linux-fsdevel, linux-mm, linux-kernel

On Fri, Jul 03, 2020 at 11:53:25AM +0200, Andreas Gruenbacher wrote:
> So far, gfs2 has taken the inode glocks inside the ->readpage and
> ->readahead address space operations.  Since commit d4388340ae0b ("fs:
> convert mpage_readpages to mpage_readahead"), gfs2_readahead is passed
> the pages to read ahead locked.  With that, the current holder of the
> inode glock may be trying to lock one of those pages while
> gfs2_readahead is trying to take the inode glock, resulting in a
> deadlock.
> 
> Fix that by moving the lock taking to the higher-level ->read_iter file
> and ->fault vm operations.  This also gets rid of an ugly lock inversion
> workaround in gfs2_readpage.
> 
> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>

Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>

> -/**
> - * __gfs2_readpage - readpage
> - * @file: The file to read a page for
> - * @page: The page to read
> - *
> - * This is the core of gfs2's readpage. It's used by the internal file
> - * reading code as in that case we already hold the glock. Also it's
> - * called by gfs2_readpage() once the required lock has been granted.
> - */
> -
>  static int __gfs2_readpage(void *file, struct page *page)

You could go a little further and rename this function to plain
gfs2_readpage().

gfs2_internal_read() should switch from read_cache_page() to
read_mapping_page().

>  {
>  	struct gfs2_inode *ip = GFS2_I(page->mapping->host);
>  	struct gfs2_sbd *sdp = GFS2_SB(page->mapping->host);
> -
>  	int error;
>  
>  	if (i_blocksize(page->mapping->host) == PAGE_SIZE &&
> @@ -505,36 +494,11 @@ static int __gfs2_readpage(void *file, struct page *page)
>   * gfs2_readpage - read a page of a file
>   * @file: The file to read
>   * @page: The page of the file
> - *
> - * This deals with the locking required. We have to unlock and
> - * relock the page in order to get the locking in the right
> - * order.
>   */

I'd drop the kernel-doc comments on method implementations entirely,
unless there's something useful to say ... which there isn't any more
(yay!)

> @@ -598,16 +562,9 @@ static void gfs2_readahead(struct readahead_control *rac)
>  {
>  	struct inode *inode = rac->mapping->host;
>  	struct gfs2_inode *ip = GFS2_I(inode);
> -	struct gfs2_holder gh;
>  
> -	gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
> -	if (gfs2_glock_nq(&gh))
> -		goto out_uninit;
>  	if (!gfs2_is_stuffed(ip))
>  		mpage_readahead(rac, gfs2_block_map);

I think you probably want to make this:

	if (i_blocksize(page->mapping->host) == PAGE_SIZE &&
	    !page_has_buffers(page))
		error = iomap_readahead(rac, &gfs2_iomap_ops);
	else if (!gfs2_is_stuffed(ip))
		error = mpage_readahead(rac, gfs2_block_map);

... but I understand not wanting to make that change at this point
in the release cycle.

I'm happy for the patches to go in as-is, just wanted to point out these
improvements that could be made.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC v2 1/2] fs: Add IOCB_NOIO flag for generic_file_read_iter
  2020-07-03  9:53 ` [RFC v2 1/2] fs: Add IOCB_NOIO flag for generic_file_read_iter Andreas Gruenbacher
@ 2020-07-03 11:41   ` Matthew Wilcox
  2020-07-05 15:08     ` Andreas Gruenbacher
  0 siblings, 1 reply; 9+ messages in thread
From: Matthew Wilcox @ 2020-07-03 11:41 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Linus Torvalds, Dave Chinner, linux-fsdevel, linux-mm, linux-kernel

On Fri, Jul 03, 2020 at 11:53:24AM +0200, Andreas Gruenbacher wrote:
> Add an IOCB_NOIO flag that indicates to generic_file_read_iter that it
> shouldn't trigger any filesystem I/O for the actual request or for
> readahead.  This allows to do tentative reads out of the page cache as
> some filesystems allow, and to take the appropriate locks and retry the
> reads only if the requested pages are not cached.
> 
> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>

Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>

> @@ -2249,9 +2253,18 @@ EXPORT_SYMBOL_GPL(generic_file_buffered_read);
>   *
>   * This is the "read_iter()" routine for all filesystems
>   * that can use the page cache directly.
> + *
> + * The IOCB_NOWAIT flag in iocb->ki_flags indicates that -EAGAIN shall
> + * be returned when no data can be read without waiting for I/O requests
> + * to complete; it doesn't prevent readahead.
> + *
> + * The IOCB_NOIO flag in iocb->ki_flags indicates that -EAGAIN shall be
> + * returned when no data can be read without issuing new I/O requests,
> + * and 0 shall be returned when readhead would have been triggered.

s/shall/may/ -- if we read a previous page then hit a readahead page,
we'll return a positive value.  If the first page we hit is a readahead
page, then yes, we'll return zero.

Again, I'm happy for the patch to go in as-is without this nitpick.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC v2 2/2] gfs2: Rework read and page fault locking
  2020-07-03 11:38   ` Matthew Wilcox
@ 2020-07-03 11:44     ` Matthew Wilcox
  0 siblings, 0 replies; 9+ messages in thread
From: Matthew Wilcox @ 2020-07-03 11:44 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Linus Torvalds, Dave Chinner, linux-fsdevel, linux-mm, linux-kernel

On Fri, Jul 03, 2020 at 12:38:01PM +0100, Matthew Wilcox wrote:
> > @@ -598,16 +562,9 @@ static void gfs2_readahead(struct readahead_control *rac)
> >  {
> >  	struct inode *inode = rac->mapping->host;
> >  	struct gfs2_inode *ip = GFS2_I(inode);
> > -	struct gfs2_holder gh;
> >  
> > -	gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
> > -	if (gfs2_glock_nq(&gh))
> > -		goto out_uninit;
> >  	if (!gfs2_is_stuffed(ip))
> >  		mpage_readahead(rac, gfs2_block_map);
> 
> I think you probably want to make this:
> 
> 	if (i_blocksize(page->mapping->host) == PAGE_SIZE &&
> 	    !page_has_buffers(page))
> 		error = iomap_readahead(rac, &gfs2_iomap_ops);
> 	else if (!gfs2_is_stuffed(ip))
> 		error = mpage_readahead(rac, gfs2_block_map);
> 
> ... but I understand not wanting to make that change at this point
> in the release cycle.

That was stupid.  I meant to write out:

	if (i_blocksize(rac->mapping->host) == PAGE_SIZE)
		error = iomap_readahead(rac, &gfs2_iomap_ops);
	else if (!gfs2_is_stuffed(ip))
		error = mpage_readahead(rac, gfs2_block_map);

Since the pages are freshly allocated, they can't have buffers, and
the mapping comes out of the readahead_control, not from the page.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC v2 0/2] Fix gfs2 readahead deadlocks
  2020-07-03  9:53 [RFC v2 0/2] Fix gfs2 readahead deadlocks Andreas Gruenbacher
  2020-07-03  9:53 ` [RFC v2 1/2] fs: Add IOCB_NOIO flag for generic_file_read_iter Andreas Gruenbacher
  2020-07-03  9:53 ` [RFC v2 2/2] gfs2: Rework read and page fault locking Andreas Gruenbacher
@ 2020-07-03 19:24 ` Linus Torvalds
  2020-07-03 19:26   ` Linus Torvalds
  2 siblings, 1 reply; 9+ messages in thread
From: Linus Torvalds @ 2020-07-03 19:24 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Matthew Wilcox, Dave Chinner, linux-fsdevel, Linux-MM,
	Linux Kernel Mailing List

On Fri, Jul 3, 2020 at 2:53 AM Andreas Gruenbacher <agruenba@redhat.com> wrote:
>
> Here's an improved version.  If the IOCB_NOIO flag can be added right
> away, we can just fix the locking in gfs2.

I see nothing wrong with this, and would be ok with getting the
patches as pulls from the gfs2 tree despite touching generic code.

Maybe wait a bit for others to comment (I see Willy already did), but
it seems like a fairly straightforward improvement, and the IOCB_NOIO
flag conceptually seems to match well with the IOCB_NOWAIT one, so
this all makes sense to me.

              Linus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC v2 0/2] Fix gfs2 readahead deadlocks
  2020-07-03 19:24 ` [RFC v2 0/2] Fix gfs2 readahead deadlocks Linus Torvalds
@ 2020-07-03 19:26   ` Linus Torvalds
  0 siblings, 0 replies; 9+ messages in thread
From: Linus Torvalds @ 2020-07-03 19:26 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Matthew Wilcox, Dave Chinner, linux-fsdevel, Linux-MM,
	Linux Kernel Mailing List

On Fri, Jul 3, 2020 at 12:24 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> I see nothing wrong with this [..]

Again, I didn't actually look at the gfs2 parts, but I assume you've
done all the testing of the deadlocks etc.

The IOCB_NOIO patch you can add my acked-by to, fwiw.

          Linus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC v2 1/2] fs: Add IOCB_NOIO flag for generic_file_read_iter
  2020-07-03 11:41   ` Matthew Wilcox
@ 2020-07-05 15:08     ` Andreas Gruenbacher
  0 siblings, 0 replies; 9+ messages in thread
From: Andreas Gruenbacher @ 2020-07-05 15:08 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Linus Torvalds, Dave Chinner, linux-fsdevel, Linux-MM, LKML

On Fri, Jul 3, 2020 at 1:41 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Fri, Jul 03, 2020 at 11:53:24AM +0200, Andreas Gruenbacher wrote:
> > Add an IOCB_NOIO flag that indicates to generic_file_read_iter that it
> > shouldn't trigger any filesystem I/O for the actual request or for
> > readahead.  This allows to do tentative reads out of the page cache as
> > some filesystems allow, and to take the appropriate locks and retry the
> > reads only if the requested pages are not cached.
> >
> > Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
>
> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
>
> > @@ -2249,9 +2253,18 @@ EXPORT_SYMBOL_GPL(generic_file_buffered_read);
> >   *
> >   * This is the "read_iter()" routine for all filesystems
> >   * that can use the page cache directly.
> > + *
> > + * The IOCB_NOWAIT flag in iocb->ki_flags indicates that -EAGAIN shall
> > + * be returned when no data can be read without waiting for I/O requests
> > + * to complete; it doesn't prevent readahead.
> > + *
> > + * The IOCB_NOIO flag in iocb->ki_flags indicates that -EAGAIN shall be
> > + * returned when no data can be read without issuing new I/O requests,
> > + * and 0 shall be returned when readhead would have been triggered.
>
> s/shall/may/ -- if we read a previous page then hit a readahead page,
> we'll return a positive value.  If the first page we hit is a readahead
> page, then yes, we'll return zero.

How about this?

 * The IOCB_NOIO flag in iocb->ki_flags indicates that no new I/O
 * requests shall be made for the read or for readahead.  When no data
 * can be read, -EAGAIN shall be returned.  When readahead would be
 * triggered, a short read (possibly of length 0) shall be returned.

> Again, I'm happy for the patch to go in as-is without this nitpick.

Thanks,
Andreas


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-07-05 15:08 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-03  9:53 [RFC v2 0/2] Fix gfs2 readahead deadlocks Andreas Gruenbacher
2020-07-03  9:53 ` [RFC v2 1/2] fs: Add IOCB_NOIO flag for generic_file_read_iter Andreas Gruenbacher
2020-07-03 11:41   ` Matthew Wilcox
2020-07-05 15:08     ` Andreas Gruenbacher
2020-07-03  9:53 ` [RFC v2 2/2] gfs2: Rework read and page fault locking Andreas Gruenbacher
2020-07-03 11:38   ` Matthew Wilcox
2020-07-03 11:44     ` Matthew Wilcox
2020-07-03 19:24 ` [RFC v2 0/2] Fix gfs2 readahead deadlocks Linus Torvalds
2020-07-03 19:26   ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).