Linux-Fsdevel Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v2] Resolve LRU page-pinning issue for file-backed pages
@ 2021-01-05  7:05 Chris Goldsworthy
  2021-01-05  7:05 ` [PATCH v2] fs/buffer.c: Revoke LRU when trying to drop buffers Chris Goldsworthy
  0 siblings, 1 reply; 2+ messages in thread
From: Chris Goldsworthy @ 2021-01-05  7:05 UTC (permalink / raw)
  To: Alexander Viro, linux-fsdevel, Matthew Wilcox, linux-mm, linux-kernel
  Cc: Chris Goldsworthy

It is possible for file-backed pages to end up in a contiguous memory area
(CMA), such that the relevant page must be migrated using the .migratepage()
callback when its backing physical memory is selected for use in an CMA
allocation (through cma_alloc()).  However, if a set of address space
operations (AOPs) for a file-backed page lacks a migratepage() page call-back,
fallback_migrate_page() will be used instead, which through
try_to_release_page() calls try_to_free_buffers() (which is called directly or
through a try_to_free_buffers() callback.  try_to_free_buffers() in turn calls
drop_buffers()

drop_buffers() itself can fail due to the buffer_head associated with a page
being busy. However, it is possible that the buffer_head is on an LRU list for
a CPU, such that we can try removing the buffer_head from that list, in order
to successfully release the page.  Do this.

v1: https://lore.kernel.org/lkml/cover.1606194703.git.cgoldswo@codeaurora.org/T/#m3a44b5745054206665455625ccaf27379df8a190
Original version of the patch (with updates to make to account for changes in
on_each_cpu_cond()).

v2: Follow Matthew Wilcox's suggestion of reducing the number of calls to
on_each_cpu_cond(), by iterating over a page's busy buffer_heads inside of
on_each_cpu_cond(). To copy from his e-mail, we go from:

for_each_buffer
	for_each_cpu
		for_each_lru_entry

to:

for_each_cpu
	for_each_buffer
		for_each_lru_entry

This is done using xarrays, which I found to be the cleanest data structure to
use, though a pre-allocated array of page_size(page) / bh->b_size elements might
be more performant.

Laura Abbott (1):
  fs/buffer.c: Revoke LRU when trying to drop buffers

 fs/buffer.c   | 85 +++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 fs/internal.h |  5 ++++
 2 files changed, 85 insertions(+), 5 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply	[flat|nested] 2+ messages in thread

* [PATCH v2] fs/buffer.c: Revoke LRU when trying to drop buffers
  2021-01-05  7:05 [PATCH v2] Resolve LRU page-pinning issue for file-backed pages Chris Goldsworthy
@ 2021-01-05  7:05 ` Chris Goldsworthy
  0 siblings, 0 replies; 2+ messages in thread
From: Chris Goldsworthy @ 2021-01-05  7:05 UTC (permalink / raw)
  To: Alexander Viro, linux-fsdevel, Matthew Wilcox, linux-mm, linux-kernel
  Cc: Laura Abbott, Chris Goldsworthy

From: Laura Abbott <lauraa@codeaurora.org>

When a buffer is added to the LRU list, a reference is taken which is
not dropped until the buffer is evicted from the LRU list. This is the
correct behavior, however this LRU reference will prevent the buffer
from being dropped. This means that the buffer can't actually be dropped
until it is selected for eviction. There's no bound on the time spent
on the LRU list, which means that the buffer may be undroppable for
very long periods of time. Given that migration involves dropping
buffers, the associated page is now unmigratible for long periods of
time as well. CMA relies on being able to migrate a specific range
of pages, so these types of failures make CMA significantly
less reliable, especially under high filesystem usage.

Rather than waiting for the LRU algorithm to eventually kick out
the buffer, explicitly remove the buffer from the LRU list when trying
to drop it. There is still the possibility that the buffer
could be added back on the list, but that indicates the buffer is
still in use and would probably have other 'in use' indicates to
prevent dropping.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Chris Goldsworthy <cgoldswo@codeaurora.org>
Cc: Matthew Wilcox <willy@infradead.org>
---
 fs/buffer.c   | 85 +++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 fs/internal.h |  5 ++++
 2 files changed, 85 insertions(+), 5 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 96c7604..536fb5b 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -48,6 +48,7 @@
 #include <linux/sched/mm.h>
 #include <trace/events/block.h>
 #include <linux/fscrypt.h>
+#include <linux/xarray.h>
 
 #include "internal.h"
 
@@ -1471,12 +1472,63 @@ static bool has_bh_in_lru(int cpu, void *dummy)
 	return false;
 }
 
+static void __evict_bhs_lru(void *arg)
+{
+	struct bh_lru *b = &get_cpu_var(bh_lrus);
+	struct busy_bhs_container *busy_bhs = arg;
+	struct buffer_head *bh;
+	int i;
+
+	XA_STATE(xas, &busy_bhs->xarray, 0);
+
+	xas_for_each(&xas, bh, busy_bhs->size) {
+		for (i = 0; i < BH_LRU_SIZE; i++) {
+			if (b->bhs[i] == bh) {
+				brelse(b->bhs[i]);
+				b->bhs[i] = NULL;
+				break;
+			}
+		}
+
+		bh = bh->b_this_page;
+	}
+
+	put_cpu_var(bh_lrus);
+}
+
+static bool page_has_bhs_in_lru(int cpu, void *arg)
+{
+	struct bh_lru *b = per_cpu_ptr(&bh_lrus, cpu);
+	struct busy_bhs_container *busy_bhs = arg;
+	struct buffer_head *bh;
+	int i;
+
+	XA_STATE(xas, &busy_bhs->xarray, 0);
+
+	xas_for_each(&xas, bh, busy_bhs->size) {
+		for (i = 0; i < BH_LRU_SIZE; i++) {
+			if (b->bhs[i] == bh)
+				return true;
+		}
+
+		bh = bh->b_this_page;
+	}
+
+	return false;
+
+}
 void invalidate_bh_lrus(void)
 {
 	on_each_cpu_cond(has_bh_in_lru, invalidate_bh_lru, NULL, 1);
 }
 EXPORT_SYMBOL_GPL(invalidate_bh_lrus);
 
+static void evict_bh_lrus(struct busy_bhs_container *busy_bhs)
+{
+	on_each_cpu_cond(page_has_bhs_in_lru, __evict_bhs_lru,
+			 busy_bhs, 1);
+}
+
 void set_bh_page(struct buffer_head *bh,
 		struct page *page, unsigned long offset)
 {
@@ -3242,14 +3294,36 @@ drop_buffers(struct page *page, struct buffer_head **buffers_to_free)
 {
 	struct buffer_head *head = page_buffers(page);
 	struct buffer_head *bh;
+	struct busy_bhs_container busy_bhs;
+	int xa_ret, ret = 0;
+
+	xa_init(&busy_bhs.xarray);
+	busy_bhs.size = 0;
 
 	bh = head;
 	do {
-		if (buffer_busy(bh))
-			goto failed;
+		if (buffer_busy(bh)) {
+			xa_ret = xa_err(xa_store(&busy_bhs.xarray, busy_bhs.size++,
+						 bh, GFP_ATOMIC));
+			if (xa_ret)
+				goto out;
+		}
 		bh = bh->b_this_page;
 	} while (bh != head);
 
+	if (busy_bhs.size) {
+		/*
+		 * Check if the busy failure was due to an outstanding
+		 * LRU reference
+		 */
+		evict_bh_lrus(&busy_bhs);
+		do {
+			if (buffer_busy(bh))
+				goto out;
+		} while (bh != head);
+	}
+
+	ret = 1;
 	do {
 		struct buffer_head *next = bh->b_this_page;
 
@@ -3259,9 +3333,10 @@ drop_buffers(struct page *page, struct buffer_head **buffers_to_free)
 	} while (bh != head);
 	*buffers_to_free = head;
 	detach_page_private(page);
-	return 1;
-failed:
-	return 0;
+out:
+	xa_destroy(&busy_bhs.xarray);
+
+	return ret;
 }
 
 int try_to_free_buffers(struct page *page)
diff --git a/fs/internal.h b/fs/internal.h
index 77c50be..00f17c4 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -15,6 +15,7 @@ struct mount;
 struct shrink_control;
 struct fs_context;
 struct user_namespace;
+struct xarray;
 
 /*
  * block_dev.c
@@ -49,6 +50,10 @@ static inline int emergency_thaw_bdev(struct super_block *sb)
  */
 extern int __block_write_begin_int(struct page *page, loff_t pos, unsigned len,
 		get_block_t *get_block, struct iomap *iomap);
+struct busy_bhs_container {
+	struct xarray xarray;
+	int size;
+};
 
 /*
  * char_dev.c
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-01-05  7:07 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-05  7:05 [PATCH v2] Resolve LRU page-pinning issue for file-backed pages Chris Goldsworthy
2021-01-05  7:05 ` [PATCH v2] fs/buffer.c: Revoke LRU when trying to drop buffers Chris Goldsworthy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).