LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>,
Zhang Qiang <Qiang.Zhang@windriver.com>,
Yanfei Xu <yanfei.xu@windriver.com>,
Chuck Lever <chuck.lever@oracle.com>,
Jesper Dangaard Brouer <brouer@redhat.com>,
Matteo Croce <mcroce@microsoft.com>,
Linux-MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
Mel Gorman <mgorman@techsingularity.net>
Subject: [PATCH 1/4] mm/page_alloc: Avoid page allocator recursion with pagesets.lock held
Date: Tue, 13 Jul 2021 16:20:57 +0100 [thread overview]
Message-ID: <20210713152100.10381-2-mgorman@techsingularity.net> (raw)
In-Reply-To: <20210713152100.10381-1-mgorman@techsingularity.net>
Syzbot is reporting potential deadlocks due to pagesets.lock when
PAGE_OWNER is enabled. One example from Desmond Cheong Zhi Xi is
as follows
__alloc_pages_bulk()
local_lock_irqsave(&pagesets.lock, flags) <---- outer lock here
prep_new_page():
post_alloc_hook():
set_page_owner():
__set_page_owner():
save_stack():
stack_depot_save():
alloc_pages():
alloc_page_interleave():
__alloc_pages():
get_page_from_freelist():
rm_queue():
rm_queue_pcplist():
local_lock_irqsave(&pagesets.lock, flags);
*** DEADLOCK ***
Zhang, Qiang also reported
BUG: sleeping function called from invalid context at mm/page_alloc.c:5179
in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0
.....
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:96
___might_sleep.cold+0x1f1/0x237 kernel/sched/core.c:9153
prepare_alloc_pages+0x3da/0x580 mm/page_alloc.c:5179
__alloc_pages+0x12f/0x500 mm/page_alloc.c:5375
alloc_page_interleave+0x1e/0x200 mm/mempolicy.c:2147
alloc_pages+0x238/0x2a0 mm/mempolicy.c:2270
stack_depot_save+0x39d/0x4e0 lib/stackdepot.c:303
save_stack+0x15e/0x1e0 mm/page_owner.c:120
__set_page_owner+0x50/0x290 mm/page_owner.c:181
prep_new_page mm/page_alloc.c:2445 [inline]
__alloc_pages_bulk+0x8b9/0x1870 mm/page_alloc.c:5313
alloc_pages_bulk_array_node include/linux/gfp.h:557 [inline]
vm_area_alloc_pages mm/vmalloc.c:2775 [inline]
__vmalloc_area_node mm/vmalloc.c:2845 [inline]
__vmalloc_node_range+0x39d/0x960 mm/vmalloc.c:2947
__vmalloc_node mm/vmalloc.c:2996 [inline]
vzalloc+0x67/0x80 mm/vmalloc.c:3066
There are a number of ways it could be fixed. The page owner code could
be audited to strip GFP flags that allow sleeping but it'll impair the
functionality of PAGE_OWNER if allocations fail. The bulk allocator
could add a special case to release/reacquire the lock for prep_new_page
and lookup PCP after the lock is reacquired at the cost of performance.
The pages requiring prep could be tracked using the least significant
bit and looping through the array although it is more complicated for
the list interface. The options are relatively complex and the second
one still incurs a performance penalty when PAGE_OWNER is active so this
patch takes the simple approach -- disable bulk allocation if PAGE_OWNER is
active. The caller will be forced to allocate one page at a time incurring
a performance penalty but PAGE_OWNER is already a performance penalty.
Fixes: dbbee9d5cd83 ("mm/page_alloc: convert per-cpu list protection to local_lock")
Reported-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Reported-by: "Zhang, Qiang" <Qiang.Zhang@windriver.com>
Reported-and-tested-by: syzbot+127fd7828d6eeb611703@syzkaller.appspotmail.com
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Rafael Aquini <aquini@redhat.com>
---
mm/page_alloc.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3b97e17806be..6ef86f338151 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5239,6 +5239,18 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
if (nr_pages - nr_populated == 1)
goto failed;
+#ifdef CONFIG_PAGE_OWNER
+ /*
+ * PAGE_OWNER may recurse into the allocator to allocate space to
+ * save the stack with pagesets.lock held. Releasing/reacquiring
+ * removes much of the performance benefit of bulk allocation so
+ * force the caller to allocate one page at a time as it'll have
+ * similar performance to added complexity to the bulk allocator.
+ */
+ if (static_branch_unlikely(&page_owner_inited))
+ goto failed;
+#endif
+
/* May set ALLOC_NOFRAGMENT, fragmentation will return 1 page. */
gfp &= gfp_allowed_mask;
alloc_gfp = gfp;
--
2.26.2
next prev parent reply other threads:[~2021-07-13 15:21 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-13 15:20 [PATCH 0/4 v2] 5.14-rc1 mm/page_alloc.c stray patches Mel Gorman
2021-07-13 15:20 ` Mel Gorman [this message]
2021-07-13 15:20 ` [PATCH 2/4] mm/page_alloc: correct return value when failing at preparing Mel Gorman
2021-07-13 15:20 ` [PATCH 3/4] mm/page_alloc: Further fix __alloc_pages_bulk() return value Mel Gorman
2021-07-13 15:34 ` Chuck Lever III
2021-07-13 15:21 ` [PATCH 4/4] Revert "mm/page_alloc: make should_fail_alloc_page() static" Mel Gorman
2021-07-14 7:06 ` John Hubbard
2021-07-15 8:35 ` Jesper Dangaard Brouer
2021-07-16 0:04 ` John Hubbard
2021-07-16 6:04 ` John Hubbard
-- strict thread matches above, loose matches on Subject: below --
2021-07-13 13:56 [PATCH 0/4] 5.14-rc1 mm/page_alloc.c stray patches Mel Gorman
2021-07-13 13:56 ` [PATCH 1/4] mm/page_alloc: Avoid page allocator recursion with pagesets.lock held Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210713152100.10381-2-mgorman@techsingularity.net \
--to=mgorman@techsingularity.net \
--cc=Qiang.Zhang@windriver.com \
--cc=akpm@linux-foundation.org \
--cc=brouer@redhat.com \
--cc=chuck.lever@oracle.com \
--cc=desmondcheongzx@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mcroce@microsoft.com \
--cc=yanfei.xu@windriver.com \
--subject='Re: [PATCH 1/4] mm/page_alloc: Avoid page allocator recursion with pagesets.lock held' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).