LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"hannes@cmpxchg.org" <hannes@cmpxchg.org>
Subject: Re: [mmotm] BUG: Bad page state in process khugepaged ?
Date: Fri, 11 Feb 2011 11:58:58 -0800 (PST) [thread overview]
Message-ID: <alpine.LSU.2.00.1102111132560.3814@sister.anvils> (raw)
In-Reply-To: <20110211104906.GE3347@random.random>
On Fri, 11 Feb 2011, Andrea Arcangeli wrote:
> On Thu, Feb 10, 2011 at 11:02:50PM -0800, Hugh Dickins wrote:
> > There is a separate little issue here, Andrea.
> >
> > Although we went to some trouble for bad_page() to take the page out
> > of circulation yet let the system continue, your VM_BUG_ON(!PageBuddy)
> > inside __ClearPageBuddy(page), from two callsites in bad_page(), is
> > turning it into a fatal error when CONFIG_DEBUG_VM.
>
> I see what you mean. Of course it is only a problem after bad_page
> already triggered.... but then it trigger an BUG_ON instead of only a
> bad_page.
>
> > You could that only MM developers switch CONFIG_DEBUG_VM=y, and they
> > would like bad_page() to be fatal; maybe, but if so we should do that
> > as an intentional patch, rather than as an unexpected side-effect ;)
>
> Fedora kernels are built with CONFIG_DEBUG_VM, all my kernels runs
> with CONFIG_DEBUG_VM too, so we want it to be as "production" as
> possible, and we don't want DEBUG_VM to decrease any reliability (only
> to increase it of course).
Oh, I hadn't realized Fedora use it. I wonder if that's wise, I thought
Nick introduced it partly for the more expensive checks, and there might
be one or two of those around - those bad_range()s in page_alloc.c?
>
> > I noticed this a few days ago, but hadn't quite decided whether just to
> > remove the VM_BUG_ON, or move it to __ClearPageBuddy's third callsite,
> > or... doesn't matter much.
> >
> > I do also wonder if PageBuddy would better be _mapcount -something else:
> > if we've got a miscounted page (itself unlikely of course), there's a
> > chance that its _mapcount will be further decremented after it has been
> > freed: whereupon it will go from -1 to -2, PageBuddy at present. The
> > special avoidance of PageBuddy being that it can pull a whole block of
> > pages into misuse if its mistaken.
>
> Agreed. What about the below?
>
> =====
> Subject: mm: PageBuddy cleanups
>
> From: Andrea Arcangeli <aarcange@redhat.com>
>
> bad_page could VM_BUG_ON(!PageBuddy(page)) inside __ClearPageBuddy().
> I prefer to keep the VM_BUG_ON for safety and to add a if to solve it.
Too much iffery: I ended up preferring it in rmv_page_order() myself.
>
> Change the _mapcount value indicating PageBuddy from -2 to -1024 for more
> robusteness against page_mapcount() undeflows.
But the patch actually says -1024*1024: either would do.
>
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> Reported-by: Hugh Dickins <hughd@google.com>
> ---
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index f6385fc..fa16ba0 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -402,16 +402,22 @@ static inline void init_page_count(struct page *page)
> /*
> * PageBuddy() indicate that the page is free and in the buddy system
> * (see mm/page_alloc.c).
> + *
> + * PAGE_BUDDY_MAPCOUNT_VALUE must be <= -2 but better not too close to
> + * -2 so that an underflow of the page_mapcount() won't be mistaken
> + * for a genuine PAGE_BUDDY_MAPCOUNT_VALUE.
Yes, good to comment that, thanks.
> */
> +#define PAGE_BUDDY_MAPCOUNT_VALUE (-1024*1024)
> +
> static inline int PageBuddy(struct page *page)
> {
> - return atomic_read(&page->_mapcount) == -2;
> + return atomic_read(&page->_mapcount) == PAGE_BUDDY_MAPCOUNT_VALUE;
> }
>
> static inline void __SetPageBuddy(struct page *page)
> {
> VM_BUG_ON(atomic_read(&page->_mapcount) != -1);
> - atomic_set(&page->_mapcount, -2);
> + atomic_set(&page->_mapcount, PAGE_BUDDY_MAPCOUNT_VALUE);
> }
>
> static inline void __ClearPageBuddy(struct page *page)
Yes, that's fine, 0xfff00000 looks unlikely enough (and my
imagination for "deadbeef"-like magic is too drowsy today).
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index a873e61..8aac134 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -286,7 +286,9 @@ static void bad_page(struct page *page)
>
> /* Don't complain about poisoned pages */
> if (PageHWPoison(page)) {
> - __ClearPageBuddy(page);
> + /* __ClearPageBuddy VM_BUG_ON(!PageBuddy(page)) */
> + if (PageBuddy(page))
> + __ClearPageBuddy(page);
> return;
> }
>
> @@ -317,7 +319,8 @@ static void bad_page(struct page *page)
> dump_stack();
> out:
> /* Leave bad fields for debug, except PageBuddy could make trouble */
> - __ClearPageBuddy(page);
> + if (PageBuddy(page)) /* __ClearPageBuddy VM_BUG_ON(!PageBuddy(page)) */
> + __ClearPageBuddy(page);
> add_taint(TAINT_BAD_PAGE);
> }
>
Okay I suppose: it seems rather laboured to me, I think I'd have just
moved the VM_BUG_ON into rmv_page_order() if I'd done the patch; but
since I was too lazy to do it, I'd better be grateful for yours!
Hugh
next prev parent reply other threads:[~2011-02-11 19:59 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-09 6:10 KAMEZAWA Hiroyuki
2011-02-09 6:40 ` KAMEZAWA Hiroyuki
2011-02-09 6:50 ` Daisuke Nishimura
2011-02-09 6:52 ` KAMEZAWA Hiroyuki
2011-02-09 20:07 ` Andrea Arcangeli
2011-02-11 7:02 ` Hugh Dickins
2011-02-11 10:49 ` Andrea Arcangeli
2011-02-11 19:58 ` Hugh Dickins [this message]
2011-02-11 20:24 ` Andrea Arcangeli
2011-02-14 22:24 ` Johannes Weiner
2011-02-09 7:23 ` [PATCH][BUGFIX] memcg: fix leak of accounting at failure path of hugepage collapsing KAMEZAWA Hiroyuki
2011-02-09 7:51 ` Daisuke Nishimura
2011-02-09 9:51 ` Johannes Weiner
2011-02-10 2:49 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LSU.2.00.1102111132560.3814@sister.anvils \
--to=hughd@google.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nishimura@mxp.nes.nec.co.jp \
--subject='Re: [mmotm] BUG: Bad page state in process khugepaged ?' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).