LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Naoya Horiguchi <nao.horiguchi@gmail.com>,
	Oscar Salvador <osalvador@suse.de>,
	Muchun Song <songmuchun@bytedance.com>,
	linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.com>, Tony Luck <tony.luck@intel.com>,
	Naoya Horiguchi <naoya.horiguchi@nec.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v6 1/2] mm,hwpoison: fix race with hugetlb page allocation
Date: Fri, 4 Jun 2021 16:55:40 -0700	[thread overview]
Message-ID: <69922ccb-9ab9-33a6-4aa6-ea47431d60a7@oracle.com> (raw)
In-Reply-To: <20210603233632.2964832-2-nao.horiguchi@gmail.com>

On 6/3/21 4:36 PM, Naoya Horiguchi wrote:
> From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> 
> When hugetlb page fault (under overcommitting situation) and
> memory_failure() race, VM_BUG_ON_PAGE() is triggered by the following race:
> 
>     CPU0:                           CPU1:
> 
>                                     gather_surplus_pages()
>                                       page = alloc_surplus_huge_page()
>     memory_failure_hugetlb()
>       get_hwpoison_page(page)
>         __get_hwpoison_page(page)
>           get_page_unless_zero(page)
>                                       zero = put_page_testzero(page)
>                                       VM_BUG_ON_PAGE(!zero, page)
>                                       enqueue_huge_page(h, page)
>       put_page(page)
> 
> __get_hwpoison_page() only checks the page refcount before taking an
> additional one for memory error handling, which is not enough because
> there's a time window where compound pages have non-zero refcount during
> hugetlb page initialization.
> 
> So make __get_hwpoison_page() check page status a bit more for hugetlb
> pages with get_hwpoison_huge_page(). Checking hugetlb-specific flags
> under hugetlb_lock makes sure that the hugetlb page is not transitive.
> It's notable that another new function, HWPoisonHandlable(), is helpful
> to prevent a race against other transitive page states (like a generic
> compound page just before PageHuge becomes true).

Thanks!

I believe this is good enough to prevent the race.  Since access to the
page is not synchronized in any way, it would still be "theoretically
possible" to race.  For example,

    CPU0:                           CPU1:

    HWPoisonHandlable true
    				    page freed to buddy
				    page allocated from buddy
				    page added to hugetlb pool
				    alloc_surplus_huge_page

But, this is very Very VERY unlikely to ever happen.

> 
> Fixes: ead07f6a867b ("mm/memory-failure: introduce get_hwpoison_page() for consistent refcount handling")
> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Reported-by: Muchun Song <songmuchun@bytedance.com>
> Cc: stable@vger.kernel.org # 5.12+
> ---
> ChangeLog v6:
> - defined HWPoisonHandlable(),
> - updated comment and patch description.
> ---
>  include/linux/hugetlb.h |  6 ++++++
>  mm/hugetlb.c            | 15 +++++++++++++++
>  mm/memory-failure.c     | 29 +++++++++++++++++++++++++++--
>  3 files changed, 48 insertions(+), 2 deletions(-)

Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
-- 
Mike Kravetz

  reply	other threads:[~2021-06-04 23:56 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-03 23:36 [PATCH v6 0/2] hwpoison: " Naoya Horiguchi
2021-06-03 23:36 ` [PATCH v6 1/2] mm,hwpoison: " Naoya Horiguchi
2021-06-04 23:55   ` Mike Kravetz [this message]
2021-08-12  4:28   ` Luck, Tony
2021-08-12  9:03     ` HORIGUCHI NAOYA(堀口 直也)
2021-08-12 15:25       ` Luck, Tony
2021-08-13  6:29         ` HORIGUCHI NAOYA(堀口 直也)
2021-08-13 15:07           ` Luck, Tony
2021-08-16 17:12             ` Naoya Horiguchi
2021-08-16 17:56               ` Luck, Tony
2021-08-17  5:40                 ` HORIGUCHI NAOYA(堀口 直也)
2021-06-03 23:36 ` [PATCH v6 2/2] mm,hwpoison: make get_hwpoison_page() call get_any_page() Naoya Horiguchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=69922ccb-9ab9-33a6-4aa6-ea47431d60a7@oracle.com \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=nao.horiguchi@gmail.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    --cc=songmuchun@bytedance.com \
    --cc=tony.luck@intel.com \
    --subject='Re: [PATCH v6 1/2] mm,hwpoison: fix race with hugetlb page allocation' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).