LKML Archive on lore.kernel.org help / color / mirror / Atom feed
From: Muchun Song <songmuchun@bytedance.com> To: mike.kravetz@oracle.com, akpm@linux-foundation.org, osalvador@suse.de, mhocko@suse.com, song.bao.hua@hisilicon.com, david@redhat.com, chenhuang5@huawei.com, bodeddub@amazon.com, corbet@lwn.net, willy@infradead.org Cc: duanxiongchun@bytedance.com, fam.zheng@bytedance.com, smuchun@gmail.com, zhengqi.arch@bytedance.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Muchun Song <songmuchun@bytedance.com> Subject: [PATCH RESEND v2 0/4] Free the 2nd vmemmap page associated with each HugeTLB page Date: Fri, 17 Sep 2021 11:48:11 +0800 [thread overview] Message-ID: <20210917034815.80264-1-songmuchun@bytedance.com> (raw) Hi, This series can minimize the overhead of struct page for 2MB HugeTLB pages significantly, I'd like to get some review input. Thanks. After the feature of "Free sonme vmemmap pages of HugeTLB page" is enabled, the mapping of the vmemmap addresses associated with a 2MB HugeTLB page becomes the figure below. HugeTLB struct pages(8 pages) page frame(8 pages) +-----------+ ---virt_to_page---> +-----------+ mapping to +-----------+---> PG_head | | | 0 | -------------> | 0 | | | +-----------+ +-----------+ | | | 1 | -------------> | 1 | | | +-----------+ +-----------+ | | | 2 | ----------------^ ^ ^ ^ ^ ^ | | +-----------+ | | | | | | | | 3 | ------------------+ | | | | | | +-----------+ | | | | | | | 4 | --------------------+ | | | | 2MB | +-----------+ | | | | | | 5 | ----------------------+ | | | | +-----------+ | | | | | 6 | ------------------------+ | | | +-----------+ | | | | 7 | --------------------------+ | | +-----------+ | | | | | | +-----------+ As we can see, the 2nd vmemmap page frame (indexed by 1) is reused and remaped. However, the 2nd vmemmap page frame is also can be freed to the buddy allocator, then we can change the mapping from the figure above to the figure below. HugeTLB struct pages(8 pages) page frame(8 pages) +-----------+ ---virt_to_page---> +-----------+ mapping to +-----------+---> PG_head | | | 0 | -------------> | 0 | | | +-----------+ +-----------+ | | | 1 | ---------------^ ^ ^ ^ ^ ^ ^ | | +-----------+ | | | | | | | | | 2 | -----------------+ | | | | | | | +-----------+ | | | | | | | | 3 | -------------------+ | | | | | | +-----------+ | | | | | | | 4 | ---------------------+ | | | | 2MB | +-----------+ | | | | | | 5 | -----------------------+ | | | | +-----------+ | | | | | 6 | -------------------------+ | | | +-----------+ | | | | 7 | ---------------------------+ | | +-----------+ | | | | | | +-----------+ After we do this, all tail vmemmap pages (1-7) are mapped to the head vmemmap page frame (0). In other words, there are more than one page struct with PG_head associated with each HugeTLB page. We __know__ that there is only one head page struct, the tail page structs with PG_head are fake head page structs. We need an approach to distinguish between those two different types of page structs so that compound_head(), PageHead() and PageTail() can work properly if the parameter is the tail page struct but with PG_head. The following code snippet describes how to distinguish between real and fake head page struct. if (test_bit(PG_head, &page->flags)) { unsigned long head = READ_ONCE(page[1].compound_head); if (head & 1) { if (head == (unsigned long)page + 1) ==> head page struct else ==> tail page struct } else ==> head page struct } We can safely access the field of the @page[1] with PG_head because the @page is a compound page composed with at least two contiguous pages. The main implementation is in the patch 1. In our server, we can save extra 2GB memory with this patchset applied if there are 1 TB HugeTLB (2 MB) pages. If the size of the HugeTLB page is 1 GB, it only can save 4MB. For 2 MB HugeTLB page, it is a nice gain. Changlogs in v2: 1. Drop two patches of introducing PAGEFLAGS_MASK from this series. 2. Let page_head_if_fake() return page instead of NULL. 3. Add a selftest to check if PageHead or PageTail work well. Muchun Song (4): mm: hugetlb: free the 2nd vmemmap page associated with each HugeTLB page mm: hugetlb: replace hugetlb_free_vmemmap_enabled with a static_key mm: sparsemem: use page table lock to protect kernel pmd operations selftests: vm: add a hugetlb test case Documentation/admin-guide/kernel-parameters.txt | 2 +- include/linux/hugetlb.h | 6 +- include/linux/page-flags.h | 77 ++++++++++++- mm/hugetlb_vmemmap.c | 64 ++++++----- mm/ptdump.c | 16 ++- mm/sparse-vmemmap.c | 70 +++++++++--- tools/testing/selftests/vm/vmemmap_hugetlb.c | 139 ++++++++++++++++++++++++ 7 files changed, 320 insertions(+), 54 deletions(-) create mode 100644 tools/testing/selftests/vm/vmemmap_hugetlb.c -- 2.11.0
next reply other threads:[~2021-09-17 3:53 UTC|newest] Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-09-17 3:48 Muchun Song [this message] 2021-09-17 3:48 ` [PATCH RESEND v2 1/4] mm: hugetlb: free the 2nd vmemmap page associated with each HugeTLB page Muchun Song 2021-09-18 4:38 ` Barry Song 2021-09-18 10:06 ` Muchun Song 2021-09-21 6:43 ` Muchun Song 2021-09-21 10:22 ` Muchun Song 2021-09-21 0:11 ` Barry Song 2021-09-21 13:46 ` Muchun Song 2021-09-21 20:43 ` Barry Song 2021-09-22 2:38 ` Muchun Song 2021-09-22 7:36 ` Barry Song 2021-09-17 3:48 ` [PATCH RESEND v2 2/4] mm: hugetlb: replace hugetlb_free_vmemmap_enabled with a static_key Muchun Song 2021-09-18 4:55 ` Barry Song 2021-09-18 10:30 ` Muchun Song 2021-09-18 11:14 ` Barry Song 2021-09-18 11:47 ` Muchun Song 2021-09-18 12:27 ` Barry Song 2021-09-17 3:48 ` [PATCH RESEND v2 3/4] mm: sparsemem: use page table lock to protect kernel pmd operations Muchun Song 2021-09-18 5:06 ` Barry Song 2021-09-18 10:51 ` Muchun Song 2021-09-18 11:01 ` Barry Song 2021-09-17 3:48 ` [PATCH RESEND v2 4/4] selftests: vm: add a hugetlb test case Muchun Song 2021-09-18 5:20 ` Barry Song 2021-09-20 14:26 ` Muchun Song 2021-09-21 0:28 ` Barry Song 2021-09-21 13:18 ` Muchun Song
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210917034815.80264-1-songmuchun@bytedance.com \ --to=songmuchun@bytedance.com \ --cc=akpm@linux-foundation.org \ --cc=bodeddub@amazon.com \ --cc=chenhuang5@huawei.com \ --cc=corbet@lwn.net \ --cc=david@redhat.com \ --cc=duanxiongchun@bytedance.com \ --cc=fam.zheng@bytedance.com \ --cc=linux-doc@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mhocko@suse.com \ --cc=mike.kravetz@oracle.com \ --cc=osalvador@suse.de \ --cc=smuchun@gmail.com \ --cc=song.bao.hua@hisilicon.com \ --cc=willy@infradead.org \ --cc=zhengqi.arch@bytedance.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).