LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Andi Kleen <andi@firstfloor.org>
To: linux-kernel@vger.kernel.org, pj@sgi.com, linux-mm@kvack.org,
	nickpiggin@yahoo.com.au
Subject: [PATCH] [0/18] GB pages hugetlb support
Date: Mon, 17 Mar 2008 02:58:13 +0100 (CET)	[thread overview]
Message-ID: <20080317258.659191058@firstfloor.org> (raw)


This patchkit supports GB pages for hugetlb on x86-64 in addition to 
2MB pages.   This is the sucessor of an earlier much simpler
patchkit that allowed to set the hugepagesz globally at boot
to 1GB pages. The advantage of this more complex patchkit
is that it allows 2MB page users and 1GB page users to 
coexist (although not on the same hugetlbfs mount points) 

It first adds some straight-forward infrastructure 
to hugetlbfs to support multiple page sizes. Then it uses that
infrastructure to implement support for huge pages > MAX_ORDER 
(which can be allocated at boot with bootmem only). Then 
the x86-64 port is extended to support 1GB pages on CPUs
that support them (AMD Quad Cores)

There is no support for i386 because GB pages are only available in
long mode.

The variable page size support is currently limited to the
specific use case of the single additional 1GB page size.
Using it for more page sizes (especially those < MAX_ORDER)
would require some more work, although the basic infrastructure
is all in place and the incremental work will be small.
But I didn't bother to implement some corner cases not needed
for the GB page case. I usually added comments so they
should be easy to find (and fix) later however :)

I hacked in also cpuset support. It would be good if 
Paul double checked that.

GB pages are only intended to be used in special situations, like
dedicated databases where complicated configuration does not matter. 
That is why they have some limitations:
- Can be only allocated at boot (using hugepagesz=1G hugepages=...) 
- Can't be freed at runtime
- One hugetlbfs mount per page size (using the pagesize=... mount 
option). This is a little awkward, but greatly simplified the
code.
- No IPC SHM support currently (would not be very hard to do, 
but it is unclear what the best API for this is. Suggestions
welcome)

Some of this would be fixable later.

Known issues:
- GB pages are not reported in total memory, which gives
confusing free(1) output
- I have still to explain myself how and if free_pgd_pages works
on hugetlb, both with 1GB and with 2MB pages. 
- cpuset support is a little dubious, but the code was 
even before quite strange.
- lockdep sometimes complains about recursive page_table_locks
for shared hugetlb memory, but as far as I can see I didn't
actually change this area. Looks a little dubious, might
be a false positive too.
- hugemmap04 from LTP fails. Cause unknown currently

-Andi

             reply	other threads:[~2008-03-17  1:58 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-17  1:58 Andi Kleen [this message]
2008-03-17  1:58 ` [PATCH] [1/18] Convert hugeltlb.c over to pass global state around in a structure Andi Kleen
2008-03-17 20:15   ` Adam Litke
2008-03-18 12:05   ` Mel Gorman
2008-03-17  1:58 ` [PATCH] [2/18] Add basic support for more than one hstate in hugetlbfs Andi Kleen
2008-03-17 20:22   ` Adam Litke
2008-03-17 20:44     ` Andi Kleen
2008-03-18 12:23   ` Mel Gorman
2008-03-23 10:38   ` KOSAKI Motohiro
2008-03-23 11:28     ` Andi Kleen
2008-03-23 11:30       ` KOSAKI Motohiro
2008-03-17  1:58 ` [PATCH] [3/18] Convert /proc output code over to report multiple hstates Andi Kleen
2008-03-18 12:28   ` Mel Gorman
2008-03-17  1:58 ` [PATCH] [4/18] Add basic support for more than one hstate in hugetlbfs Andi Kleen
2008-03-17  8:09   ` Paul Jackson
2008-03-17  8:15     ` Andi Kleen
2008-03-17 20:28   ` Adam Litke
2008-03-18 14:11   ` Mel Gorman
2008-03-17  1:58 ` [PATCH] [5/18] Expand the hugetlbfs sysctls to handle arrays for all hstates Andi Kleen
2008-03-18 14:34   ` Mel Gorman
2008-03-18 16:49     ` Andi Kleen
2008-03-18 17:01       ` Mel Gorman
2008-03-17  1:58 ` [PATCH] [6/18] Add support to have individual hstates for each hugetlbfs mount Andi Kleen
2008-03-18 14:10   ` Adam Litke
2008-03-18 15:02   ` Mel Gorman
2008-03-17  1:58 ` [PATCH] [7/18] Abstract out the NUMA node round robin code into a separate function Andi Kleen
2008-03-18 15:42   ` Mel Gorman
2008-03-18 15:47     ` Andi Kleen
2008-03-18 16:04       ` Mel Gorman
2008-03-17  1:58 ` [PATCH] [8/18] Add a __alloc_bootmem_node_nopanic Andi Kleen
2008-03-18 15:54   ` Mel Gorman
2008-03-17  1:58 ` [PATCH] [9/18] Export prep_compound_page to the hugetlb allocator Andi Kleen
2008-03-17  1:58 ` [PATCH] [10/18] Factor out new huge page preparation code into separate function Andi Kleen
2008-03-17 20:31   ` Adam Litke
2008-03-18 16:02   ` Mel Gorman
2008-03-17  1:58 ` [PATCH] [11/18] Fix alignment bug in bootmem allocator Andi Kleen
2008-03-17  2:19   ` Yinghai Lu
2008-03-17  7:02     ` Andi Kleen
2008-03-17  7:17       ` Yinghai Lu
2008-03-17  7:31         ` Yinghai Lu
2008-03-17  7:41           ` Andi Kleen
2008-03-17  7:53             ` Yinghai Lu
2008-03-17  8:10               ` Yinghai Lu
2008-03-17  8:17                 ` Andi Kleen
2008-03-17  8:56               ` Andi Kleen
2008-03-17 18:52                 ` Yinghai Lu
2008-03-17 21:27                   ` Yinghai Lu
2008-03-18  2:06                     ` Yinghai Lu
2008-03-18 16:18   ` Mel Gorman
2008-03-17  1:58 ` [PATCH] [12/18] Add support to allocate hugetlb pages that are larger than MAX_ORDER Andi Kleen
2008-03-18 16:27   ` Mel Gorman
2008-04-09 16:05   ` Andrew Hastings
2008-04-09 17:56     ` Andi Kleen
2008-03-17  1:58 ` [PATCH] [13/18] Add support to allocate hugepages of different size with hugepages= Andi Kleen
2008-03-18 16:32   ` Mel Gorman
2008-03-18 16:45     ` Andi Kleen
2008-03-18 16:46       ` Mel Gorman
2008-03-17  1:58 ` [PATCH] [14/18] Clean up hugetlb boot time printk Andi Kleen
2008-03-18 16:37   ` Mel Gorman
2008-03-17  1:58 ` [PATCH] [15/18] Add support to x86-64 to allocate and lookup GB pages in hugetlb Andi Kleen
2008-03-17  1:58 ` [PATCH] [16/18] Add huge pud support to hugetlbfs Andi Kleen
2008-03-17  1:58 ` [PATCH] [17/18] Add huge pud support to mm/memory.c Andi Kleen
2008-03-17  1:58 ` [PATCH] [18/18] Implement hugepagesz= option for x86-64 Andi Kleen
2008-03-17  9:29   ` Paul Jackson
2008-03-17  9:59     ` Andi Kleen
2008-03-17 10:02       ` Paul Jackson
2008-03-17  3:11 ` [PATCH] [0/18] GB pages hugetlb support Paul Jackson
2008-03-17  7:00   ` Andi Kleen
2008-03-17  7:00     ` Paul Jackson
2008-03-17  7:29       ` Andi Kleen
2008-03-17  5:35 ` Paul Jackson
2008-03-17  6:58   ` Andi Kleen
2008-03-17  9:26 ` Paul Jackson
2008-03-17 15:05 ` Adam Litke
2008-03-17 15:33   ` Andi Kleen
2008-03-17 15:59     ` Adam Litke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080317258.659191058@firstfloor.org \
    --to=andi@firstfloor.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=pj@sgi.com \
    --subject='Re: [PATCH] [0/18] GB pages hugetlb support' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).