LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: David Rientjes <rientjes@google.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: Re: [PATCH 0/4] hugetlbfs: optionally reserve all fs pages at mount time
Date: Tue, 3 Mar 2015 21:49:55 -0800 (PST)	[thread overview]
Message-ID: <alpine.DEB.2.10.1503032145110.12253@chino.kir.corp.google.com> (raw)
In-Reply-To: <1425432106-17214-1-git-send-email-mike.kravetz@oracle.com>

On Tue, 3 Mar 2015, Mike Kravetz wrote:

> hugetlbfs allocates huge pages from the global pool as needed.  Even if
> the global pool contains a sufficient number pages for the filesystem
> size at mount time, those global pages could be grabbed for some other
> use.  As a result, filesystem huge page allocations may fail due to lack
> of pages.
> 
> Applications such as a database want to use huge pages for performance
> reasons.  hugetlbfs filesystem semantics with ownership and modes work
> well to manage access to a pool of huge pages.  However, the application
> would like some reasonable assurance that allocations will not fail due
> to a lack of huge pages.  At application startup time, the application
> would like to configure itself to use a specific number of huge pages.
> Before starting, the application will can check to make sure that enough
> huge pages exist in the system global pools.  What the application wants
> is exclusive use of a subpool of huge pages. 
> 
> Add a new hugetlbfs mount option 'reserved' to specify that the number
> of pages associated with the size of the filesystem will be reserved.  If
> there are insufficient pages, the mount will fail.  The reservation is
> maintained for the duration of the filesystem so that as pages are
> allocated and free'ed a sufficient number of pages remains reserved.
> 

This functionality is somewhat limited because it's not possible to 
reserve a subset of the size for a single mount point, it's either all or 
nothing.  It shouldn't be too difficult to just add a reserved=<value> 
option where <value> is <= size.  If it's done that way, you should be 
able to omit size= entirely for unlimited hugepages but always ensure that 
a low watermark of hugepages are reserved for the database.

> Comments from RFC addressed/incorporated
> 
> Mike Kravetz (4):
>   hugetlbfs: add reserved mount fields to subpool structure
>   hugetlbfs: coordinate global and subpool reserve accounting
>   hugetlbfs: accept subpool reserved option and setup accordingly
>   hugetlbfs: document reserved mount option
> 
>  Documentation/vm/hugetlbpage.txt | 18 ++++++++------
>  fs/hugetlbfs/inode.c             | 15 ++++++++++--
>  include/linux/hugetlb.h          |  7 ++++++
>  mm/hugetlb.c                     | 53 +++++++++++++++++++++++++++++++++-------
>  4 files changed, 75 insertions(+), 18 deletions(-)

  parent reply	other threads:[~2015-03-04  5:50 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-04  1:21 Mike Kravetz
2015-03-04  1:21 ` [PATCH 1/4] hugetlbfs: add reserved mount fields to subpool structure Mike Kravetz
2015-03-04  1:21 ` [PATCH 2/4] hugetlbfs: coordinate global and subpool reserve accounting Mike Kravetz
2015-03-04  1:21 ` [PATCH 3/4] hugetlbfs: accept subpool reserved option and setup accordingly Mike Kravetz
2015-03-04  1:21 ` [PATCH 4/4] hugetlbfs: document reserved mount option Mike Kravetz
2015-03-04  5:49 ` David Rientjes [this message]
2015-03-04 17:21   ` [PATCH 0/4] hugetlbfs: optionally reserve all fs pages at mount time Mike Kravetz
2015-03-06 22:13 ` Andi Kleen
2015-03-06 22:30   ` Mike Kravetz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.10.1503032145110.12253@chino.kir.corp.google.com \
    --to=rientjes@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=dave@stgolabs.net \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --subject='Re: [PATCH 0/4] hugetlbfs: optionally reserve all fs pages at mount time' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).