From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754218AbbCDFuA (ORCPT ); Wed, 4 Mar 2015 00:50:00 -0500 Received: from mail-ie0-f170.google.com ([209.85.223.170]:40020 "EHLO mail-ie0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751125AbbCDFt6 (ORCPT ); Wed, 4 Mar 2015 00:49:58 -0500 Date: Tue, 3 Mar 2015 21:49:55 -0800 (PST) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Mike Kravetz cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Davidlohr Bueso , Aneesh Kumar , Joonsoo Kim Subject: Re: [PATCH 0/4] hugetlbfs: optionally reserve all fs pages at mount time In-Reply-To: <1425432106-17214-1-git-send-email-mike.kravetz@oracle.com> Message-ID: References: <1425432106-17214-1-git-send-email-mike.kravetz@oracle.com> User-Agent: Alpine 2.10 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 3 Mar 2015, Mike Kravetz wrote: > hugetlbfs allocates huge pages from the global pool as needed. Even if > the global pool contains a sufficient number pages for the filesystem > size at mount time, those global pages could be grabbed for some other > use. As a result, filesystem huge page allocations may fail due to lack > of pages. > > Applications such as a database want to use huge pages for performance > reasons. hugetlbfs filesystem semantics with ownership and modes work > well to manage access to a pool of huge pages. However, the application > would like some reasonable assurance that allocations will not fail due > to a lack of huge pages. At application startup time, the application > would like to configure itself to use a specific number of huge pages. > Before starting, the application will can check to make sure that enough > huge pages exist in the system global pools. What the application wants > is exclusive use of a subpool of huge pages. > > Add a new hugetlbfs mount option 'reserved' to specify that the number > of pages associated with the size of the filesystem will be reserved. If > there are insufficient pages, the mount will fail. The reservation is > maintained for the duration of the filesystem so that as pages are > allocated and free'ed a sufficient number of pages remains reserved. > This functionality is somewhat limited because it's not possible to reserve a subset of the size for a single mount point, it's either all or nothing. It shouldn't be too difficult to just add a reserved= option where is <= size. If it's done that way, you should be able to omit size= entirely for unlimited hugepages but always ensure that a low watermark of hugepages are reserved for the database. > Comments from RFC addressed/incorporated > > Mike Kravetz (4): > hugetlbfs: add reserved mount fields to subpool structure > hugetlbfs: coordinate global and subpool reserve accounting > hugetlbfs: accept subpool reserved option and setup accordingly > hugetlbfs: document reserved mount option > > Documentation/vm/hugetlbpage.txt | 18 ++++++++------ > fs/hugetlbfs/inode.c | 15 ++++++++++-- > include/linux/hugetlb.h | 7 ++++++ > mm/hugetlb.c | 53 +++++++++++++++++++++++++++++++++------- > 4 files changed, 75 insertions(+), 18 deletions(-)