LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH V3] Allow compaction of unevictable pages
@ 2015-03-09 20:48 Eric B Munson
  2015-03-09 20:51 ` Rik van Riel
  2015-03-10 11:22 ` Peter Zijlstra
  0 siblings, 2 replies; 4+ messages in thread
From: Eric B Munson @ 2015-03-09 20:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Eric B Munson, Vlastimil Babka, Thomas Gleixner,
	Christoph Lameter, Peter Zijlstra, Mel Gorman, David Rientjes,
	linux-mm, linux-kernel

Currently, pages which are marked as unevictable are protected from
compaction, but not from other types of migration.  The mlock
desctription does not promise that all page faults will be avoided, only
major ones so this protection is not necessary.  This extra protection
can cause problems for applications that are using mlock to avoid
swapping pages out, but require order > 0 allocations to continue to
succeed in a fragmented environment.  This patch removes the
ISOLATE_UNEVICTABLE mode and the check for it in __isolate_lru_page().
Removing this check allows the removal of the isolate_mode argument from
isolate_migratepages_block() because it can compute the required mode
from the compact_control structure.

To illustrate this problem I wrote a quick test program that mmaps a
large number of 1MB files filled with random data.  These maps are
created locked and read only.  Then every other mmap is unmapped and I
attempt to allocate huge pages to the static huge page pool.  Without
this patch I am unable to allocate any huge pages after  fragmenting
memory.  With it, I can allocate almost all the space freed by unmapping
as huge pages.

Signed-off-by: Eric B Munson <emunson@akamai.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 include/linux/mmzone.h |    2 --
 mm/compaction.c        |   13 +++++--------
 mm/vmscan.c            |    4 ----
 3 files changed, 5 insertions(+), 14 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index f279d9c..599fb01 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -232,8 +232,6 @@ struct lruvec {
 #define ISOLATE_UNMAPPED	((__force isolate_mode_t)0x2)
 /* Isolate for asynchronous migration */
 #define ISOLATE_ASYNC_MIGRATE	((__force isolate_mode_t)0x4)
-/* Isolate unevictable pages */
-#define ISOLATE_UNEVICTABLE	((__force isolate_mode_t)0x8)
 
 /* LRU Isolation modes. */
 typedef unsigned __bitwise__ isolate_mode_t;
diff --git a/mm/compaction.c b/mm/compaction.c
index 8c0d945..9bdf1d7 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -650,7 +650,6 @@ static bool too_many_isolated(struct zone *zone)
  * @cc:		Compaction control structure.
  * @low_pfn:	The first PFN to isolate
  * @end_pfn:	The one-past-the-last PFN to isolate, within same pageblock
- * @isolate_mode: Isolation mode to be used.
  *
  * Isolate all pages that can be migrated from the range specified by
  * [low_pfn, end_pfn). The range is expected to be within same pageblock.
@@ -664,7 +663,7 @@ static bool too_many_isolated(struct zone *zone)
  */
 static unsigned long
 isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
-			unsigned long end_pfn, isolate_mode_t isolate_mode)
+			unsigned long end_pfn)
 {
 	struct zone *zone = cc->zone;
 	unsigned long nr_scanned = 0, nr_isolated = 0;
@@ -674,6 +673,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 	bool locked = false;
 	struct page *page = NULL, *valid_page = NULL;
 	unsigned long start_pfn = low_pfn;
+	const isolate_mode_t isolate_mode =
+		(cc->mode == MIGRATE_ASYNC ? ISOLATE_ASYNC_MIGRATE : 0);
 
 	/*
 	 * Ensure that there are not too many pages isolated from the LRU
@@ -872,8 +873,7 @@ isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn,
 		if (!pageblock_pfn_to_page(pfn, block_end_pfn, cc->zone))
 			continue;
 
-		pfn = isolate_migratepages_block(cc, pfn, block_end_pfn,
-							ISOLATE_UNEVICTABLE);
+		pfn = isolate_migratepages_block(cc, pfn, block_end_pfn);
 
 		/*
 		 * In case of fatal failure, release everything that might
@@ -1056,8 +1056,6 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
 {
 	unsigned long low_pfn, end_pfn;
 	struct page *page;
-	const isolate_mode_t isolate_mode =
-		(cc->mode == MIGRATE_ASYNC ? ISOLATE_ASYNC_MIGRATE : 0);
 
 	/*
 	 * Start at where we last stopped, or beginning of the zone as
@@ -1102,8 +1100,7 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
 			continue;
 
 		/* Perform the isolation */
-		low_pfn = isolate_migratepages_block(cc, low_pfn, end_pfn,
-								isolate_mode);
+		low_pfn = isolate_migratepages_block(cc, low_pfn, end_pfn);
 
 		if (!low_pfn || cc->contended) {
 			acct_isolated(zone, cc);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 5e8eadd..3b2a444 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1234,10 +1234,6 @@ int __isolate_lru_page(struct page *page, isolate_mode_t mode)
 	if (!PageLRU(page))
 		return ret;
 
-	/* Compaction should not handle unevictable pages but CMA can do so */
-	if (PageUnevictable(page) && !(mode & ISOLATE_UNEVICTABLE))
-		return ret;
-
 	ret = -EBUSY;
 
 	/*
-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH V3] Allow compaction of unevictable pages
  2015-03-09 20:48 [PATCH V3] Allow compaction of unevictable pages Eric B Munson
@ 2015-03-09 20:51 ` Rik van Riel
  2015-03-10 11:22 ` Peter Zijlstra
  1 sibling, 0 replies; 4+ messages in thread
From: Rik van Riel @ 2015-03-09 20:51 UTC (permalink / raw)
  To: Eric B Munson, Andrew Morton
  Cc: Vlastimil Babka, Thomas Gleixner, Christoph Lameter,
	Peter Zijlstra, Mel Gorman, David Rientjes, linux-mm,
	linux-kernel

On 03/09/2015 04:48 PM, Eric B Munson wrote:
> Currently, pages which are marked as unevictable are protected from
> compaction, but not from other types of migration.  The mlock
> desctription does not promise that all page faults will be avoided, only
> major ones so this protection is not necessary.  This extra protection
> can cause problems for applications that are using mlock to avoid
> swapping pages out, but require order > 0 allocations to continue to
> succeed in a fragmented environment.  This patch removes the
> ISOLATE_UNEVICTABLE mode and the check for it in __isolate_lru_page().
> Removing this check allows the removal of the isolate_mode argument from
> isolate_migratepages_block() because it can compute the required mode
> from the compact_control structure.
>
> To illustrate this problem I wrote a quick test program that mmaps a
> large number of 1MB files filled with random data.  These maps are
> created locked and read only.  Then every other mmap is unmapped and I
> attempt to allocate huge pages to the static huge page pool.  Without
> this patch I am unable to allocate any huge pages after  fragmenting
> memory.  With it, I can allocate almost all the space freed by unmapping
> as huge pages.
>
> Signed-off-by: Eric B Munson <emunson@akamai.com>
> Acked-by: David Rientjes <rientjes@google.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: David Rientjes <rientjes@google.com>
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org

Acked-by: Rik van Riel <riel@redhat.com>


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH V3] Allow compaction of unevictable pages
  2015-03-09 20:48 [PATCH V3] Allow compaction of unevictable pages Eric B Munson
  2015-03-09 20:51 ` Rik van Riel
@ 2015-03-10 11:22 ` Peter Zijlstra
  2015-03-10 14:12   ` Eric B Munson
  1 sibling, 1 reply; 4+ messages in thread
From: Peter Zijlstra @ 2015-03-10 11:22 UTC (permalink / raw)
  To: Eric B Munson
  Cc: Andrew Morton, Vlastimil Babka, Thomas Gleixner,
	Christoph Lameter, Mel Gorman, David Rientjes, linux-mm,
	linux-kernel

On Mon, Mar 09, 2015 at 04:48:43PM -0400, Eric B Munson wrote:
> Currently, pages which are marked as unevictable are protected from
> compaction, but not from other types of migration.  The mlock
> desctription does not promise that all page faults will be avoided, only
> major ones so this protection is not necessary.  This extra protection
> can cause problems for applications that are using mlock to avoid
> swapping pages out, but require order > 0 allocations to continue to
> succeed in a fragmented environment.  This patch removes the
> ISOLATE_UNEVICTABLE mode and the check for it in __isolate_lru_page().
> Removing this check allows the removal of the isolate_mode argument from
> isolate_migratepages_block() because it can compute the required mode
> from the compact_control structure.
> 
> To illustrate this problem I wrote a quick test program that mmaps a
> large number of 1MB files filled with random data.  These maps are
> created locked and read only.  Then every other mmap is unmapped and I
> attempt to allocate huge pages to the static huge page pool.  Without
> this patch I am unable to allocate any huge pages after  fragmenting
> memory.  With it, I can allocate almost all the space freed by unmapping
> as huge pages.

So mlock() is part of the POSIX real-time spec. For real-time purposes
we very much do _NOT_ want page migration to happen.

So while you might be following the letter of the spec you're very much
violating the spirit of the thing.

Also, there is another solution to your problem; you can compact
mlock'ed pages at mlock() time.

Furthermore, I would once again like to remind people of my VM_PINNED
patches. The only thing that needs happening there is someone needs to
deobfuscate the IB code.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH V3] Allow compaction of unevictable pages
  2015-03-10 11:22 ` Peter Zijlstra
@ 2015-03-10 14:12   ` Eric B Munson
  0 siblings, 0 replies; 4+ messages in thread
From: Eric B Munson @ 2015-03-10 14:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrew Morton, Vlastimil Babka, Thomas Gleixner,
	Christoph Lameter, Mel Gorman, David Rientjes, linux-mm,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2856 bytes --]

On Tue, 10 Mar 2015, Peter Zijlstra wrote:

> On Mon, Mar 09, 2015 at 04:48:43PM -0400, Eric B Munson wrote:
> > Currently, pages which are marked as unevictable are protected from
> > compaction, but not from other types of migration.  The mlock
> > desctription does not promise that all page faults will be avoided, only
> > major ones so this protection is not necessary.  This extra protection
> > can cause problems for applications that are using mlock to avoid
> > swapping pages out, but require order > 0 allocations to continue to
> > succeed in a fragmented environment.  This patch removes the
> > ISOLATE_UNEVICTABLE mode and the check for it in __isolate_lru_page().
> > Removing this check allows the removal of the isolate_mode argument from
> > isolate_migratepages_block() because it can compute the required mode
> > from the compact_control structure.
> > 
> > To illustrate this problem I wrote a quick test program that mmaps a
> > large number of 1MB files filled with random data.  These maps are
> > created locked and read only.  Then every other mmap is unmapped and I
> > attempt to allocate huge pages to the static huge page pool.  Without
> > this patch I am unable to allocate any huge pages after  fragmenting
> > memory.  With it, I can allocate almost all the space freed by unmapping
> > as huge pages.
> 
> So mlock() is part of the POSIX real-time spec. For real-time purposes
> we very much do _NOT_ want page migration to happen.
> 
> So while you might be following the letter of the spec you're very much
> violating the spirit of the thing.
> 

Fair enough, but the documentation in the mlock manpage only explicitly
promises to prevent major faults.  If this patch is not taken, then the
manpage for mlock needs to have a note added explaining that mlock
prevents compaction as well.  The confusion our userspace devs had stems
from this as they though they could use mlock to avoid swapping, but
still benefit from compaction in order > 0 allocations.

> Also, there is another solution to your problem; you can compact
> mlock'ed pages at mlock() time.

This might work for some cases, I'd have to spend some time thinking on
it, but it won't work in my case.  Memory is fragmented by unmapping
as data is no longer needed.  So we really do need to compact the
locked pages that are left.

> 
> Furthermore, I would once again like to remind people of my VM_PINNED
> patches. The only thing that needs happening there is someone needs to
> deobfuscate the IB code.

Hence my attempt to kick that discussion last week.  Unfortunately, I
cannot provide any help with the IB code.  Having this mechanism would
give us a way to continue to allow real-time users to avoid all faults
while giving anyone that wants to avoid only major faults a way to do
so.


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-03-10 14:12 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-09 20:48 [PATCH V3] Allow compaction of unevictable pages Eric B Munson
2015-03-09 20:51 ` Rik van Riel
2015-03-10 11:22 ` Peter Zijlstra
2015-03-10 14:12   ` Eric B Munson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).