LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/12] Group pages of related mobility together to reduce external fragmentation v28
@ 2007-03-01 10:02 Mel Gorman
  2007-03-01 10:02 ` [PATCH 1/12] Add a bitmap that is used to track flags affecting a block of pages Mel Gorman
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: Mel Gorman @ 2007-03-01 10:02 UTC (permalink / raw)
  To: akpm; +Cc: Mel Gorman, linux-kernel, linux-mm

Here is the latest revision of the anti-fragmentation patches. Of
particular note in this version is special treatment of high-order atomic
allocations. Care is taken to group them together and avoid grouping pages
of other types near them. Artifical tests imply that it works. I'm trying to
get the hardware together that would allow setting up of a "real" test. If
anyone already has a setup and test that can trigger the atomic-allocation
problem, I'd appreciate a test of these patches and a report.  The second
major change is that these patches will apply cleanly with patches that
implement anti-fragmentation through zones.

kernbench shows effectively no performance difference varying between -0.2%
and +2% on a variety of test machines. Success rates for huge page allocation
are dramatically increased. For example, on a ppc64 machine, the vanilla
kernel was only able to allocate 1% of memory as a hugepage and this was
due to a single hugepage reserved as min_free_kbytes. With these patches
applied, 17% was allocatable as superpages.  With reclaim-related fixes from
Andy Whitcroft, it was 40% and further reclaim-related improvements should
increase this further.

Changelog Since V28
o Group high-order atomic allocations together
o It is no longer required to set min_free_kbytes to 10% of memory. A value
  of 16384 in most cases will be sufficient
o Now applied with zone-based anti-fragmentation
o Fix incorrect VM_BUG_ON within buffered_rmqueue()
o Reorder the stack so later patches do not back out work from earlier patches
o Fix bug were journal pages were being treated as movable
o Bias placement of non-movable pages to lower PFNs
o More agressive clustering of reclaimable pages in reactions to workloads
  like updatedb that flood the size of inode caches

Changelog Since V27

o Renamed anti-fragmentation to Page Clustering. Anti-fragmentation was giving
  the mistaken impression that it was the 100% solution for high order
  allocations. Instead, it greatly increases the chances high-order
  allocations will succeed and lays the foundation for defragmentation and
  memory hot-remove to work properly
o Redefine page groupings based on ability to migrate or reclaim instead of
  basing on reclaimability alone
o Get rid of spurious inits
o Per-cpu lists are no longer split up per-type. Instead the per-cpu list is
  searched for a page of the appropriate type
o Added more explanation commentary
o Fix up bug in pageblock code where bitmap was used before being initalised

Changelog Since V26
o Fix double init of lists in setup_pageset

Changelog Since V25
o Fix loop order of for_each_rclmtype_order so that order of loop matches args
o gfpflags_to_rclmtype uses gfp_t instead of unsigned long
o Rename get_pageblock_type() to get_page_rclmtype()
o Fix alignment problem in move_freepages()
o Add mechanism for assigning flags to blocks of pages instead of page->flags
o On fallback, do not examine the preferred list of free pages a second time

The purpose of these patches is to reduce external fragmentation by grouping
pages of related types together. When pages are migrated (or reclaimed under
memory pressure), large contiguous pages will be freed. 

This patch works by categorising allocations by their ability to migrate;

Movable - The pages may be moved with the page migration mechanism. These are
	generally userspace pages. 

Reclaimable - These are allocations for some kernel caches that are
	reclaimable or allocations that are known to be very short-lived.

Unmovable - These are pages that are allocated by the kernel that
	are not trivially reclaimed. For example, the memory allocated for a
	loaded module would be in this category. By default, allocations are
	considered to be of this type

HighAtomic - These are high-order allocations belonging to callers that
	cannot sleep or perform any IO. In practice, this is restricted to
	jumbo frame allocation for network receive. It is assumed that the
	allocations are short-lived

Instead of having one MAX_ORDER-sized array of free lists in struct free_area,
there is one for each type of reclaimability. Once a 2^MAX_ORDER block of
pages is split for a type of allocation, it is added to the free-lists for
that type, in effect reserving it. Hence, over time, pages of the different
types can be clustered together.

When the preferred freelists are expired, the largest possible block is taken
from an alternative list. Buddies that are split from that large block are
placed on the preferred allocation-type freelists to mitigate fragmentation.

This implementation gives best-effort for low fragmentation in all zones.
Ideally, min_free_kbytes needs to be set to a value equal to
4 * (1 << (MAX_ORDER-1)) pages in most cases. This would be 16384 on x86
and x86_64 for example.

Our tests show that about 60-70% of physical memory can be allocated on
a desktop after a few days uptime. In benchmarks and stress tests, we are
finding that 80% of memory is available as contiguous blocks at the end of
the test. To compare, a standard kernel was getting < 1% of memory as large
pages on a desktop and about 8-12% of memory as large pages at the end of
stress tests.

Following this email are 12 patches that implement thie page grouping
feature. The first patch introduces a mechanism for storing flags related
to a whole block of pages. Then allocations are split between movable and
all other allocations.  Following that are patches to deal with per-cpu
pages and make the mechanism configurable. The next patch moves free pages
between lists when partially allocated blocks are used for pages of another
migrate type. The second last patch groups reclaimable kernel allocations
such as inode caches together. The final patch related to groupings keeps
high-order atomic allocations.

The last two patches are more concerned with control of fragmentation. The
second last patch biases placement of non-movable allocations towards the
start of memory. This is with a view of supporting memory hot-remove of
DIMMs with higher PFNs in the future. The biasing could be enforced a lot
heavier but it would cost. The last patch agressively clusters reclaimable
pages like inode caches together.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/12] Add a bitmap that is used to track flags affecting a block of pages
  2007-03-01 10:02 [PATCH 0/12] Group pages of related mobility together to reduce external fragmentation v28 Mel Gorman
@ 2007-03-01 10:02 ` Mel Gorman
  2007-03-01 10:03 ` [PATCH 2/12] Add __GFP_MOVABLE for callers to flag allocations from high memory that may be migrated Mel Gorman
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Mel Gorman @ 2007-03-01 10:02 UTC (permalink / raw)
  To: akpm; +Cc: Mel Gorman, linux-kernel, linux-mm


The fragmentation reduction strategy needs to track if pages within a block
can be moved or reclaimed so that pages are freed to the appropriate list.
This patch adds a bitmap for flags affecting a whole a MAX_ORDER block
of pages.

In non-SPARSEMEM configurations, the bitmap is stored in the struct zone
and allocated during initialisation. SPARSEMEM statically allocates the
bitmap in a struct mem_section so that bitmaps do not have to be resized
during memory hotadd. This wastes a small amount of memory per unused section
(usually sizeof(unsigned long)) but the complexity of dynamically allocating
the memory is quite high.

Additional credit to Andy Whitcroft who reviewed up an earlier implementation
of the mechanism an suggested how to make it a *lot* cleaner.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---

 include/linux/mmzone.h          |   13 ++++
 include/linux/pageblock-flags.h |   51 +++++++++++++++
 mm/page_alloc.c                 |  113 +++++++++++++++++++++++++++++++++++
 3 files changed, 177 insertions(+)

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-clean/include/linux/mmzone.h linux-2.6.20-mm2-001_pageblock_bits/include/linux/mmzone.h
--- linux-2.6.20-mm2-clean/include/linux/mmzone.h	2007-02-19 01:22:30.000000000 +0000
+++ linux-2.6.20-mm2-001_pageblock_bits/include/linux/mmzone.h	2007-02-20 18:23:25.000000000 +0000
@@ -13,6 +13,7 @@
 #include <linux/init.h>
 #include <linux/seqlock.h>
 #include <linux/nodemask.h>
+#include <linux/pageblock-flags.h>
 #include <asm/atomic.h>
 #include <asm/page.h>
 
@@ -209,6 +210,14 @@ struct zone {
 #endif
 	struct free_area	free_area[MAX_ORDER];
 
+#ifndef CONFIG_SPARSEMEM
+	/*
+	 * Flags for a MAX_ORDER_NR_PAGES block. See pageblock-flags.h.
+	 * In SPARSEMEM, this map is stored in struct mem_section
+	 */
+	unsigned long		*pageblock_flags;
+#endif /* CONFIG_SPARSEMEM */
+
 
 	ZONE_PADDING(_pad1_)
 
@@ -661,6 +670,9 @@ extern struct zone *next_zone(struct zon
 #define PAGES_PER_SECTION       (1UL << PFN_SECTION_SHIFT)
 #define PAGE_SECTION_MASK	(~(PAGES_PER_SECTION-1))
 
+#define SECTION_BLOCKFLAGS_BITS \
+		((SECTION_SIZE_BITS - (MAX_ORDER-1)) * NR_PAGEBLOCK_BITS)
+
 #if (MAX_ORDER - 1 + PAGE_SHIFT) > SECTION_SIZE_BITS
 #error Allocator MAX_ORDER exceeds SECTION_SIZE
 #endif
@@ -680,6 +692,7 @@ struct mem_section {
 	 * before using it wrong.
 	 */
 	unsigned long section_mem_map;
+	DECLARE_BITMAP(pageblock_flags, SECTION_BLOCKFLAGS_BITS);
 };
 
 #ifdef CONFIG_SPARSEMEM_EXTREME
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-clean/include/linux/pageblock-flags.h linux-2.6.20-mm2-001_pageblock_bits/include/linux/pageblock-flags.h
--- linux-2.6.20-mm2-clean/include/linux/pageblock-flags.h	2007-02-19 13:53:43.000000000 +0000
+++ linux-2.6.20-mm2-001_pageblock_bits/include/linux/pageblock-flags.h	2007-02-20 19:44:47.000000000 +0000
@@ -0,0 +1,51 @@
+/*
+ * Macros for manipulating and testing flags related to a
+ * MAX_ORDER_NR_PAGES block of pages.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation version 2 of the License
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2006
+ *
+ * Original author, Mel Gorman
+ * Major cleanups and reduction of bit operations, Andy Whitcroft
+ */
+#ifndef PAGEBLOCK_FLAGS_H
+#define PAGEBLOCK_FLAGS_H
+
+#include <linux/types.h>
+
+/* Macro to aid the definition of ranges of bits */
+#define PB_range(name, required_bits) \
+	name, name ## _end = (name + required_bits) - 1
+
+/* Bit indices that affect a whole block of pages */
+enum pageblock_bits {
+	NR_PAGEBLOCK_BITS
+};
+
+/* Forward declaration */
+struct page;
+
+/* Declarations for getting and setting flags. See mm/page_alloc.c */
+unsigned long get_pageblock_flags_group(struct page *page,
+					int start_bitidx, int end_bitidx);
+void set_pageblock_flags_group(struct page *page, unsigned long flags,
+					int start_bitidx, int end_bitidx);
+
+#define get_pageblock_flags(page) \
+			get_pageblock_flags_group(page, 0, NR_PAGEBLOCK_BITS-1)
+#define set_pageblock_flags(page) \
+			set_pageblock_flags_group(page, 0, NR_PAGEBLOCK_BITS-1)
+
+#endif	/* PAGEBLOCK_FLAGS_H */
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-clean/mm/page_alloc.c linux-2.6.20-mm2-001_pageblock_bits/mm/page_alloc.c
--- linux-2.6.20-mm2-clean/mm/page_alloc.c	2007-02-19 01:22:35.000000000 +0000
+++ linux-2.6.20-mm2-001_pageblock_bits/mm/page_alloc.c	2007-02-20 18:23:25.000000000 +0000
@@ -2715,6 +2715,41 @@ static void __init calculate_node_totalp
 							realtotalpages);
 }
 
+#ifndef CONFIG_SPARSEMEM
+/*
+ * Calculate the size of the zone->blockflags rounded to an unsigned long
+ * Start by making sure zonesize is a multiple of MAX_ORDER-1 by rounding up
+ * Then figure 1 NR_PAGEBLOCK_BITS worth of bits per MAX_ORDER-1, finally
+ * round what is now in bits to nearest long in bits, then return it in
+ * bytes.
+ */
+static unsigned long __init usemap_size(unsigned long zonesize)
+{
+	unsigned long usemapsize;
+
+	usemapsize = roundup(zonesize, MAX_ORDER_NR_PAGES);
+	usemapsize = usemapsize >> (MAX_ORDER-1);
+	usemapsize *= NR_PAGEBLOCK_BITS;
+	usemapsize = roundup(usemapsize, 8 * sizeof(unsigned long));
+
+	return usemapsize / 8;
+}
+
+static void __init setup_usemap(struct pglist_data *pgdat,
+				struct zone *zone, unsigned long zonesize)
+{
+	unsigned long usemapsize = usemap_size(zonesize);
+	zone->pageblock_flags = NULL;
+	if (usemapsize) {
+		zone->pageblock_flags = alloc_bootmem_node(pgdat, usemapsize);
+		memset(zone->pageblock_flags, 0, usemapsize);
+	}
+}
+#else
+static void inline setup_usemap(struct pglist_data *pgdat,
+				struct zone *zone, unsigned long zonesize) {}
+#endif /* CONFIG_SPARSEMEM */
+
 /*
  * Set up the zone data structures:
  *   - mark all pages reserved
@@ -2795,6 +2830,7 @@ static void __meminit free_area_init_cor
 		if (!size)
 			continue;
 
+		setup_usemap(pgdat, zone, size);
 		ret = init_currently_empty_zone(zone, zone_start_pfn,
 						size, MEMMAP_EARLY);
 		BUG_ON(ret);
@@ -3512,4 +3548,81 @@ EXPORT_SYMBOL(pfn_to_page);
 EXPORT_SYMBOL(page_to_pfn);
 #endif /* CONFIG_OUT_OF_LINE_PFN_TO_PAGE */
 
+/* Return a pointer to the bitmap storing bits affecting a block of pages */
+static inline unsigned long *get_pageblock_bitmap(struct zone *zone,
+							unsigned long pfn)
+{
+#ifdef CONFIG_SPARSEMEM
+	unsigned long blockpfn;
+	blockpfn = pfn & ~(MAX_ORDER_NR_PAGES - 1);
+	return __pfn_to_section(blockpfn)->pageblock_flags;
+#else
+	return zone->pageblock_flags;
+#endif /* CONFIG_SPARSEMEM */
+}
+
+static inline int pfn_to_bitidx(struct zone *zone, unsigned long pfn)
+{
+#ifdef CONFIG_SPARSEMEM
+	pfn &= (PAGES_PER_SECTION-1);
+	return (pfn >> (MAX_ORDER-1)) * NR_PAGEBLOCK_BITS;
+#else
+	pfn = pfn - zone->zone_start_pfn;
+	return (pfn >> (MAX_ORDER-1)) * NR_PAGEBLOCK_BITS;
+#endif /* CONFIG_SPARSEMEM */
+}
 
+/**
+ * get_pageblock_flags_group - Return the requested group of flags for the MAX_ORDER_NR_PAGES block of pages
+ * @page: The page within the block of interest
+ * @start_bitidx: The first bit of interest to retrieve
+ * @end_bitidx: The last bit of interest
+ * returns pageblock_bits flags
+ */
+unsigned long get_pageblock_flags_group(struct page *page,
+					int start_bitidx, int end_bitidx)
+{
+	struct zone *zone;
+	unsigned long *bitmap;
+	unsigned long pfn, bitidx;
+	unsigned long flags = 0;
+	unsigned long value = 1;
+
+	zone = page_zone(page);
+	pfn = page_to_pfn(page);
+	bitmap = get_pageblock_bitmap(zone, pfn);
+	bitidx = pfn_to_bitidx(zone, pfn);
+
+	for (; start_bitidx <= end_bitidx; start_bitidx++, value <<= 1)
+		if (test_bit(bitidx + start_bitidx, bitmap))
+			flags |= value;
+
+	return flags;
+}
+
+/**
+ * set_pageblock_flags_group - Set the requested group of flags for a MAX_ORDER_NR_PAGES block of pages
+ * @page: The page within the block of interest
+ * @start_bitidx: The first bit of interest
+ * @end_bitidx: The last bit of interest
+ * @flags: The flags to set
+ */
+void set_pageblock_flags_group(struct page *page, unsigned long flags,
+					int start_bitidx, int end_bitidx)
+{
+	struct zone *zone;
+	unsigned long *bitmap;
+	unsigned long pfn, bitidx;
+	unsigned long value = 1;
+
+	zone = page_zone(page);
+	pfn = page_to_pfn(page);
+	bitmap = get_pageblock_bitmap(zone, pfn);
+	bitidx = pfn_to_bitidx(zone, pfn);
+
+	for (; start_bitidx <= end_bitidx; start_bitidx++, value <<= 1)
+		if (flags & value)
+			__set_bit(bitidx + start_bitidx, bitmap);
+		else
+			__clear_bit(bitidx + start_bitidx, bitmap);
+}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/12] Add __GFP_MOVABLE for callers to flag allocations from high memory that may be migrated
  2007-03-01 10:02 [PATCH 0/12] Group pages of related mobility together to reduce external fragmentation v28 Mel Gorman
  2007-03-01 10:02 ` [PATCH 1/12] Add a bitmap that is used to track flags affecting a block of pages Mel Gorman
@ 2007-03-01 10:03 ` Mel Gorman
  2007-03-01 10:03 ` [PATCH 3/12] Add __GFP_MOVABLE for callers to flag allocations from low " Mel Gorman
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Mel Gorman @ 2007-03-01 10:03 UTC (permalink / raw)
  To: akpm; +Cc: Mel Gorman, linux-kernel, linux-mm


It is often known at allocation time whether a page may be migrated or
not. This patch adds a flag called __GFP_MOVABLE and a new mask called
GFP_HIGH_MOVABLE. Allocations using the __GFP_MOVABLE can be either migrated
using the page migration mechanism or reclaimed by syncing with backing
storage and discarding.

An API function very similar to alloc_zeroed_user_highpage() is added for
__GFP_MOVABLE allocations called alloc_zeroed_user_highpage_movable(). The
flags used by alloc_zeroed_user_highpage() are not changed because it would
change the semantics of an existing API. After this patch is applied there
are no in-kernel users of alloc_zeroed_user_highpage() so it probably should
be marked deprecated if this patch is merged.

Note that this patch includes a minor cleanup to the use of __GFP_ZERO
in shmem.c to keep all flag modifications to inode->mapping in the
shmem_dir_alloc() helper function. This clean-up suggestion is courtesy of
Hugh Dickens.

Additional credit goes to Christoph Lameter and Linus Torvalds for shaping
the concept. Credit to Hugh Dickens for catching issues with shmem swap
vector and ramfs allocations.

[hugh@veritas.com: __GFP_ZERO cleanup]

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---

 fs/inode.c                |   10 ++++++--
 fs/ramfs/inode.c          |    1 
 include/asm-alpha/page.h  |    3 +-
 include/asm-cris/page.h   |    3 +-
 include/asm-h8300/page.h  |    3 +-
 include/asm-i386/page.h   |    3 +-
 include/asm-ia64/page.h   |    5 ++--
 include/asm-m32r/page.h   |    3 +-
 include/asm-s390/page.h   |    3 +-
 include/asm-x86_64/page.h |    3 +-
 include/linux/gfp.h       |   10 +++++++-
 include/linux/highmem.h   |   51 +++++++++++++++++++++++++++++++++++++++--
 mm/memory.c               |    8 +++---
 mm/mempolicy.c            |    4 +--
 mm/migrate.c              |    2 -
 mm/shmem.c                |    7 ++++-
 mm/swap_prefetch.c        |    2 -
 mm/swap_state.c           |    2 -
 18 files changed, 98 insertions(+), 25 deletions(-)

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-001_pageblock_bits/fs/inode.c linux-2.6.20-mm2-002_clustering_flags/fs/inode.c
--- linux-2.6.20-mm2-001_pageblock_bits/fs/inode.c	2007-02-19 01:21:38.000000000 +0000
+++ linux-2.6.20-mm2-002_clustering_flags/fs/inode.c	2007-02-20 18:25:33.000000000 +0000
@@ -145,7 +145,7 @@ static struct inode *alloc_inode(struct 
 		mapping->a_ops = &empty_aops;
  		mapping->host = inode;
 		mapping->flags = 0;
-		mapping_set_gfp_mask(mapping, GFP_HIGHUSER);
+		mapping_set_gfp_mask(mapping, GFP_HIGH_MOVABLE);
 		mapping->assoc_mapping = NULL;
 		mapping->backing_dev_info = &default_backing_dev_info;
 
@@ -521,7 +521,13 @@ repeat:
  *	new_inode 	- obtain an inode
  *	@sb: superblock
  *
- *	Allocates a new inode for given superblock.
+ *	Allocates a new inode for given superblock. The default gfp_mask
+ *	for allocations related to inode->i_mapping is GFP_HIGH_MOVABLE. If
+ *	HIGHMEM pages are unsuitable or it is known that pages allocated
+ *	for the page cache are not reclaimable or migratable,
+ *	mapping_set_gfp_mask() must be called with suitable flags on the
+ *	newly created inode's mapping
+ *
  */
 struct inode *new_inode(struct super_block *sb)
 {
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-001_pageblock_bits/fs/ramfs/inode.c linux-2.6.20-mm2-002_clustering_flags/fs/ramfs/inode.c
--- linux-2.6.20-mm2-001_pageblock_bits/fs/ramfs/inode.c	2007-02-19 01:21:42.000000000 +0000
+++ linux-2.6.20-mm2-002_clustering_flags/fs/ramfs/inode.c	2007-02-20 18:25:33.000000000 +0000
@@ -61,6 +61,7 @@ struct inode *ramfs_get_inode(struct sup
 		inode->i_blocks = 0;
 		inode->i_mapping->a_ops = &ramfs_aops;
 		inode->i_mapping->backing_dev_info = &ramfs_backing_dev_info;
+		mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER);
 		inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
 		switch (mode & S_IFMT) {
 		default:
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-001_pageblock_bits/include/asm-alpha/page.h linux-2.6.20-mm2-002_clustering_flags/include/asm-alpha/page.h
--- linux-2.6.20-mm2-001_pageblock_bits/include/asm-alpha/page.h	2007-02-04 18:44:54.000000000 +0000
+++ linux-2.6.20-mm2-002_clustering_flags/include/asm-alpha/page.h	2007-02-20 18:25:33.000000000 +0000
@@ -17,7 +17,8 @@
 extern void clear_page(void *page);
 #define clear_user_page(page, vaddr, pg)	clear_page(page)
 
-#define alloc_zeroed_user_highpage(vma, vaddr) alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO, vma, vmaddr)
+#define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
+	alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vmaddr)
 #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
 
 extern void copy_page(void * _to, void * _from);
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-001_pageblock_bits/include/asm-cris/page.h linux-2.6.20-mm2-002_clustering_flags/include/asm-cris/page.h
--- linux-2.6.20-mm2-001_pageblock_bits/include/asm-cris/page.h	2007-02-04 18:44:54.000000000 +0000
+++ linux-2.6.20-mm2-002_clustering_flags/include/asm-cris/page.h	2007-02-20 18:25:33.000000000 +0000
@@ -20,7 +20,8 @@
 #define clear_user_page(page, vaddr, pg)    clear_page(page)
 #define copy_user_page(to, from, vaddr, pg) copy_page(to, from)
 
-#define alloc_zeroed_user_highpage(vma, vaddr) alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO, vma, vaddr)
+#define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
+	alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
 #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
 
 /*
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-001_pageblock_bits/include/asm-h8300/page.h linux-2.6.20-mm2-002_clustering_flags/include/asm-h8300/page.h
--- linux-2.6.20-mm2-001_pageblock_bits/include/asm-h8300/page.h	2007-02-04 18:44:54.000000000 +0000
+++ linux-2.6.20-mm2-002_clustering_flags/include/asm-h8300/page.h	2007-02-20 18:25:33.000000000 +0000
@@ -22,7 +22,8 @@
 #define clear_user_page(page, vaddr, pg)	clear_page(page)
 #define copy_user_page(to, from, vaddr, pg)	copy_page(to, from)
 
-#define alloc_zeroed_user_highpage(vma, vaddr) alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO, vma, vaddr)
+#define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
+	alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
 #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
 
 /*
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-001_pageblock_bits/include/asm-i386/page.h linux-2.6.20-mm2-002_clustering_flags/include/asm-i386/page.h
--- linux-2.6.20-mm2-001_pageblock_bits/include/asm-i386/page.h	2007-02-19 01:21:54.000000000 +0000
+++ linux-2.6.20-mm2-002_clustering_flags/include/asm-i386/page.h	2007-02-20 18:25:33.000000000 +0000
@@ -34,7 +34,8 @@
 #define clear_user_page(page, vaddr, pg)	clear_page(page)
 #define copy_user_page(to, from, vaddr, pg)	copy_page(to, from)
 
-#define alloc_zeroed_user_highpage(vma, vaddr) alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO, vma, vaddr)
+#define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
+	alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
 #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
 
 /*
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-001_pageblock_bits/include/asm-ia64/page.h linux-2.6.20-mm2-002_clustering_flags/include/asm-ia64/page.h
--- linux-2.6.20-mm2-001_pageblock_bits/include/asm-ia64/page.h	2007-02-04 18:44:54.000000000 +0000
+++ linux-2.6.20-mm2-002_clustering_flags/include/asm-ia64/page.h	2007-02-20 18:25:33.000000000 +0000
@@ -87,9 +87,10 @@ do {						\
 } while (0)
 
 
-#define alloc_zeroed_user_highpage(vma, vaddr) \
+#define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
 ({						\
-	struct page *page = alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO, vma, vaddr); \
+	struct page *page = alloc_page_vma(
+		GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr); \
 	if (page)				\
  		flush_dcache_page(page);	\
 	page;					\
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-001_pageblock_bits/include/asm-m32r/page.h linux-2.6.20-mm2-002_clustering_flags/include/asm-m32r/page.h
--- linux-2.6.20-mm2-001_pageblock_bits/include/asm-m32r/page.h	2007-02-19 01:21:54.000000000 +0000
+++ linux-2.6.20-mm2-002_clustering_flags/include/asm-m32r/page.h	2007-02-20 18:25:33.000000000 +0000
@@ -15,7 +15,8 @@ extern void copy_page(void *to, void *fr
 #define clear_user_page(page, vaddr, pg)	clear_page(page)
 #define copy_user_page(to, from, vaddr, pg)	copy_page(to, from)
 
-#define alloc_zeroed_user_highpage(vma, vaddr) alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO, vma, vaddr)
+#define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
+	alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
 #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
 
 /*
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-001_pageblock_bits/include/asm-s390/page.h linux-2.6.20-mm2-002_clustering_flags/include/asm-s390/page.h
--- linux-2.6.20-mm2-001_pageblock_bits/include/asm-s390/page.h	2007-02-04 18:44:54.000000000 +0000
+++ linux-2.6.20-mm2-002_clustering_flags/include/asm-s390/page.h	2007-02-20 18:25:33.000000000 +0000
@@ -64,7 +64,8 @@ static inline void copy_page(void *to, v
 #define clear_user_page(page, vaddr, pg)	clear_page(page)
 #define copy_user_page(to, from, vaddr, pg)	copy_page(to, from)
 
-#define alloc_zeroed_user_highpage(vma, vaddr) alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO, vma, vaddr)
+#define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
+	alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
 #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
 
 /*
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-001_pageblock_bits/include/asm-x86_64/page.h linux-2.6.20-mm2-002_clustering_flags/include/asm-x86_64/page.h
--- linux-2.6.20-mm2-001_pageblock_bits/include/asm-x86_64/page.h	2007-02-04 18:44:54.000000000 +0000
+++ linux-2.6.20-mm2-002_clustering_flags/include/asm-x86_64/page.h	2007-02-20 18:25:33.000000000 +0000
@@ -51,7 +51,8 @@ void copy_page(void *, void *);
 #define clear_user_page(page, vaddr, pg)	clear_page(page)
 #define copy_user_page(to, from, vaddr, pg)	copy_page(to, from)
 
-#define alloc_zeroed_user_highpage(vma, vaddr) alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO, vma, vaddr)
+#define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
+	alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
 #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
 /*
  * These are used to make use of C type-checking..
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-001_pageblock_bits/include/linux/gfp.h linux-2.6.20-mm2-002_clustering_flags/include/linux/gfp.h
--- linux-2.6.20-mm2-001_pageblock_bits/include/linux/gfp.h	2007-02-19 01:22:30.000000000 +0000
+++ linux-2.6.20-mm2-002_clustering_flags/include/linux/gfp.h	2007-02-20 18:25:33.000000000 +0000
@@ -30,6 +30,9 @@ struct vm_area_struct;
  * cannot handle allocation failures.
  *
  * __GFP_NORETRY: The VM implementation must not retry indefinitely.
+ *
+ * __GFP_MOVABLE: Flag that this page will be movable by the page migration
+ * mechanism or reclaimed
  */
 #define __GFP_WAIT	((__force gfp_t)0x10u)	/* Can wait and reschedule? */
 #define __GFP_HIGH	((__force gfp_t)0x20u)	/* Should access emergency pools? */
@@ -46,6 +49,7 @@ struct vm_area_struct;
 #define __GFP_NOMEMALLOC ((__force gfp_t)0x10000u) /* Don't use emergency reserves */
 #define __GFP_HARDWALL   ((__force gfp_t)0x20000u) /* Enforce hardwall cpuset memory allocs */
 #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
+#define __GFP_MOVABLE	((__force gfp_t)0x80000u) /* Page is movable */
 
 #define __GFP_BITS_SHIFT 20	/* Room for 20 __GFP_FOO bits */
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
@@ -54,7 +58,8 @@ struct vm_area_struct;
 #define GFP_LEVEL_MASK (__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS| \
 			__GFP_COLD|__GFP_NOWARN|__GFP_REPEAT| \
 			__GFP_NOFAIL|__GFP_NORETRY|__GFP_NO_GROW|__GFP_COMP| \
-			__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE)
+			__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE| \
+			__GFP_MOVABLE)
 
 /* This equals 0, but use constants in case they ever change */
 #define GFP_NOWAIT	(GFP_ATOMIC & ~__GFP_HIGH)
@@ -66,6 +71,9 @@ struct vm_area_struct;
 #define GFP_USER	(__GFP_WAIT | __GFP_IO | __GFP_FS | __GFP_HARDWALL)
 #define GFP_HIGHUSER	(__GFP_WAIT | __GFP_IO | __GFP_FS | __GFP_HARDWALL | \
 			 __GFP_HIGHMEM)
+#define GFP_HIGH_MOVABLE	(__GFP_WAIT | __GFP_IO | __GFP_FS | \
+				 __GFP_HARDWALL | __GFP_HIGHMEM | \
+				 __GFP_MOVABLE)
 
 #ifdef CONFIG_NUMA
 #define GFP_THISNODE	(__GFP_THISNODE | __GFP_NOWARN | __GFP_NORETRY)
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-001_pageblock_bits/include/linux/highmem.h linux-2.6.20-mm2-002_clustering_flags/include/linux/highmem.h
--- linux-2.6.20-mm2-001_pageblock_bits/include/linux/highmem.h	2007-02-04 18:44:54.000000000 +0000
+++ linux-2.6.20-mm2-002_clustering_flags/include/linux/highmem.h	2007-02-20 18:25:33.000000000 +0000
@@ -62,10 +62,27 @@ static inline void clear_user_highpage(s
 }
 
 #ifndef __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
+/**
+ * __alloc_zeroed_user_highpage - Allocate a zeroed HIGHMEM page for a VMA with caller-specified movable GFP flags
+ * @movableflags: The GFP flags related to the pages future ability to move like __GFP_MOVABLE
+ * @vma: The VMA the page is to be allocated for
+ * @vaddr: The virtual address the page will be inserted into
+ *
+ * This function will allocate a page for a VMA but the caller is expected
+ * to specify via movableflags whether the page will be movable in the
+ * future or not
+ *
+ * An architecture may override this function by defining
+ * __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE and providing their own
+ * implementation.
+ */
 static inline struct page *
-alloc_zeroed_user_highpage(struct vm_area_struct *vma, unsigned long vaddr)
+__alloc_zeroed_user_highpage(gfp_t movableflags,
+			struct vm_area_struct *vma,
+			unsigned long vaddr)
 {
-	struct page *page = alloc_page_vma(GFP_HIGHUSER, vma, vaddr);
+	struct page *page = alloc_page_vma(GFP_HIGHUSER | movableflags,
+			vma, vaddr);
 
 	if (page)
 		clear_user_highpage(page, vaddr);
@@ -74,6 +91,36 @@ alloc_zeroed_user_highpage(struct vm_are
 }
 #endif
 
+/**
+ * alloc_zeroed_user_highpage - Allocate a zeroed HIGHMEM page for a VMA
+ * @vma: The VMA the page is to be allocated for
+ * @vaddr: The virtual address the page will be inserted into
+ *
+ * This function will allocate a page for a VMA that the caller knows will
+ * not be able to move in the future using move_pages() or reclaim. If it
+ * is known that the page can move, use alloc_zeroed_user_highpage_movable
+ */
+static inline struct page *
+alloc_zeroed_user_highpage(struct vm_area_struct *vma, unsigned long vaddr)
+{
+	return __alloc_zeroed_user_highpage(0, vma, vaddr);
+}
+
+/**
+ * alloc_zeroed_user_highpage_movable - Allocate a zeroed HIGHMEM page for a VMA that the caller knows can move
+ * @vma: The VMA the page is to be allocated for
+ * @vaddr: The virtual address the page will be inserted into
+ *
+ * This function will allocate a page for a VMA that the caller knows will
+ * be able to migrate in the future using move_pages() or reclaimed
+ */
+static inline struct page *
+alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma,
+					unsigned long vaddr)
+{
+	return __alloc_zeroed_user_highpage(__GFP_MOVABLE, vma, vaddr);
+}
+
 static inline void clear_highpage(struct page *page)
 {
 	void *kaddr = kmap_atomic(page, KM_USER0);
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-001_pageblock_bits/mm/memory.c linux-2.6.20-mm2-002_clustering_flags/mm/memory.c
--- linux-2.6.20-mm2-001_pageblock_bits/mm/memory.c	2007-02-19 01:22:35.000000000 +0000
+++ linux-2.6.20-mm2-002_clustering_flags/mm/memory.c	2007-02-20 18:25:33.000000000 +0000
@@ -1761,11 +1761,11 @@ gotten:
 	if (unlikely(anon_vma_prepare(vma)))
 		goto oom;
 	if (old_page == ZERO_PAGE(address)) {
-		new_page = alloc_zeroed_user_highpage(vma, address);
+		new_page = alloc_zeroed_user_highpage_movable(vma, address);
 		if (!new_page)
 			goto oom;
 	} else {
-		new_page = alloc_page_vma(GFP_HIGHUSER, vma, address);
+		new_page = alloc_page_vma(GFP_HIGH_MOVABLE, vma, address);
 		if (!new_page)
 			goto oom;
 		cow_user_page(new_page, old_page, address, vma);
@@ -2283,7 +2283,7 @@ static int do_anonymous_page(struct mm_s
 
 		if (unlikely(anon_vma_prepare(vma)))
 			goto oom;
-		page = alloc_zeroed_user_highpage(vma, address);
+		page = alloc_zeroed_user_highpage_movable(vma, address);
 		if (!page)
 			goto oom;
 
@@ -2384,7 +2384,7 @@ retry:
 
 			if (unlikely(anon_vma_prepare(vma)))
 				goto oom;
-			page = alloc_page_vma(GFP_HIGHUSER, vma, address);
+			page = alloc_page_vma(GFP_HIGH_MOVABLE, vma, address);
 			if (!page)
 				goto oom;
 			copy_user_highpage(page, new_page, address, vma);
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-001_pageblock_bits/mm/mempolicy.c linux-2.6.20-mm2-002_clustering_flags/mm/mempolicy.c
--- linux-2.6.20-mm2-001_pageblock_bits/mm/mempolicy.c	2007-02-19 01:22:35.000000000 +0000
+++ linux-2.6.20-mm2-002_clustering_flags/mm/mempolicy.c	2007-02-20 18:25:33.000000000 +0000
@@ -603,7 +603,7 @@ static void migrate_page_add(struct page
 
 static struct page *new_node_page(struct page *page, unsigned long node, int **x)
 {
-	return alloc_pages_node(node, GFP_HIGHUSER, 0);
+	return alloc_pages_node(node, GFP_HIGH_MOVABLE, 0);
 }
 
 /*
@@ -719,7 +719,7 @@ static struct page *new_vma_page(struct 
 {
 	struct vm_area_struct *vma = (struct vm_area_struct *)private;
 
-	return alloc_page_vma(GFP_HIGHUSER, vma, page_address_in_vma(page, vma));
+	return alloc_page_vma(GFP_HIGH_MOVABLE, vma, page_address_in_vma(page, vma));
 }
 #else
 
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-001_pageblock_bits/mm/migrate.c linux-2.6.20-mm2-002_clustering_flags/mm/migrate.c
--- linux-2.6.20-mm2-001_pageblock_bits/mm/migrate.c	2007-02-19 01:22:35.000000000 +0000
+++ linux-2.6.20-mm2-002_clustering_flags/mm/migrate.c	2007-02-20 18:25:33.000000000 +0000
@@ -755,7 +755,7 @@ static struct page *new_page_node(struct
 
 	*result = &pm->status;
 
-	return alloc_pages_node(pm->node, GFP_HIGHUSER | GFP_THISNODE, 0);
+	return alloc_pages_node(pm->node, GFP_HIGH_MOVABLE | GFP_THISNODE, 0);
 }
 
 /*
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-001_pageblock_bits/mm/shmem.c linux-2.6.20-mm2-002_clustering_flags/mm/shmem.c
--- linux-2.6.20-mm2-001_pageblock_bits/mm/shmem.c	2007-02-19 01:22:35.000000000 +0000
+++ linux-2.6.20-mm2-002_clustering_flags/mm/shmem.c	2007-02-20 18:25:33.000000000 +0000
@@ -93,8 +93,11 @@ static inline struct page *shmem_dir_all
 	 * The above definition of ENTRIES_PER_PAGE, and the use of
 	 * BLOCKS_PER_PAGE on indirect pages, assume PAGE_CACHE_SIZE:
 	 * might be reconsidered if it ever diverges from PAGE_SIZE.
+	 *
+	 * __GFP_MOVABLE is masked out as swap vectors cannot move
 	 */
-	return alloc_pages(gfp_mask, PAGE_CACHE_SHIFT-PAGE_SHIFT);
+	return alloc_pages((gfp_mask & ~__GFP_MOVABLE) | __GFP_ZERO,
+				PAGE_CACHE_SHIFT-PAGE_SHIFT);
 }
 
 static inline void shmem_dir_free(struct page *page)
@@ -371,7 +374,7 @@ static swp_entry_t *shmem_swp_alloc(stru
 		}
 
 		spin_unlock(&info->lock);
-		page = shmem_dir_alloc(mapping_gfp_mask(inode->i_mapping) | __GFP_ZERO);
+		page = shmem_dir_alloc(mapping_gfp_mask(inode->i_mapping));
 		if (page)
 			set_page_private(page, 0);
 		spin_lock(&info->lock);
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-001_pageblock_bits/mm/swap_prefetch.c linux-2.6.20-mm2-002_clustering_flags/mm/swap_prefetch.c
--- linux-2.6.20-mm2-001_pageblock_bits/mm/swap_prefetch.c	2007-02-19 01:22:35.000000000 +0000
+++ linux-2.6.20-mm2-002_clustering_flags/mm/swap_prefetch.c	2007-02-20 18:25:33.000000000 +0000
@@ -204,7 +204,7 @@ static enum trickle_return trickle_swap_
 	 * Get a new page to read from swap. We have already checked the
 	 * watermarks so __alloc_pages will not call on reclaim.
 	 */
-	page = alloc_pages_node(node, GFP_HIGHUSER & ~__GFP_WAIT, 0);
+	page = alloc_pages_node(node, GFP_HIGH_MOVABLE & ~__GFP_WAIT, 0);
 	if (unlikely(!page)) {
 		ret = TRICKLE_DELAY;
 		goto out;
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-001_pageblock_bits/mm/swap_state.c linux-2.6.20-mm2-002_clustering_flags/mm/swap_state.c
--- linux-2.6.20-mm2-001_pageblock_bits/mm/swap_state.c	2007-02-19 01:22:35.000000000 +0000
+++ linux-2.6.20-mm2-002_clustering_flags/mm/swap_state.c	2007-02-20 18:25:33.000000000 +0000
@@ -340,7 +340,7 @@ struct page *read_swap_cache_async(swp_e
 		 * Get a new page to read into from swap.
 		 */
 		if (!new_page) {
-			new_page = alloc_page_vma(GFP_HIGHUSER, vma, addr);
+			new_page = alloc_page_vma(GFP_HIGH_MOVABLE, vma, addr);
 			if (!new_page)
 				break;		/* Out of memory */
 		}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/12] Add __GFP_MOVABLE for callers to flag allocations from low memory that may be migrated
  2007-03-01 10:02 [PATCH 0/12] Group pages of related mobility together to reduce external fragmentation v28 Mel Gorman
  2007-03-01 10:02 ` [PATCH 1/12] Add a bitmap that is used to track flags affecting a block of pages Mel Gorman
  2007-03-01 10:03 ` [PATCH 2/12] Add __GFP_MOVABLE for callers to flag allocations from high memory that may be migrated Mel Gorman
@ 2007-03-01 10:03 ` Mel Gorman
  2007-03-01 10:03 ` [PATCH 4/12] Split the free lists for movable and unmovable allocations Mel Gorman
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Mel Gorman @ 2007-03-01 10:03 UTC (permalink / raw)
  To: akpm; +Cc: Mel Gorman, linux-kernel, linux-mm


This patch flags the allocations from low memory that may be migrated.

A GFP_USER_MOVABLE similar to GFP_HIGH_MOVABLE is not provided in this patch
because it would only be used once.  This patch uses __GFP_MOVABLE twice
for a GFP_USER and a GFP_NOIO allocation. There is little point defining
GFP_*_MOVABLE for one use unless people feel it would help self-documentation.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---

 block_dev.c |    2 +-
 buffer.c    |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-002_clustering_flags/fs/block_dev.c linux-2.6.20-mm2-003_additional_flags/fs/block_dev.c
--- linux-2.6.20-mm2-002_clustering_flags/fs/block_dev.c	2007-02-19 01:21:36.000000000 +0000
+++ linux-2.6.20-mm2-003_additional_flags/fs/block_dev.c	2007-02-20 18:27:38.000000000 +0000
@@ -576,7 +576,7 @@ struct block_device *bdget(dev_t dev)
 		inode->i_rdev = dev;
 		inode->i_bdev = bdev;
 		inode->i_data.a_ops = &def_blk_aops;
-		mapping_set_gfp_mask(&inode->i_data, GFP_USER);
+		mapping_set_gfp_mask(&inode->i_data, GFP_USER|__GFP_MOVABLE);
 		inode->i_data.backing_dev_info = &default_backing_dev_info;
 		spin_lock(&bdev_lock);
 		list_add(&bdev->bd_list, &all_bdevs);
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-002_clustering_flags/fs/buffer.c linux-2.6.20-mm2-003_additional_flags/fs/buffer.c
--- linux-2.6.20-mm2-002_clustering_flags/fs/buffer.c	2007-02-19 01:21:36.000000000 +0000
+++ linux-2.6.20-mm2-003_additional_flags/fs/buffer.c	2007-02-20 18:27:38.000000000 +0000
@@ -2652,7 +2652,7 @@ int submit_bh(int rw, struct buffer_head
 	 * from here on down, it's all bio -- do the initial mapping,
 	 * submit_bio -> generic_make_request may further map this bio around
 	 */
-	bio = bio_alloc(GFP_NOIO, 1);
+	bio = bio_alloc(GFP_NOIO|__GFP_MOVABLE, 1);
 
 	bio->bi_sector = bh->b_blocknr * (bh->b_size >> 9);
 	bio->bi_bdev = bh->b_bdev;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 4/12] Split the free lists for movable and unmovable allocations
  2007-03-01 10:02 [PATCH 0/12] Group pages of related mobility together to reduce external fragmentation v28 Mel Gorman
                   ` (2 preceding siblings ...)
  2007-03-01 10:03 ` [PATCH 3/12] Add __GFP_MOVABLE for callers to flag allocations from low " Mel Gorman
@ 2007-03-01 10:03 ` Mel Gorman
  2007-03-01 10:04 ` [PATCH 5/12] Choose pages from the per-cpu list based on migration type Mel Gorman
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Mel Gorman @ 2007-03-01 10:03 UTC (permalink / raw)
  To: akpm; +Cc: Mel Gorman, linux-kernel, linux-mm


This patch adds the core of the fragmentation reduction strategy. It
works by grouping pages together based on their ability to migrate or be
reclaimed.  Basically, it works by breaking the list in zone->free_area list
into MIGRATE_TYPES number of lists.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---

 include/linux/mmzone.h          |   10 ++
 include/linux/pageblock-flags.h |    1 
 mm/page_alloc.c                 |  140 +++++++++++++++++++++++++++++------
 3 files changed, 127 insertions(+), 24 deletions(-)

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-003_additional_flags/include/linux/mmzone.h linux-2.6.20-mm2-004_clustering_core/include/linux/mmzone.h
--- linux-2.6.20-mm2-003_additional_flags/include/linux/mmzone.h	2007-02-20 18:23:25.000000000 +0000
+++ linux-2.6.20-mm2-004_clustering_core/include/linux/mmzone.h	2007-02-20 18:29:42.000000000 +0000
@@ -25,8 +25,16 @@
 #endif
 #define MAX_ORDER_NR_PAGES (1 << (MAX_ORDER - 1))
 
+#define MIGRATE_UNMOVABLE     0
+#define MIGRATE_MOVABLE       1
+#define MIGRATE_TYPES         2
+
+#define for_each_migratetype_order(order, type) \
+	for (order = 0; order < MAX_ORDER; order++) \
+		for (type = 0; type < MIGRATE_TYPES; type++)
+
 struct free_area {
-	struct list_head	free_list;
+	struct list_head	free_list[MIGRATE_TYPES];
 	unsigned long		nr_free;
 };
 
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-003_additional_flags/include/linux/pageblock-flags.h linux-2.6.20-mm2-004_clustering_core/include/linux/pageblock-flags.h
--- linux-2.6.20-mm2-003_additional_flags/include/linux/pageblock-flags.h	2007-02-20 19:44:47.000000000 +0000
+++ linux-2.6.20-mm2-004_clustering_core/include/linux/pageblock-flags.h	2007-02-20 19:29:13.000000000 +0000
@@ -31,6 +31,7 @@
 
 /* Bit indices that affect a whole block of pages */
 enum pageblock_bits {
+	PB_range(PB_migrate, 1), /* 1 bit required for migrate types */
 	NR_PAGEBLOCK_BITS
 };
 
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-003_additional_flags/mm/page_alloc.c linux-2.6.20-mm2-004_clustering_core/mm/page_alloc.c
--- linux-2.6.20-mm2-003_additional_flags/mm/page_alloc.c	2007-02-20 18:23:25.000000000 +0000
+++ linux-2.6.20-mm2-004_clustering_core/mm/page_alloc.c	2007-02-20 18:29:42.000000000 +0000
@@ -136,6 +136,22 @@ static unsigned long __initdata dma_rese
 #endif /* CONFIG_MEMORY_HOTPLUG_RESERVE */
 #endif /* CONFIG_ARCH_POPULATES_NODE_MAP */
 
+static inline int get_pageblock_migratetype(struct page *page)
+{
+	return get_pageblock_flags_group(page, PB_migrate, PB_migrate_end);
+}
+
+static void set_pageblock_migratetype(struct page *page, int migratetype)
+{
+	set_pageblock_flags_group(page, (unsigned long)migratetype,
+					PB_migrate, PB_migrate_end);
+}
+
+static inline int gfpflags_to_migratetype(gfp_t gfp_flags)
+{
+	return ((gfp_flags & __GFP_MOVABLE) != 0);
+}
+
 #ifdef CONFIG_DEBUG_VM
 static int page_outside_zone_boundaries(struct zone *zone, struct page *page)
 {
@@ -406,6 +422,7 @@ static inline void __free_one_page(struc
 {
 	unsigned long page_idx;
 	int order_size = 1 << order;
+	int migratetype = get_pageblock_migratetype(page);
 
 	if (unlikely(PageCompound(page)))
 		destroy_compound_page(page, order);
@@ -418,7 +435,6 @@ static inline void __free_one_page(struc
 	__mod_zone_page_state(zone, NR_FREE_PAGES, order_size);
 	while (order < MAX_ORDER-1) {
 		unsigned long combined_idx;
-		struct free_area *area;
 		struct page *buddy;
 
 		buddy = __page_find_buddy(page, page_idx, order);
@@ -426,8 +442,7 @@ static inline void __free_one_page(struc
 			break;		/* Move the buddy up one level. */
 
 		list_del(&buddy->lru);
-		area = zone->free_area + order;
-		area->nr_free--;
+		zone->free_area[order].nr_free--;
 		rmv_page_order(buddy);
 		combined_idx = __find_combined_index(page_idx, order);
 		page = page + (combined_idx - page_idx);
@@ -435,7 +450,8 @@ static inline void __free_one_page(struc
 		order++;
 	}
 	set_page_order(page, order);
-	list_add(&page->lru, &zone->free_area[order].free_list);
+	list_add(&page->lru,
+		&zone->free_area[order].free_list[migratetype]);
 	zone->free_area[order].nr_free++;
 }
 
@@ -575,7 +591,8 @@ void fastcall __init __free_pages_bootme
  * -- wli
  */
 static inline void expand(struct zone *zone, struct page *page,
- 	int low, int high, struct free_area *area)
+	int low, int high, struct free_area *area,
+	int migratetype)
 {
 	unsigned long size = 1 << high;
 
@@ -584,7 +601,7 @@ static inline void expand(struct zone *z
 		high--;
 		size >>= 1;
 		VM_BUG_ON(bad_range(zone, &page[size]));
-		list_add(&page[size].lru, &area->free_list);
+		list_add(&page[size].lru, &area->free_list[migratetype]);
 		area->nr_free++;
 		set_page_order(&page[size], high);
 	}
@@ -638,31 +655,95 @@ static int prep_new_page(struct page *pa
 	return 0;
 }
 
+/*
+ * This array describes the order lists are fallen back to when
+ * the free lists for the desirable migrate type are depleted
+ */
+static int fallbacks[MIGRATE_TYPES][MIGRATE_TYPES-1] = {
+	[MIGRATE_UNMOVABLE] = { MIGRATE_MOVABLE   },
+	[MIGRATE_MOVABLE]   = { MIGRATE_UNMOVABLE },
+};
+
+/* Remove an element from the buddy allocator from the fallback list */
+static struct page *__rmqueue_fallback(struct zone *zone, int order,
+						int start_migratetype)
+{
+	struct free_area * area;
+	int current_order;
+	struct page *page;
+	int migratetype, i;
+
+	/* Find the largest possible block of pages in the other list */
+	for (current_order = MAX_ORDER-1; current_order >= order;
+						--current_order) {
+		for (i = 0; i < MIGRATE_TYPES - 1; i++) {
+			migratetype = fallbacks[start_migratetype][i];
+
+			area = &(zone->free_area[current_order]);
+			if (list_empty(&area->free_list[migratetype]))
+				continue;
+
+			page = list_entry(area->free_list[migratetype].next,
+					struct page, lru);
+			area->nr_free--;
+
+			/*
+			 * If breaking a large block of pages, place the buddies
+			 * on the preferred allocation list
+			 */
+			if (unlikely(current_order >= MAX_ORDER / 2))
+				migratetype = start_migratetype;
+
+			/* Remove the page from the freelists */
+			list_del(&page->lru);
+			rmv_page_order(page);
+			__mod_zone_page_state(zone, NR_FREE_PAGES,
+							-(1UL << order));
+
+			if (current_order == MAX_ORDER - 1)
+				set_pageblock_migratetype(page,
+							start_migratetype);
+
+			expand(zone, page, order, current_order, area, migratetype);
+			return page;
+		}
+	}
+
+	return NULL;
+}
+
 /* 
  * Do the hard work of removing an element from the buddy allocator.
  * Call me with the zone->lock already held.
  */
-static struct page *__rmqueue(struct zone *zone, unsigned int order)
+static struct page *__rmqueue(struct zone *zone, unsigned int order,
+						int migratetype)
 {
 	struct free_area * area;
 	unsigned int current_order;
 	struct page *page;
 
+	/* Find a page of the appropriate size in the preferred list */
 	for (current_order = order; current_order < MAX_ORDER; ++current_order) {
-		area = zone->free_area + current_order;
-		if (list_empty(&area->free_list))
+		area = &(zone->free_area[current_order]);
+		if (list_empty(&area->free_list[migratetype]))
 			continue;
 
-		page = list_entry(area->free_list.next, struct page, lru);
+		page = list_entry(area->free_list[migratetype].next,
+							struct page, lru);
 		list_del(&page->lru);
 		rmv_page_order(page);
 		area->nr_free--;
 		__mod_zone_page_state(zone, NR_FREE_PAGES, - (1UL << order));
-		expand(zone, page, order, current_order, area);
-		return page;
+		expand(zone, page, order, current_order, area, migratetype);
+		goto got_page;
 	}
 
-	return NULL;
+	page = __rmqueue_fallback(zone, order, migratetype);
+
+got_page:
+
+	return page;
 }
 
 /* 
@@ -671,13 +752,14 @@ static struct page *__rmqueue(struct zon
  * Returns the number of new pages which were placed at *list.
  */
 static int rmqueue_bulk(struct zone *zone, unsigned int order, 
-			unsigned long count, struct list_head *list)
+			unsigned long count, struct list_head *list,
+			int migratetype)
 {
 	int i;
 	
 	spin_lock(&zone->lock);
 	for (i = 0; i < count; ++i) {
-		struct page *page = __rmqueue(zone, order);
+		struct page *page = __rmqueue(zone, order, migratetype);
 		if (unlikely(page == NULL))
 			break;
 		list_add_tail(&page->lru, list);
@@ -779,7 +861,7 @@ void mark_free_pages(struct zone *zone)
 {
 	unsigned long pfn, max_zone_pfn;
 	unsigned long flags;
-	int order;
+	int order, t;
 	struct list_head *curr;
 
 	if (!zone->spanned_pages)
@@ -796,14 +878,15 @@ void mark_free_pages(struct zone *zone)
 				ClearPageNosaveFree(page);
 		}
 
-	for (order = MAX_ORDER - 1; order >= 0; --order)
-		list_for_each(curr, &zone->free_area[order].free_list) {
+	for_each_migratetype_order(order, t) {
+		list_for_each(curr, &zone->free_area[order].free_list[t]) {
 			unsigned long i;
 
 			pfn = page_to_pfn(list_entry(curr, struct page, lru));
 			for (i = 0; i < (1UL << order); i++)
 				SetPageNosaveFree(pfn_to_page(pfn + i));
 		}
+	}
 
 	spin_unlock_irqrestore(&zone->lock, flags);
 }
@@ -893,6 +976,7 @@ static struct page *buffered_rmqueue(str
 	struct page *page;
 	int cold = !!(gfp_flags & __GFP_COLD);
 	int cpu;
+	int migratetype = gfpflags_to_migratetype(gfp_flags);
 
 again:
 	cpu  = get_cpu();
@@ -903,7 +987,7 @@ again:
 		local_irq_save(flags);
 		if (!pcp->count) {
 			pcp->count = rmqueue_bulk(zone, 0,
-						pcp->batch, &pcp->list);
+					pcp->batch, &pcp->list, migratetype);
 			if (unlikely(!pcp->count))
 				goto failed;
 		}
@@ -912,7 +996,7 @@ again:
 		pcp->count--;
 	} else {
 		spin_lock_irqsave(&zone->lock, flags);
-		page = __rmqueue(zone, order);
+		page = __rmqueue(zone, order, migratetype);
 		spin_unlock(&zone->lock);
 		if (!page)
 			goto failed;
@@ -2083,6 +2167,16 @@ void __meminit memmap_init_zone(unsigned
 		init_page_count(page);
 		reset_page_mapcount(page);
 		SetPageReserved(page);
+
+		/*
+		 * Mark the block movable so that blocks are reserved for
+		 * movable at startup. This will force kernel allocations
+		 * to reserve their blocks rather than leaking throughout
+		 * the address space during boot when many long-lived
+		 * kernel allocations are made
+		 */
+		set_pageblock_migratetype(page, MIGRATE_MOVABLE);
+
 		INIT_LIST_HEAD(&page->lru);
 #ifdef WANT_PAGE_VIRTUAL
 		/* The shift won't overflow because ZONE_NORMAL is below 4G. */
@@ -2098,9 +2192,9 @@ void __meminit memmap_init_zone(unsigned
 void zone_init_free_lists(struct pglist_data *pgdat, struct zone *zone,
 				unsigned long size)
 {
-	int order;
-	for (order = 0; order < MAX_ORDER ; order++) {
-		INIT_LIST_HEAD(&zone->free_area[order].free_list);
+	int order, t;
+	for_each_migratetype_order(order, t) {
+		INIT_LIST_HEAD(&zone->free_area[order].free_list[t]);
 		zone->free_area[order].nr_free = 0;
 	}
 }

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 5/12] Choose pages from the per-cpu list based on migration type
  2007-03-01 10:02 [PATCH 0/12] Group pages of related mobility together to reduce external fragmentation v28 Mel Gorman
                   ` (3 preceding siblings ...)
  2007-03-01 10:03 ` [PATCH 4/12] Split the free lists for movable and unmovable allocations Mel Gorman
@ 2007-03-01 10:04 ` Mel Gorman
  2007-03-01 10:04 ` [PATCH 6/12] Add a configure option to group pages by mobility Mel Gorman
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Mel Gorman @ 2007-03-01 10:04 UTC (permalink / raw)
  To: akpm; +Cc: Mel Gorman, linux-kernel, linux-mm


The freelists for each migrate type can slowly become polluted due to the
per-cpu list. Consider what happens when the following happens

1. A 2^(MAX_ORDER-1) list is reserved for __GFP_MOVABLE pages
2. An order-0 page is allocated from the newly reserved block
3. The page is freed and placed on the per-cpu list
4. alloc_page() is called with GFP_KERNEL as the gfp_mask
5. The per-cpu list is used to satisfy the allocation

This results in a kernel page is in the middle of a migratable region. This
patch prevents this leak occuring by storing the MIGRATE_ type of the page in
page->private. On allocate, a page will only be returned of the desired type,
else more pages will be allocated. This may temporarily allow a per-cpu list
to go over the pcp->high limit but it'll be corrected on the next free. Care
is taken to preserve the hotness of pages recently freed.

The additional code is not measurably slower for the workloads we've tested.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---

 page_alloc.c |   28 ++++++++++++++++++++++++----
 1 files changed, 24 insertions(+), 4 deletions(-)

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-004_clustering_core/mm/page_alloc.c linux-2.6.20-mm2-005_percpu/mm/page_alloc.c
--- linux-2.6.20-mm2-004_clustering_core/mm/page_alloc.c	2007-02-20 18:29:42.000000000 +0000
+++ linux-2.6.20-mm2-005_percpu/mm/page_alloc.c	2007-02-20 18:31:48.000000000 +0000
@@ -762,7 +762,8 @@ static int rmqueue_bulk(struct zone *zon
 		struct page *page = __rmqueue(zone, order, migratetype);
 		if (unlikely(page == NULL))
 			break;
-		list_add_tail(&page->lru, list);
+		list_add(&page->lru, list);
+		set_page_private(page, migratetype);
 	}
 	spin_unlock(&zone->lock);
 	return i;
@@ -927,6 +928,7 @@ static void fastcall free_hot_cold_page(
 	local_irq_save(flags);
 	__count_vm_event(PGFREE);
 	list_add(&page->lru, &pcp->list);
+	set_page_private(page, get_pageblock_migratetype(page));
 	pcp->count++;
 	if (pcp->count >= pcp->high) {
 		free_pages_bulk(zone, pcp->batch, &pcp->list, 0);
@@ -991,9 +993,27 @@ again:
 			if (unlikely(!pcp->count))
 				goto failed;
 		}
-		page = list_entry(pcp->list.next, struct page, lru);
-		list_del(&page->lru);
-		pcp->count--;
+		/* Find a page of the appropriate migrate type */
+		list_for_each_entry(page, &pcp->list, lru) {
+			if (page_private(page) == migratetype) {
+				list_del(&page->lru);
+				pcp->count--;
+				break;
+			}
+		}
+
+		/*
+		 * Check if a page of the appropriate migrate type
+		 * was found. If not, allocate more to the pcp list
+		 */
+		if (&page->lru == &pcp->list) {
+			pcp->count += rmqueue_bulk(zone, 0,
+					pcp->batch, &pcp->list, migratetype);
+			page = list_entry(pcp->list.next, struct page, lru);
+			VM_BUG_ON(page_private(page) != migratetype);
+			list_del(&page->lru);
+			pcp->count--;
+		}
 	} else {
 		spin_lock_irqsave(&zone->lock, flags);
 		page = __rmqueue(zone, order, migratetype);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 6/12] Add a configure option to group pages by mobility
  2007-03-01 10:02 [PATCH 0/12] Group pages of related mobility together to reduce external fragmentation v28 Mel Gorman
                   ` (4 preceding siblings ...)
  2007-03-01 10:04 ` [PATCH 5/12] Choose pages from the per-cpu list based on migration type Mel Gorman
@ 2007-03-01 10:04 ` Mel Gorman
  2007-03-01 10:04 ` [PATCH 7/12] Drain per-cpu lists when high-order allocations fail Mel Gorman
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Mel Gorman @ 2007-03-01 10:04 UTC (permalink / raw)
  To: akpm; +Cc: Mel Gorman, linux-kernel, linux-mm


The grouping mechanism has some memory overhead and a more complex allocation
path. This patch allows the strategy to be disabled for small memory systems
or if it is known the workload is suffering because of the strategy. It also
acts to show where the page groupings strategy interacts with the standard
buddy allocator.


Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Joel Schopp <jschopp@austin.ibm.com>
---

 include/linux/mmzone.h |    6 ++++++
 init/Kconfig           |   13 +++++++++++++
 mm/page_alloc.c        |   31 +++++++++++++++++++++++++++++++
 3 files changed, 50 insertions(+)

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-005_percpu/include/linux/mmzone.h linux-2.6.20-mm2-006_configurable/include/linux/mmzone.h
--- linux-2.6.20-mm2-005_percpu/include/linux/mmzone.h	2007-02-20 18:29:42.000000000 +0000
+++ linux-2.6.20-mm2-006_configurable/include/linux/mmzone.h	2007-02-20 18:33:41.000000000 +0000
@@ -25,9 +25,15 @@
 #endif
 #define MAX_ORDER_NR_PAGES (1 << (MAX_ORDER - 1))
 
+#ifdef CONFIG_PAGE_GROUP_BY_MOBILITY
 #define MIGRATE_UNMOVABLE     0
 #define MIGRATE_MOVABLE       1
 #define MIGRATE_TYPES         2
+#else
+#define MIGRATE_UNMOVABLE     0
+#define MIGRATE_MOVABLE       0
+#define MIGRATE_TYPES         1
+#endif
 
 #define for_each_migratetype_order(order, type) \
 	for (order = 0; order < MAX_ORDER; order++) \
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-005_percpu/init/Kconfig linux-2.6.20-mm2-006_configurable/init/Kconfig
--- linux-2.6.20-mm2-005_percpu/init/Kconfig	2007-02-19 01:22:33.000000000 +0000
+++ linux-2.6.20-mm2-006_configurable/init/Kconfig	2007-02-20 18:33:41.000000000 +0000
@@ -556,6 +556,19 @@ config SLOB
 	default !SLAB
 	bool
 
+config PAGE_GROUP_BY_MOBILITY
+	bool "Group pages based on their mobility in the page allocator"
+	def_bool y
+	help
+	  The standard allocator will fragment memory over time which means
+	  that high order allocations will fail even if kswapd is running. If
+	  this option is set, the allocator will try and group page types
+	  based on their ability to migrate or reclaim. This is a best effort
+	  attempt at lowering fragmentation which a few workloads care about.
+	  The loss is a more complex allocator that may perform slower. If
+	  you are interested in working with large pages, say Y and set
+	  /proc/sys/vm/min_free_bytes to 16374. Otherwise say N
+
 menu "Loadable module support"
 
 config MODULES
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-005_percpu/mm/page_alloc.c linux-2.6.20-mm2-006_configurable/mm/page_alloc.c
--- linux-2.6.20-mm2-005_percpu/mm/page_alloc.c	2007-02-20 18:31:48.000000000 +0000
+++ linux-2.6.20-mm2-006_configurable/mm/page_alloc.c	2007-02-20 18:33:41.000000000 +0000
@@ -136,6 +136,7 @@ static unsigned long __initdata dma_rese
 #endif /* CONFIG_MEMORY_HOTPLUG_RESERVE */
 #endif /* CONFIG_ARCH_POPULATES_NODE_MAP */
 
+#ifdef CONFIG_PAGE_GROUP_BY_MOBILITY
 static inline int get_pageblock_migratetype(struct page *page)
 {
 	return get_pageblock_flags_group(page, PB_migrate, PB_migrate_end);
@@ -152,6 +153,22 @@ static inline int gfpflags_to_migratetyp
 	return ((gfp_flags & __GFP_MOVABLE) != 0);
 }
 
+#else
+static inline int get_pageblock_migratetype(struct page *page)
+{
+	return MIGRATE_UNMOVABLE;
+}
+
+static void set_pageblock_migratetype(struct page *page, int migratetype)
+{
+}
+
+static inline int gfpflags_to_migratetype(gfp_t gfp_flags)
+{
+	return MIGRATE_UNMOVABLE;
+}
+#endif /* CONFIG_PAGE_GROUP_BY_MOBILITY */
+
 #ifdef CONFIG_DEBUG_VM
 static int page_outside_zone_boundaries(struct zone *zone, struct page *page)
 {
@@ -655,6 +672,7 @@ static int prep_new_page(struct page *pa
 	return 0;
 }
 
+#ifdef CONFIG_PAGE_GROUP_BY_MOBILITY
 /*
  * This array describes the order lists are fallen back to when
  * the free lists for the desirable migrate type are depleted
@@ -711,6 +729,13 @@ static struct page *__rmqueue_fallback(s
 
 	return NULL;
 }
+#else
+static struct page *__rmqueue_fallback(struct zone *zone, int order,
+						int start_migratetype)
+{
+	return NULL;
+}
+#endif /* CONFIG_PAGE_GROUP_BY_MOBILITY */
 
 /* 
  * Do the hard work of removing an element from the buddy allocator.
@@ -993,6 +1018,7 @@ again:
 			if (unlikely(!pcp->count))
 				goto failed;
 		}
+#ifdef CONFIG_PAGE_GROUP_BY_MOBILITY
 		/* Find a page of the appropriate migrate type */
 		list_for_each_entry(page, &pcp->list, lru) {
 			if (page_private(page) == migratetype) {
@@ -1014,6 +1040,11 @@ again:
 			list_del(&page->lru);
 			pcp->count--;
 		}
+#else
+		page = list_entry(pcp->list.next, struct page, lru);
+		list_del(&page->lru);
+		pcp->count--;
+#endif /* CONFIG_PAGE_GROUP_BY_MOBILITY */
 	} else {
 		spin_lock_irqsave(&zone->lock, flags);
 		page = __rmqueue(zone, order, migratetype);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 7/12] Drain per-cpu lists when high-order allocations fail
  2007-03-01 10:02 [PATCH 0/12] Group pages of related mobility together to reduce external fragmentation v28 Mel Gorman
                   ` (5 preceding siblings ...)
  2007-03-01 10:04 ` [PATCH 6/12] Add a configure option to group pages by mobility Mel Gorman
@ 2007-03-01 10:04 ` Mel Gorman
  2007-03-01 10:05 ` [PATCH 8/12] Move free pages between lists on steal Mel Gorman
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Mel Gorman @ 2007-03-01 10:04 UTC (permalink / raw)
  To: akpm; +Cc: Mel Gorman, linux-kernel, linux-mm


Per-cpu pages can accidentally cause fragmentation because they are free, but
pinned pages in an otherwise contiguous block.  When this patch is applied,
the per-cpu caches are drained after the direct-reclaim is entered if the
requested order is greater than 0. It simply reuses the code used by suspend
and hotplug.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---

 page_alloc.c |   28 +++++++++++++++++++++++++++-
 1 files changed, 27 insertions(+), 1 deletion(-)

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-006_configurable/mm/page_alloc.c linux-2.6.20-mm2-007_drainpercpu/mm/page_alloc.c
--- linux-2.6.20-mm2-006_configurable/mm/page_alloc.c	2007-02-20 18:33:41.000000000 +0000
+++ linux-2.6.20-mm2-007_drainpercpu/mm/page_alloc.c	2007-02-20 18:35:52.000000000 +0000
@@ -916,7 +916,9 @@ void mark_free_pages(struct zone *zone)
 
 	spin_unlock_irqrestore(&zone->lock, flags);
 }
+#endif /* CONFIG_PM */
 
+#if defined(CONFIG_PM) || defined(CONFIG_PAGE_GROUP_BY_MOBILITY)
 /*
  * Spill all of this CPU's per-cpu pages back into the buddy allocator.
  */
@@ -928,7 +930,28 @@ void drain_local_pages(void)
 	__drain_pages(smp_processor_id());
 	local_irq_restore(flags);	
 }
-#endif /* CONFIG_PM */
+
+void smp_drain_local_pages(void *arg)
+{
+	drain_local_pages();
+}
+
+/*
+ * Spill all the per-cpu pages from all CPUs back into the buddy allocator
+ */
+void drain_all_local_pages(void)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	__drain_pages(smp_processor_id());
+	local_irq_restore(flags);
+
+	smp_call_function(smp_drain_local_pages, NULL, 0, 1);
+}
+#else
+void drain_all_local_pages(void) {}
+#endif /* CONFIG_PM || CONFIG_PAGE_GROUP_BY_MOBILITY */
 
 /*
  * Free a 0-order page
@@ -1557,6 +1580,9 @@ nofail_alloc:
 
 	cond_resched();
 
+	if (order != 0)
+		drain_all_local_pages();
+
 	if (likely(did_some_progress)) {
 		page = get_page_from_freelist(gfp_mask, order,
 						zonelist, alloc_flags);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 8/12] Move free pages between lists on steal
  2007-03-01 10:02 [PATCH 0/12] Group pages of related mobility together to reduce external fragmentation v28 Mel Gorman
                   ` (6 preceding siblings ...)
  2007-03-01 10:04 ` [PATCH 7/12] Drain per-cpu lists when high-order allocations fail Mel Gorman
@ 2007-03-01 10:05 ` Mel Gorman
  2007-03-01 10:05 ` [PATCH 9/12] Group short-lived and reclaimable kernel allocations Mel Gorman
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Mel Gorman @ 2007-03-01 10:05 UTC (permalink / raw)
  To: akpm; +Cc: Mel Gorman, linux-kernel, linux-mm


When a fallback occurs, there will be free pages for one allocation type
stored on the list for another. When a large steal occurs, this patch will
move all the free pages within one list to the other.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---

 page_alloc.c |   65 +++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 62 insertions(+), 3 deletions(-)

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-007_drainpercpu/mm/page_alloc.c linux-2.6.20-mm2-008_movefree/mm/page_alloc.c
--- linux-2.6.20-mm2-007_drainpercpu/mm/page_alloc.c	2007-02-20 18:35:52.000000000 +0000
+++ linux-2.6.20-mm2-008_movefree/mm/page_alloc.c	2007-02-20 18:38:07.000000000 +0000
@@ -682,6 +682,63 @@ static int fallbacks[MIGRATE_TYPES][MIGR
 	[MIGRATE_MOVABLE]   = { MIGRATE_UNMOVABLE },
 };
 
+/*
+ * Move the free pages in a range to the free lists of the requested type.
+ * Note that start_page and end_pages are not aligned in a MAX_ORDER_NR_PAGES
+ * boundary. If alignment is required, use move_freepages_block()
+ */
+int move_freepages(struct zone *zone,
+			struct page *start_page, struct page *end_page,
+			int migratetype)
+{
+	struct page *page;
+	unsigned long order;
+	int blocks_moved = 0;
+
+	BUG_ON(page_zone(start_page) != page_zone(end_page));
+
+	for (page = start_page; page < end_page;) {
+		if (!PageBuddy(page)) {
+			page++;
+			continue;
+		}
+#ifdef CONFIG_HOLES_IN_ZONE
+		if (!pfn_valid(page_to_pfn(page))) {
+			page++;
+			continue;
+		}
+#endif
+
+		order = page_order(page);
+		list_del(&page->lru);
+		list_add(&page->lru,
+			&zone->free_area[order].free_list[migratetype]);
+		page += 1 << order;
+		blocks_moved++;
+	}
+
+	return blocks_moved;
+}
+
+int move_freepages_block(struct zone *zone, struct page *page, int migratetype)
+{
+	unsigned long start_pfn;
+	struct page *start_page, *end_page;
+
+	start_pfn = page_to_pfn(page);
+	start_pfn = start_pfn & ~(MAX_ORDER_NR_PAGES-1);
+	start_page = pfn_to_page(start_pfn);
+	end_page = start_page + MAX_ORDER_NR_PAGES;
+
+	/* Do not cross zone boundaries */
+	if (page_zone(page) != page_zone(start_page))
+		start_page = page;
+	if (page_zone(page) != page_zone(end_page))
+		return 0;
+
+	return move_freepages(zone, start_page, end_page, migratetype);
+}
+
 /* Remove an element from the buddy allocator from the fallback list */
 static struct page *__rmqueue_fallback(struct zone *zone, int order,
 						int start_migratetype)
@@ -706,11 +763,13 @@ static struct page *__rmqueue_fallback(s
 			area->nr_free--;
 
 			/*
-			 * If breaking a large block of pages, place the buddies
-			 * on the preferred allocation list
+			 * If breaking a large block of pages, move all free
+			 * pages to the preferred allocation list
 			 */
-			if (unlikely(current_order >= MAX_ORDER / 2))
+			if (unlikely(current_order >= MAX_ORDER / 2)) {
 				migratetype = start_migratetype;
+				move_freepages_block(zone, page, migratetype);
+			}
 
 			/* Remove the page from the freelists */
 			list_del(&page->lru);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 9/12] Group short-lived and reclaimable kernel allocations
  2007-03-01 10:02 [PATCH 0/12] Group pages of related mobility together to reduce external fragmentation v28 Mel Gorman
                   ` (7 preceding siblings ...)
  2007-03-01 10:05 ` [PATCH 8/12] Move free pages between lists on steal Mel Gorman
@ 2007-03-01 10:05 ` Mel Gorman
  2007-03-01 10:05 ` [PATCH 10/12] Group high-order atomic allocations Mel Gorman
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Mel Gorman @ 2007-03-01 10:05 UTC (permalink / raw)
  To: akpm; +Cc: Mel Gorman, linux-kernel, linux-mm


This patch marks a number of allocations that are either short-lived such as
network buffers or are reclaimable such as inode allocations. When something
like updatedb is called, long-lived and unmovable kernel allocations tend
to be spread throughout the address space which increases fragmentation.

This patch groups these allocations together as much as possible by adding
a new MIGRATE_TYPE. The MIGRATE_RECLAIMABLE type is for allocations that can
be reclaimed on demand, but not moved. i.e. they can be migrated by deleting
them and re-reading the information from elsewhere.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---

 fs/buffer.c                     |    6 ++++--
 fs/dcache.c                     |    2 +-
 fs/ext2/super.c                 |    3 ++-
 fs/ext3/super.c                 |    2 +-
 fs/jbd/journal.c                |    6 ++++--
 fs/jbd/revoke.c                 |    6 ++++--
 fs/ntfs/inode.c                 |    4 ++--
 fs/proc/base.c                  |   13 +++++++------
 fs/proc/generic.c               |    2 +-
 fs/reiserfs/super.c             |    3 ++-
 include/linux/gfp.h             |   16 +++++++++++++---
 include/linux/mmzone.h          |    6 ++++--
 include/linux/pageblock-flags.h |    2 +-
 lib/radix-tree.c                |    6 ++++--
 mm/page_alloc.c                 |   10 +++++++---
 mm/shmem.c                      |    7 +++++--
 net/core/skbuff.c               |    1 +
 17 files changed, 63 insertions(+), 32 deletions(-)

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-008_movefree/fs/buffer.c linux-2.6.20-mm2-009_cluster_reclaimable/fs/buffer.c
--- linux-2.6.20-mm2-008_movefree/fs/buffer.c	2007-02-20 18:27:38.000000000 +0000
+++ linux-2.6.20-mm2-009_cluster_reclaimable/fs/buffer.c	2007-02-20 18:46:51.000000000 +0000
@@ -989,7 +989,8 @@ grow_dev_page(struct block_device *bdev,
 	struct page *page;
 	struct buffer_head *bh;
 
-	page = find_or_create_page(inode->i_mapping, index, GFP_NOFS);
+	page = find_or_create_page(inode->i_mapping, index,
+					GFP_NOFS|__GFP_RECLAIMABLE);
 	if (!page)
 		return NULL;
 
@@ -2928,7 +2929,8 @@ static void recalc_bh_state(void)
 	
 struct buffer_head *alloc_buffer_head(gfp_t gfp_flags)
 {
-	struct buffer_head *ret = kmem_cache_alloc(bh_cachep, gfp_flags);
+	struct buffer_head *ret = kmem_cache_alloc(bh_cachep,
+				set_migrateflags(gfp_flags, __GFP_RECLAIMABLE));
 	if (ret) {
 		get_cpu_var(bh_accounting).nr++;
 		recalc_bh_state();
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-008_movefree/fs/dcache.c linux-2.6.20-mm2-009_cluster_reclaimable/fs/dcache.c
--- linux-2.6.20-mm2-008_movefree/fs/dcache.c	2007-02-19 01:21:37.000000000 +0000
+++ linux-2.6.20-mm2-009_cluster_reclaimable/fs/dcache.c	2007-02-20 18:46:51.000000000 +0000
@@ -900,7 +900,7 @@ struct dentry *d_alloc(struct dentry * p
 	struct dentry *dentry;
 	char *dname;
 
-	dentry = kmem_cache_alloc(dentry_cache, GFP_KERNEL); 
+	dentry = kmem_cache_alloc(dentry_cache, GFP_KERNEL|__GFP_RECLAIMABLE);
 	if (!dentry)
 		return NULL;
 
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-008_movefree/fs/ext2/super.c linux-2.6.20-mm2-009_cluster_reclaimable/fs/ext2/super.c
--- linux-2.6.20-mm2-008_movefree/fs/ext2/super.c	2007-02-19 01:21:37.000000000 +0000
+++ linux-2.6.20-mm2-009_cluster_reclaimable/fs/ext2/super.c	2007-02-20 18:46:51.000000000 +0000
@@ -140,7 +140,8 @@ static struct kmem_cache * ext2_inode_ca
 static struct inode *ext2_alloc_inode(struct super_block *sb)
 {
 	struct ext2_inode_info *ei;
-	ei = (struct ext2_inode_info *)kmem_cache_alloc(ext2_inode_cachep, GFP_KERNEL);
+	ei = (struct ext2_inode_info *)kmem_cache_alloc(ext2_inode_cachep,
+						GFP_KERNEL|__GFP_RECLAIMABLE);
 	if (!ei)
 		return NULL;
 #ifdef CONFIG_EXT2_FS_POSIX_ACL
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-008_movefree/fs/ext3/super.c linux-2.6.20-mm2-009_cluster_reclaimable/fs/ext3/super.c
--- linux-2.6.20-mm2-008_movefree/fs/ext3/super.c	2007-02-19 01:21:37.000000000 +0000
+++ linux-2.6.20-mm2-009_cluster_reclaimable/fs/ext3/super.c	2007-02-20 18:46:51.000000000 +0000
@@ -445,7 +445,7 @@ static struct inode *ext3_alloc_inode(st
 {
 	struct ext3_inode_info *ei;
 
-	ei = kmem_cache_alloc(ext3_inode_cachep, GFP_NOFS);
+	ei = kmem_cache_alloc(ext3_inode_cachep, GFP_NOFS|__GFP_RECLAIMABLE);
 	if (!ei)
 		return NULL;
 #ifdef CONFIG_EXT3_FS_POSIX_ACL
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-008_movefree/fs/jbd/journal.c linux-2.6.20-mm2-009_cluster_reclaimable/fs/jbd/journal.c
--- linux-2.6.20-mm2-008_movefree/fs/jbd/journal.c	2007-02-19 01:21:38.000000000 +0000
+++ linux-2.6.20-mm2-009_cluster_reclaimable/fs/jbd/journal.c	2007-02-20 18:46:51.000000000 +0000
@@ -1735,7 +1735,8 @@ static struct journal_head *journal_allo
 #ifdef CONFIG_JBD_DEBUG
 	atomic_inc(&nr_journal_heads);
 #endif
-	ret = kmem_cache_alloc(journal_head_cache, GFP_NOFS);
+	ret = kmem_cache_alloc(journal_head_cache,
+			set_migrateflags(GFP_NOFS, __GFP_RECLAIMABLE));
 	if (ret == 0) {
 		jbd_debug(1, "out of memory for journal_head\n");
 		if (time_after(jiffies, last_warning + 5*HZ)) {
@@ -1745,7 +1746,8 @@ static struct journal_head *journal_allo
 		}
 		while (ret == 0) {
 			yield();
-			ret = kmem_cache_alloc(journal_head_cache, GFP_NOFS);
+			ret = kmem_cache_alloc(journal_head_cache,
+					GFP_NOFS|__GFP_RECLAIMABLE);
 		}
 	}
 	return ret;
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-008_movefree/fs/jbd/revoke.c linux-2.6.20-mm2-009_cluster_reclaimable/fs/jbd/revoke.c
--- linux-2.6.20-mm2-008_movefree/fs/jbd/revoke.c	2007-02-04 18:44:54.000000000 +0000
+++ linux-2.6.20-mm2-009_cluster_reclaimable/fs/jbd/revoke.c	2007-02-20 18:46:51.000000000 +0000
@@ -206,7 +206,8 @@ int journal_init_revoke(journal_t *journ
 	while((tmp >>= 1UL) != 0UL)
 		shift++;
 
-	journal->j_revoke_table[0] = kmem_cache_alloc(revoke_table_cache, GFP_KERNEL);
+	journal->j_revoke_table[0] = kmem_cache_alloc(revoke_table_cache,
+					GFP_KERNEL|__GFP_RECLAIMABLE);
 	if (!journal->j_revoke_table[0])
 		return -ENOMEM;
 	journal->j_revoke = journal->j_revoke_table[0];
@@ -229,7 +230,8 @@ int journal_init_revoke(journal_t *journ
 	for (tmp = 0; tmp < hash_size; tmp++)
 		INIT_LIST_HEAD(&journal->j_revoke->hash_table[tmp]);
 
-	journal->j_revoke_table[1] = kmem_cache_alloc(revoke_table_cache, GFP_KERNEL);
+	journal->j_revoke_table[1] = kmem_cache_alloc(revoke_table_cache,
+					GFP_KERNEL|__GFP_RECLAIMABLE);
 	if (!journal->j_revoke_table[1]) {
 		kfree(journal->j_revoke_table[0]->hash_table);
 		kmem_cache_free(revoke_table_cache, journal->j_revoke_table[0]);
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-008_movefree/fs/ntfs/inode.c linux-2.6.20-mm2-009_cluster_reclaimable/fs/ntfs/inode.c
--- linux-2.6.20-mm2-008_movefree/fs/ntfs/inode.c	2007-02-04 18:44:54.000000000 +0000
+++ linux-2.6.20-mm2-009_cluster_reclaimable/fs/ntfs/inode.c	2007-02-20 18:46:51.000000000 +0000
@@ -324,7 +324,7 @@ struct inode *ntfs_alloc_big_inode(struc
 	ntfs_inode *ni;
 
 	ntfs_debug("Entering.");
-	ni = kmem_cache_alloc(ntfs_big_inode_cache, GFP_NOFS);
+	ni = kmem_cache_alloc(ntfs_big_inode_cache, GFP_NOFS|__GFP_RECLAIMABLE);
 	if (likely(ni != NULL)) {
 		ni->state = 0;
 		return VFS_I(ni);
@@ -349,7 +349,7 @@ static inline ntfs_inode *ntfs_alloc_ext
 	ntfs_inode *ni;
 
 	ntfs_debug("Entering.");
-	ni = kmem_cache_alloc(ntfs_inode_cache, GFP_NOFS);
+	ni = kmem_cache_alloc(ntfs_inode_cache, GFP_NOFS|__GFP_RECLAIMABLE);
 	if (likely(ni != NULL)) {
 		ni->state = 0;
 		return ni;
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-008_movefree/fs/proc/base.c linux-2.6.20-mm2-009_cluster_reclaimable/fs/proc/base.c
--- linux-2.6.20-mm2-008_movefree/fs/proc/base.c	2007-02-19 01:21:42.000000000 +0000
+++ linux-2.6.20-mm2-009_cluster_reclaimable/fs/proc/base.c	2007-02-20 18:46:51.000000000 +0000
@@ -521,7 +521,7 @@ static ssize_t proc_info_read(struct fil
 		count = PROC_BLOCK_SIZE;
 
 	length = -ENOMEM;
-	if (!(page = __get_free_page(GFP_KERNEL)))
+	if (!(page = __get_free_page(GFP_KERNEL|__GFP_RECLAIMABLE)))
 		goto out;
 
 	length = PROC_I(inode)->op.proc_read(task, (char*)page);
@@ -634,7 +634,7 @@ static ssize_t mem_write(struct file * f
 		goto out;
 
 	copied = -ENOMEM;
-	page = (char *)__get_free_page(GFP_USER);
+	page = (char *)__get_free_page(GFP_USER|__GFP_RECLAIMABLE);
 	if (!page)
 		goto out;
 
@@ -825,7 +825,7 @@ static ssize_t proc_loginuid_write(struc
 		/* No partial writes. */
 		return -EINVAL;
 	}
-	page = (char*)__get_free_page(GFP_USER);
+	page = (char*)__get_free_page(GFP_USER|__GFP_RECLAIMABLE);
 	if (!page)
 		return -ENOMEM;
 	length = -EFAULT;
@@ -1007,7 +1007,8 @@ static int do_proc_readlink(struct dentr
 			    char __user *buffer, int buflen)
 {
 	struct inode * inode;
-	char *tmp = (char*)__get_free_page(GFP_KERNEL), *path;
+	char *tmp = (char*)__get_free_page(GFP_KERNEL|__GFP_RECLAIMABLE);
+	char *path;
 	int len;
 
 	if (!tmp)
@@ -1658,7 +1659,7 @@ static ssize_t proc_pid_attr_read(struct
 	if (count > PAGE_SIZE)
 		count = PAGE_SIZE;
 	length = -ENOMEM;
-	if (!(page = __get_free_page(GFP_KERNEL)))
+	if (!(page = __get_free_page(GFP_KERNEL|__GFP_RECLAIMABLE)))
 		goto out;
 
 	length = security_getprocattr(task,
@@ -1693,7 +1694,7 @@ static ssize_t proc_pid_attr_write(struc
 		goto out;
 
 	length = -ENOMEM;
-	page = (char*)__get_free_page(GFP_USER);
+	page = (char*)__get_free_page(GFP_USER|__GFP_RECLAIMABLE);
 	if (!page)
 		goto out;
 
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-008_movefree/fs/proc/generic.c linux-2.6.20-mm2-009_cluster_reclaimable/fs/proc/generic.c
--- linux-2.6.20-mm2-008_movefree/fs/proc/generic.c	2007-02-19 01:21:42.000000000 +0000
+++ linux-2.6.20-mm2-009_cluster_reclaimable/fs/proc/generic.c	2007-02-20 18:46:51.000000000 +0000
@@ -74,7 +74,7 @@ proc_file_read(struct file *file, char _
 		nbytes = MAX_NON_LFS - pos;
 
 	dp = PDE(inode);
-	if (!(page = (char*) __get_free_page(GFP_KERNEL)))
+	if (!(page = (char*) __get_free_page(GFP_KERNEL|__GFP_RECLAIMABLE)))
 		return -ENOMEM;
 
 	spin_lock(&dp->pde_unload_lock);
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-008_movefree/fs/reiserfs/super.c linux-2.6.20-mm2-009_cluster_reclaimable/fs/reiserfs/super.c
--- linux-2.6.20-mm2-008_movefree/fs/reiserfs/super.c	2007-02-19 01:21:46.000000000 +0000
+++ linux-2.6.20-mm2-009_cluster_reclaimable/fs/reiserfs/super.c	2007-02-20 18:46:51.000000000 +0000
@@ -496,7 +496,8 @@ static struct inode *reiserfs_alloc_inod
 {
 	struct reiserfs_inode_info *ei;
 	ei = (struct reiserfs_inode_info *)
-	    kmem_cache_alloc(reiserfs_inode_cachep, GFP_KERNEL);
+	    kmem_cache_alloc(reiserfs_inode_cachep,
+						GFP_KERNEL|__GFP_RECLAIMABLE);
 	if (!ei)
 		return NULL;
 	return &ei->vfs_inode;
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-008_movefree/include/linux/gfp.h linux-2.6.20-mm2-009_cluster_reclaimable/include/linux/gfp.h
--- linux-2.6.20-mm2-008_movefree/include/linux/gfp.h	2007-02-20 18:25:33.000000000 +0000
+++ linux-2.6.20-mm2-009_cluster_reclaimable/include/linux/gfp.h	2007-02-20 18:47:46.000000000 +0000
@@ -49,9 +49,10 @@ struct vm_area_struct;
 #define __GFP_NOMEMALLOC ((__force gfp_t)0x10000u) /* Don't use emergency reserves */
 #define __GFP_HARDWALL   ((__force gfp_t)0x20000u) /* Enforce hardwall cpuset memory allocs */
 #define __GFP_THISNODE	((__force gfp_t)0x40000u)/* No fallback, no policies */
-#define __GFP_MOVABLE	((__force gfp_t)0x80000u) /* Page is movable */
+#define __GFP_RECLAIMABLE ((__force gfp_t)0x80000u) /* Page is reclaimable */
+#define __GFP_MOVABLE	((__force gfp_t)0x100000u)  /* Page is movable */
 
-#define __GFP_BITS_SHIFT 20	/* Room for 20 __GFP_FOO bits */
+#define __GFP_BITS_SHIFT 21	/* Room for 21 __GFP_FOO bits */
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /* if you forget to add the bitmask here kernel will crash, period */
@@ -59,7 +60,10 @@ struct vm_area_struct;
 			__GFP_COLD|__GFP_NOWARN|__GFP_REPEAT| \
 			__GFP_NOFAIL|__GFP_NORETRY|__GFP_NO_GROW|__GFP_COMP| \
 			__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE| \
-			__GFP_MOVABLE)
+			__GFP_RECLAIMABLE|__GFP_MOVABLE)
+
+/* This mask makes up all the page movable related flags */
+#define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE|__GFP_MOVABLE)
 
 /* This equals 0, but use constants in case they ever change */
 #define GFP_NOWAIT	(GFP_ATOMIC & ~__GFP_HIGH)
@@ -108,6 +112,12 @@ static inline enum zone_type gfp_zone(gf
 	return ZONE_NORMAL;
 }
 
+static inline gfp_t set_migrateflags(gfp_t gfp, gfp_t migrate_flags)
+{
+	BUG_ON((gfp & GFP_MOVABLE_MASK) == GFP_MOVABLE_MASK);
+	return (gfp & ~(GFP_MOVABLE_MASK)) | migrate_flags;
+}
+
 /*
  * There is only one page-allocator function, and two main namespaces to
  * it. The alloc_page*() variants return 'struct page *' and as such
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-008_movefree/include/linux/mmzone.h linux-2.6.20-mm2-009_cluster_reclaimable/include/linux/mmzone.h
--- linux-2.6.20-mm2-008_movefree/include/linux/mmzone.h	2007-02-20 18:33:41.000000000 +0000
+++ linux-2.6.20-mm2-009_cluster_reclaimable/include/linux/mmzone.h	2007-02-20 18:46:51.000000000 +0000
@@ -27,10 +27,12 @@
 
 #ifdef CONFIG_PAGE_GROUP_BY_MOBILITY
 #define MIGRATE_UNMOVABLE     0
-#define MIGRATE_MOVABLE       1
-#define MIGRATE_TYPES         2
+#define MIGRATE_RECLAIMABLE   1
+#define MIGRATE_MOVABLE       2
+#define MIGRATE_TYPES         3
 #else
 #define MIGRATE_UNMOVABLE     0
+#define MIGRATE_UNRECLAIMABLE 0
 #define MIGRATE_MOVABLE       0
 #define MIGRATE_TYPES         1
 #endif
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-008_movefree/include/linux/pageblock-flags.h linux-2.6.20-mm2-009_cluster_reclaimable/include/linux/pageblock-flags.h
--- linux-2.6.20-mm2-008_movefree/include/linux/pageblock-flags.h	2007-02-20 19:29:13.000000000 +0000
+++ linux-2.6.20-mm2-009_cluster_reclaimable/include/linux/pageblock-flags.h	2007-02-20 19:31:18.000000000 +0000
@@ -31,7 +31,7 @@
 
 /* Bit indices that affect a whole block of pages */
 enum pageblock_bits {
-	PB_range(PB_migrate, 1), /* 1 bit required for migrate types */
+	PB_range(PB_migrate, 2), /* 2 bits required for migrate types */
 	NR_PAGEBLOCK_BITS
 };
 
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-008_movefree/lib/radix-tree.c linux-2.6.20-mm2-009_cluster_reclaimable/lib/radix-tree.c
--- linux-2.6.20-mm2-008_movefree/lib/radix-tree.c	2007-02-19 01:22:34.000000000 +0000
+++ linux-2.6.20-mm2-009_cluster_reclaimable/lib/radix-tree.c	2007-02-20 18:46:51.000000000 +0000
@@ -93,7 +93,8 @@ radix_tree_node_alloc(struct radix_tree_
 	struct radix_tree_node *ret;
 	gfp_t gfp_mask = root_gfp_mask(root);
 
-	ret = kmem_cache_alloc(radix_tree_node_cachep, gfp_mask);
+	ret = kmem_cache_alloc(radix_tree_node_cachep,
+				set_migrateflags(gfp_mask, __GFP_RECLAIMABLE));
 	if (ret == NULL && !(gfp_mask & __GFP_WAIT)) {
 		struct radix_tree_preload *rtp;
 
@@ -137,7 +138,8 @@ int radix_tree_preload(gfp_t gfp_mask)
 	rtp = &__get_cpu_var(radix_tree_preloads);
 	while (rtp->nr < ARRAY_SIZE(rtp->nodes)) {
 		preempt_enable();
-		node = kmem_cache_alloc(radix_tree_node_cachep, gfp_mask);
+		node = kmem_cache_alloc(radix_tree_node_cachep,
+				set_migrateflags(gfp_mask, __GFP_RECLAIMABLE));
 		if (node == NULL)
 			goto out;
 		preempt_disable();
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-008_movefree/mm/page_alloc.c linux-2.6.20-mm2-009_cluster_reclaimable/mm/page_alloc.c
--- linux-2.6.20-mm2-008_movefree/mm/page_alloc.c	2007-02-20 18:38:07.000000000 +0000
+++ linux-2.6.20-mm2-009_cluster_reclaimable/mm/page_alloc.c	2007-02-20 18:46:51.000000000 +0000
@@ -150,7 +150,10 @@ static void set_pageblock_migratetype(st
 
 static inline int gfpflags_to_migratetype(gfp_t gfp_flags)
 {
-	return ((gfp_flags & __GFP_MOVABLE) != 0);
+	WARN_ON((gfp_flags & GFP_MOVABLE_MASK) == GFP_MOVABLE_MASK);
+
+	return (((gfp_flags & __GFP_MOVABLE) != 0) << 1) |
+		((gfp_flags & __GFP_RECLAIMABLE) != 0);
 }
 
 #else
@@ -678,8 +681,9 @@ static int prep_new_page(struct page *pa
  * the free lists for the desirable migrate type are depleted
  */
 static int fallbacks[MIGRATE_TYPES][MIGRATE_TYPES-1] = {
-	[MIGRATE_UNMOVABLE] = { MIGRATE_MOVABLE   },
-	[MIGRATE_MOVABLE]   = { MIGRATE_UNMOVABLE },
+	[MIGRATE_UNMOVABLE]   = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE   },
+	[MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE,   MIGRATE_MOVABLE   },
+	[MIGRATE_MOVABLE]     = { MIGRATE_RECLAIMABLE, MIGRATE_UNMOVABLE },
 };
 
 /*
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-008_movefree/mm/shmem.c linux-2.6.20-mm2-009_cluster_reclaimable/mm/shmem.c
--- linux-2.6.20-mm2-008_movefree/mm/shmem.c	2007-02-20 18:25:33.000000000 +0000
+++ linux-2.6.20-mm2-009_cluster_reclaimable/mm/shmem.c	2007-02-20 18:46:51.000000000 +0000
@@ -983,7 +983,9 @@ shmem_alloc_page(gfp_t gfp, struct shmem
 	pvma.vm_policy = mpol_shared_policy_lookup(&info->policy, idx);
 	pvma.vm_pgoff = idx;
 	pvma.vm_end = PAGE_SIZE;
-	page = alloc_page_vma(gfp | __GFP_ZERO, &pvma, 0);
+	page = alloc_page_vma(
+			set_migrateflags(gfp | __GFP_ZERO, __GFP_RECLAIMABLE),
+								&pvma, 0);
 	mpol_free(pvma.vm_policy);
 	return page;
 }
@@ -1003,7 +1005,8 @@ shmem_swapin(struct shmem_inode_info *in
 static inline struct page *
 shmem_alloc_page(gfp_t gfp,struct shmem_inode_info *info, unsigned long idx)
 {
-	return alloc_page(gfp | __GFP_ZERO);
+	return alloc_page(
+			set_migrateflags(gfp | __GFP_ZERO, __GFP_RECLAIMABLE));
 }
 #endif
 
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-008_movefree/net/core/skbuff.c linux-2.6.20-mm2-009_cluster_reclaimable/net/core/skbuff.c
--- linux-2.6.20-mm2-008_movefree/net/core/skbuff.c	2007-02-19 01:22:35.000000000 +0000
+++ linux-2.6.20-mm2-009_cluster_reclaimable/net/core/skbuff.c	2007-02-20 18:46:51.000000000 +0000
@@ -170,6 +170,7 @@ struct sk_buff *__alloc_skb(unsigned int
 	u8 *data;
 
 	cache = fclone ? skbuff_fclone_cache : skbuff_head_cache;
+	gfp_mask = set_migrateflags(gfp_mask, __GFP_RECLAIMABLE);
 
 	/* Get the HEAD */
 	skb = kmem_cache_alloc_node(cache, gfp_mask & ~__GFP_DMA, node);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 10/12] Group high-order atomic allocations
  2007-03-01 10:02 [PATCH 0/12] Group pages of related mobility together to reduce external fragmentation v28 Mel Gorman
                   ` (8 preceding siblings ...)
  2007-03-01 10:05 ` [PATCH 9/12] Group short-lived and reclaimable kernel allocations Mel Gorman
@ 2007-03-01 10:05 ` Mel Gorman
  2007-03-01 10:06 ` [PATCH 11/12] Bias the placement of kernel pages at lower PFNs Mel Gorman
  2007-03-01 10:06 ` [PATCH 12/12] Be more agressive about stealing when MIGRATE_RECLAIMABLE allocations fallback Mel Gorman
  11 siblings, 0 replies; 13+ messages in thread
From: Mel Gorman @ 2007-03-01 10:05 UTC (permalink / raw)
  To: akpm; +Cc: Mel Gorman, linux-kernel, linux-mm


In rare cases, the kernel needs to allocate a high-order block of pages
without sleeping. For example, this is the case with e1000 cards configured
to use jumbo frames.  Migrating or reclaiming pages in this situation is
not an option.

This patch groups these allocations together as much as possible by adding
a new MIGRATE_TYPE. The MIGRATE_HIGHATOMIC type are exactly what they sound
like. Care is taken that pages of other migrate types do not use the same
blocks as high-order atomic allocations.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---

 include/linux/mmzone.h |    4 +++-
 mm/page_alloc.c        |   36 ++++++++++++++++++++++++++++++------
 2 files changed, 33 insertions(+), 7 deletions(-)

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-009_cluster_reclaimable/include/linux/mmzone.h linux-2.6.20-mm2-010_cluster_atomic/include/linux/mmzone.h
--- linux-2.6.20-mm2-009_cluster_reclaimable/include/linux/mmzone.h	2007-02-20 18:46:51.000000000 +0000
+++ linux-2.6.20-mm2-010_cluster_atomic/include/linux/mmzone.h	2007-02-20 18:50:00.000000000 +0000
@@ -29,11 +29,13 @@
 #define MIGRATE_UNMOVABLE     0
 #define MIGRATE_RECLAIMABLE   1
 #define MIGRATE_MOVABLE       2
-#define MIGRATE_TYPES         3
+#define MIGRATE_HIGHATOMIC    3
+#define MIGRATE_TYPES         4
 #else
 #define MIGRATE_UNMOVABLE     0
 #define MIGRATE_UNRECLAIMABLE 0
 #define MIGRATE_MOVABLE       0
+#define MIGRATE_HIGHATOMIC    0
 #define MIGRATE_TYPES         1
 #endif
 
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-009_cluster_reclaimable/mm/page_alloc.c linux-2.6.20-mm2-010_cluster_atomic/mm/page_alloc.c
--- linux-2.6.20-mm2-009_cluster_reclaimable/mm/page_alloc.c	2007-02-20 18:46:51.000000000 +0000
+++ linux-2.6.20-mm2-010_cluster_atomic/mm/page_alloc.c	2007-02-20 18:50:00.000000000 +0000
@@ -148,10 +148,16 @@ static void set_pageblock_migratetype(st
 					PB_migrate, PB_migrate_end);
 }
 
-static inline int gfpflags_to_migratetype(gfp_t gfp_flags)
+static inline int allocflags_to_migratetype(gfp_t gfp_flags, int order)
 {
 	WARN_ON((gfp_flags & GFP_MOVABLE_MASK) == GFP_MOVABLE_MASK);
 
+	/* Cluster high-order atomic allocations together */
+	if (unlikely(order > 0) &&
+			(!(gfp_flags & __GFP_WAIT) || in_interrupt()))
+		return MIGRATE_HIGHATOMIC;
+
+	/* Cluster based on mobility */
 	return (((gfp_flags & __GFP_MOVABLE) != 0) << 1) |
 		((gfp_flags & __GFP_RECLAIMABLE) != 0);
 }
@@ -166,7 +172,7 @@ static void set_pageblock_migratetype(st
 {
 }
 
-static inline int gfpflags_to_migratetype(gfp_t gfp_flags)
+static inline int allocflags_to_migratetype(gfp_t gfp_flags, int order)
 {
 	return MIGRATE_UNMOVABLE;
 }
@@ -681,9 +687,10 @@ static int prep_new_page(struct page *pa
  * the free lists for the desirable migrate type are depleted
  */
 static int fallbacks[MIGRATE_TYPES][MIGRATE_TYPES-1] = {
-	[MIGRATE_UNMOVABLE]   = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE   },
-	[MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE,   MIGRATE_MOVABLE   },
-	[MIGRATE_MOVABLE]     = { MIGRATE_RECLAIMABLE, MIGRATE_UNMOVABLE },
+	[MIGRATE_UNMOVABLE]   = { MIGRATE_RECLAIMABLE, MIGRATE_MOVABLE,  MIGRATE_HIGHATOMIC },
+	[MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE,   MIGRATE_MOVABLE,  MIGRATE_HIGHATOMIC },
+	[MIGRATE_MOVABLE]     = { MIGRATE_RECLAIMABLE, MIGRATE_UNMOVABLE,MIGRATE_HIGHATOMIC },
+	[MIGRATE_HIGHATOMIC]  = { MIGRATE_RECLAIMABLE, MIGRATE_UNMOVABLE,MIGRATE_MOVABLE},
 };
 
 /*
@@ -751,13 +758,24 @@ static struct page *__rmqueue_fallback(s
 	int current_order;
 	struct page *page;
 	int migratetype, i;
+	int nonatomic_fallback_atomic = 0;
 
+retry:
 	/* Find the largest possible block of pages in the other list */
 	for (current_order = MAX_ORDER-1; current_order >= order;
 						--current_order) {
 		for (i = 0; i < MIGRATE_TYPES - 1; i++) {
 			migratetype = fallbacks[start_migratetype][i];
 
+			/*
+			 * Make it hard to fallback to blocks used for
+			 * high-order atomic allocations
+			 */
+			if (migratetype == MIGRATE_HIGHATOMIC &&
+				start_migratetype != MIGRATE_UNMOVABLE &&
+				!nonatomic_fallback_atomic)
+				continue;
+
 			area = &(zone->free_area[current_order]);
 			if (list_empty(&area->free_list[migratetype]))
 				continue;
@@ -790,6 +808,12 @@ static struct page *__rmqueue_fallback(s
 		}
 	}
 
+	/* Allow fallback to high-order atomic blocks if memory is that low */
+	if (!nonatomic_fallback_atomic) {
+		nonatomic_fallback_atomic = 1;
+		goto retry;
+	}
+
 	return NULL;
 }
 #else
@@ -1089,7 +1113,7 @@ static struct page *buffered_rmqueue(str
 	struct page *page;
 	int cold = !!(gfp_flags & __GFP_COLD);
 	int cpu;
-	int migratetype = gfpflags_to_migratetype(gfp_flags);
+	int migratetype = allocflags_to_migratetype(gfp_flags, order);
 
 again:
 	cpu  = get_cpu();

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 11/12] Bias the placement of kernel pages at lower PFNs
  2007-03-01 10:02 [PATCH 0/12] Group pages of related mobility together to reduce external fragmentation v28 Mel Gorman
                   ` (9 preceding siblings ...)
  2007-03-01 10:05 ` [PATCH 10/12] Group high-order atomic allocations Mel Gorman
@ 2007-03-01 10:06 ` Mel Gorman
  2007-03-01 10:06 ` [PATCH 12/12] Be more agressive about stealing when MIGRATE_RECLAIMABLE allocations fallback Mel Gorman
  11 siblings, 0 replies; 13+ messages in thread
From: Mel Gorman @ 2007-03-01 10:06 UTC (permalink / raw)
  To: akpm; +Cc: Mel Gorman, linux-kernel, linux-mm


This patch chooses blocks with lower PFNs when placing kernel allocations. This
is particularly important during fallback in low memory situations to stop
unmovable pages being placed throughout the entire address space.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---

 page_alloc.c |   20 ++++++++++++++++++++
 1 files changed, 20 insertions(+)

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-010_cluster_atomic/mm/page_alloc.c linux-2.6.20-mm2-011_biasplacement/mm/page_alloc.c
--- linux-2.6.20-mm2-010_cluster_atomic/mm/page_alloc.c	2007-02-20 18:50:00.000000000 +0000
+++ linux-2.6.20-mm2-011_biasplacement/mm/page_alloc.c	2007-02-20 18:52:18.000000000 +0000
@@ -750,6 +750,23 @@ int move_freepages_block(struct zone *zo
 	return move_freepages(zone, start_page, end_page, migratetype);
 }
 
+/* Return the page with the lowest PFN in the list */
+static struct page *min_page(struct list_head *list)
+{
+	unsigned long min_pfn = -1UL;
+	struct page *min_page = NULL, *page;;
+
+	list_for_each_entry(page, list, lru) {
+		unsigned long pfn = page_to_pfn(page);
+		if (pfn < min_pfn) {
+			min_pfn = pfn;
+			min_page = page;
+		}
+	}
+
+	return min_page;
+}
+
 /* Remove an element from the buddy allocator from the fallback list */
 static struct page *__rmqueue_fallback(struct zone *zone, int order,
 						int start_migratetype)
@@ -780,8 +797,11 @@ retry:
 			if (list_empty(&area->free_list[migratetype]))
 				continue;
 
+			/* Bias kernel allocations towards low pfns */
 			page = list_entry(area->free_list[migratetype].next,
 					struct page, lru);
+			if (unlikely(start_migratetype != MIGRATE_MOVABLE))
+				page = min_page(&area->free_list[migratetype]);
 			area->nr_free--;
 
 			/*

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 12/12] Be more agressive about stealing when MIGRATE_RECLAIMABLE allocations fallback
  2007-03-01 10:02 [PATCH 0/12] Group pages of related mobility together to reduce external fragmentation v28 Mel Gorman
                   ` (10 preceding siblings ...)
  2007-03-01 10:06 ` [PATCH 11/12] Bias the placement of kernel pages at lower PFNs Mel Gorman
@ 2007-03-01 10:06 ` Mel Gorman
  11 siblings, 0 replies; 13+ messages in thread
From: Mel Gorman @ 2007-03-01 10:06 UTC (permalink / raw)
  To: akpm; +Cc: Mel Gorman, linux-kernel, linux-mm


MIGRATE_RECLAIMABLE allocations tend to be very bursty in nature like
when updatedb starts. It is likely this will occur in situations where
MAX_ORDER blocks of pages are not free. This means that updatedb can scatter
MIGRATE_RECLAIMABLE pages throughout the address space. This patch is more
agressive about stealing blocks of pages for MIGRATE_RECLAIMABLE.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---

 page_alloc.c |   18 +++++++++++++++---
 1 files changed, 15 insertions(+), 3 deletions(-)

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.20-mm2-011_biasplacement/mm/page_alloc.c linux-2.6.20-mm2-012_grabbyreclaim/mm/page_alloc.c
--- linux-2.6.20-mm2-011_biasplacement/mm/page_alloc.c	2007-02-20 18:52:18.000000000 +0000
+++ linux-2.6.20-mm2-012_grabbyreclaim/mm/page_alloc.c	2007-02-20 18:54:35.000000000 +0000
@@ -806,11 +806,23 @@ retry:
 
 			/*
 			 * If breaking a large block of pages, move all free
-			 * pages to the preferred allocation list
+			 * pages to the preferred allocation list. If falling
+			 * back for a reclaimable kernel allocation, be more
+			 * agressive about taking ownership of free pages
 			 */
-			if (unlikely(current_order >= MAX_ORDER / 2)) {
+			if (unlikely(current_order >= MAX_ORDER / 2) ||
+					start_migratetype == MIGRATE_RECLAIMABLE) {
+				unsigned long pages;
+				pages = move_freepages_block(zone, page,
+								start_migratetype);
+
+				/* Claim the whole block if over half of it is free */
+				if ((pages << current_order) >= (1 << (MAX_ORDER-2)) &&
+						migratetype != MIGRATE_HIGHATOMIC)
+					set_pageblock_migratetype(page,
+								start_migratetype);
+
 				migratetype = start_migratetype;
-				move_freepages_block(zone, page, migratetype);
 			}
 
 			/* Remove the page from the freelists */

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2007-03-01 10:06 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-01 10:02 [PATCH 0/12] Group pages of related mobility together to reduce external fragmentation v28 Mel Gorman
2007-03-01 10:02 ` [PATCH 1/12] Add a bitmap that is used to track flags affecting a block of pages Mel Gorman
2007-03-01 10:03 ` [PATCH 2/12] Add __GFP_MOVABLE for callers to flag allocations from high memory that may be migrated Mel Gorman
2007-03-01 10:03 ` [PATCH 3/12] Add __GFP_MOVABLE for callers to flag allocations from low " Mel Gorman
2007-03-01 10:03 ` [PATCH 4/12] Split the free lists for movable and unmovable allocations Mel Gorman
2007-03-01 10:04 ` [PATCH 5/12] Choose pages from the per-cpu list based on migration type Mel Gorman
2007-03-01 10:04 ` [PATCH 6/12] Add a configure option to group pages by mobility Mel Gorman
2007-03-01 10:04 ` [PATCH 7/12] Drain per-cpu lists when high-order allocations fail Mel Gorman
2007-03-01 10:05 ` [PATCH 8/12] Move free pages between lists on steal Mel Gorman
2007-03-01 10:05 ` [PATCH 9/12] Group short-lived and reclaimable kernel allocations Mel Gorman
2007-03-01 10:05 ` [PATCH 10/12] Group high-order atomic allocations Mel Gorman
2007-03-01 10:06 ` [PATCH 11/12] Bias the placement of kernel pages at lower PFNs Mel Gorman
2007-03-01 10:06 ` [PATCH 12/12] Be more agressive about stealing when MIGRATE_RECLAIMABLE allocations fallback Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).