LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects
@ 2007-11-06 19:51 Christoph Lameter
  2007-11-06 19:51 ` [patch 01/28] cpu alloc: The allocator Christoph Lameter
                   ` (28 more replies)
  0 siblings, 29 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:51 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

In various places the kernel maintains arrays of pointers indexed by
processor numbers. These are used to locate objects that need to be used
when executing on a specirfic processor. Both the slab allocator
and the page allocator use these arrays and there the arrays are used in
performance critical code. The allocpercpu functionality is a simple
allocator to provide these arrays. However, there are certain drawbacks
in using such arrays:

1. The arrays become huge for large systems and may be very sparsely
   populated (if they are dimensionied for NR_CPUS) on an architecture
   like IA64 that allows up to 4k cpus if a kernel is then booted on a
   machine that only supports 8 processors.

2. The arrays cause surrounding variables to no longer fit into a single
   cacheline. The layout of core data structure is typically optimized so
   that variables frequently used together are placed in the same cacheline.
   Arrays of pointers move these variables far apart and destroy this effect.

3. A processor frequently follows only one pointer for its own use. Thus
   that cacheline with that pointer has to be kept in memory. The neighboring
   pointers are all to other processors that are rarely used. So a whole
   cacheline of 128 bytes may be consumed but only 8 bytes of information
   is constant use. It would be better to be able to place more information
   in this cacheline.

4. The lookup of the per cpu object is expensive and requires multiple
   memory accesses to:

   A) smp_processor_id()
   B) pointer to the base of the per cpu pointer array
   C) pointer to the per cpu object in the pointer array
   D) the per cpu object itself.

5. Each use of allocper requires its own per cpu array. On large
   system large arrays have to be allocated again and again.

6. Processor hotplug cannot effectively track the per cpu objects
   since the VM cannot find all memory that was allocated for
   a specific cpu. It is impossible to add or remove objects in
   a consistent way. Although the allocpercpu subsystem was extended
   to add that capability is not used since use would require adding
   cpu hotplug callbacks to each and every use of allocpercpu in
   the kernel.

The patchset here provides an cpu allocator that arranges data differently.
Objects are placed tightly in linear areas reserved for each processor.
The areas are of a fixed size so that address calculation can be used
instead of a lookup. This means that

6. The VM knows where all the per cpu variables are and it could remove
   or add cpu areas as cpus come online or go offline.

5. There is no need for per cpu pointer arrays.

4. The lookup of a per cpu object is easy and requires memory access to:

   A) smp_processor_id()
   B) cpu pointer to the object
   C) the per cpu object itself.

3. So one access to the not very friendly cacheline that only contains
   a single useful pointer is avoided. The cache footprint is reduced.

2. Surrounding variables can be placed in the same cacheline.
   This allow f.e. in SLUB to avoid caching objects in per cpu structures
   since the kmem_cache structure is finally available without the need
   to access a cache cold cacheline.

1. A single pointer can be used regardless of the number of processors
   in the system.

The cpu allocator managed data beginning at CPU_AREA_BASE. The pointer to
access item DATA on processor X can then be calculated using

POINTER = CPU_AREA_BASE + DATA + (X << CPU_AREA_ORDER)

This makes the allocator rely on a fixed address of the cpu area and on
a fixed size of memory for each processor (similar to S/390s
way of addressing percpu variables).

The allocator can be configured in two ways:

1. Static configuration

	The cpu areas are directly mapped memory addresses. Thus
	the memory in the cpu areas is fixed and is allocated
	as a static variable.

	The default configuration of the cpu allocator (if no arch code
	changed settings) is to reserve a 64k area for each processor.

2. Virtual configuration

	The cpu areas are virtualized. Memory in cpu areas is allocated
	on demand. The MMU is used to map memory allocated into the
	cpu areas (in same way that the virtual memmap functionality does it).

	The maximum sizes for the cpu areas is only dependent on the amount
	of virtual memory available. The virtualization can use large
	mappings (PMDs f.e.) in order to avoid TLB pressure that could occur
	on system that only have a small page when heavy use of cpu areas
	is made.


This patch increases the speed of the SLUB fastpath and it is likely that
similar results can be obtained for other kernel subsystems :


Allocation of 10000 objects of each size. Measurement of the cycles
for each action:

Size  SLUBmm	cpu alloc
-------------------------
   8  45	38
  16  49	43
  32  61	53
  64  82	75
 128  188	176
 256  207	204
 512  260	250
1024  398	391
2048  530	511
4096  342	376

Allocation and then immeidate freeing of an object. Measured in cycles
for each alloc/free action:

alloc/free test
    SLUBmm	cpu alloc
    68-72	56-58

The cpu allocator also removes the difference in handling SMP, UP and NUMA in
the slab and page allocate and simplifies code. It is advantageous even for UP
to place per cpu data from different zones or different slabs in the same
cacheline. Cpu alloc makes uniform handling of cpu data on all three different
types of configurations possible.

The cpu allocator also decreases the memory needs for per cpu storage.

On a classic configuration with SLAB, 32 processors and the allocation of a 4 byte
counter via allocpercpu one needs the following on a 64 bit platform:

32 * 8		256	Array indexed by processor
32 * 32		1024	32 objects. The minimum allocation size of SLAB is 32.
------------------------------------------------------------------------------
Total		1280 bytes

cpu alloc needs

32 * 4		128 bytes

This is one tenth of storage. Granted this is the worst case scenario for a
32 processor system but it shows the savings that can be had. cpu alloc can
allocate 10 counters in the same cacheline for the price of one with
allocpercpu. The allocpercpu counters are likely dispersed over all of
memory. So multiple cachelines (in the worst case 10) need to be kept in
memory if those counters need constant updating. cpu alloc will keep the
10 counter in a single cacheline. cpu alloc can keep up to 16 counters
in the same cacheline if the machine has a 64 byte cacheline size.

The use of the cpu area is usually pretty minimal. 32 bit SMP systems typicaly
use about 8k of cpu area space after bootup. 64 bit SMP around 16k. Small NUMA
systems (8p 4node) use about 64k. Large NUMA system may need a megabyte of
cpu area.

The usage of the per cpu areas typically increases by

1. New slabs being created (needs about 12 bytes per slab on 32 bit, 20 on 64 bit)
2. New devices being mounted that need cpu data for statistics
3. Network devices statistics
4. Special network features (Dave needs to run 100000 IP tunnels)

The current use of the cpu area can be seen in the field

	cpu_bytes

in /proc/vmstat

Drawbacks:

1. The per cpu area size is fixed

   If we use a virtually mapped area then this is not a problem if there
   is sufficient virtual space. The 100000 IP tunnels are only realistic
   with a virtually mapped cpu area.

2. The cpu allocator cannot control allocation of individual objects like
   allocpercpu may. This is in actuality never used except in net/iucv/iucv.c
   where we have a single case of a per cpu allocation being used to allocate
   GFP_DMA structures(!). A patch is provided that replaces the use of
   allocpercpu with explicit calls to allocators for each object in iucv.c

TODO:
- Currently only i386, ia64 and x86_64 arch definitions are provided.
  Other arches fall back to 64k static configurations.
- Cpu hotplug support. Current we simply allocate for all possible processors.
  We could reduce this to only online processors if we could allocate the
  cpu area for the new processor before the callbacks are run and if we could
  free the cpu areas for a processor going down after all the callbacks for
  that were run.
- There are various modifications to exotic configurations that still need
  some testing (f.e. s/390 iucv--whatever that is--) etc. Tests were
  done on UP(i386) SMP(i386, x86_64) and NUMA (x86_64, ia64)

The patchset implements cpu alloc and then gradually replaces all uses of
allocpercpu in the kernel. The last patch removes the allocpercpu support.
If the last patch is not applied then allocpercpu can coexist with cpu alloc.

The patchset is available also via

git pull git://git.kernel.org/pub/scm/linux/kernel/git/christoph/slab.git cpu_alloc


The following patches are based on the linux-2.6 git tree +

git://git.kernel.org/pub/scm/linux/kernel/git/christoph/slab.git performance

(which is the mm version of SLUB)

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 01/28] cpu alloc: The allocator
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
@ 2007-11-06 19:51 ` Christoph Lameter
  2007-11-08 12:34   ` Peter Zijlstra
                     ` (2 more replies)
  2007-11-06 19:51 ` [patch 02/28] cpu alloc: x86_64 support Christoph Lameter
                   ` (27 subsequent siblings)
  28 siblings, 3 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:51 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_core --]
[-- Type: text/plain, Size: 16918 bytes --]

The core portion of the cpu allocator.

The per cpu allocator allows dynamic allocation of memory on all
processor simultaneously. A bitmap is used to track used areas.
The allocator implements tight packing to reduce the cache footprint
and increase speed since cacheline contention is typically not a concern
for memory mainly used by a single cpu. Small objects will fill up gaps
left by larger allocations that required alignments.

Signed-off-by: Christoph Lameter <clameter@sgi.com>


---
 include/linux/cpu_alloc.h |   56 ++++++
 include/linux/mm.h        |   13 +
 include/linux/vmstat.h    |    2 
 mm/Kconfig                |   33 +++
 mm/Makefile               |    3 
 mm/cpu_alloc.c            |  407 ++++++++++++++++++++++++++++++++++++++++++++++
 mm/vmstat.c               |    1 
 7 files changed, 512 insertions(+), 3 deletions(-)

Index: linux-2.6/mm/cpu_alloc.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/mm/cpu_alloc.c	2007-11-06 06:05:06.000000000 +0000
@@ -0,0 +1,407 @@
+/*
+ * Cpu allocator - Manage objects allocated for each processor
+ *
+ * (C) 2007 SGI, Christoph Lameter <clameter@sgi.com>
+ * 	Basic implementation with allocation and free from a dedicated per
+ * 	cpu area.
+ *
+ * The per cpu allocator allows dynamic allocation of memory on all
+ * processor simultaneously. A bitmap is used to track used areas.
+ * The allocator implements tight packing to reduce the cache footprint
+ * and increase speed since cacheline contention is typically not a concern
+ * for memory mainly used by a single cpu. Small objects will fill up gaps
+ * left by larger allocations that required alignments.
+ */
+#include <linux/mm.h>
+#include <linux/mmzone.h>
+#include <linux/module.h>
+#include <linux/cpu_alloc.h>
+#include <linux/bitmap.h>
+#include <linux/vmalloc.h>
+#include <linux/bootmem.h>
+#include <linux/sched.h>	/* i386 definition of init_mm */
+#include <linux/highmem.h>	/* i386 dependency on highmem config */
+#include <asm/pgtable.h>
+#include <asm/pgalloc.h>
+
+/*
+ * Basic allocation unit. A bit map is created to track the use of each
+ * UNIT_SIZE element in the cpu area.
+ */
+
+#define UNIT_SIZE sizeof(int)
+#define UNITS_PER_BLOCK (ALLOC_SIZE / UNIT_SIZE)
+
+/*
+ * Lock to protect the bitmap and the meta data for the cpu allocator.
+ */
+static DEFINE_SPINLOCK(cpu_alloc_map_lock);
+
+#ifdef CONFIG_CPU_AREA_VIRTUAL
+
+/*
+ * Virtualized cpu area. The cpu area can be extended if more space is needed.
+ */
+
+#define cpu_area ((u8 *)(CPU_AREA_BASE))
+#define ALLOC_SIZE (1UL << (CONFIG_CPU_AREA_ALLOC_ORDER + PAGE_SHIFT))
+
+/*
+ * The maximum number of blocks is the maximum size of the
+ * cpu area for one processor divided by the size of an allocation
+ * block.
+ */
+#define MAX_BLOCKS (1UL << (CONFIG_CPU_AREA_ORDER - \
+				CONFIG_CPU_AREA_ALLOC_ORDER))
+
+
+static unsigned long *cpu_alloc_map = NULL;
+static int cpu_alloc_map_order = -1;	/* Size of the bitmap in page order */
+static unsigned long active_blocks;	/* Number of block allocated on each cpu */
+static unsigned long units_free;	/* Number of available units */
+static unsigned long units_total;	/* Total units that are managed */
+
+/*
+ * Allocate a block of memory to be used to provide cpu area memory
+ * or to extend the bitmap for the cpu map.
+ */
+void *cpu_area_alloc_block(unsigned long size, gfp_t flags, int node)
+{
+	struct page *page = alloc_pages_node(node,
+			flags, get_order(size));
+	if (page)
+		return page_address(page);
+	return NULL;
+}
+
+pte_t *cpu_area_pte_populate(pmd_t *pmd, unsigned long addr,
+						gfp_t flags, int node)
+{
+	pte_t *pte = pte_offset_kernel(pmd, addr);
+	if (pte_none(*pte)) {
+		pte_t entry;
+		void *p = cpu_area_alloc_block(PAGE_SIZE, flags, node);
+		if (!p)
+			return 0;
+		entry = pfn_pte(__pa(p) >> PAGE_SHIFT, PAGE_KERNEL);
+		set_pte_at(&init_mm, addr, pte, entry);
+	}
+	return pte;
+}
+
+pmd_t *cpu_area_pmd_populate(pud_t *pud, unsigned long addr,
+						gfp_t flags, int node)
+{
+	pmd_t *pmd = pmd_offset(pud, addr);
+	if (pmd_none(*pmd)) {
+		void *p = cpu_area_alloc_block(PAGE_SIZE, flags, node);
+		if (!p)
+			return 0;
+		pmd_populate_kernel(&init_mm, pmd, p);
+	}
+	return pmd;
+}
+
+pud_t *cpu_area_pud_populate(pgd_t *pgd, unsigned long addr,
+						gfp_t flags, int node)
+{
+	pud_t *pud = pud_offset(pgd, addr);
+	if (pud_none(*pud)) {
+		void *p = cpu_area_alloc_block(PAGE_SIZE, flags, node);
+		if (!p)
+			return 0;
+		pud_populate(&init_mm, pud, p);
+	}
+	return pud;
+}
+
+pgd_t *cpu_area_pgd_populate(unsigned long addr, gfp_t flags, int node)
+{
+	pgd_t *pgd = pgd_offset_k(addr);
+	if (pgd_none(*pgd)) {
+		void *p = cpu_area_alloc_block(PAGE_SIZE, flags, node);
+		if (!p)
+			return 0;
+		pgd_populate(&init_mm, pgd, p);
+	}
+	return pgd;
+}
+
+int cpu_area_populate_basepages(void *start, unsigned long size,
+						gfp_t flags, int node)
+{
+	unsigned long addr = (unsigned long)start;
+	unsigned long end = addr + size;
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+	pte_t *pte;
+
+	for (; addr < end; addr += PAGE_SIZE) {
+		pgd = cpu_area_pgd_populate(addr, flags, node);
+		if (!pgd)
+			return -ENOMEM;
+		pud = cpu_area_pud_populate(pgd, addr, flags, node);
+		if (!pud)
+			return -ENOMEM;
+		pmd = cpu_area_pmd_populate(pud, addr, flags, node);
+		if (!pmd)
+			return -ENOMEM;
+		pte = cpu_area_pte_populate(pmd, addr, flags, node);
+		if (!pte)
+			return -ENOMEM;
+	}
+
+	return 0;
+}
+
+/*
+ * If no other population function is defined then this function will stand
+ * in and provide the capability to map PAGE_SIZE pages into the cpu area.
+ */
+int __attribute__((weak)) cpu_area_populate(void *start, unsigned long size,
+					gfp_t flags, int node)
+{
+	return cpu_area_populate_basepages(start, size, flags, node);
+}
+
+/*
+ * Extend the areas on all processors. This function may be called repeatedly
+ * until we have enough space to accomodate a newly allocated object.
+ *
+ * Must hold the cpu_alloc_map_lock on entry. Will drop the lock and then
+ * regain it.
+ */
+static int expand_cpu_area(gfp_t flags)
+{
+	unsigned long blocks = active_blocks;
+	unsigned long bits;
+	int cpu;
+	int err = -ENOMEM;
+	int map_order;
+	unsigned long *new_map = NULL;
+	void *start;
+
+	if (active_blocks == MAX_BLOCKS)
+		goto out;
+
+	spin_unlock(&cpu_alloc_map_lock);
+
+	/*
+	 * Determine the size of the bit map needed
+	 */
+	bits = (blocks + 1) * UNITS_PER_BLOCK;
+	map_order = get_order(DIV_ROUND_UP(bits, 8));
+	start = cpu_area + \
+		(blocks << (PAGE_SHIFT + CONFIG_CPU_AREA_ALLOC_ORDER));
+
+	for_each_possible_cpu(cpu) {
+		err = cpu_area_populate(CPU_PTR(start, cpu), ALLOC_SIZE,
+			flags, cpu_to_node(cpu));
+
+		if (err) {
+			spin_lock(&cpu_alloc_map_lock);
+			goto out;
+		}
+	}
+
+	if (map_order > cpu_alloc_map_order) {
+		new_map = cpu_area_alloc_block(PAGE_SIZE << map_order,
+						flags | __GFP_ZERO, 0);
+		if (!new_map)
+			goto out;
+	}
+
+	spin_lock(&cpu_alloc_map_lock);
+
+	/*
+	 * We dropped the lock. Another processor may have already extended
+	 * the cpu area size as needed.
+	 */
+	if (blocks != active_blocks) {
+		if (new_map)
+			free_pages((unsigned long)new_map,
+						map_order);
+		err = 0;
+		goto out;
+	}
+
+	if (new_map) {
+		/*
+		 * Need to extend the bitmap
+		 */
+		if (cpu_alloc_map)
+			memcpy(new_map, cpu_alloc_map,
+				PAGE_SIZE << cpu_alloc_map_order);
+		cpu_alloc_map = new_map;
+		cpu_alloc_map_order = map_order;
+	}
+
+	active_blocks++;
+	units_total += UNITS_PER_BLOCK;
+	units_free += UNITS_PER_BLOCK;
+	err = 0;
+out:
+	return err;
+}
+
+#else
+
+/*
+ * Static fallback configuration. The cpu areas are of a fixed size and
+ * cannot be extended. Such configurations are mainly useful on
+ * machines that do not have MMU support.
+ */
+#define MAX_BLOCKS 1
+#define ALLOC_SIZE (1UL << (CONFIG_CPU_AREA_ORDER + PAGE_SHIFT))
+
+static u8 cpu_area[NR_CPUS * ALLOC_SIZE];
+static DECLARE_BITMAP(cpu_alloc_map, UNITS_PER_BLOCK);
+static int units_free = UNITS_PER_BLOCK;
+#define cpu_alloc_map_order CONFIG_CPU_AREA_ORDER
+#define units_total UNITS_PER_BLOCK
+
+static inline int expand_cpu_area(gfp_t flags)
+{
+	return -ENOSYS;
+}
+#endif
+
+static int first_free;		/* First known free unit */
+
+/*
+ * How many units are needed for an object of a given size
+ */
+static int size_to_units(unsigned long size)
+{
+	return DIV_ROUND_UP(size, UNIT_SIZE);
+}
+
+/*
+ * Mark an object as used in the cpu_alloc_map
+ *
+ * Must hold cpu_alloc_map_lock
+ */
+static void set_map(int start, int length)
+{
+	while (length-- > 0)
+		__set_bit(start++, cpu_alloc_map);
+}
+
+/*
+ * Mark an area as freed.
+ *
+ * Must hold cpu_alloc_map_lock
+ */
+static void clear_map(int start, int length)
+{
+	while (length-- > 0)
+		__clear_bit(start++, cpu_alloc_map);
+}
+
+/*
+ * Allocate an object of a certain size
+ *
+ * Returns a special pointer that can be used with CPU_PTR to find the
+ * address of the object for a certain cpu.
+ */
+void *cpu_alloc(unsigned long size, gfp_t gfpflags, unsigned long align)
+{
+	unsigned long start;
+	int units = size_to_units(size);
+	void *ptr;
+	int first;
+	unsigned long map_size;
+
+	BUG_ON(gfpflags & ~(GFP_RECLAIM_MASK | __GFP_ZERO));
+
+	spin_lock(&cpu_alloc_map_lock);
+
+restart:
+	map_size = PAGE_SIZE << cpu_alloc_map_order;
+	first = 1;
+	start = first_free;
+
+	for ( ; ; ) {
+
+		start = find_next_zero_bit(cpu_alloc_map, map_size, start);
+		if (first)
+			first_free = start;
+
+		if (start >= units_total) {
+			if (expand_cpu_area(gfpflags))
+				goto out_of_memory;
+			goto restart;
+		}
+
+		/*
+		 * Check alignment and that there is enough space after
+		 * the starting unit.
+		 */
+		if (start % (align / UNIT_SIZE) == 0 &&
+			find_next_bit(cpu_alloc_map, map_size, start + 1)
+							>= start + units)
+				break;
+		start++;
+		first = 0;
+	}
+
+	if (first)
+		first_free = start + units;
+
+	while (start + units > units_total) {
+		if (expand_cpu_area(gfpflags))
+			goto out_of_memory;
+	}
+
+	set_map(start, units);
+	units_free -= units;
+	__count_vm_events(CPU_BYTES, units * UNIT_SIZE);
+
+	spin_unlock(&cpu_alloc_map_lock);
+
+	ptr = cpu_area + start * UNIT_SIZE;
+
+	if (gfpflags & __GFP_ZERO) {
+		int cpu;
+
+		for_each_possible_cpu(cpu)
+			memset(CPU_PTR(ptr, cpu), 0, size);
+	}
+
+	return ptr;
+
+out_of_memory:
+	spin_unlock(&cpu_alloc_map_lock);
+	return NULL;
+}
+EXPORT_SYMBOL(cpu_alloc);
+
+/*
+ * Free an object. The pointer must be a cpu pointer allocated
+ * via cpu_alloc.
+ */
+void cpu_free(void *start, unsigned long size)
+{
+	int units = size_to_units(size);
+	int index;
+	u8 *p = start;
+
+	BUG_ON(p < cpu_area);
+	index = (p - cpu_area) / UNIT_SIZE;
+	BUG_ON(!test_bit(index, cpu_alloc_map) ||
+			index >= units_total);
+
+	spin_lock(&cpu_alloc_map_lock);
+
+	clear_map(index, units);
+	units_free += units;
+	__count_vm_events(CPU_BYTES, -units * UNIT_SIZE);
+	if (index < first_free)
+		first_free = index;
+
+	spin_unlock(&cpu_alloc_map_lock);
+}
+EXPORT_SYMBOL(cpu_free);
+
+
Index: linux-2.6/include/linux/vmstat.h
===================================================================
--- linux-2.6.orig/include/linux/vmstat.h	2007-11-06 06:05:02.000000000 +0000
+++ linux-2.6/include/linux/vmstat.h	2007-11-06 06:05:06.000000000 +0000
@@ -36,7 +36,7 @@
 		FOR_ALL_ZONES(PGSCAN_KSWAPD),
 		FOR_ALL_ZONES(PGSCAN_DIRECT),
 		PGINODESTEAL, SLABS_SCANNED, KSWAPD_STEAL, KSWAPD_INODESTEAL,
-		PAGEOUTRUN, ALLOCSTALL, PGROTATED,
+		PAGEOUTRUN, ALLOCSTALL, PGROTATED, CPU_BYTES,
 		NR_VM_EVENT_ITEMS
 };
 
Index: linux-2.6/mm/Makefile
===================================================================
--- linux-2.6.orig/mm/Makefile	2007-11-06 06:05:02.000000000 +0000
+++ linux-2.6/mm/Makefile	2007-11-06 06:05:06.000000000 +0000
@@ -11,7 +11,7 @@
 			   page_alloc.o page-writeback.o pdflush.o \
 			   readahead.o swap.o truncate.o vmscan.o \
 			   prio_tree.o util.o mmzone.o vmstat.o backing-dev.o \
-			   page_isolation.o $(mmu-y)
+			   page_isolation.o cpu_alloc.o $(mmu-y)
 
 obj-$(CONFIG_BOUNCE)	+= bounce.o
 obj-$(CONFIG_SWAP)	+= page_io.o swap_state.o swapfile.o thrash.o
@@ -30,4 +30,3 @@
 obj-$(CONFIG_MIGRATION) += migrate.o
 obj-$(CONFIG_SMP) += allocpercpu.o
 obj-$(CONFIG_QUICKLIST) += quicklist.o
-
Index: linux-2.6/mm/vmstat.c
===================================================================
--- linux-2.6.orig/mm/vmstat.c	2007-11-06 06:05:02.000000000 +0000
+++ linux-2.6/mm/vmstat.c	2007-11-06 06:05:06.000000000 +0000
@@ -642,6 +642,7 @@
 	"allocstall",
 
 	"pgrotated",
+	"cpu_bytes",
 #endif
 };
 
Index: linux-2.6/include/linux/cpu_alloc.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/include/linux/cpu_alloc.h	2007-11-06 06:05:06.000000000 +0000
@@ -0,0 +1,56 @@
+/*
+ * include/linux/cpu_alloc.h - cpu allocator definitions
+ *
+ * The cpu allocator allows allocating an array of objects on all processors.
+ * A single pointer can then be used to access the instance of the object
+ * on a particular processor.
+ *
+ * Cpu objects are typically small. The allocator packs them tightly
+ * to increase the chance on each access that a per cpu object is already
+ * cached. Alignments may be specified but the intent is to align the data
+ * properly due to cpu alignment constraints and not to avoid cacheline
+ * contention. Any holes left by aligning objects are filled up with smaller
+ * objects that are allocated later.
+ *
+ * Cpu data can be allocated using CPU_ALLOC. The resulting pointer is
+ * pointing to the instance of the variable on cpu 0. It is generally an
+ * error to use the pointer directly unless we are running on cpu 0. So
+ * the use is valid during boot for example.
+ *
+ * The GFP flags have their usual function: __GFP_ZERO zeroes the object
+ * and other flags may be used to control reclaim behavior if the cpu
+ * areas have to be extended. However, zones cannot be selected nor
+ * can locality constraint flags be used.
+ *
+ * CPU_PTR() may be used to calculate the pointer for a specific processor.
+ * CPU_PTR is highly scalable since it simply adds the shifted value of
+ * smp_processor_id() to the base.
+ *
+ * Note: Synchronization is up to caller. If preemption is disabled then
+ * it is generally safe to access cpu variables (unless they are also
+ * handled from an interrupt context).
+ */
+#ifndef _LINUX_CPU_ALLOC_H_
+#define _LINUX_CPU_ALLOC_H_
+
+#define CPU_OFFSET(__cpu) \
+	((unsigned long)(__cpu) << (CONFIG_CPU_AREA_ORDER + PAGE_SHIFT))
+
+#define CPU_PTR(__p, __cpu) ((__typeof__(__p))((void *)(__p) + \
+							CPU_OFFSET(__cpu)))
+
+#define CPU_ALLOC(type, flags)	cpu_alloc(sizeof(type), flags, \
+					__alignof__(type))
+#define CPU_FREE(pointer)	cpu_free(pointer, sizeof(*(pointer)))
+
+#define THIS_CPU(__p)	CPU_PTR(__p, smp_processor_id())
+#define __THIS_CPU(__p)	CPU_PTR(__p, raw_smp_processor_id())
+
+/*
+ * Raw calls
+ */
+void *cpu_alloc(unsigned long size, gfp_t gfp, unsigned long align);
+void cpu_free(void *cpu_pointer, unsigned long size);
+
+#endif /* _LINUX_CPU_ALLOC_H_ */
+
Index: linux-2.6/mm/Kconfig
===================================================================
--- linux-2.6.orig/mm/Kconfig	2007-11-06 06:05:02.000000000 +0000
+++ linux-2.6/mm/Kconfig	2007-11-06 06:06:01.000000000 +0000
@@ -194,3 +194,15 @@
 config VIRT_TO_BUS
 	def_bool y
 	depends on !ARCH_NO_VIRT_TO_BUS
+
+config CPU_AREA_ORDER
+	int "Maximum order of CPU area"
+	default "16" if CPU_AREA_VIRTUAL
+	default "4" if !CPU_AREA_VIRTUAL
+	help
+	  Sets the maximum amount of memory that can be allocated via cpu_alloc
+	  The size is set in page order. The size set (times the maximum
+	  number of processors) determines the amount of virtual memory that
+	  is set aside for the per cpu areas.
+
+
Index: linux-2.6/include/linux/mm.h
===================================================================
--- linux-2.6.orig/include/linux/mm.h	2007-11-06 06:05:02.000000000 +0000
+++ linux-2.6/include/linux/mm.h	2007-11-06 06:05:06.000000000 +0000
@@ -1141,5 +1141,18 @@
 						unsigned long pages, int node);
 int vmemmap_populate(struct page *start_page, unsigned long pages, int node);
 
+pgd_t *cpu_area_pgd_populate(unsigned long addr, gfp_t flags, int node);
+pud_t *cpu_area_pud_populate(pgd_t *pgd, unsigned long addr,
+						gfp_t flags, int node);
+pmd_t *cpu_area_pmd_populate(pud_t *pud, unsigned long addr,
+						gfp_t flags, int node);
+pte_t *cpu_area_pte_populate(pmd_t *pmd, unsigned long addr,
+						gfp_t flags, int node);
+void *cpu_area_alloc_block(unsigned long size, gfp_t flags, int node);
+int cpu_area_populate_basepages(void *start, unsigned long size,
+						gfp_t flags, int node);
+int cpu_area_populate(void *start, unsigned long size,
+						gfp_t flags, int node);
+
 #endif /* __KERNEL__ */
 #endif /* _LINUX_MM_H */

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 02/28] cpu alloc: x86_64 support
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
  2007-11-06 19:51 ` [patch 01/28] cpu alloc: The allocator Christoph Lameter
@ 2007-11-06 19:51 ` Christoph Lameter
  2007-11-06 19:51 ` [patch 03/28] cpu alloc: IA64 support Christoph Lameter
                   ` (26 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:51 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_x86_64 --]
[-- Type: text/plain, Size: 4073 bytes --]

Set up a cpu area that allows the use of up 256MB for each processor.

Cpu memory use can grow rapidly. F.e. if we assume that a page struct
occupies 64 bytes of memory and we have 3 zones per node then we need
3 * 1k * 16k = 50 million pagesets or 3096 pagesets per processor.
This results in a total of 3.2 GB of page structs. So each cpu needs
around 200k of cpu storage for the page allocator alone.

For the UP and SMP case map the area using 4k ptes. Typical use of per cpu
data is around 16k for UP and SMP configurations. Allocating a 2M
segment would be overkill. There is enough reserve around to run
a few 100000 ip tunnels if one wants to and other lots of other frills.

For NUMA map the area using 2M PMDs. A large NUMA system may use
lots of cpu data for the page allocator data alone. We typically
have large amounts of memory around. Using a 2M page size reduces
TLB pressure.

Some numbers for envisioned maximum configurations of NUMA systems:

4k cpu configurations with 256 nodes:

	4096 * 256MB = 1TB of virtual space.

Maximum theoretical configuration 16384 processors 1k nodes:

	16384 * 256MB = 4TB of virtual space.

Both fit within the established limits established.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 arch/x86/Kconfig.x86_64      |   13 +++++++++++++
 arch/x86/mm/init_64.c        |   39 +++++++++++++++++++++++++++++++++++++++
 include/asm-x86/pgtable_64.h |    1 +
 3 files changed, 53 insertions(+)

Index: linux-2.6/arch/x86/mm/init_64.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/init_64.c	2007-11-04 09:45:12.000000000 -0800
+++ linux-2.6/arch/x86/mm/init_64.c	2007-11-04 16:34:05.000000000 -0800
@@ -28,6 +28,7 @@
 #include <linux/module.h>
 #include <linux/memory_hotplug.h>
 #include <linux/nmi.h>
+#include <linux/cpu_alloc.h>
 
 #include <asm/processor.h>
 #include <asm/system.h>
@@ -781,3 +782,41 @@ int __meminit vmemmap_populate(struct pa
 	return 0;
 }
 #endif
+
+#ifdef CONFIG_NUMA
+int __meminit cpu_area_populate(void *start, unsigned long size,
+						gfp_t flags, int node)
+{
+	unsigned long addr = (unsigned long)start;
+	unsigned long end = addr + size;
+	unsigned long next;
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+
+	for (; addr < end; addr = next) {
+		next = pmd_addr_end(addr, end);
+
+		pgd = cpu_area_pgd_populate(addr, flags, node);
+		if (!pgd)
+			return -ENOMEM;
+		pud = cpu_area_pud_populate(pgd, addr, flags, node);
+		if (!pud)
+			return -ENOMEM;
+
+		pmd = pmd_offset(pud, addr);
+		if (pmd_none(*pmd)) {
+			pte_t entry;
+			void *p = cpu_area_alloc_block(PMD_SIZE, flags, node);
+			if (!p)
+				return -ENOMEM;
+
+			entry = pfn_pte(__pa(p) >> PAGE_SHIFT, PAGE_KERNEL);
+			mk_pte_huge(entry);
+			set_pmd(pmd, __pmd(pte_val(entry)));
+		}
+	}
+
+	return 0;
+}
+#endif
Index: linux-2.6/arch/x86/Kconfig.x86_64
===================================================================
--- linux-2.6.orig/arch/x86/Kconfig.x86_64	2007-11-04 09:45:12.000000000 -0800
+++ linux-2.6/arch/x86/Kconfig.x86_64	2007-11-04 16:16:06.000000000 -0800
@@ -137,6 +137,19 @@ config ARCH_HAS_ILOG2_U64
 	bool
 	default n
 
+config CPU_AREA_VIRTUAL
+	bool
+	default y
+
+config CPU_AREA_ORDER
+	int
+	default "16"
+
+config CPU_AREA_ALLOC_ORDER
+	int
+	default "9" if NUMA
+	default "0" if !NUMA
+
 source "init/Kconfig"
 
 
Index: linux-2.6/include/asm-x86/pgtable_64.h
===================================================================
--- linux-2.6.orig/include/asm-x86/pgtable_64.h	2007-11-04 09:45:12.000000000 -0800
+++ linux-2.6/include/asm-x86/pgtable_64.h	2007-11-04 16:16:06.000000000 -0800
@@ -138,6 +138,7 @@ static inline pte_t ptep_get_and_clear_f
 #define VMALLOC_START    _AC(0xffffc20000000000, UL)
 #define VMALLOC_END      _AC(0xffffe1ffffffffff, UL)
 #define VMEMMAP_START	 _AC(0xffffe20000000000, UL)
+#define CPU_AREA_BASE	 _AC(0xfffff20000000000, UL)
 #define MODULES_VADDR    _AC(0xffffffff88000000, UL)
 #define MODULES_END      _AC(0xfffffffffff00000, UL)
 #define MODULES_LEN   (MODULES_END - MODULES_VADDR)

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 03/28] cpu alloc: IA64 support
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
  2007-11-06 19:51 ` [patch 01/28] cpu alloc: The allocator Christoph Lameter
  2007-11-06 19:51 ` [patch 02/28] cpu alloc: x86_64 support Christoph Lameter
@ 2007-11-06 19:51 ` Christoph Lameter
  2007-11-06 19:51 ` [patch 04/28] cpu alloc: i386 support Christoph Lameter
                   ` (25 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:51 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_ia64 --]
[-- Type: text/plain, Size: 3292 bytes --]

Typical use of per cpu memory for a small system of 8G 8p 4node is less than
64k per cpu memory. This is increasing rapidly for larger systems where we can
get up to 512k or 1M of memory used for cpu storage.

The maximum size of the cpu area is 128MB of memory.

The cpu area is placed in region 5 with the kernel, vmemmap and vmalloc areas.
So with this version we are limited to PAGE_SIZEd mappings. The cpu area and the
vmemmap area could use a large TLB size.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 arch/ia64/Kconfig          |   19 +++++++++++++++++++
 include/asm-ia64/pgtable.h |   27 +++++++++++++++++++++------
 2 files changed, 40 insertions(+), 6 deletions(-)

Index: linux-2.6/include/asm-ia64/pgtable.h
===================================================================
--- linux-2.6.orig/include/asm-ia64/pgtable.h	2007-11-06 06:11:58.000000000 +0000
+++ linux-2.6/include/asm-ia64/pgtable.h	2007-11-06 06:28:37.000000000 +0000
@@ -224,21 +224,41 @@
  */
 
 
+/*
+ * Layout of the RGN_GAGE
+ *
+ * 47 bits wide (16kb pages)
+ *
+ * 0xa000000000000000-0xa00000200000000	8G	Kernel data area
+ * 0xa000002000000000-0xa00040000000000	64T	vmalloc
+ * 0xa000400000000000-0xa00060000000000 32T	vmemmmap
+ * 0xa000600000000000-0xa00080000000000	32T	cpu area
+ *
+ * 55 bits wide (64kb pages)
+ *
+ * 0xa000000000000000-0xa00000200000000	8G	Kernel data area
+ * 0xa000002000000000-0xa04000000000000	16P	vmalloc
+ * 0xa040000000000000-0xa06000000000000 8P	vmemmmap
+ * 0xa060000000000000-0xa08000000000000	8P	cpu area
+ */
+
 #define VMALLOC_START		(RGN_BASE(RGN_GATE) + 0x200000000UL)
+#define VMALLOC_END_INIT	(RGN_BASE(RGN_GATE) + (1UL << (4*PAGE_SHIFT - 10)))
+
 #ifdef CONFIG_VIRTUAL_MEM_MAP
-# define VMALLOC_END_INIT	(RGN_BASE(RGN_GATE) + (1UL << (4*PAGE_SHIFT - 9)))
 # define VMALLOC_END		vmalloc_end
   extern unsigned long vmalloc_end;
 #else
+# define VMALLOC_END VMALLOC_END_INIT
+#endif
+
 #if defined(CONFIG_SPARSEMEM) && defined(CONFIG_SPARSEMEM_VMEMMAP)
 /* SPARSEMEM_VMEMMAP uses half of vmalloc... */
-# define VMALLOC_END		(RGN_BASE(RGN_GATE) + (1UL << (4*PAGE_SHIFT - 10)))
-# define vmemmap		((struct page *)VMALLOC_END)
-#else
-# define VMALLOC_END		(RGN_BASE(RGN_GATE) + (1UL << (4*PAGE_SHIFT - 9)))
-#endif
+# define vmemmap		((struct page *)VMALLOC_END_INIT)
 #endif
 
+#define CPU_AREA_BASE		(RGN_BASE(RGN_GATE) + (3UL << (4*PAGE_SHIFT - 11)))
+
 /* fs/proc/kcore.c */
 #define	kc_vaddr_to_offset(v) ((v) - RGN_BASE(RGN_GATE))
 #define	kc_offset_to_vaddr(o) ((o) + RGN_BASE(RGN_GATE))
Index: linux-2.6/arch/ia64/Kconfig
===================================================================
--- linux-2.6.orig/arch/ia64/Kconfig	2007-11-06 06:11:57.000000000 +0000
+++ linux-2.6/arch/ia64/Kconfig	2007-11-06 06:12:27.000000000 +0000
@@ -99,6 +99,25 @@
 	bool
 	default y
 
+config CPU_AREA_VIRTUAL
+	bool
+	default y
+
+# Maximum of 128 MB cpu_alloc space per cpu
+config CPU_AREA_ORDER
+	int
+	default "13"
+
+#
+# Expand the area in PAGE_SIZE steps. We could be using
+# huge pages if we would move this into region 5. But with the move
+# to 64k we will likely be fine here. 64k is more than enough
+# for most configurations.
+#
+config CPU_AREA_ALLOC_ORDER
+	int
+	default "0"
+
 choice
 	prompt "System type"
 	default IA64_GENERIC

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 04/28] cpu alloc: i386 support
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (2 preceding siblings ...)
  2007-11-06 19:51 ` [patch 03/28] cpu alloc: IA64 support Christoph Lameter
@ 2007-11-06 19:51 ` Christoph Lameter
  2007-11-06 19:51 ` [patch 05/28] cpu alloc: Use in SLUB Christoph Lameter
                   ` (24 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:51 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_i386 --]
[-- Type: text/plain, Size: 2861 bytes --]

Setup a 256 kB area for the cpu areas below the FIXADDR area.

The use of the cpu alloc area is pretty minimal on i386. An 8p system
with no extras uses only ~8kb. So 256kb should be plenty. A configuration
that supports up to 8 processors takes up 2MB of the scarce
virtual address space.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 arch/x86/Kconfig.i386        |   12 ++++++++++++
 arch/x86/mm/init_32.c        |    3 +++
 include/asm-x86/pgtable_32.h |    7 +++++--
 3 files changed, 20 insertions(+), 2 deletions(-)

Index: linux-2.6/arch/x86/Kconfig.i386
===================================================================
--- linux-2.6.orig/arch/x86/Kconfig.i386	2007-11-06 05:05:19.000000000 +0000
+++ linux-2.6/arch/x86/Kconfig.i386	2007-11-06 05:26:39.000000000 +0000
@@ -95,6 +95,18 @@
 	bool
 	default y
 
+config CPU_AREA_VIRTUAL
+	bool
+	default y
+
+config CPU_AREA_ORDER
+	int
+	default "6"
+
+config CPU_AREA_ALLOC_ORDER
+	int
+	default "0"
+
 source "init/Kconfig"
 
 menu "Processor type and features"
Index: linux-2.6/arch/x86/mm/init_32.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/init_32.c	2007-11-06 05:01:23.000000000 +0000
+++ linux-2.6/arch/x86/mm/init_32.c	2007-11-06 05:07:33.000000000 +0000
@@ -674,6 +674,7 @@
 #if 1 /* double-sanity-check paranoia */
 	printk("virtual kernel memory layout:\n"
 	       "    fixmap  : 0x%08lx - 0x%08lx   (%4ld kB)\n"
+	       "    cpu area: 0x%08lx - 0x%08lx   (%4ld kb)\n"
 #ifdef CONFIG_HIGHMEM
 	       "    pkmap   : 0x%08lx - 0x%08lx   (%4ld kB)\n"
 #endif
@@ -684,6 +685,8 @@
 	       "      .text : 0x%08lx - 0x%08lx   (%4ld kB)\n",
 	       FIXADDR_START, FIXADDR_TOP,
 	       (FIXADDR_TOP - FIXADDR_START) >> 10,
+	       CPU_AREA_BASE, FIXADDR_START,
+	       (FIXADDR_START - CPU_AREA_BASE) >> 10,
 
 #ifdef CONFIG_HIGHMEM
 	       PKMAP_BASE, PKMAP_BASE+LAST_PKMAP*PAGE_SIZE,
Index: linux-2.6/include/asm-x86/pgtable_32.h
===================================================================
--- linux-2.6.orig/include/asm-x86/pgtable_32.h	2007-11-06 05:01:23.000000000 +0000
+++ linux-2.6/include/asm-x86/pgtable_32.h	2007-11-06 05:07:33.000000000 +0000
@@ -79,11 +79,14 @@
 #define VMALLOC_START	(((unsigned long) high_memory + \
 			2*VMALLOC_OFFSET-1) & ~(VMALLOC_OFFSET-1))
 #ifdef CONFIG_HIGHMEM
-# define VMALLOC_END	(PKMAP_BASE-2*PAGE_SIZE)
+# define CPU_AREA_BASE	(PKMAP_BASE - NR_CPUS * \
+				(1 << (CONFIG_CPU_AREA_ORDER + PAGE_SHIFT)))
 #else
-# define VMALLOC_END	(FIXADDR_START-2*PAGE_SIZE)
+# define CPU_AREA_BASE	(FIXADDR_START - NR_CPUS * \
+				(1 << (CONFIG_CPU_AREA_ORDER + PAGE_SHIFT)))
 #endif
 
+#define VMALLOC_END	(CPU_AREA_BASE - 2 * PAGE_SIZE)
 /*
  * _PAGE_PSE set in the page directory entry just means that
  * the page directory entry points directly to a 4MB-aligned block of

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 05/28] cpu alloc: Use in SLUB
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (3 preceding siblings ...)
  2007-11-06 19:51 ` [patch 04/28] cpu alloc: i386 support Christoph Lameter
@ 2007-11-06 19:51 ` Christoph Lameter
  2007-11-06 19:51 ` [patch 06/28] cpu alloc: Remove SLUB fields Christoph Lameter
                   ` (23 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:51 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_slub --]
[-- Type: text/plain, Size: 11114 bytes --]

Using cpu alloc removes the needs for the per cpu arrays in the kmem_cache struct.
These could get quite big if we have to support system of up to thousands of cpus.
The use of alloc_percpu means that:

1. The size of kmem_cache for SMP configuration shrinks since we will only
   need 1 pointer instead of NR_CPUS. The same pointer can be used by all
   processors. Reduces cache footprint of the allocator.

2. We can dynamically size kmem_cache according to the actual nodes in the
   system meaning less memory overhead for configurations that may potentially
   support up to 1k NUMA nodes.

3. We can remove the diddle widdle with allocating and releasing kmem_cache_cpu
   structures when bringing up and shuttting down cpus. The allocpercpu
   logic will do it all for us.

4. Fastpath performance increases by another 20% vs. the earlier improvements.
   Instead of having fastpath with 45-50 cycles it is now possible to get
   below 40.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
---
 include/linux/slub_def.h |    6 -
 mm/slub.c                |  183 ++++++-----------------------------------------
 2 files changed, 26 insertions(+), 163 deletions(-)

Index: linux-2.6/include/linux/slub_def.h
===================================================================
--- linux-2.6.orig/include/linux/slub_def.h	2007-11-05 22:37:13.000000000 -0800
+++ linux-2.6/include/linux/slub_def.h	2007-11-05 22:41:15.000000000 -0800
@@ -34,6 +34,7 @@ struct kmem_cache_node {
  * Slab cache management.
  */
 struct kmem_cache {
+	struct kmem_cache_cpu *cpu_slab;
 	/* Used for retriving partial slabs etc */
 	unsigned long flags;
 	int size;		/* The size of an object including meta data */
@@ -63,11 +64,6 @@ struct kmem_cache {
 	int defrag_ratio;
 	struct kmem_cache_node *node[MAX_NUMNODES];
 #endif
-#ifdef CONFIG_SMP
-	struct kmem_cache_cpu *cpu_slab[NR_CPUS];
-#else
-	struct kmem_cache_cpu cpu_slab;
-#endif
 };
 
 /*
Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2007-11-05 22:39:00.000000000 -0800
+++ linux-2.6/mm/slub.c	2007-11-05 22:41:15.000000000 -0800
@@ -21,6 +21,7 @@
 #include <linux/ctype.h>
 #include <linux/kallsyms.h>
 #include <linux/memory.h>
+#include <linux/cpu_alloc.h>
 
 /*
  * Lock order:
@@ -239,15 +240,6 @@ static inline struct kmem_cache_node *ge
 #endif
 }
 
-static inline struct kmem_cache_cpu *get_cpu_slab(struct kmem_cache *s, int cpu)
-{
-#ifdef CONFIG_SMP
-	return s->cpu_slab[cpu];
-#else
-	return &s->cpu_slab;
-#endif
-}
-
 /*
  * The end pointer in a slab is special. It points to the first object in the
  * slab but has bit 0 set to mark it.
@@ -1471,7 +1463,7 @@ static inline void flush_slab(struct kme
  */
 static inline void __flush_cpu_slab(struct kmem_cache *s, int cpu)
 {
-	struct kmem_cache_cpu *c = get_cpu_slab(s, cpu);
+	struct kmem_cache_cpu *c = CPU_PTR(s->cpu_slab, cpu);
 
 	if (likely(c && c->page))
 		flush_slab(s, c);
@@ -1486,15 +1478,7 @@ static void flush_cpu_slab(void *d)
 
 static void flush_all(struct kmem_cache *s)
 {
-#ifdef CONFIG_SMP
 	on_each_cpu(flush_cpu_slab, s, 1, 1);
-#else
-	unsigned long flags;
-
-	local_irq_save(flags);
-	flush_cpu_slab(s);
-	local_irq_restore(flags);
-#endif
 }
 
 /*
@@ -1528,7 +1512,7 @@ static noinline unsigned long get_new_sl
 	if (!page)
 		return 0;
 
-	*pc = c = get_cpu_slab(s, smp_processor_id());
+	*pc = c = THIS_CPU(s->cpu_slab);
 	if (c->page)
 		flush_slab(s, c);
 	c->page = page;
@@ -1640,25 +1624,26 @@ static void __always_inline *slab_alloc(
 	struct kmem_cache_cpu *c;
 
 #ifdef CONFIG_FAST_CMPXCHG_LOCAL
-	c = get_cpu_slab(s, get_cpu());
+	preempt_disable();
+	c = THIS_CPU(s->cpu_slab);
 	do {
 		object = c->freelist;
 		if (unlikely(is_end(object) || !node_match(c, node))) {
 			object = __slab_alloc(s, gfpflags, node, addr, c);
 			if (unlikely(!object)) {
-				put_cpu();
+				preempt_enable();
 				goto out;
 			}
 			break;
 		}
 	} while (cmpxchg_local(&c->freelist, object, object[c->offset])
 								!= object);
-	put_cpu();
+	preempt_enable();
 #else
 	unsigned long flags;
 
 	local_irq_save(flags);
-	c = get_cpu_slab(s, smp_processor_id());
+	c = THIS_CPU(s->cpu_slab);
 	if (unlikely((is_end(c->freelist)) || !node_match(c, node))) {
 
 		object = __slab_alloc(s, gfpflags, node, addr, c);
@@ -1783,7 +1768,8 @@ static void __always_inline slab_free(st
 #ifdef CONFIG_FAST_CMPXCHG_LOCAL
 	void **freelist;
 
-	c = get_cpu_slab(s, get_cpu());
+	preempt_disable();
+	c = THIS_CPU(s->cpu_slab);
 	debug_check_no_locks_freed(object, s->objsize);
 	do {
 		freelist = c->freelist;
@@ -1805,13 +1791,13 @@ static void __always_inline slab_free(st
 		}
 		object[c->offset] = freelist;
 	} while (cmpxchg_local(&c->freelist, freelist, object) != freelist);
-	put_cpu();
+	preempt_enable();
 #else
 	unsigned long flags;
 
 	local_irq_save(flags);
 	debug_check_no_locks_freed(object, s->objsize);
-	c = get_cpu_slab(s, smp_processor_id());
+	c = THIS_CPU(s->cpu_slab);
 	if (likely(page == c->page && c->node >= 0)) {
 		object[c->offset] = c->freelist;
 		c->freelist = object;
@@ -2014,130 +2000,19 @@ static void init_kmem_cache_node(struct 
 #endif
 }
 
-#ifdef CONFIG_SMP
-/*
- * Per cpu array for per cpu structures.
- *
- * The per cpu array places all kmem_cache_cpu structures from one processor
- * close together meaning that it becomes possible that multiple per cpu
- * structures are contained in one cacheline. This may be particularly
- * beneficial for the kmalloc caches.
- *
- * A desktop system typically has around 60-80 slabs. With 100 here we are
- * likely able to get per cpu structures for all caches from the array defined
- * here. We must be able to cover all kmalloc caches during bootstrap.
- *
- * If the per cpu array is exhausted then fall back to kmalloc
- * of individual cachelines. No sharing is possible then.
- */
-#define NR_KMEM_CACHE_CPU 100
-
-static DEFINE_PER_CPU(struct kmem_cache_cpu,
-				kmem_cache_cpu)[NR_KMEM_CACHE_CPU];
-
-static DEFINE_PER_CPU(struct kmem_cache_cpu *, kmem_cache_cpu_free);
-static cpumask_t kmem_cach_cpu_free_init_once = CPU_MASK_NONE;
-
-static struct kmem_cache_cpu *alloc_kmem_cache_cpu(struct kmem_cache *s,
-							int cpu, gfp_t flags)
-{
-	struct kmem_cache_cpu *c = per_cpu(kmem_cache_cpu_free, cpu);
-
-	if (c)
-		per_cpu(kmem_cache_cpu_free, cpu) =
-				(void *)c->freelist;
-	else {
-		/* Table overflow: So allocate ourselves */
-		c = kmalloc_node(
-			ALIGN(sizeof(struct kmem_cache_cpu), cache_line_size()),
-			flags, cpu_to_node(cpu));
-		if (!c)
-			return NULL;
-	}
-
-	init_kmem_cache_cpu(s, c);
-	return c;
-}
-
-static void free_kmem_cache_cpu(struct kmem_cache_cpu *c, int cpu)
-{
-	if (c < per_cpu(kmem_cache_cpu, cpu) ||
-			c > per_cpu(kmem_cache_cpu, cpu) + NR_KMEM_CACHE_CPU) {
-		kfree(c);
-		return;
-	}
-	c->freelist = (void *)per_cpu(kmem_cache_cpu_free, cpu);
-	per_cpu(kmem_cache_cpu_free, cpu) = c;
-}
-
-static void free_kmem_cache_cpus(struct kmem_cache *s)
-{
-	int cpu;
-
-	for_each_online_cpu(cpu) {
-		struct kmem_cache_cpu *c = get_cpu_slab(s, cpu);
-
-		if (c) {
-			s->cpu_slab[cpu] = NULL;
-			free_kmem_cache_cpu(c, cpu);
-		}
-	}
-}
-
 static int alloc_kmem_cache_cpus(struct kmem_cache *s, gfp_t flags)
 {
 	int cpu;
 
-	for_each_online_cpu(cpu) {
-		struct kmem_cache_cpu *c = get_cpu_slab(s, cpu);
-
-		if (c)
-			continue;
-
-		c = alloc_kmem_cache_cpu(s, cpu, flags);
-		if (!c) {
-			free_kmem_cache_cpus(s);
-			return 0;
-		}
-		s->cpu_slab[cpu] = c;
-	}
-	return 1;
-}
-
-/*
- * Initialize the per cpu array.
- */
-static void init_alloc_cpu_cpu(int cpu)
-{
-	int i;
+	s->cpu_slab = CPU_ALLOC(struct kmem_cache_cpu, flags);
 
-	if (cpu_isset(cpu, kmem_cach_cpu_free_init_once))
-		return;
-
-	for (i = NR_KMEM_CACHE_CPU - 1; i >= 0; i--)
-		free_kmem_cache_cpu(&per_cpu(kmem_cache_cpu, cpu)[i], cpu);
-
-	cpu_set(cpu, kmem_cach_cpu_free_init_once);
-}
-
-static void __init init_alloc_cpu(void)
-{
-	int cpu;
+	if (!s->cpu_slab)
+		return 0;
 
 	for_each_online_cpu(cpu)
-		init_alloc_cpu_cpu(cpu);
-  }
-
-#else
-static inline void free_kmem_cache_cpus(struct kmem_cache *s) {}
-static inline void init_alloc_cpu(void) {}
-
-static inline int alloc_kmem_cache_cpus(struct kmem_cache *s, gfp_t flags)
-{
-	init_kmem_cache_cpu(s, &s->cpu_slab);
+		init_kmem_cache_cpu(s, CPU_PTR(s->cpu_slab, cpu));
 	return 1;
 }
-#endif
 
 #ifdef CONFIG_NUMA
 /*
@@ -2451,9 +2326,8 @@ static inline int kmem_cache_close(struc
 	int node;
 
 	flush_all(s);
-
+	CPU_FREE(s->cpu_slab);
 	/* Attempt to free all objects */
-	free_kmem_cache_cpus(s);
 	for_each_node_state(node, N_NORMAL_MEMORY) {
 		struct kmem_cache_node *n = get_node(s, node);
 
@@ -2957,8 +2831,6 @@ void __init kmem_cache_init(void)
 	int i;
 	int caches = 0;
 
-	init_alloc_cpu();
-
 #ifdef CONFIG_NUMA
 	/*
 	 * Must first have the slab cache available for the allocations of the
@@ -3018,11 +2890,12 @@ void __init kmem_cache_init(void)
 	for (i = KMALLOC_SHIFT_LOW; i < PAGE_SHIFT; i++)
 		kmalloc_caches[i]. name =
 			kasprintf(GFP_KERNEL, "kmalloc-%d", 1 << i);
-
 #ifdef CONFIG_SMP
 	register_cpu_notifier(&slab_notifier);
-	kmem_size = offsetof(struct kmem_cache, cpu_slab) +
-				nr_cpu_ids * sizeof(struct kmem_cache_cpu *);
+#endif
+#ifdef CONFIG_NUMA
+	kmem_size = offsetof(struct kmem_cache, node) +
+				nr_node_ids * sizeof(struct kmem_cache_node *);
 #else
 	kmem_size = sizeof(struct kmem_cache);
 #endif
@@ -3119,7 +2992,7 @@ struct kmem_cache *kmem_cache_create(con
 		 * per cpu structures
 		 */
 		for_each_online_cpu(cpu)
-			get_cpu_slab(s, cpu)->objsize = s->objsize;
+			CPU_PTR(s->cpu_slab, cpu)->objsize = s->objsize;
 		s->inuse = max_t(int, s->inuse, ALIGN(size, sizeof(void *)));
 		up_write(&slub_lock);
 		if (sysfs_slab_alias(s, name))
@@ -3164,11 +3037,9 @@ static int __cpuinit slab_cpuup_callback
 	switch (action) {
 	case CPU_UP_PREPARE:
 	case CPU_UP_PREPARE_FROZEN:
-		init_alloc_cpu_cpu(cpu);
 		down_read(&slub_lock);
 		list_for_each_entry(s, &slab_caches, list)
-			s->cpu_slab[cpu] = alloc_kmem_cache_cpu(s, cpu,
-							GFP_KERNEL);
+			init_kmem_cache_cpu(s, CPU_PTR(s->cpu_slab, cpu));
 		up_read(&slub_lock);
 		break;
 
@@ -3178,13 +3049,9 @@ static int __cpuinit slab_cpuup_callback
 	case CPU_DEAD_FROZEN:
 		down_read(&slub_lock);
 		list_for_each_entry(s, &slab_caches, list) {
-			struct kmem_cache_cpu *c = get_cpu_slab(s, cpu);
-
 			local_irq_save(flags);
 			__flush_cpu_slab(s, cpu);
 			local_irq_restore(flags);
-			free_kmem_cache_cpu(c, cpu);
-			s->cpu_slab[cpu] = NULL;
 		}
 		up_read(&slub_lock);
 		break;
@@ -3656,7 +3523,7 @@ static unsigned long slab_objects(struct
 	for_each_possible_cpu(cpu) {
 		struct page *page;
 		int node;
-		struct kmem_cache_cpu *c = get_cpu_slab(s, cpu);
+		struct kmem_cache_cpu *c = CPU_PTR(s->cpu_slab, cpu);
 
 		if (!c)
 			continue;
@@ -3723,7 +3590,7 @@ static int any_slab_objects(struct kmem_
 	int cpu;
 
 	for_each_possible_cpu(cpu) {
-		struct kmem_cache_cpu *c = get_cpu_slab(s, cpu);
+		struct kmem_cache_cpu *c = CPU_PTR(s->cpu_slab, cpu);
 
 		if (c && c->page)
 			return 1;

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 06/28] cpu alloc: Remove SLUB fields
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (4 preceding siblings ...)
  2007-11-06 19:51 ` [patch 05/28] cpu alloc: Use in SLUB Christoph Lameter
@ 2007-11-06 19:51 ` Christoph Lameter
  2007-11-06 19:51 ` [patch 07/28] cpu alloc: page allocator conversion Christoph Lameter
                   ` (22 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:51 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_slub_remove_cpu_parameter --]
[-- Type: text/plain, Size: 5883 bytes --]

Remove the fields in kmem_cache_cpu that were used to cache data from
kmem_cache when they were in different cachelines. The cacheline that holds
the per cpu array pointer now also holds these values. We can cut down the
kmem_cache_cpu size to almost half.

The get_freepointer() and set_freepointer() functions that used to be only
intended for the slow path now are also useful for the hot path since access
to the field does not require an additional cacheline anymore. This results
in consistent use of setting the freepointer for objects throughout SLUB.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
---
 include/linux/slub_def.h |    3 --
 mm/slub.c                |   50 +++++++++++++++--------------------------------
 2 files changed, 17 insertions(+), 36 deletions(-)

Index: linux-2.6/include/linux/slub_def.h
===================================================================
--- linux-2.6.orig/include/linux/slub_def.h	2007-11-05 22:41:15.000000000 -0800
+++ linux-2.6/include/linux/slub_def.h	2007-11-05 22:41:24.000000000 -0800
@@ -15,9 +15,6 @@ struct kmem_cache_cpu {
 	void **freelist;
 	struct page *page;
 	int node;
-	unsigned int offset;
-	unsigned int objsize;
-	unsigned int objects;
 };
 
 struct kmem_cache_node {
Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2007-11-05 22:41:15.000000000 -0800
+++ linux-2.6/mm/slub.c	2007-11-05 22:42:28.000000000 -0800
@@ -274,13 +274,6 @@ static inline int check_valid_pointer(st
 	return 1;
 }
 
-/*
- * Slow version of get and set free pointer.
- *
- * This version requires touching the cache lines of kmem_cache which
- * we avoid to do in the fast alloc free paths. There we obtain the offset
- * from the page struct.
- */
 static inline void *get_freepointer(struct kmem_cache *s, void *object)
 {
 	return *(void **)(object + s->offset);
@@ -1438,10 +1431,10 @@ static void deactivate_slab(struct kmem_
 
 		/* Retrieve object from cpu_freelist */
 		object = c->freelist;
-		c->freelist = c->freelist[c->offset];
+		c->freelist = get_freepointer(s, c->freelist);
 
 		/* And put onto the regular freelist */
-		object[c->offset] = page->freelist;
+		set_freepointer(s, object, page->freelist);
 		page->freelist = object;
 		page->inuse--;
 	}
@@ -1573,8 +1566,8 @@ load_freelist:
 		goto debug;
 
 	object = c->page->freelist;
-	c->freelist = object[c->offset];
-	c->page->inuse = c->objects;
+	c->freelist = get_freepointer(s, object);
+	c->page->inuse = s->objects;
 	c->page->freelist = c->page->end;
 	c->node = page_to_nid(c->page);
 unlock_out:
@@ -1602,7 +1595,7 @@ debug:
 		goto another_slab;
 
 	c->page->inuse++;
-	c->page->freelist = object[c->offset];
+	c->page->freelist = get_freepointer(s, object);
 	c->node = -1;
 	goto unlock_out;
 }
@@ -1636,8 +1629,8 @@ static void __always_inline *slab_alloc(
 			}
 			break;
 		}
-	} while (cmpxchg_local(&c->freelist, object, object[c->offset])
-								!= object);
+	} while (cmpxchg_local(&c->freelist, object,
+			get_freepointer(s, object)) != object);
 	preempt_enable();
 #else
 	unsigned long flags;
@@ -1653,13 +1646,13 @@ static void __always_inline *slab_alloc(
 		}
 	} else {
 		object = c->freelist;
-		c->freelist = object[c->offset];
+		c->freelist = get_freepointer(s, object);
 	}
 	local_irq_restore(flags);
 #endif
 
 	if (unlikely((gfpflags & __GFP_ZERO)))
-		memset(object, 0, c->objsize);
+		memset(object, 0, s->objsize);
 out:
 	return object;
 }
@@ -1687,7 +1680,7 @@ EXPORT_SYMBOL(kmem_cache_alloc_node);
  * handling required then we can return immediately.
  */
 static void __slab_free(struct kmem_cache *s, struct page *page,
-				void *x, void *addr, unsigned int offset)
+				void *x, void *addr)
 {
 	void *prior;
 	void **object = (void *)x;
@@ -1703,7 +1696,8 @@ static void __slab_free(struct kmem_cach
 	if (unlikely(state & SLABDEBUG))
 		goto debug;
 checks_ok:
-	prior = object[offset] = page->freelist;
+	prior = page->freelist;
+	set_freepointer(s, object, prior);
 	page->freelist = object;
 	page->inuse--;
 
@@ -1786,10 +1780,10 @@ static void __always_inline slab_free(st
 		 * since the freelist pointers are unique per slab.
 		 */
 		if (unlikely(page != c->page || c->node < 0)) {
-			__slab_free(s, page, x, addr, c->offset);
+			__slab_free(s, page, x, addr);
 			break;
 		}
-		object[c->offset] = freelist;
+		set_freepointer(s, object, freelist);
 	} while (cmpxchg_local(&c->freelist, freelist, object) != freelist);
 	preempt_enable();
 #else
@@ -1799,10 +1793,10 @@ static void __always_inline slab_free(st
 	debug_check_no_locks_freed(object, s->objsize);
 	c = THIS_CPU(s->cpu_slab);
 	if (likely(page == c->page && c->node >= 0)) {
-		object[c->offset] = c->freelist;
+		set_freepointer(s, object, c->freelist);
 		c->freelist = object;
 	} else
-		__slab_free(s, page, x, addr, c->offset);
+		__slab_free(s, page, x, addr);
 
 	local_irq_restore(flags);
 #endif
@@ -1984,9 +1978,6 @@ static void init_kmem_cache_cpu(struct k
 	c->page = NULL;
 	c->freelist = (void *)PAGE_MAPPING_ANON;
 	c->node = 0;
-	c->offset = s->offset / sizeof(void *);
-	c->objsize = s->objsize;
-	c->objects = s->objects;
 }
 
 static void init_kmem_cache_node(struct kmem_cache_node *n)
@@ -2978,21 +2969,14 @@ struct kmem_cache *kmem_cache_create(con
 	down_write(&slub_lock);
 	s = find_mergeable(size, align, flags, name, ctor);
 	if (s) {
-		int cpu;
-
 		s->refcount++;
+
 		/*
 		 * Adjust the object sizes so that we clear
 		 * the complete object on kzalloc.
 		 */
 		s->objsize = max(s->objsize, (int)size);
 
-		/*
-		 * And then we need to update the object size in the
-		 * per cpu structures
-		 */
-		for_each_online_cpu(cpu)
-			CPU_PTR(s->cpu_slab, cpu)->objsize = s->objsize;
 		s->inuse = max_t(int, s->inuse, ALIGN(size, sizeof(void *)));
 		up_write(&slub_lock);
 		if (sysfs_slab_alias(s, name))

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 07/28] cpu alloc: page allocator conversion
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (5 preceding siblings ...)
  2007-11-06 19:51 ` [patch 06/28] cpu alloc: Remove SLUB fields Christoph Lameter
@ 2007-11-06 19:51 ` Christoph Lameter
  2007-11-06 19:51 ` [patch 08/28] cpu alloc: percpu_counter conversion Christoph Lameter
                   ` (21 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:51 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_page_alloc --]
[-- Type: text/plain, Size: 13484 bytes --]

Use the new cpu_alloc functionality to avoid per cpu arrays in struct zone.
This drastically reduces the size of struct zone for systems with a large
amounts of processors and allows placement of critical variables of struct
zone in one cacheline even on very large systems.

Another effect is that the pagesets of one processor are placed near one
another. If multiple pagesets from different zones fit into one cacheline
then additional cacheline fetches can be avoided on the hot paths when
allocating memory from multiple zones.

Surprisingly this clears up much of the painful NUMA bringup. Bootstrap
becomes simpler if we use the same scheme for UP, SMP, NUMA. #ifdefs are
reduced and we can drop the zone_pcp macro.

Hotplug handling is also simplified since cpu alloc can bring up and
shut down cpu areas for a specific cpu as a whole. So there is no need to
allocate or free individual pagesets.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 include/linux/mm.h     |    4 -
 include/linux/mmzone.h |   12 ---
 mm/page_alloc.c        |  162 +++++++++++++++++++------------------------------
 mm/vmstat.c            |   15 ++--
 4 files changed, 74 insertions(+), 119 deletions(-)

Index: linux-2.6/include/linux/mmzone.h
===================================================================
--- linux-2.6.orig/include/linux/mmzone.h	2007-11-05 18:46:55.654284629 -0800
+++ linux-2.6/include/linux/mmzone.h	2007-11-05 19:37:52.171534088 -0800
@@ -121,13 +121,7 @@ struct per_cpu_pageset {
 	s8 stat_threshold;
 	s8 vm_stat_diff[NR_VM_ZONE_STAT_ITEMS];
 #endif
-} ____cacheline_aligned_in_smp;
-
-#ifdef CONFIG_NUMA
-#define zone_pcp(__z, __cpu) ((__z)->pageset[(__cpu)])
-#else
-#define zone_pcp(__z, __cpu) (&(__z)->pageset[(__cpu)])
-#endif
+};
 
 enum zone_type {
 #ifdef CONFIG_ZONE_DMA
@@ -231,10 +225,8 @@ struct zone {
 	 */
 	unsigned long		min_unmapped_pages;
 	unsigned long		min_slab_pages;
-	struct per_cpu_pageset	*pageset[NR_CPUS];
-#else
-	struct per_cpu_pageset	pageset[NR_CPUS];
 #endif
+	struct per_cpu_pageset	*pageset;
 	/*
 	 * free areas of different sizes
 	 */
Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c	2007-11-05 18:46:55.670285090 -0800
+++ linux-2.6/mm/page_alloc.c	2007-11-05 19:37:52.171534088 -0800
@@ -43,6 +43,7 @@
 #include <linux/backing-dev.h>
 #include <linux/fault-inject.h>
 #include <linux/page-isolation.h>
+#include <linux/cpu_alloc.h>
 
 #include <asm/tlbflush.h>
 #include <asm/div64.h>
@@ -912,7 +913,7 @@ static void __drain_pages(unsigned int c
 		if (!populated_zone(zone))
 			continue;
 
-		pset = zone_pcp(zone, cpu);
+		pset = CPU_PTR(zone->pageset, cpu);
 		for (i = 0; i < ARRAY_SIZE(pset->pcp); i++) {
 			struct per_cpu_pages *pcp;
 
@@ -1011,8 +1012,8 @@ static void fastcall free_hot_cold_page(
 	arch_free_page(page, 0);
 	kernel_map_pages(page, 1, 0);
 
-	pcp = &zone_pcp(zone, get_cpu())->pcp[cold];
 	local_irq_save(flags);
+	pcp = &THIS_CPU(zone->pageset)->pcp[cold];
 	__count_vm_event(PGFREE);
 	list_add(&page->lru, &pcp->list);
 	set_page_private(page, get_pageblock_migratetype(page));
@@ -1022,7 +1023,6 @@ static void fastcall free_hot_cold_page(
 		pcp->count -= pcp->batch;
 	}
 	local_irq_restore(flags);
-	put_cpu();
 }
 
 void fastcall free_hot_page(struct page *page)
@@ -1064,16 +1064,14 @@ static struct page *buffered_rmqueue(str
 	unsigned long flags;
 	struct page *page;
 	int cold = !!(gfp_flags & __GFP_COLD);
-	int cpu;
 	int migratetype = allocflags_to_migratetype(gfp_flags);
 
 again:
-	cpu  = get_cpu();
 	if (likely(order == 0)) {
 		struct per_cpu_pages *pcp;
 
-		pcp = &zone_pcp(zone, cpu)->pcp[cold];
 		local_irq_save(flags);
+		pcp = &THIS_CPU(zone->pageset)->pcp[cold];
 		if (!pcp->count) {
 			pcp->count = rmqueue_bulk(zone, 0,
 					pcp->batch, &pcp->list, migratetype);
@@ -1106,7 +1104,6 @@ again:
 	__count_zone_vm_events(PGALLOC, zone, 1 << order);
 	zone_statistics(zonelist, zone);
 	local_irq_restore(flags);
-	put_cpu();
 
 	VM_BUG_ON(bad_range(zone, page));
 	if (prep_new_page(page, order, gfp_flags))
@@ -1115,7 +1112,6 @@ again:
 
 failed:
 	local_irq_restore(flags);
-	put_cpu();
 	return NULL;
 }
 
@@ -1809,7 +1805,7 @@ void show_free_areas(void)
 		for_each_online_cpu(cpu) {
 			struct per_cpu_pageset *pageset;
 
-			pageset = zone_pcp(zone, cpu);
+			pageset = CPU_PTR(zone->pageset, cpu);
 
 			printk("CPU %4d: Hot: hi:%5d, btch:%4d usd:%4d   "
 			       "Cold: hi:%5d, btch:%4d usd:%4d\n",
@@ -2644,82 +2640,33 @@ static void setup_pagelist_highmark(stru
 		pcp->batch = PAGE_SHIFT * 8;
 }
 
-
-#ifdef CONFIG_NUMA
 /*
- * Boot pageset table. One per cpu which is going to be used for all
- * zones and all nodes. The parameters will be set in such a way
- * that an item put on a list will immediately be handed over to
- * the buddy list. This is safe since pageset manipulation is done
- * with interrupts disabled.
- *
- * Some NUMA counter updates may also be caught by the boot pagesets.
- *
- * The boot_pagesets must be kept even after bootup is complete for
- * unused processors and/or zones. They do play a role for bootstrapping
- * hotplugged processors.
- *
- * zoneinfo_show() and maybe other functions do
- * not check if the processor is online before following the pageset pointer.
- * Other parts of the kernel may not check if the zone is available.
+ * Dynamically allocate memory for the per cpu pageset array in struct zone.
  */
-static struct per_cpu_pageset boot_pageset[NR_CPUS];
-
-/*
- * Dynamically allocate memory for the
- * per cpu pageset array in struct zone.
- */
-static int __cpuinit process_zones(int cpu)
+static void __cpuinit process_zones(int cpu)
 {
-	struct zone *zone, *dzone;
+	struct zone *zone;
 	int node = cpu_to_node(cpu);
 
 	node_set_state(node, N_CPU);	/* this node has a cpu */
 
 	for_each_zone(zone) {
+		struct per_cpu_pageset *pcp =
+				CPU_PTR(zone->pageset, cpu);
 
 		if (!populated_zone(zone))
 			continue;
 
-		zone_pcp(zone, cpu) = kmalloc_node(sizeof(struct per_cpu_pageset),
-					 GFP_KERNEL, node);
-		if (!zone_pcp(zone, cpu))
-			goto bad;
-
-		setup_pageset(zone_pcp(zone, cpu), zone_batchsize(zone));
+		setup_pageset(pcp, zone_batchsize(zone));
 
 		if (percpu_pagelist_fraction)
-			setup_pagelist_highmark(zone_pcp(zone, cpu),
-			 	(zone->present_pages / percpu_pagelist_fraction));
-	}
-
-	return 0;
-bad:
-	for_each_zone(dzone) {
-		if (!populated_zone(dzone))
-			continue;
-		if (dzone == zone)
-			break;
-		kfree(zone_pcp(dzone, cpu));
-		zone_pcp(dzone, cpu) = NULL;
-	}
-	return -ENOMEM;
-}
+			setup_pagelist_highmark(pcp, zone->present_pages /
+						percpu_pagelist_fraction);
 
-static inline void free_zone_pagesets(int cpu)
-{
-	struct zone *zone;
-
-	for_each_zone(zone) {
-		struct per_cpu_pageset *pset = zone_pcp(zone, cpu);
-
-		/* Free per_cpu_pageset if it is slab allocated */
-		if (pset != &boot_pageset[cpu])
-			kfree(pset);
-		zone_pcp(zone, cpu) = NULL;
 	}
 }
 
+#ifdef CONFIG_SMP
 static int __cpuinit pageset_cpuup_callback(struct notifier_block *nfb,
 		unsigned long action,
 		void *hcpu)
@@ -2730,14 +2677,7 @@ static int __cpuinit pageset_cpuup_callb
 	switch (action) {
 	case CPU_UP_PREPARE:
 	case CPU_UP_PREPARE_FROZEN:
-		if (process_zones(cpu))
-			ret = NOTIFY_BAD;
-		break;
-	case CPU_UP_CANCELED:
-	case CPU_UP_CANCELED_FROZEN:
-	case CPU_DEAD:
-	case CPU_DEAD_FROZEN:
-		free_zone_pagesets(cpu);
+		process_zones(cpu);
 		break;
 	default:
 		break;
@@ -2747,21 +2687,34 @@ static int __cpuinit pageset_cpuup_callb
 
 static struct notifier_block __cpuinitdata pageset_notifier =
 	{ &pageset_cpuup_callback, NULL, 0 };
+#endif
 
 void __init setup_per_cpu_pageset(void)
 {
-	int err;
-
-	/* Initialize per_cpu_pageset for cpu 0.
+	/*
+	 * Initialize per_cpu settings for the boot cpu.
 	 * A cpuup callback will do this for every cpu
-	 * as it comes online
+	 * as it comes online.
+	 *
+	 * This is also initializing the cpu areas for the
+	 * pagesets.
 	 */
-	err = process_zones(smp_processor_id());
-	BUG_ON(err);
-	register_cpu_notifier(&pageset_notifier);
-}
+	struct zone *zone;
 
+	for_each_zone(zone) {
+
+		if (!populated_zone(zone))
+			continue;
+
+		zone->pageset = CPU_ALLOC(struct per_cpu_pageset,
+					GFP_KERNEL|__GFP_ZERO);
+		BUG_ON(!zone->pageset);
+	}
+	process_zones(smp_processor_id());
+#ifdef CONFIG_SMP
+	register_cpu_notifier(&pageset_notifier);
 #endif
+}
 
 static noinline __init_refok
 int zone_wait_table_init(struct zone *zone, unsigned long zone_size_pages)
@@ -2808,21 +2761,30 @@ int zone_wait_table_init(struct zone *zo
 
 static __meminit void zone_pcp_init(struct zone *zone)
 {
-	int cpu;
-	unsigned long batch = zone_batchsize(zone);
+	static struct per_cpu_pageset boot_pageset;
 
-	for (cpu = 0; cpu < NR_CPUS; cpu++) {
-#ifdef CONFIG_NUMA
-		/* Early boot. Slab allocator not functional yet */
-		zone_pcp(zone, cpu) = &boot_pageset[cpu];
-		setup_pageset(&boot_pageset[cpu],0);
-#else
-		setup_pageset(zone_pcp(zone,cpu), batch);
-#endif
-	}
+	/*
+	 * Fake a cpu_alloc pointer that can take the required
+	 * offset to get to the boot pageset. This is only
+	 * needed for the boot pageset while bootstrapping
+	 * the new zone. In the course of zone bootstrap
+	 * setup_cpu_pagesets() will do the proper CPU_ALLOC and
+	 * set things up the right way.
+	 *
+	 * Deferral allows CPU_ALLOC() to use the boot pageset
+	 * to allocate the initial memory to get going and then provide
+	 * the proper memory when called from setup_cpu_pagesets() to
+	 * install the proper pagesets.
+	 *
+	 * Deferral also allows slab allocators to perform their
+	 * initialization without resorting to bootmem.
+	 */
+	zone->pageset = &boot_pageset - CPU_OFFSET(smp_processor_id());
+	setup_pageset(&boot_pageset, 0);
 	if (zone->present_pages)
-		printk(KERN_DEBUG "  %s zone: %lu pages, LIFO batch:%lu\n",
-			zone->name, zone->present_pages, batch);
+		printk(KERN_DEBUG "  %s zone: %lu pages, LIFO batch:%u\n",
+			zone->name, zone->present_pages,
+			zone_batchsize(zone));
 }
 
 __meminit int init_currently_empty_zone(struct zone *zone,
@@ -4237,11 +4199,13 @@ int percpu_pagelist_fraction_sysctl_hand
 	ret = proc_dointvec_minmax(table, write, file, buffer, length, ppos);
 	if (!write || (ret == -EINVAL))
 		return ret;
-	for_each_zone(zone) {
-		for_each_online_cpu(cpu) {
+	for_each_online_cpu(cpu) {
+		for_each_zone(zone) {
 			unsigned long  high;
+
 			high = zone->present_pages / percpu_pagelist_fraction;
-			setup_pagelist_highmark(zone_pcp(zone, cpu), high);
+			setup_pagelist_highmark(CPU_PTR(zone->pageset, cpu),
+									high);
 		}
 	}
 	return 0;
Index: linux-2.6/include/linux/mm.h
===================================================================
--- linux-2.6.orig/include/linux/mm.h	2007-11-05 19:37:48.142034448 -0800
+++ linux-2.6/include/linux/mm.h	2007-11-05 19:37:52.171534088 -0800
@@ -931,11 +931,7 @@ extern void show_mem(void);
 extern void si_meminfo(struct sysinfo * val);
 extern void si_meminfo_node(struct sysinfo *val, int nid);
 
-#ifdef CONFIG_NUMA
 extern void setup_per_cpu_pageset(void);
-#else
-static inline void setup_per_cpu_pageset(void) {}
-#endif
 
 /* prio_tree.c */
 void vma_prio_tree_add(struct vm_area_struct *, struct vm_area_struct *old);
Index: linux-2.6/mm/vmstat.c
===================================================================
--- linux-2.6.orig/mm/vmstat.c	2007-11-05 19:37:48.142034448 -0800
+++ linux-2.6/mm/vmstat.c	2007-11-05 19:39:08.322033649 -0800
@@ -14,6 +14,7 @@
 #include <linux/module.h>
 #include <linux/cpu.h>
 #include <linux/sched.h>
+#include <linux/cpu_alloc.h>
 
 #ifdef CONFIG_VM_EVENT_COUNTERS
 DEFINE_PER_CPU(struct vm_event_state, vm_event_states) = {{0}};
@@ -147,7 +148,8 @@ static void refresh_zone_stat_thresholds
 		threshold = calculate_threshold(zone);
 
 		for_each_online_cpu(cpu)
-			zone_pcp(zone, cpu)->stat_threshold = threshold;
+			CPU_PTR(zone->pageset, cpu)->stat_threshold
+							= threshold;
 	}
 }
 
@@ -157,7 +159,8 @@ static void refresh_zone_stat_thresholds
 void __mod_zone_page_state(struct zone *zone, enum zone_stat_item item,
 				int delta)
 {
-	struct per_cpu_pageset *pcp = zone_pcp(zone, smp_processor_id());
+	struct per_cpu_pageset *pcp = THIS_CPU(zone->pageset);
+
 	s8 *p = pcp->vm_stat_diff + item;
 	long x;
 
@@ -210,7 +213,7 @@ EXPORT_SYMBOL(mod_zone_page_state);
  */
 void __inc_zone_state(struct zone *zone, enum zone_stat_item item)
 {
-	struct per_cpu_pageset *pcp = zone_pcp(zone, smp_processor_id());
+	struct per_cpu_pageset *pcp = THIS_CPU(zone->pageset);
 	s8 *p = pcp->vm_stat_diff + item;
 
 	(*p)++;
@@ -231,7 +234,7 @@ EXPORT_SYMBOL(__inc_zone_page_state);
 
 void __dec_zone_state(struct zone *zone, enum zone_stat_item item)
 {
-	struct per_cpu_pageset *pcp = zone_pcp(zone, smp_processor_id());
+	struct per_cpu_pageset *pcp = THIS_CPU(zone->pageset);
 	s8 *p = pcp->vm_stat_diff + item;
 
 	(*p)--;
@@ -307,7 +310,7 @@ void refresh_cpu_vm_stats(int cpu)
 		if (!populated_zone(zone))
 			continue;
 
-		p = zone_pcp(zone, cpu);
+		p = CPU_PTR(zone->pageset, cpu);
 
 		for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++)
 			if (p->vm_stat_diff[i]) {
@@ -684,7 +687,7 @@ static void zoneinfo_show_print(struct s
 		struct per_cpu_pageset *pageset;
 		int j;
 
-		pageset = zone_pcp(zone, i);
+		pageset = CPU_PTR(zone->pageset, i);
 		for (j = 0; j < ARRAY_SIZE(pageset->pcp); j++) {
 			seq_printf(m,
 				   "\n    cpu: %i pcp: %i"

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 08/28] cpu alloc: percpu_counter conversion
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (6 preceding siblings ...)
  2007-11-06 19:51 ` [patch 07/28] cpu alloc: page allocator conversion Christoph Lameter
@ 2007-11-06 19:51 ` Christoph Lameter
  2007-11-06 19:51 ` [patch 09/28] cpu alloc: crash_notes conversion Christoph Lameter
                   ` (20 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:51 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_percpu_counters --]
[-- Type: text/plain, Size: 2734 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 include/linux/percpu_counter.h |    1 -
 lib/percpu_counter.c           |   13 +++++++------
 2 files changed, 7 insertions(+), 7 deletions(-)

Index: linux-2.6/include/linux/percpu_counter.h
===================================================================
--- linux-2.6.orig/include/linux/percpu_counter.h	2007-11-04 13:18:33.000000000 -0800
+++ linux-2.6/include/linux/percpu_counter.h	2007-11-04 13:18:47.000000000 -0800
@@ -10,7 +10,6 @@
 #include <linux/smp.h>
 #include <linux/list.h>
 #include <linux/threads.h>
-#include <linux/percpu.h>
 #include <linux/types.h>
 
 #ifdef CONFIG_SMP
Index: linux-2.6/lib/percpu_counter.c
===================================================================
--- linux-2.6.orig/lib/percpu_counter.c	2007-11-04 13:18:33.000000000 -0800
+++ linux-2.6/lib/percpu_counter.c	2007-11-04 16:43:52.000000000 -0800
@@ -8,6 +8,7 @@
 #include <linux/init.h>
 #include <linux/cpu.h>
 #include <linux/module.h>
+#include <linux/cpu_alloc.h>
 
 #ifdef CONFIG_HOTPLUG_CPU
 static LIST_HEAD(percpu_counters);
@@ -20,7 +21,7 @@ void percpu_counter_set(struct percpu_co
 
 	spin_lock(&fbc->lock);
 	for_each_possible_cpu(cpu) {
-		s32 *pcount = per_cpu_ptr(fbc->counters, cpu);
+		s32 *pcount = CPU_PTR(fbc->counters, cpu);
 		*pcount = 0;
 	}
 	fbc->count = amount;
@@ -34,7 +35,7 @@ void __percpu_counter_add(struct percpu_
 	s32 *pcount;
 	int cpu = get_cpu();
 
-	pcount = per_cpu_ptr(fbc->counters, cpu);
+	pcount = CPU_PTR(fbc->counters, cpu);
 	count = *pcount + amount;
 	if (count >= batch || count <= -batch) {
 		spin_lock(&fbc->lock);
@@ -60,7 +61,7 @@ s64 __percpu_counter_sum(struct percpu_c
 	spin_lock(&fbc->lock);
 	ret = fbc->count;
 	for_each_online_cpu(cpu) {
-		s32 *pcount = per_cpu_ptr(fbc->counters, cpu);
+		s32 *pcount = CPU_PTR(fbc->counters, cpu);
 		ret += *pcount;
 	}
 	spin_unlock(&fbc->lock);
@@ -74,7 +75,7 @@ int percpu_counter_init(struct percpu_co
 {
 	spin_lock_init(&fbc->lock);
 	fbc->count = amount;
-	fbc->counters = alloc_percpu(s32);
+	fbc->counters = CPU_ALLOC(s32, GFP_KERNEL|__GFP_ZERO);
 	if (!fbc->counters)
 		return -ENOMEM;
 #ifdef CONFIG_HOTPLUG_CPU
@@ -101,7 +102,7 @@ void percpu_counter_destroy(struct percp
 	if (!fbc->counters)
 		return;
 
-	free_percpu(fbc->counters);
+	CPU_FREE(fbc->counters);
 #ifdef CONFIG_HOTPLUG_CPU
 	mutex_lock(&percpu_counters_lock);
 	list_del(&fbc->list);
@@ -127,7 +128,7 @@ static int __cpuinit percpu_counter_hotc
 		unsigned long flags;
 
 		spin_lock_irqsave(&fbc->lock, flags);
-		pcount = per_cpu_ptr(fbc->counters, cpu);
+		pcount = CPU_PTR(fbc->counters, cpu);
 		fbc->count += *pcount;
 		*pcount = 0;
 		spin_unlock_irqrestore(&fbc->lock, flags);

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 09/28] cpu alloc: crash_notes conversion
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (7 preceding siblings ...)
  2007-11-06 19:51 ` [patch 08/28] cpu alloc: percpu_counter conversion Christoph Lameter
@ 2007-11-06 19:51 ` Christoph Lameter
  2007-11-06 19:51 ` [patch 10/28] cpu alloc: workqueue conversion Christoph Lameter
                   ` (19 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:51 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_kexec --]
[-- Type: text/plain, Size: 2669 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 arch/ia64/kernel/crash.c |    2 +-
 drivers/base/cpu.c       |    3 ++-
 kernel/kexec.c           |    5 +++--
 3 files changed, 6 insertions(+), 4 deletions(-)

Index: linux-2.6/kernel/kexec.c
===================================================================
--- linux-2.6.orig/kernel/kexec.c	2007-10-26 17:52:21.593476581 -0700
+++ linux-2.6/kernel/kexec.c	2007-11-05 11:32:23.422191793 -0800
@@ -24,6 +24,7 @@
 #include <linux/utsrelease.h>
 #include <linux/utsname.h>
 #include <linux/numa.h>
+#include <linux/cpu_alloc.h>
 
 #include <asm/page.h>
 #include <asm/uaccess.h>
@@ -1122,7 +1123,7 @@ void crash_save_cpu(struct pt_regs *regs
 	 * squirrelled away.  ELF notes happen to provide
 	 * all of that, so there is no need to invent something new.
 	 */
-	buf = (u32*)per_cpu_ptr(crash_notes, cpu);
+	buf = (u32*)CPU_PTR(crash_notes, cpu);
 	if (!buf)
 		return;
 	memset(&prstatus, 0, sizeof(prstatus));
@@ -1136,7 +1137,7 @@ void crash_save_cpu(struct pt_regs *regs
 static int __init crash_notes_memory_init(void)
 {
 	/* Allocate memory for saving cpu registers. */
-	crash_notes = alloc_percpu(note_buf_t);
+	crash_notes = CPU_ALLOC(note_buf_t, GFP_KERNEL|__GFP_ZERO);
 	if (!crash_notes) {
 		printk("Kexec: Memory allocation for saving cpu register"
 		" states failed\n");
Index: linux-2.6/drivers/base/cpu.c
===================================================================
--- linux-2.6.orig/drivers/base/cpu.c	2007-10-26 17:52:00.852693291 -0700
+++ linux-2.6/drivers/base/cpu.c	2007-11-05 11:32:50.005821748 -0800
@@ -10,6 +10,7 @@
 #include <linux/topology.h>
 #include <linux/device.h>
 #include <linux/node.h>
+#include <linux/cpu_alloc.h>
 
 #include "base.h"
 
@@ -95,7 +96,7 @@ static ssize_t show_crash_notes(struct s
 	 * boot up and this data does not change there after. Hence this
 	 * operation should be safe. No locking required.
 	 */
-	addr = __pa(per_cpu_ptr(crash_notes, cpunum));
+	addr = __pa(CPU_PTR(crash_notes, cpunum));
 	rc = sprintf(buf, "%Lx\n", addr);
 	return rc;
 }
Index: linux-2.6/arch/ia64/kernel/crash.c
===================================================================
--- linux-2.6.orig/arch/ia64/kernel/crash.c	2007-10-26 17:51:57.652573158 -0700
+++ linux-2.6/arch/ia64/kernel/crash.c	2007-11-05 11:26:27.323321824 -0800
@@ -71,7 +71,7 @@ crash_save_this_cpu(void)
 	dst[46] = (unsigned long)ia64_rse_skip_regs((unsigned long *)dst[46],
 			sof - sol);
 
-	buf = (u64 *) per_cpu_ptr(crash_notes, cpu);
+	buf = (u64 *) CPU_PTR(crash_notes, cpu);
 	if (!buf)
 		return;
 	buf = append_elf_note(buf, KEXEC_CORE_NOTE_NAME, NT_PRSTATUS, prstatus,

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 10/28] cpu alloc: workqueue conversion
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (8 preceding siblings ...)
  2007-11-06 19:51 ` [patch 09/28] cpu alloc: crash_notes conversion Christoph Lameter
@ 2007-11-06 19:51 ` Christoph Lameter
  2007-11-06 19:51 ` [patch 11/28] cpu alloc: ACPI cstate handling conversion Christoph Lameter
                   ` (18 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:51 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_workqueue --]
[-- Type: text/plain, Size: 3629 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>


---
 kernel/workqueue.c |   28 +++++++++++++++-------------
 1 file changed, 15 insertions(+), 13 deletions(-)

Index: linux-2.6/kernel/workqueue.c
===================================================================
--- linux-2.6.orig/kernel/workqueue.c	2007-11-04 13:10:18.000000000 -0800
+++ linux-2.6/kernel/workqueue.c	2007-11-04 13:14:22.000000000 -0800
@@ -33,6 +33,7 @@
 #include <linux/kallsyms.h>
 #include <linux/debug_locks.h>
 #include <linux/lockdep.h>
+#include <linux/cpu_alloc.h>
 
 /*
  * The per-CPU workqueue (if single thread, we always use the first
@@ -100,7 +101,7 @@ struct cpu_workqueue_struct *wq_per_cpu(
 {
 	if (unlikely(is_single_threaded(wq)))
 		cpu = singlethread_cpu;
-	return per_cpu_ptr(wq->cpu_wq, cpu);
+	return CPU_PTR(wq->cpu_wq, cpu);
 }
 
 /*
@@ -398,7 +399,7 @@ void fastcall flush_workqueue(struct wor
 	lock_acquire(&wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
 	lock_release(&wq->lockdep_map, 1, _THIS_IP_);
 	for_each_cpu_mask(cpu, *cpu_map)
-		flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, cpu));
+		flush_cpu_workqueue(CPU_PTR(wq->cpu_wq, cpu));
 }
 EXPORT_SYMBOL_GPL(flush_workqueue);
 
@@ -478,7 +479,7 @@ static void wait_on_work(struct work_str
 	cpu_map = wq_cpu_map(wq);
 
 	for_each_cpu_mask(cpu, *cpu_map)
-		wait_on_cpu_work(per_cpu_ptr(wq->cpu_wq, cpu), work);
+		wait_on_cpu_work(CPU_PTR(wq->cpu_wq, cpu), work);
 }
 
 static int __cancel_work_timer(struct work_struct *work,
@@ -601,21 +602,21 @@ int schedule_on_each_cpu(work_func_t fun
 	int cpu;
 	struct work_struct *works;
 
-	works = alloc_percpu(struct work_struct);
+	works = CPU_ALLOC(struct work_struct, GFP_KERNEL);
 	if (!works)
 		return -ENOMEM;
 
 	preempt_disable();		/* CPU hotplug */
 	for_each_online_cpu(cpu) {
-		struct work_struct *work = per_cpu_ptr(works, cpu);
+		struct work_struct *work = CPU_PTR(works, cpu);
 
 		INIT_WORK(work, func);
 		set_bit(WORK_STRUCT_PENDING, work_data_bits(work));
-		__queue_work(per_cpu_ptr(keventd_wq->cpu_wq, cpu), work);
+		__queue_work(CPU_PTR(keventd_wq->cpu_wq, cpu), work);
 	}
 	preempt_enable();
 	flush_workqueue(keventd_wq);
-	free_percpu(works);
+	CPU_FREE(works);
 	return 0;
 }
 
@@ -664,7 +665,7 @@ int current_is_keventd(void)
 
 	BUG_ON(!keventd_wq);
 
-	cwq = per_cpu_ptr(keventd_wq->cpu_wq, cpu);
+	cwq = CPU_PTR(keventd_wq->cpu_wq, cpu);
 	if (current == cwq->thread)
 		ret = 1;
 
@@ -675,7 +676,7 @@ int current_is_keventd(void)
 static struct cpu_workqueue_struct *
 init_cpu_workqueue(struct workqueue_struct *wq, int cpu)
 {
-	struct cpu_workqueue_struct *cwq = per_cpu_ptr(wq->cpu_wq, cpu);
+	struct cpu_workqueue_struct *cwq = CPU_PTR(wq->cpu_wq, cpu);
 
 	cwq->wq = wq;
 	spin_lock_init(&cwq->lock);
@@ -732,7 +733,8 @@ struct workqueue_struct *__create_workqu
 	if (!wq)
 		return NULL;
 
-	wq->cpu_wq = alloc_percpu(struct cpu_workqueue_struct);
+	wq->cpu_wq = CPU_ALLOC(struct cpu_workqueue_struct,
+					GFP_KERNEL|__GFP_ZERO);
 	if (!wq->cpu_wq) {
 		kfree(wq);
 		return NULL;
@@ -814,11 +816,11 @@ void destroy_workqueue(struct workqueue_
 	mutex_unlock(&workqueue_mutex);
 
 	for_each_cpu_mask(cpu, *cpu_map) {
-		cwq = per_cpu_ptr(wq->cpu_wq, cpu);
+		cwq = CPU_PTR(wq->cpu_wq, cpu);
 		cleanup_workqueue_thread(cwq, cpu);
 	}
 
-	free_percpu(wq->cpu_wq);
+	CPU_FREE(wq->cpu_wq);
 	kfree(wq);
 }
 EXPORT_SYMBOL_GPL(destroy_workqueue);
@@ -847,7 +849,7 @@ static int __devinit workqueue_cpu_callb
 	}
 
 	list_for_each_entry(wq, &workqueues, list) {
-		cwq = per_cpu_ptr(wq->cpu_wq, cpu);
+		cwq = CPU_PTR(wq->cpu_wq, cpu);
 
 		switch (action) {
 		case CPU_UP_PREPARE:

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 11/28] cpu alloc: ACPI cstate handling conversion
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (9 preceding siblings ...)
  2007-11-06 19:51 ` [patch 10/28] cpu alloc: workqueue conversion Christoph Lameter
@ 2007-11-06 19:51 ` Christoph Lameter
  2007-11-06 19:51 ` [patch 12/28] cpu alloc: genhd statistics conversion Christoph Lameter
                   ` (17 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:51 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_acpi --]
[-- Type: text/plain, Size: 4062 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 arch/x86/kernel/acpi/cstate.c              |   10 ++++++----
 arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c |    8 +++++---
 drivers/acpi/processor_perflib.c           |    5 +++--
 3 files changed, 14 insertions(+), 9 deletions(-)

Index: linux-2.6/arch/x86/kernel/acpi/cstate.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/acpi/cstate.c	2007-11-05 22:46:10.000000000 -0800
+++ linux-2.6/arch/x86/kernel/acpi/cstate.c	2007-11-06 09:58:17.000000000 -0800
@@ -15,6 +15,7 @@
 
 #include <acpi/processor.h>
 #include <asm/acpi.h>
+#include <linux/cpu_alloc.h>
 
 /*
  * Initialize bm_flags based on the CPU cache properties
@@ -87,7 +88,7 @@ int acpi_processor_ffh_cstate_probe(unsi
 	if (reg->bit_offset != NATIVE_CSTATE_BEYOND_HALT)
 		return -1;
 
-	percpu_entry = per_cpu_ptr(cpu_cstate_entry, cpu);
+	percpu_entry = CPU_PTR(cpu_cstate_entry, cpu);
 	percpu_entry->states[cx->index].eax = 0;
 	percpu_entry->states[cx->index].ecx = 0;
 
@@ -138,7 +139,7 @@ void acpi_processor_ffh_cstate_enter(str
 	unsigned int cpu = smp_processor_id();
 	struct cstate_entry *percpu_entry;
 
-	percpu_entry = per_cpu_ptr(cpu_cstate_entry, cpu);
+	percpu_entry = CPU_PTR(cpu_cstate_entry, cpu);
 	mwait_idle_with_hints(percpu_entry->states[cx->index].eax,
 	                      percpu_entry->states[cx->index].ecx);
 }
@@ -150,13 +151,14 @@ static int __init ffh_cstate_init(void)
 	if (c->x86_vendor != X86_VENDOR_INTEL)
 		return -1;
 
-	cpu_cstate_entry = alloc_percpu(struct cstate_entry);
+	cpu_cstate_entry = CPU_ALLOC(struct cstate_entry,
+					GFP_KERNEL|__GFP_ZERO);
 	return 0;
 }
 
 static void __exit ffh_cstate_exit(void)
 {
-	free_percpu(cpu_cstate_entry);
+	CPU_FREE(cpu_cstate_entry);
 	cpu_cstate_entry = NULL;
 }
 
Index: linux-2.6/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c	2007-11-05 22:46:10.000000000 -0800
+++ linux-2.6/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c	2007-11-06 09:58:17.000000000 -0800
@@ -36,6 +36,7 @@
 
 #include <linux/acpi.h>
 #include <acpi/processor.h>
+#include <linux/cpu_alloc.h>
 
 #include <asm/io.h>
 #include <asm/msr.h>
@@ -513,7 +514,8 @@ static int __init acpi_cpufreq_early_ini
 {
 	dprintk("acpi_cpufreq_early_init\n");
 
-	acpi_perf_data = alloc_percpu(struct acpi_processor_performance);
+	acpi_perf_data = CPU_ALLOC(struct acpi_processor_performance,
+						GFP_KERNEL|__GFP_ZERO);
 	if (!acpi_perf_data) {
 		dprintk("Memory allocation error for acpi_perf_data.\n");
 		return -ENOMEM;
@@ -569,7 +571,7 @@ static int acpi_cpufreq_cpu_init(struct 
 	if (!data)
 		return -ENOMEM;
 
-	data->acpi_data = percpu_ptr(acpi_perf_data, cpu);
+	data->acpi_data = CPU_PTR(acpi_perf_data, cpu);
 	drv_data[cpu] = data;
 
 	if (cpu_has(c, X86_FEATURE_CONSTANT_TSC))
@@ -782,7 +784,7 @@ static void __exit acpi_cpufreq_exit(voi
 
 	cpufreq_unregister_driver(&acpi_cpufreq_driver);
 
-	free_percpu(acpi_perf_data);
+	CPU_FREE(acpi_perf_data);
 
 	return;
 }
Index: linux-2.6/drivers/acpi/processor_perflib.c
===================================================================
--- linux-2.6.orig/drivers/acpi/processor_perflib.c	2007-11-05 22:46:10.000000000 -0800
+++ linux-2.6/drivers/acpi/processor_perflib.c	2007-11-06 09:58:46.000000000 -0800
@@ -30,6 +30,7 @@
 #include <linux/module.h>
 #include <linux/init.h>
 #include <linux/cpufreq.h>
+#include <linux/cpu_alloc.h>
 
 #ifdef CONFIG_X86_ACPI_CPUFREQ_PROC_INTF
 #include <linux/proc_fs.h>
@@ -567,12 +568,12 @@ int acpi_processor_preregister_performan
 			continue;
 		}
 
-		if (!performance || !percpu_ptr(performance, i)) {
+		if (!performance || !CPU_PTR(performance, i)) {
 			retval = -EINVAL;
 			continue;
 		}
 
-		pr->performance = percpu_ptr(performance, i);
+		pr->performance = CPU_PTR(performance, i);
 		cpu_set(i, pr->performance->shared_cpu_map);
 		if (acpi_processor_get_psd(pr)) {
 			retval = -EINVAL;

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 12/28] cpu alloc: genhd statistics conversion
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (10 preceding siblings ...)
  2007-11-06 19:51 ` [patch 11/28] cpu alloc: ACPI cstate handling conversion Christoph Lameter
@ 2007-11-06 19:51 ` Christoph Lameter
  2007-11-06 19:51 ` [patch 13/28] cpu alloc: blktrace conversion Christoph Lameter
                   ` (16 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:51 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_genhd --]
[-- Type: text/plain, Size: 1960 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 include/linux/genhd.h |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

Index: linux-2.6/include/linux/genhd.h
===================================================================
--- linux-2.6.orig/include/linux/genhd.h	2007-11-04 19:32:40.000000000 -0800
+++ linux-2.6/include/linux/genhd.h	2007-11-04 19:35:51.000000000 -0800
@@ -12,6 +12,7 @@
 #include <linux/types.h>
 
 #ifdef CONFIG_BLOCK
+#include <linux/cpu_alloc.h>
 
 enum {
 /* These three have identical behaviour; use the second one if DOS FDISK gets
@@ -158,21 +159,21 @@ struct disk_attribute {
  */
 #ifdef	CONFIG_SMP
 #define __disk_stat_add(gendiskp, field, addnd) 	\
-	(per_cpu_ptr(gendiskp->dkstats, smp_processor_id())->field += addnd)
+	(THIS_CPU(gendiskp->dkstats)->field += addnd)
 
 #define disk_stat_read(gendiskp, field)					\
 ({									\
 	typeof(gendiskp->dkstats->field) res = 0;			\
 	int i;								\
 	for_each_possible_cpu(i)					\
-		res += per_cpu_ptr(gendiskp->dkstats, i)->field;	\
+		res += CPU_PTR(gendiskp->dkstats, i)->field;	\
 	res;								\
 })
 
 static inline void disk_stat_set_all(struct gendisk *gendiskp, int value)	{
 	int i;
 	for_each_possible_cpu(i)
-		memset(per_cpu_ptr(gendiskp->dkstats, i), value,
+		memset(CPU_PTR(gendiskp->dkstats, i), value,
 				sizeof (struct disk_stats));
 }		
 				
@@ -209,7 +210,7 @@ static inline void disk_stat_set_all(str
 #ifdef  CONFIG_SMP
 static inline int init_disk_stats(struct gendisk *disk)
 {
-	disk->dkstats = alloc_percpu(struct disk_stats);
+	disk->dkstats = CPU_ALLOC(struct disk_stats, GFP_KERNEL | __GFP_ZERO);
 	if (!disk->dkstats)
 		return 0;
 	return 1;
@@ -217,7 +218,7 @@ static inline int init_disk_stats(struct
 
 static inline void free_disk_stats(struct gendisk *disk)
 {
-	free_percpu(disk->dkstats);
+	CPU_FREE(disk->dkstats);
 }
 #else	/* CONFIG_SMP */
 static inline int init_disk_stats(struct gendisk *disk)

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 13/28] cpu alloc: blktrace conversion
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (11 preceding siblings ...)
  2007-11-06 19:51 ` [patch 12/28] cpu alloc: genhd statistics conversion Christoph Lameter
@ 2007-11-06 19:51 ` Christoph Lameter
  2007-11-06 19:51 ` [patch 14/28] cpu alloc: SRCU Christoph Lameter
                   ` (15 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:51 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_blktrace --]
[-- Type: text/plain, Size: 1825 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 block/blktrace.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Index: linux-2.6/block/blktrace.c
===================================================================
--- linux-2.6.orig/block/blktrace.c	2007-11-04 13:14:44.000000000 -0800
+++ linux-2.6/block/blktrace.c	2007-11-04 13:16:22.000000000 -0800
@@ -18,12 +18,12 @@
 #include <linux/kernel.h>
 #include <linux/blkdev.h>
 #include <linux/blktrace_api.h>
-#include <linux/percpu.h>
 #include <linux/init.h>
 #include <linux/mutex.h>
 #include <linux/debugfs.h>
 #include <linux/time.h>
 #include <asm/uaccess.h>
+#include <linux/cpu_alloc.h>
 
 static DEFINE_PER_CPU(unsigned long long, blk_trace_cpu_offset) = { 0, };
 static unsigned int blktrace_seq __read_mostly = 1;
@@ -155,7 +155,7 @@ void __blk_add_trace(struct blk_trace *b
 	t = relay_reserve(bt->rchan, sizeof(*t) + pdu_len);
 	if (t) {
 		cpu = smp_processor_id();
-		sequence = per_cpu_ptr(bt->sequence, cpu);
+		sequence = CPU_PTR(bt->sequence, cpu);
 
 		t->magic = BLK_IO_TRACE_MAGIC | BLK_IO_TRACE_VERSION;
 		t->sequence = ++(*sequence);
@@ -227,7 +227,7 @@ static void blk_trace_cleanup(struct blk
 	relay_close(bt->rchan);
 	debugfs_remove(bt->dropped_file);
 	blk_remove_tree(bt->dir);
-	free_percpu(bt->sequence);
+	CPU_FREE(bt->sequence);
 	kfree(bt);
 }
 
@@ -338,7 +338,7 @@ int do_blk_trace_setup(struct request_qu
 	if (!bt)
 		goto err;
 
-	bt->sequence = alloc_percpu(unsigned long);
+	bt->sequence = CPU_ALLOC(unsigned long, GFP_KERNEL | __GFP_ZERO);
 	if (!bt->sequence)
 		goto err;
 
@@ -387,7 +387,7 @@ err:
 	if (bt) {
 		if (bt->dropped_file)
 			debugfs_remove(bt->dropped_file);
-		free_percpu(bt->sequence);
+		CPU_FREE(bt->sequence);
 		if (bt->rchan)
 			relay_close(bt->rchan);
 		kfree(bt);

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 14/28] cpu alloc: SRCU
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (12 preceding siblings ...)
  2007-11-06 19:51 ` [patch 13/28] cpu alloc: blktrace conversion Christoph Lameter
@ 2007-11-06 19:51 ` Christoph Lameter
  2007-11-06 19:51 ` [patch 15/28] cpu alloc: XFS counters Christoph Lameter
                   ` (14 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:51 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_rcu --]
[-- Type: text/plain, Size: 2789 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 kernel/rcutorture.c |    4 ++--
 kernel/srcu.c       |   13 +++++++------
 2 files changed, 9 insertions(+), 8 deletions(-)

Index: linux-2.6/kernel/srcu.c
===================================================================
--- linux-2.6.orig/kernel/srcu.c	2007-11-04 19:32:38.000000000 -0800
+++ linux-2.6/kernel/srcu.c	2007-11-04 19:38:09.000000000 -0800
@@ -26,7 +26,7 @@
 
 #include <linux/module.h>
 #include <linux/mutex.h>
-#include <linux/percpu.h>
+#include <linux/cpu_alloc.h>
 #include <linux/preempt.h>
 #include <linux/rcupdate.h>
 #include <linux/sched.h>
@@ -46,7 +46,8 @@ int init_srcu_struct(struct srcu_struct 
 {
 	sp->completed = 0;
 	mutex_init(&sp->mutex);
-	sp->per_cpu_ref = alloc_percpu(struct srcu_struct_array);
+	sp->per_cpu_ref = CPU_ALLOC(struct srcu_struct_array,
+						GFP_KERNEL|__GFP_ZERO);
 	return (sp->per_cpu_ref ? 0 : -ENOMEM);
 }
 
@@ -62,7 +63,7 @@ static int srcu_readers_active_idx(struc
 
 	sum = 0;
 	for_each_possible_cpu(cpu)
-		sum += per_cpu_ptr(sp->per_cpu_ref, cpu)->c[idx];
+		sum += CPU_PTR(sp->per_cpu_ref, cpu)->c[idx];
 	return sum;
 }
 
@@ -94,7 +95,7 @@ void cleanup_srcu_struct(struct srcu_str
 	WARN_ON(sum);  /* Leakage unless caller handles error. */
 	if (sum != 0)
 		return;
-	free_percpu(sp->per_cpu_ref);
+	CPU_FREE(sp->per_cpu_ref);
 	sp->per_cpu_ref = NULL;
 }
 
@@ -113,7 +114,7 @@ int srcu_read_lock(struct srcu_struct *s
 	preempt_disable();
 	idx = sp->completed & 0x1;
 	barrier();  /* ensure compiler looks -once- at sp->completed. */
-	per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]++;
+	THIS_CPU(sp->per_cpu_ref)->c[idx]++;
 	srcu_barrier();  /* ensure compiler won't misorder critical section. */
 	preempt_enable();
 	return idx;
@@ -133,7 +134,7 @@ void srcu_read_unlock(struct srcu_struct
 {
 	preempt_disable();
 	srcu_barrier();  /* ensure compiler won't misorder critical section. */
-	per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]--;
+	THIS_CPU(sp->per_cpu_ref)->c[idx]--;
 	preempt_enable();
 }
 
Index: linux-2.6/kernel/rcutorture.c
===================================================================
--- linux-2.6.orig/kernel/rcutorture.c	2007-11-04 19:32:38.000000000 -0800
+++ linux-2.6/kernel/rcutorture.c	2007-11-04 19:33:51.000000000 -0800
@@ -441,8 +441,8 @@ static int srcu_torture_stats(char *page
 		       torture_type, TORTURE_FLAG, idx);
 	for_each_possible_cpu(cpu) {
 		cnt += sprintf(&page[cnt], " %d(%d,%d)", cpu,
-			       per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[!idx],
-			       per_cpu_ptr(srcu_ctl.per_cpu_ref, cpu)->c[idx]);
+			       CPU_PTR(srcu_ctl.per_cpu_ref, cpu)->c[!idx],
+			       CPU_PTR(srcu_ctl.per_cpu_ref, cpu)->c[idx]);
 	}
 	cnt += sprintf(&page[cnt], "\n");
 	return cnt;

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 15/28] cpu alloc: XFS counters
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (13 preceding siblings ...)
  2007-11-06 19:51 ` [patch 14/28] cpu alloc: SRCU Christoph Lameter
@ 2007-11-06 19:51 ` Christoph Lameter
  2007-11-06 19:52 ` [patch 16/28] cpu alloc: NFS statistics Christoph Lameter
                   ` (13 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:51 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_xfs --]
[-- Type: text/plain, Size: 3015 bytes --]

Also remove the useless zeroing after allocation. Allocpercpu already
zeroed the objects.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 fs/xfs/xfs_mount.c |   24 ++++++++----------------
 1 file changed, 8 insertions(+), 16 deletions(-)

Index: linux-2.6/fs/xfs/xfs_mount.c
===================================================================
--- linux-2.6.orig/fs/xfs/xfs_mount.c	2007-11-04 19:31:30.000000000 -0800
+++ linux-2.6/fs/xfs/xfs_mount.c	2007-11-04 19:55:16.000000000 -0800
@@ -1924,7 +1924,7 @@ xfs_icsb_cpu_notify(
 
 	mp = (xfs_mount_t *)container_of(nfb, xfs_mount_t, m_icsb_notifier);
 	cntp = (xfs_icsb_cnts_t *)
-			per_cpu_ptr(mp->m_sb_cnts, (unsigned long)hcpu);
+			CPU_PTR(mp->m_sb_cnts, (unsigned long)hcpu);
 	switch (action) {
 	case CPU_UP_PREPARE:
 	case CPU_UP_PREPARE_FROZEN:
@@ -1976,10 +1976,7 @@ int
 xfs_icsb_init_counters(
 	xfs_mount_t	*mp)
 {
-	xfs_icsb_cnts_t *cntp;
-	int		i;
-
-	mp->m_sb_cnts = alloc_percpu(xfs_icsb_cnts_t);
+	mp->m_sb_cnts = CPU_ALLOC(xfs_icsb_cnts_t, GFP_KERNEL | __GFP_ZERO);
 	if (mp->m_sb_cnts == NULL)
 		return -ENOMEM;
 
@@ -1989,11 +1986,6 @@ xfs_icsb_init_counters(
 	register_hotcpu_notifier(&mp->m_icsb_notifier);
 #endif /* CONFIG_HOTPLUG_CPU */
 
-	for_each_online_cpu(i) {
-		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
-		memset(cntp, 0, sizeof(xfs_icsb_cnts_t));
-	}
-
 	mutex_init(&mp->m_icsb_mutex);
 
 	/*
@@ -2026,7 +2018,7 @@ xfs_icsb_destroy_counters(
 {
 	if (mp->m_sb_cnts) {
 		unregister_hotcpu_notifier(&mp->m_icsb_notifier);
-		free_percpu(mp->m_sb_cnts);
+		CPU_FREE(mp->m_sb_cnts);
 	}
 	mutex_destroy(&mp->m_icsb_mutex);
 }
@@ -2056,7 +2048,7 @@ xfs_icsb_lock_all_counters(
 	int		i;
 
 	for_each_online_cpu(i) {
-		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
+		cntp = (xfs_icsb_cnts_t *)CPU_PTR(mp->m_sb_cnts, i);
 		xfs_icsb_lock_cntr(cntp);
 	}
 }
@@ -2069,7 +2061,7 @@ xfs_icsb_unlock_all_counters(
 	int		i;
 
 	for_each_online_cpu(i) {
-		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
+		cntp = (xfs_icsb_cnts_t *)CPU_PTR(mp->m_sb_cnts, i);
 		xfs_icsb_unlock_cntr(cntp);
 	}
 }
@@ -2089,7 +2081,7 @@ xfs_icsb_count(
 		xfs_icsb_lock_all_counters(mp);
 
 	for_each_online_cpu(i) {
-		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
+		cntp = (xfs_icsb_cnts_t *)CPU_PTR(mp->m_sb_cnts, i);
 		cnt->icsb_icount += cntp->icsb_icount;
 		cnt->icsb_ifree += cntp->icsb_ifree;
 		cnt->icsb_fdblocks += cntp->icsb_fdblocks;
@@ -2167,7 +2159,7 @@ xfs_icsb_enable_counter(
 
 	xfs_icsb_lock_all_counters(mp);
 	for_each_online_cpu(i) {
-		cntp = per_cpu_ptr(mp->m_sb_cnts, i);
+		cntp = CPU_PTR(mp->m_sb_cnts, i);
 		switch (field) {
 		case XFS_SBS_ICOUNT:
 			cntp->icsb_icount = count + resid;
@@ -2307,7 +2299,7 @@ xfs_icsb_modify_counters(
 	might_sleep();
 again:
 	cpu = get_cpu();
-	icsbp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, cpu);
+	icsbp = (xfs_icsb_cnts_t *)CPU_PTR(mp->m_sb_cnts, cpu);
 
 	/*
 	 * if the counter is disabled, go to slow path

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 16/28] cpu alloc: NFS statistics
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (14 preceding siblings ...)
  2007-11-06 19:51 ` [patch 15/28] cpu alloc: XFS counters Christoph Lameter
@ 2007-11-06 19:52 ` Christoph Lameter
  2007-11-06 19:52 ` [patch 17/28] cpu alloc: neigbour statistics Christoph Lameter
                   ` (12 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:52 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_nfs --]
[-- Type: text/plain, Size: 2382 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 fs/nfs/iostat.h           |    9 +++++----
 fs/nfs/super.c            |    2 +-
 include/linux/neighbour.h |    1 +
 3 files changed, 7 insertions(+), 5 deletions(-)

Index: linux-2.6/fs/nfs/iostat.h
===================================================================
--- linux-2.6.orig/fs/nfs/iostat.h	2007-11-04 20:13:52.000000000 -0800
+++ linux-2.6/fs/nfs/iostat.h	2007-11-04 20:16:48.000000000 -0800
@@ -20,6 +20,7 @@
 
 #ifndef _NFS_IOSTAT
 #define _NFS_IOSTAT
+#include <linux/cpu_alloc.h>
 
 #define NFS_IOSTAT_VERS		"1.0"
 
@@ -123,7 +124,7 @@ static inline void nfs_inc_server_stats(
 	int cpu;
 
 	cpu = get_cpu();
-	iostats = per_cpu_ptr(server->io_stats, cpu);
+	iostats = CPU_PTR(server->io_stats, cpu);
 	iostats->events[stat] ++;
 	put_cpu_no_resched();
 }
@@ -139,7 +140,7 @@ static inline void nfs_add_server_stats(
 	int cpu;
 
 	cpu = get_cpu();
-	iostats = per_cpu_ptr(server->io_stats, cpu);
+	iostats = CPU_PTR(server->io_stats, cpu);
 	iostats->bytes[stat] += addend;
 	put_cpu_no_resched();
 }
@@ -151,13 +152,13 @@ static inline void nfs_add_stats(struct 
 
 static inline struct nfs_iostats *nfs_alloc_iostats(void)
 {
-	return alloc_percpu(struct nfs_iostats);
+	return CPU_ALLOC(struct nfs_iostats, GFP_KERNEL | __GFP_ZERO);
 }
 
 static inline void nfs_free_iostats(struct nfs_iostats *stats)
 {
 	if (stats != NULL)
-		free_percpu(stats);
+		CPU_FREE(stats);
 }
 
 #endif
Index: linux-2.6/include/linux/neighbour.h
===================================================================
--- linux-2.6.orig/include/linux/neighbour.h	2007-11-04 20:13:52.000000000 -0800
+++ linux-2.6/include/linux/neighbour.h	2007-11-04 20:16:48.000000000 -0800
@@ -2,6 +2,7 @@
 #define __LINUX_NEIGHBOUR_H
 
 #include <linux/netlink.h>
+#include <linux/cpu_alloc.h>
 
 struct ndmsg
 {
Index: linux-2.6/fs/nfs/super.c
===================================================================
--- linux-2.6.orig/fs/nfs/super.c	2007-11-04 20:15:41.000000000 -0800
+++ linux-2.6/fs/nfs/super.c	2007-11-04 20:16:51.000000000 -0800
@@ -529,7 +529,7 @@ static int nfs_show_stats(struct seq_fil
 		struct nfs_iostats *stats;
 
 		preempt_disable();
-		stats = per_cpu_ptr(nfss->io_stats, cpu);
+		stats = CPU_PTR(nfss->io_stats, cpu);
 
 		for (i = 0; i < __NFSIOS_COUNTSMAX; i++)
 			totals.events[i] += stats->events[i];

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 17/28] cpu alloc: neigbour statistics
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (15 preceding siblings ...)
  2007-11-06 19:52 ` [patch 16/28] cpu alloc: NFS statistics Christoph Lameter
@ 2007-11-06 19:52 ` Christoph Lameter
  2007-11-06 19:52 ` [patch 18/28] cpu alloc: tcp statistics Christoph Lameter
                   ` (11 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:52 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_neighbour --]
[-- Type: text/plain, Size: 2522 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 include/net/neighbour.h |    2 +-
 net/core/neighbour.c    |   12 +++++++-----
 2 files changed, 8 insertions(+), 6 deletions(-)

Index: linux-2.6/net/core/neighbour.c
===================================================================
--- linux-2.6.orig/net/core/neighbour.c	2007-11-04 20:14:21.000000000 -0800
+++ linux-2.6/net/core/neighbour.c	2007-11-04 20:16:08.000000000 -0800
@@ -35,6 +35,7 @@
 #include <linux/random.h>
 #include <linux/string.h>
 #include <linux/log2.h>
+#include <linux/cpu_alloc.h>
 
 #define NEIGH_DEBUG 1
 
@@ -1348,7 +1349,8 @@ void neigh_table_init_no_netlink(struct 
 			kmem_cache_create(tbl->id, tbl->entry_size, 0,
 					  SLAB_HWCACHE_ALIGN|SLAB_PANIC,
 					  NULL);
-	tbl->stats = alloc_percpu(struct neigh_statistics);
+	tbl->stats = CPU_ALLOC(struct neigh_statistics,
+					GFP_KERNEL | __GFP_ZERO);
 	if (!tbl->stats)
 		panic("cannot create neighbour cache statistics");
 
@@ -1435,7 +1437,7 @@ int neigh_table_clear(struct neigh_table
 	kfree(tbl->phash_buckets);
 	tbl->phash_buckets = NULL;
 
-	free_percpu(tbl->stats);
+	CPU_FREE(tbl->stats);
 	tbl->stats = NULL;
 
 	kmem_cache_destroy(tbl->kmem_cachep);
@@ -1692,7 +1694,7 @@ static int neightbl_fill_info(struct sk_
 		for_each_possible_cpu(cpu) {
 			struct neigh_statistics	*st;
 
-			st = per_cpu_ptr(tbl->stats, cpu);
+			st = CPU_PTR(tbl->stats, cpu);
 			ndst.ndts_allocs		+= st->allocs;
 			ndst.ndts_destroys		+= st->destroys;
 			ndst.ndts_hash_grows		+= st->hash_grows;
@@ -2341,7 +2343,7 @@ static void *neigh_stat_seq_start(struct
 		if (!cpu_possible(cpu))
 			continue;
 		*pos = cpu+1;
-		return per_cpu_ptr(tbl->stats, cpu);
+		return CPU_PTR(tbl->stats, cpu);
 	}
 	return NULL;
 }
@@ -2356,7 +2358,7 @@ static void *neigh_stat_seq_next(struct 
 		if (!cpu_possible(cpu))
 			continue;
 		*pos = cpu+1;
-		return per_cpu_ptr(tbl->stats, cpu);
+		return CPU_PTR(tbl->stats, cpu);
 	}
 	return NULL;
 }
Index: linux-2.6/include/net/neighbour.h
===================================================================
--- linux-2.6.orig/include/net/neighbour.h	2007-11-04 20:15:41.000000000 -0800
+++ linux-2.6/include/net/neighbour.h	2007-11-04 20:16:24.000000000 -0800
@@ -83,7 +83,7 @@ struct neigh_statistics
 #define NEIGH_CACHE_STAT_INC(tbl, field)				\
 	do {								\
 		preempt_disable();					\
-		(per_cpu_ptr((tbl)->stats, smp_processor_id())->field)++; \
+		(THIS_CPU((tbl)->stats)->field)++; \
 		preempt_enable();					\
 	} while (0)
 

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 18/28] cpu alloc: tcp statistics
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (16 preceding siblings ...)
  2007-11-06 19:52 ` [patch 17/28] cpu alloc: neigbour statistics Christoph Lameter
@ 2007-11-06 19:52 ` Christoph Lameter
  2007-11-06 19:52 ` [patch 19/28] cpu alloc: convert scatches Christoph Lameter
                   ` (10 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:52 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_net_tcp --]
[-- Type: text/plain, Size: 1631 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>


---
 net/ipv4/tcp.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Index: linux-2.6/net/ipv4/tcp.c
===================================================================
--- linux-2.6.orig/net/ipv4/tcp.c	2007-11-04 19:01:58.000000000 -0800
+++ linux-2.6/net/ipv4/tcp.c	2007-11-04 19:30:19.000000000 -0800
@@ -2273,7 +2273,7 @@ static void __tcp_free_md5sig_pool(struc
 {
 	int cpu;
 	for_each_possible_cpu(cpu) {
-		struct tcp_md5sig_pool *p = *per_cpu_ptr(pool, cpu);
+		struct tcp_md5sig_pool *p = *CPU_PTR(pool, cpu);
 		if (p) {
 			if (p->md5_desc.tfm)
 				crypto_free_hash(p->md5_desc.tfm);
@@ -2281,7 +2281,7 @@ static void __tcp_free_md5sig_pool(struc
 			p = NULL;
 		}
 	}
-	free_percpu(pool);
+	CPU_FREE(pool);
 }
 
 void tcp_free_md5sig_pool(void)
@@ -2305,7 +2305,7 @@ static struct tcp_md5sig_pool **__tcp_al
 	int cpu;
 	struct tcp_md5sig_pool **pool;
 
-	pool = alloc_percpu(struct tcp_md5sig_pool *);
+	pool = CPU_ALLOC(struct tcp_md5sig_pool *, GFP_KERNEL);
 	if (!pool)
 		return NULL;
 
@@ -2316,7 +2316,7 @@ static struct tcp_md5sig_pool **__tcp_al
 		p = kzalloc(sizeof(*p), GFP_KERNEL);
 		if (!p)
 			goto out_free;
-		*per_cpu_ptr(pool, cpu) = p;
+		*CPU_PTR(pool, cpu) = p;
 
 		hash = crypto_alloc_hash("md5", 0, CRYPTO_ALG_ASYNC);
 		if (!hash || IS_ERR(hash))
@@ -2381,7 +2381,7 @@ struct tcp_md5sig_pool *__tcp_get_md5sig
 	if (p)
 		tcp_md5sig_users++;
 	spin_unlock_bh(&tcp_md5sig_pool_lock);
-	return (p ? *per_cpu_ptr(p, cpu) : NULL);
+	return (p ? *CPU_PTR(p, cpu) : NULL);
 }
 
 EXPORT_SYMBOL(__tcp_get_md5sig_pool);

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 19/28] cpu alloc: convert scatches
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (17 preceding siblings ...)
  2007-11-06 19:52 ` [patch 18/28] cpu alloc: tcp statistics Christoph Lameter
@ 2007-11-06 19:52 ` Christoph Lameter
  2007-11-06 19:52 ` [patch 20/28] cpu alloc: dmaengine conversion Christoph Lameter
                   ` (9 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:52 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_ipv4_icomp --]
[-- Type: text/plain, Size: 6109 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 net/ipv4/ipcomp.c  |   26 +++++++++++++-------------
 net/ipv6/ipcomp6.c |   26 +++++++++++++-------------
 2 files changed, 26 insertions(+), 26 deletions(-)

Index: linux-2.6/net/ipv4/ipcomp.c
===================================================================
--- linux-2.6.orig/net/ipv4/ipcomp.c	2007-11-05 09:41:23.000000000 -0800
+++ linux-2.6/net/ipv4/ipcomp.c	2007-11-05 09:41:37.000000000 -0800
@@ -47,8 +47,8 @@ static int ipcomp_decompress(struct xfrm
 	int dlen = IPCOMP_SCRATCH_SIZE;
 	const u8 *start = skb->data;
 	const int cpu = get_cpu();
-	u8 *scratch = *per_cpu_ptr(ipcomp_scratches, cpu);
-	struct crypto_comp *tfm = *per_cpu_ptr(ipcd->tfms, cpu);
+	u8 *scratch = *CPU_PTR(ipcomp_scratches, cpu);
+	struct crypto_comp *tfm = *CPU_PTR(ipcd->tfms, cpu);
 	int err = crypto_comp_decompress(tfm, start, plen, scratch, &dlen);
 
 	if (err)
@@ -102,8 +102,8 @@ static int ipcomp_compress(struct xfrm_s
 	int dlen = IPCOMP_SCRATCH_SIZE;
 	u8 *start = skb->data;
 	const int cpu = get_cpu();
-	u8 *scratch = *per_cpu_ptr(ipcomp_scratches, cpu);
-	struct crypto_comp *tfm = *per_cpu_ptr(ipcd->tfms, cpu);
+	u8 *scratch = *CPU_PTR(ipcomp_scratches, cpu);
+	struct crypto_comp *tfm = *CPU_PTR(ipcd->tfms, cpu);
 	int err = crypto_comp_compress(tfm, start, plen, scratch, &dlen);
 
 	if (err)
@@ -251,9 +251,9 @@ static void ipcomp_free_scratches(void)
 		return;
 
 	for_each_possible_cpu(i)
-		vfree(*per_cpu_ptr(scratches, i));
+		vfree(*CPU_PTR(scratches, i));
 
-	free_percpu(scratches);
+	CPU_FREE(scratches);
 }
 
 static void **ipcomp_alloc_scratches(void)
@@ -264,7 +264,7 @@ static void **ipcomp_alloc_scratches(voi
 	if (ipcomp_scratch_users++)
 		return ipcomp_scratches;
 
-	scratches = alloc_percpu(void *);
+	scratches = CPU_ALLOC(void *, GFP_KERNEL);
 	if (!scratches)
 		return NULL;
 
@@ -274,7 +274,7 @@ static void **ipcomp_alloc_scratches(voi
 		void *scratch = vmalloc(IPCOMP_SCRATCH_SIZE);
 		if (!scratch)
 			return NULL;
-		*per_cpu_ptr(scratches, i) = scratch;
+		*CPU_PTR(scratches, i) = scratch;
 	}
 
 	return scratches;
@@ -302,10 +302,10 @@ static void ipcomp_free_tfms(struct cryp
 		return;
 
 	for_each_possible_cpu(cpu) {
-		struct crypto_comp *tfm = *per_cpu_ptr(tfms, cpu);
+		struct crypto_comp *tfm = *CPU_PTR(tfms, cpu);
 		crypto_free_comp(tfm);
 	}
-	free_percpu(tfms);
+	CPU_FREE(tfms);
 }
 
 static struct crypto_comp **ipcomp_alloc_tfms(const char *alg_name)
@@ -321,7 +321,7 @@ static struct crypto_comp **ipcomp_alloc
 		struct crypto_comp *tfm;
 
 		tfms = pos->tfms;
-		tfm = *per_cpu_ptr(tfms, cpu);
+		tfm = *CPU_PTR(tfms, cpu);
 
 		if (!strcmp(crypto_comp_name(tfm), alg_name)) {
 			pos->users++;
@@ -337,7 +337,7 @@ static struct crypto_comp **ipcomp_alloc
 	INIT_LIST_HEAD(&pos->list);
 	list_add(&pos->list, &ipcomp_tfms_list);
 
-	pos->tfms = tfms = alloc_percpu(struct crypto_comp *);
+	pos->tfms = tfms = CPU_ALLOC(struct crypto_comp *, GFP_KERNEL);
 	if (!tfms)
 		goto error;
 
@@ -346,7 +346,7 @@ static struct crypto_comp **ipcomp_alloc
 							    CRYPTO_ALG_ASYNC);
 		if (!tfm)
 			goto error;
-		*per_cpu_ptr(tfms, cpu) = tfm;
+		*CPU_PTR(tfms, cpu) = tfm;
 	}
 
 	return tfms;
Index: linux-2.6/net/ipv6/ipcomp6.c
===================================================================
--- linux-2.6.orig/net/ipv6/ipcomp6.c	2007-11-05 09:41:23.000000000 -0800
+++ linux-2.6/net/ipv6/ipcomp6.c	2007-11-05 09:43:13.000000000 -0800
@@ -87,8 +87,8 @@ static int ipcomp6_input(struct xfrm_sta
 	start = skb->data;
 
 	cpu = get_cpu();
-	scratch = *per_cpu_ptr(ipcomp6_scratches, cpu);
-	tfm = *per_cpu_ptr(ipcd->tfms, cpu);
+	scratch = *CPU_PTR(ipcomp6_scratches, cpu);
+	tfm = *CPU_PTR(ipcd->tfms, cpu);
 
 	err = crypto_comp_decompress(tfm, start, plen, scratch, &dlen);
 	if (err)
@@ -139,8 +139,8 @@ static int ipcomp6_output(struct xfrm_st
 	start = skb->data;
 
 	cpu = get_cpu();
-	scratch = *per_cpu_ptr(ipcomp6_scratches, cpu);
-	tfm = *per_cpu_ptr(ipcd->tfms, cpu);
+	scratch = *CPU_PTR(ipcomp6_scratches, cpu);
+	tfm = *CPU_PTR(ipcd->tfms, cpu);
 
 	err = crypto_comp_compress(tfm, start, plen, scratch, &dlen);
 	if (err || (dlen + sizeof(*ipch)) >= plen) {
@@ -262,12 +262,12 @@ static void ipcomp6_free_scratches(void)
 		return;
 
 	for_each_possible_cpu(i) {
-		void *scratch = *per_cpu_ptr(scratches, i);
+		void *scratch = *CPU_PTR(scratches, i);
 
 		vfree(scratch);
 	}
 
-	free_percpu(scratches);
+	CPU_FREE(scratches);
 }
 
 static void **ipcomp6_alloc_scratches(void)
@@ -278,7 +278,7 @@ static void **ipcomp6_alloc_scratches(vo
 	if (ipcomp6_scratch_users++)
 		return ipcomp6_scratches;
 
-	scratches = alloc_percpu(void *);
+	scratches = CPU_ALLOC(void *, GFP_KERNEL);
 	if (!scratches)
 		return NULL;
 
@@ -288,7 +288,7 @@ static void **ipcomp6_alloc_scratches(vo
 		void *scratch = vmalloc(IPCOMP_SCRATCH_SIZE);
 		if (!scratch)
 			return NULL;
-		*per_cpu_ptr(scratches, i) = scratch;
+		*CPU_PTR(scratches, i) = scratch;
 	}
 
 	return scratches;
@@ -316,10 +316,10 @@ static void ipcomp6_free_tfms(struct cry
 		return;
 
 	for_each_possible_cpu(cpu) {
-		struct crypto_comp *tfm = *per_cpu_ptr(tfms, cpu);
+		struct crypto_comp *tfm = *CPU_PTR(tfms, cpu);
 		crypto_free_comp(tfm);
 	}
-	free_percpu(tfms);
+	CPU_FREE(tfms);
 }
 
 static struct crypto_comp **ipcomp6_alloc_tfms(const char *alg_name)
@@ -335,7 +335,7 @@ static struct crypto_comp **ipcomp6_allo
 		struct crypto_comp *tfm;
 
 		tfms = pos->tfms;
-		tfm = *per_cpu_ptr(tfms, cpu);
+		tfm = *CPU_PTR(tfms, cpu);
 
 		if (!strcmp(crypto_comp_name(tfm), alg_name)) {
 			pos->users++;
@@ -351,7 +351,7 @@ static struct crypto_comp **ipcomp6_allo
 	INIT_LIST_HEAD(&pos->list);
 	list_add(&pos->list, &ipcomp6_tfms_list);
 
-	pos->tfms = tfms = alloc_percpu(struct crypto_comp *);
+	pos->tfms = tfms = CPU_ALLOC(struct crypto_comp *, GFP_KERNEL);
 	if (!tfms)
 		goto error;
 
@@ -360,7 +360,7 @@ static struct crypto_comp **ipcomp6_allo
 							    CRYPTO_ALG_ASYNC);
 		if (!tfm)
 			goto error;
-		*per_cpu_ptr(tfms, cpu) = tfm;
+		*CPU_PTR(tfms, cpu) = tfm;
 	}
 
 	return tfms;

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 20/28] cpu alloc: dmaengine conversion
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (18 preceding siblings ...)
  2007-11-06 19:52 ` [patch 19/28] cpu alloc: convert scatches Christoph Lameter
@ 2007-11-06 19:52 ` Christoph Lameter
  2007-11-06 19:52 ` [patch 21/28] cpu alloc: convert loopback statistics Christoph Lameter
                   ` (8 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:52 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_dmaengine --]
[-- Type: text/plain, Size: 4524 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 drivers/dma/dmaengine.c   |   27 ++++++++++++++-------------
 include/linux/dmaengine.h |    5 +++--
 2 files changed, 17 insertions(+), 15 deletions(-)

Index: linux-2.6/drivers/dma/dmaengine.c
===================================================================
--- linux-2.6.orig/drivers/dma/dmaengine.c	2007-11-04 19:31:08.000000000 -0800
+++ linux-2.6/drivers/dma/dmaengine.c	2007-11-04 19:34:09.000000000 -0800
@@ -84,7 +84,7 @@ static ssize_t show_memcpy_count(struct 
 	int i;
 
 	for_each_possible_cpu(i)
-		count += per_cpu_ptr(chan->local, i)->memcpy_count;
+		count += CPU_PTR(chan->local, i)->memcpy_count;
 
 	return sprintf(buf, "%lu\n", count);
 }
@@ -96,7 +96,7 @@ static ssize_t show_bytes_transferred(st
 	int i;
 
 	for_each_possible_cpu(i)
-		count += per_cpu_ptr(chan->local, i)->bytes_transferred;
+		count += CPU_PTR(chan->local, i)->bytes_transferred;
 
 	return sprintf(buf, "%lu\n", count);
 }
@@ -110,7 +110,7 @@ static ssize_t show_in_use(struct class_
 		atomic_read(&chan->refcount.refcount) > 1)
 		in_use = 1;
 	else {
-		if (local_read(&(per_cpu_ptr(chan->local,
+		if (local_read(&(CPU_PTR(chan->local,
 			get_cpu())->refcount)) > 0)
 			in_use = 1;
 		put_cpu();
@@ -227,7 +227,7 @@ static void dma_chan_free_rcu(struct rcu
 	int bias = 0x7FFFFFFF;
 	int i;
 	for_each_possible_cpu(i)
-		bias -= local_read(&per_cpu_ptr(chan->local, i)->refcount);
+		bias -= local_read(&CPU_PTR(chan->local, i)->refcount);
 	atomic_sub(bias, &chan->refcount.refcount);
 	kref_put(&chan->refcount, dma_chan_cleanup);
 }
@@ -379,7 +379,8 @@ int dma_async_device_register(struct dma
 
 	/* represent channels in sysfs. Probably want devs too */
 	list_for_each_entry(chan, &device->channels, device_node) {
-		chan->local = alloc_percpu(typeof(*chan->local));
+		chan->local = CPU_ALLOC(typeof(*chan->local),
+					GFP_KERNEL | __GFP_ZERO);
 		if (chan->local == NULL)
 			continue;
 
@@ -392,7 +393,7 @@ int dma_async_device_register(struct dma
 		rc = class_device_register(&chan->class_dev);
 		if (rc) {
 			chancnt--;
-			free_percpu(chan->local);
+			CPU_FREE(chan->local);
 			chan->local = NULL;
 			goto err_out;
 		}
@@ -418,7 +419,7 @@ err_out:
 		kref_put(&device->refcount, dma_async_device_cleanup);
 		class_device_unregister(&chan->class_dev);
 		chancnt--;
-		free_percpu(chan->local);
+		CPU_FREE(chan->local);
 	}
 	return rc;
 }
@@ -494,8 +495,8 @@ dma_async_memcpy_buf_to_buf(struct dma_c
 	cookie = tx->tx_submit(tx);
 
 	cpu = get_cpu();
-	per_cpu_ptr(chan->local, cpu)->bytes_transferred += len;
-	per_cpu_ptr(chan->local, cpu)->memcpy_count++;
+	CPU_PTR(chan->local, cpu)->bytes_transferred += len;
+	CPU_PTR(chan->local, cpu)->memcpy_count++;
 	put_cpu();
 
 	return cookie;
@@ -538,8 +539,8 @@ dma_async_memcpy_buf_to_pg(struct dma_ch
 	cookie = tx->tx_submit(tx);
 
 	cpu = get_cpu();
-	per_cpu_ptr(chan->local, cpu)->bytes_transferred += len;
-	per_cpu_ptr(chan->local, cpu)->memcpy_count++;
+	CPU_PTR(chan->local, cpu)->bytes_transferred += len;
+	CPU_PTR(chan->local, cpu)->memcpy_count++;
 	put_cpu();
 
 	return cookie;
@@ -584,8 +585,8 @@ dma_async_memcpy_pg_to_pg(struct dma_cha
 	cookie = tx->tx_submit(tx);
 
 	cpu = get_cpu();
-	per_cpu_ptr(chan->local, cpu)->bytes_transferred += len;
-	per_cpu_ptr(chan->local, cpu)->memcpy_count++;
+	CPU_PTR(chan->local, cpu)->bytes_transferred += len;
+	CPU_PTR(chan->local, cpu)->memcpy_count++;
 	put_cpu();
 
 	return cookie;
Index: linux-2.6/include/linux/dmaengine.h
===================================================================
--- linux-2.6.orig/include/linux/dmaengine.h	2007-11-04 19:31:08.000000000 -0800
+++ linux-2.6/include/linux/dmaengine.h	2007-11-04 19:35:07.000000000 -0800
@@ -27,6 +27,7 @@
 #include <linux/completion.h>
 #include <linux/rcupdate.h>
 #include <linux/dma-mapping.h>
+#include <linux/cpu_alloc.h>
 
 /**
  * enum dma_state - resource PNP/power managment state
@@ -150,7 +151,7 @@ static inline void dma_chan_get(struct d
 	if (unlikely(chan->slow_ref))
 		kref_get(&chan->refcount);
 	else {
-		local_inc(&(per_cpu_ptr(chan->local, get_cpu())->refcount));
+		local_inc(&CPU_PTR(chan->local, get_cpu())->refcount);
 		put_cpu();
 	}
 }
@@ -160,7 +161,7 @@ static inline void dma_chan_put(struct d
 	if (unlikely(chan->slow_ref))
 		kref_put(&chan->refcount, dma_chan_cleanup);
 	else {
-		local_dec(&(per_cpu_ptr(chan->local, get_cpu())->refcount));
+		local_dec(&CPU_PTR(chan->local, get_cpu())->refcount);
 		put_cpu();
 	}
 }

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 21/28] cpu alloc: convert loopback statistics
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (19 preceding siblings ...)
  2007-11-06 19:52 ` [patch 20/28] cpu alloc: dmaengine conversion Christoph Lameter
@ 2007-11-06 19:52 ` Christoph Lameter
  2007-11-06 19:52 ` [patch 22/28] cpu alloc: veth conversion Christoph Lameter
                   ` (7 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:52 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_loopback --]
[-- Type: text/plain, Size: 1424 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 drivers/net/loopback.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Index: linux-2.6/drivers/net/loopback.c
===================================================================
--- linux-2.6.orig/drivers/net/loopback.c	2007-11-04 19:15:14.000000000 -0800
+++ linux-2.6/drivers/net/loopback.c	2007-11-04 19:16:38.000000000 -0800
@@ -156,7 +156,7 @@ static int loopback_xmit(struct sk_buff 
 
 	/* it's OK to use per_cpu_ptr() because BHs are off */
 	pcpu_lstats = netdev_priv(dev);
-	lb_stats = per_cpu_ptr(pcpu_lstats, smp_processor_id());
+	lb_stats = THIS_CPU(pcpu_lstats);
 	lb_stats->bytes += skb->len;
 	lb_stats->packets++;
 
@@ -177,7 +177,7 @@ static struct net_device_stats *get_stat
 	for_each_possible_cpu(i) {
 		const struct pcpu_lstats *lb_stats;
 
-		lb_stats = per_cpu_ptr(pcpu_lstats, i);
+		lb_stats = CPU_PTR(pcpu_lstats, i);
 		bytes   += lb_stats->bytes;
 		packets += lb_stats->packets;
 	}
@@ -205,7 +205,7 @@ static int loopback_dev_init(struct net_
 {
 	struct pcpu_lstats *lstats;
 
-	lstats = alloc_percpu(struct pcpu_lstats);
+	lstats = CPU_ALLOC(struct pcpu_lstats, GFP_KERNEL | __GFP_ZERO);
 	if (!lstats)
 		return -ENOMEM;
 
@@ -217,7 +217,7 @@ static void loopback_dev_free(struct net
 {
 	struct pcpu_lstats *lstats = netdev_priv(dev);
 
-	free_percpu(lstats);
+	CPU_FREE(lstats);
 	free_netdev(dev);
 }
 

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 22/28] cpu alloc: veth conversion
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (20 preceding siblings ...)
  2007-11-06 19:52 ` [patch 21/28] cpu alloc: convert loopback statistics Christoph Lameter
@ 2007-11-06 19:52 ` Christoph Lameter
  2007-11-06 19:52 ` [patch 23/28] cpu alloc: Chelsio statistics conversion Christoph Lameter
                   ` (6 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:52 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_veth --]
[-- Type: text/plain, Size: 1667 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 drivers/net/veth.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Index: linux-2.6/drivers/net/veth.c
===================================================================
--- linux-2.6.orig/drivers/net/veth.c	2007-11-04 19:17:22.000000000 -0800
+++ linux-2.6/drivers/net/veth.c	2007-11-04 19:19:11.000000000 -0800
@@ -162,7 +162,7 @@ static int veth_xmit(struct sk_buff *skb
 	rcv_priv = netdev_priv(rcv);
 
 	cpu = smp_processor_id();
-	stats = per_cpu_ptr(priv->stats, cpu);
+	stats = CPU_PTR(priv->stats, cpu);
 
 	if (!(rcv->flags & IFF_UP))
 		goto outf;
@@ -183,7 +183,7 @@ static int veth_xmit(struct sk_buff *skb
 	stats->tx_bytes += length;
 	stats->tx_packets++;
 
-	stats = per_cpu_ptr(rcv_priv->stats, cpu);
+	stats = CPU_PTR(rcv_priv->stats, cpu);
 	stats->rx_bytes += length;
 	stats->rx_packets++;
 
@@ -217,7 +217,7 @@ static struct net_device_stats *veth_get
 	dev_stats->tx_dropped = 0;
 
 	for_each_online_cpu(cpu) {
-		stats = per_cpu_ptr(priv->stats, cpu);
+		stats = CPU_PTR(priv->stats, cpu);
 
 		dev_stats->rx_packets += stats->rx_packets;
 		dev_stats->tx_packets += stats->tx_packets;
@@ -261,7 +261,7 @@ static int veth_dev_init(struct net_devi
 	struct veth_net_stats *stats;
 	struct veth_priv *priv;
 
-	stats = alloc_percpu(struct veth_net_stats);
+	stats = CPU_ALLOC(struct veth_net_stats, GFP_KERNEL | __GFP_ZER);
 	if (stats == NULL)
 		return -ENOMEM;
 
@@ -275,7 +275,7 @@ static void veth_dev_free(struct net_dev
 	struct veth_priv *priv;
 
 	priv = netdev_priv(dev);
-	free_percpu(priv->stats);
+	CPU_FREE(priv->stats);
 	free_netdev(dev);
 }
 

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 23/28] cpu alloc: Chelsio statistics conversion
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (21 preceding siblings ...)
  2007-11-06 19:52 ` [patch 22/28] cpu alloc: veth conversion Christoph Lameter
@ 2007-11-06 19:52 ` Christoph Lameter
  2007-11-06 19:52 ` [patch 24/28] cpu alloc: convert mib handling to cpu alloc Christoph Lameter
                   ` (5 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:52 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_chelsio --]
[-- Type: text/plain, Size: 2237 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 drivers/net/chelsio/sge.c |   13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

Index: linux-2.6/drivers/net/chelsio/sge.c
===================================================================
--- linux-2.6.orig/drivers/net/chelsio/sge.c	2007-11-04 19:19:23.000000000 -0800
+++ linux-2.6/drivers/net/chelsio/sge.c	2007-11-04 19:21:01.000000000 -0800
@@ -805,7 +805,7 @@ void t1_sge_destroy(struct sge *sge)
 	int i;
 
 	for_each_port(sge->adapter, i)
-		free_percpu(sge->port_stats[i]);
+		CPU_FREE(sge->port_stats[i]);
 
 	kfree(sge->tx_sched);
 	free_tx_resources(sge);
@@ -984,7 +984,7 @@ void t1_sge_get_port_stats(const struct 
 
 	memset(ss, 0, sizeof(*ss));
 	for_each_possible_cpu(cpu) {
-		struct sge_port_stats *st = per_cpu_ptr(sge->port_stats[port], cpu);
+		struct sge_port_stats *st = CPU_PTR(sge->port_stats[port], cpu);
 
 		ss->rx_packets += st->rx_packets;
 		ss->rx_cso_good += st->rx_cso_good;
@@ -1380,7 +1380,7 @@ static void sge_rx(struct sge *sge, stru
 	__skb_pull(skb, sizeof(*p));
 
 	skb->dev->last_rx = jiffies;
-	st = per_cpu_ptr(sge->port_stats[p->iff], smp_processor_id());
+	st = THIS_CPU(sge->port_stats[p->iff]);
 	st->rx_packets++;
 
 	skb->protocol = eth_type_trans(skb, adapter->port[p->iff].dev);
@@ -1848,7 +1848,7 @@ int t1_start_xmit(struct sk_buff *skb, s
 {
 	struct adapter *adapter = dev->priv;
 	struct sge *sge = adapter->sge;
-	struct sge_port_stats *st = per_cpu_ptr(sge->port_stats[dev->if_port], smp_processor_id());
+	struct sge_port_stats *st = THIS_CPU(sge->port_stats[dev->if_port]);
 	struct cpl_tx_pkt *cpl;
 	struct sk_buff *orig_skb = skb;
 	int ret;
@@ -2165,7 +2165,8 @@ struct sge * __devinit t1_sge_create(str
 	sge->jumbo_fl = t1_is_T1B(adapter) ? 1 : 0;
 
 	for_each_port(adapter, i) {
-		sge->port_stats[i] = alloc_percpu(struct sge_port_stats);
+		sge->port_stats[i] = CPU_ALLOC(struct sge_port_stats,
+					GFP_KERNEL | __GFP_ZERO);
 		if (!sge->port_stats[i])
 			goto nomem_port;
 	}
@@ -2209,7 +2210,7 @@ struct sge * __devinit t1_sge_create(str
 	return sge;
 nomem_port:
 	while (i >= 0) {
-		free_percpu(sge->port_stats[i]);
+		CPU_FREE(sge->port_stats[i]);
 		--i;
 	}
 	kfree(sge);

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 24/28] cpu alloc: convert mib handling to cpu alloc
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (22 preceding siblings ...)
  2007-11-06 19:52 ` [patch 23/28] cpu alloc: Chelsio statistics conversion Christoph Lameter
@ 2007-11-06 19:52 ` Christoph Lameter
  2007-11-06 19:52 ` [patch 25/28] cpu alloc: Explicitly code allocpercpu calls in iucv Christoph Lameter
                   ` (4 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:52 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_net_snmp --]
[-- Type: text/plain, Size: 11082 bytes --]

Use the cpu alloc functions for the mib handling functions in the net
layer. The API for snmp_mib_free() is changed to add a size parameter
since cpu_fre requires that.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 include/net/ip.h    |    2 +-
 include/net/snmp.h  |   14 +++++++-------
 net/dccp/proto.c    |   12 +++++++-----
 net/ipv4/af_inet.c  |   32 ++++++++++++++++++--------------
 net/ipv6/addrconf.c |   10 +++++-----
 net/ipv6/af_inet6.c |   18 +++++++++---------
 net/sctp/proc.c     |    4 ++--
 net/sctp/protocol.c |   13 ++++++++-----
 8 files changed, 57 insertions(+), 48 deletions(-)

Index: linux-2.6/include/net/snmp.h
===================================================================
--- linux-2.6.orig/include/net/snmp.h	2007-11-04 20:19:49.000000000 -0800
+++ linux-2.6/include/net/snmp.h	2007-11-04 20:57:16.000000000 -0800
@@ -133,18 +133,18 @@ struct linux_mib {
 #define SNMP_STAT_USRPTR(name)	(name[1])
 
 #define SNMP_INC_STATS_BH(mib, field) 	\
-	(per_cpu_ptr(mib[0], raw_smp_processor_id())->mibs[field]++)
+	(__THIS_CPU(mib[0])->mibs[field]++)
 #define SNMP_INC_STATS_OFFSET_BH(mib, field, offset)	\
-	(per_cpu_ptr(mib[0], raw_smp_processor_id())->mibs[field + (offset)]++)
+	(__THIS_CPU(mib[0])->mibs[field + (offset)]++)
 #define SNMP_INC_STATS_USER(mib, field) \
-	(per_cpu_ptr(mib[1], raw_smp_processor_id())->mibs[field]++)
+	(__THIS_CPU(mib[1])->mibs[field]++)
 #define SNMP_INC_STATS(mib, field) 	\
-	(per_cpu_ptr(mib[!in_softirq()], raw_smp_processor_id())->mibs[field]++)
+	(__THIS_CPU(mib[!in_softirq()])->mibs[field]++)
 #define SNMP_DEC_STATS(mib, field) 	\
-	(per_cpu_ptr(mib[!in_softirq()], raw_smp_processor_id())->mibs[field]--)
+	(__THIS_CPU(mib[!in_softirq()])->mibs[field]--)
 #define SNMP_ADD_STATS_BH(mib, field, addend) 	\
-	(per_cpu_ptr(mib[0], raw_smp_processor_id())->mibs[field] += addend)
+	(__THIS_CPU(mib[0])->mibs[field] += addend)
 #define SNMP_ADD_STATS_USER(mib, field, addend) 	\
-	(per_cpu_ptr(mib[1], raw_smp_processor_id())->mibs[field] += addend)
+	(__THIS_CPU(mib[1])->mibs[field] += addend)
 
 #endif
Index: linux-2.6/net/dccp/proto.c
===================================================================
--- linux-2.6.orig/net/dccp/proto.c	2007-11-04 20:19:49.000000000 -0800
+++ linux-2.6/net/dccp/proto.c	2007-11-04 20:57:16.000000000 -0800
@@ -990,11 +990,13 @@ static int __init dccp_mib_init(void)
 {
 	int rc = -ENOMEM;
 
-	dccp_statistics[0] = alloc_percpu(struct dccp_mib);
+	dccp_statistics[0] = CPU_ALLOC(struct dccp_mib,
+					GFP_KERNEL | __GFP_ZERO);
 	if (dccp_statistics[0] == NULL)
 		goto out;
 
-	dccp_statistics[1] = alloc_percpu(struct dccp_mib);
+	dccp_statistics[1] = CPU_ALLOC(struct dccp_mib,
+					GFP_KERNEL | __GFP_ZERO);
 	if (dccp_statistics[1] == NULL)
 		goto out_free_one;
 
@@ -1002,7 +1004,7 @@ static int __init dccp_mib_init(void)
 out:
 	return rc;
 out_free_one:
-	free_percpu(dccp_statistics[0]);
+	CPU_FREE(dccp_statistics[0]);
 	dccp_statistics[0] = NULL;
 	goto out;
 
@@ -1010,8 +1012,8 @@ out_free_one:
 
 static void dccp_mib_exit(void)
 {
-	free_percpu(dccp_statistics[0]);
-	free_percpu(dccp_statistics[1]);
+	CPU_FREE(dccp_statistics[0]);
+	CPU_FREE(dccp_statistics[1]);
 	dccp_statistics[0] = dccp_statistics[1] = NULL;
 }
 
Index: linux-2.6/net/sctp/protocol.c
===================================================================
--- linux-2.6.orig/net/sctp/protocol.c	2007-11-04 20:19:49.000000000 -0800
+++ linux-2.6/net/sctp/protocol.c	2007-11-04 21:11:37.000000000 -0800
@@ -61,6 +61,7 @@
 #include <net/addrconf.h>
 #include <net/inet_common.h>
 #include <net/inet_ecn.h>
+#include <linux/cpu_alloc.h>
 
 /* Global data structures. */
 struct sctp_globals sctp_globals __read_mostly;
@@ -970,12 +971,14 @@ int sctp_register_pf(struct sctp_pf *pf,
 
 static int __init init_sctp_mibs(void)
 {
-	sctp_statistics[0] = alloc_percpu(struct sctp_mib);
+	sctp_statistics[0] = CPU_ALLOC(struct sctp_mib,
+					GFP_KERNEL | __GFP_ZERO);
 	if (!sctp_statistics[0])
 		return -ENOMEM;
-	sctp_statistics[1] = alloc_percpu(struct sctp_mib);
+	sctp_statistics[1] = CPU_ALLOC(struct sctp_mib,
+					GFP_KERNEL | __GFP_ZERO);
 	if (!sctp_statistics[1]) {
-		free_percpu(sctp_statistics[0]);
+		CPU_FREE(sctp_statistics[0]);
 		return -ENOMEM;
 	}
 	return 0;
@@ -984,8 +987,8 @@ static int __init init_sctp_mibs(void)
 
 static void cleanup_sctp_mibs(void)
 {
-	free_percpu(sctp_statistics[0]);
-	free_percpu(sctp_statistics[1]);
+	CPU_FREE(sctp_statistics[0]);
+	CPU_FREE(sctp_statistics[1]);
 }
 
 /* Initialize the universe into something sensible.  */
Index: linux-2.6/net/ipv4/af_inet.c
===================================================================
--- linux-2.6.orig/net/ipv4/af_inet.c	2007-11-04 20:19:49.000000000 -0800
+++ linux-2.6/net/ipv4/af_inet.c	2007-11-04 21:01:34.000000000 -0800
@@ -88,6 +88,7 @@
 #include <linux/poll.h>
 #include <linux/netfilter_ipv4.h>
 #include <linux/random.h>
+#include <linux/cpu_alloc.h>
 
 #include <asm/uaccess.h>
 #include <asm/system.h>
@@ -1230,8 +1231,8 @@ unsigned long snmp_fold_field(void *mib[
 	int i;
 
 	for_each_possible_cpu(i) {
-		res += *(((unsigned long *) per_cpu_ptr(mib[0], i)) + offt);
-		res += *(((unsigned long *) per_cpu_ptr(mib[1], i)) + offt);
+		res += *(((unsigned long *) CPU_PTR(mib[0], i)) + offt);
+		res += *(((unsigned long *) CPU_PTR(mib[1], i)) + offt);
 	}
 	return res;
 }
@@ -1240,26 +1241,28 @@ EXPORT_SYMBOL_GPL(snmp_fold_field);
 int snmp_mib_init(void *ptr[2], size_t mibsize, size_t mibalign)
 {
 	BUG_ON(ptr == NULL);
-	ptr[0] = __alloc_percpu(mibsize);
+	ptr[0] = cpu_alloc(mibsize, GFP_KERNEL | __GFP_ZERO,
+					mibalign);
 	if (!ptr[0])
 		goto err0;
-	ptr[1] = __alloc_percpu(mibsize);
+	ptr[1] = cpu_alloc(mibsize, GFP_KERNEL | __GFP_ZERO,
+					mibalign);
 	if (!ptr[1])
 		goto err1;
 	return 0;
 err1:
-	free_percpu(ptr[0]);
+	cpu_free(ptr[0], mibsize);
 	ptr[0] = NULL;
 err0:
 	return -ENOMEM;
 }
 EXPORT_SYMBOL_GPL(snmp_mib_init);
 
-void snmp_mib_free(void *ptr[2])
+void snmp_mib_free(void *ptr[2], size_t mibsize)
 {
 	BUG_ON(ptr == NULL);
-	free_percpu(ptr[0]);
-	free_percpu(ptr[1]);
+	cpu_free(ptr[0], mibsize);
+	cpu_free(ptr[1], mibsize);
 	ptr[0] = ptr[1] = NULL;
 }
 EXPORT_SYMBOL_GPL(snmp_mib_free);
@@ -1324,17 +1327,18 @@ static int __init init_ipv4_mibs(void)
 	return 0;
 
 err_udplite_mib:
-	snmp_mib_free((void **)udp_statistics);
+	snmp_mib_free((void **)udp_statistics, sizeof(struct udp_mib));
 err_udp_mib:
-	snmp_mib_free((void **)tcp_statistics);
+	snmp_mib_free((void **)tcp_statistics, sizeof(struct tcp_mib));
 err_tcp_mib:
-	snmp_mib_free((void **)icmpmsg_statistics);
+	snmp_mib_free((void **)icmpmsg_statistics,
+					sizeof(struct icmpmsg_mib));
 err_icmpmsg_mib:
-	snmp_mib_free((void **)icmp_statistics);
+	snmp_mib_free((void **)icmp_statistics, sizeof(struct icmp_mib));
 err_icmp_mib:
-	snmp_mib_free((void **)ip_statistics);
+	snmp_mib_free((void **)ip_statistics, sizeof(struct ipstats_mib));
 err_ip_mib:
-	snmp_mib_free((void **)net_statistics);
+	snmp_mib_free((void **)net_statistics, sizeof(struct linux_mib));
 err_net_mib:
 	return -ENOMEM;
 }
Index: linux-2.6/net/sctp/proc.c
===================================================================
--- linux-2.6.orig/net/sctp/proc.c	2007-11-04 20:19:49.000000000 -0800
+++ linux-2.6/net/sctp/proc.c	2007-11-04 20:57:16.000000000 -0800
@@ -86,10 +86,10 @@ fold_field(void *mib[], int nr)
 
 	for_each_possible_cpu(i) {
 		res +=
-		    *((unsigned long *) (((void *) per_cpu_ptr(mib[0], i)) +
+		    *((unsigned long *) (((void *)CPU_PTR(mib[0], i)) +
 					 sizeof (unsigned long) * nr));
 		res +=
-		    *((unsigned long *) (((void *) per_cpu_ptr(mib[1], i)) +
+		    *((unsigned long *) (((void *)CPU_PTR(mib[1], i)) +
 					 sizeof (unsigned long) * nr));
 	}
 	return res;
Index: linux-2.6/include/net/ip.h
===================================================================
--- linux-2.6.orig/include/net/ip.h	2007-11-04 20:58:19.000000000 -0800
+++ linux-2.6/include/net/ip.h	2007-11-04 20:58:30.000000000 -0800
@@ -170,7 +170,7 @@ DECLARE_SNMP_STAT(struct linux_mib, net_
 
 extern unsigned long snmp_fold_field(void *mib[], int offt);
 extern int snmp_mib_init(void *ptr[2], size_t mibsize, size_t mibalign);
-extern void snmp_mib_free(void *ptr[2]);
+extern void snmp_mib_free(void *ptr[2], size_t mibsize);
 
 extern void inet_get_local_port_range(int *low, int *high);
 
Index: linux-2.6/net/ipv6/addrconf.c
===================================================================
--- linux-2.6.orig/net/ipv6/addrconf.c	2007-11-04 21:06:09.000000000 -0800
+++ linux-2.6/net/ipv6/addrconf.c	2007-11-04 21:07:56.000000000 -0800
@@ -271,18 +271,18 @@ static int snmp6_alloc_dev(struct inet6_
 	return 0;
 
 err_icmpmsg:
-	snmp_mib_free((void **)idev->stats.icmpv6);
+	snmp_mib_free((void **)idev->stats.icmpv6, sizeof(struct icmpv6_mib));
 err_icmp:
-	snmp_mib_free((void **)idev->stats.ipv6);
+	snmp_mib_free((void **)idev->stats.ipv6, sizeof(struct ipstats_mib));
 err_ip:
 	return -ENOMEM;
 }
 
 static void snmp6_free_dev(struct inet6_dev *idev)
 {
-	snmp_mib_free((void **)idev->stats.icmpv6msg);
-	snmp_mib_free((void **)idev->stats.icmpv6);
-	snmp_mib_free((void **)idev->stats.ipv6);
+	snmp_mib_free((void **)idev->stats.icmpv6msg, sizeof(struct icmpv6_mib));
+	snmp_mib_free((void **)idev->stats.icmpv6, sizeof(struct icmpv6_mib));
+	snmp_mib_free((void **)idev->stats.ipv6, sizeof(struct ipstats_mib));
 }
 
 /* Nobody refers to this device, we may destroy it. */
Index: linux-2.6/net/ipv6/af_inet6.c
===================================================================
--- linux-2.6.orig/net/ipv6/af_inet6.c	2007-11-04 21:02:39.000000000 -0800
+++ linux-2.6/net/ipv6/af_inet6.c	2007-11-04 21:05:36.000000000 -0800
@@ -731,13 +731,13 @@ static int __init init_ipv6_mibs(void)
 	return 0;
 
 err_udplite_mib:
-	snmp_mib_free((void **)udp_stats_in6);
+	snmp_mib_free((void **)udp_stats_in6, sizeof(struct udp_mib));
 err_udp_mib:
-	snmp_mib_free((void **)icmpv6msg_statistics);
+	snmp_mib_free((void **)icmpv6msg_statistics, sizeof(struct icmpv6_mib));
 err_icmpmsg_mib:
-	snmp_mib_free((void **)icmpv6_statistics);
+	snmp_mib_free((void **)icmpv6_statistics, sizeof(struct icmpv6_mib));
 err_icmp_mib:
-	snmp_mib_free((void **)ipv6_statistics);
+	snmp_mib_free((void **)ipv6_statistics, sizeof(struct ipstats_mib));
 err_ip_mib:
 	return -ENOMEM;
 
@@ -745,11 +745,11 @@ err_ip_mib:
 
 static void cleanup_ipv6_mibs(void)
 {
-	snmp_mib_free((void **)ipv6_statistics);
-	snmp_mib_free((void **)icmpv6_statistics);
-	snmp_mib_free((void **)icmpv6msg_statistics);
-	snmp_mib_free((void **)udp_stats_in6);
-	snmp_mib_free((void **)udplite_stats_in6);
+	snmp_mib_free((void **)ipv6_statistics, sizeof(struct ipstats_mib));
+	snmp_mib_free((void **)icmpv6_statistics, sizeof(struct icmpv6_mib));
+	snmp_mib_free((void **)icmpv6msg_statistics, sizeof(struct icmpv6_mib));
+	snmp_mib_free((void **)udp_stats_in6, sizeof(struct udp_mib));
+	snmp_mib_free((void **)udplite_stats_in6, sizeof(struct udp_mib));
 }
 
 static int __init inet6_init(void)

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 25/28] cpu alloc: Explicitly code allocpercpu calls in iucv
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (23 preceding siblings ...)
  2007-11-06 19:52 ` [patch 24/28] cpu alloc: convert mib handling to cpu alloc Christoph Lameter
@ 2007-11-06 19:52 ` Christoph Lameter
  2007-11-06 19:52 ` [patch 26/28] cpu alloc: Use for infiniband Christoph Lameter
                   ` (3 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:52 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_iucv --]
[-- Type: text/plain, Size: 9854 bytes --]

The iucv is the only user of the various functions that are used to bring
parts of cpus up and down. Its the only allocpercpu user that will do
I/O on per cpu objects (which is difficult to do with virtually mapped memory).
And its the only use of allocpercpu where a GFP_DMA allocation is done.

Remove the allocpercpu calls from iucv and code the allocation and freeing
manually.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 net/iucv/iucv.c |   99 +++++++++++++++++++++++++++++++-------------------------
 1 file changed, 56 insertions(+), 43 deletions(-)

Index: linux-2.6/net/iucv/iucv.c
===================================================================
--- linux-2.6.orig/net/iucv/iucv.c	2007-11-04 21:15:10.000000000 -0800
+++ linux-2.6/net/iucv/iucv.c	2007-11-04 21:22:50.000000000 -0800
@@ -97,7 +97,7 @@ struct iucv_irq_list {
 	struct iucv_irq_data data;
 };
 
-static struct iucv_irq_data *iucv_irq_data;
+static struct iucv_irq_data *iucv_irq_data[NR_CPUS];
 static cpumask_t iucv_buffer_cpumask = CPU_MASK_NONE;
 static cpumask_t iucv_irq_cpumask = CPU_MASK_NONE;
 
@@ -277,7 +277,7 @@ union iucv_param {
 /*
  * Anchor for per-cpu IUCV command parameter block.
  */
-static union iucv_param *iucv_param;
+static union iucv_param *iucv_param[NR_CPUS];
 
 /**
  * iucv_call_b2f0
@@ -356,7 +356,7 @@ static void iucv_allow_cpu(void *data)
 	 *	0x10 - Flag to allow priority message completion interrupts
 	 *	0x08 - Flag to allow IUCV control interrupts
 	 */
-	parm = percpu_ptr(iucv_param, smp_processor_id());
+	parm = iucv_param[cpu];
 	memset(parm, 0, sizeof(union iucv_param));
 	parm->set_mask.ipmask = 0xf8;
 	iucv_call_b2f0(IUCV_SETMASK, parm);
@@ -377,7 +377,7 @@ static void iucv_block_cpu(void *data)
 	union iucv_param *parm;
 
 	/* Disable all iucv interrupts. */
-	parm = percpu_ptr(iucv_param, smp_processor_id());
+	parm = iucv_param[cpu];
 	memset(parm, 0, sizeof(union iucv_param));
 	iucv_call_b2f0(IUCV_SETMASK, parm);
 
@@ -401,9 +401,9 @@ static void iucv_declare_cpu(void *data)
 		return;
 
 	/* Declare interrupt buffer. */
-	parm = percpu_ptr(iucv_param, cpu);
+	parm = iucv_param[cpu];
 	memset(parm, 0, sizeof(union iucv_param));
-	parm->db.ipbfadr1 = virt_to_phys(percpu_ptr(iucv_irq_data, cpu));
+	parm->db.ipbfadr1 = virt_to_phys(iucv_irq_data[cpu]);
 	rc = iucv_call_b2f0(IUCV_DECLARE_BUFFER, parm);
 	if (rc) {
 		char *err = "Unknown";
@@ -458,7 +458,7 @@ static void iucv_retrieve_cpu(void *data
 	iucv_block_cpu(NULL);
 
 	/* Retrieve interrupt buffer. */
-	parm = percpu_ptr(iucv_param, cpu);
+	parm = iucv_param[cpu];
 	iucv_call_b2f0(IUCV_RETRIEVE_BUFFER, parm);
 
 	/* Clear indication that an iucv buffer exists for this cpu. */
@@ -558,13 +558,13 @@ static int __cpuinit iucv_cpu_notify(str
 	switch (action) {
 	case CPU_UP_PREPARE:
 	case CPU_UP_PREPARE_FROZEN:
-		if (!percpu_populate(iucv_irq_data,
-				     sizeof(struct iucv_irq_data),
-				     GFP_KERNEL|GFP_DMA, cpu))
+		iucv_irq_data[cpu] = kmalloc_node(sizeof(struct iucv_irq_data),
+					GFP_KERNEL|GFP_DMA, cpu_to_node(cpu));
+		if (!iucv_irq_data[cpu])
 			return NOTIFY_BAD;
-		if (!percpu_populate(iucv_param, sizeof(union iucv_param),
-				     GFP_KERNEL|GFP_DMA, cpu)) {
-			percpu_depopulate(iucv_irq_data, cpu);
+		iucv_param[cpu] = kmalloc_node(sizeof(union iucv_param),
+				     GFP_KERNEL|GFP_DMA, cpu_to_node(cpu));
+		if (!iucv_param[cpu])
 			return NOTIFY_BAD;
 		}
 		break;
@@ -572,8 +572,10 @@ static int __cpuinit iucv_cpu_notify(str
 	case CPU_UP_CANCELED_FROZEN:
 	case CPU_DEAD:
 	case CPU_DEAD_FROZEN:
-		percpu_depopulate(iucv_param, cpu);
-		percpu_depopulate(iucv_irq_data, cpu);
+		kfree(iucv_param[cpu]);
+		iucv_param[cpu] = NULL;
+		kfree(iucv_irq_data[cpu]);
+		iucv_irq_data[cpu] = NULL;
 		break;
 	case CPU_ONLINE:
 	case CPU_ONLINE_FROZEN:
@@ -612,7 +614,7 @@ static int iucv_sever_pathid(u16 pathid,
 {
 	union iucv_param *parm;
 
-	parm = percpu_ptr(iucv_param, smp_processor_id());
+	parm = iucv_param[smp_processor_id()];
 	memset(parm, 0, sizeof(union iucv_param));
 	if (userdata)
 		memcpy(parm->ctrl.ipuser, userdata, sizeof(parm->ctrl.ipuser));
@@ -755,7 +757,7 @@ int iucv_path_accept(struct iucv_path *p
 
 	local_bh_disable();
 	/* Prepare parameter block. */
-	parm = percpu_ptr(iucv_param, smp_processor_id());
+	parm = iucv_param[smp_processor_id()];
 	memset(parm, 0, sizeof(union iucv_param));
 	parm->ctrl.ippathid = path->pathid;
 	parm->ctrl.ipmsglim = path->msglim;
@@ -799,7 +801,7 @@ int iucv_path_connect(struct iucv_path *
 	BUG_ON(in_atomic());
 	spin_lock_bh(&iucv_table_lock);
 	iucv_cleanup_queue();
-	parm = percpu_ptr(iucv_param, smp_processor_id());
+	parm = iucv_param[smp_processor_id()];
 	memset(parm, 0, sizeof(union iucv_param));
 	parm->ctrl.ipmsglim = path->msglim;
 	parm->ctrl.ipflags1 = path->flags;
@@ -854,7 +856,7 @@ int iucv_path_quiesce(struct iucv_path *
 	int rc;
 
 	local_bh_disable();
-	parm = percpu_ptr(iucv_param, smp_processor_id());
+	parm = iucv_param[smp_processor_id()];
 	memset(parm, 0, sizeof(union iucv_param));
 	if (userdata)
 		memcpy(parm->ctrl.ipuser, userdata, sizeof(parm->ctrl.ipuser));
@@ -881,7 +883,7 @@ int iucv_path_resume(struct iucv_path *p
 	int rc;
 
 	local_bh_disable();
-	parm = percpu_ptr(iucv_param, smp_processor_id());
+	parm = iucv_param[smp_processor_id()];
 	memset(parm, 0, sizeof(union iucv_param));
 	if (userdata)
 		memcpy(parm->ctrl.ipuser, userdata, sizeof(parm->ctrl.ipuser));
@@ -936,7 +938,7 @@ int iucv_message_purge(struct iucv_path 
 	int rc;
 
 	local_bh_disable();
-	parm = percpu_ptr(iucv_param, smp_processor_id());
+	parm = iucv_param[smp_processor_id()];
 	memset(parm, 0, sizeof(union iucv_param));
 	parm->purge.ippathid = path->pathid;
 	parm->purge.ipmsgid = msg->id;
@@ -1003,7 +1005,7 @@ int iucv_message_receive(struct iucv_pat
 	}
 
 	local_bh_disable();
-	parm = percpu_ptr(iucv_param, smp_processor_id());
+	parm = iucv_param[smp_processor_id()];
 	memset(parm, 0, sizeof(union iucv_param));
 	parm->db.ipbfadr1 = (u32)(addr_t) buffer;
 	parm->db.ipbfln1f = (u32) size;
@@ -1040,7 +1042,7 @@ int iucv_message_reject(struct iucv_path
 	int rc;
 
 	local_bh_disable();
-	parm = percpu_ptr(iucv_param, smp_processor_id());
+	parm = iucv_param[smp_processor_id()];
 	memset(parm, 0, sizeof(union iucv_param));
 	parm->db.ippathid = path->pathid;
 	parm->db.ipmsgid = msg->id;
@@ -1074,7 +1076,7 @@ int iucv_message_reply(struct iucv_path 
 	int rc;
 
 	local_bh_disable();
-	parm = percpu_ptr(iucv_param, smp_processor_id());
+	parm = iucv_param[smp_processor_id()];
 	memset(parm, 0, sizeof(union iucv_param));
 	if (flags & IUCV_IPRMDATA) {
 		parm->dpl.ippathid = path->pathid;
@@ -1118,7 +1120,7 @@ int iucv_message_send(struct iucv_path *
 	int rc;
 
 	local_bh_disable();
-	parm = percpu_ptr(iucv_param, smp_processor_id());
+	parm = iucv_param[smp_processor_id()];
 	memset(parm, 0, sizeof(union iucv_param));
 	if (flags & IUCV_IPRMDATA) {
 		/* Message of 8 bytes can be placed into the parameter list. */
@@ -1172,7 +1174,7 @@ int iucv_message_send2way(struct iucv_pa
 	int rc;
 
 	local_bh_disable();
-	parm = percpu_ptr(iucv_param, smp_processor_id());
+	parm = iucv_param[smp_processor_id()];
 	memset(parm, 0, sizeof(union iucv_param));
 	if (flags & IUCV_IPRMDATA) {
 		parm->dpl.ippathid = path->pathid;
@@ -1559,7 +1561,7 @@ static void iucv_external_interrupt(u16 
 	struct iucv_irq_data *p;
 	struct iucv_irq_list *work;
 
-	p = percpu_ptr(iucv_irq_data, smp_processor_id());
+	p = iucv_irq_data[smp_processor_id()];
 	if (p->ippathid >= iucv_max_pathid) {
 		printk(KERN_WARNING "iucv_do_int: Got interrupt with "
 		       "pathid %d > max_connections (%ld)\n",
@@ -1598,6 +1600,7 @@ static void iucv_external_interrupt(u16 
 static int __init iucv_init(void)
 {
 	int rc;
+	int cpu;
 
 	if (!MACHINE_IS_VM) {
 		rc = -EPROTONOSUPPORT;
@@ -1617,19 +1620,21 @@ static int __init iucv_init(void)
 		rc = PTR_ERR(iucv_root);
 		goto out_bus;
 	}
-	/* Note: GFP_DMA used to get memory below 2G */
-	iucv_irq_data = percpu_alloc(sizeof(struct iucv_irq_data),
-				     GFP_KERNEL|GFP_DMA);
-	if (!iucv_irq_data) {
-		rc = -ENOMEM;
-		goto out_root;
-	}
-	/* Allocate parameter blocks. */
-	iucv_param = percpu_alloc(sizeof(union iucv_param),
-				  GFP_KERNEL|GFP_DMA);
-	if (!iucv_param) {
-		rc = -ENOMEM;
-		goto out_extint;
+
+	for_each_online_cpu(cpu) {
+			/* Note: GFP_DMA used to get memory below 2G */
+		iucv_irq_data[cpu] = kmalloc_node(sizeof(struct iucv_irq_data),
+				     GFP_KERNEL|GFP_DMA, cpu_to_node(cpu));
+		if (!iucv_irq_data[cpu]) {
+			rc = -ENOMEM;
+			goto out_root;
+
+		/* Allocate parameter blocks. */
+		iucv_param[cpu] = kmalloc_node(sizeof(union iucv_param),
+				  GFP_KERNEL|GFP_DMA, cpu_to_node(cpu));
+		if (!iucv_param[cpu]) {
+			rc = -ENOMEM;
+			goto out_extint;
 	}
 	register_hotcpu_notifier(&iucv_cpu_notifier);
 	ASCEBC(iucv_error_no_listener, 16);
@@ -1639,7 +1644,10 @@ static int __init iucv_init(void)
 	return 0;
 
 out_extint:
-	percpu_free(iucv_irq_data);
+	for_each_cpu(cpu) {
+		kfree(iucv_irq_data[cpu]);
+		iucv_irq_data[cpu] = NULL;
+	}
 out_root:
 	s390_root_dev_unregister(iucv_root);
 out_bus:
@@ -1658,6 +1666,7 @@ out:
 static void __exit iucv_exit(void)
 {
 	struct iucv_irq_list *p, *n;
+	int cpu;
 
 	spin_lock_irq(&iucv_queue_lock);
 	list_for_each_entry_safe(p, n, &iucv_task_queue, list)
@@ -1666,8 +1675,12 @@ static void __exit iucv_exit(void)
 		kfree(p);
 	spin_unlock_irq(&iucv_queue_lock);
 	unregister_hotcpu_notifier(&iucv_cpu_notifier);
-	percpu_free(iucv_param);
-	percpu_free(iucv_irq_data);
+	for_each_cpu(cpu) {
+		kfree(iucv_param[cpu]);
+		iucv_param[cpu] = NULL;
+		kfree(iucv_irq_data[cpu]);
+		iucv_irq_data[cpu] = NULL;
+	}
 	s390_root_dev_unregister(iucv_root);
 	bus_unregister(&iucv_bus);
 	unregister_external_interrupt(0x4000, iucv_external_interrupt);

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 26/28] cpu alloc: Use for infiniband
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (24 preceding siblings ...)
  2007-11-06 19:52 ` [patch 25/28] cpu alloc: Explicitly code allocpercpu calls in iucv Christoph Lameter
@ 2007-11-06 19:52 ` Christoph Lameter
  2007-11-06 19:52 ` [patch 27/28] cpu alloc: Use in the crypto subsystem Christoph Lameter
                   ` (2 subsequent siblings)
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:52 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_infiniband --]
[-- Type: text/plain, Size: 3556 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 drivers/infiniband/hw/ehca/ehca_irq.c |   22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

Index: linux-2.6/drivers/infiniband/hw/ehca/ehca_irq.c
===================================================================
--- linux-2.6.orig/drivers/infiniband/hw/ehca/ehca_irq.c	2007-11-05 16:04:55.350534140 -0800
+++ linux-2.6/drivers/infiniband/hw/ehca/ehca_irq.c	2007-11-05 16:39:25.151283658 -0800
@@ -646,7 +646,7 @@ static void queue_comp_task(struct ehca_
 	cpu_id = find_next_online_cpu(pool);
 	BUG_ON(!cpu_online(cpu_id));
 
-	cct = per_cpu_ptr(pool->cpu_comp_tasks, cpu_id);
+	cct = CPU_PTR(pool->cpu_comp_tasks, cpu_id);
 	BUG_ON(!cct);
 
 	spin_lock_irqsave(&cct->task_lock, flags);
@@ -654,7 +654,7 @@ static void queue_comp_task(struct ehca_
 	spin_unlock_irqrestore(&cct->task_lock, flags);
 	if (cq_jobs > 0) {
 		cpu_id = find_next_online_cpu(pool);
-		cct = per_cpu_ptr(pool->cpu_comp_tasks, cpu_id);
+		cct = CPU_PTR(pool->cpu_comp_tasks, cpu_id);
 		BUG_ON(!cct);
 	}
 
@@ -727,7 +727,7 @@ static struct task_struct *create_comp_t
 {
 	struct ehca_cpu_comp_task *cct;
 
-	cct = per_cpu_ptr(pool->cpu_comp_tasks, cpu);
+	cct = CPU_PTR(pool->cpu_comp_tasks, cpu);
 	spin_lock_init(&cct->task_lock);
 	INIT_LIST_HEAD(&cct->cq_list);
 	init_waitqueue_head(&cct->wait_queue);
@@ -743,7 +743,7 @@ static void destroy_comp_task(struct ehc
 	struct task_struct *task;
 	unsigned long flags_cct;
 
-	cct = per_cpu_ptr(pool->cpu_comp_tasks, cpu);
+	cct = CPU_PTR(pool->cpu_comp_tasks, cpu);
 
 	spin_lock_irqsave(&cct->task_lock, flags_cct);
 
@@ -759,7 +759,7 @@ static void destroy_comp_task(struct ehc
 
 static void __cpuinit take_over_work(struct ehca_comp_pool *pool, int cpu)
 {
-	struct ehca_cpu_comp_task *cct = per_cpu_ptr(pool->cpu_comp_tasks, cpu);
+	struct ehca_cpu_comp_task *cct = CPU_PTR(pool->cpu_comp_tasks, cpu);
 	LIST_HEAD(list);
 	struct ehca_cq *cq;
 	unsigned long flags_cct;
@@ -772,8 +772,7 @@ static void __cpuinit take_over_work(str
 		cq = list_entry(cct->cq_list.next, struct ehca_cq, entry);
 
 		list_del(&cq->entry);
-		__queue_comp_task(cq, per_cpu_ptr(pool->cpu_comp_tasks,
-						  smp_processor_id()));
+		__queue_comp_task(cq, THIS_CPU(pool->cpu_comp_tasks));
 	}
 
 	spin_unlock_irqrestore(&cct->task_lock, flags_cct);
@@ -799,14 +798,14 @@ static int __cpuinit comp_pool_callback(
 	case CPU_UP_CANCELED:
 	case CPU_UP_CANCELED_FROZEN:
 		ehca_gen_dbg("CPU: %x (CPU_CANCELED)", cpu);
-		cct = per_cpu_ptr(pool->cpu_comp_tasks, cpu);
+		cct = CPU_PTR(pool->cpu_comp_tasks, cpu);
 		kthread_bind(cct->task, any_online_cpu(cpu_online_map));
 		destroy_comp_task(pool, cpu);
 		break;
 	case CPU_ONLINE:
 	case CPU_ONLINE_FROZEN:
 		ehca_gen_dbg("CPU: %x (CPU_ONLINE)", cpu);
-		cct = per_cpu_ptr(pool->cpu_comp_tasks, cpu);
+		cct = CPU_PTR(pool->cpu_comp_tasks, cpu);
 		kthread_bind(cct->task, cpu);
 		wake_up_process(cct->task);
 		break;
@@ -849,7 +848,8 @@ int ehca_create_comp_pool(void)
 	spin_lock_init(&pool->last_cpu_lock);
 	pool->last_cpu = any_online_cpu(cpu_online_map);
 
-	pool->cpu_comp_tasks = alloc_percpu(struct ehca_cpu_comp_task);
+	pool->cpu_comp_tasks = CPU_ALLOC(struct ehca_cpu_comp_task,
+						GFP_KERNEL | __GFP_ZERO);
 	if (pool->cpu_comp_tasks == NULL) {
 		kfree(pool);
 		return -EINVAL;
@@ -883,6 +883,6 @@ void ehca_destroy_comp_pool(void)
 		if (cpu_online(i))
 			destroy_comp_task(pool, i);
 	}
-	free_percpu(pool->cpu_comp_tasks);
+	CPU_FREE(pool->cpu_comp_tasks);
 	kfree(pool);
 }

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 27/28] cpu alloc: Use in the crypto subsystem.
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (25 preceding siblings ...)
  2007-11-06 19:52 ` [patch 26/28] cpu alloc: Use for infiniband Christoph Lameter
@ 2007-11-06 19:52 ` Christoph Lameter
  2007-11-06 19:52 ` [patch 28/28] cpu alloc: Remove the allocpercpu functionality Christoph Lameter
  2007-11-07 13:10 ` [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Martin Schwidefsky
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:52 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_crypto --]
[-- Type: text/plain, Size: 2246 bytes --]

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 crypto/async_tx/async_tx.c |   15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

Index: linux-2.6/crypto/async_tx/async_tx.c
===================================================================
--- linux-2.6.orig/crypto/async_tx/async_tx.c	2007-11-05 09:46:04.000000000 -0800
+++ linux-2.6/crypto/async_tx/async_tx.c	2007-11-05 09:49:56.000000000 -0800
@@ -207,10 +207,10 @@ static void async_tx_rebalance(void)
 	for_each_dma_cap_mask(cap, dma_cap_mask_all)
 		for_each_possible_cpu(cpu) {
 			struct dma_chan_ref *ref =
-				per_cpu_ptr(channel_table[cap], cpu)->ref;
+				CPU_PTR(channel_table[cap], cpu)->ref;
 			if (ref) {
 				atomic_set(&ref->count, 0);
-				per_cpu_ptr(channel_table[cap], cpu)->ref =
+				CPU_PTR(channel_table[cap], cpu)->ref =
 									NULL;
 			}
 		}
@@ -223,7 +223,7 @@ static void async_tx_rebalance(void)
 			else
 				new = get_chan_ref_by_cap(cap, -1);
 
-			per_cpu_ptr(channel_table[cap], cpu)->ref = new;
+			CPU_PTR(channel_table[cap], cpu)->ref = new;
 		}
 
 	spin_unlock_irqrestore(&async_tx_lock, flags);
@@ -327,7 +327,8 @@ async_tx_init(void)
 	clear_bit(DMA_INTERRUPT, dma_cap_mask_all.bits);
 
 	for_each_dma_cap_mask(cap, dma_cap_mask_all) {
-		channel_table[cap] = alloc_percpu(struct chan_ref_percpu);
+		channel_table[cap] = CPU_ALLOC(struct chan_ref_percpu,
+						GFP_KERNEL | __GFP_ZERO);
 		if (!channel_table[cap])
 			goto err;
 	}
@@ -343,7 +344,7 @@ err:
 	printk(KERN_ERR "async_tx: initialization failure\n");
 
 	while (--cap >= 0)
-		free_percpu(channel_table[cap]);
+		CPU_FRE(channel_table[cap]);
 
 	return 1;
 }
@@ -356,7 +357,7 @@ static void __exit async_tx_exit(void)
 
 	for_each_dma_cap_mask(cap, dma_cap_mask_all)
 		if (channel_table[cap])
-			free_percpu(channel_table[cap]);
+			CPU_FREE(channel_table[cap]);
 
 	dma_async_client_unregister(&async_tx_dma);
 }
@@ -378,7 +379,7 @@ async_tx_find_channel(struct dma_async_t
 	else if (likely(channel_table_initialized)) {
 		struct dma_chan_ref *ref;
 		int cpu = get_cpu();
-		ref = per_cpu_ptr(channel_table[tx_type], cpu)->ref;
+		ref = CPU_PTR(channel_table[tx_type], cpu)->ref;
 		put_cpu();
 		return ref ? ref->chan : NULL;
 	} else

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [patch 28/28] cpu alloc: Remove the allocpercpu functionality
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (26 preceding siblings ...)
  2007-11-06 19:52 ` [patch 27/28] cpu alloc: Use in the crypto subsystem Christoph Lameter
@ 2007-11-06 19:52 ` Christoph Lameter
  2007-11-07 13:10 ` [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Martin Schwidefsky
  28 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-06 19:52 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, David Miller, Eric Dumazet, Martin Schwidefsky

[-- Attachment #1: cpu_alloc_remove_allocpercpu --]
[-- Type: text/plain, Size: 7814 bytes --]

There is no user of allocpercpu left after all the earlier patches were
applied. Remove the code that realizes allocpercpu.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 include/linux/percpu.h |   79 ------------------------------
 mm/Makefile            |    1 
 mm/allocpercpu.c       |  127 -------------------------------------------------
 3 files changed, 207 deletions(-)

Index: linux-2.6/include/linux/percpu.h
===================================================================
--- linux-2.6.orig/include/linux/percpu.h	2007-11-04 19:25:18.000000000 -0800
+++ linux-2.6/include/linux/percpu.h	2007-11-04 19:25:31.000000000 -0800
@@ -31,83 +31,4 @@
 	&__get_cpu_var(var); }))
 #define put_cpu_var(var) preempt_enable()
 
-#ifdef CONFIG_SMP
-
-struct percpu_data {
-	void *ptrs[NR_CPUS];
-};
-
-#define __percpu_disguise(pdata) (struct percpu_data *)~(unsigned long)(pdata)
-/* 
- * Use this to get to a cpu's version of the per-cpu object dynamically
- * allocated. Non-atomic access to the current CPU's version should
- * probably be combined with get_cpu()/put_cpu().
- */ 
-#define percpu_ptr(ptr, cpu)                              \
-({                                                        \
-        struct percpu_data *__p = __percpu_disguise(ptr); \
-        (__typeof__(ptr))__p->ptrs[(cpu)];	          \
-})
-
-extern void *percpu_populate(void *__pdata, size_t size, gfp_t gfp, int cpu);
-extern void percpu_depopulate(void *__pdata, int cpu);
-extern int __percpu_populate_mask(void *__pdata, size_t size, gfp_t gfp,
-				  cpumask_t *mask);
-extern void __percpu_depopulate_mask(void *__pdata, cpumask_t *mask);
-extern void *__percpu_alloc_mask(size_t size, gfp_t gfp, cpumask_t *mask);
-extern void percpu_free(void *__pdata);
-
-#else /* CONFIG_SMP */
-
-#define percpu_ptr(ptr, cpu) ({ (void)(cpu); (ptr); })
-
-static inline void percpu_depopulate(void *__pdata, int cpu)
-{
-}
-
-static inline void __percpu_depopulate_mask(void *__pdata, cpumask_t *mask)
-{
-}
-
-static inline void *percpu_populate(void *__pdata, size_t size, gfp_t gfp,
-				    int cpu)
-{
-	return percpu_ptr(__pdata, cpu);
-}
-
-static inline int __percpu_populate_mask(void *__pdata, size_t size, gfp_t gfp,
-					 cpumask_t *mask)
-{
-	return 0;
-}
-
-static __always_inline void *__percpu_alloc_mask(size_t size, gfp_t gfp, cpumask_t *mask)
-{
-	return kzalloc(size, gfp);
-}
-
-static inline void percpu_free(void *__pdata)
-{
-	kfree(__pdata);
-}
-
-#endif /* CONFIG_SMP */
-
-#define percpu_populate_mask(__pdata, size, gfp, mask) \
-	__percpu_populate_mask((__pdata), (size), (gfp), &(mask))
-#define percpu_depopulate_mask(__pdata, mask) \
-	__percpu_depopulate_mask((__pdata), &(mask))
-#define percpu_alloc_mask(size, gfp, mask) \
-	__percpu_alloc_mask((size), (gfp), &(mask))
-
-#define percpu_alloc(size, gfp) percpu_alloc_mask((size), (gfp), cpu_online_map)
-
-/* (legacy) interface for use without CPU hotplug handling */
-
-#define __alloc_percpu(size)	percpu_alloc_mask((size), GFP_KERNEL, \
-						  cpu_possible_map)
-#define alloc_percpu(type)	(type *)__alloc_percpu(sizeof(type))
-#define free_percpu(ptr)	percpu_free((ptr))
-#define per_cpu_ptr(ptr, cpu)	percpu_ptr((ptr), (cpu))
-
 #endif /* __LINUX_PERCPU_H */
Index: linux-2.6/mm/Makefile
===================================================================
--- linux-2.6.orig/mm/Makefile	2007-11-04 19:25:00.000000000 -0800
+++ linux-2.6/mm/Makefile	2007-11-04 19:25:04.000000000 -0800
@@ -28,5 +28,4 @@ obj-$(CONFIG_SLUB) += slub.o
 obj-$(CONFIG_MEMORY_HOTPLUG) += memory_hotplug.o
 obj-$(CONFIG_FS_XIP) += filemap_xip.o
 obj-$(CONFIG_MIGRATION) += migrate.o
-obj-$(CONFIG_SMP) += allocpercpu.o
 obj-$(CONFIG_QUICKLIST) += quicklist.o
Index: linux-2.6/mm/allocpercpu.c
===================================================================
--- linux-2.6.orig/mm/allocpercpu.c	2007-11-04 19:24:51.000000000 -0800
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,127 +0,0 @@
-/*
- * linux/mm/allocpercpu.c
- *
- * Separated from slab.c August 11, 2006 Christoph Lameter <clameter@sgi.com>
- */
-#include <linux/mm.h>
-#include <linux/module.h>
-
-/**
- * percpu_depopulate - depopulate per-cpu data for given cpu
- * @__pdata: per-cpu data to depopulate
- * @cpu: depopulate per-cpu data for this cpu
- *
- * Depopulating per-cpu data for a cpu going offline would be a typical
- * use case. You need to register a cpu hotplug handler for that purpose.
- */
-void percpu_depopulate(void *__pdata, int cpu)
-{
-	struct percpu_data *pdata = __percpu_disguise(__pdata);
-
-	kfree(pdata->ptrs[cpu]);
-	pdata->ptrs[cpu] = NULL;
-}
-EXPORT_SYMBOL_GPL(percpu_depopulate);
-
-/**
- * percpu_depopulate_mask - depopulate per-cpu data for some cpu's
- * @__pdata: per-cpu data to depopulate
- * @mask: depopulate per-cpu data for cpu's selected through mask bits
- */
-void __percpu_depopulate_mask(void *__pdata, cpumask_t *mask)
-{
-	int cpu;
-	for_each_cpu_mask(cpu, *mask)
-		percpu_depopulate(__pdata, cpu);
-}
-EXPORT_SYMBOL_GPL(__percpu_depopulate_mask);
-
-/**
- * percpu_populate - populate per-cpu data for given cpu
- * @__pdata: per-cpu data to populate further
- * @size: size of per-cpu object
- * @gfp: may sleep or not etc.
- * @cpu: populate per-data for this cpu
- *
- * Populating per-cpu data for a cpu coming online would be a typical
- * use case. You need to register a cpu hotplug handler for that purpose.
- * Per-cpu object is populated with zeroed buffer.
- */
-void *percpu_populate(void *__pdata, size_t size, gfp_t gfp, int cpu)
-{
-	struct percpu_data *pdata = __percpu_disguise(__pdata);
-	int node = cpu_to_node(cpu);
-
-	BUG_ON(pdata->ptrs[cpu]);
-	if (node_online(node))
-		pdata->ptrs[cpu] = kmalloc_node(size, gfp|__GFP_ZERO, node);
-	else
-		pdata->ptrs[cpu] = kzalloc(size, gfp);
-	return pdata->ptrs[cpu];
-}
-EXPORT_SYMBOL_GPL(percpu_populate);
-
-/**
- * percpu_populate_mask - populate per-cpu data for more cpu's
- * @__pdata: per-cpu data to populate further
- * @size: size of per-cpu object
- * @gfp: may sleep or not etc.
- * @mask: populate per-cpu data for cpu's selected through mask bits
- *
- * Per-cpu objects are populated with zeroed buffers.
- */
-int __percpu_populate_mask(void *__pdata, size_t size, gfp_t gfp,
-			   cpumask_t *mask)
-{
-	cpumask_t populated = CPU_MASK_NONE;
-	int cpu;
-
-	for_each_cpu_mask(cpu, *mask)
-		if (unlikely(!percpu_populate(__pdata, size, gfp, cpu))) {
-			__percpu_depopulate_mask(__pdata, &populated);
-			return -ENOMEM;
-		} else
-			cpu_set(cpu, populated);
-	return 0;
-}
-EXPORT_SYMBOL_GPL(__percpu_populate_mask);
-
-/**
- * percpu_alloc_mask - initial setup of per-cpu data
- * @size: size of per-cpu object
- * @gfp: may sleep or not etc.
- * @mask: populate per-data for cpu's selected through mask bits
- *
- * Populating per-cpu data for all online cpu's would be a typical use case,
- * which is simplified by the percpu_alloc() wrapper.
- * Per-cpu objects are populated with zeroed buffers.
- */
-void *__percpu_alloc_mask(size_t size, gfp_t gfp, cpumask_t *mask)
-{
-	void *pdata = kzalloc(sizeof(struct percpu_data), gfp);
-	void *__pdata = __percpu_disguise(pdata);
-
-	if (unlikely(!pdata))
-		return NULL;
-	if (likely(!__percpu_populate_mask(__pdata, size, gfp, mask)))
-		return __pdata;
-	kfree(pdata);
-	return NULL;
-}
-EXPORT_SYMBOL_GPL(__percpu_alloc_mask);
-
-/**
- * percpu_free - final cleanup of per-cpu data
- * @__pdata: object to clean up
- *
- * We simply clean up any per-cpu object left. No need for the client to
- * track and specify through a bis mask which per-cpu objects are to free.
- */
-void percpu_free(void *__pdata)
-{
-	if (unlikely(!__pdata))
-		return;
-	__percpu_depopulate_mask(__pdata, &cpu_possible_map);
-	kfree(__percpu_disguise(__pdata));
-}
-EXPORT_SYMBOL_GPL(percpu_free);

-- 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects
  2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
                   ` (27 preceding siblings ...)
  2007-11-06 19:52 ` [patch 28/28] cpu alloc: Remove the allocpercpu functionality Christoph Lameter
@ 2007-11-07 13:10 ` Martin Schwidefsky
  2007-11-07 18:05   ` Christoph Lameter
  28 siblings, 1 reply; 78+ messages in thread
From: Martin Schwidefsky @ 2007-11-07 13:10 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, linux-mm, linux-kernel, David Miller, Eric Dumazet, heiko.carstens

On Tue, 2007-11-06 at 11:51 -0800, Christoph Lameter wrote:
> TODO:
> - Currently only i386, ia64 and x86_64 arch definitions are provided.
>   Other arches fall back to 64k static configurations.
> - Cpu hotplug support. Current we simply allocate for all possible processors.
>   We could reduce this to only online processors if we could allocate the
>   cpu area for the new processor before the callbacks are run and if we could
>   free the cpu areas for a processor going down after all the callbacks for
>   that were run.
> - There are various modifications to exotic configurations that still need
>   some testing (f.e. s/390 iucv--whatever that is--) etc. Tests were
>   done on UP(i386) SMP(i386, x86_64) and NUMA (x86_64, ia64)

IUCV = Inter-User-Communication-Vehicle. Nice name, isn't it?

It is a z/VM hypervisor interface that allows the different guests to
communicate between each other. net/iucv.c is the base code,
net/af_iucv.c implements the socket interface. 

The patch you provided for iucv has a few bugs which are corrected with
the patch below (please merge with patch #25). With it new cpu alloc
code works fine on s390. We likely want to switch to a virtual
configuration as well. For now we can live with the static
configuration.

-- 
blue skies,
  Martin.

"Reality continues to ruin my life." - Calvin.

---

 net/iucv/iucv.c |   18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/net/iucv/iucv.c b/net/iucv/iucv.c
index 0d77dba..7698f6c 100644
--- a/net/iucv/iucv.c
+++ b/net/iucv/iucv.c
@@ -566,7 +566,6 @@ static int __cpuinit iucv_cpu_notify(struct notifier_block *self,
 				     GFP_KERNEL|GFP_DMA, cpu_to_node(cpu));
 		if (!iucv_param[cpu])
 			return NOTIFY_BAD;
-		}
 		break;
 	case CPU_UP_CANCELED:
 	case CPU_UP_CANCELED_FROZEN:
@@ -1622,19 +1621,21 @@ static int __init iucv_init(void)
 	}
 
 	for_each_online_cpu(cpu) {
-			/* Note: GFP_DMA used to get memory below 2G */
+		/* Note: GFP_DMA used to get memory below 2G */
 		iucv_irq_data[cpu] = kmalloc_node(sizeof(struct iucv_irq_data),
 				     GFP_KERNEL|GFP_DMA, cpu_to_node(cpu));
 		if (!iucv_irq_data[cpu]) {
 			rc = -ENOMEM;
-			goto out_root;
+			goto out_free;
+		}
 
 		/* Allocate parameter blocks. */
 		iucv_param[cpu] = kmalloc_node(sizeof(union iucv_param),
 				  GFP_KERNEL|GFP_DMA, cpu_to_node(cpu));
 		if (!iucv_param[cpu]) {
 			rc = -ENOMEM;
-			goto out_extint;
+			goto out_free;
+		}
 	}
 	register_hotcpu_notifier(&iucv_cpu_notifier);
 	ASCEBC(iucv_error_no_listener, 16);
@@ -1643,12 +1644,13 @@ static int __init iucv_init(void)
 	iucv_available = 1;
 	return 0;
 
-out_extint:
-	for_each_cpu(cpu) {
+out_free:
+	for_each_possible_cpu(cpu) {
+		kfree(iucv_param[cpu]);
+		iucv_param[cpu] = NULL;
 		kfree(iucv_irq_data[cpu]);
 		iucv_irq_data[cpu] = NULL;
 	}
-out_root:
 	s390_root_dev_unregister(iucv_root);
 out_bus:
 	bus_unregister(&iucv_bus);
@@ -1675,7 +1677,7 @@ static void __exit iucv_exit(void)
 		kfree(p);
 	spin_unlock_irq(&iucv_queue_lock);
 	unregister_hotcpu_notifier(&iucv_cpu_notifier);
-	for_each_cpu(cpu) {
+	for_each_possible_cpu(cpu) {
 		kfree(iucv_param[cpu]);
 		iucv_param[cpu] = NULL;
 		kfree(iucv_irq_data[cpu]);



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects
  2007-11-07 13:10 ` [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Martin Schwidefsky
@ 2007-11-07 18:05   ` Christoph Lameter
  0 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-07 18:05 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: akpm, linux-mm, linux-kernel, David Miller, Eric Dumazet, heiko.carstens

On Wed, 7 Nov 2007, Martin Schwidefsky wrote:

> The patch you provided for iucv has a few bugs which are corrected with
> the patch below (please merge with patch #25). With it new cpu alloc
> code works fine on s390. We likely want to switch to a virtual
> configuration as well. For now we can live with the static
> configuration.

Ok. Thanks!

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-06 19:51 ` [patch 01/28] cpu alloc: The allocator Christoph Lameter
@ 2007-11-08 12:34   ` Peter Zijlstra
  2007-11-08 12:37     ` Peter Zijlstra
  2007-11-08 18:33     ` Christoph Lameter
       [not found]   ` <1194522615.6289.136.camel@twins>
  2007-11-13 11:15   ` David Miller
  2 siblings, 2 replies; 78+ messages in thread
From: Peter Zijlstra @ 2007-11-08 12:34 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: akpm, linux-kernel

On Tue, 2007-11-06 at 11:51 -0800, Christoph Lameter wrote:

> +/*
> + * Lock to protect the bitmap and the meta data for the cpu allocator.
> + */
> +static DEFINE_SPINLOCK(cpu_alloc_map_lock);

I thought you got nightmares from global locks :-)

> +/*
> + * Allocate an object of a certain size
> + *
> + * Returns a special pointer that can be used with CPU_PTR to find the
> + * address of the object for a certain cpu.
> + */
> +void *cpu_alloc(unsigned long size, gfp_t gfpflags, unsigned long align)
> +{
> +	unsigned long start;
> +	int units = size_to_units(size);
> +	void *ptr;
> +	int first;
> +	unsigned long map_size;
> +
> +	BUG_ON(gfpflags & ~(GFP_RECLAIM_MASK | __GFP_ZERO));
> +
> +	spin_lock(&cpu_alloc_map_lock);
> +
> +restart:
> +	map_size = PAGE_SIZE << cpu_alloc_map_order;
> +	first = 1;
> +	start = first_free;
> +
> +	for ( ; ; ) {
> +
> +		start = find_next_zero_bit(cpu_alloc_map, map_size, start);
> +		if (first)
> +			first_free = start;
> +
> +		if (start >= units_total) {
> +			if (expand_cpu_area(gfpflags))
> +				goto out_of_memory;
> +			goto restart;
> +		}
> +
> +		/*
> +		 * Check alignment and that there is enough space after
> +		 * the starting unit.
> +		 */
> +		if (start % (align / UNIT_SIZE) == 0 &&
> +			find_next_bit(cpu_alloc_map, map_size, start + 1)
> +							>= start + units)
> +				break;
> +		start++;
> +		first = 0;
> +	}
> +
> +	if (first)
> +		first_free = start + units;
> +
> +	while (start + units > units_total) {
> +		if (expand_cpu_area(gfpflags))
> +			goto out_of_memory;
> +	}
> +
> +	set_map(start, units);
> +	units_free -= units;
> +	__count_vm_events(CPU_BYTES, units * UNIT_SIZE);
> +
> +	spin_unlock(&cpu_alloc_map_lock);
> +
> +	ptr = cpu_area + start * UNIT_SIZE;
> +
> +	if (gfpflags & __GFP_ZERO) {
> +		int cpu;
> +
> +		for_each_possible_cpu(cpu)
> +			memset(CPU_PTR(ptr, cpu), 0, size);
> +	}
> +
> +	return ptr;
> +
> +out_of_memory:
> +	spin_unlock(&cpu_alloc_map_lock);
> +	return NULL;
> +}
> +EXPORT_SYMBOL(cpu_alloc);
> +
> +/*
> + * Free an object. The pointer must be a cpu pointer allocated
> + * via cpu_alloc.
> + */
> +void cpu_free(void *start, unsigned long size)
> +{
> +	int units = size_to_units(size);
> +	int index;
> +	u8 *p = start;
> +
> +	BUG_ON(p < cpu_area);
> +	index = (p - cpu_area) / UNIT_SIZE;
> +	BUG_ON(!test_bit(index, cpu_alloc_map) ||
> +			index >= units_total);
> +
> +	spin_lock(&cpu_alloc_map_lock);
> +
> +	clear_map(index, units);
> +	units_free += units;
> +	__count_vm_events(CPU_BYTES, -units * UNIT_SIZE);
> +	if (index < first_free)
> +		first_free = index;
> +
> +	spin_unlock(&cpu_alloc_map_lock);
> +}
> +EXPORT_SYMBOL(cpu_free);

Why a bitmap allocator and not a heap allocator?

Also, looking at the lock usage, this thing is not IRQ safe, so it
should not be called from hardirq context. Please document this.

> +#ifndef _LINUX_CPU_ALLOC_H_
> +#define _LINUX_CPU_ALLOC_H_
> +
> +#define CPU_OFFSET(__cpu) \
> +	((unsigned long)(__cpu) << (CONFIG_CPU_AREA_ORDER + PAGE_SHIFT))
> +
> +#define CPU_PTR(__p, __cpu) ((__typeof__(__p))((void *)(__p) + \
> +							CPU_OFFSET(__cpu)))
> +
> +#define CPU_ALLOC(type, flags)	cpu_alloc(sizeof(type), flags, \
> +					__alignof__(type))
> +#define CPU_FREE(pointer)	cpu_free(pointer, sizeof(*(pointer)))
> +
> +#define THIS_CPU(__p)	CPU_PTR(__p, smp_processor_id())
> +#define __THIS_CPU(__p)	CPU_PTR(__p, raw_smp_processor_id())
> +
> +/*
> + * Raw calls
> + */
> +void *cpu_alloc(unsigned long size, gfp_t gfp, unsigned long align);
> +void cpu_free(void *cpu_pointer, unsigned long size);
> +
> +#endif /* _LINUX_CPU_ALLOC_H_ */

Like said in the previous mail (which due to creative mailing from your
end never made it out to the lists), I dislike those shouting macros.
Please lowercase them.



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-08 12:34   ` Peter Zijlstra
@ 2007-11-08 12:37     ` Peter Zijlstra
  2007-11-08 18:33     ` Christoph Lameter
  1 sibling, 0 replies; 78+ messages in thread
From: Peter Zijlstra @ 2007-11-08 12:37 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: akpm, linux-kernel

On Thu, 2007-11-08 at 13:34 +0100, Peter Zijlstra wrote:
> On Tue, 2007-11-06 at 11:51 -0800, Christoph Lameter wrote:

> > +#ifndef _LINUX_CPU_ALLOC_H_
> > +#define _LINUX_CPU_ALLOC_H_
> > +
> > +#define CPU_OFFSET(__cpu) \
> > +	((unsigned long)(__cpu) << (CONFIG_CPU_AREA_ORDER + PAGE_SHIFT))
> > +
> > +#define CPU_PTR(__p, __cpu) ((__typeof__(__p))((void *)(__p) + \
> > +							CPU_OFFSET(__cpu)))
> > +
> > +#define CPU_ALLOC(type, flags)	cpu_alloc(sizeof(type), flags, \
> > +					__alignof__(type))
> > +#define CPU_FREE(pointer)	cpu_free(pointer, sizeof(*(pointer)))
> > +
> > +#define THIS_CPU(__p)	CPU_PTR(__p, smp_processor_id())
> > +#define __THIS_CPU(__p)	CPU_PTR(__p, raw_smp_processor_id())
> > +
> > +/*
> > + * Raw calls
> > + */
> > +void *cpu_alloc(unsigned long size, gfp_t gfp, unsigned long align);
> > +void cpu_free(void *cpu_pointer, unsigned long size);
> > +
> > +#endif /* _LINUX_CPU_ALLOC_H_ */
> 
> Like said in the previous mail (which due to creative mailing from your
> end never made it out to the lists), I dislike those shouting macros.
> Please lowercase them.

sed -i -e 's/CPU_OFFSET/cpu_offset/g' -e 's/CPU_PTR/cpu_ptr/' -e
's/CPU_ALLOC/cpu_alloc_type/g'  -e 's/cpu_free/__cpu_free/g' -e
's/CPU_FREE/cpu_free/' -e 's/THIS_CPU/this_cpu/g' patches/*.patch

should get you there.


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-08 12:34   ` Peter Zijlstra
  2007-11-08 12:37     ` Peter Zijlstra
@ 2007-11-08 18:33     ` Christoph Lameter
  2007-11-08 18:50       ` Christoph Lameter
  1 sibling, 1 reply; 78+ messages in thread
From: Christoph Lameter @ 2007-11-08 18:33 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: akpm, linux-kernel

On Thu, 8 Nov 2007, Peter Zijlstra wrote:

> On Tue, 2007-11-06 at 11:51 -0800, Christoph Lameter wrote:
> 
> > +/*
> > + * Lock to protect the bitmap and the meta data for the cpu allocator.
> > + */
> > +static DEFINE_SPINLOCK(cpu_alloc_map_lock);
> 
> I thought you got nightmares from global locks :-)

Yes but this one is rarely taken.

> Why a bitmap allocator and not a heap allocator?

Because the allocator must be able to deal with small 4 byte entities.

> Also, looking at the lock usage, this thing is not IRQ safe, so it
> should not be called from hardirq context. Please document this.

Ok.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-08 18:33     ` Christoph Lameter
@ 2007-11-08 18:50       ` Christoph Lameter
  2007-11-08 20:19         ` Peter Zijlstra
  0 siblings, 1 reply; 78+ messages in thread
From: Christoph Lameter @ 2007-11-08 18:50 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: akpm, linux-kernel

On Thu, 8 Nov 2007, Christoph Lameter wrote:

> > Also, looking at the lock usage, this thing is not IRQ safe, so it
> > should not be called from hardirq context. Please document this.

Well I went the other way and made it work like the slab allocators.


cpu_alloc: Make it irq safe

Use the same method as used in SLAB/SLUB to make the allocator interrupt safe.
disable interrupts when allocator metadata is processed. Reenable interrupts
during page allocator calls if __GFP_WAIT is set in the flags passed to the
allocator.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 mm/cpu_alloc.c |   16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

Index: linux-2.6/mm/cpu_alloc.c
===================================================================
--- linux-2.6.orig/mm/cpu_alloc.c	2007-11-07 16:49:40.069701326 -0800
+++ linux-2.6/mm/cpu_alloc.c	2007-11-08 10:45:43.172294260 -0800
@@ -186,6 +186,8 @@ static int expand_cpu_area(gfp_t flags)
 		goto out;
 
 	spin_unlock(&cpu_alloc_map_lock);
+	if (flags & __GFP_WAIT)
+		local_irq_enable();
 
 	/*
 	 * Determine the size of the bit map needed
@@ -212,6 +214,8 @@ static int expand_cpu_area(gfp_t flags)
 			goto out;
 	}
 
+	if (flags & __GFP_WAIT)
+		local_irq_disable();
 	spin_lock(&cpu_alloc_map_lock);
 
 	/*
@@ -312,10 +316,11 @@ void *cpu_alloc(unsigned long size, gfp_
 	void *ptr;
 	int first;
 	unsigned long map_size;
+	unsigned long flags;
 
 	BUG_ON(gfpflags & ~(GFP_RECLAIM_MASK | __GFP_ZERO));
 
-	spin_lock(&cpu_alloc_map_lock);
+	spin_lock_irqsave(&cpu_alloc_map_lock, flags);
 
 restart:
 	map_size = PAGE_SIZE << cpu_alloc_map_order;
@@ -358,7 +363,7 @@ restart:
 	units_free -= units;
 	__count_vm_events(CPU_BYTES, units * UNIT_SIZE);
 
-	spin_unlock(&cpu_alloc_map_lock);
+	spin_unlock_irqrestore(&cpu_alloc_map_lock, flags);
 
 	ptr = cpu_area + start * UNIT_SIZE;
 
@@ -372,7 +377,7 @@ restart:
 	return ptr;
 
 out_of_memory:
-	spin_unlock(&cpu_alloc_map_lock);
+	spin_unlock_irqrestore(&cpu_alloc_map_lock, flags);
 	return NULL;
 }
 EXPORT_SYMBOL(cpu_alloc);
@@ -386,13 +391,14 @@ void cpu_free(void *start, unsigned long
 	int units = size_to_units(size);
 	int index;
 	u8 *p = start;
+	unsigned long flags;
 
 	BUG_ON(p < cpu_area);
 	index = (p - cpu_area) / UNIT_SIZE;
 	BUG_ON(!test_bit(index, cpu_alloc_map) ||
 			index >= units_total);
 
-	spin_lock(&cpu_alloc_map_lock);
+	spin_lock_irqsave(&cpu_alloc_map_lock, flags);
 
 	clear_map(index, units);
 	units_free += units;
@@ -400,6 +406,6 @@ void cpu_free(void *start, unsigned long
 	if (index < first_free)
 		first_free = index;
 
-	spin_unlock(&cpu_alloc_map_lock);
+	spin_unlock_irqrestore(&cpu_alloc_map_lock, flags);
 }
 EXPORT_SYMBOL(cpu_free);

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
       [not found]     ` <Pine.LNX.4.64.0711081030380.7871@schroedinger.engr.sgi.com>
@ 2007-11-08 20:19       ` Peter Zijlstra
  2007-11-08 20:24         ` Christoph Lameter
  2007-11-08 23:26         ` David Miller
  0 siblings, 2 replies; 78+ messages in thread
From: Peter Zijlstra @ 2007-11-08 20:19 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: akpm, linux-kernel


On Thu, 2007-11-08 at 10:31 -0800, Christoph Lameter wrote:
> On Thu, 8 Nov 2007, Peter Zijlstra wrote:
> 
> > > +#define CPU_OFFSET(__cpu) \
> > > +	((unsigned long)(__cpu) << (CONFIG_CPU_AREA_ORDER + PAGE_SHIFT))
> > > +
> > > +#define CPU_PTR(__p, __cpu) ((__typeof__(__p))((void *)(__p) + \
> > > +							CPU_OFFSET(__cpu)))
> > > +
> > > +#define CPU_ALLOC(type, flags)	cpu_alloc(sizeof(type), flags, \
> > > +					__alignof__(type))
> > > +#define CPU_FREE(pointer)	cpu_free(pointer, sizeof(*(pointer)))
> > > +
> > > +#define THIS_CPU(__p)	CPU_PTR(__p, smp_processor_id())
> > > +#define __THIS_CPU(__p)	CPU_PTR(__p, raw_smp_processor_id())
> > > +
> > > +/*
> > > + * Raw calls
> > > + */
> > > +void *cpu_alloc(unsigned long size, gfp_t gfp, unsigned long align);
> > > +void cpu_free(void *cpu_pointer, unsigned long size);
> > > +
> > > +#endif /* _LINUX_CPU_ALLOC_H_ */
> > 
> > I don't like those shouting macros.
> 
> The convention for macros is to use upper case.

We have plent macros that look like regular functions. And as a primary
interface to this functionality these shouting things look really out of
place.


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-08 18:50       ` Christoph Lameter
@ 2007-11-08 20:19         ` Peter Zijlstra
  0 siblings, 0 replies; 78+ messages in thread
From: Peter Zijlstra @ 2007-11-08 20:19 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: akpm, linux-kernel


On Thu, 2007-11-08 at 10:50 -0800, Christoph Lameter wrote:
> On Thu, 8 Nov 2007, Christoph Lameter wrote:
> 
> > > Also, looking at the lock usage, this thing is not IRQ safe, so it
> > > should not be called from hardirq context. Please document this.
> 
> Well I went the other way and made it work like the slab allocators.

Nice.


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-08 20:19       ` Peter Zijlstra
@ 2007-11-08 20:24         ` Christoph Lameter
  2007-11-08 23:26           ` David Miller
  2007-11-08 23:26         ` David Miller
  1 sibling, 1 reply; 78+ messages in thread
From: Christoph Lameter @ 2007-11-08 20:24 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: akpm, linux-kernel

On Thu, 8 Nov 2007, Peter Zijlstra wrote:

> > The convention for macros is to use upper case.
> 
> We have plent macros that look like regular functions. And as a primary
> interface to this functionality these shouting things look really out of
> place.

One point of the patchset is to clean up the messy handling of the 
allocpercpu interface which uses lower case for macros. It is a bit 
confusing that a function like alloc_percpu() can take a type argument. 
I think this needs to be uppercase for clarity.



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-08 20:19       ` Peter Zijlstra
  2007-11-08 20:24         ` Christoph Lameter
@ 2007-11-08 23:26         ` David Miller
  1 sibling, 0 replies; 78+ messages in thread
From: David Miller @ 2007-11-08 23:26 UTC (permalink / raw)
  To: peterz; +Cc: clameter, akpm, linux-kernel

From: Peter Zijlstra <peterz@infradead.org>
Date: Thu, 08 Nov 2007 21:19:08 +0100

> 
> On Thu, 2007-11-08 at 10:31 -0800, Christoph Lameter wrote:
> > On Thu, 8 Nov 2007, Peter Zijlstra wrote:
> > 
> > > I don't like those shouting macros.
> > 
> > The convention for macros is to use upper case.
> 
> We have plent macros that look like regular functions. And as a primary
> interface to this functionality these shouting things look really out of
> place.

I disagree, macros in upper case make sense here.  Macros should SHOUT
at you because CPP is MAGIC and has side effects that normal real
functions do not have, and therefore you need to be REMINDED.

And, honestly, aren't there more important issues about his patches to
review than macro capitalization?

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-08 20:24         ` Christoph Lameter
@ 2007-11-08 23:26           ` David Miller
  0 siblings, 0 replies; 78+ messages in thread
From: David Miller @ 2007-11-08 23:26 UTC (permalink / raw)
  To: clameter; +Cc: peterz, akpm, linux-kernel

From: Christoph Lameter <clameter@sgi.com>
Date: Thu, 8 Nov 2007 12:24:22 -0800 (PST)

> On Thu, 8 Nov 2007, Peter Zijlstra wrote:
> 
> > > The convention for macros is to use upper case.
> > 
> > We have plent macros that look like regular functions. And as a primary
> > interface to this functionality these shouting things look really out of
> > place.
> 
> One point of the patchset is to clean up the messy handling of the 
> allocpercpu interface which uses lower case for macros. It is a bit 
> confusing that a function like alloc_percpu() can take a type argument. 
> I think this needs to be uppercase for clarity.

Without a doubt.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-06 19:51 ` [patch 01/28] cpu alloc: The allocator Christoph Lameter
  2007-11-08 12:34   ` Peter Zijlstra
       [not found]   ` <1194522615.6289.136.camel@twins>
@ 2007-11-13 11:15   ` David Miller
  2007-11-13 21:40     ` Christoph Lameter
                       ` (2 more replies)
  2 siblings, 3 replies; 78+ messages in thread
From: David Miller @ 2007-11-13 11:15 UTC (permalink / raw)
  To: clameter; +Cc: akpm, linux-mm, linux-kernel, dada1, schwidefsky

From: Christoph Lameter <clameter@sgi.com>
Date: Tue, 06 Nov 2007 11:51:45 -0800

> The core portion of the cpu allocator.
> 
> The per cpu allocator allows dynamic allocation of memory on all
> processor simultaneously. A bitmap is used to track used areas.
> The allocator implements tight packing to reduce the cache footprint
> and increase speed since cacheline contention is typically not a concern
> for memory mainly used by a single cpu. Small objects will fill up gaps
> left by larger allocations that required alignments.
> 
> Signed-off-by: Christoph Lameter <clameter@sgi.com>

Unfortunately, sparc64 fails to boot even with just this patch
applied.

The problem is that for the non-virtualized case this patch bloats up
the BSS section to be more than 8MB in size.  Sparc64 kernel images
cannot be more than 8MB in size total due to various boot loader and
firmware limitations.

I have NR_CPUS set to 64, but it can be up to 4096 on sparc64.

Yes, I could add virtualized area support to sparc64, but we cannot
impose this on every platform.

One thing you could do is simply use a vmalloc allocation in the
non-virtualized case.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-13 11:15   ` David Miller
@ 2007-11-13 21:40     ` Christoph Lameter
  2007-11-13 21:58       ` Eric Dumazet
  2007-11-14  1:30       ` David Miller
  2007-11-13 22:20     ` Christoph Lameter
  2007-11-14  1:06     ` Andi Kleen
  2 siblings, 2 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-13 21:40 UTC (permalink / raw)
  To: David Miller; +Cc: akpm, linux-mm, linux-kernel, dada1, schwidefsky

On Tue, 13 Nov 2007, David Miller wrote:

> One thing you could do is simply use a vmalloc allocation in the
> non-virtualized case.

Yuck. Meaning to add more crappy code. The bss limitations to 8M is a bit 
strange though. Do other platforms have the same issues? 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-13 21:40     ` Christoph Lameter
@ 2007-11-13 21:58       ` Eric Dumazet
  2007-11-13 22:00         ` Christoph Lameter
  2007-11-13 22:02         ` Christoph Lameter
  2007-11-14  1:30       ` David Miller
  1 sibling, 2 replies; 78+ messages in thread
From: Eric Dumazet @ 2007-11-13 21:58 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: David Miller, akpm, linux-mm, linux-kernel, schwidefsky

Christoph Lameter a écrit :
> On Tue, 13 Nov 2007, David Miller wrote:
> 
>> One thing you could do is simply use a vmalloc allocation in the
>> non-virtualized case.
> 
> Yuck. Meaning to add more crappy code. The bss limitations to 8M is a bit 
> strange though. Do other platforms have the same issues? 

Maybe not so crappy, because even for i386 code, you might use not a strict 
vmalloc() implementation but at least reserving percpu space inside the 
vmalloc range. (ie not use a dedicated area as your current patchset does)

This is because NR_CPUS is defaulted to 32 on i386 (with a limit of 256), so 
reserving 256*256KB = 64 MB of virtual space might be too much. (this is half 
the typical vmalloc area)

The idea would be :

- Reserving an area of NR_CPUS*256KB inside vmalloc() space (but of course not 
allocating pages)

- Then for each non possible cpu, 'release' its 256KB area and give it back to 
vmalloc free areas pool.

Once you add in mm/vmalloc.c all needed helpers, no need to use BSS Megablob 
anymore ?

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-13 21:58       ` Eric Dumazet
@ 2007-11-13 22:00         ` Christoph Lameter
  2007-11-14  1:33           ` David Miller
  2007-11-13 22:02         ` Christoph Lameter
  1 sibling, 1 reply; 78+ messages in thread
From: Christoph Lameter @ 2007-11-13 22:00 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, akpm, linux-mm, linux-kernel, schwidefsky

On Tue, 13 Nov 2007, Eric Dumazet wrote:

> Once you add in mm/vmalloc.c all needed helpers, no need to use BSS Megablob
> anymore ?

Well I think all of this can be avoided by simply copying the existing 
vmemmap helper functions and providing a virtual address for sparc64.


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-13 21:58       ` Eric Dumazet
  2007-11-13 22:00         ` Christoph Lameter
@ 2007-11-13 22:02         ` Christoph Lameter
  1 sibling, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-13 22:02 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, akpm, linux-mm, linux-kernel, schwidefsky

On Tue, 13 Nov 2007, Eric Dumazet wrote:

> This is because NR_CPUS is defaulted to 32 on i386 (with a limit of 256), so
> reserving 256*256KB = 64 MB of virtual space might be too much. (this is half
> the typical vmalloc area)

It defaults to 8 except if you use a NUMA system.

config NR_CPUS
        int "Maximum number of CPUs (2-255)"
        range 2 255
        depends on SMP
        default "32" if X86_NUMAQ || X86_SUMMIT || X86_BIGSMP || X86_ES7000
        default "8"


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-13 11:15   ` David Miller
  2007-11-13 21:40     ` Christoph Lameter
@ 2007-11-13 22:20     ` Christoph Lameter
  2007-11-14  1:36       ` David Miller
  2007-11-14  1:37       ` David Miller
  2007-11-14  1:06     ` Andi Kleen
  2 siblings, 2 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-13 22:20 UTC (permalink / raw)
  To: David Miller; +Cc: akpm, linux-mm, linux-kernel, dada1, schwidefsky

On Tue, 13 Nov 2007, David Miller wrote:

> Yes, I could add virtualized area support to sparc64, but we cannot
> impose this on every platform.

Other platforms do not have the 8MB restriction nor do they have so many 
processors.

Here is the draft of a virtual cpu area implementation for sparc64. Uses 
the VMEMMAP chunks:

---
 arch/sparc64/Kconfig          |   12 ++++++++++++
 arch/sparc64/mm/init.c        |   34 ++++++++++++++++++++++++++++++++++
 include/asm-sparc64/pgtable.h |    1 +
 3 files changed, 47 insertions(+)

Index: linux-2.6/arch/sparc64/mm/init.c
===================================================================
--- linux-2.6.orig/arch/sparc64/mm/init.c	2007-11-13 14:09:44.619500290 -0800
+++ linux-2.6/arch/sparc64/mm/init.c	2007-11-13 14:17:49.794860210 -0800
@@ -1697,6 +1697,40 @@ int __meminit vmemmap_populate(struct pa
 }
 #endif /* CONFIG_SPARSEMEM_VMEMMAP */
 
+int cpu_area_populate(void *start, unsigned long size, gfp_t flags, int node)
+{
+	unsigned long vstart = (unsigned long) start;
+	unsigned long vend = (unsigned long) (start + size);
+	unsigned long phys_start = (vstart - CPU_AREA_BASE);
+	unsigned long phys_end = (vend - CPU_AREA_BASE);
+	unsigned long addr = phys_start & VMEMMAP_CHUNK_MASK;
+	unsigned long end = VMEMMAP_ALIGN(phys_end);
+	unsigned long pte_base;
+
+	pte_base = (_PAGE_VALID | _PAGE_SZ4MB_4U |
+		    _PAGE_CP_4U | _PAGE_CV_4U |
+		    _PAGE_P_4U | _PAGE_W_4U);
+	if (tlb_type == hypervisor)
+		pte_base = (_PAGE_VALID | _PAGE_SZ4MB_4V |
+			    _PAGE_CP_4V | _PAGE_CV_4V |
+			    _PAGE_P_4V | _PAGE_W_4V);
+
+	for (; addr < end; addr += VMEMMAP_CHUNK) {
+		unsigned long *vmem_pp =
+			vmemmap_table + (addr >> VMEMMAP_CHUNK_SHIFT);
+		void *block;
+
+		if (!(*vmem_pp & _PAGE_VALID)) {
+			block = vmemmap_alloc_block(1UL << 22, flags, node);
+			if (!block)
+				return -ENOMEM;
+
+			*vmem_pp = pte_base | __pa(block);
+		}
+	}
+	return 0;
+}
+
 static void prot_init_common(unsigned long page_none,
 			     unsigned long page_shared,
 			     unsigned long page_copy,
Index: linux-2.6/arch/sparc64/Kconfig
===================================================================
--- linux-2.6.orig/arch/sparc64/Kconfig	2007-11-13 14:12:28.895750307 -0800
+++ linux-2.6/arch/sparc64/Kconfig	2007-11-13 14:18:32.110750142 -0800
@@ -76,6 +76,18 @@ config GENERIC_HARDIRQS_NO__DO_IRQ
 	bool
 	def_bool y
 
+config CPU_AREA_VIRTUAL
+	bool
+	def_bool y
+
+config CPU_AREA_ORDER
+	int
+	default "10"
+
+config CPU_AREA_ALLOC_ORDER
+	int
+	default "0"
+
 choice
 	prompt "Kernel page size"
 	default SPARC64_PAGE_SIZE_8KB
Index: linux-2.6/include/asm-sparc64/pgtable.h
===================================================================
--- linux-2.6.orig/include/asm-sparc64/pgtable.h	2007-11-13 14:11:51.871000018 -0800
+++ linux-2.6/include/asm-sparc64/pgtable.h	2007-11-13 14:12:21.011750241 -0800
@@ -43,6 +43,7 @@
 #define VMALLOC_START		_AC(0x0000000100000000,UL)
 #define VMALLOC_END		_AC(0x0000000200000000,UL)
 #define VMEMMAP_BASE		_AC(0x0000000200000000,UL)
+#define CPU_AREA_BASE		_AC(0x0000000300000000,UL)
 
 #define vmemmap			((struct page *)VMEMMAP_BASE)
 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-13 11:15   ` David Miller
  2007-11-13 21:40     ` Christoph Lameter
  2007-11-13 22:20     ` Christoph Lameter
@ 2007-11-14  1:06     ` Andi Kleen
  2007-11-14  1:52       ` David Miller
  2007-11-14  4:15       ` [patch 01/28] cpu alloc: The allocator Christoph Lameter
  2 siblings, 2 replies; 78+ messages in thread
From: Andi Kleen @ 2007-11-14  1:06 UTC (permalink / raw)
  To: David Miller; +Cc: clameter, akpm, linux-mm, linux-kernel, dada1, schwidefsky

David Miller <davem@davemloft.net> writes:
>
> The problem is that for the non-virtualized case this patch bloats up
> the BSS section to be more than 8MB in size.  Sparc64 kernel images
> cannot be more than 8MB in size total due to various boot loader and
> firmware limitations.

I recently ran into a similar problem with x86-64 and large BSS from
lockdep conflicting with a 16MB kdump kernel. Solution was to do
another early allocator before bootmem and then move the tables into
there.

-Andi


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-13 21:40     ` Christoph Lameter
  2007-11-13 21:58       ` Eric Dumazet
@ 2007-11-14  1:30       ` David Miller
  2007-11-14  1:48         ` Christoph Lameter
  1 sibling, 1 reply; 78+ messages in thread
From: David Miller @ 2007-11-14  1:30 UTC (permalink / raw)
  To: clameter; +Cc: akpm, linux-mm, linux-kernel, dada1, schwidefsky

From: Christoph Lameter <clameter@sgi.com>
Date: Tue, 13 Nov 2007 13:40:19 -0800 (PST)

> On Tue, 13 Nov 2007, David Miller wrote:
> 
> > One thing you could do is simply use a vmalloc allocation in the
> > non-virtualized case.
> 
> Yuck. Meaning to add more crappy code. The bss limitations to 8M is a bit 
> strange though. Do other platforms have the same issues? 

sparc32 has the same limit.

I'm surprised this is your reaction instead of "oh damn, sorry
I bloated up the kernel image size by 8mb, I'll find a way to
fix that."

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-13 22:00         ` Christoph Lameter
@ 2007-11-14  1:33           ` David Miller
  0 siblings, 0 replies; 78+ messages in thread
From: David Miller @ 2007-11-14  1:33 UTC (permalink / raw)
  To: clameter; +Cc: dada1, akpm, linux-mm, linux-kernel, schwidefsky

From: Christoph Lameter <clameter@sgi.com>
Date: Tue, 13 Nov 2007 14:00:45 -0800 (PST)

> On Tue, 13 Nov 2007, Eric Dumazet wrote:
> 
> > Once you add in mm/vmalloc.c all needed helpers, no need to use BSS Megablob
> > anymore ?
> 
> Well I think all of this can be avoided by simply copying the existing 
> vmemmap helper functions and providing a virtual address for sparc64.

I intend to do that in the end, but you miss my point.
Requiring this is unreasonable.

And nobody is going to do the virt stuff for platforms like sparc32.
And I do mean nobody.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-13 22:20     ` Christoph Lameter
@ 2007-11-14  1:36       ` David Miller
  2007-11-14  1:37       ` David Miller
  1 sibling, 0 replies; 78+ messages in thread
From: David Miller @ 2007-11-14  1:36 UTC (permalink / raw)
  To: clameter; +Cc: akpm, linux-mm, linux-kernel, dada1, schwidefsky

From: Christoph Lameter <clameter@sgi.com>
Date: Tue, 13 Nov 2007 14:20:02 -0800 (PST)

> On Tue, 13 Nov 2007, David Miller wrote:
> 
> > Yes, I could add virtualized area support to sparc64, but we cannot
> > impose this on every platform.
> 
> Other platforms do not have the 8MB restriction nor do they have so many 
> processors.

sparc32 has the same limitations, nobody is going to implement
the virt stuff there.

> Here is the draft of a virtual cpu area implementation for sparc64. Uses 
> the VMEMMAP chunks:

This doesn't avoid the core problem.  Bloating up the BSS like
that is bad, end enforcing a virt implementation to avoid that
is an anti-social way to go about implementing this feature.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-13 22:20     ` Christoph Lameter
  2007-11-14  1:36       ` David Miller
@ 2007-11-14  1:37       ` David Miller
  2007-11-14  1:50         ` Christoph Lameter
  1 sibling, 1 reply; 78+ messages in thread
From: David Miller @ 2007-11-14  1:37 UTC (permalink / raw)
  To: clameter; +Cc: akpm, linux-mm, linux-kernel, dada1, schwidefsky


BTW, I'm going to stop testing your patches on sparc64 for
a while until you start to make me feel like you understand
that ignoring the BSS bloat issue is bad.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-14  1:30       ` David Miller
@ 2007-11-14  1:48         ` Christoph Lameter
  0 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-14  1:48 UTC (permalink / raw)
  To: David Miller; +Cc: akpm, linux-mm, linux-kernel, dada1, schwidefsky

On Tue, 13 Nov 2007, David Miller wrote:

> I'm surprised this is your reaction instead of "oh damn, sorry
> I bloated up the kernel image size by 8mb, I'll find a way to
> fix that."

Well I found a way to fix that and its in the patch...


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-14  1:37       ` David Miller
@ 2007-11-14  1:50         ` Christoph Lameter
  2007-11-14  2:00           ` David Miller
  0 siblings, 1 reply; 78+ messages in thread
From: Christoph Lameter @ 2007-11-14  1:50 UTC (permalink / raw)
  To: David Miller; +Cc: akpm, linux-mm, linux-kernel, dada1, schwidefsky

On Tue, 13 Nov 2007, David Miller wrote:

> BTW, I'm going to stop testing your patches on sparc64 for
> a while until you start to make me feel like you understand
> that ignoring the BSS bloat issue is bad.

Well this is just the fallback. How can I avoid this and still keep a 
constant? Add a new segment to vmlinux.lds.S?
 

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-14  1:06     ` Andi Kleen
@ 2007-11-14  1:52       ` David Miller
  2007-11-14  1:57         ` Christoph Lameter
  2007-11-14  2:28         ` Andi Kleen
  2007-11-14  4:15       ` [patch 01/28] cpu alloc: The allocator Christoph Lameter
  1 sibling, 2 replies; 78+ messages in thread
From: David Miller @ 2007-11-14  1:52 UTC (permalink / raw)
  To: andi; +Cc: clameter, akpm, linux-mm, linux-kernel, dada1, schwidefsky

From: Andi Kleen <andi@firstfloor.org>
Date: Wed, 14 Nov 2007 02:06:28 +0100

> David Miller <davem@davemloft.net> writes:
> >
> > The problem is that for the non-virtualized case this patch bloats up
> > the BSS section to be more than 8MB in size.  Sparc64 kernel images
> > cannot be more than 8MB in size total due to various boot loader and
> > firmware limitations.
> 
> I recently ran into a similar problem with x86-64 and large BSS from
> lockdep conflicting with a 16MB kdump kernel. Solution was to do
> another early allocator before bootmem and then move the tables into
> there.

Yes, I've run into similar problems with lockdep as well.
I had to build an ultra minimalized kernel to get it to
boot on my Niagara boxes.

I think I even looked at the same lockdep code, and I'd
appreciate it if you'd submit your fix for this if you
haven't already.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-14  1:52       ` David Miller
@ 2007-11-14  1:57         ` Christoph Lameter
  2007-11-14  2:01           ` David Miller
  2007-11-14  2:28         ` Andi Kleen
  1 sibling, 1 reply; 78+ messages in thread
From: Christoph Lameter @ 2007-11-14  1:57 UTC (permalink / raw)
  To: David Miller; +Cc: andi, akpm, linux-mm, linux-kernel, dada1, schwidefsky

On Tue, 13 Nov 2007, David Miller wrote:

> Yes, I've run into similar problems with lockdep as well.
> I had to build an ultra minimalized kernel to get it to
> boot on my Niagara boxes.

Hmmmm. cpu_alloc really does not need zeroed data. Just an address fixed 
by the compiler where stuff can be put. Can the loader do that somehow?



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-14  1:50         ` Christoph Lameter
@ 2007-11-14  2:00           ` David Miller
  2007-11-14  2:05             ` Christoph Lameter
  0 siblings, 1 reply; 78+ messages in thread
From: David Miller @ 2007-11-14  2:00 UTC (permalink / raw)
  To: clameter; +Cc: akpm, linux-mm, linux-kernel, dada1, schwidefsky

From: Christoph Lameter <clameter@sgi.com>
Date: Tue, 13 Nov 2007 17:50:24 -0800 (PST)

> On Tue, 13 Nov 2007, David Miller wrote:
> 
> > BTW, I'm going to stop testing your patches on sparc64 for
> > a while until you start to make me feel like you understand
> > that ignoring the BSS bloat issue is bad.
> 
> Well this is just the fallback. How can I avoid this and still keep a 
> constant? Add a new segment to vmlinux.lds.S?

I'm not so sure.

The idea about doling out vmalloc space seemed the most promising.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-14  1:57         ` Christoph Lameter
@ 2007-11-14  2:01           ` David Miller
  2007-11-14  2:03             ` Christoph Lameter
  0 siblings, 1 reply; 78+ messages in thread
From: David Miller @ 2007-11-14  2:01 UTC (permalink / raw)
  To: clameter; +Cc: andi, akpm, linux-mm, linux-kernel, dada1, schwidefsky

From: Christoph Lameter <clameter@sgi.com>
Date: Tue, 13 Nov 2007 17:57:15 -0800 (PST)

> Hmmmm. cpu_alloc really does not need zeroed data. Just an address
> fixed by the compiler where stuff can be put. Can the loader do that
> somehow?

Yes, and I think IA64 uses such a scheme for it's 64KB fixed
per-cpu TLB mapping thing doesn't it?

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-14  2:01           ` David Miller
@ 2007-11-14  2:03             ` Christoph Lameter
  0 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-14  2:03 UTC (permalink / raw)
  To: David Miller; +Cc: andi, akpm, linux-mm, linux-kernel, dada1, schwidefsky

On Tue, 13 Nov 2007, David Miller wrote:

> From: Christoph Lameter <clameter@sgi.com>
> Date: Tue, 13 Nov 2007 17:57:15 -0800 (PST)
> 
> > Hmmmm. cpu_alloc really does not need zeroed data. Just an address
> > fixed by the compiler where stuff can be put. Can the loader do that
> > somehow?
> 
> Yes, and I think IA64 uses such a scheme for it's 64KB fixed
> per-cpu TLB mapping thing doesn't it?

The per cpu TLB mapping is virtually mapped. The real memory allocation 
behind it occurs dynamically from bootmem.


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-14  2:00           ` David Miller
@ 2007-11-14  2:05             ` Christoph Lameter
  0 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-14  2:05 UTC (permalink / raw)
  To: David Miller; +Cc: akpm, linux-mm, linux-kernel, dada1, schwidefsky

On Tue, 13 Nov 2007, David Miller wrote:

> The idea about doling out vmalloc space seemed the most promising.

Well that is basically the same as the virtual mode. Just ditch the 
fallback mode? vmalloc directly does not guarantee a fixed address.


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-14  1:52       ` David Miller
  2007-11-14  1:57         ` Christoph Lameter
@ 2007-11-14  2:28         ` Andi Kleen
  2007-11-14  3:48           ` David Miller
                             ` (2 more replies)
  1 sibling, 3 replies; 78+ messages in thread
From: Andi Kleen @ 2007-11-14  2:28 UTC (permalink / raw)
  To: David Miller
  Cc: andi, clameter, akpm, linux-mm, linux-kernel, dada1, schwidefsky

On Tue, Nov 13, 2007 at 05:52:08PM -0800, David Miller wrote:
> Yes, I've run into similar problems with lockdep as well.
> I had to build an ultra minimalized kernel to get it to
> boot on my Niagara boxes.
> 
> I think I even looked at the same lockdep code, and I'd
> appreciate it if you'd submit your fix for this if you
> haven't already.

ftp://firstfloor.org/pub/ak/x86_64/quilt/patches/early-reserve
ftp://firstfloor.org/pub/ak/x86_64/quilt/patches/early-alloc 
ftp://firstfloor.org/pub/ak/x86_64/quilt/patches/lockdep-early-alloc

I didn't plan to submit it for .24, just .25. Or do you need it 
urgently?

Also it would require you to write a sparc specific arch_early_alloc()
of course.  I've only done the x86-64 version.

-Andi

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-14  2:28         ` Andi Kleen
@ 2007-11-14  3:48           ` David Miller
  2007-11-14  3:49           ` Christoph Lameter
  2007-11-16 10:23           ` large lockdep bss (was: Re: [patch 01/28] cpu alloc: The allocator) Peter Zijlstra
  2 siblings, 0 replies; 78+ messages in thread
From: David Miller @ 2007-11-14  3:48 UTC (permalink / raw)
  To: andi; +Cc: clameter, akpm, linux-mm, linux-kernel, dada1, schwidefsky

From: Andi Kleen <andi@firstfloor.org>
Date: Wed, 14 Nov 2007 03:28:32 +0100

> ftp://firstfloor.org/pub/ak/x86_64/quilt/patches/early-reserve
> ftp://firstfloor.org/pub/ak/x86_64/quilt/patches/early-alloc 
> ftp://firstfloor.org/pub/ak/x86_64/quilt/patches/lockdep-early-alloc
> 
> I didn't plan to submit it for .24, just .25. Or do you need it 
> urgently?

I don't need it urgently, 2.6.25 is perfectly fine.

> Also it would require you to write a sparc specific arch_early_alloc()
> of course.  I've only done the x86-64 version.

I'll be sure to take care of that when it hits .25

Thanks Andi.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-14  2:28         ` Andi Kleen
  2007-11-14  3:48           ` David Miller
@ 2007-11-14  3:49           ` Christoph Lameter
  2007-11-16 10:23           ` large lockdep bss (was: Re: [patch 01/28] cpu alloc: The allocator) Peter Zijlstra
  2 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-14  3:49 UTC (permalink / raw)
  To: Andi Kleen; +Cc: David Miller, akpm, linux-mm, linux-kernel, dada1, schwidefsky

On Wed, 14 Nov 2007, Andi Kleen wrote:

> On Tue, Nov 13, 2007 at 05:52:08PM -0800, David Miller wrote:
> > Yes, I've run into similar problems with lockdep as well.
> > I had to build an ultra minimalized kernel to get it to
> > boot on my Niagara boxes.
> > 
> > I think I even looked at the same lockdep code, and I'd
> > appreciate it if you'd submit your fix for this if you
> > haven't already.
> 
> ftp://firstfloor.org/pub/ak/x86_64/quilt/patches/early-reserve
> ftp://firstfloor.org/pub/ak/x86_64/quilt/patches/early-alloc 
> ftp://firstfloor.org/pub/ak/x86_64/quilt/patches/lockdep-early-alloc
> 
> I didn't plan to submit it for .24, just .25. Or do you need it 
> urgently?
> 
> Also it would require you to write a sparc specific arch_early_alloc()
> of course.  I've only done the x86-64 version.

IA64 also has no lockdep. What arches support lockdep?


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-14  1:06     ` Andi Kleen
  2007-11-14  1:52       ` David Miller
@ 2007-11-14  4:15       ` Christoph Lameter
  2007-11-14  4:18         ` David Miller
  1 sibling, 1 reply; 78+ messages in thread
From: Christoph Lameter @ 2007-11-14  4:15 UTC (permalink / raw)
  To: Andi Kleen; +Cc: David Miller, akpm, linux-mm, linux-kernel, dada1, schwidefsky

Hmmm.. I have v2 in preparation here that puts the pda and the per cpu 
data into the cpu_alloc area. Thus gs: can be used to access all per cpu 
data.

Any ideas how to abstract out the pda operations? Wasnt local_t supposed 
to be able to do atomic ops on cpu data? Is there an segment register 
version of local_t? I want also to have cmpxchg xchg etc that are all 
atomic without requiring any interrupt disable or preempt disable.

cpu_alloc allows pointer arithmetic on cpu area pointers. The segment 
prefix can then be used to select the appropriate area.

Guess I need also to add an arch configuration guide to V2 as well so that 
the other arches can do similar tricks and emphasize that the static 
default that requires bss is only suitable for small systems.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-14  4:15       ` [patch 01/28] cpu alloc: The allocator Christoph Lameter
@ 2007-11-14  4:18         ` David Miller
  2007-11-14  4:21           ` David Miller
  2007-11-14  4:26           ` Christoph Lameter
  0 siblings, 2 replies; 78+ messages in thread
From: David Miller @ 2007-11-14  4:18 UTC (permalink / raw)
  To: clameter; +Cc: andi, akpm, linux-mm, linux-kernel, dada1, schwidefsky

From: Christoph Lameter <clameter@sgi.com>
Date: Tue, 13 Nov 2007 20:15:55 -0800 (PST)

> Guess I need also to add an arch configuration guide to V2 as well so that 
> the other arches can do similar tricks and emphasize that the static 
> default that requires bss is only suitable for small systems.

I'm going to be against your changes until you implement
a real fix for the BSS bloat problems.

It's worse than the per-cpu allocator we have now, much
worse.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-14  4:18         ` David Miller
@ 2007-11-14  4:21           ` David Miller
  2007-11-14  4:26           ` Christoph Lameter
  1 sibling, 0 replies; 78+ messages in thread
From: David Miller @ 2007-11-14  4:21 UTC (permalink / raw)
  To: clameter; +Cc: andi, akpm, linux-kernel, dada1, schwidefsky


BTW linux-mm@vger.kernel.org does not exist, please remove
it from the CC: in the future :-)

Thanks.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-14  4:18         ` David Miller
  2007-11-14  4:21           ` David Miller
@ 2007-11-14  4:26           ` Christoph Lameter
  2007-11-14  5:53             ` David Miller
  1 sibling, 1 reply; 78+ messages in thread
From: Christoph Lameter @ 2007-11-14  4:26 UTC (permalink / raw)
  To: David Miller; +Cc: andi, akpm, linux-kernel, dada1, schwidefsky


On Tue, 13 Nov 2007, David Miller wrote:

> From: Christoph Lameter <clameter@sgi.com>
> Date: Tue, 13 Nov 2007 20:15:55 -0800 (PST)
> 
> > Guess I need also to add an arch configuration guide to V2 as well so that 
> > the other arches can do similar tricks and emphasize that the static 
> > default that requires bss is only suitable for small systems.
> 
> I'm going to be against your changes until you implement
> a real fix for the BSS bloat problems.
> 
> It's worse than the per-cpu allocator we have now, much
> worse.

You need to configure cpu_alloc for your arch and so far you seem 
to not have had the time to understand how it works otherwise you would 
not repeat these statements and ask me to implement what the 
patch already provides.

The only problem that I see so far is a communication problem. Thus we 
need more documentation.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-14  4:26           ` Christoph Lameter
@ 2007-11-14  5:53             ` David Miller
  2007-11-15 18:49               ` Christoph Lameter
  0 siblings, 1 reply; 78+ messages in thread
From: David Miller @ 2007-11-14  5:53 UTC (permalink / raw)
  To: clameter; +Cc: andi, akpm, linux-kernel, dada1, schwidefsky

From: Christoph Lameter <clameter@sgi.com>
Date: Tue, 13 Nov 2007 20:26:33 -0800 (PST)

> The only problem that I see so far is a communication problem. Thus
> we need more documentation.

Fair enough, I'll look more closely at the next rev of
your patch set.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-14  5:53             ` David Miller
@ 2007-11-15 18:49               ` Christoph Lameter
  2007-11-15 22:03                 ` David Miller
  0 siblings, 1 reply; 78+ messages in thread
From: Christoph Lameter @ 2007-11-15 18:49 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: andi, David Miller, akpm, linux-kernel, dada1, schwidefsky

Well there is an LWN article now that also claims that the cpu_alloc 
patchset requires a large bss space. Sigh. See

http://lwn.net/Articles/257828/

Not true! 44 bytes is reasonable.

christoph@stapp:~/linux-2.6$ size mm/cpu_alloc.o
   text    data     bss     dec     hex filename
   5625      36      44    5705    1649 mm/cpu_alloc.o

Need to separate out the virtualization into a separate patch in V2 to 
make clear that it is there.



^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-15 18:49               ` Christoph Lameter
@ 2007-11-15 22:03                 ` David Miller
  2007-11-16  2:19                   ` Christoph Lameter
  0 siblings, 1 reply; 78+ messages in thread
From: David Miller @ 2007-11-15 22:03 UTC (permalink / raw)
  To: clameter; +Cc: corbet, andi, akpm, linux-kernel, dada1, schwidefsky

From: Christoph Lameter <clameter@sgi.com>
Date: Thu, 15 Nov 2007 10:49:37 -0800 (PST)

> Well there is an LWN article now that also claims that the cpu_alloc 
> patchset requires a large bss space. Sigh. See
> 
> http://lwn.net/Articles/257828/
> 
> Not true! 44 bytes is reasonable.

Well, the first version of the patch set, the one I tested, did
require a lot of BSS space.  And that's the one they are writing
about.

I don't see how you can even remotely claim that LWN's reporting is
inaccurate here.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-15 22:03                 ` David Miller
@ 2007-11-16  2:19                   ` Christoph Lameter
  2007-11-16  2:50                     ` David Miller
  0 siblings, 1 reply; 78+ messages in thread
From: Christoph Lameter @ 2007-11-16  2:19 UTC (permalink / raw)
  To: David Miller; +Cc: corbet, andi, akpm, linux-kernel, dada1, schwidefsky

On Thu, 15 Nov 2007, David Miller wrote:

> Well, the first version of the patch set, the one I tested, did
> require a lot of BSS space.  And that's the one they are writing
> about.

I am running the same version that you also ran. The problem is that you 
did not configure the stuff properly for your box and I did not include a 
configuration for sparc64 since I did not know how it needed to be 
configured for sparc64. You ignored the patch for sparc64 that I provided 
to correct the problem.





^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-16  2:19                   ` Christoph Lameter
@ 2007-11-16  2:50                     ` David Miller
  2007-11-16  2:55                       ` Christoph Lameter
  0 siblings, 1 reply; 78+ messages in thread
From: David Miller @ 2007-11-16  2:50 UTC (permalink / raw)
  To: clameter; +Cc: corbet, andi, akpm, linux-kernel, dada1, schwidefsky

From: Christoph Lameter <clameter@sgi.com>
Date: Thu, 15 Nov 2007 18:19:37 -0800 (PST)

> On Thu, 15 Nov 2007, David Miller wrote:
> 
> > Well, the first version of the patch set, the one I tested, did
> > require a lot of BSS space.  And that's the one they are writing
> > about.
> 
> I am running the same version that you also ran. The problem is that you 
> did not configure the stuff properly for your box and I did not include a 
> configuration for sparc64 since I did not know how it needed to be 
> configured for sparc64. You ignored the patch for sparc64 that I provided 
> to correct the problem.

If you're talking about the VMEMMAP thing, that patch didn't remove
the problem, it simply added optimizations for sparc64 so that you could
sweep the problem under the rug.

Sparc32 is still broken, as just one of several possible examples.
The BSS usage is still there for platforms that don't use VMEMMAP.

So again, the lwn.net report is accurate.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-16  2:50                     ` David Miller
@ 2007-11-16  2:55                       ` Christoph Lameter
  2007-11-16  2:58                         ` David Miller
  0 siblings, 1 reply; 78+ messages in thread
From: Christoph Lameter @ 2007-11-16  2:55 UTC (permalink / raw)
  To: David Miller; +Cc: corbet, andi, akpm, linux-kernel, dada1, schwidefsky

On Thu, 15 Nov 2007, David Miller wrote:

> If you're talking about the VMEMMAP thing, that patch didn't remove
> the problem, it simply added optimizations for sparc64 so that you could
> sweep the problem under the rug.

The virtual mapping of the cpu areas is used by the patch I 
posted for i386, ia64 and x86_64. All the ones that I have access to here.

> Sparc32 is still broken, as just one of several possible examples.

I have not looked at sparc32 sorry. If you simply set up a couple of 
configuration values in arch/sparc32/Kconfig then everything will be fine.

> The BSS usage is still there for platforms that don't use VMEMMAP.

All MMU platforms can use the virtual mappings. The main use of the static 
configuration is for embedded systems.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-16  2:55                       ` Christoph Lameter
@ 2007-11-16  2:58                         ` David Miller
  2007-11-16  3:10                           ` Christoph Lameter
  0 siblings, 1 reply; 78+ messages in thread
From: David Miller @ 2007-11-16  2:58 UTC (permalink / raw)
  To: clameter; +Cc: corbet, andi, akpm, linux-kernel, dada1, schwidefsky

From: Christoph Lameter <clameter@sgi.com>
Date: Thu, 15 Nov 2007 18:55:21 -0800 (PST)

> On Thu, 15 Nov 2007, David Miller wrote:
> 
> > Sparc32 is still broken, as just one of several possible examples.
> 
> I have not looked at sparc32 sorry. If you simply set up a couple of 
> configuration values in arch/sparc32/Kconfig then everything will be fine.

There is assembler code to write, which as I stated several
times nobody is going to work on or test.

It is unreasonable to add this new VMEMMAP requirement in order for
your patches to work properly.

> > The BSS usage is still there for platforms that don't use VMEMMAP.
> 
> All MMU platforms can use the virtual mappings. The main use of the static 
> configuration is for embedded systems.

Someone has to implement and test VMEMMAP now on all of these
architectures, it is becomming a requirement unlike in the sparsemem
patches case where it was optional.

That's unreasonable.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-16  2:58                         ` David Miller
@ 2007-11-16  3:10                           ` Christoph Lameter
  2007-11-16  3:17                             ` David Miller
  0 siblings, 1 reply; 78+ messages in thread
From: Christoph Lameter @ 2007-11-16  3:10 UTC (permalink / raw)
  To: David Miller; +Cc: corbet, andi, akpm, linux-kernel, dada1, schwidefsky

On Thu, 15 Nov 2007, David Miller wrote:

> > I have not looked at sparc32 sorry. If you simply set up a couple of 
> > configuration values in arch/sparc32/Kconfig then everything will be fine.
> 
> There is assembler code to write, which as I stated several
> times nobody is going to work on or test.

There is no assembly code required. I overdid it in the patch that I sent 
you trying to make sparc64 use large mappings like x86_64 NUMA. You really 
do not need that. Look at the IA64 and i386 configurations. There is no C 
code required. The x86_64 code only adds some special C code for the NUMA 
case.

> > All MMU platforms can use the virtual mappings. The main use of the static 
> > configuration is for embedded systems.
> 
> Someone has to implement and test VMEMMAP now on all of these
> architectures, it is becomming a requirement unlike in the sparsemem
> patches case where it was optional.
> 
> That's unreasonable.

VMEMMAP is something different from the cpu allocator. All MMU platforms 
have vmalloc support and you even suggested the use of vmalloc.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-16  3:10                           ` Christoph Lameter
@ 2007-11-16  3:17                             ` David Miller
  2007-11-16  3:19                               ` Christoph Lameter
  0 siblings, 1 reply; 78+ messages in thread
From: David Miller @ 2007-11-16  3:17 UTC (permalink / raw)
  To: clameter; +Cc: corbet, andi, akpm, linux-kernel, dada1, schwidefsky

From: Christoph Lameter <clameter@sgi.com>
Date: Thu, 15 Nov 2007 19:10:15 -0800 (PST)

> On Thu, 15 Nov 2007, David Miller wrote:
> 
> > > I have not looked at sparc32 sorry. If you simply set up a couple of 
> > > configuration values in arch/sparc32/Kconfig then everything will be fine.
> > 
> > There is assembler code to write, which as I stated several
> > times nobody is going to work on or test.
> 
> There is no assembly code required. I overdid it in the patch that I sent 
> you trying to make sparc64 use large mappings like x86_64 NUMA. You really 
> do not need that. Look at the IA64 and i386 configurations. There is no C 
> code required. The x86_64 code only adds some special C code for the NUMA 
> case.

Ok, and like I said last time, I'll examine this more closely
when you spin your next version of these patches.

Please post them soon as I'm eager to test this stuff out for
you.

Thanks.

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [patch 01/28] cpu alloc: The allocator
  2007-11-16  3:17                             ` David Miller
@ 2007-11-16  3:19                               ` Christoph Lameter
  0 siblings, 0 replies; 78+ messages in thread
From: Christoph Lameter @ 2007-11-16  3:19 UTC (permalink / raw)
  To: David Miller; +Cc: corbet, andi, akpm, linux-kernel, dada1, schwidefsky

On Thu, 15 Nov 2007, David Miller wrote:

> > There is no assembly code required. I overdid it in the patch that I sent 
> > you trying to make sparc64 use large mappings like x86_64 NUMA. You really 
> > do not need that. Look at the IA64 and i386 configurations. There is no C 
> > code required. The x86_64 code only adds some special C code for the NUMA 
> > case.
> 
> Ok, and like I said last time, I'll examine this more closely
> when you spin your next version of these patches.

All the above statements are about the version of the patch that I 
thought you had a look at.

> Please post them soon as I'm eager to test this stuff out for
> you.

Ok. Thanks. (You could test it with the current version.... Just edit 
Kconfig and try a few things and I could include the settings in the next 
release.)



^ permalink raw reply	[flat|nested] 78+ messages in thread

* large lockdep bss (was: Re: [patch 01/28] cpu alloc: The allocator)
  2007-11-14  2:28         ` Andi Kleen
  2007-11-14  3:48           ` David Miller
  2007-11-14  3:49           ` Christoph Lameter
@ 2007-11-16 10:23           ` Peter Zijlstra
  2007-11-16 11:44             ` Andi Kleen
  2 siblings, 1 reply; 78+ messages in thread
From: Peter Zijlstra @ 2007-11-16 10:23 UTC (permalink / raw)
  To: Andi Kleen
  Cc: David Miller, clameter, akpm, linux-mm, linux-kernel, dada1,
	schwidefsky, Ingo Molnar


On Wed, 2007-11-14 at 03:28 +0100, Andi Kleen wrote:
> On Tue, Nov 13, 2007 at 05:52:08PM -0800, David Miller wrote:
> > Yes, I've run into similar problems with lockdep as well.
> > I had to build an ultra minimalized kernel to get it to
> > boot on my Niagara boxes.
> > 
> > I think I even looked at the same lockdep code, and I'd
> > appreciate it if you'd submit your fix for this if you
> > haven't already.
> 
> ftp://firstfloor.org/pub/ak/x86_64/quilt/patches/early-reserve
> ftp://firstfloor.org/pub/ak/x86_64/quilt/patches/early-alloc 
> ftp://firstfloor.org/pub/ak/x86_64/quilt/patches/lockdep-early-alloc
> 
> I didn't plan to submit it for .24, just .25. Or do you need it 
> urgently?
> 
> Also it would require you to write a sparc specific arch_early_alloc()
> of course.  I've only done the x86-64 version.

Would've been nice to have heard about this lockdep problem. Anyway,
thanks for tackling it.

How about moving this bit:

+#ifndef ARCH_HAS_EARLY_ALLOC
+#define LARGEVAR(x,y) { static typeof(*x) __ ## x[y];  x = __ ## x; }
+#else
+#define LARGEVAR(x,y) x = arch_early_alloc(sizeof(*x) * y)
+#endif

out of the lockdep code and into the generic early alloc code?


^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: large lockdep bss (was: Re: [patch 01/28] cpu alloc: The allocator)
  2007-11-16 10:23           ` large lockdep bss (was: Re: [patch 01/28] cpu alloc: The allocator) Peter Zijlstra
@ 2007-11-16 11:44             ` Andi Kleen
  0 siblings, 0 replies; 78+ messages in thread
From: Andi Kleen @ 2007-11-16 11:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andi Kleen, David Miller, clameter, akpm, linux-mm, linux-kernel,
	dada1, schwidefsky, Ingo Molnar

> How about moving this bit:
> 
> +#ifndef ARCH_HAS_EARLY_ALLOC
> +#define LARGEVAR(x,y) { static typeof(*x) __ ## x[y];  x = __ ## x; }
> +#else
> +#define LARGEVAR(x,y) x = arch_early_alloc(sizeof(*x) * y)
> +#endif
> 
> out of the lockdep code and into the generic early alloc code?

Will do.

-Andi

^ permalink raw reply	[flat|nested] 78+ messages in thread

end of thread, other threads:[~2007-11-16 11:45 UTC | newest]

Thread overview: 78+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-11-06 19:51 [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Christoph Lameter
2007-11-06 19:51 ` [patch 01/28] cpu alloc: The allocator Christoph Lameter
2007-11-08 12:34   ` Peter Zijlstra
2007-11-08 12:37     ` Peter Zijlstra
2007-11-08 18:33     ` Christoph Lameter
2007-11-08 18:50       ` Christoph Lameter
2007-11-08 20:19         ` Peter Zijlstra
     [not found]   ` <1194522615.6289.136.camel@twins>
     [not found]     ` <Pine.LNX.4.64.0711081030380.7871@schroedinger.engr.sgi.com>
2007-11-08 20:19       ` Peter Zijlstra
2007-11-08 20:24         ` Christoph Lameter
2007-11-08 23:26           ` David Miller
2007-11-08 23:26         ` David Miller
2007-11-13 11:15   ` David Miller
2007-11-13 21:40     ` Christoph Lameter
2007-11-13 21:58       ` Eric Dumazet
2007-11-13 22:00         ` Christoph Lameter
2007-11-14  1:33           ` David Miller
2007-11-13 22:02         ` Christoph Lameter
2007-11-14  1:30       ` David Miller
2007-11-14  1:48         ` Christoph Lameter
2007-11-13 22:20     ` Christoph Lameter
2007-11-14  1:36       ` David Miller
2007-11-14  1:37       ` David Miller
2007-11-14  1:50         ` Christoph Lameter
2007-11-14  2:00           ` David Miller
2007-11-14  2:05             ` Christoph Lameter
2007-11-14  1:06     ` Andi Kleen
2007-11-14  1:52       ` David Miller
2007-11-14  1:57         ` Christoph Lameter
2007-11-14  2:01           ` David Miller
2007-11-14  2:03             ` Christoph Lameter
2007-11-14  2:28         ` Andi Kleen
2007-11-14  3:48           ` David Miller
2007-11-14  3:49           ` Christoph Lameter
2007-11-16 10:23           ` large lockdep bss (was: Re: [patch 01/28] cpu alloc: The allocator) Peter Zijlstra
2007-11-16 11:44             ` Andi Kleen
2007-11-14  4:15       ` [patch 01/28] cpu alloc: The allocator Christoph Lameter
2007-11-14  4:18         ` David Miller
2007-11-14  4:21           ` David Miller
2007-11-14  4:26           ` Christoph Lameter
2007-11-14  5:53             ` David Miller
2007-11-15 18:49               ` Christoph Lameter
2007-11-15 22:03                 ` David Miller
2007-11-16  2:19                   ` Christoph Lameter
2007-11-16  2:50                     ` David Miller
2007-11-16  2:55                       ` Christoph Lameter
2007-11-16  2:58                         ` David Miller
2007-11-16  3:10                           ` Christoph Lameter
2007-11-16  3:17                             ` David Miller
2007-11-16  3:19                               ` Christoph Lameter
2007-11-06 19:51 ` [patch 02/28] cpu alloc: x86_64 support Christoph Lameter
2007-11-06 19:51 ` [patch 03/28] cpu alloc: IA64 support Christoph Lameter
2007-11-06 19:51 ` [patch 04/28] cpu alloc: i386 support Christoph Lameter
2007-11-06 19:51 ` [patch 05/28] cpu alloc: Use in SLUB Christoph Lameter
2007-11-06 19:51 ` [patch 06/28] cpu alloc: Remove SLUB fields Christoph Lameter
2007-11-06 19:51 ` [patch 07/28] cpu alloc: page allocator conversion Christoph Lameter
2007-11-06 19:51 ` [patch 08/28] cpu alloc: percpu_counter conversion Christoph Lameter
2007-11-06 19:51 ` [patch 09/28] cpu alloc: crash_notes conversion Christoph Lameter
2007-11-06 19:51 ` [patch 10/28] cpu alloc: workqueue conversion Christoph Lameter
2007-11-06 19:51 ` [patch 11/28] cpu alloc: ACPI cstate handling conversion Christoph Lameter
2007-11-06 19:51 ` [patch 12/28] cpu alloc: genhd statistics conversion Christoph Lameter
2007-11-06 19:51 ` [patch 13/28] cpu alloc: blktrace conversion Christoph Lameter
2007-11-06 19:51 ` [patch 14/28] cpu alloc: SRCU Christoph Lameter
2007-11-06 19:51 ` [patch 15/28] cpu alloc: XFS counters Christoph Lameter
2007-11-06 19:52 ` [patch 16/28] cpu alloc: NFS statistics Christoph Lameter
2007-11-06 19:52 ` [patch 17/28] cpu alloc: neigbour statistics Christoph Lameter
2007-11-06 19:52 ` [patch 18/28] cpu alloc: tcp statistics Christoph Lameter
2007-11-06 19:52 ` [patch 19/28] cpu alloc: convert scatches Christoph Lameter
2007-11-06 19:52 ` [patch 20/28] cpu alloc: dmaengine conversion Christoph Lameter
2007-11-06 19:52 ` [patch 21/28] cpu alloc: convert loopback statistics Christoph Lameter
2007-11-06 19:52 ` [patch 22/28] cpu alloc: veth conversion Christoph Lameter
2007-11-06 19:52 ` [patch 23/28] cpu alloc: Chelsio statistics conversion Christoph Lameter
2007-11-06 19:52 ` [patch 24/28] cpu alloc: convert mib handling to cpu alloc Christoph Lameter
2007-11-06 19:52 ` [patch 25/28] cpu alloc: Explicitly code allocpercpu calls in iucv Christoph Lameter
2007-11-06 19:52 ` [patch 26/28] cpu alloc: Use for infiniband Christoph Lameter
2007-11-06 19:52 ` [patch 27/28] cpu alloc: Use in the crypto subsystem Christoph Lameter
2007-11-06 19:52 ` [patch 28/28] cpu alloc: Remove the allocpercpu functionality Christoph Lameter
2007-11-07 13:10 ` [patch 00/28] cpu alloc v1: Optimize by removing arrays of pointers to per cpu objects Martin Schwidefsky
2007-11-07 18:05   ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).