LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] NUMA slab allocator migration bugfix
@ 2008-03-05 22:33 Joe Korty
  2008-03-05 23:02 ` Christoph Lameter
  0 siblings, 1 reply; 2+ messages in thread
From: Joe Korty @ 2008-03-05 22:33 UTC (permalink / raw)
  To: clameter; +Cc: linux-kernel, npiggin, davem

NUMA slab allocator cpu migration bugfix

The NUMA slab allocator (specifically, cache_alloc_refill)
is not refreshing its local copies of what cpu and what
numa node it is on, when it drops and reacquires the irq
block that it inherited from its caller.  As a result
those values become invalid if an attempt to migrate the
process to another numa node occured while the irq block
had been dropped.

The solution is to make cache_alloc_refill reload these
variables whenever it drops and reacquires the irq block.

The error is very difficult to hit.  When it does occur,
one gets the following oops + stack traceback bits in
check_spinlock_acquired:

	kernel BUG at mm/slab.c:2417
	cache_alloc_refill+0xe6
	kmem_cache_alloc+0xd0
	...

This patch was developed against 2.6.23, ported to and
compiled-tested only against 2.6.25-rc4.

Signed-off-by: Joe Korty <joe.korty@ccur.com>

Index: 2.6.25-rc4/mm/slab.c
===================================================================
--- 2.6.25-rc4.orig/mm/slab.c	2008-03-05 16:07:56.000000000 -0500
+++ 2.6.25-rc4/mm/slab.c	2008-03-05 16:17:47.000000000 -0500
@@ -2964,11 +2964,10 @@
 	struct array_cache *ac;
 	int node;
 
-	node = numa_node_id();
-
+retry:
 	check_irq_off();
+	node = numa_node_id();
 	ac = cpu_cache_get(cachep);
-retry:
 	batchcount = ac->batchcount;
 	if (!ac->touched && batchcount > BATCHREFILL_LIMIT) {
 		/*

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] NUMA slab allocator migration bugfix
  2008-03-05 22:33 [PATCH] NUMA slab allocator migration bugfix Joe Korty
@ 2008-03-05 23:02 ` Christoph Lameter
  0 siblings, 0 replies; 2+ messages in thread
From: Christoph Lameter @ 2008-03-05 23:02 UTC (permalink / raw)
  To: Joe Korty; +Cc: linux-kernel, npiggin, davem

On Wed, 5 Mar 2008, Joe Korty wrote:

> The NUMA slab allocator (specifically, cache_alloc_refill)
> is not refreshing its local copies of what cpu and what
> numa node it is on, when it drops and reacquires the irq
> block that it inherited from its caller.  As a result
> those values become invalid if an attempt to migrate the
> process to another numa node occured while the irq block
> had been dropped.

The new slab is allocated for the node that was determined earlier and 
entered into the slab queues for that node. Howver, during the alloc we 
were rescheduled.

Then we find ourselves on another processor and recalculate the ac 
pointer. If we now retry then there is the danger of getting off node 
objects into the per cpu queue. Which may cause the wrong lock to be taken 
when draining queues. Sucks because it can cause data corruption. Same as
the other issues resolved by GFP_THISNODE.

Acked-by: Christoph Lameter <clameter@sgi.com>

Will queue it.


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2008-03-05 23:08 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-05 22:33 [PATCH] NUMA slab allocator migration bugfix Joe Korty
2008-03-05 23:02 ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).