LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: Fw: NUMA allocator on Opteron systems does non-local allocation on node0
[not found] <20081015014125.a546fcc4.akpm@linux-foundation.org>
@ 2008-10-16 19:20 ` Christoph Lameter
2008-10-17 8:07 ` Oliver Weihe
0 siblings, 1 reply; 2+ messages in thread
From: Christoph Lameter @ 2008-10-16 19:20 UTC (permalink / raw)
To: Oliver Weihe; +Cc: Andrew Morton, lkml
> I've notived that the memory allocation on NUMA systems (Opterons) does
> memory allocation on non-local nodes for processes running node0 even if
> local memory is available. (Kernel 2.6.25 and above)
How much local memory is available? 8GB per node? That means there will be 4GB
on node 0 in ZONE_DMA32 and 4GB in ZONE_NORMAL. Other nodes will have 8GB in
ZONE_NORMAL.
> In my setup I'm allocating an array of ~7GiB memory size in a
> singlethreaded application.
> Startup: numactl --cpunodebind=X ./app
> For X=1,2,3 it works as expected, all memory is allocated on the local
> node.
> For X=0 I can see the memory beeing allocated on node0 as long as ~3GiB
> are "free" on node0. At this point the kernel starts using memory from
> node1 for the app!
NUMA only supports memory policies for the highest zone which is
ZONE_NORMAL here. Only 4GB of ZONE_NORMAL are available on node 0, so it will
go off node after that memory is exhausted. This is done in order to preserve
the lower 4GB for I/O to 32 bit devices.
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Fw: NUMA allocator on Opteron systems does non-local allocation on node0
2008-10-16 19:20 ` Fw: NUMA allocator on Opteron systems does non-local allocation on node0 Christoph Lameter
@ 2008-10-17 8:07 ` Oliver Weihe
0 siblings, 0 replies; 2+ messages in thread
From: Oliver Weihe @ 2008-10-17 8:07 UTC (permalink / raw)
To: Christoph Lameter; +Cc: Andrew Morton, lkml
Hi,
this problem/question is allready solved for me. Andi suggested to post
this on the linux-mm mailing list and they helped me. :)
> > I've notived that the memory allocation on NUMA systems (Opterons)
> > does
> > memory allocation on non-local nodes for processes running node0
> > even if
> > local memory is available. (Kernel 2.6.25 and above)
>
> How much local memory is available? 8GB per node? That means there
> will be 4GB
> on node 0 in ZONE_DMA32 and 4GB in ZONE_NORMAL. Other nodes will have
> 8GB in
> ZONE_NORMAL.
You're right. This machine has 8GiB per node. Due to the memory hole the
machine has ~3GiB ZONE_DMA32 which perfectly fits to my observations.
> > In my setup I'm allocating an array of ~7GiB memory size in a
> > singlethreaded application.
> > Startup: numactl --cpunodebind=X ./app
> > For X=1,2,3 it works as expected, all memory is allocated on the
> > local
> > node.
> > For X=0 I can see the memory beeing allocated on node0 as long as
> > ~3GiB
> > are "free" on node0. At this point the kernel starts using memory
> > from
> > node1 for the app!
>
> NUMA only supports memory policies for the highest zone which is
> ZONE_NORMAL here. Only 4GB of ZONE_NORMAL are available on node 0, so
> it will
> go off node after that memory is exhausted. This is done in order to
> preserve
> the lower 4GB for I/O to 32 bit devices.
I've changed the policy from "default" to "node"
(/proc/sys/vm/numa_zonelist_order) and now it works fine for me.
Policy "default" does automaticly select "node" or "zone" depending on
the machine. When the policy is set to "default" the kernel (2.6.27)
chooses "node" if
1. there is no ZONE_DMA32
2. the size of ZONE_DMA32 is greater than 50% of the system memory
3. the size of ZONE_DMA32 is greater than 60% of the nodelocal memory
--
Regards,
Oliver Weihe
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2008-10-17 8:08 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20081015014125.a546fcc4.akpm@linux-foundation.org>
2008-10-16 19:20 ` Fw: NUMA allocator on Opteron systems does non-local allocation on node0 Christoph Lameter
2008-10-17 8:07 ` Oliver Weihe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).