LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Christoph Lameter <clameter@sgi.com>
Cc: ak@suse.de, linux-kernel@vger.kernel.org, y-goto@jp.fujitsu.com,
	clameter@engr.sgi.com, akpm@osdl.org
Subject: Re: [2.6.20][PATCH] fix mempolicy error check on a system with memory-less-node
Date: Thu, 8 Feb 2007 00:28:09 +0900	[thread overview]
Message-ID: <20070208002809.c75b2742.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0702070604090.14056@schroedinger.engr.sgi.com>

On Wed, 7 Feb 2007 06:05:56 -0800 (PST)
Christoph Lameter <clameter@sgi.com> wrote:

> On Wed, 7 Feb 2007, KAMEZAWA Hiroyuki wrote:
> 
> > > IMHO there shouldn't be any memory less nodes. The architecture code
> > > should not create them. The CPU should be assigned to a nearby node instead.
> > > At least x86-64 ensures that.
> > > 
> > AFAIK, ia64 creates nodes just depends on SRAT's possible resource information.
> > Then, ia64 can create cpu-memory-less-node(node with no available resource.).
> > (*)I don't like this.
> 
> I think that is only true for !SN2 platforms? Could we fix this?
> 
AFAIK, some vendor(HP?) has following configraion
- node0 .... cpu only node
- node1 .... cpu only node
- node2 .... memory only node.
This is because of their memory-interleave technique.

Our 64cpu socket NUMA system also has a config
- node0 cpu+memory node
- node 1 - 7 cpu only node.
for deviding scheduler domain.(old kernel had problem with big-sched-domain)

To fix memory-less-node, we have to test the performance of
"very-big-scheduler-domain" and to define the rule for cpu-hot-add, as
"a new cpu will be added to the most nearby node" 
(node-hot-add will have to add some hook..)

I don't know someone who created memory-less-node in past may have some other issues.

There may be some complicated topology system with complicated PXM map.


> > If we don't allow memory-less-node, we may have to add several codes for cpu-hot-add.
> > cpus should be moved to nearby node at hotadd .
> > And node-hot-add have to care that cpus mustn't be added before memory, cpu-driven 
> > node-hot-add will never occur. (ACPI's 'container' device spec can't guaranntee this.)
> 
> Well you could bring down the cpu and bring it up again? This would also 
> assure the best placement of the runtime structures for node?
> 
cpu-to-node relationship is fixed in the early stage of cpu hotplug.
I'm not sure we can bring down/up cpu again in clean way. After a cpu is added,
the kernel losts its original PXM value now.

about runtime structures:
The runtime structure placement for a hot-added-node is another issue here.
I and Goto-san have a plan for optimized placement of structures and will 
try when we can do. (We are now assgined to RHEL5 stabilization tasks...)

Moving per-cpu-area at hotadd does not look easy.
IMHO, maybe we have to use stop_machine_run() to move it.

Anyway, I'll post an another *easy* patch just for fix the NULL pointer access.
please review.

Thanks,
-Kame




 


  reply	other threads:[~2007-02-07 15:28 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-06 11:23 KAMEZAWA Hiroyuki
2007-02-06 17:26 ` Christoph Lameter
2007-02-07  1:14   ` KAMEZAWA Hiroyuki
2007-02-07  8:04     ` Christoph Lameter
2007-02-07  8:36       ` KAMEZAWA Hiroyuki
2007-02-08 11:49         ` Bob Picco
2007-02-07 10:20       ` Andi Kleen
2007-02-07 10:20 ` Andi Kleen
2007-02-07 10:07   ` KAMEZAWA Hiroyuki
2007-02-07 10:19     ` Andi Kleen
2007-02-07 10:37       ` KAMEZAWA Hiroyuki
2007-02-07 10:41         ` Andi Kleen
2007-02-07 10:49           ` KAMEZAWA Hiroyuki
2007-02-07 11:32             ` Andi Kleen
2007-02-07 12:27               ` KAMEZAWA Hiroyuki
2007-02-07 14:05     ` Christoph Lameter
2007-02-07 15:28       ` KAMEZAWA Hiroyuki [this message]
2007-02-07 14:03   ` Christoph Lameter
2007-02-07 16:23   ` Andrew Morton
2007-02-07 16:50     ` Andi Kleen
2007-02-07 17:43       ` Andrew Morton
2007-02-07 18:15         ` [PATCH] FS : Speedup rw_verify_area() Eric Dumazet
2007-02-08  0:37         ` [2.6.20][PATCH] fix mempolicy error check on a system with memory-less-node KAMEZAWA Hiroyuki
2007-02-08 19:09         ` Christoph Lameter
2007-02-08 19:26           ` Andrew Morton
2007-02-08 19:35             ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070208002809.c75b2742.kamezawa.hiroyu@jp.fujitsu.com \
    --to=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=clameter@engr.sgi.com \
    --cc=clameter@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=y-goto@jp.fujitsu.com \
    --subject='Re: [2.6.20][PATCH] fix mempolicy error check on a system with memory-less-node' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).