LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [patch -mm v2] mempolicy: disallow static or relative flags for local preferred mode
@ 2008-03-08 22:14 David Rientjes
  2008-03-10 16:39 ` Lee Schermerhorn
  2008-03-11 19:21 ` [PATCH -mm v3] " Lee Schermerhorn
  0 siblings, 2 replies; 3+ messages in thread
From: David Rientjes @ 2008-03-08 22:14 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Paul Jackson, Christoph Lameter, Lee Schermerhorn, Andi Kleen,
	Randy Dunlap, linux-kernel

MPOL_F_STATIC_NODES and MPOL_F_RELATIVE_NODES don't mean anything for
MPOL_PREFERRED policies that were created with an empty nodemask (for
purely local allocations).  They'll never be invalidated because the
allowed mems of a task changes or need to be rebound relative to a
cpuset's placement.

Also fixes a bug identified by Lee Schermerhorn that disallowed empty
nodemasks to be passed to MPOL_PREFERRED to specify local allocations.

Cc: Paul Jackson <pj@sgi.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
 Documentation/vm/numa_memory_policy.txt |   16 ++++++++++++++--
 mm/mempolicy.c                          |   24 +++++++++++++++++-------
 2 files changed, 31 insertions(+), 9 deletions(-)

diff --git a/Documentation/vm/numa_memory_policy.txt b/Documentation/vm/numa_memory_policy.txt
--- a/Documentation/vm/numa_memory_policy.txt
+++ b/Documentation/vm/numa_memory_policy.txt
@@ -210,6 +210,12 @@ Components of Memory Policies
 	    local allocation for a specific range of addresses--i.e. for
 	    VMA policies.
 
+	    It is possible for the user to specify that local allocation is
+	    always preferred by passing an empty nodemask with this mode.
+	    If an empty nodemask is passed, the policy cannot use the
+	    MPOL_F_STATIC_NODES or MPOL_F_RELATIVE_NODES flags described
+	    below.
+
 	MPOL_INTERLEAVED:  This mode specifies that page allocations be
 	interleaved, on a page granularity, across the nodes specified in
 	the policy.  This mode also behaves slightly differently, based on
@@ -259,7 +265,10 @@ Components of Memory Policies
 	    occurs over that node.  If no nodes from the user's nodemask are
 	    now allowed, the Default behavior is used.
 
-	    MPOL_F_STATIC_NODES cannot be used with MPOL_F_RELATIVE_NODES.
+	    MPOL_F_STATIC_NODES cannot be combined with the
+	    MPOL_F_RELATIVE_NODES flag.  It also cannot be used for
+	    MPOL_PREFERRED policies that were created with an empty nodemask
+	    (local allocation).
 
 	MPOL_F_RELATIVE_NODES:  This flag specifies that the nodemask passed
 	by the user will be mapped relative to the set of the task or VMA's
@@ -306,7 +315,10 @@ Components of Memory Policies
 	    set of memory nodes allowed by the task's cpuset, as that may
 	    change over time.
 
-	    MPOL_F_RELATIVE_NODES cannot be used with MPOL_F_STATIC_NODES.
+	    MPOL_F_RELATIVE_NODES cannot be combined with the
+	    MPOL_F_STATIC_NODES flag.  It also cannot be used for
+	    MPOL_PREFERRED policies that were created with an empty nodemask
+	    (local allocation).
 
 MEMORY POLICY APIs
 
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -151,9 +151,7 @@ static int mpol_new_interleave(struct mempolicy *pol, const nodemask_t *nodes)
 
 static int mpol_new_preferred(struct mempolicy *pol, const nodemask_t *nodes)
 {
-	if (nodes_empty(*nodes))
-		return -EINVAL;
-	pol->v.preferred_node = first_node(*nodes);
+	pol->v.preferred_node = !nodes_empty(*nodes) ? first_node(*nodes) : -1;
 	return 0;
 }
 
@@ -176,10 +174,22 @@ static struct mempolicy *mpol_new(unsigned short mode, unsigned short flags,
 	pr_debug("setting mode %d flags %d nodes[0] %lx\n",
 		 mode, flags, nodes ? nodes_addr(*nodes)[0] : -1);
 
-	if (nodes && nodes_empty(*nodes) && mode != MPOL_PREFERRED)
-		return ERR_PTR(-EINVAL);
-	if (mode == MPOL_DEFAULT)
+	if (mode == MPOL_DEFAULT) {
+		if (nodes && !nodes_empty(*nodes))
+			return ERR_PTR(-EINVAL);
 		return NULL;
+	}
+	/*
+	 * MPOL_PREFERRED cannot be used with MPOL_F_STATIC_NODES or
+	 * MPOL_F_RELATIVE_NODES if the nodemask is empty (local allocation).
+	 * All other modes require a valid pointer to a non-empty nodemask.
+	 */
+	if (mode == MPOL_PREFERRED) {
+		if (nodes_empty(*nodes) && ((flags & MPOL_F_STATIC_NODES) ||
+					    (flags & MPOL_F_RELATIVE_NODES)))
+			return ERR_PTR(-EINVAL);
+	} else if (nodes_empty(*nodes))
+		return ERR_PTR(-EINVAL);
 	policy = kmem_cache_alloc(policy_cache, GFP_KERNEL);
 	if (!policy)
 		return ERR_PTR(-ENOMEM);
@@ -250,7 +260,7 @@ static void mpol_rebind_preferred(struct mempolicy *pol,
 	} else if (pol->flags & MPOL_F_RELATIVE_NODES) {
 		mpol_relative_nodemask(&tmp, &pol->w.user_nodemask, nodes);
 		pol->v.preferred_node = first_node(tmp);
-	} else {
+	} else if (pol->v.preferred_node != -1) {
 		pol->v.preferred_node = node_remap(pol->v.preferred_node,
 						   pol->w.cpuset_mems_allowed,
 						   *nodes);

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [patch -mm v2] mempolicy: disallow static or relative flags for local preferred mode
  2008-03-08 22:14 [patch -mm v2] mempolicy: disallow static or relative flags for local preferred mode David Rientjes
@ 2008-03-10 16:39 ` Lee Schermerhorn
  2008-03-11 19:21 ` [PATCH -mm v3] " Lee Schermerhorn
  1 sibling, 0 replies; 3+ messages in thread
From: Lee Schermerhorn @ 2008-03-10 16:39 UTC (permalink / raw)
  To: David Rientjes, Andrew Morton
  Cc: Paul Jackson, Christoph Lameter, Andi Kleen, Randy Dunlap, linux-kernel

On Sat, 2008-03-08 at 14:14 -0800, David Rientjes wrote:
> MPOL_F_STATIC_NODES and MPOL_F_RELATIVE_NODES don't mean anything for
> MPOL_PREFERRED policies that were created with an empty nodemask (for
> purely local allocations).  They'll never be invalidated because the
> allowed mems of a task changes or need to be rebound relative to a
> cpuset's placement.

Good.

> 
> Also fixes a bug identified by Lee Schermerhorn that disallowed empty
> nodemasks to be passed to MPOL_PREFERRED to specify local allocations.

I tested this patch atop David's other patches on 2-6.25-rc3-mm1 and I
agree that it fixes the regression with local allocations.  It also
preserves existing behavior for MPOL_DEFAULT.  

However, [and I'm going to sound like the "Nodemask Nazi" again, sorry]
it does change the behavior of MPOL_PREFERRED when one specifies a
non-empty nodemask with preferred/all node[s] outside of the
mems_allowed mask and neither MPOL_F_{STATIC|RELATIVE}_NODES are
specified.

Existing behavior is to return EINVAL in this case.  I had tried to
preserve this behavior with my
"silently-restrict-nodemask-to-allowed-nodes" patch with those ugly
"was_empty" and "is_empty" variables to capture the state of the
nodemask before and after we remove !allowed nodes.  I also tried to
preserve this behavior in the fix I had sent to the local alloc
regression, by passing a NULL nodemask pointer to the create op for
localalloc.

With this patch, we silently fall back to local allocation for
MPOL_PREFERRED.  This occurs because by the time mpol_new_preferred()
sees the nodemask, we may have masked off all nodes with the
current_mems_allowed mask.  I have a proposed patch below to address
this [atop this one].

Another comment in-line below.

Lee
> 
> Cc: Paul Jackson <pj@sgi.com>
> Cc: Christoph Lameter <clameter@sgi.com>
> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
> Cc: Andi Kleen <ak@suse.de>
> Cc: Randy Dunlap <randy.dunlap@oracle.com>
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
>  Documentation/vm/numa_memory_policy.txt |   16 ++++++++++++++--
>  mm/mempolicy.c                          |   24 +++++++++++++++++-------
>  2 files changed, 31 insertions(+), 9 deletions(-)
> 
> diff --git a/Documentation/vm/numa_memory_policy.txt b/Documentation/vm/numa_memory_policy.txt
> --- a/Documentation/vm/numa_memory_policy.txt
> +++ b/Documentation/vm/numa_memory_policy.txt
> @@ -210,6 +210,12 @@ Components of Memory Policies
>  	    local allocation for a specific range of addresses--i.e. for
>  	    VMA policies.
>  
> +	    It is possible for the user to specify that local allocation is
> +	    always preferred by passing an empty nodemask with this mode.
> +	    If an empty nodemask is passed, the policy cannot use the
> +	    MPOL_F_STATIC_NODES or MPOL_F_RELATIVE_NODES flags described
> +	    below.
> +
>  	MPOL_INTERLEAVED:  This mode specifies that page allocations be
>  	interleaved, on a page granularity, across the nodes specified in
>  	the policy.  This mode also behaves slightly differently, based on
> @@ -259,7 +265,10 @@ Components of Memory Policies
>  	    occurs over that node.  If no nodes from the user's nodemask are
>  	    now allowed, the Default behavior is used.
>  
> -	    MPOL_F_STATIC_NODES cannot be used with MPOL_F_RELATIVE_NODES.
> +	    MPOL_F_STATIC_NODES cannot be combined with the
> +	    MPOL_F_RELATIVE_NODES flag.  It also cannot be used for
> +	    MPOL_PREFERRED policies that were created with an empty nodemask
> +	    (local allocation).
>  
>  	MPOL_F_RELATIVE_NODES:  This flag specifies that the nodemask passed
>  	by the user will be mapped relative to the set of the task or VMA's
> @@ -306,7 +315,10 @@ Components of Memory Policies
>  	    set of memory nodes allowed by the task's cpuset, as that may
>  	    change over time.
>  
> -	    MPOL_F_RELATIVE_NODES cannot be used with MPOL_F_STATIC_NODES.
> +	    MPOL_F_RELATIVE_NODES cannot be combined with the
> +	    MPOL_F_STATIC_NODES flag.  It also cannot be used for
> +	    MPOL_PREFERRED policies that were created with an empty nodemask
> +	    (local allocation).
>  
>  MEMORY POLICY APIs
>  
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -151,9 +151,7 @@ static int mpol_new_interleave(struct mempolicy *pol, const nodemask_t *nodes)
>  
>  static int mpol_new_preferred(struct mempolicy *pol, const nodemask_t *nodes)
>  {
> -	if (nodes_empty(*nodes))
> -		return -EINVAL;
> -	pol->v.preferred_node = first_node(*nodes);
> +	pol->v.preferred_node = !nodes_empty(*nodes) ? first_node(*nodes) : -1;
>  	return 0;
>  }
>  
> @@ -176,10 +174,22 @@ static struct mempolicy *mpol_new(unsigned short mode, unsigned short flags,
>  	pr_debug("setting mode %d flags %d nodes[0] %lx\n",
>  		 mode, flags, nodes ? nodes_addr(*nodes)[0] : -1);
>  
> -	if (nodes && nodes_empty(*nodes) && mode != MPOL_PREFERRED)
> -		return ERR_PTR(-EINVAL);
> -	if (mode == MPOL_DEFAULT)
> +	if (mode == MPOL_DEFAULT) {
> +		if (nodes && !nodes_empty(*nodes))
> +			return ERR_PTR(-EINVAL);
>  		return NULL;
> +	}

So, the only time we get a NULL 'nodes' pointer to mpol_new() is when we
pass MPOL_DEFAULT to mpol_shared_policy_init().  The syscall entries
always pass a non-null nodemask pointer to an on-stack nodemask_t.  I've
added a VM_BUG_ON() to assert this in the patch below, as the rest of
mpol_new() assumes a non-NULL 'nodes'.  Maybe should be just 'BUG_ON()'?

> +	/*
> +	 * MPOL_PREFERRED cannot be used with MPOL_F_STATIC_NODES or
> +	 * MPOL_F_RELATIVE_NODES if the nodemask is empty (local allocation).
> +	 * All other modes require a valid pointer to a non-empty nodemask.
> +	 */
> +	if (mode == MPOL_PREFERRED) {
> +		if (nodes_empty(*nodes) && ((flags & MPOL_F_STATIC_NODES) ||
> +					    (flags & MPOL_F_RELATIVE_NODES)))
> +			return ERR_PTR(-EINVAL);
> +	} else if (nodes_empty(*nodes))
> +		return ERR_PTR(-EINVAL);
>  	policy = kmem_cache_alloc(policy_cache, GFP_KERNEL);
>  	if (!policy)
>  		return ERR_PTR(-ENOMEM);
> @@ -250,7 +260,7 @@ static void mpol_rebind_preferred(struct mempolicy *pol,
>  	} else if (pol->flags & MPOL_F_RELATIVE_NODES) {
>  		mpol_relative_nodemask(&tmp, &pol->w.user_nodemask, nodes);
>  		pol->v.preferred_node = first_node(tmp);
> -	} else {
> +	} else if (pol->v.preferred_node != -1) {
>  		pol->v.preferred_node = node_remap(pol->v.preferred_node,
>  						   pol->w.cpuset_mems_allowed,
>  						   *nodes);

Here's a patch, atop the subject patch, that addresses the issues
discussed above.  Tested with the numactl regression "suite", my little
MPOL_DEFAULT error return test and a couple of ad hoc memtoy sessions in
and out of restricted cpusets.  All passed--once I remembered to exit
the restricted cpuset that didn't include nodes 0 & 1, used by the
regression test...

David:  if you agree with this, I can generate a combined patch with
your signoff and mine to replace the previous regression fix which I
believe Andrew has already added to -mm.  Or,

Andrew:  if you prefer, I can generate a patch atop the one you've
already added to the tree to accomplish the same thing.

Please advise...

Lee

-----------------------------------------------------------------------
Against:  2.6.25-rc3-mm1 atop "mempolicy:  disallow static or relative
flags for local preferred allocation - v2"

Restores error checking for MPOL_PREFERRED that specifies only
dis-allowed nodes.  Skips cpuset related mempolicy setup for 
local allocation.

Also, enforce mpol_new() assumption that 'nodes' arg is non-NULL
except for MPOL_DEFAULT.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>

 mm/mempolicy.c |   47 +++++++++++++++++++++++++++++++----------------
 1 file changed, 31 insertions(+), 16 deletions(-)

Index: linux-2.6.25-rc3-mm1/mm/mempolicy.c
===================================================================
--- linux-2.6.25-rc3-mm1.orig/mm/mempolicy.c	2008-03-10 10:25:58.000000000 -0400
+++ linux-2.6.25-rc3-mm1/mm/mempolicy.c	2008-03-10 12:12:19.000000000 -0400
@@ -158,7 +158,9 @@ static int mpol_new_interleave(struct me
 
 static int mpol_new_preferred(struct mempolicy *pol, const nodemask_t *nodes)
 {
-	pol->v.preferred_node = !nodes_empty(*nodes) ? first_node(*nodes) : -1;
+	if (nodes && nodes_empty(*nodes))
+		return -EINVAL;
+	pol->v.preferred_node = nodes ? first_node(*nodes) : -1;
 	return 0;
 }
 
@@ -176,6 +178,7 @@ static struct mempolicy *mpol_new(unsign
 {
 	struct mempolicy *policy;
 	nodemask_t cpuset_context_nmask;
+	int localalloc = 0;
 	int ret;
 
 	pr_debug("setting mode %d flags %d nodes[0] %lx\n",
@@ -186,36 +189,48 @@ static struct mempolicy *mpol_new(unsign
 			return ERR_PTR(-EINVAL);
 		return NULL;
 	}
+	VM_BUG_ON(!nodes);
+
 	/*
 	 * MPOL_PREFERRED cannot be used with MPOL_F_STATIC_NODES or
 	 * MPOL_F_RELATIVE_NODES if the nodemask is empty (local allocation).
 	 * All other modes require a valid pointer to a non-empty nodemask.
 	 */
 	if (mode == MPOL_PREFERRED) {
-		if (nodes_empty(*nodes) && ((flags & MPOL_F_STATIC_NODES) ||
-					    (flags & MPOL_F_RELATIVE_NODES)))
-			return ERR_PTR(-EINVAL);
+		if (nodes_empty(*nodes)) {
+			if (((flags & MPOL_F_STATIC_NODES) ||
+			     (flags & MPOL_F_RELATIVE_NODES)))
+				return ERR_PTR(-EINVAL);
+			localalloc++;
+		}
 	} else if (nodes_empty(*nodes))
 		return ERR_PTR(-EINVAL);
 	policy = kmem_cache_alloc(policy_cache, GFP_KERNEL);
 	if (!policy)
 		return ERR_PTR(-ENOMEM);
 	atomic_set(&policy->refcnt, 1);
-	cpuset_update_task_memory_state();
-	if (flags & MPOL_F_RELATIVE_NODES)
-		mpol_relative_nodemask(&cpuset_context_nmask, nodes,
-				       &cpuset_current_mems_allowed);
-	else
-		nodes_and(cpuset_context_nmask, *nodes,
-			  cpuset_current_mems_allowed);
 	policy->policy = mode;
 	policy->flags = flags;
-	if (mpol_store_user_nodemask(policy))
-		policy->w.user_nodemask = *nodes;
-	else
-		policy->w.cpuset_mems_allowed = cpuset_mems_allowed(current);
+	if (!localalloc) {
+		/*
+		 * cpuset related setup doesn't apply to local allocation
+		 */
+		cpuset_update_task_memory_state();
+		if (flags & MPOL_F_RELATIVE_NODES)
+			mpol_relative_nodemask(&cpuset_context_nmask, nodes,
+					       &cpuset_current_mems_allowed);
+		else
+			nodes_and(cpuset_context_nmask, *nodes,
+				  cpuset_current_mems_allowed);
+		if (mpol_store_user_nodemask(policy))
+			policy->w.user_nodemask = *nodes;
+		else
+			policy->w.cpuset_mems_allowed =
+						cpuset_mems_allowed(current);
+	}
 
-	ret = mpol_ops[mode].create(policy, &cpuset_context_nmask);
+	ret = mpol_ops[mode].create(policy,
+				    localalloc ? NULL : &cpuset_context_nmask);
 	if (ret < 0) {
 		kmem_cache_free(policy_cache, policy);
 		return ERR_PTR(ret);



^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH -mm v3] mempolicy: disallow static or relative flags for local preferred mode
  2008-03-08 22:14 [patch -mm v2] mempolicy: disallow static or relative flags for local preferred mode David Rientjes
  2008-03-10 16:39 ` Lee Schermerhorn
@ 2008-03-11 19:21 ` Lee Schermerhorn
  1 sibling, 0 replies; 3+ messages in thread
From: Lee Schermerhorn @ 2008-03-11 19:21 UTC (permalink / raw)
  To: David Rientjes, Andrew Morton
  Cc: Paul Jackson, Christoph Lameter, Andi Kleen, Randy Dunlap,
	linux-kernel, linux-mm, Eric Whitney

David, Andrew:  I've rebased David's recent patch--the one with
"significant-looking reject"--atop 25-rc5-mm1 and retested.

Lee

mempolicy: disallow static or relative flags for local preferred mode

Against:  2.6.25-rc5-mm1

V2 -> V3 [lts]:
+ rebased to 25-rc5-mm1
+ preserved behavior of MPOL_PREFERRED w/ non-empty nodemask containing
  only dis-allowed nodes.
+ pulled the setting of policy->flags outside of the "if(nodes)..."
  block as future flags might not be associated only with nodemasks
  as STATIC and RELATIVE are.  This is safe, now that David disallows
  these flags for the case of MPOL_PREFERRED with empty nodemask.
+ dropped the localalloc variable in favor of just NULLing out nodes
  pointer to indicate "local preferred" mode.  

MPOL_F_STATIC_NODES and MPOL_F_RELATIVE_NODES don't mean anything for
MPOL_PREFERRED policies that were created with an empty nodemask (for
purely local allocations).  They'll never be invalidated because the
allowed mems of a task changes or need to be rebound relative to a
cpuset's placement.

Also fixes a bug identified by Lee Schermerhorn that disallowed empty
nodemasks to be passed to MPOL_PREFERRED to specify local allocations.
[A different, somewhat incomplete, patch already existed in 25-rc5-mm1.]

Cc: Paul Jackson <pj@sgi.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: David Rientjes <rientjes@google.com>

 Documentation/vm/numa_memory_policy.txt |   16 ++++++++++--
 mm/mempolicy.c                          |   42 +++++++++++++++++++-------------
 2 files changed, 40 insertions(+), 18 deletions(-)

Index: linux-2.6.25-rc5-mm1/Documentation/vm/numa_memory_policy.txt
===================================================================
--- linux-2.6.25-rc5-mm1.orig/Documentation/vm/numa_memory_policy.txt	2008-03-11 14:22:56.000000000 -0400
+++ linux-2.6.25-rc5-mm1/Documentation/vm/numa_memory_policy.txt	2008-03-11 14:32:40.000000000 -0400
@@ -205,6 +205,12 @@ Components of Memory Policies
 	    local allocation for a specific range of addresses--i.e. for
 	    VMA policies.
 
+	    It is possible for the user to specify that local allocation is
+	    always preferred by passing an empty nodemask with this mode.
+	    If an empty nodemask is passed, the policy cannot use the
+	    MPOL_F_STATIC_NODES or MPOL_F_RELATIVE_NODES flags described
+	    below.
+
 	MPOL_INTERLEAVED:  This mode specifies that page allocations be
 	interleaved, on a page granularity, across the nodes specified in
 	the policy.  This mode also behaves slightly differently, based on
@@ -254,7 +260,10 @@ Components of Memory Policies
 	    occurs over that node.  If no nodes from the user's nodemask are
 	    now allowed, the Default behavior is used.
 
-	    MPOL_F_STATIC_NODES cannot be used with MPOL_F_RELATIVE_NODES.
+	    MPOL_F_STATIC_NODES cannot be combined with the
+	    MPOL_F_RELATIVE_NODES flag.  It also cannot be used for
+	    MPOL_PREFERRED policies that were created with an empty nodemask
+	    (local allocation).
 
 	MPOL_F_RELATIVE_NODES:  This flag specifies that the nodemask passed
 	by the user will be mapped relative to the set of the task or VMA's
@@ -301,7 +310,10 @@ Components of Memory Policies
 	    set of memory nodes allowed by the task's cpuset, as that may
 	    change over time.
 
-	    MPOL_F_RELATIVE_NODES cannot be used with MPOL_F_STATIC_NODES.
+	    MPOL_F_RELATIVE_NODES cannot be combined with the
+	    MPOL_F_STATIC_NODES flag.  It also cannot be used for
+	    MPOL_PREFERRED policies that were created with an empty nodemask
+	    (local allocation).
 
 MEMORY POLICY APIs
 
Index: linux-2.6.25-rc5-mm1/mm/mempolicy.c
===================================================================
--- linux-2.6.25-rc5-mm1.orig/mm/mempolicy.c	2008-03-11 14:32:17.000000000 -0400
+++ linux-2.6.25-rc5-mm1/mm/mempolicy.c	2008-03-11 15:14:49.000000000 -0400
@@ -181,27 +181,43 @@ static struct mempolicy *mpol_new(unsign
 {
 	struct mempolicy *policy;
 	nodemask_t cpuset_context_nmask;
-	int localalloc = 0;
 	int ret;
 
 	pr_debug("setting mode %d flags %d nodes[0] %lx\n",
 		 mode, flags, nodes ? nodes_addr(*nodes)[0] : -1);
 
-	if (mode == MPOL_DEFAULT)
-		return NULL;
-	if (!nodes || nodes_empty(*nodes)) {
-		if (mode != MPOL_PREFERRED)
+	if (mode == MPOL_DEFAULT) {
+		if (nodes && !nodes_empty(*nodes))
 			return ERR_PTR(-EINVAL);
-		localalloc = 1;	/* special case:  no mode flags */
+		return NULL;
 	}
+	VM_BUG_ON(!nodes);
+
+	/*
+	 * MPOL_PREFERRED cannot be used with MPOL_F_STATIC_NODES or
+	 * MPOL_F_RELATIVE_NODES if the nodemask is empty (local allocation).
+	 * All other modes require a valid pointer to a non-empty nodemask.
+	 */
+	if (mode == MPOL_PREFERRED) {
+		if (nodes_empty(*nodes)) {
+			if (((flags & MPOL_F_STATIC_NODES) ||
+			     (flags & MPOL_F_RELATIVE_NODES)))
+				return ERR_PTR(-EINVAL);
+			nodes = NULL;	/* flag local alloc */
+		}
+	} else if (nodes_empty(*nodes))
+		return ERR_PTR(-EINVAL);
 	policy = kmem_cache_alloc(policy_cache, GFP_KERNEL);
 	if (!policy)
 		return ERR_PTR(-ENOMEM);
 	atomic_set(&policy->refcnt, 1);
 	policy->policy = mode;
+	policy->flags = flags;
 
-	if (!localalloc) {
-		policy->flags = flags;
+	if (nodes) {
+		/*
+		 * cpuset related setup doesn't apply to local allocation
+		 */
 		cpuset_update_task_memory_state();
 		if (flags & MPOL_F_RELATIVE_NODES)
 			mpol_relative_nodemask(&cpuset_context_nmask, nodes,
@@ -217,7 +233,7 @@ static struct mempolicy *mpol_new(unsign
 	}
 
 	ret = mpol_ops[mode].create(policy,
-				localalloc ? NULL : &cpuset_context_nmask);
+				nodes ? &cpuset_context_nmask : NULL);
 	if (ret < 0) {
 		kmem_cache_free(policy_cache, policy);
 		return ERR_PTR(ret);
@@ -259,10 +275,6 @@ static void mpol_rebind_preferred(struct
 {
 	nodemask_t tmp;
 
-	/*
-	 * check 'STATIC_NODES first, as preferred_node == -1 may be
-	 * a temporary, "fallback" state for this policy.
-	 */
 	if (pol->flags & MPOL_F_STATIC_NODES) {
 		int node = first_node(pol->w.user_nodemask);
 
@@ -270,12 +282,10 @@ static void mpol_rebind_preferred(struct
 			pol->v.preferred_node = node;
 		else
 			pol->v.preferred_node = -1;
-	} else if (pol->v.preferred_node == -1) {
-		return;	/* no remap required for explicit local alloc */
 	} else if (pol->flags & MPOL_F_RELATIVE_NODES) {
 		mpol_relative_nodemask(&tmp, &pol->w.user_nodemask, nodes);
 		pol->v.preferred_node = first_node(tmp);
-	} else {
+	} else if (pol->v.preferred_node != -1) {
 		pol->v.preferred_node = node_remap(pol->v.preferred_node,
 						   pol->w.cpuset_mems_allowed,
 						   *nodes);



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-03-11 19:22 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-08 22:14 [patch -mm v2] mempolicy: disallow static or relative flags for local preferred mode David Rientjes
2008-03-10 16:39 ` Lee Schermerhorn
2008-03-11 19:21 ` [PATCH -mm v3] " Lee Schermerhorn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).