LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [RFC] bitmap onto and fold operators for mempolicy extensions
@ 2008-02-16 12:33 Paul Jackson
  2008-02-16 13:38 ` KOSAKI Motohiro
  2008-02-19  0:46 ` David Rientjes
  0 siblings, 2 replies; 7+ messages in thread
From: Paul Jackson @ 2008-02-16 12:33 UTC (permalink / raw)
  To: David Rientjes
  Cc: Lee.Schermerhorn, mel, ak, linux-kernel, ray-lk, kosaki.motohiro,
	akpm, Paul Jackson, clameter

[Beware - never tested, never booted.]

From: Paul Jackson <pj@sgi.com>

The following adds two more bitmap operators, bitmap_onto() and
bitmap_fold(), with the usual cpumask and nodemask wrappers.

The bitmap_onto() operator computes one bitmap relative to
another.  If the n-th bit in the origin mask is set, then the
m-th bit of the destination mask will be set, where m is
the position of the n-th set bit in the relative mask.

The bitmap_fold() operator folds a bitmap into a second that
has bit m set iff the input bitmap has some bit n set, where
m == n mod sz, for the specified sz value.

There are two substantive changes between this patch and its
predecessor bitmap_relative:
 1) Renamed bitmap_relative() to be bitmap_onto().
 2) Added bitmap_fold().

The essential motivation for bitmap_onto() is to provide
a mechanism for converting a cpuset-relative CPU or Node
mask to an absolute mask.  Cpuset relative masks are written
as if the current task were in a cpuset whose CPUs or
Nodes were just the consecutive ones numbered 0..N-1, for
some N.  The bitmap_onto() operator is provided in anticipation
of adding support for the first such cpuset relative mask,
by the mbind() and set_mempolicy() system calls, using a
planned flag of MPOL_F_RELATIVE_NODES.  These bitmap operators
(and their nodemask wrappers, in particular) will be used in
code that converts the user specified cpuset relative memory
policy to a specific system node numbered policy, given the
current mems_allowed of the tasks cpuset.

Such cpuset relative mempolicies will address two deficiencies
of the existing interface between cpusets and mempolicies:
 1) A task cannot at present reliably establish a cpuset
    relative mempolicy because there is an essential race
    condition, in that the tasks cpuset may be changed in
    between the time the task can query its cpuset placement,
    and the time the task can issue the applicable mbind or
    set_memplicy system call.
 2) A task cannot at present establish what cpuset relative
    mempolicy it would like to have, if it is in a smaller
    cpuset than it might have mempolicy preferences for,
    because the existing interface only allows specifying
    mempolicies for nodes currently allowed by the cpuset.

Cpuset relative mempolicies are useful for tasks that don't
distinguish particularly between one CPU or Node and another,
but only between how many of each are allowed, and the proper
placement of threads and memory pages on the various CPUs and
Nodes available.

The motivation for the added bitmap_fold() can be seen in
the following example.

Let's say an application has specified some mempolicies
that presume 16 memory nodes, including say a mempolicy that
specified MPOL_F_RELATIVE_NODES (cpuset relative) nodes 12-15.
Then lets say that application is crammed into a cpuset that only
has 8 memory nodes, 0-7.  If one just uses bitmap_onto(), this
mempolicy, mapped to that cpuset, would ignore the requested
relative nodes above 7, leaving it empty of nodes.  That's not
good; better to fold the higher nodes down, so that some nodes
are included in the resulting mapped mempolicy.  In this case,
the mempolicy nodes 12-15 are taken modulo 8 (the weight of the
mems_allowed of the confining cpuset), resulting in a mempolicy
specifying nodes 4-7.

Signed-off-by: Paul Jackson <pj@sgi.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Lee.Schermerhorn@hp.com
Cc: clameter@sgi.com
Cc: ak@suse.de
Cc: mel@csn.ul.ie
Cc: kosaki.motohiro@jp.fujitsu.com
Cc: ray-lk@madrabbit.org

---
 include/linux/bitmap.h   |    6 +
 include/linux/cpumask.h  |   22 ++++++
 include/linux/nodemask.h |   22 ++++++
 lib/bitmap.c             |  158 +++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 206 insertions(+), 2 deletions(-)

--- 2.6.24-mm1.orig/include/linux/nodemask.h	2008-02-15 15:15:48.210590885 -0800
+++ 2.6.24-mm1/include/linux/nodemask.h	2008-02-15 17:56:45.454238787 -0800
@@ -14,6 +14,8 @@
  * bitmap_scnlistprintf() and bitmap_parselist(), also in bitmap.c.
  * For details of node_remap(), see bitmap_bitremap in lib/bitmap.c.
  * For details of nodes_remap(), see bitmap_remap in lib/bitmap.c.
+ * For details of nodes_onto(), see bitmap_onto in lib/bitmap.c.
+ * For details of nodes_fold(), see bitmap_fold in lib/bitmap.c.
  *
  * The available nodemask operations are:
  *
@@ -55,7 +57,9 @@
  * int nodelist_scnprintf(buf, len, mask) Format nodemask as list for printing
  * int nodelist_parse(buf, map)		Parse ascii string as nodelist
  * int node_remap(oldbit, old, new)	newbit = map(old, new)(oldbit)
- * int nodes_remap(dst, src, old, new)	*dst = map(old, new)(dst)
+ * void nodes_remap(dst, src, old, new)	*dst = map(old, new)(src)
+ * void nodes_onto(dst, orig, relmap)	*dst = orig relative to relmap
+ * void nodes_fold(dst, orig, sz)	dst bits = orig bits mod sz
  *
  * for_each_node_mask(node, mask)	for-loop node over mask
  *
@@ -326,6 +330,22 @@ static inline void __nodes_remap(nodemas
 	bitmap_remap(dstp->bits, srcp->bits, oldp->bits, newp->bits, nbits);
 }
 
+#define nodes_onto(dst, orig, relmap) \
+		__nodes_onto(&(dst), &(orig), &(relmap), MAX_NUMNODES)
+static inline void __nodes_onto(nodemask_t *dstp, const nodemask_t *origp,
+		const nodemask_t *relmapp, int nbits)
+{
+	bitmap_onto(dstp->bits, origp->bits, relmapp->bits, nbits);
+}
+
+#define nodes_fold(dst, orig, sz) \
+		__nodes_fold(&(dst), &(orig), sz, MAX_NUMNODES)
+static inline void __nodes_fold(nodemask_t *dstp, const nodemask_t *origp,
+		int sz, int nbits)
+{
+	bitmap_fold(dstp->bits, origp->bits, sz, nbits);
+}
+
 #if MAX_NUMNODES > 1
 #define for_each_node_mask(node, mask)			\
 	for ((node) = first_node(mask);			\
--- 2.6.24-mm1.orig/include/linux/bitmap.h	2008-02-15 15:15:48.210590885 -0800
+++ 2.6.24-mm1/include/linux/bitmap.h	2008-02-15 16:55:20.139255707 -0800
@@ -46,6 +46,8 @@
  * bitmap_shift_left(dst, src, n, nbits)	*dst = *src << n
  * bitmap_remap(dst, src, old, new, nbits)	*dst = map(old, new)(src)
  * bitmap_bitremap(oldbit, old, new, nbits)	newbit = map(old, new)(oldbit)
+ * bitmap_onto(dst, orig, relmap, nbits)	*dst = orig relative to relmap
+ * bitmap_fold(dst, orig, sz, nbits)		dst bits = orig bits mod sz
  * bitmap_scnprintf(buf, len, src, nbits)	Print bitmap src to buf
  * bitmap_parse(buf, buflen, dst, nbits)	Parse bitmap dst from kernel buf
  * bitmap_parse_user(ubuf, ulen, dst, nbits)	Parse bitmap dst from user buf
@@ -120,6 +122,10 @@ extern void bitmap_remap(unsigned long *
 		const unsigned long *old, const unsigned long *new, int bits);
 extern int bitmap_bitremap(int oldbit,
 		const unsigned long *old, const unsigned long *new, int bits);
+extern void bitmap_onto(unsigned long *dst, const unsigned long *orig,
+		const unsigned long *relmap, int bits);
+extern void bitmap_fold(unsigned long *dst, const unsigned long *orig,
+		int sz, int bits);
 extern int bitmap_find_free_region(unsigned long *bitmap, int bits, int order);
 extern void bitmap_release_region(unsigned long *bitmap, int pos, int order);
 extern int bitmap_allocate_region(unsigned long *bitmap, int pos, int order);
--- 2.6.24-mm1.orig/include/linux/cpumask.h	2008-02-15 15:15:48.210590885 -0800
+++ 2.6.24-mm1/include/linux/cpumask.h	2008-02-15 17:57:08.202604547 -0800
@@ -14,6 +14,8 @@
  * bitmap_scnlistprintf() and bitmap_parselist(), also in bitmap.c.
  * For details of cpu_remap(), see bitmap_bitremap in lib/bitmap.c
  * For details of cpus_remap(), see bitmap_remap in lib/bitmap.c.
+ * For details of cpus_onto(), see bitmap_onto in lib/bitmap.c.
+ * For details of cpus_fold(), see bitmap_fold in lib/bitmap.c.
  *
  * The available cpumask operations are:
  *
@@ -53,7 +55,9 @@
  * int cpulist_scnprintf(buf, len, mask) Format cpumask as list for printing
  * int cpulist_parse(buf, map)		Parse ascii string as cpulist
  * int cpu_remap(oldbit, old, new)	newbit = map(old, new)(oldbit)
- * int cpus_remap(dst, src, old, new)	*dst = map(old, new)(src)
+ * void cpus_remap(dst, src, old, new)	*dst = map(old, new)(src)
+ * void cpus_onto(dst, orig, relmap)	*dst = orig relative to relmap
+ * void cpus_fold(dst, orig, sz)	dst bits = orig bits mod sz
  *
  * for_each_cpu_mask(cpu, mask)		for-loop cpu over mask
  *
@@ -311,6 +315,22 @@ static inline void __cpus_remap(cpumask_
 	bitmap_remap(dstp->bits, srcp->bits, oldp->bits, newp->bits, nbits);
 }
 
+#define cpus_onto(dst, orig, relmap) \
+		__cpus_onto(&(dst), &(orig), &(relmap), NR_CPUS)
+static inline void __cpus_onto(cpumask_t *dstp, const cpumask_t *origp,
+		const cpumask_t *relmapp, int nbits)
+{
+	bitmap_onto(dstp->bits, origp->bits, relmapp->bits, nbits);
+}
+
+#define cpus_fold(dst, orig, sz) \
+		__cpus_fold(&(dst), &(orig), sz, NR_CPUS)
+static inline void __cpus_fold(cpumask_t *dstp, const cpumask_t *origp,
+		int sz, int nbits)
+{
+	bitmap_fold(dstp->bits, origp->bits, sz, nbits);
+}
+
 #if NR_CPUS > 1
 #define for_each_cpu_mask(cpu, mask)		\
 	for ((cpu) = first_cpu(mask);		\
--- 2.6.24-mm1.orig/lib/bitmap.c	2008-02-15 15:15:48.234591265 -0800
+++ 2.6.24-mm1/lib/bitmap.c	2008-02-16 03:12:58.842876198 -0800
@@ -698,6 +698,164 @@ int bitmap_bitremap(int oldbit, const un
 }
 EXPORT_SYMBOL(bitmap_bitremap);
 
+/**
+ * bitmap_onto - translate one bitmap relative to another
+ *	@dst: resulting translated bitmap
+ * 	@orig: original untranslated bitmap
+ * 	@relmap: bitmap relative to which translated
+ *	@bits: number of bits in each of these bitmaps
+ *
+ * Set the n-th bit of @dst iff there exists some m such that the
+ * n-th bit of @relmap is set, the m-th bit of @orig is set, and
+ * the n-th bit of @relmap is also the m-th _set_ bit of @relmap.
+ * (If you understood the previous sentence the first time your
+ * read it, you're overqualified for your current job.)
+ *
+ * In other words, @orig is mapped onto (surjectively) @dst,
+ * using the the map { <n, m> | the n-th bit of @relmap is the
+ * m-th set bit of @relmap }.
+ *
+ * Any set bits in @orig above bit number W, where W is the
+ * weight of (number of set bits in) @relmap are mapped nowhere.
+ * In particular, if for all bits m set in @orig, m >= W, then
+ * @dst will end up empty.  In situations where the possibility
+ * of such an empty result is not desired, one way to avoid it is
+ * to use the bitmap_fold() operator, below, to first fold the
+ * @orig bitmap over itself so that all its set bits x are in the
+ * range 0 <= x < W.  The bitmap_fold() operator does this by
+ * setting the bit (m % W) in @dst, for each bit (m) set in @orig.
+ *
+ * Example [1] for bitmap_onto():
+ *  Let's say @relmap has bits 30-39 set, and @orig has bits
+ *  1, 3, 5, 7, 9 and 11 set.  Then on return from this routine,
+ *  @dst will have bits 31, 33, 35, 37 and 39 set.
+ *
+ *  When bit 0 is set in @orig, it means turn on the bit in
+ *  @dst corresponding to whatever is the first bit (if any)
+ *  that is turned on in @relmap.  Since bit 0 was off in the
+ *  above example, we leave off that bit (bit 30) in @dst.
+ *
+ *  When bit 1 is set in @orig (as in the above example), it
+ *  means turn on the bit in @dst corresponding to whatever
+ *  is the second bit that is turned on in @relmap.  The second
+ *  bit in @relmap that was turned on in the above example was
+ *  bit 31, so we turned on bit 31 in @dst.
+ *
+ *  Similarly, we turned on bits 33, 35, 37 and 39 in @dst,
+ *  because they were the 4th, 6th, 8th and 10th set bits
+ *  set in @relmap, and the 4th, 6th, 8th and 10th bits of
+ *  @orig (i.e. bits 3, 5, 7 and 9) were also set.
+ *
+ *  When bit 11 is set in @orig, it means turn on the bit in
+ *  @dst corresponding to whatever is the twelth bit that is
+ *  turned on in @relmap.  In the above example, there were
+ *  only ten bits turned on in @relmap (30..39), so that bit
+ *  11 was set in @orig had no affect on @dst.
+ *
+ * Example [2] for bitmap_fold() + bitmap_onto():
+ *  Let's say @relmap has these ten bits set:
+ *		40 41 42 43 45 48 53 61 74 95
+ *  (for the curious, that's 40 plus the first ten terms of the
+ *  Fibonacci sequence.)
+ *
+ *  Further lets say we use the following code, invoking
+ *  bitmap_fold() then bitmap_onto, as suggested above to
+ *  avoid the possitility of an empty @dst result:
+ *
+ *	unsigned long *tmp;	// a temporary bitmap's bits
+ *
+ *	bitmap_fold(tmp, orig, bitmap_weight(relmap, bits), bits);
+ *	bitmap_onto(dst, tmp, relmap, bits);
+ *
+ *  Then this table shows what various values of @dst would be, for
+ *  various @orig's.  I list the zero-based positions of each set bit.
+ *  The tmp column shows the intermediate result, as computed by
+ *  using bitmap_fold() to fold the @orig bitmap modulo ten
+ *  (the weight of @relmap).
+ *
+ *      @orig           tmp            @dst
+ *      0                0             40
+ *      1                1             41
+ *      9                9             95
+ *      10               0             40 (*)
+ *      1 3 5 7          1 3 5 7       41 43 48 61
+ *      0 1 2 3 4        0 1 2 3 4     40 41 42 43 45
+ *      0 9 18 27        0 9 8 7       40 61 74 95
+ *      0 10 20 30       0             40
+ *      0 11 22 33       0 1 2 3       40 41 42 43
+ *      0 12 24 36       0 2 4 6       40 42 45 53
+ *      78 102 211       1 2 8         41 42 74 (*)
+ *
+ * (*) For these marked lines, if we hadn't first done bitmap_fold()
+ *     into tmp, then the @dst result would have been empty.
+ *
+ * If either of @orig or @relmap is empty (no set bits), then @dst
+ * will be returned empty.
+ *
+ * If (as explained above) the only set bits in @orig are in positions
+ * m where m >= W, (where W is the weight of @relmap) then @dst will
+ * once again be returned empty.
+ *
+ * All bits in @dst not set by the above rule are cleared.
+ */
+void bitmap_onto(unsigned long *dst, const unsigned long *orig,
+			const unsigned long *relmap, int bits)
+{
+	int n, m;       	/* same meaning as in above comment */
+
+	if (dst == orig)	/* following doesn't handle inplace mappings */
+		return;
+	bitmap_zero(dst, bits);
+
+	/*
+	 * The following code is a more efficient, but less
+	 * obvious, equivalent to the loop:
+	 *	for (m = 0; m < bitmap_weight(relmap, bits); m++) {
+	 *		n = bitmap_ord_to_pos(orig, m, bits);
+	 *		if (test_bit(m, orig))
+	 *			set_bit(n, dst);
+	 *	}
+	 */
+
+	m = 0;
+	for (n = find_first_bit(relmap, bits);
+	     n < bits;
+	     n = find_next_bit(relmap, bits, n + 1)) {
+		/* m == bitmap_pos_to_ord(relmap, n, bits) */
+		if (test_bit(m, orig))
+			set_bit(n, dst);
+		m++;
+	}
+}
+EXPORT_SYMBOL(bitmap_onto);
+
+/**
+ * bitmap_fold - fold larger bitmap into smaller, modulo specified size
+ *	@dst: resulting smaller bitmap
+ *	@orig: original larger bitmap
+ *	@sz: specified size
+ *	@bits: number of bits in each of these bitmaps
+ *
+ * For each bit oldbit in @orig, set bit oldbit mod @sz in @dst.
+ * Clear all other bits in @dst.  See further the comment and
+ * Example [2] for bitmap_onto() for why and how to use this.
+ */
+void bitmap_fold(unsigned long *dst, const unsigned long *orig,
+			int sz, int bits)
+{
+	int oldbit;
+
+	if (dst == orig)	/* following doesn't handle inplace mappings */
+		return;
+	bitmap_zero(dst, bits);
+
+	for (oldbit = find_first_bit(orig, bits);
+	     oldbit < bits;
+	     oldbit = find_next_bit(orig, bits, oldbit + 1))
+		set_bit(oldbit % sz, dst);
+}
+EXPORT_SYMBOL(bitmap_fold);
+
 /*
  * Common code for bitmap_*_region() routines.
  *	bitmap: array of unsigned longs corresponding to the bitmap

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] bitmap onto and fold operators for mempolicy extensions
  2008-02-16 12:33 [RFC] bitmap onto and fold operators for mempolicy extensions Paul Jackson
@ 2008-02-16 13:38 ` KOSAKI Motohiro
  2008-02-16 16:46   ` Paul Jackson
  2008-02-19  0:46 ` David Rientjes
  1 sibling, 1 reply; 7+ messages in thread
From: KOSAKI Motohiro @ 2008-02-16 13:38 UTC (permalink / raw)
  To: Paul Jackson
  Cc: David Rientjes, Lee.Schermerhorn, mel, ak, linux-kernel, ray-lk,
	akpm, clameter

Hi Paul,

> The bitmap_fold() operator folds a bitmap into a second that
> has bit m set iff the input bitmap has some bit n set, where
> m == n mod sz, for the specified sz value.

iff?
other portions looks good :)

Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

Thanks!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] bitmap onto and fold operators for mempolicy extensions
  2008-02-16 13:38 ` KOSAKI Motohiro
@ 2008-02-16 16:46   ` Paul Jackson
  2008-02-17  2:39     ` KOSAKI Motohiro
  0 siblings, 1 reply; 7+ messages in thread
From: Paul Jackson @ 2008-02-16 16:46 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: rientjes, Lee.Schermerhorn, mel, ak, linux-kernel, ray-lk, akpm,
	clameter

KOSAKI-san wrote:
> iff?
> other portions looks good :)

"iff" -- "if and only if"

It is from my days, a long long time ago, as a student of mathematical
logic and set theory.  I was too lazy to spell that one out.  My bad.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] bitmap onto and fold operators for mempolicy extensions
  2008-02-16 16:46   ` Paul Jackson
@ 2008-02-17  2:39     ` KOSAKI Motohiro
  2008-02-17  3:31       ` Paul Jackson
  0 siblings, 1 reply; 7+ messages in thread
From: KOSAKI Motohiro @ 2008-02-17  2:39 UTC (permalink / raw)
  To: Paul Jackson
  Cc: kosaki.motohiro, rientjes, Lee.Schermerhorn, mel, ak,
	linux-kernel, ray-lk, akpm, clameter

Hi Paul,

> > iff?
> > other portions looks good :)
> 
> "iff" -- "if and only if"
> 
> It is from my days, a long long time ago, as a student of mathematical
> logic and set theory.  I was too lazy to spell that one out.  My bad.

not at all.
I did't know the abbreviation, sorry.

I think your patch has no bug.
Thanks.


- kosaki



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] bitmap onto and fold operators for mempolicy extensions
  2008-02-17  2:39     ` KOSAKI Motohiro
@ 2008-02-17  3:31       ` Paul Jackson
  0 siblings, 0 replies; 7+ messages in thread
From: Paul Jackson @ 2008-02-17  3:31 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: kosaki.motohiro, rientjes, Lee.Schermerhorn, mel, ak,
	linux-kernel, ray-lk, akpm, clameter

Thank-you for the kind and prompt review, kosaki-san.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] bitmap onto and fold operators for mempolicy extensions
  2008-02-16 12:33 [RFC] bitmap onto and fold operators for mempolicy extensions Paul Jackson
  2008-02-16 13:38 ` KOSAKI Motohiro
@ 2008-02-19  0:46 ` David Rientjes
  2008-02-19  6:05   ` Paul Jackson
  1 sibling, 1 reply; 7+ messages in thread
From: David Rientjes @ 2008-02-19  0:46 UTC (permalink / raw)
  To: Paul Jackson
  Cc: Lee.Schermerhorn, mel, ak, linux-kernel, ray-lk, kosaki.motohiro,
	akpm, clameter

On Sat, 16 Feb 2008, Paul Jackson wrote:

> Let's say an application has specified some mempolicies
> that presume 16 memory nodes, including say a mempolicy that
> specified MPOL_F_RELATIVE_NODES (cpuset relative) nodes 12-15.
> Then lets say that application is crammed into a cpuset that only
> has 8 memory nodes, 0-7.  If one just uses bitmap_onto(), this
> mempolicy, mapped to that cpuset, would ignore the requested
> relative nodes above 7, leaving it empty of nodes.  That's not
> good; better to fold the higher nodes down, so that some nodes
> are included in the resulting mapped mempolicy.  In this case,
> the mempolicy nodes 12-15 are taken modulo 8 (the weight of the
> mems_allowed of the confining cpuset), resulting in a mempolicy
> specifying nodes 4-7.
> 

So what is the MPOL_F_RELATIVE_NODES behavior?  Is it a combination of 
nodes_onto() and nodes_fold()?

In your example, the only way we know to use nodes_fold() is if the 
resultant of nodes_onto() has a weight of 0.  An MPOL_F_RELATIVE_NODES 
nodemask for 4-6, for example, works fine in your case of a cpuset with 
memory nodes 0-7 and no fold is required.

So it's easy enough to do this:

	case MPOL_INTERLEAVE:
		if (flags & MPOL_F_RELATIVE_NODES) {
			nodes_onto(pol->v.nodes, pol->user_nodemask,
				   cpuset_context_nmask);
			if (nodes_empty(pol->v.nodes))
				nodes_fold(pol->v.nodes,
					   pol->user_nodemask,
					   nodes_weight(cpuset_context_nmask));
		} else {
			...
		}
		break;

But what if we require a combination?  Say the user asked for a policy of 
MPOL_INTERLEAVE | MPOL_F_RELATIVE_NODES over nodes 4-8 in a cpuset 
constrained to mems 0-7?  Should the resultant be 0,4-7 (combination of 
nodes_onto() and nodes_fold()) or simply be 4-7 (just nodes_onto())?

And what if the MPOL_INTERLEAVE | MPOL_F_RELATIVE_NODES nodemask is 0,4-8 
in the same cpuset constrained to mems 0-7?  Should the resultant be

 - 0,4-7 (nodes_onto() and nodes_fold()),

 - 0,4-7 (just nodes_onto()), or

 - 0-1,4-7 (nodes_onto(), nodes_fold(), and shift)?

The last option, 0-1,4-7, is the only one that preserves the same weight 
as the relative nodemask.

		David

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC] bitmap onto and fold operators for mempolicy extensions
  2008-02-19  0:46 ` David Rientjes
@ 2008-02-19  6:05   ` Paul Jackson
  0 siblings, 0 replies; 7+ messages in thread
From: Paul Jackson @ 2008-02-19  6:05 UTC (permalink / raw)
  To: David Rientjes
  Cc: Lee.Schermerhorn, mel, ak, linux-kernel, ray-lk, kosaki.motohiro,
	akpm, clameter

David wrote:
> So what is the MPOL_F_RELATIVE_NODES behavior?  Is it a combination of 
> nodes_onto() and nodes_fold()?

MPOL_F_RELATIVE_NODES should always combination of nodes_onto() and
nodes_fold().

The reason I say that is consistency with the end cases.  That is,
we need fold in the case that no requested nodes have numbers
smaller than the weight of the cpusets mems_allowed.  At the opposite
extreme, when -all- requested node numbers are smaller than the weight
of mems_allowed, fold has no affect, so it's fine to use there.  And
for the in between cases, when some requested nodes are smaller and
some not, fold is one possible reasonable answer.  So use it everywhere
for consistency.

> So it's easy enough to do this:
> 	case MPOL_INTERLEAVE:
> 		if (flags & MPOL_F_RELATIVE_NODES) {
> 			nodes_onto(pol->v.nodes, pol->user_nodemask,
> 				   cpuset_context_nmask);
> 			if (nodes_empty(pol->v.nodes))

I hate any more special cases than I have to have -- so not that way.

> But what if we require a combination?  Say the user asked for a policy of 
> MPOL_INTERLEAVE | MPOL_F_RELATIVE_NODES over nodes 4-8 in a cpuset 
> constrained to mems 0-7?  Should the resultant be 0,4-7 (combination of 
> nodes_onto() and nodes_fold()) or simply be 4-7 (just nodes_onto())?

All fold, all the time ;).  So this one should be 0,4-7.


> And what if the MPOL_INTERLEAVE | MPOL_F_RELATIVE_NODES nodemask is 0,4-8 
> in the same cpuset constrained to mems 0-7?  Should the resultant be
> 
>  - 0,4-7 (nodes_onto() and nodes_fold()),
> 
>  - 0,4-7 (just nodes_onto()), or
> 
>  - 0-1,4-7 (nodes_onto(), nodes_fold(), and shift)?
> 
> The last option, 0-1,4-7, is the only one that preserves the same weight 
> as the relative nodemask.

Again, 0,4-7.

Weight preservation is not a goal here, and would require special cases.

If someone crams a five pound cpuset relative memory policy into a two
poound cpuset, they will likely loose weight.  What matters in that
case is that things plod along without too much distress, and that
when the job is restored to a proper size cpuset, its memory policies
recover their full health.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-02-19  6:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-16 12:33 [RFC] bitmap onto and fold operators for mempolicy extensions Paul Jackson
2008-02-16 13:38 ` KOSAKI Motohiro
2008-02-16 16:46   ` Paul Jackson
2008-02-17  2:39     ` KOSAKI Motohiro
2008-02-17  3:31       ` Paul Jackson
2008-02-19  0:46 ` David Rientjes
2008-02-19  6:05   ` Paul Jackson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).