LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/3] x86: add cpuset_scnprintf function
@ 2008-04-01 22:54 Mike Travis
  2008-04-01 22:54 ` [PATCH 1/3] " Mike Travis
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Mike Travis @ 2008-04-01 22:54 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Thomas Gleixner, H. Peter Anvin, linux-kernel, Paul Jackson


*  Add a new cpuset_scnprintf and a sysctl flag to control how cpumask sets
   are printed.  The default is to use the current cpumask_scnprintf.  If
   kernel.compat_cpuset_printf is '0' (default 1), then cpulist_scnprintf
   is used.  In addition, a nodeset_scnprintf is provided for compatibilty.

   This is introduced with a CONFIG_KERN_COMPAT_CPUSET_PRINTF flag which
   currently is only defined for X86_64_SMP architecture.  For all other
   architectures the current cpumask_scnprintf() is used.

*  Modify usages of cpumask_scnprintf to use the new cpuset_scnprintf where
   appropriate.  The list of files affected are:

	arch/x86/kernel/cpu/intel_cacheinfo.c
	drivers/base/node.c
	drivers/base/topology.c
	drivers/pci/pci-sysfs.c
	drivers/pci/probe.c
	kernel/irq/proc.c
	kernel/profile.c
	kernel/sched_stats.h
	kernel/sysctl.c
	kernel/sysctl_check.c
	kernel/trace/trace.c

Note that kernel/sched.c is not in this patchset as it has many other changes,
so the change to use cpuset_scnprintf is in another patchset.

For inclusion in x86/latest.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
    +   x86/latest          .../x86/linux-2.6-x86.git
    +   sched-devel/latest  .../mingo/linux-2.6-sched-devel.git

Cc: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Mike Travis <travis@sgi.com>
---

The discussion that led up to this...

On Fri, Mar 28, 2008 at 3:40 PM, Mike Travis <travis@sgi.com> wrote:
> >
> > Bert Wesarg wrote:
> >  > On Fri, Mar 28, 2008 at 12:16 AM, Mike Travis <travis@sgi.com> wrote:
...
>  >>  +       /*
> >  >>  +        * cpulist_scnprintf() has the advantage of compressing
> >  >>  +        * consecutive cpu numbers into a single range which seems
> >  >>  +        * appropriate for cpus on a leaf.  This will change what is
> >  >>  +        * output so scripts that process the output will have to change.
> >  > So this breaks user space?
> >  >
> >  > Bert
> >
> >  Potentially, yes.  But I suspect with 4096 cpus, user scripts will have
> >  to change anyways.  Currently it is represented as sets of 32 bit mask
> >  outputs with comma separators, so 1152 characters would be output.
But you can declare it as a programming error on user space side. If
you change the format, the brown-paper-bag is yours.


On Mon, Mar 31, 2008 at 6:35 PM, Mike Travis <travis@sgi.com> wrote:
> > Bert Wesarg wrote:
> >  > On Fri, Mar 28, 2008 at 7:19 PM, Mike Travis <travis@sgi.com> wrote:
> >  >>  > Aren't the most cpumaps (like cpu/cpu*/topology/*_siblings or
> >  >>  > node/node*/cpumap) bitmasks?
> >  >>
> >  >>  I did an informal survey and you are right, the majority of references do use
> >  >>  cpumask_scnprintf instead of cpulist_scnprintf.  Maybe the later function was
> >  >>  added later?
> >  >>
> >  >>  To me though, it would seem that:
> >  >>
> >  >>  240-255
> >  >>
> >  >>  is more readable than:
> >  >>
> >  >>  00000000,00000000,00000000,00000000,00000000,00000000,00000000,0000ffff
> >  >>
> >  >>  And as I mentioned, bitmask_parselist() [libbitmask(3)] does parse the output.
> >  > But libbitmask has a bitmask_parsehex() too. (but thanks for the
> >  > pointer to this code).
> >  >
> >  > Anyway, your above example is wrong, the most significant bits comes first:
> >  >
> >  > ffff0000,00000000,00000000,00000000,00000000,00000000,00000000,00000000
> >  >
> >  > This makes it not more readable, but I think readability isn't in this
> >  > case of that much importance.
> >
> >  The original problem was how to avoid allocating a large stack space to display
> >  cpu ids.  By using cpulist_scnprintf, it accomplishes this without, what I think
> >  is too much pain.  If it's really that much of a problem, I will rework this patch.
> >  But the length of the line with 4096 cpus will be 1152 bytes  Is this really
> >  better?
I ask myself, why is there a temporary buffer allocation in the first
place? In the end it is copied unbounded into the provided buf
argument. Sure your list is mostly shorter than a hex mask, but you
can also not be sure that it fit into the provided buffer. So you can
also use cpumask_scnprintf directly with the buf argument, and provide
a good known upper bound for the size (ie.
cpumask_scnprintf_len(nr_cpu_ids)).

-- 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/3] x86: add cpuset_scnprintf function
  2008-04-01 22:54 [PATCH 0/3] x86: add cpuset_scnprintf function Mike Travis
@ 2008-04-01 22:54 ` Mike Travis
  2008-04-01 22:54 ` [PATCH 2/3] x86: modify show_shared_cpu_map in intel_cacheinfo Mike Travis
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Mike Travis @ 2008-04-01 22:54 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, H. Peter Anvin, linux-kernel, Paul Jackson, Bert Wesarg

[-- Attachment #1: add-cpuset_scnprintf --]
[-- Type: text/plain, Size: 8717 bytes --]

Add a new cpuset_scnprintf() function and a sysctl flag to control
how cpumask sets are printed.  The default is to use the current
cpumask_scnprintf().  If kernel.compat_cpuset_printf is '0' (default 1),
then cpulist_scnprintf() is used.  A nodeset_scnprintf() function is
also provided for compatibilty.

This is introduced with a CONFIG_KERN_COMPAT_CPUSET_PRINTF flag which
currently is only defined for X86_64_SMP architecture.

In addition, remove the cpumask_scnprintf_len() function.

This is all needed to accomodate large NR_CPUS count and the usage has
been added to Documentation/sysctl/kernel.txt.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
    +   x86/latest          .../x86/linux-2.6-x86.git
    +   sched-devel/latest  .../mingo/linux-2.6-sched-devel.git

Cc: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Mike Travis <travis@sgi.com>
---
 Documentation/sysctl/kernel.txt |   34 ++++++++++++++++++++++++++++++++++
 arch/x86/Kconfig                |    4 ++++
 arch/x86/kernel/setup.c         |    5 +++++
 include/linux/bitmap.h          |    1 -
 include/linux/cpumask.h         |   22 +++++++++++++++-------
 include/linux/nodemask.h        |   17 +++++++++++++++++
 kernel/sysctl.c                 |   11 +++++++++++
 lib/bitmap.c                    |   16 ----------------
 8 files changed, 86 insertions(+), 24 deletions(-)

--- linux-2.6.x86.orig/Documentation/sysctl/kernel.txt
+++ linux-2.6.x86/Documentation/sysctl/kernel.txt
@@ -18,6 +18,7 @@ Currently, these files might (depending 
 show up in /proc/sys/kernel:
 - acpi_video_flags
 - acct
+- compat_cpuset_printf
 - core_pattern
 - core_uses_pid
 - ctrl-alt-del
@@ -85,6 +86,39 @@ valid for 30 seconds.
 
 ==============================================================
 
+compat_cpuset_printf:
+
+compat_cpuset_printf is used to alter the way cpumask_t cpusets
+are printed.  To maintain compatibility with the current output
+format, the default is '1', which results in a lengthy printout
+when the number of cpus in a system is large.  An example when
+NR_CPUS=4096 and compat_cpuset_printf = '1':
+
+    # cat /sys/devices/system/cpu/cpu3/cache/index0/shared_cpu_map
+    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
+    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
+    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
+    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
+    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
+    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
+    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
+    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
+    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
+    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
+    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
+    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
+    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
+    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
+    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
+    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000008
+
+The same example when compat_cpuset_printf = '0':
+
+    # cat /sys/devices/system/cpu/cpu3/cache/index0/shared_cpu_map
+    3
+
+==============================================================
+
 core_pattern:
 
 core_pattern is used to specify a core dumpfile pattern name.
--- linux-2.6.x86.orig/arch/x86/Kconfig
+++ linux-2.6.x86/arch/x86/Kconfig
@@ -189,6 +189,10 @@ config X86_TRAMPOLINE
 	depends on X86_SMP || (X86_VOYAGER && SMP)
 	default y
 
+config KERN_COMPAT_CPUSET_PRINTF
+	bool
+	default X86_64_SMP
+
 config KTIME_SCALAR
 	def_bool X86_32
 source "init/Kconfig"
--- linux-2.6.x86.orig/arch/x86/kernel/setup.c
+++ linux-2.6.x86/arch/x86/kernel/setup.c
@@ -10,6 +10,11 @@
 #include <asm/setup.h>
 #include <asm/topology.h>
 
+#ifdef CONFIG_KERN_COMPAT_CPUSET_PRINTF
+/* select cpuset_scnprintf output */
+int compat_cpuset_printf = 1;
+#endif
+
 #if defined(CONFIG_HAVE_SETUP_PER_CPU_AREA) && defined(CONFIG_SMP)
 /*
  * Copy data used in early init routines from the initial arrays to the
--- linux-2.6.x86.orig/include/linux/bitmap.h
+++ linux-2.6.x86/include/linux/bitmap.h
@@ -108,7 +108,6 @@ extern int __bitmap_weight(const unsigne
 
 extern int bitmap_scnprintf(char *buf, unsigned int len,
 			const unsigned long *src, int nbits);
-extern int bitmap_scnprintf_len(unsigned int len);
 extern int __bitmap_parse(const char *buf, unsigned int buflen, int is_user,
 			unsigned long *dst, int nbits);
 extern int bitmap_parse_user(const char __user *ubuf, unsigned int ulen,
--- linux-2.6.x86.orig/include/linux/cpumask.h
+++ linux-2.6.x86/include/linux/cpumask.h
@@ -273,13 +273,6 @@ static inline int __cpumask_scnprintf(ch
 	return bitmap_scnprintf(buf, len, srcp->bits, nbits);
 }
 
-#define cpumask_scnprintf_len(len) \
-			__cpumask_scnprintf_len((len))
-static inline int __cpumask_scnprintf_len(int len)
-{
-	return bitmap_scnprintf_len(len);
-}
-
 #define cpumask_parse_user(ubuf, ulen, dst) \
 			__cpumask_parse_user((ubuf), (ulen), &(dst), NR_CPUS)
 static inline int __cpumask_parse_user(const char __user *buf, int len,
@@ -302,6 +295,21 @@ static inline int __cpulist_parse(const 
 	return bitmap_parselist(buf, dstp->bits, nbits);
 }
 
+#ifdef	CONFIG_KERN_COMPAT_CPUSET_PRINTF
+#define	cpuset_scnprintf(buf, len, src) __cpuset_scnprintf((buf), (len), &(src))
+static inline int __cpuset_scnprintf(char *buf, int len, const cpumask_t *srcp)
+{
+	extern int compat_cpuset_printf;
+
+	if (compat_cpuset_printf)
+		return __cpumask_scnprintf(buf, len, srcp, NR_CPUS);
+	else
+		return __cpulist_scnprintf(buf, len, srcp, NR_CPUS);
+}
+#else
+#define cpuset_scnprintf(buf, len, src)	cpumask_scnprintf(buf, len, src)
+#endif
+
 #define cpu_remap(oldbit, old, new) \
 		__cpu_remap((oldbit), &(old), &(new), NR_CPUS)
 static inline int __cpu_remap(int oldbit,
--- linux-2.6.x86.orig/include/linux/nodemask.h
+++ linux-2.6.x86/include/linux/nodemask.h
@@ -310,6 +310,23 @@ static inline int __nodelist_parse(const
 	return bitmap_parselist(buf, dstp->bits, nbits);
 }
 
+#ifdef	CONFIG_KERN_COMPAT_CPUSET_PRINTF
+#define	nodeset_scnprintf(buf, len, src) \
+			__nodeset_scnprintf((buf), (len), &(src))
+static inline int __nodeset_scnprintf(char *buf, int len, const nodemask_t *srcp)
+{
+	extern int compat_cpuset_printf;
+
+	if (compat_cpuset_printf)
+		return __nodemask_scnprintf(buf, len, srcp, MAX_NUMNODES);
+	else
+		return __nodelist_scnprintf(buf, len, srcp, MAX_NUMNODES);
+}
+#else
+#define nodeset_scnprintf(buf, len, src) nodemask_scnprintf(buf, len, src)
+#endif
+
+
 #define node_remap(oldbit, old, new) \
 		__node_remap((oldbit), &(old), &(new), MAX_NUMNODES)
 static inline int __node_remap(int oldbit,
--- linux-2.6.x86.orig/kernel/sysctl.c
+++ linux-2.6.x86/kernel/sysctl.c
@@ -82,6 +82,7 @@ extern int compat_log;
 extern int maps_protect;
 extern int sysctl_stat_interval;
 extern int latencytop_enabled;
+extern int compat_cpuset_printf;
 
 /* Constants used for minimum and  maximum */
 #if defined(CONFIG_DETECT_SOFTLOCKUP) || defined(CONFIG_HIGHMEM)
@@ -831,6 +832,16 @@ static struct ctl_table kern_table[] = {
 		.proc_handler	= &proc_dointvec,
 	},
 #endif
+#ifdef CONFIG_KERN_COMPAT_CPUSET_PRINTF
+	{
+		.ctl_name	= CTL_UNNUMBERED,
+		.procname	= "compat_cpuset_printf",
+		.data		= &compat_cpuset_printf,
+		.maxlen		= sizeof (int),
+	 	.mode		= 0644,
+		.proc_handler	= &proc_dointvec,
+	},
+#endif
 
 /*
  * NOTE: do not add new entries to this table unless you have read
--- linux-2.6.x86.orig/lib/bitmap.c
+++ linux-2.6.x86/lib/bitmap.c
@@ -316,22 +316,6 @@ int bitmap_scnprintf(char *buf, unsigned
 EXPORT_SYMBOL(bitmap_scnprintf);
 
 /**
- * bitmap_scnprintf_len - return buffer length needed to convert
- * bitmap to an ASCII hex string.
- * @len: number of bits to be converted
- */
-int bitmap_scnprintf_len(unsigned int len)
-{
-	/* we need 9 chars per word for 32 bit words (8 hexdigits + sep/null) */
-	int bitslen = ALIGN(len, CHUNKSZ);
-	int wordlen = CHUNKSZ / 4;
-	int buflen = (bitslen / wordlen) * (wordlen + 1) * sizeof(char);
-
-	return buflen;
-}
-EXPORT_SYMBOL(bitmap_scnprintf_len);
-
-/**
  * __bitmap_parse - convert an ASCII hex string into a bitmap.
  * @buf: pointer to buffer containing string.
  * @buflen: buffer size in bytes.  If string is smaller than this

-- 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 2/3] x86: modify show_shared_cpu_map in intel_cacheinfo
  2008-04-01 22:54 [PATCH 0/3] x86: add cpuset_scnprintf function Mike Travis
  2008-04-01 22:54 ` [PATCH 1/3] " Mike Travis
@ 2008-04-01 22:54 ` Mike Travis
  2008-04-01 22:54 ` [PATCH 3/3] cpumask: use new cpuset_scnprintf function Mike Travis
  2008-04-02  6:20 ` [PATCH 0/3] x86: add " Paul Jackson
  3 siblings, 0 replies; 9+ messages in thread
From: Mike Travis @ 2008-04-01 22:54 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, H. Peter Anvin, linux-kernel, Paul Jackson, Bert Wesarg

[-- Attachment #1: mod-intel_cacheinfo --]
[-- Type: text/plain, Size: 3189 bytes --]

Use cpuset_scnprintf to print cpus on a leaf instead of requiring
a new "cpumask_scnprintf_len" function to determine the size of
the temporary buffer.  cpuset_scnprintf can be used to print directly
to the output buffer, eliminating the need for the temporary buffer.

The format that cpuset_scnprintf uses is dependent on the sysctl
variable kernel.compat_cpuset_printf.  The default is '1' and results
in this printout when NR_CPUS = 4096:

    # cat /sys/devices/system/cpu/cpu3/cache/index0/shared_cpu_map
    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,\
    00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000008

Clearing kernel.compat_cpuset_printf changes the output to:

    # sysctl kernel.compat_cpuset_printf=0
    # cat /sys/devices/system/cpu/cpu3/cache/index0/shared_cpu_map
    3


[Depends on: x86: add cpuset_scnprintf function patch]

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
    +   x86/latest          .../x86/linux-2.6-x86.git
    +   sched-devel/latest  .../mingo/linux-2.6-sched-devel.git

Cc: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/x86/kernel/cpu/intel_cacheinfo.c |   15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

--- linux-2.6.x86.orig/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ linux-2.6.x86/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -593,14 +593,17 @@ static ssize_t show_size(struct _cpuid4_
 
 static ssize_t show_shared_cpu_map(struct _cpuid4_info *this_leaf, char *buf)
 {
+	unsigned long end = ALIGN((unsigned long)buf, PAGE_SIZE);
+	int len = end - (unsigned long)buf;
 	int n = 0;
-	int len = cpumask_scnprintf_len(nr_cpu_ids);
-	char *mask_str = kmalloc(len, GFP_KERNEL);
 
-	if (mask_str) {
-		cpumask_scnprintf(mask_str, len, this_leaf->shared_cpu_map);
-		n = sprintf(buf, "%s\n", mask_str);
-		kfree(mask_str);
+	if (len == 0)
+		len = PAGE_SIZE;
+
+	if (len >= 2) {
+		n = cpuset_scnprintf(buf, len-2, this_leaf->shared_cpu_map);
+		buf[n++] = '\n';
+		buf[n] = '\0';
 	}
 	return n;
 }

-- 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 3/3] cpumask: use new cpuset_scnprintf function
  2008-04-01 22:54 [PATCH 0/3] x86: add cpuset_scnprintf function Mike Travis
  2008-04-01 22:54 ` [PATCH 1/3] " Mike Travis
  2008-04-01 22:54 ` [PATCH 2/3] x86: modify show_shared_cpu_map in intel_cacheinfo Mike Travis
@ 2008-04-01 22:54 ` Mike Travis
  2008-04-02  6:20 ` [PATCH 0/3] x86: add " Paul Jackson
  3 siblings, 0 replies; 9+ messages in thread
From: Mike Travis @ 2008-04-01 22:54 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Thomas Gleixner, H. Peter Anvin, linux-kernel, Paul Jackson

[-- Attachment #1: use-cpuset_scnprintf --]
[-- Type: text/plain, Size: 6498 bytes --]

Use cpuset_scnprintf to print cpumask sets where appropriate.  This is used
when the number of cpus present in a system is so large that the current
method of printing cpumask_t sets as mask bits result in a large number
of output lines.  See Documentation/sysctl/kernel.txt(compat_cpuset_printf)
for usage info.

Also some small bugs fixed (or code efficiency improvments) for various uses
of cpuset_scnprintf.

[Depends on: x86: add cpuset_scnprintf function patch]

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
    +   x86/latest          .../x86/linux-2.6-x86.git
    +   sched-devel/latest  .../mingo/linux-2.6-sched-devel.git

Signed-off-by: Mike Travis <travis@sgi.com>
---
 drivers/base/node.c     |    5 +++--
 drivers/base/topology.c |   14 +++++++++++---
 drivers/pci/pci-sysfs.c |    7 ++++---
 drivers/pci/probe.c     |    6 +++---
 kernel/cpuset.c         |    4 ++--
 kernel/irq/proc.c       |    2 +-
 kernel/profile.c        |    2 +-
 kernel/sched_stats.h    |   12 +++++++++---
 kernel/trace/trace.c    |    2 +-
 9 files changed, 35 insertions(+), 19 deletions(-)

--- linux-2.6.x86.orig/drivers/base/node.c
+++ linux-2.6.x86/drivers/base/node.c
@@ -28,8 +28,9 @@ static ssize_t node_read_cpumap(struct s
 	/* 2004/06/03: buf currently PAGE_SIZE, need > 1 char per 4 bits. */
 	BUILD_BUG_ON(MAX_NUMNODES/4 > PAGE_SIZE/2);
 
-	len = cpumask_scnprintf(buf, PAGE_SIZE-1, mask);
-	len += sprintf(buf + len, "\n");
+	len = cpuset_scnprintf(buf, PAGE_SIZE-2, mask);
+	buf[len++] = '\n';
+	buf[len] = '\0';
 	return len;
 }
 
--- linux-2.6.x86.orig/drivers/base/topology.c
+++ linux-2.6.x86/drivers/base/topology.c
@@ -43,10 +43,18 @@ static ssize_t show_##name(struct sys_de
 #define define_siblings_show_func(name)					\
 static ssize_t show_##name(struct sys_device *dev, char *buf)		\
 {									\
-	ssize_t len = -1;						\
+	unsigned long end = ALIGN((unsigned long)buf, PAGE_SIZE);	\
+	int len = end - (unsigned long)buf;				\
+	int n = 0;							\
 	unsigned int cpu = dev->id;					\
-	len = cpumask_scnprintf(buf, NR_CPUS+1, topology_##name(cpu));	\
-	return (len + sprintf(buf + len, "\n"));			\
+	if (len == 0)							\
+		len = PAGE_SIZE;					\
+	if (len >= 2) {							\
+		n = cpuset_scnprintf(buf, len-2, topology_##name(cpu));	\
+		buf[n++] = '\n';					\
+		buf[n] = '\0';						\
+	}								\
+	return n;							\
 }
 
 #ifdef	topology_physical_package_id
--- linux-2.6.x86.orig/drivers/pci/pci-sysfs.c
+++ linux-2.6.x86/drivers/pci/pci-sysfs.c
@@ -72,9 +72,10 @@ static ssize_t local_cpus_show(struct de
 	int len;
 
 	mask = pcibus_to_cpumask(to_pci_dev(dev)->bus);
-	len = cpumask_scnprintf(buf, PAGE_SIZE-2, mask);
-	strcat(buf,"\n"); 
-	return 1+len;
+	len = cpuset_scnprintf(buf, PAGE_SIZE-2, mask);
+	buf[len++] = '\n';
+	buf[len] = '\0';
+	return len;
 }
 
 /* show resources */
--- linux-2.6.x86.orig/drivers/pci/probe.c
+++ linux-2.6.x86/drivers/pci/probe.c
@@ -89,9 +89,9 @@ static ssize_t pci_bus_show_cpuaffinity(
 	cpumask_t cpumask;
 
 	cpumask = pcibus_to_cpumask(to_pci_bus(dev));
-	ret = cpumask_scnprintf(buf, PAGE_SIZE, cpumask);
-	if (ret < PAGE_SIZE)
-		buf[ret++] = '\n';
+	ret = cpuset_scnprintf(buf, PAGE_SIZE-2, cpumask);
+	buf[ret++] = '\n';
+	buf[ret] = '\0';
 	return ret;
 }
 DEVICE_ATTR(cpuaffinity, S_IRUGO, pci_bus_show_cpuaffinity, NULL);
--- linux-2.6.x86.orig/kernel/cpuset.c
+++ linux-2.6.x86/kernel/cpuset.c
@@ -2371,11 +2371,11 @@ const struct file_operations proc_cpuset
 void cpuset_task_status_allowed(struct seq_file *m, struct task_struct *task)
 {
 	seq_printf(m, "Cpus_allowed:\t");
-	m->count += cpumask_scnprintf(m->buf + m->count, m->size - m->count,
+	m->count += cpuset_scnprintf(m->buf + m->count, m->size - m->count,
 					task->cpus_allowed);
 	seq_printf(m, "\n");
 	seq_printf(m, "Mems_allowed:\t");
-	m->count += nodemask_scnprintf(m->buf + m->count, m->size - m->count,
+	m->count += nodeset_scnprintf(m->buf + m->count, m->size - m->count,
 					task->mems_allowed);
 	seq_printf(m, "\n");
 }
--- linux-2.6.x86.orig/kernel/irq/proc.c
+++ linux-2.6.x86/kernel/irq/proc.c
@@ -27,7 +27,7 @@ static int irq_affinity_read_proc(char *
 	if (desc->status & IRQ_MOVE_PENDING)
 		mask = &desc->pending_mask;
 #endif
-	len = cpumask_scnprintf(page, count, *mask);
+	len = cpuset_scnprintf(page, count, *mask);
 
 	if (count - len < 2)
 		return -EINVAL;
--- linux-2.6.x86.orig/kernel/profile.c
+++ linux-2.6.x86/kernel/profile.c
@@ -426,7 +426,7 @@ void profile_tick(int type)
 static int prof_cpu_mask_read_proc(char *page, char **start, off_t off,
 			int count, int *eof, void *data)
 {
-	int len = cpumask_scnprintf(page, count, *(cpumask_t *)data);
+	int len = cpuset_scnprintf(page, count, *(cpumask_t *)data);
 	if (count - len < 2)
 		return -EINVAL;
 	len += sprintf(page + len, "\n");
--- linux-2.6.x86.orig/kernel/sched_stats.h
+++ linux-2.6.x86/kernel/sched_stats.h
@@ -4,11 +4,16 @@
  * bump this up when changing the output format or the meaning of an existing
  * format, so that tools can adapt (or abort)
  */
-#define SCHEDSTAT_VERSION 14
+#define SCHEDSTAT_VERSION 15
 
 static int show_schedstat(struct seq_file *seq, void *v)
 {
 	int cpu;
+	int mask_len = NR_CPUS/32 * 9;
+	char *mask_str = kmalloc(mask_len, GFP_KERNEL);
+
+	if (mask_str == NULL)
+		return -ENOMEM;
 
 	seq_printf(seq, "version %d\n", SCHEDSTAT_VERSION);
 	seq_printf(seq, "timestamp %lu\n", jiffies);
@@ -36,9 +41,8 @@ static int show_schedstat(struct seq_fil
 		preempt_disable();
 		for_each_domain(cpu, sd) {
 			enum cpu_idle_type itype;
-			char mask_str[NR_CPUS];
 
-			cpumask_scnprintf(mask_str, NR_CPUS, sd->span);
+			cpuset_scnprintf(mask_str, mask_len, sd->span);
 			seq_printf(seq, "domain%d %s", dcount++, mask_str);
 			for (itype = CPU_IDLE; itype < CPU_MAX_IDLE_TYPES;
 					itype++) {
@@ -63,6 +67,7 @@ static int show_schedstat(struct seq_fil
 		preempt_enable();
 #endif
 	}
+	kfree(mask_str);
 	return 0;
 }
 
@@ -75,6 +80,7 @@ static int schedstat_open(struct inode *
 
 	if (!buf)
 		return -ENOMEM;
+
 	res = single_open(file, show_schedstat, NULL);
 	if (!res) {
 		m = file->private_data;
--- linux-2.6.x86.orig/kernel/trace/trace.c
+++ linux-2.6.x86/kernel/trace/trace.c
@@ -1799,7 +1799,7 @@ tracing_cpumask_read(struct file *filp, 
 
 	mutex_lock(&tracing_cpumask_update_lock);
 
-	len = cpumask_scnprintf(mask_str, count, tracing_cpumask);
+	len = cpuset_scnprintf(mask_str, count, tracing_cpumask);
 	if (count - len < 2) {
 		count = -EINVAL;
 		goto out_err;

-- 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] x86: add cpuset_scnprintf function
  2008-04-01 22:54 [PATCH 0/3] x86: add cpuset_scnprintf function Mike Travis
                   ` (2 preceding siblings ...)
  2008-04-01 22:54 ` [PATCH 3/3] cpumask: use new cpuset_scnprintf function Mike Travis
@ 2008-04-02  6:20 ` Paul Jackson
  2008-04-02  7:47   ` Mike Travis
  2008-04-02 14:36   ` Mike Travis
  3 siblings, 2 replies; 9+ messages in thread
From: Paul Jackson @ 2008-04-02  6:20 UTC (permalink / raw)
  To: Mike Travis; +Cc: mingo, tglx, hpa, linux-kernel

Hmmm ... my apologies, Mike, if I overlooked some earlier discussion
of this patchset on one of the other more limited email threads that
we share ... however a couple of aspects of this patchset don't fit
so well for me.
 1) I'm surprised to see this new routine called 'cpuset_scnprintf'
    (with the "cpuset" prefix), rather than a variant of a trio of
    names with prefixes of bitmap_, cpumask_, and nodemask_, like
    the other print routines in the bitmap family.  This doesn't
    seem to be a cpuset function to me, but a bitmap (and derived
    types cpumask and nodemask) printing function.
 2) The idea of the patch, that being a kernel modal flag that if
    set, changes a few particular details of the kernel API for all,
    seems like something I'd rarely want to do, unless I was really
    short of other options.  It leads to one app breaking another
    as a result of changing this mode, and to head butting between
    apps which cannot agree on how to set the mode.  And it introduces
    the option of breaking an existing API, which is seldom a good
    option.

I tried reading the opening "discussion that led up" comments you
posted, but couldn't find any overwhelming problem that had to be
solved of sufficient magnitude and quandry of sufficient difficulty
to justify the API conflicts and breakage in (2) above.  I did find
this comment, apparently by Bert Wesarg (though that's not clear
from your presentation):

>  If you change the format, the brown-paper-bag is yours.

I don't see a compelling response to the above comment.

Granted, what you've done isn't outright changing the format.  Rather it
is handing user space a means to change the format system-wide.

However doing this is worse in my view than simply breaking the format
outright, unilaterally and irrevocably.  If you just flat out stick a
fork in an API and break it hard on some release, then at least user
space knows that it must adapt or die at that version.  If you hand
user space the means to break that API, then any properly and
defensively written user code has to be prepared to deal with both API
flavors, and the majority of user space code is broken half the time,
when run on a system with the API variant it wasn't expecting.  More
over, you end up with apps having "toilet seat wars" with each other:
you left it up and it should be down; no you left it down and it should
be up.  Not a pretty sight.

Perhaps I totally misunderstand this patchset ?

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] x86: add cpuset_scnprintf function
  2008-04-02  6:20 ` [PATCH 0/3] x86: add " Paul Jackson
@ 2008-04-02  7:47   ` Mike Travis
  2008-04-02 10:39     ` Paul Jackson
  2008-04-02 14:36   ` Mike Travis
  1 sibling, 1 reply; 9+ messages in thread
From: Mike Travis @ 2008-04-02  7:47 UTC (permalink / raw)
  To: Paul Jackson; +Cc: mingo, tglx, hpa, linux-kernel


> However doing this is worse in my view than simply breaking the format
> outright, unilaterally and irrevocably.  If you just flat out stick a
> fork in an API and break it hard on some release, then at least user
> space knows that it must adapt or die at that version.  If you hand
> user space the means to break that API, then any properly and
> defensively written user code has to be prepared to deal with both API
> flavors, and the majority of user space code is broken half the time,
> when run on a system with the API variant it wasn't expecting.  More
> over, you end up with apps having "toilet seat wars" with each other:
> you left it up and it should be down; no you left it down and it should
> be up.  Not a pretty sight.
> 
> Perhaps I totally misunderstand this patchset ?
> 

Hi,

I wanted to not break current apps unmercifully, but perhaps I should
default it to the "non-compatible" mode (and adjust the schedstat version
to indicate this)?  [It's the only output that I found that seemed to care.]

And if users have apps that they can't convert, they can revert to the
"old" (compatible) method of outputs.  I know if I'm a user and I'm really
interested in understanding the outputs when there's hundreds and hundreds
of cpus, then the more compact format is much more useful.

I can't believe there hasn't been many changes in all of these outputs.
Like what happened before Hyperthreading, or 3rd level caches, or ?
Even the new Intel announcements for Nehalem may introduce more changes
in what's important in the output information.  Plus I was under the
impression that one of the basic tenets of Linux was that API's can and
will change?

Thanks,
Mike

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] x86: add cpuset_scnprintf function
  2008-04-02  7:47   ` Mike Travis
@ 2008-04-02 10:39     ` Paul Jackson
  0 siblings, 0 replies; 9+ messages in thread
From: Paul Jackson @ 2008-04-02 10:39 UTC (permalink / raw)
  To: Mike Travis; +Cc: mingo, tglx, hpa, linux-kernel

Mike wrote:
> I wanted to not break current apps unmercifully, but perhaps I should
> default it to the "non-compatible" mode (and adjust the schedstat version
> to indicate this)?  [It's the only output that I found that seemed to care.]

It doesn't matter which way you set the default.  My concerns apply
either way.  I don't think your reply addressed my concerns.


> I know if I'm a user and I'm really interested in understanding
> the outputs when there's hundreds and hundreds of cpus, then the
> more compact format is much more useful.

That's not sufficient reason to change an API visible across the
kernel-user boundary.


> Plus I was under the impression that one of the basic tenets of Linux
> was that API's can and will change?

Kernel internals have a relatively lower barrier to API changes.

Kernel API's visible to kernel drivers or loadable modules have a
higher barrier to change.

Kernel API's visible to user space, such as this one, have a much
higher barrier to incompatible change.

I hesitate to NAQ patches because I strike out more often than someone
like Al Viro.  But I'm getting tempted on this one.

Perhaps you could write yourself a user utility that scanned its input
for masks in legacy format, converted them to list format, and passed
all else unscathed?

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] x86: add cpuset_scnprintf function
  2008-04-02  6:20 ` [PATCH 0/3] x86: add " Paul Jackson
  2008-04-02  7:47   ` Mike Travis
@ 2008-04-02 14:36   ` Mike Travis
  2008-04-02 15:28     ` Paul Jackson
  1 sibling, 1 reply; 9+ messages in thread
From: Mike Travis @ 2008-04-02 14:36 UTC (permalink / raw)
  To: Paul Jackson; +Cc: mingo, tglx, hpa, linux-kernel


Paul Jackson wrote:
...
>  1) I'm surprised to see this new routine called 'cpuset_scnprintf'
>     (with the "cpuset" prefix), rather than a variant of a trio of
>     names with prefixes of bitmap_, cpumask_, and nodemask_, like
>     the other print routines in the bitmap family.  This doesn't
>     seem to be a cpuset function to me, but a bitmap (and derived
>     types cpumask and nodemask) printing function.

How about "cpus_scnprintf" to avoid confusion with "cpusets"?

>  2) The idea of the patch, that being a kernel modal flag that if
>     set, changes a few particular details of the kernel API for all,
>     seems like something I'd rarely want to do, unless I was really
>     short of other options.  It leads to one app breaking another
>     as a result of changing this mode, and to head butting between
>     apps which cannot agree on how to set the mode.  And it introduces
>     the option of breaking an existing API, which is seldom a good
>     option.

I could preface the cpulist_scnprintf output with '+' so a user app
could then:

        if (buf[0] == '+')
                bitmask_parselist(&buf[1], ...)
        else
                bitmask_parsehex(buf, ...)

or the perl equivalent which is probably more prevalent.

[I'll put this into the Documentation for compat_cpus_scnprintf...]

Does this sufficiently satisfy your concerns?  It stays compatible with
the current method, but provides an avenue for future growth...?

> Kernel API's visible to kernel drivers or loadable modules have a
> higher barrier to change.

> Kernel API's visible to user space, such as this one, have a much
> higher barrier to incompatible change.

But where is the API spec for /proc outputs?  I'd like to see how others
have managed to change it in the past, and what are the rules for doing so.

> I hesitate to NAQ patches because I strike out more often than someone
> like Al Viro.  But I'm getting tempted on this one.
> 
> Perhaps you could write yourself a user utility that scanned its input
> for masks in legacy format, converted them to list format, and passed
> all else unscathed?

Or write a utility to convert the more compact readable format into the
incomprehensible one and output that ... ?  ;-)

Thanks,
Mike

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] x86: add cpuset_scnprintf function
  2008-04-02 14:36   ` Mike Travis
@ 2008-04-02 15:28     ` Paul Jackson
  0 siblings, 0 replies; 9+ messages in thread
From: Paul Jackson @ 2008-04-02 15:28 UTC (permalink / raw)
  To: Mike Travis; +Cc: mingo, tglx, hpa, linux-kernel

Mike wrote:
> But where is the API spec for /proc outputs?  I'd like to see how others
> have managed to change it in the past, and what are the rules for doing so.

The primary spec is the actual behaviour, unless that's clearly
busted.

Changes are usually done by adding files, lines and fields, not
changing the format of existing fields.

Perhaps you should give me a call ... I sense that the lack of
synchrony in our viewpoints is discomforting in this forum.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-04-02 15:28 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-04-01 22:54 [PATCH 0/3] x86: add cpuset_scnprintf function Mike Travis
2008-04-01 22:54 ` [PATCH 1/3] " Mike Travis
2008-04-01 22:54 ` [PATCH 2/3] x86: modify show_shared_cpu_map in intel_cacheinfo Mike Travis
2008-04-01 22:54 ` [PATCH 3/3] cpumask: use new cpuset_scnprintf function Mike Travis
2008-04-02  6:20 ` [PATCH 0/3] x86: add " Paul Jackson
2008-04-02  7:47   ` Mike Travis
2008-04-02 10:39     ` Paul Jackson
2008-04-02 14:36   ` Mike Travis
2008-04-02 15:28     ` Paul Jackson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).