LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [RFC][PATCH 0/7] CGroup API: More structured API for CGroups control files
@ 2008-02-15 20:44 Paul Menage
  2008-02-15 20:44 ` [RFC][PATCH 1/7] CGroup API: Add cgroup.api control file Paul Menage
                   ` (7 more replies)
  0 siblings, 8 replies; 28+ messages in thread
From: Paul Menage @ 2008-02-15 20:44 UTC (permalink / raw)
  To: balbir, pj, Pavel Emelianov, KAMEZAWA Hiroyuki, vatsa, akpm
  Cc: containers, linux-kernel


This set of patches makes the Control Groups API more structured and
self-describing.

1) Allows control files to be associated with data types such as
"u64", "string", "map", etc. These types show up in a new cgroup.api
file in each cgroup directory, along with a user-readable
string. Files that use cgroup-provided data accessors have these file
types inferred automatically.

2) Moves various files in cpusets and the memory controller from using
custom-written file handlers to cgroup-defined handlers

3) Adds the "cgroup." prefix for existing cgroup-provided control
files (tasks, release_agent, releasable, notify_on_release). Given
than we've already had 2.6.24 go out without this prefix, I guess this
could be a little contentious - but it seems like a good move to
prevent name clashes in the future. (Note that this doesn't affect
mounting the legacy cpuset filesystem, since the compatibility layer
disables all prefixes when mounted with filesystem type "cpuset"). If
people object too strongly, we could just make this the case for *new*
cgroup API files, but I think this is a case where consistency would
be better than compatibility - I'd be surprised if anyone has written
major legacy apps yet that rely on 2.6.24 cgroup control file names.


There are various motivations for this:

1) We said at Kernel Summit '07 that the cgroup API wouldn't be
allowed to spiral into an arbitrary mess of ad-hoc APIs. Having simple
ways to represent common data types makes this easier. (E.g. one
standard way to report a map of string,u64 pairs to userspace.)

2) People were divided on the issue of binary APIs versus ASCII APIs
for control groups. Compatibility with the existing cpusets system,
and ease of experimentation, were two important reasons for going with
the current. ASCII API. But by having structured control files, we can
open the path towards having more efficient binary APIs for simpler
and more efficient programmatic access too, without any additional
modifications required from the subsystems themselves.

My plans for this potential binary API are a little hazy at this
point, but they might go something like opening a cgroup.bin file in a
cgroup directory, and writing the names of the control files that you
were interested in; then a read on that file handle would return the
contents of the given control files in a single read in a simple
binary format. (Better suggestions are welcome). Regardless, getting a
good typing/structure on the control files is an important first step
if we want to go in that direction.

3) The memory controller currently has files with the "_in_bytes"
suffix, on the grounds that otherwise it's not obvious to a new user
what they represent. By moving the description to a auto-generated API
file, we can remove this (IMO) inelegant suffix.


--

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC][PATCH 1/7] CGroup API: Add cgroup.api control file
  2008-02-15 20:44 [RFC][PATCH 0/7] CGroup API: More structured API for CGroups control files Paul Menage
@ 2008-02-15 20:44 ` Paul Menage
  2008-02-16 10:07   ` Balbir Singh
  2008-02-15 20:44 ` [RFC][PATCH 2/7] CGroup API: Add cgroup map data type Paul Menage
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 28+ messages in thread
From: Paul Menage @ 2008-02-15 20:44 UTC (permalink / raw)
  To: balbir, pj, Pavel Emelianov, KAMEZAWA Hiroyuki, vatsa, akpm
  Cc: containers, linux-kernel

[-- Attachment #1: cgroup-api.patch --]
[-- Type: text/plain, Size: 6923 bytes --]

Add a cgroup.api control file in every cgroup directory. This reports
for each control file the type of data represented by that control
file, and a user-friendly description of the contents.

A secondary effect of this patch is to add the "cgroup." prefix in
front of all cgroup-provided control files. This will reduce the
chance of future control files clashing with user-provided names.

Signed-off-by: Paul Menage <menage@google.com>

---
 include/linux/cgroup.h |   21 +++++++
 kernel/cgroup.c        |  133 ++++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 148 insertions(+), 6 deletions(-)

Index: cgroupmap-2.6.24-mm1/include/linux/cgroup.h
===================================================================
--- cgroupmap-2.6.24-mm1.orig/include/linux/cgroup.h
+++ cgroupmap-2.6.24-mm1/include/linux/cgroup.h
@@ -179,12 +179,33 @@ struct css_set {
  *	- the 'cftype' of the file is file->f_dentry->d_fsdata
  */
 
+/*
+ * The various types of control file that are reported in the
+ * cgroup.api file. "String" is a catch-all default, but should only
+ * be used for special cases. If you use the appropriate accessors
+ * (such as "read_uint") in your control file, then you can leave this
+ * as 0 (CGROUP_FILE_UNKNOWN) and let cgroup figure out the right type.
+ */
+enum cgroup_file_type {
+	CGROUP_FILE_UNKNOWN = 0,
+	CGROUP_FILE_VOID,
+	CGROUP_FILE_U64,
+	CGROUP_FILE_STRING,
+};
+
 #define MAX_CFTYPE_NAME 64
 struct cftype {
 	/* By convention, the name should begin with the name of the
 	 * subsystem, followed by a period */
 	char name[MAX_CFTYPE_NAME];
 	int private;
+
+	/* The type of a file - reported in the cgroup.api file */
+	enum cgroup_file_type type;
+
+	/* Human-readable description of the file */
+	const char *desc;
+
 	int (*open) (struct inode *inode, struct file *file);
 	ssize_t (*read) (struct cgroup *cont, struct cftype *cft,
 			 struct file *file,
Index: cgroupmap-2.6.24-mm1/kernel/cgroup.c
===================================================================
--- cgroupmap-2.6.24-mm1.orig/kernel/cgroup.c
+++ cgroupmap-2.6.24-mm1/kernel/cgroup.c
@@ -1301,6 +1301,7 @@ enum cgroup_filetype {
 	FILE_NOTIFY_ON_RELEASE,
 	FILE_RELEASABLE,
 	FILE_RELEASE_AGENT,
+	FILE_API,
 };
 
 static ssize_t cgroup_write_uint(struct cgroup *cgrp, struct cftype *cft,
@@ -1611,17 +1612,21 @@ static int cgroup_create_dir(struct cgro
 }
 
 int cgroup_add_file(struct cgroup *cgrp,
-		       struct cgroup_subsys *subsys,
-		       const struct cftype *cft)
+		    struct cgroup_subsys *subsys,
+		    const struct cftype *cft)
 {
 	struct dentry *dir = cgrp->dentry;
 	struct dentry *dentry;
 	int error;
 
 	char name[MAX_CGROUP_TYPE_NAMELEN + MAX_CFTYPE_NAME + 2] = { 0 };
-	if (subsys && !test_bit(ROOT_NOPREFIX, &cgrp->root->flags)) {
-		strcpy(name, subsys->name);
-		strcat(name, ".");
+	if (!test_bit(ROOT_NOPREFIX, &cgrp->root->flags)) {
+		if (subsys) {
+			strcpy(name, subsys->name);
+			strcat(name, ".");
+		} else {
+			strcpy(name, "cgroup.");
+		}
 	}
 	strcat(name, cft->name);
 	BUG_ON(!mutex_is_locked(&dir->d_inode->i_mutex));
@@ -2126,6 +2131,110 @@ static u64 cgroup_read_releasable(struct
 	return test_bit(CGRP_RELEASABLE, &cgrp->flags);
 }
 
+static const struct file_operations cgroup_api_file_operations = {
+	.read = seq_read,
+	.llseek = seq_lseek,
+	.release = seq_release,
+};
+
+/*
+ * cgroup.api is a file in each cgroup directory that gives the types
+ * and descriptions of the various control files in that directory.
+ */
+
+static struct dentry *cgroup_api_advance(struct dentry *d, int advance)
+{
+	struct dentry *parent = d->d_parent;
+	struct list_head *l = &d->d_u.d_child;
+	while (true) {
+		if (advance)
+			l = l->next;
+		advance = true;
+		/* Did we reach the end of the directory? */
+		if (l == &parent->d_subdirs)
+			return NULL;
+		d = container_of(l, struct dentry, d_u.d_child);
+		/* Skip cgroup subdirectories */
+		if (d->d_inode && S_ISREG(d->d_inode->i_mode))
+			return d;
+	}
+}
+
+static void *cgroup_api_start(struct seq_file *sf, loff_t *pos)
+{
+	struct dentry *parent = sf->private;
+	struct dentry *d;
+	loff_t l = 0;
+	spin_lock(&dcache_lock);
+	if (list_empty(&parent->d_subdirs))
+		return NULL;
+	d = container_of(parent->d_subdirs.next, struct dentry, d_u.d_child);
+	d = cgroup_api_advance(d, 0);
+	while (d && l < *pos) {
+		(*pos)++;
+		d = cgroup_api_advance(d, 1);
+	}
+
+	return d;
+}
+
+static void *cgroup_api_next(struct seq_file *sf, void *v, loff_t *pos)
+{
+	struct dentry *d = v;
+	(*pos)++;
+	return cgroup_api_advance(d, 1);
+}
+
+static void cgroup_api_stop(struct seq_file *sf, void *v)
+{
+	spin_unlock(&dcache_lock);
+}
+
+static const char *cft_type_names[] = {
+	"unknown",
+	"void",
+	"u64",
+	"string",
+	"u64map",
+};
+
+static int cgroup_api_show(struct seq_file *sf, void *v)
+{
+	struct dentry *d = v;
+	struct cftype *cft = __d_cft(d);
+	unsigned type = cft->type;
+	if (type == CGROUP_FILE_UNKNOWN) {
+		if (cft->read_uint)
+			type = CGROUP_FILE_U64;
+		else if (cft->read)
+			type = CGROUP_FILE_STRING;
+		else if (!cft->open)
+			type = CGROUP_FILE_VOID;
+	}
+	if (type >= ARRAY_SIZE(cft_type_names))
+		type = CGROUP_FILE_UNKNOWN;
+	return seq_printf(sf, "%s\t%s\t%s\n", d->d_name.name,
+			  cft_type_names[type], cft->desc ?: "");
+}
+
+static const struct seq_operations cgroup_api_seqop = {
+	.start = cgroup_api_start,
+	.next = cgroup_api_next,
+	.stop = cgroup_api_stop,
+	.show = cgroup_api_show,
+};
+
+static int cgroup_api_open(struct inode *inode, struct file *file)
+{
+	int ret = seq_open(file, &cgroup_api_seqop);
+	if (ret == 0) {
+		struct seq_file *sf = file->private_data;
+		sf->private = file->f_dentry->d_parent;
+		file->f_op = &cgroup_api_file_operations;
+	}
+	return ret;
+}
+
 /*
  * for the common functions, 'private' gives the type of file
  */
@@ -2137,6 +2246,7 @@ static struct cftype files[] = {
 		.write = cgroup_common_file_write,
 		.release = cgroup_tasks_release,
 		.private = FILE_TASKLIST,
+		.desc = "Thread ids of threads in this cgroup",
 	},
 
 	{
@@ -2144,13 +2254,23 @@ static struct cftype files[] = {
 		.read_uint = cgroup_read_notify_on_release,
 		.write = cgroup_common_file_write,
 		.private = FILE_NOTIFY_ON_RELEASE,
+		.desc =
+		"Should the release agent trigger when this cgroup is empty",
 	},
 
 	{
 		.name = "releasable",
 		.read_uint = cgroup_read_releasable,
 		.private = FILE_RELEASABLE,
-	}
+		.desc = "Is this cgroup able to be freed when empty"
+	},
+
+	{
+		.name = "api",
+		.open = cgroup_api_open,
+		.private = FILE_API,
+		.desc = "Control file descriptions",
+	},
 };
 
 static struct cftype cft_release_agent = {
@@ -2158,6 +2278,7 @@ static struct cftype cft_release_agent =
 	.read = cgroup_common_file_read,
 	.write = cgroup_common_file_write,
 	.private = FILE_RELEASE_AGENT,
+	.desc = "Path to release agent binary",
 };
 
 static int cgroup_populate_dir(struct cgroup *cgrp)

--

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC][PATCH 2/7] CGroup API: Add cgroup map data type
  2008-02-15 20:44 [RFC][PATCH 0/7] CGroup API: More structured API for CGroups control files Paul Menage
  2008-02-15 20:44 ` [RFC][PATCH 1/7] CGroup API: Add cgroup.api control file Paul Menage
@ 2008-02-15 20:44 ` Paul Menage
  2008-02-15 20:44 ` [RFC][PATCH 3/7] CGroup API: Use cgroup map for memcontrol stats file Paul Menage
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 28+ messages in thread
From: Paul Menage @ 2008-02-15 20:44 UTC (permalink / raw)
  To: balbir, pj, Pavel Emelianov, KAMEZAWA Hiroyuki, vatsa, akpm
  Cc: containers, linux-kernel

[-- Attachment #1: cgroup_read_map.patch --]
[-- Type: text/plain, Size: 4017 bytes --]

Adds a new type of supported control file representation, a map from
strings to u64 values.

Signed-off-by: Paul Menage <menage@google.com>

---
 include/linux/cgroup.h |   19 +++++++++++++++
 kernel/cgroup.c        |   61 ++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 79 insertions(+), 1 deletion(-)

Index: cgroupmap-2.6.24-mm1/include/linux/cgroup.h
===================================================================
--- cgroupmap-2.6.24-mm1.orig/include/linux/cgroup.h
+++ cgroupmap-2.6.24-mm1/include/linux/cgroup.h
@@ -191,6 +191,17 @@ enum cgroup_file_type {
 	CGROUP_FILE_VOID,
 	CGROUP_FILE_U64,
 	CGROUP_FILE_STRING,
+	CGROUP_FILE_MAP,
+};
+
+/*
+ * cgroup_map_cb is an abstract callback API for reporting map-valued
+ * control files
+ */
+
+struct cgroup_map_cb {
+	int (*fill)(struct cgroup_map_cb *cb, const char *key, u64 value);
+	void *state;
 };
 
 #define MAX_CFTYPE_NAME 64
@@ -215,6 +226,14 @@ struct cftype {
 	 * single integer. Use it in place of read()
 	 */
 	u64 (*read_uint) (struct cgroup *cont, struct cftype *cft);
+	/*
+	 * read_map() is used for defining a map of key/value
+	 * pairs. It should call cb->fill(cb, key, value) for each
+	 * entry.
+	 */
+	int (*read_map) (struct cgroup *cont, struct cftype *cft,
+			 struct cgroup_map_cb *cb);
+
 	ssize_t (*write) (struct cgroup *cont, struct cftype *cft,
 			  struct file *file,
 			  const char __user *buf, size_t nbytes, loff_t *ppos);
Index: cgroupmap-2.6.24-mm1/kernel/cgroup.c
===================================================================
--- cgroupmap-2.6.24-mm1.orig/kernel/cgroup.c
+++ cgroupmap-2.6.24-mm1/kernel/cgroup.c
@@ -1488,6 +1488,46 @@ static ssize_t cgroup_file_read(struct f
 	return -EINVAL;
 }
 
+/*
+ * seqfile ops/methods for returning structured data. Currently just
+ * supports string->u64 maps, but can be extended in future.
+ */
+
+struct cgroup_seqfile_state {
+	struct cftype *cft;
+	struct cgroup *cgroup;
+};
+
+static int cgroup_map_add(struct cgroup_map_cb *cb, const char *key, u64 value)
+{
+	struct seq_file *sf = cb->state;
+	return seq_printf(sf, "%s: %llu\n", key, value);
+}
+
+static int cgroup_seqfile_show(struct seq_file *m, void *arg)
+{
+	struct cgroup_seqfile_state *state = m->private;
+	struct cftype *cft = state->cft;
+	struct cgroup_map_cb cb = {
+		.fill = cgroup_map_add,
+		.state = m,
+	};
+	if (cft->read_map) {
+		return cft->read_map(state->cgroup, cft, &cb);
+	} else {
+		BUG();
+	}
+}
+
+int cgroup_seqfile_release(struct inode *inode, struct file *file)
+{
+	struct seq_file *seq = file->private_data;
+	kfree(seq->private);
+	return single_release(inode, file);
+}
+
+static struct file_operations cgroup_seqfile_operations;
+
 static int cgroup_file_open(struct inode *inode, struct file *file)
 {
 	int err;
@@ -1500,7 +1540,18 @@ static int cgroup_file_open(struct inode
 	cft = __d_cft(file->f_dentry);
 	if (!cft)
 		return -ENODEV;
-	if (cft->open)
+	if (cft->read_map) {
+		struct cgroup_seqfile_state *state =
+			kzalloc(sizeof(*state), GFP_USER);
+		if (!state)
+			return -ENOMEM;
+		state->cft = cft;
+		state->cgroup = __d_cgrp(file->f_dentry->d_parent);
+		file->f_op = &cgroup_seqfile_operations;
+		err = single_open(file, cgroup_seqfile_show, state);
+		if (err < 0)
+			kfree(state);
+	} else if (cft->open)
 		err = cft->open(inode, file);
 	else
 		err = 0;
@@ -1539,6 +1590,12 @@ static struct file_operations cgroup_fil
 	.release = cgroup_file_release,
 };
 
+static struct file_operations cgroup_seqfile_operations = {
+	.read = seq_read,
+	.llseek = seq_lseek,
+	.release = cgroup_seqfile_release,
+};
+
 static struct inode_operations cgroup_dir_inode_operations = {
 	.lookup = simple_lookup,
 	.mkdir = cgroup_mkdir,
@@ -2206,6 +2263,8 @@ static int cgroup_api_show(struct seq_fi
 	if (type == CGROUP_FILE_UNKNOWN) {
 		if (cft->read_uint)
 			type = CGROUP_FILE_U64;
+		else if (cft->read_map)
+			type = CGROUP_FILE_MAP;
 		else if (cft->read)
 			type = CGROUP_FILE_STRING;
 		else if (!cft->open)

--

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC][PATCH 3/7] CGroup API: Use cgroup map for memcontrol stats file
  2008-02-15 20:44 [RFC][PATCH 0/7] CGroup API: More structured API for CGroups control files Paul Menage
  2008-02-15 20:44 ` [RFC][PATCH 1/7] CGroup API: Add cgroup.api control file Paul Menage
  2008-02-15 20:44 ` [RFC][PATCH 2/7] CGroup API: Add cgroup map data type Paul Menage
@ 2008-02-15 20:44 ` Paul Menage
  2008-02-15 20:44 ` [RFC][PATCH 4/7] CGroup API: Add res_counter_read_uint() Paul Menage
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 28+ messages in thread
From: Paul Menage @ 2008-02-15 20:44 UTC (permalink / raw)
  To: balbir, pj, Pavel Emelianov, KAMEZAWA Hiroyuki, vatsa, akpm
  Cc: containers, linux-kernel

[-- Attachment #1: memcontrol_use_cgroup_map.patch --]
[-- Type: text/plain, Size: 2370 bytes --]

Remove the seq_file boilerplate used to construct the memcontrol stats
map, and instead use the new map representation for cgroup control
files

Signed-off-by: Paul Menage <menage@google.com>

---
 mm/memcontrol.c |   30 ++++++------------------------
 1 file changed, 6 insertions(+), 24 deletions(-)

Index: cgroupmap-2.6.24-mm1/mm/memcontrol.c
===================================================================
--- cgroupmap-2.6.24-mm1.orig/mm/memcontrol.c
+++ cgroupmap-2.6.24-mm1/mm/memcontrol.c
@@ -974,9 +974,9 @@ static const struct mem_cgroup_stat_desc
 	[MEM_CGROUP_STAT_RSS] = { "rss", PAGE_SIZE, },
 };
 
-static int mem_control_stat_show(struct seq_file *m, void *arg)
+static int mem_control_stat_show(struct cgroup *cont, struct cftype *cft,
+				 struct cgroup_map_cb *cb)
 {
-	struct cgroup *cont = m->private;
 	struct mem_cgroup *mem_cont = mem_cgroup_from_cont(cont);
 	struct mem_cgroup_stat *stat = &mem_cont->stat;
 	int i;
@@ -986,8 +986,7 @@ static int mem_control_stat_show(struct 
 
 		val = mem_cgroup_read_stat(stat, i);
 		val *= mem_cgroup_stat_desc[i].unit;
-		seq_printf(m, "%s %lld\n", mem_cgroup_stat_desc[i].msg,
-				(long long)val);
+		cb->fill(cb, mem_cgroup_stat_desc[i].msg, val);
 	}
 	/* showing # of active pages */
 	{
@@ -997,29 +996,12 @@ static int mem_control_stat_show(struct 
 						MEM_CGROUP_ZSTAT_INACTIVE);
 		active = mem_cgroup_get_all_zonestat(mem_cont,
 						MEM_CGROUP_ZSTAT_ACTIVE);
-		seq_printf(m, "active %ld\n", (active) * PAGE_SIZE);
-		seq_printf(m, "inactive %ld\n", (inactive) * PAGE_SIZE);
+		cb->fill(cb, "active", (active) * PAGE_SIZE);
+		cb->fill(cb, "inactive", (inactive) * PAGE_SIZE);
 	}
 	return 0;
 }
 
-static const struct file_operations mem_control_stat_file_operations = {
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-};
-
-static int mem_control_stat_open(struct inode *unused, struct file *file)
-{
-	/* XXX __d_cont */
-	struct cgroup *cont = file->f_dentry->d_parent->d_fsdata;
-
-	file->f_op = &mem_control_stat_file_operations;
-	return single_open(file, mem_control_stat_show, cont);
-}
-
-
-
 static struct cftype mem_cgroup_files[] = {
 	{
 		.name = "usage_in_bytes",
@@ -1044,7 +1026,7 @@ static struct cftype mem_cgroup_files[] 
 	},
 	{
 		.name = "stat",
-		.open = mem_control_stat_open,
+		.read_map = mem_control_stat_show,
 	},
 };
 

--

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC][PATCH 4/7] CGroup API: Add res_counter_read_uint()
  2008-02-15 20:44 [RFC][PATCH 0/7] CGroup API: More structured API for CGroups control files Paul Menage
                   ` (2 preceding siblings ...)
  2008-02-15 20:44 ` [RFC][PATCH 3/7] CGroup API: Use cgroup map for memcontrol stats file Paul Menage
@ 2008-02-15 20:44 ` Paul Menage
  2008-02-15 20:44 ` [RFC][PATCH 5/7] CGroup API: Use read_uint in memory controller Paul Menage
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 28+ messages in thread
From: Paul Menage @ 2008-02-15 20:44 UTC (permalink / raw)
  To: balbir, pj, Pavel Emelianov, KAMEZAWA Hiroyuki, vatsa, akpm
  Cc: containers, linux-kernel

[-- Attachment #1: resource_counter_read_uint.patch --]
[-- Type: text/plain, Size: 1640 bytes --]

Adds a function for returning the value of a resource counter member,
in a form suitable for use in a cgroup read_uint control file method.

Signed-off-by: Paul Menage <menage@google.com>

---
 include/linux/res_counter.h |    1 +
 kernel/res_counter.c        |    5 +++++
 2 files changed, 6 insertions(+)

Index: cgroupmap-2.6.24-mm1/include/linux/res_counter.h
===================================================================
--- cgroupmap-2.6.24-mm1.orig/include/linux/res_counter.h
+++ cgroupmap-2.6.24-mm1/include/linux/res_counter.h
@@ -54,6 +54,7 @@ struct res_counter {
 ssize_t res_counter_read(struct res_counter *counter, int member,
 		const char __user *buf, size_t nbytes, loff_t *pos,
 		int (*read_strategy)(unsigned long long val, char *s));
+u64 res_counter_read_uint(struct res_counter *counter, int member);
 ssize_t res_counter_write(struct res_counter *counter, int member,
 		const char __user *buf, size_t nbytes, loff_t *pos,
 		int (*write_strategy)(char *buf, unsigned long long *val));
Index: cgroupmap-2.6.24-mm1/kernel/res_counter.c
===================================================================
--- cgroupmap-2.6.24-mm1.orig/kernel/res_counter.c
+++ cgroupmap-2.6.24-mm1/kernel/res_counter.c
@@ -92,6 +92,11 @@ ssize_t res_counter_read(struct res_coun
 			pos, buf, s - buf);
 }
 
+u64 res_counter_read_uint(struct res_counter *counter, int member)
+{
+	return *res_counter_member(counter, member);
+}
+
 ssize_t res_counter_write(struct res_counter *counter, int member,
 		const char __user *userbuf, size_t nbytes, loff_t *pos,
 		int (*write_strategy)(char *st_buf, unsigned long long *val))

--

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC][PATCH 5/7] CGroup API: Use read_uint in memory controller
  2008-02-15 20:44 [RFC][PATCH 0/7] CGroup API: More structured API for CGroups control files Paul Menage
                   ` (3 preceding siblings ...)
  2008-02-15 20:44 ` [RFC][PATCH 4/7] CGroup API: Add res_counter_read_uint() Paul Menage
@ 2008-02-15 20:44 ` Paul Menage
  2008-02-15 20:44 ` [RFC][PATCH 6/7] CGroup API: Use descriptions for memory controller API files Paul Menage
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 28+ messages in thread
From: Paul Menage @ 2008-02-15 20:44 UTC (permalink / raw)
  To: balbir, pj, Pavel Emelianov, KAMEZAWA Hiroyuki, vatsa, akpm
  Cc: containers, linux-kernel

[-- Attachment #1: memcontrol_use_res_counter_read_uint.patch --]
[-- Type: text/plain, Size: 1646 bytes --]

Update the memory controller to use read_uint for its
limit/usage/failcnt control files, calling the new
res_counter_read_uint() function. This allows the files to show up as
u64 rather than string in the cgroup.api file.

Signed-off-by: Paul Menage <menage@google.com>

---
 mm/memcontrol.c |   15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

Index: cgroupmap-2.6.24-mm1/mm/memcontrol.c
===================================================================
--- cgroupmap-2.6.24-mm1.orig/mm/memcontrol.c
+++ cgroupmap-2.6.24-mm1/mm/memcontrol.c
@@ -922,13 +922,10 @@ int mem_cgroup_write_strategy(char *buf,
 	return 0;
 }
 
-static ssize_t mem_cgroup_read(struct cgroup *cont,
-			struct cftype *cft, struct file *file,
-			char __user *userbuf, size_t nbytes, loff_t *ppos)
+static u64 mem_cgroup_read(struct cgroup *cont, struct cftype *cft)
 {
-	return res_counter_read(&mem_cgroup_from_cont(cont)->res,
-				cft->private, userbuf, nbytes, ppos,
-				NULL);
+	return res_counter_read_uint(&mem_cgroup_from_cont(cont)->res,
+				     cft->private);
 }
 
 static ssize_t mem_cgroup_write(struct cgroup *cont, struct cftype *cft,
@@ -1006,18 +1003,18 @@ static struct cftype mem_cgroup_files[] 
 	{
 		.name = "usage_in_bytes",
 		.private = RES_USAGE,
-		.read = mem_cgroup_read,
+		.read_uint = mem_cgroup_read,
 	},
 	{
 		.name = "limit_in_bytes",
 		.private = RES_LIMIT,
 		.write = mem_cgroup_write,
-		.read = mem_cgroup_read,
+		.read_uint = mem_cgroup_read,
 	},
 	{
 		.name = "failcnt",
 		.private = RES_FAILCNT,
-		.read = mem_cgroup_read,
+		.read_uint = mem_cgroup_read,
 	},
 	{
 		.name = "force_empty",

--

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC][PATCH 6/7] CGroup API: Use descriptions for memory controller API files
  2008-02-15 20:44 [RFC][PATCH 0/7] CGroup API: More structured API for CGroups control files Paul Menage
                   ` (4 preceding siblings ...)
  2008-02-15 20:44 ` [RFC][PATCH 5/7] CGroup API: Use read_uint in memory controller Paul Menage
@ 2008-02-15 20:44 ` Paul Menage
  2008-02-15 20:44 ` [RFC][PATCH 7/7] CGroup API: Update cpusets to use cgroup structured file API Paul Menage
  2008-02-16  4:21 ` [RFC][PATCH 0/7] CGroup API: More structured API for CGroups control files KAMEZAWA Hiroyuki
  7 siblings, 0 replies; 28+ messages in thread
From: Paul Menage @ 2008-02-15 20:44 UTC (permalink / raw)
  To: balbir, pj, Pavel Emelianov, KAMEZAWA Hiroyuki, vatsa, akpm
  Cc: containers, linux-kernel

[-- Attachment #1: memcontrol-use-api.patch --]
[-- Type: text/plain, Size: 1744 bytes --]

This patch adds descriptions to the memory controller API files to
indicate that the usage/limit are in bytes; the names of the control
files can then be simplified to usage/limit.

Also removes the unnecessary mem_force_empty_read() function

Signed-off-by: Paul Menage <menage@google.com>

---
 mm/memcontrol.c |   21 +++++----------------
 1 file changed, 5 insertions(+), 16 deletions(-)

Index: cgroupmap-2.6.24-mm1/mm/memcontrol.c
===================================================================
--- cgroupmap-2.6.24-mm1.orig/mm/memcontrol.c
+++ cgroupmap-2.6.24-mm1/mm/memcontrol.c
@@ -950,19 +950,6 @@ static ssize_t mem_force_empty_write(str
 	return ret;
 }
 
-/*
- * Note: This should be removed if cgroup supports write-only file.
- */
-
-static ssize_t mem_force_empty_read(struct cgroup *cont,
-				struct cftype *cft,
-				struct file *file, char __user *userbuf,
-				size_t nbytes, loff_t *ppos)
-{
-	return -EINVAL;
-}
-
-
 static const struct mem_cgroup_stat_desc {
 	const char *msg;
 	u64 unit;
@@ -1001,15 +988,17 @@ static int mem_control_stat_show(struct 
 
 static struct cftype mem_cgroup_files[] = {
 	{
-		.name = "usage_in_bytes",
+		.name = "usage",
 		.private = RES_USAGE,
 		.read_uint = mem_cgroup_read,
+		.desc = "Memory usage in bytes",
 	},
 	{
-		.name = "limit_in_bytes",
+		.name = "limit",
 		.private = RES_LIMIT,
 		.write = mem_cgroup_write,
 		.read_uint = mem_cgroup_read,
+		.desc = "Memory limit in bytes",
 	},
 	{
 		.name = "failcnt",
@@ -1019,7 +1008,7 @@ static struct cftype mem_cgroup_files[] 
 	{
 		.name = "force_empty",
 		.write = mem_force_empty_write,
-		.read = mem_force_empty_read,
+		.desc = "Write to this file to forget all memory charges"
 	},
 	{
 		.name = "stat",

--

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC][PATCH 7/7] CGroup API: Update cpusets to use cgroup structured file API
  2008-02-15 20:44 [RFC][PATCH 0/7] CGroup API: More structured API for CGroups control files Paul Menage
                   ` (5 preceding siblings ...)
  2008-02-15 20:44 ` [RFC][PATCH 6/7] CGroup API: Use descriptions for memory controller API files Paul Menage
@ 2008-02-15 20:44 ` Paul Menage
  2008-02-17  3:29   ` Paul Jackson
  2008-02-16  4:21 ` [RFC][PATCH 0/7] CGroup API: More structured API for CGroups control files KAMEZAWA Hiroyuki
  7 siblings, 1 reply; 28+ messages in thread
From: Paul Menage @ 2008-02-15 20:44 UTC (permalink / raw)
  To: balbir, pj, Pavel Emelianov, KAMEZAWA Hiroyuki, vatsa, akpm
  Cc: containers, linux-kernel

[-- Attachment #1: cpuset_files.patch --]
[-- Type: text/plain, Size: 7756 bytes --]

Many of the cpusets control files are simple integer values, which
don't require the overhead of memory allocations for reads and writes.

Move the handlers for these control files into cpuset_read_uint() and
cpuset_write_uint(). This also has the advantage that the control
files show up as "u64" rather than "string" in the cgroup.api file.

Signed-off-by: Paul Menage <menage@google.com>

---
 kernel/cpuset.c |  158 +++++++++++++++++++++++++++++---------------------------
 1 file changed, 83 insertions(+), 75 deletions(-)

Index: cgroupmap-2.6.24-mm1/kernel/cpuset.c
===================================================================
--- cgroupmap-2.6.24-mm1.orig/kernel/cpuset.c
+++ cgroupmap-2.6.24-mm1/kernel/cpuset.c
@@ -999,19 +999,6 @@ int current_cpuset_is_being_rebound(void
 }
 
 /*
- * Call with cgroup_mutex held.
- */
-
-static int update_memory_pressure_enabled(struct cpuset *cs, char *buf)
-{
-	if (simple_strtoul(buf, NULL, 10) != 0)
-		cpuset_memory_pressure_enabled = 1;
-	else
-		cpuset_memory_pressure_enabled = 0;
-	return 0;
-}
-
-/*
  * update_flag - read a 0 or a 1 in a file and update associated flag
  * bit:	the bit to update (CS_CPU_EXCLUSIVE, CS_MEM_EXCLUSIVE,
  *				CS_SCHED_LOAD_BALANCE,
@@ -1023,15 +1010,13 @@ static int update_memory_pressure_enable
  * Call with cgroup_mutex held.
  */
 
-static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs, char *buf)
+static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs,
+		       int turning_on)
 {
-	int turning_on;
 	struct cpuset trialcs;
 	int err;
 	int cpus_nonempty, balance_flag_changed;
 
-	turning_on = (simple_strtoul(buf, NULL, 10) != 0);
-
 	trialcs = *cs;
 	if (turning_on)
 		set_bit(bit, &trialcs.flags);
@@ -1247,44 +1232,66 @@ static ssize_t cpuset_common_file_write(
 	case FILE_MEMLIST:
 		retval = update_nodemask(cs, buffer);
 		break;
+	default:
+		retval = -EINVAL;
+		goto out2;
+	}
+
+	if (retval == 0)
+		retval = nbytes;
+out2:
+	cgroup_unlock();
+out1:
+	kfree(buffer);
+	return retval;
+}
+
+static int cpuset_write_uint(struct cgroup *cgrp, struct cftype *cft, u64 val)
+{
+	int retval = 0;
+	struct cpuset *cs = cgroup_cs(cgrp);
+	cpuset_filetype_t type = cft->private;
+
+	cgroup_lock();
+
+	if (cgroup_is_removed(cgrp)) {
+		cgroup_unlock();
+		return -ENODEV;
+	}
+
+	switch (type) {
 	case FILE_CPU_EXCLUSIVE:
-		retval = update_flag(CS_CPU_EXCLUSIVE, cs, buffer);
+		retval = update_flag(CS_CPU_EXCLUSIVE, cs, val);
 		break;
 	case FILE_MEM_EXCLUSIVE:
-		retval = update_flag(CS_MEM_EXCLUSIVE, cs, buffer);
+		retval = update_flag(CS_MEM_EXCLUSIVE, cs, val);
 		break;
 	case FILE_SCHED_LOAD_BALANCE:
-		retval = update_flag(CS_SCHED_LOAD_BALANCE, cs, buffer);
+		retval = update_flag(CS_SCHED_LOAD_BALANCE, cs, val);
 		break;
 	case FILE_MEMORY_MIGRATE:
-		retval = update_flag(CS_MEMORY_MIGRATE, cs, buffer);
+		retval = update_flag(CS_MEMORY_MIGRATE, cs, val);
 		break;
 	case FILE_MEMORY_PRESSURE_ENABLED:
-		retval = update_memory_pressure_enabled(cs, buffer);
+		cpuset_memory_pressure_enabled = val;
 		break;
 	case FILE_MEMORY_PRESSURE:
 		retval = -EACCES;
 		break;
 	case FILE_SPREAD_PAGE:
-		retval = update_flag(CS_SPREAD_PAGE, cs, buffer);
+		retval = update_flag(CS_SPREAD_PAGE, cs, val);
 		cs->mems_generation = cpuset_mems_generation++;
 		break;
 	case FILE_SPREAD_SLAB:
-		retval = update_flag(CS_SPREAD_SLAB, cs, buffer);
+		retval = update_flag(CS_SPREAD_SLAB, cs, val);
 		cs->mems_generation = cpuset_mems_generation++;
 		break;
 	default:
 		retval = -EINVAL;
-		goto out2;
+		break;
 	}
-
-	if (retval == 0)
-		retval = nbytes;
-out2:
 	cgroup_unlock();
-out1:
-	kfree(buffer);
-	return retval;
+	return -EINVAL;
 }
 
 /*
@@ -1345,30 +1352,6 @@ static ssize_t cpuset_common_file_read(s
 	case FILE_MEMLIST:
 		s += cpuset_sprintf_memlist(s, cs);
 		break;
-	case FILE_CPU_EXCLUSIVE:
-		*s++ = is_cpu_exclusive(cs) ? '1' : '0';
-		break;
-	case FILE_MEM_EXCLUSIVE:
-		*s++ = is_mem_exclusive(cs) ? '1' : '0';
-		break;
-	case FILE_SCHED_LOAD_BALANCE:
-		*s++ = is_sched_load_balance(cs) ? '1' : '0';
-		break;
-	case FILE_MEMORY_MIGRATE:
-		*s++ = is_memory_migrate(cs) ? '1' : '0';
-		break;
-	case FILE_MEMORY_PRESSURE_ENABLED:
-		*s++ = cpuset_memory_pressure_enabled ? '1' : '0';
-		break;
-	case FILE_MEMORY_PRESSURE:
-		s += sprintf(s, "%d", fmeter_getrate(&cs->fmeter));
-		break;
-	case FILE_SPREAD_PAGE:
-		*s++ = is_spread_page(cs) ? '1' : '0';
-		break;
-	case FILE_SPREAD_SLAB:
-		*s++ = is_spread_slab(cs) ? '1' : '0';
-		break;
 	default:
 		retval = -EINVAL;
 		goto out;
@@ -1381,8 +1364,32 @@ out:
 	return retval;
 }
 
-
-
+static u64 cpuset_read_uint(struct cgroup *cont, struct cftype *cft)
+{
+	struct cpuset *cs = cgroup_cs(cont);
+	cpuset_filetype_t type = cft->private;
+	switch (type) {
+	case FILE_CPU_EXCLUSIVE:
+		return is_cpu_exclusive(cs);
+	case FILE_MEM_EXCLUSIVE:
+		return is_mem_exclusive(cs);
+	case FILE_SCHED_LOAD_BALANCE:
+		return is_sched_load_balance(cs);
+	case FILE_MEMORY_MIGRATE:
+		return is_memory_migrate(cs);
+	case FILE_MEMORY_PRESSURE_ENABLED:
+		return cpuset_memory_pressure_enabled;
+	case FILE_MEMORY_PRESSURE:
+		return fmeter_getrate(&cs->fmeter);
+		break;
+	case FILE_SPREAD_PAGE:
+		return is_spread_page(cs);
+	case FILE_SPREAD_SLAB:
+		return is_spread_slab(cs);
+	default:
+		BUG();
+	}
+}
 
 
 /*
@@ -1405,57 +1412,58 @@ static struct cftype cft_mems = {
 
 static struct cftype cft_cpu_exclusive = {
 	.name = "cpu_exclusive",
-	.read = cpuset_common_file_read,
-	.write = cpuset_common_file_write,
+	.read_uint = cpuset_read_uint,
+	.write_uint = cpuset_write_uint,
 	.private = FILE_CPU_EXCLUSIVE,
 };
 
 static struct cftype cft_mem_exclusive = {
 	.name = "mem_exclusive",
-	.read = cpuset_common_file_read,
-	.write = cpuset_common_file_write,
+	.read_uint = cpuset_read_uint,
+	.write_uint = cpuset_write_uint,
 	.private = FILE_MEM_EXCLUSIVE,
 };
 
 static struct cftype cft_sched_load_balance = {
 	.name = "sched_load_balance",
-	.read = cpuset_common_file_read,
-	.write = cpuset_common_file_write,
+	.read_uint = cpuset_read_uint,
+	.write_uint = cpuset_write_uint,
 	.private = FILE_SCHED_LOAD_BALANCE,
 };
 
 static struct cftype cft_memory_migrate = {
 	.name = "memory_migrate",
-	.read = cpuset_common_file_read,
-	.write = cpuset_common_file_write,
+	.read_uint = cpuset_read_uint,
+	.read_uint = cpuset_read_uint,
+	.write_uint = cpuset_write_uint,
 	.private = FILE_MEMORY_MIGRATE,
 };
 
 static struct cftype cft_memory_pressure_enabled = {
 	.name = "memory_pressure_enabled",
-	.read = cpuset_common_file_read,
-	.write = cpuset_common_file_write,
+	.read_uint = cpuset_read_uint,
+	.write_uint = cpuset_write_uint,
 	.private = FILE_MEMORY_PRESSURE_ENABLED,
 };
 
 static struct cftype cft_memory_pressure = {
 	.name = "memory_pressure",
-	.read = cpuset_common_file_read,
-	.write = cpuset_common_file_write,
+	.read_uint = cpuset_read_uint,
+	.write_uint = cpuset_write_uint,
 	.private = FILE_MEMORY_PRESSURE,
 };
 
 static struct cftype cft_spread_page = {
 	.name = "memory_spread_page",
-	.read = cpuset_common_file_read,
-	.write = cpuset_common_file_write,
+	.read_uint = cpuset_read_uint,
+	.write_uint = cpuset_write_uint,
 	.private = FILE_SPREAD_PAGE,
 };
 
 static struct cftype cft_spread_slab = {
 	.name = "memory_spread_slab",
-	.read = cpuset_common_file_read,
-	.write = cpuset_common_file_write,
+	.read_uint = cpuset_read_uint,
+	.write_uint = cpuset_write_uint,
 	.private = FILE_SPREAD_SLAB,
 };
 
@@ -1584,7 +1592,7 @@ static void cpuset_destroy(struct cgroup
 	cpuset_update_task_memory_state();
 
 	if (is_sched_load_balance(cs))
-		update_flag(CS_SCHED_LOAD_BALANCE, cs, "0");
+		update_flag(CS_SCHED_LOAD_BALANCE, cs, 0);
 
 	number_of_cpusets--;
 	kfree(cs);

--

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 0/7] CGroup API: More structured API for CGroups control files
  2008-02-15 20:44 [RFC][PATCH 0/7] CGroup API: More structured API for CGroups control files Paul Menage
                   ` (6 preceding siblings ...)
  2008-02-15 20:44 ` [RFC][PATCH 7/7] CGroup API: Update cpusets to use cgroup structured file API Paul Menage
@ 2008-02-16  4:21 ` KAMEZAWA Hiroyuki
  2008-02-16  9:31   ` Li Zefan
  7 siblings, 1 reply; 28+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-02-16  4:21 UTC (permalink / raw)
  To: Paul Menage
  Cc: balbir, pj, Pavel Emelianov, vatsa, akpm, containers, linux-kernel

On Fri, 15 Feb 2008 12:44:18 -0800
Paul Menage <menage@google.com> wrote:

> 
> This set of patches makes the Control Groups API more structured and
> self-describing.
> 
> 1) Allows control files to be associated with data types such as
> "u64", "string", "map", etc. These types show up in a new cgroup.api
> file in each cgroup directory, along with a user-readable
> string. Files that use cgroup-provided data accessors have these file
> types inferred automatically.
> 
> 2) Moves various files in cpusets and the memory controller from using
> custom-written file handlers to cgroup-defined handlers
> 
> 3) Adds the "cgroup." prefix for existing cgroup-provided control
> files (tasks, release_agent, releasable, notify_on_release). Given
> than we've already had 2.6.24 go out without this prefix, I guess this
> could be a little contentious - but it seems like a good move to
> prevent name clashes in the future. (Note that this doesn't affect
> mounting the legacy cpuset filesystem, since the compatibility layer
> disables all prefixes when mounted with filesystem type "cpuset"). If
> people object too strongly, we could just make this the case for *new*
> cgroup API files, but I think this is a case where consistency would
> be better than compatibility - I'd be surprised if anyone has written
> major legacy apps yet that rely on 2.6.24 cgroup control file names.
> 


Hi,  I like this direction very much. thank you for your work.
Self-describing cgroup.api file is a good idea!

One request from me is add "mode" bit to cftype for allowing
write-only/read-only files. 

Thanks,
-Kame





^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 0/7] CGroup API: More structured API for CGroups control files
  2008-02-16  4:21 ` [RFC][PATCH 0/7] CGroup API: More structured API for CGroups control files KAMEZAWA Hiroyuki
@ 2008-02-16  9:31   ` Li Zefan
  2008-02-16 17:40     ` Paul Menage
  0 siblings, 1 reply; 28+ messages in thread
From: Li Zefan @ 2008-02-16  9:31 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Paul Menage, balbir, pj, Pavel Emelianov, vatsa, akpm,
	containers, linux-kernel

KAMEZAWA Hiroyuki wrote:
> On Fri, 15 Feb 2008 12:44:18 -0800
> Paul Menage <menage@google.com> wrote:
> 
>> This set of patches makes the Control Groups API more structured and
>> self-describing.
>>
>> 1) Allows control files to be associated with data types such as
>> "u64", "string", "map", etc. These types show up in a new cgroup.api
>> file in each cgroup directory, along with a user-readable
>> string. Files that use cgroup-provided data accessors have these file
>> types inferred automatically.
>>
>> 2) Moves various files in cpusets and the memory controller from using
>> custom-written file handlers to cgroup-defined handlers
>>
>> 3) Adds the "cgroup." prefix for existing cgroup-provided control
>> files (tasks, release_agent, releasable, notify_on_release). Given
>> than we've already had 2.6.24 go out without this prefix, I guess this
>> could be a little contentious - but it seems like a good move to
>> prevent name clashes in the future. (Note that this doesn't affect
>> mounting the legacy cpuset filesystem, since the compatibility layer
>> disables all prefixes when mounted with filesystem type "cpuset"). If
>> people object too strongly, we could just make this the case for *new*
>> cgroup API files, but I think this is a case where consistency would
>> be better than compatibility - I'd be surprised if anyone has written
>> major legacy apps yet that rely on 2.6.24 cgroup control file names.
>>
> 
> 
> Hi,  I like this direction very much. thank you for your work.
> Self-describing cgroup.api file is a good idea!
> 
> One request from me is add "mode" bit to cftype for allowing
> write-only/read-only files. 
> 
> Thanks,
> -Kame
> 

I don't quite catch what you mean. Cgoup does support write-only/read-only
files. For a write-only file, just set .write and .write_uint to be NULL,
similar for a read-only file.

Do I miss something?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 1/7] CGroup API: Add cgroup.api control file
  2008-02-15 20:44 ` [RFC][PATCH 1/7] CGroup API: Add cgroup.api control file Paul Menage
@ 2008-02-16 10:07   ` Balbir Singh
  2008-02-16 17:44     ` Paul Menage
  0 siblings, 1 reply; 28+ messages in thread
From: Balbir Singh @ 2008-02-16 10:07 UTC (permalink / raw)
  To: Paul Menage
  Cc: balbir, pj, Pavel Emelianov, KAMEZAWA Hiroyuki, vatsa, akpm,
	containers, linux-kernel

Paul Menage wrote:

Hi, Paul,

Do we need to use a cgroup.api file? Why not keep up to date documentation and
get users to use that. I fear that, cgroup.api will not be kept up-to-date,
leading to confusion.

Why should the kernel carry so much of documentation in the image as strings?

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 0/7] CGroup API: More structured API for CGroups control files
  2008-02-16  9:31   ` Li Zefan
@ 2008-02-16 17:40     ` Paul Menage
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Menage @ 2008-02-16 17:40 UTC (permalink / raw)
  To: Li Zefan
  Cc: KAMEZAWA Hiroyuki, balbir, pj, Pavel Emelianov, vatsa, akpm,
	containers, linux-kernel

On Feb 16, 2008 1:31 AM, Li Zefan <lizf@cn.fujitsu.com> wrote:
>
> I don't quite catch what you mean. Cgoup does support write-only/read-only
> files. For a write-only file, just set .write and .write_uint to be NULL,
> similar for a read-only file.
>
> Do I miss something?
>

I suppose we could infer from the lack of any write handlers that we
should give the file in the filesystem a mode of 444 rather 644.

Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 1/7] CGroup API: Add cgroup.api control file
  2008-02-16 10:07   ` Balbir Singh
@ 2008-02-16 17:44     ` Paul Menage
  2008-02-18  9:45       ` Li Zefan
  0 siblings, 1 reply; 28+ messages in thread
From: Paul Menage @ 2008-02-16 17:44 UTC (permalink / raw)
  To: balbir
  Cc: balbir, pj, Pavel Emelianov, KAMEZAWA Hiroyuki, vatsa, akpm,
	containers, linux-kernel

On Feb 16, 2008 2:07 AM, Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> Paul Menage wrote:
>
> Hi, Paul,
>
> Do we need to use a cgroup.api file? Why not keep up to date documentation and
> get users to use that. I fear that, cgroup.api will not be kept up-to-date,
> leading to confusion.

The cgroup.api file isn't meant to give complete documentation for a
control file, simply a brief indication of its usage.

The aim is that most bits of the information reported in cgroup.api
are auto-generated, so there shouldn't be problems with it getting
out-of-date.

Is it just the space used by the documentation string that you're
objecting to? The other function of the file is to declare a type for
each variable.

Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 7/7] CGroup API: Update cpusets to use cgroup structured file API
  2008-02-15 20:44 ` [RFC][PATCH 7/7] CGroup API: Update cpusets to use cgroup structured file API Paul Menage
@ 2008-02-17  3:29   ` Paul Jackson
  2008-02-17 17:18     ` Paul Menage
  2008-02-18  9:55     ` Li Zefan
  0 siblings, 2 replies; 28+ messages in thread
From: Paul Jackson @ 2008-02-17  3:29 UTC (permalink / raw)
  To: Paul Menage
  Cc: balbir, xemul, kamezawa.hiroyu, vatsa, akpm, containers, linux-kernel

Ok ... this would (I suspect, just from code reading, no bytes were
harmed in actual testing of this) have a minor change to how white
space is handled writing integer flags to cpuset files, and a minor
inconstency.

 1) Existing cpuset code lets you set a flag (e.g. cpu_exclusive) by doing:
	echo '1 rumplestiltskin' > cpu_exclusive   # same as: echo 1 > cpu_exclusive
    With this patch, that probably fails, EINVAL.

 2) With this patch, one can write "1" or "1\n" to cpuset integer files, but one
    cannot successfully write "1\r\n" or "1 " or "1 \n".  However, for the cpuset
    control files that take strings, not single integers, one -can- have any mix
    of trailing white space.

So far as I know, I have no requirement to write rumplestiltskin to cpuset files ;).
So I'm content to let the minor change in (1) pass without further comment.

I'd like to recommend consideration of the following patch, to address the
minor inconsistency of (2), and to save a few bytes of kernel text space.

=======

From: Paul Jackson <pj@sgi.com>

Strip all trailing whitespace (such as carriage returns)
when parsing integer writes to cgroup files, not just
one trailing newline if present.

Signed-off-by: Paul Jackson <pj@sgi.com>
Cc: Paul Menage <menage@google.com>

---
 kernel/cgroup.c |    5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

--- 2.6.24-mm1.orig/kernel/cgroup.c	2008-02-16 04:20:33.000000000 -0800
+++ 2.6.24-mm1/kernel/cgroup.c	2008-02-16 19:00:41.207478218 -0800
@@ -1321,10 +1321,7 @@ static ssize_t cgroup_write_uint(struct 
 		return -EFAULT;
 
 	buffer[nbytes] = 0;     /* nul-terminate */
-
-	/* strip newline if necessary */
-	if (nbytes && (buffer[nbytes-1] == '\n'))
-		buffer[nbytes-1] = 0;
+	strstrip(buffer);	/* strip -just- trailing whitespace */
 	val = simple_strtoull(buffer, &end, 0);
 	if (*end)
 		return -EINVAL;


-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 7/7] CGroup API: Update cpusets to use cgroup structured file API
  2008-02-17  3:29   ` Paul Jackson
@ 2008-02-17 17:18     ` Paul Menage
  2008-02-17 17:28       ` Paul Jackson
  2008-02-18  9:55     ` Li Zefan
  1 sibling, 1 reply; 28+ messages in thread
From: Paul Menage @ 2008-02-17 17:18 UTC (permalink / raw)
  To: Paul Jackson
  Cc: balbir, xemul, kamezawa.hiroyu, vatsa, akpm, containers, linux-kernel

On Feb 16, 2008 7:29 PM, Paul Jackson <pj@sgi.com> wrote:
>
> From: Paul Jackson <pj@sgi.com>
>
> Strip all trailing whitespace (such as carriage returns)
> when parsing integer writes to cgroup files, not just
> one trailing newline if present.

Sounds like a good idea to me. Thanks for this.

>
> Signed-off-by: Paul Jackson <pj@sgi.com>
> Cc: Paul Menage <menage@google.com>

Acked-by: Paul Menage <menage@google.com>

>
> ---
>  kernel/cgroup.c |    5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> --- 2.6.24-mm1.orig/kernel/cgroup.c     2008-02-16 04:20:33.000000000 -0800
> +++ 2.6.24-mm1/kernel/cgroup.c  2008-02-16 19:00:41.207478218 -0800
> @@ -1321,10 +1321,7 @@ static ssize_t cgroup_write_uint(struct
>                 return -EFAULT;
>
>         buffer[nbytes] = 0;     /* nul-terminate */
> -
> -       /* strip newline if necessary */
> -       if (nbytes && (buffer[nbytes-1] == '\n'))
> -               buffer[nbytes-1] = 0;
> +       strstrip(buffer);       /* strip -just- trailing whitespace */
>         val = simple_strtoull(buffer, &end, 0);
>         if (*end)
>                 return -EINVAL;
>
>
> --
>                   I won't rest till it's the best ...
>                   Programmer, Linux Scalability
>                   Paul Jackson <pj@sgi.com> 1.940.382.4214
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 7/7] CGroup API: Update cpusets to use cgroup structured file API
  2008-02-17 17:18     ` Paul Menage
@ 2008-02-17 17:28       ` Paul Jackson
  2008-02-17 17:48         ` Paul Menage
  0 siblings, 1 reply; 28+ messages in thread
From: Paul Jackson @ 2008-02-17 17:28 UTC (permalink / raw)
  To: Paul Menage
  Cc: balbir, xemul, kamezawa.hiroyu, vatsa, akpm, containers, linux-kernel

> > Strip all trailing whitespace (such as carriage returns)
> > when parsing integer writes to cgroup files, not just
> > one trailing newline if present.
> 
> Sounds like a good idea to me. Thanks for this.

I'm figuring it would be easiest if you just threw this
little change into your hopper for the bigger changes
you're making ... no need to preserve my name as author
of this.

But if you'd rather you or I send this separately to
Andrew Morton, that works too.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 7/7] CGroup API: Update cpusets to use cgroup structured file API
  2008-02-17 17:28       ` Paul Jackson
@ 2008-02-17 17:48         ` Paul Menage
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Menage @ 2008-02-17 17:48 UTC (permalink / raw)
  To: Paul Jackson
  Cc: balbir, xemul, kamezawa.hiroyu, vatsa, akpm, containers, linux-kernel

On Feb 17, 2008 9:28 AM, Paul Jackson <pj@sgi.com> wrote:
>
> I'm figuring it would be easiest if you just threw this
> little change into your hopper for the bigger changes
> you're making

OK, will do.

Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 1/7] CGroup API: Add cgroup.api control file
  2008-02-16 17:44     ` Paul Menage
@ 2008-02-18  9:45       ` Li Zefan
  2008-02-18 10:32         ` Balbir Singh
                           ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Li Zefan @ 2008-02-18  9:45 UTC (permalink / raw)
  To: Paul Menage
  Cc: balbir, balbir, pj, Pavel Emelianov, KAMEZAWA Hiroyuki, vatsa,
	akpm, containers, linux-kernel

Paul Menage wrote:
> On Feb 16, 2008 2:07 AM, Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>> Paul Menage wrote:
>>
>> Hi, Paul,
>>
>> Do we need to use a cgroup.api file? Why not keep up to date documentation and
>> get users to use that. I fear that, cgroup.api will not be kept up-to-date,
>> leading to confusion.
> 
> The cgroup.api file isn't meant to give complete documentation for a
> control file, simply a brief indication of its usage.
> 

But we don't have /proc/proc.api or /sys/sysfs.api ...

> The aim is that most bits of the information reported in cgroup.api
> are auto-generated, so there shouldn't be problems with it getting
> out-of-date.
> 
> Is it just the space used by the documentation string that you're
> objecting to? The other function of the file is to declare a type for
> each variable.
> 

[root@localhost mnt]# cat cgroup.api
debug.current_css_set_refcount  u64
debug.current_css_set   u64
debug.taskcount u64
debug.cgroup_refcount   u64
cgroup.release_agent    string  Path to release agent binary
cgroup.api      unknown Control file descriptions
cgroup.releasable       u64     Is this cgroup able to be freed when empty
cgroup.notify_on_release        u64     Should the release agent trigger when this cgroup is empty
cgroup.tasks    string  Thread ids of threads in this cgroup

It seems to me this is a little messy.

And is it better to describe the debug subsystem too?


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 7/7] CGroup API: Update cpusets to use cgroup structured file API
  2008-02-17  3:29   ` Paul Jackson
  2008-02-17 17:18     ` Paul Menage
@ 2008-02-18  9:55     ` Li Zefan
       [not found]       ` <47B96805.7070002@linux.vnet.ibm.com>
  1 sibling, 1 reply; 28+ messages in thread
From: Li Zefan @ 2008-02-18  9:55 UTC (permalink / raw)
  To: Paul Jackson
  Cc: Paul Menage, balbir, xemul, kamezawa.hiroyu, vatsa, akpm,
	containers, linux-kernel

Paul Jackson wrote:
> Ok ... this would (I suspect, just from code reading, no bytes were
> harmed in actual testing of this) have a minor change to how white
> space is handled writing integer flags to cpuset files, and a minor
> inconstency.
> 
>  1) Existing cpuset code lets you set a flag (e.g. cpu_exclusive) by doing:
> 	echo '1 rumplestiltskin' > cpu_exclusive   # same as: echo 1 > cpu_exclusive
>     With this patch, that probably fails, EINVAL.
> 
>  2) With this patch, one can write "1" or "1\n" to cpuset integer files, but one
>     cannot successfully write "1\r\n" or "1 " or "1 \n".  However, for the cpuset
>     control files that take strings, not single integers, one -can- have any mix
>     of trailing white space.
> 
> So far as I know, I have no requirement to write rumplestiltskin to cpuset files ;).
> So I'm content to let the minor change in (1) pass without further comment.
> 
> I'd like to recommend consideration of the following patch, to address the
> minor inconsistency of (2), and to save a few bytes of kernel text space.
> 

For memory controller, we have to do this:
	echo -n 4m > memory.limit_in_bytes
'-n' is necessary. This is another inconsistency..

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 1/7] CGroup API: Add cgroup.api control file
  2008-02-18  9:45       ` Li Zefan
@ 2008-02-18 10:32         ` Balbir Singh
  2008-02-19 21:57         ` Paul Jackson
  2008-02-20  2:51         ` Paul Menage
  2 siblings, 0 replies; 28+ messages in thread
From: Balbir Singh @ 2008-02-18 10:32 UTC (permalink / raw)
  To: Paul Menage
  Cc: Li Zefan, balbir, pj, Pavel Emelianov, KAMEZAWA Hiroyuki, vatsa,
	akpm, containers, linux-kernel

Li Zefan wrote:
> Paul Menage wrote:
>> On Feb 16, 2008 2:07 AM, Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>>> Paul Menage wrote:
>>>
>>> Hi, Paul,
>>>
>>> Do we need to use a cgroup.api file? Why not keep up to date documentation and
>>> get users to use that. I fear that, cgroup.api will not be kept up-to-date,
>>> leading to confusion.
>> The cgroup.api file isn't meant to give complete documentation for a
>> control file, simply a brief indication of its usage.
>>
> 
> But we don't have /proc/proc.api or /sys/sysfs.api ...
> 

Yes, doing that would make the size of the kernel go out of control.

>> The aim is that most bits of the information reported in cgroup.api
>> are auto-generated, so there shouldn't be problems with it getting
>> out-of-date.
>>
>> Is it just the space used by the documentation string that you're
>> objecting to? The other function of the file is to declare a type for
>> each variable.
>>

I like the documentation aspect, it is the space that I object to.


-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 7/7] CGroup API: Update cpusets to use cgroup structured file API
       [not found]       ` <47B96805.7070002@linux.vnet.ibm.com>
@ 2008-02-18 11:13         ` Balbir Singh
  2008-02-18 11:52           ` Andreas Schwab
  0 siblings, 1 reply; 28+ messages in thread
From: Balbir Singh @ 2008-02-18 11:13 UTC (permalink / raw)
  To: Li Zefan, akpm
  Cc: Paul Jackson, Paul Menage, xemul, kamezawa.hiroyu, vatsa,
	containers, linux-kernel

* Balbir Singh <balbir@linux.vnet.ibm.com> [2008-02-18 16:42:05]:

> Li Zefan wrote:
> > Paul Jackson wrote:
> >> Ok ... this would (I suspect, just from code reading, no bytes were
> >> harmed in actual testing of this) have a minor change to how white
> >> space is handled writing integer flags to cpuset files, and a minor
> >> inconstency.
> >>
> >>  1) Existing cpuset code lets you set a flag (e.g. cpu_exclusive) by doing:
> >> 	echo '1 rumplestiltskin' > cpu_exclusive   # same as: echo 1 > cpu_exclusive
> >>     With this patch, that probably fails, EINVAL.
> >>
> >>  2) With this patch, one can write "1" or "1\n" to cpuset integer files, but one
> >>     cannot successfully write "1\r\n" or "1 " or "1 \n".  However, for the cpuset
> >>     control files that take strings, not single integers, one -can- have any mix
> >>     of trailing white space.
> >>
> >> So far as I know, I have no requirement to write rumplestiltskin to cpuset files ;).
> >> So I'm content to let the minor change in (1) pass without further comment.
> >>
> >> I'd like to recommend consideration of the following patch, to address the
> >> minor inconsistency of (2), and to save a few bytes of kernel text space.
> >>
> > 
> > For memory controller, we have to do this:
> > 	echo -n 4m > memory.limit_in_bytes
> > '-n' is necessary. This is another inconsistency..
> 
Hi. Li,
 
I have a similar patch that fixes the inconsistency.
It's attached below. Andrew, could we please consider this patch for -mm


The memory controller has a requirement that while writing values, we need
to use echo -n. This patch fixes the problem and makes the UI more consistent.

Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
---

 Documentation/controllers/memory.txt |    8 ++++----
 kernel/res_counter.c                 |    1 +
 2 files changed, 5 insertions(+), 4 deletions(-)

diff -puN mm/memcontrol.c~memory-controller-fix-crlf-echo-issue mm/memcontrol.c
diff -puN kernel/res_counter.c~memory-controller-fix-crlf-echo-issue kernel/res_counter.c
--- linux-2.6.25-rc2/kernel/res_counter.c~memory-controller-fix-crlf-echo-issue	2008-02-18 16:15:02.000000000 +0530
+++ linux-2.6.25-rc2-balbir/kernel/res_counter.c	2008-02-18 16:16:16.000000000 +0530
@@ -113,6 +113,7 @@ ssize_t res_counter_write(struct res_cou
 
 	ret = -EINVAL;
 
+	strstrip(buf);
 	if (write_strategy) {
 		if (write_strategy(buf, &tmp)) {
 			goto out_free;
diff -puN Documentation/controllers/memory.txt~memory-controller-fix-crlf-echo-issue Documentation/controllers/memory.txt
--- linux-2.6.25-rc2/Documentation/controllers/memory.txt~memory-controller-fix-crlf-echo-issue	2008-02-18 16:18:26.000000000 +0530
+++ linux-2.6.25-rc2-balbir/Documentation/controllers/memory.txt	2008-02-18 16:18:44.000000000 +0530
@@ -164,7 +164,7 @@ c. Enable CONFIG_CGROUP_MEM_CONT
 
 Since now we're in the 0 cgroup,
 We can alter the memory limit:
-# echo -n 4M > /cgroups/0/memory.limit_in_bytes
+# echo 4M > /cgroups/0/memory.limit_in_bytes
 
 NOTE: We can use a suffix (k, K, m, M, g or G) to indicate values in kilo,
 mega or gigabytes.
@@ -185,7 +185,7 @@ number of factors, such as rounding up t
 availability of memory on the system.  The user is required to re-read
 this file after a write to guarantee the value committed by the kernel.
 
-# echo -n 1 > memory.limit_in_bytes
+# echo 1 > memory.limit_in_bytes
 # cat memory.limit_in_bytes
 4096 Bytes
 
@@ -197,7 +197,7 @@ caches, RSS and Active pages/Inactive pa
 
 The memory.force_empty gives an interface to drop *all* charges by force.
 
-# echo -n 1 > memory.force_empty
+# echo 1 > memory.force_empty
 
 will drop all charges in cgroup. Currently, this is maintained for test.
 
@@ -238,7 +238,7 @@ rmdir() if there are no tasks.
 The type of memory accounted by the cgroup can be limited to just
 mapped pages by writing "1" to memory.control_type field
 
-echo -n 1 > memory.control_type
+echo > memory.control_type
 
 5. TODO
 
_

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 7/7] CGroup API: Update cpusets to use cgroup structured file API
  2008-02-18 11:13         ` Balbir Singh
@ 2008-02-18 11:52           ` Andreas Schwab
       [not found]             ` <47B971C6.4080807@linux.vnet.ibm.com>
  0 siblings, 1 reply; 28+ messages in thread
From: Andreas Schwab @ 2008-02-18 11:52 UTC (permalink / raw)
  To: Li Zefan
  Cc: akpm, Paul Jackson, Paul Menage, xemul, kamezawa.hiroyu, vatsa,
	containers, linux-kernel

Balbir Singh <balbir@linux.vnet.ibm.com> writes:

> @@ -238,7 +238,7 @@ rmdir() if there are no tasks.
>  The type of memory accounted by the cgroup can be limited to just
>  mapped pages by writing "1" to memory.control_type field
>  
> -echo -n 1 > memory.control_type
> +echo > memory.control_type

Looks like you stripped too much here.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 7/7] CGroup API: Update cpusets to use cgroup structured file API
       [not found]             ` <47B971C6.4080807@linux.vnet.ibm.com>
@ 2008-02-18 11:56               ` Balbir Singh
  0 siblings, 0 replies; 28+ messages in thread
From: Balbir Singh @ 2008-02-18 11:56 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Li Zefan, containers, vatsa, linux-kernel, Paul Menage, akpm, xemul

* Balbir Singh <balbir@linux.vnet.ibm.com> [2008-02-18 17:23:42]:

> Andreas Schwab wrote:
> > Balbir Singh <balbir@linux.vnet.ibm.com> writes:
> > 
> >> @@ -238,7 +238,7 @@ rmdir() if there are no tasks.
> >>  The type of memory accounted by the cgroup can be limited to just
> >>  mapped pages by writing "1" to memory.control_type field
> >>  
> >> -echo -n 1 > memory.control_type
> >> +echo > memory.control_type
> > 
> > Looks like you stripped too much here.
> > 
> > Andreas.
> > 
> 
Yikes,

The control type feature documentation needs to go away and Li has a patch for
it, so I'll not touch that part. Here's the updated patch. Thanks for catching
this Andreas.

The memory controller has a requirement that while writing values, we need
to use echo -n. This patch fixes the problem and makes the UI more consistent.

Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
---

 Documentation/controllers/memory.txt |    6 +++---
 kernel/res_counter.c                 |    1 +
 2 files changed, 4 insertions(+), 3 deletions(-)

diff -puN mm/memcontrol.c~memory-controller-fix-crlf-echo-issue mm/memcontrol.c
diff -puN kernel/res_counter.c~memory-controller-fix-crlf-echo-issue kernel/res_counter.c
--- linux-2.6.25-rc2/kernel/res_counter.c~memory-controller-fix-crlf-echo-issue	2008-02-18 16:15:02.000000000 +0530
+++ linux-2.6.25-rc2-balbir/kernel/res_counter.c	2008-02-18 16:16:16.000000000 +0530
@@ -113,6 +113,7 @@ ssize_t res_counter_write(struct res_cou
 
 	ret = -EINVAL;
 
+	strstrip(buf);
 	if (write_strategy) {
 		if (write_strategy(buf, &tmp)) {
 			goto out_free;
diff -puN Documentation/controllers/memory.txt~memory-controller-fix-crlf-echo-issue Documentation/controllers/memory.txt
--- linux-2.6.25-rc2/Documentation/controllers/memory.txt~memory-controller-fix-crlf-echo-issue	2008-02-18 16:18:26.000000000 +0530
+++ linux-2.6.25-rc2-balbir/Documentation/controllers/memory.txt	2008-02-18 17:24:48.000000000 +0530
@@ -164,7 +164,7 @@ c. Enable CONFIG_CGROUP_MEM_CONT
 
 Since now we're in the 0 cgroup,
 We can alter the memory limit:
-# echo -n 4M > /cgroups/0/memory.limit_in_bytes
+# echo 4M > /cgroups/0/memory.limit_in_bytes
 
 NOTE: We can use a suffix (k, K, m, M, g or G) to indicate values in kilo,
 mega or gigabytes.
@@ -185,7 +185,7 @@ number of factors, such as rounding up t
 availability of memory on the system.  The user is required to re-read
 this file after a write to guarantee the value committed by the kernel.
 
-# echo -n 1 > memory.limit_in_bytes
+# echo 1 > memory.limit_in_bytes
 # cat memory.limit_in_bytes
 4096 Bytes
 
@@ -197,7 +197,7 @@ caches, RSS and Active pages/Inactive pa
 
 The memory.force_empty gives an interface to drop *all* charges by force.
 
-# echo -n 1 > memory.force_empty
+# echo 1 > memory.force_empty
 
 will drop all charges in cgroup. Currently, this is maintained for test.
 
_

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 1/7] CGroup API: Add cgroup.api control file
  2008-02-18  9:45       ` Li Zefan
  2008-02-18 10:32         ` Balbir Singh
@ 2008-02-19 21:57         ` Paul Jackson
  2008-02-20  2:51           ` Paul Menage
  2008-02-20  2:51         ` Paul Menage
  2 siblings, 1 reply; 28+ messages in thread
From: Paul Jackson @ 2008-02-19 21:57 UTC (permalink / raw)
  To: Li Zefan
  Cc: menage, balbir, balbir, xemul, kamezawa.hiroyu, vatsa, akpm,
	containers, linux-kernel

Li Zefan wrote:
> It seems to me this is a little messy.

Agreed.  It looks too finicky to base real software on;  that is, I
doubt that any robust application or user level software is going to
depend on it for serious automated self-configuration.

I haven't seen a serious problem with cpuset documentation, which is
the earlier API of this flavor (though, granted, I'd be the last
person on the planet to see such ;).  It seems to me that this style
of API is easy enough for a variety of people to figure out that this
won't help that much.  And certainly the more interesting details that
need documentation are far too verbose for this presentation.

So ... little or no programmatic use, little or no human use.

It also reminds me a bit, in a very loose way, of efforts in sysfs that
have taken us a while, and too many changes, to get right.  One needs to
be -selective- in what one exposes, in order to minimize maintenance costs
and maximize the signal-to-noise ratio, over time.  This feels like it
exposes too much.

Finally, it goes against the one thingie per file (at most, one scalar
vector) that has worked well for us when tried.

As to the motivations Paul M gives:
 1) Avoid "an arbitrary mess of ad-hoc APIs":
	We can still do that, whether or not we "self-document" these
	API's in this manner.
 2) binary APIs versus ASCII APIs:
	Well, I have an ASCII API bias, not surprising.  But I'd
	suggest not doing things "in anticipation" of some future
	fuzzy binary API support.  Wait until that day actually arrives.
 3) The memory controller currently has files with the "_in_bytes":
	The traditional way to handle this is Documentation and man
	pages; good enough for my granddad, good enough for me ;).

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 1/7] CGroup API: Add cgroup.api control file
  2008-02-19 21:57         ` Paul Jackson
@ 2008-02-20  2:51           ` Paul Menage
  2008-02-20  5:17             ` Paul Jackson
  0 siblings, 1 reply; 28+ messages in thread
From: Paul Menage @ 2008-02-20  2:51 UTC (permalink / raw)
  To: Paul Jackson
  Cc: Li Zefan, balbir, balbir, xemul, kamezawa.hiroyu, vatsa, akpm,
	containers, linux-kernel

On Feb 19, 2008 1:57 PM, Paul Jackson <pj@sgi.com> wrote:
>
> Finally, it goes against the one thingie per file (at most, one scalar
> vector) that has worked well for us when tried.

Right, I like the idea of keeping things simple. But if you're going
to accept that a vector is useful, then it seems reasonable that some
other *simple* structured datatypes can be useful. An N-element
key/value map (a la /proc/meminfo) is, I think, nicer than having to
read values from N separate files.

>
> As to the motivations Paul M gives:
>  1) Avoid "an arbitrary mess of ad-hoc APIs":
>         We can still do that, whether or not we "self-document" these
>         API's in this manner.

We can, but this file makes it more clear what control files have a
well-defined API and which are just returning some ad-hoc string.

I guess it's not essential, I just figured that if we had that
information, it made sense to make it available to userspace. I guess
I'm happy with dropping the actual exposed cgroup.api file for now as
long as we can work towards reducing the number of control files that
just return strings, and make use of the structured output such as
read_uint() miore.

>  2) binary APIs versus ASCII APIs:
>         Well, I have an ASCII API bias, not surprising.  But I'd
>         suggest not doing things "in anticipation" of some future
>         fuzzy binary API support.  Wait until that day actually arrives.

I have a reasonably clear idea of how we can do the binary API.

That's mostly for a separate RFC. But for example, reading a map via
the binary API would be able to just return a list values since the
keys could be parsed once from the ascii map (provided that the
subsystem guaranteed that the map keys and their order wouldn't change
between reboots).


>  3) The memory controller currently has files with the "_in_bytes":
>         The traditional way to handle this is Documentation and man
>         pages; good enough for my granddad, good enough for me ;).

I've tried submitting patches to remove the in_bytes suffix and just
rely on the documentation, and people didn't seem to like it ...

Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 1/7] CGroup API: Add cgroup.api control file
  2008-02-18  9:45       ` Li Zefan
  2008-02-18 10:32         ` Balbir Singh
  2008-02-19 21:57         ` Paul Jackson
@ 2008-02-20  2:51         ` Paul Menage
  2 siblings, 0 replies; 28+ messages in thread
From: Paul Menage @ 2008-02-20  2:51 UTC (permalink / raw)
  To: Li Zefan
  Cc: balbir, balbir, pj, Pavel Emelianov, KAMEZAWA Hiroyuki, vatsa,
	akpm, containers, linux-kernel

On Feb 18, 2008 1:45 AM, Li Zefan <lizf@cn.fujitsu.com> wrote:
> >
>
> But we don't have /proc/proc.api or /sys/sysfs.api ...

True. And /proc is a bit of a mess. Having a similar API file for
sysfs sounds like a good idea to me.

>
> And is it better to describe the debug subsystem too?
>

Yes, probably, but that would be a separate patch to the debug
subsystem itself, not the main cgroups code.

Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 1/7] CGroup API: Add cgroup.api control file
  2008-02-20  2:51           ` Paul Menage
@ 2008-02-20  5:17             ` Paul Jackson
  2008-02-20  5:23               ` Paul Menage
  0 siblings, 1 reply; 28+ messages in thread
From: Paul Jackson @ 2008-02-20  5:17 UTC (permalink / raw)
  To: Paul Menage
  Cc: lizf, balbir, balbir, xemul, kamezawa.hiroyu, vatsa, akpm,
	containers, linux-kernel

Paul M wrote:
> I guess it's not essential, I just figured that if we had that
> information, it made sense to make it available to userspace. I guess
> I'm happy with dropping the actual exposed cgroup.api file for now as
> long as we can work towards reducing the number of control files that
> just return strings, and make use of the structured output such as
> read_uint() miore.

I could certainly go along with that ... reducing the proportion of
control files returning untyped strings.

My sense of kernel-user API's is that usually the less said the better.
Identify the essential information that one side requires from the
other via a runtime API, and pass only that.  API's represent a
lifetime commitment, so the less promised the better.

Perhaps my primary concern with these *.api files was that I did not
understand who or what the critical use or user was; who found this
essential, not just nice to have.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH 1/7] CGroup API: Add cgroup.api control file
  2008-02-20  5:17             ` Paul Jackson
@ 2008-02-20  5:23               ` Paul Menage
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Menage @ 2008-02-20  5:23 UTC (permalink / raw)
  To: Paul Jackson
  Cc: lizf, balbir, balbir, xemul, kamezawa.hiroyu, vatsa, akpm,
	containers, linux-kernel

On Feb 19, 2008 9:17 PM, Paul Jackson <pj@sgi.com> wrote:
>
> Perhaps my primary concern with these *.api files was that I did not
> understand who or what the critical use or user was; who found this
> essential, not just nice to have.
>

Right now, no-one would find it essential. If/when a binary API is
added, I guess I'll ressurrect this part of the patchset.

Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2008-02-20  5:24 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-15 20:44 [RFC][PATCH 0/7] CGroup API: More structured API for CGroups control files Paul Menage
2008-02-15 20:44 ` [RFC][PATCH 1/7] CGroup API: Add cgroup.api control file Paul Menage
2008-02-16 10:07   ` Balbir Singh
2008-02-16 17:44     ` Paul Menage
2008-02-18  9:45       ` Li Zefan
2008-02-18 10:32         ` Balbir Singh
2008-02-19 21:57         ` Paul Jackson
2008-02-20  2:51           ` Paul Menage
2008-02-20  5:17             ` Paul Jackson
2008-02-20  5:23               ` Paul Menage
2008-02-20  2:51         ` Paul Menage
2008-02-15 20:44 ` [RFC][PATCH 2/7] CGroup API: Add cgroup map data type Paul Menage
2008-02-15 20:44 ` [RFC][PATCH 3/7] CGroup API: Use cgroup map for memcontrol stats file Paul Menage
2008-02-15 20:44 ` [RFC][PATCH 4/7] CGroup API: Add res_counter_read_uint() Paul Menage
2008-02-15 20:44 ` [RFC][PATCH 5/7] CGroup API: Use read_uint in memory controller Paul Menage
2008-02-15 20:44 ` [RFC][PATCH 6/7] CGroup API: Use descriptions for memory controller API files Paul Menage
2008-02-15 20:44 ` [RFC][PATCH 7/7] CGroup API: Update cpusets to use cgroup structured file API Paul Menage
2008-02-17  3:29   ` Paul Jackson
2008-02-17 17:18     ` Paul Menage
2008-02-17 17:28       ` Paul Jackson
2008-02-17 17:48         ` Paul Menage
2008-02-18  9:55     ` Li Zefan
     [not found]       ` <47B96805.7070002@linux.vnet.ibm.com>
2008-02-18 11:13         ` Balbir Singh
2008-02-18 11:52           ` Andreas Schwab
     [not found]             ` <47B971C6.4080807@linux.vnet.ibm.com>
2008-02-18 11:56               ` Balbir Singh
2008-02-16  4:21 ` [RFC][PATCH 0/7] CGroup API: More structured API for CGroups control files KAMEZAWA Hiroyuki
2008-02-16  9:31   ` Li Zefan
2008-02-16 17:40     ` Paul Menage

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).