LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/3]Sysctl: clean the code and prepare for secure use in containers
@ 2008-02-27 13:47 Pavel Emelyanov
  2008-02-27 13:48 ` [PATCH 1/3] Sysctl: merge equal proc_sys_read and proc_sys_write Pavel Emelyanov
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Pavel Emelyanov @ 2008-02-27 13:47 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux Kernel Mailing List

Many (most of) sysctls do not have a per-container sense. E.g. 
kernel.print_fatal_signals, vm.panic_on_oom, net.core.netdev_budget 
and so on and so forth. Besides, tuning then from inside a container
is not even secure. On the other hand, hiding them completely from
the container's tasks sometimes causes user-space to stop working.

When developing net sysctl, the common practice was to duplicate
a table and drop the write bits in table->mode, but this approach
was not very elegant, lead to excessive memory consumption and
was not suitable in general.

Here's the alternative solution. To facilitate the per-container
sysctls ctl_table_root-s were introduced. Each root contains a 
list of ctl_table_header-s that are visible to different namespaces. 
The idea of this set is to add the permissions() callback on the 
ctl_table_root to allow ctl root limit permissions to the same 
ctl_table-s.

The main user of this functionality is the net-namespaces code, 
but later this will (should) be used by more and more namespaces,
containers and control groups.

Actually, this idea's core is in a single hunk in the third patch.
First two patches are cleanups for sysctl code, while the third
one mostly extends the arguments set of some sysctl functions.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/3] Sysctl: merge equal proc_sys_read and proc_sys_write
  2008-02-27 13:47 [PATCH 0/3]Sysctl: clean the code and prepare for secure use in containers Pavel Emelyanov
@ 2008-02-27 13:48 ` Pavel Emelyanov
  2008-02-27 13:51 ` [PATCH 2/3] Sysctl: clean from unneeded extern and forward declarations Pavel Emelyanov
  2008-02-27 13:54 ` [PATCH 3/3] Sysctl: add the ->permissions callback on the ctl_table_root Pavel Emelyanov
  2 siblings, 0 replies; 4+ messages in thread
From: Pavel Emelyanov @ 2008-02-27 13:48 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux Kernel Mailing List

These ->read and ->write callbacks act in a very similar way, so
merge these paths to reduce the number of places to patch later
and shrink the .text size (a bit).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
---
 fs/proc/proc_sysctl.c |   50 ++++++++++--------------------------------------
 1 files changed, 11 insertions(+), 39 deletions(-)

diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 614c34b..5e31585 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -165,8 +165,8 @@ out:
 	return err;
 }
 
-static ssize_t proc_sys_read(struct file *filp, char __user *buf,
-				size_t count, loff_t *ppos)
+static ssize_t proc_sys_call_handler(struct file *filp, void __user *buf,
+		size_t count, loff_t *ppos, int write)
 {
 	struct dentry *dentry = filp->f_dentry;
 	struct ctl_table_header *head;
@@ -190,12 +190,12 @@ static ssize_t proc_sys_read(struct file *filp, char __user *buf,
 	 * and won't be until we finish.
 	 */
 	error = -EPERM;
-	if (sysctl_perm(table, MAY_READ))
+	if (sysctl_perm(table, write ? MAY_WRITE : MAY_READ))
 		goto out;
 
 	/* careful: calling conventions are nasty here */
 	res = count;
-	error = table->proc_handler(table, 0, filp, buf, &res, ppos);
+	error = table->proc_handler(table, write, filp, buf, &res, ppos);
 	if (!error)
 		error = res;
 out:
@@ -204,44 +204,16 @@ out:
 	return error;
 }
 
-static ssize_t proc_sys_write(struct file *filp, const char __user *buf,
+static ssize_t proc_sys_read(struct file *filp, char __user *buf,
 				size_t count, loff_t *ppos)
 {
-	struct dentry *dentry = filp->f_dentry;
-	struct ctl_table_header *head;
-	struct ctl_table *table;
-	ssize_t error;
-	size_t res;
-
-	table = do_proc_sys_lookup(dentry->d_parent, &dentry->d_name, &head);
-	/* Has the sysctl entry disappeared on us? */
-	error = -ENOENT;
-	if (!table)
-		goto out;
-
-	/* Has the sysctl entry been replaced by a directory? */
-	error = -EISDIR;
-	if (!table->proc_handler)
-		goto out;
-
-	/*
-	 * At this point we know that the sysctl was not unregistered
-	 * and won't be until we finish.
-	 */
-	error = -EPERM;
-	if (sysctl_perm(table, MAY_WRITE))
-		goto out;
-
-	/* careful: calling conventions are nasty here */
-	res = count;
-	error = table->proc_handler(table, 1, filp, (char __user *)buf,
-				    &res, ppos);
-	if (!error)
-		error = res;
-out:
-	sysctl_head_finish(head);
+	return proc_sys_call_handler(filp, (void __user *)buf, count, ppos, 0);
+}
 
-	return error;
+static ssize_t proc_sys_write(struct file *filp, const char __user *buf,
+				size_t count, loff_t *ppos)
+{
+	return proc_sys_call_handler(filp, (void __user *)buf, count, ppos, 1);
 }
 
 
-- 
1.5.3.4


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 2/3] Sysctl: clean from unneeded extern and forward declarations
  2008-02-27 13:47 [PATCH 0/3]Sysctl: clean the code and prepare for secure use in containers Pavel Emelyanov
  2008-02-27 13:48 ` [PATCH 1/3] Sysctl: merge equal proc_sys_read and proc_sys_write Pavel Emelyanov
@ 2008-02-27 13:51 ` Pavel Emelyanov
  2008-02-27 13:54 ` [PATCH 3/3] Sysctl: add the ->permissions callback on the ctl_table_root Pavel Emelyanov
  2 siblings, 0 replies; 4+ messages in thread
From: Pavel Emelyanov @ 2008-02-27 13:51 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux Kernel Mailing List

The do_sysctl_strategy isn't used outside kernel/sysctl.c, so
this can be static and without a prototype in header.

Besides, move this one and parse_table() above their callers and
drop the forward declarations of the latter call.

One more "besides" - fix two checkpatch warnings: space before
a ( and an extra space at the end of a line.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
---
 include/linux/sysctl.h |    5 --
 kernel/sysctl.c        |  144 +++++++++++++++++++++++-------------------------
 2 files changed, 68 insertions(+), 81 deletions(-)

diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index 571f01d..8e50196 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -981,11 +981,6 @@ extern int do_sysctl (int __user *name, int nlen,
 		      void __user *oldval, size_t __user *oldlenp,
 		      void __user *newval, size_t newlen);
 
-extern int do_sysctl_strategy (struct ctl_table *table,
-			       int __user *name, int nlen,
-			       void __user *oldval, size_t __user *oldlenp,
-			       void __user *newval, size_t newlen);
-
 extern ctl_handler sysctl_data;
 extern ctl_handler sysctl_string;
 extern ctl_handler sysctl_intvec;
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 740e144..27aa50a 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -145,12 +145,6 @@ extern int no_unaligned_warning;
 extern int max_lock_depth;
 #endif
 
-#ifdef CONFIG_SYSCTL_SYSCALL
-static int parse_table(int __user *, int, void __user *, size_t __user *,
-		void __user *, size_t, struct ctl_table *);
-#endif
-
-
 #ifdef CONFIG_PROC_SYSCTL
 static int proc_do_cad_pid(struct ctl_table *table, int write, struct file *filp,
 		  void __user *buffer, size_t *lenp, loff_t *ppos);
@@ -1459,6 +1453,74 @@ void register_sysctl_root(struct ctl_table_root *root)
 }
 
 #ifdef CONFIG_SYSCTL_SYSCALL
+/* Perform the actual read/write of a sysctl table entry. */
+static int do_sysctl_strategy(struct ctl_table *table,
+			int __user *name, int nlen,
+			void __user *oldval, size_t __user *oldlenp,
+			void __user *newval, size_t newlen)
+{
+	int op = 0, rc;
+
+	if (oldval)
+		op |= 004;
+	if (newval)
+		op |= 002;
+	if (sysctl_perm(table, op))
+		return -EPERM;
+
+	if (table->strategy) {
+		rc = table->strategy(table, name, nlen, oldval, oldlenp,
+				     newval, newlen);
+		if (rc < 0)
+			return rc;
+		if (rc > 0)
+			return 0;
+	}
+
+	/* If there is no strategy routine, or if the strategy returns
+	 * zero, proceed with automatic r/w */
+	if (table->data && table->maxlen) {
+		rc = sysctl_data(table, name, nlen, oldval, oldlenp,
+				 newval, newlen);
+		if (rc < 0)
+			return rc;
+	}
+	return 0;
+}
+
+static int parse_table(int __user *name, int nlen,
+		       void __user *oldval, size_t __user *oldlenp,
+		       void __user *newval, size_t newlen,
+		       struct ctl_table *table)
+{
+	int n;
+repeat:
+	if (!nlen)
+		return -ENOTDIR;
+	if (get_user(n, name))
+		return -EFAULT;
+	for ( ; table->ctl_name || table->procname; table++) {
+		if (!table->ctl_name)
+			continue;
+		if (n == table->ctl_name) {
+			int error;
+			if (table->child) {
+				if (sysctl_perm(table, 001))
+					return -EPERM;
+				name++;
+				nlen--;
+				table = table->child;
+				goto repeat;
+			}
+			error = do_sysctl_strategy(table, name, nlen,
+						   oldval, oldlenp,
+						   newval, newlen);
+			return error;
+		}
+	}
+	return -ENOTDIR;
+}
+
 int do_sysctl(int __user *name, int nlen, void __user *oldval, size_t __user *oldlenp,
 	       void __user *newval, size_t newlen)
 {
@@ -1531,76 +1593,6 @@ int sysctl_perm(struct ctl_table *table, int op)
 	return test_perm(table->mode, op);
 }
 
-#ifdef CONFIG_SYSCTL_SYSCALL
-static int parse_table(int __user *name, int nlen,
-		       void __user *oldval, size_t __user *oldlenp,
-		       void __user *newval, size_t newlen,
-		       struct ctl_table *table)
-{
-	int n;
-repeat:
-	if (!nlen)
-		return -ENOTDIR;
-	if (get_user(n, name))
-		return -EFAULT;
-	for ( ; table->ctl_name || table->procname; table++) {
-		if (!table->ctl_name)
-			continue;
-		if (n == table->ctl_name) {
-			int error;
-			if (table->child) {
-				if (sysctl_perm(table, 001))
-					return -EPERM;
-				name++;
-				nlen--;
-				table = table->child;
-				goto repeat;
-			}
-			error = do_sysctl_strategy(table, name, nlen,
-						   oldval, oldlenp,
-						   newval, newlen);
-			return error;
-		}
-	}
-	return -ENOTDIR;
-}
-
-/* Perform the actual read/write of a sysctl table entry. */
-int do_sysctl_strategy (struct ctl_table *table,
-			int __user *name, int nlen,
-			void __user *oldval, size_t __user *oldlenp,
-			void __user *newval, size_t newlen)
-{
-	int op = 0, rc;
-
-	if (oldval)
-		op |= 004;
-	if (newval) 
-		op |= 002;
-	if (sysctl_perm(table, op))
-		return -EPERM;
-
-	if (table->strategy) {
-		rc = table->strategy(table, name, nlen, oldval, oldlenp,
-				     newval, newlen);
-		if (rc < 0)
-			return rc;
-		if (rc > 0)
-			return 0;
-	}
-
-	/* If there is no strategy routine, or if the strategy returns
-	 * zero, proceed with automatic r/w */
-	if (table->data && table->maxlen) {
-		rc = sysctl_data(table, name, nlen, oldval, oldlenp,
-				 newval, newlen);
-		if (rc < 0)
-			return rc;
-	}
-	return 0;
-}
-#endif /* CONFIG_SYSCTL_SYSCALL */
-
 static void sysctl_set_parent(struct ctl_table *parent, struct ctl_table *table)
 {
 	for (; table->ctl_name || table->procname; table++) {
-- 
1.5.3.4


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 3/3] Sysctl: add the ->permissions callback on the ctl_table_root
  2008-02-27 13:47 [PATCH 0/3]Sysctl: clean the code and prepare for secure use in containers Pavel Emelyanov
  2008-02-27 13:48 ` [PATCH 1/3] Sysctl: merge equal proc_sys_read and proc_sys_write Pavel Emelyanov
  2008-02-27 13:51 ` [PATCH 2/3] Sysctl: clean from unneeded extern and forward declarations Pavel Emelyanov
@ 2008-02-27 13:54 ` Pavel Emelyanov
  2 siblings, 0 replies; 4+ messages in thread
From: Pavel Emelyanov @ 2008-02-27 13:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux Kernel Mailing List

When reading from/writing to some table, a root, which this table 
came from, may affect this table's permissions, depending on who is 
working with the table.

The core hunk is at the bottom of this patch. All the rest is
just pushing the ctl_table_root argument up to the sysctl_perm()
function.

This will be mostly (only?) used in the net sysctls.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
---
 fs/proc/proc_sysctl.c  |    4 ++--
 include/linux/sysctl.h |    7 ++++++-
 kernel/sysctl.c        |   25 ++++++++++++++++++-------
 3 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 5e31585..5acc001 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -190,7 +190,7 @@ static ssize_t proc_sys_call_handler(struct file *filp, void __user *buf,
 	 * and won't be until we finish.
 	 */
 	error = -EPERM;
-	if (sysctl_perm(table, write ? MAY_WRITE : MAY_READ))
+	if (sysctl_perm(head->root, table, write ? MAY_WRITE : MAY_READ))
 		goto out;
 
 	/* careful: calling conventions are nasty here */
@@ -388,7 +388,7 @@ static int proc_sys_permission(struct inode *inode, int mask, struct nameidata *
 		goto out;
 
 	/* Use the permissions on the sysctl table entry */
-	error = sysctl_perm(table, mask);
+	error = sysctl_perm(head->root, table, mask);
 out:
 	sysctl_head_finish(head);
 	return error;
diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index 8e50196..3239561 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -945,11 +945,14 @@ enum
 /* For the /proc/sys support */
 struct ctl_table;
 struct nsproxy;
+struct ctl_table_root;
+
 extern struct ctl_table_header *sysctl_head_next(struct ctl_table_header *prev);
 extern struct ctl_table_header *__sysctl_head_next(struct nsproxy *namespaces,
 						struct ctl_table_header *prev);
 extern void sysctl_head_finish(struct ctl_table_header *prev);
-extern int sysctl_perm(struct ctl_table *table, int op);
+extern int sysctl_perm(struct ctl_table_root *root,
+		struct ctl_table *table, int op);
 
 typedef struct ctl_table ctl_table;
 
@@ -1049,6 +1052,8 @@ struct ctl_table_root {
 	struct list_head header_list;
 	struct list_head *(*lookup)(struct ctl_table_root *root,
 					   struct nsproxy *namespaces);
+	int (*permissions)(struct ctl_table_root *root,
+			struct nsproxy *namespaces, struct ctl_table *table);
 };
 
 /* struct ctl_table_header is used to maintain dynamic lists of
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 27aa50a..12b793a 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1454,7 +1454,8 @@ void register_sysctl_root(struct ctl_table_root *root)
 
 #ifdef CONFIG_SYSCTL_SYSCALL
 /* Perform the actual read/write of a sysctl table entry. */
-static int do_sysctl_strategy(struct ctl_table *table,
+static int do_sysctl_strategy(struct ctl_table_root *root,
+			struct ctl_table *table,
 			int __user *name, int nlen,
 			void __user *oldval, size_t __user *oldlenp,
 			void __user *newval, size_t newlen)
@@ -1465,7 +1466,7 @@ static int do_sysctl_strategy(struct ctl_table *table,
 		op |= 004;
 	if (newval)
 		op |= 002;
-	if (sysctl_perm(table, op))
+	if (sysctl_perm(root, table, op))
 		return -EPERM;
 
 	if (table->strategy) {
@@ -1491,6 +1492,7 @@ static int do_sysctl_strategy(struct ctl_table *table,
 static int parse_table(int __user *name, int nlen,
 		       void __user *oldval, size_t __user *oldlenp,
 		       void __user *newval, size_t newlen,
+		       struct ctl_table_root *root,
 		       struct ctl_table *table)
 {
 	int n;
@@ -1505,14 +1507,14 @@ repeat:
 		if (n == table->ctl_name) {
 			int error;
 			if (table->child) {
-				if (sysctl_perm(table, 001))
+				if (sysctl_perm(root, table, 001))
 					return -EPERM;
 				name++;
 				nlen--;
 				table = table->child;
 				goto repeat;
 			}
-			error = do_sysctl_strategy(table, name, nlen,
+			error = do_sysctl_strategy(root, table, name, nlen,
 						   oldval, oldlenp,
 						   newval, newlen);
 			return error;
@@ -1538,7 +1540,8 @@ int do_sysctl(int __user *name, int nlen, void __user *oldval, size_t __user *ol
 	for (head = sysctl_head_next(NULL); head;
 			head = sysctl_head_next(head)) {
 		error = parse_table(name, nlen, oldval, oldlenp, 
-					newval, newlen, head->ctl_table);
+					newval, newlen,
+					head->root, head->ctl_table);
 		if (error != -ENOTDIR) {
 			sysctl_head_finish(head);
 			break;
@@ -1584,13 +1587,21 @@ static int test_perm(int mode, int op)
 	return -EACCES;
 }
 
-int sysctl_perm(struct ctl_table *table, int op)
+int sysctl_perm(struct ctl_table_root *root, struct ctl_table *table, int op)
 {
 	int error;
+	int mode;
+
 	error = security_sysctl(table, op);
 	if (error)
 		return error;
-	return test_perm(table->mode, op);
+
+	if (root->permissions)
+		mode = root->permissions(root, current->nsproxy, table);
+	else
+		mode = table->mode;
+
+	return test_perm(mode, op);
 }
 
 static void sysctl_set_parent(struct ctl_table *parent, struct ctl_table *table)
-- 
1.5.3.4


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-02-27 13:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-27 13:47 [PATCH 0/3]Sysctl: clean the code and prepare for secure use in containers Pavel Emelyanov
2008-02-27 13:48 ` [PATCH 1/3] Sysctl: merge equal proc_sys_read and proc_sys_write Pavel Emelyanov
2008-02-27 13:51 ` [PATCH 2/3] Sysctl: clean from unneeded extern and forward declarations Pavel Emelyanov
2008-02-27 13:54 ` [PATCH 3/3] Sysctl: add the ->permissions callback on the ctl_table_root Pavel Emelyanov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).