LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [patch 00/10] mount ownership and unprivileged mount syscall (v8)
@ 2008-02-05 21:36 Miklos Szeredi
  2008-02-05 21:36 ` [patch 01/10] unprivileged mounts: add user mounts to the kernel Miklos Szeredi
                   ` (10 more replies)
  0 siblings, 11 replies; 30+ messages in thread
From: Miklos Szeredi @ 2008-02-05 21:36 UTC (permalink / raw)
  To: akpm, hch, serue; +Cc: linux-fsdevel, linux-kernel

Just documentation updates, compared to the previous submission.
Thanks to Serge for the relentless reviews :)

Please consider for -mm, and then for 2.6.26.

Thanks,
Miklos

v7 -> v8

 - extend documentation of allow_usermount sysctl tunable
 - describe new unprivileged mounting in fuse.txt

v6 -> v7:

 - add '/proc/sys/fs/types/<type>/usermount_safe' tunable (new patch)
 - do not make FUSE safe by default, describe possible problems
   associated with unprivileged FUSE mounts in patch header
 - return EMFILE instead of EPERM, if maximum user mount count is exceeded
 - rename option 'nomnt' -> 'nosubmnt'
 - clean up error propagation in dup_mnt_ns
 - update util-linux-ng patch

v5 -> v6:

 - update to latest -mm
 - preliminary util-linux-ng support (will post right after this series)

v4 -> v5:

 - fold back Andrew's changes
 - fold back my update patch:
    o use fsuid instead of ruid
    o allow forced unpriv. unmounts for "safe" filesystems
    o allow mounting over special files, but not over symlinks
    o set nosuid and nodev based on lack of specific capability
 - patch header updates
 - new patch: on propagation inherit owner from parent
 - new patch: add "no submounts" mount flag

v3 -> v4:

 - simplify interface as much as possible, now only a single option
   ("user=UID") is used to control everything
 - no longer allow/deny mounting based on file/directory permissions,
   that approach does not always make sense

v1 -> v3:

 - add mount flags to set/clear mnt_flags individually
 - add "usermnt" mount flag.  If it is set, then allow unprivileged
   submounts under this mount
 - make max number of user mounts default to 1024, since now the
   usermnt flag will prevent user mounts by default

--

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [patch 01/10] unprivileged mounts: add user mounts to the kernel
  2008-02-05 21:36 [patch 00/10] mount ownership and unprivileged mount syscall (v8) Miklos Szeredi
@ 2008-02-05 21:36 ` Miklos Szeredi
  2008-02-05 21:36 ` [patch 02/10] unprivileged mounts: allow unprivileged umount Miklos Szeredi
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Miklos Szeredi @ 2008-02-05 21:36 UTC (permalink / raw)
  To: akpm, hch, serue; +Cc: linux-fsdevel, linux-kernel

[-- Attachment #1: unprivileged-mounts-add-user-mounts-to-the-kernel.patch --]
[-- Type: text/plain, Size: 8650 bytes --]

From: Miklos Szeredi <mszeredi@suse.cz>

This patchset adds support for keeping mount ownership information in the
kernel, and allow unprivileged mount(2) and umount(2) in certain cases.

The mount owner has the following privileges:

  - unmount the owned mount
  - create a submount under the owned mount

The sysadmin can set the owner explicitly on mount and remount.  When an
unprivileged user creates a mount, then the owner is automatically set to the
user.

The following use cases are envisioned:

1) Private namespace, with selected mounts owned by user.  E.g.
   /home/$USER is a good candidate for allowing unpriv mounts and unmounts
   within.

2) Private namespace, with all mounts owned by user and having the "nosuid"
   flag.  User can mount and umount anywhere within the namespace, but suid
   programs will not work.

3) Global namespace, with a designated directory, which is a mount owned by
   the user.  E.g.  /mnt/users/$USER is set up so that it is bind mounted onto
   itself, and set to be owned by $USER.  The user can add/remove mounts only
   under this directory.

The following extra security measures are taken for unprivileged mounts:

 - usermounts are limited by a sysctl tunable
 - force "nosuid,nodev" mount options on the created mount

This series increases the size of vmlinux by about 1.5k on x86_64.

For testing unprivileged mounts (and for other purposes) simple
mount/umount utilities are available from:

  http://www.kernel.org/pub/linux/kernel/people/mszeredi/mmount/

A preliminary patch for util-linux-ng to add the same functionality to
mount(8) and umount(8) is available here:

  http://lkml.org/lkml/2008/1/16/103


This patch:

A new mount flag, MS_SETUSER is used to make a mount owned by a user.  If this
flag is specified, then the owner will be set to the current fsuid and the
mount will be marked with the MNT_USER flag.  On remount don't preserve
previous owner, and treat MS_SETUSER as for a new mount.  The MS_SETUSER flag
is ignored on mount move.

The MNT_USER flag is not copied on any kind of mount cloning: namespace
creation, binding or propagation.  For bind mounts the cloned mount(s) are set
to MNT_USER depending on the MS_SETUSER mount flag.  In all the other cases
MNT_USER is always cleared.

For MNT_USER mounts a "user=UID" option is added to /proc/PID/mounts.  This is
compatible with how mount ownership is stored in /etc/mtab.

The rationale for using MS_SETUSER and MNT_USER, to distinguish "user"
mounts from "non-user" or "legacy" mounts are follows:

  a) Mount(2) and umount(2) on legacy mounts always need CAP_SYS_ADMIN
     capability.  As opposed to user mounts, which will only require,
     that the mount owner matches the current fsuid.  So a process
     with fsuid=0 should not be able to mount/umount legacy mounts
     without the CAP_SYS_ADMIN capability.

  b) Legacy userspace programs may set fsuid to nonzero before calling
     mount(2).  In such an unlikely case, this patchset would cause
     an unintended side effect of making the mount owned by the fsuid.

  c) For legacy mounts, no "user=UID" option should be shown in
     /proc/mounts for backwards compatibility.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Serge Hallyn <serue@us.ibm.com>
---

Index: linux/fs/namespace.c
===================================================================
--- linux.orig/fs/namespace.c	2008-02-04 23:47:47.000000000 +0100
+++ linux/fs/namespace.c	2008-02-04 23:47:50.000000000 +0100
@@ -511,6 +511,13 @@ static struct vfsmount *skip_mnt_tree(st
 	return p;
 }
 
+static void set_mnt_user(struct vfsmount *mnt)
+{
+	WARN_ON(mnt->mnt_flags & MNT_USER);
+	mnt->mnt_uid = current->fsuid;
+	mnt->mnt_flags |= MNT_USER;
+}
+
 static struct vfsmount *clone_mnt(struct vfsmount *old, struct dentry *root,
 					int flag)
 {
@@ -525,6 +532,11 @@ static struct vfsmount *clone_mnt(struct
 		mnt->mnt_mountpoint = mnt->mnt_root;
 		mnt->mnt_parent = mnt;
 
+		/* don't copy the MNT_USER flag */
+		mnt->mnt_flags &= ~MNT_USER;
+		if (flag & CL_SETUSER)
+			set_mnt_user(mnt);
+
 		if (flag & CL_SLAVE) {
 			list_add(&mnt->mnt_slave, &old->mnt_slave_list);
 			mnt->mnt_master = old;
@@ -712,6 +724,8 @@ static void show_mnt_opts(struct seq_fil
 		if (mnt->mnt_flags & fs_infop->flag)
 			seq_puts(m, fs_infop->str);
 	}
+	if (mnt->mnt_flags & MNT_USER)
+		seq_printf(m, ",user=%i", mnt->mnt_uid);
 }
 
 static void show_type(struct seq_file *m, struct super_block *sb)
@@ -1320,8 +1334,9 @@ static int do_change_type(struct nameida
 /*
  * do loopback mount.
  */
-static int do_loopback(struct nameidata *nd, char *old_name, int recurse)
+static int do_loopback(struct nameidata *nd, char *old_name, int flags)
 {
+	int clone_fl;
 	struct nameidata old_nd;
 	struct vfsmount *mnt = NULL;
 	int err = mount_is_safe(nd);
@@ -1341,11 +1356,12 @@ static int do_loopback(struct nameidata 
 	if (!check_mnt(nd->path.mnt) || !check_mnt(old_nd.path.mnt))
 		goto out;
 
+	clone_fl = (flags & MS_SETUSER) ? CL_SETUSER : 0;
 	err = -ENOMEM;
-	if (recurse)
-		mnt = copy_tree(old_nd.path.mnt, old_nd.path.dentry, 0);
+	if (flags & MS_REC)
+		mnt = copy_tree(old_nd.path.mnt, old_nd.path.dentry, clone_fl);
 	else
-		mnt = clone_mnt(old_nd.path.mnt, old_nd.path.dentry, 0);
+		mnt = clone_mnt(old_nd.path.mnt, old_nd.path.dentry, clone_fl);
 
 	if (!mnt)
 		goto out;
@@ -1407,8 +1423,11 @@ static int do_remount(struct nameidata *
 		err = change_mount_flags(nd->path.mnt, flags);
 	else
 		err = do_remount_sb(sb, flags, data, 0);
-	if (!err)
+	if (!err) {
 		nd->path.mnt->mnt_flags = mnt_flags;
+		if (flags & MS_SETUSER)
+			set_mnt_user(nd->path.mnt);
+	}
 	up_write(&sb->s_umount);
 	if (!err)
 		security_sb_post_remount(nd->path.mnt, flags, data);
@@ -1517,10 +1536,13 @@ static int do_new_mount(struct nameidata
 	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
 
-	mnt = do_kern_mount(type, flags, name, data);
+	mnt = do_kern_mount(type, flags & ~MS_SETUSER, name, data);
 	if (IS_ERR(mnt))
 		return PTR_ERR(mnt);
 
+	if (flags & MS_SETUSER)
+		set_mnt_user(mnt);
+
 	return do_add_mount(mnt, nd, mnt_flags, NULL);
 }
 
@@ -1552,7 +1574,8 @@ int do_add_mount(struct vfsmount *newmnt
 	if (S_ISLNK(newmnt->mnt_root->d_inode->i_mode))
 		goto unlock;
 
-	newmnt->mnt_flags = mnt_flags;
+	/* MNT_USER was set earlier */
+	newmnt->mnt_flags |= mnt_flags;
 	if ((err = graft_tree(newmnt, nd)))
 		goto unlock;
 
@@ -1874,7 +1897,7 @@ long do_mount(char *dev_name, char *dir_
 		retval = do_remount(&nd, flags & ~MS_REMOUNT, mnt_flags,
 				    data_page);
 	else if (flags & MS_BIND)
-		retval = do_loopback(&nd, dev_name, flags & MS_REC);
+		retval = do_loopback(&nd, dev_name, flags);
 	else if (flags & (MS_SHARED | MS_PRIVATE | MS_SLAVE | MS_UNBINDABLE))
 		retval = do_change_type(&nd, flags);
 	else if (flags & MS_MOVE)
Index: linux/fs/pnode.h
===================================================================
--- linux.orig/fs/pnode.h	2008-02-04 23:47:47.000000000 +0100
+++ linux/fs/pnode.h	2008-02-04 23:47:50.000000000 +0100
@@ -23,6 +23,7 @@
 #define CL_MAKE_SHARED 		0x08
 #define CL_PROPAGATION 		0x10
 #define CL_PRIVATE 		0x20
+#define CL_SETUSER		0x40
 
 static inline void set_mnt_shared(struct vfsmount *mnt)
 {
Index: linux/include/linux/fs.h
===================================================================
--- linux.orig/include/linux/fs.h	2008-02-04 23:47:47.000000000 +0100
+++ linux/include/linux/fs.h	2008-02-04 23:47:50.000000000 +0100
@@ -125,6 +125,7 @@ extern int dir_notify_enable;
 #define MS_RELATIME	(1<<21)	/* Update atime relative to mtime/ctime. */
 #define MS_KERNMOUNT	(1<<22) /* this is a kern_mount call */
 #define MS_I_VERSION	(1<<23) /* Update inode I_version field */
+#define MS_SETUSER	(1<<24) /* set mnt_uid to current user */
 #define MS_ACTIVE	(1<<30)
 #define MS_NOUSER	(1<<31)
 
Index: linux/include/linux/mount.h
===================================================================
--- linux.orig/include/linux/mount.h	2008-02-04 23:47:47.000000000 +0100
+++ linux/include/linux/mount.h	2008-02-04 23:47:50.000000000 +0100
@@ -33,6 +33,7 @@ struct mnt_namespace;
 
 #define MNT_SHRINKABLE	0x100
 #define MNT_IMBALANCED_WRITE_COUNT	0x200 /* just for debugging */
+#define MNT_USER	0x400
 
 #define MNT_SHARED	0x1000	/* if the vfsmount is a shared mount */
 #define MNT_UNBINDABLE	0x2000	/* if the vfsmount is a unbindable mount */
@@ -70,6 +71,8 @@ struct vfsmount {
 	 * are held, and all mnt_writer[]s on this mount have 0 as their ->count
 	 */
 	atomic_t __mnt_writers;
+
+	uid_t mnt_uid;			/* owner of the mount */
 };
 
 static inline struct vfsmount *mntget(struct vfsmount *mnt)

--

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [patch 02/10] unprivileged mounts: allow unprivileged umount
  2008-02-05 21:36 [patch 00/10] mount ownership and unprivileged mount syscall (v8) Miklos Szeredi
  2008-02-05 21:36 ` [patch 01/10] unprivileged mounts: add user mounts to the kernel Miklos Szeredi
@ 2008-02-05 21:36 ` Miklos Szeredi
  2008-02-05 21:36 ` [patch 03/10] unprivileged mounts: propagate error values from clone_mnt Miklos Szeredi
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Miklos Szeredi @ 2008-02-05 21:36 UTC (permalink / raw)
  To: akpm, hch, serue; +Cc: linux-fsdevel, linux-kernel

[-- Attachment #1: unprivileged-mounts-allow-unprivileged-umount.patch --]
[-- Type: text/plain, Size: 1523 bytes --]

From: Miklos Szeredi <mszeredi@suse.cz>

The owner doesn't need sysadmin capabilities to call umount().

Similar behavior as umount(8) on mounts having "user=UID" option in /etc/mtab.
The difference is that umount also checks /etc/fstab, presumably to exclude
another mount on the same mountpoint.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Serge Hallyn <serue@us.ibm.com>
---

Index: linux/fs/namespace.c
===================================================================
--- linux.orig/fs/namespace.c	2008-02-04 23:47:50.000000000 +0100
+++ linux/fs/namespace.c	2008-02-04 23:47:53.000000000 +0100
@@ -1033,6 +1033,27 @@ static int do_umount(struct vfsmount *mn
 	return retval;
 }
 
+static bool is_mount_owner(struct vfsmount *mnt, uid_t uid)
+{
+	return (mnt->mnt_flags & MNT_USER) && mnt->mnt_uid == uid;
+}
+
+/*
+ * umount is permitted for
+ *  - sysadmin
+ *  - mount owner, if not forced umount
+ */
+static bool permit_umount(struct vfsmount *mnt, int flags)
+{
+	if (capable(CAP_SYS_ADMIN))
+		return true;
+
+	if (flags & MNT_FORCE)
+		return false;
+
+	return is_mount_owner(mnt, current->fsuid);
+}
+
 /*
  * Now umount can handle mount points as well as block devices.
  * This is important for filesystems which use unnamed block devices.
@@ -1056,7 +1077,7 @@ asmlinkage long sys_umount(char __user *
 		goto dput_and_out;
 
 	retval = -EPERM;
-	if (!capable(CAP_SYS_ADMIN))
+	if (!permit_umount(nd.path.mnt, flags))
 		goto dput_and_out;
 
 	retval = do_umount(nd.path.mnt, flags);

--

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [patch 03/10] unprivileged mounts: propagate error values from clone_mnt
  2008-02-05 21:36 [patch 00/10] mount ownership and unprivileged mount syscall (v8) Miklos Szeredi
  2008-02-05 21:36 ` [patch 01/10] unprivileged mounts: add user mounts to the kernel Miklos Szeredi
  2008-02-05 21:36 ` [patch 02/10] unprivileged mounts: allow unprivileged umount Miklos Szeredi
@ 2008-02-05 21:36 ` Miklos Szeredi
  2008-02-05 21:36 ` [patch 04/10] unprivileged mounts: account user mounts Miklos Szeredi
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Miklos Szeredi @ 2008-02-05 21:36 UTC (permalink / raw)
  To: akpm, hch, serue; +Cc: linux-fsdevel, linux-kernel

[-- Attachment #1: unprivileged-mounts-propagate-error-values-from-clone_mnt.patch --]
[-- Type: text/plain, Size: 5396 bytes --]

From: Miklos Szeredi <mszeredi@suse.cz>

Allow clone_mnt() to return errors other than ENOMEM.  This will be used for
returning a different error value when the number of user mounts goes over the
limit.

Fix copy_tree() to return EPERM for unbindable mounts.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Serge Hallyn <serue@us.ibm.com>
---

Index: linux/fs/namespace.c
===================================================================
--- linux.orig/fs/namespace.c	2008-02-04 23:47:53.000000000 +0100
+++ linux/fs/namespace.c	2008-02-04 23:47:56.000000000 +0100
@@ -524,41 +524,42 @@ static struct vfsmount *clone_mnt(struct
 	struct super_block *sb = old->mnt_sb;
 	struct vfsmount *mnt = alloc_vfsmnt(old->mnt_devname);
 
-	if (mnt) {
-		mnt->mnt_flags = old->mnt_flags;
-		atomic_inc(&sb->s_active);
-		mnt->mnt_sb = sb;
-		mnt->mnt_root = dget(root);
-		mnt->mnt_mountpoint = mnt->mnt_root;
-		mnt->mnt_parent = mnt;
-
-		/* don't copy the MNT_USER flag */
-		mnt->mnt_flags &= ~MNT_USER;
-		if (flag & CL_SETUSER)
-			set_mnt_user(mnt);
-
-		if (flag & CL_SLAVE) {
-			list_add(&mnt->mnt_slave, &old->mnt_slave_list);
-			mnt->mnt_master = old;
-			CLEAR_MNT_SHARED(mnt);
-		} else if (!(flag & CL_PRIVATE)) {
-			if ((flag & CL_PROPAGATION) || IS_MNT_SHARED(old))
-				list_add(&mnt->mnt_share, &old->mnt_share);
-			if (IS_MNT_SLAVE(old))
-				list_add(&mnt->mnt_slave, &old->mnt_slave);
-			mnt->mnt_master = old->mnt_master;
-		}
-		if (flag & CL_MAKE_SHARED)
-			set_mnt_shared(mnt);
+	if (!mnt)
+		return ERR_PTR(-ENOMEM);
 
-		/* stick the duplicate mount on the same expiry list
-		 * as the original if that was on one */
-		if (flag & CL_EXPIRE) {
-			spin_lock(&vfsmount_lock);
-			if (!list_empty(&old->mnt_expire))
-				list_add(&mnt->mnt_expire, &old->mnt_expire);
-			spin_unlock(&vfsmount_lock);
-		}
+	mnt->mnt_flags = old->mnt_flags;
+	atomic_inc(&sb->s_active);
+	mnt->mnt_sb = sb;
+	mnt->mnt_root = dget(root);
+	mnt->mnt_mountpoint = mnt->mnt_root;
+	mnt->mnt_parent = mnt;
+
+	/* don't copy the MNT_USER flag */
+	mnt->mnt_flags &= ~MNT_USER;
+	if (flag & CL_SETUSER)
+		set_mnt_user(mnt);
+
+	if (flag & CL_SLAVE) {
+		list_add(&mnt->mnt_slave, &old->mnt_slave_list);
+		mnt->mnt_master = old;
+		CLEAR_MNT_SHARED(mnt);
+	} else if (!(flag & CL_PRIVATE)) {
+		if ((flag & CL_PROPAGATION) || IS_MNT_SHARED(old))
+			list_add(&mnt->mnt_share, &old->mnt_share);
+		if (IS_MNT_SLAVE(old))
+			list_add(&mnt->mnt_slave, &old->mnt_slave);
+		mnt->mnt_master = old->mnt_master;
+	}
+	if (flag & CL_MAKE_SHARED)
+		set_mnt_shared(mnt);
+
+	/* stick the duplicate mount on the same expiry list
+	 * as the original if that was on one */
+	if (flag & CL_EXPIRE) {
+		spin_lock(&vfsmount_lock);
+		if (!list_empty(&old->mnt_expire))
+			list_add(&mnt->mnt_expire, &old->mnt_expire);
+		spin_unlock(&vfsmount_lock);
 	}
 	return mnt;
 }
@@ -1137,11 +1138,11 @@ struct vfsmount *copy_tree(struct vfsmou
 	struct nameidata nd;
 
 	if (!(flag & CL_COPY_ALL) && IS_MNT_UNBINDABLE(mnt))
-		return NULL;
+		return ERR_PTR(-EPERM);
 
 	res = q = clone_mnt(mnt, dentry, flag);
-	if (!q)
-		goto Enomem;
+	if (IS_ERR(q))
+		goto error;
 	q->mnt_mountpoint = mnt->mnt_mountpoint;
 
 	p = mnt;
@@ -1162,8 +1163,8 @@ struct vfsmount *copy_tree(struct vfsmou
 			nd.path.mnt = q;
 			nd.path.dentry = p->mnt_mountpoint;
 			q = clone_mnt(p, p->mnt_root, flag);
-			if (!q)
-				goto Enomem;
+			if (IS_ERR(q))
+				goto error;
 			spin_lock(&vfsmount_lock);
 			list_add_tail(&q->mnt_list, &res->mnt_list);
 			attach_mnt(q, &nd);
@@ -1171,7 +1172,7 @@ struct vfsmount *copy_tree(struct vfsmou
 		}
 	}
 	return res;
-Enomem:
+ error:
 	if (res) {
 		LIST_HEAD(umount_list);
 		spin_lock(&vfsmount_lock);
@@ -1179,7 +1180,7 @@ Enomem:
 		spin_unlock(&vfsmount_lock);
 		release_mounts(&umount_list);
 	}
-	return NULL;
+	return q;
 }
 
 struct vfsmount *collect_mounts(struct vfsmount *mnt, struct dentry *dentry)
@@ -1378,13 +1379,13 @@ static int do_loopback(struct nameidata 
 		goto out;
 
 	clone_fl = (flags & MS_SETUSER) ? CL_SETUSER : 0;
-	err = -ENOMEM;
 	if (flags & MS_REC)
 		mnt = copy_tree(old_nd.path.mnt, old_nd.path.dentry, clone_fl);
 	else
 		mnt = clone_mnt(old_nd.path.mnt, old_nd.path.dentry, clone_fl);
 
-	if (!mnt)
+	err = PTR_ERR(mnt);
+	if (IS_ERR(mnt))
 		goto out;
 
 	err = graft_tree(mnt, nd);
@@ -1955,10 +1956,10 @@ static struct mnt_namespace *dup_mnt_ns(
 	/* First pass: copy the tree topology */
 	new_ns->root = copy_tree(mnt_ns->root, mnt_ns->root->mnt_root,
 					CL_COPY_ALL | CL_EXPIRE);
-	if (!new_ns->root) {
+	if (IS_ERR(new_ns->root)) {
 		up_write(&namespace_sem);
 		kfree(new_ns);
-		return ERR_PTR(-ENOMEM);;
+		return ERR_CAST(new_ns->root);
 	}
 	spin_lock(&vfsmount_lock);
 	list_add_tail(&new_ns->list, &new_ns->root->mnt_list);
Index: linux/fs/pnode.c
===================================================================
--- linux.orig/fs/pnode.c	2008-02-04 23:47:47.000000000 +0100
+++ linux/fs/pnode.c	2008-02-04 23:47:56.000000000 +0100
@@ -224,8 +224,9 @@ int propagate_mnt(struct vfsmount *dest_
 
 		source =  get_source(m, prev_dest_mnt, prev_src_mnt, &type);
 
-		if (!(child = copy_tree(source, source->mnt_root, type))) {
-			ret = -ENOMEM;
+		child = copy_tree(source, source->mnt_root, type);
+		if (IS_ERR(child)) {
+			ret = PTR_ERR(child);
 			list_splice(tree_list, tmp_list.prev);
 			goto out;
 		}

--

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [patch 04/10] unprivileged mounts: account user mounts
  2008-02-05 21:36 [patch 00/10] mount ownership and unprivileged mount syscall (v8) Miklos Szeredi
                   ` (2 preceding siblings ...)
  2008-02-05 21:36 ` [patch 03/10] unprivileged mounts: propagate error values from clone_mnt Miklos Szeredi
@ 2008-02-05 21:36 ` Miklos Szeredi
  2008-02-05 21:36 ` [patch 05/10] unprivileged mounts: allow unprivileged bind mounts Miklos Szeredi
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Miklos Szeredi @ 2008-02-05 21:36 UTC (permalink / raw)
  To: akpm, hch, serue; +Cc: linux-fsdevel, linux-kernel

[-- Attachment #1: unprivileged-mounts-account-user-mounts.patch --]
[-- Type: text/plain, Size: 5802 bytes --]

From: Miklos Szeredi <mszeredi@suse.cz>

Add sysctl variables for accounting and limiting the number of user
mounts.

The maximum number of user mounts is set to 1024 by default.  This
won't in itself enable user mounts, setting a mount to be owned by a
user is first needed.

[akpm]
 - don't use enumerated sysctls

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Serge Hallyn <serue@us.ibm.com>
---

Index: linux/Documentation/filesystems/proc.txt
===================================================================
--- linux.orig/Documentation/filesystems/proc.txt	2008-02-04 23:47:47.000000000 +0100
+++ linux/Documentation/filesystems/proc.txt	2008-02-04 23:47:58.000000000 +0100
@@ -1052,6 +1052,15 @@ reaches aio-max-nr then io_setup will fa
 raising aio-max-nr does not result in the pre-allocation or re-sizing
 of any kernel data structures.
 
+nr_user_mounts and max_user_mounts
+----------------------------------
+
+These represent the number of "user" mounts and the maximum number of
+"user" mounts respectively.  User mounts may be created by
+unprivileged users.  User mounts may also be created with sysadmin
+privileges on behalf of a user, in which case nr_user_mounts may
+exceed max_user_mounts.
+
 2.2 /proc/sys/fs/binfmt_misc - Miscellaneous binary formats
 -----------------------------------------------------------
 
Index: linux/fs/namespace.c
===================================================================
--- linux.orig/fs/namespace.c	2008-02-04 23:47:56.000000000 +0100
+++ linux/fs/namespace.c	2008-02-04 23:47:58.000000000 +0100
@@ -46,6 +46,9 @@ static struct list_head *mount_hashtable
 static struct kmem_cache *mnt_cache __read_mostly;
 static struct rw_semaphore namespace_sem;
 
+int nr_user_mounts;
+int max_user_mounts = 1024;
+
 /* /sys/fs */
 struct kobject *fs_kobj;
 EXPORT_SYMBOL_GPL(fs_kobj);
@@ -511,21 +514,70 @@ static struct vfsmount *skip_mnt_tree(st
 	return p;
 }
 
-static void set_mnt_user(struct vfsmount *mnt)
+static void dec_nr_user_mounts(void)
+{
+	spin_lock(&vfsmount_lock);
+	nr_user_mounts--;
+	spin_unlock(&vfsmount_lock);
+}
+
+static int reserve_user_mount(void)
+{
+	int err = 0;
+
+	spin_lock(&vfsmount_lock);
+	/*
+	 * EMFILE was error returned by mount(2) in the old days, when
+	 * the mount count was limited.  Reuse this error value to
+	 * mean, that the maximum number of user mounts has been
+	 * exceeded.
+	 */
+	if (nr_user_mounts >= max_user_mounts && !capable(CAP_SYS_ADMIN))
+		err = -EMFILE;
+	else
+		nr_user_mounts++;
+	spin_unlock(&vfsmount_lock);
+	return err;
+}
+
+static void __set_mnt_user(struct vfsmount *mnt)
 {
 	WARN_ON(mnt->mnt_flags & MNT_USER);
 	mnt->mnt_uid = current->fsuid;
 	mnt->mnt_flags |= MNT_USER;
 }
 
+static void set_mnt_user(struct vfsmount *mnt)
+{
+	__set_mnt_user(mnt);
+	spin_lock(&vfsmount_lock);
+	nr_user_mounts++;
+	spin_unlock(&vfsmount_lock);
+}
+
+static void clear_mnt_user(struct vfsmount *mnt)
+{
+	if (mnt->mnt_flags & MNT_USER) {
+		mnt->mnt_uid = 0;
+		mnt->mnt_flags &= ~MNT_USER;
+		dec_nr_user_mounts();
+	}
+}
+
 static struct vfsmount *clone_mnt(struct vfsmount *old, struct dentry *root,
 					int flag)
 {
 	struct super_block *sb = old->mnt_sb;
-	struct vfsmount *mnt = alloc_vfsmnt(old->mnt_devname);
+	struct vfsmount *mnt;
 
+	if (flag & CL_SETUSER) {
+		int err = reserve_user_mount();
+		if (err)
+			return ERR_PTR(err);
+	}
+	mnt = alloc_vfsmnt(old->mnt_devname);
 	if (!mnt)
-		return ERR_PTR(-ENOMEM);
+		goto alloc_failed;
 
 	mnt->mnt_flags = old->mnt_flags;
 	atomic_inc(&sb->s_active);
@@ -537,7 +589,7 @@ static struct vfsmount *clone_mnt(struct
 	/* don't copy the MNT_USER flag */
 	mnt->mnt_flags &= ~MNT_USER;
 	if (flag & CL_SETUSER)
-		set_mnt_user(mnt);
+		__set_mnt_user(mnt);
 
 	if (flag & CL_SLAVE) {
 		list_add(&mnt->mnt_slave, &old->mnt_slave_list);
@@ -562,6 +614,11 @@ static struct vfsmount *clone_mnt(struct
 		spin_unlock(&vfsmount_lock);
 	}
 	return mnt;
+
+ alloc_failed:
+	if (flag & CL_SETUSER)
+		dec_nr_user_mounts();
+	return ERR_PTR(-ENOMEM);
 }
 
 static inline void __mntput(struct vfsmount *mnt)
@@ -577,6 +634,7 @@ static inline void __mntput(struct vfsmo
 	 */
 	WARN_ON(atomic_read(&mnt->__mnt_writers));
 	dput(mnt->mnt_root);
+	clear_mnt_user(mnt);
 	free_vfsmnt(mnt);
 	deactivate_super(sb);
 }
@@ -1446,6 +1504,7 @@ static int do_remount(struct nameidata *
 	else
 		err = do_remount_sb(sb, flags, data, 0);
 	if (!err) {
+		clear_mnt_user(nd->path.mnt);
 		nd->path.mnt->mnt_flags = mnt_flags;
 		if (flags & MS_SETUSER)
 			set_mnt_user(nd->path.mnt);
Index: linux/include/linux/fs.h
===================================================================
--- linux.orig/include/linux/fs.h	2008-02-04 23:47:50.000000000 +0100
+++ linux/include/linux/fs.h	2008-02-04 23:47:58.000000000 +0100
@@ -50,6 +50,9 @@ extern struct inodes_stat_t inodes_stat;
 
 extern int leases_enable, lease_break_time;
 
+extern int nr_user_mounts;
+extern int max_user_mounts;
+
 #ifdef CONFIG_DNOTIFY
 extern int dir_notify_enable;
 #endif
Index: linux/kernel/sysctl.c
===================================================================
--- linux.orig/kernel/sysctl.c	2008-02-04 23:47:47.000000000 +0100
+++ linux/kernel/sysctl.c	2008-02-04 23:47:58.000000000 +0100
@@ -1302,6 +1302,22 @@ static struct ctl_table fs_table[] = {
 #endif	
 #endif
 	{
+		.ctl_name	= CTL_UNNUMBERED,
+		.procname	= "nr_user_mounts",
+		.data		= &nr_user_mounts,
+		.maxlen		= sizeof(int),
+		.mode		= 0444,
+		.proc_handler	= &proc_dointvec,
+	},
+	{
+		.ctl_name	= CTL_UNNUMBERED,
+		.procname	= "max_user_mounts",
+		.data		= &max_user_mounts,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= &proc_dointvec,
+	},
+	{
 		.ctl_name	= KERN_SETUID_DUMPABLE,
 		.procname	= "suid_dumpable",
 		.data		= &suid_dumpable,

--

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [patch 05/10] unprivileged mounts: allow unprivileged bind mounts
  2008-02-05 21:36 [patch 00/10] mount ownership and unprivileged mount syscall (v8) Miklos Szeredi
                   ` (3 preceding siblings ...)
  2008-02-05 21:36 ` [patch 04/10] unprivileged mounts: account user mounts Miklos Szeredi
@ 2008-02-05 21:36 ` Miklos Szeredi
  2008-02-05 21:36 ` [patch 06/10] unprivileged mounts: allow unprivileged mounts Miklos Szeredi
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Miklos Szeredi @ 2008-02-05 21:36 UTC (permalink / raw)
  To: akpm, hch, serue; +Cc: linux-fsdevel, linux-kernel

[-- Attachment #1: unprivileged-mounts-allow-unprivileged-bind-mounts.patch --]
[-- Type: text/plain, Size: 2541 bytes --]

From: Miklos Szeredi <mszeredi@suse.cz>

Allow bind mounts to unprivileged users if the following conditions are met:

  - mountpoint is not a symlink
  - parent mount is owned by the user
  - the number of user mounts is below the maximum

Unprivileged mounts imply MS_SETUSER, and will also have the "nosuid" and
"nodev" mount flags set.

In particular, if mounting process doesn't have CAP_SETUID capability,
then the "nosuid" flag will be added, and if it doesn't have CAP_MKNOD
capability, then the "nodev" flag will be added.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Serge Hallyn <serue@us.ibm.com>
---

Index: linux/fs/namespace.c
===================================================================
--- linux.orig/fs/namespace.c	2008-02-04 23:47:58.000000000 +0100
+++ linux/fs/namespace.c	2008-02-04 23:48:00.000000000 +0100
@@ -545,6 +545,11 @@ static void __set_mnt_user(struct vfsmou
 	WARN_ON(mnt->mnt_flags & MNT_USER);
 	mnt->mnt_uid = current->fsuid;
 	mnt->mnt_flags |= MNT_USER;
+
+	if (!capable(CAP_SETUID))
+		mnt->mnt_flags |= MNT_NOSUID;
+	if (!capable(CAP_MKNOD))
+		mnt->mnt_flags |= MNT_NODEV;
 }
 
 static void set_mnt_user(struct vfsmount *mnt)
@@ -1160,22 +1165,26 @@ asmlinkage long sys_oldumount(char __use
 
 #endif
 
-static int mount_is_safe(struct nameidata *nd)
+/*
+ * Conditions for unprivileged mounts are:
+ * - mountpoint is not a symlink
+ * - mountpoint is in a mount owned by the user
+ */
+static bool permit_mount(struct nameidata *nd, int *flags)
 {
+	struct inode *inode = nd->path.dentry->d_inode;
+
 	if (capable(CAP_SYS_ADMIN))
-		return 0;
-	return -EPERM;
-#ifdef notyet
-	if (S_ISLNK(nd->path.dentry->d_inode->i_mode))
-		return -EPERM;
-	if (nd->path.dentry->d_inode->i_mode & S_ISVTX) {
-		if (current->uid != nd->path.dentry->d_inode->i_uid)
-			return -EPERM;
-	}
-	if (vfs_permission(nd, MAY_WRITE))
-		return -EPERM;
-	return 0;
-#endif
+		return true;
+
+	if (S_ISLNK(inode->i_mode))
+		return false;
+
+	if (!is_mount_owner(nd->path.mnt, current->fsuid))
+		return false;
+
+	*flags |= MS_SETUSER;
+	return true;
 }
 
 static int lives_below_in_same_fs(struct dentry *d, struct dentry *dentry)
@@ -1419,9 +1428,10 @@ static int do_loopback(struct nameidata 
 	int clone_fl;
 	struct nameidata old_nd;
 	struct vfsmount *mnt = NULL;
-	int err = mount_is_safe(nd);
-	if (err)
-		return err;
+	int err;
+
+	if (!permit_mount(nd, &flags))
+		return -EPERM;
 	if (!old_name || !*old_name)
 		return -EINVAL;
 	err = path_lookup(old_name, LOOKUP_FOLLOW, &old_nd);

--

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [patch 06/10] unprivileged mounts: allow unprivileged mounts
  2008-02-05 21:36 [patch 00/10] mount ownership and unprivileged mount syscall (v8) Miklos Szeredi
                   ` (4 preceding siblings ...)
  2008-02-05 21:36 ` [patch 05/10] unprivileged mounts: allow unprivileged bind mounts Miklos Szeredi
@ 2008-02-05 21:36 ` Miklos Szeredi
  2008-02-05 21:36 ` [patch 07/10] unprivileged mounts: add sysctl tunable for "safe" property Miklos Szeredi
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Miklos Szeredi @ 2008-02-05 21:36 UTC (permalink / raw)
  To: akpm, hch, serue; +Cc: linux-fsdevel, linux-kernel

[-- Attachment #1: unprivileged-mounts-allow-unprivileged-mounts.patch --]
[-- Type: text/plain, Size: 6193 bytes --]

From: Miklos Szeredi <mszeredi@suse.cz>

For "safe" filesystems allow unprivileged mounting and forced
unmounting.

A filesystem type is considered "safe", if mounting it by an
unprivileged user may not cause a security problem.  This is somewhat
subjective, so setting this property is left to userspace (implemented
in the next patch).

Since most filesystems haven't been designed with unprivileged
mounting in mind, a thorough audit is recommended before setting this
property.

Make this a separate integer member in 'struct file_system_type'
instead of a flag, since that is easier to handle by sysctl code.

Move subtype handling from do_kern_mount() into do_new_mount().  All
other callers are kernel-internal and do not need subtype support.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Serge Hallyn <serue@us.ibm.com>
---

Index: linux/fs/namespace.c
===================================================================
--- linux.orig/fs/namespace.c	2008-02-04 23:48:00.000000000 +0100
+++ linux/fs/namespace.c	2008-02-04 23:48:02.000000000 +0100
@@ -1105,14 +1105,16 @@ static bool is_mount_owner(struct vfsmou
 /*
  * umount is permitted for
  *  - sysadmin
- *  - mount owner, if not forced umount
+ *  - mount owner
+ *    o if not forced umount,
+ *    o if forced umount, and filesystem is "safe"
  */
 static bool permit_umount(struct vfsmount *mnt, int flags)
 {
 	if (capable(CAP_SYS_ADMIN))
 		return true;
 
-	if (flags & MNT_FORCE)
+	if ((flags & MNT_FORCE) && !(mnt->mnt_sb->s_type->fs_safe))
 		return false;
 
 	return is_mount_owner(mnt, current->fsuid);
@@ -1170,13 +1172,17 @@ asmlinkage long sys_oldumount(char __use
  * - mountpoint is not a symlink
  * - mountpoint is in a mount owned by the user
  */
-static bool permit_mount(struct nameidata *nd, int *flags)
+static bool permit_mount(struct nameidata *nd, struct file_system_type *type,
+			 int *flags)
 {
 	struct inode *inode = nd->path.dentry->d_inode;
 
 	if (capable(CAP_SYS_ADMIN))
 		return true;
 
+	if (type && !type->fs_safe)
+		return false;
+
 	if (S_ISLNK(inode->i_mode))
 		return false;
 
@@ -1430,7 +1436,7 @@ static int do_loopback(struct nameidata 
 	struct vfsmount *mnt = NULL;
 	int err;
 
-	if (!permit_mount(nd, &flags))
+	if (!permit_mount(nd, NULL, &flags))
 		return -EPERM;
 	if (!old_name || !*old_name)
 		return -EINVAL;
@@ -1611,30 +1617,76 @@ out:
 	return err;
 }
 
+static struct vfsmount *fs_set_subtype(struct vfsmount *mnt, const char *fstype)
+{
+	int err;
+	const char *subtype = strchr(fstype, '.');
+	if (subtype) {
+		subtype++;
+		err = -EINVAL;
+		if (!subtype[0])
+			goto err;
+	} else
+		subtype = "";
+
+	mnt->mnt_sb->s_subtype = kstrdup(subtype, GFP_KERNEL);
+	err = -ENOMEM;
+	if (!mnt->mnt_sb->s_subtype)
+		goto err;
+	return mnt;
+
+ err:
+	mntput(mnt);
+	return ERR_PTR(err);
+}
+
 /*
  * create a new mount for userspace and request it to be added into the
  * namespace's tree
  */
-static int do_new_mount(struct nameidata *nd, char *type, int flags,
+static int do_new_mount(struct nameidata *nd, char *fstype, int flags,
 			int mnt_flags, char *name, void *data)
 {
+	int err;
 	struct vfsmount *mnt;
+	struct file_system_type *type;
 
-	if (!type || !memchr(type, 0, PAGE_SIZE))
+	if (!fstype || !memchr(fstype, 0, PAGE_SIZE))
 		return -EINVAL;
 
-	/* we need capabilities... */
-	if (!capable(CAP_SYS_ADMIN))
-		return -EPERM;
-
-	mnt = do_kern_mount(type, flags & ~MS_SETUSER, name, data);
-	if (IS_ERR(mnt))
+	type = get_fs_type(fstype);
+	if (!type)
+		return -ENODEV;
+
+	err = -EPERM;
+	if (!permit_mount(nd, type, &flags))
+		goto out_put_filesystem;
+
+	if (flags & MS_SETUSER) {
+		err = reserve_user_mount();
+		if (err)
+			goto out_put_filesystem;
+	}
+
+	mnt = vfs_kern_mount(type, flags & ~MS_SETUSER, name, data);
+	if (!IS_ERR(mnt) && (type->fs_flags & FS_HAS_SUBTYPE) &&
+	    !mnt->mnt_sb->s_subtype)
+		mnt = fs_set_subtype(mnt, fstype);
+	put_filesystem(type);
+	if (IS_ERR(mnt)) {
+		if (flags & MS_SETUSER)
+			dec_nr_user_mounts();
 		return PTR_ERR(mnt);
+	}
 
 	if (flags & MS_SETUSER)
-		set_mnt_user(mnt);
+		__set_mnt_user(mnt);
 
 	return do_add_mount(mnt, nd, mnt_flags, NULL);
+
+ out_put_filesystem:
+	put_filesystem(type);
+	return err;
 }
 
 /*
@@ -1665,7 +1717,7 @@ int do_add_mount(struct vfsmount *newmnt
 	if (S_ISLNK(newmnt->mnt_root->d_inode->i_mode))
 		goto unlock;
 
-	/* MNT_USER was set earlier */
+	/* some flags may have been set earlier */
 	newmnt->mnt_flags |= mnt_flags;
 	if ((err = graft_tree(newmnt, nd)))
 		goto unlock;
Index: linux/fs/super.c
===================================================================
--- linux.orig/fs/super.c	2008-02-04 23:47:46.000000000 +0100
+++ linux/fs/super.c	2008-02-04 23:48:02.000000000 +0100
@@ -925,29 +925,6 @@ out:
 
 EXPORT_SYMBOL_GPL(vfs_kern_mount);
 
-static struct vfsmount *fs_set_subtype(struct vfsmount *mnt, const char *fstype)
-{
-	int err;
-	const char *subtype = strchr(fstype, '.');
-	if (subtype) {
-		subtype++;
-		err = -EINVAL;
-		if (!subtype[0])
-			goto err;
-	} else
-		subtype = "";
-
-	mnt->mnt_sb->s_subtype = kstrdup(subtype, GFP_KERNEL);
-	err = -ENOMEM;
-	if (!mnt->mnt_sb->s_subtype)
-		goto err;
-	return mnt;
-
- err:
-	mntput(mnt);
-	return ERR_PTR(err);
-}
-
 struct vfsmount *
 do_kern_mount(const char *fstype, int flags, const char *name, void *data)
 {
@@ -956,9 +933,6 @@ do_kern_mount(const char *fstype, int fl
 	if (!type)
 		return ERR_PTR(-ENODEV);
 	mnt = vfs_kern_mount(type, flags, name, data);
-	if (!IS_ERR(mnt) && (type->fs_flags & FS_HAS_SUBTYPE) &&
-	    !mnt->mnt_sb->s_subtype)
-		mnt = fs_set_subtype(mnt, fstype);
 	put_filesystem(type);
 	return mnt;
 }
Index: linux/include/linux/fs.h
===================================================================
--- linux.orig/include/linux/fs.h	2008-02-04 23:47:58.000000000 +0100
+++ linux/include/linux/fs.h	2008-02-04 23:48:02.000000000 +0100
@@ -1437,6 +1437,7 @@ int sync_inode(struct inode *inode, stru
 struct file_system_type {
 	const char *name;
 	int fs_flags;
+	int fs_safe;
 	int (*get_sb) (struct file_system_type *, int,
 		       const char *, void *, struct vfsmount *);
 	void (*kill_sb) (struct super_block *);

--

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [patch 07/10] unprivileged mounts: add sysctl tunable for "safe" property
  2008-02-05 21:36 [patch 00/10] mount ownership and unprivileged mount syscall (v8) Miklos Szeredi
                   ` (5 preceding siblings ...)
  2008-02-05 21:36 ` [patch 06/10] unprivileged mounts: allow unprivileged mounts Miklos Szeredi
@ 2008-02-05 21:36 ` Miklos Szeredi
  2008-02-06 20:21   ` Serge E. Hallyn
  2008-02-07 15:33   ` Aneesh Kumar K.V
  2008-02-05 21:36 ` [patch 08/10] unprivileged mounts: make fuse safe Miklos Szeredi
                   ` (3 subsequent siblings)
  10 siblings, 2 replies; 30+ messages in thread
From: Miklos Szeredi @ 2008-02-05 21:36 UTC (permalink / raw)
  To: akpm, hch, serue; +Cc: linux-fsdevel, linux-kernel

[-- Attachment #1: unprivileged-mounts-add-sysctl-tunable-for-safe-property.patch --]
[-- Type: text/plain, Size: 5007 bytes --]

From: Miklos Szeredi <mszeredi@suse.cz>

Add the following:

  /proc/sys/fs/types/${FS_TYPE}/usermount_safe

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---

Index: linux/fs/filesystems.c
===================================================================
--- linux.orig/fs/filesystems.c	2008-02-04 23:47:46.000000000 +0100
+++ linux/fs/filesystems.c	2008-02-04 23:48:04.000000000 +0100
@@ -12,6 +12,7 @@
 #include <linux/kmod.h>
 #include <linux/init.h>
 #include <linux/module.h>
+#include <linux/sysctl.h>
 #include <asm/uaccess.h>
 
 /*
@@ -51,6 +52,57 @@ static struct file_system_type **find_fi
 	return p;
 }
 
+#define MAX_FILESYSTEM_VARS 1
+
+struct filesystem_sysctl_table {
+	struct ctl_table_header *header;
+	struct ctl_table table[MAX_FILESYSTEM_VARS + 1];
+};
+
+/*
+ * Create /sys/fs/types/${FSNAME} directory with per fs-type tunables.
+ */
+static int filesystem_sysctl_register(struct file_system_type *fs)
+{
+	struct filesystem_sysctl_table *t;
+	struct ctl_path path[] = {
+		{ .procname = "fs", .ctl_name = CTL_FS },
+		{ .procname = "types", .ctl_name = CTL_UNNUMBERED },
+		{ .procname = fs->name, .ctl_name = CTL_UNNUMBERED },
+		{ }
+	};
+
+	t = kzalloc(sizeof(*t), GFP_KERNEL);
+	if (!t)
+		return -ENOMEM;
+
+
+	t->table[0].ctl_name = CTL_UNNUMBERED;
+	t->table[0].procname = "usermount_safe";
+	t->table[0].maxlen = sizeof(int);
+	t->table[0].data = &fs->fs_safe;
+	t->table[0].mode = 0644;
+	t->table[0].proc_handler = &proc_dointvec;
+
+	t->header = register_sysctl_paths(path, t->table);
+	if (!t->header) {
+		kfree(t);
+		return -ENOMEM;
+	}
+
+	fs->sysctl_table = t;
+
+	return 0;
+}
+
+static void filesystem_sysctl_unregister(struct file_system_type *fs)
+{
+	struct filesystem_sysctl_table *t = fs->sysctl_table;
+
+	unregister_sysctl_table(t->header);
+	kfree(t);
+}
+
 /**
  *	register_filesystem - register a new filesystem
  *	@fs: the file system structure
@@ -80,6 +132,13 @@ int register_filesystem(struct file_syst
 	else
 		*p = fs;
 	write_unlock(&file_systems_lock);
+
+	if (res == 0) {
+		res = filesystem_sysctl_register(fs);
+		if (res != 0)
+			unregister_filesystem(fs);
+	}
+
 	return res;
 }
 
@@ -108,6 +167,7 @@ int unregister_filesystem(struct file_sy
 			*tmp = fs->next;
 			fs->next = NULL;
 			write_unlock(&file_systems_lock);
+			filesystem_sysctl_unregister(fs);
 			return 0;
 		}
 		tmp = &(*tmp)->next;
Index: linux/include/linux/fs.h
===================================================================
--- linux.orig/include/linux/fs.h	2008-02-04 23:48:02.000000000 +0100
+++ linux/include/linux/fs.h	2008-02-04 23:48:04.000000000 +0100
@@ -1444,6 +1444,7 @@ struct file_system_type {
 	struct module *owner;
 	struct file_system_type * next;
 	struct list_head fs_supers;
+	struct filesystem_sysctl_table *sysctl_table;
 
 	struct lock_class_key s_lock_key;
 	struct lock_class_key s_umount_key;
Index: linux/Documentation/filesystems/proc.txt
===================================================================
--- linux.orig/Documentation/filesystems/proc.txt	2008-02-04 23:47:58.000000000 +0100
+++ linux/Documentation/filesystems/proc.txt	2008-02-04 23:48:04.000000000 +0100
@@ -44,6 +44,7 @@ Table of Contents
   2.14	/proc/<pid>/io - Display the IO accounting fields
   2.15	/proc/<pid>/coredump_filter - Core dump filtering settings
   2.16	/proc/<pid>/mountinfo - Information about mounts
+  2.17	/proc/sys/fs/types - File system type specific parameters
 
 ------------------------------------------------------------------------------
 Preface
@@ -2392,4 +2393,34 @@ For more information see:
   Documentation/filesystems/sharedsubtree.txt
 
 
+2.17 /proc/sys/fs/types/ - File system type specific parameters
+----------------------------------------------------------------
+
+There's a separate directory /proc/sys/fs/types/<type>/ for each
+filesystem type, containing the following files:
+
+usermount_safe
+--------------
+
+Setting this to non-zero will allow filesystems of this type to be
+mounted by unprivileged users (note, that there are other
+prerequisites as well).
+
+Fuse has been designed to be as safe as possible, and some
+distributions already ship with unprivileged fuse mounts enabled by
+default.  There are still some situations (multi-user systems with
+untrusted users in particular), where enabling this for fuse might not
+be appropriate.  For more details, see Documentation/filesystems/fuse.txt
+
+Procfs is also safe, but unprivileged mounting of it is not usually
+necessary (bind mounting is equivalent).
+
+Most other filesystems are unsafe.  Here are just some of the issues,
+that must be resolved before a filesystem can be declared safe:
+
+ - no strict input checking (buffer overruns, directory loops, etc)
+ - network filesystem deadlocks when mounting from localhost
+ - no permission checking when opening the device
+ - changing mount options when mounting a new instance of a filesystem
+
 ------------------------------------------------------------------------------

--

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [patch 08/10] unprivileged mounts: make fuse safe
  2008-02-05 21:36 [patch 00/10] mount ownership and unprivileged mount syscall (v8) Miklos Szeredi
                   ` (6 preceding siblings ...)
  2008-02-05 21:36 ` [patch 07/10] unprivileged mounts: add sysctl tunable for "safe" property Miklos Szeredi
@ 2008-02-05 21:36 ` Miklos Szeredi
  2008-02-05 21:36 ` [patch 09/10] unprivileged mounts: propagation: inherit owner from parent Miklos Szeredi
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Miklos Szeredi @ 2008-02-05 21:36 UTC (permalink / raw)
  To: akpm, hch, serue; +Cc: linux-fsdevel, linux-kernel

[-- Attachment #1: unprivileged-mounts-allow-unprivileged-fuse-mounts.patch --]
[-- Type: text/plain, Size: 6274 bytes --]

From: Miklos Szeredi <mszeredi@suse.cz>

Don't require the "user_id=" and "group_id=" options for unprivileged mounts,
but if they are present, verify them for sanity.

Disallow the "allow_other" option for unprivileged mounts.

Document new way of enabling unprivileged mounts for fuse.

Document problems with unprivileged mounts.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Serge Hallyn <serue@us.ibm.com>
---

Index: linux/fs/fuse/inode.c
===================================================================
--- linux.orig/fs/fuse/inode.c	2008-02-04 23:47:46.000000000 +0100
+++ linux/fs/fuse/inode.c	2008-02-04 23:48:06.000000000 +0100
@@ -359,6 +359,19 @@ static int parse_fuse_opt(char *opt, str
 	d->max_read = ~0;
 	d->blksize = FUSE_DEFAULT_BLKSIZE;
 
+	/*
+	 * For unprivileged mounts use current uid/gid.  Still allow
+	 * "user_id" and "group_id" options for compatibility, but
+	 * only if they match these values.
+	 */
+	if (!capable(CAP_SYS_ADMIN)) {
+		d->user_id = current->uid;
+		d->user_id_present = 1;
+		d->group_id = current->gid;
+		d->group_id_present = 1;
+
+	}
+
 	while ((p = strsep(&opt, ",")) != NULL) {
 		int token;
 		int value;
@@ -387,6 +400,8 @@ static int parse_fuse_opt(char *opt, str
 		case OPT_USER_ID:
 			if (match_int(&args[0], &value))
 				return 0;
+			if (d->user_id_present && d->user_id != value)
+				return 0;
 			d->user_id = value;
 			d->user_id_present = 1;
 			break;
@@ -394,6 +409,8 @@ static int parse_fuse_opt(char *opt, str
 		case OPT_GROUP_ID:
 			if (match_int(&args[0], &value))
 				return 0;
+			if (d->group_id_present && d->group_id != value)
+				return 0;
 			d->group_id = value;
 			d->group_id_present = 1;
 			break;
@@ -603,6 +620,10 @@ static int fuse_fill_super(struct super_
 	if (!parse_fuse_opt((char *) data, &d, is_bdev))
 		return -EINVAL;
 
+	/* This is a privileged option */
+	if ((d.flags & FUSE_ALLOW_OTHER) && !capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
 	if (is_bdev) {
 #ifdef CONFIG_BLOCK
 		if (!sb_set_blocksize(sb, d.blksize))
Index: linux/Documentation/filesystems/fuse.txt
===================================================================
--- linux.orig/Documentation/filesystems/fuse.txt	2008-01-24 23:58:37.000000000 +0100
+++ linux/Documentation/filesystems/fuse.txt	2008-02-05 19:34:24.000000000 +0100
@@ -215,11 +215,87 @@ the filesystem.  There are several ways 
   - Abort filesystem through the FUSE control filesystem.  Most
     powerful method, always works.
 
-How do non-privileged mounts work?
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Unprivileged fuse mounts
+~~~~~~~~~~~~~~~~~~~~~~~~
 
-Since the mount() system call is a privileged operation, a helper
-program (fusermount) is needed, which is installed setuid root.
+Possible problems with unprivileged fuse mounts
+-----------------------------------------------
+
+FUSE was designed from the beginning to be safe for unprivileged
+users.  This has also been verified in practice over many years, with
+some distributions enabling unprivileged FUSE mounts by default.
+
+However, there are cases when unprivileged mounting a fuse filesystem
+may be problematic, particularly for multi-user systems with untrusted
+users.  So here are few words of warning:
+
+Due to the design of the process freezer, a hanging (due to network
+problems, etc) or malicious filesystem may prevent suspending to ram
+or hibernation to succeed.  This is not actually unique to FUSE, as
+any hanging network filesystem will have the same affect.
+
+It is not always possible to use kill(2) (not even with SIGKILL) to
+terminate a process using a FUSE filesystem (see section "Interrupting
+filesystem operations" above).  As a special case of the above,
+killing a self-deadlocked FUSE process is not possible, and even
+killall5 will not terminate it.
+
+If the above could pose a threat to the system, it is recommended,
+that unprivileged fuse mounts are not enabled.
+
+Ways of enabling user mounts
+----------------------------
+
+Now there are two different ways of allowing unprivileged fuse mounts:
+
+ 1) new way: unprivileged mount syscall
+
+ 2) old way: suid-root fusermount utility
+
+Unprivileged mount syscall
+--------------------------
+
+To enable this do
+
+  echo 1 > /proc/sys/fs/types/fuse/usermount_safe
+
+or add this line to /etc/sysctl.conf:
+
+  fs.types.fuse.usermount_safe = 1
+
+More information can be found in Documentation/filesystems/proc.txt
+under the /proc/sys/fs/types/ heading.  Also see description of
+nr_user_mounts and max_user_mounts under /proc/sys/fs.
+
+This doesn't in itself allow users to create mounts, first root needs
+to create a mount owned by the user, under which the user can create
+submounts.
+
+For example to enable submounts under /home/xyz/mnt do:
+
+  mount --bind -ouser=xyz /home/xyz/mnt /home/xyz/mnt
+
+or add this line to /etc/fstab:
+
+  /home/xyz/mnt  /home/xyz/mnt  none  bind,user=xyz  0 0
+
+And finally, make sure, that the user has read and write permissions
+on /dev/fuse (installing fuse should have already taken care of this):
+
+  chmod 0666 /dev/fuse
+
+or create a file under /etc/udev/rules.d/ containing:
+
+  KERNEL=="fuse", MODE="0666"
+
+After this, mounting fuse filesystems under ~xyz/mnt should work, even
+if fusermount is not installed setuid-root.
+
+Suid-root fusermount utility
+----------------------------
+
+[Some of the details described here apply to the new, unprivileged
+mount system call as well].
 
 The implication of providing non-privileged mounts is that the mount
 owner must not be able to use this capability to compromise the
@@ -235,7 +311,7 @@ system.  Obvious requirements arising fr
     other users' or the super user's processes
 
 How are requirements fulfilled?
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+- - - - - - - - - - - - - - - -
 
  A) The mount owner could gain elevated privileges by either:
 
@@ -300,7 +376,7 @@ How are requirements fulfilled?
 	filesystem, since SIGSTOP can be used to get a similar effect.
 
 I think these limitations are unacceptable?
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+- - - - - - - - - - - - - - - - - - - - - -
 
 If a sysadmin trusts the users enough, or can ensure through other
 measures, that system processes will never enter non-privileged

--

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [patch 09/10] unprivileged mounts: propagation: inherit owner from parent
  2008-02-05 21:36 [patch 00/10] mount ownership and unprivileged mount syscall (v8) Miklos Szeredi
                   ` (7 preceding siblings ...)
  2008-02-05 21:36 ` [patch 08/10] unprivileged mounts: make fuse safe Miklos Szeredi
@ 2008-02-05 21:36 ` Miklos Szeredi
  2008-02-05 21:36 ` [patch 10/10] unprivileged mounts: add "no submounts" flag Miklos Szeredi
  2008-02-15  6:21 ` [patch 00/10] mount ownership and unprivileged mount syscall (v8) Andrew Morton
  10 siblings, 0 replies; 30+ messages in thread
From: Miklos Szeredi @ 2008-02-05 21:36 UTC (permalink / raw)
  To: akpm, hch, serue; +Cc: linux-fsdevel, linux-kernel

[-- Attachment #1: unprivileged-mounts-propagation-inherit-owner-from-parent.patch --]
[-- Type: text/plain, Size: 7452 bytes --]

From: Miklos Szeredi <mszeredi@suse.cz>

On mount propagation, let the owner of the clone be inherited from the
parent into which it has been propagated.

If the parent has the "nosuid" flag, set this flag for the child as
well.  This is needed for the suid-less namespace (use case #2 in the
first patch header), where all mounts are owned by the user and have
the nosuid flag set.  In this case the propagated mount needs to have
nosuid, otherwise a suid executable may be misused by the user.

Similar treatment is not needed for "nodev", because devices can't be
abused this way: the user is not able to gain privileges to devices by
rearranging the mount namespace.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Serge Hallyn <serue@us.ibm.com>
---

Index: linux/fs/namespace.c
===================================================================
--- linux.orig/fs/namespace.c	2008-02-04 23:48:02.000000000 +0100
+++ linux/fs/namespace.c	2008-02-04 23:48:08.000000000 +0100
@@ -540,10 +540,10 @@ static int reserve_user_mount(void)
 	return err;
 }
 
-static void __set_mnt_user(struct vfsmount *mnt)
+static void __set_mnt_user(struct vfsmount *mnt, uid_t owner)
 {
 	WARN_ON(mnt->mnt_flags & MNT_USER);
-	mnt->mnt_uid = current->fsuid;
+	mnt->mnt_uid = owner;
 	mnt->mnt_flags |= MNT_USER;
 
 	if (!capable(CAP_SETUID))
@@ -554,7 +554,7 @@ static void __set_mnt_user(struct vfsmou
 
 static void set_mnt_user(struct vfsmount *mnt)
 {
-	__set_mnt_user(mnt);
+	__set_mnt_user(mnt, current->fsuid);
 	spin_lock(&vfsmount_lock);
 	nr_user_mounts++;
 	spin_unlock(&vfsmount_lock);
@@ -570,7 +570,7 @@ static void clear_mnt_user(struct vfsmou
 }
 
 static struct vfsmount *clone_mnt(struct vfsmount *old, struct dentry *root,
-					int flag)
+					int flag, uid_t owner)
 {
 	struct super_block *sb = old->mnt_sb;
 	struct vfsmount *mnt;
@@ -594,7 +594,10 @@ static struct vfsmount *clone_mnt(struct
 	/* don't copy the MNT_USER flag */
 	mnt->mnt_flags &= ~MNT_USER;
 	if (flag & CL_SETUSER)
-		__set_mnt_user(mnt);
+		__set_mnt_user(mnt, owner);
+
+	if (flag & CL_NOSUID)
+		mnt->mnt_flags |= MNT_NOSUID;
 
 	if (flag & CL_SLAVE) {
 		list_add(&mnt->mnt_slave, &old->mnt_slave_list);
@@ -1205,7 +1208,7 @@ static int lives_below_in_same_fs(struct
 }
 
 struct vfsmount *copy_tree(struct vfsmount *mnt, struct dentry *dentry,
-					int flag)
+					int flag, uid_t owner)
 {
 	struct vfsmount *res, *p, *q, *r, *s;
 	struct nameidata nd;
@@ -1213,7 +1216,7 @@ struct vfsmount *copy_tree(struct vfsmou
 	if (!(flag & CL_COPY_ALL) && IS_MNT_UNBINDABLE(mnt))
 		return ERR_PTR(-EPERM);
 
-	res = q = clone_mnt(mnt, dentry, flag);
+	res = q = clone_mnt(mnt, dentry, flag, owner);
 	if (IS_ERR(q))
 		goto error;
 	q->mnt_mountpoint = mnt->mnt_mountpoint;
@@ -1235,7 +1238,7 @@ struct vfsmount *copy_tree(struct vfsmou
 			p = s;
 			nd.path.mnt = q;
 			nd.path.dentry = p->mnt_mountpoint;
-			q = clone_mnt(p, p->mnt_root, flag);
+			q = clone_mnt(p, p->mnt_root, flag, owner);
 			if (IS_ERR(q))
 				goto error;
 			spin_lock(&vfsmount_lock);
@@ -1260,7 +1263,7 @@ struct vfsmount *collect_mounts(struct v
 {
 	struct vfsmount *tree;
 	down_read(&namespace_sem);
-	tree = copy_tree(mnt, dentry, CL_COPY_ALL | CL_PRIVATE);
+	tree = copy_tree(mnt, dentry, CL_COPY_ALL | CL_PRIVATE, 0);
 	up_read(&namespace_sem);
 	return tree;
 }
@@ -1431,7 +1434,8 @@ static int do_change_type(struct nameida
  */
 static int do_loopback(struct nameidata *nd, char *old_name, int flags)
 {
-	int clone_fl;
+	int clone_fl = 0;
+	uid_t owner = 0;
 	struct nameidata old_nd;
 	struct vfsmount *mnt = NULL;
 	int err;
@@ -1452,11 +1456,17 @@ static int do_loopback(struct nameidata 
 	if (!check_mnt(nd->path.mnt) || !check_mnt(old_nd.path.mnt))
 		goto out;
 
-	clone_fl = (flags & MS_SETUSER) ? CL_SETUSER : 0;
+	if (flags & MS_SETUSER) {
+		clone_fl |= CL_SETUSER;
+		owner = current->fsuid;
+	}
+
 	if (flags & MS_REC)
-		mnt = copy_tree(old_nd.path.mnt, old_nd.path.dentry, clone_fl);
+		mnt = copy_tree(old_nd.path.mnt, old_nd.path.dentry, clone_fl,
+				owner);
 	else
-		mnt = clone_mnt(old_nd.path.mnt, old_nd.path.dentry, clone_fl);
+		mnt = clone_mnt(old_nd.path.mnt, old_nd.path.dentry, clone_fl,
+				owner);
 
 	err = PTR_ERR(mnt);
 	if (IS_ERR(mnt))
@@ -1680,7 +1690,7 @@ static int do_new_mount(struct nameidata
 	}
 
 	if (flags & MS_SETUSER)
-		__set_mnt_user(mnt);
+		__set_mnt_user(mnt, current->fsuid);
 
 	return do_add_mount(mnt, nd, mnt_flags, NULL);
 
@@ -2076,7 +2086,7 @@ static struct mnt_namespace *dup_mnt_ns(
 	down_write(&namespace_sem);
 	/* First pass: copy the tree topology */
 	new_ns->root = copy_tree(mnt_ns->root, mnt_ns->root->mnt_root,
-					CL_COPY_ALL | CL_EXPIRE);
+					CL_COPY_ALL | CL_EXPIRE, 0);
 	if (IS_ERR(new_ns->root)) {
 		up_write(&namespace_sem);
 		kfree(new_ns);
Index: linux/fs/pnode.c
===================================================================
--- linux.orig/fs/pnode.c	2008-02-04 23:47:56.000000000 +0100
+++ linux/fs/pnode.c	2008-02-04 23:48:08.000000000 +0100
@@ -216,15 +216,28 @@ int propagate_mnt(struct vfsmount *dest_
 
 	for (m = propagation_next(dest_mnt, dest_mnt); m;
 			m = propagation_next(m, dest_mnt)) {
-		int type;
+		int clflags;
+		uid_t owner = 0;
 		struct vfsmount *source;
 
 		if (IS_MNT_NEW(m))
 			continue;
 
-		source =  get_source(m, prev_dest_mnt, prev_src_mnt, &type);
+		source =  get_source(m, prev_dest_mnt, prev_src_mnt, &clflags);
 
-		child = copy_tree(source, source->mnt_root, type);
+		if (m->mnt_flags & MNT_USER) {
+			clflags |= CL_SETUSER;
+			owner = m->mnt_uid;
+
+			/*
+			 * If propagating into a user mount which doesn't
+			 * allow suid, then make sure, the child(ren) won't
+			 * allow suid either
+			 */
+			if (m->mnt_flags & MNT_NOSUID)
+				clflags |= CL_NOSUID;
+		}
+		child = copy_tree(source, source->mnt_root, clflags, owner);
 		if (IS_ERR(child)) {
 			ret = PTR_ERR(child);
 			list_splice(tree_list, tmp_list.prev);
Index: linux/fs/pnode.h
===================================================================
--- linux.orig/fs/pnode.h	2008-02-04 23:47:50.000000000 +0100
+++ linux/fs/pnode.h	2008-02-04 23:48:08.000000000 +0100
@@ -24,6 +24,7 @@
 #define CL_PROPAGATION 		0x10
 #define CL_PRIVATE 		0x20
 #define CL_SETUSER		0x40
+#define CL_NOSUID		0x80
 
 static inline void set_mnt_shared(struct vfsmount *mnt)
 {
@@ -36,6 +37,7 @@ int propagate_mnt(struct vfsmount *, str
 		struct list_head *);
 int propagate_umount(struct list_head *);
 int propagate_mount_busy(struct vfsmount *, int);
+struct vfsmount *copy_tree(struct vfsmount *, struct dentry *, int, uid_t);
 int get_peer_group_id(struct vfsmount *);
 int get_master_id(struct vfsmount *);
 #endif /* _LINUX_PNODE_H */
Index: linux/include/linux/fs.h
===================================================================
--- linux.orig/include/linux/fs.h	2008-02-04 23:48:04.000000000 +0100
+++ linux/include/linux/fs.h	2008-02-04 23:48:08.000000000 +0100
@@ -1500,7 +1500,6 @@ extern int may_umount(struct vfsmount *)
 extern void umount_tree(struct vfsmount *, int, struct list_head *);
 extern void release_mounts(struct list_head *);
 extern long do_mount(char *, char *, char *, unsigned long, void *);
-extern struct vfsmount *copy_tree(struct vfsmount *, struct dentry *, int);
 extern void mnt_set_mountpoint(struct vfsmount *, struct dentry *,
 				  struct vfsmount *);
 extern struct vfsmount *collect_mounts(struct vfsmount *, struct dentry *);

--

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [patch 10/10] unprivileged mounts: add "no submounts" flag
  2008-02-05 21:36 [patch 00/10] mount ownership and unprivileged mount syscall (v8) Miklos Szeredi
                   ` (8 preceding siblings ...)
  2008-02-05 21:36 ` [patch 09/10] unprivileged mounts: propagation: inherit owner from parent Miklos Szeredi
@ 2008-02-05 21:36 ` Miklos Szeredi
  2008-02-15  6:21 ` [patch 00/10] mount ownership and unprivileged mount syscall (v8) Andrew Morton
  10 siblings, 0 replies; 30+ messages in thread
From: Miklos Szeredi @ 2008-02-05 21:36 UTC (permalink / raw)
  To: akpm, hch, serue; +Cc: linux-fsdevel, linux-kernel

[-- Attachment #1: unprivileged-mounts-add-no-submounts-flag.patch --]
[-- Type: text/plain, Size: 2933 bytes --]

From: Miklos Szeredi <mszeredi@suse.cz>

Add a new mount flag "nosubmnt", which denies submounts for the owner.
This would be useful, if we want to support traditional /etc/fstab
based user mounts.

In this case mount(8) would still have to be suid-root, to check the
mountpoint against the user/users flag in /etc/fstab, but /etc/mtab
would no longer be mandatory for storing the actual owner of the
mount.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Serge Hallyn <serue@us.ibm.com>
---

Index: linux/fs/namespace.c
===================================================================
--- linux.orig/fs/namespace.c	2008-02-04 23:48:08.000000000 +0100
+++ linux/fs/namespace.c	2008-02-04 23:48:10.000000000 +0100
@@ -783,6 +783,7 @@ static void show_mnt_opts(struct seq_fil
 		{ MNT_NOATIME, ",noatime" },
 		{ MNT_NODIRATIME, ",nodiratime" },
 		{ MNT_RELATIME, ",relatime" },
+		{ MNT_NOSUBMNT, ",nosubmnt" },
 		{ 0, NULL }
 	};
 	const struct proc_fs_info *fs_infop;
@@ -1189,6 +1190,9 @@ static bool permit_mount(struct nameidat
 	if (S_ISLNK(inode->i_mode))
 		return false;
 
+	if (nd->path.mnt->mnt_flags & MNT_NOSUBMNT)
+		return false;
+
 	if (!is_mount_owner(nd->path.mnt, current->fsuid))
 		return false;
 
@@ -2033,9 +2037,11 @@ long do_mount(char *dev_name, char *dir_
 		mnt_flags |= MNT_RELATIME;
 	if (flags & MS_RDONLY)
 		mnt_flags |= MNT_READONLY;
+	if (flags & MS_NOSUBMNT)
+		mnt_flags |= MNT_NOSUBMNT;
 
-	flags &= ~(MS_NOSUID | MS_NOEXEC | MS_NODEV | MS_ACTIVE |
-		   MS_NOATIME | MS_NODIRATIME | MS_RELATIME| MS_KERNMOUNT);
+	flags &= ~(MS_NOSUID | MS_NOEXEC | MS_NODEV | MS_ACTIVE | MS_NOATIME |
+		   MS_NODIRATIME | MS_RELATIME | MS_KERNMOUNT | MS_NOSUBMNT);
 
 	/* ... and get the mountpoint */
 	retval = path_lookup(dir_name, LOOKUP_FOLLOW, &nd);
Index: linux/include/linux/fs.h
===================================================================
--- linux.orig/include/linux/fs.h	2008-02-04 23:48:08.000000000 +0100
+++ linux/include/linux/fs.h	2008-02-04 23:48:10.000000000 +0100
@@ -129,6 +129,7 @@ extern int dir_notify_enable;
 #define MS_KERNMOUNT	(1<<22) /* this is a kern_mount call */
 #define MS_I_VERSION	(1<<23) /* Update inode I_version field */
 #define MS_SETUSER	(1<<24) /* set mnt_uid to current user */
+#define MS_NOSUBMNT	(1<<25) /* don't allow unprivileged submounts */
 #define MS_ACTIVE	(1<<30)
 #define MS_NOUSER	(1<<31)
 
Index: linux/include/linux/mount.h
===================================================================
--- linux.orig/include/linux/mount.h	2008-02-04 23:47:50.000000000 +0100
+++ linux/include/linux/mount.h	2008-02-04 23:48:10.000000000 +0100
@@ -30,6 +30,7 @@ struct mnt_namespace;
 #define MNT_NODIRATIME	0x10
 #define MNT_RELATIME	0x20
 #define MNT_READONLY	0x40	/* does the user want this to be r/o? */
+#define MNT_NOSUBMNT	0x80
 
 #define MNT_SHRINKABLE	0x100
 #define MNT_IMBALANCED_WRITE_COUNT	0x200 /* just for debugging */

--

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch 07/10] unprivileged mounts: add sysctl tunable for "safe" property
  2008-02-05 21:36 ` [patch 07/10] unprivileged mounts: add sysctl tunable for "safe" property Miklos Szeredi
@ 2008-02-06 20:21   ` Serge E. Hallyn
  2008-02-06 21:11     ` Miklos Szeredi
  2008-02-07 15:33   ` Aneesh Kumar K.V
  1 sibling, 1 reply; 30+ messages in thread
From: Serge E. Hallyn @ 2008-02-06 20:21 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: akpm, hch, serue, linux-fsdevel, linux-kernel

Quoting Miklos Szeredi (miklos@szeredi.hu):
> From: Miklos Szeredi <mszeredi@suse.cz>
> 
> Add the following:
> 
>   /proc/sys/fs/types/${FS_TYPE}/usermount_safe
> 
> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>

Thanks, Miklos, good explanations in the docs.

Acked-by: Serge Hallyn <serue@us.ibm.com>

One comment inline, but not imo your problem :)

> ---
> 
> Index: linux/fs/filesystems.c
> ===================================================================
> --- linux.orig/fs/filesystems.c	2008-02-04 23:47:46.000000000 +0100
> +++ linux/fs/filesystems.c	2008-02-04 23:48:04.000000000 +0100
> @@ -12,6 +12,7 @@
>  #include <linux/kmod.h>
>  #include <linux/init.h>
>  #include <linux/module.h>
> +#include <linux/sysctl.h>
>  #include <asm/uaccess.h>
> 
>  /*
> @@ -51,6 +52,57 @@ static struct file_system_type **find_fi
>  	return p;
>  }
> 
> +#define MAX_FILESYSTEM_VARS 1
> +
> +struct filesystem_sysctl_table {
> +	struct ctl_table_header *header;
> +	struct ctl_table table[MAX_FILESYSTEM_VARS + 1];
> +};
> +
> +/*
> + * Create /sys/fs/types/${FSNAME} directory with per fs-type tunables.
> + */
> +static int filesystem_sysctl_register(struct file_system_type *fs)
> +{
> +	struct filesystem_sysctl_table *t;
> +	struct ctl_path path[] = {
> +		{ .procname = "fs", .ctl_name = CTL_FS },
> +		{ .procname = "types", .ctl_name = CTL_UNNUMBERED },
> +		{ .procname = fs->name, .ctl_name = CTL_UNNUMBERED },
> +		{ }
> +	};
> +
> +	t = kzalloc(sizeof(*t), GFP_KERNEL);
> +	if (!t)
> +		return -ENOMEM;
> +
> +
> +	t->table[0].ctl_name = CTL_UNNUMBERED;
> +	t->table[0].procname = "usermount_safe";
> +	t->table[0].maxlen = sizeof(int);
> +	t->table[0].data = &fs->fs_safe;
> +	t->table[0].mode = 0644;

Yikes, this could be a problem for containers, as it's simply tied to
uid 0, whereas tying it to a capability would let us solve it with
capability bounds.

This might mean more urgency to get user namespaces working at least
with sysfs, else this is a quick way around having CAP_SYS_ADMIN taken
out of a container's capability bounding set.

> +	t->table[0].proc_handler = &proc_dointvec;
> +
> +	t->header = register_sysctl_paths(path, t->table);
> +	if (!t->header) {
> +		kfree(t);
> +		return -ENOMEM;
> +	}
> +
> +	fs->sysctl_table = t;
> +
> +	return 0;
> +}
> +
> +static void filesystem_sysctl_unregister(struct file_system_type *fs)
> +{
> +	struct filesystem_sysctl_table *t = fs->sysctl_table;
> +
> +	unregister_sysctl_table(t->header);
> +	kfree(t);
> +}
> +
>  /**
>   *	register_filesystem - register a new filesystem
>   *	@fs: the file system structure
> @@ -80,6 +132,13 @@ int register_filesystem(struct file_syst
>  	else
>  		*p = fs;
>  	write_unlock(&file_systems_lock);
> +
> +	if (res == 0) {
> +		res = filesystem_sysctl_register(fs);
> +		if (res != 0)
> +			unregister_filesystem(fs);
> +	}
> +
>  	return res;
>  }
> 
> @@ -108,6 +167,7 @@ int unregister_filesystem(struct file_sy
>  			*tmp = fs->next;
>  			fs->next = NULL;
>  			write_unlock(&file_systems_lock);
> +			filesystem_sysctl_unregister(fs);
>  			return 0;
>  		}
>  		tmp = &(*tmp)->next;
> Index: linux/include/linux/fs.h
> ===================================================================
> --- linux.orig/include/linux/fs.h	2008-02-04 23:48:02.000000000 +0100
> +++ linux/include/linux/fs.h	2008-02-04 23:48:04.000000000 +0100
> @@ -1444,6 +1444,7 @@ struct file_system_type {
>  	struct module *owner;
>  	struct file_system_type * next;
>  	struct list_head fs_supers;
> +	struct filesystem_sysctl_table *sysctl_table;
> 
>  	struct lock_class_key s_lock_key;
>  	struct lock_class_key s_umount_key;
> Index: linux/Documentation/filesystems/proc.txt
> ===================================================================
> --- linux.orig/Documentation/filesystems/proc.txt	2008-02-04 23:47:58.000000000 +0100
> +++ linux/Documentation/filesystems/proc.txt	2008-02-04 23:48:04.000000000 +0100
> @@ -44,6 +44,7 @@ Table of Contents
>    2.14	/proc/<pid>/io - Display the IO accounting fields
>    2.15	/proc/<pid>/coredump_filter - Core dump filtering settings
>    2.16	/proc/<pid>/mountinfo - Information about mounts
> +  2.17	/proc/sys/fs/types - File system type specific parameters
> 
>  ------------------------------------------------------------------------------
>  Preface
> @@ -2392,4 +2393,34 @@ For more information see:
>    Documentation/filesystems/sharedsubtree.txt
> 
> 
> +2.17 /proc/sys/fs/types/ - File system type specific parameters
> +----------------------------------------------------------------
> +
> +There's a separate directory /proc/sys/fs/types/<type>/ for each
> +filesystem type, containing the following files:
> +
> +usermount_safe
> +--------------
> +
> +Setting this to non-zero will allow filesystems of this type to be
> +mounted by unprivileged users (note, that there are other
> +prerequisites as well).
> +
> +Fuse has been designed to be as safe as possible, and some
> +distributions already ship with unprivileged fuse mounts enabled by
> +default.  There are still some situations (multi-user systems with
> +untrusted users in particular), where enabling this for fuse might not
> +be appropriate.  For more details, see Documentation/filesystems/fuse.txt
> +
> +Procfs is also safe, but unprivileged mounting of it is not usually
> +necessary (bind mounting is equivalent).
> +
> +Most other filesystems are unsafe.  Here are just some of the issues,
> +that must be resolved before a filesystem can be declared safe:
> +
> + - no strict input checking (buffer overruns, directory loops, etc)
> + - network filesystem deadlocks when mounting from localhost
> + - no permission checking when opening the device
> + - changing mount options when mounting a new instance of a filesystem
> +
>  ------------------------------------------------------------------------------
> 
> --

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch 07/10] unprivileged mounts: add sysctl tunable for "safe" property
  2008-02-06 20:21   ` Serge E. Hallyn
@ 2008-02-06 21:11     ` Miklos Szeredi
  2008-02-06 22:45       ` Serge E. Hallyn
  0 siblings, 1 reply; 30+ messages in thread
From: Miklos Szeredi @ 2008-02-06 21:11 UTC (permalink / raw)
  To: serue; +Cc: miklos, akpm, hch, serue, linux-fsdevel, linux-kernel

> > +	t->table[0].mode = 0644;
> 
> Yikes, this could be a problem for containers, as it's simply tied to
> uid 0, whereas tying it to a capability would let us solve it with
> capability bounds.
> 
> This might mean more urgency to get user namespaces working at least
> with sysfs, else this is a quick way around having CAP_SYS_ADMIN taken
> out of a container's capability bounding set.

I think I understand the problem, but not the solution.  How do user
namespaces going to help?

Maybe sysctls just need to check capabilities, instead of uids.  I
think that would make a lot of sense anyway.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch 07/10] unprivileged mounts: add sysctl tunable for "safe" property
  2008-02-06 21:11     ` Miklos Szeredi
@ 2008-02-06 22:45       ` Serge E. Hallyn
  2008-02-07  8:09         ` Miklos Szeredi
  0 siblings, 1 reply; 30+ messages in thread
From: Serge E. Hallyn @ 2008-02-06 22:45 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: serue, akpm, hch, linux-fsdevel, linux-kernel

Quoting Miklos Szeredi (miklos@szeredi.hu):
> > > +	t->table[0].mode = 0644;
> > 
> > Yikes, this could be a problem for containers, as it's simply tied to
> > uid 0, whereas tying it to a capability would let us solve it with
> > capability bounds.
> > 
> > This might mean more urgency to get user namespaces working at least
> > with sysfs, else this is a quick way around having CAP_SYS_ADMIN taken
> > out of a container's capability bounding set.
> 
> I think I understand the problem, but not the solution.  How do user
> namespaces going to help?

Well it somewhat depends on how we implement userns for filesystems
in the first place, and whether we end up splitting sysfs into
sub-filesystems as I think Eric Biederman has been advocating.  My
thoughts had been running along the lines of just tagging vfsmounts
with userns of the mounting process.  A task from outside the mounting
process' namespace would get user other permissions whether or not
its uid was the owning uid or uid 0 (unless the task had CAP_NS_OVERRIDE).

But really it gets more complicated for sysfs than something like ext2
since we really want to be able to filter files and directories for
different namespaces...  Handling sysfs user namespaces before we sort
out the rest of the sysfs stuff (being hashed out with network
namespaces) seems like jumping the gun a bit.

> Maybe sysctls just need to check capabilities, instead of uids.  I
> think that would make a lot of sense anyway.

Would it be as simple as tagging the inodes with capability sets?  One
set for writing, or one each for reading and writing?

thanks,
-serge

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch 07/10] unprivileged mounts: add sysctl tunable for "safe" property
  2008-02-06 22:45       ` Serge E. Hallyn
@ 2008-02-07  8:09         ` Miklos Szeredi
  2008-02-07 14:05           ` Serge E. Hallyn
  0 siblings, 1 reply; 30+ messages in thread
From: Miklos Szeredi @ 2008-02-07  8:09 UTC (permalink / raw)
  To: serue; +Cc: miklos, serue, akpm, hch, linux-fsdevel, linux-kernel

> > Maybe sysctls just need to check capabilities, instead of uids.  I
> > think that would make a lot of sense anyway.
> 
> Would it be as simple as tagging the inodes with capability sets?  One
> set for writing, or one each for reading and writing?

Yes, or something even simpler, like mapping the owner permission bits
to CAP_SYS_ADMIN.  There seem to be very few different permissions
under /proc/sys:

--w-------
-r--r--r--
-rw-------
-rw-r--r--

As long as the group and other bits are always the same, and we accept
that the owner bits really mean CAP_SYS_ADMIN and not something else,
then the permission check would not need to look at uids or gids at
all.

Miklos

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch 07/10] unprivileged mounts: add sysctl tunable for "safe" property
  2008-02-07  8:09         ` Miklos Szeredi
@ 2008-02-07 14:05           ` Serge E. Hallyn
  2008-02-07 14:36             ` Miklos Szeredi
  0 siblings, 1 reply; 30+ messages in thread
From: Serge E. Hallyn @ 2008-02-07 14:05 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: serue, akpm, hch, linux-fsdevel, linux-kernel

Quoting Miklos Szeredi (miklos@szeredi.hu):
> > > Maybe sysctls just need to check capabilities, instead of uids.  I
> > > think that would make a lot of sense anyway.
> > 
> > Would it be as simple as tagging the inodes with capability sets?  One
> > set for writing, or one each for reading and writing?
> 
> Yes, or something even simpler, like mapping the owner permission bits
> to CAP_SYS_ADMIN.  There seem to be very few different permissions
> under /proc/sys:
> 
> --w-------
> -r--r--r--
> -rw-------
> -rw-r--r--
> 
> As long as the group and other bits are always the same, and we accept
> that the owner bits really mean CAP_SYS_ADMIN and not something else,

But I would assume some things under /proc/sys/net/ipv4 or
/proc/sys/net/ath0 require CAP_NET_ADMIN rather than CAP_SYS_ADMIN?

> then the permission check would not need to look at uids or gids at
> all.
> 
> Miklos
> -
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch 07/10] unprivileged mounts: add sysctl tunable for "safe" property
  2008-02-07 14:05           ` Serge E. Hallyn
@ 2008-02-07 14:36             ` Miklos Szeredi
  2008-02-07 16:57               ` Serge E. Hallyn
  0 siblings, 1 reply; 30+ messages in thread
From: Miklos Szeredi @ 2008-02-07 14:36 UTC (permalink / raw)
  To: serue; +Cc: miklos, serue, akpm, hch, linux-fsdevel, linux-kernel

> > > > Maybe sysctls just need to check capabilities, instead of uids.  I
> > > > think that would make a lot of sense anyway.
> > > 
> > > Would it be as simple as tagging the inodes with capability sets?  One
> > > set for writing, or one each for reading and writing?
> > 
> > Yes, or something even simpler, like mapping the owner permission bits
> > to CAP_SYS_ADMIN.  There seem to be very few different permissions
> > under /proc/sys:
> > 
> > --w-------
> > -r--r--r--
> > -rw-------
> > -rw-r--r--
> > 
> > As long as the group and other bits are always the same, and we accept
> > that the owner bits really mean CAP_SYS_ADMIN and not something else,
> 
> But I would assume some things under /proc/sys/net/ipv4 or
> /proc/sys/net/ath0 require CAP_NET_ADMIN rather than CAP_SYS_ADMIN?

I guess so.  I'm not very familiar with the different capabilities :)

How about this patch then: a hybrid solution between just relying on
permission bits, and specifying separate capability sets for read and
write in addition to the permission bits.

Untested, the 'cap' field obviously still needs to be filled in where
appropriate.

Miklos
----

Index: linux/include/linux/sysctl.h
===================================================================
--- linux.orig/include/linux/sysctl.h	2008-02-04 12:29:01.000000000 +0100
+++ linux/include/linux/sysctl.h	2008-02-07 15:19:06.000000000 +0100
@@ -1041,6 +1041,7 @@ struct ctl_table 
 	void *data;
 	int maxlen;
 	mode_t mode;
+	int cap;			/* Capability needed to read/write */
 	struct ctl_table *child;
 	struct ctl_table *parent;	/* Automatically set */
 	proc_handler *proc_handler;	/* Callback for text formatting */
Index: linux/kernel/sysctl.c
===================================================================
--- linux.orig/kernel/sysctl.c	2008-02-05 22:17:05.000000000 +0100
+++ linux/kernel/sysctl.c	2008-02-07 15:30:45.000000000 +0100
@@ -1527,14 +1527,26 @@ out:
  * some sysctl variables are readonly even to root.
  */
 
-static int test_perm(int mode, int op)
+static int test_perm(struct ctl_table *table, int op)
 {
-	if (!current->euid)
-		mode >>= 6;
-	else if (in_egroup_p(0))
-		mode >>= 3;
+	int cap = table->cap;
+	mode_t mode = table->mode;
+
+	if (!cap)
+		cap = CAP_SYS_ADMIN;
+
+	if ((op & MAY_READ) && !(mode & S_IRUGO))
+		return -EACCES;
+
+	if ((op & MAY_WRITE) && !(mode & S_IWUGO))
+		return -EACCES;
+
+	if (capable(cap))
+		return 0;
+
 	if ((mode & op & 0007) == op)
 		return 0;
+
 	return -EACCES;
 }
 
@@ -1544,7 +1556,7 @@ int sysctl_perm(struct ctl_table *table,
 	error = security_sysctl(table, op);
 	if (error)
 		return error;
-	return test_perm(table->mode, op);
+	return test_perm(table, op);
 }
 
 #ifdef CONFIG_SYSCTL_SYSCALL

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch 07/10] unprivileged mounts: add sysctl tunable for "safe" property
  2008-02-05 21:36 ` [patch 07/10] unprivileged mounts: add sysctl tunable for "safe" property Miklos Szeredi
  2008-02-06 20:21   ` Serge E. Hallyn
@ 2008-02-07 15:33   ` Aneesh Kumar K.V
  2008-02-07 16:24     ` Miklos Szeredi
  1 sibling, 1 reply; 30+ messages in thread
From: Aneesh Kumar K.V @ 2008-02-07 15:33 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: akpm, hch, serue, linux-fsdevel, linux-kernel

On Tue, Feb 05, 2008 at 10:36:23PM +0100, Miklos Szeredi wrote:
> From: Miklos Szeredi <mszeredi@suse.cz>
> 
> Add the following:
> 
>   /proc/sys/fs/types/${FS_TYPE}/usermount_safe
> 


There is  /proc/fs/<type>/ already. Since it is file system specific
shouldn't it go there ?

-aneesh

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch 07/10] unprivileged mounts: add sysctl tunable for "safe" property
  2008-02-07 15:33   ` Aneesh Kumar K.V
@ 2008-02-07 16:24     ` Miklos Szeredi
  0 siblings, 0 replies; 30+ messages in thread
From: Miklos Szeredi @ 2008-02-07 16:24 UTC (permalink / raw)
  To: aneesh.kumar; +Cc: miklos, akpm, hch, serue, linux-fsdevel, linux-kernel

> On Tue, Feb 05, 2008 at 10:36:23PM +0100, Miklos Szeredi wrote:
> > From: Miklos Szeredi <mszeredi@suse.cz>
> > 
> > Add the following:
> > 
> >   /proc/sys/fs/types/${FS_TYPE}/usermount_safe
> > 
> 
> 
> There is  /proc/fs/<type>/ already. Since it is file system specific
> shouldn't it go there ?

The problem is exactly that it's filesystem specific, each filesystem
creates it's own stuff there.

Also we have a rule tp not create new things under /proc.  Things
should either go into /sys or into /proc/sys.  And I think a sysctl is
more appropriate for this than something under /sys.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch 07/10] unprivileged mounts: add sysctl tunable for "safe" property
  2008-02-07 14:36             ` Miklos Szeredi
@ 2008-02-07 16:57               ` Serge E. Hallyn
  0 siblings, 0 replies; 30+ messages in thread
From: Serge E. Hallyn @ 2008-02-07 16:57 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: serue, akpm, hch, linux-fsdevel, linux-kernel

Quoting Miklos Szeredi (miklos@szeredi.hu):
> > > > > Maybe sysctls just need to check capabilities, instead of uids.  I
> > > > > think that would make a lot of sense anyway.
> > > > 
> > > > Would it be as simple as tagging the inodes with capability sets?  One
> > > > set for writing, or one each for reading and writing?
> > > 
> > > Yes, or something even simpler, like mapping the owner permission bits
> > > to CAP_SYS_ADMIN.  There seem to be very few different permissions
> > > under /proc/sys:
> > > 
> > > --w-------
> > > -r--r--r--
> > > -rw-------
> > > -rw-r--r--
> > > 
> > > As long as the group and other bits are always the same, and we accept
> > > that the owner bits really mean CAP_SYS_ADMIN and not something else,
> > 
> > But I would assume some things under /proc/sys/net/ipv4 or
> > /proc/sys/net/ath0 require CAP_NET_ADMIN rather than CAP_SYS_ADMIN?
> 
> I guess so.  I'm not very familiar with the different capabilities :)
> 
> How about this patch then: a hybrid solution between just relying on
> permission bits, and specifying separate capability sets for read and
> write in addition to the permission bits.
> 
> Untested, the 'cap' field obviously still needs to be filled in where
> appropriate.
> 
> Miklos
> ----
> 
> Index: linux/include/linux/sysctl.h
> ===================================================================
> --- linux.orig/include/linux/sysctl.h	2008-02-04 12:29:01.000000000 +0100
> +++ linux/include/linux/sysctl.h	2008-02-07 15:19:06.000000000 +0100
> @@ -1041,6 +1041,7 @@ struct ctl_table 
>  	void *data;
>  	int maxlen;
>  	mode_t mode;
> +	int cap;			/* Capability needed to read/write */
>  	struct ctl_table *child;
>  	struct ctl_table *parent;	/* Automatically set */
>  	proc_handler *proc_handler;	/* Callback for text formatting */
> Index: linux/kernel/sysctl.c
> ===================================================================
> --- linux.orig/kernel/sysctl.c	2008-02-05 22:17:05.000000000 +0100
> +++ linux/kernel/sysctl.c	2008-02-07 15:30:45.000000000 +0100
> @@ -1527,14 +1527,26 @@ out:
>   * some sysctl variables are readonly even to root.
>   */
> 
> -static int test_perm(int mode, int op)
> +static int test_perm(struct ctl_table *table, int op)
>  {
> -	if (!current->euid)
> -		mode >>= 6;
> -	else if (in_egroup_p(0))
> -		mode >>= 3;
> +	int cap = table->cap;
> +	mode_t mode = table->mode;
> +
> +	if (!cap)
> +		cap = CAP_SYS_ADMIN;
> +
> +	if ((op & MAY_READ) && !(mode & S_IRUGO))
> +		return -EACCES;
> +
> +	if ((op & MAY_WRITE) && !(mode & S_IWUGO))
> +		return -EACCES;
> +
> +	if (capable(cap))
> +		return 0;
> +
>  	if ((mode & op & 0007) == op)
>  		return 0;
> +
>  	return -EACCES;

I like how simple it appears to be :)

At first I missed the fact that owning uid is always 0 so I thought the
uid processing wasn't quite enough.  But since it's always 0, the only
question is whether there are any /proc/sys files whose users currently
depend on being setgid 0 and setgid non-0 with no capabilities.

On my laptop, 'find /proc/sys -type f -perm -020' gives me no results,
so that is promising.

So this certainly seems like a good first step.  In fact, combined with
/proc/sys/ being partially remounted per container like /proc/sys/net is
doing, we may not even need to do anything with CAP_NS_OVERRIDE.

thanks,
-serge

>  }
> 
> @@ -1544,7 +1556,7 @@ int sysctl_perm(struct ctl_table *table,
>  	error = security_sysctl(table, op);
>  	if (error)
>  		return error;
> -	return test_perm(table->mode, op);
> +	return test_perm(table, op);
>  }
> 
>  #ifdef CONFIG_SYSCTL_SYSCALL

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch 00/10] mount ownership and unprivileged mount syscall (v8)
  2008-02-05 21:36 [patch 00/10] mount ownership and unprivileged mount syscall (v8) Miklos Szeredi
                   ` (9 preceding siblings ...)
  2008-02-05 21:36 ` [patch 10/10] unprivileged mounts: add "no submounts" flag Miklos Szeredi
@ 2008-02-15  6:21 ` Andrew Morton
  2008-02-15  9:01   ` Christoph Hellwig
  10 siblings, 1 reply; 30+ messages in thread
From: Andrew Morton @ 2008-02-15  6:21 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: hch, serue, linux-fsdevel, linux-kernel, Dave Hansen

On Tue, 05 Feb 2008 22:36:16 +0100 Miklos Szeredi <miklos@szeredi.hu> wrote:

> Just documentation updates, compared to the previous submission.
> Thanks to Serge for the relentless reviews :)
> 
> Please consider for -mm, and then for 2.6.26.

Linus has just merged all the VFS renaming patches, so the decks
are clear for looking at this work.

However David and Christoph are beavering away on the r-o-bind-mounts
patches and I expect that there will be overlaps with unprivileged mounts.

Could we coordinate things a bit please?  Decide who goes first, review
and maybe even test each others work, etc?

Thanks.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch 00/10] mount ownership and unprivileged mount syscall (v8)
  2008-02-15  6:21 ` [patch 00/10] mount ownership and unprivileged mount syscall (v8) Andrew Morton
@ 2008-02-15  9:01   ` Christoph Hellwig
  2008-02-15  9:09     ` Andrew Morton
  0 siblings, 1 reply; 30+ messages in thread
From: Christoph Hellwig @ 2008-02-15  9:01 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Miklos Szeredi, hch, serue, linux-fsdevel, linux-kernel, Dave Hansen

On Thu, Feb 14, 2008 at 10:21:03PM -0800, Andrew Morton wrote:
> Linus has just merged all the VFS renaming patches, so the decks
> are clear for looking at this work.
> 
> However David and Christoph are beavering away on the r-o-bind-mounts
> patches and I expect that there will be overlaps with unprivileged mounts.
> 
> Could we coordinate things a bit please?  Decide who goes first, review
> and maybe even test each others work, etc?

Al is setting up a git tree for VFS work.  per-mount r/o will go in
as one of the first things, aswell as his rework of the path lookup
logic to fix the intents mess.

> 
> Thanks.
---end quoted text---

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch 00/10] mount ownership and unprivileged mount syscall (v8)
  2008-02-15  9:01   ` Christoph Hellwig
@ 2008-02-15  9:09     ` Andrew Morton
  2008-02-15  9:14       ` Christoph Hellwig
  0 siblings, 1 reply; 30+ messages in thread
From: Andrew Morton @ 2008-02-15  9:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Miklos Szeredi, serue, linux-fsdevel, linux-kernel, Dave Hansen

On Fri, 15 Feb 2008 04:01:20 -0500 Christoph Hellwig <hch@infradead.org> wrote:

> On Thu, Feb 14, 2008 at 10:21:03PM -0800, Andrew Morton wrote:
> > Linus has just merged all the VFS renaming patches, so the decks
> > are clear for looking at this work.
> > 
> > However David and Christoph are beavering away on the r-o-bind-mounts
> > patches and I expect that there will be overlaps with unprivileged mounts.
> > 
> > Could we coordinate things a bit please?  Decide who goes first, review
> > and maybe even test each others work, etc?
> 
> Al is setting up a git tree for VFS work.  per-mount r/o will go in
> as one of the first things, aswell as his rework of the path lookup
> logic to fix the intents mess.
> 

That didn't answer my question..

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch 00/10] mount ownership and unprivileged mount syscall (v8)
  2008-02-15  9:09     ` Andrew Morton
@ 2008-02-15  9:14       ` Christoph Hellwig
  2008-02-18 11:47         ` Miklos Szeredi
  0 siblings, 1 reply; 30+ messages in thread
From: Christoph Hellwig @ 2008-02-15  9:14 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Hellwig, Miklos Szeredi, serue, linux-fsdevel,
	linux-kernel, Dave Hansen

On Fri, Feb 15, 2008 at 01:09:51AM -0800, Andrew Morton wrote:
> > > However David and Christoph are beavering away on the r-o-bind-mounts
> > > patches and I expect that there will be overlaps with unprivileged mounts.
> > > 
> > > Could we coordinate things a bit please?  Decide who goes first, review
> > > and maybe even test each others work, etc?
> > 
> > Al is setting up a git tree for VFS work.  per-mount r/o will go in
> > as one of the first things, aswell as his rework of the path lookup
> > logic to fix the intents mess.
> > 
> 
> That didn't answer my question..

Well, Al as the defacto VFS maintainer will decide on the ordering.
Reviewing this stuff properly is still on my todo list, but currently
I'm busy with more important things.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch 00/10] mount ownership and unprivileged mount syscall (v8)
  2008-02-15  9:14       ` Christoph Hellwig
@ 2008-02-18 11:47         ` Miklos Szeredi
  2008-02-23 16:09           ` Al Viro
  0 siblings, 1 reply; 30+ messages in thread
From: Miklos Szeredi @ 2008-02-18 11:47 UTC (permalink / raw)
  To: hch; +Cc: akpm, hch, miklos, serue, linux-fsdevel, linux-kernel, haveblue, viro

> > > > However David and Christoph are beavering away on the r-o-bind-mounts
> > > > patches and I expect that there will be overlaps with unprivileged mounts.
> > > > 
> > > > Could we coordinate things a bit please?  Decide who goes first, review
> > > > and maybe even test each others work, etc?
> > > 
> > > Al is setting up a git tree for VFS work.  per-mount r/o will go in
> > > as one of the first things, aswell as his rework of the path lookup
> > > logic to fix the intents mess.
> > > 
> > 
> > That didn't answer my question..
> 
> Well, Al as the defacto VFS maintainer will decide on the ordering.

I think we agreed, that r-o-bind mounts are more important, so they
should go first.  They have also received more attention.  OTOH there
isn't really any fundamental conflict between the two patchsets, so
going in together (if the ro-bind patches miss 2.6.25) should also be
possible.

> Reviewing this stuff properly is still on my todo list, but currently
> I'm busy with more important things.

So what should I do?

Would Al be wanting to merge this into his VFS tree?  (Can't find it
on git.kernel.org yet, BTW.)  I can set up a git tree for these
patches if that makes things easier.

Or should I just wait and resubmit after every kernel release, hoping
that it becomes _the_ most important thing on Christoph's list ;)

Miklos

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch 00/10] mount ownership and unprivileged mount syscall (v8)
  2008-02-18 11:47         ` Miklos Szeredi
@ 2008-02-23 16:09           ` Al Viro
  2008-02-23 17:33             ` Miklos Szeredi
  0 siblings, 1 reply; 30+ messages in thread
From: Al Viro @ 2008-02-23 16:09 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: hch, akpm, serue, linux-fsdevel, linux-kernel, haveblue

On Mon, Feb 18, 2008 at 12:47:59PM +0100, Miklos Szeredi wrote:
> So what should I do?
> 
> Would Al be wanting to merge this into his VFS tree?  (Can't find it
> on git.kernel.org yet, BTW.)

FWIW, it's on hera right now, should propagate to git.kernel.org in a few.

Branches I'd pushed there: vfs-fixes.b0 and ro-bind.b0.  The latter is
on top of the former.  There will be more, but that at least takes care
of the most urgent stuff.  Again, apologies for things being too damn
slow ;-/

As for the unprivileged mounts...
	a) why do we lose them on clone() in new namespace?  Bloody
inconvenient, to put it mildly.
	b) why do we prohibit all kinds of remount?
	c) just what is limited by that sysctl?  AFAICS, rbind is allowed
if mountpoint is on user vfsmount and it seems to create vfsmounts without
eating into that limit just fine...  What's the point of limiting the
amount of vfsmounts marked user when you do not limit the number of vfsmount
one can allocate?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch 00/10] mount ownership and unprivileged mount syscall (v8)
  2008-02-23 16:09           ` Al Viro
@ 2008-02-23 17:33             ` Miklos Szeredi
  2008-02-23 18:57               ` Al Viro
  0 siblings, 1 reply; 30+ messages in thread
From: Miklos Szeredi @ 2008-02-23 17:33 UTC (permalink / raw)
  To: viro; +Cc: miklos, hch, akpm, serue, linux-fsdevel, linux-kernel, haveblue

> On Mon, Feb 18, 2008 at 12:47:59PM +0100, Miklos Szeredi wrote:
> > So what should I do?
> > 
> > Would Al be wanting to merge this into his VFS tree?  (Can't find it
> > on git.kernel.org yet, BTW.)
> 
> FWIW, it's on hera right now, should propagate to git.kernel.org in a few.
> 
> Branches I'd pushed there: vfs-fixes.b0 and ro-bind.b0.  The latter is
> on top of the former.  There will be more, but that at least takes care
> of the most urgent stuff.  Again, apologies for things being too damn
> slow ;-/
> 
> As for the unprivileged mounts...
> 	a) why do we lose them on clone() in new namespace?  Bloody
> inconvenient, to put it mildly.
> 	b) why do we prohibit all kinds of remount?

I wanted to get the basics right, before thinking about these details.
But getting the semantics of a) right before this is merged is a good
idea, of course...  So I'll have to think about that.

The remount stuff can wait (especially if there will be a new mount
API for this kind of thing).

> 	c) just what is limited by that sysctl?  AFAICS, rbind is allowed
> if mountpoint is on user vfsmount and it seems to create vfsmounts without
> eating into that limit just fine...  What's the point of limiting the
> amount of vfsmounts marked user when you do not limit the number of vfsmount
> one can allocate?

The limit is there, so that unprivileged users cannot create insane
number of mounts.  It's just a safety thing, analogous to
/proc/sys/fs/file-max.

Thanks for looking at this.

Miklos

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch 00/10] mount ownership and unprivileged mount syscall (v8)
  2008-02-23 17:33             ` Miklos Szeredi
@ 2008-02-23 18:57               ` Al Viro
  2008-02-23 19:48                 ` Miklos Szeredi
  0 siblings, 1 reply; 30+ messages in thread
From: Al Viro @ 2008-02-23 18:57 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: hch, akpm, serue, linux-fsdevel, linux-kernel, haveblue

On Sat, Feb 23, 2008 at 06:33:13PM +0100, Miklos Szeredi wrote:
> > 	c) just what is limited by that sysctl?  AFAICS, rbind is allowed
> > if mountpoint is on user vfsmount and it seems to create vfsmounts without
> > eating into that limit just fine...  What's the point of limiting the
> > amount of vfsmounts marked user when you do not limit the number of vfsmount
> > one can allocate?
> 
> The limit is there, so that unprivileged users cannot create insane
> number of mounts.  It's just a safety thing, analogous to
> /proc/sys/fs/file-max.

Can't they?  Looks like one can create any number of vfsmounts without
getting more than one marked MNT_USER...

What are you trying to limit - vfsmounts or superblocks?  The former is
not limited in your patchset at all, AFAICS - you can
do while true; do
	mount --bind /bin ~/my_directory;
	mount --bind /sbin ~/my_directory;
done
indefinitely and all the bleeding stack of vfsmounts will be !MNT_USER.
Or any number of similar schemes, really, without overmounting if you
wish to avoid it.

If you are trying to limit the number of superblocks (i.e. active instances
of filesystems), then I'd say that vfsmounts make piss-poor proxies for
those and it would be better to count the objects you really want to count...

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch 00/10] mount ownership and unprivileged mount syscall (v8)
  2008-02-23 18:57               ` Al Viro
@ 2008-02-23 19:48                 ` Miklos Szeredi
  0 siblings, 0 replies; 30+ messages in thread
From: Miklos Szeredi @ 2008-02-23 19:48 UTC (permalink / raw)
  To: viro; +Cc: miklos, hch, akpm, serue, linux-fsdevel, linux-kernel, haveblue

> On Sat, Feb 23, 2008 at 06:33:13PM +0100, Miklos Szeredi wrote:
> > > 	c) just what is limited by that sysctl?  AFAICS, rbind is allowed
> > > if mountpoint is on user vfsmount and it seems to create vfsmounts without
> > > eating into that limit just fine...  What's the point of limiting the
> > > amount of vfsmounts marked user when you do not limit the number of vfsmount
> > > one can allocate?
> > 
> > The limit is there, so that unprivileged users cannot create insane
> > number of mounts.  It's just a safety thing, analogous to
> > /proc/sys/fs/file-max.
> 
> Can't they?  Looks like one can create any number of vfsmounts without
> getting more than one marked MNT_USER...

permit_mount() will set MS_SETUSER in flags, and do_loopback() will
set CL_SETUSER based on that flag.

> If you are trying to limit the number of superblocks (i.e. active instances
> of filesystems), then I'd say that vfsmounts make piss-poor proxies for
> those and it would be better to count the objects you really want to count...

I think I really want to limit vfsmounts.  But not because these take
so much memory or anything, just to be safe against a stupid users
playing rbind and propagation, and things like that.

Miklos

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [patch 04/10] unprivileged mounts: account user mounts
  2008-01-16 12:31 [patch 00/10] mount ownership and unprivileged mount syscall (v7) Miklos Szeredi
@ 2008-01-16 12:31 ` Miklos Szeredi
  0 siblings, 0 replies; 30+ messages in thread
From: Miklos Szeredi @ 2008-01-16 12:31 UTC (permalink / raw)
  To: akpm, hch, serue, viro, kzak
  Cc: linux-fsdevel, linux-kernel, containers, util-linux-ng

[-- Attachment #1: unprivileged-mounts-account-user-mounts.patch --]
[-- Type: text/plain, Size: 5802 bytes --]

From: Miklos Szeredi <mszeredi@suse.cz>

Add sysctl variables for accounting and limiting the number of user
mounts.

The maximum number of user mounts is set to 1024 by default.  This
won't in itself enable user mounts, setting a mount to be owned by a
user is first needed.

[akpm]
 - don't use enumerated sysctls

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Serge Hallyn <serue@us.ibm.com>
---

Index: linux/Documentation/filesystems/proc.txt
===================================================================
--- linux.orig/Documentation/filesystems/proc.txt	2008-01-16 13:24:53.000000000 +0100
+++ linux/Documentation/filesystems/proc.txt	2008-01-16 13:25:07.000000000 +0100
@@ -1012,6 +1012,15 @@ reaches aio-max-nr then io_setup will fa
 raising aio-max-nr does not result in the pre-allocation or re-sizing
 of any kernel data structures.
 
+nr_user_mounts and max_user_mounts
+----------------------------------
+
+These represent the number of "user" mounts and the maximum number of
+"user" mounts respectively.  User mounts may be created by
+unprivileged users.  User mounts may also be created with sysadmin
+privileges on behalf of a user, in which case nr_user_mounts may
+exceed max_user_mounts.
+
 2.2 /proc/sys/fs/binfmt_misc - Miscellaneous binary formats
 -----------------------------------------------------------
 
Index: linux/fs/namespace.c
===================================================================
--- linux.orig/fs/namespace.c	2008-01-16 13:25:07.000000000 +0100
+++ linux/fs/namespace.c	2008-01-16 13:25:07.000000000 +0100
@@ -44,6 +44,9 @@ static struct list_head *mount_hashtable
 static struct kmem_cache *mnt_cache __read_mostly;
 static struct rw_semaphore namespace_sem;
 
+int nr_user_mounts;
+int max_user_mounts = 1024;
+
 /* /sys/fs */
 struct kobject *fs_kobj;
 EXPORT_SYMBOL_GPL(fs_kobj);
@@ -477,21 +480,70 @@ static struct vfsmount *skip_mnt_tree(st
 	return p;
 }
 
-static void set_mnt_user(struct vfsmount *mnt)
+static void dec_nr_user_mounts(void)
+{
+	spin_lock(&vfsmount_lock);
+	nr_user_mounts--;
+	spin_unlock(&vfsmount_lock);
+}
+
+static int reserve_user_mount(void)
+{
+	int err = 0;
+
+	spin_lock(&vfsmount_lock);
+	/*
+	 * EMFILE was error returned by mount(2) in the old days, when
+	 * the mount count was limited.  Reuse this error value to
+	 * mean, that the maximum number of user mounts has been
+	 * exceeded.
+	 */
+	if (nr_user_mounts >= max_user_mounts && !capable(CAP_SYS_ADMIN))
+		err = -EMFILE;
+	else
+		nr_user_mounts++;
+	spin_unlock(&vfsmount_lock);
+	return err;
+}
+
+static void __set_mnt_user(struct vfsmount *mnt)
 {
 	WARN_ON(mnt->mnt_flags & MNT_USER);
 	mnt->mnt_uid = current->fsuid;
 	mnt->mnt_flags |= MNT_USER;
 }
 
+static void set_mnt_user(struct vfsmount *mnt)
+{
+	__set_mnt_user(mnt);
+	spin_lock(&vfsmount_lock);
+	nr_user_mounts++;
+	spin_unlock(&vfsmount_lock);
+}
+
+static void clear_mnt_user(struct vfsmount *mnt)
+{
+	if (mnt->mnt_flags & MNT_USER) {
+		mnt->mnt_uid = 0;
+		mnt->mnt_flags &= ~MNT_USER;
+		dec_nr_user_mounts();
+	}
+}
+
 static struct vfsmount *clone_mnt(struct vfsmount *old, struct dentry *root,
 					int flag)
 {
 	struct super_block *sb = old->mnt_sb;
-	struct vfsmount *mnt = alloc_vfsmnt(old->mnt_devname);
+	struct vfsmount *mnt;
 
+	if (flag & CL_SETUSER) {
+		int err = reserve_user_mount();
+		if (err)
+			return ERR_PTR(err);
+	}
+	mnt = alloc_vfsmnt(old->mnt_devname);
 	if (!mnt)
-		return ERR_PTR(-ENOMEM);
+		goto alloc_failed;
 
 	mnt->mnt_flags = old->mnt_flags;
 	atomic_inc(&sb->s_active);
@@ -503,7 +555,7 @@ static struct vfsmount *clone_mnt(struct
 	/* don't copy the MNT_USER flag */
 	mnt->mnt_flags &= ~MNT_USER;
 	if (flag & CL_SETUSER)
-		set_mnt_user(mnt);
+		__set_mnt_user(mnt);
 
 	if (flag & CL_SLAVE) {
 		list_add(&mnt->mnt_slave, &old->mnt_slave_list);
@@ -528,6 +580,11 @@ static struct vfsmount *clone_mnt(struct
 		spin_unlock(&vfsmount_lock);
 	}
 	return mnt;
+
+ alloc_failed:
+	if (flag & CL_SETUSER)
+		dec_nr_user_mounts();
+	return ERR_PTR(-ENOMEM);
 }
 
 static inline void __mntput(struct vfsmount *mnt)
@@ -543,6 +600,7 @@ static inline void __mntput(struct vfsmo
 	 */
 	WARN_ON(atomic_read(&mnt->__mnt_writers));
 	dput(mnt->mnt_root);
+	clear_mnt_user(mnt);
 	free_vfsmnt(mnt);
 	deactivate_super(sb);
 }
@@ -1307,6 +1365,7 @@ static int do_remount(struct nameidata *
 	else
 		err = do_remount_sb(sb, flags, data, 0);
 	if (!err) {
+		clear_mnt_user(nd->path.mnt);
 		nd->path.mnt->mnt_flags = mnt_flags;
 		if (flags & MS_SETUSER)
 			set_mnt_user(nd->path.mnt);
Index: linux/include/linux/fs.h
===================================================================
--- linux.orig/include/linux/fs.h	2008-01-16 13:25:05.000000000 +0100
+++ linux/include/linux/fs.h	2008-01-16 13:25:07.000000000 +0100
@@ -50,6 +50,9 @@ extern struct inodes_stat_t inodes_stat;
 
 extern int leases_enable, lease_break_time;
 
+extern int nr_user_mounts;
+extern int max_user_mounts;
+
 #ifdef CONFIG_DNOTIFY
 extern int dir_notify_enable;
 #endif
Index: linux/kernel/sysctl.c
===================================================================
--- linux.orig/kernel/sysctl.c	2008-01-16 13:24:53.000000000 +0100
+++ linux/kernel/sysctl.c	2008-01-16 13:25:07.000000000 +0100
@@ -1288,6 +1288,22 @@ static struct ctl_table fs_table[] = {
 #endif	
 #endif
 	{
+		.ctl_name	= CTL_UNNUMBERED,
+		.procname	= "nr_user_mounts",
+		.data		= &nr_user_mounts,
+		.maxlen		= sizeof(int),
+		.mode		= 0444,
+		.proc_handler	= &proc_dointvec,
+	},
+	{
+		.ctl_name	= CTL_UNNUMBERED,
+		.procname	= "max_user_mounts",
+		.data		= &max_user_mounts,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= &proc_dointvec,
+	},
+	{
 		.ctl_name	= KERN_SETUID_DUMPABLE,
 		.procname	= "suid_dumpable",
 		.data		= &suid_dumpable,

--

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2008-02-23 19:49 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-05 21:36 [patch 00/10] mount ownership and unprivileged mount syscall (v8) Miklos Szeredi
2008-02-05 21:36 ` [patch 01/10] unprivileged mounts: add user mounts to the kernel Miklos Szeredi
2008-02-05 21:36 ` [patch 02/10] unprivileged mounts: allow unprivileged umount Miklos Szeredi
2008-02-05 21:36 ` [patch 03/10] unprivileged mounts: propagate error values from clone_mnt Miklos Szeredi
2008-02-05 21:36 ` [patch 04/10] unprivileged mounts: account user mounts Miklos Szeredi
2008-02-05 21:36 ` [patch 05/10] unprivileged mounts: allow unprivileged bind mounts Miklos Szeredi
2008-02-05 21:36 ` [patch 06/10] unprivileged mounts: allow unprivileged mounts Miklos Szeredi
2008-02-05 21:36 ` [patch 07/10] unprivileged mounts: add sysctl tunable for "safe" property Miklos Szeredi
2008-02-06 20:21   ` Serge E. Hallyn
2008-02-06 21:11     ` Miklos Szeredi
2008-02-06 22:45       ` Serge E. Hallyn
2008-02-07  8:09         ` Miklos Szeredi
2008-02-07 14:05           ` Serge E. Hallyn
2008-02-07 14:36             ` Miklos Szeredi
2008-02-07 16:57               ` Serge E. Hallyn
2008-02-07 15:33   ` Aneesh Kumar K.V
2008-02-07 16:24     ` Miklos Szeredi
2008-02-05 21:36 ` [patch 08/10] unprivileged mounts: make fuse safe Miklos Szeredi
2008-02-05 21:36 ` [patch 09/10] unprivileged mounts: propagation: inherit owner from parent Miklos Szeredi
2008-02-05 21:36 ` [patch 10/10] unprivileged mounts: add "no submounts" flag Miklos Szeredi
2008-02-15  6:21 ` [patch 00/10] mount ownership and unprivileged mount syscall (v8) Andrew Morton
2008-02-15  9:01   ` Christoph Hellwig
2008-02-15  9:09     ` Andrew Morton
2008-02-15  9:14       ` Christoph Hellwig
2008-02-18 11:47         ` Miklos Szeredi
2008-02-23 16:09           ` Al Viro
2008-02-23 17:33             ` Miklos Szeredi
2008-02-23 18:57               ` Al Viro
2008-02-23 19:48                 ` Miklos Szeredi
  -- strict thread matches above, loose matches on Subject: below --
2008-01-16 12:31 [patch 00/10] mount ownership and unprivileged mount syscall (v7) Miklos Szeredi
2008-01-16 12:31 ` [patch 04/10] unprivileged mounts: account user mounts Miklos Szeredi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).