LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* patch to make Linux capabilities into something useful (v 0.3.1)
@ 2006-09-05 21:26 David Madore
  2006-09-06  0:27 ` Casey Schaufler
                   ` (2 more replies)
  0 siblings, 3 replies; 52+ messages in thread
From: David Madore @ 2006-09-05 21:26 UTC (permalink / raw)
  To: Linux Kernel mailing-list

[-- Attachment #1: Type: text/plain, Size: 2499 bytes --]

Hi.

As we all know, capabilities under Linux are currently crippled to the
point of being useless.  Attached is a patch (against 2.6.18-rc6)
which attempts to make them work in a reasonably useful way and at the
same time not break anything.  On top of the "additional" capabilities
that lead up to root, it also adds "regular" capabilities which all
processes have by default and which can be removed from specifically
untrusted programs.

All the gory details as to what it does are explained on this page:
<URL: http://www.madore.org/~david/linux/newcaps/newcaps.html >

In short: currently (i.e., prior to applying this patch), Linux has
capabilities, but they are (deliberately) crippled, and thus,
essentially useless, because nobody could agree on coherent semantics
for them; this patch uncripples them and attempts to give them
reasonable semantics that will, hopefully, neither break legacy Unix
programs nor those that use the current capabilies system
(essentially, Bind9); basically, capabilities are currently useless
because they are never inheritable (=preserved across execve()) and
this patch makes them so (but carefully enough so as not to confuse
existing programs).  Furthermore, whereas the current Linux
capabilities are only "additional" capabilities (meaning that normal,
non-root, processes, have none, and adding capabilities leads up to
root), the patch also suggests (and, to some extent, implements) a new
bunch of "regular" capabilites, which are present on all normal
processes and can be removed so as to provide some measure of
fault-containment for partially untrusted or potentially buggy
programs (thus, these new capabilities can be said to lead down).

Note: Although I believe that this patch will not break anything, it
is still little tested and should be considered alpha quality: it
should on no account be applied on security-critical systems or on a
system were local users are not to be trusted: the security
implications are quite complex and I could quite possibly be wrong in
thinking that it doesn't open any local root hole.

I'd be glad if some people could review this and check my reasoning
attempting to prove that it won't open any security holes (or, on the
contrary, exhibit some).

I'd also be glad if someone had a test suite that could be used to
check that traditional Unix behavior isn't broken after applying the
patch.

Comments are welcome,

-- 
     David A. Madore
    (david.madore@ens.fr,
     http://www.madore.org/~david/ )

[-- Attachment #2: caps-0.3.1-linux-2.6.18-rc6.patch --]
[-- Type: text/plain, Size: 15996 bytes --]

diff --git a/fs/exec.c b/fs/exec.c
index 54135df..1cb5e34 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -925,10 +925,13 @@ int prepare_binprm(struct linux_binprm *
 
 	bprm->e_uid = current->euid;
 	bprm->e_gid = current->egid;
+	bprm->is_suid = 0;
+	bprm->is_sgid = 0;
 
 	if(!(bprm->file->f_vfsmnt->mnt_flags & MNT_NOSUID)) {
 		/* Set-uid? */
-		if (mode & S_ISUID) {
+		if (mode & S_ISUID && capable(CAP_REG_SXID)) {
+			bprm->is_suid = 1;
 			current->personality &= ~PER_CLEAR_ON_SETID;
 			bprm->e_uid = inode->i_uid;
 		}
@@ -939,7 +942,9 @@ int prepare_binprm(struct linux_binprm *
 		 * is a candidate for mandatory locking, not a setgid
 		 * executable.
 		 */
-		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
+		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)
+		    && capable(CAP_REG_SXID)) {
+			bprm->is_sgid = 1;
 			current->personality &= ~PER_CLEAR_ON_SETID;
 			bprm->e_gid = inode->i_gid;
 		}
@@ -1133,6 +1138,8 @@ int do_execve(char * filename,
 	int retval;
 	int i;
 
+	if (!capable(CAP_REG_EXEC))
+		return -EPERM;
 	retval = -ENOMEM;
 	bprm = kzalloc(sizeof(*bprm), GFP_KERNEL);
 	if (!bprm)
diff --git a/fs/open.c b/fs/open.c
index 303f06d..3d1fc1c 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1104,7 +1104,10 @@ asmlinkage long sys_open(const char __us
 	if (force_o_largefile())
 		flags |= O_LARGEFILE;
 
-	ret = do_sys_open(AT_FDCWD, filename, flags, mode);
+	if (capable(CAP_REG_OPEN))
+		ret = do_sys_open(AT_FDCWD, filename, flags, mode);
+	else
+		ret = -EPERM;
 	/* avoid REGPARM breakage on x86: */
 	prevent_tail_call(ret);
 	return ret;
@@ -1119,7 +1122,10 @@ asmlinkage long sys_openat(int dfd, cons
 	if (force_o_largefile())
 		flags |= O_LARGEFILE;
 
-	ret = do_sys_open(dfd, filename, flags, mode);
+	if (capable(CAP_REG_OPEN))
+		ret = do_sys_open(dfd, filename, flags, mode);
+	else
+		ret = -EPERM;
 	/* avoid REGPARM breakage on x86: */
 	prevent_tail_call(ret);
 	return ret;
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 0b615d6..6724fc2 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -285,9 +285,9 @@ static inline char * task_sig(struct tas
 
 static inline char *task_cap(struct task_struct *p, char *buffer)
 {
-    return buffer + sprintf(buffer, "CapInh:\t%016x\n"
-			    "CapPrm:\t%016x\n"
-			    "CapEff:\t%016x\n",
+    return buffer + sprintf(buffer, "CapInh:\t%016llx\n"
+			    "CapPrm:\t%016llx\n"
+			    "CapEff:\t%016llx\n",
 			    cap_t(p->cap_inheritable),
 			    cap_t(p->cap_permitted),
 			    cap_t(p->cap_effective));
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index c1e82c5..c7fb183 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -29,6 +29,7 @@ struct linux_binprm{
 	struct file * file;
 	int e_uid, e_gid;
 	kernel_cap_t cap_inheritable, cap_permitted, cap_effective;
+	char is_suid, is_sgid;
 	void *security;
 	int argc, envc;
 	char * filename;	/* Name of binary as seen by procps */
diff --git a/include/linux/capability.h b/include/linux/capability.h
index 6548b35..b3a3b27 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -27,7 +27,8 @@ #include <linux/compiler.h>
    library since the draft standard requires the use of malloc/free
    etc.. */
  
-#define _LINUX_CAPABILITY_VERSION  0x19980330
+#define _LINUX_CAPABILITY_VERSION  0x20060903
+#define _LINUX_CAPABILITY_OLD_VERSION  0x19980330
 
 typedef struct __user_cap_header_struct {
 	__u32 version;
@@ -35,10 +36,16 @@ typedef struct __user_cap_header_struct 
 } __user *cap_user_header_t;
  
 typedef struct __user_cap_data_struct {
+        __u64 effective;
+        __u64 permitted;
+        __u64 inheritable;
+} __user *cap_user_data_t;
+ 
+typedef struct __user_cap_data_old_struct {
         __u32 effective;
         __u32 permitted;
         __u32 inheritable;
-} __user *cap_user_data_t;
+} __user *cap_user_data_old_t;
   
 #ifdef __KERNEL__
 
@@ -50,12 +57,12 @@ #include <asm/current.h>
 #ifdef STRICT_CAP_T_TYPECHECKS
 
 typedef struct kernel_cap_struct {
-	__u32 cap;
+	__u64 cap;
 } kernel_cap_t;
 
 #else
 
-typedef __u32 kernel_cap_t;
+typedef __u64 kernel_cap_t;
 
 #endif
   
@@ -288,6 +295,23 @@ #define CAP_AUDIT_WRITE      29
 
 #define CAP_AUDIT_CONTROL    30
 
+
+/**
+ ** Regular capabilities (normally possessed by all processes).
+ **/
+
+/* Can fork() */
+#define CAP_REG_FORK         32
+
+/* Can open() */
+#define CAP_REG_OPEN         33
+
+/* Can exec() */
+#define CAP_REG_EXEC         34
+
+/* Might gain permissions on exec() */
+#define CAP_REG_SXID         35
+
 #ifdef __KERNEL__
 /* 
  * Bounding set
@@ -310,12 +334,13 @@ #define cap_t(x) (x)
 
 #endif
 
-#define CAP_EMPTY_SET       to_cap_t(0)
-#define CAP_FULL_SET        to_cap_t(~0)
-#define CAP_INIT_EFF_SET    to_cap_t(~0 & ~CAP_TO_MASK(CAP_SETPCAP))
-#define CAP_INIT_INH_SET    to_cap_t(0)
+#define CAP_EMPTY_SET       to_cap_t(0ULL)
+#define CAP_FULL_SET        to_cap_t(~0ULL)
+#define CAP_REGULAR_SET     to_cap_t(0xffffffff00000000ULL)
+#define CAP_INIT_EFF_SET    to_cap_t(~0ULL)
+#define CAP_INIT_INH_SET    to_cap_t(~0ULL)
 
-#define CAP_TO_MASK(x) (1 << (x))
+#define CAP_TO_MASK(x) (1ULL << (x))
 #define cap_raise(c, flag)   (cap_t(c) |=  CAP_TO_MASK(flag))
 #define cap_lower(c, flag)   (cap_t(c) &= ~CAP_TO_MASK(flag))
 #define cap_raised(c, flag)  (cap_t(c) & CAP_TO_MASK(flag))
@@ -351,8 +376,8 @@ static inline kernel_cap_t cap_invert(ke
 #define cap_isclear(c)       (!cap_t(c))
 #define cap_issubset(a,set)  (!(cap_t(a) & ~cap_t(set)))
 
-#define cap_clear(c)         do { cap_t(c) =  0; } while(0)
-#define cap_set_full(c)      do { cap_t(c) = ~0; } while(0)
+#define cap_clear(c)         do { cap_t(c) =  0ULL; } while(0)
+#define cap_set_full(c)      do { cap_t(c) = ~0ULL; } while(0)
 #define cap_mask(c,mask)     do { cap_t(c) &= cap_t(mask); } while(0)
 
 #define cap_is_fs_cap(c)     (CAP_TO_MASK(c) & CAP_FS_MASK)
diff --git a/kernel/capability.c b/kernel/capability.c
index c7685ad..c090570 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -15,7 +15,7 @@ #include <linux/syscalls.h>
 #include <asm/uaccess.h>
 
 unsigned securebits = SECUREBITS_DEFAULT; /* systemwide security settings */
-kernel_cap_t cap_bset = CAP_INIT_EFF_SET;
+kernel_cap_t cap_bset = CAP_INIT_INH_SET;
 
 EXPORT_SYMBOL(securebits);
 EXPORT_SYMBOL(cap_bset);
@@ -52,7 +52,8 @@ asmlinkage long sys_capget(cap_user_head
      if (get_user(version, &header->version))
 	     return -EFAULT;
 
-     if (version != _LINUX_CAPABILITY_VERSION) {
+     if (version != _LINUX_CAPABILITY_VERSION
+	 && version != _LINUX_CAPABILITY_OLD_VERSION) {
 	     if (put_user(_LINUX_CAPABILITY_VERSION, &header->version))
 		     return -EFAULT; 
              return -EINVAL;
@@ -82,8 +83,18 @@ out:
      read_unlock(&tasklist_lock); 
      spin_unlock(&task_capability_lock);
 
-     if (!ret && copy_to_user(dataptr, &data, sizeof data))
-          return -EFAULT; 
+     if (!ret) {
+	     if (version == _LINUX_CAPABILITY_OLD_VERSION) {
+		     struct __user_cap_data_old_struct data_old;
+		     data_old.effective = data_old.effective & 0xffffffffULL;
+		     data_old.permitted = data_old.permitted & 0xffffffffULL;
+		     data_old.inheritable = data_old.inheritable & 0xffffffffULL;
+		     if (copy_to_user(dataptr, &data_old, sizeof data_old))
+			     return -EFAULT;
+	     } else
+		     if (copy_to_user(dataptr, &data, sizeof data))
+			     return -EFAULT;
+     }
 
      return ret;
 }
@@ -179,7 +190,8 @@ asmlinkage long sys_capset(cap_user_head
      if (get_user(version, &header->version))
 	     return -EFAULT; 
 
-     if (version != _LINUX_CAPABILITY_VERSION) {
+     if (version != _LINUX_CAPABILITY_VERSION
+	 && version != _LINUX_CAPABILITY_OLD_VERSION) {
 	     if (put_user(_LINUX_CAPABILITY_VERSION, &header->version))
 		     return -EFAULT; 
              return -EINVAL;
@@ -191,10 +203,23 @@ asmlinkage long sys_capset(cap_user_head
      if (pid && pid != current->pid && !capable(CAP_SETPCAP))
              return -EPERM;
 
-     if (copy_from_user(&effective, &data->effective, sizeof(effective)) ||
-	 copy_from_user(&inheritable, &data->inheritable, sizeof(inheritable)) ||
-	 copy_from_user(&permitted, &data->permitted, sizeof(permitted)))
-	     return -EFAULT; 
+     if (version == _LINUX_CAPABILITY_OLD_VERSION) {
+	     const cap_user_data_old_t data2 = (void *)data;
+	     __u32 w;
+	     if (copy_from_user(&w, &data2->effective, sizeof(w)))
+		     return -EFAULT;
+	     effective = (__u64)w | 0xffffffff00000000ULL;
+	     if (copy_from_user(&w, &data2->inheritable, sizeof(w)))
+		     return -EFAULT;
+	     inheritable = (__u64)w | 0xffffffff00000000ULL;
+	     if (copy_from_user(&w, &data2->permitted, sizeof(w)))
+		     return -EFAULT;
+	     permitted = (__u64)w | 0xffffffff00000000ULL;
+     } else
+	     if (copy_from_user(&effective, &data->effective, sizeof(effective)) ||
+		 copy_from_user(&inheritable, &data->inheritable, sizeof(inheritable)) ||
+		 copy_from_user(&permitted, &data->permitted, sizeof(permitted)))
+		     return -EFAULT; 
 
      spin_lock(&task_capability_lock);
      read_lock(&tasklist_lock);
diff --git a/kernel/fork.c b/kernel/fork.c
index f9b014e..20f559f 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1347,6 +1347,8 @@ long do_fork(unsigned long clone_flags,
 	struct pid *pid = alloc_pid();
 	long nr;
 
+	if (!capable(CAP_REG_FORK))
+		return -EPERM;
 	if (!pid)
 		return -EAGAIN;
 	nr = pid->nr;
diff --git a/security/commoncap.c b/security/commoncap.c
index f50fc29..39596b4 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -97,6 +97,8 @@ int cap_capset_check (struct task_struct
 	if (!cap_issubset (*effective, *permitted)) {
 		return -EPERM;
 	}
+	/* we allow Inheritable not to be a subset of Permitted:
+	 * cap_capset_set will intersect them anyway */
 
 	return 0;
 }
@@ -105,7 +107,7 @@ void cap_capset_set (struct task_struct 
 		     kernel_cap_t *inheritable, kernel_cap_t *permitted)
 {
 	target->cap_effective = *effective;
-	target->cap_inheritable = *inheritable;
+	target->cap_inheritable = cap_intersect (*effective, *inheritable);
 	target->cap_permitted = *permitted;
 }
 
@@ -114,25 +116,20 @@ int cap_bprm_set_security (struct linux_
 	/* Copied from fs/exec.c:prepare_binprm. */
 
 	/* We don't have VFS support for capabilities yet */
-	cap_clear (bprm->cap_inheritable);
+	cap_set_full (bprm->cap_inheritable);
 	cap_clear (bprm->cap_permitted);
-	cap_clear (bprm->cap_effective);
+	cap_set_full (bprm->cap_effective);
 
 	/*  To support inheritance of root-permissions and suid-root
 	 *  executables under compatibility mode, we raise all three
 	 *  capability sets for the file.
-	 *
-	 *  If only the real uid is 0, we only raise the inheritable
-	 *  and permitted sets of the executable file.
 	 */
-
 	if (!issecure (SECURE_NOROOT)) {
-		if (bprm->e_uid == 0 || current->uid == 0) {
+		if (bprm->is_suid && bprm->e_uid == 0) {
 			cap_set_full (bprm->cap_inheritable);
 			cap_set_full (bprm->cap_permitted);
-		}
-		if (bprm->e_uid == 0)
 			cap_set_full (bprm->cap_effective);
+		}
 	}
 	return 0;
 }
@@ -140,13 +137,25 @@ int cap_bprm_set_security (struct linux_
 void cap_bprm_apply_creds (struct linux_binprm *bprm, int unsafe)
 {
 	/* Derived from fs/exec.c:compute_creds. */
-	kernel_cap_t new_permitted, working;
+	kernel_cap_t new_permitted, new_effective, working;
+	uid_t old_ruid, old_euid, old_suid;
 
+	/* P'(per) = (P(inh) & F(inh)) | (F(per) & bset) */
 	new_permitted = cap_intersect (bprm->cap_permitted, cap_bset);
 	working = cap_intersect (bprm->cap_inheritable,
 				 current->cap_inheritable);
 	new_permitted = cap_combine (new_permitted, working);
 
+	/* P'(eff) = (P(inh) & P(eff) & F(inh)) | (F(per) & F(eff) & bset) */
+	new_effective = cap_intersect (bprm->cap_permitted, bprm->cap_effective);
+	new_effective = cap_intersect (new_effective, cap_bset);
+	working = cap_intersect (bprm->cap_inheritable,
+				 current->cap_effective);
+	working = cap_intersect (working, current->cap_inheritable);
+	new_effective = cap_combine (new_effective, working);
+
+	/* P'(inh) = P'(per) */
+
 	if (bprm->e_uid != current->uid || bprm->e_gid != current->gid ||
 	    !cap_issubset (new_permitted, current->cap_permitted)) {
 		current->mm->dumpable = suid_dumpable;
@@ -159,25 +168,27 @@ void cap_bprm_apply_creds (struct linux_
 			if (!capable (CAP_SETPCAP)) {
 				new_permitted = cap_intersect (new_permitted,
 							current->cap_permitted);
+				new_effective = cap_intersect (new_permitted,
+							       new_effective);
 			}
 		}
 	}
 
+	old_ruid = current->uid;
+	old_euid = current->euid;
+	old_suid = current->suid;
 	current->suid = current->euid = current->fsuid = bprm->e_uid;
 	current->sgid = current->egid = current->fsgid = bprm->e_gid;
 
-	/* For init, we want to retain the capabilities set
-	 * in the init_task struct. Thus we skip the usual
-	 * capability rules */
-	if (current->pid != 1) {
-		current->cap_permitted = new_permitted;
-		current->cap_effective =
-		    cap_intersect (new_permitted, bprm->cap_effective);
-	}
-
-	/* AUD: Audit candidate if current->cap_effective is set */
+	current->cap_permitted = new_permitted;
+	current->cap_effective = new_effective;
+	current->cap_inheritable = new_permitted;
 
 	current->keep_capabilities = 0;
+	/* Make sure we drop capabilities if required by suid. */
+	cap_task_post_setuid (old_ruid, old_euid, old_suid, LSM_SETID_RES);
+
+	/* AUD: Audit candidate if current->cap_effective is set */
 }
 
 int cap_bprm_secureexec (struct linux_binprm *bprm)
@@ -187,8 +198,8 @@ int cap_bprm_secureexec (struct linux_bi
 	   test between the old and new capability sets.  For now,
 	   it simply preserves the legacy decision algorithm used by
 	   the old userland. */
-	return (current->euid != current->uid ||
-		current->egid != current->gid);
+	return ((bprm->is_suid || bprm->is_sgid)
+		&& !cap_issubset (cap_bset, current->cap_permitted));
 }
 
 int cap_inode_setxattr(struct dentry *dentry, char *name, void *value,
@@ -244,15 +255,24 @@ static inline void cap_emulate_setxuid (
 					int old_suid)
 {
 	if ((old_ruid == 0 || old_euid == 0 || old_suid == 0) &&
-	    (current->uid != 0 && current->euid != 0 && current->suid != 0) &&
-	    !current->keep_capabilities) {
-		cap_clear (current->cap_permitted);
-		cap_clear (current->cap_effective);
+	    (current->uid != 0 && current->euid != 0 && current->suid != 0)) {
+		if (!current->keep_capabilities) {
+			current->cap_permitted
+				= cap_intersect (current->cap_permitted,
+						 CAP_REGULAR_SET);
+			current->cap_effective
+				= cap_intersect (current->cap_effective,
+						 CAP_REGULAR_SET);
+		}
+		current->cap_inheritable
+			= cap_intersect (current->cap_inheritable,
+					 CAP_REGULAR_SET);
 	}
-	if (old_euid == 0 && current->euid != 0) {
-		cap_clear (current->cap_effective);
+	if (old_euid == 0 && current->euid != 0 && !current->keep_capabilities) {
+		current->cap_effective = cap_intersect (current->cap_effective,
+							CAP_REGULAR_SET);
 	}
-	if (old_euid != 0 && current->euid == 0) {
+	if (old_euid != 0 && current->euid == 0 && !current->keep_capabilities) {
 		current->cap_effective = current->cap_permitted;
 	}
 }
diff --git a/security/dummy.c b/security/dummy.c
index 58c6d39..572a15b 100644
--- a/security/dummy.c
+++ b/security/dummy.c
@@ -37,11 +37,11 @@ static int dummy_ptrace (struct task_str
 static int dummy_capget (struct task_struct *target, kernel_cap_t * effective,
 			 kernel_cap_t * inheritable, kernel_cap_t * permitted)
 {
-	*effective = *inheritable = *permitted = 0;
+	*effective = *inheritable = *permitted = CAP_REGULAR_SET;
 	if (!issecure(SECURE_NOROOT)) {
 		if (target->euid == 0) {
-			*permitted |= (~0 & ~CAP_FS_MASK);
-			*effective |= (~0 & ~CAP_TO_MASK(CAP_SETPCAP) & ~CAP_FS_MASK);
+			*permitted |= (CAP_FULL_SET & ~CAP_FS_MASK);
+			*effective |= (CAP_FULL_SET & ~CAP_TO_MASK(CAP_SETPCAP) & ~CAP_FS_MASK);
 		}
 		if (target->fsuid == 0) {
 			*permitted |= CAP_FS_MASK;

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-05 21:26 patch to make Linux capabilities into something useful (v 0.3.1) David Madore
@ 2006-09-06  0:27 ` Casey Schaufler
  2006-09-06 10:06   ` David Madore
  2006-09-06 18:25 ` Serge E. Hallyn
  2006-09-07 18:21 ` James Antill
  2 siblings, 1 reply; 52+ messages in thread
From: Casey Schaufler @ 2006-09-06  0:27 UTC (permalink / raw)
  To: David Madore, Linux Kernel mailing-list


--- David Madore <david.madore@ens.fr> wrote:

> As we all know, capabilities under Linux are
> currently crippled to the
> point of being useless.

The current work in progress to support
capability set on files will address this
longstanding issue.

> ...

> On top of the
> "additional" capabilities
> that lead up to root, it also adds "regular"
> capabilities which all
> processes have by default and which can be removed
> from specifically
> untrusted programs.

Not a bad idea, but the notion of underprivileged
processes has been tried before. The capability
mechanism is explicitly designed to provide for
the seperation and management of privilege and
taking it in the "other" direction requires
a rethinking of the inheritance mechanism.
I understand that today's scheme that does not
do inheritance has problems, but that should
soon be resolved.

> All the gory details as to what it does are
> explained on this page:
> <URL:
>
http://www.madore.org/~david/linux/newcaps/newcaps.html

A fine page, BTW. Thank you.
 
> In short: currently (i.e., prior to applying this
> patch), Linux has
> capabilities, but they are (deliberately) crippled,

The crippling is not deliberate. It is
unfortunate and represents a number of complex
issues that are being resolved. Finally.

> ...  Furthermore, whereas the
> current Linux
> capabilities are only "additional" capabilities
> (meaning that normal,
> non-root, processes, have none, and adding
> capabilities leads up to
> root), the patch also suggests (and, to some extent,
> implements) a new
> bunch of "regular" capabilites, which are present on
> all normal
> processes and can be removed so as to provide some
> measure of
> fault-containment for partially untrusted or
> potentially buggy
> programs (thus, these new capabilities can be said
> to lead down).

Again, the capability scheme is intended to
address the omnipotent userid problem. It pulls
the userid and privilege apart. It also provides
a more granular privilege. But it does not change
what operations require privilege. That is left
to wiser minds.

> ...
> 
> I'd be glad if some people could review this and
> check my reasoning
> attempting to prove that it won't open any security
> holes (or, on the
> contrary, exhibit some).

The fundimental flaw in your reasoning is
the assumption that linux will never have
capability inheritence.

> Comments are welcome,

You did say so!


Casey Schaufler
casey@schaufler-ca.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-06  0:27 ` Casey Schaufler
@ 2006-09-06 10:06   ` David Madore
  2006-09-06 13:26     ` David Madore
  0 siblings, 1 reply; 52+ messages in thread
From: David Madore @ 2006-09-06 10:06 UTC (permalink / raw)
  To: Linux Kernel mailing-list

On Wed, Sep 06, 2006 at 12:27:50AM +0000, Casey Schaufler wrote:
> --- David Madore <david.madore@ens.fr> wrote:
> > As we all know, capabilities under Linux are
> > currently crippled to the
> > point of being useless.
> 
> The current work in progress to support
> capability set on files will address this
> longstanding issue.

It seems to me that the issues of the capability inheritance semantics
and the capability filesystem support are quite orthogonal.  My patch
provides the first, and will quite happily live with a patch such as
<URL: http://lwn.net/Articles/142507/ > providing filesystem support.

Even in the absence of filesystem support, there is no reason for
capabilities not to be inheritable: this is what my patch addresses.
Of course, it is even more interesting in the presence of filesystem
support.  (I could provide a combined patch that would do both, with
xattrs, as a proof of concept.)

> Not a bad idea, but the notion of underprivileged
> processes has been tried before. The capability
> mechanism is explicitly designed to provide for
> the seperation and management of privilege and
> taking it in the "other" direction requires
> a rethinking of the inheritance mechanism.

Yes, it required a slight rethinking, and that is precisely what I am
providing: <URL: http://www.madore.org/~david/linux/newcaps/#semantics >.
Do you see anything specificly wrong with it?

> > In short: currently (i.e., prior to applying this
> > patch), Linux has
> > capabilities, but they are (deliberately) crippled,
> 
> The crippling is not deliberate.

At least the crippling of CAP_SETPCAP was deliberate and unnecessary:
it was done following an incorrect analysis by the sendmail team of a
caps-related sendmail exploit under Linux.

>				   It is
> unfortunate and represents a number of complex
> issues that are being resolved. Finally.

Resolving them is precisely what I proposed to do.  If you are saying
someone else also proposed to do the same, can you point to that work?
Perhaps we could merge usefully and thus go forward faster.

> Again, the capability scheme is intended to
> address the omnipotent userid problem. It pulls
> the userid and privilege apart. It also provides
> a more granular privilege. But it does not change
> what operations require privilege. That is left
> to wiser minds.

I don't quite understand what you're saying here.  Do you see
something wrong with my proposal for doing it?

-- 
     David A. Madore
    (david.madore@ens.fr,
     http://www.madore.org/~david/ )

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-06 10:06   ` David Madore
@ 2006-09-06 13:26     ` David Madore
  2006-09-07  0:11       ` Casey Schaufler
  2006-09-09 23:18       ` Theodore Tso
  0 siblings, 2 replies; 52+ messages in thread
From: David Madore @ 2006-09-06 13:26 UTC (permalink / raw)
  To: Linux Kernel mailing-list

On Wed, Sep 06, 2006 at 10:06:35AM +0000, David Madore wrote:
> On Wed, Sep 06, 2006 at 12:27:50AM +0000, Casey Schaufler wrote:
> > The current work in progress to support
> > capability set on files will address this
> > longstanding issue.
> 
> It seems to me that the issues of the capability inheritance semantics
> and the capability filesystem support are quite orthogonal.  My patch
> provides the first, and will quite happily live with a patch such as
> <URL: http://lwn.net/Articles/142507/ > providing filesystem support.
> 
> Even in the absence of filesystem support, there is no reason for
> capabilities not to be inheritable: this is what my patch addresses.
> Of course, it is even more interesting in the presence of filesystem
> support.  (I could provide a combined patch that would do both, with
> xattrs, as a proof of concept.)

Followup on this: maybe you were referring to the patch in <URL:
http://groups.google.com/group/fa.linux.kernel/msg/61d191383c8b19f9
 > (= <URL: http://lkml.org/lkml/2006/7/29/221 >), by Serge E. Hallyn,
which adds filesystem support for capabilities.  I haven't actually
checked in detail, but reading throught it, it appears to be quite
compatible with my own patch (one merely needs to do something about
the new bunch of capabilities I've introduced, but it should be easy
to hack something which makes sure no programs are surprised or
broken).  I'll try to come up with a combined patch soon, which will
add both the inheritability support I suggest *and* filesystem
support.

I emphasize that the filesystem support patch described above, alone,
will *not* solve the inheritability problem (as my patch does), since
unmarked executables continue to inherit no caps at all.  With my
patch, they behave as though they had a full inheritable set,
something which is required if we want to make something useful of
capabilities on non-caps-aware programs.

-- 
     David A. Madore
    (david.madore@ens.fr,
     http://www.madore.org/~david/ )

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-05 21:26 patch to make Linux capabilities into something useful (v 0.3.1) David Madore
  2006-09-06  0:27 ` Casey Schaufler
@ 2006-09-06 18:25 ` Serge E. Hallyn
  2006-09-06 22:27   ` David Madore
  2006-09-07 18:21 ` James Antill
  2 siblings, 1 reply; 52+ messages in thread
From: Serge E. Hallyn @ 2006-09-06 18:25 UTC (permalink / raw)
  To: David Madore; +Cc: Linux Kernel mailing-list

Hi,

The fact that you're changing the inheritance rules is a bit scary, so
I'm going to (and I hope others will) take some time to look it over.

In the meantime, so long as you're adding some new capabilities, how
about also splitting up a few like CAP_SYS_ADMIN?  Have you looked into
it and decided none are really separable, i.e. any subset leads to the
ability to get any other subset?

I'd recommend you split this patch into at least 3:
	1. move to 64-bit caps
	2. introduce your new caps
		(perhaps even one new cap per patch)
	3. introduce the new inheritance rules

thanks,
-serge

Quoting David Madore (david.madore@ens.fr):
> Hi.
> 
> As we all know, capabilities under Linux are currently crippled to the
> point of being useless.  Attached is a patch (against 2.6.18-rc6)
> which attempts to make them work in a reasonably useful way and at the
> same time not break anything.  On top of the "additional" capabilities
> that lead up to root, it also adds "regular" capabilities which all
> processes have by default and which can be removed from specifically
> untrusted programs.
> 
> All the gory details as to what it does are explained on this page:
> <URL: http://www.madore.org/~david/linux/newcaps/newcaps.html >
> 
> In short: currently (i.e., prior to applying this patch), Linux has
> capabilities, but they are (deliberately) crippled, and thus,
> essentially useless, because nobody could agree on coherent semantics
> for them; this patch uncripples them and attempts to give them
> reasonable semantics that will, hopefully, neither break legacy Unix
> programs nor those that use the current capabilies system
> (essentially, Bind9); basically, capabilities are currently useless
> because they are never inheritable (=preserved across execve()) and
> this patch makes them so (but carefully enough so as not to confuse
> existing programs).  Furthermore, whereas the current Linux
> capabilities are only "additional" capabilities (meaning that normal,
> non-root, processes, have none, and adding capabilities leads up to
> root), the patch also suggests (and, to some extent, implements) a new
> bunch of "regular" capabilites, which are present on all normal
> processes and can be removed so as to provide some measure of
> fault-containment for partially untrusted or potentially buggy
> programs (thus, these new capabilities can be said to lead down).
> 
> Note: Although I believe that this patch will not break anything, it
> is still little tested and should be considered alpha quality: it
> should on no account be applied on security-critical systems or on a
> system were local users are not to be trusted: the security
> implications are quite complex and I could quite possibly be wrong in
> thinking that it doesn't open any local root hole.
> 
> I'd be glad if some people could review this and check my reasoning
> attempting to prove that it won't open any security holes (or, on the
> contrary, exhibit some).
> 
> I'd also be glad if someone had a test suite that could be used to
> check that traditional Unix behavior isn't broken after applying the
> patch.
> 
> Comments are welcome,
> 
> -- 
>      David A. Madore
>     (david.madore@ens.fr,
>      http://www.madore.org/~david/ )

> diff --git a/fs/exec.c b/fs/exec.c
> index 54135df..1cb5e34 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -925,10 +925,13 @@ int prepare_binprm(struct linux_binprm *
>  
>  	bprm->e_uid = current->euid;
>  	bprm->e_gid = current->egid;
> +	bprm->is_suid = 0;
> +	bprm->is_sgid = 0;
>  
>  	if(!(bprm->file->f_vfsmnt->mnt_flags & MNT_NOSUID)) {
>  		/* Set-uid? */
> -		if (mode & S_ISUID) {
> +		if (mode & S_ISUID && capable(CAP_REG_SXID)) {
> +			bprm->is_suid = 1;
>  			current->personality &= ~PER_CLEAR_ON_SETID;
>  			bprm->e_uid = inode->i_uid;
>  		}
> @@ -939,7 +942,9 @@ int prepare_binprm(struct linux_binprm *
>  		 * is a candidate for mandatory locking, not a setgid
>  		 * executable.
>  		 */
> -		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
> +		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)
> +		    && capable(CAP_REG_SXID)) {
> +			bprm->is_sgid = 1;
>  			current->personality &= ~PER_CLEAR_ON_SETID;
>  			bprm->e_gid = inode->i_gid;
>  		}
> @@ -1133,6 +1138,8 @@ int do_execve(char * filename,
>  	int retval;
>  	int i;
>  
> +	if (!capable(CAP_REG_EXEC))
> +		return -EPERM;
>  	retval = -ENOMEM;
>  	bprm = kzalloc(sizeof(*bprm), GFP_KERNEL);
>  	if (!bprm)
> diff --git a/fs/open.c b/fs/open.c
> index 303f06d..3d1fc1c 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -1104,7 +1104,10 @@ asmlinkage long sys_open(const char __us
>  	if (force_o_largefile())
>  		flags |= O_LARGEFILE;
>  
> -	ret = do_sys_open(AT_FDCWD, filename, flags, mode);
> +	if (capable(CAP_REG_OPEN))
> +		ret = do_sys_open(AT_FDCWD, filename, flags, mode);
> +	else
> +		ret = -EPERM;
>  	/* avoid REGPARM breakage on x86: */
>  	prevent_tail_call(ret);
>  	return ret;
> @@ -1119,7 +1122,10 @@ asmlinkage long sys_openat(int dfd, cons
>  	if (force_o_largefile())
>  		flags |= O_LARGEFILE;
>  
> -	ret = do_sys_open(dfd, filename, flags, mode);
> +	if (capable(CAP_REG_OPEN))
> +		ret = do_sys_open(dfd, filename, flags, mode);
> +	else
> +		ret = -EPERM;
>  	/* avoid REGPARM breakage on x86: */
>  	prevent_tail_call(ret);
>  	return ret;
> diff --git a/fs/proc/array.c b/fs/proc/array.c
> index 0b615d6..6724fc2 100644
> --- a/fs/proc/array.c
> +++ b/fs/proc/array.c
> @@ -285,9 +285,9 @@ static inline char * task_sig(struct tas
>  
>  static inline char *task_cap(struct task_struct *p, char *buffer)
>  {
> -    return buffer + sprintf(buffer, "CapInh:\t%016x\n"
> -			    "CapPrm:\t%016x\n"
> -			    "CapEff:\t%016x\n",
> +    return buffer + sprintf(buffer, "CapInh:\t%016llx\n"
> +			    "CapPrm:\t%016llx\n"
> +			    "CapEff:\t%016llx\n",
>  			    cap_t(p->cap_inheritable),
>  			    cap_t(p->cap_permitted),
>  			    cap_t(p->cap_effective));
> diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
> index c1e82c5..c7fb183 100644
> --- a/include/linux/binfmts.h
> +++ b/include/linux/binfmts.h
> @@ -29,6 +29,7 @@ struct linux_binprm{
>  	struct file * file;
>  	int e_uid, e_gid;
>  	kernel_cap_t cap_inheritable, cap_permitted, cap_effective;
> +	char is_suid, is_sgid;
>  	void *security;
>  	int argc, envc;
>  	char * filename;	/* Name of binary as seen by procps */
> diff --git a/include/linux/capability.h b/include/linux/capability.h
> index 6548b35..b3a3b27 100644
> --- a/include/linux/capability.h
> +++ b/include/linux/capability.h
> @@ -27,7 +27,8 @@ #include <linux/compiler.h>
>     library since the draft standard requires the use of malloc/free
>     etc.. */
>   
> -#define _LINUX_CAPABILITY_VERSION  0x19980330
> +#define _LINUX_CAPABILITY_VERSION  0x20060903
> +#define _LINUX_CAPABILITY_OLD_VERSION  0x19980330
>  
>  typedef struct __user_cap_header_struct {
>  	__u32 version;
> @@ -35,10 +36,16 @@ typedef struct __user_cap_header_struct 
>  } __user *cap_user_header_t;
>   
>  typedef struct __user_cap_data_struct {
> +        __u64 effective;
> +        __u64 permitted;
> +        __u64 inheritable;
> +} __user *cap_user_data_t;
> + 
> +typedef struct __user_cap_data_old_struct {
>          __u32 effective;
>          __u32 permitted;
>          __u32 inheritable;
> -} __user *cap_user_data_t;
> +} __user *cap_user_data_old_t;
>    
>  #ifdef __KERNEL__
>  
> @@ -50,12 +57,12 @@ #include <asm/current.h>
>  #ifdef STRICT_CAP_T_TYPECHECKS
>  
>  typedef struct kernel_cap_struct {
> -	__u32 cap;
> +	__u64 cap;
>  } kernel_cap_t;
>  
>  #else
>  
> -typedef __u32 kernel_cap_t;
> +typedef __u64 kernel_cap_t;
>  
>  #endif
>    
> @@ -288,6 +295,23 @@ #define CAP_AUDIT_WRITE      29
>  
>  #define CAP_AUDIT_CONTROL    30
>  
> +
> +/**
> + ** Regular capabilities (normally possessed by all processes).
> + **/
> +
> +/* Can fork() */
> +#define CAP_REG_FORK         32
> +
> +/* Can open() */
> +#define CAP_REG_OPEN         33
> +
> +/* Can exec() */
> +#define CAP_REG_EXEC         34
> +
> +/* Might gain permissions on exec() */
> +#define CAP_REG_SXID         35
> +
>  #ifdef __KERNEL__
>  /* 
>   * Bounding set
> @@ -310,12 +334,13 @@ #define cap_t(x) (x)
>  
>  #endif
>  
> -#define CAP_EMPTY_SET       to_cap_t(0)
> -#define CAP_FULL_SET        to_cap_t(~0)
> -#define CAP_INIT_EFF_SET    to_cap_t(~0 & ~CAP_TO_MASK(CAP_SETPCAP))
> -#define CAP_INIT_INH_SET    to_cap_t(0)
> +#define CAP_EMPTY_SET       to_cap_t(0ULL)
> +#define CAP_FULL_SET        to_cap_t(~0ULL)
> +#define CAP_REGULAR_SET     to_cap_t(0xffffffff00000000ULL)
> +#define CAP_INIT_EFF_SET    to_cap_t(~0ULL)
> +#define CAP_INIT_INH_SET    to_cap_t(~0ULL)
>  
> -#define CAP_TO_MASK(x) (1 << (x))
> +#define CAP_TO_MASK(x) (1ULL << (x))
>  #define cap_raise(c, flag)   (cap_t(c) |=  CAP_TO_MASK(flag))
>  #define cap_lower(c, flag)   (cap_t(c) &= ~CAP_TO_MASK(flag))
>  #define cap_raised(c, flag)  (cap_t(c) & CAP_TO_MASK(flag))
> @@ -351,8 +376,8 @@ static inline kernel_cap_t cap_invert(ke
>  #define cap_isclear(c)       (!cap_t(c))
>  #define cap_issubset(a,set)  (!(cap_t(a) & ~cap_t(set)))
>  
> -#define cap_clear(c)         do { cap_t(c) =  0; } while(0)
> -#define cap_set_full(c)      do { cap_t(c) = ~0; } while(0)
> +#define cap_clear(c)         do { cap_t(c) =  0ULL; } while(0)
> +#define cap_set_full(c)      do { cap_t(c) = ~0ULL; } while(0)
>  #define cap_mask(c,mask)     do { cap_t(c) &= cap_t(mask); } while(0)
>  
>  #define cap_is_fs_cap(c)     (CAP_TO_MASK(c) & CAP_FS_MASK)
> diff --git a/kernel/capability.c b/kernel/capability.c
> index c7685ad..c090570 100644
> --- a/kernel/capability.c
> +++ b/kernel/capability.c
> @@ -15,7 +15,7 @@ #include <linux/syscalls.h>
>  #include <asm/uaccess.h>
>  
>  unsigned securebits = SECUREBITS_DEFAULT; /* systemwide security settings */
> -kernel_cap_t cap_bset = CAP_INIT_EFF_SET;
> +kernel_cap_t cap_bset = CAP_INIT_INH_SET;
>  
>  EXPORT_SYMBOL(securebits);
>  EXPORT_SYMBOL(cap_bset);
> @@ -52,7 +52,8 @@ asmlinkage long sys_capget(cap_user_head
>       if (get_user(version, &header->version))
>  	     return -EFAULT;
>  
> -     if (version != _LINUX_CAPABILITY_VERSION) {
> +     if (version != _LINUX_CAPABILITY_VERSION
> +	 && version != _LINUX_CAPABILITY_OLD_VERSION) {
>  	     if (put_user(_LINUX_CAPABILITY_VERSION, &header->version))
>  		     return -EFAULT; 
>               return -EINVAL;
> @@ -82,8 +83,18 @@ out:
>       read_unlock(&tasklist_lock); 
>       spin_unlock(&task_capability_lock);
>  
> -     if (!ret && copy_to_user(dataptr, &data, sizeof data))
> -          return -EFAULT; 
> +     if (!ret) {
> +	     if (version == _LINUX_CAPABILITY_OLD_VERSION) {
> +		     struct __user_cap_data_old_struct data_old;
> +		     data_old.effective = data_old.effective & 0xffffffffULL;
> +		     data_old.permitted = data_old.permitted & 0xffffffffULL;
> +		     data_old.inheritable = data_old.inheritable & 0xffffffffULL;
> +		     if (copy_to_user(dataptr, &data_old, sizeof data_old))
> +			     return -EFAULT;
> +	     } else
> +		     if (copy_to_user(dataptr, &data, sizeof data))
> +			     return -EFAULT;
> +     }
>  
>       return ret;
>  }
> @@ -179,7 +190,8 @@ asmlinkage long sys_capset(cap_user_head
>       if (get_user(version, &header->version))
>  	     return -EFAULT; 
>  
> -     if (version != _LINUX_CAPABILITY_VERSION) {
> +     if (version != _LINUX_CAPABILITY_VERSION
> +	 && version != _LINUX_CAPABILITY_OLD_VERSION) {
>  	     if (put_user(_LINUX_CAPABILITY_VERSION, &header->version))
>  		     return -EFAULT; 
>               return -EINVAL;
> @@ -191,10 +203,23 @@ asmlinkage long sys_capset(cap_user_head
>       if (pid && pid != current->pid && !capable(CAP_SETPCAP))
>               return -EPERM;
>  
> -     if (copy_from_user(&effective, &data->effective, sizeof(effective)) ||
> -	 copy_from_user(&inheritable, &data->inheritable, sizeof(inheritable)) ||
> -	 copy_from_user(&permitted, &data->permitted, sizeof(permitted)))
> -	     return -EFAULT; 
> +     if (version == _LINUX_CAPABILITY_OLD_VERSION) {
> +	     const cap_user_data_old_t data2 = (void *)data;
> +	     __u32 w;
> +	     if (copy_from_user(&w, &data2->effective, sizeof(w)))
> +		     return -EFAULT;
> +	     effective = (__u64)w | 0xffffffff00000000ULL;
> +	     if (copy_from_user(&w, &data2->inheritable, sizeof(w)))
> +		     return -EFAULT;
> +	     inheritable = (__u64)w | 0xffffffff00000000ULL;
> +	     if (copy_from_user(&w, &data2->permitted, sizeof(w)))
> +		     return -EFAULT;
> +	     permitted = (__u64)w | 0xffffffff00000000ULL;
> +     } else
> +	     if (copy_from_user(&effective, &data->effective, sizeof(effective)) ||
> +		 copy_from_user(&inheritable, &data->inheritable, sizeof(inheritable)) ||
> +		 copy_from_user(&permitted, &data->permitted, sizeof(permitted)))
> +		     return -EFAULT; 
>  
>       spin_lock(&task_capability_lock);
>       read_lock(&tasklist_lock);
> diff --git a/kernel/fork.c b/kernel/fork.c
> index f9b014e..20f559f 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1347,6 +1347,8 @@ long do_fork(unsigned long clone_flags,
>  	struct pid *pid = alloc_pid();
>  	long nr;
>  
> +	if (!capable(CAP_REG_FORK))
> +		return -EPERM;
>  	if (!pid)
>  		return -EAGAIN;
>  	nr = pid->nr;
> diff --git a/security/commoncap.c b/security/commoncap.c
> index f50fc29..39596b4 100644
> --- a/security/commoncap.c
> +++ b/security/commoncap.c
> @@ -97,6 +97,8 @@ int cap_capset_check (struct task_struct
>  	if (!cap_issubset (*effective, *permitted)) {
>  		return -EPERM;
>  	}
> +	/* we allow Inheritable not to be a subset of Permitted:
> +	 * cap_capset_set will intersect them anyway */
>  
>  	return 0;
>  }
> @@ -105,7 +107,7 @@ void cap_capset_set (struct task_struct 
>  		     kernel_cap_t *inheritable, kernel_cap_t *permitted)
>  {
>  	target->cap_effective = *effective;
> -	target->cap_inheritable = *inheritable;
> +	target->cap_inheritable = cap_intersect (*effective, *inheritable);
>  	target->cap_permitted = *permitted;
>  }
>  
> @@ -114,25 +116,20 @@ int cap_bprm_set_security (struct linux_
>  	/* Copied from fs/exec.c:prepare_binprm. */
>  
>  	/* We don't have VFS support for capabilities yet */
> -	cap_clear (bprm->cap_inheritable);
> +	cap_set_full (bprm->cap_inheritable);
>  	cap_clear (bprm->cap_permitted);
> -	cap_clear (bprm->cap_effective);
> +	cap_set_full (bprm->cap_effective);
>  
>  	/*  To support inheritance of root-permissions and suid-root
>  	 *  executables under compatibility mode, we raise all three
>  	 *  capability sets for the file.
> -	 *
> -	 *  If only the real uid is 0, we only raise the inheritable
> -	 *  and permitted sets of the executable file.
>  	 */
> -
>  	if (!issecure (SECURE_NOROOT)) {
> -		if (bprm->e_uid == 0 || current->uid == 0) {
> +		if (bprm->is_suid && bprm->e_uid == 0) {
>  			cap_set_full (bprm->cap_inheritable);
>  			cap_set_full (bprm->cap_permitted);
> -		}
> -		if (bprm->e_uid == 0)
>  			cap_set_full (bprm->cap_effective);
> +		}
>  	}
>  	return 0;
>  }
> @@ -140,13 +137,25 @@ int cap_bprm_set_security (struct linux_
>  void cap_bprm_apply_creds (struct linux_binprm *bprm, int unsafe)
>  {
>  	/* Derived from fs/exec.c:compute_creds. */
> -	kernel_cap_t new_permitted, working;
> +	kernel_cap_t new_permitted, new_effective, working;
> +	uid_t old_ruid, old_euid, old_suid;
>  
> +	/* P'(per) = (P(inh) & F(inh)) | (F(per) & bset) */
>  	new_permitted = cap_intersect (bprm->cap_permitted, cap_bset);
>  	working = cap_intersect (bprm->cap_inheritable,
>  				 current->cap_inheritable);
>  	new_permitted = cap_combine (new_permitted, working);
>  
> +	/* P'(eff) = (P(inh) & P(eff) & F(inh)) | (F(per) & F(eff) & bset) */
> +	new_effective = cap_intersect (bprm->cap_permitted, bprm->cap_effective);
> +	new_effective = cap_intersect (new_effective, cap_bset);
> +	working = cap_intersect (bprm->cap_inheritable,
> +				 current->cap_effective);
> +	working = cap_intersect (working, current->cap_inheritable);
> +	new_effective = cap_combine (new_effective, working);
> +
> +	/* P'(inh) = P'(per) */
> +
>  	if (bprm->e_uid != current->uid || bprm->e_gid != current->gid ||
>  	    !cap_issubset (new_permitted, current->cap_permitted)) {
>  		current->mm->dumpable = suid_dumpable;
> @@ -159,25 +168,27 @@ void cap_bprm_apply_creds (struct linux_
>  			if (!capable (CAP_SETPCAP)) {
>  				new_permitted = cap_intersect (new_permitted,
>  							current->cap_permitted);
> +				new_effective = cap_intersect (new_permitted,
> +							       new_effective);
>  			}
>  		}
>  	}
>  
> +	old_ruid = current->uid;
> +	old_euid = current->euid;
> +	old_suid = current->suid;
>  	current->suid = current->euid = current->fsuid = bprm->e_uid;
>  	current->sgid = current->egid = current->fsgid = bprm->e_gid;
>  
> -	/* For init, we want to retain the capabilities set
> -	 * in the init_task struct. Thus we skip the usual
> -	 * capability rules */
> -	if (current->pid != 1) {
> -		current->cap_permitted = new_permitted;
> -		current->cap_effective =
> -		    cap_intersect (new_permitted, bprm->cap_effective);
> -	}
> -
> -	/* AUD: Audit candidate if current->cap_effective is set */
> +	current->cap_permitted = new_permitted;
> +	current->cap_effective = new_effective;
> +	current->cap_inheritable = new_permitted;
>  
>  	current->keep_capabilities = 0;
> +	/* Make sure we drop capabilities if required by suid. */
> +	cap_task_post_setuid (old_ruid, old_euid, old_suid, LSM_SETID_RES);
> +
> +	/* AUD: Audit candidate if current->cap_effective is set */
>  }
>  
>  int cap_bprm_secureexec (struct linux_binprm *bprm)
> @@ -187,8 +198,8 @@ int cap_bprm_secureexec (struct linux_bi
>  	   test between the old and new capability sets.  For now,
>  	   it simply preserves the legacy decision algorithm used by
>  	   the old userland. */
> -	return (current->euid != current->uid ||
> -		current->egid != current->gid);
> +	return ((bprm->is_suid || bprm->is_sgid)
> +		&& !cap_issubset (cap_bset, current->cap_permitted));
>  }
>  
>  int cap_inode_setxattr(struct dentry *dentry, char *name, void *value,
> @@ -244,15 +255,24 @@ static inline void cap_emulate_setxuid (
>  					int old_suid)
>  {
>  	if ((old_ruid == 0 || old_euid == 0 || old_suid == 0) &&
> -	    (current->uid != 0 && current->euid != 0 && current->suid != 0) &&
> -	    !current->keep_capabilities) {
> -		cap_clear (current->cap_permitted);
> -		cap_clear (current->cap_effective);
> +	    (current->uid != 0 && current->euid != 0 && current->suid != 0)) {
> +		if (!current->keep_capabilities) {
> +			current->cap_permitted
> +				= cap_intersect (current->cap_permitted,
> +						 CAP_REGULAR_SET);
> +			current->cap_effective
> +				= cap_intersect (current->cap_effective,
> +						 CAP_REGULAR_SET);
> +		}
> +		current->cap_inheritable
> +			= cap_intersect (current->cap_inheritable,
> +					 CAP_REGULAR_SET);
>  	}
> -	if (old_euid == 0 && current->euid != 0) {
> -		cap_clear (current->cap_effective);
> +	if (old_euid == 0 && current->euid != 0 && !current->keep_capabilities) {
> +		current->cap_effective = cap_intersect (current->cap_effective,
> +							CAP_REGULAR_SET);
>  	}
> -	if (old_euid != 0 && current->euid == 0) {
> +	if (old_euid != 0 && current->euid == 0 && !current->keep_capabilities) {
>  		current->cap_effective = current->cap_permitted;
>  	}
>  }
> diff --git a/security/dummy.c b/security/dummy.c
> index 58c6d39..572a15b 100644
> --- a/security/dummy.c
> +++ b/security/dummy.c
> @@ -37,11 +37,11 @@ static int dummy_ptrace (struct task_str
>  static int dummy_capget (struct task_struct *target, kernel_cap_t * effective,
>  			 kernel_cap_t * inheritable, kernel_cap_t * permitted)
>  {
> -	*effective = *inheritable = *permitted = 0;
> +	*effective = *inheritable = *permitted = CAP_REGULAR_SET;
>  	if (!issecure(SECURE_NOROOT)) {
>  		if (target->euid == 0) {
> -			*permitted |= (~0 & ~CAP_FS_MASK);
> -			*effective |= (~0 & ~CAP_TO_MASK(CAP_SETPCAP) & ~CAP_FS_MASK);
> +			*permitted |= (CAP_FULL_SET & ~CAP_FS_MASK);
> +			*effective |= (CAP_FULL_SET & ~CAP_TO_MASK(CAP_SETPCAP) & ~CAP_FS_MASK);
>  		}
>  		if (target->fsuid == 0) {
>  			*permitted |= CAP_FS_MASK;


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-06 18:25 ` Serge E. Hallyn
@ 2006-09-06 22:27   ` David Madore
  2006-09-07  0:04     ` David Madore
                       ` (2 more replies)
  0 siblings, 3 replies; 52+ messages in thread
From: David Madore @ 2006-09-06 22:27 UTC (permalink / raw)
  To: Linux Kernel mailing-list; +Cc: Serge E. Hallyn

On Wed, Sep 06, 2006 at 01:25:31PM -0500, Serge E. Hallyn wrote:
> The fact that you're changing the inheritance rules is a bit scary, so
> I'm going to (and I hope others will) take some time to look it over.

Thanks!  I'd appreciate it.  Don't hesitate to ask me if some
decisions I made are unclear.

I was about to write to you, in fact, since I wrote a version of my
patch which merges with the one you made (an old version, though, I
suppose: I took it from <URL: http://lkml.org/lkml/2006/8/15/294 >,
but I can try merging with more recent versions).  The point being to
show that my patch is not incompatible with yours: they are quite
complementary.  (The merged patch can be found in <URL:
ftp://quatramaran.ens.fr/pub/madore/newcaps/pre-alpha/ >.)

> In the meantime, so long as you're adding some new capabilities, how
> about also splitting up a few like CAP_SYS_ADMIN?  Have you looked into
> it and decided none are really separable, i.e. any subset leads to the
> ability to get any other subset?

I agree that splitting CAP_SYS_ADMIN might be worth while, but it
really looks like opening a worm can, so I didn't feel up to the
challenge there.  It might be a good idea to reserve some bits for
that possibility, however - I'm not sure how best to proceed.

> I'd recommend you split this patch into at least 3:
> 	1. move to 64-bit caps
> 	2. introduce your new caps
> 		(perhaps even one new cap per patch)
> 	3. introduce the new inheritance rules

Yes, that sounds like a good idea.  I'll do that.

-- 
     David A. Madore
    (david.madore@ens.fr,
     http://www.madore.org/~david/ )

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-06 22:27   ` David Madore
@ 2006-09-07  0:04     ` David Madore
  2006-09-07 23:06       ` Serge E. Hallyn
  2006-09-07  6:43     ` Jan Engelhardt
  2006-09-07 23:02     ` Serge E. Hallyn
  2 siblings, 1 reply; 52+ messages in thread
From: David Madore @ 2006-09-07  0:04 UTC (permalink / raw)
  To: Linux Kernel mailing-list; +Cc: Serge E. Hallyn

[-- Attachment #1: Type: text/plain, Size: 755 bytes --]

On Thu, Sep 07, 2006 at 12:27:31AM +0200, David Madore wrote:
> On Wed, Sep 06, 2006 at 01:25:31PM -0500, Serge E. Hallyn wrote:
> > I'd recommend you split this patch into at least 3:
> > 	1. move to 64-bit caps
> > 	2. introduce your new caps
> > 		(perhaps even one new cap per patch)
> > 	3. introduce the new inheritance rules
> 
> Yes, that sounds like a good idea.  I'll do that.

Done.  Attached.  Except that the order is

part1: move to 64-bit caps (and also re-enable CAP_SETPCAP),
       where upper 32-bits are "regular" capabilities (but none defined)

part2: introduce the new inheritance rules

part3: introduce new ("regular") capabilities

Cheers,

-- 
     David A. Madore
    (david.madore@ens.fr,
     http://www.madore.org/~david/ )

[-- Attachment #2: caps-0.3.1-split-part1-setsize-linux-2.6.18-rc6.patch --]
[-- Type: text/x-patch, Size: 8227 bytes --]

 fs/proc/array.c            |    6 +++---
 include/linux/capability.h |   30 +++++++++++++++++++-----------
 kernel/capability.c        |   41 +++++++++++++++++++++++++++++++++--------
 security/commoncap.c       |   16 +++++++++++-----
 security/dummy.c           |    6 +++---
 5 files changed, 69 insertions(+), 30 deletions(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 0b615d6..6724fc2 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -285,9 +285,9 @@ static inline char * task_sig(struct tas
 
 static inline char *task_cap(struct task_struct *p, char *buffer)
 {
-    return buffer + sprintf(buffer, "CapInh:\t%016x\n"
-			    "CapPrm:\t%016x\n"
-			    "CapEff:\t%016x\n",
+    return buffer + sprintf(buffer, "CapInh:\t%016llx\n"
+			    "CapPrm:\t%016llx\n"
+			    "CapEff:\t%016llx\n",
 			    cap_t(p->cap_inheritable),
 			    cap_t(p->cap_permitted),
 			    cap_t(p->cap_effective));
diff --git a/include/linux/capability.h b/include/linux/capability.h
index 6548b35..e4f6065 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -27,7 +27,8 @@ #include <linux/compiler.h>
    library since the draft standard requires the use of malloc/free
    etc.. */
  
-#define _LINUX_CAPABILITY_VERSION  0x19980330
+#define _LINUX_CAPABILITY_VERSION  0x20060903
+#define _LINUX_CAPABILITY_OLD_VERSION  0x19980330
 
 typedef struct __user_cap_header_struct {
 	__u32 version;
@@ -35,10 +36,16 @@ typedef struct __user_cap_header_struct 
 } __user *cap_user_header_t;
  
 typedef struct __user_cap_data_struct {
+        __u64 effective;
+        __u64 permitted;
+        __u64 inheritable;
+} __user *cap_user_data_t;
+ 
+typedef struct __user_cap_data_old_struct {
         __u32 effective;
         __u32 permitted;
         __u32 inheritable;
-} __user *cap_user_data_t;
+} __user *cap_user_data_old_t;
   
 #ifdef __KERNEL__
 
@@ -50,12 +57,12 @@ #include <asm/current.h>
 #ifdef STRICT_CAP_T_TYPECHECKS
 
 typedef struct kernel_cap_struct {
-	__u32 cap;
+	__u64 cap;
 } kernel_cap_t;
 
 #else
 
-typedef __u32 kernel_cap_t;
+typedef __u64 kernel_cap_t;
 
 #endif
   
@@ -310,12 +317,13 @@ #define cap_t(x) (x)
 
 #endif
 
-#define CAP_EMPTY_SET       to_cap_t(0)
-#define CAP_FULL_SET        to_cap_t(~0)
-#define CAP_INIT_EFF_SET    to_cap_t(~0 & ~CAP_TO_MASK(CAP_SETPCAP))
-#define CAP_INIT_INH_SET    to_cap_t(0)
+#define CAP_EMPTY_SET       to_cap_t(0ULL)
+#define CAP_FULL_SET        to_cap_t(~0ULL)
+#define CAP_REGULAR_SET     to_cap_t(0xffffffff00000000ULL)
+#define CAP_INIT_EFF_SET    to_cap_t(~0ULL)
+#define CAP_INIT_INH_SET    to_cap_t(~0ULL)
 
-#define CAP_TO_MASK(x) (1 << (x))
+#define CAP_TO_MASK(x) (1ULL << (x))
 #define cap_raise(c, flag)   (cap_t(c) |=  CAP_TO_MASK(flag))
 #define cap_lower(c, flag)   (cap_t(c) &= ~CAP_TO_MASK(flag))
 #define cap_raised(c, flag)  (cap_t(c) & CAP_TO_MASK(flag))
@@ -351,8 +359,8 @@ static inline kernel_cap_t cap_invert(ke
 #define cap_isclear(c)       (!cap_t(c))
 #define cap_issubset(a,set)  (!(cap_t(a) & ~cap_t(set)))
 
-#define cap_clear(c)         do { cap_t(c) =  0; } while(0)
-#define cap_set_full(c)      do { cap_t(c) = ~0; } while(0)
+#define cap_clear(c)         do { cap_t(c) =  0ULL; } while(0)
+#define cap_set_full(c)      do { cap_t(c) = ~0ULL; } while(0)
 #define cap_mask(c,mask)     do { cap_t(c) &= cap_t(mask); } while(0)
 
 #define cap_is_fs_cap(c)     (CAP_TO_MASK(c) & CAP_FS_MASK)
diff --git a/kernel/capability.c b/kernel/capability.c
index c7685ad..bd003f9 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -52,7 +52,8 @@ asmlinkage long sys_capget(cap_user_head
      if (get_user(version, &header->version))
 	     return -EFAULT;
 
-     if (version != _LINUX_CAPABILITY_VERSION) {
+     if (version != _LINUX_CAPABILITY_VERSION
+	 && version != _LINUX_CAPABILITY_OLD_VERSION) {
 	     if (put_user(_LINUX_CAPABILITY_VERSION, &header->version))
 		     return -EFAULT; 
              return -EINVAL;
@@ -82,8 +83,18 @@ out:
      read_unlock(&tasklist_lock); 
      spin_unlock(&task_capability_lock);
 
-     if (!ret && copy_to_user(dataptr, &data, sizeof data))
-          return -EFAULT; 
+     if (!ret) {
+	     if (version == _LINUX_CAPABILITY_OLD_VERSION) {
+		     struct __user_cap_data_old_struct data_old;
+		     data_old.effective = data_old.effective & 0xffffffffULL;
+		     data_old.permitted = data_old.permitted & 0xffffffffULL;
+		     data_old.inheritable = data_old.inheritable & 0xffffffffULL;
+		     if (copy_to_user(dataptr, &data_old, sizeof data_old))
+			     return -EFAULT;
+	     } else
+		     if (copy_to_user(dataptr, &data, sizeof data))
+			     return -EFAULT;
+     }
 
      return ret;
 }
@@ -179,7 +190,8 @@ asmlinkage long sys_capset(cap_user_head
      if (get_user(version, &header->version))
 	     return -EFAULT; 
 
-     if (version != _LINUX_CAPABILITY_VERSION) {
+     if (version != _LINUX_CAPABILITY_VERSION
+	 && version != _LINUX_CAPABILITY_OLD_VERSION) {
 	     if (put_user(_LINUX_CAPABILITY_VERSION, &header->version))
 		     return -EFAULT; 
              return -EINVAL;
@@ -191,10 +203,23 @@ asmlinkage long sys_capset(cap_user_head
      if (pid && pid != current->pid && !capable(CAP_SETPCAP))
              return -EPERM;
 
-     if (copy_from_user(&effective, &data->effective, sizeof(effective)) ||
-	 copy_from_user(&inheritable, &data->inheritable, sizeof(inheritable)) ||
-	 copy_from_user(&permitted, &data->permitted, sizeof(permitted)))
-	     return -EFAULT; 
+     if (version == _LINUX_CAPABILITY_OLD_VERSION) {
+	     const cap_user_data_old_t data2 = (void *)data;
+	     __u32 w;
+	     if (copy_from_user(&w, &data2->effective, sizeof(w)))
+		     return -EFAULT;
+	     effective = (__u64)w | 0xffffffff00000000ULL;
+	     if (copy_from_user(&w, &data2->inheritable, sizeof(w)))
+		     return -EFAULT;
+	     inheritable = (__u64)w | 0xffffffff00000000ULL;
+	     if (copy_from_user(&w, &data2->permitted, sizeof(w)))
+		     return -EFAULT;
+	     permitted = (__u64)w | 0xffffffff00000000ULL;
+     } else
+	     if (copy_from_user(&effective, &data->effective, sizeof(effective)) ||
+		 copy_from_user(&inheritable, &data->inheritable, sizeof(inheritable)) ||
+		 copy_from_user(&permitted, &data->permitted, sizeof(permitted)))
+		     return -EFAULT; 
 
      spin_lock(&task_capability_lock);
      read_lock(&tasklist_lock);
diff --git a/security/commoncap.c b/security/commoncap.c
index f50fc29..91dc53d 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -244,13 +244,19 @@ static inline void cap_emulate_setxuid (
 					int old_suid)
 {
 	if ((old_ruid == 0 || old_euid == 0 || old_suid == 0) &&
-	    (current->uid != 0 && current->euid != 0 && current->suid != 0) &&
-	    !current->keep_capabilities) {
-		cap_clear (current->cap_permitted);
-		cap_clear (current->cap_effective);
+	    (current->uid != 0 && current->euid != 0 && current->suid != 0)) {
+		if (!current->keep_capabilities) {
+			current->cap_permitted
+				= cap_intersect (current->cap_permitted,
+						 CAP_REGULAR_SET);
+			current->cap_effective
+				= cap_intersect (current->cap_effective,
+						 CAP_REGULAR_SET);
+		}
 	}
 	if (old_euid == 0 && current->euid != 0) {
-		cap_clear (current->cap_effective);
+		current->cap_effective = cap_intersect (current->cap_effective,
+							CAP_REGULAR_SET);
 	}
 	if (old_euid != 0 && current->euid == 0) {
 		current->cap_effective = current->cap_permitted;
diff --git a/security/dummy.c b/security/dummy.c
index 58c6d39..572a15b 100644
--- a/security/dummy.c
+++ b/security/dummy.c
@@ -37,11 +37,11 @@ static int dummy_ptrace (struct task_str
 static int dummy_capget (struct task_struct *target, kernel_cap_t * effective,
 			 kernel_cap_t * inheritable, kernel_cap_t * permitted)
 {
-	*effective = *inheritable = *permitted = 0;
+	*effective = *inheritable = *permitted = CAP_REGULAR_SET;
 	if (!issecure(SECURE_NOROOT)) {
 		if (target->euid == 0) {
-			*permitted |= (~0 & ~CAP_FS_MASK);
-			*effective |= (~0 & ~CAP_TO_MASK(CAP_SETPCAP) & ~CAP_FS_MASK);
+			*permitted |= (CAP_FULL_SET & ~CAP_FS_MASK);
+			*effective |= (CAP_FULL_SET & ~CAP_TO_MASK(CAP_SETPCAP) & ~CAP_FS_MASK);
 		}
 		if (target->fsuid == 0) {
 			*permitted |= CAP_FS_MASK;

[-- Attachment #3: caps-0.3.1-split-part2-semantics-linux-2.6.18-rc6.patch --]
[-- Type: text/x-patch, Size: 7028 bytes --]

 fs/exec.c               |    4 +++
 include/linux/binfmts.h |    1 +
 kernel/capability.c     |    2 +
 security/commoncap.c    |   64 +++++++++++++++++++++++++++++------------------
 4 files changed, 45 insertions(+), 26 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 54135df..e4d0a2c 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -925,10 +925,13 @@ int prepare_binprm(struct linux_binprm *
 
 	bprm->e_uid = current->euid;
 	bprm->e_gid = current->egid;
+	bprm->is_suid = 0;
+	bprm->is_sgid = 0;
 
 	if(!(bprm->file->f_vfsmnt->mnt_flags & MNT_NOSUID)) {
 		/* Set-uid? */
 		if (mode & S_ISUID) {
+			bprm->is_suid = 1;
 			current->personality &= ~PER_CLEAR_ON_SETID;
 			bprm->e_uid = inode->i_uid;
 		}
@@ -940,6 +943,7 @@ int prepare_binprm(struct linux_binprm *
 		 * executable.
 		 */
 		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
+			bprm->is_sgid = 1;
 			current->personality &= ~PER_CLEAR_ON_SETID;
 			bprm->e_gid = inode->i_gid;
 		}
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index c1e82c5..c7fb183 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -29,6 +29,7 @@ struct linux_binprm{
 	struct file * file;
 	int e_uid, e_gid;
 	kernel_cap_t cap_inheritable, cap_permitted, cap_effective;
+	char is_suid, is_sgid;
 	void *security;
 	int argc, envc;
 	char * filename;	/* Name of binary as seen by procps */
diff --git a/kernel/capability.c b/kernel/capability.c
index bd003f9..c090570 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -15,7 +15,7 @@ #include <linux/syscalls.h>
 #include <asm/uaccess.h>
 
 unsigned securebits = SECUREBITS_DEFAULT; /* systemwide security settings */
-kernel_cap_t cap_bset = CAP_INIT_EFF_SET;
+kernel_cap_t cap_bset = CAP_INIT_INH_SET;
 
 EXPORT_SYMBOL(securebits);
 EXPORT_SYMBOL(cap_bset);
diff --git a/security/commoncap.c b/security/commoncap.c
index 91dc53d..39596b4 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -97,6 +97,8 @@ int cap_capset_check (struct task_struct
 	if (!cap_issubset (*effective, *permitted)) {
 		return -EPERM;
 	}
+	/* we allow Inheritable not to be a subset of Permitted:
+	 * cap_capset_set will intersect them anyway */
 
 	return 0;
 }
@@ -105,7 +107,7 @@ void cap_capset_set (struct task_struct 
 		     kernel_cap_t *inheritable, kernel_cap_t *permitted)
 {
 	target->cap_effective = *effective;
-	target->cap_inheritable = *inheritable;
+	target->cap_inheritable = cap_intersect (*effective, *inheritable);
 	target->cap_permitted = *permitted;
 }
 
@@ -114,25 +116,20 @@ int cap_bprm_set_security (struct linux_
 	/* Copied from fs/exec.c:prepare_binprm. */
 
 	/* We don't have VFS support for capabilities yet */
-	cap_clear (bprm->cap_inheritable);
+	cap_set_full (bprm->cap_inheritable);
 	cap_clear (bprm->cap_permitted);
-	cap_clear (bprm->cap_effective);
+	cap_set_full (bprm->cap_effective);
 
 	/*  To support inheritance of root-permissions and suid-root
 	 *  executables under compatibility mode, we raise all three
 	 *  capability sets for the file.
-	 *
-	 *  If only the real uid is 0, we only raise the inheritable
-	 *  and permitted sets of the executable file.
 	 */
-
 	if (!issecure (SECURE_NOROOT)) {
-		if (bprm->e_uid == 0 || current->uid == 0) {
+		if (bprm->is_suid && bprm->e_uid == 0) {
 			cap_set_full (bprm->cap_inheritable);
 			cap_set_full (bprm->cap_permitted);
-		}
-		if (bprm->e_uid == 0)
 			cap_set_full (bprm->cap_effective);
+		}
 	}
 	return 0;
 }
@@ -140,13 +137,25 @@ int cap_bprm_set_security (struct linux_
 void cap_bprm_apply_creds (struct linux_binprm *bprm, int unsafe)
 {
 	/* Derived from fs/exec.c:compute_creds. */
-	kernel_cap_t new_permitted, working;
+	kernel_cap_t new_permitted, new_effective, working;
+	uid_t old_ruid, old_euid, old_suid;
 
+	/* P'(per) = (P(inh) & F(inh)) | (F(per) & bset) */
 	new_permitted = cap_intersect (bprm->cap_permitted, cap_bset);
 	working = cap_intersect (bprm->cap_inheritable,
 				 current->cap_inheritable);
 	new_permitted = cap_combine (new_permitted, working);
 
+	/* P'(eff) = (P(inh) & P(eff) & F(inh)) | (F(per) & F(eff) & bset) */
+	new_effective = cap_intersect (bprm->cap_permitted, bprm->cap_effective);
+	new_effective = cap_intersect (new_effective, cap_bset);
+	working = cap_intersect (bprm->cap_inheritable,
+				 current->cap_effective);
+	working = cap_intersect (working, current->cap_inheritable);
+	new_effective = cap_combine (new_effective, working);
+
+	/* P'(inh) = P'(per) */
+
 	if (bprm->e_uid != current->uid || bprm->e_gid != current->gid ||
 	    !cap_issubset (new_permitted, current->cap_permitted)) {
 		current->mm->dumpable = suid_dumpable;
@@ -159,25 +168,27 @@ void cap_bprm_apply_creds (struct linux_
 			if (!capable (CAP_SETPCAP)) {
 				new_permitted = cap_intersect (new_permitted,
 							current->cap_permitted);
+				new_effective = cap_intersect (new_permitted,
+							       new_effective);
 			}
 		}
 	}
 
+	old_ruid = current->uid;
+	old_euid = current->euid;
+	old_suid = current->suid;
 	current->suid = current->euid = current->fsuid = bprm->e_uid;
 	current->sgid = current->egid = current->fsgid = bprm->e_gid;
 
-	/* For init, we want to retain the capabilities set
-	 * in the init_task struct. Thus we skip the usual
-	 * capability rules */
-	if (current->pid != 1) {
-		current->cap_permitted = new_permitted;
-		current->cap_effective =
-		    cap_intersect (new_permitted, bprm->cap_effective);
-	}
-
-	/* AUD: Audit candidate if current->cap_effective is set */
+	current->cap_permitted = new_permitted;
+	current->cap_effective = new_effective;
+	current->cap_inheritable = new_permitted;
 
 	current->keep_capabilities = 0;
+	/* Make sure we drop capabilities if required by suid. */
+	cap_task_post_setuid (old_ruid, old_euid, old_suid, LSM_SETID_RES);
+
+	/* AUD: Audit candidate if current->cap_effective is set */
 }
 
 int cap_bprm_secureexec (struct linux_binprm *bprm)
@@ -187,8 +198,8 @@ int cap_bprm_secureexec (struct linux_bi
 	   test between the old and new capability sets.  For now,
 	   it simply preserves the legacy decision algorithm used by
 	   the old userland. */
-	return (current->euid != current->uid ||
-		current->egid != current->gid);
+	return ((bprm->is_suid || bprm->is_sgid)
+		&& !cap_issubset (cap_bset, current->cap_permitted));
 }
 
 int cap_inode_setxattr(struct dentry *dentry, char *name, void *value,
@@ -253,12 +264,15 @@ static inline void cap_emulate_setxuid (
 				= cap_intersect (current->cap_effective,
 						 CAP_REGULAR_SET);
 		}
+		current->cap_inheritable
+			= cap_intersect (current->cap_inheritable,
+					 CAP_REGULAR_SET);
 	}
-	if (old_euid == 0 && current->euid != 0) {
+	if (old_euid == 0 && current->euid != 0 && !current->keep_capabilities) {
 		current->cap_effective = cap_intersect (current->cap_effective,
 							CAP_REGULAR_SET);
 	}
-	if (old_euid != 0 && current->euid == 0) {
+	if (old_euid != 0 && current->euid == 0 && !current->keep_capabilities) {
 		current->cap_effective = current->cap_permitted;
 	}
 }

[-- Attachment #4: caps-0.3.1-split-part3-newcaps-linux-2.6.18-rc6.patch --]
[-- Type: text/x-patch, Size: 2983 bytes --]

 fs/exec.c                  |    7 +++++--
 fs/open.c                  |   10 ++++++++--
 include/linux/capability.h |   17 +++++++++++++++++
 kernel/fork.c              |    2 ++
 4 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index e4d0a2c..1cb5e34 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -930,7 +930,7 @@ int prepare_binprm(struct linux_binprm *
 
 	if(!(bprm->file->f_vfsmnt->mnt_flags & MNT_NOSUID)) {
 		/* Set-uid? */
-		if (mode & S_ISUID) {
+		if (mode & S_ISUID && capable(CAP_REG_SXID)) {
 			bprm->is_suid = 1;
 			current->personality &= ~PER_CLEAR_ON_SETID;
 			bprm->e_uid = inode->i_uid;
@@ -942,7 +942,8 @@ int prepare_binprm(struct linux_binprm *
 		 * is a candidate for mandatory locking, not a setgid
 		 * executable.
 		 */
-		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
+		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)
+		    && capable(CAP_REG_SXID)) {
 			bprm->is_sgid = 1;
 			current->personality &= ~PER_CLEAR_ON_SETID;
 			bprm->e_gid = inode->i_gid;
@@ -1137,6 +1138,8 @@ int do_execve(char * filename,
 	int retval;
 	int i;
 
+	if (!capable(CAP_REG_EXEC))
+		return -EPERM;
 	retval = -ENOMEM;
 	bprm = kzalloc(sizeof(*bprm), GFP_KERNEL);
 	if (!bprm)
diff --git a/fs/open.c b/fs/open.c
index 303f06d..3d1fc1c 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1104,7 +1104,10 @@ asmlinkage long sys_open(const char __us
 	if (force_o_largefile())
 		flags |= O_LARGEFILE;
 
-	ret = do_sys_open(AT_FDCWD, filename, flags, mode);
+	if (capable(CAP_REG_OPEN))
+		ret = do_sys_open(AT_FDCWD, filename, flags, mode);
+	else
+		ret = -EPERM;
 	/* avoid REGPARM breakage on x86: */
 	prevent_tail_call(ret);
 	return ret;
@@ -1119,7 +1122,10 @@ asmlinkage long sys_openat(int dfd, cons
 	if (force_o_largefile())
 		flags |= O_LARGEFILE;
 
-	ret = do_sys_open(dfd, filename, flags, mode);
+	if (capable(CAP_REG_OPEN))
+		ret = do_sys_open(dfd, filename, flags, mode);
+	else
+		ret = -EPERM;
 	/* avoid REGPARM breakage on x86: */
 	prevent_tail_call(ret);
 	return ret;
diff --git a/include/linux/capability.h b/include/linux/capability.h
index e4f6065..b3a3b27 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -295,6 +295,23 @@ #define CAP_AUDIT_WRITE      29
 
 #define CAP_AUDIT_CONTROL    30
 
+
+/**
+ ** Regular capabilities (normally possessed by all processes).
+ **/
+
+/* Can fork() */
+#define CAP_REG_FORK         32
+
+/* Can open() */
+#define CAP_REG_OPEN         33
+
+/* Can exec() */
+#define CAP_REG_EXEC         34
+
+/* Might gain permissions on exec() */
+#define CAP_REG_SXID         35
+
 #ifdef __KERNEL__
 /* 
  * Bounding set
diff --git a/kernel/fork.c b/kernel/fork.c
index f9b014e..20f559f 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1347,6 +1347,8 @@ long do_fork(unsigned long clone_flags,
 	struct pid *pid = alloc_pid();
 	long nr;
 
+	if (!capable(CAP_REG_FORK))
+		return -EPERM;
 	if (!pid)
 		return -EAGAIN;
 	nr = pid->nr;

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-06 13:26     ` David Madore
@ 2006-09-07  0:11       ` Casey Schaufler
  2006-09-07  0:32         ` David Madore
  2006-09-09 23:18       ` Theodore Tso
  1 sibling, 1 reply; 52+ messages in thread
From: Casey Schaufler @ 2006-09-07  0:11 UTC (permalink / raw)
  To: Linux Kernel mailing-list


--- David Madore <david.madore@ens.fr> wrote:

> ... (one merely needs to do
> something about
> the new bunch of capabilities I've introduced, but
> it should be easy
> to hack something which makes sure no programs are
> surprised or
> broken).

You have not introduced new capabilities
so much as you've introduced a new layer of
policy, that being things that unprivileged
processes can do but that "underprivileged"
processes cannot. I personally think that
this would make a spiffy LSM, but I don't
buy it as an extension of the POSIX (draft)
capability mechanism. Why? Because the
capability mechanism deals with providing
controls over the abilty to violate the
traditional Unix security policy, as
implemented in Linux. Adding "negative"
privilege might not be a bad idea, but
it is outside the scope of capabilities
AND there is a mechanism (LSM) explicity
in place for adding such restrictions.



Casey Schaufler
casey@schaufler-ca.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-07  0:11       ` Casey Schaufler
@ 2006-09-07  0:32         ` David Madore
  2006-09-07  1:01           ` Casey Schaufler
  0 siblings, 1 reply; 52+ messages in thread
From: David Madore @ 2006-09-07  0:32 UTC (permalink / raw)
  To: Linux Kernel mailing-list; +Cc: Casey Schaufler

On Thu, Sep 07, 2006 at 12:12:15AM +0000, Casey Schaufler wrote:
> You have not introduced new capabilities
> so much as you've introduced a new layer of
> policy, that being things that unprivileged
> processes can do but that "underprivileged"
> processes cannot. I personally think that
> this would make a spiffy LSM, but I don't
> buy it as an extension of the POSIX (draft)
> capability mechanism. Why? Because the
> capability mechanism deals with providing
> controls over the abilty to violate the
> traditional Unix security policy, as
> implemented in Linux. Adding "negative"
> privilege might not be a bad idea, but
> it is outside the scope of capabilities
> AND there is a mechanism (LSM) explicity
> in place for adding such restrictions.

I understand your point.  But if we want these under-privileges to
follow the same inheritance rules as the over-privileges provided by
capabilities (were it only to make things simpler to comprehend),
doesn't it make sense to implement them in the same framework?  Rather
than trying to reproduce the same rules in a different part of the
kernel, causing code reduplication which would eventually, inevitably,
fall out of sync...  I think it's easier for everyone if under- and
over-privileges are treated in a uniform fashion.  Perhaps that's not
what POSIX intended, but I don't think "not what was intended" is a
sufficient reason for backing away from something that might be
useful.  Do you have a specific problem in mind?

However, the suggestion makes sense: if I can't convince the Powers
That Be that implementing under-privileges with capabilities is a Good
Thing (and I can see that it will be a serious problem), I'll try the
LSM approach.

-- 
     David A. Madore
    (david.madore@ens.fr,
     http://www.madore.org/~david/ )

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-07  0:32         ` David Madore
@ 2006-09-07  1:01           ` Casey Schaufler
  2006-09-07  1:29             ` David Wagner
  2006-09-07 17:34             ` David Madore
  0 siblings, 2 replies; 52+ messages in thread
From: Casey Schaufler @ 2006-09-07  1:01 UTC (permalink / raw)
  To: Linux Kernel mailing-list; +Cc: Casey Schaufler



--- David Madore <david.madore@ens.fr> wrote:


> I understand your point.  But if we want these
> under-privileges to
> follow the same inheritance rules as the
> over-privileges provided by
> capabilities (were it only to make things simpler to
> comprehend),
> doesn't it make sense to implement them in the same
> framework?

I'm certainly not convinced that you'd
want that. Think of all the programs that
would have to be marked with CAP_FORK.

> Rather
> than trying to reproduce the same rules in a
> different part of the
> kernel, causing code reduplication which would
> eventually, inevitably,
> fall out of sync...  I think it's easier for
> everyone if under- and
> over-privileges are treated in a uniform fashion.

This again assumes that you want to require
that in general processes run with some
capabilities.

> Perhaps that's not
> what POSIX intended, but I don't think "not what was
> intended" is a
> sufficient reason for backing away from something
> that might be
> useful.  Do you have a specific problem in mind?

You betcha. (That's Minnisotan for "Yes indeed")

In our TCSEC B1 (Old person speak for LSPP)
evaluation we had to put way too much effort
into explaining why certain operations that
had nothing to do with the system security
policy (e.g. compute resource limitations)
required privilege. These operations had
no security implications at all, but since
they required privilege they were assumed to
have dire consequences should they be abused.

If you introduce an "underprivileged" process,
you immediately relegate what are currently
unprivileged processes to the realm of
privileged processes. All of a sudden any
process that does fork() requires additional
scrutiny because it uses privilege. You
won't convince any evaluator that there is a
difference between "having a capability" and
"not having lack of capability".

On the other hand if you have "additional
restrictions" available (as in an LSM) that
are not part of the policy enforcement
mechanism there should be no problems.

> However, the suggestion makes sense: if I can't
> convince the Powers
> That Be that implementing under-privileges with
> capabilities is a Good
> Thing (and I can see that it will be a serious
> problem), I'll try the
> LSM approach.

Further, you can assign any semantic that
makes sense to you. Heaven knows, the POSIX
calculation (from any of the DRAFTS) has
proven to be contentious. Especially in the
inheritence rules on exec.


Casey Schaufler
casey@schaufler-ca.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-07  1:01           ` Casey Schaufler
@ 2006-09-07  1:29             ` David Wagner
  2006-09-07 16:00               ` Casey Schaufler
  2006-09-07 17:34             ` David Madore
  1 sibling, 1 reply; 52+ messages in thread
From: David Wagner @ 2006-09-07  1:29 UTC (permalink / raw)
  To: linux-kernel

Casey Schaufler  wrote:
>In our TCSEC B1 (Old person speak for LSPP)
>evaluation we had to put way too much effort
>into explaining why certain operations that
>had nothing to do with the system security
>policy (e.g. compute resource limitations)
>required privilege. These operations had
>no security implications at all, but since
>they required privilege they were assumed to
>have dire consequences should they be abused.

Well, this is sounding like a pretty weak argument.

It sounds like what you are saying is the evaluators were not thinking
straight when they gave you a hard time about your efforts to do a
better job of implementing the principle of least privilege.  If I
understand the gist of what you are saying correctly, you reduced the
privilege on some processes, and the result was that they complained
more loudly than if you had done nothing.  If so, that's pretty lame.
That doesn't sound to me like the evaluators were thinking very clearly.

And if the evaluators don't really understand how to think clearly about
security, why should we pay any attention to them, anyway?  I see no
reason to design the Linux kernel around the whims of those who don't
understand the technical issues.  Who cares about what the evaluators
think, if they don't have their head screwed on straight?  Personally,
I care more about technical merit than about pleasing folks who don't
understand security.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-06 22:27   ` David Madore
  2006-09-07  0:04     ` David Madore
@ 2006-09-07  6:43     ` Jan Engelhardt
  2006-09-07 23:02     ` Serge E. Hallyn
  2 siblings, 0 replies; 52+ messages in thread
From: Jan Engelhardt @ 2006-09-07  6:43 UTC (permalink / raw)
  To: David Madore; +Cc: Linux Kernel mailing-list, Serge E. Hallyn


>> In the meantime, so long as you're adding some new capabilities, how
>> about also splitting up a few like CAP_SYS_ADMIN?  Have you looked into
>> it and decided none are really separable, i.e. any subset leads to the
>> ability to get any other subset?
>
>I agree that splitting CAP_SYS_ADMIN might be worth while, but it
>really looks like opening a worm can, so I didn't feel up to the
>challenge there.  It might be a good idea to reserve some bits for
>that possibility, however - I'm not sure how best to proceed.

Split it into read and write options, for a start. This is important in
a sub-"root"-user environment as is currently created with my MultiAdm
LSM, which, due to the broad spectrum CAP_SYS_ADMIN addresses, must
give SYS_ADMIN to the subadmin (to be able to read restricted objects)
and restrict it afterwards in all the LSM hooks (to not write to
restricted objects).

http://lkml.org/lkml/2006/5/1/105
http://lkml.org/lkml/2006/5/1/110 <- important


Jan Engelhardt
-- 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-07  1:29             ` David Wagner
@ 2006-09-07 16:00               ` Casey Schaufler
  2006-09-07 18:33                 ` David Wagner
  0 siblings, 1 reply; 52+ messages in thread
From: Casey Schaufler @ 2006-09-07 16:00 UTC (permalink / raw)
  To: David Wagner, linux-kernel



--- David Wagner <daw@cs.berkeley.edu> wrote:

> Casey Schaufler  wrote:
> >In our TCSEC B1 (Old person speak for LSPP)
> >evaluation we had to put way too much effort
> >into explaining why certain operations that
> >had nothing to do with the system security
> >policy (e.g. compute resource limitations)
> >required privilege. These operations had
> >no security implications at all, but since
> >they required privilege they were assumed to
> >have dire consequences should they be abused.
> 
> Well, this is sounding like a pretty weak argument.

That all depends on how important getting
an evaluation complete is to you. And, they
do have a point, which is why does it make
sense to use the same privilege mechanism
for you security policy as you do for your
resource management policy.

> It sounds like what you are saying is the evaluators
> were not thinking
> straight when they gave you a hard time about your
> efforts to do a
> better job of implementing the principle of least
> privilege.

No, they were very clear that they felt that
use of the privelege ought to be an indication
that policy was being violated, and they were
correct. The issue was that the system made
it difficult to distinguish between violations
of the security policy and violations of the
resource management policy. 

> If I
> understand the gist of what you are saying
> correctly, you reduced the
> privilege on some processes, and the result was that
> they complained
> more loudly than if you had done nothing.  If so,
> that's pretty lame.
> That doesn't sound to me like the evaluators were
> thinking very clearly.

No, the issue was that every use of the
privilege mechanism, even when it was not
relevent to the security policy, required
investigation and explaination.

> And if the evaluators don't really understand how to
> think clearly about
> security, why should we pay any attention to them,
> anyway?

Because it's cool to have the hand-calligraphed
certificate hanging on your office wall.

> I see no
> reason to design the Linux kernel around the whims
> of those who don't
> understand the technical issues.

The evaluatee gets to explain the technical
issues to the evaluators. That's the challenge.
The areas that require the most explaination
are typically those that could use refinement.

> Who cares about what the evaluators
> think, if they don't have their head screwed on
> straight?  Personally,
> I care more about technical merit than about
> pleasing folks who don't
> understand security.

Much of the community's understanding of
security has come about because evaluators
asked unpleasant questions that required
hard thinking to answer. The SELinux effort,
for example, represents a directed effort
to "do better" than the Unix B1/CMW systems.
Don't underestimate evaluators.


Casey Schaufler
casey@schaufler-ca.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-07  1:01           ` Casey Schaufler
  2006-09-07  1:29             ` David Wagner
@ 2006-09-07 17:34             ` David Madore
  2006-09-07 19:38               ` Bernd Eckenfels
  2006-09-07 22:54               ` Pavel Machek
  1 sibling, 2 replies; 52+ messages in thread
From: David Madore @ 2006-09-07 17:34 UTC (permalink / raw)
  To: Linux Kernel mailing-list

On Thu, Sep 07, 2006 at 01:01:48AM +0000, Casey Schaufler wrote:
> --- David Madore <david.madore@ens.fr> wrote:
> > doesn't it make sense to implement them in the same
> > framework?
> 
> I'm certainly not convinced that you'd
> want that. Think of all the programs that
> would have to be marked with CAP_FORK.

They wouldn't have to be marked: capabilities are inherited by
default, with my patch (as is the Unix tradition: euid=0 or {r,s}uid=0
are preserved upon execve()), normal processes have CAP_FORK and just
pass it on if you don't do something special to remove it.

> > Rather
> > than trying to reproduce the same rules in a
> > different part of the
> > kernel, causing code reduplication which would
> > eventually, inevitably,
> > fall out of sync...  I think it's easier for
> > everyone if under- and
> > over-privileges are treated in a uniform fashion.
> 
> This again assumes that you want to require
> that in general processes run with some
> capabilities.

Yes.  In general, processes have all "regular" capabilities, and they
are inherited normally.

-- 
     David A. Madore
    (david.madore@ens.fr,
     http://www.madore.org/~david/ )

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-05 21:26 patch to make Linux capabilities into something useful (v 0.3.1) David Madore
  2006-09-06  0:27 ` Casey Schaufler
  2006-09-06 18:25 ` Serge E. Hallyn
@ 2006-09-07 18:21 ` James Antill
  2006-09-07 18:33   ` Kyle Moffett
  2006-09-08  4:00   ` David Madore
  2 siblings, 2 replies; 52+ messages in thread
From: James Antill @ 2006-09-07 18:21 UTC (permalink / raw)
  To: David Madore; +Cc: Linux Kernel mailing-list

David Madore <david.madore@ens.fr> writes:

> Hi.
>
> As we all know, capabilities under Linux are currently crippled to the
> point of being useless.  Attached is a patch (against 2.6.18-rc6)
> which attempts to make them work in a reasonably useful way and at the
> same time not break anything.  On top of the "additional" capabilities
> that lead up to root, it also adds "regular" capabilities which all
> processes have by default and which can be removed from specifically
> untrusted programs.

 Just a minor comment, can you break out the OPEN into at least
OPEN_R, OPEN_NONFILE_W and OPEN_W (possibly OPEN_A, but I don't want
that personally).
 The case I'm thinking about are network daemons that need to
open+write to the syslog socket but only have read access elsewhere.


 Also there is much more than just bind9 using capabilities, the
obvious ones that come to mind are NOATIME and AUDIT.

-- 
James Antill -- james@and.org
http://www.and.org/and-httpd

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-07 16:00               ` Casey Schaufler
@ 2006-09-07 18:33                 ` David Wagner
  0 siblings, 0 replies; 52+ messages in thread
From: David Wagner @ 2006-09-07 18:33 UTC (permalink / raw)
  To: linux-kernel

Casey Schaufler  wrote:
>That all depends on how important getting
>an evaluation complete is to you.

Getting an evaluation complete is of approximately zero importance to me.
I do want the system to be secure, but if the evaluators are ignorant of
basic security principles, then I don't much care what they may think.
Like I said, technical merit is a lot more important to me than pleasing
those who can't think clearly about security.  Frankly, I don't give a
fig what such evaluators think.

If you think there is a good technical argument against this patch, then
I encourage you to state that argument for yourself, without reference
to what evaluators may or may not think.  Appeals to authority do not
persuade me -- especially when the so-called "authority" doesn't appear
to know how to think clearly about security.

>And, they
>do have a point, which is why does it make
>sense to use the same privilege mechanism
>for you security policy as you do for your
>resource management policy.

I didn't think this patch had much of anything to do with resource
management.  I thought this patch was about POSIX-like capabilities.
Resource management isn't relevant here.  Can we talk about this patch,
instead of talking about why some other system of yours got hassled by
the evaluators?

>No, they were very clear that they felt that
>use of the privelege ought to be an indication
>that policy was being violated, and they were
>correct.

That's silly.  There's no justification for that view.  What does
"use of privilege" mean?  *Every* process has some privilege or other.
I think what you mean is that "any process which has privilege above some
baseline should be an indication that policy was violated".  But their
mistake was in getting confused over what the right baseline is, for the
purposes of that heuristic.

If the evaluators thought that a system where every application you
run automatically receives privilege to, e.g., delete all your files is
better than a system where only some applications receive privilege to
delete all your files -- then maybe they need to learn a little more
about the principle of least privilege.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-07 18:21 ` James Antill
@ 2006-09-07 18:33   ` Kyle Moffett
  2006-09-07 20:05     ` James Antill
  2006-09-08  4:00   ` David Madore
  1 sibling, 1 reply; 52+ messages in thread
From: Kyle Moffett @ 2006-09-07 18:33 UTC (permalink / raw)
  To: James Antill; +Cc: David Madore, Linux Kernel mailing-list

On Sep 07, 2006, at 14:21:15, James Antill wrote:
> David Madore <david.madore@ens.fr> writes:
>
>> Hi.
>>
>> As we all know, capabilities under Linux are currently crippled to  
>> the point of being useless.  Attached is a patch (against 2.6.18- 
>> rc6) which attempts to make them work in a reasonably useful way  
>> and at the same time not break anything.  On top of the  
>> "additional" capabilities that lead up to root, it also adds  
>> "regular" capabilities which all processes have by default and  
>> which can be removed from specifically untrusted programs.
>
> Just a minor comment, can you break out the OPEN into at least  
> OPEN_R, OPEN_NONFILE_W and OPEN_W (possibly OPEN_A, but I don't  
> want that personally). The case I'm thinking about are network  
> daemons that need to open+write to the syslog socket but only have  
> read access elsewhere.
>
> Also there is much more than just bind9 using capabilities, the  
> obvious ones that come to mind are NOATIME and AUDIT.

To be honest, once you get to the point of having an OPEN_NONFILE_W  
capability you should really just be using SELinux.  Instead of  
spewing policy all over your filesystem xattrs you put it all into a  
single set of fairly admin-readable policy files, and it includes  
more fine-grained operations than you have space for in 32 bits of  
"un"-capabilities.  It also conveniently supports the existing pseudo- 
POSIX capabilities.  You can effectively say "root, netadmin, and  
netuser users are allowed to run ping programs which automatically  
transition into the ping domain and get CAP_NET_RAW privileges" in  
your binary policy, then just leave ping as chmod "755" and be done  
with it.

Cheers,
Kyle Moffett


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-07 17:34             ` David Madore
@ 2006-09-07 19:38               ` Bernd Eckenfels
  2006-09-07 23:00                 ` Pavel Machek
  2006-09-07 22:54               ` Pavel Machek
  1 sibling, 1 reply; 52+ messages in thread
From: Bernd Eckenfels @ 2006-09-07 19:38 UTC (permalink / raw)
  To: linux-kernel

In article <20060907173449.GA24013@clipper.ens.fr> you wrote:
> They wouldn't have to be marked: capabilities are inherited by
> default, with my patch (as is the Unix tradition: euid=0 or {r,s}uid=0
> are preserved upon execve()), normal processes have CAP_FORK and just
> pass it on if you don't do something special to remove it.

The Problem with that is, that a program can be started with some priveldges
without knowing it. Traditional programs only check for uid=0 and in that
case refuse to do some things. A program might not expect to be able to do a
priveldged action with not being uid=0.

I do think those programs are broken, but it is a argument, why the
inheritance should be off by default (unless the euid is always 0 if any
root privelddge is enabled)

> Yes.  In general, processes have all "regular" capabilities, and they
> are inherited normally.

I like that, i think it was once called user capabilities. I wonder if we
need to make those distinctions at all.

BTW: some of the priveldges could also be avoided by adding access control
to some kernel objects. I really like the windows approach here, and it
would be possible to have the permissions bound to special files (which
would also allow ACLs). So for example instead of haing CAP_NET_ADMIN the
kernel could check "can the process open /dev/objects/netadmin for write".

Gruss
Bernd

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-07 18:33   ` Kyle Moffett
@ 2006-09-07 20:05     ` James Antill
  0 siblings, 0 replies; 52+ messages in thread
From: James Antill @ 2006-09-07 20:05 UTC (permalink / raw)
  To: Kyle Moffett; +Cc: David Madore, Linux Kernel mailing-list

Kyle Moffett <mrmacman_g4@mac.com> writes:

> On Sep 07, 2006, at 14:21:15, James Antill wrote:
>>
>> Just a minor comment, can you break out the OPEN into at least
>> OPEN_R, OPEN_NONFILE_W and OPEN_W (possibly OPEN_A, but I don't
>> want that personally). The case I'm thinking about are network
>> daemons that need to open+write to the syslog socket but only have
>> read access elsewhere.
>>
>> Also there is much more than just bind9 using capabilities, the
>> obvious ones that come to mind are NOATIME and AUDIT.
>
> To be honest, once you get to the point of having an OPEN_NONFILE_W
> capability you should really just be using SELinux.

 Actually, I got confused ... I forgot that you have to connect() to
/dev/log and can't just open() it[1] ... so just having open_r and
open_w separated would be enough. Which you'll hopefully agree would
be nice to have outside of SELinux policy (noting that SELinux doesn't
let you limit open calls, as such, only read+write -- although it's
mostly the same).


[1] Although, if we ever fix open... :)

-- 
James Antill -- james@and.org
http://www.and.org/and-httpd

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-07 17:34             ` David Madore
  2006-09-07 19:38               ` Bernd Eckenfels
@ 2006-09-07 22:54               ` Pavel Machek
  2006-09-08  4:10                 ` David Madore
  1 sibling, 1 reply; 52+ messages in thread
From: Pavel Machek @ 2006-09-07 22:54 UTC (permalink / raw)
  To: David Madore; +Cc: Linux Kernel mailing-list

Hi!

On the webpage, you wrote

* Second, I argue that an attacker (non-root, obviously) cannot take
advantage of the patch. (...) One might argue, however, that the patch makes suid
non-root programs vulnerable, as they could be executed with less
(regular) capabilities than they expect; however, this is not believed
to be a serious problem, because (a) such programs are much rarer than
suid root programs, (b) damage, if any, would be less limited (no
special capabilities are at stake, only access to the filesystem), (c)
removing regular capabilities makes system calls fail with a clean
error code (nothing exotic like the setuid() function which exhibits a
very subtle difference in behavior according as the CAP_SETUID
capability is set or not, which made the sendmail exploit possible),
and (d) system calls can always fail, so adding new causes for failure
is not introducing anything significantly different.

You contradict yourself. Yes, you are decreasing security of suid
non-root programs, and yes, someone will take advantage of that. Plus,
you can easily do away without this risk.

Just add all "usual" capabilities when execing
suid/sgid-anything. Alternatively disallow suid/sgid-anything exec
when all "usual" capabilities are not present.

(And btw I really like your idea of introducing "usual" capabilities
like CAP_REG_FORK/CAP_REG_OPEN/...).
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-07 19:38               ` Bernd Eckenfels
@ 2006-09-07 23:00                 ` Pavel Machek
  2006-09-08  1:22                   ` Bernd Eckenfels
  0 siblings, 1 reply; 52+ messages in thread
From: Pavel Machek @ 2006-09-07 23:00 UTC (permalink / raw)
  To: Bernd Eckenfels; +Cc: linux-kernel

On Thu 2006-09-07 21:38:43, Bernd Eckenfels wrote:
> In article <20060907173449.GA24013@clipper.ens.fr> you wrote:
> > They wouldn't have to be marked: capabilities are inherited by
> > default, with my patch (as is the Unix tradition: euid=0 or {r,s}uid=0
> > are preserved upon execve()), normal processes have CAP_FORK and just
> > pass it on if you don't do something special to remove it.
> 
> The Problem with that is, that a program can be started with some priveldges
> without knowing it. Traditional programs only check for uid=0 and in that
> case refuse to do some things. A program might not expect to be able to do a
> priveldged action with not being uid=0.

But that is not a problem.

If attacker already has priviledge foo, he can just go use it. He does
not have to exec() poor program not expecting to get priviledge foo,
then abusing it.

...actually...

...what you say is a problem. You are right that capabilities should
be "sanitized" on every exec of suid/sgid program (not only suid
root).

Sanitized here means "all regular capabilities set, all others
cleared".
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-06 22:27   ` David Madore
  2006-09-07  0:04     ` David Madore
  2006-09-07  6:43     ` Jan Engelhardt
@ 2006-09-07 23:02     ` Serge E. Hallyn
  2006-09-08  1:08       ` David Madore
  2 siblings, 1 reply; 52+ messages in thread
From: Serge E. Hallyn @ 2006-09-07 23:02 UTC (permalink / raw)
  To: David Madore; +Cc: Linux Kernel mailing-list, Serge E. Hallyn

Quoting David Madore (david.madore@ens.fr):
> On Wed, Sep 06, 2006 at 01:25:31PM -0500, Serge E. Hallyn wrote:
> > The fact that you're changing the inheritance rules is a bit scary, so
> > I'm going to (and I hope others will) take some time to look it over.
> 
> Thanks!  I'd appreciate it.  Don't hesitate to ask me if some
> decisions I made are unclear.

Ok, so to be clear, in terms of inheritability of capabilities, your
three main changes are:

	1. When creating a bprm, it's inheritable and effective
	capability sets are set full on, whereas they used to be
	cleared.  The permitted set is treated as before (always
	cleared)

	2. When computing a process' new capabilities, the new
	inheritable come from the new permitted, rather than the old
	inheritable.

	3. You change half the computation of p'E to replace fE by
	pE in one half.

Here is one apparent change in behavior:

If I currently do

	cp /bin/sh /bin/shsetuid
	chmod u+s /bin/shsetuid

then log in as uid 1000 and run

	/bin/shsetuid
	# whoami
	hallyn
	# ls /root
	ls: /root: Permission denied

With your patch I believe it will succeed, since the sh process'
inheritable set will be set to it's permitted set.

Put another way:

	cap_set_proc("=i");
	execve("/bin/shsetuid");

I obviously wanted my inheritable set to be cleared, but running the
setuid binary will end up resetting my inheritable set to a larger
set.  Your goal of allowing the inheritable caps to be truly
inheritable may make sense, but this part of it feels wrong, and
changes current setuid behavior.

So in other words, it may make sense for the process to be able to
say "I want these caps to persist across exec" (*1), but it shouldn't
happen automatically based on the file's attributes.

In any case, perhaps it would be worthwhile making this a part of
a capability_plusplus module?  That would be less controversial,
given that I believe many people use the capability module who
really just want classic setuid behavior. 

thanks,
-serge

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-07  0:04     ` David Madore
@ 2006-09-07 23:06       ` Serge E. Hallyn
  2006-09-08  4:16         ` David Madore
  0 siblings, 1 reply; 52+ messages in thread
From: Serge E. Hallyn @ 2006-09-07 23:06 UTC (permalink / raw)
  To: David Madore; +Cc: Linux Kernel mailing-list, Serge E. Hallyn

Quoting David Madore (david.madore@ens.fr):
> On Thu, Sep 07, 2006 at 12:27:31AM +0200, David Madore wrote:
> > On Wed, Sep 06, 2006 at 01:25:31PM -0500, Serge E. Hallyn wrote:
> > > I'd recommend you split this patch into at least 3:
> > > 	1. move to 64-bit caps
> > > 	2. introduce your new caps
> > > 		(perhaps even one new cap per patch)
> > > 	3. introduce the new inheritance rules
> > 
> > Yes, that sounds like a good idea.  I'll do that.
> 
> Done.  Attached.  Except that the order is
> 
> part1: move to 64-bit caps (and also re-enable CAP_SETPCAP),
>        where upper 32-bits are "regular" capabilities (but none defined)
> 
> part2: introduce the new inheritance rules
> 
> part3: introduce new ("regular") capabilities

Thanks.  This made comparing the inh behavior to your web page and to
the classic code much easier.

I'm not sure reserving all 32 for 'regular' caps is the way
to go, since we're about to overflow the 32 bits of sysadm caps
already.  What about maybe 20 regular caps?

No need to do this now for my sake, but if you repost these, doing so
in 3 separate emails with the patches inline will make it more likely
that people read them.

thanks,
-serge

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-07 23:02     ` Serge E. Hallyn
@ 2006-09-08  1:08       ` David Madore
  2006-09-08  1:31         ` Serge E. Hallyn
  0 siblings, 1 reply; 52+ messages in thread
From: David Madore @ 2006-09-08  1:08 UTC (permalink / raw)
  To: Linux Kernel mailing-list; +Cc: Serge E. Hallyn

On Thu, Sep 07, 2006 at 06:02:45PM -0500, Serge E. Hallyn wrote:
> Ok, so to be clear, in terms of inheritability of capabilities, your
> three main changes are:

Yes, this is a fair description:

> 	1. When creating a bprm, it's inheritable and effective
> 	capability sets are set full on, whereas they used to be
> 	cleared.  The permitted set is treated as before (always
> 	cleared)

- This is to make capabilities inheritable but don't add any others
except when executing suid root.

> 	2. When computing a process' new capabilities, the new
> 	inheritable come from the new permitted, rather than the old
> 	inheritable.

- The reason for that is the necessity to preserve Unix semantics (see
below).

> 	3. You change half the computation of p'E to replace fE by
> 	pE in one half.

- Again, to preserve Unix semantics (if a process with {r,s}uid=0 and
euid!=0 does an exec(), the resulting process also has euid!=0, that
is, no effective capabilities).

> Here is one apparent change in behavior:
> 
> If I currently do
> 
> 	cp /bin/sh /bin/shsetuid
> 	chmod u+s /bin/shsetuid
> 
> then log in as uid 1000 and run
> 
> 	/bin/shsetuid
> 	# whoami
> 	hallyn
> 	# ls /root
> 	ls: /root: Permission denied

What does "currently" mean"?  On an unpatched Linux, I believe (and
observe) the following:

* if your /bin/sh is bash, it purposely drops privileges (by doing
something like setresuid(getuid(),getuid(),getuid()), I haven't
checked the source), and this is the reason you get "Permission
denied",

* if your /bin/sh is something else, it keeps euid==0 and you have
root privileges all the way, including in children processes - this is
traditional Unix behavior.

My patch doesn't change any of this (I've checked), since it uses
inheritance rules for capabilities which are closely modeled upon
those of {r,s,e}uid (in fact, that's my very reason for "changing"
things), and since the bash method of dropping privileges is also kept
woring.

(I don't know *why* bash tries to drop privileges.  It's probably an
attempt at avoiding certain security problems, but I think it's a
rather bad one.)

> With your patch I believe it will succeed, since the sh process'
> inheritable set will be set to it's permitted set.

My patch doesn't change this behavior.  Evidently, if it did, it would
be very bad...

> Put another way:

I'm not sure why what follows is a restatement of what precedes, so
I'll answer differently.

> 	cap_set_proc("=i");
> 	execve("/bin/shsetuid");
> 
> I obviously wanted my inheritable set to be cleared, but running the
> setuid binary will end up resetting my inheritable set to a larger
> set.  Your goal of allowing the inheritable caps to be truly
> inheritable may make sense, but this part of it feels wrong, and
> changes current setuid behavior.

In the current (unpatched) Linux kernel, the inheritable set is
completely ignored anyway. :-( So certainly any attempt to make
something of it must change the behavior.

I agree that the above code snippet exhibits a difference of my patch
w.r.t. the capabilities(7)-documented behavior (or at least, might,
according to the way suid programs are interpreted), but this
difference is

(a) necessary in order not to break traditional Unix semantics
(children of a program with euid==0 also have euid==0, and the father
process can't avoid that), and

(b) necessary for security reasons (it is imperative that the parent
of a suid root process cannot prevent that process from keeping
privileges, otherwise we get the sendmail bug again).


To summarize my answer: as far as I know, my patch does not change
suid behavior: I've taken great care not to let that happen.  It does
change the documented inheritance behavior of capabilities, but that
is unavoidable.

PS: I should be releasing a new version of my patch, along with a
merged version of yours, very shortly.

-- 
     David A. Madore
    (david.madore@ens.fr,
     http://www.madore.org/~david/ )

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-07 23:00                 ` Pavel Machek
@ 2006-09-08  1:22                   ` Bernd Eckenfels
  2006-09-08 10:45                     ` Pavel Machek
  2006-09-08 14:39                     ` Pavel Machek
  0 siblings, 2 replies; 52+ messages in thread
From: Bernd Eckenfels @ 2006-09-08  1:22 UTC (permalink / raw)
  To: Pavel Machek, linux-kernel

On Fri, Sep 08, 2006 at 01:00:28AM +0200, Pavel Machek wrote:
> If attacker already has priviledge foo, he can just go use it. He does
> not have to exec() poor program not expecting to get priviledge foo,
> then abusing it.

It is not about attackers. It is about normal usage. If you spawn a program,
it might behave wrong since it does not know that it is priveledged. For
example a network daemon might start a child process which interacts with
the user, and forgets to drop priveldges for it.

> Sanitized here means "all regular capabilities set, all others
> cleared".

Yes, however I thought this was exactly what the patch is not doing?

Gruss
Bernd
-- 
  (OO)     -- Bernd_Eckenfels@Mörscher_Strasse_8.76185Karlsruhe.de --
 ( .. )    ecki@{inka.de,linux.de,debian.org}  http://www.eckes.org/
  o--o   1024D/E383CD7E  eckes@IRCNet  v:+497211603874  f:+49721151516129
(O____O)  When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl!

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-08  1:08       ` David Madore
@ 2006-09-08  1:31         ` Serge E. Hallyn
  2006-09-08 21:45           ` David Madore
  0 siblings, 1 reply; 52+ messages in thread
From: Serge E. Hallyn @ 2006-09-08  1:31 UTC (permalink / raw)
  To: David Madore; +Cc: Linux Kernel mailing-list, Serge E. Hallyn

Quoting David Madore (david.madore@ens.fr):
> My patch doesn't change any of this (I've checked), since it uses
> inheritance rules for capabilities which are closely modeled upon
> those of {r,s,e}uid (in fact, that's my very reason for "changing"
> things), and since the bash method of dropping privileges is also kept
> woring.

Ah, ok.  So there is in fact no change in setuid behavior at all then.

Do you have a little testsuite you've run which you could make available
someplace?  Or a few test programs you could toss into a tarball and
call a testsuite?  :)

> (b) necessary for security reasons (it is imperative that the parent
> of a suid root process cannot prevent that process from keeping
> privileges, otherwise we get the sendmail bug again).

Good point.

> To summarize my answer: as far as I know, my patch does not change
> suid behavior: I've taken great care not to let that happen.  It does
> change the documented inheritance behavior of capabilities, but that
> is unavoidable.
> 
> PS: I should be releasing a new version of my patch, along with a
> merged version of yours, very shortly.

Could you cc: the lsm list (linux-security-module@vger.kernel.org)?
I'd particularly have Chris Wright give some comment as he's spent a
lot of time looking at capabilities.

thanks,
-serge

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-07 18:21 ` James Antill
  2006-09-07 18:33   ` Kyle Moffett
@ 2006-09-08  4:00   ` David Madore
  1 sibling, 0 replies; 52+ messages in thread
From: David Madore @ 2006-09-08  4:00 UTC (permalink / raw)
  To: James Antill; +Cc: Linux Kernel mailing-list

On Thu, Sep 07, 2006 at 02:21:15PM -0400, James Antill wrote:
>  Just a minor comment, can you break out the OPEN into at least
> OPEN_R, OPEN_NONFILE_W and OPEN_W (possibly OPEN_A, but I don't want
> that personally).

Version 0.4.2 of the patch, which I'm about to post to the list, adds
a CAP_REG_WRITE capability that is supposed to prevent any write
operation to the filesystem.

-- 
     David A. Madore
    (david.madore@ens.fr,
     http://www.madore.org/~david/ )

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-07 22:54               ` Pavel Machek
@ 2006-09-08  4:10                 ` David Madore
  2006-09-08 10:52                   ` Pavel Machek
  2006-09-09  0:59                   ` patch to make Linux capabilities into something useful (v 0.3.1) David Wagner
  0 siblings, 2 replies; 52+ messages in thread
From: David Madore @ 2006-09-08  4:10 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Linux Kernel mailing-list

On Fri, Sep 08, 2006 at 12:54:29AM +0200, Pavel Machek wrote:
> You contradict yourself.

I don't see how that is.  I understand that you could be unconvinced
by my reasoning and by my arguments, but I don't see how they are
contradictory.

The bottom line is that, whereas for root making syscalls fail (or,
worse, in the case of setuid(), behave subtly diffently) is a radical
change, for non-root it is something which should always be expected
(fork() can fail for lack of resources, write() can fail for quota
exhaution, etc.), and not something an attacker should be able to
exploit.

>			   Yes, you are decreasing security of suid
> non-root programs, and yes, someone will take advantage of that. Plus,
> you can easily do away without this risk.

I wish I could offer more assurance, but unfortunately the solutions
which do away with the risk come with a great cost:

> Just add all "usual" capabilities when execing
> suid/sgid-anything.

This makes it trivial to regain capabilities: just create a program
suid yourself and exec it.  OK, we can say that "yourself" won't work,
but you still only need to find another uid to hijack...  Not too
satisfactory.  Perhaps if we create a capability to add the 's' bit to
anything?  That's a bit messy, but perhaps feasible.

>		      Alternatively disallow suid/sgid-anything exec
> when all "usual" capabilities are not present.

This is probably too stringent: remove any trivial capability
whatsoever and you lose a rather important ability.

But certainly this is a matter for some debate...

-- 
     David A. Madore
    (david.madore@ens.fr,
     http://www.madore.org/~david/ )

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-07 23:06       ` Serge E. Hallyn
@ 2006-09-08  4:16         ` David Madore
  0 siblings, 0 replies; 52+ messages in thread
From: David Madore @ 2006-09-08  4:16 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: Linux Kernel mailing-list

On Thu, Sep 07, 2006 at 06:06:30PM -0500, Serge E. Hallyn wrote:
> I'm not sure reserving all 32 for 'regular' caps is the way
> to go, since we're about to overflow the 32 bits of sysadm caps
> already.  What about maybe 20 regular caps?

Yes, that could easily be arranged.  I've tried to be careful and
nowhere asssume that the regular caps were precisely numbers 32
through 63.

> No need to do this now for my sake, but if you repost these, doing so
> in 3 separate emails with the patches inline will make it more likely
> that people read them.

I'll do this now for version 0.4.2, which merges with your own
filesystem support.  I made one noteworthy change to your code, by the
way, which is to disable the "permitted" (=forced) set of capabilities
on executables whose filesystem is mounted nosuid: I think this is a
reasonable constraint, to avoid the attack where a luser mounts a usb
key with xattrs in the filesystem or something like that.

-- 
     David A. Madore
    (david.madore@ens.fr,
     http://www.madore.org/~david/ )

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-08  1:22                   ` Bernd Eckenfels
@ 2006-09-08 10:45                     ` Pavel Machek
  2006-09-08 16:08                       ` Casey Schaufler
  2006-09-08 14:39                     ` Pavel Machek
  1 sibling, 1 reply; 52+ messages in thread
From: Pavel Machek @ 2006-09-08 10:45 UTC (permalink / raw)
  To: Bernd Eckenfels; +Cc: linux-kernel

Hi!

> > If attacker already has priviledge foo, he can just go use it. He does
> > not have to exec() poor program not expecting to get priviledge foo,
> > then abusing it.
> 
> It is not about attackers. It is about normal usage. If you spawn a program,
> it might behave wrong since it does not know that it is priveledged. For
> example a network daemon might start a child process which interacts with
> the user, and forgets to drop priveldges for it.

Ok, something like that is possible, at least it is not a security
problem.

> > Sanitized here means "all regular capabilities set, all others
> > cleared".
> 
> Yes, however I thought this was exactly what the patch is not doing?

Yep, it needs to be fixed ;-).
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-08  4:10                 ` David Madore
@ 2006-09-08 10:52                   ` Pavel Machek
  2006-09-08 22:51                     ` David Madore
  2006-09-09  0:59                   ` patch to make Linux capabilities into something useful (v 0.3.1) David Wagner
  1 sibling, 1 reply; 52+ messages in thread
From: Pavel Machek @ 2006-09-08 10:52 UTC (permalink / raw)
  To: David Madore; +Cc: Linux Kernel mailing-list

Hi!

> > You contradict yourself.
> 
> I don't see how that is.  I understand that you could be unconvinced
> by my reasoning and by my arguments, but I don't see how they are
> contradictory.

Well, you claim it is as safe as possible, and it is not quite. 

> The bottom line is that, whereas for root making syscalls fail (or,
> worse, in the case of setuid(), behave subtly diffently) is a radical
> change, for non-root it is something which should always be expected
> (fork() can fail for lack of resources, write() can fail for quota
> exhaution, etc.), and not something an attacker should be able to
> exploit.

I can bet someone will get the fork() case wrong:

f = fork();
kill(f);

fork will return -1, and kill will kill _all_ the processes.

> >			   Yes, you are decreasing security of suid
> > non-root programs, and yes, someone will take advantage of that. Plus,
> > you can easily do away without this risk.
> 
> I wish I could offer more assurance, but unfortunately the solutions
> which do away with the risk come with a great cost:
> 
> > Just add all "usual" capabilities when execing
> > suid/sgid-anything.
> 
> This makes it trivial to regain capabilities: just create a program
> suid yourself and exec it.  OK, we can say that "yourself" won't work,
> but you still only need to find another uid to hijack...  Not too

If you can find another uid to hijack, that other uid has bad
problems. And I do not think you'll commonly find another uid to
hijack.

And there are easier ways to get out of jail with your proposed
capabilities: you do not restrict ptrace, so you can just ptrace any
other process with same uid, and hijack it.

(You probably want to introduce CAP_REG_PTRACE).

Or just remove CAP_REG_XUID_EXEC when removing any other CAP_REG...?

> >		      Alternatively disallow suid/sgid-anything exec
> > when all "usual" capabilities are not present.
> 
> This is probably too stringent: remove any trivial capability
> whatsoever and you lose a rather important ability.

It is not too bad; you'll usually not want restricted programs to exec
anything setuid... (Do you have example where
restricted-but-should-be-able-to-setuid-exec makes sense?)
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-08  1:22                   ` Bernd Eckenfels
  2006-09-08 10:45                     ` Pavel Machek
@ 2006-09-08 14:39                     ` Pavel Machek
  2006-09-08 19:10                       ` Bernd Eckenfels
  1 sibling, 1 reply; 52+ messages in thread
From: Pavel Machek @ 2006-09-08 14:39 UTC (permalink / raw)
  To: Bernd Eckenfels; +Cc: linux-kernel

On Fri 2006-09-08 03:22:01, Bernd Eckenfels wrote:
> On Fri, Sep 08, 2006 at 01:00:28AM +0200, Pavel Machek wrote:
> > If attacker already has priviledge foo, he can just go use it. He does
> > not have to exec() poor program not expecting to get priviledge foo,
> > then abusing it.
> 
> It is not about attackers. It is about normal usage. If you spawn a program,
> it might behave wrong since it does not know that it is priveledged. For
> example a network daemon might start a child process which interacts with
> the user, and forgets to drop priveldges for it.

Well, then mistake was running that daemon with elevated priviledges
in the first place.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-08 10:45                     ` Pavel Machek
@ 2006-09-08 16:08                       ` Casey Schaufler
  0 siblings, 0 replies; 52+ messages in thread
From: Casey Schaufler @ 2006-09-08 16:08 UTC (permalink / raw)
  To: linux-kernel




There is lots of talk regarding the interaction
between setuid and capabilities. While the
current semanitics that lack file system
support include (of necessity) interaction
the intent of the POSIX scheme completely
divorces the setuid mechanism from the
capability scheme. The intention on a system
that supports file system capabilities is
that the setuid bit is a ignored in the
capability calculation, allowing for a system
on which root is just another user. The
capability inheritance machanism is
complicated, but that is in support of
the ability to mark a program with a limited
set of privilege so that setuid root need
not be used.

Does anyone need a pointer to the POSIX
draft document? They include extensive if
somewhat disjointed rationale sections.

http://wt.tuxomania.net/publications/posix.1e/download.html



Casey Schaufler
casey@schaufler-ca.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-08 14:39                     ` Pavel Machek
@ 2006-09-08 19:10                       ` Bernd Eckenfels
  0 siblings, 0 replies; 52+ messages in thread
From: Bernd Eckenfels @ 2006-09-08 19:10 UTC (permalink / raw)
  To: Pavel Machek, linux-kernel

On Fri, Sep 08, 2006 at 04:39:47PM +0200, Pavel Machek wrote:
> Well, then mistake was running that daemon with elevated priviledges
> in the first place.

there are workers out there which expect to be started priveldged, do
something (bind, suid, ...) and then drop priveledges. If those check if the
drop is needed based on the euid...

Of course this can be solved better, however i remeber that those cases are
the ones where compatibility means any priveledge -> euid = 0.

Anyway, I think there is something like that in the proposed patch, so it
looks good.

Gruss
Bernd
-- 
  (OO)     -- Bernd_Eckenfels@Mörscher_Strasse_8.76185Karlsruhe.de --
 ( .. )    ecki@{inka.de,linux.de,debian.org}  http://www.eckes.org/
  o--o   1024D/E383CD7E  eckes@IRCNet  v:+497211603874  f:+49721151516129
(O____O)  When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl!

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-08  1:31         ` Serge E. Hallyn
@ 2006-09-08 21:45           ` David Madore
  0 siblings, 0 replies; 52+ messages in thread
From: David Madore @ 2006-09-08 21:45 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: Linux Kernel mailing-list

On Thu, Sep 07, 2006 at 08:31:36PM -0500, Serge E. Hallyn wrote:
> Do you have a little testsuite you've run which you could make available
> someplace?  Or a few test programs you could toss into a tarball and
> call a testsuite?  :)

<URL: ftp://quatramaran.ens.fr/pub/madore/newcaps/testsuite/ >

It doesn't test everything, though, and it's pretty ugly.  It's meant
for version 0.4.2 of the patch.

> Could you cc: the lsm list (linux-security-module@vger.kernel.org)?
> I'd particularly have Chris Wright give some comment as he's spent a
> lot of time looking at capabilities.

Done.

-- 
     David A. Madore
    (david.madore@ens.fr,
     http://www.madore.org/~david/ )

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-08 10:52                   ` Pavel Machek
@ 2006-09-08 22:51                     ` David Madore
  2006-09-09  0:11                       ` Casey Schaufler
  2006-09-09 11:40                       ` Pavel Machek
  0 siblings, 2 replies; 52+ messages in thread
From: David Madore @ 2006-09-08 22:51 UTC (permalink / raw)
  To: Linux Kernel mailing-list; +Cc: Pavel Machek

On Fri, Sep 08, 2006 at 12:52:39PM +0200, Pavel Machek wrote:
> Well, you claim it is as safe as possible, and it is not quite. 

I claim "safe enough". :-)

> I can bet someone will get the fork() case wrong:
> 
> f = fork();
> kill(f);
> 
> fork will return -1, and kill will kill _all_ the processes.

Someone who writes code like that deserves to get all his processes
killed. :-p fork() can fail for a million reasons, some of which, on
most systems, can be provoked by a malicious attacker (such as filling
all available process slots).

> If you can find another uid to hijack, that other uid has bad
> problems. And I do not think you'll commonly find another uid to
> hijack.

How about another gid, then?  Should we reset all caps on sgid exec?

Ultimately a compromise is to be reached between security and
flexibility...  The problem is, I don't know who should make the
decision.

> And there are easier ways to get out of jail with your proposed
> capabilities: you do not restrict ptrace, so you can just ptrace any
> other process with same uid, and hijack it.

That's true.  The restrictions on process killing (which Serge
introduced) should probably be applied to ptrace() also.

> (You probably want to introduce CAP_REG_PTRACE).

Good idea.  I did, in version 0.4.2.

> Or just remove CAP_REG_XUID_EXEC when removing any other CAP_REG...?

Doable, but ugly (or so I think): there are many paths that set
caps...  A simpler solution would be to remove the test on
CAP_REG_SXID and instead test on all regular caps simultaneously.
Still, I really don't like the idea.

> It is not too bad; you'll usually not want restricted programs to exec
> anything setuid... (Do you have example where
> restricted-but-should-be-able-to-setuid-exec makes sense?)

Well, I could imagine that a paranoid sysadmin might want some users'
processes to run without this or that capability (perhaps
CAP_REG_PTRACE or some other yet-to-be-defined capability).  This
doesn't mean that they shouldn't be able to run a game which runs sgid
in order to write the score file.

-- 
     David A. Madore
    (david.madore@ens.fr,
     http://www.madore.org/~david/ )

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-08 22:51                     ` David Madore
@ 2006-09-09  0:11                       ` Casey Schaufler
  2006-09-09 11:59                         ` Pavel Machek
  2006-09-09 11:40                       ` Pavel Machek
  1 sibling, 1 reply; 52+ messages in thread
From: Casey Schaufler @ 2006-09-09  0:11 UTC (permalink / raw)
  To: David Madore, Linux Kernel mailing-list; +Cc: Pavel Machek



--- David Madore <david.madore@ens.fr> wrote:

> On Fri, Sep 08, 2006 at 12:52:39PM +0200, Pavel
> Machek wrote:
> > Well, you claim it is as safe as possible, and it
> is not quite. 
> 
> I claim "safe enough". :-)
> 
> > I can bet someone will get the fork() case wrong:
> > 
> > f = fork();
> > kill(f);
> > 
> > fork will return -1, and kill will kill _all_ the
> processes.
> 
> Someone who writes code like that deserves to get
> all his processes killed. :-p

Sure, but usually the person who wrote that
code was never bitten by it and is now a
marketing consultant on another continent.
Sloppy error handling remains rampant throughout
the programming community and there is
nothing more likely to get bitten by it
than a change to the privilege model.
Since the original author no longer gives
a horse's pitute about the issue the problem
is yours.

> fork() can fail for a million reasons,
> some of which, on
> most systems, can be provoked by a malicious
> attacker (such as filling
> all available process slots).

Failures of fork() are sufficiently rare
that there is way too much code with the
aforementioned problem to treat it lightly.
 
> > If you can find another uid to hijack, that other
> uid has bad
> > problems. And I do not think you'll commonly find
> another uid to
> > hijack.
> 
> How about another gid, then?  Should we reset all
> caps on sgid exec?

One advantage of the POSIX model, which
uses the capability set off the file, is
that it ignores setuid/setgid in favor of
getting the capabilities from the CAP
attribute. With the file attribute you keep
the setuid'ness independent. If there's
no capability set, you get no capabilities.

> Ultimately a compromise is to be reached between
> security and
> flexibility...  The problem is, I don't know who
> should make the
> decision.

Me! Me!
 
> > And there are easier ways to get out of jail with
> your proposed
> > capabilities: you do not restrict ptrace, so you
> can just ptrace any
> > other process with same uid, and hijack it.
> 
> That's true.  The restrictions on process killing
> (which Serge
> introduced) should probably be applied to ptrace()
> also.
> 
> > (You probably want to introduce CAP_REG_PTRACE).
> 
> Good idea.  I did, in version 0.4.2.
> 
> > Or just remove CAP_REG_XUID_EXEC when removing any
> other CAP_REG...?
> 
> Doable, but ugly (or so I think): there are many
> paths that set
> caps...  A simpler solution would be to remove the
> test on
> CAP_REG_SXID and instead test on all regular caps
> simultaneously.
> Still, I really don't like the idea.
> 
> > It is not too bad; you'll usually not want
> restricted programs to exec
> > anything setuid... (Do you have example where
> > restricted-but-should-be-able-to-setuid-exec makes
> sense?)
> 
> Well, I could imagine that a paranoid sysadmin might
> want some users'
> processes to run without this or that capability
> (perhaps
> CAP_REG_PTRACE or some other yet-to-be-defined
> capability).  This
> doesn't mean that they shouldn't be able to run a
> game which runs sgid
> in order to write the score file.

A likely scenario might be the 3rd party program
that you really are sure about trusting. You
give it a capability set that has nothing in it
(hence runs without capability regardless of
the capabilities of the parent). That's part
of the rationale behind the POSIX scheme, that
some programs you just don't want to ever run
privileged, period. But POSIX only deals with
going "above" base, which is why I like the
notion of your "underprivileged" scheme as a
seperate addition.


Casey Schaufler
casey@schaufler-ca.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-08  4:10                 ` David Madore
  2006-09-08 10:52                   ` Pavel Machek
@ 2006-09-09  0:59                   ` David Wagner
  2006-09-09 12:49                     ` David Madore
  1 sibling, 1 reply; 52+ messages in thread
From: David Wagner @ 2006-09-09  0:59 UTC (permalink / raw)
  To: linux-kernel

David Madore  wrote:
>On Fri, Sep 08, 2006 at 12:54:29AM +0200, Pavel Machek wrote:
>>		      Alternatively disallow suid/sgid-anything exec
>> when all "usual" capabilities are not present.
>
>This is probably too stringent: remove any trivial capability
>whatsoever and you lose a rather important ability.

This might not be so terrible.  At least, I'm not sure I'd rule it
out at this point -- it seems like it might be worth considering.

First of all, if you have intentionally created an underprivileged
process, it seems almost reasonable to think that you might not want
that process to be able to exec anything that is suid/sgid.

Second, there are very few programs out there that are sgid or
suid-to-something-other-than-root.  On my FC5 machine, the ones I can spot
are ssh-agent, locate, screen, sendmail, postfix, a bunch of games, and a
few other random programs that are probably very rarely used.  I'm having
a hard time imagining a use case where I'd want an underprivileged daemon
to be able to exec one of those programs.  Normally, if I'm running
one of those programs, I'm probably doing it from a command-line shell
(which will have full user privileges) or some other environment that
is running with full user privileges (e.g., startup scripts or login
scripts, in the case of ssh-agent).

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-08 22:51                     ` David Madore
  2006-09-09  0:11                       ` Casey Schaufler
@ 2006-09-09 11:40                       ` Pavel Machek
  2006-09-10 10:41                         ` David Madore
  1 sibling, 1 reply; 52+ messages in thread
From: Pavel Machek @ 2006-09-09 11:40 UTC (permalink / raw)
  To: David Madore; +Cc: Linux Kernel mailing-list

Hi!

> > Well, you claim it is as safe as possible, and it is not quite. 
> 
> I claim "safe enough". :-)

Ok, state 'this allows nasty user to induce failures in setgid
programs it execs' in changelog when you submit.

No, I do not think 'safe enough' is good enough for little or no gain.


> > I can bet someone will get the fork() case wrong:
> > 
> > f = fork();
> > kill(f);
> > 
> > fork will return -1, and kill will kill _all_ the processes.
> 
> Someone who writes code like that deserves to get all his processes
> killed. :-p

...as does someone who introduces known-bad security-related changes
withou *very* good reason.

> fork() can fail for a million reasons, some of which, on
> most systems,

'on most systems' is keyword here, and remember that other ways of
inducing fork failures are normally very easy to detect.

Also... fork was first example. There are probably better examples.


> > If you can find another uid to hijack, that other uid has bad
> > problems. And I do not think you'll commonly find another uid to
> > hijack.
> 
> How about another gid, then?  Should we reset all caps on sgid exec?

Yes. Any setuid/setgid exec is a security barrier, and weird (or new)
semantics may not cross that barrier.

> Ultimately a compromise is to be reached between security and
> flexibility...  The problem is, I don't know who should make the
> decision.

Go for security here. (Normally, consensus on the list is needed for
merging the patch).

> > Or just remove CAP_REG_XUID_EXEC when removing any other CAP_REG...?
> 
> Doable, but ugly (or so I think): there are many paths that set

No, I meant 'teach users to remove CAP_REG_XUID_EXEC when removing
others'.

> caps...  A simpler solution would be to remove the test on
> CAP_REG_SXID and instead test on all regular caps simultaneously.
> Still, I really don't like the idea.

Agreed, I'd call that slightly ugly.

> > It is not too bad; you'll usually not want restricted programs to exec
> > anything setuid... (Do you have example where
> > restricted-but-should-be-able-to-setuid-exec makes sense?)
> 
> Well, I could imagine that a paranoid sysadmin might want some users'
> processes to run without this or that capability (perhaps
> CAP_REG_PTRACE or some other yet-to-be-defined capability).  This
> doesn't mean that they shouldn't be able to run a game which runs sgid
> in order to write the score file.

...so you prefer enabling DoS attack on the core file. I bet some
combination of your new capabilities will allow game to lock the core
file, but make it crash without unlocking it.

Or do you volunteer to audit all the games in Debian each time new
capability is added? :-)
							Pavel
-- 
Thanks for all the (sleeping) penguins.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-09  0:11                       ` Casey Schaufler
@ 2006-09-09 11:59                         ` Pavel Machek
  0 siblings, 0 replies; 52+ messages in thread
From: Pavel Machek @ 2006-09-09 11:59 UTC (permalink / raw)
  To: Casey Schaufler; +Cc: David Madore, Linux Kernel mailing-list

Hi!

> > Well, I could imagine that a paranoid sysadmin might
> > want some users'
> > processes to run without this or that capability
> > (perhaps
> > CAP_REG_PTRACE or some other yet-to-be-defined
> > capability).  This
> > doesn't mean that they shouldn't be able to run a
> > game which runs sgid
> > in order to write the score file.
> 
> A likely scenario might be the 3rd party program
> that you really are sure about trusting. You
> give it a capability set that has nothing in it
> (hence runs without capability regardless of
> the capabilities of the parent). That's part
> of the rationale behind the POSIX scheme, that
> some programs you just don't want to ever run
> privileged, period. But POSIX only deals with
> going "above" base, which is why I like the
> notion of your "underprivileged" scheme as a
> seperate addition.

Well, in kernel above-priviledge and below-priviledge makes sense to
be handled by same code. You can always create interface you prefer in
glibc...
						Pavel

-- 
Thanks for all the (sleeping) penguins.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-09  0:59                   ` patch to make Linux capabilities into something useful (v 0.3.1) David Wagner
@ 2006-09-09 12:49                     ` David Madore
  0 siblings, 0 replies; 52+ messages in thread
From: David Madore @ 2006-09-09 12:49 UTC (permalink / raw)
  To: Linux Kernel mailing-list; +Cc: David Wagner

On Sat, Sep 09, 2006 at 01:01:02AM +0000, David Wagner wrote:
> David Madore  wrote:
> >On Fri, Sep 08, 2006 at 12:54:29AM +0200, Pavel Machek wrote:
> >>		      Alternatively disallow suid/sgid-anything exec
> >> when all "usual" capabilities are not present.
> >
> >This is probably too stringent: remove any trivial capability
> >whatsoever and you lose a rather important ability.
> 
> This might not be so terrible.  At least, I'm not sure I'd rule it
> out at this point -- it seems like it might be worth considering.

The following patch (follows version 0.4.3 of my main patch) should
make people happy in this respect: it adds a securebit (off by
default) to enable suid non-root execution by underprivileged
processes.

Signed-off-by: David A. Madore <david.madore@ens.fr>

---
 fs/exec.c                  |   16 ++++++++++++----
 include/linux/securebits.h |    9 +++++++++
 2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 1cb5e34..adf834b 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -44,6 +44,7 @@ #include <linux/proc_fs.h>
 #include <linux/ptrace.h>
 #include <linux/mount.h>
 #include <linux/security.h>
+#include <linux/securebits.h>
 #include <linux/syscalls.h>
 #include <linux/rmap.h>
 #include <linux/acct.h>
@@ -918,6 +919,7 @@ int prepare_binprm(struct linux_binprm *
 	int mode;
 	struct inode * inode = bprm->file->f_dentry->d_inode;
 	int retval;
+	char ok_to_sxid;
 
 	mode = inode->i_mode;
 	if (bprm->file->f_op == NULL)
@@ -928,9 +930,16 @@ int prepare_binprm(struct linux_binprm *
 	bprm->is_suid = 0;
 	bprm->is_sgid = 0;
 
-	if(!(bprm->file->f_vfsmnt->mnt_flags & MNT_NOSUID)) {
+	ok_to_sxid = capable(CAP_REG_SXID)
+	  && !(bprm->file->f_vfsmnt->mnt_flags & MNT_NOSUID);
+	if (!cap_issubset(CAP_REGULAR_SET, current->cap_permitted)
+	    && !issecure(SECURE_UNDERPRIVILEGED_MAY_SXID)
+	    && (issecure(SECURE_NOROOT) || inode->i_uid != 0))
+		ok_to_sxid = 0;
+
+	if (ok_to_sxid) {
 		/* Set-uid? */
-		if (mode & S_ISUID && capable(CAP_REG_SXID)) {
+		if (mode & S_ISUID) {
 			bprm->is_suid = 1;
 			current->personality &= ~PER_CLEAR_ON_SETID;
 			bprm->e_uid = inode->i_uid;
@@ -942,8 +951,7 @@ int prepare_binprm(struct linux_binprm *
 		 * is a candidate for mandatory locking, not a setgid
 		 * executable.
 		 */
-		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)
-		    && capable(CAP_REG_SXID)) {
+		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
 			bprm->is_sgid = 1;
 			current->personality &= ~PER_CLEAR_ON_SETID;
 			bprm->e_gid = inode->i_gid;
diff --git a/include/linux/securebits.h b/include/linux/securebits.h
index 5b06178..16e8f3e 100644
--- a/include/linux/securebits.h
+++ b/include/linux/securebits.h
@@ -18,6 +18,15 @@ #define SECURE_NOROOT            0
    privileges. When unset, setuid doesn't change privileges. */
 #define SECURE_NO_SETUID_FIXUP   2
 
+/* When set, allow underprivileged processes (= not possessing all
+   "regular" caps) to execute SUID/SGID executables (this is a
+   security issue as such executables might be surprised to run with
+   reduced privileges); if SECURE_NOROOT is _not_ set, this _does not_
+   apply to SUID root processes (they are already made secure by
+   raising all caps).  Removing the (regular) CAP_REG_SXID capability
+   also always inhibits any kind of SUID/SGID. */
+#define SECURE_UNDERPRIVILEGED_MAY_SXID 4
+
 /* Each securesetting is implemented using two bits. One bit specify
    whether the setting is on or off. The other bit specify whether the
    setting is fixed or not. A setting which is fixed cannot be changed

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-06 13:26     ` David Madore
  2006-09-07  0:11       ` Casey Schaufler
@ 2006-09-09 23:18       ` Theodore Tso
  2006-09-10 10:13         ` David Madore
  2006-09-10 12:36         ` Pavel Machek
  1 sibling, 2 replies; 52+ messages in thread
From: Theodore Tso @ 2006-09-09 23:18 UTC (permalink / raw)
  To: David Madore; +Cc: Linux Kernel mailing-list

On Wed, Sep 06, 2006 at 03:26:23PM +0200, David Madore wrote:
> I emphasize that the filesystem support patch described above, alone,
> will *not* solve the inheritability problem (as my patch does), since
> unmarked executables continue to inherit no caps at all.  With my
> patch, they behave as though they had a full inheritable set,
> something which is required if we want to make something useful of
> capabilities on non-caps-aware programs.

This is what scares me about your proposal.  I consider it a *feature*
that unmarked executables inherit no capabilities, since many programs
were written without consideration about whether or not they might be
safe to run without privileges.  So the default of not allowing an
executable to inherit capabilities is in line of the the classic
security principle of "least privileges".   

I agree it may be less convenient for a system administrator who is
used root, cd'ing to a colleagues source tree, su'ing to root, and who
then types "make" to compile a program, expecting it to work since
root privileges imply the ability to override filesystem discretionary
access control --- and then to be rudely surprised when this doesn't
work in a capabilities-enabled system.  However, I would claim this is
the correct behaviour!  Would you really want some random operator
running random Makefiles for some random program downloaded from the
Internet?  As root?  So as far as I am concerned, forcing make, cc,
et. al. to not inherit capabilities is a Good Thing.

Now, perhaps some system owners have a different idea of how they want
to run, and believe want to trade off more convenience for less
security.  That's fine, but please don't disable the high security
mode for the rest of us.  What I would suggest is that perhaps the
filesystem capabilities patch can be extended to either to allow the
filesystem superblock define (a) what the default inheritance
capability mask should be when creating a new file, and (b) what the
default inheritance capability for that filesystem should be in the
absence of an explicit capability record.  Both of these should be
overrideable by a mount option, but for convenience's sake it would be
convenient to be able to set these values in the superblock.


As far as negative capabilities, I feel rather strongly these should
not be separated into separate capability masks.  They can use the
same framework, sure, but I think the system will be much safer if
they use a different set of masks.  Otherwise, there can be a whole
class of mistakes caused by people and applications getting confused
over which bit positions indicate privileges, and which indicate
negative privileges.  If you use a separate mask, this avoids this
problem.


The other reason why it may not be such a hot idea to mess with the
inheritance formulas is compatibility with other Unix systems that
have implemented capabilities following the last Posix draft.  In
particular, Sun has recently included the Trusted Solaris into the
base Solaris offering and into Open Solaris, and has been plugging
them pretty heavily.  It would be unfortunate if Solaris and Linux had
gratuitously different semantics for how the capabilities API's work.
It could easily cause security problems in both directions --- when
trying to port a program written for Linux to Solaris, and vice versa.

The solution is to _extend_ the capabilities system: for example, by
adding default inheritance masks to cater for system administrators
who value convenience more than security, and to add new bitmasks for
negative privileges/capabilities.

Regards,

					- Ted

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-09 23:18       ` Theodore Tso
@ 2006-09-10 10:13         ` David Madore
  2006-09-10 12:36         ` Pavel Machek
  1 sibling, 0 replies; 52+ messages in thread
From: David Madore @ 2006-09-10 10:13 UTC (permalink / raw)
  To: Theodore Tso, Linux Kernel mailing-list

On Sat, Sep 09, 2006 at 07:18:05PM -0400, Theodore Tso wrote:
> This is what scares me about your proposal.  I consider it a *feature*
> that unmarked executables inherit no capabilities, since many programs
> were written without consideration about whether or not they might be
> safe to run without privileges.  So the default of not allowing an
> executable to inherit capabilities is in line of the the classic
> security principle of "least privileges".   
> 
> I agree it may be less convenient for a system administrator who is
> used root, cd'ing to a colleagues source tree, su'ing to root, and who
> then types "make" to compile a program, expecting it to work since
> root privileges imply the ability to override filesystem discretionary
> access control --- and then to be rudely surprised when this doesn't
> work in a capabilities-enabled system.  However, I would claim this is
> the correct behaviour!  Would you really want some random operator
> running random Makefiles for some random program downloaded from the
> Internet?  As root?  So as far as I am concerned, forcing make, cc,
> et. al. to not inherit capabilities is a Good Thing.

But root privileges *are* inherited under Unix.  Always.  That's a
historic fact and you can't change it.  How would you explain that a
full set of capabilities gets inherited if a subset does not?  This
can only lead to crazy semantics (you need a special hack for root)
and it will mean that capabilities are almost entirely useless (as
they are now: they are almost unused because they are basically
useless) - if it is impossibly difficult to work with a subset of all
capabilities, people will use all of them, i.e., work as root as they
do now, and you have gained nothing.

Maybe I should have emphasized the following fact about my patch: when
you switch-user from root with something like setresuid(uid,uid,uid),
the permitted/effective sets can remain unaltered if the program has
requested it (using prctl(PR_SET_KEEPCAPS)), but the inheritable set
is always cleared.  Hence, if you want capabilities to be inheritable,
you have to request it explicitly.  Someone who asks that knows what
he is doing, and should be given it.

> Now, perhaps some system owners have a different idea of how they want
> to run, and believe want to trade off more convenience for less
> security.  That's fine, but please don't disable the high security
> mode for the rest of us.  What I would suggest is that perhaps the
> filesystem capabilities patch can be extended to either to allow the
> filesystem superblock define (a) what the default inheritance
> capability mask should be when creating a new file, and (b) what the
> default inheritance capability for that filesystem should be in the
> absence of an explicit capability record.  Both of these should be
> overrideable by a mount option, but for convenience's sake it would be
> convenient to be able to set these values in the superblock.

The superblock is not an option, because there are too many filesystem
types out there.  A sysctl or securebit (although changing the latter
is not implemented for now), on the other hand, would be feasible.
But very messy: first, it means putting back the root hack (need to
specially inherit the full set of permitted, resp. effective
capabilities when {r,s,e}uid==0, resp. euid==0), and second, it means
that nobody will understand the whole picture of when and how
capabilities are inherited.  (Having a rule that nobody understands is
a sure way of getting lousy security: one thing that people *do*
understand under Unix is how/that root privileges are inherited - let
capabilities follow that general rule.)

> As far as negative capabilities, I feel rather strongly these should
> not be separated into separate capability masks.  They can use the
> same framework, sure, but I think the system will be much safer if
> they use a different set of masks.  Otherwise, there can be a whole
> class of mistakes caused by people and applications getting confused
> over which bit positions indicate privileges, and which indicate
> negative privileges.  If you use a separate mask, this avoids this
> problem.

That would mean duplicating a lot of code.

> The other reason why it may not be such a hot idea to mess with the
> inheritance formulas is compatibility with other Unix systems that
> have implemented capabilities following the last Posix draft.  In
> particular, Sun has recently included the Trusted Solaris into the
> base Solaris offering and into Open Solaris, and has been plugging
> them pretty heavily.  It would be unfortunate if Solaris and Linux had
> gratuitously different semantics for how the capabilities API's work.
> It could easily cause security problems in both directions --- when
> trying to port a program written for Linux to Solaris, and vice versa.

It is a fact that POSIX got us into a deep mess by their lack of
foresight: because they couldn't agree on a standard, now everyone has
a different idea of how things should be done.  Linux and Solaris
already have different semantics in this respect, and both are
different from any POSIX draft or from Irix.  I don't think we can
"fix" the mess in this respect, now, no matter how.

The value of my proposal is that it makes root inheritance a normal
case of capability inheritance, so the normal rules of Unix apply.

> The solution is to _extend_ the capabilities system: for example, by
> adding default inheritance masks to cater for system administrators
> who value convenience more than security, and to add new bitmasks for
> negative privileges/capabilities.

Unfortunately, I believe this is impossible to do in a way that will
seem even remotely acceptable.

-- 
     David A. Madore
    (david.madore@ens.fr,
     http://www.madore.org/~david/ )

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-09 11:40                       ` Pavel Machek
@ 2006-09-10 10:41                         ` David Madore
  2006-09-10 13:06                           ` Pavel Machek
  0 siblings, 1 reply; 52+ messages in thread
From: David Madore @ 2006-09-10 10:41 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Linux Kernel mailing-list

On Sat, Sep 09, 2006 at 11:40:38AM +0000, Pavel Machek wrote:
> > > If you can find another uid to hijack, that other uid has bad
> > > problems. And I do not think you'll commonly find another uid to
> > > hijack.
> > 
> > How about another gid, then?  Should we reset all caps on sgid exec?
> 
> Yes. Any setuid/setgid exec is a security barrier, and weird (or new)
> semantics may not cross that barrier.

Right, so what I was saying was: if you reset all regular caps on sgid
exec, anyone can trivially reset all regular caps by creating a sgid
program (users are always members of a great many groups so "finding
another gid to hijack" is trivial).  So CAP_REG_SXID needs to be off
all the time, so we lose again.

But I'll make this a securebit ("unsanitized sxid"), with the behavior
you advertise as default (0).

> > Ultimately a compromise is to be reached between security and
> > flexibility...  The problem is, I don't know who should make the
> > decision.
> 
> Go for security here. (Normally, consensus on the list is needed for
> merging the patch).

I am now completely convinced the patch will never be merged. :-(
Linux will have useless caps forever...

-- 
     David A. Madore
    (david.madore@ens.fr,
     http://www.madore.org/~david/ )

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-09 23:18       ` Theodore Tso
  2006-09-10 10:13         ` David Madore
@ 2006-09-10 12:36         ` Pavel Machek
  2006-09-10 23:24           ` Theodore Tso
  1 sibling, 1 reply; 52+ messages in thread
From: Pavel Machek @ 2006-09-10 12:36 UTC (permalink / raw)
  To: Theodore Tso, David Madore, Linux Kernel mailing-list

Hi!

> > I emphasize that the filesystem support patch described above, alone,
> > will *not* solve the inheritability problem (as my patch does), since
> > unmarked executables continue to inherit no caps at all.  With my
> > patch, they behave as though they had a full inheritable set,
> > something which is required if we want to make something useful of
> > capabilities on non-caps-aware programs.
> 
> This is what scares me about your proposal.  I consider it a *feature*
> that unmarked executables inherit no capabilities, since many programs
> were written without consideration about whether or not they might be
> safe to run without privileges.  So the default of not allowing an
> executable to inherit capabilities is in line of the the classic
> security principle of "least privileges".   
> 
> I agree it may be less convenient for a system administrator who is
> used root, cd'ing to a colleagues source tree, su'ing to root, and who
> then types "make" to compile a program, expecting it to work since
> root privileges imply the ability to override filesystem discretionary
> access control --- and then to be rudely surprised when this doesn't
> work in a capabilities-enabled system.  However, I would claim this is
> the correct behaviour!

But this is not how it behaves today, right? I do not think you could
push 'break-make-as-root' as a bugfix to -stable ;-).

> absence of an explicit capability record.  Both of these should be
> overrideable by a mount option, but for convenience's sake it would be
> convenient to be able to set these values in the superblock.

Would per-system default capability masks be enough? ... .... okay,
probably not, because nosuid is per-mount, and this is similar.

> As far as negative capabilities, I feel rather strongly these should
> not be separated into separate capability masks.  They can use the
> same framework, sure, but I think the system will be much safer if
> they use a different set of masks.  Otherwise, there can be a whole
> class of mistakes caused by people and applications getting confused

Can we simply split them at kernel interface layer? Libc could do it,
preventing confusion...

> The solution is to _extend_ the capabilities system: for example, by
> adding default inheritance masks to cater for system administrators
> who value convenience more than security, and to add new bitmasks for
> negative privileges/capabilities.

Agreed.
							Pavel
-- 
Thanks for all the (sleeping) penguins.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-10 10:41                         ` David Madore
@ 2006-09-10 13:06                           ` Pavel Machek
  2006-09-10 14:25                             ` capability inheritance (was: Re: patch to make Linux capabilities into something useful (v 0.3.1)) David Madore
  0 siblings, 1 reply; 52+ messages in thread
From: Pavel Machek @ 2006-09-10 13:06 UTC (permalink / raw)
  To: David Madore; +Cc: Linux Kernel mailing-list

Hi!

> > > How about another gid, then?  Should we reset all caps on sgid exec?
> > 
> > Yes. Any setuid/setgid exec is a security barrier, and weird (or new)
> > semantics may not cross that barrier.
> 
> Right, so what I was saying was: if you reset all regular caps on sgid
> exec, anyone can trivially reset all regular caps by creating a sgid
> program (users are always members of a great many groups so "finding
> another gid to hijack" is trivial).  So CAP_REG_SXID needs to be off
> all the time, so we lose again.
> 
> But I'll make this a securebit ("unsanitized sxid"), with the behavior
> you advertise as default (0).

I'm not sure if fundamental security semantics should be optional, but
it is certainly better than before.

> > > Ultimately a compromise is to be reached between security and
> > > flexibility...  The problem is, I don't know who should make the
> > > decision.
> > 
> > Go for security here. (Normally, consensus on the list is needed for
> > merging the patch).
> 
> I am now completely convinced the patch will never be merged. :-(
> Linux will have useless caps forever...

Well, merging the patches is not that hard.

tytso actually shown a clever way: add per-filesystem 'default
capability masks'. That should be fairly easy to merge, and
automatically back-compatible.

(And it would get tou semantics you wanted in inheritance area,
right?)
							Pavel
-- 
Thanks for all the (sleeping) penguins.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* capability inheritance (was: Re: patch to make Linux capabilities into something useful (v 0.3.1))
  2006-09-10 13:06                           ` Pavel Machek
@ 2006-09-10 14:25                             ` David Madore
  2006-09-10 22:42                               ` Pavel Machek
  2006-09-11 16:00                               ` Casey Schaufler
  0 siblings, 2 replies; 52+ messages in thread
From: David Madore @ 2006-09-10 14:25 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Linux Kernel mailing-list

[I wrote:]
> > But I'll make this a securebit ("unsanitized sxid"), with the behavior
> > you advertise as default (0).

Done: I just posted version 0.4.4, where non-root suid/sgid exec() is
sanitized.

On Sun, Sep 10, 2006 at 01:06:24PM +0000, Pavel Machek wrote:
> I'm not sure if fundamental security semantics should be optional, but
> it is certainly better than before.

I think it's a gross hack to protect people who can't write correct
programs, so there *must* be a way to disable it.

> tytso actually shown a clever way: add per-filesystem 'default
> capability masks'. That should be fairly easy to merge, and
> automatically back-compatible.
> 
> (And it would get tou semantics you wanted in inheritance area,
> right?)

I'm not at all sure.  If it's just a matter of adding a mount option
("inhcaps", say) so that the default file inheritable set is full when
the option is turned on *or* when one of current->{r,e,s}uid==0 (else
defaults to regular caps), then that can easily be done: this will
give a reasonable inheritance with the mount option turned on and the
current behavior with the mount option turned off (but still the
possibility of activating inheritance for specific files).

*However*, Theodore Ts'o seemed to say that he wanted POSIX-draft-16
conformance, and I am unable to arrange this: I don't know how to come
up with a set of semantics that are compatible at once with the POSIX
rules and the traditional Unix semantics for root.  So if someone
knows how to do that, he should speak up at that point.

The POSIX-draft-16 rules, if I understand correctly, are something
like this:

  P'(per) <- (P(inh) & F(inh)) | (F(per) & bnd)
  P'(eff) <- (P(inh) & F(eff) & F(inh)) | (F(per) & F(eff) & bnd)
  P'(inh) <- P(inh)

The second rule means that a normal process with euid!=0 (so P(eff)
restricted to regular caps) but uid=0 (so P(per) full) executing a
non-suid/sgid executable will have all P'(eff)-caps raised after exec.
This is in contradiction with normal Unix behavior, with the principle
of least surprise and also with anything I consider as "sane".  As for
the third rule, it means that P(inh) is not raised by F(per), so if I
execute a suid-root program and *that* program executes a non-suid
program, the grandchild will not get full caps, again contradicting
the norma Unix behavior.  This is why I wrote my patch to change the
behavior to

  P'(per) <- (P(inh) & F(inh)) | (F(per) & bnd)
  P'(eff) <- (P(inh) & P(eff) & F(inh)) | (F(per) & F(eff) & bnd)
  P'(inh) <- P'(per)

(see <URL: http://www.madore.org/~david/linux/newcaps/#semantics >)
- which makes perfect sense but which contradicts the POSIX
(non-)spec, so Theodore Ts'o will be unhappy with my patch.

I can see no way of reconciling the POSIX rules with sane Unix
behavior.  Hence I can only give up if someone insists that the POSIX
draft should be adhered to.

(Just in case someone were tempted to get away with a handwaving such
as "just follow the POSIX rules except for suid root...", let that
someone please try to come up with a full description of the rules
which breaks nothing, and he will understand that it's not at all
easy.)

-- 
     David A. Madore
    (david.madore@ens.fr,
     http://www.madore.org/~david/ )

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: capability inheritance (was: Re: patch to make Linux capabilities into something useful (v 0.3.1))
  2006-09-10 14:25                             ` capability inheritance (was: Re: patch to make Linux capabilities into something useful (v 0.3.1)) David Madore
@ 2006-09-10 22:42                               ` Pavel Machek
  2006-09-11 16:00                               ` Casey Schaufler
  1 sibling, 0 replies; 52+ messages in thread
From: Pavel Machek @ 2006-09-10 22:42 UTC (permalink / raw)
  To: David Madore; +Cc: Linux Kernel mailing-list

Hi!


> > tytso actually shown a clever way: add per-filesystem 'default
> > capability masks'. That should be fairly easy to merge, and
> > automatically back-compatible.
> > 
> > (And it would get tou semantics you wanted in inheritance area,
> > right?)
> 
> I'm not at all sure.  If it's just a matter of adding a mount option
> ("inhcaps", say) so that the default file inheritable set is full when
> the option is turned on *or* when one of current->{r,e,s}uid==0 (else
> defaults to regular caps), then that can easily be done: this will
> give a reasonable inheritance with the mount option turned on and the
> current behavior with the mount option turned off (but still the
> possibility of activating inheritance for specific files).

It needs to be inhcaps=[mask] to be consistent with the rest. (What
other bitmasks are there available in the xattrs?)

> *However*, Theodore Ts'o seemed to say that he wanted POSIX-draft-16
> conformance, and I am unable to arrange this: I don't know how to come
> up with a set of semantics that are compatible at once with the POSIX
> rules and the traditional Unix semantics for root.  So if someone
> knows how to do that, he should speak up at that point.

Well, current system is not posix-draft-16, and I guess someone
interested in posix-draft-16 needs to submit patch.

								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-10 12:36         ` Pavel Machek
@ 2006-09-10 23:24           ` Theodore Tso
  2006-09-11  8:09             ` Pavel Machek
  0 siblings, 1 reply; 52+ messages in thread
From: Theodore Tso @ 2006-09-10 23:24 UTC (permalink / raw)
  To: Pavel Machek; +Cc: David Madore, Linux Kernel mailing-list

On Sun, Sep 10, 2006 at 12:36:31PM +0000, Pavel Machek wrote:
> > I agree it may be less convenient for a system administrator who is
> > used root, cd'ing to a colleagues source tree, su'ing to root, and who
> > then types "make" to compile a program, expecting it to work since
> > root privileges imply the ability to override filesystem discretionary
> > access control --- and then to be rudely surprised when this doesn't
> > work in a capabilities-enabled system.  However, I would claim this is
> > the correct behaviour!
> 
> But this is not how it behaves today, right? I do not think you could
> push 'break-make-as-root' as a bugfix to -stable ;-).

No, this would be an alternate model that if enabled would give us the
full Posix Capabilities draft compliance and which would hopefully
make us be mostly compatible with Trusted Solaris and Trusted Irix.
Like SELinux, it's something that would have to be explicitly enabled,
and yes, does break compatibility with standard root privileges ---
but then again, break apart root privs was the whole *point* of the
Posix capabilities design....

> > absence of an explicit capability record.  Both of these should be
> > overrideable by a mount option, but for convenience's sake it would be
> > convenient to be able to set these values in the superblock.
> 
> Would per-system default capability masks be enough? ... .... okay,
> probably not, because nosuid is per-mount, and this is similar.

Per system defaults would also be good, but I believe it would also be
a good idea for their to be per-filesystem defaults.  Yes, not all
filesystems would support per-filesystem defaults, since this requires
extensions in their superblock; for those, they would only have the
per-system defaults.

> > As far as negative capabilities, I feel rather strongly these should
> > not be separated into separate capability masks.  They can use the
> > same framework, sure, but I think the system will be much safer if
> > they use a different set of masks.  Otherwise, there can be a whole
> > class of mistakes caused by people and applications getting confused
> 
> Can we simply split them at kernel interface layer? Libc could do it,
> preventing confusion...

But that means libc would need to know which bit positions were
positive capabilities, and which were negative, and adding new
capabilities would require a matching change with libc.  Not a good
idea, I think....

					- Ted

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: patch to make Linux capabilities into something useful (v 0.3.1)
  2006-09-10 23:24           ` Theodore Tso
@ 2006-09-11  8:09             ` Pavel Machek
  0 siblings, 0 replies; 52+ messages in thread
From: Pavel Machek @ 2006-09-11  8:09 UTC (permalink / raw)
  To: Theodore Tso, David Madore, Linux Kernel mailing-list

Hi!

> > > absence of an explicit capability record.  Both of these should be
> > > overrideable by a mount option, but for convenience's sake it would be
> > > convenient to be able to set these values in the superblock.
> > 
> > Would per-system default capability masks be enough? ... .... okay,
> > probably not, because nosuid is per-mount, and this is similar.
> 
> Per system defaults would also be good, but I believe it would also be
> a good idea for their to be per-filesystem defaults.  Yes, not all
> filesystems would support per-filesystem defaults, since this requires
> extensions in their superblock; for those, they would only have the
> per-system defaults.

Aha, okay... David, doing global defaults should be very easy, as
should be doing ext3 example conversion...

> > > As far as negative capabilities, I feel rather strongly these should
> > > not be separated into separate capability masks.  They can use the
> > > same framework, sure, but I think the system will be much safer if
> > > they use a different set of masks.  Otherwise, there can be a whole
> > > class of mistakes caused by people and applications getting confused
> > 
> > Can we simply split them at kernel interface layer? Libc could do it,
> > preventing confusion...
> 
> But that means libc would need to know which bit positions were
> positive capabilities, and which were negative, and adding new
> capabilities would require a matching change with libc.  Not a good
> idea, I think....

Okay, I guess we could hardcode the mask (top 16 capabilities are
"normal") but yes, I see your point now. We probably want different
syscalls, even if underlying implementation shares the code...

								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: capability inheritance (was: Re: patch to make Linux capabilities into something useful (v 0.3.1))
  2006-09-10 14:25                             ` capability inheritance (was: Re: patch to make Linux capabilities into something useful (v 0.3.1)) David Madore
  2006-09-10 22:42                               ` Pavel Machek
@ 2006-09-11 16:00                               ` Casey Schaufler
  2006-09-11 17:39                                 ` David Madore
  1 sibling, 1 reply; 52+ messages in thread
From: Casey Schaufler @ 2006-09-11 16:00 UTC (permalink / raw)
  To: David Madore; +Cc: Linux Kernel mailing-list



--- David Madore <david.madore@ens.fr> wrote:

> I can see no way of reconciling the POSIX rules with
> sane Unix behavior.

While one strives to maintain the decorum of
friendly debate, "Them's fighting words"*.

Have you read the POSIX DRAFT rationale section?
Have you read any of the DRAFT, for that matter?

Breaking privilege apart from UID==0 and the
setuid mechanism while allowing a system that
could still work without requiring programs
to be rewritten took quite a while. The DRAFT
versions don't differ that greatly after about
DRAFT 12. The scheme has been implemented
several times.

> Hence I can only give up if someone
> insists that the POSIX
> draft should be adhered to.
> 
> (Just in case someone were tempted to get away with
> a handwaving such
> as "just follow the POSIX rules except for suid
> root...", let that
> someone please try to come up with a full
> description of the rules
> which breaks nothing, and he will understand that
> it's not at all easy.)

The relationship between setuid and file based
capabilitiy sets is straitforward. There is
none. If your system supports root or capability
(like Irix) or strictly capability (like Trix)
the calculation is identical. There is a full
descrition of the rules in the DRAFT. If you
have questions about it, I'd be happy to dust
off my copy to help you understand it.

----
* Yosemite Sam in "High Diving Hare", 1949

Casey Schaufler
casey@schaufler-ca.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: capability inheritance (was: Re: patch to make Linux capabilities into something useful (v 0.3.1))
  2006-09-11 16:00                               ` Casey Schaufler
@ 2006-09-11 17:39                                 ` David Madore
  0 siblings, 0 replies; 52+ messages in thread
From: David Madore @ 2006-09-11 17:39 UTC (permalink / raw)
  To: Casey Schaufler; +Cc: Linux Kernel mailing-list

On Mon, Sep 11, 2006 at 09:00:02AM -0700, Casey Schaufler wrote:
> --- David Madore <david.madore@ens.fr> wrote:
> > I can see no way of reconciling the POSIX rules with
> > sane Unix behavior.
> 
> While one strives to maintain the decorum of
> friendly debate, "Them's fighting words"*.

I am sorry if you find this offensive, but, with due respect to the
editor, I find the POSIX draft's description of the capability
inheritance mechanism in general and its rational or its interaction
with norma Unix mechanisms in particular (a) highly unclear and (b)
entirely unconvincing.

> Have you read the POSIX DRAFT rationale section?

Yes.  At least, I have read the version made available, through you,
at <URL: http://wt.tuxomania.net/publications/posix.1e/download.html
 >, which seems to be draft 17.  I don't have access to any earlier
draft (nor do I wish to discuss documents which aren't publicly
available for download).

I fail to see, however, where the rationale section gives a
justification for the exact inheritance rules which were chosen:
section B.25.2.6, for example, is extremely vague and only describes
the rules rather than justifying them.  Setion B.25.6 gives a few
examples of use, but no better.  I see no real explanation of the
rules themselves, let alone any explanation why they are better than
any other system.  Could you state, precisely, what you think is wrong
with the semantics I advocate?

(Not to mention the fact that I find the "principle of least
privilege", as stated in section B.25.1, of rather dubious value.
This sounds exactly like a sort of general principle which sounds nice
in general but which becomes disastrous when you actually try to apply
it systematically.)

> Have you read any of the DRAFT, for that matter?

I can't deny that it is hard to read, being essentially a diff to
another document which is, itself, hard to find and hard to read.  And
I certainly haven't read the sections on auditing, etc.

If there's a specific part you think I missed, please give its
reference rather than just waving a several-hundred-page document at
me.

> Breaking privilege apart from UID==0 and the
> setuid mechanism while allowing a system that
> could still work without requiring programs
> to be rewritten took quite a while. The DRAFT
> versions don't differ that greatly after about
> DRAFT 12. The scheme has been implemented
> several times.

If you have any pointers to documentations describing specific
implementations, I am certainly interested.

> The relationship between setuid and file based
> capabilitiy sets is straitforward. There is
> none.

That won't do: that could mean several things, including at least: (a)
processes are given the ability to do <whatever> when they have
*either* EUID==0 *or* the appropriate capability, (b) processes
having, at any time, EUID==0 are automatically given full capability
sets, (c) processes exec()uted with EUID==0 are automatically given
full capability sets,...  I can think of a million different ways of
having "no" relationship.  Which is it?  It would seem that (a) is the
most likely meaning, but this contradicts the way Linux has
implemented capabilities so far (it only checks the capability sets,
not EUID).

>	If your system supports root or capability
> (like Irix) or strictly capability (like Trix)
> the calculation is identical. There is a full
> descrition of the rules in the DRAFT. If you
> have questions about it, I'd be happy to dust
> off my copy to help you understand it.

I am interested in a description of how to reconcile Unix semantics
with POSIX capabilities, but I find it nowhere in the draft.  So I
would appreciate it if you could point me to the very precise place
where I am supposed to find it or, better, explain it with less
legalese than the draft uses (but just as precisely).  A pointer to a
description of how other implementations solve the problem would also
be welcome.

-- 
     David A. Madore
    (david.madore@ens.fr,
     http://www.madore.org/~david/ )

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2006-09-11 17:39 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-09-05 21:26 patch to make Linux capabilities into something useful (v 0.3.1) David Madore
2006-09-06  0:27 ` Casey Schaufler
2006-09-06 10:06   ` David Madore
2006-09-06 13:26     ` David Madore
2006-09-07  0:11       ` Casey Schaufler
2006-09-07  0:32         ` David Madore
2006-09-07  1:01           ` Casey Schaufler
2006-09-07  1:29             ` David Wagner
2006-09-07 16:00               ` Casey Schaufler
2006-09-07 18:33                 ` David Wagner
2006-09-07 17:34             ` David Madore
2006-09-07 19:38               ` Bernd Eckenfels
2006-09-07 23:00                 ` Pavel Machek
2006-09-08  1:22                   ` Bernd Eckenfels
2006-09-08 10:45                     ` Pavel Machek
2006-09-08 16:08                       ` Casey Schaufler
2006-09-08 14:39                     ` Pavel Machek
2006-09-08 19:10                       ` Bernd Eckenfels
2006-09-07 22:54               ` Pavel Machek
2006-09-08  4:10                 ` David Madore
2006-09-08 10:52                   ` Pavel Machek
2006-09-08 22:51                     ` David Madore
2006-09-09  0:11                       ` Casey Schaufler
2006-09-09 11:59                         ` Pavel Machek
2006-09-09 11:40                       ` Pavel Machek
2006-09-10 10:41                         ` David Madore
2006-09-10 13:06                           ` Pavel Machek
2006-09-10 14:25                             ` capability inheritance (was: Re: patch to make Linux capabilities into something useful (v 0.3.1)) David Madore
2006-09-10 22:42                               ` Pavel Machek
2006-09-11 16:00                               ` Casey Schaufler
2006-09-11 17:39                                 ` David Madore
2006-09-09  0:59                   ` patch to make Linux capabilities into something useful (v 0.3.1) David Wagner
2006-09-09 12:49                     ` David Madore
2006-09-09 23:18       ` Theodore Tso
2006-09-10 10:13         ` David Madore
2006-09-10 12:36         ` Pavel Machek
2006-09-10 23:24           ` Theodore Tso
2006-09-11  8:09             ` Pavel Machek
2006-09-06 18:25 ` Serge E. Hallyn
2006-09-06 22:27   ` David Madore
2006-09-07  0:04     ` David Madore
2006-09-07 23:06       ` Serge E. Hallyn
2006-09-08  4:16         ` David Madore
2006-09-07  6:43     ` Jan Engelhardt
2006-09-07 23:02     ` Serge E. Hallyn
2006-09-08  1:08       ` David Madore
2006-09-08  1:31         ` Serge E. Hallyn
2006-09-08 21:45           ` David Madore
2006-09-07 18:21 ` James Antill
2006-09-07 18:33   ` Kyle Moffett
2006-09-07 20:05     ` James Antill
2006-09-08  4:00   ` David Madore

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).