LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: [PATCH] capabilites, take 2
       [not found] ` <fa.mu5rj3d.24gtbp@ifi.uio.no>
@ 2004-05-14 15:57   ` Andy Lutomirski
  2004-05-14 16:01     ` Stephen Smalley
  0 siblings, 1 reply; 43+ messages in thread
From: Andy Lutomirski @ 2004-05-14 15:57 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: Albert Cahalan, linux-kernel mailing list, luto, Chris Wright,
	olaf+list.linux-kernel, Valdis.Kletnieks



Stephen Smalley wrote:

> On Fri, 2004-05-14 at 08:03, Albert Cahalan wrote:
> 
>>This would be an excellent time to reconsider how capabilities
>>are assigned to bits. You're breaking things anyway; you might
>>as well do all the breaking at once. I want local-use bits so
>>that the print queue management access isn't by magic UID/GID.
>>We haven't escaped UID-as-priv if server apps and setuid apps
>>are still making UID-based access control decisions.
> 
> 
> Capabilities are a broken model for privilege management; try RBAC/TE
> instead.  http://www.securecomputing.com/pdf/secureos.pdf has notes
> about the history and comparison of capabilities vs. TE.
> 
> Instead of adding new capability bits, replace capable() calls with LSM
> hook calls that offer you finer granularity both in operation and in
> object-based decisions, and then use a security module like SELinux to
> map that to actual permission checks.  SELinux provides a framework for
> defining security classes and permissions, including both definitions
> used by the kernel and definitions used by userspace policy enforcers
> (ala security-enhanced X).
> 

Thanks -- turning brain back on, SELinux is obviously better than any
fine-grained capability scheme I can imagine.

So unless anyone convinces me you're wrong, I'll stick with just
fixing up capabilities to work without making them finer-grained.

--Andy

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14 15:57   ` [PATCH] capabilites, take 2 Andy Lutomirski
@ 2004-05-14 16:01     ` Stephen Smalley
  2004-05-14 16:18       ` Andy Lutomirski
  0 siblings, 1 reply; 43+ messages in thread
From: Stephen Smalley @ 2004-05-14 16:01 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Albert Cahalan, linux-kernel mailing list, Chris Wright,
	olaf+list.linux-kernel, Valdis.Kletnieks

On Fri, 2004-05-14 at 11:57, Andy Lutomirski wrote:
> Thanks -- turning brain back on, SELinux is obviously better than any
> fine-grained capability scheme I can imagine.
> 
> So unless anyone convinces me you're wrong, I'll stick with just
> fixing up capabilities to work without making them finer-grained.

Great, thanks.  Fixing capabilities to work is definitely useful and
desirable.  Significantly expanding them in any manner is a poor use of
limited resources, IMHO; I'd much rather see people work on applying
SELinux to the problem and solving it more effectively for the future.

-- 
Stephen Smalley <sds@epoch.ncsc.mil>
National Security Agency


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14 16:01     ` Stephen Smalley
@ 2004-05-14 16:18       ` Andy Lutomirski
  2004-05-14 16:37         ` Stephen Smalley
  2004-05-14 18:07         ` Chris Wright
  0 siblings, 2 replies; 43+ messages in thread
From: Andy Lutomirski @ 2004-05-14 16:18 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: Andy Lutomirski, Albert Cahalan, linux-kernel mailing list,
	Chris Wright, olaf+list.linux-kernel, Valdis.Kletnieks

Stephen Smalley wrote:

> On Fri, 2004-05-14 at 11:57, Andy Lutomirski wrote:
> 
>>Thanks -- turning brain back on, SELinux is obviously better than any
>>fine-grained capability scheme I can imagine.
>>
>>So unless anyone convinces me you're wrong, I'll stick with just
>>fixing up capabilities to work without making them finer-grained.
> 
> 
> Great, thanks.  Fixing capabilities to work is definitely useful and
> desirable.  Significantly expanding them in any manner is a poor use of
> limited resources, IMHO; I'd much rather see people work on applying
> SELinux to the problem and solving it more effectively for the future.
> 

Does this mean I should trash my 'maximum' mask?

(I like 'cap -c = sftp-server' so it can't try to run setuid/fP apps.)
OTOH, since SELinux accomplishes this better, it may not be worth the
effort.

--Andy


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14 16:18       ` Andy Lutomirski
@ 2004-05-14 16:37         ` Stephen Smalley
  2004-05-14 18:07         ` Chris Wright
  1 sibling, 0 replies; 43+ messages in thread
From: Stephen Smalley @ 2004-05-14 16:37 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andy Lutomirski, Albert Cahalan, linux-kernel mailing list,
	Chris Wright, olaf+list.linux-kernel, Valdis.Kletnieks

On Fri, 2004-05-14 at 12:18, Andy Lutomirski wrote:
> Does this mean I should trash my 'maximum' mask?
> 
> (I like 'cap -c = sftp-server' so it can't try to run setuid/fP apps.)
> OTOH, since SELinux accomplishes this better, it may not be worth the
> effort.

Not my call to make, but I'm not sure it is worthwhile.  Even filesystem
capabilities are questionable IMHO, as you can already authorize
capabilities based on program and call chain (typically just the
immediate caller's domain, but potentially encoding the entire call
chain in the domain transition rules to track any untrusted components
in the call chain) today using SELinux TE.  Today, the program still has
to be setuid as well, as we currently check both ordinary Linux
capability and SELinux permission for each operation to be conservative,
but eventually we would hope to drop the use of the capability module as
a secondary module and let SELinux manage the privileges entirely (once
the TE policy configuration is fully locked down and userland is fully
aware of SELinux).

-- 
Stephen Smalley <sds@epoch.ncsc.mil>
National Security Agency


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14 16:18       ` Andy Lutomirski
  2004-05-14 16:37         ` Stephen Smalley
@ 2004-05-14 18:07         ` Chris Wright
  2004-05-14 22:48           ` [PATCH] scaled-back caps, take 4 (was Re: [PATCH] capabilites, take 2) Andy Lutomirski
  1 sibling, 1 reply; 43+ messages in thread
From: Chris Wright @ 2004-05-14 18:07 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Stephen Smalley, Andy Lutomirski, Albert Cahalan,
	linux-kernel mailing list, Chris Wright, olaf+list.linux-kernel,
	Valdis.Kletnieks

* Andy Lutomirski (luto@stanford.edu) wrote:
> Stephen Smalley wrote:
> > On Fri, 2004-05-14 at 11:57, Andy Lutomirski wrote:
> > > Thanks -- turning brain back on, SELinux is obviously better than any
> > > fine-grained capability scheme I can imagine.
> > > 
> > > So unless anyone convinces me you're wrong, I'll stick with just
> > > fixing up capabilities to work without making them finer-grained.
> > 
> > Great, thanks.  Fixing capabilities to work is definitely useful and
> > desirable.  Significantly expanding them in any manner is a poor use of
> > limited resources, IMHO; I'd much rather see people work on applying
> > SELinux to the problem and solving it more effectively for the future.
> 
> Does this mean I should trash my 'maximum' mask?
> 
> (I like 'cap -c = sftp-server' so it can't try to run setuid/fP apps.)
> OTOH, since SELinux accomplishes this better, it may not be worth the
> effort.

Let's just get back to the simplest task.  Allow execve() to do smth.
reasonable with capabilities.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] scaled-back caps, take 4
  2004-05-15  0:06             ` [PATCH] scaled-back caps, take 4 Olaf Dietsche
@ 2004-05-14 22:09               ` Albert Cahalan
  2004-05-15  0:27               ` Chris Wright
  1 sibling, 0 replies; 43+ messages in thread
From: Albert Cahalan @ 2004-05-14 22:09 UTC (permalink / raw)
  To: Olaf Dietsche
  Cc: Andy Lutomirski, Chris Wright, Andy Lutomirski, Stephen Smalley,
	Albert Cahalan, linux-kernel mailing list, Valdis.Kletnieks

On Fri, 2004-05-14 at 20:06, Olaf Dietsche wrote:
> Andy Lutomirski <luto@myrealbox.com> writes:
> 
> > cap_2.6.6-mm2_4.patch: New stripped-back capabilities.
> >
> >  fs/exec.c               |   15 ++++-
> >  include/linux/binfmts.h |    9 ++-
> >  security/commoncap.c    |  130 ++++++++++++++++++++++++++++++++++++++++++------
> >  3 files changed, 136 insertions(+), 18 deletions(-)
> [patch]
> 
> Why don't you provide this as a configurable andycap.c module?

**GROAN**

For those that don't know, this is Andy Cap:
http://picosel.free.fr/andycap/andy_billard.jpg



^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH] scaled-back caps, take 4 (was Re: [PATCH] capabilites, take 2)
  2004-05-14 18:07         ` Chris Wright
@ 2004-05-14 22:48           ` Andy Lutomirski
  2004-05-15  0:06             ` [PATCH] scaled-back caps, take 4 Olaf Dietsche
       [not found]             ` <20040517231912.H21045@build.pdx.osdl.net>
  0 siblings, 2 replies; 43+ messages in thread
From: Andy Lutomirski @ 2004-05-14 22:48 UTC (permalink / raw)
  To: Chris Wright
  Cc: Andy Lutomirski, Stephen Smalley, Albert Cahalan,
	linux-kernel mailing list, olaf+list.linux-kernel,
	Valdis.Kletnieks

On Friday 14 May 2004 11:07, Chris Wright wrote:
> * Andy Lutomirski (luto@stanford.edu) wrote:
> > Stephen Smalley wrote:
> > > On Fri, 2004-05-14 at 11:57, Andy Lutomirski wrote:
> > > > Thanks -- turning brain back on, SELinux is obviously better than any
> > > > fine-grained capability scheme I can imagine.
> > > > 
> > > > So unless anyone convinces me you're wrong, I'll stick with just
> > > > fixing up capabilities to work without making them finer-grained.
> > > 
> > > Great, thanks.  Fixing capabilities to work is definitely useful and
> > > desirable.  Significantly expanding them in any manner is a poor use of
> > > limited resources, IMHO; I'd much rather see people work on applying
> > > SELinux to the problem and solving it more effectively for the future.
> > 
> > Does this mean I should trash my 'maximum' mask?
> > 
> > (I like 'cap -c = sftp-server' so it can't try to run setuid/fP apps.)
> > OTOH, since SELinux accomplishes this better, it may not be worth the
> > effort.
> 
> Let's just get back to the simplest task.  Allow execve() to do smth.
> reasonable with capabilities.

Sold.

This version throws out the inheritable mask.  There is no change in
behavior with newcaps=0.

All that's left in newcaps=1 mode is:

fP := permitted (app gains these)
pP := permitted (app can make them effective)
pE := effective

There is no inheritable mask :)  So we can't argue about what it's
supposed to / not supposed to.

It's not too well tested yet.  Enjoy.

--Andy

cap_2.6.6-mm2_4.patch: New stripped-back capabilities.

 fs/exec.c               |   15 ++++-
 include/linux/binfmts.h |    9 ++-
 security/commoncap.c    |  130 ++++++++++++++++++++++++++++++++++++++++++------
 3 files changed, 136 insertions(+), 18 deletions(-)

--- linux-2.6.6-mm2/fs/exec.c~caps	2004-05-13 11:42:26.000000000 -0700
+++ linux-2.6.6-mm2/fs/exec.c	2004-05-14 12:36:17.000000000 -0700
@@ -882,8 +882,10 @@
 
 	if(!(bprm->file->f_vfsmnt->mnt_flags & MNT_NOSUID)) {
 		/* Set-uid? */
-		if (mode & S_ISUID)
+		if (mode & S_ISUID) {
 			bprm->e_uid = inode->i_uid;
+			bprm->secflags |= BINPRM_SEC_SETUID;
+		}
 
 		/* Set-gid? */
 		/*
@@ -891,10 +893,18 @@
 		 * is a candidate for mandatory locking, not a setgid
 		 * executable.
 		 */
-		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP))
+		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
 			bprm->e_gid = inode->i_gid;
+			bprm->secflags |= BINPRM_SEC_SETGID;
+		}
 	}
 
+	/* Pretend we have VFS capabilities */
+	if ((bprm->secflags & BINPRM_SEC_SETUID) && bprm->e_uid == 0)
+		cap_set_full(bprm->cap_permitted);
+	else
+		cap_clear(bprm->cap_permitted);
+
 	/* fill in binprm security blob */
 	retval = security_bprm_set(bprm);
 	if (retval)
@@ -1089,6 +1099,7 @@
 	bprm.loader = 0;
 	bprm.exec = 0;
 	bprm.security = NULL;
+	bprm.secflags = 0;
 	bprm.mm = mm_alloc();
 	retval = -ENOMEM;
 	if (!bprm.mm)
--- linux-2.6.6-mm2/security/commoncap.c~caps	2004-05-13 11:42:26.000000000 -0700
+++ linux-2.6.6-mm2/security/commoncap.c	2004-05-14 13:24:45.000000000 -0700
@@ -24,6 +24,11 @@
 #include <linux/xattr.h>
 #include <linux/hugetlb.h>
 
+static int newcaps = 0;
+
+module_param(newcaps, int, 444);
+MODULE_PARM_DESC(newcaps, "Set newcaps=1 to enable experimental capabilities");
+
 int cap_capable (struct task_struct *tsk, int cap)
 {
 	/* Derived from include/linux/sched.h:capable. */
@@ -48,17 +53,19 @@
 {
 	/* Derived from kernel/capability.c:sys_capget. */
 	*effective = cap_t (target->cap_effective);
-	*inheritable = cap_t (target->cap_inheritable);
 	*permitted = cap_t (target->cap_permitted);
+	if (newcaps)
+		*inheritable = CAP_EMPTY_SET;
+	else
+		*inheritable = cap_t (target->cap_inheritable);
 	return 0;
 }
 
 int cap_capset_check (struct task_struct *target, kernel_cap_t *effective,
 		      kernel_cap_t *inheritable, kernel_cap_t *permitted)
 {
-	/* Derived from kernel/capability.c:sys_capset. */
 	/* verify restrictions on target's new Inheritable set */
-	if (!cap_issubset (*inheritable,
+	if (!newcaps && !cap_issubset (*inheritable,
 			   cap_combine (target->cap_inheritable,
 					current->cap_permitted))) {
 		return -EPERM;
@@ -83,12 +90,16 @@
 		     kernel_cap_t *inheritable, kernel_cap_t *permitted)
 {
 	target->cap_effective = *effective;
-	target->cap_inheritable = *inheritable;
 	target->cap_permitted = *permitted;
+	if (!newcaps)
+		target->cap_inheritable = *inheritable;
 }
 
 int cap_bprm_set_security (struct linux_binprm *bprm)
 {
+	if (newcaps)
+		return 0;
+
 	/* Copied from fs/exec.c:prepare_binprm. */
 
 	/* We don't have VFS support for capabilities yet */
@@ -115,9 +126,9 @@
 	return 0;
 }
 
-void cap_bprm_apply_creds (struct linux_binprm *bprm, int unsafe)
+static void cap_bprm_apply_creds_compat (struct linux_binprm *bprm, int unsafe)
 {
-	/* Derived from fs/exec.c:compute_creds. */
+	/* This function will hopefully die in 2.7. */
 	kernel_cap_t new_permitted, working;
 
 	new_permitted = cap_intersect (bprm->cap_permitted, cap_bset);
@@ -158,15 +169,88 @@
 	current->keep_capabilities = 0;
 }
 
+/*
+ * The rules of Linux capabilities (not POSIX!)
+ *
+ * What the masks mean:
+ *  pP = capabilities that this process has
+ *  pE = capabilities that this process has and are enabled
+ * (so pE <= pP)
+ *
+ * The capability evolution rules are:
+ *
+ *  pP' = ((fP & cap_bset) | pP) & Y
+ *  pE' = (pE | fP) & pP'
+ *
+ *  X = cap_bset
+ *  Y is zero if uid!=0, euid==0, and setuid non-root
+ *
+ *  Exception: if setuid-nonroot, then pE' is reset to 0.
+ *
+ * If file capabilities are introduced, programs that are granted
+ * fP > 0 need to be aware of how to deal with it.
+ * Because the effective set is left alone on non-setuid fP>0,
+ * such a program should drop capabilities that were not in its initial
+ * effective set before running untrusted code.
+ *
+ */
+void cap_bprm_apply_creds (struct linux_binprm *bprm, int unsafe)
+{
+	kernel_cap_t new_pP, new_pE, fP;
+	int is_setuid, is_setgid;
+
+	if(!newcaps) {
+		cap_bprm_apply_creds_compat(bprm, unsafe);
+		return;
+	}
+
+	fP = bprm->cap_permitted;
+	is_setuid = (bprm->secflags & BINPRM_SEC_SETUID);
+	is_setgid = (bprm->secflags & BINPRM_SEC_SETGID);
+
+	/* Check that it's safe to elevate privileges */
+	if (unsafe & ~LSM_UNSAFE_PTRACE_CAP)
+		bprm->secflags |= BINPRM_SEC_NOELEVATE;
+
+	if (bprm->secflags & BINPRM_SEC_NOELEVATE) {
+		is_setuid = is_setgid = 0;
+		cap_clear(fP);
+	}
+
+	new_pP = cap_intersect(fP, cap_bset);
+	new_pP = cap_combine(new_pP, current->cap_permitted);
+
+	/* setuid-nonroot is special. */
+	if (is_setuid && bprm->e_uid != 0 && current->uid != 0 &&
+	    current->euid == 0)
+		cap_clear(new_pP);
+
+	if (!cap_issubset(new_pP, current->cap_permitted))
+		bprm->secflags |= BINPRM_SEC_SECUREEXEC;
+
+	new_pE = cap_combine(current->cap_effective, fP);
+	cap_mask(new_pE, new_pP);
+	if (is_setuid && bprm->e_uid != 0)
+		cap_clear(new_pE);
+
+	/* Apply new security state */
+	if (is_setuid) {
+	        current->suid = current->euid = current->fsuid = bprm->e_uid;
+	}
+	if (is_setgid)
+	        current->sgid = current->egid = current->fsgid = bprm->e_gid;
+
+	current->cap_permitted = new_pP;
+	current->cap_effective = new_pE;
+
+	current->keep_capabilities = 0;
+}
+
 int cap_bprm_secureexec (struct linux_binprm *bprm)
 {
-	/* If/when this module is enhanced to incorporate capability
-	   bits on files, the test below should be extended to also perform a 
-	   test between the old and new capability sets.  For now,
-	   it simply preserves the legacy decision algorithm used by
-	   the old userland. */
 	return (current->euid != current->uid ||
-		current->egid != current->gid);
+		current->egid != current->gid ||
+		(bprm->secflags & BINPRM_SEC_SECUREEXEC));
 }
 
 int cap_inode_setxattr(struct dentry *dentry, char *name, void *value,
@@ -280,9 +364,14 @@
 
 void cap_task_reparent_to_init (struct task_struct *p)
 {
-	p->cap_effective = CAP_INIT_EFF_SET;
-	p->cap_inheritable = CAP_INIT_INH_SET;
-	p->cap_permitted = CAP_FULL_SET;
+	if (newcaps) {
+		cap_set_full(p->cap_permitted);
+		cap_set_full(p->cap_effective);
+	} else {
+		p->cap_effective = CAP_INIT_EFF_SET;
+		p->cap_inheritable = CAP_INIT_INH_SET;
+		p->cap_permitted = CAP_FULL_SET;
+	}
 	p->keep_capabilities = 0;
 	return;
 }
@@ -367,6 +456,15 @@
 	return -ENOMEM;
 }
 
+static int __init commoncap_init (void)
+{
+	if (newcaps) {
+		printk(KERN_NOTICE "Experimental capability support is on\n");
+	}
+
+	return 0;
+}
+
 EXPORT_SYMBOL(cap_capable);
 EXPORT_SYMBOL(cap_ptrace);
 EXPORT_SYMBOL(cap_capget);
@@ -382,5 +480,7 @@
 EXPORT_SYMBOL(cap_syslog);
 EXPORT_SYMBOL(cap_vm_enough_memory);
 
+module_init(commoncap_init);
+
 MODULE_DESCRIPTION("Standard Linux Common Capabilities Security Module");
 MODULE_LICENSE("GPL");
--- linux-2.6.6-mm2/include/linux/binfmts.h~caps	2004-05-13 11:42:26.000000000 -0700
+++ linux-2.6.6-mm2/include/linux/binfmts.h	2004-05-14 10:28:05.000000000 -0700
@@ -20,6 +20,10 @@
 /*
  * This structure is used to hold the arguments that are used when loading binaries.
  */
+#define BINPRM_SEC_SETUID	1
+#define BINPRM_SEC_SETGID	2
+#define BINPRM_SEC_SECUREEXEC	4
+#define BINPRM_SEC_NOELEVATE	8
 struct linux_binprm{
 	char buf[BINPRM_BUF_SIZE];
 	struct page *page[MAX_ARG_PAGES];
@@ -28,7 +32,10 @@
 	int sh_bang;
 	struct file * file;
 	int e_uid, e_gid;
-	kernel_cap_t cap_inheritable, cap_permitted, cap_effective;
+	int secflags;
+	kernel_cap_t cap_permitted;
+	/* old caps -- do NOT use in new code */
+	kernel_cap_t cap_inheritable, cap_effective;
 	void *security;
 	int argc, envc;
 	char * filename;	/* Name of binary as seen by procps */


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] scaled-back caps, take 4
  2004-05-14 22:48           ` [PATCH] scaled-back caps, take 4 (was Re: [PATCH] capabilites, take 2) Andy Lutomirski
@ 2004-05-15  0:06             ` Olaf Dietsche
  2004-05-14 22:09               ` Albert Cahalan
  2004-05-15  0:27               ` Chris Wright
       [not found]             ` <20040517231912.H21045@build.pdx.osdl.net>
  1 sibling, 2 replies; 43+ messages in thread
From: Olaf Dietsche @ 2004-05-15  0:06 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Chris Wright, Andy Lutomirski, Stephen Smalley, Albert Cahalan,
	linux-kernel mailing list, Valdis.Kletnieks

Andy Lutomirski <luto@myrealbox.com> writes:

> cap_2.6.6-mm2_4.patch: New stripped-back capabilities.
>
>  fs/exec.c               |   15 ++++-
>  include/linux/binfmts.h |    9 ++-
>  security/commoncap.c    |  130 ++++++++++++++++++++++++++++++++++++++++++------
>  3 files changed, 136 insertions(+), 18 deletions(-)
[patch]

Why don't you provide this as a configurable andycap.c module?
I think, this is the whole point of LSM.

Regards, Olaf.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] scaled-back caps, take 4
  2004-05-15  0:06             ` [PATCH] scaled-back caps, take 4 Olaf Dietsche
  2004-05-14 22:09               ` Albert Cahalan
@ 2004-05-15  0:27               ` Chris Wright
  1 sibling, 0 replies; 43+ messages in thread
From: Chris Wright @ 2004-05-15  0:27 UTC (permalink / raw)
  To: Olaf Dietsche
  Cc: Andy Lutomirski, Chris Wright, Andy Lutomirski, Stephen Smalley,
	Albert Cahalan, linux-kernel mailing list, Valdis.Kletnieks

* Olaf Dietsche (olaf+list.linux-kernel@olafdietsche.de) wrote:
> Andy Lutomirski <luto@myrealbox.com> writes:
> 
> > cap_2.6.6-mm2_4.patch: New stripped-back capabilities.
> >
> >  fs/exec.c               |   15 ++++-
> >  include/linux/binfmts.h |    9 ++-
> >  security/commoncap.c    |  130 ++++++++++++++++++++++++++++++++++++++++++------
> >  3 files changed, 136 insertions(+), 18 deletions(-)
> [patch]
> 
> Why don't you provide this as a configurable andycap.c module?
> I think, this is the whole point of LSM.

I agree, if we can't find a clean way to do it.  However, note this
includes changes to core.  And it's nice to fix this for the base case.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] scaled-back caps, take 4 (was Re: [PATCH] capabilites, take 2)
       [not found]             ` <20040517231912.H21045@build.pdx.osdl.net>
@ 2004-05-18  9:11               ` Andy Lutomirski
  2004-05-19  1:27                 ` Chris Wright
       [not found]               ` <20040517235844.I21045@build.pdx.osdl.net>
  1 sibling, 1 reply; 43+ messages in thread
From: Andy Lutomirski @ 2004-05-18  9:11 UTC (permalink / raw)
  To: Chris Wright
  Cc: Andy Lutomirski, Stephen Smalley, Albert Cahalan,
	linux-kernel mailing list, olaf+list.linux-kernel,
	Valdis.Kletnieks



Chris Wright wrote:
> * Andy Lutomirski (luto@myrealbox.com) wrote:
> 
>>This version throws out the inheritable mask.  There is no change in
>>behavior with newcaps=0.
> 
> 
> Alright, I tried to write up my expectations for all the various modes
> w.r.t dropping privs, keeping them, setuid apps, etc.  I then wrote some
> tests to get a baseline, and well as a way to compare results.  Finally
> I wrote a patch that meets the requirements I laid out, and compared it
> against yours.  With one minor exception, the capabilities bits match
> up.  You drop effective caps when a non-root users execs a non-root
> setuid app, and I the caps alone.  ...

Paranoia.  There are legacy setuid programs out there (presumably even 
setuid-nonroot).  Let's make them behave as closely to the way they 
currently do as possible.


 > ... One side note, you're changes effect
> the user/group saved ids unexpectedly.

Oops.  That's a trivial fix.

> 
> Here's a bunch of test cases:

> # Still w/out changing inheritable and with KEEPCAPS set
> # 10 Root process does setuid(!0), effective caps are dropped
> # 11 Root process does seteuid(!0), effective caps are dropped
> # 12 Nonroot process restores uid 0, effective restored to permitted

I still want some variant on KEEPCAPS that causes setxuid to leave caps 
completely alone.  Oh, well.

> # 18 Non-root execs setuid-nonroot process, new caps bound by inheritable

What new caps?

>>cap_2.6.6-mm2_4.patch: New stripped-back capabilities.
>>
>> fs/exec.c               |   15 ++++-
>> include/linux/binfmts.h |    9 ++-
>> security/commoncap.c    |  130 ++++++++++++++++++++++++++++++++++++++++++------
>> 3 files changed, 136 insertions(+), 18 deletions(-)
>>
>>--- linux-2.6.6-mm2/fs/exec.c~caps	2004-05-13 11:42:26.000000000 -0700
>>+++ linux-2.6.6-mm2/fs/exec.c	2004-05-14 12:36:17.000000000 -0700
>>@@ -882,8 +882,10 @@
>> 
>> 	if(!(bprm->file->f_vfsmnt->mnt_flags & MNT_NOSUID)) {
>> 		/* Set-uid? */
>>-		if (mode & S_ISUID)
>>+		if (mode & S_ISUID) {
>> 			bprm->e_uid = inode->i_uid;
>>+			bprm->secflags |= BINPRM_SEC_SETUID;
>>+		}
> 
> 
> No strong objection, but I don't think it's necessary.

I wanted to distinguish between setuid-me and non-setuid in apply_creds. 
That one doesn't matter much, though.

> 
> 
>> 
>> 		/* Set-gid? */
>> 		/*
>>@@ -891,10 +893,18 @@
>> 		 * is a candidate for mandatory locking, not a setgid
>> 		 * executable.
>> 		 */
>>-		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP))
>>+		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
>> 			bprm->e_gid = inode->i_gid;
>>+			bprm->secflags |= BINPRM_SEC_SETGID;
>>+		}
>> 	}
>> 
>>+	/* Pretend we have VFS capabilities */
>>+	if ((bprm->secflags & BINPRM_SEC_SETUID) && bprm->e_uid == 0)
>>+		cap_set_full(bprm->cap_permitted);
>>+	else
>>+		cap_clear(bprm->cap_permitted);
>>+
> 
> 
> Thus far we've kept all this stuff out of core.  I believe we should
> keep it that way.  This could easily be left in bprm_set().

True.  But as long as linux_binprm has fields for this stuff, intuitively 
it should always be filled in (i.e. not just in commoncap).  That way we 
can say that commoncap doesn't get special treatment :)

Also, this seems like the right place to check for VFS caps.  This way we can.

> 
> 
>>--- linux-2.6.6-mm2/security/commoncap.c~caps	2004-05-13 11:42:26.000000000 -0700
>>+++ linux-2.6.6-mm2/security/commoncap.c	2004-05-14 13:24:45.000000000 -0700
>>@@ -24,6 +24,11 @@
>> #include <linux/xattr.h>
>> #include <linux/hugetlb.h>
>> 
>>+static int newcaps = 0;
>>+
>>+module_param(newcaps, int, 444);
>>+MODULE_PARM_DESC(newcaps, "Set newcaps=1 to enable experimental capabilities");
> 
> 
> It would be really nice to have a one size fits all solution.  Esp.
> since the legacy mode is what one gets when leaving inheritable mask
> untouched.

I agree.  Andrew specifically asked for this, though, at least until this 
stuff is well-tested and everyone likes it.

> 
> 
>> int cap_bprm_set_security (struct linux_binprm *bprm)
>> {
>>+	if (newcaps)
>>+		return 0;
>>+
> 
> 
> That setup could go here instead of in core.
> 
> 

[snip]
>> 
>>+/*
>>+ * The rules of Linux capabilities (not POSIX!)
>>+ *
>>+ * What the masks mean:
>>+ *  pP = capabilities that this process has
>>+ *  pE = capabilities that this process has and are enabled
>>+ * (so pE <= pP)
>>+ *
>>+ * The capability evolution rules are:
>>+ *
>>+ *  pP' = ((fP & cap_bset) | pP) & Y
>>+ *  pE' = (pE | fP) & pP'
>>+ *
>>+ *  X = cap_bset
>>+ *  Y is zero if uid!=0, euid==0, and setuid non-root
>>+ *
>>+ *  Exception: if setuid-nonroot, then pE' is reset to 0.
> 
> 
> While this works fine, I was interested to see what we could do to
> leave the old algorithm.  Seems both work out fine.
> 
> 
>>+	/* setuid-nonroot is special. */
>>+	if (is_setuid && bprm->e_uid != 0 && current->uid != 0 &&
>>+	    current->euid == 0)
>>+		cap_clear(new_pP);
> 
> 
> setuid-nonroot from a non-root user needs to clear effective?

Yes.  Say user 500 runs a setuid-root program, which goes back and runs a 
setuid-500 program.  uid==euid==500 now, so the program expects to be 
unprivileged.  This makes that work.  (I'm assuming you meant permitted. 
Effective gets cleared in cap_mask(new_pE, new_pP)).


The reason I killed the old algorithm is because it's scary (in the sense 
of being complicated and subtle for no good reason) and because I don't see 
the point of inheritable caps.  I doubt anything uses them currently on a 
vanilla kernel because they don't _do_ anything.  So I killed them.


--Andy

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] scaled-back caps, take 4 (was Re: [PATCH] capabilites, take 2)
  2004-05-18  9:11               ` [PATCH] scaled-back caps, take 4 (was Re: [PATCH] capabilites, take 2) Andy Lutomirski
@ 2004-05-19  1:27                 ` Chris Wright
  2004-05-19  1:54                   ` [PATCH] scaled-back caps, take 4 Andy Lutomirski
  0 siblings, 1 reply; 43+ messages in thread
From: Chris Wright @ 2004-05-19  1:27 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Chris Wright, Andy Lutomirski, Stephen Smalley, Albert Cahalan,
	linux-kernel mailing list, olaf+list.linux-kernel,
	Valdis.Kletnieks

* Andy Lutomirski (luto@stanford.edu) wrote:
> Chris Wright wrote:
> > Alright, I tried to write up my expectations for all the various modes
> > w.r.t dropping privs, keeping them, setuid apps, etc.  I then wrote some
> > tests to get a baseline, and well as a way to compare results.  Finally
> > I wrote a patch that meets the requirements I laid out, and compared it
> > against yours.  With one minor exception, the capabilities bits match
> > up.  You drop effective caps when a non-root users execs a non-root
> > setuid app, and I the caps alone.  ...
> 
> Paranoia.  There are legacy setuid programs out there (presumably even 
> setuid-nonroot).  Let's make them behave as closely to the way they 
> currently do as possible.

Yes, that's the goal I have.  Although, the above scenario, they've
already been limited (IOW, if nothing's been touched, all behaves the
same).  Starting with limited inheritable (as say uid 500), execute
a non-root, setuid (say uid 99) program, is this expected to change
effective set?  Currently it's a transition for 0 caps to 0 caps, not
very interesting.  Given they are both unprivelged uids, I kept
the (limited) effective.

> > # Still w/out changing inheritable and with KEEPCAPS set
> > # 10 Root process does setuid(!0), effective caps are dropped
> > # 11 Root process does seteuid(!0), effective caps are dropped
> > # 12 Nonroot process restores uid 0, effective restored to permitted
> 
> I still want some variant on KEEPCAPS that causes setxuid to leave caps 
> completely alone.  Oh, well.

Yeah, digs hole deeper though.

> > # 18 Non-root execs setuid-nonroot process, new caps bound by inheritable
> 
> What new caps?

The caps in the newly exec'd process.  IOW, effective aren't dropped,
but as you'd expect inheritable provides limit.

> >>+	/* Pretend we have VFS capabilities */
> >>+	if ((bprm->secflags & BINPRM_SEC_SETUID) && bprm->e_uid == 0)
> >>+		cap_set_full(bprm->cap_permitted);
> >>+	else
> >>+		cap_clear(bprm->cap_permitted);
> >>+
> > 
> > 
> > Thus far we've kept all this stuff out of core.  I believe we should
> > keep it that way.  This could easily be left in bprm_set().
> 
> True.  But as long as linux_binprm has fields for this stuff, intuitively 
> it should always be filled in (i.e. not just in commoncap).  That way we 
> can say that commoncap doesn't get special treatment :)
> 
> Also, this seems like the right place to check for VFS caps.  This way we can.

This does change the current notion of layering.  I see your point though, 
likening it to say reading inode and finding S_ISUID bit.

> >>+ * The rules of Linux capabilities (not POSIX!)
> >>+ *
> >>+ * What the masks mean:
> >>+ *  pP = capabilities that this process has
> >>+ *  pE = capabilities that this process has and are enabled
> >>+ * (so pE <= pP)
> >>+ *
> >>+ * The capability evolution rules are:
> >>+ *
> >>+ *  pP' = ((fP & cap_bset) | pP) & Y
> >>+ *  pE' = (pE | fP) & pP'
> >>+ *
> >>+ *  X = cap_bset
> >>+ *  Y is zero if uid!=0, euid==0, and setuid non-root
> >>+ *
> >>+ *  Exception: if setuid-nonroot, then pE' is reset to 0.
> > 
> > While this works fine, I was interested to see what we could do to
> > leave the old algorithm.  Seems both work out fine.
> > 
> >>+	/* setuid-nonroot is special. */
> >>+	if (is_setuid && bprm->e_uid != 0 && current->uid != 0 &&
> >>+	    current->euid == 0)
> >>+		cap_clear(new_pP);
> > 
> > 
> > setuid-nonroot from a non-root user needs to clear effective?
> 
> Yes.  Say user 500 runs a setuid-root program, which goes back and runs a 
> setuid-500 program.  uid==euid==500 now, so the program expects to be 
> unprivileged.  This makes that work.  (I'm assuming you meant permitted. 
> Effective gets cleared in cap_mask(new_pE, new_pP)).

Yup, I see.  This works in my patch as well.  I'll add this test.  Also
added test to check disabling priv escalation during ptrace of setuid
app.

> The reason I killed the old algorithm is because it's scary (in the sense 
> of being complicated and subtle for no good reason) and because I don't see 
> the point of inheritable caps.  I doubt anything uses them currently on a 
> vanilla kernel because they don't _do_ anything.  So I killed them.

This does break all those caps aware apps (yeah, tongue-in-cheek ;-)
that actually have the idea to widen the effective set, yet limit the
inheriable set.  Seriously, I don't know how much this matters.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] support cap inheritable (Re: [PATCH] scaled-back caps, take 4 (was Re: [PATCH] capabilites, take 2)
       [not found]               ` <20040517235844.I21045@build.pdx.osdl.net>
@ 2004-05-19  1:34                 ` Andy Lutomirski
  2004-05-19  7:27                   ` Chris Wright
  0 siblings, 1 reply; 43+ messages in thread
From: Andy Lutomirski @ 2004-05-19  1:34 UTC (permalink / raw)
  To: Chris Wright
  Cc: Andy Lutomirski, Stephen Smalley, Albert Cahalan,
	linux-kernel mailing list, olaf+list.linux-kernel,
	Valdis.Kletnieks

Chris Wright wrote:

> * Chris Wright (chrisw@osdl.org) wrote:
> 
>>* Andy Lutomirski (luto@myrealbox.com) wrote:
>>
>>>This version throws out the inheritable mask.  There is no change in
>>>behavior with newcaps=0.
>>
>>Alright, I tried to write up my expectations for all the various modes
>>w.r.t dropping privs, keeping them, setuid apps, etc.  I then wrote some
>>tests to get a baseline, and well as a way to compare results.  Finally
>>I wrote a patch that meets the requirements I laid out, and compared it
>>against yours.  With one minor exception, the capabilities bits match
>>up.  You drop effective caps when a non-root users execs a non-root
>>setuid app, and I the caps alone.  One side note, you're changes effect
>>the user/group saved ids unexpectedly.

Disclaimer: I haven't had a chance to boot this version.

> 
> 
> The tests are available at:
>   http://developer.osdl.org/~chrisw/capabilities/testcap.tar.bz2
> 
> patch is there too and is also inline below:
>   http://developer.osdl.org/~chrisw/capabilities/2.6.6-mm2/chris_cap.patch
> 
> --- linux-2.6.6-mm2-cap/include/linux/capability.h.orig	2004-05-09 19:32:26.000000000 -0700
> +++ linux-2.6.6-mm2-cap/include/linux/capability.h	2004-05-17 23:24:08.143860096 -0700
> @@ -309,7 +309,9 @@
>  #define CAP_EMPTY_SET       to_cap_t(0)
>  #define CAP_FULL_SET        to_cap_t(~0)
>  #define CAP_INIT_EFF_SET    to_cap_t(~0 & ~CAP_TO_MASK(CAP_SETPCAP))
> -#define CAP_INIT_INH_SET    to_cap_t(0)
> +#define CAP_INIT_INH_SET    to_cap_t(~0)
> +/* ~0 is legacy inheritable mode and can never be capset by user */
> +#define cap_orig_inh(_cap) (cap_t((_cap)) == ~0)

So how do you say "inherit all caps"?

[...]

> @@ -56,10 +59,13 @@
>  int cap_capset_check (struct task_struct *target, kernel_cap_t *effective,
>  		      kernel_cap_t *inheritable, kernel_cap_t *permitted)
>  {
> +	kernel_cap_t target_inheritable = target->cap_inheritable;
> +	if (cap_orig_inh(target_inheritable))
> +		target_inheritable = 0;
>  	/* Derived from kernel/capability.c:sys_capset. */
>  	/* verify restrictions on target's new Inheritable set */
>  	if (!cap_issubset (*inheritable,
> -			   cap_combine (target->cap_inheritable,
> +			   cap_combine (target_inheritable,
>  					current->cap_permitted))) {
>  		return -EPERM;
>  	}

What stops legacy mode from being reenabled?

[...]

I think you missed the case when root-but-no-caps execs setuid root -- I 
don't see anything that would enable secureexec.

--Andy

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] scaled-back caps, take 4
  2004-05-19  1:27                 ` Chris Wright
@ 2004-05-19  1:54                   ` Andy Lutomirski
  2004-05-19  7:30                     ` Chris Wright
  0 siblings, 1 reply; 43+ messages in thread
From: Andy Lutomirski @ 2004-05-19  1:54 UTC (permalink / raw)
  To: Chris Wright
  Cc: Andy Lutomirski, Stephen Smalley, Albert Cahalan,
	linux-kernel mailing list, olaf+list.linux-kernel,
	Valdis.Kletnieks

Chris Wright wrote:

> * Andy Lutomirski (luto@stanford.edu) wrote:
> 
>>Chris Wright wrote:
>>
>>>
>>>
>>>Thus far we've kept all this stuff out of core.  I believe we should
>>>keep it that way.  This could easily be left in bprm_set().
>>
>>True.  But as long as linux_binprm has fields for this stuff, intuitively 
>>it should always be filled in (i.e. not just in commoncap).  That way we 
>>can say that commoncap doesn't get special treatment :)
>>
>>Also, this seems like the right place to check for VFS caps.  This way we can.
> 
> 
> This does change the current notion of layering.  I see your point though, 
> likening it to say reading inode and finding S_ISUID bit.

On the other hand, no one would put reading of a SELinux security label 
here.  But we already have fields in binprm specifically for commoncap.  I 
have no strong preference.


>>The reason I killed the old algorithm is because it's scary (in the sense 
>>of being complicated and subtle for no good reason) and because I don't see 
>>the point of inheritable caps.  I doubt anything uses them currently on a 
>>vanilla kernel because they don't _do_ anything.  So I killed them.
> 
> 
> This does break all those caps aware apps (yeah, tongue-in-cheek ;-)
> that actually have the idea to widen the effective set, yet limit the
> inheriable set.  Seriously, I don't know how much this matters.

Yah, they're broken anyway right now if that's what they're doing.

The reason I didn't go for something like your approach (other than not 
thinking of it) was that, as long as we're changing the semantics, we might 
as well make them as clean as possible.  I also didn't mind ripping out 
lots of old code :).  If the inheritable mask stays, then programs need to 
be audited for what happens if they are run with different inheritable 
masks.  I'd rather just eliminate that complication and the corresponding 
blob of commoncap code.  Obviously my patch fails a lot of your tests as a 
result.

So do we arm-wrestle over whose implementation wins? :)  I'd say mine wins 
on readability (not your fault -- the old code was pretty bad to begin 
with) and some simplicity, but yours has the benefit of being less intrusive.

--Andy

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] support cap inheritable (Re: [PATCH] scaled-back caps, take 4 (was Re: [PATCH] capabilites, take 2)
  2004-05-19  1:34                 ` [PATCH] support cap inheritable (Re: [PATCH] scaled-back caps, take 4 (was Re: [PATCH] capabilites, take 2) Andy Lutomirski
@ 2004-05-19  7:27                   ` Chris Wright
  0 siblings, 0 replies; 43+ messages in thread
From: Chris Wright @ 2004-05-19  7:27 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Chris Wright, Stephen Smalley, Albert Cahalan,
	linux-kernel mailing list, olaf+list.linux-kernel,
	Valdis.Kletnieks

* Andy Lutomirski (luto@myrealbox.com) wrote:
> Chris Wright wrote:
> 
> > -#define CAP_INIT_INH_SET    to_cap_t(0)
> > +#define CAP_INIT_INH_SET    to_cap_t(~0)
> > +/* ~0 is legacy inheritable mode and can never be capset by user */
> > +#define cap_orig_inh(_cap) (cap_t((_cap)) == ~0)
> 
> So how do you say "inherit all caps"?

Legacy mode requires effectively reserving a bit, which works in this case
since all 32 aren't used.  So inherit all caps is all valid caps (which,
btw,  still exludes CAP_SETPCAP).  Something like "=eip cap_setpcap-eip"
would inherit all valid caps.

> > @@ -56,10 +59,13 @@
> >  int cap_capset_check (struct task_struct *target, kernel_cap_t *effective,
> >  		      kernel_cap_t *inheritable, kernel_cap_t *permitted)
> >  {
> > +	kernel_cap_t target_inheritable = target->cap_inheritable;
> > +	if (cap_orig_inh(target_inheritable))
> > +		target_inheritable = 0;
> >  	/* Derived from kernel/capability.c:sys_capset. */
> >  	/* verify restrictions on target's new Inheritable set */
> >  	if (!cap_issubset (*inheritable,
> > -			   cap_combine (target->cap_inheritable,
> > +			   cap_combine (target_inheritable,
> >  					current->cap_permitted))) {
> >  		return -EPERM;
> >  	}
> 
> What stops legacy mode from being reenabled?

I believe only a kernel thread could do this (and init) since they are the
only ones that could get CAP_SETPCAP.  Same threat as it is currently.

> I think you missed the case when root-but-no-caps execs setuid root -- I 
> don't see anything that would enable secureexec.

Yup you're right.  I liked how you did that in your patch and was just going
to steal that bit ;-)

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] scaled-back caps, take 4
  2004-05-19  1:54                   ` [PATCH] scaled-back caps, take 4 Andy Lutomirski
@ 2004-05-19  7:30                     ` Chris Wright
  2004-05-23  9:28                       ` Andy Lutomirski
  2004-05-24 23:38                       ` [PATCH] caps, compromise version (was Re: [PATCH] scaled-back caps, take 4) Andy Lutomirski
  0 siblings, 2 replies; 43+ messages in thread
From: Chris Wright @ 2004-05-19  7:30 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Chris Wright, Stephen Smalley, Albert Cahalan,
	linux-kernel mailing list, olaf+list.linux-kernel,
	Valdis.Kletnieks

* Andy Lutomirski (luto@myrealbox.com) wrote:
> Chris Wright wrote:
> > This does change the current notion of layering.  I see your point though, 
> > likening it to say reading inode and finding S_ISUID bit.
> 
> On the other hand, no one would put reading of a SELinux security label 
> here.  But we already have fields in binprm specifically for commoncap.  I 
> have no strong preference.

Yes, I stopped short of that argument only because capabilities fall
into a bit more of a gray zone than other modules.  But I do prefer
leaving it in the module.

> >>The reason I killed the old algorithm is because it's scary (in the sense 
> >>of being complicated and subtle for no good reason) and because I don't see 
> >>the point of inheritable caps.  I doubt anything uses them currently on a 
> >>vanilla kernel because they don't _do_ anything.  So I killed them.
> > 
> > This does break all those caps aware apps (yeah, tongue-in-cheek ;-)
> > that actually have the idea to widen the effective set, yet limit the
> > inheriable set.  Seriously, I don't know how much this matters.
> 
> Yah, they're broken anyway right now if that's what they're doing.

On Linux they are.  On IRIX they aren't.  This is part of the issue as I
see it.

> The reason I didn't go for something like your approach (other than not 
> thinking of it) was that, as long as we're changing the semantics, we might 
> as well make them as clean as possible.  I also didn't mind ripping out 
> lots of old code :).  If the inheritable mask stays, then programs need to 
> be audited for what happens if they are run with different inheritable 
> masks.  I'd rather just eliminate that complication and the corresponding 
> blob of commoncap code.  Obviously my patch fails a lot of your tests as a 
> result.

Actually the only glaring difference (aside from the uid/suid and non-root
execs nonroot-yet-diff-id-setuid-app issue I mentioned earlier) is in
something like "=ep cap_setpcap-ep cap_ipc_lock+i" IIRC.

I have the feeling we both are after the same thing, which is introducing
the ability to keep some caps through exec and still being able to sleep
at night w/ confidence that there isn't some subtle new hole lurking.
This is why I aimed to change as little code as possible.

> So do we arm-wrestle over whose implementation wins? :)  I'd say mine wins 
> on readability (not your fault -- the old code was pretty bad to begin 
> with) and some simplicity, but yours has the benefit of being less intrusive.

Hehe, arm wrestling could be entertaining ;-)  I'm in favor of the most
conservative change, which I feel is in my patch.  But I'm game to
continue to pick on each.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] scaled-back caps, take 4
  2004-05-19  7:30                     ` Chris Wright
@ 2004-05-23  9:28                       ` Andy Lutomirski
  2004-05-23 18:48                         ` Olaf Dietsche
  2004-05-24 23:38                       ` [PATCH] caps, compromise version (was Re: [PATCH] scaled-back caps, take 4) Andy Lutomirski
  1 sibling, 1 reply; 43+ messages in thread
From: Andy Lutomirski @ 2004-05-23  9:28 UTC (permalink / raw)
  To: Chris Wright
  Cc: Stephen Smalley, Albert Cahalan, linux-kernel mailing list,
	olaf+list.linux-kernel, Valdis.Kletnieks

On Wednesday 19 May 2004 00:30, Chris Wright wrote:
> * Andy Lutomirski (luto@myrealbox.com) wrote:
> > Chris Wright wrote:

> > The reason I didn't go for something like your approach (other than not 
> > thinking of it) was that, as long as we're changing the semantics, we might 
> > as well make them as clean as possible.  I also didn't mind ripping out 
> > lots of old code :).  If the inheritable mask stays, then programs need to 
> > be audited for what happens if they are run with different inheritable 
> > masks.  I'd rather just eliminate that complication and the corresponding 
> > blob of commoncap code.  Obviously my patch fails a lot of your tests as a 
> > result.
> 
> Actually the only glaring difference (aside from the uid/suid and non-root
> execs nonroot-yet-diff-id-setuid-app issue I mentioned earlier) is in
> something like "=ep cap_setpcap-ep cap_ipc_lock+i" IIRC.
> 
> I have the feeling we both are after the same thing, which is introducing
> the ability to keep some caps through exec and still being able to sleep
> at night w/ confidence that there isn't some subtle new hole lurking.
> This is why I aimed to change as little code as possible.
> 
> > So do we arm-wrestle over whose implementation wins? :)  I'd say mine wins 
> > on readability (not your fault -- the old code was pretty bad to begin 
> > with) and some simplicity, but yours has the benefit of being less intrusive.
> 
> Hehe, arm wrestling could be entertaining ;-)  I'm in favor of the most
> conservative change, which I feel is in my patch.  But I'm game to
> continue to pick on each.
> 

You don't like my patch because it rips out a bunch of code and it's
not clear it won't break stuff.

I don't like your patch because it takes a bunch of incomprehensible
code that does almost nothing and tweaks it slightly to do something
useful.  (That's not to say it's does the wrong thing; I just don't
think the code is clear.)

So I decided to figure out what was going on with the original code.

First, CAP_SETPCAP is never obtainable (by anything).
Since cap_bset never has this bit set, nothing can inherit it
from fP.  capset_check prevents it from getting set in pI.

Second, cap_bset is broken.  For one thing, there's no way
to remove the caps you want to restrict from already-running
tasks.  So I don't think it matters if we break/change it.


cap_bprm_set_security does:
fP = fI = (new_uid == 0 || new_euid == 0)
fE = (new_euid == 0)

cap_bprm_apply_creds then does:
1. pP' = (fP & X) | (fI & pI)
2. unsafeness stuff
3. fix up uids
4. pE' = fE & pP'

Now for the fun part:

Since fP == fI, pP' = fP & (X | pI)
But X | pI == X (since pI can't have CAP_SETPCAP and X==~CAP_SETPCAP)
So pP' = fP & X

pE' == fE & pP' == fE & fP & X
But fE & fP == fE, so:
pE' = fE & X

The whole result is just
pP' = (uid == 0 || euid==0) & X
pE' = (euid == 0) & X


This patch implements this.  It should be invisible to userspace
(unless userspace (ab)uses cap_bset).  It also adds a secureexec
flag, which we both need.

First, did I get this right?  It seems to work :)

Second, do you have any objection to both of us redoing our
patches against this one?  It should make them nicer-looking
at least.

--Andy

--- 2.6.6-mm4/fs/exec.c~cap_cleanup	2004-05-19 10:37:25.000000000 -0700
+++ 2.6.6-mm4/fs/exec.c	2004-05-21 19:03:16.000000000 -0700
@@ -1093,6 +1093,7 @@
 	bprm.loader = 0;
 	bprm.exec = 0;
 	bprm.security = NULL;
+	bprm.secflags = 0;
 	bprm.mm = mm_alloc();
 	retval = -ENOMEM;
 	if (!bprm.mm)
--- 2.6.6-mm4/security/commoncap.c~cap_cleanup	2004-05-21 18:51:42.000000000 -0700
+++ 2.6.6-mm4/security/commoncap.c	2004-05-23 01:42:51.000000000 -0700
@@ -89,44 +89,20 @@
 
 int cap_bprm_set_security (struct linux_binprm *bprm)
 {
-	/* Copied from fs/exec.c:prepare_binprm. */
-
-	/* We don't have VFS support for capabilities yet */
-	cap_clear (bprm->cap_inheritable);
-	cap_clear (bprm->cap_permitted);
-	cap_clear (bprm->cap_effective);
-
-	/*  To support inheritance of root-permissions and suid-root
-	 *  executables under compatibility mode, we raise all three
-	 *  capability sets for the file.
-	 *
-	 *  If only the real uid is 0, we only raise the inheritable
-	 *  and permitted sets of the executable file.
-	 */
-
-	if (!issecure (SECURE_NOROOT)) {
-		if (bprm->e_uid == 0 || current->uid == 0) {
-			cap_set_full (bprm->cap_inheritable);
-			cap_set_full (bprm->cap_permitted);
-		}
-		if (bprm->e_uid == 0)
-			cap_set_full (bprm->cap_effective);
-	}
 	return 0;
 }
 
 void cap_bprm_apply_creds (struct linux_binprm *bprm, int unsafe)
 {
-	/* Derived from fs/exec.c:compute_creds. */
-	kernel_cap_t new_permitted, working;
+	kernel_cap_t new_pP, new_pE;
 
-	new_permitted = cap_intersect (bprm->cap_permitted, cap_bset);
-	working = cap_intersect (bprm->cap_inheritable,
-				 current->cap_inheritable);
-	new_permitted = cap_combine (new_permitted, working);
+	if (bprm->e_uid == 0 || current->uid == 0)
+		new_pP = cap_bset;
+	else
+		cap_clear(new_pP);
 
 	if (bprm->e_uid != current->uid || bprm->e_gid != current->gid ||
-	    !cap_issubset (new_permitted, current->cap_permitted)) {
+	    !cap_issubset (new_pP, current->cap_permitted)) {
 		current->mm->dumpable = 0;
 
 		if (unsafe & ~LSM_UNSAFE_PTRACE_CAP) {
@@ -134,13 +110,16 @@
 				bprm->e_uid = current->uid;
 				bprm->e_gid = current->gid;
 			}
-			if (!capable (CAP_SETPCAP)) {
-				new_permitted = cap_intersect (new_permitted,
-							current->cap_permitted);
-			}
+			new_pP = cap_intersect (new_pP,
+						current->cap_permitted);
 		}
 	}
 
+	if (bprm->e_uid == 0)
+		new_pE = new_pP;
+	else
+		cap_clear(new_pE);
+
 	current->suid = current->euid = current->fsuid = bprm->e_uid;
 	current->sgid = current->egid = current->fsgid = bprm->e_gid;
 
@@ -148,9 +127,8 @@
 	 * in the init_task struct. Thus we skip the usual
 	 * capability rules */
 	if (current->pid != 1) {
-		current->cap_permitted = new_permitted;
-		current->cap_effective =
-		    cap_intersect (new_permitted, bprm->cap_effective);
+		current->cap_permitted = new_pP;
+		current->cap_effective = new_pE;
 	}
 
 	/* AUD: Audit candidate if current->cap_effective is set */
@@ -160,13 +138,9 @@
 
 int cap_bprm_secureexec (struct linux_binprm *bprm)
 {
-	/* If/when this module is enhanced to incorporate capability
-	   bits on files, the test below should be extended to also perform a 
-	   test between the old and new capability sets.  For now,
-	   it simply preserves the legacy decision algorithm used by
-	   the old userland. */
 	return (current->euid != current->uid ||
-		current->egid != current->gid);
+		current->egid != current->gid ||
+		(bprm->secflags & BINPRM_SEC_SECUREEXEC));
 }
 
 int cap_inode_setxattr(struct dentry *dentry, char *name, void *value,
--- 2.6.6-mm4/include/linux/binfmts.h~cap_cleanup	2004-05-19 10:37:25.000000000 -0700
+++ 2.6.6-mm4/include/linux/binfmts.h	2004-05-23 02:24:31.034876220 -0700
@@ -20,6 +20,7 @@
 /*
  * This structure is used to hold the arguments that are used when loading binaries.
  */
+#define BINPRM_SEC_SECUREEXEC	1
 struct linux_binprm{
 	char buf[BINPRM_BUF_SIZE];
 	struct page *page[MAX_ARG_PAGES];
@@ -28,6 +29,7 @@
 	int sh_bang;
 	struct file * file;
 	int e_uid, e_gid;
+	int secflags;
 	kernel_cap_t cap_inheritable, cap_permitted, cap_effective;
 	void *security;
 	int argc, envc;



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] scaled-back caps, take 4
  2004-05-23  9:28                       ` Andy Lutomirski
@ 2004-05-23 18:48                         ` Olaf Dietsche
  0 siblings, 0 replies; 43+ messages in thread
From: Olaf Dietsche @ 2004-05-23 18:48 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Chris Wright, Stephen Smalley, Albert Cahalan,
	linux-kernel mailing list, Valdis.Kletnieks

Andy Lutomirski <luto@myrealbox.com> writes:

> You don't like my patch because it rips out a bunch of code and it's
> not clear it won't break stuff.
>
> I don't like your patch because it takes a bunch of incomprehensible
> code that does almost nothing and tweaks it slightly to do something
> useful.  (That's not to say it's does the wrong thing; I just don't
> think the code is clear.)
>
> So I decided to figure out what was going on with the original code.
>
> First, CAP_SETPCAP is never obtainable (by anything).
> Since cap_bset never has this bit set, nothing can inherit it
> from fP.  capset_check prevents it from getting set in pI.

# mv /sbin/init /sbin/init.bin
# cat >/sbin/init
#! /bin/sh

if test $$ -eq 1; then
        mount /proc
        echo -1 >/proc/sys/kernel/cap-bound
fi

exec /sbin/init.bin "$@"
^D
# chmod 755 /sbin/init
# reboot

> Second, cap_bset is broken.  For one thing, there's no way
> to remove the caps you want to restrict from already-running
> tasks.  So I don't think it matters if we break/change it.

Maybe I don't understand this, but I think this is what sys_capset()
is for.

> cap_bprm_set_security does:
> fP = fI = (new_uid == 0 || new_euid == 0)
> fE = (new_euid == 0)

Only if (!issecure (SECURE_NOROOT))

[...]
> The whole result is just
> pP' = (uid == 0 || euid==0) & X
> pE' = (euid == 0) & X
>
> This patch implements this.  It should be invisible to userspace
> (unless userspace (ab)uses cap_bset).  It also adds a secureexec
> flag, which we both need.
>
> First, did I get this right?  It seems to work :)

With this patch you effectively revert all capable() calls back to
suser() tests. If this is what you intended, your patch looks fine.

> Second, do you have any objection to both of us redoing our
> patches against this one?  It should make them nicer-looking
> at least.

I didn't scrutinize capabilities as thoroughly as you did, but I still
don't see why your patch is necessary, besides the changes in
fs/exec.c and include/binfmts.h, maybe.

$ cp commoncap.c lutocap.c
modify it to your liking
# insmod lutocap

same goes for chriscap.c

Please, don't get me wrong. For me, it's just a matter of maintaining
a slightly bigger fscaps patch. But I don't think capabilities in
Linux are really broken, only because some proponents of SELinux claim
so.

If you want a simpler - setuid like - capabilities model, throw out
the inheritable _and_ permitted set and use the effective set alone.

Regards, Olaf.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH] caps, compromise version (was Re: [PATCH] scaled-back caps, take 4)
  2004-05-19  7:30                     ` Chris Wright
  2004-05-23  9:28                       ` Andy Lutomirski
@ 2004-05-24 23:38                       ` Andy Lutomirski
  2004-05-24 23:56                         ` Chris Wright
  1 sibling, 1 reply; 43+ messages in thread
From: Andy Lutomirski @ 2004-05-24 23:38 UTC (permalink / raw)
  To: Chris Wright
  Cc: Stephen Smalley, Albert Cahalan, linux-kernel mailing list,
	olaf+list.linux-kernel, luto

> 
> Hehe, arm wrestling could be entertaining ;-)  I'm in favor of the most
> conservative change, which I feel is in my patch.  But I'm game to
> continue to pick on each.


I like your legacy mode.  I don't like making processes inherit
non-legacyness.  (With your patch, some daemon might be secure
when started from initscripts but insecure when started from the
command line, if root ended up in non-legacy mode.)

You've also convinced me that an inheritable mask is good, because
it may make some IRIX apps work.  It's also necessary for this patch
to be safe.

I don't like touching the inode in the security module (you
forgot to check nosuid, for one thing).

So here's another shot at it:

"Legacy mode" is controlled by a new bit in task_struct called
keep_all_caps (controlled by PRCTL_SET_KEEPALLCAPS).  This bit turns
off setuid emulation completely (except for setfsuid).

The evolution rules are:
pP' = (fP & X) | (pI & pP) [with the setuid-nonroot fix]
pE' = (pE | fP) & pP'
pI' = full

This time around, I haven't touched the unsafeness rules.

The magic is in the setuid emulation:
	if (current->uid == 0 || current->euid == 0)
		cap_set_full(current->cap_inheritable);
	else
		cap_clear(current->cap_inheritable);

So, unless a program plays with it's inheritable mask,
root will not pick up caps on exec (which is good -- it
means it's safe to chroot somewhere, disable all caps
except CAP_SETUID, and let untrusted code play around.)
But, if you start as root and setuid away, _even with
keepcaps_, you lose the caps on exec.  Which is the broken
behavior we want to preserve.

So, to avoid this, new code can either set keep_all_caps
or just explicitly enable inheritance after setuid, in
which case it just works.

I have pI' = full because otherwise it's just one more
(partially) user-controlled variable that programs need
to worry about.  (And because anything else would break
root.)

As for the rest of the changes:

The code no longer assumes that pI<pP, so I yanked all checks
on the inheritable mask.  On the other hand, it makes no
sense to me for capset when changing lots of processes'
masks to affect the inheritable mask.  So I made it leave
it alone, except when changing current.

keep_all_caps is clearly not entirely necessary.  I can take
it out if anyone objects.

I yanked all capset sanity checks from kernel/capability.c --
they were duplicates anyway.

And I left the old (IMHO pointless) behavior that one needs to hack
init in order to use CAP_SETPCAP.

[Side note: for cap_bset to be useful, I think there needs to be
an operation "atomically remove these caps from all tasks."  I
don't see one.]

This patch also should work fine if VFS capabilities are
introduced (there's an fP mask which defaults to (setuid-
root ? full : 0).

Patch against 2.6.6-mm4 (-mm5 didn't like my filesystem...).
It's not as well tested as it should be.  The old cap.cc
tool still works (but remember to set inheritable).  I
don't have a tool yet to play with keep_all_caps.

This depends on my cleanup patch from Saturday.

--Andy

Signed-off-by: Andy Lutomirski (luto@myrealbox.com)

 fs/exec.c                  |    8 +++++-
 include/linux/binfmts.h    |    2 +
 include/linux/capability.h |    2 -
 include/linux/init_task.h  |    1 
 include/linux/prctl.h      |    4 +++
 include/linux/sched.h      |    1 
 kernel/capability.c        |   13 ----------
 kernel/sys.c               |   11 +++++++++
 security/commoncap.c       |   54 +++++++++++++++++++++++++++++++--------------
 9 files changed, 64 insertions(+), 32 deletions(-)

--- 2.6.6-mm4/fs/exec.c~cap_2	2004-05-23 22:22:42.000000000 -0700
+++ 2.6.6-mm4/fs/exec.c	2004-05-23 22:43:22.000000000 -0700
@@ -886,8 +886,10 @@
 
 	if(!(bprm->file->f_vfsmnt->mnt_flags & MNT_NOSUID)) {
 		/* Set-uid? */
-		if (mode & S_ISUID)
+		if (mode & S_ISUID) {
 			bprm->e_uid = inode->i_uid;
+			bprm->secflags |= BINPRM_SEC_SETUID;
+		}
 
 		/* Set-gid? */
 		/*
@@ -895,8 +897,10 @@
 		 * is a candidate for mandatory locking, not a setgid
 		 * executable.
 		 */
-		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP))
+		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
 			bprm->e_gid = inode->i_gid;
+			bprm->secflags |= BINPRM_SEC_SETGID;
+		}
 	}
 
 	/* fill in binprm security blob */
--- 2.6.6-mm4/security/commoncap.c~cap_2	2004-05-23 22:15:39.000000000 -0700
+++ 2.6.6-mm4/security/commoncap.c	2004-05-24 14:32:43.210834050 -0700
@@ -57,13 +57,6 @@
 		      kernel_cap_t *inheritable, kernel_cap_t *permitted)
 {
 	/* Derived from kernel/capability.c:sys_capset. */
-	/* verify restrictions on target's new Inheritable set */
-	if (!cap_issubset (*inheritable,
-			   cap_combine (target->cap_inheritable,
-					current->cap_permitted))) {
-		return -EPERM;
-	}
-
 	/* verify restrictions on target's new Permitted set */
 	if (!cap_issubset (*permitted,
 			   cap_combine (target->cap_permitted,
@@ -83,12 +76,19 @@
 		     kernel_cap_t *inheritable, kernel_cap_t *permitted)
 {
 	target->cap_effective = *effective;
-	target->cap_inheritable = *inheritable;
 	target->cap_permitted = *permitted;
+	if (current == target)
+		target->cap_inheritable = *inheritable;
 }
 
 int cap_bprm_set_security (struct linux_binprm *bprm)
 {
+	/* Pretend we have VFS capabilities */
+	if ((bprm->secflags & BINPRM_SEC_SETUID) && bprm->e_uid == 0)
+		cap_set_full(bprm->cap_permitted);
+	else
+		cap_clear(bprm->cap_permitted);		
+
 	return 0;
 }
 
@@ -96,10 +96,9 @@
 {
 	kernel_cap_t new_pP, new_pE;
 
-	if (bprm->e_uid == 0 || current->uid == 0)
-		new_pP = cap_bset;
-	else
-		cap_clear(new_pP);
+	new_pP = cap_combine(cap_intersect(bprm->cap_permitted, cap_bset),
+			     cap_intersect(current->cap_permitted,
+					   current->cap_inheritable));
 
 	if (bprm->e_uid != current->uid || bprm->e_gid != current->gid ||
 	    !cap_issubset (new_pP, current->cap_permitted)) {
@@ -115,10 +114,15 @@
 		}
 	}
 
-	if (bprm->e_uid == 0)
-		new_pE = new_pP;
-	else
-		cap_clear(new_pE);
+	/* setuid-nonroot is special. */
+	if (bprm->e_uid != 0 && current->uid != 0 && current->euid == 0)
+		cap_clear(new_pP);
+
+	new_pE = cap_combine(current->cap_effective, bprm->cap_permitted);
+	cap_mask(new_pE, new_pP);
+
+	if (!cap_issubset(new_pP, current->cap_permitted))
+		bprm->secflags |= BINPRM_SEC_SECUREEXEC;
 
 	current->suid = current->euid = current->fsuid = bprm->e_uid;
 	current->sgid = current->egid = current->fsgid = bprm->e_gid;
@@ -129,11 +133,13 @@
 	if (current->pid != 1) {
 		current->cap_permitted = new_pP;
 		current->cap_effective = new_pE;
+		cap_set_full(current->cap_inheritable);
 	}
 
 	/* AUD: Audit candidate if current->cap_effective is set */
 
 	current->keep_capabilities = 0;
+	current->keep_all_caps = 0;
 }
 
 int cap_bprm_secureexec (struct linux_binprm *bprm)
@@ -191,10 +197,18 @@
  * Keeping uid 0 is not an option because uid 0 owns too many vital
  * files..
  * Thanks to Olaf Kirch and Peter Benie for spotting this.
+ *
+ * luto - New behavior, May '04
+ * To emulate old evolution rules, inheritable tracks uid and euid.
+ * prctl(PRCTL_SET_KEEPALLCAPS) disables this.  In fact, keepallcaps
+ * disables this whole function.  BE CAREFUL ABOUT THE EFFECTIVE MASK!!!
  */
 static inline void cap_emulate_setxuid (int old_ruid, int old_euid,
 					int old_suid)
 {
+	if (current->keep_all_caps)
+		return;
+
 	if ((old_ruid == 0 || old_euid == 0 || old_suid == 0) &&
 	    (current->uid != 0 && current->euid != 0 && current->suid != 0) &&
 	    !current->keep_capabilities) {
@@ -207,6 +221,11 @@
 	if (old_euid != 0 && current->euid == 0) {
 		current->cap_effective = current->cap_permitted;
 	}
+
+	if (current->uid == 0 || current->euid == 0)
+		cap_set_full(current->cap_inheritable);
+	else
+		cap_clear(current->cap_inheritable);
 }
 
 int cap_task_post_setuid (uid_t old_ruid, uid_t old_euid, uid_t old_suid,
@@ -230,6 +249,9 @@
 			/*
 			 * FIXME - is fsuser used for all CAP_FS_MASK capabilities?
 			 *          if not, we might be a bit too harsh here.
+			 *
+			 * This explicitly ignores keep_all_caps.  The
+			 * potential for error is just too large.
 			 */
 
 			if (!issecure (SECURE_NO_SETUID_FIXUP)) {
--- 2.6.6-mm4/kernel/sys.c~cap_2	2004-05-23 22:15:24.000000000 -0700
+++ 2.6.6-mm4/kernel/sys.c	2004-05-23 22:57:40.000000000 -0700
@@ -1646,6 +1646,17 @@
 			}
 			current->keep_capabilities = arg2;
 			break;
+		case PR_GET_KEEPALLCAPS:
+			if (current->keep_all_caps)
+				error = 1;
+			break;
+		case PR_SET_KEEPALLCAPS:
+			if (arg2 != 0 && arg2 != 1) {
+				error = -EINVAL;
+				break;
+			}
+			current->keep_all_caps = arg2;
+			break;
 		default:
 			error = -EINVAL;
 			break;
--- 2.6.6-mm4/kernel/capability.c~cap_2	2004-05-23 23:11:49.000000000 -0700
+++ 2.6.6-mm4/kernel/capability.c	2004-05-23 23:13:14.000000000 -0700
@@ -174,19 +174,6 @@
      if (security_capset_check(target, &effective, &inheritable, &permitted))
 	     goto out;
 
-     if (!cap_issubset(inheritable, cap_combine(target->cap_inheritable,
-                       current->cap_permitted)))
-             goto out;
-
-     /* verify restrictions on target's new Permitted set */
-     if (!cap_issubset(permitted, cap_combine(target->cap_permitted,
-                       current->cap_permitted)))
-             goto out;
-
-     /* verify the _new_Effective_ is a subset of the _new_Permitted_ */
-     if (!cap_issubset(effective, permitted))
-             goto out;
-
      ret = 0;
 
      /* having verified that the proposed changes are legal,
--- 2.6.6-mm4/include/linux/capability.h~cap_2	2004-05-23 22:42:22.000000000 -0700
+++ 2.6.6-mm4/include/linux/capability.h	2004-05-23 22:42:34.000000000 -0700
@@ -309,7 +309,7 @@
 #define CAP_EMPTY_SET       to_cap_t(0)
 #define CAP_FULL_SET        to_cap_t(~0)
 #define CAP_INIT_EFF_SET    to_cap_t(~0 & ~CAP_TO_MASK(CAP_SETPCAP))
-#define CAP_INIT_INH_SET    to_cap_t(0)
+#define CAP_INIT_INH_SET    CAP_FULL_SET
 
 #define CAP_TO_MASK(x) (1 << (x))
 #define cap_raise(c, flag)   (cap_t(c) |=  CAP_TO_MASK(flag))
--- 2.6.6-mm4/include/linux/sched.h~cap_2	2004-05-23 22:04:50.000000000 -0700
+++ 2.6.6-mm4/include/linux/sched.h	2004-05-23 22:11:15.000000000 -0700
@@ -462,6 +462,7 @@
 	struct group_info *group_info;
 	kernel_cap_t   cap_effective, cap_inheritable, cap_permitted;
 	int keep_capabilities:1;
+	int keep_all_caps:1;
 	struct user_struct *user;
 /* limits */
 	struct rlimit rlim[RLIM_NLIMITS];
--- 2.6.6-mm4/include/linux/prctl.h~cap_2	2004-05-23 22:12:18.000000000 -0700
+++ 2.6.6-mm4/include/linux/prctl.h	2004-05-23 22:13:35.000000000 -0700
@@ -20,6 +20,10 @@
 #define PR_GET_KEEPCAPS   7
 #define PR_SET_KEEPCAPS   8
 
+/* Get/set whether to emulate old capability behavior */
+#define PR_GET_KEEPALLCAPS 15
+#define PR_SET_KEEPALLCAPS 16
+
 /* Get/set floating-point emulation control bits (if meaningful) */
 #define PR_GET_FPEMU  9
 #define PR_SET_FPEMU 10
--- 2.6.6-mm4/include/linux/binfmts.h~cap_2	2004-05-23 22:27:43.000000000 -0700
+++ 2.6.6-mm4/include/linux/binfmts.h	2004-05-23 22:27:58.000000000 -0700
@@ -21,6 +21,8 @@
  * This structure is used to hold the arguments that are used when loading binaries.
  */
 #define BINPRM_SEC_SECUREEXEC	1
+#define BINPRM_SEC_SETUID	2
+#define BINPRM_SEC_SETGID	4
 struct linux_binprm{
 	char buf[BINPRM_BUF_SIZE];
 	struct page *page[MAX_ARG_PAGES];
--- 2.6.6-mm4/include/linux/init_task.h~cap_2	2004-05-23 22:11:27.000000000 -0700
+++ 2.6.6-mm4/include/linux/init_task.h	2004-05-23 22:11:44.000000000 -0700
@@ -96,6 +96,7 @@
 	.cap_inheritable = CAP_INIT_INH_SET,				\
 	.cap_permitted	= CAP_FULL_SET,					\
 	.keep_capabilities = 0,						\
+	.keep_all_caps  = 0,						\
 	.rlim		= INIT_RLIMITS,					\
 	.user		= INIT_USER,					\
 	.comm		= "swapper",					\




^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] caps, compromise version (was Re: [PATCH] scaled-back caps, take 4)
  2004-05-24 23:38                       ` [PATCH] caps, compromise version (was Re: [PATCH] scaled-back caps, take 4) Andy Lutomirski
@ 2004-05-24 23:56                         ` Chris Wright
  2004-05-25  0:23                           ` Andy Lutomirski
  0 siblings, 1 reply; 43+ messages in thread
From: Chris Wright @ 2004-05-24 23:56 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Chris Wright, Stephen Smalley, Albert Cahalan,
	linux-kernel mailing list, olaf+list.linux-kernel

* Andy Lutomirski (luto@myrealbox.com) wrote:
> > 
> > Hehe, arm wrestling could be entertaining ;-)  I'm in favor of the most
> > conservative change, which I feel is in my patch.  But I'm game to
> > continue to pick on each.
> 
> 
> I like your legacy mode.  I don't like making processes inherit
> non-legacyness.  (With your patch, some daemon might be secure
> when started from initscripts but insecure when started from the
> command line, if root ended up in non-legacy mode.)

Hmm, that was intentional (my very first cut at this thing cleared it,
but that patch had many other broken behaviours).  Specifically because
it goes through pI, which POSIX draft says is untouched through exec.

> You've also convinced me that an inheritable mask is good, because
> it may make some IRIX apps work.  It's also necessary for this patch
> to be safe.
> 
> I don't like touching the inode in the security module (you
> forgot to check nosuid, for one thing).

Yup, I changed that since then, using the secflags approach you did.

> So here's another shot at it:
> 
> "Legacy mode" is controlled by a new bit in task_struct called
> keep_all_caps (controlled by PRCTL_SET_KEEPALLCAPS).  This bit turns
> off setuid emulation completely (except for setfsuid).

I had same idea.  I wished we could hijack keep_capabilities as a
bit vector.

> The evolution rules are:
> pP' = (fP & X) | (pI & pP) [with the setuid-nonroot fix]
> pE' = (pE | fP) & pP'
> pI' = full
> 
> This time around, I haven't touched the unsafeness rules.
> 
> The magic is in the setuid emulation:
> 	if (current->uid == 0 || current->euid == 0)
> 		cap_set_full(current->cap_inheritable);
> 	else
> 		cap_clear(current->cap_inheritable);
> 
> So, unless a program plays with it's inheritable mask,
> root will not pick up caps on exec (which is good -- it
> means it's safe to chroot somewhere, disable all caps
> except CAP_SETUID, and let untrusted code play around.)
> But, if you start as root and setuid away, _even with
> keepcaps_, you lose the caps on exec.  Which is the broken
> behavior we want to preserve.
> 
> So, to avoid this, new code can either set keep_all_caps
> or just explicitly enable inheritance after setuid, in
> which case it just works.
> 
> I have pI' = full because otherwise it's just one more
> (partially) user-controlled variable that programs need
> to worry about.  (And because anything else would break
> root.)

How do you keep passing down the same caps through multiple execs?

> As for the rest of the changes:
> 
> The code no longer assumes that pI<pP, so I yanked all checks
> on the inheritable mask.  On the other hand, it makes no
> sense to me for capset when changing lots of processes'
> masks to affect the inheritable mask.  So I made it leave
> it alone, except when changing current.
> 
> keep_all_caps is clearly not entirely necessary.  I can take
> it out if anyone objects.
> 
> I yanked all capset sanity checks from kernel/capability.c --
> they were duplicates anyway.
> 
> And I left the old (IMHO pointless) behavior that one needs to hack
> init in order to use CAP_SETPCAP.
> 
> [Side note: for cap_bset to be useful, I think there needs to be
> an operation "atomically remove these caps from all tasks."  I
> don't see one.]

Yeah.  It depends on the definition of useful.  Get a couple privileged
tasks running (which may fork/exec from time to time), then clamp down
the machine is one form of useful.  In general, I don't cap_bset is that
useful though.

> This patch also should work fine if VFS capabilities are
> introduced (there's an fP mask which defaults to (setuid-
> root ? full : 0).
> 
> Patch against 2.6.6-mm4 (-mm5 didn't like my filesystem...).
> It's not as well tested as it should be.  The old cap.cc
> tool still works (but remember to set inheritable).  I
> don't have a tool yet to play with keep_all_caps.

I can add this to the test stuff to play with it.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] caps, compromise version (was Re: [PATCH] scaled-back caps, take 4)
  2004-05-24 23:56                         ` Chris Wright
@ 2004-05-25  0:23                           ` Andy Lutomirski
  0 siblings, 0 replies; 43+ messages in thread
From: Andy Lutomirski @ 2004-05-25  0:23 UTC (permalink / raw)
  To: Chris Wright
  Cc: Stephen Smalley, Albert Cahalan, linux-kernel mailing list,
	olaf+list.linux-kernel

Chris Wright wrote:

> * Andy Lutomirski (luto@myrealbox.com) wrote:
> 
>>>Hehe, arm wrestling could be entertaining ;-)  I'm in favor of the most
>>>conservative change, which I feel is in my patch.  But I'm game to
>>>continue to pick on each.
>>
>>
>>I like your legacy mode.  I don't like making processes inherit
>>non-legacyness.  (With your patch, some daemon might be secure
>>when started from initscripts but insecure when started from the
>>command line, if root ended up in non-legacy mode.)
> 
> 
> Hmm, that was intentional (my very first cut at this thing cleared it,
> but that patch had many other broken behaviours).  Specifically because
> it goes through pI, which POSIX draft says is untouched through exec.

Not in IRIX, though.  And I'm afraid of:

cap -c all-i <some setuid binary>
versus
cap -c all+i <some setuid binary>

Suddently the binary's behavior might be different.  This isn't 
inheritantly bad, but it seems like a pointless gotcha.

I like my version of using inheritable for legaciness, but only because my 
inheritable semantics make sense.  Your version would worry me a lot less 
if you just added a new field.  But mine doesn't actually need the new field ;)

>>
>>"Legacy mode" is controlled by a new bit in task_struct called
>>keep_all_caps (controlled by PRCTL_SET_KEEPALLCAPS).  This bit turns
>>off setuid emulation completely (except for setfsuid).
> 
> 
> I had same idea.  I wished we could hijack keep_capabilities as a
> bit vector.

It's a bitfield.  Just add fields -- no cost in memory.  Fairly large cost 
in compile time, though...

> 
> 
>>The evolution rules are:
>>pP' = (fP & X) | (pI & pP) [with the setuid-nonroot fix]
>>pE' = (pE | fP) & pP'
>>pI' = full
>>
>>This time around, I haven't touched the unsafeness rules.
>>
>>The magic is in the setuid emulation:
>>	if (current->uid == 0 || current->euid == 0)
>>		cap_set_full(current->cap_inheritable);
>>	else
>>		cap_clear(current->cap_inheritable);
>>
>>So, unless a program plays with it's inheritable mask,
>>root will not pick up caps on exec (which is good -- it
>>means it's safe to chroot somewhere, disable all caps
>>except CAP_SETUID, and let untrusted code play around.)
>>But, if you start as root and setuid away, _even with
>>keepcaps_, you lose the caps on exec.  Which is the broken
>>behavior we want to preserve.
>>
>>So, to avoid this, new code can either set keep_all_caps
>>or just explicitly enable inheritance after setuid, in
>>which case it just works.
>>
>>I have pI' = full because otherwise it's just one more
>>(partially) user-controlled variable that programs need
>>to worry about.  (And because anything else would break
>>root.)
> 
> 
> How do you keep passing down the same caps through multiple execs?

This only takes effect when set*uid is called successfully.  It bites 
programs that start as non-root with CAP_SETUID and change their uid, but 
these programs either don't exist or don't work at all right now.

[root@luto andy]# cap -c all+i -u andy bash
[andy@luto andy]$ dumpcap [note second exec]
         Real        Eff
User    500         500
Group   500         500

Caps: =ip cap_setpcap-p

> 
> 
>>As for the rest of the changes:
>>
>>The code no longer assumes that pI<pP, so I yanked all checks
>>on the inheritable mask.  On the other hand, it makes no
>>sense to me for capset when changing lots of processes'
>>masks to affect the inheritable mask.  So I made it leave
>>it alone, except when changing current.
>>
>>keep_all_caps is clearly not entirely necessary.  I can take
>>it out if anyone objects.
>>
>>I yanked all capset sanity checks from kernel/capability.c --
>>they were duplicates anyway.
>>
>>And I left the old (IMHO pointless) behavior that one needs to hack
>>init in order to use CAP_SETPCAP.
>>
>>[Side note: for cap_bset to be useful, I think there needs to be
>>an operation "atomically remove these caps from all tasks."  I
>>don't see one.]
> 
> 
> Yeah.  It depends on the definition of useful.  Get a couple privileged
> tasks running (which may fork/exec from time to time), then clamp down
> the machine is one form of useful.  In general, I don't cap_bset is that
> useful though.

Especially with CAP_SYS_ADMIN...  SELinux is clearly the way to go here.

I just discovered a patch 
(http://www.kernel.org/pub/linux/libs/security/linux-privs/kernel-2.4-fcap/README) 
that claims to implement per-process-tree maximum cap masks (like I did for 
  awhile).  It hasn't been maintained, though.

If one of our patches hits -mm or -linus, I may try and add a feature like 
that.  It'll (rightly) annoy the SELinux folks, though.

> 
> 
>>This patch also should work fine if VFS capabilities are
>>introduced (there's an fP mask which defaults to (setuid-
>>root ? full : 0).
>>
>>Patch against 2.6.6-mm4 (-mm5 didn't like my filesystem...).
>>It's not as well tested as it should be.  The old cap.cc
>>tool still works (but remember to set inheritable).  I
>>don't have a tool yet to play with keep_all_caps.
> 
> 
> I can add this to the test stuff to play with it.

Except that I fail a lot of your tests because of inheritable mask 
differences.  Oh, well.

I may revive my ext3 caps patch sometime.  Is there a way to make that work 
with your patch?

--Andy

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14 17:32       ` Albert Cahalan
@ 2004-05-14 21:11         ` Chris Wright
  2004-05-14 19:32           ` Albert Cahalan
  0 siblings, 1 reply; 43+ messages in thread
From: Chris Wright @ 2004-05-14 21:11 UTC (permalink / raw)
  To: Albert Cahalan
  Cc: Stephen Smalley, linux-kernel mailing list, luto, Chris Wright,
	olaf+list.linux-kernel, Valdis.Kletnieks

* Albert Cahalan (albert@users.sourceforge.net) wrote:
> On Fri, 2004-05-14 at 14:06, Stephen Smalley wrote:
> > You missed the point.  Capability scheme maps far too
> > many operations to a single capability; CAP_SYS_ADMIN
> > in Linux is a good example.
> 
> What I said: lovely, but not exactly groundbreaking.
> 
> Suppose we break up CAP_SYS_ADMIN into 41 other bits.
> There you go. It's silly to argue that a system with
> more bits is some kind of major advance over one with
> just a few bits. There is a quality-of-implementation
> issue here, not some fundamental difference.

Needing more bits isn't the only problem.

> > TE model
> > defers organization of operations into equivalence
> > classes to the policy writer.
> 
> I don't see anything special here either. With a
> plain capability-bit system, you could allow for
> user-defined aliases that map to multiple bits.
> In some random /etc config file:
> 
> define ADMIN := FOO | BAR | BAZ

This doesn't effect the inflexibility of a single definition for domain
transistion that's inherent in the capabilty system.

> Lack of granularity is an implementation detail.
> (Blame the SGI folks that wouldn't listen to me.)
> Lack of granularity is not a design flaw.

It's a design flaw.  More bits won't help.  There's an important missing
piece...credentials for both subject and object.  Both of which can be
dynamic, and differ per subject's view of an object.

> What I'm looking for:
> 
> 1. configure the kernel by ...
> 2. ensure that /bin/foo runs early in boot
> 3. put "blah blah blah" in /etc/foo.conf
> 
> That is, is there a small set of simple config files
> and binaries that I could just slap onto an existing
> system to ensure that a particular app is granted an
> extra capability bit?
> 
> If yes, then the ugly old-Oracle hack is not needed.

Nearly.  There's the minor issue that execve() clears that bit more
agressively than desired for non-root processes.  Otherwise pam can do
it with pam_cap.  Which is all we're trying to fix here.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14 21:11         ` Chris Wright
@ 2004-05-14 19:32           ` Albert Cahalan
  0 siblings, 0 replies; 43+ messages in thread
From: Albert Cahalan @ 2004-05-14 19:32 UTC (permalink / raw)
  To: Chris Wright
  Cc: Albert Cahalan, Stephen Smalley, linux-kernel mailing list, luto,
	olaf+list.linux-kernel, Valdis.Kletnieks

On Fri, 2004-05-14 at 17:11, Chris Wright wrote:
> * Albert Cahalan (albert@users.sourceforge.net) wrote:
> > On Fri, 2004-05-14 at 14:06, Stephen Smalley wrote:
> > > You missed the point.  Capability scheme maps far too
> > > many operations to a single capability; CAP_SYS_ADMIN
> > > in Linux is a good example.
> > 
> > What I said: lovely, but not exactly groundbreaking.
> > 
> > Suppose we break up CAP_SYS_ADMIN into 41 other bits.
> > There you go. It's silly to argue that a system with
> > more bits is some kind of major advance over one with
> > just a few bits. There is a quality-of-implementation
> > issue here, not some fundamental difference.
> 
> Needing more bits isn't the only problem.

That's what much of the document went on about. The
rest of the document was mostly generic MAC concepts.

> > > TE model
> > > defers organization of operations into equivalence
> > > classes to the policy writer.
> > 
> > I don't see anything special here either. With a
> > plain capability-bit system, you could allow for
> > user-defined aliases that map to multiple bits.
> > In some random /etc config file:
> > 
> > define ADMIN := FOO | BAR | BAZ
> 
> This doesn't effect the inflexibility of a single definition for domain
> transistion that's inherent in the capabilty system.

Sure. I already noted this.

> > Lack of granularity is an implementation detail.
> > (Blame the SGI folks that wouldn't listen to me.)
> > Lack of granularity is not a design flaw.
> 
> It's a design flaw.  More bits won't help.  There's an important missing
> piece...credentials for both subject and object.  Both of which can be
> dynamic, and differ per subject's view of an object.

There is no meaningful object.

subject: process 12345
object: ??????
operation: lock memory

For a few capability bits, there is a meaningful object
and you could use SELinux in place of the capability bits.
For most of the capability bits, this is not the case.

> > What I'm looking for:
> > 
> > 1. configure the kernel by ...
> > 2. ensure that /bin/foo runs early in boot
> > 3. put "blah blah blah" in /etc/foo.conf
> > 
> > That is, is there a small set of simple config files
> > and binaries that I could just slap onto an existing
> > system to ensure that a particular app is granted an
> > extra capability bit?
> > 
> > If yes, then the ugly old-Oracle hack is not needed.
> 
> Nearly.  There's the minor issue that execve() clears that bit more
> agressively than desired for non-root processes.  Otherwise pam can do
> it with pam_cap.  Which is all we're trying to fix here.

Stephen Smalley suggested that SELinux could take the
place of our capability bits. So I'm asking how you do
that, using the most minimal SELinux config.

If he really has a way, then there is no need to change
the execve() behavior in the Linux 2.6.x kernels.

Perhaps we just need an LSM hook to re-add the capability
bit after execve() drops it. That's a tiny change that
doesn't affect any existing system.




^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14 15:19   ` Albert Cahalan
@ 2004-05-14 18:06     ` Stephen Smalley
  2004-05-14 17:32       ` Albert Cahalan
  0 siblings, 1 reply; 43+ messages in thread
From: Stephen Smalley @ 2004-05-14 18:06 UTC (permalink / raw)
  To: Albert Cahalan
  Cc: linux-kernel mailing list, luto, Chris Wright,
	olaf+list.linux-kernel, Valdis.Kletnieks

On Fri, 2004-05-14 at 11:19, Albert Cahalan wrote:
> I just read that. It's a very unfair marketing document.
> Among other things, it suggests that a capability system
> is stuck with about 40 bits while their own version of
> capabilities (a duck is a duck...) has 80 bits. Lovely,
> but not exactly groundbreaking.

You missed the point.  Capability scheme maps far too many operations to
a single capability; CAP_SYS_ADMIN in Linux is a good example.  TE model
defers organization of operations into equivalence classes to the policy
writer.

>  There is the bit about
> a 3-argument security call, but a careful reading will
> reveal that one argument is unused (NULL?) when dealing
> with abilities like "can set the clock".

But very useful when dealing with CAP_DAC_OVERRIDE and friends.

> About the only thing of interest is that capability
> transitions can be arbitrary. You're not limited to
> an obscure set of equations that nobody can agree on.

Because there is no one-size-fits-all equation for all transitions.

> The cost: complicated site-specific config files and
> the inability to support capability-aware apps that
> set+clear their own bits.

Complexity pushed to userspace, where it can be analyzed appropriately
via tools and tailored for the transition in question.  Central
management of the capabilities based on equivalence classes (types), as
opposed to having to manage a distributed nightmare of capability bits
on your filesystems.  Applications that can transition to other domains,
but only if so authorized by the policy.

> Eh, why? That's mostly a search-and-replace on the name,
> since capable() makes a perfectly fine LSM hook.

It doesn't offer sufficient granularity, either for operation (which you
_could_ address by introducing new capabilities, but the hooks are more
easily extensible) or for object.

> So what about the old-Oracle problem? You have a
> server that needs the ability to hog and lock memory.
> Is there an almost-empty SELinux policy that would
> provide this while leaving the rest of the system
> acting as UNIX-like systems have always acted?
> If so, we have a winner.

See "relaxed policy" or "targeted policy" in recent discussions on the
Fedora mailing lists; coming soon to a rawhide near you.  Not exactly
the same thing (it is a policy where most processes run essentially
unconfined except for a targeted set that you define like apache, bind,
etc), but you could certainly tweak it to this end (i.e. you put oracle
into a domain that gives it the one capability you desire).

> One still does need to provide apps with a way to
> answer "can I do FOO, BAR, and BAZ?" and "am I
> running with elevated privileges?". Some way to
> dispose of unneeded privileges would be good too.
> Hopefully extra libraries wouldn't be needed.

SELinux provides a policy API already, and a userspace library for
accessing it (and caching decisions from it).  It also provides (via the
AT_SECURE auxv entry) notification that a domain transition has
occurred, and this is already used by glibc secure mode.  Privileges are
dropped via domain transitions.

-- 
Stephen Smalley <sds@epoch.ncsc.mil>
National Security Agency


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14 15:21 ` Stephen Smalley
  2004-05-14 15:19   ` Albert Cahalan
@ 2004-05-14 18:00   ` Chris Wright
  1 sibling, 0 replies; 43+ messages in thread
From: Chris Wright @ 2004-05-14 18:00 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: Albert Cahalan, linux-kernel mailing list, luto, Chris Wright,
	olaf+list.linux-kernel, Valdis.Kletnieks

* Stephen Smalley (sds@epoch.ncsc.mil) wrote:
> On Fri, 2004-05-14 at 08:03, Albert Cahalan wrote:
> > This would be an excellent time to reconsider how capabilities
> > are assigned to bits. You're breaking things anyway; you might
> > as well do all the breaking at once. I want local-use bits so
> > that the print queue management access isn't by magic UID/GID.
> > We haven't escaped UID-as-priv if server apps and setuid apps
> > are still making UID-based access control decisions.
> 
> Capabilities are a broken model for privilege management; try RBAC/TE
> instead.  http://www.securecomputing.com/pdf/secureos.pdf has notes
> about the history and comparison of capabilities vs. TE.
> 
> Instead of adding new capability bits, replace capable() calls with LSM
> hook calls that offer you finer granularity both in operation and in
> object-based decisions, and then use a security module like SELinux to
> map that to actual permission checks.  SELinux provides a framework for
> defining security classes and permissions, including both definitions
> used by the kernel and definitions used by userspace policy enforcers
> (ala security-enhanced X).

exactly!

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14 14:53 ` Andy Lutomirski
@ 2004-05-14 17:58   ` Chris Wright
  0 siblings, 0 replies; 43+ messages in thread
From: Chris Wright @ 2004-05-14 17:58 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Albert Cahalan, linux-kernel mailing list, chrisw,
	olaf+list.linux-kernel, Valdis.Kletnieks

* Andy Lutomirski (luto@myrealbox.com) wrote:
> > This would be an excellent time to reconsider how capabilities
> > are assigned to bits. You're breaking things anyway; you might
> > as well do all the breaking at once. I want local-use bits so
> > that the print queue management access isn't by magic UID/GID.
> > We haven't escaped UID-as-priv if server apps and setuid apps
> > are still making UID-based access control decisions.
> 
> How many bits?  Or should it even be a bitmask?
> 
> I'm thinking either 64 or 128 for kernel-defined caps and either
> a seperate 128 bits or more or just a list for local-defined.

Starts to look like the list of LSM callbacks.  Making it bigger doesn't
help the simple issue, keep one lousy bit across execve().  All this
redesign seems wrong to do in 2.6.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14 12:03 [PATCH] capabilites, take 2 Albert Cahalan
  2004-05-14 14:53 ` Andy Lutomirski
  2004-05-14 15:21 ` Stephen Smalley
@ 2004-05-14 17:48 ` Chris Wright
  2 siblings, 0 replies; 43+ messages in thread
From: Chris Wright @ 2004-05-14 17:48 UTC (permalink / raw)
  To: Albert Cahalan
  Cc: linux-kernel mailing list, luto, chrisw, olaf+list.linux-kernel,
	Valdis.Kletnieks

* Albert Cahalan (albert@users.sourceforge.net) wrote:
> This would be an excellent time to reconsider how capabilities
> are assigned to bits. You're breaking things anyway; you might
> as well do all the breaking at once. I want local-use bits so
> that the print queue management access isn't by magic UID/GID.
> We haven't escaped UID-as-priv if server apps and setuid apps
> are still making UID-based access control decisions.

This is too volatile in stable series.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14  5:58       ` Andy Lutomirski
@ 2004-05-14 17:45         ` Chris Wright
  0 siblings, 0 replies; 43+ messages in thread
From: Chris Wright @ 2004-05-14 17:45 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Chris Wright, akpm, linux-kernel, Valdis.Kletnieks

* Andy Lutomirski (luto@myrealbox.com) wrote:
>  > I tested as a module, and this doesn't run AFAICS
> 
> OK -- I give in.  I'll redo it as a kernel (non-module) boot parameter.
> And touch some more files that have no business being capability-aware
> while I'm at it :(

I'm not yet convinced this is necessary to get the behaviour we'd like,
which is simply execve() doesn't wipe caps categorically.

> The inheritable mask:
> > > Can you send me a sample of what breaks?
> > Yes.  It's something like this:
> >
> > keepcaps
> > dropuid
> > drop caps
> > execve()
> >
> > The caps that are left are like this.  (effective == inheritable) which
> > are a subset of permitted.
> 
> Is (eff == inh) what happens or what should happen?  If the former, then
> my patch is broken.  If the latter, either I'm confused, or see below.

eff == inh is what capset is doing, with intention.

> > > My concept of 'inheritable' is that caps that are _not_ inheritable
> > > may never be gained by this task or its children.  So a process
> > > should normally have all caps inheritable.
> >
> > This is the diff with Posix, which allows a process to inherit a
> > capability that it can never exercise.  However, it could pass the
> > capablity on to someone else who could inherit it.
> 
> So here's where I think we disagree:
> 
> Posix (as interpreted by Linux 2.4/2.6) says:
> pP' = (fP & cap_bset) | (fI & pI)
> 
> So (assuming that set_security did the "obvious" (but dangerous) thing):
> 
> pP := "task may enable and use these capabilities"
> pE := "task may use these capabilities _now_"
> pI := "task may gain these on exec from fI"
> 
> fP := "this program gets these caps (module cap_bset)"
> fI := "this program gets these caps if pI says so"
> 
> Which screams "overcomplicated."  I imagine that the use is for a
> trusted daemon to run an untrusted helper (with pI>0) which then
> calls back into trusted land and gets its caps back.  This is
> possibly convenient, but it seems to break (where break = scare me)
> when there are more than one such system on a given box.  Then one's
> untrusted program (with fI>0) can call the other's trusted fI>0
> helper.  I suppose the point is that a random user's program (with fI==0)
> can't try to exploit anything, but, for this to be at all secure, both
> fI>0 programs need to be secure against attack from the other (unrelated)
> system, should it be compromised.  Which means it might as well have set
> fP>0 and been done with it (I don't believe in security by inconvenience
> of exploit).

I agree it's overcomplicated.  Much of the complication is added in the
fs bits where you now have 6 fields to care about per exec.  And you've
got to make sure the fs bits acutally make sense in all cases and
provide useful security.  This is precisely why they haven't been
implemented yet.  They don't mix nicely with the world people are used
to with setuid, etc, and they add many extra knobs which means it's
difficult to admin.

> I see no particular invariants here, except for pE <= pP.
> 
> IRIX (thanks Valdis) says:
> 
> pI' = pI & fI
> pP' = fP | (pI' & fP)
> 
> Which I interpret as
> 
> pP := "task may enable and use these capabilities"
> pE := "task may use these capabilities _now_"
> pI := ~"task _loses_ these on exec"
> 
> fP := "this program gets these caps"
> fI := "this program may keep these caps"
> 
> This seems to want pP <= pI as an invariant.
> 
> This is what I always thought Linux capabilities meant to be.  They
> don't make my brain hurt.
> 
> But I also think they're overengineered.  Instead of:
> 
> drop_caps_from_inheritable()
> exec()
> 
> a program could do:
> 
> drop_caps_from_permitted()
> exec()

Yes.  Although I suppose the first case allows for simple damage control
when needing some privs locally, and calling into libraries which may
generate a fork/exec.

> And I can't imagine what use fI != ~0 has, since it's trivially
> accomplished by a wrapper.  It is also trivially bypassed by
> loading the program manually (with ld.so).

Not so sure (now that we're dreaming of fs caps), ld.so could have fI == 0.
IOW, similar to ld.so being worthless to gain setuid bit.

> So, in my patch, I decided steal the inheritable mask to mean this:
> 
> pI := "this process may gain these caps"
> fI := "this is an upper bound on pI"
> 
> In other words, if a process is extra-untrusted (e.g. it's a daemon
> that never needs a certain capability and has no business trying
> to gain it), it can drop it from pI.  Then it cannot try to abuse
> programs with pP>0.  The setuid override is just added paranoia.
> Another benefit is that it allows a securelevel-like scheme, where
> even root isn't quite trusted.
> 
> I suppose it might be inappropriate to steal this field like this,
> given that IRIX already has a (somewhat reasonable) use for it. I
> have no problem implementing IRIX-style capabilities and (if there
> is enough interest) adding a _fourth_ process field pM (process
> maximum capabilities) that does what my pI does.
> 
> As for the fE mask, I just don't see the point, although I _really_
> don't like the way it's described in the IRIX manpage.
> 
> IRIX has pE = pP & fE.  This breaks Posix non-capabilities
> compatibility for a program that's uid==0, euid!=0.  It should
> have pE==0 and pP>0.  But it calls exec() and gets pE>0.  This
> is bad.
> 
> Assuming there's something else there to fix that case,
> then I still don't see the point.  If a program is capability-
> aware, it can set its pE however it likes.  If not, then it probably
> expects pE==pP.  I guess there could be a trusted but dumb program
> that runs a trusted, cap-aware helper that needs capabilities.
> Then the admin sets fE==0 on the dump program and everything works.
> Seems a bit contrived, though.
> 
> On the other hand, I'm not wedded to my model of pE.  It'll be harder
> to get IRIX's right for uid!=euid.
> 
> CAP_SYS_PTRACE:
> 
> >>>>@@ -36,6 +41,11 @@
> >>>>int cap_ptrace (struct task_struct *parent, struct task_struct *child)
> >>>>{
> >>>>	/* Derived from arch/i386/kernel/ptrace.c:sys_ptrace. */
> >>>>+	/* CAP_SYS_PTRACE still can't bypass inheritable restrictions */
> >>>>+	if (newcaps &&
> >>>>+	    !cap_issubset (child->cap_inheritable, current->cap_inheritable))
> >>>>+		return -EPERM;
> >>>
> >>>
> >>>Why no capable() override?  In fact, is this check really necessary?
> >>
> >>If task A has less inheritable caps than B, then A is somehow less trusted
> >>and has no business tracing B.
> >
> >
> > But it's not less.  It's just not a subset.  Task B could have only one
> > inheritable cap, while A could have all but the one cap that B has.  In
> > fact, that could be CAP_SYS_PTRACE.  So the threat is task A tracing B,
> > and using it to pass on capabilities that it wasn't allowed to pass on.
> > Which is what the permitted test was for before, and what CAP_SYS_PTRACE
> > was used to override.
> >
> >
> >>A concrete example: a system runs with very restricted inheritable caps
> >>on all processes except for a magic daemon.  The magic daemon holds on
> >>to CAP_SYS_ADMIN to umount everything at shutdown.  If the rest of the
> >>system gets rooted, it still shouldn't be possible to trace the daemon.
> >>(Yes, this is currently not workable -- I plan to add a sysctl that sets
> >>what inheritable caps a task must have for setuid to work.  The blanket
> >>requirement that _all_ must be present is to avoid bugs in which a
> >>setuid program assumes it will be fully privileged.)
> >
> > I suppose this eliminates the usefulness of CAP_SYS_PTRACE.
> 
> It lets one uid/gid trace another.  If CAP_SYS_PTRACE allowed a process
> to arbitrarity steal another's capabilities, then the process with
> CAP_SYS_PTRACE might as well have been given those capabilities.  That is,
> this should IMHO be disallowed

Yeah, this is just another example of capabilities being easily used
to leverage greater caps.  Others are:  CAP_SETUID, CAP_DAC_OVERRIDE,
CAP_SYS_MODULE, CAP_SYS_RAWIO...

> drop_all_but_CAP_SYS_PTRACE()
> exec(slightly trusted debugger process)
> ptrace(1) <--- but it was supposed to be "slightly trusted"!
> 
> So:
> 
> Should I redo this to keep IRIX's meaning of fI, should I keep mine,
> or should I do something else.  Chris -- in your examples, you seem
> to have some idea of what should be happening.  What do you think?

I'd like to see smallest possible change to allow exeve() to be more
fine grained rather than 0 or ~0 caps depending on setuid-root.  I think
we can work with existing meaning of Inh, Eff, Perm and find a way to
smarten up the transition rules slightly.  My patch is trivial, although
there's probably some cases where it's still unsafe.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14 18:06     ` Stephen Smalley
@ 2004-05-14 17:32       ` Albert Cahalan
  2004-05-14 21:11         ` Chris Wright
  0 siblings, 1 reply; 43+ messages in thread
From: Albert Cahalan @ 2004-05-14 17:32 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: Albert Cahalan, linux-kernel mailing list, luto, Chris Wright,
	olaf+list.linux-kernel, Valdis.Kletnieks

On Fri, 2004-05-14 at 14:06, Stephen Smalley wrote:
> On Fri, 2004-05-14 at 11:19, Albert Cahalan wrote:
> > I just read that. It's a very unfair marketing document.
> > Among other things, it suggests that a capability system
> > is stuck with about 40 bits while their own version of
> > capabilities (a duck is a duck...) has 80 bits. Lovely,
> > but not exactly groundbreaking.
> 
> You missed the point.  Capability scheme maps far too
> many operations to a single capability; CAP_SYS_ADMIN
> in Linux is a good example.

What I said: lovely, but not exactly groundbreaking.

Suppose we break up CAP_SYS_ADMIN into 41 other bits.
There you go. It's silly to argue that a system with
more bits is some kind of major advance over one with
just a few bits. There is a quality-of-implementation
issue here, not some fundamental difference.

> TE model
> defers organization of operations into equivalence
> classes to the policy writer.

I don't see anything special here either. With a
plain capability-bit system, you could allow for
user-defined aliases that map to multiple bits.
In some random /etc config file:

define ADMIN := FOO | BAR | BAZ

> >  There is the bit about
> > a 3-argument security call, but a careful reading will
> > reveal that one argument is unused (NULL?) when dealing
> > with abilities like "can set the clock".
> 
> But very useful when dealing with CAP_DAC_OVERRIDE and friends.

I suppose so, but that isn't the interesting case.
We already have SELinux for that kind of thing.

> > About the only thing of interest is that capability
> > transitions can be arbitrary. You're not limited to
> > an obscure set of equations that nobody can agree on.
> 
> Because there is no one-size-fits-all equation for all transitions.
> 
> > The cost: complicated site-specific config files and
> > the inability to support capability-aware apps that
> > set+clear their own bits.
> 
> Complexity pushed to userspace, where it can be analyzed appropriately
> via tools and tailored for the transition in question.  Central
> management of the capabilities based on equivalence classes (types), as
> opposed to having to manage a distributed nightmare of capability bits
> on your filesystems.  Applications that can transition to other domains,
> but only if so authorized by the policy.

Complexity is pushed to admins, most of which are less
clueful than one might hope.

> > Eh, why? That's mostly a search-and-replace on the name,
> > since capable() makes a perfectly fine LSM hook.
> 
> It doesn't offer sufficient granularity, either for operation (which you
> _could_ address by introducing new capabilities, but the hooks are more
> easily extensible) or for object.

SELinux does, I hope, already deal with anything that
would involve an "object". So it is of little concern
what CAP_DAC_OVERRIDE does. What about setting the clock,
hogging memory, using the FIFO scheduler, and so on?

Lack of granularity is an implementation detail.
(Blame the SGI folks that wouldn't listen to me.)
Lack of granularity is not a design flaw.

> > So what about the old-Oracle problem? You have a
> > server that needs the ability to hog and lock memory.
> > Is there an almost-empty SELinux policy that would
> > provide this while leaving the rest of the system
> > acting as UNIX-like systems have always acted?
> > If so, we have a winner.
> 
> See "relaxed policy" or "targeted policy" in recent discussions on the
> Fedora mailing lists; coming soon to a rawhide near you.  Not exactly
> the same thing (it is a policy where most processes run essentially
> unconfined except for a targeted set that you define like apache, bind,
> etc), but you could certainly tweak it to this end (i.e. you put oracle
> into a domain that gives it the one capability you desire).

What I'm looking for:

1. configure the kernel by ...
2. ensure that /bin/foo runs early in boot
3. put "blah blah blah" in /etc/foo.conf

That is, is there a small set of simple config files
and binaries that I could just slap onto an existing
system to ensure that a particular app is granted an
extra capability bit?

If yes, then the ugly old-Oracle hack is not needed.

> > One still does need to provide apps with a way to
> > answer "can I do FOO, BAR, and BAZ?" and "am I
> > running with elevated privileges?". Some way to
> > dispose of unneeded privileges would be good too.
> > Hopefully extra libraries wouldn't be needed.
> 
> SELinux provides a policy API already, and a userspace library for
> accessing it (and caching decisions from it).  It also provides (via the
> AT_SECURE auxv entry) notification that a domain transition has
> occurred, and this is already used by glibc secure mode.  Privileges are
> dropped via domain transitions.

I take that as a "no" then, at least from the viewpoint
of a normal app author.



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14 12:03 [PATCH] capabilites, take 2 Albert Cahalan
  2004-05-14 14:53 ` Andy Lutomirski
@ 2004-05-14 15:21 ` Stephen Smalley
  2004-05-14 15:19   ` Albert Cahalan
  2004-05-14 18:00   ` Chris Wright
  2004-05-14 17:48 ` Chris Wright
  2 siblings, 2 replies; 43+ messages in thread
From: Stephen Smalley @ 2004-05-14 15:21 UTC (permalink / raw)
  To: Albert Cahalan
  Cc: linux-kernel mailing list, luto, Chris Wright,
	olaf+list.linux-kernel, Valdis.Kletnieks

On Fri, 2004-05-14 at 08:03, Albert Cahalan wrote:
> This would be an excellent time to reconsider how capabilities
> are assigned to bits. You're breaking things anyway; you might
> as well do all the breaking at once. I want local-use bits so
> that the print queue management access isn't by magic UID/GID.
> We haven't escaped UID-as-priv if server apps and setuid apps
> are still making UID-based access control decisions.

Capabilities are a broken model for privilege management; try RBAC/TE
instead.  http://www.securecomputing.com/pdf/secureos.pdf has notes
about the history and comparison of capabilities vs. TE.

Instead of adding new capability bits, replace capable() calls with LSM
hook calls that offer you finer granularity both in operation and in
object-based decisions, and then use a security module like SELinux to
map that to actual permission checks.  SELinux provides a framework for
defining security classes and permissions, including both definitions
used by the kernel and definitions used by userspace policy enforcers
(ala security-enhanced X).

-- 
Stephen Smalley <sds@epoch.ncsc.mil>
National Security Agency


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14 15:21 ` Stephen Smalley
@ 2004-05-14 15:19   ` Albert Cahalan
  2004-05-14 18:06     ` Stephen Smalley
  2004-05-14 18:00   ` Chris Wright
  1 sibling, 1 reply; 43+ messages in thread
From: Albert Cahalan @ 2004-05-14 15:19 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: Albert Cahalan, linux-kernel mailing list, luto, Chris Wright,
	olaf+list.linux-kernel, Valdis.Kletnieks

On Fri, 2004-05-14 at 11:21, Stephen Smalley wrote:
> On Fri, 2004-05-14 at 08:03, Albert Cahalan wrote:
> > This would be an excellent time to reconsider how capabilities
> > are assigned to bits. You're breaking things anyway; you might
> > as well do all the breaking at once. I want local-use bits so
> > that the print queue management access isn't by magic UID/GID.
> > We haven't escaped UID-as-priv if server apps and setuid apps
> > are still making UID-based access control decisions.
> 
> Capabilities are a broken model for privilege management; try RBAC/TE
> instead.  http://www.securecomputing.com/pdf/secureos.pdf has notes
> about the history and comparison of capabilities vs. TE.

I just read that. It's a very unfair marketing document.
Among other things, it suggests that a capability system
is stuck with about 40 bits while their own version of
capabilities (a duck is a duck...) has 80 bits. Lovely,
but not exactly groundbreaking. There is the bit about
a 3-argument security call, but a careful reading will
reveal that one argument is unused (NULL?) when dealing
with abilities like "can set the clock".

About the only thing of interest is that capability
transitions can be arbitrary. You're not limited to
an obscure set of equations that nobody can agree on.
The cost: complicated site-specific config files and
the inability to support capability-aware apps that
set+clear their own bits.

There is some value in unifying the handling of
capability bits with the handling of actor-to-actee
security controls. This simplifies the documentation,
and thus reduces the likelyhood of admin mistakes.

> Instead of adding new capability bits, replace capable()
> calls with LSM hook calls that offer you finer granularity
> both in operation and in object-based decisions, and then
> use a security module like SELinux to map that to actual
> permission checks.

Eh, why? That's mostly a search-and-replace on the name,
since capable() makes a perfectly fine LSM hook.

> SELinux provides a framework for
> defining security classes and permissions, including both definitions
> used by the kernel and definitions used by userspace policy enforcers
> (ala security-enhanced X).

Nice.

So what about the old-Oracle problem? You have a
server that needs the ability to hog and lock memory.
Is there an almost-empty SELinux policy that would
provide this while leaving the rest of the system
acting as UNIX-like systems have always acted?
If so, we have a winner.

One still does need to provide apps with a way to
answer "can I do FOO, BAR, and BAZ?" and "am I
running with elevated privileges?". Some way to
dispose of unneeded privileges would be good too.
Hopefully extra libraries wouldn't be needed.





^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14 12:03 [PATCH] capabilites, take 2 Albert Cahalan
@ 2004-05-14 14:53 ` Andy Lutomirski
  2004-05-14 17:58   ` Chris Wright
  2004-05-14 15:21 ` Stephen Smalley
  2004-05-14 17:48 ` Chris Wright
  2 siblings, 1 reply; 43+ messages in thread
From: Andy Lutomirski @ 2004-05-14 14:53 UTC (permalink / raw)
  To: Albert Cahalan
  Cc: linux-kernel mailing list, chrisw, olaf+list.linux-kernel,
	Valdis.Kletnieks

Albert Cahalan wrote:

> Andy Lutomirski writes:
> 
> 
>>Posix (as interpreted by Linux 2.4/2.6) says:
>>pP' = (fP & cap_bset) | (fI & pI)
>>
>>So (assuming that set_security did the "obvious" (but dangerous) thing):
>>
>>pP := "task may enable and use these capabilities"
>>pE := "task may use these capabilities _now_"
>>pI := "task may gain these on exec from fI"
>>
>>fP := "this program gets these caps (module cap_bset)"
>>fI := "this program gets these caps if pI says so"
>>
>>Which screams "overcomplicated."
> 
> 
> People will screw this up.
> 
> That you should even have to explain the names as you do is an
> indication that the names are poor.
> 
> Having names that match IRIX names but act differently is
> a sure path to disaster via confused admins and developers.
> 
> 
>>I see no particular invariants here, except for pE <= pP.
>>
>>IRIX (thanks Valdis) says:
>>
>>pI' = pI & fI
>>pP' = fP | (pI' & fP)
>>
>>Which I interpret as
>>
>>pP := "task may enable and use these capabilities"
>>pE := "task may use these capabilities _now_"
>>pI := ~"task _loses_ these on exec"
>>
>>fP := "this program gets these caps"
>>fI := "this program may keep these caps"
>>
>>This seems to want pP <= pI as an invariant.
>>
>>This is what I always thought Linux capabilities meant to be.  They
>>don't make my brain hurt.
>>
>>But I also think they're overengineered.  Instead of:
>>
>>drop_caps_from_inheritable()
>>exec()
>>
>>a program could do:
>>
>>drop_caps_from_permitted()
>>exec()
>>
>>And I can't imagine what use fI != ~0 has, since it's trivially
>>accomplished by a wrapper.  It is also trivially bypassed by
>>loading the program manually (with ld.so).
> 
> 
> Good point. Even if an exec-only executable would block this,
> nearly all admins will fail to mark it as such.
> 
> 
>>So, in my patch, I decided steal the inheritable mask to mean this:
>>
>>pI := "this process may gain these caps"
>>fI := "this is an upper bound on pI"
>>
>>In other words, if a process is extra-untrusted (e.g. it's a daemon
>>that never needs a certain capability and has no business trying
>>to gain it), it can drop it from pI.  Then it cannot try to abuse
>>programs with pP>0.  The setuid override is just added paranoia.
>>Another benefit is that it allows a securelevel-like scheme, where
>>even root isn't quite trusted.
>>
>>I suppose it might be inappropriate to steal this field like this,
>>given that IRIX already has a (somewhat reasonable) use for it. I
>>have no problem implementing IRIX-style capabilities and (if there
>>is enough interest) adding a _fourth_ process field pM (process
>>maximum capabilities) that does what my pI does.
> 
> 
> A few mostly-unrelated thoughts:
> 
> Rather than adding a compile-time option or boot option, simply
> change the syscall numbers and /proc/*/status names. This will
> cause any existing capability-aware tools to act as if being
> run on a pre-capability Linux kernel. This seems to be safer
> than allowing these tools to assume that nothing has changed.

That sounds nasty to get right.  What about just simulating this
for the benefit of old tools:

pP = (what it really is)
pE = (what it really is -- but it will mostly be pP)
pI = 0 (or full -- anything else is confusing)

Then capset for on other pids (which currently needs CAP_SETPCAP)
goes away completely, since, on stock Linux, that code is
unreachable anyway.

I think this emulates the current linux caps quite nicely.  Unless
we add lots more caps, in which case we'd have to make a guess at
what pP is.

Now this code becomes complicated (code-wise, not just conceptually :)

> 
> Allow me to mark an executable with a set of capabilities that
> must be all set or all unset. Default to ~0ull for setuid apps,
> and to 0ull for all other apps.
> Like this:
>    if ((pFOO & fBAR) != fBAR)    pFOO &= ~fBAR;

Once again bypassable by ld.so.  What about
if ((pM' & fBAR) != fBAR) ignore fP and setuid
(BINPRM_SEC_NO_ELEVATE)?

I'll call it fR (for required).  The default would be:
setuid-root: fR = full
setuid-nonroot and setgid: fR = CAP_SETUID

I check pM' not pP' because in the normal case these capabilities
were in fP anyway (except for setgid/setuid-nonroot, in which case
the admin probably doesn't want the low-pM task changing its uid).
If they weren't, then IMHO it's the job of the task with elevated
caps to be careful of what it runs rather than the filesystem.

So we get:

fM: maximum caps for this app and children (default full)
fP: app gains these (subject to fM and pM)
fR: as above

pM: maximum caps for this task and children (full for init)
pP: the usual
pE: the usual

> 
> Before writing the kernel code, how about writing documentation
> for admins, software developers, and Linux vendors? Until clear
> and readable documentation exists, this is all just a hazard.
> You might even make a mistake in the kernel code if it isn't
> easy for a non-kernel hacker to review how things will interact
> with their software. Just listing all the cases that need to
> be reviewed would be an improvement.

Good call.  I'll do that.

> 
> This would be an excellent time to reconsider how capabilities
> are assigned to bits. You're breaking things anyway; you might
> as well do all the breaking at once. I want local-use bits so
> that the print queue management access isn't by magic UID/GID.
> We haven't escaped UID-as-priv if server apps and setuid apps
> are still making UID-based access control decisions.
> 

How many bits?  Or should it even be a bitmask?

I'm thinking either 64 or 128 for kernel-defined caps and either
a seperate 128 bits or more or just a list for local-defined.

Then local caps could live in /etc (i don't like it) or /proc (not so
nice either) for collision-avoidance.

Or even numbered bits for kernel use and _names_ for user use.
You'd do sys_cap_register('printquota') and then that becomes
a legal name of a capability.  Then you need sys_cap_unregister()
to atomically remove it from all tasks.

I'll think about how my maximum mask fits in to all this.


Thanks,
Andy

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
@ 2004-05-14 12:03 Albert Cahalan
  2004-05-14 14:53 ` Andy Lutomirski
                   ` (2 more replies)
  0 siblings, 3 replies; 43+ messages in thread
From: Albert Cahalan @ 2004-05-14 12:03 UTC (permalink / raw)
  To: linux-kernel mailing list
  Cc: luto, chrisw, olaf+list.linux-kernel, Valdis.Kletnieks

Andy Lutomirski writes:

> Posix (as interpreted by Linux 2.4/2.6) says:
> pP' = (fP & cap_bset) | (fI & pI)
>
> So (assuming that set_security did the "obvious" (but dangerous) thing):
>
> pP := "task may enable and use these capabilities"
> pE := "task may use these capabilities _now_"
> pI := "task may gain these on exec from fI"
>
> fP := "this program gets these caps (module cap_bset)"
> fI := "this program gets these caps if pI says so"
>
> Which screams "overcomplicated."

People will screw this up.

That you should even have to explain the names as you do is an
indication that the names are poor.

Having names that match IRIX names but act differently is
a sure path to disaster via confused admins and developers.

> I see no particular invariants here, except for pE <= pP.
>
> IRIX (thanks Valdis) says:
>
> pI' = pI & fI
> pP' = fP | (pI' & fP)
>
> Which I interpret as
>
> pP := "task may enable and use these capabilities"
> pE := "task may use these capabilities _now_"
> pI := ~"task _loses_ these on exec"
>
> fP := "this program gets these caps"
> fI := "this program may keep these caps"
>
> This seems to want pP <= pI as an invariant.
>
> This is what I always thought Linux capabilities meant to be.  They
> don't make my brain hurt.
>
> But I also think they're overengineered.  Instead of:
>
> drop_caps_from_inheritable()
> exec()
>
> a program could do:
>
> drop_caps_from_permitted()
> exec()
>
> And I can't imagine what use fI != ~0 has, since it's trivially
> accomplished by a wrapper.  It is also trivially bypassed by
> loading the program manually (with ld.so).

Good point. Even if an exec-only executable would block this,
nearly all admins will fail to mark it as such.

> So, in my patch, I decided steal the inheritable mask to mean this:
>
> pI := "this process may gain these caps"
> fI := "this is an upper bound on pI"
>
> In other words, if a process is extra-untrusted (e.g. it's a daemon
> that never needs a certain capability and has no business trying
> to gain it), it can drop it from pI.  Then it cannot try to abuse
> programs with pP>0.  The setuid override is just added paranoia.
> Another benefit is that it allows a securelevel-like scheme, where
> even root isn't quite trusted.
>
> I suppose it might be inappropriate to steal this field like this,
> given that IRIX already has a (somewhat reasonable) use for it. I
> have no problem implementing IRIX-style capabilities and (if there
> is enough interest) adding a _fourth_ process field pM (process
> maximum capabilities) that does what my pI does.

A few mostly-unrelated thoughts:

Rather than adding a compile-time option or boot option, simply
change the syscall numbers and /proc/*/status names. This will
cause any existing capability-aware tools to act as if being
run on a pre-capability Linux kernel. This seems to be safer
than allowing these tools to assume that nothing has changed.

Allow me to mark an executable with a set of capabilities that
must be all set or all unset. Default to ~0ull for setuid apps,
and to 0ull for all other apps.
Like this:
   if ((pFOO & fBAR) != fBAR)    pFOO &= ~fBAR;

Before writing the kernel code, how about writing documentation
for admins, software developers, and Linux vendors? Until clear
and readable documentation exists, this is all just a hazard.
You might even make a mistake in the kernel code if it isn't
easy for a non-kernel hacker to review how things will interact
with their software. Just listing all the cases that need to
be reviewed would be an improvement.

This would be an excellent time to reconsider how capabilities
are assigned to bits. You're breaking things anyway; you might
as well do all the breaking at once. I want local-use bits so
that the print queue management access isn't by magic UID/GID.
We haven't escaped UID-as-priv if server apps and setuid apps
are still making UID-based access control decisions.




^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14  6:04       ` Valdis.Kletnieks
@ 2004-05-14  7:09         ` Olaf Dietsche
  0 siblings, 0 replies; 43+ messages in thread
From: Olaf Dietsche @ 2004-05-14  7:09 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Chris Wright, Andy Lutomirski, akpm, linux-kernel

Valdis.Kletnieks@vt.edu writes:

> On Fri, 14 May 2004 07:33:32 +0200, Olaf Dietsche said:
>
>> Seems like you're not aware of:
>> <http://www.olafdietsche.de/linux/capability/>
>> 
>> This supports filesystem capabilities with the current (POSIX?)
>> implementation.

This refers to the linux kernel main line.

> Yes.. I was aware of that.. and I just visited it.. and the VERY TOP it says:
>
> "Filesystem capabilities for linux
>
> This implementation is likely *not* POSIX compatible."

This refers to the tools I provide. I should emphasize this on the
page, thank you. My patch doesn't change the rules, how the capability
bits are mingled.

> Now who should I believe, you or the author of the page? :)

Alright, I deserve this for being imprecise. :)

Regards, Olaf.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14  2:27   ` Andy Lutomirski
  2004-05-14  4:48     ` Chris Wright
@ 2004-05-14  6:39     ` Olaf Dietsche
  1 sibling, 0 replies; 43+ messages in thread
From: Olaf Dietsche @ 2004-05-14  6:39 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Chris Wright, akpm, linux-kernel

Andy Lutomirski <luto@myrealbox.com> writes:

> I'm not convinced that Posix's version makes any sense.  Also, there are
> apparently a number of drafts around which disagree on what the right
> rules are.  (My copy, for example, matches the old rules exactly, but
> the old rules caused the sendmail problem.)

Don't confuse POSIX _semantics_ with implementation _bugs_.

>  And, under Posix, what does
> the inheritable mask mean, anyway?
>
> Also, I don't find the posix rules to be useful (why is there an
> inheritable mask if all it does is to cause caps to be dropped on
> exec, when the user could just manually drop them?).

You can use the inheritable set, if you want to give capabilities to a
process when it's started by an already priviledged parent (e.g. a
root process), but not when it's started by a regular user.

See <http://www.olafdietsche.de/linux/capability/> for an example.

Regards, Olaf.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2 
  2004-05-14  5:33     ` Olaf Dietsche
@ 2004-05-14  6:04       ` Valdis.Kletnieks
  2004-05-14  7:09         ` Olaf Dietsche
  0 siblings, 1 reply; 43+ messages in thread
From: Valdis.Kletnieks @ 2004-05-14  6:04 UTC (permalink / raw)
  To: Olaf Dietsche; +Cc: Chris Wright, Andy Lutomirski, akpm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 535 bytes --]

On Fri, 14 May 2004 07:33:32 +0200, Olaf Dietsche said:

> Seems like you're not aware of:
> <http://www.olafdietsche.de/linux/capability/>
> 
> This supports filesystem capabilities with the current (POSIX?)
> implementation. So, whatever Andy has shown, it has at least one
> counter evidence q.e.d.

Yes.. I was aware of that.. and I just visited it.. and the VERY TOP it says:

"Filesystem capabilities for linux

This implementation is likely *not* POSIX compatible."

Now who should I believe, you or the author of the page? :)


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14  4:48     ` Chris Wright
@ 2004-05-14  5:58       ` Andy Lutomirski
  2004-05-14 17:45         ` Chris Wright
  0 siblings, 1 reply; 43+ messages in thread
From: Andy Lutomirski @ 2004-05-14  5:58 UTC (permalink / raw)
  To: Chris Wright; +Cc: akpm, linux-kernel, Valdis.Kletnieks

[Pardon my gross butchery and reordering.]

Compatibility (i.e. newcaps=0):

Chris Wright wrote:
 > * Andy Lutomirski (luto@myrealbox.com) wrote:
 >
 >>Chris Wright wrote:
 >>
 >>
 >>>* Andy Lutomirski (luto@myrealbox.com) wrote:

 >>>I think it still needs more work.  Default behavoiur is changed, like
 >>>Inheritble is full rather than clear, setpcap is enabled, etc.  ...
 >>
 >>In cap_bprm_apply_creds_compat:
 >>
 >>+	} else if (!fixed_init) {
 >>+		/* This is not strictly correct, as it gives linuxrc more
 >>+		 * permissions than it used to have.  It was the only way I
 >>+		 * could think of to keep the resulting disaster contained,
 >>+		 * though.
 >>+		 */
 >>+		current->cap_effective = CAP_OLD_INIT_EFF_SET;
 >>+		current->cap_inheritable = CAP_OLD_INIT_INH_SET;
 >>+		fixed_init = 1;
 >>
 >>So that it gets changed back.  Otherwise linuxrc ran without permissions
 >>and my drives never got mounted.  Yah, it's ugly -- I'm open to
 >>suggestions to avoid this.
 >
 >
 > I tested as a module, and this doesn't run AFAICS

OK -- I give in.  I'll redo it as a kernel (non-module) boot parameter.
And touch some more files that have no business being capability-aware
while I'm at it :(


The inheritable mask:

 >>>Also, it breaks my tests which try to drop privs and keep caps across
 >>>execve() which is really the only issue we're trying to solve ATM.
 >>
 >>Can you send me a sample of what breaks?  I do:
 >
 >
 > Yes.  It's something like this:
 >
 > keepcaps
 > dropuid
 > drop caps
 > execve()
 >
 > The caps that are left are like this.  (effective == inheritable) which
 > are a subset of permitted.
 >

Is (eff == inh) what happens or what should happen?  If the former, then
my patch is broken.  If the latter, either I'm confused, or see below.

 >>>why do you change from Posix the way exec() updates capabilities?  Sure,
 >>>there is no filesystem bits present, so this changes the calculation,
 >>>but I'm not convinced it's as secure this way.  At least with newcaps=0.
 >>
 >>I'm not convinced that Posix's version makes any sense.  Also, there are
 >>apparently a number of drafts around which disagree on what the right
 >>rules are.  (My copy, for example, matches the old rules exactly, but
 >>the old rules caused the sendmail problem.)  And, under Posix, what does
 >>the inheritable mask mean, anyway?
 >>
 >>Also, I don't find the posix rules to be useful (why is there an
 >>inheritable mask if all it does is to cause caps to be dropped on
 >>exec, when the user could just manually drop them?).
 >
 >
 > Not sure if it's defensible, but it allows passing on an inheritable
 > capability through an intermediate process that simply can't inherit
 > that capability.  This is not unlike requiring an unprivileged process
 > to ask a privileged process for it to do something on it's behalf.
 > Certainly it's implicit that you trust the privileged process.
 >

[and:]
 >>>>+	/* Pretend we have VFS capabilities */
 >>>>+	cap_set_full(bprm->cap_inheritable);
 >>>
 >>>
 >>>This looks sketchy.
 >>
 >>My concept of 'inheritable' is that caps that are _not_ inheritable
 >>may never be gained by this task or its children.  So a process
 >>should normally have all caps inheritable.
 >
 >
 > This is the diff with Posix, which allows a process to inherit a
 > capability that it can never exercise.  However, it could pass the
 > capablity on to someone else who could inherit it.
 >
 > <snip>
 >

So here's where I think we disagree:

Posix (as interpreted by Linux 2.4/2.6) says:
pP' = (fP & cap_bset) | (fI & pI)

So (assuming that set_security did the "obvious" (but dangerous) thing):

pP := "task may enable and use these capabilities"
pE := "task may use these capabilities _now_"
pI := "task may gain these on exec from fI"

fP := "this program gets these caps (module cap_bset)"
fI := "this program gets these caps if pI says so"

Which screams "overcomplicated."  I imagine that the use is for a
trusted daemon to run an untrusted helper (with pI>0) which then
calls back into trusted land and gets its caps back.  This is
possibly convenient, but it seems to break (where break = scare me)
when there are more than one such system on a given box.  Then one's
untrusted program (with fI>0) can call the other's trusted fI>0
helper.  I suppose the point is that a random user's program (with fI==0)
can't try to exploit anything, but, for this to be at all secure, both
fI>0 programs need to be secure against attack from the other (unrelated)
system, should it be compromised.  Which means it might as well have set
fP>0 and been done with it (I don't believe in security by inconvenience
of exploit).

I see no particular invariants here, except for pE <= pP.

IRIX (thanks Valdis) says:

pI' = pI & fI
pP' = fP | (pI' & fP)

Which I interpret as

pP := "task may enable and use these capabilities"
pE := "task may use these capabilities _now_"
pI := ~"task _loses_ these on exec"

fP := "this program gets these caps"
fI := "this program may keep these caps"

This seems to want pP <= pI as an invariant.

This is what I always thought Linux capabilities meant to be.  They
don't make my brain hurt.

But I also think they're overengineered.  Instead of:

drop_caps_from_inheritable()
exec()

a program could do:

drop_caps_from_permitted()
exec()

And I can't imagine what use fI != ~0 has, since it's trivially
accomplished by a wrapper.  It is also trivially bypassed by
loading the program manually (with ld.so).

So, in my patch, I decided steal the inheritable mask to mean this:

pI := "this process may gain these caps"
fI := "this is an upper bound on pI"

In other words, if a process is extra-untrusted (e.g. it's a daemon
that never needs a certain capability and has no business trying
to gain it), it can drop it from pI.  Then it cannot try to abuse
programs with pP>0.  The setuid override is just added paranoia.
Another benefit is that it allows a securelevel-like scheme, where
even root isn't quite trusted.

I suppose it might be inappropriate to steal this field like this,
given that IRIX already has a (somewhat reasonable) use for it. I
have no problem implementing IRIX-style capabilities and (if there
is enough interest) adding a _fourth_ process field pM (process
maximum capabilities) that does what my pI does.

As for the fE mask, I just don't see the point, although I _really_
don't like the way it's described in the IRIX manpage.

IRIX has pE = pP & fE.  This breaks Posix non-capabilities
compatibility for a program that's uid==0, euid!=0.  It should
have pE==0 and pP>0.  But it calls exec() and gets pE>0.  This
is bad.

Assuming there's something else there to fix that case,
then I still don't see the point.  If a program is capability-
aware, it can set its pE however it likes.  If not, then it probably
expects pE==pP.  I guess there could be a trusted but dumb program
that runs a trusted, cap-aware helper that needs capabilities.
Then the admin sets fE==0 on the dump program and everything works.
Seems a bit contrived, though.

On the other hand, I'm not wedded to my model of pE.  It'll be harder
to get IRIX's right for uid!=euid.

CAP_SYS_PTRACE:

 >>>>@@ -36,6 +41,11 @@
 >>>>int cap_ptrace (struct task_struct *parent, struct task_struct *child)
 >>>>{
 >>>>	/* Derived from arch/i386/kernel/ptrace.c:sys_ptrace. */
 >>>>+	/* CAP_SYS_PTRACE still can't bypass inheritable restrictions */
 >>>>+	if (newcaps &&
 >>>>+	    !cap_issubset (child->cap_inheritable, current->cap_inheritable))
 >>>>+		return -EPERM;
 >>>
 >>>
 >>>Why no capable() override?  In fact, is this check really necessary?
 >>
 >>If task A has less inheritable caps than B, then A is somehow less trusted
 >>and has no business tracing B.
 >
 >
 > But it's not less.  It's just not a subset.  Task B could have only one
 > inheritable cap, while A could have all but the one cap that B has.  In
 > fact, that could be CAP_SYS_PTRACE.  So the threat is task A tracing B,
 > and using it to pass on capabilities that it wasn't allowed to pass on.
 > Which is what the permitted test was for before, and what CAP_SYS_PTRACE
 > was used to override.
 >
 >
 >>A concrete example: a system runs with very restricted inheritable caps
 >>on all processes except for a magic daemon.  The magic daemon holds on
 >>to CAP_SYS_ADMIN to umount everything at shutdown.  If the rest of the
 >>system gets rooted, it still shouldn't be possible to trace the daemon.
 >>(Yes, this is currently not workable -- I plan to add a sysctl that sets
 >>what inheritable caps a task must have for setuid to work.  The blanket
 >>requirement that _all_ must be present is to avoid bugs in which a
 >>setuid program assumes it will be fully privileged.)
 >
 >
 > I suppose this eliminates the usefulness of CAP_SYS_PTRACE.
 >
 >

It lets one uid/gid trace another.  If CAP_SYS_PTRACE allowed a process
to arbitrarity steal another's capabilities, then the process with
CAP_SYS_PTRACE might as well have been given those capabilities.  That is,
this should IMHO be disallowed

drop_all_but_CAP_SYS_PTRACE()
exec(slightly trusted debugger process)
ptrace(1) <--- but it was supposed to be "slightly trusted"!



So:

Should I redo this to keep IRIX's meaning of fI, should I keep mine,
or should I do something else.  Chris -- in your examples, you seem
to have some idea of what should be happening.  What do you think?

--Andy

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14  1:35   ` Valdis.Kletnieks
  2004-05-14  4:51     ` Chris Wright
@ 2004-05-14  5:33     ` Olaf Dietsche
  2004-05-14  6:04       ` Valdis.Kletnieks
  1 sibling, 1 reply; 43+ messages in thread
From: Olaf Dietsche @ 2004-05-14  5:33 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Chris Wright, Andy Lutomirski, akpm, linux-kernel

Valdis.Kletnieks@vt.edu writes:

> On Thu, 13 May 2004 18:20:10 PDT, Chris Wright said:
>
>> I think it still needs more work.  Default behavoiur is changed, like
>> Inheritble is full rather than clear, setpcap is enabled, etc.  Also,
>> why do you change from Posix the way exec() updates capabilities?  Sure,
>> there is no filesystem bits present, so this changes the calculation,
>> but I'm not convinced it's as secure this way.  At least with newcaps=0.
>
> The last time the "capabilities" thread reared its head a while ago, Andy made
> a posting that pretty conclusively showed that the Posix way was totally b0rken
> if you ever intended to support filesystem bits.  So if you wanted to ever have
> a snowball's chance of supporting something like:
>
> chcap cap_net_raw+ep /bin/ping

Seems like you're not aware of:
<http://www.olafdietsche.de/linux/capability/>

This supports filesystem capabilities with the current (POSIX?)
implementation. So, whatever Andy has shown, it has at least one
counter evidence q.e.d.

> 2) Toss all the filesystems capabilities support out the window.

I agree to disagree ;-)

Regards, Olaf.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14  1:35   ` Valdis.Kletnieks
@ 2004-05-14  4:51     ` Chris Wright
  2004-05-14  5:33     ` Olaf Dietsche
  1 sibling, 0 replies; 43+ messages in thread
From: Chris Wright @ 2004-05-14  4:51 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Chris Wright, Andy Lutomirski, akpm, linux-kernel

* Valdis.Kletnieks@vt.edu (Valdis.Kletnieks@vt.edu) wrote:
> On Thu, 13 May 2004 18:20:10 PDT, Chris Wright said:
> 
> > I think it still needs more work.  Default behavoiur is changed, like
> > Inheritble is full rather than clear, setpcap is enabled, etc.  Also,
> > why do you change from Posix the way exec() updates capabilities?  Sure,
> > there is no filesystem bits present, so this changes the calculation,
> > but I'm not convinced it's as secure this way.  At least with newcaps=0.
> 
> The last time the "capabilities" thread reared its head a while ago, Andy made
> a posting that pretty conclusively showed that the Posix way was totally b0rken
> if you ever intended to support filesystem bits.  So if you wanted to ever have
> a snowball's chance of supporting something like:
> 
> chcap cap_net_raw+ep /bin/ping

It's still not clear that's what we want.  For now, just being able to have
the sucap equiv to sudo would be nice.  And it's very uncomfortable to
change mainline in subtle ways that could break security during stable
series.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14  2:27   ` Andy Lutomirski
@ 2004-05-14  4:48     ` Chris Wright
  2004-05-14  5:58       ` Andy Lutomirski
  2004-05-14  6:39     ` Olaf Dietsche
  1 sibling, 1 reply; 43+ messages in thread
From: Chris Wright @ 2004-05-14  4:48 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Chris Wright, akpm, linux-kernel

* Andy Lutomirski (luto@myrealbox.com) wrote:
> Chris Wright wrote:
> 
> > * Andy Lutomirski (luto@myrealbox.com) wrote:
> > 
> >>Implement optional working capability support.  Try to avoid giving Andrew
> >>a heart attack. ;)
> > 
> > I think it still needs more work.  Default behavoiur is changed, like
> > Inheritble is full rather than clear, setpcap is enabled, etc.  ...
> 
> In cap_bprm_apply_creds_compat:
> 
> +	} else if (!fixed_init) {
> +		/* This is not strictly correct, as it gives linuxrc more
> +		 * permissions than it used to have.  It was the only way I
> +		 * could think of to keep the resulting disaster contained,
> +		 * though.
> +		 */
> +		current->cap_effective = CAP_OLD_INIT_EFF_SET;
> +		current->cap_inheritable = CAP_OLD_INIT_INH_SET;
> +		fixed_init = 1;
> 
> So that it gets changed back.  Otherwise linuxrc ran without permissions
> and my drives never got mounted.  Yah, it's ugly -- I'm open to
> suggestions to avoid this.

I tested as a module, and this doesn't run AFAICS

>  > ... Also,
> > why do you change from Posix the way exec() updates capabilities?  Sure,
> > there is no filesystem bits present, so this changes the calculation,
> > but I'm not convinced it's as secure this way.  At least with newcaps=0.
> 
> I'm not convinced that Posix's version makes any sense.  Also, there are
> apparently a number of drafts around which disagree on what the right
> rules are.  (My copy, for example, matches the old rules exactly, but
> the old rules caused the sendmail problem.)  And, under Posix, what does
> the inheritable mask mean, anyway?
> 
> Also, I don't find the posix rules to be useful (why is there an
> inheritable mask if all it does is to cause caps to be dropped on
> exec, when the user could just manually drop them?).

Not sure if it's defensible, but it allows passing on an inheritable
capability through an intermediate process that simply can't inherit
that capability.  This is not unlike requiring an unprivileged process
to ask a privileged process for it to do something on it's behalf.
Certainly it's implicit that you trust the privileged process.

> > I believe we can get something functional with fewer changes, hence
> > easier to understand the ramifications.  In a nutshell, I'm still not
> > comfortable with this.
> 
> I'll play with it, but I think this is the shortest patch I've come up
> with.  I'll admit that touching this stuff scares me too, but I'd rather
> redo it that try and patch it over again.

Yeah, the subtleties are unnerving.

> > Also, it breaks my tests which try to drop privs and keep caps across
> > execve() which is really the only issue we're trying to solve ATM.
> 
> Can you send me a sample of what breaks?  I do:

Yes.  It's something like this:

keepcaps
dropuid
drop caps
execve()

The caps that are left are like this.  (effective == inheritable) which
are a subset of permitted.

> [root@luto tmp]# cap -c = ls /home/andy
> ls: /home/andy: Permission denied
> [root@luto tmp]# echo test >foo
> [root@luto tmp]# chmod 700 foo
> [root@luto tmp]# su andy -c 'cat foo'
> cat: foo: Permission denied
> [root@luto tmp]# cap -c '= cap_dac_read_search=eip' -u andy cat foo
> test
> [root@luto tmp]# cap -c '= cap_dac_read_search=eip' -u andy bash
> [andy@luto tmp]$ whoami
> andy
> [andy@luto tmp]$ dumpcap
>          Real        Eff
> User    500         500
> Group   500         500
> 
> Caps: = cap_dac_read_search+eip
> [andy@luto tmp]$ cat foo
> test
> 
> Which looks exactly right to me.
> (cap and dumpcap live at www.stanford.edu/~luto/cap/)
> 
> >>--- linux-2.6.6-mm2/fs/exec.c~caps	2004-05-13 11:42:26.000000000 -0700
> >>+++ linux-2.6.6-mm2/fs/exec.c	2004-05-13 12:15:20.000000000 -0700
> >>@@ -882,8 +882,10 @@
> >> 
> >> 	if(!(bprm->file->f_vfsmnt->mnt_flags & MNT_NOSUID)) {
> >> 		/* Set-uid? */
> >>-		if (mode & S_ISUID)
> >>+		if (mode & S_ISUID) {
> >> 			bprm->e_uid = inode->i_uid;
> >>+			bprm->secflags |= BINPRM_SEC_SETUID;
> >>+		}
> >> 
> >> 		/* Set-gid? */
> >> 		/*
> >>@@ -891,10 +893,19 @@
> >> 		 * is a candidate for mandatory locking, not a setgid
> >> 		 * executable.
> >> 		 */
> >>-		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP))
> >>+		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
> >> 			bprm->e_gid = inode->i_gid;
> >>+			bprm->secflags |= BINPRM_SEC_SETGID;
> >>+		}
> >> 	}
> >> 
> >>+	/* Pretend we have VFS capabilities */
> >>+	cap_set_full(bprm->cap_inheritable);
> > 
> > 
> > This looks sketchy.
> 
> My concept of 'inheritable' is that caps that are _not_ inheritable
> may never be gained by this task or its children.  So a process
> should normally have all caps inheritable.

This is the diff with Posix, which allows a process to inherit a
capability that it can never exercise.  However, it could pass the
capablity on to someone else who could inherit it.

<snip>
> >>@@ -36,6 +41,11 @@
> >> int cap_ptrace (struct task_struct *parent, struct task_struct *child)
> >> {
> >> 	/* Derived from arch/i386/kernel/ptrace.c:sys_ptrace. */
> >>+	/* CAP_SYS_PTRACE still can't bypass inheritable restrictions */
> >>+	if (newcaps &&
> >>+	    !cap_issubset (child->cap_inheritable, current->cap_inheritable))
> >>+		return -EPERM;
> > 
> > 
> > Why no capable() override?  In fact, is this check really necessary?
> 
> If task A has less inheritable caps than B, then A is somehow less trusted
> and has no business tracing B.

But it's not less.  It's just not a subset.  Task B could have only one
inheritable cap, while A could have all but the one cap that B has.  In
fact, that could be CAP_SYS_PTRACE.  So the threat is task A tracing B,
and using it to pass on capabilities that it wasn't allowed to pass on.
Which is what the permitted test was for before, and what CAP_SYS_PTRACE
was used to override.

> A concrete example: a system runs with very restricted inheritable caps
> on all processes except for a magic daemon.  The magic daemon holds on
> to CAP_SYS_ADMIN to umount everything at shutdown.  If the rest of the
> system gets rooted, it still shouldn't be possible to trace the daemon.
> (Yes, this is currently not workable -- I plan to add a sysctl that sets
> what inheritable caps a task must have for setuid to work.  The blanket
> requirement that _all_ must be present is to avoid bugs in which a
> setuid program assumes it will be fully privileged.)

I suppose this eliminates the usefulness of CAP_SYS_PTRACE.

> >> 	if (!cap_issubset (child->cap_permitted, current->cap_permitted) &&
> >> 	    !capable (CAP_SYS_PTRACE))
> >> 		return -EPERM;
> >>@@ -76,6 +86,11 @@
> >> 		return -EPERM;
> >> 	}
> >> 
> >>+	/* verify the _new_Permitted_ is a subset of the _new_Inheritable_ */
> >>+	if (newcaps && !cap_issubset (*permitted, *inheritable)) {
> >>+		return -EPERM;
> >>+	}

This is what breaks my code.  I specifcally test permitting the parent to
grab back some caps, but never give them away.

> >> 	return 0;
> >> }
> >> 
> >>@@ -115,10 +132,11 @@
> >> 	return 0;
> >> }
> >> 
> >>-void cap_bprm_apply_creds (struct linux_binprm *bprm, int unsafe)
> >>+static void cap_bprm_apply_creds_compat (struct linux_binprm *bprm, int unsafe)
> >> {
> >>-	/* Derived from fs/exec.c:compute_creds. */
> >>+	/* This function will hopefully die in 2.7. */
> >> 	kernel_cap_t new_permitted, working;
> >>+	static int fixed_init = 0;
> >> 
> >> 	new_permitted = cap_intersect (bprm->cap_permitted, cap_bset);
> >> 	working = cap_intersect (bprm->cap_inheritable,
> >>@@ -151,6 +169,15 @@
> >> 		current->cap_permitted = new_permitted;
> >> 		current->cap_effective =
> >> 		    cap_intersect (new_permitted, bprm->cap_effective);
> >>+	} else if (!fixed_init) {
> >>+		/* This is not strictly correct, as it gives linuxrc more
> >>+		 * permissions than it used to have.  It was the only way I
> >>+		 * could think of to keep the resulting disaster contained,
> >>+		 * though.
> >>+		 */
> >>+		current->cap_effective = CAP_OLD_INIT_EFF_SET;
> >>+		current->cap_inheritable = CAP_OLD_INIT_INH_SET;
> >>+		fixed_init = 1;
> > 
> > 
> > Hrm...
> 
> Yup.  It sucks.  I didn't want to touch the system startup code, though,
> and this is the closest LSM comes.  It's too bad there's no LSM hook to do
> this right (although I suppose I could add one, but that would break
> capabilities as a module).

As a module it loosk like.

$ grep ^Cap /proc/1/status
CapInh: 00000000ffffffff
CapPrm: 00000000ffffffff
CapEff: 00000000ffffffff

> Speaking of which, this is a genuine problem if commoncap is a module
> and newcaps=0 -- this code will probably never run.  Does it matter?

Yup, that's the case I tested.  It breaks something that's working
normally now...

> [lots of patch snipped]
> 
> >>--- linux-2.6.6-mm2/include/linux/init_task.h~caps	2004-05-13 11:42:26.000000000 -0700
> >>+++ linux-2.6.6-mm2/include/linux/init_task.h	2004-05-13 11:42:51.000000000 -0700
> >>@@ -92,8 +92,8 @@
> >> 		.function	= it_real_fn				\
> >> 	},								\
> >> 	.group_info	= &init_groups,					\
> >>-	.cap_effective	= CAP_INIT_EFF_SET,				\
> >>-	.cap_inheritable = CAP_INIT_INH_SET,				\
> >>+	.cap_effective	= CAP_FULL_SET,				\
> >>+	.cap_inheritable = CAP_FULL_SET,				\
> > 
> > 
> > This was made unconditional.  And how are you convinced it's safe?
> 
> Same as above.  But even if it were really unconditional (instead
> of just sort-of unconditional), I still think it's safe, because
> (in newcaps=0 mode) only root can get CAP_SETPCAP.  Root
> could always just insert a module to enable it.  So granting it
> costs nothing.
> 
> On the other hand, safety is good, and I can't see any way in the
> old system for anything to get CAP_SETPCAP.  I'll send in a new
> patch that just disables CAP_SETPCAP when newcaps=0.

OK.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-14  1:20 ` Chris Wright
  2004-05-14  1:35   ` Valdis.Kletnieks
@ 2004-05-14  2:27   ` Andy Lutomirski
  2004-05-14  4:48     ` Chris Wright
  2004-05-14  6:39     ` Olaf Dietsche
  1 sibling, 2 replies; 43+ messages in thread
From: Andy Lutomirski @ 2004-05-14  2:27 UTC (permalink / raw)
  To: Chris Wright; +Cc: akpm, linux-kernel

Chris Wright wrote:

> * Andy Lutomirski (luto@myrealbox.com) wrote:
> 
>>Implement optional working capability support.  Try to avoid giving Andrew
>>a heart attack. ;)
> 
> 
> I think it still needs more work.  Default behavoiur is changed, like
> Inheritble is full rather than clear, setpcap is enabled, etc.  ...

In cap_bprm_apply_creds_compat:

+	} else if (!fixed_init) {
+		/* This is not strictly correct, as it gives linuxrc more
+		 * permissions than it used to have.  It was the only way I
+		 * could think of to keep the resulting disaster contained,
+		 * though.
+		 */
+		current->cap_effective = CAP_OLD_INIT_EFF_SET;
+		current->cap_inheritable = CAP_OLD_INIT_INH_SET;
+		fixed_init = 1;

So that it gets changed back.  Otherwise linuxrc ran without permissions
and my drives never got mounted.  Yah, it's ugly -- I'm open to
suggestions to avoid this.

 > ... Also,
> why do you change from Posix the way exec() updates capabilities?  Sure,
> there is no filesystem bits present, so this changes the calculation,
> but I'm not convinced it's as secure this way.  At least with newcaps=0.

I'm not convinced that Posix's version makes any sense.  Also, there are
apparently a number of drafts around which disagree on what the right
rules are.  (My copy, for example, matches the old rules exactly, but
the old rules caused the sendmail problem.)  And, under Posix, what does
the inheritable mask mean, anyway?

Also, I don't find the posix rules to be useful (why is there an
inheritable mask if all it does is to cause caps to be dropped on
exec, when the user could just manually drop them?).

> 
> I believe we can get something functional with fewer changes, hence
> easier to understand the ramifications.  In a nutshell, I'm still not
> comfortable with this.

I'll play with it, but I think this is the shortest patch I've come up
with.  I'll admit that touching this stuff scares me too, but I'd rather
redo it that try and patch it over again.

> 
> Also, it breaks my tests which try to drop privs and keep caps across
> execve() which is really the only issue we're trying to solve ATM.

Can you send me a sample of what breaks?  I do:

[root@luto tmp]# cap -c = ls /home/andy
ls: /home/andy: Permission denied
[root@luto tmp]# echo test >foo
[root@luto tmp]# chmod 700 foo
[root@luto tmp]# su andy -c 'cat foo'
cat: foo: Permission denied
[root@luto tmp]# cap -c '= cap_dac_read_search=eip' -u andy cat foo
test
[root@luto tmp]# cap -c '= cap_dac_read_search=eip' -u andy bash
[andy@luto tmp]$ whoami
andy
[andy@luto tmp]$ dumpcap
         Real        Eff
User    500         500
Group   500         500

Caps: = cap_dac_read_search+eip
[andy@luto tmp]$ cat foo
test

Which looks exactly right to me.
(cap and dumpcap live at www.stanford.edu/~luto/cap/)

>>--- linux-2.6.6-mm2/fs/exec.c~caps	2004-05-13 11:42:26.000000000 -0700
>>+++ linux-2.6.6-mm2/fs/exec.c	2004-05-13 12:15:20.000000000 -0700
>>@@ -882,8 +882,10 @@
>> 
>> 	if(!(bprm->file->f_vfsmnt->mnt_flags & MNT_NOSUID)) {
>> 		/* Set-uid? */
>>-		if (mode & S_ISUID)
>>+		if (mode & S_ISUID) {
>> 			bprm->e_uid = inode->i_uid;
>>+			bprm->secflags |= BINPRM_SEC_SETUID;
>>+		}
>> 
>> 		/* Set-gid? */
>> 		/*
>>@@ -891,10 +893,19 @@
>> 		 * is a candidate for mandatory locking, not a setgid
>> 		 * executable.
>> 		 */
>>-		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP))
>>+		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
>> 			bprm->e_gid = inode->i_gid;
>>+			bprm->secflags |= BINPRM_SEC_SETGID;
>>+		}
>> 	}
>> 
>>+	/* Pretend we have VFS capabilities */
>>+	cap_set_full(bprm->cap_inheritable);
> 
> 
> This looks sketchy.

My concept of 'inheritable' is that caps that are _not_ inheritable
may never be gained by this task or its children.  So a process
should normally have all caps inheritable.

> 
> 
>>+	if((bprm->secflags & BINPRM_SEC_SETUID) && bprm->e_uid == 0)
> 
> 
> CodingStyle:  add space after keyword 'if'

Fixed.

> 	if ((...))
> 
> 
>>+		cap_set_full(bprm->cap_permitted);
>>+	else
>>+		cap_clear(bprm->cap_permitted);
>>+
>> 	/* fill in binprm security blob */
>> 	retval = security_bprm_set(bprm);
>> 	if (retval)
>>@@ -1089,6 +1100,7 @@
>> 	bprm.loader = 0;
>> 	bprm.exec = 0;
>> 	bprm.security = NULL;
>>+	bprm.secflags = 0;
>> 	bprm.mm = mm_alloc();
>> 	retval = -ENOMEM;
>> 	if (!bprm.mm)
>>--- linux-2.6.6-mm2/security/commoncap.c~caps	2004-05-13 11:42:26.000000000 -0700
>>+++ linux-2.6.6-mm2/security/commoncap.c	2004-05-13 12:59:32.934690092 -0700
>>@@ -24,6 +24,11 @@
>> #include <linux/xattr.h>
>> #include <linux/hugetlb.h>
>> 
>>+int newcaps = 0;
> 
> 
> make this:
>   static int newcaps;

Fixed.

> 
> 
>>+
>>+module_param(newcaps, int, 444);
>>+MODULE_PARM_DESC(newcaps, "Set newcaps=1 to enable experimental capabilities");
>>+
>> int cap_capable (struct task_struct *tsk, int cap)
>> {
>> 	/* Derived from include/linux/sched.h:capable. */
>>@@ -36,6 +41,11 @@
>> int cap_ptrace (struct task_struct *parent, struct task_struct *child)
>> {
>> 	/* Derived from arch/i386/kernel/ptrace.c:sys_ptrace. */
>>+	/* CAP_SYS_PTRACE still can't bypass inheritable restrictions */
>>+	if (newcaps &&
>>+	    !cap_issubset (child->cap_inheritable, current->cap_inheritable))
>>+		return -EPERM;
> 
> 
> Why no capable() override?  In fact, is this check really necessary?

If task A has less inheritable caps than B, then A is somehow less trusted
and has no business tracing B.

A concrete example: a system runs with very restricted inheritable caps
on all processes except for a magic daemon.  The magic daemon holds on
to CAP_SYS_ADMIN to umount everything at shutdown.  If the rest of the
system gets rooted, it still shouldn't be possible to trace the daemon.
(Yes, this is currently not workable -- I plan to add a sysctl that sets
what inheritable caps a task must have for setuid to work.  The blanket
requirement that _all_ must be present is to avoid bugs in which a
setuid program assumes it will be fully privileged.)

> 
> 
>>+
>> 	if (!cap_issubset (child->cap_permitted, current->cap_permitted) &&
>> 	    !capable (CAP_SYS_PTRACE))
>> 		return -EPERM;
>>@@ -76,6 +86,11 @@
>> 		return -EPERM;
>> 	}
>> 
>>+	/* verify the _new_Permitted_ is a subset of the _new_Inheritable_ */
>>+	if (newcaps && !cap_issubset (*permitted, *inheritable)) {
>>+		return -EPERM;
>>+	}
>>+
>> 	return 0;
>> }
>> 
>>@@ -89,6 +104,8 @@
>> 
>> int cap_bprm_set_security (struct linux_binprm *bprm)
>> {
>>+	if (newcaps) return 0;
> 
> 
> CodingStyle:
> 	if (newcaps)
> 		return 0;

Fixed.

> 
> 
>>+
>> 	/* Copied from fs/exec.c:prepare_binprm. */
>> 
>> 	/* We don't have VFS support for capabilities yet */
>>@@ -115,10 +132,11 @@
>> 	return 0;
>> }
>> 
>>-void cap_bprm_apply_creds (struct linux_binprm *bprm, int unsafe)
>>+static void cap_bprm_apply_creds_compat (struct linux_binprm *bprm, int unsafe)
>> {
>>-	/* Derived from fs/exec.c:compute_creds. */
>>+	/* This function will hopefully die in 2.7. */
>> 	kernel_cap_t new_permitted, working;
>>+	static int fixed_init = 0;
>> 
>> 	new_permitted = cap_intersect (bprm->cap_permitted, cap_bset);
>> 	working = cap_intersect (bprm->cap_inheritable,
>>@@ -151,6 +169,15 @@
>> 		current->cap_permitted = new_permitted;
>> 		current->cap_effective =
>> 		    cap_intersect (new_permitted, bprm->cap_effective);
>>+	} else if (!fixed_init) {
>>+		/* This is not strictly correct, as it gives linuxrc more
>>+		 * permissions than it used to have.  It was the only way I
>>+		 * could think of to keep the resulting disaster contained,
>>+		 * though.
>>+		 */
>>+		current->cap_effective = CAP_OLD_INIT_EFF_SET;
>>+		current->cap_inheritable = CAP_OLD_INIT_INH_SET;
>>+		fixed_init = 1;
> 
> 
> Hrm...

Yup.  It sucks.  I didn't want to touch the system startup code, though,
and this is the closest LSM comes.  It's too bad there's no LSM hook to do
this right (although I suppose I could add one, but that would break
capabilities as a module).

Speaking of which, this is a genuine problem if commoncap is a module
and newcaps=0 -- this code will probably never run.  Does it matter?


[lots of patch snipped]

>>--- linux-2.6.6-mm2/include/linux/init_task.h~caps	2004-05-13 11:42:26.000000000 -0700
>>+++ linux-2.6.6-mm2/include/linux/init_task.h	2004-05-13 11:42:51.000000000 -0700
>>@@ -92,8 +92,8 @@
>> 		.function	= it_real_fn				\
>> 	},								\
>> 	.group_info	= &init_groups,					\
>>-	.cap_effective	= CAP_INIT_EFF_SET,				\
>>-	.cap_inheritable = CAP_INIT_INH_SET,				\
>>+	.cap_effective	= CAP_FULL_SET,				\
>>+	.cap_inheritable = CAP_FULL_SET,				\
> 
> 
> This was made unconditional.  And how are you convinced it's safe?

Same as above.  But even if it were really unconditional (instead
of just sort-of unconditional), I still think it's safe, because
(in newcaps=0 mode) only root can get CAP_SETPCAP.  Root
could always just insert a module to enable it.  So granting it
costs nothing.

On the other hand, safety is good, and I can't see any way in the
old system for anything to get CAP_SETPCAP.  I'll send in a new
patch that just disables CAP_SETPCAP when newcaps=0.


> 
> 
>> 	.cap_permitted	= CAP_FULL_SET,					\
>> 	.keep_capabilities = 0,						\
>> 	.rlim		= INIT_RLIMITS,					\
> 
> 
> 
> thanks,
> -chris


Thanks,
Andy

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2 
  2004-05-14  1:20 ` Chris Wright
@ 2004-05-14  1:35   ` Valdis.Kletnieks
  2004-05-14  4:51     ` Chris Wright
  2004-05-14  5:33     ` Olaf Dietsche
  2004-05-14  2:27   ` Andy Lutomirski
  1 sibling, 2 replies; 43+ messages in thread
From: Valdis.Kletnieks @ 2004-05-14  1:35 UTC (permalink / raw)
  To: Chris Wright; +Cc: Andy Lutomirski, akpm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1157 bytes --]

On Thu, 13 May 2004 18:20:10 PDT, Chris Wright said:

> I think it still needs more work.  Default behavoiur is changed, like
> Inheritble is full rather than clear, setpcap is enabled, etc.  Also,
> why do you change from Posix the way exec() updates capabilities?  Sure,
> there is no filesystem bits present, so this changes the calculation,
> but I'm not convinced it's as secure this way.  At least with newcaps=0.

The last time the "capabilities" thread reared its head a while ago, Andy made
a posting that pretty conclusively showed that the Posix way was totally b0rken
if you ever intended to support filesystem bits.  So if you wanted to ever have
a snowball's chance of supporting something like:

chcap cap_net_raw+ep /bin/ping

so you could get rid of the set-UID bit on 'ping', you had to toss the Posix
propogation rules out the window.  So we need to do either:

1) Toss the Posix out the window
2) Toss all the filesystems capabilities support out the window.

(I'm assuming that a suggestion that we make the choice a Kconfig option will
be met with the sound of many kernel hackers either retching in disgust or
screaming in horror ;)


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH] capabilites, take 2
  2004-05-13 20:08 Andy Lutomirski
@ 2004-05-14  1:20 ` Chris Wright
  2004-05-14  1:35   ` Valdis.Kletnieks
  2004-05-14  2:27   ` Andy Lutomirski
  0 siblings, 2 replies; 43+ messages in thread
From: Chris Wright @ 2004-05-14  1:20 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: akpm, chrisw, linux-kernel

* Andy Lutomirski (luto@myrealbox.com) wrote:
> Implement optional working capability support.  Try to avoid giving Andrew
> a heart attack. ;)

I think it still needs more work.  Default behavoiur is changed, like
Inheritble is full rather than clear, setpcap is enabled, etc.  Also,
why do you change from Posix the way exec() updates capabilities?  Sure,
there is no filesystem bits present, so this changes the calculation,
but I'm not convinced it's as secure this way.  At least with newcaps=0.

I believe we can get something functional with fewer changes, hence
easier to understand the ramifications.  In a nutshell, I'm still not
comfortable with this.

Also, it breaks my tests which try to drop privs and keep caps across
execve() which is really the only issue we're trying to solve ATM.


> --- linux-2.6.6-mm2/fs/exec.c~caps	2004-05-13 11:42:26.000000000 -0700
> +++ linux-2.6.6-mm2/fs/exec.c	2004-05-13 12:15:20.000000000 -0700
> @@ -882,8 +882,10 @@
>  
>  	if(!(bprm->file->f_vfsmnt->mnt_flags & MNT_NOSUID)) {
>  		/* Set-uid? */
> -		if (mode & S_ISUID)
> +		if (mode & S_ISUID) {
>  			bprm->e_uid = inode->i_uid;
> +			bprm->secflags |= BINPRM_SEC_SETUID;
> +		}
>  
>  		/* Set-gid? */
>  		/*
> @@ -891,10 +893,19 @@
>  		 * is a candidate for mandatory locking, not a setgid
>  		 * executable.
>  		 */
> -		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP))
> +		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
>  			bprm->e_gid = inode->i_gid;
> +			bprm->secflags |= BINPRM_SEC_SETGID;
> +		}
>  	}
>  
> +	/* Pretend we have VFS capabilities */
> +	cap_set_full(bprm->cap_inheritable);

This looks sketchy.

> +	if((bprm->secflags & BINPRM_SEC_SETUID) && bprm->e_uid == 0)

CodingStyle:  add space after keyword 'if'
	if ((...))

> +		cap_set_full(bprm->cap_permitted);
> +	else
> +		cap_clear(bprm->cap_permitted);
> +
>  	/* fill in binprm security blob */
>  	retval = security_bprm_set(bprm);
>  	if (retval)
> @@ -1089,6 +1100,7 @@
>  	bprm.loader = 0;
>  	bprm.exec = 0;
>  	bprm.security = NULL;
> +	bprm.secflags = 0;
>  	bprm.mm = mm_alloc();
>  	retval = -ENOMEM;
>  	if (!bprm.mm)
> --- linux-2.6.6-mm2/security/commoncap.c~caps	2004-05-13 11:42:26.000000000 -0700
> +++ linux-2.6.6-mm2/security/commoncap.c	2004-05-13 12:59:32.934690092 -0700
> @@ -24,6 +24,11 @@
>  #include <linux/xattr.h>
>  #include <linux/hugetlb.h>
>  
> +int newcaps = 0;

make this:
  static int newcaps;

> +
> +module_param(newcaps, int, 444);
> +MODULE_PARM_DESC(newcaps, "Set newcaps=1 to enable experimental capabilities");
> +
>  int cap_capable (struct task_struct *tsk, int cap)
>  {
>  	/* Derived from include/linux/sched.h:capable. */
> @@ -36,6 +41,11 @@
>  int cap_ptrace (struct task_struct *parent, struct task_struct *child)
>  {
>  	/* Derived from arch/i386/kernel/ptrace.c:sys_ptrace. */
> +	/* CAP_SYS_PTRACE still can't bypass inheritable restrictions */
> +	if (newcaps &&
> +	    !cap_issubset (child->cap_inheritable, current->cap_inheritable))
> +		return -EPERM;

Why no capable() override?  In fact, is this check really necessary?

> +
>  	if (!cap_issubset (child->cap_permitted, current->cap_permitted) &&
>  	    !capable (CAP_SYS_PTRACE))
>  		return -EPERM;
> @@ -76,6 +86,11 @@
>  		return -EPERM;
>  	}
>  
> +	/* verify the _new_Permitted_ is a subset of the _new_Inheritable_ */
> +	if (newcaps && !cap_issubset (*permitted, *inheritable)) {
> +		return -EPERM;
> +	}
> +
>  	return 0;
>  }
>  
> @@ -89,6 +104,8 @@
>  
>  int cap_bprm_set_security (struct linux_binprm *bprm)
>  {
> +	if (newcaps) return 0;

CodingStyle:
	if (newcaps)
		return 0;

> +
>  	/* Copied from fs/exec.c:prepare_binprm. */
>  
>  	/* We don't have VFS support for capabilities yet */
> @@ -115,10 +132,11 @@
>  	return 0;
>  }
>  
> -void cap_bprm_apply_creds (struct linux_binprm *bprm, int unsafe)
> +static void cap_bprm_apply_creds_compat (struct linux_binprm *bprm, int unsafe)
>  {
> -	/* Derived from fs/exec.c:compute_creds. */
> +	/* This function will hopefully die in 2.7. */
>  	kernel_cap_t new_permitted, working;
> +	static int fixed_init = 0;
>  
>  	new_permitted = cap_intersect (bprm->cap_permitted, cap_bset);
>  	working = cap_intersect (bprm->cap_inheritable,
> @@ -151,6 +169,15 @@
>  		current->cap_permitted = new_permitted;
>  		current->cap_effective =
>  		    cap_intersect (new_permitted, bprm->cap_effective);
> +	} else if (!fixed_init) {
> +		/* This is not strictly correct, as it gives linuxrc more
> +		 * permissions than it used to have.  It was the only way I
> +		 * could think of to keep the resulting disaster contained,
> +		 * though.
> +		 */
> +		current->cap_effective = CAP_OLD_INIT_EFF_SET;
> +		current->cap_inheritable = CAP_OLD_INIT_INH_SET;
> +		fixed_init = 1;

Hrm...

>  	}
>  
>  	/* AUD: Audit candidate if current->cap_effective is set */
> @@ -158,15 +185,103 @@
>  	current->keep_capabilities = 0;
>  }
>  
> +/*
> + * The rules of Linux capabilities (not POSIX!)
> + *
> + * What the masks mean:
> + *  pI = capabilities that this process or its children may have
> + *  pP = capabilities that this process has
> + *  pE = capabilities that this process has and are enabled
> + * (so pE <= pP <= pI)
> + *
> + * The capability evolution rules are:
> + *
> + *  pI' = pI & fI
> + *  pP' = ((fP & cap_bset) | pP) & pI' & Y
> + *  pE' = (setuid ? pP' : (pE & pP'))
> + *
> + *  X = cap_bset
> + *  Y is zero if uid!=0, euid==0, and setuid non-root
> + *
> + * Caveat: if (fP & ~pI'), there is no _theoretical_ problem, but
> + * this could introduce exploits in buggy programs.  Since programs
> + * that aren't capability-aware are insecure _anyway_ if pP!=0, this
> + * is OK.
> + *
> + * To allow pI != ~0 to be secure in the presence of buggy programs,
> + * we require full pI for setuid.
> + *
> + * The moral is that, if file capabilities are introduced, programs
> + * that are granted fP > 0 need to be aware of how to deal with it.
> + * Because the effective set is left alone on non-setuid fP>0,
> + * such a program should drop capabilities that were not in its initial
> + * effective set before running untrusted code.
> + *
> + */
> +void cap_bprm_apply_creds (struct linux_binprm *bprm, int unsafe)
> +{
> +	kernel_cap_t new_pI, new_pP;
> +	kernel_cap_t fI, fP;
> +	int is_setuid, is_setgid;
> +
> +	if(!newcaps) {
> +		cap_bprm_apply_creds_compat(bprm, unsafe);
> +		return;
> +	}
> +
> +	fI = bprm->cap_inheritable;
> +	fP = bprm->cap_permitted;
> +	is_setuid = (bprm->secflags & BINPRM_SEC_SETUID);
> +	is_setgid = (bprm->secflags & BINPRM_SEC_SETGID);
> +
> +	new_pI = cap_intersect(current->cap_inheritable, fI);
> +
> +	/* Check that it's safe to elevate privileges */
> +	if (unsafe & ~LSM_UNSAFE_PTRACE_CAP)
> +		bprm->secflags |= BINPRM_SEC_NOELEVATE;
> +
> +	/* FIXME: Is this overly harsh on setgid? */
> +	if ((bprm->secflags & (BINPRM_SEC_SETUID | BINPRM_SEC_SETGID)) &&
> +	    new_pI != CAP_FULL_SET)
> +			bprm->secflags |= BINPRM_SEC_NOELEVATE;
> +
> +	if (bprm->secflags & BINPRM_SEC_NOELEVATE) {
> +		is_setuid = is_setgid = 0;
> +		cap_clear(fP);
> +	}
> +
> +	new_pP = cap_intersect(fP, cap_bset);
> +	new_pP = cap_combine(new_pP, current->cap_permitted);
> +	cap_mask(new_pP, new_pI);
> +
> +	/* setuid-nonroot is special. */
> +	if (is_setuid && bprm->e_uid != 0 && current->uid != 0 &&
> +	    current->euid == 0)
> +		cap_clear(new_pP);
> +
> +	if(!cap_issubset(new_pP, current->cap_permitted))
> +		bprm->secflags |= BINPRM_SEC_SECUREEXEC;
> +
> +	/* Apply new security state */
> +	if (is_setuid) {
> +	        current->suid = current->euid = current->fsuid = bprm->e_uid;
> +		current->cap_effective = new_pP;
> +	}
> +	if (is_setgid)
> +	        current->sgid = current->egid = current->fsgid = bprm->e_gid;
> +
> +	current->cap_inheritable = new_pI;
> +	current->cap_permitted = new_pP;
> +	cap_mask(current->cap_effective, new_pP);
> +
> +	current->keep_capabilities = 0;
> +}
> +
>  int cap_bprm_secureexec (struct linux_binprm *bprm)
>  {
> -	/* If/when this module is enhanced to incorporate capability
> -	   bits on files, the test below should be extended to also perform a 
> -	   test between the old and new capability sets.  For now,
> -	   it simply preserves the legacy decision algorithm used by
> -	   the old userland. */
>  	return (current->euid != current->uid ||
> -		current->egid != current->gid);
> +		current->egid != current->gid ||
> +		(bprm->secflags & BINPRM_SEC_SECUREEXEC));
>  }
>  
>  int cap_inode_setxattr(struct dentry *dentry, char *name, void *value,
> @@ -280,9 +395,15 @@
>  
>  void cap_task_reparent_to_init (struct task_struct *p)
>  {
> -	p->cap_effective = CAP_INIT_EFF_SET;
> -	p->cap_inheritable = CAP_INIT_INH_SET;
> -	p->cap_permitted = CAP_FULL_SET;
> +	if (newcaps) {
> +		cap_set_full(p->cap_inheritable);
> +		cap_set_full(p->cap_permitted);
> +		cap_set_full(p->cap_effective);
> +	} else {
> +		p->cap_effective = CAP_OLD_INIT_EFF_SET;
> +		p->cap_inheritable = CAP_OLD_INIT_INH_SET;
> +		p->cap_permitted = CAP_FULL_SET;
> +	}
>  	p->keep_capabilities = 0;
>  	return;
>  }
> @@ -367,6 +488,16 @@
>  	return -ENOMEM;
>  }
>  
> +static int __init commoncap_init (void)
> +{
> +	if (newcaps) {
> +		printk(KERN_NOTICE "Experimental capability support is on\n");
> +		cap_bset = CAP_FULL_SET;
> +	}
> +
> +	return 0;
> +}
> +
>  EXPORT_SYMBOL(cap_capable);
>  EXPORT_SYMBOL(cap_ptrace);
>  EXPORT_SYMBOL(cap_capget);
> @@ -382,5 +513,7 @@
>  EXPORT_SYMBOL(cap_syslog);
>  EXPORT_SYMBOL(cap_vm_enough_memory);
>  
> +module_init(commoncap_init);
> +
>  MODULE_DESCRIPTION("Standard Linux Common Capabilities Security Module");
>  MODULE_LICENSE("GPL");
> --- linux-2.6.6-mm2/kernel/capability.c~caps	2004-05-13 11:42:26.000000000 -0700
> +++ linux-2.6.6-mm2/kernel/capability.c	2004-05-13 11:42:51.000000000 -0700
> @@ -13,7 +13,7 @@
>  #include <asm/uaccess.h>
>  
>  unsigned securebits = SECUREBITS_DEFAULT; /* systemwide security settings */
> -kernel_cap_t cap_bset = CAP_INIT_EFF_SET;
> +kernel_cap_t cap_bset = CAP_OLD_INIT_EFF_SET;
>  int sysctl_mlock_group;
>  
>  EXPORT_SYMBOL(securebits);
> --- linux-2.6.6-mm2/include/linux/capability.h~caps	2004-05-13 11:42:26.000000000 -0700
> +++ linux-2.6.6-mm2/include/linux/capability.h	2004-05-13 11:42:51.000000000 -0700
> @@ -308,8 +308,10 @@
>  
>  #define CAP_EMPTY_SET       to_cap_t(0)
>  #define CAP_FULL_SET        to_cap_t(~0)
> -#define CAP_INIT_EFF_SET    to_cap_t(~0 & ~CAP_TO_MASK(CAP_SETPCAP))
> -#define CAP_INIT_INH_SET    to_cap_t(0)
> +
> +/* For old-style capabilities, we use these. */
> +#define CAP_OLD_INIT_EFF_SET    to_cap_t(~0 & ~CAP_TO_MASK(CAP_SETPCAP))
> +#define CAP_OLD_INIT_INH_SET    to_cap_t(0)
>  
>  #define CAP_TO_MASK(x) (1 << (x))
>  #define cap_raise(c, flag)   (cap_t(c) |=  CAP_TO_MASK(flag))
> --- linux-2.6.6-mm2/include/linux/binfmts.h~caps	2004-05-13 11:42:26.000000000 -0700
> +++ linux-2.6.6-mm2/include/linux/binfmts.h	2004-05-13 11:44:02.000000000 -0700
> @@ -20,6 +20,10 @@
>  /*
>   * This structure is used to hold the arguments that are used when loading binaries.
>   */
> +#define BINPRM_SEC_SETUID	1
> +#define BINPRM_SEC_SETGID	2
> +#define BINPRM_SEC_SECUREEXEC	4
> +#define BINPRM_SEC_NOELEVATE	8
>  struct linux_binprm{
>  	char buf[BINPRM_BUF_SIZE];
>  	struct page *page[MAX_ARG_PAGES];
> @@ -28,7 +32,9 @@
>  	int sh_bang;
>  	struct file * file;
>  	int e_uid, e_gid;
> -	kernel_cap_t cap_inheritable, cap_permitted, cap_effective;
> +	int secflags;
> +	kernel_cap_t cap_inheritable, cap_permitted;
> +	kernel_cap_t cap_effective; /* old caps -- do NOT use in new code */
>  	void *security;
>  	int argc, envc;
>  	char * filename;	/* Name of binary as seen by procps */
> --- linux-2.6.6-mm2/include/linux/init_task.h~caps	2004-05-13 11:42:26.000000000 -0700
> +++ linux-2.6.6-mm2/include/linux/init_task.h	2004-05-13 11:42:51.000000000 -0700
> @@ -92,8 +92,8 @@
>  		.function	= it_real_fn				\
>  	},								\
>  	.group_info	= &init_groups,					\
> -	.cap_effective	= CAP_INIT_EFF_SET,				\
> -	.cap_inheritable = CAP_INIT_INH_SET,				\
> +	.cap_effective	= CAP_FULL_SET,				\
> +	.cap_inheritable = CAP_FULL_SET,				\

This was made unconditional.  And how are you convinced it's safe?

>  	.cap_permitted	= CAP_FULL_SET,					\
>  	.keep_capabilities = 0,						\
>  	.rlim		= INIT_RLIMITS,					\


thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH] capabilites, take 2
@ 2004-05-13 20:08 Andy Lutomirski
  2004-05-14  1:20 ` Chris Wright
  0 siblings, 1 reply; 43+ messages in thread
From: Andy Lutomirski @ 2004-05-13 20:08 UTC (permalink / raw)
  To: akpm, chrisw, linux-kernel; +Cc: luto

This implements working capabilities.

Changes from the previous version:

- It's now just one patch (the split doesn't make sense anymore)
- Cleaned up cap_bprm_apply_creds
- Restrictions on setuid are somewhat stricter now
- Rediffed against 2.6.6-mm2 (didn't change anything)
- bprm is initialized correctly in fs/exec.c
- printk to remind users that the patch is enabled
- LSM builds correctly.  Untested.

The whole thing is now a module parameter -- specify
commoncap.newcaps=1 if capabilities is built-in or newcaps=1 in
modprobe or insmod if not.

Known issues:

1. setxuid emulation is wrong (keepcaps doesn't work for setre(s)uid).
   I haven't fixed this because I don't what to change the semantics
   of PR_SET_KEEPCAPS.  I'll probably add PR_SET_REALLY_KEEPCAPS
   to really keep capabilities.

2. When newcaps=0 (the default), linuxrc gets all capabilities.  This
   is different from the old behavior.  Init and programs started from
   linuxrc still match the old behavior.  I couldn't think of any
   remotely clean way around that without breaking linuxrc for
   newcaps=1

3. I haven't tried it, but I imagine that unloading commoncap and then
   reloading it in a different mode may do unexpected things.  So read
   the code before you try that.  I see no reason to fix this one because
   the whole thing should go away eventually.

When newcaps=1, this extends the LSM interface:
bprm->secflags & BINPRM_SEC_NO_ELEVATE means that privileges should not
be elevated.  So stacking modules (e.g. selinux) should set this before
calling to the lower module and test it again before deciding whether
to elevate privileges.  That way, either all privs are elevated or none
are.

I also added a flag (BINPRM_SEC_SECUREEXEC) with the obvious meaning.
Otherwise cap_bprm_secureexec would have been a mess.

--Andy


Implement optional working capability support.  Try to avoid giving Andrew
a heart attack. ;)

 fs/exec.c                  |   16 ++++
 include/linux/binfmts.h    |    8 ++
 include/linux/capability.h |    6 +
 include/linux/init_task.h  |    4 -
 kernel/capability.c        |    2 
 security/commoncap.c       |  155 +++++++++++++++++++++++++++++++++++++++++----
 6 files changed, 172 insertions(+), 19 deletions(-)

--- linux-2.6.6-mm2/fs/exec.c~caps	2004-05-13 11:42:26.000000000 -0700
+++ linux-2.6.6-mm2/fs/exec.c	2004-05-13 12:15:20.000000000 -0700
@@ -882,8 +882,10 @@
 
 	if(!(bprm->file->f_vfsmnt->mnt_flags & MNT_NOSUID)) {
 		/* Set-uid? */
-		if (mode & S_ISUID)
+		if (mode & S_ISUID) {
 			bprm->e_uid = inode->i_uid;
+			bprm->secflags |= BINPRM_SEC_SETUID;
+		}
 
 		/* Set-gid? */
 		/*
@@ -891,10 +893,19 @@
 		 * is a candidate for mandatory locking, not a setgid
 		 * executable.
 		 */
-		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP))
+		if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
 			bprm->e_gid = inode->i_gid;
+			bprm->secflags |= BINPRM_SEC_SETGID;
+		}
 	}
 
+	/* Pretend we have VFS capabilities */
+	cap_set_full(bprm->cap_inheritable);
+	if((bprm->secflags & BINPRM_SEC_SETUID) && bprm->e_uid == 0)
+		cap_set_full(bprm->cap_permitted);
+	else
+		cap_clear(bprm->cap_permitted);
+
 	/* fill in binprm security blob */
 	retval = security_bprm_set(bprm);
 	if (retval)
@@ -1089,6 +1100,7 @@
 	bprm.loader = 0;
 	bprm.exec = 0;
 	bprm.security = NULL;
+	bprm.secflags = 0;
 	bprm.mm = mm_alloc();
 	retval = -ENOMEM;
 	if (!bprm.mm)
--- linux-2.6.6-mm2/security/commoncap.c~caps	2004-05-13 11:42:26.000000000 -0700
+++ linux-2.6.6-mm2/security/commoncap.c	2004-05-13 12:59:32.934690092 -0700
@@ -24,6 +24,11 @@
 #include <linux/xattr.h>
 #include <linux/hugetlb.h>
 
+int newcaps = 0;
+
+module_param(newcaps, int, 444);
+MODULE_PARM_DESC(newcaps, "Set newcaps=1 to enable experimental capabilities");
+
 int cap_capable (struct task_struct *tsk, int cap)
 {
 	/* Derived from include/linux/sched.h:capable. */
@@ -36,6 +41,11 @@
 int cap_ptrace (struct task_struct *parent, struct task_struct *child)
 {
 	/* Derived from arch/i386/kernel/ptrace.c:sys_ptrace. */
+	/* CAP_SYS_PTRACE still can't bypass inheritable restrictions */
+	if (newcaps &&
+	    !cap_issubset (child->cap_inheritable, current->cap_inheritable))
+		return -EPERM;
+
 	if (!cap_issubset (child->cap_permitted, current->cap_permitted) &&
 	    !capable (CAP_SYS_PTRACE))
 		return -EPERM;
@@ -76,6 +86,11 @@
 		return -EPERM;
 	}
 
+	/* verify the _new_Permitted_ is a subset of the _new_Inheritable_ */
+	if (newcaps && !cap_issubset (*permitted, *inheritable)) {
+		return -EPERM;
+	}
+
 	return 0;
 }
 
@@ -89,6 +104,8 @@
 
 int cap_bprm_set_security (struct linux_binprm *bprm)
 {
+	if (newcaps) return 0;
+
 	/* Copied from fs/exec.c:prepare_binprm. */
 
 	/* We don't have VFS support for capabilities yet */
@@ -115,10 +132,11 @@
 	return 0;
 }
 
-void cap_bprm_apply_creds (struct linux_binprm *bprm, int unsafe)
+static void cap_bprm_apply_creds_compat (struct linux_binprm *bprm, int unsafe)
 {
-	/* Derived from fs/exec.c:compute_creds. */
+	/* This function will hopefully die in 2.7. */
 	kernel_cap_t new_permitted, working;
+	static int fixed_init = 0;
 
 	new_permitted = cap_intersect (bprm->cap_permitted, cap_bset);
 	working = cap_intersect (bprm->cap_inheritable,
@@ -151,6 +169,15 @@
 		current->cap_permitted = new_permitted;
 		current->cap_effective =
 		    cap_intersect (new_permitted, bprm->cap_effective);
+	} else if (!fixed_init) {
+		/* This is not strictly correct, as it gives linuxrc more
+		 * permissions than it used to have.  It was the only way I
+		 * could think of to keep the resulting disaster contained,
+		 * though.
+		 */
+		current->cap_effective = CAP_OLD_INIT_EFF_SET;
+		current->cap_inheritable = CAP_OLD_INIT_INH_SET;
+		fixed_init = 1;
 	}
 
 	/* AUD: Audit candidate if current->cap_effective is set */
@@ -158,15 +185,103 @@
 	current->keep_capabilities = 0;
 }
 
+/*
+ * The rules of Linux capabilities (not POSIX!)
+ *
+ * What the masks mean:
+ *  pI = capabilities that this process or its children may have
+ *  pP = capabilities that this process has
+ *  pE = capabilities that this process has and are enabled
+ * (so pE <= pP <= pI)
+ *
+ * The capability evolution rules are:
+ *
+ *  pI' = pI & fI
+ *  pP' = ((fP & cap_bset) | pP) & pI' & Y
+ *  pE' = (setuid ? pP' : (pE & pP'))
+ *
+ *  X = cap_bset
+ *  Y is zero if uid!=0, euid==0, and setuid non-root
+ *
+ * Caveat: if (fP & ~pI'), there is no _theoretical_ problem, but
+ * this could introduce exploits in buggy programs.  Since programs
+ * that aren't capability-aware are insecure _anyway_ if pP!=0, this
+ * is OK.
+ *
+ * To allow pI != ~0 to be secure in the presence of buggy programs,
+ * we require full pI for setuid.
+ *
+ * The moral is that, if file capabilities are introduced, programs
+ * that are granted fP > 0 need to be aware of how to deal with it.
+ * Because the effective set is left alone on non-setuid fP>0,
+ * such a program should drop capabilities that were not in its initial
+ * effective set before running untrusted code.
+ *
+ */
+void cap_bprm_apply_creds (struct linux_binprm *bprm, int unsafe)
+{
+	kernel_cap_t new_pI, new_pP;
+	kernel_cap_t fI, fP;
+	int is_setuid, is_setgid;
+
+	if(!newcaps) {
+		cap_bprm_apply_creds_compat(bprm, unsafe);
+		return;
+	}
+
+	fI = bprm->cap_inheritable;
+	fP = bprm->cap_permitted;
+	is_setuid = (bprm->secflags & BINPRM_SEC_SETUID);
+	is_setgid = (bprm->secflags & BINPRM_SEC_SETGID);
+
+	new_pI = cap_intersect(current->cap_inheritable, fI);
+
+	/* Check that it's safe to elevate privileges */
+	if (unsafe & ~LSM_UNSAFE_PTRACE_CAP)
+		bprm->secflags |= BINPRM_SEC_NOELEVATE;
+
+	/* FIXME: Is this overly harsh on setgid? */
+	if ((bprm->secflags & (BINPRM_SEC_SETUID | BINPRM_SEC_SETGID)) &&
+	    new_pI != CAP_FULL_SET)
+			bprm->secflags |= BINPRM_SEC_NOELEVATE;
+
+	if (bprm->secflags & BINPRM_SEC_NOELEVATE) {
+		is_setuid = is_setgid = 0;
+		cap_clear(fP);
+	}
+
+	new_pP = cap_intersect(fP, cap_bset);
+	new_pP = cap_combine(new_pP, current->cap_permitted);
+	cap_mask(new_pP, new_pI);
+
+	/* setuid-nonroot is special. */
+	if (is_setuid && bprm->e_uid != 0 && current->uid != 0 &&
+	    current->euid == 0)
+		cap_clear(new_pP);
+
+	if(!cap_issubset(new_pP, current->cap_permitted))
+		bprm->secflags |= BINPRM_SEC_SECUREEXEC;
+
+	/* Apply new security state */
+	if (is_setuid) {
+	        current->suid = current->euid = current->fsuid = bprm->e_uid;
+		current->cap_effective = new_pP;
+	}
+	if (is_setgid)
+	        current->sgid = current->egid = current->fsgid = bprm->e_gid;
+
+	current->cap_inheritable = new_pI;
+	current->cap_permitted = new_pP;
+	cap_mask(current->cap_effective, new_pP);
+
+	current->keep_capabilities = 0;
+}
+
 int cap_bprm_secureexec (struct linux_binprm *bprm)
 {
-	/* If/when this module is enhanced to incorporate capability
-	   bits on files, the test below should be extended to also perform a 
-	   test between the old and new capability sets.  For now,
-	   it simply preserves the legacy decision algorithm used by
-	   the old userland. */
 	return (current->euid != current->uid ||
-		current->egid != current->gid);
+		current->egid != current->gid ||
+		(bprm->secflags & BINPRM_SEC_SECUREEXEC));
 }
 
 int cap_inode_setxattr(struct dentry *dentry, char *name, void *value,
@@ -280,9 +395,15 @@
 
 void cap_task_reparent_to_init (struct task_struct *p)
 {
-	p->cap_effective = CAP_INIT_EFF_SET;
-	p->cap_inheritable = CAP_INIT_INH_SET;
-	p->cap_permitted = CAP_FULL_SET;
+	if (newcaps) {
+		cap_set_full(p->cap_inheritable);
+		cap_set_full(p->cap_permitted);
+		cap_set_full(p->cap_effective);
+	} else {
+		p->cap_effective = CAP_OLD_INIT_EFF_SET;
+		p->cap_inheritable = CAP_OLD_INIT_INH_SET;
+		p->cap_permitted = CAP_FULL_SET;
+	}
 	p->keep_capabilities = 0;
 	return;
 }
@@ -367,6 +488,16 @@
 	return -ENOMEM;
 }
 
+static int __init commoncap_init (void)
+{
+	if (newcaps) {
+		printk(KERN_NOTICE "Experimental capability support is on\n");
+		cap_bset = CAP_FULL_SET;
+	}
+
+	return 0;
+}
+
 EXPORT_SYMBOL(cap_capable);
 EXPORT_SYMBOL(cap_ptrace);
 EXPORT_SYMBOL(cap_capget);
@@ -382,5 +513,7 @@
 EXPORT_SYMBOL(cap_syslog);
 EXPORT_SYMBOL(cap_vm_enough_memory);
 
+module_init(commoncap_init);
+
 MODULE_DESCRIPTION("Standard Linux Common Capabilities Security Module");
 MODULE_LICENSE("GPL");
--- linux-2.6.6-mm2/kernel/capability.c~caps	2004-05-13 11:42:26.000000000 -0700
+++ linux-2.6.6-mm2/kernel/capability.c	2004-05-13 11:42:51.000000000 -0700
@@ -13,7 +13,7 @@
 #include <asm/uaccess.h>
 
 unsigned securebits = SECUREBITS_DEFAULT; /* systemwide security settings */
-kernel_cap_t cap_bset = CAP_INIT_EFF_SET;
+kernel_cap_t cap_bset = CAP_OLD_INIT_EFF_SET;
 int sysctl_mlock_group;
 
 EXPORT_SYMBOL(securebits);
--- linux-2.6.6-mm2/include/linux/capability.h~caps	2004-05-13 11:42:26.000000000 -0700
+++ linux-2.6.6-mm2/include/linux/capability.h	2004-05-13 11:42:51.000000000 -0700
@@ -308,8 +308,10 @@
 
 #define CAP_EMPTY_SET       to_cap_t(0)
 #define CAP_FULL_SET        to_cap_t(~0)
-#define CAP_INIT_EFF_SET    to_cap_t(~0 & ~CAP_TO_MASK(CAP_SETPCAP))
-#define CAP_INIT_INH_SET    to_cap_t(0)
+
+/* For old-style capabilities, we use these. */
+#define CAP_OLD_INIT_EFF_SET    to_cap_t(~0 & ~CAP_TO_MASK(CAP_SETPCAP))
+#define CAP_OLD_INIT_INH_SET    to_cap_t(0)
 
 #define CAP_TO_MASK(x) (1 << (x))
 #define cap_raise(c, flag)   (cap_t(c) |=  CAP_TO_MASK(flag))
--- linux-2.6.6-mm2/include/linux/binfmts.h~caps	2004-05-13 11:42:26.000000000 -0700
+++ linux-2.6.6-mm2/include/linux/binfmts.h	2004-05-13 11:44:02.000000000 -0700
@@ -20,6 +20,10 @@
 /*
  * This structure is used to hold the arguments that are used when loading binaries.
  */
+#define BINPRM_SEC_SETUID	1
+#define BINPRM_SEC_SETGID	2
+#define BINPRM_SEC_SECUREEXEC	4
+#define BINPRM_SEC_NOELEVATE	8
 struct linux_binprm{
 	char buf[BINPRM_BUF_SIZE];
 	struct page *page[MAX_ARG_PAGES];
@@ -28,7 +32,9 @@
 	int sh_bang;
 	struct file * file;
 	int e_uid, e_gid;
-	kernel_cap_t cap_inheritable, cap_permitted, cap_effective;
+	int secflags;
+	kernel_cap_t cap_inheritable, cap_permitted;
+	kernel_cap_t cap_effective; /* old caps -- do NOT use in new code */
 	void *security;
 	int argc, envc;
 	char * filename;	/* Name of binary as seen by procps */
--- linux-2.6.6-mm2/include/linux/init_task.h~caps	2004-05-13 11:42:26.000000000 -0700
+++ linux-2.6.6-mm2/include/linux/init_task.h	2004-05-13 11:42:51.000000000 -0700
@@ -92,8 +92,8 @@
 		.function	= it_real_fn				\
 	},								\
 	.group_info	= &init_groups,					\
-	.cap_effective	= CAP_INIT_EFF_SET,				\
-	.cap_inheritable = CAP_INIT_INH_SET,				\
+	.cap_effective	= CAP_FULL_SET,				\
+	.cap_inheritable = CAP_FULL_SET,				\
 	.cap_permitted	= CAP_FULL_SET,					\
 	.keep_capabilities = 0,						\
 	.rlim		= INIT_RLIMITS,					\


^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2004-05-25  0:24 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <fa.dt4cg55.jnqvr5@ifi.uio.no>
     [not found] ` <fa.mu5rj3d.24gtbp@ifi.uio.no>
2004-05-14 15:57   ` [PATCH] capabilites, take 2 Andy Lutomirski
2004-05-14 16:01     ` Stephen Smalley
2004-05-14 16:18       ` Andy Lutomirski
2004-05-14 16:37         ` Stephen Smalley
2004-05-14 18:07         ` Chris Wright
2004-05-14 22:48           ` [PATCH] scaled-back caps, take 4 (was Re: [PATCH] capabilites, take 2) Andy Lutomirski
2004-05-15  0:06             ` [PATCH] scaled-back caps, take 4 Olaf Dietsche
2004-05-14 22:09               ` Albert Cahalan
2004-05-15  0:27               ` Chris Wright
     [not found]             ` <20040517231912.H21045@build.pdx.osdl.net>
2004-05-18  9:11               ` [PATCH] scaled-back caps, take 4 (was Re: [PATCH] capabilites, take 2) Andy Lutomirski
2004-05-19  1:27                 ` Chris Wright
2004-05-19  1:54                   ` [PATCH] scaled-back caps, take 4 Andy Lutomirski
2004-05-19  7:30                     ` Chris Wright
2004-05-23  9:28                       ` Andy Lutomirski
2004-05-23 18:48                         ` Olaf Dietsche
2004-05-24 23:38                       ` [PATCH] caps, compromise version (was Re: [PATCH] scaled-back caps, take 4) Andy Lutomirski
2004-05-24 23:56                         ` Chris Wright
2004-05-25  0:23                           ` Andy Lutomirski
     [not found]               ` <20040517235844.I21045@build.pdx.osdl.net>
2004-05-19  1:34                 ` [PATCH] support cap inheritable (Re: [PATCH] scaled-back caps, take 4 (was Re: [PATCH] capabilites, take 2) Andy Lutomirski
2004-05-19  7:27                   ` Chris Wright
2004-05-14 12:03 [PATCH] capabilites, take 2 Albert Cahalan
2004-05-14 14:53 ` Andy Lutomirski
2004-05-14 17:58   ` Chris Wright
2004-05-14 15:21 ` Stephen Smalley
2004-05-14 15:19   ` Albert Cahalan
2004-05-14 18:06     ` Stephen Smalley
2004-05-14 17:32       ` Albert Cahalan
2004-05-14 21:11         ` Chris Wright
2004-05-14 19:32           ` Albert Cahalan
2004-05-14 18:00   ` Chris Wright
2004-05-14 17:48 ` Chris Wright
  -- strict thread matches above, loose matches on Subject: below --
2004-05-13 20:08 Andy Lutomirski
2004-05-14  1:20 ` Chris Wright
2004-05-14  1:35   ` Valdis.Kletnieks
2004-05-14  4:51     ` Chris Wright
2004-05-14  5:33     ` Olaf Dietsche
2004-05-14  6:04       ` Valdis.Kletnieks
2004-05-14  7:09         ` Olaf Dietsche
2004-05-14  2:27   ` Andy Lutomirski
2004-05-14  4:48     ` Chris Wright
2004-05-14  5:58       ` Andy Lutomirski
2004-05-14 17:45         ` Chris Wright
2004-05-14  6:39     ` Olaf Dietsche

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).