Linux-Fsdevel Archive on
help / color / mirror / Atom feed
From: Adrian Reber <>
To: "Christian Brauner" <>,
	"Eric Biederman" <>,
	"Pavel Emelyanov" <>,
	"Oleg Nesterov" <>,
	"Dmitry Safonov" <>,
	"Andrei Vagin" <>,
	"Nicolas Viennot" <>,
	"Michał Cłapiński" <>,
	"Kamil Yurtsever" <>,
	"Dirk Petersen" <>,
	"Christine Flood" <>,
	"Casey Schaufler" <>
Cc: Mike Rapoport <>,
	Radostin Stoyanov <>,
	Adrian Reber <>,
	Cyrill Gorcunov <>,
	Serge Hallyn <>,
	Stephen Smalley <>,
	Sargun Dhillon <>, Arnd Bergmann <>,,,,
	Eric Paris <>, Jann Horn <>,
Subject: [PATCH v6 5/7] prctl: Allow local CAP_CHECKPOINT_RESTORE to change /proc/self/exe
Date: Sun, 19 Jul 2020 12:04:15 +0200	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

From: Nicolas Viennot <>

Originally, only a local CAP_SYS_ADMIN could change the exe link,
making it difficult for doing checkpoint/restore without CAP_SYS_ADMIN.
This commit adds CAP_CHECKPOINT_RESTORE in addition to CAP_SYS_ADMIN
for permitting changing the exe link.

The following describes the history of the /proc/self/exe permission
checks as it may be difficult to understand what decisions lead to this

* [1] May 2012: This commit introduces the ability of changing
  /proc/self/exe if the user is CAP_SYS_RESOURCE capable.
  In the related discussion [2], no clear thread model is presented for
  what could happen if the /proc/self/exe changes multiple times, or why
  would the admin be at the mercy of userspace.

* [3] Oct 2014: This commit introduces a new API to change
  /proc/self/exe. The permission no longer checks for CAP_SYS_RESOURCE,
  but instead checks if the current user is root (uid=0) in its local
  namespace. In the related discussion [4] it is said that "Controlling
  exe_fd without privileges may turn out to be dangerous. At least
  things like tomoyo examine it for making policy decisions (see

* [5] Dec 2016: This commit removes the restriction to change
  /proc/self/exe at most once. The related discussion [6] informs that
  the audit subsystem relies on the exe symlink, presumably
  audit_log_d_path_exe() in kernel/audit.c.

* [7] May 2017: This commit changed the check from uid==0 to local
  CAP_SYS_ADMIN. No discussion.

* [8] July 2020: A PoC to spoof any program's /proc/self/exe via ptrace
  is demonstrated

Overall, the concrete points that were made to retain capability checks
around changing the exe symlink is that tomoyo_manager() and
audit_log_d_path_exe() uses the exe_file path.

Christian Brauner said that relying on /proc/<pid>/exe being immutable (or
guarded by caps) in a sake of security is a bit misleading. It can only
be used as a hint without any guarantees of what code is being executed
once execve() returns to userspace. Christian suggested that in the
future, we could call audit_log() or similar to inform the admin of all
exe link changes, instead of attempting to provide security guarantees
via permission checks. However, this proposed change requires the
understanding of the security implications in the tomoyo/audit subsystems.

[1] b32dfe377102 ("c/r: prctl: add ability to set new mm_struct::exe_file")
[3] f606b77f1a9e ("prctl: PR_SET_MM -- introduce PR_SET_MM_MAP operation")
[5] 3fb4afd9a504 ("prctl: remove one-shot limitation for changing exe link")
[7] 4d28df6152aa ("prctl: Allow local CAP_SYS_ADMIN changing exe_file")

Signed-off-by: Nicolas Viennot <>
Signed-off-by: Adrian Reber <>
 kernel/sys.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/kernel/sys.c b/kernel/sys.c
index 00a96746e28a..a3f4ef0bbda3 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2007,11 +2007,14 @@ static int prctl_set_mm_map(int opt, const void __user *addr, unsigned long data
 	if (prctl_map.exe_fd != (u32)-1) {
-		 * Make sure the caller has the rights to
-		 * change /proc/pid/exe link: only local sys admin should
-		 * be allowed to.
+		 * Check if the current user is checkpoint/restore capable.
+		 * At the time of this writing, it checks for CAP_SYS_ADMIN
+		 * Note that a user with access to ptrace can masquerade an
+		 * arbitrary program as any executable, even setuid ones.
+		 * This may have implications in the tomoyo subsystem.
-		if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
+		if (!checkpoint_restore_ns_capable(current_user_ns()))
 			return -EINVAL;
 		error = prctl_set_mm_exe_file(mm, prctl_map.exe_fd);

  parent reply	other threads:[~2020-07-19 10:06 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-19 10:04 [PATCH v6 0/7] capabilities: Introduce CAP_CHECKPOINT_RESTORE Adrian Reber
2020-07-19 10:04 ` [PATCH v6 1/7] " Adrian Reber
2020-07-19 10:04 ` [PATCH v6 2/7] pid: use checkpoint_restore_ns_capable() for set_tid Adrian Reber
2020-07-19 10:04 ` [PATCH v6 3/7] pid_namespace: use checkpoint_restore_ns_capable() for ns_last_pid Adrian Reber
2020-07-19 10:04 ` [PATCH v6 4/7] proc: allow access in init userns for map_files with CAP_CHECKPOINT_RESTORE Adrian Reber
2020-07-19 16:50   ` Serge E. Hallyn
2020-07-19 10:04 ` Adrian Reber [this message]
2020-07-19 10:04 ` [PATCH v6 6/7] prctl: exe link permission error changed from -EINVAL to -EPERM Adrian Reber
2020-07-19 17:05   ` Serge E. Hallyn
2020-07-19 10:04 ` [PATCH v6 7/7] selftests: add clone3() CAP_CHECKPOINT_RESTORE test Adrian Reber
2020-07-19 18:17 ` [PATCH v6 0/7] capabilities: Introduce CAP_CHECKPOINT_RESTORE Christian Brauner
2020-07-20 11:54   ` Christian Brauner
2020-07-20 12:46     ` Adrian Reber
2020-07-20 12:58       ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).