LKML Archive on
help / color / mirror / Atom feed
From: Aleksa Sarai <>
To: Andy Lutomirski <>
Cc: Al Viro <>,
	"Eric W. Biederman" <>,
	Christian Brauner <>,
	Jeff Layton <>,
	"J. Bruce Fields" <>,
	Arnd Bergmann <>,
	David Howells <>, Jann Horn <>,
	Tycho Andersen <>,
	David Drysdale <>,,
	Linux Containers <>,
	Linux FS Devel <>,
	LKML <>,
	linux-arch <>,
	Linux API <>
Subject: Re: [PATCH v2 1/3] namei: implement O_BENEATH-style AT_* flags
Date: Wed, 10 Oct 2018 18:07:47 +1100	[thread overview]
Message-ID: <20181010070747.byi2itbi4j42gynq@ryuk> (raw)
In-Reply-To: <>

[-- Attachment #1: Type: text/plain, Size: 3338 bytes --]

On 2018-10-09, Andy Lutomirski <> wrote:
> On Mon, Oct 8, 2018 at 11:53 PM Aleksa Sarai <> wrote:
> > * AT_NO_PROCLINK: Disallows ->get_link "symlink" jumping. This is a very
> >   specific restriction, and it exists because /proc/$pid/fd/...
> >   "symlinks" allow for access outside nd->root and pose risk to
> >   container runtimes that don't want to be tricked into accessing a host
> >   path (but do want to allow no-funny-business symlink resolution).
> Can you elaborate on the use case?
> If I'm set up a container namespace and walk it for real (through the
> outside /proc/PID/root or otherwise starting from an fd that points
> into that namespace), and I walk through that namespace's /proc, I'm
> going to see the same thing that the processes in the namespace would
> see.  So what's the issue?
> Similarly, if I somehow manage to walk into the outside /proc, then
> I've pretty much lost regardless of the links.

Well, there's a couple of reasons:

* The original AT_NO_JUMPS patchset similarly disabled "proclinks" but
  it was sort of all contained within AT_NO_JUMPS. In order to have a
  precise 1:1 feature mapping we need this in *some* form (and in v1 the
  only way to get it was to add a separate flag). According to the
  original O_BENEATH changelog, both you and Al pushed for this to be
  part of O_BENEATH. :P

  *However* in v2 of the patchset, proclinks are also disabled by
  AT_BENEATH (because it's not really safe or consistent to allow them
  at the moment -- we'd need to add __d_path checks when jumping through
  them as well if we wanted them to be consistent) -- so the need for
  this flag (purely for AT_NO_JUMPS compatibility) is reduced.

* There were cases in the past where races caused (temporarily)
  something like /proc/self/exe (or a file descriptor referencing the
  host filesystem) to be exposed into a container -- but because of
  set_dumpable they were blocked. CVE-2016-9962 was an example of this
  (it wasn't blocked by set_dumpable -- but the fix used set_dumpable).

  In those cases, if you can trick a host-side process to open that
  procfs file through a symlink/bind-mount (which is technically
  "accessible" but not actually usable by the container process), you
  can trick the resolution to resolve the host filesystem (and this
  might be a file which is unlinked and thus there's no way for __d_path
  checking to verify whether it is safe or not).

  I think that AT_BENEATH allowing only proclinks that result in you
  being under the root is something we might want in the future, but I
  think there are some cases where you want to be _very_ sure you don't
  follow a proclink (now or in the future).

* And finally, some containers run with the host's pidns. This is not a
  usecase that I'm particularly fond of, but some folks do use this (as
  far as I'm aware this is one of the reasons why the subreaper concept
  exists). In those cases, the procfs mount would be able to see the
  host processes -- and thus /proc/self would resolve (as would the
  host's init and so on).

I will admit that this flag is more paranoid than the others though.

Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2018-10-10  7:08 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-09  6:52 [PATCH v2 0/3] namei: implement various lookup restriction " Aleksa Sarai
2018-10-09  6:52 ` Aleksa Sarai
2018-10-09  6:52 ` [PATCH v2 1/3] namei: implement O_BENEATH-style " Aleksa Sarai
2018-10-09 19:25   ` Andy Lutomirski
2018-10-10  7:07     ` Aleksa Sarai [this message]
2018-10-10  7:28       ` Aleksa Sarai
2018-10-12  1:12       ` Andy Lutomirski
2018-10-27  1:41   ` Ed Maste
2018-10-27  7:17     ` Aleksa Sarai
2018-10-27  7:53       ` Al Viro
2018-10-27 12:11         ` : " Ed Maste
2018-10-27 15:37         ` Aleksa Sarai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181010070747.byi2itbi4j42gynq@ryuk \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
    --subject='Re: [PATCH v2 1/3] namei: implement O_BENEATH-style AT_* flags' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).