Linux-Fsdevel Archive on lore.kernel.org
help / color / mirror / Atom feed
* [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2
@ 2020-08-29  2:00 Rich Felker
  2020-08-30 15:05 ` Jann Horn
  0 siblings, 1 reply; 12+ messages in thread
From: Rich Felker @ 2020-08-29  2:00 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel, linux-api; +Cc: Alexander Viro

The pwrite function, originally defined by POSIX (thus the "p"), is
defined to ignore O_APPEND and write at the offset passed as its
argument. However, historically Linux honored O_APPEND if set and
ignored the offset. This cannot be changed due to stability policy,
but is documented in the man page as a bug.

Now that there's a pwritev2 syscall providing a superset of the pwrite
functionality that has a flags argument, the conforming behavior can
be offered to userspace via a new flag. Since pwritev2 checks flag
validity (in kiocb_set_rw_flags) and reports unknown ones with
EOPNOTSUPP, callers will not get wrong behavior on old kernels that
don't support the new flag; the error is reported and the caller can
decide how to handle it.

Signed-off-by: Rich Felker <dalias@libc.org>
---
 include/linux/fs.h      | 4 ++++
 include/uapi/linux/fs.h | 5 ++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index e0d909d35763..3a769a972f79 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3397,6 +3397,8 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
 {
 	if (unlikely(flags & ~RWF_SUPPORTED))
 		return -EOPNOTSUPP;
+	if (unlikely((flags & RWF_APPEND) && (flags & RWF_NOAPPEND)))
+		return -EINVAL;
 
 	if (flags & RWF_NOWAIT) {
 		if (!(ki->ki_filp->f_mode & FMODE_NOWAIT))
@@ -3411,6 +3413,8 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
 		ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC);
 	if (flags & RWF_APPEND)
 		ki->ki_flags |= IOCB_APPEND;
+	if (flags & RWF_NOAPPEND)
+		ki->ki_flags &= ~IOCB_APPEND;
 	return 0;
 }
 
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 379a612f8f1d..591357d9b3c9 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -299,8 +299,11 @@ typedef int __bitwise __kernel_rwf_t;
 /* per-IO O_APPEND */
 #define RWF_APPEND	((__force __kernel_rwf_t)0x00000010)
 
+/* per-IO negation of O_APPEND */
+#define RWF_NOAPPEND	((__force __kernel_rwf_t)0x00000020)
+
 /* mask of flags supported by the kernel */
 #define RWF_SUPPORTED	(RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT |\
-			 RWF_APPEND)
+			 RWF_APPEND | RWF_NOAPPEND)
 
 #endif /* _UAPI_LINUX_FS_H */
-- 
2.21.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2
  2020-08-29  2:00 [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2 Rich Felker
@ 2020-08-30 15:05 ` Jann Horn
  2020-08-30 16:36   ` Rich Felker
  0 siblings, 1 reply; 12+ messages in thread
From: Jann Horn @ 2020-08-30 15:05 UTC (permalink / raw)
  To: Rich Felker; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro

On Sat, Aug 29, 2020 at 4:00 AM Rich Felker <dalias@libc.org> wrote:
> The pwrite function, originally defined by POSIX (thus the "p"), is
> defined to ignore O_APPEND and write at the offset passed as its
> argument. However, historically Linux honored O_APPEND if set and
> ignored the offset. This cannot be changed due to stability policy,
> but is documented in the man page as a bug.
>
> Now that there's a pwritev2 syscall providing a superset of the pwrite
> functionality that has a flags argument, the conforming behavior can
> be offered to userspace via a new flag.
[...]
> diff --git a/include/linux/fs.h b/include/linux/fs.h
[...]
> @@ -3411,6 +3413,8 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
>                 ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC);
>         if (flags & RWF_APPEND)
>                 ki->ki_flags |= IOCB_APPEND;
> +       if (flags & RWF_NOAPPEND)
> +               ki->ki_flags &= ~IOCB_APPEND;
>         return 0;
>  }

Linux enforces the S_APPEND flag (set by "chattr +a") only at open()
time, not at write() time:

# touch testfile
# exec 100>testfile
# echo foo > testfile
# cat testfile
foo
# chattr +a testfile
# echo bar > testfile
bash: testfile: Operation not permitted
# echo bar >&100
# cat testfile
bar
#

At open() time, the kernel enforces that you can't use O_WRONLY/O_RDWR
without also setting O_APPEND if the file is marked as append-only:

static int may_open(const struct path *path, int acc_mode, int flag)
{
[...]
  /*
   * An append-only file must be opened in append mode for writing.
   */
  if (IS_APPEND(inode)) {
    if  ((flag & O_ACCMODE) != O_RDONLY && !(flag & O_APPEND))
      return -EPERM;
    if (flag & O_TRUNC)
      return -EPERM;
  }
[...]
}

It seems to me like your patch will permit bypassing S_APPEND by
opening an append-only file with O_WRONLY|O_APPEND, then calling
pwritev2() with RWF_NOAPPEND? I think you'll have to add an extra
check for IS_APPEND() somewhere.


One could also argue that if an O_APPEND file descriptor is handed
across privilege boundaries, a programmer might reasonably expect that
the recipient will not be able to use the file descriptor for
non-append writes; if that is not actually true, that should probably
be noted in the open.2 manpage, at the end of the description of
O_APPEND.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2
  2020-08-30 15:05 ` Jann Horn
@ 2020-08-30 16:36   ` Rich Felker
  2020-08-30 18:31     ` Jann Horn
  0 siblings, 1 reply; 12+ messages in thread
From: Rich Felker @ 2020-08-30 16:36 UTC (permalink / raw)
  To: Jann Horn; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro

On Sun, Aug 30, 2020 at 05:05:45PM +0200, Jann Horn wrote:
> On Sat, Aug 29, 2020 at 4:00 AM Rich Felker <dalias@libc.org> wrote:
> > The pwrite function, originally defined by POSIX (thus the "p"), is
> > defined to ignore O_APPEND and write at the offset passed as its
> > argument. However, historically Linux honored O_APPEND if set and
> > ignored the offset. This cannot be changed due to stability policy,
> > but is documented in the man page as a bug.
> >
> > Now that there's a pwritev2 syscall providing a superset of the pwrite
> > functionality that has a flags argument, the conforming behavior can
> > be offered to userspace via a new flag.
> [...]
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> [...]
> > @@ -3411,6 +3413,8 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
> >                 ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC);
> >         if (flags & RWF_APPEND)
> >                 ki->ki_flags |= IOCB_APPEND;
> > +       if (flags & RWF_NOAPPEND)
> > +               ki->ki_flags &= ~IOCB_APPEND;
> >         return 0;
> >  }
> 
> Linux enforces the S_APPEND flag (set by "chattr +a") only at open()
> time, not at write() time:
> 
> # touch testfile
> # exec 100>testfile
> # echo foo > testfile
> # cat testfile
> foo
> # chattr +a testfile
> # echo bar > testfile
> bash: testfile: Operation not permitted
> # echo bar >&100
> # cat testfile
> bar
> #
> 
> At open() time, the kernel enforces that you can't use O_WRONLY/O_RDWR
> without also setting O_APPEND if the file is marked as append-only:
> 
> static int may_open(const struct path *path, int acc_mode, int flag)
> {
> [...]
>   /*
>    * An append-only file must be opened in append mode for writing.
>    */
>   if (IS_APPEND(inode)) {
>     if  ((flag & O_ACCMODE) != O_RDONLY && !(flag & O_APPEND))
>       return -EPERM;
>     if (flag & O_TRUNC)
>       return -EPERM;
>   }
> [...]
> }
> 
> It seems to me like your patch will permit bypassing S_APPEND by
> opening an append-only file with O_WRONLY|O_APPEND, then calling
> pwritev2() with RWF_NOAPPEND? I think you'll have to add an extra
> check for IS_APPEND() somewhere.
> 
> 
> One could also argue that if an O_APPEND file descriptor is handed
> across privilege boundaries, a programmer might reasonably expect that
> the recipient will not be able to use the file descriptor for
> non-append writes; if that is not actually true, that should probably
> be noted in the open.2 manpage, at the end of the description of
> O_APPEND.

fcntl F_SETFL can remove O_APPEND, so it is not a security boundary.
I'm not sure how this interacts with S_APPEND; presumably fcntl
rechecks it. So just checking IS_APPEND in the code paths used by
pwritev2 (and erroring out rather than silently writing output at the
wrong place) should suffice to preserve all existing security
invariants.

Rich

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2
  2020-08-30 16:36   ` Rich Felker
@ 2020-08-30 18:31     ` Jann Horn
  2020-08-30 18:43       ` Rich Felker
  0 siblings, 1 reply; 12+ messages in thread
From: Jann Horn @ 2020-08-30 18:31 UTC (permalink / raw)
  To: Rich Felker; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro

On Sun, Aug 30, 2020 at 6:36 PM Rich Felker <dalias@libc.org> wrote:
> On Sun, Aug 30, 2020 at 05:05:45PM +0200, Jann Horn wrote:
> > On Sat, Aug 29, 2020 at 4:00 AM Rich Felker <dalias@libc.org> wrote:
> > > The pwrite function, originally defined by POSIX (thus the "p"), is
> > > defined to ignore O_APPEND and write at the offset passed as its
> > > argument. However, historically Linux honored O_APPEND if set and
> > > ignored the offset. This cannot be changed due to stability policy,
> > > but is documented in the man page as a bug.
> > >
> > > Now that there's a pwritev2 syscall providing a superset of the pwrite
> > > functionality that has a flags argument, the conforming behavior can
> > > be offered to userspace via a new flag.
[...]
> > Linux enforces the S_APPEND flag (set by "chattr +a") only at open()
> > time, not at write() time:
[...]
> > It seems to me like your patch will permit bypassing S_APPEND by
> > opening an append-only file with O_WRONLY|O_APPEND, then calling
> > pwritev2() with RWF_NOAPPEND? I think you'll have to add an extra
> > check for IS_APPEND() somewhere.
> >
> >
> > One could also argue that if an O_APPEND file descriptor is handed
> > across privilege boundaries, a programmer might reasonably expect that
> > the recipient will not be able to use the file descriptor for
> > non-append writes; if that is not actually true, that should probably
> > be noted in the open.2 manpage, at the end of the description of
> > O_APPEND.
>
> fcntl F_SETFL can remove O_APPEND, so it is not a security boundary.
> I'm not sure how this interacts with S_APPEND; presumably fcntl
> rechecks it.

Ah, good point, you're right. In fs/fcntl.c:

  35 static int setfl(int fd, struct file * filp, unsigned long arg)
  36 {
  37    struct inode * inode = file_inode(filp);
  38    int error = 0;
  39
  40    /*
  41     * O_APPEND cannot be cleared if the file is marked as append-only
  42     * and the file is open for write.
  43     */
  44    if (((arg ^ filp->f_flags) & O_APPEND) && IS_APPEND(inode))
  45            return -EPERM;

> So just checking IS_APPEND in the code paths used by
> pwritev2 (and erroring out rather than silently writing output at the
> wrong place) should suffice to preserve all existing security
> invariants.

Makes sense.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2
  2020-08-30 18:31     ` Jann Horn
@ 2020-08-30 18:43       ` Rich Felker
  2020-08-30 19:02         ` Jann Horn
  0 siblings, 1 reply; 12+ messages in thread
From: Rich Felker @ 2020-08-30 18:43 UTC (permalink / raw)
  To: Jann Horn; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro

On Sun, Aug 30, 2020 at 08:31:36PM +0200, Jann Horn wrote:
> On Sun, Aug 30, 2020 at 6:36 PM Rich Felker <dalias@libc.org> wrote:
> > On Sun, Aug 30, 2020 at 05:05:45PM +0200, Jann Horn wrote:
> > > On Sat, Aug 29, 2020 at 4:00 AM Rich Felker <dalias@libc.org> wrote:
> > > > The pwrite function, originally defined by POSIX (thus the "p"), is
> > > > defined to ignore O_APPEND and write at the offset passed as its
> > > > argument. However, historically Linux honored O_APPEND if set and
> > > > ignored the offset. This cannot be changed due to stability policy,
> > > > but is documented in the man page as a bug.
> > > >
> > > > Now that there's a pwritev2 syscall providing a superset of the pwrite
> > > > functionality that has a flags argument, the conforming behavior can
> > > > be offered to userspace via a new flag.
> [...]
> > > Linux enforces the S_APPEND flag (set by "chattr +a") only at open()
> > > time, not at write() time:
> [...]
> > > It seems to me like your patch will permit bypassing S_APPEND by
> > > opening an append-only file with O_WRONLY|O_APPEND, then calling
> > > pwritev2() with RWF_NOAPPEND? I think you'll have to add an extra
> > > check for IS_APPEND() somewhere.
> > >
> > >
> > > One could also argue that if an O_APPEND file descriptor is handed
> > > across privilege boundaries, a programmer might reasonably expect that
> > > the recipient will not be able to use the file descriptor for
> > > non-append writes; if that is not actually true, that should probably
> > > be noted in the open.2 manpage, at the end of the description of
> > > O_APPEND.
> >
> > fcntl F_SETFL can remove O_APPEND, so it is not a security boundary.
> > I'm not sure how this interacts with S_APPEND; presumably fcntl
> > rechecks it.
> 
> Ah, good point, you're right. In fs/fcntl.c:
> 
>   35 static int setfl(int fd, struct file * filp, unsigned long arg)
>   36 {
>   37    struct inode * inode = file_inode(filp);
>   38    int error = 0;
>   39
>   40    /*
>   41     * O_APPEND cannot be cleared if the file is marked as append-only
>   42     * and the file is open for write.
>   43     */
>   44    if (((arg ^ filp->f_flags) & O_APPEND) && IS_APPEND(inode))
>   45            return -EPERM;

FWIW I think this check is mildly wrong; it seems to disallow *adding*
O_APPEND if the file became chattr +a after it was opened. It should
probably be changed to only disallow removal.

> > So just checking IS_APPEND in the code paths used by
> > pwritev2 (and erroring out rather than silently writing output at the
> > wrong place) should suffice to preserve all existing security
> > invariants.
> 
> Makes sense.

There are 3 places where kiocb_set_rw_flags is called with flags that
seem to be controlled by userspace: aio.c, io_uring.c, and
read_write.c. Presumably each needs to EPERM out on RWF_NOAPPEND if
the underlying inode is S_APPEND. To avoid repeating the same logic in
an error-prone way, should kiocb_set_rw_flags's signature be updated
to take the filp so that it can obtain the inode and check IS_APPEND
before accepting RWF_NOAPPEND? It's inline so this should avoid
actually loading anything except in the codepath where
flags&RWF_NOAPPEND is nonzero.

Rich

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2
  2020-08-30 18:43       ` Rich Felker
@ 2020-08-30 19:02         ` Jann Horn
  2020-08-30 20:00           ` Rich Felker
  0 siblings, 1 reply; 12+ messages in thread
From: Jann Horn @ 2020-08-30 19:02 UTC (permalink / raw)
  To: Rich Felker; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro

On Sun, Aug 30, 2020 at 8:43 PM Rich Felker <dalias@libc.org> wrote:
> On Sun, Aug 30, 2020 at 08:31:36PM +0200, Jann Horn wrote:
> > On Sun, Aug 30, 2020 at 6:36 PM Rich Felker <dalias@libc.org> wrote:
> > > On Sun, Aug 30, 2020 at 05:05:45PM +0200, Jann Horn wrote:
> > > > On Sat, Aug 29, 2020 at 4:00 AM Rich Felker <dalias@libc.org> wrote:
> > > > > The pwrite function, originally defined by POSIX (thus the "p"), is
> > > > > defined to ignore O_APPEND and write at the offset passed as its
> > > > > argument. However, historically Linux honored O_APPEND if set and
> > > > > ignored the offset. This cannot be changed due to stability policy,
> > > > > but is documented in the man page as a bug.
> > > > >
> > > > > Now that there's a pwritev2 syscall providing a superset of the pwrite
> > > > > functionality that has a flags argument, the conforming behavior can
> > > > > be offered to userspace via a new flag.
> > [...]
> > > > Linux enforces the S_APPEND flag (set by "chattr +a") only at open()
> > > > time, not at write() time:
> > [...]
> > > > It seems to me like your patch will permit bypassing S_APPEND by
> > > > opening an append-only file with O_WRONLY|O_APPEND, then calling
> > > > pwritev2() with RWF_NOAPPEND? I think you'll have to add an extra
> > > > check for IS_APPEND() somewhere.
> > > >
> > > >
> > > > One could also argue that if an O_APPEND file descriptor is handed
> > > > across privilege boundaries, a programmer might reasonably expect that
> > > > the recipient will not be able to use the file descriptor for
> > > > non-append writes; if that is not actually true, that should probably
> > > > be noted in the open.2 manpage, at the end of the description of
> > > > O_APPEND.
> > >
> > > fcntl F_SETFL can remove O_APPEND, so it is not a security boundary.
> > > I'm not sure how this interacts with S_APPEND; presumably fcntl
> > > rechecks it.
> >
> > Ah, good point, you're right. In fs/fcntl.c:
> >
> >   35 static int setfl(int fd, struct file * filp, unsigned long arg)
> >   36 {
> >   37    struct inode * inode = file_inode(filp);
> >   38    int error = 0;
> >   39
> >   40    /*
> >   41     * O_APPEND cannot be cleared if the file is marked as append-only
> >   42     * and the file is open for write.
> >   43     */
> >   44    if (((arg ^ filp->f_flags) & O_APPEND) && IS_APPEND(inode))
> >   45            return -EPERM;
>
> FWIW I think this check is mildly wrong; it seems to disallow *adding*
> O_APPEND if the file became chattr +a after it was opened. It should
> probably be changed to only disallow removal.

Yeah...

> > > So just checking IS_APPEND in the code paths used by
> > > pwritev2 (and erroring out rather than silently writing output at the
> > > wrong place) should suffice to preserve all existing security
> > > invariants.
> >
> > Makes sense.
>
> There are 3 places where kiocb_set_rw_flags is called with flags that
> seem to be controlled by userspace: aio.c, io_uring.c, and
> read_write.c. Presumably each needs to EPERM out on RWF_NOAPPEND if
> the underlying inode is S_APPEND. To avoid repeating the same logic in
> an error-prone way, should kiocb_set_rw_flags's signature be updated
> to take the filp so that it can obtain the inode and check IS_APPEND
> before accepting RWF_NOAPPEND? It's inline so this should avoid
> actually loading anything except in the codepath where
> flags&RWF_NOAPPEND is nonzero.

You can get the file pointer from ki->ki_filp. See the RWF_NOWAIT
branch of kiocb_set_rw_flags().

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2
  2020-08-30 19:02         ` Jann Horn
@ 2020-08-30 20:00           ` Rich Felker
  2020-08-31  1:15             ` Jann Horn
  0 siblings, 1 reply; 12+ messages in thread
From: Rich Felker @ 2020-08-30 20:00 UTC (permalink / raw)
  To: Jann Horn; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro

On Sun, Aug 30, 2020 at 09:02:31PM +0200, Jann Horn wrote:
> On Sun, Aug 30, 2020 at 8:43 PM Rich Felker <dalias@libc.org> wrote:
> > On Sun, Aug 30, 2020 at 08:31:36PM +0200, Jann Horn wrote:
> > > On Sun, Aug 30, 2020 at 6:36 PM Rich Felker <dalias@libc.org> wrote:
> > > > So just checking IS_APPEND in the code paths used by
> > > > pwritev2 (and erroring out rather than silently writing output at the
> > > > wrong place) should suffice to preserve all existing security
> > > > invariants.
> > >
> > > Makes sense.
> >
> > There are 3 places where kiocb_set_rw_flags is called with flags that
> > seem to be controlled by userspace: aio.c, io_uring.c, and
> > read_write.c. Presumably each needs to EPERM out on RWF_NOAPPEND if
> > the underlying inode is S_APPEND. To avoid repeating the same logic in
> > an error-prone way, should kiocb_set_rw_flags's signature be updated
> > to take the filp so that it can obtain the inode and check IS_APPEND
> > before accepting RWF_NOAPPEND? It's inline so this should avoid
> > actually loading anything except in the codepath where
> > flags&RWF_NOAPPEND is nonzero.
> 
> You can get the file pointer from ki->ki_filp. See the RWF_NOWAIT
> branch of kiocb_set_rw_flags().

Thanks. I should have looked for that. OK, so a fixup like this on top
of the existing patch?

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 473289bff4c6..674131e8d139 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3457,8 +3457,11 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
 		ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC);
 	if (flags & RWF_APPEND)
 		ki->ki_flags |= IOCB_APPEND;
-	if (flags & RWF_NOAPPEND)
+	if (flags & RWF_NOAPPEND) {
+		if (IS_APPEND(file_inode(ki->ki_filp)))
+			return -EPERM;
 		ki->ki_flags &= ~IOCB_APPEND;
+	}
 	return 0;
 }
 
If this is good I'll submit a v2 as the above squashed with the
original patch.

Rich

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2
  2020-08-30 20:00           ` Rich Felker
@ 2020-08-31  1:15             ` Jann Horn
  2020-08-31  1:46               ` Rich Felker
  0 siblings, 1 reply; 12+ messages in thread
From: Jann Horn @ 2020-08-31  1:15 UTC (permalink / raw)
  To: Rich Felker; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro

On Sun, Aug 30, 2020 at 10:00 PM Rich Felker <dalias@libc.org> wrote:
> On Sun, Aug 30, 2020 at 09:02:31PM +0200, Jann Horn wrote:
> > On Sun, Aug 30, 2020 at 8:43 PM Rich Felker <dalias@libc.org> wrote:
> > > On Sun, Aug 30, 2020 at 08:31:36PM +0200, Jann Horn wrote:
> > > > On Sun, Aug 30, 2020 at 6:36 PM Rich Felker <dalias@libc.org> wrote:
> > > > > So just checking IS_APPEND in the code paths used by
> > > > > pwritev2 (and erroring out rather than silently writing output at the
> > > > > wrong place) should suffice to preserve all existing security
> > > > > invariants.
> > > >
> > > > Makes sense.
> > >
> > > There are 3 places where kiocb_set_rw_flags is called with flags that
> > > seem to be controlled by userspace: aio.c, io_uring.c, and
> > > read_write.c. Presumably each needs to EPERM out on RWF_NOAPPEND if
> > > the underlying inode is S_APPEND. To avoid repeating the same logic in
> > > an error-prone way, should kiocb_set_rw_flags's signature be updated
> > > to take the filp so that it can obtain the inode and check IS_APPEND
> > > before accepting RWF_NOAPPEND? It's inline so this should avoid
> > > actually loading anything except in the codepath where
> > > flags&RWF_NOAPPEND is nonzero.
> >
> > You can get the file pointer from ki->ki_filp. See the RWF_NOWAIT
> > branch of kiocb_set_rw_flags().
>
> Thanks. I should have looked for that. OK, so a fixup like this on top
> of the existing patch?
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 473289bff4c6..674131e8d139 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3457,8 +3457,11 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
>                 ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC);
>         if (flags & RWF_APPEND)
>                 ki->ki_flags |= IOCB_APPEND;
> -       if (flags & RWF_NOAPPEND)
> +       if (flags & RWF_NOAPPEND) {
> +               if (IS_APPEND(file_inode(ki->ki_filp)))
> +                       return -EPERM;
>                 ki->ki_flags &= ~IOCB_APPEND;
> +       }
>         return 0;
>  }
>
> If this is good I'll submit a v2 as the above squashed with the
> original patch.

Looks good to me.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2
  2020-08-31  1:15             ` Jann Horn
@ 2020-08-31  1:46               ` Rich Felker
  2020-08-31  9:15                 ` Jann Horn
  0 siblings, 1 reply; 12+ messages in thread
From: Rich Felker @ 2020-08-31  1:46 UTC (permalink / raw)
  To: Jann Horn; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro

On Mon, Aug 31, 2020 at 03:15:04AM +0200, Jann Horn wrote:
> On Sun, Aug 30, 2020 at 10:00 PM Rich Felker <dalias@libc.org> wrote:
> > On Sun, Aug 30, 2020 at 09:02:31PM +0200, Jann Horn wrote:
> > > On Sun, Aug 30, 2020 at 8:43 PM Rich Felker <dalias@libc.org> wrote:
> > > > On Sun, Aug 30, 2020 at 08:31:36PM +0200, Jann Horn wrote:
> > > > > On Sun, Aug 30, 2020 at 6:36 PM Rich Felker <dalias@libc.org> wrote:
> > > > > > So just checking IS_APPEND in the code paths used by
> > > > > > pwritev2 (and erroring out rather than silently writing output at the
> > > > > > wrong place) should suffice to preserve all existing security
> > > > > > invariants.
> > > > >
> > > > > Makes sense.
> > > >
> > > > There are 3 places where kiocb_set_rw_flags is called with flags that
> > > > seem to be controlled by userspace: aio.c, io_uring.c, and
> > > > read_write.c. Presumably each needs to EPERM out on RWF_NOAPPEND if
> > > > the underlying inode is S_APPEND. To avoid repeating the same logic in
> > > > an error-prone way, should kiocb_set_rw_flags's signature be updated
> > > > to take the filp so that it can obtain the inode and check IS_APPEND
> > > > before accepting RWF_NOAPPEND? It's inline so this should avoid
> > > > actually loading anything except in the codepath where
> > > > flags&RWF_NOAPPEND is nonzero.
> > >
> > > You can get the file pointer from ki->ki_filp. See the RWF_NOWAIT
> > > branch of kiocb_set_rw_flags().
> >
> > Thanks. I should have looked for that. OK, so a fixup like this on top
> > of the existing patch?
> >
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 473289bff4c6..674131e8d139 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -3457,8 +3457,11 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
> >                 ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC);
> >         if (flags & RWF_APPEND)
> >                 ki->ki_flags |= IOCB_APPEND;
> > -       if (flags & RWF_NOAPPEND)
> > +       if (flags & RWF_NOAPPEND) {
> > +               if (IS_APPEND(file_inode(ki->ki_filp)))
> > +                       return -EPERM;
> >                 ki->ki_flags &= ~IOCB_APPEND;
> > +       }
> >         return 0;
> >  }
> >
> > If this is good I'll submit a v2 as the above squashed with the
> > original patch.
> 
> Looks good to me.

Actually it's not quite. I think it should be:

	if ((flags & RWF_NOAPPEND) & (ki->ki_flags & IOCB_APPEND)) {
		if (IS_APPEND(file_inode(ki->ki_filp)))
			return -EPERM;
		ki->ki_flags &= ~IOCB_APPEND;
	}

i.e. don't refuse RWF_NOAPPEND on a file that was already successfully
opened without O_APPEND that only subsequently got chattr +a. The
permission check should only be done if it's overriding the default
action for how the file is open.

This is actually related to the fcntl corner case mentioned before.

Are you ok with this change? If so I'll go ahead and prepare a v2.

Rich

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2
  2020-08-31  1:46               ` Rich Felker
@ 2020-08-31  9:15                 ` Jann Horn
  2020-08-31 12:57                   ` Rich Felker
  0 siblings, 1 reply; 12+ messages in thread
From: Jann Horn @ 2020-08-31  9:15 UTC (permalink / raw)
  To: Rich Felker; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro

On Mon, Aug 31, 2020 at 3:46 AM Rich Felker <dalias@libc.org> wrote:
> On Mon, Aug 31, 2020 at 03:15:04AM +0200, Jann Horn wrote:
> > On Sun, Aug 30, 2020 at 10:00 PM Rich Felker <dalias@libc.org> wrote:
> > > On Sun, Aug 30, 2020 at 09:02:31PM +0200, Jann Horn wrote:
> > > > On Sun, Aug 30, 2020 at 8:43 PM Rich Felker <dalias@libc.org> wrote:
> > > > > On Sun, Aug 30, 2020 at 08:31:36PM +0200, Jann Horn wrote:
> > > > > > On Sun, Aug 30, 2020 at 6:36 PM Rich Felker <dalias@libc.org> wrote:
> > > > > > > So just checking IS_APPEND in the code paths used by
> > > > > > > pwritev2 (and erroring out rather than silently writing output at the
> > > > > > > wrong place) should suffice to preserve all existing security
> > > > > > > invariants.
> > > > > >
> > > > > > Makes sense.
> > > > >
> > > > > There are 3 places where kiocb_set_rw_flags is called with flags that
> > > > > seem to be controlled by userspace: aio.c, io_uring.c, and
> > > > > read_write.c. Presumably each needs to EPERM out on RWF_NOAPPEND if
> > > > > the underlying inode is S_APPEND. To avoid repeating the same logic in
> > > > > an error-prone way, should kiocb_set_rw_flags's signature be updated
> > > > > to take the filp so that it can obtain the inode and check IS_APPEND
> > > > > before accepting RWF_NOAPPEND? It's inline so this should avoid
> > > > > actually loading anything except in the codepath where
> > > > > flags&RWF_NOAPPEND is nonzero.
> > > >
> > > > You can get the file pointer from ki->ki_filp. See the RWF_NOWAIT
> > > > branch of kiocb_set_rw_flags().
> > >
> > > Thanks. I should have looked for that. OK, so a fixup like this on top
> > > of the existing patch?
> > >
> > > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > > index 473289bff4c6..674131e8d139 100644
> > > --- a/include/linux/fs.h
> > > +++ b/include/linux/fs.h
> > > @@ -3457,8 +3457,11 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
> > >                 ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC);
> > >         if (flags & RWF_APPEND)
> > >                 ki->ki_flags |= IOCB_APPEND;
> > > -       if (flags & RWF_NOAPPEND)
> > > +       if (flags & RWF_NOAPPEND) {
> > > +               if (IS_APPEND(file_inode(ki->ki_filp)))
> > > +                       return -EPERM;
> > >                 ki->ki_flags &= ~IOCB_APPEND;
> > > +       }
> > >         return 0;
> > >  }
> > >
> > > If this is good I'll submit a v2 as the above squashed with the
> > > original patch.
> >
> > Looks good to me.
>
> Actually it's not quite. I think it should be:
>
>         if ((flags & RWF_NOAPPEND) & (ki->ki_flags & IOCB_APPEND)) {
>                 if (IS_APPEND(file_inode(ki->ki_filp)))
>                         return -EPERM;
>                 ki->ki_flags &= ~IOCB_APPEND;
>         }
>
> i.e. don't refuse RWF_NOAPPEND on a file that was already successfully
> opened without O_APPEND that only subsequently got chattr +a. The
> permission check should only be done if it's overriding the default
> action for how the file is open.
>
> This is actually related to the fcntl corner case mentioned before.
>
> Are you ok with this change? If so I'll go ahead and prepare a v2.

Ah, yeah, I guess that makes sense to keep things more consistent.

(You'll have to write that as "(flags & RWF_NOAPPEND) && (ki->ki_flags
& IOCB_APPEND)" though (logical AND, not bitwise AND).)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2
  2020-08-31  9:15                 ` Jann Horn
@ 2020-08-31 12:57                   ` Rich Felker
  2020-08-31 13:12                     ` Jann Horn
  0 siblings, 1 reply; 12+ messages in thread
From: Rich Felker @ 2020-08-31 12:57 UTC (permalink / raw)
  To: Jann Horn; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro

On Mon, Aug 31, 2020 at 11:15:57AM +0200, Jann Horn wrote:
> On Mon, Aug 31, 2020 at 3:46 AM Rich Felker <dalias@libc.org> wrote:
> > On Mon, Aug 31, 2020 at 03:15:04AM +0200, Jann Horn wrote:
> > > On Sun, Aug 30, 2020 at 10:00 PM Rich Felker <dalias@libc.org> wrote:
> > > > On Sun, Aug 30, 2020 at 09:02:31PM +0200, Jann Horn wrote:
> > > > > On Sun, Aug 30, 2020 at 8:43 PM Rich Felker <dalias@libc.org> wrote:
> > > > > > On Sun, Aug 30, 2020 at 08:31:36PM +0200, Jann Horn wrote:
> > > > > > > On Sun, Aug 30, 2020 at 6:36 PM Rich Felker <dalias@libc.org> wrote:
> > > > > > > > So just checking IS_APPEND in the code paths used by
> > > > > > > > pwritev2 (and erroring out rather than silently writing output at the
> > > > > > > > wrong place) should suffice to preserve all existing security
> > > > > > > > invariants.
> > > > > > >
> > > > > > > Makes sense.
> > > > > >
> > > > > > There are 3 places where kiocb_set_rw_flags is called with flags that
> > > > > > seem to be controlled by userspace: aio.c, io_uring.c, and
> > > > > > read_write.c. Presumably each needs to EPERM out on RWF_NOAPPEND if
> > > > > > the underlying inode is S_APPEND. To avoid repeating the same logic in
> > > > > > an error-prone way, should kiocb_set_rw_flags's signature be updated
> > > > > > to take the filp so that it can obtain the inode and check IS_APPEND
> > > > > > before accepting RWF_NOAPPEND? It's inline so this should avoid
> > > > > > actually loading anything except in the codepath where
> > > > > > flags&RWF_NOAPPEND is nonzero.
> > > > >
> > > > > You can get the file pointer from ki->ki_filp. See the RWF_NOWAIT
> > > > > branch of kiocb_set_rw_flags().
> > > >
> > > > Thanks. I should have looked for that. OK, so a fixup like this on top
> > > > of the existing patch?
> > > >
> > > > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > > > index 473289bff4c6..674131e8d139 100644
> > > > --- a/include/linux/fs.h
> > > > +++ b/include/linux/fs.h
> > > > @@ -3457,8 +3457,11 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
> > > >                 ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC);
> > > >         if (flags & RWF_APPEND)
> > > >                 ki->ki_flags |= IOCB_APPEND;
> > > > -       if (flags & RWF_NOAPPEND)
> > > > +       if (flags & RWF_NOAPPEND) {
> > > > +               if (IS_APPEND(file_inode(ki->ki_filp)))
> > > > +                       return -EPERM;
> > > >                 ki->ki_flags &= ~IOCB_APPEND;
> > > > +       }
> > > >         return 0;
> > > >  }
> > > >
> > > > If this is good I'll submit a v2 as the above squashed with the
> > > > original patch.
> > >
> > > Looks good to me.
> >
> > Actually it's not quite. I think it should be:
> >
> >         if ((flags & RWF_NOAPPEND) & (ki->ki_flags & IOCB_APPEND)) {
> >                 if (IS_APPEND(file_inode(ki->ki_filp)))
> >                         return -EPERM;
> >                 ki->ki_flags &= ~IOCB_APPEND;
> >         }
> >
> > i.e. don't refuse RWF_NOAPPEND on a file that was already successfully
> > opened without O_APPEND that only subsequently got chattr +a. The
> > permission check should only be done if it's overriding the default
> > action for how the file is open.
> >
> > This is actually related to the fcntl corner case mentioned before.
> >
> > Are you ok with this change? If so I'll go ahead and prepare a v2.
> 
> Ah, yeah, I guess that makes sense to keep things more consistent.
> 
> (You'll have to write that as "(flags & RWF_NOAPPEND) && (ki->ki_flags
> & IOCB_APPEND)" though (logical AND, not bitwise AND).)

Thanks, no idea how I made that mistake -- probably typing code in
email.

While reparing this I rebased against Linus's tree, and found
conflicts with 1752f0adea98ef85 which were easy to resolve.
Unfortunately the same improvement made in that commit does not work
for the new RWF_NOAPPEND, since it needs to inspect and mask bits off
the original ki_flags, not the local set of added flags, but the
penalty should be isolated to this branch. I'm not opposed to adding
unlikely() around it if you think that would help codegen for the
common cases.

Alternatively, kiocb_flags could be initialized to ki->ki_flags, with
assignment-back in place of |= at the end of the function. This might
be more elegant but I'm not sure if the emitted code would improve.

Rich

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2
  2020-08-31 12:57                   ` Rich Felker
@ 2020-08-31 13:12                     ` Jann Horn
  0 siblings, 0 replies; 12+ messages in thread
From: Jann Horn @ 2020-08-31 13:12 UTC (permalink / raw)
  To: Rich Felker; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro

On Mon, Aug 31, 2020 at 2:57 PM Rich Felker <dalias@libc.org> wrote:
> On Mon, Aug 31, 2020 at 11:15:57AM +0200, Jann Horn wrote:
> > On Mon, Aug 31, 2020 at 3:46 AM Rich Felker <dalias@libc.org> wrote:
> > > On Mon, Aug 31, 2020 at 03:15:04AM +0200, Jann Horn wrote:
> > > > On Sun, Aug 30, 2020 at 10:00 PM Rich Felker <dalias@libc.org> wrote:
> > > > > On Sun, Aug 30, 2020 at 09:02:31PM +0200, Jann Horn wrote:
> > > > > > On Sun, Aug 30, 2020 at 8:43 PM Rich Felker <dalias@libc.org> wrote:
> > > > > > > On Sun, Aug 30, 2020 at 08:31:36PM +0200, Jann Horn wrote:
> > > > > > > > On Sun, Aug 30, 2020 at 6:36 PM Rich Felker <dalias@libc.org> wrote:
> > > > > > > > > So just checking IS_APPEND in the code paths used by
> > > > > > > > > pwritev2 (and erroring out rather than silently writing output at the
> > > > > > > > > wrong place) should suffice to preserve all existing security
> > > > > > > > > invariants.
> > > > > > > >
> > > > > > > > Makes sense.
> > > > > > >
> > > > > > > There are 3 places where kiocb_set_rw_flags is called with flags that
> > > > > > > seem to be controlled by userspace: aio.c, io_uring.c, and
> > > > > > > read_write.c. Presumably each needs to EPERM out on RWF_NOAPPEND if
> > > > > > > the underlying inode is S_APPEND. To avoid repeating the same logic in
> > > > > > > an error-prone way, should kiocb_set_rw_flags's signature be updated
> > > > > > > to take the filp so that it can obtain the inode and check IS_APPEND
> > > > > > > before accepting RWF_NOAPPEND? It's inline so this should avoid
> > > > > > > actually loading anything except in the codepath where
> > > > > > > flags&RWF_NOAPPEND is nonzero.
> > > > > >
> > > > > > You can get the file pointer from ki->ki_filp. See the RWF_NOWAIT
> > > > > > branch of kiocb_set_rw_flags().
> > > > >
> > > > > Thanks. I should have looked for that. OK, so a fixup like this on top
> > > > > of the existing patch?
> > > > >
> > > > > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > > > > index 473289bff4c6..674131e8d139 100644
> > > > > --- a/include/linux/fs.h
> > > > > +++ b/include/linux/fs.h
> > > > > @@ -3457,8 +3457,11 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
> > > > >                 ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC);
> > > > >         if (flags & RWF_APPEND)
> > > > >                 ki->ki_flags |= IOCB_APPEND;
> > > > > -       if (flags & RWF_NOAPPEND)
> > > > > +       if (flags & RWF_NOAPPEND) {
> > > > > +               if (IS_APPEND(file_inode(ki->ki_filp)))
> > > > > +                       return -EPERM;
> > > > >                 ki->ki_flags &= ~IOCB_APPEND;
> > > > > +       }
> > > > >         return 0;
> > > > >  }
> > > > >
> > > > > If this is good I'll submit a v2 as the above squashed with the
> > > > > original patch.
> > > >
> > > > Looks good to me.
> > >
> > > Actually it's not quite. I think it should be:
> > >
> > >         if ((flags & RWF_NOAPPEND) & (ki->ki_flags & IOCB_APPEND)) {
> > >                 if (IS_APPEND(file_inode(ki->ki_filp)))
> > >                         return -EPERM;
> > >                 ki->ki_flags &= ~IOCB_APPEND;
> > >         }
> > >
> > > i.e. don't refuse RWF_NOAPPEND on a file that was already successfully
> > > opened without O_APPEND that only subsequently got chattr +a. The
> > > permission check should only be done if it's overriding the default
> > > action for how the file is open.
> > >
> > > This is actually related to the fcntl corner case mentioned before.
[...]
> While reparing this I rebased against Linus's tree, and found
> conflicts with 1752f0adea98ef85 which were easy to resolve.
> Unfortunately the same improvement made in that commit does not work
> for the new RWF_NOAPPEND, since it needs to inspect and mask bits off
> the original ki_flags, not the local set of added flags, but the
> penalty should be isolated to this branch. I'm not opposed to adding
> unlikely() around it if you think that would help codegen for the
> common cases.
>
> Alternatively, kiocb_flags could be initialized to ki->ki_flags, with
> assignment-back in place of |= at the end of the function. This might
> be more elegant but I'm not sure if the emitted code would improve.

Presumably RWF_NOAPPEND would be somewhat rare, and a simple
comparison and not-taken branch should be really cheap? I'm not really
concerned about it. I guess you can CC the author of that patch on
your v2.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-08-31 13:13 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-29  2:00 [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2 Rich Felker
2020-08-30 15:05 ` Jann Horn
2020-08-30 16:36   ` Rich Felker
2020-08-30 18:31     ` Jann Horn
2020-08-30 18:43       ` Rich Felker
2020-08-30 19:02         ` Jann Horn
2020-08-30 20:00           ` Rich Felker
2020-08-31  1:15             ` Jann Horn
2020-08-31  1:46               ` Rich Felker
2020-08-31  9:15                 ` Jann Horn
2020-08-31 12:57                   ` Rich Felker
2020-08-31 13:12                     ` Jann Horn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).