Linux-Fsdevel Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v2] vfs: add RWF_NOAPPEND flag for pwritev2
@ 2020-08-31 15:32 Rich Felker
  2020-08-31 15:46 ` Jann Horn
  0 siblings, 1 reply; 3+ messages in thread
From: Rich Felker @ 2020-08-31 15:32 UTC (permalink / raw)
  To: Jann Horn
  Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro, Pavel Begunkov

The pwrite function, originally defined by POSIX (thus the "p"), is
defined to ignore O_APPEND and write at the offset passed as its
argument. However, historically Linux honored O_APPEND if set and
ignored the offset. This cannot be changed due to stability policy,
but is documented in the man page as a bug.

Now that there's a pwritev2 syscall providing a superset of the pwrite
functionality that has a flags argument, the conforming behavior can
be offered to userspace via a new flag. Since pwritev2 checks flag
validity (in kiocb_set_rw_flags) and reports unknown ones with
EOPNOTSUPP, callers will not get wrong behavior on old kernels that
don't support the new flag; the error is reported and the caller can
decide how to handle it.

Signed-off-by: Rich Felker <dalias@libc.org>
---

Changes in v2: I've added a check to ensure that RWF_NOAPPEND does not
override O_APPEND for S_APPEND (chattr +a) inodes, and fixed conflicts
with 1752f0adea98ef85, which optimized kiocb_set_rw_flags to work with
a local copy of flags. Unfortunately the same optimization does not
work for RWF_NOAPPEND since it needs to remove flags from the original
set at function entry.

If desired, I could further change this so that kiocb_flags is
initialized to ki->ki_flags, with assignment-back in place of |= at
the end of the function. This would allow the same local variable
pattern in the RWF_NOAPPEND code path, which might be more elegant,
but I'm not sure if the emitted code would improve or get worse.


 include/linux/fs.h      | 7 +++++++
 include/uapi/linux/fs.h | 5 ++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 7519ae003a08..924e17ac8e7e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3321,6 +3321,8 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
 		return 0;
 	if (unlikely(flags & ~RWF_SUPPORTED))
 		return -EOPNOTSUPP;
+	if (unlikely((flags & RWF_APPEND) && (flags & RWF_NOAPPEND)))
+		return -EINVAL;
 
 	if (flags & RWF_NOWAIT) {
 		if (!(ki->ki_filp->f_mode & FMODE_NOWAIT))
@@ -3335,6 +3337,11 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
 		kiocb_flags |= (IOCB_DSYNC | IOCB_SYNC);
 	if (flags & RWF_APPEND)
 		kiocb_flags |= IOCB_APPEND;
+	if ((flags & RWF_NOAPPEND) && (ki->ki_flags & IOCB_APPEND)) {
+		if (IS_APPEND(file_inode(ki->ki_filp)))
+			return -EPERM;
+		ki->ki_flags &= ~IOCB_APPEND;
+	}
 
 	ki->ki_flags |= kiocb_flags;
 	return 0;
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index f44eb0a04afd..d5e54e0742cf 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -300,8 +300,11 @@ typedef int __bitwise __kernel_rwf_t;
 /* per-IO O_APPEND */
 #define RWF_APPEND	((__force __kernel_rwf_t)0x00000010)
 
+/* per-IO negation of O_APPEND */
+#define RWF_NOAPPEND	((__force __kernel_rwf_t)0x00000020)
+
 /* mask of flags supported by the kernel */
 #define RWF_SUPPORTED	(RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT |\
-			 RWF_APPEND)
+			 RWF_APPEND | RWF_NOAPPEND)
 
 #endif /* _UAPI_LINUX_FS_H */
-- 
2.21.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] vfs: add RWF_NOAPPEND flag for pwritev2
  2020-08-31 15:32 [PATCH v2] vfs: add RWF_NOAPPEND flag for pwritev2 Rich Felker
@ 2020-08-31 15:46 ` Jann Horn
  2020-08-31 17:05   ` Jens Axboe
  0 siblings, 1 reply; 3+ messages in thread
From: Jann Horn @ 2020-08-31 15:46 UTC (permalink / raw)
  To: Rich Felker, Alexander Viro, Jens Axboe
  Cc: linux-fsdevel, kernel list, Linux API, Pavel Begunkov

On Mon, Aug 31, 2020 at 5:32 PM Rich Felker <dalias@libc.org> wrote:
> The pwrite function, originally defined by POSIX (thus the "p"), is
> defined to ignore O_APPEND and write at the offset passed as its
> argument. However, historically Linux honored O_APPEND if set and
> ignored the offset. This cannot be changed due to stability policy,
> but is documented in the man page as a bug.
>
> Now that there's a pwritev2 syscall providing a superset of the pwrite
> functionality that has a flags argument, the conforming behavior can
> be offered to userspace via a new flag. Since pwritev2 checks flag
> validity (in kiocb_set_rw_flags) and reports unknown ones with
> EOPNOTSUPP, callers will not get wrong behavior on old kernels that
> don't support the new flag; the error is reported and the caller can
> decide how to handle it.
>
> Signed-off-by: Rich Felker <dalias@libc.org>

Reviewed-by: Jann Horn <jannh@google.com>

Note that if this lands, Michael Kerrisk will probably be happy if you
send a corresponding patch for the manpage man2/readv.2.

Btw, I'm not really sure whose tree this should go through - VFS is
normally Al Viro's turf, but it looks like the most recent
modifications to this function have gone through Jens Axboe's tree?

> ---
>
> Changes in v2: I've added a check to ensure that RWF_NOAPPEND does not
> override O_APPEND for S_APPEND (chattr +a) inodes, and fixed conflicts
> with 1752f0adea98ef85, which optimized kiocb_set_rw_flags to work with
> a local copy of flags. Unfortunately the same optimization does not
> work for RWF_NOAPPEND since it needs to remove flags from the original
> set at function entry.
>
> If desired, I could further change this so that kiocb_flags is
> initialized to ki->ki_flags, with assignment-back in place of |= at
> the end of the function. This would allow the same local variable
> pattern in the RWF_NOAPPEND code path, which might be more elegant,
> but I'm not sure if the emitted code would improve or get worse.
>
>
>  include/linux/fs.h      | 7 +++++++
>  include/uapi/linux/fs.h | 5 ++++-
>  2 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 7519ae003a08..924e17ac8e7e 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3321,6 +3321,8 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
>                 return 0;
>         if (unlikely(flags & ~RWF_SUPPORTED))
>                 return -EOPNOTSUPP;
> +       if (unlikely((flags & RWF_APPEND) && (flags & RWF_NOAPPEND)))
> +               return -EINVAL;
>
>         if (flags & RWF_NOWAIT) {
>                 if (!(ki->ki_filp->f_mode & FMODE_NOWAIT))
> @@ -3335,6 +3337,11 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
>                 kiocb_flags |= (IOCB_DSYNC | IOCB_SYNC);
>         if (flags & RWF_APPEND)
>                 kiocb_flags |= IOCB_APPEND;
> +       if ((flags & RWF_NOAPPEND) && (ki->ki_flags & IOCB_APPEND)) {
> +               if (IS_APPEND(file_inode(ki->ki_filp)))
> +                       return -EPERM;
> +               ki->ki_flags &= ~IOCB_APPEND;
> +       }
>
>         ki->ki_flags |= kiocb_flags;
>         return 0;
> diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
> index f44eb0a04afd..d5e54e0742cf 100644
> --- a/include/uapi/linux/fs.h
> +++ b/include/uapi/linux/fs.h
> @@ -300,8 +300,11 @@ typedef int __bitwise __kernel_rwf_t;
>  /* per-IO O_APPEND */
>  #define RWF_APPEND     ((__force __kernel_rwf_t)0x00000010)
>
> +/* per-IO negation of O_APPEND */
> +#define RWF_NOAPPEND   ((__force __kernel_rwf_t)0x00000020)
> +
>  /* mask of flags supported by the kernel */
>  #define RWF_SUPPORTED  (RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT |\
> -                        RWF_APPEND)
> +                        RWF_APPEND | RWF_NOAPPEND)
>
>  #endif /* _UAPI_LINUX_FS_H */
> --
> 2.21.0
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] vfs: add RWF_NOAPPEND flag for pwritev2
  2020-08-31 15:46 ` Jann Horn
@ 2020-08-31 17:05   ` Jens Axboe
  0 siblings, 0 replies; 3+ messages in thread
From: Jens Axboe @ 2020-08-31 17:05 UTC (permalink / raw)
  To: Jann Horn, Rich Felker, Alexander Viro
  Cc: linux-fsdevel, kernel list, Linux API, Pavel Begunkov

On 8/31/20 9:46 AM, Jann Horn wrote:
> On Mon, Aug 31, 2020 at 5:32 PM Rich Felker <dalias@libc.org> wrote:
>> The pwrite function, originally defined by POSIX (thus the "p"), is
>> defined to ignore O_APPEND and write at the offset passed as its
>> argument. However, historically Linux honored O_APPEND if set and
>> ignored the offset. This cannot be changed due to stability policy,
>> but is documented in the man page as a bug.
>>
>> Now that there's a pwritev2 syscall providing a superset of the pwrite
>> functionality that has a flags argument, the conforming behavior can
>> be offered to userspace via a new flag. Since pwritev2 checks flag
>> validity (in kiocb_set_rw_flags) and reports unknown ones with
>> EOPNOTSUPP, callers will not get wrong behavior on old kernels that
>> don't support the new flag; the error is reported and the caller can
>> decide how to handle it.
>>
>> Signed-off-by: Rich Felker <dalias@libc.org>
> 
> Reviewed-by: Jann Horn <jannh@google.com>
> 
> Note that if this lands, Michael Kerrisk will probably be happy if you
> send a corresponding patch for the manpage man2/readv.2.
> 
> Btw, I'm not really sure whose tree this should go through - VFS is
> normally Al Viro's turf, but it looks like the most recent
> modifications to this function have gone through Jens Axboe's tree?

Should probably go through Al's tree, I've only carried them when
they've been associated with io_uring in some shape or form.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-08-31 17:05 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-31 15:32 [PATCH v2] vfs: add RWF_NOAPPEND flag for pwritev2 Rich Felker
2020-08-31 15:46 ` Jann Horn
2020-08-31 17:05   ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).