Linux-Fsdevel Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Alessio Balsini <balsini@android.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
	Akilesh Kailash <akailash@google.com>,
	David Anderson <dvander@google.com>,
	Eric Yan <eric.yan@oneplus.com>, Jann Horn <jannh@google.com>,
	Jens Axboe <axboe@kernel.dk>, Martijn Coenen <maco@android.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Paul Lawrence <paullawrence@google.com>,
	Stefano Duo <stefanoduo@google.com>,
	Zimuzo Ezeozue <zezeozue@google.com>,
	fuse-devel <fuse-devel@lists.sourceforge.net>,
	kernel-team <kernel-team@android.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH V8 1/3] fuse: Definitions and ioctl() for passthrough
Date: Fri, 18 Sep 2020 22:59:16 +0300	[thread overview]
Message-ID: <CAOQ4uxhNddkdZ5TCdg6Gdb9oYqNVUrpk25kGYxZNe-LDsZV_Ag@mail.gmail.com> (raw)
In-Reply-To: <20200918163354.GB3385065@google.com>

On Fri, Sep 18, 2020 at 7:33 PM Alessio Balsini <balsini@android.com> wrote:
>
> Hi Amir,
>
> Thanks again for your feedback.
>
> On Sat, Sep 12, 2020 at 02:06:02PM +0300, Amir Goldstein wrote:
> > On Fri, Sep 11, 2020 at 7:34 PM Alessio Balsini <balsini@android.com> wrote:
> > [...]
> > > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> > > index bba747520e9b..eb223130a917 100644
> > > --- a/fs/fuse/inode.c
> > > +++ b/fs/fuse/inode.c
> > > @@ -965,6 +965,12 @@ static void process_init_reply(struct fuse_conn *fc, struct fuse_args *args,
> > >                                         min_t(unsigned int, FUSE_MAX_MAX_PAGES,
> > >                                         max_t(unsigned int, arg->max_pages, 1));
> > >                         }
> > > +                       if (arg->flags & FUSE_PASSTHROUGH) {
> > > +                               fc->passthrough = 1;
> > > +                               /* Prevent further stacking */
> > > +                               fc->sb->s_stack_depth =
> > > +                                       FILESYSTEM_MAX_STACK_DEPTH;
> > > +                       }
> >
> > That seems a bit limiting.
> > I suppose what you really want to avoid is loops into FUSE fd.
> > There may be a way to do this with forbidding overlay over FUSE passthrough
> > or the other way around.
> >
> > You can set fc->sb->s_stack_depth = FILESYSTEM_MAX_STACK_DEPTH - 1
> > here and in passthrough ioctl you can check for looping into a fuse fs with
> > passthrough enabled on the passed fd (see below) ...
> >
> > [...]
> > > diff --git a/fs/fuse/passthrough.c b/fs/fuse/passthrough.c
> > > new file mode 100644
> > > index 000000000000..86ab4eafa7bf
> > > --- /dev/null
> > > +++ b/fs/fuse/passthrough.c
> > > @@ -0,0 +1,55 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +
> > > +#include "fuse_i.h"
> > > +
> > > +int fuse_passthrough_setup(struct fuse_req *req, unsigned int fd)
> > > +{
> > > +       int ret;
> > > +       int fs_stack_depth;
> > > +       struct file *passthrough_filp;
> > > +       struct inode *passthrough_inode;
> > > +       struct super_block *passthrough_sb;
> > > +
> > > +       /* Passthrough mode can only be enabled at file open/create time */
> > > +       if (req->in.h.opcode != FUSE_OPEN && req->in.h.opcode != FUSE_CREATE) {
> > > +               pr_err("FUSE: invalid OPCODE for request.\n");
> > > +               return -EINVAL;
> > > +       }
> > > +
> > > +       passthrough_filp = fget(fd);
> > > +       if (!passthrough_filp) {
> > > +               pr_err("FUSE: invalid file descriptor for passthrough.\n");
> > > +               return -EINVAL;
> > > +       }
> > > +
> > > +       ret = -EINVAL;
> > > +       if (!passthrough_filp->f_op->read_iter ||
> > > +           !passthrough_filp->f_op->write_iter) {
> > > +               pr_err("FUSE: passthrough file misses file operations.\n");
> > > +               goto out;
> > > +       }
> > > +
> > > +       passthrough_inode = file_inode(passthrough_filp);
> > > +       passthrough_sb = passthrough_inode->i_sb;
> > > +       fs_stack_depth = passthrough_sb->s_stack_depth + 1;
> >
> > ... for example:
> >
> >        if (fs_stack_depth && passthrough_sb->s_type == fuse_fs_type) {
> >                pr_err("FUSE: stacked passthrough file\n");
> >                goto out;
> >        }
> >
> > But maybe we want to ban passthrough to any lower FUSE at least for start.
>
>
> Yes, what I proposed here is very conservative, and your solution sounds
> good to me. Unfortunately I don't have a clear idea of what could go wrong
> if we relax this constraint. I need some guidance from you experts here.
>

I guess the main concern would be locking order and deadlocks.
With my suggestion I think deadlocks are avoided and I am less sure
but think that lockdep should not have false positives either.

If people do need the 1-level stacking, I can try to think harder
if it is safe and maybe add some more compromise limitations.

> What do you think if we keep this overly strict rule for now to avoid
> unintended behaviors and come back as we find affected use case?
>

I can live with that if other designated users don't mind the limitation.

I happen to be developing a passthrough FUSE fs [1] myself and
I also happen to be using it to pass through to overlayfs.
OTOH, the workloads for my use case are mostly large sequential IO,
and the hardware can handle the few extra syscalls, so the passthrough
fd feature is not urgent for my use case at this point in time.


>
> >
> > > +       ret = -EEXIST;
> >
> > Why EEXIST? Why not EINVAL?
> >
>
>
> Reaching the stacking limit sounded like an error caused by the undesired
> existence of something, thus EEXIST sounded like a good fit.
> No problem in changing that to EINVAL.
>
>
>
> > > +       if (fs_stack_depth > FILESYSTEM_MAX_STACK_DEPTH) {
> > > +               pr_err("FUSE: maximum fs stacking depth exceeded for passthrough\n");
> > > +               goto out;
> > > +       }
> > > +
> > > +       req->args->passthrough_filp = passthrough_filp;
> > > +       return 0;
> > > +out:
> > > +       fput(passthrough_filp);
> > > +       return ret;
> > > +}
> > > +
> >
> > And speaking of overlayfs, I believe you may be able to test your code with
> > fuse-overlayfs (passthrough to upper files).
> >
...
>
> This is indeed a project with several common elements to what we are doing
> in Android,

Are you in liberty to share more information about the Android project?
Is it related to Incremental FS [2]?

Thanks,
Amir.

[1] https://github.com/amir73il/libfuse/commits/cachegwfs
[2] https://lore.kernel.org/linux-fsdevel/20190502040331.81196-1-ezemtsov@google.com/

  reply	other threads:[~2020-09-18 19:59 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-11 16:34 [PATCH V8 0/3] fuse: Add support for passthrough read/write Alessio Balsini
2020-09-11 16:34 ` [PATCH V8 1/3] fuse: Definitions and ioctl() for passthrough Alessio Balsini
2020-09-12 11:06   ` Amir Goldstein
2020-09-18 16:33     ` Alessio Balsini
2020-09-18 19:59       ` Amir Goldstein [this message]
2020-09-22 12:15         ` Alessio Balsini
2020-09-22 16:08           ` Amir Goldstein
2020-09-29 14:30             ` Alessio Balsini
2020-09-11 16:34 ` [PATCH V8 2/3] fuse: Introduce synchronous read and write " Alessio Balsini
2020-09-12  9:55   ` Amir Goldstein
2020-09-21 11:01     ` Alessio Balsini
2020-09-21 13:07       ` Amir Goldstein
2020-09-11 16:34 ` [PATCH V8 3/3] fuse: Handle AIO read and write in passthrough Alessio Balsini
2020-09-11 17:23   ` Jens Axboe
2020-09-21 15:28     ` Alessio Balsini
2020-09-11 18:46 ` [fuse-devel] [PATCH V8 0/3] fuse: Add support for passthrough read/write Antonio SJ Musumeci
2020-09-18 16:03   ` Alessio Balsini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOQ4uxhNddkdZ5TCdg6Gdb9oYqNVUrpk25kGYxZNe-LDsZV_Ag@mail.gmail.com \
    --to=amir73il@gmail.com \
    --cc=akailash@google.com \
    --cc=axboe@kernel.dk \
    --cc=balsini@android.com \
    --cc=dvander@google.com \
    --cc=eric.yan@oneplus.com \
    --cc=fuse-devel@lists.sourceforge.net \
    --cc=jannh@google.com \
    --cc=kernel-team@android.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maco@android.com \
    --cc=miklos@szeredi.hu \
    --cc=palmer@dabbelt.com \
    --cc=paullawrence@google.com \
    --cc=stefanoduo@google.com \
    --cc=zezeozue@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).