Linux-Fsdevel Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Miklos Szeredi <miklos@szeredi.hu>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Jeff Layton <jlayton@poochiereds.net>,
	"J . Bruce Fields" <bfields@fieldses.org>,
	overlayfs <linux-unionfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH v2 04/17] ovl: decode connected upper dir file handles
Date: Mon, 15 Jan 2018 15:56:06 +0100	[thread overview]
Message-ID: <CAJfpegvCVX-YEMRDfKTYM2Au=2bNT3RcX=uNyt9KO1EmBJOvuA@mail.gmail.com> (raw)
In-Reply-To: <CAOQ4uxijgqiAjKysA-HZFLa+b2nETMhq9aW8rjv=hwCq1B8tfw@mail.gmail.com>

On Mon, Jan 15, 2018 at 1:20 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Mon, Jan 15, 2018 at 1:33 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Thu, Jan 4, 2018 at 6:20 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>> Until this change, we decoded upper file handles by instantiating an
>>> overlay dentry from the real upper dentry. This is sufficient to handle
>>> pure upper files, but insufficient to handle merge/impure dirs.
>>>
>>> To that end, if decoded real upper dir is connected and hashed, we
>>> lookup an overlay dentry with the same path as the real upper dir.
>>> If decoded real upper is non-dir, we instantiate a disconnected overlay
>>> dentry as before this change.
>>>
>>> Because ovl_fh_to_dentry() returns connected overlay dir dentries,
>>> exportfs never need to call get_parent() and get_name() to reconnect an
>>> upper overlay dir. Because connectable non-dir file handles are not
>>> supported, exportfs will not be able to use fh_to_parent() and get_name()
>>> methods to reconnect a disconnected non-dir to its parent. Therefore, the
>>> methods get_parent() and get_name() are implemented just to print out a
>>> sanity warning and the method fh_to_parent() is implemented to warn the
>>> user that using the 'subtree_check' exportfs option is not supported.
>>>
>>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>>> ---
>>>  fs/overlayfs/export.c | 172 +++++++++++++++++++++++++++++++++++++++++++++++++-
>>>  1 file changed, 171 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
>>> index 5c72784a0b4d..48ae02f3acb8 100644
>>> --- a/fs/overlayfs/export.c
>>> +++ b/fs/overlayfs/export.c
>>> @@ -130,6 +130,145 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
>>>         return dentry;
>>>  }
>>>
>>> +/*
>>> + * Lookup a child overlay dentry whose real dentry is @real.
>>> + * If @is_upper is true then we lookup a child overlay dentry with the same
>>> + * name as the real dentry. Otherwise, we need to consult index for lookup.
>>> + */
>>> +static struct dentry *ovl_lookup_real_one(struct dentry *parent,
>>> +                                         struct dentry *real, bool is_upper)
>>> +{
>>> +       struct dentry *this;
>>> +       struct qstr *name = &real->d_name;
>>> +       int err;
>>> +
>>> +       /* TODO: use index when looking up by lower real dentry */
>>> +       if (!is_upper)
>>> +               return ERR_PTR(-EACCES);
>>> +
>>> +       /* Lookup overlay dentry by real name */
>>> +       this = lookup_one_len_unlocked(name->name, parent, name->len);
>>> +       err = PTR_ERR(this);
>>> +       if (IS_ERR(this)) {
>>> +               goto fail;
>>> +       } else if (!this || !this->d_inode) {
>>> +               dput(this);
>>> +               err = -ENOENT;
>>> +               goto fail;
>>> +       } else if (ovl_dentry_upper(this) != real) {
>>> +               dput(this);
>>> +               err = -ESTALE;
>>> +               goto fail;
>>> +       }
>>> +
>>> +       return this;
>>> +
>>> +fail:
>>> +       pr_warn_ratelimited("overlayfs: failed to lookup one by real (%pd2, is_upper=%d, parent=%pd2, err=%i)\n",
>>> +                           real, is_upper, parent, err);
>>> +       return ERR_PTR(err);
>>> +}
>>> +
>>> +/*
>>> + * Lookup an overlay dentry whose real dentry is @real.
>>> + * If @is_upper is true then we lookup an overlay dentry with the same path
>>> + * as the real dentry. Otherwise, we need to consult index for lookup.
>>> + */
>>> +static struct dentry *ovl_lookup_real(struct super_block *sb,
>>> +                                     struct dentry *real, bool is_upper)
>>> +{
>>> +       struct dentry *connected;
>>> +       int err = 0;
>>> +
>>> +       /* TODO: use index when looking up by lower real dentry */
>>> +       if (!is_upper)
>>> +               return ERR_PTR(-EACCES);
>>> +
>>> +       connected = dget(sb->s_root);
>>> +       while (!err) {
>>> +               struct dentry *next, *this;
>>> +               struct dentry *parent = NULL;
>>> +               struct dentry *real_connected = ovl_dentry_upper(connected);
>>> +
>>> +               if (real_connected == real)
>>> +                       break;
>>> +
>>> +               next = dget(real);
>>> +               /* find the topmost dentry not yet connected */
>>> +               for (;;) {
>>> +                       parent = dget_parent(next);
>>> +
>>> +                       if (real_connected == parent)
>>> +                               break;
>>> +
>>> +                       /*
>>> +                        * If real file has been moved out of the layer root
>>> +                        * directory, we will eventully hit the real fs root.
>>> +                        */
>>> +                       if (parent == next) {
>>> +                               err = -EXDEV;
>>> +                               break;
>>> +                       }
>>
>> This seems to assume no cross directory renames of directories in the
>> ancestry of "real", but AFAICS nothing prevents that.
>
> Do you mean online modification of underlying fs? or rename in overlay?

Rename in overlay.

> For online modification fo underlying fs, I don't a reason to make it work.
> -ESTALE would be a perfectly valid result in that case.

Sure.

>>
>> Also why not use the inode cache to find already connected dirs?
>> Seems more efficient, than always going up to the root and going down
>> from there.
>
> See patch [14/17] ovl: lookup connected ancestor of dir in inode cache
> Sorry for ordering patches like this, it was more convenient to implement
> the cold cache algorithm and then add hot cache into the mix.

Okay.

>>
>> So, a working algorithm would be going up to the first connected
>> parent or root, lock parent, lookup name and restart.  Not guaranteed
>> to finish, since not protected against always racing with renames.
>> Can we take s_vfs_rename_sem on ovl to prevent that?
>>
>
> Sounds like a simple and good enough solution.
> Do we really need the locking of parent and restart connect if
> we take s_vfs_rename_sem around ovl_lookup_real()?

No, but s_vfs_rename_sem is a really heavyweight solution, we should
do better than that for decoding a file handle.

And we probably don't need anything else, since rename on ancestor
means renamed dir is connected, and hopefully not evicted from the
cache until we repeat the walk up.

So need to lock parent, lookup ovl dentry, verify we got the same
upper, if not retry icache lookup.

Not sure we need to worry about that "hopefully".  Hopefully not.

Thanks,
Miklos

  reply	other threads:[~2018-01-15 14:56 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-04 17:20 [PATCH v2 00/17] Overlayfs NFS export support Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 01/17] ovl: document NFS export Amir Goldstein
2018-01-11 16:06   ` Miklos Szeredi
2018-01-11 16:26     ` Amir Goldstein
2018-01-12 15:43       ` Miklos Szeredi
2018-01-12 15:49         ` Miklos Szeredi
2018-01-12 18:50           ` Amir Goldstein
2018-01-13  8:54           ` Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 02/17] ovl: encode pure upper file handles Amir Goldstein
2018-01-18 10:31   ` Miklos Szeredi
2018-01-04 17:20 ` [PATCH v2 03/17] ovl: decode " Amir Goldstein
2018-01-18 14:09   ` Miklos Szeredi
2018-01-18 14:34     ` Amir Goldstein
2018-01-18 14:39       ` Miklos Szeredi
2018-01-18 19:49         ` Amir Goldstein
2018-01-18 20:10           ` Miklos Szeredi
2018-01-18 20:35             ` Amir Goldstein
2018-01-18 22:57               ` Amir Goldstein
2018-01-19  0:23                 ` Amir Goldstein
2018-01-19 10:39                   ` Miklos Szeredi
2018-01-19 11:07                     ` Amir Goldstein
2018-01-19 20:10                       ` Amir Goldstein
2018-01-24 10:34                         ` Miklos Szeredi
2018-01-24 11:04                           ` Amir Goldstein
2018-01-24 11:18                             ` Amir Goldstein
2018-01-24 11:55                               ` Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 04/17] ovl: decode connected upper dir " Amir Goldstein
2018-01-05 12:33   ` Amir Goldstein
2018-01-05 15:18     ` J . Bruce Fields
2018-01-05 15:34       ` Amir Goldstein
2018-01-15 11:41     ` Miklos Szeredi
2018-01-15 11:33   ` Miklos Szeredi
2018-01-15 12:20     ` Amir Goldstein
2018-01-15 14:56       ` Miklos Szeredi [this message]
2018-01-17 11:18         ` Amir Goldstein
2018-01-17 12:20           ` Amir Goldstein
2018-01-17 13:29             ` Amir Goldstein
2018-01-17 15:42           ` Miklos Szeredi
2018-01-17 16:34             ` Amir Goldstein
2018-01-17 21:36               ` Amir Goldstein
2018-01-18  8:22               ` Miklos Szeredi
2018-01-18  8:47                 ` Amir Goldstein
2018-01-18  9:12                   ` Miklos Szeredi
2018-01-18 10:28                     ` Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 05/17] ovl: encode non-indexed upper " Amir Goldstein
2018-01-15 11:58   ` Miklos Szeredi
2018-01-15 12:07     ` Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 06/17] ovl: copy up before encoding dir file handle when ofs->numlower > 1 Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 07/17] ovl: encode lower file handles Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 08/17] ovl: decode lower non-dir " Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 09/17] ovl: decode indexed " Amir Goldstein
2018-01-18 13:11   ` Miklos Szeredi
2018-01-04 17:20 ` [PATCH v2 10/17] ovl: decode lower file handles of unlinked but open files Amir Goldstein
2018-01-16  9:16   ` Miklos Szeredi
2018-01-16  9:37     ` Amir Goldstein
2018-01-16 10:10       ` Miklos Szeredi
2018-01-16 10:40         ` Amir Goldstein
2018-01-16 11:07           ` Miklos Szeredi
2018-01-17 21:05         ` Amir Goldstein
2018-01-18 14:18   ` Amir Goldstein
2018-02-27 11:35     ` Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 11/17] ovl: decode indexed dir file handles Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 12/17] ovl: decode pure lower " Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 13/17] ovl: hash directory inodes for NFS export Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 14/17] ovl: lookup connected ancestor of dir in inode cache Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 15/17] ovl: lookup indexed ancestor of lower dir Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 16/17] ovl: wire up NFS export support Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 17/17] nfsd: encode stat->mtime for getattr instead of inode->i_mtime Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJfpegvCVX-YEMRDfKTYM2Au=2bNT3RcX=uNyt9KO1EmBJOvuA@mail.gmail.com' \
    --to=miklos@szeredi.hu \
    --cc=amir73il@gmail.com \
    --cc=bfields@fieldses.org \
    --cc=jlayton@poochiereds.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --subject='Re: [PATCH v2 04/17] ovl: decode connected upper dir file handles' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).