Linux-Fsdevel Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jeff Layton <jlayton@poochiereds.net>,
	"J . Bruce Fields" <bfields@fieldses.org>,
	overlayfs <linux-unionfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH v2 04/17] ovl: decode connected upper dir file handles
Date: Wed, 17 Jan 2018 23:36:36 +0200	[thread overview]
Message-ID: <CAOQ4uxh+1_EFPLP8k2_-OhAPX74a9s3y9-kFNU56Z_EU1m7Adw@mail.gmail.com> (raw)
In-Reply-To: <CAOQ4uxhoR-cs=c6BfLr57X8iw5h1CzUq1YTB3s5HbBYSUSgjag@mail.gmail.com>

On Wed, Jan 17, 2018 at 6:34 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Wed, Jan 17, 2018 at 5:42 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> On Wed, Jan 17, 2018 at 12:18 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>>> On Mon, Jan 15, 2018 at 4:56 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>>>>
>>> [...]
>>>> >>
>>>> >> So, a working algorithm would be going up to the first connected
>>>> >> parent or root, lock parent, lookup name and restart.  Not guaranteed
>>>> >> to finish, since not protected against always racing with renames.
>>>> >> Can we take s_vfs_rename_sem on ovl to prevent that?
>>>> >>
>>>> >
>>>> > Sounds like a simple and good enough solution.
>>>> > Do we really need the locking of parent and restart connect if
>>>> > we take s_vfs_rename_sem around ovl_lookup_real()?
>>>>
>>>> No, but s_vfs_rename_sem is a really heavyweight solution, we should
>>>> do better than that for decoding a file handle.
>>>>
>>>> And we probably don't need anything else, since rename on ancestor
>>>> means renamed dir is connected, and hopefully not evicted from the
>>>> cache until we repeat the walk up.
>>>>
>>>> So need to lock parent, lookup ovl dentry, verify we got the same
>>>> upper, if not retry icache lookup.
>>>>
>>>> Not sure we need to worry about that "hopefully".  Hopefully not.
>>>>
>>>
>>> Something like this??
>>>
>>> This is just the raw fix to patch 4/17 without the icache lookup
>>> that is added by later patches.
>>>
>>> I added rename_lock seqlock around backwalk to connected ancestor
>>> and take_dentry_name_snapshot() for the stability of real name
>>> during overlay lookup.
>>>
>>> I considered also storing OVL_I(d_inode(connected))->version
>>> inside seqlock and comparing it to version in case lookup of child
>>> failed. This could help us distinguish between overlay rename and
>>> underlying rename (overlay dir version did not change) and return
>>> ESTALE instead of restarting lookup in the latter case.
>>> Wasn't sure if that was a good idea and what we loose if we leave it out.
>>>

[...]

>>> +       /*
>>> +        * Lookup overlay dentry by real name. The parent mutex protects us
>>> +        * from racing with overlay rename. If the overlay dentry that is
>>> +        * above real has already been moved to a different parent, then this
>>> +        * lookup will fail to find a child dentry whose real dentry is @real
>>> +        * and we will have to restart the lookup of real path from the top.
>>> +        *
>>> +        * We also need to take a snapshot of real dentry name to protect us
>>> +        * from racing with underlying layer rename. In this case, we don't
>>> +        * care about returning ESTALE, only from referencing a free name
>>> +        * pointer.
>>> +        *
>>> +        * TODO: try to lookup the renamed overlay dentry in inode cache by
>>> +        *       real inode.
>>> +        */
>>> +       inode_lock_nested(d_inode(parent), I_MUTEX_PARENT);
>>> +       take_dentry_name_snapshot(&name, real);
>>
>> No need to snapshot, just check if parent hasn't changed after
>> locking.  If parent is same, then name is guaranteed to be stable.
>>
>
> I don't understand.
> We are not holding a lock on real parent, only on overlay parent.
> What makes the real name stable?
> The snapshot is not to protect from racing with overlay rename.
> The snapshot is for protecting from race with real rename, just to
> make sure we don't dereference a stale name pointer.
>

[...]

>>> +
>>> +               /*
>>> +                * Find the topmost dentry not yet connected. Taking rename_lock
>>> +                * so at least we don't race with rename when walking back to
>>> +                * 'real_connected'.
>>> +                */
>>> +               seq = read_seqbegin(&rename_lock);
>>
>> I don't see what we gain with this.
>>
>
> I can't say that I do see it, but perhaps there is something yet
> to be gained by adding this later for lower layers lookup.
> Perhaps when looking on lower real layer, we can store the
> overlay dir cache version of 'connected' (connected in this case
> may be an indexed merge dir).
> After we take 'connected' dir mutex, we cannot check that
> real parent hasn't changes as an indication to no overlay rename
> because overlay rename happens on upper, but we can compare
> the dir cache version of 'connected' dir to the version we stored
> under rename_lock.
> Then we can tell if lower lookup has failed because of some
> permanent error (e.g. middle layer redirect) or because of an
> indexed rename, so we need to restart.
> Maybe that gains us something?
>

OK. I finished reworking the series with these changes on top
of V3 of indexing patches and pushed the work so far to:
https://github.com/amir73il/linux/commits/ovl-nfs-export-wip

There is a new patch at the top:
ovl: retry connect of non-upper dirs on parent rename
that implements the dir cache version compare
it does not use the seqlock.

The modified patch 4/17 mostly affect patches 12/17 and 14/17,
so you may want to continue review on the modified version.

This WIP is tested with xfstests including a test with readonly
no upperdir overlay.

As far as I remember, the only thing remaining to address from V2
review comments so far, that is NOT in the WIP branch is to relax
the cases of copy up on encode.
As for the lookup in icache by fh, I prefer to defer to later time.

Let me know if I forgot anything or if you find anything else that
needs to be addressed.

Thanks,
Amir.

  reply	other threads:[~2018-01-17 21:36 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-04 17:20 [PATCH v2 00/17] Overlayfs NFS export support Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 01/17] ovl: document NFS export Amir Goldstein
2018-01-11 16:06   ` Miklos Szeredi
2018-01-11 16:26     ` Amir Goldstein
2018-01-12 15:43       ` Miklos Szeredi
2018-01-12 15:49         ` Miklos Szeredi
2018-01-12 18:50           ` Amir Goldstein
2018-01-13  8:54           ` Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 02/17] ovl: encode pure upper file handles Amir Goldstein
2018-01-18 10:31   ` Miklos Szeredi
2018-01-04 17:20 ` [PATCH v2 03/17] ovl: decode " Amir Goldstein
2018-01-18 14:09   ` Miklos Szeredi
2018-01-18 14:34     ` Amir Goldstein
2018-01-18 14:39       ` Miklos Szeredi
2018-01-18 19:49         ` Amir Goldstein
2018-01-18 20:10           ` Miklos Szeredi
2018-01-18 20:35             ` Amir Goldstein
2018-01-18 22:57               ` Amir Goldstein
2018-01-19  0:23                 ` Amir Goldstein
2018-01-19 10:39                   ` Miklos Szeredi
2018-01-19 11:07                     ` Amir Goldstein
2018-01-19 20:10                       ` Amir Goldstein
2018-01-24 10:34                         ` Miklos Szeredi
2018-01-24 11:04                           ` Amir Goldstein
2018-01-24 11:18                             ` Amir Goldstein
2018-01-24 11:55                               ` Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 04/17] ovl: decode connected upper dir " Amir Goldstein
2018-01-05 12:33   ` Amir Goldstein
2018-01-05 15:18     ` J . Bruce Fields
2018-01-05 15:34       ` Amir Goldstein
2018-01-15 11:41     ` Miklos Szeredi
2018-01-15 11:33   ` Miklos Szeredi
2018-01-15 12:20     ` Amir Goldstein
2018-01-15 14:56       ` Miklos Szeredi
2018-01-17 11:18         ` Amir Goldstein
2018-01-17 12:20           ` Amir Goldstein
2018-01-17 13:29             ` Amir Goldstein
2018-01-17 15:42           ` Miklos Szeredi
2018-01-17 16:34             ` Amir Goldstein
2018-01-17 21:36               ` Amir Goldstein [this message]
2018-01-18  8:22               ` Miklos Szeredi
2018-01-18  8:47                 ` Amir Goldstein
2018-01-18  9:12                   ` Miklos Szeredi
2018-01-18 10:28                     ` Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 05/17] ovl: encode non-indexed upper " Amir Goldstein
2018-01-15 11:58   ` Miklos Szeredi
2018-01-15 12:07     ` Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 06/17] ovl: copy up before encoding dir file handle when ofs->numlower > 1 Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 07/17] ovl: encode lower file handles Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 08/17] ovl: decode lower non-dir " Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 09/17] ovl: decode indexed " Amir Goldstein
2018-01-18 13:11   ` Miklos Szeredi
2018-01-04 17:20 ` [PATCH v2 10/17] ovl: decode lower file handles of unlinked but open files Amir Goldstein
2018-01-16  9:16   ` Miklos Szeredi
2018-01-16  9:37     ` Amir Goldstein
2018-01-16 10:10       ` Miklos Szeredi
2018-01-16 10:40         ` Amir Goldstein
2018-01-16 11:07           ` Miklos Szeredi
2018-01-17 21:05         ` Amir Goldstein
2018-01-18 14:18   ` Amir Goldstein
2018-02-27 11:35     ` Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 11/17] ovl: decode indexed dir file handles Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 12/17] ovl: decode pure lower " Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 13/17] ovl: hash directory inodes for NFS export Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 14/17] ovl: lookup connected ancestor of dir in inode cache Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 15/17] ovl: lookup indexed ancestor of lower dir Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 16/17] ovl: wire up NFS export support Amir Goldstein
2018-01-04 17:20 ` [PATCH v2 17/17] nfsd: encode stat->mtime for getattr instead of inode->i_mtime Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOQ4uxh+1_EFPLP8k2_-OhAPX74a9s3y9-kFNU56Z_EU1m7Adw@mail.gmail.com \
    --to=amir73il@gmail.com \
    --cc=bfields@fieldses.org \
    --cc=jlayton@poochiereds.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --subject='Re: [PATCH v2 04/17] ovl: decode connected upper dir file handles' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).