LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Hou Tao <houtao1@huawei.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Tejun Heo <tj@kernel.org>, Ian Kent <raven@themaw.net>,
Miklos Szeredi <mszeredi@redhat.com>
Cc: <viro@ZenIV.linux.org.uk>, <linux-fsdevel@vger.kernel.org>,
<linux-kernel@vger.kernel.org>, <houtao1@huawei.com>
Subject: [PATCH] kernfs: fix the race in the creation of negative dentry
Date: Sat, 11 Sep 2021 10:13:42 +0800 [thread overview]
Message-ID: <20210911021342.3280687-1-houtao1@huawei.com> (raw)
When doing stress test for module insertion and removal,
the following phenomenon was found:
$ lsmod
Module Size Used by
libkmod: kmod_module_get_holders: could not open \
'/sys/module/nbd/holders': No such file or directory
nbd -2 -2
$ cat /proc/modules
nbd 110592 0 - Live 0xffffffffc0298000
$ ls -1 /sys/module |grep nbd
ls: cannot access 'nbd': No such file or directory
nbd
It seems the kernfs node of module has been activated and is returned to
ls command through kernfs_fop_readdir(), but the sysfs dentry is negative.
Further investigation found that there is race between kernfs dir creation
and dentry lookup as shown below:
CPU 0 CPU 1
kernfs_add_one
down_write(&kernfs_rwsem)
// insert nbd into rbtree
// update the parent's revision
kernfs_link_sibling()
up_write(&kernfs_rwsem)
kernfs_iop_lookup
down_read(&kernfs_rwsem)
// find nbd in rbtree, but it is deactivated
kn = kernfs_find_ns()
// return false
kernfs_active()
// a negative is created
d_splice_alias(NULL, dentry)
up_read(&kernfs_rwsem)
// activate after negative dentry is created
kernfs_activate()
// return 0 because parent's
// revision is stable now
kernfs_dop_revalidate()
The race will create a negative dentry for a kernfs node which
is newly-added and activated. To fix it, there are two cases
to be handled:
(1) kernfs root without KERNFS_ROOT_CREATE_DEACTIVATED
kernfs_rwsem can be always hold during kernfs_link_sibling()
and kernfs_activate() in kernfs_add_one(), so kernfs_iop_lookup()
will find an active kernfs node.
(2) kernfs root with KERNFS_ROOT_CREATE_DEACTIVATED
kernfs_activate() is called separatedly, and we can invalidate
the dentry subtree with kn as root by increasing the revision of
its parent. But we can invalidate in a finer granularity by
only invalidating the negative dentry of the newly-activated
kn node.
So factor out a helper kernfs_activate_locked() to activate
kernfs subtree lockless and invalidate the negative dentries
if requested. Creation under kernfs root with CREATED_DEACTIVATED
doesn't need invalidation because kernfs_rwsem is always hold,
and kernfs root w/o CREATED_DEACTIVATED needs to invalidate
the maybe-created negative dentries.
kernfs_inc_rev() in kernfs_link_sibling() is kept because
kernfs_rename_ns() needs it to invalidate the negative dentry
of the target kernfs which is newly created by rename.
Fixes: c7e7c04274b1 ("kernfs: use VFS negative dentry caching")
Signed-off-by: Hou Tao <houtao1@huawei.com>
---
fs/kernfs/dir.c | 52 +++++++++++++++++++++++++++++++++++++++----------
1 file changed, 42 insertions(+), 10 deletions(-)
diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index ba581429bf7b..2f1ab8bad575 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -17,6 +17,8 @@
#include "kernfs-internal.h"
+static void kernfs_activate_locked(struct kernfs_node *kn, bool invalidate);
+
DECLARE_RWSEM(kernfs_rwsem);
static DEFINE_SPINLOCK(kernfs_rename_lock); /* kn->parent and ->name */
static char kernfs_pr_cont_buf[PATH_MAX]; /* protected by rename_lock */
@@ -753,8 +755,6 @@ int kernfs_add_one(struct kernfs_node *kn)
ps_iattr->ia_mtime = ps_iattr->ia_ctime;
}
- up_write(&kernfs_rwsem);
-
/*
* Activate the new node unless CREATE_DEACTIVATED is requested.
* If not activated here, the kernfs user is responsible for
@@ -763,8 +763,7 @@ int kernfs_add_one(struct kernfs_node *kn)
* trigger deactivation.
*/
if (!(kernfs_root(kn)->flags & KERNFS_ROOT_CREATE_DEACTIVATED))
- kernfs_activate(kn);
- return 0;
+ kernfs_activate_locked(kn, false);
out_unlock:
up_write(&kernfs_rwsem);
@@ -942,8 +941,11 @@ struct kernfs_root *kernfs_create_root(struct kernfs_syscall_ops *scops,
root->kn = kn;
init_waitqueue_head(&root->deactivate_waitq);
- if (!(root->flags & KERNFS_ROOT_CREATE_DEACTIVATED))
- kernfs_activate(kn);
+ if (!(root->flags & KERNFS_ROOT_CREATE_DEACTIVATED)) {
+ down_write(&kernfs_rwsem);
+ kernfs_activate_locked(kn, false);
+ up_write(&kernfs_rwsem);
+ }
return root;
}
@@ -1262,8 +1264,11 @@ static struct kernfs_node *kernfs_next_descendant_post(struct kernfs_node *pos,
}
/**
- * kernfs_activate - activate a node which started deactivated
+ * kernfs_activate_locked - activate a node which started deactivated
* @kn: kernfs_node whose subtree is to be activated
+ * @invalidate: whether or not to increase the revision of parent node
+ * for each newly-activated child node. The increase will
+ * invalidate negative dentries created under the parent node.
*
* If the root has KERNFS_ROOT_CREATE_DEACTIVATED set, a newly created node
* needs to be explicitly activated. A node which hasn't been activated
@@ -1271,15 +1276,15 @@ static struct kernfs_node *kernfs_next_descendant_post(struct kernfs_node *pos,
* removal. This is useful to construct atomic init sequences where
* creation of multiple nodes should either succeed or fail atomically.
*
+ * The caller must have acquired kernfs_rwsem.
+ *
* The caller is responsible for ensuring that this function is not called
* after kernfs_remove*() is invoked on @kn.
*/
-void kernfs_activate(struct kernfs_node *kn)
+static void kernfs_activate_locked(struct kernfs_node *kn, bool invalidate)
{
struct kernfs_node *pos;
- down_write(&kernfs_rwsem);
-
pos = NULL;
while ((pos = kernfs_next_descendant_post(pos, kn))) {
if (pos->flags & KERNFS_ACTIVATED)
@@ -1290,8 +1295,35 @@ void kernfs_activate(struct kernfs_node *kn)
atomic_sub(KN_DEACTIVATED_BIAS, &pos->active);
pos->flags |= KERNFS_ACTIVATED;
+
+ /*
+ * Invalidate the negative dentry created after pos is
+ * inserted into sibling rbtree but before it is
+ * activated.
+ */
+ if (invalidate && pos->parent)
+ kernfs_inc_rev(pos->parent);
}
+}
+/**
+ * kernfs_activate - activate a node which started deactivated
+ * @kn: kernfs_node whose subtree is to be activated
+ *
+ * Currently it is only used by kernfs root which has
+ * FS_ROOT_CREATE_DEACTIVATED set. Because the addition and the activation
+ * of children nodes are not atomic (not always hold kernfs_rwsem),
+ * negative dentry may be created for one child node after its addition
+ * but before its activation, so passing invalidate as true to
+ * @kernfs_activate_locked() to invalidate these negative dentries.
+ *
+ * The caller is responsible for ensuring that this function is not called
+ * after kernfs_remove*() is invoked on @kn.
+ */
+void kernfs_activate(struct kernfs_node *kn)
+{
+ down_write(&kernfs_rwsem);
+ kernfs_activate_locked(kn, true);
up_write(&kernfs_rwsem);
}
--
2.29.2
next reply other threads:[~2021-09-11 1:59 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-11 2:13 Hou Tao [this message]
2021-09-14 3:05 ` Ian Kent
2021-09-15 1:35 ` Ian Kent
2021-09-15 2:09 ` Ian Kent
2021-09-23 1:52 ` Hou Tao
2021-09-23 2:50 ` Ian Kent
2021-09-23 4:34 ` Hou Tao
2021-09-27 1:51 ` Hou Tao
2021-09-27 5:31 ` Ian Kent
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210911021342.3280687-1-houtao1@huawei.com \
--to=houtao1@huawei.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mszeredi@redhat.com \
--cc=raven@themaw.net \
--cc=tj@kernel.org \
--cc=viro@ZenIV.linux.org.uk \
--subject='Re: [PATCH] kernfs: fix the race in the creation of negative dentry' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).