LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Marek Szyprowski <m.szyprowski@samsung.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Tejun Heo <tj@kernel.org>, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] kernfs: switch global kernfs_rwsem lock to per-fs lock
Date: Tue, 30 Nov 2021 09:00:36 +0100	[thread overview]
Message-ID: <061b9b7a-7a2a-d59c-aadb-d5dfdcc8aad1@samsung.com> (raw)
In-Reply-To: <YaUGsn4opaiSO2H8@google.com>


On 29.11.2021 17:58, Minchan Kim wrote:
> On Fri, Nov 26, 2021 at 12:54:45PM +0100, Marek Szyprowski wrote:
>> Hi,
>>
>> On 19.11.2021 00:00, Minchan Kim wrote:
>>> The kernfs implementation has big lock granularity(kernfs_rwsem) so
>>> every kernfs-based(e.g., sysfs, cgroup) fs are able to compete the
>>> lock. It makes trouble for some cases to wait the global lock
>>> for a long time even though they are totally independent contexts
>>> each other.
>>>
>>> A general example is process A goes under direct reclaim with holding
>>> the lock when it accessed the file in sysfs and process B is waiting
>>> the lock with exclusive mode and then process C is waiting the lock
>>> until process B could finish the job after it gets the lock from
>>> process A.
>>>
>>> This patch switches the global kernfs_rwsem to per-fs lock, which
>>> put the rwsem into kernfs_root.
>>>
>>> Suggested-by: Tejun Heo <tj@kernel.org>
>>> Signed-off-by: Minchan Kim <minchan@kernel.org>
>> This patch landed recently in linux-next (20211126) as commit
>> 393c3714081a ("kernfs: switch global kernfs_rwsem lock to per-fs lock").
>> In my tests I've found that it causes the following warning during the
>> system reboot:
>>
>>    =========================
>>    WARNING: held lock freed!
>>    5.16.0-rc2+ #10984 Not tainted
>>    -------------------------
>>    kworker/1:0/18 is freeing memory ffff00004034e200-ffff00004034e3ff,
>> with a lock still held there!
>>    ffff00004034e348 (&root->kernfs_rwsem){++++}-{3:3}, at:
>> __kernfs_remove+0x310/0x37c
>>    3 locks held by kworker/1:0/18:
>>     #0: ffff000040107938 ((wq_completion)cgroup_destroy){+.+.}-{0:0}, at:
>> process_one_work+0x1f0/0x6f0
>>     #1: ffff80000b55bdc0
>> ((work_completion)(&(&css->destroy_rwork)->work)){+.+.}-{0:0}, at:
>> process_one_work+0x1f0/0x6f0
>>     #2: ffff00004034e348 (&root->kernfs_rwsem){++++}-{3:3}, at:
>> __kernfs_remove+0x310/0x37c
>>
>>    stack backtrace:
>>    CPU: 1 PID: 18 Comm: kworker/1:0 Not tainted 5.16.0-rc2+ #10984
>>    Hardware name: Raspberry Pi 4 Model B (DT)
>>    Workqueue: cgroup_destroy css_free_rwork_fn
>>    Call trace:
>>     dump_backtrace+0x0/0x1ac
>>     show_stack+0x18/0x24
>>     dump_stack_lvl+0x8c/0xb8
>>     dump_stack+0x18/0x34
>>     debug_check_no_locks_freed+0x124/0x140
>>     kfree+0xf0/0x3a4
>>     kernfs_put+0x1f8/0x224
>>     __kernfs_remove+0x1b8/0x37c
>>     kernfs_destroy_root+0x38/0x50
>>     css_free_rwork_fn+0x288/0x3d4
>>     process_one_work+0x288/0x6f0
>>     worker_thread+0x74/0x470
>>     kthread+0x188/0x194
>>     ret_from_fork+0x10/0x20
>>
>> Let me know if you need more information or help in reproducing this issue.
> Hi Marek,
>
> Thanks for the report. Could you try this one?
>
> >From 68741aa598ffd4a407d6d5f6b58bc7e7281e6f81 Mon Sep 17 00:00:00 2001
> From: Minchan Kim <minchan@kernel.org>
> Date: Mon, 29 Nov 2021 08:40:15 -0800
> Subject: [PATCH] kernfs: prevent early freeing of root node
>
> Marek reported the warning below.
>
>    =========================
>    WARNING: held lock freed!
>    5.16.0-rc2+ #10984 Not tainted
>    -------------------------
>    kworker/1:0/18 is freeing memory ffff00004034e200-ffff00004034e3ff,
> with a lock still held there!
>    ffff00004034e348 (&root->kernfs_rwsem){++++}-{3:3}, at:
> __kernfs_remove+0x310/0x37c
>    3 locks held by kworker/1:0/18:
>     #0: ffff000040107938 ((wq_completion)cgroup_destroy){+.+.}-{0:0}, at:
> process_one_work+0x1f0/0x6f0
>     #1: ffff80000b55bdc0
> ((work_completion)(&(&css->destroy_rwork)->work)){+.+.}-{0:0}, at:
> process_one_work+0x1f0/0x6f0
>     #2: ffff00004034e348 (&root->kernfs_rwsem){++++}-{3:3}, at:
> __kernfs_remove+0x310/0x37c
>
>    stack backtrace:
>    CPU: 1 PID: 18 Comm: kworker/1:0 Not tainted 5.16.0-rc2+ #10984
>    Hardware name: Raspberry Pi 4 Model B (DT)
>    Workqueue: cgroup_destroy css_free_rwork_fn
>    Call trace:
>     dump_backtrace+0x0/0x1ac
>     show_stack+0x18/0x24
>     dump_stack_lvl+0x8c/0xb8
>     dump_stack+0x18/0x34
>     debug_check_no_locks_freed+0x124/0x140
>     kfree+0xf0/0x3a4
>     kernfs_put+0x1f8/0x224
>     __kernfs_remove+0x1b8/0x37c
>     kernfs_destroy_root+0x38/0x50
>     css_free_rwork_fn+0x288/0x3d4
>     process_one_work+0x288/0x6f0
>     worker_thread+0x74/0x470
>     kthread+0x188/0x194
>     ret_from_fork+0x10/0x20
>
> Since kernfs moves the kernfs_rwsem lock into root, it couldn't hold
> the lock when the root node is tearing down. Thus, get the refcount
> of root node.
>
> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>

Right, this fixes the issue.

Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>

> ---
>   fs/kernfs/dir.c | 8 +++++++-
>   1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
> index 13cae0ccce74..e6d9772ddb4c 100644
> --- a/fs/kernfs/dir.c
> +++ b/fs/kernfs/dir.c
> @@ -961,7 +961,13 @@ struct kernfs_root *kernfs_create_root(struct kernfs_syscall_ops *scops,
>    */
>   void kernfs_destroy_root(struct kernfs_root *root)
>   {
> -	kernfs_remove(root->kn);	/* will also free @root */
> +	/*
> +	 *  kernfs_remove holds kernfs_rwsem from the root so the root
> +	 *  shouldn't be freed during the operation.
> +	 */
> +	kernfs_get(root->kn);
> +	kernfs_remove(root->kn);
> +	kernfs_put(root->kn); /* will also free @root */
>   }
>   
>   /**

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


      reply	other threads:[~2021-11-30  8:00 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-18 23:00 Minchan Kim
2021-11-18 23:03 ` Tejun Heo
2021-11-22 17:39   ` Minchan Kim
2021-11-23  6:44     ` Greg Kroah-Hartman
     [not found] ` <CGME20211126115446eucas1p2e377cf7718c6da6bfe40a016bc0191a8@eucas1p2.samsung.com>
2021-11-26 11:54   ` Marek Szyprowski
2021-11-29 14:23     ` Nguyen Dinh Phi
2021-11-29 16:58     ` Minchan Kim
2021-11-30  8:00       ` Marek Szyprowski [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=061b9b7a-7a2a-d59c-aadb-d5dfdcc8aad1@samsung.com \
    --to=m.szyprowski@samsung.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=minchan@kernel.org \
    --cc=tj@kernel.org \
    --subject='Re: [PATCH] kernfs: switch global kernfs_rwsem lock to per-fs lock' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).