Linux-Fsdevel Archive on lore.kernel.org
help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Matthew Wilcox <matthew.wilcox@oracle.com>,
	Srinivas Eeda <SRINIVAS.EEDA@oracle.com>,
	"joe.jin\@oracle.com" <joe.jin@oracle.com>
Subject: [PATCH] proc: Avoid a thundering herd of threads freeing proc dentries
Date: Fri, 19 Jun 2020 09:09:41 -0500	[thread overview]
Message-ID: <87bllf87ve.fsf_-_@x220.int.ebiederm.org> (raw)
In-Reply-To: <2cf6af59-e86b-f6cc-06d3-84309425bd1d@oracle.com> (Junxiao Bi's message of "Thu, 18 Jun 2020 17:27:43 -0700")


Junxiao Bi <junxiao.bi@oracle.com> reported:
> When debugging some performance issue, i found that thousands of threads exit
> around same time could cause a severe spin lock contention on proc dentry
> "/proc/$parent_process_pid/task/", that's because threads needs to clean up
> their pid file from that dir when exit.

Matthew Wilcox <willy@infradead.org> reported:
> We've looked at a few different ways of fixing this problem.

The flushing of the proc dentries from the dcache is an optmization,
and is not necessary for correctness.  Eventually cache pressure will
cause the dentries to be freed even if no flushing happens.  Some
light testing when I refactored the proc flushg[1] indicated that at
least the memory footprint is easily measurable.

An optimization that causes a performance problem due to a thundering
herd of threads is no real optimization.

Modify the code to only flush the /proc/<tgid>/ directory when all
threads in a process are killed at once.  This continues to flush
practically everything when the process is reaped as the threads live
under /proc/<tgid>/task/<tid>.

There is a rare possibility that a debugger will access /proc/<tid>/,
which this change will no longer flush, but I believe such accesses
are sufficiently rare to not be observed in practice.

[1] 7bc3e6e55acf ("proc: Use a list of inodes to flush from proc")
Link: https://lkml.kernel.org/r/54091fc0-ca46-2186-97a8-d1f3c4f3877b@oracle.com
Reported-by: Masahiro Yamada <masahiroy@kernel.org>
Reported-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---

I am still waiting for word on how this affects performance, but this is
a clean version that should avoid the thundering herd problem in
general.


 kernel/exit.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index cebae77a9664..567354550d62 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -151,8 +151,8 @@ void put_task_struct_rcu_user(struct task_struct *task)
 
 void release_task(struct task_struct *p)
 {
+	struct pid *flush_pid = NULL;
 	struct task_struct *leader;
-	struct pid *thread_pid;
 	int zap_leader;
 repeat:
 	/* don't need to get the RCU readlock here - the process is dead and
@@ -165,7 +165,16 @@ void release_task(struct task_struct *p)
 
 	write_lock_irq(&tasklist_lock);
 	ptrace_release_task(p);
-	thread_pid = get_pid(p->thread_pid);
+
+	/*
+	 * When all of the threads are exiting wait until the end
+	 * and flush everything.
+	 */
+	if (thread_group_leader(p))
+		flush_pid = get_pid(task_tgid(p));
+	else if (!(p->signal->flags & SIGNAL_GROUP_EXIT))
+		flush_pid = get_pid(task_pid(p));
+
 	__exit_signal(p);
 
 	/*
@@ -188,8 +197,10 @@ void release_task(struct task_struct *p)
 	}
 
 	write_unlock_irq(&tasklist_lock);
-	proc_flush_pid(thread_pid);
-	put_pid(thread_pid);
+	if (flush_pid) {
+		proc_flush_pid(flush_pid);
+		put_pid(flush_pid);
+	}
 	release_thread(p);
 	put_task_struct_rcu_user(p);
 
-- 
2.20.1

  parent reply	other threads:[~2020-06-19 14:14 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-18 22:17 severe proc dentry lock contention Junxiao Bi
2020-06-18 23:39 ` Matthew Wilcox
2020-06-19  0:02   ` Eric W. Biederman
2020-06-19  0:27     ` Junxiao Bi
2020-06-19  3:30       ` Eric W. Biederman
2020-06-19 14:09       ` Eric W. Biederman [this message]
2020-06-19 15:56         ` [PATCH] proc: Avoid a thundering herd of threads freeing proc dentries Junxiao Bi
2020-06-19 17:24           ` Eric W. Biederman
2020-06-19 21:56             ` Junxiao Bi
2020-06-19 22:42               ` Eric W. Biederman
2020-06-20 16:27                 ` Matthew Wilcox
2020-06-22  5:15                   ` Junxiao Bi
2020-06-22 15:20                     ` Eric W. Biederman
2020-06-22 15:48                       ` willy
2020-08-17 12:19                         ` Eric W. Biederman
2020-06-22 17:16                       ` Junxiao Bi
2020-06-23  0:47                     ` Matthew Wilcox
2020-06-25 22:11                       ` Junxiao Bi
2020-06-22  5:33         ` Masahiro Yamada
2020-06-22 15:13           ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bllf87ve.fsf_-_@x220.int.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=SRINIVAS.EEDA@oracle.com \
    --cc=joe.jin@oracle.com \
    --cc=junxiao.bi@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew.wilcox@oracle.com \
    --cc=willy@infradead.org \
    --subject='Re: [PATCH] proc: Avoid a thundering herd of threads freeing proc dentries' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).