LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] cgroup/cpuset: Avoid memory migration when nodemasks match
@ 2021-08-25 10:54 Nicolas Saenz Julienne
  2021-08-25 16:53 ` Tejun Heo
  2021-08-25 19:18 ` Waiman Long
  0 siblings, 2 replies; 3+ messages in thread
From: Nicolas Saenz Julienne @ 2021-08-25 10:54 UTC (permalink / raw)
  To: cgroups, linux-kernel
  Cc: tj, lizefan.x, hannes, mtosatti, nilal, frederic, longman,
	Nicolas Saenz Julienne

With the introduction of ee9707e8593d ("cgroup/cpuset: Enable memory
migration for cpuset v2") attaching a process to a different cgroup will
trigger a memory migration regardless of whether it's really needed.
Memory migration is an expensive operation, so bypass it if the
nodemasks passed to cpuset_migrate_mm() are equal.

Note that we're not only avoiding the migration work itself, but also a
call to lru_cache_disable(), which triggers and flushes an LRU drain
work on every online CPU.

Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>

---

NOTE: This also alleviates hangs I stumbled upon while testing
linux-next on systems with nohz_full CPUs (running latency sensitive
loads). ee9707e8593d's newly imposed memory migration never finishes, as
the LRU drain is never scheduled on isolated CPUs.

I tried to follow the user-space call trace, it's something like this:

  Create new tmux pane, which triggers hostname operation, hangs...
    -> systemd (pid 1) creates new hostnamed process (using clone())
      -> hostnamed process attaches itself to:
  	 "system.slice/systemd-hostnamed.service/cgroup.procs"
        -> hangs... Waiting for LRU drain to finish on nohz_full CPUs.

As far as CPU isolation is concerned, this calls for better
understanding of the underlying issues. For example, should LRU be made
CPU isolation aware or should we deal with it at cgroup/cpuset level? In
the meantime, I figured this small optimization is worthwhile on its
own.

 kernel/cgroup/cpuset.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 44d234b0df5e..d497a65c4f04 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -1634,6 +1634,11 @@ static void cpuset_migrate_mm(struct mm_struct *mm, const nodemask_t *from,
 {
 	struct cpuset_migrate_mm_work *mwork;
 
+	if (nodes_equal(*from, *to)) {
+		mmput(mm);
+		return;
+	}
+
 	mwork = kzalloc(sizeof(*mwork), GFP_KERNEL);
 	if (mwork) {
 		mwork->mm = mm;
-- 
2.31.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] cgroup/cpuset: Avoid memory migration when nodemasks match
  2021-08-25 10:54 [PATCH] cgroup/cpuset: Avoid memory migration when nodemasks match Nicolas Saenz Julienne
@ 2021-08-25 16:53 ` Tejun Heo
  2021-08-25 19:18 ` Waiman Long
  1 sibling, 0 replies; 3+ messages in thread
From: Tejun Heo @ 2021-08-25 16:53 UTC (permalink / raw)
  To: Nicolas Saenz Julienne
  Cc: cgroups, linux-kernel, lizefan.x, hannes, mtosatti, nilal,
	frederic, longman

On Wed, Aug 25, 2021 at 12:54:15PM +0200, Nicolas Saenz Julienne wrote:
> With the introduction of ee9707e8593d ("cgroup/cpuset: Enable memory
> migration for cpuset v2") attaching a process to a different cgroup will
> trigger a memory migration regardless of whether it's really needed.
> Memory migration is an expensive operation, so bypass it if the
> nodemasks passed to cpuset_migrate_mm() are equal.
> 
> Note that we're not only avoiding the migration work itself, but also a
> call to lru_cache_disable(), which triggers and flushes an LRU drain
> work on every online CPU.
> 
> Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>

Applied to cgroup/for-5.15.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] cgroup/cpuset: Avoid memory migration when nodemasks match
  2021-08-25 10:54 [PATCH] cgroup/cpuset: Avoid memory migration when nodemasks match Nicolas Saenz Julienne
  2021-08-25 16:53 ` Tejun Heo
@ 2021-08-25 19:18 ` Waiman Long
  1 sibling, 0 replies; 3+ messages in thread
From: Waiman Long @ 2021-08-25 19:18 UTC (permalink / raw)
  To: Nicolas Saenz Julienne, cgroups, linux-kernel
  Cc: tj, lizefan.x, hannes, mtosatti, nilal, frederic

On 8/25/21 6:54 AM, Nicolas Saenz Julienne wrote:
> With the introduction of ee9707e8593d ("cgroup/cpuset: Enable memory
> migration for cpuset v2") attaching a process to a different cgroup will
> trigger a memory migration regardless of whether it's really needed.
> Memory migration is an expensive operation, so bypass it if the
> nodemasks passed to cpuset_migrate_mm() are equal.
>
> Note that we're not only avoiding the migration work itself, but also a
> call to lru_cache_disable(), which triggers and flushes an LRU drain
> work on every online CPU.
>
> Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
>
> ---
>
> NOTE: This also alleviates hangs I stumbled upon while testing
> linux-next on systems with nohz_full CPUs (running latency sensitive
> loads). ee9707e8593d's newly imposed memory migration never finishes, as
> the LRU drain is never scheduled on isolated CPUs.
>
> I tried to follow the user-space call trace, it's something like this:
>
>    Create new tmux pane, which triggers hostname operation, hangs...
>      -> systemd (pid 1) creates new hostnamed process (using clone())
>        -> hostnamed process attaches itself to:
>    	 "system.slice/systemd-hostnamed.service/cgroup.procs"
>          -> hangs... Waiting for LRU drain to finish on nohz_full CPUs.
>
> As far as CPU isolation is concerned, this calls for better
> understanding of the underlying issues. For example, should LRU be made
> CPU isolation aware or should we deal with it at cgroup/cpuset level? In
> the meantime, I figured this small optimization is worthwhile on its
> own.
>
>   kernel/cgroup/cpuset.c | 5 +++++
>   1 file changed, 5 insertions(+)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index 44d234b0df5e..d497a65c4f04 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -1634,6 +1634,11 @@ static void cpuset_migrate_mm(struct mm_struct *mm, const nodemask_t *from,
>   {
>   	struct cpuset_migrate_mm_work *mwork;
>   
> +	if (nodes_equal(*from, *to)) {
> +		mmput(mm);
> +		return;
> +	}
> +
>   	mwork = kzalloc(sizeof(*mwork), GFP_KERNEL);
>   	if (mwork) {
>   		mwork->mm = mm;

Thanks for the fix. So cpuset v1 with memory_migrate flag set will have 
the same problem then.

Acked-by: Waiman Long <longman@redhat.com>

Cheers,
Longman


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-08-25 19:19 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-25 10:54 [PATCH] cgroup/cpuset: Avoid memory migration when nodemasks match Nicolas Saenz Julienne
2021-08-25 16:53 ` Tejun Heo
2021-08-25 19:18 ` Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).