LKML Archive on
help / color / mirror / Atom feed
From: Greg Thelen <>
Cc: Johannes Weiner <>,
	Andrew Morton <>,
	Michal Hocko <>,
	Vladimir Davydov <>,
	Tejun Heo <>, Cgroups <>,, Linux MM <>,
	LKML <>
Subject: Re: [RFC PATCH 0/2] memory.low,min reclaim
Date: Tue, 24 Apr 2018 00:56:09 +0000	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On Mon, Apr 23, 2018 at 3:38 AM Roman Gushchin <> wrote:

> Hi, Greg!

> On Sun, Apr 22, 2018 at 01:26:10PM -0700, Greg Thelen wrote:
> > Roman's previously posted memory.low,min patches add per memcg effective
> > low limit to detect overcommitment of parental limits.  But if we flip
> > low,min reclaim to bail if usage<{low,min} at any level, then we don't
> > an effective low limit, which makes the code simpler.  When parent
> > are overcommited memory.min will oom kill, which is more drastic but
> > the memory.low a simpler concept.  If memcg a/b wants oom kill before
> > reclaim, then give it to them.  It seems a bit strange for
> > behaviour to depend on a/c/memory.low (i.e. a/b.low is strong unless
> > a/b.low+a/c.low exceed a.low).

> It's actually not strange: a/b and a/c are sharing a common resource:
> a/memory.low.

> Exactly as a/b/memory.max and a/c/memory.max are sharing a/memory.max.
> If there are sibling cgroups which are consuming memory, a cgroup can't
> exceed parent's memory.max, even if its memory.max is grater.

> >
> > I think there might be a simpler way (ableit it doesn't yet include
> > Documentation):
> > - memcg: fix memory.low
> > - memcg: add memory.min
> >  3 files changed, 75 insertions(+), 6 deletions(-)
> >
> > The idea of this alternate approach is for memory.low,min to avoid
> > if any portion of under-consideration memcg ancestry is under respective
> > limit.

> This approach has a significant downside: it breaks hierarchical
> for memory.low/min. There are two important outcomes:

> 1) Any leaf's memory.low/min value is respected, even if parent's value
>           is lower or even 0. It's not possible anymore to limit the amount
>           protected memory for a sub-tree.
>           This is especially bad in case of delegation.

As someone who has been using something like memory.min for a while, I have
cases where it needs to be a strong protection.  Such jobs prefer oom kill
to reclaim.  These jobs know they need X MB of memory.  But I guess it's on
me to avoid configuring machines which overcommit memory.min at such cgroup
levels all the way to the root.

> 2) If a cgroup has an ancestor with the usage under its memory.low/min,
>           it becomes protection, even if its memory.low/min is 0. So it
>           impossible to have unprotected cgroups in protected sub-tree.

Fair point.

One use case is where a non trivial job which has several memory accounting
subcontainers.  Is there a way to only set memory.low at the top and have
the offer protection to the job?
The case I'm thinking of is:
% cd /cgroup
% echo +memory > cgroup.subtree_control
% mkdir top
% echo +memory > top/cgroup.subtree_control
% mkdir top/part1 top/part2
% echo 1GB > top/memory.min
% (echo $BASHPID > top/part1/cgroup.procs && part1)
% (echo $BASHPID > top/part2/cgroup.procs && part2)

Empirically it's been measured that the entire workload (/top) needs 1GB to
perform well.  But we don't care how the memory is distributed between
part1,part2.  Is the strategy for that to set /top, /top/part1.min, and
/top/part2.min to 1GB?

What do you think about exposing emin and elow to user space?  I think that
would reduce admin/user confusion in situations where memory.min is
internally discounted.

(tangent) Delegation in v2 isn't something I've been able to fully
internalize yet.
The "no interior processes" rule challenges my notion of subdelegation.
My current model is where a system controller creates a container C with
C.min and then starts client manager process M in C.  Then M can choose
to further divide C's resources (e.g. C/S).  This doesn't seem possible
because v2 doesn't allow for interior processes.  So the system manager
would need to create C, set C.low, create C/sub_manager, create
C/sub_resources, set C/sub_manager.low, set C/sub_resources.low, then start
M in C/sub_manager.  Then sub_manager can create and manage

PS: Thanks for the memory.low and memory.min work.  Regardless of how we
proceed it's better than the upstream memory.soft_limit_in_bytes!

  reply	other threads:[~2018-04-24  0:56 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-20 22:33 [RFC] mm: memory.low heirarchical behavior Roman Gushchin
2018-03-21 18:23 ` Johannes Weiner
2018-03-21 19:08   ` Roman Gushchin
2018-04-04 17:07     ` Johannes Weiner
2018-04-05 13:54       ` Roman Gushchin
2018-04-05 15:00         ` Johannes Weiner
2018-03-23 16:37   ` [PATCH v2] mm: memory.low hierarchical behavior Roman Gushchin
2018-04-22 20:26 ` [RFC PATCH 0/2] memory.low,min reclaim Greg Thelen
2018-04-22 20:26   ` [RFC PATCH 1/2] memcg: fix memory.low Greg Thelen
2018-04-22 20:26   ` [RFC PATCH 2/2] memcg: add memory.min Greg Thelen
2018-04-23 10:38   ` [RFC PATCH 0/2] memory.low,min reclaim Roman Gushchin
2018-04-24  0:56     ` Greg Thelen [this message]
2018-04-24 10:09       ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \
    --subject='Re: [RFC PATCH 0/2] memory.low,min reclaim' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).