LKML Archive on
help / color / mirror / Atom feed
From: Michal Hocko <>
To: Robert Kolchmeyer <>
Cc: David Rientjes <>,
	Andrew Morton <>,
	Vlastimil Babka <>,,,
	Ami Fischman <>
Subject: Re: [patch] mm, oom: make a last minute check to prevent unnecessary memcg oom kills
Date: Wed, 18 Mar 2020 10:55:14 +0100	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On Tue 17-03-20 11:25:52, Robert Kolchmeyer wrote:
> On Tue, Mar 10, 2020 at 3:54 PM David Rientjes <> wrote:
> >
> > Robert, could you elaborate on the user-visible effects of this issue that
> > caused it to initially get reported?
> >
> Ami (now cc'ed) knows more, but here is my understanding. The use case
> involves a Docker container running multiple processes. The container
> has a memory limit set. The container contains two long-lived,
> important processes p1 and p2, and some arbitrary, dynamic number of
> usually ephemeral processes p3,...,pn. These processes are structured
> in a hierarchy that looks like p1->p2->[p3,...,pn]; p1 is a parent of
> p2, and p2 is the parent for all of the ephemeral processes p3,...,pn.
> Since p1 and p2 are long-lived and important, the user does not want
> p1 and p2 to be oom-killed. However, p3,...,pn are expected to use a
> lot of memory, and it's ok for those processes to be oom-killed.
> If the user sets oom_score_adj on p1 and p2 to make them very unlikely
> to be oom-killed, p3,...,pn will inherit the oom_score_adj value,
> which is bad. Additionally, setting oom_score_adj on p3,...,pn is
> tricky, since processes in the Docker container (specifically p1 and
> p2) don't have permissions to set oom_score_adj on p3,...,pn. The
> ephemeral nature of p3,...,pn also makes setting oom_score_adj on them
> tricky after they launch.

Thanks for the clarification.

> So, the user hopes that when one of p3,...,pn triggers an oom
> condition in the Docker container, the oom killer will almost always
> kill processes from p3,...,pn (and not kill p1 or p2, which are both
> important and unlikely to trigger an oom condition). The issue of more
> processes being killed than are strictly necessary is resulting in p1
> or p2 being killed much more frequently when one of p3,...,pn triggers
> an oom condition, and p1 or p2 being killed is very disruptive for the
> user (my understanding is that p1 or p2 going down with high frequency
> results in significant unhealthiness in the user's service).

Do you have any logs showing this condition? I am interested because
from your description it seems like p1/p2 shouldn't be usually those
which trigger the oom, right? That suggests that it should be mostly p3,
... pn to be in the kernel triggering the oom and therefore they
shouldn't vanish.
Michal Hocko

      parent reply	other threads:[~2020-03-18  9:55 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-10 21:55 David Rientjes
2020-03-10 22:19 ` Michal Hocko
2020-03-10 22:54   ` David Rientjes
2020-03-11  8:39     ` Michal Hocko
2020-03-17  7:59       ` Michal Hocko
2020-03-11 11:41     ` Tetsuo Handa
2020-03-11 19:51       ` David Rientjes
2020-03-17 18:25     ` Robert Kolchmeyer
2020-03-17 19:00       ` Ami Fischman
2020-03-18  9:57         ` Michal Hocko
2020-03-18 15:20           ` Ami Fischman
2020-03-18  9:55       ` Michal Hocko [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \
    --subject='Re: [patch] mm, oom: make a last minute check to prevent unnecessary memcg oom kills' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).