LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: [PATCH] badness() dramatically overcounts memory
       [not found] <1202182480.24634.22.camel@dogma.ljc.laika.com>
@ 2008-02-05  4:13 ` Balbir Singh
  2008-02-05 23:02   ` Jeff Davis
  0 siblings, 1 reply; 6+ messages in thread
From: Balbir Singh @ 2008-02-05  4:13 UTC (permalink / raw)
  To: Jeff Davis; +Cc: linux-kernel, linux-mm

Jeff Davis wrote:
> In oom_kill.c, one of the badness calculations is wildly inaccurate. If
> memory is shared among child processes, that same memory will be counted
> for each child, effectively multiplying the memory penalty by N, where N
> is the number of children.
> 
> This makes it almost certain that the parent will always be chosen as
> the victim of the OOM killer (assuming any substantial amount memory
> shared among the children), even if the parent and children are well
> behaved and have a reasonable and unchanging VM size.
> 
> Usually this does not actually alleviate the memory pressure because the
> truly bad process is completely unrelated; and the OOM killer must later
> kill the truly bad process.
> 
> This trivial patch corrects the calculation so that it does not count a
> child's shared memory against the parent.
> 

Hi, Jeff,

1. grep on the kernel source tells me that shared_vm is incremented only in
   vm_stat_account(), which is a NO-OP if CONFIG_PROC_FS is not defined.
2. How have you tested these patches? One way to do it would be to use the
   memory controller and set a small limit on the control group. A memory
   intensive application will soon see an OOM.

I do need to look at OOM kill sanity, my colleagues using the memory controller
have reported wrong actions taken by the OOM killer, but I am yet to analyze them.

The interesting thing is the use of total_vm and not the RSS which is used as
the basis by the OOM killer. I need to read/understand the code a bit more.

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] badness() dramatically overcounts memory
  2008-02-05  4:13 ` [PATCH] badness() dramatically overcounts memory Balbir Singh
@ 2008-02-05 23:02   ` Jeff Davis
  2008-02-05 23:09     ` David Rientjes
  0 siblings, 1 reply; 6+ messages in thread
From: Jeff Davis @ 2008-02-05 23:02 UTC (permalink / raw)
  To: balbir; +Cc: linux-kernel, linux-mm

On Tue, 2008-02-05 at 09:43 +0530, Balbir Singh wrote:
> 1. grep on the kernel source tells me that shared_vm is incremented only in
>    vm_stat_account(), which is a NO-OP if CONFIG_PROC_FS is not defined.

I see, thanks for pointing that out. Is there another way do you think?
Would the penalty be to high to enable vm_stat_account when
CONFIG_PROC_FS is not defined?

Or perhaps my patch would only have an effect when CONFIG_PROC_FS is set
(which is default)?

> 2. How have you tested these patches? One way to do it would be to use the
>    memory controller and set a small limit on the control group. A memory
>    intensive application will soon see an OOM.

I have done a quick test a while back when I first wrote the patch. I
will test more thoroughly now.

> The interesting thing is the use of total_vm and not the RSS which is used as
> the basis by the OOM killer. I need to read/understand the code a bit more.

RSS makes more sense to me as well.

To me, it makes no sense to count shared memory, because killing a
process doesn't free the shared memory.

Regards,
	Jeff Davis


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] badness() dramatically overcounts memory
  2008-02-05 23:02   ` Jeff Davis
@ 2008-02-05 23:09     ` David Rientjes
  2008-02-06  1:54       ` KOSAKI Motohiro
  0 siblings, 1 reply; 6+ messages in thread
From: David Rientjes @ 2008-02-05 23:09 UTC (permalink / raw)
  To: Jeff Davis; +Cc: balbir, linux-kernel, linux-mm, Andrea Arcangeli

On Tue, 5 Feb 2008, Jeff Davis wrote:

> > The interesting thing is the use of total_vm and not the RSS which is used as
> > the basis by the OOM killer. I need to read/understand the code a bit more.
> 
> RSS makes more sense to me as well.
> 
> To me, it makes no sense to count shared memory, because killing a
> process doesn't free the shared memory.
> 

Andrea Arcangeli has patches pending which change this to the RSS.  
Specifically:

	http://marc.info/?l=linux-mm&m=119977937126925

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] badness() dramatically overcounts memory
  2008-02-05 23:09     ` David Rientjes
@ 2008-02-06  1:54       ` KOSAKI Motohiro
  2008-02-06  2:05         ` David Rientjes
  2008-02-06  4:00         ` Balbir Singh
  0 siblings, 2 replies; 6+ messages in thread
From: KOSAKI Motohiro @ 2008-02-06  1:54 UTC (permalink / raw)
  To: David Rientjes
  Cc: kosaki.motohiro, Jeff Davis, balbir, linux-kernel, linux-mm,
	Andrea Arcangeli

Hi

> > > The interesting thing is the use of total_vm and not the RSS which is used as
> > > the basis by the OOM killer. I need to read/understand the code a bit more.
> > 
> > RSS makes more sense to me as well.
> 
> Andrea Arcangeli has patches pending which change this to the RSS.  
> Specifically:
> 
> 	http://marc.info/?l=linux-mm&m=119977937126925

I agreed with you that RSS is better :)



but..
on many node numa, per zone rss is more better..


- kosaki



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] badness() dramatically overcounts memory
  2008-02-06  1:54       ` KOSAKI Motohiro
@ 2008-02-06  2:05         ` David Rientjes
  2008-02-06  4:00         ` Balbir Singh
  1 sibling, 0 replies; 6+ messages in thread
From: David Rientjes @ 2008-02-06  2:05 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Jeff Davis, balbir, linux-kernel, linux-mm, Andrea Arcangeli

On Wed, 6 Feb 2008, KOSAKI Motohiro wrote:

> > Andrea Arcangeli has patches pending which change this to the RSS.  
> > Specifically:
> > 
> > 	http://marc.info/?l=linux-mm&m=119977937126925
> 
> I agreed with you that RSS is better :)
> 
> 
> 
> but..
> on many node numa, per zone rss is more better..
> 

It depends on how your applications are taking advantage of NUMA 
optimizations.  If they're constrained by mempolicies to a subset of nodes 
then the badness scoring isn't even used: the task that triggered the OOM 
condition is the one that is automatically killed.

At this point, I think you're going to need to present an actual case 
study where Andrea's patch isn't sufficient for selecting the appropriate 
task on large NUMA machines.

		David

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] badness() dramatically overcounts memory
  2008-02-06  1:54       ` KOSAKI Motohiro
  2008-02-06  2:05         ` David Rientjes
@ 2008-02-06  4:00         ` Balbir Singh
  1 sibling, 0 replies; 6+ messages in thread
From: Balbir Singh @ 2008-02-06  4:00 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: David Rientjes, Jeff Davis, linux-kernel, linux-mm, Andrea Arcangeli

KOSAKI Motohiro wrote:
> Hi
> 
>>>> The interesting thing is the use of total_vm and not the RSS which is used as
>>>> the basis by the OOM killer. I need to read/understand the code a bit more.
>>> RSS makes more sense to me as well.
>> Andrea Arcangeli has patches pending which change this to the RSS.  
>> Specifically:
>>
>> 	http://marc.info/?l=linux-mm&m=119977937126925
> 
> I agreed with you that RSS is better :)
> 
> 
> 
> but..
> on many node numa, per zone rss is more better..

Do we have a per zone RSS per task? I don't remember seeing it.


-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-02-06  4:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1202182480.24634.22.camel@dogma.ljc.laika.com>
2008-02-05  4:13 ` [PATCH] badness() dramatically overcounts memory Balbir Singh
2008-02-05 23:02   ` Jeff Davis
2008-02-05 23:09     ` David Rientjes
2008-02-06  1:54       ` KOSAKI Motohiro
2008-02-06  2:05         ` David Rientjes
2008-02-06  4:00         ` Balbir Singh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).