LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Christoph Lameter <cl@linux-foundation.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: peterz@infradead.org, rientjes@google.com, npiggin@suse.de,
	menage@google.com, dfults@sgi.com, linux-kernel@vger.kernel.org,
	containers@lists.osdl.org
Subject: Re: [patch 0/7] cpuset writeback throttling
Date: Wed, 5 Nov 2008 07:52:44 -0600 (CST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0811050747040.11867@quilx.com> (raw)
In-Reply-To: <20081104190505.769b93ec.akpm@linux-foundation.org>

On Tue, 4 Nov 2008, Andrew Morton wrote:

>> That is one aspect. When performing writeback then we need to figure out
>> which inodes have dirty pages in the memcg and we need to start writeout
>> on those inodes and not on others that have their dirty pages elsewhere.
>> There are two components of this that are in this patch and that would
>> also have to be implemented for a memcg.
>
> Doable.  lru->page->mapping->host is a good start.

The block layer has a list of inodes that are dirty. From that we need to 
select ones that will improve the situation from the cpuset/memcg. How 
does the LRU come into this?

>> This patch would solve the problem if the calculation of the dirty pages
>> would consider the active memcg and be able to determine the amount of
>> dirty pages (through some sort of additional memcg counters). That is just
>> the first part though. The second part of finding the inodes that have
>> dirty pages for writeback would require an association between memcgs and
>> inodes.
>
> We presently have that via the LRU.  It has holes, but so does this per-cpuset
> scheme.

How do I get to the LRU from the dirtied list of inodes?

> Generally, I worry that this is a specific fix to a specific problem
> encountered on specific machines with specific setups and specific
> workloads, and that it's just all too low-level and myopic.
>
> And now we're back in the usual position where there's existing code and
> everyone says it's terribly wonderful and everyone is reluctant to step
> back and look at the big picture.  Am I wrong?

Well everyone is just reluctant to do work it seems. Thus they fall back 
to a solution that I provided when memcg groups were not yet available. It 
would be best if someone could find a general scheme or generalize this 
patchset.

> Plus: we need per-memcg dirty-memory throttling, and this is more
> important than per-cpuset, I suspect.  How will the (already rather
> buggy) code look once we've stuffed both of them in there?

The basics will still be the same

1. One need to establish the dirty ratio of memcgs and monitor them.
2. There needs to be mechanism to perform writeout on the right inodes.

> I agree that there's a problem here, although given the amount of time
> that it's been there, I suspect that it is a very small problem.

It used to be only a problem for NUMA systems. Now its also a problem for 
memcgs.

> Someone please convince me that in three years time we will agree that
> merging this fix to that problem was a correct decision?

At the mininum: It provides a basis on top of which memcg support 
can be developed. There are likely major modifications needed to VM 
statistics to get there for memcg groups.





  parent reply	other threads:[~2008-11-05 13:53 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-30 19:23 David Rientjes
2008-10-30 19:23 ` [patch 1/7] cpusets: add dirty map to struct address_space David Rientjes
2008-11-04 21:09   ` Andrew Morton
2008-11-04 21:20     ` Christoph Lameter
2008-11-04 21:42       ` Andrew Morton
2008-10-30 19:23 ` [patch 2/7] pdflush: allow the passing of a nodemask parameter David Rientjes
2008-10-30 19:23 ` [patch 3/7] mm: make page writeback obey cpuset constraints David Rientjes
2008-10-30 19:23 ` [patch 4/7] mm: cpuset aware reclaim writeout David Rientjes
2008-10-30 19:23 ` [patch 5/7] mm: throttle writeout with cpuset awareness David Rientjes
2008-10-30 19:23 ` [patch 6/7] cpusets: per cpuset dirty ratios David Rientjes
2008-10-30 19:23 ` [patch 7/7] cpusets: update documentation for writeback throttling David Rientjes
2008-10-30 21:08 ` [patch 0/7] cpuset " Dave Chinner
2008-10-30 21:33   ` Christoph Lameter
2008-10-30 22:03     ` Dave Chinner
2008-10-31 13:47       ` Christoph Lameter
2008-10-31 16:36       ` David Rientjes
2008-11-04 20:47 ` Andrew Morton
2008-11-04 20:53   ` Peter Zijlstra
2008-11-04 20:58     ` Christoph Lameter
2008-11-04 21:10     ` David Rientjes
2008-11-04 21:16     ` Andrew Morton
2008-11-04 21:21       ` Peter Zijlstra
2008-11-04 21:50         ` Andrew Morton
2008-11-04 22:17           ` Christoph Lameter
2008-11-04 22:35             ` Andrew Morton
2008-11-04 22:52               ` Christoph Lameter
2008-11-04 23:36                 ` Andrew Morton
2008-11-05  1:31                   ` KAMEZAWA Hiroyuki
2008-11-05  3:09                     ` Andrew Morton
2008-11-05  2:45                   ` Christoph Lameter
2008-11-05  3:05                     ` Andrew Morton
2008-11-05  4:31                       ` KAMEZAWA Hiroyuki
2008-11-10  9:02                         ` Andrea Righi
2008-11-10 10:02                           ` David Rientjes
2008-11-05 13:52                       ` Christoph Lameter [this message]
2008-11-05 18:41                         ` Andrew Morton
2008-11-05 20:21                           ` Christoph Lameter
2008-11-05 20:31                             ` Andrew Morton
2008-11-05 20:40                               ` Christoph Lameter
2008-11-05 20:56                                 ` Andrew Morton
2008-11-05 21:28                                   ` Christoph Lameter
2008-11-05 21:55                                   ` Paul Menage
2008-11-05 22:04                                   ` David Rientjes
2008-11-06  1:34                                     ` KAMEZAWA Hiroyuki
2008-11-06 20:35                                       ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0811050747040.11867@quilx.com \
    --to=cl@linux-foundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=containers@lists.osdl.org \
    --cc=dfults@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=menage@google.com \
    --cc=npiggin@suse.de \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --subject='Re: [patch 0/7] cpuset writeback throttling' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).