LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Andrew Morton <akpm@osdl.org>
To: Christoph Lameter <clameter@sgi.com>
Cc: menage@google.com, linux-kernel@vger.kernel.org,
nickpiggin@yahoo.com.au, linux-mm@kvack.org, ak@suse.de,
pj@sgi.com, dgc@sgi.com
Subject: Re: [RFC 0/8] Cpuset aware writeback
Date: Tue, 16 Jan 2007 23:00:34 -0800 [thread overview]
Message-ID: <20070116230034.b8cb4263.akpm@osdl.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0701162219180.5215@schroedinger.engr.sgi.com>
> On Tue, 16 Jan 2007 22:27:36 -0800 (PST) Christoph Lameter <clameter@sgi.com> wrote:
> On Tue, 16 Jan 2007, Andrew Morton wrote:
>
> > > Yes this is the result of the hierachical nature of cpusets which already
> > > causes issues with the scheduler. It is rather typical that cpusets are
> > > used to partition the memory and cpus. Overlappig cpusets seem to have
> > > mainly an administrative function. Paul?
> >
> > The typical usage scenarios don't matter a lot: the examples I gave show
> > that the core problem remains unsolved. People can still hit the bug.
>
> I agree the overlap issue is a problem and I hope it can be addressed
> somehow for the rare cases in which such nesting takes place.
>
> One easy solution may be to check the dirty ratio before engaging in
> reclaim. If the dirty ratio is sufficiently high then trigger writeout via
> pdflush (we already wakeup pdflush while scanning and you already noted
> that pdflush writeout is not occurring within the context of the current
> cpuset) and pass over any dirty pages during LRU scans until some pages
> have been cleaned up.
>
> This means we allow allocation of additional kernel memory outside of the
> cpuset while triggering writeout of inodes that have pages on the nodes
> of the cpuset. The memory directly used by the application is still
> limited. Just the temporary information needed for writeback is allocated
> outside.
Gad. None of that should be necessary.
> Well sounds somehow still like a hack. Any other ideas out there?
Do what blockdevs do: limit the number of in-flight requests (Peter's
recent patch seems to be doing that for us) (perhaps only when PF_MEMALLOC
is in effect, to keep Trond happy) and implement a mempool for the NFS
request critical store. Additionally:
- we might need to twiddle the NFS gfp_flags so it doesn't call the
oom-killer on failure: just return NULL.
- consider going off-cpuset for critical allocations. It's better than
going oom. A suitable implementation might be to ignore the caller's
cpuset if PF_MEMALLOC. Maybe put a WARN_ON_ONCE in there: we prefer that
it not happen and we want to know when it does.
btw, regarding the per-address_space node mask: I think we should free it
when the inode is clean (!mapping_tagged(PAGECACHE_TAG_DIRTY)). Chances
are, the inode will be dirty for 30 seconds and in-core for hours. We
might as well steal its nodemask storage and give it to the next file which
gets written to. A suitable place to do all this is in
__mark_inode_dirty(I_DIRTY_PAGES), using inode_lock to protect
address_space.dirty_page_nodemask.
next prev parent reply other threads:[~2007-01-17 7:00 UTC|newest]
Thread overview: 110+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-01-16 5:47 Christoph Lameter
2007-01-16 5:47 ` [RFC 1/8] Convert higest_possible_node_id() into nr_node_ids Christoph Lameter
2007-01-16 22:05 ` Andi Kleen
2007-01-17 3:14 ` Christoph Lameter
2007-01-17 4:15 ` Andi Kleen
2007-01-17 4:23 ` Christoph Lameter
2007-01-16 5:47 ` [RFC 2/8] Add a map to inodes to track dirty pages per node Christoph Lameter
2007-01-16 5:47 ` [RFC 3/8] Add a nodemask to pdflush functions Christoph Lameter
2007-01-16 5:48 ` [RFC 4/8] Per cpuset dirty ratio handling and writeout Christoph Lameter
2007-01-16 5:48 ` [RFC 5/8] Make writeout during reclaim cpuset aware Christoph Lameter
2007-01-16 22:07 ` Andi Kleen
2007-01-17 4:20 ` Paul Jackson
2007-01-17 4:28 ` Andi Kleen
2007-01-17 4:36 ` Paul Jackson
2007-01-17 5:59 ` Andi Kleen
2007-01-17 6:19 ` Christoph Lameter
2007-01-17 4:23 ` Christoph Lameter
2007-01-16 5:48 ` [RFC 6/8] Throttle vm writeout per cpuset Christoph Lameter
2007-01-16 5:48 ` [RFC 7/8] Exclude unreclaimable pages from dirty ration calculation Christoph Lameter
2007-01-18 15:48 ` Nikita Danilov
2007-01-18 19:56 ` Christoph Lameter
2007-01-16 5:48 ` [RFC 8/8] Reduce inode memory usage for systems with a high MAX_NUMNODES Christoph Lameter
2007-01-16 19:52 ` Paul Menage
2007-01-16 20:00 ` Christoph Lameter
2007-01-16 20:06 ` Paul Menage
2007-01-16 20:51 ` Christoph Lameter
2007-01-16 7:38 ` [RFC 0/8] Cpuset aware writeback Peter Zijlstra
2007-01-16 20:10 ` Christoph Lameter
2007-01-16 9:25 ` Paul Jackson
2007-01-16 17:13 ` Christoph Lameter
2007-01-16 21:53 ` Andrew Morton
2007-01-16 22:08 ` [PATCH] nfs: fix congestion control Peter Zijlstra
2007-01-16 22:27 ` Trond Myklebust
2007-01-17 2:41 ` Peter Zijlstra
2007-01-17 6:15 ` Trond Myklebust
2007-01-17 8:49 ` Peter Zijlstra
2007-01-17 13:50 ` Trond Myklebust
2007-01-17 14:29 ` Peter Zijlstra
2007-01-17 14:45 ` Trond Myklebust
2007-01-17 20:05 ` Christoph Lameter
2007-01-17 21:52 ` Peter Zijlstra
2007-01-17 21:54 ` Trond Myklebust
2007-01-18 13:27 ` Peter Zijlstra
2007-01-18 15:49 ` Trond Myklebust
2007-01-19 9:33 ` Peter Zijlstra
2007-01-19 13:07 ` Peter Zijlstra
2007-01-19 16:51 ` Trond Myklebust
2007-01-19 17:54 ` Peter Zijlstra
2007-01-19 17:20 ` Christoph Lameter
2007-01-19 17:57 ` Peter Zijlstra
2007-01-19 18:02 ` Christoph Lameter
2007-01-19 18:26 ` Trond Myklebust
2007-01-19 18:27 ` Christoph Lameter
2007-01-20 7:01 ` [PATCH] nfs: fix congestion control -v3 Peter Zijlstra
2007-01-22 16:12 ` Trond Myklebust
2007-01-25 15:32 ` [PATCH] nfs: fix congestion control -v4 Peter Zijlstra
2007-01-26 5:02 ` Andrew Morton
2007-01-26 8:00 ` Peter Zijlstra
2007-01-26 8:50 ` Peter Zijlstra
2007-01-26 5:09 ` Andrew Morton
2007-01-26 5:31 ` Christoph Lameter
2007-01-26 6:04 ` Andrew Morton
2007-01-26 6:53 ` Christoph Lameter
2007-01-26 8:03 ` Peter Zijlstra
2007-01-26 8:51 ` Andrew Morton
2007-01-26 9:01 ` Peter Zijlstra
2007-02-20 12:59 ` Peter Zijlstra
2007-01-22 17:59 ` [PATCH] nfs: fix congestion control -v3 Christoph Lameter
2007-01-17 23:15 ` [PATCH] nfs: fix congestion control Christoph Hellwig
2007-01-16 22:15 ` [RFC 0/8] Cpuset aware writeback Christoph Lameter
2007-01-16 23:40 ` Andrew Morton
2007-01-17 0:16 ` Christoph Lameter
2007-01-17 1:07 ` Andrew Morton
2007-01-17 1:30 ` Christoph Lameter
2007-01-17 2:34 ` Andrew Morton
2007-01-17 3:40 ` Christoph Lameter
2007-01-17 4:02 ` Paul Jackson
2007-01-17 4:05 ` Andrew Morton
2007-01-17 6:27 ` Christoph Lameter
2007-01-17 7:00 ` Andrew Morton [this message]
2007-01-17 8:01 ` Paul Jackson
2007-01-17 9:57 ` Andrew Morton
2007-01-17 19:43 ` Christoph Lameter
2007-01-17 22:10 ` Andrew Morton
2007-01-18 1:10 ` Christoph Lameter
2007-01-18 1:25 ` Andrew Morton
2007-01-18 5:21 ` Christoph Lameter
2007-01-16 23:44 ` David Chinner
2007-01-16 22:01 ` Andi Kleen
2007-01-16 22:18 ` Christoph Lameter
2007-02-02 1:38 ` Ethan Solomita
2007-02-02 2:16 ` Christoph Lameter
2007-02-02 4:03 ` Andrew Morton
2007-02-02 5:29 ` Christoph Lameter
2007-02-02 6:02 ` Neil Brown
2007-02-02 6:17 ` Christoph Lameter
2007-02-02 6:41 ` Neil Brown
2007-02-02 7:12 ` Andrew Morton
2007-03-21 21:11 ` Ethan Solomita
2007-03-21 21:29 ` Christoph Lameter
2007-03-21 21:52 ` Andrew Morton
2007-03-21 21:57 ` Christoph Lameter
2007-04-19 2:07 ` Ethan Solomita
2007-04-19 2:55 ` Christoph Lameter
2007-04-19 7:52 ` Ethan Solomita
2007-04-19 16:03 ` Christoph Lameter
2007-04-21 1:37 ` Ethan Solomita
2007-04-21 1:48 ` Christoph Lameter
2007-04-21 8:15 ` Ethan Solomita
2007-04-21 15:40 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070116230034.b8cb4263.akpm@osdl.org \
--to=akpm@osdl.org \
--cc=ak@suse.de \
--cc=clameter@sgi.com \
--cc=dgc@sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=menage@google.com \
--cc=nickpiggin@yahoo.com.au \
--cc=pj@sgi.com \
--subject='Re: [RFC 0/8] Cpuset aware writeback' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).