LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Christoph Lameter <clameter@sgi.com>
Cc: akpm@osdl.org, Paul Menage <menage@google.com>,
linux-kernel@vger.kernel.org,
Nick Piggin <nickpiggin@yahoo.com.au>,
linux-mm@kvack.org, Andi Kleen <ak@suse.de>,
Paul Jackson <pj@sgi.com>, Dave Chinner <dgc@sgi.com>
Subject: Re: [RFC 0/8] Cpuset aware writeback
Date: Tue, 16 Jan 2007 08:38:10 +0100 [thread overview]
Message-ID: <1168933090.22935.30.camel@twins> (raw)
In-Reply-To: <20070116054743.15358.77287.sendpatchset@schroedinger.engr.sgi.com>
On Mon, 2007-01-15 at 21:47 -0800, Christoph Lameter wrote:
> Currently cpusets are not able to do proper writeback since
> dirty ratio calculations and writeback are all done for the system
> as a whole. This may result in a large percentage of a cpuset
> to become dirty without writeout being triggered. Under NFS
> this can lead to OOM conditions.
>
> Writeback will occur during the LRU scans. But such writeout
> is not effective since we write page by page and not in inode page
> order (regular writeback).
>
> In order to fix the problem we first of all introduce a method to
> establish a map of nodes that contain dirty pages for each
> inode mapping.
>
> Secondly we modify the dirty limit calculation to be based
> on the acctive cpuset.
>
> If we are in a cpuset then we select only inodes for writeback
> that have pages on the nodes of the cpuset.
>
> After we have the cpuset throttling in place we can then make
> further fixups:
>
> A. We can do inode based writeout from direct reclaim
> avoiding single page writes to the filesystem.
>
> B. We add a new counter NR_UNRECLAIMABLE that is subtracted
> from the available pages in a node. This allows us to
> accurately calculate the dirty ratio even if large portions
> of the node have been allocated for huge pages or for
> slab pages.
What about mlock'ed pages?
> There are a couple of points where some better ideas could be used:
>
> 1. The nodemask expands the inode structure significantly if the
> architecture allows a high number of nodes. This is only an issue
> for IA64. For that platform we expand the inode structure by 128 byte
> (to support 1024 nodes). The last patch attempts to address the issue
> by using the knowledge about the maximum possible number of nodes
> determined on bootup to shrink the nodemask.
Not the prettiest indeed, no ideas though.
> 2. The calculation of the per cpuset limits can require looping
> over a number of nodes which may bring the performance of get_dirty_limits
> near pre 2.6.18 performance (before the introduction of the ZVC counters)
> (only for cpuset based limit calculation). There is no way of keeping these
> counters per cpuset since cpusets may overlap.
Well, you gain functionality, you loose some runtime, sad but probably
worth it.
Otherwise it all looks good.
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
next prev parent reply other threads:[~2007-01-16 7:40 UTC|newest]
Thread overview: 110+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-01-16 5:47 Christoph Lameter
2007-01-16 5:47 ` [RFC 1/8] Convert higest_possible_node_id() into nr_node_ids Christoph Lameter
2007-01-16 22:05 ` Andi Kleen
2007-01-17 3:14 ` Christoph Lameter
2007-01-17 4:15 ` Andi Kleen
2007-01-17 4:23 ` Christoph Lameter
2007-01-16 5:47 ` [RFC 2/8] Add a map to inodes to track dirty pages per node Christoph Lameter
2007-01-16 5:47 ` [RFC 3/8] Add a nodemask to pdflush functions Christoph Lameter
2007-01-16 5:48 ` [RFC 4/8] Per cpuset dirty ratio handling and writeout Christoph Lameter
2007-01-16 5:48 ` [RFC 5/8] Make writeout during reclaim cpuset aware Christoph Lameter
2007-01-16 22:07 ` Andi Kleen
2007-01-17 4:20 ` Paul Jackson
2007-01-17 4:28 ` Andi Kleen
2007-01-17 4:36 ` Paul Jackson
2007-01-17 5:59 ` Andi Kleen
2007-01-17 6:19 ` Christoph Lameter
2007-01-17 4:23 ` Christoph Lameter
2007-01-16 5:48 ` [RFC 6/8] Throttle vm writeout per cpuset Christoph Lameter
2007-01-16 5:48 ` [RFC 7/8] Exclude unreclaimable pages from dirty ration calculation Christoph Lameter
2007-01-18 15:48 ` Nikita Danilov
2007-01-18 19:56 ` Christoph Lameter
2007-01-16 5:48 ` [RFC 8/8] Reduce inode memory usage for systems with a high MAX_NUMNODES Christoph Lameter
2007-01-16 19:52 ` Paul Menage
2007-01-16 20:00 ` Christoph Lameter
2007-01-16 20:06 ` Paul Menage
2007-01-16 20:51 ` Christoph Lameter
2007-01-16 7:38 ` Peter Zijlstra [this message]
2007-01-16 20:10 ` [RFC 0/8] Cpuset aware writeback Christoph Lameter
2007-01-16 9:25 ` Paul Jackson
2007-01-16 17:13 ` Christoph Lameter
2007-01-16 21:53 ` Andrew Morton
2007-01-16 22:08 ` [PATCH] nfs: fix congestion control Peter Zijlstra
2007-01-16 22:27 ` Trond Myklebust
2007-01-17 2:41 ` Peter Zijlstra
2007-01-17 6:15 ` Trond Myklebust
2007-01-17 8:49 ` Peter Zijlstra
2007-01-17 13:50 ` Trond Myklebust
2007-01-17 14:29 ` Peter Zijlstra
2007-01-17 14:45 ` Trond Myklebust
2007-01-17 20:05 ` Christoph Lameter
2007-01-17 21:52 ` Peter Zijlstra
2007-01-17 21:54 ` Trond Myklebust
2007-01-18 13:27 ` Peter Zijlstra
2007-01-18 15:49 ` Trond Myklebust
2007-01-19 9:33 ` Peter Zijlstra
2007-01-19 13:07 ` Peter Zijlstra
2007-01-19 16:51 ` Trond Myklebust
2007-01-19 17:54 ` Peter Zijlstra
2007-01-19 17:20 ` Christoph Lameter
2007-01-19 17:57 ` Peter Zijlstra
2007-01-19 18:02 ` Christoph Lameter
2007-01-19 18:26 ` Trond Myklebust
2007-01-19 18:27 ` Christoph Lameter
2007-01-20 7:01 ` [PATCH] nfs: fix congestion control -v3 Peter Zijlstra
2007-01-22 16:12 ` Trond Myklebust
2007-01-25 15:32 ` [PATCH] nfs: fix congestion control -v4 Peter Zijlstra
2007-01-26 5:02 ` Andrew Morton
2007-01-26 8:00 ` Peter Zijlstra
2007-01-26 8:50 ` Peter Zijlstra
2007-01-26 5:09 ` Andrew Morton
2007-01-26 5:31 ` Christoph Lameter
2007-01-26 6:04 ` Andrew Morton
2007-01-26 6:53 ` Christoph Lameter
2007-01-26 8:03 ` Peter Zijlstra
2007-01-26 8:51 ` Andrew Morton
2007-01-26 9:01 ` Peter Zijlstra
2007-02-20 12:59 ` Peter Zijlstra
2007-01-22 17:59 ` [PATCH] nfs: fix congestion control -v3 Christoph Lameter
2007-01-17 23:15 ` [PATCH] nfs: fix congestion control Christoph Hellwig
2007-01-16 22:15 ` [RFC 0/8] Cpuset aware writeback Christoph Lameter
2007-01-16 23:40 ` Andrew Morton
2007-01-17 0:16 ` Christoph Lameter
2007-01-17 1:07 ` Andrew Morton
2007-01-17 1:30 ` Christoph Lameter
2007-01-17 2:34 ` Andrew Morton
2007-01-17 3:40 ` Christoph Lameter
2007-01-17 4:02 ` Paul Jackson
2007-01-17 4:05 ` Andrew Morton
2007-01-17 6:27 ` Christoph Lameter
2007-01-17 7:00 ` Andrew Morton
2007-01-17 8:01 ` Paul Jackson
2007-01-17 9:57 ` Andrew Morton
2007-01-17 19:43 ` Christoph Lameter
2007-01-17 22:10 ` Andrew Morton
2007-01-18 1:10 ` Christoph Lameter
2007-01-18 1:25 ` Andrew Morton
2007-01-18 5:21 ` Christoph Lameter
2007-01-16 23:44 ` David Chinner
2007-01-16 22:01 ` Andi Kleen
2007-01-16 22:18 ` Christoph Lameter
2007-02-02 1:38 ` Ethan Solomita
2007-02-02 2:16 ` Christoph Lameter
2007-02-02 4:03 ` Andrew Morton
2007-02-02 5:29 ` Christoph Lameter
2007-02-02 6:02 ` Neil Brown
2007-02-02 6:17 ` Christoph Lameter
2007-02-02 6:41 ` Neil Brown
2007-02-02 7:12 ` Andrew Morton
2007-03-21 21:11 ` Ethan Solomita
2007-03-21 21:29 ` Christoph Lameter
2007-03-21 21:52 ` Andrew Morton
2007-03-21 21:57 ` Christoph Lameter
2007-04-19 2:07 ` Ethan Solomita
2007-04-19 2:55 ` Christoph Lameter
2007-04-19 7:52 ` Ethan Solomita
2007-04-19 16:03 ` Christoph Lameter
2007-04-21 1:37 ` Ethan Solomita
2007-04-21 1:48 ` Christoph Lameter
2007-04-21 8:15 ` Ethan Solomita
2007-04-21 15:40 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1168933090.22935.30.camel@twins \
--to=a.p.zijlstra@chello.nl \
--cc=ak@suse.de \
--cc=akpm@osdl.org \
--cc=clameter@sgi.com \
--cc=dgc@sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=menage@google.com \
--cc=nickpiggin@yahoo.com.au \
--cc=pj@sgi.com \
--subject='Re: [RFC 0/8] Cpuset aware writeback' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).