LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: David Rientjes <rientjes@google.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Christoph Lameter <cl@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Nick Piggin <npiggin@suse.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Paul Menage <menage@google.com>, Derek Fults <dfults@sgi.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [patch 0/7] cpuset writeback throttling
Date: Fri, 31 Oct 2008 09:36:31 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.00.0810310150160.8229@chino.kir.corp.google.com> (raw)
In-Reply-To: <20081030220344.GL4985@disturbed>

On Fri, 31 Oct 2008, Dave Chinner wrote:

> This is a fairly major change in behaviour to the writeback path on
> NUMA systems and so has the potential to introduce subtle new
> issues. Hence I'm asking about the level of testing and exposure
> it has had. It doesn't really sound like it has had much coverage
> to me....
> 

I've regression tested it to ensure that there is no significant 
degradation for unconstrained (non-cpuset) configurations, which is 
identical to single system-wide cpuset configurations with ratios that 
match the global sysctls.

So there should be no noticeable difference for non-cpuset users or users 
who attach all system tasks to the root cpuset with this new feature.

To simulate large NUMA systems, numa=fake=32 was used on my small 4G 
machine for 32 system nodes.  A 4G file was created from /dev/zero and a 
non-iterative version of my aforementioned test program (see the end of 
this message) was used to mmap(), dirty, and msync().  This was done with 
Linus' latest -git both with and without this patchset.

Although only 32 nodes were online, these kernels both had a 
CONFIG_NODES_SHIFT of 8 to use the patchset's slowpath (the kmalloc() for 
a struct address_space's dirty_nodes) since MAX_NUMNODES > BITS_PER_LONG.

Without this patchset (five iterations):

	real	2m43.824s   2m41.842s   2m44.103s   2m42.176s   2m43.692s
	user	0m18.946s   0m18.875s   0m18.747s   0m18.969s   0m19.091s
	sys	0m7.722s    0m7.817s    0m7.618s    0m7.635s    0m7.450s

	Device:        tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
	sda         376.68        18.60        17.90       4269       4108
	sda         367.37        18.13        17.45       4269       4108
	sda         388.59        19.19        18.46       4269       4107
	sda         399.44        19.73        18.98       4269       4106
	sda         407.15        20.08        19.33       4271       4110

With this patchset (five iterations):

	real	2m46.337s   2m44.321s   2m46.406s   2m43.237s   2m47.165s
	user	0m19.014s   0m18.985s   0m19.146s   0m18.778s   0m18.937s
	sys	0m7.008s    0m7.180s    0m7.067s    0m7.116s    0m7.188s

	Device:        tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
	sda         399.10        19.71        18.97       4269       4109
	sda         396.85        19.55        18.78       4272       4111
	sda         404.72        19.96        19.23       4270       4112
	sda         403.51        19.93        19.18       4269       4108
	sda         396.33        19.54        18.80       4276       4114

> I'm concerned right now because Nick and others (including myself)
> have recently found lots of nasty data integrity problems in the
> writeback path that we are currently trying to test fixes for.
> It's not a good time to introduce new behaviours as that will
> definitely perturb any regression testing we are running....
> 

Do we need a development freeze of the writeback path for features that 
would be targeted to 2.6.29 at the earliest?

Nasty data integrity problems seem like something that would be addressed 
during the -rc stage.

		David
---
#include <stdlib.h>
#include <stdio.h>
#include <sys/mman.h>
#include <fcntl.h>

int main(int argc, char **argv)
{
	void *addr;
	unsigned long length;
	unsigned long i;
	int fd;

	if (argc != 3) {
		fprintf(stderr, "usage: %s <filename> <length>\n",
			argv[0]);
		exit(1);
	}

	fd = open(argv[1], O_RDWR, 0644);
	if (fd < 0) {
		fprintf(stderr, "Cannot open file %s\n", argv[1]);
		exit(1);
	}

	length = strtoul(argv[2], NULL, 0);
	if (!length) {
		fprintf(stderr, "Invalid length %s\n", argv[2]);
		exit(1);
	}

	addr = mmap(0, length, PROT_READ | PROT_WRITE, MAP_SHARED, fd,
		    0);
	if (addr == MAP_FAILED) {
		fprintf(stderr, "mmap() failed\n");
		exit(1);
	}

	for (i = 0; i < length; i++)
		(*(char *)(addr + i))++;
	msync(addr, length, MS_SYNC);
	return 0;
}

  parent reply	other threads:[~2008-10-31 16:38 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-30 19:23 David Rientjes
2008-10-30 19:23 ` [patch 1/7] cpusets: add dirty map to struct address_space David Rientjes
2008-11-04 21:09   ` Andrew Morton
2008-11-04 21:20     ` Christoph Lameter
2008-11-04 21:42       ` Andrew Morton
2008-10-30 19:23 ` [patch 2/7] pdflush: allow the passing of a nodemask parameter David Rientjes
2008-10-30 19:23 ` [patch 3/7] mm: make page writeback obey cpuset constraints David Rientjes
2008-10-30 19:23 ` [patch 4/7] mm: cpuset aware reclaim writeout David Rientjes
2008-10-30 19:23 ` [patch 5/7] mm: throttle writeout with cpuset awareness David Rientjes
2008-10-30 19:23 ` [patch 6/7] cpusets: per cpuset dirty ratios David Rientjes
2008-10-30 19:23 ` [patch 7/7] cpusets: update documentation for writeback throttling David Rientjes
2008-10-30 21:08 ` [patch 0/7] cpuset " Dave Chinner
2008-10-30 21:33   ` Christoph Lameter
2008-10-30 22:03     ` Dave Chinner
2008-10-31 13:47       ` Christoph Lameter
2008-10-31 16:36       ` David Rientjes [this message]
2008-11-04 20:47 ` Andrew Morton
2008-11-04 20:53   ` Peter Zijlstra
2008-11-04 20:58     ` Christoph Lameter
2008-11-04 21:10     ` David Rientjes
2008-11-04 21:16     ` Andrew Morton
2008-11-04 21:21       ` Peter Zijlstra
2008-11-04 21:50         ` Andrew Morton
2008-11-04 22:17           ` Christoph Lameter
2008-11-04 22:35             ` Andrew Morton
2008-11-04 22:52               ` Christoph Lameter
2008-11-04 23:36                 ` Andrew Morton
2008-11-05  1:31                   ` KAMEZAWA Hiroyuki
2008-11-05  3:09                     ` Andrew Morton
2008-11-05  2:45                   ` Christoph Lameter
2008-11-05  3:05                     ` Andrew Morton
2008-11-05  4:31                       ` KAMEZAWA Hiroyuki
2008-11-10  9:02                         ` Andrea Righi
2008-11-10 10:02                           ` David Rientjes
2008-11-05 13:52                       ` Christoph Lameter
2008-11-05 18:41                         ` Andrew Morton
2008-11-05 20:21                           ` Christoph Lameter
2008-11-05 20:31                             ` Andrew Morton
2008-11-05 20:40                               ` Christoph Lameter
2008-11-05 20:56                                 ` Andrew Morton
2008-11-05 21:28                                   ` Christoph Lameter
2008-11-05 21:55                                   ` Paul Menage
2008-11-05 22:04                                   ` David Rientjes
2008-11-06  1:34                                     ` KAMEZAWA Hiroyuki
2008-11-06 20:35                                       ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.00.0810310150160.8229@chino.kir.corp.google.com \
    --to=rientjes@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=dfults@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=menage@google.com \
    --cc=npiggin@suse.de \
    --cc=peterz@infradead.org \
    --subject='Re: [patch 0/7] cpuset writeback throttling' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).