LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: David Chinner <dgc@sgi.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>,
	"Siddha, Suresh B" <suresh.b.siddha@intel.com>,
	linux-kernel@vger.kernel.org, mingo@elte.hu, ak@suse.de,
	jens.axboe@oracle.com, James.Bottomley@SteelEye.com,
	andrea@suse.de, clameter@sgi.com, akpm@linux-foundation.org,
	andrew.vasquez@qlogic.com, willy@linux.intel.com,
	Zach Brown <zach.brown@oracle.com>
Subject: Re: [rfc] direct IO submission and completion scalability issues
Date: Fri, 8 Feb 2008 08:50:29 +0100	[thread overview]
Message-ID: <20080208075029.GF9730@wotan.suse.de> (raw)
In-Reply-To: <20080205001419.GG155407@sgi.com>

On Tue, Feb 05, 2008 at 11:14:19AM +1100, David Chinner wrote:
> On Mon, Feb 04, 2008 at 11:09:59AM +0100, Nick Piggin wrote:
> > You get better behaviour in the slab and page allocators and locality
> > and cache hotness of memory. For example, I guess in a filesystem /
> > pagecache heavy workload, you have to touch each struct page, buffer head,
> > fs private state, and also often have to wake the thread for completion.
> > Much of this data has just been touched at submit time, so doin this on
> > the same CPU is nice...
> 
> [....]
> 
> > I'm surprised that the xfs global state bouncing would outweigh the
> > bouncing of all the per-page/block/bio/request/etc data that gets touched
> > during completion. We'll see.
> 
> per-page/block.bio/request/etc is local to a single I/O. the only
> penalty is a cacheline bounce for each of the structures from one
> CPU to another.  That is, there is no global state modified by these
> completions.

Yeah, but it is going from _all_ submitting CPUs to the one completing
CPU. So you could bottleneck the interconnect at the completing CPU
just as much as if you had cachelines being pulled the other way (ie.
many CPUs trying to pull in a global cacheline).

 
> The real issue is metadata. The transaction log I/O completion
> funnels through a state machine protected by a single lock, which
> means completions on different CPUs pulls that lock to all
> completion CPUs. Given that the same lock is used during transaction
> completion for other state transitions (in task context, not intr),
> the more cpus active at once touches, the worse the problem gets.

OK, once you add locking (and not simply cacheline contention), then
the problem gets harder I agree. But I think that if the submitting
side takes the same locks as log completion (eg. maybe for starting a
new transaction), then it is not going to be a clear win either way,
and you'd have to measure it in the end.


  reply	other threads:[~2008-02-08  7:50 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-28  1:21 Siddha, Suresh B
2007-07-30 18:20 ` Christoph Lameter
2007-07-30 20:35   ` Siddha, Suresh B
2007-07-31  4:19     ` Nick Piggin
2007-07-31 17:14       ` Siddha, Suresh B
2007-08-01  0:41         ` Nick Piggin
2007-08-01  0:55           ` Siddha, Suresh B
2007-08-01  1:24             ` Nick Piggin
2008-02-03  9:52 ` Nick Piggin
2008-02-03 10:53   ` Pekka Enberg
2008-02-03 11:58     ` Nick Piggin
2008-02-04  2:10   ` David Chinner
2008-02-04  4:14     ` Arjan van de Ven
2008-02-04  4:40       ` David Chinner
2008-02-04 10:09         ` Nick Piggin
2008-02-05  0:14           ` David Chinner
2008-02-08  7:50             ` Nick Piggin [this message]
2008-02-04 18:21     ` Zach Brown
2008-02-04 20:10       ` Jens Axboe
2008-02-04 21:45         ` Arjan van de Ven
2008-02-05  8:24           ` Jens Axboe
2008-02-04 10:12   ` Jens Axboe
2008-02-04 10:31     ` Nick Piggin
2008-02-04 10:33       ` Jens Axboe
2008-02-04 22:28         ` James Bottomley
2008-02-04 10:30   ` Andi Kleen
2008-02-04 21:47   ` Siddha, Suresh B

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080208075029.GF9730@wotan.suse.de \
    --to=npiggin@suse.de \
    --cc=James.Bottomley@SteelEye.com \
    --cc=ak@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=andrea@suse.de \
    --cc=andrew.vasquez@qlogic.com \
    --cc=arjan@linux.intel.com \
    --cc=clameter@sgi.com \
    --cc=dgc@sgi.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=suresh.b.siddha@intel.com \
    --cc=willy@linux.intel.com \
    --cc=zach.brown@oracle.com \
    --subject='Re: [rfc] direct IO submission and completion scalability issues' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).