LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: David Chinner <dgc@sgi.com>
To: Jens Axboe <jens.axboe@oracle.com>
Cc: Nick Piggin <npiggin@suse.de>,
	linux-kernel@vger.kernel.org, Alan.Brunelle@hp.com,
	arjan@linux.intel.com, dgc@sgi.com
Subject: Re: IO queuing and complete affinity with threads (was Re: [PATCH 0/8] IO queuing and complete affinity)
Date: Mon, 11 Feb 2008 16:22:11 +1100	[thread overview]
Message-ID: <20080211052211.GS155407@sgi.com> (raw)
In-Reply-To: <20080208075954.GA15220@kernel.dk>

On Fri, Feb 08, 2008 at 08:59:55AM +0100, Jens Axboe wrote:
> > > > At least they reported it to be the most efficient scheme in their
> > > > testing, and Dave thought that migrating completions out to submitters
> > > > might be a bottleneck in some cases.
> > > 
> > > More so than migrating submitters to completers? The advantage of only
> > > movign submitters is that you get rid of the completion locking. Apart
> > > from that, the cost should be the same, especially for the thread based
> > > solution.
> > 
> > Not specifically for the block layer, but higher layers like xfs.
> 
> True, but that's parallel to the initial statement - that migrating
> completers is most costly than migrating submitters. So I'd like Dave to
> expand on why he thinks that migrating completers it more costly than
> submitters, APART from the locking associated with adding the request to
> a remote CPU list.

What I think Nick is referring to is the comments I made that at a
higher layer (e.g. filesystems) migrating completions to the
submitter CPU may be exactly the wrong thing to do. I don't recall
making any comments on migrating submitters - I think others have
already commented on that so I'll ignore that for the moment and
try to explain why completion on submitter CPU /may/ be bad.

For example, in the case of XFS it is fine for data I/O but it is
wrong for transaction I/O completion. We want to direct all
transaction completions to as few CPUs as possible (one, ideally) so
that all the completion processing happens on the same CPU, rather
than bouncing global cachelines and locks between all the CPUs
taking completion interrupts.

In more detail, the XFS transaction subsystem is asynchronous.
We submit the transaction I/O on the CPU that creates the
transaction so the I/O can come from any CPU in the system.  If we
then farm the completion processing out to the submission CPU, that
will push it all over the machine and guarantee that we bounce all
of the XFS transaction log structures and locks all over the machine
on completion as well as submission (right now it's lots of
submission CPUs, few completion CPUs).

An example how bad this can get - this patch:

http://oss.sgi.com/archives/xfs/2007-11/msg00217.html

which prevents simultaneous access to the items tracked in the log
during transaction reservation. Having several hundred CPUs trying
to hit this list at once is really bad for performance - the test
app on the 2048p machine that saw this problem went from ~5500s
runtime down to 9s with the above patch.

I use this example because the transaction I/O completion touches
exactly the same list, locks and structures but is limited in
distribution (and therefore contention) by the number of
simultaneous I/O completion CPUs. Doing completion on the submitter
CPU will cause much wider distribution of completion processing and
introduce exactly the same issues as the transaction reservation
side had.

As it goes, with large, multi-device volumes (e.g. big stripe) we
already see issues with simultaneous completion processing (e.g. the
8p machine mentioned in the above link), so I'd really like to avoid
making these problems worse....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

  parent reply	other threads:[~2008-02-11  5:23 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-07  9:18 [PATCH 0/8] IO queuing and complete affinity Jens Axboe
2008-02-07  9:18 ` [PATCH 1/8] block: split softirq handling into blk-softirq.c Jens Axboe
2008-02-07  9:18 ` [PATCH 2/8] Add interface for queuing work on a specific CPU Jens Axboe
2008-02-07  9:45   ` Andrew Morton
2008-02-07  9:49     ` Jens Axboe
2008-02-07 17:44       ` Harvey Harrison
2008-02-11 10:51     ` Oleg Nesterov
2008-02-07  9:19 ` [PATCH 3/8] block: make kblockd_schedule_work() take the queue as parameter Jens Axboe
2008-02-07  9:19 ` [PATCH 4/8] x86: add support for remotely triggering the block softirq Jens Axboe
2008-02-07 10:07   ` Ingo Molnar
2008-02-07 10:17     ` Jens Axboe
2008-02-07 10:25       ` Ingo Molnar
2008-02-07 10:31         ` Jens Axboe
2008-02-07 10:38           ` Ingo Molnar
2008-02-07 14:18             ` Jens Axboe
2008-02-07 10:49           ` [patch] block layer: kmemcheck fixes Ingo Molnar
2008-02-07 17:42             ` Linus Torvalds
2008-02-07 17:55               ` Jens Axboe
2008-02-07 19:31               ` Ingo Molnar
2008-02-07 20:06                 ` Jens Axboe
2008-02-08  1:22               ` David Miller
2008-02-08  1:28                 ` Linus Torvalds
2008-02-08 15:09                 ` Arjan van de Ven
2008-02-08 22:44                   ` Nick Piggin
2008-02-08 22:56                     ` Arjan van de Ven
2008-02-08 23:58                       ` Nick Piggin
2008-02-08 11:38               ` Jens Axboe
2008-02-07  9:19 ` [PATCH 5/8] x86-64: add support for remotely triggering the block softirq Jens Axboe
2008-02-07  9:19 ` [PATCH 6/8] ia64: " Jens Axboe
2008-02-07  9:19 ` [PATCH 7/8] kernel: add generic softirq interface for triggering a remote softirq Jens Axboe
2008-02-07  9:19 ` [PATCH 8/8] block: add test code for testing CPU affinity Jens Axboe
2008-02-07 15:16 ` [PATCH 0/8] IO queuing and complete affinity Alan D. Brunelle
2008-02-07 18:25 ` IO queuing and complete affinity with threads (was Re: [PATCH 0/8] IO queuing and complete affinity) Jens Axboe
2008-02-07 20:40   ` Alan D. Brunelle
2008-02-08  7:38   ` Nick Piggin
2008-02-08  7:47     ` Jens Axboe
2008-02-08  7:53       ` Nick Piggin
2008-02-08  7:59         ` Jens Axboe
2008-02-08  8:12           ` Nick Piggin
2008-02-08  8:24             ` Jens Axboe
2008-02-08  8:33               ` Nick Piggin
2008-02-11  5:22           ` David Chinner [this message]
2008-02-12  8:28             ` Jeremy Higdon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080211052211.GS155407@sgi.com \
    --to=dgc@sgi.com \
    --cc=Alan.Brunelle@hp.com \
    --cc=arjan@linux.intel.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@suse.de \
    --subject='Re: IO queuing and complete affinity with threads (was Re: [PATCH 0/8] IO queuing and complete affinity)' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).