LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Jens Axboe <jens.axboe@oracle.com>
To: Zach Brown <zach.brown@oracle.com>
Cc: David Chinner <dgc@sgi.com>, Nick Piggin <npiggin@suse.de>,
	"Siddha, Suresh B" <suresh.b.siddha@intel.com>,
	linux-kernel@vger.kernel.org, arjan@linux.intel.com,
	mingo@elte.hu, ak@suse.de, James.Bottomley@SteelEye.com,
	andrea@suse.de, clameter@sgi.com, akpm@linux-foundation.org,
	andrew.vasquez@qlogic.com, willy@linux.intel.com
Subject: Re: [rfc] direct IO submission and completion scalability issues
Date: Mon, 4 Feb 2008 21:10:28 +0100	[thread overview]
Message-ID: <20080204201027.GJ15220@kernel.dk> (raw)
In-Reply-To: <47A7579F.2050809@oracle.com>

On Mon, Feb 04 2008, Zach Brown wrote:
> [ ugh, still jet lagged. ]
> 
> > Hi Nick,
> > 
> > When Matthew was describing this work at an LCA presentation (not
> > sure whether you were at that presentation or not), Zach came up
> > with the idea that allowing the submitting application control the
> > CPU that the io completion processing was occurring would be a good
> > approach to try.  That is, we submit a "completion cookie" with the
> > bio that indicates where we want completion to run, rather than
> > dictating that completion runs on the submission CPU.
> > 
> > The reasoning is that only the higher level context really knows
> > what is optimal, and that changes from application to application.
> > The "complete on the submission CPU" policy _may_ be more optimal
> > for database workloads, but it is definitely suboptimal for XFS and
> > transaction I/O completion handling because it simply drags a bunch
> > of global filesystem state around between all the CPUs running
> > completions. In that case, we really only want a single CPU to be
> > handling the completions.....
> > 
> > (Zach - please correct me if I've missed anything)
> 
> Yeah, I think Nick's patch (and Jens' approach, presumably) is just the
> sort of thing we were hoping for when discussing this during Matthew's talk.
> 
> I was imagining the patch a little bit differently (per-cpu tasks, do a
> wake_up from the driver instead of cpu nr testing up in blk, work
> queues, whatever), but we know how to iron out these kinds of details ;).

per-cpu tasks/wq's might be better, it's a little awkward to jump
through hoops

> > Looking at your patch - if you turn it around so that the
> > "submission CPU" field can be specified as the "completion cpu" then
> > I think the patch will expose the policy knobs needed to do the
> > above.
> 
> Yeah, that seems pretty straight forward.
> 
> We might need some logic for noticing that the desired cpu has been
> hot-plugged away while the IO was in flight, it occurs to me.

the softirq completion stuff already handles cpus going away, at least
with my patch that stuff works fine (with a dead flag added).

-- 
Jens Axboe


  reply	other threads:[~2008-02-04 20:10 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-28  1:21 Siddha, Suresh B
2007-07-30 18:20 ` Christoph Lameter
2007-07-30 20:35   ` Siddha, Suresh B
2007-07-31  4:19     ` Nick Piggin
2007-07-31 17:14       ` Siddha, Suresh B
2007-08-01  0:41         ` Nick Piggin
2007-08-01  0:55           ` Siddha, Suresh B
2007-08-01  1:24             ` Nick Piggin
2008-02-03  9:52 ` Nick Piggin
2008-02-03 10:53   ` Pekka Enberg
2008-02-03 11:58     ` Nick Piggin
2008-02-04  2:10   ` David Chinner
2008-02-04  4:14     ` Arjan van de Ven
2008-02-04  4:40       ` David Chinner
2008-02-04 10:09         ` Nick Piggin
2008-02-05  0:14           ` David Chinner
2008-02-08  7:50             ` Nick Piggin
2008-02-04 18:21     ` Zach Brown
2008-02-04 20:10       ` Jens Axboe [this message]
2008-02-04 21:45         ` Arjan van de Ven
2008-02-05  8:24           ` Jens Axboe
2008-02-04 10:12   ` Jens Axboe
2008-02-04 10:31     ` Nick Piggin
2008-02-04 10:33       ` Jens Axboe
2008-02-04 22:28         ` James Bottomley
2008-02-04 10:30   ` Andi Kleen
2008-02-04 21:47   ` Siddha, Suresh B

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080204201027.GJ15220@kernel.dk \
    --to=jens.axboe@oracle.com \
    --cc=James.Bottomley@SteelEye.com \
    --cc=ak@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=andrea@suse.de \
    --cc=andrew.vasquez@qlogic.com \
    --cc=arjan@linux.intel.com \
    --cc=clameter@sgi.com \
    --cc=dgc@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=suresh.b.siddha@intel.com \
    --cc=willy@linux.intel.com \
    --cc=zach.brown@oracle.com \
    --subject='Re: [rfc] direct IO submission and completion scalability issues' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).