LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Christoph Lameter <clameter@sgi.com>,
	linux-kernel@vger.kernel.org, arjan@linux.intel.com,
	mingo@elte.hu, ak@suse.de, jens.axboe@oracle.com,
	James.Bottomley@SteelEye.com, andrea@suse.de,
	akpm@linux-foundation.org, andrew.vasquez@qlogic.com
Subject: Re: [rfc] direct IO submission and completion scalability issues
Date: Wed, 1 Aug 2007 03:24:21 +0200	[thread overview]
Message-ID: <20070801012421.GF31006@wotan.suse.de> (raw)
In-Reply-To: <20070801005513.GE10033@linux-os.sc.intel.com>

On Tue, Jul 31, 2007 at 05:55:13PM -0700, Suresh B wrote:
> On Wed, Aug 01, 2007 at 02:41:18AM +0200, Nick Piggin wrote:
> > On Tue, Jul 31, 2007 at 10:14:03AM -0700, Suresh B wrote:
> > > task by taking away some of its cpu time. With CFS micro accounting, perhaps
> > > we can track irq, softirq time and avoid penalizing the running task's cpu
> > > time.
> > 
> > But you "penalize" the running task in the completion handler as well
> > anyway.
> 
> Yes.
> 
> Ingo, in general with CFS micro accounting, we should be able to avoid
> penalizing the running task by tracking irq/softirq time. Isn't it?
> 
> > Doing this with a SCHED_FIFO task is sort of like doing interrupt
> > threading which AFAIK has not been accepted (yet).
> 
> I am not recommending SCHED_FIFO. I will take a look at softirq
> infrastructure for this.

I think that would be a fine way to go.


> > > This workload is using direct IO and there is no batching at the block layer
> > > for direct IO. IO is submitted to the HW as it arrives.
> > 
> > So you aren't putting concurrent requests into the queue? Sounds like
> > userspace should be improved.
> 
> Nick remember that there are hundreds of disks in this setup and at
> an instance, there will be max 1 or 2 requests per disk.

Well if there is 2 requests per disk, that's a good thing; you won't
need to unplug. If there is only 1, then as well as the plugging cost,
the hardeware loses some ability to pipeline things effectively.

I'm not saying the kernel shouldn't be improved in the latter case, but
if you're looking for performance, it is nice to ensure you have at
least 2 requests. Presumably you're using some pretty well tuned db
software though, so I guess this is not always possible.

Do you have stats for these things (queue empty vs not empty events,
unplugs, etc)? 


> > > It is applicable for both direct IO and buffered IO. But the implementations
> > > will differ. For example in buffered IO, we can setup in such a way that the
> > > block plug timeout function runs on the IO completion cpu.
> > 
> > It would be nice to be doing that anyway. But unplug via request submission
> > rather than timeout is fairly common in buffered loads too.
> 
> Ok. Currently the patch handles both direct and buffered IO. While making
> improvements to this patch I will make sure that both the paths take
> advantage of this.

Sounds good!


  reply	other threads:[~2007-08-01  1:24 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-28  1:21 Siddha, Suresh B
2007-07-30 18:20 ` Christoph Lameter
2007-07-30 20:35   ` Siddha, Suresh B
2007-07-31  4:19     ` Nick Piggin
2007-07-31 17:14       ` Siddha, Suresh B
2007-08-01  0:41         ` Nick Piggin
2007-08-01  0:55           ` Siddha, Suresh B
2007-08-01  1:24             ` Nick Piggin [this message]
2008-02-03  9:52 ` Nick Piggin
2008-02-03 10:53   ` Pekka Enberg
2008-02-03 11:58     ` Nick Piggin
2008-02-04  2:10   ` David Chinner
2008-02-04  4:14     ` Arjan van de Ven
2008-02-04  4:40       ` David Chinner
2008-02-04 10:09         ` Nick Piggin
2008-02-05  0:14           ` David Chinner
2008-02-08  7:50             ` Nick Piggin
2008-02-04 18:21     ` Zach Brown
2008-02-04 20:10       ` Jens Axboe
2008-02-04 21:45         ` Arjan van de Ven
2008-02-05  8:24           ` Jens Axboe
2008-02-04 10:12   ` Jens Axboe
2008-02-04 10:31     ` Nick Piggin
2008-02-04 10:33       ` Jens Axboe
2008-02-04 22:28         ` James Bottomley
2008-02-04 10:30   ` Andi Kleen
2008-02-04 21:47   ` Siddha, Suresh B

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070801012421.GF31006@wotan.suse.de \
    --to=npiggin@suse.de \
    --cc=James.Bottomley@SteelEye.com \
    --cc=ak@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=andrea@suse.de \
    --cc=andrew.vasquez@qlogic.com \
    --cc=arjan@linux.intel.com \
    --cc=clameter@sgi.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=suresh.b.siddha@intel.com \
    --subject='Re: [rfc] direct IO submission and completion scalability issues' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).