LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Nate Diller <nate@agami.com>
To: Benjamin LaHaise <bcrl@kvack.org>
Cc: Nate Diller <nate.diller@gmail.com>,
	Christoph Hellwig <hch@infradead.org>,
	Andrew Morton <akpm@osdl.org>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>,
	Trond Myklebust <trond.myklebust@fys.uio.no>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Suparna Bhattacharya <suparna@in.ibm.com>,
	Kenneth W Chen <kenneth.w.chen@intel.com>,
	David Brownell <dbrownell@users.sourceforge.net>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	netdev@vger.kernel.org, ocfs2-devel@oss.oracle.com,
	linux-aio@kvack.org, xfs-masters@oss.sgi.com
Subject: Re: [PATCH -mm 0/10][RFC] aio: make struct kiocb private
Date: Wed, 17 Jan 2007 15:30:25 -0800 (PST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0701171424450.23419@nate-64.agami.com> (raw)
In-Reply-To: <20070117215230.GB28828@kvack.org>

On Wed, 17 Jan 2007, Benjamin LaHaise wrote:

> On Mon, Jan 15, 2007 at 08:25:15PM -0800, Nate Diller wrote:
>> the right thing to do from a design perspective.  Hopefully it enables
>> a new architecture that can reduce context switches in I/O completion,
>> and reduce overhead.  That's the real motive ;)
>
> And it's a broken motive.  Context switches per se are not bad, as they
> make it possible to properly schedule code in a busy system (which is
> *very* important when realtime concerns come into play).  Have a look
> at how things were done in the 2.4 aio code to see how completion would
> get done with a non-retry method, typically in interrupt context.  I had
> code that did direct I/O rather differently by sharing code with the
> read/write code paths at some point, the catch being that it was pretty
> invasive, which meant that it never got merged with the changes to handle
> writeback pressure and other work that happened during 2.5.

I'm having some trouble understanding your concern.  From my perspective,
any unnecessary context switch represents not only performance loss, but
extra complexity in the code.  In this case, I'm not suggesting that the
aio.c code causes problems, quite the opposite.  The code I'd like to change
is FS and md levels, where context switches happen because of timers,
workqueues, and worker threads.  For sync I/O, these layers could be doing
their completion work in process context, but because waiting on sync I/O is
done in layers above, they must resort to other means, even for the common
case.  The dm-crypt module is the most straightforward example.

I took a look at some 2.4.18 aio patches in kernel.org/.../bcrl/aio/, and if
I understand what you did, you were basically operating at the aops level
rather than f_ops.  I actually like that idea, it's nicer than having the
direct-io code do its work seperately from the aio code.  Part of where I'm
going with this patch is a better integration between the block layer
(make_request), page layer (aops), and FS layer (f_ops), particularly in the
completion paths.  The direct-io code is an improvement over the common code
on that point, do_readahead() and friends all wait on individual pages to
become uptodate.  I'd like to bring some improvements from the directIO
architecture into use in the common case, which I hope will help
performance.

I know that might seem somewhat unrelated, but I don't think it is.  This
change goes hand in hand with using completion handlers in the aops.  That
will link together the completion callback in the bio with the aio callback,
so that the whole stack can finish its work in one context.

> That said, you can't make kiocb private without completely removing the
> ability of the rest of the kernel to complete an aio sanely from irq context.
> You need some form of i/o descriptor, and a kiocb is just that.  Adding more
> layering is just going to make things messier and slower for no real gain.

This patchset does not change how or when I/O completion happens,
aio_complete() will still get called from direct-io.c, nfs-direct.c, et al. 
The iocb structure is still passed to aio_complete, just like before.  The
only difference is that the lower level code doesn't know that it's got an
iocb, all it sees is an opaque cookie.  It's more like enforcing a layer
that's already in place, and I think things got simpler rather than messier. 
Whether things are slower or not remains to be seen, but I expect no
measurable changes either way with this patch.

I'm releasing a new version of the patch soon, it will use a new iodesc
structure to keep track of iovec state, which simplifies things further.  It
also will have a new version of the usb gadget code, and some general
cleanups.  I hope you'll take a look at it.

NATE

      reply	other threads:[~2007-01-17 23:32 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-01-16  1:54 Nate Diller
2007-01-16  1:54 ` [PATCH -mm 7/10][RFC] aio: make __blockdev_direct_IO use file_endio_t Nate Diller
2007-01-16  1:54 ` [PATCH -mm 4/10][RFC] aio: convert aio_complete to file_endio_t Nate Diller
2007-01-16  5:53   ` David Brownell
2007-01-16  9:21     ` Nate Diller
2007-01-16  1:54 ` [PATCH -mm 5/10][RFC] aio: make blk_directIO use file_endio_t Nate Diller
2007-01-16  1:54 ` [PATCH -mm 8/10][RFC] aio: make direct_IO aops " Nate Diller
2007-01-16  1:54 ` [PATCH -mm 1/10][RFC] aio: scm remove struct siocb Nate Diller
2007-01-16  1:54 ` [PATCH -mm 6/10][RFC] aio: make nfs_directIO use file_endio_t Nate Diller
2007-01-16  1:54 ` [PATCH -mm 10/10][RFC] aio: convert file aio to file_endio_t Nate Diller
2007-01-16  1:54 ` [PATCH -mm 2/10][RFC] aio: net use struct socket for io Nate Diller
2007-01-16  5:44   ` Stephen Hemminger
2007-01-16 10:24     ` Evgeniy Polyakov
2007-01-16  1:54 ` [PATCH -mm 9/10][RFC] aio: usb gadget remove aio file ops Nate Diller
2007-01-16  6:05   ` David Brownell
2007-01-16  9:13     ` Nate Diller
2007-01-16 18:36       ` David Brownell
2007-01-16  1:54 ` [PATCH -mm 3/10][RFC] aio: use iov_length instead of ki_left Nate Diller
2007-01-16  2:14   ` Christoph Hellwig
2007-01-16  5:37     ` Nate Diller
2007-01-16 23:36       ` Ingo Oeser
2007-01-18  4:27     ` Vectored AIO breakage for sockets and pipes ? Suparna Bhattacharya
2007-01-18 21:40       ` Zach Brown
2007-01-16  3:23 ` [PATCH -mm 0/10][RFC] aio: make struct kiocb private Christoph Hellwig
2007-01-16  4:25   ` Nate Diller
2007-01-16  8:22     ` David Brownell
2007-01-17 21:52     ` Benjamin LaHaise
2007-01-17 23:30       ` Nate Diller [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0701171424450.23419@nate-64.agami.com \
    --to=nate@agami.com \
    --cc=akpm@osdl.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=bcrl@kvack.org \
    --cc=dbrownell@users.sourceforge.net \
    --cc=hch@infradead.org \
    --cc=kenneth.w.chen@intel.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nate.diller@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=ocfs2-devel@oss.oracle.com \
    --cc=suparna@in.ibm.com \
    --cc=trond.myklebust@fys.uio.no \
    --cc=viro@zeniv.linux.org.uk \
    --cc=xfs-masters@oss.sgi.com \
    --subject='Re: [PATCH -mm 0/10][RFC] aio: make struct kiocb private' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).