LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Milton Miller <miltonm@bga.com>
Cc: Jan Kara <jack@suse.cz>,
	linux-fsdevel@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	Nick Piggin <npiggin@kernel.dk>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: Re: [2/2] fs: Fix race between io_destroy() and io_submit() in AIO
Date: Tue, 15 Feb 2011 20:33:16 +0100	[thread overview]
Message-ID: <20110215193316.GN17313@quack.suse.cz> (raw)
In-Reply-To: <20110215191514.GM17313@quack.suse.cz>

  Hello,

On Tue 15-02-11 20:15:14, Jan Kara wrote:
> On Tue 15-02-11 12:50:32, Milton Miller wrote:
> > On Tue, 15 Feb 2011 about 11:16:16 -0600, Jan Kara wrote:
> > > On Tue 15-02-11 12:59:24, Milton Miller wrote:
> > > > > A race can occur when io_submit() races with io_destroy():
> > > > > 
> > > > >  CPU1						CPU2
> > > > > io_submit()
> > > > >   do_io_submit()
> > > > >     ...
> > > > >     ctx = lookup_ioctx(ctx_id);
> > > > > 						io_destroy()
> > > > >     Now do_io_submit() holds the last reference to ctx.
> > > > >     ...
> > > > >     queue new AIO
> > > > >     put_ioctx(ctx) - frees ctx with active AIOs
> > > > > 
> > > > > We solve this issue by checking whether ctx is being destroyed
> > > > > in AIO submission path after adding new AIO to ctx. Then we
> > > > > are guaranteed that either io_destroy() waits for new AIO or
> > > > > we see that ctx is being destroyed and bail out.
> > > > > 
> > > > > Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
> > > > > Signed-off-by: Jan Kara <jack@suse.cz>
> > > > > CC: Nick Piggin <npiggin@kernel.dk>
> > > > > 
> > > > > ---
> > > > > fs/aio.c |   15 +++++++++++++++
> > > > >  1 files changed, 15 insertions(+), 0 deletions(-)
> > > > > 
> > > > > diff --git a/fs/aio.c b/fs/aio.c
> > > > > index b4dd668..0244c04 100644
> > > > > --- a/fs/aio.c
> > > > > +++ b/fs/aio.c
> > > > > @@ -1642,6 +1642,21 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
> > > > >  		goto out_put_req;
> > > > >  
> > > > >  	spin_lock_irq(&ctx->ctx_lock);
> > > > > +	/*
> > > > > +	 * We could have raced with io_destroy() and are currently holding a
> > > > > +	 * reference to ctx which should be destroyed. We cannot submit IO
> > > > > +	 * since ctx gets freed as soon as io_submit() puts its reference.
> > > > > +	 * The check here is reliable since io_destroy() sets ctx->dead before
> > > > > +	 * waiting for outstanding IO. Thus if we don't see ctx->dead set here,
> > > > > +	 * io_destroy() waits for our IO to finish.
> > > > > +	 * The check is inside ctx->ctx_lock to avoid extra memory barrier
> > > > > +	 * in this fast path...
> > > > > +	 */
> > > > 
> > > > When reading this comment, and with all of the recient discussions I
> > > > had with Paul in the smp ipi thread (especially with resepect to third
> > > > party writes), I looked to see that the spinlock was paired with the
> > > > spinlock to set dead in io_destroy.  It is not.   It took me some time
> > > > to find that the paired lock is actually in wait_for_all_aios.  Also,
> > > > dead is also set in aio_cancel_all which is under the same spinlock.
> > > > 
> > > > Please update this lack of memory barrier comment to reflect the locking.
> > 
> > This locking description is wrong:
> > 
> > >   Hum, sorry but I don't understand. The above message wants to say that
> > > io_destroy() does
> > >   ctx->dead = 1
> > >   barrier (implied by a spin_unlock)
> > 
> > no spin_unlock only does a release barrier.
> > 
> > >   wait for reqs_active to get to 0
> > 
> > This read can move up into the spinlocked region (up to the lock acquire).
> > 
> > > 
> > > while io_submit() does
> > >   increment reqs_active
> > >   barrier (implied by a spin_lock - on a different lock but that does not
> > > matter as we only need the barrier semantics)
> > 
> > No only an acquire barrier, old writes can move into the spinlock region
> > 
> > >   check ctx->dead
> > 
> > the increment can move down past this check to the unlock here.
>   Ah OK, you're right. I was typing too fast and thinking too slow ;).
> 
> > > So if io_submit() gets past ctx->dead check, io_destroy() will certainly
> > > wait for our reference in reqs_active to be released.
> > > 
> > > I don't see any lock pairing needed here... But maybe I miss something.
> > > 
> > > 								Honza
> > 
> > spin lock and unlock are only half barriers as described in 
> > Documentation/memory-barriers.txt
> > 
> > 
> > Now, as I said, the code is ok because the active count is read and
> > written under ctx->ctx_lock, and aio_cancel_all sets dead under
> > that lock.
> > 
> > But the comment needs to reflect that and not just the the code is
> > under in some random spin_lock region instead of a memory barrier,
> > which is not sufficient.   Bad lock descriptions leads to making bad
> > code in the future, either through copying it to another context or
> > though future work removing the additional constraints not mentioned.
> > 
> > So please, comment which locks are being used here, as what
> > you described is not enough.
>   Yep, I'll improve the comment. Thanks for explanation.
  Do you like this comment better?
      /*
       * We could have raced with io_destroy() and are currently holding a
       * reference to ctx which should be destroyed. We cannot submit IO
       * since ctx gets freed as soon as io_submit() puts its reference.  The
       * check here is reliable: io_destroy() sets ctx->dead before waiting
       * for outstanding IO and the barrier between these two is realized by
       * unlock of mm->ioctx_lock and lock of ctx->ctx_lock.  Analogously we
       * increment ctx->reqs_active before checking for ctx->dead and the
       * barrier is realized by unlock and lock of ctx->ctx_lock. Thus if we
       * don't see ctx->dead set here, io_destroy() waits for our IO to
       * finish.
       */


								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2011-02-15 19:33 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-15 12:59 [PATCH 0/2] aio: Fix use after free bugs Jan Kara
2011-02-15 12:59 ` [PATCH 1/2] fs: Fix aio rcu ioctx lookup Jan Kara
2011-02-15 12:59 ` [PATCH 2/2] fs: Fix race between io_destroy() and io_submit() in AIO Jan Kara
2011-02-15 12:59   ` [2/2] " Milton Miller
2011-02-15 17:16     ` Jan Kara
2011-02-15 18:50       ` Milton Miller
2011-02-15 19:15         ` Jan Kara
2011-02-15 19:33           ` Jan Kara [this message]
2011-02-15 13:55 ` [PATCH 0/2] aio: Fix use after free bugs Jeff Moyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110215193316.GN17313@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miltonm@bga.com \
    --cc=npiggin@kernel.dk \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=viro@zeniv.linux.org.uk \
    --subject='Re: [2/2] fs: Fix race between io_destroy() and io_submit() in AIO' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).