LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/2] aio: Fix use after free bugs
@ 2011-02-15 12:59 Jan Kara
  2011-02-15 12:59 ` [PATCH 1/2] fs: Fix aio rcu ioctx lookup Jan Kara
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Jan Kara @ 2011-02-15 12:59 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel, LKML


  Hi Al,

  The following two patches fix use after free bugs in AIO code.  I'm not
completely sure you're the right one to merge these but hopefully yes ;).
Could you please merge them? Thanks.

								Honza

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] fs: Fix aio rcu ioctx lookup
  2011-02-15 12:59 [PATCH 0/2] aio: Fix use after free bugs Jan Kara
@ 2011-02-15 12:59 ` Jan Kara
  2011-02-15 12:59 ` [PATCH 2/2] fs: Fix race between io_destroy() and io_submit() in AIO Jan Kara
  2011-02-15 13:55 ` [PATCH 0/2] aio: Fix use after free bugs Jeff Moyer
  2 siblings, 0 replies; 9+ messages in thread
From: Jan Kara @ 2011-02-15 12:59 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel, LKML, Nick Piggin, Nick Piggin, Jan Kara

From: Nick Piggin <npiggin@gmail.com>

aio-dio-invalidate-failure GPFs in aio_put_req from io_submit.

lookup_ioctx doesn't implement the rcu lookup pattern properly.  rcu_read_lock
does not prevent refcount going to zero, so we might take a refcount on a zero
count ioctx.

Fix the bug by atomically testing for zero refcount before incrementing.

[JK: Added comment into the code]

Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/aio.c |   35 ++++++++++++++++++++++++-----------
 1 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index fc557a3..b4dd668 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -239,15 +239,23 @@ static void __put_ioctx(struct kioctx *ctx)
 	call_rcu(&ctx->rcu_head, ctx_rcu_free);
 }
 
-#define get_ioctx(kioctx) do {						\
-	BUG_ON(atomic_read(&(kioctx)->users) <= 0);			\
-	atomic_inc(&(kioctx)->users);					\
-} while (0)
-#define put_ioctx(kioctx) do {						\
-	BUG_ON(atomic_read(&(kioctx)->users) <= 0);			\
-	if (unlikely(atomic_dec_and_test(&(kioctx)->users))) 		\
-		__put_ioctx(kioctx);					\
-} while (0)
+static inline void get_ioctx(struct kioctx *kioctx)
+{
+	BUG_ON(atomic_read(&kioctx->users) <= 0);
+	atomic_inc(&kioctx->users);
+}
+
+static inline int try_get_ioctx(struct kioctx *kioctx)
+{
+	return atomic_inc_not_zero(&kioctx->users);
+}
+
+static inline void put_ioctx(struct kioctx *kioctx)
+{
+	BUG_ON(atomic_read(&kioctx->users) <= 0);
+	if (unlikely(atomic_dec_and_test(&kioctx->users)))
+		__put_ioctx(kioctx);
+}
 
 /* ioctx_alloc
  *	Allocates and initializes an ioctx.  Returns an ERR_PTR if it failed.
@@ -601,8 +609,13 @@ static struct kioctx *lookup_ioctx(unsigned long ctx_id)
 	rcu_read_lock();
 
 	hlist_for_each_entry_rcu(ctx, n, &mm->ioctx_list, list) {
-		if (ctx->user_id == ctx_id && !ctx->dead) {
-			get_ioctx(ctx);
+		/*
+		 * RCU protects us against accessing freed memory but
+		 * we have to be careful not to get a reference when the
+		 * reference count already dropped to 0 (ctx->dead test
+		 * is unreliable because of races).
+		 */
+		if (ctx->user_id == ctx_id && !ctx->dead && try_get_ioctx(ctx)){
 			ret = ctx;
 			break;
 		}
-- 
1.7.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [2/2] fs: Fix race between io_destroy() and io_submit() in AIO
  2011-02-15 12:59 ` [PATCH 2/2] fs: Fix race between io_destroy() and io_submit() in AIO Jan Kara
@ 2011-02-15 12:59   ` Milton Miller
  2011-02-15 17:16     ` Jan Kara
  0 siblings, 1 reply; 9+ messages in thread
From: Milton Miller @ 2011-02-15 12:59 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-fsdevel, LKML, Jan Kara, Nick Piggin, Al Viro, Andrew Morton

> A race can occur when io_submit() races with io_destroy():
> 
>  CPU1						CPU2
> io_submit()
>   do_io_submit()
>     ...
>     ctx = lookup_ioctx(ctx_id);
> 						io_destroy()
>     Now do_io_submit() holds the last reference to ctx.
>     ...
>     queue new AIO
>     put_ioctx(ctx) - frees ctx with active AIOs
> 
> We solve this issue by checking whether ctx is being destroyed
> in AIO submission path after adding new AIO to ctx. Then we
> are guaranteed that either io_destroy() waits for new AIO or
> we see that ctx is being destroyed and bail out.
> 
> Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
> Signed-off-by: Jan Kara <jack@suse.cz>
> CC: Nick Piggin <npiggin@kernel.dk>
> 
> ---
> fs/aio.c |   15 +++++++++++++++
>  1 files changed, 15 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/aio.c b/fs/aio.c
> index b4dd668..0244c04 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -1642,6 +1642,21 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
>  		goto out_put_req;
>  
>  	spin_lock_irq(&ctx->ctx_lock);
> +	/*
> +	 * We could have raced with io_destroy() and are currently holding a
> +	 * reference to ctx which should be destroyed. We cannot submit IO
> +	 * since ctx gets freed as soon as io_submit() puts its reference.
> +	 * The check here is reliable since io_destroy() sets ctx->dead before
> +	 * waiting for outstanding IO. Thus if we don't see ctx->dead set here,
> +	 * io_destroy() waits for our IO to finish.
> +	 * The check is inside ctx->ctx_lock to avoid extra memory barrier
> +	 * in this fast path...
> +	 */

When reading this comment, and with all of the recient discussions I
had with Paul in the smp ipi thread (especially with resepect to third
party writes), I looked to see that the spinlock was paired with the
spinlock to set dead in io_destroy.  It is not.   It took me some time
to find that the paired lock is actually in wait_for_all_aios.  Also,
dead is also set in aio_cancel_all which is under the same spinlock.

Please update this lack of memory barrier comment to reflect the locking.

> +	if (ctx->dead) {
> +		spin_unlock_irq(&ctx->ctx_lock);
> +		ret = -EINVAL;
> +		goto out_put_req;
> +	}
>  	aio_run_iocb(req);
>  	if (!list_empty(&ctx->run_list)) {
>  		/* drain the run list */

thanks,
milton

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 2/2] fs: Fix race between io_destroy() and io_submit() in AIO
  2011-02-15 12:59 [PATCH 0/2] aio: Fix use after free bugs Jan Kara
  2011-02-15 12:59 ` [PATCH 1/2] fs: Fix aio rcu ioctx lookup Jan Kara
@ 2011-02-15 12:59 ` Jan Kara
  2011-02-15 12:59   ` [2/2] " Milton Miller
  2011-02-15 13:55 ` [PATCH 0/2] aio: Fix use after free bugs Jeff Moyer
  2 siblings, 1 reply; 9+ messages in thread
From: Jan Kara @ 2011-02-15 12:59 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel, LKML, Jan Kara, Nick Piggin

A race can occur when io_submit() races with io_destroy():

 CPU1						CPU2
io_submit()
  do_io_submit()
    ...
    ctx = lookup_ioctx(ctx_id);
						io_destroy()
    Now do_io_submit() holds the last reference to ctx.
    ...
    queue new AIO
    put_ioctx(ctx) - frees ctx with active AIOs

We solve this issue by checking whether ctx is being destroyed
in AIO submission path after adding new AIO to ctx. Then we
are guaranteed that either io_destroy() waits for new AIO or
we see that ctx is being destroyed and bail out.

Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
CC: Nick Piggin <npiggin@kernel.dk>
---
 fs/aio.c |   15 +++++++++++++++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index b4dd668..0244c04 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1642,6 +1642,21 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
 		goto out_put_req;
 
 	spin_lock_irq(&ctx->ctx_lock);
+	/*
+	 * We could have raced with io_destroy() and are currently holding a
+	 * reference to ctx which should be destroyed. We cannot submit IO
+	 * since ctx gets freed as soon as io_submit() puts its reference.
+	 * The check here is reliable since io_destroy() sets ctx->dead before
+	 * waiting for outstanding IO. Thus if we don't see ctx->dead set here,
+	 * io_destroy() waits for our IO to finish.
+	 * The check is inside ctx->ctx_lock to avoid extra memory barrier
+	 * in this fast path...
+	 */
+	if (ctx->dead) {
+		spin_unlock_irq(&ctx->ctx_lock);
+		ret = -EINVAL;
+		goto out_put_req;
+	}
 	aio_run_iocb(req);
 	if (!list_empty(&ctx->run_list)) {
 		/* drain the run list */
-- 
1.7.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] aio: Fix use after free bugs
  2011-02-15 12:59 [PATCH 0/2] aio: Fix use after free bugs Jan Kara
  2011-02-15 12:59 ` [PATCH 1/2] fs: Fix aio rcu ioctx lookup Jan Kara
  2011-02-15 12:59 ` [PATCH 2/2] fs: Fix race between io_destroy() and io_submit() in AIO Jan Kara
@ 2011-02-15 13:55 ` Jeff Moyer
  2 siblings, 0 replies; 9+ messages in thread
From: Jeff Moyer @ 2011-02-15 13:55 UTC (permalink / raw)
  To: Jan Kara; +Cc: Al Viro, linux-fsdevel, LKML

Jan Kara <jack@suse.cz> writes:

>   Hi Al,
>
>   The following two patches fix use after free bugs in AIO code.  I'm not
> completely sure you're the right one to merge these but hopefully yes ;).
> Could you please merge them? Thanks.

I've always pushed aio changes through akpm, fwiw.

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2/2] fs: Fix race between io_destroy() and io_submit() in AIO
  2011-02-15 12:59   ` [2/2] " Milton Miller
@ 2011-02-15 17:16     ` Jan Kara
  2011-02-15 18:50       ` Milton Miller
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Kara @ 2011-02-15 17:16 UTC (permalink / raw)
  To: Milton Miller
  Cc: Jan Kara, linux-fsdevel, LKML, Nick Piggin, Al Viro, Andrew Morton

On Tue 15-02-11 12:59:24, Milton Miller wrote:
> > A race can occur when io_submit() races with io_destroy():
> > 
> >  CPU1						CPU2
> > io_submit()
> >   do_io_submit()
> >     ...
> >     ctx = lookup_ioctx(ctx_id);
> > 						io_destroy()
> >     Now do_io_submit() holds the last reference to ctx.
> >     ...
> >     queue new AIO
> >     put_ioctx(ctx) - frees ctx with active AIOs
> > 
> > We solve this issue by checking whether ctx is being destroyed
> > in AIO submission path after adding new AIO to ctx. Then we
> > are guaranteed that either io_destroy() waits for new AIO or
> > we see that ctx is being destroyed and bail out.
> > 
> > Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > CC: Nick Piggin <npiggin@kernel.dk>
> > 
> > ---
> > fs/aio.c |   15 +++++++++++++++
> >  1 files changed, 15 insertions(+), 0 deletions(-)
> > 
> > diff --git a/fs/aio.c b/fs/aio.c
> > index b4dd668..0244c04 100644
> > --- a/fs/aio.c
> > +++ b/fs/aio.c
> > @@ -1642,6 +1642,21 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
> >  		goto out_put_req;
> >  
> >  	spin_lock_irq(&ctx->ctx_lock);
> > +	/*
> > +	 * We could have raced with io_destroy() and are currently holding a
> > +	 * reference to ctx which should be destroyed. We cannot submit IO
> > +	 * since ctx gets freed as soon as io_submit() puts its reference.
> > +	 * The check here is reliable since io_destroy() sets ctx->dead before
> > +	 * waiting for outstanding IO. Thus if we don't see ctx->dead set here,
> > +	 * io_destroy() waits for our IO to finish.
> > +	 * The check is inside ctx->ctx_lock to avoid extra memory barrier
> > +	 * in this fast path...
> > +	 */
> 
> When reading this comment, and with all of the recient discussions I
> had with Paul in the smp ipi thread (especially with resepect to third
> party writes), I looked to see that the spinlock was paired with the
> spinlock to set dead in io_destroy.  It is not.   It took me some time
> to find that the paired lock is actually in wait_for_all_aios.  Also,
> dead is also set in aio_cancel_all which is under the same spinlock.
> 
> Please update this lack of memory barrier comment to reflect the locking.
  Hum, sorry but I don't understand. The above message wants to say that
io_destroy() does
  ctx->dead = 1
  barrier (implied by a spin_unlock)
  wait for reqs_active to get to 0

while io_submit() does
  increment reqs_active
  barrier (implied by a spin_lock - on a different lock but that does not
matter as we only need the barrier semantics)
  check ctx->dead

So if io_submit() gets past ctx->dead check, io_destroy() will certainly
wait for our reference in reqs_active to be released.

I don't see any lock pairing needed here... But maybe I miss something.

								Honza
> 
> > +	if (ctx->dead) {
> > +		spin_unlock_irq(&ctx->ctx_lock);
> > +		ret = -EINVAL;
> > +		goto out_put_req;
> > +	}
> >  	aio_run_iocb(req);
> >  	if (!list_empty(&ctx->run_list)) {
> >  		/* drain the run list */
> 
> thanks,
> milton
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2/2] fs: Fix race between io_destroy() and io_submit() in AIO
  2011-02-15 17:16     ` Jan Kara
@ 2011-02-15 18:50       ` Milton Miller
  2011-02-15 19:15         ` Jan Kara
  0 siblings, 1 reply; 9+ messages in thread
From: Milton Miller @ 2011-02-15 18:50 UTC (permalink / raw)
  To: Jan Kara
  Cc: Jan Kara, linux-fsdevel, LKML, Nick Piggin, Al Viro,
	Andrew Morton, Paul E. McKenney


Hi.  I chatted with Paul a bit to clarify my thoughts.

On Tue, 15 Feb 2011 about 11:16:16 -0600, Jan Kara wrote:
> On Tue 15-02-11 12:59:24, Milton Miller wrote:
> > > A race can occur when io_submit() races with io_destroy():
> > > 
> > >  CPU1						CPU2
> > > io_submit()
> > >   do_io_submit()
> > >     ...
> > >     ctx = lookup_ioctx(ctx_id);
> > > 						io_destroy()
> > >     Now do_io_submit() holds the last reference to ctx.
> > >     ...
> > >     queue new AIO
> > >     put_ioctx(ctx) - frees ctx with active AIOs
> > > 
> > > We solve this issue by checking whether ctx is being destroyed
> > > in AIO submission path after adding new AIO to ctx. Then we
> > > are guaranteed that either io_destroy() waits for new AIO or
> > > we see that ctx is being destroyed and bail out.
> > > 
> > > Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
> > > Signed-off-by: Jan Kara <jack@suse.cz>
> > > CC: Nick Piggin <npiggin@kernel.dk>
> > > 
> > > ---
> > > fs/aio.c |   15 +++++++++++++++
> > >  1 files changed, 15 insertions(+), 0 deletions(-)
> > > 
> > > diff --git a/fs/aio.c b/fs/aio.c
> > > index b4dd668..0244c04 100644
> > > --- a/fs/aio.c
> > > +++ b/fs/aio.c
> > > @@ -1642,6 +1642,21 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
> > >  		goto out_put_req;
> > >  
> > >  	spin_lock_irq(&ctx->ctx_lock);
> > > +	/*
> > > +	 * We could have raced with io_destroy() and are currently holding a
> > > +	 * reference to ctx which should be destroyed. We cannot submit IO
> > > +	 * since ctx gets freed as soon as io_submit() puts its reference.
> > > +	 * The check here is reliable since io_destroy() sets ctx->dead before
> > > +	 * waiting for outstanding IO. Thus if we don't see ctx->dead set here,
> > > +	 * io_destroy() waits for our IO to finish.
> > > +	 * The check is inside ctx->ctx_lock to avoid extra memory barrier
> > > +	 * in this fast path...
> > > +	 */
> > 
> > When reading this comment, and with all of the recient discussions I
> > had with Paul in the smp ipi thread (especially with resepect to third
> > party writes), I looked to see that the spinlock was paired with the
> > spinlock to set dead in io_destroy.  It is not.   It took me some time
> > to find that the paired lock is actually in wait_for_all_aios.  Also,
> > dead is also set in aio_cancel_all which is under the same spinlock.
> > 
> > Please update this lack of memory barrier comment to reflect the locking.

This locking description is wrong:

>   Hum, sorry but I don't understand. The above message wants to say that
> io_destroy() does
>   ctx->dead = 1
>   barrier (implied by a spin_unlock)

no spin_unlock only does a release barrier.

>   wait for reqs_active to get to 0

This read can move up into the spinlocked region (up to the lock acquire).

> 
> while io_submit() does
>   increment reqs_active
>   barrier (implied by a spin_lock - on a different lock but that does not
> matter as we only need the barrier semantics)

No only an acquire barrier, old writes can move into the spinlock region

>   check ctx->dead

the increment can move down past this check to the unlock here.

> 
> So if io_submit() gets past ctx->dead check, io_destroy() will certainly
> wait for our reference in reqs_active to be released.
> 
> I don't see any lock pairing needed here... But maybe I miss something.
> 
> 								Honza

spin lock and unlock are only half barriers as described in 
Documentation/memory-barriers.txt


Now, as I said, the code is ok because the active count is read and
written under ctx->ctx_lock, and aio_cancel_all sets dead under
that lock.

But the comment needs to reflect that and not just the the code is
under in some random spin_lock region instead of a memory barrier,
which is not sufficient.   Bad lock descriptions leads to making bad
code in the future, either through copying it to another context or
though future work removing the additional constraints not mentioned.

So please, comment which locks are being used here, as what
you described is not enough.


thanks,
milton

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2/2] fs: Fix race between io_destroy() and io_submit() in AIO
  2011-02-15 18:50       ` Milton Miller
@ 2011-02-15 19:15         ` Jan Kara
  2011-02-15 19:33           ` Jan Kara
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Kara @ 2011-02-15 19:15 UTC (permalink / raw)
  To: Milton Miller
  Cc: Jan Kara, linux-fsdevel, LKML, Nick Piggin, Al Viro,
	Andrew Morton, Paul E. McKenney

  Hi,

On Tue 15-02-11 12:50:32, Milton Miller wrote:
> On Tue, 15 Feb 2011 about 11:16:16 -0600, Jan Kara wrote:
> > On Tue 15-02-11 12:59:24, Milton Miller wrote:
> > > > A race can occur when io_submit() races with io_destroy():
> > > > 
> > > >  CPU1						CPU2
> > > > io_submit()
> > > >   do_io_submit()
> > > >     ...
> > > >     ctx = lookup_ioctx(ctx_id);
> > > > 						io_destroy()
> > > >     Now do_io_submit() holds the last reference to ctx.
> > > >     ...
> > > >     queue new AIO
> > > >     put_ioctx(ctx) - frees ctx with active AIOs
> > > > 
> > > > We solve this issue by checking whether ctx is being destroyed
> > > > in AIO submission path after adding new AIO to ctx. Then we
> > > > are guaranteed that either io_destroy() waits for new AIO or
> > > > we see that ctx is being destroyed and bail out.
> > > > 
> > > > Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
> > > > Signed-off-by: Jan Kara <jack@suse.cz>
> > > > CC: Nick Piggin <npiggin@kernel.dk>
> > > > 
> > > > ---
> > > > fs/aio.c |   15 +++++++++++++++
> > > >  1 files changed, 15 insertions(+), 0 deletions(-)
> > > > 
> > > > diff --git a/fs/aio.c b/fs/aio.c
> > > > index b4dd668..0244c04 100644
> > > > --- a/fs/aio.c
> > > > +++ b/fs/aio.c
> > > > @@ -1642,6 +1642,21 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
> > > >  		goto out_put_req;
> > > >  
> > > >  	spin_lock_irq(&ctx->ctx_lock);
> > > > +	/*
> > > > +	 * We could have raced with io_destroy() and are currently holding a
> > > > +	 * reference to ctx which should be destroyed. We cannot submit IO
> > > > +	 * since ctx gets freed as soon as io_submit() puts its reference.
> > > > +	 * The check here is reliable since io_destroy() sets ctx->dead before
> > > > +	 * waiting for outstanding IO. Thus if we don't see ctx->dead set here,
> > > > +	 * io_destroy() waits for our IO to finish.
> > > > +	 * The check is inside ctx->ctx_lock to avoid extra memory barrier
> > > > +	 * in this fast path...
> > > > +	 */
> > > 
> > > When reading this comment, and with all of the recient discussions I
> > > had with Paul in the smp ipi thread (especially with resepect to third
> > > party writes), I looked to see that the spinlock was paired with the
> > > spinlock to set dead in io_destroy.  It is not.   It took me some time
> > > to find that the paired lock is actually in wait_for_all_aios.  Also,
> > > dead is also set in aio_cancel_all which is under the same spinlock.
> > > 
> > > Please update this lack of memory barrier comment to reflect the locking.
> 
> This locking description is wrong:
> 
> >   Hum, sorry but I don't understand. The above message wants to say that
> > io_destroy() does
> >   ctx->dead = 1
> >   barrier (implied by a spin_unlock)
> 
> no spin_unlock only does a release barrier.
> 
> >   wait for reqs_active to get to 0
> 
> This read can move up into the spinlocked region (up to the lock acquire).
> 
> > 
> > while io_submit() does
> >   increment reqs_active
> >   barrier (implied by a spin_lock - on a different lock but that does not
> > matter as we only need the barrier semantics)
> 
> No only an acquire barrier, old writes can move into the spinlock region
> 
> >   check ctx->dead
> 
> the increment can move down past this check to the unlock here.
  Ah OK, you're right. I was typing too fast and thinking too slow ;).

> > So if io_submit() gets past ctx->dead check, io_destroy() will certainly
> > wait for our reference in reqs_active to be released.
> > 
> > I don't see any lock pairing needed here... But maybe I miss something.
> > 
> > 								Honza
> 
> spin lock and unlock are only half barriers as described in 
> Documentation/memory-barriers.txt
> 
> 
> Now, as I said, the code is ok because the active count is read and
> written under ctx->ctx_lock, and aio_cancel_all sets dead under
> that lock.
> 
> But the comment needs to reflect that and not just the the code is
> under in some random spin_lock region instead of a memory barrier,
> which is not sufficient.   Bad lock descriptions leads to making bad
> code in the future, either through copying it to another context or
> though future work removing the additional constraints not mentioned.
> 
> So please, comment which locks are being used here, as what
> you described is not enough.
  Yep, I'll improve the comment. Thanks for explanation.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [2/2] fs: Fix race between io_destroy() and io_submit() in AIO
  2011-02-15 19:15         ` Jan Kara
@ 2011-02-15 19:33           ` Jan Kara
  0 siblings, 0 replies; 9+ messages in thread
From: Jan Kara @ 2011-02-15 19:33 UTC (permalink / raw)
  To: Milton Miller
  Cc: Jan Kara, linux-fsdevel, LKML, Nick Piggin, Al Viro,
	Andrew Morton, Paul E. McKenney

  Hello,

On Tue 15-02-11 20:15:14, Jan Kara wrote:
> On Tue 15-02-11 12:50:32, Milton Miller wrote:
> > On Tue, 15 Feb 2011 about 11:16:16 -0600, Jan Kara wrote:
> > > On Tue 15-02-11 12:59:24, Milton Miller wrote:
> > > > > A race can occur when io_submit() races with io_destroy():
> > > > > 
> > > > >  CPU1						CPU2
> > > > > io_submit()
> > > > >   do_io_submit()
> > > > >     ...
> > > > >     ctx = lookup_ioctx(ctx_id);
> > > > > 						io_destroy()
> > > > >     Now do_io_submit() holds the last reference to ctx.
> > > > >     ...
> > > > >     queue new AIO
> > > > >     put_ioctx(ctx) - frees ctx with active AIOs
> > > > > 
> > > > > We solve this issue by checking whether ctx is being destroyed
> > > > > in AIO submission path after adding new AIO to ctx. Then we
> > > > > are guaranteed that either io_destroy() waits for new AIO or
> > > > > we see that ctx is being destroyed and bail out.
> > > > > 
> > > > > Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
> > > > > Signed-off-by: Jan Kara <jack@suse.cz>
> > > > > CC: Nick Piggin <npiggin@kernel.dk>
> > > > > 
> > > > > ---
> > > > > fs/aio.c |   15 +++++++++++++++
> > > > >  1 files changed, 15 insertions(+), 0 deletions(-)
> > > > > 
> > > > > diff --git a/fs/aio.c b/fs/aio.c
> > > > > index b4dd668..0244c04 100644
> > > > > --- a/fs/aio.c
> > > > > +++ b/fs/aio.c
> > > > > @@ -1642,6 +1642,21 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
> > > > >  		goto out_put_req;
> > > > >  
> > > > >  	spin_lock_irq(&ctx->ctx_lock);
> > > > > +	/*
> > > > > +	 * We could have raced with io_destroy() and are currently holding a
> > > > > +	 * reference to ctx which should be destroyed. We cannot submit IO
> > > > > +	 * since ctx gets freed as soon as io_submit() puts its reference.
> > > > > +	 * The check here is reliable since io_destroy() sets ctx->dead before
> > > > > +	 * waiting for outstanding IO. Thus if we don't see ctx->dead set here,
> > > > > +	 * io_destroy() waits for our IO to finish.
> > > > > +	 * The check is inside ctx->ctx_lock to avoid extra memory barrier
> > > > > +	 * in this fast path...
> > > > > +	 */
> > > > 
> > > > When reading this comment, and with all of the recient discussions I
> > > > had with Paul in the smp ipi thread (especially with resepect to third
> > > > party writes), I looked to see that the spinlock was paired with the
> > > > spinlock to set dead in io_destroy.  It is not.   It took me some time
> > > > to find that the paired lock is actually in wait_for_all_aios.  Also,
> > > > dead is also set in aio_cancel_all which is under the same spinlock.
> > > > 
> > > > Please update this lack of memory barrier comment to reflect the locking.
> > 
> > This locking description is wrong:
> > 
> > >   Hum, sorry but I don't understand. The above message wants to say that
> > > io_destroy() does
> > >   ctx->dead = 1
> > >   barrier (implied by a spin_unlock)
> > 
> > no spin_unlock only does a release barrier.
> > 
> > >   wait for reqs_active to get to 0
> > 
> > This read can move up into the spinlocked region (up to the lock acquire).
> > 
> > > 
> > > while io_submit() does
> > >   increment reqs_active
> > >   barrier (implied by a spin_lock - on a different lock but that does not
> > > matter as we only need the barrier semantics)
> > 
> > No only an acquire barrier, old writes can move into the spinlock region
> > 
> > >   check ctx->dead
> > 
> > the increment can move down past this check to the unlock here.
>   Ah OK, you're right. I was typing too fast and thinking too slow ;).
> 
> > > So if io_submit() gets past ctx->dead check, io_destroy() will certainly
> > > wait for our reference in reqs_active to be released.
> > > 
> > > I don't see any lock pairing needed here... But maybe I miss something.
> > > 
> > > 								Honza
> > 
> > spin lock and unlock are only half barriers as described in 
> > Documentation/memory-barriers.txt
> > 
> > 
> > Now, as I said, the code is ok because the active count is read and
> > written under ctx->ctx_lock, and aio_cancel_all sets dead under
> > that lock.
> > 
> > But the comment needs to reflect that and not just the the code is
> > under in some random spin_lock region instead of a memory barrier,
> > which is not sufficient.   Bad lock descriptions leads to making bad
> > code in the future, either through copying it to another context or
> > though future work removing the additional constraints not mentioned.
> > 
> > So please, comment which locks are being used here, as what
> > you described is not enough.
>   Yep, I'll improve the comment. Thanks for explanation.
  Do you like this comment better?
      /*
       * We could have raced with io_destroy() and are currently holding a
       * reference to ctx which should be destroyed. We cannot submit IO
       * since ctx gets freed as soon as io_submit() puts its reference.  The
       * check here is reliable: io_destroy() sets ctx->dead before waiting
       * for outstanding IO and the barrier between these two is realized by
       * unlock of mm->ioctx_lock and lock of ctx->ctx_lock.  Analogously we
       * increment ctx->reqs_active before checking for ctx->dead and the
       * barrier is realized by unlock and lock of ctx->ctx_lock. Thus if we
       * don't see ctx->dead set here, io_destroy() waits for our IO to
       * finish.
       */


								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-02-15 19:33 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-15 12:59 [PATCH 0/2] aio: Fix use after free bugs Jan Kara
2011-02-15 12:59 ` [PATCH 1/2] fs: Fix aio rcu ioctx lookup Jan Kara
2011-02-15 12:59 ` [PATCH 2/2] fs: Fix race between io_destroy() and io_submit() in AIO Jan Kara
2011-02-15 12:59   ` [2/2] " Milton Miller
2011-02-15 17:16     ` Jan Kara
2011-02-15 18:50       ` Milton Miller
2011-02-15 19:15         ` Jan Kara
2011-02-15 19:33           ` Jan Kara
2011-02-15 13:55 ` [PATCH 0/2] aio: Fix use after free bugs Jeff Moyer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).