LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 1/1] (v3) SYSVIPC - Fix the ipc structures initialization
       [not found] <20081027072826.266777838@bull.net>
@ 2008-10-27  7:28 ` Nadia.Derbey
  2008-10-27 10:32   ` cboulte
  0 siblings, 1 reply; 6+ messages in thread
From: Nadia.Derbey @ 2008-10-27  7:28 UTC (permalink / raw)
  To: akpm, cboulte; +Cc: manfred, linux-kernel, Nadia.Derbey

[-- Attachment #1: ipc-fix-sysvipc_structures_initialization.patch --]
[-- Type: text/plain, Size: 2427 bytes --]


This patch is a fix for Bugzilla bug
http://bugzilla.kernel.org/show_bug.cgi?id=11796.

To summarize, a simple testcase is concurrently running an infinite loop on
msgctl(IPC_STAT) and a call to msgget():

while (1)
   msgctl(id, IPC_STAT)    1
                           |
                           |
                           |
                           2 id = msgget(key, IPC_CREAT)
                           |
                           |
                           |

In the interval [1-2], the id doesn't exist yet.

In that test, the problem is the following:
When we are calling ipc_addid() from msgget() the msq structure is not
completely initialized. So idr_get_new() is inserting a pointer into the
idr tree, and the structure which is pointed to has, among other fields,
its lock uninitialized.

Since msgctl(IPC_STAT) is looping while (1), idr_find() returns the
pointer as soon as it is inserted into the IDR tree. And ipc_lock()
calls spin_lock(&mqs->lock), while we have not initialized that lock
yet.

This patch moves the spin_lock_init() before the call to ipc_addid().
It also sets the "deleted" flag to 1 in the window between msg structure
allocation and msg structure locking in ipc_addid().


Regards,
Nadia


Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>

---
 ipc/util.c |   16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

Index: linux-2.6.27/ipc/util.c
===================================================================
--- linux-2.6.27.orig/ipc/util.c	2008-10-23 15:20:46.000000000 +0200
+++ linux-2.6.27/ipc/util.c	2008-10-24 17:48:33.000000000 +0200
@@ -266,6 +266,17 @@ int ipc_addid(struct ipc_ids* ids, struc
 	if (ids->in_use >= size)
 		return -ENOSPC;
 
+	spin_lock_init(&new->lock);
+
+	/*
+	 * We have a window between the time new is inserted into the idr
+	 * tree and the time it is actually locked.
+	 * In order to be safe during that window set the new ipc structure
+	 * as deleted: a concurrent ipc_lock() will see it as not present
+	 * until the initialization phase is complete.
+	 */
+	new->deleted = 1;
+
 	err = idr_get_new(&ids->ipcs_idr, new, &id);
 	if (err)
 		return err;
@@ -280,10 +291,11 @@ int ipc_addid(struct ipc_ids* ids, struc
 		ids->seq = 0;
 
 	new->id = ipc_buildid(id, new->seq);
-	spin_lock_init(&new->lock);
-	new->deleted = 0;
 	rcu_read_lock();
 	spin_lock(&new->lock);
+
+	new->deleted = 0;
+
 	return id;
 }
 

-- 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/1] (v3) SYSVIPC - Fix the ipc structures initialization
  2008-10-27  7:28 ` [PATCH 1/1] (v3) SYSVIPC - Fix the ipc structures initialization Nadia.Derbey
@ 2008-10-27 10:32   ` cboulte
  2008-10-27 11:04     ` Nadia Derbey
  0 siblings, 1 reply; 6+ messages in thread
From: cboulte @ 2008-10-27 10:32 UTC (permalink / raw)
  To: Nadia.Derbey; +Cc: akpm, manfred, linux-kernel

On Mon, Oct 27, 2008 at 8:28 AM,  <Nadia.Derbey@bull.net> wrote:
>
> This patch is a fix for Bugzilla bug
> http://bugzilla.kernel.org/show_bug.cgi?id=11796.
>
> To summarize, a simple testcase is concurrently running an infinite loop on
> msgctl(IPC_STAT) and a call to msgget():
>
> while (1)
>   msgctl(id, IPC_STAT)    1
>                           |
>                           |
>                           |
>                           2 id = msgget(key, IPC_CREAT)
>                           |
>                           |
>                           |
>
> In the interval [1-2], the id doesn't exist yet.
>
> In that test, the problem is the following:
> When we are calling ipc_addid() from msgget() the msq structure is not
> completely initialized. So idr_get_new() is inserting a pointer into the
> idr tree, and the structure which is pointed to has, among other fields,
> its lock uninitialized.
>
> Since msgctl(IPC_STAT) is looping while (1), idr_find() returns the
> pointer as soon as it is inserted into the IDR tree. And ipc_lock()
> calls spin_lock(&mqs->lock), while we have not initialized that lock
> yet.
>
> This patch moves the spin_lock_init() before the call to ipc_addid().
> It also sets the "deleted" flag to 1 in the window between msg structure
> allocation and msg structure locking in ipc_addid().
>
>
> Regards,
> Nadia
>
>
> Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
>
> ---
>  ipc/util.c |   16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
>
> Index: linux-2.6.27/ipc/util.c
> ===================================================================
> --- linux-2.6.27.orig/ipc/util.c        2008-10-23 15:20:46.000000000 +0200
> +++ linux-2.6.27/ipc/util.c     2008-10-24 17:48:33.000000000 +0200
> @@ -266,6 +266,17 @@ int ipc_addid(struct ipc_ids* ids, struc
>        if (ids->in_use >= size)
>                return -ENOSPC;
>
> +       spin_lock_init(&new->lock);
> +
> +       /*
> +        * We have a window between the time new is inserted into the idr
> +        * tree and the time it is actually locked.
> +        * In order to be safe during that window set the new ipc structure
> +        * as deleted: a concurrent ipc_lock() will see it as not present
> +        * until the initialization phase is complete.
> +        */
> +       new->deleted = 1;
> +
>        err = idr_get_new(&ids->ipcs_idr, new, &id);
>        if (err)
>                return err;
> @@ -280,10 +291,11 @@ int ipc_addid(struct ipc_ids* ids, struc
>                ids->seq = 0;
>
>        new->id = ipc_buildid(id, new->seq);
> -       spin_lock_init(&new->lock);
> -       new->deleted = 0;
>        rcu_read_lock();
>        spin_lock(&new->lock);
> +
> +       new->deleted = 0;
> +
>        return id;
>  }
>
>
> --
>

Still got the lock... I'm using a 4 cpus node: Intel Xeon @ 2.8GHz...
don't know if it has an impact.
The only way I found to have no lock, it's to spin_lock the ipc
_before_ inserting it into the idr.

Best regards, c.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/1] (v3) SYSVIPC - Fix the ipc structures initialization
  2008-10-27 10:32   ` cboulte
@ 2008-10-27 11:04     ` Nadia Derbey
  2008-10-27 15:42       ` Nadia Derbey
  0 siblings, 1 reply; 6+ messages in thread
From: Nadia Derbey @ 2008-10-27 11:04 UTC (permalink / raw)
  To: cboulte; +Cc: akpm, manfred, linux-kernel

On Mon, 2008-10-27 at 11:32 +0100, cboulte@gmail.com wrote:
> On Mon, Oct 27, 2008 at 8:28 AM,  <Nadia.Derbey@bull.net> wrote:
> >
> > This patch is a fix for Bugzilla bug
> > http://bugzilla.kernel.org/show_bug.cgi?id=11796.
> >
> > To summarize, a simple testcase is concurrently running an infinite loop on
> > msgctl(IPC_STAT) and a call to msgget():
> >
> > while (1)
> >   msgctl(id, IPC_STAT)    1
> >                           |
> >                           |
> >                           |
> >                           2 id = msgget(key, IPC_CREAT)
> >                           |
> >                           |
> >                           |
> >
> > In the interval [1-2], the id doesn't exist yet.
> >
> > In that test, the problem is the following:
> > When we are calling ipc_addid() from msgget() the msq structure is not
> > completely initialized. So idr_get_new() is inserting a pointer into the
> > idr tree, and the structure which is pointed to has, among other fields,
> > its lock uninitialized.
> >
> > Since msgctl(IPC_STAT) is looping while (1), idr_find() returns the
> > pointer as soon as it is inserted into the IDR tree. And ipc_lock()
> > calls spin_lock(&mqs->lock), while we have not initialized that lock
> > yet.
> >
> > This patch moves the spin_lock_init() before the call to ipc_addid().
> > It also sets the "deleted" flag to 1 in the window between msg structure
> > allocation and msg structure locking in ipc_addid().
> >
> >
> > Regards,
> > Nadia
> >
> >
> > Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
> >
> > ---
> >  ipc/util.c |   16 ++++++++++++++--
> >  1 file changed, 14 insertions(+), 2 deletions(-)
> >
> > Index: linux-2.6.27/ipc/util.c
> > ===================================================================
> > --- linux-2.6.27.orig/ipc/util.c        2008-10-23 15:20:46.000000000 +0200
> > +++ linux-2.6.27/ipc/util.c     2008-10-24 17:48:33.000000000 +0200
> > @@ -266,6 +266,17 @@ int ipc_addid(struct ipc_ids* ids, struc
> >        if (ids->in_use >= size)
> >                return -ENOSPC;
> >
> > +       spin_lock_init(&new->lock);
> > +
> > +       /*
> > +        * We have a window between the time new is inserted into the idr
> > +        * tree and the time it is actually locked.
> > +        * In order to be safe during that window set the new ipc structure
> > +        * as deleted: a concurrent ipc_lock() will see it as not present
> > +        * until the initialization phase is complete.
> > +        */
> > +       new->deleted = 1;
> > +
> >        err = idr_get_new(&ids->ipcs_idr, new, &id);
> >        if (err)
> >                return err;
> > @@ -280,10 +291,11 @@ int ipc_addid(struct ipc_ids* ids, struc
> >                ids->seq = 0;
> >
> >        new->id = ipc_buildid(id, new->seq);
> > -       spin_lock_init(&new->lock);
> > -       new->deleted = 0;
> >        rcu_read_lock();
> >        spin_lock(&new->lock);
> > +
> > +       new->deleted = 0;
> > +
> >        return id;
> >  }
> >
> >
> > --
> >
> 
> Still got the lock... I'm using a 4 cpus node: Intel Xeon @ 2.8GHz...
> don't know if it has an impact.
???
The bad new, is that it becomes unreprodicible on my side.
For my part, I've got 2 2.8 GHz Xeon CPUs.

Will review the code once more.

Thanks!
Nadia

> The only way I found to have no lock, it's to spin_lock the ipc
> _before_ inserting it into the idr.
> 
> Best regards, c.
> 
-- 
Nadia Derbey <Nadia.Derbey@bull.net>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/1] (v3) SYSVIPC - Fix the ipc structures initialization
  2008-10-27 11:04     ` Nadia Derbey
@ 2008-10-27 15:42       ` Nadia Derbey
  2008-10-28  9:44         ` cboulte
  0 siblings, 1 reply; 6+ messages in thread
From: Nadia Derbey @ 2008-10-27 15:42 UTC (permalink / raw)
  To: cboulte; +Cc: akpm, manfred, linux-kernel

On Mon, 2008-10-27 at 12:04 +0100, Nadia Derbey wrote:
> On Mon, 2008-10-27 at 11:32 +0100, cboulte@gmail.com wrote:
> > On Mon, Oct 27, 2008 at 8:28 AM,  <Nadia.Derbey@bull.net> wrote:
> > >
> > > This patch is a fix for Bugzilla bug
> > > http://bugzilla.kernel.org/show_bug.cgi?id=11796.
> > >
> > > To summarize, a simple testcase is concurrently running an infinite loop on
> > > msgctl(IPC_STAT) and a call to msgget():
> > >
> > > while (1)
> > >   msgctl(id, IPC_STAT)    1
> > >                           |
> > >                           |
> > >                           |
> > >                           2 id = msgget(key, IPC_CREAT)
> > >                           |
> > >                           |
> > >                           |
> > >
> > > In the interval [1-2], the id doesn't exist yet.
> > >
> > > In that test, the problem is the following:
> > > When we are calling ipc_addid() from msgget() the msq structure is not
> > > completely initialized. So idr_get_new() is inserting a pointer into the
> > > idr tree, and the structure which is pointed to has, among other fields,
> > > its lock uninitialized.
> > >
> > > Since msgctl(IPC_STAT) is looping while (1), idr_find() returns the
> > > pointer as soon as it is inserted into the IDR tree. And ipc_lock()
> > > calls spin_lock(&mqs->lock), while we have not initialized that lock
> > > yet.
> > >
> > > This patch moves the spin_lock_init() before the call to ipc_addid().
> > > It also sets the "deleted" flag to 1 in the window between msg structure
> > > allocation and msg structure locking in ipc_addid().
> > >
> > >
> > > Regards,
> > > Nadia
> > >
> > >
> > > Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
> > >
> > > ---
> > >  ipc/util.c |   16 ++++++++++++++--
> > >  1 file changed, 14 insertions(+), 2 deletions(-)
> > >
> > > Index: linux-2.6.27/ipc/util.c
> > > ===================================================================
> > > --- linux-2.6.27.orig/ipc/util.c        2008-10-23 15:20:46.000000000 +0200
> > > +++ linux-2.6.27/ipc/util.c     2008-10-24 17:48:33.000000000 +0200
> > > @@ -266,6 +266,17 @@ int ipc_addid(struct ipc_ids* ids, struc
> > >        if (ids->in_use >= size)
> > >                return -ENOSPC;
> > >
> > > +       spin_lock_init(&new->lock);
> > > +
> > > +       /*
> > > +        * We have a window between the time new is inserted into the idr
> > > +        * tree and the time it is actually locked.
> > > +        * In order to be safe during that window set the new ipc structure
> > > +        * as deleted: a concurrent ipc_lock() will see it as not present
> > > +        * until the initialization phase is complete.
> > > +        */
> > > +       new->deleted = 1;
> > > +
> > >        err = idr_get_new(&ids->ipcs_idr, new, &id);
> > >        if (err)
> > >                return err;
> > > @@ -280,10 +291,11 @@ int ipc_addid(struct ipc_ids* ids, struc
> > >                ids->seq = 0;
> > >
> > >        new->id = ipc_buildid(id, new->seq);
> > > -       spin_lock_init(&new->lock);
> > > -       new->deleted = 0;
> > >        rcu_read_lock();
> > >        spin_lock(&new->lock);
> > > +
> > > +       new->deleted = 0;
> > > +
> > >        return id;
> > >  }
> > >
> > >
> > > --
> > >
> > 
> > Still got the lock... I'm using a 4 cpus node: Intel Xeon @ 2.8GHz...
> > don't know if it has an impact.
> ???
> The bad new, is that it becomes unreprodicible on my side.
> For my part, I've got 2 2.8 GHz Xeon CPUs.
> 
> Will review the code once more.
> 
> Thanks!
> Nadia
> 
> > The only way I found to have no lock, it's to spin_lock the ipc
> > _before_ inserting it into the idr.
> > 
> > Best regards, c.
> > 

I agree with you that it's more logical and correct to take the lock
before inserting the ipc structure (i.e. making it visible to readers).

But I wanted to understand what's wrong with
1. new->lock init
2. new->deleted = 1
3. insert(new)

I've been looking at the code again and again and the only thing I see
could have happened, is that instructions have been reordered and the
insertion done before the lock actually being initialized.
This means that a memory barrier is missing (this would explain why your
fix works: the spin_lock acts as a barrier).
But this memory barrier is supposed to be invoked by
rcu_assign_pointer() in idr_get_new(). So may be there's a problem with
the idr code.
Before going into a review of this code, I'd like to confirm what I'm
saying, doing the following (I'm sorry to ask you do it, but I can't
reproduce the problem in my side anymore): would you mind adding a
smp_wmb() just before the idr_get_new in ipc_addid() and tell me if this
solves the problem.
(BTW, I didn't ask you before, but I guess you're getting the same call
trace?)

Regards,
Nadia

-- 
Nadia Derbey <Nadia.Derbey@bull.net>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/1] (v3) SYSVIPC - Fix the ipc structures initialization
  2008-10-27 15:42       ` Nadia Derbey
@ 2008-10-28  9:44         ` cboulte
  2008-10-28 11:05           ` Nadia Derbey
  0 siblings, 1 reply; 6+ messages in thread
From: cboulte @ 2008-10-28  9:44 UTC (permalink / raw)
  To: Nadia Derbey; +Cc: akpm, manfred, linux-kernel

On Mon, Oct 27, 2008 at 4:42 PM, Nadia Derbey <Nadia.Derbey@bull.net> wrote:
> On Mon, 2008-10-27 at 12:04 +0100, Nadia Derbey wrote:
>> On Mon, 2008-10-27 at 11:32 +0100, cboulte@gmail.com wrote:
>> > On Mon, Oct 27, 2008 at 8:28 AM,  <Nadia.Derbey@bull.net> wrote:
>> > >
>> > > This patch is a fix for Bugzilla bug
>> > > http://bugzilla.kernel.org/show_bug.cgi?id=11796.
>> > >
>> > > To summarize, a simple testcase is concurrently running an infinite loop on
>> > > msgctl(IPC_STAT) and a call to msgget():
>> > >
>> > > while (1)
>> > >   msgctl(id, IPC_STAT)    1
>> > >                           |
>> > >                           |
>> > >                           |
>> > >                           2 id = msgget(key, IPC_CREAT)
>> > >                           |
>> > >                           |
>> > >                           |
>> > >
>> > > In the interval [1-2], the id doesn't exist yet.
>> > >
>> > > In that test, the problem is the following:
>> > > When we are calling ipc_addid() from msgget() the msq structure is not
>> > > completely initialized. So idr_get_new() is inserting a pointer into the
>> > > idr tree, and the structure which is pointed to has, among other fields,
>> > > its lock uninitialized.
>> > >
>> > > Since msgctl(IPC_STAT) is looping while (1), idr_find() returns the
>> > > pointer as soon as it is inserted into the IDR tree. And ipc_lock()
>> > > calls spin_lock(&mqs->lock), while we have not initialized that lock
>> > > yet.
>> > >
>> > > This patch moves the spin_lock_init() before the call to ipc_addid().
>> > > It also sets the "deleted" flag to 1 in the window between msg structure
>> > > allocation and msg structure locking in ipc_addid().
>> > >
>> > >
>> > > Regards,
>> > > Nadia
>> > >
>> > >
>> > > Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
>> > >
>> > > ---
>> > >  ipc/util.c |   16 ++++++++++++++--
>> > >  1 file changed, 14 insertions(+), 2 deletions(-)
>> > >
>> > > Index: linux-2.6.27/ipc/util.c
>> > > ===================================================================
>> > > --- linux-2.6.27.orig/ipc/util.c        2008-10-23 15:20:46.000000000 +0200
>> > > +++ linux-2.6.27/ipc/util.c     2008-10-24 17:48:33.000000000 +0200
>> > > @@ -266,6 +266,17 @@ int ipc_addid(struct ipc_ids* ids, struc
>> > >        if (ids->in_use >= size)
>> > >                return -ENOSPC;
>> > >
>> > > +       spin_lock_init(&new->lock);
>> > > +
>> > > +       /*
>> > > +        * We have a window between the time new is inserted into the idr
>> > > +        * tree and the time it is actually locked.
>> > > +        * In order to be safe during that window set the new ipc structure
>> > > +        * as deleted: a concurrent ipc_lock() will see it as not present
>> > > +        * until the initialization phase is complete.
>> > > +        */
>> > > +       new->deleted = 1;
>> > > +
>> > >        err = idr_get_new(&ids->ipcs_idr, new, &id);
>> > >        if (err)
>> > >                return err;
>> > > @@ -280,10 +291,11 @@ int ipc_addid(struct ipc_ids* ids, struc
>> > >                ids->seq = 0;
>> > >
>> > >        new->id = ipc_buildid(id, new->seq);
>> > > -       spin_lock_init(&new->lock);
>> > > -       new->deleted = 0;
>> > >        rcu_read_lock();
>> > >        spin_lock(&new->lock);
>> > > +
>> > > +       new->deleted = 0;
>> > > +
>> > >        return id;
>> > >  }
>> > >
>> > >
>> > > --
>> > >
>> >
>> > Still got the lock... I'm using a 4 cpus node: Intel Xeon @ 2.8GHz...
>> > don't know if it has an impact.
>> ???
>> The bad new, is that it becomes unreprodicible on my side.
>> For my part, I've got 2 2.8 GHz Xeon CPUs.
>>
>> Will review the code once more.
>>
>> Thanks!
>> Nadia
>>
>> > The only way I found to have no lock, it's to spin_lock the ipc
>> > _before_ inserting it into the idr.
>> >
>> > Best regards, c.
>> >
>
> I agree with you that it's more logical and correct to take the lock
> before inserting the ipc structure (i.e. making it visible to readers).
>
> But I wanted to understand what's wrong with
> 1. new->lock init
> 2. new->deleted = 1
> 3. insert(new)
>
> I've been looking at the code again and again and the only thing I see
> could have happened, is that instructions have been reordered and the
> insertion done before the lock actually being initialized.
> This means that a memory barrier is missing (this would explain why your
> fix works: the spin_lock acts as a barrier).
> But this memory barrier is supposed to be invoked by
> rcu_assign_pointer() in idr_get_new(). So may be there's a problem with
> the idr code.
> Before going into a review of this code, I'd like to confirm what I'm
> saying, doing the following (I'm sorry to ask you do it, but I can't
> reproduce the problem in my side anymore): would you mind adding a
> smp_wmb() just before the idr_get_new in ipc_addid() and tell me if this
> solves the problem.
> (BTW, I didn't ask you before, but I guess you're getting the same call
> trace?)
>
> Regards,
> Nadia
>
> --
> Nadia Derbey <Nadia.Derbey@bull.net>
>
>

I tried this patch:
Index: bug-sysv/ipc/util.c
===================================================================
--- bug-sysv.orig/ipc/util.c    2008-10-27 09:21:44.000000000 +0100
+++ bug-sysv/ipc/util.c 2008-10-27 19:04:33.000000000 +0100
@@ -266,6 +266,19 @@ int ipc_addid(struct ipc_ids* ids, struc
        if (ids->in_use >= size)
                return -ENOSPC;

+       spin_lock_init(&new->lock);
+
+       /*
+        * We have a window between the time new is inserted into the idr
+        * tree and the time it is actually locked.
+        * In order to be safe during that window set the new ipc structure
+        * as deleted: a concurrent ipc_lock() will see it as not present
+        * until the initialization phase is complete.
+        */
+       new->deleted = 1;
+
+       smp_wmb();
+
        err = idr_get_new(&ids->ipcs_idr, new, &id);
        if (err)
                return err;
@@ -280,10 +293,11 @@ int ipc_addid(struct ipc_ids* ids, struc
                ids->seq = 0;

        new->id = ipc_buildid(id, new->seq);
-       spin_lock_init(&new->lock);
-       new->deleted = 0;
        rcu_read_lock();
        spin_lock(&new->lock);
+
+       new->deleted = 0;
+
        return id;
 }

And got the lock (the node is still usuable but I guess it's because
only 1 cpu out of 4 is locked):

[  400.393024] INFO: trying to register non-static key.
[  400.397005] the code is fine but needs lockdep annotation.
[  400.397005] turning off the locking correctness validator.
[  400.397005] Pid: 4207, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #1
[  400.397005]
[  400.397005] Call Trace:
[  400.397005]  [<ffffffff80257055>] static_obj+0x60/0x77
[  400.397005]  [<ffffffff8025af59>] __lock_acquire+0x1c8/0x779
[  400.397005]  [<ffffffff8025b59f>] lock_acquire+0x95/0xc2
[  400.397005]  [<ffffffff802feb07>] ipc_lock+0x62/0x99
[  400.397005]  [<ffffffff8045117d>] _spin_lock+0x2d/0x5a
[  400.397005]  [<ffffffff802feb07>] ipc_lock+0x62/0x99
[  400.397005]  [<ffffffff802feb07>] ipc_lock+0x62/0x99
[  400.397005]  [<ffffffff802feaa5>] ipc_lock+0x0/0x99
[  400.397005]  [<ffffffff802feb46>] ipc_lock_check+0x8/0x53
[  400.397005]  [<ffffffff803002c3>] sys_msgctl+0x188/0x461
[  400.397005]  [<ffffffff80259ac7>] trace_hardirqs_on_caller+0x100/0x12a
[  400.397005]  [<ffffffff80450d49>] trace_hardirqs_on_thunk+0x3a/0x3f
[  400.397005]  [<ffffffff80259ac7>] trace_hardirqs_on_caller+0x100/0x12a
[  400.397005]  [<ffffffff80212e09>] sched_clock+0x5/0x7
[  400.397005]  [<ffffffff80450d49>] trace_hardirqs_on_thunk+0x3a/0x3f
[  400.397005]  [<ffffffff80213021>] native_sched_clock+0x8c/0xa5
[  400.397005]  [<ffffffff80212e09>] sched_clock+0x5/0x7
[  400.397005]  [<ffffffff8020bf7a>] system_call_fastpath+0x16/0x1b
[  400.397005]
[  464.933003] BUG: soft lockup - CPU#2 stuck for 61s! [sysv_test2:4207]
[  464.933006] Modules linked in: ipv6 nfs lockd nfs_acl sunrpc button
battery ac loop dm_mod md_mod usbkbd usbhid hid ff_memless mptctl
evdev tg3 libphy iTCO_wdt e752x_edac edac_core uhci_hcd rng_core
shpchp i2c_i801 pci_hotplug i2c_core ehci_hcd reiserfs edd fan thermal
processor thermal_sys mptspi mptscsih sg mptbase scsi_transport_spi
sr_mod cdrom ata_piix libata dock sd_mod scsi_mod [last unloaded:
freq_table]
[  464.933006] irq event stamp: 2136363
[  464.933006] hardirqs last  enabled at (2136363):
[<ffffffff80450d49>] trace_hardirqs_on_thunk+0x3a/0x3f
[  464.933006] hardirqs last disabled at (2136361):
[<ffffffff8023ea01>] __do_softirq+0xa3/0xf7
[  464.933006] softirqs last  enabled at (2136362):
[<ffffffff8020d9bc>] call_softirq+0x1c/0x28
[  464.933006] softirqs last disabled at (2136357):
[<ffffffff8020d9bc>] call_softirq+0x1c/0x28
[  464.933006] CPU 2:
[  464.933006] Modules linked in: ipv6 nfs lockd nfs_acl sunrpc button
battery ac loop dm_mod md_mod usbkbd usbhid hid ff_memless mptctl
evdev tg3 libphy iTCO_wdt e752x_edac edac_core uhci_hcd rng_core
shpchp i2c_i801 pci_hotplug i2c_core ehci_hcd reiserfs edd fan thermal
processor thermal_sys mptspi mptscsih sg mptbase scsi_transport_spi
sr_mod cdrom ata_piix libata dock sd_mod scsi_mod [last unloaded:
freq_table]
[  464.933006] Pid: 4207, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #1
[  464.933006] RIP: 0010:[<ffffffff8033dc6b>]  [<ffffffff8033dc6b>]
_raw_spin_lock+0x98/0x100
[  464.933006] RSP: 0018:ffff880145473e48  EFLAGS: 00000206
[  464.933006] RAX: 00000000000000cb RBX: 000000001830d3f9 RCX:
00000000ffffffff[  464.933006] RDX: 0000018500000000 RSI:
ffffffff8053d176 RDI: 0000000000000001[  464.933006] RBP:
0000000000000000 R08: 0000000000000002 R09: 0000000000000000[
464.933006] R10: 0000000000000000 R11: ffffffff8033a6fe R12:
0000000000000000[  464.933006] R13: ffffffff8033a6fe R14:
ffffffff8020c7ee R15: 0000000000000002[  464.933006] FS:
00007f40899b86d0(0000) GS:ffff88014707f508(0000)
knlGS:0000000000000000
[  464.933006] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  464.933006] CR2: 00007f408974aae0 CR3: 0000000143003000 CR4:
00000000000006e0[  464.933006] DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000[  464.933006] DR3:
0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[
464.933006]
[  464.933006] Call Trace:
[  464.933006]  [<ffffffff8033dc6b>] _raw_spin_lock+0x98/0x100
[  464.933006]  [<ffffffff8045119e>] _spin_lock+0x4e/0x5a
[  464.933006]  [<ffffffff802feb07>] ipc_lock+0x62/0x99
[  464.933006]  [<ffffffff802feb07>] ipc_lock+0x62/0x99
[  464.933006]  [<ffffffff802feaa5>] ipc_lock+0x0/0x99
[  464.933006]  [<ffffffff802feb46>] ipc_lock_check+0x8/0x53
[  464.933006]  [<ffffffff803002c3>] sys_msgctl+0x188/0x461
[  464.933006]  [<ffffffff80259ac7>] trace_hardirqs_on_caller+0x100/0x12a
[  464.933006]  [<ffffffff80450d49>] trace_hardirqs_on_thunk+0x3a/0x3f
[  464.933006]  [<ffffffff80259ac7>] trace_hardirqs_on_caller+0x100/0x12a
[  464.933006]  [<ffffffff80212e09>] sched_clock+0x5/0x7
[  464.933006]  [<ffffffff80450d49>] trace_hardirqs_on_thunk+0x3a/0x3f
[  464.933006]  [<ffffffff80213021>] native_sched_clock+0x8c/0xa5
[  464.933006]  [<ffffffff80212e09>] sched_clock+0x5/0x7
[  464.933006]  [<ffffffff8020bf7a>] system_call_fastpath+0x16/0x1b
[  464.933006]

I checked it with two different distributions: Debian Lenny and Sles 10 SP 1.

Regards, c.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/1] (v3) SYSVIPC - Fix the ipc structures initialization
  2008-10-28  9:44         ` cboulte
@ 2008-10-28 11:05           ` Nadia Derbey
  0 siblings, 0 replies; 6+ messages in thread
From: Nadia Derbey @ 2008-10-28 11:05 UTC (permalink / raw)
  To: cboulte; +Cc: akpm, manfred, linux-kernel

On Tue, 2008-10-28 at 10:44 +0100, cboulte@gmail.com wrote:
> On Mon, Oct 27, 2008 at 4:42 PM, Nadia Derbey <Nadia.Derbey@bull.net> wrote:
> > On Mon, 2008-10-27 at 12:04 +0100, Nadia Derbey wrote:
> >> On Mon, 2008-10-27 at 11:32 +0100, cboulte@gmail.com wrote:
> >> > On Mon, Oct 27, 2008 at 8:28 AM,  <Nadia.Derbey@bull.net> wrote:
> >> > >
> >> > > This patch is a fix for Bugzilla bug
> >> > > http://bugzilla.kernel.org/show_bug.cgi?id=11796.
> >> > >
> >> > > To summarize, a simple testcase is concurrently running an infinite loop on
> >> > > msgctl(IPC_STAT) and a call to msgget():
> >> > >
> >> > > while (1)
> >> > >   msgctl(id, IPC_STAT)    1
> >> > >                           |
> >> > >                           |
> >> > >                           |
> >> > >                           2 id = msgget(key, IPC_CREAT)
> >> > >                           |
> >> > >                           |
> >> > >                           |
> >> > >
> >> > > In the interval [1-2], the id doesn't exist yet.
> >> > >
> >> > > In that test, the problem is the following:
> >> > > When we are calling ipc_addid() from msgget() the msq structure is not
> >> > > completely initialized. So idr_get_new() is inserting a pointer into the
> >> > > idr tree, and the structure which is pointed to has, among other fields,
> >> > > its lock uninitialized.
> >> > >
> >> > > Since msgctl(IPC_STAT) is looping while (1), idr_find() returns the
> >> > > pointer as soon as it is inserted into the IDR tree. And ipc_lock()
> >> > > calls spin_lock(&mqs->lock), while we have not initialized that lock
> >> > > yet.
> >> > >
> >> > > This patch moves the spin_lock_init() before the call to ipc_addid().
> >> > > It also sets the "deleted" flag to 1 in the window between msg structure
> >> > > allocation and msg structure locking in ipc_addid().
> >> > >
> >> > >
> >> > > Regards,
> >> > > Nadia
> >> > >
> >> > >
> >> > > Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
> >> > >
> >> > > ---
> >> > >  ipc/util.c |   16 ++++++++++++++--
> >> > >  1 file changed, 14 insertions(+), 2 deletions(-)
> >> > >
> >> > > Index: linux-2.6.27/ipc/util.c
> >> > > ===================================================================
> >> > > --- linux-2.6.27.orig/ipc/util.c        2008-10-23 15:20:46.000000000 +0200
> >> > > +++ linux-2.6.27/ipc/util.c     2008-10-24 17:48:33.000000000 +0200
> >> > > @@ -266,6 +266,17 @@ int ipc_addid(struct ipc_ids* ids, struc
> >> > >        if (ids->in_use >= size)
> >> > >                return -ENOSPC;
> >> > >
> >> > > +       spin_lock_init(&new->lock);
> >> > > +
> >> > > +       /*
> >> > > +        * We have a window between the time new is inserted into the idr
> >> > > +        * tree and the time it is actually locked.
> >> > > +        * In order to be safe during that window set the new ipc structure
> >> > > +        * as deleted: a concurrent ipc_lock() will see it as not present
> >> > > +        * until the initialization phase is complete.
> >> > > +        */
> >> > > +       new->deleted = 1;
> >> > > +
> >> > >        err = idr_get_new(&ids->ipcs_idr, new, &id);
> >> > >        if (err)
> >> > >                return err;
> >> > > @@ -280,10 +291,11 @@ int ipc_addid(struct ipc_ids* ids, struc
> >> > >                ids->seq = 0;
> >> > >
> >> > >        new->id = ipc_buildid(id, new->seq);
> >> > > -       spin_lock_init(&new->lock);
> >> > > -       new->deleted = 0;
> >> > >        rcu_read_lock();
> >> > >        spin_lock(&new->lock);
> >> > > +
> >> > > +       new->deleted = 0;
> >> > > +
> >> > >        return id;
> >> > >  }
> >> > >
> >> > >
> >> > > --
> >> > >
> >> >
> >> > Still got the lock... I'm using a 4 cpus node: Intel Xeon @ 2.8GHz...
> >> > don't know if it has an impact.
> >> ???
> >> The bad new, is that it becomes unreprodicible on my side.
> >> For my part, I've got 2 2.8 GHz Xeon CPUs.
> >>
> >> Will review the code once more.
> >>
> >> Thanks!
> >> Nadia
> >>
> >> > The only way I found to have no lock, it's to spin_lock the ipc
> >> > _before_ inserting it into the idr.
> >> >
> >> > Best regards, c.
> >> >
> >
> > I agree with you that it's more logical and correct to take the lock
> > before inserting the ipc structure (i.e. making it visible to readers).
> >
> > But I wanted to understand what's wrong with
> > 1. new->lock init
> > 2. new->deleted = 1
> > 3. insert(new)
> >
> > I've been looking at the code again and again and the only thing I see
> > could have happened, is that instructions have been reordered and the
> > insertion done before the lock actually being initialized.
> > This means that a memory barrier is missing (this would explain why your
> > fix works: the spin_lock acts as a barrier).
> > But this memory barrier is supposed to be invoked by
> > rcu_assign_pointer() in idr_get_new(). So may be there's a problem with
> > the idr code.
> > Before going into a review of this code, I'd like to confirm what I'm
> > saying, doing the following (I'm sorry to ask you do it, but I can't
> > reproduce the problem in my side anymore): would you mind adding a
> > smp_wmb() just before the idr_get_new in ipc_addid() and tell me if this
> > solves the problem.
> > (BTW, I didn't ask you before, but I guess you're getting the same call
> > trace?)
> >
> > Regards,
> > Nadia
> >
> > --
> > Nadia Derbey <Nadia.Derbey@bull.net>
> >
> >
> 
> I tried this patch:
> Index: bug-sysv/ipc/util.c
> ===================================================================
> --- bug-sysv.orig/ipc/util.c    2008-10-27 09:21:44.000000000 +0100
> +++ bug-sysv/ipc/util.c 2008-10-27 19:04:33.000000000 +0100
> @@ -266,6 +266,19 @@ int ipc_addid(struct ipc_ids* ids, struc
>         if (ids->in_use >= size)
>                 return -ENOSPC;
> 
> +       spin_lock_init(&new->lock);
> +
> +       /*
> +        * We have a window between the time new is inserted into the idr
> +        * tree and the time it is actually locked.
> +        * In order to be safe during that window set the new ipc structure
> +        * as deleted: a concurrent ipc_lock() will see it as not present
> +        * until the initialization phase is complete.
> +        */
> +       new->deleted = 1;
> +
> +       smp_wmb();
> +
>         err = idr_get_new(&ids->ipcs_idr, new, &id);
>         if (err)
>                 return err;
> @@ -280,10 +293,11 @@ int ipc_addid(struct ipc_ids* ids, struc
>                 ids->seq = 0;
> 
>         new->id = ipc_buildid(id, new->seq);
> -       spin_lock_init(&new->lock);
> -       new->deleted = 0;
>         rcu_read_lock();
>         spin_lock(&new->lock);
> +
> +       new->deleted = 0;
> +
>         return id;
>  }
> 
> And got the lock (the node is still usuable but I guess it's because
> only 1 cpu out of 4 is locked):
> 
> [  400.393024] INFO: trying to register non-static key.
> [  400.397005] the code is fine but needs lockdep annotation.
> [  400.397005] turning off the locking correctness validator.
> [  400.397005] Pid: 4207, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #1
> [  400.397005]
> [  400.397005] Call Trace:
> [  400.397005]  [<ffffffff80257055>] static_obj+0x60/0x77
> [  400.397005]  [<ffffffff8025af59>] __lock_acquire+0x1c8/0x779
> [  400.397005]  [<ffffffff8025b59f>] lock_acquire+0x95/0xc2
> [  400.397005]  [<ffffffff802feb07>] ipc_lock+0x62/0x99
> [  400.397005]  [<ffffffff8045117d>] _spin_lock+0x2d/0x5a
> [  400.397005]  [<ffffffff802feb07>] ipc_lock+0x62/0x99
> [  400.397005]  [<ffffffff802feb07>] ipc_lock+0x62/0x99
> [  400.397005]  [<ffffffff802feaa5>] ipc_lock+0x0/0x99
> [  400.397005]  [<ffffffff802feb46>] ipc_lock_check+0x8/0x53
> [  400.397005]  [<ffffffff803002c3>] sys_msgctl+0x188/0x461
> [  400.397005]  [<ffffffff80259ac7>] trace_hardirqs_on_caller+0x100/0x12a
> [  400.397005]  [<ffffffff80450d49>] trace_hardirqs_on_thunk+0x3a/0x3f
> [  400.397005]  [<ffffffff80259ac7>] trace_hardirqs_on_caller+0x100/0x12a
> [  400.397005]  [<ffffffff80212e09>] sched_clock+0x5/0x7
> [  400.397005]  [<ffffffff80450d49>] trace_hardirqs_on_thunk+0x3a/0x3f
> [  400.397005]  [<ffffffff80213021>] native_sched_clock+0x8c/0xa5
> [  400.397005]  [<ffffffff80212e09>] sched_clock+0x5/0x7
> [  400.397005]  [<ffffffff8020bf7a>] system_call_fastpath+0x16/0x1b
> [  400.397005]
> [  464.933003] BUG: soft lockup - CPU#2 stuck for 61s! [sysv_test2:4207]
> [  464.933006] Modules linked in: ipv6 nfs lockd nfs_acl sunrpc button
> battery ac loop dm_mod md_mod usbkbd usbhid hid ff_memless mptctl
> evdev tg3 libphy iTCO_wdt e752x_edac edac_core uhci_hcd rng_core
> shpchp i2c_i801 pci_hotplug i2c_core ehci_hcd reiserfs edd fan thermal
> processor thermal_sys mptspi mptscsih sg mptbase scsi_transport_spi
> sr_mod cdrom ata_piix libata dock sd_mod scsi_mod [last unloaded:
> freq_table]
> [  464.933006] irq event stamp: 2136363
> [  464.933006] hardirqs last  enabled at (2136363):
> [<ffffffff80450d49>] trace_hardirqs_on_thunk+0x3a/0x3f
> [  464.933006] hardirqs last disabled at (2136361):
> [<ffffffff8023ea01>] __do_softirq+0xa3/0xf7
> [  464.933006] softirqs last  enabled at (2136362):
> [<ffffffff8020d9bc>] call_softirq+0x1c/0x28
> [  464.933006] softirqs last disabled at (2136357):
> [<ffffffff8020d9bc>] call_softirq+0x1c/0x28
> [  464.933006] CPU 2:
> [  464.933006] Modules linked in: ipv6 nfs lockd nfs_acl sunrpc button
> battery ac loop dm_mod md_mod usbkbd usbhid hid ff_memless mptctl
> evdev tg3 libphy iTCO_wdt e752x_edac edac_core uhci_hcd rng_core
> shpchp i2c_i801 pci_hotplug i2c_core ehci_hcd reiserfs edd fan thermal
> processor thermal_sys mptspi mptscsih sg mptbase scsi_transport_spi
> sr_mod cdrom ata_piix libata dock sd_mod scsi_mod [last unloaded:
> freq_table]
> [  464.933006] Pid: 4207, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #1
> [  464.933006] RIP: 0010:[<ffffffff8033dc6b>]  [<ffffffff8033dc6b>]
> _raw_spin_lock+0x98/0x100
> [  464.933006] RSP: 0018:ffff880145473e48  EFLAGS: 00000206
> [  464.933006] RAX: 00000000000000cb RBX: 000000001830d3f9 RCX:
> 00000000ffffffff[  464.933006] RDX: 0000018500000000 RSI:
> ffffffff8053d176 RDI: 0000000000000001[  464.933006] RBP:
> 0000000000000000 R08: 0000000000000002 R09: 0000000000000000[
> 464.933006] R10: 0000000000000000 R11: ffffffff8033a6fe R12:
> 0000000000000000[  464.933006] R13: ffffffff8033a6fe R14:
> ffffffff8020c7ee R15: 0000000000000002[  464.933006] FS:
> 00007f40899b86d0(0000) GS:ffff88014707f508(0000)
> knlGS:0000000000000000
> [  464.933006] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  464.933006] CR2: 00007f408974aae0 CR3: 0000000143003000 CR4:
> 00000000000006e0[  464.933006] DR0: 0000000000000000 DR1:
> 0000000000000000 DR2: 0000000000000000[  464.933006] DR3:
> 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[
> 464.933006]
> [  464.933006] Call Trace:
> [  464.933006]  [<ffffffff8033dc6b>] _raw_spin_lock+0x98/0x100
> [  464.933006]  [<ffffffff8045119e>] _spin_lock+0x4e/0x5a
> [  464.933006]  [<ffffffff802feb07>] ipc_lock+0x62/0x99
> [  464.933006]  [<ffffffff802feb07>] ipc_lock+0x62/0x99
> [  464.933006]  [<ffffffff802feaa5>] ipc_lock+0x0/0x99
> [  464.933006]  [<ffffffff802feb46>] ipc_lock_check+0x8/0x53
> [  464.933006]  [<ffffffff803002c3>] sys_msgctl+0x188/0x461
> [  464.933006]  [<ffffffff80259ac7>] trace_hardirqs_on_caller+0x100/0x12a
> [  464.933006]  [<ffffffff80450d49>] trace_hardirqs_on_thunk+0x3a/0x3f
> [  464.933006]  [<ffffffff80259ac7>] trace_hardirqs_on_caller+0x100/0x12a
> [  464.933006]  [<ffffffff80212e09>] sched_clock+0x5/0x7
> [  464.933006]  [<ffffffff80450d49>] trace_hardirqs_on_thunk+0x3a/0x3f
> [  464.933006]  [<ffffffff80213021>] native_sched_clock+0x8c/0xa5
> [  464.933006]  [<ffffffff80212e09>] sched_clock+0x5/0x7
> [  464.933006]  [<ffffffff8020bf7a>] system_call_fastpath+0x16/0x1b
> [  464.933006]
> 
> I checked it with two different distributions: Debian Lenny and Sles 10 SP 1.
> 
> Regards, c.

Thanks a lot! Going on checking the code.


Regards,
Nadia

> 
-- 
Nadia Derbey <Nadia.Derbey@bull.net>


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-10-28 11:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20081027072826.266777838@bull.net>
2008-10-27  7:28 ` [PATCH 1/1] (v3) SYSVIPC - Fix the ipc structures initialization Nadia.Derbey
2008-10-27 10:32   ` cboulte
2008-10-27 11:04     ` Nadia Derbey
2008-10-27 15:42       ` Nadia Derbey
2008-10-28  9:44         ` cboulte
2008-10-28 11:05           ` Nadia Derbey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).