LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] sigqueue_free: fix the race with collect_signal()
@ 2007-08-23 13:45 Oleg Nesterov
  2007-08-23 21:36 ` Sukadev Bhattiprolu
  2007-08-24 14:26 ` taoyue
  0 siblings, 2 replies; 13+ messages in thread
From: Oleg Nesterov @ 2007-08-23 13:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexey Dobriyan, Ingo Molnar, Jeremy Katz, taoyue,
	Thomas Gleixner, Roland McGrath, linux-kernel, stable

Spotted by taoyue <yue.tao@windriver.com> and Jeremy Katz <jeremy.katz@windriver.com>.

collect_signal:				sigqueue_free:

	list_del_init(&first->list);
						if (!list_empty(&q->list)) {
							// not taken
						}
						q->flags &= ~SIGQUEUE_PREALLOC;

	__sigqueue_free(first);			__sigqueue_free(q);

Now, __sigqueue_free() is called twice on the same "struct sigqueue" with the
obviously bad implications.

In particular, this double free breaks the array_cache->avail logic, so the
same sigqueue could be "allocated" twice, and the bug can manifest itself via
the "impossible" BUG_ON(!SIGQUEUE_PREALLOC) in sigqueue_free/send_sigqueue.

Hopefully this can explain these mysterious bug-reports, see

	http://marc.info/?t=118766926500003
	http://marc.info/?t=118466273000005

Alexey Dobriyan reports this patch makes the difference for the testcase, but
nobody has an access to the application which opened the problems originally.

Also, this patch removes tasklist lock/unlock, ->siglock is enough.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>

--- t/kernel/signal.c~SQFREE	2007-08-22 20:06:31.000000000 +0400
+++ t/kernel/signal.c	2007-08-23 16:02:57.000000000 +0400
@@ -1297,20 +1297,19 @@ struct sigqueue *sigqueue_alloc(void)
 void sigqueue_free(struct sigqueue *q)
 {
 	unsigned long flags;
+	spinlock_t *lock = &current->sighand->siglock;
+
 	BUG_ON(!(q->flags & SIGQUEUE_PREALLOC));
 	/*
 	 * If the signal is still pending remove it from the
-	 * pending queue.
+	 * pending queue. We must hold ->siglock while testing
+	 * q->list to serialize with collect_signal().
 	 */
-	if (unlikely(!list_empty(&q->list))) {
-		spinlock_t *lock = &current->sighand->siglock;
-		read_lock(&tasklist_lock);
-		spin_lock_irqsave(lock, flags);
-		if (!list_empty(&q->list))
-			list_del_init(&q->list);
-		spin_unlock_irqrestore(lock, flags);
-		read_unlock(&tasklist_lock);
-	}
+	spin_lock_irqsave(lock, flags);
+	if (!list_empty(&q->list))
+		list_del_init(&q->list);
+	spin_unlock_irqrestore(lock, flags);
+
 	q->flags &= ~SIGQUEUE_PREALLOC;
 	__sigqueue_free(q);
 }


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] sigqueue_free: fix the race with collect_signal()
  2007-08-23 13:45 [PATCH] sigqueue_free: fix the race with collect_signal() Oleg Nesterov
@ 2007-08-23 21:36 ` Sukadev Bhattiprolu
  2007-08-23 22:05   ` Oleg Nesterov
  2007-08-24 14:26 ` taoyue
  1 sibling, 1 reply; 13+ messages in thread
From: Sukadev Bhattiprolu @ 2007-08-23 21:36 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Alexey Dobriyan, Ingo Molnar, Jeremy Katz, taoyue,
	Thomas Gleixner, Roland McGrath, linux-kernel, stable

Oleg Nesterov wrote:
> Spotted by taoyue <yue.tao@windriver.com> and Jeremy Katz <jeremy.katz@windriver.com>.
>
> collect_signal:				sigqueue_free:
>
> 	list_del_init(&first->list);
> 						if (!list_empty(&q->list)) {
> 							// not taken
> 						}
> 						q->flags &= ~SIGQUEUE_PREALLOC;
>
> 	__sigqueue_free(first);			__sigqueue_free(q);
>
> Now, __sigqueue_free() is called twice on the same "struct sigqueue" with the
> obviously bad implications.
>
> In particular, this double free breaks the array_cache->avail logic, so the
> same sigqueue could be "allocated" twice, and the bug can manifest itself via
> the "impossible" BUG_ON(!SIGQUEUE_PREALLOC) in sigqueue_free/send_sigqueue.
>
> Hopefully this can explain these mysterious bug-reports, see
>
> 	http://marc.info/?t=118766926500003
> 	http://marc.info/?t=118466273000005
>
> Alexey Dobriyan reports this patch makes the difference for the testcase, but
> nobody has an access to the application which opened the problems originally.
>
> Also, this patch removes tasklist lock/unlock, ->siglock is enough.
>
> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
>
> --- t/kernel/signal.c~SQFREE	2007-08-22 20:06:31.000000000 +0400
> +++ t/kernel/signal.c	2007-08-23 16:02:57.000000000 +0400
> @@ -1297,20 +1297,19 @@ struct sigqueue *sigqueue_alloc(void)
>  void sigqueue_free(struct sigqueue *q)
>  {
>  	unsigned long flags;
> +	spinlock_t *lock = &current->sighand->siglock;
> +
>  	BUG_ON(!(q->flags & SIGQUEUE_PREALLOC));
>  	/*
>  	 * If the signal is still pending remove it from the
> -	 * pending queue.
> +	 * pending queue. We must hold ->siglock while testing
> +	 * q->list to serialize with collect_signal().
>  	 */
> -	if (unlikely(!list_empty(&q->list))) {
> -		spinlock_t *lock = &current->sighand->siglock;
> -		read_lock(&tasklist_lock);
> -		spin_lock_irqsave(lock, flags);
>   
Hmm, but the existing code _does_ take the siglock here. Is that not 
sufficient ?
Isn't the first list_empty() check without lock only an optimization for 
the common
case ?

> -		if (!list_empty(&q->list))
> -			list_del_init(&q->list);
> -		spin_unlock_irqrestore(lock, flags);
> -		read_unlock(&tasklist_lock);
> -	}
> +	spin_lock_irqsave(lock, flags);
> +	if (!list_empty(&q->list))
> +		list_del_init(&q->list);
> +	spin_unlock_irqrestore(lock, flags);
> +
>  	q->flags &= ~SIGQUEUE_PREALLOC;
>  	__sigqueue_free(q);
>  }
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
>   



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] sigqueue_free: fix the race with collect_signal()
  2007-08-23 21:36 ` Sukadev Bhattiprolu
@ 2007-08-23 22:05   ` Oleg Nesterov
  0 siblings, 0 replies; 13+ messages in thread
From: Oleg Nesterov @ 2007-08-23 22:05 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Andrew Morton, Alexey Dobriyan, Ingo Molnar, Jeremy Katz, taoyue,
	Thomas Gleixner, Roland McGrath, linux-kernel, stable

On 08/23, Sukadev Bhattiprolu wrote:
>
> Oleg Nesterov wrote:
> >Spotted by taoyue <yue.tao@windriver.com> and Jeremy Katz 
> ><jeremy.katz@windriver.com>.
> >
> >collect_signal:				sigqueue_free:
> >
> >	list_del_init(&first->list);
> >						if (!list_empty(&q->list)) {
> >							// not taken
> >						}
> >						q->flags &= 
> >						~SIGQUEUE_PREALLOC;
> >
> >	__sigqueue_free(first);			__sigqueue_free(q);
> >
> >Now, __sigqueue_free() is called twice on the same "struct sigqueue" with 
> >the
> >obviously bad implications.
> >
> >--- t/kernel/signal.c~SQFREE	2007-08-22 20:06:31.000000000 +0400
> >+++ t/kernel/signal.c	2007-08-23 16:02:57.000000000 +0400
> >@@ -1297,20 +1297,19 @@ struct sigqueue *sigqueue_alloc(void)
> > void sigqueue_free(struct sigqueue *q)
> > {
> > 	unsigned long flags;
> >+	spinlock_t *lock = &current->sighand->siglock;
> >+
> > 	BUG_ON(!(q->flags & SIGQUEUE_PREALLOC));
> > 	/*
> > 	 * If the signal is still pending remove it from the
> >-	 * pending queue.
> >+	 * pending queue. We must hold ->siglock while testing
> >+	 * q->list to serialize with collect_signal().
> > 	 */
> >-	if (unlikely(!list_empty(&q->list))) {
> >-		spinlock_t *lock = &current->sighand->siglock;
> >-		read_lock(&tasklist_lock);
> >-		spin_lock_irqsave(lock, flags);
> >  
> Hmm, but the existing code _does_ take the siglock here. Is that not 
> sufficient ?

Yes, it does, and this is sufficient, so the patch removes tasklist_lock.

> Isn't the first list_empty() check without lock only an optimization for 
> the common
> case ?

Yes, this is optimization, but I strongly believe it is wrong. Please look
at the race description above.

!list_empty(&q->list) does _not_ necessary mean that q is not used and we can
free it. It is possible that collect_signal() just removed this sigqueue from
list (list_empty(&q->list) becomes true) and going to free it.

The whole point is: we can't check list_empty() without ->siglock, this is
racy, and leads to double-free.

Oleg.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] sigqueue_free: fix the race with collect_signal()
  2007-08-24 14:26 ` taoyue
@ 2007-08-24  7:45   ` Oleg Nesterov
  2007-08-24 21:29     ` taoyue
  0 siblings, 1 reply; 13+ messages in thread
From: Oleg Nesterov @ 2007-08-24  7:45 UTC (permalink / raw)
  To: taoyue
  Cc: Andrew Morton, Alexey Dobriyan, Ingo Molnar, Thomas Gleixner,
	Roland McGrath, linux-kernel, stable

On 08/24, taoyue wrote:
>
> Oleg Nesterov wrote:
> >
> >--- t/kernel/signal.c~SQFREE	2007-08-22 20:06:31.000000000 +0400
> >+++ t/kernel/signal.c	2007-08-23 16:02:57.000000000 +0400
> >@@ -1297,20 +1297,19 @@ struct sigqueue *sigqueue_alloc(void)
> > void sigqueue_free(struct sigqueue *q)
> > {
> > 	unsigned long flags;
> >+	spinlock_t *lock = &current->sighand->siglock;
> >+
> > 	BUG_ON(!(q->flags & SIGQUEUE_PREALLOC));
> > 	/*
> > 	 * If the signal is still pending remove it from the
> >-	 * pending queue.
> >+	 * pending queue. We must hold ->siglock while testing
> >+	 * q->list to serialize with collect_signal().
> > 	 */
> >-	if (unlikely(!list_empty(&q->list))) {
> >-		spinlock_t *lock = &current->sighand->siglock;
> >-		read_lock(&tasklist_lock);
> >-		spin_lock_irqsave(lock, flags);
> >-		if (!list_empty(&q->list))
> >-			list_del_init(&q->list);
> >-		spin_unlock_irqrestore(lock, flags);
> >-		read_unlock(&tasklist_lock);
> >-	}
> >+	spin_lock_irqsave(lock, flags);
> >+	if (!list_empty(&q->list))
> >+		list_del_init(&q->list);
> >+	spin_unlock_irqrestore(lock, flags);
> >+
> > 	q->flags &= ~SIGQUEUE_PREALLOC;
> > 	__sigqueue_free(q);
> > }
> >
> >
> >  
>    Applying previous patch???it seems likely that the __sigqueue_free() is 
>    also called twice.
> 
> collect_signal:				sigqueue_free:
> 
> 	list_del_init(&first->list);
>                                        spin_lock_irqsave(lock, flags);
                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>                                        if (!list_empty(&q->list))
>                                              list_del_init(&q->list);
>                                        spin_unlock_irqrestore(lock, flags);
>                                        q->flags &= ~SIGQUEUE_PREALLOC;
> 
>        __sigqueue_free(first);		__sigqueue_free(q);

collect_signal() is always called under ->siglock which is also taken by
sigqueue_free(), so this is not possible.

Basically, this patch is the same one-liner I sent you before

	http://marc.info/?l=linux-kernel&m=118772206603453&w=2

(Thanks for the additional testing and report, btw).

P.S. It would be nice to know if this patch solves the problems reported
by Jeremy, but his email is disabled.

Oleg.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] sigqueue_free: fix the race with collect_signal()
  2007-08-24 21:29     ` taoyue
@ 2007-08-24 11:08       ` Oleg Nesterov
  2007-08-24 20:03         ` Sukadev Bhattiprolu
  0 siblings, 1 reply; 13+ messages in thread
From: Oleg Nesterov @ 2007-08-24 11:08 UTC (permalink / raw)
  To: taoyue
  Cc: Andrew Morton, Alexey Dobriyan, Ingo Molnar, Thomas Gleixner,
	Roland McGrath, linux-kernel, stable

On 08/24, taoyue wrote:
>
> Oleg Nesterov wrote:
> >>
> >>collect_signal:				sigqueue_free:
> >>
> >>	list_del_init(&first->list);
> >>                                       spin_lock_irqsave(lock, flags);
> >>    
> >                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >  
> >>                                       if (!list_empty(&q->list))
> >>                                             list_del_init(&q->list);
> >>                                       spin_unlock_irqrestore(lock, 
> >>                                       flags);
> >>                                       q->flags &= ~SIGQUEUE_PREALLOC;
> >>
> >>       __sigqueue_free(first);		__sigqueue_free(q);
> >>    
> >
> >collect_signal() is always called under ->siglock which is also taken by
> >sigqueue_free(), so this is not possible.
> >
> >Basically, this patch is the same one-liner I sent you before
> >
> >	http://marc.info/?l=linux-kernel&m=118772206603453&w=2
> >
> >(Thanks for the additional testing and report, btw).
> >
> >P.S. It would be nice to know if this patch solves the problems reported
> >by Jeremy, but his email is disabled.
> >
> >Oleg.
> >
> >  
> I know, using current->sighand->siglock to prevent one sigqueue
> is free twice. I want to know whether it is possible that the two
> function is called in different thread. If that, the spin_lock is useless.

Not sure I understand. Yes, it is possible they are called by 2 different
threads, that is why we had a race. But all threads in the same thread
group have the same ->sighand, and thus the same ->sighand->siglock.

Oleg.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] sigqueue_free: fix the race with collect_signal()
  2007-08-23 13:45 [PATCH] sigqueue_free: fix the race with collect_signal() Oleg Nesterov
  2007-08-23 21:36 ` Sukadev Bhattiprolu
@ 2007-08-24 14:26 ` taoyue
  2007-08-24  7:45   ` Oleg Nesterov
  1 sibling, 1 reply; 13+ messages in thread
From: taoyue @ 2007-08-24 14:26 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Alexey Dobriyan, Ingo Molnar, Jeremy Katz,
	Thomas Gleixner, Roland McGrath, linux-kernel, stable

Oleg Nesterov wrote:
> Spotted by taoyue <yue.tao@windriver.com> and Jeremy Katz <jeremy.katz@windriver.com>.
>
> collect_signal:				sigqueue_free:
>
> 	list_del_init(&first->list);
> 						if (!list_empty(&q->list)) {
> 							// not taken
> 						}
> 						q->flags &= ~SIGQUEUE_PREALLOC;
>
> 	__sigqueue_free(first);			__sigqueue_free(q);
>
> Now, __sigqueue_free() is called twice on the same "struct sigqueue" with the
> obviously bad implications.
>
> In particular, this double free breaks the array_cache->avail logic, so the
> same sigqueue could be "allocated" twice, and the bug can manifest itself via
> the "impossible" BUG_ON(!SIGQUEUE_PREALLOC) in sigqueue_free/send_sigqueue.
>
> Hopefully this can explain these mysterious bug-reports, see
>
> 	http://marc.info/?t=118766926500003
> 	http://marc.info/?t=118466273000005
>
> Alexey Dobriyan reports this patch makes the difference for the testcase, but
> nobody has an access to the application which opened the problems originally.
>
> Also, this patch removes tasklist lock/unlock, ->siglock is enough.
>
> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
>
> --- t/kernel/signal.c~SQFREE	2007-08-22 20:06:31.000000000 +0400
> +++ t/kernel/signal.c	2007-08-23 16:02:57.000000000 +0400
> @@ -1297,20 +1297,19 @@ struct sigqueue *sigqueue_alloc(void)
>  void sigqueue_free(struct sigqueue *q)
>  {
>  	unsigned long flags;
> +	spinlock_t *lock = &current->sighand->siglock;
> +
>  	BUG_ON(!(q->flags & SIGQUEUE_PREALLOC));
>  	/*
>  	 * If the signal is still pending remove it from the
> -	 * pending queue.
> +	 * pending queue. We must hold ->siglock while testing
> +	 * q->list to serialize with collect_signal().
>  	 */
> -	if (unlikely(!list_empty(&q->list))) {
> -		spinlock_t *lock = &current->sighand->siglock;
> -		read_lock(&tasklist_lock);
> -		spin_lock_irqsave(lock, flags);
> -		if (!list_empty(&q->list))
> -			list_del_init(&q->list);
> -		spin_unlock_irqrestore(lock, flags);
> -		read_unlock(&tasklist_lock);
> -	}
> +	spin_lock_irqsave(lock, flags);
> +	if (!list_empty(&q->list))
> +		list_del_init(&q->list);
> +	spin_unlock_irqrestore(lock, flags);
> +
>  	q->flags &= ~SIGQUEUE_PREALLOC;
>  	__sigqueue_free(q);
>  }
>
>
>   
    Applying previous patch,it seems likely that the __sigqueue_free() is also called twice.

collect_signal:				sigqueue_free:

	list_del_init(&first->list);
                                        spin_lock_irqsave(lock, flags);
                                        if (!list_empty(&q->list))
                                              list_del_init(&q->list);
                                        spin_unlock_irqrestore(lock, flags);
                                        q->flags &= ~SIGQUEUE_PREALLOC;

        __sigqueue_free(first);		__sigqueue_free(q);



yue.tao

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] sigqueue_free: fix the race with collect_signal()
  2007-08-24 11:08       ` Oleg Nesterov
@ 2007-08-24 20:03         ` Sukadev Bhattiprolu
  2007-08-24 20:23           ` Oleg Nesterov
  0 siblings, 1 reply; 13+ messages in thread
From: Sukadev Bhattiprolu @ 2007-08-24 20:03 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: taoyue, Andrew Morton, Alexey Dobriyan, Ingo Molnar,
	Thomas Gleixner, Roland McGrath, linux-kernel, stable

Oleg Nesterov wrote:
> On 08/24, taoyue wrote:
>   
>> Oleg Nesterov wrote:
>>     
>>>> collect_signal:				sigqueue_free:
>>>>
>>>> 	list_del_init(&first->list);
>>>>                                       spin_lock_irqsave(lock, flags);
>>>>    
>>>>         
>>>                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>  
>>>       
>>>>                                       if (!list_empty(&q->list))
>>>>                                             list_del_init(&q->list);
>>>>                                       spin_unlock_irqrestore(lock, 
>>>>                                       flags);
>>>>                                       q->flags &= ~SIGQUEUE_PREALLOC;
>>>>
>>>>       __sigqueue_free(first);		__sigqueue_free(q);
>>>>    
>>>>         
>>> collect_signal() is always called under ->siglock which is also taken by
>>> sigqueue_free(), so this is not possible.
>>>
>>> Basically, this patch is the same one-liner I sent you before
>>>
>>> 	http://marc.info/?l=linux-kernel&m=118772206603453&w=2
>>>
>>> (Thanks for the additional testing and report, btw).
>>>
>>> P.S. It would be nice to know if this patch solves the problems reported
>>> by Jeremy, but his email is disabled.
>>>
>>> Oleg.
>>>
>>>  
>>>       
>> I know, using current->sighand->siglock to prevent one sigqueue
>> is free twice. I want to know whether it is possible that the two
>> function is called in different thread. If that, the spin_lock is useless.
>>     
>
> Not sure I understand. Yes, it is possible they are called by 2 different
> threads, that is why we had a race. But all threads in the same thread
> group have the same ->sighand, and thus the same ->sighand->siglock.
>   

Oleg, if one thread can be in collect_signal() and another in 
sigqueue_free() and both operate on the exact same sigqueue object, its 
not clear how we prevent two calls to __sigqueue_free() to
the same object. In that case the lock (or some lock) should be around 
__sigqueue_free() - no ?

 i.e if we enter sigqueue_free(), we will call __sigqueue_free() 
regardless of the state.

> Oleg.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
>   



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] sigqueue_free: fix the race with collect_signal()
  2007-08-24 20:03         ` Sukadev Bhattiprolu
@ 2007-08-24 20:23           ` Oleg Nesterov
  2007-08-25 17:24             ` Sukadev Bhattiprolu
  2007-08-27 13:45             ` taoyue
  0 siblings, 2 replies; 13+ messages in thread
From: Oleg Nesterov @ 2007-08-24 20:23 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: taoyue, Andrew Morton, Alexey Dobriyan, Ingo Molnar,
	Thomas Gleixner, Roland McGrath, linux-kernel, stable

On 08/24, Sukadev Bhattiprolu wrote:
>
> Oleg Nesterov wrote:
> >On 08/24, taoyue wrote:
> >  
> >>Oleg Nesterov wrote:
> >>    
> >>>>collect_signal:				sigqueue_free:
> >>>>
> >>>>	list_del_init(&first->list);
> >>>>                                      spin_lock_irqsave(lock, flags);
> >>>>   
> >>>>        
> >>>                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >>> 
> >>>      
> >>>>                                      if (!list_empty(&q->list))
> >>>>                                            list_del_init(&q->list);
> >>>>                                      spin_unlock_irqrestore(lock, 
> >>>>                                      flags);
> >>>>                                      q->flags &= ~SIGQUEUE_PREALLOC;
> >>>>
> >>>>      __sigqueue_free(first);		__sigqueue_free(q);
> >>>>   
> >>>>        
> >>>collect_signal() is always called under ->siglock which is also taken by
> >>>sigqueue_free(), so this is not possible.
> >>>
> >>> 
> >>>      
> >>I know, using current->sighand->siglock to prevent one sigqueue
> >>is free twice. I want to know whether it is possible that the two
> >>function is called in different thread. If that, the spin_lock is useless.
> >>    
> >
> >Not sure I understand. Yes, it is possible they are called by 2 different
> >threads, that is why we had a race. But all threads in the same thread
> >group have the same ->sighand, and thus the same ->sighand->siglock.
> >  
> 
> Oleg, if one thread can be in collect_signal() and another in 
> sigqueue_free() and both operate on the exact same sigqueue object, its 
> not clear how we prevent two calls to __sigqueue_free() to
> the same object. In that case the lock (or some lock) should be around 
> __sigqueue_free() - no ?
> 
> i.e if we enter sigqueue_free(), we will call __sigqueue_free() 
> regardless of the state.

Yes. They both will call __sigqueue_free(). But please note that __sigqueue_free()
checks SIGQUEUE_PREALLOC, which is cleared by sigqueue_free().

IOW, when sigqueue_free() unlocks ->siglock, we know that it can't be used
by collect_signal() from another thread. So we can clear SIGQUEUE_PREALLOC
and free sigqueue. We don't need this lock around sigqueue_free() to prevent
the race. collect_signal() can "see" only those sigqueues which are on list.

IOW, when sigqueue_free() takes ->siglock, colect_signal() can't run, because
it needs the same lock. Now we delete this sigqueue from list, nobody can
see it, it can't have other references. So we can unlock ->siglock, mark
sigqueue as freeable (clear SIGQUEUE_PREALLOC), and free it.

Do you agree?

Oleg.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] sigqueue_free: fix the race with collect_signal()
  2007-08-24  7:45   ` Oleg Nesterov
@ 2007-08-24 21:29     ` taoyue
  2007-08-24 11:08       ` Oleg Nesterov
  0 siblings, 1 reply; 13+ messages in thread
From: taoyue @ 2007-08-24 21:29 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Alexey Dobriyan, Ingo Molnar, Thomas Gleixner,
	Roland McGrath, linux-kernel, stable

Oleg Nesterov wrote:
> On 08/24, taoyue wrote:
>   
>> Oleg Nesterov wrote:
>>     
>>> --- t/kernel/signal.c~SQFREE	2007-08-22 20:06:31.000000000 +0400
>>> +++ t/kernel/signal.c	2007-08-23 16:02:57.000000000 +0400
>>> @@ -1297,20 +1297,19 @@ struct sigqueue *sigqueue_alloc(void)
>>> void sigqueue_free(struct sigqueue *q)
>>> {
>>> 	unsigned long flags;
>>> +	spinlock_t *lock = &current->sighand->siglock;
>>> +
>>> 	BUG_ON(!(q->flags & SIGQUEUE_PREALLOC));
>>> 	/*
>>> 	 * If the signal is still pending remove it from the
>>> -	 * pending queue.
>>> +	 * pending queue. We must hold ->siglock while testing
>>> +	 * q->list to serialize with collect_signal().
>>> 	 */
>>> -	if (unlikely(!list_empty(&q->list))) {
>>> -		spinlock_t *lock = &current->sighand->siglock;
>>> -		read_lock(&tasklist_lock);
>>> -		spin_lock_irqsave(lock, flags);
>>> -		if (!list_empty(&q->list))
>>> -			list_del_init(&q->list);
>>> -		spin_unlock_irqrestore(lock, flags);
>>> -		read_unlock(&tasklist_lock);
>>> -	}
>>> +	spin_lock_irqsave(lock, flags);
>>> +	if (!list_empty(&q->list))
>>> +		list_del_init(&q->list);
>>> +	spin_unlock_irqrestore(lock, flags);
>>> +
>>> 	q->flags &= ~SIGQUEUE_PREALLOC;
>>> 	__sigqueue_free(q);
>>> }
>>>
>>>
>>>  
>>>       
>>    Applying previous patch???it seems likely that the __sigqueue_free() is 
>>    also called twice.
>>
>> collect_signal:				sigqueue_free:
>>
>> 	list_del_init(&first->list);
>>                                        spin_lock_irqsave(lock, flags);
>>     
>                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   
>>                                        if (!list_empty(&q->list))
>>                                              list_del_init(&q->list);
>>                                        spin_unlock_irqrestore(lock, flags);
>>                                        q->flags &= ~SIGQUEUE_PREALLOC;
>>
>>        __sigqueue_free(first);		__sigqueue_free(q);
>>     
>
> collect_signal() is always called under ->siglock which is also taken by
> sigqueue_free(), so this is not possible.
>
> Basically, this patch is the same one-liner I sent you before
>
> 	http://marc.info/?l=linux-kernel&m=118772206603453&w=2
>
> (Thanks for the additional testing and report, btw).
>
> P.S. It would be nice to know if this patch solves the problems reported
> by Jeremy, but his email is disabled.
>
> Oleg.
>
>   
I know, using current->sighand->siglock to prevent one sigqueue
is free twice. I want to know whether it is possible that the two
function is called in different thread. If that, the spin_lock is useless.

yue.tao

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] sigqueue_free: fix the race with collect_signal()
  2007-08-24 20:23           ` Oleg Nesterov
@ 2007-08-25 17:24             ` Sukadev Bhattiprolu
  2007-08-25 17:34               ` Oleg Nesterov
  2007-08-27 13:45             ` taoyue
  1 sibling, 1 reply; 13+ messages in thread
From: Sukadev Bhattiprolu @ 2007-08-25 17:24 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: taoyue, Andrew Morton, Alexey Dobriyan, Ingo Molnar,
	Thomas Gleixner, Roland McGrath, linux-kernel, stable

Oleg Nesterov wrote:
> On 08/24, Sukadev Bhattiprolu wrote:
>   
>> Oleg Nesterov wrote:
>>     
>>> On 08/24, taoyue wrote:
>>>  
>>>       
>>>> Oleg Nesterov wrote:
>>>>    
>>>>         
>>>>>> collect_signal:				sigqueue_free:
>>>>>>
>>>>>> 	list_del_init(&first->list);
>>>>>>                                      spin_lock_irqsave(lock, flags);
>>>>>>   
>>>>>>        
>>>>>>             
>>>>>                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>>
>>>>>      
>>>>>           
>>>>>>                                      if (!list_empty(&q->list))
>>>>>>                                            list_del_init(&q->list);
>>>>>>                                      spin_unlock_irqrestore(lock, 
>>>>>>                                      flags);
>>>>>>                                      q->flags &= ~SIGQUEUE_PREALLOC;
>>>>>>
>>>>>>      __sigqueue_free(first);		__sigqueue_free(q);
>>>>>>   
>>>>>>        
>>>>>>             
>>>>> collect_signal() is always called under ->siglock which is also taken by
>>>>> sigqueue_free(), so this is not possible.
>>>>>
>>>>>
>>>>>      
>>>>>           
>>>> I know, using current->sighand->siglock to prevent one sigqueue
>>>> is free twice. I want to know whether it is possible that the two
>>>> function is called in different thread. If that, the spin_lock is useless.
>>>>    
>>>>         
>>> Not sure I understand. Yes, it is possible they are called by 2 different
>>> threads, that is why we had a race. But all threads in the same thread
>>> group have the same ->sighand, and thus the same ->sighand->siglock.
>>>  
>>>       
>> Oleg, if one thread can be in collect_signal() and another in 
>> sigqueue_free() and both operate on the exact same sigqueue object, its 
>> not clear how we prevent two calls to __sigqueue_free() to
>> the same object. In that case the lock (or some lock) should be around 
>> __sigqueue_free() - no ?
>>
>> i.e if we enter sigqueue_free(), we will call __sigqueue_free() 
>> regardless of the state.
>>     
>
> Yes. They both will call __sigqueue_free(). But please note that __sigqueue_free()
> checks SIGQUEUE_PREALLOC, which is cleared by sigqueue_free().
>
> IOW, when sigqueue_free() unlocks ->siglock, we know that it can't be used
> by collect_signal() from another thread. So we can clear SIGQUEUE_PREALLOC
> and free sigqueue. We don't need this lock around sigqueue_free() to prevent
> the race. collect_signal() can "see" only those sigqueues which are on list.
>
> IOW, when sigqueue_free() takes ->siglock, colect_signal() can't run, because
> it needs the same lock. Now we delete this sigqueue from list, nobody can
> see it, it can't have other references. So we can unlock ->siglock, mark
> sigqueue as freeable (clear SIGQUEUE_PREALLOC), and free it.
>
> Do you agree?
>   

Yes. I see it now. I had missed the SIGQUEUE_PREALLOC in __sigqueue_free().

Thanks for clarifying

Suka



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] sigqueue_free: fix the race with collect_signal()
  2007-08-25 17:24             ` Sukadev Bhattiprolu
@ 2007-08-25 17:34               ` Oleg Nesterov
  0 siblings, 0 replies; 13+ messages in thread
From: Oleg Nesterov @ 2007-08-25 17:34 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: taoyue, Andrew Morton, Alexey Dobriyan, Ingo Molnar,
	Thomas Gleixner, Roland McGrath, linux-kernel, stable

On 08/25, Sukadev Bhattiprolu wrote:
> 
> Yes. I see it now. I had missed the SIGQUEUE_PREALLOC in __sigqueue_free().

Thanks for looking at this. This code looks simple, but it is not obvious, at
least for me.

Oleg.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] sigqueue_free: fix the race with collect_signal()
  2007-08-27 13:45             ` taoyue
@ 2007-08-27  5:57               ` Oleg Nesterov
  0 siblings, 0 replies; 13+ messages in thread
From: Oleg Nesterov @ 2007-08-27  5:57 UTC (permalink / raw)
  To: taoyue
  Cc: Sukadev Bhattiprolu, Andrew Morton, Alexey Dobriyan, Ingo Molnar,
	Thomas Gleixner, Roland McGrath, linux-kernel, stable, Mark Zhan,
	Ashfield, Bruce

On 08/27, taoyue wrote:
>
> Oleg Nesterov wrote:
> >On 08/24, Sukadev Bhattiprolu wrote:
> >>>>>          
> >>>>I know, using current->sighand->siglock to prevent one sigqueue
> >>>>is free twice. I want to know whether it is possible that the two
> >>>>function is called in different thread. If that, the spin_lock is 
> >>>>useless.
> >>>>   
> >>>>        
> >>>Not sure I understand. Yes, it is possible they are called by 2 different
> >>>threads, that is why we had a race. But all threads in the same thread
> >>>group have the same ->sighand, and thus the same ->sighand->siglock.
> >>>      
> I has applied the new patch, and start test program last Friday.
> So far, the test program has run for three days.

Great, thanks.

> In my test program, all threads is created in one process, so they
> are in the same thread group. If two thread is not in the same thread
> group, they have the different ->sighand->siglock. In this case, the
> lock can't prevent the sigqueue object from race condition.

If two thread are not in the same thread, they can't access the same
SIGQUEUE_PREALLOC sigqueue. That is why sigqueue_free() uses current->sighand.
Otherwise, this lock doesn't make any sense, and can't protect list_del(q->list).

Oleg.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] sigqueue_free: fix the race with collect_signal()
  2007-08-24 20:23           ` Oleg Nesterov
  2007-08-25 17:24             ` Sukadev Bhattiprolu
@ 2007-08-27 13:45             ` taoyue
  2007-08-27  5:57               ` Oleg Nesterov
  1 sibling, 1 reply; 13+ messages in thread
From: taoyue @ 2007-08-27 13:45 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Sukadev Bhattiprolu, Andrew Morton, Alexey Dobriyan, Ingo Molnar,
	Thomas Gleixner, Roland McGrath, linux-kernel, stable, Mark Zhan,
	Ashfield, Bruce

Oleg Nesterov wrote:
> On 08/24, Sukadev Bhattiprolu wrote:
>   
>> Oleg Nesterov wrote:
>>     
>>> On 08/24, taoyue wrote:
>>>  
>>>       
>>>> Oleg Nesterov wrote:
>>>>    
>>>>         
>>>>>> collect_signal:				sigqueue_free:
>>>>>>
>>>>>> 	list_del_init(&first->list);
>>>>>>                                      spin_lock_irqsave(lock, flags);
>>>>>>   
>>>>>>        
>>>>>>             
>>>>>                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>>
>>>>>      
>>>>>           
>>>>>>                                      if (!list_empty(&q->list))
>>>>>>                                            list_del_init(&q->list);
>>>>>>                                      spin_unlock_irqrestore(lock, 
>>>>>>                                      flags);
>>>>>>                                      q->flags &= ~SIGQUEUE_PREALLOC;
>>>>>>
>>>>>>      __sigqueue_free(first);		__sigqueue_free(q);
>>>>>>   
>>>>>>        
>>>>>>             
>>>>> collect_signal() is always called under ->siglock which is also taken by
>>>>> sigqueue_free(), so this is not possible.
>>>>>
>>>>>
>>>>>      
>>>>>           
>>>> I know, using current->sighand->siglock to prevent one sigqueue
>>>> is free twice. I want to know whether it is possible that the two
>>>> function is called in different thread. If that, the spin_lock is useless.
>>>>    
>>>>         
>>> Not sure I understand. Yes, it is possible they are called by 2 different
>>> threads, that is why we had a race. But all threads in the same thread
>>> group have the same ->sighand, and thus the same ->sighand->siglock.
>>>       
I has applied the new patch, and start test program last Friday.
So far, the test program has run for three days.
In my test program, all threads is created in one process, so they
are in the same thread group. If two thread is not in the same thread
group, they have the different ->sighand->siglock. In this case, the
lock can't prevent the sigqueue object from race condition.
>>>  
>>>       
>> Oleg, if one thread can be in collect_signal() and another in 
>> sigqueue_free() and both operate on the exact same sigqueue object, its 
>> not clear how we prevent two calls to __sigqueue_free() to
>> the same object. In that case the lock (or some lock) should be around 
>> __sigqueue_free() - no ?
>>
>> i.e if we enter sigqueue_free(), we will call __sigqueue_free() 
>> regardless of the state.
>>     
>
> Yes. They both will call __sigqueue_free(). But please note that __sigqueue_free()
> checks SIGQUEUE_PREALLOC, which is cleared by sigqueue_free().
>
> IOW, when sigqueue_free() unlocks ->siglock, we know that it can't be used
> by collect_signal() from another thread. So we can clear SIGQUEUE_PREALLOC
> and free sigqueue. We don't need this lock around sigqueue_free() to prevent
> the race. collect_signal() can "see" only those sigqueues which are on list.
>
> IOW, when sigqueue_free() takes ->siglock, colect_signal() can't run, because
> it needs the same lock. Now we delete this sigqueue from list, nobody can
> see it, it can't have other references. So we can unlock ->siglock, mark
> sigqueue as freeable (clear SIGQUEUE_PREALLOC), and free it.
>
> Do you agree?
>
> Oleg.
>
>
>   


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2007-08-27  5:59 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-08-23 13:45 [PATCH] sigqueue_free: fix the race with collect_signal() Oleg Nesterov
2007-08-23 21:36 ` Sukadev Bhattiprolu
2007-08-23 22:05   ` Oleg Nesterov
2007-08-24 14:26 ` taoyue
2007-08-24  7:45   ` Oleg Nesterov
2007-08-24 21:29     ` taoyue
2007-08-24 11:08       ` Oleg Nesterov
2007-08-24 20:03         ` Sukadev Bhattiprolu
2007-08-24 20:23           ` Oleg Nesterov
2007-08-25 17:24             ` Sukadev Bhattiprolu
2007-08-25 17:34               ` Oleg Nesterov
2007-08-27 13:45             ` taoyue
2007-08-27  5:57               ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).