LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [RFC/PATCH] debug workqueue deadlocks with lockdep
@ 2007-06-27 18:40 Johannes Berg
  2007-06-28 16:33 ` Johannes Berg
  0 siblings, 1 reply; 16+ messages in thread
From: Johannes Berg @ 2007-06-27 18:40 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Arjan van de Ven, Linux Kernel list, linux-wireless

This patch adds a fake lock to each workqueue in order to debug things
where you have something like

  my_function() -> lock(); ...; flush_workqueue(); ...
vs
  run_workqueue() -> my_work() -> ...; lock(); ...

which can obviously deadlock if my_work is scheduled when my_function()
is invoked (but hasn't locked yet.)

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
---
I'm not sure if this is correct on SMP because then might it complain
that the fake lock is locked at the same time from multiple CPUs? I
haven't (on my quad G5), however, had it print out anything but the
trace below which is legitimate.

This is sort of a regression test for an actual deadlock in mac80211 I
found by experiencing it, apparently it's really hard to trigger. Then I
wanted to see if I could add lockdep support to debug cases like it.
Here's the output with my patch:

[  171.522236] =======================================================
[  171.522248] [ INFO: possible circular locking dependency detected ]
[  171.522257] 2.6.22-rc6 #163
[  171.522266] -------------------------------------------------------
[  171.522275] khubd/1922 is trying to acquire lock:
[  171.522283]  (khelper){--..}, at: [<c000000000060c8c>] .flush_workqueue+0x48/0xf4
[  171.522318] 
[  171.522320] but task is already holding lock:
[  171.522330]  (rtnl_mutex){--..}, at: [<c0000000003a4784>] .mutex_lock+0x3c/0x58
[  171.522366] 
[  171.522368] which lock already depends on the new lock.
[  171.522371] 
[  171.522383] 
[  171.522386] the existing dependency chain (in reverse order) is:
[  171.522396] 
[  171.522398] -> #1 (rtnl_mutex){--..}:
[  171.522424]        [<c000000000072434>] .__lock_acquire+0xb8c/0xd68
[  171.522476]        [<c0000000000726b0>] .lock_acquire+0xa0/0xe8
[  171.522527]        [<c0000000003a44b0>] .__mutex_lock_slowpath+0x138/0x3d0
[  171.522579]        [<c0000000003a4784>] .mutex_lock+0x3c/0x58
[  171.522629]        [<c000000000327208>] .rtnl_lock+0x24/0x40
[  171.522677]        [<c000000000328fb4>] .linkwatch_event+0x20/0x70
[  171.522725]        [<c00000000005fe38>] .run_workqueue+0x114/0x22c
[  171.522774]        [<c0000000000612f8>] .worker_thread+0x11c/0x140
[  171.522822]        [<c000000000066230>] .kthread+0x84/0xd4
[  171.522872]        [<c0000000000235ec>] .kernel_thread+0x4c/0x68
[  171.522922] 
[  171.522924] -> #0 (khelper){--..}:
[  171.522947]        [<c000000000072324>] .__lock_acquire+0xa7c/0xd68
[  171.523000]        [<c0000000000726b0>] .lock_acquire+0xa0/0xe8
[  171.523051]        [<c000000000060cbc>] .flush_workqueue+0x78/0xf4
[  171.523102]        [<d000000000498374>] .ieee80211_stop+0x1fc/0x3a4 [mac80211]
[  171.523176]        [<c00000000031b8bc>] .dev_close+0xb8/0xfc
[  171.523222]        [<c00000000031b9c0>] .unregister_netdevice+0xc0/0x254
[  171.523275]        [<d0000000004a828c>] .__ieee80211_if_del+0x34/0x50 [mac80211]
[  171.523352]        [<d000000000497654>] .ieee80211_unregister_hw+0xf8/0x2d8 [mac80211]
[  171.523421]        [<d00000000045cfa8>] .disconnect+0x3c/0x98 [zd1211rw_mac80211]
[  171.523483]        [<d0000000000ff650>] .usb_unbind_interface+0x6c/0xcc [usbcore]
[  171.523555]        [<c00000000027bcdc>] .__device_release_driver+0xcc/0x110
[  171.523605]        [<c00000000027c430>] .device_release_driver+0x70/0xbc
[  171.523655]        [<c00000000027b270>] .bus_remove_device+0xa0/0xcc
[  171.523708]        [<c0000000002782f0>] .device_del+0x2f4/0x3d4
[  171.523758]        [<d0000000000fbd68>] .usb_disable_device+0xa0/0x150 [usbcore]
[  171.523833]        [<d0000000000f6eac>] .usb_disconnect+0xfc/0x1a4 [usbcore]
[  171.523907]        [<d0000000000f77d4>] .hub_thread+0x3d8/0xc70 [usbcore]
[  171.523984]        [<c000000000066230>] .kthread+0x84/0xd4
[  171.524039]        [<c0000000000235ec>] .kernel_thread+0x4c/0x68
[  171.524093] 
[  171.524096] other info that might help us debug this:
[  171.524099] 
[  171.524109] 1 lock held by khubd/1922:
[  171.524117]  #0:  (rtnl_mutex){--..}, at: [<c0000000003a4784>] .mutex_lock+0x3c/0x58
[  171.524155] 
[  171.524157] stack backtrace:
[  171.524165] Call Trace:
[  171.524174] [c0000000086af130] [c00000000000f624] .show_stack+0x70/0x1bc (unreliable)
[  171.524203] [c0000000086af1e0] [c00000000000f790] .dump_stack+0x20/0x34
[  171.524228] [c0000000086af260] [c000000000070310] .print_circular_bug_tail+0x84/0xa8
[  171.524256] [c0000000086af330] [c000000000072324] .__lock_acquire+0xa7c/0xd68
[  171.524281] [c0000000086af420] [c0000000000726b0] .lock_acquire+0xa0/0xe8
[  171.524307] [c0000000086af4e0] [c000000000060cbc] .flush_workqueue+0x78/0xf4
[  171.524332] [c0000000086af580] [d000000000498374] .ieee80211_stop+0x1fc/0x3a4 [mac80211]
[  171.524376] [c0000000086af640] [c00000000031b8bc] .dev_close+0xb8/0xfc
[  171.524401] [c0000000086af6d0] [c00000000031b9c0] .unregister_netdevice+0xc0/0x254
[  171.524425] [c0000000086af760] [d0000000004a828c] .__ieee80211_if_del+0x34/0x50 [mac80211]
[  171.524475] [c0000000086af7f0] [d000000000497654] .ieee80211_unregister_hw+0xf8/0x2d8 [mac80211]
[  171.524522] [c0000000086af8c0] [d00000000045cfa8] .disconnect+0x3c/0x98 [zd1211rw_mac80211]
[  171.524558] [c0000000086af960] [d0000000000ff650] .usb_unbind_interface+0x6c/0xcc [usbcore]
[  171.524611] [c0000000086afa00] [c00000000027bcdc] .__device_release_driver+0xcc/0x110
[  171.524635] [c0000000086afa90] [c00000000027c430] .device_release_driver+0x70/0xbc
[  171.524662] [c0000000086afb20] [c00000000027b270] .bus_remove_device+0xa0/0xcc
[  171.524687] [c0000000086afbb0] [c0000000002782f0] .device_del+0x2f4/0x3d4
[  171.524713] [c0000000086afc50] [d0000000000fbd68] .usb_disable_device+0xa0/0x150 [usbcore]
[  171.524763] [c0000000086afcf0] [d0000000000f6eac] .usb_disconnect+0xfc/0x1a4 [usbcore]
[  171.524811] [c0000000086afdb0] [d0000000000f77d4] .hub_thread+0x3d8/0xc70 [usbcore]
[  171.524860] [c0000000086aff00] [c000000000066230] .kthread+0x84/0xd4
[  171.524887] [c0000000086aff90] [c0000000000235ec] .kernel_thread+0x4c/0x68

[actually from a previous version of the patch]

In the actual case I experienced #1 wasn't from linkwatch_event() but
from something else in mac80211 but they both do rtnl_lock().

And finally, the patch:

---
 kernel/workqueue.c |   17 +++++++++++++++++
 1 file changed, 17 insertions(+)

--- linux-2.6-git.orig/kernel/workqueue.c	2007-06-27 19:10:35.183199046 +0200
+++ linux-2.6-git/kernel/workqueue.c	2007-06-27 20:30:00.896667168 +0200
@@ -61,8 +61,15 @@ struct workqueue_struct {
 	const char *name;
 	int singlethread;
 	int freezeable;		/* Freeze threads during suspend */
+#ifdef CONFIG_LOCKDEP
+	struct lockdep_map lockdep_map;
+#endif
 };
 
+#ifdef CONFIG_LOCKDEP
+static struct lock_class_key wq_lockdep_key;
+#endif
+
 /* All the per-cpu workqueues on the system, for hotplug cpu to add/remove
    threads to each one as cpus come/go. */
 static DEFINE_MUTEX(workqueue_mutex);
@@ -257,7 +264,9 @@ static void run_workqueue(struct cpu_wor
 
 		BUG_ON(get_wq_data(work) != cwq);
 		work_clear_pending(work);
+		lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
 		f(work);
+		lock_release(&cwq->wq->lockdep_map, 0, _THIS_IP_);
 
 		if (unlikely(in_atomic() || lockdep_depth(current) > 0)) {
 			printk(KERN_ERR "BUG: workqueue leaked lock or atomic: "
@@ -342,6 +351,9 @@ static int flush_cpu_workqueue(struct cp
 	} else {
 		struct wq_barrier barr;
 
+		lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
+		lock_release(&cwq->wq->lockdep_map, 0, _THIS_IP_);
+
 		active = 0;
 		spin_lock_irq(&cwq->lock);
 		if (!list_empty(&cwq->worklist) || cwq->current_work != NULL) {
@@ -376,6 +388,8 @@ void fastcall flush_workqueue(struct wor
 	int cpu;
 
 	might_sleep();
+	lock_acquire(&wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
+	lock_release(&wq->lockdep_map, 0, _THIS_IP_);
 	for_each_cpu_mask(cpu, *cpu_map)
 		flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, cpu));
 }
@@ -700,6 +714,9 @@ struct workqueue_struct *__create_workqu
 	}
 
 	wq->name = name;
+#ifdef CONFIG_LOCKDEP
+	lockdep_init_map(&wq->lockdep_map, name, &wq_lockdep_key, 0);
+#endif
 	wq->singlethread = singlethread;
 	wq->freezeable = freezeable;
 	INIT_LIST_HEAD(&wq->list);



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] debug workqueue deadlocks with lockdep
  2007-06-27 18:40 [RFC/PATCH] debug workqueue deadlocks with lockdep Johannes Berg
@ 2007-06-28 16:33 ` Johannes Berg
  2007-06-28 17:33   ` Johannes Berg
  0 siblings, 1 reply; 16+ messages in thread
From: Johannes Berg @ 2007-06-28 16:33 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Arjan van de Ven, Linux Kernel list, linux-wireless

[-- Attachment #1: Type: text/plain, Size: 693 bytes --]

On Wed, 2007-06-27 at 20:40 +0200, Johannes Berg wrote:

> --- linux-2.6-git.orig/kernel/workqueue.c	2007-06-27 19:10:35.183199046 +0200
> +++ linux-2.6-git/kernel/workqueue.c	2007-06-27 20:30:00.896667168 +0200
> @@ -61,8 +61,15 @@ struct workqueue_struct {
>  	const char *name;
>  	int singlethread;
>  	int freezeable;		/* Freeze threads during suspend */
> +#ifdef CONFIG_LOCKDEP
> +	struct lockdep_map lockdep_map;
> +#endif
>  };
>  
> +#ifdef CONFIG_LOCKDEP
> +static struct lock_class_key wq_lockdep_key;
> +#endif

I think I misunderstood how lockdep works because the output was
actually wrong, should the key be part of the workqueue_struct as well?

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] debug workqueue deadlocks with lockdep
  2007-06-28 16:33 ` Johannes Berg
@ 2007-06-28 17:33   ` Johannes Berg
  2007-06-30  8:05     ` Ingo Molnar
  0 siblings, 1 reply; 16+ messages in thread
From: Johannes Berg @ 2007-06-28 17:33 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Arjan van de Ven, Linux Kernel list, linux-wireless

[-- Attachment #1: Type: text/plain, Size: 399 bytes --]

On Thu, 2007-06-28 at 18:33 +0200, Johannes Berg wrote:

> I think I misunderstood how lockdep works because the output was
> actually wrong, should the key be part of the workqueue_struct as well?

No, that's not right either, but Arjan just helped me a bit with how
lockdep works and I think I have the right idea now. Ignore this for
now, I'll send a new patch in a few days.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] debug workqueue deadlocks with lockdep
  2007-06-28 17:33   ` Johannes Berg
@ 2007-06-30  8:05     ` Ingo Molnar
  2007-06-30 11:46       ` Oleg Nesterov
  0 siblings, 1 reply; 16+ messages in thread
From: Ingo Molnar @ 2007-06-30  8:05 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Arjan van de Ven, Linux Kernel list, linux-wireless,
	Peter Zijlstra, Oleg Nesterov, mingo, Thomas Sattler

On Thu, 2007-06-28 at 19:33 +0200, Johannes Berg wrote:
> No, that's not right either, but Arjan just helped me a bit with how
> lockdep works and I think I have the right idea now. Ignore this for
> now, I'll send a new patch in a few days.

ok. But in general, this is a very nice idea!

i've Cc:-ed Oleg. Oleg, what do you think? I think we should keep all
the workqueue APIs specified in a form that makes them lockdep coverable
like Johannes did. This debug mechanism could have helped with the
recent DVB lockup that Thomas Sattler reported.

	Ingo


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] debug workqueue deadlocks with lockdep
  2007-06-30  8:05     ` Ingo Molnar
@ 2007-06-30 11:46       ` Oleg Nesterov
  2007-07-02  8:37         ` Johannes Berg
  2007-07-02 13:03         ` Johannes Berg
  0 siblings, 2 replies; 16+ messages in thread
From: Oleg Nesterov @ 2007-06-30 11:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Johannes Berg, Arjan van de Ven, Linux Kernel list,
	linux-wireless, Peter Zijlstra, mingo, Thomas Sattler

On 06/30, Ingo Molnar wrote:
>
> On Thu, 2007-06-28 at 19:33 +0200, Johannes Berg wrote:
> > No, that's not right either, but Arjan just helped me a bit with how
> > lockdep works and I think I have the right idea now. Ignore this for
> > now, I'll send a new patch in a few days.
> 
> ok. But in general, this is a very nice idea!
> 
> i've Cc:-ed Oleg. Oleg, what do you think? I think we should keep all
> the workqueue APIs specified in a form that makes them lockdep coverable
> like Johannes did. This debug mechanism could have helped with the
> recent DVB lockup that Thomas Sattler reported.

I think this idea is great!

Johannes, could you change wait_on_work() as well? Most users of
flush_workqueue() should be converted to use it.

> @@ -342,6 +351,9 @@ static int flush_cpu_workqueue(struct cp
>         } else {
>                 struct wq_barrier barr;
>
> +               lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
> +               lock_release(&cwq->wq->lockdep_map, 0, _THIS_IP_);
> +
>                 active = 0;
>                 spin_lock_irq(&cwq->lock);

I am not sure why you skip "if (cwq->thread == current)" case, it can
deadlock in the same way.

But, perhaps we should not change flush_cpu_workqueue(). If we detect the
deadlock, we will have num_online_cpus() reports, yes?

And,

>                 if (!list_empty(&cwq->worklist) || cwq->current_work != NULL) {
> @@ -376,6 +388,8 @@ void fastcall flush_workqueue(struct wor
>         int cpu;
>
>         might_sleep();
> +       lock_acquire(&wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
> +       lock_release(&wq->lockdep_map, 0, _THIS_IP_);

one of the 2 callers was already modified. Perhaps it is better to add
lock_acquire() into the second caller, cleanup_workqueue_thread(), but
skip flush_cpu_workqueue() ?

Oleg.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] debug workqueue deadlocks with lockdep
  2007-06-30 11:46       ` Oleg Nesterov
@ 2007-07-02  8:37         ` Johannes Berg
  2007-07-02 13:03         ` Johannes Berg
  1 sibling, 0 replies; 16+ messages in thread
From: Johannes Berg @ 2007-07-02  8:37 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Arjan van de Ven, Linux Kernel list, linux-wireless,
	Peter Zijlstra, mingo, Thomas Sattler

[-- Attachment #1: Type: text/plain, Size: 2724 bytes --]

On Sat, 2007-06-30 at 15:46 +0400, Oleg Nesterov wrote:
> On 06/30, Ingo Molnar wrote:
> >
> > On Thu, 2007-06-28 at 19:33 +0200, Johannes Berg wrote:
> > > No, that's not right either, but Arjan just helped me a bit with how
> > > lockdep works and I think I have the right idea now. Ignore this for
> > > now, I'll send a new patch in a few days.
> > 
> > ok. But in general, this is a very nice idea!
> > 
> > i've Cc:-ed Oleg. Oleg, what do you think? I think we should keep all
> > the workqueue APIs specified in a form that makes them lockdep coverable
> > like Johannes did. This debug mechanism could have helped with the
> > recent DVB lockup that Thomas Sattler reported.
> 
> I think this idea is great!

:)

> Johannes, could you change wait_on_work() as well? Most users of
> flush_workqueue() should be converted to use it.

Sure. The case I had used flush_workqueue() but I'll look into
wait_on_work() and maybe converting the place where I had this.

> > @@ -342,6 +351,9 @@ static int flush_cpu_workqueue(struct cp
> >         } else {
> >                 struct wq_barrier barr;
> >
> > +               lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
> > +               lock_release(&cwq->wq->lockdep_map, 0, _THIS_IP_);
> > +
> >                 active = 0;
> >                 spin_lock_irq(&cwq->lock);
> 
> I am not sure why you skip "if (cwq->thread == current)" case, it can
> deadlock in the same way.

I wasn't sure what would happen with recursion.

> But, perhaps we should not change flush_cpu_workqueue(). If we detect the
> deadlock, we will have num_online_cpus() reports, yes?

Not sure what you're thinking of here. I initially had it in
flush_workqueue() but then put it into flush_cpu_workqueue(), but I have
to admit that I already forgot why.

> And,
> 
> >                 if (!list_empty(&cwq->worklist) || cwq->current_work != NULL) {
> > @@ -376,6 +388,8 @@ void fastcall flush_workqueue(struct wor
> >         int cpu;
> >
> >         might_sleep();
> > +       lock_acquire(&wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
> > +       lock_release(&wq->lockdep_map, 0, _THIS_IP_);
> 
> one of the 2 callers was already modified. Perhaps it is better to add
> lock_acquire() into the second caller, cleanup_workqueue_thread(), but
> skip flush_cpu_workqueue() ?

I'll have to look at the code.

The problem I discussed with Arjan is that with this patch all
workqueues are in the same class which is wrong, so I'll have to modify
the workqueue creation API by introducing some macros for the LOCKDEP
case that pass the current code spot as the workqueue class. I'll try to
do that later today.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] debug workqueue deadlocks with lockdep
  2007-06-30 11:46       ` Oleg Nesterov
  2007-07-02  8:37         ` Johannes Berg
@ 2007-07-02 13:03         ` Johannes Berg
  2007-07-03 17:31           ` Oleg Nesterov
  1 sibling, 1 reply; 16+ messages in thread
From: Johannes Berg @ 2007-07-02 13:03 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Arjan van de Ven, Linux Kernel list, linux-wireless,
	Peter Zijlstra, mingo, Thomas Sattler

On Sat, 2007-06-30 at 15:46 +0400, Oleg Nesterov wrote:

[reordered a bit]

> And,
> 
> >                 if (!list_empty(&cwq->worklist) || cwq->current_work != NULL) {
> > @@ -376,6 +388,8 @@ void fastcall flush_workqueue(struct wor
> >         int cpu;
> >
> >         might_sleep();
> > +       lock_acquire(&wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
> > +       lock_release(&wq->lockdep_map, 0, _THIS_IP_);
> 
> one of the 2 callers was already modified. Perhaps it is better to add
> lock_acquire() into the second caller, cleanup_workqueue_thread(), but
> skip flush_cpu_workqueue() ?

I see what you mean now, that hunk wasn't supposed to be in my patch, I
wanted to move not copy the code.

> Johannes, could you change wait_on_work() as well? Most users of
> flush_workqueue() should be converted to use it.

I think this could lead to false positives, but then I think we
shouldn't care about those. Let me explain. The thing is that with this
it can happen that the code using the workqueue somehow obeys an
ordering in the work it queues, so as far as I can tell it'd be fine to
have two work items A and B where only B takes a lock L1, and then have
a wait_on_work(A) under L1, if and only if B was always queued right
after A (or something like that). However, since this invariant is
rather likely to be broken by subsequent changes to the code, I think
such code should probably use two workqueues instead, and I doubt it
ever happens anyway since then people could in most cases just put both
works into one function. Hence I included it in my patch below.

> But, perhaps we should not change flush_cpu_workqueue(). If we detect the
> deadlock, we will have num_online_cpus() reports, yes?

The idea here was that it would cover cleanup_workqueue_thread() as
well, but I can equally well just put it there, as below.

Arjan, thanks for your help, this patch seems to work as expected
without the false positive that was my previous dump I showed.

Here's my example rechecked with this new patch:

[  174.639892] =======================================================
[  174.639909] [ INFO: possible circular locking dependency detected ]
[  174.639916] 2.6.22-rc6 #168
[  174.639924] -------------------------------------------------------
[  174.639933] khubd/1927 is trying to acquire lock:
[  174.639940]  (zd1211rw_mac80211#2){--..}, at: [<c000000000060d04>] .flush_workqueue+0x48/0xf4
[  174.639971] 
[  174.639974] but task is already holding lock:
[  174.639982]  (rtnl_mutex){--..}, at: [<c0000000003a47fc>] .mutex_lock+0x3c/0x58
[  174.640016] 
[  174.640018] which lock already depends on the new lock.
[  174.640021] 
[  174.640030] 
[  174.640032] the existing dependency chain (in reverse order) is:
[  174.640040] 
[  174.640042] -> #1 (rtnl_mutex){--..}:
[  174.640065]        [<c0000000000724ac>] .__lock_acquire+0xb8c/0xd68
[  174.640121]        [<c000000000072728>] .lock_acquire+0xa0/0xe8
[  174.640169]        [<c0000000003a4528>] .__mutex_lock_slowpath+0x138/0x3d0
[  174.640216]        [<c0000000003a47fc>] .mutex_lock+0x3c/0x58
[  174.640262]        [<c000000000327284>] .rtnl_lock+0x24/0x40
[  174.640314]        [<d0000000004a91b8>] .ieee80211_sta_work+0xab4/0x1028 [mac80211]
[  174.640401]        [<c00000000005fe20>] .run_workqueue+0x114/0x22c
[  174.640452]        [<c000000000061370>] .worker_thread+0x11c/0x140
[  174.640500]        [<c0000000000662a8>] .kthread+0x84/0xd4
[  174.640549]        [<c0000000000235e8>] .kernel_thread+0x4c/0x68
[  174.640602] 
[  174.640604] -> #0 (zd1211rw_mac80211#2){--..}:
[  174.640634]        [<c00000000007239c>] .__lock_acquire+0xa7c/0xd68
[  174.640683]        [<c000000000072728>] .lock_acquire+0xa0/0xe8
[  174.640732]        [<c000000000060d34>] .flush_workqueue+0x78/0xf4
[  174.640769]        [<d00000000049a330>] .ieee80211_stop+0x1b4/0x35c [mac80211]
[  174.640839]        [<c00000000031b938>] .dev_close+0xb8/0xfc
[  174.640891]        [<c00000000031ba3c>] .unregister_netdevice+0xc0/0x254
[  174.640941]        [<d0000000004aa234>] .__ieee80211_if_del+0x34/0x50 [mac80211]
[  174.641018]        [<d000000000499658>] .ieee80211_unregister_hw+0xf8/0x2d8 [mac80211]
[  174.641089]        [<d00000000045efb0>] .disconnect+0x3c/0x98 [zd1211rw_mac80211]
[  174.641154]        [<d0000000000a9650>] .usb_unbind_interface+0x6c/0xcc [usbcore]
[  174.641248]        [<c00000000027bd54>] .__device_release_driver+0xcc/0x110
[  174.641299]        [<c00000000027c4a8>] .device_release_driver+0x70/0xbc
[  174.641351]        [<c00000000027b2e8>] .bus_remove_device+0xa0/0xcc
[  174.641400]        [<c000000000278368>] .device_del+0x2f4/0x3d4
[  174.641451]        [<d0000000000a5d68>] .usb_disable_device+0xa0/0x150 [usbcore]
[  174.641525]        [<d0000000000a0eac>] .usb_disconnect+0xfc/0x1a4 [usbcore]
[  174.641599]        [<d0000000000a17d4>] .hub_thread+0x3d8/0xc70 [usbcore]
[  174.641673]        [<c0000000000662a8>] .kthread+0x84/0xd4
[  174.641721]        [<c0000000000235e8>] .kernel_thread+0x4c/0x68
[  174.641767] 
[  174.641769] other info that might help us debug this:
[  174.641772] 
[  174.641782] 1 lock held by khubd/1927:
[  174.641791]  #0:  (rtnl_mutex){--..}, at: [<c0000000003a47fc>] .mutex_lock+0x3c/0x58
[  174.641827] 
[  174.641830] stack backtrace:
[  174.641838] Call Trace:
[  174.641848] [c00000007ebab130] [c00000000000f620] .show_stack+0x70/0x1bc (unreliable)
[  174.641879] [c00000007ebab1e0] [c00000000000f78c] .dump_stack+0x20/0x34
[  174.641903] [c00000007ebab260] [c000000000070388] .print_circular_bug_tail+0x84/0xa8
[  174.641930] [c00000007ebab330] [c00000000007239c] .__lock_acquire+0xa7c/0xd68
[  174.641955] [c00000007ebab420] [c000000000072728] .lock_acquire+0xa0/0xe8
[  174.641979] [c00000007ebab4e0] [c000000000060d34] .flush_workqueue+0x78/0xf4
[  174.642002] [c00000007ebab580] [d00000000049a330] .ieee80211_stop+0x1b4/0x35c [mac80211]
[  174.642047] [c00000007ebab640] [c00000000031b938] .dev_close+0xb8/0xfc
[  174.642071] [c00000007ebab6d0] [c00000000031ba3c] .unregister_netdevice+0xc0/0x254
[  174.642096] [c00000007ebab760] [d0000000004aa234] .__ieee80211_if_del+0x34/0x50 [mac80211]
[  174.642147] [c00000007ebab7f0] [d000000000499658] .ieee80211_unregister_hw+0xf8/0x2d8 [mac80211]
[  174.642193] [c00000007ebab8c0] [d00000000045efb0] .disconnect+0x3c/0x98 [zd1211rw_mac80211]
[  174.642227] [c00000007ebab960] [d0000000000a9650] .usb_unbind_interface+0x6c/0xcc [usbcore]
[  174.642277] [c00000007ebaba00] [c00000000027bd54] .__device_release_driver+0xcc/0x110
[  174.642302] [c00000007ebaba90] [c00000000027c4a8] .device_release_driver+0x70/0xbc
[  174.642327] [c00000007ebabb20] [c00000000027b2e8] .bus_remove_device+0xa0/0xcc
[  174.642351] [c00000007ebabbb0] [c000000000278368] .device_del+0x2f4/0x3d4
[  174.642375] [c00000007ebabc50] [d0000000000a5d68] .usb_disable_device+0xa0/0x150 [usbcore]
[  174.642422] [c00000007ebabcf0] [d0000000000a0eac] .usb_disconnect+0xfc/0x1a4 [usbcore]
[  174.642468] [c00000007ebabdb0] [d0000000000a17d4] .hub_thread+0x3d8/0xc70 [usbcore]
[  174.642517] [c00000007ebabf00] [c0000000000662a8] .kthread+0x84/0xd4
[  174.642541] [c00000007ebabf90] [c0000000000235e8] .kernel_thread+0x4c/0x68


I haven't had it report anything else, but I don't hook up many devices
to that particular machine.

---
 include/linux/workqueue.h |   41 ++++++++++++++++++++++++++++++++++++-----
 kernel/workqueue.c        |   17 ++++++++++++++++-
 2 files changed, 52 insertions(+), 6 deletions(-)

--- linux-2.6-git.orig/kernel/workqueue.c	2007-07-02 14:18:44.331427320 +0200
+++ linux-2.6-git/kernel/workqueue.c	2007-07-02 14:37:41.441427320 +0200
@@ -61,6 +61,9 @@ struct workqueue_struct {
 	const char *name;
 	int singlethread;
 	int freezeable;		/* Freeze threads during suspend */
+#ifdef CONFIG_LOCKDEP
+	struct lockdep_map lockdep_map;
+#endif
 };
 
 /* All the per-cpu workqueues on the system, for hotplug cpu to add/remove
@@ -257,7 +260,9 @@ static void run_workqueue(struct cpu_wor
 
 		BUG_ON(get_wq_data(work) != cwq);
 		work_clear_pending(work);
+		lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
 		f(work);
+		lock_release(&cwq->wq->lockdep_map, 0, _THIS_IP_);
 
 		if (unlikely(in_atomic() || lockdep_depth(current) > 0)) {
 			printk(KERN_ERR "BUG: workqueue leaked lock or atomic: "
@@ -376,6 +381,8 @@ void fastcall flush_workqueue(struct wor
 	int cpu;
 
 	might_sleep();
+	lock_acquire(&wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
+	lock_release(&wq->lockdep_map, 0, _THIS_IP_);
 	for_each_cpu_mask(cpu, *cpu_map)
 		flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, cpu));
 }
@@ -453,6 +460,9 @@ static void wait_on_work(struct work_str
 	wq = cwq->wq;
 	cpu_map = wq_cpu_map(wq);
 
+	lock_acquire(&wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
+	lock_release(&wq->lockdep_map, 0, _THIS_IP_);
+
 	for_each_cpu_mask(cpu, *cpu_map)
 		wait_on_cpu_work(per_cpu_ptr(wq->cpu_wq, cpu), work);
 }
@@ -683,7 +693,8 @@ static void start_workqueue_thread(struc
 }
 
 struct workqueue_struct *__create_workqueue(const char *name,
-					    int singlethread, int freezeable)
+					    int singlethread, int freezeable,
+					    struct lock_class_key *key)
 {
 	struct workqueue_struct *wq;
 	struct cpu_workqueue_struct *cwq;
@@ -700,6 +711,7 @@ struct workqueue_struct *__create_workqu
 	}
 
 	wq->name = name;
+	lockdep_init_map(&wq->lockdep_map, name, key, 0);
 	wq->singlethread = singlethread;
 	wq->freezeable = freezeable;
 	INIT_LIST_HEAD(&wq->list);
@@ -739,6 +751,9 @@ static void cleanup_workqueue_thread(str
 	if (cwq->thread == NULL)
 		return;
 
+	lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
+	lock_release(&cwq->wq->lockdep_map, 0, _THIS_IP_);
+
 	/*
 	 * If the caller is CPU_DEAD the single flush_cpu_workqueue()
 	 * is not enough, a concurrent flush_workqueue() can insert a
--- linux-2.6-git.orig/include/linux/workqueue.h	2007-07-02 14:20:28.207427320 +0200
+++ linux-2.6-git/include/linux/workqueue.h	2007-07-02 14:25:35.158427320 +0200
@@ -119,11 +119,42 @@ struct execute_work {
 
 
 extern struct workqueue_struct *__create_workqueue(const char *name,
-						    int singlethread,
-						    int freezeable);
-#define create_workqueue(name) __create_workqueue((name), 0, 0)
-#define create_freezeable_workqueue(name) __create_workqueue((name), 1, 1)
-#define create_singlethread_workqueue(name) __create_workqueue((name), 1, 0)
+						   int singlethread,
+						   int freezeable,
+						   struct lock_class_key *key);
+#ifdef CONFIG_LOCKDEP
+#define create_workqueue(name) \
+({								\
+	static struct lock_class_key __key;			\
+	struct workqueue_struct *__wq;				\
+								\
+	__wq = __create_workqueue((name), 0, 0, &__key);	\
+	__wq;							\
+})
+#define create_freezeable_workqueue(name) \
+({								\
+	static struct lock_class_key __key;			\
+	struct workqueue_struct *__wq;				\
+								\
+	__wq = __create_workqueue((name), 1, 1, &__key);	\
+	__wq;							\
+})
+#define create_singlethread_workqueue(name) \
+({								\
+	static struct lock_class_key __key;			\
+	struct workqueue_struct *__wq;				\
+								\
+	__wq = __create_workqueue((name), 1, 0, &__key);	\
+	__wq;							\
+})
+#else
+#define create_workqueue(name) \
+	__create_workqueue((name), 0, 0, NULL)
+#define create_freezeable_workqueue(name) \
+	__create_workqueue((name), 1, 1, NULL)
+#define create_singlethread_workqueue(name) \
+	__create_workqueue((name), 1, 0, NULL)
+#endif
 
 extern void destroy_workqueue(struct workqueue_struct *wq);
 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] debug workqueue deadlocks with lockdep
  2007-07-02 13:03         ` Johannes Berg
@ 2007-07-03 17:31           ` Oleg Nesterov
  2007-07-03 19:42             ` Ingo Molnar
  2007-07-04 11:49             ` Johannes Berg
  0 siblings, 2 replies; 16+ messages in thread
From: Oleg Nesterov @ 2007-07-03 17:31 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Ingo Molnar, Arjan van de Ven, Linux Kernel list, linux-wireless,
	Peter Zijlstra, mingo, Thomas Sattler

On 07/02, Johannes Berg wrote:
>
> On Sat, 2007-06-30 at 15:46 +0400, Oleg Nesterov wrote:
>
> > Johannes, could you change wait_on_work() as well? Most users of
> > flush_workqueue() should be converted to use it.

(to avoid a possible confusion: I meant, most users of flush_workqueue()
 should be converted to use cancel_work_sync/cancel_delayed_work_sync
 which in turn use wait_on_work()).

> I think this could lead to false positives, but then I think we
> shouldn't care about those. Let me explain. The thing is that with this
> it can happen that the code using the workqueue somehow obeys an
> ordering in the work it queues, so as far as I can tell it'd be fine to
> have two work items A and B where only B takes a lock L1, and then have
> a wait_on_work(A) under L1, if and only if B was always queued right
> after A (or something like that).

Not sure I understand. Yes, we can have false positives, but I think the
ordering in the workqueue doesn't matter.

If A does NOT take a lock L1, then it is OK to do cancel_work_sync(A)
under L1, regardless of which other work_structs this workqueue has,
before or after A.

Now we have a false positive if some time we queue B into that workqueue,
and this is not good.

We can avoid this problem if we put lockdep_map into work_struct, so
that wait_on_work() "locks" work->lockdep_map, while flush_workqueue()
takes wq->lockdep_map.

But probably we don't need this right now, at least until we really
have a lot of false positives while converting from flush_workqueue()
to cancel_work_sync/cancel_delayed_work_sync.

> However, since this invariant is
> rather likely to be broken by subsequent changes to the code, I think
> such code should probably use two workqueues instead, and I doubt it
> ever happens anyway since then people could in most cases just put both
> works into one function.

Well, I am not sure, think about the shared workqueues like keventd_wq...

> Hence I included it in my patch below.

a couple of minor nits below,

> @@ -257,7 +260,9 @@ static void run_workqueue(struct cpu_wor
>
>  		BUG_ON(get_wq_data(work) != cwq);
>  		work_clear_pending(work);
> +		lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
>  		f(work);
> +		lock_release(&cwq->wq->lockdep_map, 0, _THIS_IP_);
                                                   ^^^
Isn't it better to call lock_release() with nested == 1 ?

> +#define create_workqueue(name) \
> +({								\
> +	static struct lock_class_key __key;			\
> +	struct workqueue_struct *__wq;				\
> +								\
> +	__wq = __create_workqueue((name), 0, 0, &__key);	\
> +	__wq;							\
> +})

Why do we need __wq ?

	+#define create_workqueue(name) \
	({							\
		static struct lock_class_key __key;		\
		__create_workqueue((name), 0, 0, &__key);	\
	})

Actually, I'd suggest to rename __create_workqueue() to __create_workqueue_key(),
and then you need the only change in linux/workqueue.h

	- extern struct workqueue_struct *__create_workqueue(...);
	+ extern struct workqueue_struct *__create_workqueue_key(..., key);
	+ #define __create_workqueue(...)	\
	+	static struct lock_class_key __key;	\
	+	__create_workqueue_key(..., key);	\

but this is a matter of taste.

Btw, I think your patch found a real bug in net/mac80211/, cool!

Oleg.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] debug workqueue deadlocks with lockdep
  2007-07-03 17:31           ` Oleg Nesterov
@ 2007-07-03 19:42             ` Ingo Molnar
  2007-07-04 11:49             ` Johannes Berg
  1 sibling, 0 replies; 16+ messages in thread
From: Ingo Molnar @ 2007-07-03 19:42 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Johannes Berg, Ingo Molnar, Arjan van de Ven, Linux Kernel list,
	linux-wireless, Peter Zijlstra, Thomas Sattler


* Oleg Nesterov <oleg@tv-sign.ru> wrote:

> Now we have a false positive if some time we queue B into that 
> workqueue, and this is not good.
> 
> We can avoid this problem if we put lockdep_map into work_struct, so 
> that wait_on_work() "locks" work->lockdep_map, while flush_workqueue() 
> takes wq->lockdep_map.
> 
> But probably we don't need this right now, at least until we really 
> have a lot of false positives while converting from flush_workqueue() 
> to cancel_work_sync/cancel_delayed_work_sync.

would be nice to map the real dependencies. Adding another lockdep_map 
is really low-cost and the lack of false positives is reassuring.

	Ingo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] debug workqueue deadlocks with lockdep
  2007-07-03 17:31           ` Oleg Nesterov
  2007-07-03 19:42             ` Ingo Molnar
@ 2007-07-04 11:49             ` Johannes Berg
  2007-07-04 12:21               ` Ingo Molnar
                                 ` (2 more replies)
  1 sibling, 3 replies; 16+ messages in thread
From: Johannes Berg @ 2007-07-04 11:49 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Arjan van de Ven, Linux Kernel list, linux-wireless,
	Peter Zijlstra, mingo, Thomas Sattler

[-- Attachment #1: Type: text/plain, Size: 3563 bytes --]

Oleg,

Thanks for your comments. Shows how little I really understand the
workqueue API :)

On Tue, 2007-07-03 at 21:31 +0400, Oleg Nesterov wrote:

> > I think this could lead to false positives, but then I think we
> > shouldn't care about those. Let me explain. The thing is that with this
> > it can happen that the code using the workqueue somehow obeys an
> > ordering in the work it queues, so as far as I can tell it'd be fine to
> > have two work items A and B where only B takes a lock L1, and then have
> > a wait_on_work(A) under L1, if and only if B was always queued right
> > after A (or something like that).
> 
> Not sure I understand. Yes, we can have false positives, but I think the
> ordering in the workqueue doesn't matter.
> 
> If A does NOT take a lock L1, then it is OK to do cancel_work_sync(A)
> under L1, regardless of which other work_structs this workqueue has,
> before or after A.

Ah, cancel_work_sync() waits only for it if A is currently running?

> Now we have a false positive if some time we queue B into that workqueue,
> and this is not good.

Right. I was thinking of the flush_workqueue case where any of A or B
matters.

> We can avoid this problem if we put lockdep_map into work_struct, so
> that wait_on_work() "locks" work->lockdep_map, while flush_workqueue()
> takes wq->lockdep_map.

Yeah, and then we'll take both wq->lockdep_map and the
work_struct->lockdep_map when running that work. That should work, I'll
give it a go later.

> But probably we don't need this right now, at least until we really
> have a lot of false positives while converting from flush_workqueue()
> to cancel_work_sync/cancel_delayed_work_sync.

I didn't really think about those yet, but I think you're right.

> > @@ -257,7 +260,9 @@ static void run_workqueue(struct cpu_wor
> >
> >  		BUG_ON(get_wq_data(work) != cwq);
> >  		work_clear_pending(work);
> > +		lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
> >  		f(work);
> > +		lock_release(&cwq->wq->lockdep_map, 0, _THIS_IP_);
>                                                    ^^^
> Isn't it better to call lock_release() with nested == 1 ?

Not sure, Ingo?

> > +#define create_workqueue(name) \
> > +({								\
> > +	static struct lock_class_key __key;			\
> > +	struct workqueue_struct *__wq;				\
> > +								\
> > +	__wq = __create_workqueue((name), 0, 0, &__key);	\
> > +	__wq;							\
> > +})
> 
> Why do we need __wq ?

No particular reason I think, I copied some other code doing it that
way.

> 	+#define create_workqueue(name) \
> 	({							\
> 		static struct lock_class_key __key;		\
> 		__create_workqueue((name), 0, 0, &__key);	\
> 	})
> 
> Actually, I'd suggest to rename __create_workqueue() to __create_workqueue_key(),
> and then you need the only change in linux/workqueue.h
> 
> 	- extern struct workqueue_struct *__create_workqueue(...);
> 	+ extern struct workqueue_struct *__create_workqueue_key(..., key);
> 	+ #define __create_workqueue(...)	\
> 	+	static struct lock_class_key __key;	\
> 	+	__create_workqueue_key(..., key);	\
> 
> but this is a matter of taste.

Sure, that works. I thought about compiling out the argument completely
for the no-lockdep case, would you prefer that?

> Btw, I think your patch found a real bug in net/mac80211/, cool!

Actually, I found the bug by experiencing it and analysing the stack
traces; then I thought that it should be possible to add lockdep support
for that to avoid regressing :)

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] debug workqueue deadlocks with lockdep
  2007-07-04 11:49             ` Johannes Berg
@ 2007-07-04 12:21               ` Ingo Molnar
  2007-07-04 13:59                 ` Johannes Berg
  2007-07-04 12:25               ` Ingo Molnar
  2007-07-04 12:52               ` Oleg Nesterov
  2 siblings, 1 reply; 16+ messages in thread
From: Ingo Molnar @ 2007-07-04 12:21 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Oleg Nesterov, Ingo Molnar, Arjan van de Ven, Linux Kernel list,
	linux-wireless, Peter Zijlstra, Thomas Sattler


* Johannes Berg <johannes@sipsolutions.net> wrote:

> > > @@ -257,7 +260,9 @@ static void run_workqueue(struct cpu_wor
> > >
> > >  		BUG_ON(get_wq_data(work) != cwq);
> > >  		work_clear_pending(work);
> > > +		lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
> > >  		f(work);
> > > +		lock_release(&cwq->wq->lockdep_map, 0, _THIS_IP_);
> >                                                    ^^^
> > Isn't it better to call lock_release() with nested == 1 ?
> 
> Not sure, Ingo?

well, in this case the lock/unlock should nest perfectly (i.e. it should 
always be balanced perfectly), so indeed calling with nested==1 leads to 
stricter checking.

non-nested unlocks occur when people do stuff like:

	spin_lock(&lock1);
	spin_lock(&lock2);
	spin_unlock(&lock1);
	spin_unlock(&lock2);

the first unlock is not 'nested perfectly'. Now for the workqueue 
dep_map this shouldnt be a legal combination, right?

	Ingo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] debug workqueue deadlocks with lockdep
  2007-07-04 11:49             ` Johannes Berg
  2007-07-04 12:21               ` Ingo Molnar
@ 2007-07-04 12:25               ` Ingo Molnar
  2007-07-04 12:52               ` Oleg Nesterov
  2 siblings, 0 replies; 16+ messages in thread
From: Ingo Molnar @ 2007-07-04 12:25 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Oleg Nesterov, Ingo Molnar, Arjan van de Ven, Linux Kernel list,
	linux-wireless, Peter Zijlstra, Thomas Sattler


* Johannes Berg <johannes@sipsolutions.net> wrote:

> > > +#define create_workqueue(name) \
> > > +({								\
> > > +	static struct lock_class_key __key;			\
> > > +	struct workqueue_struct *__wq;				\
> > > +								\
> > > +	__wq = __create_workqueue((name), 0, 0, &__key);	\
> > > +	__wq;							\
> > > +})
> > 
> > Why do we need __wq ?
> 
> No particular reason I think, I copied some other code doing it that 
> way.

yep, should be fine doing this:

 #define create_workqueue(name)				\
 ({							\
	static struct lock_class_key __key;		\
							\
	__create_workqueue((name), 0, 0, &__key);	\
 })

(and the return value of __create_workqueue() will be the 'return value' 
of the macro as well.)

> > 	+ extern struct workqueue_struct *__create_workqueue_key(..., key);
> > 	+ #define __create_workqueue(...)	\
> > 	+	static struct lock_class_key __key;	\
> > 	+	__create_workqueue_key(..., key);	\
> > 
> > but this is a matter of taste.

the above macro should at minimum be encapsulated with
do { ... } while (0) so that __create_workqueue() is a single C 
statement.

	Ingo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] debug workqueue deadlocks with lockdep
  2007-07-04 11:49             ` Johannes Berg
  2007-07-04 12:21               ` Ingo Molnar
  2007-07-04 12:25               ` Ingo Molnar
@ 2007-07-04 12:52               ` Oleg Nesterov
  2007-07-04 13:57                 ` Johannes Berg
  2007-07-05  8:43                 ` Ingo Molnar
  2 siblings, 2 replies; 16+ messages in thread
From: Oleg Nesterov @ 2007-07-04 12:52 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Ingo Molnar, Arjan van de Ven, Linux Kernel list, linux-wireless,
	Peter Zijlstra, mingo, Thomas Sattler

On 07/04, Johannes Berg wrote:
> 
> On Tue, 2007-07-03 at 21:31 +0400, Oleg Nesterov wrote:
> 
> > If A does NOT take a lock L1, then it is OK to do cancel_work_sync(A)
> > under L1, regardless of which other work_structs this workqueue has,
> > before or after A.
> 
> Ah, cancel_work_sync() waits only for it if A is currently running?

Yes. And no other work (except a barrier) can run before the caller of
wait_on_work() is woken.

> > Now we have a false positive if some time we queue B into that workqueue,
> > and this is not good.
> 
> Right. I was thinking of the flush_workqueue case where any of A or B
> matters.

Aha, now I see where I was confused. Yes, we can't avoid the false positives
with flush_workqueue().

I hope this won't be a problem, because almost every usage of flush_workqueue()
is pointless nowadays. So even if we have a false positive, it probably
means the code needs cleanups anyway.

But see below,

> > We can avoid this problem if we put lockdep_map into work_struct, so
> > that wait_on_work() "locks" work->lockdep_map, while flush_workqueue()
> > takes wq->lockdep_map.
> 
> Yeah, and then we'll take both wq->lockdep_map and the
> work_struct->lockdep_map when running that work. That should work, I'll
> give it a go later.

If you are going to do this, may I suggest you to make 2 separate patches?
Exactly because we can't avoid the false positives with flush_workqueue(),
it would be nice if we have an option to revert the 2-nd patch if there are
too many false positives (I hope this won't happen).

(please ignore if this is not suitable for you).

> > > @@ -257,7 +260,9 @@ static void run_workqueue(struct cpu_wor
> > >
> > >  		BUG_ON(get_wq_data(work) != cwq);
> > >  		work_clear_pending(work);
> > > +		lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
> > >  		f(work);
> > > +		lock_release(&cwq->wq->lockdep_map, 0, _THIS_IP_);
> >                                                    ^^^
> > Isn't it better to call lock_release() with nested == 1 ?
> 
> Not sure, Ingo?

Ingo, could you also explain the meaning of "nested" parameter? Looks
like it is just unneeded, lock_release_nested() does a quick check
and use lock_release_non_nested() when hlock is not on top of stack.

Oleg.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] debug workqueue deadlocks with lockdep
  2007-07-04 12:52               ` Oleg Nesterov
@ 2007-07-04 13:57                 ` Johannes Berg
  2007-07-05  8:43                 ` Ingo Molnar
  1 sibling, 0 replies; 16+ messages in thread
From: Johannes Berg @ 2007-07-04 13:57 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Arjan van de Ven, Linux Kernel list, linux-wireless,
	Peter Zijlstra, mingo, Thomas Sattler

[-- Attachment #1: Type: text/plain, Size: 1356 bytes --]

On Wed, 2007-07-04 at 16:52 +0400, Oleg Nesterov wrote:

> Yes. And no other work (except a barrier) can run before the caller of
> wait_on_work() is woken.

Alright, thanks. Yes, then what you proposed makes a lot of sense, I'll
implement it.

> Aha, now I see where I was confused. Yes, we can't avoid the false positives
> with flush_workqueue().
> 
> I hope this won't be a problem, because almost every usage of flush_workqueue()
> is pointless nowadays. So even if we have a false positive, it probably
> means the code needs cleanups anyway.
> 
> But see below,

[...]

> If you are going to do this, may I suggest you to make 2 separate patches?
> Exactly because we can't avoid the false positives with flush_workqueue(),
> it would be nice if we have an option to revert the 2-nd patch if there are
> too many false positives (I hope this won't happen).

I've run this patch on my system for a few days now and only seen
exactly one warning; however, it's *not* actually a *false* positive,
it's a positive but it's also perfectly possible to deadlock if the
system is loaded more than one work item is stuck on the workqueue for
some reason. Say A takes L1 and B runs without locks, and then you flush
the workqueue under L1; you'll get a real deadlock when both A and B are
actually scheduled to run!

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] debug workqueue deadlocks with lockdep
  2007-07-04 12:21               ` Ingo Molnar
@ 2007-07-04 13:59                 ` Johannes Berg
  0 siblings, 0 replies; 16+ messages in thread
From: Johannes Berg @ 2007-07-04 13:59 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Oleg Nesterov, Ingo Molnar, Arjan van de Ven, Linux Kernel list,
	linux-wireless, Peter Zijlstra, Thomas Sattler

[-- Attachment #1: Type: text/plain, Size: 585 bytes --]

On Wed, 2007-07-04 at 14:21 +0200, Ingo Molnar wrote:

> well, in this case the lock/unlock should nest perfectly (i.e. it should 
> always be balanced perfectly), so indeed calling with nested==1 leads to 
> stricter checking.
> 
> non-nested unlocks occur when people do stuff like:
> 
> 	spin_lock(&lock1);
> 	spin_lock(&lock2);
> 	spin_unlock(&lock1);
> 	spin_unlock(&lock2);
> 
> the first unlock is not 'nested perfectly'. Now for the workqueue 
> dep_map this shouldnt be a legal combination, right?

I don't think so, will change to use nested==1.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC/PATCH] debug workqueue deadlocks with lockdep
  2007-07-04 12:52               ` Oleg Nesterov
  2007-07-04 13:57                 ` Johannes Berg
@ 2007-07-05  8:43                 ` Ingo Molnar
  1 sibling, 0 replies; 16+ messages in thread
From: Ingo Molnar @ 2007-07-05  8:43 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Johannes Berg, Ingo Molnar, Arjan van de Ven, Linux Kernel list,
	linux-wireless, Peter Zijlstra, Thomas Sattler


* Oleg Nesterov <oleg@tv-sign.ru> wrote:

> > > Isn't it better to call lock_release() with nested == 1 ?
> > 
> > Not sure, Ingo?
> 
> Ingo, could you also explain the meaning of "nested" parameter? Looks 
> like it is just unneeded, lock_release_nested() does a quick check and 
> use lock_release_non_nested() when hlock is not on top of stack.

hm, i forgot about that. We basically turned off all bad-nesting 
warnings in lockdep due to false positives.

the workqueue dependencies should all nest perfectly so please just use 
nested==1, even though it's a dummy right now. We might want to turn 
unlock-nest checking back on in the future. Or we'll remove that 
parameter altogether.

	Ingo

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2007-07-05  8:44 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-06-27 18:40 [RFC/PATCH] debug workqueue deadlocks with lockdep Johannes Berg
2007-06-28 16:33 ` Johannes Berg
2007-06-28 17:33   ` Johannes Berg
2007-06-30  8:05     ` Ingo Molnar
2007-06-30 11:46       ` Oleg Nesterov
2007-07-02  8:37         ` Johannes Berg
2007-07-02 13:03         ` Johannes Berg
2007-07-03 17:31           ` Oleg Nesterov
2007-07-03 19:42             ` Ingo Molnar
2007-07-04 11:49             ` Johannes Berg
2007-07-04 12:21               ` Ingo Molnar
2007-07-04 13:59                 ` Johannes Berg
2007-07-04 12:25               ` Ingo Molnar
2007-07-04 12:52               ` Oleg Nesterov
2007-07-04 13:57                 ` Johannes Berg
2007-07-05  8:43                 ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).