LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* 2.6.24-rc7 lockdep warning when poweroff
@ 2008-01-14  9:04 Dave Young
  2008-01-14  9:24 ` Peter Zijlstra
  0 siblings, 1 reply; 16+ messages in thread
From: Dave Young @ 2008-01-14  9:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, Ingo Molnar

Hi,

When I "halt -p", lockdep warnings triggered as following (hand copy):

WARNING : at kernel/lockdep.c: 700 lookup_lock_class()

<snip>
lock_acquire
cleanup_workqueue_thread
workqueue_cpu_callback
notifier_call_chain
__raw_notifier_call_chain
raw_notifier_call_chain
__cpu_down
disable_nonboot_cpus
kernel_power_off
sys_reboot
<snip>

Regards
dave

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.24-rc7 lockdep warning when poweroff
  2008-01-14  9:04 2.6.24-rc7 lockdep warning when poweroff Dave Young
@ 2008-01-14  9:24 ` Peter Zijlstra
  2008-01-14 10:35   ` Johannes Berg
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2008-01-14  9:24 UTC (permalink / raw)
  To: Dave Young
  Cc: Linus Torvalds, Linux Kernel Mailing List, Ingo Molnar, Johannes Berg


On Mon, 2008-01-14 at 17:04 +0800, Dave Young wrote:
> Hi,
> 
> When I "halt -p", lockdep warnings triggered as following (hand copy):
> 
> WARNING : at kernel/lockdep.c: 700 lookup_lock_class()
> 
> <snip>
> lock_acquire
> cleanup_workqueue_thread
> workqueue_cpu_callback
> notifier_call_chain
> __raw_notifier_call_chain
> raw_notifier_call_chain
> __cpu_down
> disable_nonboot_cpus
> kernel_power_off
> sys_reboot
> <snip>

My first guess would be a __create_workqueue_key() where a key is
re-used with a different workqueue name.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.24-rc7 lockdep warning when poweroff
  2008-01-14  9:24 ` Peter Zijlstra
@ 2008-01-14 10:35   ` Johannes Berg
  2008-01-14 10:41     ` Peter Zijlstra
  0 siblings, 1 reply; 16+ messages in thread
From: Johannes Berg @ 2008-01-14 10:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dave Young, Linus Torvalds, Linux Kernel Mailing List, Ingo Molnar

[-- Attachment #1: Type: text/plain, Size: 639 bytes --]


> > When I "halt -p", lockdep warnings triggered as following (hand copy):
> > 
> > WARNING : at kernel/lockdep.c: 700 lookup_lock_class()
> > 
> > <snip>
> > lock_acquire
> > cleanup_workqueue_thread
> > workqueue_cpu_callback
> > notifier_call_chain
> > __raw_notifier_call_chain
> > raw_notifier_call_chain
> > __cpu_down
> > disable_nonboot_cpus
> > kernel_power_off
> > sys_reboot
> > <snip>
> 
> My first guess would be a __create_workqueue_key() where a key is
> re-used with a different workqueue name.

Hm, can you elaborate how you got there from the trace above? I don't
think I'm following.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.24-rc7 lockdep warning when poweroff
  2008-01-14 10:35   ` Johannes Berg
@ 2008-01-14 10:41     ` Peter Zijlstra
  2008-01-14 10:51       ` Johannes Berg
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2008-01-14 10:41 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Dave Young, Linus Torvalds, Linux Kernel Mailing List, Ingo Molnar


On Mon, 2008-01-14 at 11:35 +0100, Johannes Berg wrote:
> > > When I "halt -p", lockdep warnings triggered as following (hand copy):
> > > 
> > > WARNING : at kernel/lockdep.c: 700 lookup_lock_class()
> > > 
> > > <snip>
> > > lock_acquire
> > > cleanup_workqueue_thread
> > > workqueue_cpu_callback
> > > notifier_call_chain
> > > __raw_notifier_call_chain
> > > raw_notifier_call_chain
> > > __cpu_down
> > > disable_nonboot_cpus
> > > kernel_power_off
> > > sys_reboot
> > > <snip>
> > 
> > My first guess would be a __create_workqueue_key() where a key is
> > re-used with a different workqueue name.
> 
> Hm, can you elaborate how you got there from the trace above? I don't
> think I'm following.

The warning that triggered (lockdep.c:700) means that one class (key)
was used with more than one name.

Looking at cleanup_workqueue_thread(), the lock_acquire() there works on
wq->lockdep_map, and that is only initialized at one spot:
__create_workqueue_key(), thus it stands to reason that that was
mis-used.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.24-rc7 lockdep warning when poweroff
  2008-01-14 10:41     ` Peter Zijlstra
@ 2008-01-14 10:51       ` Johannes Berg
  2008-01-15  0:31         ` Dave Young
  2008-01-15 10:41         ` Johannes Berg
  0 siblings, 2 replies; 16+ messages in thread
From: Johannes Berg @ 2008-01-14 10:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dave Young, Linus Torvalds, Linux Kernel Mailing List, Ingo Molnar

[-- Attachment #1: Type: text/plain, Size: 520 bytes --]


> The warning that triggered (lockdep.c:700) means that one class (key)
> was used with more than one name.

Right.

> Looking at cleanup_workqueue_thread(), the lock_acquire() there works on
> wq->lockdep_map, and that is only initialized at one spot:
> __create_workqueue_key(), thus it stands to reason that that was
> mis-used.

Oh ok, yes, makes sense. Maybe something is generating a workqueue with
a name that's passed in but the key is statically from that place. I'll
try to find it.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.24-rc7 lockdep warning when poweroff
  2008-01-14 10:51       ` Johannes Berg
@ 2008-01-15  0:31         ` Dave Young
  2008-01-15  1:24           ` Dave Young
  2008-01-15 10:41         ` Johannes Berg
  1 sibling, 1 reply; 16+ messages in thread
From: Dave Young @ 2008-01-15  0:31 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Peter Zijlstra, Linus Torvalds, Linux Kernel Mailing List, Ingo Molnar

On Jan 14, 2008 6:51 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
>
> > The warning that triggered (lockdep.c:700) means that one class (key)
> > was used with more than one name.
>
> Right.
>
> > Looking at cleanup_workqueue_thread(), the lock_acquire() there works on
> > wq->lockdep_map, and that is only initialized at one spot:
> > __create_workqueue_key(), thus it stands to reason that that was
> > mis-used.
>
> Oh ok, yes, makes sense. Maybe something is generating a workqueue with
> a name that's passed in but the key is statically from that place. I'll
> try to find it.

I add some debug printk and found the names :

block_osm/exec_osm

in drivers/message/i2o

maybe this helps.

Regards
dave

>
> johannes
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.24-rc7 lockdep warning when poweroff
  2008-01-15  0:31         ` Dave Young
@ 2008-01-15  1:24           ` Dave Young
  2008-01-15  8:42             ` Peter Zijlstra
  0 siblings, 1 reply; 16+ messages in thread
From: Dave Young @ 2008-01-15  1:24 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Peter Zijlstra, Linus Torvalds, Linux Kernel Mailing List, Ingo Molnar

On Tue, Jan 15, 2008 at 08:31:48AM +0800, Dave Young wrote:
> On Jan 14, 2008 6:51 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
> >
> > > The warning that triggered (lockdep.c:700) means that one class (key)
> > > was used with more than one name.
> >
> > Right.
> >
> > > Looking at cleanup_workqueue_thread(), the lock_acquire() there works on
> > > wq->lockdep_map, and that is only initialized at one spot:
> > > __create_workqueue_key(), thus it stands to reason that that was
> > > mis-used.
> >
> > Oh ok, yes, makes sense. Maybe something is generating a workqueue with
> > a name that's passed in but the key is statically from that place. I'll
> > try to find it.
> 
> I add some debug printk and found the names :
> 
> block_osm/exec_osm
> 
> in drivers/message/i2o
> 
> maybe this helps.

Not sure right or not, the following patch fixed the problem:

diff -upr linux/include/linux/workqueue.h linux.new/include/linux/workqueue.h
--- linux/include/linux/workqueue.h	2008-01-15 08:49:08.000000000 +0800
+++ linux.new/include/linux/workqueue.h	2008-01-15 08:49:42.000000000 +0800
@@ -154,7 +154,7 @@ __create_workqueue_key(const char *name,
 #ifdef CONFIG_LOCKDEP
 #define __create_workqueue(name, singlethread, freezeable)	\
 ({								\
-	static struct lock_class_key __key;			\
+	struct lock_class_key __key;			\
 								\
 	__create_workqueue_key((name), (singlethread),		\
 			       (freezeable), &__key);		\
> 
> Regards
> dave
> 
> >
> > johannes
> >

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.24-rc7 lockdep warning when poweroff
  2008-01-15  1:24           ` Dave Young
@ 2008-01-15  8:42             ` Peter Zijlstra
  0 siblings, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2008-01-15  8:42 UTC (permalink / raw)
  To: Dave Young
  Cc: Johannes Berg, Linus Torvalds, Linux Kernel Mailing List, Ingo Molnar


On Tue, 2008-01-15 at 09:24 +0800, Dave Young wrote:
> On Tue, Jan 15, 2008 at 08:31:48AM +0800, Dave Young wrote:
> > On Jan 14, 2008 6:51 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
> > >
> > > > The warning that triggered (lockdep.c:700) means that one class (key)
> > > > was used with more than one name.
> > >
> > > Right.
> > >
> > > > Looking at cleanup_workqueue_thread(), the lock_acquire() there works on
> > > > wq->lockdep_map, and that is only initialized at one spot:
> > > > __create_workqueue_key(), thus it stands to reason that that was
> > > > mis-used.
> > >
> > > Oh ok, yes, makes sense. Maybe something is generating a workqueue with
> > > a name that's passed in but the key is statically from that place. I'll
> > > try to find it.
> > 
> > I add some debug printk and found the names :
> > 
> > block_osm/exec_osm
> > 
> > in drivers/message/i2o
> > 
> > maybe this helps.
> 
> Not sure right or not, the following patch fixed the problem:
> 
> diff -upr linux/include/linux/workqueue.h linux.new/include/linux/workqueue.h
> --- linux/include/linux/workqueue.h	2008-01-15 08:49:08.000000000 +0800
> +++ linux.new/include/linux/workqueue.h	2008-01-15 08:49:42.000000000 +0800
> @@ -154,7 +154,7 @@ __create_workqueue_key(const char *name,
>  #ifdef CONFIG_LOCKDEP
>  #define __create_workqueue(name, singlethread, freezeable)	\
>  ({								\
> -	static struct lock_class_key __key;			\
> +	struct lock_class_key __key;			\
>  								\
>  	__create_workqueue_key((name), (singlethread),		\
>  			       (freezeable), &__key);		\

That didn't get you:

  INFO: trying to register non-static key.

Msgs?

But it sure looks like __create_workqueue() is asking for trouble, if
there is a __create_workqueue() instance that takes a non constant name
we're in trouble.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.24-rc7 lockdep warning when poweroff
  2008-01-14 10:51       ` Johannes Berg
  2008-01-15  0:31         ` Dave Young
@ 2008-01-15 10:41         ` Johannes Berg
  2008-01-15 12:21           ` Peter Zijlstra
  1 sibling, 1 reply; 16+ messages in thread
From: Johannes Berg @ 2008-01-15 10:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dave Young, Linus Torvalds, Linux Kernel Mailing List,
	Ingo Molnar, Oleg Nesterov


Ok, I checked all users of the create*workqueue() API now.

Turns out that there are many users that give a dynamic string as the
workqueue name (only the first three are relevant for the problem at
hand because the others are single-threaded):

drivers/connector/cn_queue.c
drivers/media/video/ivtv/ivtv-driver.c
drivers/message/i2o/driver.c

drivers/i2c/chips/m41t00.c drivers/infiniband/core/mad.c
drivers/message/fusion/mptfc.c drivers/net/qla3xxx.c
drivers/scsi/hosts.c drivers/scsi/qla4xxx/ql4_os.c
drivers/scsi/scsi_transport_fc.c drivers/spi/mpc52xx_psc_spi.c
drivers/spi/omap2_mcspi.c drivers/spi/spi_bitbang.c
drivers/spi/spi_txx9.c drivers/spi/spi_imx.c drivers/spi/pxa2xx_spi.c
drivers/spi/spi_bfin5xx.c drivers/power/ds2760_battery.c
net/mac80211/ieee80211.c


That's not really the problem though.

TBH, when writing the workqueue deadlock detection code I wasn't aware
that it is not allowed to use the same key with different names.

To make sure now:
	same key - different name	- BAD
	same key - same name		- OK
	different key - same name	- OK
	different key - different name	- OK

Correct?

The root problem here seems to be that I use the same name as for the
workqueue for the lockdep_map and other code uses a non-static workqueue
name. Using the workqueue name for the lock is good for knowing which
workqueue ran into trouble though.

mac80211 for example wants to allocate a (single-threaded) workqueue for
each hardware that is plugged in and wants to call it by the hardware
name.

Anyway, the patch below should help. I hope the patch compiles, I don't
have a lockdep-enabled system at hand right now (irqtrace is still not
supported on powerpc and my 64-bit powerpc isn't running a kernel with
my irqtrace support patch at the moment).

If you think the patch is a correct way to solve the problem I'll submit
it formally and it should then be included in 2.6.24 to avoid
regressions with the workqueue API (the workqueue lockup detection was
merged early in 2.6.24.) Who should I send it to in that case?

Dave, do you know if you had connector, ivtv or i2o in the kernel (just
to make sure my analysis was correct)? And can you reproduce the
problem, and if so, can you try if this patch helps?

johannes


---
 include/linux/workqueue.h |   14 +++++++++++---
 kernel/workqueue.c        |    5 +++--
 2 files changed, 14 insertions(+), 5 deletions(-)

--- everything.orig/include/linux/workqueue.h	2008-01-15 02:10:55.098131131 +0100
+++ everything/include/linux/workqueue.h	2008-01-15 02:26:37.428130426 +0100
@@ -149,19 +149,27 @@ struct execute_work {
 
 extern struct workqueue_struct *
 __create_workqueue_key(const char *name, int singlethread,
-		       int freezeable, struct lock_class_key *key);
+		       int freezeable, struct lock_class_key *key,
+		       const char *lock_name);
 
 #ifdef CONFIG_LOCKDEP
 #define __create_workqueue(name, singlethread, freezeable)	\
 ({								\
 	static struct lock_class_key __key;			\
+	const char *__lock_name;				\
+								\
+	if (__builtin_constant_p(name))				\
+		__lock_name = (name);				\
+	else							\
+		__lock_name = #name;				\
 								\
 	__create_workqueue_key((name), (singlethread),		\
-			       (freezeable), &__key);		\
+			       (freezeable), &__key,		\
+			       __lock_name);			\
 })
 #else
 #define __create_workqueue(name, singlethread, freezeable)	\
-	__create_workqueue_key((name), (singlethread), (freezeable), NULL)
+	__create_workqueue_key((name), (singlethread), (freezeable), NULL, NULL)
 #endif
 
 #define create_workqueue(name) __create_workqueue((name), 0, 0)
--- everything.orig/kernel/workqueue.c	2008-01-15 02:15:13.578132867 +0100
+++ everything/kernel/workqueue.c	2008-01-15 02:18:40.518138455 +0100
@@ -722,7 +722,8 @@ static void start_workqueue_thread(struc
 struct workqueue_struct *__create_workqueue_key(const char *name,
 						int singlethread,
 						int freezeable,
-						struct lock_class_key *key)
+						struct lock_class_key *key,
+						const char *lock_name)
 {
 	struct workqueue_struct *wq;
 	struct cpu_workqueue_struct *cwq;
@@ -739,7 +740,7 @@ struct workqueue_struct *__create_workqu
 	}
 
 	wq->name = name;
-	lockdep_init_map(&wq->lockdep_map, name, key, 0);
+	lockdep_init_map(&wq->lockdep_map, lock_name, key, 0);
 	wq->singlethread = singlethread;
 	wq->freezeable = freezeable;
 	INIT_LIST_HEAD(&wq->list);



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.24-rc7 lockdep warning when poweroff
  2008-01-15 10:41         ` Johannes Berg
@ 2008-01-15 12:21           ` Peter Zijlstra
  2008-01-15 12:39             ` Johannes Berg
  2008-01-15 23:54             ` Johannes Berg
  0 siblings, 2 replies; 16+ messages in thread
From: Peter Zijlstra @ 2008-01-15 12:21 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Dave Young, Linus Torvalds, Linux Kernel Mailing List,
	Ingo Molnar, Oleg Nesterov


On Tue, 2008-01-15 at 11:41 +0100, Johannes Berg wrote:
> Ok, I checked all users of the create*workqueue() API now.
> 
> Turns out that there are many users that give a dynamic string as the
> workqueue name (only the first three are relevant for the problem at
> hand because the others are single-threaded):

I'm not sure the single threadedness of a workqueue matters here.

> drivers/connector/cn_queue.c
> drivers/media/video/ivtv/ivtv-driver.c
> drivers/message/i2o/driver.c
> 
> drivers/i2c/chips/m41t00.c drivers/infiniband/core/mad.c
> drivers/message/fusion/mptfc.c drivers/net/qla3xxx.c
> drivers/scsi/hosts.c drivers/scsi/qla4xxx/ql4_os.c
> drivers/scsi/scsi_transport_fc.c drivers/spi/mpc52xx_psc_spi.c
> drivers/spi/omap2_mcspi.c drivers/spi/spi_bitbang.c
> drivers/spi/spi_txx9.c drivers/spi/spi_imx.c drivers/spi/pxa2xx_spi.c
> drivers/spi/spi_bfin5xx.c drivers/power/ds2760_battery.c
> net/mac80211/ieee80211.c
> 
> 
> That's not really the problem though.
> 
> TBH, when writing the workqueue deadlock detection code I wasn't aware
> that it is not allowed to use the same key with different names.
> 
> To make sure now:
> 	same key - different name	- BAD
> 	same key - same name		- OK
> 	different key - same name	- OK

Strictly speaking one can do that, although I would recommend against it
- it leads to confusion as to which lock got into trouble when looking
at lockdep/stat output.

> 	different key - different name	- OK
> 
> Correct?

Yeah.

> The root problem here seems to be that I use the same name as for the
> workqueue for the lockdep_map and other code uses a non-static workqueue
> name. Using the workqueue name for the lock is good for knowing which
> workqueue ran into trouble though.

Indeed, and also using a different key allows the workqueue to have
different lock dependencies as well. The trouble is, lockdep works at
the class level, a class with multiple names just doesn't make sense,
and reporting will get it wrong (although it may appear to work
correctly in the trivial cases).

> mac80211 for example wants to allocate a (single-threaded) workqueue for
> each hardware that is plugged in and wants to call it by the hardware
> name.

Right, that would require a new key for each instance.

> Anyway, the patch below should help. I hope the patch compiles, I don't
> have a lockdep-enabled system at hand right now (irqtrace is still not
> supported on powerpc and my 64-bit powerpc isn't running a kernel with
> my irqtrace support patch at the moment).

Tssk :-)

> If you think the patch is a correct way to solve the problem I'll submit
> it formally and it should then be included in 2.6.24 to avoid
> regressions with the workqueue API (the workqueue lockup detection was
> merged early in 2.6.24.)

The patch looks ok, one important thing to note is that it means that
all workqueues instantiated by the same __create_workqueue() call-site
share lock dependency chains - I'm unsure if that might get us into
trouble or not.

> Who should I send it to in that case?

Me and Ingo :-)

> Dave, do you know if you had connector, ivtv or i2o in the kernel (just
> to make sure my analysis was correct)? And can you reproduce the
> problem, and if so, can you try if this patch helps?
> 
> johannes
> 
> 
> ---
>  include/linux/workqueue.h |   14 +++++++++++---
>  kernel/workqueue.c        |    5 +++--
>  2 files changed, 14 insertions(+), 5 deletions(-)
> 
> --- everything.orig/include/linux/workqueue.h	2008-01-15 02:10:55.098131131 +0100
> +++ everything/include/linux/workqueue.h	2008-01-15 02:26:37.428130426 +0100
> @@ -149,19 +149,27 @@ struct execute_work {
>  
>  extern struct workqueue_struct *
>  __create_workqueue_key(const char *name, int singlethread,
> -		       int freezeable, struct lock_class_key *key);
> +		       int freezeable, struct lock_class_key *key,
> +		       const char *lock_name);
>  
>  #ifdef CONFIG_LOCKDEP
>  #define __create_workqueue(name, singlethread, freezeable)	\
>  ({								\
>  	static struct lock_class_key __key;			\
> +	const char *__lock_name;				\
> +								\
> +	if (__builtin_constant_p(name))				\
> +		__lock_name = (name);				\
> +	else							\
> +		__lock_name = #name;				\
>  								\
>  	__create_workqueue_key((name), (singlethread),		\
> -			       (freezeable), &__key);		\
> +			       (freezeable), &__key,		\
> +			       __lock_name);			\
>  })
>  #else
>  #define __create_workqueue(name, singlethread, freezeable)	\
> -	__create_workqueue_key((name), (singlethread), (freezeable), NULL)
> +	__create_workqueue_key((name), (singlethread), (freezeable), NULL, NULL)
>  #endif
>  
>  #define create_workqueue(name) __create_workqueue((name), 0, 0)
> --- everything.orig/kernel/workqueue.c	2008-01-15 02:15:13.578132867 +0100
> +++ everything/kernel/workqueue.c	2008-01-15 02:18:40.518138455 +0100
> @@ -722,7 +722,8 @@ static void start_workqueue_thread(struc
>  struct workqueue_struct *__create_workqueue_key(const char *name,
>  						int singlethread,
>  						int freezeable,
> -						struct lock_class_key *key)
> +						struct lock_class_key *key,
> +						const char *lock_name)
>  {
>  	struct workqueue_struct *wq;
>  	struct cpu_workqueue_struct *cwq;
> @@ -739,7 +740,7 @@ struct workqueue_struct *__create_workqu
>  	}
>  
>  	wq->name = name;
> -	lockdep_init_map(&wq->lockdep_map, name, key, 0);
> +	lockdep_init_map(&wq->lockdep_map, lock_name, key, 0);
>  	wq->singlethread = singlethread;
>  	wq->freezeable = freezeable;
>  	INIT_LIST_HEAD(&wq->list);
> 
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.24-rc7 lockdep warning when poweroff
  2008-01-15 12:21           ` Peter Zijlstra
@ 2008-01-15 12:39             ` Johannes Berg
  2008-01-15 12:47               ` Peter Zijlstra
  2008-01-15 23:54             ` Johannes Berg
  1 sibling, 1 reply; 16+ messages in thread
From: Johannes Berg @ 2008-01-15 12:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dave Young, Linus Torvalds, Linux Kernel Mailing List,
	Ingo Molnar, Oleg Nesterov

[-- Attachment #1: Type: text/plain, Size: 2730 bytes --]


> > To make sure now:
> > 	same key - different name	- BAD
> > 	same key - same name		- OK
> > 	different key - same name	- OK
> 
> Strictly speaking one can do that, although I would recommend against it
> - it leads to confusion as to which lock got into trouble when looking
> at lockdep/stat output.

True, but I don't see a good way to avoid that. Similar things also
happen with

	mutex_init(&priv->mtx);

for example, no?

> > The root problem here seems to be that I use the same name as for the
> > workqueue for the lockdep_map and other code uses a non-static workqueue
> > name. Using the workqueue name for the lock is good for knowing which
> > workqueue ran into trouble though.
> 
> Indeed, and also using a different key allows the workqueue to have
> different lock dependencies as well. The trouble is, lockdep works at
> the class level, a class with multiple names just doesn't make sense,
> and reporting will get it wrong (although it may appear to work
> correctly in the trivial cases).

Right.

> > mac80211 for example wants to allocate a (single-threaded) workqueue for
> > each hardware that is plugged in and wants to call it by the hardware
> > name.
> 
> Right, that would require a new key for each instance.

Except, how could I do that though? Keys are required to be static, so I
can't have the object as the key. In any case, I don't think it matters
much because the workqueues are per-hardware but all have similar users,
I think that the other users here probably behave similarly.

> > If you think the patch is a correct way to solve the problem I'll submit
> > it formally and it should then be included in 2.6.24 to avoid
> > regressions with the workqueue API (the workqueue lockup detection was
> > merged early in 2.6.24.)
> 
> The patch looks ok, one important thing to note is that it means that
> all workqueues instantiated by the same __create_workqueue() call-site
> share lock dependency chains - I'm unsure if that might get us into
> trouble or not.

It doesn't seem to have so far ;) I don't think it should. If some code
allocates a per-instance workqueue that's much like having an inode lock
or so.

The scenario to get into trouble with this would require having a
per-instance lock and a per-instance workqueue and flushing the
workqueue (that can contain functions taking the lock of instance A) of
instance A under the lock of instance B, but unless that is nested in a
way that it cannot happen in order BA as well it's actually a possible
ABBA deadlock.

> > Who should I send it to in that case?
> 
> Me and Ingo :-)

Alright, I'll write a patch description and send it in a minute.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.24-rc7 lockdep warning when poweroff
  2008-01-15 12:39             ` Johannes Berg
@ 2008-01-15 12:47               ` Peter Zijlstra
  2008-01-15 13:04                 ` [PATCH for 2.6.24] fix workqueue creation API lockdep interaction Johannes Berg
  2008-01-16  8:11                 ` 2.6.24-rc7 lockdep warning when poweroff Ingo Molnar
  0 siblings, 2 replies; 16+ messages in thread
From: Peter Zijlstra @ 2008-01-15 12:47 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Dave Young, Linus Torvalds, Linux Kernel Mailing List,
	Ingo Molnar, Oleg Nesterov


On Tue, 2008-01-15 at 13:39 +0100, Johannes Berg wrote:
> > > To make sure now:
> > > 	same key - different name	- BAD
> > > 	same key - same name		- OK
> > > 	different key - same name	- OK
> > 
> > Strictly speaking one can do that, although I would recommend against it
> > - it leads to confusion as to which lock got into trouble when looking
> > at lockdep/stat output.
> 
> True, but I don't see a good way to avoid that. Similar things also
> happen with
> 
> 	mutex_init(&priv->mtx);
> 
> for example, no?

Yeah, it happens, I tend to 'fix' them when I encounter it though,
sometimes by just slightly altering the expression. It helps when
grepping the tree.

> > > mac80211 for example wants to allocate a (single-threaded) workqueue for
> > > each hardware that is plugged in and wants to call it by the hardware
> > > name.
> > 
> > Right, that would require a new key for each instance.
> 
> Except, how could I do that though? Keys are required to be static, so I
> can't have the object as the key. In any case, I don't think it matters
> much because the workqueues are per-hardware but all have similar users,
> I think that the other users here probably behave similarly.

Yeah, I think so too, but never underestimate the creativity of driver
authors:-)

> > > If you think the patch is a correct way to solve the problem I'll submit
> > > it formally and it should then be included in 2.6.24 to avoid
> > > regressions with the workqueue API (the workqueue lockup detection was
> > > merged early in 2.6.24.)
> > 
> > The patch looks ok, one important thing to note is that it means that
> > all workqueues instantiated by the same __create_workqueue() call-site
> > share lock dependency chains - I'm unsure if that might get us into
> > trouble or not.
> 
> It doesn't seem to have so far ;) I don't think it should. If some code
> allocates a per-instance workqueue that's much like having an inode lock
> or so.

We had to split up the inode lock to per filesystem classes, just
because the lock chains were conflicting between them...

> > Me and Ingo :-)
> 
> Alright, I'll write a patch description and send it in a minute.

Great, thanks for the effort.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH for 2.6.24] fix workqueue creation API lockdep interaction
  2008-01-15 12:47               ` Peter Zijlstra
@ 2008-01-15 13:04                 ` Johannes Berg
  2008-01-16  4:41                   ` Dave Young
  2008-01-16  8:11                 ` 2.6.24-rc7 lockdep warning when poweroff Ingo Molnar
  1 sibling, 1 reply; 16+ messages in thread
From: Johannes Berg @ 2008-01-15 13:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dave Young, Linus Torvalds, Linux Kernel Mailing List,
	Ingo Molnar, Oleg Nesterov

Dave Young reported warnings from lockdep that the workqueue API
can sometimes try to register lockdep classes with the same key
but different names. This is not permitted in lockdep.

Unfortunately, I was unaware of that restriction when I wrote
the code to debug workqueue problems with lockdep and used the
workqueue name as the lockdep class name. This can obviously
lead to the problem if the workqueue name is dynamic.

This patch solves the problem by always using a constant name
for the workqueue's lockdep class, namely either the constant
name that was passed in or a string consisting of the variable
name.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
---
Please be careful with this patch, I haven't been able to test it so far
because my powerbook doesn't have lockdep.

 include/linux/workqueue.h |   14 +++++++++++---
 kernel/workqueue.c        |    5 +++--
 2 files changed, 14 insertions(+), 5 deletions(-)

--- everything.orig/include/linux/workqueue.h	2008-01-15 02:10:55.098131131 +0100
+++ everything/include/linux/workqueue.h	2008-01-15 02:26:37.428130426 +0100
@@ -149,19 +149,27 @@ struct execute_work {
 
 extern struct workqueue_struct *
 __create_workqueue_key(const char *name, int singlethread,
-		       int freezeable, struct lock_class_key *key);
+		       int freezeable, struct lock_class_key *key,
+		       const char *lock_name);
 
 #ifdef CONFIG_LOCKDEP
 #define __create_workqueue(name, singlethread, freezeable)	\
 ({								\
 	static struct lock_class_key __key;			\
+	const char *__lock_name;				\
+								\
+	if (__builtin_constant_p(name))				\
+		__lock_name = (name);				\
+	else							\
+		__lock_name = #name;				\
 								\
 	__create_workqueue_key((name), (singlethread),		\
-			       (freezeable), &__key);		\
+			       (freezeable), &__key,		\
+			       __lock_name);			\
 })
 #else
 #define __create_workqueue(name, singlethread, freezeable)	\
-	__create_workqueue_key((name), (singlethread), (freezeable), NULL)
+	__create_workqueue_key((name), (singlethread), (freezeable), NULL, NULL)
 #endif
 
 #define create_workqueue(name) __create_workqueue((name), 0, 0)
--- everything.orig/kernel/workqueue.c	2008-01-15 02:15:13.578132867 +0100
+++ everything/kernel/workqueue.c	2008-01-15 02:18:40.518138455 +0100
@@ -722,7 +722,8 @@ static void start_workqueue_thread(struc
 struct workqueue_struct *__create_workqueue_key(const char *name,
 						int singlethread,
 						int freezeable,
-						struct lock_class_key *key)
+						struct lock_class_key *key,
+						const char *lock_name)
 {
 	struct workqueue_struct *wq;
 	struct cpu_workqueue_struct *cwq;
@@ -739,7 +740,7 @@ struct workqueue_struct *__create_workqu
 	}
 
 	wq->name = name;
-	lockdep_init_map(&wq->lockdep_map, name, key, 0);
+	lockdep_init_map(&wq->lockdep_map, lock_name, key, 0);
 	wq->singlethread = singlethread;
 	wq->freezeable = freezeable;
 	INIT_LIST_HEAD(&wq->list);



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.24-rc7 lockdep warning when poweroff
  2008-01-15 12:21           ` Peter Zijlstra
  2008-01-15 12:39             ` Johannes Berg
@ 2008-01-15 23:54             ` Johannes Berg
  1 sibling, 0 replies; 16+ messages in thread
From: Johannes Berg @ 2008-01-15 23:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dave Young, Linus Torvalds, Linux Kernel Mailing List,
	Ingo Molnar, Oleg Nesterov

[-- Attachment #1: Type: text/plain, Size: 607 bytes --]


> > Turns out that there are many users that give a dynamic string as the
> > workqueue name (only the first three are relevant for the problem at
> > hand because the others are single-threaded):
> 
> I'm not sure the single threadedness of a workqueue matters here.

Oh I should have explained this, missed it earlier. No, the
single-threadedness doesn't really matter for the problem, but the
warning Dave got couldn't possibly have triggered for a single-threaded
workqueue because it was from the CPU up/down code that spawns/stops the
new workqueue thread on a new/dying CPU.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH for 2.6.24] fix workqueue creation API lockdep interaction
  2008-01-15 13:04                 ` [PATCH for 2.6.24] fix workqueue creation API lockdep interaction Johannes Berg
@ 2008-01-16  4:41                   ` Dave Young
  0 siblings, 0 replies; 16+ messages in thread
From: Dave Young @ 2008-01-16  4:41 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Peter Zijlstra, Linus Torvalds, Linux Kernel Mailing List,
	Ingo Molnar, Oleg Nesterov

On Jan 15, 2008 9:04 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
> Dave Young reported warnings from lockdep that the workqueue API
> can sometimes try to register lockdep classes with the same key
> but different names. This is not permitted in lockdep.
>
> Unfortunately, I was unaware of that restriction when I wrote
> the code to debug workqueue problems with lockdep and used the
> workqueue name as the lockdep class name. This can obviously
> lead to the problem if the workqueue name is dynamic.
>
> This patch solves the problem by always using a constant name
> for the workqueue's lockdep class, namely either the constant
> name that was passed in or a string consisting of the variable
> name.
>
> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
> ---
> Please be careful with this patch, I haven't been able to test it so far
> because my powerbook doesn't have lockdep.

Hi,
Just for confirm, the warnings didn't trigger after applied your patch, thanks.

Regards
dave

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.24-rc7 lockdep warning when poweroff
  2008-01-15 12:47               ` Peter Zijlstra
  2008-01-15 13:04                 ` [PATCH for 2.6.24] fix workqueue creation API lockdep interaction Johannes Berg
@ 2008-01-16  8:11                 ` Ingo Molnar
  1 sibling, 0 replies; 16+ messages in thread
From: Ingo Molnar @ 2008-01-16  8:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Johannes Berg, Dave Young, Linus Torvalds,
	Linux Kernel Mailing List, Oleg Nesterov


* Peter Zijlstra <peterz@infradead.org> wrote:

> > > The patch looks ok, one important thing to note is that it means 
> > > that all workqueues instantiated by the same __create_workqueue() 
> > > call-site share lock dependency chains - I'm unsure if that might 
> > > get us into trouble or not.
> > 
> > It doesn't seem to have so far ;) I don't think it should. If some 
> > code allocates a per-instance workqueue that's much like having an 
> > inode lock or so.
> 
> We had to split up the inode lock to per filesystem classes, just 
> because the lock chains were conflicting between them...

i.e. filesystems can legally have different locking rules wrt. i_lock. I 
dont really like it (we should have as simple locking rules as possible) 
but it is the VFS status quo :)

	Ingo

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2008-01-16 15:46 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-01-14  9:04 2.6.24-rc7 lockdep warning when poweroff Dave Young
2008-01-14  9:24 ` Peter Zijlstra
2008-01-14 10:35   ` Johannes Berg
2008-01-14 10:41     ` Peter Zijlstra
2008-01-14 10:51       ` Johannes Berg
2008-01-15  0:31         ` Dave Young
2008-01-15  1:24           ` Dave Young
2008-01-15  8:42             ` Peter Zijlstra
2008-01-15 10:41         ` Johannes Berg
2008-01-15 12:21           ` Peter Zijlstra
2008-01-15 12:39             ` Johannes Berg
2008-01-15 12:47               ` Peter Zijlstra
2008-01-15 13:04                 ` [PATCH for 2.6.24] fix workqueue creation API lockdep interaction Johannes Berg
2008-01-16  4:41                   ` Dave Young
2008-01-16  8:11                 ` 2.6.24-rc7 lockdep warning when poweroff Ingo Molnar
2008-01-15 23:54             ` Johannes Berg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).