LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Problem with freezable workqueues
@ 2007-02-27 21:51 Rafael J. Wysocki
  2007-02-27 23:28 ` Oleg Nesterov
                   ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Rafael J. Wysocki @ 2007-02-27 21:51 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Gautham R Shenoy, Johannes Berg, LKML, Oleg Nesterov, Srivatsa Vaddagiri

Hi,

We have a problem with freezable workqueues in 2.6.21-rc1 and in -mm
(there are only two of them, in XFS, but still).  Namely, their worker threads
deadlock with workqueue_cpu_callback() that gets called during the CPU hotplug,
becuase workqueue_cpu_callback() tries to stop these threads while they are
frozen (disable_nonboot_cpus() happens after we've frozen tasks).

For 2.6.21-rc1 I've invented the appended workaround (works for me, waiting for
Johannes to confirm it works for him too), but I think we need something better
for -mm and future kernels.

Greetings,
Rafael


 kernel/workqueue.c |   18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

Index: linux-2.6.21-rc1/kernel/workqueue.c
===================================================================
--- linux-2.6.21-rc1.orig/kernel/workqueue.c	2007-02-24 10:17:57.000000000 +0100
+++ linux-2.6.21-rc1/kernel/workqueue.c	2007-02-24 20:00:22.000000000 +0100
@@ -376,8 +376,19 @@ static int worker_thread(void *__cwq)
 
 	set_current_state(TASK_INTERRUPTIBLE);
 	while (!kthread_should_stop()) {
-		if (cwq->freezeable)
-			try_to_freeze();
+		if (try_to_freeze()) {
+			/* We've just left the refrigerator.  If our CPU is
+			 * a nonboot one, we might have been replaced.
+			 * The lock is taken to prevent the race with
+			 * cleanup_workqueue_thread() from happening
+			 */
+			spin_lock_irq(&cwq->lock);
+			if (cwq->thread != current) {
+				spin_unlock_irq(&cwq->lock);
+				break;
+			}
+			spin_unlock_irq(&cwq->lock);
+		}
 
 		add_wait_queue(&cwq->more_work, &wait);
 		if (list_empty(&cwq->worklist))
@@ -539,6 +550,9 @@ static void cleanup_workqueue_thread(str
 	cwq = per_cpu_ptr(wq->cpu_wq, cpu);
 	spin_lock_irqsave(&cwq->lock, flags);
 	p = cwq->thread;
+	if (p && frozen(p))
+		p = NULL;
+
 	cwq->thread = NULL;
 	spin_unlock_irqrestore(&cwq->lock, flags);
 	if (p)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-27 21:51 Problem with freezable workqueues Rafael J. Wysocki
@ 2007-02-27 23:28 ` Oleg Nesterov
  2007-02-27 23:36   ` Johannes Berg
  2007-02-27 23:57   ` Rafael J. Wysocki
  2007-02-28  3:01 ` Srivatsa Vaddagiri
  2007-03-06  0:30 ` Johannes Berg
  2 siblings, 2 replies; 45+ messages in thread
From: Oleg Nesterov @ 2007-02-27 23:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, Gautham R Shenoy, Johannes Berg, LKML, Srivatsa Vaddagiri

On 02/27, Rafael J. Wysocki wrote:
>
> We have a problem with freezable workqueues in 2.6.21-rc1 and in -mm
> (there are only two of them, in XFS, but still).  Namely, their worker threads
> deadlock with workqueue_cpu_callback() that gets called during the CPU hotplug,
> becuase workqueue_cpu_callback() tries to stop these threads while they are
> frozen (disable_nonboot_cpus() happens after we've frozen tasks).

Ugh. I know nothing, nothing, nothing about suspend. I'll try to guess.

   Commit: ed746e3b18f4df18afa3763155972c5835f284c5

   [PATCH] swsusp: Change code ordering in disk.c

   Change the ordering of code in kernel/power/disk.c so that device_suspend() is
   called before disable_nonboot_cpus() and platform_finish() is called after
   enable_nonboot_cpus() and before device_resume(), as indicated by the recent
   discussion on Linux-PM (cf.
   http://lists.osdl.org/pipermail/linux-pm/2006-November/004164.html).

   The changes here only affect the built-in swsusp.

Yes? with the patch above, _cpu_down() called _after_ freeze_processes() ???
Honestly, I can't understand this (yes, I know nothing, nothing, nothing...).

> For 2.6.21-rc1 I've invented the appended workaround (works for me, waiting for
> Johannes to confirm it works for him too), but I think we need something better
> for -mm and future kernels.

How about other kthread_stop()s ? For example, kernel/softirq.c:cpu_callback() ?

I think we need a general "cpu_down() after freeze" implementation, this is what
Gautham and Srivatsa are working on, right?

> --- linux-2.6.21-rc1.orig/kernel/workqueue.c	2007-02-24 10:17:57.000000000 +0100
> +++ linux-2.6.21-rc1/kernel/workqueue.c	2007-02-24 20:00:22.000000000 +0100
> @@ -376,8 +376,19 @@ static int worker_thread(void *__cwq)
>  
>  	set_current_state(TASK_INTERRUPTIBLE);
>  	while (!kthread_should_stop()) {
> -		if (cwq->freezeable)
> -			try_to_freeze();
> +		if (try_to_freeze()) {
> +			/* We've just left the refrigerator.  If our CPU is
> +			 * a nonboot one, we might have been replaced.
> +			 * The lock is taken to prevent the race with
> +			 * cleanup_workqueue_thread() from happening
> +			 */
> +			spin_lock_irq(&cwq->lock);

I'm afraid this is racy. We can't touch *cwq, it may be freed. Suppose
that another thread does destroy_workqueue(), and we thaw that thread
before cwq->thread.

Oleg.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-27 23:28 ` Oleg Nesterov
@ 2007-02-27 23:36   ` Johannes Berg
  2007-02-28  0:00     ` Rafael J. Wysocki
  2007-02-28 18:06     ` Gautham R Shenoy
  2007-02-27 23:57   ` Rafael J. Wysocki
  1 sibling, 2 replies; 45+ messages in thread
From: Johannes Berg @ 2007-02-27 23:36 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Rafael J. Wysocki, Pavel Machek, Gautham R Shenoy, LKML,
	Srivatsa Vaddagiri

[-- Attachment #1: Type: text/plain, Size: 1022 bytes --]

On Wed, 2007-02-28 at 02:28 +0300, Oleg Nesterov wrote:

> Ugh. I know nothing, nothing, nothing about suspend. I'll try to guess.
> 
>    Commit: ed746e3b18f4df18afa3763155972c5835f284c5

> Yes? with the patch above, _cpu_down() called _after_ freeze_processes() ???

perfect :)
See also my original mail and the thread:
http://thread.gmane.org/gmane.linux.power-management.general/4314

> How about other kthread_stop()s ? For example, kernel/softirq.c:cpu_callback() ?

I'd they should be affected as well. For me, I guess I haven't run into
them because xfs is enough to freeze the box ;) As for Rafael, I don't
know why he isn't seeing that... I haven't been able to test this patch
on my quad powermac so far unfortunately, been sick and away from the
machine. I'll probably be able to test tomorrow.

> I think we need a general "cpu_down() after freeze" implementation,

Yeah, I'd think so.

>  this is what
> Gautham and Srivatsa are working on, right?

That I don't know.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-27 23:28 ` Oleg Nesterov
  2007-02-27 23:36   ` Johannes Berg
@ 2007-02-27 23:57   ` Rafael J. Wysocki
  2007-02-28  0:01     ` Johannes Berg
                       ` (2 more replies)
  1 sibling, 3 replies; 45+ messages in thread
From: Rafael J. Wysocki @ 2007-02-27 23:57 UTC (permalink / raw)
  To: Oleg Nesterov, Pavel Machek
  Cc: Gautham R Shenoy, Johannes Berg, LKML, Srivatsa Vaddagiri

On Wednesday, 28 February 2007 00:28, Oleg Nesterov wrote:
> On 02/27, Rafael J. Wysocki wrote:
> >
> > We have a problem with freezable workqueues in 2.6.21-rc1 and in -mm
> > (there are only two of them, in XFS, but still).  Namely, their worker threads
> > deadlock with workqueue_cpu_callback() that gets called during the CPU hotplug,
> > becuase workqueue_cpu_callback() tries to stop these threads while they are
> > frozen (disable_nonboot_cpus() happens after we've frozen tasks).
> 
> Ugh. I know nothing, nothing, nothing about suspend. I'll try to guess.
> 
>    Commit: ed746e3b18f4df18afa3763155972c5835f284c5

Yes, but not only this one.  Please see:

http://kernel.org/git/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e3c7db621bed4afb8e231cb005057f2feb5db557

>    [PATCH] swsusp: Change code ordering in disk.c
> 
>    Change the ordering of code in kernel/power/disk.c so that device_suspend() is
>    called before disable_nonboot_cpus() and platform_finish() is called after
>    enable_nonboot_cpus() and before device_resume(), as indicated by the recent
>    discussion on Linux-PM (cf.
>    http://lists.osdl.org/pipermail/linux-pm/2006-November/004164.html).
> 
>    The changes here only affect the built-in swsusp.
> 
> Yes? with the patch above, _cpu_down() called _after_ freeze_processes() ???

Yes, that's what we want to do.

> Honestly, I can't understand this (yes, I know nothing, nothing, nothing...).

Well, we used the CPU hotplug for disabling nonboot CPUs before
freeze_processes() was called, because the freezer used to be totally unsafe
on SMP.  However, what we wanted from the beginning was to freeze tasks with
all CPUs on line (this way, for example, userland tasks should not notice that
we have played with the CPUs under them, but there are other reasons).

> > For 2.6.21-rc1 I've invented the appended workaround (works for me, waiting for
> > Johannes to confirm it works for him too), but I think we need something better
> > for -mm and future kernels.
> 
> How about other kthread_stop()s ? For example, kernel/softirq.c:cpu_callback() ?

They all are PF_NOFREEZE, I suppose.  If we make all workqueues nonfreezable
(as they were before), the problem won't appear.

> I think we need a general "cpu_down() after freeze" implementation, this is what
> Gautham and Srivatsa are working on, right?

Yes, certainly.

> > --- linux-2.6.21-rc1.orig/kernel/workqueue.c	2007-02-24 10:17:57.000000000 +0100
> > +++ linux-2.6.21-rc1/kernel/workqueue.c	2007-02-24 20:00:22.000000000 +0100
> > @@ -376,8 +376,19 @@ static int worker_thread(void *__cwq)
> >  
> >  	set_current_state(TASK_INTERRUPTIBLE);
> >  	while (!kthread_should_stop()) {
> > -		if (cwq->freezeable)
> > -			try_to_freeze();
> > +		if (try_to_freeze()) {
> > +			/* We've just left the refrigerator.  If our CPU is
> > +			 * a nonboot one, we might have been replaced.
> > +			 * The lock is taken to prevent the race with
> > +			 * cleanup_workqueue_thread() from happening
> > +			 */
> > +			spin_lock_irq(&cwq->lock);
> 
> I'm afraid this is racy. We can't touch *cwq, it may be freed. Suppose
> that another thread does destroy_workqueue(), and we thaw that thread
> before cwq->thread.

Okay, in that case I'd suggest removing create_freezeable_workqueue() and
make all workqueues nonfreezable once again for 2.6.21 (as far as I know, only
the two XFS workqueues are affected).

Pavel, would that be acceptable?

Rafael

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-27 23:36   ` Johannes Berg
@ 2007-02-28  0:00     ` Rafael J. Wysocki
  2007-02-28  0:00       ` Johannes Berg
  2007-02-28 18:06     ` Gautham R Shenoy
  1 sibling, 1 reply; 45+ messages in thread
From: Rafael J. Wysocki @ 2007-02-28  0:00 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Oleg Nesterov, Pavel Machek, Gautham R Shenoy, LKML, Srivatsa Vaddagiri

On Wednesday, 28 February 2007 00:36, Johannes Berg wrote:
> On Wed, 2007-02-28 at 02:28 +0300, Oleg Nesterov wrote:
> 
> > Ugh. I know nothing, nothing, nothing about suspend. I'll try to guess.
> > 
> >    Commit: ed746e3b18f4df18afa3763155972c5835f284c5
> 
> > Yes? with the patch above, _cpu_down() called _after_ freeze_processes() ???
> 
> perfect :)
> See also my original mail and the thread:
> http://thread.gmane.org/gmane.linux.power-management.general/4314
> 
> > How about other kthread_stop()s ? For example, kernel/softirq.c:cpu_callback() ?
> 
> I'd they should be affected as well.

They won't be, if they have PF_NOFREEZE set.

> For me, I guess I haven't run into them because xfs is enough to freeze the
> box ;)

I think you've hit the only thing that could have triggered the issue. ;-)

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28  0:00     ` Rafael J. Wysocki
@ 2007-02-28  0:00       ` Johannes Berg
  0 siblings, 0 replies; 45+ messages in thread
From: Johannes Berg @ 2007-02-28  0:00 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Oleg Nesterov, Pavel Machek, Gautham R Shenoy, LKML, Srivatsa Vaddagiri

[-- Attachment #1: Type: text/plain, Size: 286 bytes --]

On Wed, 2007-02-28 at 01:00 +0100, Rafael J. Wysocki wrote:

> > > How about other kthread_stop()s ? For example, kernel/softirq.c:cpu_callback() ?
> > 
> > I'd they should be affected as well.
> 
> They won't be, if they have PF_NOFREEZE set.

Yup, I missed that.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-27 23:57   ` Rafael J. Wysocki
@ 2007-02-28  0:01     ` Johannes Berg
  2007-02-28  0:08       ` Rafael J. Wysocki
  2007-02-28  3:07     ` Srivatsa Vaddagiri
  2007-02-28  8:54     ` Pavel Machek
  2 siblings, 1 reply; 45+ messages in thread
From: Johannes Berg @ 2007-02-28  0:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Oleg Nesterov, Pavel Machek, Gautham R Shenoy, LKML,
	Srivatsa Vaddagiri, Nigel Cunningham

[-- Attachment #1: Type: text/plain, Size: 362 bytes --]

On Wed, 2007-02-28 at 00:57 +0100, Rafael J. Wysocki wrote:

> Okay, in that case I'd suggest removing create_freezeable_workqueue() and
> make all workqueues nonfreezable once again for 2.6.21 (as far as I know, only
> the two XFS workqueues are affected).

I think Nigel might object but I forgot what specific trouble XFS was
causing him.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28  0:01     ` Johannes Berg
@ 2007-02-28  0:08       ` Rafael J. Wysocki
  2007-02-28  1:14         ` Nigel Cunningham
  0 siblings, 1 reply; 45+ messages in thread
From: Rafael J. Wysocki @ 2007-02-28  0:08 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Oleg Nesterov, Pavel Machek, Gautham R Shenoy, LKML,
	Srivatsa Vaddagiri, Nigel Cunningham

On Wednesday, 28 February 2007 01:01, Johannes Berg wrote:
> On Wed, 2007-02-28 at 00:57 +0100, Rafael J. Wysocki wrote:
> 
> > Okay, in that case I'd suggest removing create_freezeable_workqueue() and
> > make all workqueues nonfreezable once again for 2.6.21 (as far as I know, only
> > the two XFS workqueues are affected).
> 
> I think Nigel might object but I forgot what specific trouble XFS was
> causing him.

We suspected that the XFS' worker threads might commit I/O after
freeze_processes() has returned, but that hasn't been supported by evidence,
as far as I can recall.

Also, making them freezable was controversial ...

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28  0:08       ` Rafael J. Wysocki
@ 2007-02-28  1:14         ` Nigel Cunningham
  2007-02-28 10:59           ` Rafael J. Wysocki
  2007-02-28 20:36           ` Johannes Berg
  0 siblings, 2 replies; 45+ messages in thread
From: Nigel Cunningham @ 2007-02-28  1:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Johannes Berg, Oleg Nesterov, Pavel Machek, Gautham R Shenoy,
	LKML, Srivatsa Vaddagiri

Hi.

On Wed, 2007-02-28 at 01:08 +0100, Rafael J. Wysocki wrote:
> On Wednesday, 28 February 2007 01:01, Johannes Berg wrote:
> > On Wed, 2007-02-28 at 00:57 +0100, Rafael J. Wysocki wrote:
> > 
> > > Okay, in that case I'd suggest removing create_freezeable_workqueue() and
> > > make all workqueues nonfreezable once again for 2.6.21 (as far as I know, only
> > > the two XFS workqueues are affected).
> > 
> > I think Nigel might object but I forgot what specific trouble XFS was
> > causing him.
> 
> We suspected that the XFS' worker threads might commit I/O after
> freeze_processes() has returned, but that hasn't been supported by evidence,
> as far as I can recall.
> 
> Also, making them freezable was controversial ...

Controversy is no reason to give in! Nevertheless, I think you're right
- I believe the XFS guys said they fixed the issue that had caused I/O
to be submitted post-freeze. Well, we'll see if it appears again, won't
we?

Regards,

Nigel


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-27 21:51 Problem with freezable workqueues Rafael J. Wysocki
  2007-02-27 23:28 ` Oleg Nesterov
@ 2007-02-28  3:01 ` Srivatsa Vaddagiri
  2007-02-28  3:51   ` Srivatsa Vaddagiri
  2007-03-06  0:30 ` Johannes Berg
  2 siblings, 1 reply; 45+ messages in thread
From: Srivatsa Vaddagiri @ 2007-02-28  3:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, Gautham R Shenoy, Johannes Berg, LKML, Oleg Nesterov

On Tue, Feb 27, 2007 at 10:51:27PM +0100, Rafael J. Wysocki wrote:
> We have a problem with freezable workqueues in 2.6.21-rc1 and in -mm
> (there are only two of them, in XFS, but still).  Namely, their worker threads
> deadlock with workqueue_cpu_callback() that gets called during the CPU hotplug,
> becuase workqueue_cpu_callback() tries to stop these threads while they are
> frozen (disable_nonboot_cpus() happens after we've frozen tasks).

This problem (of kthread_stopping a frozen thread) was there when we
implemented freezer-based cpu hotplug. We worked around that in the
callbacks by thawing the worker thread first before kthread_stopping it,
which is working pretty neatly.

Should that fix the issue? 

-- 
Regards,
vatsa

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-27 23:57   ` Rafael J. Wysocki
  2007-02-28  0:01     ` Johannes Berg
@ 2007-02-28  3:07     ` Srivatsa Vaddagiri
  2007-02-28  8:48       ` Oleg Nesterov
  2007-02-28 18:17       ` Gautham R Shenoy
  2007-02-28  8:54     ` Pavel Machek
  2 siblings, 2 replies; 45+ messages in thread
From: Srivatsa Vaddagiri @ 2007-02-28  3:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Oleg Nesterov, Pavel Machek, Gautham R Shenoy, Johannes Berg, LKML

On Wed, Feb 28, 2007 at 12:57:35AM +0100, Rafael J. Wysocki wrote:
> > How about other kthread_stop()s ? For example, kernel/softirq.c:cpu_callback() ?
> 
> They all are PF_NOFREEZE, I suppose.  If we make all workqueues nonfreezable
> (as they were before), the problem won't appear.

We can just thaw the worker thread selectively before kthread_stopping
them. This will let us freeze all worker threads (which we want to for
hotplug anyway).

> > I think we need a general "cpu_down() after freeze" implementation, this is what
> > Gautham and Srivatsa are working on, right?
> 
> Yes, certainly.

Hmm ..good point. So can we assume that disable/enable_nonboot_cpus() are called
with processes frozen already?

Gautham, you need to take this into account in your patchset!

> > I'm afraid this is racy. We can't touch *cwq, it may be freed. Suppose
> > that another thread does destroy_workqueue(), and we thaw that thread
> > before cwq->thread.
> 
> Okay, in that case I'd suggest removing create_freezeable_workqueue() and
> make all workqueues nonfreezable once again for 2.6.21 (as far as I know, only
> the two XFS workqueues are affected).

See above suggestion of thawing worker thread before kthread_stopping
it.

-- 
Regards,
vatsa

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28  3:01 ` Srivatsa Vaddagiri
@ 2007-02-28  3:51   ` Srivatsa Vaddagiri
  2007-02-28 11:11     ` Rafael J. Wysocki
  0 siblings, 1 reply; 45+ messages in thread
From: Srivatsa Vaddagiri @ 2007-02-28  3:51 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, Gautham R Shenoy, Johannes Berg, LKML, Oleg Nesterov

On Wed, Feb 28, 2007 at 08:31:13AM +0530, Srivatsa Vaddagiri wrote:
> This problem (of kthread_stopping a frozen thread) was there when we
> implemented freezer-based cpu hotplug. We worked around that in the
> callbacks by thawing the worker thread first before kthread_stopping it,
> which is working pretty neatly.
> 
> Should that fix the issue? 

In addition to thawing worker thread before kthread_stopping it, there
are minor changes required in worker threads, to check for
is_cpu_offline(bind_cpu) when they come out of refrigerator and jump to
wait_to_die if so (ex: softirq.c).

I guess you would need these changes before freezer-based hotplug is
merged, in which case Gautham can send those patches out first.

-- 
Regards,
vatsa

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28  3:07     ` Srivatsa Vaddagiri
@ 2007-02-28  8:48       ` Oleg Nesterov
  2007-02-28  9:10         ` Srivatsa Vaddagiri
  2007-02-28 18:17       ` Gautham R Shenoy
  1 sibling, 1 reply; 45+ messages in thread
From: Oleg Nesterov @ 2007-02-28  8:48 UTC (permalink / raw)
  To: Srivatsa Vaddagiri
  Cc: Rafael J. Wysocki, Pavel Machek, Gautham R Shenoy, Johannes Berg, LKML

On 02/28, Srivatsa Vaddagiri wrote:
>
> On Wed, Feb 28, 2007 at 12:57:35AM +0100, Rafael J. Wysocki wrote:
> > > How about other kthread_stop()s ? For example, kernel/softirq.c:cpu_callback() ?
> > 
> > They all are PF_NOFREEZE, I suppose.  If we make all workqueues nonfreezable
> > (as they were before), the problem won't appear.
> 
> We can just thaw the worker thread selectively before kthread_stopping
> them. This will let us freeze all worker threads (which we want to for
> hotplug anyway).

I am not sure this is a good change for 2.6.21.

I strongly believe it is better to change XFS so that it doesn't use
create_freezeable_workqueue() as Rafael suggested. Besides, freezeable
workqueues are buggy anyway in 2.6.21-rc,

	http://marc.theaimsgroup.com/?l=linux-kernel&m=116855740612755

This means that workqueues become non-freezeable after suspend/resume
anyway (if I understand disable_nonboot_cpus() correctly).

Oleg.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-27 23:57   ` Rafael J. Wysocki
  2007-02-28  0:01     ` Johannes Berg
  2007-02-28  3:07     ` Srivatsa Vaddagiri
@ 2007-02-28  8:54     ` Pavel Machek
  2 siblings, 0 replies; 45+ messages in thread
From: Pavel Machek @ 2007-02-28  8:54 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Oleg Nesterov, Gautham R Shenoy, Johannes Berg, LKML, Srivatsa Vaddagiri

Hi!

> > I'm afraid this is racy. We can't touch *cwq, it may be freed. Suppose
> > that another thread does destroy_workqueue(), and we thaw that thread
> > before cwq->thread.
> 
> Okay, in that case I'd suggest removing create_freezeable_workqueue() and
> make all workqueues nonfreezable once again for 2.6.21 (as far as I know, only
> the two XFS workqueues are affected).
> 
> Pavel, would that be acceptable?

Not sure... I really dislike XFS running while we are doing
swsusp. I'd like to move in direction of freezeable workqueues in the
future.

Anyway, if it gets us out of current trouble... yes I guess we can do
it.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28  8:48       ` Oleg Nesterov
@ 2007-02-28  9:10         ` Srivatsa Vaddagiri
  2007-02-28  9:43           ` Oleg Nesterov
  2007-02-28 11:09           ` Rafael J. Wysocki
  0 siblings, 2 replies; 45+ messages in thread
From: Srivatsa Vaddagiri @ 2007-02-28  9:10 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Rafael J. Wysocki, Pavel Machek, Gautham R Shenoy, Johannes Berg, LKML

On Wed, Feb 28, 2007 at 11:48:59AM +0300, Oleg Nesterov wrote:
> On 02/28, Srivatsa Vaddagiri wrote:
> > We can just thaw the worker thread selectively before kthread_stopping
> > them. This will let us freeze all worker threads (which we want to for
> > hotplug anyway).
> 
> I am not sure this is a good change for 2.6.21.

So we make that change when merging the freezer-based hotplug patchset?

> I strongly believe it is better to change XFS so that it doesn't use
> create_freezeable_workqueue() as Rafael suggested. 

Ok no issues. But when we enable freezer-based hotplug, we expect all
non-singlethreaded worker threads to be frozen (for hotplug atleast).

> Besides, freezeable workqueues are buggy anyway in 2.6.21-rc,
> 
> 	http://marc.theaimsgroup.com/?l=linux-kernel&m=116855740612755
>
> This means that workqueues become non-freezeable after suspend/resume
> anyway (if I understand disable_nonboot_cpus() correctly).

Ah ok. When is the above patch expected to be merged?

-- 
Regards,
vatsa

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28  9:10         ` Srivatsa Vaddagiri
@ 2007-02-28  9:43           ` Oleg Nesterov
  2007-02-28 11:09           ` Rafael J. Wysocki
  1 sibling, 0 replies; 45+ messages in thread
From: Oleg Nesterov @ 2007-02-28  9:43 UTC (permalink / raw)
  To: Srivatsa Vaddagiri
  Cc: Rafael J. Wysocki, Pavel Machek, Gautham R Shenoy, Johannes Berg, LKML

On 02/28, Srivatsa Vaddagiri wrote:
>
> On Wed, Feb 28, 2007 at 11:48:59AM +0300, Oleg Nesterov wrote:
> > On 02/28, Srivatsa Vaddagiri wrote:
> > > We can just thaw the worker thread selectively before kthread_stopping
> > > them. This will let us freeze all worker threads (which we want to for
> > > hotplug anyway).
> > 
> > I am not sure this is a good change for 2.6.21.
> 
> So we make that change when merging the freezer-based hotplug patchset?
> 
> > I strongly believe it is better to change XFS so that it doesn't use
> > create_freezeable_workqueue() as Rafael suggested. 
> 
> Ok no issues. But when we enable freezer-based hotplug, we expect all
> non-singlethreaded worker threads to be frozen (for hotplug atleast).

Yes, we already discussed this :) This is btw another indication we
should kill create_freezeable_workqueue() eventually, so it would be
nice to change the only one user not to use it.

> > Besides, freezeable workqueues are buggy anyway in 2.6.21-rc,
> > 
> > 	http://marc.theaimsgroup.com/?l=linux-kernel&m=116855740612755
> >
> > This means that workqueues become non-freezeable after suspend/resume
> > anyway (if I understand disable_nonboot_cpus() correctly).
> 
> Ah ok. When is the above patch expected to be merged?

Oh, I don't know. Note that this particular patch makes little sense if
we change XFS now (because we have no other users), but we can't drop it,
this will break subsequent patches.

Oleg.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28  1:14         ` Nigel Cunningham
@ 2007-02-28 10:59           ` Rafael J. Wysocki
  2007-02-28 20:36           ` Johannes Berg
  1 sibling, 0 replies; 45+ messages in thread
From: Rafael J. Wysocki @ 2007-02-28 10:59 UTC (permalink / raw)
  To: nigel
  Cc: Johannes Berg, Oleg Nesterov, Pavel Machek, Gautham R Shenoy,
	LKML, Srivatsa Vaddagiri

On Wednesday, 28 February 2007 02:14, Nigel Cunningham wrote:
> Hi.
> 
> On Wed, 2007-02-28 at 01:08 +0100, Rafael J. Wysocki wrote:
> > On Wednesday, 28 February 2007 01:01, Johannes Berg wrote:
> > > On Wed, 2007-02-28 at 00:57 +0100, Rafael J. Wysocki wrote:
> > > 
> > > > Okay, in that case I'd suggest removing create_freezeable_workqueue() and
> > > > make all workqueues nonfreezable once again for 2.6.21 (as far as I know, only
> > > > the two XFS workqueues are affected).
> > > 
> > > I think Nigel might object but I forgot what specific trouble XFS was
> > > causing him.
> > 
> > We suspected that the XFS' worker threads might commit I/O after
> > freeze_processes() has returned, but that hasn't been supported by evidence,
> > as far as I can recall.
> > 
> > Also, making them freezable was controversial ...
> 
> Controversy is no reason to give in! Nevertheless, I think you're right
> - I believe the XFS guys said they fixed the issue that had caused I/O
> to be submitted post-freeze. Well, we'll see if it appears again, won't
> we?

Sure, we will. :-)

Rafael

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28  9:10         ` Srivatsa Vaddagiri
  2007-02-28  9:43           ` Oleg Nesterov
@ 2007-02-28 11:09           ` Rafael J. Wysocki
  1 sibling, 0 replies; 45+ messages in thread
From: Rafael J. Wysocki @ 2007-02-28 11:09 UTC (permalink / raw)
  To: vatsa; +Cc: Oleg Nesterov, Pavel Machek, Gautham R Shenoy, Johannes Berg, LKML

On Wednesday, 28 February 2007 10:10, Srivatsa Vaddagiri wrote:
> On Wed, Feb 28, 2007 at 11:48:59AM +0300, Oleg Nesterov wrote:
> > On 02/28, Srivatsa Vaddagiri wrote:
> > > We can just thaw the worker thread selectively before kthread_stopping
> > > them. This will let us freeze all worker threads (which we want to for
> > > hotplug anyway).
> > 
> > I am not sure this is a good change for 2.6.21.
> 
> So we make that change when merging the freezer-based hotplug patchset?

Well, if we want to have freezable workqueues eventually, and I think we do,
they should be taken into consideration in designing this patchset, as well as
the suspend code ordering (ie. freeze_processes(SUSPEND) before
freeze_processes(CPU_HOTPLUG)).

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28  3:51   ` Srivatsa Vaddagiri
@ 2007-02-28 11:11     ` Rafael J. Wysocki
  2007-02-28 13:17       ` Srivatsa Vaddagiri
  0 siblings, 1 reply; 45+ messages in thread
From: Rafael J. Wysocki @ 2007-02-28 11:11 UTC (permalink / raw)
  To: vatsa; +Cc: Pavel Machek, Gautham R Shenoy, Johannes Berg, LKML, Oleg Nesterov

On Wednesday, 28 February 2007 04:51, Srivatsa Vaddagiri wrote:
> On Wed, Feb 28, 2007 at 08:31:13AM +0530, Srivatsa Vaddagiri wrote:
> > This problem (of kthread_stopping a frozen thread) was there when we
> > implemented freezer-based cpu hotplug. We worked around that in the
> > callbacks by thawing the worker thread first before kthread_stopping it,
> > which is working pretty neatly.
> > 
> > Should that fix the issue? 
> 
> In addition to thawing worker thread before kthread_stopping it, there
> are minor changes required in worker threads, to check for
> is_cpu_offline(bind_cpu) when they come out of refrigerator and jump to
> wait_to_die if so (ex: softirq.c).
> 
> I guess you would need these changes before freezer-based hotplug is
> merged, in which case Gautham can send those patches out first.

Yes, please, if that's possible.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28 11:11     ` Rafael J. Wysocki
@ 2007-02-28 13:17       ` Srivatsa Vaddagiri
  2007-02-28 13:27         ` Srivatsa Vaddagiri
                           ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Srivatsa Vaddagiri @ 2007-02-28 13:17 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, Gautham R Shenoy, Johannes Berg, LKML, Oleg Nesterov

On Wed, Feb 28, 2007 at 12:11:03PM +0100, Rafael J. Wysocki wrote:
> > In addition to thawing worker thread before kthread_stopping it, there
> > are minor changes required in worker threads, to check for
> > is_cpu_offline(bind_cpu) when they come out of refrigerator and jump to
> > wait_to_die if so (ex: softirq.c).
> > 
> > I guess you would need these changes before freezer-based hotplug is
> > merged, in which case Gautham can send those patches out first.
> 
> Yes, please, if that's possible.

After looking at the current workqueue code, the above minor change I
suggested is not required.

So you should be able to fix your "kthread_stop on a frozen worker
thread hangs" problem by just a simple patch like this (against
2.6.20-mm2):


--- workqueue.c.org	2007-02-28 18:32:48.000000000 +0530
+++ workqueue.c	2007-02-28 18:44:23.000000000 +0530
@@ -718,6 +718,8 @@ static void cleanup_workqueue_thread(str
 		insert_wq_barrier(cwq, &barr, 1);
 		cwq->should_stop = 1;
 		alive = 1;
+		if (frozen(cwq->thread))
+			thaw(cwq->thread);
 	}
 	spin_unlock_irq(&cwq->lock);
 

Can you test with this?

Note that as Oleg commented, freezable workqueues are broken w/o his
patch here:

	http://marc.theaimsgroup.com/?l=linux-kernel&m=116855740612755	

So you need to have Andrew/Linux pick up the above patch first to have
correctly functioning freezable workqueues.

-- 
Regards,
vatsa

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28 13:17       ` Srivatsa Vaddagiri
@ 2007-02-28 13:27         ` Srivatsa Vaddagiri
  2007-02-28 17:41           ` Rafael J. Wysocki
  2007-02-28 17:40         ` Rafael J. Wysocki
  2007-02-28 19:17         ` Rafael J. Wysocki
  2 siblings, 1 reply; 45+ messages in thread
From: Srivatsa Vaddagiri @ 2007-02-28 13:27 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, Gautham R Shenoy, Johannes Berg, LKML, Oleg Nesterov

On Wed, Feb 28, 2007 at 06:47:21PM +0530, Srivatsa Vaddagiri wrote:
> --- workqueue.c.org	2007-02-28 18:32:48.000000000 +0530
> +++ workqueue.c	2007-02-28 18:44:23.000000000 +0530
> @@ -718,6 +718,8 @@ static void cleanup_workqueue_thread(str
>  		insert_wq_barrier(cwq, &barr, 1);
>  		cwq->should_stop = 1;
>  		alive = 1;
> +		if (frozen(cwq->thread))
> +			thaw(cwq->thread);

I meant thaw_process(cwq->thread) 

>  	}
>  	spin_unlock_irq(&cwq->lock);

-- 
Regards,
vatsa

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28 13:17       ` Srivatsa Vaddagiri
  2007-02-28 13:27         ` Srivatsa Vaddagiri
@ 2007-02-28 17:40         ` Rafael J. Wysocki
  2007-02-28 19:17         ` Rafael J. Wysocki
  2 siblings, 0 replies; 45+ messages in thread
From: Rafael J. Wysocki @ 2007-02-28 17:40 UTC (permalink / raw)
  To: vatsa; +Cc: Pavel Machek, Gautham R Shenoy, Johannes Berg, LKML, Oleg Nesterov

On Wednesday, 28 February 2007 14:17, Srivatsa Vaddagiri wrote:
> On Wed, Feb 28, 2007 at 12:11:03PM +0100, Rafael J. Wysocki wrote:
> > > In addition to thawing worker thread before kthread_stopping it, there
> > > are minor changes required in worker threads, to check for
> > > is_cpu_offline(bind_cpu) when they come out of refrigerator and jump to
> > > wait_to_die if so (ex: softirq.c).
> > > 
> > > I guess you would need these changes before freezer-based hotplug is
> > > merged, in which case Gautham can send those patches out first.
> > 
> > Yes, please, if that's possible.
> 
> After looking at the current workqueue code, the above minor change I
> suggested is not required.
> 
> So you should be able to fix your "kthread_stop on a frozen worker
> thread hangs" problem by just a simple patch like this (against
> 2.6.20-mm2):
> 
> 
> --- workqueue.c.org	2007-02-28 18:32:48.000000000 +0530
> +++ workqueue.c	2007-02-28 18:44:23.000000000 +0530
> @@ -718,6 +718,8 @@ static void cleanup_workqueue_thread(str
>  		insert_wq_barrier(cwq, &barr, 1);
>  		cwq->should_stop = 1;
>  		alive = 1;
> +		if (frozen(cwq->thread))
> +			thaw(cwq->thread);
>  	}
>  	spin_unlock_irq(&cwq->lock);
>  
> 
> Can you test with this?

Sure, I will.

> Note that as Oleg commented, freezable workqueues are broken w/o his
> patch here:
> 
> 	http://marc.theaimsgroup.com/?l=linux-kernel&m=116855740612755	
> 
> So you need to have Andrew/Linux pick up the above patch first to have
> correctly functioning freezable workqueues.

This one already is in -mm.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28 13:27         ` Srivatsa Vaddagiri
@ 2007-02-28 17:41           ` Rafael J. Wysocki
  0 siblings, 0 replies; 45+ messages in thread
From: Rafael J. Wysocki @ 2007-02-28 17:41 UTC (permalink / raw)
  To: vatsa; +Cc: Pavel Machek, Gautham R Shenoy, Johannes Berg, LKML, Oleg Nesterov

On Wednesday, 28 February 2007 14:27, Srivatsa Vaddagiri wrote:
> On Wed, Feb 28, 2007 at 06:47:21PM +0530, Srivatsa Vaddagiri wrote:
> > --- workqueue.c.org	2007-02-28 18:32:48.000000000 +0530
> > +++ workqueue.c	2007-02-28 18:44:23.000000000 +0530
> > @@ -718,6 +718,8 @@ static void cleanup_workqueue_thread(str
> >  		insert_wq_barrier(cwq, &barr, 1);
> >  		cwq->should_stop = 1;
> >  		alive = 1;
> > +		if (frozen(cwq->thread))
> > +			thaw(cwq->thread);
> 
> I meant thaw_process(cwq->thread) 

I thought so, but thanks anyway. ;-)

> 
> >  	}
> >  	spin_unlock_irq(&cwq->lock);
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-27 23:36   ` Johannes Berg
  2007-02-28  0:00     ` Rafael J. Wysocki
@ 2007-02-28 18:06     ` Gautham R Shenoy
  1 sibling, 0 replies; 45+ messages in thread
From: Gautham R Shenoy @ 2007-02-28 18:06 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Oleg Nesterov, Rafael J. Wysocki, Pavel Machek, LKML, Srivatsa Vaddagiri

On Wed, Feb 28, 2007 at 12:36:52AM +0100, Johannes Berg wrote:
> On Wed, 2007-02-28 at 02:28 +0300, Oleg Nesterov wrote:
> 
> > Ugh. I know nothing, nothing, nothing about suspend. I'll try to guess.
> > 
> >    Commit: ed746e3b18f4df18afa3763155972c5835f284c5
> 
> > Yes? with the patch above, _cpu_down() called _after_ freeze_processes() ???
> 
> perfect :)
> See also my original mail and the thread:
> http://thread.gmane.org/gmane.linux.power-management.general/4314
> 
> > How about other kthread_stop()s ? For example, kernel/softirq.c:cpu_callback() ?
> 
> I'd they should be affected as well. For me, I guess I haven't run into
> them because xfs is enough to freeze the box ;) As for Rafael, I don't
> know why he isn't seeing that... I haven't been able to test this patch
> on my quad powermac so far unfortunately, been sick and away from the
> machine. I'll probably be able to test tomorrow.

They will be affected. In our first implemetentation of 
general "cpu_down after freeze"(http://lkml.org/lkml/2007/2/14/107)
we had a new state known CPU_DEAD_KILL_THREADS, 
the notifications for which were being sent
*after* we thawed the processes in __cpu_down.

However, in the version which we are working on, we are thawing processes
individually in CPU_DEAD before cleaning/stopping them.

I haven't observed any bad lockups/freeze chills with this approach. 
But I need to test it a bit more before posting them.

regards
gautham.
-- 
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28  3:07     ` Srivatsa Vaddagiri
  2007-02-28  8:48       ` Oleg Nesterov
@ 2007-02-28 18:17       ` Gautham R Shenoy
  2007-02-28 18:41         ` Rafael J. Wysocki
  1 sibling, 1 reply; 45+ messages in thread
From: Gautham R Shenoy @ 2007-02-28 18:17 UTC (permalink / raw)
  To: Srivatsa Vaddagiri
  Cc: Rafael J. Wysocki, Oleg Nesterov, Pavel Machek, Johannes Berg, LKML

On Wed, Feb 28, 2007 at 08:37:26AM +0530, Srivatsa Vaddagiri wrote:
> 
> Hmm ..good point. So can we assume that disable/enable_nonboot_cpus() are called
> with processes frozen already?
> 
> Gautham, you need to take this into account in your patchset!

Yup. That would mean making the freezer reentrant since we will
be freezing twice (once for suspend and later on for hotplug). This is
ok since the api in my patches looks like
freeze_processes(int freeze_event);

But thaw will be interesting. If we are thawing for hotplug, we gotta
only thaw processes which were frozen *only* for hotplug.

Rafael, does that mean more status flags?!

Thanks and Regards
gautham.

-- 
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28 18:17       ` Gautham R Shenoy
@ 2007-02-28 18:41         ` Rafael J. Wysocki
  0 siblings, 0 replies; 45+ messages in thread
From: Rafael J. Wysocki @ 2007-02-28 18:41 UTC (permalink / raw)
  To: ego; +Cc: Srivatsa Vaddagiri, Oleg Nesterov, Pavel Machek, Johannes Berg, LKML

On Wednesday, 28 February 2007 19:17, Gautham R Shenoy wrote:
> On Wed, Feb 28, 2007 at 08:37:26AM +0530, Srivatsa Vaddagiri wrote:
> > 
> > Hmm ..good point. So can we assume that disable/enable_nonboot_cpus() are called
> > with processes frozen already?
> > 
> > Gautham, you need to take this into account in your patchset!
> 
> Yup. That would mean making the freezer reentrant since we will
> be freezing twice (once for suspend and later on for hotplug). This is
> ok since the api in my patches looks like
> freeze_processes(int freeze_event);
> 
> But thaw will be interesting. If we are thawing for hotplug, we gotta
> only thaw processes which were frozen *only* for hotplug.
> 
> Rafael, does that mean more status flags?!

Well, I don't really think so, but we need to store some information in the
freezer (eg. in a status variable).  Namely, we can define a variable, say
tasks_frozen, the value of which will be the bitwise or of the flags
SPE_SUSPEND, SPE_HOTPLUG etc.  In a fully functional system, tasks_frozen
is equal to zero.  If freeze_processes(SPE_SUSPEND) is run, it does
tasks_frozen |= SPE_SUSPEND and analogously for SPE_HOTPLUG etc.
If tasks_frozen is equal to SPE_SUSPEND|SPE_HOTPLUG,  for example, and
thaw_tasks(SPE_HOTPLUG) runs, it only thaws the tasks that need not stay frozen
for the suspend and does tasks_frozen &= ~SPE_SUSPEND etc.

I think something like this should work.

Greetings,
Rafael 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28 13:17       ` Srivatsa Vaddagiri
  2007-02-28 13:27         ` Srivatsa Vaddagiri
  2007-02-28 17:40         ` Rafael J. Wysocki
@ 2007-02-28 19:17         ` Rafael J. Wysocki
  2007-02-28 19:32           ` Oleg Nesterov
  2 siblings, 1 reply; 45+ messages in thread
From: Rafael J. Wysocki @ 2007-02-28 19:17 UTC (permalink / raw)
  To: vatsa; +Cc: Pavel Machek, Gautham R Shenoy, Johannes Berg, LKML, Oleg Nesterov

On Wednesday, 28 February 2007 14:17, Srivatsa Vaddagiri wrote:
> On Wed, Feb 28, 2007 at 12:11:03PM +0100, Rafael J. Wysocki wrote:
> > > In addition to thawing worker thread before kthread_stopping it, there
> > > are minor changes required in worker threads, to check for
> > > is_cpu_offline(bind_cpu) when they come out of refrigerator and jump to
> > > wait_to_die if so (ex: softirq.c).
> > > 
> > > I guess you would need these changes before freezer-based hotplug is
> > > merged, in which case Gautham can send those patches out first.
> > 
> > Yes, please, if that's possible.
> 
> After looking at the current workqueue code, the above minor change I
> suggested is not required.
> 
> So you should be able to fix your "kthread_stop on a frozen worker
> thread hangs" problem by just a simple patch like this (against
> 2.6.20-mm2):
> 
> 
> --- workqueue.c.org	2007-02-28 18:32:48.000000000 +0530
> +++ workqueue.c	2007-02-28 18:44:23.000000000 +0530
> @@ -718,6 +718,8 @@ static void cleanup_workqueue_thread(str
>  		insert_wq_barrier(cwq, &barr, 1);
>  		cwq->should_stop = 1;
>  		alive = 1;
> +		if (frozen(cwq->thread))
> +			thaw(cwq->thread);
>  	}
>  	spin_unlock_irq(&cwq->lock);
>  
> 
> Can you test with this?

Unfortunately, the above code is mm-only.  Is the analogous fix for 2.6.21-rc2
viable?

Rafael

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28 19:17         ` Rafael J. Wysocki
@ 2007-02-28 19:32           ` Oleg Nesterov
  2007-02-28 19:43             ` Rafael J. Wysocki
  0 siblings, 1 reply; 45+ messages in thread
From: Oleg Nesterov @ 2007-02-28 19:32 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: vatsa, Pavel Machek, Gautham R Shenoy, Johannes Berg, LKML

On 02/28, Rafael J. Wysocki wrote:
>
> > --- workqueue.c.org	2007-02-28 18:32:48.000000000 +0530
> > +++ workqueue.c	2007-02-28 18:44:23.000000000 +0530
> > @@ -718,6 +718,8 @@ static void cleanup_workqueue_thread(str
> >  		insert_wq_barrier(cwq, &barr, 1);
> >  		cwq->should_stop = 1;
> >  		alive = 1;
> > +		if (frozen(cwq->thread))
> > +			thaw(cwq->thread);
> >  	}
> >  	spin_unlock_irq(&cwq->lock);
>
> Unfortunately, the above code is mm-only.  Is the analogous fix for 2.6.21-rc2
> viable?

I am sorry, I lost track of this problem. As for 2.6.21, create_freezeable_workqueue
doesn't work and conflict with suspend. Why can't we remove it from XFS as you
suggested before?

Iirc,
	On 02/28, Nigel Cunningham wrote:
	>
	> On Wed, 2007-02-28 at 01:08 +0100, Rafael J. Wysocki wrote:
	> > On Wednesday, 28 February 2007 01:01, Johannes Berg wrote:
	> > > On Wed, 2007-02-28 at 00:57 +0100, Rafael J. Wysocki wrote:
	> > >
	> > > > Okay, in that case I'd suggest removing create_freezeable_workqueue() and
	> > > > make all workqueues nonfreezable once again for 2.6.21 (as far as I know, only
	> > > > the two XFS workqueues are affected).
	> > >
	> > > I think Nigel might object but I forgot what specific trouble XFS was
	> > > causing him.
	> >
	> > We suspected that the XFS' worker threads might commit I/O after
	> > freeze_processes() has returned, but that hasn't been supported by evidence,
	> > as far as I can recall.
	> >
	> > Also, making them freezable was controversial ...
	>
	> Controversy is no reason to give in! Nevertheless, I think you're right
	> - I believe the XFS guys said they fixed the issue that had caused I/O
	> to be submitted post-freeze. Well, we'll see if it appears again, won't
	> we?

Oleg.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28 19:32           ` Oleg Nesterov
@ 2007-02-28 19:43             ` Rafael J. Wysocki
  2007-02-28 20:08               ` Oleg Nesterov
  0 siblings, 1 reply; 45+ messages in thread
From: Rafael J. Wysocki @ 2007-02-28 19:43 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: vatsa, Pavel Machek, Gautham R Shenoy, Johannes Berg, LKML

On Wednesday, 28 February 2007 20:32, Oleg Nesterov wrote:
> On 02/28, Rafael J. Wysocki wrote:
> >
> > > --- workqueue.c.org	2007-02-28 18:32:48.000000000 +0530
> > > +++ workqueue.c	2007-02-28 18:44:23.000000000 +0530
> > > @@ -718,6 +718,8 @@ static void cleanup_workqueue_thread(str
> > >  		insert_wq_barrier(cwq, &barr, 1);
> > >  		cwq->should_stop = 1;
> > >  		alive = 1;
> > > +		if (frozen(cwq->thread))
> > > +			thaw(cwq->thread);
> > >  	}
> > >  	spin_unlock_irq(&cwq->lock);
> >
> > Unfortunately, the above code is mm-only.  Is the analogous fix for 2.6.21-rc2
> > viable?
> 
> I am sorry, I lost track of this problem. As for 2.6.21, create_freezeable_workqueue
> doesn't work and conflict with suspend. Why can't we remove it from XFS as you
> suggested before?

Yes, we can (preparing a patch).  I was just curious. :-)

Rafael

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28 19:43             ` Rafael J. Wysocki
@ 2007-02-28 20:08               ` Oleg Nesterov
  2007-02-28 20:25                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 45+ messages in thread
From: Oleg Nesterov @ 2007-02-28 20:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: vatsa, Pavel Machek, Gautham R Shenoy, Johannes Berg, LKML

On 02/28, Rafael J. Wysocki wrote:
>
> On Wednesday, 28 February 2007 20:32, Oleg Nesterov wrote:
> > 
> > I am sorry, I lost track of this problem. As for 2.6.21, create_freezeable_workqueue
> > doesn't work and conflict with suspend. Why can't we remove it from XFS as you
> > suggested before?
> 
> Yes, we can (preparing a patch).  I was just curious. :-)

OK, thanks.

We can (I think) do pretty much the same with some additional complications
in worker_thread() (check !cpu_online() after try_to_freeze() and break).

Oleg.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28 20:08               ` Oleg Nesterov
@ 2007-02-28 20:25                 ` Rafael J. Wysocki
  2007-02-28 20:35                   ` Oleg Nesterov
  2007-02-28 21:16                   ` Problem with freezable workqueues Pavel Machek
  0 siblings, 2 replies; 45+ messages in thread
From: Rafael J. Wysocki @ 2007-02-28 20:25 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: vatsa, Pavel Machek, Gautham R Shenoy, Johannes Berg, LKML

On Wednesday, 28 February 2007 21:08, Oleg Nesterov wrote:
> On 02/28, Rafael J. Wysocki wrote:
> >
> > On Wednesday, 28 February 2007 20:32, Oleg Nesterov wrote:
> > > 
> > > I am sorry, I lost track of this problem. As for 2.6.21, create_freezeable_workqueue
> > > doesn't work and conflict with suspend. Why can't we remove it from XFS as you
> > > suggested before?
> > 
> > Yes, we can (preparing a patch).  I was just curious. :-)
> 
> OK, thanks.
> 
> We can (I think) do pretty much the same with some additional complications
> in worker_thread() (check !cpu_online() after try_to_freeze() and break).

Okay, but I've just finished the patch that removes the freezability of
workqueues (appended), so can we please do this in a separate one?

Rafael

---
Since freezable workqueues are broken in 2.6.21-rc
(cf. http://marc.theaimsgroup.com/?l=linux-kernel&m=116855740612755,
http://marc.theaimsgroup.com/?l=linux-kernel&m=117261312523921&w=2)
it's better to remove them altogether for 2.6.21 and change the only user of
them (XFS) accordingly.

---
 fs/xfs/linux-2.6/xfs_buf.c |    4 ++--
 include/linux/workqueue.h  |    8 +++-----
 kernel/workqueue.c         |   21 +++++++--------------
 3 files changed, 12 insertions(+), 21 deletions(-)

Index: linux-2.6.21-rc2/kernel/workqueue.c
===================================================================
--- linux-2.6.21-rc2.orig/kernel/workqueue.c
+++ linux-2.6.21-rc2/kernel/workqueue.c
@@ -59,7 +59,6 @@ struct cpu_workqueue_struct {
 
 	int run_depth;		/* Detect run_workqueue() recursion depth */
 
-	int freezeable;		/* Freeze the thread during suspend */
 } ____cacheline_aligned;
 
 /*
@@ -352,8 +351,7 @@ static int worker_thread(void *__cwq)
 	struct k_sigaction sa;
 	sigset_t blocked;
 
-	if (!cwq->freezeable)
-		current->flags |= PF_NOFREEZE;
+	current->flags |= PF_NOFREEZE;
 
 	set_user_nice(current, -5);
 
@@ -376,9 +374,6 @@ static int worker_thread(void *__cwq)
 
 	set_current_state(TASK_INTERRUPTIBLE);
 	while (!kthread_should_stop()) {
-		if (cwq->freezeable)
-			try_to_freeze();
-
 		add_wait_queue(&cwq->more_work, &wait);
 		if (list_empty(&cwq->worklist))
 			schedule();
@@ -454,8 +449,8 @@ void fastcall flush_workqueue(struct wor
 }
 EXPORT_SYMBOL_GPL(flush_workqueue);
 
-static struct task_struct *create_workqueue_thread(struct workqueue_struct *wq,
-						   int cpu, int freezeable)
+static struct task_struct
+*create_workqueue_thread(struct workqueue_struct *wq, int cpu)
 {
 	struct cpu_workqueue_struct *cwq = per_cpu_ptr(wq->cpu_wq, cpu);
 	struct task_struct *p;
@@ -465,7 +460,6 @@ static struct task_struct *create_workqu
 	cwq->thread = NULL;
 	cwq->insert_sequence = 0;
 	cwq->remove_sequence = 0;
-	cwq->freezeable = freezeable;
 	INIT_LIST_HEAD(&cwq->worklist);
 	init_waitqueue_head(&cwq->more_work);
 	init_waitqueue_head(&cwq->work_done);
@@ -480,8 +474,7 @@ static struct task_struct *create_workqu
 	return p;
 }
 
-struct workqueue_struct *__create_workqueue(const char *name,
-					    int singlethread, int freezeable)
+struct workqueue_struct *__create_workqueue(const char *name, int singlethread)
 {
 	int cpu, destroy = 0;
 	struct workqueue_struct *wq;
@@ -501,7 +494,7 @@ struct workqueue_struct *__create_workqu
 	mutex_lock(&workqueue_mutex);
 	if (singlethread) {
 		INIT_LIST_HEAD(&wq->list);
-		p = create_workqueue_thread(wq, singlethread_cpu, freezeable);
+		p = create_workqueue_thread(wq, singlethread_cpu);
 		if (!p)
 			destroy = 1;
 		else
@@ -509,7 +502,7 @@ struct workqueue_struct *__create_workqu
 	} else {
 		list_add(&wq->list, &workqueues);
 		for_each_online_cpu(cpu) {
-			p = create_workqueue_thread(wq, cpu, freezeable);
+			p = create_workqueue_thread(wq, cpu);
 			if (p) {
 				kthread_bind(p, cpu);
 				wake_up_process(p);
@@ -760,7 +753,7 @@ static int __devinit workqueue_cpu_callb
 		mutex_lock(&workqueue_mutex);
 		/* Create a new workqueue thread for it. */
 		list_for_each_entry(wq, &workqueues, list) {
-			if (!create_workqueue_thread(wq, hotcpu, 0)) {
+			if (!create_workqueue_thread(wq, hotcpu)) {
 				printk("workqueue for %i failed\n", hotcpu);
 				return NOTIFY_BAD;
 			}
Index: linux-2.6.21-rc2/include/linux/workqueue.h
===================================================================
--- linux-2.6.21-rc2.orig/include/linux/workqueue.h
+++ linux-2.6.21-rc2/include/linux/workqueue.h
@@ -159,11 +159,9 @@ struct execute_work {
 
 
 extern struct workqueue_struct *__create_workqueue(const char *name,
-						    int singlethread,
-						    int freezeable);
-#define create_workqueue(name) __create_workqueue((name), 0, 0)
-#define create_freezeable_workqueue(name) __create_workqueue((name), 0, 1)
-#define create_singlethread_workqueue(name) __create_workqueue((name), 1, 0)
+						    int singlethread);
+#define create_workqueue(name) __create_workqueue((name), 0)
+#define create_singlethread_workqueue(name) __create_workqueue((name), 1)
 
 extern void destroy_workqueue(struct workqueue_struct *wq);
 
Index: linux-2.6.21-rc2/fs/xfs/linux-2.6/xfs_buf.c
===================================================================
--- linux-2.6.21-rc2.orig/fs/xfs/linux-2.6/xfs_buf.c
+++ linux-2.6.21-rc2/fs/xfs/linux-2.6/xfs_buf.c
@@ -1829,11 +1829,11 @@ xfs_buf_init(void)
 	if (!xfs_buf_zone)
 		goto out_free_trace_buf;
 
-	xfslogd_workqueue = create_freezeable_workqueue("xfslogd");
+	xfslogd_workqueue = create_workqueue("xfslogd");
 	if (!xfslogd_workqueue)
 		goto out_free_buf_zone;
 
-	xfsdatad_workqueue = create_freezeable_workqueue("xfsdatad");
+	xfsdatad_workqueue = create_workqueue("xfsdatad");
 	if (!xfsdatad_workqueue)
 		goto out_destroy_xfslogd_workqueue;
 


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28 20:25                 ` Rafael J. Wysocki
@ 2007-02-28 20:35                   ` Oleg Nesterov
  2007-02-28 22:39                     ` Rafael J. Wysocki
  2007-02-28 21:16                   ` Problem with freezable workqueues Pavel Machek
  1 sibling, 1 reply; 45+ messages in thread
From: Oleg Nesterov @ 2007-02-28 20:35 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: vatsa, Pavel Machek, Gautham R Shenoy, Johannes Berg, LKML

On 02/28, Rafael J. Wysocki wrote:
> 
> Okay, but I've just finished the patch that removes the freezability of
> workqueues (appended), so can we please do this in a separate one?

Please, please, no. This patch is of course correct, but it breaks _a lot_
of patches in -mm tree.

May I ask you to send just

> ===================================================================
> --- linux-2.6.21-rc2.orig/fs/xfs/linux-2.6/xfs_buf.c
> +++ linux-2.6.21-rc2/fs/xfs/linux-2.6/xfs_buf.c
> @@ -1829,11 +1829,11 @@ xfs_buf_init(void)
>  	if (!xfs_buf_zone)
>  		goto out_free_trace_buf;
>  
> -	xfslogd_workqueue = create_freezeable_workqueue("xfslogd");
> +	xfslogd_workqueue = create_workqueue("xfslogd");
>  	if (!xfslogd_workqueue)
>  		goto out_free_buf_zone;
>  
> -	xfsdatad_workqueue = create_freezeable_workqueue("xfsdatad");
> +	xfsdatad_workqueue = create_workqueue("xfsdatad");
>  	if (!xfsdatad_workqueue)
>  		goto out_destroy_xfslogd_workqueue;
>  
> 

this bit?

After that, we can do the "removes the freezability of workqueues" patch
against -mm tree.

Oleg.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28  1:14         ` Nigel Cunningham
  2007-02-28 10:59           ` Rafael J. Wysocki
@ 2007-02-28 20:36           ` Johannes Berg
  1 sibling, 0 replies; 45+ messages in thread
From: Johannes Berg @ 2007-02-28 20:36 UTC (permalink / raw)
  To: nigel
  Cc: Rafael J. Wysocki, Oleg Nesterov, Pavel Machek, Gautham R Shenoy,
	LKML, Srivatsa Vaddagiri

[-- Attachment #1: Type: text/plain, Size: 464 bytes --]

On Wed, 2007-02-28 at 12:14 +1100, Nigel Cunningham wrote:

> Controversy is no reason to give in! Nevertheless, I think you're right
> - I believe the XFS guys said they fixed the issue that had caused I/O
> to be submitted post-freeze. Well, we'll see if it appears again, won't
> we?

I get to be the guinea pig, right? :P Unfortunately I was sick for the
better part of the past few days and can only test all this stuff early
next week.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28 20:25                 ` Rafael J. Wysocki
  2007-02-28 20:35                   ` Oleg Nesterov
@ 2007-02-28 21:16                   ` Pavel Machek
  1 sibling, 0 replies; 45+ messages in thread
From: Pavel Machek @ 2007-02-28 21:16 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Oleg Nesterov, vatsa, Gautham R Shenoy, Johannes Berg, LKML

Hi!

> > OK, thanks.
> > 
> > We can (I think) do pretty much the same with some additional complications
> > in worker_thread() (check !cpu_online() after try_to_freeze() and break).
> 
> Okay, but I've just finished the patch that removes the freezability of
> workqueues (appended), so can we please do this in a separate one?

Hmm, nothing obviously wrong with the patch (ACK), but xfs people
should ack this one, too: 'is it okay to let xfs run while suspending'
is not a trivial question.

> Since freezable workqueues are broken in 2.6.21-rc
> (cf. http://marc.theaimsgroup.com/?l=linux-kernel&m=116855740612755,
> http://marc.theaimsgroup.com/?l=linux-kernel&m=117261312523921&w=2)
> it's better to remove them altogether for 2.6.21 and change the only user of
> them (XFS) accordingly.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28 20:35                   ` Oleg Nesterov
@ 2007-02-28 22:39                     ` Rafael J. Wysocki
  2007-02-28 22:44                       ` Pavel Machek
  0 siblings, 1 reply; 45+ messages in thread
From: Rafael J. Wysocki @ 2007-02-28 22:39 UTC (permalink / raw)
  To: Oleg Nesterov, Pavel Machek; +Cc: vatsa, Gautham R Shenoy, Johannes Berg, LKML

On Wednesday, 28 February 2007 21:35, Oleg Nesterov wrote:
> On 02/28, Rafael J. Wysocki wrote:
> > 
> > Okay, but I've just finished the patch that removes the freezability of
> > workqueues (appended), so can we please do this in a separate one?
> 
> Please, please, no. This patch is of course correct, but it breaks _a lot_
> of patches in -mm tree.
> 
> May I ask you to send just
> 
> > ===================================================================
> > --- linux-2.6.21-rc2.orig/fs/xfs/linux-2.6/xfs_buf.c
> > +++ linux-2.6.21-rc2/fs/xfs/linux-2.6/xfs_buf.c
> > @@ -1829,11 +1829,11 @@ xfs_buf_init(void)
> >  	if (!xfs_buf_zone)
> >  		goto out_free_trace_buf;
> >  
> > -	xfslogd_workqueue = create_freezeable_workqueue("xfslogd");
> > +	xfslogd_workqueue = create_workqueue("xfslogd");
> >  	if (!xfslogd_workqueue)
> >  		goto out_free_buf_zone;
> >  
> > -	xfsdatad_workqueue = create_freezeable_workqueue("xfsdatad");
> > +	xfsdatad_workqueue = create_workqueue("xfsdatad");
> >  	if (!xfsdatad_workqueue)
> >  		goto out_destroy_xfslogd_workqueue;
> >  
> > 
> 
> this bit?
> 
> After that, we can do the "removes the freezability of workqueues" patch
> against -mm tree.

Okay, if that's better.

Pavel, is that acceptable to you?

Rafael

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-28 22:39                     ` Rafael J. Wysocki
@ 2007-02-28 22:44                       ` Pavel Machek
  2007-02-28 23:54                         ` [PATCH] Make XFS workqueues nonfreezable Rafael J. Wysocki
  0 siblings, 1 reply; 45+ messages in thread
From: Pavel Machek @ 2007-02-28 22:44 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Oleg Nesterov, vatsa, Gautham R Shenoy, Johannes Berg, LKML

On Wed 2007-02-28 23:39:30, Rafael J. Wysocki wrote:
> On Wednesday, 28 February 2007 21:35, Oleg Nesterov wrote:
> > On 02/28, Rafael J. Wysocki wrote:
> > > 
> > > Okay, but I've just finished the patch that removes the freezability of
> > > workqueues (appended), so can we please do this in a separate one?
> > 
> > Please, please, no. This patch is of course correct, but it breaks _a lot_
> > of patches in -mm tree.
> > 
> > May I ask you to send just
> > 
> > > ===================================================================
> > > --- linux-2.6.21-rc2.orig/fs/xfs/linux-2.6/xfs_buf.c
> > > +++ linux-2.6.21-rc2/fs/xfs/linux-2.6/xfs_buf.c
> > > @@ -1829,11 +1829,11 @@ xfs_buf_init(void)
> > >  	if (!xfs_buf_zone)
> > >  		goto out_free_trace_buf;
> > >  
> > > -	xfslogd_workqueue = create_freezeable_workqueue("xfslogd");
> > > +	xfslogd_workqueue = create_workqueue("xfslogd");
> > >  	if (!xfslogd_workqueue)
> > >  		goto out_free_buf_zone;
> > >  
> > > -	xfsdatad_workqueue = create_freezeable_workqueue("xfsdatad");
> > > +	xfsdatad_workqueue = create_workqueue("xfsdatad");
> > >  	if (!xfsdatad_workqueue)
> > >  		goto out_destroy_xfslogd_workqueue;
> > >  
> > > 
> > 
> > this bit?
> > 
> > After that, we can do the "removes the freezability of workqueues" patch
> > against -mm tree.
> 
> Okay, if that's better.
> 
> Pavel, is that acceptable to you?

No problem, but get that acked-by: from XFS people ;-).
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH] Make XFS workqueues nonfreezable
  2007-02-28 22:44                       ` Pavel Machek
@ 2007-02-28 23:54                         ` Rafael J. Wysocki
  2007-03-01  8:03                           ` Andrew Morton
  0 siblings, 1 reply; 45+ messages in thread
From: Rafael J. Wysocki @ 2007-02-28 23:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Pavel Machek, Oleg Nesterov, vatsa, Gautham R Shenoy,
	Johannes Berg, LKML, David Chinner, Nigel Cunningham

Since freezable workqueues are broken in 2.6.21-rc
(cf. http://marc.theaimsgroup.com/?l=linux-kernel&m=116855740612755,
http://marc.theaimsgroup.com/?l=linux-kernel&m=117261312523921&w=2)
it's better to change the only user of them, which is XFS, to use "normal"
nonfreezable workqueues.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 fs/xfs/linux-2.6/xfs_buf.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Index: linux-2.6.21-rc2/fs/xfs/linux-2.6/xfs_buf.c
===================================================================
--- linux-2.6.21-rc2.orig/fs/xfs/linux-2.6/xfs_buf.c
+++ linux-2.6.21-rc2/fs/xfs/linux-2.6/xfs_buf.c
@@ -1829,11 +1829,11 @@ xfs_buf_init(void)
 	if (!xfs_buf_zone)
 		goto out_free_trace_buf;
 
-	xfslogd_workqueue = create_freezeable_workqueue("xfslogd");
+	xfslogd_workqueue = create_workqueue("xfslogd");
 	if (!xfslogd_workqueue)
 		goto out_free_buf_zone;
 
-	xfsdatad_workqueue = create_freezeable_workqueue("xfsdatad");
+	xfsdatad_workqueue = create_workqueue("xfsdatad");
 	if (!xfsdatad_workqueue)
 		goto out_destroy_xfslogd_workqueue;
 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Make XFS workqueues nonfreezable
  2007-02-28 23:54                         ` [PATCH] Make XFS workqueues nonfreezable Rafael J. Wysocki
@ 2007-03-01  8:03                           ` Andrew Morton
  2007-03-01  9:15                             ` Pavel Machek
  0 siblings, 1 reply; 45+ messages in thread
From: Andrew Morton @ 2007-03-01  8:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, Oleg Nesterov, vatsa, Gautham R Shenoy,
	Johannes Berg, LKML, David Chinner, Nigel Cunningham

On Thu, 1 Mar 2007 00:54:32 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> Since freezable workqueues are broken in 2.6.21-rc
> (cf. http://marc.theaimsgroup.com/?l=linux-kernel&m=116855740612755,
> http://marc.theaimsgroup.com/?l=linux-kernel&m=117261312523921&w=2)
> it's better to change the only user of them, which is XFS, to use "normal"
> nonfreezable workqueues.
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---
>  fs/xfs/linux-2.6/xfs_buf.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> Index: linux-2.6.21-rc2/fs/xfs/linux-2.6/xfs_buf.c
> ===================================================================
> --- linux-2.6.21-rc2.orig/fs/xfs/linux-2.6/xfs_buf.c
> +++ linux-2.6.21-rc2/fs/xfs/linux-2.6/xfs_buf.c
> @@ -1829,11 +1829,11 @@ xfs_buf_init(void)
>  	if (!xfs_buf_zone)
>  		goto out_free_trace_buf;
>  
> -	xfslogd_workqueue = create_freezeable_workqueue("xfslogd");
> +	xfslogd_workqueue = create_workqueue("xfslogd");
>  	if (!xfslogd_workqueue)
>  		goto out_free_buf_zone;
>  
> -	xfsdatad_workqueue = create_freezeable_workqueue("xfsdatad");
> +	xfsdatad_workqueue = create_workqueue("xfsdatad");
>  	if (!xfsdatad_workqueue)
>  		goto out_destroy_xfslogd_workqueue;
>  

Won't this break suspend+XFS?

If so, and given that nobody seems to be reporting this deadlock, perhaps
we'd be better off leaving things as-is for the while?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Make XFS workqueues nonfreezable
  2007-03-01  8:03                           ` Andrew Morton
@ 2007-03-01  9:15                             ` Pavel Machek
  2007-03-01  9:25                               ` Andrew Morton
  0 siblings, 1 reply; 45+ messages in thread
From: Pavel Machek @ 2007-03-01  9:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rafael J. Wysocki, Oleg Nesterov, vatsa, Gautham R Shenoy,
	Johannes Berg, LKML, David Chinner, Nigel Cunningham

Hi!

> > Since freezable workqueues are broken in 2.6.21-rc
> > (cf. http://marc.theaimsgroup.com/?l=linux-kernel&m=116855740612755,
> > http://marc.theaimsgroup.com/?l=linux-kernel&m=117261312523921&w=2)
> > it's better to change the only user of them, which is XFS, to use "normal"
> > nonfreezable workqueues.
> > 
> > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > ---
> >  fs/xfs/linux-2.6/xfs_buf.c |    4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > Index: linux-2.6.21-rc2/fs/xfs/linux-2.6/xfs_buf.c
> > ===================================================================
> > --- linux-2.6.21-rc2.orig/fs/xfs/linux-2.6/xfs_buf.c
> > +++ linux-2.6.21-rc2/fs/xfs/linux-2.6/xfs_buf.c
> > @@ -1829,11 +1829,11 @@ xfs_buf_init(void)
> >  	if (!xfs_buf_zone)
> >  		goto out_free_trace_buf;
> >  
> > -	xfslogd_workqueue = create_freezeable_workqueue("xfslogd");
> > +	xfslogd_workqueue = create_workqueue("xfslogd");
> >  	if (!xfslogd_workqueue)
> >  		goto out_free_buf_zone;
> >  
> > -	xfsdatad_workqueue = create_freezeable_workqueue("xfsdatad");
> > +	xfsdatad_workqueue = create_workqueue("xfsdatad");
> >  	if (!xfsdatad_workqueue)
> >  		goto out_destroy_xfslogd_workqueue;
> >  
> 
> Won't this break suspend+XFS?
> 
> If so, and given that nobody seems to be reporting this deadlock, perhaps
> we'd be better off leaving things as-is for the while?

Worst case is not breaking suspend+XFS, worst case is XFS writing to
disk after freeze(), leading to subtle fs corruption.

(But noone could reproduce corruption before, and I was told XFS will
not do those writes these days).
								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH] Make XFS workqueues nonfreezable
  2007-03-01  9:15                             ` Pavel Machek
@ 2007-03-01  9:25                               ` Andrew Morton
  0 siblings, 0 replies; 45+ messages in thread
From: Andrew Morton @ 2007-03-01  9:25 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Rafael J. Wysocki, Oleg Nesterov, vatsa, Gautham R Shenoy,
	Johannes Berg, LKML, David Chinner, Nigel Cunningham

On Thu, 1 Mar 2007 10:15:21 +0100 Pavel Machek <pavel@ucw.cz> wrote:

> > > -	xfslogd_workqueue = create_freezeable_workqueue("xfslogd");
> > > +	xfslogd_workqueue = create_workqueue("xfslogd");
> > >  	if (!xfslogd_workqueue)
> > >  		goto out_free_buf_zone;
> > >  
> > > -	xfsdatad_workqueue = create_freezeable_workqueue("xfsdatad");
> > > +	xfsdatad_workqueue = create_workqueue("xfsdatad");
> > >  	if (!xfsdatad_workqueue)
> > >  		goto out_destroy_xfslogd_workqueue;
> > >  
> > 
> > Won't this break suspend+XFS?
> > 
> > If so, and given that nobody seems to be reporting this deadlock, perhaps
> > we'd be better off leaving things as-is for the while?
> 
> Worst case is not breaking suspend+XFS, worst case is XFS writing to
> disk after freeze(), leading to subtle fs corruption.
> 
> (But noone could reproduce corruption before, and I was told XFS will
> not do those writes these days).

hm, OK.   To avoid making a decision I sent the patch to David ;)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-02-27 21:51 Problem with freezable workqueues Rafael J. Wysocki
  2007-02-27 23:28 ` Oleg Nesterov
  2007-02-28  3:01 ` Srivatsa Vaddagiri
@ 2007-03-06  0:30 ` Johannes Berg
  2007-03-06 20:31   ` Rafael J. Wysocki
  2 siblings, 1 reply; 45+ messages in thread
From: Johannes Berg @ 2007-03-06  0:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, Gautham R Shenoy, LKML, Oleg Nesterov, Srivatsa Vaddagiri

[-- Attachment #1: Type: text/plain, Size: 801 bytes --]

On Tue, 2007-02-27 at 22:51 +0100, Rafael J. Wysocki wrote:

> For 2.6.21-rc1 I've invented the appended workaround (works for me, waiting for
> Johannes to confirm it works for him too), but I think we need something better
> for -mm and future kernels.

Finally I could get back to this but after reading the thread I figured
it might not be necessary to test this. Please let me know ASAP if you
want this patch tested as well or it'll take quite a long time (going
skiing for a week on Saturday)

In any case, I made the two xfs workqueues non-freezable and everything
on my quad powermac works again, I also couldn't detect any filesystem
correction. I wanted to adapt the BUG_ON(block IO not from suspend code)
patch from suspend2 but haven't gotten around to it yet.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-03-06  0:30 ` Johannes Berg
@ 2007-03-06 20:31   ` Rafael J. Wysocki
  2007-03-06 22:25     ` Nigel Cunningham
  2007-03-07 23:10     ` Johannes Berg
  0 siblings, 2 replies; 45+ messages in thread
From: Rafael J. Wysocki @ 2007-03-06 20:31 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Pavel Machek, Gautham R Shenoy, LKML, Oleg Nesterov, Srivatsa Vaddagiri

Hi,

On Tuesday, 6 March 2007 01:30, Johannes Berg wrote:
> On Tue, 2007-02-27 at 22:51 +0100, Rafael J. Wysocki wrote:
> 
> > For 2.6.21-rc1 I've invented the appended workaround (works for me, waiting for
> > Johannes to confirm it works for him too), but I think we need something better
> > for -mm and future kernels.
> 
> Finally I could get back to this but after reading the thread I figured
> it might not be necessary to test this. Please let me know ASAP if you
> want this patch tested as well or it'll take quite a long time (going
> skiing for a week on Saturday)

I think it won't be necessary.

For now, we have decided to make the workqueues nonfreezable (the patch for
that has already been merged, AFAICT).

> In any case, I made the two xfs workqueues non-freezable and everything
> on my quad powermac works again, I also couldn't detect any filesystem
> correction.

Good, thanks for the confirmation.

> I wanted to adapt the BUG_ON(block IO not from suspend code) 
> patch from suspend2 but haven't gotten around to it yet.

That might be a good idea for other reasons too, but I'd prefer WARN_ON()
instead of BUG_ON() when you're at it. ;-)

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-03-06 20:31   ` Rafael J. Wysocki
@ 2007-03-06 22:25     ` Nigel Cunningham
  2007-03-06 22:57       ` Rafael J. Wysocki
  2007-03-07 23:10     ` Johannes Berg
  1 sibling, 1 reply; 45+ messages in thread
From: Nigel Cunningham @ 2007-03-06 22:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Johannes Berg, Pavel Machek, Gautham R Shenoy, LKML,
	Oleg Nesterov, Srivatsa Vaddagiri

Hi.

On Tue, 2007-03-06 at 21:31 +0100, Rafael J. Wysocki wrote:
> Hi,
> 
> On Tuesday, 6 March 2007 01:30, Johannes Berg wrote:
> > On Tue, 2007-02-27 at 22:51 +0100, Rafael J. Wysocki wrote:
> > 
> > > For 2.6.21-rc1 I've invented the appended workaround (works for me, waiting for
> > > Johannes to confirm it works for him too), but I think we need something better
> > > for -mm and future kernels.
> > 
> > Finally I could get back to this but after reading the thread I figured
> > it might not be necessary to test this. Please let me know ASAP if you
> > want this patch tested as well or it'll take quite a long time (going
> > skiing for a week on Saturday)
> 
> I think it won't be necessary.
> 
> For now, we have decided to make the workqueues nonfreezable (the patch for
> that has already been merged, AFAICT).
> 
> > In any case, I made the two xfs workqueues non-freezable and everything
> > on my quad powermac works again, I also couldn't detect any filesystem
> > correction.
> 
> Good, thanks for the confirmation.
> 
> > I wanted to adapt the BUG_ON(block IO not from suspend code) 
> > patch from suspend2 but haven't gotten around to it yet.
> 
> That might be a good idea for other reasons too, but I'd prefer WARN_ON()
> instead of BUG_ON() when you're at it. ;-)

I made it BUG_ON() because if Suspend2 is running any I/O coming from
another source besides Suspend2 may be I/O on a page that's been used
for the atomic copy, and in that case it would definitely be bad to
write it to disk. If swsusp is running, the BUG_ON() won't trigger IIRC.

Regards,

Nigel


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-03-06 22:25     ` Nigel Cunningham
@ 2007-03-06 22:57       ` Rafael J. Wysocki
  0 siblings, 0 replies; 45+ messages in thread
From: Rafael J. Wysocki @ 2007-03-06 22:57 UTC (permalink / raw)
  To: nigel
  Cc: Johannes Berg, Pavel Machek, Gautham R Shenoy, LKML,
	Oleg Nesterov, Srivatsa Vaddagiri

Hi,

On Tuesday, 6 March 2007 23:25, Nigel Cunningham wrote:
> On Tue, 2007-03-06 at 21:31 +0100, Rafael J. Wysocki wrote:
> > On Tuesday, 6 March 2007 01:30, Johannes Berg wrote:
> > > On Tue, 2007-02-27 at 22:51 +0100, Rafael J. Wysocki wrote:
> > > 
> > > > For 2.6.21-rc1 I've invented the appended workaround (works for me, waiting for
> > > > Johannes to confirm it works for him too), but I think we need something better
> > > > for -mm and future kernels.
> > > 
> > > Finally I could get back to this but after reading the thread I figured
> > > it might not be necessary to test this. Please let me know ASAP if you
> > > want this patch tested as well or it'll take quite a long time (going
> > > skiing for a week on Saturday)
> > 
> > I think it won't be necessary.
> > 
> > For now, we have decided to make the workqueues nonfreezable (the patch for
> > that has already been merged, AFAICT).
> > 
> > > In any case, I made the two xfs workqueues non-freezable and everything
> > > on my quad powermac works again, I also couldn't detect any filesystem
> > > correction.
> > 
> > Good, thanks for the confirmation.
> > 
> > > I wanted to adapt the BUG_ON(block IO not from suspend code) 
> > > patch from suspend2 but haven't gotten around to it yet.
> > 
> > That might be a good idea for other reasons too, but I'd prefer WARN_ON()
> > instead of BUG_ON() when you're at it. ;-)
> 
> I made it BUG_ON() because if Suspend2 is running any I/O coming from
> another source besides Suspend2 may be I/O on a page that's been used
> for the atomic copy, and in that case it would definitely be bad to
> write it to disk. If swsusp is running, the BUG_ON() won't trigger IIRC.

Okay, but I thought the idea was to use something similar that would trigger
for swsusp too (we would have to skip the IO from the userland suspend somehow,
BTW), and that I'd prefer to be WARN_ON() rather than BUG_ON().

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Problem with freezable workqueues
  2007-03-06 20:31   ` Rafael J. Wysocki
  2007-03-06 22:25     ` Nigel Cunningham
@ 2007-03-07 23:10     ` Johannes Berg
  1 sibling, 0 replies; 45+ messages in thread
From: Johannes Berg @ 2007-03-07 23:10 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, Gautham R Shenoy, LKML, Oleg Nesterov, Srivatsa Vaddagiri

[-- Attachment #1: Type: text/plain, Size: 594 bytes --]

On Tue, 2007-03-06 at 21:31 +0100, Rafael J. Wysocki wrote:

> For now, we have decided to make the workqueues nonfreezable (the patch for
> that has already been merged, AFAICT).

It isn't in 2.6.21-rc3.

> > I wanted to adapt the BUG_ON(block IO not from suspend code) 
> > patch from suspend2 but haven't gotten around to it yet.
> 
> That might be a good idea for other reasons too, but I'd prefer WARN_ON()
> instead of BUG_ON() when you're at it. ;-)

I won't be able to do that for quite a while... It's just on my "future
projects that would be nice" list :P

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2007-03-08 15:32 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-27 21:51 Problem with freezable workqueues Rafael J. Wysocki
2007-02-27 23:28 ` Oleg Nesterov
2007-02-27 23:36   ` Johannes Berg
2007-02-28  0:00     ` Rafael J. Wysocki
2007-02-28  0:00       ` Johannes Berg
2007-02-28 18:06     ` Gautham R Shenoy
2007-02-27 23:57   ` Rafael J. Wysocki
2007-02-28  0:01     ` Johannes Berg
2007-02-28  0:08       ` Rafael J. Wysocki
2007-02-28  1:14         ` Nigel Cunningham
2007-02-28 10:59           ` Rafael J. Wysocki
2007-02-28 20:36           ` Johannes Berg
2007-02-28  3:07     ` Srivatsa Vaddagiri
2007-02-28  8:48       ` Oleg Nesterov
2007-02-28  9:10         ` Srivatsa Vaddagiri
2007-02-28  9:43           ` Oleg Nesterov
2007-02-28 11:09           ` Rafael J. Wysocki
2007-02-28 18:17       ` Gautham R Shenoy
2007-02-28 18:41         ` Rafael J. Wysocki
2007-02-28  8:54     ` Pavel Machek
2007-02-28  3:01 ` Srivatsa Vaddagiri
2007-02-28  3:51   ` Srivatsa Vaddagiri
2007-02-28 11:11     ` Rafael J. Wysocki
2007-02-28 13:17       ` Srivatsa Vaddagiri
2007-02-28 13:27         ` Srivatsa Vaddagiri
2007-02-28 17:41           ` Rafael J. Wysocki
2007-02-28 17:40         ` Rafael J. Wysocki
2007-02-28 19:17         ` Rafael J. Wysocki
2007-02-28 19:32           ` Oleg Nesterov
2007-02-28 19:43             ` Rafael J. Wysocki
2007-02-28 20:08               ` Oleg Nesterov
2007-02-28 20:25                 ` Rafael J. Wysocki
2007-02-28 20:35                   ` Oleg Nesterov
2007-02-28 22:39                     ` Rafael J. Wysocki
2007-02-28 22:44                       ` Pavel Machek
2007-02-28 23:54                         ` [PATCH] Make XFS workqueues nonfreezable Rafael J. Wysocki
2007-03-01  8:03                           ` Andrew Morton
2007-03-01  9:15                             ` Pavel Machek
2007-03-01  9:25                               ` Andrew Morton
2007-02-28 21:16                   ` Problem with freezable workqueues Pavel Machek
2007-03-06  0:30 ` Johannes Berg
2007-03-06 20:31   ` Rafael J. Wysocki
2007-03-06 22:25     ` Nigel Cunningham
2007-03-06 22:57       ` Rafael J. Wysocki
2007-03-07 23:10     ` Johannes Berg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).