LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] Remove process freezer from suspend to RAM pathway
@ 2007-07-03  4:29 Matthew Garrett
  2007-07-03  4:54 ` Nigel Cunningham
                   ` (7 more replies)
  0 siblings, 8 replies; 388+ messages in thread
From: Matthew Garrett @ 2007-07-03  4:29 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-pm

Suspend to RAM on a machine with / on a fuse filesystem turns out to be 
a screaming nightmare - either the suspend fails because syslog (for 
instance) can't be frozen, or the machine deadlocks for some other 
reason I haven't tracked down. We could "fix" fuse, or alternatively we 
could do what we do for suspend to RAM on other platforms (PPC and APM) 
and just not use the freezer.

Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org>

diff --git a/kernel/power/main.c b/kernel/power/main.c
index 8812985..5f109d5 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -19,7 +19,6 @@
 #include <linux/console.h>
 #include <linux/cpu.h>
 #include <linux/resume-trace.h>
-#include <linux/freezer.h>
 #include <linux/vmstat.h>
 
 #include "power.h"
@@ -81,11 +80,6 @@ static int suspend_prepare(suspend_state_t state)
 
 	pm_prepare_console();
 
-	if (freeze_processes()) {
-		error = -EAGAIN;
-		goto Thaw;
-	}
-
 	if ((free_pages = global_page_state(NR_FREE_PAGES))
 			< FREE_PAGE_NUMBER) {
 		pr_debug("PM: free some memory\n");
@@ -93,7 +87,7 @@ static int suspend_prepare(suspend_state_t state)
 		if (nr_free_pages() < FREE_PAGE_NUMBER) {
 			error = -ENOMEM;
 			printk(KERN_ERR "PM: No enough memory\n");
-			goto Thaw;
+			goto Restore_console;
 		}
 	}
 
@@ -118,8 +112,7 @@ static int suspend_prepare(suspend_state_t state)
 	device_resume();
  Resume_console:
 	resume_console();
- Thaw:
-	thaw_processes();
+ Restore_console:
 	pm_restore_console();
 	return error;
 }
@@ -170,7 +163,6 @@ static void suspend_finish(suspend_state_t state)
 	pm_finish(state);
 	device_resume();
 	resume_console();
-	thaw_processes();
 	pm_restore_console();
 }

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  4:29 [PATCH] Remove process freezer from suspend to RAM pathway Matthew Garrett
@ 2007-07-03  4:54 ` Nigel Cunningham
  2007-07-03  5:21   ` Matthew Garrett
  2007-07-03  5:48   ` Benjamin Herrenschmidt
  2007-07-03  5:49 ` [linux-pm] " Benjamin Herrenschmidt
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 388+ messages in thread
From: Nigel Cunningham @ 2007-07-03  4:54 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel, linux-pm

[-- Attachment #1: Type: text/plain, Size: 840 bytes --]

On Tuesday 03 July 2007 14:29:18 Matthew Garrett wrote:
> Suspend to RAM on a machine with / on a fuse filesystem turns out to be 
> a screaming nightmare - either the suspend fails because syslog (for 
> instance) can't be frozen, or the machine deadlocks for some other 
> reason I haven't tracked down. We could "fix" fuse, or alternatively we 
> could do what we do for suspend to RAM on other platforms (PPC and APM) 
> and just not use the freezer.
> 
> Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org>

Note, though, that this won't help at all when people use the "suspend-to-ram 
instead of powering down after writing a hibernation image" feature in 
(uswsusp | tuxonice). Fuse is just a broken idea in the first place, but 
given that it exists, we still need to find the underlying cause.

Regards,

Nigel

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  4:54 ` Nigel Cunningham
@ 2007-07-03  5:21   ` Matthew Garrett
  2007-07-03  5:24     ` Nigel Cunningham
  2007-07-03  5:48   ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 388+ messages in thread
From: Matthew Garrett @ 2007-07-03  5:21 UTC (permalink / raw)
  To: Nigel Cunningham; +Cc: linux-kernel, linux-pm

On Tue, Jul 03, 2007 at 02:54:41PM +1000, Nigel Cunningham wrote:

> Note, though, that this won't help at all when people use the "suspend-to-ram 
> instead of powering down after writing a hibernation image" feature in 
> (uswsusp | tuxonice). Fuse is just a broken idea in the first place, but 
> given that it exists, we still need to find the underlying cause.

If / is on fuse it's unlikely that hibernation is high on your list of 
priorities right now.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  5:21   ` Matthew Garrett
@ 2007-07-03  5:24     ` Nigel Cunningham
  0 siblings, 0 replies; 388+ messages in thread
From: Nigel Cunningham @ 2007-07-03  5:24 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel, linux-pm

[-- Attachment #1: Type: text/plain, Size: 701 bytes --]

Hi.

On Tuesday 03 July 2007 15:21:30 Matthew Garrett wrote:
> On Tue, Jul 03, 2007 at 02:54:41PM +1000, Nigel Cunningham wrote:
> 
> > Note, though, that this won't help at all when people use 
the "suspend-to-ram 
> > instead of powering down after writing a hibernation image" feature in 
> > (uswsusp | tuxonice). Fuse is just a broken idea in the first place, but 
> > given that it exists, we still need to find the underlying cause.
> 
> If / is on fuse it's unlikely that hibernation is high on your list of 
> priorities right now.

Yeah, well... what can you say to that? :)

Nigel
-- 
See http://www.tuxonice.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  4:54 ` Nigel Cunningham
  2007-07-03  5:21   ` Matthew Garrett
@ 2007-07-03  5:48   ` Benjamin Herrenschmidt
  2007-07-03  6:08     ` Nigel Cunningham
  2007-07-05  0:03     ` Pavel Machek
  1 sibling, 2 replies; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-03  5:48 UTC (permalink / raw)
  To: Nigel Cunningham; +Cc: Matthew Garrett, linux-kernel, linux-pm


> Note, though, that this won't help at all when people use the "suspend-to-ram 
> instead of powering down after writing a hibernation image" feature in 
> (uswsusp | tuxonice). Fuse is just a broken idea in the first place, but 
> given that it exists, we still need to find the underlying cause.

No, Fuse is not a broken idea in the first place. It's the freezer that
is a totally broken idea. It has proven many times to be racy by design
and cannot be made right. Ther usermode helper mess is just part of
that, fuse is another example, etc etc ...

So I think Matthew is totally right. In fact, the presence of the
freezer is the main reason why Paulus so far NACKed Johannes attempts at
merging the PPC PM code with the generic code in kernel/power.c

We've been doing fine without it so far and intend to continue to do so.

As for suspend-to-disk, I refer you to the discussions we had in the
past with Linus, where he explains I think quite clearly how wrong the
current implementation of STR is :-)

Thing is, if you're going to do snapshots, you should probably not sync
after you have "frozen" anyway.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  4:29 [PATCH] Remove process freezer from suspend to RAM pathway Matthew Garrett
  2007-07-03  4:54 ` Nigel Cunningham
@ 2007-07-03  5:49 ` Benjamin Herrenschmidt
  2007-07-03 13:07   ` Rafael J. Wysocki
  2007-07-03  5:51 ` Benjamin Herrenschmidt
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-03  5:49 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel, linux-pm

On Tue, 2007-07-03 at 05:29 +0100, Matthew Garrett wrote:
> Suspend to RAM on a machine with / on a fuse filesystem turns out to be 
> a screaming nightmare - either the suspend fails because syslog (for 
> instance) can't be frozen, or the machine deadlocks for some other 
> reason I haven't tracked down. We could "fix" fuse, or alternatively we 
> could do what we do for suspend to RAM on other platforms (PPC and APM) 
> and just not use the freezer.
> 
> Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org>

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

(And with much pleasure :-)

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  4:29 [PATCH] Remove process freezer from suspend to RAM pathway Matthew Garrett
  2007-07-03  4:54 ` Nigel Cunningham
  2007-07-03  5:49 ` [linux-pm] " Benjamin Herrenschmidt
@ 2007-07-03  5:51 ` Benjamin Herrenschmidt
  2007-07-03 13:08   ` Rafael J. Wysocki
  2007-07-03  6:13 ` Oliver Neukum
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-03  5:51 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel, linux-pm

On Tue, 2007-07-03 at 05:29 +0100, Matthew Garrett wrote:
> Suspend to RAM on a machine with / on a fuse filesystem turns out to be 
> a screaming nightmare - either the suspend fails because syslog (for 
> instance) can't be frozen, or the machine deadlocks for some other 
> reason I haven't tracked down. We could "fix" fuse, or alternatively we 
> could do what we do for suspend to RAM on other platforms (PPC and APM) 
> and just not use the freezer.

The main reason for deadlocks is because we do a sys_sync() after the
freeze, which we shouldn't do.

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  5:48   ` Benjamin Herrenschmidt
@ 2007-07-03  6:08     ` Nigel Cunningham
  2007-07-03  7:19       ` Benjamin Herrenschmidt
  2007-07-05  0:03     ` Pavel Machek
  1 sibling, 1 reply; 388+ messages in thread
From: Nigel Cunningham @ 2007-07-03  6:08 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Matthew Garrett, linux-kernel, linux-pm

[-- Attachment #1: Type: text/plain, Size: 1958 bytes --]

Hi.

On Tuesday 03 July 2007 15:48:26 Benjamin Herrenschmidt wrote:
> 
> > Note, though, that this won't help at all when people use 
the "suspend-to-ram 
> > instead of powering down after writing a hibernation image" feature in 
> > (uswsusp | tuxonice). Fuse is just a broken idea in the first place, but 
> > given that it exists, we still need to find the underlying cause.
> 
> No, Fuse is not a broken idea in the first place. It's the freezer that
> is a totally broken idea. It has proven many times to be racy by design
> and cannot be made right. Ther usermode helper mess is just part of
> that, fuse is another example, etc etc ...

To some extent, I agree. I think the ideal solution would be to simply not 
schedule processes that are supposed to be frozen. But who wants to play with 
scheduler code? Not me!

> So I think Matthew is totally right. In fact, the presence of the
> freezer is the main reason why Paulus so far NACKed Johannes attempts at
> merging the PPC PM code with the generic code in kernel/power.c
> 
> We've been doing fine without it so far and intend to continue to do so.

Fuse depends on !PPC?
 
> As for suspend-to-disk, I refer you to the discussions we had in the
> past with Linus, where he explains I think quite clearly how wrong the
> current implementation of STR is :-)

I assume you mean STD. The problem there is that Linus doesn't care about STD. 
If he did, I dare say he'd think through the issues more thoroughly than he 
apparently has.
 
> Thing is, if you're going to do snapshots, you should probably not sync
> after you have "frozen" anyway.

Fully agree. But how do you stop things syncing while you're writing the image 
if you don't have a freezer or equivalent? (scheduler based, kexec.. they're 
all workarounds for this issue).

Regards,

Nigel
-- 
Nigel, Michelle and Alisdair Cunningham
5 Mitchell Street
Cobden 3266
Victoria, Australia

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  4:29 [PATCH] Remove process freezer from suspend to RAM pathway Matthew Garrett
                   ` (2 preceding siblings ...)
  2007-07-03  5:51 ` Benjamin Herrenschmidt
@ 2007-07-03  6:13 ` Oliver Neukum
  2007-07-03  6:51   ` Miklos Szeredi
  2007-07-03 12:13   ` Matthew Garrett
  2007-07-03  7:37 ` Romano Giannetti
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-03  6:13 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel, linux-pm

Am Dienstag, 3. Juli 2007 schrieb Matthew Garrett:
> Suspend to RAM on a machine with / on a fuse filesystem turns out to be 
> a screaming nightmare - either the suspend fails because syslog (for 
> instance) can't be frozen, or the machine deadlocks for some other 
> reason I haven't tracked down. We could "fix" fuse, or alternatively we 
> could do what we do for suspend to RAM on other platforms (PPC and APM) 
> and just not use the freezer.

Only if you want to audit all character devices' read() and write()
methods for races against suspend().
/ on fuse is a bad idea.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  6:13 ` Oliver Neukum
@ 2007-07-03  6:51   ` Miklos Szeredi
  2007-07-03 12:13   ` Matthew Garrett
  1 sibling, 0 replies; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-03  6:51 UTC (permalink / raw)
  To: oliver; +Cc: mjg59, linux-kernel, linux-pm

> > Suspend to RAM on a machine with / on a fuse filesystem turns out to be 
> > a screaming nightmare - either the suspend fails because syslog (for 
> > instance) can't be frozen, or the machine deadlocks for some other 
> > reason I haven't tracked down. We could "fix" fuse, or alternatively we 
> > could do what we do for suspend to RAM on other platforms (PPC and APM) 
> > and just not use the freezer.
> 
> Only if you want to audit all character devices' read() and write()
> methods for races against suspend().
> / on fuse is a bad idea.

What makes / special?  Why aren't all fuse filesystems affected?
Suspend isn't trying to do I/O on the root fs, is it?

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  6:08     ` Nigel Cunningham
@ 2007-07-03  7:19       ` Benjamin Herrenschmidt
  2007-07-03  7:44         ` [linux-pm] " Oliver Neukum
  2007-07-03 12:56         ` Rafael J. Wysocki
  0 siblings, 2 replies; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-03  7:19 UTC (permalink / raw)
  To: Nigel Cunningham; +Cc: Matthew Garrett, linux-kernel, linux-pm

On Tue, 2007-07-03 at 16:08 +1000, Nigel Cunningham wrote:
> 
> > So I think Matthew is totally right. In fact, the presence of the
> > freezer is the main reason why Paulus so far NACKed Johannes attempts at
> > merging the PPC PM code with the generic code in kernel/power.c
> > 
> > We've been doing fine without it so far and intend to continue to do so.
> 
> Fuse depends on !PPC?

No, that's not what I'm saying. I'm saying we've been doing STR without
the freezer and that's the way to go imho.

> > As for suspend-to-disk, I refer you to the discussions we had in the
> > past with Linus, where he explains I think quite clearly how wrong the
> > current implementation of STR is :-)
> 
> I assume you mean STD.

Oops, yeah, sorry.

> The problem there is that Linus doesn't care about STD. 
> If he did, I dare say he'd think through the issues more thoroughly than he 
> apparently has.

Heh, that might be the case :-)
 
> > Thing is, if you're going to do snapshots, you should probably not sync
> > after you have "frozen" anyway.
> 
> Fully agree. But how do you stop things syncing while you're writing the image 
> if you don't have a freezer or equivalent? (scheduler based, kexec.. they're 
> all workarounds for this issue).

Well, I was saying that in the context of the -current- snapshotting
mechanism which is based on the freezer, then you should not
sys_sync(). 

Some random user or kernel thread doing a sync is not a problem. It will
stop in the middle of sync and resume on wakeup.

The problem is currently because STD -itself- attempts to sync after it
has frozen things.

I think that should be changed. If you want to sync for whatever reason,
(mostly save RAM ?) do it before the freeze. That means you may get new
dirty data in memory that isn't written out by the sync before you
freeze, but that's allright, that data will be in the suspend image
anyway. If you fail to wakeup, that's akin to a normal crash, the user
will only lose the last data written at the time of the suspend and
journaling fs'es should take care of fs metadata integrity.

So to summarize, the plan that makes things work with fuse is:

 - For STR, don't do the freezer thing.

 - For STD, don't sys_sync() after you froze

There might be -other- issues, but that should get you through some of
them at least. Of course, you'll be in trouble if you try to do things
like STD-to-a-file which sits on a fuse FS but there's a limit to
insanity :-)

Cheers,
Ben.


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  4:29 [PATCH] Remove process freezer from suspend to RAM pathway Matthew Garrett
                   ` (3 preceding siblings ...)
  2007-07-03  6:13 ` Oliver Neukum
@ 2007-07-03  7:37 ` Romano Giannetti
  2007-07-03  8:20   ` Oliver Neukum
  2007-07-03 13:12   ` Rafael J. Wysocki
  2007-07-03 12:56 ` Rafael J. Wysocki
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 388+ messages in thread
From: Romano Giannetti @ 2007-07-03  7:37 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel, linux-pm

On Tue, 2007-07-03 at 05:29 +0100, Matthew Garrett wrote:
> or alternatively we  could do what we do for suspend to RAM on other
> platforms (PPC and APM) and just not use the freezer.

As a data point, I am running with this patch on top of 2.6.21.2 the
last 3+ weeks, with an average of 5/6 STR cycles a day, and had no
problems at all. (Sony vaio pcg-fx701). Just normal work, I didn't try
to stress the thing, but I have quite a few times suspended/resumed over
a big compile without a glitch.

What are the risks of this patch supposed to be?

Romano



--
La presente comunicación tiene carácter confidencial y es para el exclusivo uso del destinatario indicado en la misma. Si Ud. no es el destinatario indicado, le informamos que cualquier forma de distribución, reproducción o uso de esta comunicación y/o de la información contenida en la misma están estrictamente prohibidos por la ley. Si Ud. ha recibido esta comunicación por error, por favor, notifíquelo inmediatamente al remitente contestando a este mensaje y proceda a continuación a destruirlo. Gracias por su colaboración.

This communication contains confidential information. It is for the exclusive use of the intended addressee. If you are not the intended addressee, please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited by law. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy this message. Thank you for your cooperation.


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  7:19       ` Benjamin Herrenschmidt
@ 2007-07-03  7:44         ` Oliver Neukum
  2007-07-03 10:47           ` Miklos Szeredi
                             ` (2 more replies)
  2007-07-03 12:56         ` Rafael J. Wysocki
  1 sibling, 3 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-03  7:44 UTC (permalink / raw)
  To: linux-pm
  Cc: Benjamin Herrenschmidt, Nigel Cunningham, Matthew Garrett, linux-kernel

Am Dienstag, 3. Juli 2007 schrieb Benjamin Herrenschmidt:
> So to summarize, the plan that makes things work with fuse is:
> 
>  - For STR, don't do the freezer thing.
> 
>  - For STD, don't sys_sync() after you froze
> 
> There might be -other- issues, but that should get you through some of

At the risk of repeating myself. Character device drivers are written
with the assumption that normal io and suspend/resume do not race
with each other due to the freezer.
What do you intend to do about that?

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  7:37 ` Romano Giannetti
@ 2007-07-03  8:20   ` Oliver Neukum
  2007-07-03 13:12   ` Rafael J. Wysocki
  1 sibling, 0 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-03  8:20 UTC (permalink / raw)
  To: Romano Giannetti; +Cc: Matthew Garrett, linux-kernel, linux-pm

Am Dienstag, 3. Juli 2007 schrieb Romano Giannetti:
> On Tue, 2007-07-03 at 05:29 +0100, Matthew Garrett wrote:
> > or alternatively we  could do what we do for suspend to RAM on other
> > platforms (PPC and APM) and just not use the freezer.
> 
> As a data point, I am running with this patch on top of 2.6.21.2 the
> last 3+ weeks, with an average of 5/6 STR cycles a day, and had no
> problems at all. (Sony vaio pcg-fx701). Just normal work, I didn't try
> to stress the thing, but I have quite a few times suspended/resumed over
> a big compile without a glitch.
> 
> What are the risks of this patch supposed to be?

You did this test only while stressing normal block devices. Try
provoking races in character devices.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  7:44         ` [linux-pm] " Oliver Neukum
@ 2007-07-03 10:47           ` Miklos Szeredi
  2007-07-03 11:07             ` Oliver Neukum
  2007-07-03 11:23           ` Paul Mackerras
  2007-07-03 11:40           ` Benjamin Herrenschmidt
  2 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-03 10:47 UTC (permalink / raw)
  To: oliver; +Cc: linux-pm, benh, nigel, mjg59, linux-kernel

> > So to summarize, the plan that makes things work with fuse is:
> > 
> >  - For STR, don't do the freezer thing.
> > 
> >  - For STD, don't sys_sync() after you froze
> > 
> > There might be -other- issues, but that should get you through some of
> 
> At the risk of repeating myself. Character device drivers are written
> with the assumption that normal io and suspend/resume do not race
> with each other due to the freezer.
> What do you intend to do about that?

Oliver, can you please explain your worries in a bit more detail?

I don't claim to know anything about how STR or hibernate works, but
neither seem to have any problem with I/O on the fuse device "racing"
with them.

And conceptually I can't see anything that would cause trouble either.
The fuse kernel module just provides a specialized IPC mechanism,
where one userspace process communicates with another using file
operations and a char dev.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 10:47           ` Miklos Szeredi
@ 2007-07-03 11:07             ` Oliver Neukum
  2007-07-03 11:22               ` Miklos Szeredi
                                 ` (3 more replies)
  0 siblings, 4 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-03 11:07 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-pm, benh, nigel, mjg59, linux-kernel

Am Dienstag, 3. Juli 2007 schrieb Miklos Szeredi:
> > > So to summarize, the plan that makes things work with fuse is:
> > > 
> > >  - For STR, don't do the freezer thing.
> > > 
> > >  - For STD, don't sys_sync() after you froze
> > > 
> > > There might be -other- issues, but that should get you through some of
> > 
> > At the risk of repeating myself. Character device drivers are written
> > with the assumption that normal io and suspend/resume do not race
> > with each other due to the freezer.
> > What do you intend to do about that?
> 
> Oliver, can you please explain your worries in a bit more detail?
> 
> I don't claim to know anything about how STR or hibernate works, but
> neither seem to have any problem with I/O on the fuse device "racing"
> with them.

The problem is not with fuse. The problem is generic in nature.

If you remove the freezer, user space remains active until the last CPU
goes into suspend. It can do syscalls. Or do you know a clean way to exempt
only the tasks fuse might use?

Now device drivers have a guaranteed temporal sequence:

last io -> suspend() -> resume() [or disconnect()] -> new io

This is because suspend() is called after the freezer goes into action. If
you remove the freezer, you need to deal with

1. io to suspended devices
2. resume() assuming that the device is in the state suspend() left it
3. io changing a device's state while suspend is saving it

and you need to fix this for all device drivers, not just those fuse is
involved with. Removing the freezer means doing a more or less full
audit of every driver and additional locking in many drivers.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 11:07             ` Oliver Neukum
@ 2007-07-03 11:22               ` Miklos Szeredi
  2007-07-03 11:27                 ` Oliver Neukum
  2007-07-05  0:02                 ` Pavel Machek
  2007-07-03 11:44               ` Benjamin Herrenschmidt
                                 ` (2 subsequent siblings)
  3 siblings, 2 replies; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-03 11:22 UTC (permalink / raw)
  To: oliver; +Cc: miklos, linux-pm, benh, nigel, mjg59, linux-kernel

> > > > So to summarize, the plan that makes things work with fuse is:
> > > > 
> > > >  - For STR, don't do the freezer thing.
> > > > 
> > > >  - For STD, don't sys_sync() after you froze
> > > > 
> > > > There might be -other- issues, but that should get you through some of
> > > 
> > > At the risk of repeating myself. Character device drivers are written
> > > with the assumption that normal io and suspend/resume do not race
> > > with each other due to the freezer.
> > > What do you intend to do about that?
> > 
> > Oliver, can you please explain your worries in a bit more detail?
> > 
> > I don't claim to know anything about how STR or hibernate works, but
> > neither seem to have any problem with I/O on the fuse device "racing"
> > with them.
> 
> The problem is not with fuse. The problem is generic in nature.
> 
> If you remove the freezer, user space remains active until the last CPU
> goes into suspend. It can do syscalls. Or do you know a clean way to exempt
> only the tasks fuse might use?

You are talking about hibernate, right?  Suspending (to ram) is
instantaneous, in that _after_ suspend no CPU is active obviously.

> Now device drivers have a guaranteed temporal sequence:
> 
> last io -> suspend() -> resume() [or disconnect()] -> new io
> 
> This is because suspend() is called after the freezer goes into action. If
> you remove the freezer, you need to deal with
> 
> 1. io to suspended devices
> 2. resume() assuming that the device is in the state suspend() left it
> 3. io changing a device's state while suspend is saving it
> 
> and you need to fix this for all device drivers, not just those fuse is
> involved with. Removing the freezer means doing a more or less full
> audit of every driver and additional locking in many drivers.

OK, this has _nothing_ at all to do with fuse then, and everything to
do with disk I/O.

Just because fuse is a filesystem, it doesn't have to do anything with
block devices, just like procfs doesn't either.

So removing the freezer from the hibernate path would be problematic,
but as I understand this is not what has been proposed, only removing
the freezer from the STR path, which should be OK.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  7:44         ` [linux-pm] " Oliver Neukum
  2007-07-03 10:47           ` Miklos Szeredi
@ 2007-07-03 11:23           ` Paul Mackerras
  2007-07-03 11:42             ` Oliver Neukum
  2007-07-03 15:58             ` Alan Stern
  2007-07-03 11:40           ` Benjamin Herrenschmidt
  2 siblings, 2 replies; 388+ messages in thread
From: Paul Mackerras @ 2007-07-03 11:23 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: linux-pm, linux-kernel, Matthew Garrett

Oliver Neukum writes:

> At the risk of repeating myself. Character device drivers are written
> with the assumption that normal io and suspend/resume do not race
> with each other due to the freezer.
> What do you intend to do about that?

Going back to the old powerbook sleep code, we had a two-phase
suspend: drivers got notified once when userspace is still running,
with interrupts enabled, in process context; and then a second time
with interrupts disabled and with only one CPU up, so the process
that is initiating the suspend is the only process running (since
interrupts are disabled and nothing it does can sleep, no other
process can get to run).

I still believe that is the right way to go, although we currently
only have a single-phase suspend.

Most drivers suspended their hardware in the second call.  If they are
in the middle of a conversation with their device that *has* to be
completed, they can do that by polling.  If it's a character device, a
better approach would be to set a flag or whatever in the first
suspend call to make sure that no new conversations get started with
the device, sleeping if necessary.

I'm actually having a hard time thinking of how to test your assertion
since there are so few things on a typical computer that are plain
character devices driving real hardware.  A serial port would be about
the only one; keyboards and mice (and serial ports :) are USB these
days, or ADB on older powerbooks.

Or did you mean to include drivers for pseudo-devices (e.g. ptys)?  I
don't see why they would have a suspend method at all.

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 11:22               ` Miklos Szeredi
@ 2007-07-03 11:27                 ` Oliver Neukum
  2007-07-03 11:45                   ` Benjamin Herrenschmidt
  2007-07-05  0:02                 ` Pavel Machek
  1 sibling, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-03 11:27 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-pm, benh, nigel, mjg59, linux-kernel

Am Dienstag, 3. Juli 2007 schrieb Miklos Szeredi:
> > > > > So to summarize, the plan that makes things work with fuse is:
> > > > > 
> > > > >  - For STR, don't do the freezer thing.
> > > > > 
> > > > >  - For STD, don't sys_sync() after you froze
> > > > > 
> > > > > There might be -other- issues, but that should get you through some of
> > > > 
> > > > At the risk of repeating myself. Character device drivers are written
> > > > with the assumption that normal io and suspend/resume do not race
> > > > with each other due to the freezer.
> > > > What do you intend to do about that?
> > > 
> > > Oliver, can you please explain your worries in a bit more detail?
> > > 
> > > I don't claim to know anything about how STR or hibernate works, but
> > > neither seem to have any problem with I/O on the fuse device "racing"
> > > with them.
> > 
> > The problem is not with fuse. The problem is generic in nature.
> > 
> > If you remove the freezer, user space remains active until the last CPU
> > goes into suspend. It can do syscalls. Or do you know a clean way to exempt
> > only the tasks fuse might use?
> 
> You are talking about hibernate, right?  Suspending (to ram) is
> instantaneous, in that _after_ suspend no CPU is active obviously.

If that is so, why do you care? If it is really atomic, fuse has no chance
to call out to its component in user space either. Removing the freezer
cannot make a difference.

Something is fishy here.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  7:44         ` [linux-pm] " Oliver Neukum
  2007-07-03 10:47           ` Miklos Szeredi
  2007-07-03 11:23           ` Paul Mackerras
@ 2007-07-03 11:40           ` Benjamin Herrenschmidt
  2007-07-03 11:46             ` Oliver Neukum
  2 siblings, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-03 11:40 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: linux-pm, Nigel Cunningham, Matthew Garrett, linux-kernel

On Tue, 2007-07-03 at 09:44 +0200, Oliver Neukum wrote:
> Am Dienstag, 3. Juli 2007 schrieb Benjamin Herrenschmidt:
> > So to summarize, the plan that makes things work with fuse is:
> > 
> >  - For STR, don't do the freezer thing.
> > 
> >  - For STD, don't sys_sync() after you froze
> > 
> > There might be -other- issues, but that should get you through some of
> 
> At the risk of repeating myself. Character device drivers are written
> with the assumption that normal io and suspend/resume do not race
> with each other due to the freezer.
> What do you intend to do about that?

Ugh ... "character devices" ... that's a pretty wide statement...
there's lots of those and very different one from the other...

Any sane device-driver will have to cope with being suspended in a
"live" system. I've demonstrated multiple times in the past why this is
necessary anyway, for things like dynamic power management, among
others.

The whole freezer thing is a hack job to avoid fixing drivers that need
fixing. Unfortunately, I believe in that area, it's simply not
sustainable. Besides, getting drivers to behave properly isn't very hard
in most cases.

Ben.


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 11:23           ` Paul Mackerras
@ 2007-07-03 11:42             ` Oliver Neukum
  2007-07-03 23:11               ` Paul Mackerras
  2007-07-03 15:58             ` Alan Stern
  1 sibling, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-03 11:42 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linux-pm, linux-kernel, Matthew Garrett

Am Dienstag, 3. Juli 2007 schrieb Paul Mackerras:
> I'm actually having a hard time thinking of how to test your assertion
> since there are so few things on a typical computer that are plain
> character devices driving real hardware.  A serial port would be about
> the only one; keyboards and mice (and serial ports :) are USB these
> days, or ADB on older powerbooks.

USB devices certainly have suspend methods.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 11:07             ` Oliver Neukum
  2007-07-03 11:22               ` Miklos Szeredi
@ 2007-07-03 11:44               ` Benjamin Herrenschmidt
  2007-07-03 11:55                 ` Oliver Neukum
  2007-07-03 12:58               ` Rafael J. Wysocki
  2007-07-03 15:46               ` Miklos Szeredi
  3 siblings, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-03 11:44 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: Miklos Szeredi, linux-pm, nigel, mjg59, linux-kernel


> The problem is not with fuse. The problem is generic in nature.
> 
> If you remove the freezer, user space remains active until the last CPU
> goes into suspend. It can do syscalls. Or do you know a clean way to exempt
> only the tasks fuse might use?
>
> Now device drivers have a guaranteed temporal sequence:
> 
> last io -> suspend() -> resume() [or disconnect()] -> new io

No, that's always been bullshit. You can have IOs emitted by kernel
threads (think knfsd, and that's just one among many others). Beside,
relying on having userland frozen means that your driver will be unable
to be "live" suspended/resumed for more ambitious dynamic power
management schemes.

So it's always been wrong, imho, to rely on that. I've had powermac STR
work fine without the freezer for years, and few drivers have been a
problem, and we just fixed them.

The freezer thingy, at best, hides problems, causing them not to be
fixed.

> This is because suspend() is called after the freezer goes into action. If
> you remove the freezer, you need to deal with
> 
> 1. io to suspended devices
> 2. resume() assuming that the device is in the state suspend() left it
> 3. io changing a device's state while suspend is saving it
> 
> and you need to fix this for all device drivers, not just those fuse is
> involved with. Removing the freezer means doing a more or less full
> audit of every driver and additional locking in many drivers.

Yes, more or less.

The good news is that a whole lot of drivers don't really care much, and
in some cases, things can be done trivially with a bit of help from the
upper layers. But yeah, as I've been explaining over and over again, the
lazy approach here doesn't work.

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 11:27                 ` Oliver Neukum
@ 2007-07-03 11:45                   ` Benjamin Herrenschmidt
  2007-07-03 11:50                     ` Oliver Neukum
  0 siblings, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-03 11:45 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: Miklos Szeredi, linux-pm, nigel, mjg59, linux-kernel

On Tue, 2007-07-03 at 13:27 +0200, Oliver Neukum wrote:
> > You are talking about hibernate, right?  Suspending (to ram) is
> > instantaneous, in that _after_ suspend no CPU is active obviously.
> 
> If that is so, why do you care? If it is really atomic, fuse has no
> chance
> to call out to its component in user space either. Removing the
> freezer
> cannot make a difference.

It's not atomic. You will get called after suspend() in drivers. The
thing is ... you just have to deal with it :-)

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 11:40           ` Benjamin Herrenschmidt
@ 2007-07-03 11:46             ` Oliver Neukum
  2007-07-03 13:07               ` Rafael J. Wysocki
  0 siblings, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-03 11:46 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: linux-pm, Nigel Cunningham, Matthew Garrett, linux-kernel

Am Dienstag, 3. Juli 2007 schrieb Benjamin Herrenschmidt:
> On Tue, 2007-07-03 at 09:44 +0200, Oliver Neukum wrote:
> > Am Dienstag, 3. Juli 2007 schrieb Benjamin Herrenschmidt:
> > > So to summarize, the plan that makes things work with fuse is:
> > > 
> > >  - For STR, don't do the freezer thing.
> > > 
> > >  - For STD, don't sys_sync() after you froze
> > > 
> > > There might be -other- issues, but that should get you through some of
> > 
> > At the risk of repeating myself. Character device drivers are written
> > with the assumption that normal io and suspend/resume do not race
> > with each other due to the freezer.
> > What do you intend to do about that?
> 
> Ugh ... "character devices" ... that's a pretty wide statement...
> there's lots of those and very different one from the other...

That is a good summary of the problem ;-(

> Any sane device-driver will have to cope with being suspended in a
> "live" system. I've demonstrated multiple times in the past why this is
> necessary anyway, for things like dynamic power management, among
> others.

That is an interesting notion. I'd rather see device drivers reporting
their devices idle and requsting to be suspended.
But in any case it doesn't solve the problem.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 11:45                   ` Benjamin Herrenschmidt
@ 2007-07-03 11:50                     ` Oliver Neukum
  0 siblings, 0 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-03 11:50 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Miklos Szeredi, linux-pm, nigel, mjg59, linux-kernel

Am Dienstag, 3. Juli 2007 schrieb Benjamin Herrenschmidt:
> On Tue, 2007-07-03 at 13:27 +0200, Oliver Neukum wrote:
> > > You are talking about hibernate, right?  Suspending (to ram) is
> > > instantaneous, in that _after_ suspend no CPU is active obviously.
> > 
> > If that is so, why do you care? If it is really atomic, fuse has no
> > chance
> > to call out to its component in user space either. Removing the
> > freezer
> > cannot make a difference.
> 
> It's not atomic. You will get called after suspend() in drivers. The
> thing is ... you just have to deal with it :-)

So you are volunteering to go through all drivers >:-> ?

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 11:44               ` Benjamin Herrenschmidt
@ 2007-07-03 11:55                 ` Oliver Neukum
  2007-07-03 23:40                   ` Paul Mackerras
  0 siblings, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-03 11:55 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Miklos Szeredi, linux-pm, nigel, mjg59, linux-kernel

Am Dienstag, 3. Juli 2007 schrieb Benjamin Herrenschmidt:
> > Now device drivers have a guaranteed temporal sequence:
> > 
> > last io -> suspend() -> resume() [or disconnect()] -> new io
> 
> No, that's always been bullshit. You can have IOs emitted by kernel
> threads (think knfsd, and that's just one among many others). Beside,

That's why we have the problem of freezing the kernel threads or not.
Short of knfsd very few kernel threads really operate on their own and
those can use the new notifier chain.

> relying on having userland frozen means that your driver will be unable
> to be "live" suspended/resumed for more ambitious dynamic power
> management schemes.

Only if they work without cooperation and idle detection in the drivers.

> So it's always been wrong, imho, to rely on that. I've had powermac STR
> work fine without the freezer for years, and few drivers have been a
> problem, and we just fixed them.

Powermacs are somewhat limited in hardware used with them (OK, powerbook
have PCMCIA, but how often is that used?)

You want to have all that pain for fuse?

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  6:13 ` Oliver Neukum
  2007-07-03  6:51   ` Miklos Szeredi
@ 2007-07-03 12:13   ` Matthew Garrett
  2007-07-03 13:09     ` Rafael J. Wysocki
  1 sibling, 1 reply; 388+ messages in thread
From: Matthew Garrett @ 2007-07-03 12:13 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: linux-kernel, linux-pm

On Tue, Jul 03, 2007 at 08:13:53AM +0200, Oliver Neukum wrote:

> Only if you want to audit all character devices' read() and write()
> methods for races against suspend().
> / on fuse is a bad idea.

Any driver that assumes that userspace will be frozen during suspend has 
been broken forever. That behaviour has never been guaranteed.
-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  4:29 [PATCH] Remove process freezer from suspend to RAM pathway Matthew Garrett
                   ` (4 preceding siblings ...)
  2007-07-03  7:37 ` Romano Giannetti
@ 2007-07-03 12:56 ` Rafael J. Wysocki
  2007-07-09 13:29   ` sysrq-t dumps of s2ram/fuse deadlock (was Re: [PATCH] Remove process freezer from suspend to RAM pathway) Pavel Machek
  2007-07-03 16:03 ` [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway Alan Stern
  2007-07-04 23:33 ` Pavel Machek
  7 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 12:56 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel, linux-pm, Alan Stern

On Tuesday, 3 July 2007 06:29, Matthew Garrett wrote:
> Suspend to RAM on a machine with / on a fuse filesystem turns out to be 
> a screaming nightmare - either the suspend fails because syslog (for 
> instance) can't be frozen, or the machine deadlocks for some other 
> reason I haven't tracked down. We could "fix" fuse, or alternatively we 
> could do what we do for suspend to RAM on other platforms (PPC and APM) 
> and just not use the freezer.
> 
> Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org>

Could you please rediff against the current -mm tree?

There are some patches in there that this will clash with.

I still think that this is a mistake, BTW.  Please see the Alan's post at

https://lists.linux-foundation.org/pipermail/linux-pm/2007-June/012847.html

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  7:19       ` Benjamin Herrenschmidt
  2007-07-03  7:44         ` [linux-pm] " Oliver Neukum
@ 2007-07-03 12:56         ` Rafael J. Wysocki
  2007-07-03 14:21           ` [linux-pm] " Johannes Berg
  2007-07-03 21:14           ` Benjamin Herrenschmidt
  1 sibling, 2 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 12:56 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Nigel Cunningham, Matthew Garrett, linux-kernel, linux-pm,
	Alan Stern, Pavel Machek

On Tuesday, 3 July 2007 09:19, Benjamin Herrenschmidt wrote:
> On Tue, 2007-07-03 at 16:08 +1000, Nigel Cunningham wrote:
> > 
> > > So I think Matthew is totally right. In fact, the presence of the
> > > freezer is the main reason why Paulus so far NACKed Johannes attempts at
> > > merging the PPC PM code with the generic code in kernel/power.c
> > > 
> > > We've been doing fine without it so far and intend to continue to do so.
> > 
> > Fuse depends on !PPC?
> 
> No, that's not what I'm saying. I'm saying we've been doing STR without
> the freezer and that's the way to go imho.
> 
> > > As for suspend-to-disk, I refer you to the discussions we had in the
> > > past with Linus, where he explains I think quite clearly how wrong the
> > > current implementation of STR is :-)
> > 
> > I assume you mean STD.
> 
> Oops, yeah, sorry.
> 
> > The problem there is that Linus doesn't care about STD. 
> > If he did, I dare say he'd think through the issues more thoroughly than he 
> > apparently has.
> 
> Heh, that might be the case :-)
>  
> > > Thing is, if you're going to do snapshots, you should probably not sync
> > > after you have "frozen" anyway.
> > 
> > Fully agree. But how do you stop things syncing while you're writing the image 
> > if you don't have a freezer or equivalent? (scheduler based, kexec.. they're 
> > all workarounds for this issue).
> 
> Well, I was saying that in the context of the -current- snapshotting
> mechanism which is based on the freezer, then you should not
> sys_sync(). 
> 
> Some random user or kernel thread doing a sync is not a problem. It will
> stop in the middle of sync and resume on wakeup.
> 
> The problem is currently because STD -itself- attempts to sync after it
> has frozen things.

To be precise, it tries to sync after it has frozen the user land.
 
> I think that should be changed. If you want to sync for whatever reason,
> (mostly save RAM ?) do it before the freeze. That means you may get new
> dirty data in memory that isn't written out by the sync before you
> freeze, but that's allright, that data will be in the suspend image
> anyway. If you fail to wakeup, that's akin to a normal crash, the user
> will only lose the last data written at the time of the suspend and
> journaling fs'es should take care of fs metadata integrity.
> 
> So to summarize, the plan that makes things work with fuse is:
> 
>  - For STR, don't do the freezer thing.

In the long run, I agree.

Still, can you please read this post from Alan Stern:

https://lists.linux-foundation.org/pipermail/linux-pm/2007-June/012847.html

?  I don't think I'm able to repeat the arguments given in there in a
convincing way.

>  - For STD, don't sys_sync() after you froze

Yeah, I think we can move the syncing before the freezing, so to speak.

And it need not be called from within the freezer, BTW.

> There might be -other- issues, but that should get you through some of
> them at least. Of course, you'll be in trouble if you try to do things
> like STD-to-a-file which sits on a fuse FS but there's a limit to
> insanity :-)

Yes. :-)

Ggreetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 11:07             ` Oliver Neukum
  2007-07-03 11:22               ` Miklos Szeredi
  2007-07-03 11:44               ` Benjamin Herrenschmidt
@ 2007-07-03 12:58               ` Rafael J. Wysocki
  2007-07-03 15:46               ` Miklos Szeredi
  3 siblings, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 12:58 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: Miklos Szeredi, linux-pm, benh, nigel, mjg59, linux-kernel

On Tuesday, 3 July 2007 13:07, Oliver Neukum wrote:
> Am Dienstag, 3. Juli 2007 schrieb Miklos Szeredi:
> > > > So to summarize, the plan that makes things work with fuse is:
> > > > 
> > > >  - For STR, don't do the freezer thing.
> > > > 
> > > >  - For STD, don't sys_sync() after you froze
> > > > 
> > > > There might be -other- issues, but that should get you through some of
> > > 
> > > At the risk of repeating myself. Character device drivers are written
> > > with the assumption that normal io and suspend/resume do not race
> > > with each other due to the freezer.
> > > What do you intend to do about that?
> > 
> > Oliver, can you please explain your worries in a bit more detail?
> > 
> > I don't claim to know anything about how STR or hibernate works, but
> > neither seem to have any problem with I/O on the fuse device "racing"
> > with them.
> 
> The problem is not with fuse. The problem is generic in nature.
> 
> If you remove the freezer, user space remains active until the last CPU
> goes into suspend. It can do syscalls. Or do you know a clean way to exempt
> only the tasks fuse might use?
> 
> Now device drivers have a guaranteed temporal sequence:
> 
> last io -> suspend() -> resume() [or disconnect()] -> new io
> 
> This is because suspend() is called after the freezer goes into action. If
> you remove the freezer, you need to deal with
> 
> 1. io to suspended devices
> 2. resume() assuming that the device is in the state suspend() left it
> 3. io changing a device's state while suspend is saving it
> 
> and you need to fix this for all device drivers, not just those fuse is
> involved with. Removing the freezer means doing a more or less full
> audit of every driver and additional locking in many drivers.

Agreed.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 11:46             ` Oliver Neukum
@ 2007-07-03 13:07               ` Rafael J. Wysocki
  0 siblings, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 13:07 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Benjamin Herrenschmidt, linux-pm, Nigel Cunningham,
	Matthew Garrett, linux-kernel

On Tuesday, 3 July 2007 13:46, Oliver Neukum wrote:
> Am Dienstag, 3. Juli 2007 schrieb Benjamin Herrenschmidt:
> > On Tue, 2007-07-03 at 09:44 +0200, Oliver Neukum wrote:
> > > Am Dienstag, 3. Juli 2007 schrieb Benjamin Herrenschmidt:
> > > > So to summarize, the plan that makes things work with fuse is:
> > > > 
> > > >  - For STR, don't do the freezer thing.
> > > > 
> > > >  - For STD, don't sys_sync() after you froze
> > > > 
> > > > There might be -other- issues, but that should get you through some of
> > > 
> > > At the risk of repeating myself. Character device drivers are written
> > > with the assumption that normal io and suspend/resume do not race
> > > with each other due to the freezer.
> > > What do you intend to do about that?
> > 
> > Ugh ... "character devices" ... that's a pretty wide statement...
> > there's lots of those and very different one from the other...
> 
> That is a good summary of the problem ;-(
> 
> > Any sane device-driver will have to cope with being suspended in a
> > "live" system. I've demonstrated multiple times in the past why this is
> > necessary anyway, for things like dynamic power management, among
> > others.
> 
> That is an interesting notion. I'd rather see device drivers reporting
> their devices idle and requsting to be suspended.
> But in any case it doesn't solve the problem.

Agreed.

What I think will solve the problem in the long run is to:

1) Separate the hibernation code from the suspend code (ie. hibernation-related
callbacks should generally be different from suspend-related callback for each
driver).
2) Remove the freezing of kernel threads from each of them (in the hibernation
case, if possible) an fix the things that get broken.
3) Remove the freezing of user space from the suspend code path and fix the
things that get broken.

Going to step 3) before doing 1) and 2) doesn't seem to be the right thing to
me.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  5:49 ` [linux-pm] " Benjamin Herrenschmidt
@ 2007-07-03 13:07   ` Rafael J. Wysocki
  0 siblings, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 13:07 UTC (permalink / raw)
  To: linux-pm; +Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-kernel

On Tuesday, 3 July 2007 07:49, Benjamin Herrenschmidt wrote:
> On Tue, 2007-07-03 at 05:29 +0100, Matthew Garrett wrote:
> > Suspend to RAM on a machine with / on a fuse filesystem turns out to be 
> > a screaming nightmare - either the suspend fails because syslog (for 
> > instance) can't be frozen, or the machine deadlocks for some other 
> > reason I haven't tracked down. We could "fix" fuse, or alternatively we 
> > could do what we do for suspend to RAM on other platforms (PPC and APM) 
> > and just not use the freezer.
> > 
> > Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org>
> 
> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> 
> (And with much pleasure :-)

Clashes with some code already in -mm.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  5:51 ` Benjamin Herrenschmidt
@ 2007-07-03 13:08   ` Rafael J. Wysocki
  2007-07-03 15:09     ` Rafael J. Wysocki
  2007-07-03 21:16     ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 13:08 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Matthew Garrett, linux-kernel, linux-pm

On Tuesday, 3 July 2007 07:51, Benjamin Herrenschmidt wrote:
> On Tue, 2007-07-03 at 05:29 +0100, Matthew Garrett wrote:
> > Suspend to RAM on a machine with / on a fuse filesystem turns out to be 
> > a screaming nightmare - either the suspend fails because syslog (for 
> > instance) can't be frozen, or the machine deadlocks for some other 
> > reason I haven't tracked down. We could "fix" fuse, or alternatively we 
> > could do what we do for suspend to RAM on other platforms (PPC and APM) 
> > and just not use the freezer.
> 
> The main reason for deadlocks is because we do a sys_sync() after the
> freeze, which we shouldn't do.

So why don't we remove the sys_sync() from freeze_processes() instead?

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 12:13   ` Matthew Garrett
@ 2007-07-03 13:09     ` Rafael J. Wysocki
  0 siblings, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 13:09 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: Oliver Neukum, linux-kernel, linux-pm

On Tuesday, 3 July 2007 14:13, Matthew Garrett wrote:
> On Tue, Jul 03, 2007 at 08:13:53AM +0200, Oliver Neukum wrote:
> 
> > Only if you want to audit all character devices' read() and write()
> > methods for races against suspend().
> > / on fuse is a bad idea.
> 
> Any driver that assumes that userspace will be frozen during suspend has 
> been broken forever. That behaviour has never been guaranteed.

Can we please fix those drivers _first_, then?

Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  7:37 ` Romano Giannetti
  2007-07-03  8:20   ` Oliver Neukum
@ 2007-07-03 13:12   ` Rafael J. Wysocki
  1 sibling, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 13:12 UTC (permalink / raw)
  To: Romano Giannetti; +Cc: Matthew Garrett, linux-kernel, linux-pm

On Tuesday, 3 July 2007 09:37, Romano Giannetti wrote:
> On Tue, 2007-07-03 at 05:29 +0100, Matthew Garrett wrote:
> > or alternatively we  could do what we do for suspend to RAM on other
> > platforms (PPC and APM) and just not use the freezer.
> 
> As a data point, I am running with this patch on top of 2.6.21.2 the
> last 3+ weeks, with an average of 5/6 STR cycles a day, and had no
> problems at all. (Sony vaio pcg-fx701). Just normal work, I didn't try
> to stress the thing, but I have quite a few times suspended/resumed over
> a big compile without a glitch.
> 
> What are the risks of this patch supposed to be?

See https://lists.linux-foundation.org/pipermail/linux-pm/2007-June/012847.html

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 12:56         ` Rafael J. Wysocki
@ 2007-07-03 14:21           ` Johannes Berg
  2007-07-03 14:50             ` Alan Stern
  2007-07-03 14:51             ` Rafael J. Wysocki
  2007-07-03 21:14           ` Benjamin Herrenschmidt
  1 sibling, 2 replies; 388+ messages in thread
From: Johannes Berg @ 2007-07-03 14:21 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-kernel,
	Pavel Machek, linux-pm

[-- Attachment #1: Type: text/plain, Size: 1073 bytes --]

On Tue, 2007-07-03 at 14:56 +0200, Rafael J. Wysocki wrote:

> Still, can you please read this post from Alan Stern:
> 
> https://lists.linux-foundation.org/pipermail/linux-pm/2007-June/012847.html
> 
> ?  I don't think I'm able to repeat the arguments given in there in a
> convincing way.

As I read it, Alan basically has two objections:
 (1) drivers shouldn't need to worry about this
 (2) suspend should be transparent to userspace

His proposed solution (freezing tasks when they cross the kernel
boundary) helps for the s-t-r case, but in fact doesn't solve (1)
because devices can be suspended at runtime and then you certainly do
not want to freeze tasks that try to access the device.

(2) is related but not identical, what if you have a device suspended at
runtime and some tasks tries to access it; should the task block until
you wake up that device?

I think the core of the discussion isn't appreciated by everybody here
yet---we need to solve both run-time and suspend-to-ram-time device
suspend, not just one of them.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 14:51             ` Rafael J. Wysocki
@ 2007-07-03 14:48               ` Johannes Berg
  0 siblings, 0 replies; 388+ messages in thread
From: Johannes Berg @ 2007-07-03 14:48 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-kernel,
	Pavel Machek, linux-pm

[-- Attachment #1: Type: text/plain, Size: 1367 bytes --]

On Tue, 2007-07-03 at 16:51 +0200, Rafael J. Wysocki wrote:

> > His proposed solution (freezing tasks when they cross the kernel
> > boundary) helps for the s-t-r case, but in fact doesn't solve (1)
> > because devices can be suspended at runtime
> 
> This is a different thing and a different infrastructure is needed for it (not
> present at the moment).

Yeah I should've said "any of his proposed solutions"

> > and then you certainly do not want to freeze tasks that try to access the
> > device. 
> > 
> > (2) is related but not identical, what if you have a device suspended at
> > runtime and some tasks tries to access it; should the task block until
> > you wake up that device?
> 
> I think the device should be woken up in that case.

Ah, but that also means the device has to actually know about it.

> > I think the core of the discussion isn't appreciated by everybody here
> > yet---we need to solve both run-time and suspend-to-ram-time device
> > suspend, not just one of them.
> 
> For now, we're discussing the suspend-to-ram-time suspend only, for which
> we have (some) infrastrcuture (and which should be supported by all drivers,
> IMO).

Right but if we solve the run-time suspend case in favour of having the
device driver know about it then the suspend-to-ram-time suspend case
solves itself.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 14:21           ` [linux-pm] " Johannes Berg
@ 2007-07-03 14:50             ` Alan Stern
  2007-07-03 14:59               ` Johannes Berg
  2007-07-04  3:55               ` Paul Mackerras
  2007-07-03 14:51             ` Rafael J. Wysocki
  1 sibling, 2 replies; 388+ messages in thread
From: Alan Stern @ 2007-07-03 14:50 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Rafael J. Wysocki, Linux-pm mailing list,
	Kernel development list, Pavel Machek, Matthew Garrett,
	Paul Mackerras

On Tue, 3 Jul 2007, Johannes Berg wrote:

> On Tue, 2007-07-03 at 14:56 +0200, Rafael J. Wysocki wrote:
> 
> > Still, can you please read this post from Alan Stern:
> > 
> > https://lists.linux-foundation.org/pipermail/linux-pm/2007-June/012847.html
> > 
> > ?  I don't think I'm able to repeat the arguments given in there in a
> > convincing way.
> 
> As I read it, Alan basically has two objections:
>  (1) drivers shouldn't need to worry about this
>  (2) suspend should be transparent to userspace
> 
> His proposed solution (freezing tasks when they cross the kernel
> boundary) helps for the s-t-r case, but in fact doesn't solve (1)
> because devices can be suspended at runtime and then you certainly do
> not want to freeze tasks that try to access the device.
> 
> (2) is related but not identical, what if you have a device suspended at
> runtime and some tasks tries to access it; should the task block until
> you wake up that device?

Time for me to jump in.

USB already implements runtime PM.  If a device is suspended at runtime
and a task tries to access it, the device is automatically resumed.  
No problem there.

The problem comes when the system is doing a STR.  Right now the code
doesn't keep track of the difference between a runtime suspend and a
system suspend -- once the device is suspended, it's suspended, period.  
Consequently, a non-frozen user task trying to do I/O to a suspended
device during STR will cause that device to resume, thereby forcing the
system suspend to abort.  Something much like this has actually
happened and been reported as a bug on LKML (I don't have a URL handy,
and it was actually a non-frozen kernel thread interfering with
hibernate rather than a non-frozen user task interfering with STR, but
the principle is the same).

Yes, the code could be changed to keep track of the reason for a device
suspend.  But that just raises the old problem of what to do when
there's an I/O request for a suspended device during STR.

> I think the core of the discussion isn't appreciated by everybody here
> yet---we need to solve both run-time and suspend-to-ram-time device
> suspend, not just one of them.

Runtime suspend isn't a problem.  Only STR.

Consider a particularly troublesome case: During STR, a non-frozen task
writes to /sys/bus/BBB/drivers/DDD/bind.  The sysfs core grabs the
device semaphore and calls the driver's probe routine.  If the driver
isn't PM-aware it simply tries to initialize the device and fails
because the device is already suspended.  That's no good; it isn't
transparent.

So assume the driver is PM-aware.  It tries to resume the device, which
fails because STR is underway.  Now what can it do?  There's only one 
possibility: It must block until the resume call can succeed.  But when 
is that?

It has to be before the PM core tries to resume the device, because the 
core will try to acquire the device semaphore and will block waiting 
for the probe call to complete.  But it has to be after the PM core 
resumes the device's parent, because obviously the device can't resume 
until its parent is awake.

As you can see, this is a very difficult problem to solve.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 14:21           ` [linux-pm] " Johannes Berg
  2007-07-03 14:50             ` Alan Stern
@ 2007-07-03 14:51             ` Rafael J. Wysocki
  2007-07-03 14:48               ` Johannes Berg
  1 sibling, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 14:51 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-kernel,
	Pavel Machek, linux-pm

On Tuesday, 3 July 2007 16:21, Johannes Berg wrote:
> On Tue, 2007-07-03 at 14:56 +0200, Rafael J. Wysocki wrote:
> 
> > Still, can you please read this post from Alan Stern:
> > 
> > https://lists.linux-foundation.org/pipermail/linux-pm/2007-June/012847.html
> > 
> > ?  I don't think I'm able to repeat the arguments given in there in a
> > convincing way.
> 
> As I read it, Alan basically has two objections:
>  (1) drivers shouldn't need to worry about this
>  (2) suspend should be transparent to userspace
> 
> His proposed solution (freezing tasks when they cross the kernel
> boundary) helps for the s-t-r case, but in fact doesn't solve (1)
> because devices can be suspended at runtime

This is a different thing and a different infrastructure is needed for it (not
present at the moment).

> and then you certainly do not want to freeze tasks that try to access the
> device. 
> 
> (2) is related but not identical, what if you have a device suspended at
> runtime and some tasks tries to access it; should the task block until
> you wake up that device?

I think the device should be woken up in that case.

> I think the core of the discussion isn't appreciated by everybody here
> yet---we need to solve both run-time and suspend-to-ram-time device
> suspend, not just one of them.

For now, we're discussing the suspend-to-ram-time suspend only, for which
we have (some) infrastrcuture (and which should be supported by all drivers,
IMO).

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 14:50             ` Alan Stern
@ 2007-07-03 14:59               ` Johannes Berg
  2007-07-03 15:22                 ` Rafael J. Wysocki
  2007-07-03 20:21                 ` Alan Stern
  2007-07-04  3:55               ` Paul Mackerras
  1 sibling, 2 replies; 388+ messages in thread
From: Johannes Berg @ 2007-07-03 14:59 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, Linux-pm mailing list,
	Kernel development list, Pavel Machek, Matthew Garrett,
	Paul Mackerras

[-- Attachment #1: Type: text/plain, Size: 3019 bytes --]

On Tue, 2007-07-03 at 10:50 -0400, Alan Stern wrote:

> Time for me to jump in.

:)

> USB already implements runtime PM.  If a device is suspended at runtime
> and a task tries to access it, the device is automatically resumed.  
> No problem there.

Right.

> The problem comes when the system is doing a STR.  Right now the code
> doesn't keep track of the difference between a runtime suspend and a
> system suspend -- once the device is suspended, it's suspended, period.  
> Consequently, a non-frozen user task trying to do I/O to a suspended
> device during STR will cause that device to resume, thereby forcing the
> system suspend to abort.  Something much like this has actually
> happened and been reported as a bug on LKML (I don't have a URL handy,
> and it was actually a non-frozen kernel thread interfering with
> hibernate rather than a non-frozen user task interfering with STR, but
> the principle is the same).

Yeah, I can see that happen.

> Yes, the code could be changed to keep track of the reason for a device
> suspend.  But that just raises the old problem of what to do when
> there's an I/O request for a suspended device during STR.
> 
> > I think the core of the discussion isn't appreciated by everybody here
> > yet---we need to solve both run-time and suspend-to-ram-time device
> > suspend, not just one of them.
> 
> Runtime suspend isn't a problem.  Only STR.

Ah but for all those character devices people were saying are the
problem we haven't even solved runtime suspend as far as I can tell from
the discussion.

> Consider a particularly troublesome case: During STR, a non-frozen task
> writes to /sys/bus/BBB/drivers/DDD/bind.  The sysfs core grabs the
> device semaphore and calls the driver's probe routine.  If the driver
> isn't PM-aware it simply tries to initialize the device and fails
> because the device is already suspended.  That's no good; it isn't
> transparent.
> 
> So assume the driver is PM-aware.  It tries to resume the device, which
> fails because STR is underway.  Now what can it do?  There's only one 
> possibility: It must block until the resume call can succeed.  But when 
> is that?
> 
> It has to be before the PM core tries to resume the device, because the 
> core will try to acquire the device semaphore and will block waiting 
> for the probe call to complete.  But it has to be after the PM core 
> resumes the device's parent, because obviously the device can't resume 
> until its parent is awake.
> 
> As you can see, this is a very difficult problem to solve.

Indeed. Actually, one could argue that it's impossible to solve the
problem as long as we try to call out to userspace during suspend and
need to wait until that's finished, like in the case of sys_sync() and
fuse filesystems, and probably other cases. Maybe we should make *those*
calls return a failure so that the suspend isn't transparent inside the
kernel but is transparent to userspace.

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 13:08   ` Rafael J. Wysocki
@ 2007-07-03 15:09     ` Rafael J. Wysocki
  2007-07-03 17:20       ` Oliver Neukum
                         ` (2 more replies)
  2007-07-03 21:16     ` Benjamin Herrenschmidt
  1 sibling, 3 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 15:09 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Matthew Garrett, linux-kernel, linux-pm, Pavel Machek, Nigel Cunningham

On Tuesday, 3 July 2007 15:08, Rafael J. Wysocki wrote:
> On Tuesday, 3 July 2007 07:51, Benjamin Herrenschmidt wrote:
> > On Tue, 2007-07-03 at 05:29 +0100, Matthew Garrett wrote:
> > > Suspend to RAM on a machine with / on a fuse filesystem turns out to be 
> > > a screaming nightmare - either the suspend fails because syslog (for 
> > > instance) can't be frozen, or the machine deadlocks for some other 
> > > reason I haven't tracked down. We could "fix" fuse, or alternatively we 
> > > could do what we do for suspend to RAM on other platforms (PPC and APM) 
> > > and just not use the freezer.
> > 
> > The main reason for deadlocks is because we do a sys_sync() after the
> > freeze, which we shouldn't do.
> 
> So why don't we remove the sys_sync() from freeze_processes() instead?

The patch follows (untested).

Greetings,
Rafael


---
From: Rafael J. Wysocki <rjw@sisk.pl>

We shouldn't sync filesystems from within the freezer, because it's not needed
for suspend to RAM and leads to problems with FUSE.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/disk.c    |    4 ++++
 kernel/power/process.c |    1 -
 2 files changed, 4 insertions(+), 1 deletion(-)

Index: linux-2.6.22-rc7/kernel/power/disk.c
===================================================================
--- linux-2.6.22-rc7.orig/kernel/power/disk.c
+++ linux-2.6.22-rc7/kernel/power/disk.c
@@ -296,6 +296,10 @@ int hibernate(void)
 {
 	int error;
 
+	printk("Syncing filesystems ... \n");
+	sys_sync();
+	printk("done.\n");
+
 	mutex_lock(&pm_mutex);
 	/* The snapshot device should not be opened while we're running */
 	if (!atomic_add_unless(&snapshot_device_available, -1, 0)) {
Index: linux-2.6.22-rc7/kernel/power/process.c
===================================================================
--- linux-2.6.22-rc7.orig/kernel/power/process.c
+++ linux-2.6.22-rc7/kernel/power/process.c
@@ -190,7 +190,6 @@ int freeze_processes(void)
 	if (error)
 		return error;
 
-	sys_sync();
 	error = try_to_freeze_tasks(FREEZER_KERNEL_THREADS);
 	if (error)
 		return error;

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 14:59               ` Johannes Berg
@ 2007-07-03 15:22                 ` Rafael J. Wysocki
  2007-07-03 17:38                   ` Miklos Szeredi
  2007-07-03 20:21                 ` Alan Stern
  1 sibling, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 15:22 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Alan Stern, Linux-pm mailing list, Kernel development list,
	Pavel Machek, Matthew Garrett, Paul Mackerras

On Tuesday, 3 July 2007 16:59, Johannes Berg wrote:
> On Tue, 2007-07-03 at 10:50 -0400, Alan Stern wrote:
> 
> > Time for me to jump in.
> 
> :)
> 
> > USB already implements runtime PM.  If a device is suspended at runtime
> > and a task tries to access it, the device is automatically resumed.  
> > No problem there.
> 
> Right.
> 
> > The problem comes when the system is doing a STR.  Right now the code
> > doesn't keep track of the difference between a runtime suspend and a
> > system suspend -- once the device is suspended, it's suspended, period.  
> > Consequently, a non-frozen user task trying to do I/O to a suspended
> > device during STR will cause that device to resume, thereby forcing the
> > system suspend to abort.  Something much like this has actually
> > happened and been reported as a bug on LKML (I don't have a URL handy,
> > and it was actually a non-frozen kernel thread interfering with
> > hibernate rather than a non-frozen user task interfering with STR, but
> > the principle is the same).
> 
> Yeah, I can see that happen.
> 
> > Yes, the code could be changed to keep track of the reason for a device
> > suspend.  But that just raises the old problem of what to do when
> > there's an I/O request for a suspended device during STR.
> > 
> > > I think the core of the discussion isn't appreciated by everybody here
> > > yet---we need to solve both run-time and suspend-to-ram-time device
> > > suspend, not just one of them.
> > 
> > Runtime suspend isn't a problem.  Only STR.
> 
> Ah but for all those character devices people were saying are the
> problem we haven't even solved runtime suspend as far as I can tell from
> the discussion.
> 
> > Consider a particularly troublesome case: During STR, a non-frozen task
> > writes to /sys/bus/BBB/drivers/DDD/bind.  The sysfs core grabs the
> > device semaphore and calls the driver's probe routine.  If the driver
> > isn't PM-aware it simply tries to initialize the device and fails
> > because the device is already suspended.  That's no good; it isn't
> > transparent.
> > 
> > So assume the driver is PM-aware.  It tries to resume the device, which
> > fails because STR is underway.  Now what can it do?  There's only one 
> > possibility: It must block until the resume call can succeed.  But when 
> > is that?
> > 
> > It has to be before the PM core tries to resume the device, because the 
> > core will try to acquire the device semaphore and will block waiting 
> > for the probe call to complete.  But it has to be after the PM core 
> > resumes the device's parent, because obviously the device can't resume 
> > until its parent is awake.
> > 
> > As you can see, this is a very difficult problem to solve.
> 
> Indeed. Actually, one could argue that it's impossible to solve the
> problem as long as we try to call out to userspace during suspend and
> need to wait until that's finished, like in the case of sys_sync() and
> fuse filesystems, and probably other cases. Maybe we should make *those*
> calls return a failure so that the suspend isn't transparent inside the
> kernel but is transparent to userspace.

Well, it generally needs more consideration. :-)

I think that we should introduce mechanisms that will allow us to notify all
kernel subsystems, including FUSE and similar, that the system is going to
enter a sleep state (one of those is the notifier chain introduced recently).

Then, they may react to such a notification by entering a "suspend" mode
of operation in which they will return errors from some callbacks that
otherwise should have succeeded etc.  That depends on the subsystem in
question.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 11:07             ` Oliver Neukum
                                 ` (2 preceding siblings ...)
  2007-07-03 12:58               ` Rafael J. Wysocki
@ 2007-07-03 15:46               ` Miklos Szeredi
  3 siblings, 0 replies; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-03 15:46 UTC (permalink / raw)
  To: oliver; +Cc: linux-pm, benh, nigel, mjg59, linux-kernel

> > > > So to summarize, the plan that makes things work with fuse is:
> > > > 
> > > >  - For STR, don't do the freezer thing.
> > > > 
> > > >  - For STD, don't sys_sync() after you froze
> > > > 
> > > > There might be -other- issues, but that should get you through some of
> > > 
> > > At the risk of repeating myself. Character device drivers are written
> > > with the assumption that normal io and suspend/resume do not race
> > > with each other due to the freezer.
> > > What do you intend to do about that?
> > 
> > Oliver, can you please explain your worries in a bit more detail?
> > 
> > I don't claim to know anything about how STR or hibernate works, but
> > neither seem to have any problem with I/O on the fuse device "racing"
> > with them.
> 
> The problem is not with fuse. The problem is generic in nature.
> 
> If you remove the freezer, user space remains active until the last CPU
> goes into suspend. It can do syscalls. Or do you know a clean way to exempt
> only the tasks fuse might use?
> 
> Now device drivers have a guaranteed temporal sequence:
> 
> last io -> suspend() -> resume() [or disconnect()] -> new io
> 
> This is because suspend() is called after the freezer goes into action. If
> you remove the freezer, you need to deal with
> 
> 1. io to suspended devices
> 2. resume() assuming that the device is in the state suspend() left it
> 3. io changing a device's state while suspend is saving it
> 
> and you need to fix this for all device drivers, not just those fuse is
> involved with.

Fuse is not involved with _any_ device drivers.  It is fully unaware
of suspend issues and I think that's how it should stay ;)

> Removing the freezer means doing a more or less full
> audit of every driver and additional locking in many drivers.

How about a "CONFIG_NOFREEZE (experimental): only turn this on if you
want to fix buggy drivers that can fail during suspend with the
freezer turned off"?

I'm guessing quite a few kernel developers would be willing to turn on
such an option.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 11:23           ` Paul Mackerras
  2007-07-03 11:42             ` Oliver Neukum
@ 2007-07-03 15:58             ` Alan Stern
  2007-07-04  4:02               ` Paul Mackerras
  1 sibling, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-03 15:58 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: Oliver Neukum, Matthew Garrett, linux-pm, linux-kernel

On Tue, 3 Jul 2007, Paul Mackerras wrote:

> Going back to the old powerbook sleep code, we had a two-phase
> suspend: drivers got notified once when userspace is still running,
> with interrupts enabled, in process context; and then a second time
> with interrupts disabled and with only one CPU up, so the process
> that is initiating the suspend is the only process running (since
> interrupts are disabled and nothing it does can sleep, no other
> process can get to run).
> 
> I still believe that is the right way to go, although we currently
> only have a single-phase suspend.
> 
> Most drivers suspended their hardware in the second call.  If they are
> in the middle of a conversation with their device that *has* to be
> completed, they can do that by polling.

Ugh.  That will cause problems when you try to integrate runtime 
suspend.  In fact this whole approach is unsuitable for runtime PM and 
it obscures the similarities between runtime PM and STR.

>  If it's a character device, a
> better approach would be to set a flag or whatever in the first
> suspend call to make sure that no new conversations get started with
> the device, sleeping if necessary.
> 
> I'm actually having a hard time thinking of how to test your assertion
> since there are so few things on a typical computer that are plain
> character devices driving real hardware.  A serial port would be about
> the only one; keyboards and mice (and serial ports :) are USB these
> days, or ADB on older powerbooks.

You don't have to restrict yourself to character devices driving real 
hardware.  The same issues apply to USB and other buses.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  4:29 [PATCH] Remove process freezer from suspend to RAM pathway Matthew Garrett
                   ` (5 preceding siblings ...)
  2007-07-03 12:56 ` Rafael J. Wysocki
@ 2007-07-03 16:03 ` Alan Stern
  2007-07-03 16:05   ` Matthew Garrett
  2007-07-04 23:33 ` Pavel Machek
  7 siblings, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-03 16:03 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel, linux-pm

On Tue, 3 Jul 2007, Matthew Garrett wrote:

> Suspend to RAM on a machine with / on a fuse filesystem turns out to be 
> a screaming nightmare - either the suspend fails because syslog (for 
> instance) can't be frozen, or the machine deadlocks for some other 
> reason I haven't tracked down. We could "fix" fuse, or alternatively we 
> could do what we do for suspend to RAM on other platforms (PPC and APM) 
> and just not use the freezer.

Quite apart from the sync() matter, _any_ synchronous call to a FUSE 
filesystem during STR will cause trouble.  Even if the user task 
implementing the filesystem isn't frozen, when it tries to carry out 
some I/O to a suspended device it will either:

	block until the system wakes up, or

	cause the suspend to abort.

Neither outcome is desirable.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 16:03 ` [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway Alan Stern
@ 2007-07-03 16:05   ` Matthew Garrett
  2007-07-03 16:57     ` Alan Stern
  0 siblings, 1 reply; 388+ messages in thread
From: Matthew Garrett @ 2007-07-03 16:05 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-kernel, linux-pm

On Tue, Jul 03, 2007 at 12:03:33PM -0400, Alan Stern wrote:
> On Tue, 3 Jul 2007, Matthew Garrett wrote:
> 
> > Suspend to RAM on a machine with / on a fuse filesystem turns out to be 
> > a screaming nightmare - either the suspend fails because syslog (for 
> > instance) can't be frozen, or the machine deadlocks for some other 
> > reason I haven't tracked down. We could "fix" fuse, or alternatively we 
> > could do what we do for suspend to RAM on other platforms (PPC and APM) 
> > and just not use the freezer.
> 
> Quite apart from the sync() matter, _any_ synchronous call to a FUSE 
> filesystem during STR will cause trouble.  Even if the user task 
> implementing the filesystem isn't frozen, when it tries to carry out 
> some I/O to a suspended device it will either:
> 
> 	block until the system wakes up, or

For the suspend to RAM case, that sounds absolutely fine.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 16:05   ` Matthew Garrett
@ 2007-07-03 16:57     ` Alan Stern
  2007-07-03 17:02       ` Matthew Garrett
  0 siblings, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-03 16:57 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel, linux-pm

On Tue, 3 Jul 2007, Matthew Garrett wrote:

> On Tue, Jul 03, 2007 at 12:03:33PM -0400, Alan Stern wrote:
> > On Tue, 3 Jul 2007, Matthew Garrett wrote:
> > 
> > > Suspend to RAM on a machine with / on a fuse filesystem turns out to be 
> > > a screaming nightmare - either the suspend fails because syslog (for 
> > > instance) can't be frozen, or the machine deadlocks for some other 
> > > reason I haven't tracked down. We could "fix" fuse, or alternatively we 
> > > could do what we do for suspend to RAM on other platforms (PPC and APM) 
> > > and just not use the freezer.
> > 
> > Quite apart from the sync() matter, _any_ synchronous call to a FUSE 
> > filesystem during STR will cause trouble.  Even if the user task 
> > implementing the filesystem isn't frozen, when it tries to carry out 
> > some I/O to a suspended device it will either:
> > 
> > 	block until the system wakes up, or
> 
> For the suspend to RAM case, that sounds absolutely fine.

It's not so good when your suspend process has to wait for the call to 
complete!

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 16:57     ` Alan Stern
@ 2007-07-03 17:02       ` Matthew Garrett
  2007-07-03 19:33         ` Alan Stern
  0 siblings, 1 reply; 388+ messages in thread
From: Matthew Garrett @ 2007-07-03 17:02 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-kernel, linux-pm

On Tue, Jul 03, 2007 at 12:57:17PM -0400, Alan Stern wrote:
> On Tue, 3 Jul 2007, Matthew Garrett wrote:
> > On Tue, Jul 03, 2007 at 12:03:33PM -0400, Alan Stern wrote:
> > > Quite apart from the sync() matter, _any_ synchronous call to a FUSE 
> > > filesystem during STR will cause trouble.  Even if the user task 
> > > implementing the filesystem isn't frozen, when it tries to carry out 
> > > some I/O to a suspended device it will either:
> > > 
> > > 	block until the system wakes up, or
> > 
> > For the suspend to RAM case, that sounds absolutely fine.
> 
> It's not so good when your suspend process has to wait for the call to 
> complete!

Why would it have to? Sorry, I suspect I'm missing something obvious 
here.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 15:09     ` Rafael J. Wysocki
@ 2007-07-03 17:20       ` Oliver Neukum
  2007-07-03 20:59         ` Rafael J. Wysocki
  2007-07-04 23:39         ` Pavel Machek
  2007-07-03 18:26       ` Oliver Neukum
  2007-07-03 19:27       ` Pavel Machek
  2 siblings, 2 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-03 17:20 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-kernel, linux-pm,
	Pavel Machek, Nigel Cunningham

Am Dienstag, 3. Juli 2007 schrieb Rafael J. Wysocki:
> On Tuesday, 3 July 2007 15:08, Rafael J. Wysocki wrote:
> > On Tuesday, 3 July 2007 07:51, Benjamin Herrenschmidt wrote:
> > > On Tue, 2007-07-03 at 05:29 +0100, Matthew Garrett wrote:
> > > > Suspend to RAM on a machine with / on a fuse filesystem turns out to be 
> > > > a screaming nightmare - either the suspend fails because syslog (for 
> > > > instance) can't be frozen, or the machine deadlocks for some other 
> > > > reason I haven't tracked down. We could "fix" fuse, or alternatively we 
> > > > could do what we do for suspend to RAM on other platforms (PPC and APM) 
> > > > and just not use the freezer.
> > > 
> > > The main reason for deadlocks is because we do a sys_sync() after the
> > > freeze, which we shouldn't do.
> > 
> > So why don't we remove the sys_sync() from freeze_processes() instead?
> 
> The patch follows (untested).
> 
> Greetings,
> Rafael
> 
> 
> ---
> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> We shouldn't sync filesystems from within the freezer, because it's not needed
> for suspend to RAM and leads to problems with FUSE.

This seems fishy. Swsusp needs enough clean memory to make enough
room for the image. If you sync before you freeze, the running tasks can
redirty memory.
What makes you sure that you don't die as shrink_all_memory() writes out
pages?

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 15:22                 ` Rafael J. Wysocki
@ 2007-07-03 17:38                   ` Miklos Szeredi
  2007-07-03 20:54                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-03 17:38 UTC (permalink / raw)
  To: rjw; +Cc: johannes, stern, linux-pm, linux-kernel, pavel, mjg59, paulus

> > Indeed. Actually, one could argue that it's impossible to solve the
> > problem as long as we try to call out to userspace during suspend and
> > need to wait until that's finished, like in the case of sys_sync() and
> > fuse filesystems, and probably other cases. Maybe we should make *those*
> > calls return a failure so that the suspend isn't transparent inside the
> > kernel but is transparent to userspace.
> 
> Well, it generally needs more consideration. :-)
> 
> I think that we should introduce mechanisms that will allow us to notify all
> kernel subsystems, including FUSE and similar, that the system is going to
> enter a sleep state (one of those is the notifier chain introduced recently).

Ugh, please no.

Believe me, fuse is doing _nothing_ out of the ordinary, and should
not need special treatment during suspend/resume.  If suspend itself
is doing something that triggers fuse activity, then that's a bug,
such as the sync() thing that started this thread.

> Then, they may react to such a notification by entering a "suspend" mode
> of operation in which they will return errors from some callbacks that
> otherwise should have succeeded etc.  That depends on the subsystem in
> question.

Sounds horrible.

Why do we need to deal with subsystem interdependencies during
suspend?  Isn't it about saving device state to ram?  That definitely
_should not_ need to trigger anything that touches filesystems or
other subsystems.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 15:09     ` Rafael J. Wysocki
  2007-07-03 17:20       ` Oliver Neukum
@ 2007-07-03 18:26       ` Oliver Neukum
  2007-07-03 19:13         ` Miklos Szeredi
  2007-07-03 21:09         ` Rafael J. Wysocki
  2007-07-03 19:27       ` Pavel Machek
  2 siblings, 2 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-03 18:26 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-kernel, linux-pm,
	Pavel Machek, Nigel Cunningham

Am Dienstag, 3. Juli 2007 schrieb Rafael J. Wysocki:
> > > The main reason for deadlocks is because we do a sys_sync() after the
> > > freeze, which we shouldn't do.
> > 
> > So why don't we remove the sys_sync() from freeze_processes() instead?
> 
> The patch follows (untested).

And a further question. The freezer is not atomic. What do you do
if a task not yet frozen calls sys_sync(), but fuse is already frozen?

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 18:26       ` Oliver Neukum
@ 2007-07-03 19:13         ` Miklos Szeredi
  2007-07-03 19:32           ` Oliver Neukum
  2007-07-03 21:09         ` Rafael J. Wysocki
  1 sibling, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-03 19:13 UTC (permalink / raw)
  To: oliver; +Cc: rjw, benh, mjg59, linux-kernel, linux-pm, pavel, nigel

> And a further question. The freezer is not atomic. What do you do
> if a task not yet frozen calls sys_sync(), but fuse is already frozen?

What do you do if a task not yet frozen writes to a pipe, on the other
end of which is a task already frozen?

It doesn't matter.  The only thing that should matter during suspend
(not hibernate) is saving the state of devices to ram, and putting the
devices to sleep.

I'm not sure why this can't be made atomic, but assuming, that it
can't, fuse should still not need to be implicated.  If it is, that's
an indication about something wrong in the suspend procedure.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 15:09     ` Rafael J. Wysocki
  2007-07-03 17:20       ` Oliver Neukum
  2007-07-03 18:26       ` Oliver Neukum
@ 2007-07-03 19:27       ` Pavel Machek
  2007-07-03 21:25         ` Rafael J. Wysocki
  2 siblings, 1 reply; 388+ messages in thread
From: Pavel Machek @ 2007-07-03 19:27 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-kernel, linux-pm,
	Nigel Cunningham

Hi!

> > > The main reason for deadlocks is because we do a sys_sync() after the
> > > freeze, which we shouldn't do.
> > 
> > So why don't we remove the sys_sync() from freeze_processes() instead?
> 
> The patch follows (untested).
> 
> Greetings,
> Rafael
> 
> 
> ---
> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> We shouldn't sync filesystems from within the freezer, because it's not needed
> for suspend to RAM and leads to problems with FUSE.

Actually... It is not _needed_ for suspend to disk, either. Snapshot is
atomic, so it should be okay to suspend with filesystems dirty.

_But_, if anything goes wrong, we'd prefer to have at least
filesystems synced. Battery running out during s2ram is not quite
uncommon, so we perhaps should do sync somewhere there. (But we can do
it before freezer just fine).
								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 19:13         ` Miklos Szeredi
@ 2007-07-03 19:32           ` Oliver Neukum
  2007-07-03 19:47             ` Miklos Szeredi
                               ` (3 more replies)
  0 siblings, 4 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-03 19:32 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: rjw, benh, mjg59, linux-kernel, linux-pm, pavel, nigel

Am Dienstag, 3. Juli 2007 schrieb Miklos Szeredi:
> > And a further question. The freezer is not atomic. What do you do
> > if a task not yet frozen calls sys_sync(), but fuse is already frozen?
> 
> What do you do if a task not yet frozen writes to a pipe, on the other
> end of which is a task already frozen?

The same as you do with a pipe when the reader is not ready.

> It doesn't matter.  The only thing that should matter during suspend
> (not hibernate) is saving the state of devices to ram, and putting the
> devices to sleep.

Well, but you did remove sys_sync() from the freezer, which is
and must be called in the hibernate path.
 
> I'm not sure why this can't be made atomic, but assuming, that it
> can't, fuse should still not need to be implicated.  If it is, that's
> an indication about something wrong in the suspend procedure.

Nope, something's wrong in fuse. You must be able to deal with sync
until every task is frozen.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 17:02       ` Matthew Garrett
@ 2007-07-03 19:33         ` Alan Stern
  2007-07-03 19:42           ` Matthew Garrett
  0 siblings, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-03 19:33 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel, linux-pm

On Tue, 3 Jul 2007, Matthew Garrett wrote:

> On Tue, Jul 03, 2007 at 12:57:17PM -0400, Alan Stern wrote:
> > On Tue, 3 Jul 2007, Matthew Garrett wrote:
> > > On Tue, Jul 03, 2007 at 12:03:33PM -0400, Alan Stern wrote:
> > > > Quite apart from the sync() matter, _any_ synchronous call to a FUSE 
> > > > filesystem during STR will cause trouble.  Even if the user task 
> > > > implementing the filesystem isn't frozen, when it tries to carry out 
> > > > some I/O to a suspended device it will either:
> > > > 
> > > > 	block until the system wakes up, or
> > > 
> > > For the suspend to RAM case, that sounds absolutely fine.
> > 
> > It's not so good when your suspend process has to wait for the call to 
> > complete!
> 
> Why would it have to? Sorry, I suspect I'm missing something obvious 
> here.

Well, the sys_sync() that caused your original problem did exactly 
that.  It's the reason you get deadlocks, right?

I agree that in general the suspend process should not have to wait for 
a userspace callback to complete.  Indeed, there's no particular 
reason that anything running during STR should have to wait for 
something in userspace to complete.  Given that fact, I don't see 
anything wrong with freezing userspace when doing STR.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 19:33         ` Alan Stern
@ 2007-07-03 19:42           ` Matthew Garrett
  2007-07-03 19:54             ` Alan Stern
  0 siblings, 1 reply; 388+ messages in thread
From: Matthew Garrett @ 2007-07-03 19:42 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-kernel, linux-pm

On Tue, Jul 03, 2007 at 03:33:40PM -0400, Alan Stern wrote:
> On Tue, 3 Jul 2007, Matthew Garrett wrote:
> 
> > On Tue, Jul 03, 2007 at 12:57:17PM -0400, Alan Stern wrote:
> > > On Tue, 3 Jul 2007, Matthew Garrett wrote:
> > > > For the suspend to RAM case, that sounds absolutely fine.
> > > 
> > > It's not so good when your suspend process has to wait for the call to 
> > > complete!
> > 
> > Why would it have to? Sorry, I suspect I'm missing something obvious 
> > here.
> 
> Well, the sys_sync() that caused your original problem did exactly 
> that.  It's the reason you get deadlocks, right?

The sys_sync is unnecessary in the first case. There shouldn't be 
anything in the suspend path that's going to require userspace access to 
a device after that device has been suspended.

> I agree that in general the suspend process should not have to wait for 
> a userspace callback to complete.  Indeed, there's no particular 
> reason that anything running during STR should have to wait for 
> something in userspace to complete.  Given that fact, I don't see 
> anything wrong with freezing userspace when doing STR.

There's nothing wrong with it as such, it's just that our implementation 
appears to suck in a myriad of small ways that keep cropping up and 
biting people. Even without the sys_sync(), freezing processes results 
in the suspend failing because syslog is stuck in D state and won't go 
into the refrigerator.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 19:32           ` Oliver Neukum
@ 2007-07-03 19:47             ` Miklos Szeredi
  2007-07-03 20:02             ` [linux-pm] " Alan Stern
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-03 19:47 UTC (permalink / raw)
  To: oliver; +Cc: rjw, benh, mjg59, linux-kernel, linux-pm, pavel, nigel

> Am Dienstag, 3. Juli 2007 schrieb Miklos Szeredi:
> > > And a further question. The freezer is not atomic. What do you do
> > > if a task not yet frozen calls sys_sync(), but fuse is already frozen?
> > 
> > What do you do if a task not yet frozen writes to a pipe, on the other
> > end of which is a task already frozen?
> 
> The same as you do with a pipe when the reader is not ready.

Yes.  And if the fuse process is not ready, what will sys_sync() do?

(As a side note, sys_sync() actually does nothing in fuse, so these
examples are slightly silly, but the concept applies to any filesystem
operation).

> > It doesn't matter.  The only thing that should matter during suspend
> > (not hibernate) is saving the state of devices to ram, and putting the
> > devices to sleep.
> 
> Well, but you did remove sys_sync() from the freezer, which is
> and must be called in the hibernate path.

Since I don't know why sync() needs to be called from hibernate, I
can't argue about this.  But for _suspend_ it definitely should not be
needed.

> > I'm not sure why this can't be made atomic, but assuming, that it
> > can't, fuse should still not need to be implicated.  If it is, that's
> > an indication about something wrong in the suspend procedure.
> 
> Nope, something's wrong in fuse. You must be able to deal with sync
> until every task is frozen.

Why exactly?  That's like saying the pipe must be able to deal with
writes until every task is frozen.

As I've said, fuse is just special kind of IPC.  The task calling into
fuse can assume nothing about when it's request will be served.  Just
like a task writing into a pipe can assume nothing about when that
write will return.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 19:42           ` Matthew Garrett
@ 2007-07-03 19:54             ` Alan Stern
  2007-07-03 20:23               ` Matthew Garrett
  0 siblings, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-03 19:54 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel, linux-pm

On Tue, 3 Jul 2007, Matthew Garrett wrote:

> > I agree that in general the suspend process should not have to wait for 
> > a userspace callback to complete.  Indeed, there's no particular 
> > reason that anything running during STR should have to wait for 
> > something in userspace to complete.  Given that fact, I don't see 
> > anything wrong with freezing userspace when doing STR.
> 
> There's nothing wrong with it as such, it's just that our implementation 
> appears to suck in a myriad of small ways that keep cropping up and 
> biting people. Even without the sys_sync(), freezing processes results 
> in the suspend failing because syslog is stuck in D state and won't go 
> into the refrigerator.

Okay, I can believe that.  The proper response then is to fix the 
freezer, not eliminate it.  Has the syslog problem been reported on 
linux-pm?  I don't recall hearing of it before.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 19:32           ` Oliver Neukum
  2007-07-03 19:47             ` Miklos Szeredi
@ 2007-07-03 20:02             ` Alan Stern
  2007-07-03 20:19               ` Miklos Szeredi
  2007-07-03 20:45               ` Oliver Neukum
  2007-07-03 21:20             ` Benjamin Herrenschmidt
  2007-07-04 23:45             ` Pavel Machek
  3 siblings, 2 replies; 388+ messages in thread
From: Alan Stern @ 2007-07-03 20:02 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: Miklos Szeredi, mjg59, linux-kernel, pavel, linux-pm

On Tue, 3 Jul 2007, Oliver Neukum wrote:

> Well, but you did remove sys_sync() from the freezer, which is
> and must be called in the hibernate path.

That's not really true.  We _want_ to call sys_sync() in both the 
hibernate and suspend paths (in case the batteries run down), to help 
avoid filesystem problems if something goes wrong with the resume.  But 
it isn't a hard requirement.

> > I'm not sure why this can't be made atomic, but assuming, that it
> > can't, fuse should still not need to be implicated.  If it is, that's
> > an indication about something wrong in the suspend procedure.
> 
> Nope, something's wrong in fuse. You must be able to deal with sync
> until every task is frozen.

That's ridiculous.  FUSE itself runs partially as a user task.  How can
you expect it to carry out a sync or anything else when it is frozen?

I suppose you could "deal" with it by having the kernel portion return
an error if the userspace part is frozen.  If the hibernate/suspend 
code bothered to check the return value, it would immediately abort 
the suspend.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 20:02             ` [linux-pm] " Alan Stern
@ 2007-07-03 20:19               ` Miklos Szeredi
  2007-07-03 21:20                 ` Rafael J. Wysocki
  2007-07-03 20:45               ` Oliver Neukum
  1 sibling, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-03 20:19 UTC (permalink / raw)
  To: stern; +Cc: oliver, miklos, mjg59, linux-kernel, pavel, linux-pm

> > Well, but you did remove sys_sync() from the freezer, which is
> > and must be called in the hibernate path.
> 
> That's not really true.  We _want_ to call sys_sync() in both the 
> hibernate and suspend paths (in case the batteries run down), to help 
> avoid filesystem problems if something goes wrong with the resume.  But 
> it isn't a hard requirement.
> 
> > > I'm not sure why this can't be made atomic, but assuming, that it
> > > can't, fuse should still not need to be implicated.  If it is, that's
> > > an indication about something wrong in the suspend procedure.
> > 
> > Nope, something's wrong in fuse. You must be able to deal with sync
> > until every task is frozen.
> 
> That's ridiculous.  FUSE itself runs partially as a user task.  How can
> you expect it to carry out a sync or anything else when it is frozen?
> 
> I suppose you could "deal" with it by having the kernel portion return
> an error if the userspace part is frozen.  If the hibernate/suspend 
> code bothered to check the return value, it would immediately abort 
> the suspend.

I strongly believe, that we don't want to deal with it.  If we want to
call sync(), do it while the system is fully operational.  It's a best
effort thing anyway, and you can loose data in other ways if resume
fails.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 14:59               ` Johannes Berg
  2007-07-03 15:22                 ` Rafael J. Wysocki
@ 2007-07-03 20:21                 ` Alan Stern
  2007-07-04  4:59                   ` Paul Mackerras
  1 sibling, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-03 20:21 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Rafael J. Wysocki, Linux-pm mailing list,
	Kernel development list, Pavel Machek, Matthew Garrett,
	Paul Mackerras, Benjamin Herrenschmidt

On Tue, 3 Jul 2007, Johannes Berg wrote:

> > Runtime suspend isn't a problem.  Only STR.
> 
> Ah but for all those character devices people were saying are the
> problem we haven't even solved runtime suspend as far as I can tell from
> the discussion.

The technique used by USB for runtime PM should work fine with other
devices.  They may not have implemented it yet, but I consider the
matter more-or-less solved.


> > As you can see, this is a very difficult problem to solve.
> 
> Indeed. Actually, one could argue that it's impossible to solve the
> problem as long as we try to call out to userspace during suspend and
> need to wait until that's finished, like in the case of sys_sync() and
> fuse filesystems, and probably other cases. Maybe we should make *those*
> calls return a failure so that the suspend isn't transparent inside the
> kernel but is transparent to userspace.

I disagree.  The problem isn't the kernel calling userspace; it's
userspace trying to do I/O at a time when everything is supposed to be
quiescing.  Detecting that and blocking it in drivers is hard and
error-prone; preventing it by freezing userspace is easy and cheap.

The reasons why the PPC people dislike the whole idea aren't clear to
me.  If it were necessary to have some user task running in order to
carry out the STR then their objection would make sense -- obviously
that task couldn't do its job if it were frozen.  But it isn't
necessary, or at least it should not be.

Userspace will be effectively "frozen" while the system as a whole is 
suspended.  So what's wrong with freezing it a little early?  Despite 
Ben's comments, it seems to me that the freezer doesn't hide problems 
-- it prevents them.

Now people may claim that the freezer implementation itself is buggy.  
I wouldn't dispute it.  But the bugs should be fixable; nobody has 
pointed out anything fundamentally wrong with the idea AFAICT.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 19:54             ` Alan Stern
@ 2007-07-03 20:23               ` Matthew Garrett
  2007-07-03 21:10                 ` Alan Stern
  0 siblings, 1 reply; 388+ messages in thread
From: Matthew Garrett @ 2007-07-03 20:23 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-kernel, linux-pm

On Tue, Jul 03, 2007 at 03:54:55PM -0400, Alan Stern wrote:
> On Tue, 3 Jul 2007, Matthew Garrett wrote:
> > There's nothing wrong with it as such, it's just that our implementation 
> > appears to suck in a myriad of small ways that keep cropping up and 
> > biting people. Even without the sys_sync(), freezing processes results 
> > in the suspend failing because syslog is stuck in D state and won't go 
> > into the refrigerator.
> 
> Okay, I can believe that.  The proper response then is to fix the 
> freezer, not eliminate it.  Has the syslog problem been reported on 
> linux-pm?  I don't recall hearing of it before.

See the start of this thread. It's just not clear what the freezer buys 
us - removing it gets rid of a load of subtle issues and complexity, and 
turns system suspend into something that looks more like runtime 
suspend (which might then encourage people to get runtime suspend 
right...)

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 20:02             ` [linux-pm] " Alan Stern
  2007-07-03 20:19               ` Miklos Szeredi
@ 2007-07-03 20:45               ` Oliver Neukum
  1 sibling, 0 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-03 20:45 UTC (permalink / raw)
  To: Alan Stern; +Cc: Miklos Szeredi, mjg59, linux-kernel, pavel, linux-pm

Am Dienstag, 3. Juli 2007 schrieben Sie:
> On Tue, 3 Jul 2007, Oliver Neukum wrote:
> 
> > Well, but you did remove sys_sync() from the freezer, which is
> > and must be called in the hibernate path.
> 
> That's not really true.  We _want_ to call sys_sync() in both the 
> hibernate and suspend paths (in case the batteries run down), to help 
> avoid filesystem problems if something goes wrong with the resume.  But 
> it isn't a hard requirement.

But the ability to launder pages is needed. During hibernation we need
to shrink memory. I don't see how this would be fundamentally different
from calling sync.

> > > I'm not sure why this can't be made atomic, but assuming, that it
> > > can't, fuse should still not need to be implicated.  If it is, that's
> > > an indication about something wrong in the suspend procedure.
> > 
> > Nope, something's wrong in fuse. You must be able to deal with sync
> > until every task is frozen.
> 
> That's ridiculous.  FUSE itself runs partially as a user task.  How can
> you expect it to carry out a sync or anything else when it is frozen?

I don't and it might point to a fundamental problem.
But I cannot help but notice that syscalls may happen while the system
is partially frozen. It must be dealt with.

> I suppose you could "deal" with it by having the kernel portion return
> an error if the userspace part is frozen.  If the hibernate/suspend 
> code bothered to check the return value, it would immediately abort 
> the suspend.

Where exactly would that code notice the errors?

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 17:38                   ` Miklos Szeredi
@ 2007-07-03 20:54                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 20:54 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: johannes, stern, linux-pm, linux-kernel, pavel, mjg59, paulus

On Tuesday, 3 July 2007 19:38, Miklos Szeredi wrote:
> > > Indeed. Actually, one could argue that it's impossible to solve the
> > > problem as long as we try to call out to userspace during suspend and
> > > need to wait until that's finished, like in the case of sys_sync() and
> > > fuse filesystems, and probably other cases. Maybe we should make *those*
> > > calls return a failure so that the suspend isn't transparent inside the
> > > kernel but is transparent to userspace.
> > 
> > Well, it generally needs more consideration. :-)
> > 
> > I think that we should introduce mechanisms that will allow us to notify all
> > kernel subsystems, including FUSE and similar, that the system is going to
> > enter a sleep state (one of those is the notifier chain introduced recently).
> 
> Ugh, please no.
> 
> Believe me, fuse is doing _nothing_ out of the ordinary, and should
> not need special treatment during suspend/resume.  If suspend itself
> is doing something that triggers fuse activity, then that's a bug,
> such as the sync() thing that started this thread.

Apart from the sync, it shouldn't trigger any fs activity.  Still, some other
task running concurrently with the suspend code may do that and _if_
we are going to allow that to happen (and we do, if we remove the freezer
from the suspend code path), we will have to take that into consideration.

> > Then, they may react to such a notification by entering a "suspend" mode
> > of operation in which they will return errors from some callbacks that
> > otherwise should have succeeded etc.  That depends on the subsystem in
> > question.
> 
> Sounds horrible.
> 
> Why do we need to deal with subsystem interdependencies during
> suspend?  Isn't it about saving device state to ram?

No.  In addition, the devices should be prevented from generating interrupts
or initiating DMA transfers and put into low power states before we actually
suspend the system (platform).  This is a bit difficult to do while all user
space is running.

> That definitely _should not_ need to trigger anything that touches
> filesystems or other subsystems.

Right, but the subsystems may do something that affects devices.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 17:20       ` Oliver Neukum
@ 2007-07-03 20:59         ` Rafael J. Wysocki
  2007-07-03 21:35           ` Benjamin Herrenschmidt
  2007-07-04 23:39         ` Pavel Machek
  1 sibling, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 20:59 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-kernel, linux-pm,
	Pavel Machek, Nigel Cunningham

On Tuesday, 3 July 2007 19:20, Oliver Neukum wrote:
> Am Dienstag, 3. Juli 2007 schrieb Rafael J. Wysocki:
> > On Tuesday, 3 July 2007 15:08, Rafael J. Wysocki wrote:
> > > On Tuesday, 3 July 2007 07:51, Benjamin Herrenschmidt wrote:
> > > > On Tue, 2007-07-03 at 05:29 +0100, Matthew Garrett wrote:
> > > > > Suspend to RAM on a machine with / on a fuse filesystem turns out to be 
> > > > > a screaming nightmare - either the suspend fails because syslog (for 
> > > > > instance) can't be frozen, or the machine deadlocks for some other 
> > > > > reason I haven't tracked down. We could "fix" fuse, or alternatively we 
> > > > > could do what we do for suspend to RAM on other platforms (PPC and APM) 
> > > > > and just not use the freezer.
> > > > 
> > > > The main reason for deadlocks is because we do a sys_sync() after the
> > > > freeze, which we shouldn't do.
> > > 
> > > So why don't we remove the sys_sync() from freeze_processes() instead?
> > 
> > The patch follows (untested).
> > 
> > Greetings,
> > Rafael
> > 
> > 
> > ---
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > We shouldn't sync filesystems from within the freezer, because it's not needed
> > for suspend to RAM and leads to problems with FUSE.
> 
> This seems fishy. Swsusp needs enough clean memory to make enough
> room for the image. If you sync before you freeze, the running tasks can
> redirty memory.
> What makes you sure that you don't die as shrink_all_memory() writes out
> pages?

I don't think that would matter.

Still, I can remove the sync from the suspend code path only, leaving it in
the hibernation code path.  The patch will be bigger, but well.

Any objection to that?

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 18:26       ` Oliver Neukum
  2007-07-03 19:13         ` Miklos Szeredi
@ 2007-07-03 21:09         ` Rafael J. Wysocki
  1 sibling, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 21:09 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-kernel, linux-pm,
	Pavel Machek, Nigel Cunningham

On Tuesday, 3 July 2007 20:26, Oliver Neukum wrote:
> Am Dienstag, 3. Juli 2007 schrieb Rafael J. Wysocki:
> > > > The main reason for deadlocks is because we do a sys_sync() after the
> > > > freeze, which we shouldn't do.
> > > 
> > > So why don't we remove the sys_sync() from freeze_processes() instead?
> > 
> > The patch follows (untested).
> 
> And a further question. The freezer is not atomic. What do you do
> if a task not yet frozen calls sys_sync(), but fuse is already frozen?

Hmm, if the sync is interruptible (I'm not sure), the task should be frozen while
waiting for it to complete.

Otherwise, the freezing of tasks will fail (no deadlock).

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 20:23               ` Matthew Garrett
@ 2007-07-03 21:10                 ` Alan Stern
  2007-07-03 21:12                   ` Matthew Garrett
  0 siblings, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-03 21:10 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel, linux-pm

On Tue, 3 Jul 2007, Matthew Garrett wrote:

> See the start of this thread. It's just not clear what the freezer buys 
> us - removing it gets rid of a load of subtle issues and complexity, and 
> turns system suspend into something that looks more like runtime 
> suspend (which might then encourage people to get runtime suspend 
> right...)

No, no -- you have it exactly backwards.  Removing the freezer turns 
STR into something _less_ like runtime suspend, because it adds the 
requirement that devices must not automatically be resumed when an I/O 
request arrives.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 21:10                 ` Alan Stern
@ 2007-07-03 21:12                   ` Matthew Garrett
  2007-07-03 21:16                     ` Alan Stern
  0 siblings, 1 reply; 388+ messages in thread
From: Matthew Garrett @ 2007-07-03 21:12 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-kernel, linux-pm

On Tue, Jul 03, 2007 at 05:10:08PM -0400, Alan Stern wrote:

> No, no -- you have it exactly backwards.  Removing the freezer turns 
> STR into something _less_ like runtime suspend, because it adds the 
> requirement that devices must not automatically be resumed when an I/O 
> request arrives.

But that's fine - "Are we undergoing a systemwide suspend" is an easy 
question to ask. Freezing processes instead means that most of those 
paths will never be tested.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 12:56         ` Rafael J. Wysocki
  2007-07-03 14:21           ` [linux-pm] " Johannes Berg
@ 2007-07-03 21:14           ` Benjamin Herrenschmidt
  2007-07-03 21:32             ` Rafael J. Wysocki
  2007-07-05  9:30             ` [PATCH] Remove process freezer from suspend to RAM pathway Pavel Machek
  1 sibling, 2 replies; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-03 21:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Matthew Garrett, linux-kernel, linux-pm,
	Alan Stern, Pavel Machek


> >  - For STR, don't do the freezer thing.
> 
> In the long run, I agree.
> 
> Still, can you please read this post from Alan Stern:
> 
> https://lists.linux-foundation.org/pipermail/linux-pm/2007-June/012847.html
> 
> ?  I don't think I'm able to repeat the arguments given in there in a
> convincing way.

That's the same crackpot I've been hearing for the past 3 years or
so ...

Both Paulus and I think the freezer is just a way to try to put your
head in the sand and ignore the problem. It causes as many problems as
it solves on its own, and is just not a solution that will be of any use
once you start implementing dynamic PM schemes etc...

In many cases, having proper support for "live" suspend of devices is
just a matter of having a couple of helpers in whatever subsystem those
drivers hookup with. In the case of network, for example, it's mostly
trivial (stop the queue). For block, it's not terribly hard neither,
though you want to have some orderign/atomicity between the blocking of
the incoming request queue and the sending of things like spindown &
flush commands to the disk. For old-style IDE, that was fairly easily
solved by piping suspend/resume command down the request queue itself
and have the queue block/unblbock itself after processing them. Some of
that logic could maybe be moved to the block layer for all block drivers
to benefit.

But yes, overall, there is work to do on drivers and I'm doing the ones
I hit on the platforms I use. I don't think the freezer is any kind of
remotely good solution, just a way to continue avoiding the problem.

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 13:08   ` Rafael J. Wysocki
  2007-07-03 15:09     ` Rafael J. Wysocki
@ 2007-07-03 21:16     ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-03 21:16 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Matthew Garrett, linux-kernel, linux-pm

On Tue, 2007-07-03 at 15:08 +0200, Rafael J. Wysocki wrote:
> On Tuesday, 3 July 2007 07:51, Benjamin Herrenschmidt wrote:
> > On Tue, 2007-07-03 at 05:29 +0100, Matthew Garrett wrote:
> > > Suspend to RAM on a machine with / on a fuse filesystem turns out to be 
> > > a screaming nightmare - either the suspend fails because syslog (for 
> > > instance) can't be frozen, or the machine deadlocks for some other 
> > > reason I haven't tracked down. We could "fix" fuse, or alternatively we 
> > > could do what we do for suspend to RAM on other platforms (PPC and APM) 
> > > and just not use the freezer.
> > 
> > The main reason for deadlocks is because we do a sys_sync() after the
> > freeze, which we shouldn't do.
> 
> So why don't we remove the sys_sync() from freeze_processes() instead?

Or rather, move it to before freeze_processes() so that you get at least
some level of sync'ing just in case you never wake up. Not perfect
(things may still get dirty) but it will help the immediate issue.

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 21:12                   ` Matthew Garrett
@ 2007-07-03 21:16                     ` Alan Stern
  2007-07-03 21:20                       ` Matthew Garrett
  0 siblings, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-03 21:16 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel, linux-pm

On Tue, 3 Jul 2007, Matthew Garrett wrote:

> On Tue, Jul 03, 2007 at 05:10:08PM -0400, Alan Stern wrote:
> 
> > No, no -- you have it exactly backwards.  Removing the freezer turns 
> > STR into something _less_ like runtime suspend, because it adds the 
> > requirement that devices must not automatically be resumed when an I/O 
> > request arrives.
> 
> But that's fine - "Are we undergoing a systemwide suspend" is an easy 
> question to ask. Freezing processes instead means that most of those 
> paths will never be tested.

The question is easy to ask, but it's not so easy to figure out what
you should do if the answer is Yes.  Freezing processes instead means
that those "untested" paths -- in many, many drivers -- won't have to 
exist at all.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 21:16                     ` Alan Stern
@ 2007-07-03 21:20                       ` Matthew Garrett
  2007-07-03 21:37                         ` Rafael J. Wysocki
  2007-07-03 22:21                         ` Alan Stern
  0 siblings, 2 replies; 388+ messages in thread
From: Matthew Garrett @ 2007-07-03 21:20 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-kernel, linux-pm

On Tue, Jul 03, 2007 at 05:16:37PM -0400, Alan Stern wrote:
> On Tue, 3 Jul 2007, Matthew Garrett wrote:
> > But that's fine - "Are we undergoing a systemwide suspend" is an easy 
> > question to ask. Freezing processes instead means that most of those 
> > paths will never be tested.
> 
> The question is easy to ask, but it's not so easy to figure out what
> you should do if the answer is Yes.  Freezing processes instead means
> that those "untested" paths -- in many, many drivers -- won't have to 
> exist at all.

We're used to the idea of applications blocking when a resource they're 
using goes away - NFS has done it forever. 

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 19:32           ` Oliver Neukum
  2007-07-03 19:47             ` Miklos Szeredi
  2007-07-03 20:02             ` [linux-pm] " Alan Stern
@ 2007-07-03 21:20             ` Benjamin Herrenschmidt
  2007-07-03 21:48               ` Oliver Neukum
  2007-07-04 23:45             ` Pavel Machek
  3 siblings, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-03 21:20 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Miklos Szeredi, rjw, mjg59, linux-kernel, linux-pm, pavel, nigel

On Tue, 2007-07-03 at 21:32 +0200, Oliver Neukum wrote:
> > I'm not sure why this can't be made atomic, but assuming, that it
> > can't, fuse should still not need to be implicated.  If it is,
> that's
> > an indication about something wrong in the suspend procedure.
> 
> Nope, something's wrong in fuse. You must be able to deal with sync
> until every task is frozen. 

Pipe dream

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 20:19               ` Miklos Szeredi
@ 2007-07-03 21:20                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 21:20 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: stern, oliver, mjg59, linux-kernel, pavel, linux-pm

On Tuesday, 3 July 2007 22:19, Miklos Szeredi wrote:
> > > Well, but you did remove sys_sync() from the freezer, which is
> > > and must be called in the hibernate path.
> > 
> > That's not really true.  We _want_ to call sys_sync() in both the 
> > hibernate and suspend paths (in case the batteries run down), to help 
> > avoid filesystem problems if something goes wrong with the resume.  But 
> > it isn't a hard requirement.
> > 
> > > > I'm not sure why this can't be made atomic, but assuming, that it
> > > > can't, fuse should still not need to be implicated.  If it is, that's
> > > > an indication about something wrong in the suspend procedure.
> > > 
> > > Nope, something's wrong in fuse. You must be able to deal with sync
> > > until every task is frozen.
> > 
> > That's ridiculous.  FUSE itself runs partially as a user task.  How can
> > you expect it to carry out a sync or anything else when it is frozen?
> > 
> > I suppose you could "deal" with it by having the kernel portion return
> > an error if the userspace part is frozen.  If the hibernate/suspend 
> > code bothered to check the return value, it would immediately abort 
> > the suspend.

Er, do_sync() doesn't return a result.

> I strongly believe, that we don't want to deal with it.  If we want to
> call sync(), do it while the system is fully operational.  It's a best
> effort thing anyway, and you can loose data in other ways if resume
> fails.

The requirement of syncing when the system (including the user space) is fully
operational is FUSE-specific.  Thus we'd rather like to sync FUSE filesystems
before freezing the user space and freeze the other filesystems after freezing
the user space.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 19:27       ` Pavel Machek
@ 2007-07-03 21:25         ` Rafael J. Wysocki
  0 siblings, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 21:25 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-kernel, linux-pm,
	Nigel Cunningham

On Tuesday, 3 July 2007 21:27, Pavel Machek wrote:
> Hi!
> 
> > > > The main reason for deadlocks is because we do a sys_sync() after the
> > > > freeze, which we shouldn't do.
> > > 
> > > So why don't we remove the sys_sync() from freeze_processes() instead?
> > 
> > The patch follows (untested).
> > 
> > Greetings,
> > Rafael
> > 
> > 
> > ---
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > We shouldn't sync filesystems from within the freezer, because it's not needed
> > for suspend to RAM and leads to problems with FUSE.
> 
> Actually... It is not _needed_ for suspend to disk, either. Snapshot is
> atomic, so it should be okay to suspend with filesystems dirty.
> 
> _But_, if anything goes wrong, we'd prefer to have at least
> filesystems synced. Battery running out during s2ram is not quite
> uncommon, so we perhaps should do sync somewhere there. (But we can do
> it before freezer just fine).

OK

So, should I add the sync() to suspend_prepare(), before freeze_processes()
(in analogy with hibernate())?

Greetings
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 21:14           ` Benjamin Herrenschmidt
@ 2007-07-03 21:32             ` Rafael J. Wysocki
  2007-07-03 21:35               ` Benjamin Herrenschmidt
  2007-07-04  3:29               ` [linux-pm] " Paul Mackerras
  2007-07-05  9:30             ` [PATCH] Remove process freezer from suspend to RAM pathway Pavel Machek
  1 sibling, 2 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 21:32 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Nigel Cunningham, Matthew Garrett, linux-kernel, linux-pm,
	Alan Stern, Pavel Machek

On Tuesday, 3 July 2007 23:14, Benjamin Herrenschmidt wrote:
> 
> > >  - For STR, don't do the freezer thing.
> > 
> > In the long run, I agree.
> > 
> > Still, can you please read this post from Alan Stern:
> > 
> > https://lists.linux-foundation.org/pipermail/linux-pm/2007-June/012847.html
> > 
> > ?  I don't think I'm able to repeat the arguments given in there in a
> > convincing way.
> 
> That's the same crackpot I've been hearing for the past 3 years or
> so ...
> 
> Both Paulus and I think the freezer is just a way to try to put your
> head in the sand and ignore the problem. It causes as many problems as
> it solves on its own, and is just not a solution that will be of any use
> once you start implementing dynamic PM schemes etc...
> 
> In many cases, having proper support for "live" suspend of devices is
> just a matter of having a couple of helpers in whatever subsystem those
> drivers hookup with. In the case of network, for example, it's mostly
> trivial (stop the queue). For block, it's not terribly hard neither,
> though you want to have some orderign/atomicity between the blocking of
> the incoming request queue and the sending of things like spindown &
> flush commands to the disk. For old-style IDE, that was fairly easily
> solved by piping suspend/resume command down the request queue itself
> and have the queue block/unblbock itself after processing them. Some of
> that logic could maybe be moved to the block layer for all block drivers
> to benefit.
> 
> But yes, overall, there is work to do on drivers and I'm doing the ones
> I hit on the platforms I use. I don't think the freezer is any kind of
> remotely good solution, just a way to continue avoiding the problem.

Still, do you really think that we're ready to drop it _right_ _now_ (I'm
referring to suspend only) and if so than on what basis (except that you
don't like it, which falls short of being a techical argument)?

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 20:59         ` Rafael J. Wysocki
@ 2007-07-03 21:35           ` Benjamin Herrenschmidt
  2007-07-03 22:33             ` Rafael J. Wysocki
  0 siblings, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-03 21:35 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Oliver Neukum, Matthew Garrett, linux-kernel, linux-pm,
	Pavel Machek, Nigel Cunningham


> I don't think that would matter.
> 
> Still, I can remove the sync from the suspend code path only, leaving it in
> the hibernation code path.  The patch will be bigger, but well.
> 
> Any objection to that?

Makes sense to sync before suspend tho, to limit the amount of dirty
non-written pages in case things go wrong.

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 21:32             ` Rafael J. Wysocki
@ 2007-07-03 21:35               ` Benjamin Herrenschmidt
  2007-07-03 22:43                 ` Rafael J. Wysocki
  2007-07-04  3:29               ` [linux-pm] " Paul Mackerras
  1 sibling, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-03 21:35 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Matthew Garrett, linux-kernel, linux-pm,
	Alan Stern, Pavel Machek

On Tue, 2007-07-03 at 23:32 +0200, Rafael J. Wysocki wrote:
> 
> Still, do you really think that we're ready to drop it _right_ _now_
> (I'm
> referring to suspend only) and if so than on what basis (except that
> you
> don't like it, which falls short of being a techical argument)?

Works fine for me without it ;-)

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 21:37                         ` Rafael J. Wysocki
@ 2007-07-03 21:36                           ` Matthew Garrett
  2007-07-03 21:47                             ` Oliver Neukum
  2007-07-03 22:46                             ` Rafael J. Wysocki
  2007-07-04  3:38                           ` Paul Mackerras
  1 sibling, 2 replies; 388+ messages in thread
From: Matthew Garrett @ 2007-07-03 21:36 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Alan Stern, linux-kernel, linux-pm

On Tue, Jul 03, 2007 at 11:37:51PM +0200, Rafael J. Wysocki wrote:
> On Tuesday, 3 July 2007 23:20, Matthew Garrett wrote:
> > We're used to the idea of applications blocking when a resource they're 
> > using goes away - NFS has done it forever. 
> 
> Now, please tell me how many driver writers even thought that something
> might try to access their devices after .suspend() had been executed (or
> even whilie it was being executed)?

Every single driver that fails under those conditions is already broken, 
and has been forever. It's likely that they're broken under run-time 
suspend, too.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 21:20                       ` Matthew Garrett
@ 2007-07-03 21:37                         ` Rafael J. Wysocki
  2007-07-03 21:36                           ` Matthew Garrett
  2007-07-04  3:38                           ` Paul Mackerras
  2007-07-03 22:21                         ` Alan Stern
  1 sibling, 2 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 21:37 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: Alan Stern, linux-kernel, linux-pm

On Tuesday, 3 July 2007 23:20, Matthew Garrett wrote:
> On Tue, Jul 03, 2007 at 05:16:37PM -0400, Alan Stern wrote:
> > On Tue, 3 Jul 2007, Matthew Garrett wrote:
> > > But that's fine - "Are we undergoing a systemwide suspend" is an easy 
> > > question to ask. Freezing processes instead means that most of those 
> > > paths will never be tested.
> > 
> > The question is easy to ask, but it's not so easy to figure out what
> > you should do if the answer is Yes.  Freezing processes instead means
> > that those "untested" paths -- in many, many drivers -- won't have to 
> > exist at all.
> 
> We're used to the idea of applications blocking when a resource they're 
> using goes away - NFS has done it forever. 

Now, please tell me how many driver writers even thought that something
might try to access their devices after .suspend() had been executed (or
even whilie it was being executed)?

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 21:36                           ` Matthew Garrett
@ 2007-07-03 21:47                             ` Oliver Neukum
  2007-07-03 22:46                             ` Rafael J. Wysocki
  1 sibling, 0 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-03 21:47 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: Rafael J. Wysocki, Alan Stern, linux-kernel, linux-pm

Am Dienstag, 3. Juli 2007 schrieb Matthew Garrett:
> On Tue, Jul 03, 2007 at 11:37:51PM +0200, Rafael J. Wysocki wrote:
> > On Tuesday, 3 July 2007 23:20, Matthew Garrett wrote:
> > > We're used to the idea of applications blocking when a resource they're 
> > > using goes away - NFS has done it forever. 
> > 
> > Now, please tell me how many driver writers even thought that something
> > might try to access their devices after .suspend() had been executed (or
> > even whilie it was being executed)?
> 
> Every single driver that fails under those conditions is already broken, 
> and has been forever. It's likely that they're broken under run-time 
> suspend, too.

There _is_ no runtime syspend in working condition. The interface
to do that has been scheduled for removal.

	Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 21:20             ` Benjamin Herrenschmidt
@ 2007-07-03 21:48               ` Oliver Neukum
  2007-07-03 21:56                 ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-03 21:48 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Miklos Szeredi, rjw, mjg59, linux-kernel, linux-pm, pavel, nigel

Am Dienstag, 3. Juli 2007 schrieb Benjamin Herrenschmidt:
> On Tue, 2007-07-03 at 21:32 +0200, Oliver Neukum wrote:
> > > I'm not sure why this can't be made atomic, but assuming, that it
> > > can't, fuse should still not need to be implicated.  If it is,
> > that's
> > > an indication about something wrong in the suspend procedure.
> > 
> > Nope, something's wrong in fuse. You must be able to deal with sync
> > until every task is frozen. 
> 
> Pipe dream

Then tell me how you want to avoid that condition.

	Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 21:48               ` Oliver Neukum
@ 2007-07-03 21:56                 ` Benjamin Herrenschmidt
  2007-07-03 22:04                   ` Oliver Neukum
  0 siblings, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-03 21:56 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Miklos Szeredi, rjw, mjg59, linux-kernel, linux-pm, pavel, nigel

On Tue, 2007-07-03 at 23:48 +0200, Oliver Neukum wrote:
> Am Dienstag, 3. Juli 2007 schrieb Benjamin Herrenschmidt:
> > On Tue, 2007-07-03 at 21:32 +0200, Oliver Neukum wrote:
> > > > I'm not sure why this can't be made atomic, but assuming, that it
> > > > can't, fuse should still not need to be implicated.  If it is,
> > > that's
> > > > an indication about something wrong in the suspend procedure.
> > > 
> > > Nope, something's wrong in fuse. You must be able to deal with sync
> > > until every task is frozen. 
> > 
> > Pipe dream
> 
> Then tell me how you want to avoid that condition.

Don't freeze :-)

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 21:56                 ` Benjamin Herrenschmidt
@ 2007-07-03 22:04                   ` Oliver Neukum
  2007-07-03 23:08                     ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-03 22:04 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Miklos Szeredi, rjw, mjg59, linux-kernel, linux-pm, pavel, nigel

Am Dienstag, 3. Juli 2007 schrieb Benjamin Herrenschmidt:
> On Tue, 2007-07-03 at 23:48 +0200, Oliver Neukum wrote:
> > Am Dienstag, 3. Juli 2007 schrieb Benjamin Herrenschmidt:
> > > On Tue, 2007-07-03 at 21:32 +0200, Oliver Neukum wrote:
> > > > > I'm not sure why this can't be made atomic, but assuming, that it
> > > > > can't, fuse should still not need to be implicated.  If it is,
> > > > that's
> > > > > an indication about something wrong in the suspend procedure.
> > > > 
> > > > Nope, something's wrong in fuse. You must be able to deal with sync
> > > > until every task is frozen. 
> > > 
> > > Pipe dream
> > 
> > Then tell me how you want to avoid that condition.
> 
> Don't freeze :-)

Then you will have to deal with all syscalls unfrozen tasks can make.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 21:20                       ` Matthew Garrett
  2007-07-03 21:37                         ` Rafael J. Wysocki
@ 2007-07-03 22:21                         ` Alan Stern
  2007-07-03 22:42                           ` Matthew Garrett
  1 sibling, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-03 22:21 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel, linux-pm

On Tue, 3 Jul 2007, Matthew Garrett wrote:

> On Tue, Jul 03, 2007 at 05:16:37PM -0400, Alan Stern wrote:
> > On Tue, 3 Jul 2007, Matthew Garrett wrote:
> > > But that's fine - "Are we undergoing a systemwide suspend" is an easy 
> > > question to ask. Freezing processes instead means that most of those 
> > > paths will never be tested.
> > 
> > The question is easy to ask, but it's not so easy to figure out what
> > you should do if the answer is Yes.  Freezing processes instead means
> > that those "untested" paths -- in many, many drivers -- won't have to 
> > exist at all.
> 
> We're used to the idea of applications blocking when a resource they're 
> using goes away - NFS has done it forever. 

You persist in evading my point.  I'm not worried about applications;  
I'm worried about drivers.

Let me put it explicitly: You're writing a driver.  You're working on
the read, write, or probe method.  You add code to check if a system
sleep is underway.  Suppose the answer is Yes -- what does your driver
do next?

Make your answer as detailed as you reasonably can.  And be careful to 
arrange things so that an ongoing I/O operation doesn't get messed up 
when your suspend method is called.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 21:35           ` Benjamin Herrenschmidt
@ 2007-07-03 22:33             ` Rafael J. Wysocki
  0 siblings, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 22:33 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Oliver Neukum, Matthew Garrett, linux-kernel, linux-pm,
	Pavel Machek, Nigel Cunningham

On Tuesday, 3 July 2007 23:35, Benjamin Herrenschmidt wrote:
> 
> > I don't think that would matter.
> > 
> > Still, I can remove the sync from the suspend code path only, leaving it in
> > the hibernation code path.  The patch will be bigger, but well.
> > 
> > Any objection to that?
> 
> Makes sense to sync before suspend tho, to limit the amount of dirty
> non-written pages in case things go wrong.

OK, below is the updated patch.

Greetings,
Rafael


---
From: Rafael J. Wysocki <rjw@sisk.pl>

The syncing of filesystems from within the freezer in not needed for suspend to
RAM and leads to problems with FUSE.  Change freeze_processes() so that it
doesn't execute sys_sync() and introduce the "syncing" version of it to be
called from the hibernation code paths.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 include/linux/freezer.h |   14 ++++++++++++--
 kernel/power/disk.c     |    2 +-
 kernel/power/main.c     |    6 ++++++
 kernel/power/process.c  |    8 +++++---
 kernel/power/user.c     |    2 +-
 5 files changed, 25 insertions(+), 7 deletions(-)

Index: linux-2.6.22-rc7/include/linux/freezer.h
===================================================================
--- linux-2.6.22-rc7.orig/include/linux/freezer.h	2007-07-04 00:21:26.000000000 +0200
+++ linux-2.6.22-rc7/include/linux/freezer.h	2007-07-04 00:22:37.000000000 +0200
@@ -62,7 +62,7 @@ static inline int thaw_process(struct ta
 }
 
 extern void refrigerator(void);
-extern int freeze_processes(void);
+extern int __freeze_processes(int sync_filesystems);
 extern void thaw_processes(void);
 
 static inline int try_to_freeze(void)
@@ -134,7 +134,7 @@ static inline void clear_freeze_flag(str
 static inline int thaw_process(struct task_struct *p) { return 1; }
 
 static inline void refrigerator(void) {}
-static inline int freeze_processes(void) { BUG(); return 0; }
+static inline int __freeze_processes(int s) { BUG(); return 0; }
 static inline void thaw_processes(void) {}
 
 static inline int try_to_freeze(void) { return 0; }
@@ -145,4 +145,14 @@ static inline int freezer_should_skip(st
 static inline void set_freezable(void) {}
 #endif
 
+static inline int freeze_processes(void)
+{
+	return __freeze_processes(0);
+}
+
+static inline int freeze_processes_with_sync(void)
+{
+	return __freeze_processes(1);
+}
+
 #endif	/* FREEZER_H_INCLUDED */
Index: linux-2.6.22-rc7/kernel/power/disk.c
===================================================================
--- linux-2.6.22-rc7.orig/kernel/power/disk.c	2007-07-04 00:21:26.000000000 +0200
+++ linux-2.6.22-rc7/kernel/power/disk.c	2007-07-04 00:21:44.000000000 +0200
@@ -281,7 +281,7 @@ static int prepare_processes(void)
 	int error = 0;
 
 	pm_prepare_console();
-	if (freeze_processes()) {
+	if (freeze_processes_with_sync()) {
 		error = -EBUSY;
 		unprepare_processes();
 	}
Index: linux-2.6.22-rc7/kernel/power/main.c
===================================================================
--- linux-2.6.22-rc7.orig/kernel/power/main.c	2007-07-04 00:21:26.000000000 +0200
+++ linux-2.6.22-rc7/kernel/power/main.c	2007-07-04 00:23:40.000000000 +0200
@@ -20,6 +20,7 @@
 #include <linux/resume-trace.h>
 #include <linux/freezer.h>
 #include <linux/vmstat.h>
+#include <linux/syscalls.h>
 
 #include "power.h"
 
@@ -231,6 +232,11 @@ static int enter_state(suspend_state_t s
 
 	if (!valid_state(state))
 		return -ENODEV;
+
+	printk("Syncing filesystems ... ");
+	sys_sync();
+	printk("done.\n");
+
 	if (!mutex_trylock(&pm_mutex))
 		return -EBUSY;
 
Index: linux-2.6.22-rc7/kernel/power/process.c
===================================================================
--- linux-2.6.22-rc7.orig/kernel/power/process.c	2007-07-04 00:21:26.000000000 +0200
+++ linux-2.6.22-rc7/kernel/power/process.c	2007-07-04 00:21:44.000000000 +0200
@@ -179,9 +179,9 @@ static int try_to_freeze_tasks(int freez
 }
 
 /**
- *	freeze_processes - tell processes to enter the refrigerator
+ *	__freeze_processes - tell processes to enter the refrigerator
  */
-int freeze_processes(void)
+int __freeze_processes(int sync_filesystems)
 {
 	int error;
 
@@ -190,7 +190,9 @@ int freeze_processes(void)
 	if (error)
 		return error;
 
-	sys_sync();
+	if (sync_filesystems)
+		sys_sync();
+
 	error = try_to_freeze_tasks(FREEZER_KERNEL_THREADS);
 	if (error)
 		return error;
Index: linux-2.6.22-rc7/kernel/power/user.c
===================================================================
--- linux-2.6.22-rc7.orig/kernel/power/user.c	2007-07-04 00:21:26.000000000 +0200
+++ linux-2.6.22-rc7/kernel/power/user.c	2007-07-04 00:21:44.000000000 +0200
@@ -153,7 +153,7 @@ static int snapshot_ioctl(struct inode *
 		mutex_lock(&pm_mutex);
 		error = pm_notifier_call_chain(PM_HIBERNATION_PREPARE);
 		if (!error) {
-			error = freeze_processes();
+			error = freeze_processes_with_sync();
 			if (error)
 				thaw_processes();
 		}

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 22:21                         ` Alan Stern
@ 2007-07-03 22:42                           ` Matthew Garrett
  2007-07-04 14:38                             ` Alan Stern
  0 siblings, 1 reply; 388+ messages in thread
From: Matthew Garrett @ 2007-07-03 22:42 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-kernel, linux-pm

On Tue, Jul 03, 2007 at 06:21:42PM -0400, Alan Stern wrote:
> On Tue, 3 Jul 2007, Matthew Garrett wrote:
> > We're used to the idea of applications blocking when a resource they're 
> > using goes away - NFS has done it forever. 
> 
> You persist in evading my point.  I'm not worried about applications;  
> I'm worried about drivers.
> 
> Let me put it explicitly: You're writing a driver.  You're working on
> the read, write, or probe method.  You add code to check if a system
> sleep is underway.  Suppose the answer is Yes -- what does your driver
> do next?

Leave the process blocked and defer any i/o until after resume. Why does 
it need to be any more complicated than that?

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 21:35               ` Benjamin Herrenschmidt
@ 2007-07-03 22:43                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 22:43 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Nigel Cunningham, Matthew Garrett, linux-kernel, linux-pm,
	Alan Stern, Pavel Machek

On Tuesday, 3 July 2007 23:35, Benjamin Herrenschmidt wrote:
> On Tue, 2007-07-03 at 23:32 +0200, Rafael J. Wysocki wrote:
> > 
> > Still, do you really think that we're ready to drop it _right_ _now_
> > (I'm
> > referring to suspend only) and if so than on what basis (except that
> > you
> > don't like it, which falls short of being a techical argument)?
> 
> Works fine for me without it ;-)

Yeah, that makes sense. ;-)

Still, someone needs to take care of bug reports from unlucky people and I
wouldn't like to increase the number of these by doing such a high-level change
without any preparations ...

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 21:36                           ` Matthew Garrett
  2007-07-03 21:47                             ` Oliver Neukum
@ 2007-07-03 22:46                             ` Rafael J. Wysocki
  1 sibling, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-03 22:46 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: Alan Stern, linux-kernel, linux-pm

On Tuesday, 3 July 2007 23:36, Matthew Garrett wrote:
> On Tue, Jul 03, 2007 at 11:37:51PM +0200, Rafael J. Wysocki wrote:
> > On Tuesday, 3 July 2007 23:20, Matthew Garrett wrote:
> > > We're used to the idea of applications blocking when a resource they're 
> > > using goes away - NFS has done it forever. 
> > 
> > Now, please tell me how many driver writers even thought that something
> > might try to access their devices after .suspend() had been executed (or
> > even whilie it was being executed)?
> 
> Every single driver that fails under those conditions is already broken, 
> and has been forever. It's likely that they're broken under run-time 
> suspend, too.

Well, I won't argue with that, but do you actually know how many drivers are
broken this way?

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 22:04                   ` Oliver Neukum
@ 2007-07-03 23:08                     ` Benjamin Herrenschmidt
  2007-07-04  8:10                       ` Oliver Neukum
  0 siblings, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-03 23:08 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Miklos Szeredi, rjw, mjg59, linux-kernel, linux-pm, pavel, nigel

On Wed, 2007-07-04 at 00:04 +0200, Oliver Neukum wrote:
> Am Dienstag, 3. Juli 2007 schrieb Benjamin Herrenschmidt:
> > On Tue, 2007-07-03 at 23:48 +0200, Oliver Neukum wrote:
> > > Am Dienstag, 3. Juli 2007 schrieb Benjamin Herrenschmidt:
> > > > On Tue, 2007-07-03 at 21:32 +0200, Oliver Neukum wrote:
> > > > > > I'm not sure why this can't be made atomic, but assuming, that it
> > > > > > can't, fuse should still not need to be implicated.  If it is,
> > > > > that's
> > > > > > an indication about something wrong in the suspend procedure.
> > > > > 
> > > > > Nope, something's wrong in fuse. You must be able to deal with sync
> > > > > until every task is frozen. 
> > > > 
> > > > Pipe dream
> > > 
> > > Then tell me how you want to avoid that condition.
> > 
> > Don't freeze :-)
> 
> Then you will have to deal with all syscalls unfrozen tasks can make.

Yup, and the majority of them is totally harmless. Looks like people
around here have a problem with the idea of writing robust drivers ...

Ben.


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 11:42             ` Oliver Neukum
@ 2007-07-03 23:11               ` Paul Mackerras
  2007-07-04  8:11                 ` Oliver Neukum
  2007-07-04 14:44                 ` Alan Stern
  0 siblings, 2 replies; 388+ messages in thread
From: Paul Mackerras @ 2007-07-03 23:11 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: linux-pm, linux-kernel, Matthew Garrett

Oliver Neukum writes:

> USB devices certainly have suspend methods.

Indeed, and the USB framework has code to know when the host
controller is suspended and avoid trying to send out urbs in that
case.  Or at least it did last time I looked at it in any detail; it's
been "just working" - including suspending and resuming, without the
freezer - for quite a while now.

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 11:55                 ` Oliver Neukum
@ 2007-07-03 23:40                   ` Paul Mackerras
  2007-07-04  7:02                     ` Miklos Szeredi
  0 siblings, 1 reply; 388+ messages in thread
From: Paul Mackerras @ 2007-07-03 23:40 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Benjamin Herrenschmidt, mjg59, linux-pm, linux-kernel, Miklos Szeredi

Oliver Neukum writes:

> That's why we have the problem of freezing the kernel threads or not.

That problem is a symptom of the deeper conceptual problem, as is the
problem with FUSE.

> You want to have all that pain for fuse?

I'd certainly rather get the drivers right, and maybe have an
occasional deadlock if I miss something, than have a GUARANTEED
deadlock every time I suspend with a FUSE filesystem mounted (which is
pretty much every time, since I use encfs regularly).

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 21:32             ` Rafael J. Wysocki
  2007-07-03 21:35               ` Benjamin Herrenschmidt
@ 2007-07-04  3:29               ` Paul Mackerras
  2007-07-04 10:33                 ` Rafael J. Wysocki
  2007-07-04 22:19                 ` The big suspend mess Adrian Bunk
  1 sibling, 2 replies; 388+ messages in thread
From: Paul Mackerras @ 2007-07-04  3:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-kernel,
	Pavel Machek, linux-pm

Rafael J. Wysocki writes:

> Still, do you really think that we're ready to drop it _right_ _now_ (I'm
> referring to suspend only) and if so than on what basis (except that you
> don't like it, which falls short of being a techical argument)?

The basis is that it (the freezer) causes more deadlocks and other
problems than it avoids, so it's a net win to remove it.

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 21:37                         ` Rafael J. Wysocki
  2007-07-03 21:36                           ` Matthew Garrett
@ 2007-07-04  3:38                           ` Paul Mackerras
  2007-07-04 10:42                             ` Rafael J. Wysocki
  1 sibling, 1 reply; 388+ messages in thread
From: Paul Mackerras @ 2007-07-04  3:38 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Matthew Garrett, linux-pm, linux-kernel

Rafael J. Wysocki writes:

> Now, please tell me how many driver writers even thought that something
> might try to access their devices after .suspend() had been executed (or
> even whilie it was being executed)?

Well, I believe that the USB framework copes with this, except
possibly for some corner cases like the example that Alan Stern
posted.  The fact that powerbooks suspend and resume without the
freezer implies that the IDE framework, the console code and the
framebuffer code cope correctly (though possibly not all chipset
drivers).

So I think that a lot of the frameworks already get it right.  Of
course the quality of the low-level chipset drivers has always been
pretty variable. :)

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 14:50             ` Alan Stern
  2007-07-03 14:59               ` Johannes Berg
@ 2007-07-04  3:55               ` Paul Mackerras
  2007-07-04 15:12                 ` Alan Stern
  1 sibling, 1 reply; 388+ messages in thread
From: Paul Mackerras @ 2007-07-04  3:55 UTC (permalink / raw)
  To: Alan Stern
  Cc: Johannes Berg, Rafael J. Wysocki, Linux-pm mailing list,
	Kernel development list, Pavel Machek, Matthew Garrett

Alan Stern writes:

> USB already implements runtime PM.  If a device is suspended at runtime
> and a task tries to access it, the device is automatically resumed.  
> No problem there.
> 
> The problem comes when the system is doing a STR.  Right now the code
> doesn't keep track of the difference between a runtime suspend and a
> system suspend -- once the device is suspended, it's suspended, period.  

Whether or not to resume a suspended device when an I/O request comes
in is a policy decision, and there could be cases where the user wants
I/O requests to be blocked, or to fail, or to be dropped while the
device is suspended, even for runtime power management.  For example,
a sound card could be suspended due to a low-battery condition, and in
that case you would want the driver to just drop any data that
userspace tries to write to the soundcard.

> Yes, the code could be changed to keep track of the reason for a device
> suspend.  But that just raises the old problem of what to do when
> there's an I/O request for a suspended device during STR.

Is this actually a real problem?  I would think the policy would be
"block" for block devices (pun not intended :), "drop" for network
devices, etc.

> Consider a particularly troublesome case: During STR, a non-frozen task
> writes to /sys/bus/BBB/drivers/DDD/bind.  The sysfs core grabs the
> device semaphore and calls the driver's probe routine.  If the driver
> isn't PM-aware it simply tries to initialize the device and fails
> because the device is already suspended.  That's no good; it isn't
> transparent.

How did the device get suspended if it didn't have a driver?  If it
did have a driver, why didn't the bind attempt fail?

> So assume the driver is PM-aware.  It tries to resume the device, which
> fails because STR is underway.  Now what can it do?  There's only one 
> possibility: It must block until the resume call can succeed.  But when 
> is that?
> 
> It has to be before the PM core tries to resume the device, because the 
> core will try to acquire the device semaphore and will block waiting 
> for the probe call to complete.  But it has to be after the PM core 
> resumes the device's parent, because obviously the device can't resume 
> until its parent is awake.

Suppose the device-model core code simply blocked all bind and unbind
requests while suspend is under way, until resume is finished.
Wouldn't that solve the problem?

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 15:58             ` Alan Stern
@ 2007-07-04  4:02               ` Paul Mackerras
  2007-07-04 15:04                 ` Alan Stern
  0 siblings, 1 reply; 388+ messages in thread
From: Paul Mackerras @ 2007-07-04  4:02 UTC (permalink / raw)
  To: Alan Stern; +Cc: Oliver Neukum, Matthew Garrett, linux-pm, linux-kernel

Alan Stern writes:

> > Most drivers suspended their hardware in the second call.  If they are
> > in the middle of a conversation with their device that *has* to be
> > completed, they can do that by polling.
> 
> Ugh.  That will cause problems when you try to integrate runtime 
> suspend.  In fact this whole approach is unsuitable for runtime PM and 
> it obscures the similarities between runtime PM and STR.

Yes there are similarities, but it would be a big mistake to say that
a requirement for STR is that all drivers do runtime PM.

If a driver does runtime PM, that's great, and it is useful for
implementing STR.  However, there are a class of devices for which
runtime PM is not possible or not useful, but which can suspend/resume
just fine as part of suspending/resuming the complete system, and for
which all that is needed is some small amount of simple hardware
poking just before the system as a whole is put into suspend.  For
those a late-suspend call with interrupts off is the simplest and best
way to go.

Think of a serial port on a motherboard for instance, where the only
power control is the overall power control for the system.  All that
is needed is to poll for the transmitter being empty (with timeout, of
course) in the late-suspend call (and possibly also turn off output
drivers, perhaps), and to reinitialize some registers in an
early-resume call.

The main attraction of the late-suspend call is that it really does,
reliably, guarantee that the driver's I/O request methods won't get
called between the late-suspend call and the early-resume call.

Paul.


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 20:21                 ` Alan Stern
@ 2007-07-04  4:59                   ` Paul Mackerras
  2007-07-04 14:57                     ` Alan Stern
  0 siblings, 1 reply; 388+ messages in thread
From: Paul Mackerras @ 2007-07-04  4:59 UTC (permalink / raw)
  To: Alan Stern
  Cc: Johannes Berg, Rafael J. Wysocki, Linux-pm mailing list,
	Kernel development list, Pavel Machek, Matthew Garrett,
	Benjamin Herrenschmidt

Alan Stern writes:

> I disagree.  The problem isn't the kernel calling userspace; it's
> userspace trying to do I/O at a time when everything is supposed to be
> quiescing.  Detecting that and blocking it in drivers is hard and
> error-prone; preventing it by freezing userspace is easy and cheap.

And unreliable, and prone to deadlocks, and invasive - requiring
changes to kernel threads that have nothing to do with drivers or
suspend/resume.

> The reasons why the PPC people dislike the whole idea aren't clear to
> me. 

Our experience is that it isn't necessary.  It's extra code that in
practice causes deadlocks and added maintenance burden for no
discernable benefit.

> If it were necessary to have some user task running in order to
> carry out the STR then their objection would make sense -- obviously
> that task couldn't do its job if it were frozen.  But it isn't
> necessary, or at least it should not be.

The freezer doesn't achieve its stated goal of preventing drivers from
getting I/O requests after suspend, since kernel threads can (and do)
initiate I/O.  So then we say that some kernel threads need to be
frozen and others don't, but making that decision is difficult and
error-prone.

Besides, any kernel thread that does I/O is potentially doing that in
order to complete some other I/O request.  So we want to freeze it in
order to prevent new I/O requests from being initiated, but we don't
want to freeze it so that existing I/O requests can be completed.
Thus we have a fundamental conflict in the notion of the freezer.

In fact I believe that making a distinction between user and kernel
threads is wrong and likely to lead to problems, since userspace can
be involved in doing I/O (e.g. FUSE or the user-space driver
framework).  So the argument of the previous paragraph also applies to
some userspace processes.

> Userspace will be effectively "frozen" while the system as a whole is 
> suspended.  So what's wrong with freezing it a little early?  Despite 
> Ben's comments, it seems to me that the freezer doesn't hide problems 
> -- it prevents them.

No, it appears to prevent them, but doesn't in fact.

I remain convinced that the right approach is to fix the drivers to do
one of two things; either do something in the suspend call to block
further requests to the device, or use a late-suspend call to put
their device into a low-power state.  Of course, correctly-written
frameworks can do a lot to help the chipset drivers here.

> Now people may claim that the freezer implementation itself is buggy.  
> I wouldn't dispute it.  But the bugs should be fixable; nobody has 
> pointed out anything fundamentally wrong with the idea AFAICT.

The fundamental problem is the kernel threads and user processes that
we need running to complete existing I/O requests, but which may
initiate new I/O requests in doing so.

The right way to solve the problem is to do the request blocking in
the drivers.

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 23:40                   ` Paul Mackerras
@ 2007-07-04  7:02                     ` Miklos Szeredi
  2007-07-04  8:02                       ` Paul Mackerras
  0 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-04  7:02 UTC (permalink / raw)
  To: paulus; +Cc: oliver, benh, mjg59, linux-pm, linux-kernel, miklos

> > That's why we have the problem of freezing the kernel threads or not.
> 
> That problem is a symptom of the deeper conceptual problem, as is the
> problem with FUSE.
> 
> > You want to have all that pain for fuse?
> 
> I'd certainly rather get the drivers right, and maybe have an
> occasional deadlock if I miss something, than have a GUARANTEED
> deadlock every time I suspend with a FUSE filesystem mounted (which is
> pretty much every time, since I use encfs regularly).

That's weird, I never had a suspend problem due to a fuse mount,
though I have them all the time.  And I suspect, that even the sync()
thing that suspend does is not the real cause, because sync() actually
does nothing in fuse filesystems.

So there's something else going on, which obviously has to do with
freezing user processes, but it's not clear what.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04  7:02                     ` Miklos Szeredi
@ 2007-07-04  8:02                       ` Paul Mackerras
  2007-07-04  8:26                         ` Miklos Szeredi
  0 siblings, 1 reply; 388+ messages in thread
From: Paul Mackerras @ 2007-07-04  8:02 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: oliver, benh, mjg59, linux-pm, linux-kernel

Miklos Szeredi writes:

> That's weird, I never had a suspend problem due to a fuse mount,
> though I have them all the time.  And I suspect, that even the sync()

Well, I don't either, because we don't freeze processes on
powerbooks.  But I have heard that other people have problems with
suspending with a fuse filesystem mounted.  Maybe the difference is
whether or not the filesystem is writable?

> thing that suspend does is not the real cause, because sync() actually
> does nothing in fuse filesystems.

It's not the filesystem sync method, as I understand it, it's that if
there are dirty pages in the page cache for files on the fuse
filesystem, the system will initiate a write-out on them and wait for
it to finish.  But if the fuse userspace is frozen, the write-out will
never complete.

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 23:08                     ` Benjamin Herrenschmidt
@ 2007-07-04  8:10                       ` Oliver Neukum
  0 siblings, 0 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-04  8:10 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Miklos Szeredi, rjw, mjg59, linux-kernel, linux-pm, pavel, nigel

Am Mittwoch, 4. Juli 2007 schrieb Benjamin Herrenschmidt:
> > > > > > Nope, something's wrong in fuse. You must be able to deal with sync
> > > > > > until every task is frozen. 
> > > > > 
> > > > > Pipe dream
> > > > 
> > > > Then tell me how you want to avoid that condition.
> > > 
> > > Don't freeze :-)
> > 
> > Then you will have to deal with all syscalls unfrozen tasks can make.
> 
> Yup, and the majority of them is totally harmless. Looks like people
> around here have a problem with the idea of writing robust drivers ...

The majority is meaningless here. The subsystem works or works not.
Reliably failing to work is better than working most of the time.

What I am having a problem with is the rest of the system changing its
behavior and expecting the drivers to cope with that out of the blue.

	Oliver

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 23:11               ` Paul Mackerras
@ 2007-07-04  8:11                 ` Oliver Neukum
  2007-07-04  8:27                   ` Paul Mackerras
  2007-07-04 14:44                 ` Alan Stern
  1 sibling, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-04  8:11 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linux-pm, linux-kernel, Matthew Garrett

Am Mittwoch, 4. Juli 2007 schrieb Paul Mackerras:
> Oliver Neukum writes:
> 
> > USB devices certainly have suspend methods.
> 
> Indeed, and the USB framework has code to know when the host
> controller is suspended and avoid trying to send out urbs in that
> case.  Or at least it did last time I looked at it in any detail; it's
> been "just working" - including suspending and resuming, without the
> freezer - for quite a while now.

And what happens to that IO? I suspended and lost data is not an
acceptable behavior.

	Oliver



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04  8:02                       ` Paul Mackerras
@ 2007-07-04  8:26                         ` Miklos Szeredi
  2007-07-04 10:26                           ` Rafael J. Wysocki
  0 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-04  8:26 UTC (permalink / raw)
  To: paulus; +Cc: miklos, oliver, benh, mjg59, linux-pm, linux-kernel

> > That's weird, I never had a suspend problem due to a fuse mount,
> > though I have them all the time.  And I suspect, that even the sync()
> 
> Well, I don't either, because we don't freeze processes on
> powerbooks.  But I have heard that other people have problems with
> suspending with a fuse filesystem mounted.  Maybe the difference is
> whether or not the filesystem is writable?
> 
> > thing that suspend does is not the real cause, because sync() actually
> > does nothing in fuse filesystems.
> 
> It's not the filesystem sync method, as I understand it, it's that if
> there are dirty pages in the page cache for files on the fuse
> filesystem,

Currently fuse doesn't produce dirty pages.  Normal writes are done
synchronously, and writable mmap is not supported.  So sync() should
really be a no-op for fuse.

> the system will initiate a write-out on them and wait for it to
> finish.  But if the fuse userspace is frozen, the write-out will
> never complete.

Maybe there is some other fs operation being done, possibly not
directly, but by waiting for a kernel thread, that does that.

It would be nice, if someone who can reproduce the deadlock could
debug it.  Does sysrq still work during suspend?

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04  8:11                 ` Oliver Neukum
@ 2007-07-04  8:27                   ` Paul Mackerras
  2007-07-04  8:39                     ` Oliver Neukum
  0 siblings, 1 reply; 388+ messages in thread
From: Paul Mackerras @ 2007-07-04  8:27 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: linux-pm, linux-kernel, Matthew Garrett

Oliver Neukum writes:

> > Indeed, and the USB framework has code to know when the host
> > controller is suspended and avoid trying to send out urbs in that
> > case.  Or at least it did last time I looked at it in any detail; it's
> > been "just working" - including suspending and resuming, without the
> > freezer - for quite a while now.
> 
> And what happens to that IO? I suspended and lost data is not an
> acceptable behavior.

It's not lost, it's sitting in RAM, and will be sent out when you
resume.

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04  8:27                   ` Paul Mackerras
@ 2007-07-04  8:39                     ` Oliver Neukum
  2007-07-04  9:21                       ` Paul Mackerras
  0 siblings, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-04  8:39 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linux-pm, linux-kernel, Matthew Garrett

Am Mittwoch, 4. Juli 2007 schrieb Paul Mackerras:
> Oliver Neukum writes:
> 
> > > Indeed, and the USB framework has code to know when the host
> > > controller is suspended and avoid trying to send out urbs in that
> > > case.  Or at least it did last time I looked at it in any detail; it's
> > > been "just working" - including suspending and resuming, without the
> > > freezer - for quite a while now.
> > 
> > And what happens to that IO? I suspended and lost data is not an
> > acceptable behavior.
> 
> It's not lost, it's sitting in RAM, and will be sent out when you
> resume.

Unfortunately this is not the case. The URB will error out.

	Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04  8:39                     ` Oliver Neukum
@ 2007-07-04  9:21                       ` Paul Mackerras
  2007-07-04 10:08                         ` Oliver Neukum
  0 siblings, 1 reply; 388+ messages in thread
From: Paul Mackerras @ 2007-07-04  9:21 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: linux-pm, linux-kernel, Matthew Garrett

Oliver Neukum writes:

> > It's not lost, it's sitting in RAM, and will be sent out when you
> > resume.
> 
> Unfortunately this is not the case. The URB will error out.

So the higher-level driver needs to do the sensible thing, i.e.,
resubmit the URB after resume.  It's not rocket science.  The data is
not lost, it's sitting in RAM, and the higher-level driver will send
it out when you resume.  If not, then we fix the higher-level driver.

Of course with USB there is the interesting question of whether the
device is still there when we resume.  But if it isn't, the situation
is no different to the user asynchronously unplugging the device
during operation, and if we lose data in that situation, we can only
blame the user. :)

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04  9:21                       ` Paul Mackerras
@ 2007-07-04 10:08                         ` Oliver Neukum
  2007-07-04 10:46                           ` Paul Mackerras
  0 siblings, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-04 10:08 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linux-pm, linux-kernel, Matthew Garrett

Am Mittwoch, 4. Juli 2007 schrieb Paul Mackerras:
> Oliver Neukum writes:
> 
> > > It's not lost, it's sitting in RAM, and will be sent out when you
> > > resume.
> > 
> > Unfortunately this is not the case. The URB will error out.
> 
> So the higher-level driver needs to do the sensible thing, i.e.,
> resubmit the URB after resume.  It's not rocket science.  The data is
> not lost, it's sitting in RAM, and the higher-level driver will send
> it out when you resume.  If not, then we fix the higher-level driver.

You cannot simply restart the URB without thinking.
The device after resumption may or may not be in the stage
you left it. It needs to be rechecked and some settings must be
renewed. You cannot simple throw an URB from an arbitrary
stage of the protocol at it.
Suspension of devices can only happen at some points
in the protocol.

	Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04  8:26                         ` Miklos Szeredi
@ 2007-07-04 10:26                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-04 10:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: paulus, oliver, benh, mjg59, linux-pm, linux-kernel

On Wednesday, 4 July 2007 10:26, Miklos Szeredi wrote:
> > > That's weird, I never had a suspend problem due to a fuse mount,
> > > though I have them all the time.  And I suspect, that even the sync()
> > 
> > Well, I don't either, because we don't freeze processes on
> > powerbooks.  But I have heard that other people have problems with
> > suspending with a fuse filesystem mounted.  Maybe the difference is
> > whether or not the filesystem is writable?
> > 
> > > thing that suspend does is not the real cause, because sync() actually
> > > does nothing in fuse filesystems.
> > 
> > It's not the filesystem sync method, as I understand it, it's that if
> > there are dirty pages in the page cache for files on the fuse
> > filesystem,
> 
> Currently fuse doesn't produce dirty pages.  Normal writes are done
> synchronously, and writable mmap is not supported.  So sync() should
> really be a no-op for fuse.
> 
> > the system will initiate a write-out on them and wait for it to
> > finish.  But if the fuse userspace is frozen, the write-out will
> > never complete.
> 
> Maybe there is some other fs operation being done, possibly not
> directly, but by waiting for a kernel thread, that does that.

We're going to limit the freezing of kernel threads to the ones that explicitly
want to be frozen, so if that's the case, then I think it'll be fixed soon.

> It would be nice, if someone who can reproduce the deadlock could
> debug it.

Agreed.

> Does sysrq still work during suspend? 

Yes, it should work.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04  3:29               ` [linux-pm] " Paul Mackerras
@ 2007-07-04 10:33                 ` Rafael J. Wysocki
  2007-07-04 10:48                   ` Paul Mackerras
  2007-07-04 22:19                 ` The big suspend mess Adrian Bunk
  1 sibling, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-04 10:33 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-kernel,
	Pavel Machek, linux-pm

On Wednesday, 4 July 2007 05:29, Paul Mackerras wrote:
> Rafael J. Wysocki writes:
> 
> > Still, do you really think that we're ready to drop it _right_ _now_ (I'm
> > referring to suspend only) and if so than on what basis (except that you
> > don't like it, which falls short of being a techical argument)?
> 
> The basis is that it (the freezer) causes more deadlocks and other
> problems than it avoids, so it's a net win to remove it.

So, I gather, you're volunteering to handle suspend-related bug reports
from the point in which we drop the freezer from the suspend code path?

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04  3:38                           ` Paul Mackerras
@ 2007-07-04 10:42                             ` Rafael J. Wysocki
  2007-07-04 10:58                               ` Paul Mackerras
  0 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-04 10:42 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: Matthew Garrett, linux-pm, linux-kernel

On Wednesday, 4 July 2007 05:38, Paul Mackerras wrote:
> Rafael J. Wysocki writes:
> 
> > Now, please tell me how many driver writers even thought that something
> > might try to access their devices after .suspend() had been executed (or
> > even whilie it was being executed)?
> 
> Well, I believe that the USB framework copes with this, except
> possibly for some corner cases like the example that Alan Stern
> posted.  The fact that powerbooks suspend and resume without the
> freezer implies that the IDE framework, the console code and the
> framebuffer code cope correctly (though possibly not all chipset
> drivers).

Okay, so in fact you don't know.

And that's my point in this thread.

I won't fight for the freezer for what it's worth, but let's do things in the
_right_ _order_.  For example, let's make sure that by making the $subject
change we won't introduce (too many) regressions and fix the frameworks
that don't get it right.

Using the problems with FUSE as an argument for making this change immediately
doesn't seem to be right to me.

> So I think that a lot of the frameworks already get it right.  Of
> course the quality of the low-level chipset drivers has always been
> pretty variable. :)

Yes, but that's another issue.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 10:08                         ` Oliver Neukum
@ 2007-07-04 10:46                           ` Paul Mackerras
  2007-07-04 10:53                             ` Oliver Neukum
  0 siblings, 1 reply; 388+ messages in thread
From: Paul Mackerras @ 2007-07-04 10:46 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: linux-pm, linux-kernel, Matthew Garrett

Oliver Neukum writes:

> You cannot simply restart the URB without thinking.
> The device after resumption may or may not be in the stage
> you left it. It needs to be rechecked and some settings must be
> renewed. You cannot simple throw an URB from an arbitrary
> stage of the protocol at it.
> Suspension of devices can only happen at some points
> in the protocol.

Yeah, and?

I said "the higher-level driver needs to do the sensible thing".

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 10:33                 ` Rafael J. Wysocki
@ 2007-07-04 10:48                   ` Paul Mackerras
  2007-07-04 11:10                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 388+ messages in thread
From: Paul Mackerras @ 2007-07-04 10:48 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-kernel,
	Pavel Machek, linux-pm

Rafael J. Wysocki writes:

> So, I gather, you're volunteering to handle suspend-related bug reports
> from the point in which we drop the freezer from the suspend code path?

Ben and I are happy to handle all the ones for the platform we
maintain, which currently does suspend without freezing processes. :)

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 10:46                           ` Paul Mackerras
@ 2007-07-04 10:53                             ` Oliver Neukum
  2007-07-04 10:59                               ` Paul Mackerras
  0 siblings, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-04 10:53 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linux-pm, linux-kernel, Matthew Garrett

Am Mittwoch, 4. Juli 2007 schrieb Paul Mackerras:
> Oliver Neukum writes:
> 
> > You cannot simply restart the URB without thinking.
> > The device after resumption may or may not be in the stage
> > you left it. It needs to be rechecked and some settings must be
> > renewed. You cannot simple throw an URB from an arbitrary
> > stage of the protocol at it.
> > Suspension of devices can only happen at some points
> > in the protocol.
> 
> Yeah, and?
> 
> I said "the higher-level driver needs to do the sensible thing".

They can't. Device specific protocols are known to the drivers only.
The fact remains, remove the freezer and you need to go through
all drivers.

	Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 10:42                             ` Rafael J. Wysocki
@ 2007-07-04 10:58                               ` Paul Mackerras
  2007-07-04 11:25                                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 388+ messages in thread
From: Paul Mackerras @ 2007-07-04 10:58 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Matthew Garrett, linux-pm, linux-kernel

Rafael J. Wysocki writes:

> Okay, so in fact you don't know.

Don't know what exactly?

It has been a while since I had my head in the USB code.  I assume
it's being maintained by competent people. :)

> And that's my point in this thread.

Well, I'd be interested in hearing from Matthew whether he has
actually been using his patch in Ubuntu, and if so, what bug reports
he has been receiving related to it?

> I won't fight for the freezer for what it's worth, but let's do things in the
> _right_ _order_.  For example, let's make sure that by making the $subject
> change we won't introduce (too many) regressions and fix the frameworks
> that don't get it right.
> 
> Using the problems with FUSE as an argument for making this change immediately
> doesn't seem to be right to me.

I can see your point, but I won't be moving powermac over to use the
generic suspend path until the freezer is gone, since I am pretty
confident that the drivers we care about behave sensibly, and I have
seen a lot of traffic on linux-pm and lkml about problems caused by
the freezer.

Also, no-one has yet answered my fundamental objection to the freezer,
which is that the very kernel threads we would want to freeze are
often the same ones that we must not freeze, namely the threads that
issue I/O requests in order to satisfy incoming I/O requests.

If there was an automatic way to construct the graph of dependencies
(including data flows) between tasks, and derive an ordering for
freezing that guarantees that all I/Os will get completed without
deadlocks, then I could accept the freezer.  But we don't have
anything like that.

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 10:53                             ` Oliver Neukum
@ 2007-07-04 10:59                               ` Paul Mackerras
  2007-07-04 11:02                                 ` Oliver Neukum
  0 siblings, 1 reply; 388+ messages in thread
From: Paul Mackerras @ 2007-07-04 10:59 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: linux-pm, linux-kernel, Matthew Garrett

Oliver Neukum writes:

> They can't. Device specific protocols are known to the drivers only.
> The fact remains, remove the freezer and you need to go through
> all drivers.

The freezer does not actually mean that you don't have to get the
drivers right, because kernel threads can issue I/O requests.

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 10:59                               ` Paul Mackerras
@ 2007-07-04 11:02                                 ` Oliver Neukum
  0 siblings, 0 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-04 11:02 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linux-pm, linux-kernel, Matthew Garrett

Am Mittwoch, 4. Juli 2007 schrieb Paul Mackerras:
> Oliver Neukum writes:
> 
> > They can't. Device specific protocols are known to the drivers only.
> > The fact remains, remove the freezer and you need to go through
> > all drivers.
> 
> The freezer does not actually mean that you don't have to get the
> drivers right, because kernel threads can issue I/O requests.

Kernel threads do not issue requests for the hell of it. And yes,
kernel threads must be aware of suspension. Threads are few,
drivers many.

	Oliver

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 10:48                   ` Paul Mackerras
@ 2007-07-04 11:10                     ` Rafael J. Wysocki
  2007-07-04 11:24                       ` Paul Mackerras
  2007-07-04 11:25                       ` Paul Mackerras
  0 siblings, 2 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-04 11:10 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-kernel,
	Pavel Machek, linux-pm

On Wednesday, 4 July 2007 12:48, Paul Mackerras wrote:
> Rafael J. Wysocki writes:
> 
> > So, I gather, you're volunteering to handle suspend-related bug reports
> > from the point in which we drop the freezer from the suspend code path?
> 
> Ben and I are happy to handle all the ones for the platform we
> maintain, which currently does suspend without freezing processes. :)

I mean all platforms.  After all, the $subject change won't affect yours.

BTW, does your platform's suspend work on SMP systems?

Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 11:10                     ` Rafael J. Wysocki
@ 2007-07-04 11:24                       ` Paul Mackerras
  2007-07-04 14:30                         ` Rafael J. Wysocki
  2007-07-04 11:25                       ` Paul Mackerras
  1 sibling, 1 reply; 388+ messages in thread
From: Paul Mackerras @ 2007-07-04 11:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-kernel,
	Pavel Machek, linux-pm

Rafael J. Wysocki writes:

> BTW, does your platform's suspend work on SMP systems?

Yes; currently we require userspace to offline all cpus other than the
boot cpu before initiating the suspend.

The main difficulty is actually that SMP powermacs that can suspend
tend to have video cards that get powered off in suspend.  We know how
to re-initialize one (the Radeon RV100 QW) but not others.  That's an
orthogonal issue to the issues we have been discussing, though.

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 11:10                     ` Rafael J. Wysocki
  2007-07-04 11:24                       ` Paul Mackerras
@ 2007-07-04 11:25                       ` Paul Mackerras
  1 sibling, 0 replies; 388+ messages in thread
From: Paul Mackerras @ 2007-07-04 11:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-kernel,
	Pavel Machek, linux-pm

Rafael J. Wysocki writes:

> > > So, I gather, you're volunteering to handle suspend-related bug reports
> > > from the point in which we drop the freezer from the suspend code path?
> > 
> > Ben and I are happy to handle all the ones for the platform we
> > maintain, which currently does suspend without freezing processes. :)
> 
> I mean all platforms.  After all, the $subject change won't affect yours.

Well, I can't commit to handling all bug reports, but I am happy to
help out where I can...

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 10:58                               ` Paul Mackerras
@ 2007-07-04 11:25                                 ` Rafael J. Wysocki
  2007-07-04 11:34                                   ` Paul Mackerras
                                                     ` (2 more replies)
  0 siblings, 3 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-04 11:25 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: Matthew Garrett, linux-pm, linux-kernel

On Wednesday, 4 July 2007 12:58, Paul Mackerras wrote:
> Rafael J. Wysocki writes:
> 
> > Okay, so in fact you don't know.
> 
> Don't know what exactly?

How many drivers will be adversely affected by the $subject change.

> It has been a while since I had my head in the USB code.  I assume
> it's being maintained by competent people. :)

I'm not talking about USB.
 
> > And that's my point in this thread.
> 
> Well, I'd be interested in hearing from Matthew whether he has
> actually been using his patch in Ubuntu, and if so, what bug reports
> he has been receiving related to it?

Me too.

> > I won't fight for the freezer for what it's worth, but let's do things in the
> > _right_ _order_.  For example, let's make sure that by making the $subject
> > change we won't introduce (too many) regressions and fix the frameworks
> > that don't get it right.
> > 
> > Using the problems with FUSE as an argument for making this change immediately
> > doesn't seem to be right to me.
> 
> I can see your point, but I won't be moving powermac over to use the
> generic suspend path until the freezer is gone, since I am pretty
> confident that the drivers we care about behave sensibly, and I have
> seen a lot of traffic on linux-pm and lkml about problems caused by
> the freezer.

They are mostly related to kernel threads, that we've already agreed no to
freeze (except for the ones that want that, but they will be responsible for
getting everything right).  The initial patches for that are in -mm and more
will come.

> Also, no-one has yet answered my fundamental objection to the freezer,
> which is that the very kernel threads we would want to freeze are
> often the same ones that we must not freeze, namely the threads that
> issue I/O requests in order to satisfy incoming I/O requests.

See above.  We're moving away from freezing kernel threads.

> If there was an automatic way to construct the graph of dependencies
> (including data flows) between tasks, and derive an ordering for
> freezing that guarantees that all I/Os will get completed without
> deadlocks, then I could accept the freezer.  But we don't have
> anything like that.

No we don't.

Still, my position is this:

1) The freezer (in the modified form, with the freezing of kernel threads
limited to the ones that want to be frozen) is needed for hibernation.

2) The freezer is generally not needed for suspend, _but_ there are drivers
in the tree that rely on it being used.  Thus, at some point in time we can
remove the freezer from the suspend code path, _but_ no sooner than we are
sure that the majority of drivers is prepared for that.

3) In the meantime, if there are freezer-related problems, they should be
fixed rather than used as arguments for immediate removal of it, because of 2).

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 11:25                                 ` Rafael J. Wysocki
@ 2007-07-04 11:34                                   ` Paul Mackerras
  2007-07-04 14:12                                     ` Dmitry Torokhov
  2007-07-04 15:38                                     ` Alan Stern
  2007-07-04 11:51                                   ` Miklos Szeredi
  2007-07-04 12:41                                   ` Theodore Tso
  2 siblings, 2 replies; 388+ messages in thread
From: Paul Mackerras @ 2007-07-04 11:34 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Matthew Garrett, linux-pm, linux-kernel

Rafael J. Wysocki writes:

> They are mostly related to kernel threads, that we've already agreed no to
> freeze (except for the ones that want that, but they will be responsible for
> getting everything right).  The initial patches for that are in -mm and more
> will come.

Serious question: which kernel threads would actually want to be
frozen?

Threads that do no I/O at all don't care about suspend/resume and
don't need to be frozen in any case.  Threads that issue I/O requests
in order to service incoming I/O requests can't be frozen because of
the possibility of deadlock.  Which leaves threads that do I/O just
for the fun of it. :)

What am I missing?

> > Also, no-one has yet answered my fundamental objection to the freezer,
> > which is that the very kernel threads we would want to freeze are
> > often the same ones that we must not freeze, namely the threads that
> > issue I/O requests in order to satisfy incoming I/O requests.
> 
> See above.  We're moving away from freezing kernel threads.

I believe the distinction between threads and user processes is a
false one, because user processes can now do things that were formerly
only doable by kernel threads.

> > If there was an automatic way to construct the graph of dependencies
> > (including data flows) between tasks, and derive an ordering for
> > freezing that guarantees that all I/Os will get completed without
> > deadlocks, then I could accept the freezer.  But we don't have
> > anything like that.
> 
> No we don't.
> 
> Still, my position is this:
> 
> 1) The freezer (in the modified form, with the freezing of kernel threads
> limited to the ones that want to be frozen) is needed for hibernation.
> 
> 2) The freezer is generally not needed for suspend, _but_ there are drivers
> in the tree that rely on it being used.  Thus, at some point in time we can

Do you know which drivers they are?  I'm happy to help hack things
into shape.

> remove the freezer from the suspend code path, _but_ no sooner than we are
> sure that the majority of drivers is prepared for that.
> 
> 3) In the meantime, if there are freezer-related problems, they should be
> fixed rather than used as arguments for immediate removal of it, because of 2).

I don't know how you can make the freezer completely deadlock-free
while still providing the guarantee that some drivers currently need,
without constructing the dependency graph I mentioned.

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 11:25                                 ` Rafael J. Wysocki
  2007-07-04 11:34                                   ` Paul Mackerras
@ 2007-07-04 11:51                                   ` Miklos Szeredi
  2007-07-04 14:41                                     ` Rafael J. Wysocki
  2007-07-04 15:42                                     ` Alan Stern
  2007-07-04 12:41                                   ` Theodore Tso
  2 siblings, 2 replies; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-04 11:51 UTC (permalink / raw)
  To: rjw; +Cc: paulus, mjg59, linux-pm, linux-kernel

> Still, my position is this:
> 
> 1) The freezer (in the modified form, with the freezing of kernel threads
> limited to the ones that want to be frozen) is needed for hibernation.
> 
> 2) The freezer is generally not needed for suspend, _but_ there are drivers
> in the tree that rely on it being used.  Thus, at some point in time we can
> remove the freezer from the suspend code path, _but_ no sooner than we are
> sure that the majority of drivers is prepared for that.

And we won't know if drivers are OK until we remove the freezer,
catch-22.

So I think we need to disable the freezer at least in -mm and/or
optionally in -linus.

I applied Matthew's patch, and suspend did in fact stop working
(thinkpad t60), but there was nothing catastrophic.  Here's the dmesg
if somebody is interested:

Suspending console(s)
usb_endpoint usbdev5.3_ep83: PM: suspend 0->2, parent 5-2:1.0 already 2
usb_endpoint usbdev5.3_ep02: PM: suspend 0->2, parent 5-2:1.0 already 2
usb_endpoint usbdev5.3_ep81: PM: suspend 0->2, parent 5-2:1.0 already 2
hub 2-0:1.0: suspend error -16
suspend_device(): usb_suspend+0x0/0x1c() returns -16
Could not suspend device usb2: error -16
usb_endpoint usbdev5.3_ep81: PM: resume from 0, parent 5-2:1.0 still 2
usb_endpoint usbdev5.3_ep02: PM: resume from 0, parent 5-2:1.0 still 2
usb_endpoint usbdev5.3_ep83: PM: resume from 0, parent 5-2:1.0 still 2
Some devices failed to suspend

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 11:25                                 ` Rafael J. Wysocki
  2007-07-04 11:34                                   ` Paul Mackerras
  2007-07-04 11:51                                   ` Miklos Szeredi
@ 2007-07-04 12:41                                   ` Theodore Tso
  2007-07-04 14:40                                     ` Rafael J. Wysocki
  2 siblings, 1 reply; 388+ messages in thread
From: Theodore Tso @ 2007-07-04 12:41 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Paul Mackerras, Matthew Garrett, linux-pm, linux-kernel

On Wed, Jul 04, 2007 at 01:25:55PM +0200, Rafael J. Wysocki wrote:
> > Don't know what exactly?
> 
> How many drivers will be adversely affected by the $subject change.

Ok, so how about a CONFIG option which removes the freezer, so we can
find out experimentally how many people without it?  We can make it be
experimental at first, or (my preference) make it be the default
initially, and if people complain that their laptop's suspend feature
is broken, we can tell them how to turn back on the
CONFIG_FREEZER_DEPRECATED option.

							- Ted

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 11:34                                   ` Paul Mackerras
@ 2007-07-04 14:12                                     ` Dmitry Torokhov
  2007-07-04 15:38                                     ` Alan Stern
  1 sibling, 0 replies; 388+ messages in thread
From: Dmitry Torokhov @ 2007-07-04 14:12 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: Rafael J. Wysocki, Matthew Garrett, linux-pm, linux-kernel

On 7/4/07, Paul Mackerras <paulus@samba.org> wrote:
> Rafael J. Wysocki writes:
>
> > They are mostly related to kernel threads, that we've already agreed no to
> > freeze (except for the ones that want that, but they will be responsible for
> > getting everything right).  The initial patches for that are in -mm and more
> > will come.
>
> Serious question: which kernel threads would actually want to be
> frozen?
>
> Threads that do no I/O at all don't care about suspend/resume and
> don't need to be frozen in any case.  Threads that issue I/O requests
> in order to service incoming I/O requests can't be frozen because of
> the possibility of deadlock.  Which leaves threads that do I/O just
> for the fun of it. :)
>
> What am I missing?
>

I like kseriod and kgameportd to be frozen. If fast resume of serio
port fails we kick the task of full resume to kseriod. But when we
hibernate and all the devices are resumed again so snapshot can be
written I don't care if touchpad failed to resume - it can be dealt
with later, when system is actually resumes. So I like that by
freezing I do not delay hibernation and all requests get simply
discarded as they are not in the image. This mostly affects
hibernation though.

Having these threads frozen also guarantees that serio/gameport device
tree is stable during susped/resume and I can process outstanding
requests later, when system is fully up.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 11:24                       ` Paul Mackerras
@ 2007-07-04 14:30                         ` Rafael J. Wysocki
  2007-07-05  0:15                           ` Paul Mackerras
  0 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-04 14:30 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-kernel,
	Pavel Machek, linux-pm

On Wednesday, 4 July 2007 13:24, Paul Mackerras wrote:
> Rafael J. Wysocki writes:
> 
> > BTW, does your platform's suspend work on SMP systems?
> 
> Yes; currently we require userspace to offline all cpus other than the
> boot cpu before initiating the suspend.

This is incompatible with the code in kernel/power/main.c, since we only
disable the nonboot CPUs after devices have been suspended.  Do you think that
your framework can be modified to work without disabling the nonboot CPUs
by the user space?

> The main difficulty is actually that SMP powermacs that can suspend
> tend to have video cards that get powered off in suspend.  We know how
> to re-initialize one (the Radeon RV100 QW) but not others.  That's an
> orthogonal issue to the issues we have been discussing, though.

Sure.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 22:42                           ` Matthew Garrett
@ 2007-07-04 14:38                             ` Alan Stern
  2007-07-04 14:58                               ` Matthew Garrett
  0 siblings, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-04 14:38 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel, linux-pm

On Tue, 3 Jul 2007, Matthew Garrett wrote:

> On Tue, Jul 03, 2007 at 06:21:42PM -0400, Alan Stern wrote:
> > On Tue, 3 Jul 2007, Matthew Garrett wrote:
> > > We're used to the idea of applications blocking when a resource they're 
> > > using goes away - NFS has done it forever. 
> > 
> > You persist in evading my point.  I'm not worried about applications;  
> > I'm worried about drivers.
> > 
> > Let me put it explicitly: You're writing a driver.  You're working on
> > the read, write, or probe method.  You add code to check if a system
> > sleep is underway.  Suppose the answer is Yes -- what does your driver
> > do next?
> 
> Leave the process blocked and defer any i/o until after resume. Why does 
> it need to be any more complicated than that?

(1) The driver will undoubtedly hold some mutex or semaphore at the 
time it checks whether a system sleep is underway.  You will have to 
drop it before blocking and then reacquire it afterward.

(2) The driver may have been called by some other routine which holds a 
mutex needed for resuming the device.  In this case the driver _can't_ 
drop the mutex and so the resume will deadlock.


Okay, I agree that (1) can be handled without too much effort.  But 
doing it adds an extra test to _every_ driver's I/O pathway.  Freezing 
userspace does not incur all this additional overhead.

(2) shouldn't arise during normal read and write operations, but it
certainly _will_ arise during probe.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 12:41                                   ` Theodore Tso
@ 2007-07-04 14:40                                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-04 14:40 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Paul Mackerras, Matthew Garrett, linux-pm, linux-kernel

On Wednesday, 4 July 2007 14:41, Theodore Tso wrote:
> On Wed, Jul 04, 2007 at 01:25:55PM +0200, Rafael J. Wysocki wrote:
> > > Don't know what exactly?
> > 
> > How many drivers will be adversely affected by the $subject change.
> 
> Ok, so how about a CONFIG option which removes the freezer, so we can
> find out experimentally how many people without it?  We can make it be
> experimental at first, or (my preference) make it be the default
> initially, and if people complain that their laptop's suspend feature
> is broken, we can tell them how to turn back on the
> CONFIG_FREEZER_DEPRECATED option.

The freezer is generally still necessary for hibernation, so I won't call
it "DEPRECATED'.

Moreover, I'd prefer to make drivers use different callbacks for hibernation
and suspend before we do that.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 11:51                                   ` Miklos Szeredi
@ 2007-07-04 14:41                                     ` Rafael J. Wysocki
  2007-07-04 14:45                                       ` Miklos Szeredi
  2007-07-04 15:42                                     ` Alan Stern
  1 sibling, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-04 14:41 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: paulus, mjg59, linux-pm, linux-kernel

On Wednesday, 4 July 2007 13:51, Miklos Szeredi wrote:
> > Still, my position is this:
> > 
> > 1) The freezer (in the modified form, with the freezing of kernel threads
> > limited to the ones that want to be frozen) is needed for hibernation.
> > 
> > 2) The freezer is generally not needed for suspend, _but_ there are drivers
> > in the tree that rely on it being used.  Thus, at some point in time we can
> > remove the freezer from the suspend code path, _but_ no sooner than we are
> > sure that the majority of drivers is prepared for that.
> 
> And we won't know if drivers are OK until we remove the freezer,
> catch-22.

I disagree.  We can learn that by auditing the drivers.

> So I think we need to disable the freezer at least in -mm and/or
> optionally in -linus.
> 
> I applied Matthew's patch, and suspend did in fact stop working
> (thinkpad t60), but there was nothing catastrophic.  Here's the dmesg
> if somebody is interested:
> 
> Suspending console(s)
> usb_endpoint usbdev5.3_ep83: PM: suspend 0->2, parent 5-2:1.0 already 2
> usb_endpoint usbdev5.3_ep02: PM: suspend 0->2, parent 5-2:1.0 already 2
> usb_endpoint usbdev5.3_ep81: PM: suspend 0->2, parent 5-2:1.0 already 2
> hub 2-0:1.0: suspend error -16
> suspend_device(): usb_suspend+0x0/0x1c() returns -16
> Could not suspend device usb2: error -16
> usb_endpoint usbdev5.3_ep81: PM: resume from 0, parent 5-2:1.0 still 2
> usb_endpoint usbdev5.3_ep02: PM: resume from 0, parent 5-2:1.0 still 2
> usb_endpoint usbdev5.3_ep83: PM: resume from 0, parent 5-2:1.0 still 2
> Some devices failed to suspend

No, it's not catastrophic, but something like this will result in a bug report
with "regression" in the subject.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 23:11               ` Paul Mackerras
  2007-07-04  8:11                 ` Oliver Neukum
@ 2007-07-04 14:44                 ` Alan Stern
  1 sibling, 0 replies; 388+ messages in thread
From: Alan Stern @ 2007-07-04 14:44 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: Oliver Neukum, Matthew Garrett, linux-pm, linux-kernel

On Wed, 4 Jul 2007, Paul Mackerras wrote:

> Oliver Neukum writes:
> 
> > USB devices certainly have suspend methods.
> 
> Indeed, and the USB framework has code to know when the host
> controller is suspended and avoid trying to send out urbs in that
> case.  Or at least it did last time I looked at it in any detail; it's
> been "just working" - including suspending and resuming, without the
> freezer - for quite a while now.

Evidently you haven't been stressing it.  You might try suspending
while printing to a USB printer.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 14:41                                     ` Rafael J. Wysocki
@ 2007-07-04 14:45                                       ` Miklos Szeredi
  2007-07-04 15:03                                         ` Oliver Neukum
  0 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-04 14:45 UTC (permalink / raw)
  To: rjw; +Cc: paulus, mjg59, linux-pm, linux-kernel

> On Wednesday, 4 July 2007 13:51, Miklos Szeredi wrote:
> > > Still, my position is this:
> > > 
> > > 1) The freezer (in the modified form, with the freezing of kernel threads
> > > limited to the ones that want to be frozen) is needed for hibernation.
> > > 
> > > 2) The freezer is generally not needed for suspend, _but_ there are drivers
> > > in the tree that rely on it being used.  Thus, at some point in time we can
> > > remove the freezer from the suspend code path, _but_ no sooner than we are
> > > sure that the majority of drivers is prepared for that.
> > 
> > And we won't know if drivers are OK until we remove the freezer,
> > catch-22.
> 
> I disagree.  We can learn that by auditing the drivers.

In theory, yes.  But it scales far worse than letting everyone
experiment/report/fix problems as they crop up.

> > So I think we need to disable the freezer at least in -mm and/or
> > optionally in -linus.
> > 
> > I applied Matthew's patch, and suspend did in fact stop working
> > (thinkpad t60), but there was nothing catastrophic.  Here's the dmesg
> > if somebody is interested:
> > 
> > Suspending console(s)
> > usb_endpoint usbdev5.3_ep83: PM: suspend 0->2, parent 5-2:1.0 already 2
> > usb_endpoint usbdev5.3_ep02: PM: suspend 0->2, parent 5-2:1.0 already 2
> > usb_endpoint usbdev5.3_ep81: PM: suspend 0->2, parent 5-2:1.0 already 2
> > hub 2-0:1.0: suspend error -16
> > suspend_device(): usb_suspend+0x0/0x1c() returns -16
> > Could not suspend device usb2: error -16
> > usb_endpoint usbdev5.3_ep81: PM: resume from 0, parent 5-2:1.0 still 2
> > usb_endpoint usbdev5.3_ep02: PM: resume from 0, parent 5-2:1.0 still 2
> > usb_endpoint usbdev5.3_ep83: PM: resume from 0, parent 5-2:1.0 still 2
> > Some devices failed to suspend
> 
> No, it's not catastrophic, but something like this will result in a
> bug report with "regression" in the subject.

If it was due to a config option marked experimental, then it's not a
regression.

It's a bug, and it needs looking at, but while the freezer is not
completely removed, it's not a serious problem.

So I agree with Ted, a config option (or maybe a runtime sysctl
tunable) to turn off the freezer for suspend should only have positive
effects.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04  4:59                   ` Paul Mackerras
@ 2007-07-04 14:57                     ` Alan Stern
  2007-07-05  0:23                       ` Paul Mackerras
  0 siblings, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-04 14:57 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Johannes Berg, Rafael J. Wysocki, Linux-pm mailing list,
	Kernel development list, Pavel Machek, Matthew Garrett,
	Benjamin Herrenschmidt

On Wed, 4 Jul 2007, Paul Mackerras wrote:

> Alan Stern writes:
> 
> > I disagree.  The problem isn't the kernel calling userspace; it's
> > userspace trying to do I/O at a time when everything is supposed to be
> > quiescing.  Detecting that and blocking it in drivers is hard and
> > error-prone; preventing it by freezing userspace is easy and cheap.
> 
> And unreliable, and prone to deadlocks, and invasive - requiring
> changes to kernel threads that have nothing to do with drivers or
> suspend/resume.

Let's agree the kernel threads and the freezer are a separate issue.  
In the most recent kernels, the freezer does not suspend kernel threads 
by default.

I agree the kernel threads which try to do I/O during a suspend will 
need extra attention.  However if these threads are necessary for the 
suspend procedure, then blocking them (which is how people on this 
thread have been saying driver should treat I/O requests during a 
suspend) will cause additional problems.  There's no way around it; 
these threads _will_ require more work.

> > The reasons why the PPC people dislike the whole idea aren't clear to
> > me. 
> 
> Our experience is that it isn't necessary.  It's extra code that in
> practice causes deadlocks and added maintenance burden for no
> discernable benefit.

I have discussed the benefits elsewhere.  As for the deadlocks -- do 
you still observe them if you use the version of the freezer which 
doesn't freeze kernel threads?

> The freezer doesn't achieve its stated goal of preventing drivers from
> getting I/O requests after suspend, since kernel threads can (and do)
> initiate I/O.  So then we say that some kernel threads need to be
> frozen and others don't, but making that decision is difficult and
> error-prone.

No -- we say that the kernel threads which generate I/O requests during 
suspend need to be changed.

> In fact I believe that making a distinction between user and kernel
> threads is wrong and likely to lead to problems, since userspace can
> be involved in doing I/O (e.g. FUSE or the user-space driver
> framework).  So the argument of the previous paragraph also applies to
> some userspace processes.

Userspace cannot do I/O directly on its own, apart from some
exceptional situations where a privileged task directly twiddles some
I/O ports or the equivalent.

There remains the problem of user tasks whose assistance is required to 
carry out some I/O (as with FUSE).  If the I/O can be deferred until 
after the resume, then there's no problem.  If the I/O can be carried 
out before the suspend, then it should be.  And finally, if the I/O 
must be done during the suspend, you're in real trouble -- how do you 
do I/O to a suspended device?

> I remain convinced that the right approach is to fix the drivers to do
> one of two things; either do something in the suspend call to block
> further requests to the device, or use a late-suspend call to put
> their device into a low-power state.  Of course, correctly-written
> frameworks can do a lot to help the chipset drivers here.

The first alternative is a possibility.  My argument all along has been 
that it is difficult and error-prone, and it adds more overhead to 
system operation (even when not suspending!) than simply freezing 
userspace.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 14:38                             ` Alan Stern
@ 2007-07-04 14:58                               ` Matthew Garrett
  2007-07-04 15:02                                 ` Oliver Neukum
  2007-07-04 15:57                                 ` Alan Stern
  0 siblings, 2 replies; 388+ messages in thread
From: Matthew Garrett @ 2007-07-04 14:58 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-kernel, linux-pm

On Wed, Jul 04, 2007 at 10:38:47AM -0400, Alan Stern wrote:

> Okay, I agree that (1) can be handled without too much effort.  But 
> doing it adds an extra test to _every_ driver's I/O pathway.  Freezing 
> userspace does not incur all this additional overhead.

For runtime PM to work it's already necessary to have a test in that 
path to check if the device is suspended. I can't see how this adds any 
overhead to the common case.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 14:58                               ` Matthew Garrett
@ 2007-07-04 15:02                                 ` Oliver Neukum
  2007-07-04 15:57                                 ` Alan Stern
  1 sibling, 0 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-04 15:02 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: Alan Stern, linux-kernel, linux-pm

Am Mittwoch, 4. Juli 2007 schrieb Matthew Garrett:
> On Wed, Jul 04, 2007 at 10:38:47AM -0400, Alan Stern wrote:
> 
> > Okay, I agree that (1) can be handled without too much effort.  But 
> > doing it adds an extra test to _every_ driver's I/O pathway.  Freezing 
> > userspace does not incur all this additional overhead.
> 
> For runtime PM to work it's already necessary to have a test in that 
> path to check if the device is suspended. I can't see how this adds any 
> overhead to the common case.

No,

you just make sure the device reports to upper layers when it might
be busy. The USB layer manages this quite well without burdening
the common case.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 14:45                                       ` Miklos Szeredi
@ 2007-07-04 15:03                                         ` Oliver Neukum
  2007-07-04 15:17                                           ` Rafael J. Wysocki
  0 siblings, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-04 15:03 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: rjw, paulus, mjg59, linux-pm, linux-kernel

Am Mittwoch, 4. Juli 2007 schrieb Miklos Szeredi:
> > > And we won't know if drivers are OK until we remove the freezer,
> > > catch-22.
> > 
> > I disagree.  We can learn that by auditing the drivers.
> 
> In theory, yes.  But it scales far worse than letting everyone
> experiment/report/fix problems as they crop up.

You will open many but small races. This would be very painfull
to debug.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04  4:02               ` Paul Mackerras
@ 2007-07-04 15:04                 ` Alan Stern
  2007-07-05  0:28                   ` Paul Mackerras
  0 siblings, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-04 15:04 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: Oliver Neukum, Matthew Garrett, linux-pm, linux-kernel

On Wed, 4 Jul 2007, Paul Mackerras wrote:

> Alan Stern writes:
> 
> > > Most drivers suspended their hardware in the second call.  If they are
> > > in the middle of a conversation with their device that *has* to be
> > > completed, they can do that by polling.
> > 
> > Ugh.  That will cause problems when you try to integrate runtime 
> > suspend.  In fact this whole approach is unsuitable for runtime PM and 
> > it obscures the similarities between runtime PM and STR.
> 
> Yes there are similarities, but it would be a big mistake to say that
> a requirement for STR is that all drivers do runtime PM.

That's not what I'm saying.  What I'm saying is that it would be a big 
mistake to force all drivers which implement runtime PM to do it using 
a separate code path from system PM.

> The main attraction of the late-suspend call is that it really does,
> reliably, guarantee that the driver's I/O request methods won't get
> called between the late-suspend call and the early-resume call.

For some drivers (like USB), carrying out an actual suspend requires a
delay.  Right now we implement those delays using wait_event(),
wait_for_completion(), and so on.  Would you have us check at runtime
whether or not a system suspend is underway and in each case use a
busy-loop instead if it is?

What happens if, in order to carry out the late-suspend, a driver needs
to acquire a mutex which happens to be held by some other task?  That
other task won't be able to run and release the mutex, so you will
deadlock.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04  3:55               ` Paul Mackerras
@ 2007-07-04 15:12                 ` Alan Stern
  2007-07-05  0:35                   ` Paul Mackerras
  0 siblings, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-04 15:12 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Johannes Berg, Rafael J. Wysocki, Linux-pm mailing list,
	Kernel development list, Pavel Machek, Matthew Garrett

On Wed, 4 Jul 2007, Paul Mackerras wrote:

> Whether or not to resume a suspended device when an I/O request comes
> in is a policy decision, and there could be cases where the user wants
> I/O requests to be blocked, or to fail, or to be dropped while the
> device is suspended, even for runtime power management.  For example,
> a sound card could be suspended due to a low-battery condition, and in
> that case you would want the driver to just drop any data that
> userspace tries to write to the soundcard.

We have provisions for that (my earlier description was somewhat 
incomplete).

> > Yes, the code could be changed to keep track of the reason for a device
> > suspend.  But that just raises the old problem of what to do when
> > there's an I/O request for a suspended device during STR.
> 
> Is this actually a real problem?  I would think the policy would be
> "block" for block devices (pun not intended :), "drop" for network
> devices, etc.

It is indeed a real problem, or at least, it can be.

> > Consider a particularly troublesome case: During STR, a non-frozen task
> > writes to /sys/bus/BBB/drivers/DDD/bind.  The sysfs core grabs the
> > device semaphore and calls the driver's probe routine.  If the driver
> > isn't PM-aware it simply tries to initialize the device and fails
> > because the device is already suspended.  That's no good; it isn't
> > transparent.
> 
> How did the device get suspended if it didn't have a driver?  If it
> did have a driver, why didn't the bind attempt fail?

Bus subsystems can suspend devices with no drivers.

> Suppose the device-model core code simply blocked all bind and unbind
> requests while suspend is under way, until resume is finished.
> Wouldn't that solve the problem?

It would help.  It would help even more if the sysfs core also blocked
all I/O while suspend is under way.  (Although this might be tricky, 
considering that the suspend is initiated by a sysfs write...)

The fact remains that lots of drivers would still need to be changed.  
In the read and write methods someone would have to add code amounting
to this:

	if (suspend_is_under_way()) {
		mutex_unlock(...);
		block_until_resume();
		goto restart;
	}

Freezing userspace is a small amount of code by comparison.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 15:03                                         ` Oliver Neukum
@ 2007-07-04 15:17                                           ` Rafael J. Wysocki
  2007-07-05  0:29                                             ` Paul Mackerras
  0 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-04 15:17 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: Miklos Szeredi, paulus, mjg59, linux-pm, linux-kernel

On Wednesday, 4 July 2007 17:03, Oliver Neukum wrote:
> Am Mittwoch, 4. Juli 2007 schrieb Miklos Szeredi:
> > > > And we won't know if drivers are OK until we remove the freezer,
> > > > catch-22.
> > > 
> > > I disagree.  We can learn that by auditing the drivers.
> > 
> > In theory, yes.  But it scales far worse than letting everyone
> > experiment/report/fix problems as they crop up.
> 
> You will open many but small races. This would be very painfull
> to debug.

Agreed.

They will not trigger 100% of the time, but sporadically and generally at
random.

At least the freezer problems are reproducible. ;-)

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 11:34                                   ` Paul Mackerras
  2007-07-04 14:12                                     ` Dmitry Torokhov
@ 2007-07-04 15:38                                     ` Alan Stern
  2007-07-04 19:07                                       ` Alan Stern
  1 sibling, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-04 15:38 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: Rafael J. Wysocki, Matthew Garrett, linux-pm, linux-kernel

On Wed, 4 Jul 2007, Paul Mackerras wrote:

> Rafael J. Wysocki writes:
> 
> > They are mostly related to kernel threads, that we've already agreed no to
> > freeze (except for the ones that want that, but they will be responsible for
> > getting everything right).  The initial patches for that are in -mm and more
> > will come.
> 
> Serious question: which kernel threads would actually want to be
> frozen?

khubd and ksuspend_usbd.

> Threads that do no I/O at all don't care about suspend/resume and
> don't need to be frozen in any case.  Threads that issue I/O requests
> in order to service incoming I/O requests can't be frozen because of
> the possibility of deadlock.  Which leaves threads that do I/O just
> for the fun of it. :)
> 
> What am I missing?

Those two threads will try to resume USB devices in response to wakeup
requests.  Such requests arrive during a suspend or resume transition
more often than one would expect.

If the resume attempt occurs before the host controller has been 
suspended, it will abort the system suspend.  If it occurs after the 
host controller is suspended (and before the controller resumes) it 
will fail and try to unregister the USB device -- something else we 
don't like happening while the sytem is only partially up (not to 
mention the annoyance caused by the unregistration of a perfectly 
functional device).

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 11:51                                   ` Miklos Szeredi
  2007-07-04 14:41                                     ` Rafael J. Wysocki
@ 2007-07-04 15:42                                     ` Alan Stern
  2007-07-04 19:25                                       ` Miklos Szeredi
  2007-07-05  0:36                                       ` Paul Mackerras
  1 sibling, 2 replies; 388+ messages in thread
From: Alan Stern @ 2007-07-04 15:42 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Miklos Szeredi, rjw, mjg59, Linux-pm mailing list,
	Kernel development list

On Wed, 4 Jul 2007, Miklos Szeredi wrote:

> And we won't know if drivers are OK until we remove the freezer,
> catch-22.
> 
> So I think we need to disable the freezer at least in -mm and/or
> optionally in -linus.
> 
> I applied Matthew's patch, and suspend did in fact stop working
> (thinkpad t60), but there was nothing catastrophic.  Here's the dmesg
> if somebody is interested:
> 
> Suspending console(s)
> usb_endpoint usbdev5.3_ep83: PM: suspend 0->2, parent 5-2:1.0 already 2
> usb_endpoint usbdev5.3_ep02: PM: suspend 0->2, parent 5-2:1.0 already 2
> usb_endpoint usbdev5.3_ep81: PM: suspend 0->2, parent 5-2:1.0 already 2
> hub 2-0:1.0: suspend error -16
> suspend_device(): usb_suspend+0x0/0x1c() returns -16
> Could not suspend device usb2: error -16
> usb_endpoint usbdev5.3_ep81: PM: resume from 0, parent 5-2:1.0 still 2
> usb_endpoint usbdev5.3_ep02: PM: resume from 0, parent 5-2:1.0 still 2
> usb_endpoint usbdev5.3_ep83: PM: resume from 0, parent 5-2:1.0 still 2
> Some devices failed to suspend

Remember what I wrote a few minutes ago about khubd and ksuspend_usbd
wanting to resume devices during a system suspend transition?  This is
exactly what happens when those threads aren't frozen.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 14:58                               ` Matthew Garrett
  2007-07-04 15:02                                 ` Oliver Neukum
@ 2007-07-04 15:57                                 ` Alan Stern
  1 sibling, 0 replies; 388+ messages in thread
From: Alan Stern @ 2007-07-04 15:57 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel, linux-pm

On Wed, 4 Jul 2007, Matthew Garrett wrote:

> On Wed, Jul 04, 2007 at 10:38:47AM -0400, Alan Stern wrote:
> 
> > Okay, I agree that (1) can be handled without too much effort.  But 
> > doing it adds an extra test to _every_ driver's I/O pathway.  Freezing 
> > userspace does not incur all this additional overhead.
> 
> For runtime PM to work it's already necessary to have a test in that 
> path to check if the device is suspended. I can't see how this adds any 
> overhead to the common case.

Actually it isn't necessary to have a test to check if the device is 
suspended.  We simply call the autoresume routine; if the device isn't 
suspended that routine doesn't have to do anything.

I agree that that the extra test (for system-wide suspend underway) is
needed only if the autoresume fails, which isn't part of the main
pathway.  So it doesn't add runtime overhead -- but handling it does
add code overhead.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 15:38                                     ` Alan Stern
@ 2007-07-04 19:07                                       ` Alan Stern
  0 siblings, 0 replies; 388+ messages in thread
From: Alan Stern @ 2007-07-04 19:07 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: Matthew Garrett, linux-pm, linux-kernel

On Wed, 4 Jul 2007, Alan Stern wrote:

> > Threads that do no I/O at all don't care about suspend/resume and
> > don't need to be frozen in any case.  Threads that issue I/O requests
> > in order to service incoming I/O requests can't be frozen because of
> > the possibility of deadlock.  Which leaves threads that do I/O just
> > for the fun of it. :)
> > 
> > What am I missing?
> 
> Those two threads will try to resume USB devices in response to wakeup
> requests.  Such requests arrive during a suspend or resume transition
> more often than one would expect.
> 
> If the resume attempt occurs before the host controller has been 
> suspended, it will abort the system suspend.  If it occurs after the 
> host controller is suspended (and before the controller resumes) it 
> will fail and try to unregister the USB device -- something else we 
> don't like happening while the sytem is only partially up (not to 
> mention the annoyance caused by the unregistration of a perfectly 
> functional device).

Actually the situation may not be quite this bad any more.  It's been a 
while since I tried suspending a system without freezing khubd and 
ksuspend_usbd.  But Miklos's mail shows that problems can and will 
occur.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 15:42                                     ` Alan Stern
@ 2007-07-04 19:25                                       ` Miklos Szeredi
  2007-07-04 21:36                                         ` Rafael J. Wysocki
  2007-07-05  0:43                                         ` Paul Mackerras
  2007-07-05  0:36                                       ` Paul Mackerras
  1 sibling, 2 replies; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-04 19:25 UTC (permalink / raw)
  To: stern; +Cc: oliver, paulus, rjw, mjg59, linux-pm, linux-kernel

> Remember what I wrote a few minutes ago about khubd and ksuspend_usbd
> wanting to resume devices during a system suspend transition?  This is
> exactly what happens when those threads aren't frozen.

OK, let me summarize the situation as I see it now: there are two
camps, the pro-freezers and the anti-freezers.

Pro-freezers say:

  - don't remove the freezer, otherwise we'll have to deal with
    numerous problems in drivers

Anti-freezers say:

  - let's remove the freezer, which causes numerous problems

Alan summerized the pro-freezer arguments well I think.  What are the
anti-freezer arguments then?

After having looked at the freezer and done some experiments with it,
the most obvious problem looks to be, that it can get stuck on a
process doing uninterruptible sleep.  And yes, this can happen if a
fuse filesystem daemon is frozen before a filesystem user is.  And
this is not something that can be fixed in fuse, some filesystem calls
(rename(2) for example) are simply not restartable.

This doesn't explain the deadlocks, but it could cause failure to
suspend which would be pretty annoying.

Does this affect other things than fuse?

Can this be fixed?

It seems to be a fundamental problem with the freezer: while it does
make sure that user processes are not calling into drivers during
suspend, it also disallows perfectly harmless non-driver calls as
well.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 19:25                                       ` Miklos Szeredi
@ 2007-07-04 21:36                                         ` Rafael J. Wysocki
  2007-07-05  8:37                                           ` Miklos Szeredi
  2007-07-05  0:43                                         ` Paul Mackerras
  1 sibling, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-04 21:36 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: stern, oliver, paulus, mjg59, linux-pm, linux-kernel

On Wednesday, 4 July 2007 21:25, Miklos Szeredi wrote:
> > Remember what I wrote a few minutes ago about khubd and ksuspend_usbd
> > wanting to resume devices during a system suspend transition?  This is
> > exactly what happens when those threads aren't frozen.
> 
> OK, let me summarize the situation as I see it now: there are two
> camps, the pro-freezers and the anti-freezers.
> 
> Pro-freezers say:
> 
>   - don't remove the freezer, otherwise we'll have to deal with
>     numerous problems in drivers

And these problems will generally be difficult to reproduce reliably and debug.

> Anti-freezers say:
> 
>   - let's remove the freezer, which causes numerous problems
> 
> Alan summerized the pro-freezer arguments well I think.  What are the
> anti-freezer arguments then?
> 
> After having looked at the freezer and done some experiments with it,
> the most obvious problem looks to be, that it can get stuck on a
> process doing uninterruptible sleep.

That's correct.

> And yes, this can happen if a fuse filesystem daemon is frozen before a
> filesystem user is.  And this is not something that can be fixed in fuse,
> some filesystem calls (rename(2) for example) are simply not restartable.
>
> This doesn't explain the deadlocks, but it could cause failure to
> suspend which would be pretty annoying.

I think the only thing that can deadlock in that context is the sync.  At
least, I don't see anything else.

> Does this affect other things than fuse?

Not that I know of.  It may affect user space drivers, but there's no data on
that.

> Can this be fixed?
> 
> It seems to be a fundamental problem with the freezer: while it does
> make sure that user processes are not calling into drivers during
> suspend, it also disallows perfectly harmless non-driver calls as
> well.

The problem is that when the freezer was designed (I didn't do that, BTW),
there was no FUSE and similar things, so it's not prepared to cope with
such interdependencies between user space tasks.

We had an analogous problem with vfork() and it was solved by using the
PF_FREEZER_SKIP flag.  Perhaps we can do similar thing with FUSE.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* The big suspend mess
  2007-07-04  3:29               ` [linux-pm] " Paul Mackerras
  2007-07-04 10:33                 ` Rafael J. Wysocki
@ 2007-07-04 22:19                 ` Adrian Bunk
  2007-07-05  0:27                   ` Pavel Machek
  2007-07-05 14:14                   ` [linux-pm] " Alan Stern
  1 sibling, 2 replies; 388+ messages in thread
From: Adrian Bunk @ 2007-07-04 22:19 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Rafael J. Wysocki, Benjamin Herrenschmidt, Matthew Garrett,
	linux-kernel, Pavel Machek, linux-pm

On Wed, Jul 04, 2007 at 01:29:49PM +1000, Paul Mackerras wrote:
> Rafael J. Wysocki writes:
> 
> > Still, do you really think that we're ready to drop it _right_ _now_ (I'm
> > referring to suspend only) and if so than on what basis (except that you
> > don't like it, which falls short of being a techical argument)?
> 
> The basis is that it (the freezer) causes more deadlocks and other
> problems than it avoids, so it's a net win to remove it.

You forget one important point:
Regressions are much worse than normal bugs.

It's not about the question whether the driver was "correct" or
"buggy" - for a user the important question is whether it worked in 
practice, and if yes, whether it continues to work.

And my impression is that every time someone touches some part of the 
suspend code some drivers break.

During 2.6.21-rc, it wasn't unusual for users to run with one kernel 
into up to three distinct suspend regressions on one machine.

And no, it wasn't always the same set of distinct regressions.

IMHO the suspend code is currently way too much of a moving target which 
results in this mess.

The correct order seems to be:
1. agree on what the suspend code as a whole should look like
2. implement this
3. fix ALL drivers to work at least as good as they do today
4. get it tested in -mm
5. fix all bugs people run into
6. submit it for inclusion in Linus' tree
7. quickly work on the most likely big amount of bug reports

Step 1 is the most important one - evolving code is often something 
good, but in this case with different people trying to evolve the 
suspend code in different directions it simply results in a big mess.

> Paul.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  4:29 [PATCH] Remove process freezer from suspend to RAM pathway Matthew Garrett
                   ` (6 preceding siblings ...)
  2007-07-03 16:03 ` [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway Alan Stern
@ 2007-07-04 23:33 ` Pavel Machek
  7 siblings, 0 replies; 388+ messages in thread
From: Pavel Machek @ 2007-07-04 23:33 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-kernel, linux-pm

Hi!

> Suspend to RAM on a machine with / on a fuse filesystem turns out to be 
> a screaming nightmare - either the suspend fails because syslog (for 
> instance) can't be frozen, or the machine deadlocks for some other 
> reason I haven't tracked down. We could "fix" fuse, or alternatively we 
> could do what we do for suspend to RAM on other platforms (PPC and APM) 
> and just not use the freezer.
> 
> Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org>

Sorry, no.

* this needs audit of all drivers. Or we can just merge it and then
fix all the problems it causes. If you are willing to become suspend
maintainer and handle all that mess, perhaps we can do this.

* it does not solve FUSE vs. hibernation

* it does not solve FUSE vs. suspend-to-both

* userspace will now see CPUs going up and down at minimum

Now, we want to do something like this long-term, but I do not think
we can just remove the freezer like this.


								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 17:20       ` Oliver Neukum
  2007-07-03 20:59         ` Rafael J. Wysocki
@ 2007-07-04 23:39         ` Pavel Machek
  2007-07-05  6:53           ` Oliver Neukum
  1 sibling, 1 reply; 388+ messages in thread
From: Pavel Machek @ 2007-07-04 23:39 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Rafael J. Wysocki, Benjamin Herrenschmidt, Matthew Garrett,
	linux-kernel, linux-pm, Nigel Cunningham

On Tue 2007-07-03 19:20:59, Oliver Neukum wrote:
> Am Dienstag, 3. Juli 2007 schrieb Rafael J. Wysocki:
> > On Tuesday, 3 July 2007 15:08, Rafael J. Wysocki wrote:
> > > On Tuesday, 3 July 2007 07:51, Benjamin Herrenschmidt wrote:
> > > > On Tue, 2007-07-03 at 05:29 +0100, Matthew Garrett wrote:
> > > > > Suspend to RAM on a machine with / on a fuse filesystem turns out to be 
> > > > > a screaming nightmare - either the suspend fails because syslog (for 
> > > > > instance) can't be frozen, or the machine deadlocks for some other 
> > > > > reason I haven't tracked down. We could "fix" fuse, or alternatively we 
> > > > > could do what we do for suspend to RAM on other platforms (PPC and APM) 
> > > > > and just not use the freezer.
> > > > 
> > > > The main reason for deadlocks is because we do a sys_sync() after the
> > > > freeze, which we shouldn't do.
> > > 
> > > So why don't we remove the sys_sync() from freeze_processes() instead?
> > 
> > The patch follows (untested).
> > 
> > Greetings,
> > Rafael
> > 
> > 
> > ---
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > We shouldn't sync filesystems from within the freezer, because it's not needed
> > for suspend to RAM and leads to problems with FUSE.
> 
> This seems fishy. Swsusp needs enough clean memory to make enough
> room for the image. If you sync before you freeze, the running tasks can
> redirty memory.
> What makes you sure that you don't die as shrink_all_memory() writes out
> pages?

Shrink_all_memory should just free enough memory, what's the problem?
Yes, we can have dirty memory, shrink_all_memory() can write that out
just fine.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 19:32           ` Oliver Neukum
                               ` (2 preceding siblings ...)
  2007-07-03 21:20             ` Benjamin Herrenschmidt
@ 2007-07-04 23:45             ` Pavel Machek
  2007-07-05 12:25               ` Rafael J. Wysocki
  3 siblings, 1 reply; 388+ messages in thread
From: Pavel Machek @ 2007-07-04 23:45 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Miklos Szeredi, rjw, benh, mjg59, linux-kernel, linux-pm, nigel

On Tue 2007-07-03 21:32:20, Oliver Neukum wrote:
> Am Dienstag, 3. Juli 2007 schrieb Miklos Szeredi:
> > > And a further question. The freezer is not atomic. What do you do
> > > if a task not yet frozen calls sys_sync(), but fuse is already frozen?
> > 
> > What do you do if a task not yet frozen writes to a pipe, on the other
> > end of which is a task already frozen?

There's some difference between uninterruptible and interruptible
sleep I'd say.

> > It doesn't matter.  The only thing that should matter during suspend
> > (not hibernate) is saving the state of devices to ram, and putting the
> > devices to sleep.
> 
> Well, but you did remove sys_sync() from the freezer, which is
> and must be called in the hibernate path.

Not "must". In fact, hibernation should be safe without sys_sync(). It
is just user un-friendly.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 11:22               ` Miklos Szeredi
  2007-07-03 11:27                 ` Oliver Neukum
@ 2007-07-05  0:02                 ` Pavel Machek
  1 sibling, 0 replies; 388+ messages in thread
From: Pavel Machek @ 2007-07-05  0:02 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: oliver, linux-pm, benh, nigel, mjg59, linux-kernel

Hi!

> > > I don't claim to know anything about how STR or hibernate works, but
> > > neither seem to have any problem with I/O on the fuse device "racing"
> > > with them.
> > 
> > The problem is not with fuse. The problem is generic in nature.
> > 
> > If you remove the freezer, user space remains active until the last CPU
> > goes into suspend. It can do syscalls. Or do you know a clean way to exempt
> > only the tasks fuse might use?
> 
> You are talking about hibernate, right?  Suspending (to ram) is
> instantaneous, in that _after_ suspend no CPU is active obviously.

No, suspend to ram is not instantaneous.

We may have 16 cpus, and we may have 250 disks that need to be spun
down. That takes time, and is really not atomic operation.
								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03  5:48   ` Benjamin Herrenschmidt
  2007-07-03  6:08     ` Nigel Cunningham
@ 2007-07-05  0:03     ` Pavel Machek
  2007-07-05  0:46       ` [linux-pm] " Paul Mackerras
  1 sibling, 1 reply; 388+ messages in thread
From: Pavel Machek @ 2007-07-05  0:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Nigel Cunningham, Matthew Garrett, linux-kernel, linux-pm

Hi!

> So I think Matthew is totally right. In fact, the presence of the
> freezer is the main reason why Paulus so far NACKed Johannes attempts at
> merging the PPC PM code with the generic code in kernel/power.c
> 
> We've been doing fine without it so far and intend to continue to do
> so.

How well does it work on SMP PPC?
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 14:30                         ` Rafael J. Wysocki
@ 2007-07-05  0:15                           ` Paul Mackerras
  2007-07-05 11:54                             ` Rafael J. Wysocki
  2007-07-07 12:09                             ` Pavel Machek
  0 siblings, 2 replies; 388+ messages in thread
From: Paul Mackerras @ 2007-07-05  0:15 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-kernel,
	Pavel Machek, linux-pm

Rafael J. Wysocki writes:

> This is incompatible with the code in kernel/power/main.c, since we only
> disable the nonboot CPUs after devices have been suspended.  Do you think that
> your framework can be modified to work without disabling the nonboot CPUs
> by the user space?

Sure.  It was a "if it can be done in userspace, do it in userspace"
kind of decision, but I'm not wedded to it.

I actually do want to converge to using the generic suspend-to-ram
code on powerbooks.  I just want to avoid causing regressions for
powerbook users, including myself. :)

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 14:57                     ` Alan Stern
@ 2007-07-05  0:23                       ` Paul Mackerras
  2007-07-05  6:58                         ` Oliver Neukum
  2007-07-05 14:23                         ` Alan Stern
  0 siblings, 2 replies; 388+ messages in thread
From: Paul Mackerras @ 2007-07-05  0:23 UTC (permalink / raw)
  To: Alan Stern
  Cc: Johannes Berg, Rafael J. Wysocki, Linux-pm mailing list,
	Kernel development list, Pavel Machek, Matthew Garrett,
	Benjamin Herrenschmidt

Alan Stern writes:

> Let's agree the kernel threads and the freezer are a separate issue.  

No, I don't think they are a separate issue, because I think the
distinction the freezer makes between kernel threads and user threads
is a false and misleading distinction.

> In the most recent kernels, the freezer does not suspend kernel threads 
> by default.

And therefore doesn't guarantee that drivers won't get I/O requests
after being suspended, as far as I can see...

> I agree the kernel threads which try to do I/O during a suspend will 
> need extra attention.  However if these threads are necessary for the 
> suspend procedure, then blocking them (which is how people on this 
> thread have been saying driver should treat I/O requests during a 
> suspend) will cause additional problems.  There's no way around it; 
> these threads _will_ require more work.

There is a way around it; do the request blocking in the drivers,
where it belongs.

> > > The reasons why the PPC people dislike the whole idea aren't clear to
> > > me. 
> > 
> > Our experience is that it isn't necessary.  It's extra code that in
> > practice causes deadlocks and added maintenance burden for no
> > discernable benefit.
> 
> I have discussed the benefits elsewhere.  As for the deadlocks -- do 
> you still observe them if you use the version of the freezer which 
> doesn't freeze kernel threads?

In general the only way to guarantee there are no deadlocks is to
construct the graph of dependencies between tasks.  Those dependencies
are not in practice observable from outside the tasks, so it is
virtually impossible to construct the graph.

The "don't freeze kernel threads" thing is an attempt to make a crude
approximation to the dependency graph (by saying kernel threads only
depend on other kernel threads), but the approximation breaks down
when you have FUSE or user-level device drivers.

> Userspace cannot do I/O directly on its own, apart from some
> exceptional situations where a privileged task directly twiddles some
> I/O ports or the equivalent.

Userspace can be involved in servicing I/O requests; not just FUSE,
but also user-level nfsd and user-level PPP demonstrate that.

> There remains the problem of user tasks whose assistance is required to 
> carry out some I/O (as with FUSE).  If the I/O can be deferred until 
> after the resume, then there's no problem.  If the I/O can be carried 
> out before the suspend, then it should be.  And finally, if the I/O 
> must be done during the suspend, you're in real trouble -- how do you 
> do I/O to a suspended device?

So why doesn't that argument apply to kernel threads? :)

> > I remain convinced that the right approach is to fix the drivers to do
> > one of two things; either do something in the suspend call to block
> > further requests to the device, or use a late-suspend call to put
> > their device into a low-power state.  Of course, correctly-written
> > frameworks can do a lot to help the chipset drivers here.
> 
> The first alternative is a possibility.  My argument all along has been 
> that it is difficult and error-prone, and it adds more overhead to 
> system operation (even when not suspending!) than simply freezing 
> userspace.

It does actually provably solve the problem though, which is more than
the freezer does.

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: The big suspend mess
  2007-07-04 22:19                 ` The big suspend mess Adrian Bunk
@ 2007-07-05  0:27                   ` Pavel Machek
  2007-07-05  0:53                     ` Paul Mackerras
  2007-07-05  1:22                     ` Adrian Bunk
  2007-07-05 14:14                   ` [linux-pm] " Alan Stern
  1 sibling, 2 replies; 388+ messages in thread
From: Pavel Machek @ 2007-07-05  0:27 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Paul Mackerras, Rafael J. Wysocki, Benjamin Herrenschmidt,
	Matthew Garrett, linux-kernel, linux-pm


> IMHO the suspend code is currently way too much of a moving target which 
> results in this mess.
> 
> The correct order seems to be:

0. Get someone to sign up as a maintainer for suspend, so we have
someone to blame for the mess? :-)

> 1. agree on what the suspend code as a whole should look like
> 2. implement this
> 3. fix ALL drivers to work at least as good as they do today
> 4. get it tested in -mm
> 5. fix all bugs people run into
> 6. submit it for inclusion in Linus' tree
> 7. quickly work on the most likely big amount of bug reports
> 
> Step 1 is the most important one - evolving code is often something 
> good, but in this case with different people trying to evolve the 
> suspend code in different directions it simply results in a big mess.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 15:04                 ` Alan Stern
@ 2007-07-05  0:28                   ` Paul Mackerras
  0 siblings, 0 replies; 388+ messages in thread
From: Paul Mackerras @ 2007-07-05  0:28 UTC (permalink / raw)
  To: Alan Stern; +Cc: Oliver Neukum, Matthew Garrett, linux-pm, linux-kernel

Alan Stern writes:

> That's not what I'm saying.  What I'm saying is that it would be a big 
> mistake to force all drivers which implement runtime PM to do it using 
> a separate code path from system PM.

OK; I can accept that provided there is a way to change the "what to
do with an I/O request" policy from auto-resume to something else
while we're suspending the system (and presumably restore the old
policy on system resume if the device was runtime-suspended at the
point where the system was suspended).

> > The main attraction of the late-suspend call is that it really does,
> > reliably, guarantee that the driver's I/O request methods won't get
> > called between the late-suspend call and the early-resume call.
> 
> For some drivers (like USB), carrying out an actual suspend requires a
> delay.  Right now we implement those delays using wait_event(),
> wait_for_completion(), and so on.  Would you have us check at runtime
> whether or not a system suspend is underway and in each case use a
> busy-loop instead if it is?

No; the late suspend call isn't appropriate for all drivers.  It is a
simple and safe way to do the suspend for some drivers, mostly the
simpler ones.  Things that are complex enough to have a subsystem
(e.g. USB) would want to use the early suspend call.

> What happens if, in order to carry out the late-suspend, a driver needs
> to acquire a mutex which happens to be held by some other task?  That
> other task won't be able to run and release the mutex, so you will
> deadlock.

Then late-suspend is not appropriate for that driver, and it needs to
use the early-suspend call, and do something such as setting a flag
that the I/O request function tests.

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 15:17                                           ` Rafael J. Wysocki
@ 2007-07-05  0:29                                             ` Paul Mackerras
  2007-07-05 12:29                                               ` Rafael J. Wysocki
  2007-07-12 15:13                                               ` Pavel Machek
  0 siblings, 2 replies; 388+ messages in thread
From: Paul Mackerras @ 2007-07-05  0:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Oliver Neukum, Miklos Szeredi, mjg59, linux-pm, linux-kernel

Rafael J. Wysocki writes:

> They will not trigger 100% of the time, but sporadically and generally at
> random.
> 
> At least the freezer problems are reproducible. ;-)

Our experience with powermacs has been that it isn't actually all that
hard to get it right for the drivers you care about.

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 15:12                 ` Alan Stern
@ 2007-07-05  0:35                   ` Paul Mackerras
  2007-07-05  9:15                     ` removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway) Pavel Machek
  2007-07-05 14:42                     ` [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway Alan Stern
  0 siblings, 2 replies; 388+ messages in thread
From: Paul Mackerras @ 2007-07-05  0:35 UTC (permalink / raw)
  To: Alan Stern
  Cc: Johannes Berg, Rafael J. Wysocki, Linux-pm mailing list,
	Kernel development list, Pavel Machek, Matthew Garrett

Alan Stern writes:

> > > Yes, the code could be changed to keep track of the reason for a device
> > > suspend.  But that just raises the old problem of what to do when
> > > there's an I/O request for a suspended device during STR.
> > 
> > Is this actually a real problem?  I would think the policy would be
> > "block" for block devices (pun not intended :), "drop" for network
> > devices, etc.
> 
> It is indeed a real problem, or at least, it can be.

How so?  Can you give me an example?

> > How did the device get suspended if it didn't have a driver?  If it
> > did have a driver, why didn't the bind attempt fail?
> 
> Bus subsystems can suspend devices with no drivers.

Interesting.  I assume this is for buses for which there is a
bus-specific but device-independent suspend procedure defined.

It would seem sensible to me that the PM core should get the bus to
resume such a device before calling a driver probe routine.  The
resume should be blocked or deferred while a system suspend is
underway.  In fact I think that all driver bind/unbind and probe
operations should be deferred while the system is suspending (i.e. put
on a list to be done after the system resumes).

> It would help.  It would help even more if the sysfs core also blocked
> all I/O while suspend is under way.  (Although this might be tricky, 
> considering that the suspend is initiated by a sysfs write...)

I didn't think sysfs got involved at all in normal read and write
requests, so I don't know how it would block them...

> The fact remains that lots of drivers would still need to be changed.  
> In the read and write methods someone would have to add code amounting
> to this:
> 
> 	if (suspend_is_under_way()) {
> 		mutex_unlock(...);
> 		block_until_resume();
> 		goto restart;
> 	}
> 
> Freezing userspace is a small amount of code by comparison.

Normally devices have some sort of queue of pending operations.  So
all that is required on suspend is to stop processing the queue and
wait for any currently-underway operations to complete.  The blocking
then happens naturally using the normal I/O wait mechanisms.

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 15:42                                     ` Alan Stern
  2007-07-04 19:25                                       ` Miklos Szeredi
@ 2007-07-05  0:36                                       ` Paul Mackerras
  2007-07-05 12:51                                         ` Rafael J. Wysocki
  2007-07-05 14:25                                         ` Alan Stern
  1 sibling, 2 replies; 388+ messages in thread
From: Paul Mackerras @ 2007-07-05  0:36 UTC (permalink / raw)
  To: Alan Stern
  Cc: Miklos Szeredi, rjw, mjg59, Linux-pm mailing list,
	Kernel development list

Alan Stern writes:

> Remember what I wrote a few minutes ago about khubd and ksuspend_usbd
> wanting to resume devices during a system suspend transition?  This is
> exactly what happens when those threads aren't frozen.

So, I wonder why I don't see that error on my powerbook?

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 19:25                                       ` Miklos Szeredi
  2007-07-04 21:36                                         ` Rafael J. Wysocki
@ 2007-07-05  0:43                                         ` Paul Mackerras
  2007-07-05 12:49                                           ` Rafael J. Wysocki
  1 sibling, 1 reply; 388+ messages in thread
From: Paul Mackerras @ 2007-07-05  0:43 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: stern, oliver, rjw, mjg59, linux-pm, linux-kernel

Miklos Szeredi writes:

> OK, let me summarize the situation as I see it now: there are two
> camps, the pro-freezers and the anti-freezers.
> 
> Pro-freezers say:
> 
>   - don't remove the freezer, otherwise we'll have to deal with
>     numerous problems in drivers
> 
> Anti-freezers say:
> 
>   - let's remove the freezer, which causes numerous problems
> 
> Alan summerized the pro-freezer arguments well I think.  What are the
> anti-freezer arguments then?

1. The freezer cannot be guaranteed deadlock-free without constructing
   a dependency graph between tasks (both user and kernel), which is
   virtually impossible since the dependencies are not externally
   observable.

2. As a consequence of (1), we try to make a crude approximation of
   the graph by saying "only kernel threads that want to be frozen
   will be frozen" or some other similar statement.

3. However, (2) means that we can no longer guarantee that drivers
   will not get any I/O requests after their suspend method has been
   called, and therefore the freezer fails in its main objective.

4. We have an existence proof that reliable suspend can be achieved
   without the freezer.

To summarize, the argument is that the freezer is deadlock-prone and
ineffective.

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  0:03     ` Pavel Machek
@ 2007-07-05  0:46       ` Paul Mackerras
  0 siblings, 0 replies; 388+ messages in thread
From: Paul Mackerras @ 2007-07-05  0:46 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-pm, linux-kernel

Pavel Machek writes:

> How well does it work on SMP PPC?

Just fine, on those machines where we know how to reinitialize the
video card.  We currently require userspace to offline all except the
boot cpu before suspending, but that could be moved into the kernel.
I have no particular attachment to that way of doing it; it was just a
"don't do things in the kernel that can be reasonably be done in
userspace" kind of thing.

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: The big suspend mess
  2007-07-05  0:27                   ` Pavel Machek
@ 2007-07-05  0:53                     ` Paul Mackerras
  2007-07-05  9:32                       ` Pavel Machek
  2007-07-05  1:22                     ` Adrian Bunk
  1 sibling, 1 reply; 388+ messages in thread
From: Paul Mackerras @ 2007-07-05  0:53 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Adrian Bunk, Rafael J. Wysocki, Benjamin Herrenschmidt,
	Matthew Garrett, linux-kernel, linux-pm

Pavel Machek writes:

> 0. Get someone to sign up as a maintainer for suspend, so we have
> someone to blame for the mess? :-)

I thought that was Rafael?

Paul.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: The big suspend mess
  2007-07-05  0:27                   ` Pavel Machek
  2007-07-05  0:53                     ` Paul Mackerras
@ 2007-07-05  1:22                     ` Adrian Bunk
  2007-07-05 12:18                       ` Rafael J. Wysocki
  1 sibling, 1 reply; 388+ messages in thread
From: Adrian Bunk @ 2007-07-05  1:22 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Paul Mackerras, Rafael J. Wysocki, Benjamin Herrenschmidt,
	Matthew Garrett, linux-kernel, linux-pm

On Thu, Jul 05, 2007 at 02:27:47AM +0200, Pavel Machek wrote:
> 
> > IMHO the suspend code is currently way too much of a moving target which 
> > results in this mess.
> > 
> > The correct order seems to be:
> 
> 0. Get someone to sign up as a maintainer for suspend, so we have
> someone to blame for the mess? :-)

The "SOFTWARE SUSPEND" entry in MAINTAINERS already contains a victim...  ;-)

My impression is that suspend is an area of the kernel that does not 
lack maintainers - you and Rafael are doing a good job, and there's e.g. 
also the maintained code formerly known as suspend2.

But some basic questions like e.g.
- What should be done in the kernel and what in userspace?
- How should this be implemented?
- What must subsystems and drivers do?
- What must subsystems and drivers not do?
seem to be in a constant flux because the big picture everyone agrees 
upon seems to be missing.

> > 1. agree on what the suspend code as a whole should look like
> > 2. implement this
> > 3. fix ALL drivers to work at least as good as they do today
> > 4. get it tested in -mm
> > 5. fix all bugs people run into
> > 6. submit it for inclusion in Linus' tree
> > 7. quickly work on the most likely big amount of bug reports
> > 
> > Step 1 is the most important one - evolving code is often something 
> > good, but in this case with different people trying to evolve the 
> > suspend code in different directions it simply results in a big mess.
> 
> 									Pavel

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 23:39         ` Pavel Machek
@ 2007-07-05  6:53           ` Oliver Neukum
  0 siblings, 0 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-05  6:53 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Rafael J. Wysocki, Benjamin Herrenschmidt, Matthew Garrett,
	linux-kernel, linux-pm, Nigel Cunningham

Am Donnerstag, 5. Juli 2007 schrieb Pavel Machek:
> On Tue 2007-07-03 19:20:59, Oliver Neukum wrote:
> > Am Dienstag, 3. Juli 2007 schrieb Rafael J. Wysocki:
> > > On Tuesday, 3 July 2007 15:08, Rafael J. Wysocki wrote:
> > > > On Tuesday, 3 July 2007 07:51, Benjamin Herrenschmidt wrote:
> > > > > On Tue, 2007-07-03 at 05:29 +0100, Matthew Garrett wrote:
> > > > > > Suspend to RAM on a machine with / on a fuse filesystem turns out to be 
> > > > > > a screaming nightmare - either the suspend fails because syslog (for 
> > > > > > instance) can't be frozen, or the machine deadlocks for some other 
> > > > > > reason I haven't tracked down. We could "fix" fuse, or alternatively we 
> > > > > > could do what we do for suspend to RAM on other platforms (PPC and APM) 
> > > > > > and just not use the freezer.
> > > > > 
> > > > > The main reason for deadlocks is because we do a sys_sync() after the
> > > > > freeze, which we shouldn't do.
> > > > 
> > > > So why don't we remove the sys_sync() from freeze_processes() instead?
> > > 
> > > The patch follows (untested).
> > > 
> > > Greetings,
> > > Rafael
> > > 
> > > 
> > > ---
> > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > We shouldn't sync filesystems from within the freezer, because it's not needed
> > > for suspend to RAM and leads to problems with FUSE.
> > 
> > This seems fishy. Swsusp needs enough clean memory to make enough
> > room for the image. If you sync before you freeze, the running tasks can
> > redirty memory.
> > What makes you sure that you don't die as shrink_all_memory() writes out
> > pages?
> 
> Shrink_all_memory should just free enough memory, what's the problem?
> Yes, we can have dirty memory, shrink_all_memory() can write that out
> just fine.

If there's any dirty memory to be written out through fuse, this will not
work as user space is already frozen. Now I am told that with fuse writes
are synchronous. Therefore I don't understand why having a call to sys_sync()
can make a difference. IMHO removing it to make fuse work covers over
a symptom but hides the bug.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  0:23                       ` Paul Mackerras
@ 2007-07-05  6:58                         ` Oliver Neukum
  2007-07-05  8:17                           ` Miklos Szeredi
  2007-07-05 14:23                         ` Alan Stern
  1 sibling, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-05  6:58 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Alan Stern, Johannes Berg, Rafael J. Wysocki,
	Linux-pm mailing list, Kernel development list, Pavel Machek,
	Matthew Garrett, Benjamin Herrenschmidt

Am Donnerstag, 5. Juli 2007 schrieb Paul Mackerras:
> > I have discussed the benefits elsewhere.  As for the deadlocks -- do 
> > you still observe them if you use the version of the freezer which 
> > doesn't freeze kernel threads?
> 
> In general the only way to guarantee there are no deadlocks is to
> construct the graph of dependencies between tasks.  Those dependencies
> are not in practice observable from outside the tasks, so it is
> virtually impossible to construct the graph.

In which way can user space tasks depend on each other in a way that
allows a them members of that cycle to be in uninterruptible sleep?

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  6:58                         ` Oliver Neukum
@ 2007-07-05  8:17                           ` Miklos Szeredi
  2007-07-05  8:24                             ` Oliver Neukum
  2007-07-05  9:18                             ` Pavel Machek
  0 siblings, 2 replies; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05  8:17 UTC (permalink / raw)
  To: oliver
  Cc: paulus, stern, johannes, rjw, linux-pm, linux-kernel, pavel, mjg59, benh

> > > I have discussed the benefits elsewhere.  As for the deadlocks -- do 
> > > you still observe them if you use the version of the freezer which 
> > > doesn't freeze kernel threads?
> > 
> > In general the only way to guarantee there are no deadlocks is to
> > construct the graph of dependencies between tasks.  Those dependencies
> > are not in practice observable from outside the tasks, so it is
> > virtually impossible to construct the graph.
> 
> In which way can user space tasks depend on each other in a way that
> allows a them members of that cycle to be in uninterruptible sleep?

 - process A calls rename() on a fuse fs
 - process B, the fuse server, starts to process the rename request
 - process B is frozen before it can reply

Now process A is unfreezable.  We cannot make rename() restartable,
hence it cannot be interruptible.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  8:17                           ` Miklos Szeredi
@ 2007-07-05  8:24                             ` Oliver Neukum
  2007-07-05  8:41                               ` Miklos Szeredi
  2007-07-05  9:18                             ` Pavel Machek
  1 sibling, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-05  8:24 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: paulus, stern, johannes, rjw, linux-pm, linux-kernel, pavel, mjg59, benh

Am Donnerstag, 5. Juli 2007 schrieb Miklos Szeredi:
> > > > I have discussed the benefits elsewhere.  As for the deadlocks -- do 
> > > > you still observe them if you use the version of the freezer which 
> > > > doesn't freeze kernel threads?
> > > 
> > > In general the only way to guarantee there are no deadlocks is to
> > > construct the graph of dependencies between tasks.  Those dependencies
> > > are not in practice observable from outside the tasks, so it is
> > > virtually impossible to construct the graph.
> > 
> > In which way can user space tasks depend on each other in a way that
> > allows a them members of that cycle to be in uninterruptible sleep?
> 
>  - process A calls rename() on a fuse fs
>  - process B, the fuse server, starts to process the rename request
>  - process B is frozen before it can reply
> 
> Now process A is unfreezable.  We cannot make rename() restartable,
> hence it cannot be interruptible.

Then this is a problem specific to fuse. You should teach fuse to block
suspension while such operations are being performed.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 21:36                                         ` Rafael J. Wysocki
@ 2007-07-05  8:37                                           ` Miklos Szeredi
  2007-07-05 12:39                                             ` Rafael J. Wysocki
  0 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05  8:37 UTC (permalink / raw)
  To: rjw; +Cc: miklos, stern, oliver, paulus, mjg59, linux-pm, linux-kernel

> > Pro-freezers say:
> > 
> >   - don't remove the freezer, otherwise we'll have to deal with
> >     numerous problems in drivers
> 
> And these problems will generally be difficult to reproduce reliably
> and debug.

I see exactly the opposite.

With the freezer I can have very rarely occuring failures, due to
freeze ordering effects.

And without the freezer I have a 100% reproducable problem, that is
not hard to fix according to Alan Stern.  OK, I don't know what the
next problem would be, but the powermac experience shows, that it's
not nearly as bad as you and Oliver try to make it out.

> > Can this be fixed?
> > 
> > It seems to be a fundamental problem with the freezer: while it does
> > make sure that user processes are not calling into drivers during
> > suspend, it also disallows perfectly harmless non-driver calls as
> > well.
> 
> The problem is that when the freezer was designed (I didn't do that, BTW),
> there was no FUSE and similar things, so it's not prepared to cope with
> such interdependencies between user space tasks.
> 
> We had an analogous problem with vfork() and it was solved by using the
> PF_FREEZER_SKIP flag.  Perhaps we can do similar thing with FUSE.

It cannot be just worked around in fuse, as a task might be sleeping
on a number of VFS mutexes as well (i_mutex, s_vfs_rename_mutex, etc).
It would be a gigantic hack, possible at all.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  8:24                             ` Oliver Neukum
@ 2007-07-05  8:41                               ` Miklos Szeredi
  2007-07-05  8:48                                 ` Oliver Neukum
  0 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05  8:41 UTC (permalink / raw)
  To: oliver
  Cc: miklos, paulus, stern, johannes, rjw, linux-pm, linux-kernel,
	pavel, mjg59, benh

> > > > > I have discussed the benefits elsewhere.  As for the deadlocks -- do 
> > > > > you still observe them if you use the version of the freezer which 
> > > > > doesn't freeze kernel threads?
> > > > 
> > > > In general the only way to guarantee there are no deadlocks is to
> > > > construct the graph of dependencies between tasks.  Those dependencies
> > > > are not in practice observable from outside the tasks, so it is
> > > > virtually impossible to construct the graph.
> > > 
> > > In which way can user space tasks depend on each other in a way that
> > > allows a them members of that cycle to be in uninterruptible sleep?
> > 
> >  - process A calls rename() on a fuse fs
> >  - process B, the fuse server, starts to process the rename request
> >  - process B is frozen before it can reply
> > 
> > Now process A is unfreezable.  We cannot make rename() restartable,
> > hence it cannot be interruptible.
> 
> Then this is a problem specific to fuse. You should teach fuse to block
> suspension while such operations are being performed.

And teach VFS to block suspension, while waiting on a mutex held by
another process performing a fuse operation.

I can already hear the beautiful praise from Al Viro at the sight of
that ;)

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  8:41                               ` Miklos Szeredi
@ 2007-07-05  8:48                                 ` Oliver Neukum
  2007-07-05  8:58                                   ` Miklos Szeredi
  2007-07-05 22:38                                   ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-05  8:48 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: paulus, stern, johannes, rjw, linux-pm, linux-kernel, pavel, mjg59, benh

Am Donnerstag, 5. Juli 2007 schrieb Miklos Szeredi:
> > > > > > I have discussed the benefits elsewhere.  As for the deadlocks -- do 
> > > > > > you still observe them if you use the version of the freezer which 
> > > > > > doesn't freeze kernel threads?
> > > > > 
> > > > > In general the only way to guarantee there are no deadlocks is to
> > > > > construct the graph of dependencies between tasks.  Those dependencies
> > > > > are not in practice observable from outside the tasks, so it is
> > > > > virtually impossible to construct the graph.
> > > > 
> > > > In which way can user space tasks depend on each other in a way that
> > > > allows a them members of that cycle to be in uninterruptible sleep?
> > > 
> > >  - process A calls rename() on a fuse fs
> > >  - process B, the fuse server, starts to process the rename request
> > >  - process B is frozen before it can reply
> > > 
> > > Now process A is unfreezable.  We cannot make rename() restartable,
> > > hence it cannot be interruptible.
> > 
> > Then this is a problem specific to fuse. You should teach fuse to block
> > suspension while such operations are being performed.
> 
> And teach VFS to block suspension, while waiting on a mutex held by
> another process performing a fuse operation.
> 
> I can already hear the beautiful praise from Al Viro at the sight of
> that ;)

There is that.

OK, bite the bullet. Tasks involved in fuse are special. Give them a flag
and teach the freezer to put them on ice only after all other task are
frozen. In a way they are kernel, there's no use denying that.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  8:48                                 ` Oliver Neukum
@ 2007-07-05  8:58                                   ` Miklos Szeredi
  2007-07-05 10:02                                     ` Oliver Neukum
  2007-07-05 22:38                                   ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05  8:58 UTC (permalink / raw)
  To: oliver
  Cc: miklos, paulus, stern, johannes, rjw, linux-pm, linux-kernel,
	pavel, mjg59, benh

> > And teach VFS to block suspension, while waiting on a mutex held by
> > another process performing a fuse operation.
> > 
> > I can already hear the beautiful praise from Al Viro at the sight of
> > that ;)
> 
> There is that.
> 
> OK, bite the bullet. Tasks involved in fuse are special. Give them a flag
> and teach the freezer to put them on ice only after all other task are
> frozen. In a way they are kernel, there's no use denying that.

And flag every other process, that the flagged process is
communicating with?  How are you proposing to do that?

Quoting Paul:

"1. The freezer cannot be guaranteed deadlock-free without constructing
   a dependency graph between tasks (both user and kernel), which is
   virtually impossible since the dependencies are not externally
   observable."

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway)
  2007-07-05  0:35                   ` Paul Mackerras
@ 2007-07-05  9:15                     ` Pavel Machek
  2007-07-05 13:57                       ` Matthew Garrett
  2007-07-05 14:42                     ` [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway Alan Stern
  1 sibling, 1 reply; 388+ messages in thread
From: Pavel Machek @ 2007-07-05  9:15 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Alan Stern, Johannes Berg, Rafael J. Wysocki,
	Linux-pm mailing list, Kernel development list, Matthew Garrett

Hi!

> > The fact remains that lots of drivers would still need to be changed.  
> > In the read and write methods someone would have to add code amounting
> > to this:
> > 
> > 	if (suspend_is_under_way()) {
> > 		mutex_unlock(...);
> > 		block_until_resume();
> > 		goto restart;
> > 	}
> > 
> > Freezing userspace is a small amount of code by comparison.
> 
> Normally devices have some sort of queue of pending operations.  So
> all that is required on suspend is to stop processing the queue and
> wait for any currently-underway operations to complete.  The blocking
> then happens naturally using the normal I/O wait mechanisms.

So... instead of one big freezer (we know it is problematic), you have
100 small freezers, problematic in same way :-(.

Let's take current FUSE problems, and see if they have problem on PPC,
ok?

Let's say FUSE thread touches one of those "blocking" devices. It is
now in D state, somewhere in kernel.... exactly same way refrigerator
works.

Now, if kernel needs FUSE services for some reason (that's the problem
we hit in s2ram case, right?), we have a deadlock.

So main problem still seems to be "kernel should not depend on
userland services during suspend", refrigerator or not.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  8:17                           ` Miklos Szeredi
  2007-07-05  8:24                             ` Oliver Neukum
@ 2007-07-05  9:18                             ` Pavel Machek
  2007-07-05  9:31                               ` Miklos Szeredi
  1 sibling, 1 reply; 388+ messages in thread
From: Pavel Machek @ 2007-07-05  9:18 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: oliver, paulus, stern, johannes, rjw, linux-pm, linux-kernel,
	mjg59, benh

On Thu 2007-07-05 10:17:17, Miklos Szeredi wrote:
> > > > I have discussed the benefits elsewhere.  As for the deadlocks -- do 
> > > > you still observe them if you use the version of the freezer which 
> > > > doesn't freeze kernel threads?
> > > 
> > > In general the only way to guarantee there are no deadlocks is to
> > > construct the graph of dependencies between tasks.  Those dependencies
> > > are not in practice observable from outside the tasks, so it is
> > > virtually impossible to construct the graph.
> > 
> > In which way can user space tasks depend on each other in a way that
> > allows a them members of that cycle to be in uninterruptible sleep?
> 
>  - process A calls rename() on a fuse fs
>  - process B, the fuse server, starts to process the rename request
>  - process B is frozen before it can reply
> 
> Now process A is unfreezable.  We cannot make rename() restartable,
> hence it cannot be interruptible.

Yes, we are claiming fuse is very special in this regard, and perhaps
even broken.

Let's see. If I SIGSTOP the fuse server, I can get unrelated tasks
unkillable (even for SIGKILL!) forever. That's very special, and maybe
even a FUSE bug. And that is also what makes FUSE special
w.r.t. s2ram.

So no, you can't claim "FUSE is just IPC". It is very special IPC.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-03 21:14           ` Benjamin Herrenschmidt
  2007-07-03 21:32             ` Rafael J. Wysocki
@ 2007-07-05  9:30             ` Pavel Machek
  2007-07-05 22:46               ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 388+ messages in thread
From: Pavel Machek @ 2007-07-05  9:30 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Rafael J. Wysocki, Nigel Cunningham, Matthew Garrett,
	linux-kernel, linux-pm, Alan Stern

Hi!

> > >  - For STR, don't do the freezer thing.
> > 
> > In the long run, I agree.
> > 
> > Still, can you please read this post from Alan Stern:
> > 
> > https://lists.linux-foundation.org/pipermail/linux-pm/2007-June/012847.html
> > 
> > ?  I don't think I'm able to repeat the arguments given in there in a
> > convincing way.
> 
> That's the same crackpot I've been hearing for the past 3 years or
> so ...
> 
> Both Paulus and I think the freezer is just a way to try to put your
> head in the sand and ignore the problem. It causes as many problems as
> it solves on its own, and is just not a solution that will be of any use
> once you start implementing dynamic PM schemes etc...

Well, yep, you can view freezer as a head in the sand...

> In many cases, having proper support for "live" suspend of devices is
> just a matter of having a couple of helpers in whatever subsystem those
> drivers hookup with. In the case of network, for example, it's mostly
> trivial (stop the queue). For block, it's not terribly hard neither,
> though you want to have some orderign/atomicity between the blocking of
> the incoming request queue and the sending of things like spindown &
> flush commands to the disk. For old-style IDE, that was fairly easily
> solved by piping suspend/resume command down the request queue itself
> and have the queue block/unblbock itself after processing them. Some of
> that logic could maybe be moved to the block layer for all block drivers
> to benefit.

...but the moment you start blocking tasks that done driver request,
you _do_ have mini-freezer of your own, with pretty much the same
problems.

In another message I shown that removing freezer will not help with
FUSE in general case.

It probably does not help with firmware, too; as soon as udev attempts
to do something with your wireless card, it is blocked, and if the
wireless card needs the firmware from udev, you are deadlocked.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  9:18                             ` Pavel Machek
@ 2007-07-05  9:31                               ` Miklos Szeredi
  2007-07-05 11:54                                 ` Pavel Machek
                                                   ` (2 more replies)
  0 siblings, 3 replies; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05  9:31 UTC (permalink / raw)
  To: pavel
  Cc: miklos, oliver, paulus, stern, johannes, rjw, linux-pm,
	linux-kernel, mjg59, benh

> > > > > I have discussed the benefits elsewhere.  As for the deadlocks -- do 
> > > > > you still observe them if you use the version of the freezer which 
> > > > > doesn't freeze kernel threads?
> > > > 
> > > > In general the only way to guarantee there are no deadlocks is to
> > > > construct the graph of dependencies between tasks.  Those dependencies
> > > > are not in practice observable from outside the tasks, so it is
> > > > virtually impossible to construct the graph.
> > > 
> > > In which way can user space tasks depend on each other in a way that
> > > allows a them members of that cycle to be in uninterruptible sleep?
> > 
> >  - process A calls rename() on a fuse fs
> >  - process B, the fuse server, starts to process the rename request
> >  - process B is frozen before it can reply
> > 
> > Now process A is unfreezable.  We cannot make rename() restartable,
> > hence it cannot be interruptible.
> 
> Yes, we are claiming fuse is very special in this regard, and perhaps
> even broken.
> 
> Let's see. If I SIGSTOP the fuse server, I can get unrelated tasks
> unkillable (even for SIGKILL!) forever.

Actually fuse allows SIGKILL, because it's always fatal, and the
syscall may not be restarted.

> That's very special, and maybe even a FUSE bug. And that is also
> what makes FUSE special w.r.t. s2ram.

What makes fuse special is that some file operations are synchronous
and non-restartable.  That's just how the UNIX filesystem API works
and is hardly a bug in fuse.

> So no, you can't claim "FUSE is just IPC". It is very special IPC.

I did say it's special.  Sure, it has some "interesting" properties,
and with a bit of malice you can do very ugly things with it.  If you
are interested, read Documentation/filesystems/fuse.txt, especially
the "Tricky deadlock" section ;)

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: The big suspend mess
  2007-07-05  0:53                     ` Paul Mackerras
@ 2007-07-05  9:32                       ` Pavel Machek
  2007-07-05 10:29                         ` Gabriel C
  0 siblings, 1 reply; 388+ messages in thread
From: Pavel Machek @ 2007-07-05  9:32 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Adrian Bunk, Rafael J. Wysocki, Benjamin Herrenschmidt,
	Matthew Garrett, linux-kernel, linux-pm

On Thu 2007-07-05 10:53:37, Paul Mackerras wrote:
> Pavel Machek writes:
> 
> > 0. Get someone to sign up as a maintainer for suspend, so we have
> > someone to blame for the mess? :-)
> 
> I thought that was Rafael?

Rafael is good job trying to fix suspend when he can, but we do not
actually have "SUSPEND" entry in MAINTAINERS file.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  8:58                                   ` Miklos Szeredi
@ 2007-07-05 10:02                                     ` Oliver Neukum
  2007-07-05 10:14                                       ` Miklos Szeredi
  0 siblings, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-05 10:02 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: paulus, stern, johannes, rjw, linux-pm, linux-kernel, pavel, mjg59, benh

Am Donnerstag, 5. Juli 2007 schrieb Miklos Szeredi:
> > > And teach VFS to block suspension, while waiting on a mutex held by
> > > another process performing a fuse operation.
> > > 
> > > I can already hear the beautiful praise from Al Viro at the sight of
> > > that ;)
> > 
> > There is that.
> > 
> > OK, bite the bullet. Tasks involved in fuse are special. Give them a flag
> > and teach the freezer to put them on ice only after all other task are
> > frozen. In a way they are kernel, there's no use denying that.
> 
> And flag every other process, that the flagged process is
> communicating with?  How are you proposing to do that?
> 
> Quoting Paul:
> 
> "1. The freezer cannot be guaranteed deadlock-free without constructing
>    a dependency graph between tasks (both user and kernel), which is
>    virtually impossible since the dependencies are not externally
>    observable."

A deadlock requires that the circular wait is uninterruptible. Normal IPC
isn't.

What are you doing in the userland portions of fuse? Some kind of IPC
with other tasks? There is a limit to which you can push kernel functionality
into user space.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 10:02                                     ` Oliver Neukum
@ 2007-07-05 10:14                                       ` Miklos Szeredi
  2007-07-05 11:40                                         ` Rafael J. Wysocki
  0 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05 10:14 UTC (permalink / raw)
  To: oliver
  Cc: miklos, paulus, stern, johannes, rjw, linux-pm, linux-kernel,
	pavel, mjg59, benh

> > > > And teach VFS to block suspension, while waiting on a mutex held by
> > > > another process performing a fuse operation.
> > > > 
> > > > I can already hear the beautiful praise from Al Viro at the sight of
> > > > that ;)
> > > 
> > > There is that.
> > > 
> > > OK, bite the bullet. Tasks involved in fuse are special. Give them a flag
> > > and teach the freezer to put them on ice only after all other task are
> > > frozen. In a way they are kernel, there's no use denying that.
> > 
> > And flag every other process, that the flagged process is
> > communicating with?  How are you proposing to do that?
> > 
> > Quoting Paul:
> > 
> > "1. The freezer cannot be guaranteed deadlock-free without constructing
> >    a dependency graph between tasks (both user and kernel), which is
> >    virtually impossible since the dependencies are not externally
> >    observable."
> 
> A deadlock requires that the circular wait is uninterruptible. Normal IPC
> isn't.
> 
> What are you doing in the userland portions of fuse? Some kind of IPC
> with other tasks?

Anything, writing to a file, writing to shared memory, sending things
over the network.  There's no limit to what a filesystem daemon may
do.  It's a perfectly ordinary unprivileged userspace process.  And
this is a feature not a bug.

> There is a limit to which you can push kernel functionality into
> user space.

Limiting what a userspace filesystem can do would defeat the whole
purpose of the bloody thing.  This is not negotiable ;)

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: The big suspend mess
  2007-07-05  9:32                       ` Pavel Machek
@ 2007-07-05 10:29                         ` Gabriel C
  2007-07-05 10:32                           ` Fabio Comolli
  0 siblings, 1 reply; 388+ messages in thread
From: Gabriel C @ 2007-07-05 10:29 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Paul Mackerras, Adrian Bunk, Rafael J. Wysocki,
	Benjamin Herrenschmidt, Matthew Garrett, linux-kernel, linux-pm

Pavel Machek wrote:
> On Thu 2007-07-05 10:53:37, Paul Mackerras wrote:
>   
>> Pavel Machek writes:
>>
>>     
>>> 0. Get someone to sign up as a maintainer for suspend, so we have
>>> someone to blame for the mess? :-)
>>>       
>> I thought that was Rafael?
>>     
>
> Rafael is good job trying to fix suspend when he can, but we do not
> actually have "SUSPEND" entry in MAINTAINERS file.
> 								Pavel
>   

Huch ?;)

We do :p

...

SOFTWARE SUSPEND:
P: Pavel Machek
M: pavel@suse.cz
L: linux-pm@lists.linux-foundation.org
S: Maintained


...

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: The big suspend mess
  2007-07-05 10:29                         ` Gabriel C
@ 2007-07-05 10:32                           ` Fabio Comolli
  0 siblings, 0 replies; 388+ messages in thread
From: Fabio Comolli @ 2007-07-05 10:32 UTC (permalink / raw)
  To: Gabriel C
  Cc: Pavel Machek, Paul Mackerras, Adrian Bunk, Rafael J. Wysocki,
	Benjamin Herrenschmidt, Matthew Garrett, linux-kernel, linux-pm

Hi.

On 7/5/07, Gabriel C <nix.or.die@googlemail.com> wrote:
> Pavel Machek wrote:
> > On Thu 2007-07-05 10:53:37, Paul Mackerras wrote:
> >
> >> Pavel Machek writes:
> >>
> >>
> >>> 0. Get someone to sign up as a maintainer for suspend, so we have
> >>> someone to blame for the mess? :-)
> >>>
> >> I thought that was Rafael?
> >>
> >
> > Rafael is good job trying to fix suspend when he can, but we do not
> > actually have "SUSPEND" entry in MAINTAINERS file.
> >                                                               Pavel
> >
>
> Huch ?;)
>
> We do :p
>
> ...
>
> SOFTWARE SUSPEND:
> P: Pavel Machek
> M: pavel@suse.cz
> L: linux-pm@lists.linux-foundation.org
> S: Maintained
>

I think he means SUSPEND (aka suspend-to-ram) not "SOFTWARE SUSPEND"
(aka suspend-to-disk)

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 10:14                                       ` Miklos Szeredi
@ 2007-07-05 11:40                                         ` Rafael J. Wysocki
  2007-07-05 11:54                                           ` Miklos Szeredi
  0 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 11:40 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: oliver, paulus, stern, johannes, linux-pm, linux-kernel, pavel,
	mjg59, benh

On Thursday, 5 July 2007 12:14, Miklos Szeredi wrote:
> > > > > And teach VFS to block suspension, while waiting on a mutex held by
> > > > > another process performing a fuse operation.
> > > > > 
> > > > > I can already hear the beautiful praise from Al Viro at the sight of
> > > > > that ;)
> > > > 
> > > > There is that.
> > > > 
> > > > OK, bite the bullet. Tasks involved in fuse are special. Give them a flag
> > > > and teach the freezer to put them on ice only after all other task are
> > > > frozen. In a way they are kernel, there's no use denying that.
> > > 
> > > And flag every other process, that the flagged process is
> > > communicating with?  How are you proposing to do that?
> > > 
> > > Quoting Paul:
> > > 
> > > "1. The freezer cannot be guaranteed deadlock-free without constructing
> > >    a dependency graph between tasks (both user and kernel), which is
> > >    virtually impossible since the dependencies are not externally
> > >    observable."

This statement is ganarally false.

There is the limitation in the freezer that it cannot handle uninterruptible
tasks and that's all.

Now, this is not usual for user space tasks to make other user space
tasks become uninterruptible.  I'd say this is a little strange.

> > A deadlock requires that the circular wait is uninterruptible. Normal IPC
> > isn't.
> > 
> > What are you doing in the userland portions of fuse? Some kind of IPC
> > with other tasks?
> 
> Anything, writing to a file, writing to shared memory, sending things
> over the network.  There's no limit to what a filesystem daemon may
> do.  It's a perfectly ordinary unprivileged userspace process.  And
> this is a feature not a bug.
> 
> > There is a limit to which you can push kernel functionality into
> > user space.
> 
> Limiting what a userspace filesystem can do would defeat the whole
> purpose of the bloody thing.  This is not negotiable ;)

Which doesn't change the fact that FUSE _is_ special, because it adds
dependencies between processed that were not present before.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  9:31                               ` Miklos Szeredi
@ 2007-07-05 11:54                                 ` Pavel Machek
  2007-07-05 12:07                                   ` Miklos Szeredi
  2007-07-05 11:58                                 ` [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway Rafael J. Wysocki
  2007-07-05 22:04                                 ` Pavel Machek
  2 siblings, 1 reply; 388+ messages in thread
From: Pavel Machek @ 2007-07-05 11:54 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: oliver, paulus, stern, johannes, rjw, linux-pm, linux-kernel,
	mjg59, benh

Hi!

> > > > In which way can user space tasks depend on each other in a way that
> > > > allows a them members of that cycle to be in uninterruptible sleep?
> > > 
> > >  - process A calls rename() on a fuse fs
> > >  - process B, the fuse server, starts to process the rename request
> > >  - process B is frozen before it can reply
> > > 
> > > Now process A is unfreezable.  We cannot make rename() restartable,
> > > hence it cannot be interruptible.
> > 
> > Yes, we are claiming fuse is very special in this regard, and perhaps
> > even broken.
> > 
> > Let's see. If I SIGSTOP the fuse server, I can get unrelated tasks
> > unkillable (even for SIGKILL!) forever.
> 
> Actually fuse allows SIGKILL, because it's always fatal, and the
> syscall may not be restarted.

I think you want to stick try_to_freeze() at the same places where you
do SIGKILL handling. That should solve the 'syslogd is unfreezeable'
problem.

Plus, it would be nice to find out where suspend/hibernation is
triggering fuse activity. We can then decide where to fix it -- in
fuse or in suspend parts. You said sys_sync is not implemented... so
where is the problem?

> > That's very special, and maybe even a FUSE bug. And that is also
> > what makes FUSE special w.r.t. s2ram.
> 
> What makes fuse special is that some file operations are synchronous
> and non-restartable.  That's just how the UNIX filesystem API works
> and is hardly a bug in fuse.

Well, unix is not plan9, and maybe userland filesystems are impossible
in unix. But that is hardly a bug in unix :-).

							Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 11:40                                         ` Rafael J. Wysocki
@ 2007-07-05 11:54                                           ` Miklos Szeredi
  2007-07-05 13:23                                             ` Rafael J. Wysocki
  0 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05 11:54 UTC (permalink / raw)
  To: rjw
  Cc: miklos, oliver, paulus, stern, johannes, linux-pm, linux-kernel,
	pavel, mjg59, benh

> > Limiting what a userspace filesystem can do would defeat the whole
> > purpose of the bloody thing.  This is not negotiable ;)
> 
> Which doesn't change the fact that FUSE _is_ special, because it adds
> dependencies between processed that were not present before.

OK, fuse is special.  So is the userspace driver framework (UIO)
proposed by Greg KH and co.  Now what can be done about these?

 - making them not-special is not an option due to the established
   interfaces, which don't allow restartability.

 - fixing the freezer is pretty much impossible because the
   dependencies between the tasks cannot be known.

 - removing the freezer and fixing the drivers seems workable, we
   already have a prototype in the form of the powermac architecture.

It seems pretty clear cut.  Whining about how much problems this will
cause won't get us nearer to a solution.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  0:15                           ` Paul Mackerras
@ 2007-07-05 11:54                             ` Rafael J. Wysocki
  2007-07-07 12:09                             ` Pavel Machek
  1 sibling, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 11:54 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Benjamin Herrenschmidt, Matthew Garrett, linux-kernel,
	Pavel Machek, linux-pm

On Thursday, 5 July 2007 02:15, Paul Mackerras wrote:
> Rafael J. Wysocki writes:
> 
> > This is incompatible with the code in kernel/power/main.c, since we only
> > disable the nonboot CPUs after devices have been suspended.  Do you think that
> > your framework can be modified to work without disabling the nonboot CPUs
> > by the user space?
> 
> Sure.  It was a "if it can be done in userspace, do it in userspace"
> kind of decision, but I'm not wedded to it.
> 
> I actually do want to converge to using the generic suspend-to-ram
> code on powerbooks.  I just want to avoid causing regressions for
> powerbook users, including myself. :)

Okay, but my question is this: Would that be possible, within your framework,
to disable the nonboot CPUs _after_ suspending devices?

Can you please point me to your high-level suspend code?

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  9:31                               ` Miklos Szeredi
  2007-07-05 11:54                                 ` Pavel Machek
@ 2007-07-05 11:58                                 ` Rafael J. Wysocki
  2007-07-05 12:24                                   ` Miklos Szeredi
  2007-07-05 22:04                                 ` Pavel Machek
  2 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 11:58 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: pavel, oliver, paulus, stern, johannes, linux-pm, linux-kernel,
	mjg59, benh

On Thursday, 5 July 2007 11:31, Miklos Szeredi wrote:
> > > > > > I have discussed the benefits elsewhere.  As for the deadlocks -- do 
> > > > > > you still observe them if you use the version of the freezer which 
> > > > > > doesn't freeze kernel threads?
> > > > > 
> > > > > In general the only way to guarantee there are no deadlocks is to
> > > > > construct the graph of dependencies between tasks.  Those dependencies
> > > > > are not in practice observable from outside the tasks, so it is
> > > > > virtually impossible to construct the graph.
> > > > 
> > > > In which way can user space tasks depend on each other in a way that
> > > > allows a them members of that cycle to be in uninterruptible sleep?
> > > 
> > >  - process A calls rename() on a fuse fs
> > >  - process B, the fuse server, starts to process the rename request
> > >  - process B is frozen before it can reply
> > > 
> > > Now process A is unfreezable.  We cannot make rename() restartable,
> > > hence it cannot be interruptible.
> > 
> > Yes, we are claiming fuse is very special in this regard, and perhaps
> > even broken.
> > 
> > Let's see. If I SIGSTOP the fuse server, I can get unrelated tasks
> > unkillable (even for SIGKILL!) forever.
> 
> Actually fuse allows SIGKILL, because it's always fatal, and the
> syscall may not be restarted.
> 
> > That's very special, and maybe even a FUSE bug. And that is also
> > what makes FUSE special w.r.t. s2ram.
> 
> What makes fuse special is that some file operations are synchronous
> and non-restartable.  That's just how the UNIX filesystem API works
> and is hardly a bug in fuse.
> 
> > So no, you can't claim "FUSE is just IPC". It is very special IPC.
> 
> I did say it's special.  Sure, it has some "interesting" properties,
> and with a bit of malice you can do very ugly things with it.  If you
> are interested, read Documentation/filesystems/fuse.txt, especially
> the "Tricky deadlock" section ;)

Very well.

Don't you think, however, that it can be modified a little to play well,
for example, with the freezer?

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 11:54                                 ` Pavel Machek
@ 2007-07-05 12:07                                   ` Miklos Szeredi
  2007-07-05 13:28                                     ` Rafael J. Wysocki
                                                       ` (2 more replies)
  0 siblings, 3 replies; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05 12:07 UTC (permalink / raw)
  To: pavel
  Cc: miklos, oliver, paulus, stern, johannes, rjw, linux-pm,
	linux-kernel, mjg59, benh

> > Actually fuse allows SIGKILL, because it's always fatal, and the
> > syscall may not be restarted.
> 
> I think you want to stick try_to_freeze() at the same places where you
> do SIGKILL handling. That should solve the 'syslogd is unfreezeable'
> problem.

I could, but it would not solve the general problem.  Namely, that the
presence of fuse imposes a certain ordering in which userspace tasks
have to be frozen.  And it is not possible to know this ordering.

And even if the ordering were solved, the freezer would still not work
if the filesystem is not responding due to external events, such as a
lost network (this affects NFS, CIFS, whatever just the same as fuse).

> Plus, it would be nice to find out where suspend/hibernation is
> triggering fuse activity. We can then decide where to fix it -- in
> fuse or in suspend parts. You said sys_sync is not implemented... so
> where is the problem?

I cannot say without having a sysrq-t of the situation.

> > > That's very special, and maybe even a FUSE bug. And that is also
> > > what makes FUSE special w.r.t. s2ram.
> > 
> > What makes fuse special is that some file operations are synchronous
> > and non-restartable.  That's just how the UNIX filesystem API works
> > and is hardly a bug in fuse.
> 
> Well, unix is not plan9, and maybe userland filesystems are impossible
> in unix. But that is hardly a bug in unix :-).

I'd rather say, reliable suspend to ram is impossible in the presense
of userspace filesystems, iff people are too lazy to fix the suspend
framework and the drivers to work without the freezer.

This has nothing to do with unix, if plan9 would need to support STR
or hibernate, it would face exactly the same problems.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: The big suspend mess
  2007-07-05  1:22                     ` Adrian Bunk
@ 2007-07-05 12:18                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 12:18 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Pavel Machek, Paul Mackerras, Benjamin Herrenschmidt,
	Matthew Garrett, linux-kernel, linux-pm

On Thursday, 5 July 2007 03:22, Adrian Bunk wrote:
> On Thu, Jul 05, 2007 at 02:27:47AM +0200, Pavel Machek wrote:
> > 
> > > IMHO the suspend code is currently way too much of a moving target which 
> > > results in this mess.
> > > 
> > > The correct order seems to be:
> > 
> > 0. Get someone to sign up as a maintainer for suspend, so we have
> > someone to blame for the mess? :-)
> 
> The "SOFTWARE SUSPEND" entry in MAINTAINERS already contains a victim...  ;-)
> 
> My impression is that suspend is an area of the kernel that does not 
> lack maintainers - you and Rafael are doing a good job, and there's e.g. 
> also the maintained code formerly known as suspend2.
> 
> But some basic questions like e.g.
> - What should be done in the kernel and what in userspace?
> - How should this be implemented?
> - What must subsystems and drivers do?
> - What must subsystems and drivers not do?
> seem to be in a constant flux because the big picture everyone agrees 
> upon seems to be missing.

Well, to some extent I agree, but that's because it's difficult to reach such
an agreement and if we have one, new poeple come with new ideas and we need to
start all over again. :-)

In the meantime, problems are reported that need to be fixed and the fixes
often break other systems and we learn about that too late ect.

As far as the big picture is concerned, we have already agreed, or at least
this is my impression, that we need to separate the drivers' suspend callbacks
(as they exist today) from their hibernation callbacks (not present yet).

This seems to be needed for a couple of reasons, the first of them being that
the desired functionality of the hibernation code may be substantially
different from the desired functionality of the suspend code.  For instance,
it generally is not necessary to put devices into low power states, or to power
them off, in order to create the hibernation image, while it is necessary to do
that for a suspend.

It will require us to redesign the core quite a bit and I've already started to
prepare for that.  I really would like to do that in the first place and _then_
to think of improving both the suspend and hibernation code paths
_separately_.  This way, we can do things more easily and in a much more clean
way.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 11:58                                 ` [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway Rafael J. Wysocki
@ 2007-07-05 12:24                                   ` Miklos Szeredi
  2007-07-05 13:31                                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05 12:24 UTC (permalink / raw)
  To: rjw
  Cc: miklos, pavel, oliver, paulus, stern, johannes, linux-pm,
	linux-kernel, mjg59, benh

> Don't you think, however, that it can be modified a little to play well,
> for example, with the freezer?

I could stick a couple of try_to_freeze()s into fuse, and that would
make suspend failure less likely.  But making problems less easy to
reproduce is not a good thing.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-04 23:45             ` Pavel Machek
@ 2007-07-05 12:25               ` Rafael J. Wysocki
  2007-07-05 12:38                 ` Nigel Cunningham
  0 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 12:25 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Oliver Neukum, Miklos Szeredi, benh, mjg59, linux-kernel,
	linux-pm, nigel

On Thursday, 5 July 2007 01:45, Pavel Machek wrote:
> On Tue 2007-07-03 21:32:20, Oliver Neukum wrote:
> > Am Dienstag, 3. Juli 2007 schrieb Miklos Szeredi:
> > > > And a further question. The freezer is not atomic. What do you do
> > > > if a task not yet frozen calls sys_sync(), but fuse is already frozen?
> > > 
> > > What do you do if a task not yet frozen writes to a pipe, on the other
> > > end of which is a task already frozen?
> 
> There's some difference between uninterruptible and interruptible
> sleep I'd say.
> 
> > > It doesn't matter.  The only thing that should matter during suspend
> > > (not hibernate) is saving the state of devices to ram, and putting the
> > > devices to sleep.
> > 
> > Well, but you did remove sys_sync() from the freezer, which is
> > and must be called in the hibernate path.
> 
> Not "must". In fact, hibernation should be safe without sys_sync(). It
> is just user un-friendly.

In fact, I'd like to remove the sys_sync() from the freezer entirely, because
it just doesn't belong in there.

The only advantege of having sys_sync() in freeze_processes() is that we
have a chance to write out everything when applications cannot produce more
data to write, but there are filesystems which don't do that anyway (eg. XFS),
so generally there's no reason to bother.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  0:29                                             ` Paul Mackerras
@ 2007-07-05 12:29                                               ` Rafael J. Wysocki
  2007-07-12 15:13                                               ` Pavel Machek
  1 sibling, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 12:29 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Oliver Neukum, Miklos Szeredi, mjg59, linux-pm, linux-kernel

On Thursday, 5 July 2007 02:29, Paul Mackerras wrote:
> Rafael J. Wysocki writes:
> 
> > They will not trigger 100% of the time, but sporadically and generally at
> > random.
> > 
> > At least the freezer problems are reproducible. ;-)
> 
> Our experience with powermacs has been that it isn't actually all that
> hard to get it right for the drivers you care about.

Well, I'm a bit suspicious about that. ;-)

Namely, if you run your suspend code on one CPU and (do I remeber correctly
that?) with kernel preemption disabled, then you practically prevent user land
processes from being scheduled when your suspend code is running.

If that is the case (of which I'm not sure), the freezer is obviously
unnecessary, because you've taken processes out of the equation in a different
way ...

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 12:25               ` Rafael J. Wysocki
@ 2007-07-05 12:38                 ` Nigel Cunningham
  2007-07-05 13:35                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 388+ messages in thread
From: Nigel Cunningham @ 2007-07-05 12:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, Oliver Neukum, Miklos Szeredi, benh, mjg59,
	linux-kernel, linux-pm

[-- Attachment #1: Type: text/plain, Size: 2199 bytes --]

Hi.

On Thursday 05 July 2007 22:25:06 Rafael J. Wysocki wrote:
> On Thursday, 5 July 2007 01:45, Pavel Machek wrote:
> > On Tue 2007-07-03 21:32:20, Oliver Neukum wrote:
> > > Am Dienstag, 3. Juli 2007 schrieb Miklos Szeredi:
> > > > > And a further question. The freezer is not atomic. What do you do
> > > > > if a task not yet frozen calls sys_sync(), but fuse is already 
frozen?
> > > > 
> > > > What do you do if a task not yet frozen writes to a pipe, on the other
> > > > end of which is a task already frozen?
> > 
> > There's some difference between uninterruptible and interruptible
> > sleep I'd say.
> > 
> > > > It doesn't matter.  The only thing that should matter during suspend
> > > > (not hibernate) is saving the state of devices to ram, and putting the
> > > > devices to sleep.
> > > 
> > > Well, but you did remove sys_sync() from the freezer, which is
> > > and must be called in the hibernate path.
> > 
> > Not "must". In fact, hibernation should be safe without sys_sync(). It
> > is just user un-friendly.
> 
> In fact, I'd like to remove the sys_sync() from the freezer entirely, 
because
> it just doesn't belong in there.
> 
> The only advantege of having sys_sync() in freeze_processes() is that we
> have a chance to write out everything when applications cannot produce more
> data to write, but there are filesystems which don't do that anyway (eg. 
XFS),
> so generally there's no reason to bother.

Shouldn't XFS - and fuse - be considered to be broken? Sync should sync data 
and if XFS isn't doing that, it's wrong.

In the case of fuse, we should have a mechanism by which fuse processes can be 
made to sync if they do have any pending I/O, and by which they can be frozen 
later than other userspace processes.

I'd like to see the sync stay, because it improves reliability and data 
integrity in the fail-to-resume case. Calling scripts would probably invoke 
sync themselves if they don't already, but that's racy. As it is at the 
moment, we know userspace is stopped, so syncing isn't racy.

Regards,

Nigel
-- 
See http://www.tuxonice.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  8:37                                           ` Miklos Szeredi
@ 2007-07-05 12:39                                             ` Rafael J. Wysocki
  2007-07-05 12:39                                               ` Miklos Szeredi
  0 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 12:39 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: stern, oliver, paulus, mjg59, linux-pm, linux-kernel

On Thursday, 5 July 2007 10:37, Miklos Szeredi wrote:
> > > Pro-freezers say:
> > > 
> > >   - don't remove the freezer, otherwise we'll have to deal with
> > >     numerous problems in drivers
> > 
> > And these problems will generally be difficult to reproduce reliably
> > and debug.
> 
> I see exactly the opposite.
> 
> With the freezer I can have very rarely occuring failures, due to
> freeze ordering effects.

The "freezer ordering effects" only affect uninterruptible tasks.
 
> And without the freezer I have a 100% reproducable problem, that is
> not hard to fix according to Alan Stern.  OK, I don't know what the
> next problem would be, but the powermac experience shows, that it's
> not nearly as bad as you and Oliver try to make it out.

The powermac need not be a good example due to the different code ordering
(it's run on one CPU only, which makes the probability of triggering a race
be quite low).

> > > Can this be fixed?
> > > 
> > > It seems to be a fundamental problem with the freezer: while it does
> > > make sure that user processes are not calling into drivers during
> > > suspend, it also disallows perfectly harmless non-driver calls as
> > > well.
> > 
> > The problem is that when the freezer was designed (I didn't do that, BTW),
> > there was no FUSE and similar things, so it's not prepared to cope with
> > such interdependencies between user space tasks.
> > 
> > We had an analogous problem with vfork() and it was solved by using the
> > PF_FREEZER_SKIP flag.  Perhaps we can do similar thing with FUSE.
> 
> It cannot be just worked around in fuse, as a task might be sleeping
> on a number of VFS mutexes as well (i_mutex, s_vfs_rename_mutex, etc).
> It would be a gigantic hack, possible at all.

Well, obviously FUSE and the freezer don't play well together, but that's FUSE
who's late in the game (the freezer was here before). ;-)

If you give me some time, I'll see what can be done.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 12:39                                             ` Rafael J. Wysocki
@ 2007-07-05 12:39                                               ` Miklos Szeredi
  2007-07-05 16:10                                                 ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05 12:39 UTC (permalink / raw)
  To: rjw; +Cc: miklos, stern, oliver, paulus, mjg59, linux-pm, linux-kernel

> > > PF_FREEZER_SKIP flag.  Perhaps we can do similar thing with FUSE.
> > 
> > It cannot be just worked around in fuse, as a task might be sleeping
> > on a number of VFS mutexes as well (i_mutex, s_vfs_rename_mutex, etc).
> > It would be a gigantic hack, possible at all.
> 
> Well, obviously FUSE and the freezer don't play well together, but
> that's FUSE who's late in the game (the freezer was here
> before). ;-)

Umm, and CODA which is _very_ similar to fuse was there long before
fuse or the freezer ;)

> If you give me some time, I'll see what can be done.

I give you all the time in the world.  I'd also be willing to help
with drivers for which I have the hardware.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  0:43                                         ` Paul Mackerras
@ 2007-07-05 12:49                                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 12:49 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Miklos Szeredi, stern, oliver, mjg59, linux-pm, linux-kernel

On Thursday, 5 July 2007 02:43, Paul Mackerras wrote:
> Miklos Szeredi writes:
> 
> > OK, let me summarize the situation as I see it now: there are two
> > camps, the pro-freezers and the anti-freezers.
> > 
> > Pro-freezers say:
> > 
> >   - don't remove the freezer, otherwise we'll have to deal with
> >     numerous problems in drivers
> > 
> > Anti-freezers say:
> > 
> >   - let's remove the freezer, which causes numerous problems
> > 
> > Alan summerized the pro-freezer arguments well I think.  What are the
> > anti-freezer arguments then?
> 
> 1. The freezer cannot be guaranteed deadlock-free without constructing
>    a dependency graph between tasks (both user and kernel), which is
>    virtually impossible since the dependencies are not externally
>    observable.

I don't agree with that.

The freezer only fails to handle uninterruptible tasks, so we need to take
the situations in which an uninterruptible task waits for a frozen task into
consideration.  Now, if both tasks are from the user land, this is highly
unusual.

> 2. As a consequence of (1), we try to make a crude approximation of
>    the graph by saying "only kernel threads that want to be frozen
>    will be frozen" or some other similar statement.

No.  The rule is that kernel threads should not be freezable, but there are
some for which that is useful.

> 3. However, (2) means that we can no longer guarantee that drivers
>    will not get any I/O requests after their suspend method has been
>    called, and therefore the freezer fails in its main objective.

This is a very general statement.  Can you please give some examples?

> 4. We have an existence proof that reliable suspend can be achieved
>    without the freezer.

No.  We only know that it might work if the nonboot CPUs are disabled before
suspending devices, which is not the case in the generic suspend code.

> To summarize, the argument is that the freezer is deadlock-prone and
> ineffective.

I remain unconvinced. ;-)

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 12:51                                         ` Rafael J. Wysocki
@ 2007-07-05 12:50                                           ` Johannes Berg
  2007-07-05 13:47                                             ` Rafael J. Wysocki
  0 siblings, 1 reply; 388+ messages in thread
From: Johannes Berg @ 2007-07-05 12:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Paul Mackerras, mjg59, Linux-pm mailing list,
	Kernel development list, Miklos Szeredi

[-- Attachment #1: Type: text/plain, Size: 736 bytes --]

On Thu, 2007-07-05 at 14:51 +0200, Rafael J. Wysocki wrote:

> > > Remember what I wrote a few minutes ago about khubd and ksuspend_usbd
> > > wanting to resume devices during a system suspend transition?  This is
> > > exactly what happens when those threads aren't frozen.
> > 
> > So, I wonder why I don't see that error on my powerbook?
> 
> Because you have only one CPU running while your suspend code is being
> executed?

If that's really all the problem then what's wrong with just unplugging
the other CPUs earlier? Sure, that makes suspend no longer perfectly
transparent since userspace might notice the CPUs being unplugged, but
it has to cope with that anyway since a user can do it manually...

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  0:36                                       ` Paul Mackerras
@ 2007-07-05 12:51                                         ` Rafael J. Wysocki
  2007-07-05 12:50                                           ` Johannes Berg
  2007-07-05 14:25                                         ` Alan Stern
  1 sibling, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 12:51 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Alan Stern, Miklos Szeredi, mjg59, Linux-pm mailing list,
	Kernel development list

On Thursday, 5 July 2007 02:36, Paul Mackerras wrote:
> Alan Stern writes:
> 
> > Remember what I wrote a few minutes ago about khubd and ksuspend_usbd
> > wanting to resume devices during a system suspend transition?  This is
> > exactly what happens when those threads aren't frozen.
> 
> So, I wonder why I don't see that error on my powerbook?

Because you have only one CPU running while your suspend code is being
executed?

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 11:54                                           ` Miklos Szeredi
@ 2007-07-05 13:23                                             ` Rafael J. Wysocki
  2007-07-05 13:28                                               ` Oliver Neukum
  0 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 13:23 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: oliver, paulus, stern, johannes, linux-pm, linux-kernel, pavel,
	mjg59, benh

On Thursday, 5 July 2007 13:54, Miklos Szeredi wrote:
> > > Limiting what a userspace filesystem can do would defeat the whole
> > > purpose of the bloody thing.  This is not negotiable ;)
> > 
> > Which doesn't change the fact that FUSE _is_ special, because it adds
> > dependencies between processed that were not present before.
> 
> OK, fuse is special.  So is the userspace driver framework (UIO)
> proposed by Greg KH and co.  Now what can be done about these?
> 
>  - making them not-special is not an option due to the established
>    interfaces, which don't allow restartability.
> 
>  - fixing the freezer is pretty much impossible because the
>    dependencies between the tasks cannot be known.
> 
>  - removing the freezer and fixing the drivers seems workable, we
>    already have a prototype in the form of the powermac architecture.
> 
> It seems pretty clear cut.  Whining about how much problems this will
> cause won't get us nearer to a solution.

Yes, that's pretty clear cut, but we should start from fixing the drivers. :-)

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 12:07                                   ` Miklos Szeredi
@ 2007-07-05 13:28                                     ` Rafael J. Wysocki
  2007-07-05 19:38                                     ` Oliver Neukum
  2007-07-07 12:17                                     ` Pavel Machek
  2 siblings, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 13:28 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: pavel, oliver, paulus, stern, johannes, linux-pm, linux-kernel,
	mjg59, benh

On Thursday, 5 July 2007 14:07, Miklos Szeredi wrote:
> > > Actually fuse allows SIGKILL, because it's always fatal, and the
> > > syscall may not be restarted.
> > 
> > I think you want to stick try_to_freeze() at the same places where you
> > do SIGKILL handling. That should solve the 'syslogd is unfreezeable'
> > problem.
> 
> I could, but it would not solve the general problem.  Namely, that the
> presence of fuse imposes a certain ordering in which userspace tasks
> have to be frozen.  And it is not possible to know this ordering.
> 
> And even if the ordering were solved, the freezer would still not work
> if the filesystem is not responding due to external events, such as a
> lost network (this affects NFS, CIFS, whatever just the same as fuse).
> 
> > Plus, it would be nice to find out where suspend/hibernation is
> > triggering fuse activity. We can then decide where to fix it -- in
> > fuse or in suspend parts. You said sys_sync is not implemented... so
> > where is the problem?
> 
> I cannot say without having a sysrq-t of the situation.
> 
> > > > That's very special, and maybe even a FUSE bug. And that is also
> > > > what makes FUSE special w.r.t. s2ram.
> > > 
> > > What makes fuse special is that some file operations are synchronous
> > > and non-restartable.  That's just how the UNIX filesystem API works
> > > and is hardly a bug in fuse.
> > 
> > Well, unix is not plan9, and maybe userland filesystems are impossible
> > in unix. But that is hardly a bug in unix :-).
> 
> I'd rather say, reliable suspend to ram is impossible in the presense
> of userspace filesystems, iff people are too lazy to fix the suspend
> framework and the drivers to work without the freezer.

Well, I don't think it has anything to do with laziness or things like that.
Rather, people have limited time and this requires some knowledge about
drivers you're modifying, so not many people really can do that.

Also, you've already assumed that there's no other solution, but I'm not
convinced about that yet.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 13:23                                             ` Rafael J. Wysocki
@ 2007-07-05 13:28                                               ` Oliver Neukum
  2007-07-05 13:46                                                 ` Matthew Garrett
  2007-07-05 14:02                                                 ` Rafael J. Wysocki
  0 siblings, 2 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-05 13:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Miklos Szeredi, paulus, stern, johannes, linux-pm, linux-kernel,
	pavel, mjg59, benh

Am Donnerstag, 5. Juli 2007 schrieb Rafael J. Wysocki:
> > It seems pretty clear cut.  Whining about how much problems this will
> > cause won't get us nearer to a solution.
> 
> Yes, that's pretty clear cut, but we should start from fixing the drivers. :-)

If, at a minimum, we can determine that we can STD without a freezer.
It makes no sense to invest a lot of work to face the same problem
again with STD.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 12:24                                   ` Miklos Szeredi
@ 2007-07-05 13:31                                     ` Rafael J. Wysocki
  2007-07-05 13:50                                       ` Miklos Szeredi
  0 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 13:31 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: pavel, oliver, paulus, stern, johannes, linux-pm, linux-kernel,
	mjg59, benh

On Thursday, 5 July 2007 14:24, Miklos Szeredi wrote:
> > Don't you think, however, that it can be modified a little to play well,
> > for example, with the freezer?
> 
> I could stick a couple of try_to_freeze()s into fuse, and that would
> make suspend failure less likely.  But making problems less easy to
> reproduce is not a good thing.

So, how about eliminating them?

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 12:38                 ` Nigel Cunningham
@ 2007-07-05 13:35                   ` Rafael J. Wysocki
  2007-07-05 13:36                     ` Nigel Cunningham
  0 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 13:35 UTC (permalink / raw)
  To: nigel
  Cc: Pavel Machek, Oliver Neukum, Miklos Szeredi, benh, mjg59,
	linux-kernel, linux-pm

On Thursday, 5 July 2007 14:38, Nigel Cunningham wrote:
> Hi.
> 
> On Thursday 05 July 2007 22:25:06 Rafael J. Wysocki wrote:
> > On Thursday, 5 July 2007 01:45, Pavel Machek wrote:
> > > On Tue 2007-07-03 21:32:20, Oliver Neukum wrote:
> > > > Am Dienstag, 3. Juli 2007 schrieb Miklos Szeredi:
> > > > > > And a further question. The freezer is not atomic. What do you do
> > > > > > if a task not yet frozen calls sys_sync(), but fuse is already 
> frozen?
> > > > > 
> > > > > What do you do if a task not yet frozen writes to a pipe, on the other
> > > > > end of which is a task already frozen?
> > > 
> > > There's some difference between uninterruptible and interruptible
> > > sleep I'd say.
> > > 
> > > > > It doesn't matter.  The only thing that should matter during suspend
> > > > > (not hibernate) is saving the state of devices to ram, and putting the
> > > > > devices to sleep.
> > > > 
> > > > Well, but you did remove sys_sync() from the freezer, which is
> > > > and must be called in the hibernate path.
> > > 
> > > Not "must". In fact, hibernation should be safe without sys_sync(). It
> > > is just user un-friendly.
> > 
> > In fact, I'd like to remove the sys_sync() from the freezer entirely, 
> because
> > it just doesn't belong in there.
> > 
> > The only advantege of having sys_sync() in freeze_processes() is that we
> > have a chance to write out everything when applications cannot produce more
> > data to write, but there are filesystems which don't do that anyway (eg. 
> XFS),
> > so generally there's no reason to bother.
> 
> Shouldn't XFS - and fuse - be considered to be broken? Sync should sync data 
> and if XFS isn't doing that, it's wrong.
> 
> In the case of fuse, we should have a mechanism by which fuse processes can be 
> made to sync if they do have any pending I/O, and by which they can be frozen 
> later than other userspace processes.
> 
> I'd like to see the sync stay, because it improves reliability and data 
> integrity in the fail-to-resume case. Calling scripts would probably invoke 
> sync themselves if they don't already, but that's racy. As it is at the 
> moment, we know userspace is stopped, so syncing isn't racy.

I'd like to move the sync out of the freezer, but to call it from the
suspend/hibernation code, so that we do

sys_sync();
error = freeze_processes();

etc.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 13:35                   ` Rafael J. Wysocki
@ 2007-07-05 13:36                     ` Nigel Cunningham
  2007-07-05 13:59                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 388+ messages in thread
From: Nigel Cunningham @ 2007-07-05 13:36 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: nigel, Pavel Machek, Oliver Neukum, Miklos Szeredi, benh, mjg59,
	linux-kernel, linux-pm

[-- Attachment #1: Type: text/plain, Size: 3253 bytes --]

Hi.

On Thursday 05 July 2007 23:35:45 Rafael J. Wysocki wrote:
> On Thursday, 5 July 2007 14:38, Nigel Cunningham wrote:
> > On Thursday 05 July 2007 22:25:06 Rafael J. Wysocki wrote:
> > > On Thursday, 5 July 2007 01:45, Pavel Machek wrote:
> > > > On Tue 2007-07-03 21:32:20, Oliver Neukum wrote:
> > > > > Am Dienstag, 3. Juli 2007 schrieb Miklos Szeredi:
> > > > > > > And a further question. The freezer is not atomic. What do you 
do
> > > > > > > if a task not yet frozen calls sys_sync(), but fuse is already 
> > frozen?
> > > > > > 
> > > > > > What do you do if a task not yet frozen writes to a pipe, on the 
other
> > > > > > end of which is a task already frozen?
> > > > 
> > > > There's some difference between uninterruptible and interruptible
> > > > sleep I'd say.
> > > > 
> > > > > > It doesn't matter.  The only thing that should matter during 
suspend
> > > > > > (not hibernate) is saving the state of devices to ram, and putting 
the
> > > > > > devices to sleep.
> > > > > 
> > > > > Well, but you did remove sys_sync() from the freezer, which is
> > > > > and must be called in the hibernate path.
> > > > 
> > > > Not "must". In fact, hibernation should be safe without sys_sync(). It
> > > > is just user un-friendly.
> > > 
> > > In fact, I'd like to remove the sys_sync() from the freezer entirely, 
> > because
> > > it just doesn't belong in there.
> > > 
> > > The only advantege of having sys_sync() in freeze_processes() is that we
> > > have a chance to write out everything when applications cannot produce 
more
> > > data to write, but there are filesystems which don't do that anyway (eg. 
> > XFS),
> > > so generally there's no reason to bother.
> > 
> > Shouldn't XFS - and fuse - be considered to be broken? Sync should sync 
data 
> > and if XFS isn't doing that, it's wrong.
> > 
> > In the case of fuse, we should have a mechanism by which fuse processes 
can be 
> > made to sync if they do have any pending I/O, and by which they can be 
frozen 
> > later than other userspace processes.
> > 
> > I'd like to see the sync stay, because it improves reliability and data 
> > integrity in the fail-to-resume case. Calling scripts would probably 
invoke 
> > sync themselves if they don't already, but that's racy. As it is at the 
> > moment, we know userspace is stopped, so syncing isn't racy.
> 
> I'd like to move the sync out of the freezer, but to call it from the
> suspend/hibernation code, so that we do
> 
> sys_sync();
> error = freeze_processes();

Yeah, I understand that. The problem then is that you're racing against 
userspace. That's not usually a problem, but that doesn't mean it's never a 
problem. Try running the stress suite while testing hibernating and you'll 
see what I mean. If something is submitting lots of I/O when you try to 
suspend, your sync call will race against that process if it's not yet 
frozen, and its continued activity will make your sync pointless (there'll be 
more unsynced data when you sys_sync call finishes). Stopping userspace 
before syncing removes that race.

Regards,

Nigel
-- 
See http://www.tuxonice.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 13:28                                               ` Oliver Neukum
@ 2007-07-05 13:46                                                 ` Matthew Garrett
  2007-07-05 14:09                                                   ` Rafael J. Wysocki
  2007-07-05 14:02                                                 ` Rafael J. Wysocki
  1 sibling, 1 reply; 388+ messages in thread
From: Matthew Garrett @ 2007-07-05 13:46 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Rafael J. Wysocki, Miklos Szeredi, paulus, stern, johannes,
	linux-pm, linux-kernel, pavel, benh

On Thu, Jul 05, 2007 at 03:28:33PM +0200, Oliver Neukum wrote:

> If, at a minimum, we can determine that we can STD without a freezer.
> It makes no sense to invest a lot of work to face the same problem
> again with STD.

I have a model for STD that avoids the need to freeze the entirity of 
userspace, but I need to find some more time to flesh it out.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 12:50                                           ` Johannes Berg
@ 2007-07-05 13:47                                             ` Rafael J. Wysocki
  0 siblings, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 13:47 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Paul Mackerras, mjg59, Linux-pm mailing list,
	Kernel development list, Miklos Szeredi

On Thursday, 5 July 2007 14:50, Johannes Berg wrote:
> On Thu, 2007-07-05 at 14:51 +0200, Rafael J. Wysocki wrote:
> 
> > > > Remember what I wrote a few minutes ago about khubd and ksuspend_usbd
> > > > wanting to resume devices during a system suspend transition?  This is
> > > > exactly what happens when those threads aren't frozen.
> > > 
> > > So, I wonder why I don't see that error on my powerbook?
> > 
> > Because you have only one CPU running while your suspend code is being
> > executed?
> 
> If that's really all the problem then what's wrong with just unplugging
> the other CPUs earlier? Sure, that makes suspend no longer perfectly
> transparent since userspace might notice the CPUs being unplugged, but
> it has to cope with that anyway since a user can do it manually...

This is a bit complicated, but I'll try to explain.

We used to disable the nonboot CPUs before suspending devices and enable
them after resuming devices, but that turned out to lead to resume problems on
some ACPI systems.  Namely, it turned out that this code ordering was not in
line with the ACPI spec that assumed specific ordering of events during a
suspend.  For this reason, we changed the code ordering and now it is more or
less in agreement with ACPI (for the first time, AFAICS).  I don't think that
anyone would like to revert that change right now.

Moreover, in the meantime we learned that the CPU hotplug code that we use
for disabling the nonboot CPUs, is generally problematic on x86 and only works
for us because we have the majority of interrupt sources disabled when it's
invoked.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 13:31                                     ` Rafael J. Wysocki
@ 2007-07-05 13:50                                       ` Miklos Szeredi
  2007-07-05 14:14                                         ` Rafael J. Wysocki
  0 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05 13:50 UTC (permalink / raw)
  To: rjw
  Cc: miklos, pavel, oliver, paulus, stern, johannes, linux-pm,
	linux-kernel, mjg59, benh

> > > Don't you think, however, that it can be modified a little to play well,
> > > for example, with the freezer?
> > 
> > I could stick a couple of try_to_freeze()s into fuse, and that would
> > make suspend failure less likely.  But making problems less easy to
> > reproduce is not a good thing.
> 
> So, how about eliminating them?

That can't be done just within fuse, a process might be sleeping on a
VFS mutex.  Do we want to hack VFS as well?

I guess I know your answer.  But it ain't gonna work.  Suspend code
really doesn't belong in VFS, and I'm pretty sure the maintainers of
that little piece of code would agree with me on this.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway)
  2007-07-05  9:15                     ` removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway) Pavel Machek
@ 2007-07-05 13:57                       ` Matthew Garrett
  2007-07-05 14:28                         ` Rafael J. Wysocki
  2007-07-07 12:08                         ` problem 1 (was Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway)) Pavel Machek
  0 siblings, 2 replies; 388+ messages in thread
From: Matthew Garrett @ 2007-07-05 13:57 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Paul Mackerras, Alan Stern, Johannes Berg, Rafael J. Wysocki,
	Linux-pm mailing list, Kernel development list

On Thu, Jul 05, 2007 at 11:15:26AM +0200, Pavel Machek wrote:

> Now, if kernel needs FUSE services for some reason (that's the problem
> we hit in s2ram case, right?), we have a deadlock.
> 
> So main problem still seems to be "kernel should not depend on
> userland services during suspend", refrigerator or not.

And also "Userland should not depend on userland services", which is 
rather more of a problem.
-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 13:36                     ` Nigel Cunningham
@ 2007-07-05 13:59                       ` Rafael J. Wysocki
  2007-07-05 21:49                         ` Nigel Cunningham
  0 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 13:59 UTC (permalink / raw)
  To: nigel
  Cc: Pavel Machek, Oliver Neukum, Miklos Szeredi, benh, mjg59,
	linux-kernel, linux-pm

Hi,

On Thursday, 5 July 2007 15:36, Nigel Cunningham wrote:
> Hi.
> 
> On Thursday 05 July 2007 23:35:45 Rafael J. Wysocki wrote:
> > On Thursday, 5 July 2007 14:38, Nigel Cunningham wrote:
> > > On Thursday 05 July 2007 22:25:06 Rafael J. Wysocki wrote:
> > > > On Thursday, 5 July 2007 01:45, Pavel Machek wrote:
> > > > > On Tue 2007-07-03 21:32:20, Oliver Neukum wrote:
> > > > > > Am Dienstag, 3. Juli 2007 schrieb Miklos Szeredi:
> > > > > > > > And a further question. The freezer is not atomic. What do you 
> do
> > > > > > > > if a task not yet frozen calls sys_sync(), but fuse is already 
> > > frozen?
> > > > > > > 
> > > > > > > What do you do if a task not yet frozen writes to a pipe, on the 
> other
> > > > > > > end of which is a task already frozen?
> > > > > 
> > > > > There's some difference between uninterruptible and interruptible
> > > > > sleep I'd say.
> > > > > 
> > > > > > > It doesn't matter.  The only thing that should matter during 
> suspend
> > > > > > > (not hibernate) is saving the state of devices to ram, and putting 
> the
> > > > > > > devices to sleep.
> > > > > > 
> > > > > > Well, but you did remove sys_sync() from the freezer, which is
> > > > > > and must be called in the hibernate path.
> > > > > 
> > > > > Not "must". In fact, hibernation should be safe without sys_sync(). It
> > > > > is just user un-friendly.
> > > > 
> > > > In fact, I'd like to remove the sys_sync() from the freezer entirely, 
> > > because
> > > > it just doesn't belong in there.
> > > > 
> > > > The only advantege of having sys_sync() in freeze_processes() is that we
> > > > have a chance to write out everything when applications cannot produce 
> more
> > > > data to write, but there are filesystems which don't do that anyway (eg. 
> > > XFS),
> > > > so generally there's no reason to bother.
> > > 
> > > Shouldn't XFS - and fuse - be considered to be broken? Sync should sync 
> data 
> > > and if XFS isn't doing that, it's wrong.
> > > 
> > > In the case of fuse, we should have a mechanism by which fuse processes 
> can be 
> > > made to sync if they do have any pending I/O, and by which they can be 
> frozen 
> > > later than other userspace processes.
> > > 
> > > I'd like to see the sync stay, because it improves reliability and data 
> > > integrity in the fail-to-resume case. Calling scripts would probably 
> invoke 
> > > sync themselves if they don't already, but that's racy. As it is at the 
> > > moment, we know userspace is stopped, so syncing isn't racy.
> > 
> > I'd like to move the sync out of the freezer, but to call it from the
> > suspend/hibernation code, so that we do
> > 
> > sys_sync();
> > error = freeze_processes();
> 
> Yeah, I understand that. The problem then is that you're racing against 
> userspace. That's not usually a problem, but that doesn't mean it's never a 
> problem. Try running the stress suite while testing hibernating and you'll 
> see what I mean. If something is submitting lots of I/O when you try to 
> suspend, your sync call will race against that process if it's not yet 
> frozen, and its continued activity will make your sync pointless (there'll be 
> more unsynced data when you sys_sync call finishes). Stopping userspace 
> before syncing removes that race.

Yes, that will make the suspend/hibernation less reliable in case the resume
fails (some data, written after the sync, may be lost).  However, the sync done
from within the freezer doesn't guarantee that there are no data lost anyway,
so we don't lose much by not doing it.

Now, there's a question how much data may be lost, potentially, if we do the
sync before the freezer and I don't think that's a lot.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 13:28                                               ` Oliver Neukum
  2007-07-05 13:46                                                 ` Matthew Garrett
@ 2007-07-05 14:02                                                 ` Rafael J. Wysocki
  1 sibling, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 14:02 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Miklos Szeredi, paulus, stern, johannes, linux-pm, linux-kernel,
	pavel, mjg59, benh

On Thursday, 5 July 2007 15:28, Oliver Neukum wrote:
> Am Donnerstag, 5. Juli 2007 schrieb Rafael J. Wysocki:
> > > It seems pretty clear cut.  Whining about how much problems this will
> > > cause won't get us nearer to a solution.
> > 
> > Yes, that's pretty clear cut, but we should start from fixing the drivers. :-)
> 
> If, at a minimum, we can determine that we can STD without a freezer.

No, we can't.

> It makes no sense to invest a lot of work to face the same problem
> again with STD.

Arguably, it does make sense, because for many platforms the hibernation
is irrelevant and we're going to separate the suspend and hibernation
frameworks anyway.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 13:46                                                 ` Matthew Garrett
@ 2007-07-05 14:09                                                   ` Rafael J. Wysocki
  2007-07-05 14:23                                                     ` Matthew Garrett
  0 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 14:09 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Oliver Neukum, Miklos Szeredi, paulus, stern, johannes, linux-pm,
	linux-kernel, pavel, benh

On Thursday, 5 July 2007 15:46, Matthew Garrett wrote:
> On Thu, Jul 05, 2007 at 03:28:33PM +0200, Oliver Neukum wrote:
> 
> > If, at a minimum, we can determine that we can STD without a freezer.
> > It makes no sense to invest a lot of work to face the same problem
> > again with STD.
> 
> I have a model for STD that avoids the need to freeze the entirity of 
> userspace, but I need to find some more time to flesh it out.

You can just describe it, as far as I'm concerned. :-)

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] The big suspend mess
  2007-07-04 22:19                 ` The big suspend mess Adrian Bunk
  2007-07-05  0:27                   ` Pavel Machek
@ 2007-07-05 14:14                   ` Alan Stern
  1 sibling, 0 replies; 388+ messages in thread
From: Alan Stern @ 2007-07-05 14:14 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Paul Mackerras, Matthew Garrett, linux-kernel, Pavel Machek, linux-pm

On Thu, 5 Jul 2007, Adrian Bunk wrote:

> IMHO the suspend code is currently way too much of a moving target which 
> results in this mess.
> 
> The correct order seems to be:
> 1. agree on what the suspend code as a whole should look like
> 2. implement this
> 3. fix ALL drivers to work at least as good as they do today
> 4. get it tested in -mm
> 5. fix all bugs people run into
> 6. submit it for inclusion in Linus' tree
> 7. quickly work on the most likely big amount of bug reports
> 
> Step 1 is the most important one - evolving code is often something 
> good, but in this case with different people trying to evolve the 
> suspend code in different directions it simply results in a big mess.

Isn't Step 1 exactly what we are in the midst of right now?

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 13:50                                       ` Miklos Szeredi
@ 2007-07-05 14:14                                         ` Rafael J. Wysocki
  2007-07-05 14:14                                           ` Miklos Szeredi
  0 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 14:14 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: pavel, oliver, paulus, stern, johannes, linux-pm, linux-kernel,
	mjg59, benh

On Thursday, 5 July 2007 15:50, Miklos Szeredi wrote:
> > > > Don't you think, however, that it can be modified a little to play well,
> > > > for example, with the freezer?
> > > 
> > > I could stick a couple of try_to_freeze()s into fuse, and that would
> > > make suspend failure less likely.  But making problems less easy to
> > > reproduce is not a good thing.
> > 
> > So, how about eliminating them?
> 
> That can't be done just within fuse, a process might be sleeping on a
> VFS mutex.  Do we want to hack VFS as well?

No.

> I guess I know your answer.  But it ain't gonna work.  Suspend code
> really doesn't belong in VFS, and I'm pretty sure the maintainers of
> that little piece of code would agree with me on this.

Surprise, surprise.  Not that I'm scared of the VFS maintainers, though. ;-)

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 14:14                                         ` Rafael J. Wysocki
@ 2007-07-05 14:14                                           ` Miklos Szeredi
  0 siblings, 0 replies; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05 14:14 UTC (permalink / raw)
  To: rjw
  Cc: miklos, pavel, oliver, paulus, stern, johannes, linux-pm,
	linux-kernel, mjg59, benh


> > I guess I know your answer.  But it ain't gonna work.  Suspend code
> > really doesn't belong in VFS, and I'm pretty sure the maintainers of
> > that little piece of code would agree with me on this.
> 
> Surprise, surprise.  Not that I'm scared of the VFS maintainers, though. ;-)

Probably you haven't had a close encounter with them yet.  They are
pretty much the _most_ dangerous people in kernel land ;)

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 14:09                                                   ` Rafael J. Wysocki
@ 2007-07-05 14:23                                                     ` Matthew Garrett
  2007-07-05 14:46                                                       ` Ray Lee
                                                                         ` (3 more replies)
  0 siblings, 4 replies; 388+ messages in thread
From: Matthew Garrett @ 2007-07-05 14:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Oliver Neukum, Miklos Szeredi, paulus, stern, johannes, linux-pm,
	linux-kernel, pavel, benh

On Thu, Jul 05, 2007 at 04:09:24PM +0200, Rafael J. Wysocki wrote:
> On Thursday, 5 July 2007 15:46, Matthew Garrett wrote:
> > I have a model for STD that avoids the need to freeze the entirity of 
> > userspace, but I need to find some more time to flesh it out.
> 
> You can just describe it, as far as I'm concerned. :-)

The basic model is that nobody's really described a use-case where we 
actually care about restoring system state. What people want is to be 
able to restore application state. So, arguably, what we want isn't to 
save the entire kernel state and application state in one go because we 
can reconstruct a huge amount of that afterwards.

This isn't too much of a problem. All we actually need to be able to do 
is to atomically dump process state (which requires the freezer, but 
doesn't require freezing the entire system), shut down, get the system 
back into approximately the correct state (remount filesystems, start X, 
whatever) and then restore the processes.

Now, obviously, there's actually quite a lot of complexity here that I'm 
neatly eliding :) The biggest issue is restoring hardware state. We'd 
require quite a different model to the existing one, but I think there 
are arguments there for it being helpful anywy. Keeping state in the 
midlevels rather than the low-level drivers would give us much more 
ability to deal with hardware issues, and potentially allow the 
replacement of faulty hardware without userspace caring (freeze your 
mission-critical application, hotplug the network card, let the kernel 
restore state and resume it)

There's other advantages to this. As long as the kernel hasn't changed 
too much it would be possible to restore userspace across kernel 
security upgrades. You end up saving less to disk so performance should 
be better. Touching filesystems between suspend and resume doesn't 
result in the entire world ending.

I've mocked up a basic implementation using cryopid, but it's somewhat 
limited by the lack of support for sockets. I'd like to move more of the 
smarts into the kernel (Hurray, checkpointing!) and then see how much 
hardware support ends up horifically broken.
-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  0:23                       ` Paul Mackerras
  2007-07-05  6:58                         ` Oliver Neukum
@ 2007-07-05 14:23                         ` Alan Stern
  2007-07-05 22:59                           ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-05 14:23 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Johannes Berg, Rafael J. Wysocki, Linux-pm mailing list,
	Kernel development list, Pavel Machek, Matthew Garrett,
	Benjamin Herrenschmidt

On Thu, 5 Jul 2007, Paul Mackerras wrote:

> Alan Stern writes:
> 
> > Let's agree the kernel threads and the freezer are a separate issue.  
> 
> No, I don't think they are a separate issue, because I think the
> distinction the freezer makes between kernel threads and user threads
> is a false and misleading distinction.

That's a little strong.  "Misleading" I could understand, but "false"?  
Isn't the distinction between a kernel thread and a user task pretty 
clear-cut (except for a few borderline cases which aren't at issue just 
now)?

> > I agree the kernel threads which try to do I/O during a suspend will 
> > need extra attention.  However if these threads are necessary for the 
> > suspend procedure, then blocking them (which is how people on this 
> > thread have been saying driver should treat I/O requests during a 
> > suspend) will cause additional problems.  There's no way around it; 
> > these threads _will_ require more work.
> 
> There is a way around it; do the request blocking in the drivers,
> where it belongs.

How will that help?  Block the kernel thread in the freezer or block it 
in the driver -- either way it is blocked.  So how do your deadlocks 
get resolved?

> In general the only way to guarantee there are no deadlocks is to
> construct the graph of dependencies between tasks.  Those dependencies
> are not in practice observable from outside the tasks, so it is
> virtually impossible to construct the graph.
> 
> The "don't freeze kernel threads" thing is an attempt to make a crude
> approximation to the dependency graph (by saying kernel threads only
> depend on other kernel threads), but the approximation breaks down
> when you have FUSE or user-level device drivers.

I disagree with your analysis -- not that it's completely wrong, but it 
points out an existing basic problem in the kernel.  The kernel should 
never depend on userspace!  More correctly, a task executing in the 
kernel should never block with any sort of mutex or other lock held (in 
a way that would preclude it from being frozen, let's say) while 
waiting for a response from userspace.

Then the dependency graph would be easy to construct: User tasks can
depend on whatever they want, and kernel threads never depend on a user
task.

If this contradicts the existing implementations and APIs for userspace 
filesystems, then so be it.  My conclusion would be that the 
implementations and APIs should be changed.

> > There remains the problem of user tasks whose assistance is required to 
> > carry out some I/O (as with FUSE).  If the I/O can be deferred until 
> > after the resume, then there's no problem.  If the I/O can be carried 
> > out before the suspend, then it should be.  And finally, if the I/O 
> > must be done during the suspend, you're in real trouble -- how do you 
> > do I/O to a suspended device?
> 
> So why doesn't that argument apply to kernel threads? :)

It _does_ apply to kernel threads.  That's exactly why I wrote above 
that kernel threads which try to do I/O during a suspend will need 
extra attention.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  0:36                                       ` Paul Mackerras
  2007-07-05 12:51                                         ` Rafael J. Wysocki
@ 2007-07-05 14:25                                         ` Alan Stern
  2007-07-05 17:42                                           ` Miklos Szeredi
  1 sibling, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-05 14:25 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Miklos Szeredi, rjw, mjg59, Linux-pm mailing list,
	Kernel development list

On Thu, 5 Jul 2007, Paul Mackerras wrote:

> Alan Stern writes:
> 
> > Remember what I wrote a few minutes ago about khubd and ksuspend_usbd
> > wanting to resume devices during a system suspend transition?  This is
> > exactly what happens when those threads aren't frozen.
> 
> So, I wonder why I don't see that error on my powerbook?

That's a good question.  Miklos, can you please reproduce the suspend 
error using a kernel built with CONFIG_USB_DEBUG turned on?

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway)
  2007-07-05 14:28                         ` Rafael J. Wysocki
@ 2007-07-05 14:26                           ` Matthew Garrett
  2007-07-05 14:41                             ` Rafael J. Wysocki
  2007-07-07 11:49                             ` Pavel Machek
  0 siblings, 2 replies; 388+ messages in thread
From: Matthew Garrett @ 2007-07-05 14:26 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, Paul Mackerras, Alan Stern, Johannes Berg,
	Linux-pm mailing list, Kernel development list

On Thu, Jul 05, 2007 at 04:28:11PM +0200, Rafael J. Wysocki wrote:
> On Thursday, 5 July 2007 15:57, Matthew Garrett wrote:
> > And also "Userland should not depend on userland services", which is 
> > rather more of a problem.
> 
> I think you're oversimplifying it, as far as FUSE is concerned.
> 
> Namely, if there are two userland tasks, A and B, and B is uninterruptible,
> because A is blocked, then this is not a usual situation.

Fuse is one case of it occuring, and if we end up with more userspace 
drivers then the problem is only going to get worse.
-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway)
  2007-07-05 13:57                       ` Matthew Garrett
@ 2007-07-05 14:28                         ` Rafael J. Wysocki
  2007-07-05 14:26                           ` Matthew Garrett
  2007-07-07 12:08                         ` problem 1 (was Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway)) Pavel Machek
  1 sibling, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 14:28 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Pavel Machek, Paul Mackerras, Alan Stern, Johannes Berg,
	Linux-pm mailing list, Kernel development list

On Thursday, 5 July 2007 15:57, Matthew Garrett wrote:
> On Thu, Jul 05, 2007 at 11:15:26AM +0200, Pavel Machek wrote:
> 
> > Now, if kernel needs FUSE services for some reason (that's the problem
> > we hit in s2ram case, right?), we have a deadlock.
> > 
> > So main problem still seems to be "kernel should not depend on
> > userland services during suspend", refrigerator or not.
> 
> And also "Userland should not depend on userland services", which is 
> rather more of a problem.

I think you're oversimplifying it, as far as FUSE is concerned.

Namely, if there are two userland tasks, A and B, and B is uninterruptible,
because A is blocked, then this is not a usual situation.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway)
  2007-07-05 14:41                             ` Rafael J. Wysocki
@ 2007-07-05 14:39                               ` Matthew Garrett
  2007-07-05 15:04                                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 388+ messages in thread
From: Matthew Garrett @ 2007-07-05 14:39 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, Paul Mackerras, Alan Stern, Johannes Berg,
	Linux-pm mailing list, Kernel development list

On Thu, Jul 05, 2007 at 04:41:39PM +0200, Rafael J. Wysocki wrote:
> On Thursday, 5 July 2007 16:26, Matthew Garrett wrote:
> > On Thu, Jul 05, 2007 at 04:28:11PM +0200, Rafael J. Wysocki wrote:
> > > On Thursday, 5 July 2007 15:57, Matthew Garrett wrote:
> > > > And also "Userland should not depend on userland services", which is 
> > > > rather more of a problem.
> > > 
> > > I think you're oversimplifying it, as far as FUSE is concerned.
> > > 
> > > Namely, if there are two userland tasks, A and B, and B is uninterruptible,
> > > because A is blocked, then this is not a usual situation.
> > 
> > Fuse is one case of it occuring, and if we end up with more userspace 
> > drivers then the problem is only going to get worse.
> 
> But this is a problem by itself, regardless of the freezer etc., no?

Why?
-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway)
  2007-07-05 14:26                           ` Matthew Garrett
@ 2007-07-05 14:41                             ` Rafael J. Wysocki
  2007-07-05 14:39                               ` Matthew Garrett
  2007-07-07 11:49                             ` Pavel Machek
  1 sibling, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 14:41 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Pavel Machek, Paul Mackerras, Alan Stern, Johannes Berg,
	Linux-pm mailing list, Kernel development list

On Thursday, 5 July 2007 16:26, Matthew Garrett wrote:
> On Thu, Jul 05, 2007 at 04:28:11PM +0200, Rafael J. Wysocki wrote:
> > On Thursday, 5 July 2007 15:57, Matthew Garrett wrote:
> > > And also "Userland should not depend on userland services", which is 
> > > rather more of a problem.
> > 
> > I think you're oversimplifying it, as far as FUSE is concerned.
> > 
> > Namely, if there are two userland tasks, A and B, and B is uninterruptible,
> > because A is blocked, then this is not a usual situation.
> 
> Fuse is one case of it occuring, and if we end up with more userspace 
> drivers then the problem is only going to get worse.

But this is a problem by itself, regardless of the freezer etc., no?

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  0:35                   ` Paul Mackerras
  2007-07-05  9:15                     ` removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway) Pavel Machek
@ 2007-07-05 14:42                     ` Alan Stern
  1 sibling, 0 replies; 388+ messages in thread
From: Alan Stern @ 2007-07-05 14:42 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Johannes Berg, Rafael J. Wysocki, Linux-pm mailing list,
	Kernel development list, Pavel Machek, Matthew Garrett

On Thu, 5 Jul 2007, Paul Mackerras wrote:

> Alan Stern writes:
> 
> > > > Yes, the code could be changed to keep track of the reason for a device
> > > > suspend.  But that just raises the old problem of what to do when
> > > > there's an I/O request for a suspended device during STR.
> > > 
> > > Is this actually a real problem?  I would think the policy would be
> > > "block" for block devices (pun not intended :), "drop" for network
> > > devices, etc.
> > 
> > It is indeed a real problem, or at least, it can be.
> 
> How so?  Can you give me an example?

The example I quoted earlier about binding during a suspend will do.  I 
agree that we can and should try to prevent it from ever occurring.

Read and write are a problem only in that fixing them would potentially
involve changing lots of drivers; I don't think they pose a serious
theoretical obstacle.  (Lord knows what will happen with async I/O!)

Any other entry points to drivers are also potential problems, but it's 
hard to say anything definite about them since they are so varied.

> > Bus subsystems can suspend devices with no drivers.
> 
> Interesting.  I assume this is for buses for which there is a
> bus-specific but device-independent suspend procedure defined.

Yes.

> It would seem sensible to me that the PM core should get the bus to
> resume such a device before calling a driver probe routine.  The
> resume should be blocked or deferred while a system suspend is
> underway.  In fact I think that all driver bind/unbind and probe
> operations should be deferred while the system is suspending (i.e. put
> on a list to be done after the system resumes).

Getting the PM core to resume a device before probing could be 
difficult; in general it doesn't know enough about specific device 
behaviors to do something like that.  But the subsystem certainly ought 
to take care of it.  USB does.

Yes, bind/unbind/etc. should be deferred during a system suspend.  But
it has to be done carefully, because these operations generally involve
locks that can't be released.  They need to be prevented at their
source, not in the driver core.  That's one reason why khubd needs to
be frozen (being part of the USB hub driver, it is the task responsible
for binding and unbinding drivers to USB devices).

Another thing to look out for is registration and unregistration of 
drivers.  These activities also cause bind/unbind operations.  Note 
that if userspace is frozen then neither insmod nor rmmod can run.  :-)

> > It would help.  It would help even more if the sysfs core also blocked
> > all I/O while suspend is under way.  (Although this might be tricky, 
> > considering that the suspend is initiated by a sysfs write...)
> 
> I didn't think sysfs got involved at all in normal read and write
> requests, so I don't know how it would block them...

All I/O to sysfs attributes passes through the routines in fs/sysfs/*.  
It could be blocked there.  (But if userspace is frozen it won't need 
to be.)

> Normally devices have some sort of queue of pending operations.

That's certainly true of block devices, whose drivers use the block 
subsystem.  It's not true for lots of other devices, though.

>  So
> all that is required on suspend is to stop processing the queue and
> wait for any currently-underway operations to complete.  The blocking
> then happens naturally using the normal I/O wait mechanisms.

In my experience, most non-block drivers do not have any queue of 
pending I/O operations.  They simply carry out requests as they arrive.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 14:23                                                     ` Matthew Garrett
@ 2007-07-05 14:46                                                       ` Ray Lee
  2007-07-05 15:00                                                         ` Matthew Garrett
  2007-07-05 14:59                                                       ` Rafael J. Wysocki
                                                                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 388+ messages in thread
From: Ray Lee @ 2007-07-05 14:46 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Rafael J. Wysocki, Oliver Neukum, Miklos Szeredi, paulus, stern,
	johannes, linux-pm, linux-kernel, pavel, benh

On 7/5/07, Matthew Garrett <mjg59@srcf.ucam.org> wrote:
> On Thu, Jul 05, 2007 at 04:09:24PM +0200, Rafael J. Wysocki wrote:
> > On Thursday, 5 July 2007 15:46, Matthew Garrett wrote:
> > > I have a model for STD that avoids the need to freeze the entirity of
> > > userspace, but I need to find some more time to flesh it out.
> >
> > You can just describe it, as far as I'm concerned. :-)
>
> The basic model is that nobody's really described a use-case where we
> actually care about restoring system state. What people want is to be
> able to restore application state.

Hmm, careful. There are a bunch of people who use suspend2 exactly
because it saves and restores the page cache, leaving the system in a
usable state without waiting for the universe to swap back in from
disk. It makes a big difference on older laptops with slow drives.
While the other advantages you list for process cryogenics are pretty
neat, let's remember that the 99% use case for STD is laptops.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 14:23                                                     ` Matthew Garrett
  2007-07-05 14:46                                                       ` Ray Lee
@ 2007-07-05 14:59                                                       ` Rafael J. Wysocki
  2007-07-05 16:06                                                       ` Jeremy Maitin-Shepard
  2007-07-06  5:45                                                       ` Daniel Pittman
  3 siblings, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 14:59 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Oliver Neukum, Miklos Szeredi, paulus, stern, johannes, linux-pm,
	linux-kernel, pavel, benh

On Thursday, 5 July 2007 16:23, Matthew Garrett wrote:
> On Thu, Jul 05, 2007 at 04:09:24PM +0200, Rafael J. Wysocki wrote:
> > On Thursday, 5 July 2007 15:46, Matthew Garrett wrote:
> > > I have a model for STD that avoids the need to freeze the entirity of 
> > > userspace, but I need to find some more time to flesh it out.
> > 
> > You can just describe it, as far as I'm concerned. :-)
> 
> The basic model is that nobody's really described a use-case where we 
> actually care about restoring system state. What people want is to be 
> able to restore application state. So, arguably, what we want isn't to 
> save the entire kernel state and application state in one go because we 
> can reconstruct a huge amount of that afterwards.

Hmm, I think that will take more time than just restoring the entire system
state.

> This isn't too much of a problem. All we actually need to be able to do 
> is to atomically dump process state (which requires the freezer, but 
> doesn't require freezing the entire system),

Once you've frozen processes, the freezing of the rest of the system is
pretty straightforward.  Currently, we do a bit too much for that, because we
suspend devices instead of just quiescing them before creating the image,
bu that's going to change (I hope).

> shut down, get the system back into approximately the correct state (remount
> filesystems, start X, whatever) and then restore the processes.
>
> Now, obviously, there's actually quite a lot of complexity here that I'm 
> neatly eliding :) The biggest issue is restoring hardware state. We'd 
> require quite a different model to the existing one, but I think there 
> are arguments there for it being helpful anywy. Keeping state in the 
> midlevels rather than the low-level drivers would give us much more 
> ability to deal with hardware issues, and potentially allow the 
> replacement of faulty hardware without userspace caring (freeze your 
> mission-critical application, hotplug the network card, let the kernel 
> restore state and resume it)

Sounds neat, but what about the processes that depend on the hardware
(like hal)?

> There's other advantages to this. As long as the kernel hasn't changed 
> too much it would be possible to restore userspace across kernel 
> security upgrades. You end up saving less to disk so performance should 
> be better.

Well, not that much less. :-)

> Touching filesystems between suspend and resume doesn't result in the entire
> world ending.

Yes, that would be an advantege.

> I've mocked up a basic implementation using cryopid, but it's somewhat 
> limited by the lack of support for sockets. I'd like to move more of the 
> smarts into the kernel (Hurray, checkpointing!) and then see how much 
> hardware support ends up horifically broken.

There's one more thing, I'm not sure if that's possible to separate the kernel
state from the processes state entirely (think shared memory, LRU lists,
situations in which the application has been frozen while waiting for an
event in the kernel space etc.).

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 14:46                                                       ` Ray Lee
@ 2007-07-05 15:00                                                         ` Matthew Garrett
  0 siblings, 0 replies; 388+ messages in thread
From: Matthew Garrett @ 2007-07-05 15:00 UTC (permalink / raw)
  To: Ray Lee
  Cc: Rafael J. Wysocki, Oliver Neukum, Miklos Szeredi, paulus, stern,
	johannes, linux-pm, linux-kernel, pavel, benh

On Thu, Jul 05, 2007 at 07:46:01AM -0700, Ray Lee wrote:

> Hmm, careful. There are a bunch of people who use suspend2 exactly
> because it saves and restores the page cache, leaving the system in a
> usable state without waiting for the universe to swap back in from
> disk. It makes a big difference on older laptops with slow drives.
> While the other advantages you list for process cryogenics are pretty
> neat, let's remember that the 99% use case for STD is laptops.

Saving the processes means that you're implicitly saving the interesting 
chunks of the page cache. Removing the need for the atomic copy saves 
you from pushing a pile of stuff out to swap in the first place.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway)
  2007-07-05 15:04                                 ` Rafael J. Wysocki
@ 2007-07-05 15:03                                   ` Matthew Garrett
  2007-07-05 15:27                                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 388+ messages in thread
From: Matthew Garrett @ 2007-07-05 15:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, Paul Mackerras, Alan Stern, Johannes Berg,
	Linux-pm mailing list, Kernel development list

On Thu, Jul 05, 2007 at 05:04:47PM +0200, Rafael J. Wysocki wrote:
> On Thursday, 5 July 2007 16:39, Matthew Garrett wrote:
> > Why?
> 
> You have processes that don't react to signals, because some other user land
> task is misbehaving.  I'd call that ugly at the very least.

It already happens with, say, NFS. Don't think about it in terms of a 
userland task misbehaving - think of it in terms of a resource becoming 
unavailable.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway)
  2007-07-05 14:39                               ` Matthew Garrett
@ 2007-07-05 15:04                                 ` Rafael J. Wysocki
  2007-07-05 15:03                                   ` Matthew Garrett
  0 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 15:04 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Pavel Machek, Paul Mackerras, Alan Stern, Johannes Berg,
	Linux-pm mailing list, Kernel development list

On Thursday, 5 July 2007 16:39, Matthew Garrett wrote:
> On Thu, Jul 05, 2007 at 04:41:39PM +0200, Rafael J. Wysocki wrote:
> > On Thursday, 5 July 2007 16:26, Matthew Garrett wrote:
> > > On Thu, Jul 05, 2007 at 04:28:11PM +0200, Rafael J. Wysocki wrote:
> > > > On Thursday, 5 July 2007 15:57, Matthew Garrett wrote:
> > > > > And also "Userland should not depend on userland services", which is 
> > > > > rather more of a problem.
> > > > 
> > > > I think you're oversimplifying it, as far as FUSE is concerned.
> > > > 
> > > > Namely, if there are two userland tasks, A and B, and B is uninterruptible,
> > > > because A is blocked, then this is not a usual situation.
> > > 
> > > Fuse is one case of it occuring, and if we end up with more userspace 
> > > drivers then the problem is only going to get worse.
> > 
> > But this is a problem by itself, regardless of the freezer etc., no?
> 
> Why?

You have processes that don't react to signals, because some other user land
task is misbehaving.  I'd call that ugly at the very least.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway)
  2007-07-05 15:03                                   ` Matthew Garrett
@ 2007-07-05 15:27                                     ` Rafael J. Wysocki
  2007-07-05 15:32                                       ` Miklos Szeredi
  0 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 15:27 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Pavel Machek, Paul Mackerras, Alan Stern, Johannes Berg,
	Linux-pm mailing list, Kernel development list

On Thursday, 5 July 2007 17:03, Matthew Garrett wrote:
> On Thu, Jul 05, 2007 at 05:04:47PM +0200, Rafael J. Wysocki wrote:
> > On Thursday, 5 July 2007 16:39, Matthew Garrett wrote:
> > > Why?
> > 
> > You have processes that don't react to signals, because some other user land
> > task is misbehaving.  I'd call that ugly at the very least.
> 
> It already happens with, say, NFS. Don't think about it in terms of a 
> userland task misbehaving - think of it in terms of a resource becoming 
> unavailable.

I think there's a difference between a userland task playing the role of a
resource and a "real" external resource the kernel doesn't control.

IMO, userland tasks should not have the power to affect each other as though
they were parts of the kernel.

Greetings,
Rafael
 

-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway)
  2007-07-05 15:27                                     ` Rafael J. Wysocki
@ 2007-07-05 15:32                                       ` Miklos Szeredi
  2007-07-07 11:50                                         ` Pavel Machek
  0 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05 15:32 UTC (permalink / raw)
  To: rjw; +Cc: mjg59, pavel, paulus, stern, johannes, linux-pm, linux-kernel

> > > You have processes that don't react to signals, because some
> > > other user land task is misbehaving.  I'd call that ugly at the
> > > very least.
> > 
> > It already happens with, say, NFS. Don't think about it in terms of a 
> > userland task misbehaving - think of it in terms of a resource becoming 
> > unavailable.
> 
> I think there's a difference between a userland task playing the role of a
> resource and a "real" external resource the kernel doesn't control.
> 
> IMO, userland tasks should not have the power to affect each other as though
> they were parts of the kernel.

One task doing ptrace() can basically do whatever it wants with the
task being traced.  This is not an exact analogy to what fuse does,
but close.

And for this reason the security model for allowing access to a fuse
filesystem is similar to that for allowing tracing.

The gory details can be found in Documentation/filesystems/fuse.txt.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 14:23                                                     ` Matthew Garrett
  2007-07-05 14:46                                                       ` Ray Lee
  2007-07-05 14:59                                                       ` Rafael J. Wysocki
@ 2007-07-05 16:06                                                       ` Jeremy Maitin-Shepard
  2007-07-06  5:45                                                       ` Daniel Pittman
  3 siblings, 0 replies; 388+ messages in thread
From: Jeremy Maitin-Shepard @ 2007-07-05 16:06 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Rafael J. Wysocki, Oliver Neukum, Miklos Szeredi, paulus, stern,
	johannes, linux-pm, linux-kernel, pavel, benh

Matthew Garrett <mjg59@srcf.ucam.org> writes:

> On Thu, Jul 05, 2007 at 04:09:24PM +0200, Rafael J. Wysocki wrote:
>> On Thursday, 5 July 2007 15:46, Matthew Garrett wrote:
>> > I have a model for STD that avoids the need to freeze the entirity of 
>> > userspace, but I need to find some more time to flesh it out.
>> 
>> You can just describe it, as far as I'm concerned. :-)

[snip: new hibernate idea]

I think my kexec-based hibernate idea is simpler and more feasible than
this approach, and also avoids the freezer.

-- 
Jeremy Maitin-Shepard

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 12:39                                               ` Miklos Szeredi
@ 2007-07-05 16:10                                                 ` Jeremy Fitzhardinge
  2007-07-05 17:45                                                   ` Miklos Szeredi
  0 siblings, 1 reply; 388+ messages in thread
From: Jeremy Fitzhardinge @ 2007-07-05 16:10 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: rjw, stern, oliver, paulus, mjg59, linux-pm, linux-kernel

Miklos Szeredi wrote:
> Umm, and CODA which is _very_ similar to fuse was there long before
> fuse or the freezer ;)
>   

I did userfs around 1994-5.

    J

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 14:25                                         ` Alan Stern
@ 2007-07-05 17:42                                           ` Miklos Szeredi
  2007-07-05 20:43                                             ` Alan Stern
  0 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05 17:42 UTC (permalink / raw)
  To: stern; +Cc: paulus, miklos, rjw, mjg59, linux-pm, linux-kernel

> > Alan Stern writes:
> > 
> > > Remember what I wrote a few minutes ago about khubd and ksuspend_usbd
> > > wanting to resume devices during a system suspend transition?  This is
> > > exactly what happens when those threads aren't frozen.
> > 
> > So, I wonder why I don't see that error on my powerbook?
> 
> That's a good question.  Miklos, can you please reproduce the suspend 
> error using a kernel built with CONFIG_USB_DEBUG turned on?

Here's the full dmesg.  First there was a successful suspend/resume,
then a couple of unsuccessful ones:

Linux version 2.6.22-rc6 (mszeredi@tucsk) (gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)) #19 SMP Thu Jul 5 18:07:34 CEST 2007
Command line: vga=775 root=/dev/sda2
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009f000 (usable)
 BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000d2000 - 00000000000d4000 (reserved)
 BIOS-e820: 00000000000dc000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000003fed0000 (usable)
 BIOS-e820: 000000003fed0000 - 000000003fedf000 (ACPI data)
 BIOS-e820: 000000003fedf000 - 000000003ff00000 (ACPI NVS)
 BIOS-e820: 000000003ff00000 - 0000000040000000 (reserved)
 BIOS-e820: 00000000f0000000 - 00000000f4000000 (reserved)
 BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
 BIOS-e820: 00000000fed00000 - 00000000fed00400 (reserved)
 BIOS-e820: 00000000fed14000 - 00000000fed1a000 (reserved)
 BIOS-e820: 00000000fed1c000 - 00000000fed90000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
 BIOS-e820: 00000000ff800000 - 0000000100000000 (reserved)
Entering add_active_range(0, 0, 159) 0 entries of 256 used
Entering add_active_range(0, 256, 261840) 1 entries of 256 used
end_pfn_map = 1048576
DMI present.
ACPI: RSDP 000F67D0, 0024 (r2 LENOVO)
ACPI: XSDT 3FED1308, 008C (r1 LENOVO TP-79        2110  LTP        0)
ACPI: FACP 3FED1400, 00F4 (r3 LENOVO TP-79        2110 LNVO        1)
ACPI Warning (tbfadt-0434): Optional field "Gpe1Block" has zero address or length: 000000000000102C/0 [20070126]
ACPI: DSDT 3FED175E, D481 (r1 LENOVO TP-79        2110 MSFT  100000E)
ACPI: FACS 3FEF4000, 0040
ACPI: SSDT 3FED15B4, 01AA (r1 LENOVO TP-79        2110 MSFT  100000E)
ACPI: ECDT 3FEDEBDF, 0052 (r1 LENOVO TP-79        2110 LNVO        1)
ACPI: TCPA 3FEDEC31, 0032 (r2 LENOVO TP-79        2110 LNVO        1)
ACPI: APIC 3FEDEC63, 0068 (r1 LENOVO TP-79        2110 LNVO        1)
ACPI: MCFG 3FEDECCB, 003C (r1 LENOVO TP-79        2110 LNVO        1)
ACPI: HPET 3FEDED07, 0038 (r1 LENOVO TP-79        2110 LNVO        1)
ACPI: SLIC 3FEDEE62, 0176 (r1 LENOVO TP-79        2110  LTP        0)
ACPI: BOOT 3FEDEFD8, 0028 (r1 LENOVO TP-79        2110  LTP        1)
ACPI: SSDT 3FEF2655, 025F (r1 LENOVO TP-79        2110 INTL 20050513)
ACPI: SSDT 3FEF28B4, 00A6 (r1 LENOVO TP-79        2110 INTL 20050513)
ACPI: SSDT 3FEF295A, 04F7 (r1 LENOVO TP-79        2110 INTL 20050513)
ACPI: SSDT 3FEF2E51, 01D8 (r1 LENOVO TP-79        2110 INTL 20050513)
Entering add_active_range(0, 0, 159) 0 entries of 256 used
Entering add_active_range(0, 256, 261840) 1 entries of 256 used
Zone PFN ranges:
  DMA             0 ->     4096
  DMA32        4096 ->  1048576
  Normal    1048576 ->  1048576
early_node_map[2] active PFN ranges
    0:        0 ->      159
    0:      256 ->   261840
On node 0 totalpages: 261743
  DMA zone: 56 pages used for memmap
  DMA zone: 1144 pages reserved
  DMA zone: 2799 pages, LIFO batch:0
  DMA32 zone: 3523 pages used for memmap
  DMA32 zone: 254221 pages, LIFO batch:31
  Normal zone: 0 pages used for memmap
ACPI: PM-Timer IO Port: 0x1008
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 (Bootup-CPU)
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Processor #1
ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 1, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Setting APIC routing to flat
ACPI: HPET id: 0x8086a201 base: 0xfed00000
Using ACPI (MADT) for SMP configuration information
swsusp: Registered nosave memory region: 000000000009f000 - 00000000000a0000
swsusp: Registered nosave memory region: 00000000000a0000 - 00000000000d2000
swsusp: Registered nosave memory region: 00000000000d2000 - 00000000000d4000
swsusp: Registered nosave memory region: 00000000000d4000 - 00000000000dc000
swsusp: Registered nosave memory region: 00000000000dc000 - 0000000000100000
Allocating PCI resources starting at 50000000 (gap: 40000000:b0000000)
SMP: Allowing 2 CPUs, 0 hotplug CPUs
PERCPU: Allocating 32616 bytes of per cpu data
Built 1 zonelists.  Total pages: 257020
Kernel command line: vga=775 root=/dev/sda2
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
Extended CMOS year: 2000
Marking TSC unstable due to TSCs unsynchronized
time.c: Detected 1828.744 MHz processor.
Console: colour dummy device 80x25
Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
Checking aperture...
Memory: 1026200k/1047360k available (2316k kernel code, 20600k reserved, 1434k data, 188k init)
Calibrating delay using timer specific routine.. 3661.49 BogoMIPS (lpj=7322983)
Mount-cache hash table entries: 256
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 2048K
using mwait in idle threads.
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
CPU0: Thermal monitoring enabled (TM2)
SMP alternatives: switching to UP code
ACPI: Core revision 20070126
Using local APIC timer interrupts.
result 10390586
Detected 10.390 MHz APIC timer.
SMP alternatives: switching to SMP code
Booting processor 1/2 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 3657.60 BogoMIPS (lpj=7315219)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 2048K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
CPU1: Thermal monitoring enabled (TM2)
Intel(R) Core(TM)2 CPU         T5600  @ 1.83GHz stepping 06
Brought up 2 CPUs
migration_cost=39
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: Using configuration type 1
ACPI: Interpreter enabled
ACPI: (supports S0 S3 S4 S5)
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
PCI quirk: region 1000-107f claimed by ICH6 ACPI/GPIO/TCO
PCI quirk: region 1180-11bf claimed by ICH6 GPIO
PCI: Transparent bridge - 0000:00:1e.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.AGP_._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.EXP0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.EXP1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.EXP2._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.EXP3._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI1._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 10 *11)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 *11)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 10 *11)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 9 10 *11)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 9 10 *11)
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 9 10 *11)
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 9 10 *11)
ACPI: Power Resource [PUBS] (on)
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
ACPI: bus type pnp registered
pnp: PnP ACPI: found 12 devices
ACPI: ACPI bus type pnp unregistered
SCSI subsystem initialized
libata version 2.21 loaded.
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
PCI-GART: No AMD northbridge found.
hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
hpet0: 3 64-bit timers, 14318180 Hz
pnp: 00:00: iomem range 0x0-0x9ffff could not be reserved
pnp: 00:00: iomem range 0xc0000-0xc3fff has been reserved
pnp: 00:00: iomem range 0xc4000-0xc7fff has been reserved
pnp: 00:00: iomem range 0xc8000-0xcbfff has been reserved
pnp: 00:02: iomem range 0xf0000000-0xf3ffffff could not be reserved
pnp: 00:02: iomem range 0xfed1c000-0xfed1ffff could not be reserved
pnp: 00:02: iomem range 0xfed14000-0xfed17fff could not be reserved
pnp: 00:02: iomem range 0xfed18000-0xfed18fff could not be reserved
PCI: Bridge: 0000:00:01.0
  IO window: 2000-2fff
  MEM window: ee100000-ee1fffff
  PREFETCH window: d8000000-dfffffff
PCI: Bridge: 0000:00:1c.0
  IO window: 3000-3fff
  MEM window: ee000000-ee0fffff
  PREFETCH window: disabled.
PCI: Bridge: 0000:00:1c.1
  IO window: 4000-5fff
  MEM window: ec000000-edffffff
  PREFETCH window: e4000000-e40fffff
PCI: Bridge: 0000:00:1c.2
  IO window: 6000-7fff
  MEM window: e8000000-e9ffffff
  PREFETCH window: e4100000-e41fffff
PCI: Bridge: 0000:00:1c.3
  IO window: 8000-9fff
  MEM window: ea000000-ebffffff
  PREFETCH window: e4200000-e42fffff
PCI: Bus 22, cardbus bridge: 0000:15:00.0
  IO window: 0000a000-0000a0ff
  IO window: 0000a400-0000a4ff
  PREFETCH window: e0000000-e3ffffff
  MEM window: 50000000-53ffffff
PCI: Bridge: 0000:00:1e.0
  IO window: a000-dfff
  MEM window: e4300000-e7ffffff
  PREFETCH window: e0000000-e3ffffff
ACPI: PCI Interrupt 0000:00:01.0[A] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:00:01.0 to 64
ACPI: PCI Interrupt 0000:00:1c.0[A] -> GSI 20 (level, low) -> IRQ 20
PCI: Setting latency timer of device 0000:00:1c.0 to 64
ACPI: PCI Interrupt 0000:00:1c.1[B] -> GSI 21 (level, low) -> IRQ 21
PCI: Setting latency timer of device 0000:00:1c.1 to 64
ACPI: PCI Interrupt 0000:00:1c.2[C] -> GSI 22 (level, low) -> IRQ 22
PCI: Setting latency timer of device 0000:00:1c.2 to 64
ACPI: PCI Interrupt 0000:00:1c.3[D] -> GSI 23 (level, low) -> IRQ 23
PCI: Setting latency timer of device 0000:00:1c.3 to 64
PCI: Enabling device 0000:00:1e.0 (0005 -> 0007)
PCI: Setting latency timer of device 0000:00:1e.0 to 64
ACPI: PCI Interrupt 0000:15:00.0[A] -> GSI 16 (level, low) -> IRQ 16
NET: Registered protocol family 2
Time: hpet clocksource has been installed.
IP route cache hash table entries: 32768 (order: 6, 262144 bytes)
TCP established hash table entries: 131072 (order: 9, 3145728 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 131072 bind 65536)
TCP reno registered
Simple Boot Flag at 0x35 set to 0x1
io scheduler noop registered
io scheduler cfq registered (default)
pci 0000:00:1d.0: uhci_check_and_reset_hc: legsup = 0x2000
pci 0000:00:1d.0: Performing full reset
pci 0000:00:1d.1: uhci_check_and_reset_hc: legsup = 0x2000
pci 0000:00:1d.1: Performing full reset
pci 0000:00:1d.2: uhci_check_and_reset_hc: legsup = 0x2000
pci 0000:00:1d.2: Performing full reset
pci 0000:00:1d.3: uhci_check_and_reset_hc: legsup = 0x2000
pci 0000:00:1d.3: Performing full reset
Boot video device is 0000:01:00.0
PCI: Setting latency timer of device 0000:00:01.0 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:01.0:pcie00]
PCI: Setting latency timer of device 0000:00:1c.0 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:1c.0:pcie00]
Allocate Port Service[0000:00:1c.0:pcie02]
PCI: Setting latency timer of device 0000:00:1c.1 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:1c.1:pcie00]
Allocate Port Service[0000:00:1c.1:pcie02]
PCI: Setting latency timer of device 0000:00:1c.2 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:1c.2:pcie00]
Allocate Port Service[0000:00:1c.2:pcie02]
PCI: Setting latency timer of device 0000:00:1c.3 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:1c.3:pcie00]
Allocate Port Service[0000:00:1c.3:pcie02]
vesafb: framebuffer at 0xd8000000, mapped to 0xffffc20000080000, using 2560k, total 16384k
vesafb: mode is 1280x1024x8, linelength=1280, pages=11
vesafb: scrolling: redraw
vesafb: Pseudocolor: size=8:8:8:8, shift=0:0:0:0
Console: switching to colour frame buffer device 160x64
fb0: VESA VGA frame buffer device
ACPI: AC Adapter [AC] (on-line)
ACPI: Battery Slot [BAT0] (battery absent)
input: Power Button (FF) as /class/input/input0
ACPI: Power Button (FF) [PWRF]
input: Lid Switch as /class/input/input1
ACPI: Lid Switch [LID]
input: Sleep Button (CM) as /class/input/input2
ACPI: Sleep Button (CM) [SLPB]
ACPI: SSDT 3FEF1D36, 0240 (r1  PmRef  Cpu0Ist      100 INTL 20050513)
ACPI: SSDT 3FEF1FFB, 065A (r1  PmRef  Cpu0Cst      100 INTL 20050513)
Monitor-Mwait will be used to enter C-1 state
Monitor-Mwait will be used to enter C-2 state
Monitor-Mwait will be used to enter C-3 state
ACPI: CPU0 (power states: C1[C1] C2[C2] C3[C3])
ACPI: Processor [CPU0] (supports 8 throttling states)
ACPI: SSDT 3FEF1C6E, 00C8 (r1  PmRef  Cpu1Ist      100 INTL 20050513)
ACPI: SSDT 3FEF1F76, 0085 (r1  PmRef  Cpu1Cst      100 INTL 20050513)
ACPI: CPU1 (power states: C1[C1] C2[C2] C3[C3])
ACPI: Processor [CPU1] (supports 8 throttling states)
ACPI: Thermal Zone [THM0] (40 C)
ACPI: Thermal Zone [THM1] (41 C)
Real Time Clock Driver v1.12ac
hpet_resources: 0xfed00000 is busy
Linux agpgart interface v0.102 (c) Dave Jones
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
loop: module loaded
thinkpad_acpi: ThinkPad ACPI Extras v0.14
thinkpad_acpi: http://ibm-acpi.sf.net/
thinkpad_acpi: ThinkPad EC firmware 79HT50WW-1.07
tun: Universal TUN/TAP device driver, 1.6
tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ICH7: IDE controller at PCI slot 0000:00:1f.1
ACPI: PCI Interrupt 0000:00:1f.1[C] -> GSI 16 (level, low) -> IRQ 16
ICH7: chipset revision 2
ICH7: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0x1880-0x1887, BIOS settings: hda:DMA, hdb:pio
Probing IDE interface ide0...
hda: MATSHITADVD-RAM UJ-842, ATAPI CD/DVD-ROM drive
hda: selected mode 0x42
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
hda: ATAPI 24X DVD-ROM DVD-R-RAM CD-R/RW drive, 2048kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
ahci 0000:00:1f.2: version 2.2
ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 16 (level, low) -> IRQ 16
ahci 0000:00:1f.2: AHCI 0001.0100 32 slots 4 ports 1.5 Gbps 0xf impl SATA mode
ahci 0000:00:1f.2: flags: 64bit ncq pm led clo pio slum part 
PCI: Setting latency timer of device 0000:00:1f.2 to 64
scsi0 : ahci
scsi1 : ahci
scsi2 : ahci
scsi3 : ahci
ata1: SATA max UDMA/133 cmd 0xffffc2000003a500 ctl 0x0000000000000000 bmdma 0x0000000000000000 irq 0
ata2: SATA max UDMA/133 cmd 0xffffc2000003a580 ctl 0x0000000000000000 bmdma 0x0000000000000000 irq 0
ata3: SATA max UDMA/133 cmd 0xffffc2000003a600 ctl 0x0000000000000000 bmdma 0x0000000000000000 irq 0
ata4: SATA max UDMA/133 cmd 0xffffc2000003a680 ctl 0x0000000000000000 bmdma 0x0000000000000000 irq 0
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ata_hpa_resize 1: sectors = 156301488, hpa_sectors = 156301488
ata1.00: ATA-7: HTS541080G9SA00, MB4IC60R, max UDMA/100
ata1.00: 156301488 sectors, multi 16: LBA48 
ata1.00: ata_hpa_resize 1: sectors = 156301488, hpa_sectors = 156301488
ata1.00: configured for UDMA/100
ata2: SATA link down (SStatus 0 SControl 0)
ata3: SATA link down (SStatus 0 SControl 0)
ata4: SATA link down (SStatus 0 SControl 0)
scsi 0:0:0:0: Direct-Access     ATA      HTS541080G9SA00  MB4I PQ: 0 ANSI: 5
sd 0:0:0:0: [sda] 156301488 512-byte hardware sectors (80026 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 0:0:0:0: [sda] 156301488 512-byte hardware sectors (80026 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sda: sda1 sda2 sda3 < sda5 >
sd 0:0:0:0: [sda] Attached SCSI disk
sd 0:0:0:0: Attached scsi generic sg0 type 0
usbmon: debugfs is not available
ehci_hcd: block sizes: qh 160 qtd 96 itd 192 sitd 96
ACPI: PCI Interrupt 0000:00:1d.7[D] -> GSI 19 (level, low) -> IRQ 19
PCI: Setting latency timer of device 0000:00:1d.7 to 64
ehci_hcd 0000:00:1d.7: EHCI Host Controller
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:1d.7: reset hcs_params 0x104208 dbg=1 cc=4 pcc=2 ordered !ppc ports=8
ehci_hcd 0000:00:1d.7: reset hcc_params 6871 thresh 7 uframes 1024 64 bit addr
ehci_hcd 0000:00:1d.7: debug port 1
PCI: cache line size of 32 is not supported by device 0000:00:1d.7
ehci_hcd 0000:00:1d.7: supports USB remote wakeup
ehci_hcd 0000:00:1d.7: irq 19, io mem 0xee404000
ehci_hcd 0000:00:1d.7: reset command 080022 (park)=0 ithresh=8 Async period=1024 Reset HALT
ehci_hcd 0000:00:1d.7: init command 010001 (park)=0 ithresh=1 period=1024 RUN
ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
usb usb1: default language 0x0409
usb usb1: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb1: Product: EHCI Host Controller
usb usb1: Manufacturer: Linux 2.6.22-rc6 ehci_hcd
usb usb1: SerialNumber: 0000:00:1d.7
usb usb1: uevent
usb usb1: usb_probe_device
usb usb1: configuration #1 chosen from 1 choice
usb usb1: adding 1-0:1.0 (config #1, interface 0)
usb 1-0:1.0: uevent
usb 1-0:1.0: uevent
hub 1-0:1.0: usb_probe_interface
hub 1-0:1.0: usb_probe_interface - got id
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 8 ports detected
hub 1-0:1.0: standalone hub
hub 1-0:1.0: no power switching (usb 1.0)
hub 1-0:1.0: individual port over-current protection
hub 1-0:1.0: Single TT
hub 1-0:1.0: TT requires at most 8 FS bit times (666 ns)
hub 1-0:1.0: power on to power good time: 20ms
hub 1-0:1.0: local power source is good
hub 1-0:1.0: trying to enable port power on non-switchable hub
hub 1-0:1.0: state 7 ports 8 chg 0000 evt 0000
ehci_hcd 0000:00:1d.7: GetStatus port 1 status 001403 POWER sig=k CSC CONNECT
USB Universal Host Controller Interface driver v3.0
hub 1-0:1.0: port 1, status 0501, change 0001, 480 Mb/s
ACPI: PCI Interrupt 0000:00:1d.0[A] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:00:1d.0 to 64
uhci_hcd 0000:00:1d.0: UHCI Host Controller
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 2
uhci_hcd 0000:00:1d.0: detected 2 ports
uhci_hcd 0000:00:1d.0: uhci_check_and_reset_hc: cmd = 0x0000
uhci_hcd 0000:00:1d.0: Performing full reset
uhci_hcd 0000:00:1d.0: irq 16, io base 0x00001800
usb usb2: default language 0x0409
usb usb2: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb2: Product: UHCI Host Controller
usb usb2: Manufacturer: Linux 2.6.22-rc6 uhci_hcd
usb usb2: SerialNumber: 0000:00:1d.0
usb usb2: uevent
usb usb2: usb_probe_device
usb usb2: configuration #1 chosen from 1 choice
usb usb2: adding 2-0:1.0 (config #1, interface 0)
usb 2-0:1.0: uevent
usb 2-0:1.0: uevent
hub 2-0:1.0: usb_probe_interface
hub 2-0:1.0: usb_probe_interface - got id
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
hub 2-0:1.0: standalone hub
hub 2-0:1.0: no power switching (usb 1.0)
hub 2-0:1.0: individual port over-current protection
hub 2-0:1.0: power on to power good time: 2ms
hub 2-0:1.0: local power source is good
hub 2-0:1.0: trying to enable port power on non-switchable hub
hub 1-0:1.0: debounce: port 1: total 100ms stable 100ms status 0x501
ehci_hcd 0000:00:1d.7: port 1 low speed --> companion
ACPI: PCI Interrupt 0000:00:1d.1[B] -> GSI 17 (level, low) -> IRQ 17
PCI: Setting latency timer of device 0000:00:1d.1 to 64
uhci_hcd 0000:00:1d.1: UHCI Host Controller
uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 3
uhci_hcd 0000:00:1d.1: detected 2 ports
uhci_hcd 0000:00:1d.1: uhci_check_and_reset_hc: cmd = 0x0000
uhci_hcd 0000:00:1d.1: Performing full reset
uhci_hcd 0000:00:1d.1: irq 17, io base 0x00001820
usb usb3: default language 0x0409
usb usb3: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb3: Product: UHCI Host Controller
usb usb3: Manufacturer: Linux 2.6.22-rc6 uhci_hcd
usb usb3: SerialNumber: 0000:00:1d.1
usb usb3: uevent
usb usb3: usb_probe_device
usb usb3: configuration #1 chosen from 1 choice
usb usb3: adding 3-0:1.0 (config #1, interface 0)
ehci_hcd 0000:00:1d.7: GetStatus port 1 status 003002 POWER OWNER sig=se0 CSC
usb 3-0:1.0: uevent
usb 3-0:1.0: uevent
ehci_hcd 0000:00:1d.7: GetStatus port 8 status 001803 POWER sig=j CSC CONNECT
hub 1-0:1.0: port 8, status 0501, change 0001, 480 Mb/s
hub 3-0:1.0: usb_probe_interface
hub 3-0:1.0: usb_probe_interface - got id
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
hub 3-0:1.0: standalone hub
hub 3-0:1.0: no power switching (usb 1.0)
hub 3-0:1.0: individual port over-current protection
hub 3-0:1.0: power on to power good time: 2ms
hub 3-0:1.0: local power source is good
hub 3-0:1.0: trying to enable port power on non-switchable hub
ACPI: PCI Interrupt 0000:00:1d.2[C] -> GSI 18 (level, low) -> IRQ 18
PCI: Setting latency timer of device 0000:00:1d.2 to 64
uhci_hcd 0000:00:1d.2: UHCI Host Controller
uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 4
uhci_hcd 0000:00:1d.2: detected 2 ports
uhci_hcd 0000:00:1d.2: uhci_check_and_reset_hc: cmd = 0x0000
uhci_hcd 0000:00:1d.2: Performing full reset
uhci_hcd 0000:00:1d.2: irq 18, io base 0x00001840
usb usb4: default language 0x0409
usb usb4: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb4: Product: UHCI Host Controller
usb usb4: Manufacturer: Linux 2.6.22-rc6 uhci_hcd
usb usb4: SerialNumber: 0000:00:1d.2
usb usb4: uevent
usb usb4: usb_probe_device
usb usb4: configuration #1 chosen from 1 choice
usb usb4: adding 4-0:1.0 (config #1, interface 0)
usb 4-0:1.0: uevent
usb 4-0:1.0: uevent
hub 4-0:1.0: usb_probe_interface
hub 4-0:1.0: usb_probe_interface - got id
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
hub 1-0:1.0: debounce: port 8: total 100ms stable 100ms status 0x501
hub 4-0:1.0: standalone hub
hub 4-0:1.0: no power switching (usb 1.0)
hub 4-0:1.0: individual port over-current protection
hub 4-0:1.0: power on to power good time: 2ms
hub 4-0:1.0: local power source is good
hub 4-0:1.0: trying to enable port power on non-switchable hub
ehci_hcd 0000:00:1d.7: port 8 full speed --> companion
ehci_hcd 0000:00:1d.7: GetStatus port 8 status 003801 POWER OWNER sig=j CONNECT
hub 1-0:1.0: port 8 not reset yet, waiting 50ms
ACPI: PCI Interrupt 0000:00:1d.3[D] -> GSI 19 (level, low) -> IRQ 19
PCI: Setting latency timer of device 0000:00:1d.3 to 64
uhci_hcd 0000:00:1d.3: UHCI Host Controller
ehci_hcd 0000:00:1d.7: GetStatus port 8 status 003002 POWER OWNER sig=se0 CSC
hub 1-0:1.0: state 7 ports 8 chg 0000 evt 0100
hub 2-0:1.0: state 7 ports 2 chg 0000 evt 0002
uhci_hcd 0000:00:1d.0: port 1 portsc 01ab,00
hub 2-0:1.0: port 1, status 0301, change 0003, 1.5 Mb/s
uhci_hcd 0000:00:1d.3: new USB bus registered, assigned bus number 5
uhci_hcd 0000:00:1d.3: detected 2 ports
uhci_hcd 0000:00:1d.3: uhci_check_and_reset_hc: cmd = 0x0000
uhci_hcd 0000:00:1d.3: Performing full reset
uhci_hcd 0000:00:1d.3: irq 19, io base 0x00001860
usb usb5: default language 0x0409
usb usb5: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb5: Product: UHCI Host Controller
usb usb5: Manufacturer: Linux 2.6.22-rc6 uhci_hcd
usb usb5: SerialNumber: 0000:00:1d.3
usb usb5: uevent
usb usb5: usb_probe_device
usb usb5: configuration #1 chosen from 1 choice
usb usb5: adding 5-0:1.0 (config #1, interface 0)
usb 5-0:1.0: uevent
usb 5-0:1.0: uevent
hub 5-0:1.0: usb_probe_interface
hub 5-0:1.0: usb_probe_interface - got id
hub 5-0:1.0: USB hub found
hub 5-0:1.0: 2 ports detected
hub 5-0:1.0: standalone hub
hub 5-0:1.0: no power switching (usb 1.0)
hub 5-0:1.0: individual port over-current protection
hub 5-0:1.0: power on to power good time: 2ms
hub 5-0:1.0: local power source is good
hub 5-0:1.0: trying to enable port power on non-switchable hub
hub 2-0:1.0: debounce: port 1: total 100ms stable 100ms status 0x301
PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard as /class/input/input3
usb 2-1: new low speed USB device using uhci_hcd and address 2
usb 2-1: skipped 1 descriptor after interface
usb 2-1: default language 0x0409
usb 2-1: new device strings: Mfr=1, Product=2, SerialNumber=0
usb 2-1: Product: Basic Optical Mouse
usb 2-1: Manufacturer: Microsoft
usb 2-1: uevent
usb 2-1: usb_probe_device
usb 2-1: configuration #1 chosen from 1 choice
usb 2-1: adding 2-1:1.0 (config #1, interface 0)
usb 2-1:1.0: uevent
usb 2-1:1.0: uevent
hub 3-0:1.0: state 7 ports 2 chg 0000 evt 0000
hub 4-0:1.0: state 7 ports 2 chg 0000 evt 0000
hub 5-0:1.0: state 7 ports 2 chg 0000 evt 0004
uhci_hcd 0000:00:1d.3: port 2 portsc 009b,00
hub 5-0:1.0: port 2, status 0101, change 0003, 12 Mb/s
hub 5-0:1.0: debounce: port 2: total 100ms stable 100ms status 0x101
usb 5-2: new full speed USB device using uhci_hcd and address 2
Synaptics Touchpad, model: 1, fw: 6.2, id: 0x81a0b1, caps: 0xa04793/0x300000
serio: Synaptics pass-through port at isa0060/serio1/input0
usb 5-2: ep0 maxpacket = 8
input: SynPS/2 Synaptics TouchPad as /class/input/input4
usb 5-2: default language 0x0409
usb 5-2: new device strings: Mfr=1, Product=2, SerialNumber=0
usb 5-2: Product: Biometric Coprocessor
usb 5-2: Manufacturer: STMicroelectronics
usb usb3: suspend_rh (auto-stop)
usb 5-2: uevent
usb 5-2: usb_probe_device
usb 5-2: configuration #1 chosen from 1 choice
usb 5-2: adding 5-2:1.0 (config #1, interface 0)
usb 5-2:1.0: uevent
usb 5-2:1.0: uevent
usbhid 2-1:1.0: usb_probe_interface
usbhid 2-1:1.0: usb_probe_interface - got id
input: Microsoft Basic Optical Mouse as /class/input/input5
input: USB HID v1.10 Mouse [Microsoft Basic Optical Mouse] on usb-0000:00:1d.0-1
usbcore: registered new interface driver usbhid
drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver
Advanced Linux Sound Architecture Driver Version 1.0.14 (Thu May 31 09:03:25 2007 UTC).
ACPI: PCI Interrupt 0000:00:1b.0[B] -> GSI 17 (level, low) -> IRQ 17
PCI: Setting latency timer of device 0000:00:1b.0 to 64
usb usb4: suspend_rh (auto-stop)
ALSA device list:
  #0: HDA Intel at 0xee400000 irq 17
Netfilter messages via NETLINK v0.30.
nf_conntrack version 0.5.0 (4091 buckets, 32728 max)
ip_tables: (C) 2000-2006 Netfilter Core Team
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
IBM TrackPoint firmware: 0x0e, buttons: 3/3
input: TPPS/2 IBM TrackPoint as /class/input/input6
kjournald starting.  Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 188k freed
usb usb2: uevent
usb 2-0:1.0: uevent
usb 2-0:1.0: uevent
usb 2-1: uevent
usb 2-1:1.0: uevent
usb 2-1:1.0: uevent
usb usb3: uevent
usb 3-0:1.0: uevent
usb 3-0:1.0: uevent
usb usb4: uevent
usb 4-0:1.0: uevent
usb 4-0:1.0: uevent
usb usb5: uevent
usb 5-0:1.0: uevent
usb 5-0:1.0: uevent
usb 5-2: uevent
usb 5-2:1.0: uevent
usb 5-2:1.0: uevent
usb usb1: uevent
usb 1-0:1.0: uevent
usb 1-0:1.0: uevent
Intel(R) PRO/1000 Network Driver - version 7.3.20-k2
Copyright (c) 1999-2006 Intel Corporation.
ACPI: PCI Interrupt 0000:02:00.0[A] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:02:00.0 to 64
e1000: 0000:02:00.0: e1000_validate_option: Receive Interrupt Delay set to 32
e1000: 0000:02:00.0: e1000_probe: (PCI Express:2.5Gb/s:Width x1) 00:16:41:e3:2c:76
e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
Adding 1542196k swap on /dev/sda1.  Priority:-1 extents:1 across:1542196k
EXT3 FS on sda2, internal journal
fuse init (API version 7.8)
e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX
e1000: eth0: e1000_watchdog: 10/100 speed: disabling TSO
uhci_hcd 0000:00:1d.0: reserve dev 2 ep81-INT, period 8, phase 4, 93 us
uhci_hcd 0000:00:1d.0: release dev 2 ep81-INT, period 8, phase 4, 93 us
uhci_hcd 0000:00:1d.0: reserve dev 2 ep81-INT, period 8, phase 4, 93 us
uhci_hcd 0000:00:1d.0: release dev 2 ep81-INT, period 8, phase 4, 93 us
uhci_hcd 0000:00:1d.0: reserve dev 2 ep81-INT, period 8, phase 4, 93 us
uhci_hcd 0000:00:1d.0: release dev 2 ep81-INT, period 8, phase 4, 93 us
IA-32 Microcode Update Driver: v1.14a <tigran@aivazian.fsnet.co.uk>
uhci_hcd 0000:00:1d.0: reserve dev 2 ep81-INT, period 8, phase 4, 93 us
uhci_hcd 0000:00:1d.0: release dev 2 ep81-INT, period 8, phase 4, 93 us
uhci_hcd 0000:00:1d.0: reserve dev 2 ep81-INT, period 8, phase 4, 93 us
uhci_hcd 0000:00:1d.0: release dev 2 ep81-INT, period 8, phase 4, 93 us
Suspending console(s)
hub 5-0:1.0: hub_suspend
usb usb5: suspend_rh
hub 4-0:1.0: hub_suspend
usb usb4: suspend_rh
hub 3-0:1.0: hub_suspend
usb usb3: suspend_rh
hub 2-0:1.0: hub_suspend
usb usb2: suspend_rh
hub 1-0:1.0: hub_suspend
ehci_hcd 0000:00:1d.7: suspend root hub
sd 0:0:0:0: [sda] Synchronizing SCSI cache
sd 0:0:0:0: [sda] Stopping disk
ACPI: PCI interrupt for device 0000:02:00.0 disabled
ACPI: PCI interrupt for device 0000:00:1d.7 disabled
ehci_hcd 0000:00:1d.7: --> PCI D3/wakeup
uhci_hcd 0000:00:1d.3: uhci_suspend
ACPI: PCI interrupt for device 0000:00:1d.3 disabled
uhci_hcd 0000:00:1d.3: --> PCI D0/legacy
uhci_hcd 0000:00:1d.2: uhci_suspend
ACPI: PCI interrupt for device 0000:00:1d.2 disabled
uhci_hcd 0000:00:1d.2: --> PCI D0/legacy
uhci_hcd 0000:00:1d.1: uhci_suspend
ACPI: PCI interrupt for device 0000:00:1d.1 disabled
uhci_hcd 0000:00:1d.1: --> PCI D0/legacy
uhci_hcd 0000:00:1d.0: uhci_suspend
ACPI: PCI interrupt for device 0000:00:1d.0 disabled
uhci_hcd 0000:00:1d.0: --> PCI D0/legacy
ACPI: PCI interrupt for device 0000:00:1b.0 disabled
Disabling non-boot CPUs ...
CPU 1 is now offline
SMP alternatives: switching to UP code
CPU1 is down
Extended CMOS year: 2000
Back to C!
Extended CMOS year: 2000
Enabling non-boot CPUs ...
SMP alternatives: switching to SMP code
Booting processor 1/2 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 3657.62 BogoMIPS (lpj=7315250)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 2048K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
Intel(R) Core(TM)2 CPU         T5600  @ 1.83GHz stepping 06
CPU1 is up
PM: Writing back config space on device 0000:00:01.0 at offset 7 (was 20002020, writing 2020)
PM: Writing back config space on device 0000:00:01.0 at offset 1 (was 100107, writing 100507)
PCI: Setting latency timer of device 0000:00:01.0 to 64
PM: Writing back config space on device 0000:00:1b.0 at offset 1 (was 100106, writing 100102)
ACPI: PCI Interrupt 0000:00:1b.0[B] -> GSI 17 (level, low) -> IRQ 17
PCI: Setting latency timer of device 0000:00:1b.0 to 64
PM: Writing back config space on device 0000:00:1c.0 at offset 1 (was 180107, writing 100507)
PCI: Setting latency timer of device 0000:00:1c.0 to 64
PM: Writing back config space on device 0000:00:1c.1 at offset 1 (was 100107, writing 100507)
PCI: Setting latency timer of device 0000:00:1c.1 to 64
PM: Writing back config space on device 0000:00:1c.2 at offset 7 (was 20007060, writing 7060)
PM: Writing back config space on device 0000:00:1c.2 at offset 1 (was 100107, writing 100507)
PCI: Setting latency timer of device 0000:00:1c.2 to 64
PM: Writing back config space on device 0000:00:1c.3 at offset f (was 40400, writing 4040b)
PM: Writing back config space on device 0000:00:1c.3 at offset 9 (was 10001, writing e421e421)
PM: Writing back config space on device 0000:00:1c.3 at offset 8 (was 0, writing ebf0ea00)
PM: Writing back config space on device 0000:00:1c.3 at offset 7 (was 20000000, writing 9080)
PM: Writing back config space on device 0000:00:1c.3 at offset 3 (was 810000, writing 810010)
PM: Writing back config space on device 0000:00:1c.3 at offset 1 (was 100000, writing 100507)
PCI: Setting latency timer of device 0000:00:1c.3 to 64
uhci_hcd 0000:00:1d.0: PCI legacy resume
ACPI: PCI Interrupt 0000:00:1d.0[A] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:00:1d.0 to 64
uhci_hcd 0000:00:1d.0: uhci_resume
uhci_hcd 0000:00:1d.0: uhci_check_and_reset_hc: legsup = 0x2f00
uhci_hcd 0000:00:1d.0: Performing full reset
usb usb2: root hub lost power or was reset
usb usb2: suspend_rh
hub 2-0:1.0: hub_resume
usb usb2: wakeup_rh
uhci_hcd 0000:00:1d.1: PCI legacy resume
ACPI: PCI Interrupt 0000:00:1d.1[B] -> GSI 17 (level, low) -> IRQ 17
PCI: Setting latency timer of device 0000:00:1d.1 to 64
uhci_hcd 0000:00:1d.1: uhci_resume
uhci_hcd 0000:00:1d.1: uhci_check_and_reset_hc: legsup = 0x2000
uhci_hcd 0000:00:1d.1: Performing full reset
usb usb3: root hub lost power or was reset
usb usb3: suspend_rh
uhci_hcd 0000:00:1d.2: PCI legacy resume
ACPI: PCI Interrupt 0000:00:1d.2[C] -> GSI 18 (level, low) -> IRQ 18
PCI: Setting latency timer of device 0000:00:1d.2 to 64
uhci_hcd 0000:00:1d.2: uhci_resume
uhci_hcd 0000:00:1d.2: uhci_check_and_reset_hc: legsup = 0x2000
uhci_hcd 0000:00:1d.2: Performing full reset
usb usb4: root hub lost power or was reset
usb usb4: suspend_rh
uhci_hcd 0000:00:1d.3: PCI legacy resume
ACPI: PCI Interrupt 0000:00:1d.3[D] -> GSI 19 (level, low) -> IRQ 19
PCI: Setting latency timer of device 0000:00:1d.3 to 64
uhci_hcd 0000:00:1d.3: uhci_resume
uhci_hcd 0000:00:1d.3: uhci_check_and_reset_hc: legsup = 0x2000
uhci_hcd 0000:00:1d.3: Performing full reset
usb usb5: root hub lost power or was reset
usb usb5: suspend_rh
ehci_hcd 0000:00:1d.7: PCI D0, from previous PCI D3
ACPI: PCI Interrupt 0000:00:1d.7[D] -> GSI 19 (level, low) -> IRQ 19
PCI: Setting latency timer of device 0000:00:1d.7 to 64
PM: Writing back config space on device 0000:00:1e.0 at offset 1 (was 100005, writing 100007)
PCI: Setting latency timer of device 0000:00:1e.0 to 64
ACPI: PCI Interrupt 0000:00:1f.1[C] -> GSI 16 (level, low) -> IRQ 16
PM: Writing back config space on device 0000:00:1f.2 at offset 1 (was 2b00007, writing 2b00407)
PCI: Setting latency timer of device 0000:00:1f.2 to 64
hub 5-0:1.0: hub_resume
usb usb5: wakeup_rh
hub 2-0:1.0: state 7 ports 2 chg 0002 evt 0000
uhci_hcd 0000:00:1d.0: port 1 portsc 008a,00
hub 2-0:1.0: port 1, status 0100, change 0003, 12 Mb/s
usb 2-1: USB disconnect, address 2
usb 2-1: unregistering device
usb 2-1: usb_disable_device nuking all URBs
usb 2-1: unregistering interface 2-1:1.0
usb_endpoint usbdev2.2_ep81: ep_device_release called for usbdev2.2_ep81
usb 2-1:1.0: uevent
usb 2-1:1.0: uevent
usb_endpoint usbdev2.2_ep00: ep_device_release called for usbdev2.2_ep00
usb 2-1: uevent
hub 1-0:1.0: hub_resume
ehci_hcd 0000:00:1d.7: resume root hub
hub 2-0:1.0: debounce: port 1: total 100ms stable 100ms status 0x100
hub 5-0:1.0: state 7 ports 2 chg 0004 evt 0004
uhci_hcd 0000:00:1d.3: port 2 portsc 008a,00
hub 5-0:1.0: port 2, status 0100, change 0003, 12 Mb/s
usb 5-2: USB disconnect, address 2
usb 5-2: unregistering device
usb 5-2: usb_disable_device nuking all URBs
usb 5-2: unregistering interface 5-2:1.0
usb_endpoint usbdev5.2_ep81: ep_device_release called for usbdev5.2_ep81
usb_endpoint usbdev5.2_ep02: ep_device_release called for usbdev5.2_ep02
usb_endpoint usbdev5.2_ep83: ep_device_release called for usbdev5.2_ep83
usb 5-2:1.0: uevent
usb 5-2:1.0: uevent
usb_endpoint usbdev5.2_ep00: ep_device_release called for usbdev5.2_ep00
usb 5-2: uevent
hub 5-0:1.0: debounce: port 2: total 100ms stable 100ms status 0x100
hub 1-0:1.0: state 7 ports 8 chg 0000 evt 0102
ehci_hcd 0000:00:1d.7: GetStatus port 1 status 001403 POWER sig=k CSC CONNECT
hub 1-0:1.0: port 1, status 0501, change 0001, 480 Mb/s
hub 1-0:1.0: debounce: port 1: total 100ms stable 100ms status 0x501
ehci_hcd 0000:00:1d.7: port 1 low speed --> companion
ehci_hcd 0000:00:1d.7: GetStatus port 1 status 003002 POWER OWNER sig=se0 CSC
ehci_hcd 0000:00:1d.7: GetStatus port 8 status 001803 POWER sig=j CSC CONNECT
hub 1-0:1.0: port 8, status 0501, change 0001, 480 Mb/s
hub 1-0:1.0: debounce: port 8: total 100ms stable 100ms status 0x501
ehci_hcd 0000:00:1d.7: port 8 full speed --> companion
ehci_hcd 0000:00:1d.7: GetStatus port 8 status 003801 POWER OWNER sig=j CONNECT
hub 1-0:1.0: port 8 not reset yet, waiting 50ms
ehci_hcd 0000:00:1d.7: GetStatus port 8 status 003002 POWER OWNER sig=se0 CSC
hub 1-0:1.0: state 7 ports 8 chg 0000 evt 0100
hub 2-0:1.0: state 7 ports 2 chg 0000 evt 0002
uhci_hcd 0000:00:1d.0: port 1 portsc 01a3,00
hub 2-0:1.0: port 1, status 0301, change 0001, 1.5 Mb/s
hub 2-0:1.0: debounce: port 1: total 100ms stable 100ms status 0x301
usb 2-1: new low speed USB device using uhci_hcd and address 3
PM: Writing back config space on device 0000:01:00.0 at offset 1 (was 100103, writing 100107)
PM: Writing back config space on device 0000:02:00.0 at offset 1 (was 100107, writing 100507)
ACPI: PCI Interrupt 0000:02:00.0[A] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:02:00.0 to 64
PM: Writing back config space on device 0000:15:00.0 at offset f (was 34001ff, writing 3c0010b)
PM: Writing back config space on device 0000:15:00.0 at offset e (was 0, writing a4fc)
PM: Writing back config space on device 0000:15:00.0 at offset d (was 0, writing a400)
PM: Writing back config space on device 0000:15:00.0 at offset c (was 0, writing a0fc)
PM: Writing back config space on device 0000:15:00.0 at offset b (was 0, writing a000)
PM: Writing back config space on device 0000:15:00.0 at offset a (was 0, writing 53fff000)
PM: Writing back config space on device 0000:15:00.0 at offset 9 (was 0, writing 50000000)
PM: Writing back config space on device 0000:15:00.0 at offset 8 (was 0, writing e3fff000)
PM: Writing back config space on device 0000:15:00.0 at offset 7 (was 0, writing e0000000)
PM: Writing back config space on device 0000:15:00.0 at offset 6 (was 0, writing b0171615)
PM: Writing back config space on device 0000:15:00.0 at offset 4 (was 0, writing e4300000)
PM: Writing back config space on device 0000:15:00.0 at offset 3 (was 20000, writing 2a808)
PM: Writing back config space on device 0000:15:00.0 at offset 1 (was 2100000, writing 2100007)
ACPI: PCI Interrupt 0000:15:00.0[A] -> GSI 16 (level, low) -> IRQ 16
usb 2-1: skipped 1 descriptor after interface
hda: selected mode 0x42
usb 2-1: default language 0x0409
sd 0:0:0:0: [sda] Starting disk
usb 2-1: new device strings: Mfr=1, Product=2, SerialNumber=0
usb 2-1: Product: Basic Optical Mouse
usb 2-1: Manufacturer: Microsoft
usb 2-1: uevent
usb 2-1: usb_probe_device
usb 2-1: configuration #1 chosen from 1 choice
usb 2-1: adding 2-1:1.0 (config #1, interface 0)
usb 2-1:1.0: uevent
usb 2-1:1.0: uevent
usbhid 2-1:1.0: usb_probe_interface
usbhid 2-1:1.0: usb_probe_interface - got id
input: Microsoft Basic Optical Mouse as /class/input/input7
input: USB HID v1.10 Mouse [Microsoft Basic Optical Mouse] on usb-0000:00:1d.0-1
hub 5-0:1.0: state 7 ports 2 chg 0000 evt 0004
uhci_hcd 0000:00:1d.3: port 2 portsc 0093,00
hub 5-0:1.0: port 2, status 0101, change 0001, 12 Mb/s
uhci_hcd 0000:00:1d.0: reserve dev 3 ep81-INT, period 8, phase 4, 93 us
uhci_hcd 0000:00:1d.0: release dev 3 ep81-INT, period 8, phase 4, 93 us
uhci_hcd 0000:00:1d.0: reserve dev 3 ep81-INT, period 8, phase 4, 93 us
uhci_hcd 0000:00:1d.0: release dev 3 ep81-INT, period 8, phase 4, 93 us
hub 5-0:1.0: debounce: port 2: total 100ms stable 100ms status 0x101
ata2: SATA link down (SStatus 0 SControl 0)
ata3: SATA link down (SStatus 0 SControl 0)
ata4: SATA link down (SStatus 0 SControl 0)
usb 5-2: new full speed USB device using uhci_hcd and address 3
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
usb 5-2: ep0 maxpacket = 8
ata1.00: ata_hpa_resize 1: sectors = 156301488, hpa_sectors = 156301488
ata1.00: ata_hpa_resize 1: sectors = 156301488, hpa_sectors = 156301488
ata1.00: configured for UDMA/100
usb 5-2: default language 0x0409
sd 0:0:0:0: [sda] 156301488 512-byte hardware sectors (80026 MB)
hub 3-0:1.0: hub_resume
usb usb3: wakeup_rh
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
usb 5-2: new device strings: Mfr=1, Product=2, SerialNumber=0
usb 5-2: Product: Biometric Coprocessor
usb 5-2: Manufacturer: STMicroelectronics
usb 5-2: uevent
usb 5-2: usb_probe_device
usb 5-2: configuration #1 chosen from 1 choice
usb 5-2: adding 5-2:1.0 (config #1, interface 0)
usb 5-2:1.0: uevent
usb 5-2:1.0: uevent
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
hub 3-0:1.0: state 7 ports 2 chg 0000 evt 0000
hub 4-0:1.0: hub_resume
usb usb4: wakeup_rh
hub 4-0:1.0: state 7 ports 2 chg 0000 evt 0000
usb usb3: suspend_rh (auto-stop)
usb usb4: suspend_rh (auto-stop)
e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX
e1000: eth0: e1000_watchdog: 10/100 speed: disabling TSO
ata4: soft resetting port
ata4: SATA link down (SStatus 0 SControl 0)
ata4: EH complete
ata3: soft resetting port
ata3: SATA link down (SStatus 0 SControl 0)
ata3: EH complete
ata2: soft resetting port
ata2: SATA link down (SStatus 0 SControl 0)
ata2: EH complete
Uhhuh. NMI received for unknown reason a0.
You have some hardware problem, likely on the PCI bus.
Dazed and confused, but trying to continue
uhci_hcd 0000:00:1d.0: reserve dev 3 ep81-INT, period 8, phase 4, 93 us
uhci_hcd 0000:00:1d.0: release dev 3 ep81-INT, period 8, phase 4, 93 us
Suspending console(s)
hub 5-0:1.0: port 2 nyet suspended
hub 5-0:1.0: suspend error -16
suspend_device(): usb_suspend+0x0/0x1c() returns -16
Could not suspend device usb5: error -16
Some devices failed to suspend
ata4: soft resetting port
ata4: SATA link down (SStatus 0 SControl 0)
ata4: EH complete
ata3: soft resetting port
ata3: SATA link down (SStatus 0 SControl 0)
ata3: EH complete
ata2: soft resetting port
ata2: SATA link down (SStatus 0 SControl 0)
ata2: EH complete
uhci_hcd 0000:00:1d.0: reserve dev 3 ep81-INT, period 8, phase 4, 93 us
uhci_hcd 0000:00:1d.0: release dev 3 ep81-INT, period 8, phase 4, 93 us
Suspending console(s)
hub 5-0:1.0: port 2 nyet suspended
hub 5-0:1.0: suspend error -16
suspend_device(): usb_suspend+0x0/0x1c() returns -16
Could not suspend device usb5: error -16
Some devices failed to suspend
ata4: soft resetting port
ata4: SATA link down (SStatus 0 SControl 0)
ata4: EH complete
ata3: soft resetting port
ata3: SATA link down (SStatus 0 SControl 0)
ata3: EH complete
ata2: soft resetting port
ata2: SATA link down (SStatus 0 SControl 0)
ata2: EH complete
uhci_hcd 0000:00:1d.0: reserve dev 3 ep81-INT, period 8, phase 4, 93 us

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 16:10                                                 ` Jeremy Fitzhardinge
@ 2007-07-05 17:45                                                   ` Miklos Szeredi
  0 siblings, 0 replies; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05 17:45 UTC (permalink / raw)
  To: jeremy; +Cc: miklos, rjw, stern, oliver, paulus, mjg59, linux-pm, linux-kernel

> > Umm, and CODA which is _very_ similar to fuse was there long before
> > fuse or the freezer ;)
> >   
> 
> I did userfs around 1994-5.

Yes, fuse didn't in fact have very much original idea in it.  It was
just putting all the pieces together to make a stable and useful
userspace filesystem framework.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 12:07                                   ` Miklos Szeredi
  2007-07-05 13:28                                     ` Rafael J. Wysocki
@ 2007-07-05 19:38                                     ` Oliver Neukum
  2007-07-05 19:44                                       ` Miklos Szeredi
  2007-07-07 12:17                                     ` Pavel Machek
  2 siblings, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-05 19:38 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: pavel, paulus, stern, johannes, rjw, linux-pm, linux-kernel, mjg59, benh

Am Donnerstag, 5. Juli 2007 schrieb Miklos Szeredi:
> > > Actually fuse allows SIGKILL, because it's always fatal, and the
> > > syscall may not be restarted.
> > 
> > I think you want to stick try_to_freeze() at the same places where you
> > do SIGKILL handling. That should solve the 'syslogd is unfreezeable'
> > problem.
> 
> I could, but it would not solve the general problem.  Namely, that the
> presence of fuse imposes a certain ordering in which userspace tasks
> have to be frozen.  And it is not possible to know this ordering.

Actually, why do you need this? There is no absolute need that you
finish the request. You must either finish the request or let yourself
be frozen.

A quick look through fuse reveals principally request_wait_answer()
And maybe a few other places. Is there some hidden reason you cannot
handle being frozen here?

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 19:38                                     ` Oliver Neukum
@ 2007-07-05 19:44                                       ` Miklos Szeredi
  2007-07-05 20:19                                         ` Rafael J. Wysocki
                                                           ` (2 more replies)
  0 siblings, 3 replies; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05 19:44 UTC (permalink / raw)
  To: oliver
  Cc: miklos, pavel, paulus, stern, johannes, rjw, linux-pm,
	linux-kernel, mjg59, benh

> Am Donnerstag, 5. Juli 2007 schrieb Miklos Szeredi:
> > > > Actually fuse allows SIGKILL, because it's always fatal, and the
> > > > syscall may not be restarted.
> > > 
> > > I think you want to stick try_to_freeze() at the same places where you
> > > do SIGKILL handling. That should solve the 'syslogd is unfreezeable'
> > > problem.
> > 
> > I could, but it would not solve the general problem.  Namely, that the
> > presence of fuse imposes a certain ordering in which userspace tasks
> > have to be frozen.  And it is not possible to know this ordering.
> 
> Actually, why do you need this? There is no absolute need that you
> finish the request. You must either finish the request or let yourself
> be frozen.
> 
> A quick look through fuse reveals principally request_wait_answer()
> And maybe a few other places. Is there some hidden reason you cannot
> handle being frozen here?

Yes, fuse could handle being frozen there.  However that would only
solve part of the problem: an operation waiting for a reply could be
holding a VFS mutex and some other task may be blocked on that mutex.

How would you solve freezing those tasks?

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 19:44                                       ` Miklos Szeredi
@ 2007-07-05 20:19                                         ` Rafael J. Wysocki
  2007-07-05 20:38                                           ` Miklos Szeredi
  2007-07-05 20:34                                         ` Oliver Neukum
  2007-07-05 23:05                                         ` Benjamin Herrenschmidt
  2 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 20:19 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: oliver, pavel, paulus, stern, johannes, linux-pm, linux-kernel,
	mjg59, benh

On Thursday, 5 July 2007 21:44, Miklos Szeredi wrote:
> > Am Donnerstag, 5. Juli 2007 schrieb Miklos Szeredi:
> > > > > Actually fuse allows SIGKILL, because it's always fatal, and the
> > > > > syscall may not be restarted.
> > > > 
> > > > I think you want to stick try_to_freeze() at the same places where you
> > > > do SIGKILL handling. That should solve the 'syslogd is unfreezeable'
> > > > problem.
> > > 
> > > I could, but it would not solve the general problem.  Namely, that the
> > > presence of fuse imposes a certain ordering in which userspace tasks
> > > have to be frozen.  And it is not possible to know this ordering.
> > 
> > Actually, why do you need this? There is no absolute need that you
> > finish the request. You must either finish the request or let yourself
> > be frozen.
> > 
> > A quick look through fuse reveals principally request_wait_answer()
> > And maybe a few other places. Is there some hidden reason you cannot
> > handle being frozen here?
> 
> Yes, fuse could handle being frozen there.  However that would only
> solve part of the problem: an operation waiting for a reply could be
> holding a VFS mutex and some other task may be blocked on that mutex.
> 
> How would you solve freezing those tasks?

How probable is this situation?

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 19:44                                       ` Miklos Szeredi
  2007-07-05 20:19                                         ` Rafael J. Wysocki
@ 2007-07-05 20:34                                         ` Oliver Neukum
  2007-07-05 20:46                                           ` Miklos Szeredi
  2007-07-05 23:05                                         ` Benjamin Herrenschmidt
  2 siblings, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-05 20:34 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: pavel, paulus, stern, johannes, rjw, linux-pm, linux-kernel, mjg59, benh

Am Donnerstag, 5. Juli 2007 schrieb Miklos Szeredi:
> > Am Donnerstag, 5. Juli 2007 schrieb Miklos Szeredi:
> > > > > Actually fuse allows SIGKILL, because it's always fatal, and the
> > > > > syscall may not be restarted.
> > > > 
> > > > I think you want to stick try_to_freeze() at the same places where you
> > > > do SIGKILL handling. That should solve the 'syslogd is unfreezeable'
> > > > problem.
> > > 
> > > I could, but it would not solve the general problem.  Namely, that the
> > > presence of fuse imposes a certain ordering in which userspace tasks
> > > have to be frozen.  And it is not possible to know this ordering.
> > 
> > Actually, why do you need this? There is no absolute need that you
> > finish the request. You must either finish the request or let yourself
> > be frozen.
> > 
> > A quick look through fuse reveals principally request_wait_answer()
> > And maybe a few other places. Is there some hidden reason you cannot
> > handle being frozen here?
> 
> Yes, fuse could handle being frozen there.  However that would only
> solve part of the problem: an operation waiting for a reply could be
> holding a VFS mutex and some other task may be blocked on that mutex.
> 
> How would you solve freezing those tasks?

OK, you made me reach for literatur on theoretical computer science.

IMHO the range of actions a fuse server is inherently limited.
You must never ever block on a lock one of your clients is holding. In
this case the limitation is not influenced by the freezer.

The freezer introduces a further limitation in that the server can freeze
before the client, which must not be. You can prevent that by freezing
the servers last.

In principle you might have dependencies between servers and you won't
catch that, true. You won't catch servers blocking on IPC, but you are
balancing on the edge of deadlock with fuse anyway.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 20:19                                         ` Rafael J. Wysocki
@ 2007-07-05 20:38                                           ` Miklos Szeredi
  2007-07-05 21:01                                             ` Rafael J. Wysocki
  0 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05 20:38 UTC (permalink / raw)
  To: rjw
  Cc: miklos, oliver, pavel, paulus, stern, johannes, linux-pm,
	linux-kernel, mjg59, benh

> > > > > > Actually fuse allows SIGKILL, because it's always fatal, and the
> > > > > > syscall may not be restarted.
> > > > > 
> > > > > I think you want to stick try_to_freeze() at the same places where you
> > > > > do SIGKILL handling. That should solve the 'syslogd is unfreezeable'
> > > > > problem.
> > > > 
> > > > I could, but it would not solve the general problem.  Namely, that the
> > > > presence of fuse imposes a certain ordering in which userspace tasks
> > > > have to be frozen.  And it is not possible to know this ordering.
> > > 
> > > Actually, why do you need this? There is no absolute need that you
> > > finish the request. You must either finish the request or let yourself
> > > be frozen.
> > > 
> > > A quick look through fuse reveals principally request_wait_answer()
> > > And maybe a few other places. Is there some hidden reason you cannot
> > > handle being frozen here?
> > 
> > Yes, fuse could handle being frozen there.  However that would only
> > solve part of the problem: an operation waiting for a reply could be
> > holding a VFS mutex and some other task may be blocked on that mutex.
> > 
> > How would you solve freezing those tasks?
> 
> How probable is this situation?

I guess it depends on usage patterns.

I don't remember seeing any such cases, even though I have a permanent
fuse mount, and I suspend regularly.  But the fs is probably totally
idle during suspend in my case.

But some people use fuse as a home or root filesystem and in those
cases it can become quite likely to cause problems.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 17:42                                           ` Miklos Szeredi
@ 2007-07-05 20:43                                             ` Alan Stern
  0 siblings, 0 replies; 388+ messages in thread
From: Alan Stern @ 2007-07-05 20:43 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: paulus, rjw, mjg59, linux-pm, linux-kernel

On Thu, 5 Jul 2007, Miklos Szeredi wrote:

> > > Alan Stern writes:
> > > 
> > > > Remember what I wrote a few minutes ago about khubd and ksuspend_usbd
> > > > wanting to resume devices during a system suspend transition?  This is
> > > > exactly what happens when those threads aren't frozen.
> > > 
> > > So, I wonder why I don't see that error on my powerbook?
> > 
> > That's a good question.  Miklos, can you please reproduce the suspend 
> > error using a kernel built with CONFIG_USB_DEBUG turned on?
> 
> Here's the full dmesg.  First there was a successful suspend/resume,
> then a couple of unsuccessful ones:

Okay, good.  It's not a coincidence that the first one worked and later 
ones didn't.  Here's what happened:

You suspended the system.  Upon resuming, the USB host controller 
drivers detected that their device sessions had been interrupted and 
consequently marked your two USB devices (the optical mouse and the 
biometric thingy) for removal.  Normally that removal would take place 
when khubd wakes up, i.e., after the resume is complete and the thread 
is taken out of the freezer.

But in this case khubd was already awake.  It resumed the two root hubs
in response to their remote wakeup requests and saw that the devices
were gone.  So it unregistered the device structures immediately --
even before the PM core tried to resume them -- and then registered new
data structures to embody the new device connections.  As far as I can
tell, the fact that these new device structures were created while the
suspend-list was in flux must have caused one of them (the biometric
coprocessor) not to be added to the appropriate list.

Thus the next time you tried to suspend, the PM core didn't call the
device's suspend method.  It was left awake, and as a result its parent 
hub refused to suspend (since it had an unsuspended child).  This 
caused the overall system suspend to fail.

This illustrates the folly of allowing other threads to perform 
driver-core-type operations in the middle of a system sleep transition.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 20:34                                         ` Oliver Neukum
@ 2007-07-05 20:46                                           ` Miklos Szeredi
  2007-07-05 20:49                                             ` Oliver Neukum
  0 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05 20:46 UTC (permalink / raw)
  To: oliver
  Cc: miklos, pavel, paulus, stern, johannes, rjw, linux-pm,
	linux-kernel, mjg59, benh

> > Yes, fuse could handle being frozen there.  However that would only
> > solve part of the problem: an operation waiting for a reply could be
> > holding a VFS mutex and some other task may be blocked on that mutex.
> > 
> > How would you solve freezing those tasks?
> 
> OK, you made me reach for literatur on theoretical computer science.
> 
> IMHO the range of actions a fuse server is inherently limited.
> You must never ever block on a lock one of your clients is holding. In
> this case the limitation is not influenced by the freezer.

Obviously.  But I wasn't about the server trying to acquire a lock
held by a client.  I was talking about a client trying to acquire a
lock held by _another_ client.

If this coincides with the server (or some other task which the server
is depending on) being frozen before the clients, the freezer has a
problem.

> The freezer introduces a further limitation in that the server can freeze
> before the client, which must not be. You can prevent that by freezing
> the servers last.
> 
> In principle you might have dependencies between servers and you won't
> catch that, true. You won't catch servers blocking on IPC, but you are
> balancing on the edge of deadlock with fuse anyway.

Huh?

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 20:46                                           ` Miklos Szeredi
@ 2007-07-05 20:49                                             ` Oliver Neukum
  2007-07-05 20:53                                               ` Oliver Neukum
                                                                 ` (2 more replies)
  0 siblings, 3 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-05 20:49 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: pavel, paulus, stern, johannes, rjw, linux-pm, linux-kernel, mjg59, benh

Am Donnerstag, 5. Juli 2007 schrieb Miklos Szeredi:
> > > Yes, fuse could handle being frozen there.  However that would only
> > > solve part of the problem: an operation waiting for a reply could be
> > > holding a VFS mutex and some other task may be blocked on that mutex.
> > > 
> > > How would you solve freezing those tasks?
> > 
> > OK, you made me reach for literatur on theoretical computer science.
> > 
> > IMHO the range of actions a fuse server is inherently limited.
> > You must never ever block on a lock one of your clients is holding. In
> > this case the limitation is not influenced by the freezer.
> 
> Obviously.  But I wasn't about the server trying to acquire a lock
> held by a client.  I was talking about a client trying to acquire a
> lock held by _another_ client.
> 
> If this coincides with the server (or some other task which the server
> is depending on) being frozen before the clients, the freezer has a
> problem.

True, but that case can only happen if servers are frozen before clients.
You don't need a full dependency graph. A simple set sequence of two
classes of tasks will do.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 20:49                                             ` Oliver Neukum
@ 2007-07-05 20:53                                               ` Oliver Neukum
  2007-07-05 21:06                                               ` Alan Stern
  2007-07-05 21:07                                               ` Miklos Szeredi
  2 siblings, 0 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-05 20:53 UTC (permalink / raw)
  To: linux-pm; +Cc: Miklos Szeredi, mjg59, linux-kernel, pavel, johannes

Am Donnerstag, 5. Juli 2007 schrieb Oliver Neukum:
> Am Donnerstag, 5. Juli 2007 schrieb Miklos Szeredi:
> > > > Yes, fuse could handle being frozen there.  However that would only
> > > > solve part of the problem: an operation waiting for a reply could be
> > > > holding a VFS mutex and some other task may be blocked on that mutex.
> > > > 
> > > > How would you solve freezing those tasks?
> > > 
> > > OK, you made me reach for literatur on theoretical computer science.
> > > 
> > > IMHO the range of actions a fuse server is inherently limited.
> > > You must never ever block on a lock one of your clients is holding. In
> > > this case the limitation is not influenced by the freezer.
> > 
> > Obviously.  But I wasn't about the server trying to acquire a lock
> > held by a client.  I was talking about a client trying to acquire a
> > lock held by _another_ client.
> > 
> > If this coincides with the server (or some other task which the server
> > is depending on) being frozen before the clients, the freezer has a
> > problem.
> 
> True, but that case can only happen if servers are frozen before clients.
> You don't need a full dependency graph. A simple set sequence of two
> classes of tasks will do.

Any replying to myself. A deadlock here is not fatal. You can and will
timeout in the freezer and can try again.

	REegards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 20:38                                           ` Miklos Szeredi
@ 2007-07-05 21:01                                             ` Rafael J. Wysocki
  0 siblings, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-05 21:01 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: oliver, pavel, paulus, stern, johannes, linux-pm, linux-kernel,
	mjg59, benh

On Thursday, 5 July 2007 22:38, Miklos Szeredi wrote:
> > > > > > > Actually fuse allows SIGKILL, because it's always fatal, and the
> > > > > > > syscall may not be restarted.
> > > > > > 
> > > > > > I think you want to stick try_to_freeze() at the same places where you
> > > > > > do SIGKILL handling. That should solve the 'syslogd is unfreezeable'
> > > > > > problem.
> > > > > 
> > > > > I could, but it would not solve the general problem.  Namely, that the
> > > > > presence of fuse imposes a certain ordering in which userspace tasks
> > > > > have to be frozen.  And it is not possible to know this ordering.
> > > > 
> > > > Actually, why do you need this? There is no absolute need that you
> > > > finish the request. You must either finish the request or let yourself
> > > > be frozen.
> > > > 
> > > > A quick look through fuse reveals principally request_wait_answer()
> > > > And maybe a few other places. Is there some hidden reason you cannot
> > > > handle being frozen here?
> > > 
> > > Yes, fuse could handle being frozen there.  However that would only
> > > solve part of the problem: an operation waiting for a reply could be
> > > holding a VFS mutex and some other task may be blocked on that mutex.
> > > 
> > > How would you solve freezing those tasks?
> > 
> > How probable is this situation?
> 
> I guess it depends on usage patterns.
> 
> I don't remember seeing any such cases, even though I have a permanent
> fuse mount, and I suspend regularly.  But the fs is probably totally
> idle during suspend in my case.
> 
> But some people use fuse as a home or root filesystem and in those
> cases it can become quite likely to cause problems.

That means we need to prevent the freezer from sending freeze requests FUSE's
filesystem servers too early.

What about this (more or less):
1) We add a list of userland processes to the freezer that should not be frozen
in the first phase (ie. during the freezing of user space).  They will be frozen
anyway along with the freezable kernel threads.
2) FUSE adds its filesystem servers to this list upon connecting to them
3) FUSE removes its filesystem servers from this list upon disconnecting

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 20:49                                             ` Oliver Neukum
  2007-07-05 20:53                                               ` Oliver Neukum
@ 2007-07-05 21:06                                               ` Alan Stern
  2007-07-05 21:15                                                 ` Oliver Neukum
  2007-07-05 21:07                                               ` Miklos Szeredi
  2 siblings, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-05 21:06 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Miklos Szeredi, pavel, paulus, johannes, rjw, linux-pm,
	linux-kernel, mjg59, benh

On Thu, 5 Jul 2007, Oliver Neukum wrote:

> > Obviously.  But I wasn't about the server trying to acquire a lock
> > held by a client.  I was talking about a client trying to acquire a
> > lock held by _another_ client.
> > 
> > If this coincides with the server (or some other task which the server
> > is depending on) being frozen before the clients, the freezer has a
> > problem.
> 
> True, but that case can only happen if servers are frozen before clients.
> You don't need a full dependency graph. A simple set sequence of two
> classes of tasks will do.

Just to make things more complicated...  Since a server isn't
restricted in what it can do, what happens when one server depends on
another server?

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 20:49                                             ` Oliver Neukum
  2007-07-05 20:53                                               ` Oliver Neukum
  2007-07-05 21:06                                               ` Alan Stern
@ 2007-07-05 21:07                                               ` Miklos Szeredi
  2007-07-05 21:15                                                 ` Alan Stern
  2 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05 21:07 UTC (permalink / raw)
  To: oliver
  Cc: miklos, pavel, paulus, stern, johannes, rjw, linux-pm,
	linux-kernel, mjg59, benh

> > > > Yes, fuse could handle being frozen there.  However that would only
> > > > solve part of the problem: an operation waiting for a reply could be
> > > > holding a VFS mutex and some other task may be blocked on that mutex.
> > > > 
> > > > How would you solve freezing those tasks?
> > > 
> > > OK, you made me reach for literatur on theoretical computer science.
> > > 
> > > IMHO the range of actions a fuse server is inherently limited.
> > > You must never ever block on a lock one of your clients is holding. In
> > > this case the limitation is not influenced by the freezer.
> > 
> > Obviously.  But I wasn't about the server trying to acquire a lock
> > held by a client.  I was talking about a client trying to acquire a
> > lock held by _another_ client.
> > 
> > If this coincides with the server (or some other task which the server
> > is depending on) being frozen before the clients, the freezer has a
> > problem.
> 
> True, but that case can only happen if servers are frozen before clients.
> You don't need a full dependency graph. A simple set sequence of two
> classes of tasks will do.

Umm, let's take sshfs, it has a separate "reply processing thread",
that reads replies from the sftp server and then wakes up the relevant
request thread.  The reply processing thread has no direct contact
with the fuse kernel module.  How is freezer supposed to know that
that task in fact also belongs to the server and needs to be kept from
freezing?

Given, it's in the same thread group, but using that is a rather weak
heuristic, as it could easily be a separate process, and likely in
some filesystems it is.

I fear, that your efforts to "save" the freezer are in vain.  It is
already moderately hackish with that PF_FREEZER_SKIP and the kernel
dotted randomly with try_to_freeze() calls, but adding bandaids to try
to order freezing userspace processes in the right order would just
make it a horrible mess.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 21:07                                               ` Miklos Szeredi
@ 2007-07-05 21:15                                                 ` Alan Stern
  2007-07-05 21:26                                                   ` Miklos Szeredi
  2007-07-05 21:37                                                   ` Oliver Neukum
  0 siblings, 2 replies; 388+ messages in thread
From: Alan Stern @ 2007-07-05 21:15 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: oliver, pavel, paulus, johannes, rjw, linux-pm, linux-kernel,
	mjg59, benh

On Thu, 5 Jul 2007, Miklos Szeredi wrote:

> I fear, that your efforts to "save" the freezer are in vain.  It is
> already moderately hackish with that PF_FREEZER_SKIP and the kernel
> dotted randomly with try_to_freeze() calls, but adding bandaids to try
> to order freezing userspace processes in the right order would just
> make it a horrible mess.

I agree that bandaids won't work.  What's needed is something more 
radical.  Things like FUSE must be written so that the kernel parts 
_can_ freeze even while they are waiting for a response from a user 
thread.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 21:06                                               ` Alan Stern
@ 2007-07-05 21:15                                                 ` Oliver Neukum
  2007-07-05 21:31                                                   ` Miklos Szeredi
  0 siblings, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-05 21:15 UTC (permalink / raw)
  To: Alan Stern
  Cc: Miklos Szeredi, pavel, paulus, johannes, rjw, linux-pm,
	linux-kernel, mjg59, benh

Am Donnerstag, 5. Juli 2007 schrieb Alan Stern:
> On Thu, 5 Jul 2007, Oliver Neukum wrote:
> 
> > > Obviously.  But I wasn't about the server trying to acquire a lock
> > > held by a client.  I was talking about a client trying to acquire a
> > > lock held by _another_ client.
> > > 
> > > If this coincides with the server (or some other task which the server
> > > is depending on) being frozen before the clients, the freezer has a
> > > problem.
> > 
> > True, but that case can only happen if servers are frozen before clients.
> > You don't need a full dependency graph. A simple set sequence of two
> > classes of tasks will do.
> 
> Just to make things more complicated...  Since a server isn't
> restricted in what it can do, what happens when one server depends on
> another server?

The same principle applies. If you really want that you can solve this
by freezing servers in the reverse sequence they were started.

The main point remains. If you have a circular dependency anywhere
among the servers you can deadlock independent of the freezer.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 21:15                                                 ` Alan Stern
@ 2007-07-05 21:26                                                   ` Miklos Szeredi
  2007-07-05 21:37                                                   ` Oliver Neukum
  1 sibling, 0 replies; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05 21:26 UTC (permalink / raw)
  To: stern
  Cc: miklos, oliver, pavel, paulus, johannes, rjw, linux-pm,
	linux-kernel, mjg59, benh

> > I fear, that your efforts to "save" the freezer are in vain.  It is
> > already moderately hackish with that PF_FREEZER_SKIP and the kernel
> > dotted randomly with try_to_freeze() calls, but adding bandaids to try
> > to order freezing userspace processes in the right order would just
> > make it a horrible mess.
> 
> I agree that bandaids won't work.  What's needed is something more 
> radical.  Things like FUSE must be written so that the kernel parts 
> _can_ freeze even while they are waiting for a response from a user 
> thread.

This has already been discussed, with the conclusion, that it can't be
done without hacking VFS internals.

The basic problem is that the freezer tries to get every user process
out of the kernel even when those processes have _nothing_ to do with
drivers and could happily stay in kernel land across a suspend or even
hibernate.

If we could have a good grip on when a request is entering a driver,
it would be easy to take care of this.  I guess network and block
devices are easy.  For others there's no obvious common place where
such barriers could be placed so it's more work, but nothing
conceptually problematic.  Is this about right?

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 21:15                                                 ` Oliver Neukum
@ 2007-07-05 21:31                                                   ` Miklos Szeredi
  0 siblings, 0 replies; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-05 21:31 UTC (permalink / raw)
  To: oliver
  Cc: stern, miklos, pavel, paulus, johannes, rjw, linux-pm,
	linux-kernel, mjg59, benh

> > On Thu, 5 Jul 2007, Oliver Neukum wrote:
> > 
> > > > Obviously.  But I wasn't about the server trying to acquire a lock
> > > > held by a client.  I was talking about a client trying to acquire a
> > > > lock held by _another_ client.
> > > > 
> > > > If this coincides with the server (or some other task which the server
> > > > is depending on) being frozen before the clients, the freezer has a
> > > > problem.
> > > 
> > > True, but that case can only happen if servers are frozen before clients.
> > > You don't need a full dependency graph. A simple set sequence of two
> > > classes of tasks will do.
> > 
> > Just to make things more complicated...  Since a server isn't
> > restricted in what it can do, what happens when one server depends on
> > another server?
> 
> The same principle applies. If you really want that you can solve this
> by freezing servers in the reverse sequence they were started.

You just can't know what constitutes a "server", processes which read
from the fuse device are candidates, but all tasks which communicate
with these in some way are also.  And that basically makes every
userspace task in the system a candidate server and you are no further
than before.

> The main point remains. If you have a circular dependency anywhere
> among the servers you can deadlock independent of the freezer.

Well, Duh.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 21:15                                                 ` Alan Stern
  2007-07-05 21:26                                                   ` Miklos Szeredi
@ 2007-07-05 21:37                                                   ` Oliver Neukum
  2007-07-06  7:13                                                     ` Rafael J. Wysocki
  1 sibling, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-05 21:37 UTC (permalink / raw)
  To: Alan Stern
  Cc: Miklos Szeredi, pavel, paulus, johannes, rjw, linux-pm,
	linux-kernel, mjg59, benh

Am Donnerstag, 5. Juli 2007 schrieb Alan Stern:
> On Thu, 5 Jul 2007, Miklos Szeredi wrote:
> 
> > I fear, that your efforts to "save" the freezer are in vain.  It is
> > already moderately hackish with that PF_FREEZER_SKIP and the kernel
> > dotted randomly with try_to_freeze() calls, but adding bandaids to try
> > to order freezing userspace processes in the right order would just
> > make it a horrible mess.
> 
> I agree that bandaids won't work.  What's needed is something more 
> radical.  Things like FUSE must be written so that the kernel parts 
> _can_ freeze even while they are waiting for a response from a user 
> thread.

OK, some radical ideas.

In principle we want a deadlock here. Tasks that are frozen due to fuse
are as good as frozen, there's no need to formally freeze them.
The bad uninterruptible tasks are those waiting for hardware.

Can we detect the difference? Perhaps we can label those crucial locks
and when only tasks in the refrigerator and tasks waiting on these locks
are left we are content and go on to suspending the device tree.

	Flame away
		Oliver

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 13:59                       ` Rafael J. Wysocki
@ 2007-07-05 21:49                         ` Nigel Cunningham
  2007-07-06  7:40                           ` Rafael J. Wysocki
  0 siblings, 1 reply; 388+ messages in thread
From: Nigel Cunningham @ 2007-07-05 21:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: nigel, Pavel Machek, Oliver Neukum, Miklos Szeredi, benh, mjg59,
	linux-kernel, linux-pm

[-- Attachment #1: Type: text/plain, Size: 4382 bytes --]

Good morning!

On Thursday 05 July 2007 23:59:57 Rafael J. Wysocki wrote:
> On Thursday, 5 July 2007 15:36, Nigel Cunningham wrote:
> > On Thursday 05 July 2007 23:35:45 Rafael J. Wysocki wrote:
> > > On Thursday, 5 July 2007 14:38, Nigel Cunningham wrote:
> > > > On Thursday 05 July 2007 22:25:06 Rafael J. Wysocki wrote:
> > > > > On Thursday, 5 July 2007 01:45, Pavel Machek wrote:
> > > > > > On Tue 2007-07-03 21:32:20, Oliver Neukum wrote:
> > > > > > > Am Dienstag, 3. Juli 2007 schrieb Miklos Szeredi:
> > > > > > > > > And a further question. The freezer is not atomic. What do 
you 
> > do
> > > > > > > > > if a task not yet frozen calls sys_sync(), but fuse is 
already 
> > > > frozen?
> > > > > > > > 
> > > > > > > > What do you do if a task not yet frozen writes to a pipe, on 
the 
> > other
> > > > > > > > end of which is a task already frozen?
> > > > > > 
> > > > > > There's some difference between uninterruptible and interruptible
> > > > > > sleep I'd say.
> > > > > > 
> > > > > > > > It doesn't matter.  The only thing that should matter during 
> > suspend
> > > > > > > > (not hibernate) is saving the state of devices to ram, and 
putting 
> > the
> > > > > > > > devices to sleep.
> > > > > > > 
> > > > > > > Well, but you did remove sys_sync() from the freezer, which is
> > > > > > > and must be called in the hibernate path.
> > > > > > 
> > > > > > Not "must". In fact, hibernation should be safe without 
sys_sync(). It
> > > > > > is just user un-friendly.
> > > > > 
> > > > > In fact, I'd like to remove the sys_sync() from the freezer 
entirely, 
> > > > because
> > > > > it just doesn't belong in there.
> > > > > 
> > > > > The only advantege of having sys_sync() in freeze_processes() is 
that we
> > > > > have a chance to write out everything when applications cannot 
produce 
> > more
> > > > > data to write, but there are filesystems which don't do that anyway 
(eg. 
> > > > XFS),
> > > > > so generally there's no reason to bother.
> > > > 
> > > > Shouldn't XFS - and fuse - be considered to be broken? Sync should 
sync 
> > data 
> > > > and if XFS isn't doing that, it's wrong.
> > > > 
> > > > In the case of fuse, we should have a mechanism by which fuse 
processes 
> > can be 
> > > > made to sync if they do have any pending I/O, and by which they can be 
> > frozen 
> > > > later than other userspace processes.
> > > > 
> > > > I'd like to see the sync stay, because it improves reliability and 
data 
> > > > integrity in the fail-to-resume case. Calling scripts would probably 
> > invoke 
> > > > sync themselves if they don't already, but that's racy. As it is at 
the 
> > > > moment, we know userspace is stopped, so syncing isn't racy.
> > > 
> > > I'd like to move the sync out of the freezer, but to call it from the
> > > suspend/hibernation code, so that we do
> > > 
> > > sys_sync();
> > > error = freeze_processes();
> > 
> > Yeah, I understand that. The problem then is that you're racing against 
> > userspace. That's not usually a problem, but that doesn't mean it's never 
a 
> > problem. Try running the stress suite while testing hibernating and you'll 
> > see what I mean. If something is submitting lots of I/O when you try to 
> > suspend, your sync call will race against that process if it's not yet 
> > frozen, and its continued activity will make your sync pointless (there'll 
be 
> > more unsynced data when you sys_sync call finishes). Stopping userspace 
> > before syncing removes that race.
> 
> Yes, that will make the suspend/hibernation less reliable in case the resume
> fails (some data, written after the sync, may be lost).  However, the sync 
done
> from within the freezer doesn't guarantee that there are no data lost 
anyway,
> so we don't lose much by not doing it.
> 
> Now, there's a question how much data may be lost, potentially, if we do the
> sync before the freezer and I don't think that's a lot.

You're missing the point. I'm arguing that a sync from within the freezer 
should guarantee that there is no data loss. As I said about, XFS should be 
fixed to properly sync its data, and something should be done about fuse 
filesystems too.

Regards,

Nigel
-- 
See http://www.tuxonice.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  9:31                               ` Miklos Szeredi
  2007-07-05 11:54                                 ` Pavel Machek
  2007-07-05 11:58                                 ` [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway Rafael J. Wysocki
@ 2007-07-05 22:04                                 ` Pavel Machek
  2007-07-06  7:07                                   ` Miklos Szeredi
  2007-07-06  7:16                                   ` Rafael J. Wysocki
  2 siblings, 2 replies; 388+ messages in thread
From: Pavel Machek @ 2007-07-05 22:04 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: oliver, paulus, stern, johannes, rjw, linux-pm, linux-kernel,
	mjg59, benh

Hi!

> > > > > > I have discussed the benefits elsewhere.  As for the deadlocks -- do 
> > > > > > you still observe them if you use the version of the freezer which 
> > > > > > doesn't freeze kernel threads?
> > > > > 
> > > > > In general the only way to guarantee there are no deadlocks is to
> > > > > construct the graph of dependencies between tasks.  Those dependencies
> > > > > are not in practice observable from outside the tasks, so it is
> > > > > virtually impossible to construct the graph.
> > > > 
> > > > In which way can user space tasks depend on each other in a way that
> > > > allows a them members of that cycle to be in uninterruptible sleep?
> > > 
> > >  - process A calls rename() on a fuse fs
> > >  - process B, the fuse server, starts to process the rename request
> > >  - process B is frozen before it can reply
> > > 
> > > Now process A is unfreezable.  We cannot make rename() restartable,
> > > hence it cannot be interruptible.
> > 
> > Yes, we are claiming fuse is very special in this regard, and perhaps
> > even broken.
> > 
> > Let's see. If I SIGSTOP the fuse server, I can get unrelated tasks
> > unkillable (even for SIGKILL!) forever.
> 
> Actually fuse allows SIGKILL, because it's always fatal, and the
> syscall may not be restarted.

Okay, and you should handle refrigerator in the same paths where you
handle SIGKILL. Just add try_to_freeze() there...
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  8:48                                 ` Oliver Neukum
  2007-07-05  8:58                                   ` Miklos Szeredi
@ 2007-07-05 22:38                                   ` Benjamin Herrenschmidt
  2007-07-06  7:04                                     ` Rafael J. Wysocki
  2007-07-06  7:30                                     ` Oliver Neukum
  1 sibling, 2 replies; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-05 22:38 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Miklos Szeredi, paulus, stern, johannes, rjw, linux-pm,
	linux-kernel, pavel, mjg59


> There is that.
> 
> OK, bite the bullet. Tasks involved in fuse are special. Give them a flag
> and teach the freezer to put them on ice only after all other task are
> frozen. In a way they are kernel, there's no use denying that.

Yet another ugly hack to work around the fact that the freezer cannot
work reliably ... yuck

Why not spend that energy fixing drivers to properly block requests
instead ?

Ben.


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  9:30             ` [PATCH] Remove process freezer from suspend to RAM pathway Pavel Machek
@ 2007-07-05 22:46               ` Benjamin Herrenschmidt
  2007-07-05 23:13                 ` Nigel Cunningham
  0 siblings, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-05 22:46 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Rafael J. Wysocki, Nigel Cunningham, Matthew Garrett,
	linux-kernel, linux-pm, Alan Stern

On Thu, 2007-07-05 at 11:30 +0200, Pavel Machek wrote:
> 
> ...but the moment you start blocking tasks that done driver request,
> you _do_ have mini-freezer of your own, with pretty much the same
> problems.

No, not at all the same problems. Those tasks will block, but that will
be harmless because we won't have some "freezer" things waiting for all
tasks to reach a "stable" point (calling try_to_freeze()). We just let
them block wherever we want, as long as it doesn't prevent a -driver-
from suspending, which should be allright, we have no problem.

> In another message I shown that removing freezer will not help with
> FUSE in general case.

I disagree.

> It probably does not help with firmware, too; as soon as udev attempts
> to do something with your wireless card, it is blocked, and if the
> wireless card needs the firmware from udev, you are deadlocked.

Firmware load has been a problem since day 1, I've talked about it
multiple times, it's broken with or without the freezer, and so far, the
reaction of pretty much everybody has been to dig their head deeper in
the mud and ignore the problem.

There are other issues (again, with or without freezer) that should be
dealt with. For example, drivers that haven't yet got their suspend()
callback or already have got their resume() may rely on services of the
kernel that are still blocked, that's where things may go hairy.
request_firmware() within resume() is a typical example of that.

There are a few things we should do in that area. For example, once we
start to call driver suspend's, we should probably set a system wide
flag that will do things such as:

 - block usermode helpers (either make call_usermodehelper return
something like -EBUSY or have it queue up the calls and issue them later
when thing are resuming, we need to look closely at what semantic we
want here).

 - Silently add GFP_NOIO to all allocations, to avoid having things
blocking in kmalloc() with a mutex held that will deadlock with
suspend() in a driver for example. Or set some way to have all GFP
waiters wakeup and fail rather than wait for IOs. It's hard/bizarre but
necessary, again, with or without a freezer.

 - Deal with the firmware problem. The best way is probably to have an
async request_firmware interface(). Another thing is, drivers may want
to cache their firmware in main memory, that sort of thing...

And that's just a small list off the top of my mind, of known problems
that will cause deadlocks or misbehaviours today, with or without the
freezer, and that need to be addressed.

Ben.


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 14:23                         ` Alan Stern
@ 2007-07-05 22:59                           ` Benjamin Herrenschmidt
  2007-07-06  7:20                             ` Rafael J. Wysocki
                                               ` (2 more replies)
  0 siblings, 3 replies; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-05 22:59 UTC (permalink / raw)
  To: Alan Stern
  Cc: Paul Mackerras, Johannes Berg, Rafael J. Wysocki,
	Linux-pm mailing list, Kernel development list, Pavel Machek,
	Matthew Garrett

On Thu, 2007-07-05 at 10:23 -0400, Alan Stern wrote:
> 
> How will that help?  Block the kernel thread in the freezer or block it 
> in the driver -- either way it is blocked.  So how do your deadlocks 
> get resolved?

Because nobody is waiting on that kernel thread anyway without a freezer
so there is no deadlock anymore.

> I disagree with your analysis -- not that it's completely wrong, but it 
> points out an existing basic problem in the kernel.  The kernel should 
> never depend on userspace!  More correctly, a task executing in the 
> kernel should never block with any sort of mutex or other lock held (in 
> a way that would preclude it from being frozen, let's say) while 
> waiting for a response from userspace.
> 
> Then the dependency graph would be easy to construct: User tasks can
> depend on whatever they want, and kernel threads never depend on a user
> task.

In an idea world, there would be no hunger...

> If this contradicts the existing implementations and APIs for userspace 
> filesystems, then so be it.  My conclusion would be that the 
> implementations and APIs should be changed.

Why are you guys working so hard and spending so much energy to try to
avoid doing the right thing is beyond my understanding...

> It _does_ apply to kernel threads.  That's exactly why I wrote above 
> that kernel threads which try to do I/O during a suspend will need 
> extra attention.

Ok none at all if you don't have a freezer.

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 19:44                                       ` Miklos Szeredi
  2007-07-05 20:19                                         ` Rafael J. Wysocki
  2007-07-05 20:34                                         ` Oliver Neukum
@ 2007-07-05 23:05                                         ` Benjamin Herrenschmidt
  2007-07-06  3:59                                           ` Jeremy Maitin-Shepard
  2007-07-06  7:32                                           ` Oliver Neukum
  2 siblings, 2 replies; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-05 23:05 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: oliver, pavel, paulus, stern, johannes, rjw, linux-pm,
	linux-kernel, mjg59


> Yes, fuse could handle being frozen there.  However that would only
> solve part of the problem: an operation waiting for a reply could be
> holding a VFS mutex and some other task may be blocked on that mutex.
> 
> How would you solve freezing those tasks?

That task is implicitely frozen... but the kernel doesn't know it and
thus the freezer timeouts or fails or deadlocks or whatever.

The freezer could be made to ignore tasks that are sleeping in the
kernel assuming that if they go out of it, they'll ultimately reach
do_signal and freeze, but that means they can potentially still issues
IOs which is what the freezer tries to avoid ...

Or the kernel could start tracking dependencies, but then, good luck
implementing that crap.

At the end of the day, I stand my ground, the freezer cannot be made
reliable without massive infrastructure changes or giving up on very
useful features such as fuse among others. Besides, it only partially
"hides" the problem of requests going to drivers, thus it's a bad
solutions.

We would be much better off spending time fixing the drivers to properly
block requests after suspended, and that also gives for free the ability
to do dynamic runtime suspend on them.

And for "trivial" drivers where we don't care, using late_suspend to
power the chip off later when IRQs are off is an easy enough way to
solve it with very little code (though won't help with dynamic PM but
that's not necessarily an issue). No need for a freezer either way.

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 22:46               ` Benjamin Herrenschmidt
@ 2007-07-05 23:13                 ` Nigel Cunningham
  2007-07-05 23:20                   ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 388+ messages in thread
From: Nigel Cunningham @ 2007-07-05 23:13 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Pavel Machek, Rafael J. Wysocki, Matthew Garrett, linux-kernel,
	linux-pm, Alan Stern

[-- Attachment #1: Type: text/plain, Size: 3426 bytes --]

Hi.

On Friday 06 July 2007 08:46:54 Benjamin Herrenschmidt wrote:
> On Thu, 2007-07-05 at 11:30 +0200, Pavel Machek wrote:
> > 
> > ...but the moment you start blocking tasks that done driver request,
> > you _do_ have mini-freezer of your own, with pretty much the same
> > problems.
> 
> No, not at all the same problems. Those tasks will block, but that will
> be harmless because we won't have some "freezer" things waiting for all
> tasks to reach a "stable" point (calling try_to_freeze()). We just let
> them block wherever we want, as long as it doesn't prevent a -driver-
> from suspending, which should be allright, we have no problem.

Will you be able to guarantee that every place where a task can/will block 
will be harmless place? If so, how will you guarantee that? How will you 
debug issues where a task occasionally doesn't block in the right place, 
particularly instances where it is some less than obvious interaction with 
other tasks?

This is the whole point to having the freezer. It makes things more 
predictable and testable. It shows us, clearly, when process X is the one 
that is causing problems.

> > In another message I shown that removing freezer will not help with
> > FUSE in general case.
> 
> I disagree.

Why?
 
> > It probably does not help with firmware, too; as soon as udev attempts
> > to do something with your wireless card, it is blocked, and if the
> > wireless card needs the firmware from udev, you are deadlocked.
> 
> Firmware load has been a problem since day 1, I've talked about it
> multiple times, it's broken with or without the freezer, and so far, the
> reaction of pretty much everybody has been to dig their head deeper in
> the mud and ignore the problem.
> 
> There are other issues (again, with or without freezer) that should be
> dealt with. For example, drivers that haven't yet got their suspend()
> callback or already have got their resume() may rely on services of the
> kernel that are still blocked, that's where things may go hairy.
> request_firmware() within resume() is a typical example of that.
> 
> There are a few things we should do in that area. For example, once we
> start to call driver suspend's, we should probably set a system wide
> flag that will do things such as:
> 
>  - block usermode helpers (either make call_usermodehelper return
> something like -EBUSY or have it queue up the calls and issue them later
> when thing are resuming, we need to look closely at what semantic we
> want here).
> 
>  - Silently add GFP_NOIO to all allocations, to avoid having things
> blocking in kmalloc() with a mutex held that will deadlock with
> suspend() in a driver for example. Or set some way to have all GFP
> waiters wakeup and fail rather than wait for IOs. It's hard/bizarre but
> necessary, again, with or without a freezer.

GFP_ATOMIC? (In driver suspend, they shouldn't be sleeping either, right?)
 
>  - Deal with the firmware problem. The best way is probably to have an
> async request_firmware interface(). Another thing is, drivers may want
> to cache their firmware in main memory, that sort of thing...
> 
> And that's just a small list off the top of my mind, of known problems
> that will cause deadlocks or misbehaviours today, with or without the
> freezer, and that need to be addressed.

Userspace device drivers too?

Regards,

Nigel

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 23:13                 ` Nigel Cunningham
@ 2007-07-05 23:20                   ` Benjamin Herrenschmidt
  2007-07-05 23:35                     ` Nigel Cunningham
  0 siblings, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-05 23:20 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Pavel Machek, Rafael J. Wysocki, Matthew Garrett, linux-kernel,
	linux-pm, Alan Stern


> Will you be able to guarantee that every place where a task can/will block 
> will be harmless place? If so, how will you guarantee that? How will you 
> debug issues where a task occasionally doesn't block in the right place, 
> particularly instances where it is some less than obvious interaction with 
> other tasks?

Which places aren't harmless if you don't have a freezer ?

> This is the whole point to having the freezer. It makes things more 
> predictable and testable. It shows us, clearly, when process X is the one 
> that is causing problems.

No, the freezer creates all those places what are harmful for a task to
block because they will break the freezer :-)

> >  - Silently add GFP_NOIO to all allocations, to avoid having things
> > blocking in kmalloc() with a mutex held that will deadlock with
> > suspend() in a driver for example. Or set some way to have all GFP
> > waiters wakeup and fail rather than wait for IOs. It's hard/bizarre but
> > necessary, again, with or without a freezer.
> 
> GFP_ATOMIC? (In driver suspend, they shouldn't be sleeping either, right?)

NOIO should be enough I think but ATOMIC would do).
 
That's one of the reason why I used to have the pre-suspend and
post-resume hooks in my original powermac implementation, for those few
drivers complicated enough to require some pre-allocations.
 
> >  - Deal with the firmware problem. The best way is probably to have an
> > async request_firmware interface(). Another thing is, drivers may want
> > to cache their firmware in main memory, that sort of thing...
> >

Note that the above firmware problem could be dealt with also with the
pre-suspend/post-resume. Allowing to pre-request firmware etc... and
keep it around until after resume, because we know we will need it.
Gives a chance to drivers to perform things while the system is still
live, filesystems still working, etc... (big memory allocations for
example).

> > And that's just a small list off the top of my mind, of known problems
> > that will cause deadlocks or misbehaviours today, with or without the
> > freezer, and that need to be addressed.
> 
> Userspace device drivers too?

Maybe but they are less of an issue, most of the time, they don't do DMA
or whatever harmful things. If they are USB drivers, for example, they
are an non-issues at that level.


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 23:20                   ` Benjamin Herrenschmidt
@ 2007-07-05 23:35                     ` Nigel Cunningham
  2007-07-06  1:19                       ` Kyle Moffett
  2007-07-06  3:54                       ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 388+ messages in thread
From: Nigel Cunningham @ 2007-07-05 23:35 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Pavel Machek, Rafael J. Wysocki, Matthew Garrett, linux-kernel,
	linux-pm, Alan Stern

[-- Attachment #1: Type: text/plain, Size: 3217 bytes --]

Hi.

On Friday 06 July 2007 09:20:43 Benjamin Herrenschmidt wrote:
> 
> > Will you be able to guarantee that every place where a task can/will block 
> > will be harmless place? If so, how will you guarantee that? How will you 
> > debug issues where a task occasionally doesn't block in the right place, 
> > particularly instances where it is some less than obvious interaction with 
> > other tasks?
> 
> Which places aren't harmless if you don't have a freezer ?

If I knew that, I wouldn't be asking the question.

> > This is the whole point to having the freezer. It makes things more 
> > predictable and testable. It shows us, clearly, when process X is the one 
> > that is causing problems.
> 
> No, the freezer creates all those places what are harmful for a task to
> block because they will break the freezer :-)

Nice try :) Okay then, you remove the freezer, try hibernating, then get back 
to me after you've fixed your filesystem because some process that wasn't 
frozen started writing things after the atomic copy (making the on disk 
filesystem inconsistent with the snapshot).

As Pavel rightly said, you can get rid of the freezer, but you're only going 
to have to implement another one that does the essentially the same thing, 
even if it is at some other level.
 
> > >  - Silently add GFP_NOIO to all allocations, to avoid having things
> > > blocking in kmalloc() with a mutex held that will deadlock with
> > > suspend() in a driver for example. Or set some way to have all GFP
> > > waiters wakeup and fail rather than wait for IOs. It's hard/bizarre but
> > > necessary, again, with or without a freezer.
> > 
> > GFP_ATOMIC? (In driver suspend, they shouldn't be sleeping either, right?)
> 
> NOIO should be enough I think but ATOMIC would do).
>  
> That's one of the reason why I used to have the pre-suspend and
> post-resume hooks in my original powermac implementation, for those few
> drivers complicated enough to require some pre-allocations.
>  
> > >  - Deal with the firmware problem. The best way is probably to have an
> > > async request_firmware interface(). Another thing is, drivers may want
> > > to cache their firmware in main memory, that sort of thing...
> > >
> 
> Note that the above firmware problem could be dealt with also with the
> pre-suspend/post-resume. Allowing to pre-request firmware etc... and
> keep it around until after resume, because we know we will need it.
> Gives a chance to drivers to perform things while the system is still
> live, filesystems still working, etc... (big memory allocations for
> example).
> 
> > > And that's just a small list off the top of my mind, of known problems
> > > that will cause deadlocks or misbehaviours today, with or without the
> > > freezer, and that need to be addressed.
> > 
> > Userspace device drivers too?
> 
> Maybe but they are less of an issue, most of the time, they don't do DMA
> or whatever harmful things. If they are USB drivers, for example, they
> are an non-issues at that level.

(Leaving the rest of the message intact so we don't have to fragment the 
discussion into a million subthreads).

Regards,

Nigel

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 23:35                     ` Nigel Cunningham
@ 2007-07-06  1:19                       ` Kyle Moffett
  2007-07-06  1:37                         ` Nigel Cunningham
                                           ` (2 more replies)
  2007-07-06  3:54                       ` Benjamin Herrenschmidt
  1 sibling, 3 replies; 388+ messages in thread
From: Kyle Moffett @ 2007-07-06  1:19 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Benjamin Herrenschmidt, Pavel Machek, Rafael J. Wysocki,
	Matthew Garrett, linux-kernel, linux-pm, Alan Stern

On Jul 05, 2007, at 19:35:11, Nigel Cunningham wrote:
> On Friday 06 July 2007 09:20:43 Benjamin Herrenschmidt wrote:
>> No, the freezer creates all those places what are harmful for a  
>> task to block because they will break the freezer :-)
>
> Nice try :) Okay then, you remove the freezer, try hibernating,  
> then get back to me after you've fixed your filesystem because some  
> process that wasn't frozen started writing things after the atomic  
> copy (making the on disk filesystem inconsistent with the snapshot).

Umm, this thread is NOT ABOUT HIBERNATING!!!  Please go back and read  
the subject, specifically the "suspend to RAM" parts :-D.  When your  
hardware can put itself to sleep and atomically preserve memory as it  
does so, you don't need an atomic copy.  For Real Suspend(TM) (IE:  
Suspend-to-RAM), the list of things to do is short and simple:

1)  Stop DMA and put most hardware into low-power states (stops all  
interrupt sources)
2)  Ensure that the other CPUs have finished any trailing interrupt  
handlers and put them to sleep
3)  Put the interrupt-controllers into low-power state
4)  Go to sleep

> As Pavel rightly said, you can get rid of the freezer, but you're  
> only going to have to implement another one that does the  
> essentially the same thing, even if it is at some other level.

How about a freezer whose job it is to "wait for pending hard  
interrupts to complete when we have already guaranteed that we won't  
get any more"?  That part should be really *REALLY* easy.  You don't  
need to care about either userspace processes or kernel threads at  
all.  Specifically, Step 1 consists of:

suspend_device(dev)
{
	set_no_bind_flag(dev);
	for (dev->subdevices)
		suspend_device(dev);
	set_no_io_flag(dev);
	wait_for_in_progress_dma(dev);
	turn_off_interrupts(dev);
	go_to_low_power_state(dev);
}

After you've set the "no_bind" flag, you won't get any *new*  
subdevices trying to bind, therefore it's safe to iterate over the  
list of present sub-devices and suspend them.  Once those are  
suspended and in low-power states you can set a "no_io" flag to  
prevent the driver from submitting more IO.  At that point you can  
lazily wait for existing DMA/IO/interrupts to finish on the device,  
since *NOBODY* will be submitting them anymore, and we certainly  
aren't probing for new devices.  Then you can just turn off the power  
to the device.  When all the leaf devices are off, the parent device  
can be turned off because everything waiting on the leaf devices is  
blocked on them and won't unblock until the parent device *AND* the  
leaf device are turned on again, in that order.

Scheduling and userspace are all still fully enabled in this  
scenario.  Once all your devices are turned off, the only remaining  
running threads will be those which haven't done IO since the  
beginning of the suspend.  We can then disable preemption, turn off  
the timer interrupts, and tell the other CPUs to park all their  
remaining threads in schedule() and sleep.  Then we put the IRQ  
controller to sleep and go to sleep ourselves.  If our driver model  
locking is sufficient to handle putting a parent device to sleep  
while threads are sleeping on a child device then there are exactly 0  
problems.

Resuming is basically running the whole process in reverse.  Runtime- 
suspend is achieved by not setting the 'no_io' or 'no_bind' flags and  
putting selective device-subtrees to sleep without doing anything to  
the rest of the system.

Cheers,
Kyle Moffett


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  1:19                       ` Kyle Moffett
@ 2007-07-06  1:37                         ` Nigel Cunningham
  2007-07-06  3:59                         ` Benjamin Herrenschmidt
  2007-07-06 15:42                         ` Alan Stern
  2 siblings, 0 replies; 388+ messages in thread
From: Nigel Cunningham @ 2007-07-06  1:37 UTC (permalink / raw)
  To: Kyle Moffett
  Cc: Benjamin Herrenschmidt, Pavel Machek, Rafael J. Wysocki,
	Matthew Garrett, linux-kernel, linux-pm, Alan Stern

[-- Attachment #1: Type: text/plain, Size: 4877 bytes --]

Hi.

On Friday 06 July 2007 11:19:32 Kyle Moffett wrote:
> On Jul 05, 2007, at 19:35:11, Nigel Cunningham wrote:
> > On Friday 06 July 2007 09:20:43 Benjamin Herrenschmidt wrote:
> >> No, the freezer creates all those places what are harmful for a  
> >> task to block because they will break the freezer :-)
> >
> > Nice try :) Okay then, you remove the freezer, try hibernating,  
> > then get back to me after you've fixed your filesystem because some  
> > process that wasn't frozen started writing things after the atomic  
> > copy (making the on disk filesystem inconsistent with the snapshot).
> 
> Umm, this thread is NOT ABOUT HIBERNATING!!!  Please go back and read  
> the subject, specifically the "suspend to RAM" parts :-D.  When your  
> hardware can put itself to sleep and atomically preserve memory as it  
> does so, you don't need an atomic copy.  For Real Suspend(TM) (IE:  
> Suspend-to-RAM), the list of things to do is short and simple:

We agreed a while back that you don't need the freezer for suspend to ram. As 
far as I was aware, we went off-topic, so the topic is out of date.

> 1)  Stop DMA and put most hardware into low-power states (stops all  
> interrupt sources)
> 2)  Ensure that the other CPUs have finished any trailing interrupt  
> handlers and put them to sleep
> 3)  Put the interrupt-controllers into low-power state
> 4)  Go to sleep
> 
> > As Pavel rightly said, you can get rid of the freezer, but you're  
> > only going to have to implement another one that does the  
> > essentially the same thing, even if it is at some other level.
> 
> How about a freezer whose job it is to "wait for pending hard  
> interrupts to complete when we have already guaranteed that we won't  
> get any more"?  That part should be really *REALLY* easy.  You don't  
> need to care about either userspace processes or kernel threads at  
> all.  Specifically, Step 1 consists of:
> 
> suspend_device(dev)
> {
> 	set_no_bind_flag(dev);
> 	for (dev->subdevices)
> 		suspend_device(dev);
> 	set_no_io_flag(dev);
> 	wait_for_in_progress_dma(dev);
> 	turn_off_interrupts(dev);
> 	go_to_low_power_state(dev);
> }
> 
> After you've set the "no_bind" flag, you won't get any *new*  
> subdevices trying to bind, therefore it's safe to iterate over the  
> list of present sub-devices and suspend them.  Once those are  
> suspended and in low-power states you can set a "no_io" flag to  
> prevent the driver from submitting more IO.  At that point you can  
> lazily wait for existing DMA/IO/interrupts to finish on the device,  
> since *NOBODY* will be submitting them anymore, and we certainly  
> aren't probing for new devices.  Then you can just turn off the power  
> to the device.  When all the leaf devices are off, the parent device  
> can be turned off because everything waiting on the leaf devices is  
> blocked on them and won't unblock until the parent device *AND* the  
> leaf device are turned on again, in that order.

For suspending, yes. For hibernating, that's not enough, because other 
processes can still be happily allocating and freeing memory, and will only 
get stopped when they try to do i/o or such like. If you're trying to make 
hibernation reliable, you need to be able to reliably check whether you're 
going to have enough storage for the image you're preparing, and enough 
memory for the atomic copy and so on. That's why the freezer is needed for 
hibernation. If you don't have it, any hibernation implementation you make is 
going to be only as reliable as the extent to which the system is otherwise 
idle.
 
> Scheduling and userspace are all still fully enabled in this  
> scenario.  Once all your devices are turned off, the only remaining  
> running threads will be those which haven't done IO since the  
> beginning of the suspend.  We can then disable preemption, turn off  
> the timer interrupts, and tell the other CPUs to park all their  
> remaining threads in schedule() and sleep.  Then we put the IRQ  
> controller to sleep and go to sleep ourselves.  If our driver model  
> locking is sufficient to handle putting a parent device to sleep  
> while threads are sleeping on a child device then there are exactly 0  
> problems.
> 
> Resuming is basically running the whole process in reverse.  Runtime- 
> suspend is achieved by not setting the 'no_io' or 'no_bind' flags and  
> putting selective device-subtrees to sleep without doing anything to  
> the rest of the system.

Fully agree when it comes to suspend to ram. There, this process should work, 
so far as I can see. But as I said, we went - to one degree or another - off 
topic, and did discuss hibernation too.

Regards,

Nigel
-- 
See http://www.tuxonice.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 23:35                     ` Nigel Cunningham
  2007-07-06  1:19                       ` Kyle Moffett
@ 2007-07-06  3:54                       ` Benjamin Herrenschmidt
  2007-07-06  4:03                         ` Nigel Cunningham
  1 sibling, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-06  3:54 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Pavel Machek, Rafael J. Wysocki, Matthew Garrett, linux-kernel,
	linux-pm, Alan Stern

On Fri, 2007-07-06 at 09:35 +1000, Nigel Cunningham wrote:
> 
> Nice try :) Okay then, you remove the freezer, try hibernating, then get back 
> to me after you've fixed your filesystem because some process that wasn't 
> frozen started writing things after the atomic copy (making the on disk 
> filesystem inconsistent with the snapshot).
> 
> As Pavel rightly said, you can get rid of the freezer, but you're only going 
> to have to implement another one that does the essentially the same thing, 
> even if it is at some other level.

I was mostly talking about STR... Regarding STD, we have a different
problem and we all know it. The freezer is one somewhat horrible way to
get it working for now, I would prefer something more along the way that
blocks the page cache from writing out new dirty pages though, except
those specifically flagged by the snapshot.

That is, some kind of proper snapshotting facility, as linus was
describing some time ago.

Ben.

> > > >  - Silently add GFP_NOIO to all allocations, to avoid having things
> > > > blocking in kmalloc() with a mutex held that will deadlock with
> > > > suspend() in a driver for example. Or set some way to have all GFP
> > > > waiters wakeup and fail rather than wait for IOs. It's hard/bizarre but
> > > > necessary, again, with or without a freezer.
> > > 
> > > GFP_ATOMIC? (In driver suspend, they shouldn't be sleeping either, right?)
> > 
> > NOIO should be enough I think but ATOMIC would do).
> >  
> > That's one of the reason why I used to have the pre-suspend and
> > post-resume hooks in my original powermac implementation, for those few
> > drivers complicated enough to require some pre-allocations.
> >  
> > > >  - Deal with the firmware problem. The best way is probably to have an
> > > > async request_firmware interface(). Another thing is, drivers may want
> > > > to cache their firmware in main memory, that sort of thing...
> > > >
> > 
> > Note that the above firmware problem could be dealt with also with the
> > pre-suspend/post-resume. Allowing to pre-request firmware etc... and
> > keep it around until after resume, because we know we will need it.
> > Gives a chance to drivers to perform things while the system is still
> > live, filesystems still working, etc... (big memory allocations for
> > example).
> > 
> > > > And that's just a small list off the top of my mind, of known problems
> > > > that will cause deadlocks or misbehaviours today, with or without the
> > > > freezer, and that need to be addressed.
> > > 
> > > Userspace device drivers too?
> > 
> > Maybe but they are less of an issue, most of the time, they don't do DMA
> > or whatever harmful things. If they are USB drivers, for example, they
> > are an non-issues at that level.
> 
> (Leaving the rest of the message intact so we don't have to fragment the 
> discussion into a million subthreads).
> 
> Regards,
> 
> Nigel
> 
> 
> Invalid signature
> 
> 


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 23:05                                         ` Benjamin Herrenschmidt
@ 2007-07-06  3:59                                           ` Jeremy Maitin-Shepard
  2007-07-06  7:32                                           ` Oliver Neukum
  1 sibling, 0 replies; 388+ messages in thread
From: Jeremy Maitin-Shepard @ 2007-07-06  3:59 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Miklos Szeredi, oliver, pavel, paulus, stern, johannes, rjw,
	linux-pm, linux-kernel, mjg59

Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:

[snip]

> At the end of the day, I stand my ground, the freezer cannot be made
> reliable without massive infrastructure changes or giving up on very
> useful features such as fuse among others. Besides, it only partially
> "hides" the problem of requests going to drivers, thus it's a bad
> solutions.

I agree that the freezer absolutely should not be used for suspend to
ram ("suspend"), since it is unnecessary with properly written drivers,
which are important to have anyway.  It seems that it is indeed the
consensus that it will be phased out sooner or later.

It does seem that the current device suspend interface does not tell the
drivers enough, since as discussed, they need to know whether to merely
block if they receive a request while suspended (as should be done while
initiating a suspend to ram), or if they should wake up the device (as
should be done if a suspend to ram is not in progress).  Clearly these
two cases need to be addressed by every driver supporting suspend/resume
(but possibly indirectly if the subsystem handles it for them).

The current hibernate approach used by all of the existing
implementations for Linux seems to depend fundamentally on the freezer,
though, in order to actually save the system state.  Thus, it will still
be necessary to fix all of the issues with the freezer, or adopt an
alternate hibernate approach (which is unlikely).  Unfortunately, even
leaving kernel threads and certain drivers running after the snapshot is
taken means that the saved image isn't completely correct, and the
freezer cannot help with these issues.

[snip]

-- 
Jeremy Maitin-Shepard

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  1:19                       ` Kyle Moffett
  2007-07-06  1:37                         ` Nigel Cunningham
@ 2007-07-06  3:59                         ` Benjamin Herrenschmidt
  2007-07-06  7:35                           ` Rafael J. Wysocki
  2007-07-06 14:38                           ` Alan Stern
  2007-07-06 15:42                         ` Alan Stern
  2 siblings, 2 replies; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-06  3:59 UTC (permalink / raw)
  To: Kyle Moffett
  Cc: Nigel Cunningham, Pavel Machek, Rafael J. Wysocki,
	Matthew Garrett, linux-kernel, linux-pm, Alan Stern


> How about a freezer whose job it is to "wait for pending hard  
> interrupts to complete when we have already guaranteed that we won't  
> get any more"?  That part should be really *REALLY* easy.  You don't  
> need to care about either userspace processes or kernel threads at  
> all.  Specifically, Step 1 consists of:

Well, waiting for pending DMA and making sure to not trigger more
activity is what driver suspend() is supposed to do. With the ability
for simple drivers that can cope with it to just basically use a
late_suspend(), called after IRQs are off, that basically does what you
describe: wait for pending HW tasks to complete (polling) and turn the
damn thing off.

Note that the later is really a shortcut for somewhat dump and directly
accessible devices (PCI comes to mind). Things like USB has to use the
"normal" mechanism of blocking IOs etc... at suspend(), at least, USB
devices have to since the USB HC will not issue any new URBs. (And will
return them with a nice error code which is a perfect way to deal with
it in driver, been there, it works fine, once we fixed the races in the
USB host code itself, which I think we pretty much did by now).

> Scheduling and userspace are all still fully enabled in this  
> scenario.  Once all your devices are turned off, the only remaining  
> running threads will be those which haven't done IO since the  
> beginning of the suspend.  We can then disable preemption, turn off  
> the timer interrupts, and tell the other CPUs to park all their  
> remaining threads in schedule() and sleep.  Then we put the IRQ  
> controller to sleep and go to sleep ourselves.  If our driver model  
> locking is sufficient to handle putting a parent device to sleep  
> while threads are sleeping on a child device then there are exactly 0  
> problems.

What you propose is basically a slightly over-simplistic version of what
I think (and Paulus think) should be done. We do need to do it via
driver callbacks down the tree since only drivers can know how to deal
with their DMA etc... and ordering need to be respected, but that's
basically it.

And guess what ? It's what we do on powerbooks, and it works fine,
without a freezer :-)

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  3:54                       ` Benjamin Herrenschmidt
@ 2007-07-06  4:03                         ` Nigel Cunningham
  2007-07-06  4:41                           ` Benjamin Herrenschmidt
  2007-07-06  5:01                           ` Kyle Moffett
  0 siblings, 2 replies; 388+ messages in thread
From: Nigel Cunningham @ 2007-07-06  4:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Pavel Machek, Rafael J. Wysocki, Matthew Garrett, linux-kernel,
	linux-pm, Alan Stern

[-- Attachment #1: Type: text/plain, Size: 2087 bytes --]

Hi.

On Friday 06 July 2007 13:54:15 Benjamin Herrenschmidt wrote:
> On Fri, 2007-07-06 at 09:35 +1000, Nigel Cunningham wrote:
> > 
> > Nice try :) Okay then, you remove the freezer, try hibernating, then get 
back 
> > to me after you've fixed your filesystem because some process that wasn't 
> > frozen started writing things after the atomic copy (making the on disk 
> > filesystem inconsistent with the snapshot).
> > 
> > As Pavel rightly said, you can get rid of the freezer, but you're only 
going 
> > to have to implement another one that does the essentially the same thing, 
> > even if it is at some other level.
> 
> I was mostly talking about STR... Regarding STD, we have a different
> problem and we all know it. The freezer is one somewhat horrible way to
> get it working for now, I would prefer something more along the way that
> blocks the page cache from writing out new dirty pages though, except
> those specifically flagged by the snapshot.
> 
> That is, some kind of proper snapshotting facility, as linus was
> describing some time ago.

The kind of thing Linus was talking about would limit you (as swsusp and 
uswsusp do now) to only half the amount of memory. I suppose you could lzf 
compress as you did the snapshot. That would generally get you up to 2/3rds, 
but then again you can't know what compression ratio you'll get until you 
try, so reliability would suffer or it would take longer because of retrying.

I/O from swsusp and suspend2 use bios directly, so the page cache isn't an 
issue for them (apart from the fact that Suspend2 saves the page cache 
separately so it can get a full image). Not sure about uswsusp.

Only having half the amount of memory doesn't sound like a big limitation for 
modern desktops & laptops, but don't forget that there are embedded guys 
wanting to hbernate too :)

Regards,

Nigel
-- 
Nigel Cunningham
Christian Reformed Church of Cobden
103 Curdie Street, Cobden 3266, Victoria, Australia
Ph. +61 3 5595 1185 / +61 417 100 574
Communal Worship: 11 am Sunday.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  4:03                         ` Nigel Cunningham
@ 2007-07-06  4:41                           ` Benjamin Herrenschmidt
  2007-07-06  5:25                             ` Nigel Cunningham
  2007-07-06  5:01                           ` Kyle Moffett
  1 sibling, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-06  4:41 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Pavel Machek, Rafael J. Wysocki, Matthew Garrett, linux-kernel,
	linux-pm, Alan Stern


> I/O from swsusp and suspend2 use bios directly, so the page cache isn't an 
> issue for them (apart from the fact that Suspend2 saves the page cache 
> separately so it can get a full image). Not sure about uswsusp.
> 
> Only having half the amount of memory doesn't sound like a big limitation for 
> modern desktops & laptops, but don't forget that there are embedded guys 
> wanting to hbernate too :)

Wait wait wait ... uses the BIOS ? what do you mean ?

I know that for example, things like MacOS X use a separate polled path to
the storage driver for suspend (works fine for the built-in IDE, but more
complicated in large scale). If you can use BIOS calls to write your
suspend image, that is, if you don't need any of the normal block
infrastructure, then you don't need a freezer ! not at all !

You just do like STR ... and at the end of the day, once you have stopped
all your driver, you shut interrupts off and do the BIOS thing....

I fail to see how processes could dirty pages while/after the BIOS thingy :-)

But then, the problem with that approach is that of course, you need a BIOS
capable of doing that (or a special sideband path to the "blessed" block
driver that will be used for suspend ... not necessarily a hard thing to do,
would be trivial to add support to drivers/ide or libata for that sort of
things I suppose).

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  4:03                         ` Nigel Cunningham
  2007-07-06  4:41                           ` Benjamin Herrenschmidt
@ 2007-07-06  5:01                           ` Kyle Moffett
  2007-07-06  5:53                             ` Nigel Cunningham
  2007-07-10  2:07                             ` Nigel Cunningham
  1 sibling, 2 replies; 388+ messages in thread
From: Kyle Moffett @ 2007-07-06  5:01 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Benjamin Herrenschmidt, Pavel Machek, Rafael J. Wysocki,
	Matthew Garrett, linux-kernel, linux-pm, Alan Stern

On Jul 06, 2007, at 00:03:15, Nigel Cunningham wrote:
> On Friday 06 July 2007 13:54:15 Benjamin Herrenschmidt wrote:
>> On Fri, 2007-07-06 at 09:35 +1000, Nigel Cunningham wrote:
>>>
>>> Nice try :) Okay then, you remove the freezer, try hibernating,  
>>> then get back to me after you've fixed your filesystem because  
>>> some process that wasn't frozen started writing things after the  
>>> atomic copy (making the on disk filesystem inconsistent with the  
>>> snapshot).
>>>
>>> As Pavel rightly said, you can get rid of the freezer, but you're  
>>> only going to have to implement another one that does the  
>>> essentially the same thing, even if it is at some other level.
>>
>> I was mostly talking about STR... Regarding STD, we have a  
>> different problem and we all know it. The freezer is one somewhat  
>> horrible way to get it working for now, I would prefer something  
>> more along the way that blocks the page cache from writing out new  
>> dirty pages though, except those specifically flagged by the  
>> snapshot.
>>
>> That is, some kind of proper snapshotting facility, as linus was  
>> describing some time ago.
>
> The kind of thing Linus was talking about would limit you (as  
> swsusp and uswsusp do now) to only half the amount of memory.

How so?  Suppose hibernate is implemented like this:

(1) Userspace program calls sys_freeze_processes()
   (a) Pokes all CPUs with IPMIs and tells them to finish the  
currently running timeslot then stop
   (b) Atomically sends SIGSTOP to all userspace processes in a non- 
trappable way, except the calling process and any process which is  
ptracing it.
   (c) Returns to the calling process.

(2) Userspace process sends SIGCONT to only those processes which are  
necessary for sync and a device-mapper snapshot.

(3) Userspace calls sys_snapshot_kernel(snapshot_overhead_pages)
   (a) Kernel starts freeing memory and swapping stuff out to make  
room for a copy of *kernel* memory (not pagecache, not process RAM).   
It does the same for at least snapshot_overhead_pages extra (used by  
userspace later).  It then allocates this memory to keep it from  
going away.  Since most processes are stopped we won't have much else  
competing with us for the RAM.
   (a) Kernel uses the device-mapper up-call-into-filesystem  
machinery to get all mounted filesystems synced and ready for a DM  
snapshot.  This may include sending data via the userspace processes  
resumed in (2).  Any deadlocks here are userspace's fault (see (2)).   
Will need some modification to handle doing multiple blockdevs at a  
time.  Anything using FUSE is basically perma-synced anyways (no dep- 
handling needed), and anything using loop should already be handled  
by DM.  This includes allocating memory for the basic snapshot  
datastructures.
   (b) At this point all blockdev operations should be halted and  
disk caches flushed; that's all we care about.
   (c) Go through the device tree and quiesce DMA and shut off  
interrupts.  Since all the disks are synced this is easy.
   (d) Use IPMIs again to get all the CPUs together, which should be  
easy as most processes are sleeping in IO or SIGSTOPed, and we're  
getting no interrupts.
   (e) One CPU turns off all interrupts on itself and takes an atomic  
snapshot of kernel memory into the previously allocated storage.   
Once again, does not include pagecache.  The kernel also records a  
list of what pages *are* included in the pagecache.  It then marks  
all userspace pages as copy-on-write.
   (f) That CPU finalizes the modified DM snapshot using the  
previously-allocated memory.
   (g) That CPU frees up the snapshot_overhead_pages memory allocated  
during step (a) for userspace to use.
   (h) The CPU does the equivalent of a "swapoff -a" without  
overwriting any data already on any swap device(s).
   (i) The CPU then IPMI-signals the other CPUs to wake them up
   (j) The kernel returns a FD-reference to the snapshot and the read- 
only halves of the CoW pagecache to the process which called  
sys_snapshot_kernel().

(4) The userspace process now has a reference to the copy of the  
kernel pages and the unmodified pagecache pages.  Since 99% of the  
processes aren't running, we aren't going to be having to CoW many of  
the pagecache pages.

(5) The userspace process uses read() or other syscalls to get data  
out of the kernel-snapshot FD in small chunks, within its  
snapshot_overhead_pages limit.  It compresses these and writes them  
out to the snapshot-storage blockdev (must not be mounted during  
snapshot), or to any network server.

(6) The userspace process syncs the disks and halts the system.  Any  
changed filesystem pages after the pseudo-DM-snapshot should have  
been stored in semi-volatile storage somewhere and will be discarded  
on the next reboot.

So basically your hibernate-overhead would consist of:
   (1) The pages necessary for the atomic snapshot of kernel memory  
and the list of pagecache pages at that time
   (2) A little memory necessary for the kernel non-persistent DM  
snapshot datastructures.
   (3) The snapshot_overhead_pages needed by userspace.

If you're using swap devices then you can save 99% of the state of  
the running kernel with an initial swapout overhead of virtually  
nothing beyond the size of the unswappable kernel memory.

Cheers,
Kyle Moffett


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  4:41                           ` Benjamin Herrenschmidt
@ 2007-07-06  5:25                             ` Nigel Cunningham
  0 siblings, 0 replies; 388+ messages in thread
From: Nigel Cunningham @ 2007-07-06  5:25 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Pavel Machek, Rafael J. Wysocki, Matthew Garrett, linux-kernel,
	linux-pm, Alan Stern

[-- Attachment #1: Type: text/plain, Size: 754 bytes --]

Hi.

On Friday 06 July 2007 14:41:40 Benjamin Herrenschmidt wrote:
> 
> > I/O from swsusp and suspend2 use bios directly, so the page cache isn't an 
> > issue for them (apart from the fact that Suspend2 saves the page cache 
> > separately so it can get a full image). Not sure about uswsusp.
> > 
> > Only having half the amount of memory doesn't sound like a big limitation 
for 
> > modern desktops & laptops, but don't forget that there are embedded guys 
> > wanting to hbernate too :)
> 
> Wait wait wait ... uses the BIOS ? what do you mean ?

You misread me, Ben. Sorry for not being clearer. bios as in struct bio.

Regards,

Nigel
-- 
See http://www.tuxonice.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 14:23                                                     ` Matthew Garrett
                                                                         ` (2 preceding siblings ...)
  2007-07-05 16:06                                                       ` Jeremy Maitin-Shepard
@ 2007-07-06  5:45                                                       ` Daniel Pittman
  3 siblings, 0 replies; 388+ messages in thread
From: Daniel Pittman @ 2007-07-06  5:45 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Rafael J. Wysocki, Oliver Neukum, Miklos Szeredi, paulus, stern,
	johannes, linux-pm, linux-kernel, pavel, benh

Matthew Garrett <mjg59@srcf.ucam.org> writes:
> On Thu, Jul 05, 2007 at 04:09:24PM +0200, Rafael J. Wysocki wrote:
>> On Thursday, 5 July 2007 15:46, Matthew Garrett wrote:
>> > I have a model for STD that avoids the need to freeze the entirity of 
>> > userspace, but I need to find some more time to flesh it out.
>> 
>> You can just describe it, as far as I'm concerned. :-)
>
> The basic model is that nobody's really described a use-case where we 
> actually care about restoring system state. What people want is to be 
> able to restore application state. So, arguably, what we want isn't to 
> save the entire kernel state and application state in one go because we 
> can reconstruct a huge amount of that afterwards.

[...]

> I've mocked up a basic implementation using cryopid, but it's somewhat
> limited by the lack of support for sockets. I'd like to move more of
> the smarts into the kernel (Hurray, checkpointing!) and then see how
> much hardware support ends up horifically broken.

You might want to look at the checkpoint / migration support in the
OpenVZ kernel in relation to this.  That does work to dump the state of
a running "virtual environment" complete with applications to disk, move
it to another running kernel and restore the content.

That might, perhaps, help with the prototype of this?

Regards,
	Daniel

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  5:01                           ` Kyle Moffett
@ 2007-07-06  5:53                             ` Nigel Cunningham
  2007-07-10  2:07                             ` Nigel Cunningham
  1 sibling, 0 replies; 388+ messages in thread
From: Nigel Cunningham @ 2007-07-06  5:53 UTC (permalink / raw)
  To: Kyle Moffett
  Cc: Benjamin Herrenschmidt, Pavel Machek, Rafael J. Wysocki,
	Matthew Garrett, linux-kernel, linux-pm, Alan Stern

[-- Attachment #1: Type: text/plain, Size: 762 bytes --]

Hi Kyle.

On Friday 06 July 2007 15:01:48 Kyle Moffett wrote:
> On Jul 06, 2007, at 00:03:15, Nigel Cunningham wrote:
> > The kind of thing Linus was talking about would limit you (as  
> > swsusp and uswsusp do now) to only half the amount of memory.
> 
> How so?  Suppose hibernate is implemented like this:

You're not talking about the same thing Linus was suggesting. He was just 
wanting a result = sys_snapshot() sort of call. That would limit us to half 
the amount of memory.

I've looked over what you've written below and want to consider it in detail. 
Right now though, I don't have the time. I'll try to get back to you 
promptly.

Nigel
-- 
See http://www.tuxonice.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 22:38                                   ` Benjamin Herrenschmidt
@ 2007-07-06  7:04                                     ` Rafael J. Wysocki
  2007-07-06  7:30                                     ` Oliver Neukum
  1 sibling, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-06  7:04 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Oliver Neukum, Miklos Szeredi, paulus, stern, johannes, linux-pm,
	linux-kernel, pavel, mjg59

On Friday, 6 July 2007 00:38, Benjamin Herrenschmidt wrote:
> 
> > There is that.
> > 
> > OK, bite the bullet. Tasks involved in fuse are special. Give them a flag
> > and teach the freezer to put them on ice only after all other task are
> > frozen. In a way they are kernel, there's no use denying that.
> 
> Yet another ugly hack to work around the fact that the freezer cannot
> work reliably ... yuck
> 
> Why not spend that energy fixing drivers to properly block requests
> instead ?

Because it's more difficult? :-)

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 22:04                                 ` Pavel Machek
@ 2007-07-06  7:07                                   ` Miklos Szeredi
  2007-07-07 12:19                                     ` Pavel Machek
  2007-07-06  7:16                                   ` Rafael J. Wysocki
  1 sibling, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-06  7:07 UTC (permalink / raw)
  To: pavel
  Cc: miklos, oliver, paulus, stern, johannes, rjw, linux-pm,
	linux-kernel, mjg59, benh

> > Actually fuse allows SIGKILL, because it's always fatal, and the
> > syscall may not be restarted.
> 
> Okay, and you should handle refrigerator in the same paths where you
> handle SIGKILL. Just add try_to_freeze() there...

It's the fourth time I'm repeating this in this thread:

Yes adding try_to_freeze() there would partially solve the probelem.

But another task can be sleeping on a mutex held by the task waiting
for the reply.  And the freezer won't be able to handle that one.

Generally, calling try_to_freeze() with mutexes held is not a good
idea.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 21:37                                                   ` Oliver Neukum
@ 2007-07-06  7:13                                                     ` Rafael J. Wysocki
  2007-07-06  8:59                                                       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-06  7:13 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Alan Stern, Miklos Szeredi, pavel, paulus, johannes, linux-pm,
	linux-kernel, mjg59, benh

On Thursday, 5 July 2007 23:37, Oliver Neukum wrote:
> Am Donnerstag, 5. Juli 2007 schrieb Alan Stern:
> > On Thu, 5 Jul 2007, Miklos Szeredi wrote:
> > 
> > > I fear, that your efforts to "save" the freezer are in vain.  It is
> > > already moderately hackish with that PF_FREEZER_SKIP and the kernel
> > > dotted randomly with try_to_freeze() calls, but adding bandaids to try
> > > to order freezing userspace processes in the right order would just
> > > make it a horrible mess.
> > 
> > I agree that bandaids won't work.  What's needed is something more 
> > radical.  Things like FUSE must be written so that the kernel parts 
> > _can_ freeze even while they are waiting for a response from a user 
> > thread.
> 
> OK, some radical ideas.
> 
> In principle we want a deadlock here. Tasks that are frozen due to fuse
> are as good as frozen, there's no need to formally freeze them.
> The bad uninterruptible tasks are those waiting for hardware.
> 
> Can we detect the difference? Perhaps we can label those crucial locks
> and when only tasks in the refrigerator and tasks waiting on these locks
> are left we are content and go on to suspending the device tree.

Well, I think we can just make the freezer handle uninterruptible tasks.

If an uninterruptible task is stuck holding a lock that's needed for suspend,
the freezerless suspend will deadlock anyway. :-)

The only reason (I know of) why we don't handle uninterruptible tasks in the
freezer is that we're afraid of the suspend process deadlocking with an
uninterruptible task holding a lock, but AFAICS the probability of such an
event is extremely small.

Also, the powermac suspend will deadlock in such cases, so the fact that
it doesn't deadlock means that they don't occur very often (if at all).

I have a patch for that, will post it in a separate thread in a while.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 22:04                                 ` Pavel Machek
  2007-07-06  7:07                                   ` Miklos Szeredi
@ 2007-07-06  7:16                                   ` Rafael J. Wysocki
  1 sibling, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-06  7:16 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Miklos Szeredi, oliver, paulus, stern, johannes, linux-pm,
	linux-kernel, mjg59, benh

On Friday, 6 July 2007 00:04, Pavel Machek wrote:
> Hi!
> 
> > > > > > > I have discussed the benefits elsewhere.  As for the deadlocks -- do 
> > > > > > > you still observe them if you use the version of the freezer which 
> > > > > > > doesn't freeze kernel threads?
> > > > > > 
> > > > > > In general the only way to guarantee there are no deadlocks is to
> > > > > > construct the graph of dependencies between tasks.  Those dependencies
> > > > > > are not in practice observable from outside the tasks, so it is
> > > > > > virtually impossible to construct the graph.
> > > > > 
> > > > > In which way can user space tasks depend on each other in a way that
> > > > > allows a them members of that cycle to be in uninterruptible sleep?
> > > > 
> > > >  - process A calls rename() on a fuse fs
> > > >  - process B, the fuse server, starts to process the rename request
> > > >  - process B is frozen before it can reply
> > > > 
> > > > Now process A is unfreezable.  We cannot make rename() restartable,
> > > > hence it cannot be interruptible.
> > > 
> > > Yes, we are claiming fuse is very special in this regard, and perhaps
> > > even broken.
> > > 
> > > Let's see. If I SIGSTOP the fuse server, I can get unrelated tasks
> > > unkillable (even for SIGKILL!) forever.
> > 
> > Actually fuse allows SIGKILL, because it's always fatal, and the
> > syscall may not be restarted.
> 
> Okay, and you should handle refrigerator in the same paths where you
> handle SIGKILL. Just add try_to_freeze() there...

In fact the problem is more complicated than that, because some tasks may
be waiting on VFS locks related to FUSE.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 22:59                           ` Benjamin Herrenschmidt
@ 2007-07-06  7:20                             ` Rafael J. Wysocki
  2007-07-06 15:13                             ` Alan Stern
  2007-07-07  7:56                             ` [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway Pavel Machek
  2 siblings, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-06  7:20 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alan Stern, Paul Mackerras, Johannes Berg, Linux-pm mailing list,
	Kernel development list, Pavel Machek, Matthew Garrett

On Friday, 6 July 2007 00:59, Benjamin Herrenschmidt wrote:
> On Thu, 2007-07-05 at 10:23 -0400, Alan Stern wrote:
> > 
> > How will that help?  Block the kernel thread in the freezer or block it 
> > in the driver -- either way it is blocked.  So how do your deadlocks 
> > get resolved?
> 
> Because nobody is waiting on that kernel thread anyway without a freezer
> so there is no deadlock anymore.

I'm not sure what you mean.  The freezer doesn't wait for threads that are
already frozen ...

> > I disagree with your analysis -- not that it's completely wrong, but it 
> > points out an existing basic problem in the kernel.  The kernel should 
> > never depend on userspace!  More correctly, a task executing in the 
> > kernel should never block with any sort of mutex or other lock held (in 
> > a way that would preclude it from being frozen, let's say) while 
> > waiting for a response from userspace.
> > 
> > Then the dependency graph would be easy to construct: User tasks can
> > depend on whatever they want, and kernel threads never depend on a user
> > task.
> 
> In an idea world, there would be no hunger...
> 
> > If this contradicts the existing implementations and APIs for userspace 
> > filesystems, then so be it.  My conclusion would be that the 
> > implementations and APIs should be changed.
> 
> Why are you guys working so hard and spending so much energy to try to
> avoid doing the right thing is beyond my understanding...
> 
> > It _does_ apply to kernel threads.  That's exactly why I wrote above 
> > that kernel threads which try to do I/O during a suspend will need 
> > extra attention.
> 
> Ok none at all if you don't have a freezer.

Provided that drivers can handle that.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 22:38                                   ` Benjamin Herrenschmidt
  2007-07-06  7:04                                     ` Rafael J. Wysocki
@ 2007-07-06  7:30                                     ` Oliver Neukum
  2007-07-06 12:35                                       ` Benny Amorsen
  1 sibling, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-06  7:30 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Miklos Szeredi, paulus, stern, johannes, rjw, linux-pm,
	linux-kernel, pavel, mjg59

Am Freitag, 6. Juli 2007 schrieb Benjamin Herrenschmidt:
> 
> > There is that.
> > 
> > OK, bite the bullet. Tasks involved in fuse are special. Give them a flag
> > and teach the freezer to put them on ice only after all other task are
> > frozen. In a way they are kernel, there's no use denying that.
> 
> Yet another ugly hack to work around the fact that the freezer cannot
> work reliably ... yuck
> 
> Why not spend that energy fixing drivers to properly block requests
> instead ?

Because we will be unable to escape that job. Let's assume that
we remove the freezer from the STR path. The next complaint would
be that we cannot do STD with fuse. "Then don't do that" would not
be taken kindly as answer.

So we would be under pressure to remove the freezer from STD, too,
or find a way to make the freezer work. If we'd have to make the freezer
work anyway, we can do it right now.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 23:05                                         ` Benjamin Herrenschmidt
  2007-07-06  3:59                                           ` Jeremy Maitin-Shepard
@ 2007-07-06  7:32                                           ` Oliver Neukum
  1 sibling, 0 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-06  7:32 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Miklos Szeredi, pavel, paulus, stern, johannes, rjw, linux-pm,
	linux-kernel, mjg59

Am Freitag, 6. Juli 2007 schrieb Benjamin Herrenschmidt:
> > Yes, fuse could handle being frozen there.  However that would only
> > solve part of the problem: an operation waiting for a reply could be
> > holding a VFS mutex and some other task may be blocked on that mutex.
> > 
> > How would you solve freezing those tasks?
> 
> That task is implicitely frozen... but the kernel doesn't know it and
> thus the freezer timeouts or fails or deadlocks or whatever.
> 
> The freezer could be made to ignore tasks that are sleeping in the
> kernel assuming that if they go out of it, they'll ultimately reach
> do_signal and freeze, but that means they can potentially still issues
> IOs which is what the freezer tries to avoid ...
> 
> Or the kernel could start tracking dependencies, but then, good luck
> implementing that crap.

Do we need dependencies? Don't we know that fuse can deadlock only
on a limited number of locks in VFS?

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  3:59                         ` Benjamin Herrenschmidt
@ 2007-07-06  7:35                           ` Rafael J. Wysocki
  2007-07-06  9:03                             ` Benjamin Herrenschmidt
  2007-07-06 14:38                           ` Alan Stern
  1 sibling, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-06  7:35 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Kyle Moffett, Nigel Cunningham, Pavel Machek, Matthew Garrett,
	linux-kernel, linux-pm, Alan Stern

On Friday, 6 July 2007 05:59, Benjamin Herrenschmidt wrote:
> 
> > How about a freezer whose job it is to "wait for pending hard  
> > interrupts to complete when we have already guaranteed that we won't  
> > get any more"?  That part should be really *REALLY* easy.  You don't  
> > need to care about either userspace processes or kernel threads at  
> > all.  Specifically, Step 1 consists of:
> 
> Well, waiting for pending DMA and making sure to not trigger more
> activity is what driver suspend() is supposed to do. With the ability
> for simple drivers that can cope with it to just basically use a
> late_suspend(), called after IRQs are off, that basically does what you
> describe: wait for pending HW tasks to complete (polling) and turn the
> damn thing off.
> 
> Note that the later is really a shortcut for somewhat dump and directly
> accessible devices (PCI comes to mind). Things like USB has to use the
> "normal" mechanism of blocking IOs etc... at suspend(), at least, USB
> devices have to since the USB HC will not issue any new URBs. (And will
> return them with a nice error code which is a perfect way to deal with
> it in driver, been there, it works fine, once we fixed the races in the
> USB host code itself, which I think we pretty much did by now).
> 
> > Scheduling and userspace are all still fully enabled in this  
> > scenario.  Once all your devices are turned off, the only remaining  
> > running threads will be those which haven't done IO since the  
> > beginning of the suspend.  We can then disable preemption, turn off  
> > the timer interrupts, and tell the other CPUs to park all their  
> > remaining threads in schedule() and sleep.  Then we put the IRQ  
> > controller to sleep and go to sleep ourselves.  If our driver model  
> > locking is sufficient to handle putting a parent device to sleep  
> > while threads are sleeping on a child device then there are exactly 0  
> > problems.
> 
> What you propose is basically a slightly over-simplistic version of what
> I think (and Paulus think) should be done. We do need to do it via
> driver callbacks down the tree since only drivers can know how to deal
> with their DMA etc... and ordering need to be respected, but that's
> basically it.
> 
> And guess what ? It's what we do on powerbooks, and it works fine,
> without a freezer :-)

On powerbooks you disable the nonboot CPUs before suspending devices, which
simplifies things _a_ _lot_ in comparison with the general case.

If you additionally disable kernel preemption, then you don't need anything
like the freezer anyway in that case (except for detecting situations in which
the suspend process can deadlock with a task stuck in the D state holding a
lock, but that's overkill).

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  7:40                           ` Rafael J. Wysocki
@ 2007-07-06  7:39                             ` Miklos Szeredi
  2007-07-06  7:51                               ` Oliver Neukum
  0 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-06  7:39 UTC (permalink / raw)
  To: rjw; +Cc: nigel, pavel, oliver, miklos, benh, mjg59, linux-kernel, linux-pm

> > You're missing the point. I'm arguing that a sync from within the freezer 
> > should guarantee that there is no data loss.
> 
> Well, it should, but it doesn't ...
> 
> Moreover, if FUSE implements syncing, then the sync from within the freezer
> will almost certainly deadlock.

Rafael, think positively: by the time fuse implements sync(), the
freezer will be long gone ;)

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 21:49                         ` Nigel Cunningham
@ 2007-07-06  7:40                           ` Rafael J. Wysocki
  2007-07-06  7:39                             ` Miklos Szeredi
  0 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-06  7:40 UTC (permalink / raw)
  To: nigel
  Cc: Pavel Machek, Oliver Neukum, Miklos Szeredi, benh, mjg59,
	linux-kernel, linux-pm

Hi,

On Thursday, 5 July 2007 23:49, Nigel Cunningham wrote:
> Good morning!
> 
> On Thursday 05 July 2007 23:59:57 Rafael J. Wysocki wrote:
> > On Thursday, 5 July 2007 15:36, Nigel Cunningham wrote:
> > > On Thursday 05 July 2007 23:35:45 Rafael J. Wysocki wrote:
> > > > On Thursday, 5 July 2007 14:38, Nigel Cunningham wrote:
> > > > > On Thursday 05 July 2007 22:25:06 Rafael J. Wysocki wrote:
> > > > > > On Thursday, 5 July 2007 01:45, Pavel Machek wrote:
> > > > > > > On Tue 2007-07-03 21:32:20, Oliver Neukum wrote:
> > > > > > > > Am Dienstag, 3. Juli 2007 schrieb Miklos Szeredi:
> > > > > > > > > > And a further question. The freezer is not atomic. What do 
> you 
> > > do
> > > > > > > > > > if a task not yet frozen calls sys_sync(), but fuse is 
> already 
> > > > > frozen?
> > > > > > > > > 
> > > > > > > > > What do you do if a task not yet frozen writes to a pipe, on 
> the 
> > > other
> > > > > > > > > end of which is a task already frozen?
> > > > > > > 
> > > > > > > There's some difference between uninterruptible and interruptible
> > > > > > > sleep I'd say.
> > > > > > > 
> > > > > > > > > It doesn't matter.  The only thing that should matter during 
> > > suspend
> > > > > > > > > (not hibernate) is saving the state of devices to ram, and 
> putting 
> > > the
> > > > > > > > > devices to sleep.
> > > > > > > > 
> > > > > > > > Well, but you did remove sys_sync() from the freezer, which is
> > > > > > > > and must be called in the hibernate path.
> > > > > > > 
> > > > > > > Not "must". In fact, hibernation should be safe without 
> sys_sync(). It
> > > > > > > is just user un-friendly.
> > > > > > 
> > > > > > In fact, I'd like to remove the sys_sync() from the freezer 
> entirely, 
> > > > > because
> > > > > > it just doesn't belong in there.
> > > > > > 
> > > > > > The only advantege of having sys_sync() in freeze_processes() is 
> that we
> > > > > > have a chance to write out everything when applications cannot 
> produce 
> > > more
> > > > > > data to write, but there are filesystems which don't do that anyway 
> (eg. 
> > > > > XFS),
> > > > > > so generally there's no reason to bother.
> > > > > 
> > > > > Shouldn't XFS - and fuse - be considered to be broken? Sync should 
> sync 
> > > data 
> > > > > and if XFS isn't doing that, it's wrong.
> > > > > 
> > > > > In the case of fuse, we should have a mechanism by which fuse 
> processes 
> > > can be 
> > > > > made to sync if they do have any pending I/O, and by which they can be 
> > > frozen 
> > > > > later than other userspace processes.
> > > > > 
> > > > > I'd like to see the sync stay, because it improves reliability and 
> data 
> > > > > integrity in the fail-to-resume case. Calling scripts would probably 
> > > invoke 
> > > > > sync themselves if they don't already, but that's racy. As it is at 
> the 
> > > > > moment, we know userspace is stopped, so syncing isn't racy.
> > > > 
> > > > I'd like to move the sync out of the freezer, but to call it from the
> > > > suspend/hibernation code, so that we do
> > > > 
> > > > sys_sync();
> > > > error = freeze_processes();
> > > 
> > > Yeah, I understand that. The problem then is that you're racing against 
> > > userspace. That's not usually a problem, but that doesn't mean it's never 
> a 
> > > problem. Try running the stress suite while testing hibernating and you'll 
> > > see what I mean. If something is submitting lots of I/O when you try to 
> > > suspend, your sync call will race against that process if it's not yet 
> > > frozen, and its continued activity will make your sync pointless (there'll 
> be 
> > > more unsynced data when you sys_sync call finishes). Stopping userspace 
> > > before syncing removes that race.
> > 
> > Yes, that will make the suspend/hibernation less reliable in case the resume
> > fails (some data, written after the sync, may be lost).  However, the sync 
> done
> > from within the freezer doesn't guarantee that there are no data lost 
> anyway,
> > so we don't lose much by not doing it.
> > 
> > Now, there's a question how much data may be lost, potentially, if we do the
> > sync before the freezer and I don't think that's a lot.
> 
> You're missing the point. I'm arguing that a sync from within the freezer 
> should guarantee that there is no data loss.

Well, it should, but it doesn't ...

Moreover, if FUSE implements syncing, then the sync from within the freezer
will almost certainly deadlock.

> As I said about, XFS should be fixed to properly sync its data, and
> something should be done about fuse filesystems too.

I think that we can't do anything about it, so we should just live with it.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  7:39                             ` Miklos Szeredi
@ 2007-07-06  7:51                               ` Oliver Neukum
  2007-07-06  9:09                                 ` Miklos Szeredi
  0 siblings, 1 reply; 388+ messages in thread
From: Oliver Neukum @ 2007-07-06  7:51 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: rjw, nigel, pavel, benh, mjg59, linux-kernel, linux-pm

Am Freitag, 6. Juli 2007 schrieb Miklos Szeredi:
> > > You're missing the point. I'm arguing that a sync from within the freezer 
> > > should guarantee that there is no data loss.
> > 
> > Well, it should, but it doesn't ...
> > 
> > Moreover, if FUSE implements syncing, then the sync from within the freezer
> > will almost certainly deadlock.
> 
> Rafael, think positively: by the time fuse implements sync(), the
> freezer will be long gone ;)

Now you are entering really dangerous territory.
If you can implement a meaningfull sync method, you must have dirty
pages in the page cache. That means you are in the page freeing path
of the vm. Then we are in real trouble. Don't even think about it.

As far as suspend/hibernate is concerned, get yourself on the new
notifying chain and revert to synchronous operations when notified.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  7:13                                                     ` Rafael J. Wysocki
@ 2007-07-06  8:59                                                       ` Benjamin Herrenschmidt
  2007-07-06  9:31                                                         ` Oliver Neukum
  0 siblings, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-06  8:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Oliver Neukum, Alan Stern, Miklos Szeredi, pavel, paulus,
	johannes, linux-pm, linux-kernel, mjg59

On Fri, 2007-07-06 at 09:13 +0200, Rafael J. Wysocki wrote:
> 
> The only reason (I know of) why we don't handle uninterruptible tasks in the
> freezer is that we're afraid of the suspend process deadlocking with an
> uninterruptible task holding a lock, but AFAICS the probability of such an
> event is extremely small.

What would deadlock specifically ? One of the drivers trying to acquire
that lock ? It would be a driver bug then.

> Also, the powermac suspend will deadlock in such cases, so the fact that
> it doesn't deadlock means that they don't occur very often (if at all).

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  7:35                           ` Rafael J. Wysocki
@ 2007-07-06  9:03                             ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-06  9:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Kyle Moffett, Nigel Cunningham, Pavel Machek, Matthew Garrett,
	linux-kernel, linux-pm, Alan Stern


> On powerbooks you disable the nonboot CPUs before suspending devices, which
> simplifies things _a_ _lot_ in comparison with the general case.

Not that much actually. It avoids having to gather them all in the last
stage at the low level but that isn't really related to what we are
talking about right now.

> If you additionally disable kernel preemption, then you don't need anything
> like the freezer anyway in that case (except for detecting situations in which
> the suspend process can deadlock with a task stuck in the D state holding a
> lock, but that's overkill).

kernel preemption and SMP have nothing to do with it. You can still
schedule because one of the driver suspend() callback is schedul'ing,
maybe waiting for some requests to complete, etc... Happens typically
with the disk suspend callbacks.

I agree that it does lower the likelyness of having a userspace process
pound on the wrong driver at the wrong time though.

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  7:51                               ` Oliver Neukum
@ 2007-07-06  9:09                                 ` Miklos Szeredi
  2007-07-06  9:16                                   ` Nigel Cunningham
  0 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-06  9:09 UTC (permalink / raw)
  To: oliver; +Cc: miklos, rjw, nigel, pavel, benh, mjg59, linux-kernel, linux-pm

> > > Moreover, if FUSE implements syncing, then the sync from within the freezer
> > > will almost certainly deadlock.
> > 
> > Rafael, think positively: by the time fuse implements sync(), the
> > freezer will be long gone ;)
> 
> Now you are entering really dangerous territory.
> If you can implement a meaningfull sync method, you must have dirty
> pages in the page cache. That means you are in the page freeing path
> of the vm. Then we are in real trouble. Don't even think about it.

VM induced deadlocks are real nasty.  I have thought about them a lot
already.  Suspend shouldn't introduce any big surprises.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  9:09                                 ` Miklos Szeredi
@ 2007-07-06  9:16                                   ` Nigel Cunningham
  2007-07-06  9:33                                     ` Miklos Szeredi
  0 siblings, 1 reply; 388+ messages in thread
From: Nigel Cunningham @ 2007-07-06  9:16 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: oliver, rjw, nigel, pavel, benh, mjg59, linux-kernel, linux-pm

[-- Attachment #1: Type: text/plain, Size: 885 bytes --]

Hi.

On Friday 06 July 2007 19:09:43 Miklos Szeredi wrote:
> > > > Moreover, if FUSE implements syncing, then the sync from within the 
freezer
> > > > will almost certainly deadlock.
> > > 
> > > Rafael, think positively: by the time fuse implements sync(), the
> > > freezer will be long gone ;)
> > 
> > Now you are entering really dangerous territory.
> > If you can implement a meaningfull sync method, you must have dirty
> > pages in the page cache. That means you are in the page freeing path
> > of the vm. Then we are in real trouble. Don't even think about it.
> 
> VM induced deadlocks are real nasty.  I have thought about them a lot
> already.  Suspend shouldn't introduce any big surprises.

Suspend isn't introducing the surprises. Fuse is. It creates the potential 
deadlocks simply by existing (this isn't suspend or hibernate specific).

Nigel

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  8:59                                                       ` Benjamin Herrenschmidt
@ 2007-07-06  9:31                                                         ` Oliver Neukum
  2007-07-06  9:53                                                           ` Rafael J. Wysocki
  2007-07-07  2:44                                                           ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-06  9:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Rafael J. Wysocki, Alan Stern, Miklos Szeredi, pavel, paulus,
	johannes, linux-pm, linux-kernel, mjg59

Am Freitag, 6. Juli 2007 schrieb Benjamin Herrenschmidt:
> On Fri, 2007-07-06 at 09:13 +0200, Rafael J. Wysocki wrote:
> > 
> > The only reason (I know of) why we don't handle uninterruptible tasks in the
> > freezer is that we're afraid of the suspend process deadlocking with an
> > uninterruptible task holding a lock, but AFAICS the probability of such an
> > event is extremely small.
> 
> What would deadlock specifically ? One of the drivers trying to acquire
> that lock ? It would be a driver bug then.

Your driver's write method looks like:

mutex_lock();
poke_some_hardware();
wait_event_uninterruptible(); //for result
res = evaluate_result();
mutex_unlock();
return res;

If you put a task into the refrigerator at wait_event_interruptible()
you will deadlock if you need this lock for the driver to go to suspend.
The suspend method then must not take the lock _and_ it must be
aware that there may be an ongoing operation.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  9:16                                   ` Nigel Cunningham
@ 2007-07-06  9:33                                     ` Miklos Szeredi
  0 siblings, 0 replies; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-06  9:33 UTC (permalink / raw)
  To: nigel
  Cc: miklos, oliver, rjw, nigel, pavel, benh, mjg59, linux-kernel, linux-pm

> > VM induced deadlocks are real nasty.  I have thought about them a lot
> > already.  Suspend shouldn't introduce any big surprises.
> 
> Suspend isn't introducing the surprises. Fuse is. It creates the potential 
> deadlocks simply by existing (this isn't suspend or hibernate specific).

Yes.  And fuse is not going away any time soon.  Get over it ;)

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  9:31                                                         ` Oliver Neukum
@ 2007-07-06  9:53                                                           ` Rafael J. Wysocki
  2007-07-07  2:46                                                             ` Benjamin Herrenschmidt
  2007-07-07  2:44                                                           ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-06  9:53 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Benjamin Herrenschmidt, Alan Stern, Miklos Szeredi, pavel,
	paulus, johannes, linux-pm, linux-kernel, mjg59

On Friday, 6 July 2007 11:31, Oliver Neukum wrote:
> Am Freitag, 6. Juli 2007 schrieb Benjamin Herrenschmidt:
> > On Fri, 2007-07-06 at 09:13 +0200, Rafael J. Wysocki wrote:
> > > 
> > > The only reason (I know of) why we don't handle uninterruptible tasks in the
> > > freezer is that we're afraid of the suspend process deadlocking with an
> > > uninterruptible task holding a lock, but AFAICS the probability of such an
> > > event is extremely small.
> > 
> > What would deadlock specifically ? One of the drivers trying to acquire
> > that lock ? It would be a driver bug then.
> 
> Your driver's write method looks like:
> 
> mutex_lock();
> poke_some_hardware();
> wait_event_uninterruptible(); //for result
> res = evaluate_result();
> mutex_unlock();
> return res;
> 
> If you put a task into the refrigerator at wait_event_interruptible()
> you will deadlock if you need this lock for the driver to go to suspend.
> The suspend method then must not take the lock _and_ it must be
> aware that there may be an ongoing operation.

s/interruptible/uninterruptible/

> you will deadlock if you need this lock for the driver to go to suspend.
> The suspend method then must not take the lock _and_ it must be
> aware that there may be an ongoing operation.

Well, is there any driver in the tree that works like that _and_ has a
.suspend() method requiring the same lock?

Besides, I'm not going to put the task into the refrigerator at that point.

Please read http://lkml.org/lkml/2007/7/6/71

Moreover, I claim that, in the context of your example, _if_ the task is stuck
at the wait_event_uninterruptible(), _then_ the freezerless suspend will
deadlock with the task.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  7:30                                     ` Oliver Neukum
@ 2007-07-06 12:35                                       ` Benny Amorsen
  2007-07-06 12:45                                         ` Oliver Neukum
  0 siblings, 1 reply; 388+ messages in thread
From: Benny Amorsen @ 2007-07-06 12:35 UTC (permalink / raw)
  To: linux-kernel

>>>>> "ON" == Oliver Neukum <oliver@neukum.org> writes:

ON> Because we will be unable to escape that job. Let's assume that we
ON> remove the freezer from the STR path. The next complaint would be
ON> that we cannot do STD with fuse. "Then don't do that" would not be
ON> taken kindly as answer.

Ah, we are back to keeping STR broken in order to maybe get STD
working. STR is much more interesting than STD.


/Benny



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06 12:35                                       ` Benny Amorsen
@ 2007-07-06 12:45                                         ` Oliver Neukum
  0 siblings, 0 replies; 388+ messages in thread
From: Oliver Neukum @ 2007-07-06 12:45 UTC (permalink / raw)
  To: Benny Amorsen; +Cc: linux-kernel

Am Freitag, 6. Juli 2007 schrieb Benny Amorsen:
> >>>>> "ON" == Oliver Neukum <oliver@neukum.org> writes:
> 
> ON> Because we will be unable to escape that job. Let's assume that we
> ON> remove the freezer from the STR path. The next complaint would be
> ON> that we cannot do STD with fuse. "Then don't do that" would not be
> ON> taken kindly as answer.
> 
> Ah, we are back to keeping STR broken in order to maybe get STD
> working. STR is much more interesting than STD.

I am sorry, but we cannot live with only fixing parts of the kernel because
you don't think the other parts are interesting.

	Oliver

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  3:59                         ` Benjamin Herrenschmidt
  2007-07-06  7:35                           ` Rafael J. Wysocki
@ 2007-07-06 14:38                           ` Alan Stern
  2007-07-07  3:44                             ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-06 14:38 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Kyle Moffett, Nigel Cunningham, Pavel Machek, Rafael J. Wysocki,
	Matthew Garrett, linux-kernel, linux-pm

On Fri, 6 Jul 2007, Benjamin Herrenschmidt wrote:

> What you propose is basically a slightly over-simplistic version of what
> I think (and Paulus think) should be done. We do need to do it via
> driver callbacks down the tree since only drivers can know how to deal
> with their DMA etc... and ordering need to be respected, but that's
> basically it.
> 
> And guess what ? It's what we do on powerbooks, and it works fine,
> without a freezer :-)

I wish you'd stop saying that.  Have you ever done any serious testing?

Here's something to try:  Add a time delay to the end of hub_suspend in
drivers/usb/core/hub.c, so you can provoke a race manually.  Then while
one of your root hubs is being suspended and the system is waiting in
that delay, either plug in a new USB device to that hub or unplug an
existing device.

Be sure that CONFIG_USB_DEBUG is on so that we can figure out what 
happened after the fact.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 22:59                           ` Benjamin Herrenschmidt
  2007-07-06  7:20                             ` Rafael J. Wysocki
@ 2007-07-06 15:13                             ` Alan Stern
  2007-07-08  7:19                               ` Paul Mackerras
  2007-07-08  7:35                               ` [PATCH] Remove process freezer from suspend to RAM pathway (philosophical) Oleg Verych
  2007-07-07  7:56                             ` [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway Pavel Machek
  2 siblings, 2 replies; 388+ messages in thread
From: Alan Stern @ 2007-07-06 15:13 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Paul Mackerras, Johannes Berg, Rafael J. Wysocki,
	Linux-pm mailing list, Kernel development list, Pavel Machek,
	Matthew Garrett

On Fri, 6 Jul 2007, Benjamin Herrenschmidt wrote:

> Why are you guys working so hard and spending so much energy to try to
> avoid doing the right thing is beyond my understanding...
> 
> > It _does_ apply to kernel threads.  That's exactly why I wrote above 
> > that kernel threads which try to do I/O during a suspend will need 
> > extra attention.
> 
> Ok none at all if you don't have a freezer.

In answer to both questions: We need the freezer in order to implement 
hibernate.  Even if we take your advice and stop using the freezer 
during suspend, these issues would still remain and would need to be 
solved.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  1:19                       ` Kyle Moffett
  2007-07-06  1:37                         ` Nigel Cunningham
  2007-07-06  3:59                         ` Benjamin Herrenschmidt
@ 2007-07-06 15:42                         ` Alan Stern
  2007-07-07  0:43                           ` Kyle Moffett
  2 siblings, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-06 15:42 UTC (permalink / raw)
  To: Kyle Moffett
  Cc: Nigel Cunningham, Benjamin Herrenschmidt, Pavel Machek,
	Rafael J. Wysocki, Matthew Garrett, linux-kernel, linux-pm

On Thu, 5 Jul 2007, Kyle Moffett wrote:

> Umm, this thread is NOT ABOUT HIBERNATING!!!  Please go back and read  
> the subject, specifically the "suspend to RAM" parts :-D.

But it _is_ about the freezer; see the "Remove process freezer" part.  
:-)  Since the freezer is used during hibernation, hibernation is a
legitimate topic.

>  When your  
> hardware can put itself to sleep and atomically preserve memory as it  
> does so, you don't need an atomic copy.  For Real Suspend(TM) (IE:  
> Suspend-to-RAM), the list of things to do is short and simple:
> 
> 1)  Stop DMA and put most hardware into low-power states (stops all  
> interrupt sources)
> 2)  Ensure that the other CPUs have finished any trailing interrupt  
> handlers and put them to sleep
> 3)  Put the interrupt-controllers into low-power state
> 4)  Go to sleep

Your short and simple list omits a few crucial items:

A)  Decide what to do about remote wakeup requests.
B)  Prevent I/O requests from resuming devices that have been 
suspended.
C)  Prevent devices and drivers from being registered or unregistered; 
in particular decide what to do about hot-plug or hot-unplug events.
D)  Block driver bind or unbind calls.

Any of these things is capable of screwing up the course of events.
(In fact A _should_ be allowed to abort a suspend.)

> > As Pavel rightly said, you can get rid of the freezer, but you're  
> > only going to have to implement another one that does the  
> > essentially the same thing, even if it is at some other level.
> 
> How about a freezer whose job it is to "wait for pending hard  
> interrupts to complete when we have already guaranteed that we won't  
> get any more"?  That part should be really *REALLY* easy.  You don't  
> need to care about either userspace processes or kernel threads at  
> all.  Specifically, Step 1 consists of:
> 
> suspend_device(dev)
> {
> 	set_no_bind_flag(dev);
> 	for (dev->subdevices)
> 		suspend_device(dev);
> 	set_no_io_flag(dev);
> 	wait_for_in_progress_dma(dev);
> 	turn_off_interrupts(dev);
> 	go_to_low_power_state(dev);
> }
> 
> After you've set the "no_bind" flag, you won't get any *new*  
> subdevices trying to bind, 

So what happens if a new subdevice arrives at the wrong time?  Do you 
block instead of binding it?  While holding a mutex needed to suspend 
the parent device?

What about drivers trying to bind to existing devices?

What happens to I/O requests submitted after the "no_io" flag is set?  
The driver will have to block them, effectively creating its own little 
"freezer".

> therefore it's safe to iterate over the  
> list of present sub-devices and suspend them.  Once those are  
> suspended and in low-power states you can set a "no_io" flag to  
> prevent the driver from submitting more IO.  At that point you can  
> lazily wait for existing DMA/IO/interrupts to finish on the device,  
> since *NOBODY* will be submitting them anymore, and we certainly  
> aren't probing for new devices.  Then you can just turn off the power  
> to the device.  When all the leaf devices are off, the parent device  
> can be turned off because everything waiting on the leaf devices is  
> blocked on them and won't unblock until the parent device *AND* the  
> leaf device are turned on again, in that order.

This is a lot like what we already do.  The differences are:

	There is nothing corresponding to your "no-bind" flag.

	Most drivers don't have anything like your "no_io" flag;
	they assume that nobody will be around to submit an I/O
	request.

> Scheduling and userspace are all still fully enabled in this  
> scenario.  Once all your devices are turned off, the only remaining  
> running threads will be those which haven't done IO since the  
> beginning of the suspend.  We can then disable preemption, turn off  
> the timer interrupts, and tell the other CPUs to park all their  
> remaining threads in schedule() and sleep.  Then we put the IRQ  
> controller to sleep and go to sleep ourselves.  If our driver model  
> locking is sufficient to handle putting a parent device to sleep  
> while threads are sleeping on a child device then there are exactly 0  
> problems.
> 
> Resuming is basically running the whole process in reverse.  Runtime- 
> suspend is achieved by not setting the 'no_io' or 'no_bind' flags and  
> putting selective device-subtrees to sleep without doing anything to  
> the rest of the system.

Nobody doubts that suspend can be made to work without the freezer.  
The point is that doing it this way dumps a bunch of extra 
responsibility on drivers.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06 15:42                         ` Alan Stern
@ 2007-07-07  0:43                           ` Kyle Moffett
  2007-07-07  2:59                             ` Alan Stern
  0 siblings, 1 reply; 388+ messages in thread
From: Kyle Moffett @ 2007-07-07  0:43 UTC (permalink / raw)
  To: Alan Stern
  Cc: Nigel Cunningham, Benjamin Herrenschmidt, Pavel Machek,
	Rafael J. Wysocki, Matthew Garrett, linux-kernel, linux-pm

On Jul 06, 2007, at 11:42:33, Alan Stern wrote:
> On Thu, 5 Jul 2007, Kyle Moffett wrote:
>> Umm, this thread is NOT ABOUT HIBERNATING!!!  Please go back and  
>> read the subject, specifically the "suspend to RAM" parts :-D.
>
> But it _is_ about the freezer; see the "Remove process freezer"  
> part. :-)  Since the freezer is used during hibernation,  
> hibernation is a legitimate topic.

Except Linus already decreed (and I heartily agree) that hibernation  
and suspend-to-RAM were fundamentally completely different operations  
and therefore any attempts to share code were basically just making a  
big muddy mess of things.  Would a thread "Remove phase-of-the-moon  
calculations from network-recv code" be relevant to lunar observation  
just because the two had to do with the phases of the moon?  No!

>> When your hardware can put itself to sleep and atomically preserve  
>> memory as it does so, you don't need an atomic copy.  For Real  
>> Suspend(TM) (IE: Suspend-to-RAM), the list of things to do is  
>> short and simple:
>>
>> 1)  Stop DMA and put most hardware into low-power states (stops  
>> all interrupt sources)
>> 2)  Ensure that the other CPUs have finished any trailing  
>> interrupt handlers and put them to sleep
>> 3)  Put the interrupt-controllers into low-power stat
>> 4)  Go to sleep
>
> Your short and simple list omits a few crucial items:
>
> A)  Decide what to do about remote wakeup requests.

Why do we care?  If the wakeup request arrives before we go to sleep,  
we obviously aren't asleep and so can't wake up.  If it arrives after  
we go to sleep then it will wake us up.  Anything that depends on a  
wakeup arriving mid-sequence is 100% masochistic race condition.

> B)  Prevent I/O requests from resuming devices that have been  
> suspended.

(1a) As I describe below, step (1) includes setting NO_BIND and NO_IO  
flags on devices as they are processed.  Anybody who wants to do IO  
while those flags are set should just go sleep on a waitqueue.

> C)  Prevent devices and drivers from being registered or  
> unregistered; in particular decide what to do about hot-plug or hot- 
> unplug events.

(1b) Again, that's where the NO_BIND flag comes in.  If its set then  
any device probe events must sleep, otherwise they can go through.

> D)  Block driver bind or unbind calls.

See points (1a) and (1b) above.

> Any of these things is capable of screwing up the course of  
> events.  (In fact A _should_ be allowed to abort a suspend.)

If any of those things screw up suspend-to-RAM then it is 100% the  
drivers fault and no "process freezer" is going to fix it, end of  
story.  And "A" cannot be made reliable.  At some point you shut off  
interrupts right before going to sleep, and at that point any remote  
wakeup event is just going to get dropped until you actually enter  
sleep mode and the hardware takes over again.  If you miss a wakeup  
event then whatever sent it should just retry, just as with *every*  
other kind of network packet.

>> How about a freezer whose job it is to "wait for pending hard  
>> interrupts to complete when we have already guaranteed that we  
>> won't get any more"?  That part should be really *REALLY* easy.   
>> You don't need to care about either userspace processes or kernel  
>> threads at all.  Specifically, Step 1 consists of:
>>
>> suspend_device(dev)
>> {
>> 	set_no_bind_flag(dev);
>> 	for (dev->subdevices)
>> 		suspend_device(dev);
>> 	set_no_io_flag(dev);
>> 	wait_for_in_progress_dma(dev);
>> 	turn_off_interrupts(dev);
>> 	go_to_low_power_state(dev);
>> }
>>
>> After you've set the "no_bind" flag, you won't get any *new*  
>> subdevices trying to bind,
>
> So what happens if a new subdevice arrives at the wrong time?  Do  
> you block instead of binding it?  While holding a mutex needed to  
> suspend the parent device?

That would be a driver bug.  If you have asynchronous probing then  
proper suspend handling includes being able to postpone driver probe  
events until after resume.  If you have synchronous probing then the  
problem doesn't exist because "set_no_bind_flag" is just telling the  
device not to raise any more device probe interrupts.

> What about drivers trying to bind to existing devices?

While binding it will clearly be holding a mutex/spinlock on the  
parent device, so the suspend process will wait for it.  When binding  
is done the suspend_device() code will take the device lock and tell  
everything else to postpone further bind requests as above.

> What happens to I/O requests submitted after the "no_io" flag is  
> set?  The driver will have to block them, effectively creating its  
> own little "freezer".

Oh, so you're calling every waitqueue in the kernel a "freezer" now?   
We do these things at the driver level *all* *the* *time*.  For  
instance, you can't submit new IOs to an ATA controller while it's  
renegotiating the bus speed, but that's never been a problem before.

>> When all the leaf devices are off, the parent device can be turned  
>> off because everything waiting on the leaf devices is blocked on  
>> them and won't unblock until the parent device *AND* the leaf  
>> device are turned on again, in that order.
>
> This is a lot like what we already do.  The differences are:
>
> There is nothing corresponding to your "no-bind" flag.
>
> Most drivers don't have anything like your "no_io" flag; they  
> assume that nobody will be around to submit an I/O request.

Most drivers have an implicit NO_BIND flag:  The device's interrupts  
are off and/or its in a low-power state.  USB is already terribly  
buggy with regards to suspend:  If you hotplug a device during  
suspend (like the touchpad in my powerbook powering down/up), then  
the USB stack will basically hang that controller.  The device is off  
and the hotplug triggers interrupts and IO, *EVEN* *WITHOUT*  
*USERSPACE*.

So if your driver doesn't already have a proper way of blocking IO  
during suspend then it probably doesn't suspend 50% of the time  
anyways.  A bug which bites *every* *time* is easy to fix, one which  
only bites when things hit a race condition is much harder.

>> Resuming is basically running the whole process in reverse.   
>> Runtime-suspend is achieved by not setting the 'no_io' or  
>> 'no_bind' flags and putting selective device-subtrees to sleep  
>> without doing anything to the rest of the system.
>
> Nobody doubts that suspend can be made to work without the freezer.  
> The point is that doing it this way dumps a bunch of extra  
> responsibility on drivers.

That responsibility has been there ever since suspend-to-RAM support  
was added.  Nobody ever denied that writing a proper driver wasn't  
tricky.  You have to simultaneously be able to handle handle hot- 
unplug, IO errors, interrupts, IO requests, suspend-to-RAM, and  
hibernation.  If your driver mutual-exclusion is buggy then it  
probably already bites you during hotplug or other similar  
scenarios.  Let's at least make the problems much more reproducible  
so we can fix the drivers properly instead of continuing to kludge  
around it for all eternity.

Cheers,
Kyle Moffett


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  9:31                                                         ` Oliver Neukum
  2007-07-06  9:53                                                           ` Rafael J. Wysocki
@ 2007-07-07  2:44                                                           ` Benjamin Herrenschmidt
  2007-07-07 20:48                                                             ` Rafael J. Wysocki
  1 sibling, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-07  2:44 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Rafael J. Wysocki, Alan Stern, Miklos Szeredi, pavel, paulus,
	johannes, linux-pm, linux-kernel, mjg59

On Fri, 2007-07-06 at 11:31 +0200, Oliver Neukum wrote:
> Am Freitag, 6. Juli 2007 schrieb Benjamin Herrenschmidt:
> > On Fri, 2007-07-06 at 09:13 +0200, Rafael J. Wysocki wrote:
> > > 
> > > The only reason (I know of) why we don't handle uninterruptible tasks in the
> > > freezer is that we're afraid of the suspend process deadlocking with an
> > > uninterruptible task holding a lock, but AFAICS the probability of such an
> > > event is extremely small.
> > 
> > What would deadlock specifically ? One of the drivers trying to acquire
> > that lock ? It would be a driver bug then.
> 
> Your driver's write method looks like:
> 
> mutex_lock();
> poke_some_hardware();
> wait_event_uninterruptible(); //for result
> res = evaluate_result();
> mutex_unlock();
> return res;
> 
> If you put a task into the refrigerator at wait_event_interruptible()
> you will deadlock if you need this lock for the driver to go to suspend.
> The suspend method then must not take the lock _and_ it must be
> aware that there may be an ongoing operation.

Well... 2 things here. Either you have a freezer in which case the
chances of the above scenario are increased, or you don't, in which case
your suspend method will just sleep on the lock until outstanding HW
accesses that have that lock are completed, and everything is fine.

You need to be careful with one thing though, whether you have a freezer
or not. If you driver, in some code path, whatever it is (ioctl, kernel
thread, workqueue, ...) does something like:

mutex_lock
kmalloc(...,GFP_KERNEL);
mutex_unlock

And it's suspend callback then does:

mutex_lock

The problem here is that the disks might already have been suspended
prior to your driver being called. Thus, any attempt at pushing things
out to swap or dirty mmap'ings back to storage will hang, thus kmalloc
can potentially hang (afaik), and you will deadlock.

That's what I've been talking about earlier when I said that we should
have some security in SLAB/SLUB/Buddy allocators, to silently turn
GFP_KERNEL to at least GFP_NOIO or even ATOMIC before we start
suspending drivers.

Now, another way to deal with that would have to use
pre-suspend/post-resume notifications, and have drivers avoid doing the
above between those, but that's much harder. (Essentially, drivers would
have to either make sure they don't do things like blocking allocations,
even implicitely, or possibly fall back to a degraded synchronous mode
or that sort of thing).

I think it's much simpler to tweak slab/slub/buddy instead :-)

Note that the above issue is orthogonal to our freezer discussion, it's
just one of the potential deadlock cause we have with suspend that needs
to be fixed.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  9:53                                                           ` Rafael J. Wysocki
@ 2007-07-07  2:46                                                             ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-07  2:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Oliver Neukum, Alan Stern, Miklos Szeredi, pavel, paulus,
	johannes, linux-pm, linux-kernel, mjg59

On Fri, 2007-07-06 at 11:53 +0200, Rafael J. Wysocki wrote:
> 
> Moreover, I claim that, in the context of your example, _if_ the task
> is stuck
> at the wait_event_uninterruptible(), _then_ the freezerless suspend
> will
> deadlock with the task.

Why would the task be stuck there if it's not becasue of a freezer ? The
only reason I see would be a dependency on something like kmalloc trying
to push things to disk, or a lock owned by another user process, but the
former is a generic issue I've discussed in a separate mail and the
later is a driver bug imho.

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-07  0:43                           ` Kyle Moffett
@ 2007-07-07  2:59                             ` Alan Stern
  2007-07-07  4:06                               ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-07  2:59 UTC (permalink / raw)
  To: Kyle Moffett
  Cc: Nigel Cunningham, Benjamin Herrenschmidt, Pavel Machek,
	Rafael J. Wysocki, Matthew Garrett, linux-kernel, linux-pm

On Fri, 6 Jul 2007, Kyle Moffett wrote:

> Except Linus already decreed (and I heartily agree) that hibernation  
> and suspend-to-RAM were fundamentally completely different operations  
> and therefore any attempts to share code were basically just making a  
> big muddy mess of things.  Would a thread "Remove phase-of-the-moon  
> calculations from network-recv code" be relevant to lunar observation  
> just because the two had to do with the phases of the moon?  No!

For that matter, shutting down a CPU and hibernation are fundamentally
different operations -- but they both use the freezer.  Is that a big
muddy mess?  But never mind; I won't discuss hibernation.

> > A)  Decide what to do about remote wakeup requests.
> 
> Why do we care?  If the wakeup request arrives before we go to sleep,  
> we obviously aren't asleep and so can't wake up.  If it arrives after  
> we go to sleep then it will wake us up.  Anything that depends on a  
> wakeup arriving mid-sequence is 100% masochistic race condition.

You don't understand my point.  If a wakeup request arrives before the
system goes to sleep, and it is serviced, then a device which ought to
have been suspended will in fact be awake.  This will (if the parent's
driver is written correctly) cause the sleep transition to abort.

Not that there's necessarily anything wrong with that.  I just wanted 
to be sure you were aware of the potential problems.

> > C)  Prevent devices and drivers from being registered or  
> > unregistered; in particular decide what to do about hot-plug or hot- 
> > unplug events.
> 
> (1b) Again, that's where the NO_BIND flag comes in.  If its set then  
> any device probe events must sleep, otherwise they can go through.

I didn't say "bind", I said "registered".  Admittedly, they are rather
similar.

Still, there are difficulties.  Let's say a driver has set the NO_BIND 
flag for one of its devices.  A bind request comes in, and the driver 
puts it on a waitqueue.  Note that the binding thread holds the device 
semaphore; this is always true when a driver's probe routine is called.

Later on it comes time for the PM core to resume the device, which will
start up the threads on the waitqueue.  Before doing so it must acquire
the device semaphore.  Deadlock!

> If any of those things screw up suspend-to-RAM then it is 100% the  
> drivers fault and no "process freezer" is going to fix it, end of  
> story.

Why do you say that?  A "process freezer" can prevent bind and
registration calls from occurring, since these calls have to run in
process context.  Ergo a freezer _can_ fix some of these problems.

>  And "A" cannot be made reliable.  At some point you shut off  
> interrupts right before going to sleep, and at that point any remote  
> wakeup event is just going to get dropped until you actually enter  
> sleep mode and the hardware takes over again.  If you miss a wakeup  
> event then whatever sent it should just retry, just as with *every*  
> other kind of network packet.

Who mentioned network packets?  And who says a remote wakeup event will 
get dropped once interrupts are disabled?  More likely it will set a 
bit somewhere that causes the system to wake up immediately after it 
has gone to sleep.

> > So what happens if a new subdevice arrives at the wrong time?  Do  
> > you block instead of binding it?  While holding a mutex needed to  
> > suspend the parent device?
> 
> That would be a driver bug.  If you have asynchronous probing then  
> proper suspend handling includes being able to postpone driver probe  
> events until after resume.

One of your conditions (embodied in the pseudocode you posted earlier) 
was that drivers should be told to prevent binding and registration 
before the child devices are suspended.  Currently the PM core doesn't 
do anything like that.  You can't blame the drivers for this lack.

Of course it could be added.  Or perhaps more easily, the drivers that 
support asynchronous probing could be notified when a suspend is about 
to start so they could begin blocking bindings/registrations then.

> > What about drivers trying to bind to existing devices?
> 
> While binding it will clearly be holding a mutex/spinlock on the  
> parent device,

("Parent device"?  Do you mean the device being bound?  If so then I
agree.  Or do you mean the device's parent?  If so then your statement
is not clear at all.  There is special-case code in the driver core to
make sure it is true for USB devices, and it looks ugly as can be.)

> so the suspend process will wait for it.  When binding  
> is done the suspend_device() code will take the device lock and tell  
> everything else to postpone further bind requests as above.

My question referred to drivers trying to bind or unbind a device
_after_ the device has been suspended.  I suppose you'll say that's
covered by the NO_BIND flag.  But now we have the locking problem
mentioned above: The thread trying to bind is holding a lock which is
needed for resuming.

> Most drivers have an implicit NO_BIND flag:  The device's interrupts  
> are off and/or its in a low-power state.

That won't cause binding to block or be postponed; it will cause it to
fail.  Not the same thing at all.

>  USB is already terribly  
> buggy with regards to suspend:  If you hotplug a device during  
> suspend (like the touchpad in my powerbook powering down/up), then  
> the USB stack will basically hang that controller.  The device is off  
> and the hotplug triggers interrupts and IO, *EVEN* *WITHOUT*  
> *USERSPACE*.

As one of the people responsible for the USB power management 
implementation, I would appreciate more details about this.  For 
example, a dmesg log with CONFIG_USB_DEBUG turned on together with a
complete description of the actions you took to provoke the bug.

(I wonder how much of this "buginess" is caused by the lack of the 
freezer in PPC.)

> So if your driver doesn't already have a proper way of blocking IO  
> during suspend then it probably doesn't suspend 50% of the time  
> anyways.  A bug which bites *every* *time* is easy to fix, one which  
> only bites when things hit a race condition is much harder.

The USB drivers (at least, the ones with runtime PM support) rely on 
the freezer to block I/O during suspend.  As far as I know, they do 
suspend properly, on systems where the freezer is used.

> That responsibility has been there ever since suspend-to-RAM support  
> was added.  Nobody ever denied that writing a proper driver wasn't  
> tricky.  You have to simultaneously be able to handle handle hot- 
> unplug, IO errors, interrupts, IO requests, suspend-to-RAM, and  
> hibernation.  If your driver mutual-exclusion is buggy then it  
> probably already bites you during hotplug or other similar  
> scenarios.  Let's at least make the problems much more reproducible  
> so we can fix the drivers properly instead of continuing to kludge  
> around it for all eternity.

Is reproducibility really a problem at this stage?  A bug which bites 
50% of the time might not be quite as easy to fix as one which occurs 
every time, but it isn't terribly bad either.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06 14:38                           ` Alan Stern
@ 2007-07-07  3:44                             ` Benjamin Herrenschmidt
  2007-07-07 11:49                               ` Pavel Machek
  2007-07-07 16:17                               ` Alan Stern
  0 siblings, 2 replies; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-07  3:44 UTC (permalink / raw)
  To: Alan Stern
  Cc: Kyle Moffett, Nigel Cunningham, Pavel Machek, Rafael J. Wysocki,
	Matthew Garrett, linux-kernel, linux-pm

On Fri, 2007-07-06 at 10:38 -0400, Alan Stern wrote:
> On Fri, 6 Jul 2007, Benjamin Herrenschmidt wrote:
> 
> > What you propose is basically a slightly over-simplistic version of what
> > I think (and Paulus think) should be done. We do need to do it via
> > driver callbacks down the tree since only drivers can know how to deal
> > with their DMA etc... and ordering need to be respected, but that's
> > basically it.
> > 
> > And guess what ? It's what we do on powerbooks, and it works fine,
> > without a freezer :-)
> 
> I wish you'd stop saying that.  Have you ever done any serious testing?
> 
> Here's something to try:  Add a time delay to the end of hub_suspend in
> drivers/usb/core/hub.c, so you can provoke a race manually.  Then while
> one of your root hubs is being suspended and the system is waiting in
> that delay, either plug in a new USB device to that hub or unplug an
> existing device.
> 
> Be sure that CONFIG_USB_DEBUG is on so that we can figure out what 
> happened after the fact.

If you remember, one of the things I've been advocating has always been
that we should put on hold all plug activity (unplug might be alright as
long as the user events are just delayed) when we start suspending. No
new devices, no new bindings. "hub" type devices are respondible for
bringing in the new stuff after resume.

Ben.


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-07  2:59                             ` Alan Stern
@ 2007-07-07  4:06                               ` Benjamin Herrenschmidt
  2007-07-07 17:19                                 ` Alan Stern
  0 siblings, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-07  4:06 UTC (permalink / raw)
  To: Alan Stern
  Cc: Kyle Moffett, Nigel Cunningham, Pavel Machek, Rafael J. Wysocki,
	Matthew Garrett, linux-kernel, linux-pm


> > so the suspend process will wait for it.  When binding  
> > is done the suspend_device() code will take the device lock and tell  
> > everything else to postpone further bind requests as above.
> 
> My question referred to drivers trying to bind or unbind a device
> _after_ the device has been suspended.  I suppose you'll say that's
> covered by the NO_BIND flag.  But now we have the locking problem
> mentioned above: The thread trying to bind is holding a lock which is
> needed for resuming.

Why would it ? Just make it fail, maybe with some kind of -ERETRY... Or
it can spin with the lock not held if it want. That's a detail really.

> As one of the people responsible for the USB power management 
> implementation, I would appreciate more details about this.  For 
> example, a dmesg log with CONFIG_USB_DEBUG turned on together with a
> complete description of the actions you took to provoke the bug.
> 
> (I wonder how much of this "buginess" is caused by the lack of the 
> freezer in PPC.)

No. The freezer will hide some of those problems under the carpet, but
not solve the basic issue which is the driver should be solid. Period.

The freezer is a flawed concept in the first place. If you go back to
square one, what is the basic idea of it ? I'll basically expose the
idea and go down all of the path I have in mind where it stops working
and becomes an incredibly difficult thing that in the end doesn't even
solve all the problems it's supposed to.

So first thing first...

I want a quiescent system with no new "IO requests" (whatever that mean
in the context of drivers) issued to avoid races during suspend/resume.

That sounds like a nice idea. Yeah. Sounds... only. Problem is. How do
you define that quiescent system ? First idea is ... let's stop
userland. There are various ways of doing that, but the freezer hooking
into the signal code is not necessarily a bad one.

No, I'm purposefully putting aside all the cases where the above doesn't
work (user process in the kernel in some uninterruptible wait, etc...),
which are the first big setback imho... our simple idea is suddenly not
so simple anymore, but we can bring those back later.

Now, there is still a problem... kernel threads. In fact, there is no
fundamental distinction between a kernel thread and a user process...
one has an MM and the other doesn't but as far as we are concerned, it's
the same. Kernel threads can issues IOs, or like khubd, detect devices,
plug/unplug them, etc etc.... all over the place.

Easy answer that comes to mind -> freeze them too. Heh, but kernel
threads don't do signals, so we end up with all those try_to_freeze().
Then what about the fact that drivers may need those kernel threads to
proceed ? Some drivers queue up their IO requests to a kernel thread to
process them and suspend() might need to flush those down, issue a
couple more such as "spin down disks" before that kernel thread can
actually be frozen... Hrm.. maybe not all of them then.

But how do you decide ? What defines that a kernel can issue an IO ? In
fact, if you look closely, anything doing kmalloc(...,GFP_KERNEL) for
example can trigger an IO... implicitely, via the VM pushing things out.
And that's just one example.

In some case, those same threads that may need to be kept non-frozen are
-also- the ones that will potentially submit new IOs or bring in new
devices.

And then, there is keventd ... what do you do about work queues ? You
have everybody pouring things at workqueues... some of these things may
well hit your driver, some may not. Same goes in some cases with
interrupt time stuff, such as timers or tasklets.. think about
networking.

In the end, the nice idea that "threads/tasks cause requests, so we just
stop them" basically falls appart. Half of the kernel can cause a driver
to be hit somewhere and a given time, it can be from a thread context,
directly caused by userland, or from some timer due to some subsystem
having a keepalive thing ticking in or whatever else.

Now, we go back to the previous issue of what do we do about
uninterruptible sleep... You want to abort suspend because, for example,
somethign called a driver that does an msleep(200) or so ? Are you aware
that 99% of laptop users close their laptops and shove it in the bag not
even waiting for the disk to spin down ? And you want suspend to abort
because some random "happen all the time" even such as a process being
somewhere temporarily in uninterruptible state in the kernel ?

So let's say we freeze them from within the scheduler even when they are
uininterruptible.. ouch... you just caused the deadlocks we talked about
before. While without a freezer, suspend() can at least rely on the fact
that it can wait for processes that have such pending locked constructs
waiting will ultimately wakeupm and wait for them (or even explicitely
wake them), it can't if they've been frozen. So what was a perfectly
solvable moderate driver synchronisation issue becomes a deadlock
nightmare.

And those are just example. During this discussion, we also brought the
example of FUSE which is a big stab at the whole freezer concept. And
I'm sure we can find more everyday.

Face it, we should seriously look into doing suspend/resume without a
freezer. I even tend to think that we could do STD that way too, in
fact, while Linus is right saying it's a different problem than STR, we
could even probably re-use some of the STR infrastructure in some
hackish way, still without a freezer. We could have ways to block page
cache writeout, for example, to prevent new post-snapshot dirty data
from hitting the platter, and use direct BIOs for writeout. That's just
an example.

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 22:59                           ` Benjamin Herrenschmidt
  2007-07-06  7:20                             ` Rafael J. Wysocki
  2007-07-06 15:13                             ` Alan Stern
@ 2007-07-07  7:56                             ` Pavel Machek
  2 siblings, 0 replies; 388+ messages in thread
From: Pavel Machek @ 2007-07-07  7:56 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alan Stern, Paul Mackerras, Johannes Berg, Rafael J. Wysocki,
	Linux-pm mailing list, Kernel development list, Matthew Garrett

Hi!

> > How will that help?  Block the kernel thread in the freezer or block it 
> > in the driver -- either way it is blocked.  So how do your deadlocks 
> > get resolved?
> 
> Because nobody is waiting on that kernel thread anyway without a freezer
> so there is no deadlock anymore.

In the deadlock we are seeing, _someone_ is waiting on userspace
thread, that leads to deadlock with freezer. We don't know who,
because we have not seen the sysrq-t dumps.

The "unknown who" will deadlock on fused frozen by driver, too. We
really need to fix the "unknown who" here.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-07  3:44                             ` Benjamin Herrenschmidt
@ 2007-07-07 11:49                               ` Pavel Machek
  2007-07-08  0:40                                 ` Benjamin Herrenschmidt
  2007-07-07 16:17                               ` Alan Stern
  1 sibling, 1 reply; 388+ messages in thread
From: Pavel Machek @ 2007-07-07 11:49 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Alan Stern, Kyle Moffett, Nigel Cunningham, Rafael J. Wysocki,
	Matthew Garrett, linux-kernel, linux-pm

Hi!

> > > And guess what ? It's what we do on powerbooks, and it works fine,
> > > without a freezer :-)

Well, issue is, you should stop claiming it works fine until issue
below is fixed... please?

And anyway I believe that current issue (fuse deadlocks with s2ram)
should be present on powerbooks, too... it is just way harder to
trigger. All that is neccessary is fused (or one of its helpers) to
get frozen by accessing suspended device.


									Pavel

> > I wish you'd stop saying that.  Have you ever done any serious testing?
> > 
> > Here's something to try:  Add a time delay to the end of hub_suspend in
> > drivers/usb/core/hub.c, so you can provoke a race manually.  Then while
> > one of your root hubs is being suspended and the system is waiting in
> > that delay, either plug in a new USB device to that hub or unplug an
> > existing device.
> > 
> > Be sure that CONFIG_USB_DEBUG is on so that we can figure out what 
> > happened after the fact.
> 
> If you remember, one of the things I've been advocating has always been
> that we should put on hold all plug activity (unplug might be alright as
> long as the user events are just delayed) when we start suspending. No
> new devices, no new bindings. "hub" type devices are respondible for
> bringing in the new stuff after resume.



-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway)
  2007-07-05 14:26                           ` Matthew Garrett
  2007-07-05 14:41                             ` Rafael J. Wysocki
@ 2007-07-07 11:49                             ` Pavel Machek
  1 sibling, 0 replies; 388+ messages in thread
From: Pavel Machek @ 2007-07-07 11:49 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Rafael J. Wysocki, Paul Mackerras, Alan Stern, Johannes Berg,
	Linux-pm mailing list, Kernel development list

Hi!

> > > And also "Userland should not depend on userland services", which is 
> > > rather more of a problem.
> > 
> > I think you're oversimplifying it, as far as FUSE is concerned.
> > 
> > Namely, if there are two userland tasks, A and B, and B is uninterruptible,
> > because A is blocked, then this is not a usual situation.
> 
> Fuse is one case of it occuring, and if we end up with more userspace 
> drivers then the problem is only going to get worse.

We'll have to solve them as they come.

Face it, hardware drivers _have_ to know about suspend/resume. Even
userspace drivers will have to know about suspend/resume, because they
need to reinit the hw during resume.

Now... most parts of kernel need to know (a bit) about suspend/resume
-- at least enough to play nicely with refrigerator. In retrospect it
is pretty obvious that this covers fused, too, unfortunately noone
noticed that when fuse was designed.

Can we try to solve the suspend vs. fuse problem now? "Just removing
the refrigerator" is not the answer. First, refrigerator is impossible
to remove in few months timeframe, and second, it does not solve the
problem anyway.

(Actually, there are two separate problems with suspend vs. fuse.)
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway)
  2007-07-05 15:32                                       ` Miklos Szeredi
@ 2007-07-07 11:50                                         ` Pavel Machek
  2007-07-07 20:14                                           ` Miklos Szeredi
  0 siblings, 1 reply; 388+ messages in thread
From: Pavel Machek @ 2007-07-07 11:50 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: rjw, mjg59, paulus, stern, johannes, linux-pm, linux-kernel

Hi!

> > > > You have processes that don't react to signals, because some
> > > > other user land task is misbehaving.  I'd call that ugly at the
> > > > very least.
> > > 
> > > It already happens with, say, NFS. Don't think about it in terms of a 
> > > userland task misbehaving - think of it in terms of a resource becoming 
> > > unavailable.
> > 
> > I think there's a difference between a userland task playing the role of a
> > resource and a "real" external resource the kernel doesn't control.
> > 
> > IMO, userland tasks should not have the power to affect each other as though
> > they were parts of the kernel.
> 
> One task doing ptrace() can basically do whatever it wants with the
> task being traced.  This is not an exact analogy to what fuse does,
> but close.

Well, IMO userland tasks should not have power to grab VFS mutexes for
indefinite ammount of time. ("fused is allowed to deadlock kernel, in
a way only write to special file helps" is ugly). Unfortunately, I
don't think there's a way to work around that deadlock within fuse
design limits... (coda was able to get around it by working on whole
files granularity, AFAICT), so we'll have to live with that.

I think we have two separate problems here, and both are
solvable, without major changes to fuse or suspend framework.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* problem 1 (was Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway))
  2007-07-05 13:57                       ` Matthew Garrett
  2007-07-05 14:28                         ` Rafael J. Wysocki
@ 2007-07-07 12:08                         ` Pavel Machek
  2007-07-07 20:55                           ` Rafael J. Wysocki
  1 sibling, 1 reply; 388+ messages in thread
From: Pavel Machek @ 2007-07-07 12:08 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Paul Mackerras, Alan Stern, Johannes Berg, Rafael J. Wysocki,
	Linux-pm mailing list, Kernel development list

Hi!

> > Now, if kernel needs FUSE services for some reason (that's the problem
> > we hit in s2ram case, right?), we have a deadlock.
> > 
> > So main problem still seems to be "kernel should not depend on
> > userland services during suspend", refrigerator or not.
> 
> And also "Userland should not depend on userland services", which is 
> rather more of a problem.

No, that's not a problem. Or rather, that's different problem, called
"problem 2" (fuse causes freezer to fail to stop processes).

But we still have "problem 1" here: after devices are suspended,
kernel tries to use fuse's services. That is not going to work, one
way or another, because devices are suspended and userland can't work
reliably.

(Aha, it _may_ be it is kernel tries to use fuse's services after
freezing userland but before freezing devices. I don't think it is).

To solve "problem 1", we need to know which part of kernel asks for
fuse services. sysrq-t trace is likely to tell us. Can someone repeat
the "problem 1" scenario (freezer succeeds but then it deadlocks), and
produce sysrq-t trace? That way we can solve "problem 1".
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05  0:15                           ` Paul Mackerras
  2007-07-05 11:54                             ` Rafael J. Wysocki
@ 2007-07-07 12:09                             ` Pavel Machek
  1 sibling, 0 replies; 388+ messages in thread
From: Pavel Machek @ 2007-07-07 12:09 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Rafael J. Wysocki, Benjamin Herrenschmidt, Matthew Garrett,
	linux-kernel, linux-pm

On Thu 2007-07-05 10:15:01, Paul Mackerras wrote:
> Rafael J. Wysocki writes:
> 
> > This is incompatible with the code in kernel/power/main.c, since we only
> > disable the nonboot CPUs after devices have been suspended.  Do you think that
> > your framework can be modified to work without disabling the nonboot CPUs
> > by the user space?
> 
> Sure.  It was a "if it can be done in userspace, do it in userspace"
> kind of decision, but I'm not wedded to it.
> 
> I actually do want to converge to using the generic suspend-to-ram
> code on powerbooks.  I just want to avoid causing regressions for
> powerbook users, including myself. :)

Curious, do you actually use fuse? Can you try it _with_ freezer and
produce sysrq-t trace of deadlock?
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-05 12:07                                   ` Miklos Szeredi
  2007-07-05 13:28                                     ` Rafael J. Wysocki
  2007-07-05 19:38                                     ` Oliver Neukum
@ 2007-07-07 12:17                                     ` Pavel Machek
  2007-07-07 20:42                                       ` Miklos Szeredi
  2 siblings, 1 reply; 388+ messages in thread
From: Pavel Machek @ 2007-07-07 12:17 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: oliver, paulus, stern, johannes, rjw, linux-pm, linux-kernel,
	mjg59, benh

Hi!

> > > Actually fuse allows SIGKILL, because it's always fatal, and the
> > > syscall may not be restarted.
> > 
> > I think you want to stick try_to_freeze() at the same places where you
> > do SIGKILL handling. That should solve the 'syslogd is unfreezeable'
> > problem.
> 
> I could, but it would not solve the general problem.  Namely, that the
> presence of fuse imposes a certain ordering in which userspace tasks
> have to be frozen.  And it is not possible to know this ordering.

We can just wait for all fuse requests to be serviced before
proceeding further with freeze, right?

> And even if the ordering were solved, the freezer would still not work
> if the filesystem is not responding due to external events, such as a
> lost network (this affects NFS, CIFS, whatever just the same as
> fuse).

That's ok, you can't suspend if your hdd is dead, and in the same way
you can't suspend if your NFS server is dead. I agree it is ugly, but
we seem to live ok with that.

We could (and should?) handle that, probably by realizing that NFS is
not a disk and using interruptible sleep, but...

> > Plus, it would be nice to find out where suspend/hibernation is
> > triggering fuse activity. We can then decide where to fix it -- in
> > fuse or in suspend parts. You said sys_sync is not implemented... so
> > where is the problem?
> 
> I cannot say without having a sysrq-t of the situation.

Yes please. Can someone affected please produce sysrq-t?
								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06  7:07                                   ` Miklos Szeredi
@ 2007-07-07 12:19                                     ` Pavel Machek
  0 siblings, 0 replies; 388+ messages in thread
From: Pavel Machek @ 2007-07-07 12:19 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: oliver, paulus, stern, johannes, rjw, linux-pm, linux-kernel,
	mjg59, benh

On Fri 2007-07-06 09:07:38, Miklos Szeredi wrote:
> > > Actually fuse allows SIGKILL, because it's always fatal, and the
> > > syscall may not be restarted.
> > 
> > Okay, and you should handle refrigerator in the same paths where you
> > handle SIGKILL. Just add try_to_freeze() there...
> 
> It's the fourth time I'm repeating this in this thread:
> 
> Yes adding try_to_freeze() there would partially solve the probelem.
> 
> But another task can be sleeping on a mutex held by the task waiting
> for the reply.  And the freezer won't be able to handle that one.
> 
> Generally, calling try_to_freeze() with mutexes held is not a good
> idea.

Agreed, calling try_to_freeze() with mutex held is no-no, and it is
even documented somewhere.
								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-07  3:44                             ` Benjamin Herrenschmidt
  2007-07-07 11:49                               ` Pavel Machek
@ 2007-07-07 16:17                               ` Alan Stern
  2007-07-08  0:42                                 ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-07 16:17 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Kyle Moffett, Nigel Cunningham, Pavel Machek, Rafael J. Wysocki,
	Matthew Garrett, linux-kernel, linux-pm

On Sat, 7 Jul 2007, Benjamin Herrenschmidt wrote:

> > > And guess what ? It's what we do on powerbooks, and it works fine,
> > > without a freezer :-)

> If you remember, one of the things I've been advocating has always been
> that we should put on hold all plug activity (unplug might be alright as
> long as the user events are just delayed) when we start suspending. No
> new devices, no new bindings. "hub" type devices are respondible for
> bringing in the new stuff after resume.

Which is exactly my point.  It _doesn't_ work fine without a freezer, 
because the USB stack currently relies on the freezer to prevent plug 
activity.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-07  4:06                               ` Benjamin Herrenschmidt
@ 2007-07-07 17:19                                 ` Alan Stern
  2007-07-08  0:48                                   ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-07 17:19 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Kyle Moffett, Nigel Cunningham, Pavel Machek, Rafael J. Wysocki,
	Matthew Garrett, linux-kernel, linux-pm

On Sat, 7 Jul 2007, Benjamin Herrenschmidt wrote:

> > > so the suspend process will wait for it.  When binding  
> > > is done the suspend_device() code will take the device lock and tell  
> > > everything else to postpone further bind requests as above.
> > 
> > My question referred to drivers trying to bind or unbind a device
> > _after_ the device has been suspended.  I suppose you'll say that's
> > covered by the NO_BIND flag.  But now we have the locking problem
> > mentioned above: The thread trying to bind is holding a lock which is
> > needed for resuming.
> 
> Why would it ? Just make it fail, maybe with some kind of -ERETRY... Or
> it can spin with the lock not held if it want. That's a detail really.

Spinning in the driver with the lock not held is impossible, since the 
driver is called with the lock already acquired.

Failing with -ERETRY is non-transparent.  I would prefer to block such 
requests at their source, before the lock is acquired.  Perhaps in the 
driver core, perhaps even earlier.

(And rather than trying to manage a waitqueue or struct completion, it
would be easiest to jump directly into the freezer!  The driver or the 
core wouldn't have to worry about waking up all these blocked threads.)

> > As one of the people responsible for the USB power management 
> > implementation, I would appreciate more details about this.  For 
> > example, a dmesg log with CONFIG_USB_DEBUG turned on together with a
> > complete description of the actions you took to provoke the bug.
> > 
> > (I wonder how much of this "buginess" is caused by the lack of the 
> > freezer in PPC.)
> 
> No. The freezer will hide some of those problems under the carpet, but
> not solve the basic issue which is the driver should be solid. Period.

You're missing the point.  If the driver and the freezer are both 
solid, there's no reason they can't share the work.  If many drivers 
can pass off part of their workload to the single freezer, it's a net 
win.

So it isn't a question of how solid the drivers are; it's a question 
of how solid the freezer is.  And bear in mind that if you convince 
people the freezer is not solid enough to be used, then you will have 
to find an alternative for purposes of hibernation.

> The freezer is a flawed concept in the first place.

<... Long and cogent argument which I will skip over for now ...>

> Face it, we should seriously look into doing suspend/resume without a
> freezer.

I'm willing to try, although I think it will be a tremendous amount of 
work to verify that every driver does the right thing.  There's lots of 
support missing.  For example, don't you think we should block all 
sysfs I/O during suspend?  And likewise for insmod/rmmod?

> I even tend to think that we could do STD that way too, in
> fact, while Linus is right saying it's a different problem than STR, we
> could even probably re-use some of the STR infrastructure in some
> hackish way, still without a freezer. We could have ways to block page
> cache writeout, for example, to prevent new post-snapshot dirty data
> from hitting the platter, and use direct BIOs for writeout. That's just
> an example.

What about systems with no BIOS?  I think this would be very hard or 
even impossible to make work.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway)
  2007-07-07 11:50                                         ` Pavel Machek
@ 2007-07-07 20:14                                           ` Miklos Szeredi
  0 siblings, 0 replies; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-07 20:14 UTC (permalink / raw)
  To: pavel; +Cc: miklos, rjw, mjg59, paulus, stern, johannes, linux-pm, linux-kernel

> > One task doing ptrace() can basically do whatever it wants with the
> > task being traced.  This is not an exact analogy to what fuse does,
> > but close.
> 
> Well, IMO userland tasks should not have power to grab VFS mutexes for
> indefinite ammount of time. ("fused is allowed to deadlock kernel, in
> a way only write to special file helps" is ugly). Unfortunately, I
> don't think there's a way to work around that deadlock within fuse
> design limits... (coda was able to get around it by working on whole
> files granularity, AFAICT), so we'll have to live with that.

That's just file I/O.  You can easily deadlock coda with any other
file operation.  In fact coda is _less_ robust wrt a misbehaving
userspace server than fuse by a big margin.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-07 12:17                                     ` Pavel Machek
@ 2007-07-07 20:42                                       ` Miklos Szeredi
  2007-07-07 23:33                                         ` malicious filesystems (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway) Pavel Machek
  0 siblings, 1 reply; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-07 20:42 UTC (permalink / raw)
  To: pavel
  Cc: miklos, oliver, paulus, stern, johannes, rjw, linux-pm,
	linux-kernel, mjg59, benh

> We can just wait for all fuse requests to be serviced before
> proceeding further with freeze, right?

Right.  Nice way to slow down or stop the suspend with an unprivileged
process.  Avoiding that sort of DoS is one of the design goals of
fuse.

Look at it this way: the task of the freezer is to stop new I/O
hitting the hardware.  But it is totally indiscriminate about what it
stops, it tries to stop _everything_ even things which have nothing to
do with hardware.

Not nice.

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-07  2:44                                                           ` Benjamin Herrenschmidt
@ 2007-07-07 20:48                                                             ` Rafael J. Wysocki
  2007-07-08  0:50                                                               ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-07 20:48 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Oliver Neukum, Alan Stern, Miklos Szeredi, pavel, paulus,
	johannes, linux-pm, linux-kernel, mjg59

On Saturday, 7 July 2007 04:44, Benjamin Herrenschmidt wrote:
> On Fri, 2007-07-06 at 11:31 +0200, Oliver Neukum wrote:
> > Am Freitag, 6. Juli 2007 schrieb Benjamin Herrenschmidt:
> > > On Fri, 2007-07-06 at 09:13 +0200, Rafael J. Wysocki wrote:
> > > > 
> > > > The only reason (I know of) why we don't handle uninterruptible tasks in the
> > > > freezer is that we're afraid of the suspend process deadlocking with an
> > > > uninterruptible task holding a lock, but AFAICS the probability of such an
> > > > event is extremely small.
> > > 
> > > What would deadlock specifically ? One of the drivers trying to acquire
> > > that lock ? It would be a driver bug then.
> > 
> > Your driver's write method looks like:
> > 
> > mutex_lock();
> > poke_some_hardware();
> > wait_event_uninterruptible(); //for result
> > res = evaluate_result();
> > mutex_unlock();
> > return res;
> > 
> > If you put a task into the refrigerator at wait_event_interruptible()
> > you will deadlock if you need this lock for the driver to go to suspend.
> > The suspend method then must not take the lock _and_ it must be
> > aware that there may be an ongoing operation.
> 
> Well... 2 things here. Either you have a freezer in which case the
> chances of the above scenario are increased,

How so? :-)

> or you don't, in which case 
> your suspend method will just sleep on the lock until outstanding HW
> accesses that have that lock are completed, and everything is fine.
> 
> You need to be careful with one thing though, whether you have a freezer
> or not. If you driver, in some code path, whatever it is (ioctl, kernel
> thread, workqueue, ...) does something like:
> 
> mutex_lock
> kmalloc(...,GFP_KERNEL);
> mutex_unlock
> 
> And it's suspend callback then does:
> 
> mutex_lock
> 
> The problem here is that the disks might already have been suspended
> prior to your driver being called. Thus, any attempt at pushing things
> out to swap or dirty mmap'ings back to storage will hang, thus kmalloc
> can potentially hang (afaik), and you will deadlock.
> 
> That's what I've been talking about earlier when I said that we should
> have some security in SLAB/SLUB/Buddy allocators, to silently turn
> GFP_KERNEL to at least GFP_NOIO or even ATOMIC before we start
> suspending drivers.
> 
> Now, another way to deal with that would have to use
> pre-suspend/post-resume notifications, and have drivers avoid doing the
> above between those, but that's much harder. (Essentially, drivers would
> have to either make sure they don't do things like blocking allocations,
> even implicitely, or possibly fall back to a degraded synchronous mode
> or that sort of thing).
> 
> I think it's much simpler to tweak slab/slub/buddy instead :-)
> 
> Note that the above issue is orthogonal to our freezer discussion, it's
> just one of the potential deadlock cause we have with suspend that needs
> to be fixed.

Agreed.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: problem 1 (was Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway))
  2007-07-07 12:08                         ` problem 1 (was Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway)) Pavel Machek
@ 2007-07-07 20:55                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 388+ messages in thread
From: Rafael J. Wysocki @ 2007-07-07 20:55 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Matthew Garrett, Paul Mackerras, Alan Stern, Johannes Berg,
	Linux-pm mailing list, Kernel development list

On Saturday, 7 July 2007 14:08, Pavel Machek wrote:
> Hi!
> 
> > > Now, if kernel needs FUSE services for some reason (that's the problem
> > > we hit in s2ram case, right?), we have a deadlock.
> > > 
> > > So main problem still seems to be "kernel should not depend on
> > > userland services during suspend", refrigerator or not.
> > 
> > And also "Userland should not depend on userland services", which is 
> > rather more of a problem.
> 
> No, that's not a problem. Or rather, that's different problem, called
> "problem 2" (fuse causes freezer to fail to stop processes).
> 
> But we still have "problem 1" here: after devices are suspended,
> kernel tries to use fuse's services. That is not going to work, one
> way or another, because devices are suspended and userland can't work
> reliably.
> 
> (Aha, it _may_ be it is kernel tries to use fuse's services after
> freezing userland but before freezing devices. I don't think it is).
> 
> To solve "problem 1", we need to know which part of kernel asks for
> fuse services. sysrq-t trace is likely to tell us. Can someone repeat
> the "problem 1" scenario (freezer succeeds but then it deadlocks), and
> produce sysrq-t trace? That way we can solve "problem 1".

Well, such a trace would be helpful in any case.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 388+ messages in thread

* malicious filesystems (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway)
  2007-07-07 20:42                                       ` Miklos Szeredi
@ 2007-07-07 23:33                                         ` Pavel Machek
  2007-07-08  7:21                                           ` Miklos Szeredi
  2007-07-09 16:19                                           ` Miklos Szeredi
  0 siblings, 2 replies; 388+ messages in thread
From: Pavel Machek @ 2007-07-07 23:33 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: oliver, paulus, stern, johannes, rjw, linux-pm, linux-kernel,
	mjg59, benh

Hi!

> > We can just wait for all fuse requests to be serviced before
> > proceeding further with freeze, right?
> 
> Right.  Nice way to slow down or stop the suspend with an unprivileged
> process.  Avoiding that sort of DoS is one of the design goals of
> fuse.

So you want me to handle _malicious_ filesystems now?

That should be easy... :-). You already have nasty deadlocks in FUSE,
and you solve them by "root can echo 1 > abort"... so allow me the
same possibility.

We can tell fused we are freezing, and if all the requests are not
serviced within, say, 30 seconds, we call the filesystem malicious and
do echo 1 > abort.

Not ideal, but neither is allowing malicious filesystems in the first
place...

> Look at it this way: the task of the freezer is to stop new I/O
> hitting the hardware.  But it is totally indiscriminate about what it
> stops, it tries to stop _everything_ even things which have nothing to
> do with hardware.
> 
> Not nice.

Not nice, but we don't know any better for now. "Just fix all the
drivers" basically means "just fix 90% of kernel".
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-07 11:49                               ` Pavel Machek
@ 2007-07-08  0:40                                 ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-08  0:40 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Alan Stern, Kyle Moffett, Nigel Cunningham, Rafael J. Wysocki,
	Matthew Garrett, linux-kernel, linux-pm

On Sat, 2007-07-07 at 13:49 +0200, Pavel Machek wrote:

> And anyway I believe that current issue (fuse deadlocks with s2ram)
> should be present on powerbooks, too... it is just way harder to
> trigger. All that is neccessary is fused (or one of its helpers) to
> get frozen by accessing suspended device.

I don't see any fuse specific issue without a freezer, but I see generic
issues such as kmalloc blocking in a driver with a mutex held etc... but
I've talked about these already. They have nothing to do with the
freezer though.

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-07 16:17                               ` Alan Stern
@ 2007-07-08  0:42                                 ` Benjamin Herrenschmidt
  2007-07-08  2:24                                   ` Alan Stern
  2007-07-08 18:20                                   ` Rafael J. Wysocki
  0 siblings, 2 replies; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-08  0:42 UTC (permalink / raw)
  To: Alan Stern
  Cc: Kyle Moffett, Nigel Cunningham, Pavel Machek, Rafael J. Wysocki,
	Matthew Garrett, linux-kernel, linux-pm

On Sat, 2007-07-07 at 12:17 -0400, Alan Stern wrote:
> On Sat, 7 Jul 2007, Benjamin Herrenschmidt wrote:
> 
> > > > And guess what ? It's what we do on powerbooks, and it works fine,
> > > > without a freezer :-)
> 
> > If you remember, one of the things I've been advocating has always been
> > that we should put on hold all plug activity (unplug might be alright as
> > long as the user events are just delayed) when we start suspending. No
> > new devices, no new bindings. "hub" type devices are respondible for
> > bringing in the new stuff after resume.
> 
> Which is exactly my point.  It _doesn't_ work fine without a freezer, 
> because the USB stack currently relies on the freezer to prevent plug 
> activity.

Putting on hold plug activity has nothing, NOTHING, to do with the half
assed piece of deadlocking crap we have now we call a freezer.

As long as you guys keep mixing up all the issues and coming up with
totally bogus solutions that cannot work, we won't have a useful suspend
(either to RAM or to disk) in linux.

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-07 17:19                                 ` Alan Stern
@ 2007-07-08  0:48                                   ` Benjamin Herrenschmidt
  2007-07-08  2:53                                     ` Alan Stern
  0 siblings, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-08  0:48 UTC (permalink / raw)
  To: Alan Stern
  Cc: Kyle Moffett, Nigel Cunningham, Pavel Machek, Rafael J. Wysocki,
	Matthew Garrett, linux-kernel, linux-pm


> Spinning in the driver with the lock not held is impossible, since the 
> driver is called with the lock already acquired.
> 
> Failing with -ERETRY is non-transparent.  I would prefer to block such 
> requests at their source, before the lock is acquired.  Perhaps in the 
> driver core, perhaps even earlier.
> 
> (And rather than trying to manage a waitqueue or struct completion, it
> would be easiest to jump directly into the freezer!  The driver or the 
> core wouldn't have to worry about waking up all these blocked threads.)

That's wrong. The freezer is NOT a solution for that sort of thing. Just
because you guys can't get your locking right.

> You're missing the point.  If the driver and the freezer are both 
> solid, there's no reason they can't share the work.  If many drivers 
> can pass off part of their workload to the single freezer, it's a net 
> win.

I've explained already multiple times that the freezer will not do what
you guys expect it to do. IOs can be submited at non-task time and there
is no clear distinction between IO generating threads that must and
those that must not be frozen.

I really can't understand why you guys work so hard at trying to avoid
the right solutions systematically.

> So it isn't a question of how solid the drivers are; it's a question 
> of how solid the freezer is.  And bear in mind that if you convince 
> people the freezer is not solid enough to be used, then you will have 
> to find an alternative for purposes of hibernation.

Because I'm intimately convinced that the freezer is a wrong approach
that cannot be made solid enough.

> > The freezer is a flawed concept in the first place.
> 
> <... Long and cogent argument which I will skip over for now ...>

Too bad, that's where the interesting points that show that the freezer
cannot work are..

> > Face it, we should seriously look into doing suspend/resume without a
> > freezer.
> 
> I'm willing to try, although I think it will be a tremendous amount of 
> work to verify that every driver does the right thing.  There's lots of 
> support missing.  For example, don't you think we should block all 
> sysfs I/O during suspend?  And likewise for insmod/rmmod?

sysfs is a matter of driver. If a sysfs read/write callback in a driver
is hitting the HW, it most certainly already has some kind of locking.
That locking can/should be extended to deal with blocking when the HW is
suspended.

However, since it seems that people universally consider it very hard to
get right (I don't but heh), Linus and Paul have come up with a solution
for most simple enough directly-mapped drivers such as PCI (ok, that
doesn't include USB) which is to simply do the HW suspend in a late
callback after IRQs are off, and not bother with the rest.

> > I even tend to think that we could do STD that way too, in
> > fact, while Linus is right saying it's a different problem than STR, we
> > could even probably re-use some of the STR infrastructure in some
> > hackish way, still without a freezer. We could have ways to block page
> > cache writeout, for example, to prevent new post-snapshot dirty data
> > from hitting the platter, and use direct BIOs for writeout. That's just
> > an example.
> 
> What about systems with no BIOS?  I think this would be very hard or 
> even impossible to make work.

You made the same mistake I did when reading Nigel's mail ... BIOs ->
Block IO requests, not BIOS :-)

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-07 20:48                                                             ` Rafael J. Wysocki
@ 2007-07-08  0:50                                                               ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-08  0:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Oliver Neukum, Alan Stern, Miklos Szeredi, pavel, paulus,
	johannes, linux-pm, linux-kernel, mjg59


> > Well... 2 things here. Either you have a freezer in which case the
> > chances of the above scenario are increased,
> 
> How so? :-)

I meant you have a freezer that freezes uninterruptible tasks.

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-08  0:42                                 ` Benjamin Herrenschmidt
@ 2007-07-08  2:24                                   ` Alan Stern
  2007-07-08  4:39                                     ` Benjamin Herrenschmidt
  2007-07-08 18:26                                     ` Rafael J. Wysocki
  2007-07-08 18:20                                   ` Rafael J. Wysocki
  1 sibling, 2 replies; 388+ messages in thread
From: Alan Stern @ 2007-07-08  2:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Kyle Moffett, Nigel Cunningham, Pavel Machek, Rafael J. Wysocki,
	Matthew Garrett, linux-kernel, linux-pm

On Sun, 8 Jul 2007, Benjamin Herrenschmidt wrote:

> > Which is exactly my point.  It _doesn't_ work fine without a freezer, 
> > because the USB stack currently relies on the freezer to prevent plug 
> > activity.
> 
> Putting on hold plug activity has nothing, NOTHING, to do with the half
> assed piece of deadlocking crap we have now we call a freezer.

You're wrong; it _does_ have something to do with the freezer.  The 
connection is that the code uses the freezer to put plug activity on 
hold.

You probably meant to say that it _should_ have nothing to do with the 
freezer.  That's a different matter.  But in any case you should write 
what you actually mean, rather than just putting down something as 
inflammatory as possible.

> As long as you guys keep mixing up all the issues and coming up with
> totally bogus solutions that cannot work, we won't have a useful suspend
> (either to RAM or to disk) in linux.

In my defense, you should realize that until Rafael's notifier chain
was added (just a few weeks ago, still not in mainline I believe) there
was no other way to do it.  Plug activity needs to be stopped before
the child devices are suspended, and the PM core does not send any
notification to drivers at that time.  All it does is activate the 
freezer.

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-08  0:48                                   ` Benjamin Herrenschmidt
@ 2007-07-08  2:53                                     ` Alan Stern
  2007-07-08  5:14                                       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 388+ messages in thread
From: Alan Stern @ 2007-07-08  2:53 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Kyle Moffett, Nigel Cunningham, Pavel Machek, Rafael J. Wysocki,
	Matthew Garrett, linux-kernel, linux-pm

On Sun, 8 Jul 2007, Benjamin Herrenschmidt wrote:

> > (And rather than trying to manage a waitqueue or struct completion, it
> > would be easiest to jump directly into the freezer!  The driver or the 
> > core wouldn't have to worry about waking up all these blocked threads.)
> 
> That's wrong. The freezer is NOT a solution for that sort of thing. Just
> because you guys can't get your locking right.

You misunderstood.  Stop trying to incite riot, calm down, and pay
attention to what I actually wrote.  I'll explain it again in more
explicit terms:

You agree that drivers need to block various activities during suspend.  
Principally I/O requests, but other things as well.  So when one of 
these requests arrives, the driver has to make it wait somehow and then 
has to allow it to proceed at the appropriate time.

Normally a waitqueue or a struct completion would be used for this
purpose.  But either one puts the burden on the driver of defining a
data structure and signalling it at the right time.  That time is
generally when the device is resumed, but there's nothing wrong with
delaying it slightly, to after all the devices have been resumed (i.e.,
the time when the current PM code takes everything out of the freezer).  
In fact, we definitely don't want to unblock plug events until this
later time.

So instead, why not have the PM core take care of all this?  There
could be a block_task_until_suspend_is_over() routine available for all
drivers to use.  Its effect would be exactly the same as sending the
current task into the freezer, but it wouldn't be the freezer that
exists now.  It would just be some routine that blocks until the system 
suspend is over.  We could call it "the icebox" instead of "the 
freezer".  :-)

Does that make you happier?


> > > Face it, we should seriously look into doing suspend/resume without a
> > > freezer.
> > 
> > I'm willing to try, although I think it will be a tremendous amount of 
> > work to verify that every driver does the right thing.  There's lots of 
> > support missing.  For example, don't you think we should block all 
> > sysfs I/O during suspend?  And likewise for insmod/rmmod?
> 
> sysfs is a matter of driver. If a sysfs read/write callback in a driver
> is hitting the HW, it most certainly already has some kind of locking.
> That locking can/should be extended to deal with blocking when the HW is
> suspended.

User tasks can cause driver binding by writing to sysfs.  Binding 
_can't_ be blocked in the driver; by then it's already too late.  If 
it is going to be blocked at all, it has to be blocked earlier.  One 
possibility is in the sysfs attribute code; another is to block all 
sysfs access.

Of course, another possibility is simply to fail the bind.  But that's 
not very satisfying, since suspends should be transparent.

> However, since it seems that people universally consider it very hard to
> get right (I don't but heh), Linus and Paul have come up with a solution
> for most simple enough directly-mapped drivers such as PCI (ok, that
> doesn't include USB) which is to simply do the HW suspend in a late
> callback after IRQs are off, and not bother with the rest.

Ben, you haven't given enough thought to the work needed to avoid 
locking problems.

For instance, you agree that during suspend we must not allow device or
driver registration or unregistration, right?  And we must not allow
driver binding or unbinding.  But these events generally involve
acquiring a device semaphore, in the driver core and quite often in the
core's caller.  Since that semaphore is also needed for calling the
suspend and resume methods, we have to be very careful about blocking
binding/unbinding/registration/unregistration.  It has to be done at a 
time when no device semaphores are held.

You also agree that kernel threads and workqueues must be allowed to 
operate during suspend.  But consider this: By writing the appropriate 
sysfs attribute, a user task can cause a workqueue item to be queued to 
keventd that tries to unregister a device.  That really puts you on the 
spot: Unregistration can't be allowed to fail, it can't be allowed to 
succeed during a suspend, and keventd can't be blocked!  So what should 
we do?

Alan Stern


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-08  2:24                                   ` Alan Stern
@ 2007-07-08  4:39                                     ` Benjamin Herrenschmidt
  2007-07-08 18:46                                       ` Rafael J. Wysocki
  2007-07-08 18:26                                     ` Rafael J. Wysocki
  1 sibling, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-08  4:39 UTC (permalink / raw)
  To: Alan Stern
  Cc: Kyle Moffett, Nigel Cunningham, Pavel Machek, Rafael J. Wysocki,
	Matthew Garrett, linux-kernel, linux-pm


> In my defense, you should realize that until Rafael's notifier chain
> was added (just a few weeks ago, still not in mainline I believe) there
> was no other way to do it.  Plug activity needs to be stopped before
> the child devices are suspended, and the PM core does not send any
> notification to drivers at that time.  All it does is activate the 
> freezer.

That's true. That was one of the reason I've always wanted the
pre-suspend and post-resume hooks. (I prefer keeping the ordering there
too, rather than a notifier, but a notifier is fine I suppose).

Among the clients we want here the firmware stuff, the allocators,
etc... to get themselves in conditions that won't deadlock during
suspend cycle.

I think that's a much bigger issue overall than freezer vs. no
freezer :-)

Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-08  2:53                                     ` Alan Stern
@ 2007-07-08  5:14                                       ` Benjamin Herrenschmidt
  2007-07-08  5:19                                         ` Benjamin Herrenschmidt
                                                           ` (3 more replies)
  0 siblings, 4 replies; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-08  5:14 UTC (permalink / raw)
  To: Alan Stern
  Cc: Kyle Moffett, Nigel Cunningham, Pavel Machek, Rafael J. Wysocki,
	Matthew Garrett, linux-kernel, linux-pm


> You agree that drivers need to block various activities during suspend.  
> Principally I/O requests, but other things as well.  So when one of 
> these requests arrives, the driver has to make it wait somehow and then 
> has to allow it to proceed at the appropriate time.

Yes. The main thing I see here is that here is nothing common among
drivers in what a "request" is and how it's processed.

For example, for block drivers, it's actually fairly simple to just stop
processing the request queue and wait for pending ones to complete.

For network, in a similar vein, we can mostly just tell the network
stack to stop sending us packets.

That's what I call the "main path". This is often the trivial part to
deal with, mostly because for a whole lot of drivers, it can be done via
a couple of helpers in the subsystem that the driver provides a service
too, via a helper, asking that subsystem to stop calling into the said
driver (the asking should be done by the driver itself of course, for
ordering reasons).

We have some helpers, but I think not enough, and that's where we should
focus imho. For example, I added fb_set_suspend() so that fbdev's can
request fbcon to stop accessing them (it doesn't solve the problem of
userland mmap's, that will have to be done, if we want to do it, in a
more sneaky way, using VM tricks, but the DRM nowadays has the
infrastructure to do it).

But that's only the "main" path. Aside for that, almost all drivers also
have sideband "request" input and some driver don't actually live behind
a subsystem. That ranges from ioctl, to direct read/write on a char dev
from userland.

I think many of those cases can fairly well deal with just taking a PM
semaphore, that's how I did for a couple of things in the past, provided
that the request path isn't deadlocking with the semaphore held because
of the system suspending of course.  

But in a whole lot of cases, it's, I beleive, perfectly kosher to just
return an error. You're trying to capture frame from your camera while
the machine is suspended ? error. At worst, your capture app will be
unhappy when you wakeup, nothing terrible and totally fixable in
userland if it's a problem.

In some cases, we could use a little bit more help from the subsystem.
Network for example, could have some explicit knowledge of the suspend
state, and in addition to stopping the queue would also stop calling
into things like change_mtu or set_multicast, provided it's agreed that
the driver will account for those changes on resume (the actual MTU
values or multicast lists are still updated in the netdev).

> Normally a waitqueue or a struct completion would be used for this
> purpose.  

I think there's no "normal" scenario, each driver or family of drivers
will do things very differently.

> But either one puts the burden on the driver of defining a
> data structure and signalling it at the right time.

Yes.

> That time is generally when the device is resumed, but there's nothing
> wrong with delaying it slightly, to after all the devices have been resumed (i.e.,
> the time when the current PM code takes everything out of the freezer).

In fact, the best is to have parallel suspend/resume of drivers and asynchronous resume but that's out of topic :-) (For the record, I did some bits like ADB resume like that, that is asynchronous, to speed up wakeup time).
>   
> In fact, we definitely don't want to unblock plug events until this
> later time.

There are two things I believe. There's a generic issue with usermode helpers that make no sense to call between pre-suspend and post-resume, and there's the specific issue of adding/removing devices.

I believe that "bus" drivers such as USB should indeed get a first round of notifications to tell them to stop performing bus plug/unplug operations (it's debatable whether we want to keep unplug going provided we can stack up the usermode events and re-send them later though, but let's say no for the sake of simplicity).

> So instead, why not have the PM core take care of all this?  There
> could be a block_task_until_suspend_is_over() routine available for all
> drivers to use.  Its effect would be exactly the same as sending the
> current task into the freezer, but it wouldn't be the freezer that
> exists now.  It would just be some routine that blocks until the system 
> suspend is over.  We could call it "the icebox" instead of "the 
> freezer".  :-)

I'm not totally sure about that. I like some of it, but I think it's
fairly different conceptually from the freezer (and the implementation
could be as trivial as a single system wide wait queue). 

Basically it has a very big difference to the current freezer, and I
like that, which is that we don't have some 3rd party trying to find out
what to freeze and what not (the freezer), but instead, we have
explicitely drivers or kernel threads sending -themselves- to the
"icebox" when they think it's a good idea. Think of it as lazy freezing
-> you only freeze lazy tasks that are trying to do something that
cannot be done because of suspend.

> Does that make you happier?

I think it's a fairly significant change from the current freezer and I
also think it's a very good idea. The more I think about it, the more I
like it, in the sense that it's a simple drop-in that you could put in a
lot of the ioctl path of drivers to just block tasks that are trying to
call in while suspending, and could be used selectively by things like
the USB hub threads.

> User tasks can cause driver binding by writing to sysfs.  Binding 
> _can't_ be blocked in the driver; by then it's already too late.  If 
> it is going to be blocked at all, it has to be blocked earlier.  One 
> possibility is in the sysfs attribute code; another is to block all 
> sysfs access.
> 
> Of course, another possibility is simply to fail the bind.  But that's 
> not very satisfying, since suspends should be transparent.

I don't think suspend has to be -that- transparent (though there is some
debate on whether it should be if we're gonna do some kind of fast
"light" suspend for things like OLPC) but overall, I agree that a bind
operation on sysfs should probably block until resume, and it does make
sense to have the logic to do that in sysfs itself. It could perfectly
use the above icebox thingy you came up with.

> Ben, you haven't given enough thought to the work needed to avoid 
> locking problems.
> 
> For instance, you agree that during suspend we must not allow device or
> driver registration or unregistration, right? 

Registration is fine, binding/ubinding is not (no problem putting the
driver on/off a list, though of course, you can't unregister a bound
driver without unbinding, it's prefectly find to unregister an unbound
driver).

> And we must not allow driver binding or unbinding. 

That's where the meat is.

> But these events generally involve acquiring a device semaphore, in the
> driver core and quite often in the core's caller.  Since that semaphore
> is also needed for calling the suspend and resume methods, we have to be
> very careful about blocking binding/unbinding/registration/unregistration.

> It has to be done at a time when no device semaphores are held.

I think that should indeed be handled within the driver core. Though I
tend to think our driver core is a bit of a locking mess at the
moment :-) (Who says it could use more RCU-like constructs ?).

Regarding unbinding, that's debatable, it might be perfectly allright to
unbind while suspended, though then, there is the question of a driver
being later on bound to a piece of HW that is suspended (though I
suppose that could happen today... some machines have their firmware
leave some devices off at boot).

As a general matter, If we have those pre-suspend/post-resume notifiers
and we adapt the few (there isn't that much) bus drivers so that they
stop all probe/unprobe operations before the suspend dance starts, we
avoid a lot of that problem. At this point, it becomes fair enough for
bind/unbind, in the core, to return an error (and maybe even stack trace
in dmesg to catch the culprits) while suspend in in progress.

The only remaining annoying case is manual bind/unbind via sysfs which
is a good candidate for the above described icebox.

> You also agree that kernel threads and workqueues must be allowed to 
> operate during suspend.

Yes, unless kernel threads explicitely decide to stop themselves (for
example, khubd is a good candidate for that). Again, not a 3rd party
trying to decide what to freeze and what not, but the drivers or kernel
threads themselves deciding it.
 
> But consider this: By writing the appropriate 
> sysfs attribute, a user task can cause a workqueue item to be queued to 
> keventd that tries to unregister a device.  That really puts you on the 
> spot: Unregistration can't be allowed to fail, it can't be allowed to 
> succeed during a suspend, and keventd can't be blocked!  So what should 
> we do?

We can either stop it at the sysfs write level, or we can have the
workqueue task reschedule itself later until we are resumed. In fact,
worqueue items being what they are (queue items), we could imagine
having a special list where they enqueue themselves to rescheduled after
resume.

Don't get me wrong, I never said we don't need generic infrastructure
and utilities, such as your proposed icebox scheme, or some of those
workqueue bits, helpers in subsystems, etc...

I just think that the freezer approach, as it is, is backward. We can't
have a 3rd party try to discriminate what to freeze and what not, it
will always get something wrong, and in some cases with the wrong timing
or ordering.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-08  5:14                                       ` Benjamin Herrenschmidt
@ 2007-07-08  5:19                                         ` Benjamin Herrenschmidt
  2007-07-08 20:17                                           ` Alan Stern
  2007-07-08 19:15                                         ` Rafael J. Wysocki
                                                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 388+ messages in thread
From: Benjamin Herrenschmidt @ 2007-07-08  5:19 UTC (permalink / raw)
  To: Alan Stern
  Cc: Kyle Moffett, Nigel Cunningham, Pavel Machek, Rafael J. Wysocki,
	Matthew Garrett, linux-kernel, linux-pm


> I think it's a fairly significant change from the current freezer and I
> also think it's a very good idea. The more I think about it, the more I
> like it, in the sense that it's a simple drop-in that you could put in a
> lot of the ioctl path of drivers to just block tasks that are trying to
> call in while suspending, and could be used selectively by things like
> the USB hub threads.

Note that we could have also a per-device "icebox" (just a waitqueue).
Might be nice for a device to resume whatever it froze just after it
resumed itself, might even be necessary in case whatever thread got
frozen is necessary for handling child devices.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway
  2007-07-06 15:13                             ` Alan Stern
@ 2007-07-08  7:19                               ` Paul Mackerras
  2007-07-08  7:35                               ` [PATCH] Remove process freezer from suspend to RAM pathway (philosophical) Oleg Verych
  1 sibling, 0 replies; 388+ messages in thread
From: Paul Mackerras @ 2007-07-08  7:19 UTC (permalink / raw)
  To: Alan Stern
  Cc: Benjamin Herrenschmidt, Johannes Berg, Rafael J. Wysocki,
	Linux-pm mailing list, Kernel development list, Pavel Machek,
	Matthew Garrett

Alan Stern writes:

> In answer to both questions: We need the freezer in order to implement 
> hibernate.  Even if we take your advice and stop using the freezer 
> during suspend, these issues would still remain and would need to be 
> solved.

Stepping back for a minute, let's think about what the freezer is
trying to achieve.  I think that currently there are three basic
design goals:

A. Ensure that device drivers don't get I/O requests after being
   suspended.

B. Ensure that driver suspend routines don't end up blocking forever
   on mutexes or semaphores held by frozen tasks.

C. Provide a way to get an atomic snapshot of memory for hibernation.

Now, it's easy enough to freeze all processes (or all except one), if
you don't have goal B.  Just offline non-boot cpus and disable
interrupts, or use stop_machine().  But goal B implies that you can't
necessarily just stop all tasks wherever they are.  In fact it means
there are points where it is safe to stop a given task, and there may
be points where it isn't safe to stop it - and there is no practical
way to determine those points reliably.[1]

That implies to me that we can have a freezer as long as we do nothing
that can sleep, while tasks are frozen.  In other words, I think the
freezer is a viable option for hibernation as long as we restrict
driver hibernate routines to doing only things which don't sleep.  I
_think_ that should be doable since hibernate routines only need to
wait for outstanding DMAs to complete, as I understand it, which can
be done by polling.

Paul.

[1] For a start, there's no way to determine which mutexes a task
holds.  Even if there were, we would then also have to know which
mutexes it is going to try to acquire before it releases the one we
want, which is pretty much unknowable.

One can say "we'll only freeze kernel tasks that ask to be frozen" but
then one has no way to guarantee A, and we still don't reliably
guarantee B if we have user-level filesystems or device drivers.

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: malicious filesystems (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway)
  2007-07-07 23:33                                         ` malicious filesystems (was Re: [linux-pm] Re: [PATCH] Remove process freezer from suspend to RAM pathway) Pavel Machek
@ 2007-07-08  7:21                                           ` Miklos Szeredi
  2007-07-08  8:15                                             ` Off topically about Re: malicious filesystems (was " Oleg Verych
                                                               ` (2 more replies)
  2007-07-09 16:19                                           ` Miklos Szeredi
  1 sibling, 3 replies; 388+ messages in thread
From: Miklos Szeredi @ 2007-07-08  7:21 UTC (permalink / raw)
  To: pavel
  Cc: miklos, oliver, paulus, stern, johannes, rjw, linux-pm,
	linux-kernel, mjg59, benh

> > > We can just wait for all fuse requests to be serviced before
> > > proceeding further with freeze, right?
> > 
> > Right.  Nice way to slow down or stop the suspend with an unprivileged
> > process.  Avoiding that sort of DoS is one of the design goals of
> > fuse.
> 
> So you want me to handle _malicious_ filesystems now?

What I'd like, is a suspend, that works reliably, regardless of the
state of any userspace filesystem, network servers and such.

> That should be easy... :-). You already have nasty deadlocks in FUSE,
> and you solve them by "root can echo 1 > abort"... so allow me the
> same possibility.
> 
> We can tell fused we are freezing, and if all the requests are not
> serviced within, say, 30 seconds, we call the filesystem malicious and
> do echo 1 > abort.

Arbitrary time limits, nice.  Not.

This freezer is like an old house that's close to collapsing, and you
are basically just thinking of where to prop it up further.  To
continute this brilliant analogy, Rafael's patch at least demolishes
the worst part of the house, where bricks are already falling on our
head ;)

> Not ideal, but neither is allowing malicious filesystems in the first
> place...

Malicious programs are not something specific to fuse.  A lot of the
multiuser/multitasking OS design is about isolating things, so such a
program is limited in the damage it can do.

> > Look at it this way: the task of the freezer is to stop new I/O
> > hitting the hardware.  But it is totally indiscriminate about what it
> > stops, it tries to stop _everything_ even things which have nothing to
> > do with hardware.
> > 
> > Not nice.
> 
> Not nice, but we don't know any better for now. "Just fix all the
> drivers" basically means "just fix 90% of kernel".

And how much of that 90% currently has any power management?

Miklos

^ permalink raw reply	[flat|nested] 388+ messages in thread

* Re: [PATCH] Remove process freezer from suspend to RAM pathway (philosophical)
  2007-07-06 15:13                             ` Alan Stern
  2007-07-08  7:19                               ` Paul Mackerras
@ 2007-07-08  7:35                               ` Oleg Verych
  1 sibling, 0 replies; 388+ messages in thread
From: Oleg Verych @ 2007-07-08  7:35 UTC (permalink / raw)
  To: linux-kernel

* Alan Stern <Date: Fri, 6 Jul 2007 11:13:14 -0400 (EDT)>

>> Why are you guys working so hard and spending so much energy to try to
>> avoid doing the right thing is beyond my understanding...
>> 
>> > It _does_ apply to kernel threads.  That's exactly why I wrote above 
>> > that kernel threads which try to do I/O during a suspend will need 
>> > extra attention.
>> 
>> Ok none at all if you don't have a freezer.
>
> In answer to both questions: We need the freezer in order to

Non of *complex* living creatures survive motionless freezing (with
ice).

That's why stupid me just don't care about all that stuff in the
Linux. It's nice to see somebody approaches this from technical, but not
off-topical POV :). 
____


^ permalink raw reply	[flat|nested] 388+ messages in thread

* Off topically about Re: malicious filesystems (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway)
  2007-07-08  7:21                                           ` Miklos Szeredi
@ 2007-07-08  8:15                                             ` Oleg Verych
  2007-07-08 12:37                                             ` malicious filesystems (was Re: [linux-pm] " Pavel Machek
  2007-07-08 14:06                                             ` Rafael J. Wysocki
  2 siblings, 0 replies; 388+ messages in thread
From: Oleg Verych @ 2007-07-08  8:15 UTC (permalink / raw)
  To: linux-kernel

>> > > We can just wait for all fuse requests to be serviced before
>> > > proceeding further with freeze, right?
>> > 
>> > Right.  Nice way to slow down or stop the suspend with an unprivileged
>> > process.  Avoiding that sort of DoS is one of the design goals of
>> > fuse.
>> 
>> So you want me to handle _malicious_ filesystems now?
>
> What I'd like, is a suspend, that works reliably, regardless of the
> state of any userspace filesystem, network