LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Suspend2 merge preparation: Rationale behind the freezer changes.
@ 2004-05-17  6:49 Nigel Cunningham
  2004-05-21  9:33 ` Pavel Machek
  0 siblings, 1 reply; 14+ messages in thread
From: Nigel Cunningham @ 2004-05-17  6:49 UTC (permalink / raw)
  To: Linux Kernel Mailing List

Hi all.

In merging suspend2, one of the biggest changes is in the area of 
freezing activity. I'm writing this email in an effort to improve 
understanding of why I've implemented the freezer differently, and 
perhaps get some ideas as to how I could better achieve the desired results.

First of all, let me explain that although swsusp and suspend2 work at a 
very fundamental level in the same way, there are also some important 
differences. Of particular relevance to this conversation is the fact 
that swsusp makes what is as close to an atomic copy of the entire image 
to be saved as we can get and then saves it. In contrast, suspend2 saves 
one portion of the memory (lru pages), makes an atomic copy of the rest 
and then saves the atomic copy of the second part.

In order, then, for suspend2 to get the equivalent of an atomic copy of 
memory, pages must not be added to or removed from the LRU list once we 
start saving the image. (There are other issues and measures taken, but 
they're not relevant here).

One of the problems we face in achieving this goal is the fact that 
timers & timeouts can still fire during this period. These can of course 
be used to start new processes and to cause others (eg sleep) to exit, 
with the result that we can end up with changes to the LRU lists and 
therefore an inconsistent image.

Secondly, we have a more basic problem with the existing freezer 
implementation. A fundamental assumption made by it is that the order in 
which processes are signalled does not matter; that there will be no 
deadlocks caused by freezing one process before another. This simply 
isn't true.

Thirdly, the existing implementation does not allow us to quickly stop 
activity. Under heavy load, particularly heavy I/O (assuming the freezer 
does work), it make take quite a while for processes to respond to the 
pseudo-signal and enter the refrigerator. New processes may also be 
spawned, further complicating matters. The busier the system is, the 
more hit-and-miss freezing becomes.

The implementation of the freezer that I have developed addresses these 
concerns by adding an atomic count of the number of procesess in 
critical paths. The first part of the freezer simply waits for the 
number of processes in critical paths to reach zero.

A critical path is defined as one in which a process takes locks or 
carries out other activities which could deadlock with another process 
or make the process not respond to a freezer signal. When a process 
enters a critical path, the ACTIVITY_START macro causes it to be marked 
PF_FRIDGE_WAIT and the count of processes in critical paths is 
atomically imcremented. When it returns, a matching ACTIVITY_END macro 
reverses these effects. Use of a local variable makes it safe for 
processes to pass through multiple ACTIVITY_START calls; only the 
matching ACTIVITY_END will reverse the initial ACTIVITY_START. It may be 
that in the middle of a critical patch, there is sleeping in which we 
could safely suspend. This can be indicated by surrounding the sleep 
with ACTIVITY_PAUSING and ACTIVITY_RESTARTING calls. The thread is thus 
temporarily marked as safely suspendable.

These four macros play a further role. When we begin to wait for the 
activity counter to reach zero, a flag is set to record this fact. Macro 
calls check this flag, and a process reaching a START or RESTARTING 
activity macro while the flag is set will be refrigerated at that point 
until after the suspend cycle is completed. This helps us quiesce the 
system more quickly.

Some processes receive special treatment during this period.

A process marked PF_NOFREEZE is never refrigerated or counted in 
measuring activity.

A process may instead be marked PF_SYNCTHREAD. It is good for us to sync 
all dirty data to disc prior to suspending, just-in-case something goes 
wrong or the user uses noresume. By doing this, we maximise the 
filesystem integrity as far as is possible. PF_SYNCTHREAD is used for 
processes such as journalling threads that are used in doing this, and 
for processes which begin a filesystem sync. These processes are allowed 
to continue operation during the initial phase, and are frozen later.

The freezing process is thus:

1) Set FREEZE_NEW_ACTIVITY flag and wait for activity count to reach 
zero. New activity is held, existing activity completes critical paths 
or pauses at a safe place and syncing runs to completion.
2) Do our own sys_sync, just in case none were already running.
3) Set FREEZE_UNREFRIGERATED flag. Syncthreads will now enter the 
refrigerator of their own accord or by being signalled.
4) Signal remaining processes to be frozen. Deadlocking is avoided 
because those that would start critical paths are held at the 
ACTIVITY_START/RESTARTING calls, prior to taking the locks that would 
cause the deadlocks.

Regards,

Nigel
-- 
Nigel & Michelle Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6254 0216 (home)

Evolution (n): A hypothetical process whereby infinitely improbable 
events occur
with alarming frequency, order arises from chaos, and no one is given 
credit.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Suspend2 merge preparation: Rationale behind the freezer changes.
  2004-05-17  6:49 Suspend2 merge preparation: Rationale behind the freezer changes Nigel Cunningham
@ 2004-05-21  9:33 ` Pavel Machek
  2004-05-21 12:28   ` Nigel Cunningham
  0 siblings, 1 reply; 14+ messages in thread
From: Pavel Machek @ 2004-05-21  9:33 UTC (permalink / raw)
  To: Nigel Cunningham; +Cc: Linux Kernel Mailing List

Hi!

> In merging suspend2, one of the biggest changes is in the area of 
> freezing activity. I'm writing this email in an effort to improve 
> understanding of why I've implemented the freezer differently, and 
> perhaps get some ideas as to how I could better achieve the desired
> results.

Thanks for this. (Btw you might want to Cc me on such mails, I read
both list and personal mails, but you'll get way better response time.)

> First of all, let me explain that although swsusp and suspend2 work at a 
> very fundamental level in the same way, there are also some important 
> differences. Of particular relevance to this conversation is the fact 
> that swsusp makes what is as close to an atomic copy of the entire image 
> to be saved as we can get and then saves it. In contrast, suspend2 saves 
> one portion of the memory (lru pages), makes an atomic copy of the rest 
> and then saves the atomic copy of the second part.

Hmm, I did not realize this difference. Doing these hacks with LRU
seems pretty crazy to me...

Would it be possible to stop processes when they try to manipulate
LRU, instead?

> Secondly, we have a more basic problem with the existing freezer 
> implementation. A fundamental assumption made by it is that the order in 
> which processes are signalled does not matter; that there will be no 
> deadlocks caused by freezing one process before another. This simply 
> isn't true.

It better should be. If it is not true, then kill -STOP -1 does not
work, and that would be a kernel bug, right?

When user thread is stopped, it should better not hold any lock,
because otherwise we have problem anyway.

Kernel threads are different, and each must be handled separately,
maybe even with some ordering. But there's relatively small number of
kernel threads... 

> Thirdly, the existing implementation does not allow us to quickly stop 
> activity. Under heavy load, particularly heavy I/O (assuming the freezer 
> does work), it make take quite a while for processes to respond to the 
> pseudo-signal and enter the refrigerator. New processes may also be 
> spawned, further complicating matters. The busier the system is, the 
> more hit-and-miss freezing becomes.

I agree it can take longer, but modulo bugs, it should be always possible.

> The implementation of the freezer that I have developed addresses these 
> concerns by adding an atomic count of the number of procesess in 
> critical paths. The first part of the freezer simply waits for the 
> number of processes in critical paths to reach zero.

Exactly, you slowed down critical paths of kernel... This makes patch
big, ugly, and is bad idea.

> These four macros play a further role. When we begin to wait for the 
> activity counter to reach zero, a flag is set to record this fact. Macro 
> calls check this flag, and a process reaching a START or RESTARTING 
> activity macro while the flag is set will be refrigerated at that point 
> until after the suspend cycle is completed. This helps us quiesce the 
> system more quickly.

Adding hooks to "fast" stuff like read()/write()/open is no-no. Adding
small number of hooks to slower stuff like exec()/exit() might be
acceptable. Could you get away with that?

								Pavel
-- 
934a471f20d6580d5aad759bf0d97ddc

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Suspend2 merge preparation: Rationale behind the freezer changes.
  2004-05-21  9:33 ` Pavel Machek
@ 2004-05-21 12:28   ` Nigel Cunningham
  2004-05-21 13:34     ` Pavel Machek
  2004-05-21 13:42     ` Oliver Neukum
  0 siblings, 2 replies; 14+ messages in thread
From: Nigel Cunningham @ 2004-05-21 12:28 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Linux Kernel Mailing List

Hi.

Pavel Machek wrote:
> Thanks for this. (Btw you might want to Cc me on such mails, I read
> both list and personal mails, but you'll get way better response time.)

Humble apologies! :>

>>First of all, let me explain that although swsusp and suspend2 work at a 
>>very fundamental level in the same way, there are also some important 
>>differences. Of particular relevance to this conversation is the fact 
>>that swsusp makes what is as close to an atomic copy of the entire image 
>>to be saved as we can get and then saves it. In contrast, suspend2 saves 
>>one portion of the memory (lru pages), makes an atomic copy of the rest 
>>and then saves the atomic copy of the second part.
> 
> 
> Hmm, I did not realize this difference. Doing these hacks with LRU
> seems pretty crazy to me...

No... this is what you already know, just described differently. You 
mentioned in your documentation that suspend2 overcomes the half of 
memory limitation by saving the image in two parts: the second part is 
LRU (unless I have my terminology confused: I'm talking about pages on 
the inactive and active lists). By the time this is done, all other 
processes are stopped, so there's no danger of corruption anyway. 
Suspend2 itself doesn't affect LRU because I switched from using the 
'normal' swap read/write calls ages ago, as part of adding swapfile 
support. For 2.6 we're directly using BIOs.

> Would it be possible to stop processes when they try to manipulate
> LRU, instead?

They're already stopped.

>>Secondly, we have a more basic problem with the existing freezer 
>>implementation. A fundamental assumption made by it is that the order in 
>>which processes are signalled does not matter; that there will be no 
>>deadlocks caused by freezing one process before another. This simply 
>>isn't true.
> 
> 
> It better should be. If it is not true, then kill -STOP -1 does not
> work, and that would be a kernel bug, right?

We already discussed the example of trying to do an ls on an NFS share 
and the NFS threads being frozen first. I can come up with more examples 
if you'd like. I guess the simplest one (off the top of my head) would 
be freezing kjournald while processes are submitting and waiting on I/O.

> When user thread is stopped, it should better not hold any lock,
> because otherwise we have problem anyway.

Yes, but we're not just talking about user threads. We could 
differentiate kernel threads and user threads (presumably using another 
PF_ flag?) and attempt to freeze the user threads first.

> Kernel threads are different, and each must be handled separately,
> maybe even with some ordering. But there's relatively small number of
> kernel threads... 

Yes, but what order? I played with that problem for ages. Perhaps I just 
  didn't find the right combination.

>>Thirdly, the existing implementation does not allow us to quickly stop 
>>activity. Under heavy load, particularly heavy I/O (assuming the freezer 
>>does work), it make take quite a while for processes to respond to the 
>>pseudo-signal and enter the refrigerator. New processes may also be 
>>spawned, further complicating matters. The busier the system is, the 
>>more hit-and-miss freezing becomes.
> 
> I agree it can take longer, but modulo bugs, it should be always possible.

Should... I'll find some time to roll a freezer implementation that does 
what you're suggesting (try user space threads first, seek an order for 
kernel space threads). If we can do it that way, it will be less 
invasive. I'll see...

>>The implementation of the freezer that I have developed addresses these 
>>concerns by adding an atomic count of the number of procesess in 
>>critical paths. The first part of the freezer simply waits for the 
>>number of processes in critical paths to reach zero.
> 
> Exactly, you slowed down critical paths of kernel... This makes patch
> big, ugly, and is bad idea.

Maybe I wasn't clear enough. When we're not suspending, all that is 
added to the paths that are modified is:

- 9 tests, possibly resulting in refrigerator entry or immediately 
dropping through, setting the PF_FRIDGE_WAIT flag and incrementing the 
atomic_t at the start of a busy path.
- 2 tests, possibly resetting the flag & decrementing the counter at the 
end.
- 3 tests, setting a local variable, restting the FRIDGE_WAIT flag and 
decrementing the atomic_t when dropping locks and sleeping in kernel.
- 10 tests, possibly resulting in refrigerator entry or immediately 
dropping through, restoring the PF_FRIDGE_WAIT flag and reincrementing 
the atomic_t after such sleeps.

I've been using this approach for months, and my Celeron 933 doesn't 
feel slow at all. I've had no complains from users either.

We really need to ask how critical these paths really are: some of them 
are certainly more commonly used, such as sys_read & sys_write. The vast 
majority, however, are less commonly used. I wonder if it's worth 
getting a benchmarking program. I'll try your suggestion above first.

>>These four macros play a further role. When we begin to wait for the 
>>activity counter to reach zero, a flag is set to record this fact. Macro 
>>calls check this flag, and a process reaching a START or RESTARTING 
>>activity macro while the flag is set will be refrigerated at that point 
>>until after the suspend cycle is completed. This helps us quiesce the 
>>system more quickly.
> 
> 
> Adding hooks to "fast" stuff like read()/write()/open is no-no. Adding
> small number of hooks to slower stuff like exec()/exit() might be
> acceptable. Could you get away with that?

No. Reading and writing is exactly what we want to be able to pause. 
Otherwise we get processes stuck waiting on pages.

Summary:
- I'll try your user space first, kernel space afterwards suggestion.
- I'll also look into benchmarking the system with and without suspend2 
compiled in (ie with and without the hooks, since they compile away to 
nothing without CONFIG_SOFTWARE_SUSPEND2

Regards,

Nigel
-- 
Nigel & Michelle Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6254 0216 (home)

Evolution (n): A hypothetical process whereby infinitely improbable 
events occur
with alarming frequency, order arises from chaos, and no one is given 
credit.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Suspend2 merge preparation: Rationale behind the freezer changes.
  2004-05-21 12:28   ` Nigel Cunningham
@ 2004-05-21 13:34     ` Pavel Machek
  2004-05-22  2:43       ` Nigel Cunningham
  2004-05-21 13:42     ` Oliver Neukum
  1 sibling, 1 reply; 14+ messages in thread
From: Pavel Machek @ 2004-05-21 13:34 UTC (permalink / raw)
  To: Nigel Cunningham; +Cc: Linux Kernel Mailing List

Hi!

> >>First of all, let me explain that although swsusp and suspend2 work at a 
> >>very fundamental level in the same way, there are also some important 
> >>differences. Of particular relevance to this conversation is the fact 
> >>that swsusp makes what is as close to an atomic copy of the entire image 
> >>to be saved as we can get and then saves it. In contrast, suspend2 saves 
> >>one portion of the memory (lru pages), makes an atomic copy of the rest 
> >>and then saves the atomic copy of the second part.
> >
> >
> >Hmm, I did not realize this difference. Doing these hacks with LRU
> >seems pretty crazy to me...
> 
> No... this is what you already know, just described differently. You 
> mentioned in your documentation that suspend2 overcomes the half of 
> memory limitation by saving the image in two parts: the second part is 
> LRU (unless I have my terminology confused: I'm talking about pages on 

Yes, I just did not realize that it means changes for freezer.

> >>Secondly, we have a more basic problem with the existing freezer 
> >>implementation. A fundamental assumption made by it is that the order in 
> >>which processes are signalled does not matter; that there will be no 
> >>deadlocks caused by freezing one process before another. This simply 
> >>isn't true.
> >
> >
> >It better should be. If it is not true, then kill -STOP -1 does not
> >work, and that would be a kernel bug, right?
> 
> We already discussed the example of trying to do an ls on an NFS share 
> and the NFS threads being frozen first. I can come up with more examples 
> if you'd like. I guess the simplest one (off the top of my head) would 
> be freezing kjournald while processes are submitting and waiting on
> I/O.

Agreed, kernel processes need to go last.

> >When user thread is stopped, it should better not hold any lock,
> >because otherwise we have problem anyway.
> 
> Yes, but we're not just talking about user threads. We could 
> differentiate kernel threads and user threads (presumably using another 
> PF_ flag?) and attempt to freeze the user threads first.

There should be some other way to see kernel threads... their mm is
init_mm or something like that.

> >Kernel threads are different, and each must be handled separately,
> >maybe even with some ordering. But there's relatively small number of
> >kernel threads... 
> 
> Yes, but what order? I played with that problem for ages. Perhaps I just 
>  didn't find the right combination.

... ... hmm. Not sure.

Did you add hooks into sys_read() to deal with with
kernel-thread-vs-kernel-thread ordering?

> >>The implementation of the freezer that I have developed addresses these 
> >>concerns by adding an atomic count of the number of procesess in 
> >>critical paths. The first part of the freezer simply waits for the 
> >>number of processes in critical paths to reach zero.
> >
> >Exactly, you slowed down critical paths of kernel... This makes patch
> >big, ugly, and is bad idea.
> 
> Maybe I wasn't clear enough. When we're not suspending, all that is 
> added to the paths that are modified is:
> 
> - 9 tests, possibly resulting in refrigerator entry or immediately 
> dropping through, setting the PF_FRIDGE_WAIT flag and incrementing the 
> atomic_t at the start of a busy path.
> - 2 tests, possibly resetting the flag & decrementing the counter at the 
> end.
> - 3 tests, setting a local variable, restting the FRIDGE_WAIT flag and 
> decrementing the atomic_t when dropping locks and sleeping in kernel.
> - 10 tests, possibly resulting in refrigerator entry or immediately 
> dropping through, restoring the PF_FRIDGE_WAIT flag and reincrementing 
> the atomic_t after such sleeps.
> 
> I've been using this approach for months, and my Celeron 933 doesn't 
> feel slow at all. I've had no complains from users either.

Well, slowness is likely to be something like 1% at
microbenchmark. (Try lm_bench, test "lat-read" or something like
that). Of course you don't feel any slowness; but you'd probably not
feel any slowness if kernel was compiled -O0 either. 

> Summary:
> - I'll try your user space first, kernel space afterwards suggestion.
> - I'll also look into benchmarking the system with and without suspend2 
> compiled in (ie with and without the hooks, since they compile away to 
> nothing without CONFIG_SOFTWARE_SUSPEND2

Don't spend too much time benchmarking... But you might want to ask Al
Viro what he thinks about another test in sys_read ;-).
								Pavel
-- 
934a471f20d6580d5aad759bf0d97ddc

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Suspend2 merge preparation: Rationale behind the freezer changes.
  2004-05-21 12:28   ` Nigel Cunningham
  2004-05-21 13:34     ` Pavel Machek
@ 2004-05-21 13:42     ` Oliver Neukum
  2004-05-21 17:08       ` Pavel Machek
  2004-05-21 22:32       ` Nigel Cunningham
  1 sibling, 2 replies; 14+ messages in thread
From: Oliver Neukum @ 2004-05-21 13:42 UTC (permalink / raw)
  To: Nigel Cunningham; +Cc: Pavel Machek, Linux Kernel Mailing List

Am Freitag, 21. Mai 2004 14:28 schrieb Nigel Cunningham:
> > Kernel threads are different, and each must be handled separately,
> > maybe even with some ordering. But there's relatively small number of
> > kernel threads... 
> 
> Yes, but what order? I played with that problem for ages. Perhaps I just 
>   didn't find the right combination.

How about recording the order of creation and do it in opposite order?

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Suspend2 merge preparation: Rationale behind the freezer changes.
  2004-05-21 13:42     ` Oliver Neukum
@ 2004-05-21 17:08       ` Pavel Machek
  2004-05-21 17:12         ` Oliver Neukum
  2004-05-21 22:32       ` Nigel Cunningham
  1 sibling, 1 reply; 14+ messages in thread
From: Pavel Machek @ 2004-05-21 17:08 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: Nigel Cunningham, Linux Kernel Mailing List

Hi!

> > > Kernel threads are different, and each must be handled separately,
> > > maybe even with some ordering. But there's relatively small number of
> > > kernel threads... 
> > 
> > Yes, but what order? I played with that problem for ages. Perhaps I just 
> >   didn't find the right combination.
> 
> How about recording the order of creation and do it in opposite order?

Order of creation is pretty much hidden in pid, but I do not think
that will work.
								Pavel

-- 
934a471f20d6580d5aad759bf0d97ddc

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Suspend2 merge preparation: Rationale behind the freezer changes.
  2004-05-21 17:08       ` Pavel Machek
@ 2004-05-21 17:12         ` Oliver Neukum
  2004-05-21 17:15           ` Pavel Machek
  0 siblings, 1 reply; 14+ messages in thread
From: Oliver Neukum @ 2004-05-21 17:12 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Nigel Cunningham, Linux Kernel Mailing List

Am Freitag, 21. Mai 2004 19:08 schrieben Sie:
> Hi!
> 
> > > > Kernel threads are different, and each must be handled separately,
> > > > maybe even with some ordering. But there's relatively small number of
> > > > kernel threads... 
> > > 
> > > Yes, but what order? I played with that problem for ages. Perhaps I just 
> > >   didn't find the right combination.
> > 
> > How about recording the order of creation and do it in opposite order?
> 
> Order of creation is pretty much hidden in pid, but I do not think
> that will work.

Why? Build a list during kernel thread creation. It is not a hot code path.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Suspend2 merge preparation: Rationale behind the freezer changes.
  2004-05-21 17:12         ` Oliver Neukum
@ 2004-05-21 17:15           ` Pavel Machek
  2004-05-21 17:20             ` Oliver Neukum
  0 siblings, 1 reply; 14+ messages in thread
From: Pavel Machek @ 2004-05-21 17:15 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: Nigel Cunningham, Linux Kernel Mailing List

Hi!

> > > > > Kernel threads are different, and each must be handled separately,
> > > > > maybe even with some ordering. But there's relatively small number of
> > > > > kernel threads... 
> > > > 
> > > > Yes, but what order? I played with that problem for ages. Perhaps I just 
> > > >   didn't find the right combination.
> > > 
> > > How about recording the order of creation and do it in opposite order?
> > 
> > Order of creation is pretty much hidden in pid, but I do not think
> > that will work.
> 
> Why? Build a list during kernel thread creation. It is not a hot code path.

Maybe the order in which kernel threads were created is not the same
as the order how they need to be frozen?
								Pavel

-- 
934a471f20d6580d5aad759bf0d97ddc

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Suspend2 merge preparation: Rationale behind the freezer changes.
  2004-05-21 17:15           ` Pavel Machek
@ 2004-05-21 17:20             ` Oliver Neukum
  2004-05-21 23:35               ` Herbert Xu
  0 siblings, 1 reply; 14+ messages in thread
From: Oliver Neukum @ 2004-05-21 17:20 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Nigel Cunningham, Linux Kernel Mailing List

Am Freitag, 21. Mai 2004 19:15 schrieb Pavel Machek:
> Hi!
> 
> > > > > > Kernel threads are different, and each must be handled separately,
> > > > > > maybe even with some ordering. But there's relatively small number of
> > > > > > kernel threads... 
> > > > > 
> > > > > Yes, but what order? I played with that problem for ages. Perhaps I just 
> > > > >   didn't find the right combination.
> > > > 
> > > > How about recording the order of creation and do it in opposite order?
> > > 
> > > Order of creation is pretty much hidden in pid, but I do not think
> > > that will work.
> > 
> > Why? Build a list during kernel thread creation. It is not a hot code path.
> 
> Maybe the order in which kernel threads were created is not the same
> as the order how they need to be frozen?

Possible, but unlikely. If there can be a deadlock if they are frozen in
reverse order, the same problem existed during creation and needed
to be specially handled.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Suspend2 merge preparation: Rationale behind the freezer changes.
  2004-05-21 13:42     ` Oliver Neukum
  2004-05-21 17:08       ` Pavel Machek
@ 2004-05-21 22:32       ` Nigel Cunningham
  2004-05-22 14:11         ` Bill Davidsen
  1 sibling, 1 reply; 14+ messages in thread
From: Nigel Cunningham @ 2004-05-21 22:32 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: Pavel Machek, Linux Kernel Mailing List

Hi.

Oliver Neukum wrote:
> Am Freitag, 21. Mai 2004 14:28 schrieb Nigel Cunningham:
>>Yes, but what order? I played with that problem for ages. Perhaps I just 
>>  didn't find the right combination.
> How about recording the order of creation and do it in opposite order?

We could add a field to the process struct to record that. (Since PIDs 
can wrap, they can't be relied upon for this).

One potential problem is that we'd race with processes that were 
forking, but that's a problem with the existing implementation anyway.

I can see that the only way I'm going to convince people that we need 
the method I settled on is by showing the deficiencies of the current one :<

Nigel
-- 
Nigel & Michelle Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6254 0216 (home)

Evolution (n): A hypothetical process whereby infinitely improbable 
events occur
with alarming frequency, order arises from chaos, and no one is given 
credit.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Suspend2 merge preparation: Rationale behind the freezer changes.
  2004-05-21 17:20             ` Oliver Neukum
@ 2004-05-21 23:35               ` Herbert Xu
  2004-05-22  2:47                 ` Nigel Cunningham
  0 siblings, 1 reply; 14+ messages in thread
From: Herbert Xu @ 2004-05-21 23:35 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: Pavel Machek, Nigel Cunningham, linux-kernel

Oliver Neukum <oliver@neukum.org> wrote:
> 
> Possible, but unlikely. If there can be a deadlock if they are frozen in
> reverse order, the same problem existed during creation and needed
> to be specially handled.

So exactly which kernel threads will dead lock when frozen in the wrong
order? So far I've only seen user process vs. kernel thread examples.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email:  Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Suspend2 merge preparation: Rationale behind the freezer changes.
  2004-05-21 13:34     ` Pavel Machek
@ 2004-05-22  2:43       ` Nigel Cunningham
  0 siblings, 0 replies; 14+ messages in thread
From: Nigel Cunningham @ 2004-05-22  2:43 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Linux Kernel Mailing List

Hi.

Pavel Machek wrote:
>>No... this is what you already know, just described differently. You 
>>mentioned in your documentation that suspend2 overcomes the half of 
>>memory limitation by saving the image in two parts: the second part is 
>>LRU (unless I have my terminology confused: I'm talking about pages on 
> 
> 
> Yes, I just did not realize that it means changes for freezer.

It means that we need to be more careful to ensure that processes all 
processes are frozen than we do with your implementation: we can't have 
processes forking or exiting because a timer fires while we're writing 
the first part of the image. (I saw that happen with bad effects).

>>Yes, but we're not just talking about user threads. We could 
>>differentiate kernel threads and user threads (presumably using another 
>>PF_ flag?) and attempt to freeze the user threads first.
> 
> There should be some other way to see kernel threads... their mm is
> init_mm or something like that.

Yes.

> Did you add hooks into sys_read() to deal with with
> kernel-thread-vs-kernel-thread ordering?

With the hooks, I don't have to worry about ordering at all. Once the 
atomic_t comes down to zero, every process has either entered the 
refrigerator because it hit a start/restart hook or wasn't doing 
anything to start with. I don't have to care about ordering because 
anything that does start activity while I'm signalling hits a hook and 
gets refrigerated anyway, even if it's not in response to the pseudo-signal.

> Well, slowness is likely to be something like 1% at
> microbenchmark. (Try lm_bench, test "lat-read" or something like
> that). Of course you don't feel any slowness; but you'd probably not
> feel any slowness if kernel was compiled -O0 either. 

:>. I'll put benchmarking on my todo list...

>>Summary:
>>- I'll try your user space first, kernel space afterwards suggestion.
>>- I'll also look into benchmarking the system with and without suspend2 
>>compiled in (ie with and without the hooks, since they compile away to 
>>nothing without CONFIG_SOFTWARE_SUSPEND2
> 
> 
> Don't spend too much time benchmarking... But you might want to ask Al
> Viro what he thinks about another test in sys_read ;-).

... and won't spend too long on it :>. I think I can guess.

Nigel

-- 
Nigel & Michelle Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6254 0216 (home)

Evolution (n): A hypothetical process whereby infinitely improbable 
events occur
with alarming frequency, order arises from chaos, and no one is given 
credit.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Suspend2 merge preparation: Rationale behind the freezer changes.
  2004-05-21 23:35               ` Herbert Xu
@ 2004-05-22  2:47                 ` Nigel Cunningham
  0 siblings, 0 replies; 14+ messages in thread
From: Nigel Cunningham @ 2004-05-22  2:47 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Oliver Neukum, Pavel Machek, linux-kernel

Hi.

Herbert Xu wrote:
> So exactly which kernel threads will dead lock when frozen in the wrong
> order? So far I've only seen user process vs. kernel thread examples.

It's been so long since I've seen it happen that I have to confess I've 
forgotten. I'll take a look in the list archives.

Nigel
-- 
Nigel & Michelle Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6254 0216 (home)

Evolution (n): A hypothetical process whereby infinitely improbable 
events occur
with alarming frequency, order arises from chaos, and no one is given 
credit.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Suspend2 merge preparation: Rationale behind the freezer changes.
  2004-05-21 22:32       ` Nigel Cunningham
@ 2004-05-22 14:11         ` Bill Davidsen
  0 siblings, 0 replies; 14+ messages in thread
From: Bill Davidsen @ 2004-05-22 14:11 UTC (permalink / raw)
  To: Nigel Cunningham; +Cc: Oliver Neukum, Pavel Machek, Linux Kernel Mailing List

Nigel Cunningham wrote:
> Hi.
> 
> Oliver Neukum wrote:
> 
>> Am Freitag, 21. Mai 2004 14:28 schrieb Nigel Cunningham:
>>
>>> Yes, but what order? I played with that problem for ages. Perhaps I 
>>> just  didn't find the right combination.
>>
>> How about recording the order of creation and do it in opposite order?
> 
> 
> We could add a field to the process struct to record that. (Since PIDs 
> can wrap, they can't be relied upon for this).

I would never suggest keeping a process creation date for something 
trivial, but since you seem to be proposing one for something major, the 
process creation date could be available in the readdir of the /proc 
directory. Assuming you intend to keep date at all and not just some 
counter, of course.

-- 
    -bill davidsen (davidsen@tmr.com)
"The secret to procrastination is to put things off until the
  last possible moment - but no longer"  -me

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2004-05-22 14:08 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-05-17  6:49 Suspend2 merge preparation: Rationale behind the freezer changes Nigel Cunningham
2004-05-21  9:33 ` Pavel Machek
2004-05-21 12:28   ` Nigel Cunningham
2004-05-21 13:34     ` Pavel Machek
2004-05-22  2:43       ` Nigel Cunningham
2004-05-21 13:42     ` Oliver Neukum
2004-05-21 17:08       ` Pavel Machek
2004-05-21 17:12         ` Oliver Neukum
2004-05-21 17:15           ` Pavel Machek
2004-05-21 17:20             ` Oliver Neukum
2004-05-21 23:35               ` Herbert Xu
2004-05-22  2:47                 ` Nigel Cunningham
2004-05-21 22:32       ` Nigel Cunningham
2004-05-22 14:11         ` Bill Davidsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).