LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] Hibernation: Fix mark_nosave_pages()
@ 2008-03-11 23:34 Rafael J. Wysocki
2008-03-12 3:21 ` Len Brown
2008-03-13 7:46 ` Pavel Machek
0 siblings, 2 replies; 3+ messages in thread
From: Rafael J. Wysocki @ 2008-03-11 23:34 UTC (permalink / raw)
To: Len Brown; +Cc: Andrew Morton, LKML, Pavel Machek, pm list
Hi Len,
The following patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=9966 .
Although the problem has been there for quite some time (since before 2.6.22,
apparently), it would be good to have in 2.6.25, because it may fix booting
problems on the affected systems.
Please note that it doesn't modify the code behavior on the systems which are
not affected by bug #9966 .
Thanks,
Rafael
---
From: Rafael J. Wysocki <rjw@sisk.pl>
There is a problem in the hibernation code that triggers on some NUMA
systems on which pfn_valid() returns 'true' for some PFNs that don't
belong to any zone. Namely, there is a BUG_ON() in
memory_bm_find_bit() that triggers for PFNs not belonging to any
zone and passing the pfn_valid() test. On the affected systems it
triggers when we mark PFNs reported by the platform as not saveable,
because the PFNs in question belong to a region mapped directly using
iorepam() (i.e. the ACPI data area) and they pass the pfn_valid()
test.
Modify memory_bm_find_bit() so that it returns an error if given PFN
doesn't belong to any zone instead of crashing the kernel and ignore
the result returned by it in mark_nosave_pages(), while marking the
"nosave" memory regions.
This doesn't affect the hibernation functionality, as we won't touch
the PFNs in question anyway.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
kernel/power/snapshot.c | 41 ++++++++++++++++++++++++++++++++++-------
1 file changed, 34 insertions(+), 7 deletions(-)
Index: linux-2.6/kernel/power/snapshot.c
===================================================================
--- linux-2.6.orig/kernel/power/snapshot.c
+++ linux-2.6/kernel/power/snapshot.c
@@ -447,7 +447,7 @@ static void memory_bm_free(struct memory
* of @bm->cur_zone_bm are updated.
*/
-static void memory_bm_find_bit(struct memory_bitmap *bm, unsigned long pfn,
+static int memory_bm_find_bit(struct memory_bitmap *bm, unsigned long pfn,
void **addr, unsigned int *bit_nr)
{
struct zone_bitmap *zone_bm;
@@ -461,7 +461,8 @@ static void memory_bm_find_bit(struct me
while (pfn < zone_bm->start_pfn || pfn >= zone_bm->end_pfn) {
zone_bm = zone_bm->next;
- BUG_ON(!zone_bm);
+ if (!zone_bm)
+ return -EFAULT;
}
bm->cur.zone_bm = zone_bm;
}
@@ -479,23 +480,40 @@ static void memory_bm_find_bit(struct me
pfn -= bb->start_pfn;
*bit_nr = pfn % BM_BITS_PER_CHUNK;
*addr = bb->data + pfn / BM_BITS_PER_CHUNK;
+ return 0;
}
static void memory_bm_set_bit(struct memory_bitmap *bm, unsigned long pfn)
{
void *addr;
unsigned int bit;
+ int error;
- memory_bm_find_bit(bm, pfn, &addr, &bit);
+ error = memory_bm_find_bit(bm, pfn, &addr, &bit);
+ BUG_ON(error);
set_bit(bit, addr);
}
+static int mem_bm_set_bit_check(struct memory_bitmap *bm, unsigned long pfn)
+{
+ void *addr;
+ unsigned int bit;
+ int error;
+
+ error = memory_bm_find_bit(bm, pfn, &addr, &bit);
+ if (!error)
+ set_bit(bit, addr);
+ return error;
+}
+
static void memory_bm_clear_bit(struct memory_bitmap *bm, unsigned long pfn)
{
void *addr;
unsigned int bit;
+ int error;
- memory_bm_find_bit(bm, pfn, &addr, &bit);
+ error = memory_bm_find_bit(bm, pfn, &addr, &bit);
+ BUG_ON(error);
clear_bit(bit, addr);
}
@@ -503,8 +521,10 @@ static int memory_bm_test_bit(struct mem
{
void *addr;
unsigned int bit;
+ int error;
- memory_bm_find_bit(bm, pfn, &addr, &bit);
+ error = memory_bm_find_bit(bm, pfn, &addr, &bit);
+ BUG_ON(error);
return test_bit(bit, addr);
}
@@ -709,8 +729,15 @@ static void mark_nosave_pages(struct mem
region->end_pfn << PAGE_SHIFT);
for (pfn = region->start_pfn; pfn < region->end_pfn; pfn++)
- if (pfn_valid(pfn))
- memory_bm_set_bit(bm, pfn);
+ if (pfn_valid(pfn)) {
+ /*
+ * It is safe to ignore the result of
+ * mem_bm_set_bit_check() here, since we won't
+ * touch the PFNs for which the error is
+ * returned anyway.
+ */
+ mem_bm_set_bit_check(bm, pfn);
+ }
}
}
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Hibernation: Fix mark_nosave_pages()
2008-03-11 23:34 [PATCH] Hibernation: Fix mark_nosave_pages() Rafael J. Wysocki
@ 2008-03-12 3:21 ` Len Brown
2008-03-13 7:46 ` Pavel Machek
1 sibling, 0 replies; 3+ messages in thread
From: Len Brown @ 2008-03-12 3:21 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: Andrew Morton, LKML, Pavel Machek, pm list
applied to acpi test tree.
thanks,
-len
On Tuesday 11 March 2008, Rafael J. Wysocki wrote:
> Hi Len,
>
> The following patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=9966 .
>
> Although the problem has been there for quite some time (since before 2.6.22,
> apparently), it would be good to have in 2.6.25, because it may fix booting
> problems on the affected systems.
>
> Please note that it doesn't modify the code behavior on the systems which are
> not affected by bug #9966 .
>
> Thanks,
> Rafael
>
>
> ---
> From: Rafael J. Wysocki <rjw@sisk.pl>
>
> There is a problem in the hibernation code that triggers on some NUMA
> systems on which pfn_valid() returns 'true' for some PFNs that don't
> belong to any zone. Namely, there is a BUG_ON() in
> memory_bm_find_bit() that triggers for PFNs not belonging to any
> zone and passing the pfn_valid() test. On the affected systems it
> triggers when we mark PFNs reported by the platform as not saveable,
> because the PFNs in question belong to a region mapped directly using
> iorepam() (i.e. the ACPI data area) and they pass the pfn_valid()
> test.
>
> Modify memory_bm_find_bit() so that it returns an error if given PFN
> doesn't belong to any zone instead of crashing the kernel and ignore
> the result returned by it in mark_nosave_pages(), while marking the
> "nosave" memory regions.
>
> This doesn't affect the hibernation functionality, as we won't touch
> the PFNs in question anyway.
>
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---
> kernel/power/snapshot.c | 41 ++++++++++++++++++++++++++++++++++-------
> 1 file changed, 34 insertions(+), 7 deletions(-)
>
> Index: linux-2.6/kernel/power/snapshot.c
> ===================================================================
> --- linux-2.6.orig/kernel/power/snapshot.c
> +++ linux-2.6/kernel/power/snapshot.c
> @@ -447,7 +447,7 @@ static void memory_bm_free(struct memory
> * of @bm->cur_zone_bm are updated.
> */
>
> -static void memory_bm_find_bit(struct memory_bitmap *bm, unsigned long pfn,
> +static int memory_bm_find_bit(struct memory_bitmap *bm, unsigned long pfn,
> void **addr, unsigned int *bit_nr)
> {
> struct zone_bitmap *zone_bm;
> @@ -461,7 +461,8 @@ static void memory_bm_find_bit(struct me
> while (pfn < zone_bm->start_pfn || pfn >= zone_bm->end_pfn) {
> zone_bm = zone_bm->next;
>
> - BUG_ON(!zone_bm);
> + if (!zone_bm)
> + return -EFAULT;
> }
> bm->cur.zone_bm = zone_bm;
> }
> @@ -479,23 +480,40 @@ static void memory_bm_find_bit(struct me
> pfn -= bb->start_pfn;
> *bit_nr = pfn % BM_BITS_PER_CHUNK;
> *addr = bb->data + pfn / BM_BITS_PER_CHUNK;
> + return 0;
> }
>
> static void memory_bm_set_bit(struct memory_bitmap *bm, unsigned long pfn)
> {
> void *addr;
> unsigned int bit;
> + int error;
>
> - memory_bm_find_bit(bm, pfn, &addr, &bit);
> + error = memory_bm_find_bit(bm, pfn, &addr, &bit);
> + BUG_ON(error);
> set_bit(bit, addr);
> }
>
> +static int mem_bm_set_bit_check(struct memory_bitmap *bm, unsigned long pfn)
> +{
> + void *addr;
> + unsigned int bit;
> + int error;
> +
> + error = memory_bm_find_bit(bm, pfn, &addr, &bit);
> + if (!error)
> + set_bit(bit, addr);
> + return error;
> +}
> +
> static void memory_bm_clear_bit(struct memory_bitmap *bm, unsigned long pfn)
> {
> void *addr;
> unsigned int bit;
> + int error;
>
> - memory_bm_find_bit(bm, pfn, &addr, &bit);
> + error = memory_bm_find_bit(bm, pfn, &addr, &bit);
> + BUG_ON(error);
> clear_bit(bit, addr);
> }
>
> @@ -503,8 +521,10 @@ static int memory_bm_test_bit(struct mem
> {
> void *addr;
> unsigned int bit;
> + int error;
>
> - memory_bm_find_bit(bm, pfn, &addr, &bit);
> + error = memory_bm_find_bit(bm, pfn, &addr, &bit);
> + BUG_ON(error);
> return test_bit(bit, addr);
> }
>
> @@ -709,8 +729,15 @@ static void mark_nosave_pages(struct mem
> region->end_pfn << PAGE_SHIFT);
>
> for (pfn = region->start_pfn; pfn < region->end_pfn; pfn++)
> - if (pfn_valid(pfn))
> - memory_bm_set_bit(bm, pfn);
> + if (pfn_valid(pfn)) {
> + /*
> + * It is safe to ignore the result of
> + * mem_bm_set_bit_check() here, since we won't
> + * touch the PFNs for which the error is
> + * returned anyway.
> + */
> + mem_bm_set_bit_check(bm, pfn);
> + }
> }
> }
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Hibernation: Fix mark_nosave_pages()
2008-03-11 23:34 [PATCH] Hibernation: Fix mark_nosave_pages() Rafael J. Wysocki
2008-03-12 3:21 ` Len Brown
@ 2008-03-13 7:46 ` Pavel Machek
1 sibling, 0 replies; 3+ messages in thread
From: Pavel Machek @ 2008-03-13 7:46 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: Len Brown, Andrew Morton, LKML, pm list
On Wed 2008-03-12 00:34:57, Rafael J. Wysocki wrote:
> Hi Len,
>
> The following patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=9966 .
>
> Although the problem has been there for quite some time (since before 2.6.22,
> apparently), it would be good to have in 2.6.25, because it may fix booting
> problems on the affected systems.
>
> Please note that it doesn't modify the code behavior on the systems which are
> not affected by bug #9966 .
>
> Thanks,
> Rafael
>
>
> ---
> From: Rafael J. Wysocki <rjw@sisk.pl>
>
> There is a problem in the hibernation code that triggers on some NUMA
> systems on which pfn_valid() returns 'true' for some PFNs that don't
> belong to any zone. Namely, there is a BUG_ON() in
> memory_bm_find_bit() that triggers for PFNs not belonging to any
> zone and passing the pfn_valid() test. On the affected systems it
> triggers when we mark PFNs reported by the platform as not saveable,
> because the PFNs in question belong to a region mapped directly using
> iorepam() (i.e. the ACPI data area) and they pass the pfn_valid()
> test.
>
> Modify memory_bm_find_bit() so that it returns an error if given PFN
> doesn't belong to any zone instead of crashing the kernel and ignore
> the result returned by it in mark_nosave_pages(), while marking the
> "nosave" memory regions.
>
> This doesn't affect the hibernation functionality, as we won't touch
> the PFNs in question anyway.
>
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
ACK. I hate how it adds more code to swsusp, but I could not find a
simple way to avoid that.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2008-03-13 10:11 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-11 23:34 [PATCH] Hibernation: Fix mark_nosave_pages() Rafael J. Wysocki
2008-03-12 3:21 ` Len Brown
2008-03-13 7:46 ` Pavel Machek
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).