LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
@ 2008-10-27 16:49 Gerald Schaefer
2008-10-27 17:17 ` Gerald Schaefer
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Gerald Schaefer @ 2008-10-27 16:49 UTC (permalink / raw)
To: akpm
Cc: linux-kernel, linux-mm, schwidefsky, heiko.carstens,
kamezawa.hiroyu, y-goto
From: Gerald Schaefer <gerald.schaefer@de.ibm.com>
My last bugfix here (adding zone->lock) introduced a new problem: Using
pfn_to_page(pfn) to get the zone after the for() loop is wrong. pfn then
points to the first pfn after end_pfn, which may be in a different zone
or not present at all. This may lead to an addressing exception in
page_zone() or spin_lock_irqsave().
Using the last valid page that was found inside the for() loop, instead
of pfn_to_page(), should fix this.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
---
mm/page_isolation.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
Index: linux-2.6/mm/page_isolation.c
===================================================================
--- linux-2.6.orig/mm/page_isolation.c
+++ linux-2.6/mm/page_isolation.c
@@ -115,7 +115,7 @@ __test_page_isolated_in_pageblock(unsign
int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn)
{
unsigned long pfn, flags;
- struct page *page;
+ struct page *page = NULL;
struct zone *zone;
int ret;
@@ -130,10 +130,10 @@ int test_pages_isolated(unsigned long st
if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
break;
}
- if (pfn < end_pfn)
+ if ((pfn < end_pfn) || !page)
return -EBUSY;
/* Check all pages are free or Marked as ISOLATED */
- zone = page_zone(pfn_to_page(pfn));
+ zone = page_zone(page);
spin_lock_irqsave(&zone->lock, flags);
ret = __test_page_isolated_in_pageblock(start_pfn, end_pfn);
spin_unlock_irqrestore(&zone->lock, flags);
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
2008-10-27 16:49 [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated() Gerald Schaefer
@ 2008-10-27 17:17 ` Gerald Schaefer
2008-10-27 17:19 ` Gerald Schaefer
2008-10-27 17:25 ` Dave Hansen
2 siblings, 0 replies; 10+ messages in thread
From: Gerald Schaefer @ 2008-10-27 17:17 UTC (permalink / raw)
To: akpm
Cc: linux-kernel, linux-mm, schwidefsky, heiko.carstens,
kamezawa.hiroyu, y-goto
Ouch, stupid Thunderbird broke the patch (or stupid me used Thunderbird...),
will send a new one.
Gerald
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
2008-10-27 16:49 [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated() Gerald Schaefer
2008-10-27 17:17 ` Gerald Schaefer
@ 2008-10-27 17:19 ` Gerald Schaefer
2008-10-27 17:25 ` Dave Hansen
2 siblings, 0 replies; 10+ messages in thread
From: Gerald Schaefer @ 2008-10-27 17:19 UTC (permalink / raw)
To: akpm
Cc: linux-kernel, linux-mm, schwidefsky, heiko.carstens,
kamezawa.hiroyu, y-goto
From: Gerald Schaefer <gerald.schaefer@de.ibm.com>
My last bugfix here (adding zone->lock) introduced a new problem: Using
pfn_to_page(pfn) to get the zone after the for() loop is wrong. pfn then
points to the first pfn after end_pfn, which may be in a different zone
or not present at all. This may lead to an addressing exception in
page_zone() or spin_lock_irqsave().
Using the last valid page that was found inside the for() loop, instead
of pfn_to_page(), should fix this.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
---
mm/page_isolation.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
Index: linux-2.6/mm/page_isolation.c
===================================================================
--- linux-2.6.orig/mm/page_isolation.c
+++ linux-2.6/mm/page_isolation.c
@@ -115,7 +115,7 @@ __test_page_isolated_in_pageblock(unsign
int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn)
{
unsigned long pfn, flags;
- struct page *page;
+ struct page *page = NULL;
struct zone *zone;
int ret;
@@ -130,10 +130,10 @@ int test_pages_isolated(unsigned long st
if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
break;
}
- if (pfn < end_pfn)
+ if ((pfn < end_pfn) || !page)
return -EBUSY;
/* Check all pages are free or Marked as ISOLATED */
- zone = page_zone(pfn_to_page(pfn));
+ zone = page_zone(page);
spin_lock_irqsave(&zone->lock, flags);
ret = __test_page_isolated_in_pageblock(start_pfn, end_pfn);
spin_unlock_irqrestore(&zone->lock, flags);
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
2008-10-27 16:49 [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated() Gerald Schaefer
2008-10-27 17:17 ` Gerald Schaefer
2008-10-27 17:19 ` Gerald Schaefer
@ 2008-10-27 17:25 ` Dave Hansen
2008-10-27 17:59 ` Gerald Schaefer
2 siblings, 1 reply; 10+ messages in thread
From: Dave Hansen @ 2008-10-27 17:25 UTC (permalink / raw)
To: gerald.schaefer
Cc: akpm, linux-kernel, linux-mm, schwidefsky, heiko.carstens,
kamezawa.hiroyu, y-goto
On Mon, 2008-10-27 at 17:49 +0100, Gerald Schaefer wrote:
> My last bugfix here (adding zone->lock) introduced a new problem: Using
> pfn_to_page(pfn) to get the zone after the for() loop is wrong. pfn then
> points to the first pfn after end_pfn, which may be in a different zone
> or not present at all. This may lead to an addressing exception in
> page_zone() or spin_lock_irqsave().
I'm not sure I follow. Let's look at the code, pre-patch:
> for (pfn = start_pfn; pfn < end_pfn; pfn += pageblock_nr_pages) {
> page = __first_valid_page(pfn, pageblock_nr_pages);
> if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> break;
> }
> if (pfn < end_pfn)
> return -EBUSY;
We have two ways out of the loop:
1. 'page' is valid, and not isolated, so we did a 'break'
2. No page hit (1) in the range and we broke out of the loop because
of the for() condition: (pfn < end_pfn).
So, when the condition happens that you mentioned in your changelog
above: "pfn then points to the first pfn after end_pfn", we jump out at
the 'return -EBUSY;'. We don't ever do pfn_to_page() in that case since
we've returned befoer.
Either 'page' is valid *OR* you return -EBUSY. I don't think you need
to check both.
> Using the last valid page that was found inside the for() loop, instead
> of pfn_to_page(), should fix this.
> @@ -130,10 +130,10 @@ int test_pages_isolated(unsigned long st
> if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> break;
> }
> - if (pfn < end_pfn)
> + if ((pfn < end_pfn) || !page)
> return -EBUSY;
> /* Check all pages are free or Marked as ISOLATED */
> - zone = page_zone(pfn_to_page(pfn));
> + zone = page_zone(page);
I think this patch fixes the bug, but for reasons other than what you
said. :)
The trouble here is that the 'pfn' could have been in the middle of a
hole somewhere, which __first_valid_page() worked around. Since you
saved off the result of __first_valid_page(), it ends up being OK with
your patch.
Instead of using pfn_to_page() you could also have just called
__first_valid_page() again. But, that would have duplicated a bit of
work, even though not much in practice because the caches are still hot.
Technically, you wouldn't even need to check the return from
__first_valid_page() since you know it has a valid result because you
made the exact same call a moment before.
Anyway, can you remove the !page check, fix up the changelog and resend?
-- Dave
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
2008-10-27 17:25 ` Dave Hansen
@ 2008-10-27 17:59 ` Gerald Schaefer
2008-10-28 0:32 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 10+ messages in thread
From: Gerald Schaefer @ 2008-10-27 17:59 UTC (permalink / raw)
To: Dave Hansen
Cc: akpm, linux-kernel, linux-mm, schwidefsky, heiko.carstens,
kamezawa.hiroyu, y-goto
On Mon, 2008-10-27 at 10:25 -0700, Dave Hansen wrote:
> I'm not sure I follow. Let's look at the code, pre-patch:
>
> > for (pfn = start_pfn; pfn < end_pfn; pfn += pageblock_nr_pages) {
> > page = __first_valid_page(pfn, pageblock_nr_pages);
> > if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> > break;
> > }
> > if (pfn < end_pfn)
> > return -EBUSY;
>
> We have two ways out of the loop:
> 1. 'page' is valid, and not isolated, so we did a 'break'
> 2. No page hit (1) in the range and we broke out of the loop because
> of the for() condition: (pfn < end_pfn).
>
> So, when the condition happens that you mentioned in your changelog
> above: "pfn then points to the first pfn after end_pfn", we jump out at
> the 'return -EBUSY;'. We don't ever do pfn_to_page() in that case since
> we've returned befoer.
>
> Either 'page' is valid *OR* you return -EBUSY. I don't think you need
> to check both.
We only return -EBUSY if pfn < end_pfn, but after completing the loop w/o
a break pfn will be > end_pfn. Also, the last call to __first_valid_page()
may return NULL w/o causing a break, so page may also be invalid after the
loop.
> > Using the last valid page that was found inside the for() loop, instead
> > of pfn_to_page(), should fix this.
> > @@ -130,10 +130,10 @@ int test_pages_isolated(unsigned long st
> > if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> > break;
> > }
> > - if (pfn < end_pfn)
> > + if ((pfn < end_pfn) || !page)
> > return -EBUSY;
> > /* Check all pages are free or Marked as ISOLATED */
> > - zone = page_zone(pfn_to_page(pfn));
> > + zone = page_zone(page);
>
> I think this patch fixes the bug, but for reasons other than what you
> said. :)
>
> The trouble here is that the 'pfn' could have been in the middle of a
> hole somewhere, which __first_valid_page() worked around. Since you
> saved off the result of __first_valid_page(), it ends up being OK with
> your patch.
I think pfn will always be > end_pfn if we complete the loop. And breaking
out of the loop earlier will always return -EBUSY.
> Instead of using pfn_to_page() you could also have just called
> __first_valid_page() again. But, that would have duplicated a bit of
> work, even though not much in practice because the caches are still hot.
>
> Technically, you wouldn't even need to check the return from
> __first_valid_page() since you know it has a valid result because you
> made the exact same call a moment before.
>
> Anyway, can you remove the !page check, fix up the changelog and resend?
Calling __first_valid_page() again might be a good idea. Thinking about it
now, I guess there is still a problem left with my patch, but for reasons
other than what you said :) If the loop is completed with page == NULL,
we will return -EBUSY with the new patch. But there may have been valid
pages before, and only some memory hole at the end. In this case, returning
-EBUSY would probably be wrong.
Kamezawa, this loop/function was added by you, what do you think?
--
Thanks,
Gerald
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
2008-10-27 17:59 ` Gerald Schaefer
@ 2008-10-28 0:32 ` KAMEZAWA Hiroyuki
2008-10-28 13:00 ` Gerald Schaefer
0 siblings, 1 reply; 10+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-10-28 0:32 UTC (permalink / raw)
To: Gerald Schaefer
Cc: Dave Hansen, akpm, linux-kernel, linux-mm, schwidefsky,
heiko.carstens, y-goto
On Mon, 27 Oct 2008 18:59:29 +0100
Gerald Schaefer <gerald.schaefer@de.ibm.com> wrote:
> Instead of using pfn_to_page() you could also have just called
> > __first_valid_page() again. But, that would have duplicated a bit of
> > work, even though not much in practice because the caches are still hot.
> >
> > Technically, you wouldn't even need to check the return from
> > __first_valid_page() since you know it has a valid result because you
> > made the exact same call a moment before.
> >
> > Anyway, can you remove the !page check, fix up the changelog and resend?
>
> Calling __first_valid_page() again might be a good idea. Thinking about it
> now, I guess there is still a problem left with my patch, but for reasons
> other than what you said :) If the loop is completed with page == NULL,
> we will return -EBUSY with the new patch. But there may have been valid
> pages before, and only some memory hole at the end. In this case, returning
> -EBUSY would probably be wrong.
>
> Kamezawa, this loop/function was added by you, what do you think?
>
I think there is a bug, as you wrote.
But
- "pfn" and "end_pfn" (and pfn in the middle of them) can be in different zone on strange machine.
Now: test_pages_isolated() is called in following sequence.
check_page_isolated()
walk_memory_resource() # read resource range and get start/end of pfn
-> chcek_page_isolated_cb()
-> test_page_isolated().
I think all pages within [start, end) passed to test_pages_isolated() should be in the same zone.
please change this to
check_page_isolated()
walk_memory_resource()
-> check_page_isolated_cb()
-> walk_page_range_in_same_zone() # get page range in the same zone.
-> test_page_isolated().
Could you try ?
Thanks,
-Kame
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
2008-10-28 0:32 ` KAMEZAWA Hiroyuki
@ 2008-10-28 13:00 ` Gerald Schaefer
0 siblings, 0 replies; 10+ messages in thread
From: Gerald Schaefer @ 2008-10-28 13:00 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Dave Hansen, akpm, linux-kernel, linux-mm, schwidefsky,
heiko.carstens, y-goto
On Tue, 2008-10-28 at 09:32 +0900, KAMEZAWA Hiroyuki wrote:
> But
> - "pfn" and "end_pfn" (and pfn in the middle of them) can be in different zone on strange machine.
>
> Now: test_pages_isolated() is called in following sequence.
>
> check_page_isolated()
> walk_memory_resource() # read resource range and get start/end of pfn
> -> chcek_page_isolated_cb()
> -> test_page_isolated().
>
> I think all pages within [start, end) passed to test_pages_isolated() should be in the same zone.
>
> please change this to
> check_page_isolated()
> walk_memory_resource()
> -> check_page_isolated_cb()
> -> walk_page_range_in_same_zone() # get page range in the same zone.
> -> test_page_isolated().
>
> Could you try ?
There is already a "same zone" check at the beginning of offline_pages():
> if (!test_pages_in_a_zone(start_pfn, end_pfn))
> return -EINVAL;
So we should be safe here, the only problem that I see is that my
zone->lock patch in test_pages_isolated() is broken. As explained,
the pfn used in my page_zone(pfn_to_page(pfn)) is >= end_pfn.
I'll send a new patch to fix this, using __first_valid_page() again,
as described in my reply to Daves mail. The only other solution that
I see would be to remember the first/last !NULL page that was found
inside the for() loop. Not sure which is better, but I think I like
the first one more. Any other ideas?
Thanks,
Gerald
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
2008-10-29 14:25 Gerald Schaefer
2008-10-29 18:00 ` Nathan Fontenot
@ 2008-10-30 0:09 ` KAMEZAWA Hiroyuki
1 sibling, 0 replies; 10+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-10-30 0:09 UTC (permalink / raw)
To: gerald.schaefer
Cc: akpm, linux-kernel, linux-mm, schwidefsky, heiko.carstens, y-goto, dave
On Wed, 29 Oct 2008 15:25:30 +0100
Gerald Schaefer <gerald.schaefer@de.ibm.com> wrote:
> From: Gerald Schaefer <gerald.schaefer@de.ibm.com>
>
> My last bugfix here (adding zone->lock) introduced a new problem: Using
> page_zone(pfn_to_page(pfn)) to get the zone after the for() loop is wrong.
> pfn will then be >= end_pfn, which may be in a different zone or not
> present at all. This may lead to an addressing exception in page_zone()
> or spin_lock_irqsave().
>
> Now I use __first_valid_page() again after the loop to find a valid page
> for page_zone().
>
> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
>
Thanks.
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
> mm/page_isolation.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> Index: linux-2.6/mm/page_isolation.c
> ===================================================================
> --- linux-2.6.orig/mm/page_isolation.c
> +++ linux-2.6/mm/page_isolation.c
> @@ -130,10 +130,11 @@ int test_pages_isolated(unsigned long st
> if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
> break;
> }
> - if (pfn < end_pfn)
> + page = __first_valid_page(start_pfn, end_pfn - start_pfn);
> + if ((pfn < end_pfn) || !page)
> return -EBUSY;
> /* Check all pages are free or Marked as ISOLATED */
> - zone = page_zone(pfn_to_page(pfn));
> + zone = page_zone(page);
> spin_lock_irqsave(&zone->lock, flags);
> ret = __test_page_isolated_in_pageblock(start_pfn, end_pfn);
> spin_unlock_irqrestore(&zone->lock, flags);
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
2008-10-29 14:25 Gerald Schaefer
@ 2008-10-29 18:00 ` Nathan Fontenot
2008-10-30 0:09 ` KAMEZAWA Hiroyuki
1 sibling, 0 replies; 10+ messages in thread
From: Nathan Fontenot @ 2008-10-29 18:00 UTC (permalink / raw)
To: gerald.schaefer
Cc: akpm, linux-kernel, linux-mm, schwidefsky, heiko.carstens,
kamezawa.hiroyu, y-goto, dave
Looks like you beat me to it, and with a nicer fix too.
Gerald Schaefer wrote:
> From: Gerald Schaefer <gerald.schaefer@de.ibm.com>
>
> My last bugfix here (adding zone->lock) introduced a new problem: Using
> page_zone(pfn_to_page(pfn)) to get the zone after the for() loop is wrong.
> pfn will then be >= end_pfn, which may be in a different zone or not
> present at all. This may lead to an addressing exception in page_zone()
> or spin_lock_irqsave().
>
> Now I use __first_valid_page() again after the loop to find a valid page
> for page_zone().
>
> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Acked-by: Nathan Fontenot <nfont@austin.ibm.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated()
@ 2008-10-29 14:25 Gerald Schaefer
2008-10-29 18:00 ` Nathan Fontenot
2008-10-30 0:09 ` KAMEZAWA Hiroyuki
0 siblings, 2 replies; 10+ messages in thread
From: Gerald Schaefer @ 2008-10-29 14:25 UTC (permalink / raw)
To: akpm
Cc: linux-kernel, linux-mm, schwidefsky, heiko.carstens,
kamezawa.hiroyu, y-goto, dave
From: Gerald Schaefer <gerald.schaefer@de.ibm.com>
My last bugfix here (adding zone->lock) introduced a new problem: Using
page_zone(pfn_to_page(pfn)) to get the zone after the for() loop is wrong.
pfn will then be >= end_pfn, which may be in a different zone or not
present at all. This may lead to an addressing exception in page_zone()
or spin_lock_irqsave().
Now I use __first_valid_page() again after the loop to find a valid page
for page_zone().
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
---
mm/page_isolation.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
Index: linux-2.6/mm/page_isolation.c
===================================================================
--- linux-2.6.orig/mm/page_isolation.c
+++ linux-2.6/mm/page_isolation.c
@@ -130,10 +130,11 @@ int test_pages_isolated(unsigned long st
if (page && get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
break;
}
- if (pfn < end_pfn)
+ page = __first_valid_page(start_pfn, end_pfn - start_pfn);
+ if ((pfn < end_pfn) || !page)
return -EBUSY;
/* Check all pages are free or Marked as ISOLATED */
- zone = page_zone(pfn_to_page(pfn));
+ zone = page_zone(page);
spin_lock_irqsave(&zone->lock, flags);
ret = __test_page_isolated_in_pageblock(start_pfn, end_pfn);
spin_unlock_irqrestore(&zone->lock, flags);
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2008-10-30 0:10 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-27 16:49 [PATCH] memory hotplug: fix page_zone() calculation in test_pages_isolated() Gerald Schaefer
2008-10-27 17:17 ` Gerald Schaefer
2008-10-27 17:19 ` Gerald Schaefer
2008-10-27 17:25 ` Dave Hansen
2008-10-27 17:59 ` Gerald Schaefer
2008-10-28 0:32 ` KAMEZAWA Hiroyuki
2008-10-28 13:00 ` Gerald Schaefer
2008-10-29 14:25 Gerald Schaefer
2008-10-29 18:00 ` Nathan Fontenot
2008-10-30 0:09 ` KAMEZAWA Hiroyuki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).