LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* 2.6.25-rc2-mm1 - boot hangs on ia64
@ 2008-02-25 15:56 Lee Schermerhorn
2008-02-26 11:25 ` KOSAKI Motohiro
0 siblings, 1 reply; 12+ messages in thread
From: Lee Schermerhorn @ 2008-02-25 15:56 UTC (permalink / raw)
To: linux-ia64, linux-kernel, Andrew Morton, Tony Luck, Ingo Molnar
Cc: Bob Picco, Eric Whitney
25-rc2-mm1 is hanging early in boot on my HP ia64 numa platform. I saw
the "Strange hang on ia64 with CONFIG_PRINTK_TIME=y" thread on lkml:
http://marc.info/?t=120288396800001&r=1&w=4
However, my config does not include PRINTK_TIME=y. In fact, hang occurs
with ia64 defconfig as well--right after the "Loading...initrd...done"
message. 2.6.25-rc2 boots OK.
Bisecting the broken-out series appears to indict 'git-sched.patch'. I
went ahead and added Ingo's patch, discussed in the "strange hang"
thread, even tho' I hadn't enabled printk timestamps. No effect.
Anyone else seeing this?
Regards,
Lee
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.25-rc2-mm1 - boot hangs on ia64
2008-02-25 15:56 2.6.25-rc2-mm1 - boot hangs on ia64 Lee Schermerhorn
@ 2008-02-26 11:25 ` KOSAKI Motohiro
2008-02-26 11:31 ` Ingo Molnar
0 siblings, 1 reply; 12+ messages in thread
From: KOSAKI Motohiro @ 2008-02-26 11:25 UTC (permalink / raw)
To: Lee Schermerhorn
Cc: kosaki.motohiro, linux-ia64, linux-kernel, Andrew Morton,
Tony Luck, Ingo Molnar, Bob Picco, Eric Whitney
Hi
Fujitsu machine can't boot too.
my bisect indicate git-sched.patch cause regression too.
Thanks.
> 25-rc2-mm1 is hanging early in boot on my HP ia64 numa platform. I saw
> the "Strange hang on ia64 with CONFIG_PRINTK_TIME=y" thread on lkml:
>
> http://marc.info/?t=120288396800001&r=1&w=4
>
> However, my config does not include PRINTK_TIME=y. In fact, hang occurs
> with ia64 defconfig as well--right after the "Loading...initrd...done"
> message. 2.6.25-rc2 boots OK.
>
> Bisecting the broken-out series appears to indict 'git-sched.patch'. I
> went ahead and added Ingo's patch, discussed in the "strange hang"
> thread, even tho' I hadn't enabled printk timestamps. No effect.
>
> Anyone else seeing this?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.25-rc2-mm1 - boot hangs on ia64
2008-02-26 11:25 ` KOSAKI Motohiro
@ 2008-02-26 11:31 ` Ingo Molnar
2008-02-27 1:42 ` KOSAKI Motohiro
0 siblings, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2008-02-26 11:31 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: Lee Schermerhorn, linux-ia64, linux-kernel, Andrew Morton,
Tony Luck, Ingo Molnar, Bob Picco, Eric Whitney
* KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> Fujitsu machine can't boot too. my bisect indicate git-sched.patch
> cause regression too.
hm, that's a bit weird - nothing really should have broken it. Could you
try to do a specific bisection of sched-devel.git:
http://people.redhat.com/mingo/sched-devel.git/README
it's just a handful of commits so it should be relatively quick to
figure out. My only guess would be:
Subject: sched: make early bootup sched_clock() use safer
but i think this has been ruled out before ...
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.25-rc2-mm1 - boot hangs on ia64
2008-02-26 11:31 ` Ingo Molnar
@ 2008-02-27 1:42 ` KOSAKI Motohiro
2008-02-27 7:11 ` Ingo Molnar
0 siblings, 1 reply; 12+ messages in thread
From: KOSAKI Motohiro @ 2008-02-27 1:42 UTC (permalink / raw)
To: Ingo Molnar
Cc: kosaki.motohiro, Lee Schermerhorn, linux-ia64, linux-kernel,
Andrew Morton, Tony Luck, Ingo Molnar, Bob Picco, Eric Whitney
Hi Ingo
> > Fujitsu machine can't boot too. my bisect indicate git-sched.patch
> > cause regression too.
>
> hm, that's a bit weird - nothing really should have broken it. Could you
> try to do a specific bisection of sched-devel.git:
>
> http://people.redhat.com/mingo/sched-devel.git/README
How do I know revision of git-sched.patch of 2.6.25-rc2-mm1?
Should I do bisect from HEAD of sched-devel.git?
> it's just a handful of commits so it should be relatively quick to
> figure out. My only guess would be:
>
> Subject: sched: make early bootup sched_clock() use safer
>
> but i think this has been ruled out before ...
rc2-mm1 + that patch doesn't boot too.
stop at the same point ;)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.25-rc2-mm1 - boot hangs on ia64
2008-02-27 1:42 ` KOSAKI Motohiro
@ 2008-02-27 7:11 ` Ingo Molnar
2008-02-28 10:38 ` KOSAKI Motohiro
0 siblings, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2008-02-27 7:11 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: Lee Schermerhorn, linux-ia64, linux-kernel, Andrew Morton,
Tony Luck, Ingo Molnar, Bob Picco, Eric Whitney
* KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> > hm, that's a bit weird - nothing really should have broken it. Could you
> > try to do a specific bisection of sched-devel.git:
> >
> > http://people.redhat.com/mingo/sched-devel.git/README
>
> How do I know revision of git-sched.patch of 2.6.25-rc2-mm1? Should I
> do bisect from HEAD of sched-devel.git?
yeah, please. If it's caused by sched-devel.git then you should see the
hang there too.
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.25-rc2-mm1 - boot hangs on ia64
2008-02-27 7:11 ` Ingo Molnar
@ 2008-02-28 10:38 ` KOSAKI Motohiro
2008-02-28 11:50 ` Ingo Molnar
0 siblings, 1 reply; 12+ messages in thread
From: KOSAKI Motohiro @ 2008-02-28 10:38 UTC (permalink / raw)
To: Ingo Molnar, Steven Rostedt
Cc: kosaki.motohiro, Lee Schermerhorn, linux-ia64, linux-kernel,
Andrew Morton, Tony Luck, Ingo Molnar, Bob Picco, Eric Whitney,
akpm
Hi Ingo,
CC'ed Steven Rostedt
I found the following patch cause regression by bisect.
2.6.25-rc2-mm1: doesn't boot
2.6.25-rc2-mm1 + revert following patch: works well
but I think it is very strange.
runqueue_is_locked() seems simple and have not bug. ;)
What do you think this problem?
-------------------------------------------------------------------
commit 033394c2c097215ac556a446154af24fbf18b064
Author: Steven Rostedt <srostedt@redhat.com>
Date: Mon Feb 25 21:15:44 2008 +0100
printk: dont wake up klogd with the rq locked
It is not wise to place a printk where the runqueue lock is held.
>
> * KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
>
> > > hm, that's a bit weird - nothing really should have broken it. Could you
> > > try to do a specific bisection of sched-devel.git:
> > >
> > > http://people.redhat.com/mingo/sched-devel.git/README
> >
> > How do I know revision of git-sched.patch of 2.6.25-rc2-mm1? Should I
> > do bisect from HEAD of sched-devel.git?
>
> yeah, please. If it's caused by sched-devel.git then you should see the
> hang there too.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.25-rc2-mm1 - boot hangs on ia64
2008-02-28 10:38 ` KOSAKI Motohiro
@ 2008-02-28 11:50 ` Ingo Molnar
2008-02-28 12:13 ` KOSAKI Motohiro
2008-02-28 18:13 ` Andrew Morton
0 siblings, 2 replies; 12+ messages in thread
From: Ingo Molnar @ 2008-02-28 11:50 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: Steven Rostedt, Lee Schermerhorn, linux-ia64, linux-kernel,
Andrew Morton, Tony Luck, Ingo Molnar, Bob Picco, Eric Whitney
* KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> Hi Ingo,
> CC'ed Steven Rostedt
>
> I found the following patch cause regression by bisect.
>
> 2.6.25-rc2-mm1: doesn't boot
> 2.6.25-rc2-mm1 + revert following patch: works well
>
> but I think it is very strange. runqueue_is_locked() seems simple and
> have not bug. ;)
>
> What do you think this problem?
thanks for bisecting it down! Could ia64 have trouble accessing the
percpu data structures of the scheduler?
does the patch below resolve the hang?
Ingo
------------------------->
Subject: sched: fix wake_up_klogd()
From: Ingo Molnar <mingo@elte.hu>
Date: Thu Feb 28 12:42:45 CET 2008
on some platforms if we printk too early it might not be safe to call
into the scheduler data structures.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/printk.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
Index: linux/kernel/printk.c
===================================================================
--- linux.orig/kernel/printk.c
+++ linux/kernel/printk.c
@@ -948,7 +948,8 @@ int is_console_locked(void)
void wake_up_klogd(void)
{
- if (!oops_in_progress && waitqueue_active(&log_wait))
+ if (!oops_in_progress && waitqueue_active(&log_wait) &&
+ !runqueue_is_locked())
wake_up_interruptible(&log_wait);
}
@@ -1000,7 +1001,7 @@ void release_console_sem(void)
* If we try to wake up klogd while printing with the runqueue lock
* held, this will deadlock.
*/
- if (wake_klogd && !runqueue_is_locked())
+ if (wake_klogd)
wake_up_klogd();
}
EXPORT_SYMBOL(release_console_sem);
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.25-rc2-mm1 - boot hangs on ia64
2008-02-28 11:50 ` Ingo Molnar
@ 2008-02-28 12:13 ` KOSAKI Motohiro
2008-02-28 18:13 ` Andrew Morton
1 sibling, 0 replies; 12+ messages in thread
From: KOSAKI Motohiro @ 2008-02-28 12:13 UTC (permalink / raw)
To: Ingo Molnar
Cc: kosaki.motohiro, Steven Rostedt, Lee Schermerhorn, linux-ia64,
linux-kernel, Andrew Morton, Tony Luck, Ingo Molnar, Bob Picco,
Eric Whitney
Hi Ingo,
> thanks for bisecting it down! Could ia64 have trouble accessing the
> percpu data structures of the scheduler?
>
> does the patch below resolve the hang?
Thanks!
that patch works well on my test environment.
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
BTW: Your work is ultimate fast. it's wonderful :)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.25-rc2-mm1 - boot hangs on ia64
2008-02-28 11:50 ` Ingo Molnar
2008-02-28 12:13 ` KOSAKI Motohiro
@ 2008-02-28 18:13 ` Andrew Morton
2008-02-28 19:12 ` Ingo Molnar
1 sibling, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2008-02-28 18:13 UTC (permalink / raw)
To: Ingo Molnar
Cc: KOSAKI Motohiro, Steven Rostedt, Lee Schermerhorn, linux-ia64,
linux-kernel, Tony Luck, Ingo Molnar, Bob Picco, Eric Whitney
On Thu, 28 Feb 2008 12:50:41 +0100 Ingo Molnar <mingo@elte.hu> wrote:
> @@ -1000,7 +1001,7 @@ void release_console_sem(void)
> * If we try to wake up klogd while printing with the runqueue lock
> * held, this will deadlock.
> */
> - if (wake_klogd && !runqueue_is_locked())
> + if (wake_klogd)
> wake_up_klogd();
> }
I don't think we shoudl have added that hack in the first place. It solves a
problem which about three developers hit four times in five years but it
has made kernel logging less reliable for everyone.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.25-rc2-mm1 - boot hangs on ia64
2008-02-28 18:13 ` Andrew Morton
@ 2008-02-28 19:12 ` Ingo Molnar
2008-02-28 19:24 ` Andrew Morton
0 siblings, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2008-02-28 19:12 UTC (permalink / raw)
To: Andrew Morton
Cc: KOSAKI Motohiro, Steven Rostedt, Lee Schermerhorn, linux-ia64,
linux-kernel, Tony Luck, Ingo Molnar, Bob Picco, Eric Whitney
* Andrew Morton <akpm@linux-foundation.org> wrote:
> On Thu, 28 Feb 2008 12:50:41 +0100 Ingo Molnar <mingo@elte.hu> wrote:
>
> > @@ -1000,7 +1001,7 @@ void release_console_sem(void)
> > * If we try to wake up klogd while printing with the runqueue lock
> > * held, this will deadlock.
> > */
> > - if (wake_klogd && !runqueue_is_locked())
> > + if (wake_klogd)
> > wake_up_klogd();
> > }
>
> I don't think we shoudl have added that hack in the first place. It
> solves a problem which about three developers hit four times in five
> years but it has made kernel logging less reliable for everyone.
well, the problem was ia64, not a problem on x86 or other platforms. The
problem here is ia64 not setting up percpu data structures soon enough.
It has blown up in the past in other areas, and it will likely blow up
in the future in other areas as well. It's just not robust to have init
dependencies on such basic data structures like percpu areas like that.
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.25-rc2-mm1 - boot hangs on ia64
2008-02-28 19:12 ` Ingo Molnar
@ 2008-02-28 19:24 ` Andrew Morton
2008-02-28 19:32 ` Ingo Molnar
0 siblings, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2008-02-28 19:24 UTC (permalink / raw)
To: Ingo Molnar
Cc: kosaki.motohiro, srostedt, Lee.Schermerhorn, linux-ia64,
linux-kernel, tony.luck, mingo, bob.picco, eric.whitney
On Thu, 28 Feb 2008 20:12:14 +0100
Ingo Molnar <mingo@elte.hu> wrote:
>
> * Andrew Morton <akpm@linux-foundation.org> wrote:
>
> > On Thu, 28 Feb 2008 12:50:41 +0100 Ingo Molnar <mingo@elte.hu> wrote:
> >
> > > @@ -1000,7 +1001,7 @@ void release_console_sem(void)
> > > * If we try to wake up klogd while printing with the runqueue lock
> > > * held, this will deadlock.
> > > */
> > > - if (wake_klogd && !runqueue_is_locked())
> > > + if (wake_klogd)
> > > wake_up_klogd();
> > > }
> >
> > I don't think we shoudl have added that hack in the first place. It
> > solves a problem which about three developers hit four times in five
> > years but it has made kernel logging less reliable for everyone.
>
> well, the problem was ia64, not a problem on x86 or other platforms.
I am referring to the original change which made klogd wakeups unreliable.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: 2.6.25-rc2-mm1 - boot hangs on ia64
2008-02-28 19:24 ` Andrew Morton
@ 2008-02-28 19:32 ` Ingo Molnar
0 siblings, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2008-02-28 19:32 UTC (permalink / raw)
To: Andrew Morton
Cc: kosaki.motohiro, srostedt, Lee.Schermerhorn, linux-ia64,
linux-kernel, tony.luck, mingo, bob.picco, eric.whitney
* Andrew Morton <akpm@linux-foundation.org> wrote:
> > > I don't think we shoudl have added that hack in the first place.
> > > It solves a problem which about three developers hit four times in
> > > five years but it has made kernel logging less reliable for
> > > everyone.
> >
> > well, the problem was ia64, not a problem on x86 or other platforms.
>
> I am referring to the original change which made klogd wakeups
> unreliable.
oh, indeed - agreed - i missed the fact that is_locked check is sporadic
and can cause other CPUs to prevent the wakeup of klogd. I've zapped the
change.
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2008-02-28 19:33 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-25 15:56 2.6.25-rc2-mm1 - boot hangs on ia64 Lee Schermerhorn
2008-02-26 11:25 ` KOSAKI Motohiro
2008-02-26 11:31 ` Ingo Molnar
2008-02-27 1:42 ` KOSAKI Motohiro
2008-02-27 7:11 ` Ingo Molnar
2008-02-28 10:38 ` KOSAKI Motohiro
2008-02-28 11:50 ` Ingo Molnar
2008-02-28 12:13 ` KOSAKI Motohiro
2008-02-28 18:13 ` Andrew Morton
2008-02-28 19:12 ` Ingo Molnar
2008-02-28 19:24 ` Andrew Morton
2008-02-28 19:32 ` Ingo Molnar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).