LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
@ 2008-01-11  2:22 Ed Tomlinson
  2008-01-14 16:16 ` Ingo Molnar
  0 siblings, 1 reply; 29+ messages in thread
From: Ed Tomlinson @ 2008-01-11  2:22 UTC (permalink / raw)
  To: Ingo Molnar, Matthew, H. Peter Anvin, linux-kernel, Thomas Gleixner

>> - if yes, does booting with "nmi_watchdog=2 idle=poll" give you a 
>>   working NMI watchdog? (working NMI watchdog means the NMI counts 
>>   increase for all cores in /proc/interrupts).

> booting with the above gives me an incrementing NMI counter in /proc/interrupts

Ingo,

Is there anything else that needs to be set in the kernel config for the nmi watchdog to trigger?

I ask because I just had a hang but nothing showed on the _serial_ console - I waited a couple
of minutes before rebooting....  Is there any other way to verify the watchdog is working?

I seem to need X active with mix of 32 and 64 bit applications active to get hung here.  A massivily
threaded 64 bit java app along with 32 bit firefox and a wine active will eventually trigger things here.
If I had to guess I would say that it the switch from 32 to 64 (or vise versa) that triggers the isuue.

TIA & test/debug patches welcome,
Ed Tomlinson


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-11  2:22 Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo) Ed Tomlinson
@ 2008-01-14 16:16 ` Ingo Molnar
  0 siblings, 0 replies; 29+ messages in thread
From: Ingo Molnar @ 2008-01-14 16:16 UTC (permalink / raw)
  To: Ed Tomlinson; +Cc: Matthew, H. Peter Anvin, linux-kernel, Thomas Gleixner


* Ed Tomlinson <edt@aei.ca> wrote:

> >> - if yes, does booting with "nmi_watchdog=2 idle=poll" give you a
> >>   working NMI watchdog? (working NMI watchdog means the NMI counts 
> >>   increase for all cores in /proc/interrupts).
> 
> > booting with the above gives me an incrementing NMI counter in 
> > /proc/interrupts
> 
> Ingo,
> 
> Is there anything else that needs to be set in the kernel config for 
> the nmi watchdog to trigger?
> 
> I ask because I just had a hang but nothing showed on the _serial_ 
> console - I waited a couple of minutes before rebooting....  Is there 
> any other way to verify the watchdog is working?

if you cause a hard lockup intentionally via an infinite irqs-off loop:
  
# cat > lockupcli.c
main ()
{
	iopl(3);
	for (;;) asm("cli");
}
Ctrl-D
make lockupcli
./lockupcli

does the NMI watchdog properly trigger? If not, does booting with 
idle=poll change the situation?

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-15 22:09                                   ` Ingo Molnar
@ 2008-01-15 23:41                                     ` Ed Tomlinson
  0 siblings, 0 replies; 29+ messages in thread
From: Ed Tomlinson @ 2008-01-15 23:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Matthew, H. Peter Anvin, linux-kernel, Ingo Molnar,
	Thomas Gleixner, Roland McGrath

On January 15, 2008, Ingo Molnar wrote:
> 
> * Ed Tomlinson <edt@aei.ca> wrote:
> 
> > This is _not_ a regression.  This has been occuring for ages here.  A 
> > backport of this fix to 2.6.23 would be a very good thing - IMHO its 
> > something that should go into stable asap.
> 
> the problem is that this bug was only present in x86.git. I.e. neither 
> 2.6.24 nor 2.6.23 has this particular bug.
> 
> perhaps something else in x86.git fixed your box, but this 
> x86.git-specific hang 'took over its place', and now that it got fixed, 
> you've got a working box? In any case, please monitor your box, it might 
> still lock up the same way it did previously ...

I am now testing with a .24-rc7+fix  kernel.  So far so good.  Running gentoo's 32 bit
firefox with flash 9 is a good way to trigger the problem here as is running Delftship (freeship)
under wine.  The problem is usually worst with a fully preemptive kernel.  I have been using both on 
a kernel with preempt and have an uptime of 22 hours - this is really good.   I have rarely been able 
to get this much uptime using these apps.  If it manages to run for a few more days without a lockup 
it would really be worth trying to figure out what in .24 fixes the problem...

THANKS!
Ed Tomlinson

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-14 22:16                                 ` Ed Tomlinson
  2008-01-15 17:11                                   ` Matthew
@ 2008-01-15 22:09                                   ` Ingo Molnar
  2008-01-15 23:41                                     ` Ed Tomlinson
  1 sibling, 1 reply; 29+ messages in thread
From: Ingo Molnar @ 2008-01-15 22:09 UTC (permalink / raw)
  To: Ed Tomlinson
  Cc: Matthew, H. Peter Anvin, linux-kernel, Ingo Molnar,
	Thomas Gleixner, Roland McGrath


* Ed Tomlinson <edt@aei.ca> wrote:

> This is _not_ a regression.  This has been occuring for ages here.  A 
> backport of this fix to 2.6.23 would be a very good thing - IMHO its 
> something that should go into stable asap.

the problem is that this bug was only present in x86.git. I.e. neither 
2.6.24 nor 2.6.23 has this particular bug.

perhaps something else in x86.git fixed your box, but this 
x86.git-specific hang 'took over its place', and now that it got fixed, 
you've got a working box? In any case, please monitor your box, it might 
still lock up the same way it did previously ...

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-14 22:16                                 ` Ed Tomlinson
@ 2008-01-15 17:11                                   ` Matthew
  2008-01-15 22:09                                   ` Ingo Molnar
  1 sibling, 0 replies; 29+ messages in thread
From: Matthew @ 2008-01-15 17:11 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: H. Peter Anvin, linux-kernel, Ed Tomlinson, Thomas Gleixner

> Ingo,
>
> This is _not_ a regression.  This has been occuring for ages here.  A backport of this fix to 2.6.23 would be a
> very good thing - IMHO its something that should go into stable asap.
>
> Thanks,
> Ed Tomlinson
>
>
>

++

Ingo,
this probably has to do something with the random unmotivated
hardlocks which I suffered from with 2.6.23

when I used 2.6.23-based kernels my rig from time to time (sometimes 2
times a day) just locked and wouldn't react to keyboard-input or magic
sysrq key anymore
if I have a more precise look at my memory (in my head) it probably
happened more often around usage with realplayer (<-- at least that
app; 32bit on amd64) [perhaps also with 32bit thunderbird]
the next suspect is nvidia-drivers: with earlier versions this
happened more often for me (that's at least the "feeling" I have)

so you guys might want to start with those 3 points (thunderbird,
realplayer, nvidia-drivers);
considerung use / test with latest cfs-backports would also be a good
idea, with early backports I had some problems, whereas it's now
perfectly stable (fair group scheduling not enabled)

unfortunately I can't / couldn't reproduce it
in addition to that I'm pretty busy right now so I can't investigate
any further ...

hope you also find the culprit for that buggy ;)

Regards
Mat

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-14 17:00                               ` Ingo Molnar
@ 2008-01-14 22:16                                 ` Ed Tomlinson
  2008-01-15 17:11                                   ` Matthew
  2008-01-15 22:09                                   ` Ingo Molnar
  0 siblings, 2 replies; 29+ messages in thread
From: Ed Tomlinson @ 2008-01-14 22:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Matthew, H. Peter Anvin, linux-kernel, Ingo Molnar, Thomas Gleixner

On January 14, 2008, Ingo Molnar wrote:
> 
> * Matthew <jackdachef@gmail.com> wrote:
> 
> > > FYI, latest x86.git should have this fix included. So if your box 
> > > still hangs there must be some other bug lurking as well.
> > 
> > 
> > the fix from Roland ?: http://lkml.org/lkml/2008/1/11/108
> > http://forums.gentoo.org/viewtopic-p-4719206.html#4719206 (+ following posts)
> > 
> > works like a charm :)
> > wine-problems should be solved,
> > 
> > 64bit firefox & 32bit flash, 32bit firefox, 32bit thunderbird, 
> > realplayer work fine again without hardlocking so far (at least for 
> > me)
> 
> great - thanks for following through with this, this was an important 
> regression to get fixed! I've added:
> 
> Tested-by: Matthew <jackdachef@gmail.com>

Ingo,

This is _not_ a regression.  This has been occuring for ages here.  A backport of this fix to 2.6.23 would be a 
very good thing - IMHO its something that should go into stable asap.

Thanks,
Ed Tomlinson



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-14 16:47                             ` Matthew
@ 2008-01-14 17:00                               ` Ingo Molnar
  2008-01-14 22:16                                 ` Ed Tomlinson
  0 siblings, 1 reply; 29+ messages in thread
From: Ingo Molnar @ 2008-01-14 17:00 UTC (permalink / raw)
  To: Matthew
  Cc: H. Peter Anvin, linux-kernel, Ingo Molnar, Thomas Gleixner, Ed Tomlinson


* Matthew <jackdachef@gmail.com> wrote:

> > FYI, latest x86.git should have this fix included. So if your box 
> > still hangs there must be some other bug lurking as well.
> 
> 
> the fix from Roland ?: http://lkml.org/lkml/2008/1/11/108
> http://forums.gentoo.org/viewtopic-p-4719206.html#4719206 (+ following posts)
> 
> works like a charm :)
> wine-problems should be solved,
> 
> 64bit firefox & 32bit flash, 32bit firefox, 32bit thunderbird, 
> realplayer work fine again without hardlocking so far (at least for 
> me)

great - thanks for following through with this, this was an important 
regression to get fixed! I've added:

Tested-by: Matthew <jackdachef@gmail.com>

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-14 16:13                           ` Ingo Molnar
@ 2008-01-14 16:47                             ` Matthew
  2008-01-14 17:00                               ` Ingo Molnar
  0 siblings, 1 reply; 29+ messages in thread
From: Matthew @ 2008-01-14 16:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: H. Peter Anvin, linux-kernel, Ingo Molnar, Thomas Gleixner, Ed Tomlinson

>
> FYI, latest x86.git should have this fix included. So if your box still
> hangs there must be some other bug lurking as well.


the fix from Roland ?: http://lkml.org/lkml/2008/1/11/108
http://forums.gentoo.org/viewtopic-p-4719206.html#4719206 (+ following posts)

works like a charm :)
wine-problems should be solved,

64bit firefox & 32bit flash, 32bit firefox, 32bit thunderbird,
realplayer work fine again without hardlocking so far  (at least for
me)

Thanks to everyone involved

>
>         Ingo
>

Regards
Mat

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-10 22:53                         ` Matthew
@ 2008-01-14 16:13                           ` Ingo Molnar
  2008-01-14 16:47                             ` Matthew
  0 siblings, 1 reply; 29+ messages in thread
From: Ingo Molnar @ 2008-01-14 16:13 UTC (permalink / raw)
  To: Matthew
  Cc: H. Peter Anvin, linux-kernel, Ingo Molnar, Thomas Gleixner, Ed Tomlinson


* Matthew <jackdachef@gmail.com> wrote:

> > I just managed to reproduce the bug in simulation.  I believe we should
> > be able to resolve this.
> 
> That's great news! Keep up the good work :)

FYI, latest x86.git should have this fix included. So if your box still 
hangs there must be some other bug lurking as well.

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-10 12:43                 ` Matthew
  2008-01-10 12:48                   ` Ingo Molnar
@ 2008-01-10 23:35                   ` Zan Lynx
  1 sibling, 0 replies; 29+ messages in thread
From: Zan Lynx @ 2008-01-10 23:35 UTC (permalink / raw)
  To: Matthew; +Cc: Ingo Molnar, H. Peter Anvin, linux-kernel, Thomas Gleixner

[-- Attachment #1: Type: text/plain, Size: 1342 bytes --]


On Thu, 2008-01-10 at 13:43 +0100, Matthew wrote:

> it's a little tricky to reproduce it:
> I tried it with root-account: firefox-bin, thunderbird-bin wouldn't trigger
> user-account (with used account-directory of both apps):
> thunderbird-bin triggers it more reliably
> probably it has to do with the x86 compatibility apps of gentoo ?
> gentoo amd64-users with 32bit firefox & thunderbird - anyone able to
> reproduce it ?

I believe that I *have* seen this happen to me.  I'm using a Compaq
R3000 laptop with AMD-64 CPU and Gentoo with kernel 2.6.24-rc6-mm1.

A few days ago I crashed it twice in a row by trying to load my Comics
bookmarks as tabs (87 entries) in 32-bit Firefox.

I only got 1 and 3/4ths lines output by netconsole and it mentioned a
function with 32-something in the name.

Later tonight I can try loading the tabs in Firefox again to see if it
will reproduce for a 3rd time.

I also have to say that the NMI watchdog, supposedly Non-Maskable,
hardly *ever* works for me.  I don't believe that whatever events reset
the dog actually matter to the end user.  Perhaps its still processing
interrupts and running a timer loop but if nothing can read or write
disk, net, netlink or other device IO, I don't believe the system can
actually claim to be working.
-- 
Zan Lynx <zlynx@acm.org>

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-10 22:30                       ` H. Peter Anvin
@ 2008-01-10 22:53                         ` Matthew
  2008-01-14 16:13                           ` Ingo Molnar
  0 siblings, 1 reply; 29+ messages in thread
From: Matthew @ 2008-01-10 22:53 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Ed Tomlinson

> I just managed to reproduce the bug in simulation.  I believe we should
> be able to resolve this.

That's great news! Keep up the good work :)

I hope that you guys'll be able to do so since it (indirectly) more or
less leads to data corruption (at least with thunderbird-bin &
firefox-bin -> both profile directories didn't work after the crash
anymore == data loss)
if I didn't have a backup aside they would have been lost ...

>         -hpa
>

Mat

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-10 22:28                     ` Matthew
@ 2008-01-10 22:30                       ` H. Peter Anvin
  2008-01-10 22:53                         ` Matthew
  0 siblings, 1 reply; 29+ messages in thread
From: H. Peter Anvin @ 2008-01-10 22:30 UTC (permalink / raw)
  To: Matthew; +Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Ed Tomlinson

Matthew wrote:
>> That's fine, but that was collected with the vmlinux image you sent me,
>> right?
>>
> 
> no, but now it is:
> http://kerneloftruth.neucode.org/other/crash_ia32_64/not_tainted/latest/
> (the other one was taken before I added/selected the demanded features)
> 
> what puzzles me is that it doesn't say "tainted" (not tainted) in the
> upper part & "tainted" in the lower part
> I apologize for the bad quality of the pictures ...
> 
> luckily there was also an additional call trace this time (syslog-ng
> Tainted: G   D) (don't know what it means), nvidia-module wasn't
> loaded and no other additional proprietary modules were loaded AFAIK
> 

I just managed to reproduce the bug in simulation.  I believe we should 
be able to resolve this.

	-hpa

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-10 21:57                   ` H. Peter Anvin
@ 2008-01-10 22:28                     ` Matthew
  2008-01-10 22:30                       ` H. Peter Anvin
  0 siblings, 1 reply; 29+ messages in thread
From: Matthew @ 2008-01-10 22:28 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Ed Tomlinson

>
> That's fine, but that was collected with the vmlinux image you sent me,
> right?
>

no, but now it is:
http://kerneloftruth.neucode.org/other/crash_ia32_64/not_tainted/latest/
(the other one was taken before I added/selected the demanded features)

what puzzles me is that it doesn't say "tainted" (not tainted) in the
upper part & "tainted" in the lower part
I apologize for the bad quality of the pictures ...

luckily there was also an additional call trace this time (syslog-ng
Tainted: G   D) (don't know what it means), nvidia-module wasn't
loaded and no other additional proprietary modules were loaded AFAIK

>         -hpa
>

Mat

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-10 21:52                 ` Matthew
@ 2008-01-10 21:57                   ` H. Peter Anvin
  2008-01-10 22:28                     ` Matthew
  0 siblings, 1 reply; 29+ messages in thread
From: H. Peter Anvin @ 2008-01-10 21:57 UTC (permalink / raw)
  To: Matthew; +Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Ed Tomlinson

Matthew wrote:
>> Do you have the error dump output to go along with this, too?
>>
> 
> no, unfortunately no kernel crash dump on disk ;( (I hope I understood
> it right, I'm pretty noobish concerning collection of error data ;) )
> 
> I only have the console-output of the hardlock in:
> http://kerneloftruth.neucode.org/other/crash_ia32_64/not_tainted/

That's fine, but that was collected with the vmlinux image you sent me, 
right?

	-hpa

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-10 21:17               ` H. Peter Anvin
@ 2008-01-10 21:52                 ` Matthew
  2008-01-10 21:57                   ` H. Peter Anvin
  0 siblings, 1 reply; 29+ messages in thread
From: Matthew @ 2008-01-10 21:52 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Ed Tomlinson

> Do you have the error dump output to go along with this, too?
>

no, unfortunately no kernel crash dump on disk ;( (I hope I understood
it right, I'm pretty noobish concerning collection of error data ;) )

I only have the console-output of the hardlock in:
http://kerneloftruth.neucode.org/other/crash_ia32_64/not_tainted/

and a call trace from an earlier crash but I don't know if it's worth anything:
http://kerneloftruth.neucode.org/other/crash_ia32_64/moto_0040.jpg

just FYI: the crash also occurs with preemptible rcu disabled (classic
rcu) (just saw that I had it enabled in that kernel ...)


> Huge thanks,
>

:)

>         -hpa
>

Mat

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-10 21:10             ` Matthew
@ 2008-01-10 21:17               ` H. Peter Anvin
  2008-01-10 21:52                 ` Matthew
  0 siblings, 1 reply; 29+ messages in thread
From: H. Peter Anvin @ 2008-01-10 21:17 UTC (permalink / raw)
  To: Matthew; +Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Ed Tomlinson

Matthew wrote:
>> If you *do* reproduce the problem that way, it would be extremely
>> helpful if you could enable CONFIG_DEBUG_INFO and provide the vmlinux
>> (not vmlinuz/bzImage) file that goes with the crash dump screenshot.
> 
> I *did* reproduce it that way and enabled the above mentioned option
> and the CONFIG_DEBUG_BUGVERBOSE=y thingy and CONFIG_FRAME_POINTER=y,
> too
> hope that's enough
> 
> here you go: http://kerneloftruth.neucode.org/other/crash_ia32_64/not_tainted/vmlinux
> (I hope it was uploaded completely):
> 
> md5sum: 0be124557bafebaebd69be2138329ef6
> sha256sum: 638d7a0dc36caa8eedd77e2ebeae6e8b54db74466f9d28f769c9cacf2ace0e0e
> 
> updated kernel-config:
> http://omploader.org/vYWhw/2.6.24-rc6-zen0_bisect%20(latest_config)
> 

Great!!  Downloading now.

Do you have the error dump output to go along with this, too?

Huge thanks,

	-hpa

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-09  0:58           ` H. Peter Anvin
  2008-01-10  9:05             ` Matthew
@ 2008-01-10 21:10             ` Matthew
  2008-01-10 21:17               ` H. Peter Anvin
  1 sibling, 1 reply; 29+ messages in thread
From: Matthew @ 2008-01-10 21:10 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Ed Tomlinson

> If you *do* reproduce the problem that way, it would be extremely
> helpful if you could enable CONFIG_DEBUG_INFO and provide the vmlinux
> (not vmlinuz/bzImage) file that goes with the crash dump screenshot.

I *did* reproduce it that way and enabled the above mentioned option
and the CONFIG_DEBUG_BUGVERBOSE=y thingy and CONFIG_FRAME_POINTER=y,
too
hope that's enough

here you go: http://kerneloftruth.neucode.org/other/crash_ia32_64/not_tainted/vmlinux
(I hope it was uploaded completely):

md5sum: 0be124557bafebaebd69be2138329ef6
sha256sum: 638d7a0dc36caa8eedd77e2ebeae6e8b54db74466f9d28f769c9cacf2ace0e0e

updated kernel-config:
http://omploader.org/vYWhw/2.6.24-rc6-zen0_bisect%20(latest_config)

>          -hpa
>

Mat

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-10 13:32                       ` Ingo Molnar
  2008-01-10 15:56                         ` Matthew
@ 2008-01-10 16:38                         ` Ed Tomlinson
  1 sibling, 0 replies; 29+ messages in thread
From: Ed Tomlinson @ 2008-01-10 16:38 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Matthew, H. Peter Anvin, linux-kernel, Thomas Gleixner

On January 10, 2008, Ingo Molnar wrote:
> 
> * Ed Tomlinson <edt@aei.ca> wrote:
> 
> > Matthew is not alone with this problem.  I have it too.  Its not new 
> > here.  Its been happening as long as I have had gentoo amd64 
> > installed.  It can be hard to reproduce but eventually, when 32 bit 
> > apps are used, my box bricks.  There is nothing in the logs (nor on a 
> > serial console) - the box just freezes.
> > 
> > My kernel is _not_ tainted. [...]
> 
> ok, good. A series of questions:
> 
> - can you reproduce it from the VGA console?

No - though I do have a serial console to see logs.

> - if yes, does booting with "nmi_watchdog=2 idle=poll" give you a 
>   working NMI watchdog? (working NMI watchdog means the NMI counts 
>   increase for all cores in /proc/interrupts).

booting with the above gives me an incrementing NMI counter in /proc/interrupts

> if still 'yes', then try to reproduce the hard hang on the VGA text 
> console - do you perhaps get an NMI backtrace printed within 1-2 minutes 
> after the hard hang happens? If yes then take a photo of that or write 
> it down.

I am booted with the NMI watchdog and serial consoles active running apps that
eventually will trigger a hang...

Ed Tomlinson



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-10 13:32                       ` Ingo Molnar
@ 2008-01-10 15:56                         ` Matthew
  2008-01-10 16:38                         ` Ed Tomlinson
  1 sibling, 0 replies; 29+ messages in thread
From: Matthew @ 2008-01-10 15:56 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Ed Tomlinson, H. Peter Anvin, linux-kernel, Thomas Gleixner

> > My kernel is _not_ tainted. [...]
>

this time my kernel isn't tainted either (comm: thunderbird-bin Not
tainted; http://kerneloftruth.neucode.org/other/crash_ia32_64/not_tainted/moto_0041.jpg)
 but still hardlocks:
http://kerneloftruth.neucode.org/other/crash_ia32_64/not_tainted/


> ok, good. A series of questions:
>
> - can you reproduce it from the VGA console?
>

yes

> - if yes, does booting with "nmi_watchdog=2 idle=poll" give you a
>   working NMI watchdog? (working NMI watchdog means the NMI counts
>   increase for all cores in /proc/interrupts).
>

no, I get err=-16
(watchdog is broken: cat /proc/interrupts | grep NMI reveals nothing)

> if still 'yes', then try to reproduce the hard hang on the VGA text
> console

yes, despite broken watchdog

steps:
1) startx
2) change to tty2, log in; DISPLAY:=0 thunderbird-bin
3) wait until it hardlocks

known apps to trigger that locking:
- realplayer
- thunderbird-bin (2.0.0.9)
- mozilla-firefox-bin (2.0.0.11)
(all included in portage-tree)

apps not triggering:
- skype (not tested that thoroughly (yet))
- ...

> - do you perhaps get an NMI backtrace printed within 1-2 minutes
> after the hard hang happens? If yes then take a photo of that or write
> it down.
>

only backtrace so far (once):
http://kerneloftruth.neucode.org/other/crash_ia32_64/moto_0040.jpg

I'll tar the whole kernel-directory & modules so that you'll be able
to reproduce it more easily (if wanted), is there a place where I
could upload it (it weighs around 300-400 MBs so that'll take some
time ;) )
I got work to do so that'll be all for now, I hope you'll be able to
find the culprit soon ...

>         Ingo
>

Mat

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-10 13:08                     ` Ed Tomlinson
@ 2008-01-10 13:32                       ` Ingo Molnar
  2008-01-10 15:56                         ` Matthew
  2008-01-10 16:38                         ` Ed Tomlinson
  0 siblings, 2 replies; 29+ messages in thread
From: Ingo Molnar @ 2008-01-10 13:32 UTC (permalink / raw)
  To: Ed Tomlinson; +Cc: Matthew, H. Peter Anvin, linux-kernel, Thomas Gleixner


* Ed Tomlinson <edt@aei.ca> wrote:

> Matthew is not alone with this problem.  I have it too.  Its not new 
> here.  Its been happening as long as I have had gentoo amd64 
> installed.  It can be hard to reproduce but eventually, when 32 bit 
> apps are used, my box bricks.  There is nothing in the logs (nor on a 
> serial console) - the box just freezes.
> 
> My kernel is _not_ tainted. [...]

ok, good. A series of questions:

- can you reproduce it from the VGA console?

- if yes, does booting with "nmi_watchdog=2 idle=poll" give you a 
  working NMI watchdog? (working NMI watchdog means the NMI counts 
  increase for all cores in /proc/interrupts).

if still 'yes', then try to reproduce the hard hang on the VGA text 
console - do you perhaps get an NMI backtrace printed within 1-2 minutes 
after the hard hang happens? If yes then take a photo of that or write 
it down.

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-10 12:59                     ` Matthew
@ 2008-01-10 13:24                       ` Ingo Molnar
  0 siblings, 0 replies; 29+ messages in thread
From: Ingo Molnar @ 2008-01-10 13:24 UTC (permalink / raw)
  To: Matthew; +Cc: H. Peter Anvin, linux-kernel, Thomas Gleixner


* Matthew <jackdachef@gmail.com> wrote:

> > > this also happens with rc7-based kernels, btw
> >
> > hm, exactly what rc7 based kernel? Vanilla 2.6.24-rc7, built by you? 
> > Or any patches ontop of it? (x86.git perhaps?)
> 
> see first post / mail (there are a few additional patches / trees 
> included: badram, wireless, alsa, tuxonice, madwifi, reiser4, 
> sched-devel, realtime-lsm, powertop, mactel)

problem being, that the bad patch that was identified in the first post:

 Author: Roland McGrath <roland@redhat.com>
 Date:   Sun Dec 23 12:47:41 2007 +0100

    x86 user_regset math_emu

    This converts the ptrace/signal accessors for i387 math_emu
    state to the user_regset interface style, and calls these
    from the old interfaces.

is only included in x86.git AFAIK. Maybe this commit is not really the 
culprit?

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-10 12:48                   ` Ingo Molnar
  2008-01-10 12:59                     ` Matthew
@ 2008-01-10 13:08                     ` Ed Tomlinson
  2008-01-10 13:32                       ` Ingo Molnar
  1 sibling, 1 reply; 29+ messages in thread
From: Ed Tomlinson @ 2008-01-10 13:08 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Matthew, H. Peter Anvin, linux-kernel, Thomas Gleixner

On January 10, 2008, Ingo Molnar wrote:
> 
> * Matthew <jackdachef@gmail.com> wrote:
> 
> > this also happens with rc7-based kernels, btw
> 
> hm, exactly what rc7 based kernel? Vanilla 2.6.24-rc7, built by you? Or 
> any patches ontop of it? (x86.git perhaps?)

Matthew  is not alone with this problem.  I have it too.  Its not new here.  Its
been happening as long as I have had gentoo amd64 installed.  It can be hard
to reproduce but eventually, when 32 bit apps are used, my box bricks.  There is
nothing in the logs (nor on a serial console) - the box just freezes.  

My kernel is _not_ tainted.  The kernel is currently 2.6.23-gentoo-r5-crc with 
the latest cfs backport applied; it does not seem to be critical though as it has
happen with all kernels I have tried (mm, linux and gentoo varients).  

The processor is:

processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 4
model name      : AMD Athlon(tm) 64 Processor 2800+
stepping        : 10
cpu MHz         : 1808.802
cache size      : 512 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow rep_good
bogomips        : 3620.77
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

I asked about this lkml before and was told it was probably a cpu/hardware issue...  Its 
interesting that Matthew is also running gentoo.

Thanks,
Ed Tomlinson

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-10 12:48                   ` Ingo Molnar
@ 2008-01-10 12:59                     ` Matthew
  2008-01-10 13:24                       ` Ingo Molnar
  2008-01-10 13:08                     ` Ed Tomlinson
  1 sibling, 1 reply; 29+ messages in thread
From: Matthew @ 2008-01-10 12:59 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: H. Peter Anvin, linux-kernel, Thomas Gleixner

>
> > this also happens with rc7-based kernels, btw
>
> hm, exactly what rc7 based kernel? Vanilla 2.6.24-rc7, built by you? Or
> any patches ontop of it? (x86.git perhaps?)

see first post / mail (there are a few additional patches / trees
included:  badram, wireless, alsa, tuxonice, madwifi, reiser4,
sched-devel, realtime-lsm, powertop, mactel)

>since yesterday my laptop kept on hard-locking when launching 32bit
>binaries / apps
>I didn't know what to do but

>miguel botón was the one pointing me in the right direction, namely bisect :)

>kudos to him & the others involved in his zen-sources project:
>http://repo.or.cz/w/linux-2.6/zen-sources.git

>bisect said the following is the causer:

so I guess I need to counter-check it against your realtime-tree:
is it the following ?
http://git.eu.kernel.org/?p=linux/kernel/git/cloos/rt-2.6.git;a=summary
(it's currently at rc5 ?)

or is hardirq / softirq also included in your sched-devel tree ?
http://git.eu.kernel.org/?p=linux/kernel/git/mingo/linux-2.6-sched-devel.git;a=summary

>         Ingo

Mat

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-10 12:43                 ` Matthew
@ 2008-01-10 12:48                   ` Ingo Molnar
  2008-01-10 12:59                     ` Matthew
  2008-01-10 13:08                     ` Ed Tomlinson
  2008-01-10 23:35                   ` Zan Lynx
  1 sibling, 2 replies; 29+ messages in thread
From: Ingo Molnar @ 2008-01-10 12:48 UTC (permalink / raw)
  To: Matthew; +Cc: H. Peter Anvin, linux-kernel, Thomas Gleixner


* Matthew <jackdachef@gmail.com> wrote:

> this also happens with rc7-based kernels, btw

hm, exactly what rc7 based kernel? Vanilla 2.6.24-rc7, built by you? Or 
any patches ontop of it? (x86.git perhaps?)

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-10  9:42               ` Ingo Molnar
@ 2008-01-10 12:43                 ` Matthew
  2008-01-10 12:48                   ` Ingo Molnar
  2008-01-10 23:35                   ` Zan Lynx
  0 siblings, 2 replies; 29+ messages in thread
From: Matthew @ 2008-01-10 12:43 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: H. Peter Anvin, linux-kernel, Thomas Gleixner

> really, that module does all sorts of nasty stuff when inserted (and
> then removed), so just to make sure (because you are about to crash your
> box again to take a picture), could you try to boot up without never
> even once loading the nvidia module?

and it still happens ;(
I un-emerged nvidia-drivers & checked via dmesg |grep nv -> it wasn't
loaded, but the box also hanged

it's a little tricky to reproduce it:
I tried it with root-account: firefox-bin, thunderbird-bin wouldn't trigger
user-account (with used account-directory of both apps):
thunderbird-bin triggers it more reliably
probably it has to do with the x86 compatibility apps of gentoo ?
gentoo amd64-users with 32bit firefox & thunderbird - anyone able to
reproduce it ?
it seemingly is being caused by softirq (see pictures; the zen-sources
is also using parts of rt-kernel); approx 1 minute later there also
was a spinlock lockup by syslog-ng (?)

I'll recompile the newest git-sources and see if it's still triggered
with hardirq & softirq disabled ...

http://www.kerneloftruth.neucode.org/other/crash_ia32_64/ (<--
omploader is down so I'll host the picture somewhere else)
hope there's everything revelant to see / read ...

I'll recompile the kernel in question with debug-info probably this
evening - if I find some time, you guys also need frame-pointers set ?

this also happens with rc7-based kernels, btw

>         Ingo
>

Mat

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-10  9:05             ` Matthew
@ 2008-01-10  9:42               ` Ingo Molnar
  2008-01-10 12:43                 ` Matthew
  0 siblings, 1 reply; 29+ messages in thread
From: Ingo Molnar @ 2008-01-10  9:42 UTC (permalink / raw)
  To: Matthew; +Cc: H. Peter Anvin, linux-kernel, Ingo Molnar, Thomas Gleixner


* Matthew <jackdachef@gmail.com> wrote:

> > I have been unable to reproduce your problem here, and I notice you have
> > the proprietary, highly invasive and closed-source Nvidia driver
> > installed in your kernel.
> >
> > Can you try using the "nv" or "vesa" (unaccelerated) Xorg drivers and
> > reproduce the problem that way?
> >
> > If you *do* reproduce the problem that way, it would be extremely
> > helpful if you could enable CONFIG_DEBUG_INFO and provide the vmlinux
> > (not vmlinuz/bzImage) file that goes with the crash dump screenshot.
> >
> > Thanks!
> 
> I was able to reproduce it with removed nvidia module (rmmod nvidia) & 
> nv driver, and will post the pictures later if I find some time (it 
> was the same function if I recall right) do you also need: 
> CONFIG_DEBUG_BUGVERBOSE enabled ?

really, that module does all sorts of nasty stuff when inserted (and 
then removed), so just to make sure (because you are about to crash your 
box again to take a picture), could you try to boot up without never 
even once loading the nvidia module?

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-09  0:58           ` H. Peter Anvin
@ 2008-01-10  9:05             ` Matthew
  2008-01-10  9:42               ` Ingo Molnar
  2008-01-10 21:10             ` Matthew
  1 sibling, 1 reply; 29+ messages in thread
From: Matthew @ 2008-01-10  9:05 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel, Ingo Molnar, Thomas Gleixner

> I have been unable to reproduce your problem here, and I notice you have
> the proprietary, highly invasive and closed-source Nvidia driver
> installed in your kernel.
>
> Can you try using the "nv" or "vesa" (unaccelerated) Xorg drivers and
> reproduce the problem that way?
>
> If you *do* reproduce the problem that way, it would be extremely
> helpful if you could enable CONFIG_DEBUG_INFO and provide the vmlinux
> (not vmlinuz/bzImage) file that goes with the crash dump screenshot.
>
> Thanks!

I was able to reproduce it with removed nvidia module (rmmod nvidia) &
nv driver, and will post the pictures later if I find some time (it
was the same function if I recall right)
do you also need: CONFIG_DEBUG_BUGVERBOSE enabled ?

>
>          -hpa
>

Mat

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
  2008-01-08 11:40         ` Fwd: " Matthew
@ 2008-01-09  0:58           ` H. Peter Anvin
  2008-01-10  9:05             ` Matthew
  2008-01-10 21:10             ` Matthew
  0 siblings, 2 replies; 29+ messages in thread
From: H. Peter Anvin @ 2008-01-09  0:58 UTC (permalink / raw)
  To: Matthew; +Cc: linux-kernel, Ingo Molnar, Thomas Gleixner

Matthew wrote:
> Hi everyone,
> 
> sorry for the long delay
> 
> - I first had to get home & set up my rig to reproduce this hardlock
> (repeatedly hardlocking / shutting down the laptop doesn't do too good
> to the new hdd ;) )
> 
> and fortunately I was successful :)
> 
> sorry for the bad quality of the pics (they were taken with my phone):
> 
> http://omploader.org/vYWU1/moto_0025.jpg
> http://omploader.org/vYWU2/moto_0026.jpg
> 
> steps to reproduce:
> 1.) log on
> 2.) startx
> 3.) opening some pure 64bit apps == working, no locks
> 4.) opening 32bit-apps (such as firefox-bin, thunderbird-bin) == hard
> lock, only pulling power cord (on laptop) or reset button (rig) works,
> magic sysrq key doesn't (keyboard & mouse == dead)
> 
> I'm currently writing from my "rescue system" (winxp ;) )
> so if you need my kernel-config or some more info of the system please tell
> 

I have been unable to reproduce your problem here, and I notice you have
the proprietary, highly invasive and closed-source Nvidia driver
installed in your kernel.

Can you try using the "nv" or "vesa" (unaccelerated) Xorg drivers and
reproduce the problem that way?

If you *do* reproduce the problem that way, it would be extremely 
helpful if you could enable CONFIG_DEBUG_INFO and provide the vmlinux 
(not vmlinuz/bzImage) file that goes with the crash dump screenshot.

Thanks!

         -hpa

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo)
       [not found]       ` <e85b9d30801080333y3d6668ccgf20f9666d0326884@mail.gmail.com>
@ 2008-01-08 11:40         ` Matthew
  2008-01-09  0:58           ` H. Peter Anvin
  0 siblings, 1 reply; 29+ messages in thread
From: Matthew @ 2008-01-08 11:40 UTC (permalink / raw)
  To: linux-kernel

Hi everyone,

sorry for the long delay

- I first had to get home & set up my rig to reproduce this hardlock
(repeatedly hardlocking / shutting down the laptop doesn't do too good
to the new hdd ;) )

and fortunately I was successful :)

sorry for the bad quality of the pics (they were taken with my phone):

http://omploader.org/vYWU1/moto_0025.jpg
http://omploader.org/vYWU2/moto_0026.jpg

steps to reproduce:
1.) log on
2.) startx
3.) opening some pure 64bit apps == working, no locks
4.) opening 32bit-apps (such as firefox-bin, thunderbird-bin) == hard
lock, only pulling power cord (on laptop) or reset button (rig) works,
magic sysrq key doesn't (keyboard & mouse == dead)

I'm currently writing from my "rescue system" (winxp ;) )
so if you need my kernel-config or some more info of the system please tell

Cheers

Mat

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2008-01-15 23:41 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-01-11  2:22 Fwd: Fwd: laptop / computer hardlocks during execution of 32bit applications(binaries) on 64bit system (Gentoo) Ed Tomlinson
2008-01-14 16:16 ` Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2007-12-29 19:57 Matthew
2007-12-29 23:04 ` Fwd: " Matthew
2007-12-30  0:28   ` Miguel Botón
2008-01-02  8:40     ` Ingo Molnar
     [not found]       ` <e85b9d30801080333y3d6668ccgf20f9666d0326884@mail.gmail.com>
2008-01-08 11:40         ` Fwd: " Matthew
2008-01-09  0:58           ` H. Peter Anvin
2008-01-10  9:05             ` Matthew
2008-01-10  9:42               ` Ingo Molnar
2008-01-10 12:43                 ` Matthew
2008-01-10 12:48                   ` Ingo Molnar
2008-01-10 12:59                     ` Matthew
2008-01-10 13:24                       ` Ingo Molnar
2008-01-10 13:08                     ` Ed Tomlinson
2008-01-10 13:32                       ` Ingo Molnar
2008-01-10 15:56                         ` Matthew
2008-01-10 16:38                         ` Ed Tomlinson
2008-01-10 23:35                   ` Zan Lynx
2008-01-10 21:10             ` Matthew
2008-01-10 21:17               ` H. Peter Anvin
2008-01-10 21:52                 ` Matthew
2008-01-10 21:57                   ` H. Peter Anvin
2008-01-10 22:28                     ` Matthew
2008-01-10 22:30                       ` H. Peter Anvin
2008-01-10 22:53                         ` Matthew
2008-01-14 16:13                           ` Ingo Molnar
2008-01-14 16:47                             ` Matthew
2008-01-14 17:00                               ` Ingo Molnar
2008-01-14 22:16                                 ` Ed Tomlinson
2008-01-15 17:11                                   ` Matthew
2008-01-15 22:09                                   ` Ingo Molnar
2008-01-15 23:41                                     ` Ed Tomlinson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).