LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [patch] disable NMI watchdog by default
@ 2007-01-14  9:29 Ingo Molnar
  2007-01-14 14:45 ` Henrique de Moraes Holschuh
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Ingo Molnar @ 2007-01-14  9:29 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds; +Cc: Andi Kleen, linux-kernel

From: Ingo Molnar <mingo@elte.hu>
Subject: [patch] disable NMI watchdog by default

there's a new NMI watchdog related problem: KVM crashes on certain 
bzImages because ... we enable the NMI watchdog by default (even if the 
user does not ask for it) , and no other OS on this planet does that so 
KVM doesnt have emulation for that yet. So KVM injects a #GP, which 
crashes the Linux guest:

 general protection fault: 0000 [#1]
 PREEMPT SMP
 Modules linked in:
 CPU:    0
 EIP:    0060:[<c011a8ae>]    Not tainted VLI
 EFLAGS: 00000246   (2.6.20-rc5-rt0 #3)
 EIP is at setup_apic_nmi_watchdog+0x26d/0x3d3

and no, i did /not/ request an nmi_watchdog on the boot command line!

Solution: turn off that darn thing! It's a debug tool, not a 'make life 
harder' tool!!

with this patch the KVM guest boots up just fine.

And with this my laptop (Lenovo T60) also stops its sporadic hard 
hanging (sometimes in acpi_init(), sometimes later during bootup, 
sometimes much later during actual use) as well. It hung with both 
nmi_watchdog=1 and nmi_watchdog=2, so it's generally the fact of NMI 
injection that is causing problems, not the NMI watchdog variant, nor 
any particular bootup code.

The patch is unintrusive.

Signed-off-by: Ingo Molnar <mingo@elte.hu>

Index: linux/include/asm-i386/nmi.h
===================================================================
--- linux.orig/include/asm-i386/nmi.h
+++ linux/include/asm-i386/nmi.h
@@ -33,7 +33,7 @@ extern int nmi_watchdog_tick (struct pt_
 
 extern atomic_t nmi_active;
 extern unsigned int nmi_watchdog;
-#define NMI_DEFAULT     -1
+#define NMI_DEFAULT     0
 #define NMI_NONE	0
 #define NMI_IO_APIC	1
 #define NMI_LOCAL_APIC	2
Index: linux/include/asm-x86_64/nmi.h
===================================================================
--- linux.orig/include/asm-x86_64/nmi.h
+++ linux/include/asm-x86_64/nmi.h
@@ -63,7 +63,7 @@ extern int setup_nmi_watchdog(char *);
 
 extern atomic_t nmi_active;
 extern unsigned int nmi_watchdog;
-#define NMI_DEFAULT	-1
+#define NMI_DEFAULT	0
 #define NMI_NONE	0
 #define NMI_IO_APIC	1
 #define NMI_LOCAL_APIC	2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [patch] disable NMI watchdog by default
  2007-01-14  9:29 [patch] disable NMI watchdog by default Ingo Molnar
@ 2007-01-14 14:45 ` Henrique de Moraes Holschuh
  2007-01-14 16:45 ` Arjan van de Ven
  2007-03-05 16:02 ` Bill Davidsen
  2 siblings, 0 replies; 13+ messages in thread
From: Henrique de Moraes Holschuh @ 2007-01-14 14:45 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel

On Sun, 14 Jan 2007, Ingo Molnar wrote:
> And with this my laptop (Lenovo T60) also stops its sporadic hard 
> hanging (sometimes in acpi_init(), sometimes later during bootup, 
> sometimes much later during actual use) as well. It hung with both 
> nmi_watchdog=1 and nmi_watchdog=2, so it's generally the fact of NMI 
> injection that is causing problems, not the NMI watchdog variant, nor 
> any particular bootup code.

I seem to recall a patch sent to lkml that removed nmi_watchdog
functionality from ThinkPads exactly because of this.  Something to do with
SMBIOS code calling int 10h under SMM, and that it would hang hard if NMIs
happened at that time.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [patch] disable NMI watchdog by default
  2007-01-14  9:29 [patch] disable NMI watchdog by default Ingo Molnar
  2007-01-14 14:45 ` Henrique de Moraes Holschuh
@ 2007-01-14 16:45 ` Arjan van de Ven
  2007-03-05 16:02 ` Bill Davidsen
  2 siblings, 0 replies; 13+ messages in thread
From: Arjan van de Ven @ 2007-01-14 16:45 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andrew Morton, Linus Torvalds, Andi Kleen, linux-kernel


> Signed-off-by: Ingo Molnar <mingo@elte.hu>

Fully agree!
NMI watchdog is high risk in terms of interacting with firmware and
other things (and the code is sort of broken anyway)

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [patch] disable NMI watchdog by default
  2007-01-14  9:29 [patch] disable NMI watchdog by default Ingo Molnar
  2007-01-14 14:45 ` Henrique de Moraes Holschuh
  2007-01-14 16:45 ` Arjan van de Ven
@ 2007-03-05 16:02 ` Bill Davidsen
  2007-03-08 19:44   ` Avi Kivity
  2 siblings, 1 reply; 13+ messages in thread
From: Bill Davidsen @ 2007-03-05 16:02 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andi Kleen, linux-kernel

Ingo Molnar wrote:
> From: Ingo Molnar <mingo@elte.hu>
> Subject: [patch] disable NMI watchdog by default
> 
> there's a new NMI watchdog related problem: KVM crashes on certain 
> bzImages because ... we enable the NMI watchdog by default (even if the 
> user does not ask for it) , and no other OS on this planet does that so 
> KVM doesnt have emulation for that yet. So KVM injects a #GP, which 
> crashes the Linux guest:
> 
>  general protection fault: 0000 [#1]
>  PREEMPT SMP
>  Modules linked in:
>  CPU:    0
>  EIP:    0060:[<c011a8ae>]    Not tainted VLI
>  EFLAGS: 00000246   (2.6.20-rc5-rt0 #3)
>  EIP is at setup_apic_nmi_watchdog+0x26d/0x3d3
> 
> and no, i did /not/ request an nmi_watchdog on the boot command line!
> 
> Solution: turn off that darn thing! It's a debug tool, not a 'make life 
> harder' tool!!
> 
> with this patch the KVM guest boots up just fine.
> 
> And with this my laptop (Lenovo T60) also stops its sporadic hard 
> hanging (sometimes in acpi_init(), sometimes later during bootup, 
> sometimes much later during actual use) as well. It hung with both 
> nmi_watchdog=1 and nmi_watchdog=2, so it's generally the fact of NMI 
> injection that is causing problems, not the NMI watchdog variant, nor 
> any particular bootup code.
> 
> The patch is unintrusive.

I'm missing something, what limits this to systems running under kvm?

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [patch] disable NMI watchdog by default
  2007-03-05 16:02 ` Bill Davidsen
@ 2007-03-08 19:44   ` Avi Kivity
  0 siblings, 0 replies; 13+ messages in thread
From: Avi Kivity @ 2007-03-08 19:44 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Ingo Molnar, Andi Kleen, linux-kernel

Bill Davidsen wrote:
>>
>> there's a new NMI watchdog related problem: KVM crashes on certain 
>> bzImages because ... we enable the NMI watchdog by default (even if 
>> the user does not ask for it) , and no other OS on this planet does 
>> that so KVM doesnt have emulation for that yet. So KVM injects a #GP, 
>> which crashes the Linux guest:
>>
> I'm missing something, what limits this to systems running under kvm?
>

Most likely kvm doesn't implement the msrs which drive the nmi 
watchdog.  That makes it a kvm bug, not a problem with the nmi 
watchdog.  Emulating it correctly is fairly difficult, though, 
especially if we want to migrate virtual machines between different 
processor models, so I hope this goes in.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [patch] disable NMI watchdog by default
  2007-03-07  3:06 ` Roland Dreier
@ 2007-03-07 14:56   ` Andi Kleen
  0 siblings, 0 replies; 13+ messages in thread
From: Andi Kleen @ 2007-03-07 14:56 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Ingo Molnar, Linus Torvalds, Andrew Morton, linux-kernel,
	Arjan van de Ven, Adrian Bunk, Alan Cox

On Wednesday 07 March 2007 04:06, Roland Dreier wrote:
>  > --- linux.orig/include/asm-x86_64/nmi.h
>  > +++ linux/include/asm-x86_64/nmi.h
>  > @@ -63,7 +63,7 @@ extern int setup_nmi_watchdog(char *);
>  >  
>  >  extern atomic_t nmi_active;
>  >  extern unsigned int nmi_watchdog;
>  > -#define NMI_DEFAULT	-1
>  > +#define NMI_DEFAULT	0
> 
> Maybe I'm missing something obvious, but this patch doesn't seem
> correct to me.  The sentiment of disabling the NMI watchdog by default
> is fine, and I agree with it, but I don't think this patch does what
> it says.  First of all, I have a system running a kernel with this
> patch applied (v2.6.21-rc2-gc3442e2), and I see NMIs in
> /proc/interrupts and "testing NMI watchdog ... OK." in the log.

Yes the patch looks quite broken.

-Andi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [patch] disable NMI watchdog by default
  2007-03-05 12:20 Ingo Molnar
  2007-03-05 14:49 ` Arjan van de Ven
  2007-03-05 17:42 ` Len Brown
@ 2007-03-07  3:06 ` Roland Dreier
  2007-03-07 14:56   ` Andi Kleen
  2 siblings, 1 reply; 13+ messages in thread
From: Roland Dreier @ 2007-03-07  3:06 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Andi Kleen, Andrew Morton, linux-kernel,
	Arjan van de Ven, Adrian Bunk, Alan Cox

 > --- linux.orig/include/asm-x86_64/nmi.h
 > +++ linux/include/asm-x86_64/nmi.h
 > @@ -63,7 +63,7 @@ extern int setup_nmi_watchdog(char *);
 >  
 >  extern atomic_t nmi_active;
 >  extern unsigned int nmi_watchdog;
 > -#define NMI_DEFAULT	-1
 > +#define NMI_DEFAULT	0

Maybe I'm missing something obvious, but this patch doesn't seem
correct to me.  The sentiment of disabling the NMI watchdog by default
is fine, and I agree with it, but I don't think this patch does what
it says.  First of all, I have a system running a kernel with this
patch applied (v2.6.21-rc2-gc3442e2), and I see NMIs in
/proc/interrupts and "testing NMI watchdog ... OK." in the log.

And second, looking at the NMI code, it seems that this change
actually makes it impossible to turn off the NMI watchdog!  In
arch/x86_64/kernel/nmi.c, we have:

void nmi_watchdog_default(void)
{
	if (nmi_watchdog != NMI_DEFAULT)
		return;
	if (nmi_known_cpu())
		nmi_watchdog = NMI_LOCAL_APIC;
	else
		nmi_watchdog = NMI_IO_APIC;
}

so it seems changing the value of NMI_DEFAULT has no effect on this
logic, really: if nmi_watchdog is left at the default, then the kernel
chooses LAPIC or IO-APIC.  And if someone passes "nmi_watchdog=0" on
the command line, nmi_watchdog is still NMI_DEFAULT and so the same
logic triggers.

Ingo, I assume you tested this, so what am I missing?

 - R.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [patch] disable NMI watchdog by default
  2007-03-05 20:26     ` Ingo Molnar
@ 2007-03-05 20:40       ` Linus Torvalds
  0 siblings, 0 replies; 13+ messages in thread
From: Linus Torvalds @ 2007-03-05 20:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andi Kleen, Len Brown, Andrew Morton, linux-kernel,
	Arjan van de Ven, Adrian Bunk, Alan Cox



On Mon, 5 Mar 2007, Ingo Molnar wrote:
> 
> Maybe we could take only the 32-bit side of my patch, because that's 
> what is most affected by legacies. Although i suspect Windows still 
> doesnt inject NMIs in 64-bit mode either, so i dont think there's any 
> fundamental difference in terms of breakage in the future, it's just 
> that 64-bit systems and 64-bit testing is 1:5 - 1:10 rarer than 32-bit 
> testing.

I really don't see the point of having NMI on by default. Debugging that 
can cause problems should be disabled. I thought we disabled this a long 
time ago -  but anyway, now it *really* is (ie I applied your patch).

Anybody who thinks that debugging code is always good is deluded. 
Debugging is only good if it has zero downsides and doesn't introduce 
problems of its own. And the NMI watchdog clearly isn't in that camp.

If you *actively* debug stuff and/or have actually seen lockups, use the 
NMI watchdog. But last I saw, if the machine was running X, the NMI 
watchdog wouldn't help *anyway*, so enabling it by default

 - has almost zero upsides in the wild _anyway_

 - clearly has risks.

so this was a no-brainer. I don't understand why it's even discussed, or 
why I hadn't gotten the patch the last time this was discussed.

		Linus

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [patch] disable NMI watchdog by default
  2007-03-05 19:54   ` Andi Kleen
@ 2007-03-05 20:26     ` Ingo Molnar
  2007-03-05 20:40       ` Linus Torvalds
  0 siblings, 1 reply; 13+ messages in thread
From: Ingo Molnar @ 2007-03-05 20:26 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Len Brown, Linus Torvalds, Andrew Morton, linux-kernel,
	Arjan van de Ven, Adrian Bunk, Alan Cox


* Andi Kleen <ak@suse.de> wrote:

> > There are multiple machines not booting because of nmi_watchdog.
> > Some of them are documented here:
> > http://bugzilla.kernel.org/show_bug.cgi?id=7839
> > 
> > We used to think this was the "nolapic" bug,
> > but it is actually the "nmi_watchdog=0" bug.
> 
> I thought that one was worked around by Ingo's patch to not do nmi 
> watchdog during ACPI methods, wasn't it?

unfortunately that only made the lockups on my laptop rarer, it didnt 
totally solve it. My workaround was only done for init acpi methods 
(bootup) - it was getting really ugly when i tried to extend it to all 
ACPI execution. I'd guess the situation on those other systems is 
similar.

Maybe we could take only the 32-bit side of my patch, because that's 
what is most affected by legacies. Although i suspect Windows still 
doesnt inject NMIs in 64-bit mode either, so i dont think there's any 
fundamental difference in terms of breakage in the future, it's just 
that 64-bit systems and 64-bit testing is 1:5 - 1:10 rarer than 32-bit 
testing.

dunno. A distro can still patch the NMI watchdog on, easily. If you 
think it's a better approach i can make this a .config option - just 
like CONFIG_DETECT_SOFTLOCKUP: CONFIG_DETECT_HARDLOCKUP, which would 
default to off?

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [patch] disable NMI watchdog by default
  2007-03-05 17:42 ` Len Brown
@ 2007-03-05 19:54   ` Andi Kleen
  2007-03-05 20:26     ` Ingo Molnar
  0 siblings, 1 reply; 13+ messages in thread
From: Andi Kleen @ 2007-03-05 19:54 UTC (permalink / raw)
  To: Len Brown
  Cc: Ingo Molnar, Linus Torvalds, Andrew Morton, linux-kernel,
	Arjan van de Ven, Adrian Bunk, Alan Cox

On Monday 05 March 2007 18:42, Len Brown wrote:
> On Monday 05 March 2007 07:20, Ingo Molnar wrote:
> > 
> > Linus,
> > 
> > Andrew sent the patch below (which is now months old and has been in -mm 
> > for some time) towards Andi's tree 4 weeks ago, but apparently it fell 
> > into a black hole there - the patch is still not upstream!
> > 
> > This is a must-have for v2.6.21
> 
> I agree.
> There are multiple machines not booting because of nmi_watchdog.
> Some of them are documented here:
> http://bugzilla.kernel.org/show_bug.cgi?id=7839
> 
> We used to think this was the "nolapic" bug,
> but it is actually the "nmi_watchdog=0" bug.

I thought that one was worked around by Ingo's patch to not do nmi watchdog 
during ACPI methods, wasn't it? 

-Andi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [patch] disable NMI watchdog by default
  2007-03-05 12:20 Ingo Molnar
  2007-03-05 14:49 ` Arjan van de Ven
@ 2007-03-05 17:42 ` Len Brown
  2007-03-05 19:54   ` Andi Kleen
  2007-03-07  3:06 ` Roland Dreier
  2 siblings, 1 reply; 13+ messages in thread
From: Len Brown @ 2007-03-05 17:42 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Andi Kleen, Andrew Morton, linux-kernel,
	Arjan van de Ven, Adrian Bunk, Alan Cox

On Monday 05 March 2007 07:20, Ingo Molnar wrote:
> 
> Linus,
> 
> Andrew sent the patch below (which is now months old and has been in -mm 
> for some time) towards Andi's tree 4 weeks ago, but apparently it fell 
> into a black hole there - the patch is still not upstream!
> 
> This is a must-have for v2.6.21

I agree.
There are multiple machines not booting because of nmi_watchdog.
Some of them are documented here:
http://bugzilla.kernel.org/show_bug.cgi?id=7839

We used to think this was the "nolapic" bug,
but it is actually the "nmi_watchdog=0" bug.

-Len

> Frankly, i find it ridiculous that i had to write more than 10 emails 
> about this stupid topic already, while i'm the original author of this 
> feature to begin with. The "do not enable by default debug features that 
> break certain systems" concept is obvious to me.
> 
> This category of regressions has been introduced by Andi when he made 
> the NMI watchdog the default in certain scenarios at around v2.6.18 - 
> over my repeated objections. Andi Cc:-ed :-)
> 
> 	Ingo
> 
> ----- Forwarded message from Ingo Molnar <mingo@elte.hu> -----
> 
> From: Ingo Molnar <mingo@elte.hu>
> Subject: [patch] disable NMI watchdog by default
> 
> there's a new NMI watchdog related problem: KVM crashes on certain 
> bzImages because ... we enable the NMI watchdog by default (even if the 
> user does not ask for it) , and no other OS on this planet does that so 
> KVM doesnt have emulation for that yet. So KVM injects a #GP, which 
> crashes the Linux guest:
> 
>  general protection fault: 0000 [#1]
>  PREEMPT SMP
>  Modules linked in:
>  CPU:    0
>  EIP:    0060:[<c011a8ae>]    Not tainted VLI
>  EFLAGS: 00000246   (2.6.20-rc5-rt0 #3)
>  EIP is at setup_apic_nmi_watchdog+0x26d/0x3d3
> 
> and no, i did /not/ request an nmi_watchdog on the boot command line!
> 
> Solution: turn off that darn thing! It's a debug tool, not a 'make life 
> harder' tool!!
> 
> with this patch the KVM guest boots up just fine.
> 
> And with this my laptop (Lenovo T60) also stopped its sporadic hard 
> hanging (sometimes in acpi_init(), sometimes later during bootup, 
> sometimes much later during actual use) as well. It hung with both 
> nmi_watchdog=1 and nmi_watchdog=2, so it's generally the fact of NMI 
> injection that is causing problems, not the NMI watchdog variant, nor 
> any particular bootup code.
> 
> The patch is unintrusive.
> 
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> 
> Index: linux/include/asm-i386/nmi.h
> ===================================================================
> --- linux.orig/include/asm-i386/nmi.h
> +++ linux/include/asm-i386/nmi.h
> @@ -33,7 +33,7 @@ extern int nmi_watchdog_tick (struct pt_
>  
>  extern atomic_t nmi_active;
>  extern unsigned int nmi_watchdog;
> -#define NMI_DEFAULT     -1
> +#define NMI_DEFAULT     0
>  #define NMI_NONE	0
>  #define NMI_IO_APIC	1
>  #define NMI_LOCAL_APIC	2
> Index: linux/include/asm-x86_64/nmi.h
> ===================================================================
> --- linux.orig/include/asm-x86_64/nmi.h
> +++ linux/include/asm-x86_64/nmi.h
> @@ -63,7 +63,7 @@ extern int setup_nmi_watchdog(char *);
>  
>  extern atomic_t nmi_active;
>  extern unsigned int nmi_watchdog;
> -#define NMI_DEFAULT	-1
> +#define NMI_DEFAULT	0
>  #define NMI_NONE	0
>  #define NMI_IO_APIC	1
>  #define NMI_LOCAL_APIC	2
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [patch] disable NMI watchdog by default
  2007-03-05 12:20 Ingo Molnar
@ 2007-03-05 14:49 ` Arjan van de Ven
  2007-03-05 17:42 ` Len Brown
  2007-03-07  3:06 ` Roland Dreier
  2 siblings, 0 replies; 13+ messages in thread
From: Arjan van de Ven @ 2007-03-05 14:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Andi Kleen, Andrew Morton, linux-kernel,
	Adrian Bunk, Alan Cox


> Frankly, i find it ridiculous that i had to write more than 10 emails 
> about this stupid topic already, while i'm the original author of this 
> feature to begin with. The "do not enable by default debug features that 
> break certain systems" concept is obvious to me.


NMI breaks on some systems, esp in combination with SMM so

Acked-by: Arjan van de Ven <arjan@linux.intel.com> for both 32 bit and
64 bit

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [patch] disable NMI watchdog by default
@ 2007-03-05 12:20 Ingo Molnar
  2007-03-05 14:49 ` Arjan van de Ven
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Ingo Molnar @ 2007-03-05 12:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andi Kleen, Andrew Morton, linux-kernel, Arjan van de Ven,
	Adrian Bunk, Alan Cox, Arjan van de Ven


Linus,

Andrew sent the patch below (which is now months old and has been in -mm 
for some time) towards Andi's tree 4 weeks ago, but apparently it fell 
into a black hole there - the patch is still not upstream!

This is a must-have for v2.6.21, because the problem still triggers even 
with the latest upstream tree, if i boot a KVM guest.

Frankly, i find it ridiculous that i had to write more than 10 emails 
about this stupid topic already, while i'm the original author of this 
feature to begin with. The "do not enable by default debug features that 
break certain systems" concept is obvious to me.

This category of regressions has been introduced by Andi when he made 
the NMI watchdog the default in certain scenarios at around v2.6.18 - 
over my repeated objections. Andi Cc:-ed :-)

	Ingo

----- Forwarded message from Ingo Molnar <mingo@elte.hu> -----

From: Ingo Molnar <mingo@elte.hu>
Subject: [patch] disable NMI watchdog by default

there's a new NMI watchdog related problem: KVM crashes on certain 
bzImages because ... we enable the NMI watchdog by default (even if the 
user does not ask for it) , and no other OS on this planet does that so 
KVM doesnt have emulation for that yet. So KVM injects a #GP, which 
crashes the Linux guest:

 general protection fault: 0000 [#1]
 PREEMPT SMP
 Modules linked in:
 CPU:    0
 EIP:    0060:[<c011a8ae>]    Not tainted VLI
 EFLAGS: 00000246   (2.6.20-rc5-rt0 #3)
 EIP is at setup_apic_nmi_watchdog+0x26d/0x3d3

and no, i did /not/ request an nmi_watchdog on the boot command line!

Solution: turn off that darn thing! It's a debug tool, not a 'make life 
harder' tool!!

with this patch the KVM guest boots up just fine.

And with this my laptop (Lenovo T60) also stopped its sporadic hard 
hanging (sometimes in acpi_init(), sometimes later during bootup, 
sometimes much later during actual use) as well. It hung with both 
nmi_watchdog=1 and nmi_watchdog=2, so it's generally the fact of NMI 
injection that is causing problems, not the NMI watchdog variant, nor 
any particular bootup code.

The patch is unintrusive.

Signed-off-by: Ingo Molnar <mingo@elte.hu>

Index: linux/include/asm-i386/nmi.h
===================================================================
--- linux.orig/include/asm-i386/nmi.h
+++ linux/include/asm-i386/nmi.h
@@ -33,7 +33,7 @@ extern int nmi_watchdog_tick (struct pt_
 
 extern atomic_t nmi_active;
 extern unsigned int nmi_watchdog;
-#define NMI_DEFAULT     -1
+#define NMI_DEFAULT     0
 #define NMI_NONE	0
 #define NMI_IO_APIC	1
 #define NMI_LOCAL_APIC	2
Index: linux/include/asm-x86_64/nmi.h
===================================================================
--- linux.orig/include/asm-x86_64/nmi.h
+++ linux/include/asm-x86_64/nmi.h
@@ -63,7 +63,7 @@ extern int setup_nmi_watchdog(char *);
 
 extern atomic_t nmi_active;
 extern unsigned int nmi_watchdog;
-#define NMI_DEFAULT	-1
+#define NMI_DEFAULT	0
 #define NMI_NONE	0
 #define NMI_IO_APIC	1
 #define NMI_LOCAL_APIC	2


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2007-03-08 19:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-01-14  9:29 [patch] disable NMI watchdog by default Ingo Molnar
2007-01-14 14:45 ` Henrique de Moraes Holschuh
2007-01-14 16:45 ` Arjan van de Ven
2007-03-05 16:02 ` Bill Davidsen
2007-03-08 19:44   ` Avi Kivity
2007-03-05 12:20 Ingo Molnar
2007-03-05 14:49 ` Arjan van de Ven
2007-03-05 17:42 ` Len Brown
2007-03-05 19:54   ` Andi Kleen
2007-03-05 20:26     ` Ingo Molnar
2007-03-05 20:40       ` Linus Torvalds
2007-03-07  3:06 ` Roland Dreier
2007-03-07 14:56   ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).