LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] [WATCHDOG] Fix kdump when using hpwdt
@ 2008-10-26 14:59 Bernhard Walle
  2008-10-27 19:30 ` Wim Van Sebroeck
  0 siblings, 1 reply; 6+ messages in thread
From: Bernhard Walle @ 2008-10-26 14:59 UTC (permalink / raw)
  To: kexec
  Cc: linux-kernel, Bernhard Walle, Wim Van Sebroeck,
	Thomas Mingarelli, Vivek Goyal

When the "hpwdt" module is loaded (even if the /dev/watchdog device is not
opened), then kdump does not work. The panic kernel either does not start at
all or crash in various places.

The problem is that hpwdt_pretimeout is registered with register_die_notifier()
with the highest possible priority. Because it returns NOTIFY_STOP, the
crash_nmi_callback which is also registered with register_die_notifier()
is never executed. This causes the shutdown of other CPUs to fail.

Reverting the order is no option: The crash_nmi_callback executes HLT
and so never returns normally. Because of that, it must be executed as
last notifier, which currently is done.

So, that patch returns NOTIFY_OK in case allow_kdump is set as module parameter
in the hpwdt module. Also, it changes the default of allow_kdump to 1. Kdump is
quite common and should be working as default.

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Cc: Wim Van Sebroeck <wim@iguana.be>
Cc: Thomas Mingarelli <thomas.mingarelli@hp.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
---
 drivers/watchdog/hpwdt.c |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
index a3765e0..65e7102 100644
--- a/drivers/watchdog/hpwdt.c
+++ b/drivers/watchdog/hpwdt.c
@@ -116,7 +116,7 @@ static unsigned int reload;			/* the computed soft_margin */
 static int nowayout = WATCHDOG_NOWAYOUT;
 static char expect_release;
 static unsigned long hpwdt_is_open;
-static unsigned int allow_kdump;
+static unsigned int allow_kdump = 1;
 
 static void __iomem *pci_mem_addr;		/* the PCI-memory address */
 static unsigned long __iomem *hpwdt_timer_reg;
@@ -482,7 +482,11 @@ static int hpwdt_pretimeout(struct notifier_block *nb, unsigned long ulReason,
 			"Management Log for details.\n");
 	}
 
-	return NOTIFY_STOP;
+	/*
+	 * for kdump, we must return NOTIFY_OK here to execute the
+	 * crash_nmi_callback afterwards, see arch/x86/kernel/crash.c
+	 */
+	return allow_kdump ? NOTIFY_OK : NOTIFY_STOP;
 }
 
 /*
@@ -759,7 +763,7 @@ MODULE_ALIAS_MISCDEV(WATCHDOG_MINOR);
 module_param(soft_margin, int, 0);
 MODULE_PARM_DESC(soft_margin, "Watchdog timeout in seconds");
 
-module_param(allow_kdump, int, 0);
+module_param(allow_kdump, int, 1);
 MODULE_PARM_DESC(allow_kdump, "Start a kernel dump after NMI occurs");
 
 module_param(nowayout, int, 0);
-- 
1.6.0.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] [WATCHDOG] Fix kdump when using hpwdt
  2008-10-26 14:59 [PATCH] [WATCHDOG] Fix kdump when using hpwdt Bernhard Walle
@ 2008-10-27 19:30 ` Wim Van Sebroeck
  2008-10-27 21:52   ` Bernhard Walle
  0 siblings, 1 reply; 6+ messages in thread
From: Wim Van Sebroeck @ 2008-10-27 19:30 UTC (permalink / raw)
  To: Bernhard Walle; +Cc: kexec, linux-kernel, Thomas Mingarelli, Vivek Goyal

Hi Bernard,

Tom will have a look at the fix but the below code is wrong:

> -module_param(allow_kdump, int, 0);
> +module_param(allow_kdump, int, 1);
>  MODULE_PARM_DESC(allow_kdump, "Start a kernel dump after NMI occurs");

the syntax is: #define module_param(name, type, perm)
perm sets the visibility in sysfs: 000 means it's not there,
read bits mean it's readable, write bits mean it's writable.

perm is not the default value for the integer but a file permission attribute.

Kind regards,
Wim.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] [WATCHDOG] Fix kdump when using hpwdt
  2008-10-27 19:30 ` Wim Van Sebroeck
@ 2008-10-27 21:52   ` Bernhard Walle
  0 siblings, 0 replies; 6+ messages in thread
From: Bernhard Walle @ 2008-10-27 21:52 UTC (permalink / raw)
  To: Wim Van Sebroeck; +Cc: kexec, linux-kernel, Thomas Mingarelli, Vivek Goyal

Hi Wim,

* Wim Van Sebroeck [2008-10-27 20:30]:

> > -module_param(allow_kdump, int, 0);
> > +module_param(allow_kdump, int, 1);
> >  MODULE_PARM_DESC(allow_kdump, "Start a kernel dump after NMI occurs");
> 
> the syntax is: #define module_param(name, type, perm)
> perm sets the visibility in sysfs: 000 means it's not there,
> read bits mean it's readable, write bits mean it's writable.
> 
> perm is not the default value for the integer but a file permission attribute.

Yeah, thanks for spotting that. I was a bit in hurry while creating the
patch ...

I'll come up with an updated patch with the suggestions from Vivek.
Probably tomorrow.


Regards,
Bernhard
-- 
Bernhard Walle, SUSE LINUX Products GmbH, Architecture Development

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] [WATCHDOG] Fix kdump when using hpwdt
  2008-11-25 14:27 ` Vivek Goyal
@ 2008-11-25 14:32   ` Bernhard Walle
  0 siblings, 0 replies; 6+ messages in thread
From: Bernhard Walle @ 2008-11-25 14:32 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: akpm, linux-kernel, wim, stable, Thomas.Mingarelli

* Vivek Goyal [2008-11-25 09:27]:
>
> On Sun, Nov 23, 2008 at 02:15:24PM +0100, Bernhard Walle wrote:
> > When the "hpwdt" module is loaded (even if the /dev/watchdog device is not
> > opened), then kdump does not work. The panic kernel either does not start at
> > all or crash in various places.
> > 
> > The problem is that hpwdt_pretimeout is registered with register_die_notifier()
> > with the highest possible priority. Because it returns NOTIFY_STOP, the
> > crash_nmi_callback which is also registered with register_die_notifier() is
> > never executed. This causes the shutdown of other CPUs to fail.
> > 
> > Reverting the order is no option: The crash_nmi_callback executes HLT and so
> > never returns normally. Because of that, it must be executed as last notifier,
> > which currently is done.
> > 
> > So, that patch returns NOTIFY_OK to keep the crash_nmi_callback executed.
> 
> Hi Bernhard,
> 
> Why does this handler need to run after a crash? IOW, even if kdump NMI
> handler halts the cpu, and this handler never gets a chance to run, is 
> that an issue.

Hi Vivek,

Because otherwise the crashkernel receives NMIs and crashes ... it just
doesn't work. The watchdog guys should be able to provide technical
details here.


Regards,
Bernhard


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] [WATCHDOG] Fix kdump when using hpwdt
  2008-11-23 13:15 Bernhard Walle
@ 2008-11-25 14:27 ` Vivek Goyal
  2008-11-25 14:32   ` Bernhard Walle
  0 siblings, 1 reply; 6+ messages in thread
From: Vivek Goyal @ 2008-11-25 14:27 UTC (permalink / raw)
  To: Bernhard Walle; +Cc: akpm, linux-kernel, wim, stable, Thomas.Mingarelli

On Sun, Nov 23, 2008 at 02:15:24PM +0100, Bernhard Walle wrote:
> When the "hpwdt" module is loaded (even if the /dev/watchdog device is not
> opened), then kdump does not work. The panic kernel either does not start at
> all or crash in various places.
> 
> The problem is that hpwdt_pretimeout is registered with register_die_notifier()
> with the highest possible priority. Because it returns NOTIFY_STOP, the
> crash_nmi_callback which is also registered with register_die_notifier() is
> never executed. This causes the shutdown of other CPUs to fail.
> 
> Reverting the order is no option: The crash_nmi_callback executes HLT and so
> never returns normally. Because of that, it must be executed as last notifier,
> which currently is done.
> 
> So, that patch returns NOTIFY_OK to keep the crash_nmi_callback executed.

Hi Bernhard,

Why does this handler need to run after a crash? IOW, even if kdump NMI
handler halts the cpu, and this handler never gets a chance to run, is 
that an issue.

I am getting back to previous discussion of dropping the priority of this
hpwdt. You mentioned that dropping priority will not work as kdump handler
hlts the cpus.  But my point is that kdump handler is registered
dynamically only after a system crash. Does hpwdt need to run then?

Above patch as such should fix the kdump issue (assuming the handler of
this driver will always return back), but I don't understand why does 
it need to run after a crash?

Thanks
Vivek

> 
 > 
> Signed-off-by: Bernhard Walle <bwalle@suse.de>
> Cc: Wim Van Sebroeck <wim@iguana.be>
> Cc: Thomas Mingarelli <thomas.mingarelli@hp.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> ---
>  drivers/watchdog/hpwdt.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
> index a3765e0..21fe202 100644
> --- a/drivers/watchdog/hpwdt.c
> +++ b/drivers/watchdog/hpwdt.c
> @@ -482,7 +482,7 @@ static int hpwdt_pretimeout(struct notifier_block *nb, unsigned long ulReason,
>  			"Management Log for details.\n");
>  	}
>  
> -	return NOTIFY_STOP;
> +	return NOTIFY_OK;
>  }
>  
>  /*
> -- 
> 1.6.0.2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] [WATCHDOG] Fix kdump when using hpwdt
@ 2008-11-23 13:15 Bernhard Walle
  2008-11-25 14:27 ` Vivek Goyal
  0 siblings, 1 reply; 6+ messages in thread
From: Bernhard Walle @ 2008-11-23 13:15 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, wim, stable, Thomas.Mingarelli, Bernhard Walle,
	Thomas Mingarelli, Vivek Goyal

When the "hpwdt" module is loaded (even if the /dev/watchdog device is not
opened), then kdump does not work. The panic kernel either does not start at
all or crash in various places.

The problem is that hpwdt_pretimeout is registered with register_die_notifier()
with the highest possible priority. Because it returns NOTIFY_STOP, the
crash_nmi_callback which is also registered with register_die_notifier() is
never executed. This causes the shutdown of other CPUs to fail.

Reverting the order is no option: The crash_nmi_callback executes HLT and so
never returns normally. Because of that, it must be executed as last notifier,
which currently is done.

So, that patch returns NOTIFY_OK to keep the crash_nmi_callback executed.


Signed-off-by: Bernhard Walle <bwalle@suse.de>
Cc: Wim Van Sebroeck <wim@iguana.be>
Cc: Thomas Mingarelli <thomas.mingarelli@hp.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
---
 drivers/watchdog/hpwdt.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
index a3765e0..21fe202 100644
--- a/drivers/watchdog/hpwdt.c
+++ b/drivers/watchdog/hpwdt.c
@@ -482,7 +482,7 @@ static int hpwdt_pretimeout(struct notifier_block *nb, unsigned long ulReason,
 			"Management Log for details.\n");
 	}
 
-	return NOTIFY_STOP;
+	return NOTIFY_OK;
 }
 
 /*
-- 
1.6.0.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-11-25 14:32 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-26 14:59 [PATCH] [WATCHDOG] Fix kdump when using hpwdt Bernhard Walle
2008-10-27 19:30 ` Wim Van Sebroeck
2008-10-27 21:52   ` Bernhard Walle
2008-11-23 13:15 Bernhard Walle
2008-11-25 14:27 ` Vivek Goyal
2008-11-25 14:32   ` Bernhard Walle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).