LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Michael Kelley <mikelley@microsoft.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
Dave Young <dyoung@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
Andrew Morton <akpm@linux-foundation.org>,
"bhe@redhat.com" <bhe@redhat.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"kexec@lists.infradead.org" <kexec@lists.infradead.org>,
Eric DeVolder <eric.devolder@oracle.com>,
Boris Ostrovsky <boris.ostrovsky@oracle.com>,
Tianyu Lan <Tianyu.Lan@microsoft.com>,
Wei Liu <wei.liu@kernel.org>,
Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>,
HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Subject: RE: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
Date: Thu, 24 Sep 2020 16:15:03 +0000 [thread overview]
Message-ID: <MW2PR2101MB10521373DD95F5AF014254DDD7390@MW2PR2101MB1052.namprd21.prod.outlook.com> (raw)
In-Reply-To: <20200923154825.GC7635@char.us.oracle.com>
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Sent: Wednesday, September 23, 2020 8:48 AM
>
> On Wed, Sep 23, 2020 at 10:43:29AM +0800, Dave Young wrote:
> > + more people who may care about this param
>
> Paarty time!!
>
> (See below, didn't snip any comments)
> > On 09/21/20 at 08:45pm, Eric W. Biederman wrote:
> > > Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> writes:
> > >
> > > > On Fri, Sep 18, 2020 at 05:47:43PM -0700, Andrew Morton wrote:
> > > >> On Fri, 18 Sep 2020 11:25:46 +0800 Dave Young <dyoung@redhat.com> wrote:
> > > >>
> > > >> > crash_kexec_post_notifiers enables running various panic notifier
> > > >> > before kdump kernel booting. This increases risks of kdump failure.
> > > >> > It is well documented in kernel-parameters.txt. We do not suggest
> > > >> > people to enable it together with kdump unless he/she is really sure.
> > > >> > This is also not suggested to be enabled by default when users are
> > > >> > not aware in distributions.
> > > >> >
> > > >> > But unfortunately it is enabled by default in systemd, see below
> > > >> > discussions in a systemd report, we can not convince systemd to change
> > > >> > it:
> > > >> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fsyst
> emd%2Fsystemd%2Fissues%2F16661&data=02%7C01%7Cmikelley%40microsoft.com%
> 7C3631bae06f7147c0f92908d85fd7f2b2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%
> 7C637364728378052956&sdata=9CUpPUxcKLLggbJ1bjubBjbFUAhPVeZhIc4yss8wAiU%3
> D&reserved=0
> > > >> >
> > > >> > Actually we have got reports about kdump kernel hangs in both s390x
> > > >> > and powerpcle cases caused by the systemd change, also some x86 cases
> > > >> > could also be caused by the same (although that is in Hyper-V code
> > > >> > instead of systemd, that need to be addressed separately).
> > > >
> > > > Perhaps it may be better to fix the issus on s390x and PowerPC as well?
> > > >
> > > >> >
> > > >> > Thus to avoid the auto enablement here just disable the param writable
> > > >> > permission in sysfs.
> > > >> >
> > > >>
> > > >> Well. I don't think this is at all a desirable way of resolving a
> > > >> disagreement with the systemd developers
> > > >>
> > > >> At the above github address I'm seeing "ryncsn added a commit to
> > > >> ryncsn/systemd that referenced this issue 9 days ago", "pstore: don't
> > > >> enable crash_kexec_post_notifiers by default". So didn't that address
> > > >> the issue?
> > > >
> > > > It does in systemd, but there is a strong interest in making this on
> > > > by default.
> > >
> > > There is also a strong interest in removing this code entirely from the
> > > kernel.
> >
> > Added Hyper-V people and people who created the param, it is below
> > commit, I also want to remove it if possible, let's see how people
> > think, but the least way should be to disable the auto setting in both systemd
> > and kernel:
Hyper-V uses a notifier to inform the host system that a Linux VM has
panic'ed. Informing the host is particularly important in a public cloud
such as Azure so that the cloud software can alert the customer, and can
track cloud-wide reliability statistics. Whether a kdump is taken is controlled
entirely by the customer and how he configures the VM, and we want
the host to be informed either way.
Michael
> >
> > commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45
> > Author: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> > Date: Fri Jun 6 14:37:07 2014 -0700
> >
> > kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after
> panic_notifers
> >
> > Add a "crash_kexec_post_notifiers" boot option to run kdump after
> > running panic_notifiers and dump kmsg. This can help rare situations
> > where kdump fails because of unstable crashed kernel or hardware failure
> > (memory corruption on critical data/code), or the 2nd kernel is already
> > broken by the 1st kernel (it's a broken behavior, but who can guarantee
> > that the "crashed" kernel works correctly?).
> >
> > Usage: add "crash_kexec_post_notifiers" to kernel boot option.
> >
> > Note that this actually increases risks of the failure of kdump. This
> > option should be set only if you worry about the rare case of kdump
> > failure rather than increasing the chance of success.
>
>
> If this is such risky knob that leads to bugs where folks are backing away
> from with disgust in their faces - then perhaps the only way to go about
> this is - limit the exposure to known working situations on firmware
> that we can control?
>
> That is enable only a subset of post notifiers which determine if they
> are OK running if the conditions are blessed?
>
> I think that would satisfy the conditions where you have to to deal with unsavory
> bugs that end up on your plate - and aren't fun because there is no
> way to fixing it - but at the same time allowing multiple ways to save the crash?
>
> Please don't take away something that is quite useful in the field. Can we
> hammer out something that will remove your pain points?
> >
> > >
> > > This failure is a case in point.
> > >
> > > I think I am at my I told you so point. This is what all of the testing
> > > over all the years has said. Leaving functionality to the peculiarities
> > > of firmware when you don't have to, and can actually control what is
> > > going on doesn't work.
> > >
> > > Eric
> > >
> > >
> >
> > Thanks
> > Dave
> >
next prev parent reply other threads:[~2020-09-24 16:15 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-18 3:25 Dave Young
2020-09-19 0:47 ` Andrew Morton
2020-09-19 7:26 ` Dave Young
2020-09-21 20:18 ` Konrad Rzeszutek Wilk
2020-09-22 1:45 ` Eric W. Biederman
2020-09-23 2:43 ` Dave Young
2020-09-23 15:48 ` Konrad Rzeszutek Wilk
2020-09-24 16:15 ` Michael Kelley [this message]
2020-09-24 16:25 ` Eric W. Biederman
2020-09-24 16:43 ` Michael Kelley
2020-09-24 17:16 ` boris.ostrovsky
2020-09-25 3:05 ` Dave Young
2020-09-25 14:56 ` Konrad Rzeszutek Wilk
2020-09-27 2:51 ` Dave Young
2020-09-29 13:36 ` Philipp Rudo
2020-09-29 19:10 ` boris.ostrovsky
2020-09-22 10:58 ` Philipp Rudo
2020-09-22 14:50 ` boris.ostrovsky
2020-09-22 17:04 ` Guilherme G. Piccoli
2020-09-23 2:25 ` Dave Young
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=MW2PR2101MB10521373DD95F5AF014254DDD7390@MW2PR2101MB1052.namprd21.prod.outlook.com \
--to=mikelley@microsoft.com \
--cc=Tianyu.Lan@microsoft.com \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=boris.ostrovsky@oracle.com \
--cc=d.hatayama@jp.fujitsu.com \
--cc=dyoung@redhat.com \
--cc=ebiederm@xmission.com \
--cc=eric.devolder@oracle.com \
--cc=kexec@lists.infradead.org \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=masami.hiramatsu.pt@hitachi.com \
--cc=wei.liu@kernel.org \
--subject='RE: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).