LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "Luck, Tony" <tony.luck@intel.com>
To: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org, linux-edac@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Sumanth Kamatala <skamatala@juniper.net>
Subject: Re: [PATCH] x86/mce/dev-mcelog: Call mce_register_decode_chain() much earlier
Date: Fri, 20 Aug 2021 07:43:14 -0700	[thread overview]
Message-ID: <20210820144314.GA1622759@agluck-desk2.amr.corp.intel.com> (raw)
In-Reply-To: <YR+f/fdGIxWcLTP2@zn.tnic>

On Fri, Aug 20, 2021 at 02:28:45PM +0200, Borislav Petkov wrote:
> On Thu, Aug 19, 2021 at 03:44:52PM -0700, Tony Luck wrote:
> > which made sure that the logs were not lost completely by printing
> > to the console. But parsing console logs is error prone. Users
> > of /dev/mcelog should expect to find any early errors logged to
> > standard places.
> 
> Yes, and for that matter, *all* consumers which register on the decoding
> chain should get a chance to look at those records...
> 
> > Split the initialization code in dev-mcelog.c into:
> > 1) an "early" part that registers for mce notifications. Call this
> > directly from mcheck_init() because early_initcall() is still too late.
> > This allocation is too early for kzalloc() so use memblock_alloc().
> > 2) "late" part that registers the /dev/mcelog character device.
> 
> ... but this looks like a hack to me: why aren't we adding those early
> records to the gen_pool and kick the work to consume them *only* *after*
> all consumers have been registered properly and everything is up and
> running?

How can the kernel tell that all consumers have registered? Is there
some new kernel crystal ball functionality that can predict that an
EDAC driver module is going to be loaded at some point in the future
when user space is up and running :-)

I think the best we could do would be to set a timer for some point
far enough out (one minute?, two minutes?) to give a chance for
modules to load. But this seems even more hacky ... I have no idea
how much time is enough? In this particular case we know that the
system crashed before ... maybe the file systems are going to need
a fsck(8) before modules are loaded?

-Tony

  reply	other threads:[~2021-08-20 14:43 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-19 22:44 Tony Luck
2021-08-20 12:28 ` Borislav Petkov
2021-08-20 14:43   ` Luck, Tony [this message]
2021-08-20 15:48     ` Borislav Petkov
2021-08-23 18:45       ` Luck, Tony
2021-08-23 20:41         ` [PATCH v2] x86/mce: Defer processing early errors until mcheck_late_init() Luck, Tony
2021-08-23 20:51           ` Borislav Petkov
2021-08-23 21:41             ` Luck, Tony
2021-08-24  0:31             ` [PATCH v3] x86/mce: Defer processing of early errors Luck, Tony
2021-08-24  8:44               ` [tip: ras/core] " tip-bot2 for Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210820144314.GA1622759@agluck-desk2.amr.corp.intel.com \
    --to=tony.luck@intel.com \
    --cc=bp@alien8.de \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=skamatala@juniper.net \
    --cc=x86@kernel.org \
    --subject='Re: [PATCH] x86/mce/dev-mcelog: Call mce_register_decode_chain() much earlier' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).