LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "Luck, Tony" <tony.luck@intel.com>
To: Jue Wang <juew@google.com>
Cc: "Borislav Petkov" <bp@alien8.de>,
	dinghui@sangfor.com.cn, huangcun@sangfor.com.cn,
	linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
	"HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>,
	"Oscar Salvador" <osalvador@suse.de>, x86 <x86@kernel.org>,
	"Song, Youquan" <youquan.song@intel.com>
Subject: Re: [PATCH 2/3] x86/mce: Avoid infinite loop for copy from user recovery
Date: Thu, 22 Jul 2021 08:19:30 -0700	[thread overview]
Message-ID: <20210722151930.GA1453521@agluck-desk2.amr.corp.intel.com> (raw)
In-Reply-To: <CAPcxDJ7YsnYtyzSmgfBj-rmALkjigKx2ODB=SCYCzY8FJYg4iA@mail.gmail.com>

On Thu, Jul 22, 2021 at 06:54:37AM -0700, Jue Wang wrote:
> This patch assumes the UC error consumed in kernel is always the same UC.
> 
> Yet it's possible two UCs on different pages are consumed in a row.
> The patch below will panic on the 2nd MCE. How can we make the code works
> on multiple UC errors?
> 
> 
> > + int count = ++current->mce_count;
> > +
> > + /* First call, save all the details */
> > + if (count == 1) {
> > + current->mce_addr = m->addr;
> > + current->mce_kflags = m->kflags;
> > + current->mce_ripv = !!(m->mcgstatus & MCG_STATUS_RIPV);
> > + current->mce_whole_page = whole_page(m);
> > + current->mce_kill_me.func = func;
> > + }
> > ......
> > + /* Second or later call, make sure page address matches the one from first call */
> > + if (count > 1 && (current->mce_addr >> PAGE_SHIFT) != (m->addr >> PAGE_SHIFT))
> > + mce_panic("Machine checks to different user pages", m, msg);

The issue is getting the information about the location
of the error from the machine check handler to the "task_work"
function that processes it. Currently there is a single place
to store the address of the error in the task structure:

	current->mce_addr = m->addr;

Plausibly that could be made into an array, indexed by
current->mce_count to save mutiple addresses (perhaps
also need mce_kflags, mce_ripv, etc. to also be arrays).

But I don't want to pre-emptively make such a change without
some data to show that situations arise with multiple errors
to different addresses:
1) Actually occur
2) Would be recovered if we made the change.

The first would be indicated by seeing the:

	"Machine checks to different user pages"

panic. You'd have to code up the change to have arrays
to confirm that would fix the problem.

-Tony

  reply	other threads:[~2021-07-22 15:21 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-22 13:54 Jue Wang
2021-07-22 15:19 ` Luck, Tony [this message]
2021-07-22 23:30   ` Jue Wang
2021-07-23  0:14     ` Luck, Tony
2021-07-23  3:47       ` Jue Wang
2021-07-23  4:01         ` Luck, Tony
2021-07-23  4:16           ` Jue Wang
2021-07-23 14:47             ` Luck, Tony
  -- strict thread matches above, loose matches on Subject: below --
2021-07-31  6:30 Jue Wang
2021-07-31 20:43 ` Luck, Tony
2021-08-02 15:29   ` Jue Wang
2021-07-06 19:06 [PATCH 0/3] More machine check recovery fixes Tony Luck
2021-07-06 19:06 ` [PATCH 2/3] x86/mce: Avoid infinite loop for copy from user recovery Tony Luck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210722151930.GA1453521@agluck-desk2.amr.corp.intel.com \
    --to=tony.luck@intel.com \
    --cc=bp@alien8.de \
    --cc=dinghui@sangfor.com.cn \
    --cc=huangcun@sangfor.com.cn \
    --cc=juew@google.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    --cc=x86@kernel.org \
    --cc=youquan.song@intel.com \
    --subject='Re: [PATCH 2/3] x86/mce: Avoid infinite loop for copy from user recovery' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).