LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Jue Wang <juew@google.com>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: "Borislav Petkov" <bp@alien8.de>,
	"dinghui@sangfor.com.cn" <dinghui@sangfor.com.cn>,
	"huangcun@sangfor.com.cn" <huangcun@sangfor.com.cn>,
	"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>,
	"Oscar Salvador" <osalvador@suse.de>, x86 <x86@kernel.org>,
	"Song, Youquan" <youquan.song@intel.com>
Subject: Re: [PATCH 2/3] x86/mce: Avoid infinite loop for copy from user recovery
Date: Thu, 22 Jul 2021 21:16:40 -0700	[thread overview]
Message-ID: <CAPcxDJ7=UsAkDwVuoQcTt2B2UA4RWjs_o_=Fnk4Hfuqj+V8hAA@mail.gmail.com> (raw)
In-Reply-To: <0e39ef0e1b6d4532a09ad2d6e0b28310@intel.com>

On Thu, Jul 22, 2021 at 9:01 PM Luck, Tony <tony.luck@intel.com> wrote:
>
> >> I'm not aware of, nor expecting to find, places where the kernel
> >> tries to access user address A and hits poison, and then tries to
> >> access user address B (without returrning to user between access
> >> A and access B).
> >This seems a reasonablely easy scenario.
> >
> > A user space app allocates a buffer of xyz KB/MB/GB.
> >
> > Unfortunately the dimms are bad and multiple cache lines have
> > uncorrectable errors in them on different pages.
> >
> > Then the user space app tries to write the content of the buffer into some
> > file via write(2) from the entire buffer in one go.
>
> Before this patch Linux gets into an infinite loop taking machine
> checks on the first of the poison addresses in the buffer.
>
> With this patch (and also patch 3/3 in this series). There are
> a few machine checks on the first poison address (I think the number
> depends on the alignment of the poison within a page ... but I'm
> not sure). My test code shows 4 machine checks at the same
> address. Then Linux returns a short byte count to the user
> showing how many bytes were actually written to the file.
>
> The fast that there are many more poison lines in the buffer
> beyond the place where the write stopped on the first one is
> irrelevant.
In our test, the application memory was anon.
With 1 UC error injected, the test always passes with the error
recovered and a SIGBUS delivered to user space.

When there are >1 UC errors in buffer, then indefinite mce loop.
>
> [Well, if the second poisoned line is immediately after the first
> you may hit h/w prefetch issues and h/w may signal a fatal
> machine check ... but that's a different problem that s/w could
> only solve with painful LFENCE operations between each 64-bytes
> of the copy]
>
> -Tony

  reply	other threads:[~2021-07-23  4:17 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-22 13:54 Jue Wang
2021-07-22 15:19 ` Luck, Tony
2021-07-22 23:30   ` Jue Wang
2021-07-23  0:14     ` Luck, Tony
2021-07-23  3:47       ` Jue Wang
2021-07-23  4:01         ` Luck, Tony
2021-07-23  4:16           ` Jue Wang [this message]
2021-07-23 14:47             ` Luck, Tony
  -- strict thread matches above, loose matches on Subject: below --
2021-07-31  6:30 Jue Wang
2021-07-31 20:43 ` Luck, Tony
2021-08-02 15:29   ` Jue Wang
2021-07-06 19:06 [PATCH 0/3] More machine check recovery fixes Tony Luck
2021-07-06 19:06 ` [PATCH 2/3] x86/mce: Avoid infinite loop for copy from user recovery Tony Luck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPcxDJ7=UsAkDwVuoQcTt2B2UA4RWjs_o_=Fnk4Hfuqj+V8hAA@mail.gmail.com' \
    --to=juew@google.com \
    --cc=bp@alien8.de \
    --cc=dinghui@sangfor.com.cn \
    --cc=huangcun@sangfor.com.cn \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    --cc=youquan.song@intel.com \
    --subject='Re: [PATCH 2/3] x86/mce: Avoid infinite loop for copy from user recovery' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).