LKML Archive on
help / color / mirror / Atom feed
From: "Luck, Tony" <>
To: Jue Wang <>
Cc: "Borislav Petkov" <>,
	"" <>,
	"" <>,
	"" <>,
	"" <>,
	"HORIGUCHI NAOYA(堀口 直也)" <>,
	"Oscar Salvador" <>, x86 <>,
	"Song, Youquan" <>
Subject: RE: [PATCH 2/3] x86/mce: Avoid infinite loop for copy from user recovery
Date: Fri, 23 Jul 2021 04:01:44 +0000	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

>> I'm not aware of, nor expecting to find, places where the kernel
>> tries to access user address A and hits poison, and then tries to
>> access user address B (without returrning to user between access
>> A and access B).
>This seems a reasonablely easy scenario.
> A user space app allocates a buffer of xyz KB/MB/GB.
> Unfortunately the dimms are bad and multiple cache lines have
> uncorrectable errors in them on different pages.
> Then the user space app tries to write the content of the buffer into some
> file via write(2) from the entire buffer in one go.

Before this patch Linux gets into an infinite loop taking machine
checks on the first of the poison addresses in the buffer.

With this patch (and also patch 3/3 in this series). There are
a few machine checks on the first poison address (I think the number
depends on the alignment of the poison within a page ... but I'm
not sure). My test code shows 4 machine checks at the same
address. Then Linux returns a short byte count to the user
showing how many bytes were actually written to the file.

The fast that there are many more poison lines in the buffer
beyond the place where the write stopped on the first one is

[Well, if the second poisoned line is immediately after the first
you may hit h/w prefetch issues and h/w may signal a fatal
machine check ... but that's a different problem that s/w could
only solve with painful LFENCE operations between each 64-bytes
of the copy]


  reply	other threads:[~2021-07-23  4:03 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-22 13:54 Jue Wang
2021-07-22 15:19 ` Luck, Tony
2021-07-22 23:30   ` Jue Wang
2021-07-23  0:14     ` Luck, Tony
2021-07-23  3:47       ` Jue Wang
2021-07-23  4:01         ` Luck, Tony [this message]
2021-07-23  4:16           ` Jue Wang
2021-07-23 14:47             ` Luck, Tony
  -- strict thread matches above, loose matches on Subject: below --
2021-07-31  6:30 Jue Wang
2021-07-31 20:43 ` Luck, Tony
2021-08-02 15:29   ` Jue Wang
2021-07-06 19:06 [PATCH 0/3] More machine check recovery fixes Tony Luck
2021-07-06 19:06 ` [PATCH 2/3] x86/mce: Avoid infinite loop for copy from user recovery Tony Luck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \
    --subject='RE: [PATCH 2/3] x86/mce: Avoid infinite loop for copy from user recovery' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).