LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/3] x86/mce fixes
@ 2018-05-25 21:40 Tony Luck
  2018-05-25 21:41 ` [PATCH 1/3] x86/mce: Improve error message when kernel cannot recover Tony Luck
                   ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Tony Luck @ 2018-05-25 21:40 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Tony Luck, Dan Williams, Qiuxu Zhuo, Ashok Raj, x86, linux-kernel

The first two just get better error messages when we refuse to recover
from an error that the h/w says is recoverable because the error occurred
in a non-recoverable part of the kernel.

The third improves identification of Skylake SKUs that support
recovery.

Tony Luck (3):
  x86/mce: Improve error message when kernel cannot recover.
  x86/mce: Fix incorrect "Machine check from unknown source" message
  x86/mce: Check for alternate indication of machine check recovery on
    Skylake

 arch/x86/kernel/cpu/mcheck/mce-severity.c |  5 +++++
 arch/x86/kernel/cpu/mcheck/mce.c          | 26 ++++++++++++++++-------
 arch/x86/kernel/quirks.c                  | 11 ++++++++--
 3 files changed, 32 insertions(+), 10 deletions(-)

-- 
2.17.0

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/3] x86/mce: Improve error message when kernel cannot recover.
  2018-05-25 21:40 [PATCH 0/3] x86/mce fixes Tony Luck
@ 2018-05-25 21:41 ` Tony Luck
  2018-06-07 20:24   ` [tip:ras/urgent] " tip-bot for Tony Luck
  2018-05-25 21:41 ` [PATCH 2/3] x86/mce: Fix incorrect "Machine check from unknown source" message Tony Luck
  2018-05-25 21:42 ` [PATCH 3/3] x86/mce: Check for alternate indication of machine check recovery on Skylake Tony Luck
  2 siblings, 1 reply; 28+ messages in thread
From: Tony Luck @ 2018-05-25 21:41 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Tony Luck, Dan Williams, Qiuxu Zhuo, Ashok Raj, x86, linux-kernel

Since we added support to add recovery from some errors inside the kernel in:

commit b2f9d678e28c ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries")

we have done a less than stellar job at reporting the cause of recoverable
machine checks that occur in other parts of the kernel. The user just gets
the unhelpful message:

	mce: [Hardware Error]: Machine check: Action required: unknown MCACOD

doubly unhelpful when they check the manual for the reported IA32_MSR_STATUS.MCACOD
and see that it is listed as one of the standard recoverable values.

Add an extra rule to the MCE severity table to catch this case and report it
as:

	mce: [Hardware Error]: Machine check: Data load in unrecoverable area of kernel

Cc: stable@vger.kernel.org # 4.6+
Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/cpu/mcheck/mce-severity.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c
index 5bbd06f38ff6..f34d89c01edc 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
@@ -160,6 +160,11 @@ static struct severity {
 		SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_INSTR),
 		USER
 		),
+	MCESEV(
+		PANIC, "Data load in unrecoverable area of kernel",
+		SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_DATA),
+		KERNEL
+		),
 #endif
 	MCESEV(
 		PANIC, "Action required: unknown MCACOD",
-- 
2.17.0

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 2/3] x86/mce: Fix incorrect "Machine check from unknown source" message
  2018-05-25 21:40 [PATCH 0/3] x86/mce fixes Tony Luck
  2018-05-25 21:41 ` [PATCH 1/3] x86/mce: Improve error message when kernel cannot recover Tony Luck
@ 2018-05-25 21:41 ` Tony Luck
  2018-05-28 20:49   ` Borislav Petkov
                     ` (2 more replies)
  2018-05-25 21:42 ` [PATCH 3/3] x86/mce: Check for alternate indication of machine check recovery on Skylake Tony Luck
  2 siblings, 3 replies; 28+ messages in thread
From: Tony Luck @ 2018-05-25 21:41 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Tony Luck, Dan Williams, Qiuxu Zhuo, Ashok Raj, x86, linux-kernel

Some injection testing resulted in the following console log:

 mce: [Hardware Error]: CPU 22: Machine Check Exception: f Bank 1: bd80000000100134
 mce: [Hardware Error]: RIP 10:<ffffffffc05292dd> {pmem_do_bvec+0x11d/0x330 [nd_pmem]}
 mce: [Hardware Error]: TSC c51a63035d52 ADDR 3234bc4000 MISC 88
 mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1526502199 SOCKET 0 APIC 38 microcode 2000043
 mce: [Hardware Error]: Run the above through 'mcelog --ascii'
 Kernel panic - not syncing: Machine check from unknown source

This confused everybody because the first line quite clearly shows
that we found a logged error in "Bank 1", while the last line says
"unknown source".

The problem is that the Linux code doesn't do the right thing
for a local machine check that results in a fatal error.

It turns out that we know very early in the handler whether the
machine check is fatal. The call to mce_no_way_out() has checked
all the banks for the CPU that took the local machine check. If
it says we must crash, we can do so right away with the right
messages.

We do scan all the banks again. This means that we might initially
not see a problem, but during the second scan find something fatal.
If this happens we print a slightly different message (so I can
see if it actually every happens).

Cc: stable@vger.kernel.org # 4.2
Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/cpu/mcheck/mce.c | 26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 42cf2880d0ed..f013d97abd5f 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1205,13 +1205,18 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 		lmce = m.mcgstatus & MCG_STATUS_LMCES;
 
 	/*
+	 * Local machine check may already know that we have to panic.
+	 * Broadcast machine check begins rendezvous in mce_start()
 	 * Go through all banks in exclusion of the other CPUs. This way we
 	 * don't report duplicated events on shared banks because the first one
-	 * to see it will clear it. If this is a Local MCE, then no need to
-	 * perform rendezvous.
+	 * to see it will clear it.
 	 */
-	if (!lmce)
+	if (lmce) {
+		if (no_way_out)
+			mce_panic("Fatal local machine check", &m, msg);
+	} else {
 		order = mce_start(&no_way_out);
+	}
 
 	for (i = 0; i < cfg->banks; i++) {
 		__clear_bit(i, toclear);
@@ -1287,12 +1292,17 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 			no_way_out = worst >= MCE_PANIC_SEVERITY;
 	} else {
 		/*
-		 * Local MCE skipped calling mce_reign()
-		 * If we found a fatal error, we need to panic here.
+		 * If there was a fatal machine check we should have
+		 * already called mce_panic earlier in this function.
+		 * Since we re-read the banks, we might have found
+		 * something new. Check again to see if we found a
+		 * fatal error. We call "mce_severity()" again to
+		 * make sure we have the right "msg".
 		 */
-		 if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3)
-			mce_panic("Machine check from unknown source",
-				NULL, NULL);
+		if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3) {
+			severity = mce_severity(&m, cfg->tolerant, &msg, true);
+			mce_panic("Local fatal machine check!", &m, msg);
+		}
 	}
 
 	/*
-- 
2.17.0

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 3/3] x86/mce: Check for alternate indication of machine check recovery on Skylake
  2018-05-25 21:40 [PATCH 0/3] x86/mce fixes Tony Luck
  2018-05-25 21:41 ` [PATCH 1/3] x86/mce: Improve error message when kernel cannot recover Tony Luck
  2018-05-25 21:41 ` [PATCH 2/3] x86/mce: Fix incorrect "Machine check from unknown source" message Tony Luck
@ 2018-05-25 21:42 ` Tony Luck
  2018-06-07 17:43   ` Luck, Tony
  2018-06-07 20:25   ` [tip:ras/urgent] " tip-bot for Tony Luck
  2 siblings, 2 replies; 28+ messages in thread
From: Tony Luck @ 2018-05-25 21:42 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Tony Luck, Dan Williams, Qiuxu Zhuo, Ashok Raj, x86, linux-kernel

Currently we just check the "CAPID0" register to see whether the CPU
can recover from machine checks.

But there are also some special SKUs which do not have all advanced
RAS features, but do enable machine check recovery for use with NVDIMMs.

Add a check for any of bits {8:5} in the "CAPID5" register (each
reports some NVDIMM mode available, if any of them are set, then
the system supports memory machine check recovery).

Cc: stable@vger.kernel.org # 4.9
Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/quirks.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c
index 697a4ce04308..736348ead421 100644
--- a/arch/x86/kernel/quirks.c
+++ b/arch/x86/kernel/quirks.c
@@ -645,12 +645,19 @@ static void quirk_intel_brickland_xeon_ras_cap(struct pci_dev *pdev)
 /* Skylake */
 static void quirk_intel_purley_xeon_ras_cap(struct pci_dev *pdev)
 {
-	u32 capid0;
+	u32 capid0, capid5;
 
 	pci_read_config_dword(pdev, 0x84, &capid0);
+	pci_read_config_dword(pdev, 0x98, &capid5);
 
-	if ((capid0 & 0xc0) == 0xc0)
+	/*
+	 * CAPID0{7:6} indicate whether this is an advanced RAS SKU
+	 * CAPID5{8:5} indicate that various NVDIMM usage modes are
+	 * enabled, so memory machine check recovery is also enabled.
+	 */
+	if ((capid0 & 0xc0) == 0xc0 || (capid5 & 0x1e0))
 		static_branch_inc(&mcsafe_key);
+
 }
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x0ec3, quirk_intel_brickland_xeon_ras_cap);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2fc0, quirk_intel_brickland_xeon_ras_cap);
-- 
2.17.0

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] x86/mce: Fix incorrect "Machine check from unknown source" message
  2018-05-25 21:41 ` [PATCH 2/3] x86/mce: Fix incorrect "Machine check from unknown source" message Tony Luck
@ 2018-05-28 20:49   ` Borislav Petkov
  2018-05-29 16:15     ` [PATCH 2/3 V2] " Luck, Tony
  2018-05-29 18:22     ` [PATCH 2/3] " Raj, Ashok
  2018-05-29 10:42   ` Borislav Petkov
  2018-06-22 12:40   ` [tip:ras/core] " tip-bot for Tony Luck
  2 siblings, 2 replies; 28+ messages in thread
From: Borislav Petkov @ 2018-05-28 20:49 UTC (permalink / raw)
  To: Tony Luck; +Cc: Dan Williams, Qiuxu Zhuo, Ashok Raj, x86, linux-kernel

On Fri, May 25, 2018 at 02:41:55PM -0700, Tony Luck wrote:
> @@ -1287,12 +1292,17 @@ void do_machine_check(struct pt_regs *regs, long error_code)
>  			no_way_out = worst >= MCE_PANIC_SEVERITY;
>  	} else {
>  		/*
> -		 * Local MCE skipped calling mce_reign()
> -		 * If we found a fatal error, we need to panic here.
> +		 * If there was a fatal machine check we should have
> +		 * already called mce_panic earlier in this function.
> +		 * Since we re-read the banks, we might have found
> +		 * something new. Check again to see if we found a
> +		 * fatal error. We call "mce_severity()" again to
> +		 * make sure we have the right "msg".
>  		 */
> -		 if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3)
> -			mce_panic("Machine check from unknown source",
> -				NULL, NULL);
> +		if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3) {
> +			severity = mce_severity(&m, cfg->tolerant, &msg, true);
> +			mce_panic("Local fatal machine check!", &m, msg);

Haha, this would still make you look at the code to remember was it
"fatal local" or "local fatal" the second one. Yeah, there's the "!" but
still.

How about:

	"Fatal local machine check after banks scan"

or so.

Btw, the code in do_machine_check() has become one helluva spaghetti
mess. It could use some clean up a bit... :)

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] x86/mce: Fix incorrect "Machine check from unknown source" message
  2018-05-25 21:41 ` [PATCH 2/3] x86/mce: Fix incorrect "Machine check from unknown source" message Tony Luck
  2018-05-28 20:49   ` Borislav Petkov
@ 2018-05-29 10:42   ` Borislav Petkov
  2018-05-29 16:13     ` Luck, Tony
  2018-06-22 12:40   ` [tip:ras/core] " tip-bot for Tony Luck
  2 siblings, 1 reply; 28+ messages in thread
From: Borislav Petkov @ 2018-05-29 10:42 UTC (permalink / raw)
  To: Tony Luck; +Cc: Dan Williams, Qiuxu Zhuo, Ashok Raj, x86, linux-kernel

On Fri, May 25, 2018 at 02:41:55PM -0700, Tony Luck wrote:
> @@ -1287,12 +1292,17 @@ void do_machine_check(struct pt_regs *regs, long error_code)
>  			no_way_out = worst >= MCE_PANIC_SEVERITY;
>  	} else {
>  		/*
> -		 * Local MCE skipped calling mce_reign()
> -		 * If we found a fatal error, we need to panic here.
> +		 * If there was a fatal machine check we should have
> +		 * already called mce_panic earlier in this function.
> +		 * Since we re-read the banks, we might have found
> +		 * something new. Check again to see if we found a
> +		 * fatal error. We call "mce_severity()" again to
> +		 * make sure we have the right "msg".
>  		 */
> -		 if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3)
> -			mce_panic("Machine check from unknown source",
> -				NULL, NULL);
> +		if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3) {
> +			severity = mce_severity(&m, cfg->tolerant, &msg, true);

Looking at this more while cleaning the whole thing up, that severity
doesn't get read anywhere past this line, AFAICT...

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] x86/mce: Fix incorrect "Machine check from unknown source" message
  2018-05-29 10:42   ` Borislav Petkov
@ 2018-05-29 16:13     ` Luck, Tony
  0 siblings, 0 replies; 28+ messages in thread
From: Luck, Tony @ 2018-05-29 16:13 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Dan Williams, Qiuxu Zhuo, Ashok Raj, x86, linux-kernel

On Tue, May 29, 2018 at 12:42:22PM +0200, Borislav Petkov wrote:
> > +		 * fatal error. We call "mce_severity()" again to
> > +		 * make sure we have the right "msg".
> >  		 */
> > -		 if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3)
> > -			mce_panic("Machine check from unknown source",
> > -				NULL, NULL);
> > +		if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3) {
> > +			severity = mce_severity(&m, cfg->tolerant, &msg, true);
> 
> Looking at this more while cleaning the whole thing up, that severity
> doesn't get read anywhere past this line, AFAICT...

Just making the call to update "msg" (see comment). But you are right
that we don't need to update the severity variable. I'll fix that in
the re-spin to make the messages more than slightly different.

-Tony

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 2/3 V2] x86/mce: Fix incorrect "Machine check from unknown source" message
  2018-05-28 20:49   ` Borislav Petkov
@ 2018-05-29 16:15     ` Luck, Tony
  2018-05-29 17:41       ` Borislav Petkov
  2018-05-29 18:22     ` [PATCH 2/3] " Raj, Ashok
  1 sibling, 1 reply; 28+ messages in thread
From: Luck, Tony @ 2018-05-29 16:15 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Dan Williams, Qiuxu Zhuo, Ashok Raj, x86, linux-kernel

Some injection testing resulted in the following console log:

 mce: [Hardware Error]: CPU 22: Machine Check Exception: f Bank 1: bd80000000100134
 mce: [Hardware Error]: RIP 10:<ffffffffc05292dd> {pmem_do_bvec+0x11d/0x330 [nd_pmem]}
 mce: [Hardware Error]: TSC c51a63035d52 ADDR 3234bc4000 MISC 88
 mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1526502199 SOCKET 0 APIC 38 microcode 2000043
 mce: [Hardware Error]: Run the above through 'mcelog --ascii'
 Kernel panic - not syncing: Machine check from unknown source

This confused everybody because the first line quite clearly shows
that we found a logged error in "Bank 1", while the last line says
"unknown source".

The problem is that the Linux code doesn't do the right thing
for a local machine check that results in a fatal error.

It turns out that we know very early in the handler whether the
machine check is fatal. The call to mce_no_way_out() has checked
all the banks for the CPU that took the local machine check. If
it says we must crash, we can do so right away with the right
messages.

We do scan all the banks again. This means that we might initially
not see a problem, but during the second scan find something fatal.
If this happens we print a different message (so I can see if it
actually every happens).

Cc: stable@vger.kernel.org # 4.2
Signed-off-by: Tony Luck <tony.luck@intel.com>
---

V2: Boris:
	1) make the "different message" more descriptive. 
	2) don't bother reassigning "severity" variable.

 arch/x86/kernel/cpu/mcheck/mce.c | 26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 42cf2880d0ed..f013d97abd5f 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1205,13 +1205,18 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 		lmce = m.mcgstatus & MCG_STATUS_LMCES;
 
 	/*
+	 * Local machine check may already know that we have to panic.
+	 * Broadcast machine check begins rendezvous in mce_start()
 	 * Go through all banks in exclusion of the other CPUs. This way we
 	 * don't report duplicated events on shared banks because the first one
-	 * to see it will clear it. If this is a Local MCE, then no need to
-	 * perform rendezvous.
+	 * to see it will clear it.
 	 */
-	if (!lmce)
+	if (lmce) {
+		if (no_way_out)
+			mce_panic("Fatal local machine check", &m, msg);
+	} else {
 		order = mce_start(&no_way_out);
+	}
 
 	for (i = 0; i < cfg->banks; i++) {
 		__clear_bit(i, toclear);
@@ -1287,12 +1292,17 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 			no_way_out = worst >= MCE_PANIC_SEVERITY;
 	} else {
 		/*
-		 * Local MCE skipped calling mce_reign()
-		 * If we found a fatal error, we need to panic here.
+		 * If there was a fatal machine check we should have
+		 * already called mce_panic earlier in this function.
+		 * Since we re-read the banks, we might have found
+		 * something new. Check again to see if we found a
+		 * fatal error. We call "mce_severity()" again to
+		 * make sure we have the right "msg".
 		 */
-		 if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3)
-			mce_panic("Machine check from unknown source",
-				NULL, NULL);
+		if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3) {
+			severity = mce_severity(&m, cfg->tolerant, &msg, true);
+			mce_panic("Local fatal machine check!", &m, msg);
+		}
 	}
 
 	/*
-- 
2.17.0

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3 V2] x86/mce: Fix incorrect "Machine check from unknown source" message
  2018-05-29 16:15     ` [PATCH 2/3 V2] " Luck, Tony
@ 2018-05-29 17:41       ` Borislav Petkov
  2018-05-29 17:50         ` Luck, Tony
  0 siblings, 1 reply; 28+ messages in thread
From: Borislav Petkov @ 2018-05-29 17:41 UTC (permalink / raw)
  To: Luck, Tony; +Cc: Dan Williams, Qiuxu Zhuo, Ashok Raj, x86, linux-kernel

On Tue, May 29, 2018 at 09:15:49AM -0700, Luck, Tony wrote:
> @@ -1287,12 +1292,17 @@ void do_machine_check(struct pt_regs *regs, long error_code)
>  			no_way_out = worst >= MCE_PANIC_SEVERITY;
>  	} else {
>  		/*
> -		 * Local MCE skipped calling mce_reign()
> -		 * If we found a fatal error, we need to panic here.
> +		 * If there was a fatal machine check we should have
> +		 * already called mce_panic earlier in this function.
> +		 * Since we re-read the banks, we might have found
> +		 * something new. Check again to see if we found a
> +		 * fatal error. We call "mce_severity()" again to
> +		 * make sure we have the right "msg".
>  		 */
> -		 if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3)
> -			mce_panic("Machine check from unknown source",
> -				NULL, NULL);
> +		if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3) {
> +			severity = mce_severity(&m, cfg->tolerant, &msg, true);
> +			mce_panic("Local fatal machine check!", &m, msg);
> +		}
>  	}

It is still assigning. I'll simply do:

       if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3) {
               mce_severity(&m, cfg->tolerant, &msg, true);
               mce_panic("Local fatal machine check!", &m, msg);
       }

when committing.

Btw, I did some cleaning up ontop. Please have a look and let me know if
it is too ugly to live... :)

https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=rc7%2b0-ras

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH 2/3 V2] x86/mce: Fix incorrect "Machine check from unknown source" message
  2018-05-29 17:41       ` Borislav Petkov
@ 2018-05-29 17:50         ` Luck, Tony
  2018-05-29 17:53           ` Borislav Petkov
  0 siblings, 1 reply; 28+ messages in thread
From: Luck, Tony @ 2018-05-29 17:50 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Williams, Dan J, Zhuo, Qiuxu, Raj, Ashok, x86, linux-kernel

> It is still assigning.

Ah. That would be because I forgot to "git add" before "git commit --amend" :-(

> I'll simply do:
>
>       if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3) {
>               mce_severity(&m, cfg->tolerant, &msg, true);
>               mce_panic("Local fatal machine check!", &m, msg);
>       }
>
> when committing.

I had put:

		(void) mce_severity(&m, cfg->tolerant, &msg, true);

but either works.

> Btw, I did some cleaning up ontop. Please have a look and let me know if
> it is too ugly to live... :)
>
> https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=rc7%2b0-ras

Will look. I hope you didn't do too much to make the stable backport more difficult.

-Tony
 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3 V2] x86/mce: Fix incorrect "Machine check from unknown source" message
  2018-05-29 17:50         ` Luck, Tony
@ 2018-05-29 17:53           ` Borislav Petkov
  2018-05-29 18:54             ` Luck, Tony
  0 siblings, 1 reply; 28+ messages in thread
From: Borislav Petkov @ 2018-05-29 17:53 UTC (permalink / raw)
  To: Luck, Tony; +Cc: Williams, Dan J, Zhuo, Qiuxu, Raj, Ashok, x86, linux-kernel

On Tue, May 29, 2018 at 05:50:48PM +0000, Luck, Tony wrote:
> Ah. That would be because I forgot to "git add" before "git commit --amend" :-(

Oh, I know the situation very well. :)

> I had put:
> 
> 		(void) mce_severity(&m, cfg->tolerant, &msg, true);
> 
> but either works.

Right.

> Will look. I hope you didn't do too much to make the stable backport more difficult.

Nah, the cleanups will all go ontop. This is just a dirty branch to show
my intention but yours go first and then the cleanup.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] x86/mce: Fix incorrect "Machine check from unknown source" message
  2018-05-28 20:49   ` Borislav Petkov
  2018-05-29 16:15     ` [PATCH 2/3 V2] " Luck, Tony
@ 2018-05-29 18:22     ` Raj, Ashok
  1 sibling, 0 replies; 28+ messages in thread
From: Raj, Ashok @ 2018-05-29 18:22 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Tony Luck, Dan Williams, Qiuxu Zhuo, x86, linux-kernel, Ashok Raj

On Mon, May 28, 2018 at 10:49:23PM +0200, Borislav Petkov wrote:
> On Fri, May 25, 2018 at 02:41:55PM -0700, Tony Luck wrote:
> > @@ -1287,12 +1292,17 @@ void do_machine_check(struct pt_regs *regs, long error_code)
> >  			no_way_out = worst >= MCE_PANIC_SEVERITY;
> >  	} else {
> >  		/*
> > -		 * Local MCE skipped calling mce_reign()
> > -		 * If we found a fatal error, we need to panic here.
> > +		 * If there was a fatal machine check we should have
> > +		 * already called mce_panic earlier in this function.
> > +		 * Since we re-read the banks, we might have found
> > +		 * something new. Check again to see if we found a
> > +		 * fatal error. We call "mce_severity()" again to
> > +		 * make sure we have the right "msg".
> >  		 */
> > -		 if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3)
> > -			mce_panic("Machine check from unknown source",
> > -				NULL, NULL);
> > +		if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3) {
> > +			severity = mce_severity(&m, cfg->tolerant, &msg, true);
> > +			mce_panic("Local fatal machine check!", &m, msg);

If this doesn't affect mcelog parsing, would it make sense to change this from
"fatal" -> "Unrecoverable".. Fatal typically screams PCC=1 for x86, but
some of these cases are its Software recoverable, but just that Kernel 
isn't able to perform recovery.


> 
> Haha, this would still make you look at the code to remember was it
> "fatal local" or "local fatal" the second one. Yeah, there's the "!" but
> still.
> 
> How about:
> 
> 	"Fatal local machine check after banks scan"
> 
> or so.
> 
> Btw, the code in do_machine_check() has become one helluva spaghetti
> mess. It could use some clean up a bit... :)
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
> -- 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3 V2] x86/mce: Fix incorrect "Machine check from unknown source" message
  2018-05-29 17:53           ` Borislav Petkov
@ 2018-05-29 18:54             ` Luck, Tony
  2018-05-29 20:17               ` Dan Williams
  2018-05-30  9:26               ` Borislav Petkov
  0 siblings, 2 replies; 28+ messages in thread
From: Luck, Tony @ 2018-05-29 18:54 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Williams, Dan J, Zhuo, Qiuxu, Raj, Ashok, x86, linux-kernel

On Tue, May 29, 2018 at 07:53:14PM +0200, Borislav Petkov wrote:
> Nah, the cleanups will all go ontop. This is just a dirty branch to show
> my intention but yours go first and then the cleanup.


Couple of thoughts:

In "x86/mce: Carve out bank scanning code" you drop the extra
call to mce_severity() that I just added:

@@ -1310,10 +1318,8 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 		 * fatal error. We call "mce_severity()" again to
 		 * make sure we have the right "msg".
 		 */
-		if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3) {
-			severity = mce_severity(&m, cfg->tolerant, &msg, true);
+		if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3)
 			mce_panic("Local fatal machine check!", &m, msg);
-		}
 	}

But "msg" won't have been filled in with the right message to match
the error in "m" (__mc_scan_banks() doesn't update "msg").



In "x86/mce: Exit properly when no banks to poll" you
leap right to the end.  I'm wondering whether this can
ever happen? I mean, if there are no machine check banks,
then how did we get a machine check?

Both the original, and your new code, skip the:

	mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);

which seems bad. That leaves MCG_STATUS.MCIP set ... so a second
machine check would just reset the machine.

-Tony

P.S. What happened to my "part 3/3" (updating the Skylake quirk)
... does that belong in somebody else's tree?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3 V2] x86/mce: Fix incorrect "Machine check from unknown source" message
  2018-05-29 18:54             ` Luck, Tony
@ 2018-05-29 20:17               ` Dan Williams
  2018-05-30  9:26               ` Borislav Petkov
  1 sibling, 0 replies; 28+ messages in thread
From: Dan Williams @ 2018-05-29 20:17 UTC (permalink / raw)
  To: Luck, Tony; +Cc: Borislav Petkov, Zhuo, Qiuxu, Raj, Ashok, x86, linux-kernel

On Tue, May 29, 2018 at 11:54 AM, Luck, Tony <tony.luck@intel.com> wrote:
> On Tue, May 29, 2018 at 07:53:14PM +0200, Borislav Petkov wrote:
[..]
> P.S. What happened to my "part 3/3" (updating the Skylake quirk)
> ... does that belong in somebody else's tree?

I have no qualms taking it through the nvdimm tree with the rest of
the pending memcpy_mcsafe() updates. Just need an x86 maintainer ack.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3 V2] x86/mce: Fix incorrect "Machine check from unknown source" message
  2018-05-29 18:54             ` Luck, Tony
  2018-05-29 20:17               ` Dan Williams
@ 2018-05-30  9:26               ` Borislav Petkov
  2018-06-19 10:30                 ` Borislav Petkov
  1 sibling, 1 reply; 28+ messages in thread
From: Borislav Petkov @ 2018-05-30  9:26 UTC (permalink / raw)
  To: Luck, Tony; +Cc: Williams, Dan J, Zhuo, Qiuxu, Raj, Ashok, x86, linux-kernel

On Tue, May 29, 2018 at 11:54:25AM -0700, Luck, Tony wrote:
> Couple of thoughts:

Thanks for looking.

> In "x86/mce: Carve out bank scanning code" you drop the extra
> call to mce_severity() that I just added:

Yeah, did that before we talked about it.

> In "x86/mce: Exit properly when no banks to poll" you
> leap right to the end.  I'm wondering whether this can
> ever happen? I mean, if there are no machine check banks,
> then how did we get a machine check?

Right, so this looks like some remnant from old times, lemme do some
archeology...

/me goes and dusts off the full history linux repo...

I found this:

commit 7dd1e1d805d15ca63d05badf40026629ba75cbc8
Author: Andi Kleen <ak@suse.de>
Date:   Tue Feb 24 17:58:41 2004 -0800

    [PATCH] New machine check handler for x86-64

and there's no mention why the !banks check is there.

I'm wondering if we should simply remove it. I mean, as you say, if
there are no MCA banks, we won't be running in here in the first
place...

> Both the original, and your new code, skip the:
> 
> 	mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
> 
> which seems bad. That leaves MCG_STATUS.MCIP set ... so a second
> machine check would just reset the machine.

That's a good point. It goes away as an issue if we simply drop the
check.

> P.S. What happened to my "part 3/3" (updating the Skylake quirk)
> ... does that belong in somebody else's tree?

Simply hadn't reached it yet. I will take it too, eventually.

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mce: Check for alternate indication of machine check recovery on Skylake
  2018-05-25 21:42 ` [PATCH 3/3] x86/mce: Check for alternate indication of machine check recovery on Skylake Tony Luck
@ 2018-06-07 17:43   ` Luck, Tony
  2018-06-07 20:18     ` Dan Williams
  2018-06-07 20:25   ` [tip:ras/urgent] " tip-bot for Tony Luck
  1 sibling, 1 reply; 28+ messages in thread
From: Luck, Tony @ 2018-06-07 17:43 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Dan Williams, Qiuxu Zhuo, Ashok Raj, x86, linux-kernel

On Fri, May 25, 2018 at 02:42:09PM -0700, Tony Luck wrote:
> Currently we just check the "CAPID0" register to see whether the CPU
> can recover from machine checks.
> 
> But there are also some special SKUs which do not have all advanced
> RAS features, but do enable machine check recovery for use with NVDIMMs.
> 
> Add a check for any of bits {8:5} in the "CAPID5" register (each
> reports some NVDIMM mode available, if any of them are set, then
> the system supports memory machine check recovery).
> 
> Cc: stable@vger.kernel.org # 4.9
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---

Has this stalled somewhere?  I'd like to see this one go into the
4.18 merge because it unbreaks some real hardware.

Parts 1 & 2 are nice-to-have, but they just make for better error
messages so aren't as critical.

>  arch/x86/kernel/quirks.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c
> index 697a4ce04308..736348ead421 100644
> --- a/arch/x86/kernel/quirks.c
> +++ b/arch/x86/kernel/quirks.c
> @@ -645,12 +645,19 @@ static void quirk_intel_brickland_xeon_ras_cap(struct pci_dev *pdev)
>  /* Skylake */
>  static void quirk_intel_purley_xeon_ras_cap(struct pci_dev *pdev)
>  {
> -	u32 capid0;
> +	u32 capid0, capid5;
>  
>  	pci_read_config_dword(pdev, 0x84, &capid0);
> +	pci_read_config_dword(pdev, 0x98, &capid5);
>  
> -	if ((capid0 & 0xc0) == 0xc0)
> +	/*
> +	 * CAPID0{7:6} indicate whether this is an advanced RAS SKU
> +	 * CAPID5{8:5} indicate that various NVDIMM usage modes are
> +	 * enabled, so memory machine check recovery is also enabled.
> +	 */
> +	if ((capid0 & 0xc0) == 0xc0 || (capid5 & 0x1e0))
>  		static_branch_inc(&mcsafe_key);
> +
>  }
>  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x0ec3, quirk_intel_brickland_xeon_ras_cap);
>  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2fc0, quirk_intel_brickland_xeon_ras_cap);
> -- 
> 2.17.0
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mce: Check for alternate indication of machine check recovery on Skylake
  2018-06-07 17:43   ` Luck, Tony
@ 2018-06-07 20:18     ` Dan Williams
  2018-06-07 20:24       ` Borislav Petkov
  2018-06-07 20:24       ` Thomas Gleixner
  0 siblings, 2 replies; 28+ messages in thread
From: Dan Williams @ 2018-06-07 20:18 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Borislav Petkov, Qiuxu Zhuo, Ashok Raj, X86 ML,
	Linux Kernel Mailing List

On Thu, Jun 7, 2018 at 10:43 AM, Luck, Tony <tony.luck@intel.com> wrote:
> On Fri, May 25, 2018 at 02:42:09PM -0700, Tony Luck wrote:
>> Currently we just check the "CAPID0" register to see whether the CPU
>> can recover from machine checks.
>>
>> But there are also some special SKUs which do not have all advanced
>> RAS features, but do enable machine check recovery for use with NVDIMMs.
>>
>> Add a check for any of bits {8:5} in the "CAPID5" register (each
>> reports some NVDIMM mode available, if any of them are set, then
>> the system supports memory machine check recovery).
>>
>> Cc: stable@vger.kernel.org # 4.9
>> Signed-off-by: Tony Luck <tony.luck@intel.com>
>> ---
>
> Has this stalled somewhere?  I'd like to see this one go into the
> 4.18 merge because it unbreaks some real hardware.
>
> Parts 1 & 2 are nice-to-have, but they just make for better error
> messages so aren't as critical.

I'm making an effort to get all persistent memory error handling holes
covered this cycle, so I think it makes sense for this to go through
the nvdimm tree. This looks sufficiently non-controversial that I
could justify sending it to Linus along with the other pmem updates.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mce: Check for alternate indication of machine check recovery on Skylake
  2018-06-07 20:18     ` Dan Williams
@ 2018-06-07 20:24       ` Borislav Petkov
  2018-06-07 22:26         ` Luck, Tony
  2018-06-14 21:57         ` Luck, Tony
  2018-06-07 20:24       ` Thomas Gleixner
  1 sibling, 2 replies; 28+ messages in thread
From: Borislav Petkov @ 2018-06-07 20:24 UTC (permalink / raw)
  To: Dan Williams
  Cc: Luck, Tony, Qiuxu Zhuo, Ashok Raj, X86 ML, Linux Kernel Mailing List

On Thu, Jun 07, 2018 at 01:18:31PM -0700, Dan Williams wrote:
> I'm making an effort to get all persistent memory error handling holes
> covered this cycle, so I think it makes sense for this to go through
> the nvdimm tree. This looks sufficiently non-controversial that I
> could justify sending it to Linus along with the other pmem updates.

tglx just took 1 and 3, 2/3 had a minor issue but the merge window
happened so I'll send it later. It is nice to have anyway.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [tip:ras/urgent] x86/mce: Improve error message when kernel cannot recover
  2018-05-25 21:41 ` [PATCH 1/3] x86/mce: Improve error message when kernel cannot recover Tony Luck
@ 2018-06-07 20:24   ` tip-bot for Tony Luck
  0 siblings, 0 replies; 28+ messages in thread
From: tip-bot for Tony Luck @ 2018-06-07 20:24 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tony.luck, hpa, mingo, linux-kernel, ashok.raj, dan.j.williams,
	bp, tglx, qiuxu.zhuo

Commit-ID:  c7d606f560e4c698884697fef503e4abacdd8c25
Gitweb:     https://git.kernel.org/tip/c7d606f560e4c698884697fef503e4abacdd8c25
Author:     Tony Luck <tony.luck@intel.com>
AuthorDate: Fri, 25 May 2018 14:41:39 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 7 Jun 2018 22:22:12 +0200

x86/mce: Improve error message when kernel cannot recover

Since we added support to add recovery from some errors inside the kernel in:

commit b2f9d678e28c ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries")

we have done a less than stellar job at reporting the cause of recoverable
machine checks that occur in other parts of the kernel. The user just gets
the unhelpful message:

	mce: [Hardware Error]: Machine check: Action required: unknown MCACOD

doubly unhelpful when they check the manual for the reported IA32_MSR_STATUS.MCACOD
and see that it is listed as one of the standard recoverable values.

Add an extra rule to the MCE severity table to catch this case and report it
as:

	mce: [Hardware Error]: Machine check: Data load in unrecoverable area of kernel

Fixes: b2f9d678e28c ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries")
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: stable@vger.kernel.org # 4.6+
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/4cc7c465150a9a48b8b9f45d0b840278e77eb9b5.1527283897.git.tony.luck@intel.com

---
 arch/x86/kernel/cpu/mcheck/mce-severity.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c
index 5bbd06f38ff6..f34d89c01edc 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
@@ -160,6 +160,11 @@ static struct severity {
 		SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_INSTR),
 		USER
 		),
+	MCESEV(
+		PANIC, "Data load in unrecoverable area of kernel",
+		SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR|MCACOD_DATA),
+		KERNEL
+		),
 #endif
 	MCESEV(
 		PANIC, "Action required: unknown MCACOD",

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mce: Check for alternate indication of machine check recovery on Skylake
  2018-06-07 20:18     ` Dan Williams
  2018-06-07 20:24       ` Borislav Petkov
@ 2018-06-07 20:24       ` Thomas Gleixner
  1 sibling, 0 replies; 28+ messages in thread
From: Thomas Gleixner @ 2018-06-07 20:24 UTC (permalink / raw)
  To: Dan Williams
  Cc: Luck, Tony, Borislav Petkov, Qiuxu Zhuo, Ashok Raj, X86 ML,
	Linux Kernel Mailing List


On Thu, 7 Jun 2018, Dan Williams wrote:

> On Thu, Jun 7, 2018 at 10:43 AM, Luck, Tony <tony.luck@intel.com> wrote:
> > On Fri, May 25, 2018 at 02:42:09PM -0700, Tony Luck wrote:
> >> Currently we just check the "CAPID0" register to see whether the CPU
> >> can recover from machine checks.
> >>
> >> But there are also some special SKUs which do not have all advanced
> >> RAS features, but do enable machine check recovery for use with NVDIMMs.
> >>
> >> Add a check for any of bits {8:5} in the "CAPID5" register (each
> >> reports some NVDIMM mode available, if any of them are set, then
> >> the system supports memory machine check recovery).
> >>
> >> Cc: stable@vger.kernel.org # 4.9
> >> Signed-off-by: Tony Luck <tony.luck@intel.com>
> >> ---
> >
> > Has this stalled somewhere?  I'd like to see this one go into the
> > 4.18 merge because it unbreaks some real hardware.
> >
> > Parts 1 & 2 are nice-to-have, but they just make for better error
> > messages so aren't as critical.
> 
> I'm making an effort to get all persistent memory error handling holes
> covered this cycle, so I think it makes sense for this to go through
> the nvdimm tree. This looks sufficiently non-controversial that I
> could justify sending it to Linus along with the other pmem updates.

I've picked it up already and please can we let stuff go through the right
trees? The worlds does not stop turning if a fix goes in 2 days later.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [tip:ras/urgent] x86/mce: Check for alternate indication of machine check recovery on Skylake
  2018-05-25 21:42 ` [PATCH 3/3] x86/mce: Check for alternate indication of machine check recovery on Skylake Tony Luck
  2018-06-07 17:43   ` Luck, Tony
@ 2018-06-07 20:25   ` tip-bot for Tony Luck
  1 sibling, 0 replies; 28+ messages in thread
From: tip-bot for Tony Luck @ 2018-06-07 20:25 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, dan.j.williams, bp, mingo, ashok.raj, linux-kernel,
	qiuxu.zhuo, tony.luck, hpa

Commit-ID:  4c5717da1d021cf368eabb3cb1adcaead56c0d1e
Gitweb:     https://git.kernel.org/tip/4c5717da1d021cf368eabb3cb1adcaead56c0d1e
Author:     Tony Luck <tony.luck@intel.com>
AuthorDate: Fri, 25 May 2018 14:42:09 -0700
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 7 Jun 2018 22:22:12 +0200

x86/mce: Check for alternate indication of machine check recovery on Skylake

Currently we just check the "CAPID0" register to see whether the CPU
can recover from machine checks.

But there are also some special SKUs which do not have all advanced
RAS features, but do enable machine check recovery for use with NVDIMMs.

Add a check for any of bits {8:5} in the "CAPID5" register (each
reports some NVDIMM mode available, if any of them are set, then
the system supports memory machine check recovery).

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: stable@vger.kernel.org # 4.9
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/03cbed6e99ddafb51c2eadf9a3b7c8d7a0cc204e.1527283897.git.tony.luck@intel.com

---
 arch/x86/kernel/quirks.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c
index 697a4ce04308..736348ead421 100644
--- a/arch/x86/kernel/quirks.c
+++ b/arch/x86/kernel/quirks.c
@@ -645,12 +645,19 @@ static void quirk_intel_brickland_xeon_ras_cap(struct pci_dev *pdev)
 /* Skylake */
 static void quirk_intel_purley_xeon_ras_cap(struct pci_dev *pdev)
 {
-	u32 capid0;
+	u32 capid0, capid5;
 
 	pci_read_config_dword(pdev, 0x84, &capid0);
+	pci_read_config_dword(pdev, 0x98, &capid5);
 
-	if ((capid0 & 0xc0) == 0xc0)
+	/*
+	 * CAPID0{7:6} indicate whether this is an advanced RAS SKU
+	 * CAPID5{8:5} indicate that various NVDIMM usage modes are
+	 * enabled, so memory machine check recovery is also enabled.
+	 */
+	if ((capid0 & 0xc0) == 0xc0 || (capid5 & 0x1e0))
 		static_branch_inc(&mcsafe_key);
+
 }
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x0ec3, quirk_intel_brickland_xeon_ras_cap);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2fc0, quirk_intel_brickland_xeon_ras_cap);

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mce: Check for alternate indication of machine check recovery on Skylake
  2018-06-07 20:24       ` Borislav Petkov
@ 2018-06-07 22:26         ` Luck, Tony
  2018-06-14 21:57         ` Luck, Tony
  1 sibling, 0 replies; 28+ messages in thread
From: Luck, Tony @ 2018-06-07 22:26 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Dan Williams, Qiuxu Zhuo, Ashok Raj, X86 ML, Linux Kernel Mailing List

On Thu, Jun 07, 2018 at 10:24:46PM +0200, Borislav Petkov wrote:
> On Thu, Jun 07, 2018 at 01:18:31PM -0700, Dan Williams wrote:
> > I'm making an effort to get all persistent memory error handling holes
> > covered this cycle, so I think it makes sense for this to go through
> > the nvdimm tree. This looks sufficiently non-controversial that I
> > could justify sending it to Linus along with the other pmem updates.
> 
> tglx just took 1 and 3, 2/3 had a minor issue but the merge window
> happened so I'll send it later. It is nice to have anyway.

Thanks Boris.

-Tony

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mce: Check for alternate indication of machine check recovery on Skylake
  2018-06-07 20:24       ` Borislav Petkov
  2018-06-07 22:26         ` Luck, Tony
@ 2018-06-14 21:57         ` Luck, Tony
  2018-06-15 11:45           ` Borislav Petkov
  1 sibling, 1 reply; 28+ messages in thread
From: Luck, Tony @ 2018-06-14 21:57 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Dan Williams, Qiuxu Zhuo, Ashok Raj, X86 ML, Linux Kernel Mailing List

On Thu, Jun 07, 2018 at 10:24:46PM +0200, Borislav Petkov wrote:
> tglx just took 1 and 3, 2/3 had a minor issue but the merge window
> happened so I'll send it later. It is nice to have anyway.

Did you fix up part 2/3?  I see 1 & 3 were staged by Thomas in
TIP ras/urgent and ras-urgent-for-linus but haven't gone into
Linus' tree yet.

-Tony

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mce: Check for alternate indication of machine check recovery on Skylake
  2018-06-14 21:57         ` Luck, Tony
@ 2018-06-15 11:45           ` Borislav Petkov
  2018-06-15 16:34             ` Luck, Tony
  0 siblings, 1 reply; 28+ messages in thread
From: Borislav Petkov @ 2018-06-15 11:45 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Dan Williams, Qiuxu Zhuo, Ashok Raj, X86 ML, Linux Kernel Mailing List

On Thu, Jun 14, 2018 at 02:57:54PM -0700, Luck, Tony wrote:
> On Thu, Jun 07, 2018 at 10:24:46PM +0200, Borislav Petkov wrote:
> > tglx just took 1 and 3, 2/3 had a minor issue but the merge window
> > happened so I'll send it later. It is nice to have anyway.
> 
> Did you fix up part 2/3?

You said "Parts 1 & 2 are nice-to-have" and we have merge window now. So
what exactly do you mean?

> I see 1 & 3 were staged by Thomas in TIP ras/urgent and
> ras-urgent-for-linus but haven't gone into Linus' tree yet.

Merge window ain't over yet :)

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mce: Check for alternate indication of machine check recovery on Skylake
  2018-06-15 11:45           ` Borislav Petkov
@ 2018-06-15 16:34             ` Luck, Tony
  2018-06-15 17:16               ` Borislav Petkov
  0 siblings, 1 reply; 28+ messages in thread
From: Luck, Tony @ 2018-06-15 16:34 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Dan Williams, Qiuxu Zhuo, Ashok Raj, X86 ML, Linux Kernel Mailing List

On Fri, Jun 15, 2018 at 01:45:18PM +0200, Borislav Petkov wrote:
> On Thu, Jun 14, 2018 at 02:57:54PM -0700, Luck, Tony wrote:
> > On Thu, Jun 07, 2018 at 10:24:46PM +0200, Borislav Petkov wrote:
> > > tglx just took 1 and 3, 2/3 had a minor issue but the merge window
> > > happened so I'll send it later. It is nice to have anyway.
> > 
> > Did you fix up part 2/3?
> 
> You said "Parts 1 & 2 are nice-to-have" and we have merge window now. So
> what exactly do you mean?
> 
> > I see 1 & 3 were staged by Thomas in TIP ras/urgent and
> > ras-urgent-for-linus but haven't gone into Linus' tree yet.
> 
> Merge window ain't over yet :)

"send it later" could be read a couple of ways.

I was just worried that Thomas is holding off asking Linus to pull
because he's waiting for part 2/3, while you might be planning to
hold onto part 2/3 for the next merge (and add the other cleanups
you RFC'd).

I'm OK either way on part 2/3 (it just makes for better error messages,
it doesn't fix any bugs).

-Tony

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] x86/mce: Check for alternate indication of machine check recovery on Skylake
  2018-06-15 16:34             ` Luck, Tony
@ 2018-06-15 17:16               ` Borislav Petkov
  0 siblings, 0 replies; 28+ messages in thread
From: Borislav Petkov @ 2018-06-15 17:16 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Dan Williams, Qiuxu Zhuo, Ashok Raj, X86 ML, Linux Kernel Mailing List

On Fri, Jun 15, 2018 at 09:34:06AM -0700, Luck, Tony wrote:
> I was just worried that Thomas is holding off asking Linus to pull
> because he's waiting for part 2/3, while you might be planning to
> hold onto part 2/3 for the next merge (and add the other cleanups
> you RFC'd).
> 
> I'm OK either way on part 2/3 (it just makes for better error messages,
> it doesn't fix any bugs).

Yes, and this is my understanding too - 2/3 is not urgent material so no
need to hurry that particular one.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3 V2] x86/mce: Fix incorrect "Machine check from unknown source" message
  2018-05-30  9:26               ` Borislav Petkov
@ 2018-06-19 10:30                 ` Borislav Petkov
  0 siblings, 0 replies; 28+ messages in thread
From: Borislav Petkov @ 2018-06-19 10:30 UTC (permalink / raw)
  To: Luck, Tony; +Cc: Williams, Dan J, Zhuo, Qiuxu, Raj, Ashok, x86, linux-kernel

On Wed, May 30, 2018 at 11:26:32AM +0200, Borislav Petkov wrote:
> > In "x86/mce: Exit properly when no banks to poll" you
> > leap right to the end.  I'm wondering whether this can
> > ever happen? I mean, if there are no machine check banks,
> > then how did we get a machine check?
> 
> Right, so this looks like some remnant from old times, lemme do some
> archeology...
> 
> /me goes and dusts off the full history linux repo...
> 
> I found this:
> 
> commit 7dd1e1d805d15ca63d05badf40026629ba75cbc8
> Author: Andi Kleen <ak@suse.de>
> Date:   Tue Feb 24 17:58:41 2004 -0800
> 
>     [PATCH] New machine check handler for x86-64
> 
> and there's no mention why the !banks check is there.
> 
> I'm wondering if we should simply remove it. I mean, as you say, if
> there are no MCA banks, we won't be running in here in the first
> place...
> 
> > Both the original, and your new code, skip the:
> > 
> > 	mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
> > 
> > which seems bad. That leaves MCG_STATUS.MCIP set ... so a second
> > machine check would just reset the machine.
> 
> That's a good point. It goes away as an issue if we simply drop the
> check.

...and gone it is.

---
From: Borislav Petkov <bp@suse.de>
Date: Tue, 19 Jun 2018 12:27:02 +0200
Subject: [PATCH] x86/mce: Remove !banks check

If we don't have MCA banks, we won't see machine checks anyway. Drop the
check.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mcheck/mce.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index ea1521ec7e5b..05d2d15a95bd 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1185,9 +1185,6 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 
 	this_cpu_inc(mce_exception_count);
 
-	if (!cfg->banks)
-		goto out;
-
 	mce_gather_info(&m, regs);
 	m.tsc = rdtsc();
 
@@ -1327,7 +1324,7 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 	if (worst > 0)
 		mce_report_event(regs);
 	mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
-out:
+
 	sync_core();
 
 	if (worst != MCE_AR_SEVERITY && !kill_it)
-- 
2.17.0.582.gccdcbd54c

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [tip:ras/core] x86/mce: Fix incorrect "Machine check from unknown source" message
  2018-05-25 21:41 ` [PATCH 2/3] x86/mce: Fix incorrect "Machine check from unknown source" message Tony Luck
  2018-05-28 20:49   ` Borislav Petkov
  2018-05-29 10:42   ` Borislav Petkov
@ 2018-06-22 12:40   ` tip-bot for Tony Luck
  2 siblings, 0 replies; 28+ messages in thread
From: tip-bot for Tony Luck @ 2018-06-22 12:40 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, linux-edac, dan.j.williams, ashok.raj, tony.luck,
	qiuxu.zhuo, bp, mingo, hpa, linux-kernel

Commit-ID:  40c36e2741d7fe1e66d6ec55477ba5fd19c9c5d2
Gitweb:     https://git.kernel.org/tip/40c36e2741d7fe1e66d6ec55477ba5fd19c9c5d2
Author:     Tony Luck <tony.luck@intel.com>
AuthorDate: Fri, 22 Jun 2018 11:54:23 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Fri, 22 Jun 2018 14:35:50 +0200

x86/mce: Fix incorrect "Machine check from unknown source" message

Some injection testing resulted in the following console log:

  mce: [Hardware Error]: CPU 22: Machine Check Exception: f Bank 1: bd80000000100134
  mce: [Hardware Error]: RIP 10:<ffffffffc05292dd> {pmem_do_bvec+0x11d/0x330 [nd_pmem]}
  mce: [Hardware Error]: TSC c51a63035d52 ADDR 3234bc4000 MISC 88
  mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1526502199 SOCKET 0 APIC 38 microcode 2000043
  mce: [Hardware Error]: Run the above through 'mcelog --ascii'
  Kernel panic - not syncing: Machine check from unknown source

This confused everybody because the first line quite clearly shows
that we found a logged error in "Bank 1", while the last line says
"unknown source".

The problem is that the Linux code doesn't do the right thing
for a local machine check that results in a fatal error.

It turns out that we know very early in the handler whether the
machine check is fatal. The call to mce_no_way_out() has checked
all the banks for the CPU that took the local machine check. If
it says we must crash, we can do so right away with the right
messages.

We do scan all the banks again. This means that we might initially
not see a problem, but during the second scan find something fatal.
If this happens we print a slightly different message (so I can
see if it actually every happens).

[ bp: Remove unneeded severity assignment. ]

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: stable@vger.kernel.org # 4.2
Link: http://lkml.kernel.org/r/52e049a497e86fd0b71c529651def8871c804df0.1527283897.git.tony.luck@intel.com

---
 arch/x86/kernel/cpu/mcheck/mce.c | 26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 7e6f51a9d917..e93670d736a6 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1207,13 +1207,18 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 		lmce = m.mcgstatus & MCG_STATUS_LMCES;
 
 	/*
+	 * Local machine check may already know that we have to panic.
+	 * Broadcast machine check begins rendezvous in mce_start()
 	 * Go through all banks in exclusion of the other CPUs. This way we
 	 * don't report duplicated events on shared banks because the first one
-	 * to see it will clear it. If this is a Local MCE, then no need to
-	 * perform rendezvous.
+	 * to see it will clear it.
 	 */
-	if (!lmce)
+	if (lmce) {
+		if (no_way_out)
+			mce_panic("Fatal local machine check", &m, msg);
+	} else {
 		order = mce_start(&no_way_out);
+	}
 
 	for (i = 0; i < cfg->banks; i++) {
 		__clear_bit(i, toclear);
@@ -1289,12 +1294,17 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 			no_way_out = worst >= MCE_PANIC_SEVERITY;
 	} else {
 		/*
-		 * Local MCE skipped calling mce_reign()
-		 * If we found a fatal error, we need to panic here.
+		 * If there was a fatal machine check we should have
+		 * already called mce_panic earlier in this function.
+		 * Since we re-read the banks, we might have found
+		 * something new. Check again to see if we found a
+		 * fatal error. We call "mce_severity()" again to
+		 * make sure we have the right "msg".
 		 */
-		 if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3)
-			mce_panic("Machine check from unknown source",
-				NULL, NULL);
+		if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3) {
+			mce_severity(&m, cfg->tolerant, &msg, true);
+			mce_panic("Local fatal machine check!", &m, msg);
+		}
 	}
 
 	/*

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2018-06-22 12:40 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-25 21:40 [PATCH 0/3] x86/mce fixes Tony Luck
2018-05-25 21:41 ` [PATCH 1/3] x86/mce: Improve error message when kernel cannot recover Tony Luck
2018-06-07 20:24   ` [tip:ras/urgent] " tip-bot for Tony Luck
2018-05-25 21:41 ` [PATCH 2/3] x86/mce: Fix incorrect "Machine check from unknown source" message Tony Luck
2018-05-28 20:49   ` Borislav Petkov
2018-05-29 16:15     ` [PATCH 2/3 V2] " Luck, Tony
2018-05-29 17:41       ` Borislav Petkov
2018-05-29 17:50         ` Luck, Tony
2018-05-29 17:53           ` Borislav Petkov
2018-05-29 18:54             ` Luck, Tony
2018-05-29 20:17               ` Dan Williams
2018-05-30  9:26               ` Borislav Petkov
2018-06-19 10:30                 ` Borislav Petkov
2018-05-29 18:22     ` [PATCH 2/3] " Raj, Ashok
2018-05-29 10:42   ` Borislav Petkov
2018-05-29 16:13     ` Luck, Tony
2018-06-22 12:40   ` [tip:ras/core] " tip-bot for Tony Luck
2018-05-25 21:42 ` [PATCH 3/3] x86/mce: Check for alternate indication of machine check recovery on Skylake Tony Luck
2018-06-07 17:43   ` Luck, Tony
2018-06-07 20:18     ` Dan Williams
2018-06-07 20:24       ` Borislav Petkov
2018-06-07 22:26         ` Luck, Tony
2018-06-14 21:57         ` Luck, Tony
2018-06-15 11:45           ` Borislav Petkov
2018-06-15 16:34             ` Luck, Tony
2018-06-15 17:16               ` Borislav Petkov
2018-06-07 20:24       ` Thomas Gleixner
2018-06-07 20:25   ` [tip:ras/urgent] " tip-bot for Tony Luck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).