LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v2 0/2] Introduce the pkill_on_warn parameter
@ 2021-10-27 23:32 Alexander Popov
  2021-10-27 23:32 ` [PATCH v2 1/2] bug: do refactoring allowing to add a warning handling action Alexander Popov
                   ` (3 more replies)
  0 siblings, 4 replies; 31+ messages in thread
From: Alexander Popov @ 2021-10-27 23:32 UTC (permalink / raw)
  To: Jonathan Corbet, Linus Torvalds, Paul McKenney, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, Joerg Roedel, Maciej Rozycki,
	Muchun Song, Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu,
	Petr Mladek, Kees Cook, Luis Chamberlain, Wei Liu, John Ogness,
	Andy Shevchenko, Alexey Kardashevskiy, Christophe Leroy,
	Jann Horn, Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski,
	Dave Hansen, Steven Rostedt, Will Deacon, Ard Biesheuvel,
	Laura Abbott, David S Miller, Borislav Petkov, Arnd Bergmann,
	Andrew Scull, Marc Zyngier, Jessica Yu, Iurii Zaikin,
	Rasmus Villemoes, Wang Qing, Mel Gorman, Mauro Carvalho Chehab,
	Andrew Klychkov, Mathieu Chouquet-Stringer, Daniel Borkmann,
	Stephen Kitt, Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Alexander Popov, kernel-hardening,
	linux-hardening, linux-doc, linux-arch, linux-kernel,
	linux-fsdevel
  Cc: notify

Hello! This is the v2 of pkill_on_warn.
Changes from v1 and tricks for testing are described below.

Rationale
=========

Currently, the Linux kernel provides two types of reaction to kernel
warnings:
 1. Do nothing (by default),
 2. Call panic() if panic_on_warn is set. That's a very strong reaction,
    so panic_on_warn is usually disabled on production systems.

From a safety point of view, the Linux kernel misses a middle way of
handling kernel warnings:
 - The kernel should stop the activity that provokes a warning,
 - But the kernel should avoid complete denial of service.

From a security point of view, kernel warning messages provide a lot of
useful information for attackers. Many GNU/Linux distributions allow
unprivileged users to read the kernel log, so attackers use kernel
warning infoleak in vulnerability exploits. See the examples:
https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html
https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html

Let's introduce the pkill_on_warn sysctl.
If this parameter is set, the kernel kills all threads in a process that
provoked a kernel warning. This behavior is reasonable from a safety point of
view described above. It is also useful for kernel security hardening because
the system kills an exploit process that hits a kernel warning.

Moreover, bugs usually don't come alone, and a kernel warning may be
followed by memory corruption or other bad effects. So pkill_on_warn allows
the kernel to stop the process when the first signs of wrong behavior
are detected.


Changes from v1
===============

1) Introduce do_pkill_on_warn() and call it in all warning handling paths.

2) Do refactoring without functional changes in a separate patch.

3) Avoid killing init and kthreads.

4) Use do_send_sig_info() instead of do_group_exit().

5) Introduce sysctl instead of using core_param().


Tricks for testing
==================

1) This patch series was tested on x86_64 using CONFIG_LKDTM.
The kernel kills a process that performs this:
  echo WARNING > /sys/kernel/debug/provoke-crash/DIRECT

2) The warn_slowpath_fmt() path was tested using this trick:
diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h
index 84b87538a15d..3106c203ebb6 100644
--- a/arch/x86/include/asm/bug.h
+++ b/arch/x86/include/asm/bug.h
@@ -73,7 +73,7 @@ do {                                                          \
  * were to trigger, we'd rather wreck the machine in an attempt to get the
  * message out than not know about it.
  */
-#define __WARN_FLAGS(flags)                                    \
+#define ___WARN_FLAGS(flags)                                   \
 do {                                                           \
        instrumentation_begin();                                \
        _BUG_FLAGS(ASM_UD2, BUGFLAG_WARNING|(flags));           \

3) Testing pkill_on_warn with kthreads was done using this trick:
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index bce848e50512..13c56f472681 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2133,6 +2133,8 @@ static int __noreturn rcu_gp_kthread(void *unused)
                WRITE_ONCE(rcu_state.gp_state, RCU_GP_CLEANUP);
                rcu_gp_cleanup();
                WRITE_ONCE(rcu_state.gp_state, RCU_GP_CLEANED);
+
+               WARN_ONCE(1, "hello from kthread\n");
        }
 }

4) Changing drivers/misc/lkdtm/bugs.c:lkdtm_WARNING() allowed me
to test all warning flavours:
 - WARN_ON()
 - WARN()
 - WARN_TAINT()
 - WARN_ON_ONCE()
 - WARN_ONCE()
 - WARN_TAINT_ONCE()

Thanks!

Alexander Popov (2):
  bug: do refactoring allowing to add a warning handling action
  sysctl: introduce kernel.pkill_on_warn

 Documentation/admin-guide/sysctl/kernel.rst | 14 ++++++++
 include/asm-generic/bug.h                   | 37 +++++++++++++++------
 include/linux/panic.h                       |  3 ++
 kernel/panic.c                              | 22 +++++++++++-
 kernel/sysctl.c                             |  9 +++++
 lib/bug.c                                   | 22 ++++++++----
 6 files changed, 90 insertions(+), 17 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 1/2] bug: do refactoring allowing to add a warning handling action
  2021-10-27 23:32 [PATCH v2 0/2] Introduce the pkill_on_warn parameter Alexander Popov
@ 2021-10-27 23:32 ` Alexander Popov
  2021-10-27 23:32 ` [PATCH v2 2/2] sysctl: introduce kernel.pkill_on_warn Alexander Popov
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 31+ messages in thread
From: Alexander Popov @ 2021-10-27 23:32 UTC (permalink / raw)
  To: Jonathan Corbet, Linus Torvalds, Paul McKenney, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, Joerg Roedel, Maciej Rozycki,
	Muchun Song, Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu,
	Petr Mladek, Kees Cook, Luis Chamberlain, Wei Liu, John Ogness,
	Andy Shevchenko, Alexey Kardashevskiy, Christophe Leroy,
	Jann Horn, Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski,
	Dave Hansen, Steven Rostedt, Will Deacon, Ard Biesheuvel,
	Laura Abbott, David S Miller, Borislav Petkov, Arnd Bergmann,
	Andrew Scull, Marc Zyngier, Jessica Yu, Iurii Zaikin,
	Rasmus Villemoes, Wang Qing, Mel Gorman, Mauro Carvalho Chehab,
	Andrew Klychkov, Mathieu Chouquet-Stringer, Daniel Borkmann,
	Stephen Kitt, Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Alexander Popov, kernel-hardening,
	linux-hardening, linux-doc, linux-arch, linux-kernel,
	linux-fsdevel
  Cc: notify

Do refactoring that allows adding a warning handling action,
in particular, pkill_on_warn. No functional changes intended.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 include/asm-generic/bug.h | 31 +++++++++++++++++++++----------
 lib/bug.c                 | 19 +++++++++++++------
 2 files changed, 34 insertions(+), 16 deletions(-)

diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index edb0e2a602a8..881aeaf5a2d5 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -91,7 +91,15 @@ void warn_slowpath_fmt(const char *file, const int line, unsigned taint,
 		warn_slowpath_fmt(__FILE__, __LINE__, taint, arg);	\
 		instrumentation_end();					\
 	} while (0)
-#else
+#ifndef WARN_ON_ONCE
+#define WARN_ON_ONCE(condition) ({					\
+	int __ret_warn_on = !!(condition);				\
+	if (unlikely(__ret_warn_on))					\
+		DO_ONCE_LITE(__WARN_printf, TAINT_WARN, NULL);		\
+	unlikely(__ret_warn_on);					\
+})
+#endif
+#else /* __WARN_FLAGS */
 extern __printf(1, 2) void __warn_printk(const char *fmt, ...);
 #define __WARN()		__WARN_FLAGS(BUGFLAG_TAINT(TAINT_WARN))
 #define __WARN_printf(taint, arg...) do {				\
@@ -141,16 +149,19 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
 	unlikely(__ret_warn_on);					\
 })
 
-#ifndef WARN_ON_ONCE
-#define WARN_ON_ONCE(condition)					\
-	DO_ONCE_LITE_IF(condition, WARN_ON, 1)
-#endif
-
-#define WARN_ONCE(condition, format...)				\
-	DO_ONCE_LITE_IF(condition, WARN, 1, format)
+#define WARN_ONCE(condition, format...) ({				\
+	int __ret_warn_on = !!(condition);				\
+	if (unlikely(__ret_warn_on))					\
+		DO_ONCE_LITE(__WARN_printf, TAINT_WARN, format);	\
+	unlikely(__ret_warn_on);					\
+})
 
-#define WARN_TAINT_ONCE(condition, taint, format...)		\
-	DO_ONCE_LITE_IF(condition, WARN_TAINT, 1, taint, format)
+#define WARN_TAINT_ONCE(condition, taint, format...) ({			\
+	int __ret_warn_on = !!(condition);				\
+	if (unlikely(__ret_warn_on))					\
+		DO_ONCE_LITE(__WARN_printf, taint, format);		\
+	unlikely(__ret_warn_on);					\
+})
 
 #else /* !CONFIG_BUG */
 #ifndef HAVE_ARCH_BUG
diff --git a/lib/bug.c b/lib/bug.c
index 45a0584f6541..1a91f01412b8 100644
--- a/lib/bug.c
+++ b/lib/bug.c
@@ -156,16 +156,17 @@ struct bug_entry *find_bug(unsigned long bugaddr)
 
 enum bug_trap_type report_bug(unsigned long bugaddr, struct pt_regs *regs)
 {
+	enum bug_trap_type bug_type = BUG_TRAP_TYPE_NONE;
 	struct bug_entry *bug;
 	const char *file;
 	unsigned line, warning, once, done;
 
 	if (!is_valid_bugaddr(bugaddr))
-		return BUG_TRAP_TYPE_NONE;
+		goto out;
 
 	bug = find_bug(bugaddr);
 	if (!bug)
-		return BUG_TRAP_TYPE_NONE;
+		goto out;
 
 	disable_trace_on_warning();
 
@@ -176,8 +177,10 @@ enum bug_trap_type report_bug(unsigned long bugaddr, struct pt_regs *regs)
 	done = (bug->flags & BUGFLAG_DONE) != 0;
 
 	if (warning && once) {
-		if (done)
-			return BUG_TRAP_TYPE_WARN;
+		if (done) {
+			bug_type = BUG_TRAP_TYPE_WARN;
+			goto out;
+		}
 
 		/*
 		 * Since this is the only store, concurrency is not an issue.
@@ -198,7 +201,8 @@ enum bug_trap_type report_bug(unsigned long bugaddr, struct pt_regs *regs)
 		/* this is a WARN_ON rather than BUG/BUG_ON */
 		__warn(file, line, (void *)bugaddr, BUG_GET_TAINT(bug), regs,
 		       NULL);
-		return BUG_TRAP_TYPE_WARN;
+		bug_type = BUG_TRAP_TYPE_WARN;
+		goto out;
 	}
 
 	if (file)
@@ -207,7 +211,10 @@ enum bug_trap_type report_bug(unsigned long bugaddr, struct pt_regs *regs)
 		pr_crit("Kernel BUG at %pB [verbose debug info unavailable]\n",
 			(void *)bugaddr);
 
-	return BUG_TRAP_TYPE_BUG;
+	bug_type = BUG_TRAP_TYPE_BUG;
+
+out:
+	return bug_type;
 }
 
 static void clear_once_table(struct bug_entry *start, struct bug_entry *end)
-- 
2.31.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 2/2] sysctl: introduce kernel.pkill_on_warn
  2021-10-27 23:32 [PATCH v2 0/2] Introduce the pkill_on_warn parameter Alexander Popov
  2021-10-27 23:32 ` [PATCH v2 1/2] bug: do refactoring allowing to add a warning handling action Alexander Popov
@ 2021-10-27 23:32 ` Alexander Popov
  2021-11-12 21:24   ` Steven Rostedt
  2021-11-12 18:52 ` [PATCH v2 0/2] Introduce the pkill_on_warn parameter Alexander Popov
  2021-11-15  8:12 ` Kaiwan N Billimoria
  3 siblings, 1 reply; 31+ messages in thread
From: Alexander Popov @ 2021-10-27 23:32 UTC (permalink / raw)
  To: Jonathan Corbet, Linus Torvalds, Paul McKenney, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, Joerg Roedel, Maciej Rozycki,
	Muchun Song, Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu,
	Petr Mladek, Kees Cook, Luis Chamberlain, Wei Liu, John Ogness,
	Andy Shevchenko, Alexey Kardashevskiy, Christophe Leroy,
	Jann Horn, Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski,
	Dave Hansen, Steven Rostedt, Will Deacon, Ard Biesheuvel,
	Laura Abbott, David S Miller, Borislav Petkov, Arnd Bergmann,
	Andrew Scull, Marc Zyngier, Jessica Yu, Iurii Zaikin,
	Rasmus Villemoes, Wang Qing, Mel Gorman, Mauro Carvalho Chehab,
	Andrew Klychkov, Mathieu Chouquet-Stringer, Daniel Borkmann,
	Stephen Kitt, Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Alexander Popov, kernel-hardening,
	linux-hardening, linux-doc, linux-arch, linux-kernel,
	linux-fsdevel
  Cc: notify

Currently, the Linux kernel provides two types of reaction to kernel
warnings:
 1. Do nothing (by default),
 2. Call panic() if panic_on_warn is set. That's a very strong reaction,
    so panic_on_warn is usually disabled on production systems.

From a safety point of view, the Linux kernel misses a middle way of
handling kernel warnings:
 - The kernel should stop the activity that provokes a warning,
 - But the kernel should avoid complete denial of service.

From a security point of view, kernel warning messages provide a lot of
useful information for attackers. Many GNU/Linux distributions allow
unprivileged users to read the kernel log, so attackers use kernel
warning infoleak in vulnerability exploits. See the examples:
https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html
https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html

Let's introduce the pkill_on_warn sysctl.
If this parameter is set, the kernel kills all threads in a process that
provoked a kernel warning. This behavior is reasonable from a safety point of
view described above. It is also useful for kernel security hardening because
the system kills an exploit process that hits a kernel warning.

Moreover, bugs usually don't come alone, and a kernel warning may be
followed by memory corruption or other bad effects. So pkill_on_warn allows
the kernel to stop the process when the first signs of wrong behavior
are detected.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 Documentation/admin-guide/sysctl/kernel.rst | 14 +++++++++++++
 include/asm-generic/bug.h                   | 12 ++++++++---
 include/linux/panic.h                       |  3 +++
 kernel/panic.c                              | 22 ++++++++++++++++++++-
 kernel/sysctl.c                             |  9 +++++++++
 lib/bug.c                                   |  3 +++
 6 files changed, 59 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 426162009ce9..5faf395fdf8f 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -921,6 +921,20 @@ lives in) pid namespace. When selecting a pid for a next task on fork
 kernel tries to allocate a number starting from this one.
 
 
+pkill_on_warn
+=============
+
+Kills all threads in a process that provoked a kernel warning.
+That allows the kernel to stop the process when the first signs
+of wrong behavior are detected.
+
+= =====================================================================
+0 Allows a process to proceed execution after hitting a kernel warning,
+  this is the default behavior.
+1 Kills all threads in a process that provoked a kernel warning.
+= =====================================================================
+
+
 powersave-nap (PPC only)
 ========================
 
diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index 881aeaf5a2d5..959000b5856a 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -94,8 +94,10 @@ void warn_slowpath_fmt(const char *file, const int line, unsigned taint,
 #ifndef WARN_ON_ONCE
 #define WARN_ON_ONCE(condition) ({					\
 	int __ret_warn_on = !!(condition);				\
-	if (unlikely(__ret_warn_on))					\
+	if (unlikely(__ret_warn_on)) {					\
 		DO_ONCE_LITE(__WARN_printf, TAINT_WARN, NULL);		\
+		do_pkill_on_warn();					\
+	}								\
 	unlikely(__ret_warn_on);					\
 })
 #endif
@@ -151,15 +153,19 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
 
 #define WARN_ONCE(condition, format...) ({				\
 	int __ret_warn_on = !!(condition);				\
-	if (unlikely(__ret_warn_on))					\
+	if (unlikely(__ret_warn_on)) {					\
 		DO_ONCE_LITE(__WARN_printf, TAINT_WARN, format);	\
+		do_pkill_on_warn();					\
+	}								\
 	unlikely(__ret_warn_on);					\
 })
 
 #define WARN_TAINT_ONCE(condition, taint, format...) ({			\
 	int __ret_warn_on = !!(condition);				\
-	if (unlikely(__ret_warn_on))					\
+	if (unlikely(__ret_warn_on)) {					\
 		DO_ONCE_LITE(__WARN_printf, taint, format);		\
+		do_pkill_on_warn();					\
+	}								\
 	unlikely(__ret_warn_on);					\
 })
 
diff --git a/include/linux/panic.h b/include/linux/panic.h
index f5844908a089..f79c69279859 100644
--- a/include/linux/panic.h
+++ b/include/linux/panic.h
@@ -27,6 +27,9 @@ extern int panic_on_oops;
 extern int panic_on_unrecovered_nmi;
 extern int panic_on_io_nmi;
 extern int panic_on_warn;
+extern int pkill_on_warn;
+
+extern void do_pkill_on_warn(void);
 
 extern unsigned long panic_on_taint;
 extern bool panic_on_taint_nousertaint;
diff --git a/kernel/panic.c b/kernel/panic.c
index cefd7d82366f..1323c9e2630f 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -53,6 +53,7 @@ static int pause_on_oops_flag;
 static DEFINE_SPINLOCK(pause_on_oops_lock);
 bool crash_kexec_post_notifiers;
 int panic_on_warn __read_mostly;
+int pkill_on_warn __read_mostly;
 unsigned long panic_on_taint;
 bool panic_on_taint_nousertaint = false;
 
@@ -625,13 +626,16 @@ void warn_slowpath_fmt(const char *file, int line, unsigned taint,
 	if (!fmt) {
 		__warn(file, line, __builtin_return_address(0), taint,
 		       NULL, NULL);
-		return;
+		goto out;
 	}
 
 	args.fmt = fmt;
 	va_start(args.args, fmt);
 	__warn(file, line, __builtin_return_address(0), taint, NULL, &args);
 	va_end(args.args);
+
+out:
+	do_pkill_on_warn();
 }
 EXPORT_SYMBOL(warn_slowpath_fmt);
 #else
@@ -732,3 +736,19 @@ static int __init panic_on_taint_setup(char *s)
 	return 0;
 }
 early_param("panic_on_taint", panic_on_taint_setup);
+
+void do_pkill_on_warn(void)
+{
+	if (!pkill_on_warn)
+		return;
+
+	if (is_global_init(current))
+		return;
+
+	if (current->flags & PF_KTHREAD)
+		return;
+
+	if (system_state >= SYSTEM_RUNNING)
+		do_send_sig_info(SIGKILL, SEND_SIG_PRIV, current, PIDTYPE_TGID);
+}
+EXPORT_SYMBOL(do_pkill_on_warn);
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 083be6af29d7..7fe6f0aaad2b 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -2656,6 +2656,15 @@ static struct ctl_table kern_table[] = {
 		.extra1		= SYSCTL_ZERO,
 		.extra2		= SYSCTL_ONE,
 	},
+		{
+		.procname	= "pkill_on_warn",
+		.data		= &pkill_on_warn,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= SYSCTL_ONE,
+	},
 #if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON)
 	{
 		.procname	= "timer_migration",
diff --git a/lib/bug.c b/lib/bug.c
index 1a91f01412b8..28cc8a5b2ee0 100644
--- a/lib/bug.c
+++ b/lib/bug.c
@@ -214,6 +214,9 @@ enum bug_trap_type report_bug(unsigned long bugaddr, struct pt_regs *regs)
 	bug_type = BUG_TRAP_TYPE_BUG;
 
 out:
+	if (bug_type == BUG_TRAP_TYPE_WARN)
+		do_pkill_on_warn();
+
 	return bug_type;
 }
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-10-27 23:32 [PATCH v2 0/2] Introduce the pkill_on_warn parameter Alexander Popov
  2021-10-27 23:32 ` [PATCH v2 1/2] bug: do refactoring allowing to add a warning handling action Alexander Popov
  2021-10-27 23:32 ` [PATCH v2 2/2] sysctl: introduce kernel.pkill_on_warn Alexander Popov
@ 2021-11-12 18:52 ` Alexander Popov
  2021-11-12 21:26   ` Linus Torvalds
  2021-11-15  8:12 ` Kaiwan N Billimoria
  3 siblings, 1 reply; 31+ messages in thread
From: Alexander Popov @ 2021-11-12 18:52 UTC (permalink / raw)
  To: Jonathan Corbet, Linus Torvalds, Paul McKenney, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, Joerg Roedel, Maciej Rozycki,
	Muchun Song, Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu,
	Petr Mladek, Kees Cook, Luis Chamberlain, Wei Liu, John Ogness,
	Andy Shevchenko, Alexey Kardashevskiy, Christophe Leroy,
	Jann Horn, Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski,
	Dave Hansen, Steven Rostedt, Will Deacon, Ard Biesheuvel,
	Laura Abbott, David S Miller, Borislav Petkov, Arnd Bergmann,
	Andrew Scull, Marc Zyngier, Jessica Yu, Iurii Zaikin,
	Rasmus Villemoes, Wang Qing, Mel Gorman, Mauro Carvalho Chehab,
	Andrew Klychkov, Mathieu Chouquet-Stringer, Daniel Borkmann,
	Stephen Kitt, Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, kernel-hardening, linux-hardening, linux-doc,
	linux-arch, linux-kernel, linux-fsdevel
  Cc: notify

On 28.10.2021 02:32, Alexander Popov wrote:
> Hello! This is the v2 of pkill_on_warn.
> Changes from v1 and tricks for testing are described below.

Hello everyone!
Friendly ping for your feedback.

Thanks.
Alexander

> Rationale
> =========
> 
> Currently, the Linux kernel provides two types of reaction to kernel
> warnings:
>   1. Do nothing (by default),
>   2. Call panic() if panic_on_warn is set. That's a very strong reaction,
>      so panic_on_warn is usually disabled on production systems.
> 
>  From a safety point of view, the Linux kernel misses a middle way of
> handling kernel warnings:
>   - The kernel should stop the activity that provokes a warning,
>   - But the kernel should avoid complete denial of service.
> 
>  From a security point of view, kernel warning messages provide a lot of
> useful information for attackers. Many GNU/Linux distributions allow
> unprivileged users to read the kernel log, so attackers use kernel
> warning infoleak in vulnerability exploits. See the examples:
> https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html
> https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
> https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html
> 
> Let's introduce the pkill_on_warn sysctl.
> If this parameter is set, the kernel kills all threads in a process that
> provoked a kernel warning. This behavior is reasonable from a safety point of
> view described above. It is also useful for kernel security hardening because
> the system kills an exploit process that hits a kernel warning.
> 
> Moreover, bugs usually don't come alone, and a kernel warning may be
> followed by memory corruption or other bad effects. So pkill_on_warn allows
> the kernel to stop the process when the first signs of wrong behavior
> are detected.
> 
> 
> Changes from v1
> ===============
> 
> 1) Introduce do_pkill_on_warn() and call it in all warning handling paths.
> 
> 2) Do refactoring without functional changes in a separate patch.
> 
> 3) Avoid killing init and kthreads.
> 
> 4) Use do_send_sig_info() instead of do_group_exit().
> 
> 5) Introduce sysctl instead of using core_param().
> 
> 
> Tricks for testing
> ==================
> 
> 1) This patch series was tested on x86_64 using CONFIG_LKDTM.
> The kernel kills a process that performs this:
>    echo WARNING > /sys/kernel/debug/provoke-crash/DIRECT
> 
> 2) The warn_slowpath_fmt() path was tested using this trick:
> diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h
> index 84b87538a15d..3106c203ebb6 100644
> --- a/arch/x86/include/asm/bug.h
> +++ b/arch/x86/include/asm/bug.h
> @@ -73,7 +73,7 @@ do {                                                          \
>    * were to trigger, we'd rather wreck the machine in an attempt to get the
>    * message out than not know about it.
>    */
> -#define __WARN_FLAGS(flags)                                    \
> +#define ___WARN_FLAGS(flags)                                   \
>   do {                                                           \
>          instrumentation_begin();                                \
>          _BUG_FLAGS(ASM_UD2, BUGFLAG_WARNING|(flags));           \
> 
> 3) Testing pkill_on_warn with kthreads was done using this trick:
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index bce848e50512..13c56f472681 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -2133,6 +2133,8 @@ static int __noreturn rcu_gp_kthread(void *unused)
>                  WRITE_ONCE(rcu_state.gp_state, RCU_GP_CLEANUP);
>                  rcu_gp_cleanup();
>                  WRITE_ONCE(rcu_state.gp_state, RCU_GP_CLEANED);
> +
> +               WARN_ONCE(1, "hello from kthread\n");
>          }
>   }
> 
> 4) Changing drivers/misc/lkdtm/bugs.c:lkdtm_WARNING() allowed me
> to test all warning flavours:
>   - WARN_ON()
>   - WARN()
>   - WARN_TAINT()
>   - WARN_ON_ONCE()
>   - WARN_ONCE()
>   - WARN_TAINT_ONCE()
> 
> Thanks!
> 
> Alexander Popov (2):
>    bug: do refactoring allowing to add a warning handling action
>    sysctl: introduce kernel.pkill_on_warn
> 
>   Documentation/admin-guide/sysctl/kernel.rst | 14 ++++++++
>   include/asm-generic/bug.h                   | 37 +++++++++++++++------
>   include/linux/panic.h                       |  3 ++
>   kernel/panic.c                              | 22 +++++++++++-
>   kernel/sysctl.c                             |  9 +++++
>   lib/bug.c                                   | 22 ++++++++----
>   6 files changed, 90 insertions(+), 17 deletions(-)
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 2/2] sysctl: introduce kernel.pkill_on_warn
  2021-10-27 23:32 ` [PATCH v2 2/2] sysctl: introduce kernel.pkill_on_warn Alexander Popov
@ 2021-11-12 21:24   ` Steven Rostedt
  0 siblings, 0 replies; 31+ messages in thread
From: Steven Rostedt @ 2021-11-12 21:24 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Jonathan Corbet, Linus Torvalds, Paul McKenney, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, Joerg Roedel, Maciej Rozycki,
	Muchun Song, Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu,
	Petr Mladek, Kees Cook, Luis Chamberlain, Wei Liu, John Ogness,
	Andy Shevchenko, Alexey Kardashevskiy, Christophe Leroy,
	Jann Horn, Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski,
	Dave Hansen, Will Deacon, Ard Biesheuvel, Laura Abbott,
	David S Miller, Borislav Petkov, Arnd Bergmann, Andrew Scull,
	Marc Zyngier, Jessica Yu, Iurii Zaikin, Rasmus Villemoes,
	Wang Qing, Mel Gorman, Mauro Carvalho Chehab, Andrew Klychkov,
	Mathieu Chouquet-Stringer, Daniel Borkmann, Stephen Kitt,
	Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, kernel-hardening, linux-hardening, linux-doc,
	linux-arch, linux-kernel, linux-fsdevel, notify

On Thu, 28 Oct 2021 02:32:15 +0300
Alexander Popov <alex.popov@linux.com> wrote:

> Signed-off-by: Alexander Popov <alex.popov@linux.com>
> ---
>  Documentation/admin-guide/sysctl/kernel.rst | 14 +++++++++++++
>  include/asm-generic/bug.h                   | 12 ++++++++---
>  include/linux/panic.h                       |  3 +++
>  kernel/panic.c                              | 22 ++++++++++++++++++++-
>  kernel/sysctl.c                             |  9 +++++++++
>  lib/bug.c                                   |  3 +++
>  6 files changed, 59 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
> index 426162009ce9..5faf395fdf8f 100644
> --- a/Documentation/admin-guide/sysctl/kernel.rst
> +++ b/Documentation/admin-guide/sysctl/kernel.rst
> @@ -921,6 +921,20 @@ lives in) pid namespace. When selecting a pid for a next task on fork
>  kernel tries to allocate a number starting from this one.
>  
>  
> +pkill_on_warn
> +=============
> +
> +Kills all threads in a process that provoked a kernel warning.
> +That allows the kernel to stop the process when the first signs
> +of wrong behavior are detected.
> +
> += =====================================================================
> +0 Allows a process to proceed execution after hitting a kernel warning,
> +  this is the default behavior.
> +1 Kills all threads in a process that provoked a kernel warning.
> += =====================================================================
> +
> +
>  powersave-nap (PPC only)
>  ========================
>  
> diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
> index 881aeaf5a2d5..959000b5856a 100644
> --- a/include/asm-generic/bug.h
> +++ b/include/asm-generic/bug.h
> @@ -94,8 +94,10 @@ void warn_slowpath_fmt(const char *file, const int line, unsigned taint,
>  #ifndef WARN_ON_ONCE
>  #define WARN_ON_ONCE(condition) ({					\
>  	int __ret_warn_on = !!(condition);				\
> -	if (unlikely(__ret_warn_on))					\
> +	if (unlikely(__ret_warn_on)) {					\
>  		DO_ONCE_LITE(__WARN_printf, TAINT_WARN, NULL);		\
> +		do_pkill_on_warn();					\

Should this be a config option so that those that do not want this do not
need to have the added overhead of a function call embedded at every
WARN*() in their code?

That is, do_pkill_on_warn() should be defined as do { } while (0), and not
add any I$ overhead when compiled out?

> +	}								\
>  	unlikely(__ret_warn_on);					\
>  })
>  #endif
> @@ -151,15 +153,19 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
>  
>  #define WARN_ONCE(condition, format...) ({				\
>  	int __ret_warn_on = !!(condition);				\
> -	if (unlikely(__ret_warn_on))					\
> +	if (unlikely(__ret_warn_on)) {					\
>  		DO_ONCE_LITE(__WARN_printf, TAINT_WARN, format);	\
> +		do_pkill_on_warn();					\
> +	}								\
>  	unlikely(__ret_warn_on);					\
>  })
>  
>  #define WARN_TAINT_ONCE(condition, taint, format...) ({			\
>  	int __ret_warn_on = !!(condition);				\
> -	if (unlikely(__ret_warn_on))					\
> +	if (unlikely(__ret_warn_on)) {					\
>  		DO_ONCE_LITE(__WARN_printf, taint, format);		\
> +		do_pkill_on_warn();					\
> +	}								\
>  	unlikely(__ret_warn_on);					\
>  })
>  
> diff --git a/include/linux/panic.h b/include/linux/panic.h
> index f5844908a089..f79c69279859 100644
> --- a/include/linux/panic.h
> +++ b/include/linux/panic.h
> @@ -27,6 +27,9 @@ extern int panic_on_oops;
>  extern int panic_on_unrecovered_nmi;
>  extern int panic_on_io_nmi;
>  extern int panic_on_warn;
> +extern int pkill_on_warn;
> +
> +extern void do_pkill_on_warn(void);
>  
>  extern unsigned long panic_on_taint;
>  extern bool panic_on_taint_nousertaint;
> diff --git a/kernel/panic.c b/kernel/panic.c
> index cefd7d82366f..1323c9e2630f 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -53,6 +53,7 @@ static int pause_on_oops_flag;
>  static DEFINE_SPINLOCK(pause_on_oops_lock);
>  bool crash_kexec_post_notifiers;
>  int panic_on_warn __read_mostly;
> +int pkill_on_warn __read_mostly;
>  unsigned long panic_on_taint;
>  bool panic_on_taint_nousertaint = false;
>  
> @@ -625,13 +626,16 @@ void warn_slowpath_fmt(const char *file, int line, unsigned taint,
>  	if (!fmt) {
>  		__warn(file, line, __builtin_return_address(0), taint,
>  		       NULL, NULL);
> -		return;
> +		goto out;
>  	}
>  
>  	args.fmt = fmt;
>  	va_start(args.args, fmt);
>  	__warn(file, line, __builtin_return_address(0), taint, NULL, &args);
>  	va_end(args.args);
> +
> +out:
> +	do_pkill_on_warn();
>  }
>  EXPORT_SYMBOL(warn_slowpath_fmt);
>  #else
> @@ -732,3 +736,19 @@ static int __init panic_on_taint_setup(char *s)
>  	return 0;
>  }
>  early_param("panic_on_taint", panic_on_taint_setup);
> +
> +void do_pkill_on_warn(void)
> +{
> +	if (!pkill_on_warn)
> +		return;
> +
> +	if (is_global_init(current))
> +		return;
> +
> +	if (current->flags & PF_KTHREAD)
> +		return;
> +
> +	if (system_state >= SYSTEM_RUNNING)
> +		do_send_sig_info(SIGKILL, SEND_SIG_PRIV, current, PIDTYPE_TGID);

I believe this was mentioned before. I'm not sure how safe it is to call
do_send_sig_info() in random areas of the kernel, as that could possibly
cause a deadlock with the locking that is taken.

Best to use irq_work() and have a irq_work interrupt handle the
do_sig_info, because then you know you are safe to call this. Of course,
then you need some way to pass the task to the handler.

Could do some kind of clever trick, where we add a PF_KILL_ME flag to the
task, and the irq worker just loops over all the tasks kill all those with
it set?


> +}
> +EXPORT_SYMBOL(do_pkill_on_warn);
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 083be6af29d7..7fe6f0aaad2b 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -2656,6 +2656,15 @@ static struct ctl_table kern_table[] = {
>  		.extra1		= SYSCTL_ZERO,
>  		.extra2		= SYSCTL_ONE,
>  	},
> +		{

extra tab?

-- Steve

> +		.procname	= "pkill_on_warn",
> +		.data		= &pkill_on_warn,
> +		.maxlen		= sizeof(int),
> +		.mode		= 0644,
> +		.proc_handler	= proc_dointvec_minmax,
> +		.extra1		= SYSCTL_ZERO,
> +		.extra2		= SYSCTL_ONE,
> +	},
>  #if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON)
>  	{
>  		.procname	= "timer_migration",
> diff --git a/lib/bug.c b/lib/bug.c
> index 1a91f01412b8..28cc8a5b2ee0 100644
> --- a/lib/bug.c
> +++ b/lib/bug.c
> @@ -214,6 +214,9 @@ enum bug_trap_type report_bug(unsigned long bugaddr, struct pt_regs *regs)
>  	bug_type = BUG_TRAP_TYPE_BUG;
>  
>  out:
> +	if (bug_type == BUG_TRAP_TYPE_WARN)
> +		do_pkill_on_warn();
> +
>  	return bug_type;
>  }
>  


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-12 18:52 ` [PATCH v2 0/2] Introduce the pkill_on_warn parameter Alexander Popov
@ 2021-11-12 21:26   ` Linus Torvalds
  2021-11-13 18:14     ` Alexander Popov
  0 siblings, 1 reply; 31+ messages in thread
From: Linus Torvalds @ 2021-11-12 21:26 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Jonathan Corbet, Paul McKenney, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Joerg Roedel, Maciej Rozycki, Muchun Song,
	Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu, Petr Mladek,
	Kees Cook, Luis Chamberlain, Wei Liu, John Ogness,
	Andy Shevchenko, Alexey Kardashevskiy, Christophe Leroy,
	Jann Horn, Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski,
	Dave Hansen, Steven Rostedt, Will Deacon, Ard Biesheuvel,
	Laura Abbott, David S Miller, Borislav Petkov, Arnd Bergmann,
	Andrew Scull, Marc Zyngier, Jessica Yu, Iurii Zaikin,
	Rasmus Villemoes, Wang Qing, Mel Gorman, Mauro Carvalho Chehab,
	Andrew Klychkov, Mathieu Chouquet-Stringer, Daniel Borkmann,
	Stephen Kitt, Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify

On Fri, Nov 12, 2021 at 10:52 AM Alexander Popov <alex.popov@linux.com> wrote:
>
> Hello everyone!
> Friendly ping for your feedback.

I still haven't heard a compelling _reason_ for this all, and why
anybody should ever use this or care?

               Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-12 21:26   ` Linus Torvalds
@ 2021-11-13 18:14     ` Alexander Popov
  2021-11-13 19:58       ` Linus Torvalds
  2021-11-15 13:59       ` Lukas Bulwahn
  0 siblings, 2 replies; 31+ messages in thread
From: Alexander Popov @ 2021-11-13 18:14 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jonathan Corbet, Paul McKenney, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Joerg Roedel, Maciej Rozycki, Muchun Song,
	Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu, Petr Mladek,
	Kees Cook, Luis Chamberlain, Wei Liu, John Ogness,
	Andy Shevchenko, Alexey Kardashevskiy, Christophe Leroy,
	Jann Horn, Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski,
	Dave Hansen, Steven Rostedt, Will Deacon, Ard Biesheuvel,
	Laura Abbott, David S Miller, Borislav Petkov, Arnd Bergmann,
	Andrew Scull, Marc Zyngier, Jessica Yu, Iurii Zaikin,
	Rasmus Villemoes, Wang Qing, Mel Gorman, Mauro Carvalho Chehab,
	Andrew Klychkov, Mathieu Chouquet-Stringer, Daniel Borkmann,
	Stephen Kitt, Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan, Lukas Bulwahn

On 13.11.2021 00:26, Linus Torvalds wrote:
> On Fri, Nov 12, 2021 at 10:52 AM Alexander Popov <alex.popov@linux.com> wrote:
>>
>> Hello everyone!
>> Friendly ping for your feedback.
> 
> I still haven't heard a compelling _reason_ for this all, and why
> anybody should ever use this or care?

Ok, to sum up:

Killing the process that hit a kernel warning complies with the Fail-Fast 
principle [1]. pkill_on_warn sysctl allows the kernel to stop the process when 
the **first signs** of wrong behavior are detected.

By default, the Linux kernel ignores a warning and proceeds the execution from 
the flawed state. That is opposite to the Fail-Fast principle.
A kernel warning may be followed by memory corruption or other negative effects, 
like in CVE-2019-18683 exploit [2] or many other cases detected by the SyzScope 
project [3]. pkill_on_warn would prevent the system from the errors going after 
a warning in the process context.

At the same time, pkill_on_warn does not kill the entire system like 
panic_on_warn. That is the middle way of handling kernel warnings.
Linus, it's similar to your BUG_ON() policy [4]. The process hitting BUG_ON() is 
killed, and the system proceeds to work. pkill_on_warn just brings a similar 
policy to WARN_ON() handling.

I believe that many Linux distros (which don't hit WARN_ON() here and there) 
will enable pkill_on_warn because it's reasonable from the safety and security 
points of view.

And I'm sure that the ELISA project by the Linux Foundation (Enabling Linux In 
Safety Applications [5]) would support the pkill_on_warn sysctl.
[Adding people from this project to CC]

I hope that I managed to show the rationale.

Best regards,
Alexander


[1]: https://en.wikipedia.org/wiki/Fail-fast
[2]: https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
[3]: https://www.usenix.org/system/files/sec22summer_zou.pdf
[4]: http://lkml.iu.edu/hypermail/linux/kernel/1610.0/01217.html
[5]: https://elisa.tech/

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-13 18:14     ` Alexander Popov
@ 2021-11-13 19:58       ` Linus Torvalds
  2021-11-14 14:21         ` Marco Elver
  2021-11-15 13:59       ` Lukas Bulwahn
  1 sibling, 1 reply; 31+ messages in thread
From: Linus Torvalds @ 2021-11-13 19:58 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Jonathan Corbet, Paul McKenney, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Joerg Roedel, Maciej Rozycki, Muchun Song,
	Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu, Petr Mladek,
	Kees Cook, Luis Chamberlain, Wei Liu, John Ogness,
	Andy Shevchenko, Alexey Kardashevskiy, Christophe Leroy,
	Jann Horn, Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski,
	Dave Hansen, Steven Rostedt, Will Deacon, Ard Biesheuvel,
	Laura Abbott, David S Miller, Borislav Petkov, Arnd Bergmann,
	Andrew Scull, Marc Zyngier, Jessica Yu, Iurii Zaikin,
	Rasmus Villemoes, Wang Qing, Mel Gorman, Mauro Carvalho Chehab,
	Andrew Klychkov, Mathieu Chouquet-Stringer, Daniel Borkmann,
	Stephen Kitt, Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan, Lukas Bulwahn

On Sat, Nov 13, 2021 at 10:14 AM Alexander Popov <alex.popov@linux.com> wrote:
>
> Killing the process that hit a kernel warning complies with the Fail-Fast
> principle [1].

The thing is a WARNING.

It's not even clear that the warning has anything to do with the
process that triggered it. It could happen in an interrupt, or in some
async context (kernel threads, whatever), or the warning could just be
something that is detected by a different user than the thing that
actually caused the warning to become an issue.

If you want to reboot the machine on a kernel warning, you get that
fail-fast thing you want. There are two situations:

 - kernel testing (pretty much universally done in a virtual machine,
or simply just checking 'dmesg' afterwards)

 - hyperscalers like google etc that just want to take any suspect
machines offline asap

But sending a signal to a random process is just voodoo programming,
and as likely to cause other very odd failures as anything else.

I really don't see the point of that signal.

I'm happy to be proven wrong, but that will require some major
installation actually using it first and having a lot of strong
arguments to counter-act the above.

Seriously, WARN_ON() can happen in situations where sending a signal
may be a REALLY BAD idea, never mind the issue that it's not even
clear who the signal should be sent to.

Yes, yes, your patches have some random "safety guards", in that it
won't send the signal to a PF_KTHREAD or the global init process. But
those safety guards literally make my argument for me: sending a
signal to whoever randomly triggered a warning is simply _wrong_.
Adding random "don't do it in this case" doesn't make it right, it
only shows that "yes, it happens to the wrong person, and here's a
hack to avoid generating obvious problems".

Honestly, if the intent is to not have to parse the dmesg output, then
I think it would be much better to introduce a new /proc file to read
the kernel tainting state, and then some test manager process could be
able to poll() that file or something. Not sending a signal to random
targets, but have a much more explicit model.

That said, I'm not convinced that "just read the kernel message log"
is in any way wrong either.

                  Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-13 19:58       ` Linus Torvalds
@ 2021-11-14 14:21         ` Marco Elver
  0 siblings, 0 replies; 31+ messages in thread
From: Marco Elver @ 2021-11-14 14:21 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Linus Torvalds, Jonathan Corbet, Paul McKenney, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, Joerg Roedel, Maciej Rozycki,
	Muchun Song, Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu,
	Petr Mladek, Kees Cook, Luis Chamberlain, Wei Liu, John Ogness,
	Andy Shevchenko, Alexey Kardashevskiy, Christophe Leroy,
	Jann Horn, Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski,
	Dave Hansen, Steven Rostedt, Will Deacon, Ard Biesheuvel,
	Laura Abbott, David S Miller, Borislav Petkov, Arnd Bergmann,
	Andrew Scull, Marc Zyngier, Jessica Yu, Iurii Zaikin,
	Rasmus Villemoes, Wang Qing, Mel Gorman, Mauro Carvalho Chehab,
	Andrew Klychkov, Mathieu Chouquet-Stringer, Daniel Borkmann,
	Stephen Kitt, Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan, Lukas Bulwahn, glider

On Sat, Nov 13, 2021 at 11:58AM -0800, Linus Torvalds wrote:
> On Sat, Nov 13, 2021 at 10:14 AM Alexander Popov <alex.popov@linux.com> wrote:
[...]
> Honestly, if the intent is to not have to parse the dmesg output, then
> I think it would be much better to introduce a new /proc file to read
> the kernel tainting state, and then some test manager process could be
> able to poll() that file or something. Not sending a signal to random
> targets, but have a much more explicit model.
> 
> That said, I'm not convinced that "just read the kernel message log"
> is in any way wrong either.

We had this problem of "need to get errors/warnings that appear in the
kernel log" without actually polling the kernel log all the time. Since
5.12 there's the 'error_report' tracepoint for exactly this purpose [1].

Right now it only generates events on KASAN and KFENCE reports, but we
imagined it's easy enough to extend with more types. Like WARN, should
the need arise (you'd have to add it if you decide to go down that
route).

So you could implement a close-enough variant of the whole thing in
userspace using what tracepoints give you by just monitoring the trace
pipe. It'd be much easier to experiment with different policies as well.

[1] https://git.kernel.org/torvalds/c/9c0dee54eb91d48cca048bd7bd2c1f4a166e0252 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-10-27 23:32 [PATCH v2 0/2] Introduce the pkill_on_warn parameter Alexander Popov
                   ` (2 preceding siblings ...)
  2021-11-12 18:52 ` [PATCH v2 0/2] Introduce the pkill_on_warn parameter Alexander Popov
@ 2021-11-15  8:12 ` Kaiwan N Billimoria
  3 siblings, 0 replies; 31+ messages in thread
From: Kaiwan N Billimoria @ 2021-11-15  8:12 UTC (permalink / raw)
  To: Alexander Popov, Jonathan Corbet, Linus Torvalds, Paul McKenney,
	Andrew Morton, Thomas Gleixner, Peter Zijlstra, Joerg Roedel,
	Maciej Rozycki, Muchun Song, Viresh Kumar, Robin Murphy,
	Randy Dunlap, Lu Baolu, Petr Mladek, Kees Cook, Luis Chamberlain,
	Wei Liu, John Ogness, Andy Shevchenko, Alexey Kardashevskiy,
	Christophe Leroy, Jann Horn, Greg Kroah-Hartman, Mark Rutland,
	Andy Lutomirski, Dave Hansen, Steven Rostedt, Will Deacon,
	Ard Biesheuvel, Laura Abbott, David S Miller, Borislav Petkov,
	Arnd Bergmann, Andrew Scull, Marc Zyngier, Jessica Yu,
	Iurii Zaikin, Rasmus Villemoes, Wang Qing, Mel Gorman,
	Mauro Carvalho Chehab, Andrew Klychkov,
	Mathieu Chouquet-Stringer, Daniel Borkmann, Stephen Kitt,
	Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, kernel-hardening, linux-hardening, linux-doc,
	linux-arch, linux-kernel, linux-fsdevel
  Cc: notify

On Thu, 2021-10-28 at 02:32 +0300, Alexander Popov wrote:
> [...]
> 
> From a security point of view, kernel warning messages provide a lot of
> useful information for attackers. Many GNU/Linux distributions allow
> unprivileged users to read the kernel log, so attackers use kernel
> warning infoleak in vulnerability exploits. 
At the risk of being too simplistic, if the intention is to cut down infoleaks,
why not simply have a config (and/or sysctl) to toggle it - both at kernel build
as well as at runtime via a sysctl.

A minimal starting attempt at this, definitely incomplete (i've not actually written
the config anywhere, sorry, I'd just like to propose this as an idea for now) could
be something like this? (Am calling the kconfig CONFIG_TERSE_DIAGS_ONWARN):

---
 kernel/panic.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/kernel/panic.c b/kernel/panic.c
index cefd7d82366f..bbf00b0a8110 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -587,10 +587,8 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
    if (args)
        vprintk(args->fmt, args->args);
 
-   print_modules();
-
-   if (regs)
-       show_regs(regs);
+   if (IS_ENABLED(CONFIG_TERSE_DIAGS_ONWARN))
+       return;
 
    if (panic_on_warn) {
        /*
@@ -603,6 +601,11 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
        panic("panic_on_warn set ...\n");
    }   
 
+   print_modules();
+
+   if (regs)
+       show_regs(regs);
+
    if (!regs)
        dump_stack();
 
-- 
2.25.1


Further, am unsure precisely which portions of diagnostic output would be useful
to retain when the config's on. Of course, this "patch" is very premature. Of course,
am open to suggestions on all of this,
Regards


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-13 18:14     ` Alexander Popov
  2021-11-13 19:58       ` Linus Torvalds
@ 2021-11-15 13:59       ` Lukas Bulwahn
  2021-11-15 15:51         ` [ELISA Safety Architecture WG] " Gabriele Paoloni
  2021-11-15 16:06         ` Steven Rostedt
  1 sibling, 2 replies; 31+ messages in thread
From: Lukas Bulwahn @ 2021-11-15 13:59 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Linus Torvalds, Jonathan Corbet, Paul McKenney, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, Joerg Roedel, Maciej Rozycki,
	Muchun Song, Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu,
	Petr Mladek, Kees Cook, Luis Chamberlain, Wei Liu, John Ogness,
	Andy Shevchenko, Alexey Kardashevskiy, Christophe Leroy,
	Jann Horn, Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski,
	Dave Hansen, Steven Rostedt, Will Deacon, Ard Biesheuvel,
	Laura Abbott, David S Miller, Borislav Petkov, Arnd Bergmann,
	Andrew Scull, Marc Zyngier, Jessica Yu, Iurii Zaikin,
	Rasmus Villemoes, Wang Qing, Mel Gorman, Mauro Carvalho Chehab,
	Andrew Klychkov, Mathieu Chouquet-Stringer, Daniel Borkmann,
	Stephen Kitt, Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan

On Sat, Nov 13, 2021 at 7:14 PM Alexander Popov <alex.popov@linux.com> wrote:
>
> On 13.11.2021 00:26, Linus Torvalds wrote:
> > On Fri, Nov 12, 2021 at 10:52 AM Alexander Popov <alex.popov@linux.com> wrote:
> >>
> >> Hello everyone!
> >> Friendly ping for your feedback.
> >
> > I still haven't heard a compelling _reason_ for this all, and why
> > anybody should ever use this or care?
>
> Ok, to sum up:
>
> Killing the process that hit a kernel warning complies with the Fail-Fast
> principle [1]. pkill_on_warn sysctl allows the kernel to stop the process when
> the **first signs** of wrong behavior are detected.
>
> By default, the Linux kernel ignores a warning and proceeds the execution from
> the flawed state. That is opposite to the Fail-Fast principle.
> A kernel warning may be followed by memory corruption or other negative effects,
> like in CVE-2019-18683 exploit [2] or many other cases detected by the SyzScope
> project [3]. pkill_on_warn would prevent the system from the errors going after
> a warning in the process context.
>
> At the same time, pkill_on_warn does not kill the entire system like
> panic_on_warn. That is the middle way of handling kernel warnings.
> Linus, it's similar to your BUG_ON() policy [4]. The process hitting BUG_ON() is
> killed, and the system proceeds to work. pkill_on_warn just brings a similar
> policy to WARN_ON() handling.
>
> I believe that many Linux distros (which don't hit WARN_ON() here and there)
> will enable pkill_on_warn because it's reasonable from the safety and security
> points of view.
>
> And I'm sure that the ELISA project by the Linux Foundation (Enabling Linux In
> Safety Applications [5]) would support the pkill_on_warn sysctl.
> [Adding people from this project to CC]
>
> I hope that I managed to show the rationale.
>

Alex, officially and formally, I cannot talk for the ELISA project
(Enabling Linux In Safety Applications) by the Linux Foundation and I
do not think there is anyone that can confidently do so on such a
detailed technical aspect that you are raising here, and as the
various participants in the ELISA Project have not really agreed on
such a technical aspect being one way or the other and I would not see
that happening quickly. However, I have spent quite some years on the
topic on "what is the right and important topics for using Linux in
safety applications"; so here are my five cents:

One of the general assumptions about safety applications and safety
systems is that the malfunction of a function within a system is more
critical, i.e., more likely to cause harm to people, directly or
indirectly, than the unavailability of the system. So, before
"something potentially unexpected happens"---which can have arbitrary
effects and hence effects difficult to foresee and control---, it is
better to just shutdown/silence the system, i.e., design a fail-safe
or fail-silent system, as the effect of shutdown is pretty easily
foreseeable during the overall system design and you could think about
what the overall system does, when the kernel crashes the usual way.

So, that brings us to what a user would expect from the kernel in a
safety-critical system: Shutdown on any event that is unexpected.

Here, I currently see panic_on_warn as the closest existing feature to
indicate any event that is unexpected and to shutdown the system. That
requires two things for the kernel development:

1. Allow a reasonably configured kernel to boot and run with
panic_on_warn set. Warnings should only be raised when something is
not configured as the developers expect it or the kernel is put into a
state that generally is _unexpected_ and has been exposed little to
the critical thought of the developer, to testing efforts and use in
other systems in the wild. Warnings should not be used for something
informative, which still allows the kernel to continue running in a
proper way in a generally expected environment. Up to my knowledge,
there are some kernels in production that run with panic_on_warn; so,
IMHO, this requirement is generally accepted (we might of course
discuss the one or other use of warn) and is not too much to ask for.

2. Really ensure that the system shuts down when it hits warn and
panic. That requires that the execution path for warn() and panic() is
not overly complicated (stuffed with various bells and whistles).
Otherwise, warn() and panic() could fail in various complex ways and
potentially keep the system running, although it should be shut down.
Some people in the ELISA Project looked a bit into why they believe
panic() shuts down a system but I have not seen a good system analysis
and argument why any third person could be convinced that panic()
works under all circumstances where it is invoked or that at least,
the circumstances under which panic really works is properly
documented. That is a central aspect for using Linux in a
reasonably-designed safety-critical system. That is possibly also
relevant for security, as you might see an attacker obtain information
because it was possible to "block" the kernel shutting down after
invoking panic() and hence, the attacker could obtain certain
information that was only possible because 1. the system got into an
inconsistent state, 2. it was detected by some check leading to warn()
or panic(), and 3. the system's security engineers assumed that the
system must have been shutting down at that point, as panic() was
invoked, and hence, this would be disallowing a lot of further
operations or some specific operations that the attacker would need to
trigger in that inconsistent state to obtain information.

To your feature, Alex, I do not see the need to have any refined
handling of killing a specific process when the kernel warns; stopping
the whole system is the better and more predictable thing to do. I
would prefer if systems, which have those high-integrity requirements,
e.g., in a highly secure---where stopping any unintended information
flow matters more than availability---or in fail-silent environments
in safety systems, can use panic_on_warn. That should address your
concern above of handling certain CVEs as well.

In summary, I am not supporting pkill_on_warn. I would support the
other points I mentioned above, i.e., a good enforced policy for use
of warn() and any investigation to understand the complexity of
panic() and reducing its complexity if triggered by such an
investigation.

Of course, the listeners and participants in the ELISA Project are
very, very diverse and still on a steep learning curve, i.e., what
does the kernel do, how complex are certain aspects in the kernel, and
what are reasonable system designs that are in reach for
certification. So, there might be some stakeholders in the ELISA
Project that consider availability of a Linux system safety-critical,
i.e., if the system with a Linux kernel is not available, something
really bad (harmful to people) happens. I personally do not know how
these stakeholders could (ever) argue the availability of a complex
system with a Linux kernel, with the availability criteria and the
needed confidence (evidence and required methods) for exposing anyone
to such system under our current societal expectations on technical
systems (you would to need show sufficient investigation of the
kernel's availability for a certification), but that does not stop
anyone looking into it... Those stakeholders should really speak for
themselves, if they see any need for such a refined control of
"something unexpected happens" (an invocation of warn) and more
generally what features from the kernel are needed for such systems.


Lukas

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [ELISA Safety Architecture WG] [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-15 13:59       ` Lukas Bulwahn
@ 2021-11-15 15:51         ` Gabriele Paoloni
  2021-11-16  7:52           ` Alexander Popov
  2021-11-15 16:06         ` Steven Rostedt
  1 sibling, 1 reply; 31+ messages in thread
From: Gabriele Paoloni @ 2021-11-15 15:51 UTC (permalink / raw)
  To: Lukas Bulwahn, Alexander Popov
  Cc: Linus Torvalds, Jonathan Corbet, Paul McKenney, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, Joerg Roedel, Maciej Rozycki,
	Muchun Song, Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu,
	Petr Mladek, Kees Cook, Luis Chamberlain, Wei Liu, John Ogness,
	Andy Shevchenko, Alexey Kardashevskiy, Christophe Leroy,
	Jann Horn, Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski,
	Dave Hansen, Steven Rostedt, Will Deacon, Ard Biesheuvel,
	Laura Abbott, David S Miller, Borislav Petkov, Arnd Bergmann,
	Andrew Scull, Marc Zyngier, Jessica Yu, Iurii Zaikin,
	Rasmus Villemoes, Wang Qing, Mel Gorman, Mauro Carvalho Chehab,
	Andrew Klychkov, Mathieu Chouquet-Stringer, Daniel Borkmann,
	Stephen Kitt, Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan



On 15/11/2021 14:59, Lukas Bulwahn wrote:
> On Sat, Nov 13, 2021 at 7:14 PM Alexander Popov <alex.popov@linux.com> wrote:
>>
>> On 13.11.2021 00:26, Linus Torvalds wrote:
>>> On Fri, Nov 12, 2021 at 10:52 AM Alexander Popov <alex.popov@linux.com> wrote:
>>>>
>>>> Hello everyone!
>>>> Friendly ping for your feedback.
>>>
>>> I still haven't heard a compelling _reason_ for this all, and why
>>> anybody should ever use this or care?
>>
>> Ok, to sum up:
>>
>> Killing the process that hit a kernel warning complies with the Fail-Fast
>> principle [1]. pkill_on_warn sysctl allows the kernel to stop the process when
>> the **first signs** of wrong behavior are detected.
>>
>> By default, the Linux kernel ignores a warning and proceeds the execution from
>> the flawed state. That is opposite to the Fail-Fast principle.
>> A kernel warning may be followed by memory corruption or other negative effects,
>> like in CVE-2019-18683 exploit [2] or many other cases detected by the SyzScope
>> project [3]. pkill_on_warn would prevent the system from the errors going after
>> a warning in the process context.
>>
>> At the same time, pkill_on_warn does not kill the entire system like
>> panic_on_warn. That is the middle way of handling kernel warnings.
>> Linus, it's similar to your BUG_ON() policy [4]. The process hitting BUG_ON() is
>> killed, and the system proceeds to work. pkill_on_warn just brings a similar
>> policy to WARN_ON() handling.
>>
>> I believe that many Linux distros (which don't hit WARN_ON() here and there)
>> will enable pkill_on_warn because it's reasonable from the safety and security
>> points of view.
>>
>> And I'm sure that the ELISA project by the Linux Foundation (Enabling Linux In
>> Safety Applications [5]) would support the pkill_on_warn sysctl.
>> [Adding people from this project to CC]
>>
>> I hope that I managed to show the rationale.
>>
> 
> Alex, officially and formally, I cannot talk for the ELISA project
> (Enabling Linux In Safety Applications) by the Linux Foundation and I
> do not think there is anyone that can confidently do so on such a
> detailed technical aspect that you are raising here, and as the
> various participants in the ELISA Project have not really agreed on
> such a technical aspect being one way or the other and I would not see
> that happening quickly. However, I have spent quite some years on the
> topic on "what is the right and important topics for using Linux in
> safety applications"; so here are my five cents:
> 
> One of the general assumptions about safety applications and safety
> systems is that the malfunction of a function within a system is more
> critical, i.e., more likely to cause harm to people, directly or
> indirectly, than the unavailability of the system. So, before
> "something potentially unexpected happens"---which can have arbitrary
> effects and hence effects difficult to foresee and control---, it is
> better to just shutdown/silence the system, i.e., design a fail-safe
> or fail-silent system, as the effect of shutdown is pretty easily
> foreseeable during the overall system design and you could think about
> what the overall system does, when the kernel crashes the usual way.
> 
> So, that brings us to what a user would expect from the kernel in a
> safety-critical system: Shutdown on any event that is unexpected.
> 
> Here, I currently see panic_on_warn as the closest existing feature to
> indicate any event that is unexpected and to shutdown the system. That
> requires two things for the kernel development:
> 
> 1. Allow a reasonably configured kernel to boot and run with
> panic_on_warn set. Warnings should only be raised when something is
> not configured as the developers expect it or the kernel is put into a
> state that generally is _unexpected_ and has been exposed little to
> the critical thought of the developer, to testing efforts and use in
> other systems in the wild. Warnings should not be used for something
> informative, which still allows the kernel to continue running in a
> proper way in a generally expected environment. Up to my knowledge,
> there are some kernels in production that run with panic_on_warn; so,
> IMHO, this requirement is generally accepted (we might of course
> discuss the one or other use of warn) and is not too much to ask for.
> 
> 2. Really ensure that the system shuts down when it hits warn and
> panic. That requires that the execution path for warn() and panic() is
> not overly complicated (stuffed with various bells and whistles).
> Otherwise, warn() and panic() could fail in various complex ways and
> potentially keep the system running, although it should be shut down.
> Some people in the ELISA Project looked a bit into why they believe
> panic() shuts down a system but I have not seen a good system analysis
> and argument why any third person could be convinced that panic()
> works under all circumstances where it is invoked or that at least,
> the circumstances under which panic really works is properly
> documented. That is a central aspect for using Linux in a
> reasonably-designed safety-critical system. That is possibly also
> relevant for security, as you might see an attacker obtain information
> because it was possible to "block" the kernel shutting down after
> invoking panic() and hence, the attacker could obtain certain
> information that was only possible because 1. the system got into an
> inconsistent state, 2. it was detected by some check leading to warn()
> or panic(), and 3. the system's security engineers assumed that the
> system must have been shutting down at that point, as panic() was
> invoked, and hence, this would be disallowing a lot of further
> operations or some specific operations that the attacker would need to
> trigger in that inconsistent state to obtain information.
> 
> To your feature, Alex, I do not see the need to have any refined
> handling of killing a specific process when the kernel warns; stopping
> the whole system is the better and more predictable thing to do. I
> would prefer if systems, which have those high-integrity requirements,
> e.g., in a highly secure---where stopping any unintended information
> flow matters more than availability---or in fail-silent environments
> in safety systems, can use panic_on_warn. That should address your
> concern above of handling certain CVEs as well.
> 
> In summary, I am not supporting pkill_on_warn. I would support the
> other points I mentioned above, i.e., a good enforced policy for use
> of warn() and any investigation to understand the complexity of
> panic() and reducing its complexity if triggered by such an
> investigation.

Hi Alex

I also agree with the summary that Lukas gave here. From my experience
the safety system are always guarded by an external flow monitor (e.g. a
watchdog) that triggers in case the safety relevant workloads slows down
or block (for any reason); given this condition of use, a system that
goes into the panic state is always safe, since the watchdog would
trigger and drive the system automatically into safe state.
So I also don't see a clear advantage of having pkill_on_warn();
actually on the flip side it seems to me that such feature could
introduce more risk, as it kills only the threads of the process that
caused the kernel warning whereas the other processes are trusted to
run on a weaker Kernel (does killing the threads of the process that
caused the kernel warning always fix the Kernel condition that lead to
the warning?)

Thanks
Gab

> 
> Of course, the listeners and participants in the ELISA Project are
> very, very diverse and still on a steep learning curve, i.e., what
> does the kernel do, how complex are certain aspects in the kernel, and
> what are reasonable system designs that are in reach for
> certification. So, there might be some stakeholders in the ELISA
> Project that consider availability of a Linux system safety-critical,
> i.e., if the system with a Linux kernel is not available, something
> really bad (harmful to people) happens. I personally do not know how
> these stakeholders could (ever) argue the availability of a complex
> system with a Linux kernel, with the availability criteria and the
> needed confidence (evidence and required methods) for exposing anyone
> to such system under our current societal expectations on technical
> systems (you would to need show sufficient investigation of the
> kernel's availability for a certification), but that does not stop
> anyone looking into it... Those stakeholders should really speak for
> themselves, if they see any need for such a refined control of
> "something unexpected happens" (an invocation of warn) and more
> generally what features from the kernel are needed for such systems.
> 
> 
> Lukas
> 
> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#629): https://lists.elisa.tech/g/safety-architecture/message/629
> Mute This Topic: https://lists.elisa.tech/mt/87069369/6239706
> Group Owner: safety-architecture+owner@lists.elisa.tech
> Unsubscribe: https://lists.elisa.tech/g/safety-architecture/unsub [gpaoloni@redhat.com]
> -=-=-=-=-=-=-=-=-=-=-=-
> 
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-15 13:59       ` Lukas Bulwahn
  2021-11-15 15:51         ` [ELISA Safety Architecture WG] " Gabriele Paoloni
@ 2021-11-15 16:06         ` Steven Rostedt
  2021-11-15 22:06           ` Kees Cook
  2021-11-16  6:37           ` Christophe Leroy
  1 sibling, 2 replies; 31+ messages in thread
From: Steven Rostedt @ 2021-11-15 16:06 UTC (permalink / raw)
  To: Lukas Bulwahn
  Cc: Alexander Popov, Linus Torvalds, Jonathan Corbet, Paul McKenney,
	Andrew Morton, Thomas Gleixner, Peter Zijlstra, Joerg Roedel,
	Maciej Rozycki, Muchun Song, Viresh Kumar, Robin Murphy,
	Randy Dunlap, Lu Baolu, Petr Mladek, Kees Cook, Luis Chamberlain,
	Wei Liu, John Ogness, Andy Shevchenko, Alexey Kardashevskiy,
	Christophe Leroy, Jann Horn, Greg Kroah-Hartman, Mark Rutland,
	Andy Lutomirski, Dave Hansen, Will Deacon, Ard Biesheuvel,
	Laura Abbott, David S Miller, Borislav Petkov, Arnd Bergmann,
	Andrew Scull, Marc Zyngier, Jessica Yu, Iurii Zaikin,
	Rasmus Villemoes, Wang Qing, Mel Gorman, Mauro Carvalho Chehab,
	Andrew Klychkov, Mathieu Chouquet-Stringer, Daniel Borkmann,
	Stephen Kitt, Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan

On Mon, 15 Nov 2021 14:59:57 +0100
Lukas Bulwahn <lukas.bulwahn@gmail.com> wrote:

> 1. Allow a reasonably configured kernel to boot and run with
> panic_on_warn set. Warnings should only be raised when something is
> not configured as the developers expect it or the kernel is put into a
> state that generally is _unexpected_ and has been exposed little to
> the critical thought of the developer, to testing efforts and use in
> other systems in the wild. Warnings should not be used for something
> informative, which still allows the kernel to continue running in a
> proper way in a generally expected environment. Up to my knowledge,
> there are some kernels in production that run with panic_on_warn; so,
> IMHO, this requirement is generally accepted (we might of course

To me, WARN*() is the same as BUG*(). If it gets hit, it's a bug in the
kernel and needs to be fixed. I have several WARN*() calls in my code, and
it's all because the algorithms used is expected to prevent the condition
in the warning from happening. If the warning triggers, it means either that
the algorithm is wrong or my assumption about the algorithm is wrong. In
either case, the kernel needs to be updated. All my tests fail if a WARN*()
gets hit (anywhere in the kernel, not just my own).

After reading all the replies and thinking about this more, I find the
pkill_on_warning actually worse than not doing anything. If you are
concerned about exploits from warnings, the only real solution is a
panic_on_warning. Yes, it brings down the system, but really, it has to be
brought down anyway, because it is in need of a kernel update.

-- Steve

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-15 16:06         ` Steven Rostedt
@ 2021-11-15 22:06           ` Kees Cook
  2021-11-16  9:12             ` Alexander Popov
                               ` (2 more replies)
  2021-11-16  6:37           ` Christophe Leroy
  1 sibling, 3 replies; 31+ messages in thread
From: Kees Cook @ 2021-11-15 22:06 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Lukas Bulwahn, Alexander Popov, Linus Torvalds, Jonathan Corbet,
	Paul McKenney, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Joerg Roedel, Maciej Rozycki, Muchun Song, Viresh Kumar,
	Robin Murphy, Randy Dunlap, Lu Baolu, Petr Mladek,
	Luis Chamberlain, Wei Liu, John Ogness, Andy Shevchenko,
	Alexey Kardashevskiy, Christophe Leroy, Jann Horn,
	Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski, Dave Hansen,
	Will Deacon, Ard Biesheuvel, Laura Abbott, David S Miller,
	Borislav Petkov, Arnd Bergmann, Andrew Scull, Marc Zyngier,
	Jessica Yu, Iurii Zaikin, Rasmus Villemoes, Wang Qing,
	Mel Gorman, Mauro Carvalho Chehab, Andrew Klychkov,
	Mathieu Chouquet-Stringer, Daniel Borkmann, Stephen Kitt,
	Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan

On Mon, Nov 15, 2021 at 11:06:49AM -0500, Steven Rostedt wrote:
> On Mon, 15 Nov 2021 14:59:57 +0100
> Lukas Bulwahn <lukas.bulwahn@gmail.com> wrote:
> 
> > 1. Allow a reasonably configured kernel to boot and run with
> > panic_on_warn set. Warnings should only be raised when something is
> > not configured as the developers expect it or the kernel is put into a
> > state that generally is _unexpected_ and has been exposed little to
> > the critical thought of the developer, to testing efforts and use in
> > other systems in the wild. Warnings should not be used for something
> > informative, which still allows the kernel to continue running in a
> > proper way in a generally expected environment. Up to my knowledge,
> > there are some kernels in production that run with panic_on_warn; so,
> > IMHO, this requirement is generally accepted (we might of course
> 
> To me, WARN*() is the same as BUG*(). If it gets hit, it's a bug in the
> kernel and needs to be fixed. I have several WARN*() calls in my code, and
> it's all because the algorithms used is expected to prevent the condition
> in the warning from happening. If the warning triggers, it means either that
> the algorithm is wrong or my assumption about the algorithm is wrong. In
> either case, the kernel needs to be updated. All my tests fail if a WARN*()
> gets hit (anywhere in the kernel, not just my own).
> 
> After reading all the replies and thinking about this more, I find the
> pkill_on_warning actually worse than not doing anything. If you are
> concerned about exploits from warnings, the only real solution is a
> panic_on_warning. Yes, it brings down the system, but really, it has to be
> brought down anyway, because it is in need of a kernel update.

Hmm, yes. What it originally boiled down to, which is why Linus first
objected to BUG(), was that we don't know what other parts of the system
have been disrupted. The best example is just that of locking: if we
BUG() or do_exit() in the middle of holding a lock, we'll wreck whatever
subsystem that was attached to. Without a deterministic system state
unwinder, there really isn't a "safe" way to just stop a kernel thread.

With this pkill_on_warn, we avoid the BUG problem (since the thread of
execution continues and stops at an 'expected' place: the signal
handler).

However, now we have the newer objection from Linus, which is one of
attribution: the WARN might be hit during an "unrelated" thread of
execution and "current" gets blamed, etc. And beyond that, if we take
down a portion of userspace, what in userspace may be destabilized? In
theory, we get a case where any required daemons would be restarted by
init, but that's not "known".

The safest version of this I can think of is for processes to opt into
this mitigation. That would also cover the "special cases" we've seen
exposed too. i.e. init and kthreads would not opt in.

However, that's a lot to implement when Marco's tracing suggestion might
be sufficient and policy could be entirely implemented in userspace. It
could be as simple as this (totally untested):


diff --git a/include/trace/events/error_report.h b/include/trace/events/error_report.h
index 96f64bf218b2..129d22eb8b6e 100644
--- a/include/trace/events/error_report.h
+++ b/include/trace/events/error_report.h
@@ -16,6 +16,8 @@
 #define __ERROR_REPORT_DECLARE_TRACE_ENUMS_ONCE_ONLY
 
 enum error_detector {
+	ERROR_DETECTOR_WARN,
+	ERROR_DETECTOR_BUG,
 	ERROR_DETECTOR_KFENCE,
 	ERROR_DETECTOR_KASAN
 };
@@ -23,6 +25,8 @@ enum error_detector {
 #endif /* __ERROR_REPORT_DECLARE_TRACE_ENUMS_ONCE_ONLY */
 
 #define error_detector_list	\
+	EM(ERROR_DETECTOR_WARN, "warn")	\
+	EM(ERROR_DETECTOR_BUG, "bug")	\
 	EM(ERROR_DETECTOR_KFENCE, "kfence")	\
 	EMe(ERROR_DETECTOR_KASAN, "kasan")
 /* Always end the list with an EMe. */
diff --git a/lib/bug.c b/lib/bug.c
index 45a0584f6541..201b4070bbbc 100644
--- a/lib/bug.c
+++ b/lib/bug.c
@@ -48,6 +48,7 @@
 #include <linux/sched.h>
 #include <linux/rculist.h>
 #include <linux/ftrace.h>
+#include <trace/events/error_report.h>
 
 extern struct bug_entry __start___bug_table[], __stop___bug_table[];
 
@@ -198,6 +199,7 @@ enum bug_trap_type report_bug(unsigned long bugaddr, struct pt_regs *regs)
 		/* this is a WARN_ON rather than BUG/BUG_ON */
 		__warn(file, line, (void *)bugaddr, BUG_GET_TAINT(bug), regs,
 		       NULL);
+		trace_error_report_end(ERROR_DETECTOR_WARN, bugaddr);
 		return BUG_TRAP_TYPE_WARN;
 	}
 
@@ -206,6 +208,7 @@ enum bug_trap_type report_bug(unsigned long bugaddr, struct pt_regs *regs)
 	else
 		pr_crit("Kernel BUG at %pB [verbose debug info unavailable]\n",
 			(void *)bugaddr);
+	trace_error_report_end(ERROR_DETECTOR_BUG, bugaddr);
 
 	return BUG_TRAP_TYPE_BUG;
 }


Marco, is this the full version of monitoring this from the userspace
side?

	perf record -e error_report:error_report_end

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-15 16:06         ` Steven Rostedt
  2021-11-15 22:06           ` Kees Cook
@ 2021-11-16  6:37           ` Christophe Leroy
  2021-11-16  8:34             ` Alexander Popov
  2021-11-16  8:57             ` Lukas Bulwahn
  1 sibling, 2 replies; 31+ messages in thread
From: Christophe Leroy @ 2021-11-16  6:37 UTC (permalink / raw)
  To: Steven Rostedt, Lukas Bulwahn
  Cc: Alexander Popov, Linus Torvalds, Jonathan Corbet, Paul McKenney,
	Andrew Morton, Thomas Gleixner, Peter Zijlstra, Joerg Roedel,
	Maciej Rozycki, Muchun Song, Viresh Kumar, Robin Murphy,
	Randy Dunlap, Lu Baolu, Petr Mladek, Kees Cook, Luis Chamberlain,
	Wei Liu, John Ogness, Andy Shevchenko, Alexey Kardashevskiy,
	Jann Horn, Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski,
	Dave Hansen, Will Deacon, Ard Biesheuvel, Laura Abbott,
	David S Miller, Borislav Petkov, Arnd Bergmann, Andrew Scull,
	Marc Zyngier, Jessica Yu, Iurii Zaikin, Rasmus Villemoes,
	Wang Qing, Mel Gorman, Mauro Carvalho Chehab, Andrew Klychkov,
	Mathieu Chouquet-Stringer, Daniel Borkmann, Stephen Kitt,
	Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan



Le 15/11/2021 à 17:06, Steven Rostedt a écrit :
> On Mon, 15 Nov 2021 14:59:57 +0100
> Lukas Bulwahn <lukas.bulwahn@gmail.com> wrote:
> 
>> 1. Allow a reasonably configured kernel to boot and run with
>> panic_on_warn set. Warnings should only be raised when something is
>> not configured as the developers expect it or the kernel is put into a
>> state that generally is _unexpected_ and has been exposed little to
>> the critical thought of the developer, to testing efforts and use in
>> other systems in the wild. Warnings should not be used for something
>> informative, which still allows the kernel to continue running in a
>> proper way in a generally expected environment. Up to my knowledge,
>> there are some kernels in production that run with panic_on_warn; so,
>> IMHO, this requirement is generally accepted (we might of course
> 
> To me, WARN*() is the same as BUG*(). If it gets hit, it's a bug in the
> kernel and needs to be fixed. I have several WARN*() calls in my code, and
> it's all because the algorithms used is expected to prevent the condition
> in the warning from happening. If the warning triggers, it means either that
> the algorithm is wrong or my assumption about the algorithm is wrong. In
> either case, the kernel needs to be updated. All my tests fail if a WARN*()
> gets hit (anywhere in the kernel, not just my own).
> 
> After reading all the replies and thinking about this more, I find the
> pkill_on_warning actually worse than not doing anything. If you are
> concerned about exploits from warnings, the only real solution is a
> panic_on_warning. Yes, it brings down the system, but really, it has to be
> brought down anyway, because it is in need of a kernel update.
> 

We also have LIVEPATCH to avoid bringing down the system for a kernel 
update, don't we ? So I wouldn't expect bringing down a vital system 
just for a WARN.

As far as I understand from 
https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on, 
WARN() and WARN_ON() are meant to deal with those situations as 
gracefull as possible, allowing the system to continue running the best 
it can until a human controled action is taken.

So I'd expect the WARN/WARN_ON to be handled and I agree that that 
pkill_on_warning seems dangerous and unrelevant, probably more dangerous 
than doing nothing, especially as the WARN may trigger for a reason 
which has nothing to do with the running thread.

Christophe

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [ELISA Safety Architecture WG] [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-15 15:51         ` [ELISA Safety Architecture WG] " Gabriele Paoloni
@ 2021-11-16  7:52           ` Alexander Popov
  2021-11-16  8:01             ` Lukas Bulwahn
  2021-11-16  8:41             ` Petr Mladek
  0 siblings, 2 replies; 31+ messages in thread
From: Alexander Popov @ 2021-11-16  7:52 UTC (permalink / raw)
  To: Gabriele Paoloni, Lukas Bulwahn, Robert Krutsch
  Cc: Linus Torvalds, Jonathan Corbet, Paul McKenney, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, Joerg Roedel, Maciej Rozycki,
	Muchun Song, Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu,
	Petr Mladek, Kees Cook, Luis Chamberlain, Wei Liu, John Ogness,
	Andy Shevchenko, Alexey Kardashevskiy, Christophe Leroy,
	Jann Horn, Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski,
	Dave Hansen, Steven Rostedt, Will Deacon, Ard Biesheuvel,
	Laura Abbott, David S Miller, Borislav Petkov, Arnd Bergmann,
	Andrew Scull, Marc Zyngier, Jessica Yu, Iurii Zaikin,
	Rasmus Villemoes, Wang Qing, Mel Gorman, Mauro Carvalho Chehab,
	Andrew Klychkov, Mathieu Chouquet-Stringer, Daniel Borkmann,
	Stephen Kitt, Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan

On 15.11.2021 18:51, Gabriele Paoloni wrote:
> 
> 
> On 15/11/2021 14:59, Lukas Bulwahn wrote:
>> On Sat, Nov 13, 2021 at 7:14 PM Alexander Popov <alex.popov@linux.com> wrote:
>>>
>>> On 13.11.2021 00:26, Linus Torvalds wrote:
>>>> On Fri, Nov 12, 2021 at 10:52 AM Alexander Popov <alex.popov@linux.com> wrote:
>>>>>
>>>>> Hello everyone!
>>>>> Friendly ping for your feedback.
>>>>
>>>> I still haven't heard a compelling _reason_ for this all, and why
>>>> anybody should ever use this or care?
>>>
>>> Ok, to sum up:
>>>
>>> Killing the process that hit a kernel warning complies with the Fail-Fast
>>> principle [1]. pkill_on_warn sysctl allows the kernel to stop the process when
>>> the **first signs** of wrong behavior are detected.
>>>
>>> By default, the Linux kernel ignores a warning and proceeds the execution from
>>> the flawed state. That is opposite to the Fail-Fast principle.
>>> A kernel warning may be followed by memory corruption or other negative effects,
>>> like in CVE-2019-18683 exploit [2] or many other cases detected by the SyzScope
>>> project [3]. pkill_on_warn would prevent the system from the errors going after
>>> a warning in the process context.
>>>
>>> At the same time, pkill_on_warn does not kill the entire system like
>>> panic_on_warn. That is the middle way of handling kernel warnings.
>>> Linus, it's similar to your BUG_ON() policy [4]. The process hitting BUG_ON() is
>>> killed, and the system proceeds to work. pkill_on_warn just brings a similar
>>> policy to WARN_ON() handling.
>>>
>>> I believe that many Linux distros (which don't hit WARN_ON() here and there)
>>> will enable pkill_on_warn because it's reasonable from the safety and security
>>> points of view.
>>>
>>> And I'm sure that the ELISA project by the Linux Foundation (Enabling Linux In
>>> Safety Applications [5]) would support the pkill_on_warn sysctl.
>>> [Adding people from this project to CC]
>>>
>>> I hope that I managed to show the rationale.
>>>
>>
>> Alex, officially and formally, I cannot talk for the ELISA project
>> (Enabling Linux In Safety Applications) by the Linux Foundation and I
>> do not think there is anyone that can confidently do so on such a
>> detailed technical aspect that you are raising here, and as the
>> various participants in the ELISA Project have not really agreed on
>> such a technical aspect being one way or the other and I would not see
>> that happening quickly. However, I have spent quite some years on the
>> topic on "what is the right and important topics for using Linux in
>> safety applications"; so here are my five cents:
>>
>> One of the general assumptions about safety applications and safety
>> systems is that the malfunction of a function within a system is more
>> critical, i.e., more likely to cause harm to people, directly or
>> indirectly, than the unavailability of the system. So, before
>> "something potentially unexpected happens"---which can have arbitrary
>> effects and hence effects difficult to foresee and control---, it is
>> better to just shutdown/silence the system, i.e., design a fail-safe
>> or fail-silent system, as the effect of shutdown is pretty easily
>> foreseeable during the overall system design and you could think about
>> what the overall system does, when the kernel crashes the usual way.
>>
>> So, that brings us to what a user would expect from the kernel in a
>> safety-critical system: Shutdown on any event that is unexpected.
>>
>> Here, I currently see panic_on_warn as the closest existing feature to
>> indicate any event that is unexpected and to shutdown the system. That
>> requires two things for the kernel development:
>>
>> 1. Allow a reasonably configured kernel to boot and run with
>> panic_on_warn set. Warnings should only be raised when something is
>> not configured as the developers expect it or the kernel is put into a
>> state that generally is _unexpected_ and has been exposed little to
>> the critical thought of the developer, to testing efforts and use in
>> other systems in the wild. Warnings should not be used for something
>> informative, which still allows the kernel to continue running in a
>> proper way in a generally expected environment. Up to my knowledge,
>> there are some kernels in production that run with panic_on_warn; so,
>> IMHO, this requirement is generally accepted (we might of course
>> discuss the one or other use of warn) and is not too much to ask for.
>>
>> 2. Really ensure that the system shuts down when it hits warn and
>> panic. That requires that the execution path for warn() and panic() is
>> not overly complicated (stuffed with various bells and whistles).
>> Otherwise, warn() and panic() could fail in various complex ways and
>> potentially keep the system running, although it should be shut down.
>> Some people in the ELISA Project looked a bit into why they believe
>> panic() shuts down a system but I have not seen a good system analysis
>> and argument why any third person could be convinced that panic()
>> works under all circumstances where it is invoked or that at least,
>> the circumstances under which panic really works is properly
>> documented. That is a central aspect for using Linux in a
>> reasonably-designed safety-critical system. That is possibly also
>> relevant for security, as you might see an attacker obtain information
>> because it was possible to "block" the kernel shutting down after
>> invoking panic() and hence, the attacker could obtain certain
>> information that was only possible because 1. the system got into an
>> inconsistent state, 2. it was detected by some check leading to warn()
>> or panic(), and 3. the system's security engineers assumed that the
>> system must have been shutting down at that point, as panic() was
>> invoked, and hence, this would be disallowing a lot of further
>> operations or some specific operations that the attacker would need to
>> trigger in that inconsistent state to obtain information.
>>
>> To your feature, Alex, I do not see the need to have any refined
>> handling of killing a specific process when the kernel warns; stopping
>> the whole system is the better and more predictable thing to do. I
>> would prefer if systems, which have those high-integrity requirements,
>> e.g., in a highly secure---where stopping any unintended information
>> flow matters more than availability---or in fail-silent environments
>> in safety systems, can use panic_on_warn. That should address your
>> concern above of handling certain CVEs as well.
>>
>> In summary, I am not supporting pkill_on_warn. I would support the
>> other points I mentioned above, i.e., a good enforced policy for use
>> of warn() and any investigation to understand the complexity of
>> panic() and reducing its complexity if triggered by such an
>> investigation.
> 
> Hi Alex
> 
> I also agree with the summary that Lukas gave here. From my experience
> the safety system are always guarded by an external flow monitor (e.g. a
> watchdog) that triggers in case the safety relevant workloads slows down
> or block (for any reason); given this condition of use, a system that
> goes into the panic state is always safe, since the watchdog would
> trigger and drive the system automatically into safe state.
> So I also don't see a clear advantage of having pkill_on_warn();
> actually on the flip side it seems to me that such feature could
> introduce more risk, as it kills only the threads of the process that
> caused the kernel warning whereas the other processes are trusted to
> run on a weaker Kernel (does killing the threads of the process that
> caused the kernel warning always fix the Kernel condition that lead to
> the warning?)

Lukas, Gabriele, Robert,
Thanks for showing this from the safety point of view.

The part about believing in panic() functionality is amazing :)
Yes, safety critical systems depend on the robust ability to restart.

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [ELISA Safety Architecture WG] [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-16  7:52           ` Alexander Popov
@ 2021-11-16  8:01             ` Lukas Bulwahn
  2021-11-16  8:41             ` Petr Mladek
  1 sibling, 0 replies; 31+ messages in thread
From: Lukas Bulwahn @ 2021-11-16  8:01 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Gabriele Paoloni, Robert Krutsch, Linus Torvalds,
	Jonathan Corbet, Paul McKenney, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Joerg Roedel, Maciej Rozycki, Muchun Song,
	Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu, Petr Mladek,
	Kees Cook, Luis Chamberlain, Wei Liu, John Ogness,
	Andy Shevchenko, Alexey Kardashevskiy, Christophe Leroy,
	Jann Horn, Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski,
	Dave Hansen, Steven Rostedt, Will Deacon, Ard Biesheuvel,
	Laura Abbott, David S Miller, Borislav Petkov, Arnd Bergmann,
	Andrew Scull, Marc Zyngier, Jessica Yu, Iurii Zaikin,
	Rasmus Villemoes, Wang Qing, Mel Gorman, Mauro Carvalho Chehab,
	Andrew Klychkov, Mathieu Chouquet-Stringer, Daniel Borkmann,
	Stephen Kitt, Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan

On Tue, Nov 16, 2021 at 8:52 AM Alexander Popov <alex.popov@linux.com> wrote:
>
> On 15.11.2021 18:51, Gabriele Paoloni wrote:
> >
> >
> > On 15/11/2021 14:59, Lukas Bulwahn wrote:
> >> On Sat, Nov 13, 2021 at 7:14 PM Alexander Popov <alex.popov@linux.com> wrote:
> >>>
> >>> On 13.11.2021 00:26, Linus Torvalds wrote:
> >>>> On Fri, Nov 12, 2021 at 10:52 AM Alexander Popov <alex.popov@linux.com> wrote:
> >>>>>
> >>>>> Hello everyone!
> >>>>> Friendly ping for your feedback.
> >>>>
> >>>> I still haven't heard a compelling _reason_ for this all, and why
> >>>> anybody should ever use this or care?
> >>>
> >>> Ok, to sum up:
> >>>
> >>> Killing the process that hit a kernel warning complies with the Fail-Fast
> >>> principle [1]. pkill_on_warn sysctl allows the kernel to stop the process when
> >>> the **first signs** of wrong behavior are detected.
> >>>
> >>> By default, the Linux kernel ignores a warning and proceeds the execution from
> >>> the flawed state. That is opposite to the Fail-Fast principle.
> >>> A kernel warning may be followed by memory corruption or other negative effects,
> >>> like in CVE-2019-18683 exploit [2] or many other cases detected by the SyzScope
> >>> project [3]. pkill_on_warn would prevent the system from the errors going after
> >>> a warning in the process context.
> >>>
> >>> At the same time, pkill_on_warn does not kill the entire system like
> >>> panic_on_warn. That is the middle way of handling kernel warnings.
> >>> Linus, it's similar to your BUG_ON() policy [4]. The process hitting BUG_ON() is
> >>> killed, and the system proceeds to work. pkill_on_warn just brings a similar
> >>> policy to WARN_ON() handling.
> >>>
> >>> I believe that many Linux distros (which don't hit WARN_ON() here and there)
> >>> will enable pkill_on_warn because it's reasonable from the safety and security
> >>> points of view.
> >>>
> >>> And I'm sure that the ELISA project by the Linux Foundation (Enabling Linux In
> >>> Safety Applications [5]) would support the pkill_on_warn sysctl.
> >>> [Adding people from this project to CC]
> >>>
> >>> I hope that I managed to show the rationale.
> >>>
> >>
> >> Alex, officially and formally, I cannot talk for the ELISA project
> >> (Enabling Linux In Safety Applications) by the Linux Foundation and I
> >> do not think there is anyone that can confidently do so on such a
> >> detailed technical aspect that you are raising here, and as the
> >> various participants in the ELISA Project have not really agreed on
> >> such a technical aspect being one way or the other and I would not see
> >> that happening quickly. However, I have spent quite some years on the
> >> topic on "what is the right and important topics for using Linux in
> >> safety applications"; so here are my five cents:
> >>
> >> One of the general assumptions about safety applications and safety
> >> systems is that the malfunction of a function within a system is more
> >> critical, i.e., more likely to cause harm to people, directly or
> >> indirectly, than the unavailability of the system. So, before
> >> "something potentially unexpected happens"---which can have arbitrary
> >> effects and hence effects difficult to foresee and control---, it is
> >> better to just shutdown/silence the system, i.e., design a fail-safe
> >> or fail-silent system, as the effect of shutdown is pretty easily
> >> foreseeable during the overall system design and you could think about
> >> what the overall system does, when the kernel crashes the usual way.
> >>
> >> So, that brings us to what a user would expect from the kernel in a
> >> safety-critical system: Shutdown on any event that is unexpected.
> >>
> >> Here, I currently see panic_on_warn as the closest existing feature to
> >> indicate any event that is unexpected and to shutdown the system. That
> >> requires two things for the kernel development:
> >>
> >> 1. Allow a reasonably configured kernel to boot and run with
> >> panic_on_warn set. Warnings should only be raised when something is
> >> not configured as the developers expect it or the kernel is put into a
> >> state that generally is _unexpected_ and has been exposed little to
> >> the critical thought of the developer, to testing efforts and use in
> >> other systems in the wild. Warnings should not be used for something
> >> informative, which still allows the kernel to continue running in a
> >> proper way in a generally expected environment. Up to my knowledge,
> >> there are some kernels in production that run with panic_on_warn; so,
> >> IMHO, this requirement is generally accepted (we might of course
> >> discuss the one or other use of warn) and is not too much to ask for.
> >>
> >> 2. Really ensure that the system shuts down when it hits warn and
> >> panic. That requires that the execution path for warn() and panic() is
> >> not overly complicated (stuffed with various bells and whistles).
> >> Otherwise, warn() and panic() could fail in various complex ways and
> >> potentially keep the system running, although it should be shut down.
> >> Some people in the ELISA Project looked a bit into why they believe
> >> panic() shuts down a system but I have not seen a good system analysis
> >> and argument why any third person could be convinced that panic()
> >> works under all circumstances where it is invoked or that at least,
> >> the circumstances under which panic really works is properly
> >> documented. That is a central aspect for using Linux in a
> >> reasonably-designed safety-critical system. That is possibly also
> >> relevant for security, as you might see an attacker obtain information
> >> because it was possible to "block" the kernel shutting down after
> >> invoking panic() and hence, the attacker could obtain certain
> >> information that was only possible because 1. the system got into an
> >> inconsistent state, 2. it was detected by some check leading to warn()
> >> or panic(), and 3. the system's security engineers assumed that the
> >> system must have been shutting down at that point, as panic() was
> >> invoked, and hence, this would be disallowing a lot of further
> >> operations or some specific operations that the attacker would need to
> >> trigger in that inconsistent state to obtain information.
> >>
> >> To your feature, Alex, I do not see the need to have any refined
> >> handling of killing a specific process when the kernel warns; stopping
> >> the whole system is the better and more predictable thing to do. I
> >> would prefer if systems, which have those high-integrity requirements,
> >> e.g., in a highly secure---where stopping any unintended information
> >> flow matters more than availability---or in fail-silent environments
> >> in safety systems, can use panic_on_warn. That should address your
> >> concern above of handling certain CVEs as well.
> >>
> >> In summary, I am not supporting pkill_on_warn. I would support the
> >> other points I mentioned above, i.e., a good enforced policy for use
> >> of warn() and any investigation to understand the complexity of
> >> panic() and reducing its complexity if triggered by such an
> >> investigation.
> >
> > Hi Alex
> >
> > I also agree with the summary that Lukas gave here. From my experience
> > the safety system are always guarded by an external flow monitor (e.g. a
> > watchdog) that triggers in case the safety relevant workloads slows down
> > or block (for any reason); given this condition of use, a system that
> > goes into the panic state is always safe, since the watchdog would
> > trigger and drive the system automatically into safe state.
> > So I also don't see a clear advantage of having pkill_on_warn();
> > actually on the flip side it seems to me that such feature could
> > introduce more risk, as it kills only the threads of the process that
> > caused the kernel warning whereas the other processes are trusted to
> > run on a weaker Kernel (does killing the threads of the process that
> > caused the kernel warning always fix the Kernel condition that lead to
> > the warning?)
>
> Lukas, Gabriele, Robert,
> Thanks for showing this from the safety point of view.
>
> The part about believing in panic() functionality is amazing :)
> Yes, safety critical systems depend on the robust ability to restart.
>

Well, there is really a lot of thought and willingness for engineering
effort to address the fact there must be high confidence that the
shutdown with panic() really works.

The proper start and restart of the kernel is actually another
issue... but there various sanity checks are possible before the
system switches into a mode that could potentially harm people (cause
physical damage, directly or indirectly).

Lukas

> Best regards,
> Alexander

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-16  6:37           ` Christophe Leroy
@ 2021-11-16  8:34             ` Alexander Popov
  2021-11-16  8:57             ` Lukas Bulwahn
  1 sibling, 0 replies; 31+ messages in thread
From: Alexander Popov @ 2021-11-16  8:34 UTC (permalink / raw)
  To: Christophe Leroy, Steven Rostedt, Lukas Bulwahn
  Cc: Linus Torvalds, Jonathan Corbet, Paul McKenney, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, Joerg Roedel, Maciej Rozycki,
	Muchun Song, Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu,
	Petr Mladek, Kees Cook, Luis Chamberlain, Wei Liu, John Ogness,
	Andy Shevchenko, Alexey Kardashevskiy, Jann Horn,
	Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski, Dave Hansen,
	Will Deacon, Ard Biesheuvel, Laura Abbott, David S Miller,
	Borislav Petkov, Arnd Bergmann, Andrew Scull, Marc Zyngier,
	Jessica Yu, Iurii Zaikin, Rasmus Villemoes, Wang Qing,
	Mel Gorman, Mauro Carvalho Chehab, Andrew Klychkov,
	Mathieu Chouquet-Stringer, Daniel Borkmann, Stephen Kitt,
	Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan, Gabriele Paoloni, Robert Krutsch

On 16.11.2021 09:37, Christophe Leroy wrote:
> Le 15/11/2021 à 17:06, Steven Rostedt a écrit :
>> On Mon, 15 Nov 2021 14:59:57 +0100
>> Lukas Bulwahn <lukas.bulwahn@gmail.com> wrote:
>>
>>> 1. Allow a reasonably configured kernel to boot and run with
>>> panic_on_warn set. Warnings should only be raised when something is
>>> not configured as the developers expect it or the kernel is put into a
>>> state that generally is _unexpected_ and has been exposed little to
>>> the critical thought of the developer, to testing efforts and use in
>>> other systems in the wild. Warnings should not be used for something
>>> informative, which still allows the kernel to continue running in a
>>> proper way in a generally expected environment. Up to my knowledge,
>>> there are some kernels in production that run with panic_on_warn; so,
>>> IMHO, this requirement is generally accepted (we might of course
>>
>> To me, WARN*() is the same as BUG*(). If it gets hit, it's a bug in the
>> kernel and needs to be fixed. I have several WARN*() calls in my code, and
>> it's all because the algorithms used is expected to prevent the condition
>> in the warning from happening. If the warning triggers, it means either that
>> the algorithm is wrong or my assumption about the algorithm is wrong. In
>> either case, the kernel needs to be updated. All my tests fail if a WARN*()
>> gets hit (anywhere in the kernel, not just my own).
>>
>> After reading all the replies and thinking about this more, I find the
>> pkill_on_warning actually worse than not doing anything. If you are
>> concerned about exploits from warnings, the only real solution is a
>> panic_on_warning. Yes, it brings down the system, but really, it has to be
>> brought down anyway, because it is in need of a kernel update.
>>
> 
> We also have LIVEPATCH to avoid bringing down the system for a kernel
> update, don't we ? So I wouldn't expect bringing down a vital system
> just for a WARN.

Hello Christophe,

I would say that different systems have different requirements.
Not every Linux-based system needs live patching (it also has own limitations).

That's why I proposed a sysctl and didn't change the default kernel behavior.

> As far as I understand from
> https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on,
> WARN() and WARN_ON() are meant to deal with those situations as
> gracefull as possible, allowing the system to continue running the best
> it can until a human controled action is taken.

I can't agree here. There is a very strong push against adding BUG*() to the 
kernel source code. So there are a lot of cases when WARN*() is used for severe 
problems because kernel developers just don't have other options.

Currently, it looks like there is no consistent error handling policy in the kernel.

> So I'd expect the WARN/WARN_ON to be handled and I agree that that
> pkill_on_warning seems dangerous and unrelevant, probably more dangerous
> than doing nothing, especially as the WARN may trigger for a reason
> which has nothing to do with the running thread.

Sorry, I see a contradiction.
If killing a process hitting a kernel warning is "dangerous and unrelevant",
why killing a process on a kernel oops is fine? That's strange.

Linus calls that behavior "fairly benign" here: 
http://lkml.iu.edu/hypermail/linux/kernel/1610.0/01217.html

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [ELISA Safety Architecture WG] [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-16  7:52           ` Alexander Popov
  2021-11-16  8:01             ` Lukas Bulwahn
@ 2021-11-16  8:41             ` Petr Mladek
  2021-11-16  9:19               ` Lukas Bulwahn
  2021-11-16 13:20               ` James Bottomley
  1 sibling, 2 replies; 31+ messages in thread
From: Petr Mladek @ 2021-11-16  8:41 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Gabriele Paoloni, Lukas Bulwahn, Robert Krutsch, Linus Torvalds,
	Jonathan Corbet, Paul McKenney, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Joerg Roedel, Maciej Rozycki, Muchun Song,
	Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu, Kees Cook,
	Luis Chamberlain, Wei Liu, John Ogness, Andy Shevchenko,
	Alexey Kardashevskiy, Christophe Leroy, Jann Horn,
	Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski, Dave Hansen,
	Steven Rostedt, Will Deacon, Ard Biesheuvel, Laura Abbott,
	David S Miller, Borislav Petkov, Arnd Bergmann, Andrew Scull,
	Marc Zyngier, Jessica Yu, Iurii Zaikin, Rasmus Villemoes,
	Wang Qing, Mel Gorman, Mauro Carvalho Chehab, Andrew Klychkov,
	Mathieu Chouquet-Stringer, Daniel Borkmann, Stephen Kitt,
	Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan

On Tue 2021-11-16 10:52:39, Alexander Popov wrote:
> On 15.11.2021 18:51, Gabriele Paoloni wrote:
> > On 15/11/2021 14:59, Lukas Bulwahn wrote:
> > > On Sat, Nov 13, 2021 at 7:14 PM Alexander Popov <alex.popov@linux.com> wrote:
> > > > On 13.11.2021 00:26, Linus Torvalds wrote:
> > > > > On Fri, Nov 12, 2021 at 10:52 AM Alexander Popov <alex.popov@linux.com> wrote:
> > > > Killing the process that hit a kernel warning complies with the Fail-Fast
> > > > principle [1]. pkill_on_warn sysctl allows the kernel to stop the process when
> > > > the **first signs** of wrong behavior are detected.
> > > > 
> > > In summary, I am not supporting pkill_on_warn. I would support the
> > > other points I mentioned above, i.e., a good enforced policy for use
> > > of warn() and any investigation to understand the complexity of
> > > panic() and reducing its complexity if triggered by such an
> > > investigation.
> > 
> > Hi Alex
> > 
> > I also agree with the summary that Lukas gave here. From my experience
> > the safety system are always guarded by an external flow monitor (e.g. a
> > watchdog) that triggers in case the safety relevant workloads slows down
> > or block (for any reason); given this condition of use, a system that
> > goes into the panic state is always safe, since the watchdog would
> > trigger and drive the system automatically into safe state.
> > So I also don't see a clear advantage of having pkill_on_warn();
> > actually on the flip side it seems to me that such feature could
> > introduce more risk, as it kills only the threads of the process that
> > caused the kernel warning whereas the other processes are trusted to
> > run on a weaker Kernel (does killing the threads of the process that
> > caused the kernel warning always fix the Kernel condition that lead to
> > the warning?)
> 
> Lukas, Gabriele, Robert,
> Thanks for showing this from the safety point of view.
> 
> The part about believing in panic() functionality is amazing :)

Nothing is 100% reliable.

With printk() maintainer hat on, the current panic() implementation
is less reliable because it tries hard to provide some debugging
information, for example, error message, backtrace, registry,
flush pending messages on console, crashdump.

See panic() implementation, the reboot is done by emergency_restart().
The rest is about duping the information.

Well, the information is important. Otherwise, it is really hard to
fix the problem.

From my experience, especially the access to consoles is not fully
safe. The reliability might improve a lot when a lockless console
is used. I guess that using non-volatile memory for the log buffer
might be even more reliable.

I am not familiar with the code under emergency_restart(). I am not
sure how reliable it is.

> Yes, safety critical systems depend on the robust ability to restart.

If I wanted to implement a super-reliable panic() I would
use some external device that would cause power-reset when
the watched device is not responding.

Best Regards,
Petr


PS: I do not believe much into the pkill approach as well.

    It is similar to OOM killer. And I always had to restart the
    system when it was triggered.

    Also kernel is not prepared for the situation that an external
    code kills a kthread. And kthreads are used by many subsystems
    to handle work that has to be done asynchronously and/or in
    process context. And I guess that kthreads are non-trivial
    source of WARN().

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-16  6:37           ` Christophe Leroy
  2021-11-16  8:34             ` Alexander Popov
@ 2021-11-16  8:57             ` Lukas Bulwahn
  1 sibling, 0 replies; 31+ messages in thread
From: Lukas Bulwahn @ 2021-11-16  8:57 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Steven Rostedt, Alexander Popov, Linus Torvalds, Jonathan Corbet,
	Paul McKenney, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Joerg Roedel, Maciej Rozycki, Muchun Song, Viresh Kumar,
	Robin Murphy, Randy Dunlap, Lu Baolu, Petr Mladek, Kees Cook,
	Luis Chamberlain, Wei Liu, John Ogness, Andy Shevchenko,
	Alexey Kardashevskiy, Jann Horn, Greg Kroah-Hartman,
	Mark Rutland, Andy Lutomirski, Dave Hansen, Will Deacon,
	Ard Biesheuvel, Laura Abbott, David S Miller, Borislav Petkov,
	Arnd Bergmann, Andrew Scull, Marc Zyngier, Jessica Yu,
	Iurii Zaikin, Rasmus Villemoes, Wang Qing, Mel Gorman,
	Mauro Carvalho Chehab, Andrew Klychkov,
	Mathieu Chouquet-Stringer, Daniel Borkmann, Stephen Kitt,
	Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan

On Tue, Nov 16, 2021 at 7:37 AM Christophe Leroy
<christophe.leroy@csgroup.eu> wrote:
>
>
>
> Le 15/11/2021 à 17:06, Steven Rostedt a écrit :
> > On Mon, 15 Nov 2021 14:59:57 +0100
> > Lukas Bulwahn <lukas.bulwahn@gmail.com> wrote:
> >
> >> 1. Allow a reasonably configured kernel to boot and run with
> >> panic_on_warn set. Warnings should only be raised when something is
> >> not configured as the developers expect it or the kernel is put into a
> >> state that generally is _unexpected_ and has been exposed little to
> >> the critical thought of the developer, to testing efforts and use in
> >> other systems in the wild. Warnings should not be used for something
> >> informative, which still allows the kernel to continue running in a
> >> proper way in a generally expected environment. Up to my knowledge,
> >> there are some kernels in production that run with panic_on_warn; so,
> >> IMHO, this requirement is generally accepted (we might of course
> >
> > To me, WARN*() is the same as BUG*(). If it gets hit, it's a bug in the
> > kernel and needs to be fixed. I have several WARN*() calls in my code, and
> > it's all because the algorithms used is expected to prevent the condition
> > in the warning from happening. If the warning triggers, it means either that
> > the algorithm is wrong or my assumption about the algorithm is wrong. In
> > either case, the kernel needs to be updated. All my tests fail if a WARN*()
> > gets hit (anywhere in the kernel, not just my own).
> >
> > After reading all the replies and thinking about this more, I find the
> > pkill_on_warning actually worse than not doing anything. If you are
> > concerned about exploits from warnings, the only real solution is a
> > panic_on_warning. Yes, it brings down the system, but really, it has to be
> > brought down anyway, because it is in need of a kernel update.
> >
>
> We also have LIVEPATCH to avoid bringing down the system for a kernel
> update, don't we ? So I wouldn't expect bringing down a vital system
> just for a WARN.
>
> As far as I understand from
> https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on,
> WARN() and WARN_ON() are meant to deal with those situations as
> gracefull as possible, allowing the system to continue running the best
> it can until a human controled action is taken.
>
> So I'd expect the WARN/WARN_ON to be handled and I agree that that
> pkill_on_warning seems dangerous and unrelevant, probably more dangerous
> than doing nothing, especially as the WARN may trigger for a reason
> which has nothing to do with the running thread.
>

Christophe,

I agree with a reasonable goal that WARN() should allow users "to deal
with those situations as gracefull as possible, allowing the system to
continue running the best it can until a human controled action is
taken."

However, that makes me wonder even more: what does the system after a
WARN() invocation still need to provide as properly working
functionality, so that the human can take action, and how can the
kernel indicate to the whole user applications that a certain
functionality is not working anymore and how adaptive can those user
application really be here? Making that explicit for every WARN()
invocation seems to be tricky and probably also quite error-prone. So,
in the end, after a WARN(), you end up running a system where you have
this uncomfortable feeling of a running system where some things work
and some things do not and it might be insecure (the whole system
security concept is invalidated, because security features do not
work, security holes are opened etc.) or other surprises happen.

The panic_on_warn implements a simple policy of that "run as graceful
as possible": We assume stopping the kernel is _graceful_, and we just
assume that the functionality "panic shuts down the system" still
works properly after any WARN() invocation. Once the system is shut
down, the human can take action and switch it into some (remote)
diagnostic mode for further analysis and repair.

I am wondering if that policy and that assumption holds for all WARN()
invocations in the kernel? I would hope that we can answer this
question, which is much simpler than getting the precise answer on
"what as graceful as possible actually means".

Lukas

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-15 22:06           ` Kees Cook
@ 2021-11-16  9:12             ` Alexander Popov
  2021-11-16 18:41               ` Kees Cook
  2021-11-16 13:07             ` James Bottomley
  2021-11-20 12:17             ` Marco Elver
  2 siblings, 1 reply; 31+ messages in thread
From: Alexander Popov @ 2021-11-16  9:12 UTC (permalink / raw)
  To: Kees Cook, Steven Rostedt, Linus Torvalds
  Cc: Lukas Bulwahn, Linus Torvalds, Jonathan Corbet, Paul McKenney,
	Andrew Morton, Thomas Gleixner, Peter Zijlstra, Joerg Roedel,
	Maciej Rozycki, Muchun Song, Viresh Kumar, Robin Murphy,
	Randy Dunlap, Lu Baolu, Petr Mladek, Luis Chamberlain, Wei Liu,
	John Ogness, Andy Shevchenko, Alexey Kardashevskiy,
	Christophe Leroy, Jann Horn, Greg Kroah-Hartman, Mark Rutland,
	Andy Lutomirski, Dave Hansen, Will Deacon, Ard Biesheuvel,
	Laura Abbott, David S Miller, Borislav Petkov, Arnd Bergmann,
	Andrew Scull, Marc Zyngier, Jessica Yu, Iurii Zaikin,
	Rasmus Villemoes, Wang Qing, Mel Gorman, Mauro Carvalho Chehab,
	Andrew Klychkov, Mathieu Chouquet-Stringer, Daniel Borkmann,
	Stephen Kitt, Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan

On 16.11.2021 01:06, Kees Cook wrote:
> Hmm, yes. What it originally boiled down to, which is why Linus first
> objected to BUG(), was that we don't know what other parts of the system
> have been disrupted. The best example is just that of locking: if we
> BUG() or do_exit() in the middle of holding a lock, we'll wreck whatever
> subsystem that was attached to. Without a deterministic system state
> unwinder, there really isn't a "safe" way to just stop a kernel thread.
> 
> With this pkill_on_warn, we avoid the BUG problem (since the thread of
> execution continues and stops at an 'expected' place: the signal
> handler).
> 
> However, now we have the newer objection from Linus, which is one of
> attribution: the WARN might be hit during an "unrelated" thread of
> execution and "current" gets blamed, etc. And beyond that, if we take
> down a portion of userspace, what in userspace may be destabilized? In
> theory, we get a case where any required daemons would be restarted by
> init, but that's not "known".
> 
> The safest version of this I can think of is for processes to opt into
> this mitigation. That would also cover the "special cases" we've seen
> exposed too. i.e. init and kthreads would not opt in.
> 
> However, that's a lot to implement when Marco's tracing suggestion might
> be sufficient and policy could be entirely implemented in userspace. It
> could be as simple as this (totally untested):

I don't think that this userspace warning handling can work as pkill_on_warn.

1. The kernel code execution continues after WARN_ON(), it will not wait some 
userspace daemon that is polling trace events. That's not different from 
ignoring and having all negative effects after WARN_ON().

2. This userspace policy will miss WARN_ON_ONCE(), WARN_ONCE() and 
WARN_TAINT_ONCE() after the first hit.


Oh, wait...
I got a crazy idea that may bring more consistency in the error handling mess.

What if the Linux kernel had a LSM module responsible for error handling policy?
That would require adding LSM hooks to BUG*(), WARN*(), KERN_EMERG, etc.
In such LSM policy we can decide immediately how to react on the kernel error.
We can even decide depending on the subsystem and things like that.

(idea for brainstorming)

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [ELISA Safety Architecture WG] [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-16  8:41             ` Petr Mladek
@ 2021-11-16  9:19               ` Lukas Bulwahn
  2021-11-16 13:20               ` James Bottomley
  1 sibling, 0 replies; 31+ messages in thread
From: Lukas Bulwahn @ 2021-11-16  9:19 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Alexander Popov, Gabriele Paoloni, Robert Krutsch,
	Linus Torvalds, Jonathan Corbet, Paul McKenney, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, Joerg Roedel, Maciej Rozycki,
	Muchun Song, Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu,
	Kees Cook, Luis Chamberlain, Wei Liu, John Ogness,
	Andy Shevchenko, Alexey Kardashevskiy, Christophe Leroy,
	Jann Horn, Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski,
	Dave Hansen, Steven Rostedt, Will Deacon, Ard Biesheuvel,
	Laura Abbott, David S Miller, Borislav Petkov, Arnd Bergmann,
	Andrew Scull, Marc Zyngier, Jessica Yu, Iurii Zaikin,
	Rasmus Villemoes, Wang Qing, Mel Gorman, Mauro Carvalho Chehab,
	Andrew Klychkov, Mathieu Chouquet-Stringer, Daniel Borkmann,
	Stephen Kitt, Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan

On Tue, Nov 16, 2021 at 9:41 AM Petr Mladek <pmladek@suse.com> wrote:
>
> On Tue 2021-11-16 10:52:39, Alexander Popov wrote:
> > On 15.11.2021 18:51, Gabriele Paoloni wrote:
> > > On 15/11/2021 14:59, Lukas Bulwahn wrote:
> > > > On Sat, Nov 13, 2021 at 7:14 PM Alexander Popov <alex.popov@linux.com> wrote:
> > > > > On 13.11.2021 00:26, Linus Torvalds wrote:
> > > > > > On Fri, Nov 12, 2021 at 10:52 AM Alexander Popov <alex.popov@linux.com> wrote:
> > > > > Killing the process that hit a kernel warning complies with the Fail-Fast
> > > > > principle [1]. pkill_on_warn sysctl allows the kernel to stop the process when
> > > > > the **first signs** of wrong behavior are detected.
> > > > >
> > > > In summary, I am not supporting pkill_on_warn. I would support the
> > > > other points I mentioned above, i.e., a good enforced policy for use
> > > > of warn() and any investigation to understand the complexity of
> > > > panic() and reducing its complexity if triggered by such an
> > > > investigation.
> > >
> > > Hi Alex
> > >
> > > I also agree with the summary that Lukas gave here. From my experience
> > > the safety system are always guarded by an external flow monitor (e.g. a
> > > watchdog) that triggers in case the safety relevant workloads slows down
> > > or block (for any reason); given this condition of use, a system that
> > > goes into the panic state is always safe, since the watchdog would
> > > trigger and drive the system automatically into safe state.
> > > So I also don't see a clear advantage of having pkill_on_warn();
> > > actually on the flip side it seems to me that such feature could
> > > introduce more risk, as it kills only the threads of the process that
> > > caused the kernel warning whereas the other processes are trusted to
> > > run on a weaker Kernel (does killing the threads of the process that
> > > caused the kernel warning always fix the Kernel condition that lead to
> > > the warning?)
> >
> > Lukas, Gabriele, Robert,
> > Thanks for showing this from the safety point of view.
> >
> > The part about believing in panic() functionality is amazing :)
>
> Nothing is 100% reliable.
>
> With printk() maintainer hat on, the current panic() implementation
> is less reliable because it tries hard to provide some debugging
> information, for example, error message, backtrace, registry,
> flush pending messages on console, crashdump.
>
> See panic() implementation, the reboot is done by emergency_restart().
> The rest is about duping the information.
>
> Well, the information is important. Otherwise, it is really hard to
> fix the problem.
>
> From my experience, especially the access to consoles is not fully
> safe. The reliability might improve a lot when a lockless console
> is used. I guess that using non-volatile memory for the log buffer
> might be even more reliable.
>
> I am not familiar with the code under emergency_restart(). I am not
> sure how reliable it is.
>
> > Yes, safety critical systems depend on the robust ability to restart.
>
> If I wanted to implement a super-reliable panic() I would
> use some external device that would cause power-reset when
> the watched device is not responding.
>

Petr, that is basically the common system design taken.

The whole challenge then remains to show that:

Once panic() was invoked, the watched device does not signal being
alive unintentionally, while the panic() is stuck in its shutdown
routines. That requires having a panic() or other shutdown routine
that still reliably can do something that the kernel routine that
makes the watched device signal does not signal anymore.


Lukas

> Best Regards,
> Petr
>
>
> PS: I do not believe much into the pkill approach as well.
>
>     It is similar to OOM killer. And I always had to restart the
>     system when it was triggered.
>
>     Also kernel is not prepared for the situation that an external
>     code kills a kthread. And kthreads are used by many subsystems
>     to handle work that has to be done asynchronously and/or in
>     process context. And I guess that kthreads are non-trivial
>     source of WARN().

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-15 22:06           ` Kees Cook
  2021-11-16  9:12             ` Alexander Popov
@ 2021-11-16 13:07             ` James Bottomley
  2021-11-20 12:17             ` Marco Elver
  2 siblings, 0 replies; 31+ messages in thread
From: James Bottomley @ 2021-11-16 13:07 UTC (permalink / raw)
  To: Kees Cook, Steven Rostedt
  Cc: Lukas Bulwahn, Alexander Popov, Linus Torvalds, Jonathan Corbet,
	Paul McKenney, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Joerg Roedel, Maciej Rozycki, Muchun Song, Viresh Kumar,
	Robin Murphy, Randy Dunlap, Lu Baolu, Petr Mladek,
	Luis Chamberlain, Wei Liu, John Ogness, Andy Shevchenko,
	Alexey Kardashevskiy, Christophe Leroy, Jann Horn,
	Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski, Dave Hansen,
	Will Deacon, Ard Biesheuvel, Laura Abbott, David S Miller,
	Borislav Petkov, Arnd Bergmann, Andrew Scull, Marc Zyngier,
	Jessica Yu, Iurii Zaikin, Rasmus Villemoes, Wang Qing,
	Mel Gorman, Mauro Carvalho Chehab, Andrew Klychkov,
	Mathieu Chouquet-Stringer, Daniel Borkmann, Stephen Kitt,
	Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan

On Mon, 2021-11-15 at 14:06 -0800, Kees Cook wrote:
> On Mon, Nov 15, 2021 at 11:06:49AM -0500, Steven Rostedt wrote:
> > On Mon, 15 Nov 2021 14:59:57 +0100
> > Lukas Bulwahn <lukas.bulwahn@gmail.com> wrote:
> > 
> > > 1. Allow a reasonably configured kernel to boot and run with
> > > panic_on_warn set. Warnings should only be raised when something
> > > is not configured as the developers expect it or the kernel is
> > > put into a state that generally is _unexpected_ and has been
> > > exposed little to the critical thought of the developer, to
> > > testing efforts and use in other systems in the wild. Warnings
> > > should not be used for something informative, which still allows
> > > the kernel to continue running in a proper way in a generally
> > > expected environment. Up to my knowledge, there are some kernels
> > > in production that run with panic_on_warn; so, IMHO, this
> > > requirement is generally accepted (we might of course
> > 
> > To me, WARN*() is the same as BUG*(). If it gets hit, it's a bug in
> > the kernel and needs to be fixed. I have several WARN*() calls in
> > my code, and it's all because the algorithms used is expected to
> > prevent the condition in the warning from happening. If the warning
> > triggers, it means either that the algorithm is wrong or my
> > assumption about the algorithm is wrong. In either case, the kernel
> > needs to be updated. All my tests fail if a WARN*() gets hit
> > (anywhere in the kernel, not just my own).
> > 
> > After reading all the replies and thinking about this more, I find
> > the pkill_on_warning actually worse than not doing anything. If you
> > are concerned about exploits from warnings, the only real solution
> > is a panic_on_warning. Yes, it brings down the system, but really,
> > it has to be brought down anyway, because it is in need of a kernel
> > update.
> 
> Hmm, yes. What it originally boiled down to, which is why Linus first
> objected to BUG(), was that we don't know what other parts of the
> system have been disrupted. The best example is just that of locking:
> if we BUG() or do_exit() in the middle of holding a lock, we'll wreck
> whatever subsystem that was attached to. Without a deterministic
> system state unwinder, there really isn't a "safe" way to just stop a
> kernel thread.

But this misses the real point: the majority of WARN conditions are in
device drivers checking expected device state against an internal state
model.  If this triggers it's a problem with the device not the thread,
so killing the thread is blaming the wrong party and making the
situation worse because it didn't do anything to address the actual
problem.

> With this pkill_on_warn, we avoid the BUG problem (since the thread
> of execution continues and stops at an 'expected' place: the signal
> handler).

And what about the unexpected state?

> However, now we have the newer objection from Linus, which is one of
> attribution: the WARN might be hit during an "unrelated" thread of
> execution and "current" gets blamed, etc. And beyond that, if we take
> down a portion of userspace, what in userspace may be destabilized?
> In theory, we get a case where any required daemons would be
> restarted by init, but that's not "known".
> 
> The safest version of this I can think of is for processes to opt
> into this mitigation. That would also cover the "special cases" we've
> seen exposed too. i.e. init and kthreads would not opt in.
> 
> However, that's a lot to implement when Marco's tracing suggestion
> might be sufficient and policy could be entirely implemented in
> userspace. It could be as simple as this (totally untested):

Really, no, this is precisely wrong thinking.  If the condition were
recoverable it wouldn't result in a WARN.  There are some WARNs where
we think the condition is unexpected enough not to bother adding error
handling (we need these reporting so we know that the assumption was
wrong), but for most if there were a way to handle it we'd have built
it into the usual error flow.  What WARN means is that an unexpected
condition occurred which means the kernel itself is in an unknown
state.  You can't recover from that by killing and restarting random
stuff, you have to reinitialize to a known state (i.e. reset the
system).  Some of the reason we do WARN instead of BUG is that we
believe the state contamination is limited and if you're careful the
system can continue in a degraded state if the user wants to accept the
risk.  Thinking the user can handle the state reset locally by some
preset policy is pure fantasy: if we didn't know how to fix it at the
point it occurred, why would something far away from the action when
most of the information has been lost have a better chance?

Your only policy choices when hitting WARN are

   1. Accept the risk and continue degraded operation, or
   2. reset the system to a known good state.

James




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [ELISA Safety Architecture WG] [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-16  8:41             ` Petr Mladek
  2021-11-16  9:19               ` Lukas Bulwahn
@ 2021-11-16 13:20               ` James Bottomley
  1 sibling, 0 replies; 31+ messages in thread
From: James Bottomley @ 2021-11-16 13:20 UTC (permalink / raw)
  To: Petr Mladek, Alexander Popov
  Cc: Gabriele Paoloni, Lukas Bulwahn, Robert Krutsch, Linus Torvalds,
	Jonathan Corbet, Paul McKenney, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Joerg Roedel, Maciej Rozycki, Muchun Song,
	Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu, Kees Cook,
	Luis Chamberlain, Wei Liu, John Ogness, Andy Shevchenko,
	Alexey Kardashevskiy, Christophe Leroy, Jann Horn,
	Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski, Dave Hansen,
	Steven Rostedt, Will Deacon, Ard Biesheuvel, Laura Abbott,
	David S Miller, Borislav Petkov, Arnd Bergmann, Andrew Scull,
	Marc Zyngier, Jessica Yu, Iurii Zaikin, Rasmus Villemoes,
	Wang Qing, Mel Gorman, Mauro Carvalho Chehab, Andrew Klychkov,
	Mathieu Chouquet-Stringer, Daniel Borkmann, Stephen Kitt,
	Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan

On Tue, 2021-11-16 at 09:41 +0100, Petr Mladek wrote:
[...]
> If I wanted to implement a super-reliable panic() I would
> use some external device that would cause power-reset when
> the watched device is not responding.

They're called watchdog timers.  We have a whole subsystem full of
them:

drivers/watchdog

We used them in old cluster HA systems to guarantee successful recovery
of shared state from contaminated cluster members, but I think they'd
serve the reliable panic need equally well.  Most server class systems
today have them built in (on the BMC if they don't have a separate
mechanism), they're just not usually activated.

James



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-16  9:12             ` Alexander Popov
@ 2021-11-16 18:41               ` Kees Cook
  2021-11-16 19:00                 ` Casey Schaufler
  0 siblings, 1 reply; 31+ messages in thread
From: Kees Cook @ 2021-11-16 18:41 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Steven Rostedt, Linus Torvalds, Lukas Bulwahn, Jonathan Corbet,
	Paul McKenney, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Joerg Roedel, Maciej Rozycki, Muchun Song, Viresh Kumar,
	Robin Murphy, Randy Dunlap, Lu Baolu, Petr Mladek,
	Luis Chamberlain, Wei Liu, John Ogness, Andy Shevchenko,
	Alexey Kardashevskiy, Christophe Leroy, Jann Horn,
	Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski, Dave Hansen,
	Will Deacon, Ard Biesheuvel, Laura Abbott, David S Miller,
	Borislav Petkov, Arnd Bergmann, Andrew Scull, Marc Zyngier,
	Jessica Yu, Iurii Zaikin, Rasmus Villemoes, Wang Qing,
	Mel Gorman, Mauro Carvalho Chehab, Andrew Klychkov,
	Mathieu Chouquet-Stringer, Daniel Borkmann, Stephen Kitt,
	Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan

On Tue, Nov 16, 2021 at 12:12:16PM +0300, Alexander Popov wrote:
> What if the Linux kernel had a LSM module responsible for error handling policy?
> That would require adding LSM hooks to BUG*(), WARN*(), KERN_EMERG, etc.
> In such LSM policy we can decide immediately how to react on the kernel error.
> We can even decide depending on the subsystem and things like that.

That would solve the "atomicity" issue the WARN tracepoint solution has,
and it would allow for very flexible userspace policy.

I actually wonder if the existing panic_on_* sites should serve as a
guide for where to put the hooks. The current sysctls could be replaced
by the hooks and a simple LSM.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-16 18:41               ` Kees Cook
@ 2021-11-16 19:00                 ` Casey Schaufler
  2021-11-18 17:32                   ` Kees Cook
  0 siblings, 1 reply; 31+ messages in thread
From: Casey Schaufler @ 2021-11-16 19:00 UTC (permalink / raw)
  To: Kees Cook, Alexander Popov
  Cc: Steven Rostedt, Linus Torvalds, Lukas Bulwahn, Jonathan Corbet,
	Paul McKenney, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Joerg Roedel, Maciej Rozycki, Muchun Song, Viresh Kumar,
	Robin Murphy, Randy Dunlap, Lu Baolu, Petr Mladek,
	Luis Chamberlain, Wei Liu, John Ogness, Andy Shevchenko,
	Alexey Kardashevskiy, Christophe Leroy, Jann Horn,
	Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski, Dave Hansen,
	Will Deacon, Ard Biesheuvel, Laura Abbott, David S Miller,
	Borislav Petkov, Arnd Bergmann, Andrew Scull, Marc Zyngier,
	Jessica Yu, Iurii Zaikin, Rasmus Villemoes, Wang Qing,
	Mel Gorman, Mauro Carvalho Chehab, Andrew Klychkov,
	Mathieu Chouquet-Stringer, Daniel Borkmann, Stephen Kitt,
	Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan, Casey Schaufler

On 11/16/2021 10:41 AM, Kees Cook wrote:
> On Tue, Nov 16, 2021 at 12:12:16PM +0300, Alexander Popov wrote:
>> What if the Linux kernel had a LSM module responsible for error handling policy?
>> That would require adding LSM hooks to BUG*(), WARN*(), KERN_EMERG, etc.
>> In such LSM policy we can decide immediately how to react on the kernel error.
>> We can even decide depending on the subsystem and things like that.
> That would solve the "atomicity" issue the WARN tracepoint solution has,
> and it would allow for very flexible userspace policy.
>
> I actually wonder if the existing panic_on_* sites should serve as a
> guide for where to put the hooks. The current sysctls could be replaced
> by the hooks and a simple LSM.

Do you really want to make error handling a "security" issue?
If you add security_bug(), security_warn_on() and the like
you're begging that they be included in SELinux (AppArmor) policy.
BPF, too, come to think of it. Is that what you want?


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-16 19:00                 ` Casey Schaufler
@ 2021-11-18 17:32                   ` Kees Cook
  2021-11-18 18:30                     ` Casey Schaufler
  0 siblings, 1 reply; 31+ messages in thread
From: Kees Cook @ 2021-11-18 17:32 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Alexander Popov, Steven Rostedt, Linus Torvalds, Lukas Bulwahn,
	Jonathan Corbet, Paul McKenney, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Joerg Roedel, Maciej Rozycki, Muchun Song,
	Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu, Petr Mladek,
	Luis Chamberlain, Wei Liu, John Ogness, Andy Shevchenko,
	Alexey Kardashevskiy, Christophe Leroy, Jann Horn,
	Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski, Dave Hansen,
	Will Deacon, Ard Biesheuvel, Laura Abbott, David S Miller,
	Borislav Petkov, Arnd Bergmann, Andrew Scull, Marc Zyngier,
	Jessica Yu, Iurii Zaikin, Rasmus Villemoes, Wang Qing,
	Mel Gorman, Mauro Carvalho Chehab, Andrew Klychkov,
	Mathieu Chouquet-Stringer, Daniel Borkmann, Stephen Kitt,
	Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan

On Tue, Nov 16, 2021 at 11:00:23AM -0800, Casey Schaufler wrote:
> On 11/16/2021 10:41 AM, Kees Cook wrote:
> > On Tue, Nov 16, 2021 at 12:12:16PM +0300, Alexander Popov wrote:
> > > What if the Linux kernel had a LSM module responsible for error handling policy?
> > > That would require adding LSM hooks to BUG*(), WARN*(), KERN_EMERG, etc.
> > > In such LSM policy we can decide immediately how to react on the kernel error.
> > > We can even decide depending on the subsystem and things like that.
> > That would solve the "atomicity" issue the WARN tracepoint solution has,
> > and it would allow for very flexible userspace policy.
> > 
> > I actually wonder if the existing panic_on_* sites should serve as a
> > guide for where to put the hooks. The current sysctls could be replaced
> > by the hooks and a simple LSM.
> 
> Do you really want to make error handling a "security" issue?
> If you add security_bug(), security_warn_on() and the like
> you're begging that they be included in SELinux (AppArmor) policy.
> BPF, too, come to think of it. Is that what you want?

Yeah, that is what I was thinking. This would give the LSM a view into
kernel state, which seems a reasonable thing to do. If system integrity
is compromised, an LSM may want to stop trusting things.

A dedicated error-handling LSM could be added for those hooks that
implemented the existing default panic_on_* sysctls, and could expand on
that logic for other actions.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-18 17:32                   ` Kees Cook
@ 2021-11-18 18:30                     ` Casey Schaufler
  2021-11-18 20:29                       ` Kees Cook
  0 siblings, 1 reply; 31+ messages in thread
From: Casey Schaufler @ 2021-11-18 18:30 UTC (permalink / raw)
  To: Kees Cook
  Cc: Alexander Popov, Steven Rostedt, Linus Torvalds, Lukas Bulwahn,
	Jonathan Corbet, Paul McKenney, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Joerg Roedel, Maciej Rozycki, Muchun Song,
	Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu, Petr Mladek,
	Luis Chamberlain, Wei Liu, John Ogness, Andy Shevchenko,
	Alexey Kardashevskiy, Christophe Leroy, Jann Horn,
	Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski, Dave Hansen,
	Will Deacon, Ard Biesheuvel, Laura Abbott, David S Miller,
	Borislav Petkov, Arnd Bergmann, Andrew Scull, Marc Zyngier,
	Jessica Yu, Iurii Zaikin, Rasmus Villemoes, Wang Qing,
	Mel Gorman, Mauro Carvalho Chehab, Andrew Klychkov,
	Mathieu Chouquet-Stringer, Daniel Borkmann, Stephen Kitt,
	Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan, Casey Schaufler

On 11/18/2021 9:32 AM, Kees Cook wrote:
> On Tue, Nov 16, 2021 at 11:00:23AM -0800, Casey Schaufler wrote:
>> On 11/16/2021 10:41 AM, Kees Cook wrote:
>>> On Tue, Nov 16, 2021 at 12:12:16PM +0300, Alexander Popov wrote:
>>>> What if the Linux kernel had a LSM module responsible for error handling policy?
>>>> That would require adding LSM hooks to BUG*(), WARN*(), KERN_EMERG, etc.
>>>> In such LSM policy we can decide immediately how to react on the kernel error.
>>>> We can even decide depending on the subsystem and things like that.
>>> That would solve the "atomicity" issue the WARN tracepoint solution has,
>>> and it would allow for very flexible userspace policy.
>>>
>>> I actually wonder if the existing panic_on_* sites should serve as a
>>> guide for where to put the hooks. The current sysctls could be replaced
>>> by the hooks and a simple LSM.
>> Do you really want to make error handling a "security" issue?
>> If you add security_bug(), security_warn_on() and the like
>> you're begging that they be included in SELinux (AppArmor) policy.
>> BPF, too, come to think of it. Is that what you want?
> Yeah, that is what I was thinking. This would give the LSM a view into
> kernel state, which seems a reasonable thing to do. If system integrity
> is compromised, an LSM may want to stop trusting things.

How are you planning to communicate the security relevance of the
warning to the LSM? I don't think that __FILE__, __LINE__ or __func__
is great information to base security policy on. Nor is a backtrace.

> A dedicated error-handling LSM could be added for those hooks that
> implemented the existing default panic_on_* sysctls, and could expand on
> that logic for other actions.

I can see having an interface like LSM for choosing a bug/warn policy.
I worry about expanding the LSM hook list for a case where I would
hope no existing LSM would use them, and the new LSM doesn't use any
of the existing hooks.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-18 18:30                     ` Casey Schaufler
@ 2021-11-18 20:29                       ` Kees Cook
  0 siblings, 0 replies; 31+ messages in thread
From: Kees Cook @ 2021-11-18 20:29 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Alexander Popov, Steven Rostedt, Linus Torvalds, Lukas Bulwahn,
	Jonathan Corbet, Paul McKenney, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Joerg Roedel, Maciej Rozycki, Muchun Song,
	Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu, Petr Mladek,
	Luis Chamberlain, Wei Liu, John Ogness, Andy Shevchenko,
	Alexey Kardashevskiy, Christophe Leroy, Jann Horn,
	Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski, Dave Hansen,
	Will Deacon, Ard Biesheuvel, Laura Abbott, David S Miller,
	Borislav Petkov, Arnd Bergmann, Andrew Scull, Marc Zyngier,
	Jessica Yu, Iurii Zaikin, Rasmus Villemoes, Wang Qing,
	Mel Gorman, Mauro Carvalho Chehab, Andrew Klychkov,
	Mathieu Chouquet-Stringer, Daniel Borkmann, Stephen Kitt,
	Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan

On Thu, Nov 18, 2021 at 10:30:32AM -0800, Casey Schaufler wrote:
> On 11/18/2021 9:32 AM, Kees Cook wrote:
> > On Tue, Nov 16, 2021 at 11:00:23AM -0800, Casey Schaufler wrote:
> > > On 11/16/2021 10:41 AM, Kees Cook wrote:
> > > > On Tue, Nov 16, 2021 at 12:12:16PM +0300, Alexander Popov wrote:
> > > > > What if the Linux kernel had a LSM module responsible for error handling policy?
> > > > > That would require adding LSM hooks to BUG*(), WARN*(), KERN_EMERG, etc.
> > > > > In such LSM policy we can decide immediately how to react on the kernel error.
> > > > > We can even decide depending on the subsystem and things like that.
> > > > That would solve the "atomicity" issue the WARN tracepoint solution has,
> > > > and it would allow for very flexible userspace policy.
> > > > 
> > > > I actually wonder if the existing panic_on_* sites should serve as a
> > > > guide for where to put the hooks. The current sysctls could be replaced
> > > > by the hooks and a simple LSM.
> > > Do you really want to make error handling a "security" issue?
> > > If you add security_bug(), security_warn_on() and the like
> > > you're begging that they be included in SELinux (AppArmor) policy.
> > > BPF, too, come to think of it. Is that what you want?
> > Yeah, that is what I was thinking. This would give the LSM a view into
> > kernel state, which seems a reasonable thing to do. If system integrity
> > is compromised, an LSM may want to stop trusting things.
> 
> How are you planning to communicate the security relevance of the
> warning to the LSM? I don't think that __FILE__, __LINE__ or __func__
> is great information to base security policy on. Nor is a backtrace.

I think that would be part of the design proposal. Initially, the known
parts are "warn or bug" and "pid".

> > A dedicated error-handling LSM could be added for those hooks that
> > implemented the existing default panic_on_* sysctls, and could expand on
> > that logic for other actions.
> 
> I can see having an interface like LSM for choosing a bug/warn policy.
> I worry about expanding the LSM hook list for a case where I would
> hope no existing LSM would use them, and the new LSM doesn't use any
> of the existing hooks.

Yeah, I can see that, though we've got a history of the "specialized"
hooks getting used by other LSMs. (e.g. loadpin's stuff got hooked up to
other LSMs, etc.)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-15 22:06           ` Kees Cook
  2021-11-16  9:12             ` Alexander Popov
  2021-11-16 13:07             ` James Bottomley
@ 2021-11-20 12:17             ` Marco Elver
  2021-11-22 16:21               ` Steven Rostedt
  2 siblings, 1 reply; 31+ messages in thread
From: Marco Elver @ 2021-11-20 12:17 UTC (permalink / raw)
  To: Kees Cook
  Cc: Steven Rostedt, Lukas Bulwahn, Alexander Popov, Linus Torvalds,
	Jonathan Corbet, Paul McKenney, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Joerg Roedel, Maciej Rozycki, Muchun Song,
	Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu, Petr Mladek,
	Luis Chamberlain, Wei Liu, John Ogness, Andy Shevchenko,
	Alexey Kardashevskiy, Christophe Leroy, Jann Horn,
	Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski, Dave Hansen,
	Will Deacon, Ard Biesheuvel, Laura Abbott, David S Miller,
	Borislav Petkov, Arnd Bergmann, Andrew Scull, Marc Zyngier,
	Jessica Yu, Iurii Zaikin, Rasmus Villemoes, Wang Qing,
	Mel Gorman, Mauro Carvalho Chehab, Andrew Klychkov,
	Mathieu Chouquet-Stringer, Daniel Borkmann, Stephen Kitt,
	Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan

On Mon, Nov 15, 2021 at 02:06PM -0800, Kees Cook wrote:
[...]
> However, that's a lot to implement when Marco's tracing suggestion might
> be sufficient and policy could be entirely implemented in userspace. It
> could be as simple as this (totally untested):
[...]
> 
> Marco, is this the full version of monitoring this from the userspace
> side?

Sorry I completely missed this email (I somehow wasn't Cc'd... I just
saw it by chance re-reading this thread).

I've sent a patch to add WARN:

	https://lkml.kernel.org/r/20211115085630.1756817-1-elver@google.com

Not sure how useful BUG is, but I have no objection to it also being
traced if you think it's useful.

(I added it to kernel/panic.c, because lib/bug.c requires
CONFIG_GENERIC_BUG.)

> 	perf record -e error_report:error_report_end

I think userspace would want something other than perf tool to handle it
of course.  There are several options:

	1. Open trace pipe to be notified (/sys/kernel/tracing/trace_pipe).
	   This already includes the pid.

	2. As you suggest, use perf events globally (but the handling
	   would be done by some system process).

	3. As of 5.13 there's actually a new perf feature to
	   synchronously SIGTRAP the exact task where an event occurred
	   (see perf_event_attr::sigtrap). This would very closely mimic
	   pkill_on_warn (because the SIGTRAP is synchronous), but lets the
	   process being SIGTRAP'd decide what to do. Not sure how to
	   deploy this though, because a) only root user can create this
	   perf event (because exclude_kernel=0), and b) sigtrap perf
	   events deliberately won't propagate beyond an exec
	   (must remove_on_exec=1 if sigtrap=1) because who knows if
	   the exec'd process has the right SIGTRAP handler.

I think #3 is hard to deploy right, but below is an example program I
played with.

Thanks,
-- Marco

------ >8 ------

#define _GNU_SOURCE
#include <assert.h>
#include <stdio.h>
#include <linux/perf_event.h>
#include <signal.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <sys/syscall.h>
#include <unistd.h>

static void sigtrap_handler(int signum, siginfo_t *info, void *ucontext)
{
	// FIXME: check event is error_report_end
	printf("Kernel error in this task!\n");
}
static void generate_warning(void)
{
	... do something to generate a warning ...
}
int main()
{
	struct perf_event_attr attr = {
		.type		= PERF_TYPE_TRACEPOINT,
		.size		= sizeof(attr),
		.config		= 189, // FIXME: error_report_end
		.sample_period	= 1,
		.inherit	= 1, /* Children inherit events ... */
		.remove_on_exec = 1, /* Required by sigtrap. */
		.sigtrap	= 1, /* Request synchronous SIGTRAP on event. */
		.sig_data	= 189, /* FIXME: use to identify error_report_end */
	};
	struct sigaction action = {};
	struct sigaction oldact;
	int fd;
	action.sa_flags = SA_SIGINFO | SA_NODEFER;
	action.sa_sigaction = sigtrap_handler;
	sigemptyset(&action.sa_mask);
	assert(sigaction(SIGTRAP, &action, &oldact) == 0);
	fd = syscall(__NR_perf_event_open, &attr, 0, -1, -1, PERF_FLAG_FD_CLOEXEC);
	assert(fd != -1);
	sleep(5); /* Try to generate a warning from elsewhere, nothing will be printed. */
	generate_warning(); /* Warning from this process. */
	sigaction(SIGTRAP, &oldact, NULL);
	close(fd);
	return 0;
}

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter
  2021-11-20 12:17             ` Marco Elver
@ 2021-11-22 16:21               ` Steven Rostedt
  0 siblings, 0 replies; 31+ messages in thread
From: Steven Rostedt @ 2021-11-22 16:21 UTC (permalink / raw)
  To: Marco Elver
  Cc: Kees Cook, Lukas Bulwahn, Alexander Popov, Linus Torvalds,
	Jonathan Corbet, Paul McKenney, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Joerg Roedel, Maciej Rozycki, Muchun Song,
	Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu, Petr Mladek,
	Luis Chamberlain, Wei Liu, John Ogness, Andy Shevchenko,
	Alexey Kardashevskiy, Christophe Leroy, Jann Horn,
	Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski, Dave Hansen,
	Will Deacon, Ard Biesheuvel, Laura Abbott, David S Miller,
	Borislav Petkov, Arnd Bergmann, Andrew Scull, Marc Zyngier,
	Jessica Yu, Iurii Zaikin, Rasmus Villemoes, Wang Qing,
	Mel Gorman, Mauro Carvalho Chehab, Andrew Klychkov,
	Mathieu Chouquet-Stringer, Daniel Borkmann, Stephen Kitt,
	Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Kernel Hardening, linux-hardening,
	open list:DOCUMENTATION, linux-arch, Linux Kernel Mailing List,
	linux-fsdevel, notify, main, safety-architecture, devel,
	Shuah Khan

On Sat, 20 Nov 2021 13:17:08 +0100
Marco Elver <elver@google.com> wrote:


> I think userspace would want something other than perf tool to handle it
> of course.  There are several options:
> 
> 	1. Open trace pipe to be notified (/sys/kernel/tracing/trace_pipe).
> 	   This already includes the pid.

I would suggest using /sys/kernel/tracing/per_cpu/cpu*/trace_pipe_raw

and use libtracefs[1] to read it.

-- Steve

[1] https://git.kernel.org/pub/scm/libs/libtrace/libtracefs.git/

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2021-11-22 16:21 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-27 23:32 [PATCH v2 0/2] Introduce the pkill_on_warn parameter Alexander Popov
2021-10-27 23:32 ` [PATCH v2 1/2] bug: do refactoring allowing to add a warning handling action Alexander Popov
2021-10-27 23:32 ` [PATCH v2 2/2] sysctl: introduce kernel.pkill_on_warn Alexander Popov
2021-11-12 21:24   ` Steven Rostedt
2021-11-12 18:52 ` [PATCH v2 0/2] Introduce the pkill_on_warn parameter Alexander Popov
2021-11-12 21:26   ` Linus Torvalds
2021-11-13 18:14     ` Alexander Popov
2021-11-13 19:58       ` Linus Torvalds
2021-11-14 14:21         ` Marco Elver
2021-11-15 13:59       ` Lukas Bulwahn
2021-11-15 15:51         ` [ELISA Safety Architecture WG] " Gabriele Paoloni
2021-11-16  7:52           ` Alexander Popov
2021-11-16  8:01             ` Lukas Bulwahn
2021-11-16  8:41             ` Petr Mladek
2021-11-16  9:19               ` Lukas Bulwahn
2021-11-16 13:20               ` James Bottomley
2021-11-15 16:06         ` Steven Rostedt
2021-11-15 22:06           ` Kees Cook
2021-11-16  9:12             ` Alexander Popov
2021-11-16 18:41               ` Kees Cook
2021-11-16 19:00                 ` Casey Schaufler
2021-11-18 17:32                   ` Kees Cook
2021-11-18 18:30                     ` Casey Schaufler
2021-11-18 20:29                       ` Kees Cook
2021-11-16 13:07             ` James Bottomley
2021-11-20 12:17             ` Marco Elver
2021-11-22 16:21               ` Steven Rostedt
2021-11-16  6:37           ` Christophe Leroy
2021-11-16  8:34             ` Alexander Popov
2021-11-16  8:57             ` Lukas Bulwahn
2021-11-15  8:12 ` Kaiwan N Billimoria

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).