LKML Archive on lore.kernel.org help / color / mirror / Atom feed
* [PATCH v3] kcsan: Add option for verbose reporting @ 2020-02-21 23:10 Marco Elver 2020-02-22 1:36 ` Paul E. McKenney 2020-04-06 12:59 ` Qian Cai 0 siblings, 2 replies; 10+ messages in thread From: Marco Elver @ 2020-02-21 23:10 UTC (permalink / raw) To: elver Cc: paulmck, andreyknvl, glider, dvyukov, kasan-dev, linux-kernel, Qian Cai Adds CONFIG_KCSAN_VERBOSE to optionally enable more verbose reports. Currently information about the reporting task's held locks and IRQ trace events are shown, if they are enabled. Signed-off-by: Marco Elver <elver@google.com> Suggested-by: Qian Cai <cai@lca.pw> --- v3: * Typos v2: * Rework obtaining 'current' for the "other thread" -- it now passes 'current' and ensures that we stall until the report was printed, so that the lockdep information contained in 'current' is accurate. This was non-trivial but testing so far leads me to conclude this now reliably prints the held locks for the "other thread" (please test more!). --- kernel/kcsan/core.c | 4 +- kernel/kcsan/kcsan.h | 3 ++ kernel/kcsan/report.c | 103 +++++++++++++++++++++++++++++++++++++++++- lib/Kconfig.kcsan | 13 ++++++ 4 files changed, 120 insertions(+), 3 deletions(-) diff --git a/kernel/kcsan/core.c b/kernel/kcsan/core.c index e7387fec66795..065615df88eaa 100644 --- a/kernel/kcsan/core.c +++ b/kernel/kcsan/core.c @@ -18,8 +18,8 @@ #include "kcsan.h" static bool kcsan_early_enable = IS_ENABLED(CONFIG_KCSAN_EARLY_ENABLE); -static unsigned int kcsan_udelay_task = CONFIG_KCSAN_UDELAY_TASK; -static unsigned int kcsan_udelay_interrupt = CONFIG_KCSAN_UDELAY_INTERRUPT; +unsigned int kcsan_udelay_task = CONFIG_KCSAN_UDELAY_TASK; +unsigned int kcsan_udelay_interrupt = CONFIG_KCSAN_UDELAY_INTERRUPT; static long kcsan_skip_watch = CONFIG_KCSAN_SKIP_WATCH; static bool kcsan_interrupt_watcher = IS_ENABLED(CONFIG_KCSAN_INTERRUPT_WATCHER); diff --git a/kernel/kcsan/kcsan.h b/kernel/kcsan/kcsan.h index 892de5120c1b6..e282f8b5749e9 100644 --- a/kernel/kcsan/kcsan.h +++ b/kernel/kcsan/kcsan.h @@ -13,6 +13,9 @@ /* The number of adjacent watchpoints to check. */ #define KCSAN_CHECK_ADJACENT 1 +extern unsigned int kcsan_udelay_task; +extern unsigned int kcsan_udelay_interrupt; + /* * Globally enable and disable KCSAN. */ diff --git a/kernel/kcsan/report.c b/kernel/kcsan/report.c index 11c791b886f3c..7bdb515e3662f 100644 --- a/kernel/kcsan/report.c +++ b/kernel/kcsan/report.c @@ -1,5 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 +#include <linux/debug_locks.h> +#include <linux/delay.h> #include <linux/jiffies.h> #include <linux/kernel.h> #include <linux/lockdep.h> @@ -31,7 +33,26 @@ static struct { int cpu_id; unsigned long stack_entries[NUM_STACK_ENTRIES]; int num_stack_entries; -} other_info = { .ptr = NULL }; + + /* + * Optionally pass @current. Typically we do not need to pass @current + * via @other_info since just @task_pid is sufficient. Passing @current + * has additional overhead. + * + * To safely pass @current, we must either use get_task_struct/ + * put_task_struct, or stall the thread that populated @other_info. + * + * We cannot rely on get_task_struct/put_task_struct in case + * release_report() races with a task being released, and would have to + * free it in release_report(). This may result in deadlock if we want + * to use KCSAN on the allocators. + * + * Since we also want to reliably print held locks for + * CONFIG_KCSAN_VERBOSE, the current implementation stalls the thread + * that populated @other_info until it has been consumed. + */ + struct task_struct *task; +} other_info; /* * Information about reported races; used to rate limit reporting. @@ -245,6 +266,16 @@ static int sym_strcmp(void *addr1, void *addr2) return strncmp(buf1, buf2, sizeof(buf1)); } +static void print_verbose_info(struct task_struct *task) +{ + if (!task) + return; + + pr_err("\n"); + debug_show_held_locks(task); + print_irqtrace_events(task); +} + /* * Returns true if a report was generated, false otherwise. */ @@ -319,6 +350,9 @@ static bool print_report(const volatile void *ptr, size_t size, int access_type, other_info.num_stack_entries - other_skipnr, 0); + if (IS_ENABLED(CONFIG_KCSAN_VERBOSE)) + print_verbose_info(other_info.task); + pr_err("\n"); pr_err("%s to 0x%px of %zu bytes by %s on cpu %i:\n", get_access_type(access_type), ptr, size, @@ -340,6 +374,9 @@ static bool print_report(const volatile void *ptr, size_t size, int access_type, stack_trace_print(stack_entries + skipnr, num_stack_entries - skipnr, 0); + if (IS_ENABLED(CONFIG_KCSAN_VERBOSE)) + print_verbose_info(current); + /* Print report footer. */ pr_err("\n"); pr_err("Reported by Kernel Concurrency Sanitizer on:\n"); @@ -357,6 +394,67 @@ static void release_report(unsigned long *flags, enum kcsan_report_type type) spin_unlock_irqrestore(&report_lock, *flags); } +/* + * Sets @other_info.task and awaits consumption of @other_info. + * + * Precondition: report_lock is held. + * Postcondition: report_lock is held. + */ +static void +set_other_info_task_blocking(unsigned long *flags, const volatile void *ptr) +{ + /* + * We may be instrumenting a code-path where current->state is already + * something other than TASK_RUNNING. + */ + const bool is_running = current->state == TASK_RUNNING; + /* + * To avoid deadlock in case we are in an interrupt here and this is a + * race with a task on the same CPU (KCSAN_INTERRUPT_WATCHER), provide a + * timeout to ensure this works in all contexts. + * + * Await approximately the worst case delay of the reporting thread (if + * we are not interrupted). + */ + int timeout = max(kcsan_udelay_task, kcsan_udelay_interrupt); + + other_info.task = current; + do { + if (is_running) { + /* + * Let lockdep know the real task is sleeping, to print + * the held locks (recall we turned lockdep off, so + * locking/unlocking @report_lock won't be recorded). + */ + set_current_state(TASK_UNINTERRUPTIBLE); + } + spin_unlock_irqrestore(&report_lock, *flags); + /* + * We cannot call schedule() since we also cannot reliably + * determine if sleeping here is permitted -- see in_atomic(). + */ + + udelay(1); + spin_lock_irqsave(&report_lock, *flags); + if (timeout-- < 0) { + /* + * Abort. Reset other_info.task to NULL, since it + * appears the other thread is still going to consume + * it. It will result in no verbose info printed for + * this task. + */ + other_info.task = NULL; + break; + } + /* + * If @ptr nor @current matches, then our information has been + * consumed and we may continue. If not, retry. + */ + } while (other_info.ptr == ptr && other_info.task == current); + if (is_running) + set_current_state(TASK_RUNNING); +} + /* * Depending on the report type either sets other_info and returns false, or * acquires the matching other_info and returns true. If other_info is not @@ -388,6 +486,9 @@ static bool prepare_report(unsigned long *flags, const volatile void *ptr, other_info.cpu_id = cpu_id; other_info.num_stack_entries = stack_trace_save(other_info.stack_entries, NUM_STACK_ENTRIES, 1); + if (IS_ENABLED(CONFIG_KCSAN_VERBOSE)) + set_other_info_task_blocking(flags, ptr); + spin_unlock_irqrestore(&report_lock, *flags); /* diff --git a/lib/Kconfig.kcsan b/lib/Kconfig.kcsan index 081ed2e1bf7b1..0f1447ff8f558 100644 --- a/lib/Kconfig.kcsan +++ b/lib/Kconfig.kcsan @@ -20,6 +20,19 @@ menuconfig KCSAN if KCSAN +config KCSAN_VERBOSE + bool "Show verbose reports with more information about system state" + depends on PROVE_LOCKING + help + If enabled, reports show more information about the system state that + may help better analyze and debug races. This includes held locks and + IRQ trace events. + + While this option should generally be benign, we call into more + external functions on report generation; if a race report is + generated from any one of them, system stability may suffer due to + deadlocks or recursion. If in doubt, say N. + config KCSAN_DEBUG bool "Debugging of KCSAN internals" -- 2.25.0.265.gbab2e86ba0-goog ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] kcsan: Add option for verbose reporting 2020-02-21 23:10 [PATCH v3] kcsan: Add option for verbose reporting Marco Elver @ 2020-02-22 1:36 ` Paul E. McKenney 2020-02-24 19:16 ` Marco Elver 2020-04-06 12:59 ` Qian Cai 1 sibling, 1 reply; 10+ messages in thread From: Paul E. McKenney @ 2020-02-22 1:36 UTC (permalink / raw) To: Marco Elver Cc: andreyknvl, glider, dvyukov, kasan-dev, linux-kernel, Qian Cai On Sat, Feb 22, 2020 at 12:10:27AM +0100, Marco Elver wrote: > Adds CONFIG_KCSAN_VERBOSE to optionally enable more verbose reports. > Currently information about the reporting task's held locks and IRQ > trace events are shown, if they are enabled. > > Signed-off-by: Marco Elver <elver@google.com> > Suggested-by: Qian Cai <cai@lca.pw> Applied in place of v1, thank you! Please check -rcu's "dev" branch to make sure that I have correct ordering and versions. Thanx, Paul > --- > v3: > * Typos > v2: > * Rework obtaining 'current' for the "other thread" -- it now passes > 'current' and ensures that we stall until the report was printed, so > that the lockdep information contained in 'current' is accurate. This > was non-trivial but testing so far leads me to conclude this now > reliably prints the held locks for the "other thread" (please test > more!). > --- > kernel/kcsan/core.c | 4 +- > kernel/kcsan/kcsan.h | 3 ++ > kernel/kcsan/report.c | 103 +++++++++++++++++++++++++++++++++++++++++- > lib/Kconfig.kcsan | 13 ++++++ > 4 files changed, 120 insertions(+), 3 deletions(-) > > diff --git a/kernel/kcsan/core.c b/kernel/kcsan/core.c > index e7387fec66795..065615df88eaa 100644 > --- a/kernel/kcsan/core.c > +++ b/kernel/kcsan/core.c > @@ -18,8 +18,8 @@ > #include "kcsan.h" > > static bool kcsan_early_enable = IS_ENABLED(CONFIG_KCSAN_EARLY_ENABLE); > -static unsigned int kcsan_udelay_task = CONFIG_KCSAN_UDELAY_TASK; > -static unsigned int kcsan_udelay_interrupt = CONFIG_KCSAN_UDELAY_INTERRUPT; > +unsigned int kcsan_udelay_task = CONFIG_KCSAN_UDELAY_TASK; > +unsigned int kcsan_udelay_interrupt = CONFIG_KCSAN_UDELAY_INTERRUPT; > static long kcsan_skip_watch = CONFIG_KCSAN_SKIP_WATCH; > static bool kcsan_interrupt_watcher = IS_ENABLED(CONFIG_KCSAN_INTERRUPT_WATCHER); > > diff --git a/kernel/kcsan/kcsan.h b/kernel/kcsan/kcsan.h > index 892de5120c1b6..e282f8b5749e9 100644 > --- a/kernel/kcsan/kcsan.h > +++ b/kernel/kcsan/kcsan.h > @@ -13,6 +13,9 @@ > /* The number of adjacent watchpoints to check. */ > #define KCSAN_CHECK_ADJACENT 1 > > +extern unsigned int kcsan_udelay_task; > +extern unsigned int kcsan_udelay_interrupt; > + > /* > * Globally enable and disable KCSAN. > */ > diff --git a/kernel/kcsan/report.c b/kernel/kcsan/report.c > index 11c791b886f3c..7bdb515e3662f 100644 > --- a/kernel/kcsan/report.c > +++ b/kernel/kcsan/report.c > @@ -1,5 +1,7 @@ > // SPDX-License-Identifier: GPL-2.0 > > +#include <linux/debug_locks.h> > +#include <linux/delay.h> > #include <linux/jiffies.h> > #include <linux/kernel.h> > #include <linux/lockdep.h> > @@ -31,7 +33,26 @@ static struct { > int cpu_id; > unsigned long stack_entries[NUM_STACK_ENTRIES]; > int num_stack_entries; > -} other_info = { .ptr = NULL }; > + > + /* > + * Optionally pass @current. Typically we do not need to pass @current > + * via @other_info since just @task_pid is sufficient. Passing @current > + * has additional overhead. > + * > + * To safely pass @current, we must either use get_task_struct/ > + * put_task_struct, or stall the thread that populated @other_info. > + * > + * We cannot rely on get_task_struct/put_task_struct in case > + * release_report() races with a task being released, and would have to > + * free it in release_report(). This may result in deadlock if we want > + * to use KCSAN on the allocators. > + * > + * Since we also want to reliably print held locks for > + * CONFIG_KCSAN_VERBOSE, the current implementation stalls the thread > + * that populated @other_info until it has been consumed. > + */ > + struct task_struct *task; > +} other_info; > > /* > * Information about reported races; used to rate limit reporting. > @@ -245,6 +266,16 @@ static int sym_strcmp(void *addr1, void *addr2) > return strncmp(buf1, buf2, sizeof(buf1)); > } > > +static void print_verbose_info(struct task_struct *task) > +{ > + if (!task) > + return; > + > + pr_err("\n"); > + debug_show_held_locks(task); > + print_irqtrace_events(task); > +} > + > /* > * Returns true if a report was generated, false otherwise. > */ > @@ -319,6 +350,9 @@ static bool print_report(const volatile void *ptr, size_t size, int access_type, > other_info.num_stack_entries - other_skipnr, > 0); > > + if (IS_ENABLED(CONFIG_KCSAN_VERBOSE)) > + print_verbose_info(other_info.task); > + > pr_err("\n"); > pr_err("%s to 0x%px of %zu bytes by %s on cpu %i:\n", > get_access_type(access_type), ptr, size, > @@ -340,6 +374,9 @@ static bool print_report(const volatile void *ptr, size_t size, int access_type, > stack_trace_print(stack_entries + skipnr, num_stack_entries - skipnr, > 0); > > + if (IS_ENABLED(CONFIG_KCSAN_VERBOSE)) > + print_verbose_info(current); > + > /* Print report footer. */ > pr_err("\n"); > pr_err("Reported by Kernel Concurrency Sanitizer on:\n"); > @@ -357,6 +394,67 @@ static void release_report(unsigned long *flags, enum kcsan_report_type type) > spin_unlock_irqrestore(&report_lock, *flags); > } > > +/* > + * Sets @other_info.task and awaits consumption of @other_info. > + * > + * Precondition: report_lock is held. > + * Postcondition: report_lock is held. > + */ > +static void > +set_other_info_task_blocking(unsigned long *flags, const volatile void *ptr) > +{ > + /* > + * We may be instrumenting a code-path where current->state is already > + * something other than TASK_RUNNING. > + */ > + const bool is_running = current->state == TASK_RUNNING; > + /* > + * To avoid deadlock in case we are in an interrupt here and this is a > + * race with a task on the same CPU (KCSAN_INTERRUPT_WATCHER), provide a > + * timeout to ensure this works in all contexts. > + * > + * Await approximately the worst case delay of the reporting thread (if > + * we are not interrupted). > + */ > + int timeout = max(kcsan_udelay_task, kcsan_udelay_interrupt); > + > + other_info.task = current; > + do { > + if (is_running) { > + /* > + * Let lockdep know the real task is sleeping, to print > + * the held locks (recall we turned lockdep off, so > + * locking/unlocking @report_lock won't be recorded). > + */ > + set_current_state(TASK_UNINTERRUPTIBLE); > + } > + spin_unlock_irqrestore(&report_lock, *flags); > + /* > + * We cannot call schedule() since we also cannot reliably > + * determine if sleeping here is permitted -- see in_atomic(). > + */ > + > + udelay(1); > + spin_lock_irqsave(&report_lock, *flags); > + if (timeout-- < 0) { > + /* > + * Abort. Reset other_info.task to NULL, since it > + * appears the other thread is still going to consume > + * it. It will result in no verbose info printed for > + * this task. > + */ > + other_info.task = NULL; > + break; > + } > + /* > + * If @ptr nor @current matches, then our information has been > + * consumed and we may continue. If not, retry. > + */ > + } while (other_info.ptr == ptr && other_info.task == current); > + if (is_running) > + set_current_state(TASK_RUNNING); > +} > + > /* > * Depending on the report type either sets other_info and returns false, or > * acquires the matching other_info and returns true. If other_info is not > @@ -388,6 +486,9 @@ static bool prepare_report(unsigned long *flags, const volatile void *ptr, > other_info.cpu_id = cpu_id; > other_info.num_stack_entries = stack_trace_save(other_info.stack_entries, NUM_STACK_ENTRIES, 1); > > + if (IS_ENABLED(CONFIG_KCSAN_VERBOSE)) > + set_other_info_task_blocking(flags, ptr); > + > spin_unlock_irqrestore(&report_lock, *flags); > > /* > diff --git a/lib/Kconfig.kcsan b/lib/Kconfig.kcsan > index 081ed2e1bf7b1..0f1447ff8f558 100644 > --- a/lib/Kconfig.kcsan > +++ b/lib/Kconfig.kcsan > @@ -20,6 +20,19 @@ menuconfig KCSAN > > if KCSAN > > +config KCSAN_VERBOSE > + bool "Show verbose reports with more information about system state" > + depends on PROVE_LOCKING > + help > + If enabled, reports show more information about the system state that > + may help better analyze and debug races. This includes held locks and > + IRQ trace events. > + > + While this option should generally be benign, we call into more > + external functions on report generation; if a race report is > + generated from any one of them, system stability may suffer due to > + deadlocks or recursion. If in doubt, say N. > + > config KCSAN_DEBUG > bool "Debugging of KCSAN internals" > > -- > 2.25.0.265.gbab2e86ba0-goog > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] kcsan: Add option for verbose reporting 2020-02-22 1:36 ` Paul E. McKenney @ 2020-02-24 19:16 ` Marco Elver 2020-03-01 18:58 ` Paul E. McKenney 0 siblings, 1 reply; 10+ messages in thread From: Marco Elver @ 2020-02-24 19:16 UTC (permalink / raw) To: Paul E. McKenney Cc: Andrey Konovalov, Alexander Potapenko, Dmitry Vyukov, kasan-dev, LKML, Qian Cai On Sat, 22 Feb 2020 at 02:36, Paul E. McKenney <paulmck@kernel.org> wrote: > > On Sat, Feb 22, 2020 at 12:10:27AM +0100, Marco Elver wrote: > > Adds CONFIG_KCSAN_VERBOSE to optionally enable more verbose reports. > > Currently information about the reporting task's held locks and IRQ > > trace events are shown, if they are enabled. > > > > Signed-off-by: Marco Elver <elver@google.com> > > Suggested-by: Qian Cai <cai@lca.pw> > > Applied in place of v1, thank you! Please check -rcu's "dev" branch > to make sure that I have correct ordering and versions. (Missed this.) Checked, and all looks good. Thank you! I hope the new version of this patch now does what you'd expect. Thanks, -- Marco > > > --- > > v3: > > * Typos > > v2: > > * Rework obtaining 'current' for the "other thread" -- it now passes > > 'current' and ensures that we stall until the report was printed, so > > that the lockdep information contained in 'current' is accurate. This > > was non-trivial but testing so far leads me to conclude this now > > reliably prints the held locks for the "other thread" (please test > > more!). > > --- > > kernel/kcsan/core.c | 4 +- > > kernel/kcsan/kcsan.h | 3 ++ > > kernel/kcsan/report.c | 103 +++++++++++++++++++++++++++++++++++++++++- > > lib/Kconfig.kcsan | 13 ++++++ > > 4 files changed, 120 insertions(+), 3 deletions(-) > > > > diff --git a/kernel/kcsan/core.c b/kernel/kcsan/core.c > > index e7387fec66795..065615df88eaa 100644 > > --- a/kernel/kcsan/core.c > > +++ b/kernel/kcsan/core.c > > @@ -18,8 +18,8 @@ > > #include "kcsan.h" > > > > static bool kcsan_early_enable = IS_ENABLED(CONFIG_KCSAN_EARLY_ENABLE); > > -static unsigned int kcsan_udelay_task = CONFIG_KCSAN_UDELAY_TASK; > > -static unsigned int kcsan_udelay_interrupt = CONFIG_KCSAN_UDELAY_INTERRUPT; > > +unsigned int kcsan_udelay_task = CONFIG_KCSAN_UDELAY_TASK; > > +unsigned int kcsan_udelay_interrupt = CONFIG_KCSAN_UDELAY_INTERRUPT; > > static long kcsan_skip_watch = CONFIG_KCSAN_SKIP_WATCH; > > static bool kcsan_interrupt_watcher = IS_ENABLED(CONFIG_KCSAN_INTERRUPT_WATCHER); > > > > diff --git a/kernel/kcsan/kcsan.h b/kernel/kcsan/kcsan.h > > index 892de5120c1b6..e282f8b5749e9 100644 > > --- a/kernel/kcsan/kcsan.h > > +++ b/kernel/kcsan/kcsan.h > > @@ -13,6 +13,9 @@ > > /* The number of adjacent watchpoints to check. */ > > #define KCSAN_CHECK_ADJACENT 1 > > > > +extern unsigned int kcsan_udelay_task; > > +extern unsigned int kcsan_udelay_interrupt; > > + > > /* > > * Globally enable and disable KCSAN. > > */ > > diff --git a/kernel/kcsan/report.c b/kernel/kcsan/report.c > > index 11c791b886f3c..7bdb515e3662f 100644 > > --- a/kernel/kcsan/report.c > > +++ b/kernel/kcsan/report.c > > @@ -1,5 +1,7 @@ > > // SPDX-License-Identifier: GPL-2.0 > > > > +#include <linux/debug_locks.h> > > +#include <linux/delay.h> > > #include <linux/jiffies.h> > > #include <linux/kernel.h> > > #include <linux/lockdep.h> > > @@ -31,7 +33,26 @@ static struct { > > int cpu_id; > > unsigned long stack_entries[NUM_STACK_ENTRIES]; > > int num_stack_entries; > > -} other_info = { .ptr = NULL }; > > + > > + /* > > + * Optionally pass @current. Typically we do not need to pass @current > > + * via @other_info since just @task_pid is sufficient. Passing @current > > + * has additional overhead. > > + * > > + * To safely pass @current, we must either use get_task_struct/ > > + * put_task_struct, or stall the thread that populated @other_info. > > + * > > + * We cannot rely on get_task_struct/put_task_struct in case > > + * release_report() races with a task being released, and would have to > > + * free it in release_report(). This may result in deadlock if we want > > + * to use KCSAN on the allocators. > > + * > > + * Since we also want to reliably print held locks for > > + * CONFIG_KCSAN_VERBOSE, the current implementation stalls the thread > > + * that populated @other_info until it has been consumed. > > + */ > > + struct task_struct *task; > > +} other_info; > > > > /* > > * Information about reported races; used to rate limit reporting. > > @@ -245,6 +266,16 @@ static int sym_strcmp(void *addr1, void *addr2) > > return strncmp(buf1, buf2, sizeof(buf1)); > > } > > > > +static void print_verbose_info(struct task_struct *task) > > +{ > > + if (!task) > > + return; > > + > > + pr_err("\n"); > > + debug_show_held_locks(task); > > + print_irqtrace_events(task); > > +} > > + > > /* > > * Returns true if a report was generated, false otherwise. > > */ > > @@ -319,6 +350,9 @@ static bool print_report(const volatile void *ptr, size_t size, int access_type, > > other_info.num_stack_entries - other_skipnr, > > 0); > > > > + if (IS_ENABLED(CONFIG_KCSAN_VERBOSE)) > > + print_verbose_info(other_info.task); > > + > > pr_err("\n"); > > pr_err("%s to 0x%px of %zu bytes by %s on cpu %i:\n", > > get_access_type(access_type), ptr, size, > > @@ -340,6 +374,9 @@ static bool print_report(const volatile void *ptr, size_t size, int access_type, > > stack_trace_print(stack_entries + skipnr, num_stack_entries - skipnr, > > 0); > > > > + if (IS_ENABLED(CONFIG_KCSAN_VERBOSE)) > > + print_verbose_info(current); > > + > > /* Print report footer. */ > > pr_err("\n"); > > pr_err("Reported by Kernel Concurrency Sanitizer on:\n"); > > @@ -357,6 +394,67 @@ static void release_report(unsigned long *flags, enum kcsan_report_type type) > > spin_unlock_irqrestore(&report_lock, *flags); > > } > > > > +/* > > + * Sets @other_info.task and awaits consumption of @other_info. > > + * > > + * Precondition: report_lock is held. > > + * Postcondition: report_lock is held. > > + */ > > +static void > > +set_other_info_task_blocking(unsigned long *flags, const volatile void *ptr) > > +{ > > + /* > > + * We may be instrumenting a code-path where current->state is already > > + * something other than TASK_RUNNING. > > + */ > > + const bool is_running = current->state == TASK_RUNNING; > > + /* > > + * To avoid deadlock in case we are in an interrupt here and this is a > > + * race with a task on the same CPU (KCSAN_INTERRUPT_WATCHER), provide a > > + * timeout to ensure this works in all contexts. > > + * > > + * Await approximately the worst case delay of the reporting thread (if > > + * we are not interrupted). > > + */ > > + int timeout = max(kcsan_udelay_task, kcsan_udelay_interrupt); > > + > > + other_info.task = current; > > + do { > > + if (is_running) { > > + /* > > + * Let lockdep know the real task is sleeping, to print > > + * the held locks (recall we turned lockdep off, so > > + * locking/unlocking @report_lock won't be recorded). > > + */ > > + set_current_state(TASK_UNINTERRUPTIBLE); > > + } > > + spin_unlock_irqrestore(&report_lock, *flags); > > + /* > > + * We cannot call schedule() since we also cannot reliably > > + * determine if sleeping here is permitted -- see in_atomic(). > > + */ > > + > > + udelay(1); > > + spin_lock_irqsave(&report_lock, *flags); > > + if (timeout-- < 0) { > > + /* > > + * Abort. Reset other_info.task to NULL, since it > > + * appears the other thread is still going to consume > > + * it. It will result in no verbose info printed for > > + * this task. > > + */ > > + other_info.task = NULL; > > + break; > > + } > > + /* > > + * If @ptr nor @current matches, then our information has been > > + * consumed and we may continue. If not, retry. > > + */ > > + } while (other_info.ptr == ptr && other_info.task == current); > > + if (is_running) > > + set_current_state(TASK_RUNNING); > > +} > > + > > /* > > * Depending on the report type either sets other_info and returns false, or > > * acquires the matching other_info and returns true. If other_info is not > > @@ -388,6 +486,9 @@ static bool prepare_report(unsigned long *flags, const volatile void *ptr, > > other_info.cpu_id = cpu_id; > > other_info.num_stack_entries = stack_trace_save(other_info.stack_entries, NUM_STACK_ENTRIES, 1); > > > > + if (IS_ENABLED(CONFIG_KCSAN_VERBOSE)) > > + set_other_info_task_blocking(flags, ptr); > > + > > spin_unlock_irqrestore(&report_lock, *flags); > > > > /* > > diff --git a/lib/Kconfig.kcsan b/lib/Kconfig.kcsan > > index 081ed2e1bf7b1..0f1447ff8f558 100644 > > --- a/lib/Kconfig.kcsan > > +++ b/lib/Kconfig.kcsan > > @@ -20,6 +20,19 @@ menuconfig KCSAN > > > > if KCSAN > > > > +config KCSAN_VERBOSE > > + bool "Show verbose reports with more information about system state" > > + depends on PROVE_LOCKING > > + help > > + If enabled, reports show more information about the system state that > > + may help better analyze and debug races. This includes held locks and > > + IRQ trace events. > > + > > + While this option should generally be benign, we call into more > > + external functions on report generation; if a race report is > > + generated from any one of them, system stability may suffer due to > > + deadlocks or recursion. If in doubt, say N. > > + > > config KCSAN_DEBUG > > bool "Debugging of KCSAN internals" > > > > -- > > 2.25.0.265.gbab2e86ba0-goog > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] kcsan: Add option for verbose reporting 2020-02-24 19:16 ` Marco Elver @ 2020-03-01 18:58 ` Paul E. McKenney 0 siblings, 0 replies; 10+ messages in thread From: Paul E. McKenney @ 2020-03-01 18:58 UTC (permalink / raw) To: Marco Elver Cc: Andrey Konovalov, Alexander Potapenko, Dmitry Vyukov, kasan-dev, LKML, Qian Cai On Mon, Feb 24, 2020 at 08:16:47PM +0100, Marco Elver wrote: > On Sat, 22 Feb 2020 at 02:36, Paul E. McKenney <paulmck@kernel.org> wrote: > > > > On Sat, Feb 22, 2020 at 12:10:27AM +0100, Marco Elver wrote: > > > Adds CONFIG_KCSAN_VERBOSE to optionally enable more verbose reports. > > > Currently information about the reporting task's held locks and IRQ > > > trace events are shown, if they are enabled. > > > > > > Signed-off-by: Marco Elver <elver@google.com> > > > Suggested-by: Qian Cai <cai@lca.pw> > > > > Applied in place of v1, thank you! Please check -rcu's "dev" branch > > to make sure that I have correct ordering and versions. > > (Missed this.) Checked, and all looks good. Thank you! > > I hope the new version of this patch now does what you'd expect. Indeed it does! Please see below for one example from an rcutorture run. Thanx, Paul ------------------------------------------------------------------------ [ 3.162466] ================================================================== [ 3.162989] BUG: KCSAN: data-race in mutex_spin_on_owner+0xc6/0x2b0 [ 3.162989] [ 3.162989] race at unknown origin, with read to 0xffff8e91dde0d838 of 4 bytes by task 156 on cpu 3: [ 3.162989] mutex_spin_on_owner+0xc6/0x2b0 [ 3.162989] __mutex_lock+0x252/0xc70 [ 3.162989] mutex_lock_nested+0x27/0x30 [ 3.162989] ata_eh_acquire+0x32/0x80 [ 3.162989] ata_msleep+0x72/0xa0 [ 3.162989] sata_link_debounce+0xed/0x1e0 [ 3.162989] sata_link_resume+0x146/0x1b0 [ 3.162989] sata_link_hardreset+0x16c/0x290 [ 3.162989] ahci_do_hardreset+0x19b/0x220 [ 3.162989] ahci_hardreset+0x3e/0x70 [ 3.168849] ata5: SATA link down (SStatus 0 SControl 300) [ 3.162989] ata_do_reset+0x35/0xa0 [ 3.162989] ata_eh_reset+0x77b/0x1300 [ 3.162989] ata_eh_recover+0x433/0x2090 [ 3.162989] sata_pmp_error_handler+0x86a/0xef0 [ 3.162989] ahci_error_handler+0x7c/0xd0 [ 3.162989] ata_scsi_port_error_handler+0x3ef/0xb90 [ 3.162989] ata_scsi_error+0x185/0x1d0 [ 3.162989] scsi_error_handler+0x13f/0x710 [ 3.172310] ata3.00: ATAPI: QEMU DVD-ROM, 2.5+, max UDMA/100 [ 3.162989] kthread+0x1c3/0x1e0 [ 3.162989] ret_from_fork+0x3a/0x50 [ 3.162989] [ 3.162989] 2 locks held by scsi_eh_3/156: [ 3.162989] #0: ffff8e91ddef90d0 (&host->eh_mutex){+.+.}, at: ata_eh_acquire+0x32/0x80 [ 3.162989] #1: ffffffff86067ba0 (rcu_read_lock){....}, at: mutex_spin_on_owner+0x0/0x2b0 > Thanks, > -- Marco > > > > > > --- > > > v3: > > > * Typos > > > v2: > > > * Rework obtaining 'current' for the "other thread" -- it now passes > > > 'current' and ensures that we stall until the report was printed, so > > > that the lockdep information contained in 'current' is accurate. This > > > was non-trivial but testing so far leads me to conclude this now > > > reliably prints the held locks for the "other thread" (please test > > > more!). > > > --- > > > kernel/kcsan/core.c | 4 +- > > > kernel/kcsan/kcsan.h | 3 ++ > > > kernel/kcsan/report.c | 103 +++++++++++++++++++++++++++++++++++++++++- > > > lib/Kconfig.kcsan | 13 ++++++ > > > 4 files changed, 120 insertions(+), 3 deletions(-) > > > > > > diff --git a/kernel/kcsan/core.c b/kernel/kcsan/core.c > > > index e7387fec66795..065615df88eaa 100644 > > > --- a/kernel/kcsan/core.c > > > +++ b/kernel/kcsan/core.c > > > @@ -18,8 +18,8 @@ > > > #include "kcsan.h" > > > > > > static bool kcsan_early_enable = IS_ENABLED(CONFIG_KCSAN_EARLY_ENABLE); > > > -static unsigned int kcsan_udelay_task = CONFIG_KCSAN_UDELAY_TASK; > > > -static unsigned int kcsan_udelay_interrupt = CONFIG_KCSAN_UDELAY_INTERRUPT; > > > +unsigned int kcsan_udelay_task = CONFIG_KCSAN_UDELAY_TASK; > > > +unsigned int kcsan_udelay_interrupt = CONFIG_KCSAN_UDELAY_INTERRUPT; > > > static long kcsan_skip_watch = CONFIG_KCSAN_SKIP_WATCH; > > > static bool kcsan_interrupt_watcher = IS_ENABLED(CONFIG_KCSAN_INTERRUPT_WATCHER); > > > > > > diff --git a/kernel/kcsan/kcsan.h b/kernel/kcsan/kcsan.h > > > index 892de5120c1b6..e282f8b5749e9 100644 > > > --- a/kernel/kcsan/kcsan.h > > > +++ b/kernel/kcsan/kcsan.h > > > @@ -13,6 +13,9 @@ > > > /* The number of adjacent watchpoints to check. */ > > > #define KCSAN_CHECK_ADJACENT 1 > > > > > > +extern unsigned int kcsan_udelay_task; > > > +extern unsigned int kcsan_udelay_interrupt; > > > + > > > /* > > > * Globally enable and disable KCSAN. > > > */ > > > diff --git a/kernel/kcsan/report.c b/kernel/kcsan/report.c > > > index 11c791b886f3c..7bdb515e3662f 100644 > > > --- a/kernel/kcsan/report.c > > > +++ b/kernel/kcsan/report.c > > > @@ -1,5 +1,7 @@ > > > // SPDX-License-Identifier: GPL-2.0 > > > > > > +#include <linux/debug_locks.h> > > > +#include <linux/delay.h> > > > #include <linux/jiffies.h> > > > #include <linux/kernel.h> > > > #include <linux/lockdep.h> > > > @@ -31,7 +33,26 @@ static struct { > > > int cpu_id; > > > unsigned long stack_entries[NUM_STACK_ENTRIES]; > > > int num_stack_entries; > > > -} other_info = { .ptr = NULL }; > > > + > > > + /* > > > + * Optionally pass @current. Typically we do not need to pass @current > > > + * via @other_info since just @task_pid is sufficient. Passing @current > > > + * has additional overhead. > > > + * > > > + * To safely pass @current, we must either use get_task_struct/ > > > + * put_task_struct, or stall the thread that populated @other_info. > > > + * > > > + * We cannot rely on get_task_struct/put_task_struct in case > > > + * release_report() races with a task being released, and would have to > > > + * free it in release_report(). This may result in deadlock if we want > > > + * to use KCSAN on the allocators. > > > + * > > > + * Since we also want to reliably print held locks for > > > + * CONFIG_KCSAN_VERBOSE, the current implementation stalls the thread > > > + * that populated @other_info until it has been consumed. > > > + */ > > > + struct task_struct *task; > > > +} other_info; > > > > > > /* > > > * Information about reported races; used to rate limit reporting. > > > @@ -245,6 +266,16 @@ static int sym_strcmp(void *addr1, void *addr2) > > > return strncmp(buf1, buf2, sizeof(buf1)); > > > } > > > > > > +static void print_verbose_info(struct task_struct *task) > > > +{ > > > + if (!task) > > > + return; > > > + > > > + pr_err("\n"); > > > + debug_show_held_locks(task); > > > + print_irqtrace_events(task); > > > +} > > > + > > > /* > > > * Returns true if a report was generated, false otherwise. > > > */ > > > @@ -319,6 +350,9 @@ static bool print_report(const volatile void *ptr, size_t size, int access_type, > > > other_info.num_stack_entries - other_skipnr, > > > 0); > > > > > > + if (IS_ENABLED(CONFIG_KCSAN_VERBOSE)) > > > + print_verbose_info(other_info.task); > > > + > > > pr_err("\n"); > > > pr_err("%s to 0x%px of %zu bytes by %s on cpu %i:\n", > > > get_access_type(access_type), ptr, size, > > > @@ -340,6 +374,9 @@ static bool print_report(const volatile void *ptr, size_t size, int access_type, > > > stack_trace_print(stack_entries + skipnr, num_stack_entries - skipnr, > > > 0); > > > > > > + if (IS_ENABLED(CONFIG_KCSAN_VERBOSE)) > > > + print_verbose_info(current); > > > + > > > /* Print report footer. */ > > > pr_err("\n"); > > > pr_err("Reported by Kernel Concurrency Sanitizer on:\n"); > > > @@ -357,6 +394,67 @@ static void release_report(unsigned long *flags, enum kcsan_report_type type) > > > spin_unlock_irqrestore(&report_lock, *flags); > > > } > > > > > > +/* > > > + * Sets @other_info.task and awaits consumption of @other_info. > > > + * > > > + * Precondition: report_lock is held. > > > + * Postcondition: report_lock is held. > > > + */ > > > +static void > > > +set_other_info_task_blocking(unsigned long *flags, const volatile void *ptr) > > > +{ > > > + /* > > > + * We may be instrumenting a code-path where current->state is already > > > + * something other than TASK_RUNNING. > > > + */ > > > + const bool is_running = current->state == TASK_RUNNING; > > > + /* > > > + * To avoid deadlock in case we are in an interrupt here and this is a > > > + * race with a task on the same CPU (KCSAN_INTERRUPT_WATCHER), provide a > > > + * timeout to ensure this works in all contexts. > > > + * > > > + * Await approximately the worst case delay of the reporting thread (if > > > + * we are not interrupted). > > > + */ > > > + int timeout = max(kcsan_udelay_task, kcsan_udelay_interrupt); > > > + > > > + other_info.task = current; > > > + do { > > > + if (is_running) { > > > + /* > > > + * Let lockdep know the real task is sleeping, to print > > > + * the held locks (recall we turned lockdep off, so > > > + * locking/unlocking @report_lock won't be recorded). > > > + */ > > > + set_current_state(TASK_UNINTERRUPTIBLE); > > > + } > > > + spin_unlock_irqrestore(&report_lock, *flags); > > > + /* > > > + * We cannot call schedule() since we also cannot reliably > > > + * determine if sleeping here is permitted -- see in_atomic(). > > > + */ > > > + > > > + udelay(1); > > > + spin_lock_irqsave(&report_lock, *flags); > > > + if (timeout-- < 0) { > > > + /* > > > + * Abort. Reset other_info.task to NULL, since it > > > + * appears the other thread is still going to consume > > > + * it. It will result in no verbose info printed for > > > + * this task. > > > + */ > > > + other_info.task = NULL; > > > + break; > > > + } > > > + /* > > > + * If @ptr nor @current matches, then our information has been > > > + * consumed and we may continue. If not, retry. > > > + */ > > > + } while (other_info.ptr == ptr && other_info.task == current); > > > + if (is_running) > > > + set_current_state(TASK_RUNNING); > > > +} > > > + > > > /* > > > * Depending on the report type either sets other_info and returns false, or > > > * acquires the matching other_info and returns true. If other_info is not > > > @@ -388,6 +486,9 @@ static bool prepare_report(unsigned long *flags, const volatile void *ptr, > > > other_info.cpu_id = cpu_id; > > > other_info.num_stack_entries = stack_trace_save(other_info.stack_entries, NUM_STACK_ENTRIES, 1); > > > > > > + if (IS_ENABLED(CONFIG_KCSAN_VERBOSE)) > > > + set_other_info_task_blocking(flags, ptr); > > > + > > > spin_unlock_irqrestore(&report_lock, *flags); > > > > > > /* > > > diff --git a/lib/Kconfig.kcsan b/lib/Kconfig.kcsan > > > index 081ed2e1bf7b1..0f1447ff8f558 100644 > > > --- a/lib/Kconfig.kcsan > > > +++ b/lib/Kconfig.kcsan > > > @@ -20,6 +20,19 @@ menuconfig KCSAN > > > > > > if KCSAN > > > > > > +config KCSAN_VERBOSE > > > + bool "Show verbose reports with more information about system state" > > > + depends on PROVE_LOCKING > > > + help > > > + If enabled, reports show more information about the system state that > > > + may help better analyze and debug races. This includes held locks and > > > + IRQ trace events. > > > + > > > + While this option should generally be benign, we call into more > > > + external functions on report generation; if a race report is > > > + generated from any one of them, system stability may suffer due to > > > + deadlocks or recursion. If in doubt, say N. > > > + > > > config KCSAN_DEBUG > > > bool "Debugging of KCSAN internals" > > > > > > -- > > > 2.25.0.265.gbab2e86ba0-goog > > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] kcsan: Add option for verbose reporting 2020-02-21 23:10 [PATCH v3] kcsan: Add option for verbose reporting Marco Elver 2020-02-22 1:36 ` Paul E. McKenney @ 2020-04-06 12:59 ` Qian Cai 2020-04-06 13:35 ` Paul E. McKenney 1 sibling, 1 reply; 10+ messages in thread From: Qian Cai @ 2020-04-06 12:59 UTC (permalink / raw) To: Paul E. McKenney Cc: Andrey Konovalov, Alexander Potapenko, Dmitry Vyukov, kasan-dev, LKML, Marco Elver > On Feb 21, 2020, at 6:10 PM, Marco Elver <elver@google.com> wrote: > > Adds CONFIG_KCSAN_VERBOSE to optionally enable more verbose reports. > Currently information about the reporting task's held locks and IRQ > trace events are shown, if they are enabled. This patch is no longer in today’s linux-next. I suppose that it is because Paul had sent the initial pull request without this one that I had missed dearly. Is there a way to get it back there? > > Signed-off-by: Marco Elver <elver@google.com> > Suggested-by: Qian Cai <cai@lca.pw> > --- > v3: > * Typos > v2: > * Rework obtaining 'current' for the "other thread" -- it now passes > 'current' and ensures that we stall until the report was printed, so > that the lockdep information contained in 'current' is accurate. This > was non-trivial but testing so far leads me to conclude this now > reliably prints the held locks for the "other thread" (please test > more!). > --- > kernel/kcsan/core.c | 4 +- > kernel/kcsan/kcsan.h | 3 ++ > kernel/kcsan/report.c | 103 +++++++++++++++++++++++++++++++++++++++++- > lib/Kconfig.kcsan | 13 ++++++ > 4 files changed, 120 insertions(+), 3 deletions(-) > > diff --git a/kernel/kcsan/core.c b/kernel/kcsan/core.c > index e7387fec66795..065615df88eaa 100644 > --- a/kernel/kcsan/core.c > +++ b/kernel/kcsan/core.c > @@ -18,8 +18,8 @@ > #include "kcsan.h" > > static bool kcsan_early_enable = IS_ENABLED(CONFIG_KCSAN_EARLY_ENABLE); > -static unsigned int kcsan_udelay_task = CONFIG_KCSAN_UDELAY_TASK; > -static unsigned int kcsan_udelay_interrupt = CONFIG_KCSAN_UDELAY_INTERRUPT; > +unsigned int kcsan_udelay_task = CONFIG_KCSAN_UDELAY_TASK; > +unsigned int kcsan_udelay_interrupt = CONFIG_KCSAN_UDELAY_INTERRUPT; > static long kcsan_skip_watch = CONFIG_KCSAN_SKIP_WATCH; > static bool kcsan_interrupt_watcher = IS_ENABLED(CONFIG_KCSAN_INTERRUPT_WATCHER); > > diff --git a/kernel/kcsan/kcsan.h b/kernel/kcsan/kcsan.h > index 892de5120c1b6..e282f8b5749e9 100644 > --- a/kernel/kcsan/kcsan.h > +++ b/kernel/kcsan/kcsan.h > @@ -13,6 +13,9 @@ > /* The number of adjacent watchpoints to check. */ > #define KCSAN_CHECK_ADJACENT 1 > > +extern unsigned int kcsan_udelay_task; > +extern unsigned int kcsan_udelay_interrupt; > + > /* > * Globally enable and disable KCSAN. > */ > diff --git a/kernel/kcsan/report.c b/kernel/kcsan/report.c > index 11c791b886f3c..7bdb515e3662f 100644 > --- a/kernel/kcsan/report.c > +++ b/kernel/kcsan/report.c > @@ -1,5 +1,7 @@ > // SPDX-License-Identifier: GPL-2.0 > > +#include <linux/debug_locks.h> > +#include <linux/delay.h> > #include <linux/jiffies.h> > #include <linux/kernel.h> > #include <linux/lockdep.h> > @@ -31,7 +33,26 @@ static struct { > int cpu_id; > unsigned long stack_entries[NUM_STACK_ENTRIES]; > int num_stack_entries; > -} other_info = { .ptr = NULL }; > + > + /* > + * Optionally pass @current. Typically we do not need to pass @current > + * via @other_info since just @task_pid is sufficient. Passing @current > + * has additional overhead. > + * > + * To safely pass @current, we must either use get_task_struct/ > + * put_task_struct, or stall the thread that populated @other_info. > + * > + * We cannot rely on get_task_struct/put_task_struct in case > + * release_report() races with a task being released, and would have to > + * free it in release_report(). This may result in deadlock if we want > + * to use KCSAN on the allocators. > + * > + * Since we also want to reliably print held locks for > + * CONFIG_KCSAN_VERBOSE, the current implementation stalls the thread > + * that populated @other_info until it has been consumed. > + */ > + struct task_struct *task; > +} other_info; > > /* > * Information about reported races; used to rate limit reporting. > @@ -245,6 +266,16 @@ static int sym_strcmp(void *addr1, void *addr2) > return strncmp(buf1, buf2, sizeof(buf1)); > } > > +static void print_verbose_info(struct task_struct *task) > +{ > + if (!task) > + return; > + > + pr_err("\n"); > + debug_show_held_locks(task); > + print_irqtrace_events(task); > +} > + > /* > * Returns true if a report was generated, false otherwise. > */ > @@ -319,6 +350,9 @@ static bool print_report(const volatile void *ptr, size_t size, int access_type, > other_info.num_stack_entries - other_skipnr, > 0); > > + if (IS_ENABLED(CONFIG_KCSAN_VERBOSE)) > + print_verbose_info(other_info.task); > + > pr_err("\n"); > pr_err("%s to 0x%px of %zu bytes by %s on cpu %i:\n", > get_access_type(access_type), ptr, size, > @@ -340,6 +374,9 @@ static bool print_report(const volatile void *ptr, size_t size, int access_type, > stack_trace_print(stack_entries + skipnr, num_stack_entries - skipnr, > 0); > > + if (IS_ENABLED(CONFIG_KCSAN_VERBOSE)) > + print_verbose_info(current); > + > /* Print report footer. */ > pr_err("\n"); > pr_err("Reported by Kernel Concurrency Sanitizer on:\n"); > @@ -357,6 +394,67 @@ static void release_report(unsigned long *flags, enum kcsan_report_type type) > spin_unlock_irqrestore(&report_lock, *flags); > } > > +/* > + * Sets @other_info.task and awaits consumption of @other_info. > + * > + * Precondition: report_lock is held. > + * Postcondition: report_lock is held. > + */ > +static void > +set_other_info_task_blocking(unsigned long *flags, const volatile void *ptr) > +{ > + /* > + * We may be instrumenting a code-path where current->state is already > + * something other than TASK_RUNNING. > + */ > + const bool is_running = current->state == TASK_RUNNING; > + /* > + * To avoid deadlock in case we are in an interrupt here and this is a > + * race with a task on the same CPU (KCSAN_INTERRUPT_WATCHER), provide a > + * timeout to ensure this works in all contexts. > + * > + * Await approximately the worst case delay of the reporting thread (if > + * we are not interrupted). > + */ > + int timeout = max(kcsan_udelay_task, kcsan_udelay_interrupt); > + > + other_info.task = current; > + do { > + if (is_running) { > + /* > + * Let lockdep know the real task is sleeping, to print > + * the held locks (recall we turned lockdep off, so > + * locking/unlocking @report_lock won't be recorded). > + */ > + set_current_state(TASK_UNINTERRUPTIBLE); > + } > + spin_unlock_irqrestore(&report_lock, *flags); > + /* > + * We cannot call schedule() since we also cannot reliably > + * determine if sleeping here is permitted -- see in_atomic(). > + */ > + > + udelay(1); > + spin_lock_irqsave(&report_lock, *flags); > + if (timeout-- < 0) { > + /* > + * Abort. Reset other_info.task to NULL, since it > + * appears the other thread is still going to consume > + * it. It will result in no verbose info printed for > + * this task. > + */ > + other_info.task = NULL; > + break; > + } > + /* > + * If @ptr nor @current matches, then our information has been > + * consumed and we may continue. If not, retry. > + */ > + } while (other_info.ptr == ptr && other_info.task == current); > + if (is_running) > + set_current_state(TASK_RUNNING); > +} > + > /* > * Depending on the report type either sets other_info and returns false, or > * acquires the matching other_info and returns true. If other_info is not > @@ -388,6 +486,9 @@ static bool prepare_report(unsigned long *flags, const volatile void *ptr, > other_info.cpu_id = cpu_id; > other_info.num_stack_entries = stack_trace_save(other_info.stack_entries, NUM_STACK_ENTRIES, 1); > > + if (IS_ENABLED(CONFIG_KCSAN_VERBOSE)) > + set_other_info_task_blocking(flags, ptr); > + > spin_unlock_irqrestore(&report_lock, *flags); > > /* > diff --git a/lib/Kconfig.kcsan b/lib/Kconfig.kcsan > index 081ed2e1bf7b1..0f1447ff8f558 100644 > --- a/lib/Kconfig.kcsan > +++ b/lib/Kconfig.kcsan > @@ -20,6 +20,19 @@ menuconfig KCSAN > > if KCSAN > > +config KCSAN_VERBOSE > + bool "Show verbose reports with more information about system state" > + depends on PROVE_LOCKING > + help > + If enabled, reports show more information about the system state that > + may help better analyze and debug races. This includes held locks and > + IRQ trace events. > + > + While this option should generally be benign, we call into more > + external functions on report generation; if a race report is > + generated from any one of them, system stability may suffer due to > + deadlocks or recursion. If in doubt, say N. > + > config KCSAN_DEBUG > bool "Debugging of KCSAN internals" > > -- > 2.25.0.265.gbab2e86ba0-goog > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] kcsan: Add option for verbose reporting 2020-04-06 12:59 ` Qian Cai @ 2020-04-06 13:35 ` Paul E. McKenney 2020-04-06 13:45 ` Qian Cai 0 siblings, 1 reply; 10+ messages in thread From: Paul E. McKenney @ 2020-04-06 13:35 UTC (permalink / raw) To: Qian Cai Cc: Andrey Konovalov, Alexander Potapenko, Dmitry Vyukov, kasan-dev, LKML, Marco Elver On Mon, Apr 06, 2020 at 08:59:58AM -0400, Qian Cai wrote: > > > > On Feb 21, 2020, at 6:10 PM, Marco Elver <elver@google.com> wrote: > > > > Adds CONFIG_KCSAN_VERBOSE to optionally enable more verbose reports. > > Currently information about the reporting task's held locks and IRQ > > trace events are shown, if they are enabled. > > This patch is no longer in today’s linux-next. I suppose that it is because Paul had sent > the initial pull request without this one that I had missed dearly. > > Is there a way to get it back there? It goes back in in seven days, after -rc1 is released. The fact that it was there last week was a mistake on my part, and I did eventually get my hand slapped for it. ;-) In the meantime, if it would help, I could group the KCSAN commits on top of those in -tip to allow you to get them with one "git pull" command. Thanx, Paul > > Signed-off-by: Marco Elver <elver@google.com> > > Suggested-by: Qian Cai <cai@lca.pw> > > --- > > v3: > > * Typos > > v2: > > * Rework obtaining 'current' for the "other thread" -- it now passes > > 'current' and ensures that we stall until the report was printed, so > > that the lockdep information contained in 'current' is accurate. This > > was non-trivial but testing so far leads me to conclude this now > > reliably prints the held locks for the "other thread" (please test > > more!). > > --- > > kernel/kcsan/core.c | 4 +- > > kernel/kcsan/kcsan.h | 3 ++ > > kernel/kcsan/report.c | 103 +++++++++++++++++++++++++++++++++++++++++- > > lib/Kconfig.kcsan | 13 ++++++ > > 4 files changed, 120 insertions(+), 3 deletions(-) > > > > diff --git a/kernel/kcsan/core.c b/kernel/kcsan/core.c > > index e7387fec66795..065615df88eaa 100644 > > --- a/kernel/kcsan/core.c > > +++ b/kernel/kcsan/core.c > > @@ -18,8 +18,8 @@ > > #include "kcsan.h" > > > > static bool kcsan_early_enable = IS_ENABLED(CONFIG_KCSAN_EARLY_ENABLE); > > -static unsigned int kcsan_udelay_task = CONFIG_KCSAN_UDELAY_TASK; > > -static unsigned int kcsan_udelay_interrupt = CONFIG_KCSAN_UDELAY_INTERRUPT; > > +unsigned int kcsan_udelay_task = CONFIG_KCSAN_UDELAY_TASK; > > +unsigned int kcsan_udelay_interrupt = CONFIG_KCSAN_UDELAY_INTERRUPT; > > static long kcsan_skip_watch = CONFIG_KCSAN_SKIP_WATCH; > > static bool kcsan_interrupt_watcher = IS_ENABLED(CONFIG_KCSAN_INTERRUPT_WATCHER); > > > > diff --git a/kernel/kcsan/kcsan.h b/kernel/kcsan/kcsan.h > > index 892de5120c1b6..e282f8b5749e9 100644 > > --- a/kernel/kcsan/kcsan.h > > +++ b/kernel/kcsan/kcsan.h > > @@ -13,6 +13,9 @@ > > /* The number of adjacent watchpoints to check. */ > > #define KCSAN_CHECK_ADJACENT 1 > > > > +extern unsigned int kcsan_udelay_task; > > +extern unsigned int kcsan_udelay_interrupt; > > + > > /* > > * Globally enable and disable KCSAN. > > */ > > diff --git a/kernel/kcsan/report.c b/kernel/kcsan/report.c > > index 11c791b886f3c..7bdb515e3662f 100644 > > --- a/kernel/kcsan/report.c > > +++ b/kernel/kcsan/report.c > > @@ -1,5 +1,7 @@ > > // SPDX-License-Identifier: GPL-2.0 > > > > +#include <linux/debug_locks.h> > > +#include <linux/delay.h> > > #include <linux/jiffies.h> > > #include <linux/kernel.h> > > #include <linux/lockdep.h> > > @@ -31,7 +33,26 @@ static struct { > > int cpu_id; > > unsigned long stack_entries[NUM_STACK_ENTRIES]; > > int num_stack_entries; > > -} other_info = { .ptr = NULL }; > > + > > + /* > > + * Optionally pass @current. Typically we do not need to pass @current > > + * via @other_info since just @task_pid is sufficient. Passing @current > > + * has additional overhead. > > + * > > + * To safely pass @current, we must either use get_task_struct/ > > + * put_task_struct, or stall the thread that populated @other_info. > > + * > > + * We cannot rely on get_task_struct/put_task_struct in case > > + * release_report() races with a task being released, and would have to > > + * free it in release_report(). This may result in deadlock if we want > > + * to use KCSAN on the allocators. > > + * > > + * Since we also want to reliably print held locks for > > + * CONFIG_KCSAN_VERBOSE, the current implementation stalls the thread > > + * that populated @other_info until it has been consumed. > > + */ > > + struct task_struct *task; > > +} other_info; > > > > /* > > * Information about reported races; used to rate limit reporting. > > @@ -245,6 +266,16 @@ static int sym_strcmp(void *addr1, void *addr2) > > return strncmp(buf1, buf2, sizeof(buf1)); > > } > > > > +static void print_verbose_info(struct task_struct *task) > > +{ > > + if (!task) > > + return; > > + > > + pr_err("\n"); > > + debug_show_held_locks(task); > > + print_irqtrace_events(task); > > +} > > + > > /* > > * Returns true if a report was generated, false otherwise. > > */ > > @@ -319,6 +350,9 @@ static bool print_report(const volatile void *ptr, size_t size, int access_type, > > other_info.num_stack_entries - other_skipnr, > > 0); > > > > + if (IS_ENABLED(CONFIG_KCSAN_VERBOSE)) > > + print_verbose_info(other_info.task); > > + > > pr_err("\n"); > > pr_err("%s to 0x%px of %zu bytes by %s on cpu %i:\n", > > get_access_type(access_type), ptr, size, > > @@ -340,6 +374,9 @@ static bool print_report(const volatile void *ptr, size_t size, int access_type, > > stack_trace_print(stack_entries + skipnr, num_stack_entries - skipnr, > > 0); > > > > + if (IS_ENABLED(CONFIG_KCSAN_VERBOSE)) > > + print_verbose_info(current); > > + > > /* Print report footer. */ > > pr_err("\n"); > > pr_err("Reported by Kernel Concurrency Sanitizer on:\n"); > > @@ -357,6 +394,67 @@ static void release_report(unsigned long *flags, enum kcsan_report_type type) > > spin_unlock_irqrestore(&report_lock, *flags); > > } > > > > +/* > > + * Sets @other_info.task and awaits consumption of @other_info. > > + * > > + * Precondition: report_lock is held. > > + * Postcondition: report_lock is held. > > + */ > > +static void > > +set_other_info_task_blocking(unsigned long *flags, const volatile void *ptr) > > +{ > > + /* > > + * We may be instrumenting a code-path where current->state is already > > + * something other than TASK_RUNNING. > > + */ > > + const bool is_running = current->state == TASK_RUNNING; > > + /* > > + * To avoid deadlock in case we are in an interrupt here and this is a > > + * race with a task on the same CPU (KCSAN_INTERRUPT_WATCHER), provide a > > + * timeout to ensure this works in all contexts. > > + * > > + * Await approximately the worst case delay of the reporting thread (if > > + * we are not interrupted). > > + */ > > + int timeout = max(kcsan_udelay_task, kcsan_udelay_interrupt); > > + > > + other_info.task = current; > > + do { > > + if (is_running) { > > + /* > > + * Let lockdep know the real task is sleeping, to print > > + * the held locks (recall we turned lockdep off, so > > + * locking/unlocking @report_lock won't be recorded). > > + */ > > + set_current_state(TASK_UNINTERRUPTIBLE); > > + } > > + spin_unlock_irqrestore(&report_lock, *flags); > > + /* > > + * We cannot call schedule() since we also cannot reliably > > + * determine if sleeping here is permitted -- see in_atomic(). > > + */ > > + > > + udelay(1); > > + spin_lock_irqsave(&report_lock, *flags); > > + if (timeout-- < 0) { > > + /* > > + * Abort. Reset other_info.task to NULL, since it > > + * appears the other thread is still going to consume > > + * it. It will result in no verbose info printed for > > + * this task. > > + */ > > + other_info.task = NULL; > > + break; > > + } > > + /* > > + * If @ptr nor @current matches, then our information has been > > + * consumed and we may continue. If not, retry. > > + */ > > + } while (other_info.ptr == ptr && other_info.task == current); > > + if (is_running) > > + set_current_state(TASK_RUNNING); > > +} > > + > > /* > > * Depending on the report type either sets other_info and returns false, or > > * acquires the matching other_info and returns true. If other_info is not > > @@ -388,6 +486,9 @@ static bool prepare_report(unsigned long *flags, const volatile void *ptr, > > other_info.cpu_id = cpu_id; > > other_info.num_stack_entries = stack_trace_save(other_info.stack_entries, NUM_STACK_ENTRIES, 1); > > > > + if (IS_ENABLED(CONFIG_KCSAN_VERBOSE)) > > + set_other_info_task_blocking(flags, ptr); > > + > > spin_unlock_irqrestore(&report_lock, *flags); > > > > /* > > diff --git a/lib/Kconfig.kcsan b/lib/Kconfig.kcsan > > index 081ed2e1bf7b1..0f1447ff8f558 100644 > > --- a/lib/Kconfig.kcsan > > +++ b/lib/Kconfig.kcsan > > @@ -20,6 +20,19 @@ menuconfig KCSAN > > > > if KCSAN > > > > +config KCSAN_VERBOSE > > + bool "Show verbose reports with more information about system state" > > + depends on PROVE_LOCKING > > + help > > + If enabled, reports show more information about the system state that > > + may help better analyze and debug races. This includes held locks and > > + IRQ trace events. > > + > > + While this option should generally be benign, we call into more > > + external functions on report generation; if a race report is > > + generated from any one of them, system stability may suffer due to > > + deadlocks or recursion. If in doubt, say N. > > + > > config KCSAN_DEBUG > > bool "Debugging of KCSAN internals" > > > > -- > > 2.25.0.265.gbab2e86ba0-goog > > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] kcsan: Add option for verbose reporting 2020-04-06 13:35 ` Paul E. McKenney @ 2020-04-06 13:45 ` Qian Cai 2020-04-06 19:51 ` Paul E. McKenney 0 siblings, 1 reply; 10+ messages in thread From: Qian Cai @ 2020-04-06 13:45 UTC (permalink / raw) To: paulmck Cc: Andrey Konovalov, Alexander Potapenko, Dmitry Vyukov, kasan-dev, LKML, Marco Elver > On Apr 6, 2020, at 9:35 AM, Paul E. McKenney <paulmck@kernel.org> wrote: > > It goes back in in seven days, after -rc1 is released. The fact that > it was there last week was a mistake on my part, and I did eventually > get my hand slapped for it. ;-) > > In the meantime, if it would help, I could group the KCSAN commits > on top of those in -tip to allow you to get them with one "git pull" > command. Testing Linux-next for a week without that commit with KCSAN is a torture, so please do that if that is not much work. Otherwise, I could manually cherry-pick the commit myself after fixing all the offsets. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] kcsan: Add option for verbose reporting 2020-04-06 13:45 ` Qian Cai @ 2020-04-06 19:51 ` Paul E. McKenney 2020-04-06 20:33 ` Qian Cai 0 siblings, 1 reply; 10+ messages in thread From: Paul E. McKenney @ 2020-04-06 19:51 UTC (permalink / raw) To: Qian Cai Cc: Andrey Konovalov, Alexander Potapenko, Dmitry Vyukov, kasan-dev, LKML, Marco Elver On Mon, Apr 06, 2020 at 09:45:44AM -0400, Qian Cai wrote: > > > > On Apr 6, 2020, at 9:35 AM, Paul E. McKenney <paulmck@kernel.org> wrote: > > > > It goes back in in seven days, after -rc1 is released. The fact that > > it was there last week was a mistake on my part, and I did eventually > > get my hand slapped for it. ;-) > > > > In the meantime, if it would help, I could group the KCSAN commits > > on top of those in -tip to allow you to get them with one "git pull" > > command. > > Testing Linux-next for a week without that commit with KCSAN is a torture, so please do that if that is not much work. Otherwise, I could manually cherry-pick the commit myself after fixing all the offsets. Just to confirm, you are interested in this -rcu commit, correct? 2402d0eae589 ("kcsan: Add option for verbose reporting") This one and the following are directly on top of the KCSAN stack that is in -tip and thus -next: 48b1fc1 kcsan: Add option to allow watcher interruptions 2402d0e kcsan: Add option for verbose reporting 44656d3 kcsan: Add current->state to implicitly atomic accesses e7b3410 kcsan: Fix a typo in a comment e7325b7 kcsan: Update Documentation/dev-tools/kcsan.rst 1443b8c kcsan: Update API documentation in kcsan-checks.h These are on top of this -tip commit: f5d2313bd3c5 ("kcsan, trace: Make KCSAN compatible with tracing") You can pull them in via the kcsan-dev.2020.03.25a branch if you wish. Thanx, Paul ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] kcsan: Add option for verbose reporting 2020-04-06 19:51 ` Paul E. McKenney @ 2020-04-06 20:33 ` Qian Cai 2020-04-15 22:42 ` Paul E. McKenney 0 siblings, 1 reply; 10+ messages in thread From: Qian Cai @ 2020-04-06 20:33 UTC (permalink / raw) To: paulmck Cc: Andrey Konovalov, Alexander Potapenko, Dmitry Vyukov, kasan-dev, LKML, Marco Elver > On Apr 6, 2020, at 3:51 PM, Paul E. McKenney <paulmck@kernel.org> wrote: > > On Mon, Apr 06, 2020 at 09:45:44AM -0400, Qian Cai wrote: >> >> >>> On Apr 6, 2020, at 9:35 AM, Paul E. McKenney <paulmck@kernel.org> wrote: >>> >>> It goes back in in seven days, after -rc1 is released. The fact that >>> it was there last week was a mistake on my part, and I did eventually >>> get my hand slapped for it. ;-) >>> >>> In the meantime, if it would help, I could group the KCSAN commits >>> on top of those in -tip to allow you to get them with one "git pull" >>> command. >> >> Testing Linux-next for a week without that commit with KCSAN is a torture, so please do that if that is not much work. Otherwise, I could manually cherry-pick the commit myself after fixing all the offsets. > > Just to confirm, you are interested in this -rcu commit, correct? > > 2402d0eae589 ("kcsan: Add option for verbose reporting") > > This one and the following are directly on top of the KCSAN stack > that is in -tip and thus -next: > > 48b1fc1 kcsan: Add option to allow watcher interruptions > 2402d0e kcsan: Add option for verbose reporting > 44656d3 kcsan: Add current->state to implicitly atomic accesses > e7b3410 kcsan: Fix a typo in a comment > e7325b7 kcsan: Update Documentation/dev-tools/kcsan.rst > 1443b8c kcsan: Update API documentation in kcsan-checks.h > > These are on top of this -tip commit: > > f5d2313bd3c5 ("kcsan, trace: Make KCSAN compatible with tracing") > > You can pull them in via the kcsan-dev.2020.03.25a branch if you wish. Great! That should be enough food for me to survive for this week. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3] kcsan: Add option for verbose reporting 2020-04-06 20:33 ` Qian Cai @ 2020-04-15 22:42 ` Paul E. McKenney 0 siblings, 0 replies; 10+ messages in thread From: Paul E. McKenney @ 2020-04-15 22:42 UTC (permalink / raw) To: Qian Cai Cc: Andrey Konovalov, Alexander Potapenko, Dmitry Vyukov, kasan-dev, LKML, Marco Elver On Mon, Apr 06, 2020 at 04:33:27PM -0400, Qian Cai wrote: > > > > On Apr 6, 2020, at 3:51 PM, Paul E. McKenney <paulmck@kernel.org> wrote: > > > > On Mon, Apr 06, 2020 at 09:45:44AM -0400, Qian Cai wrote: > >> > >> > >>> On Apr 6, 2020, at 9:35 AM, Paul E. McKenney <paulmck@kernel.org> wrote: > >>> > >>> It goes back in in seven days, after -rc1 is released. The fact that > >>> it was there last week was a mistake on my part, and I did eventually > >>> get my hand slapped for it. ;-) > >>> > >>> In the meantime, if it would help, I could group the KCSAN commits > >>> on top of those in -tip to allow you to get them with one "git pull" > >>> command. > >> > >> Testing Linux-next for a week without that commit with KCSAN is a torture, so please do that if that is not much work. Otherwise, I could manually cherry-pick the commit myself after fixing all the offsets. > > > > Just to confirm, you are interested in this -rcu commit, correct? > > > > 2402d0eae589 ("kcsan: Add option for verbose reporting") > > > > This one and the following are directly on top of the KCSAN stack > > that is in -tip and thus -next: > > > > 48b1fc1 kcsan: Add option to allow watcher interruptions > > 2402d0e kcsan: Add option for verbose reporting > > 44656d3 kcsan: Add current->state to implicitly atomic accesses > > e7b3410 kcsan: Fix a typo in a comment > > e7325b7 kcsan: Update Documentation/dev-tools/kcsan.rst > > 1443b8c kcsan: Update API documentation in kcsan-checks.h > > > > These are on top of this -tip commit: > > > > f5d2313bd3c5 ("kcsan, trace: Make KCSAN compatible with tracing") > > > > You can pull them in via the kcsan-dev.2020.03.25a branch if you wish. > > Great! That should be enough food for me to survive for this week. And I just put it back. Please accept my apologies for the delay, but the branching process fought a bit harder than usual. I probably missed today's -next, but hopefully tomorrow! Thanx, Paul ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2020-04-15 22:42 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-02-21 23:10 [PATCH v3] kcsan: Add option for verbose reporting Marco Elver 2020-02-22 1:36 ` Paul E. McKenney 2020-02-24 19:16 ` Marco Elver 2020-03-01 18:58 ` Paul E. McKenney 2020-04-06 12:59 ` Qian Cai 2020-04-06 13:35 ` Paul E. McKenney 2020-04-06 13:45 ` Qian Cai 2020-04-06 19:51 ` Paul E. McKenney 2020-04-06 20:33 ` Qian Cai 2020-04-15 22:42 ` Paul E. McKenney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).