LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCHv4 0/2] tracing/rwmmio/arm64: Add support to trace register reads/writes
@ 2021-11-15 11:33 Sai Prakash Ranjan
  2021-11-15 11:33 ` [PATCHv4 1/2] tracing: Add register read/write tracing support Sai Prakash Ranjan
  2021-11-15 11:33 ` [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation Sai Prakash Ranjan
  0 siblings, 2 replies; 22+ messages in thread
From: Sai Prakash Ranjan @ 2021-11-15 11:33 UTC (permalink / raw)
  To: Will Deacon, rostedt, Catalin Marinas, quic_psodagud
  Cc: Marc Zyngier, gregkh, arnd, linux-arm-kernel, linux-kernel,
	linux-arm-msm, mingo, Sai Prakash Ranjan

Generic MMIO read/write i.e., __raw_{read,write}{b,l,w,q} accessors
are typically used to read/write from/to memory mapped registers
and can cause hangs or some undefined behaviour in following cases,

* If the access to the register space is unclocked, for example: if
  there is an access to multimedia(MM) block registers without MM
  clocks.

* If the register space is protected and not set to be accessible from
  non-secure world, for example: only EL3 (EL: Exception level) access
  is allowed and any EL2/EL1 access is forbidden.

* If xPU(memory/register protection units) is controlling access to
  certain memory/register space for specific clients.

and more...

Such cases usually results in instant reboot/SErrors/NOC or interconnect
hangs and tracing these register accesses can be very helpful to debug
such issues during initial development stages and also in later stages.

So use ftrace trace events to log such MMIO register accesses which
provides rich feature set such as early enablement of trace events,
filtering capability, dumping ftrace logs on console and many more.

Sample output:

rwmmio_read: gic_peek_irq+0xd0/0xd8 readl addr=0xffff800010040104
rwmmio_write: gic_poke_irq+0xe4/0xf0 writel addr=0xffff800010040184 
rwmmio_read: gic_do_wait_for_rwp+0x54/0x90 readl addr=0xffff800010040000
rwmmio_write: gic_set_affinity+0x1bc/0x1e8 writeq addr=0xffff800010046130

This series is a follow-up for the series [1] and a recent series [2] making use
of both.

[1] https://lore.kernel.org/lkml/cover.1536430404.git.saiprakash.ranjan@codeaurora.org/
[2] https://lore.kernel.org/lkml/1604631386-178312-1-git-send-email-psodagud@codeaurora.org/

Changes in v4:
 * Drop dynamic debug based filter support since that will be developed later with
   the help from Steven (Ftrace maintainer).
 * Drop value passed to writel as it is causing hangs when tracing is enabled.
 * Code cleanup for trace event as suggested by Steven for earlier version.
 * Fixed some build errors reported by 0-day bot.

Changes in v3:
 * Create a generic mmio header for instrumented version (Earlier suggested in [1]
   by Will Deacon and recently [2] by Greg to have a generic version first).
 * Add dynamic debug support to filter out traces which can be very useful for targeted
   debugging specific to subsystems or drivers.
 * Few modifications to the rwmmio trace event fields to include the mmio width and print
   addresses in hex.
 * Rewrote commit msg to explain some more about usecases.

Prasad Sodagudi (1):
  tracing: Add register read/write tracing support

Sai Prakash Ranjan (1):
  arm64/io: Add a header for mmio access instrumentation

 arch/arm64/include/asm/io.h       | 25 ++++-------
 arch/arm64/kvm/hyp/nvhe/Makefile  |  2 +-
 include/linux/mmio-instrumented.h | 70 +++++++++++++++++++++++++++++++
 include/trace/events/rwmmio.h     | 59 ++++++++++++++++++++++++++
 kernel/trace/Kconfig              |  7 ++++
 kernel/trace/Makefile             |  1 +
 kernel/trace/trace_readwrite.c    | 29 +++++++++++++
 7 files changed, 176 insertions(+), 17 deletions(-)
 create mode 100644 include/linux/mmio-instrumented.h
 create mode 100644 include/trace/events/rwmmio.h
 create mode 100644 kernel/trace/trace_readwrite.c

-- 
2.33.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCHv4 1/2] tracing: Add register read/write tracing support
  2021-11-15 11:33 [PATCHv4 0/2] tracing/rwmmio/arm64: Add support to trace register reads/writes Sai Prakash Ranjan
@ 2021-11-15 11:33 ` Sai Prakash Ranjan
  2021-11-19 13:43   ` Marc Zyngier
  2021-11-15 11:33 ` [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation Sai Prakash Ranjan
  1 sibling, 1 reply; 22+ messages in thread
From: Sai Prakash Ranjan @ 2021-11-15 11:33 UTC (permalink / raw)
  To: Will Deacon, rostedt, Catalin Marinas, quic_psodagud
  Cc: Marc Zyngier, gregkh, arnd, linux-arm-kernel, linux-kernel,
	linux-arm-msm, mingo, Prasad Sodagudi, Sai Prakash Ranjan

From: Prasad Sodagudi <psodagud@codeaurora.org>

Generic MMIO read/write i.e., __raw_{read,write}{b,l,w,q} accessors
are typically used to read/write from/to memory mapped registers
and can cause hangs or some undefined behaviour in following few
cases,

* If the access to the register space is unclocked, for example: if
  there is an access to multimedia(MM) block registers without MM
  clocks.

* If the register space is protected and not set to be accessible from
  non-secure world, for example: only EL3 (EL: Exception level) access
  is allowed and any EL2/EL1 access is forbidden.

* If xPU(memory/register protection units) is controlling access to
  certain memory/register space for specific clients.

and more...

Such cases usually results in instant reboot/SErrors/NOC or interconnect
hangs and tracing these register accesses can be very helpful to debug
such issues during initial development stages and also in later stages.

So use ftrace trace events to log such MMIO register accesses which
provides rich feature set such as early enablement of trace events,
filtering capability, dumping ftrace logs on console and many more.

Sample output:

rwmmio_read: gic_peek_irq+0xd0/0xd8 readl addr=0xffff800010040104
rwmmio_write: gic_poke_irq+0xe4/0xf0 writel addr=0xffff800010040184
rwmmio_read: gic_do_wait_for_rwp+0x54/0x90 readl addr=0xffff800010040000
rwmmio_write: gic_set_affinity+0x1bc/0x1e8 writeq addr=0xffff800010046130

Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
[saiprakash: Rewrote commit msg and trace event field edits]
Signed-off-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
---

Have dropped value parameter for mmio write trace event as that
was causing hangs in strange ways, i.e., if we pass any other
64bit value, it works fine but passing value would just hang.
Not just using the log apis, even simple trace_printk with value
printed would cause hang. It wasn't noticed in early version
because dyndbg would filter the logging in my system (I had
set it to trace only specific qcom directory) but once this
version starts recording all the reads/writes with value passed,
it just hangs system when rwmmio write event tracing is enabled.

Reason why we wouldn't need value along with mmio write log is
that value can be easily deduced from the caller_name+offset which is
printed already by the rwmmio trace events which gives the exact
location of mmio writes and the value is easily known from the driver.
So trace event fields will be only the required ones and rest which
can be deduced from userspace with GDB like application can be skipped.

---
 include/trace/events/rwmmio.h  | 59 ++++++++++++++++++++++++++++++++++
 kernel/trace/Kconfig           |  7 ++++
 kernel/trace/Makefile          |  1 +
 kernel/trace/trace_readwrite.c | 29 +++++++++++++++++
 4 files changed, 96 insertions(+)
 create mode 100644 include/trace/events/rwmmio.h
 create mode 100644 kernel/trace/trace_readwrite.c

diff --git a/include/trace/events/rwmmio.h b/include/trace/events/rwmmio.h
new file mode 100644
index 000000000000..1687abf8e7a3
--- /dev/null
+++ b/include/trace/events/rwmmio.h
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2021 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM rwmmio
+
+#if !defined(_TRACE_RWMMIO_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_RWMMIO_H
+
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(rwmmio_write,
+
+	TP_PROTO(unsigned long fn, const char *width, volatile void __iomem *addr),
+
+	TP_ARGS(fn, width, addr),
+
+	TP_STRUCT__entry(
+		__field(u64, fn)
+		__field(u64, addr)
+		__string(width, width)
+	),
+
+	TP_fast_assign(
+		__entry->fn = fn;
+		__entry->addr = (u64)(void *)addr;
+		__assign_str(width, width);
+	),
+
+	TP_printk("%pS %s addr=%#llx",
+		(void *)(unsigned long)__entry->fn, __get_str(width), __entry->addr)
+);
+
+TRACE_EVENT(rwmmio_read,
+
+	TP_PROTO(unsigned long fn, const char *width, const volatile void __iomem *addr),
+
+	TP_ARGS(fn, width, addr),
+
+	TP_STRUCT__entry(
+		__field(u64, fn)
+		__field(u64, addr)
+		__string(width, width)
+	),
+
+	TP_fast_assign(
+		__entry->fn = fn;
+		__entry->addr = (u64)(void *)addr;
+		__assign_str(width, width);
+	),
+
+	TP_printk("%pS %s addr=%#llx",
+		 (void *)(unsigned long)__entry->fn, __get_str(width), __entry->addr)
+);
+
+#endif /* _TRACE_RWMMIO_H */
+
+#include <trace/define_trace.h>
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 420ff4bc67fd..9f55bcc51de1 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -95,6 +95,13 @@ config RING_BUFFER_ALLOW_SWAP
 	 Allow the use of ring_buffer_swap_cpu.
 	 Adds a very slight overhead to tracing when enabled.
 
+config TRACE_MMIO_ACCESS
+	bool "Register read/write tracing"
+	depends on TRACING
+	help
+	  Create tracepoints for MMIO read/write operations. These trace events
+	  can be used for logging all MMIO read/write operations.
+
 config PREEMPTIRQ_TRACEPOINTS
 	bool
 	depends on TRACE_PREEMPT_TOGGLE || TRACE_IRQFLAGS
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index bedc5caceec7..a3d16e1a5abd 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -99,5 +99,6 @@ obj-$(CONFIG_BOOTTIME_TRACING) += trace_boot.o
 obj-$(CONFIG_FTRACE_RECORD_RECURSION) += trace_recursion_record.o
 
 obj-$(CONFIG_TRACEPOINT_BENCHMARK) += trace_benchmark.o
+obj-$(CONFIG_TRACE_MMIO_ACCESS) += trace_readwrite.o
 
 libftrace-y := ftrace.o
diff --git a/kernel/trace/trace_readwrite.c b/kernel/trace/trace_readwrite.c
new file mode 100644
index 000000000000..334e21f70c62
--- /dev/null
+++ b/kernel/trace/trace_readwrite.c
@@ -0,0 +1,29 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Register read and write tracepoints
+ *
+ * Copyright (c) 2021 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/ftrace.h>
+#include <linux/mmio-instrumented.h>
+#include <linux/module.h>
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/rwmmio.h>
+
+#ifdef CONFIG_TRACE_MMIO_ACCESS
+void log_write_mmio(const char *width, volatile void __iomem *addr)
+{
+	trace_rwmmio_write(CALLER_ADDR0, width, addr);
+}
+EXPORT_SYMBOL_GPL(log_write_mmio);
+EXPORT_TRACEPOINT_SYMBOL_GPL(rwmmio_write);
+
+void log_read_mmio(const char *width, const volatile void __iomem *addr)
+{
+	trace_rwmmio_read(CALLER_ADDR0, width, addr);
+}
+EXPORT_SYMBOL_GPL(log_read_mmio);
+EXPORT_TRACEPOINT_SYMBOL_GPL(rwmmio_read);
+#endif /* CONFIG_TRACE_MMIO_ACCESS */
-- 
2.33.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation
  2021-11-15 11:33 [PATCHv4 0/2] tracing/rwmmio/arm64: Add support to trace register reads/writes Sai Prakash Ranjan
  2021-11-15 11:33 ` [PATCHv4 1/2] tracing: Add register read/write tracing support Sai Prakash Ranjan
@ 2021-11-15 11:33 ` Sai Prakash Ranjan
  2021-11-16 22:40   ` Steven Rostedt
                     ` (3 more replies)
  1 sibling, 4 replies; 22+ messages in thread
From: Sai Prakash Ranjan @ 2021-11-15 11:33 UTC (permalink / raw)
  To: Will Deacon, rostedt, Catalin Marinas, quic_psodagud
  Cc: Marc Zyngier, gregkh, arnd, linux-arm-kernel, linux-kernel,
	linux-arm-msm, mingo, Sai Prakash Ranjan

The new generic header mmio-instrumented.h will keep arch code clean
and separate from instrumented version which traces mmio register
accesses. This instrumented header is generic and can be used by other
architectures as well. Also add a generic flag (__DISABLE_TRACE_MMIO__)
which is used to disable MMIO tracing in nVHE and if required can be
used to disable tracing for specific drivers.

Signed-off-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
---
 arch/arm64/include/asm/io.h       | 25 ++++-------
 arch/arm64/kvm/hyp/nvhe/Makefile  |  2 +-
 include/linux/mmio-instrumented.h | 70 +++++++++++++++++++++++++++++++
 3 files changed, 80 insertions(+), 17 deletions(-)
 create mode 100644 include/linux/mmio-instrumented.h

diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
index 7fd836bea7eb..a635aaaf81b9 100644
--- a/arch/arm64/include/asm/io.h
+++ b/arch/arm64/include/asm/io.h
@@ -10,6 +10,7 @@
 
 #include <linux/types.h>
 #include <linux/pgtable.h>
+#include <linux/mmio-instrumented.h>
 
 #include <asm/byteorder.h>
 #include <asm/barrier.h>
@@ -21,32 +22,27 @@
 /*
  * Generic IO read/write.  These perform native-endian accesses.
  */
-#define __raw_writeb __raw_writeb
-static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
+static inline void arch_raw_writeb(u8 val, volatile void __iomem *addr)
 {
 	asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr));
 }
 
-#define __raw_writew __raw_writew
-static inline void __raw_writew(u16 val, volatile void __iomem *addr)
+static inline void arch_raw_writew(u16 val, volatile void __iomem *addr)
 {
 	asm volatile("strh %w0, [%1]" : : "rZ" (val), "r" (addr));
 }
 
-#define __raw_writel __raw_writel
-static __always_inline void __raw_writel(u32 val, volatile void __iomem *addr)
+static __always_inline void arch_raw_writel(u32 val, volatile void __iomem *addr)
 {
 	asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr));
 }
 
-#define __raw_writeq __raw_writeq
-static inline void __raw_writeq(u64 val, volatile void __iomem *addr)
+static inline void arch_raw_writeq(u64 val, volatile void __iomem *addr)
 {
 	asm volatile("str %x0, [%1]" : : "rZ" (val), "r" (addr));
 }
 
-#define __raw_readb __raw_readb
-static inline u8 __raw_readb(const volatile void __iomem *addr)
+static inline u8 arch_raw_readb(const volatile void __iomem *addr)
 {
 	u8 val;
 	asm volatile(ALTERNATIVE("ldrb %w0, [%1]",
@@ -56,8 +52,7 @@ static inline u8 __raw_readb(const volatile void __iomem *addr)
 	return val;
 }
 
-#define __raw_readw __raw_readw
-static inline u16 __raw_readw(const volatile void __iomem *addr)
+static inline u16 arch_raw_readw(const volatile void __iomem *addr)
 {
 	u16 val;
 
@@ -68,8 +63,7 @@ static inline u16 __raw_readw(const volatile void __iomem *addr)
 	return val;
 }
 
-#define __raw_readl __raw_readl
-static __always_inline u32 __raw_readl(const volatile void __iomem *addr)
+static __always_inline u32 arch_raw_readl(const volatile void __iomem *addr)
 {
 	u32 val;
 	asm volatile(ALTERNATIVE("ldr %w0, [%1]",
@@ -79,8 +73,7 @@ static __always_inline u32 __raw_readl(const volatile void __iomem *addr)
 	return val;
 }
 
-#define __raw_readq __raw_readq
-static inline u64 __raw_readq(const volatile void __iomem *addr)
+static inline u64 arch_raw_readq(const volatile void __iomem *addr)
 {
 	u64 val;
 	asm volatile(ALTERNATIVE("ldr %0, [%1]",
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index c3c11974fa3b..ff56d2165ea9 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -4,7 +4,7 @@
 #
 
 asflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS
-ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS
+ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS -D__DISABLE_TRACE_MMIO__
 
 hostprogs := gen-hyprel
 HOST_EXTRACFLAGS += -I$(objtree)/include
diff --git a/include/linux/mmio-instrumented.h b/include/linux/mmio-instrumented.h
new file mode 100644
index 000000000000..99979c025cc1
--- /dev/null
+++ b/include/linux/mmio-instrumented.h
@@ -0,0 +1,70 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2021 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#ifndef _LINUX_MMIO_INSTRUMENTED_H
+#define _LINUX_MMIO_INSTRUMENTED_H
+
+#include <linux/tracepoint-defs.h>
+
+/*
+ * Tracepoint and MMIO logging symbols should not be visible at EL2(HYP) as
+ * there is no way to execute them and any such MMIO access from EL2 will
+ * explode instantly (Words of Marc Zyngier). So introduce a generic flag
+ * __DISABLE_TRACE_MMIO__ to disable MMIO tracing in nVHE and other drivers
+ * if required.
+ */
+#if IS_ENABLED(CONFIG_TRACE_MMIO_ACCESS) && !(defined(__DISABLE_TRACE_MMIO__))
+DECLARE_TRACEPOINT(rwmmio_write);
+DECLARE_TRACEPOINT(rwmmio_read);
+
+void log_write_mmio(const char *width, volatile void __iomem *addr);
+void log_read_mmio(const char *width, const volatile void __iomem *addr);
+
+#define __raw_write(v, a, _l)	({				\
+	volatile void __iomem *_a = (a);			\
+	if (tracepoint_enabled(rwmmio_write))			\
+		log_write_mmio(__stringify(write##_l), _a);	\
+	arch_raw_write##_l((v), _a);				\
+	})
+
+#define __raw_writeb(v, a)	__raw_write((v), a, b)
+#define __raw_writew(v, a)	__raw_write((v), a, w)
+#define __raw_writel(v, a)	__raw_write((v), a, l)
+#define __raw_writeq(v, a)	__raw_write((v), a, q)
+
+#define __raw_read(a, _l, _t)    ({				\
+	_t __a;							\
+	const volatile void __iomem *_a = (a);			\
+	if (tracepoint_enabled(rwmmio_read))			\
+		log_read_mmio(__stringify(read##_l), _a);	\
+	__a = arch_raw_read##_l(_a);				\
+	__a;							\
+	})
+
+#define __raw_readb(a)		__raw_read((a), b, u8)
+#define __raw_readw(a)		__raw_read((a), w, u16)
+#define __raw_readl(a)		__raw_read((a), l, u32)
+#define __raw_readq(a)		__raw_read((a), q, u64)
+
+#else
+
+#define __raw_writeb(v, a)	arch_raw_writeb(v, a)
+#define __raw_writew(v, a)	arch_raw_writew(v, a)
+#define __raw_writel(v, a)	arch_raw_writel(v, a)
+#define __raw_writeq(v, a)	arch_raw_writeq(v, a)
+
+#define __raw_readb(a)		arch_raw_readb(a)
+#define __raw_readw(a)		arch_raw_readw(a)
+#define __raw_readl(a)		arch_raw_readl(a)
+#define __raw_readq(a)		arch_raw_readq(a)
+
+static inline void log_write_mmio(const char *width,
+				  volatile void __iomem *addr) {}
+static inline void log_read_mmio(const char *width,
+				 const volatile void __iomem *addr) {}
+
+#endif /* CONFIG_TRACE_MMIO_ACCESS */
+
+#endif /* _LINUX_MMIO_INSTRUMENTED_H */
-- 
2.33.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation
  2021-11-15 11:33 ` [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation Sai Prakash Ranjan
@ 2021-11-16 22:40   ` Steven Rostedt
  2021-11-17  3:53     ` Sai Prakash Ranjan
  2021-11-18 14:58   ` kernel test robot
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 22+ messages in thread
From: Steven Rostedt @ 2021-11-16 22:40 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Will Deacon, Catalin Marinas, quic_psodagud, Marc Zyngier,
	gregkh, arnd, linux-arm-kernel, linux-kernel, linux-arm-msm,
	mingo

On Mon, 15 Nov 2021 17:03:30 +0530
Sai Prakash Ranjan <quic_saipraka@quicinc.com> wrote:

> The new generic header mmio-instrumented.h will keep arch code clean
> and separate from instrumented version which traces mmio register
> accesses. This instrumented header is generic and can be used by other
> architectures as well. Also add a generic flag (__DISABLE_TRACE_MMIO__)
> which is used to disable MMIO tracing in nVHE and if required can be
> used to disable tracing for specific drivers.
> 
> Signed-off-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
> ---
>  arch/arm64/include/asm/io.h       | 25 ++++-------
>  arch/arm64/kvm/hyp/nvhe/Makefile  |  2 +-
>  include/linux/mmio-instrumented.h | 70 +++++++++++++++++++++++++++++++
>  3 files changed, 80 insertions(+), 17 deletions(-)
>  create mode 100644 include/linux/mmio-instrumented.h
> 
> diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
> index 7fd836bea7eb..a635aaaf81b9 100644
> --- a/arch/arm64/include/asm/io.h
> +++ b/arch/arm64/include/asm/io.h
> @@ -10,6 +10,7 @@
>  
>  #include <linux/types.h>
>  #include <linux/pgtable.h>
> +#include <linux/mmio-instrumented.h>
>  
>  #include <asm/byteorder.h>
>  #include <asm/barrier.h>
> @@ -21,32 +22,27 @@
>  /*
>   * Generic IO read/write.  These perform native-endian accesses.
>   */
> -#define __raw_writeb __raw_writeb
> -static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
> +static inline void arch_raw_writeb(u8 val, volatile void __iomem *addr)
>  {
>  	asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr));
>  }
>  
> -#define __raw_writew __raw_writew
> -static inline void __raw_writew(u16 val, volatile void __iomem *addr)
> +static inline void arch_raw_writew(u16 val, volatile void __iomem *addr)
>  {
>  	asm volatile("strh %w0, [%1]" : : "rZ" (val), "r" (addr));
>  }
>  
> -#define __raw_writel __raw_writel
> -static __always_inline void __raw_writel(u32 val, volatile void __iomem *addr)
> +static __always_inline void arch_raw_writel(u32 val, volatile void __iomem *addr)
>  {
>  	asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr));
>  }
>  
> -#define __raw_writeq __raw_writeq
> -static inline void __raw_writeq(u64 val, volatile void __iomem *addr)
> +static inline void arch_raw_writeq(u64 val, volatile void __iomem *addr)
>  {
>  	asm volatile("str %x0, [%1]" : : "rZ" (val), "r" (addr));
>  }
>  
> -#define __raw_readb __raw_readb
> -static inline u8 __raw_readb(const volatile void __iomem *addr)
> +static inline u8 arch_raw_readb(const volatile void __iomem *addr)
>  {
>  	u8 val;
>  	asm volatile(ALTERNATIVE("ldrb %w0, [%1]",
> @@ -56,8 +52,7 @@ static inline u8 __raw_readb(const volatile void __iomem *addr)
>  	return val;
>  }
>  
> -#define __raw_readw __raw_readw
> -static inline u16 __raw_readw(const volatile void __iomem *addr)
> +static inline u16 arch_raw_readw(const volatile void __iomem *addr)
>  {
>  	u16 val;
>  
> @@ -68,8 +63,7 @@ static inline u16 __raw_readw(const volatile void __iomem *addr)
>  	return val;
>  }
>  
> -#define __raw_readl __raw_readl
> -static __always_inline u32 __raw_readl(const volatile void __iomem *addr)
> +static __always_inline u32 arch_raw_readl(const volatile void __iomem *addr)
>  {
>  	u32 val;
>  	asm volatile(ALTERNATIVE("ldr %w0, [%1]",
> @@ -79,8 +73,7 @@ static __always_inline u32 __raw_readl(const volatile void __iomem *addr)
>  	return val;
>  }
>  
> -#define __raw_readq __raw_readq
> -static inline u64 __raw_readq(const volatile void __iomem *addr)
> +static inline u64 arch_raw_readq(const volatile void __iomem *addr)

Shouldn't the above be done as a separate patch and handle other
architectures that implement __raw_read/write*()?


>  {
>  	u64 val;
>  	asm volatile(ALTERNATIVE("ldr %0, [%1]",
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index c3c11974fa3b..ff56d2165ea9 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -4,7 +4,7 @@
>  #
>  
>  asflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS
> -ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS
> +ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS -D__DISABLE_TRACE_MMIO__
>  
>  hostprogs := gen-hyprel
>  HOST_EXTRACFLAGS += -I$(objtree)/include
> diff --git a/include/linux/mmio-instrumented.h b/include/linux/mmio-instrumented.h
> new file mode 100644
> index 000000000000..99979c025cc1
> --- /dev/null
> +++ b/include/linux/mmio-instrumented.h
> @@ -0,0 +1,70 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2021 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _LINUX_MMIO_INSTRUMENTED_H
> +#define _LINUX_MMIO_INSTRUMENTED_H
> +
> +#include <linux/tracepoint-defs.h>
> +
> +/*
> + * Tracepoint and MMIO logging symbols should not be visible at EL2(HYP) as
> + * there is no way to execute them and any such MMIO access from EL2 will
> + * explode instantly (Words of Marc Zyngier). So introduce a generic flag
> + * __DISABLE_TRACE_MMIO__ to disable MMIO tracing in nVHE and other drivers
> + * if required.
> + */
> +#if IS_ENABLED(CONFIG_TRACE_MMIO_ACCESS) && !(defined(__DISABLE_TRACE_MMIO__))
> +DECLARE_TRACEPOINT(rwmmio_write);
> +DECLARE_TRACEPOINT(rwmmio_read);
> +
> +void log_write_mmio(const char *width, volatile void __iomem *addr);
> +void log_read_mmio(const char *width, const volatile void __iomem *addr);
> +
> +#define __raw_write(v, a, _l)	({				\
> +	volatile void __iomem *_a = (a);			\
> +	if (tracepoint_enabled(rwmmio_write))			\
> +		log_write_mmio(__stringify(write##_l), _a);	\
> +	arch_raw_write##_l((v), _a);				\
> +	})
> +
> +#define __raw_writeb(v, a)	__raw_write((v), a, b)
> +#define __raw_writew(v, a)	__raw_write((v), a, w)
> +#define __raw_writel(v, a)	__raw_write((v), a, l)
> +#define __raw_writeq(v, a)	__raw_write((v), a, q)
> +
> +#define __raw_read(a, _l, _t)    ({				\
> +	_t __a;							\
> +	const volatile void __iomem *_a = (a);			\
> +	if (tracepoint_enabled(rwmmio_read))			\
> +		log_read_mmio(__stringify(read##_l), _a);	\
> +	__a = arch_raw_read##_l(_a);				\
> +	__a;							\
> +	})
> +
> +#define __raw_readb(a)		__raw_read((a), b, u8)
> +#define __raw_readw(a)		__raw_read((a), w, u16)
> +#define __raw_readl(a)		__raw_read((a), l, u32)
> +#define __raw_readq(a)		__raw_read((a), q, u64)
> +
> +#else
> +
> +#define __raw_writeb(v, a)	arch_raw_writeb(v, a)
> +#define __raw_writew(v, a)	arch_raw_writew(v, a)
> +#define __raw_writel(v, a)	arch_raw_writel(v, a)
> +#define __raw_writeq(v, a)	arch_raw_writeq(v, a)
> +
> +#define __raw_readb(a)		arch_raw_readb(a)
> +#define __raw_readw(a)		arch_raw_readw(a)
> +#define __raw_readl(a)		arch_raw_readl(a)
> +#define __raw_readq(a)		arch_raw_readq(a)
> +
> +static inline void log_write_mmio(const char *width,
> +				  volatile void __iomem *addr) {}
> +static inline void log_read_mmio(const char *width,
> +				 const volatile void __iomem *addr) {}

The rest from a tracing point of view looks fine, and for that part:

Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

-- Steve


> +
> +#endif /* CONFIG_TRACE_MMIO_ACCESS */
> +
> +#endif /* _LINUX_MMIO_INSTRUMENTED_H */


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation
  2021-11-16 22:40   ` Steven Rostedt
@ 2021-11-17  3:53     ` Sai Prakash Ranjan
  0 siblings, 0 replies; 22+ messages in thread
From: Sai Prakash Ranjan @ 2021-11-17  3:53 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Will Deacon, Catalin Marinas, quic_psodagud, Marc Zyngier,
	gregkh, arnd, linux-arm-kernel, linux-kernel, linux-arm-msm,
	mingo

Hi Steve,

On 11/17/2021 4:10 AM, Steven Rostedt wrote:
> On Mon, 15 Nov 2021 17:03:30 +0530
> Sai Prakash Ranjan <quic_saipraka@quicinc.com> wrote:
>
>> The new generic header mmio-instrumented.h will keep arch code clean
>> and separate from instrumented version which traces mmio register
>> accesses. This instrumented header is generic and can be used by other
>> architectures as well. Also add a generic flag (__DISABLE_TRACE_MMIO__)
>> which is used to disable MMIO tracing in nVHE and if required can be
>> used to disable tracing for specific drivers.
>>
>> Signed-off-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
>> ---
>>   arch/arm64/include/asm/io.h       | 25 ++++-------
>>   arch/arm64/kvm/hyp/nvhe/Makefile  |  2 +-
>>   include/linux/mmio-instrumented.h | 70 +++++++++++++++++++++++++++++++
>>   3 files changed, 80 insertions(+), 17 deletions(-)
>>   create mode 100644 include/linux/mmio-instrumented.h
>>
>> diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
>> index 7fd836bea7eb..a635aaaf81b9 100644
>> --- a/arch/arm64/include/asm/io.h
>> +++ b/arch/arm64/include/asm/io.h
>> @@ -10,6 +10,7 @@
>>   
>>   #include <linux/types.h>
>>   #include <linux/pgtable.h>
>> +#include <linux/mmio-instrumented.h>
>>   
>>   #include <asm/byteorder.h>
>>   #include <asm/barrier.h>
>> @@ -21,32 +22,27 @@
>>   /*
>>    * Generic IO read/write.  These perform native-endian accesses.
>>    */
>> -#define __raw_writeb __raw_writeb
>> -static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
>> +static inline void arch_raw_writeb(u8 val, volatile void __iomem *addr)
>>   {
>>   	asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr));
>>   }
>>   
>> -#define __raw_writew __raw_writew
>> -static inline void __raw_writew(u16 val, volatile void __iomem *addr)
>> +static inline void arch_raw_writew(u16 val, volatile void __iomem *addr)
>>   {
>>   	asm volatile("strh %w0, [%1]" : : "rZ" (val), "r" (addr));
>>   }
>>   
>> -#define __raw_writel __raw_writel
>> -static __always_inline void __raw_writel(u32 val, volatile void __iomem *addr)
>> +static __always_inline void arch_raw_writel(u32 val, volatile void __iomem *addr)
>>   {
>>   	asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr));
>>   }
>>   
>> -#define __raw_writeq __raw_writeq
>> -static inline void __raw_writeq(u64 val, volatile void __iomem *addr)
>> +static inline void arch_raw_writeq(u64 val, volatile void __iomem *addr)
>>   {
>>   	asm volatile("str %x0, [%1]" : : "rZ" (val), "r" (addr));
>>   }
>>   
>> -#define __raw_readb __raw_readb
>> -static inline u8 __raw_readb(const volatile void __iomem *addr)
>> +static inline u8 arch_raw_readb(const volatile void __iomem *addr)
>>   {
>>   	u8 val;
>>   	asm volatile(ALTERNATIVE("ldrb %w0, [%1]",
>> @@ -56,8 +52,7 @@ static inline u8 __raw_readb(const volatile void __iomem *addr)
>>   	return val;
>>   }
>>   
>> -#define __raw_readw __raw_readw
>> -static inline u16 __raw_readw(const volatile void __iomem *addr)
>> +static inline u16 arch_raw_readw(const volatile void __iomem *addr)
>>   {
>>   	u16 val;
>>   
>> @@ -68,8 +63,7 @@ static inline u16 __raw_readw(const volatile void __iomem *addr)
>>   	return val;
>>   }
>>   
>> -#define __raw_readl __raw_readl
>> -static __always_inline u32 __raw_readl(const volatile void __iomem *addr)
>> +static __always_inline u32 arch_raw_readl(const volatile void __iomem *addr)
>>   {
>>   	u32 val;
>>   	asm volatile(ALTERNATIVE("ldr %w0, [%1]",
>> @@ -79,8 +73,7 @@ static __always_inline u32 __raw_readl(const volatile void __iomem *addr)
>>   	return val;
>>   }
>>   
>> -#define __raw_readq __raw_readq
>> -static inline u64 __raw_readq(const volatile void __iomem *addr)
>> +static inline u64 arch_raw_readq(const volatile void __iomem *addr)
> Shouldn't the above be done as a separate patch and handle other
> architectures that implement __raw_read/write*()?

Will do it as a separate patch in the next version, but can't add for 
other architectures yet
as I have no way to test it except compile test and are better left 
since there are certain things
which are specific to certain archs, for example on arm64, we cannot 
enable MMIO tracing for
arm64 NVHE EL2 (HYP) mode, these things are better known by the 
corresponding arch experts.
So the idea of this patch series is to provide a generic instrumentation 
facility which can be easily
adopted by other architectures as well with initial support for ARM64 
just like any other new
feature starting with support for 1/2 archs.

>
>>   {
>>   	u64 val;
>>   	asm volatile(ALTERNATIVE("ldr %0, [%1]",
>> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
>> index c3c11974fa3b..ff56d2165ea9 100644
>> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
>> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
>> @@ -4,7 +4,7 @@
>>   #
>>   
>>   asflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS
>> -ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS
>> +ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS -D__DISABLE_TRACE_MMIO__
>>   
>>   hostprogs := gen-hyprel
>>   HOST_EXTRACFLAGS += -I$(objtree)/include
>> diff --git a/include/linux/mmio-instrumented.h b/include/linux/mmio-instrumented.h
>> new file mode 100644
>> index 000000000000..99979c025cc1
>> --- /dev/null
>> +++ b/include/linux/mmio-instrumented.h
>> @@ -0,0 +1,70 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) 2021 Qualcomm Innovation Center, Inc. All rights reserved.
>> + */
>> +
>> +#ifndef _LINUX_MMIO_INSTRUMENTED_H
>> +#define _LINUX_MMIO_INSTRUMENTED_H
>> +
>> +#include <linux/tracepoint-defs.h>
>> +
>> +/*
>> + * Tracepoint and MMIO logging symbols should not be visible at EL2(HYP) as
>> + * there is no way to execute them and any such MMIO access from EL2 will
>> + * explode instantly (Words of Marc Zyngier). So introduce a generic flag
>> + * __DISABLE_TRACE_MMIO__ to disable MMIO tracing in nVHE and other drivers
>> + * if required.
>> + */
>> +#if IS_ENABLED(CONFIG_TRACE_MMIO_ACCESS) && !(defined(__DISABLE_TRACE_MMIO__))
>> +DECLARE_TRACEPOINT(rwmmio_write);
>> +DECLARE_TRACEPOINT(rwmmio_read);
>> +
>> +void log_write_mmio(const char *width, volatile void __iomem *addr);
>> +void log_read_mmio(const char *width, const volatile void __iomem *addr);
>> +
>> +#define __raw_write(v, a, _l)	({				\
>> +	volatile void __iomem *_a = (a);			\
>> +	if (tracepoint_enabled(rwmmio_write))			\
>> +		log_write_mmio(__stringify(write##_l), _a);	\
>> +	arch_raw_write##_l((v), _a);				\
>> +	})
>> +
>> +#define __raw_writeb(v, a)	__raw_write((v), a, b)
>> +#define __raw_writew(v, a)	__raw_write((v), a, w)
>> +#define __raw_writel(v, a)	__raw_write((v), a, l)
>> +#define __raw_writeq(v, a)	__raw_write((v), a, q)
>> +
>> +#define __raw_read(a, _l, _t)    ({				\
>> +	_t __a;							\
>> +	const volatile void __iomem *_a = (a);			\
>> +	if (tracepoint_enabled(rwmmio_read))			\
>> +		log_read_mmio(__stringify(read##_l), _a);	\
>> +	__a = arch_raw_read##_l(_a);				\
>> +	__a;							\
>> +	})
>> +
>> +#define __raw_readb(a)		__raw_read((a), b, u8)
>> +#define __raw_readw(a)		__raw_read((a), w, u16)
>> +#define __raw_readl(a)		__raw_read((a), l, u32)
>> +#define __raw_readq(a)		__raw_read((a), q, u64)
>> +
>> +#else
>> +
>> +#define __raw_writeb(v, a)	arch_raw_writeb(v, a)
>> +#define __raw_writew(v, a)	arch_raw_writew(v, a)
>> +#define __raw_writel(v, a)	arch_raw_writel(v, a)
>> +#define __raw_writeq(v, a)	arch_raw_writeq(v, a)
>> +
>> +#define __raw_readb(a)		arch_raw_readb(a)
>> +#define __raw_readw(a)		arch_raw_readw(a)
>> +#define __raw_readl(a)		arch_raw_readl(a)
>> +#define __raw_readq(a)		arch_raw_readq(a)
>> +
>> +static inline void log_write_mmio(const char *width,
>> +				  volatile void __iomem *addr) {}
>> +static inline void log_read_mmio(const char *width,
>> +				 const volatile void __iomem *addr) {}
> The rest from a tracing point of view looks fine, and for that part:
>
> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
>
> -- Steve
>
Thanks, would you be able to review Patch1 touching trace directory in 
case it has missed your queue?

-Sai

>> +
>> +#endif /* CONFIG_TRACE_MMIO_ACCESS */
>> +
>> +#endif /* _LINUX_MMIO_INSTRUMENTED_H */



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation
  2021-11-15 11:33 ` [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation Sai Prakash Ranjan
  2021-11-16 22:40   ` Steven Rostedt
@ 2021-11-18 14:58   ` kernel test robot
  2021-11-18 15:24   ` Arnd Bergmann
  2021-11-19 13:48   ` Marc Zyngier
  3 siblings, 0 replies; 22+ messages in thread
From: kernel test robot @ 2021-11-18 14:58 UTC (permalink / raw)
  To: Sai Prakash Ranjan, Will Deacon, rostedt, Catalin Marinas, quic_psodagud
  Cc: kbuild-all, Marc Zyngier, gregkh, arnd, linux-arm-kernel,
	linux-kernel, linux-arm-msm

[-- Attachment #1: Type: text/plain, Size: 11555 bytes --]

Hi Sai,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on rostedt-trace/for-next]
[also build test WARNING on arm64/for-next/core arm-perf/for-next/perf v5.16-rc1 next-20211118]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Sai-Prakash-Ranjan/tracing-rwmmio-arm64-Add-support-to-trace-register-reads-writes/20211115-193645
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git for-next
config: csky-randconfig-r005-20211118 (attached as .config)
compiler: csky-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/b3765baa5dcf19d695332a310cf29d7abd39ad73
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Sai-Prakash-Ranjan/tracing-rwmmio-arm64-Add-support-to-trace-register-reads-writes/20211115-193645
        git checkout b3765baa5dcf19d695332a310cf29d7abd39ad73
        # save the attached .config to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=csky SHELL=/bin/bash kernel/trace/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from include/linux/bitops.h:33,
                    from include/linux/kernel.h:12,
                    from include/linux/interrupt.h:6,
                    from include/linux/trace_recursion.h:5,
                    from include/linux/ftrace.h:10,
                    from kernel/trace/trace_readwrite.c:8:
   arch/csky/include/asm/bitops.h:77: warning: "__clear_bit" redefined
      77 | #define __clear_bit(nr, vaddr) clear_bit(nr, vaddr)
         | 
   In file included from arch/csky/include/asm/bitops.h:76,
                    from include/linux/bitops.h:33,
                    from include/linux/kernel.h:12,
                    from include/linux/interrupt.h:6,
                    from include/linux/trace_recursion.h:5,
                    from include/linux/ftrace.h:10,
                    from kernel/trace/trace_readwrite.c:8:
   include/asm-generic/bitops/non-atomic.h:34: note: this is the location of the previous definition
      34 | #define __clear_bit arch___clear_bit
         | 
   In file included from kernel/trace/trace_readwrite.c:9:
>> include/linux/mmio-instrumented.h:32: warning: "__raw_writeb" redefined
      32 | #define __raw_writeb(v, a)      __raw_write((v), a, b)
         | 
   In file included from arch/csky/include/asm/io.h:42,
                    from include/linux/io.h:13,
                    from include/linux/irq.h:20,
                    from include/asm-generic/hardirq.h:17,
                    from ./arch/csky/include/generated/asm/hardirq.h:1,
                    from include/linux/hardirq.h:11,
                    from include/linux/interrupt.h:11,
                    from include/linux/trace_recursion.h:5,
                    from include/linux/ftrace.h:10,
                    from kernel/trace/trace_readwrite.c:8:
   include/asm-generic/io.h:108: note: this is the location of the previous definition
     108 | #define __raw_writeb __raw_writeb
         | 
   In file included from kernel/trace/trace_readwrite.c:9:
>> include/linux/mmio-instrumented.h:33: warning: "__raw_writew" redefined
      33 | #define __raw_writew(v, a)      __raw_write((v), a, w)
         | 
   In file included from arch/csky/include/asm/io.h:42,
                    from include/linux/io.h:13,
                    from include/linux/irq.h:20,
                    from include/asm-generic/hardirq.h:17,
                    from ./arch/csky/include/generated/asm/hardirq.h:1,
                    from include/linux/hardirq.h:11,
                    from include/linux/interrupt.h:11,
                    from include/linux/trace_recursion.h:5,
                    from include/linux/ftrace.h:10,
                    from kernel/trace/trace_readwrite.c:8:
   include/asm-generic/io.h:116: note: this is the location of the previous definition
     116 | #define __raw_writew __raw_writew
         | 
   In file included from kernel/trace/trace_readwrite.c:9:
>> include/linux/mmio-instrumented.h:34: warning: "__raw_writel" redefined
      34 | #define __raw_writel(v, a)      __raw_write((v), a, l)
         | 
   In file included from arch/csky/include/asm/io.h:42,
                    from include/linux/io.h:13,
                    from include/linux/irq.h:20,
                    from include/asm-generic/hardirq.h:17,
                    from ./arch/csky/include/generated/asm/hardirq.h:1,
                    from include/linux/hardirq.h:11,
                    from include/linux/interrupt.h:11,
                    from include/linux/trace_recursion.h:5,
                    from include/linux/ftrace.h:10,
                    from kernel/trace/trace_readwrite.c:8:
   include/asm-generic/io.h:124: note: this is the location of the previous definition
     124 | #define __raw_writel __raw_writel
         | 
   In file included from kernel/trace/trace_readwrite.c:9:
>> include/linux/mmio-instrumented.h:46: warning: "__raw_readb" redefined
      46 | #define __raw_readb(a)          __raw_read((a), b, u8)
         | 
   In file included from arch/csky/include/asm/io.h:42,
                    from include/linux/io.h:13,
                    from include/linux/irq.h:20,
                    from include/asm-generic/hardirq.h:17,
                    from ./arch/csky/include/generated/asm/hardirq.h:1,
                    from include/linux/hardirq.h:11,
                    from include/linux/interrupt.h:11,
                    from include/linux/trace_recursion.h:5,
                    from include/linux/ftrace.h:10,
                    from kernel/trace/trace_readwrite.c:8:
   include/asm-generic/io.h:74: note: this is the location of the previous definition
      74 | #define __raw_readb __raw_readb
         | 
   In file included from kernel/trace/trace_readwrite.c:9:
>> include/linux/mmio-instrumented.h:47: warning: "__raw_readw" redefined
      47 | #define __raw_readw(a)          __raw_read((a), w, u16)
         | 
   In file included from arch/csky/include/asm/io.h:42,
                    from include/linux/io.h:13,
                    from include/linux/irq.h:20,
                    from include/asm-generic/hardirq.h:17,
                    from ./arch/csky/include/generated/asm/hardirq.h:1,
                    from include/linux/hardirq.h:11,
                    from include/linux/interrupt.h:11,
                    from include/linux/trace_recursion.h:5,
                    from include/linux/ftrace.h:10,
                    from kernel/trace/trace_readwrite.c:8:
   include/asm-generic/io.h:82: note: this is the location of the previous definition
      82 | #define __raw_readw __raw_readw
         | 
   In file included from kernel/trace/trace_readwrite.c:9:
>> include/linux/mmio-instrumented.h:48: warning: "__raw_readl" redefined
      48 | #define __raw_readl(a)          __raw_read((a), l, u32)
         | 
   In file included from arch/csky/include/asm/io.h:42,
                    from include/linux/io.h:13,
                    from include/linux/irq.h:20,
                    from include/asm-generic/hardirq.h:17,
                    from ./arch/csky/include/generated/asm/hardirq.h:1,
                    from include/linux/hardirq.h:11,
                    from include/linux/interrupt.h:11,
                    from include/linux/trace_recursion.h:5,
                    from include/linux/ftrace.h:10,
                    from kernel/trace/trace_readwrite.c:8:
   include/asm-generic/io.h:90: note: this is the location of the previous definition
      90 | #define __raw_readl __raw_readl
         | 
   In file included from include/trace/define_trace.h:102,
                    from include/trace/events/rwmmio.h:59,
                    from kernel/trace/trace_readwrite.c:13:
   include/trace/events/rwmmio.h: In function 'trace_event_raw_event_rwmmio_write':
>> include/trace/events/rwmmio.h:27:33: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
      27 |                 __entry->addr = (u64)(void *)addr;
         |                                 ^
   include/trace/trace_events.h:743:11: note: in definition of macro 'DECLARE_EVENT_CLASS'
     743 |         { assign; }                                                     \
         |           ^~~~~~
   include/trace/trace_events.h:79:30: note: in expansion of macro 'PARAMS'
      79 |                              PARAMS(assign),                   \
         |                              ^~~~~~
   include/trace/events/rwmmio.h:13:1: note: in expansion of macro 'TRACE_EVENT'
      13 | TRACE_EVENT(rwmmio_write,
         | ^~~~~~~~~~~
   include/trace/events/rwmmio.h:25:9: note: in expansion of macro 'TP_fast_assign'
      25 |         TP_fast_assign(
         |         ^~~~~~~~~~~~~~
   include/trace/events/rwmmio.h: In function 'trace_event_raw_event_rwmmio_read':
   include/trace/events/rwmmio.h:49:33: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
      49 |                 __entry->addr = (u64)(void *)addr;
         |                                 ^
   include/trace/trace_events.h:743:11: note: in definition of macro 'DECLARE_EVENT_CLASS'
     743 |         { assign; }                                                     \
         |           ^~~~~~
   include/trace/trace_events.h:79:30: note: in expansion of macro 'PARAMS'
      79 |                              PARAMS(assign),                   \
         |                              ^~~~~~
   include/trace/events/rwmmio.h:35:1: note: in expansion of macro 'TRACE_EVENT'
      35 | TRACE_EVENT(rwmmio_read,
         | ^~~~~~~~~~~
   include/trace/events/rwmmio.h:47:9: note: in expansion of macro 'TP_fast_assign'
      47 |         TP_fast_assign(
         |         ^~~~~~~~~~~~~~


vim +/__raw_writeb +32 include/linux/mmio-instrumented.h

    24	
    25	#define __raw_write(v, a, _l)	({				\
    26		volatile void __iomem *_a = (a);			\
    27		if (tracepoint_enabled(rwmmio_write))			\
    28			log_write_mmio(__stringify(write##_l), _a);	\
    29		arch_raw_write##_l((v), _a);				\
    30		})
    31	
  > 32	#define __raw_writeb(v, a)	__raw_write((v), a, b)
  > 33	#define __raw_writew(v, a)	__raw_write((v), a, w)
  > 34	#define __raw_writel(v, a)	__raw_write((v), a, l)
    35	#define __raw_writeq(v, a)	__raw_write((v), a, q)
    36	
    37	#define __raw_read(a, _l, _t)    ({				\
    38		_t __a;							\
    39		const volatile void __iomem *_a = (a);			\
    40		if (tracepoint_enabled(rwmmio_read))			\
    41			log_read_mmio(__stringify(read##_l), _a);	\
    42		__a = arch_raw_read##_l(_a);				\
    43		__a;							\
    44		})
    45	
  > 46	#define __raw_readb(a)		__raw_read((a), b, u8)
  > 47	#define __raw_readw(a)		__raw_read((a), w, u16)
  > 48	#define __raw_readl(a)		__raw_read((a), l, u32)
    49	#define __raw_readq(a)		__raw_read((a), q, u64)
    50	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 32912 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation
  2021-11-15 11:33 ` [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation Sai Prakash Ranjan
  2021-11-16 22:40   ` Steven Rostedt
  2021-11-18 14:58   ` kernel test robot
@ 2021-11-18 15:24   ` Arnd Bergmann
  2021-11-19  4:06     ` Sai Prakash Ranjan
  2021-11-19 13:48   ` Marc Zyngier
  3 siblings, 1 reply; 22+ messages in thread
From: Arnd Bergmann @ 2021-11-18 15:24 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Will Deacon, Steven Rostedt, Catalin Marinas, quic_psodagud,
	Marc Zyngier, gregkh, Arnd Bergmann, Linux ARM,
	Linux Kernel Mailing List, linux-arm-msm, Ingo Molnar

On Mon, Nov 15, 2021 at 12:33 PM Sai Prakash Ranjan
<quic_saipraka@quicinc.com> wrote:
>  /*
>   * Generic IO read/write.  These perform native-endian accesses.
>   */
> -#define __raw_writeb __raw_writeb
> -static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
> +static inline void arch_raw_writeb(u8 val, volatile void __iomem *addr)
>  {
>         asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr));
>  }

Woundn't removing the #define here will break the logic in
include/asm-generic/io.h,
making it fall back to the pointer-dereference version for the actual access?

> +#if IS_ENABLED(CONFIG_TRACE_MMIO_ACCESS) && !(defined(__DISABLE_TRACE_MMIO__))
> +DECLARE_TRACEPOINT(rwmmio_write);
> +DECLARE_TRACEPOINT(rwmmio_read);
> +
> +void log_write_mmio(const char *width, volatile void __iomem *addr);
> +void log_read_mmio(const char *width, const volatile void __iomem *addr);
> +
> +#define __raw_write(v, a, _l)  ({                              \
> +       volatile void __iomem *_a = (a);                        \
> +       if (tracepoint_enabled(rwmmio_write))                   \
> +               log_write_mmio(__stringify(write##_l), _a);     \
> +       arch_raw_write##_l((v), _a);                            \
> +       })

This feels like it's getting too big to be inlined. Have you considered
integrating this with the lib/logic_iomem.c infrastructure instead?

That already provides a way to override MMIO areas, and it lets you do
the logging from a single place rather than having it duplicated in every
single caller. It also provides a way of filtering it based on the ioremap()
call.

        Arnd

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation
  2021-11-18 15:24   ` Arnd Bergmann
@ 2021-11-19  4:06     ` Sai Prakash Ranjan
  2021-11-22 13:35       ` Sai Prakash Ranjan
  0 siblings, 1 reply; 22+ messages in thread
From: Sai Prakash Ranjan @ 2021-11-19  4:06 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Will Deacon, Steven Rostedt, Catalin Marinas, quic_psodagud,
	Marc Zyngier, gregkh, Linux ARM, Linux Kernel Mailing List,
	linux-arm-msm, Ingo Molnar

Hi Arnd,

On 11/18/2021 8:54 PM, Arnd Bergmann wrote:
> On Mon, Nov 15, 2021 at 12:33 PM Sai Prakash Ranjan
> <quic_saipraka@quicinc.com> wrote:
>>   /*
>>    * Generic IO read/write.  These perform native-endian accesses.
>>    */
>> -#define __raw_writeb __raw_writeb
>> -static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
>> +static inline void arch_raw_writeb(u8 val, volatile void __iomem *addr)
>>   {
>>          asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr));
>>   }
> Woundn't removing the #define here will break the logic in
> include/asm-generic/io.h,
> making it fall back to the pointer-dereference version for the actual access?

#defines for these are added in mmio-instrumented.h header which is 
included in
arm64/asm/io.h, so it won't break the logic by falling back to 
pointer-dereference.

>> +#if IS_ENABLED(CONFIG_TRACE_MMIO_ACCESS) && !(defined(__DISABLE_TRACE_MMIO__))
>> +DECLARE_TRACEPOINT(rwmmio_write);
>> +DECLARE_TRACEPOINT(rwmmio_read);
>> +
>> +void log_write_mmio(const char *width, volatile void __iomem *addr);
>> +void log_read_mmio(const char *width, const volatile void __iomem *addr);
>> +
>> +#define __raw_write(v, a, _l)  ({                              \
>> +       volatile void __iomem *_a = (a);                        \
>> +       if (tracepoint_enabled(rwmmio_write))                   \
>> +               log_write_mmio(__stringify(write##_l), _a);     \
>> +       arch_raw_write##_l((v), _a);                            \
>> +       })
> This feels like it's getting too big to be inlined. Have you considered
> integrating this with the lib/logic_iomem.c infrastructure instead?
>
> That already provides a way to override MMIO areas, and it lets you do
> the logging from a single place rather than having it duplicated in every
> single caller. It also provides a way of filtering it based on the ioremap()
> call.
>

Thanks for the suggestion, will look at the logic_iomem.c and see if it 
fits our
usecase.

Thanks,
Sai



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 1/2] tracing: Add register read/write tracing support
  2021-11-15 11:33 ` [PATCHv4 1/2] tracing: Add register read/write tracing support Sai Prakash Ranjan
@ 2021-11-19 13:43   ` Marc Zyngier
  2021-11-19 14:07     ` Sai Prakash Ranjan
  0 siblings, 1 reply; 22+ messages in thread
From: Marc Zyngier @ 2021-11-19 13:43 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Will Deacon, rostedt, Catalin Marinas, quic_psodagud, gregkh,
	arnd, linux-arm-kernel, linux-kernel, linux-arm-msm, mingo,
	Prasad Sodagudi

On Mon, 15 Nov 2021 11:33:29 +0000,
Sai Prakash Ranjan <quic_saipraka@quicinc.com> wrote:
> 
> From: Prasad Sodagudi <psodagud@codeaurora.org>
> 
> Generic MMIO read/write i.e., __raw_{read,write}{b,l,w,q} accessors
> are typically used to read/write from/to memory mapped registers
> and can cause hangs or some undefined behaviour in following few
> cases,
> 
> * If the access to the register space is unclocked, for example: if
>   there is an access to multimedia(MM) block registers without MM
>   clocks.
> 
> * If the register space is protected and not set to be accessible from
>   non-secure world, for example: only EL3 (EL: Exception level) access
>   is allowed and any EL2/EL1 access is forbidden.
> 
> * If xPU(memory/register protection units) is controlling access to
>   certain memory/register space for specific clients.
> 
> and more...
> 
> Such cases usually results in instant reboot/SErrors/NOC or interconnect
> hangs and tracing these register accesses can be very helpful to debug
> such issues during initial development stages and also in later stages.
> 
> So use ftrace trace events to log such MMIO register accesses which
> provides rich feature set such as early enablement of trace events,
> filtering capability, dumping ftrace logs on console and many more.
> 
> Sample output:
> 
> rwmmio_read: gic_peek_irq+0xd0/0xd8 readl addr=0xffff800010040104
> rwmmio_write: gic_poke_irq+0xe4/0xf0 writel addr=0xffff800010040184
> rwmmio_read: gic_do_wait_for_rwp+0x54/0x90 readl addr=0xffff800010040000
> rwmmio_write: gic_set_affinity+0x1bc/0x1e8 writeq addr=0xffff800010046130
> 
> Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
> [saiprakash: Rewrote commit msg and trace event field edits]
> Signed-off-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
> ---
> 
> Have dropped value parameter for mmio write trace event as that
> was causing hangs in strange ways, i.e., if we pass any other
> 64bit value, it works fine but passing value would just hang.
> Not just using the log apis, even simple trace_printk with value
> printed would cause hang. It wasn't noticed in early version
> because dyndbg would filter the logging in my system (I had
> set it to trace only specific qcom directory) but once this
> version starts recording all the reads/writes with value passed,
> it just hangs system when rwmmio write event tracing is enabled.

Why is that so? Not being able to track the value written out makes
the feature pretty useless if you're not writing fixed values.

> Reason why we wouldn't need value along with mmio write log is
> that value can be easily deduced from the caller_name+offset which is
> printed already by the rwmmio trace events which gives the exact
> location of mmio writes and the value is easily known from the driver.

That's a very narrow view of what can be written in an MMIO
registers. We write dynamic values at all times, and if we are able to
trace MMIO writes, then the value written out must be part of the trace.

I'd rather you try and get to the bottom of this issue rather than
paper over it.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation
  2021-11-15 11:33 ` [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation Sai Prakash Ranjan
                     ` (2 preceding siblings ...)
  2021-11-18 15:24   ` Arnd Bergmann
@ 2021-11-19 13:48   ` Marc Zyngier
  2021-11-19 14:09     ` Sai Prakash Ranjan
  3 siblings, 1 reply; 22+ messages in thread
From: Marc Zyngier @ 2021-11-19 13:48 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Will Deacon, rostedt, Catalin Marinas, quic_psodagud, gregkh,
	arnd, linux-arm-kernel, linux-kernel, linux-arm-msm, mingo

On Mon, 15 Nov 2021 11:33:30 +0000,
Sai Prakash Ranjan <quic_saipraka@quicinc.com> wrote:
> 
> The new generic header mmio-instrumented.h will keep arch code clean
> and separate from instrumented version which traces mmio register
> accesses. This instrumented header is generic and can be used by other
> architectures as well. Also add a generic flag (__DISABLE_TRACE_MMIO__)
> which is used to disable MMIO tracing in nVHE and if required can be
> used to disable tracing for specific drivers.
> 
> Signed-off-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
> ---
>  arch/arm64/include/asm/io.h       | 25 ++++-------
>  arch/arm64/kvm/hyp/nvhe/Makefile  |  2 +-
>  include/linux/mmio-instrumented.h | 70 +++++++++++++++++++++++++++++++
>  3 files changed, 80 insertions(+), 17 deletions(-)
>  create mode 100644 include/linux/mmio-instrumented.h
> 
> diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
> index 7fd836bea7eb..a635aaaf81b9 100644
> --- a/arch/arm64/include/asm/io.h
> +++ b/arch/arm64/include/asm/io.h
> @@ -10,6 +10,7 @@
>  
>  #include <linux/types.h>
>  #include <linux/pgtable.h>
> +#include <linux/mmio-instrumented.h>
>  
>  #include <asm/byteorder.h>
>  #include <asm/barrier.h>
> @@ -21,32 +22,27 @@
>  /*
>   * Generic IO read/write.  These perform native-endian accesses.
>   */
> -#define __raw_writeb __raw_writeb
> -static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
> +static inline void arch_raw_writeb(u8 val, volatile void __iomem *addr)
>  {
>  	asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr));
>  }
>  
> -#define __raw_writew __raw_writew
> -static inline void __raw_writew(u16 val, volatile void __iomem *addr)
> +static inline void arch_raw_writew(u16 val, volatile void __iomem *addr)
>  {
>  	asm volatile("strh %w0, [%1]" : : "rZ" (val), "r" (addr));
>  }
>  
> -#define __raw_writel __raw_writel
> -static __always_inline void __raw_writel(u32 val, volatile void __iomem *addr)
> +static __always_inline void arch_raw_writel(u32 val, volatile void __iomem *addr)
>  {
>  	asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr));
>  }
>  
> -#define __raw_writeq __raw_writeq
> -static inline void __raw_writeq(u64 val, volatile void __iomem *addr)
> +static inline void arch_raw_writeq(u64 val, volatile void __iomem *addr)
>  {
>  	asm volatile("str %x0, [%1]" : : "rZ" (val), "r" (addr));
>  }
>  
> -#define __raw_readb __raw_readb
> -static inline u8 __raw_readb(const volatile void __iomem *addr)
> +static inline u8 arch_raw_readb(const volatile void __iomem *addr)
>  {
>  	u8 val;
>  	asm volatile(ALTERNATIVE("ldrb %w0, [%1]",
> @@ -56,8 +52,7 @@ static inline u8 __raw_readb(const volatile void __iomem *addr)
>  	return val;
>  }
>  
> -#define __raw_readw __raw_readw
> -static inline u16 __raw_readw(const volatile void __iomem *addr)
> +static inline u16 arch_raw_readw(const volatile void __iomem *addr)
>  {
>  	u16 val;
>  
> @@ -68,8 +63,7 @@ static inline u16 __raw_readw(const volatile void __iomem *addr)
>  	return val;
>  }
>  
> -#define __raw_readl __raw_readl
> -static __always_inline u32 __raw_readl(const volatile void __iomem *addr)
> +static __always_inline u32 arch_raw_readl(const volatile void __iomem *addr)
>  {
>  	u32 val;
>  	asm volatile(ALTERNATIVE("ldr %w0, [%1]",
> @@ -79,8 +73,7 @@ static __always_inline u32 __raw_readl(const volatile void __iomem *addr)
>  	return val;
>  }
>  
> -#define __raw_readq __raw_readq
> -static inline u64 __raw_readq(const volatile void __iomem *addr)
> +static inline u64 arch_raw_readq(const volatile void __iomem *addr)
>  {
>  	u64 val;
>  	asm volatile(ALTERNATIVE("ldr %0, [%1]",
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index c3c11974fa3b..ff56d2165ea9 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -4,7 +4,7 @@
>  #
>  
>  asflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS
> -ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS
> +ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS -D__DISABLE_TRACE_MMIO__
>  
>  hostprogs := gen-hyprel
>  HOST_EXTRACFLAGS += -I$(objtree)/include
> diff --git a/include/linux/mmio-instrumented.h b/include/linux/mmio-instrumented.h
> new file mode 100644
> index 000000000000..99979c025cc1
> --- /dev/null
> +++ b/include/linux/mmio-instrumented.h
> @@ -0,0 +1,70 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2021 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _LINUX_MMIO_INSTRUMENTED_H
> +#define _LINUX_MMIO_INSTRUMENTED_H
> +
> +#include <linux/tracepoint-defs.h>
> +
> +/*
> + * Tracepoint and MMIO logging symbols should not be visible at EL2(HYP) as
> + * there is no way to execute them and any such MMIO access from EL2 will
> + * explode instantly (Words of Marc Zyngier). So introduce a generic flag
> + * __DISABLE_TRACE_MMIO__ to disable MMIO tracing in nVHE and other drivers
> + * if required.
> + */

This Gospel would be better placed next to the code that defines the
macro, given that this is an arch-independent include file, and hardly
anyone understands the quirks of a nVHE KVM (and only nVHE, something
that the comment fails to capture).

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 1/2] tracing: Add register read/write tracing support
  2021-11-19 13:43   ` Marc Zyngier
@ 2021-11-19 14:07     ` Sai Prakash Ranjan
  2021-11-19 14:17       ` Marc Zyngier
  0 siblings, 1 reply; 22+ messages in thread
From: Sai Prakash Ranjan @ 2021-11-19 14:07 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Will Deacon, rostedt, Catalin Marinas, quic_psodagud, gregkh,
	arnd, linux-arm-kernel, linux-kernel, linux-arm-msm, mingo,
	Prasad Sodagudi

On 11/19/2021 7:13 PM, Marc Zyngier wrote:
> On Mon, 15 Nov 2021 11:33:29 +0000,
> Sai Prakash Ranjan <quic_saipraka@quicinc.com> wrote:
>> From: Prasad Sodagudi <psodagud@codeaurora.org>
>>
>> Generic MMIO read/write i.e., __raw_{read,write}{b,l,w,q} accessors
>> are typically used to read/write from/to memory mapped registers
>> and can cause hangs or some undefined behaviour in following few
>> cases,
>>
>> * If the access to the register space is unclocked, for example: if
>>    there is an access to multimedia(MM) block registers without MM
>>    clocks.
>>
>> * If the register space is protected and not set to be accessible from
>>    non-secure world, for example: only EL3 (EL: Exception level) access
>>    is allowed and any EL2/EL1 access is forbidden.
>>
>> * If xPU(memory/register protection units) is controlling access to
>>    certain memory/register space for specific clients.
>>
>> and more...
>>
>> Such cases usually results in instant reboot/SErrors/NOC or interconnect
>> hangs and tracing these register accesses can be very helpful to debug
>> such issues during initial development stages and also in later stages.
>>
>> So use ftrace trace events to log such MMIO register accesses which
>> provides rich feature set such as early enablement of trace events,
>> filtering capability, dumping ftrace logs on console and many more.
>>
>> Sample output:
>>
>> rwmmio_read: gic_peek_irq+0xd0/0xd8 readl addr=0xffff800010040104
>> rwmmio_write: gic_poke_irq+0xe4/0xf0 writel addr=0xffff800010040184
>> rwmmio_read: gic_do_wait_for_rwp+0x54/0x90 readl addr=0xffff800010040000
>> rwmmio_write: gic_set_affinity+0x1bc/0x1e8 writeq addr=0xffff800010046130
>>
>> Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
>> [saiprakash: Rewrote commit msg and trace event field edits]
>> Signed-off-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
>> ---
>>
>> Have dropped value parameter for mmio write trace event as that
>> was causing hangs in strange ways, i.e., if we pass any other
>> 64bit value, it works fine but passing value would just hang.
>> Not just using the log apis, even simple trace_printk with value
>> printed would cause hang. It wasn't noticed in early version
>> because dyndbg would filter the logging in my system (I had
>> set it to trace only specific qcom directory) but once this
>> version starts recording all the reads/writes with value passed,
>> it just hangs system when rwmmio write event tracing is enabled.
> Why is that so? Not being able to track the value written out makes
> the feature pretty useless if you're not writing fixed values.
>
>> Reason why we wouldn't need value along with mmio write log is
>> that value can be easily deduced from the caller_name+offset which is
>> printed already by the rwmmio trace events which gives the exact
>> location of mmio writes and the value is easily known from the driver.
> That's a very narrow view of what can be written in an MMIO
> registers. We write dynamic values at all times, and if we are able to
> trace MMIO writes, then the value written out must be part of the trace.
>
> I'd rather you try and get to the bottom of this issue rather than
> paper over it.
>
> Thanks,
>
> 	M.
>

Sure, idea was to put it out in the open if anyone has any idea as to 
what might be happening
there since the version where directly instrumenting the raw read/write 
accessors in arm64/asm/io.h
was working fine casting doubts if this has to do something with 
inlining as Arnd mentioned before.

Thanks,
Sai

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation
  2021-11-19 13:48   ` Marc Zyngier
@ 2021-11-19 14:09     ` Sai Prakash Ranjan
  0 siblings, 0 replies; 22+ messages in thread
From: Sai Prakash Ranjan @ 2021-11-19 14:09 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Will Deacon, rostedt, Catalin Marinas, quic_psodagud, gregkh,
	arnd, linux-arm-kernel, linux-kernel, linux-arm-msm, mingo

On 11/19/2021 7:18 PM, Marc Zyngier wrote:
> On Mon, 15 Nov 2021 11:33:30 +0000,
> Sai Prakash Ranjan <quic_saipraka@quicinc.com> wrote:
>> The new generic header mmio-instrumented.h will keep arch code clean
>> and separate from instrumented version which traces mmio register
>> accesses. This instrumented header is generic and can be used by other
>> architectures as well. Also add a generic flag (__DISABLE_TRACE_MMIO__)
>> which is used to disable MMIO tracing in nVHE and if required can be
>> used to disable tracing for specific drivers.
>>
>> Signed-off-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
>> ---
>>   arch/arm64/include/asm/io.h       | 25 ++++-------
>>   arch/arm64/kvm/hyp/nvhe/Makefile  |  2 +-
>>   include/linux/mmio-instrumented.h | 70 +++++++++++++++++++++++++++++++
>>   3 files changed, 80 insertions(+), 17 deletions(-)
>>   create mode 100644 include/linux/mmio-instrumented.h
>>
>> diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
>> index 7fd836bea7eb..a635aaaf81b9 100644
>> --- a/arch/arm64/include/asm/io.h
>> +++ b/arch/arm64/include/asm/io.h
>> @@ -10,6 +10,7 @@
>>   
>>   #include <linux/types.h>
>>   #include <linux/pgtable.h>
>> +#include <linux/mmio-instrumented.h>
>>   
>>   #include <asm/byteorder.h>
>>   #include <asm/barrier.h>
>> @@ -21,32 +22,27 @@
>>   /*
>>    * Generic IO read/write.  These perform native-endian accesses.
>>    */
>> -#define __raw_writeb __raw_writeb
>> -static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
>> +static inline void arch_raw_writeb(u8 val, volatile void __iomem *addr)
>>   {
>>   	asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr));
>>   }
>>   
>> -#define __raw_writew __raw_writew
>> -static inline void __raw_writew(u16 val, volatile void __iomem *addr)
>> +static inline void arch_raw_writew(u16 val, volatile void __iomem *addr)
>>   {
>>   	asm volatile("strh %w0, [%1]" : : "rZ" (val), "r" (addr));
>>   }
>>   
>> -#define __raw_writel __raw_writel
>> -static __always_inline void __raw_writel(u32 val, volatile void __iomem *addr)
>> +static __always_inline void arch_raw_writel(u32 val, volatile void __iomem *addr)
>>   {
>>   	asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr));
>>   }
>>   
>> -#define __raw_writeq __raw_writeq
>> -static inline void __raw_writeq(u64 val, volatile void __iomem *addr)
>> +static inline void arch_raw_writeq(u64 val, volatile void __iomem *addr)
>>   {
>>   	asm volatile("str %x0, [%1]" : : "rZ" (val), "r" (addr));
>>   }
>>   
>> -#define __raw_readb __raw_readb
>> -static inline u8 __raw_readb(const volatile void __iomem *addr)
>> +static inline u8 arch_raw_readb(const volatile void __iomem *addr)
>>   {
>>   	u8 val;
>>   	asm volatile(ALTERNATIVE("ldrb %w0, [%1]",
>> @@ -56,8 +52,7 @@ static inline u8 __raw_readb(const volatile void __iomem *addr)
>>   	return val;
>>   }
>>   
>> -#define __raw_readw __raw_readw
>> -static inline u16 __raw_readw(const volatile void __iomem *addr)
>> +static inline u16 arch_raw_readw(const volatile void __iomem *addr)
>>   {
>>   	u16 val;
>>   
>> @@ -68,8 +63,7 @@ static inline u16 __raw_readw(const volatile void __iomem *addr)
>>   	return val;
>>   }
>>   
>> -#define __raw_readl __raw_readl
>> -static __always_inline u32 __raw_readl(const volatile void __iomem *addr)
>> +static __always_inline u32 arch_raw_readl(const volatile void __iomem *addr)
>>   {
>>   	u32 val;
>>   	asm volatile(ALTERNATIVE("ldr %w0, [%1]",
>> @@ -79,8 +73,7 @@ static __always_inline u32 __raw_readl(const volatile void __iomem *addr)
>>   	return val;
>>   }
>>   
>> -#define __raw_readq __raw_readq
>> -static inline u64 __raw_readq(const volatile void __iomem *addr)
>> +static inline u64 arch_raw_readq(const volatile void __iomem *addr)
>>   {
>>   	u64 val;
>>   	asm volatile(ALTERNATIVE("ldr %0, [%1]",
>> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
>> index c3c11974fa3b..ff56d2165ea9 100644
>> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
>> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
>> @@ -4,7 +4,7 @@
>>   #
>>   
>>   asflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS
>> -ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS
>> +ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS -D__DISABLE_TRACE_MMIO__
>>   
>>   hostprogs := gen-hyprel
>>   HOST_EXTRACFLAGS += -I$(objtree)/include
>> diff --git a/include/linux/mmio-instrumented.h b/include/linux/mmio-instrumented.h
>> new file mode 100644
>> index 000000000000..99979c025cc1
>> --- /dev/null
>> +++ b/include/linux/mmio-instrumented.h
>> @@ -0,0 +1,70 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) 2021 Qualcomm Innovation Center, Inc. All rights reserved.
>> + */
>> +
>> +#ifndef _LINUX_MMIO_INSTRUMENTED_H
>> +#define _LINUX_MMIO_INSTRUMENTED_H
>> +
>> +#include <linux/tracepoint-defs.h>
>> +
>> +/*
>> + * Tracepoint and MMIO logging symbols should not be visible at EL2(HYP) as
>> + * there is no way to execute them and any such MMIO access from EL2 will
>> + * explode instantly (Words of Marc Zyngier). So introduce a generic flag
>> + * __DISABLE_TRACE_MMIO__ to disable MMIO tracing in nVHE and other drivers
>> + * if required.
>> + */
> This Gospel would be better placed next to the code that defines the
> macro, given that this is an arch-independent include file, and hardly
> anyone understands the quirks of a nVHE KVM (and only nVHE, something
> that the comment fails to capture).
>
> 	M.
>

I'll move the comment and include nVHE in the comment as well.

Thanks,
Sai

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 1/2] tracing: Add register read/write tracing support
  2021-11-19 14:07     ` Sai Prakash Ranjan
@ 2021-11-19 14:17       ` Marc Zyngier
  2021-11-19 15:19         ` Sai Prakash Ranjan
  0 siblings, 1 reply; 22+ messages in thread
From: Marc Zyngier @ 2021-11-19 14:17 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Will Deacon, rostedt, Catalin Marinas, quic_psodagud, gregkh,
	arnd, linux-arm-kernel, linux-kernel, linux-arm-msm, mingo,
	Prasad Sodagudi

On Fri, 19 Nov 2021 14:07:09 +0000,
Sai Prakash Ranjan <quic_saipraka@quicinc.com> wrote:
> 
> On 11/19/2021 7:13 PM, Marc Zyngier wrote:
> > On Mon, 15 Nov 2021 11:33:29 +0000,
> > Sai Prakash Ranjan <quic_saipraka@quicinc.com> wrote:
> >> From: Prasad Sodagudi <psodagud@codeaurora.org>
> >> 

[...]

> >> Reason why we wouldn't need value along with mmio write log is
> >> that value can be easily deduced from the caller_name+offset which is
> >> printed already by the rwmmio trace events which gives the exact
> >> location of mmio writes and the value is easily known from the driver.
> > That's a very narrow view of what can be written in an MMIO
> > registers. We write dynamic values at all times, and if we are able to
> > trace MMIO writes, then the value written out must be part of the trace.
> > 
> > I'd rather you try and get to the bottom of this issue rather than
> > paper over it.
> > 
> > Thanks,
> > 
> > 	M.
> > 
> 
> Sure, idea was to put it out in the open if anyone has any idea as
> to what might be happening there since the version where directly
> instrumenting the raw read/write accessors in arm64/asm/io.h was
> working fine casting doubts if this has to do something with
> inlining as Arnd mentioned before.

Yup. I wouldn't be surprised if MMIO accessors were getting directly
inlined at the wrong location and creating havoc. For example:

	writel(readl(addr1) | 1, addr2);

If you're not careful about capturing the result of the read rather
than the read itself, you can end-up with something really funky. No
idea if that's what is happening, but a disassembly of the generated
code could tell you.

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 1/2] tracing: Add register read/write tracing support
  2021-11-19 14:17       ` Marc Zyngier
@ 2021-11-19 15:19         ` Sai Prakash Ranjan
  0 siblings, 0 replies; 22+ messages in thread
From: Sai Prakash Ranjan @ 2021-11-19 15:19 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Will Deacon, rostedt, Catalin Marinas, quic_psodagud, gregkh,
	arnd, linux-arm-kernel, linux-kernel, linux-arm-msm, mingo,
	Prasad Sodagudi

On 11/19/2021 7:47 PM, Marc Zyngier wrote:
> On Fri, 19 Nov 2021 14:07:09 +0000,
> Sai Prakash Ranjan <quic_saipraka@quicinc.com> wrote:
>> On 11/19/2021 7:13 PM, Marc Zyngier wrote:
>>> On Mon, 15 Nov 2021 11:33:29 +0000,
>>> Sai Prakash Ranjan <quic_saipraka@quicinc.com> wrote:
>>>> From: Prasad Sodagudi <psodagud@codeaurora.org>
>>>>
> [...]
>
>>>> Reason why we wouldn't need value along with mmio write log is
>>>> that value can be easily deduced from the caller_name+offset which is
>>>> printed already by the rwmmio trace events which gives the exact
>>>> location of mmio writes and the value is easily known from the driver.
>>> That's a very narrow view of what can be written in an MMIO
>>> registers. We write dynamic values at all times, and if we are able to
>>> trace MMIO writes, then the value written out must be part of the trace.
>>>
>>> I'd rather you try and get to the bottom of this issue rather than
>>> paper over it.
>>>
>>> Thanks,
>>>
>>> 	M.
>>>
>> Sure, idea was to put it out in the open if anyone has any idea as
>> to what might be happening there since the version where directly
>> instrumenting the raw read/write accessors in arm64/asm/io.h was
>> working fine casting doubts if this has to do something with
>> inlining as Arnd mentioned before.
> Yup. I wouldn't be surprised if MMIO accessors were getting directly
> inlined at the wrong location and creating havoc. For example:
>
> 	writel(readl(addr1) | 1, addr2);
>
> If you're not careful about capturing the result of the read rather
> than the read itself, you can end-up with something really funky. No
> idea if that's what is happening, but a disassembly of the generated
> code could tell you.
>
> 	M.
>

I did that initially (compare the disassembly in working and non-working 
case) but didn't find
anything noticeable, maybe I need to look some more. Thanks for the 
suggestion.

Thanks,
Sai

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation
  2021-11-19  4:06     ` Sai Prakash Ranjan
@ 2021-11-22 13:35       ` Sai Prakash Ranjan
  2021-11-22 13:59         ` Arnd Bergmann
  0 siblings, 1 reply; 22+ messages in thread
From: Sai Prakash Ranjan @ 2021-11-22 13:35 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Will Deacon, Steven Rostedt, Catalin Marinas, quic_psodagud,
	Marc Zyngier, gregkh, Linux ARM, Linux Kernel Mailing List,
	linux-arm-msm, Ingo Molnar

On 11/19/2021 9:36 AM, Sai Prakash Ranjan wrote:
> Hi Arnd,
>
> On 11/18/2021 8:54 PM, Arnd Bergmann wrote:
>> On Mon, Nov 15, 2021 at 12:33 PM Sai Prakash Ranjan
>> <quic_saipraka@quicinc.com> wrote:
>>>   /*
>>>    * Generic IO read/write.  These perform native-endian accesses.
>>>    */
>>> -#define __raw_writeb __raw_writeb
>>> -static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
>>> +static inline void arch_raw_writeb(u8 val, volatile void __iomem 
>>> *addr)
>>>   {
>>>          asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr));
>>>   }
>> Woundn't removing the #define here will break the logic in
>> include/asm-generic/io.h,
>> making it fall back to the pointer-dereference version for the actual 
>> access?
>
> #defines for these are added in mmio-instrumented.h header which is 
> included in
> arm64/asm/io.h, so it won't break the logic by falling back to 
> pointer-dereference.
>
>>> +#if IS_ENABLED(CONFIG_TRACE_MMIO_ACCESS) && 
>>> !(defined(__DISABLE_TRACE_MMIO__))
>>> +DECLARE_TRACEPOINT(rwmmio_write);
>>> +DECLARE_TRACEPOINT(rwmmio_read);
>>> +
>>> +void log_write_mmio(const char *width, volatile void __iomem *addr);
>>> +void log_read_mmio(const char *width, const volatile void __iomem 
>>> *addr);
>>> +
>>> +#define __raw_write(v, a, _l) ({                              \
>>> +       volatile void __iomem *_a = (a);                        \
>>> +       if (tracepoint_enabled(rwmmio_write))                   \
>>> +               log_write_mmio(__stringify(write##_l), _a);     \
>>> +       arch_raw_write##_l((v), _a);                            \
>>> +       })
>> This feels like it's getting too big to be inlined. Have you considered
>> integrating this with the lib/logic_iomem.c infrastructure instead?
>>
>> That already provides a way to override MMIO areas, and it lets you do
>> the logging from a single place rather than having it duplicated in 
>> every
>> single caller. It also provides a way of filtering it based on the 
>> ioremap()
>> call.
>>
>
> Thanks for the suggestion, will look at the logic_iomem.c and see if 
> it fits our
> usecase.
>
>

So I looked at logic_iomem.c which seems to be useful for emulated IO 
for virtio drivers
but our usecase just needs to log the mmio operations and no additional 
stuff, similar to
the logging access of x86 msr registers via tracepoint 
(arch/x86/include/asm/msr-trace.h).
Also raw read/write macros in logic_iomem.c have the callbacks which 
seems to be pretty costly
than inlining or direct function call given it has to be called for 
every register read and write
which are going to be thousands in our case. In their usecase, read and 
write callbacks are just
pci cfgspace reads and writes which may not be that frequently called 
and the latency might not
be visible but in our case, I think it would be visible if we have a 
callback as such. I know this is a
debug feature and perf isn't expected much but that wouldn't mean we 
should not have a debug
feature which performs better right.

On the second point, filtering by ioremap isn't much useful for our 
usecase since ioremapped
region can have 100s of registers and we are interested in the exact 
register read/write which
would cause any of the issues mentioned in the description of this patchset.

So I feel like the current way where we consolidate the instrumentation 
in mmio-instrumented.h
seems like the better way than adding tracing to an emulated iomem library.

Thanks,
Sai

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation
  2021-11-22 13:35       ` Sai Prakash Ranjan
@ 2021-11-22 13:59         ` Arnd Bergmann
  2021-11-22 14:19           ` Sai Prakash Ranjan
  0 siblings, 1 reply; 22+ messages in thread
From: Arnd Bergmann @ 2021-11-22 13:59 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Arnd Bergmann, Will Deacon, Steven Rostedt, Catalin Marinas,
	quic_psodagud, Marc Zyngier, gregkh, Linux ARM,
	Linux Kernel Mailing List, linux-arm-msm, Ingo Molnar

On Mon, Nov 22, 2021 at 2:35 PM Sai Prakash Ranjan
<quic_saipraka@quicinc.com> wrote:
> On 11/19/2021 9:36 AM, Sai Prakash Ranjan wrote:
>
> So I looked at logic_iomem.c which seems to be useful for emulated IO
> for virtio drivers
> but our usecase just needs to log the mmio operations and no additional
> stuff, similar to
> the logging access of x86 msr registers via tracepoint
> (arch/x86/include/asm/msr-trace.h).

I think it depends on whether one wants to filter the MMIO access based
on the device, or based on the caller.

> Also raw read/write macros in logic_iomem.c have the callbacks which
> seems to be pretty costly
> than inlining or direct function call given it has to be called for
> every register read and write
> which are going to be thousands in our case. In their usecase, read and
> write callbacks are just
> pci cfgspace reads and writes which may not be that frequently called
> and the latency might not
> be visible but in our case, I think it would be visible if we have a
> callback as such. I know this is a
> debug feature and perf isn't expected much but that wouldn't mean we
> should not have a debug
> feature which performs better right.

I would expect the cost of a bus access to always dwarf the cost of
indirect function calls and instrumentation. On the other hand,
the cost of an inline trace call is nontrivial in terms of code size,
which may lead to wasting significant amounts of both RAM and
instruction cache on small machines. If you want to continue with
your approach, it would help to include code size numbers before/after
for a defconfig kernel, and maybe some performance numbers to
show what this does when you enable tracing for all registers of
a device with a lot of accesses.

> On the second point, filtering by ioremap isn't much useful for our
> usecase since ioremapped
> region can have 100s of registers and we are interested in the exact
> register read/write which
> would cause any of the issues mentioned in the description of this patchset.
>
> So I feel like the current way where we consolidate the instrumentation
> in mmio-instrumented.h
> seems like the better way than adding tracing to an emulated iomem
> library.

There is another point that I don't like in the implementation, which is
the extra indirection. If we end up with your approach of doing it
inline per caller, I would prefer having the instrumentation in
include/asm-generic/io.h, like

#ifndef readl
#define readl readl
static inline u32 readl(const volatile void __iomem *addr)
{
        u32 val;

        __io_br();
        val = __le32_to_cpu((__le32 __force)__raw_readl(addr));
        __io_ar(val);
        if (tracepoint_enabled(rwmmio_read))
               log_read_mmio("readl", addr, val);
        return val;
}
#endif

I think this would be a lot less confusing to readers, as it is implemented
exactly in the place that has the normal definition, and it can also have
somewhat more logical semantics by only instrumenting the
normal/relaxed/ioport accessors but not the __raw_* versions that
are meant to be little more than a pointer dereference.

         Arnd

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation
  2021-11-22 13:59         ` Arnd Bergmann
@ 2021-11-22 14:19           ` Sai Prakash Ranjan
  2021-11-22 14:30             ` Arnd Bergmann
  0 siblings, 1 reply; 22+ messages in thread
From: Sai Prakash Ranjan @ 2021-11-22 14:19 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Will Deacon, Steven Rostedt, Catalin Marinas, quic_psodagud,
	Marc Zyngier, gregkh, Linux ARM, Linux Kernel Mailing List,
	linux-arm-msm, Ingo Molnar

On 11/22/2021 7:29 PM, Arnd Bergmann wrote:
> On Mon, Nov 22, 2021 at 2:35 PM Sai Prakash Ranjan
> <quic_saipraka@quicinc.com> wrote:
>> On 11/19/2021 9:36 AM, Sai Prakash Ranjan wrote:
>>
>> So I looked at logic_iomem.c which seems to be useful for emulated IO
>> for virtio drivers
>> but our usecase just needs to log the mmio operations and no additional
>> stuff, similar to
>> the logging access of x86 msr registers via tracepoint
>> (arch/x86/include/asm/msr-trace.h).
> I think it depends on whether one wants to filter the MMIO access based
> on the device, or based on the caller.
>
>> Also raw read/write macros in logic_iomem.c have the callbacks which
>> seems to be pretty costly
>> than inlining or direct function call given it has to be called for
>> every register read and write
>> which are going to be thousands in our case. In their usecase, read and
>> write callbacks are just
>> pci cfgspace reads and writes which may not be that frequently called
>> and the latency might not
>> be visible but in our case, I think it would be visible if we have a
>> callback as such. I know this is a
>> debug feature and perf isn't expected much but that wouldn't mean we
>> should not have a debug
>> feature which performs better right.
> I would expect the cost of a bus access to always dwarf the cost of
> indirect function calls and instrumentation. On the other hand,
> the cost of an inline trace call is nontrivial in terms of code size,
> which may lead to wasting significant amounts of both RAM and
> instruction cache on small machines. If you want to continue with
> your approach, it would help to include code size numbers before/after
> for a defconfig kernel, and maybe some performance numbers to
> show what this does when you enable tracing for all registers of
> a device with a lot of accesses.

Sure, I will get the numbers for both cases(inline and indirect calls) 
and run some
benchmark tests with register tracing enabled for both cases.


>> On the second point, filtering by ioremap isn't much useful for our
>> usecase since ioremapped
>> region can have 100s of registers and we are interested in the exact
>> register read/write which
>> would cause any of the issues mentioned in the description of this patchset.
>>
>> So I feel like the current way where we consolidate the instrumentation
>> in mmio-instrumented.h
>> seems like the better way than adding tracing to an emulated iomem
>> library.
> There is another point that I don't like in the implementation, which is
> the extra indirection. If we end up with your approach of doing it
> inline per caller, I would prefer having the instrumentation in
> include/asm-generic/io.h, like
>
> #ifndef readl
> #define readl readl
> static inline u32 readl(const volatile void __iomem *addr)
> {
>          u32 val;
>
>          __io_br();
>          val = __le32_to_cpu((__le32 __force)__raw_readl(addr));
>          __io_ar(val);
>          if (tracepoint_enabled(rwmmio_read))
>                 log_read_mmio("readl", addr, val);
>          return val;
> }
> #endif
>
> I think this would be a lot less confusing to readers, as it is implemented
> exactly in the place that has the normal definition, and it can also have
> somewhat more logical semantics by only instrumenting the
> normal/relaxed/ioport accessors but not the __raw_* versions that
> are meant to be little more than a pointer dereference.
>
>           Arnd

But how is this different from logic in atomic-instrumented.h which also 
has asm-generic version?
Initial review few years back mentioned about having something similar 
to atomic instrumentation
and hence it was implemented with the similar approach keeping 
instrumentation out of arch specific
details.
And if we do move this instrumentation to asm-generic/io.h, how will 
that be executed since
the arch specifc read{b,w,l,q} overrides this generic version?

Thanks,
Sai

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation
  2021-11-22 14:19           ` Sai Prakash Ranjan
@ 2021-11-22 14:30             ` Arnd Bergmann
  2021-11-22 14:59               ` Sai Prakash Ranjan
  0 siblings, 1 reply; 22+ messages in thread
From: Arnd Bergmann @ 2021-11-22 14:30 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Arnd Bergmann, Will Deacon, Steven Rostedt, Catalin Marinas,
	quic_psodagud, Marc Zyngier, gregkh, Linux ARM,
	Linux Kernel Mailing List, linux-arm-msm, Ingo Molnar

On Mon, Nov 22, 2021 at 3:19 PM Sai Prakash Ranjan
<quic_saipraka@quicinc.com> wrote:
> On 11/22/2021 7:29 PM, Arnd Bergmann wrote:
> >
> > I think this would be a lot less confusing to readers, as it is implemented
> > exactly in the place that has the normal definition, and it can also have
> > somewhat more logical semantics by only instrumenting the
> > normal/relaxed/ioport accessors but not the __raw_* versions that
> > are meant to be little more than a pointer dereference.
>
> But how is this different from logic in atomic-instrumented.h which also
> has asm-generic version?
> Initial review few years back mentioned about having something similar
> to atomic instrumentation
> and hence it was implemented with the similar approach keeping
> instrumentation out of arch specific details.

This is only a cosmetic difference. I usually prefer fewer indirections,
and I like the way that include/asm-generic/io.h only has all the
normal 'static inline' definitions spelled out, and calling the __raw_*
versions. Your version adds an extra layer with the arch_raw_readl(),
which I'd prefer to avoid.

> And if we do move this instrumentation to asm-generic/io.h, how will
> that be executed since
> the arch specifc read{b,w,l,q} overrides this generic version?

As I understand it, your version also requires architecture specific
changes, so that would be the same: it only works for architectures
that get the definition of readl()/readl_relaxed()/inl()/... from
include/asm-generic/io.h and only override the __raw version.

      Arnd

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation
  2021-11-22 14:30             ` Arnd Bergmann
@ 2021-11-22 14:59               ` Sai Prakash Ranjan
  2021-11-22 15:35                 ` Arnd Bergmann
  0 siblings, 1 reply; 22+ messages in thread
From: Sai Prakash Ranjan @ 2021-11-22 14:59 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Will Deacon, Steven Rostedt, Catalin Marinas, quic_psodagud,
	Marc Zyngier, gregkh, Linux ARM, Linux Kernel Mailing List,
	linux-arm-msm, Ingo Molnar

On 11/22/2021 8:00 PM, Arnd Bergmann wrote:
> On Mon, Nov 22, 2021 at 3:19 PM Sai Prakash Ranjan
> <quic_saipraka@quicinc.com> wrote:
>> On 11/22/2021 7:29 PM, Arnd Bergmann wrote:
>>> I think this would be a lot less confusing to readers, as it is implemented
>>> exactly in the place that has the normal definition, and it can also have
>>> somewhat more logical semantics by only instrumenting the
>>> normal/relaxed/ioport accessors but not the __raw_* versions that
>>> are meant to be little more than a pointer dereference.
>> But how is this different from logic in atomic-instrumented.h which also
>> has asm-generic version?
>> Initial review few years back mentioned about having something similar
>> to atomic instrumentation
>> and hence it was implemented with the similar approach keeping
>> instrumentation out of arch specific details.
> This is only a cosmetic difference. I usually prefer fewer indirections,
> and I like the way that include/asm-generic/io.h only has all the
> normal 'static inline' definitions spelled out, and calling the __raw_*
> versions. Your version adds an extra layer with the arch_raw_readl(),
> which I'd prefer to avoid.

I'm ok with your preference as long as we have some way to log these 
MMIO accesses.

>> And if we do move this instrumentation to asm-generic/io.h, how will
>> that be executed since
>> the arch specifc read{b,w,l,q} overrides this generic version?
> As I understand it, your version also requires architecture specific 
> changes, so that would be the same: it only works for architectures 
> that get the definition of readl()/readl_relaxed()/inl()/... from 
> include/asm-generic/io.h and only override the __raw version. Arnd

Sorry, I didn't get this part, so  I am trying this on ARM64:

arm64/include/asm/io.h has read{b,l,w,q} defined.
include/asm-generic/io.h has below:
   #ifndef readl
   #define readl readl
   static inline u32 readl(const volatile void __iomem *addr)

and we include asm-generic/io.h in arm64/include/asm/io.h at the end 
after the definitions for arm64 mmio accesors.
So arch implementation here overrides generic ones as I see it, am I 
missing something? I even confirmed this
with some trace_printk to generic and arch specific definitions of readl 
and I see arch specific ones being called.

Thanks,
Sai

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation
  2021-11-22 14:59               ` Sai Prakash Ranjan
@ 2021-11-22 15:35                 ` Arnd Bergmann
  2021-11-22 15:43                   ` Sai Prakash Ranjan
  0 siblings, 1 reply; 22+ messages in thread
From: Arnd Bergmann @ 2021-11-22 15:35 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Arnd Bergmann, Will Deacon, Steven Rostedt, Catalin Marinas,
	quic_psodagud, Marc Zyngier, gregkh, Linux ARM,
	Linux Kernel Mailing List, linux-arm-msm, Ingo Molnar

On Mon, Nov 22, 2021 at 3:59 PM Sai Prakash Ranjan
<quic_saipraka@quicinc.com> wrote:
> >> And if we do move this instrumentation to asm-generic/io.h, how will
> >> that be executed since
> >> the arch specifc read{b,w,l,q} overrides this generic version?
> > As I understand it, your version also requires architecture specific
> > changes, so that would be the same: it only works for architectures
> > that get the definition of readl()/readl_relaxed()/inl()/... from
> > include/asm-generic/io.h and only override the __raw version. Arnd
>
> Sorry, I didn't get this part, so  I am trying this on ARM64:
>
> arm64/include/asm/io.h has read{b,l,w,q} defined.
> include/asm-generic/io.h has below:
>    #ifndef readl
>    #define readl readl
>    static inline u32 readl(const volatile void __iomem *addr)
>
> and we include asm-generic/io.h in arm64/include/asm/io.h at the end
> after the definitions for arm64 mmio accesors.
> So arch implementation here overrides generic ones as I see it, am I
> missing something? I even confirmed this
> with some trace_printk to generic and arch specific definitions of readl
> and I see arch specific ones being called.

Ah, you are right that the arm64 version currently has custom definitions
of the high-level interfaces. These predate the introduction of the
__io_{p,}{b,a}{r,w} macros and are currently only used on risc-v.

I think in this case you should start by changing arm64 to use the
generic readl() etc definitions, by removing the extra definitions and
using

#define __io_ar(v) __iormb(__v)
#define __io_bw() dma_wmb()

      Arnd

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation
  2021-11-22 15:35                 ` Arnd Bergmann
@ 2021-11-22 15:43                   ` Sai Prakash Ranjan
  2021-11-29 13:49                     ` Sai Prakash Ranjan
  0 siblings, 1 reply; 22+ messages in thread
From: Sai Prakash Ranjan @ 2021-11-22 15:43 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Will Deacon, Steven Rostedt, Catalin Marinas, quic_psodagud,
	Marc Zyngier, gregkh, Linux ARM, Linux Kernel Mailing List,
	linux-arm-msm, Ingo Molnar

On 11/22/2021 9:05 PM, Arnd Bergmann wrote:
> On Mon, Nov 22, 2021 at 3:59 PM Sai Prakash Ranjan
> <quic_saipraka@quicinc.com> wrote:
>>>> And if we do move this instrumentation to asm-generic/io.h, how will
>>>> that be executed since
>>>> the arch specifc read{b,w,l,q} overrides this generic version?
>>> As I understand it, your version also requires architecture specific
>>> changes, so that would be the same: it only works for architectures
>>> that get the definition of readl()/readl_relaxed()/inl()/... from
>>> include/asm-generic/io.h and only override the __raw version. Arnd
>> Sorry, I didn't get this part, so  I am trying this on ARM64:
>>
>> arm64/include/asm/io.h has read{b,l,w,q} defined.
>> include/asm-generic/io.h has below:
>>     #ifndef readl
>>     #define readl readl
>>     static inline u32 readl(const volatile void __iomem *addr)
>>
>> and we include asm-generic/io.h in arm64/include/asm/io.h at the end
>> after the definitions for arm64 mmio accesors.
>> So arch implementation here overrides generic ones as I see it, am I
>> missing something? I even confirmed this
>> with some trace_printk to generic and arch specific definitions of readl
>> and I see arch specific ones being called.
> Ah, you are right that the arm64 version currently has custom definitions
> of the high-level interfaces. These predate the introduction of the
> __io_{p,}{b,a}{r,w} macros and are currently only used on risc-v.
>
> I think in this case you should start by changing arm64 to use the
> generic readl() etc definitions, by removing the extra definitions and
> using
>
> #define __io_ar(v) __iormb(__v)
> #define __io_bw() dma_wmb()
>
>

Sure, will do that.

Thanks,
Sai

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation
  2021-11-22 15:43                   ` Sai Prakash Ranjan
@ 2021-11-29 13:49                     ` Sai Prakash Ranjan
  0 siblings, 0 replies; 22+ messages in thread
From: Sai Prakash Ranjan @ 2021-11-29 13:49 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Will Deacon, Steven Rostedt, Catalin Marinas, quic_psodagud,
	Marc Zyngier, gregkh, Linux ARM, Linux Kernel Mailing List,
	linux-arm-msm, Ingo Molnar

Hi Arnd,

On 11/22/2021 9:13 PM, Sai Prakash Ranjan wrote:
> On 11/22/2021 9:05 PM, Arnd Bergmann wrote:
>> On Mon, Nov 22, 2021 at 3:59 PM Sai Prakash Ranjan
>> <quic_saipraka@quicinc.com> wrote:
>>>>> And if we do move this instrumentation to asm-generic/io.h, how will
>>>>> that be executed since
>>>>> the arch specifc read{b,w,l,q} overrides this generic version?
>>>> As I understand it, your version also requires architecture specific
>>>> changes, so that would be the same: it only works for architectures
>>>> that get the definition of readl()/readl_relaxed()/inl()/... from
>>>> include/asm-generic/io.h and only override the __raw version. Arnd
>>> Sorry, I didn't get this part, so  I am trying this on ARM64:
>>>
>>> arm64/include/asm/io.h has read{b,l,w,q} defined.
>>> include/asm-generic/io.h has below:
>>>     #ifndef readl
>>>     #define readl readl
>>>     static inline u32 readl(const volatile void __iomem *addr)
>>>
>>> and we include asm-generic/io.h in arm64/include/asm/io.h at the end
>>> after the definitions for arm64 mmio accesors.
>>> So arch implementation here overrides generic ones as I see it, am I
>>> missing something? I even confirmed this
>>> with some trace_printk to generic and arch specific definitions of 
>>> readl
>>> and I see arch specific ones being called.
>> Ah, you are right that the arm64 version currently has custom 
>> definitions
>> of the high-level interfaces. These predate the introduction of the
>> __io_{p,}{b,a}{r,w} macros and are currently only used on risc-v.
>>
>> I think in this case you should start by changing arm64 to use the
>> generic readl() etc definitions, by removing the extra definitions and
>> using
>>
>> #define __io_ar(v) __iormb(__v)
>> #define __io_bw() dma_wmb()
>>
>>
>
> Sure, will do that.

I got the callback version implemented as suggested by you to compare 
the overall size and performance
with the inline version and apparently the size increased more in case 
of callback version when compared to
inline version. As for the performance, I ran some basic dd tests and 
sysbench and didn't see any noticeable
difference between the two implementations.

**Inline version with CONFIG_FTRACE=y and CONFIG_TRACE_MMIO_ACCESS=y**
$ size vmlinux
    text                   data                    bss dec             
hex         filename
23884219        14284468         532568 38701255        24e88c7 vmlinux

**Callback version with CONFIG_FTRACE=y and CONFIG_TRACE_MMIO_ACCESS=y**
$ size vmlinux
    text                  data                     bss dec               
        hex        filename
24108179        14279596         532568 38920343        251e097 vmlinux

$ ./scripts/bloat-o-meter inline-vmlinux callback-vmlinux
add/remove: 8/3 grow/shrink: 4889/89 up/down: 242244/-11564 (230680)
Total: Before=25812612, After=26043292, chg +0.89%

Note: I had arm64 move to asm-generic high level accessors in both 
versions which I plan to post together but
not included in below links,

For your reference, here are the 2 versions of the patches,
  Inline version - 
https://github.com/saiprakash-ranjan/RTB-Patches/blob/main/0001-asm-generic-io-Add-logging-support-for-MMIO-accessor.patch
  Callback version - 
https://github.com/saiprakash-ranjan/RTB-Patches/blob/main/0001-asm-generic-io-Add-callback-based-MMIO-logging-suppo.patch

Couple of things noted in callback version which didn't look quite good 
was that it needed some way to register the
callbacks and need to use some initcall (core_initcall used) as 
implemented in above patch but that would probably
mean we would lose some register logging(if there is some) in between 
early_initcall(which is when trace events get
enabled) and this trace_readwrite core_initcall. Another thing I noticed 
that since we now move to callback based
implementations, the caller info is somewhat inaccurate when compared to 
inline version.

Also regarding your earlier comment on inline versions being possibly 
problematic on lower memory systems, enabling
Ftrace itself is making a considerable size difference and in such 
systems ftrace wouldn't be enabled which implicitly means
MMIO logging which are based on trace events will be disabled anyways.

Here is the size delta with FTRACE enabled and disabled with arm64 
defconfig without MMIO logging support:
$ ./scripts/bloat-o-meter ftrace-disabled-vmlinux ftrace-enabled-vmlinux
add/remove: 8/3 grow/shrink: 4889/89 up/down: 242244/-11564 (230680)
Total: Before=22865837, After=25215198, chg +10.27%

Given all this, I would prefer inline versions in asm-generic high level 
accessors, let me know what you think.

Thanks,
Sai

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-11-29 13:52 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-15 11:33 [PATCHv4 0/2] tracing/rwmmio/arm64: Add support to trace register reads/writes Sai Prakash Ranjan
2021-11-15 11:33 ` [PATCHv4 1/2] tracing: Add register read/write tracing support Sai Prakash Ranjan
2021-11-19 13:43   ` Marc Zyngier
2021-11-19 14:07     ` Sai Prakash Ranjan
2021-11-19 14:17       ` Marc Zyngier
2021-11-19 15:19         ` Sai Prakash Ranjan
2021-11-15 11:33 ` [PATCHv4 2/2] arm64/io: Add a header for mmio access instrumentation Sai Prakash Ranjan
2021-11-16 22:40   ` Steven Rostedt
2021-11-17  3:53     ` Sai Prakash Ranjan
2021-11-18 14:58   ` kernel test robot
2021-11-18 15:24   ` Arnd Bergmann
2021-11-19  4:06     ` Sai Prakash Ranjan
2021-11-22 13:35       ` Sai Prakash Ranjan
2021-11-22 13:59         ` Arnd Bergmann
2021-11-22 14:19           ` Sai Prakash Ranjan
2021-11-22 14:30             ` Arnd Bergmann
2021-11-22 14:59               ` Sai Prakash Ranjan
2021-11-22 15:35                 ` Arnd Bergmann
2021-11-22 15:43                   ` Sai Prakash Ranjan
2021-11-29 13:49                     ` Sai Prakash Ranjan
2021-11-19 13:48   ` Marc Zyngier
2021-11-19 14:09     ` Sai Prakash Ranjan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).