LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* PowerPC, ftrace: Add PPC raw syscall tracepoints & ftrace fixes (mimimal subset only) v3
@ 2011-02-02  7:11 Ian Munsie
  2011-02-02  7:11 ` [PATCH 1/6] ftrace syscalls: don't add events for unmapped syscalls Ian Munsie
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Ian Munsie @ 2011-02-02  7:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andreas Dilger, Dave Kleikamp, Andrew Morton, Jiri Kosina,
	Jason Baron, linuxppc-dev, Steven Rostedt, Alexander Graf,
	Ingo Molnar, Paul Mackerras, KOSAKI Motohiro,
	Frederic Weisbecker, Scott Wood, Nathan Lynch, Avantika Mathur,
	David Gibson, Andreas Schwab, Namhyung Kim

git send-email just failed to save or send this message, so retyping...


Hi All,

This is a partial version of my 'ftrace syscalls, PowerPC: Various fixes,
Compat Syscall support and PowerPC implementation'.

I finally found some time this week to go back and work on this patchset, but
now I've been hit with another more pressing work item, so this will once again
have to go on the backburner :-(. Even though I haven't finished fixing up the
complete 40 patch series I would still like to get this smaller minimal subset
in. I've had a request for the raw syscall tracepoints on PowerPC that this
subset implements, and these patches have proven to be pretty stable.

This subset also fixes ftrace syscalls to ensure that events will only be
created for syscalls that successfully map their metadata to a syscall number,
so that non-working phantom events are not created. Patches #2 and #6 in this
series are not strictly necessary for this, they just optimise ftrace syscalls
a bit.

What's missing from this series that was in the v2 is the conversion of
all the syscalls implemented under /arch/powerpc, Jason Baron's compat
syscall support and the conversion of the remaining native and compat
syscalls to this infrastructure.


This is based on tip/master. The PowerPC specific patch will need Ben's ACK
before it goes in.

Changes from the last subset I posted:
- Rather than removing the redundant syscall_nr checks completely, I have
  turned them into WARN_ON_ONCE to catch possible future regressions, from
  Steven Rostedt's suggestion.
- From Mike Frysinger's suggestion, arch_syscall_addr is now a macro rather
  than a weak function to minimise the overhead at boot. Archs with special
  requirements (such as ppc64) can define their own macro in asm/ftrace.h.
  Steven Rostedt suggested this be made a static inline function, but I don't
  see how this would be possible (at least without #defines and #ifndefs) given
  that it has to be weak to allow archs to override it (Unless I misunderstood
  something?  Steven?).


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/6] ftrace syscalls: don't add events for unmapped syscalls
  2011-02-02  7:11 PowerPC, ftrace: Add PPC raw syscall tracepoints & ftrace fixes (mimimal subset only) v3 Ian Munsie
@ 2011-02-02  7:11 ` Ian Munsie
  2011-02-02  7:11 ` [PATCH 2/6] trace syscalls: Convert redundant syscall_nr checks into WARN_ON Ian Munsie
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Ian Munsie @ 2011-02-02  7:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andreas Dilger, Dave Kleikamp, Andrew Morton, Jiri Kosina,
	Jason Baron, linuxppc-dev, Steven Rostedt, Alexander Graf,
	Ingo Molnar, Paul Mackerras, KOSAKI Motohiro,
	Frederic Weisbecker, Scott Wood, Nathan Lynch, Avantika Mathur,
	David Gibson, Andreas Schwab, Namhyung Kim, Ian Munsie

From: Ian Munsie <imunsie@au1.ibm.com>

FTRACE_SYSCALLS would create events for each and every system call, even
if it had failed to map the system call's name with it's number. This
resulted in a number of events being created that would not behave as
expected.

This could happen, for example, on architectures who's symbol names are
unusual and will not match the system call name. It could also happen
with system calls which were mapped to sys_ni_syscall.

This patch changes the default system call number in the metadata to -1.
If the system call name from the metadata is not successfully mapped to
a system call number during boot, than the event initialisation routine
will now return an error, preventing the event from being created.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
---
 include/linux/syscalls.h      |    2 ++
 kernel/trace/trace_syscalls.c |    8 ++++++++
 2 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 18cd068..2e5a68d 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -160,6 +160,7 @@ extern struct trace_event_functions exit_syscall_print_funcs;
 	  __attribute__((section("__syscalls_metadata")))	\
 	  __syscall_meta_##sname = {				\
 		.name 		= "sys"#sname,			\
+		.syscall_nr	= -1,	/* Filled in at boot */	\
 		.nb_args 	= nb,				\
 		.types		= types_##sname,		\
 		.args		= args_##sname,			\
@@ -176,6 +177,7 @@ extern struct trace_event_functions exit_syscall_print_funcs;
 	  __attribute__((section("__syscalls_metadata")))	\
 	  __syscall_meta__##sname = {				\
 		.name 		= "sys_"#sname,			\
+		.syscall_nr	= -1,	/* Filled in at boot */	\
 		.nb_args 	= 0,				\
 		.enter_event	= &event_enter__##sname,	\
 		.exit_event	= &event_exit__##sname,		\
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index b706529..a66bc13 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -423,6 +423,14 @@ void unreg_event_syscall_exit(struct ftrace_event_call *call)
 int init_syscall_trace(struct ftrace_event_call *call)
 {
 	int id;
+	int num;
+
+	num = ((struct syscall_metadata *)call->data)->syscall_nr;
+	if (num < 0 || num >= NR_syscalls) {
+		pr_debug("syscall %s metadata not mapped, disabling ftrace event\n",
+				((struct syscall_metadata *)call->data)->name);
+		return -ENOSYS;
+	}
 
 	if (set_syscall_print_fmt(call) < 0)
 		return -ENOMEM;
-- 
1.7.2.3


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/6] trace syscalls: Convert redundant syscall_nr checks into WARN_ON
  2011-02-02  7:11 PowerPC, ftrace: Add PPC raw syscall tracepoints & ftrace fixes (mimimal subset only) v3 Ian Munsie
  2011-02-02  7:11 ` [PATCH 1/6] ftrace syscalls: don't add events for unmapped syscalls Ian Munsie
@ 2011-02-02  7:11 ` Ian Munsie
  2011-02-02  7:11 ` [PATCH 3/6] ftrace syscalls: Make arch_syscall_addr weak Ian Munsie
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Ian Munsie @ 2011-02-02  7:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andreas Dilger, Dave Kleikamp, Andrew Morton, Jiri Kosina,
	Jason Baron, linuxppc-dev, Steven Rostedt, Alexander Graf,
	Ingo Molnar, Paul Mackerras, KOSAKI Motohiro,
	Frederic Weisbecker, Scott Wood, Nathan Lynch, Avantika Mathur,
	David Gibson, Andreas Schwab, Namhyung Kim, Ian Munsie

From: Ian Munsie <imunsie@au1.ibm.com>

With the ftrace events now checking if the syscall_nr is valid upon
initialisation it should no longer be possible to register or unregister
a syscall event without a valid syscall_nr since they should not be
created. This adds a WARN_ON_ONCE in the register and unregister
functions to locate potential regressions in the future.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
---
 kernel/trace/trace_syscalls.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index a66bc13..1a6e8dd 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -358,7 +358,7 @@ int reg_event_syscall_enter(struct ftrace_event_call *call)
 	int num;
 
 	num = ((struct syscall_metadata *)call->data)->syscall_nr;
-	if (num < 0 || num >= NR_syscalls)
+	if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls))
 		return -ENOSYS;
 	mutex_lock(&syscall_trace_lock);
 	if (!sys_refcount_enter)
@@ -376,7 +376,7 @@ void unreg_event_syscall_enter(struct ftrace_event_call *call)
 	int num;
 
 	num = ((struct syscall_metadata *)call->data)->syscall_nr;
-	if (num < 0 || num >= NR_syscalls)
+	if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls))
 		return;
 	mutex_lock(&syscall_trace_lock);
 	sys_refcount_enter--;
@@ -392,7 +392,7 @@ int reg_event_syscall_exit(struct ftrace_event_call *call)
 	int num;
 
 	num = ((struct syscall_metadata *)call->data)->syscall_nr;
-	if (num < 0 || num >= NR_syscalls)
+	if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls))
 		return -ENOSYS;
 	mutex_lock(&syscall_trace_lock);
 	if (!sys_refcount_exit)
@@ -410,7 +410,7 @@ void unreg_event_syscall_exit(struct ftrace_event_call *call)
 	int num;
 
 	num = ((struct syscall_metadata *)call->data)->syscall_nr;
-	if (num < 0 || num >= NR_syscalls)
+	if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls))
 		return;
 	mutex_lock(&syscall_trace_lock);
 	sys_refcount_exit--;
-- 
1.7.2.3


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 3/6] ftrace syscalls: Make arch_syscall_addr weak
  2011-02-02  7:11 PowerPC, ftrace: Add PPC raw syscall tracepoints & ftrace fixes (mimimal subset only) v3 Ian Munsie
  2011-02-02  7:11 ` [PATCH 1/6] ftrace syscalls: don't add events for unmapped syscalls Ian Munsie
  2011-02-02  7:11 ` [PATCH 2/6] trace syscalls: Convert redundant syscall_nr checks into WARN_ON Ian Munsie
@ 2011-02-02  7:11 ` Ian Munsie
  2011-02-02  7:11 ` [PATCH 4/6] ftrace syscalls: Allow arch specific syscall symbol matching Ian Munsie
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Ian Munsie @ 2011-02-02  7:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andreas Dilger, Dave Kleikamp, Andrew Morton, Jiri Kosina,
	Jason Baron, linuxppc-dev, Steven Rostedt, Alexander Graf,
	Ingo Molnar, Paul Mackerras, KOSAKI Motohiro,
	Frederic Weisbecker, Scott Wood, Nathan Lynch, Avantika Mathur,
	David Gibson, Andreas Schwab, Namhyung Kim, Ian Munsie,
	open list:DOCUMENTATION

From: Ian Munsie <imunsie@au1.ibm.com>

Some architectures use non-trivial system call tables and will not work
with the generic arch_syscall_addr code. For example, PowerPC64 uses a
table of twin long longs.

This patch makes the generic arch_syscall_addr weak to allow
architectures with non-trivial system call tables to override it.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
---
 Documentation/trace/ftrace-design.txt |    3 +++
 kernel/trace/trace_syscalls.c         |    2 +-
 2 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/Documentation/trace/ftrace-design.txt b/Documentation/trace/ftrace-design.txt
index dc52bd4..6fca17b 100644
--- a/Documentation/trace/ftrace-design.txt
+++ b/Documentation/trace/ftrace-design.txt
@@ -247,6 +247,9 @@ You need very few things to get the syscalls tracing in an arch.
 - Support the TIF_SYSCALL_TRACEPOINT thread flags.
 - Put the trace_sys_enter() and trace_sys_exit() tracepoints calls from ptrace
   in the ptrace syscalls tracing path.
+- If the system call table on this arch is more complicated than a simple array
+  of addresses of the system calls, implement an arch_syscall_addr to return
+  the address of a given system call.
 - Tag this arch as HAVE_SYSCALL_TRACEPOINTS.
 
 
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 1a6e8dd..33360b9 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -445,7 +445,7 @@ int init_syscall_trace(struct ftrace_event_call *call)
 	return id;
 }
 
-unsigned long __init arch_syscall_addr(int nr)
+unsigned long __init __weak arch_syscall_addr(int nr)
 {
 	return (unsigned long)sys_call_table[nr];
 }
-- 
1.7.2.3


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 4/6] ftrace syscalls: Allow arch specific syscall symbol matching
  2011-02-02  7:11 PowerPC, ftrace: Add PPC raw syscall tracepoints & ftrace fixes (mimimal subset only) v3 Ian Munsie
                   ` (2 preceding siblings ...)
  2011-02-02  7:11 ` [PATCH 3/6] ftrace syscalls: Make arch_syscall_addr weak Ian Munsie
@ 2011-02-02  7:11 ` Ian Munsie
  2011-02-02 14:04   ` Steven Rostedt
  2011-02-02  7:11 ` [PATCH 5/6] trace, powerpc: Implement raw syscall tracepoints on PowerPC Ian Munsie
  2011-02-02  7:11 ` [PATCH 6/6] trace syscalls: Early terminate search for sys_ni_syscall Ian Munsie
  5 siblings, 1 reply; 12+ messages in thread
From: Ian Munsie @ 2011-02-02  7:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andreas Dilger, Dave Kleikamp, Andrew Morton, Jiri Kosina,
	Jason Baron, linuxppc-dev, Steven Rostedt, Alexander Graf,
	Ingo Molnar, Paul Mackerras, KOSAKI Motohiro,
	Frederic Weisbecker, Scott Wood, Nathan Lynch, Avantika Mathur,
	David Gibson, Andreas Schwab, Namhyung Kim, Ian Munsie,
	open list:DOCUMENTATION

From: Ian Munsie <imunsie@au1.ibm.com>

Some architectures have unusual symbol names and the generic code to
match the symbol name with the function name for the syscall metadata
will fail. For example, symbols on PPC64 start with a period and the
generic code will fail to match them.

This patch moves the match logic out into a macro which can be
overridden by defining arch_syscall_match_sym_name in an archs
asm/ftrace.h if needed.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
---
 Documentation/trace/ftrace-design.txt |    4 ++++
 include/linux/ftrace.h                |    9 +++++++++
 kernel/trace/trace_syscalls.c         |    8 +-------
 3 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/Documentation/trace/ftrace-design.txt b/Documentation/trace/ftrace-design.txt
index 6fca17b..e1eaeb1 100644
--- a/Documentation/trace/ftrace-design.txt
+++ b/Documentation/trace/ftrace-design.txt
@@ -250,6 +250,10 @@ You need very few things to get the syscalls tracing in an arch.
 - If the system call table on this arch is more complicated than a simple array
   of addresses of the system calls, implement an arch_syscall_addr to return
   the address of a given system call.
+- If the symbol names of the system calls do not match the function names on
+  this arch, define the arch_syscall_match_sym_name macro in asm/ftrace.h with
+  the appropriate logic to return non zero if the function name corresponds
+  with the symbol name.
 - Tag this arch as HAVE_SYSCALL_TRACEPOINTS.
 
 
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index dcd6a7c..0d0e109 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -527,6 +527,15 @@ extern enum ftrace_dump_mode ftrace_dump_on_oops;
 #ifdef CONFIG_FTRACE_SYSCALLS
 
 unsigned long arch_syscall_addr(int nr);
+#ifndef arch_syscall_match_sym_name
+/*
+ * Only compare after the "sys" prefix. Archs that use
+ * syscall wrappers may have syscalls symbols aliases prefixed
+ * with "SyS" instead of "sys", leading to an unwanted
+ * mismatch.
+ */
+#define arch_syscall_match_sym_name(sym, name) !strcmp(sym + 3, name + 3)
+#endif
 
 #endif /* CONFIG_FTRACE_SYSCALLS */
 
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 33360b9..76bffba 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -72,13 +72,7 @@ static struct syscall_metadata *find_syscall_meta(unsigned long syscall)
 	kallsyms_lookup(syscall, NULL, NULL, NULL, str);
 
 	for ( ; start < stop; start++) {
-		/*
-		 * Only compare after the "sys" prefix. Archs that use
-		 * syscall wrappers may have syscalls symbols aliases prefixed
-		 * with "SyS" instead of "sys", leading to an unwanted
-		 * mismatch.
-		 */
-		if (start->name && !strcmp(start->name + 3, str + 3))
+		if (start->name && arch_syscall_match_sym_name(str, start->name))
 			return start;
 	}
 	return NULL;
-- 
1.7.2.3


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 5/6] trace, powerpc: Implement raw syscall tracepoints on PowerPC
  2011-02-02  7:11 PowerPC, ftrace: Add PPC raw syscall tracepoints & ftrace fixes (mimimal subset only) v3 Ian Munsie
                   ` (3 preceding siblings ...)
  2011-02-02  7:11 ` [PATCH 4/6] ftrace syscalls: Allow arch specific syscall symbol matching Ian Munsie
@ 2011-02-02  7:11 ` Ian Munsie
  2011-02-02  7:11 ` [PATCH 6/6] trace syscalls: Early terminate search for sys_ni_syscall Ian Munsie
  5 siblings, 0 replies; 12+ messages in thread
From: Ian Munsie @ 2011-02-02  7:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andreas Dilger, Dave Kleikamp, Andrew Morton, Jiri Kosina,
	Jason Baron, linuxppc-dev, Steven Rostedt, Alexander Graf,
	Ingo Molnar, Paul Mackerras, KOSAKI Motohiro,
	Frederic Weisbecker, Scott Wood, Nathan Lynch, Avantika Mathur,
	David Gibson, Andreas Schwab, Namhyung Kim, Ian Munsie,
	Ian Munsie

From: Ian Munsie <imunsie@au.ibm.com>

This patch implements the raw syscall tracepoints on PowerPC and exports
them for ftrace syscalls to use.

To minimise reworking existing code, I slightly re-ordered the thread
info flags such that the new TIF_SYSCALL_TRACEPOINT bit would still fit
within the 16 bits of the andi. instruction's UI field. The instructions
in question are in /arch/powerpc/kernel/entry_{32,64}.S to and the
_TIF_SYSCALL_T_OR_A with the thread flags to see if system call tracing
is enabled.

In the case of 64bit PowerPC, arch_syscall_addr and
arch_syscall_match_sym_name are overridden to allow ftrace syscalls to
work given the unusual system call table structure and symbol names that
start with a period.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
---
 arch/powerpc/Kconfig                   |    1 +
 arch/powerpc/include/asm/ftrace.h      |   10 ++++++++++
 arch/powerpc/include/asm/syscall.h     |    5 +++++
 arch/powerpc/include/asm/thread_info.h |    7 +++++--
 arch/powerpc/kernel/Makefile           |    1 +
 arch/powerpc/kernel/ftrace.c           |    8 ++++++++
 arch/powerpc/kernel/ptrace.c           |   10 ++++++++++
 7 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 7d69e9b..a0e8e02 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -134,6 +134,7 @@ config PPC
 	select HAVE_GENERIC_HARDIRQS
 	select HAVE_SPARSE_IRQ
 	select IRQ_PER_CPU
+	select HAVE_SYSCALL_TRACEPOINTS
 
 config EARLY_PRINTK
 	bool
diff --git a/arch/powerpc/include/asm/ftrace.h b/arch/powerpc/include/asm/ftrace.h
index dde1296..0a08902 100644
--- a/arch/powerpc/include/asm/ftrace.h
+++ b/arch/powerpc/include/asm/ftrace.h
@@ -60,4 +60,14 @@ struct dyn_arch_ftrace {
 
 #endif
 
+#if defined(CONFIG_FTRACE_SYSCALLS) && defined(CONFIG_PPC64)
+/*
+ * Compare the symbol name with the system call name. Skip the .sys or .SyS
+ * prefix from the symbol name and the sys prefix from the system call name and
+ * just match the rest. This is only needed on ppc64 since symbol names on
+ * 32bit do not start with a period so the generic function will work.
+ */
+#define arch_syscall_match_sym_name(sym, name) !strcmp(sym + 4, name + 3)
+#endif /* CONFIG_FTRACE_SYSCALLS && CONFIG_PPC64 */
+
 #endif /* _ASM_POWERPC_FTRACE */
diff --git a/arch/powerpc/include/asm/syscall.h b/arch/powerpc/include/asm/syscall.h
index 23913e9..b54b2ad 100644
--- a/arch/powerpc/include/asm/syscall.h
+++ b/arch/powerpc/include/asm/syscall.h
@@ -15,6 +15,11 @@
 
 #include <linux/sched.h>
 
+/* ftrace syscalls requires exporting the sys_call_table */
+#ifdef CONFIG_FTRACE_SYSCALLS
+extern const unsigned long *sys_call_table;
+#endif /* CONFIG_FTRACE_SYSCALLS */
+
 static inline long syscall_get_nr(struct task_struct *task,
 				  struct pt_regs *regs)
 {
diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index 65eb859..4403f09 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -110,7 +110,8 @@ static inline struct thread_info *current_thread_info(void)
 #define TIF_NOERROR		12	/* Force successful syscall return */
 #define TIF_NOTIFY_RESUME	13	/* callback before returning to user */
 #define TIF_FREEZE		14	/* Freezing for suspend */
-#define TIF_RUNLATCH		15	/* Is the runlatch enabled? */
+#define TIF_SYSCALL_TRACEPOINT	15	/* syscall tracepoint instrumentation */
+#define TIF_RUNLATCH		16	/* Is the runlatch enabled? */
 
 /* as above, but as bit values */
 #define _TIF_SYSCALL_TRACE	(1<<TIF_SYSCALL_TRACE)
@@ -127,8 +128,10 @@ static inline struct thread_info *current_thread_info(void)
 #define _TIF_NOERROR		(1<<TIF_NOERROR)
 #define _TIF_NOTIFY_RESUME	(1<<TIF_NOTIFY_RESUME)
 #define _TIF_FREEZE		(1<<TIF_FREEZE)
+#define _TIF_SYSCALL_TRACEPOINT	(1<<TIF_SYSCALL_TRACEPOINT)
 #define _TIF_RUNLATCH		(1<<TIF_RUNLATCH)
-#define _TIF_SYSCALL_T_OR_A	(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP)
+#define _TIF_SYSCALL_T_OR_A	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
+				 _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT)
 
 #define _TIF_USER_WORK_MASK	(_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
 				 _TIF_NOTIFY_RESUME)
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 3bb2a3e..fe1ac47 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -105,6 +105,7 @@ obj64-$(CONFIG_AUDIT)		+= compat_audit.o
 
 obj-$(CONFIG_DYNAMIC_FTRACE)	+= ftrace.o
 obj-$(CONFIG_FUNCTION_GRAPH_TRACER)	+= ftrace.o
+obj-$(CONFIG_FTRACE_SYSCALLS)	+= ftrace.o
 obj-$(CONFIG_PERF_EVENTS)	+= perf_callchain.o
 
 obj-$(CONFIG_PPC_PERF_CTRS)	+= perf_event.o
diff --git a/arch/powerpc/kernel/ftrace.c b/arch/powerpc/kernel/ftrace.c
index ce1f3e4..bf99cfa 100644
--- a/arch/powerpc/kernel/ftrace.c
+++ b/arch/powerpc/kernel/ftrace.c
@@ -22,6 +22,7 @@
 #include <asm/cacheflush.h>
 #include <asm/code-patching.h>
 #include <asm/ftrace.h>
+#include <asm/syscall.h>
 
 
 #ifdef CONFIG_DYNAMIC_FTRACE
@@ -600,3 +601,10 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr)
 	}
 }
 #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
+
+#if defined(CONFIG_FTRACE_SYSCALLS) && defined(CONFIG_PPC64)
+unsigned long __init arch_syscall_addr(int nr)
+{
+	return sys_call_table[nr*2];
+}
+#endif /* CONFIG_FTRACE_SYSCALLS && CONFIG_PPC64 */
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index 9065369..b6ff0cb 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -29,6 +29,7 @@
 #include <linux/signal.h>
 #include <linux/seccomp.h>
 #include <linux/audit.h>
+#include <trace/syscall.h>
 #ifdef CONFIG_PPC32
 #include <linux/module.h>
 #endif
@@ -40,6 +41,9 @@
 #include <asm/pgtable.h>
 #include <asm/system.h>
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/syscalls.h>
+
 /*
  * The parameter save area on the stack is used to store arguments being passed
  * to callee function and is located at fixed offset from stack pointer.
@@ -1691,6 +1695,9 @@ long do_syscall_trace_enter(struct pt_regs *regs)
 		 */
 		ret = -1L;
 
+	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
+		trace_sys_enter(regs, regs->gpr[0]);
+
 	if (unlikely(current->audit_context)) {
 #ifdef CONFIG_PPC64
 		if (!is_32bit_task())
@@ -1719,6 +1726,9 @@ void do_syscall_trace_leave(struct pt_regs *regs)
 		audit_syscall_exit((regs->ccr&0x10000000)?AUDITSC_FAILURE:AUDITSC_SUCCESS,
 				   regs->result);
 
+	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
+		trace_sys_exit(regs, regs->result);
+
 	step = test_thread_flag(TIF_SINGLESTEP);
 	if (step || test_thread_flag(TIF_SYSCALL_TRACE))
 		tracehook_report_syscall_exit(regs, step);
-- 
1.7.2.3


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 6/6] trace syscalls: Early terminate search for sys_ni_syscall
  2011-02-02  7:11 PowerPC, ftrace: Add PPC raw syscall tracepoints & ftrace fixes (mimimal subset only) v3 Ian Munsie
                   ` (4 preceding siblings ...)
  2011-02-02  7:11 ` [PATCH 5/6] trace, powerpc: Implement raw syscall tracepoints on PowerPC Ian Munsie
@ 2011-02-02  7:11 ` Ian Munsie
  5 siblings, 0 replies; 12+ messages in thread
From: Ian Munsie @ 2011-02-02  7:11 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andreas Dilger, Dave Kleikamp, Andrew Morton, Jiri Kosina,
	Jason Baron, linuxppc-dev, Steven Rostedt, Alexander Graf,
	Ingo Molnar, Paul Mackerras, KOSAKI Motohiro,
	Frederic Weisbecker, Scott Wood, Nathan Lynch, Avantika Mathur,
	David Gibson, Andreas Schwab, Namhyung Kim, Ian Munsie

From: Ian Munsie <imunsie@au1.ibm.com>

Many system calls are unimplemented and mapped to sys_ni_syscall, but at
boot ftrace would still search through every syscall metadata entry for
a match which wouldn't be there.

This patch adds causes the search to terminate early if the system call
is not mapped.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
---
 kernel/trace/trace_syscalls.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 76bffba..ff070aa 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -71,6 +71,9 @@ static struct syscall_metadata *find_syscall_meta(unsigned long syscall)
 	stop = (struct syscall_metadata *)__stop_syscalls_metadata;
 	kallsyms_lookup(syscall, NULL, NULL, NULL, str);
 
+	if (arch_syscall_match_sym_name(str, "sys_ni_syscall"))
+		return NULL;
+
 	for ( ; start < stop; start++) {
 		if (start->name && arch_syscall_match_sym_name(str, start->name))
 			return start;
-- 
1.7.2.3


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/6] ftrace syscalls: Allow arch specific syscall symbol matching
  2011-02-02  7:11 ` [PATCH 4/6] ftrace syscalls: Allow arch specific syscall symbol matching Ian Munsie
@ 2011-02-02 14:04   ` Steven Rostedt
  2011-02-02 14:15     ` [PATCH 4/6] ftrace syscalls: Allow arch specific syscallsymbol matching David Laight
  0 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2011-02-02 14:04 UTC (permalink / raw)
  To: Ian Munsie
  Cc: linux-kernel, Andreas Dilger, Dave Kleikamp, Andrew Morton,
	Jiri Kosina, Jason Baron, linuxppc-dev, Alexander Graf,
	Ingo Molnar, Paul Mackerras, KOSAKI Motohiro,
	Frederic Weisbecker, Scott Wood, Nathan Lynch, Avantika Mathur,
	David Gibson, Andreas Schwab, Namhyung Kim,
	open list:DOCUMENTATION

On Wed, 2011-02-02 at 18:11 +1100, Ian Munsie wrote:

I'll answer your question here.

> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index dcd6a7c..0d0e109 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -527,6 +527,15 @@ extern enum ftrace_dump_mode ftrace_dump_on_oops;
>  #ifdef CONFIG_FTRACE_SYSCALLS
>  
>  unsigned long arch_syscall_addr(int nr);
> +#ifndef arch_syscall_match_sym_name
> +/*
> + * Only compare after the "sys" prefix. Archs that use
> + * syscall wrappers may have syscalls symbols aliases prefixed
> + * with "SyS" instead of "sys", leading to an unwanted
> + * mismatch.
> + */
> +#define arch_syscall_match_sym_name(sym, name) !strcmp(sym + 3, name + 3)

Instead, you could have:

#ifndef ARCH_HAS_SYSCALL_MATCH_SYM_NAME

static inline arch_syscall_match_sym_name(const char *sym, const char *name)
{
	return strcmp(sym + 3, name + 3) != 0;
}


If an arch needs to make its own, then it can simply override it by
creating its own version and defining:

#define ARCH_HAS_SYSCALL_MATCH_SYM_NAME

Just like they do when an arch has its own strcmp.

This just keeps things cleaner. I like to avoid macros as they can have
nasty side effects (never would have guess that if you look at what I've
done in trace/ftrace.h ;-)

For example, if we use your call as:

arch_syscall_match_sym_name(str = sym[5], name);

That str would end up being something unexpected. I'm not condoning such
side-effect code, but it is something to think about when using macros.

-- Steve


> +#endif
>  
>  #endif /* CONFIG_FTRACE_SYSCALLS */
>  
> diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
> index 33360b9..76bffba 100644
> --- a/kernel/trace/trace_syscalls.c
> +++ b/kernel/trace/trace_syscalls.c
> @@ -72,13 +72,7 @@ static struct syscall_metadata *find_syscall_meta(unsigned long syscall)
>  	kallsyms_lookup(syscall, NULL, NULL, NULL, str);
>  
>  	for ( ; start < stop; start++) {
> -		/*
> -		 * Only compare after the "sys" prefix. Archs that use
> -		 * syscall wrappers may have syscalls symbols aliases prefixed
> -		 * with "SyS" instead of "sys", leading to an unwanted
> -		 * mismatch.
> -		 */
> -		if (start->name && !strcmp(start->name + 3, str + 3))
> +		if (start->name && arch_syscall_match_sym_name(str, start->name))
>  			return start;
>  	}
>  	return NULL;



^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH 4/6] ftrace syscalls: Allow arch specific syscallsymbol matching
  2011-02-02 14:04   ` Steven Rostedt
@ 2011-02-02 14:15     ` David Laight
  2011-02-02 14:28       ` Steven Rostedt
  0 siblings, 1 reply; 12+ messages in thread
From: David Laight @ 2011-02-02 14:15 UTC (permalink / raw)
  To: Steven Rostedt, Ian Munsie
  Cc: Andreas Dilger, Andreas Schwab, Dave Kleikamp,
	open list:DOCUMENTATION, Namhyung Kim, Jiri Kosina, Jason Baron,
	linux-kernel, Alexander Graf, Ingo Molnar, Paul Mackerras,
	KOSAKI Motohiro, Frederic Weisbecker, Scott Wood, Nathan Lynch,
	Andrew Morton, David Gibson, linuxppc-dev, Avantika Mathur

 
> +#define arch_syscall_match_sym_name(sym, name) !strcmp(sym + 3, name
+ 3)

Whenever you use a #define macro arg, you should enclose it in ().
About the only time you don't need to is when it is being
passed as an argument to another function
(ie when it's use is also ',' separated).

So the above ought to be:
#define arch_syscall_match_sym_name(sym, name) (!strcmp((sym) + 3,
(name) + 3))

Whether an inline function is better or worse is much more subtle!
For instance I've used:
   asm volatile ( "# line " STR(__LINE__) :: )
to stop gcc merging the tails of conditionals.
Useful when the conditional is at the end of a loop (etc),
it might increase code size slightly, but removes a branch.

If I put one of those in an 'inline' function separate copies
of the function end up sharing code.
With a #define __LINE__ differs so they don't.

(I had some code to get below 190 clocks, these changes
were significant!)

	David



^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH 4/6] ftrace syscalls: Allow arch specific syscallsymbol matching
  2011-02-02 14:15     ` [PATCH 4/6] ftrace syscalls: Allow arch specific syscallsymbol matching David Laight
@ 2011-02-02 14:28       ` Steven Rostedt
  0 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2011-02-02 14:28 UTC (permalink / raw)
  To: David Laight
  Cc: Ian Munsie, Andreas Dilger, Andreas Schwab, Dave Kleikamp,
	open list:DOCUMENTATION, Namhyung Kim, Jiri Kosina, Jason Baron,
	linux-kernel, Alexander Graf, Ingo Molnar, Paul Mackerras,
	KOSAKI Motohiro, Frederic Weisbecker, Scott Wood, Nathan Lynch,
	Andrew Morton, David Gibson, linuxppc-dev, Avantika Mathur

On Wed, 2011-02-02 at 14:15 +0000, David Laight wrote:
> > +#define arch_syscall_match_sym_name(sym, name) !strcmp(sym + 3, name
> + 3)
> 
> Whenever you use a #define macro arg, you should enclose it in ().
> About the only time you don't need to is when it is being
> passed as an argument to another function
> (ie when it's use is also ',' separated).
> 
> So the above ought to be:
> #define arch_syscall_match_sym_name(sym, name) (!strcmp((sym) + 3,
> (name) + 3))

I would have mentioned this if I wanted it to stay a macro ;)

> 
> Whether an inline function is better or worse is much more subtle!
> For instance I've used:
>    asm volatile ( "# line " STR(__LINE__) :: )
> to stop gcc merging the tails of conditionals.
> Useful when the conditional is at the end of a loop (etc),
> it might increase code size slightly, but removes a branch.
> 
> If I put one of those in an 'inline' function separate copies
> of the function end up sharing code.
> With a #define __LINE__ differs so they don't.
> 
> (I had some code to get below 190 clocks, these changes
> were significant!)

For what you were doing, this may have helped. But the code in question
is the "default" version of the function. I much more prefer it to be a
static inline. The issues you experience could change from gcc to gcc.
But static inlined functions are much cleaner and easier to read than
macros.

Using a macro for this purpose is just too messy.

Again, look at include/trace/ftrace.h. If I'm saying using a macro is
ugly, then don't use it! Listen to me, because I'm Mr. Ugly Macro Man.

-- Steve



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 3/6] ftrace syscalls: Make arch_syscall_addr weak
  2011-02-03  3:27 PowerPC, ftrace: Add PPC raw syscall tracepoints & ftrace fixes (mimimal subset only) v4 Ian Munsie
@ 2011-02-03  3:27 ` Ian Munsie
  0 siblings, 0 replies; 12+ messages in thread
From: Ian Munsie @ 2011-02-03  3:27 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andreas Dilger, Dave Kleikamp, Andrew Morton, Jiri Kosina,
	Jason Baron, linuxppc-dev, Steven Rostedt, Alexander Graf,
	Ingo Molnar, Paul Mackerras, KOSAKI Motohiro,
	Frederic Weisbecker, Scott Wood, Nathan Lynch, Avantika Mathur,
	David Gibson, Andreas Schwab, Namhyung Kim, Ian Munsie,
	open list:DOCUMENTATION

From: Ian Munsie <imunsie@au1.ibm.com>

Some architectures use non-trivial system call tables and will not work
with the generic arch_syscall_addr code. For example, PowerPC64 uses a
table of twin long longs.

This patch makes the generic arch_syscall_addr weak to allow
architectures with non-trivial system call tables to override it.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
---
 Documentation/trace/ftrace-design.txt |    3 +++
 kernel/trace/trace_syscalls.c         |    2 +-
 2 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/Documentation/trace/ftrace-design.txt b/Documentation/trace/ftrace-design.txt
index dc52bd4..6fca17b 100644
--- a/Documentation/trace/ftrace-design.txt
+++ b/Documentation/trace/ftrace-design.txt
@@ -247,6 +247,9 @@ You need very few things to get the syscalls tracing in an arch.
 - Support the TIF_SYSCALL_TRACEPOINT thread flags.
 - Put the trace_sys_enter() and trace_sys_exit() tracepoints calls from ptrace
   in the ptrace syscalls tracing path.
+- If the system call table on this arch is more complicated than a simple array
+  of addresses of the system calls, implement an arch_syscall_addr to return
+  the address of a given system call.
 - Tag this arch as HAVE_SYSCALL_TRACEPOINTS.
 
 
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 1a6e8dd..33360b9 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -445,7 +445,7 @@ int init_syscall_trace(struct ftrace_event_call *call)
 	return id;
 }
 
-unsigned long __init arch_syscall_addr(int nr)
+unsigned long __init __weak arch_syscall_addr(int nr)
 {
 	return (unsigned long)sys_call_table[nr];
 }
-- 
1.7.2.3


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 3/6] ftrace syscalls: Make arch_syscall_addr weak
       [not found] <1291696151-4336-1-git-send-email-imunsie@au1.ibm.com>
@ 2010-12-07  4:29 ` Ian Munsie
  0 siblings, 0 replies; 12+ messages in thread
From: Ian Munsie @ 2010-12-07  4:29 UTC (permalink / raw)
  To: Avantika Mathur
  Cc: Jason Baron, Ian Munsie, Randy Dunlap, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Mike Frysinger, Heiko Carstens,
	linux-doc, linux-kernel

From: Ian Munsie <imunsie@au1.ibm.com>

Some architectures use non-trivial system call tables and will not work
with the generic arch_syscall_addr code. For example, PowerPC64 uses a
table of twin long longs.

This patch makes the generic arch_syscall_addr weak to allow
architectures with non-trivial system call tables to override it.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
---
 Documentation/trace/ftrace-design.txt |    3 +++
 kernel/trace/trace_syscalls.c         |    2 +-
 2 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/Documentation/trace/ftrace-design.txt b/Documentation/trace/ftrace-design.txt
index dc52bd4..6fca17b 100644
--- a/Documentation/trace/ftrace-design.txt
+++ b/Documentation/trace/ftrace-design.txt
@@ -247,6 +247,9 @@ You need very few things to get the syscalls tracing in an arch.
 - Support the TIF_SYSCALL_TRACEPOINT thread flags.
 - Put the trace_sys_enter() and trace_sys_exit() tracepoints calls from ptrace
   in the ptrace syscalls tracing path.
+- If the system call table on this arch is more complicated than a simple array
+  of addresses of the system calls, implement an arch_syscall_addr to return
+  the address of a given system call.
 - Tag this arch as HAVE_SYSCALL_TRACEPOINTS.
 
 
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 5e34b49..b31ea2d 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -447,7 +447,7 @@ int init_syscall_trace(struct ftrace_event_call *call)
 	return id;
 }
 
-unsigned long __init arch_syscall_addr(int nr)
+unsigned long __init __weak arch_syscall_addr(int nr)
 {
 	return (unsigned long)sys_call_table[nr];
 }
-- 
1.7.2.3


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-02-03  3:27 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-02  7:11 PowerPC, ftrace: Add PPC raw syscall tracepoints & ftrace fixes (mimimal subset only) v3 Ian Munsie
2011-02-02  7:11 ` [PATCH 1/6] ftrace syscalls: don't add events for unmapped syscalls Ian Munsie
2011-02-02  7:11 ` [PATCH 2/6] trace syscalls: Convert redundant syscall_nr checks into WARN_ON Ian Munsie
2011-02-02  7:11 ` [PATCH 3/6] ftrace syscalls: Make arch_syscall_addr weak Ian Munsie
2011-02-02  7:11 ` [PATCH 4/6] ftrace syscalls: Allow arch specific syscall symbol matching Ian Munsie
2011-02-02 14:04   ` Steven Rostedt
2011-02-02 14:15     ` [PATCH 4/6] ftrace syscalls: Allow arch specific syscallsymbol matching David Laight
2011-02-02 14:28       ` Steven Rostedt
2011-02-02  7:11 ` [PATCH 5/6] trace, powerpc: Implement raw syscall tracepoints on PowerPC Ian Munsie
2011-02-02  7:11 ` [PATCH 6/6] trace syscalls: Early terminate search for sys_ni_syscall Ian Munsie
  -- strict thread matches above, loose matches on Subject: below --
2011-02-03  3:27 PowerPC, ftrace: Add PPC raw syscall tracepoints & ftrace fixes (mimimal subset only) v4 Ian Munsie
2011-02-03  3:27 ` [PATCH 3/6] ftrace syscalls: Make arch_syscall_addr weak Ian Munsie
     [not found] <1291696151-4336-1-git-send-email-imunsie@au1.ibm.com>
2010-12-07  4:29 ` Ian Munsie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).