LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Using ftrace/perf as a basis for generic seccomp
@ 2011-01-12 21:28 Eric Paris
  2011-02-01 14:58 ` Eric Paris
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Paris @ 2011-01-12 21:28 UTC (permalink / raw)
  To: linux-kernel; +Cc: mingo, agl, fweisbec, tzanussi

Some time ago Adam posted a patch to allow for a generic seccomp
implementation (unlike the current seccomp where your choice is all
syscalls or only read, write, sigreturn, and exit) which got little
traction and it was suggested he instead do the same thing somehow using
the tracing code:
http://thread.gmane.org/gmane.linux.kernel/833556
The actual method that this could be achieved was apparently left as an
exercise for the reader.  Since I'd like to do something similar (and
actually basically reimplemented Adam's code before I found this thread)
I guess that makes me the reader.  I've never touched
perf/ftrace/whatever so I'm not even knowledgeably enough to ask good
questions so please, try to talk to me like a 2 year old.

I started playing a bit having no idea where to start decided to see
where something like:
perf stat -e syscalls:sys_enter_read -e syscalls:sys_enter_write -- ./seccomp_test
Ended up in the kernel.  It ended up I saw in perf_syscall_enter().  So
I decided to do a little hacking and added this little patch segment:

diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index bac752f..6653995 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -495,8 +495,12 @@ static void perf_syscall_enter(void *ignore, struct pt_regs *regs, long id)
 	int size;
 
 	syscall_nr = syscall_get_nr(current, regs);
-	if (!test_bit(syscall_nr, enabled_perf_enter_syscalls))
+
+	if (!test_bit(syscall_nr, enabled_perf_enter_syscalls)) {
+		if (current->seccomp.mode == 2)
+			do_exit(SIGKILL);
 		return;
+	}
 
 	sys_data = syscall_nr_to_meta(syscall_nr);
 	if (!sys_data)

Which appears to be a necessary, but not sufficient, requirement, since
another unrelated task could also have a 'watch?' on other syscalls.  So
I hacked in this little PoS into the filter code.

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index eac7e33..d8c1c8f 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -4780,15 +4780,19 @@ void perf_tp_event(u64 addr, u64 count, void *record, int entry_size,
 		.size = entry_size,
 		.data = record,
 	};
+	int found = 0;
 
 	perf_sample_data_init(&data, addr);
 	data.raw = &raw;
 
 	hlist_for_each_entry_rcu(event, node, head, hlist_entry) {
-		if (perf_tp_event_match(event, &data, regs))
+		if (perf_tp_event_match(event, &data, regs)) {
+			found = 1;
 			perf_swevent_event(event, count, 1, &data, regs);
+		}
 	}
-
+	if (current->seccomp.mode == 2 && !found)
+		do_exit(SIGKILL);
 	perf_swevent_put_recursion_context(rctx);
 }
 EXPORT_SYMBOL_GPL(perf_tp_event);

Which seems to get me a 'working' version of generic seccomp on top of
ftrace.  Problem is it makes me feel dirty, I'm logging a bunch of trace
stuff I don't care about, and I'm sure its being done wrong 1001 ways.
I know that do_exit(SIGKILL) is actually really wrong since it ends up
giving me this crap (but i don't know how to do it better from there)

note: seccomp-test[2485] exited with preempt_count 1
BUG: sleeping function called from invalid context at kernel/rwsem.c:21

So, finally, onto the question.  How would you guys do it?  The tracing
code seems to me to be built on the idea of recording information on a
very small limited set of events, not blocking access on the complement
of a small limited set of events.

I'm not seeing how the tracing code is better than the generic seccomp
code that Adam wrote, but hopefully someone can enlighten me as to how
this can be done reasonably.  I need all the guidance you can offer
because I don't really see what next steps should be!

-Eric


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2011-02-07 12:25 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-12 21:28 Using ftrace/perf as a basis for generic seccomp Eric Paris
2011-02-01 14:58 ` Eric Paris
2011-02-02 12:14   ` Masami Hiramatsu
2011-02-02 12:26     ` Ingo Molnar
2011-02-02 16:45       ` Eric Paris
2011-02-02 17:55         ` Ingo Molnar
2011-02-02 18:17           ` Steven Rostedt
2011-02-03 19:06         ` Frederic Weisbecker
2011-02-03 19:18           ` Frederic Weisbecker
2011-02-03 22:06           ` Stefan Fritsch
2011-02-03 23:10             ` Frederic Weisbecker
2011-02-04  1:50               ` Eric Paris
2011-02-04 14:31                 ` Peter Zijlstra
2011-02-04 16:29                   ` Eric Paris
2011-02-04 17:04                     ` Frederic Weisbecker
2011-02-05 11:51                       ` Stefan Fritsch
2011-02-07 12:26                         ` Peter Zijlstra
2011-02-04 16:36             ` Eric Paris
2011-02-05 11:42               ` Stefan Fritsch
2011-02-06 16:51                 ` Eric Paris

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).