LKML Archive on
help / color / mirror / Atom feed
From: Frederic Weisbecker <>
To: Stefan Fritsch <>
Cc: Eric Paris <>, Ingo Molnar <>,
	Masami Hiramatsu <>,
	Eric Paris <>,,,,
	Jason Baron <>,
	Mathieu Desnoyers <>,,
	Steven Rostedt <>,
	Arnaldo Carvalho de Melo <>,
	Peter Zijlstra <>,
	Thomas Gleixner <>
Subject: Re: Using ftrace/perf as a basis for generic seccomp
Date: Fri, 4 Feb 2011 00:10:54 +0100	[thread overview]
Message-ID: <20110203231051.GA1840@nowhere> (raw)
In-Reply-To: <>

On Thu, Feb 03, 2011 at 11:06:33PM +0100, Stefan Fritsch wrote:
> Hi,
> On Thursday 03 February 2011, Frederic Weisbecker wrote:
> > I think you won't work with trace events, so you need to make the
> > filtering code more tracing-agnostic.
> > 
> > But I think it's quite workable and shouldn't be too hard to split
> > that into a filtering backend. Many parts are already pretty
> > standalone.
> > 
> > Also I suspect the tracepoints are not what you need. Or may be
> > they are. But as Masami said, the syscall tracepoint is called
> > late. It's workable though. The other problem is that preemption
> > is disabled when tracepoints are called, which is probably not
> > what you want. One day I think we'll need to unify the tracepoints
> > and notifier code but until then, better keep tracepoints for
> > tracing.
> > 
> > Now once you have the filtering code more generic, you still
> > need an arch backend to map register contents and layout into
> > syscall arguments name and type. On top of which you can finally
> > use the filtering code. For that you can use, again, some code we
> > use for tracing, which are syscalls metadata: informations
> > generated on build time that have syscalls fields and type.
> > And that also needs to be split up, but it's more trivial
> > than the filtering part.
> AFAICS the infrastructure for tracing and metadata of compat syscalls 
> is also still missing. That would need to be added, too. Jason Baron 
> and Ian Munsie have worked on this in mid 2010, but I don't know about 
> the current status.

Oh we have these features for a while now. Jason and others have
made a great work on it.
For example you can accept only read() calls when fd = 7 with perf:

./perf record -a -e syscalls:sys_enter_read --filter "fd==7"
./perf script

	    perf-2956  [000]  6427.757791: sys_enter_read: fd: 0x00000007, buf: 0x7fdd2c095000, count: 0x00000400
            perf-2956  [000]  6427.757858: sys_enter_read: fd: 0x00000007, buf: 0x7fdd2c095000, count: 0x00000400
            perf-2956  [000]  6427.757877: sys_enter_read: fd: 0x00000007, buf: 0x7fdd2c095000, count: 0x00000400
            perf-2956  [000]  6427.757949: sys_enter_read: fd: 0x00000007, buf: 0x7fdd2c095000, count: 0x00000400
            perf-2956  [000]  6427.757968: sys_enter_read: fd: 0x00000007, buf: 0x7fdd2c095000, count: 0x00000400

Or you can do it with ftrace.

Now what Jason and Ian have worked on lately was rather about compat syscalls support and power pc
support amongst other things.
> Considering that all this is still quite a bit of work and that the 
> initial suggestion by Adam Langley happened nearly two years ago, 
> maybe a two step approach would be better:
> Integrate a seccomp mode 2 now, which only supports a bitmask of 
> bitmaps and no filtering.
> Then, when the infrastructure for the filtering is finished, add a 
> seccomp mode 3 with support for filtering.

Infrastructure for filtering will never be finished. IMO there will
always be many ways to make it better, simply because using such
conditional expressions offer tons of possibilities and is extendable
by nature.

Actually having seccomp as a user of filter expressions is a opportunity
to improve it and bring new ideas.

> This would give something in the very near future that is way more 
> usable than seccomp mode 1. I think only the following adjustments 
> would need to be made to Adam Langley's patch:
> - only allow syscalls in the mode (non-compat/compat) that the prctl 
> call was made in
> - deny exec of setuid/setgid binaries
> - deny exec of binaries with filesystem capabilities
> What do you think of this proposal? I have a patch lying around 
> somewhere that already does the first two of these.

IMHO this is an unnecessary intermediate state. It will actually make
things worse by bringing a new non-flexible ABI that we'll need to
maintain forever.

I'm no expert in security but I think it's not flexible.

> > of raw arguments value. Syscalls metadata don't know much
> > about type semantics and won't help you to dereference
> > syscall argument pointers. Only raw syscall parameter values.
> > Similarly, the filtering code can't evaluate pointer dereferencing
> > expression evaluation, only direct values comprehension.
> Pointer dereferencing at syscall entry must be avoided for seccomp 
> anyway, or there would be race conditions. Of course if the filtering 
> points could be put after the final copy_form_user, it would be ok.

Makes sense yeah.

  reply	other threads:[~2011-02-03 23:11 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-12 21:28 Eric Paris
2011-02-01 14:58 ` Eric Paris
2011-02-02 12:14   ` Masami Hiramatsu
2011-02-02 12:26     ` Ingo Molnar
2011-02-02 16:45       ` Eric Paris
2011-02-02 17:55         ` Ingo Molnar
2011-02-02 18:17           ` Steven Rostedt
2011-02-03 19:06         ` Frederic Weisbecker
2011-02-03 19:18           ` Frederic Weisbecker
2011-02-03 22:06           ` Stefan Fritsch
2011-02-03 23:10             ` Frederic Weisbecker [this message]
2011-02-04  1:50               ` Eric Paris
2011-02-04 14:31                 ` Peter Zijlstra
2011-02-04 16:29                   ` Eric Paris
2011-02-04 17:04                     ` Frederic Weisbecker
2011-02-05 11:51                       ` Stefan Fritsch
2011-02-07 12:26                         ` Peter Zijlstra
2011-02-04 16:36             ` Eric Paris
2011-02-05 11:42               ` Stefan Fritsch
2011-02-06 16:51                 ` Eric Paris

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110203231051.GA1840@nowhere \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
    --subject='Re: Using ftrace/perf as a basis for generic seccomp' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).