LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Alexei Starovoitov <ast@plumgrid.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Namhyung Kim <namhyung@kernel.org>,
	Arnaldo Carvalho de Melo <acme@infradead.org>,
	Jiri Olsa <jolsa@redhat.com>,
	Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>,
	"David S. Miller" <davem@davemloft.net>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Linux API <linux-api@vger.kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v5 tip 0/7] tracing: attach eBPF programs to kprobes
Date: Wed, 4 Mar 2015 09:23:54 -0800	[thread overview]
Message-ID: <CAMEtUuzOQqyXgZaytwOPxAfn5ucu08HjAMrFUgeSaLRVg9E0rg@mail.gmail.com> (raw)
In-Reply-To: <1425252465-27527-1-git-send-email-ast@plumgrid.com>

On Sun, Mar 1, 2015 at 3:27 PM, Alexei Starovoitov <ast@plumgrid.com> wrote:
> Peter, Steven,
> I think this set addresses everything we've discussed.
> Please review/ack. Thanks!

icmp echo request

> V4->V5:
> - switched to ktime_get_mono_fast_ns() as suggested by Peter
> - in libbpf.c fixed zero init of 'union bpf_attr' padding
> - fresh rebase on tip/master
>
> Hi All,
>
> This is targeting 'tip' tree, since most of the changes are perf_event related.
> There will be a small conflict between net-next and tip, since they both
> add new bpf_prog_type (BPF_PROG_TYPE_SCHED_CLS and BPF_PROG_TYPE_KPROBE).
>
> V3 discussion:
> https://lkml.org/lkml/2015/2/9/738
>
> V3->V4:
> - since the boundary of stable ABI in bpf+tracepoints is not clear yet,
>   I've dropped them for now.
> - bpf+syscalls are ok from stable ABI point of view, but bpf+seccomp
>   would want to do very similar analysis of syscalls, so I've dropped
>   them as well to take time and define common bpf+syscalls and bpf+seccomp
>   infra in the future.
> - so only bpf+kprobes left. kprobes by definition is not a stable ABI,
>   so bpf+kprobe is not stable ABI either. To stress on that point added
>   kernel version attribute that user space must pass along with the program
>   and kernel will reject programs when version code doesn't match.
>   So bpf+kprobe is very similar to kernel modules, but unlike modules
>   version check is not used for safety, but for enforcing 'non-ABI-ness'.
>   (version check doesn't apply to bpf+sockets which are stable)
>
> Patch 1 is in net-next and needs to be in tip too, since patch 2 depends on it.
>
> Patch 2 actually adds bpf+kprobe infra:
> programs receive 'struct pt_regs' on input and can walk data structures
> using bpf_probe_read() helper which is a wrapper of probe_kernel_read()
>
> Programs are attached to kprobe events via API:
>
> prog_fd = bpf_prog_load(...);
> struct perf_event_attr attr = {
>   .type = PERF_TYPE_TRACEPOINT,
>   .config = event_id, /* ID of just created kprobe event */
> };
> event_fd = perf_event_open(&attr,...);
> ioctl(event_fd, PERF_EVENT_IOC_SET_BPF, prog_fd);
>
> Patch 3 adds bpf_ktime_get_ns() helper function, so that bpf programs can
> measure time delta between events to compute disk io latency, etc.
>
> Patch 4 adds bpf_trace_printk() helper that is used to debug programs.
> When bpf verifier sees that program is calling bpf_trace_printk() it inits
> trace_printk buffers which emits nasty 'this is debug only' banner.
> That's exactly what we want. bpf_trace_printk() is for debugging only.
>
> Patch 5 sample code that shows how to use bpf_probe_read/bpf_trace_printk
>
> Patch 6 sample code - combination of kfree_skb and sys_write tracing.
>
> Patch 7 sample code that computes disk io latency and prints it as 'heatmap'
>
> Interesting bit is that patch 6 has log2() function implemented in C
> and patch 7 has another log2() function using different algorithm in C.
> In the future if 'log2' usage becomes common, we can add it as in-kernel
> helper function, but for now bpf programs can implement them on bpf side.
>
> Another interesting bit from patch 7 is that it does approximation of
> floating point log10(X)*10 using integer arithmetic, which demonstrates
> the power of C->BPF vs traditional tracing language alternatives,
> where one would need to introduce new helper functions to add functionality,
> whereas bpf can just implement such things in C as part of the program.
>
> Next step is to prototype TCP stack instrumentation (like web10g) using
> bpf+kprobe, but without adding any new code tcp stack.
> Though kprobes are slow comparing to tracepoints, they are good enough
> for prototyping and trace_marker/debug_tracepoint ideas can accelerate
> them in the future.
>
> Alexei Starovoitov (6):
>   tracing: attach BPF programs to kprobes
>   tracing: allow BPF programs to call bpf_ktime_get_ns()
>   tracing: allow BPF programs to call bpf_trace_printk()
>   samples: bpf: simple non-portable kprobe filter example
>   samples: bpf: counting example for kfree_skb and write syscall
>   samples: bpf: IO latency analysis (iosnoop/heatmap)
>
> Daniel Borkmann (1):
>   bpf: make internal bpf API independent of CONFIG_BPF_SYSCALL ifdefs
>
>  include/linux/bpf.h             |   20 ++++-
>  include/linux/ftrace_event.h    |   14 +++
>  include/uapi/linux/bpf.h        |    5 ++
>  include/uapi/linux/perf_event.h |    1 +
>  kernel/bpf/syscall.c            |    7 +-
>  kernel/events/core.c            |   59 +++++++++++++
>  kernel/trace/Makefile           |    1 +
>  kernel/trace/bpf_trace.c        |  178 +++++++++++++++++++++++++++++++++++++++
>  kernel/trace/trace_kprobe.c     |   10 ++-
>  samples/bpf/Makefile            |   12 +++
>  samples/bpf/bpf_helpers.h       |    6 ++
>  samples/bpf/bpf_load.c          |  112 ++++++++++++++++++++++--
>  samples/bpf/bpf_load.h          |    3 +
>  samples/bpf/libbpf.c            |   14 ++-
>  samples/bpf/libbpf.h            |    5 +-
>  samples/bpf/sock_example.c      |    2 +-
>  samples/bpf/test_verifier.c     |    2 +-
>  samples/bpf/tracex1_kern.c      |   50 +++++++++++
>  samples/bpf/tracex1_user.c      |   25 ++++++
>  samples/bpf/tracex2_kern.c      |   86 +++++++++++++++++++
>  samples/bpf/tracex2_user.c      |   95 +++++++++++++++++++++
>  samples/bpf/tracex3_kern.c      |   89 ++++++++++++++++++++
>  samples/bpf/tracex3_user.c      |  150 +++++++++++++++++++++++++++++++++
>  23 files changed, 930 insertions(+), 16 deletions(-)
>  create mode 100644 kernel/trace/bpf_trace.c
>  create mode 100644 samples/bpf/tracex1_kern.c
>  create mode 100644 samples/bpf/tracex1_user.c
>  create mode 100644 samples/bpf/tracex2_kern.c
>  create mode 100644 samples/bpf/tracex2_user.c
>  create mode 100644 samples/bpf/tracex3_kern.c
>  create mode 100644 samples/bpf/tracex3_user.c
>
> --
> 1.7.9.5
>

  parent reply	other threads:[~2015-03-04 17:24 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-01 23:27 Alexei Starovoitov
2015-03-01 23:27 ` [PATCH v5 tip 1/7] bpf: make internal bpf API independent of CONFIG_BPF_SYSCALL ifdefs Alexei Starovoitov
2015-03-02 10:53   ` Masami Hiramatsu
2015-03-02 11:10     ` Daniel Borkmann
2015-03-02 11:51       ` Masami Hiramatsu
2015-03-02 12:26         ` Daniel Borkmann
2015-03-01 23:27 ` [PATCH v5 tip 2/7] tracing: attach BPF programs to kprobes Alexei Starovoitov
2015-03-01 23:27 ` [PATCH v5 tip 3/7] tracing: allow BPF programs to call bpf_ktime_get_ns() Alexei Starovoitov
2015-03-01 23:27 ` [PATCH v5 tip 4/7] tracing: allow BPF programs to call bpf_trace_printk() Alexei Starovoitov
2015-03-01 23:27 ` [PATCH v5 tip 5/7] samples: bpf: simple non-portable kprobe filter example Alexei Starovoitov
2015-03-01 23:27 ` [PATCH v5 tip 6/7] samples: bpf: counting example for kfree_skb and write syscall Alexei Starovoitov
2015-03-01 23:27 ` [PATCH v5 tip 7/7] samples: bpf: IO latency analysis (iosnoop/heatmap) Alexei Starovoitov
2015-03-04 17:23 ` Alexei Starovoitov [this message]
2015-03-04 20:33   ` [PATCH v5 tip 0/7] tracing: attach eBPF programs to kprobes Ingo Molnar
2015-03-04 20:48     ` Steven Rostedt
2015-03-07  1:09       ` Steven Rostedt
2015-03-08  0:21         ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMEtUuzOQqyXgZaytwOPxAfn5ucu08HjAMrFUgeSaLRVg9E0rg@mail.gmail.com \
    --to=ast@plumgrid.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@infradead.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=jolsa@redhat.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --subject='Re: [PATCH v5 tip 0/7] tracing: attach eBPF programs to kprobes' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).