Netdev Archive on lore.kernel.org help / color / mirror / Atom feed
From: Jiri Olsa <jolsa@redhat.com> To: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Alexei Starovoitov <ast@kernel.org>, Daniel Borkmann <daniel@iogearbox.net>, Andrii Nakryiko <andriin@fb.com>, "Steven Rostedt (VMware)" <rostedt@goodmis.org>, Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>, Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>, Yonghong Song <yhs@fb.com>, John Fastabend <john.fastabend@gmail.com>, KP Singh <kpsingh@chromium.org>, Daniel Xu <dxu@dxuuu.xyz>, Viktor Malik <vmalik@redhat.com> Subject: Re: [PATCH bpf-next v4 18/27] bpf, x64: Store properly return value for trampoline with multi func programs Date: Thu, 2 Sep 2021 14:57:11 +0200 [thread overview] Message-ID: <YTDKJ2E1fN0hPDZj@krava> (raw) In-Reply-To: <CAEf4BzY1XhuZ5huinfTmUZGhrT=wgACOgKbbdEPmnek=nN6YgA@mail.gmail.com> On Wed, Sep 01, 2021 at 08:56:19PM -0700, Andrii Nakryiko wrote: > On Wed, Sep 1, 2021 at 8:15 AM Jiri Olsa <jolsa@redhat.com> wrote: > > > > On Tue, Aug 31, 2021 at 04:51:18PM -0700, Andrii Nakryiko wrote: > > > On Thu, Aug 26, 2021 at 12:41 PM Jiri Olsa <jolsa@redhat.com> wrote: > > > > > > > > When we have multi func program attached, the trampoline > > > > switched to the function model of the multi func program. > > > > > > > > This breaks already attached standard programs, for example > > > > when we attach following program: > > > > > > > > SEC("fexit/bpf_fentry_test2") > > > > int BPF_PROG(test1, int a, __u64 b, int ret) > > > > > > > > the trampoline pushes on stack args 'a' and 'b' and return > > > > value 'ret'. > > > > > > > > When following multi func program is attached to bpf_fentry_test2: > > > > > > > > SEC("fexit.multi/bpf_fentry_test*") > > > > int BPF_PROG(test2, __u64 a, __u64 b, __u64 c, __u64 d, > > > > __u64 e, __u64 f, int ret) > > > > > > > > the trampoline takes this program model and pushes all 6 args > > > > and return value on stack. > > > > > > > > But we still have the original 'test1' program attached, that > > > > expects 'ret' value where there's 'c' argument now: > > > > > > > > test1(a, b, c) > > > > > > > > To fix that we simply overwrite 'c' argument with 'ret' value, > > > > so test1 is called as expected and test2 gets called as: > > > > > > > > test2(a, b, ret, d, e, f, ret) > > > > > > > > which is ok, because 'c' is not defined for bpf_fentry_test2 > > > > anyway. > > > > > > > > > > What if we change the order on the stack to be the return value first, > > > followed by input arguments. That would get us a bit closer to > > > unifying multi-trampoline and the normal one, right? BPF verifier > > > should be able to rewrite access to the last argument (i.e., return > > > value) for fexit programs to actually be at offset 0, and shift all > > > other arguments by 8 bytes. For fentry, if that helps to keep things > > > more aligned, we'd just skip the first 8 bytes on the stack and store > > > all the input arguments in the same offsets. So BPF verifier rewriting > > > logic stays consistent (except offset 0 will be disallowed). > > > > nice idea, with this in place we could cut that args re-arranging code > > > > > > > > Basically, I'm thinking how we can make normal and multi trampolines > > > more interoperable to remove those limitations that two > > > multi-trampolines can't be attached to the same function, which seems > > > like a pretty annoying limitation which will be easy to hit in > > > practice. Alexei previously proposed (as an optimization) to group all > > > to-be-attached functions into groups by number of arguments, so that > > > we can have up to 6 different trampolines tailored to actual functions > > > being attached. So that we don't save unnecessary extra input > > > arguments saving, which will be even more important once we allow more > > > than 6 arguments in the future. > > > > > > With such logic, we should be able to split all the functions into > > > multiple underlying trampolines, so it seems like it should be > > > possible to also allow multiple multi-fentry programs to be attached > > > to the same function by having a separate bpf_trampoline just for > > > those functions. It will be just an extension of the above "just 6 > > > trampolines" strategy to "as much as we need trampolines". > > > > I'm probably missing something here.. say we have 2 functions with single > > argument: > > > > foo1(int a) > > foo2(int b) > > > > then having 2 programs: > > > > A - attaching to foo1 > > B - attaching to foo2 > > > > then you need to have 2 different trampolines instead of single 'generic-1-argument-trampoline' > > right, you have two different BPF progs attached to two different > functions. You have to have 2 trampolines, not sure what's > confusing?.. I misunderstood the statement above: > > > practice. Alexei previously proposed (as an optimization) to group all > > > to-be-attached functions into groups by number of arguments, so that > > > we can have up to 6 different trampolines tailored to actual functions > > > being attached. So that we don't save unnecessary extra input you meant just functions to be attached at that moment, not all, ok > > > > > > > > > It's just a vague idea, sorry, I don't understand all the code yet. > > > But the limitation outlined in one of the previous patches seems very > > > limiting and unpleasant. I can totally see that some 24/7 running BPF > > > tracing app uses multi-fentry for tracing a small subset of kernel > > > functions non-stop, and then someone is trying to use bpftrace or > > > retsnoop to trace overlapping set of functions. And it immediately > > > fails. Very frustrating. > > > > so the current approach is to some extent driven by the direct ftrace > > batch API: > > > > you have ftrace_ops object and set it up with functions you want > > to change with calling: > > > > ftrace_set_filter_ip(ops, ip1); > > ftrace_set_filter_ip(ops, ip2); > > ... > > > > and then register trampoline with those functions: > > > > register_ftrace_direct_multi(ops, tramp_addr); > > > > and with this call being the expensive one (it does the actual work > > and sync waiting), my objective was to call it just once for update > > > > now with 2 intersecting multi trampolines we end up with 3 functions > > sets: > > > > A - functions for first multi trampoline > > B - functions for second multi trampoline > > C - intersection of them > > > > each set needs different trampoline: > > > > tramp A - calls program for first multi trampoline > > tramp B - calls program for second multi trampoline > > tramp C - calls both programs > > > > so we need to call register_ftrace_direct_multi 3 times > > Yes, that's the minimal amount of trampolines you need. Calling > register_ftrace_direct_multi() three times is not that bad at all, > compared to calling it 1000s of times. If you are worried about 1 vs 3 > calls, I think you are over-optimizing here. I'd rather take no > restrictions on what can be attached to what and in which sequences > but taking 3ms vs having obscure (for uninitiated users) restrictions, > but in some cases allowing attachment to happen in 1ms. > > The goal with multi-attach is to make it decently fast when attaching > to a lot functions, but if attachment speed is fast enough, then such > small performance differences don't matter anymore. true, I might have been focused on the worst possible case here ;-) > > > > > if we allow also standard trampolines being attached, it makes > > it even more complicated and ultimatelly gets broken to > > 1-function/1-trampoline pairs, ending up with attach speed > > that we have now > > > > So let's make sure that we are on the same page. Let me write out an example. > > Let's say we have 5 kernel functions: a, b, c, d, e. Say a, b, c all > have 1 input args, and d and e have 2. > > Now let's say we attach just normal fentry program A to function a. > Also we attach normal fexit program E to func e. > > We'll have A attached to a with trampoline T1. We'll also have E > attached to e with trampoline T2. Right? > > And now we try to attach generic fentry (fentry.multi in your > terminology) prog X to all 5 of them. If A and E weren't attached, > we'd need two generic trampolines, one for a, b, c (because 1 input > argument) and another for d,e (because 2 input arguments). But because > we already have A and B attached, we'll end up needing 4: > > T1 (1 arg) for func a calling progs A and X > T2 (2 args) for func e calling progs E and X > T3 (1 arg) for func b and c calling X > T4 (2 args) for func d calling X so current code would group T3/T4 together, but if we keep them separated, then we won't need to use new model and cut off some of the code, ok together with that args shifting we could endup with almost untouched trampoline generation code ;-) > > We can't have less than that and satisfy all the constraints. But 4 is > not that bad. If the example has 1000s of functions, you'd still need > between 4 and 8 trampolines (if we had 3, 4, 5, and 6 input args for > kernel functions). That's way less than 1000s of trampolines needed > today. And it's still fast enough on the attachment. > > The good thing with what we discussed with making current trampoline > co-exist with generic (multi) fentry/fexit, is that we'll still have > just one trampoline, saving exactly as many input arguments as > attached function(s) have. So at least we don't have to maintain two > separate pieces of logic for that. Then the only added complexity > would be breaking up all to-be-attached kernel functions into groups, > as described in the example. > > It sounds a bit more complicated in writing than it will be in > practice, probably. I think the critical part is unification of > trampoline to work with fentry/fexit and fentry.multi/fexit.multi > simultaneously, which seems like you agreed above is achievable. ok, I haven't considered this way, but I think it's doable thanks, jirka
next prev parent reply other threads:[~2021-09-02 12:57 UTC|newest] Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-08-26 19:38 [PATCH bpf-next v4 00/27] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa 2021-08-26 19:38 ` [PATCH bpf-next v4 01/27] x86/ftrace: Remove extra orig rax move Jiri Olsa 2021-08-26 19:38 ` [PATCH bpf-next v4 02/27] x86/ftrace: Remove fault protection code in prepare_ftrace_return Jiri Olsa 2021-08-26 19:38 ` [PATCH bpf-next v4 03/27] x86/ftrace: Make function graph use ftrace directly Jiri Olsa 2021-08-26 19:38 ` [PATCH bpf-next v4 04/27] tracing: Add trampoline/graph selftest Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 05/27] ftrace: Add ftrace_add_rec_direct function Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 06/27] ftrace: Add multi direct register/unregister interface Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 07/27] ftrace: Add multi direct modify interface Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 08/27] ftrace/samples: Add multi direct interface test module Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 09/27] bpf: Add support to load multi func tracing program Jiri Olsa 2021-08-31 23:17 ` Andrii Nakryiko 2021-09-01 11:32 ` Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 10/27] bpf: Add struct bpf_tramp_node layer Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 11/27] bpf: Factor out bpf_trampoline_init function Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 12/27] bpf: Factor out __bpf_trampoline_lookup function Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 13/27] bpf: Factor out __bpf_trampoline_put function Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 14/27] bpf: Change bpf_trampoline_get to return error pointer Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 15/27] bpf, x64: Allow to use caller address from stack Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 16/27] bpf: Add bpf_trampoline_multi_get/put functions Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 17/27] bpf: Add multi trampoline attach support Jiri Olsa 2021-08-31 23:36 ` Andrii Nakryiko 2021-09-01 0:02 ` Andrii Nakryiko 2021-09-01 11:39 ` Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 18/27] bpf, x64: Store properly return value for trampoline with multi func programs Jiri Olsa 2021-08-31 23:51 ` Andrii Nakryiko 2021-09-01 15:15 ` Jiri Olsa 2021-09-02 3:56 ` Andrii Nakryiko 2021-09-02 12:57 ` Jiri Olsa [this message] 2021-09-02 16:54 ` Andrii Nakryiko 2021-09-02 21:55 ` Alexei Starovoitov 2021-09-03 9:50 ` Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 19/27] bpf: Attach multi trampoline with ftrace_ops Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 20/27] libbpf: Add btf__find_by_glob_kind function Jiri Olsa 2021-09-01 0:10 ` Andrii Nakryiko 2021-09-01 11:33 ` Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 21/27] libbpf: Add support to link multi func tracing program Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 22/27] selftests/bpf: Add fentry multi func test Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 23/27] selftests/bpf: Add fexit " Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 24/27] selftests/bpf: Add fentry/fexit " Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 25/27] selftests/bpf: Add mixed " Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 26/27] selftests/bpf: Add attach " Jiri Olsa 2021-08-26 19:39 ` [PATCH bpf-next v4 27/27] selftests/bpf: Add ret_mod " Jiri Olsa 2021-08-29 17:04 ` [PATCH bpf-next v4 00/27] x86/ftrace/bpf: Add batch support for direct/tracing attach Alexei Starovoitov 2021-08-30 8:02 ` Jiri Olsa
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=YTDKJ2E1fN0hPDZj@krava \ --to=jolsa@redhat.com \ --cc=andrii.nakryiko@gmail.com \ --cc=andriin@fb.com \ --cc=ast@kernel.org \ --cc=bpf@vger.kernel.org \ --cc=daniel@iogearbox.net \ --cc=dxu@dxuuu.xyz \ --cc=john.fastabend@gmail.com \ --cc=kafai@fb.com \ --cc=kpsingh@chromium.org \ --cc=netdev@vger.kernel.org \ --cc=rostedt@goodmis.org \ --cc=songliubraving@fb.com \ --cc=vmalik@redhat.com \ --cc=yhs@fb.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).