Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Hao Luo <haoluo@google.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
	open list <linux-kernel@vger.kernel.org>,
	"open list:KERNEL SELFTEST FRAMEWORK" 
	<linux-kselftest@vger.kernel.org>, Shuah Khan <shuah@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andriin@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@chromium.org>,
	Quentin Monnet <quentin@isovalent.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@redhat.com>, Andrey Ignatov <rdna@fb.com>,
	Jakub Sitnicki <jakub@cloudflare.com>
Subject: Re: [PATCH bpf-next v1 4/8] bpf/libbpf: BTF support for typed ksyms
Date: Tue, 1 Sep 2020 13:35:17 -0700	[thread overview]
Message-ID: <CA+khW7jnzZim6h9O+JH2AnXmvtU19-FxJDZBXfHZH9Xniq8zeg@mail.gmail.com> (raw)
In-Reply-To: <CAEf4BzaO_P1LiWDvFcZ3u1f2eaUEpqb+KXg0FqLMGYDLdRNBJQ@mail.gmail.com>

On Tue, Sep 1, 2020 at 11:11 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Aug 27, 2020 at 3:29 PM Hao Luo <haoluo@google.com> wrote:
> >
> > On Fri, Aug 21, 2020 at 3:37 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Wed, Aug 19, 2020 at 3:42 PM Hao Luo <haoluo@google.com> wrote:
> > > >
> > > > If a ksym is defined with a type, libbpf will try to find the ksym's btf
> > > > information from kernel btf. If a valid btf entry for the ksym is found,
> > > > libbpf can pass in the found btf id to the verifier, which validates the
> > > > ksym's type and value.
> > > >
> > > > Typeless ksyms (i.e. those defined as 'void') will not have such btf_id,
> > > > but it has the symbol's address (read from kallsyms) and its value is
> > > > treated as a raw pointer.
> > > >
> > > > Signed-off-by: Hao Luo <haoluo@google.com>
> > > > ---
> > > >  tools/lib/bpf/libbpf.c | 130 ++++++++++++++++++++++++++++++++++++-----
> > > >  1 file changed, 114 insertions(+), 16 deletions(-)
> > > >
> > > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > > > index 4a81c6b2d21b..94eff612c7c2 100644
> > > > --- a/tools/lib/bpf/libbpf.c
> > > > +++ b/tools/lib/bpf/libbpf.c
> > > > @@ -357,7 +357,16 @@ struct extern_desc {
> > > >                         bool is_signed;
> > > >                 } kcfg;
> > > >                 struct {
> > > > -                       unsigned long long addr;
> > > > +                       /*
> > > > +                        *  1. If ksym is typeless, the field 'addr' is valid.
> > > > +                        *  2. If ksym is typed, the field 'vmlinux_btf_id' is
> > > > +                        *     valid.
> > > > +                        */
> > > > +                       bool is_typeless;
> > > > +                       union {
> > > > +                               unsigned long long addr;
> > > > +                               int vmlinux_btf_id;
> > > > +                       };
> > >
> > > ksym is 16 bytes anyways, union doesn't help to save space. I propose
> > > to encode all this with just two fields: vmlinux_btf_id and addr. If
> > > btf_id == 0, then extern is typeless.
> >
> > Ack on expanding the union. But I slightly preferred keeping
> > is_typeless. IIUC, btf_id points a VAR_KIND, we need the following
> > pointer chasing every time
> >
> > t = btf__type_by_id(obj->btf, ext->btf_id);
> > t->type;
> >
> > which I felt is worse than keeping a is_typeless flag.
>
> Sorry, I'm not following. In all places where you would check
> sym->is_typeless, you'd now just do:
>
> if (ext->ksym.vmlinux_btf_id) {
>   /* typed, use ext->ksym.vmlinux_btf_id */
> } else {
>   /* typeless */
> }
>

My apologies, I should be more specific.

'vmlinux_btf_id' gets its value in bpf_object__resolve_ksyms_btf_id().
Before we call this function, there are three places that need to tell
whether a ksym is typed, currently in v1. Specifically,

 - in bpf_object__collect_externs(), typeless ksyms are rewritten as
'int', in contrast, typed ones are left untouched (though this may
change in v2).
 - bpf_object__load_vmlinux_btf() now is called before
bpf_object__resolve_ksyms_btf_id(). In v1's design, if there is no
typed ksym, we could skip loading vmlinux_btf potentially.
 - even bpf_object__resolve_ksyms_btf_id() itself is conditionally
called, depending on whether there is any typed ksym.

At the time when these places are called, vmlinux_btf_id is
unavailable and we can't use it for the purpose of telling whether a
ksym is typed.

However, rather than vmlinux_btf_id, there may be an alternative. We
can record the ksym extern's type's btf_id and use that as
'is_typeless' flag. This also solves the problem below.

[...]

> > > >                 } else {
> > > >                         pr_warn("unrecognized extern section '%s'\n", sec_name);
> > > >                         return -ENOTSUP;
> > > > @@ -2992,9 +3006,9 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
> > > >         /* sort externs by type, for kcfg ones also by (align, size, name) */
> > > >         qsort(obj->externs, obj->nr_extern, sizeof(*ext), cmp_externs);
> > > >
> > > > -       /* for .ksyms section, we need to turn all externs into allocated
> > > > -        * variables in BTF to pass kernel verification; we do this by
> > > > -        * pretending that each extern is a 8-byte variable
> > > > +       /* for .ksyms section, we need to turn all typeless externs into
> > > > +        * allocated variables in BTF to pass kernel verification; we do
> > > > +        * this by pretending that each typeless extern is a 8-byte variable
> > > >          */
> > > >         if (ksym_sec) {
> > > >                 /* find existing 4-byte integer type in BTF to use for fake
> > > > @@ -3012,7 +3026,7 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
> > > >
> > > >                 sec = ksym_sec;
> > > >                 n = btf_vlen(sec);
> > > > -               for (i = 0, off = 0; i < n; i++, off += sizeof(int)) {
> > > > +               for (i = 0, off = 0; i < n; i++) {
> > > >                         struct btf_var_secinfo *vs = btf_var_secinfos(sec) + i;
> > > >                         struct btf_type *vt;
> > > >
> > > > @@ -3025,9 +3039,14 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
> > > >                                 return -ESRCH;
> > > >                         }
> > > >                         btf_var(vt)->linkage = BTF_VAR_GLOBAL_ALLOCATED;
> > > > -                       vt->type = int_btf_id;
> > > > +                       if (ext->ksym.is_typeless) {
> > > > +                               vt->type = int_btf_id;
> > > > +                               vs->size = sizeof(int);
> > > > +                       }
> > > >                         vs->offset = off;
> > > > -                       vs->size = sizeof(int);
> > > > +                       off += vs->size;
> > > > +                       pr_debug("ksym var_secinfo: var '%s', type #%d, size %d, offset %d\n",
> > > > +                                ext->name, vt->type, vs->size, vs->offset);
> > >
> > > It's a bit of a waste that we still allocate memory for those typed
> > > ksym externs, as they don't really need space. But modifying BTF is a
> > > pain right now, so I think we'll have to do it, until we have a better
> > > BTF API. But let's make them integers for now to take a fixed and
> > > small amount of space.
> > >
> >
> > Do you mean making typed ksym externs of type integer? If so, we can't
> > do that, I think. After collect_externs, we later need to compare the
> > declared extern's type against the type defined in kernel. Better not
> > rewrite their types in BTf.
>
> Then maybe we need to make btf_id to point to the actual type of the
> variable, not BTF_KIND_VAR? Or just additionally record type's btf_id,
> not sure which one makes more sense at the moment.
>
> >
> > I am generally against modifying BTF. I initially didn't notice that
> > all the ksym externs' types are modified to 'int' and the type
> > comparison I mentioned above always failed. I dumped the btf in
> > vmlinux and the btf in object file, checked the kernel variable's
> > source code, printed out everything I could. The experience was very
> > bad.
> >
>
> It might be confusing, I agree, but the alternative is just a waste of
> memory just to match the BTF definition of a DATASEC, which describes
> externs. It seems sloppy to allocate a bunch of unused memory just to
> match the kernel's variable size, while in reality we either use 8
> bytes used (for typeless externs, storing ksym address) or none (for
> typed externs).
>
> Another alternative is to not specify BTF ID for .ksyms map, but it's
> not great for typeless externs case, as we are losing all type info
> completely. Trade-offs...
>

I see. It looks like rewriting all ksym externs' type to integers is
the most straightforward solution here, though I felt a bit hacky.

I can record the btf_id of the var's type before rewriting, so
bpf_core_type_are_compat() can find the true type for comparison. One
good thing about recording the type's btf_id is that it can be used to
tell whether the ksym extern is typed or not, before vmlinux_btf_id
gets its value. I will think about this and try the alternatives a bit
more and follow up if I come up with a better solution.

Thanks!

[...]

  reply	other threads:[~2020-09-01 20:35 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-19 22:40 [PATCH bpf-next v1 0/8] bpf: BTF support for ksyms Hao Luo
2020-08-19 22:40 ` [PATCH bpf-next v1 1/8] bpf: Introduce pseudo_btf_id Hao Luo
2020-08-20 15:22   ` Yonghong Song
2020-08-20 17:04     ` Hao Luo
2020-08-25  0:05     ` Hao Luo
2020-08-25  0:43       ` Yonghong Song
2020-08-20 16:43   ` Yonghong Song
2020-08-20 21:53   ` Alexei Starovoitov
2020-08-21  2:22     ` Hao Luo
2020-08-19 22:40 ` [PATCH bpf-next v1 2/8] bpf: Propagate BPF_PSEUDO_BTF_ID to uapi headers in /tools Hao Luo
2020-08-20 16:44   ` Yonghong Song
2020-08-19 22:40 ` [PATCH bpf-next v1 3/8] bpf: Introduce help function to validate ksym's type Hao Luo
2020-08-20 17:20   ` Yonghong Song
2020-08-21 21:50     ` Andrii Nakryiko
2020-08-22  0:43       ` Hao Luo
2020-08-22  2:43         ` Andrii Nakryiko
2020-08-22  7:04           ` Hao Luo
2020-08-19 22:40 ` [PATCH bpf-next v1 4/8] bpf/libbpf: BTF support for typed ksyms Hao Luo
2020-08-21 22:37   ` Andrii Nakryiko
2020-08-27 22:29     ` Hao Luo
2020-09-01 18:11       ` Andrii Nakryiko
2020-09-01 20:35         ` Hao Luo [this message]
2020-09-01 23:54           ` Andrii Nakryiko
2020-09-02  0:46             ` Hao Luo
2020-08-19 22:40 ` [PATCH bpf-next v1 5/8] bpf/selftests: ksyms_btf to test " Hao Luo
2020-08-20 17:28   ` Yonghong Song
2020-08-21 23:03     ` Andrii Nakryiko
2020-08-22  7:26       ` Hao Luo
2020-08-22  7:35         ` Andrii Nakryiko
2020-08-19 22:40 ` [PATCH bpf-next v1 6/8] bpf: Introduce bpf_per_cpu_ptr() Hao Luo
2020-08-22  3:26   ` Andrii Nakryiko
2020-08-22  3:31     ` Andrii Nakryiko
2020-08-22  7:49       ` Hao Luo
2020-08-22  7:55         ` Andrii Nakryiko
2020-08-25  1:03           ` Hao Luo
2020-08-19 22:40 ` [PATCH bpf-next v1 7/8] bpf: Propagate bpf_per_cpu_ptr() to /tools Hao Luo
2020-08-20 17:30   ` Yonghong Song
2020-08-19 22:40 ` [PATCH bpf-next v1 8/8] bpf/selftests: Test for bpf_per_cpu_ptr() Hao Luo
2020-08-22  3:30   ` Andrii Nakryiko
2020-08-28  3:42     ` Hao Luo
2020-09-01 18:12       ` Andrii Nakryiko
2020-09-01 19:47         ` Hao Luo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+khW7jnzZim6h9O+JH2AnXmvtU19-FxJDZBXfHZH9Xniq8zeg@mail.gmail.com \
    --to=haoluo@google.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andriin@fb.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jakub@cloudflare.com \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=quentin@isovalent.com \
    --cc=rdna@fb.com \
    --cc=rostedt@goodmis.org \
    --cc=shuah@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=yhs@fb.com \
    --subject='Re: [PATCH bpf-next v1 4/8] bpf/libbpf: BTF support for typed ksyms' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).