Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH bpf-next v3 0/8] Allow storage of flexible metadata information for eBPF programs
@ 2020-08-28 19:35 Stanislav Fomichev
  2020-08-28 19:35 ` [PATCH bpf-next v3 1/8] bpf: Mutex protect used_maps array and count Stanislav Fomichev
                   ` (7 more replies)
  0 siblings, 8 replies; 31+ messages in thread
From: Stanislav Fomichev @ 2020-08-28 19:35 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev, YiFei Zhu

Currently, if a user wants to store arbitrary metadata for an eBPF
program, for example, the program build commit hash or version, they
could store it in a map, and conveniently libbpf uses .data section to
populate an internal map. However, if the program does not actually
reference the map, then the map would be de-refcounted and freed.

This patch set introduces a new syscall BPF_PROG_BIND_MAP to add a map
to a program's used_maps, even if the program instructions does not
reference the map. libbpf is extended to recognize the .metadata section
and load it as an internal map, and use the new syscall to ensure the
map is bound. bpftool is also extended to have a new flag to prog
subcommand, "--metadata" to dump the contents of the metadata section
without a separate map dump call.

An example use of this would be BPF C file declaring:

  char commit_hash[] SEC(".metadata") = "abcdef123456";

and bpftool would emit:

  $ bpftool prog --metadata
  [...]
        metadata:
                commit_hash = "abcdef123456"

Patch 1 protects the used_maps array and count with a mutex.

Patch 2 implements the new syscall.

Patch 3 extends libbpf to have a wrapper around the syscall, probe the
kernel for support of this new syscall, and use it on .metadata section
if supported and the section exists.

Patch 4 extends bpftool so that it is able to dump metadata from prog
show.

Patch 5 extends bpftool gen skeleton to treat the metadata section like
an rodata section so that it mmaps the map read-only at load time.

Patch 6 adds a test to check the metadata loading and dumping.

Changes since RFC:
* Fixed a few missing unlocks, and missing close while iterating map fds.
* Move mutex initialization to right after prog aux allocation, and mutex
  destroy to right after prog aux free.
* s/ADD_MAP/BIND_MAP/
* Use mutex only instead of RCU to protect the used_map array & count.

Changes since v1:
* Made struct bpf_prog_bind_opts in libbpf so flags is optional.
* Deduped probe_kern_global_data and probe_prog_bind_map into a common
  helper.
* Added comment regarding why EEXIST is ignored in libbpf bind map.
* Froze all LIBBPF_MAP_METADATA internal maps.
* Moved bpf_prog_bind_map into new LIBBPF_0.1.1 in libbpf.map.
* Added p_err() calls on error cases in bpftool show_prog_metadata.
* Reverse christmas tree coding style in bpftool show_prog_metadata.
* Made bpftool gen skeleton recognize .metadata as an internal map and
  generate datasec definition in skeleton.
* Added C test using skeleton to see asset that the metadata is what we
  expect and rebinding causes EEXIST.

Cc: YiFei Zhu <zhuyifei1999@gmail.com>

Stanislav Fomichev (2):
  libbpf: implement bpf_prog_find_metadata
  bpftool: mention --metadata in the documentation

YiFei Zhu (6):
  bpf: Mutex protect used_maps array and count
  bpf: Add BPF_PROG_BIND_MAP syscall
  libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .metadata section
  bpftool: support dumping metadata
  bpftool: support metadata internal map in gen skeleton
  selftests/bpf: Test load and dump metadata with btftool and skel

 .../net/ethernet/netronome/nfp/bpf/offload.c  |  18 ++-
 include/linux/bpf.h                           |   1 +
 include/uapi/linux/bpf.h                      |   7 +
 kernel/bpf/core.c                             |  15 +-
 kernel/bpf/syscall.c                          |  81 ++++++++++-
 net/core/dev.c                                |  11 +-
 .../bpftool/Documentation/bpftool-prog.rst    |   5 +-
 tools/bpf/bpftool/gen.c                       |   5 +
 tools/bpf/bpftool/json_writer.c               |   6 +
 tools/bpf/bpftool/json_writer.h               |   3 +
 tools/bpf/bpftool/main.c                      |  10 ++
 tools/bpf/bpftool/main.h                      |   1 +
 tools/bpf/bpftool/prog.c                      | 132 +++++++++++++++++-
 tools/include/uapi/linux/bpf.h                |   7 +
 tools/lib/bpf/bpf.c                           |  87 ++++++++++++
 tools/lib/bpf/bpf.h                           |   9 ++
 tools/lib/bpf/libbpf.c                        | 130 ++++++++++++++---
 tools/lib/bpf/libbpf.map                      |   2 +
 tools/testing/selftests/bpf/Makefile          |   3 +-
 .../selftests/bpf/prog_tests/metadata.c       |  83 +++++++++++
 .../selftests/bpf/progs/metadata_unused.c     |  15 ++
 .../selftests/bpf/progs/metadata_used.c       |  15 ++
 .../selftests/bpf/test_bpftool_metadata.sh    |  82 +++++++++++
 23 files changed, 687 insertions(+), 41 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/metadata.c
 create mode 100644 tools/testing/selftests/bpf/progs/metadata_unused.c
 create mode 100644 tools/testing/selftests/bpf/progs/metadata_used.c
 create mode 100755 tools/testing/selftests/bpf/test_bpftool_metadata.sh

-- 
2.28.0.402.g5ffc5be6b7-goog


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH bpf-next v3 1/8] bpf: Mutex protect used_maps array and count
  2020-08-28 19:35 [PATCH bpf-next v3 0/8] Allow storage of flexible metadata information for eBPF programs Stanislav Fomichev
@ 2020-08-28 19:35 ` Stanislav Fomichev
  2020-08-28 19:35 ` [PATCH bpf-next v3 2/8] bpf: Add BPF_PROG_BIND_MAP syscall Stanislav Fomichev
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 31+ messages in thread
From: Stanislav Fomichev @ 2020-08-28 19:35 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, YiFei Zhu, YiFei Zhu, Stanislav Fomichev

From: YiFei Zhu <zhuyifei@google.com>

To support modifying the used_maps array, we use a mutex to protect
the use of the counter and the array. The mutex is initialized right
after the prog aux is allocated, and destroyed right before prog
aux is freed. This way we guarantee it's initialized for both cBPF
and eBPF.

Cc: YiFei Zhu <zhuyifei1999@gmail.com>
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 .../net/ethernet/netronome/nfp/bpf/offload.c   | 18 ++++++++++++------
 include/linux/bpf.h                            |  1 +
 kernel/bpf/core.c                              | 15 +++++++++++----
 kernel/bpf/syscall.c                           | 16 ++++++++++++----
 net/core/dev.c                                 | 11 ++++++++---
 5 files changed, 44 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/offload.c b/drivers/net/ethernet/netronome/nfp/bpf/offload.c
index ac02369174a9..53851853562c 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/offload.c
@@ -111,7 +111,9 @@ static int
 nfp_map_ptrs_record(struct nfp_app_bpf *bpf, struct nfp_prog *nfp_prog,
 		    struct bpf_prog *prog)
 {
-	int i, cnt, err;
+	int i, cnt, err = 0;
+
+	mutex_lock(&prog->aux->used_maps_mutex);
 
 	/* Quickly count the maps we will have to remember */
 	cnt = 0;
@@ -119,13 +121,15 @@ nfp_map_ptrs_record(struct nfp_app_bpf *bpf, struct nfp_prog *nfp_prog,
 		if (bpf_map_offload_neutral(prog->aux->used_maps[i]))
 			cnt++;
 	if (!cnt)
-		return 0;
+		goto out;
 
 	nfp_prog->map_records = kmalloc_array(cnt,
 					      sizeof(nfp_prog->map_records[0]),
 					      GFP_KERNEL);
-	if (!nfp_prog->map_records)
-		return -ENOMEM;
+	if (!nfp_prog->map_records) {
+		err = -ENOMEM;
+		goto out;
+	}
 
 	for (i = 0; i < prog->aux->used_map_cnt; i++)
 		if (bpf_map_offload_neutral(prog->aux->used_maps[i])) {
@@ -133,12 +137,14 @@ nfp_map_ptrs_record(struct nfp_app_bpf *bpf, struct nfp_prog *nfp_prog,
 						 prog->aux->used_maps[i]);
 			if (err) {
 				nfp_map_ptrs_forget(bpf, nfp_prog);
-				return err;
+				goto out;
 			}
 		}
 	WARN_ON(cnt != nfp_prog->map_records_cnt);
 
-	return 0;
+out:
+	mutex_unlock(&prog->aux->used_maps_mutex);
+	return err;
 }
 
 static int
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index dbba82a80087..1b404b034775 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -748,6 +748,7 @@ struct bpf_prog_aux {
 	struct bpf_ksym ksym;
 	const struct bpf_prog_ops *ops;
 	struct bpf_map **used_maps;
+	struct mutex used_maps_mutex; /* mutex for used_maps and used_map_cnt */
 	struct bpf_prog *prog;
 	struct user_struct *user;
 	u64 load_time; /* ns since boottime */
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index ed0b3578867c..2a20c2833996 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -98,6 +98,7 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag
 	fp->jit_requested = ebpf_jit_enabled();
 
 	INIT_LIST_HEAD_RCU(&fp->aux->ksym.lnode);
+	mutex_init(&fp->aux->used_maps_mutex);
 
 	return fp;
 }
@@ -253,6 +254,7 @@ struct bpf_prog *bpf_prog_realloc(struct bpf_prog *fp_old, unsigned int size,
 void __bpf_prog_free(struct bpf_prog *fp)
 {
 	if (fp->aux) {
+		mutex_destroy(&fp->aux->used_maps_mutex);
 		free_percpu(fp->aux->stats);
 		kfree(fp->aux->poke_tab);
 		kfree(fp->aux);
@@ -1747,8 +1749,9 @@ bool bpf_prog_array_compatible(struct bpf_array *array,
 static int bpf_check_tail_call(const struct bpf_prog *fp)
 {
 	struct bpf_prog_aux *aux = fp->aux;
-	int i;
+	int i, ret = 0;
 
+	mutex_lock(&aux->used_maps_mutex);
 	for (i = 0; i < aux->used_map_cnt; i++) {
 		struct bpf_map *map = aux->used_maps[i];
 		struct bpf_array *array;
@@ -1757,11 +1760,15 @@ static int bpf_check_tail_call(const struct bpf_prog *fp)
 			continue;
 
 		array = container_of(map, struct bpf_array, map);
-		if (!bpf_prog_array_compatible(array, fp))
-			return -EINVAL;
+		if (!bpf_prog_array_compatible(array, fp)) {
+			ret = -EINVAL;
+			goto out;
+		}
 	}
 
-	return 0;
+out:
+	mutex_unlock(&aux->used_maps_mutex);
+	return ret;
 }
 
 static void bpf_prog_select_func(struct bpf_prog *fp)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index b86b1155b748..c9b8a97fbbdf 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3155,21 +3155,25 @@ static const struct bpf_map *bpf_map_from_imm(const struct bpf_prog *prog,
 	const struct bpf_map *map;
 	int i;
 
+	mutex_lock(&prog->aux->used_maps_mutex);
 	for (i = 0, *off = 0; i < prog->aux->used_map_cnt; i++) {
 		map = prog->aux->used_maps[i];
 		if (map == (void *)addr) {
 			*type = BPF_PSEUDO_MAP_FD;
-			return map;
+			goto out;
 		}
 		if (!map->ops->map_direct_value_meta)
 			continue;
 		if (!map->ops->map_direct_value_meta(map, addr, off)) {
 			*type = BPF_PSEUDO_MAP_VALUE;
-			return map;
+			goto out;
 		}
 	}
+	map = NULL;
 
-	return NULL;
+out:
+	mutex_unlock(&prog->aux->used_maps_mutex);
+	return map;
 }
 
 static struct bpf_insn *bpf_insn_prepare_dump(const struct bpf_prog *prog,
@@ -3287,6 +3291,7 @@ static int bpf_prog_get_info_by_fd(struct file *file,
 	memcpy(info.tag, prog->tag, sizeof(prog->tag));
 	memcpy(info.name, prog->aux->name, sizeof(prog->aux->name));
 
+	mutex_lock(&prog->aux->used_maps_mutex);
 	ulen = info.nr_map_ids;
 	info.nr_map_ids = prog->aux->used_map_cnt;
 	ulen = min_t(u32, info.nr_map_ids, ulen);
@@ -3296,9 +3301,12 @@ static int bpf_prog_get_info_by_fd(struct file *file,
 
 		for (i = 0; i < ulen; i++)
 			if (put_user(prog->aux->used_maps[i]->id,
-				     &user_map_ids[i]))
+				     &user_map_ids[i])) {
+				mutex_unlock(&prog->aux->used_maps_mutex);
 				return -EFAULT;
+			}
 	}
+	mutex_unlock(&prog->aux->used_maps_mutex);
 
 	err = set_info_rec_size(&info);
 	if (err)
diff --git a/net/core/dev.c b/net/core/dev.c
index b5d1129d8310..6957b31127d9 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5441,15 +5441,20 @@ static int generic_xdp_install(struct net_device *dev, struct netdev_bpf *xdp)
 	if (new) {
 		u32 i;
 
+		mutex_lock(&new->aux->used_maps_mutex);
+
 		/* generic XDP does not work with DEVMAPs that can
 		 * have a bpf_prog installed on an entry
 		 */
 		for (i = 0; i < new->aux->used_map_cnt; i++) {
-			if (dev_map_can_have_prog(new->aux->used_maps[i]))
-				return -EINVAL;
-			if (cpu_map_prog_allowed(new->aux->used_maps[i]))
+			if (dev_map_can_have_prog(new->aux->used_maps[i]) ||
+			    cpu_map_prog_allowed(new->aux->used_maps[i])) {
+				mutex_unlock(&new->aux->used_maps_mutex);
 				return -EINVAL;
+			}
 		}
+
+		mutex_unlock(&new->aux->used_maps_mutex);
 	}
 
 	switch (xdp->command) {
-- 
2.28.0.402.g5ffc5be6b7-goog


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH bpf-next v3 2/8] bpf: Add BPF_PROG_BIND_MAP syscall
  2020-08-28 19:35 [PATCH bpf-next v3 0/8] Allow storage of flexible metadata information for eBPF programs Stanislav Fomichev
  2020-08-28 19:35 ` [PATCH bpf-next v3 1/8] bpf: Mutex protect used_maps array and count Stanislav Fomichev
@ 2020-08-28 19:35 ` Stanislav Fomichev
  2020-09-03  2:15   ` Andrii Nakryiko
  2020-08-28 19:35 ` [PATCH bpf-next v3 3/8] libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .metadata section Stanislav Fomichev
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 31+ messages in thread
From: Stanislav Fomichev @ 2020-08-28 19:35 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, YiFei Zhu, YiFei Zhu, Stanislav Fomichev

From: YiFei Zhu <zhuyifei@google.com>

This syscall binds a map to a program. -EEXIST if the map is
already bound to the program.

Cc: YiFei Zhu <zhuyifei1999@gmail.com>
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 include/uapi/linux/bpf.h       |  7 ++++
 kernel/bpf/syscall.c           | 65 ++++++++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h |  7 ++++
 3 files changed, 79 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index ef7af384f5ee..d232c71c4560 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -124,6 +124,7 @@ enum bpf_cmd {
 	BPF_ENABLE_STATS,
 	BPF_ITER_CREATE,
 	BPF_LINK_DETACH,
+	BPF_PROG_BIND_MAP,
 };
 
 enum bpf_map_type {
@@ -650,6 +651,12 @@ union bpf_attr {
 		__u32		flags;
 	} iter_create;
 
+	struct { /* struct used by BPF_PROG_BIND_MAP command */
+		__u32		prog_fd;
+		__u32		map_fd;
+		__u32		flags;		/* extra flags */
+	} prog_bind_map;
+
 } __attribute__((aligned(8)));
 
 /* The description below is an attempt at providing documentation to eBPF
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index c9b8a97fbbdf..8c1fad2826d1 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -4154,6 +4154,68 @@ static int bpf_iter_create(union bpf_attr *attr)
 	return err;
 }
 
+#define BPF_PROG_BIND_MAP_LAST_FIELD prog_bind_map.flags
+
+static int bpf_prog_bind_map(union bpf_attr *attr)
+{
+	struct bpf_prog *prog;
+	struct bpf_map *map;
+	struct bpf_map **used_maps_old, **used_maps_new;
+	int i, ret = 0;
+
+	if (CHECK_ATTR(BPF_PROG_BIND_MAP))
+		return -EINVAL;
+
+	if (attr->prog_bind_map.flags)
+		return -EINVAL;
+
+	prog = bpf_prog_get(attr->prog_bind_map.prog_fd);
+	if (IS_ERR(prog))
+		return PTR_ERR(prog);
+
+	map = bpf_map_get(attr->prog_bind_map.map_fd);
+	if (IS_ERR(map)) {
+		ret = PTR_ERR(map);
+		goto out_prog_put;
+	}
+
+	mutex_lock(&prog->aux->used_maps_mutex);
+
+	used_maps_old = prog->aux->used_maps;
+
+	for (i = 0; i < prog->aux->used_map_cnt; i++)
+		if (used_maps_old[i] == map) {
+			ret = -EEXIST;
+			goto out_unlock;
+		}
+
+	used_maps_new = kmalloc_array(prog->aux->used_map_cnt + 1,
+				      sizeof(used_maps_new[0]),
+				      GFP_KERNEL);
+	if (!used_maps_new) {
+		ret = -ENOMEM;
+		goto out_unlock;
+	}
+
+	memcpy(used_maps_new, used_maps_old,
+	       sizeof(used_maps_old[0]) * prog->aux->used_map_cnt);
+	used_maps_new[prog->aux->used_map_cnt] = map;
+
+	prog->aux->used_map_cnt++;
+	prog->aux->used_maps = used_maps_new;
+
+	kfree(used_maps_old);
+
+out_unlock:
+	mutex_unlock(&prog->aux->used_maps_mutex);
+
+	if (ret)
+		bpf_map_put(map);
+out_prog_put:
+	bpf_prog_put(prog);
+	return ret;
+}
+
 SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size)
 {
 	union bpf_attr attr;
@@ -4287,6 +4349,9 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 	case BPF_LINK_DETACH:
 		err = link_detach(&attr);
 		break;
+	case BPF_PROG_BIND_MAP:
+		err = bpf_prog_bind_map(&attr);
+		break;
 	default:
 		err = -EINVAL;
 		break;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index ef7af384f5ee..d232c71c4560 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -124,6 +124,7 @@ enum bpf_cmd {
 	BPF_ENABLE_STATS,
 	BPF_ITER_CREATE,
 	BPF_LINK_DETACH,
+	BPF_PROG_BIND_MAP,
 };
 
 enum bpf_map_type {
@@ -650,6 +651,12 @@ union bpf_attr {
 		__u32		flags;
 	} iter_create;
 
+	struct { /* struct used by BPF_PROG_BIND_MAP command */
+		__u32		prog_fd;
+		__u32		map_fd;
+		__u32		flags;		/* extra flags */
+	} prog_bind_map;
+
 } __attribute__((aligned(8)));
 
 /* The description below is an attempt at providing documentation to eBPF
-- 
2.28.0.402.g5ffc5be6b7-goog


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH bpf-next v3 3/8] libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .metadata section
  2020-08-28 19:35 [PATCH bpf-next v3 0/8] Allow storage of flexible metadata information for eBPF programs Stanislav Fomichev
  2020-08-28 19:35 ` [PATCH bpf-next v3 1/8] bpf: Mutex protect used_maps array and count Stanislav Fomichev
  2020-08-28 19:35 ` [PATCH bpf-next v3 2/8] bpf: Add BPF_PROG_BIND_MAP syscall Stanislav Fomichev
@ 2020-08-28 19:35 ` Stanislav Fomichev
  2020-09-03  2:31   ` Andrii Nakryiko
  2020-08-28 19:35 ` [PATCH bpf-next v3 4/8] libbpf: implement bpf_prog_find_metadata Stanislav Fomichev
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 31+ messages in thread
From: Stanislav Fomichev @ 2020-08-28 19:35 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, YiFei Zhu, YiFei Zhu, Stanislav Fomichev

From: YiFei Zhu <zhuyifei@google.com>

The patch adds a simple wrapper bpf_prog_bind_map around the syscall.
And when using libbpf to load a program, it will probe the kernel for
the support of this syscall, and scan for the .metadata ELF section
and load it as an internal map like a .data section.

In the case that kernel supports the BPF_PROG_BIND_MAP syscall and
a .metadata section exists, the map will be explicitly bound to
the program via the syscall immediately after program is loaded.
-EEXIST is ignored for this syscall.

Cc: YiFei Zhu <zhuyifei1999@gmail.com>
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/lib/bpf/bpf.c      |  13 ++++
 tools/lib/bpf/bpf.h      |   8 +++
 tools/lib/bpf/libbpf.c   | 130 ++++++++++++++++++++++++++++++++-------
 tools/lib/bpf/libbpf.map |   1 +
 4 files changed, 131 insertions(+), 21 deletions(-)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 82b983ff6569..5f6c5676cc45 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -872,3 +872,16 @@ int bpf_enable_stats(enum bpf_stats_type type)
 
 	return sys_bpf(BPF_ENABLE_STATS, &attr, sizeof(attr));
 }
+
+int bpf_prog_bind_map(int prog_fd, int map_fd,
+		      const struct bpf_prog_bind_opts *opts)
+{
+	union bpf_attr attr;
+
+	memset(&attr, 0, sizeof(attr));
+	attr.prog_bind_map.prog_fd = prog_fd;
+	attr.prog_bind_map.map_fd = map_fd;
+	attr.prog_bind_map.flags = OPTS_GET(opts, flags, 0);
+
+	return sys_bpf(BPF_PROG_BIND_MAP, &attr, sizeof(attr));
+}
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 015d13f25fcc..8c1ac4b42f90 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -243,6 +243,14 @@ LIBBPF_API int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf,
 enum bpf_stats_type; /* defined in up-to-date linux/bpf.h */
 LIBBPF_API int bpf_enable_stats(enum bpf_stats_type type);
 
+struct bpf_prog_bind_opts {
+	size_t sz; /* size of this struct for forward/backward compatibility */
+	__u32 flags;
+};
+#define bpf_prog_bind_opts__last_field flags
+
+LIBBPF_API int bpf_prog_bind_map(int prog_fd, int map_fd,
+				 const struct bpf_prog_bind_opts *opts);
 #ifdef __cplusplus
 } /* extern "C" */
 #endif
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 8cdb2528482e..2b21021b66bb 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -176,6 +176,8 @@ enum kern_feature_id {
 	FEAT_EXP_ATTACH_TYPE,
 	/* bpf_probe_read_{kernel,user}[_str] helpers */
 	FEAT_PROBE_READ_KERN,
+	/* BPF_PROG_BIND_MAP is supported */
+	FEAT_PROG_BIND_MAP,
 	__FEAT_CNT,
 };
 
@@ -285,6 +287,7 @@ struct bpf_struct_ops {
 #define KCONFIG_SEC ".kconfig"
 #define KSYMS_SEC ".ksyms"
 #define STRUCT_OPS_SEC ".struct_ops"
+#define METADATA_SEC ".metadata"
 
 enum libbpf_map_type {
 	LIBBPF_MAP_UNSPEC,
@@ -292,6 +295,7 @@ enum libbpf_map_type {
 	LIBBPF_MAP_BSS,
 	LIBBPF_MAP_RODATA,
 	LIBBPF_MAP_KCONFIG,
+	LIBBPF_MAP_METADATA,
 };
 
 static const char * const libbpf_type_to_btf_name[] = {
@@ -299,6 +303,7 @@ static const char * const libbpf_type_to_btf_name[] = {
 	[LIBBPF_MAP_BSS]	= BSS_SEC,
 	[LIBBPF_MAP_RODATA]	= RODATA_SEC,
 	[LIBBPF_MAP_KCONFIG]	= KCONFIG_SEC,
+	[LIBBPF_MAP_METADATA]	= METADATA_SEC,
 };
 
 struct bpf_map {
@@ -381,6 +386,7 @@ struct bpf_object {
 	struct extern_desc *externs;
 	int nr_extern;
 	int kconfig_map_idx;
+	int metadata_map_idx;
 
 	bool loaded;
 	bool has_pseudo_calls;
@@ -400,6 +406,7 @@ struct bpf_object {
 		Elf_Data *rodata;
 		Elf_Data *bss;
 		Elf_Data *st_ops_data;
+		Elf_Data *metadata;
 		size_t shstrndx; /* section index for section name strings */
 		size_t strtabidx;
 		struct {
@@ -416,6 +423,7 @@ struct bpf_object {
 		int rodata_shndx;
 		int bss_shndx;
 		int st_ops_shndx;
+		int metadata_shndx;
 	} efile;
 	/*
 	 * All loaded bpf_object is linked in a list, which is
@@ -1027,11 +1035,13 @@ static struct bpf_object *bpf_object__new(const char *path,
 	obj->efile.obj_buf_sz = obj_buf_sz;
 	obj->efile.maps_shndx = -1;
 	obj->efile.btf_maps_shndx = -1;
+	obj->efile.metadata_shndx = -1;
 	obj->efile.data_shndx = -1;
 	obj->efile.rodata_shndx = -1;
 	obj->efile.bss_shndx = -1;
 	obj->efile.st_ops_shndx = -1;
 	obj->kconfig_map_idx = -1;
+	obj->metadata_map_idx = -1;
 
 	obj->kern_version = get_kernel_version();
 	obj->loaded = false;
@@ -1343,7 +1353,8 @@ bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type,
 	def->key_size = sizeof(int);
 	def->value_size = data_sz;
 	def->max_entries = 1;
-	def->map_flags = type == LIBBPF_MAP_RODATA || type == LIBBPF_MAP_KCONFIG
+	def->map_flags = type == LIBBPF_MAP_RODATA || type == LIBBPF_MAP_KCONFIG ||
+			 type == LIBBPF_MAP_METADATA
 			 ? BPF_F_RDONLY_PROG : 0;
 	def->map_flags |= BPF_F_MMAPABLE;
 
@@ -1399,6 +1410,16 @@ static int bpf_object__init_global_data_maps(struct bpf_object *obj)
 		if (err)
 			return err;
 	}
+	if (obj->efile.metadata_shndx >= 0) {
+		err = bpf_object__init_internal_map(obj, LIBBPF_MAP_METADATA,
+						    obj->efile.metadata_shndx,
+						    obj->efile.metadata->d_buf,
+						    obj->efile.metadata->d_size);
+		if (err)
+			return err;
+
+		obj->metadata_map_idx = obj->nr_maps - 1;
+	}
 	return 0;
 }
 
@@ -2790,6 +2811,9 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
 			} else if (strcmp(name, STRUCT_OPS_SEC) == 0) {
 				obj->efile.st_ops_data = data;
 				obj->efile.st_ops_shndx = idx;
+			} else if (strcmp(name, METADATA_SEC) == 0) {
+				obj->efile.metadata = data;
+				obj->efile.metadata_shndx = idx;
 			} else {
 				pr_info("elf: skipping unrecognized data section(%d) %s\n",
 					idx, name);
@@ -3201,7 +3225,8 @@ static bool bpf_object__shndx_is_data(const struct bpf_object *obj,
 {
 	return shndx == obj->efile.data_shndx ||
 	       shndx == obj->efile.bss_shndx ||
-	       shndx == obj->efile.rodata_shndx;
+	       shndx == obj->efile.rodata_shndx ||
+	       shndx == obj->efile.metadata_shndx;
 }
 
 static bool bpf_object__shndx_is_maps(const struct bpf_object *obj,
@@ -3222,6 +3247,8 @@ bpf_object__section_to_libbpf_map_type(const struct bpf_object *obj, int shndx)
 		return LIBBPF_MAP_RODATA;
 	else if (shndx == obj->efile.symbols_shndx)
 		return LIBBPF_MAP_KCONFIG;
+	else if (shndx == obj->efile.metadata_shndx)
+		return LIBBPF_MAP_METADATA;
 	else
 		return LIBBPF_MAP_UNSPEC;
 }
@@ -3592,18 +3619,13 @@ static int probe_kern_prog_name(void)
 	return probe_fd(ret);
 }
 
-static int probe_kern_global_data(void)
+static void __probe_create_global_data(int *prog, int *map,
+				       struct bpf_insn *insns, size_t insns_cnt)
 {
 	struct bpf_load_program_attr prg_attr;
 	struct bpf_create_map_attr map_attr;
 	char *cp, errmsg[STRERR_BUFSIZE];
-	struct bpf_insn insns[] = {
-		BPF_LD_MAP_VALUE(BPF_REG_1, 0, 16),
-		BPF_ST_MEM(BPF_DW, BPF_REG_1, 0, 42),
-		BPF_MOV64_IMM(BPF_REG_0, 0),
-		BPF_EXIT_INSN(),
-	};
-	int ret, map;
+	int err;
 
 	memset(&map_attr, 0, sizeof(map_attr));
 	map_attr.map_type = BPF_MAP_TYPE_ARRAY;
@@ -3611,26 +3633,40 @@ static int probe_kern_global_data(void)
 	map_attr.value_size = 32;
 	map_attr.max_entries = 1;
 
-	map = bpf_create_map_xattr(&map_attr);
-	if (map < 0) {
-		ret = -errno;
-		cp = libbpf_strerror_r(ret, errmsg, sizeof(errmsg));
+	*map = bpf_create_map_xattr(&map_attr);
+	if (*map < 0) {
+		err = errno;
+		cp = libbpf_strerror_r(err, errmsg, sizeof(errmsg));
 		pr_warn("Error in %s():%s(%d). Couldn't create simple array map.\n",
-			__func__, cp, -ret);
-		return ret;
+			__func__, cp, -err);
+		return;
 	}
 
-	insns[0].imm = map;
+	insns[0].imm = *map;
 
 	memset(&prg_attr, 0, sizeof(prg_attr));
 	prg_attr.prog_type = BPF_PROG_TYPE_SOCKET_FILTER;
 	prg_attr.insns = insns;
-	prg_attr.insns_cnt = ARRAY_SIZE(insns);
+	prg_attr.insns_cnt = insns_cnt;
 	prg_attr.license = "GPL";
 
-	ret = bpf_load_program_xattr(&prg_attr, NULL, 0);
+	*prog = bpf_load_program_xattr(&prg_attr, NULL, 0);
+}
+
+static int probe_kern_global_data(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_LD_MAP_VALUE(BPF_REG_1, 0, 16),
+		BPF_ST_MEM(BPF_DW, BPF_REG_1, 0, 42),
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	};
+	int prog = -1, map = -1;
+
+	__probe_create_global_data(&prog, &map, insns, ARRAY_SIZE(insns));
+
 	close(map);
-	return probe_fd(ret);
+	return probe_fd(prog);
 }
 
 static int probe_kern_btf(void)
@@ -3757,6 +3793,32 @@ static int probe_kern_probe_read_kernel(void)
 	return probe_fd(bpf_load_program_xattr(&attr, NULL, 0));
 }
 
+static int probe_prog_bind_map(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	};
+	int prog = -1, map = -1, ret = 0;
+
+	if (!kernel_supports(FEAT_GLOBAL_DATA))
+		return 0;
+
+	__probe_create_global_data(&prog, &map, insns, ARRAY_SIZE(insns));
+
+	if (map >= 0 && prog < 0) {
+		close(map);
+		return 0;
+	}
+
+	if (!bpf_prog_bind_map(prog, map, NULL))
+		ret = 1;
+
+	close(map);
+	close(prog);
+	return ret;
+}
+
 enum kern_feature_result {
 	FEAT_UNKNOWN = 0,
 	FEAT_SUPPORTED = 1,
@@ -3797,6 +3859,9 @@ static struct kern_feature_desc {
 	},
 	[FEAT_PROBE_READ_KERN] = {
 		"bpf_probe_read_kernel() helper", probe_kern_probe_read_kernel,
+	},
+	[FEAT_PROG_BIND_MAP] = {
+		"BPF_PROG_BIND_MAP support", probe_prog_bind_map,
 	}
 };
 
@@ -3897,7 +3962,8 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
 	}
 
 	/* Freeze .rodata and .kconfig map as read-only from syscall side. */
-	if (map_type == LIBBPF_MAP_RODATA || map_type == LIBBPF_MAP_KCONFIG) {
+	if (map_type == LIBBPF_MAP_RODATA || map_type == LIBBPF_MAP_KCONFIG ||
+	    map_type == LIBBPF_MAP_METADATA) {
 		err = bpf_map_freeze(map->fd);
 		if (err) {
 			err = -errno;
@@ -6057,6 +6123,28 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
 	if (ret >= 0) {
 		if (log_buf && load_attr.log_level)
 			pr_debug("verifier log:\n%s", log_buf);
+
+		if (prog->obj->metadata_map_idx >= 0 &&
+		    kernel_supports(FEAT_PROG_BIND_MAP)) {
+			struct bpf_map *metadata_map =
+				&prog->obj->maps[prog->obj->metadata_map_idx];
+
+			/* EEXIST to bpf_prog_bind_map means the map is already
+			 * bound to the program. This can happen if the program
+			 * refers to the map in its code. Since all we are doing
+			 * is to make sure when we iterate the program's maps
+			 * metadata map is always inside, EXIST is okay; we
+			 * ignore this errno
+			 */
+			if (bpf_prog_bind_map(ret, bpf_map__fd(metadata_map), NULL) &&
+			    errno != EEXIST) {
+				cp = libbpf_strerror_r(errno, errmsg, sizeof(errmsg));
+				pr_warn("prog '%s': failed to bind .metadata map: %s\n",
+					prog->name, cp);
+				/* Don't fail hard if can't load metadata. */
+			}
+		}
+
 		*pfd = ret;
 		ret = 0;
 		goto out;
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 66a6286d0716..529b99c0c2c3 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -302,6 +302,7 @@ LIBBPF_0.1.0 {
 
 LIBBPF_0.2.0 {
 	global:
+		bpf_prog_bind_map;
 		perf_buffer__buffer_cnt;
 		perf_buffer__buffer_fd;
 		perf_buffer__epoll_fd;
-- 
2.28.0.402.g5ffc5be6b7-goog


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH bpf-next v3 4/8] libbpf: implement bpf_prog_find_metadata
  2020-08-28 19:35 [PATCH bpf-next v3 0/8] Allow storage of flexible metadata information for eBPF programs Stanislav Fomichev
                   ` (2 preceding siblings ...)
  2020-08-28 19:35 ` [PATCH bpf-next v3 3/8] libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .metadata section Stanislav Fomichev
@ 2020-08-28 19:35 ` Stanislav Fomichev
  2020-08-28 21:10   ` Toke Høiland-Jørgensen
  2020-08-28 19:36 ` [PATCH bpf-next v3 5/8] bpftool: support dumping metadata Stanislav Fomichev
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 31+ messages in thread
From: Stanislav Fomichev @ 2020-08-28 19:35 UTC (permalink / raw)
  To: netdev, bpf
  Cc: davem, ast, daniel, Stanislav Fomichev,
	Toke Høiland-Jørgensen, YiFei Zhu

This is a low-level function (hence in bpf.c) to find out the metadata
map id for the provided program fd.
It will be used in the next commits from bpftool.

Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Cc: YiFei Zhu <zhuyifei1999@gmail.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/lib/bpf/bpf.c      | 74 ++++++++++++++++++++++++++++++++++++++++
 tools/lib/bpf/bpf.h      |  1 +
 tools/lib/bpf/libbpf.map |  1 +
 3 files changed, 76 insertions(+)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 5f6c5676cc45..01c0ede1625d 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -885,3 +885,77 @@ int bpf_prog_bind_map(int prog_fd, int map_fd,
 
 	return sys_bpf(BPF_PROG_BIND_MAP, &attr, sizeof(attr));
 }
+
+int bpf_prog_find_metadata(int prog_fd)
+{
+	struct bpf_prog_info prog_info = {};
+	struct bpf_map_info map_info;
+	__u32 prog_info_len;
+	__u32 map_info_len;
+	int saved_errno;
+	__u32 *map_ids;
+	int nr_maps;
+	int map_fd;
+	int ret;
+	int i;
+
+	prog_info_len = sizeof(prog_info);
+
+	ret = bpf_obj_get_info_by_fd(prog_fd, &prog_info, &prog_info_len);
+	if (ret)
+		return ret;
+
+	if (!prog_info.nr_map_ids)
+		return -1;
+
+	map_ids = calloc(prog_info.nr_map_ids, sizeof(__u32));
+	if (!map_ids)
+		return -1;
+
+	nr_maps = prog_info.nr_map_ids;
+	memset(&prog_info, 0, sizeof(prog_info));
+	prog_info.nr_map_ids = nr_maps;
+	prog_info.map_ids = ptr_to_u64(map_ids);
+	prog_info_len = sizeof(prog_info);
+
+	ret = bpf_obj_get_info_by_fd(prog_fd, &prog_info, &prog_info_len);
+	if (ret)
+		goto free_map_ids;
+
+	ret = -1;
+	for (i = 0; i < prog_info.nr_map_ids; i++) {
+		map_fd = bpf_map_get_fd_by_id(map_ids[i]);
+		if (map_fd < 0) {
+			ret = -1;
+			goto free_map_ids;
+		}
+
+		memset(&map_info, 0, sizeof(map_info));
+		map_info_len = sizeof(map_info);
+		ret = bpf_obj_get_info_by_fd(map_fd, &map_info, &map_info_len);
+		saved_errno = errno;
+		close(map_fd);
+		errno = saved_errno;
+		if (ret)
+			goto free_map_ids;
+
+		if (map_info.type != BPF_MAP_TYPE_ARRAY)
+			continue;
+		if (map_info.key_size != sizeof(int))
+			continue;
+		if (map_info.max_entries != 1)
+			continue;
+		if (!map_info.btf_value_type_id)
+			continue;
+		if (!strstr(map_info.name, ".metadata"))
+			continue;
+
+		ret = map_ids[i];
+		break;
+	}
+
+
+free_map_ids:
+	free(map_ids);
+	return ret;
+}
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 8c1ac4b42f90..8982ffa7cfd2 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -251,6 +251,7 @@ struct bpf_prog_bind_opts {
 
 LIBBPF_API int bpf_prog_bind_map(int prog_fd, int map_fd,
 				 const struct bpf_prog_bind_opts *opts);
+LIBBPF_API int bpf_prog_find_metadata(int prog_fd);
 #ifdef __cplusplus
 } /* extern "C" */
 #endif
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 529b99c0c2c3..b7a40f543b2b 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -307,4 +307,5 @@ LIBBPF_0.2.0 {
 		perf_buffer__buffer_fd;
 		perf_buffer__epoll_fd;
 		perf_buffer__consume_buffer;
+		bpf_prog_find_metadata;
 } LIBBPF_0.1.0;
-- 
2.28.0.402.g5ffc5be6b7-goog


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH bpf-next v3 5/8] bpftool: support dumping metadata
  2020-08-28 19:35 [PATCH bpf-next v3 0/8] Allow storage of flexible metadata information for eBPF programs Stanislav Fomichev
                   ` (3 preceding siblings ...)
  2020-08-28 19:35 ` [PATCH bpf-next v3 4/8] libbpf: implement bpf_prog_find_metadata Stanislav Fomichev
@ 2020-08-28 19:36 ` Stanislav Fomichev
  2020-09-03  5:00   ` Andrii Nakryiko
  2020-08-28 19:36 ` [PATCH bpf-next v3 6/8] bpftool: support metadata internal map in gen skeleton Stanislav Fomichev
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 31+ messages in thread
From: Stanislav Fomichev @ 2020-08-28 19:36 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, YiFei Zhu, YiFei Zhu, Stanislav Fomichev

From: YiFei Zhu <zhuyifei@google.com>

Added a flag "--metadata" to `bpftool prog list` to dump the metadata
contents. For some formatting some BTF code is put directly in the
metadata dumping. Sanity checks on the map and the kind of the btf_type
to make sure we are actually dumping what we are expecting.

A helper jsonw_reset is added to json writer so we can reuse the same
json writer without having extraneous commas.

Sample output:

  $ bpftool prog --metadata
  6: cgroup_skb  name prog  tag bcf7977d3b93787c  gpl
  [...]
  	btf_id 4
  	metadata:
  		metadata_a = "foo"
  		metadata_b = 1

  $ bpftool prog --metadata --json --pretty
  [{
          "id": 6,
  [...]
          "btf_id": 4,
          "metadata": {
              "metadata_a": "foo",
              "metadata_b": 1
          }
      }
  ]

Cc: YiFei Zhu <zhuyifei1999@gmail.com>
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/bpf/bpftool/json_writer.c |   6 ++
 tools/bpf/bpftool/json_writer.h |   3 +
 tools/bpf/bpftool/main.c        |  10 +++
 tools/bpf/bpftool/main.h        |   1 +
 tools/bpf/bpftool/prog.c        | 130 ++++++++++++++++++++++++++++++++
 5 files changed, 150 insertions(+)

diff --git a/tools/bpf/bpftool/json_writer.c b/tools/bpf/bpftool/json_writer.c
index 86501cd3c763..7fea83bedf48 100644
--- a/tools/bpf/bpftool/json_writer.c
+++ b/tools/bpf/bpftool/json_writer.c
@@ -119,6 +119,12 @@ void jsonw_pretty(json_writer_t *self, bool on)
 	self->pretty = on;
 }
 
+void jsonw_reset(json_writer_t *self)
+{
+	assert(self->depth == 0);
+	self->sep = '\0';
+}
+
 /* Basic blocks */
 static void jsonw_begin(json_writer_t *self, int c)
 {
diff --git a/tools/bpf/bpftool/json_writer.h b/tools/bpf/bpftool/json_writer.h
index 35cf1f00f96c..8ace65cdb92f 100644
--- a/tools/bpf/bpftool/json_writer.h
+++ b/tools/bpf/bpftool/json_writer.h
@@ -27,6 +27,9 @@ void jsonw_destroy(json_writer_t **self_p);
 /* Cause output to have pretty whitespace */
 void jsonw_pretty(json_writer_t *self, bool on);
 
+/* Reset separator to create new JSON */
+void jsonw_reset(json_writer_t *self);
+
 /* Add property name */
 void jsonw_name(json_writer_t *self, const char *name);
 
diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c
index 4a191fcbeb82..a681d568cfa7 100644
--- a/tools/bpf/bpftool/main.c
+++ b/tools/bpf/bpftool/main.c
@@ -28,6 +28,7 @@ bool show_pinned;
 bool block_mount;
 bool verifier_logs;
 bool relaxed_maps;
+bool dump_metadata;
 struct pinned_obj_table prog_table;
 struct pinned_obj_table map_table;
 struct pinned_obj_table link_table;
@@ -351,6 +352,10 @@ static int do_batch(int argc, char **argv)
 	return err;
 }
 
+enum bpftool_longonly_opts {
+	OPT_METADATA = 256,
+};
+
 int main(int argc, char **argv)
 {
 	static const struct option options[] = {
@@ -362,6 +367,7 @@ int main(int argc, char **argv)
 		{ "mapcompat",	no_argument,	NULL,	'm' },
 		{ "nomount",	no_argument,	NULL,	'n' },
 		{ "debug",	no_argument,	NULL,	'd' },
+		{ "metadata",	no_argument,	NULL,	OPT_METADATA },
 		{ 0 }
 	};
 	int opt, ret;
@@ -371,6 +377,7 @@ int main(int argc, char **argv)
 	json_output = false;
 	show_pinned = false;
 	block_mount = false;
+	dump_metadata = false;
 	bin_name = argv[0];
 
 	hash_init(prog_table.table);
@@ -412,6 +419,9 @@ int main(int argc, char **argv)
 			libbpf_set_print(print_all_levels);
 			verifier_logs = true;
 			break;
+		case OPT_METADATA:
+			dump_metadata = true;
+			break;
 		default:
 			p_err("unrecognized option '%s'", argv[optind - 1]);
 			if (json_output)
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index c46e52137b87..8750758e9150 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -90,6 +90,7 @@ extern bool show_pids;
 extern bool block_mount;
 extern bool verifier_logs;
 extern bool relaxed_maps;
+extern bool dump_metadata;
 extern struct pinned_obj_table prog_table;
 extern struct pinned_obj_table map_table;
 extern struct pinned_obj_table link_table;
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index d393eb8263a6..5d626c134e7d 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -151,6 +151,130 @@ static void show_prog_maps(int fd, __u32 num_maps)
 	}
 }
 
+static void show_prog_metadata(int fd, __u32 num_maps)
+{
+	const struct btf_type *t_datasec, *t_var;
+	struct bpf_map_info map_info = {};
+	struct btf_var_secinfo *vsi;
+	struct btf *btf = NULL;
+	unsigned int i, vlen;
+	__u32 map_info_len;
+	void *value = NULL;
+	int key = 0;
+	int map_id;
+	int map_fd;
+	int err;
+
+	if (!num_maps)
+		return;
+
+	map_id = bpf_prog_find_metadata(fd);
+	if (map_id < 0)
+		return;
+
+	map_fd = bpf_map_get_fd_by_id(map_id);
+	if (map_fd < 0) {
+		p_err("can't get map by id (%u): %s", map_id, strerror(errno));
+		return;
+	}
+
+	map_info_len = sizeof(map_info);
+	err = bpf_obj_get_info_by_fd(map_fd, &map_info, &map_info_len);
+	if (err) {
+		p_err("can't get map info of id (%u): %s", map_id,
+		      strerror(errno));
+		goto out_close;
+	}
+
+	value = malloc(map_info.value_size);
+	if (!value) {
+		p_err("mem alloc failed");
+		goto out_close;
+	}
+
+	if (bpf_map_lookup_elem(map_fd, &key, value)) {
+		p_err("metadata map lookup failed: %s", strerror(errno));
+		goto out_free;
+	}
+
+	err = btf__get_from_id(map_info.btf_id, &btf);
+	if (err || !btf) {
+		p_err("metadata BTF get failed: %s", strerror(-err));
+		goto out_free;
+	}
+
+	t_datasec = btf__type_by_id(btf, map_info.btf_value_type_id);
+	if (BTF_INFO_KIND(t_datasec->info) != BTF_KIND_DATASEC) {
+		p_err("bad metadata BTF");
+		goto out_free;
+	}
+
+	vlen = BTF_INFO_VLEN(t_datasec->info);
+	vsi = (struct btf_var_secinfo *)(t_datasec + 1);
+
+	/* We don't proceed to check the kinds of the elements of the DATASEC.
+	 * The verifier enforce then to be BTF_KIND_VAR.
+	 */
+
+	if (json_output) {
+		struct btf_dumper d = {
+			.btf = btf,
+			.jw = json_wtr,
+			.is_plain_text = false,
+		};
+
+		jsonw_name(json_wtr, "metadata");
+
+		jsonw_start_object(json_wtr);
+		for (i = 0; i < vlen; i++) {
+			t_var = btf__type_by_id(btf, vsi[i].type);
+
+			jsonw_name(json_wtr, btf__name_by_offset(btf, t_var->name_off));
+			err = btf_dumper_type(&d, t_var->type, value + vsi[i].offset);
+			if (err) {
+				p_err("btf dump failed");
+				break;
+			}
+		}
+		jsonw_end_object(json_wtr);
+	} else {
+		json_writer_t *btf_wtr = jsonw_new(stdout);
+		struct btf_dumper d = {
+			.btf = btf,
+			.jw = btf_wtr,
+			.is_plain_text = true,
+		};
+		if (!btf_wtr) {
+			p_err("jsonw alloc failed");
+			goto out_free;
+		}
+
+		printf("\tmetadata:");
+
+		for (i = 0; i < vlen; i++) {
+			t_var = btf__type_by_id(btf, vsi[i].type);
+
+			printf("\n\t\t%s = ", btf__name_by_offset(btf, t_var->name_off));
+
+			jsonw_reset(btf_wtr);
+			err = btf_dumper_type(&d, t_var->type, value + vsi[i].offset);
+			if (err) {
+				p_err("btf dump failed");
+				break;
+			}
+		}
+
+		jsonw_destroy(&btf_wtr);
+	}
+
+out_free:
+	btf__free(btf);
+	free(value);
+
+out_close:
+	close(map_fd);
+}
+
 static void print_prog_header_json(struct bpf_prog_info *info)
 {
 	jsonw_uint_field(json_wtr, "id", info->id);
@@ -228,6 +352,9 @@ static void print_prog_json(struct bpf_prog_info *info, int fd)
 
 	emit_obj_refs_json(&refs_table, info->id, json_wtr);
 
+	if (dump_metadata)
+		show_prog_metadata(fd, info->nr_map_ids);
+
 	jsonw_end_object(json_wtr);
 }
 
@@ -297,6 +424,9 @@ static void print_prog_plain(struct bpf_prog_info *info, int fd)
 	emit_obj_refs_plain(&refs_table, info->id, "\n\tpids ");
 
 	printf("\n");
+
+	if (dump_metadata)
+		show_prog_metadata(fd, info->nr_map_ids);
 }
 
 static int show_prog(int fd)
-- 
2.28.0.402.g5ffc5be6b7-goog


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH bpf-next v3 6/8] bpftool: support metadata internal map in gen skeleton
  2020-08-28 19:35 [PATCH bpf-next v3 0/8] Allow storage of flexible metadata information for eBPF programs Stanislav Fomichev
                   ` (4 preceding siblings ...)
  2020-08-28 19:36 ` [PATCH bpf-next v3 5/8] bpftool: support dumping metadata Stanislav Fomichev
@ 2020-08-28 19:36 ` Stanislav Fomichev
  2020-08-28 19:36 ` [PATCH bpf-next v3 7/8] bpftool: mention --metadata in the documentation Stanislav Fomichev
  2020-08-28 19:36 ` [PATCH bpf-next v3 8/8] selftests/bpf: Test load and dump metadata with btftool and skel Stanislav Fomichev
  7 siblings, 0 replies; 31+ messages in thread
From: Stanislav Fomichev @ 2020-08-28 19:36 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, YiFei Zhu, YiFei Zhu, Stanislav Fomichev

From: YiFei Zhu <zhuyifei@google.com>

The treatment of this section .metadata is exactly like .rodata,
the type modifiers are stripped. The resulting skeleton looks like:

  struct skel_with_metadata {
  	struct bpf_object_skeleton *skeleton;
  	struct bpf_object *obj;
  	struct {
  		struct bpf_map *metadata;
  	} maps;
  	[...]
  	struct skel_with_metadata__metadata {
  		char metadata_a[4];
  		int metadata_b;
  	} *metadata;
  };

Cc: YiFei Zhu <zhuyifei1999@gmail.com>
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/bpf/bpftool/gen.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c
index 4033c46d83e7..9c316ea23ca1 100644
--- a/tools/bpf/bpftool/gen.c
+++ b/tools/bpf/bpftool/gen.c
@@ -82,6 +82,8 @@ static const char *get_map_ident(const struct bpf_map *map)
 		return "bss";
 	else if (str_has_suffix(name, ".kconfig"))
 		return "kconfig";
+	else if (str_has_suffix(name, ".metadata"))
+		return "metadata";
 	else
 		return NULL;
 }
@@ -113,6 +115,9 @@ static int codegen_datasec_def(struct bpf_object *obj,
 		strip_mods = true;
 	} else if (strcmp(sec_name, ".kconfig") == 0) {
 		sec_ident = "kconfig";
+	} else if (strcmp(sec_name, ".metadata") == 0) {
+		sec_ident = "metadata";
+		strip_mods = true;
 	} else {
 		return 0;
 	}
-- 
2.28.0.402.g5ffc5be6b7-goog


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH bpf-next v3 7/8] bpftool: mention --metadata in the documentation
  2020-08-28 19:35 [PATCH bpf-next v3 0/8] Allow storage of flexible metadata information for eBPF programs Stanislav Fomichev
                   ` (5 preceding siblings ...)
  2020-08-28 19:36 ` [PATCH bpf-next v3 6/8] bpftool: support metadata internal map in gen skeleton Stanislav Fomichev
@ 2020-08-28 19:36 ` Stanislav Fomichev
  2020-08-28 19:36 ` [PATCH bpf-next v3 8/8] selftests/bpf: Test load and dump metadata with btftool and skel Stanislav Fomichev
  7 siblings, 0 replies; 31+ messages in thread
From: Stanislav Fomichev @ 2020-08-28 19:36 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, Stanislav Fomichev, YiFei Zhu

Mention --metadata in the rst documentation and in the prog.c
help.

Cc: YiFei Zhu <zhuyifei1999@gmail.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/bpf/bpftool/Documentation/bpftool-prog.rst | 5 ++++-
 tools/bpf/bpftool/prog.c                         | 2 +-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/tools/bpf/bpftool/Documentation/bpftool-prog.rst b/tools/bpf/bpftool/Documentation/bpftool-prog.rst
index 82e356b664e8..84dc47e18016 100644
--- a/tools/bpf/bpftool/Documentation/bpftool-prog.rst
+++ b/tools/bpf/bpftool/Documentation/bpftool-prog.rst
@@ -12,7 +12,7 @@ SYNOPSIS
 
 	**bpftool** [*OPTIONS*] **prog** *COMMAND*
 
-	*OPTIONS* := { { **-j** | **--json** } [{ **-p** | **--pretty** }] | { **-f** | **--bpffs** } }
+	*OPTIONS* := { { **-j** | **--json** } [{ **-p** | **--pretty** }] | { **-f** | **--bpffs** } | {**--metadata**} }
 
 	*COMMANDS* :=
 	{ **show** | **list** | **dump xlated** | **dump jited** | **pin** | **load**
@@ -80,6 +80,9 @@ DESCRIPTION
 		  programs. On such kernels bpftool will automatically emit this
 		  information as well.
 
+		  You can specify **--metadata** option to pretty-print
+		  read-only data from the associated *.metadata* section.
+
 	**bpftool prog dump xlated** *PROG* [{ **file** *FILE* | **opcodes** | **visual** | **linum** }]
 		  Dump eBPF instructions of the programs from the kernel. By
 		  default, eBPF will be disassembled and printed to standard
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index 5d626c134e7d..4c129d6d2a0c 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -2005,7 +2005,7 @@ static int do_help(int argc, char **argv)
 	}
 
 	fprintf(stderr,
-		"Usage: %1$s %2$s { show | list } [PROG]\n"
+		"Usage: %1$s %2$s { show | list } [PROG] [--metadata]\n"
 		"       %1$s %2$s dump xlated PROG [{ file FILE | opcodes | visual | linum }]\n"
 		"       %1$s %2$s dump jited  PROG [{ file FILE | opcodes | linum }]\n"
 		"       %1$s %2$s pin   PROG FILE\n"
-- 
2.28.0.402.g5ffc5be6b7-goog


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH bpf-next v3 8/8] selftests/bpf: Test load and dump metadata with btftool and skel
  2020-08-28 19:35 [PATCH bpf-next v3 0/8] Allow storage of flexible metadata information for eBPF programs Stanislav Fomichev
                   ` (6 preceding siblings ...)
  2020-08-28 19:36 ` [PATCH bpf-next v3 7/8] bpftool: mention --metadata in the documentation Stanislav Fomichev
@ 2020-08-28 19:36 ` Stanislav Fomichev
  7 siblings, 0 replies; 31+ messages in thread
From: Stanislav Fomichev @ 2020-08-28 19:36 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, YiFei Zhu, YiFei Zhu, Stanislav Fomichev

From: YiFei Zhu <zhuyifei@google.com>

This is a simple test to check that loading and dumping metadata
in btftool works, whether or not metadata contents are used by the
program.

A C test is also added to make sure the skeleton code can read the
metadata values, and we also check that trying to re-bind the map
causes EEXIST, so we are sure the map is already bound by libbpf
when loading skeleton.

Cc: YiFei Zhu <zhuyifei1999@gmail.com>
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/testing/selftests/bpf/Makefile          |  3 +-
 .../selftests/bpf/prog_tests/metadata.c       | 83 +++++++++++++++++++
 .../selftests/bpf/progs/metadata_unused.c     | 15 ++++
 .../selftests/bpf/progs/metadata_used.c       | 15 ++++
 .../selftests/bpf/test_bpftool_metadata.sh    | 82 ++++++++++++++++++
 5 files changed, 197 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/metadata.c
 create mode 100644 tools/testing/selftests/bpf/progs/metadata_unused.c
 create mode 100644 tools/testing/selftests/bpf/progs/metadata_used.c
 create mode 100755 tools/testing/selftests/bpf/test_bpftool_metadata.sh

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 09657d0afb5c..f7fd1503d210 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -68,7 +68,8 @@ TEST_PROGS := test_kmod.sh \
 	test_tc_edt.sh \
 	test_xdping.sh \
 	test_bpftool_build.sh \
-	test_bpftool.sh
+	test_bpftool.sh \
+	test_bpftool_metadata.sh \
 
 TEST_PROGS_EXTENDED := with_addr.sh \
 	with_tunnels.sh \
diff --git a/tools/testing/selftests/bpf/prog_tests/metadata.c b/tools/testing/selftests/bpf/prog_tests/metadata.c
new file mode 100644
index 000000000000..086a601a3f74
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/metadata.c
@@ -0,0 +1,83 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * Copyright 2020 Google LLC.
+ */
+
+#include <test_progs.h>
+#include <cgroup_helpers.h>
+#include <network_helpers.h>
+
+#include "metadata_unused.skel.h"
+#include "metadata_used.skel.h"
+
+static int duration;
+
+static void test_metadata_unused(void)
+{
+	struct metadata_unused *obj;
+	int err;
+
+	obj = metadata_unused__open_and_load();
+	if (CHECK(!obj, "skel-load", "errno %d", errno))
+		return;
+
+	/* Assert that we can access the metadata in skel and the values are
+	 * what we expect.
+	 */
+	if (CHECK(strncmp(obj->metadata->metadata_a, "foo",
+			  sizeof(obj->metadata->metadata_a)),
+		  "metadata_a", "expected \"foo\", value differ"))
+		goto close_bpf_object;
+	if (CHECK(obj->metadata->metadata_b != 1, "metadata_b",
+		  "expected 1, got %d", obj->metadata->metadata_b))
+		goto close_bpf_object;
+
+	/* Assert that binding metadata map to prog again results in EEXIST. */
+	err = bpf_prog_bind_map(bpf_program__fd(obj->progs.prog),
+				bpf_map__fd(obj->maps.metadata), NULL);
+	CHECK(!err || errno != EEXIST, "rebind_map",
+	      "errno %d, expected EEXIST", errno);
+
+close_bpf_object:
+	metadata_unused__destroy(obj);
+}
+
+static void test_metadata_used(void)
+{
+	struct metadata_used *obj;
+	int err;
+
+	obj = metadata_used__open_and_load();
+	if (CHECK(!obj, "skel-load", "errno %d", errno))
+		return;
+
+	/* Assert that we can access the metadata in skel and the values are
+	 * what we expect.
+	 */
+	if (CHECK(strncmp(obj->metadata->metadata_a, "bar",
+			  sizeof(obj->metadata->metadata_a)),
+		  "metadata_a", "expected \"bar\", value differ"))
+		goto close_bpf_object;
+	if (CHECK(obj->metadata->metadata_b != 2, "metadata_b",
+		  "expected 2, got %d", obj->metadata->metadata_b))
+		goto close_bpf_object;
+
+	/* Assert that binding metadata map to prog again results in EEXIST. */
+	err = bpf_prog_bind_map(bpf_program__fd(obj->progs.prog),
+				bpf_map__fd(obj->maps.metadata), NULL);
+	CHECK(!err || errno != EEXIST, "rebind_map",
+	      "errno %d, expected EEXIST", errno);
+
+close_bpf_object:
+	metadata_used__destroy(obj);
+}
+
+void test_metadata(void)
+{
+	if (test__start_subtest("unused"))
+		test_metadata_unused();
+
+	if (test__start_subtest("used"))
+		test_metadata_used();
+}
diff --git a/tools/testing/selftests/bpf/progs/metadata_unused.c b/tools/testing/selftests/bpf/progs/metadata_unused.c
new file mode 100644
index 000000000000..523b3c332426
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/metadata_unused.c
@@ -0,0 +1,15 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+
+char metadata_a[] SEC(".metadata") = "foo";
+int metadata_b SEC(".metadata") = 1;
+
+SEC("cgroup_skb/egress")
+int prog(struct xdp_md *ctx)
+{
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/metadata_used.c b/tools/testing/selftests/bpf/progs/metadata_used.c
new file mode 100644
index 000000000000..59785404f7bb
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/metadata_used.c
@@ -0,0 +1,15 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+
+char metadata_a[] SEC(".metadata") = "bar";
+int metadata_b SEC(".metadata") = 2;
+
+SEC("cgroup_skb/egress")
+int prog(struct xdp_md *ctx)
+{
+	return metadata_b ? 1 : 0;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_bpftool_metadata.sh b/tools/testing/selftests/bpf/test_bpftool_metadata.sh
new file mode 100755
index 000000000000..a7515c09dc2d
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_bpftool_metadata.sh
@@ -0,0 +1,82 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+
+TESTNAME=bpftool_metadata
+BPF_FS=$(awk '$3 == "bpf" {print $2; exit}' /proc/mounts)
+BPF_DIR=$BPF_FS/test_$TESTNAME
+
+_cleanup()
+{
+	set +e
+	rm -rf $BPF_DIR 2> /dev/null
+}
+
+cleanup_skip()
+{
+	echo "selftests: $TESTNAME [SKIP]"
+	_cleanup
+
+	exit $ksft_skip
+}
+
+cleanup()
+{
+	if [ "$?" = 0 ]; then
+		echo "selftests: $TESTNAME [PASS]"
+	else
+		echo "selftests: $TESTNAME [FAILED]"
+	fi
+	_cleanup
+}
+
+if [ $(id -u) -ne 0 ]; then
+	echo "selftests: $TESTNAME [SKIP] Need root privileges"
+	exit $ksft_skip
+fi
+
+if [ -z "$BPF_FS" ]; then
+	echo "selftests: $TESTNAME [SKIP] Could not run test without bpffs mounted"
+	exit $ksft_skip
+fi
+
+if ! bpftool version > /dev/null 2>&1; then
+	echo "selftests: $TESTNAME [SKIP] Could not run test without bpftool"
+	exit $ksft_skip
+fi
+
+set -e
+
+trap cleanup_skip EXIT
+
+mkdir $BPF_DIR
+
+trap cleanup EXIT
+
+bpftool prog load metadata_unused.o $BPF_DIR/unused
+
+METADATA_PLAIN="$(bpftool prog --metadata)"
+echo "$METADATA_PLAIN" | grep 'metadata_a = "foo"' > /dev/null
+echo "$METADATA_PLAIN" | grep 'metadata_b = 1' > /dev/null
+
+bpftool prog --metadata --json | grep '"metadata":{"metadata_a":"foo","metadata_b":1}' > /dev/null
+
+bpftool map | grep 'metada.metadata' > /dev/null
+
+rm $BPF_DIR/unused
+
+bpftool prog load metadata_used.o $BPF_DIR/used
+
+METADATA_PLAIN="$(bpftool prog --metadata)"
+echo "$METADATA_PLAIN" | grep 'metadata_a = "bar"' > /dev/null
+echo "$METADATA_PLAIN" | grep 'metadata_b = 2' > /dev/null
+
+bpftool prog --metadata --json | grep '"metadata":{"metadata_a":"bar","metadata_b":2}' > /dev/null
+
+bpftool map | grep 'metada.metadata' > /dev/null
+
+rm $BPF_DIR/used
+
+exit 0
-- 
2.28.0.402.g5ffc5be6b7-goog


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 4/8] libbpf: implement bpf_prog_find_metadata
  2020-08-28 19:35 ` [PATCH bpf-next v3 4/8] libbpf: implement bpf_prog_find_metadata Stanislav Fomichev
@ 2020-08-28 21:10   ` Toke Høiland-Jørgensen
  2020-08-31 15:40     ` sdf
  0 siblings, 1 reply; 31+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-08-28 21:10 UTC (permalink / raw)
  To: Stanislav Fomichev, netdev, bpf
  Cc: davem, ast, daniel, Stanislav Fomichev, YiFei Zhu

Stanislav Fomichev <sdf@google.com> writes:

> This is a low-level function (hence in bpf.c) to find out the metadata
> map id for the provided program fd.
> It will be used in the next commits from bpftool.
>
> Cc: Toke Høiland-Jørgensen <toke@redhat.com>
> Cc: YiFei Zhu <zhuyifei1999@gmail.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  tools/lib/bpf/bpf.c      | 74 ++++++++++++++++++++++++++++++++++++++++
>  tools/lib/bpf/bpf.h      |  1 +
>  tools/lib/bpf/libbpf.map |  1 +
>  3 files changed, 76 insertions(+)
>
> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> index 5f6c5676cc45..01c0ede1625d 100644
> --- a/tools/lib/bpf/bpf.c
> +++ b/tools/lib/bpf/bpf.c
> @@ -885,3 +885,77 @@ int bpf_prog_bind_map(int prog_fd, int map_fd,
>  
>  	return sys_bpf(BPF_PROG_BIND_MAP, &attr, sizeof(attr));
>  }
> +
> +int bpf_prog_find_metadata(int prog_fd)
> +{
> +	struct bpf_prog_info prog_info = {};
> +	struct bpf_map_info map_info;
> +	__u32 prog_info_len;
> +	__u32 map_info_len;
> +	int saved_errno;
> +	__u32 *map_ids;
> +	int nr_maps;
> +	int map_fd;
> +	int ret;
> +	int i;
> +
> +	prog_info_len = sizeof(prog_info);
> +
> +	ret = bpf_obj_get_info_by_fd(prog_fd, &prog_info, &prog_info_len);
> +	if (ret)
> +		return ret;
> +
> +	if (!prog_info.nr_map_ids)
> +		return -1;
> +
> +	map_ids = calloc(prog_info.nr_map_ids, sizeof(__u32));
> +	if (!map_ids)
> +		return -1;
> +
> +	nr_maps = prog_info.nr_map_ids;
> +	memset(&prog_info, 0, sizeof(prog_info));
> +	prog_info.nr_map_ids = nr_maps;
> +	prog_info.map_ids = ptr_to_u64(map_ids);
> +	prog_info_len = sizeof(prog_info);
> +
> +	ret = bpf_obj_get_info_by_fd(prog_fd, &prog_info, &prog_info_len);
> +	if (ret)
> +		goto free_map_ids;
> +
> +	ret = -1;
> +	for (i = 0; i < prog_info.nr_map_ids; i++) {
> +		map_fd = bpf_map_get_fd_by_id(map_ids[i]);
> +		if (map_fd < 0) {
> +			ret = -1;
> +			goto free_map_ids;
> +		}
> +
> +		memset(&map_info, 0, sizeof(map_info));
> +		map_info_len = sizeof(map_info);
> +		ret = bpf_obj_get_info_by_fd(map_fd, &map_info, &map_info_len);
> +		saved_errno = errno;
> +		close(map_fd);
> +		errno = saved_errno;
> +		if (ret)
> +			goto free_map_ids;

If you get to this point on the last entry in the loop, ret will be 0,
and any of the continue statements below will end the loop, causing the
whole function to return 0. While this is not technically a valid ID, it
still seems odd that the function returns -1 on all error conditions
except this one.

Also, it would be good to be able to unambiguously distinguish between
"this program has no metadata associated" and "something went wrong
while querying the kernel for metadata (e.g., permission error)". So
something that amounts to a -ENOENT return; I guess turning all return
values into negative error codes would do that (and also do away with
the need for the saved_errno dance above), but id does clash a bit with
the convention in the rest of the file (where all the other functions
just return -1 and set errno)...

-Toke


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 4/8] libbpf: implement bpf_prog_find_metadata
  2020-08-28 21:10   ` Toke Høiland-Jørgensen
@ 2020-08-31 15:40     ` sdf
  2020-09-01 22:58       ` Alexei Starovoitov
  0 siblings, 1 reply; 31+ messages in thread
From: sdf @ 2020-08-31 15:40 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: netdev, bpf, davem, ast, daniel, YiFei Zhu

On 08/28, Toke H�iland-J�rgensen wrote:
> Stanislav Fomichev <sdf@google.com> writes:

> > This is a low-level function (hence in bpf.c) to find out the metadata
> > map id for the provided program fd.
> > It will be used in the next commits from bpftool.
> >
> > Cc: Toke H�iland-J�rgensen <toke@redhat.com>
> > Cc: YiFei Zhu <zhuyifei1999@gmail.com>
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  tools/lib/bpf/bpf.c      | 74 ++++++++++++++++++++++++++++++++++++++++
> >  tools/lib/bpf/bpf.h      |  1 +
> >  tools/lib/bpf/libbpf.map |  1 +
> >  3 files changed, 76 insertions(+)
> >
> > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> > index 5f6c5676cc45..01c0ede1625d 100644
> > --- a/tools/lib/bpf/bpf.c
> > +++ b/tools/lib/bpf/bpf.c
> > @@ -885,3 +885,77 @@ int bpf_prog_bind_map(int prog_fd, int map_fd,
> >
> >  	return sys_bpf(BPF_PROG_BIND_MAP, &attr, sizeof(attr));
> >  }
> > +
> > +int bpf_prog_find_metadata(int prog_fd)
> > +{
> > +	struct bpf_prog_info prog_info = {};
> > +	struct bpf_map_info map_info;
> > +	__u32 prog_info_len;
> > +	__u32 map_info_len;
> > +	int saved_errno;
> > +	__u32 *map_ids;
> > +	int nr_maps;
> > +	int map_fd;
> > +	int ret;
> > +	int i;
> > +
> > +	prog_info_len = sizeof(prog_info);
> > +
> > +	ret = bpf_obj_get_info_by_fd(prog_fd, &prog_info, &prog_info_len);
> > +	if (ret)
> > +		return ret;
> > +
> > +	if (!prog_info.nr_map_ids)
> > +		return -1;
> > +
> > +	map_ids = calloc(prog_info.nr_map_ids, sizeof(__u32));
> > +	if (!map_ids)
> > +		return -1;
> > +
> > +	nr_maps = prog_info.nr_map_ids;
> > +	memset(&prog_info, 0, sizeof(prog_info));
> > +	prog_info.nr_map_ids = nr_maps;
> > +	prog_info.map_ids = ptr_to_u64(map_ids);
> > +	prog_info_len = sizeof(prog_info);
> > +
> > +	ret = bpf_obj_get_info_by_fd(prog_fd, &prog_info, &prog_info_len);
> > +	if (ret)
> > +		goto free_map_ids;
> > +
> > +	ret = -1;
> > +	for (i = 0; i < prog_info.nr_map_ids; i++) {
> > +		map_fd = bpf_map_get_fd_by_id(map_ids[i]);
> > +		if (map_fd < 0) {
> > +			ret = -1;
> > +			goto free_map_ids;
> > +		}
> > +
> > +		memset(&map_info, 0, sizeof(map_info));
> > +		map_info_len = sizeof(map_info);
> > +		ret = bpf_obj_get_info_by_fd(map_fd, &map_info, &map_info_len);
> > +		saved_errno = errno;
> > +		close(map_fd);
> > +		errno = saved_errno;
> > +		if (ret)
> > +			goto free_map_ids;

> If you get to this point on the last entry in the loop, ret will be 0,
> and any of the continue statements below will end the loop, causing the
> whole function to return 0. While this is not technically a valid ID, it
> still seems odd that the function returns -1 on all error conditions
> except this one.

> Also, it would be good to be able to unambiguously distinguish between
> "this program has no metadata associated" and "something went wrong
> while querying the kernel for metadata (e.g., permission error)". So
> something that amounts to a -ENOENT return; I guess turning all return
> values into negative error codes would do that (and also do away with
> the need for the saved_errno dance above), but id does clash a bit with
> the convention in the rest of the file (where all the other functions
> just return -1 and set errno)...
Good point. I think I can change the function signature to:

	int bpf_prog_find_metadata(int prog_fd, int *map_id)

And explicitly return map_id via argument. Then the ret can be used as
-1/0 error and I can set errno appropriately where it makes sense.
This will better match the convention we have in this file.

Will respin v4 later this week to give people opportunity to look at the
other patches in the series. Thanks!

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 4/8] libbpf: implement bpf_prog_find_metadata
  2020-08-31 15:40     ` sdf
@ 2020-09-01 22:58       ` Alexei Starovoitov
  2020-09-02  9:43         ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 31+ messages in thread
From: Alexei Starovoitov @ 2020-09-01 22:58 UTC (permalink / raw)
  To: sdf
  Cc: Toke Høiland-Jørgensen, netdev, bpf, davem, ast,
	daniel, YiFei Zhu, andriin

On Mon, Aug 31, 2020 at 08:40:01AM -0700, sdf@google.com wrote:
> On 08/28, Toke H�iland-J�rgensen wrote:
> > Stanislav Fomichev <sdf@google.com> writes:
> 
> > > This is a low-level function (hence in bpf.c) to find out the metadata
> > > map id for the provided program fd.
> > > It will be used in the next commits from bpftool.
> > >
> > > Cc: Toke H�iland-J�rgensen <toke@redhat.com>
> > > Cc: YiFei Zhu <zhuyifei1999@gmail.com>
> > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > ---
> > >  tools/lib/bpf/bpf.c      | 74 ++++++++++++++++++++++++++++++++++++++++
> > >  tools/lib/bpf/bpf.h      |  1 +
> > >  tools/lib/bpf/libbpf.map |  1 +
> > >  3 files changed, 76 insertions(+)
> > >
> > > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> > > index 5f6c5676cc45..01c0ede1625d 100644
> > > --- a/tools/lib/bpf/bpf.c
> > > +++ b/tools/lib/bpf/bpf.c
> > > @@ -885,3 +885,77 @@ int bpf_prog_bind_map(int prog_fd, int map_fd,
> > >
> > >  	return sys_bpf(BPF_PROG_BIND_MAP, &attr, sizeof(attr));
> > >  }
> > > +
> > > +int bpf_prog_find_metadata(int prog_fd)
> > > +{
> > > +	struct bpf_prog_info prog_info = {};
> > > +	struct bpf_map_info map_info;
> > > +	__u32 prog_info_len;
> > > +	__u32 map_info_len;
> > > +	int saved_errno;
> > > +	__u32 *map_ids;
> > > +	int nr_maps;
> > > +	int map_fd;
> > > +	int ret;
> > > +	int i;
> > > +
> > > +	prog_info_len = sizeof(prog_info);
> > > +
> > > +	ret = bpf_obj_get_info_by_fd(prog_fd, &prog_info, &prog_info_len);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	if (!prog_info.nr_map_ids)
> > > +		return -1;
> > > +
> > > +	map_ids = calloc(prog_info.nr_map_ids, sizeof(__u32));
> > > +	if (!map_ids)
> > > +		return -1;
> > > +
> > > +	nr_maps = prog_info.nr_map_ids;
> > > +	memset(&prog_info, 0, sizeof(prog_info));
> > > +	prog_info.nr_map_ids = nr_maps;
> > > +	prog_info.map_ids = ptr_to_u64(map_ids);
> > > +	prog_info_len = sizeof(prog_info);
> > > +
> > > +	ret = bpf_obj_get_info_by_fd(prog_fd, &prog_info, &prog_info_len);
> > > +	if (ret)
> > > +		goto free_map_ids;
> > > +
> > > +	ret = -1;
> > > +	for (i = 0; i < prog_info.nr_map_ids; i++) {
> > > +		map_fd = bpf_map_get_fd_by_id(map_ids[i]);
> > > +		if (map_fd < 0) {
> > > +			ret = -1;
> > > +			goto free_map_ids;
> > > +		}
> > > +
> > > +		memset(&map_info, 0, sizeof(map_info));
> > > +		map_info_len = sizeof(map_info);
> > > +		ret = bpf_obj_get_info_by_fd(map_fd, &map_info, &map_info_len);
> > > +		saved_errno = errno;
> > > +		close(map_fd);
> > > +		errno = saved_errno;
> > > +		if (ret)
> > > +			goto free_map_ids;
> 
> > If you get to this point on the last entry in the loop, ret will be 0,
> > and any of the continue statements below will end the loop, causing the
> > whole function to return 0. While this is not technically a valid ID, it
> > still seems odd that the function returns -1 on all error conditions
> > except this one.
> 
> > Also, it would be good to be able to unambiguously distinguish between
> > "this program has no metadata associated" and "something went wrong
> > while querying the kernel for metadata (e.g., permission error)". So
> > something that amounts to a -ENOENT return; I guess turning all return
> > values into negative error codes would do that (and also do away with
> > the need for the saved_errno dance above), but id does clash a bit with
> > the convention in the rest of the file (where all the other functions
> > just return -1 and set errno)...
> Good point. I think I can change the function signature to:
> 
> 	int bpf_prog_find_metadata(int prog_fd, int *map_id)
> 
> And explicitly return map_id via argument. Then the ret can be used as
> -1/0 error and I can set errno appropriately where it makes sense.
> This will better match the convention we have in this file.

I don't feel great about this libbpf api. bpftool already does
bpf_obj_get_info_by_fd() for progs and for maps.
This extra step and extra set of syscalls is redundant work.
I think it's better to be done as part of bpftool.
It doesn't quite fit as generic api.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 4/8] libbpf: implement bpf_prog_find_metadata
  2020-09-01 22:58       ` Alexei Starovoitov
@ 2020-09-02  9:43         ` Toke Høiland-Jørgensen
  2020-09-02 21:08           ` Alexei Starovoitov
  0 siblings, 1 reply; 31+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-09-02  9:43 UTC (permalink / raw)
  To: Alexei Starovoitov, sdf
  Cc: netdev, bpf, davem, ast, daniel, YiFei Zhu, andriin

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Mon, Aug 31, 2020 at 08:40:01AM -0700, sdf@google.com wrote:
>> On 08/28, Toke H�iland-J�rgensen wrote:
>> > Stanislav Fomichev <sdf@google.com> writes:
>> 
>> > > This is a low-level function (hence in bpf.c) to find out the metadata
>> > > map id for the provided program fd.
>> > > It will be used in the next commits from bpftool.
>> > >
>> > > Cc: Toke H�iland-J�rgensen <toke@redhat.com>
>> > > Cc: YiFei Zhu <zhuyifei1999@gmail.com>
>> > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
>> > > ---
>> > >  tools/lib/bpf/bpf.c      | 74 ++++++++++++++++++++++++++++++++++++++++
>> > >  tools/lib/bpf/bpf.h      |  1 +
>> > >  tools/lib/bpf/libbpf.map |  1 +
>> > >  3 files changed, 76 insertions(+)
>> > >
>> > > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
>> > > index 5f6c5676cc45..01c0ede1625d 100644
>> > > --- a/tools/lib/bpf/bpf.c
>> > > +++ b/tools/lib/bpf/bpf.c
>> > > @@ -885,3 +885,77 @@ int bpf_prog_bind_map(int prog_fd, int map_fd,
>> > >
>> > >  	return sys_bpf(BPF_PROG_BIND_MAP, &attr, sizeof(attr));
>> > >  }
>> > > +
>> > > +int bpf_prog_find_metadata(int prog_fd)
>> > > +{
>> > > +	struct bpf_prog_info prog_info = {};
>> > > +	struct bpf_map_info map_info;
>> > > +	__u32 prog_info_len;
>> > > +	__u32 map_info_len;
>> > > +	int saved_errno;
>> > > +	__u32 *map_ids;
>> > > +	int nr_maps;
>> > > +	int map_fd;
>> > > +	int ret;
>> > > +	int i;
>> > > +
>> > > +	prog_info_len = sizeof(prog_info);
>> > > +
>> > > +	ret = bpf_obj_get_info_by_fd(prog_fd, &prog_info, &prog_info_len);
>> > > +	if (ret)
>> > > +		return ret;
>> > > +
>> > > +	if (!prog_info.nr_map_ids)
>> > > +		return -1;
>> > > +
>> > > +	map_ids = calloc(prog_info.nr_map_ids, sizeof(__u32));
>> > > +	if (!map_ids)
>> > > +		return -1;
>> > > +
>> > > +	nr_maps = prog_info.nr_map_ids;
>> > > +	memset(&prog_info, 0, sizeof(prog_info));
>> > > +	prog_info.nr_map_ids = nr_maps;
>> > > +	prog_info.map_ids = ptr_to_u64(map_ids);
>> > > +	prog_info_len = sizeof(prog_info);
>> > > +
>> > > +	ret = bpf_obj_get_info_by_fd(prog_fd, &prog_info, &prog_info_len);
>> > > +	if (ret)
>> > > +		goto free_map_ids;
>> > > +
>> > > +	ret = -1;
>> > > +	for (i = 0; i < prog_info.nr_map_ids; i++) {
>> > > +		map_fd = bpf_map_get_fd_by_id(map_ids[i]);
>> > > +		if (map_fd < 0) {
>> > > +			ret = -1;
>> > > +			goto free_map_ids;
>> > > +		}
>> > > +
>> > > +		memset(&map_info, 0, sizeof(map_info));
>> > > +		map_info_len = sizeof(map_info);
>> > > +		ret = bpf_obj_get_info_by_fd(map_fd, &map_info, &map_info_len);
>> > > +		saved_errno = errno;
>> > > +		close(map_fd);
>> > > +		errno = saved_errno;
>> > > +		if (ret)
>> > > +			goto free_map_ids;
>> 
>> > If you get to this point on the last entry in the loop, ret will be 0,
>> > and any of the continue statements below will end the loop, causing the
>> > whole function to return 0. While this is not technically a valid ID, it
>> > still seems odd that the function returns -1 on all error conditions
>> > except this one.
>> 
>> > Also, it would be good to be able to unambiguously distinguish between
>> > "this program has no metadata associated" and "something went wrong
>> > while querying the kernel for metadata (e.g., permission error)". So
>> > something that amounts to a -ENOENT return; I guess turning all return
>> > values into negative error codes would do that (and also do away with
>> > the need for the saved_errno dance above), but id does clash a bit with
>> > the convention in the rest of the file (where all the other functions
>> > just return -1 and set errno)...
>> Good point. I think I can change the function signature to:
>> 
>> 	int bpf_prog_find_metadata(int prog_fd, int *map_id)
>> 
>> And explicitly return map_id via argument. Then the ret can be used as
>> -1/0 error and I can set errno appropriately where it makes sense.
>> This will better match the convention we have in this file.
>
> I don't feel great about this libbpf api. bpftool already does
> bpf_obj_get_info_by_fd() for progs and for maps.
> This extra step and extra set of syscalls is redundant work.
> I think it's better to be done as part of bpftool.
> It doesn't quite fit as generic api.

Why not? We are establishing a convention for how to store (and read)
metadata from a program; by having an API to get this, we make sure that
every application that wants to access this metadata agrees on how to do
so. If we don't have it, people will have to go look at bpftool code,
and we'll end up with copied code snippets, which seems less than ideal.

-Toke


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 4/8] libbpf: implement bpf_prog_find_metadata
  2020-09-02  9:43         ` Toke Høiland-Jørgensen
@ 2020-09-02 21:08           ` Alexei Starovoitov
  2020-09-02 21:33             ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 31+ messages in thread
From: Alexei Starovoitov @ 2020-09-02 21:08 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: sdf, netdev, bpf, davem, ast, daniel, YiFei Zhu, andriin

On Wed, Sep 02, 2020 at 11:43:26AM +0200, Toke Høiland-Jørgensen wrote:
> >
> > I don't feel great about this libbpf api. bpftool already does
> > bpf_obj_get_info_by_fd() for progs and for maps.
> > This extra step and extra set of syscalls is redundant work.
> > I think it's better to be done as part of bpftool.
> > It doesn't quite fit as generic api.
> 
> Why not? 

It's a helper function on top of already provided api and implemented
in the most brute force and inefficient way.
bpftool implementation of the same will be more efficient.

> so. If we don't have it, people will have to go look at bpftool code,
> and we'll end up with copied code snippets, which seems less than ideal.

I'd like to see the real use case first before hypothesising.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 4/8] libbpf: implement bpf_prog_find_metadata
  2020-09-02 21:08           ` Alexei Starovoitov
@ 2020-09-02 21:33             ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 31+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-09-02 21:33 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: sdf, netdev, bpf, davem, ast, daniel, YiFei Zhu, andriin

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Wed, Sep 02, 2020 at 11:43:26AM +0200, Toke Høiland-Jørgensen wrote:
>> >
>> > I don't feel great about this libbpf api. bpftool already does
>> > bpf_obj_get_info_by_fd() for progs and for maps.
>> > This extra step and extra set of syscalls is redundant work.
>> > I think it's better to be done as part of bpftool.
>> > It doesn't quite fit as generic api.
>> 
>> Why not? 
>
> It's a helper function on top of already provided api and implemented
> in the most brute force and inefficient way.
> bpftool implementation of the same will be more efficient.

Right, certainly wouldn't mind something more efficient. But to me, the
inefficiency is outweighed by convenience of having this canonical
reference for 'this is the metadata map'.

>> so. If we don't have it, people will have to go look at bpftool code,
>> and we'll end up with copied code snippets, which seems less than ideal.
>
> I'd like to see the real use case first before hypothesising.

For me, that would be incorporating support for this into
libxdp/xdp-tools; which was the reason I asked for this to be split into
a separate API in the first place. But okay, not going to keep arguing
about this, I can copy-paste code as well as the next person.

-Toke


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 2/8] bpf: Add BPF_PROG_BIND_MAP syscall
  2020-08-28 19:35 ` [PATCH bpf-next v3 2/8] bpf: Add BPF_PROG_BIND_MAP syscall Stanislav Fomichev
@ 2020-09-03  2:15   ` Andrii Nakryiko
  0 siblings, 0 replies; 31+ messages in thread
From: Andrii Nakryiko @ 2020-09-03  2:15 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, David S. Miller, Alexei Starovoitov,
	Daniel Borkmann, YiFei Zhu, YiFei Zhu

On Fri, Aug 28, 2020 at 12:37 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> From: YiFei Zhu <zhuyifei@google.com>
>
> This syscall binds a map to a program. -EEXIST if the map is
> already bound to the program.
>
> Cc: YiFei Zhu <zhuyifei1999@gmail.com>
> Signed-off-by: YiFei Zhu <zhuyifei@google.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  include/uapi/linux/bpf.h       |  7 ++++
>  kernel/bpf/syscall.c           | 65 ++++++++++++++++++++++++++++++++++
>  tools/include/uapi/linux/bpf.h |  7 ++++
>  3 files changed, 79 insertions(+)
>

[...]

> +
> +       mutex_lock(&prog->aux->used_maps_mutex);
> +
> +       used_maps_old = prog->aux->used_maps;
> +
> +       for (i = 0; i < prog->aux->used_map_cnt; i++)
> +               if (used_maps_old[i] == map) {
> +                       ret = -EEXIST;
> +                       goto out_unlock;

Do we need to return any error in this case? The intent of this
command is to make sure the map is bound to the prog, right? If it's
already bound, good, it's a success, just no extra steps were
performed internally. What's the use for this EEXIST if it's always
going to be ignored?

> +               }
> +
> +       used_maps_new = kmalloc_array(prog->aux->used_map_cnt + 1,
> +                                     sizeof(used_maps_new[0]),
> +                                     GFP_KERNEL);
> +       if (!used_maps_new) {
> +               ret = -ENOMEM;
> +               goto out_unlock;
> +       }
> +

[...]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 3/8] libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .metadata section
  2020-08-28 19:35 ` [PATCH bpf-next v3 3/8] libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .metadata section Stanislav Fomichev
@ 2020-09-03  2:31   ` Andrii Nakryiko
  2020-09-04  1:29     ` Alexei Starovoitov
  0 siblings, 1 reply; 31+ messages in thread
From: Andrii Nakryiko @ 2020-09-03  2:31 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, David S. Miller, Alexei Starovoitov,
	Daniel Borkmann, YiFei Zhu, YiFei Zhu

On Fri, Aug 28, 2020 at 12:37 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> From: YiFei Zhu <zhuyifei@google.com>
>
> The patch adds a simple wrapper bpf_prog_bind_map around the syscall.
> And when using libbpf to load a program, it will probe the kernel for
> the support of this syscall, and scan for the .metadata ELF section
> and load it as an internal map like a .data section.
>
> In the case that kernel supports the BPF_PROG_BIND_MAP syscall and
> a .metadata section exists, the map will be explicitly bound to
> the program via the syscall immediately after program is loaded.
> -EEXIST is ignored for this syscall.

Here is the question I have. How important is it that all this
metadata is in a separate map? What if libbpf just PROG_BIND_MAP all
the maps inside a single BPF .o file to all BPF programs in that file?
Including ARRAY maps created for .data, .rodata and .bss, even if the
BPF program doesn't use any of the global variables? If it's too
extreme, we could do it only for global data maps, leaving explicit
map definitions in SEC(".maps") alone. Would that be terrible?
Conceptually it makes sense, because when you program in user-space,
you expect global variables to be there, even if you don't reference
it directly, right? The only downside is that you won't have a special
".metadata" map, rather it will be part of ".rodata" one.


>
> Cc: YiFei Zhu <zhuyifei1999@gmail.com>
> Signed-off-by: YiFei Zhu <zhuyifei@google.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  tools/lib/bpf/bpf.c      |  13 ++++
>  tools/lib/bpf/bpf.h      |   8 +++
>  tools/lib/bpf/libbpf.c   | 130 ++++++++++++++++++++++++++++++++-------
>  tools/lib/bpf/libbpf.map |   1 +
>  4 files changed, 131 insertions(+), 21 deletions(-)
>

[...]

> @@ -3592,18 +3619,13 @@ static int probe_kern_prog_name(void)
>         return probe_fd(ret);
>  }
>
> -static int probe_kern_global_data(void)
> +static void __probe_create_global_data(int *prog, int *map,
> +                                      struct bpf_insn *insns, size_t insns_cnt)

all those static functions are internal, no need for double underscore in names

>  {
>         struct bpf_load_program_attr prg_attr;
>         struct bpf_create_map_attr map_attr;
>         char *cp, errmsg[STRERR_BUFSIZE];
> -       struct bpf_insn insns[] = {
> -               BPF_LD_MAP_VALUE(BPF_REG_1, 0, 16),
> -               BPF_ST_MEM(BPF_DW, BPF_REG_1, 0, 42),
> -               BPF_MOV64_IMM(BPF_REG_0, 0),
> -               BPF_EXIT_INSN(),
> -       };
> -       int ret, map;
> +       int err;
>
>         memset(&map_attr, 0, sizeof(map_attr));
>         map_attr.map_type = BPF_MAP_TYPE_ARRAY;
> @@ -3611,26 +3633,40 @@ static int probe_kern_global_data(void)
>         map_attr.value_size = 32;
>         map_attr.max_entries = 1;
>
> -       map = bpf_create_map_xattr(&map_attr);
> -       if (map < 0) {
> -               ret = -errno;
> -               cp = libbpf_strerror_r(ret, errmsg, sizeof(errmsg));
> +       *map = bpf_create_map_xattr(&map_attr);
> +       if (*map < 0) {
> +               err = errno;
> +               cp = libbpf_strerror_r(err, errmsg, sizeof(errmsg));
>                 pr_warn("Error in %s():%s(%d). Couldn't create simple array map.\n",
> -                       __func__, cp, -ret);
> -               return ret;
> +                       __func__, cp, -err);
> +               return;
>         }
>
> -       insns[0].imm = map;
> +       insns[0].imm = *map;

you are making confusing and error prone assumptions about the first
instruction passed in, I really-really don't like this, super easy to
miss and super easy to screw up.

>
>         memset(&prg_attr, 0, sizeof(prg_attr));
>         prg_attr.prog_type = BPF_PROG_TYPE_SOCKET_FILTER;
>         prg_attr.insns = insns;
> -       prg_attr.insns_cnt = ARRAY_SIZE(insns);
> +       prg_attr.insns_cnt = insns_cnt;
>         prg_attr.license = "GPL";
>
> -       ret = bpf_load_program_xattr(&prg_attr, NULL, 0);
> +       *prog = bpf_load_program_xattr(&prg_attr, NULL, 0);
> +}
> +

[...]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 5/8] bpftool: support dumping metadata
  2020-08-28 19:36 ` [PATCH bpf-next v3 5/8] bpftool: support dumping metadata Stanislav Fomichev
@ 2020-09-03  5:00   ` Andrii Nakryiko
  2020-09-08 20:53     ` Stanislav Fomichev
  0 siblings, 1 reply; 31+ messages in thread
From: Andrii Nakryiko @ 2020-09-03  5:00 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, David S. Miller, Alexei Starovoitov,
	Daniel Borkmann, YiFei Zhu, YiFei Zhu

On Fri, Aug 28, 2020 at 12:37 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> From: YiFei Zhu <zhuyifei@google.com>
>
> Added a flag "--metadata" to `bpftool prog list` to dump the metadata
> contents. For some formatting some BTF code is put directly in the
> metadata dumping. Sanity checks on the map and the kind of the btf_type
> to make sure we are actually dumping what we are expecting.
>
> A helper jsonw_reset is added to json writer so we can reuse the same
> json writer without having extraneous commas.
>
> Sample output:
>
>   $ bpftool prog --metadata
>   6: cgroup_skb  name prog  tag bcf7977d3b93787c  gpl
>   [...]
>         btf_id 4
>         metadata:
>                 metadata_a = "foo"
>                 metadata_b = 1
>
>   $ bpftool prog --metadata --json --pretty
>   [{
>           "id": 6,
>   [...]
>           "btf_id": 4,
>           "metadata": {
>               "metadata_a": "foo",
>               "metadata_b": 1
>           }
>       }
>   ]
>
> Cc: YiFei Zhu <zhuyifei1999@gmail.com>
> Signed-off-by: YiFei Zhu <zhuyifei@google.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  tools/bpf/bpftool/json_writer.c |   6 ++
>  tools/bpf/bpftool/json_writer.h |   3 +
>  tools/bpf/bpftool/main.c        |  10 +++
>  tools/bpf/bpftool/main.h        |   1 +
>  tools/bpf/bpftool/prog.c        | 130 ++++++++++++++++++++++++++++++++
>  5 files changed, 150 insertions(+)
>

[...]

> +
> +       if (bpf_map_lookup_elem(map_fd, &key, value)) {
> +               p_err("metadata map lookup failed: %s", strerror(errno));
> +               goto out_free;
> +       }
> +
> +       err = btf__get_from_id(map_info.btf_id, &btf);

what if the map has no btf_id associated (e.g., because of an old
kernel?); why fail in this case?

> +       if (err || !btf) {
> +               p_err("metadata BTF get failed: %s", strerror(-err));
> +               goto out_free;
> +       }
> +
> +       t_datasec = btf__type_by_id(btf, map_info.btf_value_type_id);
> +       if (BTF_INFO_KIND(t_datasec->info) != BTF_KIND_DATASEC) {

btf_is_datasec(t_datasec)

> +               p_err("bad metadata BTF");
> +               goto out_free;
> +       }
> +
> +       vlen = BTF_INFO_VLEN(t_datasec->info);

btf_vlen(t_datasec)

> +       vsi = (struct btf_var_secinfo *)(t_datasec + 1);

btf_var_secinfos(t_datasec)

> +
> +       /* We don't proceed to check the kinds of the elements of the DATASEC.
> +        * The verifier enforce then to be BTF_KIND_VAR.

typo: then -> them

> +        */
> +
> +       if (json_output) {
> +               struct btf_dumper d = {
> +                       .btf = btf,
> +                       .jw = json_wtr,
> +                       .is_plain_text = false,
> +               };
> +
> +               jsonw_name(json_wtr, "metadata");
> +
> +               jsonw_start_object(json_wtr);
> +               for (i = 0; i < vlen; i++) {

nit: doing ++vsi here


> +                       t_var = btf__type_by_id(btf, vsi[i].type);

and vsi->type here and below would look a bit cleaner

> +
> +                       jsonw_name(json_wtr, btf__name_by_offset(btf, t_var->name_off));
> +                       err = btf_dumper_type(&d, t_var->type, value + vsi[i].offset);
> +                       if (err) {
> +                               p_err("btf dump failed");
> +                               break;
> +                       }
> +               }
> +               jsonw_end_object(json_wtr);

[...]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 3/8] libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .metadata section
  2020-09-03  2:31   ` Andrii Nakryiko
@ 2020-09-04  1:29     ` Alexei Starovoitov
  2020-09-04 23:18       ` Andrii Nakryiko
  0 siblings, 1 reply; 31+ messages in thread
From: Alexei Starovoitov @ 2020-09-04  1:29 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Stanislav Fomichev, Networking, bpf, David S. Miller,
	Alexei Starovoitov, Daniel Borkmann, YiFei Zhu, YiFei Zhu

On Wed, Sep 02, 2020 at 07:31:33PM -0700, Andrii Nakryiko wrote:
> On Fri, Aug 28, 2020 at 12:37 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > From: YiFei Zhu <zhuyifei@google.com>
> >
> > The patch adds a simple wrapper bpf_prog_bind_map around the syscall.
> > And when using libbpf to load a program, it will probe the kernel for
> > the support of this syscall, and scan for the .metadata ELF section
> > and load it as an internal map like a .data section.
> >
> > In the case that kernel supports the BPF_PROG_BIND_MAP syscall and
> > a .metadata section exists, the map will be explicitly bound to
> > the program via the syscall immediately after program is loaded.
> > -EEXIST is ignored for this syscall.
> 
> Here is the question I have. How important is it that all this
> metadata is in a separate map? What if libbpf just PROG_BIND_MAP all
> the maps inside a single BPF .o file to all BPF programs in that file?
> Including ARRAY maps created for .data, .rodata and .bss, even if the
> BPF program doesn't use any of the global variables? If it's too
> extreme, we could do it only for global data maps, leaving explicit
> map definitions in SEC(".maps") alone. Would that be terrible?
> Conceptually it makes sense, because when you program in user-space,
> you expect global variables to be there, even if you don't reference
> it directly, right? The only downside is that you won't have a special
> ".metadata" map, rather it will be part of ".rodata" one.

That's an interesting idea.
Indeed. If we have BPF_PROG_BIND_MAP command why do we need to create
another map that behaves exactly like .rodata but has a different name?
Wouldn't it be better to identify metadata elements some other way?
Like by common prefix/suffix name of the variables or
via grouping them under one structure with standard prefix?
Like:
struct bpf_prog_metadata_blahblah {
  char compiler_name[];
  int my_internal_prog_version;
} = { .compiler_name[] = "clang v.12", ...};

In the past we did this hack for 'version' and for 'license',
but we did it because we didn't have BTF and there was no other way
to determine the boundaries.
I think libbpf can and should support multiple rodata sections with
arbitrary names, but hardcoding one specific ".metadata" name?
Hmm. Let's think through the implications.
Multiple .o support and static linking is coming soon.
When two .o-s with multiple bpf progs are statically linked libbpf
won't have any choice but to merge them together under single
".metadata" section and single map that will be BPF_PROG_BIND_MAP-ed
to different progs. Meaning that metadata applies to final elf file
after linking. It's _not_ per program metadata.
May be we should talk about problem statement and goals.
Do we actually need metadata per program or metadata per single .o
or metadata per final .o with multiple .o linked together?
What is this metadata?
If it's just unreferenced by program read only data then no special names or
prefixes are needed. We can introduce BPF_PROG_BIND_MAP to bind any map to any
program and it would be up to tooling to decide the meaning of the data in the
map. For example, bpftool can choose to print all variables from all read only
maps that match "bpf_metadata_" prefix, but it will be bpftool convention only
and not hard coded in libbpf.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 3/8] libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .metadata section
  2020-09-04  1:29     ` Alexei Starovoitov
@ 2020-09-04 23:18       ` Andrii Nakryiko
  2020-09-07  8:49         ` Toke Høiland-Jørgensen
  2020-09-08 17:44         ` Andrey Ignatov
  0 siblings, 2 replies; 31+ messages in thread
From: Andrii Nakryiko @ 2020-09-04 23:18 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Stanislav Fomichev, Networking, bpf, David S. Miller,
	Alexei Starovoitov, Daniel Borkmann, YiFei Zhu, YiFei Zhu,
	Andrey Ignatov

On Thu, Sep 3, 2020 at 6:29 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Sep 02, 2020 at 07:31:33PM -0700, Andrii Nakryiko wrote:
> > On Fri, Aug 28, 2020 at 12:37 PM Stanislav Fomichev <sdf@google.com> wrote:
> > >
> > > From: YiFei Zhu <zhuyifei@google.com>
> > >
> > > The patch adds a simple wrapper bpf_prog_bind_map around the syscall.
> > > And when using libbpf to load a program, it will probe the kernel for
> > > the support of this syscall, and scan for the .metadata ELF section
> > > and load it as an internal map like a .data section.
> > >
> > > In the case that kernel supports the BPF_PROG_BIND_MAP syscall and
> > > a .metadata section exists, the map will be explicitly bound to
> > > the program via the syscall immediately after program is loaded.
> > > -EEXIST is ignored for this syscall.
> >
> > Here is the question I have. How important is it that all this
> > metadata is in a separate map? What if libbpf just PROG_BIND_MAP all
> > the maps inside a single BPF .o file to all BPF programs in that file?
> > Including ARRAY maps created for .data, .rodata and .bss, even if the
> > BPF program doesn't use any of the global variables? If it's too
> > extreme, we could do it only for global data maps, leaving explicit
> > map definitions in SEC(".maps") alone. Would that be terrible?
> > Conceptually it makes sense, because when you program in user-space,
> > you expect global variables to be there, even if you don't reference
> > it directly, right? The only downside is that you won't have a special
> > ".metadata" map, rather it will be part of ".rodata" one.
>
> That's an interesting idea.
> Indeed. If we have BPF_PROG_BIND_MAP command why do we need to create
> another map that behaves exactly like .rodata but has a different name?

That was exactly my thought when I re-read this patch set :)

> Wouldn't it be better to identify metadata elements some other way?
> Like by common prefix/suffix name of the variables or
> via grouping them under one structure with standard prefix?
> Like:
> struct bpf_prog_metadata_blahblah {
>   char compiler_name[];
>   int my_internal_prog_version;
> } = { .compiler_name[] = "clang v.12", ...};
>
> In the past we did this hack for 'version' and for 'license',
> but we did it because we didn't have BTF and there was no other way
> to determine the boundaries.
> I think libbpf can and should support multiple rodata sections with

Yep, that's coming, we already have a pretty common .rodata.str1.1
section emitted by Clang for some cases, which libbpf currently
ignores, but that should change. Creating a separate map for all such
small sections seems excessive, so my plan is to combine them and
their BTFs into one, as you assumed below.

> arbitrary names, but hardcoding one specific ".metadata" name?
> Hmm. Let's think through the implications.
> Multiple .o support and static linking is coming soon.
> When two .o-s with multiple bpf progs are statically linked libbpf
> won't have any choice but to merge them together under single
> ".metadata" section and single map that will be BPF_PROG_BIND_MAP-ed
> to different progs. Meaning that metadata applies to final elf file
> after linking. It's _not_ per program metadata.

Right, exactly.

> May be we should talk about problem statement and goals.
> Do we actually need metadata per program or metadata per single .o
> or metadata per final .o with multiple .o linked together?
> What is this metadata?

Yep, that's a very valid question. I've also CC'ed Andrey.

> If it's just unreferenced by program read only data then no special names or
> prefixes are needed. We can introduce BPF_PROG_BIND_MAP to bind any map to any
> program and it would be up to tooling to decide the meaning of the data in the
> map. For example, bpftool can choose to print all variables from all read only
> maps that match "bpf_metadata_" prefix, but it will be bpftool convention only
> and not hard coded in libbpf.

Agree as well. It feels a bit odd for libbpf to handle ".metadata"
specially, given libbpf itself doesn't care about its contents at all.

So thanks for bringing this up, I think this is an important discussion to have.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 3/8] libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .metadata section
  2020-09-04 23:18       ` Andrii Nakryiko
@ 2020-09-07  8:49         ` Toke Høiland-Jørgensen
  2020-09-08 15:19           ` Stanislav Fomichev
  2020-09-08 18:10           ` Andrii Nakryiko
  2020-09-08 17:44         ` Andrey Ignatov
  1 sibling, 2 replies; 31+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-09-07  8:49 UTC (permalink / raw)
  To: Andrii Nakryiko, Alexei Starovoitov
  Cc: Stanislav Fomichev, Networking, bpf, David S. Miller,
	Alexei Starovoitov, Daniel Borkmann, YiFei Zhu, YiFei Zhu,
	Andrey Ignatov

Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:

>> May be we should talk about problem statement and goals.
>> Do we actually need metadata per program or metadata per single .o
>> or metadata per final .o with multiple .o linked together?
>> What is this metadata?
>
> Yep, that's a very valid question. I've also CC'ed Andrey.

For the libxdp use case, I need metadata per program. But I'm already
sticking that in a single section and disambiguating by struct name
(just prefixing the function name with a _ ), so I think it's fine to
have this kind of "concatenated metadata" per elf file and parse out the
per-program information from that. This is similar to the BTF-encoded
"metadata" we can do today.

>> If it's just unreferenced by program read only data then no special names or
>> prefixes are needed. We can introduce BPF_PROG_BIND_MAP to bind any map to any
>> program and it would be up to tooling to decide the meaning of the data in the
>> map. For example, bpftool can choose to print all variables from all read only
>> maps that match "bpf_metadata_" prefix, but it will be bpftool convention only
>> and not hard coded in libbpf.
>
> Agree as well. It feels a bit odd for libbpf to handle ".metadata"
> specially, given libbpf itself doesn't care about its contents at all.
>
> So thanks for bringing this up, I think this is an important
> discussion to have.

I'm fine with having this be part of .rodata. One drawback, though, is
that if any metadata is defined, it becomes a bit more complicated to
use bpf_map__set_initial_value() because that now also has to include
the metadata. Any way we can improve upon that?

-Toke


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 3/8] libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .metadata section
  2020-09-07  8:49         ` Toke Høiland-Jørgensen
@ 2020-09-08 15:19           ` Stanislav Fomichev
  2020-09-08 18:20             ` Andrii Nakryiko
  2020-09-08 18:10           ` Andrii Nakryiko
  1 sibling, 1 reply; 31+ messages in thread
From: Stanislav Fomichev @ 2020-09-08 15:19 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Andrii Nakryiko, Alexei Starovoitov, Networking, bpf,
	David S. Miller, Alexei Starovoitov, Daniel Borkmann, YiFei Zhu,
	YiFei Zhu, Andrey Ignatov

On Mon, Sep 7, 2020 at 1:49 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>
> >> May be we should talk about problem statement and goals.
> >> Do we actually need metadata per program or metadata per single .o
> >> or metadata per final .o with multiple .o linked together?
> >> What is this metadata?
> >
> > Yep, that's a very valid question. I've also CC'ed Andrey.
>
> For the libxdp use case, I need metadata per program. But I'm already
> sticking that in a single section and disambiguating by struct name
> (just prefixing the function name with a _ ), so I think it's fine to
> have this kind of "concatenated metadata" per elf file and parse out the
> per-program information from that. This is similar to the BTF-encoded
> "metadata" we can do today.
We've come full circle :-)
I think we discussed that approach originally - to stick everything
into existing global .data/.rodata and use some variable prefix for
the metadata. I'm fine with that approach. The only thing I don't
understand is - why bother with the additional .rodata.metadata
section and merging?
Can we unconditionally do BPF_PROG_BIND_MAP(.rodata) from libbpf (and
ignore the error) and be done?

Sticking to the original question: for our use-case, the metadata is
per .o file. I'm not sure how it would work in the 'multiple .o linked
together' use case. Ideally, we'd need to preserve all metadata?

> >> If it's just unreferenced by program read only data then no special names or
> >> prefixes are needed. We can introduce BPF_PROG_BIND_MAP to bind any map to any
> >> program and it would be up to tooling to decide the meaning of the data in the
> >> map. For example, bpftool can choose to print all variables from all read only
> >> maps that match "bpf_metadata_" prefix, but it will be bpftool convention only
> >> and not hard coded in libbpf.
> >
> > Agree as well. It feels a bit odd for libbpf to handle ".metadata"
> > specially, given libbpf itself doesn't care about its contents at all.
> >
> > So thanks for bringing this up, I think this is an important
> > discussion to have.
>
> I'm fine with having this be part of .rodata. One drawback, though, is
> that if any metadata is defined, it becomes a bit more complicated to
> use bpf_map__set_initial_value() because that now also has to include
> the metadata. Any way we can improve upon that?
Right. One additional thing we wanted this metadata to have is the
comm of the process who loaded this bpf program (to be filled/added by
libbpf).
I suppose .rodata.metadata section can help with that?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 3/8] libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .metadata section
  2020-09-04 23:18       ` Andrii Nakryiko
  2020-09-07  8:49         ` Toke Høiland-Jørgensen
@ 2020-09-08 17:44         ` Andrey Ignatov
  2020-09-08 18:24           ` Andrii Nakryiko
  1 sibling, 1 reply; 31+ messages in thread
From: Andrey Ignatov @ 2020-09-08 17:44 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Stanislav Fomichev, Networking, bpf,
	David S. Miller, Alexei Starovoitov, Daniel Borkmann, YiFei Zhu,
	YiFei Zhu

Andrii Nakryiko <andrii.nakryiko@gmail.com> [Fri, 2020-09-04 16:19 -0700]:
> On Thu, Sep 3, 2020 at 6:29 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Wed, Sep 02, 2020 at 07:31:33PM -0700, Andrii Nakryiko wrote:
> > > On Fri, Aug 28, 2020 at 12:37 PM Stanislav Fomichev <sdf@google.com> wrote:
> > > >
> > > > From: YiFei Zhu <zhuyifei@google.com>
> > > >
> > > > The patch adds a simple wrapper bpf_prog_bind_map around the syscall.
> > > > And when using libbpf to load a program, it will probe the kernel for
> > > > the support of this syscall, and scan for the .metadata ELF section
> > > > and load it as an internal map like a .data section.
> > > >
> > > > In the case that kernel supports the BPF_PROG_BIND_MAP syscall and
> > > > a .metadata section exists, the map will be explicitly bound to
> > > > the program via the syscall immediately after program is loaded.
> > > > -EEXIST is ignored for this syscall.
> > >
> > > Here is the question I have. How important is it that all this
> > > metadata is in a separate map? What if libbpf just PROG_BIND_MAP all
> > > the maps inside a single BPF .o file to all BPF programs in that file?
> > > Including ARRAY maps created for .data, .rodata and .bss, even if the
> > > BPF program doesn't use any of the global variables? If it's too
> > > extreme, we could do it only for global data maps, leaving explicit
> > > map definitions in SEC(".maps") alone. Would that be terrible?
> > > Conceptually it makes sense, because when you program in user-space,
> > > you expect global variables to be there, even if you don't reference
> > > it directly, right? The only downside is that you won't have a special
> > > ".metadata" map, rather it will be part of ".rodata" one.
> >
> > That's an interesting idea.
> > Indeed. If we have BPF_PROG_BIND_MAP command why do we need to create
> > another map that behaves exactly like .rodata but has a different name?
> 
> That was exactly my thought when I re-read this patch set :)
> 
> > Wouldn't it be better to identify metadata elements some other way?
> > Like by common prefix/suffix name of the variables or
> > via grouping them under one structure with standard prefix?
> > Like:
> > struct bpf_prog_metadata_blahblah {
> >   char compiler_name[];
> >   int my_internal_prog_version;
> > } = { .compiler_name[] = "clang v.12", ...};
> >
> > In the past we did this hack for 'version' and for 'license',
> > but we did it because we didn't have BTF and there was no other way
> > to determine the boundaries.
> > I think libbpf can and should support multiple rodata sections with
> 
> Yep, that's coming, we already have a pretty common .rodata.str1.1
> section emitted by Clang for some cases, which libbpf currently
> ignores, but that should change. Creating a separate map for all such
> small sections seems excessive, so my plan is to combine them and
> their BTFs into one, as you assumed below.
> 
> > arbitrary names, but hardcoding one specific ".metadata" name?
> > Hmm. Let's think through the implications.
> > Multiple .o support and static linking is coming soon.
> > When two .o-s with multiple bpf progs are statically linked libbpf
> > won't have any choice but to merge them together under single
> > ".metadata" section and single map that will be BPF_PROG_BIND_MAP-ed
> > to different progs. Meaning that metadata applies to final elf file
> > after linking. It's _not_ per program metadata.
> 
> Right, exactly.
> 
> > May be we should talk about problem statement and goals.
> > Do we actually need metadata per program or metadata per single .o
> > or metadata per final .o with multiple .o linked together?
> > What is this metadata?
> 
> Yep, that's a very valid question. I've also CC'ed Andrey.

From my side the problem statement is to be able to save a bunch of
metadata fields per BPF object file (I don't distinguish "final .o" and
"multiple .o linked together" since we have only the former in prod).

Specifically things like oncall team who owns the programs in the object
(the most important info), build info (repository revision, build commit
time, build time), etc. The plan is to integrate it with build system
and be able to quickly identify source code and point of contact for any
particular program.

All these things are always the same for all programs in one object. It
may change in the future, but at the moment I'm not aware of any
use-case where these things can be different for different programs in
the same object.

I don't have strong preferences on the implementation side as long as it
covers the use-case, e.g. the one in the patch set would work FWIW.

> > If it's just unreferenced by program read only data then no special names or
> > prefixes are needed. We can introduce BPF_PROG_BIND_MAP to bind any map to any
> > program and it would be up to tooling to decide the meaning of the data in the
> > map. For example, bpftool can choose to print all variables from all read only
> > maps that match "bpf_metadata_" prefix, but it will be bpftool convention only
> > and not hard coded in libbpf.
> 
> Agree as well. It feels a bit odd for libbpf to handle ".metadata"
> specially, given libbpf itself doesn't care about its contents at all.
> 
> So thanks for bringing this up, I think this is an important discussion to have.

-- 
Andrey Ignatov

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 3/8] libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .metadata section
  2020-09-07  8:49         ` Toke Høiland-Jørgensen
  2020-09-08 15:19           ` Stanislav Fomichev
@ 2020-09-08 18:10           ` Andrii Nakryiko
  2020-09-09 10:58             ` Toke Høiland-Jørgensen
  1 sibling, 1 reply; 31+ messages in thread
From: Andrii Nakryiko @ 2020-09-08 18:10 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Alexei Starovoitov, Stanislav Fomichev, Networking, bpf,
	David S. Miller, Alexei Starovoitov, Daniel Borkmann, YiFei Zhu,
	YiFei Zhu, Andrey Ignatov

On Mon, Sep 7, 2020 at 1:49 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>
> >> May be we should talk about problem statement and goals.
> >> Do we actually need metadata per program or metadata per single .o
> >> or metadata per final .o with multiple .o linked together?
> >> What is this metadata?
> >
> > Yep, that's a very valid question. I've also CC'ed Andrey.
>
> For the libxdp use case, I need metadata per program. But I'm already
> sticking that in a single section and disambiguating by struct name
> (just prefixing the function name with a _ ), so I think it's fine to
> have this kind of "concatenated metadata" per elf file and parse out the
> per-program information from that. This is similar to the BTF-encoded
> "metadata" we can do today.
>
> >> If it's just unreferenced by program read only data then no special names or
> >> prefixes are needed. We can introduce BPF_PROG_BIND_MAP to bind any map to any
> >> program and it would be up to tooling to decide the meaning of the data in the
> >> map. For example, bpftool can choose to print all variables from all read only
> >> maps that match "bpf_metadata_" prefix, but it will be bpftool convention only
> >> and not hard coded in libbpf.
> >
> > Agree as well. It feels a bit odd for libbpf to handle ".metadata"
> > specially, given libbpf itself doesn't care about its contents at all.
> >
> > So thanks for bringing this up, I think this is an important
> > discussion to have.
>
> I'm fine with having this be part of .rodata. One drawback, though, is
> that if any metadata is defined, it becomes a bit more complicated to
> use bpf_map__set_initial_value() because that now also has to include
> the metadata. Any way we can improve upon that?

I know that skeleton is not an answer for you, so you'll have to find
DATASEC and corresponding variable offset and size (libbpf provides
APIs for all those operations, but you'll need to combine them
together). Then mmap() map and then you can do partial updates. There
is no other way to update only portions of an ARRAY map, except
through memory-mapping.

>
> -Toke
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 3/8] libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .metadata section
  2020-09-08 15:19           ` Stanislav Fomichev
@ 2020-09-08 18:20             ` Andrii Nakryiko
  0 siblings, 0 replies; 31+ messages in thread
From: Andrii Nakryiko @ 2020-09-08 18:20 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Toke Høiland-Jørgensen, Alexei Starovoitov, Networking,
	bpf, David S. Miller, Alexei Starovoitov, Daniel Borkmann,
	YiFei Zhu, Andrey Ignatov

On Tue, Sep 8, 2020 at 8:20 AM Stanislav Fomichev <sdf@google.com> wrote:
>
> On Mon, Sep 7, 2020 at 1:49 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >
> > Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
> >
> > >> May be we should talk about problem statement and goals.
> > >> Do we actually need metadata per program or metadata per single .o
> > >> or metadata per final .o with multiple .o linked together?
> > >> What is this metadata?
> > >
> > > Yep, that's a very valid question. I've also CC'ed Andrey.
> >
> > For the libxdp use case, I need metadata per program. But I'm already
> > sticking that in a single section and disambiguating by struct name
> > (just prefixing the function name with a _ ), so I think it's fine to
> > have this kind of "concatenated metadata" per elf file and parse out the
> > per-program information from that. This is similar to the BTF-encoded
> > "metadata" we can do today.
> We've come full circle :-)
> I think we discussed that approach originally - to stick everything
> into existing global .data/.rodata and use some variable prefix for
> the metadata. I'm fine with that approach. The only thing I don't

That's what we wanted all along, but the problem was with keeping
reference to bpf_map from bpf_prog. We eventually gave up and
concluded that extra BPF command is necessary. But somewhere along the
road we somehow concluded we need an entire new special map/section,
and I didn't realize at first (and it seems it wasn't just me) that
the latter part is unnecessary.

> understand is - why bother with the additional .rodata.metadata
> section and merging?
> Can we unconditionally do BPF_PROG_BIND_MAP(.rodata) from libbpf (and
> ignore the error) and be done?

That's exactly what we are proposing, to stick to .rodata, instead of
having extra .metadata section. Multiple .rodata/.data sections are
orthogonal concerns, which we need to solve as well, because the
compiler does emit many of them in some cases. So in that context,
once we support multiple .rodata's, it would be possible to have
metadata-only "sub-sections". But we don't have to do that, keeping
everything simple and put into .rodata works just fine.

>
> Sticking to the original question: for our use-case, the metadata is
> per .o file. I'm not sure how it would work in the 'multiple .o linked
> together' use case. Ideally, we'd need to preserve all metadata?

Just like in user-space, when you have multiple .c files compiled into
.o files and later linked into a final library or binary, all the
.data and .rodata sections are combined. That's what will happen with
BPF .o files as well. So it will be automatically preserved, as you
seem to want.

>
> > >> If it's just unreferenced by program read only data then no special names or
> > >> prefixes are needed. We can introduce BPF_PROG_BIND_MAP to bind any map to any
> > >> program and it would be up to tooling to decide the meaning of the data in the
> > >> map. For example, bpftool can choose to print all variables from all read only
> > >> maps that match "bpf_metadata_" prefix, but it will be bpftool convention only
> > >> and not hard coded in libbpf.
> > >
> > > Agree as well. It feels a bit odd for libbpf to handle ".metadata"
> > > specially, given libbpf itself doesn't care about its contents at all.
> > >
> > > So thanks for bringing this up, I think this is an important
> > > discussion to have.
> >
> > I'm fine with having this be part of .rodata. One drawback, though, is
> > that if any metadata is defined, it becomes a bit more complicated to
> > use bpf_map__set_initial_value() because that now also has to include
> > the metadata. Any way we can improve upon that?
> Right. One additional thing we wanted this metadata to have is the
> comm of the process who loaded this bpf program (to be filled/added by
> libbpf).
> I suppose .rodata.metadata section can help with that?

.rodata.metadata has nothing to do with this. I'm also not sure
whether it's a responsibility of libbpf to provide process's comm as a
metadata, to be honest. Next thing it will be user name/user id, then
cgroup name, then some other application-level concept and so on. I'd
prefer to keep it simple and let applications handle that for
themselves. Luckily, using a BPF skeleton this is **extremely** easy.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 3/8] libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .metadata section
  2020-09-08 17:44         ` Andrey Ignatov
@ 2020-09-08 18:24           ` Andrii Nakryiko
  0 siblings, 0 replies; 31+ messages in thread
From: Andrii Nakryiko @ 2020-09-08 18:24 UTC (permalink / raw)
  To: Andrey Ignatov
  Cc: Alexei Starovoitov, Stanislav Fomichev, Networking, bpf,
	David S. Miller, Alexei Starovoitov, Daniel Borkmann, YiFei Zhu

On Tue, Sep 8, 2020 at 10:45 AM Andrey Ignatov <rdna@fb.com> wrote:
>
> Andrii Nakryiko <andrii.nakryiko@gmail.com> [Fri, 2020-09-04 16:19 -0700]:
> > On Thu, Sep 3, 2020 at 6:29 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Wed, Sep 02, 2020 at 07:31:33PM -0700, Andrii Nakryiko wrote:
> > > > On Fri, Aug 28, 2020 at 12:37 PM Stanislav Fomichev <sdf@google.com> wrote:
> > > > >
> > > > > From: YiFei Zhu <zhuyifei@google.com>
> > > > >
> > > > > The patch adds a simple wrapper bpf_prog_bind_map around the syscall.
> > > > > And when using libbpf to load a program, it will probe the kernel for
> > > > > the support of this syscall, and scan for the .metadata ELF section
> > > > > and load it as an internal map like a .data section.
> > > > >
> > > > > In the case that kernel supports the BPF_PROG_BIND_MAP syscall and
> > > > > a .metadata section exists, the map will be explicitly bound to
> > > > > the program via the syscall immediately after program is loaded.
> > > > > -EEXIST is ignored for this syscall.
> > > >
> > > > Here is the question I have. How important is it that all this
> > > > metadata is in a separate map? What if libbpf just PROG_BIND_MAP all
> > > > the maps inside a single BPF .o file to all BPF programs in that file?
> > > > Including ARRAY maps created for .data, .rodata and .bss, even if the
> > > > BPF program doesn't use any of the global variables? If it's too
> > > > extreme, we could do it only for global data maps, leaving explicit
> > > > map definitions in SEC(".maps") alone. Would that be terrible?
> > > > Conceptually it makes sense, because when you program in user-space,
> > > > you expect global variables to be there, even if you don't reference
> > > > it directly, right? The only downside is that you won't have a special
> > > > ".metadata" map, rather it will be part of ".rodata" one.
> > >
> > > That's an interesting idea.
> > > Indeed. If we have BPF_PROG_BIND_MAP command why do we need to create
> > > another map that behaves exactly like .rodata but has a different name?
> >
> > That was exactly my thought when I re-read this patch set :)
> >
> > > Wouldn't it be better to identify metadata elements some other way?
> > > Like by common prefix/suffix name of the variables or
> > > via grouping them under one structure with standard prefix?
> > > Like:
> > > struct bpf_prog_metadata_blahblah {
> > >   char compiler_name[];
> > >   int my_internal_prog_version;
> > > } = { .compiler_name[] = "clang v.12", ...};
> > >
> > > In the past we did this hack for 'version' and for 'license',
> > > but we did it because we didn't have BTF and there was no other way
> > > to determine the boundaries.
> > > I think libbpf can and should support multiple rodata sections with
> >
> > Yep, that's coming, we already have a pretty common .rodata.str1.1
> > section emitted by Clang for some cases, which libbpf currently
> > ignores, but that should change. Creating a separate map for all such
> > small sections seems excessive, so my plan is to combine them and
> > their BTFs into one, as you assumed below.
> >
> > > arbitrary names, but hardcoding one specific ".metadata" name?
> > > Hmm. Let's think through the implications.
> > > Multiple .o support and static linking is coming soon.
> > > When two .o-s with multiple bpf progs are statically linked libbpf
> > > won't have any choice but to merge them together under single
> > > ".metadata" section and single map that will be BPF_PROG_BIND_MAP-ed
> > > to different progs. Meaning that metadata applies to final elf file
> > > after linking. It's _not_ per program metadata.
> >
> > Right, exactly.
> >
> > > May be we should talk about problem statement and goals.
> > > Do we actually need metadata per program or metadata per single .o
> > > or metadata per final .o with multiple .o linked together?
> > > What is this metadata?
> >
> > Yep, that's a very valid question. I've also CC'ed Andrey.
>
> From my side the problem statement is to be able to save a bunch of
> metadata fields per BPF object file (I don't distinguish "final .o" and
> "multiple .o linked together" since we have only the former in prod).

We don't *yet*. But reading below, you have the entire BPF application
(i.e., a collection of maps, variables and programs) in mind, not
specifically single .c file compiled into a single .o file. So
everything works out, I think.

>
> Specifically things like oncall team who owns the programs in the object
> (the most important info), build info (repository revision, build commit
> time, build time), etc. The plan is to integrate it with build system
> and be able to quickly identify source code and point of contact for any
> particular program.
>
> All these things are always the same for all programs in one object. It
> may change in the future, but at the moment I'm not aware of any
> use-case where these things can be different for different programs in
> the same object.
>
> I don't have strong preferences on the implementation side as long as it
> covers the use-case, e.g. the one in the patch set would work FWIW.
>
> > > If it's just unreferenced by program read only data then no special names or
> > > prefixes are needed. We can introduce BPF_PROG_BIND_MAP to bind any map to any
> > > program and it would be up to tooling to decide the meaning of the data in the
> > > map. For example, bpftool can choose to print all variables from all read only
> > > maps that match "bpf_metadata_" prefix, but it will be bpftool convention only
> > > and not hard coded in libbpf.
> >
> > Agree as well. It feels a bit odd for libbpf to handle ".metadata"
> > specially, given libbpf itself doesn't care about its contents at all.
> >
> > So thanks for bringing this up, I think this is an important discussion to have.
>
> --
> Andrey Ignatov

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 5/8] bpftool: support dumping metadata
  2020-09-03  5:00   ` Andrii Nakryiko
@ 2020-09-08 20:53     ` Stanislav Fomichev
  2020-09-08 22:35       ` Andrii Nakryiko
  0 siblings, 1 reply; 31+ messages in thread
From: Stanislav Fomichev @ 2020-09-08 20:53 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, David S. Miller, Alexei Starovoitov,
	Daniel Borkmann, YiFei Zhu, YiFei Zhu

On Wed, Sep 2, 2020 at 10:00 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Fri, Aug 28, 2020 at 12:37 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > From: YiFei Zhu <zhuyifei@google.com>
> >
> > Added a flag "--metadata" to `bpftool prog list` to dump the metadata
> > contents. For some formatting some BTF code is put directly in the
> > metadata dumping. Sanity checks on the map and the kind of the btf_type
> > to make sure we are actually dumping what we are expecting.
> >
> > A helper jsonw_reset is added to json writer so we can reuse the same
> > json writer without having extraneous commas.
> >
> > Sample output:
> >
> >   $ bpftool prog --metadata
> >   6: cgroup_skb  name prog  tag bcf7977d3b93787c  gpl
> >   [...]
> >         btf_id 4
> >         metadata:
> >                 metadata_a = "foo"
> >                 metadata_b = 1
> >
> >   $ bpftool prog --metadata --json --pretty
> >   [{
> >           "id": 6,
> >   [...]
> >           "btf_id": 4,
> >           "metadata": {
> >               "metadata_a": "foo",
> >               "metadata_b": 1
> >           }
> >       }
> >   ]
> >
> > Cc: YiFei Zhu <zhuyifei1999@gmail.com>
> > Signed-off-by: YiFei Zhu <zhuyifei@google.com>
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  tools/bpf/bpftool/json_writer.c |   6 ++
> >  tools/bpf/bpftool/json_writer.h |   3 +
> >  tools/bpf/bpftool/main.c        |  10 +++
> >  tools/bpf/bpftool/main.h        |   1 +
> >  tools/bpf/bpftool/prog.c        | 130 ++++++++++++++++++++++++++++++++
> >  5 files changed, 150 insertions(+)
> >
>
> [...]
>
> > +
> > +       if (bpf_map_lookup_elem(map_fd, &key, value)) {
> > +               p_err("metadata map lookup failed: %s", strerror(errno));
> > +               goto out_free;
> > +       }
> > +
> > +       err = btf__get_from_id(map_info.btf_id, &btf);
>
> what if the map has no btf_id associated (e.g., because of an old
> kernel?); why fail in this case?
Thank you for the review, coming back at it a bit late :-(

This functionality is guarded by --metadata bpftool flag (off by default).
In case of no btf_id, it might be helpful to show why we don't have
the metadata rather than just quietly failing.
WDYT?

> > +       if (err || !btf) {
> > +               p_err("metadata BTF get failed: %s", strerror(-err));
> > +               goto out_free;
> > +       }
> > +
> > +       t_datasec = btf__type_by_id(btf, map_info.btf_value_type_id);
> > +       if (BTF_INFO_KIND(t_datasec->info) != BTF_KIND_DATASEC) {
>
> btf_is_datasec(t_datasec)
>
> > +               p_err("bad metadata BTF");
> > +               goto out_free;
> > +       }
> > +
> > +       vlen = BTF_INFO_VLEN(t_datasec->info);
>
> btf_vlen(t_datasec)
>
> > +       vsi = (struct btf_var_secinfo *)(t_datasec + 1);
>
> btf_var_secinfos(t_datasec)
>
> > +
> > +       /* We don't proceed to check the kinds of the elements of the DATASEC.
> > +        * The verifier enforce then to be BTF_KIND_VAR.
>
> typo: then -> them
>
> > +        */
> > +
> > +       if (json_output) {
> > +               struct btf_dumper d = {
> > +                       .btf = btf,
> > +                       .jw = json_wtr,
> > +                       .is_plain_text = false,
> > +               };
> > +
> > +               jsonw_name(json_wtr, "metadata");
> > +
> > +               jsonw_start_object(json_wtr);
> > +               for (i = 0; i < vlen; i++) {
>
> nit: doing ++vsi here
Agreed with all the above, except this one.
It feels like it's safer to do [i] in case somebody adds a 'continue'
clause later and we miss that '++vsi'.
Let me know if you feel strongly about it.

> > +                       t_var = btf__type_by_id(btf, vsi[i].type);
>
> and vsi->type here and below would look a bit cleaner
>
> > +
> > +                       jsonw_name(json_wtr, btf__name_by_offset(btf, t_var->name_off));
> > +                       err = btf_dumper_type(&d, t_var->type, value + vsi[i].offset);
> > +                       if (err) {
> > +                               p_err("btf dump failed");
> > +                               break;
> > +                       }
> > +               }
> > +               jsonw_end_object(json_wtr);
>
> [...]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 5/8] bpftool: support dumping metadata
  2020-09-08 20:53     ` Stanislav Fomichev
@ 2020-09-08 22:35       ` Andrii Nakryiko
  2020-09-08 22:49         ` Stanislav Fomichev
  0 siblings, 1 reply; 31+ messages in thread
From: Andrii Nakryiko @ 2020-09-08 22:35 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, David S. Miller, Alexei Starovoitov,
	Daniel Borkmann, YiFei Zhu, YiFei Zhu

On Tue, Sep 8, 2020 at 1:53 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> On Wed, Sep 2, 2020 at 10:00 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Fri, Aug 28, 2020 at 12:37 PM Stanislav Fomichev <sdf@google.com> wrote:
> > >
> > > From: YiFei Zhu <zhuyifei@google.com>
> > >
> > > Added a flag "--metadata" to `bpftool prog list` to dump the metadata
> > > contents. For some formatting some BTF code is put directly in the
> > > metadata dumping. Sanity checks on the map and the kind of the btf_type
> > > to make sure we are actually dumping what we are expecting.
> > >
> > > A helper jsonw_reset is added to json writer so we can reuse the same
> > > json writer without having extraneous commas.
> > >
> > > Sample output:
> > >
> > >   $ bpftool prog --metadata
> > >   6: cgroup_skb  name prog  tag bcf7977d3b93787c  gpl
> > >   [...]
> > >         btf_id 4
> > >         metadata:
> > >                 metadata_a = "foo"
> > >                 metadata_b = 1
> > >
> > >   $ bpftool prog --metadata --json --pretty
> > >   [{
> > >           "id": 6,
> > >   [...]
> > >           "btf_id": 4,
> > >           "metadata": {
> > >               "metadata_a": "foo",
> > >               "metadata_b": 1
> > >           }
> > >       }
> > >   ]
> > >
> > > Cc: YiFei Zhu <zhuyifei1999@gmail.com>
> > > Signed-off-by: YiFei Zhu <zhuyifei@google.com>
> > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > ---
> > >  tools/bpf/bpftool/json_writer.c |   6 ++
> > >  tools/bpf/bpftool/json_writer.h |   3 +
> > >  tools/bpf/bpftool/main.c        |  10 +++
> > >  tools/bpf/bpftool/main.h        |   1 +
> > >  tools/bpf/bpftool/prog.c        | 130 ++++++++++++++++++++++++++++++++
> > >  5 files changed, 150 insertions(+)
> > >
> >
> > [...]
> >
> > > +
> > > +       if (bpf_map_lookup_elem(map_fd, &key, value)) {
> > > +               p_err("metadata map lookup failed: %s", strerror(errno));
> > > +               goto out_free;
> > > +       }
> > > +
> > > +       err = btf__get_from_id(map_info.btf_id, &btf);
> >
> > what if the map has no btf_id associated (e.g., because of an old
> > kernel?); why fail in this case?
> Thank you for the review, coming back at it a bit late :-(
>
> This functionality is guarded by --metadata bpftool flag (off by default).
> In case of no btf_id, it might be helpful to show why we don't have
> the metadata rather than just quietly failing.
> WDYT?

we might do it similarly to PID info I added with bpf_iter: if it's
supported -- emit it, if not -- skip and still succeed. So maybe we
don't really need extra --metadata flag and should do all this always?

>
> > > +       if (err || !btf) {
> > > +               p_err("metadata BTF get failed: %s", strerror(-err));
> > > +               goto out_free;
> > > +       }
> > > +
> > > +       t_datasec = btf__type_by_id(btf, map_info.btf_value_type_id);
> > > +       if (BTF_INFO_KIND(t_datasec->info) != BTF_KIND_DATASEC) {
> >
> > btf_is_datasec(t_datasec)
> >
> > > +               p_err("bad metadata BTF");
> > > +               goto out_free;
> > > +       }
> > > +
> > > +       vlen = BTF_INFO_VLEN(t_datasec->info);
> >
> > btf_vlen(t_datasec)
> >
> > > +       vsi = (struct btf_var_secinfo *)(t_datasec + 1);
> >
> > btf_var_secinfos(t_datasec)
> >
> > > +
> > > +       /* We don't proceed to check the kinds of the elements of the DATASEC.
> > > +        * The verifier enforce then to be BTF_KIND_VAR.
> >
> > typo: then -> them
> >
> > > +        */
> > > +
> > > +       if (json_output) {
> > > +               struct btf_dumper d = {
> > > +                       .btf = btf,
> > > +                       .jw = json_wtr,
> > > +                       .is_plain_text = false,
> > > +               };
> > > +
> > > +               jsonw_name(json_wtr, "metadata");
> > > +
> > > +               jsonw_start_object(json_wtr);
> > > +               for (i = 0; i < vlen; i++) {
> >
> > nit: doing ++vsi here
> Agreed with all the above, except this one.
> It feels like it's safer to do [i] in case somebody adds a 'continue'
> clause later and we miss that '++vsi'.
> Let me know if you feel strongly about it.

I meant to add vsi++ inside the for clause, no way to miss it:

for (i = 0; i < vlen, i++, vsi++) {
  continue/break/whatever you want, except extra i++ or vsi++
}

it's the safest way, imo

>
> > > +                       t_var = btf__type_by_id(btf, vsi[i].type);
> >
> > and vsi->type here and below would look a bit cleaner
> >
> > > +
> > > +                       jsonw_name(json_wtr, btf__name_by_offset(btf, t_var->name_off));
> > > +                       err = btf_dumper_type(&d, t_var->type, value + vsi[i].offset);
> > > +                       if (err) {
> > > +                               p_err("btf dump failed");
> > > +                               break;
> > > +                       }
> > > +               }
> > > +               jsonw_end_object(json_wtr);
> >
> > [...]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 5/8] bpftool: support dumping metadata
  2020-09-08 22:35       ` Andrii Nakryiko
@ 2020-09-08 22:49         ` Stanislav Fomichev
  0 siblings, 0 replies; 31+ messages in thread
From: Stanislav Fomichev @ 2020-09-08 22:49 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, David S. Miller, Alexei Starovoitov,
	Daniel Borkmann, YiFei Zhu, YiFei Zhu

On Tue, Sep 8, 2020 at 3:35 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Sep 8, 2020 at 1:53 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > On Wed, Sep 2, 2020 at 10:00 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Fri, Aug 28, 2020 at 12:37 PM Stanislav Fomichev <sdf@google.com> wrote:
> > > >
> > > > From: YiFei Zhu <zhuyifei@google.com>
> > > >
> > > > Added a flag "--metadata" to `bpftool prog list` to dump the metadata
> > > > contents. For some formatting some BTF code is put directly in the
> > > > metadata dumping. Sanity checks on the map and the kind of the btf_type
> > > > to make sure we are actually dumping what we are expecting.
> > > >
> > > > A helper jsonw_reset is added to json writer so we can reuse the same
> > > > json writer without having extraneous commas.
> > > >
> > > > Sample output:
> > > >
> > > >   $ bpftool prog --metadata
> > > >   6: cgroup_skb  name prog  tag bcf7977d3b93787c  gpl
> > > >   [...]
> > > >         btf_id 4
> > > >         metadata:
> > > >                 metadata_a = "foo"
> > > >                 metadata_b = 1
> > > >
> > > >   $ bpftool prog --metadata --json --pretty
> > > >   [{
> > > >           "id": 6,
> > > >   [...]
> > > >           "btf_id": 4,
> > > >           "metadata": {
> > > >               "metadata_a": "foo",
> > > >               "metadata_b": 1
> > > >           }
> > > >       }
> > > >   ]
> > > >
> > > > Cc: YiFei Zhu <zhuyifei1999@gmail.com>
> > > > Signed-off-by: YiFei Zhu <zhuyifei@google.com>
> > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > ---
> > > >  tools/bpf/bpftool/json_writer.c |   6 ++
> > > >  tools/bpf/bpftool/json_writer.h |   3 +
> > > >  tools/bpf/bpftool/main.c        |  10 +++
> > > >  tools/bpf/bpftool/main.h        |   1 +
> > > >  tools/bpf/bpftool/prog.c        | 130 ++++++++++++++++++++++++++++++++
> > > >  5 files changed, 150 insertions(+)
> > > >
> > >
> > > [...]
> > >
> > > > +
> > > > +       if (bpf_map_lookup_elem(map_fd, &key, value)) {
> > > > +               p_err("metadata map lookup failed: %s", strerror(errno));
> > > > +               goto out_free;
> > > > +       }
> > > > +
> > > > +       err = btf__get_from_id(map_info.btf_id, &btf);
> > >
> > > what if the map has no btf_id associated (e.g., because of an old
> > > kernel?); why fail in this case?
> > Thank you for the review, coming back at it a bit late :-(
> >
> > This functionality is guarded by --metadata bpftool flag (off by default).
> > In case of no btf_id, it might be helpful to show why we don't have
> > the metadata rather than just quietly failing.
> > WDYT?
>
> we might do it similarly to PID info I added with bpf_iter: if it's
> supported -- emit it, if not -- skip and still succeed. So maybe we
> don't really need extra --metadata flag and should do all this always?
Sounds reasonable, especially if there is an existing precedent.
Let me explore that option.

> > > > +       if (err || !btf) {
> > > > +               p_err("metadata BTF get failed: %s", strerror(-err));
> > > > +               goto out_free;
> > > > +       }
> > > > +
> > > > +       t_datasec = btf__type_by_id(btf, map_info.btf_value_type_id);
> > > > +       if (BTF_INFO_KIND(t_datasec->info) != BTF_KIND_DATASEC) {
> > >
> > > btf_is_datasec(t_datasec)
> > >
> > > > +               p_err("bad metadata BTF");
> > > > +               goto out_free;
> > > > +       }
> > > > +
> > > > +       vlen = BTF_INFO_VLEN(t_datasec->info);
> > >
> > > btf_vlen(t_datasec)
> > >
> > > > +       vsi = (struct btf_var_secinfo *)(t_datasec + 1);
> > >
> > > btf_var_secinfos(t_datasec)
> > >
> > > > +
> > > > +       /* We don't proceed to check the kinds of the elements of the DATASEC.
> > > > +        * The verifier enforce then to be BTF_KIND_VAR.
> > >
> > > typo: then -> them
> > >
> > > > +        */
> > > > +
> > > > +       if (json_output) {
> > > > +               struct btf_dumper d = {
> > > > +                       .btf = btf,
> > > > +                       .jw = json_wtr,
> > > > +                       .is_plain_text = false,
> > > > +               };
> > > > +
> > > > +               jsonw_name(json_wtr, "metadata");
> > > > +
> > > > +               jsonw_start_object(json_wtr);
> > > > +               for (i = 0; i < vlen; i++) {
> > >
> > > nit: doing ++vsi here
> > Agreed with all the above, except this one.
> > It feels like it's safer to do [i] in case somebody adds a 'continue'
> > clause later and we miss that '++vsi'.
> > Let me know if you feel strongly about it.
>
> I meant to add vsi++ inside the for clause, no way to miss it:
>
> for (i = 0; i < vlen, i++, vsi++) {
>   continue/break/whatever you want, except extra i++ or vsi++
> }
>
> it's the safest way, imo
Ack, I can do that, thanks!

> > > > +                       t_var = btf__type_by_id(btf, vsi[i].type);
> > >
> > > and vsi->type here and below would look a bit cleaner
> > >
> > > > +
> > > > +                       jsonw_name(json_wtr, btf__name_by_offset(btf, t_var->name_off));
> > > > +                       err = btf_dumper_type(&d, t_var->type, value + vsi[i].offset);
> > > > +                       if (err) {
> > > > +                               p_err("btf dump failed");
> > > > +                               break;
> > > > +                       }
> > > > +               }
> > > > +               jsonw_end_object(json_wtr);
> > >
> > > [...]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 3/8] libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .metadata section
  2020-09-08 18:10           ` Andrii Nakryiko
@ 2020-09-09 10:58             ` Toke Høiland-Jørgensen
  2020-09-09 16:34               ` Andrii Nakryiko
  0 siblings, 1 reply; 31+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-09-09 10:58 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Stanislav Fomichev, Networking, bpf,
	David S. Miller, Alexei Starovoitov, Daniel Borkmann, YiFei Zhu,
	YiFei Zhu, Andrey Ignatov

Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:

> On Mon, Sep 7, 2020 at 1:49 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>>
>> >> May be we should talk about problem statement and goals.
>> >> Do we actually need metadata per program or metadata per single .o
>> >> or metadata per final .o with multiple .o linked together?
>> >> What is this metadata?
>> >
>> > Yep, that's a very valid question. I've also CC'ed Andrey.
>>
>> For the libxdp use case, I need metadata per program. But I'm already
>> sticking that in a single section and disambiguating by struct name
>> (just prefixing the function name with a _ ), so I think it's fine to
>> have this kind of "concatenated metadata" per elf file and parse out the
>> per-program information from that. This is similar to the BTF-encoded
>> "metadata" we can do today.
>>
>> >> If it's just unreferenced by program read only data then no special names or
>> >> prefixes are needed. We can introduce BPF_PROG_BIND_MAP to bind any map to any
>> >> program and it would be up to tooling to decide the meaning of the data in the
>> >> map. For example, bpftool can choose to print all variables from all read only
>> >> maps that match "bpf_metadata_" prefix, but it will be bpftool convention only
>> >> and not hard coded in libbpf.
>> >
>> > Agree as well. It feels a bit odd for libbpf to handle ".metadata"
>> > specially, given libbpf itself doesn't care about its contents at all.
>> >
>> > So thanks for bringing this up, I think this is an important
>> > discussion to have.
>>
>> I'm fine with having this be part of .rodata. One drawback, though, is
>> that if any metadata is defined, it becomes a bit more complicated to
>> use bpf_map__set_initial_value() because that now also has to include
>> the metadata. Any way we can improve upon that?
>
> I know that skeleton is not an answer for you, so you'll have to find
> DATASEC and corresponding variable offset and size (libbpf provides
> APIs for all those operations, but you'll need to combine them
> together). Then mmap() map and then you can do partial updates. There
> is no other way to update only portions of an ARRAY map, except
> through memory-mapping.

Well, I wouldn't mind having to go digging through the section. But is
it really possible to pick out and modify parts of it my mmap() before
the object is loaded (and the map frozen)? How? I seem to recall we
added bpf_map__set_initial_value() because this was *not* possible with
the public API?

Also, for this, a bpf_map__get_initial_value() could be a simple way to
allow partial modifications. The caller could just get the whole map
value, modify it, and set it again afterwards with
__set_initial_value(). Any objections to adding that?

-Toke


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH bpf-next v3 3/8] libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .metadata section
  2020-09-09 10:58             ` Toke Høiland-Jørgensen
@ 2020-09-09 16:34               ` Andrii Nakryiko
  0 siblings, 0 replies; 31+ messages in thread
From: Andrii Nakryiko @ 2020-09-09 16:34 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Alexei Starovoitov, Stanislav Fomichev, Networking, bpf,
	David S. Miller, Alexei Starovoitov, Daniel Borkmann, YiFei Zhu,
	YiFei Zhu, Andrey Ignatov

On Wed, Sep 9, 2020 at 3:58 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>
> > On Mon, Sep 7, 2020 at 1:49 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
> >>
> >> >> May be we should talk about problem statement and goals.
> >> >> Do we actually need metadata per program or metadata per single .o
> >> >> or metadata per final .o with multiple .o linked together?
> >> >> What is this metadata?
> >> >
> >> > Yep, that's a very valid question. I've also CC'ed Andrey.
> >>
> >> For the libxdp use case, I need metadata per program. But I'm already
> >> sticking that in a single section and disambiguating by struct name
> >> (just prefixing the function name with a _ ), so I think it's fine to
> >> have this kind of "concatenated metadata" per elf file and parse out the
> >> per-program information from that. This is similar to the BTF-encoded
> >> "metadata" we can do today.
> >>
> >> >> If it's just unreferenced by program read only data then no special names or
> >> >> prefixes are needed. We can introduce BPF_PROG_BIND_MAP to bind any map to any
> >> >> program and it would be up to tooling to decide the meaning of the data in the
> >> >> map. For example, bpftool can choose to print all variables from all read only
> >> >> maps that match "bpf_metadata_" prefix, but it will be bpftool convention only
> >> >> and not hard coded in libbpf.
> >> >
> >> > Agree as well. It feels a bit odd for libbpf to handle ".metadata"
> >> > specially, given libbpf itself doesn't care about its contents at all.
> >> >
> >> > So thanks for bringing this up, I think this is an important
> >> > discussion to have.
> >>
> >> I'm fine with having this be part of .rodata. One drawback, though, is
> >> that if any metadata is defined, it becomes a bit more complicated to
> >> use bpf_map__set_initial_value() because that now also has to include
> >> the metadata. Any way we can improve upon that?
> >
> > I know that skeleton is not an answer for you, so you'll have to find
> > DATASEC and corresponding variable offset and size (libbpf provides
> > APIs for all those operations, but you'll need to combine them
> > together). Then mmap() map and then you can do partial updates. There
> > is no other way to update only portions of an ARRAY map, except
> > through memory-mapping.
>
> Well, I wouldn't mind having to go digging through the section. But is
> it really possible to pick out and modify parts of it my mmap() before
> the object is loaded (and the map frozen)? How? I seem to recall we
> added bpf_map__set_initial_value() because this was *not* possible with
> the public API?
>

Ah, right, .rodata is frozen on load, forgot we are talking about .rodata here.


> Also, for this, a bpf_map__get_initial_value() could be a simple way to
> allow partial modifications. The caller could just get the whole map
> value, modify it, and set it again afterwards with
> __set_initial_value(). Any objections to adding that?

Yeah, I think having an API for getting initial map value makes sense.
But please follow the naming convention for getters and call it
bpf_map__initial_value().

>
> -Toke
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2020-09-09 16:35 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-28 19:35 [PATCH bpf-next v3 0/8] Allow storage of flexible metadata information for eBPF programs Stanislav Fomichev
2020-08-28 19:35 ` [PATCH bpf-next v3 1/8] bpf: Mutex protect used_maps array and count Stanislav Fomichev
2020-08-28 19:35 ` [PATCH bpf-next v3 2/8] bpf: Add BPF_PROG_BIND_MAP syscall Stanislav Fomichev
2020-09-03  2:15   ` Andrii Nakryiko
2020-08-28 19:35 ` [PATCH bpf-next v3 3/8] libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .metadata section Stanislav Fomichev
2020-09-03  2:31   ` Andrii Nakryiko
2020-09-04  1:29     ` Alexei Starovoitov
2020-09-04 23:18       ` Andrii Nakryiko
2020-09-07  8:49         ` Toke Høiland-Jørgensen
2020-09-08 15:19           ` Stanislav Fomichev
2020-09-08 18:20             ` Andrii Nakryiko
2020-09-08 18:10           ` Andrii Nakryiko
2020-09-09 10:58             ` Toke Høiland-Jørgensen
2020-09-09 16:34               ` Andrii Nakryiko
2020-09-08 17:44         ` Andrey Ignatov
2020-09-08 18:24           ` Andrii Nakryiko
2020-08-28 19:35 ` [PATCH bpf-next v3 4/8] libbpf: implement bpf_prog_find_metadata Stanislav Fomichev
2020-08-28 21:10   ` Toke Høiland-Jørgensen
2020-08-31 15:40     ` sdf
2020-09-01 22:58       ` Alexei Starovoitov
2020-09-02  9:43         ` Toke Høiland-Jørgensen
2020-09-02 21:08           ` Alexei Starovoitov
2020-09-02 21:33             ` Toke Høiland-Jørgensen
2020-08-28 19:36 ` [PATCH bpf-next v3 5/8] bpftool: support dumping metadata Stanislav Fomichev
2020-09-03  5:00   ` Andrii Nakryiko
2020-09-08 20:53     ` Stanislav Fomichev
2020-09-08 22:35       ` Andrii Nakryiko
2020-09-08 22:49         ` Stanislav Fomichev
2020-08-28 19:36 ` [PATCH bpf-next v3 6/8] bpftool: support metadata internal map in gen skeleton Stanislav Fomichev
2020-08-28 19:36 ` [PATCH bpf-next v3 7/8] bpftool: mention --metadata in the documentation Stanislav Fomichev
2020-08-28 19:36 ` [PATCH bpf-next v3 8/8] selftests/bpf: Test load and dump metadata with btftool and skel Stanislav Fomichev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).