LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/8] bpf: Add fprobe link
@ 2022-02-02 13:53 Jiri Olsa
  2022-02-02 13:53 ` [PATCH 1/8] bpf: Add support to attach kprobe program with fprobe Jiri Olsa
                   ` (8 more replies)
  0 siblings, 9 replies; 50+ messages in thread
From: Jiri Olsa @ 2022-02-02 13:53 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu
  Cc: netdev, bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Steven Rostedt, Jiri Olsa

hi,
this patchset adds new link type BPF_LINK_TYPE_FPROBE that attaches kprobe
program through fprobe API [1] instroduced by Masami.

The fprobe API allows to attach probe on multiple functions at once very
fast, because it works on top of ftrace. On the other hand this limits
the probe point to the function entry or return.

With bpftrace support I see following attach speed:

  # perf stat --null -r 5 ./src/bpftrace -e 'kprobe:x* { } i:ms:1 { exit(); } '
  Attaching 2 probes...
  Attaching 3342 functions
  ...

  1.4960 +- 0.0285 seconds time elapsed  ( +-  1.91% )

Also available at:
  https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  bpf/fprobe_link

thanks,
jirka


[1] https://lore.kernel.org/bpf/20220202162925.bd74e7970fc35cb4236eef48@kernel.org/T/#t
---
Jiri Olsa (8):
      bpf: Add support to attach kprobe program with fprobe
      bpf: Add bpf_get_func_ip kprobe helper for fprobe link
      bpf: Add bpf_cookie support to fprobe
      libbpf: Add libbpf__kallsyms_parse function
      libbpf: Add bpf_link_create support for multi kprobes
      libbpf: Add bpf_program__attach_kprobe_opts for multi kprobes
      selftest/bpf: Add fprobe attach test
      selftest/bpf: Add fprobe test for bpf_cookie values

 include/linux/bpf.h                                   |   2 +
 include/linux/bpf_types.h                             |   1 +
 include/uapi/linux/bpf.h                              |  14 +++++
 kernel/bpf/syscall.c                                  | 327 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 kernel/bpf/verifier.c                                 |  19 +++++-
 kernel/trace/bpf_trace.c                              |  32 +++++++++-
 tools/include/uapi/linux/bpf.h                        |  14 +++++
 tools/lib/bpf/bpf.c                                   |   7 +++
 tools/lib/bpf/bpf.h                                   |   9 ++-
 tools/lib/bpf/libbpf.c                                | 198 +++++++++++++++++++++++++++++++++++++++++++++++++++--------
 tools/lib/bpf/libbpf_internal.h                       |   5 ++
 tools/testing/selftests/bpf/prog_tests/bpf_cookie.c   |  73 ++++++++++++++++++++++
 tools/testing/selftests/bpf/prog_tests/fprobe_test.c  | 117 +++++++++++++++++++++++++++++++++++
 tools/testing/selftests/bpf/progs/fprobe.c            |  58 ++++++++++++++++++
 tools/testing/selftests/bpf/progs/fprobe_bpf_cookie.c |  62 +++++++++++++++++++
 15 files changed, 902 insertions(+), 36 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/fprobe_test.c
 create mode 100644 tools/testing/selftests/bpf/progs/fprobe.c
 create mode 100644 tools/testing/selftests/bpf/progs/fprobe_bpf_cookie.c


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 1/8] bpf: Add support to attach kprobe program with fprobe
  2022-02-02 13:53 [PATCH 0/8] bpf: Add fprobe link Jiri Olsa
@ 2022-02-02 13:53 ` Jiri Olsa
  2022-02-07 18:59   ` Andrii Nakryiko
  2022-02-02 13:53 ` [PATCH 2/8] bpf: Add bpf_get_func_ip kprobe helper for fprobe link Jiri Olsa
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 50+ messages in thread
From: Jiri Olsa @ 2022-02-02 13:53 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu
  Cc: netdev, bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Steven Rostedt, Jiri Olsa

Adding new link type BPF_LINK_TYPE_FPROBE that attaches kprobe program
through fprobe API.

The fprobe API allows to attach probe on multiple functions at once very
fast, because it works on top of ftrace. On the other hand this limits
the probe point to the function entry or return.

The kprobe program gets the same pt_regs input ctx as when it's attached
through the perf API.

Adding new attach type BPF_TRACE_FPROBE that enables such link for kprobe
program.

User provides array of addresses or symbols with count to attach the kprobe
program to. The new link_create uapi interface looks like:

  struct {
          __aligned_u64   syms;
          __aligned_u64   addrs;
          __u32           cnt;
          __u32           flags;
  } fprobe;

The flags field allows single BPF_F_FPROBE_RETURN bit to create return fprobe.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 include/linux/bpf_types.h      |   1 +
 include/uapi/linux/bpf.h       |  13 ++
 kernel/bpf/syscall.c           | 248 ++++++++++++++++++++++++++++++++-
 tools/include/uapi/linux/bpf.h |  13 ++
 4 files changed, 270 insertions(+), 5 deletions(-)

diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
index 48a91c51c015..e279cea46653 100644
--- a/include/linux/bpf_types.h
+++ b/include/linux/bpf_types.h
@@ -140,3 +140,4 @@ BPF_LINK_TYPE(BPF_LINK_TYPE_XDP, xdp)
 #ifdef CONFIG_PERF_EVENTS
 BPF_LINK_TYPE(BPF_LINK_TYPE_PERF_EVENT, perf)
 #endif
+BPF_LINK_TYPE(BPF_LINK_TYPE_FPROBE, fprobe)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index a7f0ddedac1f..c0912f0a3dfe 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -997,6 +997,7 @@ enum bpf_attach_type {
 	BPF_SK_REUSEPORT_SELECT,
 	BPF_SK_REUSEPORT_SELECT_OR_MIGRATE,
 	BPF_PERF_EVENT,
+	BPF_TRACE_FPROBE,
 	__MAX_BPF_ATTACH_TYPE
 };
 
@@ -1011,6 +1012,7 @@ enum bpf_link_type {
 	BPF_LINK_TYPE_NETNS = 5,
 	BPF_LINK_TYPE_XDP = 6,
 	BPF_LINK_TYPE_PERF_EVENT = 7,
+	BPF_LINK_TYPE_FPROBE = 8,
 
 	MAX_BPF_LINK_TYPE,
 };
@@ -1118,6 +1120,11 @@ enum bpf_link_type {
  */
 #define BPF_F_XDP_HAS_FRAGS	(1U << 5)
 
+/* link_create.fprobe.flags used in LINK_CREATE command for
+ * BPF_TRACE_FPROBE attach type to create return probe.
+ */
+#define BPF_F_FPROBE_RETURN	(1U << 0)
+
 /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
  * the following extensions:
  *
@@ -1472,6 +1479,12 @@ union bpf_attr {
 				 */
 				__u64		bpf_cookie;
 			} perf_event;
+			struct {
+				__aligned_u64	syms;
+				__aligned_u64	addrs;
+				__u32		cnt;
+				__u32		flags;
+			} fprobe;
 		};
 	} link_create;
 
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 72ce1edde950..0cfbb112c8e1 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -32,6 +32,7 @@
 #include <linux/bpf-netns.h>
 #include <linux/rcupdate_trace.h>
 #include <linux/memcontrol.h>
+#include <linux/fprobe.h>
 
 #define IS_FD_ARRAY(map) ((map)->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY || \
 			  (map)->map_type == BPF_MAP_TYPE_CGROUP_ARRAY || \
@@ -3015,8 +3016,235 @@ static int bpf_perf_link_attach(const union bpf_attr *attr, struct bpf_prog *pro
 	fput(perf_file);
 	return err;
 }
+#else
+static int bpf_perf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
+{
+	return -EOPNOTSUPP;
+}
 #endif /* CONFIG_PERF_EVENTS */
 
+#ifdef CONFIG_FPROBE
+
+struct bpf_fprobe_link {
+	struct bpf_link link;
+	struct fprobe fp;
+	unsigned long *addrs;
+};
+
+static void bpf_fprobe_link_release(struct bpf_link *link)
+{
+	struct bpf_fprobe_link *fprobe_link;
+
+	fprobe_link = container_of(link, struct bpf_fprobe_link, link);
+	unregister_fprobe(&fprobe_link->fp);
+}
+
+static void bpf_fprobe_link_dealloc(struct bpf_link *link)
+{
+	struct bpf_fprobe_link *fprobe_link;
+
+	fprobe_link = container_of(link, struct bpf_fprobe_link, link);
+	kfree(fprobe_link->addrs);
+	kfree(fprobe_link);
+}
+
+static const struct bpf_link_ops bpf_fprobe_link_lops = {
+	.release = bpf_fprobe_link_release,
+	.dealloc = bpf_fprobe_link_dealloc,
+};
+
+static int fprobe_link_prog_run(struct bpf_fprobe_link *fprobe_link,
+				struct pt_regs *regs)
+{
+	int err;
+
+	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
+		err = 0;
+		goto out;
+	}
+
+	rcu_read_lock();
+	migrate_disable();
+	err = bpf_prog_run(fprobe_link->link.prog, regs);
+	migrate_enable();
+	rcu_read_unlock();
+
+ out:
+	__this_cpu_dec(bpf_prog_active);
+	return err;
+}
+
+static void fprobe_link_entry_handler(struct fprobe *fp, unsigned long entry_ip,
+				      struct pt_regs *regs)
+{
+	unsigned long saved_ip = instruction_pointer(regs);
+	struct bpf_fprobe_link *fprobe_link;
+
+	/*
+	 * Because fprobe's regs->ip is set to the next instruction of
+	 * dynamic-ftrace insturction, correct entry ip must be set, so
+	 * that the bpf program can access entry address via regs as same
+	 * as kprobes.
+	 */
+	instruction_pointer_set(regs, entry_ip);
+
+	fprobe_link = container_of(fp, struct bpf_fprobe_link, fp);
+	fprobe_link_prog_run(fprobe_link, regs);
+
+	instruction_pointer_set(regs, saved_ip);
+}
+
+static void fprobe_link_exit_handler(struct fprobe *fp, unsigned long entry_ip,
+				     struct pt_regs *regs)
+{
+	unsigned long saved_ip = instruction_pointer(regs);
+	struct bpf_fprobe_link *fprobe_link;
+
+	instruction_pointer_set(regs, entry_ip);
+
+	fprobe_link = container_of(fp, struct bpf_fprobe_link, fp);
+	fprobe_link_prog_run(fprobe_link, regs);
+
+	instruction_pointer_set(regs, saved_ip);
+}
+
+static int fprobe_resolve_syms(const void *usyms, u32 cnt,
+			       unsigned long *addrs)
+{
+	unsigned long addr, size;
+	const char **syms;
+	int err = -ENOMEM;
+	unsigned int i;
+	char *func;
+
+	size = cnt * sizeof(*syms);
+	syms = kzalloc(size, GFP_KERNEL);
+	if (!syms)
+		return -ENOMEM;
+
+	func = kzalloc(KSYM_NAME_LEN, GFP_KERNEL);
+	if (!func)
+		goto error;
+
+	if (copy_from_user(syms, usyms, size)) {
+		err = -EFAULT;
+		goto error;
+	}
+
+	for (i = 0; i < cnt; i++) {
+		err = strncpy_from_user(func, syms[i], KSYM_NAME_LEN);
+		if (err == KSYM_NAME_LEN)
+			err = -E2BIG;
+		if (err < 0)
+			goto error;
+
+		err = -EINVAL;
+		if (func[0] == '\0')
+			goto error;
+		addr = kallsyms_lookup_name(func);
+		if (!addr)
+			goto error;
+		if (!kallsyms_lookup_size_offset(addr, &size, NULL))
+			size = MCOUNT_INSN_SIZE;
+		addr = ftrace_location_range(addr, addr + size - 1);
+		if (!addr)
+			goto error;
+		addrs[i] = addr;
+	}
+
+	err = 0;
+error:
+	kfree(syms);
+	kfree(func);
+	return err;
+}
+
+static int bpf_fprobe_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
+{
+	struct bpf_fprobe_link *link = NULL;
+	struct bpf_link_primer link_primer;
+	unsigned long *addrs;
+	u32 flags, cnt, size;
+	void __user *uaddrs;
+	void __user *usyms;
+	int err;
+
+	/* no support for 32bit archs yet */
+	if (sizeof(u64) != sizeof(void *))
+		return -EINVAL;
+
+	if (prog->expected_attach_type != BPF_TRACE_FPROBE)
+		return -EINVAL;
+
+	flags = attr->link_create.fprobe.flags;
+	if (flags & ~BPF_F_FPROBE_RETURN)
+		return -EINVAL;
+
+	uaddrs = u64_to_user_ptr(attr->link_create.fprobe.addrs);
+	usyms = u64_to_user_ptr(attr->link_create.fprobe.syms);
+	if ((!uaddrs && !usyms) || (uaddrs && usyms))
+		return -EINVAL;
+
+	cnt = attr->link_create.fprobe.cnt;
+	if (!cnt)
+		return -EINVAL;
+
+	size = cnt * sizeof(*addrs);
+	addrs = kzalloc(size, GFP_KERNEL);
+	if (!addrs)
+		return -ENOMEM;
+
+	if (uaddrs) {
+		if (copy_from_user(addrs, uaddrs, size)) {
+			err = -EFAULT;
+			goto error;
+		}
+	} else {
+		err = fprobe_resolve_syms(usyms, cnt, addrs);
+		if (err)
+			goto error;
+	}
+
+	link = kzalloc(sizeof(*link), GFP_KERNEL);
+	if (!link) {
+		err = -ENOMEM;
+		goto error;
+	}
+
+	bpf_link_init(&link->link, BPF_LINK_TYPE_FPROBE,
+		      &bpf_fprobe_link_lops, prog);
+
+	err = bpf_link_prime(&link->link, &link_primer);
+	if (err)
+		goto error;
+
+	if (flags & BPF_F_FPROBE_RETURN)
+		link->fp.exit_handler = fprobe_link_exit_handler;
+	else
+		link->fp.entry_handler = fprobe_link_entry_handler;
+
+	link->addrs = addrs;
+
+	err = register_fprobe_ips(&link->fp, addrs, cnt);
+	if (err) {
+		bpf_link_cleanup(&link_primer);
+		return err;
+	}
+
+	return bpf_link_settle(&link_primer);
+
+error:
+	kfree(link);
+	kfree(addrs);
+	return err;
+}
+#else /* !CONFIG_FPROBE */
+static int bpf_fprobe_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
+{
+	return -EOPNOTSUPP;
+}
+#endif
+
 #define BPF_RAW_TRACEPOINT_OPEN_LAST_FIELD raw_tracepoint.prog_fd
 
 static int bpf_raw_tracepoint_open(const union bpf_attr *attr)
@@ -4248,7 +4476,7 @@ static int tracing_bpf_link_attach(const union bpf_attr *attr, bpfptr_t uattr,
 	return -EINVAL;
 }
 
-#define BPF_LINK_CREATE_LAST_FIELD link_create.iter_info_len
+#define BPF_LINK_CREATE_LAST_FIELD link_create.fprobe.flags
 static int link_create(union bpf_attr *attr, bpfptr_t uattr)
 {
 	enum bpf_prog_type ptype;
@@ -4272,7 +4500,6 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr)
 		ret = tracing_bpf_link_attach(attr, uattr, prog);
 		goto out;
 	case BPF_PROG_TYPE_PERF_EVENT:
-	case BPF_PROG_TYPE_KPROBE:
 	case BPF_PROG_TYPE_TRACEPOINT:
 		if (attr->link_create.attach_type != BPF_PERF_EVENT) {
 			ret = -EINVAL;
@@ -4280,6 +4507,14 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr)
 		}
 		ptype = prog->type;
 		break;
+	case BPF_PROG_TYPE_KPROBE:
+		if (attr->link_create.attach_type != BPF_PERF_EVENT &&
+		    attr->link_create.attach_type != BPF_TRACE_FPROBE) {
+			ret = -EINVAL;
+			goto out;
+		}
+		ptype = prog->type;
+		break;
 	default:
 		ptype = attach_type_to_prog_type(attr->link_create.attach_type);
 		if (ptype == BPF_PROG_TYPE_UNSPEC || ptype != prog->type) {
@@ -4311,13 +4546,16 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr)
 		ret = bpf_xdp_link_attach(attr, prog);
 		break;
 #endif
-#ifdef CONFIG_PERF_EVENTS
 	case BPF_PROG_TYPE_PERF_EVENT:
 	case BPF_PROG_TYPE_TRACEPOINT:
-	case BPF_PROG_TYPE_KPROBE:
 		ret = bpf_perf_link_attach(attr, prog);
 		break;
-#endif
+	case BPF_PROG_TYPE_KPROBE:
+		if (attr->link_create.attach_type == BPF_PERF_EVENT)
+			ret = bpf_perf_link_attach(attr, prog);
+		else
+			ret = bpf_fprobe_link_attach(attr, prog);
+		break;
 	default:
 		ret = -EINVAL;
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index a7f0ddedac1f..c0912f0a3dfe 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -997,6 +997,7 @@ enum bpf_attach_type {
 	BPF_SK_REUSEPORT_SELECT,
 	BPF_SK_REUSEPORT_SELECT_OR_MIGRATE,
 	BPF_PERF_EVENT,
+	BPF_TRACE_FPROBE,
 	__MAX_BPF_ATTACH_TYPE
 };
 
@@ -1011,6 +1012,7 @@ enum bpf_link_type {
 	BPF_LINK_TYPE_NETNS = 5,
 	BPF_LINK_TYPE_XDP = 6,
 	BPF_LINK_TYPE_PERF_EVENT = 7,
+	BPF_LINK_TYPE_FPROBE = 8,
 
 	MAX_BPF_LINK_TYPE,
 };
@@ -1118,6 +1120,11 @@ enum bpf_link_type {
  */
 #define BPF_F_XDP_HAS_FRAGS	(1U << 5)
 
+/* link_create.fprobe.flags used in LINK_CREATE command for
+ * BPF_TRACE_FPROBE attach type to create return probe.
+ */
+#define BPF_F_FPROBE_RETURN	(1U << 0)
+
 /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
  * the following extensions:
  *
@@ -1472,6 +1479,12 @@ union bpf_attr {
 				 */
 				__u64		bpf_cookie;
 			} perf_event;
+			struct {
+				__aligned_u64	syms;
+				__aligned_u64	addrs;
+				__u32		cnt;
+				__u32		flags;
+			} fprobe;
 		};
 	} link_create;
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 2/8] bpf: Add bpf_get_func_ip kprobe helper for fprobe link
  2022-02-02 13:53 [PATCH 0/8] bpf: Add fprobe link Jiri Olsa
  2022-02-02 13:53 ` [PATCH 1/8] bpf: Add support to attach kprobe program with fprobe Jiri Olsa
@ 2022-02-02 13:53 ` Jiri Olsa
  2022-02-07 18:59   ` Andrii Nakryiko
  2022-02-02 13:53 ` [PATCH 3/8] bpf: Add bpf_cookie support to fprobe Jiri Olsa
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 50+ messages in thread
From: Jiri Olsa @ 2022-02-02 13:53 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu
  Cc: netdev, bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Steven Rostedt, Jiri Olsa

Adding support to call get_func_ip_fprobe helper from kprobe
programs attached by fprobe link.

Also adding support to inline it, because it's single load
instruction.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/bpf/verifier.c    | 19 ++++++++++++++++++-
 kernel/trace/bpf_trace.c | 16 +++++++++++++++-
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 1ae41d0cf96c..a745ded00635 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -13625,7 +13625,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
 			continue;
 		}
 
-		/* Implement bpf_get_func_ip inline. */
+		/* Implement tracing bpf_get_func_ip inline. */
 		if (prog_type == BPF_PROG_TYPE_TRACING &&
 		    insn->imm == BPF_FUNC_get_func_ip) {
 			/* Load IP address from ctx - 16 */
@@ -13640,6 +13640,23 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
 			continue;
 		}
 
+		/* Implement kprobe/fprobe bpf_get_func_ip inline. */
+		if (prog_type == BPF_PROG_TYPE_KPROBE &&
+		    eatype == BPF_TRACE_FPROBE &&
+		    insn->imm == BPF_FUNC_get_func_ip) {
+			/* Load IP address from ctx (struct pt_regs) ip */
+			insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1,
+						  offsetof(struct pt_regs, ip));
+
+			new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, 1);
+			if (!new_prog)
+				return -ENOMEM;
+
+			env->prog = prog = new_prog;
+			insn      = new_prog->insnsi + i + delta;
+			continue;
+		}
+
 patch_call_imm:
 		fn = env->ops->get_func_proto(insn->imm, env->prog);
 		/* all functions that have prototype and verifier allowed
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index a2024ba32a20..28e59e31e3db 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1036,6 +1036,19 @@ static const struct bpf_func_proto bpf_get_func_ip_proto_kprobe = {
 	.arg1_type	= ARG_PTR_TO_CTX,
 };
 
+BPF_CALL_1(bpf_get_func_ip_fprobe, struct pt_regs *, regs)
+{
+	/* This helper call is inlined by verifier. */
+	return regs->ip;
+}
+
+static const struct bpf_func_proto bpf_get_func_ip_proto_fprobe = {
+	.func		= bpf_get_func_ip_fprobe,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_CTX,
+};
+
 BPF_CALL_1(bpf_get_attach_cookie_trace, void *, ctx)
 {
 	struct bpf_trace_run_ctx *run_ctx;
@@ -1279,7 +1292,8 @@ kprobe_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_override_return_proto;
 #endif
 	case BPF_FUNC_get_func_ip:
-		return &bpf_get_func_ip_proto_kprobe;
+		return prog->expected_attach_type == BPF_TRACE_FPROBE ?
+			&bpf_get_func_ip_proto_fprobe : &bpf_get_func_ip_proto_kprobe;
 	case BPF_FUNC_get_attach_cookie:
 		return &bpf_get_attach_cookie_proto_trace;
 	default:
-- 
2.34.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 3/8] bpf: Add bpf_cookie support to fprobe
  2022-02-02 13:53 [PATCH 0/8] bpf: Add fprobe link Jiri Olsa
  2022-02-02 13:53 ` [PATCH 1/8] bpf: Add support to attach kprobe program with fprobe Jiri Olsa
  2022-02-02 13:53 ` [PATCH 2/8] bpf: Add bpf_get_func_ip kprobe helper for fprobe link Jiri Olsa
@ 2022-02-02 13:53 ` Jiri Olsa
  2022-02-07 18:59   ` Andrii Nakryiko
  2022-02-02 13:53 ` [PATCH 4/8] libbpf: Add libbpf__kallsyms_parse function Jiri Olsa
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 50+ messages in thread
From: Jiri Olsa @ 2022-02-02 13:53 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu
  Cc: netdev, bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Steven Rostedt, Jiri Olsa

Adding support to call bpf_get_attach_cookie helper from
kprobe program attached by fprobe link.

The bpf_cookie is provided by array of u64 values, where
each value is paired with provided function address with
the same array index.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 include/linux/bpf.h            |  2 +
 include/uapi/linux/bpf.h       |  1 +
 kernel/bpf/syscall.c           | 83 +++++++++++++++++++++++++++++++++-
 kernel/trace/bpf_trace.c       | 16 ++++++-
 tools/include/uapi/linux/bpf.h |  1 +
 5 files changed, 100 insertions(+), 3 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 6eb0b180d33b..7b65f05c0487 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1301,6 +1301,8 @@ static inline void bpf_reset_run_ctx(struct bpf_run_ctx *old_ctx)
 #endif
 }
 
+u64 bpf_fprobe_cookie(struct bpf_run_ctx *ctx, u64 ip);
+
 /* BPF program asks to bypass CAP_NET_BIND_SERVICE in bind. */
 #define BPF_RET_BIND_NO_CAP_NET_BIND_SERVICE			(1 << 0)
 /* BPF program asks to set CN on the packet. */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c0912f0a3dfe..0dc6aa4f9683 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1484,6 +1484,7 @@ union bpf_attr {
 				__aligned_u64	addrs;
 				__u32		cnt;
 				__u32		flags;
+				__aligned_u64	bpf_cookies;
 			} fprobe;
 		};
 	} link_create;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 0cfbb112c8e1..6c5e74bc43b6 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -33,6 +33,8 @@
 #include <linux/rcupdate_trace.h>
 #include <linux/memcontrol.h>
 #include <linux/fprobe.h>
+#include <linux/bsearch.h>
+#include <linux/sort.h>
 
 #define IS_FD_ARRAY(map) ((map)->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY || \
 			  (map)->map_type == BPF_MAP_TYPE_CGROUP_ARRAY || \
@@ -3025,10 +3027,18 @@ static int bpf_perf_link_attach(const union bpf_attr *attr, struct bpf_prog *pro
 
 #ifdef CONFIG_FPROBE
 
+struct bpf_fprobe_cookie {
+	unsigned long addr;
+	u64 bpf_cookie;
+};
+
 struct bpf_fprobe_link {
 	struct bpf_link link;
 	struct fprobe fp;
 	unsigned long *addrs;
+	struct bpf_run_ctx run_ctx;
+	struct bpf_fprobe_cookie *bpf_cookies;
+	u32 cnt;
 };
 
 static void bpf_fprobe_link_release(struct bpf_link *link)
@@ -3045,6 +3055,7 @@ static void bpf_fprobe_link_dealloc(struct bpf_link *link)
 
 	fprobe_link = container_of(link, struct bpf_fprobe_link, link);
 	kfree(fprobe_link->addrs);
+	kfree(fprobe_link->bpf_cookies);
 	kfree(fprobe_link);
 }
 
@@ -3053,9 +3064,37 @@ static const struct bpf_link_ops bpf_fprobe_link_lops = {
 	.dealloc = bpf_fprobe_link_dealloc,
 };
 
+static int bpf_fprobe_cookie_cmp(const void *_a, const void *_b)
+{
+	const struct bpf_fprobe_cookie *a = _a;
+	const struct bpf_fprobe_cookie *b = _b;
+
+	if (a->addr == b->addr)
+		return 0;
+	return a->addr < b->addr ? -1 : 1;
+}
+
+u64 bpf_fprobe_cookie(struct bpf_run_ctx *ctx, u64 ip)
+{
+	struct bpf_fprobe_link *fprobe_link;
+	struct bpf_fprobe_cookie *val, key = {
+		.addr = (unsigned long) ip,
+	};
+
+	if (!ctx)
+		return 0;
+	fprobe_link = container_of(ctx, struct bpf_fprobe_link, run_ctx);
+	if (!fprobe_link->bpf_cookies)
+		return 0;
+	val = bsearch(&key, fprobe_link->bpf_cookies, fprobe_link->cnt,
+		      sizeof(key), bpf_fprobe_cookie_cmp);
+	return val ? val->bpf_cookie : 0;
+}
+
 static int fprobe_link_prog_run(struct bpf_fprobe_link *fprobe_link,
 				struct pt_regs *regs)
 {
+	struct bpf_run_ctx *old_run_ctx;
 	int err;
 
 	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
@@ -3063,12 +3102,16 @@ static int fprobe_link_prog_run(struct bpf_fprobe_link *fprobe_link,
 		goto out;
 	}
 
+	old_run_ctx = bpf_set_run_ctx(&fprobe_link->run_ctx);
+
 	rcu_read_lock();
 	migrate_disable();
 	err = bpf_prog_run(fprobe_link->link.prog, regs);
 	migrate_enable();
 	rcu_read_unlock();
 
+	bpf_reset_run_ctx(old_run_ctx);
+
  out:
 	__this_cpu_dec(bpf_prog_active);
 	return err;
@@ -3161,10 +3204,12 @@ static int fprobe_resolve_syms(const void *usyms, u32 cnt,
 
 static int bpf_fprobe_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
 {
+	struct bpf_fprobe_cookie *bpf_cookies = NULL;
 	struct bpf_fprobe_link *link = NULL;
 	struct bpf_link_primer link_primer;
+	void __user *ubpf_cookies;
+	u32 flags, cnt, i, size;
 	unsigned long *addrs;
-	u32 flags, cnt, size;
 	void __user *uaddrs;
 	void __user *usyms;
 	int err;
@@ -3205,6 +3250,37 @@ static int bpf_fprobe_link_attach(const union bpf_attr *attr, struct bpf_prog *p
 			goto error;
 	}
 
+	ubpf_cookies = u64_to_user_ptr(attr->link_create.fprobe.bpf_cookies);
+	if (ubpf_cookies) {
+		u64 *tmp;
+
+		err = -ENOMEM;
+		tmp = kzalloc(size, GFP_KERNEL);
+		if (!tmp)
+			goto error;
+
+		if (copy_from_user(tmp, ubpf_cookies, size)) {
+			kfree(tmp);
+			err = -EFAULT;
+			goto error;
+		}
+
+		size = cnt * sizeof(*bpf_cookies);
+		bpf_cookies = kzalloc(size, GFP_KERNEL);
+		if (!bpf_cookies) {
+			kfree(tmp);
+			goto error;
+		}
+
+		for (i = 0; i < cnt; i++) {
+			bpf_cookies[i].addr = addrs[i];
+			bpf_cookies[i].bpf_cookie = tmp[i];
+		}
+
+		sort(bpf_cookies, cnt, sizeof(*bpf_cookies), bpf_fprobe_cookie_cmp, NULL);
+		kfree(tmp);
+	}
+
 	link = kzalloc(sizeof(*link), GFP_KERNEL);
 	if (!link) {
 		err = -ENOMEM;
@@ -3224,6 +3300,8 @@ static int bpf_fprobe_link_attach(const union bpf_attr *attr, struct bpf_prog *p
 		link->fp.entry_handler = fprobe_link_entry_handler;
 
 	link->addrs = addrs;
+	link->bpf_cookies = bpf_cookies;
+	link->cnt = cnt;
 
 	err = register_fprobe_ips(&link->fp, addrs, cnt);
 	if (err) {
@@ -3236,6 +3314,7 @@ static int bpf_fprobe_link_attach(const union bpf_attr *attr, struct bpf_prog *p
 error:
 	kfree(link);
 	kfree(addrs);
+	kfree(bpf_cookies);
 	return err;
 }
 #else /* !CONFIG_FPROBE */
@@ -4476,7 +4555,7 @@ static int tracing_bpf_link_attach(const union bpf_attr *attr, bpfptr_t uattr,
 	return -EINVAL;
 }
 
-#define BPF_LINK_CREATE_LAST_FIELD link_create.fprobe.flags
+#define BPF_LINK_CREATE_LAST_FIELD link_create.fprobe.bpf_cookies
 static int link_create(union bpf_attr *attr, bpfptr_t uattr)
 {
 	enum bpf_prog_type ptype;
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 28e59e31e3db..b54b2ef93928 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1049,6 +1049,18 @@ static const struct bpf_func_proto bpf_get_func_ip_proto_fprobe = {
 	.arg1_type	= ARG_PTR_TO_CTX,
 };
 
+BPF_CALL_1(bpf_get_attach_cookie_fprobe, struct pt_regs *, regs)
+{
+	return bpf_fprobe_cookie(current->bpf_ctx, regs->ip);
+}
+
+static const struct bpf_func_proto bpf_get_attach_cookie_proto_fprobe = {
+	.func		= bpf_get_attach_cookie_fprobe,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_CTX,
+};
+
 BPF_CALL_1(bpf_get_attach_cookie_trace, void *, ctx)
 {
 	struct bpf_trace_run_ctx *run_ctx;
@@ -1295,7 +1307,9 @@ kprobe_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return prog->expected_attach_type == BPF_TRACE_FPROBE ?
 			&bpf_get_func_ip_proto_fprobe : &bpf_get_func_ip_proto_kprobe;
 	case BPF_FUNC_get_attach_cookie:
-		return &bpf_get_attach_cookie_proto_trace;
+		return prog->expected_attach_type == BPF_TRACE_FPROBE ?
+			&bpf_get_attach_cookie_proto_fprobe :
+			&bpf_get_attach_cookie_proto_trace;
 	default:
 		return bpf_tracing_func_proto(func_id, prog);
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index c0912f0a3dfe..0dc6aa4f9683 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1484,6 +1484,7 @@ union bpf_attr {
 				__aligned_u64	addrs;
 				__u32		cnt;
 				__u32		flags;
+				__aligned_u64	bpf_cookies;
 			} fprobe;
 		};
 	} link_create;
-- 
2.34.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 4/8] libbpf: Add libbpf__kallsyms_parse function
  2022-02-02 13:53 [PATCH 0/8] bpf: Add fprobe link Jiri Olsa
                   ` (2 preceding siblings ...)
  2022-02-02 13:53 ` [PATCH 3/8] bpf: Add bpf_cookie support to fprobe Jiri Olsa
@ 2022-02-02 13:53 ` Jiri Olsa
  2022-02-07 18:59   ` Andrii Nakryiko
  2022-02-02 13:53 ` [PATCH 5/8] libbpf: Add bpf_link_create support for multi kprobes Jiri Olsa
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 50+ messages in thread
From: Jiri Olsa @ 2022-02-02 13:53 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu
  Cc: netdev, bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Steven Rostedt, Jiri Olsa

Move the kallsyms parsing in internal libbpf__kallsyms_parse
function, so it can be used from other places.

It will be used in following changes.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/lib/bpf/libbpf.c          | 62 ++++++++++++++++++++-------------
 tools/lib/bpf/libbpf_internal.h |  5 +++
 2 files changed, 43 insertions(+), 24 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 1b0936b016d9..7d595cfd03bc 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -7165,12 +7165,10 @@ static int bpf_object__sanitize_maps(struct bpf_object *obj)
 	return 0;
 }
 
-static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
+int libbpf__kallsyms_parse(void *arg, kallsyms_cb_t cb)
 {
 	char sym_type, sym_name[500];
 	unsigned long long sym_addr;
-	const struct btf_type *t;
-	struct extern_desc *ext;
 	int ret, err = 0;
 	FILE *f;
 
@@ -7189,35 +7187,51 @@ static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
 		if (ret != 3) {
 			pr_warn("failed to read kallsyms entry: %d\n", ret);
 			err = -EINVAL;
-			goto out;
+			break;
 		}
 
-		ext = find_extern_by_name(obj, sym_name);
-		if (!ext || ext->type != EXT_KSYM)
-			continue;
-
-		t = btf__type_by_id(obj->btf, ext->btf_id);
-		if (!btf_is_var(t))
-			continue;
-
-		if (ext->is_set && ext->ksym.addr != sym_addr) {
-			pr_warn("extern (ksym) '%s' resolution is ambiguous: 0x%llx or 0x%llx\n",
-				sym_name, ext->ksym.addr, sym_addr);
-			err = -EINVAL;
-			goto out;
-		}
-		if (!ext->is_set) {
-			ext->is_set = true;
-			ext->ksym.addr = sym_addr;
-			pr_debug("extern (ksym) %s=0x%llx\n", sym_name, sym_addr);
-		}
+		err = cb(arg, sym_addr, sym_type, sym_name);
+		if (err)
+			break;
 	}
 
-out:
 	fclose(f);
 	return err;
 }
 
+static int kallsyms_cb(void *arg, unsigned long long sym_addr,
+		       char sym_type, const char *sym_name)
+{
+	struct bpf_object *obj = arg;
+	const struct btf_type *t;
+	struct extern_desc *ext;
+
+	ext = find_extern_by_name(obj, sym_name);
+	if (!ext || ext->type != EXT_KSYM)
+		return 0;
+
+	t = btf__type_by_id(obj->btf, ext->btf_id);
+	if (!btf_is_var(t))
+		return 0;
+
+	if (ext->is_set && ext->ksym.addr != sym_addr) {
+		pr_warn("extern (ksym) '%s' resolution is ambiguous: 0x%llx or 0x%llx\n",
+			sym_name, ext->ksym.addr, sym_addr);
+		return -EINVAL;
+	}
+	if (!ext->is_set) {
+		ext->is_set = true;
+		ext->ksym.addr = sym_addr;
+		pr_debug("extern (ksym) %s=0x%llx\n", sym_name, sym_addr);
+	}
+	return 0;
+}
+
+static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
+{
+	return libbpf__kallsyms_parse(obj, kallsyms_cb);
+}
+
 static int find_ksym_btf_id(struct bpf_object *obj, const char *ksym_name,
 			    __u16 kind, struct btf **res_btf,
 			    struct module_btf **res_mod_btf)
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
index bc86b82e90d1..fb3b07d401df 100644
--- a/tools/lib/bpf/libbpf_internal.h
+++ b/tools/lib/bpf/libbpf_internal.h
@@ -449,6 +449,11 @@ __s32 btf__find_by_name_kind_own(const struct btf *btf, const char *type_name,
 
 extern enum libbpf_strict_mode libbpf_mode;
 
+typedef int (*kallsyms_cb_t)(void *arg, unsigned long long sym_addr,
+			     char sym_type, const char *sym_name);
+
+int libbpf__kallsyms_parse(void *arg, kallsyms_cb_t cb);
+
 /* handle direct returned errors */
 static inline int libbpf_err(int ret)
 {
-- 
2.34.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 5/8] libbpf: Add bpf_link_create support for multi kprobes
  2022-02-02 13:53 [PATCH 0/8] bpf: Add fprobe link Jiri Olsa
                   ` (3 preceding siblings ...)
  2022-02-02 13:53 ` [PATCH 4/8] libbpf: Add libbpf__kallsyms_parse function Jiri Olsa
@ 2022-02-02 13:53 ` Jiri Olsa
  2022-02-02 13:53 ` [PATCH 6/8] libbpf: Add bpf_program__attach_kprobe_opts " Jiri Olsa
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 50+ messages in thread
From: Jiri Olsa @ 2022-02-02 13:53 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu
  Cc: netdev, bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Steven Rostedt, Jiri Olsa

Adding new kprobe struct in bpf_link_create_opts object
to pass multi kprobe data to link_create attr API.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/lib/bpf/bpf.c | 7 +++++++
 tools/lib/bpf/bpf.h | 9 ++++++++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 418b259166f8..98156709a96c 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -853,6 +853,13 @@ int bpf_link_create(int prog_fd, int target_fd,
 		if (!OPTS_ZEROED(opts, perf_event))
 			return libbpf_err(-EINVAL);
 		break;
+	case BPF_TRACE_FPROBE:
+		attr.link_create.fprobe.syms = OPTS_GET(opts, fprobe.syms, 0);
+		attr.link_create.fprobe.addrs = OPTS_GET(opts, fprobe.addrs, 0);
+		attr.link_create.fprobe.cnt = OPTS_GET(opts, fprobe.cnt, 0);
+		attr.link_create.fprobe.flags = OPTS_GET(opts, fprobe.flags, 0);
+		attr.link_create.fprobe.bpf_cookies = OPTS_GET(opts, fprobe.bpf_cookies, 0);
+		break;
 	default:
 		if (!OPTS_ZEROED(opts, flags))
 			return libbpf_err(-EINVAL);
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index c2e8327010f9..114e828ae027 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -413,10 +413,17 @@ struct bpf_link_create_opts {
 		struct {
 			__u64 bpf_cookie;
 		} perf_event;
+		struct {
+			__u64 syms;
+			__u64 addrs;
+			__u32 cnt;
+			__u32 flags;
+			__u64 bpf_cookies;
+		} fprobe;
 	};
 	size_t :0;
 };
-#define bpf_link_create_opts__last_field perf_event
+#define bpf_link_create_opts__last_field fprobe.bpf_cookies
 
 LIBBPF_API int bpf_link_create(int prog_fd, int target_fd,
 			       enum bpf_attach_type attach_type,
-- 
2.34.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 6/8] libbpf: Add bpf_program__attach_kprobe_opts for multi kprobes
  2022-02-02 13:53 [PATCH 0/8] bpf: Add fprobe link Jiri Olsa
                   ` (4 preceding siblings ...)
  2022-02-02 13:53 ` [PATCH 5/8] libbpf: Add bpf_link_create support for multi kprobes Jiri Olsa
@ 2022-02-02 13:53 ` Jiri Olsa
  2022-02-07 18:59   ` Andrii Nakryiko
  2022-02-02 13:53 ` [PATCH 7/8] selftest/bpf: Add fprobe attach test Jiri Olsa
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 50+ messages in thread
From: Jiri Olsa @ 2022-02-02 13:53 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu
  Cc: netdev, bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Steven Rostedt, Jiri Olsa

Adding support to bpf_program__attach_kprobe_opts to load kprobes
to multiple functions.

If the kprobe program has BPF_TRACE_FPROBE as expected_attach_type
it will use the new fprobe link to attach the program. In this case
it will use 'func_name' as pattern for functions to attach.

Adding also support to use '*' wildcard in 'kprobe/kretprobe' section
name by SEC macro, like:

  SEC("kprobe/bpf_fentry_test*")
  SEC("kretprobe/bpf_fentry_test*")

This will set kprobe's expected_attach_type to BPF_TRACE_FPROBE,
and attach it to provided functions pattern.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/lib/bpf/libbpf.c | 136 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 133 insertions(+), 3 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 7d595cfd03bc..6b343ef77ed8 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -8607,13 +8607,15 @@ static struct bpf_link *attach_trace(const struct bpf_program *prog, long cookie
 static struct bpf_link *attach_lsm(const struct bpf_program *prog, long cookie);
 static struct bpf_link *attach_iter(const struct bpf_program *prog, long cookie);
 
+static int init_kprobe(struct bpf_program *prog, long cookie);
+
 static const struct bpf_sec_def section_defs[] = {
 	SEC_DEF("socket",		SOCKET_FILTER, 0, SEC_NONE | SEC_SLOPPY_PFX),
 	SEC_DEF("sk_reuseport/migrate",	SK_REUSEPORT, BPF_SK_REUSEPORT_SELECT_OR_MIGRATE, SEC_ATTACHABLE | SEC_SLOPPY_PFX),
 	SEC_DEF("sk_reuseport",		SK_REUSEPORT, BPF_SK_REUSEPORT_SELECT, SEC_ATTACHABLE | SEC_SLOPPY_PFX),
-	SEC_DEF("kprobe/",		KPROBE,	0, SEC_NONE, attach_kprobe),
+	SEC_DEF("kprobe/",		KPROBE,	0, SEC_NONE, attach_kprobe, .init_fn = init_kprobe),
 	SEC_DEF("uprobe/",		KPROBE,	0, SEC_NONE),
-	SEC_DEF("kretprobe/",		KPROBE, 0, SEC_NONE, attach_kprobe),
+	SEC_DEF("kretprobe/",		KPROBE, 0, SEC_NONE, attach_kprobe, .init_fn = init_kprobe),
 	SEC_DEF("uretprobe/",		KPROBE, 0, SEC_NONE),
 	SEC_DEF("tc",			SCHED_CLS, 0, SEC_NONE),
 	SEC_DEF("classifier",		SCHED_CLS, 0, SEC_NONE | SEC_SLOPPY_PFX | SEC_DEPRECATED),
@@ -10031,6 +10033,123 @@ static int perf_event_kprobe_open_legacy(const char *probe_name, bool retprobe,
 	return pfd;
 }
 
+struct fprobe_resolve {
+	const char *name;
+	__u64 *addrs;
+	__u32 alloc;
+	__u32 cnt;
+};
+
+static bool glob_matches(const char *glob, const char *s)
+{
+	int n = strlen(glob);
+
+	if (n == 1 && glob[0] == '*')
+		return true;
+
+	if (glob[0] == '*' && glob[n - 1] == '*') {
+		const char *subs;
+		/* substring match */
+
+		/* this is hacky, but we don't want to allocate
+		 * for no good reason
+		 */
+		((char *)glob)[n - 1] = '\0';
+		subs = strstr(s, glob + 1);
+		((char *)glob)[n - 1] = '*';
+
+		return subs != NULL;
+	} else if (glob[0] == '*') {
+		size_t nn = strlen(s);
+		/* suffix match */
+
+		/* too short for a given suffix */
+		if (nn < n - 1)
+			return false;
+		return strcmp(s + nn - (n - 1), glob + 1) == 0;
+	} else if (glob[n - 1] == '*') {
+		/* prefix match */
+		return strncmp(s, glob, n - 1) == 0;
+	} else {
+		/* exact match */
+		return strcmp(glob, s) == 0;
+	}
+}
+
+static int resolve_fprobe_cb(void *arg, unsigned long long sym_addr,
+			     char sym_type, const char *sym_name)
+{
+	struct fprobe_resolve *res = arg;
+	__u64 *p;
+
+	if (!glob_matches(res->name, sym_name))
+		return 0;
+
+	if (res->cnt == res->alloc) {
+		res->alloc = max((__u32) 16, res->alloc * 3 / 2);
+		p = libbpf_reallocarray(res->addrs, res->alloc, sizeof(__u32));
+		if (!p)
+			return -ENOMEM;
+		res->addrs = p;
+	}
+	res->addrs[res->cnt++] = sym_addr;
+	return 0;
+}
+
+static struct bpf_link *
+attach_fprobe_opts(const struct bpf_program *prog,
+		   const char *func_name,
+		   const struct bpf_kprobe_opts *kopts)
+{
+	DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts);
+	struct fprobe_resolve res = {
+		.name = func_name,
+	};
+	struct bpf_link *link = NULL;
+	char errmsg[STRERR_BUFSIZE];
+	int err, link_fd, prog_fd;
+	bool retprobe;
+
+	err = libbpf__kallsyms_parse(&res, resolve_fprobe_cb);
+	if (err)
+		goto error;
+	if (!res.cnt) {
+		err = -ENOENT;
+		goto error;
+	}
+
+	retprobe = OPTS_GET(kopts, retprobe, false);
+
+	opts.fprobe.addrs = (__u64) res.addrs;
+	opts.fprobe.cnt = res.cnt;
+	opts.flags = retprobe ? BPF_F_FPROBE_RETURN : 0;
+
+	link = calloc(1, sizeof(*link));
+	if (!link) {
+		err = -ENOMEM;
+		goto error;
+	}
+	link->detach = &bpf_link__detach_fd;
+
+	prog_fd = bpf_program__fd(prog);
+	link_fd = bpf_link_create(prog_fd, 0, BPF_TRACE_FPROBE, &opts);
+	if (link_fd < 0) {
+		err = -errno;
+		pr_warn("prog '%s': failed to attach to %s: %s\n",
+			prog->name, res.name,
+			libbpf_strerror_r(err, errmsg, sizeof(errmsg)));
+		goto error;
+	}
+	link->fd = link_fd;
+	free(res.addrs);
+	return link;
+
+error:
+	free(link);
+	free(res.addrs);
+	return libbpf_err_ptr(err);
+}
+
 struct bpf_link *
 bpf_program__attach_kprobe_opts(const struct bpf_program *prog,
 				const char *func_name,
@@ -10047,6 +10166,9 @@ bpf_program__attach_kprobe_opts(const struct bpf_program *prog,
 	if (!OPTS_VALID(opts, bpf_kprobe_opts))
 		return libbpf_err_ptr(-EINVAL);
 
+	if (prog->expected_attach_type == BPF_TRACE_FPROBE)
+		return attach_fprobe_opts(prog, func_name, opts);
+
 	retprobe = OPTS_GET(opts, retprobe, false);
 	offset = OPTS_GET(opts, offset, 0);
 	pe_opts.bpf_cookie = OPTS_GET(opts, bpf_cookie, 0);
@@ -10112,6 +10234,14 @@ struct bpf_link *bpf_program__attach_kprobe(const struct bpf_program *prog,
 	return bpf_program__attach_kprobe_opts(prog, func_name, &opts);
 }
 
+static int init_kprobe(struct bpf_program *prog, long cookie)
+{
+	/* If we have wildcard, switch to fprobe link. */
+	if (strchr(prog->sec_name, '*'))
+		bpf_program__set_expected_attach_type(prog, BPF_TRACE_FPROBE);
+	return 0;
+}
+
 static struct bpf_link *attach_kprobe(const struct bpf_program *prog, long cookie)
 {
 	DECLARE_LIBBPF_OPTS(bpf_kprobe_opts, opts);
@@ -10127,7 +10257,7 @@ static struct bpf_link *attach_kprobe(const struct bpf_program *prog, long cooki
 	else
 		func_name = prog->sec_name + sizeof("kprobe/") - 1;
 
-	n = sscanf(func_name, "%m[a-zA-Z0-9_.]+%li", &func, &offset);
+	n = sscanf(func_name, "%m[a-zA-Z0-9_.*]+%li", &func, &offset);
 	if (n < 1) {
 		err = -EINVAL;
 		pr_warn("kprobe name is invalid: %s\n", func_name);
-- 
2.34.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 7/8] selftest/bpf: Add fprobe attach test
  2022-02-02 13:53 [PATCH 0/8] bpf: Add fprobe link Jiri Olsa
                   ` (5 preceding siblings ...)
  2022-02-02 13:53 ` [PATCH 6/8] libbpf: Add bpf_program__attach_kprobe_opts " Jiri Olsa
@ 2022-02-02 13:53 ` Jiri Olsa
  2022-02-02 13:53 ` [PATCH 8/8] selftest/bpf: Add fprobe test for bpf_cookie values Jiri Olsa
  2022-02-02 17:09 ` [PATCH 0/8] bpf: Add fprobe link Alexei Starovoitov
  8 siblings, 0 replies; 50+ messages in thread
From: Jiri Olsa @ 2022-02-02 13:53 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu
  Cc: netdev, bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Steven Rostedt, Jiri Olsa

Adding kprobe attach test that uses new fprobe interface
to attach kprobe program to multiple functions.

The test is attaching to bpf_fentry_test* functions and
uses single trampoline program bpf_prog_test_run to trigger
bpf_fentry_test* functions.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 .../selftests/bpf/prog_tests/fprobe_test.c    | 117 ++++++++++++++++++
 tools/testing/selftests/bpf/progs/fprobe.c    |  58 +++++++++
 2 files changed, 175 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/fprobe_test.c
 create mode 100644 tools/testing/selftests/bpf/progs/fprobe.c

diff --git a/tools/testing/selftests/bpf/prog_tests/fprobe_test.c b/tools/testing/selftests/bpf/prog_tests/fprobe_test.c
new file mode 100644
index 000000000000..dcbde37ec369
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/fprobe_test.c
@@ -0,0 +1,117 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <test_progs.h>
+#include "fprobe.skel.h"
+#include "trace_helpers.h"
+
+static void test_skel_api(void)
+{
+	struct fprobe *skel = NULL;
+	__u32 duration = 0, retval;
+	int err, prog_fd;
+
+	skel = fprobe__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "fprobe__open_and_load"))
+		goto cleanup;
+
+	err = fprobe__attach(skel);
+	if (!ASSERT_OK(err, "fprobe__attach"))
+		goto cleanup;
+
+	prog_fd = bpf_program__fd(skel->progs.test1);
+	err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
+				NULL, NULL, &retval, &duration);
+	ASSERT_OK(err, "test_run");
+	ASSERT_EQ(retval, 0, "test_run");
+
+	ASSERT_EQ(skel->bss->test2_result, 8, "test2_result");
+	ASSERT_EQ(skel->bss->test3_result, 8, "test3_result");
+
+cleanup:
+	fprobe__destroy(skel);
+}
+
+static void test_link_api(struct bpf_link_create_opts *opts)
+{
+	int err, prog_fd, link1_fd = -1, link2_fd = -1;
+	struct fprobe *skel = NULL;
+	__u32 duration = 0, retval;
+
+	skel = fprobe__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "fentry_raw_skel_load"))
+		goto cleanup;
+
+	prog_fd = bpf_program__fd(skel->progs.test2);
+	link1_fd = bpf_link_create(prog_fd, 0, BPF_TRACE_FPROBE, opts);
+	if (!ASSERT_GE(link1_fd, 0, "link_fd"))
+		goto cleanup;
+
+	opts->fprobe.flags = BPF_F_FPROBE_RETURN;
+	prog_fd = bpf_program__fd(skel->progs.test3);
+	link2_fd = bpf_link_create(prog_fd, 0, BPF_TRACE_FPROBE, opts);
+	if (!ASSERT_GE(link2_fd, 0, "link_fd"))
+		goto cleanup;
+
+	skel->bss->test2_result = 0;
+	skel->bss->test3_result = 0;
+
+	prog_fd = bpf_program__fd(skel->progs.test1);
+	err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
+				NULL, NULL, &retval, &duration);
+	ASSERT_OK(err, "test_run");
+	ASSERT_EQ(retval, 0, "test_run");
+
+	ASSERT_EQ(skel->bss->test2_result, 8, "test2_result");
+	ASSERT_EQ(skel->bss->test3_result, 8, "test3_result");
+
+cleanup:
+	if (link1_fd != -1)
+		close(link1_fd);
+	if (link2_fd != -1)
+		close(link2_fd);
+	fprobe__destroy(skel);
+}
+
+static void test_link_api_addrs(void)
+{
+	DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts);
+	__u64 addrs[8];
+
+	kallsyms_find("bpf_fentry_test1", &addrs[0]);
+	kallsyms_find("bpf_fentry_test2", &addrs[1]);
+	kallsyms_find("bpf_fentry_test3", &addrs[2]);
+	kallsyms_find("bpf_fentry_test4", &addrs[3]);
+	kallsyms_find("bpf_fentry_test5", &addrs[4]);
+	kallsyms_find("bpf_fentry_test6", &addrs[5]);
+	kallsyms_find("bpf_fentry_test7", &addrs[6]);
+	kallsyms_find("bpf_fentry_test8", &addrs[7]);
+
+	opts.fprobe.addrs = (__u64) addrs;
+	opts.fprobe.cnt = 8;
+	test_link_api(&opts);
+}
+
+static void test_link_api_syms(void)
+{
+	DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts);
+	const char *syms[8] = {
+		"bpf_fentry_test1",
+		"bpf_fentry_test2",
+		"bpf_fentry_test3",
+		"bpf_fentry_test4",
+		"bpf_fentry_test5",
+		"bpf_fentry_test6",
+		"bpf_fentry_test7",
+		"bpf_fentry_test8",
+	};
+
+	opts.fprobe.syms = (__u64) syms;
+	opts.fprobe.cnt = 8;
+	test_link_api(&opts);
+}
+
+void test_fprobe_test(void)
+{
+	test_skel_api();
+	test_link_api_syms();
+	test_link_api_addrs();
+}
diff --git a/tools/testing/selftests/bpf/progs/fprobe.c b/tools/testing/selftests/bpf/progs/fprobe.c
new file mode 100644
index 000000000000..baf7086203f9
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/fprobe.c
@@ -0,0 +1,58 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+extern const void bpf_fentry_test1 __ksym;
+extern const void bpf_fentry_test2 __ksym;
+extern const void bpf_fentry_test3 __ksym;
+extern const void bpf_fentry_test4 __ksym;
+extern const void bpf_fentry_test5 __ksym;
+extern const void bpf_fentry_test6 __ksym;
+extern const void bpf_fentry_test7 __ksym;
+extern const void bpf_fentry_test8 __ksym;
+
+/* No tests, just to trigger bpf_fentry_test* through tracing test_run */
+SEC("fentry/bpf_modify_return_test")
+int BPF_PROG(test1)
+{
+	return 0;
+}
+
+__u64 test2_result = 0;
+
+SEC("kprobe/bpf_fentry_test*")
+int test2(struct pt_regs *ctx)
+{
+	__u64 addr = bpf_get_func_ip(ctx);
+
+	test2_result += (const void *) addr == &bpf_fentry_test1 ||
+			(const void *) addr == &bpf_fentry_test2 ||
+			(const void *) addr == &bpf_fentry_test3 ||
+			(const void *) addr == &bpf_fentry_test4 ||
+			(const void *) addr == &bpf_fentry_test5 ||
+			(const void *) addr == &bpf_fentry_test6 ||
+			(const void *) addr == &bpf_fentry_test7 ||
+			(const void *) addr == &bpf_fentry_test8;
+	return 0;
+}
+
+__u64 test3_result = 0;
+
+SEC("kretprobe/bpf_fentry_test*")
+int test3(struct pt_regs *ctx)
+{
+	__u64 addr = bpf_get_func_ip(ctx);
+
+	test3_result += (const void *) addr == &bpf_fentry_test1 ||
+			(const void *) addr == &bpf_fentry_test2 ||
+			(const void *) addr == &bpf_fentry_test3 ||
+			(const void *) addr == &bpf_fentry_test4 ||
+			(const void *) addr == &bpf_fentry_test5 ||
+			(const void *) addr == &bpf_fentry_test6 ||
+			(const void *) addr == &bpf_fentry_test7 ||
+			(const void *) addr == &bpf_fentry_test8;
+	return 0;
+}
-- 
2.34.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 8/8] selftest/bpf: Add fprobe test for bpf_cookie values
  2022-02-02 13:53 [PATCH 0/8] bpf: Add fprobe link Jiri Olsa
                   ` (6 preceding siblings ...)
  2022-02-02 13:53 ` [PATCH 7/8] selftest/bpf: Add fprobe attach test Jiri Olsa
@ 2022-02-02 13:53 ` Jiri Olsa
  2022-02-07 18:59   ` Andrii Nakryiko
  2022-02-02 17:09 ` [PATCH 0/8] bpf: Add fprobe link Alexei Starovoitov
  8 siblings, 1 reply; 50+ messages in thread
From: Jiri Olsa @ 2022-02-02 13:53 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu
  Cc: netdev, bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Steven Rostedt, Jiri Olsa

Adding bpf_cookie test for kprobe attached by fprobe link.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 .../selftests/bpf/prog_tests/bpf_cookie.c     | 73 +++++++++++++++++++
 .../selftests/bpf/progs/fprobe_bpf_cookie.c   | 62 ++++++++++++++++
 2 files changed, 135 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/fprobe_bpf_cookie.c

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
index cd10df6cd0fc..bf70d859c598 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
@@ -7,6 +7,7 @@
 #include <unistd.h>
 #include <test_progs.h>
 #include "test_bpf_cookie.skel.h"
+#include "fprobe_bpf_cookie.skel.h"
 
 /* uprobe attach point */
 static void trigger_func(void)
@@ -63,6 +64,76 @@ static void kprobe_subtest(struct test_bpf_cookie *skel)
 	bpf_link__destroy(retlink2);
 }
 
+static void fprobe_subtest(void)
+{
+	DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts);
+	int err, prog_fd, link1_fd = -1, link2_fd = -1;
+	struct fprobe_bpf_cookie *skel = NULL;
+	__u32 duration = 0, retval;
+	__u64 addrs[8], cookies[8];
+
+	skel = fprobe_bpf_cookie__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "fentry_raw_skel_load"))
+		goto cleanup;
+
+	kallsyms_find("bpf_fentry_test1", &addrs[0]);
+	kallsyms_find("bpf_fentry_test2", &addrs[1]);
+	kallsyms_find("bpf_fentry_test3", &addrs[2]);
+	kallsyms_find("bpf_fentry_test4", &addrs[3]);
+	kallsyms_find("bpf_fentry_test5", &addrs[4]);
+	kallsyms_find("bpf_fentry_test6", &addrs[5]);
+	kallsyms_find("bpf_fentry_test7", &addrs[6]);
+	kallsyms_find("bpf_fentry_test8", &addrs[7]);
+
+	cookies[0] = 1;
+	cookies[1] = 2;
+	cookies[2] = 3;
+	cookies[3] = 4;
+	cookies[4] = 5;
+	cookies[5] = 6;
+	cookies[6] = 7;
+	cookies[7] = 8;
+
+	opts.fprobe.addrs = (__u64) &addrs;
+	opts.fprobe.cnt = 8;
+	opts.fprobe.bpf_cookies = (__u64) &cookies;
+	prog_fd = bpf_program__fd(skel->progs.test2);
+
+	link1_fd = bpf_link_create(prog_fd, 0, BPF_TRACE_FPROBE, &opts);
+	if (!ASSERT_GE(link1_fd, 0, "link1_fd"))
+		return;
+
+	cookies[0] = 8;
+	cookies[1] = 7;
+	cookies[2] = 6;
+	cookies[3] = 5;
+	cookies[4] = 4;
+	cookies[5] = 3;
+	cookies[6] = 2;
+	cookies[7] = 1;
+
+	opts.flags = BPF_F_FPROBE_RETURN;
+	prog_fd = bpf_program__fd(skel->progs.test3);
+
+	link2_fd = bpf_link_create(prog_fd, 0, BPF_TRACE_FPROBE, &opts);
+	if (!ASSERT_GE(link2_fd, 0, "link2_fd"))
+		goto cleanup;
+
+	prog_fd = bpf_program__fd(skel->progs.test1);
+	err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
+				NULL, NULL, &retval, &duration);
+	ASSERT_OK(err, "test_run");
+	ASSERT_EQ(retval, 0, "test_run");
+
+	ASSERT_EQ(skel->bss->test2_result, 8, "test2_result");
+	ASSERT_EQ(skel->bss->test3_result, 8, "test3_result");
+
+cleanup:
+	close(link1_fd);
+	close(link2_fd);
+	fprobe_bpf_cookie__destroy(skel);
+}
+
 static void uprobe_subtest(struct test_bpf_cookie *skel)
 {
 	DECLARE_LIBBPF_OPTS(bpf_uprobe_opts, opts);
@@ -249,6 +320,8 @@ void test_bpf_cookie(void)
 
 	if (test__start_subtest("kprobe"))
 		kprobe_subtest(skel);
+	if (test__start_subtest("rawkprobe"))
+		fprobe_subtest();
 	if (test__start_subtest("uprobe"))
 		uprobe_subtest(skel);
 	if (test__start_subtest("tracepoint"))
diff --git a/tools/testing/selftests/bpf/progs/fprobe_bpf_cookie.c b/tools/testing/selftests/bpf/progs/fprobe_bpf_cookie.c
new file mode 100644
index 000000000000..42cb109e5a30
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/fprobe_bpf_cookie.c
@@ -0,0 +1,62 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+extern const void bpf_fentry_test1 __ksym;
+extern const void bpf_fentry_test2 __ksym;
+extern const void bpf_fentry_test3 __ksym;
+extern const void bpf_fentry_test4 __ksym;
+extern const void bpf_fentry_test5 __ksym;
+extern const void bpf_fentry_test6 __ksym;
+extern const void bpf_fentry_test7 __ksym;
+extern const void bpf_fentry_test8 __ksym;
+
+/* No tests, just to trigger bpf_fentry_test* through tracing test_run */
+SEC("fentry/bpf_modify_return_test")
+int BPF_PROG(test1)
+{
+	return 0;
+}
+
+__u64 test2_result = 0;
+
+SEC("kprobe/bpf_fentry_test*")
+int test2(struct pt_regs *ctx)
+{
+	__u64 cookie = bpf_get_attach_cookie(ctx);
+	__u64 addr = bpf_get_func_ip(ctx);
+
+	test2_result += (const void *) addr == &bpf_fentry_test1 && cookie == 1;
+	test2_result += (const void *) addr == &bpf_fentry_test2 && cookie == 2;
+	test2_result += (const void *) addr == &bpf_fentry_test3 && cookie == 3;
+	test2_result += (const void *) addr == &bpf_fentry_test4 && cookie == 4;
+	test2_result += (const void *) addr == &bpf_fentry_test5 && cookie == 5;
+	test2_result += (const void *) addr == &bpf_fentry_test6 && cookie == 6;
+	test2_result += (const void *) addr == &bpf_fentry_test7 && cookie == 7;
+	test2_result += (const void *) addr == &bpf_fentry_test8 && cookie == 8;
+
+	return 0;
+}
+
+__u64 test3_result = 0;
+
+SEC("kretprobe/bpf_fentry_test*")
+int test3(struct pt_regs *ctx)
+{
+	__u64 cookie = bpf_get_attach_cookie(ctx);
+	__u64 addr = bpf_get_func_ip(ctx);
+
+	test3_result += (const void *) addr == &bpf_fentry_test1 && cookie == 8;
+	test3_result += (const void *) addr == &bpf_fentry_test2 && cookie == 7;
+	test3_result += (const void *) addr == &bpf_fentry_test3 && cookie == 6;
+	test3_result += (const void *) addr == &bpf_fentry_test4 && cookie == 5;
+	test3_result += (const void *) addr == &bpf_fentry_test5 && cookie == 4;
+	test3_result += (const void *) addr == &bpf_fentry_test6 && cookie == 3;
+	test3_result += (const void *) addr == &bpf_fentry_test7 && cookie == 2;
+	test3_result += (const void *) addr == &bpf_fentry_test8 && cookie == 1;
+
+	return 0;
+}
-- 
2.34.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-02 13:53 [PATCH 0/8] bpf: Add fprobe link Jiri Olsa
                   ` (7 preceding siblings ...)
  2022-02-02 13:53 ` [PATCH 8/8] selftest/bpf: Add fprobe test for bpf_cookie values Jiri Olsa
@ 2022-02-02 17:09 ` Alexei Starovoitov
  2022-02-02 17:24   ` Jiri Olsa
  8 siblings, 1 reply; 50+ messages in thread
From: Alexei Starovoitov @ 2022-02-02 17:09 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Network Development, bpf, lkml,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Steven Rostedt, Jiri Olsa

On Wed, Feb 2, 2022 at 5:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> hi,
> this patchset adds new link type BPF_LINK_TYPE_FPROBE that attaches kprobe
> program through fprobe API [1] instroduced by Masami.

No new prog type please.
I thought I made my reasons clear earlier.
It's a multi kprobe. Not a fprobe or any other name.
The kernel internal names should not leak into uapi.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-02 17:09 ` [PATCH 0/8] bpf: Add fprobe link Alexei Starovoitov
@ 2022-02-02 17:24   ` Jiri Olsa
  2022-02-02 17:30     ` Alexei Starovoitov
  0 siblings, 1 reply; 50+ messages in thread
From: Jiri Olsa @ 2022-02-02 17:24 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Network Development, bpf, lkml,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Steven Rostedt, Jiri Olsa

On Wed, Feb 02, 2022 at 09:09:53AM -0800, Alexei Starovoitov wrote:
> On Wed, Feb 2, 2022 at 5:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > hi,
> > this patchset adds new link type BPF_LINK_TYPE_FPROBE that attaches kprobe
> > program through fprobe API [1] instroduced by Masami.
> 
> No new prog type please.
> I thought I made my reasons clear earlier.
> It's a multi kprobe. Not a fprobe or any other name.
> The kernel internal names should not leak into uapi.
> 

well it's not new prog type, it's new link type that allows
to attach kprobe program to multiple functions

the original change used BPF_LINK_TYPE_RAW_KPROBE, which did not
seem to fit anymore, so I moved to FPROBE, because that's what
it is ;-)

but if you don't want new name in uapi we could make this more
obvious with link name:
  BPF_LINK_TYPE_MULTI_KPROBE

and bpf_attach_type:
  BPF_TRACE_MULTI_KPROBE

jirka


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-02 17:24   ` Jiri Olsa
@ 2022-02-02 17:30     ` Alexei Starovoitov
  2022-02-03 15:06       ` Jiri Olsa
  0 siblings, 1 reply; 50+ messages in thread
From: Alexei Starovoitov @ 2022-02-02 17:30 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Network Development, bpf, lkml,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Steven Rostedt, Jiri Olsa

On Wed, Feb 2, 2022 at 9:24 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Wed, Feb 02, 2022 at 09:09:53AM -0800, Alexei Starovoitov wrote:
> > On Wed, Feb 2, 2022 at 5:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > >
> > > hi,
> > > this patchset adds new link type BPF_LINK_TYPE_FPROBE that attaches kprobe
> > > program through fprobe API [1] instroduced by Masami.
> >
> > No new prog type please.
> > I thought I made my reasons clear earlier.
> > It's a multi kprobe. Not a fprobe or any other name.
> > The kernel internal names should not leak into uapi.
> >
>
> well it's not new prog type, it's new link type that allows
> to attach kprobe program to multiple functions
>
> the original change used BPF_LINK_TYPE_RAW_KPROBE, which did not
> seem to fit anymore, so I moved to FPROBE, because that's what
> it is ;-)

Now I don't like the fprobe name even more.
Why invent new names? It's an ftrace interface.

> but if you don't want new name in uapi we could make this more
> obvious with link name:
>   BPF_LINK_TYPE_MULTI_KPROBE
>
> and bpf_attach_type:
>   BPF_TRACE_MULTI_KPROBE

I'd rather get rid of fprobe name first.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-02 17:30     ` Alexei Starovoitov
@ 2022-02-03 15:06       ` Jiri Olsa
  2022-02-04  0:46         ` Masami Hiramatsu
  0 siblings, 1 reply; 50+ messages in thread
From: Jiri Olsa @ 2022-02-03 15:06 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Network Development, bpf, lkml,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Steven Rostedt, Jiri Olsa

On Wed, Feb 02, 2022 at 09:30:21AM -0800, Alexei Starovoitov wrote:
> On Wed, Feb 2, 2022 at 9:24 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Wed, Feb 02, 2022 at 09:09:53AM -0800, Alexei Starovoitov wrote:
> > > On Wed, Feb 2, 2022 at 5:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > >
> > > > hi,
> > > > this patchset adds new link type BPF_LINK_TYPE_FPROBE that attaches kprobe
> > > > program through fprobe API [1] instroduced by Masami.
> > >
> > > No new prog type please.
> > > I thought I made my reasons clear earlier.
> > > It's a multi kprobe. Not a fprobe or any other name.
> > > The kernel internal names should not leak into uapi.
> > >
> >
> > well it's not new prog type, it's new link type that allows
> > to attach kprobe program to multiple functions
> >
> > the original change used BPF_LINK_TYPE_RAW_KPROBE, which did not
> > seem to fit anymore, so I moved to FPROBE, because that's what
> > it is ;-)
> 
> Now I don't like the fprobe name even more.
> Why invent new names? It's an ftrace interface.

how about ftrace_probe ?

> 
> > but if you don't want new name in uapi we could make this more
> > obvious with link name:
> >   BPF_LINK_TYPE_MULTI_KPROBE
> >
> > and bpf_attach_type:
> >   BPF_TRACE_MULTI_KPROBE
> 
> I'd rather get rid of fprobe name first.
>

Masami, any idea?

thanks,
jirka


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-03 15:06       ` Jiri Olsa
@ 2022-02-04  0:46         ` Masami Hiramatsu
  2022-02-04  1:34           ` Alexei Starovoitov
  0 siblings, 1 reply; 50+ messages in thread
From: Masami Hiramatsu @ 2022-02-04  0:46 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Masami Hiramatsu, Network Development, bpf,
	lkml, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Steven Rostedt, Jiri Olsa

On Thu, 3 Feb 2022 16:06:36 +0100
Jiri Olsa <jolsa@redhat.com> wrote:

> On Wed, Feb 02, 2022 at 09:30:21AM -0800, Alexei Starovoitov wrote:
> > On Wed, Feb 2, 2022 at 9:24 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > >
> > > On Wed, Feb 02, 2022 at 09:09:53AM -0800, Alexei Starovoitov wrote:
> > > > On Wed, Feb 2, 2022 at 5:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > >
> > > > > hi,
> > > > > this patchset adds new link type BPF_LINK_TYPE_FPROBE that attaches kprobe
> > > > > program through fprobe API [1] instroduced by Masami.
> > > >
> > > > No new prog type please.
> > > > I thought I made my reasons clear earlier.
> > > > It's a multi kprobe. Not a fprobe or any other name.
> > > > The kernel internal names should not leak into uapi.
> > > >
> > >
> > > well it's not new prog type, it's new link type that allows
> > > to attach kprobe program to multiple functions
> > >
> > > the original change used BPF_LINK_TYPE_RAW_KPROBE, which did not
> > > seem to fit anymore, so I moved to FPROBE, because that's what
> > > it is ;-)
> > 
> > Now I don't like the fprobe name even more.
> > Why invent new names? It's an ftrace interface.
> 
> how about ftrace_probe ?

I thought What Alexei pointed was that don't expose the FPROBE name
to user space. If so, I agree with that. We can continue to use
KPROBE for user space. Using fprobe is just for kernel implementation.

It means that we may better to keep simple mind model (there are only
static event or dynamic kprobe event).


> > > but if you don't want new name in uapi we could make this more
> > > obvious with link name:
> > >   BPF_LINK_TYPE_MULTI_KPROBE
> > >
> > > and bpf_attach_type:
> > >   BPF_TRACE_MULTI_KPROBE
> > 
> > I'd rather get rid of fprobe name first.
> >
> 
> Masami, any idea?

Can't we continue to use kprobe prog type for user interface
and internally, if there are multiple kprobes or kretprobes
required, switch to use fprobe?

Thank you,

> 
> thanks,
> jirka
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-04  0:46         ` Masami Hiramatsu
@ 2022-02-04  1:34           ` Alexei Starovoitov
  2022-02-04  2:07             ` Masami Hiramatsu
  0 siblings, 1 reply; 50+ messages in thread
From: Alexei Starovoitov @ 2022-02-04  1:34 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Steven Rostedt,
	Jiri Olsa

On Thu, Feb 3, 2022 at 4:46 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> I thought What Alexei pointed was that don't expose the FPROBE name
> to user space. If so, I agree with that. We can continue to use
> KPROBE for user space. Using fprobe is just for kernel implementation.

Clearly that intent is not working.
The "fprobe" name is already leaking outside of the kernel internals.
The module interface is being proposed.
You'd need to document it, etc.
I think it's only causing confusion to users.
The new name serves no additional purpose other than
being new and unheard of.
fprobe is kprobe on ftrace. That's it.
Just call it kprobe on ftrace in api and everywhere.
Please?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-04  1:34           ` Alexei Starovoitov
@ 2022-02-04  2:07             ` Masami Hiramatsu
  2022-02-04  2:12               ` Alexei Starovoitov
  0 siblings, 1 reply; 50+ messages in thread
From: Masami Hiramatsu @ 2022-02-04  2:07 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Steven Rostedt,
	Jiri Olsa

On Thu, 3 Feb 2022 17:34:54 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Thu, Feb 3, 2022 at 4:46 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > I thought What Alexei pointed was that don't expose the FPROBE name
> > to user space. If so, I agree with that. We can continue to use
> > KPROBE for user space. Using fprobe is just for kernel implementation.
> 
> Clearly that intent is not working.

Thanks for confirmation :-)

> The "fprobe" name is already leaking outside of the kernel internals.
> The module interface is being proposed.

Yes, but that is only for making the example module.
It is easy for me to enclose it inside kernel. I'm preparing KUnit
selftest code for next version. After integrated that, we don't need
that example module anymore.

> You'd need to document it, etc.

Yes, I've added a document of the APIs for the series.  :-)

> I think it's only causing confusion to users.
> The new name serves no additional purpose other than
> being new and unheard of.
> fprobe is kprobe on ftrace. That's it.

No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
transparently.

> Just call it kprobe on ftrace in api and everywhere.
> Please?

Hmm, no, I think that's the work for who provide user-interface, isn't it?.
Inside kernel, IMHO, the interface named from the programing viewpoint, and
from that viewpoint, fprobe and kprobe interface are similar but different.

I'm able to allow kprobe-event (of ftrace) to accept "func*" (yeah, that's
actually good idea), but ftrace interface will not export as fprobe. Even if
it internally uses fprobe, I don't call it fprobe. It's kprobes from the
viewpoint of ftrace user. (Yeah, I think it should be called as 
"dynamic-probe-event-for-kernel" but historically, it is called as kprobe-event.)

Thank you, 

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-04  2:07             ` Masami Hiramatsu
@ 2022-02-04  2:12               ` Alexei Starovoitov
  2022-02-04  2:19                 ` Steven Rostedt
  2022-02-04  3:14                 ` Masami Hiramatsu
  0 siblings, 2 replies; 50+ messages in thread
From: Alexei Starovoitov @ 2022-02-04  2:12 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Steven Rostedt,
	Jiri Olsa

On Thu, Feb 3, 2022 at 6:07 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Thu, 3 Feb 2022 17:34:54 -0800
> Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
>
> > On Thu, Feb 3, 2022 at 4:46 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > >
> > > I thought What Alexei pointed was that don't expose the FPROBE name
> > > to user space. If so, I agree with that. We can continue to use
> > > KPROBE for user space. Using fprobe is just for kernel implementation.
> >
> > Clearly that intent is not working.
>
> Thanks for confirmation :-)
>
> > The "fprobe" name is already leaking outside of the kernel internals.
> > The module interface is being proposed.
>
> Yes, but that is only for making the example module.
> It is easy for me to enclose it inside kernel. I'm preparing KUnit
> selftest code for next version. After integrated that, we don't need
> that example module anymore.
>
> > You'd need to document it, etc.
>
> Yes, I've added a document of the APIs for the series.  :-)
>
> > I think it's only causing confusion to users.
> > The new name serves no additional purpose other than
> > being new and unheard of.
> > fprobe is kprobe on ftrace. That's it.
>
> No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> transparently.

Not true.
fprobe is nothing but _explicit_ kprobe on ftrace.
There was an implicit optimization for kprobe when ftrace
could be used.
All this new interface is doing is making it explicit.
So a new name is not warranted here.

> from that viewpoint, fprobe and kprobe interface are similar but different.

What is the difference?
I don't see it.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-04  2:12               ` Alexei Starovoitov
@ 2022-02-04  2:19                 ` Steven Rostedt
  2022-02-04  2:42                   ` Alexei Starovoitov
  2022-02-04  3:14                 ` Masami Hiramatsu
  1 sibling, 1 reply; 50+ messages in thread
From: Steven Rostedt @ 2022-02-04  2:19 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Masami Hiramatsu, Jiri Olsa, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Network Development, bpf, lkml,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Jiri Olsa

On Thu, 3 Feb 2022 18:12:11 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > transparently.  
> 
> Not true.
> fprobe is nothing but _explicit_ kprobe on ftrace.
> There was an implicit optimization for kprobe when ftrace
> could be used.
> All this new interface is doing is making it explicit.
> So a new name is not warranted here.
> 
> > from that viewpoint, fprobe and kprobe interface are similar but different.  
> 
> What is the difference?
> I don't see it.

IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
abilities that a normal kprobe does not. Namely, "what is the function
parameters?"

You can only reliably get the parameters at function entry. Hence, by
having a probe that is unique to functions as supposed to the middle of a
function, makes sense to me.

That is, the API can change. "Give me parameter X". That along with some
BTF reading, could figure out how to get parameter X, and record that.

-- Steve

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-04  2:19                 ` Steven Rostedt
@ 2022-02-04  2:42                   ` Alexei Starovoitov
  2022-02-04  3:17                     ` Masami Hiramatsu
  2022-02-04  3:59                     ` Masami Hiramatsu
  0 siblings, 2 replies; 50+ messages in thread
From: Alexei Starovoitov @ 2022-02-04  2:42 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Jiri Olsa, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Network Development, bpf, lkml,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Jiri Olsa

On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Thu, 3 Feb 2022 18:12:11 -0800
> Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
>
> > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > transparently.
> >
> > Not true.
> > fprobe is nothing but _explicit_ kprobe on ftrace.
> > There was an implicit optimization for kprobe when ftrace
> > could be used.
> > All this new interface is doing is making it explicit.
> > So a new name is not warranted here.
> >
> > > from that viewpoint, fprobe and kprobe interface are similar but different.
> >
> > What is the difference?
> > I don't see it.
>
> IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> abilities that a normal kprobe does not. Namely, "what is the function
> parameters?"
>
> You can only reliably get the parameters at function entry. Hence, by
> having a probe that is unique to functions as supposed to the middle of a
> function, makes sense to me.
>
> That is, the API can change. "Give me parameter X". That along with some
> BTF reading, could figure out how to get parameter X, and record that.

This is more or less a description of kprobe on ftrace :)
The bpf+kprobe users were relying on that for a long time.
See PT_REGS_PARM1() macros in bpf_tracing.h
They're meaningful only with kprobe on ftrace.
So, no, fprobe is not inventing anything new here.

No one is using kprobe in the middle of the function.
It's too difficult to make anything useful out of it,
so no one bothers.
When people say "kprobe" 99 out of 100 they mean
kprobe on ftrace/fentry.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-04  2:12               ` Alexei Starovoitov
  2022-02-04  2:19                 ` Steven Rostedt
@ 2022-02-04  3:14                 ` Masami Hiramatsu
  1 sibling, 0 replies; 50+ messages in thread
From: Masami Hiramatsu @ 2022-02-04  3:14 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Steven Rostedt,
	Jiri Olsa

On Thu, 3 Feb 2022 18:12:11 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Thu, Feb 3, 2022 at 6:07 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Thu, 3 Feb 2022 17:34:54 -0800
> > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> >
> > > On Thu, Feb 3, 2022 at 4:46 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > >
> > > > I thought What Alexei pointed was that don't expose the FPROBE name
> > > > to user space. If so, I agree with that. We can continue to use
> > > > KPROBE for user space. Using fprobe is just for kernel implementation.
> > >
> > > Clearly that intent is not working.
> >
> > Thanks for confirmation :-)
> >
> > > The "fprobe" name is already leaking outside of the kernel internals.
> > > The module interface is being proposed.
> >
> > Yes, but that is only for making the example module.
> > It is easy for me to enclose it inside kernel. I'm preparing KUnit
> > selftest code for next version. After integrated that, we don't need
> > that example module anymore.
> >
> > > You'd need to document it, etc.
> >
> > Yes, I've added a document of the APIs for the series.  :-)
> >
> > > I think it's only causing confusion to users.
> > > The new name serves no additional purpose other than
> > > being new and unheard of.
> > > fprobe is kprobe on ftrace. That's it.
> >
> > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > transparently.
> 
> Not true.
> fprobe is nothing but _explicit_ kprobe on ftrace.
> There was an implicit optimization for kprobe when ftrace
> could be used.
> All this new interface is doing is making it explicit.
> So a new name is not warranted here.
> 
> > from that viewpoint, fprobe and kprobe interface are similar but different.
> 
> What is the difference?
> I don't see it.

From the raw-kernel programer's viewpoint, here are the differences.

kprobes is focusing on probing just a single probe point, and it can probe
everywhere including function body. With this charactoristics, user can
made a callback logic which is specialized for a specific address.

typedef int (*kprobe_pre_handler_t) (struct kprobe *, struct pt_regs *);


On the other hand, fprobe focuses on the multiple function entry and exit.
That is just a wrapper of ftrace. So callbacks will need to check the
function IP and change their behavior according to the IP.

        void (*entry_handler)(struct fprobe *fp, unsigned long entry_ip, struct pt_regs *regs);
        void (*exit_handler)(struct fprobe *fp, unsigned long entry_ip, struct pt_regs *regs);

This is why the fprobe handler gets @entry_ip for the handlers.

However, from viewpoint of the higher level users, those may look same
because both interrupts the kernel execution and callback their program
like BPF. BPF can select collect program according to the instruction_pointer
of @regs in both case.

In that case, I think it is natual that the BPF layer hides those differences
from user, by abstracting those as a generic "kprobe" which means an idea of
the general kernel instrumentation.

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-04  2:42                   ` Alexei Starovoitov
@ 2022-02-04  3:17                     ` Masami Hiramatsu
  2022-02-04  3:59                     ` Masami Hiramatsu
  1 sibling, 0 replies; 50+ messages in thread
From: Masami Hiramatsu @ 2022-02-04  3:17 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Steven Rostedt, Masami Hiramatsu, Jiri Olsa, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Network Development, bpf, lkml,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Jiri Olsa

On Thu, 3 Feb 2022 18:42:22 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > On Thu, 3 Feb 2022 18:12:11 -0800
> > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> >
> > > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > > transparently.
> > >
> > > Not true.
> > > fprobe is nothing but _explicit_ kprobe on ftrace.
> > > There was an implicit optimization for kprobe when ftrace
> > > could be used.
> > > All this new interface is doing is making it explicit.
> > > So a new name is not warranted here.
> > >
> > > > from that viewpoint, fprobe and kprobe interface are similar but different.
> > >
> > > What is the difference?
> > > I don't see it.
> >
> > IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> > abilities that a normal kprobe does not. Namely, "what is the function
> > parameters?"
> >
> > You can only reliably get the parameters at function entry. Hence, by
> > having a probe that is unique to functions as supposed to the middle of a
> > function, makes sense to me.
> >
> > That is, the API can change. "Give me parameter X". That along with some
> > BTF reading, could figure out how to get parameter X, and record that.
> 
> This is more or less a description of kprobe on ftrace :)
> The bpf+kprobe users were relying on that for a long time.
> See PT_REGS_PARM1() macros in bpf_tracing.h
> They're meaningful only with kprobe on ftrace.
> So, no, fprobe is not inventing anything new here.
> 
> No one is using kprobe in the middle of the function.
> It's too difficult to make anything useful out of it,
> so no one bothers.

Perf-probe makes it very easy, as easy as gdb does. :-)

Thank you,

> When people say "kprobe" 99 out of 100 they mean
> kprobe on ftrace/fentry.


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-04  2:42                   ` Alexei Starovoitov
  2022-02-04  3:17                     ` Masami Hiramatsu
@ 2022-02-04  3:59                     ` Masami Hiramatsu
  2022-02-15 13:21                       ` Jiri Olsa
  1 sibling, 1 reply; 50+ messages in thread
From: Masami Hiramatsu @ 2022-02-04  3:59 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Steven Rostedt, Masami Hiramatsu, Jiri Olsa, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Network Development, bpf, lkml,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Jiri Olsa

Hi Alexei,

On Thu, 3 Feb 2022 18:42:22 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > On Thu, 3 Feb 2022 18:12:11 -0800
> > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> >
> > > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > > transparently.
> > >
> > > Not true.
> > > fprobe is nothing but _explicit_ kprobe on ftrace.
> > > There was an implicit optimization for kprobe when ftrace
> > > could be used.
> > > All this new interface is doing is making it explicit.
> > > So a new name is not warranted here.
> > >
> > > > from that viewpoint, fprobe and kprobe interface are similar but different.
> > >
> > > What is the difference?
> > > I don't see it.
> >
> > IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> > abilities that a normal kprobe does not. Namely, "what is the function
> > parameters?"
> >
> > You can only reliably get the parameters at function entry. Hence, by
> > having a probe that is unique to functions as supposed to the middle of a
> > function, makes sense to me.
> >
> > That is, the API can change. "Give me parameter X". That along with some
> > BTF reading, could figure out how to get parameter X, and record that.
> 
> This is more or less a description of kprobe on ftrace :)
> The bpf+kprobe users were relying on that for a long time.
> See PT_REGS_PARM1() macros in bpf_tracing.h
> They're meaningful only with kprobe on ftrace.
> So, no, fprobe is not inventing anything new here.

Hmm, you may be misleading why PT_REGS_PARAM1() macro works. You can use
it even if CONFIG_FUNCITON_TRACER=n if your kernel is built with
CONFIG_KPROBES=y. It is valid unless you put a probe out of function
entry.

> No one is using kprobe in the middle of the function.
> It's too difficult to make anything useful out of it,
> so no one bothers.
> When people say "kprobe" 99 out of 100 they mean
> kprobe on ftrace/fentry.

I see. But the kprobe is kprobe. It is not designed to support multiple
probe points. If I'm forced to say, I can rename the struct fprobe to
struct multi_kprobe, but that doesn't change the essence. You may need
to use both of kprobes and so-called multi_kprobe properly. (Someone
need to do that.)

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/8] bpf: Add support to attach kprobe program with fprobe
  2022-02-02 13:53 ` [PATCH 1/8] bpf: Add support to attach kprobe program with fprobe Jiri Olsa
@ 2022-02-07 18:59   ` Andrii Nakryiko
  2022-02-08  8:56     ` Jiri Olsa
  0 siblings, 1 reply; 50+ messages in thread
From: Andrii Nakryiko @ 2022-02-07 18:59 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Steven Rostedt, Jiri Olsa

On Wed, Feb 2, 2022 at 5:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> Adding new link type BPF_LINK_TYPE_FPROBE that attaches kprobe program
> through fprobe API.
>
> The fprobe API allows to attach probe on multiple functions at once very
> fast, because it works on top of ftrace. On the other hand this limits
> the probe point to the function entry or return.
>
> The kprobe program gets the same pt_regs input ctx as when it's attached
> through the perf API.
>
> Adding new attach type BPF_TRACE_FPROBE that enables such link for kprobe
> program.
>
> User provides array of addresses or symbols with count to attach the kprobe
> program to. The new link_create uapi interface looks like:
>
>   struct {
>           __aligned_u64   syms;
>           __aligned_u64   addrs;
>           __u32           cnt;
>           __u32           flags;
>   } fprobe;
>
> The flags field allows single BPF_F_FPROBE_RETURN bit to create return fprobe.
>
> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  include/linux/bpf_types.h      |   1 +
>  include/uapi/linux/bpf.h       |  13 ++
>  kernel/bpf/syscall.c           | 248 ++++++++++++++++++++++++++++++++-
>  tools/include/uapi/linux/bpf.h |  13 ++
>  4 files changed, 270 insertions(+), 5 deletions(-)
>

[...]

>
> +#ifdef CONFIG_FPROBE
> +
> +struct bpf_fprobe_link {
> +       struct bpf_link link;
> +       struct fprobe fp;
> +       unsigned long *addrs;
> +};
> +
> +static void bpf_fprobe_link_release(struct bpf_link *link)
> +{
> +       struct bpf_fprobe_link *fprobe_link;
> +
> +       fprobe_link = container_of(link, struct bpf_fprobe_link, link);
> +       unregister_fprobe(&fprobe_link->fp);
> +}
> +
> +static void bpf_fprobe_link_dealloc(struct bpf_link *link)
> +{
> +       struct bpf_fprobe_link *fprobe_link;
> +
> +       fprobe_link = container_of(link, struct bpf_fprobe_link, link);
> +       kfree(fprobe_link->addrs);
> +       kfree(fprobe_link);
> +}
> +
> +static const struct bpf_link_ops bpf_fprobe_link_lops = {
> +       .release = bpf_fprobe_link_release,
> +       .dealloc = bpf_fprobe_link_dealloc,
> +};
> +

should this whole new link implementation (including
fprobe_link_prog_run() below) maybe live in kernel/trace/bpf_trace.c?
Seems a bit more fitting than kernel/bpf/syscall.c

> +static int fprobe_link_prog_run(struct bpf_fprobe_link *fprobe_link,
> +                               struct pt_regs *regs)
> +{
> +       int err;
> +
> +       if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> +               err = 0;
> +               goto out;
> +       }
> +
> +       rcu_read_lock();
> +       migrate_disable();
> +       err = bpf_prog_run(fprobe_link->link.prog, regs);
> +       migrate_enable();
> +       rcu_read_unlock();
> +
> + out:
> +       __this_cpu_dec(bpf_prog_active);
> +       return err;
> +}
> +
> +static void fprobe_link_entry_handler(struct fprobe *fp, unsigned long entry_ip,
> +                                     struct pt_regs *regs)
> +{
> +       unsigned long saved_ip = instruction_pointer(regs);
> +       struct bpf_fprobe_link *fprobe_link;
> +
> +       /*
> +        * Because fprobe's regs->ip is set to the next instruction of
> +        * dynamic-ftrace insturction, correct entry ip must be set, so
> +        * that the bpf program can access entry address via regs as same
> +        * as kprobes.
> +        */
> +       instruction_pointer_set(regs, entry_ip);
> +
> +       fprobe_link = container_of(fp, struct bpf_fprobe_link, fp);
> +       fprobe_link_prog_run(fprobe_link, regs);
> +
> +       instruction_pointer_set(regs, saved_ip);
> +}
> +
> +static void fprobe_link_exit_handler(struct fprobe *fp, unsigned long entry_ip,
> +                                    struct pt_regs *regs)

isn't it identical to fprobe_lnk_entry_handler? Maybe use one callback
for both entry and exit?

> +{
> +       unsigned long saved_ip = instruction_pointer(regs);
> +       struct bpf_fprobe_link *fprobe_link;
> +
> +       instruction_pointer_set(regs, entry_ip);
> +
> +       fprobe_link = container_of(fp, struct bpf_fprobe_link, fp);
> +       fprobe_link_prog_run(fprobe_link, regs);
> +
> +       instruction_pointer_set(regs, saved_ip);
> +}
> +
> +static int fprobe_resolve_syms(const void *usyms, u32 cnt,
> +                              unsigned long *addrs)
> +{
> +       unsigned long addr, size;
> +       const char **syms;
> +       int err = -ENOMEM;
> +       unsigned int i;
> +       char *func;
> +
> +       size = cnt * sizeof(*syms);
> +       syms = kzalloc(size, GFP_KERNEL);

any reason not to use kvzalloc() here?

> +       if (!syms)
> +               return -ENOMEM;
> +

[...]

> +
> +static int bpf_fprobe_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
> +{
> +       struct bpf_fprobe_link *link = NULL;
> +       struct bpf_link_primer link_primer;
> +       unsigned long *addrs;
> +       u32 flags, cnt, size;
> +       void __user *uaddrs;
> +       void __user *usyms;
> +       int err;
> +
> +       /* no support for 32bit archs yet */
> +       if (sizeof(u64) != sizeof(void *))
> +               return -EINVAL;

-EOPNOTSUPP?

> +
> +       if (prog->expected_attach_type != BPF_TRACE_FPROBE)
> +               return -EINVAL;
> +
> +       flags = attr->link_create.fprobe.flags;
> +       if (flags & ~BPF_F_FPROBE_RETURN)
> +               return -EINVAL;
> +
> +       uaddrs = u64_to_user_ptr(attr->link_create.fprobe.addrs);
> +       usyms = u64_to_user_ptr(attr->link_create.fprobe.syms);
> +       if ((!uaddrs && !usyms) || (uaddrs && usyms))
> +               return -EINVAL;

!!uaddrs == !!usyms ?

> +
> +       cnt = attr->link_create.fprobe.cnt;
> +       if (!cnt)
> +               return -EINVAL;
> +
> +       size = cnt * sizeof(*addrs);
> +       addrs = kzalloc(size, GFP_KERNEL);

same, why not kvzalloc? Also, aren't you overwriting each addrs entry
anyway, so "z" is not necessary, right?

> +       if (!addrs)
> +               return -ENOMEM;
> +

[...]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 2/8] bpf: Add bpf_get_func_ip kprobe helper for fprobe link
  2022-02-02 13:53 ` [PATCH 2/8] bpf: Add bpf_get_func_ip kprobe helper for fprobe link Jiri Olsa
@ 2022-02-07 18:59   ` Andrii Nakryiko
  2022-02-07 21:01     ` Alexei Starovoitov
  2022-02-09 15:01     ` Jiri Olsa
  0 siblings, 2 replies; 50+ messages in thread
From: Andrii Nakryiko @ 2022-02-07 18:59 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Steven Rostedt, Jiri Olsa

On Wed, Feb 2, 2022 at 5:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> Adding support to call get_func_ip_fprobe helper from kprobe
> programs attached by fprobe link.
>
> Also adding support to inline it, because it's single load
> instruction.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  kernel/bpf/verifier.c    | 19 ++++++++++++++++++-
>  kernel/trace/bpf_trace.c | 16 +++++++++++++++-
>  2 files changed, 33 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 1ae41d0cf96c..a745ded00635 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -13625,7 +13625,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
>                         continue;
>                 }
>
> -               /* Implement bpf_get_func_ip inline. */
> +               /* Implement tracing bpf_get_func_ip inline. */
>                 if (prog_type == BPF_PROG_TYPE_TRACING &&
>                     insn->imm == BPF_FUNC_get_func_ip) {
>                         /* Load IP address from ctx - 16 */
> @@ -13640,6 +13640,23 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
>                         continue;
>                 }
>
> +               /* Implement kprobe/fprobe bpf_get_func_ip inline. */
> +               if (prog_type == BPF_PROG_TYPE_KPROBE &&
> +                   eatype == BPF_TRACE_FPROBE &&
> +                   insn->imm == BPF_FUNC_get_func_ip) {
> +                       /* Load IP address from ctx (struct pt_regs) ip */
> +                       insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1,
> +                                                 offsetof(struct pt_regs, ip));

Isn't this architecture-specific? I'm starting to dislike this
inlining whole more and more. It's just a complication in verifier
without clear real-world benefits. We are clearly prematurely
optimizing here. In practice you'll just call bpf_get_func_ip() once
and that's it. Function call overhead will be negligible compare to
other *userful* work you'll be doing in your BPF program.


> +
> +                       new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, 1);
> +                       if (!new_prog)
> +                               return -ENOMEM;
> +
> +                       env->prog = prog = new_prog;
> +                       insn      = new_prog->insnsi + i + delta;
> +                       continue;
> +               }
> +
>  patch_call_imm:
>                 fn = env->ops->get_func_proto(insn->imm, env->prog);
>                 /* all functions that have prototype and verifier allowed
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index a2024ba32a20..28e59e31e3db 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -1036,6 +1036,19 @@ static const struct bpf_func_proto bpf_get_func_ip_proto_kprobe = {
>         .arg1_type      = ARG_PTR_TO_CTX,
>  };
>
> +BPF_CALL_1(bpf_get_func_ip_fprobe, struct pt_regs *, regs)
> +{
> +       /* This helper call is inlined by verifier. */
> +       return regs->ip;
> +}
> +
> +static const struct bpf_func_proto bpf_get_func_ip_proto_fprobe = {
> +       .func           = bpf_get_func_ip_fprobe,
> +       .gpl_only       = false,
> +       .ret_type       = RET_INTEGER,
> +       .arg1_type      = ARG_PTR_TO_CTX,
> +};
> +
>  BPF_CALL_1(bpf_get_attach_cookie_trace, void *, ctx)
>  {
>         struct bpf_trace_run_ctx *run_ctx;
> @@ -1279,7 +1292,8 @@ kprobe_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>                 return &bpf_override_return_proto;
>  #endif
>         case BPF_FUNC_get_func_ip:
> -               return &bpf_get_func_ip_proto_kprobe;
> +               return prog->expected_attach_type == BPF_TRACE_FPROBE ?
> +                       &bpf_get_func_ip_proto_fprobe : &bpf_get_func_ip_proto_kprobe;
>         case BPF_FUNC_get_attach_cookie:
>                 return &bpf_get_attach_cookie_proto_trace;
>         default:
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/8] bpf: Add bpf_cookie support to fprobe
  2022-02-02 13:53 ` [PATCH 3/8] bpf: Add bpf_cookie support to fprobe Jiri Olsa
@ 2022-02-07 18:59   ` Andrii Nakryiko
  2022-02-08  9:07     ` Jiri Olsa
  0 siblings, 1 reply; 50+ messages in thread
From: Andrii Nakryiko @ 2022-02-07 18:59 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Steven Rostedt, Jiri Olsa

On Wed, Feb 2, 2022 at 5:54 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> Adding support to call bpf_get_attach_cookie helper from
> kprobe program attached by fprobe link.
>
> The bpf_cookie is provided by array of u64 values, where
> each value is paired with provided function address with
> the same array index.
>
> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  include/linux/bpf.h            |  2 +
>  include/uapi/linux/bpf.h       |  1 +
>  kernel/bpf/syscall.c           | 83 +++++++++++++++++++++++++++++++++-
>  kernel/trace/bpf_trace.c       | 16 ++++++-
>  tools/include/uapi/linux/bpf.h |  1 +
>  5 files changed, 100 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 6eb0b180d33b..7b65f05c0487 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1301,6 +1301,8 @@ static inline void bpf_reset_run_ctx(struct bpf_run_ctx *old_ctx)
>  #endif
>  }
>
> +u64 bpf_fprobe_cookie(struct bpf_run_ctx *ctx, u64 ip);
> +
>  /* BPF program asks to bypass CAP_NET_BIND_SERVICE in bind. */
>  #define BPF_RET_BIND_NO_CAP_NET_BIND_SERVICE                   (1 << 0)
>  /* BPF program asks to set CN on the packet. */
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index c0912f0a3dfe..0dc6aa4f9683 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -1484,6 +1484,7 @@ union bpf_attr {
>                                 __aligned_u64   addrs;
>                                 __u32           cnt;
>                                 __u32           flags;
> +                               __aligned_u64   bpf_cookies;

maybe put it right after addrs, they are closely related and cnt
describes all of syms/addrs/cookies.

>                         } fprobe;
>                 };
>         } link_create;
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 0cfbb112c8e1..6c5e74bc43b6 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -33,6 +33,8 @@
>  #include <linux/rcupdate_trace.h>
>  #include <linux/memcontrol.h>
>  #include <linux/fprobe.h>
> +#include <linux/bsearch.h>
> +#include <linux/sort.h>
>
>  #define IS_FD_ARRAY(map) ((map)->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY || \
>                           (map)->map_type == BPF_MAP_TYPE_CGROUP_ARRAY || \
> @@ -3025,10 +3027,18 @@ static int bpf_perf_link_attach(const union bpf_attr *attr, struct bpf_prog *pro
>
>  #ifdef CONFIG_FPROBE
>
> +struct bpf_fprobe_cookie {
> +       unsigned long addr;
> +       u64 bpf_cookie;
> +};
> +
>  struct bpf_fprobe_link {
>         struct bpf_link link;
>         struct fprobe fp;
>         unsigned long *addrs;
> +       struct bpf_run_ctx run_ctx;
> +       struct bpf_fprobe_cookie *bpf_cookies;

you already have all the addrs above, why keeping a second copy of
each addrs in bpf_fprobe_cookie. Let's have two arrays: addrs
(unsigned long) and cookies (u64) and make sure that they are sorted
together. Then lookup addrs, calculate index, use that index to fetch
cookie.

Seems like sort_r() provides exactly the interface you'd need to do
this very easily. Having addrs separate from cookies also a bit
advantageous in terms of TLB misses (if you need any more persuasion
;)

> +       u32 cnt;
>  };
>
>  static void bpf_fprobe_link_release(struct bpf_link *link)
> @@ -3045,6 +3055,7 @@ static void bpf_fprobe_link_dealloc(struct bpf_link *link)
>
>         fprobe_link = container_of(link, struct bpf_fprobe_link, link);
>         kfree(fprobe_link->addrs);
> +       kfree(fprobe_link->bpf_cookies);
>         kfree(fprobe_link);
>  }
>
> @@ -3053,9 +3064,37 @@ static const struct bpf_link_ops bpf_fprobe_link_lops = {
>         .dealloc = bpf_fprobe_link_dealloc,
>  };
>
> +static int bpf_fprobe_cookie_cmp(const void *_a, const void *_b)
> +{
> +       const struct bpf_fprobe_cookie *a = _a;
> +       const struct bpf_fprobe_cookie *b = _b;
> +
> +       if (a->addr == b->addr)
> +               return 0;
> +       return a->addr < b->addr ? -1 : 1;
> +}
> +
> +u64 bpf_fprobe_cookie(struct bpf_run_ctx *ctx, u64 ip)
> +{
> +       struct bpf_fprobe_link *fprobe_link;
> +       struct bpf_fprobe_cookie *val, key = {
> +               .addr = (unsigned long) ip,
> +       };
> +
> +       if (!ctx)
> +               return 0;

is it allowed to have ctx == NULL?

> +       fprobe_link = container_of(ctx, struct bpf_fprobe_link, run_ctx);
> +       if (!fprobe_link->bpf_cookies)
> +               return 0;
> +       val = bsearch(&key, fprobe_link->bpf_cookies, fprobe_link->cnt,
> +                     sizeof(key), bpf_fprobe_cookie_cmp);
> +       return val ? val->bpf_cookie : 0;
> +}
> +
>  static int fprobe_link_prog_run(struct bpf_fprobe_link *fprobe_link,
>                                 struct pt_regs *regs)
>  {
> +       struct bpf_run_ctx *old_run_ctx;
>         int err;
>
>         if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> @@ -3063,12 +3102,16 @@ static int fprobe_link_prog_run(struct bpf_fprobe_link *fprobe_link,
>                 goto out;
>         }
>
> +       old_run_ctx = bpf_set_run_ctx(&fprobe_link->run_ctx);
> +
>         rcu_read_lock();
>         migrate_disable();
>         err = bpf_prog_run(fprobe_link->link.prog, regs);
>         migrate_enable();
>         rcu_read_unlock();
>
> +       bpf_reset_run_ctx(old_run_ctx);
> +
>   out:
>         __this_cpu_dec(bpf_prog_active);
>         return err;
> @@ -3161,10 +3204,12 @@ static int fprobe_resolve_syms(const void *usyms, u32 cnt,
>
>  static int bpf_fprobe_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
>  {
> +       struct bpf_fprobe_cookie *bpf_cookies = NULL;
>         struct bpf_fprobe_link *link = NULL;
>         struct bpf_link_primer link_primer;
> +       void __user *ubpf_cookies;
> +       u32 flags, cnt, i, size;
>         unsigned long *addrs;
> -       u32 flags, cnt, size;
>         void __user *uaddrs;
>         void __user *usyms;
>         int err;
> @@ -3205,6 +3250,37 @@ static int bpf_fprobe_link_attach(const union bpf_attr *attr, struct bpf_prog *p
>                         goto error;
>         }
>
> +       ubpf_cookies = u64_to_user_ptr(attr->link_create.fprobe.bpf_cookies);

nit: let's call all this "cookies", this bpf_ prefix feels a bit
redundant (I know about perf_event.bpf_cookie, but still).

> +       if (ubpf_cookies) {
> +               u64 *tmp;
> +
> +               err = -ENOMEM;
> +               tmp = kzalloc(size, GFP_KERNEL);

kvmalloc?

> +               if (!tmp)
> +                       goto error;
> +
> +               if (copy_from_user(tmp, ubpf_cookies, size)) {
> +                       kfree(tmp);
> +                       err = -EFAULT;
> +                       goto error;
> +               }
> +

[...]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 4/8] libbpf: Add libbpf__kallsyms_parse function
  2022-02-02 13:53 ` [PATCH 4/8] libbpf: Add libbpf__kallsyms_parse function Jiri Olsa
@ 2022-02-07 18:59   ` Andrii Nakryiko
  2022-02-08  9:08     ` Jiri Olsa
  0 siblings, 1 reply; 50+ messages in thread
From: Andrii Nakryiko @ 2022-02-07 18:59 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Steven Rostedt, Jiri Olsa

On Wed, Feb 2, 2022 at 5:54 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> Move the kallsyms parsing in internal libbpf__kallsyms_parse
> function, so it can be used from other places.
>
> It will be used in following changes.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  tools/lib/bpf/libbpf.c          | 62 ++++++++++++++++++++-------------
>  tools/lib/bpf/libbpf_internal.h |  5 +++
>  2 files changed, 43 insertions(+), 24 deletions(-)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 1b0936b016d9..7d595cfd03bc 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -7165,12 +7165,10 @@ static int bpf_object__sanitize_maps(struct bpf_object *obj)
>         return 0;
>  }
>
> -static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
> +int libbpf__kallsyms_parse(void *arg, kallsyms_cb_t cb)

please call it libbpf_kallsyms_parse(), internal APIs don't use
"object oriented" double underscore separator

also this "arg" is normally called "ctx" in similar APIs in libbpf and
is passed the last, can you please adjust all that for consistency?

>  {
>         char sym_type, sym_name[500];
>         unsigned long long sym_addr;
> -       const struct btf_type *t;
> -       struct extern_desc *ext;
>         int ret, err = 0;
>         FILE *f;
>

[...]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 6/8] libbpf: Add bpf_program__attach_kprobe_opts for multi kprobes
  2022-02-02 13:53 ` [PATCH 6/8] libbpf: Add bpf_program__attach_kprobe_opts " Jiri Olsa
@ 2022-02-07 18:59   ` Andrii Nakryiko
  2022-02-08  9:12     ` Jiri Olsa
  0 siblings, 1 reply; 50+ messages in thread
From: Andrii Nakryiko @ 2022-02-07 18:59 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Steven Rostedt, Jiri Olsa

On Wed, Feb 2, 2022 at 5:54 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> Adding support to bpf_program__attach_kprobe_opts to load kprobes
> to multiple functions.
>
> If the kprobe program has BPF_TRACE_FPROBE as expected_attach_type
> it will use the new fprobe link to attach the program. In this case
> it will use 'func_name' as pattern for functions to attach.
>
> Adding also support to use '*' wildcard in 'kprobe/kretprobe' section
> name by SEC macro, like:
>
>   SEC("kprobe/bpf_fentry_test*")
>   SEC("kretprobe/bpf_fentry_test*")
>
> This will set kprobe's expected_attach_type to BPF_TRACE_FPROBE,
> and attach it to provided functions pattern.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  tools/lib/bpf/libbpf.c | 136 ++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 133 insertions(+), 3 deletions(-)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 7d595cfd03bc..6b343ef77ed8 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -8607,13 +8607,15 @@ static struct bpf_link *attach_trace(const struct bpf_program *prog, long cookie
>  static struct bpf_link *attach_lsm(const struct bpf_program *prog, long cookie);
>  static struct bpf_link *attach_iter(const struct bpf_program *prog, long cookie);
>
> +static int init_kprobe(struct bpf_program *prog, long cookie);
> +
>  static const struct bpf_sec_def section_defs[] = {
>         SEC_DEF("socket",               SOCKET_FILTER, 0, SEC_NONE | SEC_SLOPPY_PFX),
>         SEC_DEF("sk_reuseport/migrate", SK_REUSEPORT, BPF_SK_REUSEPORT_SELECT_OR_MIGRATE, SEC_ATTACHABLE | SEC_SLOPPY_PFX),
>         SEC_DEF("sk_reuseport",         SK_REUSEPORT, BPF_SK_REUSEPORT_SELECT, SEC_ATTACHABLE | SEC_SLOPPY_PFX),
> -       SEC_DEF("kprobe/",              KPROBE, 0, SEC_NONE, attach_kprobe),
> +       SEC_DEF("kprobe/",              KPROBE, 0, SEC_NONE, attach_kprobe, .init_fn = init_kprobe),
>         SEC_DEF("uprobe/",              KPROBE, 0, SEC_NONE),
> -       SEC_DEF("kretprobe/",           KPROBE, 0, SEC_NONE, attach_kprobe),
> +       SEC_DEF("kretprobe/",           KPROBE, 0, SEC_NONE, attach_kprobe, .init_fn = init_kprobe),
>         SEC_DEF("uretprobe/",           KPROBE, 0, SEC_NONE),
>         SEC_DEF("tc",                   SCHED_CLS, 0, SEC_NONE),
>         SEC_DEF("classifier",           SCHED_CLS, 0, SEC_NONE | SEC_SLOPPY_PFX | SEC_DEPRECATED),
> @@ -10031,6 +10033,123 @@ static int perf_event_kprobe_open_legacy(const char *probe_name, bool retprobe,
>         return pfd;
>  }
>
> +struct fprobe_resolve {
> +       const char *name;
> +       __u64 *addrs;
> +       __u32 alloc;
> +       __u32 cnt;
> +};
> +
> +static bool glob_matches(const char *glob, const char *s)

we've since added more generic glob_match() implementation (see
test_progs.c), let's copy/paste that one (it's actually shorter and
doesn't do hacky input args modification). Let's maybe also add '?'
handling (it's trivial). Both original code in perf and the one in
test_progs.c are GPL-2.0-only, so let's also get acks from original
authors.

> +{
> +       int n = strlen(glob);
> +
> +       if (n == 1 && glob[0] == '*')
> +               return true;
> +
> +       if (glob[0] == '*' && glob[n - 1] == '*') {
> +               const char *subs;
> +               /* substring match */
> +
> +               /* this is hacky, but we don't want to allocate
> +                * for no good reason
> +                */
> +               ((char *)glob)[n - 1] = '\0';
> +               subs = strstr(s, glob + 1);
> +               ((char *)glob)[n - 1] = '*';
> +
> +               return subs != NULL;
> +       } else if (glob[0] == '*') {
> +               size_t nn = strlen(s);
> +               /* suffix match */
> +
> +               /* too short for a given suffix */
> +               if (nn < n - 1)
> +                       return false;
> +               return strcmp(s + nn - (n - 1), glob + 1) == 0;
> +       } else if (glob[n - 1] == '*') {
> +               /* prefix match */
> +               return strncmp(s, glob, n - 1) == 0;
> +       } else {
> +               /* exact match */
> +               return strcmp(glob, s) == 0;
> +       }
> +}
> +
> +static int resolve_fprobe_cb(void *arg, unsigned long long sym_addr,
> +                            char sym_type, const char *sym_name)
> +{
> +       struct fprobe_resolve *res = arg;
> +       __u64 *p;
> +
> +       if (!glob_matches(res->name, sym_name))
> +               return 0;
> +
> +       if (res->cnt == res->alloc) {
> +               res->alloc = max((__u32) 16, res->alloc * 3 / 2);
> +               p = libbpf_reallocarray(res->addrs, res->alloc, sizeof(__u32));
> +               if (!p)
> +                       return -ENOMEM;
> +               res->addrs = p;
> +       }

please use libbpf_ensure_mem() instead


> +       res->addrs[res->cnt++] = sym_addr;
> +       return 0;
> +}
> +
> +static struct bpf_link *
> +attach_fprobe_opts(const struct bpf_program *prog,
> +                  const char *func_name,

func_glob or func_pattern?

> +                  const struct bpf_kprobe_opts *kopts)
> +{
> +       DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts);
> +       struct fprobe_resolve res = {
> +               .name = func_name,
> +       };
> +       struct bpf_link *link = NULL;
> +       char errmsg[STRERR_BUFSIZE];
> +       int err, link_fd, prog_fd;
> +       bool retprobe;
> +
> +       err = libbpf__kallsyms_parse(&res, resolve_fprobe_cb);
> +       if (err)
> +               goto error;
> +       if (!res.cnt) {
> +               err = -ENOENT;
> +               goto error;
> +       }
> +
> +       retprobe = OPTS_GET(kopts, retprobe, false);
> +
> +       opts.fprobe.addrs = (__u64) res.addrs;

ptr_to_u64()

> +       opts.fprobe.cnt = res.cnt;
> +       opts.flags = retprobe ? BPF_F_FPROBE_RETURN : 0;
> +
> +       link = calloc(1, sizeof(*link));
> +       if (!link) {
> +               err = -ENOMEM;
> +               goto error;
> +       }
> +       link->detach = &bpf_link__detach_fd;
> +
> +       prog_fd = bpf_program__fd(prog);
> +       link_fd = bpf_link_create(prog_fd, 0, BPF_TRACE_FPROBE, &opts);
> +       if (link_fd < 0) {
> +               err = -errno;
> +               pr_warn("prog '%s': failed to attach to %s: %s\n",
> +                       prog->name, res.name,
> +                       libbpf_strerror_r(err, errmsg, sizeof(errmsg)));
> +               goto error;
> +       }
> +       link->fd = link_fd;
> +       free(res.addrs);
> +       return link;
> +
> +error:
> +       free(link);
> +       free(res.addrs);
> +       return libbpf_err_ptr(err);
> +}
> +
>  struct bpf_link *
>  bpf_program__attach_kprobe_opts(const struct bpf_program *prog,
>                                 const char *func_name,
> @@ -10047,6 +10166,9 @@ bpf_program__attach_kprobe_opts(const struct bpf_program *prog,
>         if (!OPTS_VALID(opts, bpf_kprobe_opts))
>                 return libbpf_err_ptr(-EINVAL);
>
> +       if (prog->expected_attach_type == BPF_TRACE_FPROBE)
> +               return attach_fprobe_opts(prog, func_name, opts);
> +
>         retprobe = OPTS_GET(opts, retprobe, false);
>         offset = OPTS_GET(opts, offset, 0);
>         pe_opts.bpf_cookie = OPTS_GET(opts, bpf_cookie, 0);
> @@ -10112,6 +10234,14 @@ struct bpf_link *bpf_program__attach_kprobe(const struct bpf_program *prog,
>         return bpf_program__attach_kprobe_opts(prog, func_name, &opts);
>  }
>
> +static int init_kprobe(struct bpf_program *prog, long cookie)
> +{
> +       /* If we have wildcard, switch to fprobe link. */
> +       if (strchr(prog->sec_name, '*'))

ugh... :( maybe let's have a separate SEC("kprobe.multi/<glob>") and
same for kretprobe?


> +               bpf_program__set_expected_attach_type(prog, BPF_TRACE_FPROBE);
> +       return 0;
> +}
> +
>  static struct bpf_link *attach_kprobe(const struct bpf_program *prog, long cookie)
>  {
>         DECLARE_LIBBPF_OPTS(bpf_kprobe_opts, opts);
> @@ -10127,7 +10257,7 @@ static struct bpf_link *attach_kprobe(const struct bpf_program *prog, long cooki
>         else
>                 func_name = prog->sec_name + sizeof("kprobe/") - 1;
>
> -       n = sscanf(func_name, "%m[a-zA-Z0-9_.]+%li", &func, &offset);
> +       n = sscanf(func_name, "%m[a-zA-Z0-9_.*]+%li", &func, &offset);
>         if (n < 1) {
>                 err = -EINVAL;
>                 pr_warn("kprobe name is invalid: %s\n", func_name);
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 8/8] selftest/bpf: Add fprobe test for bpf_cookie values
  2022-02-02 13:53 ` [PATCH 8/8] selftest/bpf: Add fprobe test for bpf_cookie values Jiri Olsa
@ 2022-02-07 18:59   ` Andrii Nakryiko
  2022-02-08  9:15     ` Jiri Olsa
  0 siblings, 1 reply; 50+ messages in thread
From: Andrii Nakryiko @ 2022-02-07 18:59 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Steven Rostedt, Jiri Olsa

On Wed, Feb 2, 2022 at 5:54 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> Adding bpf_cookie test for kprobe attached by fprobe link.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  .../selftests/bpf/prog_tests/bpf_cookie.c     | 73 +++++++++++++++++++
>  .../selftests/bpf/progs/fprobe_bpf_cookie.c   | 62 ++++++++++++++++
>  2 files changed, 135 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/progs/fprobe_bpf_cookie.c
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
> index cd10df6cd0fc..bf70d859c598 100644
> --- a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
> +++ b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
> @@ -7,6 +7,7 @@
>  #include <unistd.h>
>  #include <test_progs.h>
>  #include "test_bpf_cookie.skel.h"
> +#include "fprobe_bpf_cookie.skel.h"
>
>  /* uprobe attach point */
>  static void trigger_func(void)
> @@ -63,6 +64,76 @@ static void kprobe_subtest(struct test_bpf_cookie *skel)
>         bpf_link__destroy(retlink2);
>  }
>
> +static void fprobe_subtest(void)
> +{
> +       DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts);
> +       int err, prog_fd, link1_fd = -1, link2_fd = -1;
> +       struct fprobe_bpf_cookie *skel = NULL;
> +       __u32 duration = 0, retval;
> +       __u64 addrs[8], cookies[8];
> +
> +       skel = fprobe_bpf_cookie__open_and_load();
> +       if (!ASSERT_OK_PTR(skel, "fentry_raw_skel_load"))
> +               goto cleanup;
> +
> +       kallsyms_find("bpf_fentry_test1", &addrs[0]);
> +       kallsyms_find("bpf_fentry_test2", &addrs[1]);
> +       kallsyms_find("bpf_fentry_test3", &addrs[2]);
> +       kallsyms_find("bpf_fentry_test4", &addrs[3]);
> +       kallsyms_find("bpf_fentry_test5", &addrs[4]);
> +       kallsyms_find("bpf_fentry_test6", &addrs[5]);
> +       kallsyms_find("bpf_fentry_test7", &addrs[6]);
> +       kallsyms_find("bpf_fentry_test8", &addrs[7]);
> +
> +       cookies[0] = 1;
> +       cookies[1] = 2;
> +       cookies[2] = 3;
> +       cookies[3] = 4;
> +       cookies[4] = 5;
> +       cookies[5] = 6;
> +       cookies[6] = 7;
> +       cookies[7] = 8;
> +
> +       opts.fprobe.addrs = (__u64) &addrs;

we should have ptr_to_u64() for test_progs, but if not, let's either
add it or it should be (__u64)(uintptr_t)&addrs. Otherwise we'll be
getting compilation warnings on some architectures.

> +       opts.fprobe.cnt = 8;
> +       opts.fprobe.bpf_cookies = (__u64) &cookies;
> +       prog_fd = bpf_program__fd(skel->progs.test2);
> +
> +       link1_fd = bpf_link_create(prog_fd, 0, BPF_TRACE_FPROBE, &opts);
> +       if (!ASSERT_GE(link1_fd, 0, "link1_fd"))
> +               return;
> +
> +       cookies[0] = 8;
> +       cookies[1] = 7;
> +       cookies[2] = 6;
> +       cookies[3] = 5;
> +       cookies[4] = 4;
> +       cookies[5] = 3;
> +       cookies[6] = 2;
> +       cookies[7] = 1;
> +
> +       opts.flags = BPF_F_FPROBE_RETURN;
> +       prog_fd = bpf_program__fd(skel->progs.test3);
> +
> +       link2_fd = bpf_link_create(prog_fd, 0, BPF_TRACE_FPROBE, &opts);
> +       if (!ASSERT_GE(link2_fd, 0, "link2_fd"))
> +               goto cleanup;
> +
> +       prog_fd = bpf_program__fd(skel->progs.test1);
> +       err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
> +                               NULL, NULL, &retval, &duration);
> +       ASSERT_OK(err, "test_run");
> +       ASSERT_EQ(retval, 0, "test_run");
> +
> +       ASSERT_EQ(skel->bss->test2_result, 8, "test2_result");
> +       ASSERT_EQ(skel->bss->test3_result, 8, "test3_result");
> +
> +cleanup:
> +       close(link1_fd);
> +       close(link2_fd);
> +       fprobe_bpf_cookie__destroy(skel);
> +}
> +
>  static void uprobe_subtest(struct test_bpf_cookie *skel)
>  {
>         DECLARE_LIBBPF_OPTS(bpf_uprobe_opts, opts);
> @@ -249,6 +320,8 @@ void test_bpf_cookie(void)
>
>         if (test__start_subtest("kprobe"))
>                 kprobe_subtest(skel);
> +       if (test__start_subtest("rawkprobe"))

kprobe.multi?

> +               fprobe_subtest();
>         if (test__start_subtest("uprobe"))
>                 uprobe_subtest(skel);
>         if (test__start_subtest("tracepoint"))

[...]

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 2/8] bpf: Add bpf_get_func_ip kprobe helper for fprobe link
  2022-02-07 18:59   ` Andrii Nakryiko
@ 2022-02-07 21:01     ` Alexei Starovoitov
  2022-02-09 15:01     ` Jiri Olsa
  1 sibling, 0 replies; 50+ messages in thread
From: Alexei Starovoitov @ 2022-02-07 21:01 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Steven Rostedt, Jiri Olsa

On Mon, Feb 7, 2022 at 10:59 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Wed, Feb 2, 2022 at 5:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > Adding support to call get_func_ip_fprobe helper from kprobe
> > programs attached by fprobe link.
> >
> > Also adding support to inline it, because it's single load
> > instruction.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  kernel/bpf/verifier.c    | 19 ++++++++++++++++++-
> >  kernel/trace/bpf_trace.c | 16 +++++++++++++++-
> >  2 files changed, 33 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 1ae41d0cf96c..a745ded00635 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -13625,7 +13625,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
> >                         continue;
> >                 }
> >
> > -               /* Implement bpf_get_func_ip inline. */
> > +               /* Implement tracing bpf_get_func_ip inline. */
> >                 if (prog_type == BPF_PROG_TYPE_TRACING &&
> >                     insn->imm == BPF_FUNC_get_func_ip) {
> >                         /* Load IP address from ctx - 16 */
> > @@ -13640,6 +13640,23 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
> >                         continue;
> >                 }
> >
> > +               /* Implement kprobe/fprobe bpf_get_func_ip inline. */
> > +               if (prog_type == BPF_PROG_TYPE_KPROBE &&
> > +                   eatype == BPF_TRACE_FPROBE &&
> > +                   insn->imm == BPF_FUNC_get_func_ip) {
> > +                       /* Load IP address from ctx (struct pt_regs) ip */
> > +                       insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1,
> > +                                                 offsetof(struct pt_regs, ip));
>
> Isn't this architecture-specific? I'm starting to dislike this
> inlining whole more and more. It's just a complication in verifier
> without clear real-world benefits. We are clearly prematurely
> optimizing here. In practice you'll just call bpf_get_func_ip() once
> and that's it. Function call overhead will be negligible compare to
> other *userful* work you'll be doing in your BPF program.

We should be doing inlining when we can.
Every bit of performance matters.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/8] bpf: Add support to attach kprobe program with fprobe
  2022-02-07 18:59   ` Andrii Nakryiko
@ 2022-02-08  8:56     ` Jiri Olsa
  0 siblings, 0 replies; 50+ messages in thread
From: Jiri Olsa @ 2022-02-08  8:56 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Steven Rostedt, Jiri Olsa

On Mon, Feb 07, 2022 at 10:59:14AM -0800, Andrii Nakryiko wrote:
> On Wed, Feb 2, 2022 at 5:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > Adding new link type BPF_LINK_TYPE_FPROBE that attaches kprobe program
> > through fprobe API.
> >
> > The fprobe API allows to attach probe on multiple functions at once very
> > fast, because it works on top of ftrace. On the other hand this limits
> > the probe point to the function entry or return.
> >
> > The kprobe program gets the same pt_regs input ctx as when it's attached
> > through the perf API.
> >
> > Adding new attach type BPF_TRACE_FPROBE that enables such link for kprobe
> > program.
> >
> > User provides array of addresses or symbols with count to attach the kprobe
> > program to. The new link_create uapi interface looks like:
> >
> >   struct {
> >           __aligned_u64   syms;
> >           __aligned_u64   addrs;
> >           __u32           cnt;
> >           __u32           flags;
> >   } fprobe;
> >
> > The flags field allows single BPF_F_FPROBE_RETURN bit to create return fprobe.
> >
> > Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  include/linux/bpf_types.h      |   1 +
> >  include/uapi/linux/bpf.h       |  13 ++
> >  kernel/bpf/syscall.c           | 248 ++++++++++++++++++++++++++++++++-
> >  tools/include/uapi/linux/bpf.h |  13 ++
> >  4 files changed, 270 insertions(+), 5 deletions(-)
> >
> 
> [...]
> 
> >
> > +#ifdef CONFIG_FPROBE
> > +
> > +struct bpf_fprobe_link {
> > +       struct bpf_link link;
> > +       struct fprobe fp;
> > +       unsigned long *addrs;
> > +};
> > +
> > +static void bpf_fprobe_link_release(struct bpf_link *link)
> > +{
> > +       struct bpf_fprobe_link *fprobe_link;
> > +
> > +       fprobe_link = container_of(link, struct bpf_fprobe_link, link);
> > +       unregister_fprobe(&fprobe_link->fp);
> > +}
> > +
> > +static void bpf_fprobe_link_dealloc(struct bpf_link *link)
> > +{
> > +       struct bpf_fprobe_link *fprobe_link;
> > +
> > +       fprobe_link = container_of(link, struct bpf_fprobe_link, link);
> > +       kfree(fprobe_link->addrs);
> > +       kfree(fprobe_link);
> > +}
> > +
> > +static const struct bpf_link_ops bpf_fprobe_link_lops = {
> > +       .release = bpf_fprobe_link_release,
> > +       .dealloc = bpf_fprobe_link_dealloc,
> > +};
> > +
> 
> should this whole new link implementation (including
> fprobe_link_prog_run() below) maybe live in kernel/trace/bpf_trace.c?
> Seems a bit more fitting than kernel/bpf/syscall.c

right, it's trace related

> 
> > +static int fprobe_link_prog_run(struct bpf_fprobe_link *fprobe_link,
> > +                               struct pt_regs *regs)
> > +{
> > +       int err;
> > +
> > +       if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> > +               err = 0;
> > +               goto out;
> > +       }
> > +
> > +       rcu_read_lock();
> > +       migrate_disable();
> > +       err = bpf_prog_run(fprobe_link->link.prog, regs);
> > +       migrate_enable();
> > +       rcu_read_unlock();
> > +
> > + out:
> > +       __this_cpu_dec(bpf_prog_active);
> > +       return err;
> > +}
> > +
> > +static void fprobe_link_entry_handler(struct fprobe *fp, unsigned long entry_ip,
> > +                                     struct pt_regs *regs)
> > +{
> > +       unsigned long saved_ip = instruction_pointer(regs);
> > +       struct bpf_fprobe_link *fprobe_link;
> > +
> > +       /*
> > +        * Because fprobe's regs->ip is set to the next instruction of
> > +        * dynamic-ftrace insturction, correct entry ip must be set, so
> > +        * that the bpf program can access entry address via regs as same
> > +        * as kprobes.
> > +        */
> > +       instruction_pointer_set(regs, entry_ip);
> > +
> > +       fprobe_link = container_of(fp, struct bpf_fprobe_link, fp);
> > +       fprobe_link_prog_run(fprobe_link, regs);
> > +
> > +       instruction_pointer_set(regs, saved_ip);
> > +}
> > +
> > +static void fprobe_link_exit_handler(struct fprobe *fp, unsigned long entry_ip,
> > +                                    struct pt_regs *regs)
> 
> isn't it identical to fprobe_lnk_entry_handler? Maybe use one callback
> for both entry and exit?

heh, did not notice that :) yep, looks that way, will check

> 
> > +{
> > +       unsigned long saved_ip = instruction_pointer(regs);
> > +       struct bpf_fprobe_link *fprobe_link;
> > +
> > +       instruction_pointer_set(regs, entry_ip);
> > +
> > +       fprobe_link = container_of(fp, struct bpf_fprobe_link, fp);
> > +       fprobe_link_prog_run(fprobe_link, regs);
> > +
> > +       instruction_pointer_set(regs, saved_ip);
> > +}
> > +
> > +static int fprobe_resolve_syms(const void *usyms, u32 cnt,
> > +                              unsigned long *addrs)
> > +{
> > +       unsigned long addr, size;
> > +       const char **syms;
> > +       int err = -ENOMEM;
> > +       unsigned int i;
> > +       char *func;
> > +
> > +       size = cnt * sizeof(*syms);
> > +       syms = kzalloc(size, GFP_KERNEL);
> 
> any reason not to use kvzalloc() here?

probably just my ignorance ;-) will check

> 
> > +       if (!syms)
> > +               return -ENOMEM;
> > +
> 
> [...]
> 
> > +
> > +static int bpf_fprobe_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
> > +{
> > +       struct bpf_fprobe_link *link = NULL;
> > +       struct bpf_link_primer link_primer;
> > +       unsigned long *addrs;
> > +       u32 flags, cnt, size;
> > +       void __user *uaddrs;
> > +       void __user *usyms;
> > +       int err;
> > +
> > +       /* no support for 32bit archs yet */
> > +       if (sizeof(u64) != sizeof(void *))
> > +               return -EINVAL;
> 
> -EOPNOTSUPP?

ok

> 
> > +
> > +       if (prog->expected_attach_type != BPF_TRACE_FPROBE)
> > +               return -EINVAL;
> > +
> > +       flags = attr->link_create.fprobe.flags;
> > +       if (flags & ~BPF_F_FPROBE_RETURN)
> > +               return -EINVAL;
> > +
> > +       uaddrs = u64_to_user_ptr(attr->link_create.fprobe.addrs);
> > +       usyms = u64_to_user_ptr(attr->link_create.fprobe.syms);
> > +       if ((!uaddrs && !usyms) || (uaddrs && usyms))
> > +               return -EINVAL;
> 
> !!uaddrs == !!usyms ?

ah right, will change

> 
> > +
> > +       cnt = attr->link_create.fprobe.cnt;
> > +       if (!cnt)
> > +               return -EINVAL;
> > +
> > +       size = cnt * sizeof(*addrs);
> > +       addrs = kzalloc(size, GFP_KERNEL);
> 
> same, why not kvzalloc? Also, aren't you overwriting each addrs entry
> anyway, so "z" is not necessary, right?

true, no need for zeroing

thanks,
jirka


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/8] bpf: Add bpf_cookie support to fprobe
  2022-02-07 18:59   ` Andrii Nakryiko
@ 2022-02-08  9:07     ` Jiri Olsa
  2022-02-08 23:35       ` Andrii Nakryiko
  0 siblings, 1 reply; 50+ messages in thread
From: Jiri Olsa @ 2022-02-08  9:07 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Steven Rostedt, Jiri Olsa

On Mon, Feb 07, 2022 at 10:59:21AM -0800, Andrii Nakryiko wrote:
> On Wed, Feb 2, 2022 at 5:54 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > Adding support to call bpf_get_attach_cookie helper from
> > kprobe program attached by fprobe link.
> >
> > The bpf_cookie is provided by array of u64 values, where
> > each value is paired with provided function address with
> > the same array index.
> >
> > Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  include/linux/bpf.h            |  2 +
> >  include/uapi/linux/bpf.h       |  1 +
> >  kernel/bpf/syscall.c           | 83 +++++++++++++++++++++++++++++++++-
> >  kernel/trace/bpf_trace.c       | 16 ++++++-
> >  tools/include/uapi/linux/bpf.h |  1 +
> >  5 files changed, 100 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 6eb0b180d33b..7b65f05c0487 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -1301,6 +1301,8 @@ static inline void bpf_reset_run_ctx(struct bpf_run_ctx *old_ctx)
> >  #endif
> >  }
> >
> > +u64 bpf_fprobe_cookie(struct bpf_run_ctx *ctx, u64 ip);
> > +
> >  /* BPF program asks to bypass CAP_NET_BIND_SERVICE in bind. */
> >  #define BPF_RET_BIND_NO_CAP_NET_BIND_SERVICE                   (1 << 0)
> >  /* BPF program asks to set CN on the packet. */
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index c0912f0a3dfe..0dc6aa4f9683 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -1484,6 +1484,7 @@ union bpf_attr {
> >                                 __aligned_u64   addrs;
> >                                 __u32           cnt;
> >                                 __u32           flags;
> > +                               __aligned_u64   bpf_cookies;
> 
> maybe put it right after addrs, they are closely related and cnt
> describes all of syms/addrs/cookies.

ok

> 
> >                         } fprobe;
> >                 };
> >         } link_create;
> > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > index 0cfbb112c8e1..6c5e74bc43b6 100644
> > --- a/kernel/bpf/syscall.c
> > +++ b/kernel/bpf/syscall.c
> > @@ -33,6 +33,8 @@
> >  #include <linux/rcupdate_trace.h>
> >  #include <linux/memcontrol.h>
> >  #include <linux/fprobe.h>
> > +#include <linux/bsearch.h>
> > +#include <linux/sort.h>
> >
> >  #define IS_FD_ARRAY(map) ((map)->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY || \
> >                           (map)->map_type == BPF_MAP_TYPE_CGROUP_ARRAY || \
> > @@ -3025,10 +3027,18 @@ static int bpf_perf_link_attach(const union bpf_attr *attr, struct bpf_prog *pro
> >
> >  #ifdef CONFIG_FPROBE
> >
> > +struct bpf_fprobe_cookie {
> > +       unsigned long addr;
> > +       u64 bpf_cookie;
> > +};
> > +
> >  struct bpf_fprobe_link {
> >         struct bpf_link link;
> >         struct fprobe fp;
> >         unsigned long *addrs;
> > +       struct bpf_run_ctx run_ctx;
> > +       struct bpf_fprobe_cookie *bpf_cookies;
> 
> you already have all the addrs above, why keeping a second copy of
> each addrs in bpf_fprobe_cookie. Let's have two arrays: addrs
> (unsigned long) and cookies (u64) and make sure that they are sorted
> together. Then lookup addrs, calculate index, use that index to fetch
> cookie.
> 
> Seems like sort_r() provides exactly the interface you'd need to do
> this very easily. Having addrs separate from cookies also a bit
> advantageous in terms of TLB misses (if you need any more persuasion
> ;)

no persuation needed, I actually tried that but it turned out sort_r
is not ready yet ;-)

because you can't pass priv pointer to the swap callback, so we can't
swap the other array.. I did a change to allow that, but it's not trivial
and will need some bigger testing/review because the original sort
calls sort_r, and of course there are many 'sort' users ;-)

> 
> > +       u32 cnt;
> >  };
> >
> >  static void bpf_fprobe_link_release(struct bpf_link *link)
> > @@ -3045,6 +3055,7 @@ static void bpf_fprobe_link_dealloc(struct bpf_link *link)
> >
> >         fprobe_link = container_of(link, struct bpf_fprobe_link, link);
> >         kfree(fprobe_link->addrs);
> > +       kfree(fprobe_link->bpf_cookies);
> >         kfree(fprobe_link);
> >  }
> >
> > @@ -3053,9 +3064,37 @@ static const struct bpf_link_ops bpf_fprobe_link_lops = {
> >         .dealloc = bpf_fprobe_link_dealloc,
> >  };
> >
> > +static int bpf_fprobe_cookie_cmp(const void *_a, const void *_b)
> > +{
> > +       const struct bpf_fprobe_cookie *a = _a;
> > +       const struct bpf_fprobe_cookie *b = _b;
> > +
> > +       if (a->addr == b->addr)
> > +               return 0;
> > +       return a->addr < b->addr ? -1 : 1;
> > +}
> > +
> > +u64 bpf_fprobe_cookie(struct bpf_run_ctx *ctx, u64 ip)
> > +{
> > +       struct bpf_fprobe_link *fprobe_link;
> > +       struct bpf_fprobe_cookie *val, key = {
> > +               .addr = (unsigned long) ip,
> > +       };
> > +
> > +       if (!ctx)
> > +               return 0;
> 
> is it allowed to have ctx == NULL?

nope, I was also thinking this is more 'WARN_ON[_ONCE]' check

> 
> > +       fprobe_link = container_of(ctx, struct bpf_fprobe_link, run_ctx);
> > +       if (!fprobe_link->bpf_cookies)
> > +               return 0;
> > +       val = bsearch(&key, fprobe_link->bpf_cookies, fprobe_link->cnt,
> > +                     sizeof(key), bpf_fprobe_cookie_cmp);
> > +       return val ? val->bpf_cookie : 0;
> > +}
> > +
> >  static int fprobe_link_prog_run(struct bpf_fprobe_link *fprobe_link,
> >                                 struct pt_regs *regs)
> >  {
> > +       struct bpf_run_ctx *old_run_ctx;
> >         int err;
> >
> >         if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> > @@ -3063,12 +3102,16 @@ static int fprobe_link_prog_run(struct bpf_fprobe_link *fprobe_link,
> >                 goto out;
> >         }
> >
> > +       old_run_ctx = bpf_set_run_ctx(&fprobe_link->run_ctx);
> > +
> >         rcu_read_lock();
> >         migrate_disable();
> >         err = bpf_prog_run(fprobe_link->link.prog, regs);
> >         migrate_enable();
> >         rcu_read_unlock();
> >
> > +       bpf_reset_run_ctx(old_run_ctx);
> > +
> >   out:
> >         __this_cpu_dec(bpf_prog_active);
> >         return err;
> > @@ -3161,10 +3204,12 @@ static int fprobe_resolve_syms(const void *usyms, u32 cnt,
> >
> >  static int bpf_fprobe_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
> >  {
> > +       struct bpf_fprobe_cookie *bpf_cookies = NULL;
> >         struct bpf_fprobe_link *link = NULL;
> >         struct bpf_link_primer link_primer;
> > +       void __user *ubpf_cookies;
> > +       u32 flags, cnt, i, size;
> >         unsigned long *addrs;
> > -       u32 flags, cnt, size;
> >         void __user *uaddrs;
> >         void __user *usyms;
> >         int err;
> > @@ -3205,6 +3250,37 @@ static int bpf_fprobe_link_attach(const union bpf_attr *attr, struct bpf_prog *p
> >                         goto error;
> >         }
> >
> > +       ubpf_cookies = u64_to_user_ptr(attr->link_create.fprobe.bpf_cookies);
> 
> nit: let's call all this "cookies", this bpf_ prefix feels a bit
> redundant (I know about perf_event.bpf_cookie, but still).

ok

> 
> > +       if (ubpf_cookies) {
> > +               u64 *tmp;
> > +
> > +               err = -ENOMEM;
> > +               tmp = kzalloc(size, GFP_KERNEL);
> 
> kvmalloc?

ok

thanks,
jirka


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 4/8] libbpf: Add libbpf__kallsyms_parse function
  2022-02-07 18:59   ` Andrii Nakryiko
@ 2022-02-08  9:08     ` Jiri Olsa
  0 siblings, 0 replies; 50+ messages in thread
From: Jiri Olsa @ 2022-02-08  9:08 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Steven Rostedt, Jiri Olsa

On Mon, Feb 07, 2022 at 10:59:24AM -0800, Andrii Nakryiko wrote:
> On Wed, Feb 2, 2022 at 5:54 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > Move the kallsyms parsing in internal libbpf__kallsyms_parse
> > function, so it can be used from other places.
> >
> > It will be used in following changes.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  tools/lib/bpf/libbpf.c          | 62 ++++++++++++++++++++-------------
> >  tools/lib/bpf/libbpf_internal.h |  5 +++
> >  2 files changed, 43 insertions(+), 24 deletions(-)
> >
> > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > index 1b0936b016d9..7d595cfd03bc 100644
> > --- a/tools/lib/bpf/libbpf.c
> > +++ b/tools/lib/bpf/libbpf.c
> > @@ -7165,12 +7165,10 @@ static int bpf_object__sanitize_maps(struct bpf_object *obj)
> >         return 0;
> >  }
> >
> > -static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
> > +int libbpf__kallsyms_parse(void *arg, kallsyms_cb_t cb)
> 
> please call it libbpf_kallsyms_parse(), internal APIs don't use
> "object oriented" double underscore separator
> 
> also this "arg" is normally called "ctx" in similar APIs in libbpf and
> is passed the last, can you please adjust all that for consistency?

ok, thanks

jirka

> 
> >  {
> >         char sym_type, sym_name[500];
> >         unsigned long long sym_addr;
> > -       const struct btf_type *t;
> > -       struct extern_desc *ext;
> >         int ret, err = 0;
> >         FILE *f;
> >
> 
> [...]
> 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 6/8] libbpf: Add bpf_program__attach_kprobe_opts for multi kprobes
  2022-02-07 18:59   ` Andrii Nakryiko
@ 2022-02-08  9:12     ` Jiri Olsa
  0 siblings, 0 replies; 50+ messages in thread
From: Jiri Olsa @ 2022-02-08  9:12 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Steven Rostedt, Jiri Olsa

On Mon, Feb 07, 2022 at 10:59:29AM -0800, Andrii Nakryiko wrote:

SNIP

> > +struct fprobe_resolve {
> > +       const char *name;
> > +       __u64 *addrs;
> > +       __u32 alloc;
> > +       __u32 cnt;
> > +};
> > +
> > +static bool glob_matches(const char *glob, const char *s)
> 
> we've since added more generic glob_match() implementation (see
> test_progs.c), let's copy/paste that one (it's actually shorter and
> doesn't do hacky input args modification). Let's maybe also add '?'
> handling (it's trivial). Both original code in perf and the one in
> test_progs.c are GPL-2.0-only, so let's also get acks from original
> authors.

ok, will check

> 
> > +{
> > +       int n = strlen(glob);
> > +
> > +       if (n == 1 && glob[0] == '*')
> > +               return true;
> > +
> > +       if (glob[0] == '*' && glob[n - 1] == '*') {
> > +               const char *subs;
> > +               /* substring match */
> > +
> > +               /* this is hacky, but we don't want to allocate
> > +                * for no good reason
> > +                */
> > +               ((char *)glob)[n - 1] = '\0';
> > +               subs = strstr(s, glob + 1);
> > +               ((char *)glob)[n - 1] = '*';
> > +
> > +               return subs != NULL;
> > +       } else if (glob[0] == '*') {
> > +               size_t nn = strlen(s);
> > +               /* suffix match */
> > +
> > +               /* too short for a given suffix */
> > +               if (nn < n - 1)
> > +                       return false;
> > +               return strcmp(s + nn - (n - 1), glob + 1) == 0;
> > +       } else if (glob[n - 1] == '*') {
> > +               /* prefix match */
> > +               return strncmp(s, glob, n - 1) == 0;
> > +       } else {
> > +               /* exact match */
> > +               return strcmp(glob, s) == 0;
> > +       }
> > +}
> > +
> > +static int resolve_fprobe_cb(void *arg, unsigned long long sym_addr,
> > +                            char sym_type, const char *sym_name)
> > +{
> > +       struct fprobe_resolve *res = arg;
> > +       __u64 *p;
> > +
> > +       if (!glob_matches(res->name, sym_name))
> > +               return 0;
> > +
> > +       if (res->cnt == res->alloc) {
> > +               res->alloc = max((__u32) 16, res->alloc * 3 / 2);
> > +               p = libbpf_reallocarray(res->addrs, res->alloc, sizeof(__u32));
> > +               if (!p)
> > +                       return -ENOMEM;
> > +               res->addrs = p;
> > +       }
> 
> please use libbpf_ensure_mem() instead

ok

> 
> 
> > +       res->addrs[res->cnt++] = sym_addr;
> > +       return 0;
> > +}
> > +
> > +static struct bpf_link *
> > +attach_fprobe_opts(const struct bpf_program *prog,
> > +                  const char *func_name,
> 
> func_glob or func_pattern?

ok

> 
> > +                  const struct bpf_kprobe_opts *kopts)
> > +{
> > +       DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts);
> > +       struct fprobe_resolve res = {
> > +               .name = func_name,
> > +       };
> > +       struct bpf_link *link = NULL;
> > +       char errmsg[STRERR_BUFSIZE];
> > +       int err, link_fd, prog_fd;
> > +       bool retprobe;
> > +
> > +       err = libbpf__kallsyms_parse(&res, resolve_fprobe_cb);
> > +       if (err)
> > +               goto error;
> > +       if (!res.cnt) {
> > +               err = -ENOENT;
> > +               goto error;
> > +       }
> > +
> > +       retprobe = OPTS_GET(kopts, retprobe, false);
> > +
> > +       opts.fprobe.addrs = (__u64) res.addrs;
> 
> ptr_to_u64()

ok

> 
> > +       opts.fprobe.cnt = res.cnt;
> > +       opts.flags = retprobe ? BPF_F_FPROBE_RETURN : 0;
> > +
> > +       link = calloc(1, sizeof(*link));
> > +       if (!link) {
> > +               err = -ENOMEM;
> > +               goto error;
> > +       }
> > +       link->detach = &bpf_link__detach_fd;
> > +
> > +       prog_fd = bpf_program__fd(prog);
> > +       link_fd = bpf_link_create(prog_fd, 0, BPF_TRACE_FPROBE, &opts);
> > +       if (link_fd < 0) {
> > +               err = -errno;
> > +               pr_warn("prog '%s': failed to attach to %s: %s\n",
> > +                       prog->name, res.name,
> > +                       libbpf_strerror_r(err, errmsg, sizeof(errmsg)));
> > +               goto error;
> > +       }
> > +       link->fd = link_fd;
> > +       free(res.addrs);
> > +       return link;
> > +
> > +error:
> > +       free(link);
> > +       free(res.addrs);
> > +       return libbpf_err_ptr(err);
> > +}
> > +
> >  struct bpf_link *
> >  bpf_program__attach_kprobe_opts(const struct bpf_program *prog,
> >                                 const char *func_name,
> > @@ -10047,6 +10166,9 @@ bpf_program__attach_kprobe_opts(const struct bpf_program *prog,
> >         if (!OPTS_VALID(opts, bpf_kprobe_opts))
> >                 return libbpf_err_ptr(-EINVAL);
> >
> > +       if (prog->expected_attach_type == BPF_TRACE_FPROBE)
> > +               return attach_fprobe_opts(prog, func_name, opts);
> > +
> >         retprobe = OPTS_GET(opts, retprobe, false);
> >         offset = OPTS_GET(opts, offset, 0);
> >         pe_opts.bpf_cookie = OPTS_GET(opts, bpf_cookie, 0);
> > @@ -10112,6 +10234,14 @@ struct bpf_link *bpf_program__attach_kprobe(const struct bpf_program *prog,
> >         return bpf_program__attach_kprobe_opts(prog, func_name, &opts);
> >  }
> >
> > +static int init_kprobe(struct bpf_program *prog, long cookie)
> > +{
> > +       /* If we have wildcard, switch to fprobe link. */
> > +       if (strchr(prog->sec_name, '*'))
> 
> ugh... :( maybe let's have a separate SEC("kprobe.multi/<glob>") and
> same for kretprobe?

I agree new SEC type is more clear ;-) ok

thanks,
jirka


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 8/8] selftest/bpf: Add fprobe test for bpf_cookie values
  2022-02-07 18:59   ` Andrii Nakryiko
@ 2022-02-08  9:15     ` Jiri Olsa
  2022-02-08 23:24       ` Andrii Nakryiko
  0 siblings, 1 reply; 50+ messages in thread
From: Jiri Olsa @ 2022-02-08  9:15 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Steven Rostedt, Jiri Olsa

On Mon, Feb 07, 2022 at 10:59:32AM -0800, Andrii Nakryiko wrote:
> On Wed, Feb 2, 2022 at 5:54 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > Adding bpf_cookie test for kprobe attached by fprobe link.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  .../selftests/bpf/prog_tests/bpf_cookie.c     | 73 +++++++++++++++++++
> >  .../selftests/bpf/progs/fprobe_bpf_cookie.c   | 62 ++++++++++++++++
> >  2 files changed, 135 insertions(+)
> >  create mode 100644 tools/testing/selftests/bpf/progs/fprobe_bpf_cookie.c
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
> > index cd10df6cd0fc..bf70d859c598 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
> > @@ -7,6 +7,7 @@
> >  #include <unistd.h>
> >  #include <test_progs.h>
> >  #include "test_bpf_cookie.skel.h"
> > +#include "fprobe_bpf_cookie.skel.h"
> >
> >  /* uprobe attach point */
> >  static void trigger_func(void)
> > @@ -63,6 +64,76 @@ static void kprobe_subtest(struct test_bpf_cookie *skel)
> >         bpf_link__destroy(retlink2);
> >  }
> >
> > +static void fprobe_subtest(void)
> > +{
> > +       DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts);
> > +       int err, prog_fd, link1_fd = -1, link2_fd = -1;
> > +       struct fprobe_bpf_cookie *skel = NULL;
> > +       __u32 duration = 0, retval;
> > +       __u64 addrs[8], cookies[8];
> > +
> > +       skel = fprobe_bpf_cookie__open_and_load();
> > +       if (!ASSERT_OK_PTR(skel, "fentry_raw_skel_load"))
> > +               goto cleanup;
> > +
> > +       kallsyms_find("bpf_fentry_test1", &addrs[0]);
> > +       kallsyms_find("bpf_fentry_test2", &addrs[1]);
> > +       kallsyms_find("bpf_fentry_test3", &addrs[2]);
> > +       kallsyms_find("bpf_fentry_test4", &addrs[3]);
> > +       kallsyms_find("bpf_fentry_test5", &addrs[4]);
> > +       kallsyms_find("bpf_fentry_test6", &addrs[5]);
> > +       kallsyms_find("bpf_fentry_test7", &addrs[6]);
> > +       kallsyms_find("bpf_fentry_test8", &addrs[7]);
> > +
> > +       cookies[0] = 1;
> > +       cookies[1] = 2;
> > +       cookies[2] = 3;
> > +       cookies[3] = 4;
> > +       cookies[4] = 5;
> > +       cookies[5] = 6;
> > +       cookies[6] = 7;
> > +       cookies[7] = 8;
> > +
> > +       opts.fprobe.addrs = (__u64) &addrs;
> 
> we should have ptr_to_u64() for test_progs, but if not, let's either
> add it or it should be (__u64)(uintptr_t)&addrs. Otherwise we'll be
> getting compilation warnings on some architectures.

there's one in btf.c, bpf.c and libbpf.c ;-) so I guess it could go to bpf.h

> 
> > +       opts.fprobe.cnt = 8;
> > +       opts.fprobe.bpf_cookies = (__u64) &cookies;
> > +       prog_fd = bpf_program__fd(skel->progs.test2);
> > +
> > +       link1_fd = bpf_link_create(prog_fd, 0, BPF_TRACE_FPROBE, &opts);
> > +       if (!ASSERT_GE(link1_fd, 0, "link1_fd"))
> > +               return;
> > +
> > +       cookies[0] = 8;
> > +       cookies[1] = 7;
> > +       cookies[2] = 6;
> > +       cookies[3] = 5;
> > +       cookies[4] = 4;
> > +       cookies[5] = 3;
> > +       cookies[6] = 2;
> > +       cookies[7] = 1;
> > +
> > +       opts.flags = BPF_F_FPROBE_RETURN;
> > +       prog_fd = bpf_program__fd(skel->progs.test3);
> > +
> > +       link2_fd = bpf_link_create(prog_fd, 0, BPF_TRACE_FPROBE, &opts);
> > +       if (!ASSERT_GE(link2_fd, 0, "link2_fd"))
> > +               goto cleanup;
> > +
> > +       prog_fd = bpf_program__fd(skel->progs.test1);
> > +       err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
> > +                               NULL, NULL, &retval, &duration);
> > +       ASSERT_OK(err, "test_run");
> > +       ASSERT_EQ(retval, 0, "test_run");
> > +
> > +       ASSERT_EQ(skel->bss->test2_result, 8, "test2_result");
> > +       ASSERT_EQ(skel->bss->test3_result, 8, "test3_result");
> > +
> > +cleanup:
> > +       close(link1_fd);
> > +       close(link2_fd);
> > +       fprobe_bpf_cookie__destroy(skel);
> > +}
> > +
> >  static void uprobe_subtest(struct test_bpf_cookie *skel)
> >  {
> >         DECLARE_LIBBPF_OPTS(bpf_uprobe_opts, opts);
> > @@ -249,6 +320,8 @@ void test_bpf_cookie(void)
> >
> >         if (test__start_subtest("kprobe"))
> >                 kprobe_subtest(skel);
> > +       if (test__start_subtest("rawkprobe"))
> 
> kprobe.multi?

yes

thanks,
jirka


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 8/8] selftest/bpf: Add fprobe test for bpf_cookie values
  2022-02-08  9:15     ` Jiri Olsa
@ 2022-02-08 23:24       ` Andrii Nakryiko
  0 siblings, 0 replies; 50+ messages in thread
From: Andrii Nakryiko @ 2022-02-08 23:24 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Steven Rostedt, Jiri Olsa

On Tue, Feb 8, 2022 at 1:16 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Mon, Feb 07, 2022 at 10:59:32AM -0800, Andrii Nakryiko wrote:
> > On Wed, Feb 2, 2022 at 5:54 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > >
> > > Adding bpf_cookie test for kprobe attached by fprobe link.
> > >
> > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > ---
> > >  .../selftests/bpf/prog_tests/bpf_cookie.c     | 73 +++++++++++++++++++
> > >  .../selftests/bpf/progs/fprobe_bpf_cookie.c   | 62 ++++++++++++++++
> > >  2 files changed, 135 insertions(+)
> > >  create mode 100644 tools/testing/selftests/bpf/progs/fprobe_bpf_cookie.c
> > >
> > > diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
> > > index cd10df6cd0fc..bf70d859c598 100644
> > > --- a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
> > > +++ b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
> > > @@ -7,6 +7,7 @@
> > >  #include <unistd.h>
> > >  #include <test_progs.h>
> > >  #include "test_bpf_cookie.skel.h"
> > > +#include "fprobe_bpf_cookie.skel.h"
> > >
> > >  /* uprobe attach point */
> > >  static void trigger_func(void)
> > > @@ -63,6 +64,76 @@ static void kprobe_subtest(struct test_bpf_cookie *skel)
> > >         bpf_link__destroy(retlink2);
> > >  }
> > >
> > > +static void fprobe_subtest(void)
> > > +{
> > > +       DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts);
> > > +       int err, prog_fd, link1_fd = -1, link2_fd = -1;
> > > +       struct fprobe_bpf_cookie *skel = NULL;
> > > +       __u32 duration = 0, retval;
> > > +       __u64 addrs[8], cookies[8];
> > > +
> > > +       skel = fprobe_bpf_cookie__open_and_load();
> > > +       if (!ASSERT_OK_PTR(skel, "fentry_raw_skel_load"))
> > > +               goto cleanup;
> > > +
> > > +       kallsyms_find("bpf_fentry_test1", &addrs[0]);
> > > +       kallsyms_find("bpf_fentry_test2", &addrs[1]);
> > > +       kallsyms_find("bpf_fentry_test3", &addrs[2]);
> > > +       kallsyms_find("bpf_fentry_test4", &addrs[3]);
> > > +       kallsyms_find("bpf_fentry_test5", &addrs[4]);
> > > +       kallsyms_find("bpf_fentry_test6", &addrs[5]);
> > > +       kallsyms_find("bpf_fentry_test7", &addrs[6]);
> > > +       kallsyms_find("bpf_fentry_test8", &addrs[7]);
> > > +
> > > +       cookies[0] = 1;
> > > +       cookies[1] = 2;
> > > +       cookies[2] = 3;
> > > +       cookies[3] = 4;
> > > +       cookies[4] = 5;
> > > +       cookies[5] = 6;
> > > +       cookies[6] = 7;
> > > +       cookies[7] = 8;
> > > +
> > > +       opts.fprobe.addrs = (__u64) &addrs;
> >
> > we should have ptr_to_u64() for test_progs, but if not, let's either
> > add it or it should be (__u64)(uintptr_t)&addrs. Otherwise we'll be
> > getting compilation warnings on some architectures.
>
> there's one in btf.c, bpf.c and libbpf.c ;-) so I guess it could go to bpf.h

No, it shouldn't, bpf.h is a public API header. Let's keep internal
helpers internal. Just copy/paste.

>
> >
> > > +       opts.fprobe.cnt = 8;
> > > +       opts.fprobe.bpf_cookies = (__u64) &cookies;
> > > +       prog_fd = bpf_program__fd(skel->progs.test2);
> > > +
> > > +       link1_fd = bpf_link_create(prog_fd, 0, BPF_TRACE_FPROBE, &opts);
> > > +       if (!ASSERT_GE(link1_fd, 0, "link1_fd"))
> > > +               return;
> > > +
> > > +       cookies[0] = 8;
> > > +       cookies[1] = 7;
> > > +       cookies[2] = 6;
> > > +       cookies[3] = 5;
> > > +       cookies[4] = 4;
> > > +       cookies[5] = 3;
> > > +       cookies[6] = 2;
> > > +       cookies[7] = 1;
> > > +
> > > +       opts.flags = BPF_F_FPROBE_RETURN;
> > > +       prog_fd = bpf_program__fd(skel->progs.test3);
> > > +
> > > +       link2_fd = bpf_link_create(prog_fd, 0, BPF_TRACE_FPROBE, &opts);
> > > +       if (!ASSERT_GE(link2_fd, 0, "link2_fd"))
> > > +               goto cleanup;
> > > +
> > > +       prog_fd = bpf_program__fd(skel->progs.test1);
> > > +       err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
> > > +                               NULL, NULL, &retval, &duration);
> > > +       ASSERT_OK(err, "test_run");
> > > +       ASSERT_EQ(retval, 0, "test_run");
> > > +
> > > +       ASSERT_EQ(skel->bss->test2_result, 8, "test2_result");
> > > +       ASSERT_EQ(skel->bss->test3_result, 8, "test3_result");
> > > +
> > > +cleanup:
> > > +       close(link1_fd);
> > > +       close(link2_fd);
> > > +       fprobe_bpf_cookie__destroy(skel);
> > > +}
> > > +
> > >  static void uprobe_subtest(struct test_bpf_cookie *skel)
> > >  {
> > >         DECLARE_LIBBPF_OPTS(bpf_uprobe_opts, opts);
> > > @@ -249,6 +320,8 @@ void test_bpf_cookie(void)
> > >
> > >         if (test__start_subtest("kprobe"))
> > >                 kprobe_subtest(skel);
> > > +       if (test__start_subtest("rawkprobe"))
> >
> > kprobe.multi?
>
> yes
>
> thanks,
> jirka
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/8] bpf: Add bpf_cookie support to fprobe
  2022-02-08  9:07     ` Jiri Olsa
@ 2022-02-08 23:35       ` Andrii Nakryiko
  2022-02-08 23:46         ` Jiri Olsa
  0 siblings, 1 reply; 50+ messages in thread
From: Andrii Nakryiko @ 2022-02-08 23:35 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Steven Rostedt, Jiri Olsa

On Tue, Feb 8, 2022 at 1:07 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Mon, Feb 07, 2022 at 10:59:21AM -0800, Andrii Nakryiko wrote:
> > On Wed, Feb 2, 2022 at 5:54 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > >
> > > Adding support to call bpf_get_attach_cookie helper from
> > > kprobe program attached by fprobe link.
> > >
> > > The bpf_cookie is provided by array of u64 values, where
> > > each value is paired with provided function address with
> > > the same array index.
> > >
> > > Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > ---
> > >  include/linux/bpf.h            |  2 +
> > >  include/uapi/linux/bpf.h       |  1 +
> > >  kernel/bpf/syscall.c           | 83 +++++++++++++++++++++++++++++++++-
> > >  kernel/trace/bpf_trace.c       | 16 ++++++-
> > >  tools/include/uapi/linux/bpf.h |  1 +
> > >  5 files changed, 100 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > index 6eb0b180d33b..7b65f05c0487 100644
> > > --- a/include/linux/bpf.h
> > > +++ b/include/linux/bpf.h
> > > @@ -1301,6 +1301,8 @@ static inline void bpf_reset_run_ctx(struct bpf_run_ctx *old_ctx)
> > >  #endif
> > >  }
> > >
> > > +u64 bpf_fprobe_cookie(struct bpf_run_ctx *ctx, u64 ip);
> > > +
> > >  /* BPF program asks to bypass CAP_NET_BIND_SERVICE in bind. */
> > >  #define BPF_RET_BIND_NO_CAP_NET_BIND_SERVICE                   (1 << 0)
> > >  /* BPF program asks to set CN on the packet. */
> > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > index c0912f0a3dfe..0dc6aa4f9683 100644
> > > --- a/include/uapi/linux/bpf.h
> > > +++ b/include/uapi/linux/bpf.h
> > > @@ -1484,6 +1484,7 @@ union bpf_attr {
> > >                                 __aligned_u64   addrs;
> > >                                 __u32           cnt;
> > >                                 __u32           flags;
> > > +                               __aligned_u64   bpf_cookies;
> >
> > maybe put it right after addrs, they are closely related and cnt
> > describes all of syms/addrs/cookies.
>
> ok
>
> >
> > >                         } fprobe;
> > >                 };
> > >         } link_create;
> > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > index 0cfbb112c8e1..6c5e74bc43b6 100644
> > > --- a/kernel/bpf/syscall.c
> > > +++ b/kernel/bpf/syscall.c
> > > @@ -33,6 +33,8 @@
> > >  #include <linux/rcupdate_trace.h>
> > >  #include <linux/memcontrol.h>
> > >  #include <linux/fprobe.h>
> > > +#include <linux/bsearch.h>
> > > +#include <linux/sort.h>
> > >
> > >  #define IS_FD_ARRAY(map) ((map)->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY || \
> > >                           (map)->map_type == BPF_MAP_TYPE_CGROUP_ARRAY || \
> > > @@ -3025,10 +3027,18 @@ static int bpf_perf_link_attach(const union bpf_attr *attr, struct bpf_prog *pro
> > >
> > >  #ifdef CONFIG_FPROBE
> > >
> > > +struct bpf_fprobe_cookie {
> > > +       unsigned long addr;
> > > +       u64 bpf_cookie;
> > > +};
> > > +
> > >  struct bpf_fprobe_link {
> > >         struct bpf_link link;
> > >         struct fprobe fp;
> > >         unsigned long *addrs;
> > > +       struct bpf_run_ctx run_ctx;
> > > +       struct bpf_fprobe_cookie *bpf_cookies;
> >
> > you already have all the addrs above, why keeping a second copy of
> > each addrs in bpf_fprobe_cookie. Let's have two arrays: addrs
> > (unsigned long) and cookies (u64) and make sure that they are sorted
> > together. Then lookup addrs, calculate index, use that index to fetch
> > cookie.
> >
> > Seems like sort_r() provides exactly the interface you'd need to do
> > this very easily. Having addrs separate from cookies also a bit
> > advantageous in terms of TLB misses (if you need any more persuasion
> > ;)
>
> no persuation needed, I actually tried that but it turned out sort_r
> is not ready yet ;-)
>
> because you can't pass priv pointer to the swap callback, so we can't
> swap the other array.. I did a change to allow that, but it's not trivial
> and will need some bigger testing/review because the original sort
> calls sort_r, and of course there are many 'sort' users ;-)

Big sigh... :( Did you do something similar to _CMP_WRAPPER? You don't
need to change the interface of sort(), so it shouldn't require
extensive code refactoring. You'll just need to adjust priv to be not
just cmp_func, but cmp_func + swap_fun (need a small struct on the
stack in sort, probably). Or you did something else?

>
> >
> > > +       u32 cnt;
> > >  };
> > >
> > >  static void bpf_fprobe_link_release(struct bpf_link *link)
> > > @@ -3045,6 +3055,7 @@ static void bpf_fprobe_link_dealloc(struct bpf_link *link)
> > >
> > >         fprobe_link = container_of(link, struct bpf_fprobe_link, link);
> > >         kfree(fprobe_link->addrs);
> > > +       kfree(fprobe_link->bpf_cookies);
> > >         kfree(fprobe_link);
> > >  }
> > >
> > > @@ -3053,9 +3064,37 @@ static const struct bpf_link_ops bpf_fprobe_link_lops = {
> > >         .dealloc = bpf_fprobe_link_dealloc,
> > >  };
> > >
> > > +static int bpf_fprobe_cookie_cmp(const void *_a, const void *_b)
> > > +{
> > > +       const struct bpf_fprobe_cookie *a = _a;
> > > +       const struct bpf_fprobe_cookie *b = _b;
> > > +
> > > +       if (a->addr == b->addr)
> > > +               return 0;
> > > +       return a->addr < b->addr ? -1 : 1;
> > > +}
> > > +
> > > +u64 bpf_fprobe_cookie(struct bpf_run_ctx *ctx, u64 ip)
> > > +{
> > > +       struct bpf_fprobe_link *fprobe_link;
> > > +       struct bpf_fprobe_cookie *val, key = {
> > > +               .addr = (unsigned long) ip,
> > > +       };
> > > +
> > > +       if (!ctx)
> > > +               return 0;
> >
> > is it allowed to have ctx == NULL?
>
> nope, I was also thinking this is more 'WARN_ON[_ONCE]' check
>
> >
> > > +       fprobe_link = container_of(ctx, struct bpf_fprobe_link, run_ctx);
> > > +       if (!fprobe_link->bpf_cookies)
> > > +               return 0;
> > > +       val = bsearch(&key, fprobe_link->bpf_cookies, fprobe_link->cnt,
> > > +                     sizeof(key), bpf_fprobe_cookie_cmp);
> > > +       return val ? val->bpf_cookie : 0;
> > > +}
> > > +
> > >  static int fprobe_link_prog_run(struct bpf_fprobe_link *fprobe_link,
> > >                                 struct pt_regs *regs)
> > >  {
> > > +       struct bpf_run_ctx *old_run_ctx;
> > >         int err;
> > >
> > >         if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> > > @@ -3063,12 +3102,16 @@ static int fprobe_link_prog_run(struct bpf_fprobe_link *fprobe_link,
> > >                 goto out;
> > >         }
> > >
> > > +       old_run_ctx = bpf_set_run_ctx(&fprobe_link->run_ctx);
> > > +
> > >         rcu_read_lock();
> > >         migrate_disable();
> > >         err = bpf_prog_run(fprobe_link->link.prog, regs);
> > >         migrate_enable();
> > >         rcu_read_unlock();
> > >
> > > +       bpf_reset_run_ctx(old_run_ctx);
> > > +
> > >   out:
> > >         __this_cpu_dec(bpf_prog_active);
> > >         return err;
> > > @@ -3161,10 +3204,12 @@ static int fprobe_resolve_syms(const void *usyms, u32 cnt,
> > >
> > >  static int bpf_fprobe_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
> > >  {
> > > +       struct bpf_fprobe_cookie *bpf_cookies = NULL;
> > >         struct bpf_fprobe_link *link = NULL;
> > >         struct bpf_link_primer link_primer;
> > > +       void __user *ubpf_cookies;
> > > +       u32 flags, cnt, i, size;
> > >         unsigned long *addrs;
> > > -       u32 flags, cnt, size;
> > >         void __user *uaddrs;
> > >         void __user *usyms;
> > >         int err;
> > > @@ -3205,6 +3250,37 @@ static int bpf_fprobe_link_attach(const union bpf_attr *attr, struct bpf_prog *p
> > >                         goto error;
> > >         }
> > >
> > > +       ubpf_cookies = u64_to_user_ptr(attr->link_create.fprobe.bpf_cookies);
> >
> > nit: let's call all this "cookies", this bpf_ prefix feels a bit
> > redundant (I know about perf_event.bpf_cookie, but still).
>
> ok
>
> >
> > > +       if (ubpf_cookies) {
> > > +               u64 *tmp;
> > > +
> > > +               err = -ENOMEM;
> > > +               tmp = kzalloc(size, GFP_KERNEL);
> >
> > kvmalloc?
>
> ok
>
> thanks,
> jirka
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/8] bpf: Add bpf_cookie support to fprobe
  2022-02-08 23:35       ` Andrii Nakryiko
@ 2022-02-08 23:46         ` Jiri Olsa
  2022-02-08 23:53           ` Andrii Nakryiko
  0 siblings, 1 reply; 50+ messages in thread
From: Jiri Olsa @ 2022-02-08 23:46 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Steven Rostedt

On Tue, Feb 08, 2022 at 03:35:24PM -0800, Andrii Nakryiko wrote:
> On Tue, Feb 8, 2022 at 1:07 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Mon, Feb 07, 2022 at 10:59:21AM -0800, Andrii Nakryiko wrote:
> > > On Wed, Feb 2, 2022 at 5:54 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > >
> > > > Adding support to call bpf_get_attach_cookie helper from
> > > > kprobe program attached by fprobe link.
> > > >
> > > > The bpf_cookie is provided by array of u64 values, where
> > > > each value is paired with provided function address with
> > > > the same array index.
> > > >
> > > > Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > ---
> > > >  include/linux/bpf.h            |  2 +
> > > >  include/uapi/linux/bpf.h       |  1 +
> > > >  kernel/bpf/syscall.c           | 83 +++++++++++++++++++++++++++++++++-
> > > >  kernel/trace/bpf_trace.c       | 16 ++++++-
> > > >  tools/include/uapi/linux/bpf.h |  1 +
> > > >  5 files changed, 100 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > index 6eb0b180d33b..7b65f05c0487 100644
> > > > --- a/include/linux/bpf.h
> > > > +++ b/include/linux/bpf.h
> > > > @@ -1301,6 +1301,8 @@ static inline void bpf_reset_run_ctx(struct bpf_run_ctx *old_ctx)
> > > >  #endif
> > > >  }
> > > >
> > > > +u64 bpf_fprobe_cookie(struct bpf_run_ctx *ctx, u64 ip);
> > > > +
> > > >  /* BPF program asks to bypass CAP_NET_BIND_SERVICE in bind. */
> > > >  #define BPF_RET_BIND_NO_CAP_NET_BIND_SERVICE                   (1 << 0)
> > > >  /* BPF program asks to set CN on the packet. */
> > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > index c0912f0a3dfe..0dc6aa4f9683 100644
> > > > --- a/include/uapi/linux/bpf.h
> > > > +++ b/include/uapi/linux/bpf.h
> > > > @@ -1484,6 +1484,7 @@ union bpf_attr {
> > > >                                 __aligned_u64   addrs;
> > > >                                 __u32           cnt;
> > > >                                 __u32           flags;
> > > > +                               __aligned_u64   bpf_cookies;
> > >
> > > maybe put it right after addrs, they are closely related and cnt
> > > describes all of syms/addrs/cookies.
> >
> > ok
> >
> > >
> > > >                         } fprobe;
> > > >                 };
> > > >         } link_create;
> > > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > > index 0cfbb112c8e1..6c5e74bc43b6 100644
> > > > --- a/kernel/bpf/syscall.c
> > > > +++ b/kernel/bpf/syscall.c
> > > > @@ -33,6 +33,8 @@
> > > >  #include <linux/rcupdate_trace.h>
> > > >  #include <linux/memcontrol.h>
> > > >  #include <linux/fprobe.h>
> > > > +#include <linux/bsearch.h>
> > > > +#include <linux/sort.h>
> > > >
> > > >  #define IS_FD_ARRAY(map) ((map)->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY || \
> > > >                           (map)->map_type == BPF_MAP_TYPE_CGROUP_ARRAY || \
> > > > @@ -3025,10 +3027,18 @@ static int bpf_perf_link_attach(const union bpf_attr *attr, struct bpf_prog *pro
> > > >
> > > >  #ifdef CONFIG_FPROBE
> > > >
> > > > +struct bpf_fprobe_cookie {
> > > > +       unsigned long addr;
> > > > +       u64 bpf_cookie;
> > > > +};
> > > > +
> > > >  struct bpf_fprobe_link {
> > > >         struct bpf_link link;
> > > >         struct fprobe fp;
> > > >         unsigned long *addrs;
> > > > +       struct bpf_run_ctx run_ctx;
> > > > +       struct bpf_fprobe_cookie *bpf_cookies;
> > >
> > > you already have all the addrs above, why keeping a second copy of
> > > each addrs in bpf_fprobe_cookie. Let's have two arrays: addrs
> > > (unsigned long) and cookies (u64) and make sure that they are sorted
> > > together. Then lookup addrs, calculate index, use that index to fetch
> > > cookie.
> > >
> > > Seems like sort_r() provides exactly the interface you'd need to do
> > > this very easily. Having addrs separate from cookies also a bit
> > > advantageous in terms of TLB misses (if you need any more persuasion
> > > ;)
> >
> > no persuation needed, I actually tried that but it turned out sort_r
> > is not ready yet ;-)
> >
> > because you can't pass priv pointer to the swap callback, so we can't
> > swap the other array.. I did a change to allow that, but it's not trivial
> > and will need some bigger testing/review because the original sort
> > calls sort_r, and of course there are many 'sort' users ;-)
> 
> Big sigh... :( Did you do something similar to _CMP_WRAPPER? You don't
> need to change the interface of sort(), so it shouldn't require
> extensive code refactoring. You'll just need to adjust priv to be not
> just cmp_func, but cmp_func + swap_fun (need a small struct on the
> stack in sort, probably). Or you did something else?

I ended up with change below

jirka


---
 include/linux/sort.h  |  2 +-
 include/linux/types.h |  1 +
 lib/sort.c            | 44 +++++++++++++++++++++++++++++++++----------
 3 files changed, 36 insertions(+), 11 deletions(-)

diff --git a/include/linux/sort.h b/include/linux/sort.h
index b5898725fe9d..e163287ac6c1 100644
--- a/include/linux/sort.h
+++ b/include/linux/sort.h
@@ -6,7 +6,7 @@
 
 void sort_r(void *base, size_t num, size_t size,
 	    cmp_r_func_t cmp_func,
-	    swap_func_t swap_func,
+	    swap_r_func_t swap_func,
 	    const void *priv);
 
 void sort(void *base, size_t num, size_t size,
diff --git a/include/linux/types.h b/include/linux/types.h
index ac825ad90e44..ea8cf60a8a79 100644
--- a/include/linux/types.h
+++ b/include/linux/types.h
@@ -226,6 +226,7 @@ struct callback_head {
 typedef void (*rcu_callback_t)(struct rcu_head *head);
 typedef void (*call_rcu_func_t)(struct rcu_head *head, rcu_callback_t func);
 
+typedef void (*swap_r_func_t)(void *a, void *b, int size, const void *priv);
 typedef void (*swap_func_t)(void *a, void *b, int size);
 
 typedef int (*cmp_r_func_t)(const void *a, const void *b, const void *priv);
diff --git a/lib/sort.c b/lib/sort.c
index aa18153864d2..f65078608c16 100644
--- a/lib/sort.c
+++ b/lib/sort.c
@@ -122,16 +122,29 @@ static void swap_bytes(void *a, void *b, size_t n)
  * a pointer, but small integers make for the smallest compare
  * instructions.
  */
-#define SWAP_WORDS_64 (swap_func_t)0
-#define SWAP_WORDS_32 (swap_func_t)1
-#define SWAP_BYTES    (swap_func_t)2
+#define SWAP_WORDS_64 (swap_r_func_t)0
+#define SWAP_WORDS_32 (swap_r_func_t)1
+#define SWAP_BYTES    (swap_r_func_t)2
+#define SWAP_WRAPPER  (swap_r_func_t)3
+
+struct wrapper {
+	cmp_func_t cmp;
+	swap_func_t swap;
+};
 
 /*
  * The function pointer is last to make tail calls most efficient if the
  * compiler decides not to inline this function.
  */
-static void do_swap(void *a, void *b, size_t size, swap_func_t swap_func)
+static void do_swap(void *a, void *b, size_t size, swap_r_func_t swap_func, const void *priv)
 {
+	const struct wrapper *w = priv;
+
+	if (swap_func == SWAP_WRAPPER) {
+		w->swap(a, b, (int)size);
+		return;
+	}
+
 	if (swap_func == SWAP_WORDS_64)
 		swap_words_64(a, b, size);
 	else if (swap_func == SWAP_WORDS_32)
@@ -139,15 +152,17 @@ static void do_swap(void *a, void *b, size_t size, swap_func_t swap_func)
 	else if (swap_func == SWAP_BYTES)
 		swap_bytes(a, b, size);
 	else
-		swap_func(a, b, (int)size);
+		swap_func(a, b, (int)size, priv);
 }
 
 #define _CMP_WRAPPER ((cmp_r_func_t)0L)
 
 static int do_cmp(const void *a, const void *b, cmp_r_func_t cmp, const void *priv)
 {
+	const struct wrapper *w = priv;
+
 	if (cmp == _CMP_WRAPPER)
-		return ((cmp_func_t)(priv))(a, b);
+		return w->cmp(a, b);
 	return cmp(a, b, priv);
 }
 
@@ -198,16 +213,20 @@ static size_t parent(size_t i, unsigned int lsbit, size_t size)
  */
 void sort_r(void *base, size_t num, size_t size,
 	    cmp_r_func_t cmp_func,
-	    swap_func_t swap_func,
+	    swap_r_func_t swap_func,
 	    const void *priv)
 {
 	/* pre-scale counters for performance */
 	size_t n = num * size, a = (num/2) * size;
 	const unsigned int lsbit = size & -size;  /* Used to find parent */
+	const struct wrapper *w = priv;
 
 	if (!a)		/* num < 2 || size == 0 */
 		return;
 
+	if (swap_func == SWAP_WRAPPER && !w->swap)
+		swap_func = NULL;
+
 	if (!swap_func) {
 		if (is_aligned(base, size, 8))
 			swap_func = SWAP_WORDS_64;
@@ -230,7 +249,7 @@ void sort_r(void *base, size_t num, size_t size,
 		if (a)			/* Building heap: sift down --a */
 			a -= size;
 		else if (n -= size)	/* Sorting: Extract root to --n */
-			do_swap(base, base + n, size, swap_func);
+			do_swap(base, base + n, size, swap_func, priv);
 		else			/* Sort complete */
 			break;
 
@@ -257,7 +276,7 @@ void sort_r(void *base, size_t num, size_t size,
 		c = b;			/* Where "a" belongs */
 		while (b != a) {	/* Shift it into place */
 			b = parent(b, lsbit, size);
-			do_swap(base + b, base + c, size, swap_func);
+			do_swap(base + b, base + c, size, swap_func, priv);
 		}
 	}
 }
@@ -267,6 +286,11 @@ void sort(void *base, size_t num, size_t size,
 	  cmp_func_t cmp_func,
 	  swap_func_t swap_func)
 {
-	return sort_r(base, num, size, _CMP_WRAPPER, swap_func, cmp_func);
+	struct wrapper w = {
+		.cmp  = cmp_func,
+		.swap = swap_func,
+	};
+
+	return sort_r(base, num, size, _CMP_WRAPPER, SWAP_WRAPPER, &w);
 }
 EXPORT_SYMBOL(sort);
-- 
2.34.1


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 3/8] bpf: Add bpf_cookie support to fprobe
  2022-02-08 23:46         ` Jiri Olsa
@ 2022-02-08 23:53           ` Andrii Nakryiko
  0 siblings, 0 replies; 50+ messages in thread
From: Andrii Nakryiko @ 2022-02-08 23:53 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Steven Rostedt

On Tue, Feb 8, 2022 at 3:46 PM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Tue, Feb 08, 2022 at 03:35:24PM -0800, Andrii Nakryiko wrote:
> > On Tue, Feb 8, 2022 at 1:07 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > >
> > > On Mon, Feb 07, 2022 at 10:59:21AM -0800, Andrii Nakryiko wrote:
> > > > On Wed, Feb 2, 2022 at 5:54 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > >
> > > > > Adding support to call bpf_get_attach_cookie helper from
> > > > > kprobe program attached by fprobe link.
> > > > >
> > > > > The bpf_cookie is provided by array of u64 values, where
> > > > > each value is paired with provided function address with
> > > > > the same array index.
> > > > >
> > > > > Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> > > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > > ---
> > > > >  include/linux/bpf.h            |  2 +
> > > > >  include/uapi/linux/bpf.h       |  1 +
> > > > >  kernel/bpf/syscall.c           | 83 +++++++++++++++++++++++++++++++++-
> > > > >  kernel/trace/bpf_trace.c       | 16 ++++++-
> > > > >  tools/include/uapi/linux/bpf.h |  1 +
> > > > >  5 files changed, 100 insertions(+), 3 deletions(-)
> > > > >
> > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > > index 6eb0b180d33b..7b65f05c0487 100644
> > > > > --- a/include/linux/bpf.h
> > > > > +++ b/include/linux/bpf.h
> > > > > @@ -1301,6 +1301,8 @@ static inline void bpf_reset_run_ctx(struct bpf_run_ctx *old_ctx)
> > > > >  #endif
> > > > >  }
> > > > >
> > > > > +u64 bpf_fprobe_cookie(struct bpf_run_ctx *ctx, u64 ip);
> > > > > +
> > > > >  /* BPF program asks to bypass CAP_NET_BIND_SERVICE in bind. */
> > > > >  #define BPF_RET_BIND_NO_CAP_NET_BIND_SERVICE                   (1 << 0)
> > > > >  /* BPF program asks to set CN on the packet. */
> > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > > index c0912f0a3dfe..0dc6aa4f9683 100644
> > > > > --- a/include/uapi/linux/bpf.h
> > > > > +++ b/include/uapi/linux/bpf.h
> > > > > @@ -1484,6 +1484,7 @@ union bpf_attr {
> > > > >                                 __aligned_u64   addrs;
> > > > >                                 __u32           cnt;
> > > > >                                 __u32           flags;
> > > > > +                               __aligned_u64   bpf_cookies;
> > > >
> > > > maybe put it right after addrs, they are closely related and cnt
> > > > describes all of syms/addrs/cookies.
> > >
> > > ok
> > >
> > > >
> > > > >                         } fprobe;
> > > > >                 };
> > > > >         } link_create;
> > > > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > > > index 0cfbb112c8e1..6c5e74bc43b6 100644
> > > > > --- a/kernel/bpf/syscall.c
> > > > > +++ b/kernel/bpf/syscall.c
> > > > > @@ -33,6 +33,8 @@
> > > > >  #include <linux/rcupdate_trace.h>
> > > > >  #include <linux/memcontrol.h>
> > > > >  #include <linux/fprobe.h>
> > > > > +#include <linux/bsearch.h>
> > > > > +#include <linux/sort.h>
> > > > >
> > > > >  #define IS_FD_ARRAY(map) ((map)->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY || \
> > > > >                           (map)->map_type == BPF_MAP_TYPE_CGROUP_ARRAY || \
> > > > > @@ -3025,10 +3027,18 @@ static int bpf_perf_link_attach(const union bpf_attr *attr, struct bpf_prog *pro
> > > > >
> > > > >  #ifdef CONFIG_FPROBE
> > > > >
> > > > > +struct bpf_fprobe_cookie {
> > > > > +       unsigned long addr;
> > > > > +       u64 bpf_cookie;
> > > > > +};
> > > > > +
> > > > >  struct bpf_fprobe_link {
> > > > >         struct bpf_link link;
> > > > >         struct fprobe fp;
> > > > >         unsigned long *addrs;
> > > > > +       struct bpf_run_ctx run_ctx;
> > > > > +       struct bpf_fprobe_cookie *bpf_cookies;
> > > >
> > > > you already have all the addrs above, why keeping a second copy of
> > > > each addrs in bpf_fprobe_cookie. Let's have two arrays: addrs
> > > > (unsigned long) and cookies (u64) and make sure that they are sorted
> > > > together. Then lookup addrs, calculate index, use that index to fetch
> > > > cookie.
> > > >
> > > > Seems like sort_r() provides exactly the interface you'd need to do
> > > > this very easily. Having addrs separate from cookies also a bit
> > > > advantageous in terms of TLB misses (if you need any more persuasion
> > > > ;)
> > >
> > > no persuation needed, I actually tried that but it turned out sort_r
> > > is not ready yet ;-)
> > >
> > > because you can't pass priv pointer to the swap callback, so we can't
> > > swap the other array.. I did a change to allow that, but it's not trivial
> > > and will need some bigger testing/review because the original sort
> > > calls sort_r, and of course there are many 'sort' users ;-)
> >
> > Big sigh... :( Did you do something similar to _CMP_WRAPPER? You don't
> > need to change the interface of sort(), so it shouldn't require
> > extensive code refactoring. You'll just need to adjust priv to be not
> > just cmp_func, but cmp_func + swap_fun (need a small struct on the
> > stack in sort, probably). Or you did something else?
>
> I ended up with change below
>

exactly what I had in mind

> jirka
>
>
> ---
>  include/linux/sort.h  |  2 +-
>  include/linux/types.h |  1 +
>  lib/sort.c            | 44 +++++++++++++++++++++++++++++++++----------
>  3 files changed, 36 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/sort.h b/include/linux/sort.h
> index b5898725fe9d..e163287ac6c1 100644
> --- a/include/linux/sort.h
> +++ b/include/linux/sort.h
> @@ -6,7 +6,7 @@
>
>  void sort_r(void *base, size_t num, size_t size,
>             cmp_r_func_t cmp_func,
> -           swap_func_t swap_func,
> +           swap_r_func_t swap_func,
>             const void *priv);
>
>  void sort(void *base, size_t num, size_t size,
> diff --git a/include/linux/types.h b/include/linux/types.h
> index ac825ad90e44..ea8cf60a8a79 100644
> --- a/include/linux/types.h
> +++ b/include/linux/types.h
> @@ -226,6 +226,7 @@ struct callback_head {
>  typedef void (*rcu_callback_t)(struct rcu_head *head);
>  typedef void (*call_rcu_func_t)(struct rcu_head *head, rcu_callback_t func);
>
> +typedef void (*swap_r_func_t)(void *a, void *b, int size, const void *priv);
>  typedef void (*swap_func_t)(void *a, void *b, int size);
>
>  typedef int (*cmp_r_func_t)(const void *a, const void *b, const void *priv);
> diff --git a/lib/sort.c b/lib/sort.c
> index aa18153864d2..f65078608c16 100644
> --- a/lib/sort.c
> +++ b/lib/sort.c
> @@ -122,16 +122,29 @@ static void swap_bytes(void *a, void *b, size_t n)
>   * a pointer, but small integers make for the smallest compare
>   * instructions.
>   */
> -#define SWAP_WORDS_64 (swap_func_t)0
> -#define SWAP_WORDS_32 (swap_func_t)1
> -#define SWAP_BYTES    (swap_func_t)2
> +#define SWAP_WORDS_64 (swap_r_func_t)0
> +#define SWAP_WORDS_32 (swap_r_func_t)1
> +#define SWAP_BYTES    (swap_r_func_t)2
> +#define SWAP_WRAPPER  (swap_r_func_t)3
> +
> +struct wrapper {
> +       cmp_func_t cmp;
> +       swap_func_t swap;
> +};
>
>  /*
>   * The function pointer is last to make tail calls most efficient if the
>   * compiler decides not to inline this function.
>   */
> -static void do_swap(void *a, void *b, size_t size, swap_func_t swap_func)
> +static void do_swap(void *a, void *b, size_t size, swap_r_func_t swap_func, const void *priv)
>  {
> +       const struct wrapper *w = priv;

I'd just move this under if

> +
> +       if (swap_func == SWAP_WRAPPER) {

const struct wrapper *w = priv; here

> +               w->swap(a, b, (int)size);
> +               return;
> +       }
> +
>         if (swap_func == SWAP_WORDS_64)
>                 swap_words_64(a, b, size);
>         else if (swap_func == SWAP_WORDS_32)
> @@ -139,15 +152,17 @@ static void do_swap(void *a, void *b, size_t size, swap_func_t swap_func)
>         else if (swap_func == SWAP_BYTES)
>                 swap_bytes(a, b, size);
>         else
> -               swap_func(a, b, (int)size);
> +               swap_func(a, b, (int)size, priv);
>  }
>
>  #define _CMP_WRAPPER ((cmp_r_func_t)0L)
>
>  static int do_cmp(const void *a, const void *b, cmp_r_func_t cmp, const void *priv)
>  {
> +       const struct wrapper *w = priv;
> +
>         if (cmp == _CMP_WRAPPER)
> -               return ((cmp_func_t)(priv))(a, b);
> +               return w->cmp(a, b);

same here, or just stick to the previous style with

return ((const struct wrapper *)priv)->cmd(a, b);

>         return cmp(a, b, priv);
>  }
>
> @@ -198,16 +213,20 @@ static size_t parent(size_t i, unsigned int lsbit, size_t size)
>   */
>  void sort_r(void *base, size_t num, size_t size,
>             cmp_r_func_t cmp_func,
> -           swap_func_t swap_func,
> +           swap_r_func_t swap_func,
>             const void *priv)
>  {
>         /* pre-scale counters for performance */
>         size_t n = num * size, a = (num/2) * size;
>         const unsigned int lsbit = size & -size;  /* Used to find parent */
> +       const struct wrapper *w = priv;
>
>         if (!a)         /* num < 2 || size == 0 */
>                 return;
>
> +       if (swap_func == SWAP_WRAPPER && !w->swap)

same here, I'd probably do the cast right here to keep this wrapper
stuff as local as possible

> +               swap_func = NULL;
> +
>         if (!swap_func) {
>                 if (is_aligned(base, size, 8))
>                         swap_func = SWAP_WORDS_64;
> @@ -230,7 +249,7 @@ void sort_r(void *base, size_t num, size_t size,
>                 if (a)                  /* Building heap: sift down --a */
>                         a -= size;
>                 else if (n -= size)     /* Sorting: Extract root to --n */
> -                       do_swap(base, base + n, size, swap_func);
> +                       do_swap(base, base + n, size, swap_func, priv);
>                 else                    /* Sort complete */
>                         break;
>
> @@ -257,7 +276,7 @@ void sort_r(void *base, size_t num, size_t size,
>                 c = b;                  /* Where "a" belongs */
>                 while (b != a) {        /* Shift it into place */
>                         b = parent(b, lsbit, size);
> -                       do_swap(base + b, base + c, size, swap_func);
> +                       do_swap(base + b, base + c, size, swap_func, priv);
>                 }
>         }
>  }
> @@ -267,6 +286,11 @@ void sort(void *base, size_t num, size_t size,
>           cmp_func_t cmp_func,
>           swap_func_t swap_func)
>  {
> -       return sort_r(base, num, size, _CMP_WRAPPER, swap_func, cmp_func);
> +       struct wrapper w = {
> +               .cmp  = cmp_func,
> +               .swap = swap_func,
> +       };
> +
> +       return sort_r(base, num, size, _CMP_WRAPPER, SWAP_WRAPPER, &w);
>  }
>  EXPORT_SYMBOL(sort);
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 2/8] bpf: Add bpf_get_func_ip kprobe helper for fprobe link
  2022-02-07 18:59   ` Andrii Nakryiko
  2022-02-07 21:01     ` Alexei Starovoitov
@ 2022-02-09 15:01     ` Jiri Olsa
  2022-02-09 16:05       ` Andrii Nakryiko
  1 sibling, 1 reply; 50+ messages in thread
From: Jiri Olsa @ 2022-02-09 15:01 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Steven Rostedt

On Mon, Feb 07, 2022 at 10:59:18AM -0800, Andrii Nakryiko wrote:
> On Wed, Feb 2, 2022 at 5:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > Adding support to call get_func_ip_fprobe helper from kprobe
> > programs attached by fprobe link.
> >
> > Also adding support to inline it, because it's single load
> > instruction.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  kernel/bpf/verifier.c    | 19 ++++++++++++++++++-
> >  kernel/trace/bpf_trace.c | 16 +++++++++++++++-
> >  2 files changed, 33 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 1ae41d0cf96c..a745ded00635 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -13625,7 +13625,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
> >                         continue;
> >                 }
> >
> > -               /* Implement bpf_get_func_ip inline. */
> > +               /* Implement tracing bpf_get_func_ip inline. */
> >                 if (prog_type == BPF_PROG_TYPE_TRACING &&
> >                     insn->imm == BPF_FUNC_get_func_ip) {
> >                         /* Load IP address from ctx - 16 */
> > @@ -13640,6 +13640,23 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
> >                         continue;
> >                 }
> >
> > +               /* Implement kprobe/fprobe bpf_get_func_ip inline. */
> > +               if (prog_type == BPF_PROG_TYPE_KPROBE &&
> > +                   eatype == BPF_TRACE_FPROBE &&
> > +                   insn->imm == BPF_FUNC_get_func_ip) {
> > +                       /* Load IP address from ctx (struct pt_regs) ip */
> > +                       insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1,
> > +                                                 offsetof(struct pt_regs, ip));
> 
> Isn't this architecture-specific? I'm starting to dislike this

ugh, it is.. I'm not sure we want #ifdef CONFIG_X86 in here,
or some arch_* specific function?

jirka

> inlining whole more and more. It's just a complication in verifier
> without clear real-world benefits. We are clearly prematurely
> optimizing here. In practice you'll just call bpf_get_func_ip() once
> and that's it. Function call overhead will be negligible compare to
> other *userful* work you'll be doing in your BPF program.
> 
> 
> > +
> > +                       new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, 1);
> > +                       if (!new_prog)
> > +                               return -ENOMEM;
> > +
> > +                       env->prog = prog = new_prog;
> > +                       insn      = new_prog->insnsi + i + delta;
> > +                       continue;
> > +               }
> > +
> >  patch_call_imm:
> >                 fn = env->ops->get_func_proto(insn->imm, env->prog);
> >                 /* all functions that have prototype and verifier allowed
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index a2024ba32a20..28e59e31e3db 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -1036,6 +1036,19 @@ static const struct bpf_func_proto bpf_get_func_ip_proto_kprobe = {
> >         .arg1_type      = ARG_PTR_TO_CTX,
> >  };
> >
> > +BPF_CALL_1(bpf_get_func_ip_fprobe, struct pt_regs *, regs)
> > +{
> > +       /* This helper call is inlined by verifier. */
> > +       return regs->ip;
> > +}
> > +
> > +static const struct bpf_func_proto bpf_get_func_ip_proto_fprobe = {
> > +       .func           = bpf_get_func_ip_fprobe,
> > +       .gpl_only       = false,
> > +       .ret_type       = RET_INTEGER,
> > +       .arg1_type      = ARG_PTR_TO_CTX,
> > +};
> > +
> >  BPF_CALL_1(bpf_get_attach_cookie_trace, void *, ctx)
> >  {
> >         struct bpf_trace_run_ctx *run_ctx;
> > @@ -1279,7 +1292,8 @@ kprobe_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> >                 return &bpf_override_return_proto;
> >  #endif
> >         case BPF_FUNC_get_func_ip:
> > -               return &bpf_get_func_ip_proto_kprobe;
> > +               return prog->expected_attach_type == BPF_TRACE_FPROBE ?
> > +                       &bpf_get_func_ip_proto_fprobe : &bpf_get_func_ip_proto_kprobe;
> >         case BPF_FUNC_get_attach_cookie:
> >                 return &bpf_get_attach_cookie_proto_trace;
> >         default:
> > --
> > 2.34.1
> >

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 2/8] bpf: Add bpf_get_func_ip kprobe helper for fprobe link
  2022-02-09 15:01     ` Jiri Olsa
@ 2022-02-09 16:05       ` Andrii Nakryiko
  2022-02-09 19:14         ` Jiri Olsa
  0 siblings, 1 reply; 50+ messages in thread
From: Andrii Nakryiko @ 2022-02-09 16:05 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Steven Rostedt

On Wed, Feb 9, 2022 at 7:01 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Mon, Feb 07, 2022 at 10:59:18AM -0800, Andrii Nakryiko wrote:
> > On Wed, Feb 2, 2022 at 5:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > >
> > > Adding support to call get_func_ip_fprobe helper from kprobe
> > > programs attached by fprobe link.
> > >
> > > Also adding support to inline it, because it's single load
> > > instruction.
> > >
> > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > ---
> > >  kernel/bpf/verifier.c    | 19 ++++++++++++++++++-
> > >  kernel/trace/bpf_trace.c | 16 +++++++++++++++-
> > >  2 files changed, 33 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index 1ae41d0cf96c..a745ded00635 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -13625,7 +13625,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
> > >                         continue;
> > >                 }
> > >
> > > -               /* Implement bpf_get_func_ip inline. */
> > > +               /* Implement tracing bpf_get_func_ip inline. */
> > >                 if (prog_type == BPF_PROG_TYPE_TRACING &&
> > >                     insn->imm == BPF_FUNC_get_func_ip) {
> > >                         /* Load IP address from ctx - 16 */
> > > @@ -13640,6 +13640,23 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
> > >                         continue;
> > >                 }
> > >
> > > +               /* Implement kprobe/fprobe bpf_get_func_ip inline. */
> > > +               if (prog_type == BPF_PROG_TYPE_KPROBE &&
> > > +                   eatype == BPF_TRACE_FPROBE &&
> > > +                   insn->imm == BPF_FUNC_get_func_ip) {
> > > +                       /* Load IP address from ctx (struct pt_regs) ip */
> > > +                       insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1,
> > > +                                                 offsetof(struct pt_regs, ip));
> >
> > Isn't this architecture-specific? I'm starting to dislike this
>
> ugh, it is.. I'm not sure we want #ifdef CONFIG_X86 in here,
> or some arch_* specific function?


So not inlining it isn't even considered? this function will be called
once or at most a few times per BPF program invocation. Anyone calling
it in a tight loop is going to use it very-very suboptimally (and even
then useful program logic will dominate). There is no point in
inlining it.

>
> jirka
>
> > inlining whole more and more. It's just a complication in verifier
> > without clear real-world benefits. We are clearly prematurely
> > optimizing here. In practice you'll just call bpf_get_func_ip() once
> > and that's it. Function call overhead will be negligible compare to
> > other *userful* work you'll be doing in your BPF program.
> >
> >
> > > +
> > > +                       new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, 1);
> > > +                       if (!new_prog)
> > > +                               return -ENOMEM;
> > > +
> > > +                       env->prog = prog = new_prog;
> > > +                       insn      = new_prog->insnsi + i + delta;
> > > +                       continue;
> > > +               }
> > > +
> > >  patch_call_imm:
> > >                 fn = env->ops->get_func_proto(insn->imm, env->prog);
> > >                 /* all functions that have prototype and verifier allowed
> > > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > > index a2024ba32a20..28e59e31e3db 100644
> > > --- a/kernel/trace/bpf_trace.c
> > > +++ b/kernel/trace/bpf_trace.c
> > > @@ -1036,6 +1036,19 @@ static const struct bpf_func_proto bpf_get_func_ip_proto_kprobe = {
> > >         .arg1_type      = ARG_PTR_TO_CTX,
> > >  };
> > >
> > > +BPF_CALL_1(bpf_get_func_ip_fprobe, struct pt_regs *, regs)
> > > +{
> > > +       /* This helper call is inlined by verifier. */
> > > +       return regs->ip;
> > > +}
> > > +
> > > +static const struct bpf_func_proto bpf_get_func_ip_proto_fprobe = {
> > > +       .func           = bpf_get_func_ip_fprobe,
> > > +       .gpl_only       = false,
> > > +       .ret_type       = RET_INTEGER,
> > > +       .arg1_type      = ARG_PTR_TO_CTX,
> > > +};
> > > +
> > >  BPF_CALL_1(bpf_get_attach_cookie_trace, void *, ctx)
> > >  {
> > >         struct bpf_trace_run_ctx *run_ctx;
> > > @@ -1279,7 +1292,8 @@ kprobe_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> > >                 return &bpf_override_return_proto;
> > >  #endif
> > >         case BPF_FUNC_get_func_ip:
> > > -               return &bpf_get_func_ip_proto_kprobe;
> > > +               return prog->expected_attach_type == BPF_TRACE_FPROBE ?
> > > +                       &bpf_get_func_ip_proto_fprobe : &bpf_get_func_ip_proto_kprobe;
> > >         case BPF_FUNC_get_attach_cookie:
> > >                 return &bpf_get_attach_cookie_proto_trace;
> > >         default:
> > > --
> > > 2.34.1
> > >

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 2/8] bpf: Add bpf_get_func_ip kprobe helper for fprobe link
  2022-02-09 16:05       ` Andrii Nakryiko
@ 2022-02-09 19:14         ` Jiri Olsa
  0 siblings, 0 replies; 50+ messages in thread
From: Jiri Olsa @ 2022-02-09 19:14 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Steven Rostedt

On Wed, Feb 09, 2022 at 08:05:05AM -0800, Andrii Nakryiko wrote:
> On Wed, Feb 9, 2022 at 7:01 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Mon, Feb 07, 2022 at 10:59:18AM -0800, Andrii Nakryiko wrote:
> > > On Wed, Feb 2, 2022 at 5:53 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > >
> > > > Adding support to call get_func_ip_fprobe helper from kprobe
> > > > programs attached by fprobe link.
> > > >
> > > > Also adding support to inline it, because it's single load
> > > > instruction.
> > > >
> > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > ---
> > > >  kernel/bpf/verifier.c    | 19 ++++++++++++++++++-
> > > >  kernel/trace/bpf_trace.c | 16 +++++++++++++++-
> > > >  2 files changed, 33 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > index 1ae41d0cf96c..a745ded00635 100644
> > > > --- a/kernel/bpf/verifier.c
> > > > +++ b/kernel/bpf/verifier.c
> > > > @@ -13625,7 +13625,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
> > > >                         continue;
> > > >                 }
> > > >
> > > > -               /* Implement bpf_get_func_ip inline. */
> > > > +               /* Implement tracing bpf_get_func_ip inline. */
> > > >                 if (prog_type == BPF_PROG_TYPE_TRACING &&
> > > >                     insn->imm == BPF_FUNC_get_func_ip) {
> > > >                         /* Load IP address from ctx - 16 */
> > > > @@ -13640,6 +13640,23 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
> > > >                         continue;
> > > >                 }
> > > >
> > > > +               /* Implement kprobe/fprobe bpf_get_func_ip inline. */
> > > > +               if (prog_type == BPF_PROG_TYPE_KPROBE &&
> > > > +                   eatype == BPF_TRACE_FPROBE &&
> > > > +                   insn->imm == BPF_FUNC_get_func_ip) {
> > > > +                       /* Load IP address from ctx (struct pt_regs) ip */
> > > > +                       insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1,
> > > > +                                                 offsetof(struct pt_regs, ip));
> > >
> > > Isn't this architecture-specific? I'm starting to dislike this
> >
> > ugh, it is.. I'm not sure we want #ifdef CONFIG_X86 in here,
> > or some arch_* specific function?
> 
> 
> So not inlining it isn't even considered? this function will be called
> once or at most a few times per BPF program invocation. Anyone calling
> it in a tight loop is going to use it very-very suboptimally (and even
> then useful program logic will dominate). There is no point in
> inlining it.

I agree that given its usage pattern there won't be too much gain,
on the other hand it's simple verifier code changing call/load/ret
into simple load, so I thought why not.. also there are just few
helpers we can inline so easily

but yea.. I can't think of any sane usage of this helper that inlining
would matter for.. which doesn't mean there isn't one ;-)

jirka

> 
> >
> > jirka
> >
> > > inlining whole more and more. It's just a complication in verifier
> > > without clear real-world benefits. We are clearly prematurely
> > > optimizing here. In practice you'll just call bpf_get_func_ip() once
> > > and that's it. Function call overhead will be negligible compare to
> > > other *userful* work you'll be doing in your BPF program.
> > >
> > >
> > > > +
> > > > +                       new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, 1);
> > > > +                       if (!new_prog)
> > > > +                               return -ENOMEM;
> > > > +
> > > > +                       env->prog = prog = new_prog;
> > > > +                       insn      = new_prog->insnsi + i + delta;
> > > > +                       continue;
> > > > +               }
> > > > +
> > > >  patch_call_imm:
> > > >                 fn = env->ops->get_func_proto(insn->imm, env->prog);
> > > >                 /* all functions that have prototype and verifier allowed
> > > > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > > > index a2024ba32a20..28e59e31e3db 100644
> > > > --- a/kernel/trace/bpf_trace.c
> > > > +++ b/kernel/trace/bpf_trace.c
> > > > @@ -1036,6 +1036,19 @@ static const struct bpf_func_proto bpf_get_func_ip_proto_kprobe = {
> > > >         .arg1_type      = ARG_PTR_TO_CTX,
> > > >  };
> > > >
> > > > +BPF_CALL_1(bpf_get_func_ip_fprobe, struct pt_regs *, regs)
> > > > +{
> > > > +       /* This helper call is inlined by verifier. */
> > > > +       return regs->ip;
> > > > +}
> > > > +
> > > > +static const struct bpf_func_proto bpf_get_func_ip_proto_fprobe = {
> > > > +       .func           = bpf_get_func_ip_fprobe,
> > > > +       .gpl_only       = false,
> > > > +       .ret_type       = RET_INTEGER,
> > > > +       .arg1_type      = ARG_PTR_TO_CTX,
> > > > +};
> > > > +
> > > >  BPF_CALL_1(bpf_get_attach_cookie_trace, void *, ctx)
> > > >  {
> > > >         struct bpf_trace_run_ctx *run_ctx;
> > > > @@ -1279,7 +1292,8 @@ kprobe_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> > > >                 return &bpf_override_return_proto;
> > > >  #endif
> > > >         case BPF_FUNC_get_func_ip:
> > > > -               return &bpf_get_func_ip_proto_kprobe;
> > > > +               return prog->expected_attach_type == BPF_TRACE_FPROBE ?
> > > > +                       &bpf_get_func_ip_proto_fprobe : &bpf_get_func_ip_proto_kprobe;
> > > >         case BPF_FUNC_get_attach_cookie:
> > > >                 return &bpf_get_attach_cookie_proto_trace;
> > > >         default:
> > > > --
> > > > 2.34.1
> > > >

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-04  3:59                     ` Masami Hiramatsu
@ 2022-02-15 13:21                       ` Jiri Olsa
  2022-02-16 18:27                         ` Andrii Nakryiko
  0 siblings, 1 reply; 50+ messages in thread
From: Jiri Olsa @ 2022-02-15 13:21 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Alexei Starovoitov, Steven Rostedt, Jiri Olsa,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh

On Fri, Feb 04, 2022 at 12:59:42PM +0900, Masami Hiramatsu wrote:
> Hi Alexei,
> 
> On Thu, 3 Feb 2022 18:42:22 -0800
> Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> 
> > On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> > >
> > > On Thu, 3 Feb 2022 18:12:11 -0800
> > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > >
> > > > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > > > transparently.
> > > >
> > > > Not true.
> > > > fprobe is nothing but _explicit_ kprobe on ftrace.
> > > > There was an implicit optimization for kprobe when ftrace
> > > > could be used.
> > > > All this new interface is doing is making it explicit.
> > > > So a new name is not warranted here.
> > > >
> > > > > from that viewpoint, fprobe and kprobe interface are similar but different.
> > > >
> > > > What is the difference?
> > > > I don't see it.
> > >
> > > IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> > > abilities that a normal kprobe does not. Namely, "what is the function
> > > parameters?"
> > >
> > > You can only reliably get the parameters at function entry. Hence, by
> > > having a probe that is unique to functions as supposed to the middle of a
> > > function, makes sense to me.
> > >
> > > That is, the API can change. "Give me parameter X". That along with some
> > > BTF reading, could figure out how to get parameter X, and record that.
> > 
> > This is more or less a description of kprobe on ftrace :)
> > The bpf+kprobe users were relying on that for a long time.
> > See PT_REGS_PARM1() macros in bpf_tracing.h
> > They're meaningful only with kprobe on ftrace.
> > So, no, fprobe is not inventing anything new here.
> 
> Hmm, you may be misleading why PT_REGS_PARAM1() macro works. You can use
> it even if CONFIG_FUNCITON_TRACER=n if your kernel is built with
> CONFIG_KPROBES=y. It is valid unless you put a probe out of function
> entry.
> 
> > No one is using kprobe in the middle of the function.
> > It's too difficult to make anything useful out of it,
> > so no one bothers.
> > When people say "kprobe" 99 out of 100 they mean
> > kprobe on ftrace/fentry.
> 
> I see. But the kprobe is kprobe. It is not designed to support multiple
> probe points. If I'm forced to say, I can rename the struct fprobe to
> struct multi_kprobe, but that doesn't change the essence. You may need
> to use both of kprobes and so-called multi_kprobe properly. (Someone
> need to do that.)

hi,
tying to kick things further ;-) I was thinking about bpf side of this
and we could use following interface:

  enum bpf_attach_type {
    ...
    BPF_TRACE_KPROBE_MULTI
  };

  enum bpf_link_type {
    ...
    BPF_LINK_TYPE_KPROBE_MULTI
  };

  union bpf_attr {

    struct {
      ...
      struct {
        __aligned_u64   syms;
        __aligned_u64   addrs;
        __aligned_u64   cookies;
        __u32           cnt;
        __u32           flags;
      } kprobe_multi;
    } link_create;
  }

because from bpf user POV it's new link for attaching multiple kprobes
and I agree new 'fprobe' type name in here brings more confusion, using
kprobe_multi is straightforward

thoguhts?

thanks,
jirka

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-15 13:21                       ` Jiri Olsa
@ 2022-02-16 18:27                         ` Andrii Nakryiko
  2022-02-17 14:03                           ` Masami Hiramatsu
  2022-02-22 12:42                           ` Jiri Olsa
  0 siblings, 2 replies; 50+ messages in thread
From: Andrii Nakryiko @ 2022-02-16 18:27 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Masami Hiramatsu, Alexei Starovoitov, Steven Rostedt, Jiri Olsa,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh

On Tue, Feb 15, 2022 at 5:21 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Fri, Feb 04, 2022 at 12:59:42PM +0900, Masami Hiramatsu wrote:
> > Hi Alexei,
> >
> > On Thu, 3 Feb 2022 18:42:22 -0800
> > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> >
> > > On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> > > >
> > > > On Thu, 3 Feb 2022 18:12:11 -0800
> > > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > > > > transparently.
> > > > >
> > > > > Not true.
> > > > > fprobe is nothing but _explicit_ kprobe on ftrace.
> > > > > There was an implicit optimization for kprobe when ftrace
> > > > > could be used.
> > > > > All this new interface is doing is making it explicit.
> > > > > So a new name is not warranted here.
> > > > >
> > > > > > from that viewpoint, fprobe and kprobe interface are similar but different.
> > > > >
> > > > > What is the difference?
> > > > > I don't see it.
> > > >
> > > > IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> > > > abilities that a normal kprobe does not. Namely, "what is the function
> > > > parameters?"
> > > >
> > > > You can only reliably get the parameters at function entry. Hence, by
> > > > having a probe that is unique to functions as supposed to the middle of a
> > > > function, makes sense to me.
> > > >
> > > > That is, the API can change. "Give me parameter X". That along with some
> > > > BTF reading, could figure out how to get parameter X, and record that.
> > >
> > > This is more or less a description of kprobe on ftrace :)
> > > The bpf+kprobe users were relying on that for a long time.
> > > See PT_REGS_PARM1() macros in bpf_tracing.h
> > > They're meaningful only with kprobe on ftrace.
> > > So, no, fprobe is not inventing anything new here.
> >
> > Hmm, you may be misleading why PT_REGS_PARAM1() macro works. You can use
> > it even if CONFIG_FUNCITON_TRACER=n if your kernel is built with
> > CONFIG_KPROBES=y. It is valid unless you put a probe out of function
> > entry.
> >
> > > No one is using kprobe in the middle of the function.
> > > It's too difficult to make anything useful out of it,
> > > so no one bothers.
> > > When people say "kprobe" 99 out of 100 they mean
> > > kprobe on ftrace/fentry.
> >
> > I see. But the kprobe is kprobe. It is not designed to support multiple
> > probe points. If I'm forced to say, I can rename the struct fprobe to
> > struct multi_kprobe, but that doesn't change the essence. You may need
> > to use both of kprobes and so-called multi_kprobe properly. (Someone
> > need to do that.)
>
> hi,
> tying to kick things further ;-) I was thinking about bpf side of this
> and we could use following interface:
>
>   enum bpf_attach_type {
>     ...
>     BPF_TRACE_KPROBE_MULTI
>   };
>
>   enum bpf_link_type {
>     ...
>     BPF_LINK_TYPE_KPROBE_MULTI
>   };
>
>   union bpf_attr {
>
>     struct {
>       ...
>       struct {
>         __aligned_u64   syms;
>         __aligned_u64   addrs;
>         __aligned_u64   cookies;
>         __u32           cnt;
>         __u32           flags;
>       } kprobe_multi;
>     } link_create;
>   }
>
> because from bpf user POV it's new link for attaching multiple kprobes
> and I agree new 'fprobe' type name in here brings more confusion, using
> kprobe_multi is straightforward
>
> thoguhts?

I think this makes sense. We do need new type of link to store ip ->
cookie mapping anyways.

Is there any chance to support this fast multi-attach for uprobe? If
yes, we might want to reuse the same link for both (so should we name
it more generically? on the other hand BPF program type for uprobe is
BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
consistent with what we have today).

But yeah, the main question is whether there is something preventing
us from supporting multi-attach uprobe as well? It would be really
great for USDT use case.

>
> thanks,
> jirka

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-16 18:27                         ` Andrii Nakryiko
@ 2022-02-17 14:03                           ` Masami Hiramatsu
  2022-02-17 22:01                             ` Andrii Nakryiko
  2022-02-22 12:42                           ` Jiri Olsa
  1 sibling, 1 reply; 50+ messages in thread
From: Masami Hiramatsu @ 2022-02-17 14:03 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Masami Hiramatsu, Alexei Starovoitov, Steven Rostedt,
	Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Oleg Nesterov

On Wed, 16 Feb 2022 10:27:19 -0800
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Tue, Feb 15, 2022 at 5:21 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Fri, Feb 04, 2022 at 12:59:42PM +0900, Masami Hiramatsu wrote:
> > > Hi Alexei,
> > >
> > > On Thu, 3 Feb 2022 18:42:22 -0800
> > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > >
> > > > On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> > > > >
> > > > > On Thu, 3 Feb 2022 18:12:11 -0800
> > > > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > > > > > transparently.
> > > > > >
> > > > > > Not true.
> > > > > > fprobe is nothing but _explicit_ kprobe on ftrace.
> > > > > > There was an implicit optimization for kprobe when ftrace
> > > > > > could be used.
> > > > > > All this new interface is doing is making it explicit.
> > > > > > So a new name is not warranted here.
> > > > > >
> > > > > > > from that viewpoint, fprobe and kprobe interface are similar but different.
> > > > > >
> > > > > > What is the difference?
> > > > > > I don't see it.
> > > > >
> > > > > IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> > > > > abilities that a normal kprobe does not. Namely, "what is the function
> > > > > parameters?"
> > > > >
> > > > > You can only reliably get the parameters at function entry. Hence, by
> > > > > having a probe that is unique to functions as supposed to the middle of a
> > > > > function, makes sense to me.
> > > > >
> > > > > That is, the API can change. "Give me parameter X". That along with some
> > > > > BTF reading, could figure out how to get parameter X, and record that.
> > > >
> > > > This is more or less a description of kprobe on ftrace :)
> > > > The bpf+kprobe users were relying on that for a long time.
> > > > See PT_REGS_PARM1() macros in bpf_tracing.h
> > > > They're meaningful only with kprobe on ftrace.
> > > > So, no, fprobe is not inventing anything new here.
> > >
> > > Hmm, you may be misleading why PT_REGS_PARAM1() macro works. You can use
> > > it even if CONFIG_FUNCITON_TRACER=n if your kernel is built with
> > > CONFIG_KPROBES=y. It is valid unless you put a probe out of function
> > > entry.
> > >
> > > > No one is using kprobe in the middle of the function.
> > > > It's too difficult to make anything useful out of it,
> > > > so no one bothers.
> > > > When people say "kprobe" 99 out of 100 they mean
> > > > kprobe on ftrace/fentry.
> > >
> > > I see. But the kprobe is kprobe. It is not designed to support multiple
> > > probe points. If I'm forced to say, I can rename the struct fprobe to
> > > struct multi_kprobe, but that doesn't change the essence. You may need
> > > to use both of kprobes and so-called multi_kprobe properly. (Someone
> > > need to do that.)
> >
> > hi,
> > tying to kick things further ;-) I was thinking about bpf side of this
> > and we could use following interface:
> >
> >   enum bpf_attach_type {
> >     ...
> >     BPF_TRACE_KPROBE_MULTI
> >   };
> >
> >   enum bpf_link_type {
> >     ...
> >     BPF_LINK_TYPE_KPROBE_MULTI
> >   };
> >
> >   union bpf_attr {
> >
> >     struct {
> >       ...
> >       struct {
> >         __aligned_u64   syms;
> >         __aligned_u64   addrs;
> >         __aligned_u64   cookies;
> >         __u32           cnt;
> >         __u32           flags;
> >       } kprobe_multi;
> >     } link_create;
> >   }
> >
> > because from bpf user POV it's new link for attaching multiple kprobes
> > and I agree new 'fprobe' type name in here brings more confusion, using
> > kprobe_multi is straightforward
> >
> > thoguhts?
> 
> I think this makes sense. We do need new type of link to store ip ->
> cookie mapping anyways.

This looks good to me too.

> 
> Is there any chance to support this fast multi-attach for uprobe? If
> yes, we might want to reuse the same link for both (so should we name
> it more generically?

There is no interface to do that but also there is no limitation to
expand uprobes. For the kprobes, there are some limitations for the
function entry because it needs to share the space with ftrace. So
I introduced fprobe for easier to use.

> on the other hand BPF program type for uprobe is
> BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> consistent with what we have today).

Hmm, I'm not sure why BPF made such design choice... (Uprobe needs
the target program.)


> But yeah, the main question is whether there is something preventing
> us from supporting multi-attach uprobe as well? It would be really
> great for USDT use case.

Ah, for the USDT, it will be useful. But since now we will have "user-event"
which is faster than uprobes, we may be better to consider to use it.

I'm not so sure how uprobes probes the target process, but maybe it has
to manage some memory pages and task related things. If we can split
those task-related part from struct uprobe software-breakpoint part,
it maybe easy to support multiple probe (one task-related part + multiple
software-breakpoint parts.)

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-17 14:03                           ` Masami Hiramatsu
@ 2022-02-17 22:01                             ` Andrii Nakryiko
  2022-02-18  4:07                               ` Masami Hiramatsu
  0 siblings, 1 reply; 50+ messages in thread
From: Andrii Nakryiko @ 2022-02-17 22:01 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Jiri Olsa, Alexei Starovoitov, Steven Rostedt, Jiri Olsa,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Oleg Nesterov

On Thu, Feb 17, 2022 at 6:04 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Wed, 16 Feb 2022 10:27:19 -0800
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > On Tue, Feb 15, 2022 at 5:21 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > >
> > > On Fri, Feb 04, 2022 at 12:59:42PM +0900, Masami Hiramatsu wrote:
> > > > Hi Alexei,
> > > >
> > > > On Thu, 3 Feb 2022 18:42:22 -0800
> > > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > > On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> > > > > >
> > > > > > On Thu, 3 Feb 2022 18:12:11 -0800
> > > > > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > > > > >
> > > > > > > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > > > > > > transparently.
> > > > > > >
> > > > > > > Not true.
> > > > > > > fprobe is nothing but _explicit_ kprobe on ftrace.
> > > > > > > There was an implicit optimization for kprobe when ftrace
> > > > > > > could be used.
> > > > > > > All this new interface is doing is making it explicit.
> > > > > > > So a new name is not warranted here.
> > > > > > >
> > > > > > > > from that viewpoint, fprobe and kprobe interface are similar but different.
> > > > > > >
> > > > > > > What is the difference?
> > > > > > > I don't see it.
> > > > > >
> > > > > > IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> > > > > > abilities that a normal kprobe does not. Namely, "what is the function
> > > > > > parameters?"
> > > > > >
> > > > > > You can only reliably get the parameters at function entry. Hence, by
> > > > > > having a probe that is unique to functions as supposed to the middle of a
> > > > > > function, makes sense to me.
> > > > > >
> > > > > > That is, the API can change. "Give me parameter X". That along with some
> > > > > > BTF reading, could figure out how to get parameter X, and record that.
> > > > >
> > > > > This is more or less a description of kprobe on ftrace :)
> > > > > The bpf+kprobe users were relying on that for a long time.
> > > > > See PT_REGS_PARM1() macros in bpf_tracing.h
> > > > > They're meaningful only with kprobe on ftrace.
> > > > > So, no, fprobe is not inventing anything new here.
> > > >
> > > > Hmm, you may be misleading why PT_REGS_PARAM1() macro works. You can use
> > > > it even if CONFIG_FUNCITON_TRACER=n if your kernel is built with
> > > > CONFIG_KPROBES=y. It is valid unless you put a probe out of function
> > > > entry.
> > > >
> > > > > No one is using kprobe in the middle of the function.
> > > > > It's too difficult to make anything useful out of it,
> > > > > so no one bothers.
> > > > > When people say "kprobe" 99 out of 100 they mean
> > > > > kprobe on ftrace/fentry.
> > > >
> > > > I see. But the kprobe is kprobe. It is not designed to support multiple
> > > > probe points. If I'm forced to say, I can rename the struct fprobe to
> > > > struct multi_kprobe, but that doesn't change the essence. You may need
> > > > to use both of kprobes and so-called multi_kprobe properly. (Someone
> > > > need to do that.)
> > >
> > > hi,
> > > tying to kick things further ;-) I was thinking about bpf side of this
> > > and we could use following interface:
> > >
> > >   enum bpf_attach_type {
> > >     ...
> > >     BPF_TRACE_KPROBE_MULTI
> > >   };
> > >
> > >   enum bpf_link_type {
> > >     ...
> > >     BPF_LINK_TYPE_KPROBE_MULTI
> > >   };
> > >
> > >   union bpf_attr {
> > >
> > >     struct {
> > >       ...
> > >       struct {
> > >         __aligned_u64   syms;
> > >         __aligned_u64   addrs;
> > >         __aligned_u64   cookies;
> > >         __u32           cnt;
> > >         __u32           flags;
> > >       } kprobe_multi;
> > >     } link_create;
> > >   }
> > >
> > > because from bpf user POV it's new link for attaching multiple kprobes
> > > and I agree new 'fprobe' type name in here brings more confusion, using
> > > kprobe_multi is straightforward
> > >
> > > thoguhts?
> >
> > I think this makes sense. We do need new type of link to store ip ->
> > cookie mapping anyways.
>
> This looks good to me too.
>
> >
> > Is there any chance to support this fast multi-attach for uprobe? If
> > yes, we might want to reuse the same link for both (so should we name
> > it more generically?
>
> There is no interface to do that but also there is no limitation to
> expand uprobes. For the kprobes, there are some limitations for the
> function entry because it needs to share the space with ftrace. So
> I introduced fprobe for easier to use.
>
> > on the other hand BPF program type for uprobe is
> > BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> > consistent with what we have today).
>
> Hmm, I'm not sure why BPF made such design choice... (Uprobe needs
> the target program.)
>

We've been talking about sleepable uprobe programs, so we might need
to add uprobe-specific program type, probably. But historically, from
BPF point of view there was no difference between kprobe and uprobe
programs (in terms of how they are run and what's available to them).
From BPF point of view, it was just attaching BPF program to a
perf_event.

>
> > But yeah, the main question is whether there is something preventing
> > us from supporting multi-attach uprobe as well? It would be really
> > great for USDT use case.
>
> Ah, for the USDT, it will be useful. But since now we will have "user-event"
> which is faster than uprobes, we may be better to consider to use it.

Any pointers? I'm not sure what "user-event" refers to.

>
> I'm not so sure how uprobes probes the target process, but maybe it has
> to manage some memory pages and task related things. If we can split
> those task-related part from struct uprobe software-breakpoint part,
> it maybe easy to support multiple probe (one task-related part + multiple
> software-breakpoint parts.)
>
> Thank you,
>
> --
> Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-17 22:01                             ` Andrii Nakryiko
@ 2022-02-18  4:07                               ` Masami Hiramatsu
  2022-02-18 19:46                                 ` Andrii Nakryiko
  2022-02-19  2:10                                 ` Alexei Starovoitov
  0 siblings, 2 replies; 50+ messages in thread
From: Masami Hiramatsu @ 2022-02-18  4:07 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Steven Rostedt, Jiri Olsa,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Oleg Nesterov

On Thu, 17 Feb 2022 14:01:30 -0800
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:


> > > Is there any chance to support this fast multi-attach for uprobe? If
> > > yes, we might want to reuse the same link for both (so should we name
> > > it more generically?
> >
> > There is no interface to do that but also there is no limitation to
> > expand uprobes. For the kprobes, there are some limitations for the
> > function entry because it needs to share the space with ftrace. So
> > I introduced fprobe for easier to use.
> >
> > > on the other hand BPF program type for uprobe is
> > > BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> > > consistent with what we have today).
> >
> > Hmm, I'm not sure why BPF made such design choice... (Uprobe needs
> > the target program.)
> >
> 
> We've been talking about sleepable uprobe programs, so we might need
> to add uprobe-specific program type, probably. But historically, from
> BPF point of view there was no difference between kprobe and uprobe
> programs (in terms of how they are run and what's available to them).
> From BPF point of view, it was just attaching BPF program to a
> perf_event.

Got it, so that will reuse the uprobe_events in ftrace. But I think
the uprobe requires a "path" to the attached binary, how is it
specified?

> > > But yeah, the main question is whether there is something preventing
> > > us from supporting multi-attach uprobe as well? It would be really
> > > great for USDT use case.
> >
> > Ah, for the USDT, it will be useful. But since now we will have "user-event"
> > which is faster than uprobes, we may be better to consider to use it.
> 
> Any pointers? I'm not sure what "user-event" refers to.

Here is the user-events series, which allows user program to define
raw dynamic events and it can write raw event data directly from
user space.

https://lore.kernel.org/all/20220118204326.2169-1-beaub@linux.microsoft.com/

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-18  4:07                               ` Masami Hiramatsu
@ 2022-02-18 19:46                                 ` Andrii Nakryiko
  2022-02-19  2:10                                 ` Alexei Starovoitov
  1 sibling, 0 replies; 50+ messages in thread
From: Andrii Nakryiko @ 2022-02-18 19:46 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Jiri Olsa, Alexei Starovoitov, Steven Rostedt, Jiri Olsa,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Oleg Nesterov

On Thu, Feb 17, 2022 at 8:07 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Thu, 17 Feb 2022 14:01:30 -0800
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
>
> > > > Is there any chance to support this fast multi-attach for uprobe? If
> > > > yes, we might want to reuse the same link for both (so should we name
> > > > it more generically?
> > >
> > > There is no interface to do that but also there is no limitation to
> > > expand uprobes. For the kprobes, there are some limitations for the
> > > function entry because it needs to share the space with ftrace. So
> > > I introduced fprobe for easier to use.
> > >
> > > > on the other hand BPF program type for uprobe is
> > > > BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> > > > consistent with what we have today).
> > >
> > > Hmm, I'm not sure why BPF made such design choice... (Uprobe needs
> > > the target program.)
> > >
> >
> > We've been talking about sleepable uprobe programs, so we might need
> > to add uprobe-specific program type, probably. But historically, from
> > BPF point of view there was no difference between kprobe and uprobe
> > programs (in terms of how they are run and what's available to them).
> > From BPF point of view, it was just attaching BPF program to a
> > perf_event.
>
> Got it, so that will reuse the uprobe_events in ftrace. But I think
> the uprobe requires a "path" to the attached binary, how is it
> specified?

It's passed as a string to perf subsystem during perf_event_open() syscall.

>
> > > > But yeah, the main question is whether there is something preventing
> > > > us from supporting multi-attach uprobe as well? It would be really
> > > > great for USDT use case.
> > >
> > > Ah, for the USDT, it will be useful. But since now we will have "user-event"
> > > which is faster than uprobes, we may be better to consider to use it.
> >
> > Any pointers? I'm not sure what "user-event" refers to.
>
> Here is the user-events series, which allows user program to define
> raw dynamic events and it can write raw event data directly from
> user space.
>
> https://lore.kernel.org/all/20220118204326.2169-1-beaub@linux.microsoft.com/
>

Thanks for the link! I'll check it out.

> Thank you,
>
> --
> Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-18  4:07                               ` Masami Hiramatsu
  2022-02-18 19:46                                 ` Andrii Nakryiko
@ 2022-02-19  2:10                                 ` Alexei Starovoitov
  2022-02-21  7:18                                   ` Masami Hiramatsu
  1 sibling, 1 reply; 50+ messages in thread
From: Alexei Starovoitov @ 2022-02-19  2:10 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Andrii Nakryiko, Jiri Olsa, Steven Rostedt, Jiri Olsa,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Oleg Nesterov

On Thu, Feb 17, 2022 at 8:07 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Thu, 17 Feb 2022 14:01:30 -0800
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
>
> > > > Is there any chance to support this fast multi-attach for uprobe? If
> > > > yes, we might want to reuse the same link for both (so should we name
> > > > it more generically?
> > >
> > > There is no interface to do that but also there is no limitation to
> > > expand uprobes. For the kprobes, there are some limitations for the
> > > function entry because it needs to share the space with ftrace. So
> > > I introduced fprobe for easier to use.
> > >
> > > > on the other hand BPF program type for uprobe is
> > > > BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> > > > consistent with what we have today).
> > >
> > > Hmm, I'm not sure why BPF made such design choice... (Uprobe needs
> > > the target program.)
> > >
> >
> > We've been talking about sleepable uprobe programs, so we might need
> > to add uprobe-specific program type, probably. But historically, from
> > BPF point of view there was no difference between kprobe and uprobe
> > programs (in terms of how they are run and what's available to them).
> > From BPF point of view, it was just attaching BPF program to a
> > perf_event.
>
> Got it, so that will reuse the uprobe_events in ftrace. But I think
> the uprobe requires a "path" to the attached binary, how is it
> specified?
>
> > > > But yeah, the main question is whether there is something preventing
> > > > us from supporting multi-attach uprobe as well? It would be really
> > > > great for USDT use case.
> > >
> > > Ah, for the USDT, it will be useful. But since now we will have "user-event"
> > > which is faster than uprobes, we may be better to consider to use it.
> >
> > Any pointers? I'm not sure what "user-event" refers to.
>
> Here is the user-events series, which allows user program to define
> raw dynamic events and it can write raw event data directly from
> user space.
>
> https://lore.kernel.org/all/20220118204326.2169-1-beaub@linux.microsoft.com/

Is this a way for user space to inject user bytes into kernel events?
What is the use case?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-19  2:10                                 ` Alexei Starovoitov
@ 2022-02-21  7:18                                   ` Masami Hiramatsu
  0 siblings, 0 replies; 50+ messages in thread
From: Masami Hiramatsu @ 2022-02-21  7:18 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, Jiri Olsa, Steven Rostedt, Jiri Olsa,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Oleg Nesterov

On Fri, 18 Feb 2022 18:10:08 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Thu, Feb 17, 2022 at 8:07 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Thu, 17 Feb 2022 14:01:30 -0800
> > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >
> >
> > > > > Is there any chance to support this fast multi-attach for uprobe? If
> > > > > yes, we might want to reuse the same link for both (so should we name
> > > > > it more generically?
> > > >
> > > > There is no interface to do that but also there is no limitation to
> > > > expand uprobes. For the kprobes, there are some limitations for the
> > > > function entry because it needs to share the space with ftrace. So
> > > > I introduced fprobe for easier to use.
> > > >
> > > > > on the other hand BPF program type for uprobe is
> > > > > BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> > > > > consistent with what we have today).
> > > >
> > > > Hmm, I'm not sure why BPF made such design choice... (Uprobe needs
> > > > the target program.)
> > > >
> > >
> > > We've been talking about sleepable uprobe programs, so we might need
> > > to add uprobe-specific program type, probably. But historically, from
> > > BPF point of view there was no difference between kprobe and uprobe
> > > programs (in terms of how they are run and what's available to them).
> > > From BPF point of view, it was just attaching BPF program to a
> > > perf_event.
> >
> > Got it, so that will reuse the uprobe_events in ftrace. But I think
> > the uprobe requires a "path" to the attached binary, how is it
> > specified?
> >
> > > > > But yeah, the main question is whether there is something preventing
> > > > > us from supporting multi-attach uprobe as well? It would be really
> > > > > great for USDT use case.
> > > >
> > > > Ah, for the USDT, it will be useful. But since now we will have "user-event"
> > > > which is faster than uprobes, we may be better to consider to use it.
> > >
> > > Any pointers? I'm not sure what "user-event" refers to.
> >
> > Here is the user-events series, which allows user program to define
> > raw dynamic events and it can write raw event data directly from
> > user space.
> >
> > https://lore.kernel.org/all/20220118204326.2169-1-beaub@linux.microsoft.com/
> 
> Is this a way for user space to inject user bytes into kernel events?

Yes, it is.

> What is the use case?

This is like trace_marker but more ftrace/perf friendly version. The trace_marker
can only send a user string, and the kernel can not parse it. Thus, the traced
data will be shown in the trace buffer, but the event filter, event trigger,
histogram etc didn't work with trace_marker.

On the other hand, the user-events allows user-space defines new events with
various arguments with types, and the application can send the formatted raw
data to the kernel. Thus the kernel can apply event filter, event trigger and
histograms on those events as same as other kernel defined events.

This will be helpful for users to push their own data as events of ftrace
and perf (and eBPF I think) so that they can use those tracing tools to analyze
both of their events and kernel events. :-)

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/8] bpf: Add fprobe link
  2022-02-16 18:27                         ` Andrii Nakryiko
  2022-02-17 14:03                           ` Masami Hiramatsu
@ 2022-02-22 12:42                           ` Jiri Olsa
  1 sibling, 0 replies; 50+ messages in thread
From: Jiri Olsa @ 2022-02-22 12:42 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Masami Hiramatsu, Alexei Starovoitov, Steven Rostedt, Jiri Olsa,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh

On Wed, Feb 16, 2022 at 10:27:19AM -0800, Andrii Nakryiko wrote:

SNIP

> >
> > hi,
> > tying to kick things further ;-) I was thinking about bpf side of this
> > and we could use following interface:
> >
> >   enum bpf_attach_type {
> >     ...
> >     BPF_TRACE_KPROBE_MULTI
> >   };
> >
> >   enum bpf_link_type {
> >     ...
> >     BPF_LINK_TYPE_KPROBE_MULTI
> >   };
> >
> >   union bpf_attr {
> >
> >     struct {
> >       ...
> >       struct {
> >         __aligned_u64   syms;
> >         __aligned_u64   addrs;
> >         __aligned_u64   cookies;
> >         __u32           cnt;
> >         __u32           flags;
> >       } kprobe_multi;
> >     } link_create;
> >   }
> >
> > because from bpf user POV it's new link for attaching multiple kprobes
> > and I agree new 'fprobe' type name in here brings more confusion, using
> > kprobe_multi is straightforward
> >
> > thoguhts?
> 
> I think this makes sense. We do need new type of link to store ip ->
> cookie mapping anyways.
> 
> Is there any chance to support this fast multi-attach for uprobe? If
> yes, we might want to reuse the same link for both (so should we name
> it more generically? on the other hand BPF program type for uprobe is
> BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> consistent with what we have today).
> 
> But yeah, the main question is whether there is something preventing
> us from supporting multi-attach uprobe as well? It would be really
> great for USDT use case.

I need to check with uprobes, my understanding ends at perf/trace
code calling uprobe_register ;-)

maybe I should first try if uprobes suffer the same performance issue

I'll send another version with above interface, because there's
tons of other fixes, and by the time for next version we might
have answer for the interface change

jirka

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2022-02-22 12:42 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-02 13:53 [PATCH 0/8] bpf: Add fprobe link Jiri Olsa
2022-02-02 13:53 ` [PATCH 1/8] bpf: Add support to attach kprobe program with fprobe Jiri Olsa
2022-02-07 18:59   ` Andrii Nakryiko
2022-02-08  8:56     ` Jiri Olsa
2022-02-02 13:53 ` [PATCH 2/8] bpf: Add bpf_get_func_ip kprobe helper for fprobe link Jiri Olsa
2022-02-07 18:59   ` Andrii Nakryiko
2022-02-07 21:01     ` Alexei Starovoitov
2022-02-09 15:01     ` Jiri Olsa
2022-02-09 16:05       ` Andrii Nakryiko
2022-02-09 19:14         ` Jiri Olsa
2022-02-02 13:53 ` [PATCH 3/8] bpf: Add bpf_cookie support to fprobe Jiri Olsa
2022-02-07 18:59   ` Andrii Nakryiko
2022-02-08  9:07     ` Jiri Olsa
2022-02-08 23:35       ` Andrii Nakryiko
2022-02-08 23:46         ` Jiri Olsa
2022-02-08 23:53           ` Andrii Nakryiko
2022-02-02 13:53 ` [PATCH 4/8] libbpf: Add libbpf__kallsyms_parse function Jiri Olsa
2022-02-07 18:59   ` Andrii Nakryiko
2022-02-08  9:08     ` Jiri Olsa
2022-02-02 13:53 ` [PATCH 5/8] libbpf: Add bpf_link_create support for multi kprobes Jiri Olsa
2022-02-02 13:53 ` [PATCH 6/8] libbpf: Add bpf_program__attach_kprobe_opts " Jiri Olsa
2022-02-07 18:59   ` Andrii Nakryiko
2022-02-08  9:12     ` Jiri Olsa
2022-02-02 13:53 ` [PATCH 7/8] selftest/bpf: Add fprobe attach test Jiri Olsa
2022-02-02 13:53 ` [PATCH 8/8] selftest/bpf: Add fprobe test for bpf_cookie values Jiri Olsa
2022-02-07 18:59   ` Andrii Nakryiko
2022-02-08  9:15     ` Jiri Olsa
2022-02-08 23:24       ` Andrii Nakryiko
2022-02-02 17:09 ` [PATCH 0/8] bpf: Add fprobe link Alexei Starovoitov
2022-02-02 17:24   ` Jiri Olsa
2022-02-02 17:30     ` Alexei Starovoitov
2022-02-03 15:06       ` Jiri Olsa
2022-02-04  0:46         ` Masami Hiramatsu
2022-02-04  1:34           ` Alexei Starovoitov
2022-02-04  2:07             ` Masami Hiramatsu
2022-02-04  2:12               ` Alexei Starovoitov
2022-02-04  2:19                 ` Steven Rostedt
2022-02-04  2:42                   ` Alexei Starovoitov
2022-02-04  3:17                     ` Masami Hiramatsu
2022-02-04  3:59                     ` Masami Hiramatsu
2022-02-15 13:21                       ` Jiri Olsa
2022-02-16 18:27                         ` Andrii Nakryiko
2022-02-17 14:03                           ` Masami Hiramatsu
2022-02-17 22:01                             ` Andrii Nakryiko
2022-02-18  4:07                               ` Masami Hiramatsu
2022-02-18 19:46                                 ` Andrii Nakryiko
2022-02-19  2:10                                 ` Alexei Starovoitov
2022-02-21  7:18                                   ` Masami Hiramatsu
2022-02-22 12:42                           ` Jiri Olsa
2022-02-04  3:14                 ` Masami Hiramatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).