Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
* Actual tail call count limits in x86 JITs and interpreter
@ 2021-09-09 22:15 Johan Almbladh
  2021-09-09 23:20 ` Daniel Borkmann
  0 siblings, 1 reply; 2+ messages in thread
From: Johan Almbladh @ 2021-09-09 22:15 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov, Andrii Nakryiko,
	Paul Chaignon, Tiezhu Yang
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, bpf, Networking

I have done some investigation into this matter, and this is what I
have found. To recap, the situation is as follows.

MAX_TAIL_CALL_CNT is defined to be 32. Since the interpreter has used
33 instead, we agree to use that limit across all JIT implementations
to not break any user space program. To make sure everything uses the
same limit, we must first understand what the current state actually
is so we know what to fix.

Me: according to test_bpf.ko the tail call limit is 33 for the
interpreter, and 32 for the x86-64 JIT.
Paul: according to selftests the tail call limit is 33 for both the
interpreter and the x86-64 JIT.

Link: https://lore.kernel.org/bpf/20210809093437.876558-1-johan.almbladh@anyfinetworks.com/

I have been able to reproduce the above selftests results using
vmtest.sh. Digging deeper into this, I found that there are actually
two different code paths where the tail call count is checked in the
x86-64 JIT, corresponding to direct and indirect tail calls. By
setting different limits in those two places, I found that selftests
tailcall_3 hits the limit in emit_bpf_tail_call_direct(), whereas the
test_bpf.ko is limited by emit_bpf_tail_call_indirect().

I am not 100% sure that this is the correct explanation, but it sounds
very reasonable. However, the asm generated in the two cases look very
similar to me, so by looking at that alone I cannot really see that
the limits would be different. Perhaps someone more versed in x86 asm
could take a closer look.

What are your thoughts?

Johan

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Actual tail call count limits in x86 JITs and interpreter
  2021-09-09 22:15 Actual tail call count limits in x86 JITs and interpreter Johan Almbladh
@ 2021-09-09 23:20 ` Daniel Borkmann
  0 siblings, 0 replies; 2+ messages in thread
From: Daniel Borkmann @ 2021-09-09 23:20 UTC (permalink / raw)
  To: Johan Almbladh, Alexei Starovoitov, Andrii Nakryiko,
	Paul Chaignon, Tiezhu Yang
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, bpf, Networking

[-- Attachment #1: Type: text/plain, Size: 1897 bytes --]

On 9/10/21 12:15 AM, Johan Almbladh wrote:
> I have done some investigation into this matter, and this is what I
> have found. To recap, the situation is as follows.
> 
> MAX_TAIL_CALL_CNT is defined to be 32. Since the interpreter has used
> 33 instead, we agree to use that limit across all JIT implementations
> to not break any user space program. To make sure everything uses the
> same limit, we must first understand what the current state actually
> is so we know what to fix.
> 
> Me: according to test_bpf.ko the tail call limit is 33 for the
> interpreter, and 32 for the x86-64 JIT.
> Paul: according to selftests the tail call limit is 33 for both the
> interpreter and the x86-64 JIT.
> 
> Link: https://lore.kernel.org/bpf/20210809093437.876558-1-johan.almbladh@anyfinetworks.com/
> 
> I have been able to reproduce the above selftests results using
> vmtest.sh. Digging deeper into this, I found that there are actually
> two different code paths where the tail call count is checked in the
> x86-64 JIT, corresponding to direct and indirect tail calls. By
> setting different limits in those two places, I found that selftests
> tailcall_3 hits the limit in emit_bpf_tail_call_direct(), whereas the
> test_bpf.ko is limited by emit_bpf_tail_call_indirect().

I hacked a quick test case so that both are covered, see attached. Both
have the same behavior from my testing with and without the x86-64 JIT.
Bit late over here, will check more tomorrow, but in both cases we also
emit the same JIT code wrt counter update..

> I am not 100% sure that this is the correct explanation, but it sounds
> very reasonable. However, the asm generated in the two cases look very
> similar to me, so by looking at that alone I cannot really see that
> the limits would be different. Perhaps someone more versed in x86 asm
> could take a closer look.
> 
> What are your thoughts?
> 
> Johan
> 


[-- Attachment #2: 0001-bpf-selftests-Replicate-tailcall_3-limit-test-for-in.patch --]
[-- Type: text/x-patch, Size: 4635 bytes --]

From c68f8e699a639392f24244068c1b49389f67f302 Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel@iogearbox.net>
Date: Thu, 9 Sep 2021 22:55:13 +0000
Subject: [PATCH bpf-next] bpf, selftests: Replicate tailcall_3 limit test for indirect call case

The tailcall_3 test program uses bpf_tail_call_static() where the JIT
would patch a direct jump. Add a new tailcall_6 test program replicating
exactly the same test just ensuring that bpf_tail_call() uses a map index
where the verifier cannot make assumptions this time. In other words this
will now cover both on x86-64 JIT, emit_bpf_tail_call_direct() emission
as well as emit_bpf_tail_call_indirect() emission.

  # ./test_progs -t tailcalls
  #136/1 tailcalls/tailcall_1:OK
  #136/2 tailcalls/tailcall_2:OK
  #136/3 tailcalls/tailcall_3:OK
  #136/4 tailcalls/tailcall_4:OK
  #136/5 tailcalls/tailcall_5:OK
  #136/6 tailcalls/tailcall_6:OK
  #136/7 tailcalls/tailcall_bpf2bpf_1:OK
  #136/8 tailcalls/tailcall_bpf2bpf_2:OK
  #136/9 tailcalls/tailcall_bpf2bpf_3:OK
  #136/10 tailcalls/tailcall_bpf2bpf_4:OK
  #136/11 tailcalls/tailcall_bpf2bpf_5:OK
  #136 tailcalls:OK
  Summary: 1/11 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Cc: Paul Chaignon <paul@cilium.io>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
---
 .../selftests/bpf/prog_tests/tailcalls.c      | 25 ++++++++++---
 tools/testing/selftests/bpf/progs/tailcall6.c | 36 +++++++++++++++++++
 2 files changed, 56 insertions(+), 5 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/tailcall6.c

diff --git a/tools/testing/selftests/bpf/prog_tests/tailcalls.c b/tools/testing/selftests/bpf/prog_tests/tailcalls.c
index b5940e6ca67c..7bf3a7a97d7b 100644
--- a/tools/testing/selftests/bpf/prog_tests/tailcalls.c
+++ b/tools/testing/selftests/bpf/prog_tests/tailcalls.c
@@ -219,10 +219,7 @@ static void test_tailcall_2(void)
 	bpf_object__close(obj);
 }
 
-/* test_tailcall_3 checks that the count value of the tail call limit
- * enforcement matches with expectations.
- */
-static void test_tailcall_3(void)
+static void test_tailcall_count(const char *which)
 {
 	int err, map_fd, prog_fd, main_fd, data_fd, i, val;
 	struct bpf_map *prog_array, *data_map;
@@ -231,7 +228,7 @@ static void test_tailcall_3(void)
 	__u32 retval, duration;
 	char buff[128] = {};
 
-	err = bpf_prog_load("tailcall3.o", BPF_PROG_TYPE_SCHED_CLS, &obj,
+	err = bpf_prog_load(which, BPF_PROG_TYPE_SCHED_CLS, &obj,
 			    &prog_fd);
 	if (CHECK_FAIL(err))
 		return;
@@ -296,6 +293,22 @@ static void test_tailcall_3(void)
 	bpf_object__close(obj);
 }
 
+/* test_tailcall_3 checks that the count value of the tail call limit
+ * enforcement matches with expectations. JIT uses direct jump.
+ */
+static void test_tailcall_3(void)
+{
+	test_tailcall_count("tailcall3.o");
+}
+
+/* test_tailcall_6 checks that the count value of the tail call limit
+ * enforcement matches with expectations. JIT uses indirect jump.
+ */
+static void test_tailcall_6(void)
+{
+	test_tailcall_count("tailcall6.o");
+}
+
 /* test_tailcall_4 checks that the kernel properly selects indirect jump
  * for the case where the key is not known. Latter is passed via global
  * data to select different targets we can compare return value of.
@@ -822,6 +835,8 @@ void test_tailcalls(void)
 		test_tailcall_4();
 	if (test__start_subtest("tailcall_5"))
 		test_tailcall_5();
+	if (test__start_subtest("tailcall_6"))
+		test_tailcall_6();
 	if (test__start_subtest("tailcall_bpf2bpf_1"))
 		test_tailcall_bpf2bpf_1();
 	if (test__start_subtest("tailcall_bpf2bpf_2"))
diff --git a/tools/testing/selftests/bpf/progs/tailcall6.c b/tools/testing/selftests/bpf/progs/tailcall6.c
new file mode 100644
index 000000000000..9c298d908693
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/tailcall6.c
@@ -0,0 +1,36 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+
+#include <bpf/bpf_helpers.h>
+
+struct {
+	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
+	__uint(max_entries, 1);
+	__uint(key_size, sizeof(__u32));
+	__uint(value_size, sizeof(__u32));
+} jmp_table SEC(".maps");
+
+int count = 0;
+volatile int which;
+
+SEC("classifier/0")
+int bpf_func_0(struct __sk_buff *skb)
+{
+	count++;
+	if (__builtin_constant_p(which))
+		__bpf_unreachable();
+	bpf_tail_call(skb, &jmp_table, which);
+	return 1;
+}
+
+SEC("classifier")
+int entry(struct __sk_buff *skb)
+{
+	if (__builtin_constant_p(which))
+		__bpf_unreachable();
+	bpf_tail_call(skb, &jmp_table, which);
+	return 0;
+}
+
+char __license[] SEC("license") = "GPL";
+int _version SEC("version") = 1;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-09-09 23:20 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-09 22:15 Actual tail call count limits in x86 JITs and interpreter Johan Almbladh
2021-09-09 23:20 ` Daniel Borkmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).