Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Daniel Borkmann <daniel@iogearbox.net>,
syzbot+bed360704c521841c85d@syzkaller.appspotmail.com,
Kurt Manucredo <fuzzybritches0@gmail.com>,
Eric Biggers <ebiggers@kernel.org>,
Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Edward Cree <ecree.xilinx@gmail.com>,
Sasha Levin <sashal@kernel.org>,
netdev@vger.kernel.org, bpf@vger.kernel.org
Subject: [PATCH AUTOSEL 5.10 087/137] bpf: Fix up register-based shifts in interpreter to silence KUBSAN
Date: Tue, 6 Jul 2021 07:21:13 -0400 [thread overview]
Message-ID: <20210706112203.2062605-87-sashal@kernel.org> (raw)
In-Reply-To: <20210706112203.2062605-1-sashal@kernel.org>
From: Daniel Borkmann <daniel@iogearbox.net>
[ Upstream commit 28131e9d933339a92f78e7ab6429f4aaaa07061c ]
syzbot reported a shift-out-of-bounds that KUBSAN observed in the
interpreter:
[...]
UBSAN: shift-out-of-bounds in kernel/bpf/core.c:1420:2
shift exponent 255 is too large for 64-bit type 'long long unsigned int'
CPU: 1 PID: 11097 Comm: syz-executor.4 Not tainted 5.12.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x141/0x1d7 lib/dump_stack.c:120
ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
__ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:327
___bpf_prog_run.cold+0x19/0x56c kernel/bpf/core.c:1420
__bpf_prog_run32+0x8f/0xd0 kernel/bpf/core.c:1735
bpf_dispatcher_nop_func include/linux/bpf.h:644 [inline]
bpf_prog_run_pin_on_cpu include/linux/filter.h:624 [inline]
bpf_prog_run_clear_cb include/linux/filter.h:755 [inline]
run_filter+0x1a1/0x470 net/packet/af_packet.c:2031
packet_rcv+0x313/0x13e0 net/packet/af_packet.c:2104
dev_queue_xmit_nit+0x7c2/0xa90 net/core/dev.c:2387
xmit_one net/core/dev.c:3588 [inline]
dev_hard_start_xmit+0xad/0x920 net/core/dev.c:3609
__dev_queue_xmit+0x2121/0x2e00 net/core/dev.c:4182
__bpf_tx_skb net/core/filter.c:2116 [inline]
__bpf_redirect_no_mac net/core/filter.c:2141 [inline]
__bpf_redirect+0x548/0xc80 net/core/filter.c:2164
____bpf_clone_redirect net/core/filter.c:2448 [inline]
bpf_clone_redirect+0x2ae/0x420 net/core/filter.c:2420
___bpf_prog_run+0x34e1/0x77d0 kernel/bpf/core.c:1523
__bpf_prog_run512+0x99/0xe0 kernel/bpf/core.c:1737
bpf_dispatcher_nop_func include/linux/bpf.h:644 [inline]
bpf_test_run+0x3ed/0xc50 net/bpf/test_run.c:50
bpf_prog_test_run_skb+0xabc/0x1c50 net/bpf/test_run.c:582
bpf_prog_test_run kernel/bpf/syscall.c:3127 [inline]
__do_sys_bpf+0x1ea9/0x4f00 kernel/bpf/syscall.c:4406
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xae
[...]
Generally speaking, KUBSAN reports from the kernel should be fixed.
However, in case of BPF, this particular report caused concerns since
the large shift is not wrong from BPF point of view, just undefined.
In the verifier, K-based shifts that are >= {64,32} (depending on the
bitwidth of the instruction) are already rejected. The register-based
cases were not given their content might not be known at verification
time. Ideas such as verifier instruction rewrite with an additional
AND instruction for the source register were brought up, but regularly
rejected due to the additional runtime overhead they incur.
As Edward Cree rightly put it:
Shifts by more than insn bitness are legal in the BPF ISA; they are
implementation-defined behaviour [of the underlying architecture],
rather than UB, and have been made legal for performance reasons.
Each of the JIT backends compiles the BPF shift operations to machine
instructions which produce implementation-defined results in such a
case; the resulting contents of the register may be arbitrary but
program behaviour as a whole remains defined.
Guard checks in the fast path (i.e. affecting JITted code) will thus
not be accepted.
The case of division by zero is not truly analogous here, as division
instructions on many of the JIT-targeted architectures will raise a
machine exception / fault on division by zero, whereas (to the best
of my knowledge) none will do so on an out-of-bounds shift.
Given the KUBSAN report only affects the BPF interpreter, but not JITs,
one solution is to add the ANDs with 63 or 31 into ___bpf_prog_run().
That would make the shifts defined, and thus shuts up KUBSAN, and the
compiler would optimize out the AND on any CPU that interprets the shift
amounts modulo the width anyway (e.g., confirmed from disassembly that
on x86-64 and arm64 the generated interpreter code is the same before
and after this fix).
The BPF interpreter is slow path, and most likely compiled out anyway
as distros select BPF_JIT_ALWAYS_ON to avoid speculative execution of
BPF instructions by the interpreter. Given the main argument was to
avoid sacrificing performance, the fact that the AND is optimized away
from compiler for mainstream archs helps as well as a solution moving
forward. Also add a comment on LSH/RSH/ARSH translation for JIT authors
to provide guidance when they see the ___bpf_prog_run() interpreter
code and use it as a model for a new JIT backend.
Reported-by: syzbot+bed360704c521841c85d@syzkaller.appspotmail.com
Reported-by: Kurt Manucredo <fuzzybritches0@gmail.com>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Co-developed-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: syzbot+bed360704c521841c85d@syzkaller.appspotmail.com
Cc: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/bpf/0000000000008f912605bd30d5d7@google.com
Link: https://lore.kernel.org/bpf/bac16d8d-c174-bdc4-91bd-bfa62b410190@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
kernel/bpf/core.c | 61 +++++++++++++++++++++++++++++++++--------------
1 file changed, 43 insertions(+), 18 deletions(-)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 182e162f8fd0..239c6b3b5993 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1395,29 +1395,54 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
select_insn:
goto *jumptable[insn->code];
- /* ALU */
-#define ALU(OPCODE, OP) \
- ALU64_##OPCODE##_X: \
- DST = DST OP SRC; \
- CONT; \
- ALU_##OPCODE##_X: \
- DST = (u32) DST OP (u32) SRC; \
- CONT; \
- ALU64_##OPCODE##_K: \
- DST = DST OP IMM; \
- CONT; \
- ALU_##OPCODE##_K: \
- DST = (u32) DST OP (u32) IMM; \
+ /* Explicitly mask the register-based shift amounts with 63 or 31
+ * to avoid undefined behavior. Normally this won't affect the
+ * generated code, for example, in case of native 64 bit archs such
+ * as x86-64 or arm64, the compiler is optimizing the AND away for
+ * the interpreter. In case of JITs, each of the JIT backends compiles
+ * the BPF shift operations to machine instructions which produce
+ * implementation-defined results in such a case; the resulting
+ * contents of the register may be arbitrary, but program behaviour
+ * as a whole remains defined. In other words, in case of JIT backends,
+ * the AND must /not/ be added to the emitted LSH/RSH/ARSH translation.
+ */
+ /* ALU (shifts) */
+#define SHT(OPCODE, OP) \
+ ALU64_##OPCODE##_X: \
+ DST = DST OP (SRC & 63); \
+ CONT; \
+ ALU_##OPCODE##_X: \
+ DST = (u32) DST OP ((u32) SRC & 31); \
+ CONT; \
+ ALU64_##OPCODE##_K: \
+ DST = DST OP IMM; \
+ CONT; \
+ ALU_##OPCODE##_K: \
+ DST = (u32) DST OP (u32) IMM; \
+ CONT;
+ /* ALU (rest) */
+#define ALU(OPCODE, OP) \
+ ALU64_##OPCODE##_X: \
+ DST = DST OP SRC; \
+ CONT; \
+ ALU_##OPCODE##_X: \
+ DST = (u32) DST OP (u32) SRC; \
+ CONT; \
+ ALU64_##OPCODE##_K: \
+ DST = DST OP IMM; \
+ CONT; \
+ ALU_##OPCODE##_K: \
+ DST = (u32) DST OP (u32) IMM; \
CONT;
-
ALU(ADD, +)
ALU(SUB, -)
ALU(AND, &)
ALU(OR, |)
- ALU(LSH, <<)
- ALU(RSH, >>)
ALU(XOR, ^)
ALU(MUL, *)
+ SHT(LSH, <<)
+ SHT(RSH, >>)
+#undef SHT
#undef ALU
ALU_NEG:
DST = (u32) -DST;
@@ -1442,13 +1467,13 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
insn++;
CONT;
ALU_ARSH_X:
- DST = (u64) (u32) (((s32) DST) >> SRC);
+ DST = (u64) (u32) (((s32) DST) >> (SRC & 31));
CONT;
ALU_ARSH_K:
DST = (u64) (u32) (((s32) DST) >> IMM);
CONT;
ALU64_ARSH_X:
- (*(s64 *) &DST) >>= SRC;
+ (*(s64 *) &DST) >>= (SRC & 63);
CONT;
ALU64_ARSH_K:
(*(s64 *) &DST) >>= IMM;
--
2.30.2
next prev parent reply other threads:[~2021-07-06 11:39 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20210706112203.2062605-1-sashal@kernel.org>
2021-07-06 11:19 ` [PATCH AUTOSEL 5.10 009/137] net: pch_gbe: Use proper accessors to BE data in pch_ptp_match() Sasha Levin
2021-07-06 11:19 ` [PATCH AUTOSEL 5.10 013/137] atm: iphase: fix possible use-after-free in ia_module_exit() Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 014/137] mISDN: fix possible use-after-free in HFC_cleanup() Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 015/137] atm: nicstar: Fix possible use-after-free in nicstar_cleanup() Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 016/137] net: Treat __napi_schedule_irqoff() as __napi_schedule() on PREEMPT_RT Sasha Levin
2021-07-12 21:52 ` Pavel Machek
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 018/137] net: mdio: ipq8064: add regmap config to disable REGCACHE Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 023/137] bpf: Check for BPF_F_ADJ_ROOM_FIXED_GSO when bpf_skb_change_proto Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 024/137] net: mdio: provide shim implementation of devm_of_mdiobus_register Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 025/137] net/sched: cls_api: increase max_reclassify_loop Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 032/137] e100: handle eeprom as little endian Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 033/137] igb: handle vlan types with checker enabled Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 034/137] igb: fix assignment on big endian machines Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 037/137] net/mlx5e: IPsec/rep_tc: Fix rep_tc_update_skb drops IPsec packet Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 038/137] net/mlx5: Fix lag port remapping logic Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 041/137] net: stmmac: the XPCS obscures a potential "PHY not found" error Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 046/137] virtio-net: Add validation for used length Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 047/137] ipv6: use prandom_u32() for ID generation Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 052/137] net: tcp better handling of reordering then loss cases Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 057/137] net: bridge: mrp: Update ring transitions Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 059/137] ice: set the value of global config lock timeout longer Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 060/137] ice: fix clang warning regarding deadcode.DeadStores Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 061/137] virtio_net: Remove BUG() to avoid machine dead Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 062/137] net: mscc: ocelot: check return value after calling platform_get_resource() Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 063/137] net: bcmgenet: " Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 064/137] net: mvpp2: " Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 065/137] net: micrel: " Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 066/137] net: moxa: Use devm_platform_get_and_ioremap_resource() Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 072/137] net: phy: realtek: add delay to fix RXC generation issue Sasha Levin
2021-07-06 11:20 ` [PATCH AUTOSEL 5.10 073/137] selftests: Clean forgotten resources as part of cleanup() Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 074/137] net: sgi: ioc3-eth: check return value after calling platform_get_resource() Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 076/137] fjes: " Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 078/137] r8169: avoid link-up interrupt issue on RTL8106e if user enables ASPM Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 080/137] xfrm: Fix error reporting in xfrm_state_construct Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 082/137] wlcore/wl12xx: Fix wl12xx get_mac error if device is in ELP Sasha Levin
2021-07-12 22:03 ` Pavel Machek
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 083/137] wl1251: Fix possible buffer overflow in wl1251_cmd_scan Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 084/137] cw1200: add missing MODULE_DEVICE_TABLE Sasha Levin
2021-07-06 11:21 ` Sasha Levin [this message]
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 088/137] ice: fix incorrect payload indicator on PTYPE Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 089/137] ice: mark PTYPE 2 as reserved Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 090/137] mt76: mt7615: fix fixed-rate tx status reporting Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 092/137] net: ipa: Add missing of_node_put() in ipa_firmware_load() Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 093/137] net: sched: fix error return code in tcf_del_walker() Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 097/137] mt76: mt7915: fix IEEE80211_HE_PHY_CAP7_MAX_NC for station mode Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 098/137] rtl8xxxu: Fix device info for RTL8192EU devices Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 100/137] net: fec: add ndo_select_queue to fix TX bandwidth fluctuations Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 101/137] atm: nicstar: use 'dma_free_coherent' instead of 'kfree' Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 102/137] atm: nicstar: register the interrupt handler in the right place Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 103/137] vsock: notify server to shutdown when client has pending signal Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 105/137] iwlwifi: mvm: don't change band on bound PHY contexts Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 106/137] iwlwifi: mvm: fix error print when session protection ends Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 107/137] iwlwifi: mvm: support LONG_GROUP for WOWLAN_GET_STATUSES version Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 108/137] iwlwifi: pcie: free IML DMA memory allocation Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 109/137] iwlwifi: pcie: fix context info freeing Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 110/137] sfc: avoid double pci_remove of VFs Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 111/137] sfc: error code if SRIOV cannot be disabled Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 112/137] net: dsa: b53: Create default VLAN entry explicitly Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 113/137] wireless: wext-spy: Fix out-of-bounds warning Sasha Levin
2021-07-06 14:08 ` Johannes Berg
2021-07-07 10:45 ` Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 114/137] cfg80211: fix default HE tx bitrate mask in 2G band Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 115/137] mac80211: consider per-CPU statistics if present Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 116/137] mac80211_hwsim: add concurrent channels scanning support over virtio Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 118/137] media, bpf: Do not copy more entries than user space requested Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 119/137] net: ip: avoid OOM kills with large UDP sends over loopback Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 122/137] Bluetooth: Fix the HCI to MGMT status conversion table Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 123/137] Bluetooth: Fix alt settings for incoming SCO with transparent coding format Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 124/137] Bluetooth: Shutdown controller after workqueues are flushed or cancelled Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 126/137] Bluetooth: L2CAP: Fix invalid access if ECRED Reconfigure fails Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 127/137] Bluetooth: L2CAP: Fix invalid access on ECRED Connection response Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 129/137] Bluetooth: mgmt: Fix the command returns garbage parameter value Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 132/137] bpf: Fix false positive kmemleak report in bpf_ringbuf_area_alloc() Sasha Levin
2021-07-06 11:21 ` [PATCH AUTOSEL 5.10 133/137] flow_offload: action should not be NULL when it is referenced Sasha Levin
2021-07-06 11:22 ` [PATCH AUTOSEL 5.10 134/137] sctp: validate from_addr_param return Sasha Levin
2021-07-06 11:22 ` [PATCH AUTOSEL 5.10 135/137] sctp: add size validation when walking chunks Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210706112203.2062605-87-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=ebiggers@kernel.org \
--cc=ecree.xilinx@gmail.com \
--cc=fuzzybritches0@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=syzbot+bed360704c521841c85d@syzkaller.appspotmail.com \
--subject='Re: [PATCH AUTOSEL 5.10 087/137] bpf: Fix up register-based shifts in interpreter to silence KUBSAN' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).