Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Gilad Naaman <gnaaman@drivenets.com>
To: "netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: Possible GRE-TAP/ECT bug
Date: Wed, 21 Jul 2021 20:18:56 +0000	[thread overview]
Message-ID: <5CE212DA-0C5E-47B7-9612-4BE6EA224488@drivenets.com> (raw)
In-Reply-To: <60DDAD73-D59E-449E-B85C-2A79F41C687D@drivenets.com>

This can be reproduced by using this:

    $ cat setup_bug.sh
    #!/bin/bash
    ip netns add A
    ip netns add B
    ip -n A link add _v0 type veth peer name _v1 netns B
    
    ip -n A link set _v0 up
    ip -n A addr add dev _v0 10.254.3.1/24
    ip -n A route add default dev _v0 scope global
    
    ip -n B link set _v1 up
    ip -n B addr add dev _v1 10.254.1.6/24
    ip -n B route add default dev _v1 scope global
    
    ip -n B link add florp type gretap local 10.254.1.6 remote 10.254.3.1 key 0x49000000
    ip -n B link set florp up
    
    $ cat send_pkt.py
    #!/usr/bin/env python3
    from scapy.all import *
    
    pkt = IP(b'E\x01\x00\xa7\x00\x00\x00\x00@/`%\n\xfe\x03\x01\n\xfe\x01\x06 \x00eXI\x00\x00\x00\x18\xbe\x92\xa0\xee&\x18\xb0\x92\xa0l&\x08\x00E\x00\x00}\x8b\x85@\x00\x01\x01\xe4\xf2\x82\x82\x82\x01\x82\x82\x82\x02\x08\x00d\x11\xa6\xeb3\x1e\x1e\\xf3\\xf7`\x00\x00\x00\x00ZN\x00\x00\x00\x00\x00\x00\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ')
    
    send(pkt)


And then

    $ sudo ip netns exec B tcpdump -neqllvi florp icmp &

    $ sudo ip netns exec A python3 send_pkt.py

> On 21 Jul 2021, at 20:36, Gilad Naaman <gnaaman@drivenets.com> wrote:
> 
> Hey,
> 
> We’ve encountered a bug with handling of the ECT bits in GRE-tap encapsulated IPv4 traffic on Kernel 5.4.
> 
> This bug manifests when a packet in the following form enters the system:
> 
> +-----------+
> |Inner IPv4 |    ECT=0b00
> +-----------+
> |Ethernet   |
> +-----------+
> |GRE        |
> +-----------+
> |Outer IPv4 |    ECT=0b01  |  ECT=0b10
> +-----------+
> 
> When inspected by tcpdump-ing the outer interface, the packet is fine;
> when inspected by tcpdump-ing the GRE interface, a small amount of bits is changed in the source/destination addresses of the inner Ethernet header.
> 
> Upon further inspection, we found out that the bits changed in the ethernet headers correspond to the position of the ECT/Checksum fields in the IPv4 header.
> In other words, some code path tries to copy the ECT bits from outer to the inner IPv4 header and correct the checksum, but it accidentally treats the ethernet header as an IP header.
> 
> Unfortunately this results in the loss of all IPv4 traffic.
> 
> This doesn’t happen if the internal protocol is IPv6 or IS-IS, or if the external ECT is 0.
> 
> We’ve tried to search for places that modify the `tos` field of an skb, and found just enough to say “yes, some code wants to sync the ECT bits”,
> but not to find the right one.
> 
> Most prominently, xfrm does just that, but as far as I can tell it is not relevant to these packets.
> (It is done by calling `ipv4_copy_dscp` from `xfrm4_remove_tunnel_encap`)
> 
> We’ve gone over the ip_tunnel.c code dmesg shows that `ip_tunnel: non-ECT from 10.254.3.1 with TOS=0x1` is printed for the packet mentioned below.
> 
> Would love if someone can direct me at where to look for this ECT copying code for further analysis.
> 
> Thank you!
> 
> 
> =================================================
> 
> As an example to a modified packet, here’s an ICMP ping.
> The sent packet is
> 
> ```
> <IP  version=4 ihl=5 tos=0x1 id=0 flags= frag=0 ttl=64 proto=gre src=10.254.3.1 dst=10.254.1.6 |<GRE  chksum_present=0 routing_present=0 key_present=1 seqnum_present=0 strict_route_source=0 recursion_control=0 flags=0 version=0 proto=TEB key=0x49000000 |<Ether  dst=18:be:92:a0:ee:26 src=18:b0:92:a0:6c:26 type=IPv4 |<IP  version=4 ihl=5 tos=0x0 id=35717 flags=DF frag=0 ttl=1 proto=icmp src=130.130.130.1 dst=130.130.130.2 |<ICMP  type=echo-request code=0 id=0xa6eb seq=0x331e unused='' |<Raw  load='\x1e\\xf3\\xf7`\x00\x00\x00\x00ZN\x00\x00\x00\x00\x00\x00\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ' |>>>>>>
> ```
> or in hex:
> 450100a700000000402f60250afe03010afe0106200065584900000018be92a0ee2618b092a06c2608004500007d8b8540000101e4f2828282018282820208006411a6eb331e1e5c7866335c78663760000000005a4e000000000000101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f404142434445464748494a4b4c4d4e4f505152535455565758595a
> 
> 
> 
> The received packet is
> ```
> <Ether  dst=18:bd:92:a0:ee:26 src=18:b0:92:a0:6c:27 type=IPv4 |<IP  version=4 ihl=5 tos=0x0 len=125 id=35717 flags=DF frag=0 ttl=1 proto=icmp chksum=0xe4f2 src=130.130.130.1 dst=130.130.130.2 |<ICMP  type=echo-request code=0 chksum=0x6411 id=0xa6eb seq=0x331e unused='' |<Raw  load='\x1e\\xf3\\xf7`\x00\x00\x00\x00ZN\x00\x00\x00\x00\x00\x00\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ' |>>>>
> ```
> or
> 18bd92a0ee2618b092a06c2708004500007d8b8540000101e4f2828282018282820208006411a6eb331e1e5c7866335c78663760000000005a4e000000000000101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f404142434445464748494a4b4c4d4e4f505152535455565758595a
> 
> 
> Note how the destination address changes from `18:be:92:a0:ee:26` to `18:bd:92:a0:ee:26`.
> Where the lower 2 bits of the second octet are overwritten with the ECT `0b01`.
> 
> The source address is then changed from `18:b0:92:a0:6c:26` to `18:b0:92:a0:6c:27`, probably as an attempt to fix the checksum.
> In practice it’s always incremented by 1, even though I’m not sure it results in a correct checksum.
> 
> If the MAC address is changed on the interface to match the ECT, or the other ECT is used in the packet, nothing changes and the packet is received correctly.


  reply	other threads:[~2021-07-21 20:19 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-21 17:36 Gilad Naaman
2021-07-21 20:18 ` Gilad Naaman [this message]
2021-07-22 15:05   ` Gilad Naaman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5CE212DA-0C5E-47B7-9612-4BE6EA224488@drivenets.com \
    --to=gnaaman@drivenets.com \
    --cc=netdev@vger.kernel.org \
    --subject='Re: Possible GRE-TAP/ECT bug' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).