Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH net 0/3] Fix PMTU for ESP-in-UDP encapsulation
@ 2021-07-12  0:55 Vadim Fedorenko
  2021-07-12  0:55 ` [PATCH net 1/3] udp: check for encap using encap_enable Vadim Fedorenko
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Vadim Fedorenko @ 2021-07-12  0:55 UTC (permalink / raw)
  To: David Ahern, Willem de Bruijn, Paolo Abeni, Xin Long
  Cc: Jakub Kicinski, David S. Miller, netdev, Vadim Fedorenko

Bug 213669 uncovered regression in PMTU discovery for UDP-encapsulated
routes and some incorrect usage in udp tunnel fields. This series fixes
problems and also adds such case for selftests

Vadim Fedorenko (3):
  udp: check for encap using encap_enable
  udp: check encap socket in __udp_lib_err
  selftests: net: add ESP-in-UDP PMTU test

 drivers/infiniband/sw/rxe/rxe_net.c   |   1 -
 drivers/net/bareudp.c                 |   1 -
 drivers/net/geneve.c                  |   1 -
 drivers/net/vxlan.c                   |   1 -
 drivers/net/wireguard/socket.c        |   1 -
 net/ipv4/fou.c                        |   1 -
 net/ipv4/udp.c                        |  31 ++++++--
 net/ipv6/udp.c                        |  30 ++++++--
 net/sctp/protocol.c                   |   2 -
 net/tipc/udp_media.c                  |   1 -
 tools/testing/selftests/net/nettest.c |  55 +++++++++++++-
 tools/testing/selftests/net/pmtu.sh   | 104 +++++++++++++++++++++++++-
 12 files changed, 205 insertions(+), 24 deletions(-)

-- 
2.18.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH net 1/3] udp: check for encap using encap_enable
  2021-07-12  0:55 [PATCH net 0/3] Fix PMTU for ESP-in-UDP encapsulation Vadim Fedorenko
@ 2021-07-12  0:55 ` Vadim Fedorenko
  2021-07-12  8:37   ` Paolo Abeni
  2021-07-12  0:55 ` [PATCH net 2/3] udp: check encap socket in __udp_lib_err Vadim Fedorenko
  2021-07-12  0:55 ` [PATCH net 3/3] selftests: net: add ESP-in-UDP PMTU test Vadim Fedorenko
  2 siblings, 1 reply; 18+ messages in thread
From: Vadim Fedorenko @ 2021-07-12  0:55 UTC (permalink / raw)
  To: David Ahern, Willem de Bruijn, Paolo Abeni, Xin Long
  Cc: Jakub Kicinski, David S. Miller, netdev, Vadim Fedorenko

Usage of encap_type as a flag to determine encapsulation of udp
socket is not correct because there is special encap_enable field
for this check. Many network drivers use this field as a flag
instead of correctly indicate type of encapsulation. Remove such
usage and update all checks to use encap_enable field

Fixes: 60fb9567bf30 ("udp: implement complete book-keeping for encap_needed")
Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
---
 drivers/infiniband/sw/rxe/rxe_net.c | 1 -
 drivers/net/bareudp.c               | 1 -
 drivers/net/geneve.c                | 1 -
 drivers/net/vxlan.c                 | 1 -
 drivers/net/wireguard/socket.c      | 1 -
 net/ipv4/fou.c                      | 1 -
 net/ipv4/udp.c                      | 9 ++++-----
 net/ipv6/udp.c                      | 8 +++-----
 net/sctp/protocol.c                 | 2 --
 net/tipc/udp_media.c                | 1 -
 10 files changed, 7 insertions(+), 19 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c
index 01662727dca0..f6515208e5af 100644
--- a/drivers/infiniband/sw/rxe/rxe_net.c
+++ b/drivers/infiniband/sw/rxe/rxe_net.c
@@ -212,7 +212,6 @@ static struct socket *rxe_setup_udp_tunnel(struct net *net, __be16 port,
 		return ERR_PTR(err);
 	}
 
-	tnl_cfg.encap_type = 1;
 	tnl_cfg.encap_rcv = rxe_udp_encap_recv;
 
 	/* Setup UDP tunnel */
diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c
index a7ee0af1af90..4c84b327bba0 100644
--- a/drivers/net/bareudp.c
+++ b/drivers/net/bareudp.c
@@ -236,7 +236,6 @@ static int bareudp_socket_create(struct bareudp_dev *bareudp, __be16 port)
 	/* Mark socket as an encapsulation socket */
 	memset(&tunnel_cfg, 0, sizeof(tunnel_cfg));
 	tunnel_cfg.sk_user_data = bareudp;
-	tunnel_cfg.encap_type = 1;
 	tunnel_cfg.encap_rcv = bareudp_udp_encap_recv;
 	tunnel_cfg.encap_err_lookup = bareudp_err_lookup;
 	tunnel_cfg.encap_destroy = NULL;
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 1ab94b5f9bbf..953a9306af98 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -590,7 +590,6 @@ static struct geneve_sock *geneve_socket_create(struct net *net, __be16 port,
 	/* Mark socket as an encapsulation socket */
 	memset(&tunnel_cfg, 0, sizeof(tunnel_cfg));
 	tunnel_cfg.sk_user_data = gs;
-	tunnel_cfg.encap_type = 1;
 	tunnel_cfg.gro_receive = geneve_gro_receive;
 	tunnel_cfg.gro_complete = geneve_gro_complete;
 	tunnel_cfg.encap_rcv = geneve_udp_encap_recv;
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 5a8df5a195cb..4eba44b1120e 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -3539,7 +3539,6 @@ static struct vxlan_sock *vxlan_socket_create(struct net *net, bool ipv6,
 	/* Mark socket as an encapsulation socket. */
 	memset(&tunnel_cfg, 0, sizeof(tunnel_cfg));
 	tunnel_cfg.sk_user_data = vs;
-	tunnel_cfg.encap_type = 1;
 	tunnel_cfg.encap_rcv = vxlan_rcv;
 	tunnel_cfg.encap_err_lookup = vxlan_err_lookup;
 	tunnel_cfg.encap_destroy = NULL;
diff --git a/drivers/net/wireguard/socket.c b/drivers/net/wireguard/socket.c
index 8c496b747108..81669dd0e6cd 100644
--- a/drivers/net/wireguard/socket.c
+++ b/drivers/net/wireguard/socket.c
@@ -351,7 +351,6 @@ int wg_socket_init(struct wg_device *wg, u16 port)
 	int ret;
 	struct udp_tunnel_sock_cfg cfg = {
 		.sk_user_data = wg,
-		.encap_type = 1,
 		.encap_rcv = wg_receive
 	};
 	struct socket *new4 = NULL, *new6 = NULL;
diff --git a/net/ipv4/fou.c b/net/ipv4/fou.c
index e5f69b0bf3df..b80f56594e0a 100644
--- a/net/ipv4/fou.c
+++ b/net/ipv4/fou.c
@@ -590,7 +590,6 @@ static int fou_create(struct net *net, struct fou_cfg *cfg,
 	fou->sock = sock;
 
 	memset(&tunnel_cfg, 0, sizeof(tunnel_cfg));
-	tunnel_cfg.encap_type = 1;
 	tunnel_cfg.sk_user_data = fou;
 	tunnel_cfg.encap_destroy = NULL;
 
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 62cd4cd52e84..e5cb7fedfbcd 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -707,7 +707,7 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
 	sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
 			       iph->saddr, uh->source, skb->dev->ifindex,
 			       inet_sdif(skb), udptable, NULL);
-	if (!sk || udp_sk(sk)->encap_type) {
+	if (!sk || udp_sk(sk)->encap_enabled) {
 		/* No socket for error: try tunnels before discarding */
 		sk = ERR_PTR(-ENOENT);
 		if (static_branch_unlikely(&udp_encap_needed_key)) {
@@ -2105,7 +2105,7 @@ static int udp_queue_rcv_one_skb(struct sock *sk, struct sk_buff *skb)
 		goto drop;
 	nf_reset_ct(skb);
 
-	if (static_branch_unlikely(&udp_encap_needed_key) && up->encap_type) {
+	if (static_branch_unlikely(&udp_encap_needed_key) && up->encap_enabled) {
 		int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
 
 		/*
@@ -2615,14 +2615,13 @@ void udp_destroy_sock(struct sock *sk)
 	udp_flush_pending_frames(sk);
 	unlock_sock_fast(sk, slow);
 	if (static_branch_unlikely(&udp_encap_needed_key)) {
-		if (up->encap_type) {
+		if (up->encap_enabled) {
 			void (*encap_destroy)(struct sock *sk);
 			encap_destroy = READ_ONCE(up->encap_destroy);
 			if (encap_destroy)
 				encap_destroy(sk);
-		}
-		if (up->encap_enabled)
 			static_branch_dec(&udp_encap_needed_key);
+		}
 	}
 }
 
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 0cc7ba531b34..798916d2e722 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -558,7 +558,7 @@ int __udp6_lib_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 
 	sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source,
 			       inet6_iif(skb), inet6_sdif(skb), udptable, NULL);
-	if (!sk || udp_sk(sk)->encap_type) {
+	if (!sk || udp_sk(sk)->encap_enabled) {
 		/* No socket for error: try tunnels before discarding */
 		sk = ERR_PTR(-ENOENT);
 		if (static_branch_unlikely(&udpv6_encap_needed_key)) {
@@ -661,7 +661,7 @@ static int udpv6_queue_rcv_one_skb(struct sock *sk, struct sk_buff *skb)
 	if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb))
 		goto drop;
 
-	if (static_branch_unlikely(&udpv6_encap_needed_key) && up->encap_type) {
+	if (static_branch_unlikely(&udpv6_encap_needed_key) && up->encap_enabled) {
 		int (*encap_rcv)(struct sock *sk, struct sk_buff *skb);
 
 		/*
@@ -1605,13 +1605,11 @@ void udpv6_destroy_sock(struct sock *sk)
 	release_sock(sk);
 
 	if (static_branch_unlikely(&udpv6_encap_needed_key)) {
-		if (up->encap_type) {
+		if (up->encap_enabled) {
 			void (*encap_destroy)(struct sock *sk);
 			encap_destroy = READ_ONCE(up->encap_destroy);
 			if (encap_destroy)
 				encap_destroy(sk);
-		}
-		if (up->encap_enabled) {
 			static_branch_dec(&udpv6_encap_needed_key);
 			udp_encap_disable();
 		}
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index ec0f52567c16..2eccb3f9122b 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -872,7 +872,6 @@ int sctp_udp_sock_start(struct net *net)
 		return err;
 	}
 
-	tuncfg.encap_type = 1;
 	tuncfg.encap_rcv = sctp_udp_rcv;
 	tuncfg.encap_err_lookup = sctp_udp_v4_err;
 	setup_udp_tunnel_sock(net, sock, &tuncfg);
@@ -894,7 +893,6 @@ int sctp_udp_sock_start(struct net *net)
 		return err;
 	}
 
-	tuncfg.encap_type = 1;
 	tuncfg.encap_rcv = sctp_udp_rcv;
 	tuncfg.encap_err_lookup = sctp_udp_v6_err;
 	setup_udp_tunnel_sock(net, sock, &tuncfg);
diff --git a/net/tipc/udp_media.c b/net/tipc/udp_media.c
index c2bb818704c8..da0c679dae37 100644
--- a/net/tipc/udp_media.c
+++ b/net/tipc/udp_media.c
@@ -771,7 +771,6 @@ static int tipc_udp_enable(struct net *net, struct tipc_bearer *b,
 	if (err)
 		goto err;
 	tuncfg.sk_user_data = ub;
-	tuncfg.encap_type = 1;
 	tuncfg.encap_rcv = tipc_udp_recv;
 	tuncfg.encap_destroy = NULL;
 	setup_udp_tunnel_sock(net, ub->ubsock, &tuncfg);
-- 
2.18.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH net 2/3] udp: check encap socket in __udp_lib_err
  2021-07-12  0:55 [PATCH net 0/3] Fix PMTU for ESP-in-UDP encapsulation Vadim Fedorenko
  2021-07-12  0:55 ` [PATCH net 1/3] udp: check for encap using encap_enable Vadim Fedorenko
@ 2021-07-12  0:55 ` Vadim Fedorenko
  2021-07-12  7:59   ` Willem de Bruijn
  2021-07-12  9:07   ` Paolo Abeni
  2021-07-12  0:55 ` [PATCH net 3/3] selftests: net: add ESP-in-UDP PMTU test Vadim Fedorenko
  2 siblings, 2 replies; 18+ messages in thread
From: Vadim Fedorenko @ 2021-07-12  0:55 UTC (permalink / raw)
  To: David Ahern, Willem de Bruijn, Paolo Abeni, Xin Long
  Cc: Jakub Kicinski, David S. Miller, netdev, Vadim Fedorenko

Commit d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
added checks for encapsulated sockets but it broke cases when there is
no implementation of encap_err_lookup for encapsulation, i.e. ESP in
UDP encapsulation. Fix it by calling encap_err_lookup only if socket
implements this method otherwise treat it as legal socket.

Fixes: d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
---
 net/ipv4/udp.c | 24 +++++++++++++++++++++++-
 net/ipv6/udp.c | 22 ++++++++++++++++++++++
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index e5cb7fedfbcd..4980e0f19990 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -707,7 +707,29 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
 	sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
 			       iph->saddr, uh->source, skb->dev->ifindex,
 			       inet_sdif(skb), udptable, NULL);
-	if (!sk || udp_sk(sk)->encap_enabled) {
+	if (sk && udp_sk(sk)->encap_enabled) {
+		int (*lookup)(struct sock *sk, struct sk_buff *skb);
+
+		lookup = READ_ONCE(udp_sk(sk)->encap_err_lookup);
+		if (lookup) {
+			int network_offset, transport_offset;
+
+			network_offset = skb_network_offset(skb);
+			transport_offset = skb_transport_offset(skb);
+
+			/* Network header needs to point to the outer IPv4 header inside ICMP */
+			skb_reset_network_header(skb);
+
+			/* Transport header needs to point to the UDP header */
+			skb_set_transport_header(skb, iph->ihl << 2);
+			if (lookup(sk, skb))
+				sk = NULL;
+			skb_set_transport_header(skb, transport_offset);
+			skb_set_network_header(skb, network_offset);
+		}
+	}
+
+	if (!sk) {
 		/* No socket for error: try tunnels before discarding */
 		sk = ERR_PTR(-ENOENT);
 		if (static_branch_unlikely(&udp_encap_needed_key)) {
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 798916d2e722..ed49a8589d9f 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -558,6 +558,28 @@ int __udp6_lib_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 
 	sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source,
 			       inet6_iif(skb), inet6_sdif(skb), udptable, NULL);
+	if (sk && udp_sk(sk)->encap_enabled) {
+		int (*lookup)(struct sock *sk, struct sk_buff *skb);
+
+		lookup = READ_ONCE(udp_sk(sk)->encap_err_lookup);
+		if (lookup) {
+			int network_offset, transport_offset;
+
+			network_offset = skb_network_offset(skb);
+			transport_offset = skb_transport_offset(skb);
+
+			/* Network header needs to point to the outer IPv6 header inside ICMP */
+			skb_reset_network_header(skb);
+
+			/* Transport header needs to point to the UDP header */
+			skb_set_transport_header(skb, offset);
+			if (lookup(sk, skb))
+				sk = NULL;
+			skb_set_transport_header(skb, transport_offset);
+			skb_set_network_header(skb, network_offset);
+		}
+	}
+
 	if (!sk || udp_sk(sk)->encap_enabled) {
 		/* No socket for error: try tunnels before discarding */
 		sk = ERR_PTR(-ENOENT);
-- 
2.18.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH net 3/3] selftests: net: add ESP-in-UDP PMTU test
  2021-07-12  0:55 [PATCH net 0/3] Fix PMTU for ESP-in-UDP encapsulation Vadim Fedorenko
  2021-07-12  0:55 ` [PATCH net 1/3] udp: check for encap using encap_enable Vadim Fedorenko
  2021-07-12  0:55 ` [PATCH net 2/3] udp: check encap socket in __udp_lib_err Vadim Fedorenko
@ 2021-07-12  0:55 ` Vadim Fedorenko
  2 siblings, 0 replies; 18+ messages in thread
From: Vadim Fedorenko @ 2021-07-12  0:55 UTC (permalink / raw)
  To: David Ahern, Willem de Bruijn, Paolo Abeni, Xin Long
  Cc: Jakub Kicinski, David S. Miller, netdev, Vadim Fedorenko

The case of ESP in UDP encapsulation was not covered before

Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
---
 tools/testing/selftests/net/nettest.c |  55 +++++++++++++-
 tools/testing/selftests/net/pmtu.sh   | 104 +++++++++++++++++++++++++-
 2 files changed, 154 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/net/nettest.c b/tools/testing/selftests/net/nettest.c
index 6365c7fd1262..bd6288302094 100644
--- a/tools/testing/selftests/net/nettest.c
+++ b/tools/testing/selftests/net/nettest.c
@@ -11,9 +11,11 @@
 #include <sys/socket.h>
 #include <sys/wait.h>
 #include <linux/tcp.h>
+#include <linux/udp.h>
 #include <arpa/inet.h>
 #include <net/if.h>
 #include <netinet/in.h>
+#include <netinet/ip.h>
 #include <netdb.h>
 #include <fcntl.h>
 #include <libgen.h>
@@ -27,6 +29,10 @@
 #include <time.h>
 #include <errno.h>
 
+#include <linux/xfrm.h>
+#include <linux/ipsec.h>
+#include <linux/pfkeyv2.h>
+
 #ifndef IPV6_UNICAST_IF
 #define IPV6_UNICAST_IF         76
 #endif
@@ -114,6 +120,9 @@ struct sock_args {
 		struct in_addr  in;
 		struct in6_addr in6;
 	} expected_raddr;
+
+	/* ESP in UDP encap test */
+	int use_xfrm;
 };
 
 static int server_mode;
@@ -1346,6 +1355,41 @@ static int bind_socket(int sd, struct sock_args *args)
 	return 0;
 }
 
+static int config_xfrm_policy(int sd, struct sock_args *args)
+{
+	struct xfrm_userpolicy_info policy = {};
+	int type = UDP_ENCAP_ESPINUDP;
+	int xfrm_af = IP_XFRM_POLICY;
+	int level = SOL_IP;
+
+	if (args->type != SOCK_DGRAM) {
+		log_error("Invalid socket type. Only DGRAM could be used for XFRM\n");
+		return 1;
+	}
+
+	policy.action = XFRM_POLICY_ALLOW;
+	policy.sel.family = args->version;
+	if (args->version == AF_INET6) {
+		xfrm_af = IPV6_XFRM_POLICY;
+		level = SOL_IPV6;
+	}
+
+	policy.dir = XFRM_POLICY_OUT;
+	if (setsockopt(sd, level, xfrm_af, &policy, sizeof(policy)) < 0)
+		return 1;
+
+	policy.dir = XFRM_POLICY_IN;
+	if (setsockopt(sd, level, xfrm_af, &policy, sizeof(policy)) < 0)
+		return 1;
+
+	if (setsockopt(sd, IPPROTO_UDP, UDP_ENCAP, &type, sizeof(type)) < 0) {
+		log_err_errno("Failed to set xfrm encap");
+		return 1;
+	}
+
+	return 0;
+}
+
 static int lsock_init(struct sock_args *args)
 {
 	long flags;
@@ -1389,6 +1433,11 @@ static int lsock_init(struct sock_args *args)
 	if (fcntl(sd, F_SETFD, FD_CLOEXEC) < 0)
 		log_err_errno("Failed to set close-on-exec flag");
 
+	if (args->use_xfrm && config_xfrm_policy(sd, args)) {
+		log_err_errno("Failed to set xfrm policy");
+		goto err;
+	}
+
 out:
 	return sd;
 
@@ -1772,7 +1821,7 @@ static int ipc_parent(int cpid, int fd, struct sock_args *args)
 	return client_status;
 }
 
-#define GETOPT_STR  "sr:l:c:p:t:g:P:DRn:M:X:m:d:I:BN:O:SCi6L:0:1:2:3:Fbq"
+#define GETOPT_STR  "sr:l:c:p:t:g:P:DRn:M:X:m:d:I:BN:O:SCi6xL:0:1:2:3:Fbq"
 
 static void print_usage(char *prog)
 {
@@ -1795,6 +1844,7 @@ static void print_usage(char *prog)
 	"    -D|R          datagram (D) / raw (R) socket (default stream)\n"
 	"    -l addr       local address to bind to in server mode\n"
 	"    -c addr       local address to bind to in client mode\n"
+	"    -x            configure XFRM policy on socket\n"
 	"\n"
 	"    -d dev        bind socket to given device name\n"
 	"    -I dev        bind socket to given device name - server mode\n"
@@ -1966,6 +2016,9 @@ int main(int argc, char *argv[])
 		case 'q':
 			quiet = 1;
 			break;
+		case 'x':
+			args.use_xfrm = 1;
+			break;
 		default:
 			print_usage(argv[0]);
 			return 1;
diff --git a/tools/testing/selftests/net/pmtu.sh b/tools/testing/selftests/net/pmtu.sh
index 64cd2e23c568..e33152a55508 100755
--- a/tools/testing/selftests/net/pmtu.sh
+++ b/tools/testing/selftests/net/pmtu.sh
@@ -118,6 +118,9 @@
 #	below for IPv6 doesn't apply here, because, on IPv4, administrative MTU
 #	changes alone won't affect PMTU
 #
+# - pmtu_vti4_udp_exception
+#       Same as pmtu_vti4_exception, but using ESP-in-UDP
+#
 # - pmtu_vti6_exception
 #	Set up vti6 tunnel on top of veth, with xfrm states and policies, in two
 #	namespaces with matching endpoints. Check that route exception is
@@ -125,6 +128,9 @@
 #	decrease and increase MTU of tunnel, checking that route exception PMTU
 #	changes accordingly
 #
+# - pmtu_vti6_udp_exception
+#       Same as pmtu_vti6_exception, but using ESP-in-UDP
+#
 # - pmtu_vti4_default_mtu
 #	Set up vti4 tunnel on top of veth, in two namespaces with matching
 #	endpoints. Check that MTU assigned to vti interface is the MTU of the
@@ -224,6 +230,8 @@ tests="
 	pmtu_ipv6_ipv6_exception	IPv6 over IPv6: PMTU exceptions		1
 	pmtu_vti6_exception		vti6: PMTU exceptions			0
 	pmtu_vti4_exception		vti4: PMTU exceptions			0
+	pmtu_vti6_udp_exception		vti6: PMTU exceptions (ESP-in-UDP)	0
+	pmtu_vti4_udp_exception		vti4: PMTU exceptions (ESP-in-UDP)	0
 	pmtu_vti4_default_mtu		vti4: default MTU assignment		0
 	pmtu_vti6_default_mtu		vti6: default MTU assignment		0
 	pmtu_vti4_link_add_mtu		vti4: MTU setting on link creation	0
@@ -326,6 +334,7 @@ dummy6_mask="64"
 
 err_buf=
 tcpdump_pids=
+nettest_pids=
 
 err() {
 	err_buf="${err_buf}${1}
@@ -619,18 +628,30 @@ setup_xfrm() {
 	proto=${1}
 	veth_a_addr="${2}"
 	veth_b_addr="${3}"
+	encap=${4}
 
-	run_cmd ${ns_a} ip -${proto} xfrm state add src ${veth_a_addr} dst ${veth_b_addr} spi 0x1000 proto esp aead 'rfc4106(gcm(aes))' 0x0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f 128 mode tunnel || return 1
-	run_cmd ${ns_a} ip -${proto} xfrm state add src ${veth_b_addr} dst ${veth_a_addr} spi 0x1001 proto esp aead 'rfc4106(gcm(aes))' 0x0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f 128 mode tunnel
+	run_cmd ${ns_a} ip -${proto} xfrm state add src ${veth_a_addr} dst ${veth_b_addr} spi 0x1000 proto esp aead 'rfc4106(gcm(aes))' 0x0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f 128 mode tunnel ${encap} || return 1
+	run_cmd ${ns_a} ip -${proto} xfrm state add src ${veth_b_addr} dst ${veth_a_addr} spi 0x1001 proto esp aead 'rfc4106(gcm(aes))' 0x0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f 128 mode tunnel ${encap}
 	run_cmd ${ns_a} ip -${proto} xfrm policy add dir out mark 10 tmpl src ${veth_a_addr} dst ${veth_b_addr} proto esp mode tunnel
 	run_cmd ${ns_a} ip -${proto} xfrm policy add dir in mark 10 tmpl src ${veth_b_addr} dst ${veth_a_addr} proto esp mode tunnel
 
-	run_cmd ${ns_b} ip -${proto} xfrm state add src ${veth_a_addr} dst ${veth_b_addr} spi 0x1000 proto esp aead 'rfc4106(gcm(aes))' 0x0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f 128 mode tunnel
-	run_cmd ${ns_b} ip -${proto} xfrm state add src ${veth_b_addr} dst ${veth_a_addr} spi 0x1001 proto esp aead 'rfc4106(gcm(aes))' 0x0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f 128 mode tunnel
+	run_cmd ${ns_b} ip -${proto} xfrm state add src ${veth_a_addr} dst ${veth_b_addr} spi 0x1000 proto esp aead 'rfc4106(gcm(aes))' 0x0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f 128 mode tunnel ${encap}
+	run_cmd ${ns_b} ip -${proto} xfrm state add src ${veth_b_addr} dst ${veth_a_addr} spi 0x1001 proto esp aead 'rfc4106(gcm(aes))' 0x0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f 128 mode tunnel ${encap}
 	run_cmd ${ns_b} ip -${proto} xfrm policy add dir out mark 10 tmpl src ${veth_b_addr} dst ${veth_a_addr} proto esp mode tunnel
 	run_cmd ${ns_b} ip -${proto} xfrm policy add dir in mark 10 tmpl src ${veth_a_addr} dst ${veth_b_addr} proto esp mode tunnel
 }
 
+setup_nettest_xfrm() {
+	proto=${1}
+	port=${2}
+
+	run_cmd ${ns_a} nettest -${proto} -D -s -i -p ${port} 2> /dev/null &
+	nettest_pids="${nettest_pids} $!"
+
+	run_cmd ${ns_b} nettest -${proto} -D -s -i -p ${port} 2> /dev/null &
+	nettest_pids="${nettest_pids} $!"
+}
+
 setup_xfrm4() {
 	setup_xfrm 4 ${veth4_a_addr} ${veth4_b_addr}
 }
@@ -639,6 +660,16 @@ setup_xfrm6() {
 	setup_xfrm 6 ${veth6_a_addr} ${veth6_b_addr}
 }
 
+setup_xfrm4udp() {
+	setup_xfrm 4 ${veth4_a_addr} ${veth4_b_addr} "encap espinudp 4500 4500 0.0.0.0"
+	setup_nettest_xfrm 4 4500
+}
+
+setup_xfrm6udp() {
+	setup_xfrm 6 ${veth6_a_addr} ${veth6_b_addr} "encap espinudp 4500 4500 0.0.0.0"
+	setup_nettest_xfrm 6 4500
+}
+
 setup_routing_old() {
 	for i in ${routes}; do
 		[ "${ns}" = "" ]	&& ns="${i}"		&& continue
@@ -823,6 +854,11 @@ cleanup() {
 	done
 	tcpdump_pids=
 
+	for pid in ${nettest_pids}; do
+		kill ${pid}
+	done
+	nettest_pids=
+
 	for n in ${NS_A} ${NS_B} ${NS_C} ${NS_R1} ${NS_R2}; do
 		ip netns del ${n} 2> /dev/null
 	done
@@ -1432,6 +1468,66 @@ test_pmtu_vti6_exception() {
 	return ${fail}
 }
 
+test_pmtu_vti4_udp_exception() {
+	setup namespaces veth vti4 xfrm4udp || return $ksft_skip
+	trace "${ns_a}" veth_a    "${ns_b}" veth_b \
+	      "${ns_a}" vti4_a    "${ns_b}" vti4_b
+
+	veth_mtu=1500
+	vti_mtu=$((veth_mtu - 20))
+
+	#                                UDP   SPI   SN   IV  ICV   pad length   next header
+	esp_payload_rfc4106=$((vti_mtu - 8   - 4   - 4  - 8 - 16  - 1          - 1))
+	ping_payload=$((esp_payload_rfc4106 - 28))
+
+	mtu "${ns_a}" veth_a ${veth_mtu}
+	mtu "${ns_b}" veth_b ${veth_mtu}
+	mtu "${ns_a}" vti4_a ${vti_mtu}
+	mtu "${ns_b}" vti4_b ${vti_mtu}
+
+	# Send DF packet without exceeding link layer MTU, check that no
+	# exception is created
+	run_cmd ${ns_a} ping -q -M want -i 0.1 -w 1 -s ${ping_payload} ${tunnel4_b_addr}
+	pmtu="$(route_get_dst_pmtu_from_exception "${ns_a}" ${tunnel4_b_addr})"
+	check_pmtu_value "" "${pmtu}" "sending packet smaller than PMTU (IP payload length ${esp_payload_rfc4106})" || return 1
+
+	# Now exceed link layer MTU by one byte, check that exception is created
+	# with the right PMTU value
+	run_cmd ${ns_a} ping -q -M want -i 0.1 -w 1 -s $((ping_payload + 1)) ${tunnel4_b_addr}
+	pmtu="$(route_get_dst_pmtu_from_exception "${ns_a}" ${tunnel4_b_addr})"
+	check_pmtu_value "${esp_payload_rfc4106}" "${pmtu}" "exceeding PMTU (IP payload length $((esp_payload_rfc4106 + 1)))"
+}
+
+test_pmtu_vti6_udp_exception() {
+	setup namespaces veth vti6 xfrm6udp || return $ksft_skip
+	trace "${ns_a}" veth_a    "${ns_b}" veth_b \
+	      "${ns_a}" vti6_a    "${ns_b}" vti6_b
+	fail=0
+
+	# Create route exception by exceeding link layer MTU
+	mtu "${ns_a}" veth_a 4000
+	mtu "${ns_b}" veth_b 4000
+	mtu "${ns_a}" vti6_a 5000
+	mtu "${ns_b}" vti6_b 5000
+	run_cmd ${ns_a} ${ping6} -q -i 0.1 -w 1 -s 60000 ${tunnel6_b_addr}
+
+	# Check that exception was created
+	pmtu="$(route_get_dst_pmtu_from_exception "${ns_a}" ${tunnel6_b_addr})"
+	check_pmtu_value any "${pmtu}" "creating tunnel exceeding link layer MTU" || return 1
+
+	# Decrease tunnel MTU, check for PMTU decrease in route exception
+	mtu "${ns_a}" vti6_a 3000
+	pmtu="$(route_get_dst_pmtu_from_exception "${ns_a}" ${tunnel6_b_addr})"
+	check_pmtu_value "3000" "${pmtu}" "decreasing tunnel MTU" || fail=1
+
+	# Increase tunnel MTU, check for PMTU increase in route exception
+	mtu "${ns_a}" vti6_a 9000
+	pmtu="$(route_get_dst_pmtu_from_exception "${ns_a}" ${tunnel6_b_addr})"
+	check_pmtu_value "9000" "${pmtu}" "increasing tunnel MTU" || fail=1
+
+	return ${fail}
+}
+
 test_pmtu_vti4_default_mtu() {
 	setup namespaces veth vti4 || return $ksft_skip
 
-- 
2.18.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 2/3] udp: check encap socket in __udp_lib_err
  2021-07-12  0:55 ` [PATCH net 2/3] udp: check encap socket in __udp_lib_err Vadim Fedorenko
@ 2021-07-12  7:59   ` Willem de Bruijn
  2021-07-12 12:09     ` Vadim Fedorenko
  2021-07-12  9:07   ` Paolo Abeni
  1 sibling, 1 reply; 18+ messages in thread
From: Willem de Bruijn @ 2021-07-12  7:59 UTC (permalink / raw)
  To: Vadim Fedorenko
  Cc: David Ahern, Paolo Abeni, Xin Long, Jakub Kicinski,
	David S. Miller, netdev

On Mon, Jul 12, 2021 at 2:56 AM Vadim Fedorenko <vfedorenko@novek.ru> wrote:
>
> Commit d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
> added checks for encapsulated sockets but it broke cases when there is
> no implementation of encap_err_lookup for encapsulation, i.e. ESP in
> UDP encapsulation. Fix it by calling encap_err_lookup only if socket
> implements this method otherwise treat it as legal socket.
>
> Fixes: d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
> ---
>  net/ipv4/udp.c | 24 +++++++++++++++++++++++-
>  net/ipv6/udp.c | 22 ++++++++++++++++++++++
>  2 files changed, 45 insertions(+), 1 deletion(-)

This duplicates __udp4_lib_err_encap and __udp6_lib_err_encap.

Can we avoid open-coding that logic multiple times?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 1/3] udp: check for encap using encap_enable
  2021-07-12  0:55 ` [PATCH net 1/3] udp: check for encap using encap_enable Vadim Fedorenko
@ 2021-07-12  8:37   ` Paolo Abeni
  2021-07-12 12:32     ` Vadim Fedorenko
  0 siblings, 1 reply; 18+ messages in thread
From: Paolo Abeni @ 2021-07-12  8:37 UTC (permalink / raw)
  To: Vadim Fedorenko, David Ahern, Willem de Bruijn, Xin Long
  Cc: Jakub Kicinski, David S. Miller, netdev

Hello,

On Mon, 2021-07-12 at 03:55 +0300, Vadim Fedorenko wrote:
> Usage of encap_type as a flag to determine encapsulation of udp
> socket is not correct because there is special encap_enable field
> for this check. Many network drivers use this field as a flag
> instead of correctly indicate type of encapsulation. Remove such
> usage and update all checks to use encap_enable field

Uhmm... this looks quite dangerous to me. Apparently l2tp and gtp clear
'encap_type' to prevent the rx path pushing pkts into the encap on
shutdown. Will such tunnel's shutdown be safe with the above?

> Fixes: 60fb9567bf30 ("udp: implement complete book-keeping for encap_needed")

IMHO this not fix. Which bug are you observing that is addressed here?

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 2/3] udp: check encap socket in __udp_lib_err
  2021-07-12  0:55 ` [PATCH net 2/3] udp: check encap socket in __udp_lib_err Vadim Fedorenko
  2021-07-12  7:59   ` Willem de Bruijn
@ 2021-07-12  9:07   ` Paolo Abeni
  2021-07-12 12:45     ` Vadim Fedorenko
  1 sibling, 1 reply; 18+ messages in thread
From: Paolo Abeni @ 2021-07-12  9:07 UTC (permalink / raw)
  To: Vadim Fedorenko, David Ahern, Willem de Bruijn, Xin Long
  Cc: Jakub Kicinski, David S. Miller, netdev

Hello,

On Mon, 2021-07-12 at 03:55 +0300, Vadim Fedorenko wrote:
> Commit d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
> added checks for encapsulated sockets but it broke cases when there is
> no implementation of encap_err_lookup for encapsulation, i.e. ESP in
> UDP encapsulation. Fix it by calling encap_err_lookup only if socket
> implements this method otherwise treat it as legal socket.
> 
> Fixes: d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
> ---
>  net/ipv4/udp.c | 24 +++++++++++++++++++++++-
>  net/ipv6/udp.c | 22 ++++++++++++++++++++++
>  2 files changed, 45 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index e5cb7fedfbcd..4980e0f19990 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -707,7 +707,29 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
>  	sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
>  			       iph->saddr, uh->source, skb->dev->ifindex,
>  			       inet_sdif(skb), udptable, NULL);
> -	if (!sk || udp_sk(sk)->encap_enabled) {
> +	if (sk && udp_sk(sk)->encap_enabled) {
> +		int (*lookup)(struct sock *sk, struct sk_buff *skb);
> +
> +		lookup = READ_ONCE(udp_sk(sk)->encap_err_lookup);
> +		if (lookup) {
> +			int network_offset, transport_offset;
> +
> +			network_offset = skb_network_offset(skb);
> +			transport_offset = skb_transport_offset(skb);
> +
> +			/* Network header needs to point to the outer IPv4 header inside ICMP */
> +			skb_reset_network_header(skb);
> +
> +			/* Transport header needs to point to the UDP header */
> +			skb_set_transport_header(skb, iph->ihl << 2);
> +			if (lookup(sk, skb))
> +				sk = NULL;
> +			skb_set_transport_header(skb, transport_offset);
> +			skb_set_network_header(skb, network_offset);
> +		}
> +	}
> +
> +	if (!sk) {
>  		/* No socket for error: try tunnels before discarding */
>  		sk = ERR_PTR(-ENOENT);
>  		if (static_branch_unlikely(&udp_encap_needed_key)) {
> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> index 798916d2e722..ed49a8589d9f 100644
> --- a/net/ipv6/udp.c
> +++ b/net/ipv6/udp.c
> @@ -558,6 +558,28 @@ int __udp6_lib_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
>  
>  	sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source,
>  			       inet6_iif(skb), inet6_sdif(skb), udptable, NULL);
> +	if (sk && udp_sk(sk)->encap_enabled) {
> +		int (*lookup)(struct sock *sk, struct sk_buff *skb);
> +
> +		lookup = READ_ONCE(udp_sk(sk)->encap_err_lookup);
> +		if (lookup) {
> +			int network_offset, transport_offset;
> +
> +			network_offset = skb_network_offset(skb);
> +			transport_offset = skb_transport_offset(skb);
> +
> +			/* Network header needs to point to the outer IPv6 header inside ICMP */
> +			skb_reset_network_header(skb);
> +
> +			/* Transport header needs to point to the UDP header */
> +			skb_set_transport_header(skb, offset);
> +			if (lookup(sk, skb))
> +				sk = NULL;
> +			skb_set_transport_header(skb, transport_offset);
> +			skb_set_network_header(skb, network_offset);
> +		}
> +	}

I can't follow this code. I guess that before d26796ae5894,
__udp6_lib_err() used to invoke ICMP processing on the ESP in UDP
socket, and after d26796ae5894 'sk' was cleared
by __udp4_lib_err_encap(), is that correct?

After this patch, the above chunk will not clear 'sk' for packets
targeting ESP in UDP sockets, but AFAICS we will still enter the
following conditional, preserving the current behavior - no ICMP
processing. 

Can you please clarify?

Why can't you use something alike the following instead?

---
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index c0f9f3260051..96a3b640e4da 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -707,7 +707,7 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
        sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
                               iph->saddr, uh->source, skb->dev->ifindex,
                               inet_sdif(skb), udptable, NULL);
-       if (!sk || udp_sk(sk)->encap_type) {
+       if (!sk || READ_ONCE(udp_sk(sk)->encap_err_lookup)) {
                /* No socket for error: try tunnels before discarding */
                sk = ERR_PTR(-ENOENT);
                if (static_branch_unlikely(&udp_encap_needed_key)) {

---

Thanks!

/P


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 2/3] udp: check encap socket in __udp_lib_err
  2021-07-12  7:59   ` Willem de Bruijn
@ 2021-07-12 12:09     ` Vadim Fedorenko
  0 siblings, 0 replies; 18+ messages in thread
From: Vadim Fedorenko @ 2021-07-12 12:09 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David Ahern, Paolo Abeni, Xin Long, Jakub Kicinski,
	David S. Miller, netdev

On 12.07.2021 08:59, Willem de Bruijn wrote:
> On Mon, Jul 12, 2021 at 2:56 AM Vadim Fedorenko <vfedorenko@novek.ru> wrote:
>>
>> Commit d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
>> added checks for encapsulated sockets but it broke cases when there is
>> no implementation of encap_err_lookup for encapsulation, i.e. ESP in
>> UDP encapsulation. Fix it by calling encap_err_lookup only if socket
>> implements this method otherwise treat it as legal socket.
>>
>> Fixes: d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
>> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
>> ---
>>   net/ipv4/udp.c | 24 +++++++++++++++++++++++-
>>   net/ipv6/udp.c | 22 ++++++++++++++++++++++
>>   2 files changed, 45 insertions(+), 1 deletion(-)
> 
> This duplicates __udp4_lib_err_encap and __udp6_lib_err_encap.
> 
> Can we avoid open-coding that logic multiple times?
> 
Yes, sure. I was thinking about the same but wanted to get a feedback
on approach itself. I will try to implement parts of that duplicates
as helpers next round.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 1/3] udp: check for encap using encap_enable
  2021-07-12  8:37   ` Paolo Abeni
@ 2021-07-12 12:32     ` Vadim Fedorenko
  2021-07-12 14:05       ` Paolo Abeni
  0 siblings, 1 reply; 18+ messages in thread
From: Vadim Fedorenko @ 2021-07-12 12:32 UTC (permalink / raw)
  To: Paolo Abeni, David Ahern, Willem de Bruijn, Xin Long
  Cc: Jakub Kicinski, David S. Miller, netdev

On 12.07.2021 09:37, Paolo Abeni wrote:
> Hello,
> 
> On Mon, 2021-07-12 at 03:55 +0300, Vadim Fedorenko wrote:
>> Usage of encap_type as a flag to determine encapsulation of udp
>> socket is not correct because there is special encap_enable field
>> for this check. Many network drivers use this field as a flag
>> instead of correctly indicate type of encapsulation. Remove such
>> usage and update all checks to use encap_enable field
> 
> Uhmm... this looks quite dangerous to me. Apparently l2tp and gtp clear
> 'encap_type' to prevent the rx path pushing pkts into the encap on
> shutdown. Will such tunnel's shutdown be safe with the above?
>
I think it's safe because all the code that checks for encap_enabled checks for
encap_rcv before invoking it and l2tp clears this handler. A bit different
situation with gtp where no clearing is done while encap destroy, so I think
the same approach could be done to clear the receive handler.

I also realised that there could be some imbalance on udp_encap_needed_key in
case of l2tp and gtp. I will try to investigate it a bit more.

>> Fixes: 60fb9567bf30 ("udp: implement complete book-keeping for encap_needed")
> 
> IMHO this not fix. Which bug are you observing that is addressed here?
> 
I thought that introduction of encap_enabled should go further to switch the
code to check this particular flag and leave encap_type as a description of
specific type (or subtype) of used encapsulation. That's why I added Fixes tag.
Correct me if I'm wrong on this assumption


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 2/3] udp: check encap socket in __udp_lib_err
  2021-07-12  9:07   ` Paolo Abeni
@ 2021-07-12 12:45     ` Vadim Fedorenko
  2021-07-12 13:37       ` Paolo Abeni
  0 siblings, 1 reply; 18+ messages in thread
From: Vadim Fedorenko @ 2021-07-12 12:45 UTC (permalink / raw)
  To: Paolo Abeni, David Ahern, Willem de Bruijn, Xin Long
  Cc: Jakub Kicinski, David S. Miller, netdev

On 12.07.2021 10:07, Paolo Abeni wrote:
> Hello,
> 
> On Mon, 2021-07-12 at 03:55 +0300, Vadim Fedorenko wrote:
>> Commit d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
>> added checks for encapsulated sockets but it broke cases when there is
>> no implementation of encap_err_lookup for encapsulation, i.e. ESP in
>> UDP encapsulation. Fix it by calling encap_err_lookup only if socket
>> implements this method otherwise treat it as legal socket.
>>
>> Fixes: d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
>> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
>> ---
>>   net/ipv4/udp.c | 24 +++++++++++++++++++++++-
>>   net/ipv6/udp.c | 22 ++++++++++++++++++++++
>>   2 files changed, 45 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
>> index e5cb7fedfbcd..4980e0f19990 100644
>> --- a/net/ipv4/udp.c
>> +++ b/net/ipv4/udp.c
>> @@ -707,7 +707,29 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
>>   	sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
>>   			       iph->saddr, uh->source, skb->dev->ifindex,
>>   			       inet_sdif(skb), udptable, NULL);
>> -	if (!sk || udp_sk(sk)->encap_enabled) {
>> +	if (sk && udp_sk(sk)->encap_enabled) {
>> +		int (*lookup)(struct sock *sk, struct sk_buff *skb);
>> +
>> +		lookup = READ_ONCE(udp_sk(sk)->encap_err_lookup);
>> +		if (lookup) {
>> +			int network_offset, transport_offset;
>> +
>> +			network_offset = skb_network_offset(skb);
>> +			transport_offset = skb_transport_offset(skb);
>> +
>> +			/* Network header needs to point to the outer IPv4 header inside ICMP */
>> +			skb_reset_network_header(skb);
>> +
>> +			/* Transport header needs to point to the UDP header */
>> +			skb_set_transport_header(skb, iph->ihl << 2);
>> +			if (lookup(sk, skb))
>> +				sk = NULL;
>> +			skb_set_transport_header(skb, transport_offset);
>> +			skb_set_network_header(skb, network_offset);
>> +		}
>> +	}
>> +
>> +	if (!sk) {
>>   		/* No socket for error: try tunnels before discarding */
>>   		sk = ERR_PTR(-ENOENT);
>>   		if (static_branch_unlikely(&udp_encap_needed_key)) {
>> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
>> index 798916d2e722..ed49a8589d9f 100644
>> --- a/net/ipv6/udp.c
>> +++ b/net/ipv6/udp.c
>> @@ -558,6 +558,28 @@ int __udp6_lib_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
>>   
>>   	sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source,
>>   			       inet6_iif(skb), inet6_sdif(skb), udptable, NULL);
>> +	if (sk && udp_sk(sk)->encap_enabled) {
>> +		int (*lookup)(struct sock *sk, struct sk_buff *skb);
>> +
>> +		lookup = READ_ONCE(udp_sk(sk)->encap_err_lookup);
>> +		if (lookup) {
>> +			int network_offset, transport_offset;
>> +
>> +			network_offset = skb_network_offset(skb);
>> +			transport_offset = skb_transport_offset(skb);
>> +
>> +			/* Network header needs to point to the outer IPv6 header inside ICMP */
>> +			skb_reset_network_header(skb);
>> +
>> +			/* Transport header needs to point to the UDP header */
>> +			skb_set_transport_header(skb, offset);
>> +			if (lookup(sk, skb))
>> +				sk = NULL;
>> +			skb_set_transport_header(skb, transport_offset);
>> +			skb_set_network_header(skb, network_offset);
>> +		}
>> +	}
> 
> I can't follow this code. I guess that before d26796ae5894,
> __udp6_lib_err() used to invoke ICMP processing on the ESP in UDP
> socket, and after d26796ae5894 'sk' was cleared
> by __udp4_lib_err_encap(), is that correct?

Actually it was cleared just before __udp4_lib_err_encap() and after
it we totally loose the information of socket found by __udp4_lib_lookup()
because __udp4_lib_err_encap() uses different combination of ports
(source and destination ports are exchanged) and could find different
socket.

> 
> After this patch, the above chunk will not clear 'sk' for packets
> targeting ESP in UDP sockets, but AFAICS we will still enter the
> following conditional, preserving the current behavior - no ICMP
> processing.

We will not enter following conditional for ESP in UDP case because
there is no more check for encap_type or encap_enabled. Just for
case of no udp socket as it was before d26796ae5894. But we still
have to check if the socket found by __udp4_lib_lookup() is correct
for received ICMP packet that's why I added code about encap_err_lookup.

I maybe missing something but d26796ae5894 doesn't actually explain
which particular situation should be avoided by this additional check
and no tests were added to simply reproduce the problem. If you can
explain it a bit more it would greatly help me to improve the fix.

Thanks
> 
> Can you please clarify?
> 
> Why can't you use something alike the following instead?
> 
> ---
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index c0f9f3260051..96a3b640e4da 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -707,7 +707,7 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
>          sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
>                                 iph->saddr, uh->source, skb->dev->ifindex,
>                                 inet_sdif(skb), udptable, NULL);
> -       if (!sk || udp_sk(sk)->encap_type) {
> +       if (!sk || READ_ONCE(udp_sk(sk)->encap_err_lookup)) {
>                  /* No socket for error: try tunnels before discarding */
>                  sk = ERR_PTR(-ENOENT);
>                  if (static_branch_unlikely(&udp_encap_needed_key)) {
> 
> ---
> 
> Thanks!
> 
> /P
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 2/3] udp: check encap socket in __udp_lib_err
  2021-07-12 12:45     ` Vadim Fedorenko
@ 2021-07-12 13:37       ` Paolo Abeni
  2021-07-12 14:05         ` Vadim Fedorenko
  2021-07-16 17:50         ` Xin Long
  0 siblings, 2 replies; 18+ messages in thread
From: Paolo Abeni @ 2021-07-12 13:37 UTC (permalink / raw)
  To: Vadim Fedorenko, David Ahern, Willem de Bruijn, Xin Long
  Cc: Jakub Kicinski, David S. Miller, netdev

On Mon, 2021-07-12 at 13:45 +0100, Vadim Fedorenko wrote:
> 
> > After this patch, the above chunk will not clear 'sk' for packets
> > targeting ESP in UDP sockets, but AFAICS we will still enter the
> > following conditional, preserving the current behavior - no ICMP
> > processing.
> 
> We will not enter following conditional for ESP in UDP case because
> there is no more check for encap_type or encap_enabled. 

I see. You have a bug in the ipv6 code-path. With your patch applied:

---
 	sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source,
                               inet6_iif(skb), inet6_sdif(skb), udptable, NULL);
        if (sk && udp_sk(sk)->encap_enabled) {
		//...
        }

        if (!sk || udp_sk(sk)->encap_enabled) {
	// can still enter here...
---	

> I maybe missing something but d26796ae5894 doesn't actually explain
> which particular situation should be avoided by this additional check
> and no tests were added to simply reproduce the problem. If you can
> explain it a bit more it would greatly help me to improve the fix.

Xin knows better, but AFAICS it used to cover the situation you
explicitly tests in patch 3/3 - incoming packet with src-port == dst-
port == tunnel port - for e.g. vxlan tunnels.

> > Why can't you use something alike the following instead?
> > 
> > ---
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index c0f9f3260051..96a3b640e4da 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -707,7 +707,7 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
> >          sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
> >                                 iph->saddr, uh->source, skb->dev->ifindex,
> >                                 inet_sdif(skb), udptable, NULL);
> > -       if (!sk || udp_sk(sk)->encap_type) {
> > +       if (!sk || READ_ONCE(udp_sk(sk)->encap_err_lookup)) {
> >                  /* No socket for error: try tunnels before discarding */
> >                  sk = ERR_PTR(-ENOENT);
> >                  if (static_branch_unlikely(&udp_encap_needed_key)) {
> > 
> > ---

Could you please have a look at the above ?

Thanks!

/P


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 1/3] udp: check for encap using encap_enable
  2021-07-12 12:32     ` Vadim Fedorenko
@ 2021-07-12 14:05       ` Paolo Abeni
  2021-07-12 14:13         ` Vadim Fedorenko
  0 siblings, 1 reply; 18+ messages in thread
From: Paolo Abeni @ 2021-07-12 14:05 UTC (permalink / raw)
  To: Vadim Fedorenko, David Ahern, Willem de Bruijn, Xin Long
  Cc: Jakub Kicinski, David S. Miller, netdev

On Mon, 2021-07-12 at 13:32 +0100, Vadim Fedorenko wrote:
> On 12.07.2021 09:37, Paolo Abeni wrote:
> > > Fixes: 60fb9567bf30 ("udp: implement complete book-keeping for encap_needed")
> > 
> > IMHO this not fix. Which bug are you observing that is addressed here?
> > 
> I thought that introduction of encap_enabled should go further to switch the
> code to check this particular flag and leave encap_type as a description of
> specific type (or subtype) of used encapsulation.

Than to me it looks more like a refactor than a fix. Is this strictly
needed by the following patch? if not, I suggest to consider net-next
as a target for this patch, or even better, drop it altogether. 

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 2/3] udp: check encap socket in __udp_lib_err
  2021-07-12 13:37       ` Paolo Abeni
@ 2021-07-12 14:05         ` Vadim Fedorenko
  2021-07-12 14:09           ` Paolo Abeni
  2021-07-16 17:50         ` Xin Long
  1 sibling, 1 reply; 18+ messages in thread
From: Vadim Fedorenko @ 2021-07-12 14:05 UTC (permalink / raw)
  To: Paolo Abeni, David Ahern, Willem de Bruijn, Xin Long
  Cc: Jakub Kicinski, David S. Miller, netdev

On 12.07.2021 14:37, Paolo Abeni wrote:
> On Mon, 2021-07-12 at 13:45 +0100, Vadim Fedorenko wrote:
>>
>>> After this patch, the above chunk will not clear 'sk' for packets
>>> targeting ESP in UDP sockets, but AFAICS we will still enter the
>>> following conditional, preserving the current behavior - no ICMP
>>> processing.
>>
>> We will not enter following conditional for ESP in UDP case because
>> there is no more check for encap_type or encap_enabled.
> 
> I see. You have a bug in the ipv6 code-path. With your patch applied:
> 
> ---
>   	sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source,
>                                 inet6_iif(skb), inet6_sdif(skb), udptable, NULL);
>          if (sk && udp_sk(sk)->encap_enabled) {
> 		//...
>          }
> 
>          if (!sk || udp_sk(sk)->encap_enabled) {
> 	// can still enter here...
> ---	
> 

Oh, my bad, thanks for catching this!

>> I maybe missing something but d26796ae5894 doesn't actually explain
>> which particular situation should be avoided by this additional check
>> and no tests were added to simply reproduce the problem. If you can
>> explain it a bit more it would greatly help me to improve the fix.
> 
> Xin knows better, but AFAICS it used to cover the situation you
> explicitly tests in patch 3/3 - incoming packet with src-port == dst-
> port == tunnel port - for e.g. vxlan tunnels.
>

Ok, so my assumption was like yours, that's good.

>>> Why can't you use something alike the following instead?
>>>
>>> ---
>>> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
>>> index c0f9f3260051..96a3b640e4da 100644
>>> --- a/net/ipv4/udp.c
>>> +++ b/net/ipv4/udp.c
>>> @@ -707,7 +707,7 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
>>>           sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
>>>                                  iph->saddr, uh->source, skb->dev->ifindex,
>>>                                  inet_sdif(skb), udptable, NULL);
>>> -       if (!sk || udp_sk(sk)->encap_type) {
>>> +       if (!sk || READ_ONCE(udp_sk(sk)->encap_err_lookup)) {
>>>                   /* No socket for error: try tunnels before discarding */
>>>                   sk = ERR_PTR(-ENOENT);
>>>                   if (static_branch_unlikely(&udp_encap_needed_key)) {
>>>
>>> ---
> 
> Could you please have a look at the above ?
> 
Sure. The main problem I see here is that udp4_lib_lookup in udp_lib_err_encap
could return different socket because of different source and destination port
and in this case we will never check for correctness of originally found socket,
i.e. encap_err_lookup will never be called and the ICMP notification will never
be applied to that socket even if it passes checks.
My point is that it's simplier to explicitly check socket that was found than
rely on the result of udp4_lib_lookup with different inputs and leave the case
of no socket as it was before d26796ae5894.

If it's ok, I will unify the code for check as Willem suggested and resend v2.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 2/3] udp: check encap socket in __udp_lib_err
  2021-07-12 14:05         ` Vadim Fedorenko
@ 2021-07-12 14:09           ` Paolo Abeni
  0 siblings, 0 replies; 18+ messages in thread
From: Paolo Abeni @ 2021-07-12 14:09 UTC (permalink / raw)
  To: Vadim Fedorenko, David Ahern, Willem de Bruijn, Xin Long
  Cc: Jakub Kicinski, David S. Miller, netdev

On Mon, 2021-07-12 at 15:05 +0100, Vadim Fedorenko wrote:
> On 12.07.2021 14:37, Paolo Abeni wrote:
> > On Mon, 2021-07-12 at 13:45 +0100, Vadim Fedorenko wrote:
> > > > After this patch, the above chunk will not clear 'sk' for packets
> > > > targeting ESP in UDP sockets, but AFAICS we will still enter the
> > > > following conditional, preserving the current behavior - no ICMP
> > > > processing.
> > > 
> > > We will not enter following conditional for ESP in UDP case because
> > > there is no more check for encap_type or encap_enabled.
> > 
> > I see. You have a bug in the ipv6 code-path. With your patch applied:
> > 
> > ---
> >   	sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source,
> >                                 inet6_iif(skb), inet6_sdif(skb), udptable, NULL);
> >          if (sk && udp_sk(sk)->encap_enabled) {
> > 		//...
> >          }
> > 
> >          if (!sk || udp_sk(sk)->encap_enabled) {
> > 	// can still enter here...
> > ---	
> > 
> 
> Oh, my bad, thanks for catching this!
> 
> > > I maybe missing something but d26796ae5894 doesn't actually explain
> > > which particular situation should be avoided by this additional check
> > > and no tests were added to simply reproduce the problem. If you can
> > > explain it a bit more it would greatly help me to improve the fix.
> > 
> > Xin knows better, but AFAICS it used to cover the situation you
> > explicitly tests in patch 3/3 - incoming packet with src-port == dst-
> > port == tunnel port - for e.g. vxlan tunnels.
> > 
> 
> Ok, so my assumption was like yours, that's good.
> 
> > > > Why can't you use something alike the following instead?
> > > > 
> > > > ---
> > > > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > > > index c0f9f3260051..96a3b640e4da 100644
> > > > --- a/net/ipv4/udp.c
> > > > +++ b/net/ipv4/udp.c
> > > > @@ -707,7 +707,7 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
> > > >           sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
> > > >                                  iph->saddr, uh->source, skb->dev->ifindex,
> > > >                                  inet_sdif(skb), udptable, NULL);
> > > > -       if (!sk || udp_sk(sk)->encap_type) {
> > > > +       if (!sk || READ_ONCE(udp_sk(sk)->encap_err_lookup)) {
> > > >                   /* No socket for error: try tunnels before discarding */
> > > >                   sk = ERR_PTR(-ENOENT);
> > > >                   if (static_branch_unlikely(&udp_encap_needed_key)) {
> > > > 
> > > > ---
> > 
> > Could you please have a look at the above ?
> > 
> Sure. The main problem I see here is that udp4_lib_lookup in udp_lib_err_encap
> could return different socket because of different source and destination port
> and in this case we will never check for correctness of originally found socket,
> i.e. encap_err_lookup will never be called and the ICMP notification will never
> be applied to that socket even if it passes checks.
> My point is that it's simplier to explicitly check socket that was found than
> rely on the result of udp4_lib_lookup with different inputs and leave the case
> of no socket as it was before d26796ae5894.
> 
> If it's ok, I will unify the code for check as Willem suggested and resend v2.

If the final code is small enough, please go ahead with that.

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 1/3] udp: check for encap using encap_enable
  2021-07-12 14:05       ` Paolo Abeni
@ 2021-07-12 14:13         ` Vadim Fedorenko
  2021-07-12 14:33           ` Paolo Abeni
  0 siblings, 1 reply; 18+ messages in thread
From: Vadim Fedorenko @ 2021-07-12 14:13 UTC (permalink / raw)
  To: Paolo Abeni, David Ahern, Willem de Bruijn, Xin Long
  Cc: Jakub Kicinski, David S. Miller, netdev

On 12.07.2021 15:05, Paolo Abeni wrote:
> On Mon, 2021-07-12 at 13:32 +0100, Vadim Fedorenko wrote:
>> On 12.07.2021 09:37, Paolo Abeni wrote:
>>>> Fixes: 60fb9567bf30 ("udp: implement complete book-keeping for encap_needed")
>>>
>>> IMHO this not fix. Which bug are you observing that is addressed here?
>>>
>> I thought that introduction of encap_enabled should go further to switch the
>> code to check this particular flag and leave encap_type as a description of
>> specific type (or subtype) of used encapsulation.
> 
> Than to me it looks more like a refactor than a fix. Is this strictly
> needed by the following patch? if not, I suggest to consider net-next
> as a target for this patch, or even better, drop it altogether.
> 
Looks like it isn't strictly needed for the following patch. Do you think that
such refactor would lead to more harm than benefits provided by clearness of
usage of encap_enable and encap_type fields? I mean why do you think that it's
better to drop it?
Thanks!

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 1/3] udp: check for encap using encap_enable
  2021-07-12 14:13         ` Vadim Fedorenko
@ 2021-07-12 14:33           ` Paolo Abeni
  2021-07-12 16:27             ` Vadim Fedorenko
  0 siblings, 1 reply; 18+ messages in thread
From: Paolo Abeni @ 2021-07-12 14:33 UTC (permalink / raw)
  To: Vadim Fedorenko, David Ahern, Willem de Bruijn, Xin Long
  Cc: Jakub Kicinski, David S. Miller, netdev

On Mon, 2021-07-12 at 15:13 +0100, Vadim Fedorenko wrote:
> On 12.07.2021 15:05, Paolo Abeni wrote:
> > On Mon, 2021-07-12 at 13:32 +0100, Vadim Fedorenko wrote:
> > > On 12.07.2021 09:37, Paolo Abeni wrote:
> > > > > Fixes: 60fb9567bf30 ("udp: implement complete book-keeping for encap_needed")
> > > > 
> > > > IMHO this not fix. Which bug are you observing that is addressed here?
> > > > 
> > > I thought that introduction of encap_enabled should go further to switch the
> > > code to check this particular flag and leave encap_type as a description of
> > > specific type (or subtype) of used encapsulation.
> > 
> > Than to me it looks more like a refactor than a fix. Is this strictly
> > needed by the following patch? if not, I suggest to consider net-next
> > as a target for this patch, or even better, drop it altogether.
> > 
> Looks like it isn't strictly needed for the following patch. Do you think that
> such refactor would lead to more harm than benefits provided by clearness of
> usage of encap_enable and encap_type fields? 

Yes. That patch is invasive and the clarification is quite subjective
IMHO.

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 1/3] udp: check for encap using encap_enable
  2021-07-12 14:33           ` Paolo Abeni
@ 2021-07-12 16:27             ` Vadim Fedorenko
  0 siblings, 0 replies; 18+ messages in thread
From: Vadim Fedorenko @ 2021-07-12 16:27 UTC (permalink / raw)
  To: Paolo Abeni, David Ahern, Willem de Bruijn, Xin Long
  Cc: Jakub Kicinski, David S. Miller, netdev

On 12.07.2021 15:33, Paolo Abeni wrote:
> On Mon, 2021-07-12 at 15:13 +0100, Vadim Fedorenko wrote:
>> On 12.07.2021 15:05, Paolo Abeni wrote:
>>> On Mon, 2021-07-12 at 13:32 +0100, Vadim Fedorenko wrote:
>>>> On 12.07.2021 09:37, Paolo Abeni wrote:
>>>>>> Fixes: 60fb9567bf30 ("udp: implement complete book-keeping for encap_needed")
>>>>>
>>>>> IMHO this not fix. Which bug are you observing that is addressed here?
>>>>>
>>>> I thought that introduction of encap_enabled should go further to switch the
>>>> code to check this particular flag and leave encap_type as a description of
>>>> specific type (or subtype) of used encapsulation.
>>>
>>> Than to me it looks more like a refactor than a fix. Is this strictly
>>> needed by the following patch? if not, I suggest to consider net-next
>>> as a target for this patch, or even better, drop it altogether.
>>>
>> Looks like it isn't strictly needed for the following patch. Do you think that
>> such refactor would lead to more harm than benefits provided by clearness of
>> usage of encap_enable and encap_type fields?
> 
> Yes. That patch is invasive and the clarification is quite subjective
> IMHO.
>
Ok, no problem, will drop it in next iteration.
Thanks


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net 2/3] udp: check encap socket in __udp_lib_err
  2021-07-12 13:37       ` Paolo Abeni
  2021-07-12 14:05         ` Vadim Fedorenko
@ 2021-07-16 17:50         ` Xin Long
  1 sibling, 0 replies; 18+ messages in thread
From: Xin Long @ 2021-07-16 17:50 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Vadim Fedorenko, David Ahern, Willem de Bruijn, Jakub Kicinski,
	David S. Miller, network dev

On Mon, Jul 12, 2021 at 9:37 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On Mon, 2021-07-12 at 13:45 +0100, Vadim Fedorenko wrote:
> >
> > > After this patch, the above chunk will not clear 'sk' for packets
> > > targeting ESP in UDP sockets, but AFAICS we will still enter the
> > > following conditional, preserving the current behavior - no ICMP
> > > processing.
> >
> > We will not enter following conditional for ESP in UDP case because
> > there is no more check for encap_type or encap_enabled.
>
> I see. You have a bug in the ipv6 code-path. With your patch applied:
>
> ---
>         sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source,
>                                inet6_iif(skb), inet6_sdif(skb), udptable, NULL);
>         if (sk && udp_sk(sk)->encap_enabled) {
>                 //...
>         }
>
>         if (!sk || udp_sk(sk)->encap_enabled) {
>         // can still enter here...
> ---
>
> > I maybe missing something but d26796ae5894 doesn't actually explain
> > which particular situation should be avoided by this additional check
> > and no tests were added to simply reproduce the problem. If you can
> > explain it a bit more it would greatly help me to improve the fix.
>
> Xin knows better, but AFAICS it used to cover the situation you
> explicitly tests in patch 3/3 - incoming packet with src-port == dst-
> port == tunnel port - for e.g. vxlan tunnels.
Thanks Paolo and sorry for late.

Right, __udp4/6_lib_err_encap() was introduced to process the ICMP error
packets for UDP tunnels. But it will only work when there's no socket
found with src + dst port, as when the src == dst port a socket might
be found(if the bind addr is ANY) and the code will be called.



>
> > > Why can't you use something alike the following instead?
> > >
> > > ---
> > > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > > index c0f9f3260051..96a3b640e4da 100644
> > > --- a/net/ipv4/udp.c
> > > +++ b/net/ipv4/udp.c
> > > @@ -707,7 +707,7 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable)
> > >          sk = __udp4_lib_lookup(net, iph->daddr, uh->dest,
> > >                                 iph->saddr, uh->source, skb->dev->ifindex,
> > >                                 inet_sdif(skb), udptable, NULL);
> > > -       if (!sk || udp_sk(sk)->encap_type) {
> > > +       if (!sk || READ_ONCE(udp_sk(sk)->encap_err_lookup)) {
> > >                  /* No socket for error: try tunnels before discarding */
> > >                  sk = ERR_PTR(-ENOENT);
> > >                  if (static_branch_unlikely(&udp_encap_needed_key)) {
> > >
> > > ---
>
> Could you please have a look at the above ?
If not all udp tunnels want to do further validation for ICMP error packet,
This looks good to me.

>
> Thanks!
>
> /P
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-07-16 17:50 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-12  0:55 [PATCH net 0/3] Fix PMTU for ESP-in-UDP encapsulation Vadim Fedorenko
2021-07-12  0:55 ` [PATCH net 1/3] udp: check for encap using encap_enable Vadim Fedorenko
2021-07-12  8:37   ` Paolo Abeni
2021-07-12 12:32     ` Vadim Fedorenko
2021-07-12 14:05       ` Paolo Abeni
2021-07-12 14:13         ` Vadim Fedorenko
2021-07-12 14:33           ` Paolo Abeni
2021-07-12 16:27             ` Vadim Fedorenko
2021-07-12  0:55 ` [PATCH net 2/3] udp: check encap socket in __udp_lib_err Vadim Fedorenko
2021-07-12  7:59   ` Willem de Bruijn
2021-07-12 12:09     ` Vadim Fedorenko
2021-07-12  9:07   ` Paolo Abeni
2021-07-12 12:45     ` Vadim Fedorenko
2021-07-12 13:37       ` Paolo Abeni
2021-07-12 14:05         ` Vadim Fedorenko
2021-07-12 14:09           ` Paolo Abeni
2021-07-16 17:50         ` Xin Long
2021-07-12  0:55 ` [PATCH net 3/3] selftests: net: add ESP-in-UDP PMTU test Vadim Fedorenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).