Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH bpf-next v1 0/5] sockmap: add sockmap support for unix stream socket
@ 2021-07-27  0:12 Jiang Wang
  2021-07-27  0:12 ` [PATCH bpf-next v1 1/5] af_unix: add read_sock for stream socket types Jiang Wang
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Jiang Wang @ 2021-07-27  0:12 UTC (permalink / raw)
  To: netdev
  Cc: cong.wang, duanxiongchun, xieyongji, chaiwen.cc, David S. Miller,
	Jakub Kicinski, John Fastabend, Daniel Borkmann, Jakub Sitnicki,
	Lorenz Bauer, Alexei Starovoitov, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh, Shuah Khan,
	Johan Almbladh, linux-kernel, bpf, linux-kselftest

This patch series add support for unix stream type
for sockmap. Sockmap already supports TCP, UDP,
unix dgram types. The unix stream support is similar
to unix dgram.

Also add selftests for unix stream type in sockmap tests.


Jiang Wang (5):
  af_unix: add read_sock for stream socket types
  af_unix: add unix_stream_proto for sockmap
  selftest/bpf: add tests for sockmap with unix stream type.
  selftest/bpf: change udp to inet in some function names
  selftest/bpf: add new tests in sockmap for unix stream to tcp.

 include/net/af_unix.h                         |  8 +-
 net/core/sock_map.c                           |  8 +-
 net/unix/af_unix.c                            | 89 ++++++++++++++++--
 net/unix/unix_bpf.c                           | 93 ++++++++++++++-----
 .../selftests/bpf/prog_tests/sockmap_listen.c | 48 ++++++----
 5 files changed, 194 insertions(+), 52 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v1 1/5] af_unix: add read_sock for stream socket types
  2021-07-27  0:12 [PATCH bpf-next v1 0/5] sockmap: add sockmap support for unix stream socket Jiang Wang
@ 2021-07-27  0:12 ` Jiang Wang
  2021-07-29  8:37   ` Jakub Sitnicki
  2021-07-27  0:12 ` [PATCH bpf-next v1 2/5] af_unix: add unix_stream_proto for sockmap Jiang Wang
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Jiang Wang @ 2021-07-27  0:12 UTC (permalink / raw)
  To: netdev
  Cc: cong.wang, duanxiongchun, xieyongji, chaiwen.cc, David S. Miller,
	Jakub Kicinski, John Fastabend, Daniel Borkmann, Jakub Sitnicki,
	Lorenz Bauer, Alexei Starovoitov, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh, Shuah Khan,
	Johan Almbladh, linux-kernel, bpf, linux-kselftest

To support sockmap for af_unix stream type, implement
read_sock, which is similar to the read_sock for unix
dgram sockets.

Signed-off-by: Jiang Wang <jiang.wang@bytedance.com>
Reviewed-by: Cong Wang <cong.wang@bytedance.com>.
---
 net/unix/af_unix.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 89927678c..32eeb4a6a 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -672,6 +672,8 @@ static int unix_dgram_sendmsg(struct socket *, struct msghdr *, size_t);
 static int unix_dgram_recvmsg(struct socket *, struct msghdr *, size_t, int);
 static int unix_read_sock(struct sock *sk, read_descriptor_t *desc,
 			  sk_read_actor_t recv_actor);
+static int unix_stream_read_sock(struct sock *sk, read_descriptor_t *desc,
+				 sk_read_actor_t recv_actor);
 static int unix_dgram_connect(struct socket *, struct sockaddr *,
 			      int, int);
 static int unix_seqpacket_sendmsg(struct socket *, struct msghdr *, size_t);
@@ -725,6 +727,7 @@ static const struct proto_ops unix_stream_ops = {
 	.shutdown =	unix_shutdown,
 	.sendmsg =	unix_stream_sendmsg,
 	.recvmsg =	unix_stream_recvmsg,
+	.read_sock =	unix_stream_read_sock,
 	.mmap =		sock_no_mmap,
 	.sendpage =	unix_stream_sendpage,
 	.splice_read =	unix_stream_splice_read,
@@ -2311,6 +2314,15 @@ struct unix_stream_read_state {
 	unsigned int splice_flags;
 };
 
+static int unix_stream_read_sock(struct sock *sk, read_descriptor_t *desc,
+				 sk_read_actor_t recv_actor)
+{
+	if (unlikely(sk->sk_state != TCP_ESTABLISHED))
+		return -EINVAL;
+
+	return unix_read_sock(sk, desc, recv_actor);
+}
+
 static int unix_stream_read_generic(struct unix_stream_read_state *state,
 				    bool freezable)
 {
-- 
2.20.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v1 2/5] af_unix: add unix_stream_proto for sockmap
  2021-07-27  0:12 [PATCH bpf-next v1 0/5] sockmap: add sockmap support for unix stream socket Jiang Wang
  2021-07-27  0:12 ` [PATCH bpf-next v1 1/5] af_unix: add read_sock for stream socket types Jiang Wang
@ 2021-07-27  0:12 ` Jiang Wang
  2021-07-27 16:37   ` John Fastabend
  2021-07-27  0:12 ` [PATCH bpf-next v1 3/5] selftest/bpf: add tests for sockmap with unix stream type Jiang Wang
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Jiang Wang @ 2021-07-27  0:12 UTC (permalink / raw)
  To: netdev
  Cc: cong.wang, duanxiongchun, xieyongji, chaiwen.cc, David S. Miller,
	Jakub Kicinski, John Fastabend, Daniel Borkmann, Jakub Sitnicki,
	Lorenz Bauer, Alexei Starovoitov, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh, Shuah Khan,
	Johan Almbladh, linux-kernel, bpf, linux-kselftest

unix_stream_proto is similar to unix_dgram_proto.
Also implement stream related sockmap functions.

Add dgram key words to those dgram specific functions.

Signed-off-by: Jiang Wang <jiang.wang@bytedance.com>
Reviewed-by: Cong Wang <cong.wang@bytedance.com>.
---
 include/net/af_unix.h |  8 +++-
 net/core/sock_map.c   |  8 +++-
 net/unix/af_unix.c    | 77 ++++++++++++++++++++++++++++++-----
 net/unix/unix_bpf.c   | 93 +++++++++++++++++++++++++++++++++----------
 4 files changed, 151 insertions(+), 35 deletions(-)

diff --git a/include/net/af_unix.h b/include/net/af_unix.h
index 435a2c3d5..5d04fbf8a 100644
--- a/include/net/af_unix.h
+++ b/include/net/af_unix.h
@@ -84,6 +84,8 @@ long unix_outq_len(struct sock *sk);
 
 int __unix_dgram_recvmsg(struct sock *sk, struct msghdr *msg, size_t size,
 			 int flags);
+int __unix_stream_recvmsg(struct sock *sk, struct msghdr *msg, size_t size,
+			  int flags);
 #ifdef CONFIG_SYSCTL
 int unix_sysctl_register(struct net *net);
 void unix_sysctl_unregister(struct net *net);
@@ -93,9 +95,11 @@ static inline void unix_sysctl_unregister(struct net *net) {}
 #endif
 
 #ifdef CONFIG_BPF_SYSCALL
-extern struct proto unix_proto;
+extern struct proto unix_dgram_proto;
+extern struct proto unix_stream_proto;
 
-int unix_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore);
+int unix_dgram_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore);
+int unix_stream_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore);
 void __init unix_bpf_build_proto(void);
 #else
 static inline void __init unix_bpf_build_proto(void)
diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index ae5fa4338..42f50ea7a 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -517,9 +517,15 @@ static bool sk_is_tcp(const struct sock *sk)
 	       sk->sk_protocol == IPPROTO_TCP;
 }
 
+static bool sk_is_unix_stream(const struct sock *sk)
+{
+	return sk->sk_type == SOCK_STREAM &&
+	       sk->sk_protocol == PF_UNIX;
+}
+
 static bool sock_map_redirect_allowed(const struct sock *sk)
 {
-	if (sk_is_tcp(sk))
+	if (sk_is_tcp(sk) || sk_is_unix_stream(sk))
 		return sk->sk_state != TCP_LISTEN;
 	else
 		return sk->sk_state == TCP_ESTABLISHED;
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 32eeb4a6a..c68d13f61 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -791,17 +791,35 @@ static void unix_close(struct sock *sk, long timeout)
 	 */
 }
 
-struct proto unix_proto = {
-	.name			= "UNIX",
+static void unix_unhash(struct sock *sk)
+{
+	/* Nothing to do here, unix socket does not need a ->unhash().
+	 * This is merely for sockmap.
+	 */
+}
+
+struct proto unix_dgram_proto = {
+	.name			= "UNIX-DGRAM",
+	.owner			= THIS_MODULE,
+	.obj_size		= sizeof(struct unix_sock),
+	.close			= unix_close,
+#ifdef CONFIG_BPF_SYSCALL
+	.psock_update_sk_prot	= unix_dgram_bpf_update_proto,
+#endif
+};
+
+struct proto unix_stream_proto = {
+	.name			= "UNIX-STREAM",
 	.owner			= THIS_MODULE,
 	.obj_size		= sizeof(struct unix_sock),
 	.close			= unix_close,
+	.unhash			= unix_unhash,
 #ifdef CONFIG_BPF_SYSCALL
-	.psock_update_sk_prot	= unix_bpf_update_proto,
+	.psock_update_sk_prot	= unix_stream_bpf_update_proto,
 #endif
 };
 
-static struct sock *unix_create1(struct net *net, struct socket *sock, int kern)
+static struct sock *unix_create1(struct net *net, struct socket *sock, int kern, int type)
 {
 	struct sock *sk = NULL;
 	struct unix_sock *u;
@@ -810,7 +828,17 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern)
 	if (atomic_long_read(&unix_nr_socks) > 2 * get_max_files())
 		goto out;
 
-	sk = sk_alloc(net, PF_UNIX, GFP_KERNEL, &unix_proto, kern);
+	if (type != 0) {
+		if (type == SOCK_STREAM)
+			sk = sk_alloc(net, PF_UNIX, GFP_KERNEL, &unix_stream_proto, kern);
+		else /*for seqpacket */
+			sk = sk_alloc(net, PF_UNIX, GFP_KERNEL, &unix_dgram_proto, kern);
+	} else {
+		if (sock->type == SOCK_STREAM)
+			sk = sk_alloc(net, PF_UNIX, GFP_KERNEL, &unix_stream_proto, kern);
+		else
+			sk = sk_alloc(net, PF_UNIX, GFP_KERNEL, &unix_dgram_proto, kern);
+	}
 	if (!sk)
 		goto out;
 
@@ -872,7 +900,7 @@ static int unix_create(struct net *net, struct socket *sock, int protocol,
 		return -ESOCKTNOSUPPORT;
 	}
 
-	return unix_create1(net, sock, kern) ? 0 : -ENOMEM;
+	return unix_create1(net, sock, kern, 0) ? 0 : -ENOMEM;
 }
 
 static int unix_release(struct socket *sock)
@@ -1286,7 +1314,7 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
 	err = -ENOMEM;
 
 	/* create new sock for complete connection */
-	newsk = unix_create1(sock_net(sk), NULL, 0);
+	newsk = unix_create1(sock_net(sk), NULL, 0, sock->type);
 	if (newsk == NULL)
 		goto out;
 
@@ -2214,7 +2242,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si
 	struct sock *sk = sock->sk;
 
 #ifdef CONFIG_BPF_SYSCALL
-	if (sk->sk_prot != &unix_proto)
+	if (sk->sk_prot != &unix_dgram_proto)
 		return sk->sk_prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT,
 					    flags & ~MSG_DONTWAIT, NULL);
 #endif
@@ -2533,6 +2561,21 @@ static int unix_stream_read_actor(struct sk_buff *skb,
 	return ret ?: chunk;
 }
 
+int __unix_stream_recvmsg(struct sock *sk, struct msghdr *msg,
+			  size_t size, int flags)
+{
+	struct socket *sock = sk->sk_socket;
+	struct unix_stream_read_state state = {
+		.recv_actor = unix_stream_read_actor,
+		.socket = sock,
+		.msg = msg,
+		.size = size,
+		.flags = flags
+	};
+
+	return unix_stream_read_generic(&state, true);
+}
+
 static int unix_stream_recvmsg(struct socket *sock, struct msghdr *msg,
 			       size_t size, int flags)
 {
@@ -2544,6 +2587,13 @@ static int unix_stream_recvmsg(struct socket *sock, struct msghdr *msg,
 		.flags = flags
 	};
 
+	struct sock *sk = sock->sk;
+
+#ifdef CONFIG_BPF_SYSCALL
+	if (sk->sk_prot != &unix_stream_proto)
+		return sk->sk_prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT,
+					    flags & ~MSG_DONTWAIT, NULL);
+#endif
 	return unix_stream_read_generic(&state, true);
 }
 
@@ -2993,7 +3043,13 @@ static int __init af_unix_init(void)
 
 	BUILD_BUG_ON(sizeof(struct unix_skb_parms) > sizeof_field(struct sk_buff, cb));
 
-	rc = proto_register(&unix_proto, 1);
+	rc = proto_register(&unix_dgram_proto, 1);
+	if (rc != 0) {
+		pr_crit("%s: Cannot create unix_sock SLAB cache!\n", __func__);
+		goto out;
+	}
+
+	rc = proto_register(&unix_stream_proto, 1);
 	if (rc != 0) {
 		pr_crit("%s: Cannot create unix_sock SLAB cache!\n", __func__);
 		goto out;
@@ -3009,7 +3065,8 @@ static int __init af_unix_init(void)
 static void __exit af_unix_exit(void)
 {
 	sock_unregister(PF_UNIX);
-	proto_unregister(&unix_proto);
+	proto_unregister(&unix_dgram_proto);
+	proto_unregister(&unix_stream_proto);
 	unregister_pernet_subsys(&unix_net_ops);
 }
 
diff --git a/net/unix/unix_bpf.c b/net/unix/unix_bpf.c
index db0cda29f..9067210d3 100644
--- a/net/unix/unix_bpf.c
+++ b/net/unix/unix_bpf.c
@@ -38,9 +38,18 @@ static int unix_msg_wait_data(struct sock *sk, struct sk_psock *psock,
 	return ret;
 }
 
-static int unix_dgram_bpf_recvmsg(struct sock *sk, struct msghdr *msg,
-				  size_t len, int nonblock, int flags,
-				  int *addr_len)
+static int __unix_recvmsg(struct sock *sk, struct msghdr *msg,
+			   size_t len, int flags)
+{
+	if (sk->sk_type == SOCK_DGRAM)
+		return __unix_dgram_recvmsg(sk, msg, len, flags);
+	else
+		return __unix_stream_recvmsg(sk, msg, len, flags);
+}
+
+static int unix_bpf_recvmsg(struct sock *sk, struct msghdr *msg,
+			    size_t len, int nonblock, int flags,
+			    int *addr_len)
 {
 	struct unix_sock *u = unix_sk(sk);
 	struct sk_psock *psock;
@@ -48,12 +57,12 @@ static int unix_dgram_bpf_recvmsg(struct sock *sk, struct msghdr *msg,
 
 	psock = sk_psock_get(sk);
 	if (unlikely(!psock))
-		return __unix_dgram_recvmsg(sk, msg, len, flags);
+		return __unix_recvmsg(sk, msg, len, flags);
 
 	mutex_lock(&u->iolock);
 	if (!skb_queue_empty(&sk->sk_receive_queue) &&
 	    sk_psock_queue_empty(psock)) {
-		ret = __unix_dgram_recvmsg(sk, msg, len, flags);
+		ret = __unix_recvmsg(sk, msg, len, flags);
 		goto out;
 	}
 
@@ -68,7 +77,7 @@ static int unix_dgram_bpf_recvmsg(struct sock *sk, struct msghdr *msg,
 		if (data) {
 			if (!sk_psock_queue_empty(psock))
 				goto msg_bytes_ready;
-			ret = __unix_dgram_recvmsg(sk, msg, len, flags);
+			ret = __unix_recvmsg(sk, msg, len, flags);
 			goto out;
 		}
 		copied = -EAGAIN;
@@ -80,30 +89,55 @@ static int unix_dgram_bpf_recvmsg(struct sock *sk, struct msghdr *msg,
 	return ret;
 }
 
-static struct proto *unix_prot_saved __read_mostly;
-static DEFINE_SPINLOCK(unix_prot_lock);
-static struct proto unix_bpf_prot;
+static struct proto *unix_dgram_prot_saved __read_mostly;
+static DEFINE_SPINLOCK(unix_dgram_prot_lock);
+static struct proto unix_dgram_bpf_prot;
+
+static struct proto *unix_stream_prot_saved __read_mostly;
+static DEFINE_SPINLOCK(unix_stream_prot_lock);
+static struct proto unix_stream_bpf_prot;
+
+static void unix_dgram_bpf_rebuild_protos(struct proto *prot, const struct proto *base)
+{
+	*prot        = *base;
+	prot->close  = sock_map_close;
+	prot->recvmsg = unix_bpf_recvmsg;
+}
 
-static void unix_bpf_rebuild_protos(struct proto *prot, const struct proto *base)
+static void unix_stream_bpf_rebuild_protos(struct proto *prot,
+					   const struct proto *base)
 {
 	*prot        = *base;
 	prot->close  = sock_map_close;
-	prot->recvmsg = unix_dgram_bpf_recvmsg;
+	prot->recvmsg = unix_bpf_recvmsg;
+	prot->unhash  = sock_map_unhash;
 }
 
-static void unix_bpf_check_needs_rebuild(struct proto *ops)
+static void unix_dgram_bpf_check_needs_rebuild(struct proto *ops)
 {
-	if (unlikely(ops != smp_load_acquire(&unix_prot_saved))) {
-		spin_lock_bh(&unix_prot_lock);
-		if (likely(ops != unix_prot_saved)) {
-			unix_bpf_rebuild_protos(&unix_bpf_prot, ops);
-			smp_store_release(&unix_prot_saved, ops);
+	if (unlikely(ops != smp_load_acquire(&unix_dgram_prot_saved))) {
+		spin_lock_bh(&unix_dgram_prot_lock);
+		if (likely(ops != unix_dgram_prot_saved)) {
+			unix_dgram_bpf_rebuild_protos(&unix_dgram_bpf_prot, ops);
+			smp_store_release(&unix_dgram_prot_saved, ops);
 		}
-		spin_unlock_bh(&unix_prot_lock);
+		spin_unlock_bh(&unix_dgram_prot_lock);
 	}
 }
 
-int unix_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
+static void unix_stream_bpf_check_needs_rebuild(struct proto *ops)
+{
+	if (unlikely(ops != smp_load_acquire(&unix_stream_prot_saved))) {
+		spin_lock_bh(&unix_stream_prot_lock);
+		if (likely(ops != unix_stream_prot_saved)) {
+			unix_stream_bpf_rebuild_protos(&unix_stream_bpf_prot, ops);
+			smp_store_release(&unix_stream_prot_saved, ops);
+		}
+		spin_unlock_bh(&unix_stream_prot_lock);
+	}
+}
+
+int unix_dgram_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
 {
 	if (restore) {
 		sk->sk_write_space = psock->saved_write_space;
@@ -111,12 +145,27 @@ int unix_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
 		return 0;
 	}
 
-	unix_bpf_check_needs_rebuild(psock->sk_proto);
-	WRITE_ONCE(sk->sk_prot, &unix_bpf_prot);
+	unix_dgram_bpf_check_needs_rebuild(psock->sk_proto);
+	WRITE_ONCE(sk->sk_prot, &unix_dgram_bpf_prot);
+	return 0;
+}
+
+int unix_stream_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
+{
+	if (restore) {
+		sk->sk_write_space = psock->saved_write_space;
+		WRITE_ONCE(sk->sk_prot, psock->sk_proto);
+		return 0;
+	}
+
+	unix_stream_bpf_check_needs_rebuild(psock->sk_proto);
+	WRITE_ONCE(sk->sk_prot, &unix_stream_bpf_prot);
 	return 0;
 }
 
 void __init unix_bpf_build_proto(void)
 {
-	unix_bpf_rebuild_protos(&unix_bpf_prot, &unix_proto);
+	unix_dgram_bpf_rebuild_protos(&unix_dgram_bpf_prot, &unix_dgram_proto);
+	unix_stream_bpf_rebuild_protos(&unix_stream_bpf_prot, &unix_stream_proto);
+
 }
-- 
2.20.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v1 3/5] selftest/bpf: add tests for sockmap with unix stream type.
  2021-07-27  0:12 [PATCH bpf-next v1 0/5] sockmap: add sockmap support for unix stream socket Jiang Wang
  2021-07-27  0:12 ` [PATCH bpf-next v1 1/5] af_unix: add read_sock for stream socket types Jiang Wang
  2021-07-27  0:12 ` [PATCH bpf-next v1 2/5] af_unix: add unix_stream_proto for sockmap Jiang Wang
@ 2021-07-27  0:12 ` Jiang Wang
  2021-07-27 16:38   ` John Fastabend
  2021-07-27  0:12 ` [PATCH bpf-next v1 4/5] selftest/bpf: change udp to inet in some function names Jiang Wang
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Jiang Wang @ 2021-07-27  0:12 UTC (permalink / raw)
  To: netdev
  Cc: cong.wang, duanxiongchun, xieyongji, chaiwen.cc, David S. Miller,
	Jakub Kicinski, John Fastabend, Daniel Borkmann, Jakub Sitnicki,
	Lorenz Bauer, Alexei Starovoitov, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh, Shuah Khan,
	Johan Almbladh, linux-kernel, bpf, linux-kselftest

Add two tests for unix stream to unix stream redirection
in sockmap tests.

Signed-off-by: Jiang Wang <jiang.wang@bytedance.com>
Reviewed-by: Cong Wang <cong.wang@bytedance.com>.
---
 tools/testing/selftests/bpf/prog_tests/sockmap_listen.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
index a9f1bf9d5..7a976d432 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
@@ -2020,11 +2020,13 @@ void test_sockmap_listen(void)
 	run_tests(skel, skel->maps.sock_map, AF_INET);
 	run_tests(skel, skel->maps.sock_map, AF_INET6);
 	test_unix_redir(skel, skel->maps.sock_map, SOCK_DGRAM);
+	test_unix_redir(skel, skel->maps.sock_map, SOCK_STREAM);
 
 	skel->bss->test_sockmap = false;
 	run_tests(skel, skel->maps.sock_hash, AF_INET);
 	run_tests(skel, skel->maps.sock_hash, AF_INET6);
 	test_unix_redir(skel, skel->maps.sock_hash, SOCK_DGRAM);
+	test_unix_redir(skel, skel->maps.sock_hash, SOCK_STREAM);
 
 	test_sockmap_listen__destroy(skel);
 }
-- 
2.20.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v1 4/5] selftest/bpf: change udp to inet in some function names
  2021-07-27  0:12 [PATCH bpf-next v1 0/5] sockmap: add sockmap support for unix stream socket Jiang Wang
                   ` (2 preceding siblings ...)
  2021-07-27  0:12 ` [PATCH bpf-next v1 3/5] selftest/bpf: add tests for sockmap with unix stream type Jiang Wang
@ 2021-07-27  0:12 ` Jiang Wang
  2021-07-27 16:40   ` John Fastabend
  2021-07-27  0:12 ` [PATCH bpf-next v1 5/5] selftest/bpf: add new tests in sockmap for unix stream to tcp Jiang Wang
  2021-07-27 16:44 ` [PATCH bpf-next v1 0/5] sockmap: add sockmap support for unix stream socket John Fastabend
  5 siblings, 1 reply; 17+ messages in thread
From: Jiang Wang @ 2021-07-27  0:12 UTC (permalink / raw)
  To: netdev
  Cc: cong.wang, duanxiongchun, xieyongji, chaiwen.cc, David S. Miller,
	Jakub Kicinski, John Fastabend, Daniel Borkmann, Jakub Sitnicki,
	Lorenz Bauer, Alexei Starovoitov, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh, Shuah Khan,
	Johan Almbladh, linux-kernel, bpf, linux-kselftest

This is to prepare for adding new unix stream tests.
Mostly renames, also pass the socket types as an argument.

Signed-off-by: Jiang Wang <jiang.wang@bytedance.com>
Reviewed-by: Cong Wang <cong.wang@bytedance.com>.
---
 .../selftests/bpf/prog_tests/sockmap_listen.c | 30 +++++++++++--------
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
index 7a976d432..07ed8081f 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
@@ -1692,14 +1692,14 @@ static void test_reuseport(struct test_sockmap_listen *skel,
 	}
 }
 
-static int udp_socketpair(int family, int *s, int *c)
+static int inet_socketpair(int family, int type, int *s, int *c)
 {
 	struct sockaddr_storage addr;
 	socklen_t len;
 	int p0, c0;
 	int err;
 
-	p0 = socket_loopback(family, SOCK_DGRAM | SOCK_NONBLOCK);
+	p0 = socket_loopback(family, type | SOCK_NONBLOCK);
 	if (p0 < 0)
 		return p0;
 
@@ -1708,7 +1708,7 @@ static int udp_socketpair(int family, int *s, int *c)
 	if (err)
 		goto close_peer0;
 
-	c0 = xsocket(family, SOCK_DGRAM | SOCK_NONBLOCK, 0);
+	c0 = xsocket(family, type | SOCK_NONBLOCK, 0);
 	if (c0 < 0) {
 		err = c0;
 		goto close_peer0;
@@ -1747,10 +1747,10 @@ static void udp_redir_to_connected(int family, int sock_mapfd, int verd_mapfd,
 
 	zero_verdict_count(verd_mapfd);
 
-	err = udp_socketpair(family, &p0, &c0);
+	err = inet_socketpair(family, SOCK_DGRAM, &p0, &c0);
 	if (err)
 		return;
-	err = udp_socketpair(family, &p1, &c1);
+	err = inet_socketpair(family, SOCK_DGRAM, &p1, &c1);
 	if (err)
 		goto close_cli0;
 
@@ -1825,7 +1825,7 @@ static void test_udp_redir(struct test_sockmap_listen *skel, struct bpf_map *map
 	udp_skb_redir_to_connected(skel, map, family);
 }
 
-static void udp_unix_redir_to_connected(int family, int sock_mapfd,
+static void inet_unix_redir_to_connected(int family, int type, int sock_mapfd,
 					int verd_mapfd, enum redir_mode mode)
 {
 	const char *log_prefix = redir_mode_str(mode);
@@ -1843,7 +1843,7 @@ static void udp_unix_redir_to_connected(int family, int sock_mapfd,
 		return;
 	c0 = sfd[0], p0 = sfd[1];
 
-	err = udp_socketpair(family, &p1, &c1);
+	err = inet_socketpair(family, SOCK_DGRAM, &p1, &c1);
 	if (err)
 		goto close;
 
@@ -1897,14 +1897,16 @@ static void udp_unix_skb_redir_to_connected(struct test_sockmap_listen *skel,
 		return;
 
 	skel->bss->test_ingress = false;
-	udp_unix_redir_to_connected(family, sock_map, verdict_map, REDIR_EGRESS);
+	inet_unix_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
+				    REDIR_EGRESS);
 	skel->bss->test_ingress = true;
-	udp_unix_redir_to_connected(family, sock_map, verdict_map, REDIR_INGRESS);
+	inet_unix_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
+				    REDIR_INGRESS);
 
 	xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT);
 }
 
-static void unix_udp_redir_to_connected(int family, int sock_mapfd,
+static void unix_inet_redir_to_connected(int family, int type, int sock_mapfd,
 					int verd_mapfd, enum redir_mode mode)
 {
 	const char *log_prefix = redir_mode_str(mode);
@@ -1917,7 +1919,7 @@ static void unix_udp_redir_to_connected(int family, int sock_mapfd,
 
 	zero_verdict_count(verd_mapfd);
 
-	err = udp_socketpair(family, &p0, &c0);
+	err = inet_socketpair(family, SOCK_DGRAM, &p0, &c0);
 	if (err)
 		return;
 
@@ -1972,9 +1974,11 @@ static void unix_udp_skb_redir_to_connected(struct test_sockmap_listen *skel,
 		return;
 
 	skel->bss->test_ingress = false;
-	unix_udp_redir_to_connected(family, sock_map, verdict_map, REDIR_EGRESS);
+	unix_inet_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
+				     REDIR_EGRESS);
 	skel->bss->test_ingress = true;
-	unix_udp_redir_to_connected(family, sock_map, verdict_map, REDIR_INGRESS);
+	unix_inet_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
+				     REDIR_INGRESS);
 
 	xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT);
 }
-- 
2.20.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v1 5/5] selftest/bpf: add new tests in sockmap for unix stream to tcp.
  2021-07-27  0:12 [PATCH bpf-next v1 0/5] sockmap: add sockmap support for unix stream socket Jiang Wang
                   ` (3 preceding siblings ...)
  2021-07-27  0:12 ` [PATCH bpf-next v1 4/5] selftest/bpf: change udp to inet in some function names Jiang Wang
@ 2021-07-27  0:12 ` Jiang Wang
  2021-07-27 16:42   ` John Fastabend
  2021-07-27 16:44 ` [PATCH bpf-next v1 0/5] sockmap: add sockmap support for unix stream socket John Fastabend
  5 siblings, 1 reply; 17+ messages in thread
From: Jiang Wang @ 2021-07-27  0:12 UTC (permalink / raw)
  To: netdev
  Cc: cong.wang, duanxiongchun, xieyongji, chaiwen.cc, David S. Miller,
	Jakub Kicinski, John Fastabend, Daniel Borkmann, Jakub Sitnicki,
	Lorenz Bauer, Alexei Starovoitov, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh, Shuah Khan,
	Johan Almbladh, linux-kernel, bpf, linux-kselftest

Add two new test cases in sockmap tests, where unix stream is
redirected to tcp and vice versa.

Signed-off-by: Jiang Wang <jiang.wang@bytedance.com>
Reviewed-by: Cong Wang <cong.wang@bytedance.com>.
---
 .../selftests/bpf/prog_tests/sockmap_listen.c    | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
index 07ed8081f..afa14fb66 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
@@ -1884,7 +1884,7 @@ static void inet_unix_redir_to_connected(int family, int type, int sock_mapfd,
 	xclose(p0);
 }
 
-static void udp_unix_skb_redir_to_connected(struct test_sockmap_listen *skel,
+static void inet_unix_skb_redir_to_connected(struct test_sockmap_listen *skel,
 					    struct bpf_map *inner_map, int family)
 {
 	int verdict = bpf_program__fd(skel->progs.prog_skb_verdict);
@@ -1899,9 +1899,13 @@ static void udp_unix_skb_redir_to_connected(struct test_sockmap_listen *skel,
 	skel->bss->test_ingress = false;
 	inet_unix_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
 				    REDIR_EGRESS);
+	inet_unix_redir_to_connected(family, SOCK_STREAM, sock_map, verdict_map,
+				    REDIR_EGRESS);
 	skel->bss->test_ingress = true;
 	inet_unix_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
 				    REDIR_INGRESS);
+	inet_unix_redir_to_connected(family, SOCK_STREAM, sock_map, verdict_map,
+				    REDIR_INGRESS);
 
 	xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT);
 }
@@ -1961,7 +1965,7 @@ static void unix_inet_redir_to_connected(int family, int type, int sock_mapfd,
 
 }
 
-static void unix_udp_skb_redir_to_connected(struct test_sockmap_listen *skel,
+static void unix_inet_skb_redir_to_connected(struct test_sockmap_listen *skel,
 					    struct bpf_map *inner_map, int family)
 {
 	int verdict = bpf_program__fd(skel->progs.prog_skb_verdict);
@@ -1976,9 +1980,13 @@ static void unix_udp_skb_redir_to_connected(struct test_sockmap_listen *skel,
 	skel->bss->test_ingress = false;
 	unix_inet_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
 				     REDIR_EGRESS);
+	unix_inet_redir_to_connected(family, SOCK_STREAM, sock_map, verdict_map,
+				     REDIR_EGRESS);
 	skel->bss->test_ingress = true;
 	unix_inet_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
 				     REDIR_INGRESS);
+	unix_inet_redir_to_connected(family, SOCK_STREAM, sock_map, verdict_map,
+				     REDIR_INGRESS);
 
 	xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT);
 }
@@ -1994,8 +2002,8 @@ static void test_udp_unix_redir(struct test_sockmap_listen *skel, struct bpf_map
 	snprintf(s, sizeof(s), "%s %s %s", map_name, family_name, __func__);
 	if (!test__start_subtest(s))
 		return;
-	udp_unix_skb_redir_to_connected(skel, map, family);
-	unix_udp_skb_redir_to_connected(skel, map, family);
+	inet_unix_skb_redir_to_connected(skel, map, family);
+	unix_inet_skb_redir_to_connected(skel, map, family);
 }
 
 static void run_tests(struct test_sockmap_listen *skel, struct bpf_map *map,
-- 
2.20.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH bpf-next v1 2/5] af_unix: add unix_stream_proto for sockmap
  2021-07-27  0:12 ` [PATCH bpf-next v1 2/5] af_unix: add unix_stream_proto for sockmap Jiang Wang
@ 2021-07-27 16:37   ` John Fastabend
  2021-07-28  2:07     ` Cong Wang
  0 siblings, 1 reply; 17+ messages in thread
From: John Fastabend @ 2021-07-27 16:37 UTC (permalink / raw)
  To: Jiang Wang, netdev
  Cc: cong.wang, duanxiongchun, xieyongji, chaiwen.cc, David S. Miller,
	Jakub Kicinski, John Fastabend, Daniel Borkmann, Jakub Sitnicki,
	Lorenz Bauer, Alexei Starovoitov, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh, Shuah Khan,
	Johan Almbladh, linux-kernel, bpf, linux-kselftest

Jiang Wang wrote:
> unix_stream_proto is similar to unix_dgram_proto.
> Also implement stream related sockmap functions.
> 
> Add dgram key words to those dgram specific functions.
> 
> Signed-off-by: Jiang Wang <jiang.wang@bytedance.com>
> Reviewed-by: Cong Wang <cong.wang@bytedance.com>.
> ---

Overall LGTM a few small question/comments below.

>  include/net/af_unix.h |  8 +++-
>  net/core/sock_map.c   |  8 +++-
>  net/unix/af_unix.c    | 77 ++++++++++++++++++++++++++++++-----
>  net/unix/unix_bpf.c   | 93 +++++++++++++++++++++++++++++++++----------
>  4 files changed, 151 insertions(+), 35 deletions(-)
> 
> diff --git a/include/net/af_unix.h b/include/net/af_unix.h
> index 435a2c3d5..5d04fbf8a 100644
> --- a/include/net/af_unix.h
> +++ b/include/net/af_unix.h
> @@ -84,6 +84,8 @@ long unix_outq_len(struct sock *sk);
>  
>  int __unix_dgram_recvmsg(struct sock *sk, struct msghdr *msg, size_t size,
>  			 int flags);
> +int __unix_stream_recvmsg(struct sock *sk, struct msghdr *msg, size_t size,
> +			  int flags);
>  #ifdef CONFIG_SYSCTL
>  int unix_sysctl_register(struct net *net);
>  void unix_sysctl_unregister(struct net *net);
> @@ -93,9 +95,11 @@ static inline void unix_sysctl_unregister(struct net *net) {}
>  #endif
>  
>  #ifdef CONFIG_BPF_SYSCALL
> -extern struct proto unix_proto;
> +extern struct proto unix_dgram_proto;
> +extern struct proto unix_stream_proto;
>  
> -int unix_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore);
> +int unix_dgram_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore);
> +int unix_stream_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore);
>  void __init unix_bpf_build_proto(void);
>  #else
>  static inline void __init unix_bpf_build_proto(void)
> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> index ae5fa4338..42f50ea7a 100644
> --- a/net/core/sock_map.c
> +++ b/net/core/sock_map.c
> @@ -517,9 +517,15 @@ static bool sk_is_tcp(const struct sock *sk)
>  	       sk->sk_protocol == IPPROTO_TCP;
>  }
>  
> +static bool sk_is_unix_stream(const struct sock *sk)
> +{
> +	return sk->sk_type == SOCK_STREAM &&
> +	       sk->sk_protocol == PF_UNIX;
> +}
> +
>  static bool sock_map_redirect_allowed(const struct sock *sk)
>  {
> -	if (sk_is_tcp(sk))
> +	if (sk_is_tcp(sk) || sk_is_unix_stream(sk))
>  		return sk->sk_state != TCP_LISTEN;
>  	else
>  		return sk->sk_state == TCP_ESTABLISHED;
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 32eeb4a6a..c68d13f61 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -791,17 +791,35 @@ static void unix_close(struct sock *sk, long timeout)
>  	 */
>  }
>  
> -struct proto unix_proto = {
> -	.name			= "UNIX",
> +static void unix_unhash(struct sock *sk)
> +{
> +	/* Nothing to do here, unix socket does not need a ->unhash().
> +	 * This is merely for sockmap.
> +	 */
> +}

Do we really need an unhash hook for unix_stream? I'm doing some testing
now to pull it out of TCP side as well. It seems to be an artifact of old
code that is no longer necessary. On TCP side at least just using close()
looks to be enough now.

> +
> +struct proto unix_dgram_proto = {
> +	.name			= "UNIX-DGRAM",
> +	.owner			= THIS_MODULE,
> +	.obj_size		= sizeof(struct unix_sock),
> +	.close			= unix_close,
> +#ifdef CONFIG_BPF_SYSCALL
> +	.psock_update_sk_prot	= unix_dgram_bpf_update_proto,
> +#endif
> +};
> +
> +struct proto unix_stream_proto = {
> +	.name			= "UNIX-STREAM",
>  	.owner			= THIS_MODULE,
>  	.obj_size		= sizeof(struct unix_sock),
>  	.close			= unix_close,
> +	.unhash			= unix_unhash,
>  #ifdef CONFIG_BPF_SYSCALL
> -	.psock_update_sk_prot	= unix_bpf_update_proto,
> +	.psock_update_sk_prot	= unix_stream_bpf_update_proto,
>  #endif
>  };
>  
> -static struct sock *unix_create1(struct net *net, struct socket *sock, int kern)
> +static struct sock *unix_create1(struct net *net, struct socket *sock, int kern, int type)
>  {
>  	struct sock *sk = NULL;
>  	struct unix_sock *u;
> @@ -810,7 +828,17 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern)
>  	if (atomic_long_read(&unix_nr_socks) > 2 * get_max_files())
>  		goto out;
>  
> -	sk = sk_alloc(net, PF_UNIX, GFP_KERNEL, &unix_proto, kern);
> +	if (type != 0) {
> +		if (type == SOCK_STREAM)
> +			sk = sk_alloc(net, PF_UNIX, GFP_KERNEL, &unix_stream_proto, kern);
> +		else /*for seqpacket */
> +			sk = sk_alloc(net, PF_UNIX, GFP_KERNEL, &unix_dgram_proto, kern);
> +	} else {
> +		if (sock->type == SOCK_STREAM)
> +			sk = sk_alloc(net, PF_UNIX, GFP_KERNEL, &unix_stream_proto, kern);
> +		else
> +			sk = sk_alloc(net, PF_UNIX, GFP_KERNEL, &unix_dgram_proto, kern);
> +	}
>  	if (!sk)
>  		goto out;
>  
> @@ -872,7 +900,7 @@ static int unix_create(struct net *net, struct socket *sock, int protocol,
>  		return -ESOCKTNOSUPPORT;
>  	}
>  
> -	return unix_create1(net, sock, kern) ? 0 : -ENOMEM;
> +	return unix_create1(net, sock, kern, 0) ? 0 : -ENOMEM;
>  }
>  
>  static int unix_release(struct socket *sock)
> @@ -1286,7 +1314,7 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
>  	err = -ENOMEM;
>  
>  	/* create new sock for complete connection */
> -	newsk = unix_create1(sock_net(sk), NULL, 0);
> +	newsk = unix_create1(sock_net(sk), NULL, 0, sock->type);
>  	if (newsk == NULL)
>  		goto out;
>  
> @@ -2214,7 +2242,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si
>  	struct sock *sk = sock->sk;
>  
>  #ifdef CONFIG_BPF_SYSCALL
> -	if (sk->sk_prot != &unix_proto)
> +	if (sk->sk_prot != &unix_dgram_proto)
>  		return sk->sk_prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT,
>  					    flags & ~MSG_DONTWAIT, NULL);
>  #endif
> @@ -2533,6 +2561,21 @@ static int unix_stream_read_actor(struct sk_buff *skb,
>  	return ret ?: chunk;
>  }
>  
> +int __unix_stream_recvmsg(struct sock *sk, struct msghdr *msg,
> +			  size_t size, int flags)
> +{
> +	struct socket *sock = sk->sk_socket;
> +	struct unix_stream_read_state state = {
> +		.recv_actor = unix_stream_read_actor,
> +		.socket = sock,
> +		.msg = msg,
> +		.size = size,
> +		.flags = flags
> +	};
> +
> +	return unix_stream_read_generic(&state, true);
> +}
> +
>  static int unix_stream_recvmsg(struct socket *sock, struct msghdr *msg,
>  			       size_t size, int flags)
>  {
> @@ -2544,6 +2587,13 @@ static int unix_stream_recvmsg(struct socket *sock, struct msghdr *msg,
>  		.flags = flags
>  	};
>  
> +	struct sock *sk = sock->sk;
> +
> +#ifdef CONFIG_BPF_SYSCALL
> +	if (sk->sk_prot != &unix_stream_proto)
> +		return sk->sk_prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT,
> +					    flags & ~MSG_DONTWAIT, NULL);
> +#endif
>  	return unix_stream_read_generic(&state, true);
>  }
>  
> @@ -2993,7 +3043,13 @@ static int __init af_unix_init(void)
>  
>  	BUILD_BUG_ON(sizeof(struct unix_skb_parms) > sizeof_field(struct sk_buff, cb));
>  
> -	rc = proto_register(&unix_proto, 1);
> +	rc = proto_register(&unix_dgram_proto, 1);

Can you add a note in the commit message on why they proto_register is
needed. I think it might be helpful later.

> +	if (rc != 0) {
> +		pr_crit("%s: Cannot create unix_sock SLAB cache!\n", __func__);
> +		goto out;
> +	}
> +
> +	rc = proto_register(&unix_stream_proto, 1);
>  	if (rc != 0) {
>  		pr_crit("%s: Cannot create unix_sock SLAB cache!\n", __func__);
>  		goto out;
> @@ -3009,7 +3065,8 @@ static int __init af_unix_init(void)
>  static void __exit af_unix_exit(void)
>  {
>  	sock_unregister(PF_UNIX);
> -	proto_unregister(&unix_proto);
> +	proto_unregister(&unix_dgram_proto);
> +	proto_unregister(&unix_stream_proto);
>  	unregister_pernet_subsys(&unix_net_ops);
>  }
>  
> diff --git a/net/unix/unix_bpf.c b/net/unix/unix_bpf.c
> index db0cda29f..9067210d3 100644
> --- a/net/unix/unix_bpf.c
> +++ b/net/unix/unix_bpf.c
> @@ -38,9 +38,18 @@ static int unix_msg_wait_data(struct sock *sk, struct sk_psock *psock,
>  	return ret;
>  }
>  
> -static int unix_dgram_bpf_recvmsg(struct sock *sk, struct msghdr *msg,
> -				  size_t len, int nonblock, int flags,
> -				  int *addr_len)
> +static int __unix_recvmsg(struct sock *sk, struct msghdr *msg,
> +			   size_t len, int flags)
> +{
> +	if (sk->sk_type == SOCK_DGRAM)
> +		return __unix_dgram_recvmsg(sk, msg, len, flags);
> +	else
> +		return __unix_stream_recvmsg(sk, msg, len, flags);
> +}
> +
> +static int unix_bpf_recvmsg(struct sock *sk, struct msghdr *msg,
> +			    size_t len, int nonblock, int flags,
> +			    int *addr_len)
>  {
>  	struct unix_sock *u = unix_sk(sk);
>  	struct sk_psock *psock;
> @@ -48,12 +57,12 @@ static int unix_dgram_bpf_recvmsg(struct sock *sk, struct msghdr *msg,
>  
>  	psock = sk_psock_get(sk);
>  	if (unlikely(!psock))
> -		return __unix_dgram_recvmsg(sk, msg, len, flags);
> +		return __unix_recvmsg(sk, msg, len, flags);
>  
>  	mutex_lock(&u->iolock);
>  	if (!skb_queue_empty(&sk->sk_receive_queue) &&
>  	    sk_psock_queue_empty(psock)) {
> -		ret = __unix_dgram_recvmsg(sk, msg, len, flags);
> +		ret = __unix_recvmsg(sk, msg, len, flags);

Will need rebase after Cong's fix for iolock goes in.

>  		goto out;
>  	}
>  
> @@ -68,7 +77,7 @@ static int unix_dgram_bpf_recvmsg(struct sock *sk, struct msghdr *msg,
>  		if (data) {
>  			if (!sk_psock_queue_empty(psock))
>  				goto msg_bytes_ready;
> -			ret = __unix_dgram_recvmsg(sk, msg, len, flags);
> +			ret = __unix_recvmsg(sk, msg, len, flags);
>  			goto out;
>  		}
>  		copied = -EAGAIN;
> @@ -80,30 +89,55 @@ static int unix_dgram_bpf_recvmsg(struct sock *sk, struct msghdr *msg,
>  	return ret;
>  }
>  
> -static struct proto *unix_prot_saved __read_mostly;
> -static DEFINE_SPINLOCK(unix_prot_lock);
> -static struct proto unix_bpf_prot;
> +static struct proto *unix_dgram_prot_saved __read_mostly;
> +static DEFINE_SPINLOCK(unix_dgram_prot_lock);
> +static struct proto unix_dgram_bpf_prot;
> +
> +static struct proto *unix_stream_prot_saved __read_mostly;
> +static DEFINE_SPINLOCK(unix_stream_prot_lock);
> +static struct proto unix_stream_bpf_prot;
> +
> +static void unix_dgram_bpf_rebuild_protos(struct proto *prot, const struct proto *base)
> +{
> +	*prot        = *base;
> +	prot->close  = sock_map_close;
> +	prot->recvmsg = unix_bpf_recvmsg;
> +}
>  
> -static void unix_bpf_rebuild_protos(struct proto *prot, const struct proto *base)
> +static void unix_stream_bpf_rebuild_protos(struct proto *prot,
> +					   const struct proto *base)
>  {
>  	*prot        = *base;
>  	prot->close  = sock_map_close;
> -	prot->recvmsg = unix_dgram_bpf_recvmsg;
> +	prot->recvmsg = unix_bpf_recvmsg;
> +	prot->unhash  = sock_map_unhash;

Still unsure whats different between stream and dgram that means we now
need the unhash hook.

>  }
>  
> -static void unix_bpf_check_needs_rebuild(struct proto *ops)
> +static void unix_dgram_bpf_check_needs_rebuild(struct proto *ops)
>  {
> -	if (unlikely(ops != smp_load_acquire(&unix_prot_saved))) {
> -		spin_lock_bh(&unix_prot_lock);
> -		if (likely(ops != unix_prot_saved)) {
> -			unix_bpf_rebuild_protos(&unix_bpf_prot, ops);
> -			smp_store_release(&unix_prot_saved, ops);
> +	if (unlikely(ops != smp_load_acquire(&unix_dgram_prot_saved))) {
> +		spin_lock_bh(&unix_dgram_prot_lock);
> +		if (likely(ops != unix_dgram_prot_saved)) {
> +			unix_dgram_bpf_rebuild_protos(&unix_dgram_bpf_prot, ops);
> +			smp_store_release(&unix_dgram_prot_saved, ops);
>  		}
> -		spin_unlock_bh(&unix_prot_lock);
> +		spin_unlock_bh(&unix_dgram_prot_lock);
>  	}
>  }
>  
> -int unix_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
> +static void unix_stream_bpf_check_needs_rebuild(struct proto *ops)
> +{
> +	if (unlikely(ops != smp_load_acquire(&unix_stream_prot_saved))) {
> +		spin_lock_bh(&unix_stream_prot_lock);
> +		if (likely(ops != unix_stream_prot_saved)) {
> +			unix_stream_bpf_rebuild_protos(&unix_stream_bpf_prot, ops);
> +			smp_store_release(&unix_stream_prot_saved, ops);
> +		}
> +		spin_unlock_bh(&unix_stream_prot_lock);
> +	}
> +}
> +
> +int unix_dgram_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
>  {
>  	if (restore) {
>  		sk->sk_write_space = psock->saved_write_space;
> @@ -111,12 +145,27 @@ int unix_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
>  		return 0;
>  	}
>  
> -	unix_bpf_check_needs_rebuild(psock->sk_proto);
> -	WRITE_ONCE(sk->sk_prot, &unix_bpf_prot);
> +	unix_dgram_bpf_check_needs_rebuild(psock->sk_proto);
> +	WRITE_ONCE(sk->sk_prot, &unix_dgram_bpf_prot);
> +	return 0;
> +}
> +
> +int unix_stream_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
> +{
> +	if (restore) {
> +		sk->sk_write_space = psock->saved_write_space;
> +		WRITE_ONCE(sk->sk_prot, psock->sk_proto);
> +		return 0;
> +	}
> +
> +	unix_stream_bpf_check_needs_rebuild(psock->sk_proto);
> +	WRITE_ONCE(sk->sk_prot, &unix_stream_bpf_prot);
>  	return 0;
>  }
>  
>  void __init unix_bpf_build_proto(void)
>  {
> -	unix_bpf_rebuild_protos(&unix_bpf_prot, &unix_proto);
> +	unix_dgram_bpf_rebuild_protos(&unix_dgram_bpf_prot, &unix_dgram_proto);
> +	unix_stream_bpf_rebuild_protos(&unix_stream_bpf_prot, &unix_stream_proto);
> +
>  }
> -- 
> 2.20.1
> 



^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH bpf-next v1 3/5] selftest/bpf: add tests for sockmap with unix stream type.
  2021-07-27  0:12 ` [PATCH bpf-next v1 3/5] selftest/bpf: add tests for sockmap with unix stream type Jiang Wang
@ 2021-07-27 16:38   ` John Fastabend
  0 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2021-07-27 16:38 UTC (permalink / raw)
  To: Jiang Wang, netdev
  Cc: cong.wang, duanxiongchun, xieyongji, chaiwen.cc, David S. Miller,
	Jakub Kicinski, John Fastabend, Daniel Borkmann, Jakub Sitnicki,
	Lorenz Bauer, Alexei Starovoitov, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh, Shuah Khan,
	Johan Almbladh, linux-kernel, bpf, linux-kselftest

Jiang Wang wrote:
> Add two tests for unix stream to unix stream redirection
> in sockmap tests.
> 
> Signed-off-by: Jiang Wang <jiang.wang@bytedance.com>
> Reviewed-by: Cong Wang <cong.wang@bytedance.com>.
> ---
>  tools/testing/selftests/bpf/prog_tests/sockmap_listen.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
> index a9f1bf9d5..7a976d432 100644
> --- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
> +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
> @@ -2020,11 +2020,13 @@ void test_sockmap_listen(void)
>  	run_tests(skel, skel->maps.sock_map, AF_INET);
>  	run_tests(skel, skel->maps.sock_map, AF_INET6);
>  	test_unix_redir(skel, skel->maps.sock_map, SOCK_DGRAM);
> +	test_unix_redir(skel, skel->maps.sock_map, SOCK_STREAM);
>  
>  	skel->bss->test_sockmap = false;
>  	run_tests(skel, skel->maps.sock_hash, AF_INET);
>  	run_tests(skel, skel->maps.sock_hash, AF_INET6);
>  	test_unix_redir(skel, skel->maps.sock_hash, SOCK_DGRAM);
> +	test_unix_redir(skel, skel->maps.sock_hash, SOCK_STREAM);
>  
>  	test_sockmap_listen__destroy(skel);
>  }
> -- 
> 2.20.1
> 

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH bpf-next v1 4/5] selftest/bpf: change udp to inet in some function names
  2021-07-27  0:12 ` [PATCH bpf-next v1 4/5] selftest/bpf: change udp to inet in some function names Jiang Wang
@ 2021-07-27 16:40   ` John Fastabend
  0 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2021-07-27 16:40 UTC (permalink / raw)
  To: Jiang Wang, netdev
  Cc: cong.wang, duanxiongchun, xieyongji, chaiwen.cc, David S. Miller,
	Jakub Kicinski, John Fastabend, Daniel Borkmann, Jakub Sitnicki,
	Lorenz Bauer, Alexei Starovoitov, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh, Shuah Khan,
	Johan Almbladh, linux-kernel, bpf, linux-kselftest

Jiang Wang wrote:
> This is to prepare for adding new unix stream tests.
> Mostly renames, also pass the socket types as an argument.
> 
> Signed-off-by: Jiang Wang <jiang.wang@bytedance.com>
> Reviewed-by: Cong Wang <cong.wang@bytedance.com>.
> ---

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH bpf-next v1 5/5] selftest/bpf: add new tests in sockmap for unix stream to tcp.
  2021-07-27  0:12 ` [PATCH bpf-next v1 5/5] selftest/bpf: add new tests in sockmap for unix stream to tcp Jiang Wang
@ 2021-07-27 16:42   ` John Fastabend
  0 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2021-07-27 16:42 UTC (permalink / raw)
  To: Jiang Wang, netdev
  Cc: cong.wang, duanxiongchun, xieyongji, chaiwen.cc, David S. Miller,
	Jakub Kicinski, John Fastabend, Daniel Borkmann, Jakub Sitnicki,
	Lorenz Bauer, Alexei Starovoitov, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh, Shuah Khan,
	Johan Almbladh, linux-kernel, bpf, linux-kselftest

Jiang Wang wrote:
> Add two new test cases in sockmap tests, where unix stream is
> redirected to tcp and vice versa.
> 
> Signed-off-by: Jiang Wang <jiang.wang@bytedance.com>
> Reviewed-by: Cong Wang <cong.wang@bytedance.com>.
> ---

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH bpf-next v1 0/5] sockmap: add sockmap support for unix stream socket
  2021-07-27  0:12 [PATCH bpf-next v1 0/5] sockmap: add sockmap support for unix stream socket Jiang Wang
                   ` (4 preceding siblings ...)
  2021-07-27  0:12 ` [PATCH bpf-next v1 5/5] selftest/bpf: add new tests in sockmap for unix stream to tcp Jiang Wang
@ 2021-07-27 16:44 ` John Fastabend
  5 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2021-07-27 16:44 UTC (permalink / raw)
  To: Jiang Wang, netdev
  Cc: cong.wang, duanxiongchun, xieyongji, chaiwen.cc, David S. Miller,
	Jakub Kicinski, John Fastabend, Daniel Borkmann, Jakub Sitnicki,
	Lorenz Bauer, Alexei Starovoitov, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh, Shuah Khan,
	Johan Almbladh, linux-kernel, bpf, linux-kselftest

Jiang Wang wrote:
> This patch series add support for unix stream type
> for sockmap. Sockmap already supports TCP, UDP,
> unix dgram types. The unix stream support is similar
> to unix dgram.
> 
> Also add selftests for unix stream type in sockmap tests.

Overall looks good to me. Couple comments on 2/5 and we should get Cong's
fix in before merging this. Its nice this fell in without requiring larger
changes in ./net/core/*.c code.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v1 2/5] af_unix: add unix_stream_proto for sockmap
  2021-07-27 16:37   ` John Fastabend
@ 2021-07-28  2:07     ` Cong Wang
  2021-07-28 18:43       ` John Fastabend
  0 siblings, 1 reply; 17+ messages in thread
From: Cong Wang @ 2021-07-28  2:07 UTC (permalink / raw)
  To: John Fastabend
  Cc: Jiang Wang, Linux Kernel Network Developers, Cong Wang .,
	Xiongchun Duan, xieyongji, chaiwen.cc, David S. Miller,
	Jakub Kicinski, Daniel Borkmann, Jakub Sitnicki, Lorenz Bauer,
	Alexei Starovoitov, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Shuah Khan, Johan Almbladh,
	linux-kernel, bpf, linux-kselftest

On Tue, Jul 27, 2021 at 9:37 AM John Fastabend <john.fastabend@gmail.com> wrote:
> Do we really need an unhash hook for unix_stream? I'm doing some testing
> now to pull it out of TCP side as well. It seems to be an artifact of old
> code that is no longer necessary. On TCP side at least just using close()
> looks to be enough now.

How do you handle the disconnection from remote without ->unhash()?

For all stream sockets, we still only allow established sockets to stay
in sockmap, which means we have to remove it if it is disconnected
or closed.

But it seems Jiang forgot to call ->unhash() when disconnecting.

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v1 2/5] af_unix: add unix_stream_proto for sockmap
  2021-07-28  2:07     ` Cong Wang
@ 2021-07-28 18:43       ` John Fastabend
  2021-07-29 21:27         ` Cong Wang
  0 siblings, 1 reply; 17+ messages in thread
From: John Fastabend @ 2021-07-28 18:43 UTC (permalink / raw)
  To: Cong Wang, John Fastabend
  Cc: Jiang Wang, Linux Kernel Network Developers, Cong Wang .,
	Xiongchun Duan, xieyongji, chaiwen.cc, David S. Miller,
	Jakub Kicinski, Daniel Borkmann, Jakub Sitnicki, Lorenz Bauer,
	Alexei Starovoitov, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Shuah Khan, Johan Almbladh,
	linux-kernel, bpf, linux-kselftest

Cong Wang wrote:
> On Tue, Jul 27, 2021 at 9:37 AM John Fastabend <john.fastabend@gmail.com> wrote:
> > Do we really need an unhash hook for unix_stream? I'm doing some testing
> > now to pull it out of TCP side as well. It seems to be an artifact of old
> > code that is no longer necessary. On TCP side at least just using close()
> > looks to be enough now.
> 
> How do you handle the disconnection from remote without ->unhash()?

Would close() not work for stream/dgram sockets?

> 
> For all stream sockets, we still only allow established sockets to stay
> in sockmap, which means we have to remove it if it is disconnected
> or closed.

+1.

> 
> But it seems Jiang forgot to call ->unhash() when disconnecting.

Aha so we need to add it in af_unix code I guess. Anyways looking forward
to v2.

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v1 1/5] af_unix: add read_sock for stream socket types
  2021-07-27  0:12 ` [PATCH bpf-next v1 1/5] af_unix: add read_sock for stream socket types Jiang Wang
@ 2021-07-29  8:37   ` Jakub Sitnicki
  0 siblings, 0 replies; 17+ messages in thread
From: Jakub Sitnicki @ 2021-07-29  8:37 UTC (permalink / raw)
  To: Jiang Wang
  Cc: netdev, cong.wang, duanxiongchun, xieyongji, chaiwen.cc,
	David S. Miller, Jakub Kicinski, John Fastabend, Daniel Borkmann,
	Lorenz Bauer, Alexei Starovoitov, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh, Shuah Khan,
	Johan Almbladh, linux-kernel, bpf, linux-kselftest

On Tue, Jul 27, 2021 at 02:12 AM CEST, Jiang Wang wrote:
> To support sockmap for af_unix stream type, implement
> read_sock, which is similar to the read_sock for unix
> dgram sockets.
>
> Signed-off-by: Jiang Wang <jiang.wang@bytedance.com>
> Reviewed-by: Cong Wang <cong.wang@bytedance.com>.
> ---
>  net/unix/af_unix.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 89927678c..32eeb4a6a 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -672,6 +672,8 @@ static int unix_dgram_sendmsg(struct socket *, struct msghdr *, size_t);
>  static int unix_dgram_recvmsg(struct socket *, struct msghdr *, size_t, int);
>  static int unix_read_sock(struct sock *sk, read_descriptor_t *desc,
>  			  sk_read_actor_t recv_actor);
> +static int unix_stream_read_sock(struct sock *sk, read_descriptor_t *desc,
> +				 sk_read_actor_t recv_actor);
>  static int unix_dgram_connect(struct socket *, struct sockaddr *,
>  			      int, int);
>  static int unix_seqpacket_sendmsg(struct socket *, struct msghdr *, size_t);
> @@ -725,6 +727,7 @@ static const struct proto_ops unix_stream_ops = {
>  	.shutdown =	unix_shutdown,
>  	.sendmsg =	unix_stream_sendmsg,
>  	.recvmsg =	unix_stream_recvmsg,
> +	.read_sock =	unix_stream_read_sock,
>  	.mmap =		sock_no_mmap,
>  	.sendpage =	unix_stream_sendpage,
>  	.splice_read =	unix_stream_splice_read,
> @@ -2311,6 +2314,15 @@ struct unix_stream_read_state {
>  	unsigned int splice_flags;
>  };
>
> +static int unix_stream_read_sock(struct sock *sk, read_descriptor_t *desc,
> +				 sk_read_actor_t recv_actor)
> +{
> +	if (unlikely(sk->sk_state != TCP_ESTABLISHED))
> +		return -EINVAL;

tcp_read_sock returns -ENOTCONN if socket is not connected.

For the sake of being consistent, and in case we start propagating the
error up the call chain, I'd use the same error code.

> +
> +	return unix_read_sock(sk, desc, recv_actor);
> +}
> +
>  static int unix_stream_read_generic(struct unix_stream_read_state *state,
>  				    bool freezable)
>  {

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v1 2/5] af_unix: add unix_stream_proto for sockmap
  2021-07-28 18:43       ` John Fastabend
@ 2021-07-29 21:27         ` Cong Wang
  2021-08-10 17:04           ` John Fastabend
  0 siblings, 1 reply; 17+ messages in thread
From: Cong Wang @ 2021-07-29 21:27 UTC (permalink / raw)
  To: John Fastabend
  Cc: Jiang Wang, Linux Kernel Network Developers, Cong Wang .,
	Xiongchun Duan, xieyongji, chaiwen.cc, David S. Miller,
	Jakub Kicinski, Daniel Borkmann, Jakub Sitnicki, Lorenz Bauer,
	Alexei Starovoitov, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Shuah Khan, Johan Almbladh, LKML, bpf,
	open list:KERNEL SELFTEST FRAMEWORK

On Wed, Jul 28, 2021 at 11:44 AM John Fastabend
<john.fastabend@gmail.com> wrote:
>
> Cong Wang wrote:
> > On Tue, Jul 27, 2021 at 9:37 AM John Fastabend <john.fastabend@gmail.com> wrote:
> > > Do we really need an unhash hook for unix_stream? I'm doing some testing
> > > now to pull it out of TCP side as well. It seems to be an artifact of old
> > > code that is no longer necessary. On TCP side at least just using close()
> > > looks to be enough now.
> >
> > How do you handle the disconnection from remote without ->unhash()?
>
> Would close() not work for stream/dgram sockets?

close() is called when the local closes the sockets, but when the remote
closes or disconnects it, unhash() is called. This is why TCP calls unhash()
to remove the socket from established socket hash table. unhash() itself
might not make much sense for AF_UNIX as it probably does not need a
hash table to track established ones, however, the idea is the same, that
is, we have to handle remote disconnections here.

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v1 2/5] af_unix: add unix_stream_proto for sockmap
  2021-07-29 21:27         ` Cong Wang
@ 2021-08-10 17:04           ` John Fastabend
  0 siblings, 0 replies; 17+ messages in thread
From: John Fastabend @ 2021-08-10 17:04 UTC (permalink / raw)
  To: Cong Wang, John Fastabend
  Cc: Jiang Wang, Linux Kernel Network Developers, Cong Wang .,
	Xiongchun Duan, xieyongji, chaiwen.cc, David S. Miller,
	Jakub Kicinski, Daniel Borkmann, Jakub Sitnicki, Lorenz Bauer,
	Alexei Starovoitov, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Shuah Khan, Johan Almbladh, LKML, bpf,
	open list:KERNEL SELFTEST FRAMEWORK

Cong Wang wrote:
> On Wed, Jul 28, 2021 at 11:44 AM John Fastabend
> <john.fastabend@gmail.com> wrote:
> >
> > Cong Wang wrote:
> > > On Tue, Jul 27, 2021 at 9:37 AM John Fastabend <john.fastabend@gmail.com> wrote:
> > > > Do we really need an unhash hook for unix_stream? I'm doing some testing
> > > > now to pull it out of TCP side as well. It seems to be an artifact of old
> > > > code that is no longer necessary. On TCP side at least just using close()
> > > > looks to be enough now.
> > >
> > > How do you handle the disconnection from remote without ->unhash()?
> >
> > Would close() not work for stream/dgram sockets?
> 
> close() is called when the local closes the sockets, but when the remote
> closes or disconnects it, unhash() is called. This is why TCP calls unhash()
> to remove the socket from established socket hash table. unhash() itself
> might not make much sense for AF_UNIX as it probably does not need a
> hash table to track established ones, however, the idea is the same, that
> is, we have to handle remote disconnections here.

Following up on this series. Leaving a socket in the sockmap until close()
happens is not paticularly problematic, but does consume space in the map
so unhash() is slightly better I guess. Thanks.

> 
> Thanks.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v1 2/5] af_unix: add unix_stream_proto for sockmap
       [not found] <202107280009.ualwfjZP-lkp@intel.com>
@ 2021-07-28  0:38 ` kernel test robot
  0 siblings, 0 replies; 17+ messages in thread
From: kernel test robot @ 2021-07-28  0:38 UTC (permalink / raw)
  To: Jiang Wang, netdev
  Cc: clang-built-linux, kbuild-all, cong.wang, duanxiongchun,
	xieyongji, chaiwen.cc, Jakub Kicinski, John Fastabend,
	Daniel Borkmann, Jakub Sitnicki, Lorenz Bauer

[-- Attachment #1: Type: text/plain, Size: 17374 bytes --]


Hi Jiang,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on bpf-next/master]

url: 
https://github.com/0day-ci/linux/commits/Jiang-Wang/sockmap-add-sockmap-support-for-unix-stream-socket/20210727-081531
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git 
master
:::::: branch date: 16 hours ago
:::::: commit date: 16 hours ago
config: x86_64-randconfig-c001-20210726 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 
c658b472f3e61e1818e1909bf02f3d65470018a5)
reproduce (this is a W=1 build):
         wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross 
-O ~/bin/make.cross
         chmod +x ~/bin/make.cross
         # install x86_64 cross compiling tool for clang build
         # apt-get install binutils-x86-64-linux-gnu
         # 
https://github.com/0day-ci/linux/commit/607ed02e3232aa57995e87230faad770b810a64a
         git remote add linux-review https://github.com/0day-ci/linux
         git fetch --no-tags linux-review 
Jiang-Wang/sockmap-add-sockmap-support-for-unix-stream-socket/20210727-081531
         git checkout 607ed02e3232aa57995e87230faad770b810a64a
         # save the attached .config to linux build tree
         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross 
ARCH=x86_64 clang-analyzer
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


clang-analyzer warnings: (new ones prefixed by >>)
            BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) 
&&   \
                                                                       ^
    net/bridge/br_multicast.c:970:3: note: Taking false branch
                    hlist_for_each_entry(ent, &pg->src_list, node) {
                    ^
    include/linux/list.h:993:13: note: expanded from macro 
'hlist_for_each_entry'
            for (pos = hlist_entry_safe((head)->first, typeof(*(pos)), 
member);\
                       ^
    include/linux/list.h:983:15: note: expanded from macro 
'hlist_entry_safe'
               ____ptr ? hlist_entry(____ptr, type, member) : NULL; \
                         ^
    include/linux/list.h:972:40: note: expanded from macro 'hlist_entry'
    #define hlist_entry(ptr, type, member) container_of(ptr,type,member)
                                           ^
    note: (skipping 2 expansions in backtrace; use 
-fmacro-backtrace-limit=0 to see all)
    include/linux/compiler_types.h:328:2: note: expanded from macro 
'compiletime_assert'
            _compiletime_assert(condition, msg, __compiletime_assert_, 
__COUNTER__)
            ^
    include/linux/compiler_types.h:316:2: note: expanded from macro 
'_compiletime_assert'
            __compiletime_assert(condition, msg, prefix, suffix)
            ^
    include/linux/compiler_types.h:308:3: note: expanded from macro 
'__compiletime_assert'
                    if (!(condition)) 
     \
                    ^
    net/bridge/br_multicast.c:970:3: note: Loop condition is false. 
Exiting loop
                    hlist_for_each_entry(ent, &pg->src_list, node) {
                    ^
    include/linux/list.h:993:13: note: expanded from macro 
'hlist_for_each_entry'
            for (pos = hlist_entry_safe((head)->first, typeof(*(pos)), 
member);\
                       ^
    include/linux/list.h:983:15: note: expanded from macro 
'hlist_entry_safe'
               ____ptr ? hlist_entry(____ptr, type, member) : NULL; \
                         ^
    include/linux/list.h:972:40: note: expanded from macro 'hlist_entry'
    #define hlist_entry(ptr, type, member) container_of(ptr,type,member)
                                           ^
    note: (skipping 2 expansions in backtrace; use 
-fmacro-backtrace-limit=0 to see all)
    include/linux/compiler_types.h:328:2: note: expanded from macro 
'compiletime_assert'
            _compiletime_assert(condition, msg, __compiletime_assert_, 
__COUNTER__)
            ^
    include/linux/compiler_types.h:316:2: note: expanded from macro 
'_compiletime_assert'
            __compiletime_assert(condition, msg, prefix, suffix)
            ^
    include/linux/compiler_types.h:306:2: note: expanded from macro 
'__compiletime_assert'
            do { 
     \
            ^
    net/bridge/br_multicast.c:970:3: note: Loop condition is true. 
Entering loop body
                    hlist_for_each_entry(ent, &pg->src_list, node) {
                    ^
    include/linux/list.h:993:2: note: expanded from macro 
'hlist_for_each_entry'
            for (pos = hlist_entry_safe((head)->first, typeof(*(pos)), 
member);\
            ^
    net/bridge/br_multicast.c:971:21: note: Left side of '&&' is true
                            if (over_llqt == time_after(ent->timer.expires,
                                             ^
    include/linux/jiffies.h:105:3: note: expanded from macro 'time_after'
            (typecheck(unsigned long, a) && \
             ^
    include/linux/typecheck.h:9:27: note: expanded from macro 'typecheck'
    #define typecheck(type,x) \
                              ^
    net/bridge/br_multicast.c:971:21: note: Left side of '&&' is true
                            if (over_llqt == time_after(ent->timer.expires,
                                             ^
    include/linux/jiffies.h:105:3: note: expanded from macro 'time_after'
            (typecheck(unsigned long, a) && \
             ^
    include/linux/typecheck.h:9:27: note: expanded from macro 'typecheck'
    #define typecheck(type,x) \
                              ^
    net/bridge/br_multicast.c:971:21: note: The left operand of '-' is a 
garbage value
                            if (over_llqt == time_after(ent->timer.expires,
                                             ^
    include/linux/jiffies.h:107:15: note: expanded from macro 'time_after'
             ((long)((b) - (a)) < 0))
                      ~  ^
    Suppressed 4 warnings (4 in non-user code).
    Use -header-filter=.* to display errors from all non-system headers. 
Use -system-headers to display errors from system headers as well.
    3 warnings generated.
    Suppressed 3 warnings (3 in non-user code).
    Use -header-filter=.* to display errors from all non-system headers. 
Use -system-headers to display errors from system headers as well.
    3 warnings generated.
    Suppressed 3 warnings (3 in non-user code).
    Use -header-filter=.* to display errors from all non-system headers. 
Use -system-headers to display errors from system headers as well.
    3 warnings generated.
    Suppressed 3 warnings (3 in non-user code).
    Use -header-filter=.* to display errors from all non-system headers. 
Use -system-headers to display errors from system headers as well.
    3 warnings generated.
    Suppressed 3 warnings (3 in non-user code).
    Use -header-filter=.* to display errors from all non-system headers. 
Use -system-headers to display errors from system headers as well.
    3 warnings generated.
    Suppressed 3 warnings (3 in non-user code).
    Use -header-filter=.* to display errors from all non-system headers. 
Use -system-headers to display errors from system headers as well.
    4 warnings generated.
    Suppressed 4 warnings (4 in non-user code).
    Use -header-filter=.* to display errors from all non-system headers. 
Use -system-headers to display errors from system headers as well.
    4 warnings generated.
    Suppressed 4 warnings (4 in non-user code).
    Use -header-filter=.* to display errors from all non-system headers. 
Use -system-headers to display errors from system headers as well.
    7 warnings generated.
>> net/unix/af_unix.c:837:7: warning: Access to field 'type' results in a dereference of a null pointer (loaded from variable 'sock') [clang-analyzer-core.NullDereference]
                    if (sock->type == SOCK_STREAM)
                        ^
    net/unix/af_unix.c:1299:6: note: 'err' is >= 0
            if (err < 0)
                ^~~
    net/unix/af_unix.c:1299:2: note: Taking false branch
            if (err < 0)
            ^
    net/unix/af_unix.c:1303:6: note: Assuming the condition is false
            if (test_bit(SOCK_PASSCRED, &sock->flags) && !u->addr &&
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    net/unix/af_unix.c:1303:44: note: Left side of '&&' is false
            if (test_bit(SOCK_PASSCRED, &sock->flags) && !u->addr &&
                                                      ^
    net/unix/af_unix.c:1317:37: note: Passing null pointer value via 2nd 
parameter 'sock'
            newsk = unix_create1(sock_net(sk), NULL, 0, sock->type);
                                               ^
    include/linux/stddef.h:8:14: note: expanded from macro 'NULL'
    #define NULL ((void *)0)
                 ^~~~~~~~~~~
    net/unix/af_unix.c:1317:10: note: Calling 'unix_create1'
            newsk = unix_create1(sock_net(sk), NULL, 0, sock->type);
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    net/unix/af_unix.c:828:6: note: Assuming the condition is false
            if (atomic_long_read(&unix_nr_socks) > 2 * get_max_files())
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    net/unix/af_unix.c:828:2: note: Taking false branch
            if (atomic_long_read(&unix_nr_socks) > 2 * get_max_files())
            ^
    net/unix/af_unix.c:831:6: note: Assuming 'type' is equal to 0
            if (type != 0) {
                ^~~~~~~~~
    net/unix/af_unix.c:831:2: note: Taking false branch
            if (type != 0) {
            ^
    net/unix/af_unix.c:837:7: note: Access to field 'type' results in a 
dereference of a null pointer (loaded from variable 'sock')
                    if (sock->type == SOCK_STREAM)
                        ^~~~
    net/unix/af_unix.c:1251:34: warning: Dereference of null pointer 
[clang-analyzer-core.NullDereference]
                    sk->sk_state = other->sk_state = TCP_ESTABLISHED;
                                   ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
    net/unix/af_unix.c:1189:6: note: Assuming the condition is false
            if (alen < offsetofend(struct sockaddr, sa_family))
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    net/unix/af_unix.c:1189:2: note: Taking false branch
            if (alen < offsetofend(struct sockaddr, sa_family))
            ^
    net/unix/af_unix.c:1192:6: note: Assuming field 'sa_family' is equal 
to AF_UNSPEC
            if (addr->sa_family != AF_UNSPEC) {
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
    net/unix/af_unix.c:1192:2: note: Taking false branch
            if (addr->sa_family != AF_UNSPEC) {
            ^
    net/unix/af_unix.c:1228:3: note: Null pointer value stored to 'other'
                    other = NULL;
                    ^~~~~~~~~~~~
    net/unix/af_unix.c:1235:6: note: Assuming field 'peer' is null
            if (unix_peer(sk)) {
                ^
    net/unix/af_unix.c:180:23: note: expanded from macro 'unix_peer'
    #define unix_peer(sk) (unix_sk(sk)->peer)
                          ^~~~~~~~~~~~~~~~~~~
    net/unix/af_unix.c:1235:2: note: Taking false branch
            if (unix_peer(sk)) {
            ^
    net/unix/af_unix.c:1247:3: note: Calling 'unix_state_double_unlock'
                    unix_state_double_unlock(sk, other);
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    net/unix/af_unix.c:1170:15: note: 'sk1' is not equal to 'sk2'
            if (unlikely(sk1 == sk2) || !sk2) {
                         ^
    include/linux/compiler.h:78:42: note: expanded from macro 'unlikely'
    # define unlikely(x)    __builtin_expect(!!(x), 0)
                                                ^
    net/unix/af_unix.c:1170:6: note: Left side of '||' is false
            if (unlikely(sk1 == sk2) || !sk2) {
                ^
    include/linux/compiler.h:78:22: note: expanded from macro 'unlikely'
    # define unlikely(x)    __builtin_expect(!!(x), 0)
                            ^
    net/unix/af_unix.c:1170:31: note: 'sk2' is null
            if (unlikely(sk1 == sk2) || !sk2) {
                                         ^~~
    net/unix/af_unix.c:1170:2: note: Taking true branch
            if (unlikely(sk1 == sk2) || !sk2) {
            ^
    net/unix/af_unix.c:1171:3: note: Calling 'spin_unlock'
                    unix_state_unlock(sk1);
                    ^
    include/net/af_unix.h:51:30: note: expanded from macro 
'unix_state_unlock'
    #define unix_state_unlock(s)    spin_unlock(&unix_sk(s)->lock)
                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    include/linux/spinlock.h:394:2: note: Value assigned to field 
'peer', which participates in a condition later
            raw_spin_unlock(&lock->rlock);
            ^
    include/linux/spinlock.h:284:32: note: expanded from macro 
'raw_spin_unlock'
    #define raw_spin_unlock(lock)           _raw_spin_unlock(lock)
                                            ^~~~~~~~~~~~~~~~~~~~~~
    net/unix/af_unix.c:1171:3: note: Returning from 'spin_unlock'
                    unix_state_unlock(sk1);

vim +837 net/unix/af_unix.c

^1da177e4c3f41 Linus Torvalds   2005-04-16  821  607ed02e3232aa Jiang 
Wang       2021-07-27  822  static struct sock *unix_create1(struct net 
*net, struct socket *sock, int kern, int type)
^1da177e4c3f41 Linus Torvalds   2005-04-16  823  {
^1da177e4c3f41 Linus Torvalds   2005-04-16  824  	struct sock *sk = NULL;
^1da177e4c3f41 Linus Torvalds   2005-04-16  825  	struct unix_sock *u;
^1da177e4c3f41 Linus Torvalds   2005-04-16  826  518de9b39e8545 Eric 
Dumazet     2010-10-26  827  	atomic_long_inc(&unix_nr_socks);
518de9b39e8545 Eric Dumazet     2010-10-26  828  	if 
(atomic_long_read(&unix_nr_socks) > 2 * get_max_files())
^1da177e4c3f41 Linus Torvalds   2005-04-16  829  		goto out;
^1da177e4c3f41 Linus Torvalds   2005-04-16  830  607ed02e3232aa Jiang 
Wang       2021-07-27  831  	if (type != 0) {
607ed02e3232aa Jiang Wang       2021-07-27  832  		if (type == SOCK_STREAM)
607ed02e3232aa Jiang Wang       2021-07-27  833  			sk = sk_alloc(net, 
PF_UNIX, GFP_KERNEL, &unix_stream_proto, kern);
607ed02e3232aa Jiang Wang       2021-07-27  834  		else /*for seqpacket */
607ed02e3232aa Jiang Wang       2021-07-27  835  			sk = sk_alloc(net, 
PF_UNIX, GFP_KERNEL, &unix_dgram_proto, kern);
607ed02e3232aa Jiang Wang       2021-07-27  836  	} else {
607ed02e3232aa Jiang Wang       2021-07-27 @837  		if (sock->type == 
SOCK_STREAM)
607ed02e3232aa Jiang Wang       2021-07-27  838  			sk = sk_alloc(net, 
PF_UNIX, GFP_KERNEL, &unix_stream_proto, kern);
607ed02e3232aa Jiang Wang       2021-07-27  839  		else
607ed02e3232aa Jiang Wang       2021-07-27  840  			sk = sk_alloc(net, 
PF_UNIX, GFP_KERNEL, &unix_dgram_proto, kern);
607ed02e3232aa Jiang Wang       2021-07-27  841  	}
^1da177e4c3f41 Linus Torvalds   2005-04-16  842  	if (!sk)
^1da177e4c3f41 Linus Torvalds   2005-04-16  843  		goto out;
^1da177e4c3f41 Linus Torvalds   2005-04-16  844  ^1da177e4c3f41 Linus 
Torvalds   2005-04-16  845  	sock_init_data(sock, sk);
^1da177e4c3f41 Linus Torvalds   2005-04-16  846  3aa9799e13645f Vladimir 
Davydov 2016-07-26  847  	sk->sk_allocation	= GFP_KERNEL_ACCOUNT;
^1da177e4c3f41 Linus Torvalds   2005-04-16  848  	sk->sk_write_space	= 
unix_write_space;
a0a53c8ba95451 Denis V. Lunev   2007-12-11  849 
sk->sk_max_ack_backlog	= net->unx.sysctl_max_dgram_qlen;
^1da177e4c3f41 Linus Torvalds   2005-04-16  850  	sk->sk_destruct		= 
unix_sock_destructor;
^1da177e4c3f41 Linus Torvalds   2005-04-16  851  	u	  = unix_sk(sk);
40ffe67d2e89c7 Al Viro          2012-03-14  852  	u->path.dentry = NULL;
40ffe67d2e89c7 Al Viro          2012-03-14  853  	u->path.mnt = NULL;
fd19f329a32bdc Benjamin LaHaise 2006-01-03  854  	spin_lock_init(&u->lock);
516e0cc5646f37 Al Viro          2008-07-26  855 
atomic_long_set(&u->inflight, 0);
1fd05ba5a2f2aa Miklos Szeredi   2007-07-11  856  	INIT_LIST_HEAD(&u->link);
6e1ce3c3451291 Linus Torvalds   2016-09-01  857 
mutex_init(&u->iolock); /* single task reading lock */
6e1ce3c3451291 Linus Torvalds   2016-09-01  858 
mutex_init(&u->bindlock); /* single task binding lock */
^1da177e4c3f41 Linus Torvalds   2005-04-16  859 
init_waitqueue_head(&u->peer_wait);
7d267278a9ece9 Rainer Weikusat  2015-11-20  860 
init_waitqueue_func_entry(&u->peer_wake, unix_dgram_peer_wake_relay);
3c32da19a858fb Kirill Tkhai     2019-12-09  861  	memset(&u->scm_stat, 
0, sizeof(struct scm_stat));
7123aaa3a14165 Eric Dumazet     2012-06-08  862 
unix_insert_socket(unix_sockets_unbound(sk), sk);
^1da177e4c3f41 Linus Torvalds   2005-04-16  863  out:
284b327be2f86c Pavel Emelyanov  2007-11-10  864  	if (sk == NULL)
518de9b39e8545 Eric Dumazet     2010-10-26  865  	 
atomic_long_dec(&unix_nr_socks);
920de804bca61f Eric Dumazet     2008-11-24  866  	else {
920de804bca61f Eric Dumazet     2008-11-24  867  		local_bh_disable();
a8076d8db98de6 Eric Dumazet     2008-11-17  868  	 
sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
920de804bca61f Eric Dumazet     2008-11-24  869  		local_bh_enable();
920de804bca61f Eric Dumazet     2008-11-24  870  	}
^1da177e4c3f41 Linus Torvalds   2005-04-16  871  	return sk;
^1da177e4c3f41 Linus Torvalds   2005-04-16  872  }
^1da177e4c3f41 Linus Torvalds   2005-04-16  873
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org


[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 33620 bytes --]

[-- Attachment #3: Attached Message Part --]
[-- Type: text/plain, Size: 150 bytes --]

_______________________________________________
kbuild mailing list -- kbuild@lists.01.org
To unsubscribe send an email to kbuild-leave@lists.01.org


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-08-10 17:04 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-27  0:12 [PATCH bpf-next v1 0/5] sockmap: add sockmap support for unix stream socket Jiang Wang
2021-07-27  0:12 ` [PATCH bpf-next v1 1/5] af_unix: add read_sock for stream socket types Jiang Wang
2021-07-29  8:37   ` Jakub Sitnicki
2021-07-27  0:12 ` [PATCH bpf-next v1 2/5] af_unix: add unix_stream_proto for sockmap Jiang Wang
2021-07-27 16:37   ` John Fastabend
2021-07-28  2:07     ` Cong Wang
2021-07-28 18:43       ` John Fastabend
2021-07-29 21:27         ` Cong Wang
2021-08-10 17:04           ` John Fastabend
2021-07-27  0:12 ` [PATCH bpf-next v1 3/5] selftest/bpf: add tests for sockmap with unix stream type Jiang Wang
2021-07-27 16:38   ` John Fastabend
2021-07-27  0:12 ` [PATCH bpf-next v1 4/5] selftest/bpf: change udp to inet in some function names Jiang Wang
2021-07-27 16:40   ` John Fastabend
2021-07-27  0:12 ` [PATCH bpf-next v1 5/5] selftest/bpf: add new tests in sockmap for unix stream to tcp Jiang Wang
2021-07-27 16:42   ` John Fastabend
2021-07-27 16:44 ` [PATCH bpf-next v1 0/5] sockmap: add sockmap support for unix stream socket John Fastabend
     [not found] <202107280009.ualwfjZP-lkp@intel.com>
2021-07-28  0:38 ` [PATCH bpf-next v1 2/5] af_unix: add unix_stream_proto for sockmap kernel test robot

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox