Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH bpf v3 0/2] potential sockmap memleak and proc stats fix
@ 2021-07-06 16:31 John Fastabend
  2021-07-06 16:31 ` [PATCH bpf v3 1/2] bpf, sockmap: fix potential memory leak on unlikely error case John Fastabend
  2021-07-06 16:31 ` [PATCH bpf v3 2/2] bpf, sockmap: sk_prot needs inuse_idx set for proc stats John Fastabend
  0 siblings, 2 replies; 8+ messages in thread
From: John Fastabend @ 2021-07-06 16:31 UTC (permalink / raw)
  To: ast, daniel, andriin, xiyou.wangcong; +Cc: john.fastabend, bpf, netdev

While investigating a memleak in sockmap I found these two issues. Patch
1 found doing code review, I wasn't able to get KASAN to trigger a
memleak here, but should be necessary. Patch 2 fixes proc stats so when
we use sockstats for debugging we get correct values.

The fix for observered memleak will come after these, but requires some
more discussion and potentially patch revert so I'll try to get the set
here going now.

John Fastabend (2):
  bpf, sockmap: fix potential memory leak on unlikely error case
  bpf, sockmap: sk_prot needs inuse_idx set for proc stats

 net/core/skmsg.c    | 10 ++++++----
 net/core/sock_map.c | 11 ++++++++++-
 2 files changed, 16 insertions(+), 5 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH bpf v3 1/2] bpf, sockmap: fix potential memory leak on unlikely error case
  2021-07-06 16:31 [PATCH bpf v3 0/2] potential sockmap memleak and proc stats fix John Fastabend
@ 2021-07-06 16:31 ` John Fastabend
  2021-07-08 19:38   ` Cong Wang
  2021-07-06 16:31 ` [PATCH bpf v3 2/2] bpf, sockmap: sk_prot needs inuse_idx set for proc stats John Fastabend
  1 sibling, 1 reply; 8+ messages in thread
From: John Fastabend @ 2021-07-06 16:31 UTC (permalink / raw)
  To: ast, daniel, andriin, xiyou.wangcong; +Cc: john.fastabend, bpf, netdev

If skb_linearize is needed and fails we could leak a msg on the error
handling. To fix ensure we kfree the msg block before returning error.
Found during code review.

Fixes: 4363023d2668e ("bpf, sockmap: Avoid failures from skb_to_sgvec when skb has frag_list")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 net/core/skmsg.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 9b6160a191f8..1a9059e5b3b3 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -508,10 +508,8 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb,
 	if (skb_linearize(skb))
 		return -EAGAIN;
 	num_sge = skb_to_sgvec(skb, msg->sg.data, 0, skb->len);
-	if (unlikely(num_sge < 0)) {
-		kfree(msg);
+	if (unlikely(num_sge < 0))
 		return num_sge;
-	}
 
 	copied = skb->len;
 	msg->sg.start = 0;
@@ -530,6 +528,7 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb)
 {
 	struct sock *sk = psock->sk;
 	struct sk_msg *msg;
+	int err;
 
 	/* If we are receiving on the same sock skb->sk is already assigned,
 	 * skip memory accounting and owner transition seeing it already set
@@ -548,7 +547,10 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb)
 	 * into user buffers.
 	 */
 	skb_set_owner_r(skb, sk);
-	return sk_psock_skb_ingress_enqueue(skb, psock, sk, msg);
+	err = sk_psock_skb_ingress_enqueue(skb, psock, sk, msg);
+	if (err < 0)
+		kfree(msg);
+	return err;
 }
 
 /* Puts an skb on the ingress queue of the socket already assigned to the
-- 
2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH bpf v3 2/2] bpf, sockmap: sk_prot needs inuse_idx set for proc stats
  2021-07-06 16:31 [PATCH bpf v3 0/2] potential sockmap memleak and proc stats fix John Fastabend
  2021-07-06 16:31 ` [PATCH bpf v3 1/2] bpf, sockmap: fix potential memory leak on unlikely error case John Fastabend
@ 2021-07-06 16:31 ` John Fastabend
  2021-07-08 19:42   ` Cong Wang
  2021-07-12  7:22   ` Jakub Sitnicki
  1 sibling, 2 replies; 8+ messages in thread
From: John Fastabend @ 2021-07-06 16:31 UTC (permalink / raw)
  To: ast, daniel, andriin, xiyou.wangcong; +Cc: john.fastabend, bpf, netdev

Proc socket stats use sk_prot->inuse_idx value to record inuse sock stats.
We currently do not set this correctly from sockmap side. The result is
reading sock stats '/proc/net/sockstat' gives incorrect values. The
socket counter is incremented correctly, but because we don't set the
counter correctly when we replace sk_prot we may omit the decrement.

Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 net/core/sock_map.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 60decd6420ca..27bdf768aa8c 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -185,10 +185,19 @@ static void sock_map_unref(struct sock *sk, void *link_raw)
 
 static int sock_map_init_proto(struct sock *sk, struct sk_psock *psock)
 {
+	int err;
+#ifdef CONFIG_PROC_FS
+	int idx = sk->sk_prot->inuse_idx;
+#endif
 	if (!sk->sk_prot->psock_update_sk_prot)
 		return -EINVAL;
 	psock->psock_update_sk_prot = sk->sk_prot->psock_update_sk_prot;
-	return sk->sk_prot->psock_update_sk_prot(sk, psock, false);
+	err = sk->sk_prot->psock_update_sk_prot(sk, psock, false);
+#ifdef CONFIG_PROC_FS
+	if (!err)
+		sk->sk_prot->inuse_idx = idx;
+#endif
+	return err;
 }
 
 static struct sk_psock *sock_map_psock_get_checked(struct sock *sk)
-- 
2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf v3 1/2] bpf, sockmap: fix potential memory leak on unlikely error case
  2021-07-06 16:31 ` [PATCH bpf v3 1/2] bpf, sockmap: fix potential memory leak on unlikely error case John Fastabend
@ 2021-07-08 19:38   ` Cong Wang
  2021-07-08 20:39     ` John Fastabend
  0 siblings, 1 reply; 8+ messages in thread
From: Cong Wang @ 2021-07-08 19:38 UTC (permalink / raw)
  To: John Fastabend
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Linux Kernel Network Developers

On Tue, Jul 6, 2021 at 9:31 AM John Fastabend <john.fastabend@gmail.com> wrote:
>
> If skb_linearize is needed and fails we could leak a msg on the error
> handling. To fix ensure we kfree the msg block before returning error.
> Found during code review.

sk_psock_skb_ingress_self() also needs the same fix, right?
Other than this, it looks good to me.

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf v3 2/2] bpf, sockmap: sk_prot needs inuse_idx set for proc stats
  2021-07-06 16:31 ` [PATCH bpf v3 2/2] bpf, sockmap: sk_prot needs inuse_idx set for proc stats John Fastabend
@ 2021-07-08 19:42   ` Cong Wang
  2021-07-12  7:22   ` Jakub Sitnicki
  1 sibling, 0 replies; 8+ messages in thread
From: Cong Wang @ 2021-07-08 19:42 UTC (permalink / raw)
  To: John Fastabend
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Linux Kernel Network Developers

On Tue, Jul 6, 2021 at 9:31 AM John Fastabend <john.fastabend@gmail.com> wrote:
>
> Proc socket stats use sk_prot->inuse_idx value to record inuse sock stats.
> We currently do not set this correctly from sockmap side. The result is
> reading sock stats '/proc/net/sockstat' gives incorrect values. The
> socket counter is incremented correctly, but because we don't set the
> counter correctly when we replace sk_prot we may omit the decrement.
>
> Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
>  net/core/sock_map.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> index 60decd6420ca..27bdf768aa8c 100644
> --- a/net/core/sock_map.c
> +++ b/net/core/sock_map.c
> @@ -185,10 +185,19 @@ static void sock_map_unref(struct sock *sk, void *link_raw)
>
>  static int sock_map_init_proto(struct sock *sk, struct sk_psock *psock)
>  {
> +       int err;
> +#ifdef CONFIG_PROC_FS
> +       int idx = sk->sk_prot->inuse_idx;
> +#endif

A nit: Reverse XMAS tree declaration style is preferred for networking
subsystem.

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf v3 1/2] bpf, sockmap: fix potential memory leak on unlikely error case
  2021-07-08 19:38   ` Cong Wang
@ 2021-07-08 20:39     ` John Fastabend
  0 siblings, 0 replies; 8+ messages in thread
From: John Fastabend @ 2021-07-08 20:39 UTC (permalink / raw)
  To: Cong Wang, John Fastabend
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Linux Kernel Network Developers

Cong Wang wrote:
> On Tue, Jul 6, 2021 at 9:31 AM John Fastabend <john.fastabend@gmail.com> wrote:
> >
> > If skb_linearize is needed and fails we could leak a msg on the error
> > handling. To fix ensure we kfree the msg block before returning error.
> > Found during code review.
> 
> sk_psock_skb_ingress_self() also needs the same fix, right?

Yep.

> Other than this, it looks good to me.

I'll do another spin to get the other one as well. Mind as well
fix both cases at once.

> 
> Thanks.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf v3 2/2] bpf, sockmap: sk_prot needs inuse_idx set for proc stats
  2021-07-06 16:31 ` [PATCH bpf v3 2/2] bpf, sockmap: sk_prot needs inuse_idx set for proc stats John Fastabend
  2021-07-08 19:42   ` Cong Wang
@ 2021-07-12  7:22   ` Jakub Sitnicki
  2021-07-12 17:17     ` John Fastabend
  1 sibling, 1 reply; 8+ messages in thread
From: Jakub Sitnicki @ 2021-07-12  7:22 UTC (permalink / raw)
  To: John Fastabend; +Cc: ast, daniel, andriin, xiyou.wangcong, bpf, netdev

On Tue, Jul 06, 2021 at 06:31 PM CEST, John Fastabend wrote:
> Proc socket stats use sk_prot->inuse_idx value to record inuse sock stats.
> We currently do not set this correctly from sockmap side. The result is
> reading sock stats '/proc/net/sockstat' gives incorrect values. The
> socket counter is incremented correctly, but because we don't set the
> counter correctly when we replace sk_prot we may omit the decrement.
>
> Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
>  net/core/sock_map.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> index 60decd6420ca..27bdf768aa8c 100644
> --- a/net/core/sock_map.c
> +++ b/net/core/sock_map.c
> @@ -185,10 +185,19 @@ static void sock_map_unref(struct sock *sk, void *link_raw)
>
>  static int sock_map_init_proto(struct sock *sk, struct sk_psock *psock)
>  {
> +	int err;
> +#ifdef CONFIG_PROC_FS
> +	int idx = sk->sk_prot->inuse_idx;
> +#endif
>  	if (!sk->sk_prot->psock_update_sk_prot)
>  		return -EINVAL;
>  	psock->psock_update_sk_prot = sk->sk_prot->psock_update_sk_prot;
> -	return sk->sk_prot->psock_update_sk_prot(sk, psock, false);
> +	err = sk->sk_prot->psock_update_sk_prot(sk, psock, false);
> +#ifdef CONFIG_PROC_FS
> +	if (!err)
> +		sk->sk_prot->inuse_idx = idx;
> +#endif
> +	return err;
>  }
>
>  static struct sk_psock *sock_map_psock_get_checked(struct sock *sk)

We could initialize inuse_idx just once in {tcp,udp}_bpf_rebuild_protos,
if we changed {tcp,udp}_bpf_v4_build_proto to be a late_initcall, so
that it runs after inet_init when {tcp,udp}_prot and udp_prot are
already registered and have inuse_idx assigned.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf v3 2/2] bpf, sockmap: sk_prot needs inuse_idx set for proc stats
  2021-07-12  7:22   ` Jakub Sitnicki
@ 2021-07-12 17:17     ` John Fastabend
  0 siblings, 0 replies; 8+ messages in thread
From: John Fastabend @ 2021-07-12 17:17 UTC (permalink / raw)
  To: Jakub Sitnicki, John Fastabend
  Cc: ast, daniel, andriin, xiyou.wangcong, bpf, netdev

Jakub Sitnicki wrote:
> On Tue, Jul 06, 2021 at 06:31 PM CEST, John Fastabend wrote:
> > Proc socket stats use sk_prot->inuse_idx value to record inuse sock stats.
> > We currently do not set this correctly from sockmap side. The result is
> > reading sock stats '/proc/net/sockstat' gives incorrect values. The
> > socket counter is incremented correctly, but because we don't set the
> > counter correctly when we replace sk_prot we may omit the decrement.
> >
> > Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
> > Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> > ---
> >  net/core/sock_map.c | 11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> > index 60decd6420ca..27bdf768aa8c 100644
> > --- a/net/core/sock_map.c
> > +++ b/net/core/sock_map.c
> > @@ -185,10 +185,19 @@ static void sock_map_unref(struct sock *sk, void *link_raw)
> >
> >  static int sock_map_init_proto(struct sock *sk, struct sk_psock *psock)
> >  {
> > +	int err;
> > +#ifdef CONFIG_PROC_FS
> > +	int idx = sk->sk_prot->inuse_idx;
> > +#endif
> >  	if (!sk->sk_prot->psock_update_sk_prot)
> >  		return -EINVAL;
> >  	psock->psock_update_sk_prot = sk->sk_prot->psock_update_sk_prot;
> > -	return sk->sk_prot->psock_update_sk_prot(sk, psock, false);
> > +	err = sk->sk_prot->psock_update_sk_prot(sk, psock, false);
> > +#ifdef CONFIG_PROC_FS
> > +	if (!err)
> > +		sk->sk_prot->inuse_idx = idx;
> > +#endif
> > +	return err;
> >  }
> >
> >  static struct sk_psock *sock_map_psock_get_checked(struct sock *sk)
> 
> We could initialize inuse_idx just once in {tcp,udp}_bpf_rebuild_protos,
> if we changed {tcp,udp}_bpf_v4_build_proto to be a late_initcall, so
> that it runs after inet_init when {tcp,udp}_prot and udp_prot are
> already registered and have inuse_idx assigned.

OK does seem slightly nicer. Then I guess the diff is just,

diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index f26916a62f25..d3e9386b493e 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -503,7 +503,7 @@ static int __init tcp_bpf_v4_build_proto(void)
        tcp_bpf_rebuild_protos(tcp_bpf_prots[TCP_BPF_IPV4], &tcp_prot);
        return 0;
 }
-core_initcall(tcp_bpf_v4_build_proto);
+late_initcall(tcp_bpf_v4_build_proto);
 
 static int tcp_bpf_assert_proto_ops(struct proto *ops)
 {

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-07-12 17:18 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-06 16:31 [PATCH bpf v3 0/2] potential sockmap memleak and proc stats fix John Fastabend
2021-07-06 16:31 ` [PATCH bpf v3 1/2] bpf, sockmap: fix potential memory leak on unlikely error case John Fastabend
2021-07-08 19:38   ` Cong Wang
2021-07-08 20:39     ` John Fastabend
2021-07-06 16:31 ` [PATCH bpf v3 2/2] bpf, sockmap: sk_prot needs inuse_idx set for proc stats John Fastabend
2021-07-08 19:42   ` Cong Wang
2021-07-12  7:22   ` Jakub Sitnicki
2021-07-12 17:17     ` John Fastabend

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).