LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH] net: memcg: late association of sock to memcg
@ 2020-02-22  1:04 Shakeel Butt
  2020-02-22  1:48 ` Roman Gushchin
  2020-02-24  7:29 ` Eric Dumazet
  0 siblings, 2 replies; 5+ messages in thread
From: Shakeel Butt @ 2020-02-22  1:04 UTC (permalink / raw)
  To: Eric Dumazet, Roman Gushchin
  Cc: Johannes Weiner, Michal Hocko, Andrew Morton, David S . Miller,
	Alexey Kuznetsov, netdev, Hideaki YOSHIFUJI, linux-mm, cgroups,
	linux-kernel, Shakeel Butt

If a TCP socket is allocated in IRQ context or cloned from unassociated
(i.e. not associated to a memcg) in IRQ context then it will remain
unassociated for its whole life. Almost half of the TCPs created on the
system are created in IRQ context, so, memory used by suck sockets will
not be accounted by the memcg.

This issue is more widespread in cgroup v1 where network memory
accounting is opt-in but it can happen in cgroup v2 if the source socket
for the cloning was created in root memcg.

To fix the issue, just do the late association of the unassociated
sockets at accept() time in the process context and then force charge
the memory buffer already reserved by the socket.

Signed-off-by: Shakeel Butt <shakeelb@google.com>
---
 net/ipv4/inet_connection_sock.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index a4db79b1b643..df9c8ef024a2 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -482,6 +482,13 @@ struct sock *inet_csk_accept(struct sock *sk, int flags, int *err, bool kern)
 		}
 		spin_unlock_bh(&queue->fastopenq.lock);
 	}
+
+	if (mem_cgroup_sockets_enabled && !newsk->sk_memcg) {
+		mem_cgroup_sk_alloc(newsk);
+		if (newsk->sk_memcg)
+			mem_cgroup_charge_skmem(newsk->sk_memcg,
+					sk_mem_pages(newsk->sk_forward_alloc));
+	}
 out:
 	release_sock(sk);
 	if (req)
-- 
2.25.0.265.gbab2e86ba0-goog


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] net: memcg: late association of sock to memcg
  2020-02-22  1:04 [PATCH] net: memcg: late association of sock to memcg Shakeel Butt
@ 2020-02-22  1:48 ` Roman Gushchin
  2020-02-22  1:54   ` Shakeel Butt
  2020-02-24  7:29 ` Eric Dumazet
  1 sibling, 1 reply; 5+ messages in thread
From: Roman Gushchin @ 2020-02-22  1:48 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Eric Dumazet, Johannes Weiner, Michal Hocko, Andrew Morton,
	David S . Miller, Alexey Kuznetsov, netdev, Hideaki YOSHIFUJI,
	linux-mm, cgroups, linux-kernel

On Fri, Feb 21, 2020 at 05:04:56PM -0800, Shakeel Butt wrote:
> If a TCP socket is allocated in IRQ context or cloned from unassociated
> (i.e. not associated to a memcg) in IRQ context then it will remain
> unassociated for its whole life. Almost half of the TCPs created on the
> system are created in IRQ context, so, memory used by suck sockets will
> not be accounted by the memcg.
> 
> This issue is more widespread in cgroup v1 where network memory
> accounting is opt-in but it can happen in cgroup v2 if the source socket
> for the cloning was created in root memcg.
> 
> To fix the issue, just do the late association of the unassociated
> sockets at accept() time in the process context and then force charge
> the memory buffer already reserved by the socket.
> 
> Signed-off-by: Shakeel Butt <shakeelb@google.com>

Hello, Shakeel!

> ---
>  net/ipv4/inet_connection_sock.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
> index a4db79b1b643..df9c8ef024a2 100644
> --- a/net/ipv4/inet_connection_sock.c
> +++ b/net/ipv4/inet_connection_sock.c
> @@ -482,6 +482,13 @@ struct sock *inet_csk_accept(struct sock *sk, int flags, int *err, bool kern)
>  		}
>  		spin_unlock_bh(&queue->fastopenq.lock);
>  	}
> +
> +	if (mem_cgroup_sockets_enabled && !newsk->sk_memcg) {
> +		mem_cgroup_sk_alloc(newsk);
> +		if (newsk->sk_memcg)
> +			mem_cgroup_charge_skmem(newsk->sk_memcg,
> +					sk_mem_pages(newsk->sk_forward_alloc));
> +	}

Looks good for me from the memcg side. Let's see what networking people will say...

Btw, do you plan to make a separate patch for associating the socket with the default
cgroup on the unified hierarchy? I mean cgroup_sk_alloc().

Thank you for working on it!

Roman

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] net: memcg: late association of sock to memcg
  2020-02-22  1:48 ` Roman Gushchin
@ 2020-02-22  1:54   ` Shakeel Butt
  0 siblings, 0 replies; 5+ messages in thread
From: Shakeel Butt @ 2020-02-22  1:54 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Eric Dumazet, Johannes Weiner, Michal Hocko, Andrew Morton,
	David S . Miller, Alexey Kuznetsov, netdev, Hideaki YOSHIFUJI,
	Linux MM, Cgroups, LKML

On Fri, Feb 21, 2020 at 5:49 PM Roman Gushchin <guro@fb.com> wrote:
>
> On Fri, Feb 21, 2020 at 05:04:56PM -0800, Shakeel Butt wrote:
> > If a TCP socket is allocated in IRQ context or cloned from unassociated
> > (i.e. not associated to a memcg) in IRQ context then it will remain
> > unassociated for its whole life. Almost half of the TCPs created on the
> > system are created in IRQ context, so, memory used by suck sockets will
> > not be accounted by the memcg.
> >
> > This issue is more widespread in cgroup v1 where network memory
> > accounting is opt-in but it can happen in cgroup v2 if the source socket
> > for the cloning was created in root memcg.
> >
> > To fix the issue, just do the late association of the unassociated
> > sockets at accept() time in the process context and then force charge
> > the memory buffer already reserved by the socket.
> >
> > Signed-off-by: Shakeel Butt <shakeelb@google.com>
>
> Hello, Shakeel!
>
> > ---
> >  net/ipv4/inet_connection_sock.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
> > index a4db79b1b643..df9c8ef024a2 100644
> > --- a/net/ipv4/inet_connection_sock.c
> > +++ b/net/ipv4/inet_connection_sock.c
> > @@ -482,6 +482,13 @@ struct sock *inet_csk_accept(struct sock *sk, int flags, int *err, bool kern)
> >               }
> >               spin_unlock_bh(&queue->fastopenq.lock);
> >       }
> > +
> > +     if (mem_cgroup_sockets_enabled && !newsk->sk_memcg) {
> > +             mem_cgroup_sk_alloc(newsk);
> > +             if (newsk->sk_memcg)
> > +                     mem_cgroup_charge_skmem(newsk->sk_memcg,
> > +                                     sk_mem_pages(newsk->sk_forward_alloc));
> > +     }
>
> Looks good for me from the memcg side. Let's see what networking people will say...
>
> Btw, do you plan to make a separate patch for associating the socket with the default
> cgroup on the unified hierarchy? I mean cgroup_sk_alloc().
>

Yes. I tried to do that here but was not able to do without adding the
(newsk->sk_cgrp_data.val) check which I can not do in this file as
sk_cgrp_data might not be compiled. I will send a separate patch.

Shakeel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] net: memcg: late association of sock to memcg
  2020-02-22  1:04 [PATCH] net: memcg: late association of sock to memcg Shakeel Butt
  2020-02-22  1:48 ` Roman Gushchin
@ 2020-02-24  7:29 ` Eric Dumazet
  2020-02-24 16:38   ` Shakeel Butt
  1 sibling, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2020-02-24  7:29 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Roman Gushchin, Johannes Weiner, Michal Hocko, Andrew Morton,
	David S . Miller, Alexey Kuznetsov, netdev, Hideaki YOSHIFUJI,
	linux-mm, Cgroups, LKML

On Fri, Feb 21, 2020 at 5:05 PM Shakeel Butt <shakeelb@google.com> wrote:
>
> If a TCP socket is allocated in IRQ context or cloned from unassociated
> (i.e. not associated to a memcg) in IRQ context then it will remain
> unassociated for its whole life. Almost half of the TCPs created on the
> system are created in IRQ context, so, memory used by suck sockets will
> not be accounted by the memcg.
>
> This issue is more widespread in cgroup v1 where network memory
> accounting is opt-in but it can happen in cgroup v2 if the source socket
> for the cloning was created in root memcg.
>
> To fix the issue, just do the late association of the unassociated
> sockets at accept() time in the process context and then force charge
> the memory buffer already reserved by the socket.
>
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> ---
>  net/ipv4/inet_connection_sock.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
> index a4db79b1b643..df9c8ef024a2 100644
> --- a/net/ipv4/inet_connection_sock.c
> +++ b/net/ipv4/inet_connection_sock.c
> @@ -482,6 +482,13 @@ struct sock *inet_csk_accept(struct sock *sk, int flags, int *err, bool kern)
>                 }
>                 spin_unlock_bh(&queue->fastopenq.lock);
>         }
> +
> +       if (mem_cgroup_sockets_enabled && !newsk->sk_memcg) {
> +               mem_cgroup_sk_alloc(newsk);
> +               if (newsk->sk_memcg)
> +                       mem_cgroup_charge_skmem(newsk->sk_memcg,
> +                                       sk_mem_pages(newsk->sk_forward_alloc));

I am not sure what you  are trying to do here.

sk->sk_forward_alloc is not the total amount of memory used by a TCP socket.
It is only some part that has been reserved, but not yet consumed.

For example, every skb that has been stored in TCP receive queue or
out-of-order queue might have
used memory.

I guess that if we assume that  a not yet accepted socket can not have
any outstanding data in its transmit queue,
you need to use sk->sk_rmem_alloc as well.

To test this patch, make sure to add a delay before accept(), so that
2MB worth of data can be queued before accept() happens.

Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] net: memcg: late association of sock to memcg
  2020-02-24  7:29 ` Eric Dumazet
@ 2020-02-24 16:38   ` Shakeel Butt
  0 siblings, 0 replies; 5+ messages in thread
From: Shakeel Butt @ 2020-02-24 16:38 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Roman Gushchin, Johannes Weiner, Michal Hocko, Andrew Morton,
	David S . Miller, Alexey Kuznetsov, netdev, Hideaki YOSHIFUJI,
	linux-mm, Cgroups, LKML

On Sun, Feb 23, 2020 at 11:29 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Fri, Feb 21, 2020 at 5:05 PM Shakeel Butt <shakeelb@google.com> wrote:
> >
> > If a TCP socket is allocated in IRQ context or cloned from unassociated
> > (i.e. not associated to a memcg) in IRQ context then it will remain
> > unassociated for its whole life. Almost half of the TCPs created on the
> > system are created in IRQ context, so, memory used by suck sockets will
> > not be accounted by the memcg.
> >
> > This issue is more widespread in cgroup v1 where network memory
> > accounting is opt-in but it can happen in cgroup v2 if the source socket
> > for the cloning was created in root memcg.
> >
> > To fix the issue, just do the late association of the unassociated
> > sockets at accept() time in the process context and then force charge
> > the memory buffer already reserved by the socket.
> >
> > Signed-off-by: Shakeel Butt <shakeelb@google.com>
> > ---
> >  net/ipv4/inet_connection_sock.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
> > index a4db79b1b643..df9c8ef024a2 100644
> > --- a/net/ipv4/inet_connection_sock.c
> > +++ b/net/ipv4/inet_connection_sock.c
> > @@ -482,6 +482,13 @@ struct sock *inet_csk_accept(struct sock *sk, int flags, int *err, bool kern)
> >                 }
> >                 spin_unlock_bh(&queue->fastopenq.lock);
> >         }
> > +
> > +       if (mem_cgroup_sockets_enabled && !newsk->sk_memcg) {
> > +               mem_cgroup_sk_alloc(newsk);
> > +               if (newsk->sk_memcg)
> > +                       mem_cgroup_charge_skmem(newsk->sk_memcg,
> > +                                       sk_mem_pages(newsk->sk_forward_alloc));
>
> I am not sure what you  are trying to do here.
>
> sk->sk_forward_alloc is not the total amount of memory used by a TCP socket.
> It is only some part that has been reserved, but not yet consumed.
>
> For example, every skb that has been stored in TCP receive queue or
> out-of-order queue might have
> used memory.
>
> I guess that if we assume that  a not yet accepted socket can not have
> any outstanding data in its transmit queue,
> you need to use sk->sk_rmem_alloc as well.

Thanks a lot. I will add that with a comment. BTW for my knowledge
which field represents the transmit queue size?

>
> To test this patch, make sure to add a delay before accept(), so that
> 2MB worth of data can be queued before accept() happens.

Yes, I will test this with a delay.

thanks,
Shakeel

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-02-24 16:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-22  1:04 [PATCH] net: memcg: late association of sock to memcg Shakeel Butt
2020-02-22  1:48 ` Roman Gushchin
2020-02-22  1:54   ` Shakeel Butt
2020-02-24  7:29 ` Eric Dumazet
2020-02-24 16:38   ` Shakeel Butt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).