LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Yunsheng Lin <linyunsheng@huawei.com>
To: Eric Dumazet <edumazet@google.com>
Cc: David Miller <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Alexander Duyck <alexander.duyck@gmail.com>,
	Russell King <linux@armlinux.org.uk>,
	Marcin Wojtas <mw@semihalf.com>, <linuxarm@openeuler.org>,
	Yisen Zhuang <yisen.zhuang@huawei.com>,
	Salil Mehta <salil.mehta@huawei.com>,
	Thomas Petazzoni <thomas.petazzoni@bootlin.com>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	Ilias Apalodimas <ilias.apalodimas@linaro.org>,
	Alexei Starovoitov <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	John Fastabend <john.fastabend@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Will Deacon <will@kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Fenghua Yu <fenghua.yu@intel.com>, Roman Gushchin <guro@fb.com>,
	Peter Xu <peterx@redhat.com>, "Tang, Feng" <feng.tang@intel.com>,
	Jason Gunthorpe <jgg@ziepe.ca>, <mcroce@microsoft.com>,
	Hugh Dickins <hughd@google.com>,
	Jonathan Lemon <jonathan.lemon@gmail.com>,
	Alexander Lobakin <alobakin@pm.me>,
	Willem de Bruijn <willemb@google.com>, wenxu <wenxu@ucloud.cn>,
	Cong Wang <cong.wang@bytedance.com>,
	Kevin Hao <haokexin@gmail.com>,
	Aleksandr Nogikh <nogikh@google.com>,
	Marco Elver <elver@google.com>, Yonghong Song <yhs@fb.com>,
	<kpsingh@kernel.org>, "Andrii Nakryiko" <andrii@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	netdev <netdev@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
	<chenhao288@hisilicon.com>,
	Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
	David Ahern <dsahern@kernel.org>, <memxor@gmail.com>,
	<linux@rempel-privat.de>, Antoine Tenart <atenart@kernel.org>,
	Wei Wang <weiwan@google.com>, Taehee Yoo <ap420073@gmail.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Mat Martineau <mathew.j.martineau@linux.intel.com>,
	<aahringo@redhat.com>, <ceggers@arri.de>, <yangbo.lu@nxp.com>,
	"Florian Westphal" <fw@strlen.de>, <xiangxia.m.yue@gmail.com>,
	linmiaohe <linmiaohe@huawei.com>, <hch@lst.de>
Subject: Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support
Date: Wed, 18 Aug 2021 17:36:06 +0800	[thread overview]
Message-ID: <2cf4b672-d7dc-db3d-ce90-15b4e91c4005@huawei.com> (raw)
In-Reply-To: <CANn89iJDf9uzSdqLEBeTeGB1uAxvmruKfK5HbeZWp+Cdc+qggQ@mail.gmail.com>

On 2021/8/18 16:57, Eric Dumazet wrote:
> On Wed, Aug 18, 2021 at 5:33 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>>
>> This patchset adds the socket to netdev page frag recycling
>> support based on the busy polling and page pool infrastructure.
> 
> I really do not see how this can scale to thousands of sockets.
> 
> tcp_mem[] defaults to ~ 9 % of physical memory.
> 
> If you now run tests with thousands of sockets, their skbs will
> consume Gigabytes
> of memory on typical servers, now backed by order-0 pages (instead of
> current order-3 pages)
> So IOMMU costs will actually be much bigger.

As the page allocator support bulk allocating now, see:
https://elixir.bootlin.com/linux/latest/source/net/core/page_pool.c#L252

if the DMA also support batch mapping/unmapping, maybe having a
small-sized page pool for thousands of sockets may not be a problem?
Christoph Hellwig mentioned the batch DMA operation support in below
thread:
https://www.spinics.net/lists/netdev/msg666715.html

if the batched DMA operation is supported, maybe having the
page pool is mainly benefit the case of small number of socket?

> 
> Are we planning to use Gigabyte sized page pools for NIC ?
> 
> Have you tried instead to make TCP frags twice bigger ?

Not yet.

> This would require less IOMMU mappings.
> (Note: This could require some mm help, since PAGE_ALLOC_COSTLY_ORDER
> is currently 3, not 4)

I am not familiar with mm yet, but I will take a look about that:)

> 
> diff --git a/net/core/sock.c b/net/core/sock.c
> index a3eea6e0b30a7d43793f567ffa526092c03e3546..6b66b51b61be9f198f6f1c4a3d81b57fa327986a
> 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -2560,7 +2560,7 @@ static void sk_leave_memory_pressure(struct sock *sk)
>         }
>  }
> 
> -#define SKB_FRAG_PAGE_ORDER    get_order(32768)
> +#define SKB_FRAG_PAGE_ORDER    get_order(65536)
>  DEFINE_STATIC_KEY_FALSE(net_high_order_alloc_disable_key);
> 
>  /**
> 
> 
> 
>>

  reply	other threads:[~2021-08-18  9:38 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-18  3:32 Yunsheng Lin
2021-08-18  3:32 ` [PATCH RFC 1/7] page_pool: refactor the page pool to support multi alloc context Yunsheng Lin
2021-08-18  3:32 ` [PATCH RFC 2/7] skbuff: add interface to manipulate frag count for tx recycling Yunsheng Lin
2021-08-18  3:32 ` [PATCH RFC 3/7] net: add NAPI api to register and retrieve the page pool ptr Yunsheng Lin
2021-08-18  3:32 ` [PATCH RFC 4/7] net: pfrag_pool: add pfrag pool support based on page pool Yunsheng Lin
2021-08-18  3:32 ` [PATCH RFC 5/7] sock: support refilling pfrag from pfrag_pool Yunsheng Lin
2021-08-18  3:32 ` [PATCH RFC 6/7] net: hns3: support tx recycling in the hns3 driver Yunsheng Lin
2021-08-18  8:57 ` [PATCH RFC 0/7] add socket to netdev page frag recycling support Eric Dumazet
2021-08-18  9:36   ` Yunsheng Lin [this message]
2021-08-23  9:25     ` [Linuxarm] " Yunsheng Lin
2021-08-23 15:04       ` Eric Dumazet
2021-08-24  8:03         ` Yunsheng Lin
2021-08-25 16:29         ` David Ahern
2021-08-25 16:32           ` Eric Dumazet
2021-08-25 16:38             ` David Ahern
2021-08-25 17:24               ` Eric Dumazet
2021-08-26  4:05                 ` David Ahern
2021-08-18 22:05 ` David Ahern
2021-08-19  8:18   ` Yunsheng Lin
2021-08-20 14:35     ` David Ahern
2021-08-23  3:32       ` Yunsheng Lin
2021-08-24  3:34         ` David Ahern
2021-08-24  8:41           ` Yunsheng Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2cf4b672-d7dc-db3d-ce90-15b4e91c4005@huawei.com \
    --to=linyunsheng@huawei.com \
    --cc=aahringo@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.duyck@gmail.com \
    --cc=alobakin@pm.me \
    --cc=andrii@kernel.org \
    --cc=ap420073@gmail.com \
    --cc=arnd@arndb.de \
    --cc=ast@kernel.org \
    --cc=atenart@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=ceggers@arri.de \
    --cc=chenhao288@hisilicon.com \
    --cc=cong.wang@bytedance.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=elver@google.com \
    --cc=feng.tang@intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=fw@strlen.de \
    --cc=guro@fb.com \
    --cc=haokexin@gmail.com \
    --cc=hawk@kernel.org \
    --cc=hch@lst.de \
    --cc=hughd@google.com \
    --cc=ilias.apalodimas@linaro.org \
    --cc=jgg@ziepe.ca \
    --cc=john.fastabend@gmail.com \
    --cc=jonathan.lemon@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=linux@rempel-privat.de \
    --cc=linuxarm@openeuler.org \
    --cc=mathew.j.martineau@linux.intel.com \
    --cc=mcroce@microsoft.com \
    --cc=memxor@gmail.com \
    --cc=mw@semihalf.com \
    --cc=netdev@vger.kernel.org \
    --cc=nogikh@google.com \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=salil.mehta@huawei.com \
    --cc=songliubraving@fb.com \
    --cc=thomas.petazzoni@bootlin.com \
    --cc=vbabka@suse.cz \
    --cc=weiwan@google.com \
    --cc=wenxu@ucloud.cn \
    --cc=will@kernel.org \
    --cc=willemb@google.com \
    --cc=willy@infradead.org \
    --cc=xiangxia.m.yue@gmail.com \
    --cc=yangbo.lu@nxp.com \
    --cc=yhs@fb.com \
    --cc=yisen.zhuang@huawei.com \
    --cc=yoshfuji@linux-ipv6.org \
    --subject='Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).