LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: David Ahern <dsahern@gmail.com>
To: Eric Dumazet <edumazet@google.com>
Cc: Yunsheng Lin <linyunsheng@huawei.com>,
David Miller <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>,
Alexander Duyck <alexander.duyck@gmail.com>,
Russell King <linux@armlinux.org.uk>,
Marcin Wojtas <mw@semihalf.com>,
linuxarm@openeuler.org, Yisen Zhuang <yisen.zhuang@huawei.com>,
Salil Mehta <salil.mehta@huawei.com>,
Thomas Petazzoni <thomas.petazzoni@bootlin.com>,
Jesper Dangaard Brouer <hawk@kernel.org>,
Ilias Apalodimas <ilias.apalodimas@linaro.org>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
John Fastabend <john.fastabend@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
Will Deacon <will@kernel.org>,
Matthew Wilcox <willy@infradead.org>,
Vlastimil Babka <vbabka@suse.cz>,
Fenghua Yu <fenghua.yu@intel.com>, Roman Gushchin <guro@fb.com>,
Peter Xu <peterx@redhat.com>, "Tang, Feng" <feng.tang@intel.com>,
Jason Gunthorpe <jgg@ziepe.ca>,
mcroce@microsoft.com, Hugh Dickins <hughd@google.com>,
Jonathan Lemon <jonathan.lemon@gmail.com>,
Alexander Lobakin <alobakin@pm.me>,
Willem de Bruijn <willemb@google.com>, wenxu <wenxu@ucloud.cn>,
Cong Wang <cong.wang@bytedance.com>,
Kevin Hao <haokexin@gmail.com>,
Aleksandr Nogikh <nogikh@google.com>,
Marco Elver <elver@google.com>, Yonghong Song <yhs@fb.com>,
kpsingh@kernel.org, Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
netdev <netdev@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
chenhao288@hisilicon.com,
Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
David Ahern <dsahern@kernel.org>,
memxor@gmail.com, linux@rempel-privat.de,
Antoine Tenart <atenart@kernel.org>, Wei Wang <weiwan@google.com>,
Taehee Yoo <ap420073@gmail.com>, Arnd Bergmann <arnd@arndb.de>,
Mat Martineau <mathew.j.martineau@linux.intel.com>,
aahringo@redhat.com, ceggers@arri.de, yangbo.lu@nxp.com,
Florian Westphal <fw@strlen.de>,
xiangxia.m.yue@gmail.com, linmiaohe <linmiaohe@huawei.com>,
Christoph Hellwig <hch@lst.de>
Subject: Re: [Linuxarm] Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support
Date: Wed, 25 Aug 2021 09:38:55 -0700 [thread overview]
Message-ID: <2d2154f4-c735-a9b3-7940-f8830fee6229@gmail.com> (raw)
In-Reply-To: <CANn89iKqijGU_0dQMeyMJ2h2MJE3=fLm8qb456G3ZD_7TrLt_A@mail.gmail.com>
On 8/25/21 9:32 AM, Eric Dumazet wrote:
> On Wed, Aug 25, 2021 at 9:29 AM David Ahern <dsahern@gmail.com> wrote:
>>
>> On 8/23/21 8:04 AM, Eric Dumazet wrote:
>>>>
>>>>
>>>> It seems PAGE_ALLOC_COSTLY_ORDER is mostly related to pcp page, OOM, memory
>>>> compact and memory isolation, as the test system has a lot of memory installed
>>>> (about 500G, only 3-4G is used), so I used the below patch to test the max
>>>> possible performance improvement when making TCP frags twice bigger, and
>>>> the performance improvement went from about 30Gbit to 32Gbit for one thread
>>>> iperf tcp flow in IOMMU strict mode,
>>>
>>> This is encouraging, and means we can do much better.
>>>
>>> Even with SKB_FRAG_PAGE_ORDER set to 4, typical skbs will need 3 mappings
>>>
>>> 1) One for the headers (in skb->head)
>>> 2) Two page frags, because one TSO packet payload is not a nice power-of-two.
>>
>> interesting observation. I have noticed 17 with the ZC API. That might
>> explain the less than expected performance bump with iommu strict mode.
>
> Note that if application is using huge pages, things get better after
>
> commit 394fcd8a813456b3306c423ec4227ed874dfc08b
> Author: Eric Dumazet <edumazet@google.com>
> Date: Thu Aug 20 08:43:59 2020 -0700
>
> net: zerocopy: combine pages in zerocopy_sg_from_iter()
>
> Currently, tcp sendmsg(MSG_ZEROCOPY) is building skbs with order-0
> fragments.
> Compared to standard sendmsg(), these skbs usually contain up to
> 16 fragments
> on arches with 4KB page sizes, instead of two.
>
> This adds considerable costs on various ndo_start_xmit() handlers,
> especially when IOMMU is in the picture.
>
> As high performance applications are often using huge pages,
> we can try to combine adjacent pages belonging to same
> compound page.
>
> Tested on AMD Rome platform, with IOMMU, nominal single TCP flow speed
> is roughly doubled (~55Gbit -> ~100Gbit), when user application
> is using hugepages.
>
> For reference, nominal single TCP flow speed on this platform
> without MSG_ZEROCOPY is ~65Gbit.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Signed-off-by: David S. Miller <davem@davemloft.net>
>
> Ideally the gup stuff should really directly deal with hugepages, so
> that we avoid
> all these crazy refcounting games on the per-huge-page central refcount.
>
thanks for the pointer. I need to revisit my past attempt to get iperf3
working with hugepages.
next prev parent reply other threads:[~2021-08-25 16:39 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-18 3:32 Yunsheng Lin
2021-08-18 3:32 ` [PATCH RFC 1/7] page_pool: refactor the page pool to support multi alloc context Yunsheng Lin
2021-08-18 3:32 ` [PATCH RFC 2/7] skbuff: add interface to manipulate frag count for tx recycling Yunsheng Lin
2021-08-18 3:32 ` [PATCH RFC 3/7] net: add NAPI api to register and retrieve the page pool ptr Yunsheng Lin
2021-08-18 3:32 ` [PATCH RFC 4/7] net: pfrag_pool: add pfrag pool support based on page pool Yunsheng Lin
2021-08-18 3:32 ` [PATCH RFC 5/7] sock: support refilling pfrag from pfrag_pool Yunsheng Lin
2021-08-18 3:32 ` [PATCH RFC 6/7] net: hns3: support tx recycling in the hns3 driver Yunsheng Lin
2021-08-18 8:57 ` [PATCH RFC 0/7] add socket to netdev page frag recycling support Eric Dumazet
2021-08-18 9:36 ` Yunsheng Lin
2021-08-23 9:25 ` [Linuxarm] " Yunsheng Lin
2021-08-23 15:04 ` Eric Dumazet
2021-08-24 8:03 ` Yunsheng Lin
2021-08-25 16:29 ` David Ahern
2021-08-25 16:32 ` Eric Dumazet
2021-08-25 16:38 ` David Ahern [this message]
2021-08-25 17:24 ` Eric Dumazet
2021-08-26 4:05 ` David Ahern
2021-08-18 22:05 ` David Ahern
2021-08-19 8:18 ` Yunsheng Lin
2021-08-20 14:35 ` David Ahern
2021-08-23 3:32 ` Yunsheng Lin
2021-08-24 3:34 ` David Ahern
2021-08-24 8:41 ` Yunsheng Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2d2154f4-c735-a9b3-7940-f8830fee6229@gmail.com \
--to=dsahern@gmail.com \
--cc=aahringo@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.duyck@gmail.com \
--cc=alobakin@pm.me \
--cc=andrii@kernel.org \
--cc=ap420073@gmail.com \
--cc=arnd@arndb.de \
--cc=ast@kernel.org \
--cc=atenart@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=ceggers@arri.de \
--cc=chenhao288@hisilicon.com \
--cc=cong.wang@bytedance.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=elver@google.com \
--cc=feng.tang@intel.com \
--cc=fenghua.yu@intel.com \
--cc=fw@strlen.de \
--cc=guro@fb.com \
--cc=haokexin@gmail.com \
--cc=hawk@kernel.org \
--cc=hch@lst.de \
--cc=hughd@google.com \
--cc=ilias.apalodimas@linaro.org \
--cc=jgg@ziepe.ca \
--cc=john.fastabend@gmail.com \
--cc=jonathan.lemon@gmail.com \
--cc=kafai@fb.com \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=linux@rempel-privat.de \
--cc=linuxarm@openeuler.org \
--cc=linyunsheng@huawei.com \
--cc=mathew.j.martineau@linux.intel.com \
--cc=mcroce@microsoft.com \
--cc=memxor@gmail.com \
--cc=mw@semihalf.com \
--cc=netdev@vger.kernel.org \
--cc=nogikh@google.com \
--cc=peterx@redhat.com \
--cc=peterz@infradead.org \
--cc=salil.mehta@huawei.com \
--cc=songliubraving@fb.com \
--cc=thomas.petazzoni@bootlin.com \
--cc=vbabka@suse.cz \
--cc=weiwan@google.com \
--cc=wenxu@ucloud.cn \
--cc=will@kernel.org \
--cc=willemb@google.com \
--cc=willy@infradead.org \
--cc=xiangxia.m.yue@gmail.com \
--cc=yangbo.lu@nxp.com \
--cc=yhs@fb.com \
--cc=yisen.zhuang@huawei.com \
--cc=yoshfuji@linux-ipv6.org \
--subject='Re: [Linuxarm] Re: [PATCH RFC 0/7] add socket to netdev page frag recycling support' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).