Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Eric Dumazet <edumazet@google.com>
To: Wei Wang <weiwan@google.com>
Cc: "David S . Miller" <davem@davemloft.net>,
netdev <netdev@vger.kernel.org>, Jakub Kicinski <kuba@kernel.org>,
Paolo Abeni <pabeni@redhat.com>,
Hannes Frederic Sowa <hannes@stressinduktion.org>,
Felix Fietkau <nbd@nbd.name>
Subject: Re: [RFC PATCH net-next 0/6] implement kthread based napi poll
Date: Mon, 28 Sep 2020 19:43:36 +0200 [thread overview]
Message-ID: <CANn89iJDM97U15Znrx4k4bOFKunQp7dwJ9mtPwvMmB4S+rSSbA@mail.gmail.com> (raw)
In-Reply-To: <20200914172453.1833883-1-weiwan@google.com>
On Mon, Sep 14, 2020 at 7:26 PM Wei Wang <weiwan@google.com> wrote:
>
> The idea of moving the napi poll process out of softirq context to a
> kernel thread based context is not new.
> Paolo Abeni and Hannes Frederic Sowa has proposed patches to move napi
> poll to kthread back in 2016. And Felix Fietkau has also proposed
> patches of similar ideas to use workqueue to process napi poll just a
> few weeks ago.
>
> The main reason we'd like to push forward with this idea is that the
> scheduler has poor visibility into cpu cycles spent in softirq context,
> and is not able to make optimal scheduling decisions of the user threads.
> For example, we see in one of the application benchmark where network
> load is high, the CPUs handling network softirqs has ~80% cpu util. And
> user threads are still scheduled on those CPUs, despite other more idle
> cpus available in the system. And we see very high tail latencies. In this
> case, we have to explicitly pin away user threads from the CPUs handling
> network softirqs to ensure good performance.
> With napi poll moved to kthread, scheduler is in charge of scheduling both
> the kthreads handling network load, and the user threads, and is able to
> make better decisions. In the previous benchmark, if we do this and we
> pin the kthreads processing napi poll to specific CPUs, scheduler is
> able to schedule user threads away from these CPUs automatically.
>
> And the reason we prefer 1 kthread per napi, instead of 1 workqueue
> entity per host, is that kthread is more configurable than workqueue,
> and we could leverage existing tuning tools for threads, like taskset,
> chrt, etc to tune scheduling class and cpu set, etc. Another reason is
> if we eventually want to provide busy poll feature using kernel threads
> for napi poll, kthread seems to be more suitable than workqueue.
>
> In this patch series, I revived Paolo and Hannes's patch in 2016 and
> left them as the first 2 patches. Then there are changes proposed by
> Felix, Jakub, Paolo and myself on top of those, with suggestions from
> Eric Dumazet.
>
> In terms of performance, I ran tcp_rr tests with 1000 flows with
> various request/response sizes, with RFS/RPS disabled, and compared
> performance between softirq vs kthread. Host has 56 hyper threads and
> 100Gbps nic.
>
> req/resp QPS 50%tile 90%tile 99%tile 99.9%tile
> softirq 1B/1B 2.19M 284us 987us 1.1ms 1.56ms
> kthread 1B/1B 2.14M 295us 987us 1.0ms 1.17ms
>
> softirq 5KB/5KB 1.31M 869us 1.06ms 1.28ms 2.38ms
> kthread 5KB/5KB 1.32M 878us 1.06ms 1.26ms 1.66ms
>
> softirq 1MB/1MB 10.78K 84ms 166ms 234ms 294ms
> kthread 1MB/1MB 10.83K 82ms 173ms 262ms 320ms
>
> I also ran one application benchmark where the user threads have more
> work to do. We do see good amount of tail latency reductions with the
> kthread model.
Wei, this is a very nice work.
Please re-send it without the RFC tag, so that we can hopefully merge it ASAP.
Thanks !
next prev parent reply other threads:[~2020-09-28 17:43 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-14 17:24 Wei Wang
2020-09-14 17:24 ` [RFC PATCH net-next 1/6] net: implement threaded-able napi poll loop support Wei Wang
2020-09-25 19:45 ` Hannes Frederic Sowa
2020-09-25 23:50 ` Wei Wang
2020-09-26 14:22 ` Hannes Frederic Sowa
2020-09-28 8:45 ` Paolo Abeni
2020-09-28 18:13 ` Wei Wang
2020-09-14 17:24 ` [RFC PATCH net-next 2/6] net: add sysfs attribute to control napi threaded mode Wei Wang
2020-09-14 17:24 ` [RFC PATCH net-next 3/6] net: extract napi poll functionality to __napi_poll() Wei Wang
2020-09-14 17:24 ` [RFC PATCH net-next 4/6] net: modify kthread handler to use __napi_poll() Wei Wang
2020-09-14 17:24 ` [RFC PATCH net-next 5/6] net: process RPS/RFS work in kthread context Wei Wang
2020-09-18 22:44 ` Wei Wang
2020-09-21 8:11 ` Eric Dumazet
2020-09-14 17:24 ` [RFC PATCH net-next 6/6] net: improve napi threaded config Wei Wang
2020-09-25 13:48 ` [RFC PATCH net-next 0/6] implement kthread based napi poll Magnus Karlsson
2020-09-25 17:15 ` Wei Wang
2020-09-25 17:30 ` Eric Dumazet
2020-09-25 18:16 ` Stephen Hemminger
2020-09-25 18:23 ` Eric Dumazet
2020-09-25 19:00 ` Stephen Hemminger
2020-09-25 19:06 ` Jakub Kicinski
2020-09-28 14:07 ` Magnus Karlsson
2020-09-28 17:43 ` Eric Dumazet [this message]
2020-09-28 18:15 ` Wei Wang
2020-09-29 19:19 ` Jakub Kicinski
2020-09-29 20:16 ` Wei Wang
2020-09-29 21:48 ` Jakub Kicinski
2020-09-30 8:23 ` David Laight
2020-09-30 8:58 ` Paolo Abeni
2020-09-30 15:58 ` Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CANn89iJDM97U15Znrx4k4bOFKunQp7dwJ9mtPwvMmB4S+rSSbA@mail.gmail.com \
--to=edumazet@google.com \
--cc=davem@davemloft.net \
--cc=hannes@stressinduktion.org \
--cc=kuba@kernel.org \
--cc=nbd@nbd.name \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=weiwan@google.com \
--subject='Re: [RFC PATCH net-next 0/6] implement kthread based napi poll' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).