Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: "Íñigo Huguet" <ihuguet@redhat.com>,
	"Edward Cree" <ecree.xilinx@gmail.com>,
	habetsm.xilinx@gmail.com
Cc: netdev@vger.kernel.org, Dinan Gunawardena <dinang@xilinx.com>,
	Pablo Cascon <pabloc@xilinx.com>
Subject: Re: Bad performance in RX with sfc 40G
Date: Thu, 18 Nov 2021 09:19:39 -0800	[thread overview]
Message-ID: <beef3b28-6818-df7b-eaad-8569cac5d79b@gmail.com> (raw)
In-Reply-To: <CACT4oudChHDKecLfDdA7R8jpQv2Nmz5xBS3hH_jFWeS37CnQGg@mail.gmail.com>



On 11/18/21 7:14 AM, Íñigo Huguet wrote:
> Hello,
> 
> Doing some tests a few weeks ago I noticed a very low performance in
> RX using 40G Solarflare NICs. Doing tests with iperf3 I got more than
> 30Gbps in TX, but just around 15Gbps in RX. Other NICs from other
> vendors could send and receive over 30Gbps.
> 
> I was doing the tests with multiple threads in iperf3 (-P 8).
> 
> The models used are SFC9140 and SFC9220.
> 
> Perf showed that most of the time was being expended in
> `native_queued_spin_lock_slowpath`. Tracing the calls to it with
> bpftrace I got that most of the calls were from __napi_poll > efx_poll
>> efx_fast_push_rx_descriptors > __alloc_pages >
> get_page_from_freelist > ...
> 
> Please can you help me investigate the issue? At first sight, it seems
> a not very optimal memory allocation strategy, or maybe a failure in
> pages recycling strategy...
> 
> This is the output of bpftrace, the 2 call chains that repeat more
> times, both from sfc
> 
> @[
>     native_queued_spin_lock_slowpath+1
>     _raw_spin_lock+26
>     rmqueue_bulk+76
>     get_page_from_freelist+2295
>     __alloc_pages+214
>     efx_fast_push_rx_descriptors+640
>     efx_poll+660
>     __napi_poll+42
>     net_rx_action+547
>     __softirqentry_text_start+208
>     __irq_exit_rcu+179
>     common_interrupt+131
>     asm_common_interrupt+30
>     cpuidle_enter_state+199
>     cpuidle_enter+41
>     do_idle+462
>     cpu_startup_entry+25
>     start_kernel+2465
>     secondary_startup_64_no_verify+194
> ]: 2650
> @[
>     native_queued_spin_lock_slowpath+1
>     _raw_spin_lock+26
>     rmqueue_bulk+76
>     get_page_from_freelist+2295
>     __alloc_pages+214
>     efx_fast_push_rx_descriptors+640
>     efx_poll+660
>     __napi_poll+42
>     net_rx_action+547
>     __softirqentry_text_start+208
>     __irq_exit_rcu+179
>     common_interrupt+131
>     asm_common_interrupt+30
>     cpuidle_enter_state+199
>     cpuidle_enter+41
>     do_idle+462
>     cpu_startup_entry+25
>     secondary_startup_64_no_verify+194
> ]: 17119
> 
> --
> Íñigo Huguet
> 


You could try to :

Make the RX ring buffers bigger (ethtool -G eth0 rx 8192)

and/or

Make sure your tcp socket receive buffer is smaller than number of frames in the ring buffer

echo "4096 131072 2097152" >/proc/sys/net/ipv4/tcp_rmem

You can also try latest net-next, as TCP got something to help this case.

f35f821935d8df76f9c92e2431a225bdff938169 tcp: defer skb freeing after socket lock is released

  reply	other threads:[~2021-11-18 17:19 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-18 15:14 Íñigo Huguet
2021-11-18 17:19 ` Eric Dumazet [this message]
2021-12-02 14:26   ` Íñigo Huguet
2021-11-20  8:31 ` Martin Habets
2021-12-09 12:06   ` Íñigo Huguet
2021-12-23 13:18     ` Íñigo Huguet
2022-01-02  9:22       ` Martin Habets
2022-01-10  8:58         ` [PATCH net-next] sfc: The size of the RX recycle ring should be more flexible Martin Habets
2022-01-10  9:31           ` Íñigo Huguet
2022-01-12  9:05             ` Martin Habets
2022-01-31 11:08             ` Martin Habets
2022-01-10 17:22           ` Jakub Kicinski
2022-01-12  9:08             ` Martin Habets
2022-01-31 11:10           ` [PATCH V2 " Martin Habets
2022-02-02  5:10             ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=beef3b28-6818-df7b-eaad-8569cac5d79b@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=dinang@xilinx.com \
    --cc=ecree.xilinx@gmail.com \
    --cc=habetsm.xilinx@gmail.com \
    --cc=ihuguet@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabloc@xilinx.com \
    --subject='Re: Bad performance in RX with sfc 40G' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).