Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "Li,Rongqing" <lirongqing@baidu.com>
To: "Björn Töpel" <bjorn.topel@gmail.com>
Cc: Netdev <netdev@vger.kernel.org>,
	intel-wired-lan <intel-wired-lan@lists.osuosl.org>,
	"Karlsson, Magnus" <magnus.karlsson@intel.com>,
	"Björn Töpel" <bjorn.topel@intel.com>, bpf <bpf@vger.kernel.org>,
	"Maciej Fijalkowski" <maciej.fijalkowski@intel.com>,
	Piotr <piotr.raczynski@intel.com>,
	Maciej <maciej.machnikowski@intel.com>
Subject: 答复: [Intel-wired-lan] [PATCH 0/2] intel/xdp fixes for fliping rx buffer
Date: Wed, 19 Aug 2020 01:37:58 +0000	[thread overview]
Message-ID: <4268316b200049d58b9973ec4dc4725c@baidu.com> (raw)
In-Reply-To: <CAJ+HfNi2B+2KYP9A7yCfFUhfUBd=sFPeuGbNZMjhNSdq3GEpMg@mail.gmail.com>



> -----邮件原件-----
> 发件人: Björn Töpel [mailto:bjorn.topel@gmail.com]
> 发送时间: 2020年8月18日 22:05
> 收件人: Li,Rongqing <lirongqing@baidu.com>
> 抄送: Netdev <netdev@vger.kernel.org>; intel-wired-lan
> <intel-wired-lan@lists.osuosl.org>; Karlsson, Magnus
> <magnus.karlsson@intel.com>; Björn Töpel <bjorn.topel@intel.com>; bpf
> <bpf@vger.kernel.org>; Maciej Fijalkowski <maciej.fijalkowski@intel.com>;
> Piotr <piotr.raczynski@intel.com>; Maciej <maciej.machnikowski@intel.com>
> 主题: Re: [Intel-wired-lan] [PATCH 0/2] intel/xdp fixes for fliping rx buffer
> 
> On Fri, 17 Jul 2020 at 08:24, Li RongQing <lirongqing@baidu.com> wrote:
> >
> > This fixes ice/i40e/ixgbe/ixgbevf_rx_buffer_flip in copy mode xdp that
> > can lead to data corruption.
> >
> > I split two patches, since i40e/xgbe/ixgbevf supports xsk receiving
> > from 4.18, put their fixes in a patch
> >
> 
> Li, sorry for the looong latency. I took a looong vacation. :-P
> 
> Thanks for taking a look at this, but I believe this is not a bug.
> 
> The Intel Ethernet drivers (obviously non-zerocopy AF_XDP -- "good ol'
> XDP") use a page reuse algorithm.
> 
> Basic idea is that a page is allocated from the page allocator
> (i40e_alloc_mapped_page()). The refcount is increased to USHRT_MAX. The
> page is split into two chunks (simplified). If there's one user of the page, the
> page can be reused (flipped). If not, a new page needs to be allocated (with the
> large refcount).
> 
> So, the idea is that usually the page can be reused (flipped), and the page only
> needs to be "put" not "get" since the refcount was initally bumped to a large
> value.
> 
> All frames (except XDP_DROP which can be reused directly) "die" via
> page_frag_free() which decreases the page refcount, and frees the page if the
> refcount is zero.
> 
> Let's take some scenarios as examples:
> 
> 1. A frame is received in "vanilla" XDP (MEM_TYPE_PAGE_SHARED), and
>    the XDP program verdict is XDP_TX. The frame will be placed on the
>    HW Tx ring, and freed* (async) in i40e_clean_tx_irq:
>         /* free the skb/XDP data */
>         if (ring_is_xdp(tx_ring))
>             xdp_return_frame(tx_buf->xdpf); // calls page_frag_free()
> 
> 2. A frame is passed to the stack, eventually it's freed* via
>    skb_free_frag().
> 
> 3. A frame is passed to an AF_XDP socket. The data is copied to the
>    socket data area, and the frame is directly freed*.
> 
> Not the * by the freed. Actually freeing here means calling page_frag_free(),
> which means decreasing the refcount. The page reuse algorithm makes sure
> that the buffers are not stale.
> 
> The only difference from XDP_TX and XDP_DIRECT to dev/cpumaps, compared
> to AF_XDP sockets is that the latter calls page_frag_free() directly, whereas
> the other does it asynchronous from the Tx clean up phase.
> 

Hi:

Thanks for your explanation.

But we can reproduce this bug

We use ebpf to redirect only-Vxlan packets to non-zerocopy AF_XDP,  First we see panic on tcp stack, in tcp_collapse: BUG_ON(offset < 0); it is very hard to reproduce.

Then we use the scp to do test, and has lots of vxlan packet at the same time, scp will be broken frequently.

With this fixes, scp has not been broken again, and kernel is not panic again

Seem your explanation is unable to solve my analysis:

       1. first skb is not for xsk, and forwarded to another device
          or socket queue
       2. seconds skb is for xsk, copy data to xsk memory, and page
          of skb->data is released
       3. rx_buff is reusable since only first skb is in it, but
          *_rx_buffer_flip will make that page_offset is set to
          first skb data
       4. then reuse rx buffer, first skb which still is living
          will be corrupted.


The root cause is difference you said upper, so I only fixes for non-zerocopy AF_XDP

-Li
> Let me know if it's still not clear, but the bottom line is that none of these
> patches are needed.
> 
> 
> Thanks!
> Björn
> 
> 
> > Li RongQing (2):
> >   xdp: i40e: ixgbe: ixgbevf: not flip rx buffer for copy mode xdp
> >   ice/xdp: not adjust rx buffer for copy mode xdp
> >
> >  drivers/net/ethernet/intel/i40e/i40e_txrx.c       | 5 ++++-
> >  drivers/net/ethernet/intel/ice/ice_txrx.c         | 5 ++++-
> >  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c     | 5 ++++-
> >  drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 5 ++++-
> >  include/net/xdp.h                                 | 3 +++
> >  net/xdp/xsk.c                                     | 4 +++-
> >  6 files changed, 22 insertions(+), 5 deletions(-)
> >
> > --
> > 2.16.2
> >
> > _______________________________________________
> > Intel-wired-lan mailing list
> > Intel-wired-lan@osuosl.org
> > https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

  reply	other threads:[~2020-08-19  1:38 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-17  6:24 Li RongQing
2020-07-17  6:24 ` [PATCH 1/2] xdp: i40e: ixgbe: ixgbevf: not flip rx buffer for copy mode xdp Li RongQing
2020-07-20  7:21   ` [Intel-wired-lan] " Magnus Karlsson
2020-07-21  1:42     ` 答复: " Li,Rongqing
2020-07-21  7:49     ` Li,Rongqing
2020-07-17  6:24 ` [PATCH 2/2] ice/xdp: not adjust " Li RongQing
2020-08-18 14:04 ` [Intel-wired-lan] [PATCH 0/2] intel/xdp fixes for fliping rx buffer Björn Töpel
2020-08-19  1:37   ` Li,Rongqing [this message]
2020-08-19  6:44     ` 答复: " Björn Töpel
2020-08-19  8:17       ` 答复: " Li,Rongqing
2020-08-19  8:31         ` Björn Töpel
2020-08-19  8:52           ` Björn Töpel
2020-08-20 15:13   ` Björn Töpel
2020-08-20 16:51     ` Maciej Fijalkowski
2020-08-20 18:04       ` Björn Töpel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4268316b200049d58b9973ec4dc4725c@baidu.com \
    --to=lirongqing@baidu.com \
    --cc=bjorn.topel@gmail.com \
    --cc=bjorn.topel@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=maciej.machnikowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=piotr.raczynski@intel.com \
    --subject='Re: 答复: [Intel-wired-lan] [PATCH 0/2] intel/xdp fixes for fliping rx buffer' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).