Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Martin KaFai Lau <kafai@fb.com>
To: Cong Wang <xiyou.wangcong@gmail.com>
Cc: <netdev@vger.kernel.org>, <bpf@vger.kernel.org>,
	<toke@redhat.com>, Cong Wang <cong.wang@bytedance.com>,
	Jamal Hadi Salim <jhs@mojatatu.com>,
	Jiri Pirko <jiri@resnulli.us>
Subject: Re: [RFC Patch net-next] net_sched: introduce eBPF based Qdisc
Date: Tue, 24 Aug 2021 16:47:00 -0700	[thread overview]
Message-ID: <20210824234700.qlteie6al3cldcu5@kafai-mbp> (raw)
In-Reply-To: <20210821010240.10373-1-xiyou.wangcong@gmail.com>

On Fri, Aug 20, 2021 at 06:02:40PM -0700, Cong Wang wrote:
> From: Cong Wang <cong.wang@bytedance.com>
> 
> This *incomplete* patch introduces a programmable Qdisc with
> eBPF.  The goal is to make Qdisc as programmable as possible,
> that is, to replace as many existing Qdisc's as we can. ;)
> 
> The design was discussed during last LPC:
> https://linuxplumbersconf.org/event/7/contributions/679/attachments/520/1188/sch_bpf.pdf 
> 
> Here is a summary of design decisions I made:
> 
> 1. Avoid eBPF struct_ops, as it would be really hard to program
>    a Qdisc with this approach.
Please explain more on this.  What is currently missing
to make qdisc in struct_ops possible?

> 2. Avoid exposing skb's to user-space, which means we can't introduce
>    a map to store skb's. Instead, store them in kernel without exposure
>    to user-space.
> 
> So I choose to use priority queues to store skb's inside a
> flow and to store flows inside a Qdisc, and let eBPF programs
> decide the *relative* position of the skb within the flow and the
> *relative* order of the flows too, upon each enqueue and dequeue.
> Each flow is also exposed to user as a TC class, like many other
> classful Qdisc's.
> 
> Although the biggest limitation is obviously that users can
> not traverse the packets or flows inside the Qdisc, I think
> at least they could store those global information of interest
> inside their own map and map can be shared between enqueue and
> dequeue. For example, users could use skb pointer as key and
> rank as a value to find out the absolute order.
> 
> One of the challeges is how to interact with existing TC infra,
> for instance, if users install TC filters on this Qdisc, should
> we respect this by ignoring or rejecting eBPF enqueue program
> attached or vice versa? Should we allow users to replace each
> priority queue of a class with a regular Qdisc?
> 
> Any high-level feedbacks are welcome. Please do not review any
> coding details until RFC tag is removed.
> 
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Cc: Jiri Pirko <jiri@resnulli.us>
> Signed-off-by: Cong Wang <cong.wang@bytedance.com>

  reply	other threads:[~2021-08-24 23:47 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-21  1:02 Cong Wang
2021-08-24 23:47 ` Martin KaFai Lau [this message]
2021-09-01  4:39   ` Cong Wang
2021-09-01  5:45     ` John Fastabend
2021-09-01 10:42       ` Toke Høiland-Jørgensen
2021-09-01 17:45         ` Martin KaFai Lau
2021-09-01 18:03           ` Alexei Starovoitov
2021-09-02 16:57           ` Toke Høiland-Jørgensen
2021-09-02 20:40             ` John Fastabend
2021-09-02 22:27               ` Toke Høiland-Jørgensen
2021-09-02 23:35                 ` Martin KaFai Lau
2021-09-03 14:44                   ` Toke Høiland-Jørgensen
2021-09-03 15:33                     ` Jamal Hadi Salim
2021-09-10  6:55                     ` Martin KaFai Lau
2021-09-10 11:31                       ` Toke Høiland-Jørgensen
2021-09-04  1:09           ` Cong Wang
2021-09-17  4:19             ` Martin KaFai Lau
2021-09-04  1:30         ` Cong Wang
2021-09-06 11:45           ` Toke Høiland-Jørgensen
2021-09-04  1:05       ` Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210824234700.qlteie6al3cldcu5@kafai-mbp \
    --to=kafai@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=cong.wang@bytedance.com \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=netdev@vger.kernel.org \
    --cc=toke@redhat.com \
    --cc=xiyou.wangcong@gmail.com \
    --subject='Re: [RFC Patch net-next] net_sched: introduce eBPF based Qdisc' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).