Netdev Archive on
help / color / mirror / Atom feed
From: Jamal Hadi Salim <>
To: "Toke Høiland-Jørgensen" <>,
	"Martin KaFai Lau" <>
Cc: John Fastabend <>,
	Cong Wang <>,
	Linux Kernel Network Developers <>,
	bpf <>, Cong Wang <>,
	Jiri Pirko <>
Subject: Re: [RFC Patch net-next] net_sched: introduce eBPF based Qdisc
Date: Fri, 3 Sep 2021 11:33:26 -0400	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On 2021-09-03 10:44 a.m., Toke Høiland-Jørgensen wrote:
> Martin KaFai Lau <> writes:
>> On Fri, Sep 03, 2021 at 12:27:52AM +0200, Toke Høiland-Jørgensen wrote:
>>>>> The question is if it's useful to provide the full struct_ops for
>>>>> qdiscs? Having it would allow a BPF program to implement that interface
>>>>> towards userspace (things like statistics, classes etc), but the
>>>>> question is if anyone is going to bother with that given the wealth of
>>>>> BPF-specific introspection tools already available?
>> Instead of bpftool can only introspect bpf qdisc and the existing tc
>> can only introspect kernel qdisc,  it will be nice to have bpf
>> qdisc work as other qdisc and showing details together with others
>> in tc.  e.g. a bpf qdisc export its data/stats with its btf-id
>> to tc and have tc print it out in a generic way?
> I'm not opposed to the idea, certainly. I just wonder if people who go
> to the trouble of writing a custom qdisc in BPF will feel it's worth it
> to do the extra work to make this available via a second API. We could
> certainly encourage it, and some things are easy (drop and pkt counters,
> etc), but other things (like class stats) will depend on the semantics
> of the qdisc being implemented, so will require extra work from the BPF
> qdisc developer...

The idea of using btf to overcome the domain difference is _very_
appealing but sounds like a lot of work? Havent delved enough
into btf - but wondering if the same could be stated for filters
and actions...Note:
Aside from current existing tooling being well understood,
challenges  you will be faced with is reinventing all the
infrastructure that tc qdiscs have taken care of over the years,
the proper integrations with softirqs and multiprocessor protections,
irqs, timers etc which take care of smooth triggering of
enqueue/dequeue, taking care of defering things when the target
device/hw is busy, hierarchies, etc, etc;
not saying it is the most perfect or performant but it is one of
those 'day 3' deployments i.e a lot of corner cases taken care of.
I noticed you mentioned some of those things in one of your emails.
For this reason - Cong's approach looks appealing because it
reuses said infra. Main thing that needs to have extensibility is
the de/enqueue ops as ebpf progs. Allowing enq/deq to be ebpf specific
sounds like will allow one scheme that works for both tc and XDP
(with enq/deq taking care of the buffer contextual differences).
I admit XDP is a little harder than plain tc....


  reply	other threads:[~2021-09-03 15:33 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-21  1:02 [RFC Patch net-next] net_sched: introduce eBPF based Qdisc Cong Wang
2021-08-24 23:47 ` Martin KaFai Lau
2021-09-01  4:39   ` Cong Wang
2021-09-01  5:45     ` John Fastabend
2021-09-01 10:42       ` Toke Høiland-Jørgensen
2021-09-01 17:45         ` Martin KaFai Lau
2021-09-01 18:03           ` Alexei Starovoitov
2021-09-02 16:57           ` Toke Høiland-Jørgensen
2021-09-02 20:40             ` John Fastabend
2021-09-02 22:27               ` Toke Høiland-Jørgensen
2021-09-02 23:35                 ` Martin KaFai Lau
2021-09-03 14:44                   ` Toke Høiland-Jørgensen
2021-09-03 15:33                     ` Jamal Hadi Salim [this message]
2021-09-10  6:55                     ` Martin KaFai Lau
2021-09-10 11:31                       ` Toke Høiland-Jørgensen
2021-09-04  1:09           ` Cong Wang
2021-09-17  4:19             ` Martin KaFai Lau
2021-09-04  1:30         ` Cong Wang
2021-09-06 11:45           ` Toke Høiland-Jørgensen
2021-09-04  1:05       ` Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).