Netdev Archive on
help / color / mirror / Atom feed
From: Jamal Hadi Salim <>
To: Cong Wang <>
Cc: David Miller <>,
	Linux Kernel Network Developers <>,
	Jiri Pirko <>,
	Ariel Levkovich <>
Subject: Re: [PATCH net-next 1/1] net/sched: Introduce skb hash classifier
Date: Thu, 13 Aug 2020 08:52:10 -0400	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On 2020-08-11 7:25 p.m., Cong Wang wrote:
> On Sun, Aug 9, 2020 at 4:41 PM Jamal Hadi Salim <> wrote:

> Not sure if I get you correctly, but with a combined implementation
> you can do above too, right? Something like:
> (AND case)
> $TC filter add dev $DEV1 parent ffff: protocol ip prio 3 handle 1
> skb hash Y mark X flowid 1:12 action ok
> (OR case)
> $TC filter add dev $DEV1 parent ffff: protocol ip prio 3 handle 1
> skb hash Y flowid 1:12 action ok
> $TC filter add dev $DEV1 parent ffff: protocol ip prio 4 handle 2
> skb mark X flowid 1:12 action ok

It will work but what i was tring to say is it is tricky to implement.
More below

> Side note: you don't have to use handle as the value of hash/mark,
> which gives people freedom to choose different handles.

Same comment here as above. More below.

>> Then the question is how to implement? is it one hash table for
>> both or two(one for mark and one for hash), etc.
> Good question. I am not sure, maybe no hash table at all?
> Unless there are a lot of filters, we do not have to organize
> them in a hash table, do we?

The _main_ requirement is to scale to a large number of filters
(a million is a good handwave number). Scale means
1) fast datapath lookup time + 2) fast insertion/deletion/get/dump
from control/user space.
fwmark is good at all these goals today for #2. It is good for #1 for
maybe 1K rules (limitation is the 256 buckets, constrained by rcu
trickery). Then you start having collisions in a bucket and your
lookup requires long linked list walks.

Generally something like a hash table with sufficient number of buckets
will work out ok.
There maybe other approaches (IDR in the kernel looks interesting,
but i didnt look closely).

So to the implementation issue:
Major issue is removing ambiguity while at the same time trying
to get good performance.

Lets say we decided to classify skbmark and skbhash at this point.
For a hash table, one simple approach is to set
lookupkey = hash<<32|mark

the key is used as input to the hash algo to find the bucket.

There are two outstanding challenges in my mind:

1)  To use the policy like you describe above
as an example:

$TC filter add dev $DEV1 parent ffff: protocol ip prio 3 handle 1
skb hash Y flowid 1:12 action ok

and say you receive a packet with both skb->hash and skb->mark set
Then there is ambiguity

How do you know whether to use hash or mark or both
for that specific key?
You can probably do some trick but I cant think of a cheap way to 
achieve this goal. Of course this issue doesnt exist if you have
separate classifiers.

2) If you decide tomorrow to add tcindex/prio etc, you will have to
rework this as well.

#2 is not as a big deal as #1.


  parent reply	other threads:[~2020-08-13 12:52 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-07 22:28 Jamal Hadi Salim
2020-08-09 18:15 ` Cong Wang
2020-08-09 23:41   ` Jamal Hadi Salim
2020-08-11 23:25     ` Cong Wang
2020-08-12 21:07       ` Marcelo Ricardo Leitner
2020-08-13 12:52       ` Jamal Hadi Salim [this message]
2020-08-16 18:59         ` Cong Wang
2020-08-17 11:19           ` Jamal Hadi Salim
2020-08-17 19:47             ` Cong Wang
2020-08-19  9:48               ` Jamal Hadi Salim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \
    --subject='Re: [PATCH net-next 1/1] net/sched: Introduce skb hash classifier' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).