Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
From: "Nambiar, Amritha" <amritha.nambiar@intel.com>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Netdev <netdev@vger.kernel.org>, Jakub Kicinski <kuba@kernel.org>,
"Jamal Hadi Salim" <jhs@mojatatu.com>,
Jiri Pirko <jiri@resnulli.us>,
Cong Wang <xiyou.wangcong@gmail.com>,
"Samudrala, Sridhar" <sridhar.samudrala@intel.com>
Subject: RE: [net-next PATCH] net: act_skbedit: Fix tc action skbedit queue_mapping
Date: Wed, 18 Aug 2021 00:16:18 +0000 [thread overview]
Message-ID: <PH0PR11MB5207CCFF0DDB54055AE50801F1FF9@PH0PR11MB5207.namprd11.prod.outlook.com> (raw)
In-Reply-To: <CAKgT0Ueqv55Qw=yYyqvtv9Sq=QTDeaoX=8z68H6KWZdEa4vwrA@mail.gmail.com>
> -----Original Message-----
> From: Alexander Duyck <alexander.duyck@gmail.com>
> Sent: Tuesday, August 10, 2021 7:04 AM
> To: Nambiar, Amritha <amritha.nambiar@intel.com>
> Cc: Netdev <netdev@vger.kernel.org>; Jakub Kicinski <kuba@kernel.org>;
> Jamal Hadi Salim <jhs@mojatatu.com>; Jiri Pirko <jiri@resnulli.us>; Cong
> Wang <xiyou.wangcong@gmail.com>; Samudrala, Sridhar
> <sridhar.samudrala@intel.com>
> Subject: Re: [net-next PATCH] net: act_skbedit: Fix tc action skbedit
> queue_mapping
>
> On Mon, Aug 9, 2021 at 6:20 PM Nambiar, Amritha
> <amritha.nambiar@intel.com> wrote:
> >
> > > -----Original Message-----
> > > From: Alexander Duyck <alexander.duyck@gmail.com>
> > > Sent: Monday, August 9, 2021 5:51 PM
> > > To: Nambiar, Amritha <amritha.nambiar@intel.com>
> > > Cc: Netdev <netdev@vger.kernel.org>; Jakub Kicinski <kuba@kernel.org>;
> > > Jamal Hadi Salim <jhs@mojatatu.com>; Jiri Pirko <jiri@resnulli.us>; Cong
> > > Wang <xiyou.wangcong@gmail.com>; Samudrala, Sridhar
> > > <sridhar.samudrala@intel.com>
> > > Subject: Re: [net-next PATCH] net: act_skbedit: Fix tc action skbedit
> > > queue_mapping
> > >
> > > On Mon, Aug 9, 2021 at 4:36 PM Amritha Nambiar
> > > <amritha.nambiar@intel.com> wrote:
> > > >
> > > > For skbedit action queue_mapping to select the transmit queue,
> > > > queue_mapping takes the value N for tx-N (where N is the actual
> > > > queue number). However, current behavior is the following:
> > > > 1. Queue selection is off by 1, tx queue N-1 is selected for
> > > > action skbedit queue_mapping N. (If the general syntax for queue
> > > > index is 1 based, i.e., action skbedit queue_mapping N would
> > > > transmit to tx queue N-1, where N >=1, then the last queue cannot
> > > > be used for transmit as this fails the upper bound check.)
> > > > 2. Transmit to first queue of TCs other than TC0 selects the
> > > > next queue.
> > > > 3. It is not possible to transmit to the first queue (tx-0) as
> > > > this fails the bounds check, in this case the fallback
> > > > mechanism for hash calculation is used.
> > > >
> > > > Fix the call to skb_set_queue_mapping(), the code retrieving the
> > > > transmit queue uses skb_get_rx_queue() which subtracts the queue
> > > > index by 1. This makes it so that "action skbedit queue_mapping N"
> > > > will transmit to tx-N (including the first and last queue).
> > > >
> > > > Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
> > > > Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
> > > > ---
> > > > net/sched/act_skbedit.c | 2 +-
> > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/net/sched/act_skbedit.c b/net/sched/act_skbedit.c
> > > > index e5f3fb8b00e3..a7bba15c74c4 100644
> > > > --- a/net/sched/act_skbedit.c
> > > > +++ b/net/sched/act_skbedit.c
> > > > @@ -59,7 +59,7 @@ static int tcf_skbedit_act(struct sk_buff *skb, const
> > > struct tc_action *a,
> > > > }
> > > > if (params->flags & SKBEDIT_F_QUEUE_MAPPING &&
> > > > skb->dev->real_num_tx_queues > params->queue_mapping)
> > > > - skb_set_queue_mapping(skb, params->queue_mapping);
> > > > + skb_set_queue_mapping(skb, params->queue_mapping + 1);
> > > > if (params->flags & SKBEDIT_F_MARK) {
> > > > skb->mark &= ~params->mask;
> > > > skb->mark |= params->mark & params->mask;
> > > >
> > >
> > > I don't think this is correct. It is conflating the rx_queue_mapping
> > > versus the Tx queue mapping. This is supposed to be setting the Tx
> > > queue mapping which applies after we have dropped the Rx queue
> > > mapping, not before. Specifically this is run at the qdisc enqueue
> > > stage with a single locked qdisc, after netdev_pick_tx and skb_tx_hash
> > > have already run. It is something that existed before mq and is meant
> > > to work with the mutliq qdisc.
> > >
> > > If you are wanting to add a seperate override to add support for
> > > programming the Rx queue mapping you may want to submit that as a
> > > different patch rather than trying to change the existing Tx queue
> > > mapping feature. Either that or you would need to change this so that
> > > it has a different behavior depending on where the hook is added since
> > > the behavior would be different if this is called before skb_tx_hash.
> >
> > Hi Alex,
> > Thanks for the review. The goal is to select a transmit queue using tc egress
> rule
> > and the action skbedit (that will go through netdev_pick_tx and
> skb_tx_hash).
> > I am not sure of the correct syntax for the queue-mapping value in the
> > action (tx-N or tx-N+1). As per the man page
> > (https://man7.org/linux/man-pages/man8/tc-skbedit.8.html), I interpreted
> > it as "action skbedit queue_mapping N" will transmit to tx-N. But, the
> > 3 observations I listed don't quite seem to be following the tc rule.
> > Hence, tried to fix this in the action module.
> >
> > -Amritha
>
> Hi Amritha,
>
> As I mentioned before the problem is where the hook is being inserted.
>
> If you follow the example in the documentation found at:
> https://elixir.bootlin.com/linux/v5.14-
> rc5/source/Documentation/networking/multiqueue.rst
>
> What you should find is that the Tx queue hook works as expected
> because it will occur when the packet is enqueued to the qdisc, not
> before. The problem is for the mq qdiscs this is occuring before the
> Tx queue selection where the Rx queue is still active and the Tx queue
> has not yet been selected.
>
> You might take a look at the spots where tcf_classify is called. The
> problem you are experiencing is because this action is expecting to
> edit the packet inside the qdisc, whereas what you are doing is
> calling this in sch_handle_egress which occurs before the Tx queue has
> been selected.
>
> What we likely need is some way of identifying when we are attaching
> the classifier/action to either the ingress or egress classifier
> rather than a qdisc and in that case then we use the
> skb_record_rx_queue functionality instead of the skb_set_queue_mapping
> functionality. One possible way to address this would be to expand
> tc_at_ingress to actually be a 2 bit value and have 0 indicate Qdisc,
> 1 indicate Ingress, and 2 indicate Egress. Then you could filter out
> the Ingress/Egress cases and have them use skb_record_rx_queue while
> the other cases use skb_set_queue_mapping.
>
> Hope that helps to clear it up a bit.
>
> - Alex
Thanks for clarifying. Yes, it works as expected with the multiq qdisc. So, IIUC,
with multiq qdisc, this is hooked such that, netdev_pick_tx and skb_tx_hash
first picks a Tx queue. Then multiq_classify overwrites the queue_mapping
with the value from the skbedit action and calls qdisc_enqueue.
Yes, as you explained, I was hooking this up at the classifier level. I was adding an
egress filter to the clsact qdisc, and expecting the Tx queue to be picked from the
action when netdev_pick_tx executes from sch_handle_egress. This was not getting
into qdisc enqueue stage.
What I actually hoped to do was to add similar functionality (as with multiq) to
other qdiscs such as mqprio. So, to support this action with mqprio qdisc,
I would need to implement the enqueue callback in mqprio and set the
queue_mapping from the action, then run qdisc_enqueue. Is this correct ?
-Amritha
next prev parent reply other threads:[~2021-08-18 0:16 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-09 23:41 Amritha Nambiar
2021-08-10 0:51 ` Alexander Duyck
2021-08-10 1:20 ` Nambiar, Amritha
2021-08-10 14:03 ` Alexander Duyck
2021-08-18 0:16 ` Nambiar, Amritha [this message]
2021-08-18 0:52 ` Alexander Duyck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=PH0PR11MB5207CCFF0DDB54055AE50801F1FF9@PH0PR11MB5207.namprd11.prod.outlook.com \
--to=amritha.nambiar@intel.com \
--cc=alexander.duyck@gmail.com \
--cc=jhs@mojatatu.com \
--cc=jiri@resnulli.us \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=sridhar.samudrala@intel.com \
--cc=xiyou.wangcong@gmail.com \
--subject='RE: [net-next PATCH] net: act_skbedit: Fix tc action skbedit queue_mapping' \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).