Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Vladimir Oltean <vladimir.oltean@nxp.com>
To: netdev@vger.kernel.org, Jakub Kicinski <kuba@kernel.org>,
	"David S. Miller" <davem@davemloft.net>
Cc: Andrew Lunn <andrew@lunn.ch>,
	Florian Fainelli <f.fainelli@gmail.com>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	Vladimir Oltean <olteanv@gmail.com>
Subject: [PATCH v3 net-next 7/8] net: dsa: sja1105: suppress TX packets from looping back in "H" topologies
Date: Wed,  4 Aug 2021 16:54:35 +0300	[thread overview]
Message-ID: <20210804135436.1741856-8-vladimir.oltean@nxp.com> (raw)
In-Reply-To: <20210804135436.1741856-1-vladimir.oltean@nxp.com>

H topologies like this one have a problem:

         eth0                                                     eth1
          |                                                        |
       CPU port                                                CPU port
          |                        DSA link                        |
 sw0p0  sw0p1  sw0p2  sw0p3  sw0p4 -------- sw1p4  sw1p3  sw1p2  sw1p1  sw1p0
   |             |      |                            |      |             |
 user          user   user                         user   user          user
 port          port   port                         port   port          port

Basically any packet sent by the eth0 DSA master can be flooded on the
interconnecting DSA link sw0p4 <-> sw1p4 and it will be received by the
eth1 DSA master too. Basically we are talking to ourselves.

In VLAN-unaware mode, these packets are encoded using a tag_8021q TX
VLAN, which dsa_8021q_rcv() rightfully cannot decode and complains.
Whereas in VLAN-aware mode, the packets are encoded with a bridge VLAN
which _can_ be decoded by the tagger running on eth1, so it will attempt
to reinject that packet into the network stack (the bridge, if there is
any port under eth1 that is under a bridge). In the case where the ports
under eth1 are under the same cross-chip bridge as the ports under eth0,
the TX packets will even be learned as RX packets. The only thing that
will prevent loops with the software bridging path, and therefore
disaster, is that the source port and the destination port are in the
same hardware domain, and the bridge will receive packets from the
driver with skb->offload_fwd_mark = true and will not forward between
the two.

The proper solution to this problem is to detect H topologies and
enforce that all packets are received through the local switch and we do
not attempt to receive packets on our CPU port from switches that have
their own. This is a viable solution which works thanks to the fact that
MAC addresses which should be filtered towards the host are installed by
DSA as static MAC addresses towards the CPU port of each switch.

TX from a CPU port towards the DSA port continues to be allowed, this is
because sja1105 supports bridge TX forwarding offload, and the skb->dev
used initially for xmit does not have any direct correlation with where
the station that will respond to that packet is connected. It may very
well happen that when we send a ping through a br0 interface that spans
all switch ports, the xmit packet will exit the system through a DSA
switch interface under eth1 (say sw1p2), but the destination station is
connected to a switch port under eth0, like sw0p0. So the switch under
eth1 needs to communicate on TX with the switch under eth0. The
response, however, will not follow the same path, but instead, this
patch enforces that the response is sent by the first switch directly to
its DSA master which is eth0.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 drivers/net/dsa/sja1105/sja1105_main.c | 29 ++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index fffcaef6b148..b3b5ae3ef408 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -474,7 +474,9 @@ static int sja1105_init_l2_forwarding(struct sja1105_private *priv)
 {
 	struct sja1105_l2_forwarding_entry *l2fwd;
 	struct dsa_switch *ds = priv->ds;
+	struct dsa_switch_tree *dst;
 	struct sja1105_table *table;
+	struct dsa_link *dl;
 	int port, tc;
 	int from, to;
 
@@ -547,6 +549,33 @@ static int sja1105_init_l2_forwarding(struct sja1105_private *priv)
 		}
 	}
 
+	/* In odd topologies ("H" connections where there is a DSA link to
+	 * another switch which also has its own CPU port), TX packets can loop
+	 * back into the system (they are flooded from CPU port 1 to the DSA
+	 * link, and from there to CPU port 2). Prevent this from happening by
+	 * cutting RX from DSA links towards our CPU port, if the remote switch
+	 * has its own CPU port and therefore doesn't need ours for network
+	 * stack termination.
+	 */
+	dst = ds->dst;
+
+	list_for_each_entry(dl, &dst->rtable, list) {
+		if (dl->dp->ds != ds || dl->link_dp->cpu_dp == dl->dp->cpu_dp)
+			continue;
+
+		from = dl->dp->index;
+		to = dsa_upstream_port(ds, from);
+
+		dev_warn(ds->dev,
+			 "H topology detected, cutting RX from DSA link %d to CPU port %d to prevent TX packet loops\n",
+			 from, to);
+
+		sja1105_port_allow_traffic(l2fwd, from, to, false);
+
+		l2fwd[from].bc_domain &= ~BIT(to);
+		l2fwd[from].fl_domain &= ~BIT(to);
+	}
+
 	/* Finally, manage the egress flooding domain. All ports start up with
 	 * flooding enabled, including the CPU port and DSA links.
 	 */
-- 
2.25.1


  parent reply	other threads:[~2021-08-04 13:55 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-04 13:54 [PATCH v3 net-next 0/8] NXP SJA1105 driver support for "H" switch topologies Vladimir Oltean
2021-08-04 13:54 ` [PATCH v3 net-next 1/8] net: dsa: rename teardown_default_cpu to teardown_cpu_ports Vladimir Oltean
2021-08-04 13:54 ` [PATCH v3 net-next 2/8] net: dsa: give preference to local CPU ports Vladimir Oltean
2021-08-04 13:54 ` [PATCH v3 net-next 3/8] net: dsa: sja1105: configure the cascade ports based on topology Vladimir Oltean
2021-08-04 13:54 ` [PATCH v3 net-next 4/8] net: dsa: sja1105: manage the forwarding domain towards DSA ports Vladimir Oltean
2021-08-04 13:54 ` [PATCH v3 net-next 5/8] net: dsa: sja1105: manage VLANs on cascade ports Vladimir Oltean
2021-08-04 13:54 ` [PATCH v3 net-next 6/8] net: dsa: sja1105: increase MTU to account for VLAN header on DSA ports Vladimir Oltean
2021-08-04 13:54 ` Vladimir Oltean [this message]
2021-08-04 13:54 ` [PATCH v3 net-next 8/8] net: dsa: sja1105: enable address learning on cascade ports Vladimir Oltean
2021-08-05 10:10 ` [PATCH v3 net-next 0/8] NXP SJA1105 driver support for "H" switch topologies patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210804135436.1741856-8-vladimir.oltean@nxp.com \
    --to=vladimir.oltean@nxp.com \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=olteanv@gmail.com \
    --cc=vivien.didelot@gmail.com \
    --subject='Re: [PATCH v3 net-next 7/8] net: dsa: sja1105: suppress TX packets from looping back in "H" topologies' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).