Ido Schimmel
2018-Dec-12 09:02 UTC
[Bridge] Correct PVID behavior with bridge's VLAN filtering on/off?
On Tue, Dec 11, 2018 at 11:48:21AM -0800, Florian Fainelli wrote:> Hi Nikolay, Roopa, Jiri, Ido, > > When a bridge has vlan_filtering=0 and notifies a switch driver through > HOST_OBJ_MDB about MC addresses that the CPU/management port is > interested in getting MC traffic for, I am seeing that the mdb->vid is > set to 0 because br_allowed_ingress() checks for BROPT_VLAN_ENABLED > which is now disabled and so we never populated *vid to anything but 0 > because the caller: br_handle_frame_finish() zeroed it out.s/br_handle_frame_finish()/br_dev_xmit()/ ? Since you're talking about HOST_OBJ_MDB> This creates a problem with the b53 DSA switch driver because in order > to match the bridge's default_pvid, we did program the switch's "default > tag" to be 1, which gets used for all untagged frames that ingress the > switch (which AFAICT is correct behavior for PVID).Not sure I'm following. If bridge is not VLAN-aware, then where do you see 'default_pvid' being used?> Despite having turned off VLAN filtering in the switch such that it does > allow ingress of packets with a VID that is not present in the VLAN > table (violation), Multicast addresses do behave differently and we > really must be strictly matching the programmed PVID in order for MC > frames to ingress the switch even with VLAN filtering turned off. > > So with all that being written, should the bridge still be sending MDB > notifications and use the bridge's default_pvid even with > vlan_filtering=0? And if we did that, what use case could we be possibly > breaking? > > Let me know if this is not clear so I can provide mode details.I think you need to provide more details about the device you're working with. I can explain what we're doing in mlxsw for reference. When you use a VLAN-unaware bridge w/o VLAN devices, we make sure all untagged packets get tagged with some arbitrary VLAN (now 1, soon 4095). You never see this VLAN on the wire, since we remove it before sending the packets. It is only used because all packets in the ASIC must be tagged. After we have a VLAN we classify the packet to a FID (bridge) and it does {FID,DMAC} lookup in the FDB (MDB). IIUC, your problem is that you also need to tag all the packets (you used '1', can be something else), but then you program the MDB entry according to the VLAN passed in the notification ('0') and not use ('1'). We completely ignore the VID in this case and use the FID which we lookup based on the ifindex of the bridge.
Florian Fainelli
2018-Dec-12 19:52 UTC
[Bridge] Correct PVID behavior with bridge's VLAN filtering on/off?
On 12/12/18 1:02 AM, Ido Schimmel wrote:> On Tue, Dec 11, 2018 at 11:48:21AM -0800, Florian Fainelli wrote: >> Hi Nikolay, Roopa, Jiri, Ido, >> >> When a bridge has vlan_filtering=0 and notifies a switch driver through >> HOST_OBJ_MDB about MC addresses that the CPU/management port is >> interested in getting MC traffic for, I am seeing that the mdb->vid is >> set to 0 because br_allowed_ingress() checks for BROPT_VLAN_ENABLED >> which is now disabled and so we never populated *vid to anything but 0 >> because the caller: br_handle_frame_finish() zeroed it out. > > s/br_handle_frame_finish()/br_dev_xmit()/ ? Since you're talking about > HOST_OBJ_MDBThis affects the bridge ingress path as well, since I use HOST_OBJ_MDB to indicate whether the CPU port wants to receive multicast, transmitting multicast from the CPU port is almost never a problem. Is that a correct use of HOST_OBJ_MDB?> >> This creates a problem with the b53 DSA switch driver because in order >> to match the bridge's default_pvid, we did program the switch's "default >> tag" to be 1, which gets used for all untagged frames that ingress the >> switch (which AFAICT is correct behavior for PVID). > > Not sure I'm following. If bridge is not VLAN-aware, then where do you > see 'default_pvid' being used?A key detail I missed is that this is done with 4.9 (for now, in the process of forward porting fixes to net-next right now) which does not have this commit from Andrew: 2ea7a679ca2abd251c1ec03f20508619707e1749 ("net: dsa: Don't add vlans when vlan filtering is disabled") so the switch is actually VLAN aware, just it does not do strict VID violation enforcement policy, I like that behavior, but Jiri corrected me that this is not quite how it is defined.m.> >> Despite having turned off VLAN filtering in the switch such that it does >> allow ingress of packets with a VID that is not present in the VLAN >> table (violation), Multicast addresses do behave differently and we >> really must be strictly matching the programmed PVID in order for MC >> frames to ingress the switch even with VLAN filtering turned off. >> >> So with all that being written, should the bridge still be sending MDB >> notifications and use the bridge's default_pvid even with >> vlan_filtering=0? And if we did that, what use case could we be possibly >> breaking? >> >> Let me know if this is not clear so I can provide mode details. > > I think you need to provide more details about the device you're working > with. I can explain what we're doing in mlxsw for reference. > > When you use a VLAN-unaware bridge w/o VLAN devices, we make sure all > untagged packets get tagged with some arbitrary VLAN (now 1, soon 4095). > You never see this VLAN on the wire, since we remove it before sending > the packets. It is only used because all packets in the ASIC must be > tagged. > > After we have a VLAN we classify the packet to a FID (bridge) and it > does {FID,DMAC} lookup in the FDB (MDB). > > IIUC, your problem is that you also need to tag all the packets (you > used '1', can be something else), but then you program the MDB entry > according to the VLAN passed in the notification ('0') and not use > ('1'). We completely ignore the VID in this case and use the FID which > we lookup based on the ifindex of the bridge. >We do not have a concept of a FID with Broadcom switches, so we can either use a reserved VLAN ID to emulate that behavior and do individual MAC address learning which hashes into VID,MAC. The switch has a "default 802.1Q tag" which gets used for untagged packets. Internally the switch normalizes all incoming frames (when 802.1Q is enabled) to have a double VLAN tag, and untagged frames get mapped to that "default 802.1Q/PVID tag" in the processing pipeline, and then when they ingress the destination switch port, they can get untagged again using the ingress port's "default 802.1Q/PVID tag" again. The CPU port remains in the "default 802.1Q tag" set to 0, because that is also the configuration for non-bridge ports, and I need that to continue getting non-bridge ports to function normally and not be subjected to vlan_filtering = 1 being applied on the bridge (I will send a documentation patch that hopefully clarifies what the correct port behavior is and request feedback on that). So here are essentially 3 things that could be fixed/tackled more or less independently: - because vlan_filtering = 0, we have the HOST_OBJ_MDB request coming with mdb->vid = 0, which is expected with the current bridge code - because the switch is made 802.1Q/VLAN aware, we have the ports that are bridge member configured with PVID = default_pvid (old kernel behavior, prior to Andrew's change), such that untagged frames show up untagged correctly at the network device level - CPU/management port's default untagged VID is 0 which matches mdb->vid, but the bridged port, which is the ingress port for MC traffic is in VID 1 and where MC ingress filter checking is done. So there is a VID mismatch, and despite filtering being turned off, MC traffic does not evade that restriction (could be a misconfiguration on my side, could not find something that would allow it to just pass through). With this one liner change both vlan_filtering states now work correctly with respect to MC: diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c index b6de4f457161..fe446e971456 100644 --- a/net/bridge/br_vlan.c +++ b/net/bridge/br_vlan.c @@ -482,6 +482,7 @@ bool br_allowed_ingress(const struct net_bridge *br, */ if (!br->vlan_enabled) { BR_INPUT_SKB_CB(skb)->vlan_filtered = false; + *vid = br->default_pvid; return true; } Or I suppose that I could just backport Andrew's patch and that would remove all VLAN awareness in the bridge, which would likely solve the problem as well, and/or find a way to make sure that MC flows do bypass VLAN filtering after all when vlan_filtering = 0. Thanks for your patience. -- Florian
Florian Fainelli
2018-Dec-15 18:10 UTC
[Bridge] Correct PVID behavior with bridge's VLAN filtering on/off?
Le 12/12/18 ? 1:02 AM, Ido Schimmel a ?crit?:> On Tue, Dec 11, 2018 at 11:48:21AM -0800, Florian Fainelli wrote: >> Hi Nikolay, Roopa, Jiri, Ido, >> >> When a bridge has vlan_filtering=0 and notifies a switch driver through >> HOST_OBJ_MDB about MC addresses that the CPU/management port is >> interested in getting MC traffic for, I am seeing that the mdb->vid is >> set to 0 because br_allowed_ingress() checks for BROPT_VLAN_ENABLED >> which is now disabled and so we never populated *vid to anything but 0 >> because the caller: br_handle_frame_finish() zeroed it out. > > s/br_handle_frame_finish()/br_dev_xmit()/ ? Since you're talking about > HOST_OBJ_MDB > >> This creates a problem with the b53 DSA switch driver because in order >> to match the bridge's default_pvid, we did program the switch's "default >> tag" to be 1, which gets used for all untagged frames that ingress the >> switch (which AFAICT is correct behavior for PVID). > > Not sure I'm following. If bridge is not VLAN-aware, then where do you > see 'default_pvid' being used? > >> Despite having turned off VLAN filtering in the switch such that it does >> allow ingress of packets with a VID that is not present in the VLAN >> table (violation), Multicast addresses do behave differently and we >> really must be strictly matching the programmed PVID in order for MC >> frames to ingress the switch even with VLAN filtering turned off. >> >> So with all that being written, should the bridge still be sending MDB >> notifications and use the bridge's default_pvid even with >> vlan_filtering=0? And if we did that, what use case could we be possibly >> breaking? >> >> Let me know if this is not clear so I can provide mode details. > > I think you need to provide more details about the device you're working > with. I can explain what we're doing in mlxsw for reference. > > When you use a VLAN-unaware bridge w/o VLAN devices, we make sure all > untagged packets get tagged with some arbitrary VLAN (now 1, soon 4095). > You never see this VLAN on the wire, since we remove it before sending > the packets. It is only used because all packets in the ASIC must be > tagged. > > After we have a VLAN we classify the packet to a FID (bridge) and it > does {FID,DMAC} lookup in the FDB (MDB). > > IIUC, your problem is that you also need to tag all the packets (you > used '1', can be something else), but then you program the MDB entry > according to the VLAN passed in the notification ('0') and not use > ('1'). We completely ignore the VID in this case and use the FID which > we lookup based on the ifindex of the bridge. >Another thing that seems inconsistent or rather possibly problematic to deal with is the following: - create the bridge with VLAN filtering set, this leads to programming VLAN entries into the switch through switchdev notifications, that is expected and working - turn off VLAN filtering on that bridge, this trickles down through attributes notification to the switch driver which now disables ingress VID checking (egress directed in Broadcom B53 is not easily enforceable), we still have VLAN entries programmed into the switch, in particular, the port's default VLAN/PVID, which is contributing to my problem mentioned above - bridge is now requesting MDB programming to be done with VID=0 since the bridge is now VLAN-unaware One might expect that when turning off VLAN filtering, the bridge layer should also remove any programmed VLAN entries? In spectrum_switchdev.c an error is issued to indicate that changing VLAN filtering is not possible once the bridge has been created with VLAN filtering on initially. This is not necessarily something I want to restrict within B53, because we ought to support dynamically turning on/off VLAN filtering on the switch device and driver. I am not aware of an use case for that expect my own tests so far, but clearly we can support it, so why not. Instead of the patch I copied in my previous response where I would change br_allowed_ingress(), I am considering modifying the port's default PVID to match whatever the default VLAN is when not using a bridge (0 in my case) when the currently configured PVID is not that default PVID, conversely when re-enabling VLAN filtering, putting the port back on the bridge's default_pvid. Ido, does that make sense to you or would you advocate not to bother at all with that use case and do what spectrum_switchdev.c does? I would also appreciate if you could take a look at this patch since it would answer a lot of my questions, thank you very much! http://patchwork.ozlabs.org/patch/1012404/ -- Florian