Linus Lüssing
2018-Dec-13 16:10 UTC
[Bridge] [PATCH net] net: bridge: remove ipv6 zero address check in mcast queries
Even though RFC4541 recommends this, I'm not quite sure whether this works... even for IGMP. I think this would lead to multicast packet loss in a scenario like this: ---------- [Switch A] -------------- [Switch B] / / / / / / (Host A) (Host B) - Snooping Switches: Switch A + Switch B - Selected Querier: Switch A, with 0.0.0.0 query source - Multicast Listener: Host A - Multicast Data Sender: Host B 1) Host A sends IGMP report to Switch A 2) Switch A refrains from forwarding it to Switch B (reports are only forwarded to multicast routers according to RFC4541) => Switch B does not learn about listeners on Host A Now, with this patch and recommendation in RFC4541 to not add queries with a 0.0.0.0 source address to the multicast router port list: 3) Host B sends multicast data to Switch B => Switch B does not forward it to Switch A as it neither detected a multicast listener nor multicast router on the according port. => Host A does not receive the multicast data it signed up for (Or with colors: https://metameute.de/~tux/linux/bridge/query-zero-source-no-mcrouter-port.png) ---------- Alternatively we would need to ignore 0.0.0.0 for the querier election and "querier present" detection. And by that disable multicast snooping if there are no queries from a non-zero source address. But I'm a little hesitant whether ignoring is a reliable way as IGMPv3 (RFC3376) and IGMPv2 (RFC2236) make no such restrictions regarding the query source address. With no such restrictions according to RFC3376/RFC2236 a 0.0.0.0 would always win the querier election. Meaning any potential querier with a non-zero source address would remain silent. Meaning we would always disable multicast snooping then? Adding queriers with a 0.0.0.0 source address to the multicast router list, too, seems like a less harmful way then disabling multicast snooping completely? ---------- However, one of the two options seems to be necessary. Either reverting the patch for the IGMP part, too. Or Ignoring 0.0.0.0 sources for querier eletcion and presence detection. The current state seems broken to me unless I'm missing something.
Ying Xu
2018-Dec-14 02:32 UTC
[Bridge] [PATCH net] net: bridge: remove ipv6 zero address check in mcast queries
I think the scenario mentioned above is abnormal. According to rfc 4541, multicast router port means this port is attached to a real router. The source of query indicats that is a real router or only a switch.(0.0.0.0 means switch,non-zero means router). In the scenario above,the switch A was selected to be a querier that means A performs as a router, so switch A should config its query source address to non-zero,and then Host A can recieve the traffic from B. On Fri, Dec 14, 2018 at 12:10 AM Linus L?ssing <linus.luessing at c0d3.blue> wrote:> Even though RFC4541 recommends this, I'm not quite sure whether > this works... even for IGMP. > > I think this would lead to multicast packet loss in a scenario > like this: > > ---------- > > [Switch A] -------------- [Switch B] > / / > / / > / / > (Host A) (Host B) > > > - Snooping Switches: Switch A + Switch B > - Selected Querier: Switch A, with 0.0.0.0 query source > - Multicast Listener: Host A > - Multicast Data Sender: Host B > > 1) Host A sends IGMP report to Switch A > 2) Switch A refrains from forwarding it to Switch B > (reports are only forwarded to multicast routers according to > RFC4541) > => Switch B does not learn about listeners on Host A > > Now, with this patch and recommendation in RFC4541 to not add queries > with a 0.0.0.0 source address to the multicast router port list: > > 3) Host B sends multicast data to Switch B > => Switch B does not forward it to Switch A as it neither > detected a multicast listener nor multicast router on > the according port. > => Host A does not receive the multicast data it signed up for > > (Or with colors: > > https://metameute.de/~tux/linux/bridge/query-zero-source-no-mcrouter-port.png > ) > > ---------- > > Alternatively we would need to ignore 0.0.0.0 for the querier > election and "querier present" detection. And by that disable > multicast snooping if there are no queries from a non-zero source > address. > > But I'm a little hesitant whether ignoring is a reliable way as > IGMPv3 (RFC3376) and IGMPv2 (RFC2236) make no such restrictions > regarding the query source address. > > With no such restrictions according to RFC3376/RFC2236 a 0.0.0.0 > would always win the querier election. Meaning any potential > querier with a non-zero source address would remain silent. > Meaning we would always disable multicast snooping then? > > > Adding queriers with a 0.0.0.0 source address to the multicast > router list, too, seems like a less harmful way then disabling multicast > snooping completely? > > ---------- > > However, one of the two options seems to be necessary. Either > reverting the patch for the IGMP part, too. Or Ignoring 0.0.0.0 > sources for querier eletcion and presence detection. > > The current state seems broken to me unless I'm missing something. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linuxfoundation.org/pipermail/bridge/attachments/20181214/3b464880/attachment.html>
Linus Lüssing
2018-Dec-17 13:15 UTC
[Bridge] [PATCH net] net: bridge: remove ipv6 zero address check in mcast queries
Hi and thanks for your reply! On Fri, Dec 14, 2018 at 10:32:16AM +0800, Ying Xu wrote:> ?I think the scenario mentioned above is abnormal.Can we agree, that this scenario, if switch A and B were using the current bridge code, has issues right now which it did not have before that patch? I also do not quite understand what you mean with "abnormal". Do you think that it is unlikely to have two snooping switches and general queries with a 0.0.0.0 source? Note that with the current bridge code and according to RFC3376 and RFC2236, as soon as a query with a 0.0.0.0 source is sent somewhere in the broadcast domain, it will become the selected querier [*].> The source of query indicats that is a real router or only a switch.(0.0.0.0 > means switch,non-zero means router). > In the scenario above,the switch A was selected to be a querier that means A > performs as a router, > so switch A should config its query source address to non-zero,and then Host A > can recieve the traffic from B.Even if in the described scenario switch A were configured to use a a non-zero source address to become a router, so that switch B would mark the port to switch A as a multicast router port, switch A would still loose in the querier election right now, as 0.0.0.0 is lower (RFC3376, RFC2236). So switch B would then become the selected querier with its 0.0.0.0 source and switch A would become silent even though it had a non-zero source address. And then we would have the same issue again, only swapped between host+switch A and host+switch B. Would you agree, does that make sense? Regards, Linus [*]: Looking at br_ip4_multicast_select_querier(): If previously selected querier were 0.0.0.0, it would accept any source as a new querier ("!br->ip4_querier.addr.u.ip4"). However, if the previously selected querier were non-zero, a query with 0.0.0.0 would win, too ("ntohl(saddr) <= ntohl(br->ip4_querier.addr.u.ip4)").
Hangbin Liu
2019-Feb-21 08:01 UTC
[Bridge] [PATCH net] net: bridge: remove ipv6 zero address check in mcast queries
Hi Linus, Sorry, my mail client somehow droped your message and I didn't see your reply. I find this mail after Nikolay pointed out yesterday.> Hi and thanks for your reply! > > On Fri, Dec 14, 2018 at 10:32:16AM +0800, Ying Xu wrote: > > I think the scenario mentioned above is abnormal. > > Can we agree, that this scenario, if switch A and B were using the > current bridge code, has issues right now which it did > not have before that patch?Yes, I agree. But this "regression" could be fixed by setting up correct switch configuration. See more explains below.> > I also do not quite understand what you mean with "abnormal". Do > you think that it is unlikely to have two snooping switches and > general queries with a 0.0.0.0 source?The "abnormal" means we shouldn't use this kind of configuration on snooping switch. We could have general queries with a 0.0.0.0 source, but these queries are only from proxy Querier, not from real Querier. Based on RFC 4541, 2.1.1. IGMP Forwarding Rules b) The arrival port for IGMP Queries (sent by multicast routers) where the source address is not 0.0.0.0. The 0.0.0.0 address represents a special case where the switch is proxying IGMP Queries for faster network convergence, but is not itself the Querier. The switch does not use its own IP address (even if it has one), because this would cause the Queries to be seen as coming from a newly elected Querier. The 0.0.0.0 address is used to indicate that the Query packets are NOT from a multicast router. This paragraph specified "0.0.0.0" is only for switch proxying queries. It should not be used as a IGMP Querier. "because this would cause the Queries to be seen as coming from a newly elected Querier" means other address could be elected as a Querier but "0.0.0.0" should not. AFAIK, most verdors(Cisco, HW, etc. expect H3C) use none-zero address as default address on proxying switches. But H3C also respect the b) section.> > Note that with the current bridge code and according to RFC3376 > and RFC2236, as soon as a query with a 0.0.0.0 source is sent somewhere > in the broadcast domain, it will become the selected querier [*].Yes, In RFC 3376 6.6.2. Querier Election IGMPv3 elects a single querier per subnet using the same querier election mechanism as IGMPv2, namely by IP address. When a router receives a query with a lower IP address, it sets the Other-Querier- Present timer to Other Querier Present Interval and ceases to send queries on the network if it was the previously elected querier. After its Other-Querier Present timer expires, it should begin sending General Queries. It only said we should select lower IP address as Querier, but didn't specify the "0.0.0.0" address. While in the later RFC 4541 Abstract This memo describes the recommendations for Internet Group Management Protocol (IGMP) and Multicast Listener Discovery (MLD) snooping switches. These are based on best current practices for IGMPv2, with further considerations for IGMPv3- and MLDv2-snooping. And section 2.1.1. IGMP Forwarding Rules(I just pasted above). It did specify the "0.0.0.0" address issue. I think we should respect the later and accurater standard, shouldn't we?> > > > The source of query indicats that is a real router or only a switch.(0.0.0.0 > > means switch,non-zero means router). > > In the scenario above,the switch A was selected to be a querier that means A > > performs as a router, > > so switch A should config its query source address to non-zero,and then Host A > > can recieve the traffic from B. > > Even if in the described scenario switch A were configured to use a a non-zero > source address to become a router, so that switch B would mark the port > to switch A as a multicast router port, switch A would still loose > in the querier election right now, as 0.0.0.0 is lower (RFC3376, RFC2236). So > switch B would then become the selected querier with its 0.0.0.0 source > and switch A would become silent even though it had a non-zero > source address. > > And then we would have the same issue again, only swapped between > host+switch A and host+switch B. > > Would you agree, does that make sense? >Yes, that's why Ying Xu said the topology is "abnormal", or illegal. Because in a correct configured topology, there should has a Querier with none-zero IP address.> Regards, Linus > > > [*]: Looking at br_ip4_multicast_select_querier(): > > If previously selected querier were 0.0.0.0, it would accept any > source as a new querier ("!br->ip4_querier.addr.u.ip4"). However, > if the previously selected querier were non-zero, a query with > 0.0.0.0 would win, too > ("ntohl(saddr) <= ntohl(br->ip4_querier.addr.u.ip4)").In RFC 3810 7.6.2. Querier Election When a router starts operating on a subnet, by default it considers itself as being the Querier. When a router receives a query with a lower IPv6 address than its own, it sets the Other Querier Present timer to Other Querier Present Timeout; if it was previously in Querier state, it switches to Non- Querier state and ceases to send queries on the link. All MLDv2 queries MUST be sent with the FE80::/64 link-local source address prefix. MLDv2 also sepcified the default Querier is itself and electe a lower adddress as new Querier. And the source address should be link-loacl source address instead of 0("::"). This is another evidence that we should not use "0.0.0.0" as a Querier. So I think we should fix the IPv4 election in function br_ip4_multicast_select_querier(). This looks like a little extreme and may have "regression", but the "regression" should be fixed by setting up correct router/switch configuration. What do you think? Regards Hangbin