bugzilla-daemon at netfilter.org
2024-Aug-26 09:08 UTC
[Bug 1766] New: nfqueue randomly drops packets with same tuple
https://bugzilla.netfilter.org/show_bug.cgi?id=1766 Bug ID: 1766 Summary: nfqueue randomly drops packets with same tuple Product: netfilter/iptables Version: unspecified Hardware: x86_64 OS: All Status: NEW Severity: major Priority: P5 Component: netfilter hooks Assignee: netfilter-buglog at lists.netfilter.org Reporter: antonio.ojea.garcia at gmail.com I was puzzled by this problem for a long time, first reported in https://github.com/kubernetes-sigs/kube-network-policies/issues/12 and now reported in https://github.com/kubernetes-sigs/kind/issues/3713 It seems the same symptom as described here https://www.spinics.net/lists/netfilter/msg58296.html but those seems to be fixed back in the day. I was able to narrow down the scenario, I will try to translate the kubernetes constructs to namespaces and node to describe better the scenario. 2 nodes: N1 and N2 N1 contains two containers: - client C1 (10.244.1.3) - DNS server D1 (10.244.1.5) N2 containers the second DNS server D2 (10.244.2.4) One rule to send the packets to nfqueue in postrouting, but it happened in other hooks before. We can assume the set matches the packet and the nfqueue userspace always accept the packet> chain postrouting { > type filter hook postrouting priority srcnat - 5; policy accept; > icmpv6 type { nd-neighbor-solicit, nd-neighbor-advert } accept > meta skuid 0 accept > ct state established,related accept > ip saddr @podips-v4 queue flags bypass to 100 comment "process IPv4 traffic with network policy enforcement" > ip daddr @podips-v4 queue flags bypass to 100 comment "process IPv4 traffic with network policy enforcement" > ip6 saddr @podips-v6 queue flags bypass to 100 comment "process IPv6 traffic with network policy enforcement" > ip6 daddr @podips-v6 queue flags bypass to 100 comment "process IPv6 traffic with network policy enforcement" > }The containerd DNS servers are abstracted via DNAT with IP 10.96.0.10> meta l4proto udp ip daddr 10.96.0.10 udp dport 53 counter packets 0 bytes 0 jump KUBE-SVC-TCOU7JCQXEZGVUNU> chain KUBE-SVC-TCOU7JCQXEZGVUNU { > meta l4proto udp ip saddr != 10.244.0.0/16 ip daddr 10.96.0.10 udp dport 53 counter packets 0 bytes 0 jump KUBE-MARK-MASQ > meta random & 2147483647 < 1073741824 counter packets 38 bytes 2280 jump KUBE-SEP-CEYPGFB7VCORONY3 > counter packets 32 bytes 1920 jump KUBE-SEP-RJHMR3QLYGJVBWVL > }> chain KUBE-SEP-CEYPGFB7VCORONY3 { > ip saddr 10.244.1.5 counter packets 0 bytes 0 jump KUBE-MARK-MASQ > meta l4proto udp counter packets 38 bytes 2280 dnat to 10.244.1.5:53 > }C1 sends a DNS request to the virtual ip 10.96.0.10, because of the happy-eyeball protocol, it sends two packets with the same tuple for each record A and AAAA The symptom is that one of the packets does not come back ... see tcpdump trace, the packets go out at 22:49:07 but only the A answer comes back, the client retries at 22:49:10 the AAAA and this times come back 22:49:07.632846 vetha5c90841 In IP (tos 0x0, ttl 64, id 52468, offset 0, flags [DF], proto UDP (17), length 60) 10.244.1.3.48199 > 10.96.0.10.53: 60169+ A? www.google.com. (32) 22:49:07.632909 vetha5c90841 In IP (tos 0x0, ttl 64, id 52469, offset 0, flags [DF], proto UDP (17), length 60) 10.244.1.3.48199 > 10.96.0.10.53: 60459+ AAAA? www.google.com. (32) 22:49:07.633080 veth271ea3e0 Out IP (tos 0x0, ttl 63, id 52468, offset 0, flags [DF], proto UDP (17), length 60) 10.244.1.3.48199 > 10.244.1.5.53: 60169+ A? www.google.com. (32) 22:49:07.633210 eth0 Out IP (tos 0x0, ttl 63, id 52469, offset 0, flags [DF], proto UDP (17), length 60) 10.244.1.3.48199 > 10.244.1.5.53: 60459+ AAAA? www.google.com. (32) 22:49:07.633352 eth0 In IP (tos 0x0, ttl 62, id 52469, offset 0, flags [DF], proto UDP (17), length 60) 10.244.1.3.48199 > 10.244.1.5.53: 60459+ AAAA? www.google.com. (32) 22:49:07.653981 veth271ea3e0 In IP (tos 0x0, ttl 64, id 28750, offset 0, flags [DF], proto UDP (17), length 240) 10.244.1.5.53 > 10.244.1.3.48199: 60169 6/0/0 www.google.com. A 172.217.218.104, www.google.com. A 172.217.218.99, www.google.com. A 172.217.218.106, www.google.com. A 172.217.218.147, www.google.com. A 172.217.218.105, www.google.com. A 172.217.218.103 (212) 22:49:07.654012 vetha5c90841 Out IP (tos 0x0, ttl 63, id 28750, offset 0, flags [DF], proto UDP (17), length 240) 10.96.0.10.53 > 10.244.1.3.48199: 60169 6/0/0 www.google.com. A 172.217.218.104, www.google.com. A 172.217.218.99, www.google.com. A 172.217.218.106, www.google.com. A 172.217.218.147, www.google.com. A 172.217.218.105, www.google.com. A 172.217.218.103 (212) 22:49:10.135710 vetha5c90841 In IP (tos 0x0, ttl 64, id 52470, offset 0, flags [DF], proto UDP (17), length 60) 10.244.1.3.48199 > 10.96.0.10.53: 60459+ AAAA? www.google.com. (32) 22:49:10.135740 veth271ea3e0 Out IP (tos 0x0, ttl 63, id 52470, offset 0, flags [DF], proto UDP (17), length 60) 10.244.1.3.48199 > 10.244.1.5.53: 60459+ AAAA? www.google.com. (32) 22:49:10.136635 veth271ea3e0 In IP (tos 0x0, ttl 64, id 28842, offset 0, flags [DF], proto UDP (17), length 228) 10.244.1.5.53 > 10.244.1.3.48199: 60459 4/0/0 www.google.com. AAAA 2a00:1450:4013:c08::6a, www.google.com. AAAA 2a00:1450:4013:c08::67, www.google.com. AAAA 2a00:1450:4013:c08::63, www.google.com. AAAA 2a00:1450:4013:c08::68 (200) 22:49:10.136669 vetha5c90841 Out IP (tos 0x0, ttl 63, id 28842, offset 0, flags [DF], proto UDP (17), length 228) 10.96.0.10.53 > 10.244.1.3.48199: 60459 4/0/0 www.google.com. AAAA 2a00:1450:4013:c08::6a, www.google.com. AAAA 2a00:1450:4013:c08::67, www.google.com. AAAA 2a00:1450:4013:c08::63, www.google.com. AAAA 2a00:1450:4013:c08::68 (200) ^C 23 packets captured When tracing the packets I could observer two different reasons for dropping, depending the destination of the DNAT rule, if is local it is dropped by SKB_DROP_REASON_IP_RPFILTER if is in the other node it is dropped by SKB_DROP_REASON_NEIGH_FAILED 0xffff9527290acb00 3 [<empty>(3178406)] kfree_skb_reason(SKB_DROP_REASON_IP_RPFILTER) 1289 netns=4026533244 mark=0x0 iface=52(eth0) proto=0x0800 mtu=1500 len=60 10.244.1.3:48199->10.244.1.5:53(udp) and 3:24:37.411 0xffff9534c19a3d00 7 [<empty>(0)] kfree_skb_reason(SKB_DROP_REASON_NEIGH_FAILED) 1583194220087332 netns=4026533244 mark=0x0 iface=5(veth271ea3e0) proto=0x0800 mtu=1500 len=60 10.244.1.3:58611->10.244.2.4:53(udp) If I enable martian logging net.ipv4.conf.all.log_martians=1 it also reports these packets as martians when the destination is in the same node [1581593.716839] IPv4: martian source 10.244.1.5 from 10.244.1.3, on dev eth0 [1581593.723848] ll header: 00000000: 02 42 c0 a8 08 05 02 42 c0 a8 08 03 08 00 An interesting detail is that only seems to happen with DNS (2 packets with the same tuple) and when there is more than 1 replica behind the virtual IP (> 1 DNAT rules) . When there is only 1 DNAT rule it does not happen, this is a fact. Since the behavior is not deterministic but reproducible, it makes me think that there is some kind of race where the nfqueue system is not able to correctly handle the two packets with the same tuple on the return path and it goes dropped ... I would like some help on two fronts: - advices on the next steps to debugging further or how can I provide more information that can help maintainer - advices on how to workaround temporarily this problem -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20240826/46a406a1/attachment.html>
bugzilla-daemon at netfilter.org
2024-Aug-26 16:17 UTC
[Bug 1766] nfqueue randomly drops packets with same tuple
https://bugzilla.netfilter.org/show_bug.cgi?id=1766 Pablo Neira Ayuso <pablo at netfilter.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |pablo at netfilter.org --- Comment #1 from Pablo Neira Ayuso <pablo at netfilter.org> --- Could you have a look at # conntrack -S to check if show clash_resolve= gets bumped? I suspect martians are going on because packets goes through without being mangled by NAT. -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20240826/de1d8182/attachment.html>
bugzilla-daemon at netfilter.org
2024-Aug-27 17:15 UTC
[Bug 1766] nfqueue randomly drops packets with same tuple
https://bugzilla.netfilter.org/show_bug.cgi?id=1766 --- Comment #2 from Antonio Ojea <antonio.ojea.garcia at gmail.com> --- These are the only entries that are bumped when an dns timeout happens # conntrack -S > cs6.log # diff cs5.log cs6.log 45c45 < cpu=44 found=0 invalid=0 insert=0 insert_failed=0 drop=0 early_drop=0 error=0 search_restart=0 clash_resolve=0 chaintoolong=0 ---> cpu=44 found=3 invalid=0 insert=0 insert_failed=0 drop=0 early_drop=0 error=0 search_restart=0 clash_resolve=0 chaintoolong=0No other stats are bumped if there are no dns timeouts -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20240827/888f31a8/attachment.html>
bugzilla-daemon at netfilter.org
2024-Aug-28 06:31 UTC
[Bug 1766] nfqueue randomly drops packets with same tuple
https://bugzilla.netfilter.org/show_bug.cgi?id=1766 --- Comment #3 from Antonio Ojea <antonio.ojea.garcia at gmail.com> --- I was tracing the packets with this tool from cilium folks , https://github.com/cilium/pwru/tree/main And I think the problem is here, bear in mind 10.244.1.5 and 10.244.2.4 are the DNATed addresses for 10.96.0.10 SKB CPU PROCESS NETNS MARK/x IFACE PROTO MTU LEN TUPLE FUNC 0xffff9523b3207080 1 <empty>:913286 4026533244 0 vetha5c90841:3 0x0800 1500 60 10.244.1.3:45957->10.96.0.10:53(udp) skb_ensure_writable 0xffff9523b3207080 1 <empty>:913286 4026533244 0 vetha5c90841:3 0x0800 1500 60 10.244.1.3:45957->10.96.0.10:53(udp) inet_proto_csum_replace4 0xffff9523b3207080 1 <empty>:913286 4026533244 0 vetha5c90841:3 0x0800 1500 60 10.244.1.3:45957->10.96.0.10:53(udp) inet_proto_csum_replace4 SKB 0xffff9523b3207080 DNAT 10.96.0.10 to 0.244.1.5 on CPU 1 0xffff9523b3207080 1 <empty>:913286 4026533244 0 vetha5c90841:3 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) udp_v4_early_demux 0xffff9523b3207080 1 <empty>:913286 4026533244 0 vetha5c90841:3 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) ip_route_input_noref 0xffff9523b3207080 1 <empty>:913286 4026533244 0 vetha5c90841:3 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) ip_route_input_slow 0xffff9523b3207080 1 <empty>:913286 4026533244 0 vetha5c90841:3 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) fib_validate_source 0xffff9523b3207080 1 <empty>:913286 4026533244 0 vetha5c90841:3 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) ip_forward 0xffff9523b3207080 1 <empty>:913286 4026533244 0 vetha5c90841:3 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) nf_hook_slow 0xffff9523b3207080 1 <empty>:913286 4026533244 0 vetha5c90841:3 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) ip_forward_finish 0xffff9523b3207080 1 <empty>:913286 4026533244 0 vetha5c90841:3 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) ip_output 0xffff9523b3207080 1 <empty>:913286 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) nf_hook_slow 0xffff9523b3207080 1 <empty>:913286 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) apparmor_ip_postroute 0xffff9523b3207080 1 <empty>:913286 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) nf_queue 0xffff9523b3207080 1 <empty>:913286 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) __nf_queue SEND 0xffff9523b3207080 to the queue 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) skb_ensure_writable 0xffff9523b3207080 is returned on CPU 16 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) skb_ensure_writable 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) inet_proto_csum_replace4 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) inet_proto_csum_replace4 and DNATTED again ?? to 10.244.2.4 (we drop here 10.244.1.5) 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.2.4:53(udp) nf_reroute 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.2.4:53(udp) ip_finish_output 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.2.4:53(udp) __ip_finish_output 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.2.4:53(udp) ip_finish_output2 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.2.4:53(udp) neigh_resolve_output 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.2.4:53(udp) __neigh_event_send 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.2.4:53(udp) skb_clone 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.2.4:53(udp) arp_solicit 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.2.4:53(udp) consume_skb 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.2.4:53(udp) skb_release_head_state 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.2.4:53(udp) skb_release_data 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.2.4:53(udp) kfree_skbmem -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20240828/a00e0fa0/attachment.html>
bugzilla-daemon at netfilter.org
2024-Sep-01 20:13 UTC
[Bug 1766] nfqueue randomly drops packets with same tuple
https://bugzilla.netfilter.org/show_bug.cgi?id=1766 --- Comment #4 from Antonio Ojea <antonio.ojea.garcia at gmail.com> --- An interesting observation, the problem only seems to happen when at least one of the DNAT destinations is in the same namespace where the nfqueue program runs, I imaging this causes the packet to follow a different codepath than if the packet is sent out. What I'm puzzled is why the packets gets dnated twice, after __nf_queue and before nf_reroute 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) __nf_queue 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) skb_ensure_writable 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) skb_ensure_writable 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) inet_proto_csum_replace4 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.1.5:53(udp) inet_proto_csum_replace4 0xffff9523b3207080 16 <empty>:3178406 4026533244 0 veth271ea3e0:5 0x0800 1500 60 10.244.1.3:45957->10.244.2.4:53(udp) nf_reroute -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20240901/319fb3a5/attachment.html>
bugzilla-daemon at netfilter.org
2024-Sep-01 20:22 UTC
[Bug 1766] nfqueue randomly drops packets with same tuple
https://bugzilla.netfilter.org/show_bug.cgi?id=1766 Antonio Ojea <antonio.ojea.garcia at gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P5 |P2 -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20240901/b6ca1a2a/attachment.html>
bugzilla-daemon at netfilter.org
2024-Sep-01 20:40 UTC
[Bug 1766] nfqueue randomly drops packets with same tuple
https://bugzilla.netfilter.org/show_bug.cgi?id=1766 --- Comment #5 from Antonio Ojea <antonio.ojea.garcia at gmail.com> --- The nftables rule does not detect the two packets from the same tuple as the same connection> ct state established,related acceptSo, it seems the problem is that the same tuple gets DNATed to a different address for each packet, but there is only one conntrack entry, so the return packet is not able to be handled and is discarded -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20240901/5b8b6978/attachment.html>
bugzilla-daemon at netfilter.org
2024-Sep-01 22:03 UTC
[Bug 1766] nfqueue randomly drops packets with same tuple
https://bugzilla.netfilter.org/show_bug.cgi?id=1766 --- Comment #6 from Antonio Ojea <antonio.ojea.garcia at gmail.com> --- testcase https://patchwork.ozlabs.org/project/netfilter-devel/patch/20240901220228.4157482-1-aojea at google.com/ -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20240901/b93ccb1d/attachment.html>
bugzilla-daemon at netfilter.org
2024-Sep-02 07:54 UTC
[Bug 1766] nfqueue randomly drops packets with same tuple
https://bugzilla.netfilter.org/show_bug.cgi?id=1766 --- Comment #7 from Pablo Neira Ayuso <pablo at netfilter.org> --- thanks for your testcase issue is related to 368982cd7d1b ("netfilter: nfnetlink_queue: resolve clash for unconfirmed conntracks") which collides with the new approach to deal with clash resolution. Let me get back to you with a remedy for this situation. -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20240902/db334147/attachment.html>
bugzilla-daemon at netfilter.org
2024-Sep-12 19:41 UTC
[Bug 1766] nfqueue randomly drops packets with same tuple
https://bugzilla.netfilter.org/show_bug.cgi?id=1766 Pablo Neira Ayuso <pablo at netfilter.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED --- Comment #8 from Pablo Neira Ayuso <pablo at netfilter.org> --- Patch attempt to fix this: https://patchwork.ozlabs.org/project/netfilter-devel/patch/20240912185832.11962-1-pablo at netfilter.org/ -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20240912/2ca3538e/attachment.html>
bugzilla-daemon at netfilter.org
2024-Sep-18 08:41 UTC
[Bug 1766] nfqueue randomly drops packets with same tuple
https://bugzilla.netfilter.org/show_bug.cgi?id=1766 --- Comment #9 from Antonio Ojea <antonio.ojea.garcia at gmail.com> --- Still failing, captured two traces to show the differences GOOD ONE 0xffff8dd044679180 0 <empty>:3760 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.3:53(udp) nf_queue nf_hook_slow 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) nf_conntrack_update nfqnl_reinject 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) nf_nat_ipv4_out nfqnl_reinject 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) nf_nat_inet_fn nf_nat_ipv4_out 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) nft_nat_do_chain nf_nat_inet_fn 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) selinux_ip_postroute nfqnl_reinject 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) selinux_ip_postroute_compat selinux_ip_postroute 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) nf_confirm nfqnl_reinject 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) __nf_conntrack_confirm nf_confirm 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) ip_finish_output nfqnl_reinject 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) __ip_finish_output nfqnl_reinject 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) ip_finish_output2 nfqnl_reinject 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 73 10.244.0.2:32942->10.244.0.4:53(udp) __dev_queue_xmit ip_finish_output2 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 73 10.244.0.2:32942->10.244.0.4:53(udp) netdev_core_pick_tx __dev_queue_xmit 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 73 10.244.0.2:32942->10.244.0.4:53(udp) validate_xmit_skb __dev_queue_xmit 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 73 10.244.0.2:32942->10.244.0.4:53(udp) netif_skb_features validate_xmit_skb 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 73 10.244.0.2:32942->10.244.0.4:53(udp) passthru_features_check netif_skb_features 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 73 10.244.0.2:32942->10.244.0.4:53(udp) skb_network_protocol netif_skb_features 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 73 10.244.0.2:32942->10.244.0.4:53(udp) skb_csum_hwoffload_help validate_xmit_skb 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 73 10.244.0.2:32942->10.244.0.4:53(udp) validate_xmit_xfrm __dev_queue_xmit 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 73 10.244.0.2:32942->10.244.0.4:53(udp) dev_hard_start_xmit __dev_queue_xmit 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 73 10.244.0.2:32942->10.244.0.4:53(udp) veth_xmit dev_hard_start_xmit 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 73 10.244.0.2:32942->10.244.0.4:53(udp) __dev_forward_skb veth_xmit 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 73 10.244.0.2:32942->10.244.0.4:53(udp) __dev_forward_skb2 veth_xmit 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 73 10.244.0.2:32942->10.244.0.4:53(udp) skb_scrub_packet __dev_forward_skb2 0xffff8dd044679680 0 <empty>:1635 4026532316 0 veth60925af6:7 0x0800 1500 73 10.244.0.2:32942->10.244.0.4:53(udp) eth_type_trans __dev_forward_skb2 BAD ONE 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.3:53(udp) nf_conntrack_update nfqnl_reinject 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.3:53(udp) nf_nat_manip_pkt nf_conntrack_update 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.3:53(udp) nf_nat_ipv4_manip_pkt nf_nat_manip_pkt 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.3:53(udp) skb_ensure_writable nf_nat_ipv4_manip_pkt 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.3:53(udp) l4proto_manip_pkt nf_nat_ipv4_manip_pkt 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.3:53(udp) skb_ensure_writable l4proto_manip_pkt 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.3:53(udp) nf_csum_update l4proto_manip_pkt 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.3:53(udp) inet_proto_csum_replace4 l4proto_manip_pkt 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.3:53(udp) inet_proto_csum_replace4 l4proto_manip_pkt 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) nf_nat_ipv4_out nfqnl_reinject 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) nf_nat_inet_fn nf_nat_ipv4_out 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) selinux_ip_postroute nfqnl_reinject 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) selinux_ip_postroute_compat selinux_ip_postroute 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) nf_confirm nfqnl_reinject 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) ip_finish_output nfqnl_reinject 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) __ip_finish_output nfqnl_reinject 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) ip_finish_output2 nfqnl_reinject 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) neigh_resolve_output ip_finish_output2 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) __neigh_event_send neigh_resolve_output 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) skb_clone neigh_probe 0xffff8dd044679180 0 <empty>:1635 4026532316 0 veth539c3d56:6 0x0800 1500 59 10.244.0.2:32942->10.244.0.4:53(udp) arp_solicit neigh_probe -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20240918/aa36ee3d/attachment.html>
bugzilla-daemon at netfilter.org
2024-Sep-18 09:33 UTC
[Bug 1766] nfqueue randomly drops packets with same tuple
https://bugzilla.netfilter.org/show_bug.cgi?id=1766 --- Comment #10 from Antonio Ojea <antonio.ojea.garcia at gmail.com> --- @pablo I'm rereading 368982cd7d1bd41cd39049c794990aca3770db44 , and the problem comes with> NAT mangling for the packet losing race is corrected by using the conntrack information that won race.I don't have enough knowledge on the codebase to fully understand all the logic, but I think the problems comes because the packet is enqueued in postrouting and the NAT is redone ... but IIUIC is not considering the hook from when the function is called, so it redoes all the NAT, in this case the PREROUTING NAT. What if it ONLY redoes the part of NAT that belongs to the hook where is this called, is that possible? does it make sense? -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20240918/4597ba99/attachment-0001.html>
bugzilla-daemon at netfilter.org
2024-Sep-18 09:33 UTC
[Bug 1766] nfqueue randomly drops packets with same tuple
https://bugzilla.netfilter.org/show_bug.cgi?id=1766 --- Comment #11 from Antonio Ojea <antonio.ojea.garcia at gmail.com> --- @pablo I'm rereading 368982cd7d1bd41cd39049c794990aca3770db44 , and the problem comes with> NAT mangling for the packet losing race is corrected by using the conntrack information that won race.I don't have enough knowledge on the codebase to fully understand all the logic, but I think the problems comes because the packet is enqueued in postrouting and the NAT is redone ... but IIUIC is not considering the hook from when the function is called, so it redoes all the NAT, in this case the PREROUTING NAT. What if it ONLY redoes the part of NAT that belongs to the hook where is this called, is that possible? does it make sense? -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20240918/eecfbb01/attachment.html>
bugzilla-daemon at netfilter.org
2024-Oct-03 10:43 UTC
[Bug 1766] nfqueue randomly drops packets with same tuple
https://bugzilla.netfilter.org/show_bug.cgi?id=1766 Antonio Ojea <antonio.ojea.garcia at gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Status|ASSIGNED |RESOLVED --- Comment #12 from Antonio Ojea <antonio.ojea.garcia at gmail.com> --- Fixed by https://github.com/torvalds/linux/commit/8af79d3edb5fd2dce35ea0a71595b6d4f9962350 -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20241003/9a365003/attachment.html>
Apparently Analagous Threads
- [Bug 1435] New: segfault when using iptables-nft and iptables-legacy inside a container
- [Bug 1728] New: Regression: iptables lock is now waited for without --wait
- [Bug 1730] New: nft does not handle IPv6 addresses with embedded IPv4 addresses
- [Bug 1742] New: using nfqueue breaks SCTP connection (tracking)
- sendmail on Centos 7.7