bugzilla-daemon at netfilter.org
2017-Mar-08 13:21 UTC
[Bug 1127] New: running nft command creates lag for forwarded packets
https://bugzilla.netfilter.org/show_bug.cgi?id=1127 Bug ID: 1127 Summary: running nft command creates lag for forwarded packets Product: nftables Version: unspecified Hardware: x86_64 OS: Gentoo Status: NEW Severity: major Priority: P5 Component: nft Assignee: pablo at netfilter.org Reporter: karel at unitednetworks.cz We have several routers with Gentoo x86-64 kernels 4.9.9 or 4.10.1 with about 150 nftables rules (nftables used are commit da3f503, date 2017-1-03). Hardware used are Xeons or i7-7700K with 10Gbe Intel or Solarflare NICs. There are few sets, maps and flows. Any nft command, be it listing sets, maps, flows or writing rules creates lag for forwarded packets. In test case, ICMP packets going through boxes have normally about 5ms latency. When running nft (regardless command for listing set with few items or with several thousand items) latencies go up to 30-100ms. This is observed when router throughput is from 600Mbps to 2Gbps. When throughput is about 300Mbps, latecies go up too, but to 8-12ms. Routers are using multiple NICs queues with affinity to CPU cores, maximum load when tests were performed was about 10-20%. Userspace processes (including nft) are affined to other cores than NICs queues. Older boxes with iptables but similar ruleset have no such behaviour. Running "iptables -t mangle -L -v -n", which lists tens of thousands of rules have almost no effect on latency of forwarded packets (we observed 0 - 1ms). It looks like nft have some kind of global lock for netfilter kernel, which stops forwarding packets when nft is running, regardless of actual nft command performed. Can anyone explain to me that behaviour ? Can it be fixed ? Its quite a problem for us because we use hundreds of nft commands per day and such spikes in latencies of forwarded packets are deal breaker. -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20170308/49f72d12/attachment.html>
bugzilla-daemon at netfilter.org
2017-Mar-08 13:30 UTC
[Bug 1127] running nft command creates lag for forwarded packets
https://bugzilla.netfilter.org/show_bug.cgi?id=1127 Pablo Neira Ayuso <pablo at netfilter.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED --- Comment #1 from Pablo Neira Ayuso <pablo at netfilter.org> --- Are you using a large set with intervals / netmasks? -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20170308/fe3301f1/attachment.html>
bugzilla-daemon at netfilter.org
2017-Mar-08 13:45 UTC
[Bug 1127] running nft command creates lag for forwarded packets
https://bugzilla.netfilter.org/show_bug.cgi?id=1127 --- Comment #2 from Karel Rericha <karel at unitednetworks.cz> --- Yes, we are using several sets with type ipv4_addr, flags interval. But regardless of listing these sets (even empty sets) or listing sets without interval flag, or listing flows, lag spikes when running nft are still the same 30-100ms. Tested right now. -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20170308/780a0112/attachment.html>
bugzilla-daemon at netfilter.org
2017-Mar-08 13:50 UTC
[Bug 1127] running nft command creates lag for forwarded packets
https://bugzilla.netfilter.org/show_bug.cgi?id=1127 --- Comment #3 from Karel Rericha <karel at unitednetworks.cz> --- To be precise, we have about 20 sets with interval flag which have from tens of items to tens of thousands of items. And we have 4 maps with almost 20 000 items each (and few others much smaller). -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20170308/e6b79664/attachment.html>
bugzilla-daemon at netfilter.org
2017-Mar-08 14:44 UTC
[Bug 1127] running nft command creates lag for forwarded packets
https://bugzilla.netfilter.org/show_bug.cgi?id=1127 --- Comment #4 from Pablo Neira Ayuso <pablo at netfilter.org> --- Created attachment 495 --> https://bugzilla.netfilter.org/attachment.cgi?id=495&action=edit remove central spinlock for rbtree Could you give a try to this kernel patch? -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20170308/ae66bdd7/attachment.html>
bugzilla-daemon at netfilter.org
2017-Mar-08 16:13 UTC
[Bug 1127] running nft command creates lag for forwarded packets
https://bugzilla.netfilter.org/show_bug.cgi?id=1127 --- Comment #5 from Karel Rericha <karel at unitednetworks.cz> --- Compiled fine on kernel 4.10.1, will test at night, report tomorrow. -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20170308/97850a1d/attachment.html>
bugzilla-daemon at netfilter.org
2017-Mar-09 12:33 UTC
[Bug 1127] running nft command creates lag for forwarded packets
https://bugzilla.netfilter.org/show_bug.cgi?id=1127 --- Comment #6 from Karel Rericha <karel at unitednetworks.cz> --- Patch didnt help Pablo. Listing this small set each second from patched box: testbox ~ # nft list set filter private table ip filter { set private { type ipv4_addr flags interval elements = { 10.0.0.0/8, 100.64.0.0/10, 172.16.0.0/12, 192.168.0.0/16} } } testbox ~ # watch -n 1 nft list set filter private > /dev/null and testing latencies of packets going through this box (actual traffic is 1.2Gbps, real address is replaced by a.b.c.d): karel2 karel # ping a.b.c.d -i 0.1 PING a.b.c.d (a.b.c.d) 56(84) bytes of data. 64 bytes from a.b.c.d: icmp_seq=1 ttl=60 time=5.58 ms 64 bytes from a.b.c.d: icmp_seq=2 ttl=60 time=5.44 ms 64 bytes from a.b.c.d: icmp_seq=3 ttl=60 time=14.8 ms 64 bytes from a.b.c.d: icmp_seq=4 ttl=60 time=83.1 ms 64 bytes from a.b.c.d: icmp_seq=5 ttl=60 time=5.57 ms 64 bytes from a.b.c.d: icmp_seq=6 ttl=60 time=5.55 ms 64 bytes from a.b.c.d: icmp_seq=7 ttl=60 time=5.54 ms 64 bytes from a.b.c.d: icmp_seq=8 ttl=60 time=5.55 ms 64 bytes from a.b.c.d: icmp_seq=9 ttl=60 time=5.54 ms 64 bytes from a.b.c.d: icmp_seq=10 ttl=60 time=5.55 ms 64 bytes from a.b.c.d: icmp_seq=11 ttl=60 time=5.48 ms 64 bytes from a.b.c.d: icmp_seq=12 ttl=60 time=5.62 ms 64 bytes from a.b.c.d: icmp_seq=13 ttl=60 time=5.44 ms 64 bytes from a.b.c.d: icmp_seq=14 ttl=60 time=5.47 ms 64 bytes from a.b.c.d: icmp_seq=15 ttl=60 time=5.48 ms 64 bytes from a.b.c.d: icmp_seq=16 ttl=60 time=5.49 ms 64 bytes from a.b.c.d: icmp_seq=17 ttl=60 time=5.70 ms 64 bytes from a.b.c.d: icmp_seq=18 ttl=60 time=109 ms 64 bytes from a.b.c.d: icmp_seq=19 ttl=60 time=18.4 ms 64 bytes from a.b.c.d: icmp_seq=20 ttl=60 time=5.56 ms 64 bytes from a.b.c.d: icmp_seq=21 ttl=60 time=5.50 ms 64 bytes from a.b.c.d: icmp_seq=22 ttl=60 time=5.55 ms 64 bytes from a.b.c.d: icmp_seq=23 ttl=60 time=5.51 ms 64 bytes from a.b.c.d: icmp_seq=24 ttl=60 time=5.51 ms 64 bytes from a.b.c.d: icmp_seq=25 ttl=60 time=5.47 ms 64 bytes from a.b.c.d: icmp_seq=26 ttl=60 time=5.55 ms 64 bytes from a.b.c.d: icmp_seq=27 ttl=60 time=5.45 ms 64 bytes from a.b.c.d: icmp_seq=28 ttl=60 time=5.55 ms 64 bytes from a.b.c.d: icmp_seq=29 ttl=60 time=5.48 ms 64 bytes from a.b.c.d: icmp_seq=30 ttl=60 time=5.53 ms 64 bytes from a.b.c.d: icmp_seq=31 ttl=60 time=5.51 ms 64 bytes from a.b.c.d: icmp_seq=32 ttl=60 time=28.1 ms 64 bytes from a.b.c.d: icmp_seq=33 ttl=60 time=49.2 ms 64 bytes from a.b.c.d: icmp_seq=34 ttl=60 time=5.47 ms 64 bytes from a.b.c.d: icmp_seq=35 ttl=60 time=5.57 ms 64 bytes from a.b.c.d: icmp_seq=36 ttl=60 time=5.54 ms 64 bytes from a.b.c.d: icmp_seq=37 ttl=60 time=5.51 ms 64 bytes from a.b.c.d: icmp_seq=38 ttl=60 time=5.45 ms 64 bytes from a.b.c.d: icmp_seq=39 ttl=60 time=5.54 ms 64 bytes from a.b.c.d: icmp_seq=40 ttl=60 time=5.54 ms 64 bytes from a.b.c.d: icmp_seq=41 ttl=60 time=5.54 ms 64 bytes from a.b.c.d: icmp_seq=42 ttl=60 time=5.59 ms 64 bytes from a.b.c.d: icmp_seq=43 ttl=60 time=5.49 ms 64 bytes from a.b.c.d: icmp_seq=44 ttl=60 time=5.49 ms 64 bytes from a.b.c.d: icmp_seq=45 ttl=60 time=5.53 ms 64 bytes from a.b.c.d: icmp_seq=46 ttl=60 time=26.3 ms 64 bytes from a.b.c.d: icmp_seq=47 ttl=60 time=84.3 ms 64 bytes from a.b.c.d: icmp_seq=48 ttl=60 time=5.46 ms 64 bytes from a.b.c.d: icmp_seq=49 ttl=60 time=5.54 ms 64 bytes from a.b.c.d: icmp_seq=50 ttl=60 time=5.47 ms -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20170309/8976672f/attachment.html>
bugzilla-daemon at netfilter.org
2017-Mar-09 13:01 UTC
[Bug 1127] running nft command creates lag for forwarded packets
https://bugzilla.netfilter.org/show_bug.cgi?id=1127 --- Comment #7 from Pablo Neira Ayuso <pablo at netfilter.org> --- (In reply to Karel Rericha from comment #6)> Patch didnt help Pablo. Listing this small set each second from patched box: > > testbox ~ # nft list set filter private > table ip filter { > set private { > type ipv4_addr > flags interval > elements = { 10.0.0.0/8, 100.64.0.0/10, 172.16.0.0/12, > 192.168.0.0/16} > } > } > > testbox ~ # watch -n 1 nft list set filter private > /dev/nullI guess you don't hit this problem if you just use non-interval sets. The good solution is to provide a replacement for the interval set implementation that we have now, which doesn't scale up as you're noticing. I can workaround this by providing a lockless path to list. Still if you have dynamic insertions on that interval tree, you would hit problems. So the definitive solution is to provide a replacement implementation for interval sets. -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20170309/0d52f8a3/attachment.html>
bugzilla-daemon at netfilter.org
2017-Mar-09 14:32 UTC
[Bug 1127] running nft command creates lag for forwarded packets
https://bugzilla.netfilter.org/show_bug.cgi?id=1127 Liping Zhang <zlpnobody at gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |zlpnobody at gmail.com --- Comment #8 from Liping Zhang <zlpnobody at gmail.com> --- Created attachment 496 --> https://bugzilla.netfilter.org/attachment.cgi?id=496&action=edit use rwlock for rbtree Hi Pablo, Based on your patch, I think we can convert spinlock to rwlock to improve the scalability, since "listing sets" and "forwarding packets" are both readers. So after using rwlock, they will not race to waiting lock anymore. @Karel, I think the attached patch can help your situation, can you try it? -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20170309/3c41bb17/attachment.html>
bugzilla-daemon at netfilter.org
2017-Mar-09 15:00 UTC
[Bug 1127] running nft command creates lag for forwarded packets
https://bugzilla.netfilter.org/show_bug.cgi?id=1127 --- Comment #9 from Karel Rericha <karel at unitednetworks.cz> --- We are blocking "known attackers" in firewall with set compiled from several sources which has currently about 90 000 IP addresses. This set has flag interval and makes about 90% of items in sets with interval flag. It can be safely converted to set without interval flag, because there are only IP addresses, no subnets. I will try separately second patch and converting this set to set without interval flag. Report tomorrow. -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20170309/86f44fad/attachment.html>
bugzilla-daemon at netfilter.org
2017-Mar-09 15:35 UTC
[Bug 1127] running nft command creates lag for forwarded packets
https://bugzilla.netfilter.org/show_bug.cgi?id=1127 --- Comment #10 from Karel Rericha <karel at unitednetworks.cz> --- First result: Converting set with "known attackers" to set without interval flag DID help. Pablo was right, no visible latency problems now. I will try second patch at night. -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20170309/71b98357/attachment.html>
bugzilla-daemon at netfilter.org
2017-Mar-10 08:16 UTC
[Bug 1127] running nft command creates lag for forwarded packets
https://bugzilla.netfilter.org/show_bug.cgi?id=1127 --- Comment #11 from Karel Rericha <karel at unitednetworks.cz> --- Fell asleep last night :) Will test second patch next week. Sry :| -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20170310/aa4817e4/attachment.html>
bugzilla-daemon at netfilter.org
2017-Mar-12 08:37 UTC
[Bug 1127] running nft command creates lag for forwarded packets
https://bugzilla.netfilter.org/show_bug.cgi?id=1127 --- Comment #12 from Karel Rericha <karel at unitednetworks.cz> --- Second result: Zhang your patch DID help. When listing small set like in previous test there were no visible latency problems now. And when listing big "known attackers" set there were latency spikes around 15ms. I have waited until actual traffic was 1.2Gbps so both tests are more or less the same. Summary: Getting rid of interval sets obviously solves "big sets with interval flag latency problem" for good :) But when big sets with interval flag are needed Zhang's patch improves situation a lot: Listing of small sets have no more latency impacts and listing of big sets with interval raises latency several times less than previous code. Pablo, Zhang, thanks for your help. -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20170312/0d95f3d4/attachment.html>
bugzilla-daemon at netfilter.org
2017-Mar-14 15:17 UTC
[Bug 1127] running nft command creates lag for forwarded packets
https://bugzilla.netfilter.org/show_bug.cgi?id=1127 Liping Zhang <zlpnobody at gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution|--- |FIXED --- Comment #13 from Liping Zhang <zlpnobody at gmail.com> --- Karel, thanks very much for your reporting and testing. Patch was accepted, and you can see it in linux4.12: https://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git/commit/?id=03e5fd0e9bcc1f34b7a542786b34b8f771e7c260 -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20170314/44a6251d/attachment.html>
Apparently Analagous Threads
- [Bug 1411] New: add elements with counter to dynamic sets with
- [Bug 1185] New: counter flag proposal for sets and maps
- [Bug 1382] New: nftables.py cmd leaking memory when ruleset contain mapping ip length to range with high limit 65535
- [Bug 1386] New: nftables.py cmd doesn't read updated counter values after first read
- [Bug 1184] New: disable implicit concatenating of elements of sets with flag interval