bugzilla-daemon at netfilter.org
2024-Nov-07 11:43 UTC
[Bug 1778] New: Skipping garbage collection in nf_conncount.c stops working when jiffies wrap around
https://bugzilla.netfilter.org/show_bug.cgi?id=1778 Bug ID: 1778 Summary: Skipping garbage collection in nf_conncount.c stops working when jiffies wrap around Product: netfilter/iptables Version: unspecified Hardware: All OS: All Status: NEW Severity: normal Priority: P5 Component: nf_conntrack Assignee: netfilter-buglog at lists.netfilter.org Reporter: njensen at akamai.com This previous patch skips garbage collection for nf_conncount if we already ran garbage collection in the same jiffy: https://github.com/torvalds/linux/commit/d265929930e2ffafc744c0ae05fb70acd53be1ee In our testing this patch stops working when jiffies wrap around. This happens after the kernel has run for 5 minutes since INITIAL_JIFFIES is set to -300*HZ. We observed a massive slowdown for ct counts when this happens in our testing. To reproduce add a simple ruleset with ct counts such as: table inet filter { chain input { type filter hook input priority 0; ct state { established, related } accept reject } chain OUTPUT { type filter hook output priority 0; ct count over 100000 drop accept } } To better show the effect I have modified the kernel like below with a debugging print: diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c index 4890af4dc263..b39fb3c10c06 100644 --- a/net/netfilter/nf_conncount.c +++ b/net/netfilter/nf_conncount.c @@ -134,6 +134,8 @@ static int __nf_conncount_add(struct net *net, if (time_is_after_eq_jiffies((unsigned long)list->last_gc)) goto add_new_node; + if ((u32)jiffies == list->last_gc) + printk(KERN_INFO "Already did GC this jiffy, but not skipping. (u32)jiffies=%d, (unsigned long)list->last_gc=%lu, jiffies=%lu", (u32)jiffies, (unsigned long)list->last_gc, jiffies); /* check the saved connections */ list_for_each_entry_safe(conn, conn_n, &list->head, node) { After the kernel has run for 5 minutes we see the following logged when quickly sending a few SYNs: Already did GC this jiffy, but not skipping. (u32)jiffies=2541, (unsigned long)list->last_gc=2541, jiffies=4294969837 The problem seems to be that last_gc in the nf_conncount_list struct is an u32, but jiffies is an unsigned long which is 8 bytes on my systems. When those two are compared it only works until last_gc wraps around. The problematic check is here: https://github.com/torvalds/linux/blob/master/net/netfilter/nf_conncount.c#L135. One fix could be to check if the last_gc matches the current (u32)jiffies like below: --- a/net/netfilter/nf_conncount.c +++ b/net/netfilter/nf_conncount.c @@ -132,7 +132,7 @@ struct nf_conn *found_ct; unsigned int collect = 0; - if (time_is_after_eq_jiffies((unsigned long)list->last_gc)) + if ((u32)jiffies == list->last_gc) goto add_new_node; /* check the saved connections */ @@ -234,7 +234,7 @@ bool ret = false; /* don't bother if we just did GC */ - if (time_is_after_eq_jiffies((unsigned long)READ_ONCE(list->last_gc))) + if ((u32)jiffies == READ_ONCE(list->last_gc)) return false; /* don't bother if other cpu is already doing GC */ -- You are receiving this mail because: You are watching all bug changes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.netfilter.org/pipermail/netfilter-buglog/attachments/20241107/f90b48d9/attachment.html>