Hi all, I''m having this problem with this system configuration: 1) iptables 1.3.7 2) kernel 2.6.19.1 3) SMP computer 4) 2 external links + 2 internal (bridged). Some hours after the system is working without any troubles, all network devices stop respond. Anyone could help me to fix this problem? Googling some ours I detect that this was a problem with old kernels and were solved with 2.6.11 kernel version. Any help will be appretiated. Regards. P.D.: With MASQUERADE the problem begans more quickly than with SNAT target.
The log says: Dec 30 00:52:27 cura kernel: dst cache overflow Dec 30 00:52:27 cura kernel: MASQUERADE: No route: Rusty''s brain broke! Dec 30 00:52:27 cura kernel: dst cache overflow Dec 30 00:52:28 cura kernel: zlan0: received tcn bpdu on port 1(eth0) Dec 30 00:52:28 cura kernel: zlan0: topology change detected, propagating Dec 30 00:52:28 cura kernel: dst cache overflow Dec 30 00:52:30 cura kernel: zlan0: received tcn bpdu on port 1(eth0) Dec 30 00:52:30 cura kernel: zlan0: topology change detected, propagating Dec 30 00:52:32 cura kernel: zlan0: received tcn bpdu on port 1(eth0) Dec 30 00:52:32 cura kernel: zlan0: topology change detected, propagating Dec 30 00:52:32 cura kernel: printk: 15 messages suppressed. Dec 30 00:52:32 cura kernel: dst cache overflow Dec 30 00:52:34 cura kernel: zlan0: received tcn bpdu on port 1(eth0) Dec 30 00:52:34 cura kernel: zlan0: topology change detected, propagating Dec 30 00:52:36 cura kernel: zlan0: received tcn bpdu on port 1(eth0) Dec 30 00:52:36 cura kernel: zlan0: topology change detected, propagating Dec 30 00:52:37 cura kernel: printk: 40 messages suppressed. Dec 30 00:52:37 cura kernel: dst cache overflow zlan0 is a bridge (with STP configured) between some LANs. Thanks P.D.: I''m a bit desesperated with this error, I changed "MASQUERADE" with "SNAT" with no sense. Some hours after router is booted up, the network appears to be UP but all ifaces haven''t responses. El Mar, 2 de Enero de 2007, 23:24, ArcosCom Linux User escribió:> Hi all, I''m having this problem with this system configuration: > 1) iptables 1.3.7 > 2) kernel 2.6.19.1 > 3) SMP computer > 4) 2 external links + 2 internal (bridged). > > Some hours after the system is working without any troubles, all network > devices stop respond. > > Anyone could help me to fix this problem? > > Googling some ours I detect that this was a problem with old kernels and > were solved with 2.6.11 kernel version. > > Any help will be appretiated. > > Regards. > > P.D.: With MASQUERADE the problem begans more quickly than with SNAT > target.
ArcosCom Linux User wrote:> The log says: > > Dec 30 00:52:27 cura kernel: dst cache overflow > Dec 30 00:52:27 cura kernel: MASQUERADE: No route: Rusty''s brain broke! > Dec 30 00:52:27 cura kernel: dst cache overflow > Dec 30 00:52:28 cura kernel: zlan0: received tcn bpdu on port 1(eth0) > Dec 30 00:52:28 cura kernel: zlan0: topology change detected, propagating > Dec 30 00:52:28 cura kernel: dst cache overflow > Dec 30 00:52:30 cura kernel: zlan0: received tcn bpdu on port 1(eth0) > Dec 30 00:52:30 cura kernel: zlan0: topology change detected, propagating > Dec 30 00:52:32 cura kernel: zlan0: received tcn bpdu on port 1(eth0) > Dec 30 00:52:32 cura kernel: zlan0: topology change detected, propagating > Dec 30 00:52:32 cura kernel: printk: 15 messages suppressed. > Dec 30 00:52:32 cura kernel: dst cache overflow > Dec 30 00:52:34 cura kernel: zlan0: received tcn bpdu on port 1(eth0) > Dec 30 00:52:34 cura kernel: zlan0: topology change detected, propagating > Dec 30 00:52:36 cura kernel: zlan0: received tcn bpdu on port 1(eth0) > Dec 30 00:52:36 cura kernel: zlan0: topology change detected, propagating > Dec 30 00:52:37 cura kernel: printk: 40 messages suppressed. > Dec 30 00:52:37 cura kernel: dst cache overflow > > zlan0 is a bridge (with STP configured) between some LANs. > > Thanks > > P.D.: I''m a bit desesperated with this error, I changed "MASQUERADE" with > "SNAT" with no sense. Some hours after router is booted up, the network > appears to be UP but all ifaces haven''t responses. > > El Mar, 2 de Enero de 2007, 23:24, ArcosCom Linux User escribió: > >> Hi all, I''m having this problem with this system configuration: >> 1) iptables 1.3.7 >> 2) kernel 2.6.19.1 >> 3) SMP computer >> 4) 2 external links + 2 internal (bridged). >> >> Some hours after the system is working without any troubles, all network >> devices stop respond. >> >> Anyone could help me to fix this problem? >> >> Googling some ours I detect that this was a problem with old kernels and >> were solved with 2.6.11 kernel version. >> >> Any help will be appretiated. >> >> Regards. >> >> P.D.: With MASQUERADE the problem begans more quickly than with SNAT >> target. >> > > > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc >The generic solution is to make less/better use of the CPU resources. In particular, it is good to tune a lot of parrametters, like /proc/sys/net/ipv4/neigh/default/gc_threshx, where x is 1,2 or 3. echo 2048 > /proc/sys/net/ipv4/neigh/default/gc_thresh1 echo 4096 > /proc/sys/net/ipv4/neigh/default/gc_thresh2 echo 16384 > /proc/sys/net/ipv4/neigh/default/gc_thresh3 Then, check/tune whatever consume CPU, iptables firewall, tc filters, lots of routes and heavy pachekts/second traffic, and so on. You can check with top how resources are used, for start.
Alexandru Dragoi wrote:> ArcosCom Linux User wrote: > >> The log says: >> >> Dec 30 00:52:27 cura kernel: dst cache overflow >> Dec 30 00:52:27 cura kernel: MASQUERADE: No route: Rusty''s brain broke! >> Dec 30 00:52:27 cura kernel: dst cache overflow >> Dec 30 00:52:28 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:28 cura kernel: zlan0: topology change detected, propagating >> Dec 30 00:52:28 cura kernel: dst cache overflow >> Dec 30 00:52:30 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:30 cura kernel: zlan0: topology change detected, propagating >> Dec 30 00:52:32 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:32 cura kernel: zlan0: topology change detected, propagating >> Dec 30 00:52:32 cura kernel: printk: 15 messages suppressed. >> Dec 30 00:52:32 cura kernel: dst cache overflow >> Dec 30 00:52:34 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:34 cura kernel: zlan0: topology change detected, propagating >> Dec 30 00:52:36 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:36 cura kernel: zlan0: topology change detected, propagating >> Dec 30 00:52:37 cura kernel: printk: 40 messages suppressed. >> Dec 30 00:52:37 cura kernel: dst cache overflow >> >> zlan0 is a bridge (with STP configured) between some LANs. >> >> Thanks >> >> P.D.: I''m a bit desesperated with this error, I changed "MASQUERADE" with >> "SNAT" with no sense. Some hours after router is booted up, the network >> appears to be UP but all ifaces haven''t responses. >> >> El Mar, 2 de Enero de 2007, 23:24, ArcosCom Linux User escribió: >> >> >>> Hi all, I''m having this problem with this system configuration: >>> 1) iptables 1.3.7 >>> 2) kernel 2.6.19.1 >>> 3) SMP computer >>> 4) 2 external links + 2 internal (bridged). >>> >>> Some hours after the system is working without any troubles, all network >>> devices stop respond. >>> >>> Anyone could help me to fix this problem? >>> >>> Googling some ours I detect that this was a problem with old kernels and >>> were solved with 2.6.11 kernel version. >>> >>> Any help will be appretiated. >>> >>> Regards. >>> >>> P.D.: With MASQUERADE the problem begans more quickly than with SNAT >>> target. >>> >>> >> _______________________________________________ >> LARTC mailing list >> LARTC@mailman.ds9a.nl >> http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc >> >> > The generic solution is to make less/better use of the CPU resources. In > particular, it is good to tune a lot of parrametters, like > /proc/sys/net/ipv4/neigh/default/gc_threshx, where x is 1,2 or 3. > > echo 2048 > /proc/sys/net/ipv4/neigh/default/gc_thresh1 > echo 4096 > /proc/sys/net/ipv4/neigh/default/gc_thresh2 > echo 16384 > /proc/sys/net/ipv4/neigh/default/gc_thresh3 > > Then, check/tune whatever consume CPU, iptables firewall, tc filters, > lots of routes and heavy pachekts/second traffic, and so on. You can > check with top how resources are used, for start. > _______________________________________________ > LARTC mailing list > LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc >Now i see the bpdu packets received by your bridge. Seem you may have some network loop, wich generated lots of broadcast traffic (wich includes arp).
ArcosCom Linux User wrote:> The log says: > > Dec 30 00:52:27 cura kernel: dst cache overflow > Dec 30 00:52:27 cura kernel: MASQUERADE: No route: Rusty''s brain broke! > Dec 30 00:52:27 cura kernel: dst cache overflow > Dec 30 00:52:28 cura kernel: zlan0: received tcn bpdu on port 1(eth0) > Dec 30 00:52:28 cura kernel: zlan0: topology change detected, propagating > Dec 30 00:52:28 cura kernel: dst cache overflow > Dec 30 00:52:30 cura kernel: zlan0: received tcn bpdu on port 1(eth0) > Dec 30 00:52:30 cura kernel: zlan0: topology change detected, propagating > Dec 30 00:52:32 cura kernel: zlan0: received tcn bpdu on port 1(eth0) > Dec 30 00:52:32 cura kernel: zlan0: topology change detected, propagating > Dec 30 00:52:32 cura kernel: printk: 15 messages suppressed. > Dec 30 00:52:32 cura kernel: dst cache overflow > Dec 30 00:52:34 cura kernel: zlan0: received tcn bpdu on port 1(eth0) > Dec 30 00:52:34 cura kernel: zlan0: topology change detected, propagating > Dec 30 00:52:36 cura kernel: zlan0: received tcn bpdu on port 1(eth0) > Dec 30 00:52:36 cura kernel: zlan0: topology change detected, propagating > Dec 30 00:52:37 cura kernel: printk: 40 messages suppressed. > Dec 30 00:52:37 cura kernel: dst cache overflow > > zlan0 is a bridge (with STP configured) between some LANs. > > Thanks > > P.D.: I''m a bit desesperated with this error, I changed "MASQUERADE" with > "SNAT" with no sense. Some hours after router is booted up, the network > appears to be UP but all ifaces haven''t responses.The MASQUERADE message is just an effect of the problem. Please describe your setup in more detail (what kind of devices, how are they connected, ebtables/iptables rules, routing, ...).
The configuration is: 1) linux box with 2.6.19.1 kernel with these patches/modules: a) l7-filter b) multipath patch (from nano-howto) c) IMQ d) ipp2p e) connlimit 2) 4 ethernet interfaces: a) 2 external (eth1 and eth3) interfaces with balanced links (as described in nato-howto). b) 2 internal ineterfaces (eth0 and eth2) in bridge zlan0 with STP enabled and configured. 3) For tests I load manually ALL conntrack/nat kernel modules. My first attempt (to allow UPnP daemon to handle only 1 external iface) where put eth1 and eth3 in a bridge without STP enabled and the NAT were done only with -j MASQUERADE and appeared to work fine, but when I run some amule clients along the network, the problem appear in one day (after some weeks working without peers to peers software). Then I broke the wan bridge and put each static external IP into their iface, and the problem appears too in two days instead 1 day. My next step were use SNAT instead MASQUERADE and the problem appears 3 days after the change. Always I had the multipath enableded along these described steps. A production linux box with 2.6.17.14 kernel and the same patches/modules and only 1 wan iface and 1 lan iface and with connlimit match enabled by host is working fine with 100 more p2p traffic than the test machine (the linux box that has de dst cache overflow problem). If you need more info about this to help me in solve this problem, please, say me, I''ll get all you need and put here. Thanks El Mie, 10 de Enero de 2007, 8:15, Patrick McHardy escribió:> ArcosCom Linux User wrote: >> The log says: >> >> Dec 30 00:52:27 cura kernel: dst cache overflow >> Dec 30 00:52:27 cura kernel: MASQUERADE: No route: Rusty''s brain broke! >> Dec 30 00:52:27 cura kernel: dst cache overflow >> Dec 30 00:52:28 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:28 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:28 cura kernel: dst cache overflow >> Dec 30 00:52:30 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:30 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:32 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:32 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:32 cura kernel: printk: 15 messages suppressed. >> Dec 30 00:52:32 cura kernel: dst cache overflow >> Dec 30 00:52:34 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:34 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:36 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:36 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:37 cura kernel: printk: 40 messages suppressed. >> Dec 30 00:52:37 cura kernel: dst cache overflow >> >> zlan0 is a bridge (with STP configured) between some LANs. >> >> Thanks >> >> P.D.: I''m a bit desesperated with this error, I changed "MASQUERADE" >> with >> "SNAT" with no sense. Some hours after router is booted up, the network >> appears to be UP but all ifaces haven''t responses. > > > The MASQUERADE message is just an effect of the problem. Please describe > your setup in more detail (what kind of devices, how are they connected, > ebtables/iptables rules, routing, ...). > >
The configuration is: 1) linux box with 2.6.19.1 kernel with these patches/modules: a) l7-filter b) multipath patch (from nano-howto) c) IMQ d) ipp2p e) connlimit 2) 4 ethernet interfaces: a) 2 external (eth1 and eth3) interfaces with balanced links (as described in nato-howto). b) 2 internal ineterfaces (eth0 and eth2) in bridge zlan0 with STP enabled and configured. 3) For tests I load manually ALL conntrack/nat kernel modules. My first attempt (to allow UPnP daemon to handle only 1 external iface) where put eth1 and eth3 in a bridge without STP enabled and the NAT were done only with -j MASQUERADE and appeared to work fine, but when I run some amule clients along the network, the problem appear in one day (after some weeks working without peers to peers software). Then I broke the wan bridge and put each static external IP into their iface, and the problem appears too in two days instead 1 day. My next step were use SNAT instead MASQUERADE and the problem appears 3 days after the change. Always I had the multipath enableded along these described steps. A production linux box with 2.6.17.14 kernel and the same patches/modules and only 1 wan iface and 1 lan iface and with connlimit match enabled by host is working fine with 100 more p2p traffic than the test machine (the linux box that has de dst cache overflow problem). If you need more info about this to help me in solve this problem, please, say me, I''ll get all you need and put here. Thanks El Mie, 10 de Enero de 2007, 8:15, Patrick McHardy escribió:> ArcosCom Linux User wrote: >> The log says: >> >> Dec 30 00:52:27 cura kernel: dst cache overflow >> Dec 30 00:52:27 cura kernel: MASQUERADE: No route: Rusty''s brain broke! >> Dec 30 00:52:27 cura kernel: dst cache overflow >> Dec 30 00:52:28 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:28 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:28 cura kernel: dst cache overflow >> Dec 30 00:52:30 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:30 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:32 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:32 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:32 cura kernel: printk: 15 messages suppressed. >> Dec 30 00:52:32 cura kernel: dst cache overflow >> Dec 30 00:52:34 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:34 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:36 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:36 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:37 cura kernel: printk: 40 messages suppressed. >> Dec 30 00:52:37 cura kernel: dst cache overflow >> >> zlan0 is a bridge (with STP configured) between some LANs. >> >> Thanks >> >> P.D.: I''m a bit desesperated with this error, I changed "MASQUERADE" >> with >> "SNAT" with no sense. Some hours after router is booted up, the network >> appears to be UP but all ifaces haven''t responses. > > > The MASQUERADE message is just an effect of the problem. Please describe > your setup in more detail (what kind of devices, how are they connected, > ebtables/iptables rules, routing, ...). > >
The configuration is: 1) linux box with 2.6.19.1 kernel with these patches/modules: a) l7-filter b) multipath patch (from nano-howto) c) IMQ d) ipp2p e) connlimit 2) 4 ethernet interfaces: a) 2 external (eth1 and eth3) interfaces with balanced links (as described in nato-howto). b) 2 internal ineterfaces (eth0 and eth2) in bridge zlan0 with STP enabled and configured. 3) For tests I load manually ALL conntrack/nat kernel modules. My first attempt (to allow UPnP daemon to handle only 1 external iface) where put eth1 and eth3 in a bridge without STP enabled and the NAT were done only with -j MASQUERADE and appeared to work fine, but when I run some amule clients along the network, the problem appear in one day (after some weeks working without peers to peers software). Then I broke the wan bridge and put each static external IP into their iface, and the problem appears too in two days instead 1 day. My next step were use SNAT instead MASQUERADE and the problem appears 3 days after the change. Always I had the multipath enableded along these described steps. A production linux box with 2.6.17.14 kernel and the same patches/modules and only 1 wan iface and 1 lan iface and with connlimit match enabled by host is working fine with 100 more p2p traffic than the test machine (the linux box that has de dst cache overflow problem). If you need more info about this to help me in solve this problem, please, say me, I''ll get all you need and put here. Thanks El Mie, 10 de Enero de 2007, 8:15, Patrick McHardy escribió:> ArcosCom Linux User wrote: >> The log says: >> >> Dec 30 00:52:27 cura kernel: dst cache overflow >> Dec 30 00:52:27 cura kernel: MASQUERADE: No route: Rusty''s brain broke! >> Dec 30 00:52:27 cura kernel: dst cache overflow >> Dec 30 00:52:28 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:28 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:28 cura kernel: dst cache overflow >> Dec 30 00:52:30 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:30 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:32 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:32 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:32 cura kernel: printk: 15 messages suppressed. >> Dec 30 00:52:32 cura kernel: dst cache overflow >> Dec 30 00:52:34 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:34 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:36 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:36 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:37 cura kernel: printk: 40 messages suppressed. >> Dec 30 00:52:37 cura kernel: dst cache overflow >> >> zlan0 is a bridge (with STP configured) between some LANs. >> >> Thanks >> >> P.D.: I''m a bit desesperated with this error, I changed "MASQUERADE" >> with >> "SNAT" with no sense. Some hours after router is booted up, the network >> appears to be UP but all ifaces haven''t responses. > > > The MASQUERADE message is just an effect of the problem. Please describe > your setup in more detail (what kind of devices, how are they connected, > ebtables/iptables rules, routing, ...). > >
ArcosCom Linux User wrote:> The configuration is: > 1) linux box with 2.6.19.1 kernel with these patches/modules: > a) l7-filter > b) multipath patch (from nano-howto) > c) IMQ > d) ipp2p > e) connlimit > 2) 4 ethernet interfaces: > a) 2 external (eth1 and eth3) interfaces with balanced links (as > described in nato-howto). > b) 2 internal ineterfaces (eth0 and eth2) in bridge zlan0 with STP > enabled and configured. > 3) For tests I load manually ALL conntrack/nat kernel modules.Please try to reproduce this without all these whacky patches (or at least without multipath and IMQ).
The configuration is: 1) linux box with 2.6.19.1 kernel with these patches/modules: a) l7-filter b) multipath patch (from nano-howto) c) IMQ d) ipp2p e) connlimit 2) 4 ethernet interfaces: a) 2 external (eth1 and eth3) interfaces with balanced links (as described in nato-howto). b) 2 internal ineterfaces (eth0 and eth2) in bridge zlan0 with STP enabled and configured. 3) For tests I load manually ALL conntrack/nat kernel modules. My first attempt (to allow UPnP daemon to handle only 1 external iface) where put eth1 and eth3 in a bridge without STP enabled and the NAT were done only with -j MASQUERADE and appeared to work fine, but when I run some amule clients along the network, the problem appear in one day (after some weeks working without peers to peers software). Then I broke the wan bridge and put each static external IP into their iface, and the problem appears too in two days instead 1 day. My next step were use SNAT instead MASQUERADE and the problem appears 3 days after the change. Always I had the multipath enableded along these described steps. A production linux box with 2.6.17.14 kernel and the same patches/modules and only 1 wan iface and 1 lan iface and with connlimit match enabled by host is working fine with 100 more p2p traffic than the test machine (the linux box that has de dst cache overflow problem). If you need more info about this to help me in solve this problem, please, say me, I''ll get all you need and put here. Thanks El Mie, 10 de Enero de 2007, 8:15, Patrick McHardy escribió:> ArcosCom Linux User wrote: >> The log says: >> >> Dec 30 00:52:27 cura kernel: dst cache overflow >> Dec 30 00:52:27 cura kernel: MASQUERADE: No route: Rusty''s brain broke! >> Dec 30 00:52:27 cura kernel: dst cache overflow >> Dec 30 00:52:28 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:28 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:28 cura kernel: dst cache overflow >> Dec 30 00:52:30 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:30 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:32 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:32 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:32 cura kernel: printk: 15 messages suppressed. >> Dec 30 00:52:32 cura kernel: dst cache overflow >> Dec 30 00:52:34 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:34 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:36 cura kernel: zlan0: received tcn bpdu on port 1(eth0) >> Dec 30 00:52:36 cura kernel: zlan0: topology change detected, >> propagating >> Dec 30 00:52:37 cura kernel: printk: 40 messages suppressed. >> Dec 30 00:52:37 cura kernel: dst cache overflow >> >> zlan0 is a bridge (with STP configured) between some LANs. >> >> Thanks >> >> P.D.: I''m a bit desesperated with this error, I changed "MASQUERADE" >> with >> "SNAT" with no sense. Some hours after router is booted up, the network >> appears to be UP but all ifaces haven''t responses. > > > The MASQUERADE message is just an effect of the problem. Please describe > your setup in more detail (what kind of devices, how are they connected, > ebtables/iptables rules, routing, ...). > >
The configuration is: 1) linux box with 2.6.19.1 kernel with these patches/modules: a) l7-filter b) multipath patch (from nano-howto) c) IMQ d) ipp2p e) connlimit 2) 4 ethernet interfaces: a) 2 external (eth1 and eth3) interfaces with balanced links (as described in nato-howto). b) 2 internal ineterfaces (eth0 and eth2) in bridge zlan0 with STP enabled and configured. 3) For tests I load manually ALL conntrack/nat kernel modules. My first attempt (to allow UPnP daemon to handle only 1 external iface) where put eth1 and eth3 in a bridge without STP enabled and the NAT were done only with -j MASQUERADE and appeared to work fine, but when I run some amule clients along the network, the problem appear in one day (after some weeks working without peers to peers software). Then I broke the wan bridge and put each static external IP into their iface, and the problem appears too in two days instead 1 day. My next step were use SNAT instead MASQUERADE and the problem appears 3 days after the change. Always I had the multipath enableded along these described steps. A production linux box with 2.6.17.14 kernel and the same patches/modules and only 1 wan iface and 1 lan iface and with connlimit match enabled by host is working fine with 100 more p2p traffic than the test machine (the linux box that has de dst cache overflow problem). If you need more info about this to help me in solve this problem, please, say me, I''ll get all you need and put here. Thanks El Mie, 10 de Enero de 2007, 8:15, Patrick McHardy escribió:> ArcosCom Linux User wrote: >> The log says: >> Dec 30 00:52:27 cura kernel: dst cache overflow >> Dec 30 00:52:27 cura kernel: MASQUERADE: No route: Rusty''s brain broke!Dec 30 00:52:27 cura kernel: dst cache overflow>> Dec 30 00:52:28 cura kernel: zlan0: received tcn bpdu on port 1(eth0)Dec 30 00:52:28 cura kernel: zlan0: topology change detected,>> propagating >> Dec 30 00:52:28 cura kernel: dst cache overflow >> Dec 30 00:52:30 cura kernel: zlan0: received tcn bpdu on port 1(eth0)Dec 30 00:52:30 cura kernel: zlan0: topology change detected,>> propagating >> Dec 30 00:52:32 cura kernel: zlan0: received tcn bpdu on port 1(eth0)Dec 30 00:52:32 cura kernel: zlan0: topology change detected,>> propagating >> Dec 30 00:52:32 cura kernel: printk: 15 messages suppressed. >> Dec 30 00:52:32 cura kernel: dst cache overflow >> Dec 30 00:52:34 cura kernel: zlan0: received tcn bpdu on port 1(eth0)Dec 30 00:52:34 cura kernel: zlan0: topology change detected,>> propagating >> Dec 30 00:52:36 cura kernel: zlan0: received tcn bpdu on port 1(eth0)Dec 30 00:52:36 cura kernel: zlan0: topology change detected,>> propagating >> Dec 30 00:52:37 cura kernel: printk: 40 messages suppressed. >> Dec 30 00:52:37 cura kernel: dst cache overflow >> zlan0 is a bridge (with STP configured) between some LANs. >> Thanks >> P.D.: I''m a bit desesperated with this error, I changed "MASQUERADE" with >> "SNAT" with no sense. Some hours after router is booted up, the networkappears to be UP but all ifaces haven''t responses.> The MASQUERADE message is just an effect of the problem. Please describeyour setup in more detail (what kind of devices, how are they connected, ebtables/iptables rules, routing, ...).
On Wed, 10 Jan 2007, ArcosCom Linux User wrote:> El Mie, 10 de Enero de 2007, 8:15, Patrick McHardy escribió: >> ArcosCom Linux User wrote: >>> The log says: >>> Dec 30 00:52:27 cura kernel: dst cache overflowThe log message "dst cache overflow" is normally related to overflow of the route cache. The max_size of the route cache can be adjusted through /proc/sys/net/ipv4/route/max_size. What is your settings in /proc/sys/net/ipv4/route/? Run command: grep . /proc/sys/net/ipv4/route/* Hilsen Jesper Brouer -- ------------------------------------------------------------------- MSc. Master of Computer Science Dept. of Computer Science, University of Copenhagen Author of http://www.adsl-optimizer.dk ------------------------------------------------------------------- _______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
Here are: # grep . /proc/sys/net/ipv4/route/* /proc/sys/net/ipv4/route/error_burst:5000 /proc/sys/net/ipv4/route/error_cost:1000 grep: /proc/sys/net/ipv4/route/flush: Operación no permitida /proc/sys/net/ipv4/route/gc_elasticity:8 /proc/sys/net/ipv4/route/gc_interval:60 /proc/sys/net/ipv4/route/gc_min_interval:0 /proc/sys/net/ipv4/route/gc_min_interval_ms:500 /proc/sys/net/ipv4/route/gc_thresh:32768 /proc/sys/net/ipv4/route/gc_timeout:300 /proc/sys/net/ipv4/route/max_delay:10 /proc/sys/net/ipv4/route/max_size:524288 /proc/sys/net/ipv4/route/min_adv_mss:256 /proc/sys/net/ipv4/route/min_delay:2 /proc/sys/net/ipv4/route/min_pmtu:552 /proc/sys/net/ipv4/route/mtu_expires:600 /proc/sys/net/ipv4/route/redirect_load:20 /proc/sys/net/ipv4/route/redirect_number:9 /proc/sys/net/ipv4/route/redirect_silence:20480 /proc/sys/net/ipv4/route/secret_interval:600 El Mie, 10 de Enero de 2007, 16:01, Jesper Dangaard Brouer escribió:> > > On Wed, 10 Jan 2007, ArcosCom Linux User wrote: > >> El Mie, 10 de Enero de 2007, 8:15, Patrick McHardy escribió: >>> ArcosCom Linux User wrote: >>>> The log says: >>>> Dec 30 00:52:27 cura kernel: dst cache overflow > > The log message "dst cache overflow" is normally related to overflow of > the route cache. The max_size of the route cache can be adjusted through > /proc/sys/net/ipv4/route/max_size. > > What is your settings in /proc/sys/net/ipv4/route/? > > Run command: > grep . /proc/sys/net/ipv4/route/* > > Hilsen > Jesper Brouer > > -- > ------------------------------------------------------------------- > MSc. Master of Computer Science > Dept. of Computer Science, University of Copenhagen > Author of http://www.adsl-optimizer.dk > -------------------------------------------------------------------
On Mi, 2007-01-10 at 14:20 +0100, ArcosCom Linux User wrote:> The configuration is: > 1) linux box with 2.6.19.1 kernel with these patches/modules: > a) l7-filter > b) multipath patch (from nano-howto) > c) IMQThat''s interesting - how did you get IMQ into 2.6.19.1? Afaik, the most recent patch is for 2.6.17, and there were some rejects if you try to patch it into 2.6.19. Regards, Torsten
The values looks reasonable, garbage collection start (gc_thresh:32768) fairly early, but I often see that the GC cannot keep up. The maximum size of the route cache max_size=524288 is okay, but it depends on the usage pattern. On my production systems I has increased max_size to 2 million, to keep up! Another interesting value is secret_interval:600, which is the interval the route cache is flushed, in seconds, that is 10 minuts. 524288/600 = 873 packet/sec to new destinations. You should realize that filling the route cache in 10 minuts can happen, as it only requires 873 packet/sec to new destinations. What to do next: Monitor the route cache, to see whats actually happening. The route cache counters are located in /proc/net/stat/rt_cache, but is not very human readable. Use the tool "rtstat" to monitor the route cache. The rtstat tool can be downloaded from Roberts site: ftp://robur.slu.se/pub/Linux/net-development/rt_cache_stat Cheers, Jesper Brouer -- ------------------------------------------------------------------- MSc. Master of Computer Science Dept. of Computer Science, University of Copenhagen Author of http://www.adsl-optimizer.dk ------------------------------------------------------------------- On Wed, 10 Jan 2007, ArcosCom Linux User wrote:> Here are: > > # grep . /proc/sys/net/ipv4/route/* > /proc/sys/net/ipv4/route/error_burst:5000 > /proc/sys/net/ipv4/route/error_cost:1000 > grep: /proc/sys/net/ipv4/route/flush: Operación no permitida > /proc/sys/net/ipv4/route/gc_elasticity:8 > /proc/sys/net/ipv4/route/gc_interval:60 > /proc/sys/net/ipv4/route/gc_min_interval:0 > /proc/sys/net/ipv4/route/gc_min_interval_ms:500 > /proc/sys/net/ipv4/route/gc_thresh:32768 > /proc/sys/net/ipv4/route/gc_timeout:300 > /proc/sys/net/ipv4/route/max_delay:10 > /proc/sys/net/ipv4/route/max_size:524288 > /proc/sys/net/ipv4/route/min_adv_mss:256 > /proc/sys/net/ipv4/route/min_delay:2 > /proc/sys/net/ipv4/route/min_pmtu:552 > /proc/sys/net/ipv4/route/mtu_expires:600 > /proc/sys/net/ipv4/route/redirect_load:20 > /proc/sys/net/ipv4/route/redirect_number:9 > /proc/sys/net/ipv4/route/redirect_silence:20480 > /proc/sys/net/ipv4/route/secret_interval:600 > > > El Mie, 10 de Enero de 2007, 16:01, Jesper Dangaard Brouer escribió: >> >> >> On Wed, 10 Jan 2007, ArcosCom Linux User wrote: >> >>> El Mie, 10 de Enero de 2007, 8:15, Patrick McHardy escribió: >>>> ArcosCom Linux User wrote: >>>>> The log says: >>>>> Dec 30 00:52:27 cura kernel: dst cache overflow >> >> The log message "dst cache overflow" is normally related to overflow of >> the route cache. The max_size of the route cache can be adjusted through >> /proc/sys/net/ipv4/route/max_size. >> >> What is your settings in /proc/sys/net/ipv4/route/? >> >> Run command: >> grep . /proc/sys/net/ipv4/route/* >> >> Hilsen >> Jesper Brouer >> >> -- >> ------------------------------------------------------------------- >> MSc. Master of Computer Science >> Dept. of Computer Science, University of Copenhagen >> Author of http://www.adsl-optimizer.dk >> -------------------------------------------------------------------_______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc
Thanks, good tool: I''m using it to take a view into the routes table. El Mie, 10 de Enero de 2007, 20:40, Jesper Dangaard Brouer escribió:> > The values looks reasonable, garbage collection start (gc_thresh:32768) > fairly early, but I often see that the GC cannot keep up. > > The maximum size of the route cache max_size=524288 is okay, but it > depends on the usage pattern. On my production systems I has increased > max_size to 2 million, to keep up! > > Another interesting value is secret_interval:600, which is the interval > the route cache is flushed, in seconds, that is 10 minuts. > > 524288/600 = 873 packet/sec to new destinations. > > You should realize that filling the route cache in 10 minuts can happen, > as it only requires 873 packet/sec to new destinations. > > > What to do next: > > Monitor the route cache, to see whats actually happening. The route cache > counters are located in /proc/net/stat/rt_cache, but is not very human > readable. Use the tool "rtstat" to monitor the route cache. > > The rtstat tool can be downloaded from Roberts site: > ftp://robur.slu.se/pub/Linux/net-development/rt_cache_stat > > Cheers, > Jesper Brouer > > -- > ------------------------------------------------------------------- > MSc. Master of Computer Science > Dept. of Computer Science, University of Copenhagen > Author of http://www.adsl-optimizer.dk > ------------------------------------------------------------------- > > > > On Wed, 10 Jan 2007, ArcosCom Linux User wrote: > >> Here are: >> >> # grep . /proc/sys/net/ipv4/route/* >> /proc/sys/net/ipv4/route/error_burst:5000 >> /proc/sys/net/ipv4/route/error_cost:1000 >> grep: /proc/sys/net/ipv4/route/flush: Operación no permitida >> /proc/sys/net/ipv4/route/gc_elasticity:8 >> /proc/sys/net/ipv4/route/gc_interval:60 >> /proc/sys/net/ipv4/route/gc_min_interval:0 >> /proc/sys/net/ipv4/route/gc_min_interval_ms:500 >> /proc/sys/net/ipv4/route/gc_thresh:32768 >> /proc/sys/net/ipv4/route/gc_timeout:300 >> /proc/sys/net/ipv4/route/max_delay:10 >> /proc/sys/net/ipv4/route/max_size:524288 >> /proc/sys/net/ipv4/route/min_adv_mss:256 >> /proc/sys/net/ipv4/route/min_delay:2 >> /proc/sys/net/ipv4/route/min_pmtu:552 >> /proc/sys/net/ipv4/route/mtu_expires:600 >> /proc/sys/net/ipv4/route/redirect_load:20 >> /proc/sys/net/ipv4/route/redirect_number:9 >> /proc/sys/net/ipv4/route/redirect_silence:20480 >> /proc/sys/net/ipv4/route/secret_interval:600 >> >> >> El Mie, 10 de Enero de 2007, 16:01, Jesper Dangaard Brouer escribió: >>> >>> >>> On Wed, 10 Jan 2007, ArcosCom Linux User wrote: >>> >>>> El Mie, 10 de Enero de 2007, 8:15, Patrick McHardy escribió: >>>>> ArcosCom Linux User wrote: >>>>>> The log says: >>>>>> Dec 30 00:52:27 cura kernel: dst cache overflow >>> >>> The log message "dst cache overflow" is normally related to overflow of >>> the route cache. The max_size of the route cache can be adjusted >>> through >>> /proc/sys/net/ipv4/route/max_size. >>> >>> What is your settings in /proc/sys/net/ipv4/route/? >>> >>> Run command: >>> grep . /proc/sys/net/ipv4/route/* >>> >>> Hilsen >>> Jesper Brouer >>> >>> -- >>> ------------------------------------------------------------------- >>> MSc. Master of Computer Science >>> Dept. of Computer Science, University of Copenhagen >>> Author of http://www.adsl-optimizer.dk >>> ------------------------------------------------------------------- >