I''m using shorewall with openvpn and traffic shaping at all of our offices. I have noticed for a while that occasionally ping times are excessive. Usually this is during overnight off site backups but some times during the day. I have assumed the is was an ISP issue but now I''m suspecting it''s problem with openvpn and traffic shaping. In the test case have 2 sites with t1s. I''m setting speed much lower to allow for some phone traffic that comes off before I get it. During the tests there is no phone use. I set to do iperf between sites between sites both direct and therough the vpn. With no traffic ping time direct is about 8ms and 10ms via vpn. With saturating direct traffic . Direct ping is about 40-50 ms vpn ping is about 50ms With saturating vpn traffic Direct ping is about 15-30ms vpn ping is about 18-250ms Ping times are very erratic particular in the one bad case. Some times pings via vpn are over a second. The consistant thing is with saturating traffic via vpn the vpn ping times are bad. Other cases are OK. My wild guess is that openvpn does not like its packets being delayed. Attached is shorewall dump. In case it looks odd the openvpn links are point to point and routing is done via ospf. Any Ideas? Thanks John ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
On 7/4/10 5:11 PM, John McMonagle wrote:> I''m using shorewall with openvpn and traffic shaping at all of our offices. > I have noticed for a while that occasionally ping times are excessive. Usually > this is during overnight off site backups but some times during the day. > I have assumed the is was an ISP issue but now I''m suspecting it''s problem > with openvpn and traffic shaping. > > In the test case have 2 sites with t1s. I''m setting speed much lower to allow > for some phone traffic that comes off before I get it. > During the tests there is no phone use. > > I set to do iperf between sites between sites both direct and therough the > vpn. > > With no traffic ping time direct is about 8ms and 10ms via vpn. > With saturating direct traffic . > Direct ping is about 40-50 ms > vpn ping is about 50ms > > With saturating vpn traffic > Direct ping is about 15-30ms > vpn ping is about 18-250ms > > Ping times are very erratic particular in the one bad case. Some times pings > via vpn are over a second. > The consistant thing is with saturating traffic via vpn the vpn ping times are > bad. > Other cases are OK. > > My wild guess is that openvpn does not like its packets being delayed. > Attached is shorewall dump. > In case it looks odd the openvpn links are point to point and routing is done > via ospf. > > Any Ideas?A shorewall dump taken when there is little or no traffic flowing is not particularly useful for analyzing TC problems but it looks to me as if you have entries in /etc/shorewall/tcfilters with 0.0.0.0 in the SOURCE and DEST columns where you really want 0.0.0.0/0. -Tom -- Tom Eastep \ When I die, I want to go like my Grandfather who Shoreline, \ died peacefully in his sleep. Not screaming like Washington, USA \ all of the passengers in his car http://shorewall.net \________________________________________________ ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
On 7/5/10 7:28 AM, Tom Eastep wrote:>> Any Ideas? > > > A shorewall dump taken when there is little or no traffic flowing is not > particularly useful for analyzing TC problems but it looks to me as if > you have entries in /etc/shorewall/tcfilters with 0.0.0.0 in the SOURCE > and DEST columns where you really want 0.0.0.0/0.I notice that you are running a particularly ancient version of Shorewall (4.2.1 -- Released in October of 2008). I found the following under ''Problems Corrected'' in the release notes for 4.2.8: 5) When a network address was specified in the SOURCE or DEST column of /etc/shorewall/tcfilters, Shorewall-perl was generating an incorrect netmask. -Tom -- Tom Eastep \ When I die, I want to go like my Grandfather who Shoreline, \ died peacefully in his sleep. Not screaming like Washington, USA \ all of the passengers in his car http://shorewall.net \________________________________________________ ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
On Monday 05 July 2010 09:44:19 am Tom Eastep wrote:> On 7/5/10 7:28 AM, Tom Eastep wrote: > >> Any Ideas? > > > > A shorewall dump taken when there is little or no traffic flowing is not > > particularly useful for analyzing TC problems but it looks to me as if > > you have entries in /etc/shorewall/tcfilters with 0.0.0.0 in the SOURCE > > and DEST columns where you really want 0.0.0.0/0. > > I notice that you are running a particularly ancient version of > Shorewall (4.2.1 -- Released in October of 2008). I found the following > under ''Problems Corrected'' in the release notes for 4.2.8: > > 5) When a network address was specified in the SOURCE or DEST column of > /etc/shorewall/tcfilters, Shorewall-perl was generating an incorrect > netmask. > > -TomTom My tcfilters file does have 0.0.0.0/0 entries ?? part of it: # OUTGOING # 3389 is rdesktop 1:110 0.0.0.0/0 0.0.0.0/0 udp iax 1:110 0.0.0.0/0 0.0.0.0/0 udp - iax 1:110 0.0.0.0/0 0.0.0.0/0 ospf 1:120 0.0.0.0/0 0.0.0.0/0 tcp ssh 1:120 0.0.0.0/0 0.0.0.0/0 tcp - ssh 1:120 0.0.0.0/0 0.0.0.0/0 tcp https 1:120 0.0.0.0/0 0.0.0.0/0 tcp - https 1:120 0.0.0.0/0 0.0.0.0/0 tcp 3389 1:120 0.0.0.0/0 0.0.0.0/0 tcp - 3389 1:130 0.0.0.0/0 0.0.0.0/0 tcp smtp 1:130 0.0.0.0/0 0.0.0.0/0 tcp - smtp # # INCOMING TRAFFIC # # 2:110 0.0.0.0/0 0.0.0.0/0 udp iax 2:110 0.0.0.0/0 0.0.0.0/0 udp - iax 2:110 0.0.0.0/0 0.0.0.0/0 ospf 2:120 0.0.0.0/0 0.0.0.0/0 tcp ssh 2:120 0.0.0.0/0 0.0.0.0/0 tcp - ssh 2:120 0.0.0.0/0 0.0.0.0/0 tcp https 2:120 0.0.0.0/0 0.0.0.0/0 tcp - https 2:120 0.0.0.0/0 0.0.0.0/0 tcp 3389 2:120 0.0.0.0/0 0.0.0.0/0 tcp - 3389 2:130 0.0.0.0/0 0.0.0.0/0 tcp smtp 2:130 0.0.0.0/0 0.0.0.0/0 tcp - smtp Or are your referring to the bad netmasks that are being created by my version of shorewall? Should I send a dump with traffic or should I concentrate on upgrading shorewall? It will probably take a week or so to upgrade both ends. John ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
On 7/5/10 2:11 PM, John McMonagle wrote:> Or are your referring to the bad netmasks that are being created by my version > of shorewall?It appears that your old version of Shorewall is treating 0.0.0.0/0 like 0.0.0.0.> > Should I send a dump with traffic or should I concentrate on upgrading > shorewall? > > It will probably take a week or so to upgrade both ends.I suggest concentrating on upgrading. -Tom -- Tom Eastep \ When I die, I want to go like my Grandfather who Shoreline, \ died peacefully in his sleep. Not screaming like Washington, USA \ all of the passengers in his car http://shorewall.net \________________________________________________ ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
Tom I upgraded 3 of our routers. Nice upgrade. Shorewall check showed a couple issues that I fixed but the problem still remains :-( Attached is a dump with heavy trafic and some opennms graphs of the same site. eth0 is local net. eth1 is internet. This is a different site than the last one it''s simpler and has more traffic. The internet is a t1 that is cut back a bit to allow for some phone trafic that come off before I get it. Most of the traffic is remote backups of main office to this site via rsync in ssh in openvpn. As you can see it''s kept rather busy I did an iperf durring the dump just to make sure it was busy. The opennms is also from the main site. The main site has 2 bonded t1s so its able to keep the link saturated at the tested site. Thos is a nasty case as it needs to throtle incoming traffic. It does seem to control the traffic well but ping times are getting bad at times. Ususally the pings via the internet hold up better but they are stll better than via openvpn As I recall was getting better ping times then I used ipsec. I recall while minimum ping times were less the average ping times were less via ipsec. I switch from ipsec because the packets were getting counted twice in the traffic shapping. On Monday 05 July 2010 06:20:23 pm Tom Eastep wrote:> On 7/5/10 2:11 PM, John McMonagle wrote: > > Or are your referring to the bad netmasks that are being created by my > > version of shorewall? > > It appears that your old version of Shorewall is treating 0.0.0.0/0 like > 0.0.0.0. > > > Should I send a dump with traffic or should I concentrate on upgrading > > shorewall? > > > > It will probably take a week or so to upgrade both ends. > > I suggest concentrating on upgrading. > > -Tom------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
On 7/10/10 1:10 PM, John McMonagle wrote:> > Attached is a dump with heavy trafic and some opennms graphs of the same site. > eth0 is local net. > eth1 is internet. >There is still very little traffic going through the qdiscs and no queuing. Please forward your Shorewall TC configuration on this box. -Tom PS: The OpenNMS graphs were inaccessible -- the graphics are still on one of your systems which has an RFC 1918 IP address. -- Tom Eastep \ When I die, I want to go like my Grandfather who Shoreline, \ died peacefully in his sleep. Not screaming like Washington, USA \ all of the passengers in his car http://shorewall.net \________________________________________________ ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
On 7/10/10 1:33 PM, Tom Eastep wrote:> On 7/10/10 1:10 PM, John McMonagle wrote: > >> >> Attached is a dump with heavy trafic and some opennms graphs of the same site. >> eth0 is local net. >> eth1 is internet. >> > > There is still very little traffic going through the qdiscs and no > queuing. Please forward your Shorewall TC configuration on this box.Don''t bother. I''ve taken another look at this and there is, indeed, quite a bit of incoming traffic (right below the OUT-BANDWIDTH on ifb0). But none of the classes is being driven to the point where queuing is occurring; that most likely means that queuing is happening at the up-stream router. Try dropping the OUT-BANDWIDTH of ifb0 by 10-20% and see if that helps or hurts. If it doesn''t help, then I would try dropping the IN-BANDWIDTH of eth1 by a similar amount. -Tom -- Tom Eastep \ When I die, I want to go like my Grandfather who Shoreline, \ died peacefully in his sleep. Not screaming like Washington, USA \ all of the passengers in his car http://shorewall.net \________________________________________________ ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
Tom Attached all of /etc/shorewall Tom Eastep wrote:> On 7/10/10 1:10 PM, John McMonagle wrote: > > >> Attached is a dump with heavy trafic and some opennms graphs of the same site. >> eth0 is local net. >> eth1 is internet. >> >> > > There is still very little traffic going through the qdiscs and no > queuing. Please forward your Shorewall TC configuration on this box. > > -Tom > > PS: The OpenNMS graphs were inaccessible -- the graphics are still on > one of your systems which has an RFC 1918 IP address. >Should have known better. That was too easy. Attached a few of the graphs in openoffice odt. Thanks for the help. John ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
Tom Did more tests. All tests except for throughput are via vpn. Did tests while copying a server install from one site to another. Doing copying through vpn Both have a t1 line at 1500kb. with the tcdevices limits set very high get 1520kb.sec Measured throughput with iptraf. Ping time is mostly over 500ms. One at a time cut back in tcdevices both incoming and outgoing. Rate always dropped to what it was set to. Here is the interesting part. If I rate limited outgoing ping times were bad. over 500ms If I rate limited incoming rates are very good. ~20ms Also noticed that when ping times are very bad between these 2 sites ping times to other sites via openvpn are OK. This is true from either site. Wild speculation is that excessive amount of packets are in the outgoing openvpn queue and do not harm other links. Based on that I tried txqueuelen to 20 from the default 100 on the outgoing sides openvpn setup and it seems to help a bit. My knowledge of the details of networking are limited. If the it runs out of transmit does it just wait? I wonder how low one can safely set it. John On Saturday 10 July 2010 05:59:12 pm Tom Eastep wrote:> On 7/10/10 1:33 PM, Tom Eastep wrote: > > On 7/10/10 1:10 PM, John McMonagle wrote: > >> Attached is a dump with heavy trafic and some opennms graphs of the same > >> site. eth0 is local net. > >> eth1 is internet. > > > > There is still very little traffic going through the qdiscs and no > > queuing. Please forward your Shorewall TC configuration on this box. > > Don''t bother. > > I''ve taken another look at this and there is, indeed, quite a bit of > incoming traffic (right below the OUT-BANDWIDTH on ifb0). But none of > the classes is being driven to the point where queuing is occurring; > that most likely means that queuing is happening at the up-stream > router. Try dropping the OUT-BANDWIDTH of ifb0 by 10-20% and see if that > helps or hurts. > > If it doesn''t help, then I would try dropping the IN-BANDWIDTH of eth1 > by a similar amount. > > -Tom------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
On 7/11/10 9:15 AM, John McMonagle wrote:> Tom > > Did more tests. > All tests except for throughput are via vpn. > Did tests while copying a server install from one site to another. > Doing copying through vpn > Both have a t1 line at 1500kb. > with the tcdevices limits set very high get 1520kb.sec > Measured throughput with iptraf. > Ping time is mostly over 500ms. > > One at a time cut back in tcdevices both incoming and outgoing. > Rate always dropped to what it was set to. > > Here is the interesting part. > If I rate limited outgoing ping times were bad. over 500ms > If I rate limited incoming rates are very good. ~20msThat is why I suggested that you cut back the input rate. It is better if you do it using the IFB''s OUT-BANDWIDTH because that way, you get to choose which traffic gets dropped. If you do it with the IN-BANDWIDTH setting on the external interface, packets get discarded at random.> > Also noticed that when ping times are very bad between these 2 sites ping > times to other sites via openvpn are OK. > This is true from either site. > > Wild speculation is that excessive amount of packets are in the outgoing > openvpn queue and do not harm other links. > > Based on that I tried txqueuelen to 20 from the default 100 on the outgoing > sides openvpn setup and it seems to help a bit. > My knowledge of the details of networking are limited. > If the it runs out of transmit does it just wait? > I wonder how low one can safely set it.Don''t know -- you will have to ask the OpenVPN folks. -Tom -- Tom Eastep \ When I die, I want to go like my Grandfather who Shoreline, \ died peacefully in his sleep. Not screaming like Washington, USA \ all of the passengers in his car http://shorewall.net \________________________________________________ ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first