Hi shorewall users, we are running a shorewall-based firewall with several IPSec (ESP) tunnels. The remote endpoints (mostly cisco based) require us to SNAT the IP addresses coming from our LAN to ONE single IP. Since we switched to a shorewall setup the performance on the tunnels has dropped massively. Everthing works fine for simple connections (ssh, etc), but if we transfer bigger volumes of data the connection speed drops very fast from about 8 MB/s to 400 KB/s (sometimes less) and other non-ipsec traffic slows downs also. When this happens, the load in "software interupts (%si in top)" rises until most packets (non-IPSec traffic) get dropped. The machine has a quad-core XEON CPU, so crypto performance is not the issue here. For testing purposes we duplicated the tunnel config but WITHOUT SNAT and we can transfer IPSec traffic at 10MB/s constantly, without any harm on the machine or the other traffic. We believe, we have some mistake in the NAT setup. ("shorewall dump" output attached, but we replaced the IP-addresses) The non-IPSec traffic is also SNAT''ted but we cannot see any performance problems there. We can saturate the link with 12MB/s without problems, which is the maximum for our connection. The Software Setup: - Linux 2.6.32 - OpenSWAN (with netkey) - shorewall 4.4.11 (Debian) cleaned output of shorewall dump attached Network Setup: LAN (10.66.1.0/24) + | + (eth2:10.66.1.1) fw (shorewall) ++ (eth0:1.1.1.1) || || (IPSec Tunnel) ++ remoteGW (2.2.2.2) + | | + remoteLAN (192.168.1.0/24) Packet Flow: 10.66.1.2 -> 10.66.1.1 -> SNAT to 192.168.82.8 -> IPSEC-policy-routing -> remoteGW -> remoteLAN Everything works! We can reach the remote servers. The relevant parts of the config: $IPSEC_MASQ_DEST is a list of IPs that are behind the tunnel and therefore have to have changed their source-IP. fw:/etc/shorewall# cat hosts #zone hosts options vpn eth0:192.168.1.0/24 ipsec fw:/etc/shorewall# cat masq #INTERFACE SOURCE ADDRESS PROTO PORT(S) IPSEC MARK eth0:$IPSEC_MASQ_DEST eth2 192.168.82.8 - - - # everything else to our external IP eth0 10.66.1.0/24 1.1.1.1 fw:/etc/shorewall# cat zones #ZONE TYPE OPTIONS IN OUT # OPTIONS OPTIONS fw firewall net ipv4 loc ipv4 vpn ipsec mode=tunnel mss=1400 fw:/etc/shorewall# cat interfaces #ZONE INTERFACE BROADCAST OPTIONS net eth0 detect tcpflags,nosmurfs,logmartians,routefilter loc eth2 detect tcpflags,dhcp,nosmurfs,routefilter OpenSWAN CONFIG: conn fw-test pfs=yes auth=esp esp=aes128-sha1 keyexchange=ike type=tunnel authby=secret left=10.31.1.1 leftsubnet=192.168.1.0/24 right=1.1.1.1 rightsubnet=192.168.82.8/32 auto=add Any ideas? Best regards, Joerg ------------------------------------------------------------------------------ Download new Adobe(R) Flash(R) Builder(TM) 4 The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly Flex(R) Builder(TM)) enable the development of rich applications that run across multiple browsers and platforms. Download your free trials today! http://p.sf.net/sfu/adobe-dev2dev
On 10/15/10 3:35 AM, Jörg Kurlbaum wrote:> > Since we switched to a shorewall setup the performance on the tunnels has > dropped massively.Switched from what?> We believe, we have some mistake in the NAT setup. > ("shorewall dump" output attached, but we replaced the IP-addresses) >I''m more inclined to suspect an MSS issue.> fw:/etc/shorewall# cat zones > #ZONE TYPE OPTIONS IN OUT > # OPTIONS OPTIONS > fw firewall > net ipv4 > loc ipv4 > vpn ipsec mode=tunnel mss=1400 >You are only clamping the MSS in one direction. Try moving that setting to the OPTIONS column. -Tom -- Tom Eastep \ When I die, I want to go like my Grandfather who Shoreline, \ died peacefully in his sleep. Not screaming like Washington, USA \ all of the passengers in his car http://shorewall.net \________________________________________________ ------------------------------------------------------------------------------ Download new Adobe(R) Flash(R) Builder(TM) 4 The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly Flex(R) Builder(TM)) enable the development of rich applications that run across multiple browsers and platforms. Download your free trials today! http://p.sf.net/sfu/adobe-dev2dev
On Fri, Oct 15, 2010 at 07:59:51AM -0700, Tom Eastep wrote:> On 10/15/10 3:35 AM, Jörg Kurlbaum wrote: > > Since we switched to a shorewall setup the performance on the tunnels has > > dropped massively. > > Switched from what?From a strange hand-written iptables script. The main difference was, that we had two different routers. The first was the default route for all clients and marked packets that should go into the IPSec tunnel and changed the IP with SNAT. All packets then got forwarded to the "real" firewall. On the firewall marked packets got routed into the tunnel. But they already had the nessecary IP. The setup was faulty and not very manageable, that is why I changed to shorewall. Which works great, but has this tiny problem :-)> > We believe, we have some mistake in the NAT setup. > > ("shorewall dump" output attached, but we replaced the IP-addresses) > > > > I'm more inclined to suspect an MSS issue. > > > > fw:/etc/shorewall# cat zones > > #ZONE TYPE OPTIONS IN OUT > > # OPTIONS OPTIONS > > fw firewall > > net ipv4 > > loc ipv4 > > vpn ipsec mode=tunnel mss=1400 > > > > You are only clamping the MSS in one direction. Try moving that setting > to the OPTIONS column.Okay, i tried that. The line looks like this now: vpn ipsec mode=tunnel,mss=1400 But i'm sorry to say. No difference. The interesting part is, if I don't do SNAT on the test-tunnel performance is very well (like i said in the previous post about 10MB/s). Any more ideas? Are there other pitfalls with IPSec and Shorewall? Thanks so far and best regards, Jörg ------------------------------------------------------------------------------ Download new Adobe(R) Flash(R) Builder(TM) 4 The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly Flex(R) Builder(TM)) enable the development of rich applications that run across multiple browsers and platforms. Download your free trials today! http://p.sf.net/sfu/adobe-dev2dev _______________________________________________ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
On 10/15/10 8:59 AM, Jörg Kurlbaum wrote:> On Fri, Oct 15, 2010 at 07:59:51AM -0700, Tom Eastep wrote: >> On 10/15/10 3:35 AM, Jörg Kurlbaum wrote: >>> Since we switched to a shorewall setup the performance on the tunnels has >>> dropped massively. >> >> Switched from what? > > From a strange hand-written iptables script. > > The main difference was, that we had two different routers. > The first was the default route for all clients and marked packets that > should go into the IPSec tunnel and changed the IP with SNAT. All packets > then got forwarded to the "real" firewall. On the firewall marked packets > got routed into the tunnel. But they already had the nessecary IP. > The setup was faulty and not very manageable, that is why I changed to > shorewall. Which works great, but has this tiny problem :-) > >>> We believe, we have some mistake in the NAT setup. >>> ("shorewall dump" output attached, but we replaced the IP-addresses) >>> >> >> I''m more inclined to suspect an MSS issue. >> >> >>> fw:/etc/shorewall# cat zones >>> #ZONE TYPE OPTIONS IN OUT >>> # OPTIONS OPTIONS >>> fw firewall >>> net ipv4 >>> loc ipv4 >>> vpn ipsec mode=tunnel mss=1400 >>> >> >> You are only clamping the MSS in one direction. Try moving that setting >> to the OPTIONS column. > > Okay, i tried that. The line looks like this now: > > vpn ipsec mode=tunnel,mss=1400 > > But i''m sorry to say. No difference. > The interesting part is, if I don''t do SNAT on the test-tunnel > performance is very well (like i said in the previous post about 10MB/s).You said that but it''s impossible for us to understand exactly what you are telling us. Your original post said: "The remote endpoints (mostly cisco based) require us to SNAT the IP addresses coming from our LAN to ONE single IP." And yet you say: "if I don''t do SNAT on the test-tunnel performance is very well (sic)" ???? I can only guess that means that tunneled connections from the Shorewall box to the remote subnets have normal performance?> Any more ideas? Are there other pitfalls with IPSec and Shorewall?I can recall no case where IPSEC performance issues were not resolved by MSS clamping. Anyone else? -Tom -- Tom Eastep \ When I die, I want to go like my Grandfather who Shoreline, \ died peacefully in his sleep. Not screaming like Washington, USA \ all of the passengers in his car http://shorewall.net \________________________________________________ ------------------------------------------------------------------------------ Download new Adobe(R) Flash(R) Builder(TM) 4 The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly Flex(R) Builder(TM)) enable the development of rich applications that run across multiple browsers and platforms. Download your free trials today! http://p.sf.net/sfu/adobe-dev2dev
On Fri, Oct 15, 2010 at 03:58:38PM -0700, Tom Eastep wrote:> You said that but it's impossible for us to understand exactly what you > are telling us. Your original post said: > > "The remote endpoints (mostly cisco based) require us to SNAT > the IP addresses coming from our LAN to ONE single IP." > > And yet you say: > > "if I don't do SNAT on the test-tunnel performance is very well > (sic)" > > ????Well, lost in translation, i guess. I'm having this error for a while now and i'm trying to locate the error. We have about 12 tunnels with very similar configuration, that all need SNAT. We cannot change the config on the endpoints, because they are not ours. But i've setup another machine as (test)endpoint, which has exactly the same configuration to reproduce the error. There i tried both: with SNAT (bad performance) and without SNAT. In the last case everything is fine. That is why i think it has something to do with the SNAT.> I can only guess that means that tunneled connections from the Shorewall > box to the remote subnets have normal performance?No, normal performance from subnet to subnet, when turning of SNAT, which is not possible on the production tunnels, but only on my test connection.> > Any more ideas? Are there other pitfalls with IPSec and Shorewall? > > I can recall no case where IPSEC performance issues were not resolved by > MSS clamping. Anyone else?Maybe i'm not getting the full idea of MSS clamping. Can you see misconfigured MSS, for example with tcpdump? I will re-read the documentation on mss option in shorewall. Sorry for the misunderstandings, it's a bit difficult for me to explain this complicated scenario in my non-native language. I very much appreciate your help, really, since i'm a bit lost. Greetings, Jörg ------------------------------------------------------------------------------ Download new Adobe(R) Flash(R) Builder(TM) 4 The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly Flex(R) Builder(TM)) enable the development of rich applications that run across multiple browsers and platforms. Download your free trials today! http://p.sf.net/sfu/adobe-dev2dev _______________________________________________ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
On 10/15/10 4:35 PM, Jörg Kurlbaum wrote:> > No, normal performance from subnet to subnet, when turning of SNAT, which > is not possible on the production tunnels, but only on my test connection. > >>> Any more ideas? Are there other pitfalls with IPSec and Shorewall? >> >> I can recall no case where IPSEC performance issues were not resolved by >> MSS clamping. Anyone else? > > Maybe i''m not getting the full idea of MSS clamping. Can you see > misconfigured MSS, for example with tcpdump? I will re-read the > documentation on mss option in shorewall. > > > Sorry for the misunderstandings, it''s a bit difficult for me to explain > this complicated scenario in my non-native language. > > I very much appreciate your help, really, since i''m a bit lost.Since you have a test configuration, please show us the configuration that works and the one that doesn''t (config files and ''shorewall dump''). And please don''t alter the dump output. Thanks, -Tom -- Tom Eastep \ When I die, I want to go like my Grandfather who Shoreline, \ died peacefully in his sleep. Not screaming like Washington, USA \ all of the passengers in his car http://shorewall.net \________________________________________________ ------------------------------------------------------------------------------ Download new Adobe(R) Flash(R) Builder(TM) 4 The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly Flex(R) Builder(TM)) enable the development of rich applications that run across multiple browsers and platforms. Download your free trials today! http://p.sf.net/sfu/adobe-dev2dev
On 10/15/10 6:59 PM, Tom Eastep wrote:> > Since you have a test configuration, please show us the configuration > that works and the one that doesn''t (config files and ''shorewall dump''). > And please don''t alter the dump output.Unfortunately, I won''t be available to look at the information until at least Sunday evening. -Tom -- Tom Eastep \ When I die, I want to go like my Grandfather who Shoreline, \ died peacefully in his sleep. Not screaming like Washington, USA \ all of the passengers in his car http://shorewall.net \________________________________________________ ------------------------------------------------------------------------------ Download new Adobe(R) Flash(R) Builder(TM) 4 The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly Flex(R) Builder(TM)) enable the development of rich applications that run across multiple browsers and platforms. Download your free trials today! http://p.sf.net/sfu/adobe-dev2dev
On Fri, Oct 15, 2010 at 06:59:45PM -0700, Tom Eastep wrote:> On 10/15/10 4:35 PM, Jörg Kurlbaum wrote: > Since you have a test configuration, please show us the configuration > that works and the one that doesn't (config files and 'shorewall dump').Here are the dumps and the diff from the two configs. The test tunnel is "fw-test" from the shorewall box (82.198.203.20) to the test box (10.31.1.1) the subnet behind the test box is 192.168.145.1/32 This is the diff from the two configs. The origin (a) is the version with bad performance and (b) performs very well on the test connection. The complete configuration is attached, as well as the shorewall dump output from both configurations. diff --git a/ipsec.conf b/ipsec.conf index 7be7575..7e89344 100644 --- a/ipsec.conf +++ b/ipsec.conf @@ -28,7 +28,7 @@ conn fw-test left=10.31.1.1 leftsubnet=192.168.145.1/32 right=82.198.203.20 - rightsubnet=192.168.82.8/32 + rightsubnet=10.66.1.0/24 auto=add conn neuland-merlin diff --git a/shorewall/masq b/shorewall/masq index 2ad9582..84608c5 100644 --- a/shorewall/masq +++ b/shorewall/masq @@ -7,7 +7,7 @@ eth0:$IPSEC_MASQ_DEST vpn0 $IPSEC_MASQ_SRC_IP - - - ## alles andere auf eine IP maskieren (manche Server authentifizieren anhand dieser IP, muss also .20 sein) -eth0 10.66.1.0/24 82.198.203.20 +eth0:!192.168.145.1/32 10.66.1.0/24 82.198.203.20 eth0 192.168.111.0/24 82.198.203.24 eth0 192.168.0.0/24 82.198.203.20 # SNAT fuer DMZ? diff --git a/shorewall/params b/shorewall/params index 97637b6..5288ee2 100644 --- a/shorewall/params +++ b/shorewall/params @@ -1,3 +1,3 @@ IPSEC_GATEWAYS=213.178.160.60,195.50.185.4,80.85.192.44,10.31.1.1 -IPSEC_MASQ_DEST=10.107.10.0/24,212.9.181.0/24,213.178.160.128/26,10.108.104.0/23,80.85.195.32/27,10.79.24.0/23,80.85.196.128/26,10.108.100.0/24,172.27.32.0/24,80.85.198.0/24,10.111.128.0/17,80.85.199.0/27,10.108.124.0/25,192.168.145.1/32 +IPSEC_MASQ_DEST=10.107.10.0/24,212.9.181.0/24,213.178.160.128/26,10.108.104.0/23,80.85.195.32/27,10.79.24.0/23,80.85.196.128/26,10.108.100.0/24,172.27.32.0/24,80.85.198.0/24,10.111.128.0/17,80.85.199.0/27,10.108.124.0/25 IPSEC_MASQ_SRC_IP=192.168.82.8 I hope this makes things clearer. The corresponding outputs from "shorewall dump" are available here: http://static.neuland-bfi.de/shorewall_dump_bad.bz2 http://static.neuland-bfi.de/shorewall_dump_good.bz2 http://static.neuland-bfi.de/shorewall_conf_bad.tar.bz2 Greetings, Jörg P.S.: replying late because the first mail got stuck in the moderation ------------------------------------------------------------------------------ Nokia and AT&T present the 2010 Calling All Innovators-North America contest Create new apps & games for the Nokia N8 for consumers in U.S. and Canada $10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store http://p.sf.net/sfu/nokia-dev2dev _______________________________________________ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
On 11/4/10 3:27 AM, Joerg Kurlbaum wrote:> On Tue, Nov 02, 2010 at 01:00:42PM -0700, Tom Eastep wrote: >> I see nothing in the dumps that gives a clue as to the problem. If you >> are still experiencing this issue, I think that the next step is for me >> to try to reproduce the problem in a controlled environment. > > Yes, we still have this problem. (And i think it will not go away by > itself, even when hoped so). > > If you could have a closer look, and try to reproduce the problem, > that would be really great. >I have seen some decrease in throughput with SNAT but no effect on %SI. I''ll test some more tomorrow. -Tom -- Tom Eastep \ When I die, I want to go like my Grandfather who Shoreline, \ died peacefully in his sleep. Not screaming like Washington, USA \ all of the passengers in his car http://shorewall.net \________________________________________________ ------------------------------------------------------------------------------ The Next 800 Companies to Lead America''s Growth: New Video Whitepaper David G. Thomson, author of the best-selling book "Blueprint to a Billion" shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev
On 11/6/10 1:06 PM, Tom Eastep wrote:> On 11/4/10 3:27 AM, Joerg Kurlbaum wrote: >> On Tue, Nov 02, 2010 at 01:00:42PM -0700, Tom Eastep wrote: >>> I see nothing in the dumps that gives a clue as to the problem. If you >>> are still experiencing this issue, I think that the next step is for me >>> to try to reproduce the problem in a controlled environment. >> >> Yes, we still have this problem. (And i think it will not go away by >> itself, even when hoped so). >> >> If you could have a closer look, and try to reproduce the problem, >> that would be really great. >> > > I have seen some decrease in throughput with SNAT but no effect on %SI. > I''ll test some more tomorrow.After careful testing today, I was not able to reproduce your problem. Configuration: Shorewall box: Debian Lenny, 3.4mz Pentium IV Shorewall 4.4.14 Remote Endpoint: Debian Lenny running in a VirtualBox VM under OS X. This was also where the IPSEC tunnel was terminated. Hardware is a MacBook Pro with a 2.66 Ghz Intel Core i7. Local Endpoint: Debian Squeeze running in a VirtualBox VM under Windows XP. Hardware is a 2.3 Ghz Athlon Dual Core. There was a commercial SMC router between the Shorewall box and the remote endpoint. All interfaces are Gigabit Ethernet. Both endpoints were configured using setkey and racoon. I tested using iperf which was run on the Results are as follows: No SNAT, No IPSEC Local Server: 91.6 Mbit/sec Remote Server: 75.8 Mbit/sec SNAT, No IPSEC Local Server: 91.3 Mbit/sec Remote Server: 75.9 Mbit/sec No SNAT, IPSEC Local Server: 89.7 Mbit/sec %si=15% Remote Server: 73.2 Mbit/sec %si=12% SNAT, IPSEC Local Server: 90.0 Mbit/sec %si=15% Remote Server: 73.1 Mbit/sec %si=12% As you can see, SNAT had no effect on throughput or on CPU utilization. Adding IPSEC had an expected effect on CPU utilization (Total CPU busy was around 20% in all IPSEC test cases and around 5% in the non-IPSEC tests). I can only conclude that this issue is something unique to your configuration. The significant difference in our firewalls'' software configurations is Lenny vs. Squeeze. Your hardware is more powerful than mine. One more note -- my IPSEC zone has mss=1380 while yours has mss=1400. -Tom -- Tom Eastep \ When I die, I want to go like my Grandfather who Shoreline, \ died peacefully in his sleep. Not screaming like Washington, USA \ all of the passengers in his car http://shorewall.net \________________________________________________ ------------------------------------------------------------------------------ The Next 800 Companies to Lead America''s Growth: New Video Whitepaper David G. Thomson, author of the best-selling book "Blueprint to a Billion" shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev
On Sun, Nov 07, 2010 at 10:11:45AM -0800, Tom Eastep wrote:> After careful testing today, I was not able to reproduce your problem. > > Configuration: > [...]Many, many thanks for your afford. It's a lot of work to test such a non-standard setup. So sad, that you couldn't reproduce the problem.> I can only conclude that this issue is something unique to your > configuration. The significant difference in our firewalls' software > configurations is Lenny vs. Squeeze. Your hardware is more powerful than > mine.Lenny vs. Squeeze means a different Linux-Kernel (2.6.26 vs. 2.6.32) and maybe that is already the point. I've had a look into the other parts of the system for example the involved network card (bnx2). And i still think it has something to do with the NIC driver. In the end i installed a new kernel (2.6.36) and the problem was gone. Yes, really, that simple. But it's a bit frustrating not to know the real cause of the issue. The 2.6.32 kernel is one of the long supported kernels that's why i didn't want to change that if not really nessecary. Maybe i should contact the kernel developers, but the problem seems hard to reproduce. Sorry, that i suspected your software to be the root of the problems and you had so much work with it. Shorewall is a great and damn flexible piece of software. So, the problem is SOLVED, kind of.. Many thanks again. Greetings, Jörg :-) ------------------------------------------------------------------------------ Centralized Desktop Delivery: Dell and VMware Reference Architecture Simplifying enterprise desktop deployment and management using Dell EqualLogic storage and VMware View: A highly scalable, end-to-end client virtualization framework. Read more! http://p.sf.net/sfu/dell-eql-dev2dev _______________________________________________ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users
On 11/11/10 7:11 AM, Jörg Kurlbaum wrote:> On Sun, Nov 07, 2010 at 10:11:45AM -0800, Tom Eastep wrote: >> After careful testing today, I was not able to reproduce your problem. >> >> Configuration: >> [...] > > Many, many thanks for your afford. It''s a lot of work to test such a > non-standard setup. So sad, that you couldn''t reproduce the problem. > >> I can only conclude that this issue is something unique to your >> configuration. The significant difference in our firewalls'' software >> configurations is Lenny vs. Squeeze. Your hardware is more powerful than >> mine. > > Lenny vs. Squeeze means a different Linux-Kernel (2.6.26 vs. 2.6.32) and > maybe that is already the point.> In the end i installed a new kernel (2.6.36) and the problem was gone.It would be good if you reported this to Debian -- they will want to try to fix this before Squeeze final release. -Tom -- Tom Eastep \ When I die, I want to go like my Grandfather who Shoreline, \ died peacefully in his sleep. Not screaming like Washington, USA \ all of the passengers in his car http://shorewall.net \________________________________________________ ------------------------------------------------------------------------------ Centralized Desktop Delivery: Dell and VMware Reference Architecture Simplifying enterprise desktop deployment and management using Dell EqualLogic storage and VMware View: A highly scalable, end-to-end client virtualization framework. Read more! http://p.sf.net/sfu/dell-eql-dev2dev