Brian J. Murrell
2007-Dec-28 22:25 UTC
marking and routing (with dual default routes) not working
Well, it probably is working. I''m probably just misunderstanding something. Given routing rules that look like this: 0: from all lookup local 10000: from all fwmark 0x40 lookup CGCO 10001: from all fwmark 0x80 lookup IGS 20000: from 67.193.45.68 lookup CGCO 20256: from 66.11.173.224 lookup IGS 32766: from all lookup main 32767: from all lookup default and given the CGCO routing table: 10.8.0.2 dev tun0 proto kernel scope link src 10.8.0.1 67.193.45.68 dev eth0.1 scope link 192.168.200.1 dev ppp0 proto kernel scope link src 66.11.173.224 10.8.0.0/24 via 10.8.0.2 dev tun0 10.75.22.0/24 dev br-lan proto kernel scope link src 10.75.22.254 10.75.23.0/24 via 10.8.0.2 dev tun0 67.193.44.0/23 dev eth0.1 proto kernel scope link src 67.193.45.68 default via 67.193.44.1 dev eth0.1 and the main routing table: 10.8.0.2 dev tun0 proto kernel scope link src 10.8.0.1 192.168.200.1 dev ppp0 proto kernel scope link src 66.11.173.224 10.8.0.0/24 via 10.8.0.2 dev tun0 10.75.22.0/24 dev br-lan proto kernel scope link src 10.75.22.254 10.75.23.0/24 via 10.8.0.2 dev tun0 67.193.44.0/23 dev eth0.1 proto kernel scope link src 67.193.45.68 169.254.0.0/16 via 10.75.22.223 dev br-lan proto zebra metric 20 equalize default nexthop via 67.193.44.1 dev eth0.1 weight 1 nexthop via 192.168.200.1 dev ppp0 weight 1 and given a routemark chain of (which I don''t think is really much relevant -- I added those set marks after this started happening but it does not seem to be the solution): Chain routemark (2 references) pkts bytes target prot opt in out source destination 0 0 MARK udp -- * * 0.0.0.0/0 0.0.0.0/0 udp spt:1194 MARK set 0x40 6 252 MARK udp -- * * 0.0.0.0/0 0.0.0.0/0 udp dpt:1194 MARK set 0x40 332 46438 MARK all -- ppp0 * 0.0.0.0/0 0.0.0.0/0 MARK set 0x80 4600 737K MARK all -- eth0.1 * 0.0.0.0/0 0.0.0.0/0 MARK set 0x40 4932 783K CONNMARK all -- * * 0.0.0.0/0 0.0.0.0/0 MARK match !0x0/0xff CONNMARK save mask 0xff and given the following entry in the /proc/net/ip_conntrack udp 17 59 src=99.228.107.5 dst=67.193.45.68 sport=34725 dport=1194 packets=47 bytes=1974 [UNREPLIED] src=67.193.45.68 dst=99.228.107.5 sport=1194 dport=34725 packets=0 bytes=0 mark=64 use=1 Why am I seeing these: Dec 28 17:19:18 gw.ilinx kernel: Shorewall:fw2all:REJECT:IN= OUT=ppp0 SRC=66.11.173.224 DST=99.228.107.5 LEN=42 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=1194 DPT=34725 LEN=22 Dec 28 17:19:18 gw.ilinx kernel: Shorewall:fw2all:REJECT:IN= OUT=ppp0 SRC=66.11.173.224 DST=99.228.107.5 LEN=50 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=1194 DPT=34725 LEN=30 Dec 28 17:19:20 gw.ilinx kernel: Shorewall:fw2all:REJECT:IN= OUT=ppp0 SRC=66.11.173.224 DST=99.228.107.5 LEN=42 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=1194 DPT=34725 LEN=22 Dec 28 17:19:20 gw.ilinx kernel: Shorewall:fw2all:REJECT:IN= OUT=ppp0 SRC=66.11.173.224 DST=99.228.107.5 LEN=50 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=1194 DPT=34725 LEN=30 It seems to me that given that the connection is marked with 64 (0x40) that the route rules should be shoving that to the CGCO routing table and causing those packets to be sent via the eth0.1 interface, no? I do understand that with two default routes with equal weights that the active default gateways will be alternated round robin, but I would have thought that the mark on that connection and the route rules and tables would have trumped that. Thots? b. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
Jerry Vonau
2007-Dec-29 00:22 UTC
Re: marking and routing (with dual default routes) not working
Brian J. Murrell wrote:> Well, it probably is working. I''m probably just misunderstanding > something. > > Given routing rules that look like this: > > 0: from all lookup local > 10000: from all fwmark 0x40 lookup CGCO > 10001: from all fwmark 0x80 lookup IGS > 20000: from 67.193.45.68 lookup CGCO > 20256: from 66.11.173.224 lookup IGS > 32766: from all lookup main > 32767: from all lookup default > > and given the CGCO routing table: > > 10.8.0.2 dev tun0 proto kernel scope link src 10.8.0.1 > 67.193.45.68 dev eth0.1 scope link > 192.168.200.1 dev ppp0 proto kernel scope link src 66.11.173.224 > 10.8.0.0/24 via 10.8.0.2 dev tun0 > 10.75.22.0/24 dev br-lan proto kernel scope link src 10.75.22.254 > 10.75.23.0/24 via 10.8.0.2 dev tun0 > 67.193.44.0/23 dev eth0.1 proto kernel scope link src 67.193.45.68 > default via 67.193.44.1 dev eth0.1 > > and the main routing table: > > 10.8.0.2 dev tun0 proto kernel scope link src 10.8.0.1 > 192.168.200.1 dev ppp0 proto kernel scope link src 66.11.173.224 > 10.8.0.0/24 via 10.8.0.2 dev tun0 > 10.75.22.0/24 dev br-lan proto kernel scope link src 10.75.22.254 > 10.75.23.0/24 via 10.8.0.2 dev tun0 > 67.193.44.0/23 dev eth0.1 proto kernel scope link src 67.193.45.68 > 169.254.0.0/16 via 10.75.22.223 dev br-lan proto zebra metric 20 equalize > default > nexthop via 67.193.44.1 dev eth0.1 weight 1 > nexthop via 192.168.200.1 dev ppp0 weight 1 >Both tables are the same, is the copy column in the providers file blank? Both provders'' routing should not be in each others table. The use of that column, results in just a single provider''s routes being in the providers routing table. /sbin/ip route ls table SHAW 24.78.192.1 dev eth1 scope link src 24.78.192.197 10.3.0.0/24 dev eth0 proto kernel scope link src 10.3.0.75 24.78.192.0/23 dev eth1 proto kernel scope link src 24.78.192.197 169.254.0.0/16 dev eth1 scope link default via 24.78.192.1 dev eth1 Note the lack of routing for my other provider. Just the local lan and that provider''s routing Jerry ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
Brian J. Murrell
2007-Dec-29 04:48 UTC
Re: marking and routing (with dual default routes) not working
On Fri, 2007-12-28 at 18:22 -0600, Jerry Vonau wrote:> Brian J. Murrell wrote: > > > and given the CGCO routing table: > > > > 10.8.0.2 dev tun0 proto kernel scope link src 10.8.0.1 > > 67.193.45.68 dev eth0.1 scope link > > 192.168.200.1 dev ppp0 proto kernel scope link src 66.11.173.224 > > 10.8.0.0/24 via 10.8.0.2 dev tun0 > > 10.75.22.0/24 dev br-lan proto kernel scope link src 10.75.22.254 > > 10.75.23.0/24 via 10.8.0.2 dev tun0 > > 67.193.44.0/23 dev eth0.1 proto kernel scope link src 67.193.45.68 > > default via 67.193.44.1 dev eth0.1 > > > > and the main routing table: > > > > 10.8.0.2 dev tun0 proto kernel scope link src 10.8.0.1 > > 192.168.200.1 dev ppp0 proto kernel scope link src 66.11.173.224 > > 10.8.0.0/24 via 10.8.0.2 dev tun0 > > 10.75.22.0/24 dev br-lan proto kernel scope link src 10.75.22.254 > > 10.75.23.0/24 via 10.8.0.2 dev tun0 > > 67.193.44.0/23 dev eth0.1 proto kernel scope link src 67.193.45.68 > > 169.254.0.0/16 via 10.75.22.223 dev br-lan proto zebra metric 20 equalize > > default > > nexthop via 67.193.44.1 dev eth0.1 weight 1 > > nexthop via 192.168.200.1 dev ppp0 weight 1 > > > > Both tables are the same,No they aren''t. Most specifically, the CGCO (one of the provider tables) table, above at top only has a default route to the CGCO provider. The main table (which is consulted only as a last resort according to my route rules) is the only table with both default routes and which is therefore eligible for round-robin default routing. AFAIU.> is the copy column in the providers file > blank?Yes: #NAME NUMBER MARK DUPLICATE INTERFACE GATEWAY OPTIONS COPY CGCO 1 64 main eth0.1 detect track,balance IGS 2 128 main ppp0 detect track,balance> Both provders'' routing should not be in each others table.Correct. They are not: # ip route ls table CGCO 10.8.0.2 dev tun0 proto kernel scope link src 10.8.0.1 67.193.45.68 dev eth0.1 scope link 192.168.200.1 dev ppp0 proto kernel scope link src 66.11.173.224 10.8.0.0/24 via 10.8.0.2 dev tun0 10.75.22.0/24 dev br-lan proto kernel scope link src 10.75.22.254 10.75.23.0/24 via 10.8.0.2 dev tun0 67.193.44.0/23 dev eth0.1 proto kernel scope link src 67.193.45.68 default via 67.193.44.1 dev eth0.1 # ip route ls table IGS 66.11.173.224 dev ppp0 scope link 10.8.0.2 dev tun0 proto kernel scope link src 10.8.0.1 192.168.200.1 dev ppp0 proto kernel scope link src 66.11.173.224 10.8.0.0/24 via 10.8.0.2 dev tun0 10.75.22.0/24 dev br-lan proto kernel scope link src 10.75.22.254 10.75.23.0/24 via 10.8.0.2 dev tun0 67.193.44.0/23 dev eth0.1 proto kernel scope link src 67.193.45.68 default via 192.168.200.1 dev ppp0 But given a route rule table with: 0: from all lookup local 10000: from all fwmark 0x40 lookup CGCO 10001: from all fwmark 0x80 lookup IGS 20000: from 67.193.45.68 lookup CGCO 20256: from 66.11.173.224 lookup IGS 32766: from all lookup main 32767: from all lookup default Does that not mean that packets on this connection: udp 17 179 src=99.228.107.5 dst=67.193.45.68 sport=34786 dport=1194 packets=25769 bytes=7319381 src=67.193.45.68 dst=99.228.107.5 sport=1194 dport=34786 packets=25116 bytes=4779760 [ASSURED] mark=64 use=1 (notice the mark=) are sent via the CGCO table and ultimately to the: default via 67.193.44.1 dev eth0.1 route? Seem so to me, yet I still see: Dec 28 23:31:18 gw.ilinx kernel: Shorewall:fw2all:REJECT:IN= OUT=ppp0 SRC=66.11.173.224 DST=99.228.107.5 LEN=481 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=1194 DPT=34786 LEN=461 Dec 28 23:31:18 gw.ilinx kernel: Shorewall:fw2all:REJECT:IN= OUT=ppp0 SRC=66.11.173.224 DST=99.228.107.5 LEN=545 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=1194 DPT=34786 LEN=525 Dec 28 23:31:19 gw.ilinx kernel: Shorewall:fw2all:REJECT:IN= OUT=ppp0 SRC=66.11.173.224 DST=99.228.107.5 LEN=505 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=1194 DPT=34786 LEN=485 I can''t see how those are making it to the ppp0 interface.> /sbin/ip route ls table SHAW > 24.78.192.1 dev eth1 scope link src 24.78.192.197 > 10.3.0.0/24 dev eth0 proto kernel scope link src 10.3.0.75 > 24.78.192.0/23 dev eth1 proto kernel scope link src 24.78.192.197 > 169.254.0.0/16 dev eth1 scope link > default via 24.78.192.1 dev eth1 > > Note the lack of routing for my other provider.I''m not quite following what routing you see in my CGCO table that is for the other provider. Indeed there is the 192.168.200.1 dev ppp0 proto kernel scope link src 66.11.173.224 route but that is a very specific route which should not be causing my problem. I don''t think. In taking a closer look at the REJECT messages above, the SRC address is for my ppp0 interface, indeed but I assumed that the cross-provider SNAT rules were doing that, but it does not seem to be the case: Chain ppp0_masq (1 references) pkts bytes target prot opt in out source destination 0 0 MASQUERADE all -- * * 10.75.22.0/24 0.0.0.0/0 0 0 SNAT all -- * * 67.193.45.68 0.0.0.0/0 to:66.11.173.224 You can see there are no hits on the second rule. So maybe these packets are getting a source address for the interface the kernel is wanting to route out of at the moment (i.e. the current round robin candidate). How does one defeat this "feature" of the kernel and force packets out of a specific interface? I thought: Chain tcpre (3 references) pkts bytes target prot opt in out source destination 1679 393K RETURN all -- * * 0.0.0.0/0 0.0.0.0/0 MARK match !0x0/0xc0 4240 452K MARK all -- * * 0.0.0.0/0 0.0.0.0/0 MARK set 0x40 0 0 MARK all -- * * 10.75.22.101 0.0.0.0/0 MARK set 0x80 Which is a result of the tcrules: CONTINUE:P 0.0.0.0/0 0.0.0.0/0 all - - - !0/0xc0 # default routing of everything else through cogeco (put exceptions # below since last match wins) 64:P 0.0.0.0/0 64 $FW 128:P 10.75.22.101 Was supposed to take care of that. This was all working on my white russian kernel 2.4.30 kernel and is now not working with kamikaze 2.6.23. ~sigh~ b. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
Jerry Vonau
2007-Dec-29 05:30 UTC
Re: marking and routing (with dual default routes) not working
Brian J. Murrell wrote:> On Fri, 2007-12-28 at 18:22 -0600, Jerry Vonau wrote: >> Brian J. Murrell wrote: >> >>> and given the CGCO routing table: >>> >>> 10.8.0.2 dev tun0 proto kernel scope link src 10.8.0.1 >>> 67.193.45.68 dev eth0.1 scope link >>> 192.168.200.1 dev ppp0 proto kernel scope link src 66.11.173.224 >>> 10.8.0.0/24 via 10.8.0.2 dev tun0 >>> 10.75.22.0/24 dev br-lan proto kernel scope link src 10.75.22.254 >>> 10.75.23.0/24 via 10.8.0.2 dev tun0 >>> 67.193.44.0/23 dev eth0.1 proto kernel scope link src 67.193.45.68 >>> default via 67.193.44.1 dev eth0.1 >>> >>> and the main routing table: >>> >>> 10.8.0.2 dev tun0 proto kernel scope link src 10.8.0.1 >>> 192.168.200.1 dev ppp0 proto kernel scope link src 66.11.173.224 >>> 10.8.0.0/24 via 10.8.0.2 dev tun0 >>> 10.75.22.0/24 dev br-lan proto kernel scope link src 10.75.22.254 >>> 10.75.23.0/24 via 10.8.0.2 dev tun0 >>> 67.193.44.0/23 dev eth0.1 proto kernel scope link src 67.193.45.68 >>> 169.254.0.0/16 via 10.75.22.223 dev br-lan proto zebra metric 20 equalize >>> default >>> nexthop via 67.193.44.1 dev eth0.1 weight 1 >>> nexthop via 192.168.200.1 dev ppp0 weight 1 >>> >> Both tables are the same, > > No they aren''t. Most specifically, the CGCO (one of the provider > tables) table, above at top only has a default route to the CGCO > provider. The main table (which is consulted only as a last resort > according to my route rules) is the only table with both default routes > and which is therefore eligible for round-robin default routing. AFAIU. >OK, except for the gateways they are the same....>> is the copy column in the providers file >> blank? > > Yes: > > #NAME NUMBER MARK DUPLICATE INTERFACE GATEWAY OPTIONS COPY > CGCO 1 64 main eth0.1 detect track,balance > IGS 2 128 main ppp0 detect track,balance > >> Both provders'' routing should not be in each others table. >(wondering why I even bothered to help develop/test/document this support) That is the issue add the local lan to copy column...>From the multi-isp page:When you specify an existing table in the DUPLICATE column, Shorewall copies all routes through the interface specified in the INTERFACE column plus the interfaces listed in this column. Normally, you will list all interfaces on your firewall in this column except those internet interfaces specified in the INTERFACE column of entries in this file.> Correct. They are not: > > # ip route ls table CGCO > 10.8.0.2 dev tun0 proto kernel scope link src 10.8.0.1 > 67.193.45.68 dev eth0.1 scope link > 192.168.200.1 dev ppp0 proto kernel scope link src 66.11.173.224 > 10.8.0.0/24 via 10.8.0.2 dev tun0 > 10.75.22.0/24 dev br-lan proto kernel scope link src 10.75.22.254 > 10.75.23.0/24 via 10.8.0.2 dev tun0 > 67.193.44.0/23 dev eth0.1 proto kernel scope link src 67.193.45.68 > default via 67.193.44.1 dev eth0.1 > # ip route ls table IGSYou have a route to the other isp''s gateway.... With a source address from the other provider... This is suppose to be a provider specific table...> 66.11.173.224 dev ppp0 scope link > 10.8.0.2 dev tun0 proto kernel scope link src 10.8.0.1 > 192.168.200.1 dev ppp0 proto kernel scope link src 66.11.173.224 > 10.8.0.0/24 via 10.8.0.2 dev tun0 > 10.75.22.0/24 dev br-lan proto kernel scope link src 10.75.22.254 > 10.75.23.0/24 via 10.8.0.2 dev tun0 > 67.193.44.0/23 dev eth0.1 proto kernel scope link src 67.193.45.68 > default via 192.168.200.1 dev ppp0 >diddo> But given a route rule table with: > > 0: from all lookup local > 10000: from all fwmark 0x40 lookup CGCO > 10001: from all fwmark 0x80 lookup IGS > 20000: from 67.193.45.68 lookup CGCO > 20256: from 66.11.173.224 lookup IGS > 32766: from all lookup main > 32767: from all lookup default > > Does that not mean that packets on this connection: > > udp 17 179 src=99.228.107.5 dst=67.193.45.68 sport=34786 dport=1194 packets=25769 bytes=7319381 src=67.193.45.68 dst=99.228.107.5 sport=1194 dport=34786 packets=25116 bytes=4779760 [ASSURED] mark=64 use=1 > > (notice the mark=) are sent via the CGCO table and ultimately to the: > > default via 67.193.44.1 dev eth0.1 > > route? Seem so to me, yet I still see: > > Dec 28 23:31:18 gw.ilinx kernel: Shorewall:fw2all:REJECT:IN= OUT=ppp0 SRC=66.11.173.224 DST=99.228.107.5 LEN=481 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=1194 DPT=34786 LEN=461 > Dec 28 23:31:18 gw.ilinx kernel: Shorewall:fw2all:REJECT:IN= OUT=ppp0 SRC=66.11.173.224 DST=99.228.107.5 LEN=545 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=1194 DPT=34786 LEN=525 > Dec 28 23:31:19 gw.ilinx kernel: Shorewall:fw2all:REJECT:IN= OUT=ppp0 SRC=66.11.173.224 DST=99.228.107.5 LEN=505 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=1194 DPT=34786 LEN=485 > > I can''t see how those are making it to the ppp0 interface. >The above routes maybe?> /sbin/ip route ls table SHAW >> 24.78.192.1 dev eth1 scope link src 24.78.192.197 >> 10.3.0.0/24 dev eth0 proto kernel scope link src 10.3.0.75 >> 24.78.192.0/23 dev eth1 proto kernel scope link src 24.78.192.197 >> 169.254.0.0/16 dev eth1 scope link >> default via 24.78.192.1 dev eth1 >> >> Note the lack of routing for my other provider. > > I''m not quite following what routing you see in my CGCO table that is > for the other provider. Indeed there is the > > 192.168.200.1 dev ppp0 proto kernel scope link src 66.11.173.224 > > route but that is a very specific route which should not be causing my > problem. I don''t think. >with a source address for the other provider''s ip block... with a route going to the other gateway.> In taking a closer look at the REJECT messages above, the SRC address is > for my ppp0 interface, indeed but I assumed that the cross-provider SNAT > rules were doing that, but it does not seem to be the case: > > Chain ppp0_masq (1 references) > pkts bytes target prot opt in out source destination > 0 0 MASQUERADE all -- * * 10.75.22.0/24 0.0.0.0/0 > 0 0 SNAT all -- * * 67.193.45.68 0.0.0.0/0 to:66.11.173.224 >MASQURADE from experence please use SNAT here..> You can see there are no hits on the second rule. So maybe these > packets are getting a source address for the interface the kernel is > wanting to route out of at the moment (i.e. the current round robin > candidate). How does one defeat this "feature" of the kernel and force > packets out of a specific interface? I thought: > > Chain tcpre (3 references) > pkts bytes target prot opt in out source destination > 1679 393K RETURN all -- * * 0.0.0.0/0 0.0.0.0/0 MARK match !0x0/0xc0 > 4240 452K MARK all -- * * 0.0.0.0/0 0.0.0.0/0 MARK set 0x40 > 0 0 MARK all -- * * 10.75.22.101 0.0.0.0/0 MARK set 0x80 > > Which is a result of the tcrules: > > CONTINUE:P 0.0.0.0/0 0.0.0.0/0 all - - - !0/0xc0 > # default routing of everything else through cogeco (put exceptions > # below since last match wins) > 64:P 0.0.0.0/0 > 64 $FW > 128:P 10.75.22.101 > > Was supposed to take care of that. > > This was all working on my white russian kernel 2.4.30 kernel and is now > not working with kamikaze 2.6.23. ~sigh~ > > b.What version of Shorewall? If adding br-lan to the copy column does not fix the issue, please post a dump, maybe I''m not seeing the whole picture here. Jerry ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
Brian J. Murrell
2008-Jan-01 15:33 UTC
Re: marking and routing (with dual default routes) not working
On Fri, 2007-12-28 at 17:49 -0500, Brian J. Murrell wrote:> Well, it probably is working. I''m probably just misunderstanding > something.The crux of this problem was in fact that because I have multiple Internet interfaces and thus multiple addresses, the application (OpenVPN) was choosing a different source address to use with different packets of what was essentially the same connection. The solution was to add a: local {hostname|ip-address} directive to my openvpn configuration file. This is less than ideal however as it reduces the flexibility of actually having multiple Internet connections. I guess reduced flexibility and working is better than working intermittently. :-) Cheers and many thanks to Jerry for providing me the opportunity to work through the issue and jogging my memory about that "feature". b. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/