Vincent Jaussaud
2002-Oct-25 13:50 UTC
multipath routing problem [Shorter version] - Help still needed :-)
Hi there ! Well, it seems my previous mail was too long for even beeing read :) So, I''ll try to summarize this a bit. (see original mail below) I''m doing multipath routing to reach some networks. Multipath routing is done on the firewall itself, splitting traffic toward two gateways which are on the same network than the firewall network device beeing used. (eg, we don''t need link redundency, but "gateway redundency" which are linked with remote networks using VPN tunnels.) When only one gateway is used to reach remote networks, everything is working just fine. (Whatever gateway we choose to use) Whenever we attempt to activate multipath routing over both gateways, then SSH don''t work anymore. We can ping, traceroute, telnet, ... but not SSH nor FTP (PASV). Connections simply break, with a "Read error from remote host: Connection reset by peer" on both side (client & server) using SSH. or, by using FTP: No control connection for command: Transport endpoint is not connected We''ve checked everything, including bad routes, DNS/RDNS resolving, ACL, firewalling, etc... everything is ok. Firewall is running kernel 2.2.22, with julian''s patches. No NAT is beeing done on the firewall, since we don''t need to. (both gateways already do NAT) I''m a bit stuck there, and to be honnest, I don''t have a clue of what''s going on :-(. I''ve tried with different gateways, on different networks, no luck; so it doesn''t seems to be a gateway problem. Nobody else do have this kind of problem using multipath routing ? If someone have a clue, even a small one, I would be very happy to hear about, since I''m definitely lost, there. And I *need* to fix this issue. Thanks again. Vincent. -----Forwarded Message----- From: Vincent Jaussaud <tatooin@kelkoo.com> To: lartc <lartc@mailman.ds9a.nl> Subject: [LARTC] multipath routing problem - Help needed Date: 22 Oct 2002 18:17:03 +0200 Hi there; I''m currently facing some weird issues using multipath routing, and I''m feeling desesperate to solve them. :-( Overview: --------- We have two distinct datacenters, linked to our office network across VTUND VPNs. In our office, one linux server has two VTUN tunnels connected to our DCs (one tunnel per DC). DCs are also connected with each other using a VTUN tunnel as well. So, basically, it looks something like: Office | Firewall | VTUN_box | | ----------------------------------INTERNET | | DC1-----DC2 In this situation, everything is working just fine. However, and for redundency / load balancing reasons, we want to build the following setup. Office | Firewall | --------------- | | Vtun_Box1-------Vtun_Box2 | | | | ----------------------------------INTERNET | | | | DC1----DC2 DC1----DC2 In such setup, if Vtun_Box1 crash, all traffic going to our DCs would be redirected by the firewall through Vtun_Box2, and vice-versa. On top of this, if one or both tunnels on one Vtun Box stop working, while the Vtun box itself is still alive, it will automatically redirect all traffic through the other Vtun box. Note that both Vtun_Box are on the same network segment, that is they do have the same network address / broadcast / netmask. Only their IP addresses are different. Thus, both Vtun_Box are reached by the firewall through the same device (eth1, here) Also, the Firewall don''t NAT traffic going to the DCs, since each Vtun box will already NAT everything going out through the tunnels. Now, regarding the servers settings: Firewall: --------- System: Linux, stock kernel 2.2.22 with julian''s patches applied routing policy: 0: from all lookup local 50: from all lookup main 101: from all lookup prod-vpn # Traffic going to both DCs 200: from all lookup uunet # Default route 32766: from all lookup main 32767: from all lookup default Where: ip route list table prod-vpn: DC1_NET/24 proto static nexthop via Vtun_Box1 dev eth1 weight 1 nexthop via Vtun_Box2 dev eth1 weight 1 DC2_NET/24 proto static nexthop via Vtun_Box1 dev eth1 weight 1 nexthop via Vtun_Box2 dev eth1 weight 1 Vtun_Box1: ---------- System: Linux, stock kernel 2.2.19 NAT: MASQ all ------ anywhere anywhere n/a On this box, we have 172.1.1.1 as the local ip of the tunnel to DC1 172.1.2.1 as the local ip of the tunnel to DC2 Vtun_Box2: ---------- System: Linux, stock kernel 2.4.19 NAT: SNAT all -- any tun2 anywhere anywhere to:172.1.1.3 SNAT all -- any tun3 anywhere anywhere to:172.1.2.3 Where 172.1.1.3 is the local ip of the tunnel to DC1 Where 172.1.2.3 is the local ip of the tunnel to DC2 Now, the problem :-) We mostly do SSH to our DCs. In the simple setup, where we don''t do multipath routing (eg, having only one Vtun box), everything is working fine. We can ssh into any machines in any DC without problems.SSH sessions are stable, and stop working only when the NAT ttl has expired. However, when we activate multipath routing, everything goes wrong. For instance: [root@leonard /root]# ssh -l root lime.hosting.kelkoo.net root@lime.hosting.kelkoo.net''s password: Read from remote host lime.hosting.kelkoo.net: Connection reset by peer Connection to lime.hosting.kelkoo.net closed. SSH simply don''t work anymore, and it''s not a netfilter issue, nor any TCP wrapper ACLs. We''ve checked every firewall rules, and every TCP Wrapper ACLs. Everything is ok. What is weird, is that much simple protocols seems to work fine; eg, doing a telnet to the same host instead of SSH, will work. Same thing if we telnet on the SMTP port for instance, and start simulating an SMTP dialog; it''ll work just fine. I also noticed that ping & traceroute ICMP packets works just fine, whatever path they use to reach a DC. However, I think that we are likely to have the same problems with simple protocols as well, if we look a bit deeper, and start heavy testing. Right now, we can''t use SSH or FTP with our DC, all sessions will crash just after authentication. Rarely, we can SSH''in successfully through the machines, but the session crash a few minutes after. I''m a bit worry with this situation, because it seems that packets don''t use the proper reverse path to come back, although we are NAT''ing everything going out the tunnels ! Maybe the problem comes from the fact that both Vtun box gateways are reachable through the same firewall device, but in that case, I''d like to be sure, before throwing everything out. :) I don''t get what''s going on there, any help, would be greatly appreciated. Thanks in advance. Best regards, Vincent Jaussaud -- Vincent Jaussaud Kelkoo.com Security Manager email: tatooin@kelkoo.com "The UNIX philosophy is to design small tools that do one thing, and do it well." _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/ -- Vincent Jaussaud Kelkoo.com Security Manager email: tatooin@kelkoo.com "The UNIX philosophy is to design small tools that do one thing, and do it well." _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Arthur van Leeuwen
2002-Oct-25 14:24 UTC
Re: multipath routing problem [Shorter version] - Help still needed :-)
On 25 Oct 2002, Vincent Jaussaud wrote:> When only one gateway is used to reach remote networks, everything is > working just fine. (Whatever gateway we choose to use) > Whenever we attempt to activate multipath routing over both gateways, > then SSH don''t work anymore. We can ping, traceroute, telnet, ... but > not SSH nor FTP (PASV).ssh tends to play with TOS fields (and rightly so). Routing is keyed to the *triple* (src, dst, tos), something that most people (including me) normally forget. However, in this particular case that may be the reason for your ssh''s breaking. The reason for FTP breaking possibly has to do with packets for the control connection going out the one gateway and for the data going out the other... but this is speculation on my part. Doei, Arthur. -- /\ / | arthurvl@sci.kun.nl | Work like you don''t need the money /__\ / | A friend is someone with whom | Love like you have never been hurt / \/__ | you can dare to be yourself | Dance like there''s nobody watching _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Vincent Jaussaud
2002-Oct-25 14:38 UTC
Re: Re: multipath routing problem [Shorter version] - Help still needed :-)
On Fri, 2002-10-25 at 16:24, Arthur van Leeuwen wrote:> On 25 Oct 2002, Vincent Jaussaud wrote: > > > When only one gateway is used to reach remote networks, everything is > > working just fine. (Whatever gateway we choose to use) > > Whenever we attempt to activate multipath routing over both gateways, > > then SSH don''t work anymore. We can ping, traceroute, telnet, ... but > > not SSH nor FTP (PASV). > > ssh tends to play with TOS fields (and rightly so). Routing is keyed to the > *triple* (src, dst, tos), something that most people (including me) normally > forget. However, in this particular case that may be the reason for your > ssh''s breaking. >Hmm... that''s really interesting. Thanks for the pointer. I remember now that I''ve read something regarding SSH & TOS field some days ago. If I''m right, it use the Minimum Delay TOS value. Now, how am I suppose to deal with this TOS issue ? What TOS value should do the trick ? I''m using a 2.2 kernel with ipchains.> The reason for FTP breaking possibly has to do with packets for > the control connection going out the one gateway and for the data going > out the other... but this is speculation on my part.That sounds wise. However, routes are suppose to be cached using the src IP field as well (If I''m not mistaken), so that every packets coming from a particular IP are likely to take the same route than the others. Am I wrong ? A BIG Thanks for your reply :-) Cheers, Vincent.> > Doei, Arthur. > > -- > /\ / | arthurvl@sci.kun.nl | Work like you don''t need the money > /__\ / | A friend is someone with whom | Love like you have never been hurt > / \/__ | you can dare to be yourself | Dance like there''s nobody watching-- Vincent Jaussaud Kelkoo.com Security Manager email: tatooin@kelkoo.com "The UNIX philosophy is to design small tools that do one thing, and do it well." _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Julian Anastasov
2002-Oct-25 14:55 UTC
Re: Re: multipath routing problem [Shorter version] - Help still needed :-)
Hello, On 25 Oct 2002, Vincent Jaussaud wrote:> > ssh tends to play with TOS fields (and rightly so). Routing is keyed to the > > *triple* (src, dst, tos), something that most people (including me) normally > > forget. However, in this particular case that may be the reason for your > > ssh''s breaking. > > > Hmm... that''s really interesting. Thanks for the pointer. I remember now > that I''ve read something regarding SSH & TOS field some days ago. If I''m > right, it use the Minimum Delay TOS value. > > Now, how am I suppose to deal with this TOS issue ? What TOS value > should do the trick ?In theory, you should not reach multipath route for traffic that is already NAT-ed. May be you have to fix your routes. The TOS field plays in the input routing performed on forwarding and traffic between two public IP addresses can select different nexthop if the TOS is different or if the routing cache is somehow flushed (on route/address add/del, expiration).> I''m using a 2.2 kernel with ipchains. > > > The reason for FTP breaking possibly has to do with packets for > > the control connection going out the one gateway and for the data going > > out the other... but this is speculation on my part. > > That sounds wise. However, routes are suppose to be cached using the src > IP field as well (If I''m not mistaken), so that every packets coming > from a particular IP are likely to take the same route than the others. > Am I wrong ?Yes, TOS is a routing key just like SADDR and DADDR. By using multipath route between 2 IP addresses you agree that the packets can _safely_ choose any of the paths. When using two or more ISPs you simply can''t do this if the ISPs have source spoofing disabled. In such cases only the traffic that is NAT-ed from your box has the right to use the multipath route. This is a key requirement for the patches you are using. Once the NAT connections are established they don''t hit multipath route.> A BIG Thanks for your reply :-) > Cheers, > Vincent.Regards -- Julian Anastasov <ja@ssi.bg> _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Vincent Jaussaud
2002-Oct-25 15:31 UTC
Re: Re: multipath routing problem [Shorter version] - Help still needed :-)
On Fri, 2002-10-25 at 16:55, Julian Anastasov wrote:> > Hello,Hi julian !> > On 25 Oct 2002, Vincent Jaussaud wrote: > > > > ssh tends to play with TOS fields (and rightly so). Routing is keyed to the > > > *triple* (src, dst, tos), something that most people (including me) normally > > > forget. However, in this particular case that may be the reason for your > > > ssh''s breaking. > > > > > Hmm... that''s really interesting. Thanks for the pointer. I remember now > > that I''ve read something regarding SSH & TOS field some days ago. If I''m > > right, it use the Minimum Delay TOS value. > > > > Now, how am I suppose to deal with this TOS issue ? What TOS value > > should do the trick ? > > In theory, you should not reach multipath route for > traffic that is already NAT-ed. May be you have to fix your > routes. The TOS field plays in the input routing performed > on forwarding and traffic between two public IP addresses can select > different nexthop if the TOS is different or if the routing > cache is somehow flushed (on route/address add/del, expiration).But traffic is NAT-ed after multipath routing occurs ! Eg, the box which do multipath routing do not NAT traffic; traffic get NAT-ed when leaving the gateways: LAN --> FW w/ multipath-routing | | Gateway1 Gateway2 | (NAT) | (NAT) | | -------------------- Remote Network Packets reach the Remote Network using one of the Gateway NAT-ed IP, so that when packets come back they should use the proper return path. Am I wrong ?> > > I''m using a 2.2 kernel with ipchains. > > > > > The reason for FTP breaking possibly has to do with packets for > > > the control connection going out the one gateway and for the data going > > > out the other... but this is speculation on my part. > > > > That sounds wise. However, routes are suppose to be cached using the src > > IP field as well (If I''m not mistaken), so that every packets coming > > from a particular IP are likely to take the same route than the others. > > Am I wrong ? > > Yes, TOS is a routing key just like SADDR and DADDR. > By using multipath route between 2 IP addresses you agree that > the packets can _safely_ choose any of the paths. When using > two or more ISPs you simply can''t do this if the ISPs have > source spoofing disabled. In such cases only the traffic that > is NAT-ed from your box has the right to use the multipath route. > This is a key requirement for the patches you are using. Once > the NAT connections are established they don''t hit multipath > route. >Hmmm... Then this is where the problem is. So, if I understand correctly, packets coming from a single TCP connections will use any of the multipath route _if_ they are not NAT-ed ? Isn''t the routing cache suppose to ensure that every packets coming from a single connection use the same path ? Now, If I understand the whole topic, in order to fix my problem, I need to: - NAT everything on the FW itself - or disable NAT on both gateways, and ensure that routing is done properly Am I right ? I''m becomming a bit lost, here :-\ Many Thanks for your time. Vincent.> > A BIG Thanks for your reply :-) > > Cheers, > > Vincent. > > Regards > > -- > Julian Anastasov <ja@ssi.bg> > > _______________________________________________ > LARTC mailing list / LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/-- Vincent Jaussaud Kelkoo.com Security Manager email: tatooin@kelkoo.com "The UNIX philosophy is to design small tools that do one thing, and do it well." _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Julian Anastasov
2002-Oct-25 16:12 UTC
Re: Re: multipath routing problem [Shorter version] - Help still needed :-)
Hello, On 25 Oct 2002, Vincent Jaussaud wrote:> But traffic is NAT-ed after multipath routing occurs ! > Eg, the box which do multipath routing do not NAT traffic; traffic get > NAT-ed when leaving the gateways: > > LAN --> FW w/ multipath-routing > | | > Gateway1 Gateway2 > | (NAT) | (NAT) > | | > -------------------- Remote Network > > Packets reach the Remote Network using one of the Gateway NAT-ed IP, so > that when packets come back they should use the proper return path. Am I > wrong ?Now I see, then the TOS is a big problem for you. May be your problem will be solved if TOS is not a routing key but it does not sound as a thing that is easy to fix in kernel.> Hmmm... Then this is where the problem is. So, if I understand > correctly, packets coming from a single TCP connections will use any of > the multipath route _if_ they are not NAT-ed ? Isn''t the routing cacheYes, traffic A->B with TOS=XXX will use one path, traffic A->B with TOS=YYY can use different path.> suppose to ensure that every packets coming from a single connection use > the same path ?The routing does not cache connections but route resolutions keyed by saddr, daddr, tos, etc. Even there are no TCP/UDP ports (yet?).> Now, If I understand the whole topic, in order to fix my problem, I need > to: > > - NAT everything on the FW itselfThis is a solution> - or disable NAT on both gateways, and ensure that routing is done > properlyNot sure what you mean. But you can run multipath routes on the both gateways. Then the internal hosts can select any of the gateways as default (or to use alternative default gateways). Another solution is your apps not to change the TOS, it can be changed at the border gateways (if useful at all).> Am I right ? I''m becomming a bit lost, here :-\ > > Many Thanks for your time. > Vincent.Regards -- Julian Anastasov <ja@ssi.bg> _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Vincent Jaussaud
2002-Oct-25 18:15 UTC
Re: Re: multipath routing problem [Shorter version] - Help still needed :-)
On Fri, 2002-10-25 at 18:12, Julian Anastasov wrote:> > Hello, > > On 25 Oct 2002, Vincent Jaussaud wrote: > > > But traffic is NAT-ed after multipath routing occurs ! > > Eg, the box which do multipath routing do not NAT traffic; traffic get > > NAT-ed when leaving the gateways: > > > > LAN --> FW w/ multipath-routing > > | | > > Gateway1 Gateway2 > > | (NAT) | (NAT) > > | | > > -------------------- Remote Network > > > > Packets reach the Remote Network using one of the Gateway NAT-ed IP, so > > that when packets come back they should use the proper return path. Am I > > wrong ? > > Now I see, then the TOS is a big problem for you. May > be your problem will be solved if TOS is not a routing key but > it does not sound as a thing that is easy to fix in kernel. > > > Hmmm... Then this is where the problem is. So, if I understand > > correctly, packets coming from a single TCP connections will use any of > > the multipath route _if_ they are not NAT-ed ? Isn''t the routing cache > > Yes, traffic A->B with TOS=XXX will use one path, traffic > A->B with TOS=YYY can use different path. >Ok, I understand a lot of things now, thanks to you. However, I don''t get why, in the same SSH session, TOS may differ from one packet to another. Using tcpdump, it seems that TOS value change right after the authentication has been successfully made.> > suppose to ensure that every packets coming from a single connection use > > the same path ? > > The routing does not cache connections but route resolutions keyed > by saddr, daddr, tos, etc. Even there are no TCP/UDP ports (yet?). >Ok.> > Now, If I understand the whole topic, in order to fix my problem, I need > > to: > > > > - NAT everything on the FW itself > > This is a solutionYes, but unfortunately, and regarding my current networks architecture, this will require massive changes in the firewall ACLs / routing rules. Currently, this is not an option.> > > - or disable NAT on both gateways, and ensure that routing is done > > properly > > Not sure what you mean. But you can run multipath routes > on the both gateways. Then the internal hosts can select any of > the gateways as default (or to use alternative default gateways).Each gateway only have one route to the remote network. So I don''t see where the multipath routing will send packets to, except maybe to the other gateway. But this may become a big problem, in case all tunnels to the remote networks are down; this will create a routing loop.> Another solution is your apps not to change the TOS, it can be > changed at the border gateways (if useful at all). >Actually, and thanks to you, now I know where the problem is. Considering that doing NAT on the firewall itself is not possible, and that disabling NAT on both gateways will quickly become a big headache, because of the current networks setup, I think of adding another layout of NAT on the final gateway. Eg, something like: LAN --> FW w/ multipath-routing | | Gateway1 Gateway2 | (NAT) | (NAT) | | Remote Gateway (other side) | (NAT) | -------------------- Remote Network This''ll ensure that all packets arriving to the remove network comes from the same IP address, whatever path they used. What do you think of such setup ? Is it likely to work ? Thanks again. Regards, Vincent.> > Am I right ? I''m becomming a bit lost, here :-\ > > > > Many Thanks for your time. > > Vincent. > > Regards > > -- > Julian Anastasov <ja@ssi.bg> > > _______________________________________________ > LARTC mailing list / LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/-- Vincent Jaussaud Kelkoo.com Security Manager email: tatooin@kelkoo.com "The UNIX philosophy is to design small tools that do one thing, and do it well." _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Arthur van Leeuwen
2002-Oct-25 18:17 UTC
Re: Re: multipath routing problem [Shorter version] - Help still needed :-)
On Fri, 25 Oct 2002, Julian Anastasov wrote:> Hello,> On 25 Oct 2002, Vincent Jaussaud wrote:> > But traffic is NAT-ed after multipath routing occurs ! > > Eg, the box which do multipath routing do not NAT traffic; traffic get > > NAT-ed when leaving the gateways: > > > > LAN --> FW w/ multipath-routing > > | | > > Gateway1 Gateway2 > > | (NAT) | (NAT) > > | | > > -------------------- Remote Network > > > > Packets reach the Remote Network using one of the Gateway NAT-ed IP, so > > that when packets come back they should use the proper return path. Am I > > wrong ? > > Now I see, then the TOS is a big problem for you. May > be your problem will be solved if TOS is not a routing key but > it does not sound as a thing that is easy to fix in kernel.Actually, you can simply play whack-a-mole with the TOS value, using ipchains (or iptables), killing all TOS values present on the packets. Ofcourse, this is not very *nice*, but it''ll work. Doei, Arthur. -- /\ / | arthurvl@sci.kun.nl | Work like you don''t need the money /__\ / | A friend is someone with whom | Love like you have never been hurt / \/__ | you can dare to be yourself | Dance like there''s nobody watching _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Arthur van Leeuwen
2002-Oct-25 18:21 UTC
Re: Re: multipath routing problem [Shorter version] - Help still needed :-)
On 25 Oct 2002, Vincent Jaussaud wrote:> However, I don''t get why, in the same SSH session, TOS may differ from > one packet to another. Using tcpdump, it seems that TOS value change > right after the authentication has been successfully made.Shit... you figured that one out *quite* a bit faster than I did at the time... took me two weeks. What openssh does is first authenticate, then set the TOS value depending on whether you''re doing interactive communications (ssh) or bulk transfer (scp). One could see this as a way of minimizing information leakage... Oh, and yes, it does what you deduced. I finally got that from reading the sources... Doei, Arthur. -- /\ / | arthurvl@sci.kun.nl | Work like you don''t need the money /__\ / | A friend is someone with whom | Love like you have never been hurt / \/__ | you can dare to be yourself | Dance like there''s nobody watching _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Vincent Jaussaud
2002-Oct-25 18:44 UTC
Re: Re: multipath routing problem [Shorter version] - Help still needed :-)
On Fri, 2002-10-25 at 20:21, Arthur van Leeuwen wrote:> On 25 Oct 2002, Vincent Jaussaud wrote: > > > However, I don''t get why, in the same SSH session, TOS may differ from > > one packet to another. Using tcpdump, it seems that TOS value change > > right after the authentication has been successfully made. > > Shit... you figured that one out *quite* a bit faster than I did at the > time... took me two weeks. >:-)> What openssh does is first authenticate, then set the TOS value depending on > whether you''re doing interactive communications (ssh) or bulk transfer > (scp). One could see this as a way of minimizing information leakage... >OK, now I know why openssh is changing it''s TOS !. Thanks. :-)> Oh, and yes, it does what you deduced. I finally got that from reading the > sources...I could mangle the TOS field as you suggested, but I don''t like this, since packets *should* be able to find their way out, whatever path they use to come back. The thing I don''t understand, is that even by NAT''ing everything, everywhere, my connections still break. I''ve tried to NAT on the firewall everything coming from a test IP, just to see how it goes. No luck. I even tried NAT''ing on the firewall, then on the gateways, then on the final router, in the other network. Still no luck ! This is non sense ! There has to be something wrong, somewhere. Thanks for your reply. Regards, Vincent.> > Doei, Arthur. > > -- > /\ / | arthurvl@sci.kun.nl | Work like you don''t need the money > /__\ / | A friend is someone with whom | Love like you have never been hurt > / \/__ | you can dare to be yourself | Dance like there''s nobody watching > > _______________________________________________ > LARTC mailing list / LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/-- Vincent Jaussaud Kelkoo.com Security Manager email: tatooin@kelkoo.com "The UNIX philosophy is to design small tools that do one thing, and do it well." _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Julian Anastasov
2002-Oct-25 18:45 UTC
Re: Re: multipath routing problem [Shorter version] - Help still needed :-)
Hello, On Fri, 25 Oct 2002, Arthur van Leeuwen wrote:> > Now I see, then the TOS is a big problem for you. May > > be your problem will be solved if TOS is not a routing key but > > it does not sound as a thing that is easy to fix in kernel. > > Actually, you can simply play whack-a-mole with the TOS value, using > ipchains (or iptables), killing all TOS values present on the packets. > Ofcourse, this is not very *nice*, but it''ll work.This is a good idea. Vincent, may be you can play with ipchains -t AND XOR in the input chain to see what happens. Just make sure you don''t touch bits 0, 1, 5, 6, 7. It seems the routing uses only bits 2, 3 and 4 for routing key (if I''m not overlooking something). This is for kernel 2.4. For kernel 2.2 it seems bit 1 is also included in the routing key. 2.4 mask 0x1C, inverted 0xE3 2.2 mask 0x1E, inverted 0xE1 So, for 2.2 may be: ipchains -I input -d 0.0.0.0/0 22 -t 0xE3 0x00 What are the TOS values used during the SSH session? Regards -- Julian Anastasov <ja@ssi.bg> _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Vincent Jaussaud
2002-Oct-25 19:13 UTC
Re: Re: multipath routing problem [Shorter version] - Help still needed :-)
On Fri, 2002-10-25 at 20:45, Julian Anastasov wrote:> > Hello, > > On Fri, 25 Oct 2002, Arthur van Leeuwen wrote: > > > > Now I see, then the TOS is a big problem for you. May > > > be your problem will be solved if TOS is not a routing key but > > > it does not sound as a thing that is easy to fix in kernel. > > > > Actually, you can simply play whack-a-mole with the TOS value, using > > ipchains (or iptables), killing all TOS values present on the packets. > > Ofcourse, this is not very *nice*, but it''ll work. > > This is a good idea. Vincent, may be you can play with > ipchains -t AND XOR in the input chain to see what happens. Just make sure > you don''t touch bits 0, 1, 5, 6, 7. It seems the routing uses only bits > 2, 3 and 4 for routing key (if I''m not overlooking something). > This is for kernel 2.4. For kernel 2.2 it seems bit 1 is also > included in the routing key. > > 2.4 mask 0x1C, inverted 0xE3 > 2.2 mask 0x1E, inverted 0xE1 > > So, for 2.2 may be: > > ipchains -I input -d 0.0.0.0/0 22 -t 0xE3 0x00Just tried. Now SSH connections don''t break anymore !!! :) Thanks ! Am I suppose to do this on both side, or doing this on the firewall itself is enough ?> > What are the TOS values used during the SSH session?Right after authentication, TOS value is set to 0x10 20:53:46.515566 192.168.0.2.ssh > 172.1.1.3.2418: R 4008315859:4008315859(0) win 0 [tos 0x10] The only problem with this, is that I will need to do this trick for any applications changing it''s TOS during the session. It seems that FTP behaves exactly the same way as SSH, regarding the TOS field. Do you guys know if many applications do this ? Or is this just particular to SSH & FTP ? Anyway, I really would like to understand why it doesn''t work when doing NAT. A big thanks to both of you. I''ve learned a lot today :) Thanks again. Regards, Vincent.> > Regards > > -- > Julian Anastasov <ja@ssi.bg>-- Vincent Jaussaud Kelkoo.com Security Manager email: tatooin@kelkoo.com "The UNIX philosophy is to design small tools that do one thing, and do it well." _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Julian Anastasov
2002-Oct-25 19:28 UTC
Re: Re: multipath routing problem [Shorter version] - Help still needed :-)
Hello, On 25 Oct 2002, Vincent Jaussaud wrote:> > 2.4 mask 0x1C, inverted 0xE3 > > 2.2 mask 0x1E, inverted 0xE1 > > > > So, for 2.2 may be: > > > > ipchains -I input -d 0.0.0.0/0 22 -t 0xE3 0x00 > Just tried. Now SSH connections don''t break anymore !!! :) Thanks ! > Am I suppose to do this on both side, or doing this on the firewall > itself is enough ?I now see that my example ipchains command is wrong, use 0xE1 for 2.2 as the above table.> The only problem with this, is that I will need to do this trick for any > applications changing it''s TOS during the session. It seems that FTP > behaves exactly the same way as SSH, regarding the TOS field.It seems you can safely alter the TOS for all packets entering your box/site.> Do you guys know if many applications do this ? Or is this just > particular to SSH & FTP ?The TOS is usually used for routing between routers in your site, then the border gateways can assign different priorities based on the TOS values, for traffic control purposes.> Anyway, I really would like to understand why it doesn''t work when doing > NAT.May be you can hunt it with tcpdump. I assume your are using the patches because the plain kernel has the same problem for NAT.> A big thanks to both of you. I''ve learned a lot today :) > > Thanks again. > Regards, > Vincent.Regards -- Julian Anastasov <ja@ssi.bg> _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Vincent Jaussaud
2002-Oct-28 14:29 UTC
Re: Re: multipath routing problem [Shorter version] - Help still needed :-)
> > It seems you can safely alter the TOS for all packets > entering your box/site. >Ok, I''ll dig into this tip, and see how it goes. If I can''t figure out this NAT problem, I''ll do this.> May be you can hunt it with tcpdump. I assume your are > using the patches because the plain kernel has the same problem > for NAT. >Yes, I am running your patch. Kernel is 2.2.22 with routes-2.2.20-7.diff patch applied. (I''m sure of this, otherwise dead gateway detection will simply not work.) My question is, if we ensure that EVERY packets, whatever path they use to arrive, finally pass through a single peer doing NAT, is this suppose to work around my TOS problem ? Eg, end services will only see packets coming from the last NAT address, which is single whatever path packets used to arrive. Something like: LAN --> Multipath Firewall | | GW1 GW2 | | ------------------- | Gateway (NAT) | --------- Remote Network What about the rp_filter kernel value ? Could it be a problem in such setup ? Thanks again. Vincent.> > A big thanks to both of you. I''ve learned a lot today :) > > > > Thanks again. > > Regards, > > Vincent. > > Regards > > -- > Julian Anastasov <ja@ssi.bg> > > _______________________________________________ > LARTC mailing list / LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/-- Vincent Jaussaud Kelkoo.com Security Manager email: tatooin@kelkoo.com "The UNIX philosophy is to design small tools that do one thing, and do it well." _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Julian Anastasov
2002-Oct-28 22:21 UTC
Re: Re: multipath routing problem [Shorter version] - Help still needed :-)
Hello, On 28 Oct 2002, Vincent Jaussaud wrote:> My question is, if we ensure that EVERY packets, whatever path they use > to arrive, finally pass through a single peer doing NAT, is this suppose > to work around my TOS problem ?Sounds correct. The requirement is each packet from one connection to be NAT-ed only from one NAT router and to same masquerade address and port. The routing cache can not guarantee that. It can be done only from the patched masquerade.> What about the rp_filter kernel value ? Could it be a problem in such > setup ?The patches are designed to work with rp_filter enabled. You can safely use it, it is changed to work only with the defined paths.> Thanks again. > Vincent.Regards -- Julian Anastasov <ja@ssi.bg> _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Vincent Jaussaud
2002-Oct-29 16:32 UTC
Re: Re: multipath routing problem [Shorter version] - Help still needed :-)
On Mon, 2002-10-28 at 23:21, Julian Anastasov wrote:> > Hello, > > On 28 Oct 2002, Vincent Jaussaud wrote: > > > My question is, if we ensure that EVERY packets, whatever path they use > > to arrive, finally pass through a single peer doing NAT, is this suppose > > to work around my TOS problem ? > > Sounds correct. The requirement is each packet from one > connection to be NAT-ed only from one NAT router and to same > masquerade address and port. The routing cache can not guarantee > that. It can be done only from the patched masquerade. >Hmmm.. then that''s why it doesn''t work.. final gateway doing NAT isn''t patched, only the first one is. I think I''ll have to drop the idea of using both gateways simultaneously. Now, If I only want do to fail-over (eg; only one gateway used at the same time, other one used only in case the first one breaks.) I was thinking about using the metric value for this. Let''s say: ip route add table dual-gw proto static 192.168.0.0/24 via GW1 dev eth1 metric 1 ip route add table dual-gw proto static 192.168.0.0/24 via GW2 dev eth1 metric 2 I assume the kernel will always use the best route, that is the one with best metric. So that all packets will use the same route. If GW1 breaks, patched kernel should mark first route as dead, and force all further packets to use GW2 instead. Is this suppose to work ? Or can we use different metric value inside a multipath route, like: ip route add table dual-gw proto static 192.168.0.0/24 nexthop via GW1 dev eth1 metric 1 nexthop via GW2 dev eth1 metric 2 ? Anyway, the more I think about this setup, the more I think I should use a clustering solution instead. Maybe a cluster of gateway with one VIP is much more appropriate for what I want to build. I''ll use multipath routing for ISP redundency then :) Thanks to both of you, I''ve learn a lot during the last past few days, this was one of my main concern too. Thanks again. Cheers, Vincent.> > What about the rp_filter kernel value ? Could it be a problem in such > > setup ? > > The patches are designed to work with rp_filter enabled. > You can safely use it, it is changed to work only with the defined > paths. > > > Thanks again. > > Vincent. > > Regards > > -- > Julian Anastasov <ja@ssi.bg> > > _______________________________________________ > LARTC mailing list / LARTC@mailman.ds9a.nl > http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/-- Vincent Jaussaud Kelkoo.com Security Manager email: tatooin@kelkoo.com "The UNIX philosophy is to design small tools that do one thing, and do it well." _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Julian Anastasov
2002-Oct-29 22:31 UTC
Re: Re: multipath routing problem [Shorter version] - Help still needed :-)
Hello, On 29 Oct 2002, Vincent Jaussaud wrote:> I was thinking about using the metric value for this. > > Let''s say: > > ip route add table dual-gw proto static 192.168.0.0/24 via GW1 dev eth1 > metric 1 > ip route add table dual-gw proto static 192.168.0.0/24 via GW2 dev eth1 > metric 2No, the metrics work only if the routes disappear. This usually happens when device goes down (for example, ppp). For gateways reachable via ARP it can''t work. You need to define alternative routes (ip route append), see my docs about alt routes that use same metric.> I assume the kernel will always use the best route, that is the one with > best metric. So that all packets will use the same route. > If GW1 breaks, patched kernel should mark first route as dead, and force > all further packets to use GW2 instead.No, dead gateway detection currently works for routes with same metric. But even then the detection is passive and needs help from user space. Without such checks you can expect almost random results. OTOH, you can run your own checks and to keep only the alive routes. Regards -- Julian Anastasov <ja@ssi.bg> _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/