Hi All, As you probably all know :-) I''m trying to do the multi-isp thing. I''ve resolved my last issue with the route_rules as suggested by Tom and Jerry suggested. Lately I have been seeing "transient" (I say transient because the problem will persist for a while and then magically clear itself up some number of minutes later) situations where my gateway will log: Feb 9 17:23:45 gw.ilinx kernel: martian source 66.11.173.224 from 64.86.88.116, on dev eth1 Feb 9 17:23:45 gw.ilinx kernel: ll header: 00:a0:24:2a:1f:72:00:13:5f:07:97:05:08:00 but I''m not quite sure how to read these and/or what would be causing them. Concerning the packet that the message is describing I can assert that eth1 is the interface the packet would have arrived on and 64.86.88.116 would have been the party sending the packet and indeed 00:A0:24:2A:1F:72 is the address of my eth1 and 00:13:5f:07:97:05 is the router on the other end of that eth1. Finally, 66.11.173.224 is the address of my other Internet interface, a pppoe link. So in the above messages, what is it trying to tell me about the packet that arrived and what''s the relevance of the 66.11.173.224 in it? All seems well except that 66.11.173.224. Thanx, b. -- My other computer is your Microsoft Windows server. Brian J. Murrell ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
On Fri, Feb 09, 2007 at 06:02:09PM -0500, Brian J. Murrell wrote:> Lately I have been seeing "transient" (I say transient because the > problem will persist for a while and then magically clear itself up some > number of minutes later) situations where my gateway will log: > > Feb 9 17:23:45 gw.ilinx kernel: martian source 66.11.173.224 from 64.86.88.116, on dev eth1 > Feb 9 17:23:45 gw.ilinx kernel: ll header: 00:a0:24:2a:1f:72:00:13:5f:07:97:05:08:00 > > but I''m not quite sure how to read these and/or what would be causing > them. > > Concerning the packet that the message is describing I can assert that > eth1 is the interface the packet would have arrived on and 64.86.88.116 > would have been the party sending the packet and indeed > 00:A0:24:2A:1F:72 is the address of my eth1 and 00:13:5f:07:97:05 is the > router on the other end of that eth1. Finally, 66.11.173.224 is the > address of my other Internet interface, a pppoe link. > > So in the above messages, what is it trying to tell me about the packet > that arrived and what''s the relevance of the 66.11.173.224 in it? All > seems well except that 66.11.173.224.The message is somewhat obtusely phrased. The kernel has received a packet from 64.86.88.116 to 66.11.173.224 on eth1, and it doesn''t like the source address for whatever reason, so it dropped the packet. Most likely, 64.86.88.116 is not routable via eth1, which implies either your routing tables are wrong or you need to disable return-path filtering on this interface (I still haven''t been paying enough attention to know which, but you must disable rpfilter if your routing is assymetric). It''s probably transient because the sending system notes that packets aren''t getting through and tries a different route. Other less likely reasons: the kernel thinks that''s a broadcast address, or something else that is not a unicast host address. The message sadly does not indicate which of the many obscure rules were violated. ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
On Fri, 2007-02-09 at 23:56 +0000, Andrew Suffield wrote:> > The message is somewhat obtusely phrased.Indeed.> The kernel has received a > packet from 64.86.88.116 to 66.11.173.224 on eth1, and it doesn''t like > the source address for whatever reason,Or the destination address, considering that it''s the destination address for a different interface?> so it dropped the packet. Most > likely, 64.86.88.116 is not routable via eth1, which implies either > your routing tables are wrong# ip route ls ... default nexthop via 72.38.184.1 dev eth1 weight 1 nexthop via 192.168.200.1 dev ppp0 weight 1 That should make it routable, yes?> or you need to disable return-path > filtering on this interface (I still haven''t been paying enough > attention to know which, but you must disable rpfilter if your routing > is assymetric).Well, it should not be. I do have two interfaces but they are in completely different subnets with different providers. IOW, completely independent of each other. That''s what makes it odd that a packet could arrive on my eth1 with a destination address of 66.11.173.224. The Internet would not route that destination address to my eth1 via my eth1 provider but rather to my ppp0 via my ppp0 provider. But that packet should not even have that destination address as it is replying to a packet I sent via my eth1 interface and had a source address of my eth1 interface. In fact a tcpdump shows that at the demarcation of my eth1 interface, addressing is indeed correct: 19:21:31.572939 IP 72.38.184.236.4697 > 64.86.88.116.3653: S 2034318562:2034318562(0) win 5648 <mss 1412,sackOK,timestamp 61683401 0,nop,wscale 2> 19:21:31.611442 IP 64.86.88.116.3653 > 72.38.184.236.4697: S 1578824716:1578824716(0) ack 2034318563 win 32768 <mss 1460,nop,wscale 0,nop,nop,timestamp 0 61683401> So somehow, I guess, in my gateway it''s having it''s destination address rewritten? That seems strange/unlikely.> It''s probably transient because the sending system notes that packets > aren''t getting through and tries a different route.Well, the sending system has no idea that my machine has these two different addresses, so I can''t see how it would. b. -- My other computer is your Microsoft Windows server. Brian J. Murrell ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Brian J. Murrell wrote:> On Fri, 2007-02-09 at 23:56 +0000, Andrew Suffield wrote: >> The message is somewhat obtusely phrased. > > Indeed. > >> The kernel has received a >> packet from 64.86.88.116 to 66.11.173.224 on eth1, and it doesn''t like >> the source address for whatever reason, > > Or the destination address, considering that it''s the destination > address for a different interface? > >> so it dropped the packet. Most >> likely, 64.86.88.116 is not routable via eth1, which implies either >> your routing tables are wrong > > # ip route ls > ... > default > nexthop via 72.38.184.1 dev eth1 weight 1 > nexthop via 192.168.200.1 dev ppp0 weight 1 > > That should make it routable, yes? > >> or you need to disable return-path >> filtering on this interface (I still haven''t been paying enough >> attention to know which, but you must disable rpfilter if your routing >> is assymetric). > > Well, it should not be. I do have two interfaces but they are in > completely different subnets with different providers. IOW, completely > independent of each other. > > That''s what makes it odd that a packet could arrive on my eth1 with a > destination address of 66.11.173.224. The Internet would not route that > destination address to my eth1 via my eth1 provider but rather to my > ppp0 via my ppp0 provider. > > But that packet should not even have that destination address as it is > replying to a packet I sent via my eth1 interface and had a source > address of my eth1 interface. > > In fact a tcpdump shows that at the demarcation of my eth1 interface, > addressing is indeed correct: > > 19:21:31.572939 IP 72.38.184.236.4697 > 64.86.88.116.3653: S 2034318562:2034318562(0) win 5648 <mss 1412,sackOK,timestamp 61683401 0,nop,wscale 2> > 19:21:31.611442 IP 64.86.88.116.3653 > 72.38.184.236.4697: S 1578824716:1578824716(0) ack 2034318563 win 32768 <mss 1460,nop,wscale 0,nop,nop,timestamp 0 61683401> > > So somehow, I guess, in my gateway it''s having it''s destination address > rewritten? That seems strange/unlikely. > >> It''s probably transient because the sending system notes that packets >> aren''t getting through and tries a different route. > > Well, the sending system has no idea that my machine has these two > different addresses, so I can''t see how it would. > > b.Just wondering how you have your masq file setup, I hope your using the SNAT column in there. Jerry ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
On Fri, 2007-02-09 at 19:38 -0600, Jerry Vonau wrote:> > Just wondering how you have your masq file setup, I hope your using the > SNAT column in there.Yeah, I wondered if it could be that "make sure the packet has the right source address for the interface it''s leaving on" masquing going on too, but no, it seems right: Chain ppp0_masq (1 references) ... 0 0 SNAT all -- * * 72.38.184.236 0.0.0.0/0 policy match dir out pol none to:66.11.173.224 Chain eth1_masq (1 references) ... 7 668 SNAT all -- * * 66.11.173.224 0.0.0.0/0 policy match dir out pol none to:72.38.184.236 where eth1==72.38.184.236 and ppp0==66.11.173.224. b. -- My other computer is your Microsoft Windows server. Brian J. Murrell ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Brian J. Murrell wrote:> On Fri, 2007-02-09 at 19:38 -0600, Jerry Vonau wrote: >> Just wondering how you have your masq file setup, I hope your using the >> SNAT column in there. > > Yeah, I wondered if it could be that "make sure the packet has the right > source address for the interface it''s leaving on" masquing going on too, > but no, it seems right: > > Chain ppp0_masq (1 references) > ... > 0 0 SNAT all -- * * 72.38.184.236 0.0.0.0/0 policy match dir out pol none to:66.11.173.224 > > Chain eth1_masq (1 references) > ... > 7 668 SNAT all -- * * 66.11.173.224 0.0.0.0/0 policy match dir out pol none to:72.38.184.236 > > where eth1==72.38.184.236 and ppp0==66.11.173.224. > > b. >I guess you missed this part from the Multi-ISP page: ------ Regardless of whether you have masqueraded hosts or not, YOU MUST ADD THESE TWO ENTRIES TO /etc/shorewall/masq: #INTERFACE SUBNET ADDRESS eth0 130.252.99.27 206.124.146.176 eth1 206.124.146.176 130.252.99.27 Those entries ensure that traffic originating on the firewall always has the source IP address corresponding to the interface that it is routed out of. ----- You should have these entries in there also: eth1 66.11.173.224 72.38.184.236 ppp0 72.38.184.236 66.11.173.224 Jerry ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
On Fri, 2007-02-09 at 20:09 -0600, Jerry Vonau wrote:> Brian J. Murrell wrote: > > > > Chain ppp0_masq (1 references) > > ... > > 0 0 SNAT all -- * * 72.38.184.236 0.0.0.0/0 policy match dir out pol none to:66.11.173.224 > > > > Chain eth1_masq (1 references) > > ... > > 7 668 SNAT all -- * * 66.11.173.224 0.0.0.0/0 policy match dir out pol none to:72.38.184.236 > > > > where eth1==72.38.184.236 and ppp0==66.11.173.224. > > > > b. > > > > I guess you missed this part from the Multi-ISP page: > ------ > Regardless of whether you have masqueraded hosts or not, YOU MUST ADD > THESE TWO ENTRIES TO /etc/shorewall/masq: > > #INTERFACE SUBNET ADDRESS > eth0 130.252.99.27 206.124.146.176 > eth1 206.124.146.176 130.252.99.27No I didn''t: #INTERFACE SUBNET ADDRESS PROTO PORT(S) IPSEC ... eth1 66.11.173.224 $ETH1_IP ppp0 $ETH1_IP 66.11.173.224> Those entries ensure that traffic originating on the firewall always has > the source IP address corresponding to the interface that it is routed > out of.Right. Which AFAIK translates into the two rules in the two chains I pasted in my last e-mail (and are above).> You should have these entries in there also: > > eth1 66.11.173.224 72.38.184.236 > ppp0 72.38.184.236 66.11.173.224Yup, see above, given a params entry of: ETH1_IP=$(find_first_interface_address eth1) b. -- My other computer is your Microsoft Windows server. Brian J. Murrell ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Jerry Vonau wrote:> Brian J. Murrell wrote:>> >> Chain ppp0_masq (1 references) >> ... >> 0 0 SNAT all -- * * 72.38.184.236 0.0.0.0/0 policy match dir out pol none to:66.11.173.224 >> >> Chain eth1_masq (1 references) >> ... >> 7 668 SNAT all -- * * 66.11.173.224 0.0.0.0/0 policy match dir out pol none to:72.38.184.236 >> >> where eth1==72.38.184.236 and ppp0==66.11.173.224. > > You should have these entries in there also: > > eth1 66.11.173.224 72.38.184.236 > ppp0 72.38.184.236 66.11.173.224Which will generate the two rules that Brian posted. -Tom -- Tom Eastep \ Nothing is foolproof to a sufficiently talented fool Shoreline, \ http://shorewall.net Washington USA \ teastep@shorewall.net PGP Public Key \ https://lists.shorewall.net/teastep.pgp.key ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
On Fri, Feb 09, 2007 at 07:25:19PM -0500, Brian J. Murrell wrote:> On Fri, 2007-02-09 at 23:56 +0000, Andrew Suffield wrote: > > > > The message is somewhat obtusely phrased. > > Indeed. > > > The kernel has received a > > packet from 64.86.88.116 to 66.11.173.224 on eth1, and it doesn''t like > > the source address for whatever reason, > > Or the destination address, considering that it''s the destination > address for a different interface?Not directly AFAIK - the destination address is used only to consider whether the source address is routable (ie, if you''re using source routing). ''Martian'' is conceptually a collection of vaguely related objections to the source address. However...> > so it dropped the packet. Most > > likely, 64.86.88.116 is not routable via eth1, which implies either > > your routing tables are wrong > > # ip route ls > ... > default > nexthop via 72.38.184.1 dev eth1 weight 1 > nexthop via 192.168.200.1 dev ppp0 weight 1 > > That should make it routable, yes?Only if it didn''t match any other routes (and if you''re doing weird things, I''m really not sure exactly what the kernel thinks is acceptable - the code is funky and not very well commented). From one of your earlier mails, don''t you have a source route for 66.11.173.224 that sends it out ppp0? I''m not completely certain, but I believe that will cause the kernel to reject anything destined for that address coming from other interfaces. While it''s not always true if your routing is complicated, rpfilter generally means "If I wouldn''t route my reply to this packet back out of this interface, then it shouldn''t be arriving at this interface" (which is almost exactly equivalent to "all routes must be symmetric").> > or you need to disable return-path > > filtering on this interface (I still haven''t been paying enough > > attention to know which, but you must disable rpfilter if your routing > > is assymetric). > > Well, it should not be. I do have two interfaces but they are in > completely different subnets with different providers. IOW, completely > independent of each other. > > That''s what makes it odd that a packet could arrive on my eth1 with a > destination address of 66.11.173.224. The Internet would not route that > destination address to my eth1 via my eth1 provider but rather to my > ppp0 via my ppp0 provider.Then probably either this packet is malformed or you''ve got some weird NAT issue. Exactly why that may be happening is not immediately apparent to me. ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Andrew Suffield wrote:> Exactly why that may be happening is not immediately > apparent to me. >Me neither -- that''s why I''ve been (almost) quiet on this thread. I saw this when I was developing MultiISP but both of my net interfaces were (by necessity) connected to the same switch. I finally had to use ebtables to suppress the ''martian'' messages. -Tom -- Tom Eastep \ Nothing is foolproof to a sufficiently talented fool Shoreline, \ http://shorewall.net Washington USA \ teastep@shorewall.net PGP Public Key \ https://lists.shorewall.net/teastep.pgp.key ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
On Sat, 2007-02-10 at 02:29 +0000, Andrew Suffield wrote:> Not directly AFAIK - the destination address is used only to consider > whether the source address is routable (ie, if you''re using source > routing). ''Martian'' is conceptually a collection of vaguely related > objections to the source address.Right.> Only if it didn''t match any other routesAhhh. This switches on a lightbulb above my head...> From one > of your earlier mails, don''t you have a source route for 66.11.173.224 > that sends it out ppp0?I don''t think so. But given this concept of "Only if it didn''t match any other routes" and the fact that at times I will get these martian errors and other times not, for the exact same packet sequence, check this out... # ip route get 64.86.88.116 64.86.88.116 via 192.168.200.1 dev ppp0 src 66.11.173.224 cache mtu 1452 advmss 1412 metric 10 64 # ip route get 64.86.88.116 64.86.88.116 via 192.168.200.1 dev ppp0 src 66.11.173.224 cache mtu 1452 advmss 1412 metric 10 64 # ip route get 64.86.88.116 64.86.88.116 via 192.168.200.1 dev ppp0 src 66.11.173.224 cache mtu 1452 advmss 1412 metric 10 64 # ip route get 64.86.88.116 64.86.88.116 via 192.168.200.1 dev ppp0 src 72.38.184.236 cache mtu 1452 advmss 1412 metric 10 64 That is simply executing the same command over and over again over a period of about 10-15 seconds. This appears to be the dual default route and load balancing at play... choosing a different default route at different times. If the kernel does essentially the same as this "ip route get" when determining if an inbound packet is routable through the interface it came in on I can see how it would fail and think it''s a martian. What might be happening is that at a moment in time when the kernel is seeing the ppp0 route as the default and shorewall is defeating that default route through the FAQ #58 "forcing a default" route (i.e. through tcrules) it could determine that a packet arriving on eth1 is martian. Sound too funky? b. -- My other computer is your Microsoft Windows server. Brian J. Murrell ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
On Fri, 2007-02-09 at 21:52 -0500, Brian J. Murrell wrote:> > If the kernel does essentially the same as this "ip route get" when > determining if an inbound packet is routable through the interface it > came in on I can see how it would fail and think it''s a martian.I think this is the problem. I just had it happen again and solved it by adding a specific route to the "martian source" via the interface that it was a martian on and the udp packets started flowing again. An immediate removal of the specific route left the "ip route get" returning the eth1 default route (route caching I guess) and things are still flowing. So it seem that having a 2 default routes when you really only want one (i.e. no balance) is a bad thing. I will experiment with removing the ppp0 default route and see what kind of badness comes from that. b. -- My other computer is your Microsoft Windows server. Brian J. Murrell ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Brian J. Murrell wrote:> On Sat, 2007-02-10 at 02:29 +0000, Andrew Suffield wrote: > > If the kernel does essentially the same as this "ip route get" when > determining if an inbound packet is routable through the interface it > came in on I can see how it would fail and think it''s a martian. >If you search the kernel source for the Martian message, you can read the relevant code. When you understand it, I would be delighted if you could explain it to me. In the meantime, what Shorewall version are you running? -Tom -- Tom Eastep \ Nothing is foolproof to a sufficiently talented fool Shoreline, \ http://shorewall.net Washington USA \ teastep@shorewall.net PGP Public Key \ https://lists.shorewall.net/teastep.pgp.key ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
On Fri, Feb 09, 2007 at 09:52:48PM -0500, Brian J. Murrell wrote:> What might be happening is that at a moment in time when the kernel is > seeing the ppp0 route as the default and shorewall is defeating that > default route through the FAQ #58 "forcing a default" route (i.e. > through tcrules) it could determine that a packet arriving on eth1 is > martian. > > Sound too funky?I can''t see any obvious reason why that couldn''t happen but I don''t really understand the guts of the routing code. I''d call it a kernel bug if that''s what is really going on. You''ll probably have to find a kernel hacker to get a meaningful answer. ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
On Fri, Feb 09, 2007 at 07:07:17PM -0800, Tom Eastep wrote:> Brian J. Murrell wrote: > > On Sat, 2007-02-10 at 02:29 +0000, Andrew Suffield wrote: > > > > If the kernel does essentially the same as this "ip route get" when > > determining if an inbound packet is routable through the interface it > > came in on I can see how it would fail and think it''s a martian. > > > > If you search the kernel source for the Martian message, you can read > the relevant code.I''ve read the code and I think this suggestion falls under "cruel and unusual".> When you understand it, I would be delighted if you > could explain it to me.In case it''s of any use, here''s the parts I do understand (2.6.19.1 behaviour, since I had that unpacked here): The error message in question can be reached by two parts of the ipv4 routing code. The first is when fib_validate_source() returns an error, and the second is the following bits of code from ip_route_input_slow(): /* Check for the most weird martians, which can be not detected by fib_lookup. */ if (MULTICAST(saddr) || BADCLASS(saddr) || LOOPBACK(saddr)) goto martian_source; ... /* Accept zero addresses only to limited broadcast; * I even do not know to fix it or not. Waiting for complains :-) */ if (ZERONET(saddr)) goto martian_source; (I do not understand this comment, but the tests are all pretty obvious - 0/8, 127/8, anything multicast, and anything in the undefined space above multicast) fib_validate_source() is sadly far harder to understand without going through and figuring out how all the fib_* stuff works, which is quite a lot of intricate, poorly commented code written by insane people. Here''s the sole comment, which hopefully resembles what this function actually does: /* Given (packet source, input interface) and optional (dst, oif, tos): - (main) check, that source is valid i.e. not broadcast or our local address. - figure out what "logical" interface this packet arrived and calculate "specific destination" address. - check, that packet arrived from expected physical interface. */ This code corresponds to the rpfilter option on the interface, but it does some other tests against the routing table even when rpfilter is disabled - I don''t know what to make of that. As far as it makes sense to me (and this may be wrong), the algorithm goes: Swap the source and destination from the packet we''re considering, and look up the result in the routing table. If that doesn''t work, reject if rpfilter is enabled, otherwise accept. If the route we find is not a gateway or direct route, reject (ie, if it''s multicast/blackhole/whatever). If the route we find says to send the packet out the device on which this packet was received OR (something I don''t understand involving multipath routes), then: reject if the scope of this route is ''host'' or ''nowhere'', else accept. If rpfilter is enabled on the interface where the packet arrived, reject. Modify the query (in some way I don''t understand), and try looking it up in the routing table again. If that doesn''t work, accept. If we did find a route this time, and it''s not a gateway or direct route, or its scope is ''host'' or ''nowhere'', reject. Otherwise accept. It''s times like this when I remember why I make a point of never doing any actual work on the kernel. ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
On Fri, 2007-02-09 at 19:07 -0800, Tom Eastep wrote:> > In the meantime, what Shorewall version are you running?3.2.3 Perhaps the rule should be, in shorewall, that if you run multi-isp with balance, you CANNOT use rp_filter and instead shorewall should[1] install anti-spoofing rules for you. [1] Can it? Is there always enough information in the config files to construct an all inclusive set of anti-spoofing rules? If not, is there even optionally a way to specify everything needed for comprehensive anti-spoofing rules? b. -- My other computer is your Microsoft Windows server. Brian J. Murrell ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Brian J. Murrell wrote:> On Fri, 2007-02-09 at 19:07 -0800, Tom Eastep wrote: >> In the meantime, what Shorewall version are you running? > > 3.2.3 > > Perhaps the rule should be, in shorewall, that if you run multi-isp with > balance, you CANNOT use rp_filter and instead shorewall should[1] > install anti-spoofing rules for you. > > [1] Can it? Is there always enough information in the config files to > construct an all inclusive set of anti-spoofing rules? If not, is there > even optionally a way to specify everything needed for comprehensive > anti-spoofing rules?Before we start discussing remedies, I think we need to understand how a packet addressed to your pppoe interface arrived from your other ISP''s router. Please run a tcpdump on eth1 filtering on host 66.11.173.224. That way, we can see what these packets are. Also, do you run any client applications on the firewall box that initiate connections to the Internet? -Tom -- Tom Eastep \ Nothing is foolproof to a sufficiently talented fool Shoreline, \ http://shorewall.net Washington USA \ teastep@shorewall.net PGP Public Key \ https://lists.shorewall.net/teastep.pgp.key ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
On Sat, 2007-02-10 at 08:41 -0800, Tom Eastep wrote:> > Before we start discussing remedies, I think we need to understand how a > packet addressed to your pppoe interface arrived from your other ISP''s > router.I don''t think it did. I think by the time routing got it it was rewritten.> Please run a tcpdump on eth1 filtering on host 66.11.173.224. That > way, we can see what these packets are.I think I already posted a tcpdump in this thread that showed the actual packets that were being considered martians and at tcpdump time, there were being addressed to the correct address. In this message: http://article.gmane.org/gmane.comp.security.shorewall/15379> Also, do you run any client applications on the firewall box that initiate > connections to the Internet?I do. OpenVPN among others "gateway" kind of software. freenet6 (ipv6 tunnelling) which is what the traffic in question in this thread is. b. -- My other computer is your Microsoft Windows server. Brian J. Murrell ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Brian J. Murrell wrote:> > I think I already posted a tcpdump in this thread that showed the actual > packets that were being considered martians and at tcpdump time, there > were being addressed to the correct address. In this message: > > http://article.gmane.org/gmane.comp.security.shorewall/15379I see. So it would seem that martian filtering is occurring *after* the destination address is getting rewritten. That seems bogus.> >> Also, do you run any client applications on the firewall box that initiate >> connections to the Internet? > > I do. OpenVPN among others "gateway" kind of software. freenet6 (ipv6 > tunnelling) which is what the traffic in question in this thread is. >Do you set ''loose'' in /etc/shorewall/providers? Better yet, can you forward the output of "shorewall dump" please. Thanks, -Tom -- Tom Eastep \ Nothing is foolproof to a sufficiently talented fool Shoreline, \ http://shorewall.net Washington USA \ teastep@shorewall.net PGP Public Key \ https://lists.shorewall.net/teastep.pgp.key ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
On Sat, Feb 10, 2007 at 11:54:45AM -0800, Tom Eastep wrote:> Brian J. Murrell wrote: > > > > > I think I already posted a tcpdump in this thread that showed the actual > > packets that were being considered martians and at tcpdump time, there > > were being addressed to the correct address. In this message: > > > > http://article.gmane.org/gmane.comp.security.shorewall/15379 > > I see. So it would seem that martian filtering is occurring *after* the > destination address is getting rewritten. That seems bogus.It runs as part of the routing decision, wherever that fits into the process. ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Andrew Suffield wrote:> On Sat, Feb 10, 2007 at 11:54:45AM -0800, Tom Eastep wrote: >> Brian J. Murrell wrote: >> >>> I think I already posted a tcpdump in this thread that showed the actual >>> packets that were being considered martians and at tcpdump time, there >>> were being addressed to the correct address. In this message: >>> >>> http://article.gmane.org/gmane.comp.security.shorewall/15379 >> I see. So it would seem that martian filtering is occurring *after* the >> destination address is getting rewritten. That seems bogus. > > It runs as part of the routing decision, wherever that fits into the > process.That''s consistent with what we''re seeing. The best way to work around this is to configure applications on the firewall so that they use the local IP address that corresponds to the interface that you want them to use. That approach is mentioned on the Multi-ISP page in the section entitles "Applications running on the Firewall" -Tom -- Tom Eastep \ Nothing is foolproof to a sufficiently talented fool Shoreline, \ http://shorewall.net Washington USA \ teastep@shorewall.net PGP Public Key \ https://lists.shorewall.net/teastep.pgp.key ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
On Sat, 2007-02-10 at 13:31 -0800, Tom Eastep wrote:> > That''s consistent with what we''re seeing. > > The best way to work around this is to configure applications on the > firewall so that they use the local IP address that corresponds to the > interface that you want them to use.See, this is all a red herring I think. I think the outgoing address and the return address and the interface they are coming back in on is all correct. I think the SNAT/masq''ing to ensure the correct source address is being set is working fine and as for the routing back given the source address I specify, that has to be working or the Internet would be really broken if it were not. The only problem at hand seems to be this martian detection/rp_filtering and which default route is currently the default in use (given load balancing). Unfortunately, nobody on the LARTC list has either confirmed or denied my supposition. b. -- My other computer is your Microsoft Windows server. Brian J. Murrell ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
On Sat, Feb 10, 2007 at 08:10:50PM -0500, Brian J. Murrell wrote:> that has to be working or the Internet > would be really broken if it were not.Never rule out the possibility that the internet is really broken. We live in a world where Cisco routers are in common use. They can do all kinds of weird and incomprehensible things. [Favourite example: a graph showing that packets comprised entirely of the letter ''i'' were forwarded much faster than packets comprised of ''k'' or ''l''. After finally managing to convince the support desk that this was not a wind-up, the router was quietly accepted for warranty replacement] ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
On Sat, 2007-02-10 at 20:10 -0500, Brian J. Murrell wrote:> On Sat, 2007-02-10 at 13:31 -0800, Tom Eastep wrote: > > > > That''s consistent with what we''re seeing. > > > > The best way to work around this is to configure applications on the > > firewall so that they use the local IP address that corresponds to the > > interface that you want them to use. > > See, this is all a red herring I think. I think the outgoing address > and the return address and the interface they are coming back in on is > all correct. I think the SNAT/masq''ing to ensure the correct source > address is being set is working fine and as for the routing back given > the source address I specify, that has to be working or the Internet > would be really broken if it were not.OK. So it''s not so much a red herring, but it is something very strange. Here the tcpdump on my eth1 nic who''s address is gw-eth1: $ grep gw-eth1 /etc/hosts 72.38.184.236 gw-eth1 20:19:13.409971 IP gw-eth1.1194 > CPE00a0c98cb9ff-CM000a73a15345.cpe.net.cable.rogers.com.1194: UDP, length 124 20:19:13.433470 IP CPE00a0c98cb9ff-CM000a73a15345.cpe.net.cable.rogers.com.1194 > gw-eth1.1194: UDP, length 124 20:19:14.413857 IP gw-eth1.1194 > CPE00a0c98cb9ff-CM000a73a15345.cpe.net.cable.rogers.com.1194: UDP, length 124 20:19:14.437204 IP CPE00a0c98cb9ff-CM000a73a15345.cpe.net.cable.rogers.com.1194 > gw-eth1.1194: UDP, length 124 20:19:15.414026 IP gw-eth1.1194 > CPE00a0c98cb9ff-CM000a73a15345.cpe.net.cable.rogers.com.1194: UDP, length 124 20:19:15.439276 IP CPE00a0c98cb9ff-CM000a73a15345.cpe.net.cable.rogers.com.1194 > gw-eth1.1194: UDP, length 124 20:19:16.413976 IP gw-eth1.1194 > CPE00a0c98cb9ff-CM000a73a15345.cpe.net.cable.rogers.com.1194: UDP, length 124 20:19:16.437514 IP CPE00a0c98cb9ff-CM000a73a15345.cpe.net.cable.rogers.com.1194 > gw-eth1.1194: UDP, length 124 All source and destination addresses are correct. Packets are leaving and coming back properly addressed. Yet: Feb 10 20:19:09 gw.ilinx kernel: ll header: 00:a0:24:2a:1f:72:00:13:5f:07:97:05:08:00 Feb 10 20:19:14 gw.ilinx kernel: printk: 4 messages suppressed. Feb 10 20:19:14 gw.ilinx kernel: martian source 66.11.173.224 from 74.111.215.93, on dev eth1 Feb 10 20:19:14 gw.ilinx kernel: ll header: 00:a0:24:2a:1f:72:00:13:5f:07:97:05:08:00 Feb 10 20:19:19 gw.ilinx kernel: printk: 4 messages suppressed. Feb 10 20:19:19 gw.ilinx kernel: martian source 66.11.173.224 from 74.111.215.93, on dev eth1 Feb 10 20:19:19 gw.ilinx kernel: ll header: 00:a0:24:2a:1f:72:00:13:5f:07:97:05:08:00 Feb 10 20:19:24 gw.ilinx kernel: printk: 4 messages suppressed. Feb 10 20:19:24 gw.ilinx kernel: martian source 66.11.173.224 from 74.111.215.93, on dev eth1 Feb 10 20:19:24 gw.ilinx kernel: ll header: 00:a0:24:2a:1f:72:00:13:5f:07:97:05:08:00 And: # ip route get 74.111.215.93 74.111.215.93 via 192.168.200.1 dev ppp0 src 66.11.173.224 cache mtu 1452 advmss 1412 metric 10 64 To give some numbers to some names: $ grep gw-eth1 /etc/hosts 72.38.184.236 gw-eth1 $ host CPE00a0c98cb9ff-CM000a73a15345.cpe.net.cable.rogers.com CPE00a0c98cb9ff-CM000a73a15345.cpe.net.cable.rogers.com has address 74.111.215.93 # ifconfig eth1 eth1 Link encap:Ethernet HWaddr 00:A0:24:2A:1F:72 inet addr:72.38.184.236 Bcast:72.38.185.255 Mask:255.255.254.0 [root@gw shorewall]# arp -an ? (72.38.184.1) at 00:13:5F:07:97:05 [ether] on eth1 So I''d say packets are definitely addressed correctly at my eth1 demarcation. And now, at this moment, it''s again working and here we can see: # ip route get 74.111.215.93 74.111.215.93 via 192.168.200.1 dev ppp0 src 72.38.184.236 cache mtu 1452 advmss 1412 metric 10 64 So I am convinced that the problem is the flip-flopping of the active default route to achieve load balancing. And I consider this a bug in the rp_filter functionality. I''d say that every default route needs to be examined when doing reverse-path analysis, not just the one that is currently active. Even without the kind of trickery we are using to force the use of one of the default routes, there will always be a race between when a packet leaves a machine and when the machine decides to "flip" the active default route. I wonder if I should bug report this and see how much traction I get. As for the 66.11.173.224 (i.e. the address of the ppp0 interface) appearing in the martian log entry source addresses, I''m beginning to think that that is just the error message printing that for whatever reason. Perhaps it''s trying to tell us the interface it would have expected the packet on -- in a roundabout, obtuse way. I don''t think it''s bad routing of the packet back to my machine, or even rewriting of the destination address of the packet as I once suspected. b. -- My other computer is your Microsoft Windows server. Brian J. Murrell ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Brian J. Murrell wrote:> > OK. So it''s not so much a red herring, but it is something very > strange. >I''m still waiting for the output of "shorewall dump" -Tom -- Tom Eastep \ Nothing is foolproof to a sufficiently talented fool Shoreline, \ http://shorewall.net Washington USA \ teastep@shorewall.net PGP Public Key \ https://lists.shorewall.net/teastep.pgp.key ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Brian J. Murrell wrote:> > So I am convinced that the problem is the flip-flopping of the active > default route to achieve load balancing. And I consider this a bug in > the rp_filter functionality.Alternatively, I''ve long suspected that rp_filter doesn''t take the packet mark into consideration. One experiment you can run for me: When you are seeing this:> # ip route get 74.111.215.93 > 74.111.215.93 via 192.168.200.1 dev ppp0 src 66.11.173.224 > cache mtu 1452 advmss 1412 metric 10 64What does "ip route get 74.111.215.93 from 72.38.184.236" give you?> > As for the 66.11.173.224 (i.e. the address of the ppp0 interface) > appearing in the martian log entry source addresses, I''m beginning to > think that that is just the error message printing that for whatever > reason.>From your ''shorewall dump'':Chain eth1_masq (1 references) pkts bytes target prot opt in out source destination ... 3 456 SNAT all -- * * 66.11.173.224 0.0.0.0/0 policy match dir out pol none to:72.38.184.236 So the SNAT rule has been exercised at some point. If any of those connections was your freenet6 application, it would explain how the ppp0 address got into the Martian messages. -Tom -- Tom Eastep \ Nothing is foolproof to a sufficiently talented fool Shoreline, \ http://shorewall.net Washington USA \ teastep@shorewall.net PGP Public Key \ https://lists.shorewall.net/teastep.pgp.key ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
On Sun, 2007-02-11 at 11:04 -0800, Tom Eastep wrote:> Alternatively, I''ve long suspected that rp_filter doesn''t take the packet mark > into consideration.I would be quite surprised if rp_filter did that.> What does "ip route get 74.111.215.93 from 72.38.184.236" give you?# ip route get 74.111.215.93 from 72.38.184.236 74.111.215.93 from 72.38.184.236 via 72.38.184.1 dev eth1 cache mtu 1500 advmss 1460 metric 10 64> >From your ''shorewall dump'': > > Chain eth1_masq (1 references) > pkts bytes target prot opt in out source > destination > ... > 3 456 SNAT all -- * * 66.11.173.224 0.0.0.0/0 > policy match dir out pol none to:72.38.184.236 > > So the SNAT rule has been exercised at some point.That is what I was thinking at some point too. But I can''t see how as it''s only supposed to happen on packets going out of eth1. To test, I''ve added a logging rule right before that SNAT rule: Chain eth1_masq (1 references) num pkts bytes target prot opt in out source destination ... 7 0 0 LOG all -- * * 66.11.173.224 0.0.0.0/0 LOG flags 0 level 6 prefix `SNATting:'' 8 4 608 SNAT all -- * * 66.11.173.224 0.0.0.0/0 policy match dir out pol none to:72.38.184.236 and when I trigger the situation though many packets were sent and many martians logged only a single occurence of: SNATting:IN= OUT=eth1 SRC=66.11.173.224 DST=74.111.215.93 LEN=152 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=1194 DPT=1194 LEN=132 and that one looks pretty legit to me. I''m really leaning towards this being a logging anomaly/strangeness rather than actual packet manipulation.> If any of those connections > was your freenet6 application,Naw, this was plain old openvpn. I was trying to ping something on the other end of an openvpn tunnel.> it would explain how the ppp0 address got into > the Martian messages.b. -- My other computer is your Microsoft Windows server. Brian J. Murrell ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642