I''ve posted this in an other form recently without success. I realize, that this may not be to intersting of a problem, but I need to get it resolved. Even if this is not the right place to ask, some direction as to where I can get answers would still be helpful. I''m having a very strange problem with my rule based routing. I''ve narrowed it down quite a bit and I''m not sure where to go from here. I seems like it could be a bug in the code or something. If it is, I wouldn''t think it too hard to fix (famous last words), but having never hacked at the kernel before, I wouldn''t even know where to begin looking. Anyway, let me describe my problem. I''ve narrowed things down to a very simple test case. For starters, when I use the "ip rule ls" command to list my rules, I see the following: #cygnus:~> ip rule ls 0: from all lookup local 32766: from all lookup main 32767: from all lookup default quite normal. Just as a refresher, here''s my topology again | +-------+ +----------------+ Orion + | +-------+ | (172.X.X.2) |(172.X.X.1) |eth0 _/\__/\_ +---+----+ _/\__/\_ / \ (63...)| Cygnus |(204...) / \ ( Internet )-----------+(Router)+----------( Internet ) \_ __ _/ aps0| |eth2 \_ __ _/ \/ \/ +----+---+ \/ \/ eth1|63.. |204..x | --+---------------+----------+-- <---single physical net | | (i.e. one hub) | | +---+---+ 63..1 +---+---+ 63..2 | Linux | 63..4 | Linux | 63..3 +-------+ 204..1 +-------+ 204..2 204..4 204..3 Starting with all my interfaces up, with the rule''s above, I run the following script:(slightly edited to protect the guilty) #!/bin/sh # ############################################################################## # Define routing rules ############################################################################## #rules for packets coming in eth0 (LAN) ip rule add iif eth0 to 204.x.x.0/24 lookup to-lan priority 100 ip rule add iif eth0 to 172.x.x.1/32 lookup main priority 110 #catch all rule ip rule add from 0.0.0.0/0 type blackhole lookup bit-bucket priority 500 ############################################################################## # Create routing tables referenced by rules above # Note: the table names used below must exist in the # /etc/iproute2/rt_tables file ############################################################################## #to-lan table routes ip route add default dev eth0 table to-lan #bit-bucket table routes ip route add blackhole default table bit-bucket # Make rules/routes active ip route flush cache # Enable IP forwarding since it is disabled by default echo "1" > /proc/sys/net/ipv4/ip_forward # Enable automatic IP defragmenting since it is disabled by default echo "1" > /proc/sys/net/ipv4/ip_always_defrag #---------end script When I''m all done, an ip rule ls shows the following 0: from all lookup local 100: from all to 204.x.x.0/24 iif eth0 lookup to-lan 110: from all to 172.x.x.1 iif eth0 lookup main 500: from all lookup bit-bucket blackhole 32766: from all lookup main 32767: from all lookup default So far so good. I can now hop over to orion and begin to test. I set the default gw on orion to point to 172.x.x.1 and try to ping 204.x.x.2 (our dns server) which answers back fine. So rule 100 is working and redirecting things to the cisco router on our 172 network which has that particular 204 network attached to it. But when I ping 172.x.x.1, cygnus'' address I get nothing. Hopping over to cygnus'' terminal and running tcpdump shows me that the packets are indeed arriving but they aren''t making it. As it ends up, they are getting blackholed by rule 500 above. I know this because If I delete rule 500 from the command line the ping starts getting responded to, furthermore if I delete rule 32766 after that, it quits again. What is happening is for some reason packets coming in are not matching the condition of the specific local address specified by rule 110, and are winding up matching the blackhole rule that follows. I''ve tried various permutations of the offending rule, and it seems that anything which tries to match an address or address range on a locally attached network won''t match. Sure ip rule takes and adds it OK. but the kernel won''t match the packets like it should. Incidentally, routing across cygnus works just fine. If I match the destination address range of the boxes on the other side of Cygnus, and route the packets to the DMZ, everything works great. It''s almost as if there is a bug in the rule matching code somewhere which doesn''t properly handle this specific condition. One important gotcha I found when testing is after every change you make, you have to run "ip route flush cache" to make it take effect. I''ve been down that road already, we''re not dealing with that here. Anyway, I''ve troubleshot this about as far as I can with the knowledge I currently have, and I was hoping someone out there might have some usefull suggestions. One thing I thought of doing is trying a more recent kernel (I''m currently using 2.2.17 on cygnus). But that doesn''t seem to help. I compiled and installed 2.2.18 and experianced the exact same symptoms Thanks in advance for any help. If this advanced routing is going to be of any use to me, and if I''m ever going to get a howto written on what I''m doing (see subject "A complicated routing scenario (for me at least)" in the archives), I need to get this resoved. -Andrew -- depaan@bibleinfo.com -------------------------------------------------------------- Want answers to life''s big questions? Visit www.bibleinfo.com.
Andrew
2000-Dec-14 21:25 UTC
Re: Advanced Routing problem (Can someone PLEASE answer this!)
Hello, see comments belo> > ############################################################################## > > # Define routing rules > > ############################################################################## > > > > #rules for packets coming in eth0 (LAN) > > ip rule add iif eth0 to 204.x.x.0/24 lookup to-lan priority 100 > > ip rule add iif eth0 to 172.x.x.1/32 lookup main priority 110 > > 172.x.x.1/32 -- I''d say just offhand that needs to be 172.x.x.1/24. Why > 32?The point of the .../32 is to specify a single unique ip address. /24 would specify the whole subnet of 254 addresses, and I don''t want that. I could just as easily have said "ip rule add iif eth0 to 172.x.x.1 lookup main priority 110". The ip command will take it either way. If you want proof look below at the second listing with "ip rule ls", or try it yourself. You will see that rule 110 is listed as a unique ip address.> > > > > #catch all rule > > ip rule add from 0.0.0.0/0 type blackhole lookup bit-bucket priority 500 > > > > ############################################################################## > > # Create routing tables referenced by rules above > > # Note: the table names used below must exist in the > > # /etc/iproute2/rt_tables file > > ############################################################################## > > > > #to-lan table routes > > ip route add default dev eth0 table to-lan > > > > #bit-bucket table routes > > ip route add blackhole default table bit-bucket > > > > # Make rules/routes active > > ip route flush cache > > > > # Enable IP forwarding since it is disabled by default > > echo "1" > /proc/sys/net/ipv4/ip_forward > > > > # Enable automatic IP defragmenting since it is disabled by default > > echo "1" > /proc/sys/net/ipv4/ip_always_defrag > > #---------end script > > > > When I''m all done, an ip rule ls shows the following > > > > 0: from all lookup local > > 100: from all to 204.x.x.0/24 iif eth0 lookup to-lan > > 110: from all to 172.x.x.1 iif eth0 lookup main > > 500: from all lookup bit-bucket blackhole > > 32766: from all lookup main > > 32767: from all lookup default > > > > So far so good. I can now hop over to orion and begin to test. I set > > the default gw on orion to point to 172.x.x.1 and try to ping > > 204.x.x.2 (our dns server) which answers back fine. So rule 100 > > is working and redirecting things to the cisco router on our 172 > > network which has that particular 204 network attached to it. > > > > But when I ping 172.x.x.1, cygnus'' address I get nothing. Hopping > > over to cygnus'' terminal and running tcpdump shows me that the packets > > are indeed arriving but they aren''t making it. As it ends up, they are > > getting blackholed by rule 500 above. I know this because If I delete > > rule 500 from the command line the ping starts getting responded to, > > furthermore if I delete rule 32766 after that, it quits again. > > I''d say this more or less confirms it. > > What does ip route ls show?All ip route ls shows is my default routing table (main) the same thing you would see if you ran say the "route -n" command after executing "/etc/rc.d/init.d/network start" there are also two other tables: to-lan, and bit-bucket which each have a single default route in them as you can see from the script above. -Andrew