Hello everyone.
I've stumbled upon a strange networking issue with multiple interfaces
on CentOS 5.
The network setup is just like the diagram in
http://lartc.org/howto/lartc.rpdb.multiple-links.html
It looks like linux is not routing correctly outgoing packets on
interfaces different from the one of the default gateway, but instead
broadcasts an ARP request on the link, looking for the destination host,
which doesn't make any sense as the remote host is far away.
I suspect there's a bug in the kernel in the routing decision algorithm,
as the issue is absent in kernel releases up to 2.6.18-194.8.1.el5
(CentOS 5.5), while I've tracked it down at least from
2.6.18-308.1.1.el5 release onwards (CentOS 5.8). I'm not sure in which
release inbetween appeared first.
I'll try to explain with an example.
Let's say that my IF1 is eth1 with 100.10.10.1/24 and gateway
100.10.10.254 and IF2 is eth2 99.10.11.1/24 with gateway 99.10.11.254.
Inside network is eth0 with 192.168.0.1/24.
My main routing table looks something like
# route -n
Destination Gateway Genmask Flags Metric Ref Use
Iface
99.10.11.0 0.0.0.0 255.255.255.0 U 0 0 0
eth2
100.10.10.0 0.0.0.0 255.255.255.0 U 0 0 0
eth1
192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0
eth0
0.0.0.0 100.10.10.254 0.0.0.0 UG 0 0 0
eth1
Then I've setup two additional routing tables with iproute2 and linked
to the main rule list like this
# cat >> /etc/iproute2/rt_tables << EOF
10 provider1
20 provider2
EOF
# ip route add 100.10.10.0/24 dev eth1 table provider1
# ip route add default via 100.10.10.254 dev eth1 table provider1
# ip rule add from 100.10.10.0/24 table provider1
# ip route add 99.10.11.0/24 dev eth2 table provider2
# ip route add default via 99.10.11.254 dev eth2 table provider2
# ip rule add from 99.10.11.0/24 table provider2
So the result is the following:
# ip rule list
0: from all lookup 255
32764: from 99.10.11.0/24 lookup provider2
32765: from 100.10.10.0/24 lookup provider1
32766: from all lookup main
32767: from all lookup default
# ip route list table provider1
100.10.10.0/24 dev eth1 scope link
default via 100.10.10.254 dev eth1
# ip route list table provider2
99.10.11.0/24 dev eth2 scope link
default via 99.10.11.254 dev eth2
Now the issue.
With kernel 2.6.18-194.8.1.el5, if I try two different ping forcing
outgoing interface everything looks good:
# ping -c 3 -I eth1 8.8.8.8
PING 8.8.8.8 (8.8.8.8) from 100.10.10.1 eth1: 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=49 time=16.9 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=49 time=19.4 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=49 time=20.7 ms
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 16.378/16.796/17.289/0.390 ms
and
# ping -c 3 -I eth2 8.8.8.8
PING 8.8.8.8 (8.8.8.8) from 99.10.11.1 eth2: 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=48 time=24.9 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=48 time=25.3 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=48 time=24.5 ms
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 16.378/16.796/17.289/0.390 ms
If I repeat the ping, but with a newer kernel, let's say
2.6.18-308.1.1.el5, the result is the following:
# ping -c 3 -I eth1 8.8.8.8
PING 8.8.8.8 (8.8.8.8) from 100.10.10.1 eth1: 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=48 time=36.8 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=48 time=36.8 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=48 time=35.5 ms
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 35.542/36.433/36.887/0.630 ms
# ping -c 3 -I eth2 8.8.8.8
PING 8.8.8.8 (8.8.8.8) from 99.10.11.1 eth2: 56(84) bytes of
data.>From 99.10.11.1 icmp_seq=1 Destination Host Unreachable
>From 99.10.11.1 icmp_seq=2 Destination Host Unreachable
>From 99.10.11.1 icmp_seq=3 Destination Host Unreachable
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time
2001ms
, pipe 3
I was puzzled by this behaviour, so I traced network traffic with
tcpdump, and the output is the following:
# tcpdump -n -i eth2 src or dst 8.8.8.8
tcpdump: verbose output suppressed, use -v or -vv for full protocol
decode
listening on eth2, link-type EN10MB (Ethernet), capture size 96 bytes
10:03:34.951676 arp who-has 8.8.8.8 tell 99.10.11.1
10:03:35.951719 arp who-has 8.8.8.8 tell 99.10.11.1
10:03:36.951662 arp who-has 8.8.8.8 tell 99.10.11.1
3 packets captured
3 packets received by filter
0 packets dropped by kernel
It looks like the routing algorithm discards advanced routing,
forgetting about the second default gateway on provider2 table.
Adding a second default route to the main routing table seems to resolve
the issue, but it wasn't necessary before and I'm not sure is a correct
solution.
# route add default gw 100.10.11.254 metric 100
# route -n
Destination Gateway Genmask Flags Metric Ref Use
Iface
99.10.11.0 0.0.0.0 255.255.255.0 U 0 0 0
eth2
100.10.10.0 0.0.0.0 255.255.255.0 U 0 0 0
eth1
192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0
eth0
0.0.0.0 99.10.11.254 0.0.0.0 UG 100 0 0
eth2
0.0.0.0 100.10.10.254 0.0.0.0 UG 0 0 0
eth1
Any comments ?
--
Stefano Buelow <stefano.buelow [at] logx.it>
--
Il messaggio e' stato analizzato alla ricerca di virus o
contenuti pericolosi da MailScanner, ed e'
risultato non infetto.