Jarrod Lowe
2010-Jul-27 12:12 UTC
[Bridge] Bridging and NAT: Confusion when linux sees the packet for a second time
Hi, I have a rather odd problem. This is a bit of a complicated setup, if you want to know why, I have included more detail at the bottom. I have two bridges - lets call them "A" and "B". They have an IP address on each, so we can route between them. There are also two virtual machines attached to both "A" and "B" (call them "vs1" and "vs2"), and another virtual machine attached only to "A" (call it "src") src sends a packet to an address on the "B" network. It forwards this via the Host's address on "A". The Host applies an SNAT rule, to change the source address to another address of the host (which is on a completely separate interface). The Host forwards the packet via "B" to vs1. vs1 forwards the packet via "A" to vs2. At this point, the *host*, instead of copying the packet (as any good bridge should do :) does something... weird. It sends a RST packet, from a *different* source port, instead of bridging! /proc/sys/ip_conntrack, at this point, has *two* entries for the connection: (A=10.17.192.*, B=192.168.148.*, Host=172.16.64.76) tcp 6 119 SYN_SENT src=10.17.192.254 dst=192.168.148.2 sport=41869 dport=80 packets=2 bytes=120 [UNREPLIED] src=192.168.148.2 dst=172.16.64.76 sport=80 dport=41869 packets=0 bytes=0 mark=0 secmark=0 use=2 tcp 6 9 CLOSE src=172.16.64.76 dst=192.168.148.2 sport=41869 dport=80 packets=2 bytes=100 src=192.168.148.2 dst=172.16.64.76 sport=80 dport=1088 packets=1 bytes=60 mark=0 secmark=0 use=2 Note that the CLOSE connections match says sport=41869, but its action dport=1088. I am utterly confused. Can anyone explain what is happening here, or how to get around it? In more detail: I am using KVM to test what is, in live deployments, a bunch of independent pieces of kit. The real machine - "H" is acting as the upstream router (on "A" and "Internet", internal switch (for "B"), and the firewall (passing through "H"). "Internet" | A | B | He | +--[Ha H Hb]--+ | | +--[V1a v1 V1b]--+ | | +--[V2a v2 V2b]--+ | | | [ src Sb]--+ | | v1 and v2 are running Linux IPVS (load balancing). "A"=10.17.192.0/24, "B"=192.168.148.0/24, "He"=172.16.64.76 For the particular connections I am having problems with, v1 is the master - it additionally has the floating IP address Fa. It then chooses that v2 should handle this connection. In this context, "B" is a private network, on RFC1918 addressing. If it wants to access the "Internet" side ("A") is must be NATd - in this case behind the address He. (Yes, "A" and "He" are also on RFC1918 addressing here - but that is because it is in my lab. A real deployment would have real IPs there - and "He" would be inside "B". Also, "A" and "B" would be on physical switches, rather than bridges on a linux box) I have only one iptables rule; on the nat table, in POSTROUTING; From "B" ( 10.17.192.0/24) to anything not "B" (!10.17.192.0/24), SNAT behind "He" (172.16.64.76) So "src" wants to talk to "Fa". I can see a packet "Sb->Fa", send from "src" to "Hb". I can see a packet "He->Fa" (SNAT'd behind He), send from "H" to "v1" on "A". I can see a packet "He->Fa", send from "v1" to "v2" on "B" going up to the bridge "B". I can see an *incorrect* RST packet "He->Fa", send from "H" to "v2" on "A", with the wrong destination port. I suspect this is to do with ip_conntrack. I think it is getting confused, and building two conntrack entries, but getting the packets against the wrong one, possibly? Is there any way I can force conntrack to not look at bridging packets? Any Ideas? Thanks, -- Jarrod Lowe -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.linux-foundation.org/pipermail/bridge/attachments/20100727/1c128931/attachment-0001.htm
Jarrod Lowe
2010-Jul-27 13:58 UTC
[Bridge] Bridging and NAT: Confusion when linux sees the packet for a second time
On 27 July 2010 13:12, Jarrod Lowe <jarrod.lowe at gmail.com> wrote:> > Hi, > > I have a rather odd problem. > > This is a bit of a complicated setup, if you want to know why, I have included more detail at the bottom.Joy, I can answer my own question. It was indeed conntrack screwing up. The rule: iptables -A PREROUTING -t raw (some suitable conditions) -j NOTRACK where "some suitable conditions match the packet as seen the second time (In my case, I could use that they were on the internal network, but the IPs were neither to nor from the internal network). It would be nicer to be able to say that all bridging but non-routing traffic was NOTRACK, but the above will do. Thanks, -- Jarrod Lowe