Peter Steele
2016-Apr-07 13:50 UTC
Re: [libvirt-users] Networking issues with lxc containers in AWS EC2
On 04/02/2016 05:20 PM, Laine Stump wrote:> You say they can talk among containers on the same host, and with their > own host (I guess you mean the virtual machine that is hosting the > containers), but not to containers on another host. Can the containers > communicate outside of the host at all? If not, perhaps the problem is > iptables rules for the bridge device the containers are using - try > running this command: > > sysctl net.bridge.bridge-nf-call-iptables > > If that returns: > > net.bridge.bridge-nf-call-iptables = 1 > > then run this command and see if the containers can now communicate with > the outside: > > sysctl -w net.bridge.bridge-nf-call-iptables=0This key doesn't exist in the CentOS 7 image I'm running. I do have a bridge interface defined of course, although we do not run iptables. We don't need this service when running our software on premise. Actually, in CentOS 7 the iptables service doesn't exist; there's a new service called firewalld that serves the same purpose. We don't run this either at present.> Well, if they've allowed your virtual machine to acquire multiple IP > addresses, then it would make sense that they would allow them to > actually use those IP addresses. I'm actually more inclined to think > that the packets simply aren't getting out of the virtual machine (or > the responses aren't getting back in). >The difference is that the virtual machine itself isn't assigned the IPs but rather containers under the AWS instance and something with how Amazon manages their stack prevents the packets from one container to the other. The very fact that the exact same software runs fine in VMs under say VMware or KVM but not VMs under AWS clearly points to AWS as the ultimate source of the problem.
Laine Stump
2016-Apr-11 18:33 UTC
Re: [libvirt-users] Networking issues with lxc containers in AWS EC2
On 04/07/2016 09:50 AM, Peter Steele wrote:> On 04/02/2016 05:20 PM, Laine Stump wrote: >> You say they can talk among containers on the same host, and with their >> own host (I guess you mean the virtual machine that is hosting the >> containers), but not to containers on another host. Can the containers >> communicate outside of the host at all? If not, perhaps the problem is >> iptables rules for the bridge device the containers are using - try >> running this command: >> >> sysctl net.bridge.bridge-nf-call-iptables >> >> If that returns: >> >> net.bridge.bridge-nf-call-iptables = 1 >> >> then run this command and see if the containers can now communicate with >> the outside: >> >> sysctl -w net.bridge.bridge-nf-call-iptables=0 > > This key doesn't exist in the CentOS 7 image I'm running.Interesting. That functionality was moved out of the kernel's bridge module into br_netfilter some time back, but that was done later than the kernel 3.10 that is used by CentOS 7. Are you running some later kernel version? If your kernel doesn't have a message in dmesg that looks like this: bridge: automatic filtering via arp/ip/ip6tables has been deprecated. Update your scripts to load br_netfilter if you need this. and the bridge driver is loaded, then that key should be available. Of course if you don't have it, that's equivalent to having it set to 0, so you should be okay regardless of why it's missing.> I do have a bridge interface defined of course, although we do not run > iptables. We don't need this service when running our software on > premise. Actually, in CentOS 7 the iptables service doesn't exist; > there's a new service called firewalld that serves the same purpose. > We don't run this either at present.The iptables service is not the same thing as the iptables kernel module. Even firewalld uses the iptables kernel module (libvirt doesn't care about any service named iptables, but does use firewalld if it's running).> >> Well, if they've allowed your virtual machine to acquire multiple IP >> addresses, then it would make sense that they would allow them to >> actually use those IP addresses. I'm actually more inclined to think >> that the packets simply aren't getting out of the virtual machine (or >> the responses aren't getting back in). >> > > The difference is that the virtual machine itself isn't assigned the > IPs but rather containers under the AWS instance and something with > how Amazon manages their stack prevents the packets from one container > to the other. The very fact that the exact same software runs fine in > VMs under say VMware or KVM but not VMs under AWS clearly points to > AWS as the ultimate source of the problem.I wouldn't be too quick to judgement. First take a look at tcpdump on the bridge interface that the containers are attached to, and on the ethernet device that connects the bridge to the rest of Amazon's infrastructure. If you see packets from the container's IP going out but not coming back in, check the iptables rules (again - firewalld uses iptables to setup its filtering) for a REJECT or DISCARD rule that has an incrementing count. I use something like this to narrow down the list I need to check: while true; do iptables -v -S -Z | grep -v '^Zeroing' | grep -v "c 0 0" | grep -e '-c'; echo '**************'; sleep 1; If you don't see any REJECT or DISCARD rules being triggered, then maybe the problem is that AWS is providing an IP address to your container's MAC, but isn't actually allowing traffic from that MAC out onto the network.
Peter Steele
2016-Apr-12 20:37 UTC
Re: [libvirt-users] Networking issues with lxc containers in AWS EC2
On 04/11/2016 11:33 AM, Laine Stump wrote:> Interesting. That functionality was moved out of the kernel's bridge > module into br_netfilter some time back, but that was done later than > the kernel 3.10 that is used by CentOS 7. Are you running some later > kernel version? > > If your kernel doesn't have a message in dmesg that looks like this: > > bridge: automatic filtering via arp/ip/ip6tables has been deprecated. > Update your scripts to load br_netfilter if you need this. > > and the bridge driver is loaded, then that key should be available. Of > course if you don't have it, that's equivalent to having it set to 0, > so you should be okay regardless of why it's missing. >Ah, you were right. I'd forgot that the AMI I've using was one running the 4.0.5 ml kernel. We discovered that bonded interfaces running with mode 5 or 6 do not work with lxc containers (the host's ARP table does not get updated). The issue was fixed in the 4.0.5 kernel so we ran for a short time with this kernel, only to later abandon this kernel due to a bug with software RAID. I've reverted the kernel back to 3.10 on the AWS instances I'm using the net.bridge.bridge-nf-call-iptables key is now present. It's already set to 0 though so there is nothing that needs to be done here.> > I wouldn't be too quick to judgement. First take a look at tcpdump on > the bridge interface that the containers are attached to, and on the > ethernet device that connects the bridge to the rest of Amazon's > infrastructure. If you see packets from the container's IP going out > but not coming back in, check the iptables rules (again - firewalld > uses iptables to setup its filtering) for a REJECT or DISCARD rule > that has an incrementing count. I use something like this to narrow > down the list I need to check: > > while true; do iptables -v -S -Z | grep -v '^Zeroing' | grep -v "c 0 > 0" | grep -e '-c'; echo '**************'; sleep 1; > > If you don't see any REJECT or DISCARD rules being triggered, then > maybe the problem is that AWS is providing an IP address to your > container's MAC, but isn't actually allowing traffic from that MAC out > onto the network. >I'll get this test setup. Unfortunately I'm not particularly knowledgeable with iptables; we don't use it in our product so I've never had to deal with it. I think you are right though about what's happening--AWS doesn't recognize the MAC addresses for containers running under another instance.
Possibly Parallel Threads
- Re: Networking issues with lxc containers in AWS EC2
- Re: Networking issues with lxc containers in AWS EC2
- Networking issues with lxc containers in AWS EC2
- Re: Networking issues with lxc containers in AWS EC2
- Re: Networking issues with lxc containers in AWS EC2