Lars Lindstrom
2019-Mar-12 22:10 UTC
[libvirt-users] KVM-Docker-Networking using TAP and MACVLAN
Hi everyone! I have the following requirement: I need to connect a set of Docker containers to a KVM. The containers shall be isolated in a way that they cannot communicate to each other without going through the KVM, which will act as router/firewall. For this, I thought about the following simple setup (as opposed to a more complex one involving a bridge with vlan_filtering and a seperate VLAN for each container): +------------------------------------------------------------------+ | Host | | +-------------+ +----------------------+---+ | | KVM | | Docker +-> | a | | | +----------+ +----------+ +--------------+ +---+ | | | NIC lan0 | <-> | DEV tap0 | <-> | NET macvlan0 | <-+-> | b | | | +----------+ +----------+ +--------------+ +---+ | | | | +-> | c | | +-------------+ +----------------------+---+ | | +------------------------------------------------------------------+ NIC lan0: <interface type='direct'> <source dev='tap0' mode='vepa'/> <model type='virtio'/> </interface> *** Welcome to pfSense 2.4.4-RELEASE-p1 (amd64) on pfSense *** LAN (lan) -> vtnet0 -> v4: 10.0.20.1/24 DEV tap0: [root@server ~]# ip tuntap add tap0 mode tap [root@server ~]# ip l set tap0 up [root@server ~]# ip l show tap0 49: tap0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN mode DEFAULT group default qlen 1000 link/ether ce:9e:95:89:33:5f brd ff:ff:ff:ff:ff:ff [root@server ~]# virsh start pfsense [root@server opt]# ip l show tap0 49: tap0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN mode DEFAULT group default qlen 1000 link/ether ce:9e:95:89:33:5f brd ff:ff:ff:ff:ff:ff NET macvlan0: [root@server ~]# docker network create --driver macvlan --subnet=10.0.20.0/24 --gateway=10.0.20.1 --opt parent=tap0 macvlan0 CNT a: [root@server ~]# docker run --network macvlan0 --ip=10.0.20.2 -it alpine /bin/sh / # ping -c 4 10.0.20.1 PING 10.0.20.1 (10.0.20.1): 56 data bytes --- 10.0.20.1 ping statistics --- 4 packets transmitted, 0 packets received, 100% packet loss / # ifconfig eth0 Link encap:Ethernet HWaddr 02:42:0A:00:14:02 inet addr:10.0.20.2 Bcast:10.0.20.255 Mask:255.255.255.0 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:4 errors:0 dropped:0 overruns:0 frame:0 TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:448 (448.0 B) TX bytes:448 (448.0 B) / # ip r default via 10.0.20.1 dev eth0 10.0.20.0/24 dev eth0 scope link src 10.0.20.2 CNT b: [root@server ~]# docker run --network macvlan0 --ip=10.0.20.2 -it alpine /bin/ping 10.0.20.1 PING 10.0.20.1 (10.0.20.1): 56 data bytes CNT c: [root@server ~]# docker run --network macvlan0 --ip=10.0.20.2 -it alpine /bin/ping 10.0.20.1 PING 10.0.20.1 (10.0.20.1): 56 data bytes The KVM is not reachable from within a Docker container (during the test firewalld was disabled) and vice versa. The first thing I noticed is that tap0 remains NO-CARRIER and DOWN, even though the KVM has been started. Shouldn't the link come up as soon as the KVM is started (and thus is connected to the tap0 device)? The next thing that looked strange to me - even though interface and routing configuration within the container seemingly looks OK, there are 0 packets TX/RX on eth0 after pinging the KVM (but 4 on lo instead). Any idea on how to proceed from here? Is this a valid setup and a valid libvirt configuration for that setup? Thanks and br, Lars
Martin Kletzander
2019-Mar-13 13:26 UTC
Re: [libvirt-users] KVM-Docker-Networking using TAP and MACVLAN
On Tue, Mar 12, 2019 at 11:10:40PM +0100, Lars Lindstrom wrote:>Hi everyone! > >I have the following requirement: I need to connect a set of Docker >containers to a KVM. The containers shall be isolated in a way that they >cannot communicate to each other without going through the KVM, which >will act as router/firewall. For this, I thought about the following >simple setup (as opposed to a more complex one involving a bridge with >vlan_filtering and a seperate VLAN for each container): > >+------------------------------------------------------------------+ >| Host | >| +-------------+ +----------------------+---+ >| | KVM | | Docker +-> | a | >| | +----------+ +----------+ +--------------+ +---+ >| | | NIC lan0 | <-> | DEV tap0 | <-> | NET macvlan0 | <-+-> | b | >| | +----------+ +----------+ +--------------+ +---+ >| | | | +-> | c | >| +-------------+ +----------------------+---+ >| | >+------------------------------------------------------------------+ > >NIC lan0: > <interface type='direct'> > <source dev='tap0' mode='vepa'/> > <model type='virtio'/> > </interface> > *** Welcome to pfSense 2.4.4-RELEASE-p1 (amd64) on pfSense *** > LAN (lan) -> vtnet0 -> v4: 10.0.20.1/24 > >DEV tap0: > [root@server ~]# ip tuntap add tap0 mode tap > [root@server ~]# ip l set tap0 up > [root@server ~]# ip l show tap0 > 49: tap0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc >fq_codel state DOWN mode DEFAULT group default qlen 1000 > link/ether ce:9e:95:89:33:5f brd ff:ff:ff:ff:ff:ff > [root@server ~]# virsh start pfsense > [root@server opt]# ip l show tap0 > 49: tap0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc >fq_codel state DOWN mode DEFAULT group default qlen 1000 > link/ether ce:9e:95:89:33:5f brd ff:ff:ff:ff:ff:ff >IIUC, you are using the tap0 device, but it is not plugged anywhere. By that I mean there is one end that you created and passed through into the VM, but there is no other end of that. I can think of some complicated ways how to do what you are trying to, but hopefully the above explanation will move you forward and you'll figure out something better than what I'm thinking about right now. What usually helps me is to think of a way this would be done with hardware and replicate that as most of the technology is modelled after HW anyway. Or someone else will have a better idea. Before sending it I just thought, wouldn't it be possible to just have a veth pair instead of the tap device? one end would go to the VM and the other one would be used for the containers' macvtaps...>NET macvlan0: > [root@server ~]# docker network create --driver macvlan >--subnet=10.0.20.0/24 --gateway=10.0.20.1 --opt parent=tap0 macvlan0 > >CNT a: > [root@server ~]# docker run --network macvlan0 --ip=10.0.20.2 -it >alpine /bin/sh > / # ping -c 4 10.0.20.1 > PING 10.0.20.1 (10.0.20.1): 56 data bytes > --- 10.0.20.1 ping statistics --- > 4 packets transmitted, 0 packets received, 100% packet loss > / # ifconfig > eth0 Link encap:Ethernet HWaddr 02:42:0A:00:14:02 > inet addr:10.0.20.2 Bcast:10.0.20.255 Mask:255.255.255.0 > UP BROADCAST MULTICAST MTU:1500 Metric:1 > RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) > lo Link encap:Local Loopback > inet addr:127.0.0.1 Mask:255.0.0.0 > UP LOOPBACK RUNNING MTU:65536 Metric:1 > RX packets:4 errors:0 dropped:0 overruns:0 frame:0 > TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:448 (448.0 B) TX bytes:448 (448.0 B) > / # ip r > default via 10.0.20.1 dev eth0 > 10.0.20.0/24 dev eth0 scope link src 10.0.20.2 > >CNT b: > [root@server ~]# docker run --network macvlan0 --ip=10.0.20.2 -it >alpine /bin/ping 10.0.20.1 > PING 10.0.20.1 (10.0.20.1): 56 data bytes > >CNT c: > [root@server ~]# docker run --network macvlan0 --ip=10.0.20.2 -it >alpine /bin/ping 10.0.20.1 > PING 10.0.20.1 (10.0.20.1): 56 data bytes > > >The KVM is not reachable from within a Docker container (during the test >firewalld was disabled) and vice versa. The first thing I noticed is >that tap0 remains NO-CARRIER and DOWN, even though the KVM has been >started. Shouldn't the link come up as soon as the KVM is started (and >thus is connected to the tap0 device)? The next thing that looked >strange to me - even though interface and routing configuration within >the container seemingly looks OK, there are 0 packets TX/RX on eth0 >after pinging the KVM (but 4 on lo instead). > >Any idea on how to proceed from here? Is this a valid setup and a valid >libvirt configuration for that setup? > > >Thanks and br, Lars > > >_______________________________________________ >libvirt-users mailing list >libvirt-users@redhat.com >https://www.redhat.com/mailman/listinfo/libvirt-users
Lars Lindstrom
2019-Mar-13 22:40 UTC
Re: [libvirt-users] KVM-Docker-Networking using TAP and MACVLAN
On 3/13/19 2:26 PM, Martin Kletzander wrote:> IIUC, you are using the tap0 device, but it is not plugged anywhere. > By that I > mean there is one end that you created and passed through into the VM, > but there > is no other end of that. I can think of some complicated ways how to > do what > you are trying to, but hopefully the above explanation will move you > forward and > you'll figure out something better than what I'm thinking about right > now. What > usually helps me is to think of a way this would be done with hardware > and > replicate that as most of the technology is modelled after HW anyway. Or > someone else will have a better idea. > > Before sending it I just thought, wouldn't it be possible to just have > a veth > pair instead of the tap device? one end would go to the VM and the > other one > would be used for the containers' macvtaps...What I am trying to achieve is the most performant way to connect a set of containers to the KVM while having proper isolation. As the Linux bridge does not support port isolation I started with a 'bridge' networking and MACVLAN using a VLAN for each container, but this comes at the cost of bridging and the VLAN trunk on the KVM side. The simplest (and hopefully therefore most performant) solution I could come up with was using a 'virtio' NIC in the KVM, with 'direct' connection in 'vepa' mode to 'some other end' on the host, TAP in its simplest form, which Docker then uses for its MACVLAN network. I am not quite sure if I understood you correctly with the 'other end'. With the given configuration I would expect that one end of the TAP device is connected to the NIC in the KVM (and it actually is, it has an IP address assigned in the KVM and is serving the web configurator) and the other end is connected to the MACVLAN network of Docker. If this is not how TAP works, how do I then provide a 'simple virtual NIC' which has one end in the KVM itself and the other on the host (without using bridging or alike). I always thought then when using 'bridge' network libvirt does exactly that, it creates a TAP device on the host and assigns it to a bridge. According to the man page I have to specify both interfaces when creating the 'vdev' device, but how would I do that on the host with one end being in the KVM? br Lars