The consensus when I asked seemed to be that VIMAGE+jail was the right combination to give every container its own private loopback interface, so I tried to build that. I noticed a few things: 1) The kernel prints out a warning message at boot time that VIMAGE is "highly experimental". Should I be concerned about running this in production? 2) Stopping jails with virtual network stacks generates warnings from UMA about memory being leaked. 3) It wasn't clear (or documented anywhere that I could see) how to get the host network set up properly. Obviously I'm not going to have a vlan for every single jail, so it seemed like what most people were doing was "bridge" along with a bunch of "epair" interfaces. I ended up with the following: network_interfaces="lo0 bridge0 bce0" autobridge_interfaces="bridge0" autobridge_bridge0="bce0 epair0a epair1a" cloned_interfaces="bridge0 epair0 epair1" ifconfig_bridge0="inet [deleted] netmask 0xffffff00" ifconfig_bridge0_ipv6="inet6 [deleted] prefixlen 64 accept_rtadv" ifconfig_bce0="up" ifconfig_epair0a="up" ifconfig_epair1a="up" The net.link.bridge.inherit_mac sysctl, which is documented in bridge(4), doesn't appear to work; I haven't yet verified that I can create a /etc/start_if.bridge0 to set the MAC address manually without breaking something else. The IPv6 stack regularly prints "in6_if2idlen: unknown link type (209)" to the console, which is annoying, and IPv6 on the host doesn't entirely work -- it accepts router advertisements but then gives [ENETUNREACH] trying to actually send packets to the default gateway. (IPv6 to the jails *does* work!) In each of the jails I have to manually configure a MAC address using /etc/start_if.epairNb to ensure that it's globally unique, but then everything seems to work. Does this match up with what other people have been doing? Anything I've missed? Any patches I should pull up to make this setup more reliable before I roll it out in production? -GAWollman
On 23/12/2015 1:05 AM, Garrett Wollman wrote:> The consensus when I asked seemed to be that VIMAGE+jail was the right > combination to give every container its own private loopback > interface, so I tried to build that. I noticed a few things: > > 1) The kernel prints out a warning message at boot time that VIMAGE is > "highly experimental". Should I be concerned about running this in > production?CYA only If you are not doing much that is super unusual you should be fine.> > 2) Stopping jails with virtual network stacks generates warnings from > UMA about memory being leaked.I haven't any information about that.> > 3) It wasn't clear (or documented anywhere that I could see) how to > get the host network set up properly. Obviously I'm not going to have > a vlan for every single jail, so it seemed like what most people were > doing was "bridge" along with a bunch of "epair" interfaces. I ended > up with the following:there are exapmples in /usr/share/examples/netgraph for some things.. I've never used the build in configuration stuff,, always handcoded it.. It's probably improved a lot since then.> network_interfaces="lo0 bridge0 bce0" > autobridge_interfaces="bridge0" > autobridge_bridge0="bce0 epair0a epair1a" > cloned_interfaces="bridge0 epair0 epair1" > ifconfig_bridge0="inet [deleted] netmask 0xffffff00" > ifconfig_bridge0_ipv6="inet6 [deleted] prefixlen 64 accept_rtadv" > ifconfig_bce0="up" > ifconfig_epair0a="up" > ifconfig_epair1a="up" > > The net.link.bridge.inherit_mac sysctl, which is documented in > bridge(4), doesn't appear to work; I haven't yet verified that I can > create a /etc/start_if.bridge0 to set the MAC address manually without > breaking something else. The IPv6 stack regularly prints > "in6_if2idlen: unknown link type (209)" to the console, which is > annoying, and IPv6 on the host doesn't entirely work -- it accepts > router advertisements but then gives [ENETUNREACH] trying to actually > send packets to the default gateway. (IPv6 to the jails *does* work!) > > In each of the jails I have to manually configure a MAC address using > /etc/start_if.epairNb to ensure that it's globally unique, but then > everything seems to work. > > Does this match up with what other people have been doing? Anything > I've missed? Any patches I should pull up to make this setup more > reliable before I roll it out in production?I haven't used it for a couple of years.. I know others are, so I'll let them pipe up.> > -GAWollman > _______________________________________________ > freebsd-net at freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org" >
On Tue, Dec 22, 2015 at 12:05:07PM -0500 I heard the voice of Garrett Wollman, and lo! it spake thus:> > The consensus when I asked seemed to be that VIMAGE+jail was the > right combination to give every container its own private loopback > interface, so I tried to build that. I noticed a few things:I've got a server running a dozen or so VIMAGE jails, so I can at least chime in a little...> 1) The kernel prints out a warning message at boot time that VIMAGE > is "highly experimental". Should I be concerned about running this > in production?It hasn't blown up anything for me yet.> 2) Stopping jails with virtual network stacks generates warnings from > UMA about memory being leaked.I'm given to understand that's Known, and presumably Not Quite Trivial To Fix. Since I'm not starting/stopping jails repeatedly as a normal runtime thing, I'm ignoring it. If you were spinning jails up and down dynamically dozens of times a day, I'd want to look more closely at just what is leaking and why...> 3) It wasn't clear (or documented anywhere that I could see) how to > get the host network set up properly. Obviously I'm not going to > have a vlan for every single jail, so it seemed like what most > people were doing was "bridge" along with a bunch of "epair" > interfaces. I ended up with the following:Is what I'm doing, though I'm creating the epair's and adding them to the bridges in the setup script rather than rc.conf (exec.prestart in jail.conf), because that makes it a more manageable IME, and since I'm already doing a bunch of setup in the script anyway...> In each of the jails I have to manually configure a MAC address > using /etc/start_if.epairNb to ensure that it's globally unique, but > then everything seems to work.I hardcode (well, dynamically generated hardcoded) MAC addresses on the epair's in the setup script, since <https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=184149> bit me hard when I was first setting it up. -- Matthew Fuller (MF4839) | fullermd at over-yonder.net Systems/Network Administrator | http://www.over-yonder.net/~fullermd/ On the Internet, nobody can hear you scream.
On Tue, Dec 22, 2015 at 9:05 AM, Garrett Wollman <wollman at bimajority.org> wrote:> Any patches I should pull up to make this setup more > reliable before I roll it out in production? > >If you loook at CURRENT, bz@ has committed a few VIMAGE related fixes this week which you might want to look at. -- Craig
<<On Tue, 22 Dec 2015 22:42:33 -0600, "Matthew D. Fuller" <fullermd at over-yonder.net> said:>> 2) Stopping jails with virtual network stacks generates warnings from >> UMA about memory being leaked.> I'm given to understand that's Known, and presumably Not Quite Trivial > To Fix. Since I'm not starting/stopping jails repeatedly as a normal > runtime thing, I'm ignoring it. If you were spinning jails up and > down dynamically dozens of times a day, I'd want to look more closely > at just what is leaking and why...It looks like that's what bz@ fixed in r292601 (thanks to rodrigc@ for pointing me in the right direction). I haven't looked at how difficult this would be to backport, but since I'm in the same situation as you in terms of the frequency of startup/teardown operations, I'm probably not going to worry too much about it. Other relevant changes from HEAD appear to be 292603, 292604, 278766, and 286537 (and again, this is just based on scanning the svn logs, not actually thinking about the code).> Is what I'm doing, though I'm creating the epair's and adding them to > the bridges in the setup script rather than rc.conf (exec.prestart in > jail.conf), because that makes it a more manageable IME, and since I'm > already doing a bunch of setup in the script anyway...For now, I think I'll just use exec.prestart to manually configure a MAC address. It would be nice if the LAA MAC addresses we generated were both random on initial creation (to better avoid duplicates) and stable over reboot. (Likewise the bridge(4) MAC address.) Or alternatively if we just had rc.conf support for explicitly configuring the MAC address of every interface, since ifconfig doesn't let you configure L2 and L3 addresses on the same command line. Actually, what would be *really* nice -- and I don't know if any of my network interfaces can do this, but it would give me a reason to buy hardware that could -- would be if PCI virtual functions could be used to implement multiple independent network interfaces in the same kernel (additional units in the same driver). Then I wouldn't have to deal with any of this configuration at all. Failing all of those, having a good, well-documented example in the handbook would be a Good Thing. -GAWollman