Hello, Firstly THANK YOU for the IPv6 NAT support merged in 6.5. It has been almost impossible to get IPv6 into a VM on a laptop that switches between wifi and wired (dock) connections, because you can not add a wifi interface to a bridge. I know NAT is against the IPv6 end-to-end xen but it makes this "just work" for the vast majority of people like me who need to ssh/curl/talk to ipv6 only hosts! So I installed 6.6.0 from the virt-preview repos on Fedora 32 to eagerly test it out. My network config looks like <network> <name>network</name> <uuid> ... </uuid> <forward mode='nat'> <nat ipv6='yes'/> </forward> <bridge name='virbr0' stp='on' delay='0'/> <mac address=' ... '/> <domain name='network'/> <ip address='192.168.100.1' netmask='255.255.255.0'> <dhcp> <range start='192.168.100.128' end='192.168.100.254'/> </dhcp> </ip> <ip family='ipv6' address='fc00:dead:beef:55::' prefix='64'> </ip> </network> The first problem I hit was trying to start that network: error: internal error: Check the host setup: enabling IPv6 forwarding with RA routes without accept_ra set to 2 is likely to cause routes loss. Interfaces to look at: wlp4s0 wlp4s0 is my wifi card that is configured by NetworkManager in a completely unremarkable fashion. By default it gets an ipv6 via SLAAC from my router. This feels a bit like the unresolved bug [1] which says that systemd-networkd is handling the RA's in userspace for ... reasons [2]. It's unclear to me if NetworkManager is doing similar. I feel like this must be a red-herring. My wired interface has the same setting of 0 $ cat /proc/sys/net/ipv6/conf/enp0s31f6/accept_ra 0 and is similarly just a very standard auto-configured NetworkManager interface. When I "net-start" the network whilst on wifi libvirt doesn't seem to care about that interface (I presume it only looks at the active one?). When I dock and turn off wifi, ipv6 connectivity continues to work through enp0s31f6, so I don't think the accept_ra really matters in this case. I feel like this message is incorrect, and being as I've done nothing special to my underlying interfaces probably going to be wrong for a lot of people trying this? Does anyone know the details of this message and see why it would be required in this situation? The other thing that I'd like to expand the documentation on, if I can get some clarity, is the choice of network. It seems like it has to be a /64, and it seems like the best choice is within fc00::/7, or at least that is what has been assigned for private networks like this [3]? The only problem with this is that I think glibc filters this range so nothing prefers IPv6. Is this the range expected to be used for ipv6 NAT? If so, would a patch to drop some documentation breadcrumbs about setting gai.conf or something be useful? Or are there better choices for the network? Thanks! -i [1] https://bugzilla.redhat.com/show_bug.cgi?id=1639087 [2] https://github.com/systemd/systemd/commit/3b015d40c19d9338b66bf916d84dec601019c811 [3] https://tools.ietf.org/html/rfc4193
On 8/10/20 11:23 PM, Ian Wienand wrote:> Hello, > > Firstly THANK YOU for the IPv6 NAT support merged in 6.5. It has been > almost impossible to get IPv6 into a VM on a laptop that switches > between wifi and wired (dock) connections, because you can not add a > wifi interface to a bridge. I know NAT is against the IPv6 end-to-end > xen but it makes this "just work" for the vast majority of people like > me who need to ssh/curl/talk to ipv6 only hosts! > > So I installed 6.6.0 from the virt-preview repos on Fedora 32 to > eagerly test it out. > > My network config looks like > > <network> > <name>network</name> > <uuid> ... </uuid> > <forward mode='nat'> > <nat ipv6='yes'/> > </forward> > <bridge name='virbr0' stp='on' delay='0'/> > <mac address=' ... '/> > <domain name='network'/> > <ip address='192.168.100.1' netmask='255.255.255.0'> > <dhcp> > <range start='192.168.100.128' end='192.168.100.254'/> > </dhcp> > </ip> > <ip family='ipv6' address='fc00:dead:beef:55::' prefix='64'> > </ip> > </network> > > The first problem I hit was trying to start that network: > > error: internal error: Check the host setup: enabling IPv6 forwarding > with RA routes without accept_ra set to 2 is likely to cause routes > loss. Interfaces to look at: wlp4s0 > > wlp4s0 is my wifi card that is configured by NetworkManager in a > completely unremarkable fashion. By default it gets an ipv6 via SLAAC > from my router. This feels a bit like the unresolved bug [1] which > says that systemd-networkd is handling the RA's in userspace for > ... reasons [2]. It's unclear to me if NetworkManager is doing > similar.Yes, and yes. The only reason I haven't done something about this is that I'm undecided *what* to do. On one hand it seems many (most) systems are handling RAs with a userspace process, so it doesn't matter that it's disabled in the kernel. On the other hand, the person who added this check must have had a valid reason for going to the trouble of adding it (rather than just documenting that you needed to set accept_ra to 2 for some set of interfaces (I forget right now exactly which ones, and I'm trying to wind my brain down for the end of the day, so don't want to go look it up :-) I can see 3 possibilities: 1) completely remove the check, with the idea that while it was a good thing at the time, it's now obsolete. 2) have a config item (in /etc/libvirt/network.conf (which doesn't currently exist) maybe?) to let people manually disable the check. 3) try to make libvirt's code intelligent, and look for clues that RAs are handled elsewhere (someone would need to figure out what those "clues" are).> > I feel like this must be a red-herring. My wired interface has the > same setting of 0 > > $ cat /proc/sys/net/ipv6/conf/enp0s31f6/accept_ra > 0 > > and is similarly just a very standard auto-configured NetworkManager > interface. When I "net-start" the network whilst on wifi libvirt > doesn't seem to care about that interface (I presume it only looks at > the active one?). When I dock and turn off wifi, ipv6 connectivity > continues to work through enp0s31f6, so I don't think the accept_ra > really matters in this case.Because you're using NetworkManager. I've confirmed with [some NM person, I forget who or in what venue] that NM handles RAs itself, so accept_ra should be turned off in the kernel (it's not harmful if it's on as far as I know, it just does nothing useful)> > I feel like this message is incorrect, and being as I've done nothing > special to my underlying interfaces probably going to be wrong for a > lot of people trying this? Does anyone know the details of this > message and see why it would be required in this situation?It isn't. We just need to decide which of the ways listed above to fix it.> > The other thing that I'd like to expand the documentation on, if I can > get some clarity, is the choice of network. It seems like it has to > be a /64, and it seems like the best choice is within fc00::/7, or at > least that is what has been assigned for private networks like this > [3]?"locally assigned" addresses in IPv6 are... different. I've been trying to figure this out myself (in order to *automatically* assign a network address to a libvirt virtual network, as Dan suggested in the cover letter for the IPv6 NAT patches), and I *think* you need to at least set the lowest bit of the first byte of the address (that's the "locally assigned" bit). So that would mean that all networks should be somewhere within FD00::/8 (but please correct me if I'm wrong!)> > The only problem with this is that I think glibc filters this range so > nothing prefers IPv6.What?? Exactly what isn't preferring IPv6? Do you mean outbound connections that would be to an IPv6 address will be nixed in favor of an IPv4 address if the source IP of the connection was going to be in FC00::/7? Or something else? Do you have a reference for this?> Is this the range expected to be used for ipv6 > NAT? If so, would a patch to drop some documentation breadcrumbs > about setting gai.conf or something be useful?The man page for gai.conf *implies* that glibc is following the preference rules suggested in RFC3484, which was written prior to RFC4193, so it seems strange that it would give any special treatment to addresses in that range. Does it behave in the same way if you use FD00::... instead of FC00::...? (probably, but worth checking)> Or are there better choices for the network?I've Cc'ed Stefano Brivio, who has worked on IPv6 in the kernel, and (at least based on the conversations I've had with him) has a much better knowledge of IPv6. Maybe he can offer some advice. (BTW, he was playing around with defining an IPv6 libvirt network that used the same network as the host's physical interface, then turning on ndp-proxy, and finally adding a host route for each guest IP; this permits the guests to all be on the same IPv6 network as the host; if we can get all of those steps automated in a libvirt virtual network, it will be even better than IPv6 NAT!)> > Thanks! > > -i > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1639087 > [2] https://github.com/systemd/systemd/commit/3b015d40c19d9338b66bf916d84dec601019c811 > [3] https://tools.ietf.org/html/rfc4193 >
Hello, thanks for looking! On Tue, Aug 11, 2020 at 11:52:46PM -0400, Laine Stump wrote:> > The first problem I hit was trying to start that network: > > > > error: internal error: Check the host setup: enabling IPv6 forwarding > > with RA routes without accept_ra set to 2 is likely to cause routes > > loss. Interfaces to look at: wlp4s0 > > > I can see 3 possibilities: > > 3) try to make libvirt's code intelligent, and look for clues that RAs are > handled elsewhere (someone would need to figure out what those "clues" are).Perhaps my proposal at [1] falls into this category. The theory is to only warn when an interface is already set to 1 [2] because that seems to be when you would be expecting the interface to accept RA's, but enabling forwarding would inhibit it. From sysctl docs: Possible values are: 0 Do not accept Router Advertisements. 1 Accept Router Advertisements if forwarding is disabled. 2 Overrule forwarding behaviour. Accept Router Advertisements even if forwarding is enabled.> > The other thing that I'd like to expand the documentation on, if I can > > get some clarity, is the choice of network. It seems like it has to > > be a /64, and it seems like the best choice is within fc00::/7, or at > > least that is what has been assigned for private networks like this > > [3]? > > "locally assigned" addresses in IPv6 are... different. I've been trying to > figure this out myself (in order to *automatically* assign a network address > to a libvirt virtual network, as Dan suggested in the cover letter for the > IPv6 NAT patches), and I *think* you need to at least set the lowest bit of > the first byte of the address (that's the "locally assigned" bit). So that > would mean that all networks should be somewhere within FD00::/8 (but please > correct me if I'm wrong!)... different ... indeed! :) I've proposed [3] with what I've found out. Yes I agree it seems the intent of FD00::/8 is to be somewhat analogous to 192.168 ... but you know of course it's IPv6 so there's a page worth of details in the RFC on how to generate 40 random bits ...> > The only problem with this is that I think glibc filters this range so > > nothing prefers IPv6. > > What?? Exactly what isn't preferring IPv6? Do you mean outbound connections > that would be to an IPv6 address will be nixed in favor of an IPv4 address > if the source IP of the connection was going to be in FC00::/7? Or something > else? Do you have a reference for this?> The man page for gai.conf *implies* that glibc is following the preference > rules suggested in RFC3484, which was written prior to RFC4193, so it seems > strange that it would give any special treatment to addresses in that range. > Does it behave in the same way if you use FD00::... instead of FC00::...? > (probably, but worth checking)Yeah, on my Fedora 32 host I have to override gai.conf to prefer ipv6 (there is no default /etc/gai.conf) with what I took directly from RFC3484 --- label ::1/128 0 label ::/0 1 label 2002::/16 2 label ::/96 3 label ::ffff:0:0/96 4 precedence ::1/128 50 precedence ::/0 40 precedence 2002::/16 30 precedence ::/96 20 precedence ::ffff:0:0/96 10 --- I think the gai.conf it is actually using is reflected in /usr/share/doc/glibc-common/gai.conf; which has this comment: # This default differs from the tables given in RFC 3484 by handling # (now obsolete) site-local IPv6 addresses and Unique Local Addresses. # The reason for this difference is that these addresses are never # NATed while IPv4 site-local addresses most probably are. Given # the precedence of IPv6 over IPv4 (see below) on machines having only # site-local IPv4 and IPv6 addresses a lookup for a global address would # see the IPv6 be preferred. The result is a long delay because the # site-local IPv6 addresses cannot be used while the IPv4 address is # (at least for the foreseeable future) NATed. We also treat Teredo # tunnels special. (it must be compiled in defaults?) So; it seems a choice made by the practicalities of basically people having these addresses that *weren't* routable and having a terrible experience. Of course, now legitimate use is collateral damage. Perhaps we should raise this with the distro -- but I expect if they update it, they might be back in the position of people reporting "why is my website taking 30 seconds to load" :/ Of course I'm sure other distros have made other choices too.> (BTW, he was playing around with defining an IPv6 libvirt network that used > the same network as the host's physical interface, then turning on > ndp-proxy, and finally adding a host route for each guest IP; this permits > the guests to all be on the same IPv6 network as the host; if we can get all > of those steps automated in a libvirt virtual network, it will be even > better than IPv6 NAT!)I just want to access ipv6 only clouds on my laptop from my work VM over wifi and plugged into the docking station :) That might be similar to what VirtualBox does? That allows a guest to have a NIC bridged to a wifi card, that seems to get an address (RA makes it in?) but no packets flow for me. Apparently with some wireless NICs it might work, but not mine I guess. -i [1] https://www.redhat.com/archives/libvir-list/2020-August/msg00437.html [2] https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt [3] https://www.redhat.com/archives/libvir-list/2020-August/msg00438.html
Stefano Brivio
2020-Aug-17 23:32 UTC
Re: ipv6 NAT; accept_ra errors and about network choice
Hi, Sorry for the delay. On Tue, 11 Aug 2020 23:52:46 -0400 Laine Stump <laine@redhat.com> wrote:> On 8/10/20 11:23 PM, Ian Wienand wrote: > > Hello, > > > > Firstly THANK YOU for the IPv6 NAT support merged in 6.5. It has been > > almost impossible to get IPv6 into a VM on a laptop that switches > > between wifi and wired (dock) connections, because you can not add a > > wifi interface to a bridge. I know NAT is against the IPv6 end-to-end > > xen but it makes this "just work" for the vast majority of people like > > me who need to ssh/curl/talk to ipv6 only hosts! > > > > So I installed 6.6.0 from the virt-preview repos on Fedora 32 to > > eagerly test it out. > > > > My network config looks like > > > > <network> > > <name>network</name> > > <uuid> ... </uuid> > > <forward mode='nat'> > > <nat ipv6='yes'/> > > </forward> > > <bridge name='virbr0' stp='on' delay='0'/> > > <mac address=' ... '/> > > <domain name='network'/> > > <ip address='192.168.100.1' netmask='255.255.255.0'> > > <dhcp> > > <range start='192.168.100.128' end='192.168.100.254'/> > > </dhcp> > > </ip> > > <ip family='ipv6' address='fc00:dead:beef:55::' prefix='64'> > > </ip> > > </network> > > > > The first problem I hit was trying to start that network: > > > > error: internal error: Check the host setup: enabling IPv6 forwarding > > with RA routes without accept_ra set to 2 is likely to cause routes > > loss. Interfaces to look at: wlp4s0 > > > > wlp4s0 is my wifi card that is configured by NetworkManager in a > > completely unremarkable fashion. By default it gets an ipv6 via SLAAC > > from my router. This feels a bit like the unresolved bug [1] which > > says that systemd-networkd is handling the RA's in userspace for > > ... reasons [2]. It's unclear to me if NetworkManager is doing > > similar. > > Yes, and yes. The only reason I haven't done something about this is > that I'm undecided *what* to do. On one hand it seems many (most) > systems are handling RAs with a userspace process, so it doesn't matter > that it's disabled in the kernel. On the other hand, the person who > added this check must have had a valid reason for going to the trouble > of adding it (rather than just documenting that you needed to set > accept_ra to 2 for some set of interfaces (I forget right now exactly > which ones, and I'm trying to wind my brain down for the end of the day, > so don't want to go look it up :-)The check comes from commit 00d28a78b5d1 ("network: check accept_ra before enabling ipv6 forwarding"), and it's there because the accept_ra flag works like this (from Documentation/networking/ip-sysctl.txt): 0 Do not accept Router Advertisements. 1 Accept Router Advertisements if forwarding is disabled. 2 Overrule forwarding behaviour. Accept Router Advertisements even if forwarding is enabled. Now, as libvirt enables IPv6 forwarding via /proc/sys/net/ipv6/conf/all/forwarding (in my opinion, this could be limited to the interfaces involved), router advertisements would start being discarded on all interfaces if this is '1'. Another half-baked idea I was thinking about is: if there's at least one address on a given interface with the 'noprefixroute' flag, that means they are added by userspace. In that case, virNetDevIPCheckIPv6ForwardingCallback() could set data->hasRARoutes to false, and if userspace is explicitly handling RAs, don't worry at all about accept_ra -- 0 is fine if it was set e.g. by NetworkManager. Otherwise, just go ahead and set it to 2, we're not conflicting with anything that would set addresses from RAs (other than the kernel).> I can see 3 possibilities: > > 1) completely remove the check, with the idea that while it was a good > thing at the time, it's now obsolete. > > 2) have a config item (in /etc/libvirt/network.conf (which doesn't > currently exist) maybe?) to let people manually disable the check. > > 3) try to make libvirt's code intelligent, and look for clues that RAs > are handled elsewhere (someone would need to figure out what those > "clues" are).Yes, addresses with 'noprefixroute' should be a safe choice: userspace agents need to create routes separately anyway, but it won't be set if the kernel is setting up those addresses.> > I feel like this must be a red-herring. My wired interface has the > > same setting of 0 > > > > $ cat /proc/sys/net/ipv6/conf/enp0s31f6/accept_ra > > 0 > > > > and is similarly just a very standard auto-configured NetworkManager > > interface. When I "net-start" the network whilst on wifi libvirt > > doesn't seem to care about that interface (I presume it only looks at > > the active one?). When I dock and turn off wifi, ipv6 connectivity > > continues to work through enp0s31f6, so I don't think the accept_ra > > really matters in this case. > > Because you're using NetworkManager. I've confirmed with [some NM > person, I forget who or in what venue] that NM handles RAs itself, so > accept_ra should be turned off in the kernel (it's not harmful if it's > on as far as I know, it just does nothing useful) > > > > > I feel like this message is incorrect, and being as I've done nothing > > special to my underlying interfaces probably going to be wrong for a > > lot of people trying this? Does anyone know the details of this > > message and see why it would be required in this situation? > > It isn't. We just need to decide which of the ways listed above to fix it. > > > > > The other thing that I'd like to expand the documentation on, if I can > > get some clarity, is the choice of network. It seems like it has to > > be a /64, and it seems like the best choice is within fc00::/7, or at > > least that is what has been assigned for private networks like this > > [3]? > > "locally assigned" addresses in IPv6 are... different. I've been trying > to figure this out myself (in order to *automatically* assign a network > address to a libvirt virtual network, as Dan suggested in the cover > letter for the IPv6 NAT patches), and I *think* you need to at least set > the lowest bit of the first byte of the address (that's the "locally > assigned" bit). So that would mean that all networks should be somewhere > within FD00::/8 (but please correct me if I'm wrong!) > > > > > The only problem with this is that I think glibc filters this range so > > nothing prefers IPv6. > > What?? Exactly what isn't preferring IPv6? Do you mean outbound > connections that would be to an IPv6 address will be nixed in favor of > an IPv4 address if the source IP of the connection was going to be in > FC00::/7? Or something else? Do you have a reference for this? > > > Is this the range expected to be used for ipv6 > > NAT? If so, would a patch to drop some documentation breadcrumbs > > about setting gai.conf or something be useful? > > The man page for gai.conf *implies* that glibc is following the > preference rules suggested in RFC3484, which was written prior to > RFC4193, so it seems strange that it would give any special treatment to > addresses in that range. Does it behave in the same way if you use > FD00::... instead of FC00::...? (probably, but worth checking) > > > Or are there better choices for the network? > > I've Cc'ed Stefano Brivio, who has worked on IPv6 in the kernel, and (at > least based on the conversations I've had with him) has a much better > knowledge of IPv6. Maybe he can offer some advice. > > (BTW, he was playing around with defining an IPv6 libvirt network that > used the same network as the host's physical interface, then turning on > ndp-proxy, and finally adding a host route for each guest IP; this > permits the guests to all be on the same IPv6 network as the host; if we > can get all of those steps automated in a libvirt virtual network, it > will be even better than IPv6 NAT!)Yes, that would be ideal. I don't think NAT with IPv6 is a wise thing to do, but my ISP just delegates a /64 prefix to me. So I need NDP proxying because my guests need to appear on the same network. I do it manually with something like: echo 1 > /proc/sys/net/ipv6/conf/<upstream interface>/proxy_ndp ip -6 neigh add proxy <guest address> dev <upstream interface> and passing my network prefix to libvirt: <ip family='ipv6' address='<my prefix>::1' prefix='64'> </ip> works flawlessly, dnsmasq gets configured properly on the host and the guest can use SLAAC, also DNS configuration (RFC 8106) worked. Other than NDP proxying, another slightly problematic item was that I tried, as a hack, to pass a different prefix there (should never be done, indeed). dnsmasq fails, but silently, and libvirt accepts it, also silently -- it should probably warn instead, even just because it won't work with dnsmasq. -- Stefano