Laszlo Ersek
2022-Mar-31 13:38 UTC
[Libguestfs] [p2v PATCH 10/10] nbd.c: bind listening socket without AI_ADDRCONFIG
On 03/31/22 15:14, Richard W.M. Jones wrote:> On Thu, Mar 31, 2022 at 09:22:11AM +0200, Laszlo Ersek wrote: >> (This patch is similar to nbdkit commit 9eec2335d630, "server/sockets: get >> rid of AI_ADDRCONFIG", 2022-01-19). >> >> Consider the following call tree: >> >> start_conversion() [conversion.c] >> start_nbd_server() [nbd.c] >> open_listening_socket() [nbd.c] >> bind_tcpip_socket() [nbd.c] >> getaddrinfo() >> socket() >> bind() >> wait_for_nbd_server_to_start() [nbd.c] >> connect_to_nbdkit() [nbd.c] >> getaddrinfo() >> socket() >> connect() >> open_data_connection() [ssh.c] >> /* "-R 0:localhost:<port>" */ >> >> - For each of IPv4 and IPv6, if the network config on the host running >> virt-p2v supports that protocol, then bind_tcpip_socket() intends to >> bind the port for that protocol. >> >> - connect_to_nbdkit() connects to the port using *one* of IPv4 and IPv6; >> it just wants to see NBDMAGIC, regardless of IP version. >> >> - The ssh "-R 0:localhost:<port>" option, formatted by >> open_data_connection(), instructs ssh to create a reverse forwarding >> channel (a listening socket) per IP version (this can be verified with >> "netstat" on the conversion server). In case the reverse NBD connection >> on the conversion server were made to sshd over IPv6, then ssh on the >> p2v server would presumably want to connect to nbdkit over IPv6 too. >> >> The (theoretical) problem with using AI_ADDRCONFIG in bind_tcpip_socket() >> is that, in case the p2v server has no publicly routable IPv6 address >> assigned, then bind_tcpip_socket() will not bind ::1 from "localhost". And >> then the IPv6 reverse forwarding attempt, set up by >> open_data_connection(), *might* fail. >> >> Remove AI_ADDRCONFIG anyway. While at it, spell out AF_UNSPEC as well (in >> practice, this makes no difference, as Linux defines AF_UNSPEC as >> PF_UNSPEC as 0). >> >> While running the local test suite ("make -j10 check") on a host without a >> publicly routable IPv6 address, the "netstat -anp" output changes, due to >> this patch. Before: >> >>> Active Internet connections (servers and established) >>> Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name >>> tcp 0 0 127.0.0.1:58273 0.0.0.0:* LISTEN 51057/nbdkit >>> tcp 0 0 127.0.0.1:58272 0.0.0.0:* LISTEN 51050/nbdkit >>> tcp 0 0 127.0.0.1:58272 127.0.0.1:35332 ESTABLISHED 51050/nbdkit >>> tcp 0 0 127.0.0.1:35332 127.0.0.1:58272 ESTABLISHED 51204/nbdkit >> >> We have two nbdkit processes (PIDs 51057 and 51050) listening over TCPv4, >> and a third one (PID 51204) connected to PID 51050 over TCPv4. >> >> After: >> >>> Active Internet connections (servers and established) >>> Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name >>> tcp 0 0 127.0.0.1:54146 0.0.0.0:* LISTEN 49310/nbdkit >>> tcp 0 0 127.0.0.1:54145 0.0.0.0:* LISTEN 49303/nbdkit >>> tcp6 0 0 ::1:54145 :::* LISTEN 49303/nbdkit >>> tcp6 0 0 ::1:54146 :::* LISTEN 49310/nbdkit >>> tcp6 0 0 ::1:57432 ::1:54145 ESTABLISHED 49457/nbdkit >>> tcp6 0 0 ::1:54145 ::1:57432 ESTABLISHED 49303/nbdkit >> >> The two listening nbdkit processes (PIDs 49310 and 49303) are now doing so >> over both TCPv4 and TCPv6, and the third one (PID 49457) actually connects >> to PID 49303 over TCPv6! > > Nice!Thanks! BTW: do you have any comments about the following serial log message (see the cover letter):> nbdkit: file[1]: error: reading initial client flags: conn->recv: > Connection reset by peerI did not check if the same was issued *without* this patch in place; I highlighted it in the cover letter because it could ring a bell with you (regardless of AI_ADDRCONFIG / IPv6). Thanks! Laszlo> >> Ref: https://listman.redhat.com/archives/libguestfs/2022-March/028475.html >> Suggested-by: Richard W.M. Jones <rjones at redhat.com> >> Signed-off-by: Laszlo Ersek <lersek at redhat.com> >> --- >> nbd.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/nbd.c b/nbd.c >> index dcedd0a52dce..92fad34c0e32 100644 >> --- a/nbd.c >> +++ b/nbd.c >> @@ -262,90 +262,91 @@ static int >> bind_tcpip_socket (const char *port, int **fds_rtn, size_t *nr_fds_rtn) >> { >> struct addrinfo *ai = NULL; >> struct addrinfo hints; >> struct addrinfo *a; >> int err; >> int *fds = NULL; >> size_t nr_fds; >> int addr_in_use = 0; >> >> memset (&hints, 0, sizeof hints); >> - hints.ai_flags = AI_PASSIVE | AI_ADDRCONFIG; >> + hints.ai_family = AF_UNSPEC; >> + hints.ai_flags = AI_PASSIVE; >> hints.ai_socktype = SOCK_STREAM; >> >> err = getaddrinfo ("localhost", port, &hints, &ai); >> if (err != 0) { > > Reviewed-by: Richard W.M. Jones <rjones at redhat.com> > > Rich. >
Richard W.M. Jones
2022-Mar-31 14:44 UTC
[Libguestfs] [p2v PATCH 10/10] nbd.c: bind listening socket without AI_ADDRCONFIG
On Thu, Mar 31, 2022 at 03:38:40PM +0200, Laszlo Ersek wrote:> Thanks! BTW: do you have any comments about the following serial log > message (see the cover letter):[https://listman.redhat.com/archives/libguestfs/2022-March/028498.html]> > nbdkit: file[1]: error: reading initial client flags: conn->recv: > > Connection reset by peerWhat's happening is nbdkit is sending the initial handshake and expecting client flags back, but received a RST instead. I couldn't quite work out exactly why, but it seems to be a side-effect of the way we test that the NBD server is up: https://github.com/libguestfs/virt-p2v/blob/c1a4bd83a8bba5959470ef1f6deb06dc61587112/nbd.c#L659 I believe after your simplifications that code is not necessary at all, since with systemd socket activation the sockets passed to the server are always listening (even if the server isn't ready to accept a connection at that moment). Does it work OK if you simply remove wait_for_nbd_server_to_start and the place where that function is called? (CC-d Eric) Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com nbdkit - Flexible, fast NBD server with plugins https://gitlab.com/nbdkit/nbdkit
Laszlo Ersek
2022-Apr-01 10:07 UTC
[Libguestfs] [p2v PATCH 10/10] nbd.c: bind listening socket without AI_ADDRCONFIG
On 03/31/22 16:44, Richard W.M. Jones wrote:> On Thu, Mar 31, 2022 at 03:38:40PM +0200, Laszlo Ersek wrote: >> Thanks! BTW: do you have any comments about the following serial log >> message (see the cover letter): > > [https://listman.redhat.com/archives/libguestfs/2022-March/028498.html] > >>> nbdkit: file[1]: error: reading initial client flags: conn->recv: >>> Connection reset by peer > > What's happening is nbdkit is sending the initial handshake and > expecting client flags back, but received a RST instead. I couldn't > quite work out exactly why, but it seems to be a side-effect of the > way we test that the NBD server is up: > > https://github.com/libguestfs/virt-p2v/blob/c1a4bd83a8bba5959470ef1f6deb06dc61587112/nbd.c#L659 > > I believe after your simplifications that code is not necessary at > all, since with systemd socket activation the sockets passed to the > server are always listening (even if the server isn't ready to accept > a connection at that moment).Oh I think understand it now. In wait_for_nbd_server_to_start(), we connect to our nbdkit server child process like any other nbdkit client would, and even read "NBDMAGIC" (common to both old and new style negotiation). The server (the child process) most likely places the next protocol words in its Send-Q as well (IHAVEOPT and handshake flags), and then attempts to read back the client flags. But the "client", wait_for_nbd_server_to_start(), doesn't even attempt reading past byte#7 from its Recv-Q: char magic[8]; /* NBDMAGIC */ size_t bytes_read = 0; ssize_t recvd; ... do { recvd = recv (sockfd, magic, sizeof magic - bytes_read, 0); ... } ... close (sockfd); so when we reach the close(), one of two cases can happen: - the client's Recv-Q has some bytes from the server's IHAVEOPT, and the handshake flags, and throws them away, or - the client's Recv-Q is now empty (it didn't have any bytes beyond NBDMAGIC), but the server's Send-Q is not empty. I think that, in either case, the close() will result in an RST for whatever the server tries to do next with the socket. (Much more likely in the second case, i.e. when the server's Send-Q was not empty at the time of the client's close(), but I think the same happens in the first case too, when the client simply throws away data from its Recv-Q.) Now the question is if we are bothered by this "RST" in the conversion log, as it's expected. I think wait_for_nbd_server_to_start() does add a little bit of safety. Namely, if the child goes away unexpectedly (because the execlp() fails, or the new image exits for whatever reason very soon), we find out about that (with a timeout in the worst case), and don't start up "ssh -R". With wait_for_nbd_server_to_start() removed, this "child failure" will only be visible by virt-v2v attempting to connect back via "ssh -R", and then (with ssh on the p2v box getting an RST), witnessing a similarly abrubpt EOF / RST from sshd on the conversion server. Ultimately virt-v2v will fail, and this will be visible on virt-p2v, but that's a detour. I agree that functionally, wait_for_nbd_server_to_start() is not necessary, as long as the child does not vanish. With the child present, there is always at least one file descriptor pointing to the file description pointing to the listening socket, so the p2v side of "ssh -R" will always have a socket to connect to. I'll append a patch anyway to the series, just for discussion's sake.> Does it work OK if you simply remove wait_for_nbd_server_to_start and > the place where that function is called?Yes, the message disappears then. Thanks Laszlo> > (CC-d Eric) > > Rich. >