Richard W.M. Jones
2022-May-01 10:47 UTC
[Libguestfs] [libguestfs PATCH 1/2] guestfs_readdir(): rewrite with FileOut transfer, to lift protocol limit
On Sun, May 01, 2022 at 09:04:40AM +0200, Laszlo Ersek wrote:> Currently the guestfs_readdir() API can not list long directories, due to > it sending back the whole directory listing in a single guestfs protocol > response, which is limited to GUESTFS_MESSAGE_MAX (approx. 4MB) in size. > > Introduce the "internal_readdir" action, for transferring the directory > listing from the daemon to the library through a FileOut parameter. > Rewrite guestfs_readdir() on top of this new internal function: > > - The new "internal_readdir" action is a daemon action. Repurpose the > "readdir" proc_nr (138) for "internal_readdir". Assume the new action > will first be released in libguestfs-1.48.2.So you can't really do this, it's best to allocate a new proc_nr (we're not going to run out of integers ...). Although the protocol is an internal detail of libguestfs that we can change whenever we want, unfortunately some distros ship the binary appliance to their users, either the one we supply, or one built themselves. The point being that on those distros you could get a mismatch between library & appliance. Reusing the proc_nr will cause weird breakage (probably a hang), whereas using a new number will produce a clear error message.> - Turn "readdir" from a daemon action into a non-daemon one. Call the > daemon action guestfs_internal_readdir() manually, receive the FileOut > parameter into a temp file, then deserialize the dirents array from the > temp file. > > This patch sneakily fixes an independent bug, too. In the pre-patch > do_readdir() function [daemon/readdir.c], when readdir() returns NULL, we > don't distinguish "end of directory stream" from "readdir() failed". This > rewrite fixes this problem -- I didn't see much value separating out the > fix for the original do_readdir(). > > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1674392 > Signed-off-by: Laszlo Ersek <lersek at redhat.com> > --- > lib/Makefile.am | 1 + > generator/actions_core.ml | 127 ++++++++++--------- > generator/proc_nr.ml | 2 +- > daemon/readdir.c | 132 ++++++++++---------- > lib/readdir.c | 131 +++++++++++++++++++ > TODO | 8 -- > 6 files changed, 266 insertions(+), 135 deletions(-)[...]> --- a/generator/proc_nr.ml > +++ b/generator/proc_nr.ml > @@ -152,7 +152,7 @@ let proc_nr = [ > 135, "mknod_b"; > 136, "mknod_c"; > 137, "umask"; > -138, "readdir"; > +138, "internal_readdir"; > 139, "sfdiskM"; > 140, "zfile"; > 141, "getxattrs";So here, remove the old entry completely and add a new one at the end. [...]> + xdrmem_create (&xdr, xdr_buf, GUESTFS_MAX_CHUNK_SIZE, XDR_ENCODE);I've forgotton how xdrmem works. Does it realloc the buffer when it runs out of space? Patch looks generally good to me. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com nbdkit - Flexible, fast NBD server with plugins https://gitlab.com/nbdkit/nbdkit
Laszlo Ersek
2022-May-02 06:59 UTC
[Libguestfs] [libguestfs PATCH 1/2] guestfs_readdir(): rewrite with FileOut transfer, to lift protocol limit
On 05/01/22 12:47, Richard W.M. Jones wrote:> > On Sun, May 01, 2022 at 09:04:40AM +0200, Laszlo Ersek wrote: >> Currently the guestfs_readdir() API can not list long directories, due to >> it sending back the whole directory listing in a single guestfs protocol >> response, which is limited to GUESTFS_MESSAGE_MAX (approx. 4MB) in size. >> >> Introduce the "internal_readdir" action, for transferring the directory >> listing from the daemon to the library through a FileOut parameter. >> Rewrite guestfs_readdir() on top of this new internal function: >> >> - The new "internal_readdir" action is a daemon action. Repurpose the >> "readdir" proc_nr (138) for "internal_readdir". Assume the new action >> will first be released in libguestfs-1.48.2. > > So you can't really do this, it's best to allocate a new proc_nr > (we're not going to run out of integers ...). > > Although the protocol is an internal detail of libguestfs that we can > change whenever we want, unfortunately some distros ship the binary > appliance to their users, either the one we supply, or one built > themselves.Urgh. I was aware that the proc_nr's must always match, but didn't think they were treated as public ABI.> The point being that on those distros you could get a > mismatch between library & appliance. Reusing the proc_nr will cause > weird breakage (probably a hang), whereas using a new number will > produce a clear error message. > >> - Turn "readdir" from a daemon action into a non-daemon one. Call the >> daemon action guestfs_internal_readdir() manually, receive the FileOut >> parameter into a temp file, then deserialize the dirents array from the >> temp file. >> >> This patch sneakily fixes an independent bug, too. In the pre-patch >> do_readdir() function [daemon/readdir.c], when readdir() returns NULL, we >> don't distinguish "end of directory stream" from "readdir() failed". This >> rewrite fixes this problem -- I didn't see much value separating out the >> fix for the original do_readdir(). >> >> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1674392 >> Signed-off-by: Laszlo Ersek <lersek at redhat.com> >> --- >> lib/Makefile.am | 1 + >> generator/actions_core.ml | 127 ++++++++++--------- >> generator/proc_nr.ml | 2 +- >> daemon/readdir.c | 132 ++++++++++---------- >> lib/readdir.c | 131 +++++++++++++++++++ >> TODO | 8 -- >> 6 files changed, 266 insertions(+), 135 deletions(-) > > [...] > >> --- a/generator/proc_nr.ml >> +++ b/generator/proc_nr.ml >> @@ -152,7 +152,7 @@ let proc_nr = [ >> 135, "mknod_b"; >> 136, "mknod_c"; >> 137, "umask"; >> -138, "readdir"; >> +138, "internal_readdir"; >> 139, "sfdiskM"; >> 140, "zfile"; >> 141, "getxattrs"; > > So here, remove the old entry completely and add a new one at the end.OK!> > [...] > >> + xdrmem_create (&xdr, xdr_buf, GUESTFS_MAX_CHUNK_SIZE, XDR_ENCODE); > > I've forgotton how xdrmem works. Does it realloc the buffer when it > runs out of space?No; the encoding that no longer fits fails. But the patch (and the next one too) make sure we never run out of space in the xdr buffer. This patch just assumes that a single guestfs_int_dirent can always be encoded in GUESTFS_MAX_CHUNK_SIZE (which is a justified assumption). The next one actually enforces it.> > Patch looks generally good to me.Thanks! Laszlo> > Rich. >