Eric Blake
2019-Feb-19 15:21 UTC
Re: [Libguestfs] [PATCH nbdkit 4/4] Add linuxdisk plugin.
On 2/19/19 1:49 AM, Richard W.M. Jones wrote:> From: "Richard W.M. Jones" <rjones@redhat.com> > > This plugin allows you to create a complete ext2 filesystem in a GPT > partitioned disk image. This can be attached as a disk to a Linux > virtual machine. It is implemented using libext2fs (the same as > supermin). > > Although there is some overlap with nbdkit-iso-plugin and > nbdkit-floppy-plugin, the implementations and use cases of all three > plugins are sufficiently different that it seems to make sense to add > another plugin rather than attempting to extend one of the existing > plugins. > > Largely to avoid user error this plugin is read-only. This is a major > difference from the floppy plugin: that plugin allows files to be > modified (but not resized or created) and writes those changes through > to the backing filesystem. While this plugin could easily be made > writable, this would cause almost certain disk corruption when someone > connected two clients at the same time. In any case it doesn't make > much sense for it to be writable by default since the expectation that > writes would somehow modify the original directory on the host > filesystem cannot be satisfied by this or any reasonable > implementation. Users can add the cow filter on top if they really > want writes and know what they are doing: instructions plus disclaimer > about this are included in the man page. > > As mentioned above, this implementation is based on the same idea as > the appliance creation code in supermin. Eventually we could replace > that supermin code with this plugin, but there are some missing > features that would need to be implemented first. > ---> +++ b/plugins/floppy/nbdkit-floppy-plugin.pod > @@ -74,6 +74,7 @@ important, and it simplifies the implementation greatly. > L<nbdkit(1)>, > L<nbdkit-plugin(3)>, > L<nbdkit-file-plugin(1)>, > +L<nbdkit-linuxdisk-plugin(1)>, > L<nbdkit-iso-plugin(1)>. > > =head1 AUTHORS > diff --git a/plugins/iso/nbdkit-iso-plugin.pod b/plugins/iso/nbdkit-iso-plugin.pod > index 4d9cf41..90e26f0 100644Should this have as much text pointing to the other disk plugins...> --- a/plugins/iso/nbdkit-iso-plugin.pod > +++ b/plugins/iso/nbdkit-iso-plugin.pod > @@ -17,8 +17,9 @@ read-only over the NBD protocol. > This plugin uses L<genisoimage(1)> or L<mkisofs(1)> to create the ISO > content. > > -To create a virtual floppy disk instead of a CD, see > -L<nbdkit-floppy-plugin(1)>. > +To create a FAT-formatted virtual floppy disk instead of a CD, see > +L<nbdkit-floppy-plugin(1)>. To create a Linux compatible virtual > +disk, see L<nbdkit-linuxdisk-plugin(1)>....as what you did here?> +++ b/plugins/linuxdisk/filesystem.c> +int > +create_filesystem (struct virtual_disk *disk) > +{> + /* Add 20% to the estimate to account for the overhead of > + * filesystem metadata. Also set a minimum size. Note we are > + * only wasting virtual space (since this will be stored sparsely > + * under $TMDIR) so we can be generous here.TMPDIR> + /* Create the filesystem file. */ > + tmpdir = getenv ("TMPDIR"); > + if (tmpdir == NULL) > + tmpdir = LARGE_TMPDIR; > + if (asprintf (&filename, "%s/linuxdiskXXXXXX", tmpdir) == -1) { > + nbdkit_error ("asprintf: %m"); > + return -1; > + } > + > + disk->fd = mkstemp (filename); > + if (disk->fd == -1) { > + nbdkit_error ("mkstemp: %s: %m", filename); > + free (filename); > + return -1; > + } > + if (ftruncate (disk->fd, size) == -1) { > + nbdkit_error ("ftruncate: %s: %m", filename); > + free (filename);Missing an unlink(). Also, does the caller properly use close(disk->fd) on early exits like this, or does it not matter because we're just going to exit()?> + return -1; > + } > + > + /* Create the filesystem. */ > + if (mke2fs (filename) == -1) { > + unlink (filename); > + free (filename); > + return -1;Another example of early exit leaving disk->fd open.> + } > + > + /* Open the filesystem. */ > + err = ext2fs_open (filename, EXT2_FLAG_RW|EXT2_FLAG_64BITS, 0, 0, > + unix_io_manager, &fs); > + unlink (filename); > + free (filename); > + if (err) { > + nbdkit_error ("ext2fs_open: %s", error_message (err)); > + return -1; > + } > + > + err = ext2fs_read_bitmaps (fs); > + if (err) { > + nbdkit_error ("ext2fs_read_bitmaps: %s", error_message (err)); > + ext2fs_close (fs); > + return -1; > + } > + > + /* First pass: This creates subdirectories in the filesystem and > + * also builds a list of files so we can identify hard links. > + */ > + for (i = 0; i < nr_dirs; ++i) { > + if (visit (dirs[i], fs, EXT2_ROOT_INO, &files, &nr_files) == -1) > + return -1; > + } > + > + if (nr_files > 0) { > + assert (files != NULL); > + > + /* Sort the files by device and inode number to identify hard links. */ > + qsort (files, nr_files, sizeof (struct file), compare_inodes); > + > + /* Second pass: Copy/create the files. */ > + last_file = NULL; > + for (i = 0; i < nr_files; ++i) { > + hardlinked = last_file && compare_inodes (last_file, &files[i]) == 0; > + > + if (!hardlinked) { > + /* Normal case: creating a new inode. */ > + last_file = &files[i]; > + assert (!S_ISDIR (files[i].statbuf.st_mode)); > + > + if (e2copyfile (fs, &files[i]) == -1) > + return -1; > + > + if (linuxdisk_debug_filesystem) > + nbdkit_debug ("%s: <%" PRIu32 ">/%s -> <%" PRIu32 ">", > + files[i].type, > + files[i].dir_ino, files[i].name, files[i].ino); > + } > + else { > + /* Creating a hard link to an existing inode. */ > + if (e2hardlink (fs, last_file, &files[i]) == -1) > + return -1;More early exits; this time, skipping...> + > + if (linuxdisk_debug_filesystem) > + nbdkit_debug ("hard link: <%" PRIu32 ">/%s -> <%" PRIu32 ">", > + files[i].dir_ino, files[i].name, last_file->ino); > + } > + } > + > + for (i = 0; i < nr_files; ++i) { > + free (files[i].pathname); > + free (files[i].name); > + } > + free (files);...the memory cleanup. Again, I guess that's okay if we know the caller is going to exit() on our failure.> + } > + > + /* Close the filesystem. Note we don't bother to sync it because > + * it's a private temporary file which only we will read. > + */ > + ext2fs_close2 (fs, EXT2_FLAG_FLUSH_NO_SYNC); > + return 0; > +} > + > +/* Use ‘du’ to estimate the size of the filesystem quickly. > + * > + * Typical output from ‘du -cs dir1 dir2’ is: > + * > + * 12345 dir1 > + * 34567 dir2 > + * 46912 total > + * > + * We ignore everything except the first number on the last line. > + */ > +static int64_t > +estimate_size (void) > +{> + fprintf (fp, "du -c -k -s");'du -c' is not specified by POSIX, but is reasonable to expect (we can't build this plugin without ext2 support, which implies we are probably on a Linux system to begin with). If so, should we rely on having GNU coreutils' du, where we can use -B1 to force output in bytes...> + /* Run the command. */ > + nbdkit_debug ("%s", command); > + fp = popen (command, "r"); > + free (command); > + if (fp == NULL) { > + nbdkit_error ("du command failed: %m"); > + return -1; > + } > + > + /* Ignore everything up to the last line. */ > + len = 0; > + while (getline (&line, &len, fp) != -1) > + /* empty */; > + if (ferror (fp)) { > + nbdkit_error ("getline failed: %m"); > + free (line); > + fclose (fp);You should really use pclose(fp) here, to avoid libc leaking internal data associated with the pipe.> + return -1; > + } > + > + fclose (fp);and again here, and maybe even check WIFEXITED() and WEXITSTATUS() of the results.> + > + /* Parse the last line. */ > + if (sscanf (line, "%" SCNi64, &ret) != 1 || ret < 0) { > + nbdkit_error ("could not parse last line of output: %s", line); > + free (line); > + return -1; > + } > + free (line); > + > + /* Result is in 1K blocks, convert it to bytes. */ > + ret *= 1024;...rather than having to scale the output of -k?> + return ret; > +} > + > +static int > +mke2fs (const char *filename) > +{> + /* Run the command. */ > + nbdkit_debug ("%s", command); > + r = system (command); > + free (command); > + > + if (WIFEXITED (r) && WEXITSTATUS (r) != 0) { > + nbdkit_error ("mke2fs command failed with exit code %d", WEXITSTATUS (r)); > + return -1; > + } > + else if (WIFSIGNALED (r)) { > + nbdkit_error ("mke2fs command was killed by signal %d", WTERMSIG (r)); > + return -1; > + } > + else if (WIFSTOPPED (r)) { > + nbdkit_error ("mke2fs command was stopped by signal %d", WSTOPSIG (r)); > + return -1; > + }Is WIFSTOPPED() even a possible result of system()? 'man waitpid' says this status is only possible for a call made with WUNTRACED, which seems contradictory to system() running the command to completion without tracing.> + > + return 0; > +} > + > +static int > +visit (const char *dir, ext2_filsys fs, ext2_ino_t dir_ino, > + struct file **files, size_t *nr_files) > +{> + error2: > + closedir (DIR); > + error1: > + err = errno; > +#pragma GCC diagnostic push > +#pragma GCC diagnostic ignored "-Wunused-result" > + chdir (origdir); > +#pragma GCC diagnostic popFailure to return to the original directory may cause the rest of nbdkit to misbehave, shouldn't we check the result after all and exit on error?> + > +/* To identify hard links we sort the files array by device and inode > + * number (thus hard linked files will be next to each other after the > + * sort). Hence this odd sorting function. > + */ > +static int > +compare_inodes (const void *vp1, const void *vp2) > +{ > + const struct file *f1 = vp1; > + const struct file *f2 = vp2; > + > + if (f1->statbuf.st_dev < f2->statbuf.st_dev) > + return -1; > + else if (f1->statbuf.st_dev > f2->statbuf.st_dev) > + return 1; > + else { > + if (f1->statbuf.st_ino < f2->statbuf.st_ino) > + return -1; > + else if (f1->statbuf.st_ino > f2->statbuf.st_ino) > + return 1; > + else > + return 0;The sort is not a complete order (two files with the same inode have an arbitrary order), but I don't think that matters. And it's not that odd to visit files in inode order - GNU coreutils has discovered that when deleting a large directory, performance is often faster if you visit files in inode order rather than in readdir() order.> +static int > +e2emptyinode (ext2_filsys fs, ext2_ino_t dir_ino, > + const char *name, const struct stat *statbuf, > + int ino_flags, ext2_ino_t *ino) > +{ > + errcode_t err; > + struct ext2_inode inode; > + int major_, minor_; > + > + err = ext2fs_new_inode (fs, dir_ino, statbuf->st_mode, 0, ino); > + if (err) { > + nbdkit_error ("ext2fs_new_inode: %s", error_message (err)); > + return -1; > + } > + > + memset (&inode, 0, sizeof inode); > + inode.i_mode = statbuf->st_mode; > + inode.i_uid = statbuf->st_uid; > + inode.i_gid = statbuf->st_gid; > + inode.i_blocks = 0; > + inode.i_links_count = 1; > + /* XXX nanosecond times? */ > + inode.i_ctime = statbuf->st_ctime; > + inode.i_atime = statbuf->st_atime; > + inode.i_mtime = statbuf->st_mtime; > + inode.i_size = 0;Not for this patch, but now that newer Linux has the statx() call that can expose birthtime, should we be worrying about that (well, ext4 supports birthtime, but if you are targetting just ext2, it probably doesn't matter). Otherwise looks reasonable. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Richard W.M. Jones
2019-Feb-19 21:21 UTC
Re: [Libguestfs] [PATCH nbdkit 4/4] Add linuxdisk plugin.
On Tue, Feb 19, 2019 at 09:21:10AM -0600, Eric Blake wrote:> Missing an unlink(). Also, does the caller properly use close(disk->fd) > on early exits like this, or does it not matter because we're just going > to exit()?Just about the early exits: I started off trying to get those paths right but it made the code ridiculously complicated so I gave up. As you say config_complete just calls exit, so it's not really a problem, in fact more likely to cause bugs than fix them. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
Eric Blake
2019-Feb-19 21:31 UTC
Re: [Libguestfs] [PATCH nbdkit 4/4] Add linuxdisk plugin.
On 2/19/19 3:21 PM, Richard W.M. Jones wrote:> On Tue, Feb 19, 2019 at 09:21:10AM -0600, Eric Blake wrote: >> Missing an unlink(). Also, does the caller properly use close(disk->fd) >> on early exits like this, or does it not matter because we're just going >> to exit()? > > Just about the early exits: I started off trying to get those paths > right but it made the code ridiculously complicated so I gave up. As > you say config_complete just calls exit, so it's not really a problem, > in fact more likely to cause bugs than fix them.Fair enough. The unlink() is an obvious cleanup worth doing (no need to litter $TMPDIR, as the filesystem is not an automatic cleanup on process termination), but memory/fd leaks are acceptable, although it's probably also worth a comment in the code stating that we know we are leaking on failure on the assumption that the caller will exit(). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Apparently Analagous Threads
- Re: [PATCH nbdkit 4/4] Add linuxdisk plugin.
- Re: [PATCH NOT WORKING nbdkit v2 1/2] server: Add .ready_to_serve plugin method.
- Re: [PATCH nbdkit v3 4/4] Add linuxdisk plugin.
- ANNOUNCE: nbdkit 1.18 - high performance NBD server
- Re: [nbdkit PATCH] eval: Allow user override of 'missing'