Richard W.M. Jones
2012-Aug-20 21:07 UTC
[Libguestfs] [libguestfs] Options for hotplugging
libguestfs recently added support for virtio-scsi and libvirt, and when these are both available this lets us relatively easily add hotplugging of drives. This email is about how we would present that through the libguestfs API. (a) The current API Currently you have to add drive(s) via guestfs_add_drive* and then call guestfs_launch, ie. your program must look something like this: guestfs_h *g = guestfs_create (); guestfs_add_drive (g, "/tmp/disk1.img"); guestfs_add_drive (g, "/tmp/disk2.img"); guestfs_launch (g); After guestfs_launch [the qemu backend is running] you are not allowed to add more drives. The API specifies that you refer to drives in other commands in one of two ways. Either you're allowed to use names like "/dev/sda", "/dev/sdb" etc to refer to the first, second etc drive you added, in the same order that you added them. Or you can call guestfs_list_devices which returns a list of device names, opaque strings that you pass to other functions. In the first case (using "/dev/sdX" names), some magic already happens translating these to the real names underneath, but currently that magic is just "/dev/sdX" -> "/dev/vdX" for the virtio case. Note that we cannot change this API or break existing programs. (b) The hidden appliance drive The libguestfs appliance has to have its own root drive. Currently this is added after the user-added drives. For example, if the user adds two drives, then the appliance will appear as /dev/sdc (or /dev/vdc or whatever). Some magic in the bootloader causes the last drive to be used as the root filesystem. This hidden drive doesn't appear in the API -- for example it is suppressed when we generate the result of guestfs_list_devices. (c) /dev/null drives It's always been possible to add "/dev/null" as a drive via guestfs_add_drive*. This is mainly useful for testing, or if you want to access just those parts of the API which don't require a disk image (for various reasons we force you to add one drive, so if you don't have any drive to add, you can use "/dev/null"). Current libguestfs treats "/dev/null" as a magic string and (because of bugs in qemu) substitutes a non-zero sized temporary file. (d) Maximum number of drives With virtio-scsi, this maximum is pretty large -- currently 255 (256 targets less the hidden appliance), but if we used LUNs or even multiple controllers then it'd be almost unlimited. We actually test up to 255, and virt-df will use as many slots as it can. (e) For libguestfs you can assume the latest of everything, qemu, guest kernel, host kernel, libvirt, tools. Any suggestions based on very new features are fine, even proposed features provided there's a working implementation which is likely to go upstream. - - - - Here are some ideas about how we might add hotplugging without breaking existing clients. (1) The "raw libvirt" option In this one we'd simply provide thin wrappers around virDomainAttachDevice and virDomainDetactDevice, and leave it up to the user to know what they're doing. The problem with this is the hidden appliance disk. We certainly don't want the user to accidentally detach that(!) It's also undesirable for there to be a "hole" in the naming scheme so that you'd have: /dev/sda <- your normal drives /dev/sdb <- [/dev/sdc # sorry, you can't use this, we won't tell you why] /dev/sdd <- your first hotplugged device As far as I know, the kernel assigns /dev/sdX names on a first-free basis, so there's no way to permanently put the appliance at /dev/sdzzz (if there is, please let me know!) (2) The "slots" option In this option you'd have to use null devices to reserve the maximum number of drive slots that you're going to use in the libguestfs handle before launch. Then after launching you'd be allowed to hotplug only those slots. So for example: guestfs_add_drive (g, "/dev/null"); # reserves /dev/sda guestfs_add_drive (g, "/dev/null"); # reserves /dev/sdb guestfs_add_drive (g, "/dev/null"); # reserves /dev/sdc guestfs_launch (g); guestfs_hotplug (g, 1, "/tmp/foo"); # replaces index 1 == /dev/sdb guestfs_hotplug (g, 3, "/tmp/foo"); # error! Although ugly, in some ways this is quite attractive. It maps easily into guestfish scripts. You have contiguous device naming. You often know how many drives you'll need in advance, and if you don't then you can reserve up to max_disks-1. (3) The "serial numbers" option This was Dan's suggestion. Hotplugged drives are known only by their serial number. ie. We hotplug them via libvirt using the <serial/> field, and then they are accessed using /dev/disk/by-id/serial. This is tempting, but unfortunately it doesn't quite work in stock udev, because the actual name used is: /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_SERIAL We could add a custom udev rule to get the path we wanted. (4) The "rewriting device names" option Since we already have the infrastructure to rewrite device names, we could do some complicated and hairy device name rewriting to make names appear continguous, even though there's an hidden appliance drive. This is my least favourite option, mainly because of the complexity, and complexity is bound to lead to bugs. (5) Your idea here ... As usual, comments and suggestions welcome. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://et.redhat.com/~rjones/virt-top
Daniel P. Berrange
2012-Aug-21 09:32 UTC
[Libguestfs] [libguestfs] Options for hotplugging
On Mon, Aug 20, 2012 at 10:07:43PM +0100, Richard W.M. Jones wrote:> In the first case (using "/dev/sdX" names), some magic already happens > translating these to the real names underneath, but currently that > magic is just "/dev/sdX" -> "/dev/vdX" for the virtio case.Ah I didn't realize that you already do device name remapping between what the kernel shows & what the API shows.> This hidden drive doesn't appear in the API -- for example it is > suppressed when we generate the result of guestfs_list_devices.Makes sense.> (1) The "raw libvirt" option > > In this one we'd simply provide thin wrappers around > virDomainAttachDevice and virDomainDetactDevice, and leave it up to > the user to know what they're doing. > > The problem with this is the hidden appliance disk. We certainly > don't want the user to accidentally detach that(!) It's also > undesirable for there to be a "hole" in the naming scheme so that > you'd have: > > /dev/sda <- your normal drives > /dev/sdb <- > [/dev/sdc # sorry, you can't use this, we won't tell you why] > /dev/sdd <- your first hotplugged deviceI think this scheme is flawed, because it exposes the user to the raw kernel device names (/dev/vda), which are not guaranteed to match the device names you expose in the API (/dev/sda).> (2) The "slots" option > > In this option you'd have to use null devices to reserve the maximum > number of drive slots that you're going to use in the libguestfs > handle before launch. Then after launching you'd be allowed to > hotplug only those slots. > > So for example: > > guestfs_add_drive (g, "/dev/null"); # reserves /dev/sda > guestfs_add_drive (g, "/dev/null"); # reserves /dev/sdb > guestfs_add_drive (g, "/dev/null"); # reserves /dev/sdc > guestfs_launch (g); > guestfs_hotplug (g, 1, "/tmp/foo"); # replaces index 1 == /dev/sdb > guestfs_hotplug (g, 3, "/tmp/foo"); # error! > > Although ugly, in some ways this is quite attractive. It maps easily > into guestfish scripts. You have contiguous device naming. You often > know how many drives you'll need in advance, and if you don't then you > can reserve up to max_disks-1.This feels rather unpleasant to me - I don't really consider it to be true hotplug if you have to plan it all in advance.> (3) The "serial numbers" option > > This was Dan's suggestion. Hotplugged drives are known only by their > serial number. ie. We hotplug them via libvirt using the <serial/> > field, and then they are accessed using /dev/disk/by-id/serial. > > This is tempting, but unfortunately it doesn't quite work in stock > udev, because the actual name used is: > > /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_SERIAL > > We could add a custom udev rule to get the path we wanted.Yep, you wouldn't even need to use any of the /dev/disk/by-XXX dirs at all - you could easily add /dev/guestfs/$SERIAL as a naming scheme if you wanted.> (4) The "rewriting device names" option > > Since we already have the infrastructure to rewrite device names, we > could do some complicated and hairy device name rewriting to make > names appear continguous, even though there's an hidden appliance > drive. > > This is my least favourite option, mainly because of the complexity, > and complexity is bound to lead to bugs.Heh, based on the fact that you already have todo device name translation as described above, this feels like the best option to me.> (5) Your idea here ...I think I'd do both (3) and (4), since I think (3) could be useful even outside the realm of hotplug. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
On 20/08/12 22:07, Richard W.M. Jones wrote:> (1) The "raw libvirt" option > > In this one we'd simply provide thin wrappers around > virDomainAttachDevice and virDomainDetactDevice, and leave it up to > the user to know what they're doing. > > The problem with this is the hidden appliance disk. We certainly > don't want the user to accidentally detach that(!) It's also > undesirable for there to be a "hole" in the naming scheme so that > you'd have: > > /dev/sda <- your normal drives > /dev/sdb <- > [/dev/sdc # sorry, you can't use this, we won't tell you why] > /dev/sdd <- your first hotplugged device > > As far as I know, the kernel assigns /dev/sdX names on a first-free > basis, so there's no way to permanently put the appliance at > /dev/sdzzz (if there is, please let me know!)There are numerous reasons not to like this. If there were any reason to want to expose the underlying libvirt api directly I'd suggest it only be a debug option.> (2) The "slots" option > > In this option you'd have to use null devices to reserve the maximum > number of drive slots that you're going to use in the libguestfs > handle before launch. Then after launching you'd be allowed to > hotplug only those slots. > > So for example: > > guestfs_add_drive (g, "/dev/null"); # reserves /dev/sda > guestfs_add_drive (g, "/dev/null"); # reserves /dev/sdb > guestfs_add_drive (g, "/dev/null"); # reserves /dev/sdc > guestfs_launch (g); > guestfs_hotplug (g, 1, "/tmp/foo"); # replaces index 1 == /dev/sdb > guestfs_hotplug (g, 3, "/tmp/foo"); # error! > > Although ugly, in some ways this is quite attractive. It maps easily > into guestfish scripts. You have contiguous device naming. You often > know how many drives you'll need in advance, and if you don't then you > can reserve up to max_disks-1.Echo Dan's general dislike of this.> (3) The "serial numbers" option > > This was Dan's suggestion. Hotplugged drives are known only by their > serial number. ie. We hotplug them via libvirt using the <serial/> > field, and then they are accessed using /dev/disk/by-id/serial. > > This is tempting, but unfortunately it doesn't quite work in stock > udev, because the actual name used is: > > /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_SERIAL > > We could add a custom udev rule to get the path we wanted.Presumably you would specify then serial number when adding the drive? I'm not opposed to this, but it's not the simplest.> (4) The "rewriting device names" option > > Since we already have the infrastructure to rewrite device names, we > could do some complicated and hairy device name rewriting to make > names appear continguous, even though there's an hidden appliance > drive. > > This is my least favourite option, mainly because of the complexity, > and complexity is bound to lead to bugs.I don't think this rewriting is required. Having a hole in the drive letters isn't a big deal. In fact, I suspect it would be simpler in most code to use returned rather than calculated device names. I'd suggest a very simple api: char * guestfs_hotplug_drive(g, path, <opts>) This does the same as add_drive, except that it works after launch and returns the api name of the newly added drive. list_devices will return a list with a hole in it. If it isn't already there, we can add some generic code to methods taking a Device parameter to guard against passing in the root device. Matt -- Matthew Booth, RHCA, RHCSS Red Hat Engineering, Virtualisation Team GPG ID: D33C3490 GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490