Richard W.M. Jones
2012-Aug-20 21:07 UTC
[Libguestfs] [libguestfs] Options for hotplugging
libguestfs recently added support for virtio-scsi and libvirt, and
when these are both available this lets us relatively easily add
hotplugging of drives. This email is about how we would present that
through the libguestfs API.
(a) The current API
Currently you have to add drive(s) via guestfs_add_drive* and then
call guestfs_launch, ie. your program must look something like this:
guestfs_h *g = guestfs_create ();
guestfs_add_drive (g, "/tmp/disk1.img");
guestfs_add_drive (g, "/tmp/disk2.img");
guestfs_launch (g);
After guestfs_launch [the qemu backend is running] you are not allowed
to add more drives.
The API specifies that you refer to drives in other commands in one of
two ways. Either you're allowed to use names like "/dev/sda",
"/dev/sdb" etc to refer to the first, second etc drive you added, in
the same order that you added them. Or you can call
guestfs_list_devices which returns a list of device names, opaque
strings that you pass to other functions.
In the first case (using "/dev/sdX" names), some magic already happens
translating these to the real names underneath, but currently that
magic is just "/dev/sdX" -> "/dev/vdX" for the virtio
case.
Note that we cannot change this API or break existing programs.
(b) The hidden appliance drive
The libguestfs appliance has to have its own root drive. Currently
this is added after the user-added drives. For example, if the user
adds two drives, then the appliance will appear as /dev/sdc (or
/dev/vdc or whatever). Some magic in the bootloader causes the last
drive to be used as the root filesystem.
This hidden drive doesn't appear in the API -- for example it is
suppressed when we generate the result of guestfs_list_devices.
(c) /dev/null drives
It's always been possible to add "/dev/null" as a drive via
guestfs_add_drive*. This is mainly useful for testing, or if you want
to access just those parts of the API which don't require a disk image
(for various reasons we force you to add one drive, so if you don't
have any drive to add, you can use "/dev/null"). Current libguestfs
treats "/dev/null" as a magic string and (because of bugs in qemu)
substitutes a non-zero sized temporary file.
(d) Maximum number of drives
With virtio-scsi, this maximum is pretty large -- currently 255 (256
targets less the hidden appliance), but if we used LUNs or even
multiple controllers then it'd be almost unlimited. We actually test
up to 255, and virt-df will use as many slots as it can.
(e) For libguestfs you can assume the latest of everything, qemu,
guest kernel, host kernel, libvirt, tools. Any suggestions based on
very new features are fine, even proposed features provided there's a
working implementation which is likely to go upstream.
- - - -
Here are some ideas about how we might add hotplugging without
breaking existing clients.
(1) The "raw libvirt" option
In this one we'd simply provide thin wrappers around
virDomainAttachDevice and virDomainDetactDevice, and leave it up to
the user to know what they're doing.
The problem with this is the hidden appliance disk. We certainly
don't want the user to accidentally detach that(!) It's also
undesirable for there to be a "hole" in the naming scheme so that
you'd have:
/dev/sda <- your normal drives
/dev/sdb <-
[/dev/sdc # sorry, you can't use this, we won't tell you why]
/dev/sdd <- your first hotplugged device
As far as I know, the kernel assigns /dev/sdX names on a first-free
basis, so there's no way to permanently put the appliance at
/dev/sdzzz (if there is, please let me know!)
(2) The "slots" option
In this option you'd have to use null devices to reserve the maximum
number of drive slots that you're going to use in the libguestfs
handle before launch. Then after launching you'd be allowed to
hotplug only those slots.
So for example:
guestfs_add_drive (g, "/dev/null"); # reserves /dev/sda
guestfs_add_drive (g, "/dev/null"); # reserves /dev/sdb
guestfs_add_drive (g, "/dev/null"); # reserves /dev/sdc
guestfs_launch (g);
guestfs_hotplug (g, 1, "/tmp/foo"); # replaces index 1 == /dev/sdb
guestfs_hotplug (g, 3, "/tmp/foo"); # error!
Although ugly, in some ways this is quite attractive. It maps easily
into guestfish scripts. You have contiguous device naming. You often
know how many drives you'll need in advance, and if you don't then you
can reserve up to max_disks-1.
(3) The "serial numbers" option
This was Dan's suggestion. Hotplugged drives are known only by their
serial number. ie. We hotplug them via libvirt using the <serial/>
field, and then they are accessed using /dev/disk/by-id/serial.
This is tempting, but unfortunately it doesn't quite work in stock
udev, because the actual name used is:
/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_SERIAL
We could add a custom udev rule to get the path we wanted.
(4) The "rewriting device names" option
Since we already have the infrastructure to rewrite device names, we
could do some complicated and hairy device name rewriting to make
names appear continguous, even though there's an hidden appliance
drive.
This is my least favourite option, mainly because of the complexity,
and complexity is bound to lead to bugs.
(5) Your idea here ...
As usual, comments and suggestions welcome.
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines. Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://et.redhat.com/~rjones/virt-top
Daniel P. Berrange
2012-Aug-21 09:32 UTC
[Libguestfs] [libguestfs] Options for hotplugging
On Mon, Aug 20, 2012 at 10:07:43PM +0100, Richard W.M. Jones wrote:> In the first case (using "/dev/sdX" names), some magic already happens > translating these to the real names underneath, but currently that > magic is just "/dev/sdX" -> "/dev/vdX" for the virtio case.Ah I didn't realize that you already do device name remapping between what the kernel shows & what the API shows.> This hidden drive doesn't appear in the API -- for example it is > suppressed when we generate the result of guestfs_list_devices.Makes sense.> (1) The "raw libvirt" option > > In this one we'd simply provide thin wrappers around > virDomainAttachDevice and virDomainDetactDevice, and leave it up to > the user to know what they're doing. > > The problem with this is the hidden appliance disk. We certainly > don't want the user to accidentally detach that(!) It's also > undesirable for there to be a "hole" in the naming scheme so that > you'd have: > > /dev/sda <- your normal drives > /dev/sdb <- > [/dev/sdc # sorry, you can't use this, we won't tell you why] > /dev/sdd <- your first hotplugged deviceI think this scheme is flawed, because it exposes the user to the raw kernel device names (/dev/vda), which are not guaranteed to match the device names you expose in the API (/dev/sda).> (2) The "slots" option > > In this option you'd have to use null devices to reserve the maximum > number of drive slots that you're going to use in the libguestfs > handle before launch. Then after launching you'd be allowed to > hotplug only those slots. > > So for example: > > guestfs_add_drive (g, "/dev/null"); # reserves /dev/sda > guestfs_add_drive (g, "/dev/null"); # reserves /dev/sdb > guestfs_add_drive (g, "/dev/null"); # reserves /dev/sdc > guestfs_launch (g); > guestfs_hotplug (g, 1, "/tmp/foo"); # replaces index 1 == /dev/sdb > guestfs_hotplug (g, 3, "/tmp/foo"); # error! > > Although ugly, in some ways this is quite attractive. It maps easily > into guestfish scripts. You have contiguous device naming. You often > know how many drives you'll need in advance, and if you don't then you > can reserve up to max_disks-1.This feels rather unpleasant to me - I don't really consider it to be true hotplug if you have to plan it all in advance.> (3) The "serial numbers" option > > This was Dan's suggestion. Hotplugged drives are known only by their > serial number. ie. We hotplug them via libvirt using the <serial/> > field, and then they are accessed using /dev/disk/by-id/serial. > > This is tempting, but unfortunately it doesn't quite work in stock > udev, because the actual name used is: > > /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_SERIAL > > We could add a custom udev rule to get the path we wanted.Yep, you wouldn't even need to use any of the /dev/disk/by-XXX dirs at all - you could easily add /dev/guestfs/$SERIAL as a naming scheme if you wanted.> (4) The "rewriting device names" option > > Since we already have the infrastructure to rewrite device names, we > could do some complicated and hairy device name rewriting to make > names appear continguous, even though there's an hidden appliance > drive. > > This is my least favourite option, mainly because of the complexity, > and complexity is bound to lead to bugs.Heh, based on the fact that you already have todo device name translation as described above, this feels like the best option to me.> (5) Your idea here ...I think I'd do both (3) and (4), since I think (3) could be useful even outside the realm of hotplug. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
On 20/08/12 22:07, Richard W.M. Jones wrote:> (1) The "raw libvirt" option > > In this one we'd simply provide thin wrappers around > virDomainAttachDevice and virDomainDetactDevice, and leave it up to > the user to know what they're doing. > > The problem with this is the hidden appliance disk. We certainly > don't want the user to accidentally detach that(!) It's also > undesirable for there to be a "hole" in the naming scheme so that > you'd have: > > /dev/sda <- your normal drives > /dev/sdb <- > [/dev/sdc # sorry, you can't use this, we won't tell you why] > /dev/sdd <- your first hotplugged device > > As far as I know, the kernel assigns /dev/sdX names on a first-free > basis, so there's no way to permanently put the appliance at > /dev/sdzzz (if there is, please let me know!)There are numerous reasons not to like this. If there were any reason to want to expose the underlying libvirt api directly I'd suggest it only be a debug option.> (2) The "slots" option > > In this option you'd have to use null devices to reserve the maximum > number of drive slots that you're going to use in the libguestfs > handle before launch. Then after launching you'd be allowed to > hotplug only those slots. > > So for example: > > guestfs_add_drive (g, "/dev/null"); # reserves /dev/sda > guestfs_add_drive (g, "/dev/null"); # reserves /dev/sdb > guestfs_add_drive (g, "/dev/null"); # reserves /dev/sdc > guestfs_launch (g); > guestfs_hotplug (g, 1, "/tmp/foo"); # replaces index 1 == /dev/sdb > guestfs_hotplug (g, 3, "/tmp/foo"); # error! > > Although ugly, in some ways this is quite attractive. It maps easily > into guestfish scripts. You have contiguous device naming. You often > know how many drives you'll need in advance, and if you don't then you > can reserve up to max_disks-1.Echo Dan's general dislike of this.> (3) The "serial numbers" option > > This was Dan's suggestion. Hotplugged drives are known only by their > serial number. ie. We hotplug them via libvirt using the <serial/> > field, and then they are accessed using /dev/disk/by-id/serial. > > This is tempting, but unfortunately it doesn't quite work in stock > udev, because the actual name used is: > > /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_SERIAL > > We could add a custom udev rule to get the path we wanted.Presumably you would specify then serial number when adding the drive? I'm not opposed to this, but it's not the simplest.> (4) The "rewriting device names" option > > Since we already have the infrastructure to rewrite device names, we > could do some complicated and hairy device name rewriting to make > names appear continguous, even though there's an hidden appliance > drive. > > This is my least favourite option, mainly because of the complexity, > and complexity is bound to lead to bugs.I don't think this rewriting is required. Having a hole in the drive letters isn't a big deal. In fact, I suspect it would be simpler in most code to use returned rather than calculated device names. I'd suggest a very simple api: char * guestfs_hotplug_drive(g, path, <opts>) This does the same as add_drive, except that it works after launch and returns the api name of the newly added drive. list_devices will return a list with a hole in it. If it isn't already there, we can add some generic code to methods taking a Device parameter to guard against passing in the root device. Matt -- Matthew Booth, RHCA, RHCSS Red Hat Engineering, Virtualisation Team GPG ID: D33C3490 GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490