Hello, I''m currently reviewing the driver domains protocol proposal, but I think that before reviewing the protocol we should make clear what kind of store backends libxl supports, and what are the plans for future backends. One of the benefits of the driver domain protocol is that it should allow to split device connection in at least two phases, which is important for live migration. The first phase should contain all the logic that''s slower, and should be performed on the receiver domain without pausing the migrated domain. I''ve been trying to figure out which kind of operation should be done in this phase for the different types of backends, but with the backend support we currently have in libxl (blkback, qdisk and blktap) I don''t think we are able to perform any kind of preparatory work before actually connecting the device. One of the backends which I think libxl should support is iSCSI, that also allows live migration. I''ve also been trying to figure out how we are going to handle this kind of devices, and I''m unsure if it will be best to handle them using Qemu as the backend, which currently has a userspace implementation of iSCSI, or using an in-kernel initiator and blkback. The benefits of using Qemu is that it is all contained in userspace, and we don''t pollute the Dom0 (or the Driver Domain) with unneeded devices, on the other hand it is probably slower than using a in-kernel initiator. Doing it one way or another, I''m still not able to see what we can offload to the "preparatory" phase, in the Qemu case we just launch Qemu, and if we decide to use an in-kernel initiator we only have to launch a hotplug script with something like: iscsiadm -m node -T <iqn> -p <ip:port>. I''m sure there''s people on the list with more experience than me on this field, and I would like to ask for some use-cases where this "preparatory" phase would be useful, and what actions will be performed on it. Thanks, Roger.
On Tue, 2012-12-11 at 19:46 +0000, Roger Pau Monné wrote:> just launch Qemu, and if we decide to use an in-kernel initiator we only > have to launch a hotplug script with something like: iscsiadm -m node -T > <iqn> -p <ip:port>.isn't there also a iscsiadm login which is needed at some point? In any case almost any iscsiadm command has the potential to be slow compared to the amount of downtime we would like to aim for during a migration.> I'm sure there's people on the list with more experience than me on this > field, and I would like to ask for some use-cases where this > "preparatory" phase would be useful, and what actions will be performed > on it.Might we worth including xen-api in this discussion since the xapi guys have a fair bit of knowledge of the requirement here. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On 12/12/12 11:43, Ian Campbell wrote:> On Tue, 2012-12-11 at 19:46 +0000, Roger Pau Monné wrote: >> just launch Qemu, and if we decide to use an in-kernel initiator we only >> have to launch a hotplug script with something like: iscsiadm -m node -T >> <iqn> -p <ip:port>. > > isn't there also a iscsiadm login which is needed at some point?Login is done during the plug by adding --login to the above command, but I think you can also perform the login in a discovery, and I guess this login is keep by iscsid, so you don't need to perform it when plugin the devices. But I'm not sure if is worth performing a discovery just to login.> In any case almost any iscsiadm command has the potential to be slow > compared to the amount of downtime we would like to aim for during a > migration.I will do some more research and timming about iscsiadm, to see if there's anyway in which we can speedup the actual connection of the device.>> I'm sure there's people on the list with more experience than me on this >> field, and I would like to ask for some use-cases where this >> "preparatory" phase would be useful, and what actions will be performed >> on it. > > Might we worth including xen-api in this discussion since the xapi guys > have a fair bit of knowledge of the requirement here.I agree, let's CC xen-api@lists.xen.org. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Roger Pau Monné
2012-Dec-13 16:56 UTC
Handling iSCSI block devices (Was: Driver domains and device handling)
After doing some more research I''m able to understand a little better how the Open-iSCSI initiator works (which I think is the most common initiator in the Linux world, and should be supported by all distros). This is marginally derived from the driver domains protocol proposal, because if we are going to implement a driver domain protocol we should try to handle device kinds that can have a two phase connection mechanism, and iSCSI looks like the most interesting candidate (from my POV). I would like to implement iSCSI support in libxl to have at least a device kind that makes use of this two phase connection mechanism, and then draft a driver domain communication protocol, since we will already have a device that needs this kind of protocol (it looked strange to implement a two phase protocol without having any device that needed it). This is the very simple scheme of the two phases of the connection of a iSCSI device: The first phase of connecting a iSCSI device consists of discovering it, which can be done before entering the blackout phase of migration: iscsiadm -m discovery -t st -p <ip>:<port> And possibly setting the right authentication method: iscsiadm -m node --targetname <iqn> -p <ip>:<port> --op=update --name node.session.auth.authmethod --value=<auth_method> iscsiadm -m node --targetname <iqn> -p <ip>:<port> --op=update --name node.session.auth.username --value=<user> iscsiadm -m node --targetname <iqn> -p <ip>:<port> --op=update --name node.session.auth.password --value=<password> The second phase is the actual device plug: iscsiadm -m node --targetname <iqn> -p <ip>:<port> --login I''m trying to fit all this parameters in the current diskspec, but I guess we will have to add new parameters. I think the iqn parameter should go in "target", and the rest should have their own parameters, so this will leave us with the following new parameters: - portal: specifies the address, and optionally the port to connect to the desired target, the format is <ip>:<port> - authmethod: authentication method - user: username to use for authentication - password: password to use for authentication. So the diskspec line would look like: portal=127.0.0.0:3260, authmethod=CHAP, user=foo, password=bar, backendtype=phy, format=iscsi, vdev=xvda, target=iqn.2012-12.com.example:lun1 Note that I''ve used the format parameter here to specify "iscsi", which will be a new format, to distinguish this from a block device that also uses the "phy" backend type. All this new parameters should also be added to the libxl_device_disk struct. Since this device type uses two hotplug scripts we should also add a new generic parameter to specify a "preparatory" hotplug script, so other custom devices can also make use of this, something like "preparescript"? I would like to get some feedback about handling iSCSI devices, and also about adding all this new parameters to the diskspec. Thanks, Roger.
Ian Jackson
2012-Dec-13 18:23 UTC
Re: Handling iSCSI block devices (Was: Driver domains and device handling)
Roger Pau Monne writes ("Handling iSCSI block devices (Was: Driver domains and device handling)"):> [stuff]Most of this sounds sensible.> So the diskspec line would look like: > > portal=127.0.0.0:3260, authmethod=CHAP, user=foo, password=bar, > backendtype=phy, format=iscsi, vdev=xvda, > target=iqn.2012-12.com.example:lun1Are we suggesting that every backend type should be able to define its own parameters ? I was imagining that these options would all go into "target" - and if "target" is last it can contain commas and =s.> Note that I''ve used the format parameter here to specify "iscsi", which > will be a new format, to distinguish this from a block device that also > uses the "phy" backend type. All this new parameters should also be > added to the libxl_device_disk struct.I don''t think this is right. I think the right answer is "script=iscsi". The format might be qcow or something.> Since this device type uses two hotplug scripts we should also add a new > generic parameter to specify a "preparatory" hotplug script, so other > custom devices can also make use of this, something like "preparescript"?Clearly when we have this two-phase setup we need to have more scripts, or the existing script with different arguments. I think it should be controlled by the same argument. So maybe script=iscsi causes libxl to check for a dropping in the script file saying "yes do the prepare thing" or maybe it runs /etc/xen/scripts/block-iscsi--prepare or something. Ian.
Roger Pau Monné
2012-Dec-14 10:49 UTC
Re: Handling iSCSI block devices (Was: Driver domains and device handling)
On 13/12/12 19:23, Ian Jackson wrote:> Roger Pau Monne writes ("Handling iSCSI block devices (Was: Driver domains and device handling)"): >> [stuff] > > Most of this sounds sensible. > >> So the diskspec line would look like: >> >> portal=127.0.0.0:3260, authmethod=CHAP, user=foo, password=bar, >> backendtype=phy, format=iscsi, vdev=xvda, >> target=iqn.2012-12.com.example:lun1 > > Are we suggesting that every backend type should be able to define its > own parameters ? I was imagining that these options would all go into > "target" - and if "target" is last it can contain commas and =s.According to RFC3270 and RFC1035 IQNs should follow this format: iqn.yyyy-mm.com.example:optional.string The problem is that Open-iSCSI seems to accept almost anything, for example iqn.yyyy-mm,com@example:... is a valid iqn from Open-iSCSI point of view. The only character that Open-iSCSI doesn''t seem to accept in iqns is "/", but I don''t really like using that as a field separator inside of target. So I propose the following encoding for target: "<iqn>,<portal>" "<iqn>,<portal>,<auth_method>,<user>,<password>" If a user/password is given, we should take care about what we write to "params" xenstore backend field (because the DomU can read that). Would you agree with the syntax described below?>> Note that I''ve used the format parameter here to specify "iscsi", which >> will be a new format, to distinguish this from a block device that also >> uses the "phy" backend type. All this new parameters should also be >> added to the libxl_device_disk struct. > > I don''t think this is right. I think the right answer is > "script=iscsi". The format might be qcow or something.Yes, it might be better to specify the script.>> Since this device type uses two hotplug scripts we should also add a new >> generic parameter to specify a "preparatory" hotplug script, so other >> custom devices can also make use of this, something like "preparescript"? > > Clearly when we have this two-phase setup we need to have more > scripts, or the existing script with different arguments. > > I think it should be controlled by the same argument. So maybe > script=iscsi causes libxl to check for a dropping in the script file > saying "yes do the prepare thing" or maybe it runs > /etc/xen/scripts/block-iscsi--prepare or something.I like the approach to call the same hotplug script twice, the first time use something like `/etc/xen/scripts/block-iscsi prepare`, and the second time `/etc/xen/scripts/block-iscsi add`
Ian Jackson
2012-Dec-14 12:30 UTC
Re: Handling iSCSI block devices (Was: Driver domains and device handling)
Roger Pau Monne writes ("Re: Handling iSCSI block devices (Was: Driver domains and device handling)"):> According to RFC3270 and RFC1035 IQNs should follow this format: > > iqn.yyyy-mm.com.example:optional.string > > The problem is that Open-iSCSI seems to accept almost anything, for > example iqn.yyyy-mm,com@example:... is a valid iqn from Open-iSCSI point > of view. The only character that Open-iSCSI doesn''t seem to accept in > iqns is "/", but I don''t really like using that as a field separator > inside of target. So I propose the following encoding for target: > > "<iqn>,<portal>" > "<iqn>,<portal>,<auth_method>,<user>,<password>" > > If a user/password is given, we should take care about what we write to > "params" xenstore backend field (because the DomU can read that). Would > you agree with the syntax described below?Wouldn''t it be better to specify this in a more key/value like way ? The password is a problem. Perhaps we need to arrange not to write params to a place where the guest can see it, but that means upheaval for the interface to block scripts.> > I think it should be controlled by the same argument. So maybe > > script=iscsi causes libxl to check for a dropping in the script file > > saying "yes do the prepare thing" or maybe it runs > > /etc/xen/scripts/block-iscsi--prepare or something. > > I like the approach to call the same hotplug script twice, the first > time use something like `/etc/xen/scripts/block-iscsi prepare`, and the > second time `/etc/xen/scripts/block-iscsi add`So how would we tell whether the script understood this ? Perhaps we should invent a new config parameter parallel to script which specifies an entirely new interface. Ian.
Roger Pau Monné
2012-Dec-14 15:20 UTC
Re: Handling iSCSI block devices (Was: Driver domains and device handling)
On 14/12/12 13:30, Ian Jackson wrote:> Roger Pau Monne writes ("Re: Handling iSCSI block devices (Was: Driver domains and device handling)"): >> According to RFC3270 and RFC1035 IQNs should follow this format: >> >> iqn.yyyy-mm.com.example:optional.string >> >> The problem is that Open-iSCSI seems to accept almost anything, for >> example iqn.yyyy-mm,com@example:... is a valid iqn from Open-iSCSI point >> of view. The only character that Open-iSCSI doesn''t seem to accept in >> iqns is "/", but I don''t really like using that as a field separator >> inside of target. So I propose the following encoding for target: >> >> "<iqn>,<portal>" >> "<iqn>,<portal>,<auth_method>,<user>,<password>" >> >> If a user/password is given, we should take care about what we write to >> "params" xenstore backend field (because the DomU can read that). Would >> you agree with the syntax described below? > > Wouldn''t it be better to specify this in a more key/value like way ?I guess we could use something like: "<iqn>,<portal>,auth_method=<auth_method>,user=<user>,password=<password>" Where <iqn> and <portal> are required, and all other fields are optional. Password should always be the last field, because it can contain special characters, like "," or "=".> The password is a problem. Perhaps we need to arrange not to write > params to a place where the guest can see it, but that means upheaval > for the interface to block scripts.I was thinking of adding a new variable to aodev that can contain an extra parameter to pass to hotplug scripts, so we can directly pass the full diskspec to the hotplug script and the hotplug script itself can decide what to save in the "params" field (to be used later in the shutdown/destroy).>>> I think it should be controlled by the same argument. So maybe >>> script=iscsi causes libxl to check for a dropping in the script file >>> saying "yes do the prepare thing" or maybe it runs >>> /etc/xen/scripts/block-iscsi--prepare or something. >> >> I like the approach to call the same hotplug script twice, the first >> time use something like `/etc/xen/scripts/block-iscsi prepare`, and the >> second time `/etc/xen/scripts/block-iscsi add` > > So how would we tell whether the script understood this ?I''m still looking into the current hotplug script mess, but I only see the following lines in block-common.sh that should be changed: if [ "$command" != "add" ] && [ "$command" != "remove" ] then log err "Invalid command: $command" exit 1 fi I think current block hotplug scripts will work nicely when passed the "prepare" command, they will become no-ops, since they all seem to use the following case: case "$command" in add) [...] ;; remove) [...] ;; esac> Perhaps we should invent a new config parameter parallel to script > which specifies an entirely new interface. > > Ian. >
Ian Jackson
2012-Dec-14 15:46 UTC
Re: Handling iSCSI block devices (Was: Driver domains and device handling)
Roger Pau Monne writes ("Re: Handling iSCSI block devices (Was: Driver domains and device handling)"):> I was thinking of adding a new variable to aodev that can contain an > extra parameter to pass to hotplug scripts, so we can directly pass the > full diskspec to the hotplug script and the hotplug script itself can > decide what to save in the "params" field (to be used later in the > shutdown/destroy).That''s all very well but command line parameters are visible in ps and so not suitable for passwords. Really there should be an area in xenstore that''s for communication between the toolstack and the driver domain (including scripts in the latter), but which is not visible to the guest.> > So how would we tell whether the script understood this ? > > I''m still looking into the current hotplug script mess, but I only see > the following lines in block-common.sh that should be changed: > > if [ "$command" != "add" ] && > [ "$command" != "remove" ]What about existing out-of-tree scripts ? Do they all use block-common ?> then > log err "Invalid command: $command" > exit 1 > fiAnd this is no good because if libxl does the error handling properly it would cause every attempt to fail :-). You could explicitly ignore prepare and unprepare.> I think current block hotplug scripts will work nicely when passed the > "prepare" command, they will become no-ops, since they all seem to use > the following case:I''m worried about out-of-tree scripts. The existing hotplug script interface is pretty horrible TBH. Which is why I was suggesting inventing a new one. We could keep the old interface for out-of-tree and unconverted in-tree scripts, and provide a new parameter to request the new style. Eg "method=<something>" rather than "script=<something>" would mean "set script to <something> and also set the flag saying `use the new script calling convention''" Ian.
Roger Pau Monné
2012-Dec-14 17:33 UTC
Re: Handling iSCSI block devices (Was: Driver domains and device handling)
On 14/12/12 16:46, Ian Jackson wrote:> Roger Pau Monne writes ("Re: Handling iSCSI block devices (Was: Driver domains and device handling)"): >> I was thinking of adding a new variable to aodev that can contain an >> extra parameter to pass to hotplug scripts, so we can directly pass the >> full diskspec to the hotplug script and the hotplug script itself can >> decide what to save in the "params" field (to be used later in the >> shutdown/destroy). > > That''s all very well but command line parameters are visible in ps and > so not suitable for passwords. Really there should be an area in > xenstore that''s for communication between the toolstack and the driver > domain (including scripts in the latter), but which is not visible to > the guest. > >>> So how would we tell whether the script understood this ? >> >> I''m still looking into the current hotplug script mess, but I only see >> the following lines in block-common.sh that should be changed: >> >> if [ "$command" != "add" ] && >> [ "$command" != "remove" ] > > What about existing out-of-tree scripts ? Do they all use > block-common ? > >> then >> log err "Invalid command: $command" >> exit 1 >> fi > > And this is no good because if libxl does the error handling properly > it would cause every attempt to fail :-). You could explicitly ignore > prepare and unprepare. > >> I think current block hotplug scripts will work nicely when passed the >> "prepare" command, they will become no-ops, since they all seem to use >> the following case: > > I''m worried about out-of-tree scripts. > > > The existing hotplug script interface is pretty horrible TBH. Which > is why I was suggesting inventing a new one. We could keep the old > interface for out-of-tree and unconverted in-tree scripts, and provide > a new parameter to request the new style. > > Eg "method=<something>" rather than "script=<something>" would mean > "set script to <something> and also set the flag saying `use the new > script calling convention''"Yes, I agree that current hotplug script interface is not good (if this was ever intended to be an interface). When method=<foo> is used as a parameter in the disk specification, it is assumed that script <foo> is using the new hotplug calling convention Script <foo> will be called with only one of the following parameters: * prepare: called before start building the domain, this is specially interesting during migration to offload as much work as possible from the "add" call, which is done during the blackout phase of migration. In the prepare state, the backend xenstore entries have not yet been created. * add: called to connect the device. Xenstore backend entries exist, and backend state is 2 (XenbusStateInitWait). * remove: called to disconnect the device. Xenstore backend entries exists, and backend state is 6 (XenbusStateClosed). Environment variables the script can use (set by the caller): * BACKEND_PATH: path to xenstore backend of the related device, ie: /local/domain/0/backend/vbd/3/51712/. Empty when the script is called with "prepare" argument. When the new hotplug calling convention is used, the toolstack will not write the backend "params" node, it is up to the hotplug script to write it if necessary. * HOTPLUG_PATH: path to the xenstore directory that can be used to pass extra parameters to the script. In this implementation only the "params" variable is set, and it will contain the full diskspec string. This xenstore path will not be deleted until the script has been called with the "remove" parameter, so it can be used to store information that will persist between the different hotplug calls. I''m not sure where HOTPLUG_PATH should reside, does /local/domain/<backend_domid>/libxl/hotplug/<domid>/<devid>/ sound ok?
Ian Jackson
2012-Dec-14 18:26 UTC
Re: Handling iSCSI block devices (Was: Driver domains and device handling)
Roger Pau Monne writes ("Re: Handling iSCSI block devices (Was: Driver domains and device handling)"):> Script <foo> will be called with only one of the following parameters: > > * prepare: called before start building the domain, this is specially > interesting during migration to offload as much work as possible from > the "add" call, which is done during the blackout phase of migration. In > the prepare state, the backend xenstore entries have not yet been created. > > * add: called to connect the device. Xenstore backend entries exist, > and backend state is 2 (XenbusStateInitWait). > > * remove: called to disconnect the device. Xenstore backend entries > exists, and backend state is 6 (XenbusStateClosed).I assume we need an unprepare here too.> Environment variables the script can use (set by the caller):...> I''m not sure where HOTPLUG_PATH should reside, does > /local/domain/<backend_domid>/libxl/hotplug/<domid>/<devid>/ sound ok?I think that would be fine. Ian.
Roger Pau Monné
2012-Dec-14 18:38 UTC
Re: Handling iSCSI block devices (Was: Driver domains and device handling)
On 14/12/12 19:26, Ian Jackson wrote:> Roger Pau Monne writes ("Re: Handling iSCSI block devices (Was: Driver domains and device handling)"): >> * remove: called to disconnect the device. Xenstore backend entries >> exists, and backend state is 6 (XenbusStateClosed). > > I assume we need an unprepare here too.I''ve also thought that, but the reason for prepare to exist is to reduce the time that the "add" operation takes, thus reducing the blackout phase during migration. There''s no such problem in the remove phase, but I guess we need an unprepare in case there''s a failure between the prepare and add operations, and we wish to give the hotplug script an opportunity to unwind whatever the prepare operation has done.
Ian Jackson
2012-Dec-17 11:47 UTC
Re: Handling iSCSI block devices (Was: Driver domains and device handling)
Roger Pau Monne writes ("Re: Handling iSCSI block devices (Was: Driver domains and device handling)"):> On 14/12/12 19:26, Ian Jackson wrote: > > Roger Pau Monne writes ("Re: Handling iSCSI block devices (Was: Driver domains and device handling)"): > >> * remove: called to disconnect the device. Xenstore backend entries > >> exists, and backend state is 6 (XenbusStateClosed). > > > > I assume we need an unprepare here too. > > I''ve also thought that, but the reason for prepare to exist is to reduce > the time that the "add" operation takes, thus reducing the blackout > phase during migration.The unprepare operation might also be slow. (Of course we believe in crash-only software but the storage provider might not...)> There''s no such problem in the remove phase, but I guess we need an > unprepare in case there''s a failure between the prepare and add > operations, and we wish to give the hotplug script an opportunity to > unwind whatever the prepare operation has done. >Yes. Ian.