thr3ads.net - Xen users - linux stubdom [Jan 2013]

If this information is useful, please help other people find it:
Share via:

Markus Hochholdinger

2013-Jan-29 15:46 UTC

linux stubdom

Hello list,

I''m reading a lot about xen stub-domains, but I''m wondering if
I can use a
linux stubdom to serve a "transformed" block device for the according
domU.
The wiki page(s) and the stubdom-directory in the source code leaves a lot of 
questions to me I''m hoping someone can answer me here.

So my question are:
* What are the requirements to run linux inside a stubdom? Is a current pvops
  kernel enough or has the linux kernel to be modified for a stubdom? If yes,
  I would prepare a kernel and a minimal rootfs within an initrd to setup my
  blockdevice for the domU.
* How can I offer a block device (created within the stubdom) from the stubdom
  to the domU? Are there any docs how to configure this?
* If the above is not possible, how could I offer a blockdevice from one domU
  to another domU as block device? Are there any docs how to do this?

What I''m trying to do:
* In my case, I make logical volumes available on all hosts with the same path
  on each host. So I can assemble a software raid1 where each device lives on
  a different server while not loosing the possibility of live migration.
* Configure a domU with two block devices, the according (linux) stubdomu
  assembles a software raid1 (linux md device) and presents this md device
  to the domU. So the domU hasn''t to handle anything with the software
raid1
  but has ONE redundant block device for its usage.
* I have two use cases, one is a HVM domU, where something like windows is
  running and because the lack of (good) software raid1 I use the software
  raid1 of linux for baking the block device inside the stubdom.
  The other use case is a PV domU where the admin of the virtualization
  environment is not the admin of the domU and therefore isn''t able to
manage
  the software raid1 inside the domU. So the stubdom could be used to manage
  the software raid1 without interfering within the domU.


-- 
greetings

eMHa


_______________________________________________
Xen-users mailing list
Xen-users@lists.xen.org
http://lists.xen.org/xen-users

Peter Gansterer

2013-Jan-29 16:15 UTC

head link

Re: linux stubdom

Hi,

On 2013-01-29 16:46:40 Markus Hochholdinger wrote:> [...]
> What I''m trying to do:
> * In my case, I make logical volumes available on all hosts with the same
path
>   on each host. So I can assemble a software raid1 where each device lives
on
>   a different server while not loosing the possibility of live migration.
> * Configure a domU with two block devices, the according (linux) stubdomu
>   assembles a software raid1 (linux md device) and presents this md device
>   to the domU. So the domU hasn''t to handle anything with the
software raid1
>   but has ONE redundant block device for its usage.
> * I have two use cases, one is a HVM domU, where something like windows is
>   running and because the lack of (good) software raid1 I use the software
>   raid1 of linux for baking the block device inside the stubdom.
>   The other use case is a PV domU where the admin of the virtualization
>   environment is not the admin of the domU and therefore isn''t
able to manage
>   the software raid1 inside the domU. So the stubdom could be used to
manage
>   the software raid1 without interfering within the domU.

Out of curiosity:
Is there anything I am missing as to what you''re trying to achieve
cannot be done in dom0 alone?
- I suppose you could create an LVM2 Mirror above the remote volumes
  (assuming using iSCSI volumes as LVM2 PVs is possible).
- Did you consider DRBD? Seems to me it provides what you need.

Sorry tho I cannot help you with stubdoms.

- peter.

Ian Campbell

2013-Jan-29 16:36 UTC

head link

Re: linux stubdom

On Tue, 2013-01-29 at 15:46 +0000, Markus Hochholdinger
wrote:> Hello list,
> 
> I''m reading a lot about xen stub-domains, but I''m
wondering if I can use a
> linux stubdom to serve a "transformed" block device for the
according domU.
> The wiki page(s) and the stubdom-directory in the source code leaves a lot
of
> questions to me I''m hoping someone can answer me here.
I think the thing you are looking for is a "driver domain" rather than
a
stubdomain, http://wiki.xen.org/wiki/Driver_Domain. You''ll likely find
more useful info if you google for that rather than stubdomain.
> So my question are:
> * What are the requirements to run linux inside a stubdom? Is a current
pvops
>   kernel enough or has the linux kernel to be modified for a stubdom? If
yes,
>   I would prepare a kernel and a minimal rootfs within an initrd to setup
my
>   blockdevice for the domU.
You can use Linux as a block driver storage domain, yes.
> * How can I offer a block device (created within the stubdom) from the
stubdom
>   to the domU? Are there any docs how to configure this?
> * If the above is not possible, how could I offer a blockdevice from one
domU
>   to another domU as block device? Are there any docs how to do this?
Since a driver domain is also a domU (just one which happens to provide
services to other domains) these are basically the same question. People
more often do this with network driver domains but block ought to be
possible too (although there may be a certain element of having to take
the pieces and build something yourself).

Essentially you just need to a) make the block device bits available in
the driver domain b) run blkback (or some other block backend) in the
driver domain and c) tell the toolstack that a particular device is
provided by the driver domain when you build the guest.

For a) one would usually use PCI passthrough to pass a storage
controller to the driver domain and use the regular drivers in there.
But you could also use e.g. iSCSI or NFS (I guess). If you want to also
use this controller for dom0''s disks then that''s a bit more
complex...

For b) that''s just a case of compiling in the appropriate driver and
installing the appropriate hotplug scripts in the domain.

For c) I''m not entirely sure how you do that with either xend or xl in
practice. I know there have been some patches on xen-devel not so long
ago to improve things for xl support of disk driver domains. It possible
that you might need to hack the toolstack a bit to get this to work, and
depending on how and when the disk images are constructed you may need
some out of band communication between the toolstack domain and driver
domain to actually create the underlying devices.

The biggest problem I can see is supporting Windows HVM, since the
device model also needs to have access to the disk in order to provide
the emulated devices (at least initially, hopefully you have PV
drivers). The usual way to do this is to attach a PV device to the
domain running the device model where the backend is supported by the
driver domain as well. Again you might need to hack up a few things to
get this working.
> What I''m trying to do:
> * In my case, I make logical volumes available on all hosts with the same
path
>   on each host. So I can assemble a software raid1 where each device lives
on
>   a different server while not loosing the possibility of live migration.
> * Configure a domU with two block devices, the according (linux) stubdomu
>   assembles a software raid1 (linux md device) and presents this md device
>   to the domU. So the domU hasn''t to handle anything with the
software raid1
>   but has ONE redundant block device for its usage.
> * I have two use cases, one is a HVM domU, where something like windows is
>   running and because the lack of (good) software raid1 I use the software
>   raid1 of linux for baking the block device inside the stubdom.
>   The other use case is a PV domU where the admin of the virtualization
>   environment is not the admin of the domU and therefore isn''t
able to manage
>   the software raid1 inside the domU. So the stubdom could be used to
manage
>   the software raid1 without interfering within the domU.
> 
>

Markus Hochholdinger

2013-Jan-29 18:16 UTC

head link

Re: linux stubdom

Hi,

Am 29.01.2013 um 17:15 Uhr schrieb "Peter Gansterer" 
<peter.gansterer@paradigma.net>:> On 2013-01-29 16:46:40 Markus Hochholdinger wrote:
[..]> Out of curiosity:
> Is there anything I am missing as to what you''re trying to achieve
cannot
> be done in dom0 alone? - I suppose you could create an LVM2 Mirror above
> the remote volumes (assuming using iSCSI volumes as LVM2 PVs is possible).
> - Did you consider DRBD? Seems to me it provides what you need.
well, if I create/assemble a md device in one dom0 I can''t live migrate
the
domU. If I create/assemble the md device simultaneously on the destination 
dom0 I have potentially data corruption.

If I use drbd the performance is not that good and I''m limited to two
dom0s
(ok, with newer drbd I can have multiple slaves without stacking). And I have 
the problem of split brain situations.

As my tests have shown I get the most out of the hardware if I use a software 
raid1 inside linux domUs. The local logical volumes are exported over iscsi to 
the other dom0s and all logical volumes have on each dom0 the same symlink in 
/dev.
The other reason is that I have no problems with split brain. If the domU 
doesn''t run the software raid1 also doesn''t run.
I have the setup with linux domUs and software raid1 inside the domUs 
successfull in production since 2006 but it has the limit that the software 
raid1 has to be managed inside the domUs and now I''m searching for
solutions
where the software raid1 is not inside the domU but very near of the domU like 
in a stubdom.

-- 
greetings

eMHa

_______________________________________________
Xen-users mailing list
Xen-users@lists.xen.org
http://lists.xen.org/xen-users

Markus Hochholdinger

2013-Jan-29 19:32 UTC

head link

Re: linux stubdom

Hello,

Am 29.01.2013 um 17:36 Uhr schrieb Ian Campbell
<Ian.Campbell@citrix.com>:> On Tue, 2013-01-29 at 15:46 +0000, Markus Hochholdinger wrote:
> > I''m reading a lot about xen stub-domains, but I''m
wondering if I can use
> > a linux stubdom to serve a "transformed" block device for
the according
> > domU. The wiki page(s) and the stubdom-directory in the source code
> > leaves a lot of questions to me I''m hoping someone can answer
me here.
> I think the thing you are looking for is a "driver domain" rather
than a
> stubdomain, http://wiki.xen.org/wiki/Driver_Domain. You''ll likely
find
> more useful info if you google for that rather than stubdomain.
a driver domain is a great thing, but I''m wondering if I can live
migrate a
domU inclusive its driver domU? In the domU config I have to use the $domid of 
the driver domain. This $domid is probably wrong after a live migration of the 
domU and I have to migrate the driver domU in the same time while migrating 
the domU. If the driver domU is ment to stay on one dom0 there''s no
difference
between doing this in dom0, but for me this doesn''t work.
Do you know how I can live migrate a domU which depends on a driver domU? How 
can I migrate the driver domU?
For my understanding the block device has to be there on the destination dom0 
before live migration begins but is also used on the source dom0 from the 
migrating, but still running, domU.
Can I combine a driver domU to a normal domU like I can combine a stubdom with 
a normal domU?

I thought a stubdom live migrates with its domU so you don''t have to
worry
that the driver domU live migrates while the according normal domU migrates.

> > So my question are:
> > * What are the requirements to run linux inside a stubdom? Is a
current
> > pvops
> >   kernel enough or has the linux kernel to be modified for a stubdom?
If
> >   yes, I would prepare a kernel and a minimal rootfs within an initrd
to
> >   setup my blockdevice for the domU.
> You can use Linux as a block driver storage domain, yes.
OK, I see. But still, would it be possible to run linux in a stub-domain?
I''ve
read e.g. http://blog.xen.org/index.php/2012/12/12/linux-stub-domain/ which 
describes this, but I''m unsure if this will be (or is already)
supported by
current xen?

> 
> > * How can I offer a block device (created within the stubdom) from the
> > stubdom
> > 
> >   to the domU? Are there any docs how to configure this?
> > 
> > * If the above is not possible, how could I offer a blockdevice from
one
> > domU
> > 
> >   to another domU as block device? Are there any docs how to do this?
> Since a driver domain is also a domU (just one which happens to provide
> services to other domains) these are basically the same question. People
> more often do this with network driver domains but block ought to be
> possible too (although there may be a certain element of having to take
> the pieces and build something yourself).
Is live migration possible with this driver domUs? What requirements are 
needed that I can live migrate a domU which depends on a driver domU?

> Essentially you just need to a) make the block device bits available in
> the driver domain b) run blkback (or some other block backend) in the
> driver domain and c) tell the toolstack that a particular device is
> provided by the driver domain when you build the guest.
Yes, I would provide two block devices (logical volumes) into the driver domU, 
create there a software raid1 device and make the md device available with 
blkback. I would do this for each domU so I can live migrate domUs 
independent. The driver domU only needs a kernel and a initrd with a rootfs 
filled with enough to build the md device and export it with blkback.

But how can I address this exported block device? As far as I''ve seen I
need
the $domid of the driver domain in my config file of the domU, or am I missing 
something?

> For a) one would usually use PCI passthrough to pass a storage
> controller to the driver domain and use the regular drivers in there.
> But you could also use e.g. iSCSI or NFS (I guess). If you want to also
> use this controller for dom0''s disks then that''s a bit
more complex...
Because I''d like to "migrate" the driver domain with my
normal domU I wouldn''t
do any pci passthrough but only provide the logical volumes for baking the 
block device of one domU.

> For b) that''s just a case of compiling in the appropriate driver
and
> installing the appropriate hotplug scripts in the domain.
> For c) I''m not entirely sure how you do that with either xend or
xl in
> practice. I know there have been some patches on xen-devel not so long
> ago to improve things for xl support of disk driver domains. It possible
> that you might need to hack the toolstack a bit to get this to work, and
> depending on how and when the disk images are constructed you may need
> some out of band communication between the toolstack domain and driver
> domain to actually create the underlying devices.
I''ll test a driver domain so I can test what works for me and what not.

> The biggest problem I can see is supporting Windows HVM, since the
> device model also needs to have access to the disk in order to provide
> the emulated devices (at least initially, hopefully you have PV
> drivers). The usual way to do this is to attach a PV device to the
> domain running the device model where the backend is supported by the
> driver domain as well. Again you might need to hack up a few things to
> get this working.
So this would be a dependency that the driver domain is started before the 
stubdom with qemu.

In some subdom-startup script I saw the parameter "target" while
creating the
stubdom. Is this a possibility to combine two domUs?

As far as I understand with a driver domU I need the $domid of the driver domU 
in my config, so this is the connection between the two domUs.
But with stub-domains I haven''t understand how data flows between
stubdom and
domU and because I''ve seen a lot of nice little pictures describing
that I/O
flows between domU over stubdom to dom0 and backwards I thought stub-domains 
would be the way to go.

Many thanks so far for your informations.

> > What I''m trying to do:
> > * In my case, I make logical volumes available on all hosts with the
same
> > path
> > 
> >   on each host. So I can assemble a software raid1 where each device
> >   lives on a different server while not loosing the possibility of
live
> >   migration.
> > 
> > * Configure a domU with two block devices, the according (linux)
stubdomu
> > 
> >   assembles a software raid1 (linux md device) and presents this md
> >   device to the domU. So the domU hasn''t to handle anything
with the
> >   software raid1 but has ONE redundant block device for its usage.
> > 
> > * I have two use cases, one is a HVM domU, where something like
windows
> > is
> > 
> >   running and because the lack of (good) software raid1 I use the
> >   software raid1 of linux for baking the block device inside the
> >   stubdom. The other use case is a PV domU where the admin of the
> >   virtualization environment is not the admin of the domU and
therefore
> >   isn''t able to manage the software raid1 inside the domU. So
the
> >   stubdom could be used to manage the software raid1 without
interfering
> >   within the domU.
-- 
greetings

eMHa


_______________________________________________
Xen-users mailing list
Xen-users@lists.xen.org
http://lists.xen.org/xen-users

Ian Campbell

2013-Jan-30 09:36 UTC

head link

Re: linux stubdom

On Tue, 2013-01-29 at 19:32 +0000, Markus Hochholdinger
wrote:> Hello,
> 
> Am 29.01.2013 um 17:36 Uhr schrieb Ian Campbell
<Ian.Campbell@citrix.com>:
> > On Tue, 2013-01-29 at 15:46 +0000, Markus Hochholdinger wrote:
> > > I''m reading a lot about xen stub-domains, but
I''m wondering if I can use
> > > a linux stubdom to serve a "transformed" block device
for the according
> > > domU. The wiki page(s) and the stubdom-directory in the source
code
> > > leaves a lot of questions to me I''m hoping someone can
answer me here.
> > I think the thing you are looking for is a "driver domain"
rather than a
> > stubdomain, http://wiki.xen.org/wiki/Driver_Domain. You''ll
likely find
> > more useful info if you google for that rather than stubdomain.
> 
> a driver domain is a great thing, but I''m wondering if I can live
migrate a
> domU inclusive its driver domU? In the domU config I have to use the $domid
of
> the driver domain. This $domid is probably wrong after a live migration of
the
> domU and I have to migrate the driver domU in the same time while migrating
> the domU. If the driver domU is ment to stay on one dom0 there''s
no difference
> between doing this in dom0, but for me this doesn''t work.
The change of $domid doesn''t matter since a migration involves
reconnecting the devices anyway, which means thy will reconnect to the
"new" driver domain.

The normal way would be to have a driver domain per host but there''s no
reason you couldn''t make it such that the driver domain was migrated
too
(you''d have to do some hacking to make this work).

AIUI you currently have a RAID1 device in the guest, presumably
constructed from 2 xvd* devices? What are those two xvda devices are
backed by? I presume it must be some sort of network storage (NFS,
iSCSI, NBD, DRDB) or else you just couldn''t migrate.

Are you intending to instead run the RAID1 device in a "driver
domain",
constructed from 2 xvd* devices exported from dom0 and exporting that as
a single xvd* device to the guest? Or are you intending to surface the
network storage directly into the driver domain, construct the RAID
device from those and export that as an xvd* to the guest?
> Do you know how I can live migrate a domU which depends on a driver domU?
How
> can I migrate the driver domU?
> For my understanding the block device has to be there on the destination
dom0
> before live migration begins but is also used on the source dom0 from the 
> migrating, but still running, domU.
Not quite, when you migrate there is a pause period while the final copy
over occurs and at this point you can safely remove the device from the
source host and make it available on the target host. The toolstack will
ensure that the block device is only ever active on one end of the other
and never on both -- otherwise you would get potential corruption.

While you could migrate the driver domain during the main domU''s pause
period it is much more normal to simply have a driver domain on each
host and dynamically configure the storage as you migrate.
> Can I combine a driver domU to a normal domU like I can combine a stubdom
with
> a normal domU?
If you want, it would be more typical to have a single driver domain
providing block services to all domains (or one per underlying physical
block device).
> I thought a stubdom live migrates with its domU so you don''t have
to worry
> that the driver domU live migrates while the according normal domU
migrates.
A stubdom is a bit of an overloaded term. If you mean an ioemu stub
domain (i.e. the qemu associated with an HVM guest) then a new one of
those is started on the target host each time you migrate.

If you mean a xenstored stubdom then those are per host and are not
migrated.

And if you mean a driver domain then as I say those are usually per host
and the domain will be connected to the appropriate local driver domain
on the target host. 
> > > So my question are:
> > > * What are the requirements to run linux inside a stubdom? Is a
current
> > > pvops
> > >   kernel enough or has the linux kernel to be modified for a
stubdom? If
> > >   yes, I would prepare a kernel and a minimal rootfs within an
initrd to
> > >   setup my blockdevice for the domU.
> > You can use Linux as a block driver storage domain, yes.
> 
> OK, I see. But still, would it be possible to run linux in a stub-domain?
I''ve
> read e.g. http://blog.xen.org/index.php/2012/12/12/linux-stub-domain/ which
> describes this, but I''m unsure if this will be (or is already)
supported by
> current xen?
This work is about a Linux ioemu stub domain. That is a stubdomain with
the sole purpose of running the qemu emulation process for an HVM
domain. I think the intention is for this to land in Xen 4.3 but it does
not have anything to do with your usecase AFAICT.

Everything you want to do is already possible with what is in Xen and
Linux today, in that the mechanisms all exist. However what you are
doing is not something which others have done and so there will
necessarily need to be a certain amount of putting the pieces together
on your part.
> > > * How can I offer a block device (created within the stubdom)
from the
> > > stubdom
> > > 
> > >   to the domU? Are there any docs how to configure this?
> > > 
> > > * If the above is not possible, how could I offer a blockdevice
from one
> > > domU
> > > 
> > >   to another domU as block device? Are there any docs how to do
this?
> > Since a driver domain is also a domU (just one which happens to
provide
> > services to other domains) these are basically the same question.
People
> > more often do this with network driver domains but block ought to be
> > possible too (although there may be a certain element of having to
take
> > the pieces and build something yourself).
> 
> Is live migration possible with this driver domUs? What requirements are 
> needed that I can live migrate a domU which depends on a driver domU?
> 
> 
> > Essentially you just need to a) make the block device bits available
in
> > the driver domain b) run blkback (or some other block backend) in the
> > driver domain and c) tell the toolstack that a particular device is
> > provided by the driver domain when you build the guest.
> 
> Yes, I would provide two block devices (logical volumes) into the driver
domU,
How are you doing this? Where do those logical device come from and how
are they getting into the driver domU?
> create there a software raid1 device and make the md device available with 
> blkback. I would do this for each domU so I can live migrate domUs 
> independent. The driver domU only needs a kernel and a initrd with a rootfs
> filled with enough to build the md device and export it with blkback.
> 
> But how can I address this exported block device? As far as I''ve
seen I need
> the $domid of the driver domain in my config file of the domU, or am I
missing
> something?
$domid can also be a domain name, and you can also change this over
migration by providing an updated configuration file (at least with xl).
> So this would be a dependency that the driver domain is started before the 
> stubdom with qemu.
Yes.
> In some subdom-startup script I saw the parameter "target" while
creating the
> stubdom. Is this a possibility to combine two domUs?
I think "target" in this context refers to the HVM guest for which the
ioemu-stubdom is providing services.
> As far as I understand with a driver domU I need the $domid of the driver
domU
> in my config, so this is the connection between the two domUs.
> But with stub-domains I haven''t understand how data flows between
stubdom and
> domU and because I''ve seen a lot of nice little pictures
describing that I/O
> flows between domU over stubdom to dom0 and backwards I thought
stub-domains
> would be the way to go.
Only if you are using emulated I/O. I assumed you were using PV I/O, is
that not the case?

Ian.

Markus Hochholdinger

2013-Jan-30 15:35 UTC

head link

Re: linux stubdom

Hello,

Am 30.01.2013 um 10:36 Uhr schrieb Ian Campbell
<Ian.Campbell@citrix.com>:> On Tue, 2013-01-29 at 19:32 +0000, Markus Hochholdinger wrote:
> > Am 29.01.2013 um 17:36 Uhr schrieb Ian Campbell
<Ian.Campbell@citrix.com>:
> > > On Tue, 2013-01-29 at 15:46 +0000, Markus Hochholdinger wrote:
[..]> The change of $domid doesn''t matter since a migration involves
> reconnecting the devices anyway, which means thy will reconnect to the
> "new" driver domain.
OK, I understand that not a numeric id has to be equal but the name of the 
driver domU.

> The normal way would be to have a driver domain per host but
there''s no
> reason you couldn''t make it such that the driver domain was
migrated too
> (you''d have to do some hacking to make this work).
If my driver domU is on the same hardware host as the domU I don''t have
to
care about split brain situations for the storage. So my idea is to have one 
driver domU for each normal domU.
And the live migration is possible difficult. As I understand I have to create 
somehow the driver domU on the destination to have the block device my normal 
domU is to be connected. I''ll look into this if I find no easier
solution.

> AIUI you currently have a RAID1 device in the guest, presumably
> constructed from 2 xvd* devices? What are those two xvda devices are
> backed by? I presume it must be some sort of network storage (NFS,
> iSCSI, NBD, DRDB) or else you just couldn''t migrate.
Well, perhaps some device path say more than my bad english:

node1:/dev/xbd/mydomu.node1 -> /dev/vg0/mydomu (also exported over iscsi)
node1:/dev/xbd/mydomu.node2 -> /dev/sdx (imported over iscsi)

node2:/dev/xbd/mydomu.node1 -> /dev/sdy (imported over iscsi)
node2:/dev/xbd/mydomu.node2 -> /dev/vg0/mydomu (also exported over iscsi)

node3:/dev/xbd/mydomu.node1 -> /dev/sdy (imported over iscsi)
node3:/dev/xbd/mydomu.node2 -> /dev/sdz (imported over iscsi)

In /dev/xbd/* there are only symlinks to the according device so I have 
consistent path on all nodes.

On all hardware nodes (node1, node2 and node3) I can access the logical volume 
/dev/vg0/mydomu on node1 with the path /dev/xbd/mydomu.node1 which I use in my 
domU configurations. If I''m not on node1 the block device is
transported over
iscsi. Because on all nodes /dev/xbd/mydomu.node1 points to the same block 
device I''m able to live migrate the domUs independently where the
physical
location of the logical volume is.

So my xvda and xvdb inside the domU are baked by /dev/xbd/mydomu.node1 and 
/dev/xbd/mydomu.node2 and if one or both of these logical volumes are not 
local it uses iscsi (in dom0) for transport.

> Are you intending to instead run the RAID1 device in a "driver
domain",
> constructed from 2 xvd* devices exported from dom0 and exporting that as
> a single xvd* device to the guest?
Yes, somehow. But the exported devices from dom0 don''t have to be local
logical volumes of the dom0 but can be remote iscsi block devices.

For me it is very important to be able to live migrate domUs but also have the 
storage at least redundant over two nodes.

> Or are you intending to surface the
> network storage directly into the driver domain, construct the RAID
> device from those and export that as an xvd* to the guest?
No.

> > Do you know how I can live migrate a domU which depends on a driver
domU?
> > How can I migrate the driver domU?
> > For my understanding the block device has to be there on the
destination
> > dom0 before live migration begins but is also used on the source dom0
> > from the migrating, but still running, domU.
> Not quite, when you migrate there is a pause period while the final copy
> over occurs and at this point you can safely remove the device from the
> source host and make it available on the target host. The toolstack will
Isn''t the domU on the destination created with all its virtual devices
before
the migration starts? What if the blkback is not ready on the destination 
host? Am I missing something?

> ensure that the block device is only ever active on one end of the other
> and never on both -- otherwise you would get potential corruption.
Yeah, this is the problem! If I migrate the active raid1 logic within the domU 
(aka linux software raid1) I don''t have to care. I''ll try to
accomplish the
same with a "helper" domU very near to the normal domU and which is
live
migrated while the normal domU is migrated.

> While you could migrate the driver domain during the main domU''s
pause
> period it is much more normal to simply have a driver domain on each
> host and dynamically configure the storage as you migrate.
If I dynamically create the software raid1 I have to add a lot of checks which 
I don''t need now.
I''ve already thought about a software raid1 in the dom0 and the
resulting md
device as xvda for a domU. But I have to assemble the md device on the 
destination host before I can deactivate the md device on the source host. The 
race condition is, if I deactivate the md device on the source host while data 
is only written to one of the two devices. On the destination host my raid1 
seems clean but my two devices differ. The other race condition is, if my 
raid1 is inconsistent while assembling on the destination host.

> > Can I combine a driver domU to a normal domU like I can combine a
stubdom
> > with a normal domU?
> If you want, it would be more typical to have a single driver domain
> providing block services to all domains (or one per underlying physical
> block device).
I want :-) A single driver domain would need more logic (for me) while doing 
live migrations.

> > I thought a stubdom live migrates with its domU so you don''t
have to
> > worry that the driver domU live migrates while the according normal
domU
> > migrates.
> A stubdom is a bit of an overloaded term. If you mean an ioemu stub
> domain (i.e. the qemu associated with an HVM guest) then a new one of
> those is started on the target host each time you migrate.
OK, this isn''t what I want. For me a (re)start on the destination host
is no
better than if I would do the software raid1 in dom0.

> If you mean a xenstored stubdom then those are per host and are not
> migrated.
Hm, if they are not migrated they don''t behave like I expect.

> And if you mean a driver domain then as I say those are usually per host
> and the domain will be connected to the appropriate local driver domain
> on the target host.
OK, it seems I want a driver domain which I would migrate while migrating the 
according normal domU.


[..]> > OK, I see. But still, would it be possible to run linux in a
stub-domain?
> > I''ve read e.g.
> > http://blog.xen.org/index.php/2012/12/12/linux-stub-domain/ which
> > describes this, but I''m unsure if this will be (or is
already) supported
> > by current xen?
> This work is about a Linux ioemu stub domain. That is a stubdomain with
> the sole purpose of running the qemu emulation process for an HVM
> domain. I think the intention is for this to land in Xen 4.3 but it does
> not have anything to do with your usecase AFAICT.
OK, I see that. If the linux ioemu stub domu would be (re)started on the 
destination host on a live migration it doesn''t solve my problem.

> Everything you want to do is already possible with what is in Xen and
> Linux today, in that the mechanisms all exist. However what you are
> doing is not something which others have done and so there will
> necessarily need to be a certain amount of putting the pieces together
> on your part.
Yeah, this gives hope to me :-)


[..]> > Yes, I would provide two block devices (logical volumes) into the
driver
> > domU,
> How are you doing this? Where do those logical device come from and how
> are they getting into the driver domU?
See the explanation above for details, the logical volumes come from the local 
host and/or from remote hosts over isccsi with a consistent path on all hosts.

> > create there a software raid1 device and make the md device available
> > with blkback. I would do this for each domU so I can live migrate
domUs
> > independent. The driver domU only needs a kernel and a initrd with a
> > rootfs filled with enough to build the md device and export it with
> > blkback.
> > But how can I address this exported block device? As far as
I''ve seen I
> > need the $domid of the driver domain in my config file of the domU, or
> > am I missing something?
> $domid can also be a domain name, and you can also change this over
> migration by providing an updated configuration file (at least with xl).
If $domid can be a name it really would be possible for me. Great!

> > So this would be a dependency that the driver domain is started before
> > the stubdom with qemu.
> Yes.
> > In some subdom-startup script I saw the parameter "target"
while creating
> > the stubdom. Is this a possibility to combine two domUs?
> I think "target" in this context refers to the HVM guest for
which the
> ioemu-stubdom is providing services.
Yes, I''ve also thought this way. So my idea was to create a driver
domain with
target to the according domU so I only "see" one domU. It could ease
the
management I''ve thought.

> > As far as I understand with a driver domU I need the $domid of the
driver
> > domU in my config, so this is the connection between the two domUs.
> > But with stub-domains I haven''t understand how data flows
between stubdom
> > and domU and because I''ve seen a lot of nice little pictures
describing
> > that I/O flows between domU over stubdom to dom0 and backwards I
thought
> > stub-domains would be the way to go.
> Only if you are using emulated I/O. I assumed you were using PV I/O, is
> that not the case?
OK, my bad. I''ve two use cases:
1. Provide a redundant block device for a PV domU but I''m not able to
manage
   the software raid1 inside the domU.
2. Provide a redundant block device for a HVM domU for operating systems which
   have no (good) software raid1 implementation.

Many thanks so far, I''ll try the driver domain approach and test if it
solves
my problem and if it looses not too much performance.


-- 
greetings

eMHa


_______________________________________________
Xen-users mailing list
Xen-users@lists.xen.org
http://lists.xen.org/xen-users

Ian Campbell

2013-Jan-31 11:22 UTC

head link

Re: linux stubdom

On Wed, 2013-01-30 at 15:35 +0000, Markus Hochholdinger
wrote:> > > Do you know how I can live migrate a domU which depends on a
driver domU?
> > > How can I migrate the driver domU?
> > > For my understanding the block device has to be there on the
destination
> > > dom0 before live migration begins but is also used on the source
dom0
> > > from the migrating, but still running, domU.
> > Not quite, when you migrate there is a pause period while the final
copy
> > over occurs and at this point you can safely remove the device from
the
> > source host and make it available on the target host. The toolstack
will
> 
> Isn''t the domU on the destination created with all its virtual
devices before
> the migration starts?
No
> What if the blkback is not ready on the destination host?
We have to arrange that it is.
>  Am I missing something?
Migration is a staged process.
     1. First an empty shell domain (with no devices) is created on the
        target host.
     2. Then we copy the memory over, in several iterations, while the
        domain is running on the source host (iterations happen to
        handle the guest dirtying memory as we copy, this is the
"live"
        aspect of the migration).
     3. After some iterations of live migration we pause the source
        guest
     4. Now we copy the remaining dirty RAM
     5. Tear down devices on the source host
     6. Setup devices on the target host for the incoming domain
     7. Resume the guest on the target domain
     8. Guest reconnects to new backend

The key point is that the devices are only ever active on either the
source or the target host and never both. The domain is paused during
this final transfer (from #3 until #7) and therefore guest I/O is
quiesced.

In your scenario I would expect that in the interval of #5,#6 you would
migrate the associated driver domain.
> 
> > ensure that the block device is only ever active on one end of the
other
> > and never on both -- otherwise you would get potential corruption.
> 
> Yeah, this is the problem! If I migrate the active raid1 logic within the
domU
> (aka linux software raid1) I don''t have to care. I''ll try
to accomplish the
> same with a "helper" domU very near to the normal domU and which
is live
> migrated while the normal domU is migrated.
This might be possible but as I say the more normal approach would be to
have a "RAID" domain on both hosts and dynamically map and unmap the
backing guest disks at steps #5 and #6 above.

> > While you could migrate the driver domain during the main
domU''s pause
> > period it is much more normal to simply have a driver domain on each
> > host and dynamically configure the storage as you migrate.
> 
> If I dynamically create the software raid1 I have to add a lot of checks
which
> I don''t need now.
> I''ve already thought about a software raid1 in the dom0 and the
resulting md
> device as xvda for a domU. But I have to assemble the md device on the 
> destination host before I can deactivate the md device on the source host. 
No you don''t, you deactivate on the source (step #5) before activating
on the target (step #6).
> The 
> race condition is, if I deactivate the md device on the source host while
data
> is only written to one of the two devices. On the destination host my raid1
> seems clean but my two devices differ. The other race condition is, if my 
> raid1 is inconsistent while assembling on the destination host.
I''d have thought that shutting down the raid in step #5 and
reactivating
in step #6 would guarantee that neither of these were possible.
> > > Can I combine a driver domU to a normal domU like I can combine a
stubdom
> > > with a normal domU?
> > If you want, it would be more typical to have a single driver domain
> > providing block services to all domains (or one per underlying
physical
> > block device).
> 
> I want :-) A single driver domain would need more logic (for me) while
doing
> live migrations.
OK, but be aware that you are treading into unexplored territory, most
people do things the other way. This means you are likely going to have
to do a fair bit of heavy lifting yourself.

Ian.

Markus Hochholdinger

2013-Jan-31 19:44 UTC

head link

Re: linux stubdom

Hello,

Am 31.01.2013 um 12:22 Uhr schrieb Ian Campbell
<Ian.Campbell@citrix.com>:> On Wed, 2013-01-30 at 15:35 +0000, Markus Hochholdinger wrote:
> > > Not quite, when you migrate there is a pause period while the
final
> > > copy over occurs and at this point you can safely remove the
device
> > > from the source host and make it available on the target host.
The
> > > toolstack will
> > Isn''t the domU on the destination created with all its
virtual devices
> > before the migration starts?
> No
oh, this opens new possibilities for me :-)

> > What if the blkback is not ready on the destination host?
> We have to arrange that it is.
OK, I see.

> >  Am I missing something?
> Migration is a staged process.
>      1. First an empty shell domain (with no devices) is created on the
>         target host.
>      2. Then we copy the memory over, in several iterations, while the
>         domain is running on the source host (iterations happen to
>         handle the guest dirtying memory as we copy, this is the
"live"
>         aspect of the migration).
>      3. After some iterations of live migration we pause the source
>         guest
>      4. Now we copy the remaining dirty RAM
>      5. Tear down devices on the source host
>      6. Setup devices on the target host for the incoming domain
>      7. Resume the guest on the target domain
>      8. Guest reconnects to new backend
> The key point is that the devices are only ever active on either the
> source or the target host and never both. The domain is paused during
> this final transfer (from #3 until #7) and therefore guest I/O is
> quiesced.
At what point are scripts like
 disk = [ ".., script=myblockscript.sh" ]
executed? Would this be between #3 and #7?

> In your scenario I would expect that in the interval of #5,#6 you would
> migrate the associated driver domain.
OK.
> > > ensure that the block device is only ever active on one end of
the
> > > other and never on both -- otherwise you would get potential
> > > corruption.
> > Yeah, this is the problem! If I migrate the active raid1 logic within
the
> > domU (aka linux software raid1) I don''t have to care.
I''ll try to
> > accomplish the same with a "helper" domU very near to the
normal domU
> > and which is live migrated while the normal domU is migrated.
> This might be possible but as I say the more normal approach would be to
> have a "RAID" domain on both hosts and dynamically map and unmap
the
> backing guest disks at steps #5 and #6 above.
With the above info, that block devices are removed and added in the right 
order while doing live migration, I''m thinking more and more about a
driver
domain.

But in the first place I''ll test the stopping and assembling of md
devices in
the dom0s while migrating. If this works I could put this job into a driver 
domain. Wow, this gives me a new view of the setup.

> > > While you could migrate the driver domain during the main
domU''s pause
> > > period it is much more normal to simply have a driver domain on
each
> > > host and dynamically configure the storage as you migrate.
> > If I dynamically create the software raid1 I have to add a lot of
checks
> > which I don''t need now.
> > I''ve already thought about a software raid1 in the dom0 and
the resulting
> > md device as xvda for a domU. But I have to assemble the md device on
> > the destination host before I can deactivate the md device on the
source
> > host.
> No you don''t, you deactivate on the source (step #5) before
activating
> on the target (step #6).
This wasn''t clear to me before. Many thanks for this info.

> > The
> > race condition is, if I deactivate the md device on the source host
while
> > data is only written to one of the two devices. On the destination
host
> > my raid1 seems clean but my two devices differ. The other race
condition
> > is, if my raid1 is inconsistent while assembling on the destination
> > host.
> I''d have thought that shutting down the raid in step #5 and
reactivating
> in step #6 would guarantee that neither of these were possible.
I''ll try this first of all. If this works I''ll recheck the
performance against
drbd and probably try a driver domain with this.

> > > > Can I combine a driver domU to a normal domU like I can
combine a
> > > > stubdom with a normal domU?
> > > If you want, it would be more typical to have a single driver
domain
> > > providing block services to all domains (or one per underlying
physical
> > > block device).
> > I want :-) A single driver domain would need more logic (for me) while
> > doing live migrations.
> OK, but be aware that you are treading into unexplored territory, most
> people do things the other way. This means you are likely going to have
> to do a fair bit of heavy lifting yourself.
If this solves my problem I''m willing to go into unexplored territory
:-) But
I''m also sane enough to test the common ways first.

Many thanks so far.


-- 
greetings

eMHa


_______________________________________________
Xen-users mailing list
Xen-users@lists.xen.org
http://lists.xen.org/xen-users

Ian Campbell

2013-Feb-01 08:56 UTC

head link

Re: linux stubdom

> > >  Am I missing something?
> > Migration is a staged process.
> >      1. First an empty shell domain (with no devices) is created on
the
> >         target host.
> >      2. Then we copy the memory over, in several iterations, while the
> >         domain is running on the source host (iterations happen to
> >         handle the guest dirtying memory as we copy, this is the
"live"
> >         aspect of the migration).
> >      3. After some iterations of live migration we pause the source
> >         guest
> >      4. Now we copy the remaining dirty RAM
> >      5. Tear down devices on the source host
> >      6. Setup devices on the target host for the incoming domain
> >      7. Resume the guest on the target domain
> >      8. Guest reconnects to new backend
> > The key point is that the devices are only ever active on either the
> > source or the target host and never both. The domain is paused during
> > this final transfer (from #3 until #7) and therefore guest I/O is
> > quiesced.
> 
> At what point are scripts like
>  disk = [ ".., script=myblockscript.sh" ]
> executed? Would this be between #3 and #7?
It is part of the device teardown and setup, so it is during #5 and #6
(strictly I think it is just after #5 and just before #6).

On xen-devel at the minute there is a patch series under discussion to
make the script hooks more flexible, in particular adding pre and post
migrate hooks (called something like #1-#3 and #7-#7) which can pre
setup bits of the storage stack which are safe to do with the guest
running but might be slow to initialise (e.g. iSCSI login, but not
opening the device). I don''t think this needs to affect you though.
> > > > ensure that the block device is only ever active on one end
of the
> > > > other and never on both -- otherwise you would get potential
> > > > corruption.
> > > Yeah, this is the problem! If I migrate the active raid1 logic
within the
> > > domU (aka linux software raid1) I don''t have to care.
I''ll try to
> > > accomplish the same with a "helper" domU very near to
the normal domU
> > > and which is live migrated while the normal domU is migrated.
> > This might be possible but as I say the more normal approach would be
to
> > have a "RAID" domain on both hosts and dynamically map and
unmap the
> > backing guest disks at steps #5 and #6 above.
> 
> With the above info, that block devices are removed and added in the right 
> order while doing live migration, I''m thinking more and more about
a driver
> domain.
> 
> But in the first place I''ll test the stopping and assembling of md
devices in
> the dom0s while migrating. If this works I could put this job into a driver
> domain. Wow, this gives me a new view of the setup.
Excellent ;-)

Ian.

Markus Hochholdinger

2013-Feb-06 14:39 UTC

head link

Re: linux stubdom

Hello,

Am 01.02.2013 um 09:56 Uhr schrieb Ian Campbell
<Ian.Campbell@citrix.com>:> > > >  Am I missing something?
> > > Migration is a staged process.
> > >      1. First an empty shell domain (with no devices) is created
on the
> > >         target host.
> > >      2. Then we copy the memory over, in several iterations,
while the
> > >         domain is running on the source host (iterations happen
to
> > >         handle the guest dirtying memory as we copy, this is the
"live"
> > >         aspect of the migration).
> > >      3. After some iterations of live migration we pause the
source
> > >         guest
> > >      4. Now we copy the remaining dirty RAM
> > >      5. Tear down devices on the source host
> > >      6. Setup devices on the target host for the incoming domain
> > >      7. Resume the guest on the target domain
> > >      8. Guest reconnects to new backend
> > > The key point is that the devices are only ever active on either
the
> > > source or the target host and never both. The domain is paused
during
> > > this final transfer (from #3 until #7) and therefore guest I/O is
> > > quiesced.
> > At what point are scripts like
> >  disk = [ ".., script=myblockscript.sh" ]
> > executed? Would this be between #3 and #7?
> It is part of the device teardown and setup, so it is during #5 and #6
> (strictly I think it is just after #5 and just before #6).
> On xen-devel at the minute there is a patch series under discussion to
> make the script hooks more flexible, in particular adding pre and post
> migrate hooks (called something like #1-#3 and #7-#7) which can pre
> setup bits of the storage stack which are safe to do with the guest
> running but might be slow to initialise (e.g. iSCSI login, but not
> opening the device). I don''t think this needs to affect you
though.
at least with xm toolstack, block-scripts in /etc/xen/scripts/block-* are 
executed on the destination host (add) before the script (remove) on the 
source host is executed while live migrating a domU.

(I''ve created the script /etc/xen/scripts/block-md which assembles and
stops
raid1 devices.)

Next, I''ll test xl toolstack with this setup.


[..]

-- 
greetings

eMHa


_______________________________________________
Xen-users mailing list
Xen-users@lists.xen.org
http://lists.xen.org/xen-users

Markus Hochholdinger

2013-Feb-11 14:59 UTC

head link

Re: linux stubdom

Hello,

Am 06.02.2013 um 15:39 Uhr schrieb Markus Hochholdinger 
<Markus@hochholdinger.net>:> Am 01.02.2013 um 09:56 Uhr schrieb Ian Campbell
<Ian.Campbell@citrix.com>:
[..]> > > executed? Would this be between #3 and #7?
> > It is part of the device teardown and setup, so it is during #5 and #6
> > (strictly I think it is just after #5 and just before #6).
> > On xen-devel at the minute there is a patch series under discussion to
> > make the script hooks more flexible, in particular adding pre and post
> > migrate hooks (called something like #1-#3 and #7-#7) which can pre
> > setup bits of the storage stack which are safe to do with the guest
> > running but might be slow to initialise (e.g. iSCSI login, but not
> > opening the device). I don''t think this needs to affect you
though.
> at least with xm toolstack, block-scripts in /etc/xen/scripts/block-* are
> executed on the destination host (add) before the script (remove) on the
> source host is executed while live migrating a domU.
> (I''ve created the script /etc/xen/scripts/block-md which assembles
and
> stops raid1 devices.)
> Next, I''ll test xl toolstack with this setup.
with Xen 4.1 there''s no support for custom scripts within libxl.

With Xen 4.2.1 there''s support for custom scripts (like with xm
toolstack) but
also only add/remove. And add is called on the destination side before remove 
is called on the transmitting side while doing a live migration of a domU.

Next I''ll test latest development version...


-- 
greetings

eMHa


_______________________________________________
Xen-users mailing list
Xen-users@lists.xen.org
http://lists.xen.org/xen-users

Ian Campbell

2013-Feb-11 15:05 UTC

head link

Re: linux stubdom

On Mon, 2013-02-11 at 14:59 +0000, Markus Hochholdinger
wrote:> Hello,
> 
> Am 06.02.2013 um 15:39 Uhr schrieb Markus Hochholdinger 
> <Markus@hochholdinger.net>:
> > Am 01.02.2013 um 09:56 Uhr schrieb Ian Campbell
<Ian.Campbell@citrix.com>:
> [..]
> > > > executed? Would this be between #3 and #7?
> > > It is part of the device teardown and setup, so it is during #5
and #6
> > > (strictly I think it is just after #5 and just before #6).
> > > On xen-devel at the minute there is a patch series under
discussion to
> > > make the script hooks more flexible, in particular adding pre and
post
> > > migrate hooks (called something like #1-#3 and #7-#7) which can
pre
> > > setup bits of the storage stack which are safe to do with the
guest
> > > running but might be slow to initialise (e.g. iSCSI login, but
not
> > > opening the device). I don''t think this needs to affect
you though.
> > at least with xm toolstack, block-scripts in /etc/xen/scripts/block-*
are
> > executed on the destination host (add) before the script (remove) on
the
> > source host is executed while live migrating a domU.
> > (I''ve created the script /etc/xen/scripts/block-md which
assembles and
> > stops raid1 devices.)
> > Next, I''ll test xl toolstack with this setup.
> 
> with Xen 4.1 there''s no support for custom scripts within libxl.
> 
> With Xen 4.2.1 there''s support for custom scripts (like with xm
toolstack) but
> also only add/remove. And add is called on the destination side before
remove
> is called on the transmitting side while doing a live migration of a domU.
This sounds like a bug which ought to be addressed (Roger, can you take
a look?)
> Next I''ll test latest development version...
I''m not sure it will differ from 4.2.x in this area (yet). Roger can
probably advise better than me though.

Ian.

Roger Pau Monné

2013-Feb-11 16:00 UTC

head link

Re: linux stubdom

On 11/02/13 16:05, Ian Campbell wrote:> On Mon, 2013-02-11 at 14:59 +0000, Markus Hochholdinger wrote:
>> Hello,
>>
>> Am 06.02.2013 um 15:39 Uhr schrieb Markus Hochholdinger 
>> <Markus@hochholdinger.net>:
>>> Am 01.02.2013 um 09:56 Uhr schrieb Ian Campbell
<Ian.Campbell@citrix.com>:
>> [..]
>>>>> executed? Would this be between #3 and #7?
>>>> It is part of the device teardown and setup, so it is during #5
and #6
>>>> (strictly I think it is just after #5 and just before #6).
>>>> On xen-devel at the minute there is a patch series under
discussion to
>>>> make the script hooks more flexible, in particular adding pre
and post
>>>> migrate hooks (called something like #1-#3 and #7-#7) which can
pre
>>>> setup bits of the storage stack which are safe to do with the
guest
>>>> running but might be slow to initialise (e.g. iSCSI login, but
not
>>>> opening the device). I don''t think this needs to
affect you though.
>>> at least with xm toolstack, block-scripts in
/etc/xen/scripts/block-* are
>>> executed on the destination host (add) before the script (remove)
on the
>>> source host is executed while live migrating a domU.
>>> (I''ve created the script /etc/xen/scripts/block-md which
assembles and
>>> stops raid1 devices.)
>>> Next, I''ll test xl toolstack with this setup.
>>
>> with Xen 4.1 there''s no support for custom scripts within
libxl.
>>
>> With Xen 4.2.1 there''s support for custom scripts (like with
xm toolstack) but
>> also only add/remove. And add is called on the destination side before
remove
>> is called on the transmitting side while doing a live migration of a
domU.
Yes, I''ve also realized this while working on the new hotplug
implementation. The hotplug script is executed on the destination before
the other end has executed the remove script (this is due to the fact
that the remove script is executed when the migrated domain is destroyed
on the source). So at a certain point the destination host has executed
the "add" script before the source host executes the
"remove" hotplug
script.

This is not a problem with the current hotplug scripts in-three, because
we can guarantee that the device will not be accessed simultaneously
(because the guest only resumes on either the source or the destination
hosts, but never on both).

So the scheme looks more like:

     1. First an empty shell domain (with no devices) is created on the
        target host.
     2. Then we copy the memory over, in several iterations, while the
        domain is running on the source host (iterations happen to
        handle the guest dirtying memory as we copy, this is the
"live"
        aspect of the migration).
     3. After some iterations of live migration we pause the source
        guest
     4. Setup devices on the target host for the incoming domain
     5. Now we copy the remaining dirty RAM
     6. Resume the guest on the target domain
     7. Tear down devices on the source host
     8. Guest reconnects to new backend
     (#7 and #8 can happen in different order)

#4 will be where the hotplug script "add" call happens on the target
host, and #7 where the hotplug script "remove" call happens on the
source host.
> 
> This sounds like a bug which ought to be addressed (Roger, can you take
> a look?)
I think this is how migration works in both xl and xm, but if there are
hotplug scripts that cannot be executed simultaneously (ie you cannot
make two simultaneous calls to "add" without calling
"remove" first) we
could mark it as a bug.

It would make the resume on source host more complicated, since in case
of failure we will have to remove the devices on the destination host
and reconnect them on the source host.
>> Next I''ll test latest development version...
> 
> I''m not sure it will differ from 4.2.x in this area (yet). Roger
can
> probably advise better than me though.
No, this has not changed in -unstable.

Markus Hochholdinger

2013-Feb-11 16:27 UTC

head link

Re: linux stubdom

Hello,

Am 11.02.2013 um 17:00 Uhr schrieb Roger Pau Monné
<roger.pau@citrix.com>:> On 11/02/13 16:05, Ian Campbell wrote:
> > On Mon, 2013-02-11 at 14:59 +0000, Markus Hochholdinger wrote:
> >> Am 06.02.2013 um 15:39 Uhr schrieb Markus Hochholdinger
> >>> Am 01.02.2013 um 09:56 Uhr schrieb Ian Campbell
[..]> >> With Xen 4.2.1 there''s support for custom scripts (like
with xm
> >> toolstack) but also only add/remove. And add is called on the
> >> destination side before remove is called on the transmitting side
while
> >> doing a live migration of a domU.
> Yes, I''ve also realized this while working on the new hotplug
> implementation. The hotplug script is executed on the destination before
> the other end has executed the remove script (this is due to the fact
> that the remove script is executed when the migrated domain is destroyed
> on the source). So at a certain point the destination host has executed
> the "add" script before the source host executes the
"remove" hotplug
> script.
OK, so this is what I thought before. Many thanks for clarification.

> This is not a problem with the current hotplug scripts in-three, because
> we can guarantee that the device will not be accessed simultaneously
> (because the guest only resumes on either the source or the destination
> hosts, but never on both).
> So the scheme looks more like:
>      1. First an empty shell domain (with no devices) is created on the
>         target host.
>      2. Then we copy the memory over, in several iterations, while the
>         domain is running on the source host (iterations happen to
>         handle the guest dirtying memory as we copy, this is the
"live"
>         aspect of the migration).
>      3. After some iterations of live migration we pause the source
>         guest
>      4. Setup devices on the target host for the incoming domain
>      5. Now we copy the remaining dirty RAM
>      6. Resume the guest on the target domain
>      7. Tear down devices on the source host
>      8. Guest reconnects to new backend
>      (#7 and #8 can happen in different order)
> #4 will be where the hotplug script "add" call happens on the
target
> host, and #7 where the hotplug script "remove" call happens on
the
> source host.
As far as I understand now the (block) device on the destination host will be 
read before the block device on the source is detached.

> > This sounds like a bug which ought to be addressed (Roger, can you
take
> > a look?)
> I think this is how migration works in both xl and xm, but if there are
> hotplug scripts that cannot be executed simultaneously (ie you cannot
> make two simultaneous calls to "add" without calling
"remove" first) we
> could mark it as a bug.
No, it was not a bug of the hotplug scripts, I made a hotplug script myself to 
assemble linux raid1 devices and log timestamps of execution.

> It would make the resume on source host more complicated, since in case
> of failure we will have to remove the devices on the destination host
> and reconnect them on the source host.
I understand.

> >> Next I''ll test latest development version...
> > I''m not sure it will differ from 4.2.x in this area (yet).
Roger can
> > probably advise better than me though.
> No, this has not changed in -unstable.
OK. Many thanks.


-- 
greetings

eMHa


_______________________________________________
Xen-users mailing list
Xen-users@lists.xen.org
http://lists.xen.org/xen-users

Xen users - Jan 2013 - linux stubdom

linux stubdom

Re: linux stubdom

Re: linux stubdom

Re: linux stubdom

Re: linux stubdom

Re: linux stubdom

Re: linux stubdom

Re: linux stubdom

Re: linux stubdom

Re: linux stubdom

Re: linux stubdom

Re: linux stubdom

Re: linux stubdom

Re: linux stubdom

Re: linux stubdom