We are trying to migrate a domain from a xen box to another
transparently, using drbd to ensure the smallest "freezetime"
possible.
The method described here allows to do that ; the affected domain will
be suspended 3 times (a few seconds each time), and doesn''t have to be
destroyed/recreated whatsoever.
The following explanations assume you know about xen domain creation,
are familiar with "xm save" and "xm restore", and know what
is drbd (if
you don''t know exactly how to use it, refer to its documentation ; all
operations described here are "basic" drbd functions).
Long story made short : the migrated domain root device is a symlink,
which gets changed between a xm save/xm restore bracked, to switch from
a plain root device to a drbd. BUT, we get strange kernel messages on
dom0 when running drbd-backed domain ...
Phase A: some extra steps to take at domain creation
1) create a logical volume to host the domain rootfs (for instance,
/dev/vg/mydomain)
2) create a symlink to this domain rootfs (for instance,
/dev/xenlvs/mydomain -> /dev/vg/mydomain)
3) mkfs the logical volume and populate it with a filesystem
4) start the domain (the configuration should use xenlvs/mydomain as
block device, NOT vg/mydomain)
Phase B: substitute the real root block device with drbd
5) compile drbd module and utils on both xen boxes
6) xm save mydomain (this suspends the domain, saving its state)
7) setup a drbd pseudo-disk on the source box, using the domain logical
volume as data disk (and some external temporary storage as meta-data
disk) ; let''s assume the drbd will be /dev/drbd0 ; and switch this drbd
to "primary" operation (=source for synchronization in drbd terms)
8) change the symlink /dev/xenlvs/mydomain so it points to /dev/drbd0
9) xm restore mydomain (this will resume operation on the domain)
Steps 7 and 8 are very short, so this first save/restore should not be
noticable.
Phase C: synchronize drbd state
10) on the target xen box, create the target logical volume
(/dev/vg/mydomaintoo for instance) - it should obviously have the same
size as the source logical volume
11) associate it with a drbd device and setup a symlink with the SAME
NAME as on the source box - e.g. /dev/xenlvs/mydomain -> /dev/vg/mydomaintoo
12) synchronize both drbd devices ; wait for synchronization to
complete (this can take some time, but the domain stills run on the
source box unaffected meanwhile)
Phase D: actual migration
13) when synchronization is complete, "xm save" the domain on the
source box
14) transfer the saved image (size is dependent of xen domain memory
size) to the target box
15) tear down drbd on the target box
16) "xm restore" the saved image on the target box
transfer speed is limited by your network or disks ; drbd detach is very
quick (a few seconds again) ; so those steps won''t freeze your domain
for a long time !
Phase E: cleanup
17) you can tear down the drbd on the source box (as well as the now
unused logical volume)
This works well, *but* when operating a xen domain over drbd, we see a
lot of "suspicious" messages in xen0 domain (see attached log). Those
messages *seem* harmless, but they certainly indicate something is going
wrong under the hood. If someone is familiar with xen and drbd internals
and has a clue about what they mean, I will be very happy to learn from
them ...
All tests were made with xen-testing (as of apr, 27 bk source), compiled
with gcc 3.3.4 (from Debian), on a Dual Xeon system (running "pure 32
bits" kernel).
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel