alice wan
2011-Feb-13 15:45 UTC
[Xen-devel] inconsistent metadata of vhd file while live migration
hi all, i have some doubt about live migration which may cause inconsistent metadata of vhd file between two tapdisk2 process. given that vm migrates from host A to host B, which image is vhd file. in host B, it first creates devices including starting tapdisk2 process, at this time, tapdisk2 will read some metadata of vhd file. then, it xc_restore in host A, before it start last iteration(stop-and-copy phase), while xc_save''s going, vhd file has been changed including metadata. So, in hostB tapdisk2 process doesn''t read the newest metadata of vhd file. for tapdisk2, when it starts, it will read footer, header, bat of vhd file. especially bat structure, if it''s inconsistent, it''ll cause problem. Maybe my doubt isn''t a real problem, however, i hope someone to figure it out for me. thanks in advance. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2011-Feb-13 21:11 UTC
Re: [Xen-devel] inconsistent metadata of vhd file while live migration
On Sun, 2011-02-13 at 10:45 -0500, alice wan wrote:> hi all, > > i have some doubt about live migration which may cause inconsistent > metadata of vhd file between two tapdisk2 process. > > given that vm migrates from host A to host B, which image is vhd > file. > > in host B, it first creates devices including starting tapdisk2 > process, at this time, tapdisk2 will read some metadata of vhd file. > then, it xc_restore > > in host A, before it start last iteration(stop-and-copy phase), while > xc_save''s going, vhd file has been changed including metadata. So, in > hostB tapdisk2 process doesn''t read the > > newest metadata of vhd file. > > for tapdisk2, when it starts, it will read footer, header, bat of vhd > file. especially bat structure, if it''s inconsistent, it''ll cause > problem. > > Maybe my doubt isn''t a real problem, however, i hope someone to figure > it out for me. thanks in advance.If that''s what''s done right now in the toolchain, it''s a real problem and needs to be fixed. Options: A. Avoid VBD lifetime overlap. This is how XCP presently does it. XCP has vdi.activate/deactivate operations in addition to attach/detach to control storage during migration. Attach/detach is the same as described above. It may be desired as the preferred transfer method on non-shared storage nodes to avoid latency in stop/copy. The simpler way is of course activate/deactivate semantics everywhere, which is mutually exclusive. This is needed for any indirectly mapped disk format (vhd, qcow? etc) on shared physical nodes. Not that this doesn''t only matter for metadata. There are physical layers where exclusive login is preferred/mandatory, so you won''t even get access to the device before pre-copy is done and the node could be released on A. Diagram: Node A B VM.migrate .. pre-copy > < stop-and-copy > <resumed ... VDI.attached ..------------A---------------> <-----------B-------------------.. VDI.active -----------A----> <----B-------.. B. Hack. Let the toolstack issue a tap-ctl pause/unpause cycle before resume. This will reopen the image. C. Back then, in the dark ages, blktap did this implicitly. Every I/O request after disk create run an implicit close/open cycle on the physical image. Cheers, Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2011-Feb-13 21:13 UTC
Re: [Xen-devel] inconsistent metadata of vhd file while live migration
On Sun, 2011-02-13 at 16:11 -0500, Daniel Stodden wrote:> B. Hack. > Let the toolstack issue a tap-ctl pause/unpause cycle before resume. > This will reopen the image. > > C. Back then, in the dark ages, blktap did this implicitly. > Every*first*> I/O request after disk create run an implicit close/open > cycle.:o) D a niel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
alice wan
2011-Feb-16 10:55 UTC
Re: [Xen-devel] inconsistent metadata of vhd file while live migration
option b, c seems simpler and needs less codes for my code version(xen4.0.0+2.6.31.13). i''m not familiar with blktap code. would you please tell in which function blktap run an implicit close/open when process first io? and in latest stable version blktap2 pause/unpause is available ? thanks 2011/2/14 Daniel Stodden <daniel.stodden@citrix.com>> On Sun, 2011-02-13 at 16:11 -0500, Daniel Stodden wrote: > > > B. Hack. > > Let the toolstack issue a tap-ctl pause/unpause cycle before resume. > > This will reopen the image. > > > > C. Back then, in the dark ages, blktap did this implicitly. > > Every > > *first* > > > I/O request after disk create run an implicit close/open > > cycle. > > :o) > > D > a > niel > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2011-Feb-16 21:11 UTC
Re: [Xen-devel] inconsistent metadata of vhd file while live migration
On Wed, 2011-02-16 at 05:55 -0500, alice wan wrote:> option b, c seems simpler and needs less codes for my code > version(xen4.0.0+2.6.31.13).Example: [1]+ tail -f /var/log/daemon.log & root@vantst07:~# tap-ctl list 7781 0 0 vhd /var/tmp/lenny.vhd Feb 16 13:04:00 vantst07 tapdisk2[7779]: received ''pid'' message (uuid = 0) Feb 16 13:04:00 vantst07 tapdisk2[7779]: sending ''pid response'' message (uuid = 0) Feb 16 13:04:00 vantst07 tapdisk2[7779]: received ''list'' message (uuid = 65535) Feb 16 13:04:00 vantst07 tapdisk2[7779]: sending ''list response'' message (uuid = 65535) Feb 16 13:04:00 vantst07 tapdisk2[7779]: sending ''list response'' message (uuid = 65535) root@vantst07:~# tap-ctl pause -p 7781 -m 0 Feb 16 13:04:12 vantst07 tapdisk2[7779]: received ''pause'' message (uuid = 0) Feb 16 13:04:12 vantst07 tapdisk2[7779]: /var/tmp/lenny.vhd: b: 256, a: 256, f: 140, n: 1050624 Feb 16 13:04:12 vantst07 tapdisk2[7779]: closed image /var/tmp/lenny.vhd (0 users, state: 0x00000000, type: 4) Feb 16 13:04:12 vantst07 tapdisk2[7779]: sending ''pause response'' message (uuid = 0) root@vantst07:~# tap-ctl unpause -p 7781 -m 0 Feb 16 13:04:20 vantst07 tapdisk2[7779]: received ''resume'' message (uuid = 0) Feb 16 13:04:20 vantst07 tapdisk2[7779]: /var/tmp/lenny.vhd version: tap 0x00010003, b: 256, a: 256, f: 140, n: 1050624 Feb 16 13:04:20 vantst07 tapdisk2[7779]: opened image /var/tmp/lenny.vhd (1 users, state: 0x00000001, type: 4) Feb 16 13:04:20 vantst07 tapdisk2[7779]: VBD CHAIN: Feb 16 13:04:20 vantst07 tapdisk2[7779]: /var/tmp/lenny.vhd: 4 Feb 16 13:04:20 vantst07 tapdisk2[7779]: sending ''resume response'' message (uuid = 0)> i''m not familiar with blktap code. would you please tell in which > function blktap run an implicit close/open when process first io?I think those lines never made it into tools/blktap. XCP''s srpm should still have those patches, but they''re already removed post-5.6fp1, so I''d recommend to rather go for b. and let c. fade out. The toolstack should stay in control, not the disk to try paper over mistaken assumptions.> and in latest stable version blktap2 pause/unpause is available ?Yup. Daniel> thanks > 2011/2/14 Daniel Stodden <daniel.stodden@citrix.com> > On Sun, 2011-02-13 at 16:11 -0500, Daniel Stodden wrote: > > > B. Hack. > > Let the toolstack issue a tap-ctl pause/unpause cycle > before resume. > > This will reopen the image. > > > > C. Back then, in the dark ages, blktap did this implicitly. > > Every > > > *first* > > > I/O request after disk create run an implicit close/open > > > cycle. > > :o) > > D > a > niel > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel