Claris Castillo
2006-Jul-14 05:32 UTC
[Xen-users] "xm save" hanging when saving domain in "pause" state
Is there any known issue on saving a domain which is in pause state? Xen hangs whenever I try to save (by means of xm save command) the state of a machine which has been paused (by means of xm pause command). I have been looking at the log files etc but I am not able to spot the problem. BTW, xm save works prefectly fine if the domain is running or block. The problem is when the domain is in pause. I am ataching below the xend.log file in case it helps. Any help is highly appreciated! p -------------------------------------- [2006-07-14 00:50:36 xend] DEBUG (DevController:105) DevController: writing {''domain'': ''ecdvm1'', ''frontend'': ''/local/domain/2/device/vbd/2049'', ''dev'': ''sda1'', ''state'': ''1'', ''params'': ''planetlab/ecdvm1'', ''mode'': ''w'', ''frontend-id'': ''2'', ''type'': ''phy''} to /local/domain/0/backend/vbd/2/2049. [2006-07-14 00:50:36 xend.XendDomainInfo] DEBUG (XendDomainInfo:698) Storing domain details: {''console/port'': ''2'', ''name'': ''ecdvm1'', ''console/limit'': ''1048576'', ''vm'': ''/vm/dc2dd73a-6e23-b0d2-050b-be94ac50515c'', ''domid'': ''2'', ''cpu/0/availability'': ''online'', ''memory/target'': ''131072'', ''store/port'': ''1''} [2006-07-14 00:50:36 xend] DEBUG (balloon:130) Balloon: free 129; need 137. [2006-07-14 00:50:36 xend] DEBUG (balloon:139) Balloon: setting dom0 target to 3410. [2006-07-14 00:50:36 xend.XendDomainInfo] DEBUG (XendDomainInfo:950) Setting memory target of domain Domain-0 (0) to 3410 MiB. [2006-07-14 00:50:37 xend] DEBUG (balloon:126) Balloon: free 137; need 137; done. [2006-07-14 00:50:37 xend] DEBUG (XendCheckpoint:149) [xc_restore]: /usr/lib/xen/bin/xc_restore 10 18 2 34816 1 2 [2006-07-14 00:50:37 xend] ERROR (XendCheckpoint:231) xc_linux_restore start: max_pfn = 8800 [2006-07-14 00:50:37 xend] ERROR (XendCheckpoint:231) Increased domain reservation by 22000 KB [2006-07-14 00:50:37 xend] ERROR (XendCheckpoint:231) Reloading memory pages: 0% [2006-07-14 00:50:37 xend] ERROR (XendCheckpoint:231) Received all pages (0 races) [2006-07-14 00:50:37 xend] ERROR (XendCheckpoint:231) ^H^H^H^H100% [2006-07-14 00:50:37 xend] ERROR (XendCheckpoint:231) Memory reloaded. [2006-07-14 00:50:37 xend] ERROR (XendCheckpoint:231) Decreased reservation by 2048 pages [2006-07-14 00:50:37 xend] ERROR (XendCheckpoint:231) Domain ready to be built. [2006-07-14 00:50:37 xend] ERROR (XendCheckpoint:231) Restore exit with rc=0 [2006-07-14 00:50:37 xend] DEBUG (XendCheckpoint:204) store-mfn 303115 [2006-07-14 00:50:37 xend] DEBUG (XendCheckpoint:204) console-mfn 303114 [2006-07-14 00:50:37 xend.XendDomainInfo] DEBUG (XendDomainInfo:650) XendDomainInfo.completeRestore [2006-07-14 00:50:37 xend.XendDomainInfo] DEBUG (XendDomainInfo:698) Storing domain details: {''console/ring-ref'': ''303114'', ''console/port'': ''2'', ''name'': ''ecdvm1'', ''console/limit'': ''1048576'', ''vm'': ''/vm/dc2dd73a-6e23-b0d2-050b-be94ac50515c'', ''domid'': ''2'', ''cpu/0/availability'': ''online'', ''memory/target'': ''131072'', ''store/ring-ref'': ''303115'', ''store/port'': ''1''} [2006-07-14 00:50:37 xend.XendDomainInfo] DEBUG (XendDomainInfo:660) XendDomainInfo.completeRestore done [2006-07-14 00:50:37 xend.XendDomainInfo] DEBUG (XendDomainInfo:882) XendDomainInfo.handleShutdownWatch [2006-07-14 00:50:54 xend] INFO (XendDomain:376) Domain ecdvm1 (2) paused. [2006-07-14 00:51:04 xend] DEBUG (XendCheckpoint:81) [xc_save]: /usr/lib/xen/bin/xc_save 10 18 2 0 0 0 [2006-07-14 00:51:04 xend] DEBUG (XendCheckpoint:204) suspend [2006-07-14 00:51:04 xend] DEBUG (XendCheckpoint:84) In saveInputHandler suspend [2006-07-14 00:51:04 xend] DEBUG (XendCheckpoint:86) Suspending 2 ... [2006-07-14 00:51:04 xend.XendDomainInfo] DEBUG (XendDomainInfo:882) XendDomainInfo.handleShutdownWatch ~ ---------------------------- _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Michael Vrable
2006-Jul-14 22:37 UTC
Re: [Xen-users] "xm save" hanging when saving domain in "pause" state
On Thu, Jul 13, 2006 at 10:32:47PM -0700, Claris Castillo wrote:> Is there any known issue on saving a domain which is in pause state? > Xen hangs whenever I try to save (by means of xm save command) the state of > a machine which has been paused (by means of xm pause command). I have been > looking at the log files etc but I am not able to spot the problem. BTW, xm > save works prefectly fine if the domain is running or block. The problem is > when the domain is in pause.Yes, this is by design. Save/restore are a cooperative process: 1. Xend notifies the domain that it will be saved 2. The domain disconnects from devices, places itself into a quiescent state, and notifies Xen 3. After receiving this notification, xend saves the domain''s CPU state and memory to a file Pausing a domain will prevent any progress from being made on step 2 (for the obvious reason--the domain can''t run), so saving will hang. --Michael Vrable _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Claris Castillo
2006-Jul-14 23:08 UTC
Re: [Xen-users] "xm save" hanging when saving domain in "pause" state
Thanks for your reply Michael. These are bad news... I am actually looking at options to do offline migration, that is, take snapshots of the VMs state and their corresponding filesystems in order to be able to fire them up in a different host if for some reason the original host they are assigned to crashes at some point. I have found some threads with *very* similar questions in some mailing lists, unfortunately they don''t extend to more than two entries and have more questions than answers. Ok. Parameter of design? As far as I understand by "pausing" a domain, XEN is basically indicating the scheduler not to give any other slice to that particular VM from that point in time (until the VM is unpaused) Why can''t XEN0 just surpass the scheduler, indicate to the VM that it must disconnect its devices (basically doing an enhanced version of "xm unpause") and put itself in quiescent state, and wait to be notified by the VM (step 2 in your email)?. What is wrong with such approach? Am I missing something? Would not this enable a clean checkpointing procedure? Thanks cc PS Yes, I am aware that checkpointing is in the roadmap. Unfortunately it has a priority level of 2 and I can''t wait that long :( On 7/14/06, Michael Vrable <mvrable@cs.ucsd.edu> wrote:> > On Thu, Jul 13, 2006 at 10:32:47PM -0700, Claris Castillo wrote: > > Is there any known issue on saving a domain which is in pause state? > > Xen hangs whenever I try to save (by means of xm save command) the state > of > > a machine which has been paused (by means of xm pause command). I have > been > > looking at the log files etc but I am not able to spot the problem. BTW, > xm > > save works prefectly fine if the domain is running or block. The problem > is > > when the domain is in pause. > > Yes, this is by design. Save/restore are a cooperative process: > 1. Xend notifies the domain that it will be saved > 2. The domain disconnects from devices, places itself into a quiescent > state, and notifies Xen > 3. After receiving this notification, xend saves the domain''s CPU state > and memory to a file > Pausing a domain will prevent any progress from being made on step 2 > (for the obvious reason--the domain can''t run), so saving will hang. > > --Michael Vrable > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >-- Claris Castillo http://www4.ncsu.edu/~ccastil PhD. Candidate Computer Science North Carolina State University Raleigh, NC _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Claris Castillo
2006-Jul-14 23:12 UTC
Re: [Xen-users] "xm save" hanging when saving domain in "pause" state
Thanks for your reply Michael. These are bad news... I am actually looking at options to do offline migration, that is, take snapshots of the VMs state and their corresponding filesystems in order to be able to fire them up in a different host if for some reason the original host they are assigned to crashes at some point. I have found some threads with *very* similar questions in some mailing lists, unfortunately they don''t extend to more than two entries and have more questions than answers. Ok. Parameter of design? As far as I understand by "pausing" a domain, XEN is basically indicating the scheduler not to give any other slice to that particular VM from that point in time (until the VM is unpaused) Why can''t XEN0 just surpass the scheduler, indicate to the VM that it must disconnect its devices (basically doing an enhanced version of "xm unpause") and put itself in quiescent state, and wait to be notified by the VM (step 2 in your email)?. What is wrong with such approach? Am I missing something? Would not this enable a clean checkpointing procedure? Thanks cc On 7/14/06, Michael Vrable <mvrable@cs.ucsd.edu> wrote:> > On Thu, Jul 13, 2006 at 10:32:47PM -0700, Claris Castillo wrote: > > Is there any known issue on saving a domain which is in pause state? > > Xen hangs whenever I try to save (by means of xm save command) the state > of > > a machine which has been paused (by means of xm pause command). I have > been > > looking at the log files etc but I am not able to spot the problem. BTW, > xm > > save works prefectly fine if the domain is running or block. The problem > is > > when the domain is in pause. > > Yes, this is by design. Save/restore are a cooperative process: > 1. Xend notifies the domain that it will be saved > 2. The domain disconnects from devices, places itself into a quiescent > state, and notifies Xen > 3. After receiving this notification, xend saves the domain''s CPU state > and memory to a file > Pausing a domain will prevent any progress from being made on step 2 > (for the obvious reason--the domain can''t run), so saving will hang. > > --Michael Vrable > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >-- Claris Castillo http://www4.ncsu.edu/~ccastil PhD. Candidate Computer Science North Carolina State University Raleigh, NC _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Petersson, Mats
2006-Jul-14 23:17 UTC
RE: [Xen-users] "xm save" hanging when saving domain in "pause" state
________________________________ From: xen-users-bounces@lists.xensource.com [mailto:xen-users-bounces@lists.xensource.com] On Behalf Of Claris Castillo Sent: 15 July 2006 00:12 To: xen-users@lists.xensource.com Subject: Re: [Xen-users] "xm save" hanging when saving domain in "pause" state Thanks for your reply Michael. These are bad news... I am actually looking at options to do offline migration, that is, take snapshots of the VMs state and their corresponding filesystems in order to be able to fire them up in a different host if for some reason the original host they are assigned to crashes at some point. I have found some threads with *very* similar questions in some mailing lists, unfortunately they don''t extend to more than two entries and have more questions than answers. Ok. Parameter of design? As far as I understand by "pausing" a domain, XEN is basically indicating the scheduler not to give any other slice to that particular VM from that point in time (until the VM is unpaused) Why can''t XEN0 just surpass the scheduler, indicate to the VM that it must disconnect its devices (basically doing an enhanced version of "xm unpause") and put itself in quiescent state, and wait to be notified by the VM (step 2 in your email)?. What is wrong with such approach? Am I missing something? Would not this enable a clean checkpointing procedure? Thanks cc cc, I think you could do exactly what you want with JUST xm save, it automatically "pauses" the domain. Is there any particular reason you CAN''T do that? Why do you need to do xm pause before xm save? -- Mats On 7/14/06, Michael Vrable <mvrable@cs.ucsd.edu> wrote: On Thu, Jul 13, 2006 at 10:32:47PM -0700, Claris Castillo wrote: > Is there any known issue on saving a domain which is in pause state? > Xen hangs whenever I try to save (by means of xm save command) the state of > a machine which has been paused (by means of xm pause command). I have been > looking at the log files etc but I am not able to spot the problem. BTW, xm > save works prefectly fine if the domain is running or block. The problem is > when the domain is in pause. Yes, this is by design. Save/restore are a cooperative process: 1. Xend notifies the domain that it will be saved 2. The domain disconnects from devices, places itself into a quiescent state, and notifies Xen 3. After receiving this notification, xend saves the domain''s CPU state and memory to a file Pausing a domain will prevent any progress from being made on step 2 (for the obvious reason--the domain can''t run), so saving will hang. --Michael Vrable _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users -- Claris Castillo http://www4.ncsu.edu/~ccastil PhD. Candidate Computer Science North Carolina State University Raleigh, NC _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Thorolf Godawa
2006-Jul-14 23:55 UTC
Re: [Xen-users] "xm save" hanging when saving domain in "pause" state
Hi, > looking at options to do offline migration, that is, take snapshots of > the VMs state and their corresponding filesystems in order to be able > to fire them up in a different host if for some reason the original > host they are assigned to well, this could be done quite easy, save the domain, make a copy of the chk-file and of the actual disc-image. Now move both to the new system and make the restore - it''s only important that the chk-file and the image-file correspondent together, otherwise the system will crash at restore or even more worse the data will be corrupt! -- Chau y hasta luego, Thorolf _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Tomas Kouba
2006-Jul-18 11:24 UTC
Re: [Xen-users] "xm save" hanging when saving domain in "pause" state
Petersson, Mats wrote:> > ------------------------------------------------------------------------ > *From:* xen-users-bounces@lists.xensource.com > [mailto:xen-users-bounces@lists.xensource.com] *On Behalf Of *Claris > Castillo > *Sent:* 15 July 2006 00:12 > *To:* xen-users@lists.xensource.com > *Subject:* Re: [Xen-users] "xm save" hanging when saving domain in > "pause" state > > Thanks for your reply Michael. These are bad news... I am actually > looking at options to do offline migration, that is, take snapshots > of the VMs state and their corresponding filesystems in order to be > able to fire them up in a different host if for some reason the > original host they are assigned to crashes at some point. I have > found some threads with *very* similar questions in some mailing > lists, unfortunately they don''t extend to more than two entries > and have more questions than answers. > > > > Ok. Parameter of design? As far as I understand by "pausing" a > domain, XEN is basically indicating the scheduler not to > give any other slice to that particular VM from that point in time > (until the VM is unpaused) Why can''t XEN0 just surpass the > scheduler, indicate to the VM that it must disconnect its devices > (basically doing an enhanced version of "xm unpause") and put itself > in quiescent state, and wait to be notified by the VM (step 2 in > your email)?. What is wrong with such approach? Am I missing > something? Would not this enable a clean checkpointing procedure? > > Thanks > > cc > > > cc, I think you could do exactly what you want with JUST xm save, it > automatically "pauses" the domain. Is there any particular reason you > CAN''T do that? Why do you need to do xm pause before xm save?I think this is because when paused you can do lvm snapshot and so you will be sure that lvm snapshot and saved vm are from the same moment. From my point of view the approach you suggest cannot ensure that. Or am I missing something? -- Tomas Kouba _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users