A user (Joshua) is reporting that ''xm restore'' isn''t working when GPLPV is involved. I''ve checked the logs generated by GPLPV and there are no problems on the save side of things that I can see. Is there anything extra that the suspend or restore needs to do since 3.4.x? Joshua has captured the following: On the dom0 I initiated a "xm save" of the VM. No problems here, but when I initiate an "xm restore", I receive the following error: Error: /usr/lib64/xen/bin/xc_restore 56 103 2 3 1 1 1 failed And in /var/log/xen/xend.log, I see (pertaining to this event): [2009-08-02 15:12:44 4839] INFO (image:745) Need to create platform device.[domid:103] [2009-08-02 15:12:44 4839] DEBUG (XendCheckpoint:261) restore:shadow=0x9, _static_max=0x40000000, _static_min=0x0, [2009-08-02 15:12:44 4839] DEBUG (balloon:166) Balloon: 31589116 KiB free; need 1061888; done. [2009-08-02 15:12:44 4839] DEBUG (XendCheckpoint:278) [xc_restore]: /usr/lib64/xen/bin/xc_restore 56 103 2 3 1 1 1 [2009-08-02 15:12:44 4839] INFO (XendCheckpoint:417) xc_domain_restore start: p2m_size = 100000 [2009-08-02 15:12:44 4839] INFO (XendCheckpoint:417) Reloading memory pages: 0% [2009-08-02 15:12:52 4839] INFO (XendCheckpoint:417) Failed allocation for dom 103: 1024 extents of order 0 [2009-08-02 15:12:52 4839] INFO (XendCheckpoint:417) ERROR Internal error: Failed to allocate memory for batch.! [2009-08-02 15:12:52 4839] INFO (XendCheckpoint:417) [2009-08-02 15:12:52 4839] INFO (XendCheckpoint:417) Restore exit with rc=1 [2009-08-02 15:12:52 4839] DEBUG (XendDomainInfo:2724) XendDomainInfo.destroy: domid=103 [2009-08-02 15:12:52 4839] ERROR (XendDomainInfo:2738) XendDomainInfo.destroy: domain destruction failed. Traceback (most recent call last): File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 2731, in destroy xc.domain_pause(self.domid) Error: (3, ''No such process'') [2009-08-02 15:12:52 4839] DEBUG (XendDomainInfo:2204) No device model [2009-08-02 15:12:52 4839] DEBUG (XendDomainInfo:2206) Releasing devices [2009-08-02 15:12:52 4839] DEBUG (XendDomainInfo:2219) Removing vbd/768 [2009-08-02 15:12:52 4839] DEBUG (XendDomainInfo:1134) XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/768 [2009-08-02 15:12:52 4839] DEBUG (XendDomainInfo:2219) Removing vfb/0 [2009-08-02 15:12:52 4839] DEBUG (XendDomainInfo:1134) XendDomainInfo.destroyDevice: deviceClass = vfb, device = vfb/0 [2009-08-02 15:12:52 4839] DEBUG (XendDomainInfo:2219) Removing console/0 [2009-08-02 15:12:52 4839] DEBUG (XendDomainInfo:1134) XendDomainInfo.destroyDevice: deviceClass = console, device = console/0 [2009-08-02 15:12:52 4839] ERROR (XendDomain:1149) Restore failed Traceback (most recent call last): File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomain.py", line 1147, in domain_restore_fd return XendCheckpoint.restore(self, fd, paused=paused, relocating=relocating) File "/usr/lib64/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 282, in restore forkHelper(cmd, fd, handler.handler, True) File "/usr/lib64/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 405, in forkHelper raise XendError("%s failed" % string.join(cmd)) XendError: /usr/lib64/xen/bin/xc_restore 56 103 2 3 1 1 1 failed _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
It seems that somewhere along the line Xen started using an event channel to trigger a suspend, as opposed to the ''shutdown'' xenstore value. Is there anything else there I need to know about? Thanks James> -----Original Message----- > From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel- > bounces@lists.xensource.com] On Behalf Of James Harper > Sent: Tuesday, 4 August 2009 11:23 > To: xen-devel@lists.xensource.com > Cc: Joshua West > Subject: [Xen-devel] Error restoring DomU when using GPLPV > > A user (Joshua) is reporting that ''xm restore'' isn''t working whenGPLPV> is involved. I''ve checked the logs generated by GPLPV and there are no > problems on the save side of things that I can see. Is there anything > extra that the suspend or restore needs to do since 3.4.x? > > Joshua has captured the following: > > On the dom0 I initiated a "xm save" of the VM. No problems here, but > when I initiate an "xm restore", I receive the following error: > > Error: /usr/lib64/xen/bin/xc_restore 56 103 2 3 1 1 1 failed > > And in /var/log/xen/xend.log, I see (pertaining to this event): > > [2009-08-02 15:12:44 4839] INFO (image:745) Need to create platform > device.[domid:103] > [2009-08-02 15:12:44 4839] DEBUG (XendCheckpoint:261) > restore:shadow=0x9, _static_max=0x40000000, _static_min=0x0, > [2009-08-02 15:12:44 4839] DEBUG (balloon:166) Balloon: 31589116 KiB > free; need 1061888; done. > [2009-08-02 15:12:44 4839] DEBUG (XendCheckpoint:278) [xc_restore]: > /usr/lib64/xen/bin/xc_restore 56 103 2 3 1 1 1 > [2009-08-02 15:12:44 4839] INFO (XendCheckpoint:417) xc_domain_restore > start: p2m_size = 100000 > [2009-08-02 15:12:44 4839] INFO (XendCheckpoint:417) Reloading memory > pages: 0% > [2009-08-02 15:12:52 4839] INFO (XendCheckpoint:417) Failed allocation > for dom 103: 1024 extents of order 0 > [2009-08-02 15:12:52 4839] INFO (XendCheckpoint:417) ERROR Internal > error: Failed to allocate memory for batch.! > [2009-08-02 15:12:52 4839] INFO (XendCheckpoint:417) > [2009-08-02 15:12:52 4839] INFO (XendCheckpoint:417) Restore exit with > rc=1 > [2009-08-02 15:12:52 4839] DEBUG (XendDomainInfo:2724) > XendDomainInfo.destroy: domid=103 > [2009-08-02 15:12:52 4839] ERROR (XendDomainInfo:2738) > XendDomainInfo.destroy: domain destruction failed. > Traceback (most recent call last): > File"/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py",> line 2731, in destroy > xc.domain_pause(self.domid) > Error: (3, ''No such process'') > [2009-08-02 15:12:52 4839] DEBUG (XendDomainInfo:2204) No device model > [2009-08-02 15:12:52 4839] DEBUG (XendDomainInfo:2206) Releasingdevices> [2009-08-02 15:12:52 4839] DEBUG (XendDomainInfo:2219) Removingvbd/768> [2009-08-02 15:12:52 4839] DEBUG (XendDomainInfo:1134) > XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/768 > [2009-08-02 15:12:52 4839] DEBUG (XendDomainInfo:2219) Removing vfb/0 > [2009-08-02 15:12:52 4839] DEBUG (XendDomainInfo:1134) > XendDomainInfo.destroyDevice: deviceClass = vfb, device = vfb/0 > [2009-08-02 15:12:52 4839] DEBUG (XendDomainInfo:2219) Removing > console/0 > [2009-08-02 15:12:52 4839] DEBUG (XendDomainInfo:1134) > XendDomainInfo.destroyDevice: deviceClass = console, device console/0 > [2009-08-02 15:12:52 4839] ERROR (XendDomain:1149) Restore failed > Traceback (most recent call last): > File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomain.py",line> 1147, in domain_restore_fd > return XendCheckpoint.restore(self, fd, paused=paused, > relocating=relocating) > File"/usr/lib64/python2.4/site-packages/xen/xend/XendCheckpoint.py",> line 282, in restore > forkHelper(cmd, fd, handler.handler, True) > File"/usr/lib64/python2.4/site-packages/xen/xend/XendCheckpoint.py",> line 405, in forkHelper > raise XendError("%s failed" % string.join(cmd)) > XendError: /usr/lib64/xen/bin/xc_restore 56 103 2 3 1 1 1 failed > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > It seems that somewhere along the line Xen started using an event > channel to trigger a suspend, as opposed to the ''shutdown'' xenstore > value. Is there anything else there I need to know about? >Actually that seems to be unrelated the problem (I found this out after adding suspend evtchn support to gplpv...) The actual error is that the call to xc_memory_op passes 33 as nr_extents, but the return value is 32, which is an error condition. Is it not counting an already allocated page in the PVonHVM case or something? XendCheckpoint:417 calls xc_domain_restore which calls xc_domain_memory_increase_reservation. Does XENMEM_increase_reservation mean "increase reservation _by_ X" or "increase reservation _to_ X"? James _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > > > > It seems that somewhere along the line Xen started using an event > > channel to trigger a suspend, as opposed to the ''shutdown'' xenstore > > value. Is there anything else there I need to know about? > > > > Actually that seems to be unrelated the problem (I found this outafter adding> suspend evtchn support to gplpv...) > > The actual error is that the call to xc_memory_op passes 33 asnr_extents, but> the return value is 32, which is an error condition. Is it notcounting an> already allocated page in the PVonHVM case or something? > > XendCheckpoint:417 calls xc_domain_restore which calls > xc_domain_memory_increase_reservation. > > Does XENMEM_increase_reservation mean "increase reservation _by_ X" or > "increase reservation _to_ X"? >Actually I was looking at the wrong thing. The error is actually in xc_domain_memory_populate_physmap James _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
When the DomU is running, ''xm debug q'' looks like: (XEN) General information for domain 23: (XEN) refcnt=3 dying=0 nr_pages=197611 xenheap_pages=33 dirty_cpus={1} max_pages=197632 During restore, it looks like this: (XEN) General information for domain 22: (XEN) refcnt=3 dying=0 nr_pages=196576 xenheap_pages=5 dirty_cpus={} max_pages=197632 (last capture before the domain ran out of pages) I added some debugging to libxc and it allocates bunches of 1024 pages over and over quite successfully but then fails at 33. Presumably something is counting pages incorrectly somewhere? James _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 04/08/2009 08:58, "James Harper" <james.harper@bendigoit.com.au> wrote:> When the DomU is running, ''xm debug q'' looks like: > > (XEN) General information for domain 23: > (XEN) refcnt=3 dying=0 nr_pages=197611 xenheap_pages=33 > dirty_cpus={1} max_pages=197632 > > During restore, it looks like this: > (XEN) General information for domain 22: > (XEN) refcnt=3 dying=0 nr_pages=196576 xenheap_pages=5 dirty_cpus={} > max_pages=197632Is the host simply out of memory? If dom22 above has 196576 pages and max_pages=197632 then an allocation of 33 order-0 extents should not fail due to over-commitment to the guest. The only reason for such a failure is inadequate memory available in the host free pools. Perhaps xend auto-ballooning is involved? I''d turn it off if so, as it blows. It could have freed up one-page-too-few or somesuch. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > On 04/08/2009 08:58, "James Harper" <james.harper@bendigoit.com.au>wrote:> > > When the DomU is running, ''xm debug q'' looks like: > > > > (XEN) General information for domain 23: > > (XEN) refcnt=3 dying=0 nr_pages=197611 xenheap_pages=33 > > dirty_cpus={1} max_pages=197632 > > > > During restore, it looks like this: > > (XEN) General information for domain 22: > > (XEN) refcnt=3 dying=0 nr_pages=196576 xenheap_pages=5dirty_cpus={}> > max_pages=197632 > > Is the host simply out of memory?No. 5G physical memory free and there is only 768MB assigned to the DomU. I can start the guest again, I just can''t restore it.> If dom22 above has 196576 pages and > max_pages=197632 then an allocation of 33 order-0 extents should notfail> due to over-commitment to the guest.196576 is just where it happened to be when I took the last ''xm debug q'', before ''xm restore'' failed and deleted it. The allocation of ''33'' returns ''32'' so it does appear to be an off-by-one error.> The only reason for such a failure is > inadequate memory available in the host free pools. Perhaps xend > auto-ballooning is involved? I''d turn it off if so, as it blows. Itcould> have freed up one-page-too-few or somesuch.I assume that what happens is that the memory continues to grow until it hits max_pages, for some reason. Is there a way to tell ''xm restore'' not to delete the domain when the restore fails so I can see if nr_pages really does equal max_pages at the time that it dies? The curious thing is that this only happens when GPLPV is running. A PV domU or a pure HVM DomU doesn''t have this problem (presumably that would have been noticed during regression testing). It would be interesting to try a PVonHVM Linux DomU and see how that goes... hopefully someone who having the problem with GPLPV also has PVonHVM domains they could test. James _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > Is the host simply out of memory? If dom22 above has 196576 pages and > max_pages=197632 then an allocation of 33 order-0 extents should notfail> due to over-commitment to the guest.I added some more debugging... batch 1024 [1] Allocating 1024 mfns [2] 197600 allocated [3] batch 1024 Allocating 33 mfns Failed allocation for dom 24: 33 extents of order 0 (err = 32) [4] [1] is just after ''j'' is read in xc_domain_restore [2] is just before the call to populate_physmap [3] is just after the call to populate_physmap [4] is the error message in the memory_op function in libxc modified to give the value of err According to the a total of 197632 are being allocated and the last page cannot be (could be more pages required the next time around the loop too...) James _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 04/08/2009 10:01, "James Harper" <james.harper@bendigoit.com.au> wrote:> I assume that what happens is that the memory continues to grow until it > hits max_pages, for some reason. Is there a way to tell ''xm restore'' > not to delete the domain when the restore fails so I can see if nr_pages > really does equal max_pages at the time that it dies? > > The curious thing is that this only happens when GPLPV is running. A PV > domU or a pure HVM DomU doesn''t have this problem (presumably that would > have been noticed during regression testing). It would be interesting to > try a PVonHVM Linux DomU and see how that goes... hopefully someone who > having the problem with GPLPV also has PVonHVM domains they could test.Okay, also this is a normal save/restore (no live migration of pages)? Could the grant-table/shinfo Xenheap pages be confusing matters I wonder. The save process may save those pages out - since dom0 can map them it will also save them - and then they get mistakenly restored as domheap pages at the far end. All would work out okay in the end when you remap those special pages during GPLPV restore as the domheap pages would get implciitly freed. But maybe there is not allocation headroom for the guest in the meantime so the restore fails. Just a theory. Maybe you could try unmapping grant/shinfo pages in the suspend callback? This may not help for live migration though, where pages get transmitted before the callback. It may be necessary to allow dom0 to specify ''map me a page but not if it''s special'' and plumb that up to xc_domain_save. It''d be good to have the theory proved first. Cheers, Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > Just a theory. Maybe you could try unmapping grant/shinfo pages in the > suspend callback? This may not help for live migration though, wherepages> get transmitted before the callback. It may be necessary to allow dom0to> specify ''map me a page but not if it''s special'' and plumb that up to > xc_domain_save. It''d be good to have the theory proved first. >Can do. What is the opposite of ''XENMEM_add_to_physmap'' which I assume I''ll need to unmap the grant pages? Is it XENMEM_decrease_reservation? James _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 04/08/2009 10:34, "James Harper" <james.harper@bendigoit.com.au> wrote:>> Just a theory. Maybe you could try unmapping grant/shinfo pages in the >> suspend callback? This may not help for live migration though, where > pages >> get transmitted before the callback. It may be necessary to allow dom0 > to >> specify ''map me a page but not if it''s special'' and plumb that up to >> xc_domain_save. It''d be good to have the theory proved first. >> > > Can do. What is the opposite of ''XENMEM_add_to_physmap'' which I assume > I''ll need to unmap the grant pages? Is it XENMEM_decrease_reservation?Oh yes, there is no direct opposite of add_to_physmap... But I think decrease_reservation will work okay in this case, fortunately. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > Could the grant-table/shinfo Xenheap pages be confusing matters Iwonder.> The save process may save those pages out - since dom0 can map them itwill> also save them - and then they get mistakenly restored as domheappages at> the far end. All would work out okay in the end when you remap thosespecial> pages during GPLPV restore as the domheap pages would get implciitlyfreed.> But maybe there is not allocation headroom for the guest in themeantime so> the restore fails. > > Just a theory. Maybe you could try unmapping grant/shinfo pages in the > suspend callback? This may not help for live migration though, wherepages> get transmitted before the callback. It may be necessary to allow dom0to> specify ''map me a page but not if it''s special'' and plumb that up to > xc_domain_save. It''d be good to have the theory proved first. >I took the easier path and told my grant code to map 2 less pages, and sure enough it tries to allocate 31 pages (which succeeds) but then tries to allocate 7 (which fails). So the 31 (was 33) must contain the grant table pages etc. I''ll attempt to add the unmap code you requested and see if it makes a difference... So why doesn''t PV have this problem? Does it not send the pages first? And do you think that a Linux HVM domain with PV drivers would suffer the same fate? Thanks James _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > On 04/08/2009 10:34, "James Harper" <james.harper@bendigoit.com.au>wrote:> > >> Just a theory. Maybe you could try unmapping grant/shinfo pages inthe> >> suspend callback? This may not help for live migration though,where> > pages > >> get transmitted before the callback. It may be necessary to allowdom0> > to > >> specify ''map me a page but not if it''s special'' and plumb that upto> >> xc_domain_save. It''d be good to have the theory proved first. > >> > > > > Can do. What is the opposite of ''XENMEM_add_to_physmap'' which Iassume> > I''ll need to unmap the grant pages? Is itXENMEM_decrease_reservation?> > Oh yes, there is no direct opposite of add_to_physmap... But I think > decrease_reservation will work okay in this case, fortunately. >Given that I''m not going to use the grant table subsequent to unmapping them I''ll probably get away with it, but does XENMEM_decrease_reservation actually tell xen that the pages are no longer actually part of the grant table? James _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 04/08/2009 11:40, "James Harper" <james.harper@bendigoit.com.au> wrote:>> Oh yes, there is no direct opposite of add_to_physmap... But I think >> decrease_reservation will work okay in this case, fortunately. >> > > Given that I''m not going to use the grant table subsequent to unmapping > them I''ll probably get away with it, but does > XENMEM_decrease_reservation actually tell xen that the pages are no > longer actually part of the grant table?No, for a xenheap page the page won''t actually get freed. Xen keeps a reference to them until the domain is finally destroyed. Regarding the Linux PV-on-HVM drivers - they may have the same issue. Full PV guests do not as they have a gnttab_suspend() function called during suspend callback (and for subtle reasons xc_domain_save can detect and not save Xenheap pages for a full PV guests anyway - because it can see the P2M table in that case). Like I said before -- unmapping the gnttab pages I think will not help you for live migration, but I suppose it is a reasonable thing to do anyway. For live migration I think xc_domain_save needs t get a bit smarter about Xenheap pages in HVM guests. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> On 04/08/2009 11:40, "James Harper" <james.harper@bendigoit.com.au>wrote:> > >> Oh yes, there is no direct opposite of add_to_physmap... But Ithink> >> decrease_reservation will work okay in this case, fortunately. > >> > > > > Given that I''m not going to use the grant table subsequent tounmapping> > them I''ll probably get away with it, but does > > XENMEM_decrease_reservation actually tell xen that the pages are no > > longer actually part of the grant table? > > No, for a xenheap page the page won''t actually get freed. Xen keeps a > reference to them until the domain is finally destroyed. > > Regarding the Linux PV-on-HVM drivers - they may have the same issue.Full> PV guests do not as they have a gnttab_suspend() function calledduring> suspend callback (and for subtle reasons xc_domain_save can detect andnot> save Xenheap pages for a full PV guests anyway - because it can seethe P2M> table in that case). > > Like I said before -- unmapping the gnttab pages I think will not helpyou> for live migration, but I suppose it is a reasonable thing to doanyway. For> live migration I think xc_domain_save needs t get a bit smarter about > Xenheap pages in HVM guests. >Understood. Do you have any idea about why it worked fine under 3.3.x but not 3.4.x? Thanks James _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 04/08/2009 12:34, "James Harper" <james.harper@bendigoit.com.au> wrote:>> Like I said before -- unmapping the gnttab pages I think will not help > you >> for live migration, but I suppose it is a reasonable thing to do > anyway. For >> live migration I think xc_domain_save needs t get a bit smarter about >> Xenheap pages in HVM guests. > > Understood. Do you have any idea about why it worked fine under 3.3.x > but not 3.4.x?The bit of code in 3.3''s xc_domain_save.c that is commented "Skip PFNs that aren''t really there" is removed in 3.4. That will be the reason. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2009-Aug-18 08:17 UTC
Re: [Xen-devel] Error restoring DomU when using GPLPV
On Tue, Aug 04, 2009 at 02:12:48PM +0100, Keir Fraser wrote:> On 04/08/2009 12:34, "James Harper" <james.harper@bendigoit.com.au> wrote: > > >> Like I said before -- unmapping the gnttab pages I think will not help > > you > >> for live migration, but I suppose it is a reasonable thing to do > > anyway. For > >> live migration I think xc_domain_save needs t get a bit smarter about > >> Xenheap pages in HVM guests. > > > > Understood. Do you have any idea about why it worked fine under 3.3.x > > but not 3.4.x? > > The bit of code in 3.3''s xc_domain_save.c that is commented "Skip PFNs that > aren''t really there" is removed in 3.4. That will be the reason. >James: Did you figure out how to fix this problem in gplpv drivers? Just asking because it seems people hit this save/restore/migration problem pretty often.. -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > The bit of code in 3.3''s xc_domain_save.c that is commented "Skip PFNs that > > aren''t really there" is removed in 3.4. That will be the reason. > > > > James: Did you figure out how to fix this problem in gplpv drivers? Just > asking because it seems people hit this save/restore/migration problem pretty > often.. >It''s a problem that affects any PVonHVM domain afaict, so I''d rather defer to the person who made the change originally. James _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi James,> > It''s a problem that affects any PVonHVM domain afaict, so I''d rather defer to the person who made the change originally.I did some test. Linux PVM live migration works well on Xen3.4. As you said above, PVHVM fails live migration and got error: "Error: /usr/lib/xen/bin/xc_save 32 8 0 0 5 failed", but save/restore is OK. So this problem should be fixed on xen side instead of pv driver side? Following is the log of linux PVHVM live migration: [2009-08-19 10:52:08 2832] DEBUG (XendCheckpoint:110) [xc_save]: /usr/lib/xen/bin/xc_save 32 8 0 0 5 [2009-08-19 10:52:08 2832] INFO (XendCheckpoint:418) xc_save: failed to get the suspend evtchn port [2009-08-19 10:52:08 2832] INFO (XendCheckpoint:418) [2009-08-19 10:52:08 2832] INFO (XendCheckpoint:418) Saving memory pages: iter 1 0%ERROR Internal error: Error when writing to state file (4a) (errno 104) [2009-08-19 10:52:08 2832] INFO (XendCheckpoint:418) Save exit rc=1 [2009-08-19 10:52:08 2832] ERROR (XendCheckpoint:164) Save failed on domain OVM_EL5U3_X86_PVHVM_4GB (8) - resuming. Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 132, in save forkHelper(cmd, fd, saveInputHandler, False) File "/usr/lib/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 406, in forkHelper raise XendError("%s failed" % string.join(cmd)) XendError: /usr/lib/xen/bin/xc_save 32 8 0 0 5 failed [2009-08-19 10:52:08 2832] DEBUG (XendDomainInfo:2806) XendDomainInfo.resumeDomain(8) Thanks Annie. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 19/08/2009 08:39, "ANNIE LI" <annie.li@oracle.com> wrote:> Following is the log of linux PVHVM live migration: > > [2009-08-19 10:52:08 2832] DEBUG (XendCheckpoint:110) [xc_save]: > /usr/lib/xen/bin/xc_save 32 8 0 0 5 > [2009-08-19 10:52:08 2832] INFO (XendCheckpoint:418) xc_save: failed to get > the suspend evtchn port > [2009-08-19 10:52:08 2832] INFO (XendCheckpoint:418) > [2009-08-19 10:52:08 2832] INFO (XendCheckpoint:418) Saving memory pages: iter > 1 0%ERROR Internal error: Error when writing to state file (4a) (errno 104)The original error will be on the receive side. The sender failed because the receiver closed down. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi Keir Fraser wrote:> On 19/08/2009 08:39, "ANNIE LI" <annie.li@oracle.com> wrote: > > >> Following is the log of linux PVHVM live migration: >> >> [2009-08-19 10:52:08 2832] DEBUG (XendCheckpoint:110) [xc_save]: >> /usr/lib/xen/bin/xc_save 32 8 0 0 5 >> [2009-08-19 10:52:08 2832] INFO (XendCheckpoint:418) xc_save: failed to get >> the suspend evtchn port >> [2009-08-19 10:52:08 2832] INFO (XendCheckpoint:418) >> [2009-08-19 10:52:08 2832] INFO (XendCheckpoint:418) Saving memory pages: iter >> 1 0%ERROR Internal error: Error when writing to state file (4a) (errno 104) >> > > The original error will be on the receive side. The sender failed because > the receiver closed down. >Please ignore my last post, it was wrong because of the kernel version problem. I tested again after updating my test environment. Live migration/ save/ restore works well on pvhvm with Xen3.4. So this should not be problem for any PVonHVM domain, just windows domain with pv drivers has this problem. Thanks Annie. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi> Regarding the Linux PV-on-HVM drivers - they may have the same issue. Full > PV guests do not as they have a gnttab_suspend() function called during > suspend callback (and for subtle reasons xc_domain_save can detect and not > save Xenheap pages for a full PV guests anyway - because it can see the P2M > table in that case). >Live migration of linux PVonHVM passed on Xen3.4 successfully, but windows os with pv driver failed.> Like I said before -- unmapping the gnttab pages I think will not help you > for live migration, but I suppose it is a reasonable thing to do anyway. For > live migration I think xc_domain_save needs t get a bit smarter about > Xenheap pages in HVM guests.Yes. The live migration of windows domu with pv driver failed again after we added unmapping gnttab and shinfo pages in suspending process. The save process is OK, but the restore hit the similar problem. So what should we do in windows pv driver to avoid this problem? Any suggestion for this issue? Any suggestion is appreciated. Following is the log of restore process: [2009-08-21 00:17:34 2918] DEBUG (XendCheckpoint:278) [xc_restore]: /usr/lib/xen/bin/xc_restore 31 33 2 3 1 1 1 [2009-08-21 00:17:34 2918] INFO (XendCheckpoint:418) xc_domain_restore start: p2m_size = 100000 [2009-08-21 00:17:34 2918] INFO (XendCheckpoint:418) Reloading memory pages: 0% [2009-08-21 00:17:44 2918] INFO (XendCheckpoint:418) Failed allocation for dom 33: 7 extents of order 0 [2009-08-21 00:17:44 2918] INFO (XendCheckpoint:418) ERROR Internal error: Failed to allocate memory for batch.! [2009-08-21 00:17:44 2918] INFO (XendCheckpoint:418) [2009-08-21 00:17:44 2918] INFO (XendCheckpoint:418) Restore exit with rc=1 [2009-08-21 00:17:44 2918] DEBUG (XendDomainInfo:2750) XendDomainInfo.destroy: domid=33 [2009-08-21 00:17:44 2918] ERROR (XendDomainInfo:2764) XendDomainInfo.destroy: domain destruction failed. Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 2757, in destroy xc.domain_pause(self.domid) Error: (3, ''No such process'') [2009-08-21 00:17:44 2918] DEBUG (XendDomainInfo:2225) No device model [2009-08-21 00:17:44 2918] DEBUG (XendDomainInfo:2227) Releasing devices [2009-08-21 00:17:44 2918] DEBUG (XendDomainInfo:2240) Removing vif/0 [2009-08-21 00:17:44 2918] DEBUG (XendDomainInfo:1142) XendDomainInfo.destroyDevice: deviceClass = vif, device = vif/0 [2009-08-21 00:17:44 2918] DEBUG (XendDomainInfo:2240) Removing vbd/768 [2009-08-21 00:17:44 2918] DEBUG (XendDomainInfo:1142) XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/768 [2009-08-21 00:17:44 2918] DEBUG (XendDomainInfo:2240) Removing vfb/0 [2009-08-21 00:17:44 2918] DEBUG (XendDomainInfo:1142) XendDomainInfo.destroyDevice: deviceClass = vfb, device = vfb/0 [2009-08-21 00:17:44 2918] DEBUG (XendDomainInfo:2240) Removing console/0 [2009-08-21 00:17:44 2918] DEBUG (XendDomainInfo:1142) XendDomainInfo.destroyDevice: deviceClass = console, device = console/0 [2009-08-21 00:17:45 2918] ERROR (XendDomain:1149) Restore failed Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/xen/xend/XendDomain.py", line 1147, in domain_restore_fd return XendCheckpoint.restore(self, fd, paused=paused, relocating=relocating) File "/usr/lib/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 282, in restore forkHelper(cmd, fd, handler.handler, True) File "/usr/lib/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 406, in forkHelper raise XendError("%s failed" % string.join(cmd)) XendError: /usr/lib/xen/bin/xc_restore 31 33 2 3 1 1 1 failed Thanks Annie. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 20/08/2009 09:17, "ANNIE LI" <annie.li@oracle.com> wrote:>> Like I said before -- unmapping the gnttab pages I think will not help you >> for live migration, but I suppose it is a reasonable thing to do anyway. For >> live migration I think xc_domain_save needs t get a bit smarter about >> Xenheap pages in HVM guests. > Yes. The live migration of windows domu with pv driver failed again > after we added unmapping gnttab and shinfo pages in suspending process. > The save process is OK, but the restore hit the similar problem. So what > should we do in windows pv driver to avoid this problem? Any suggestion > for this issue?Balloon down by (#gnttab+#shinfo) pages when PV drivers first load. You could do this instead of unmapping gnttab+shinfo pages on suspend, or as well as. Ultimately the ''right'' fix will need to be implemented in Xen and dom0 tools. But the above kludge would work perfectly well I believe. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > On 20/08/2009 09:17, "ANNIE LI" <annie.li@oracle.com> wrote: > > >> Like I said before -- unmapping the gnttab pages I think will nothelp you> >> for live migration, but I suppose it is a reasonable thing to doanyway.> For > >> live migration I think xc_domain_save needs t get a bit smarterabout> >> Xenheap pages in HVM guests. > > Yes. The live migration of windows domu with pv driver failed again > > after we added unmapping gnttab and shinfo pages in suspendingprocess.> > The save process is OK, but the restore hit the similar problem. Sowhat> > should we do in windows pv driver to avoid this problem? Anysuggestion> > for this issue? > > Balloon down by (#gnttab+#shinfo) pages when PV drivers first load.You> could do this instead of unmapping gnttab+shinfo pages on suspend, oras> well as. > > Ultimately the ''right'' fix will need to be implemented in Xen and dom0 > tools. But the above kludge would work perfectly well I believe. >Kludgy, as you say, but if it works then so be it. Is there a future proof way for the drivers to know if they are running under a future version of xen that isn''t broken in this way? I''m guessing not... James _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi> Kludgy, as you say, but if it works then so be it. >Just finished test, live migration works well on winpv domu after ballooning down those pages when pv driver first load. Another question: Why linux PVonHVM don''t have this issue? does the linux pv driver have the same process like above? Thanks Annie. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 20/08/2009 10:42, "James Harper" <james.harper@bendigoit.com.au> wrote:>> Ultimately the ''right'' fix will need to be implemented in Xen and dom0 >> tools. But the above kludge would work perfectly well I believe. >> > > Kludgy, as you say, but if it works then so be it. > > Is there a future proof way for the drivers to know if they are running > under a future version of xen that isn''t broken in this way? I''m > guessing not...No, not really. We could add a xenstore flag or something I suppose. But really losing a few memory pages is not the end of the world. I suppose the major pain might be if it shatters a physical superpage, and hence makes VT-d/EPT type stuff more expensive. If you could at least arrange for the pages to come from the same aligned 2MB region, or even from the bottom 2MB of memory (which can never be allocated as a superpage because of the VGA area), that might be nice. Thinking about how to fix this nicely in the tools, it seems pretty tricky if I don''t want to have to change the dom0 kernel too. The kernel is quite involved in mapping foreign pages up to user space and gets in the way of hacking in a flag between tools and Xen... And we do want to be able to map foreign Xen-heap pages in some cases. It''s only a nuisance for xc_domain_save. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 20/08/2009 11:05, "ANNIE LI" <annie.li@oracle.com> wrote:> Hi >> >> Kludgy, as you say, but if it works then so be it. >> > Just finished test, live migration works well on winpv domu after ballooning > down those pages when pv driver first load. > > Another question: > Why linux PVonHVM don''t have this issue? does the linux pv driver have the > same process like above?It''s working more by luck than design I fear. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 20/08/2009 11:19, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:> No, not really. We could add a xenstore flag or something I suppose. But > really losing a few memory pages is not the end of the world. I suppose the > major pain might be if it shatters a physical superpage, and hence makes > VT-d/EPT type stuff more expensive. If you could at least arrange for the > pages to come from the same aligned 2MB region, or even from the bottom 2MB > of memory (which can never be allocated as a superpage because of the VGA > area), that might be nice. > > Thinking about how to fix this nicely in the tools, it seems pretty tricky > if I don''t want to have to change the dom0 kernel too. The kernel is quite > involved in mapping foreign pages up to user space and gets in the way of > hacking in a flag between tools and Xen... And we do want to be able to map > foreign Xen-heap pages in some cases. It''s only a nuisance for > xc_domain_save.Another method would be for the PV-on-HVM drivers to map shinfo and gnttab pages in a restricted guest-physical address range and then advertise the range via, say, a new HVMPARAM. Tools could also indicate to PV-on-HVM that this method is supported via the same HVMPARAM. How does that sound? It would give the ability for a general exclusion range for save/restore. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi> Just finished test, live migration works well on winpv domu after > ballooning down those pages when pv driver first load.After more test, i find there are some problem with this method: ballooning down those pages when pv driver first load. If i balloon down those pages when driver first load, save/restore can only work only once, and the migration can not work. It seems that a restored vm lost those pages ballooned down. For migration, destination does not have those pages which ballooned down on source. But if i balloon down those pages every time(not driver first load), i tested save/restore/migration for several times, and all work fine. But the domu will waste lots of memory in this situation. I will do more test about this and update here. Thanks Annie. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 20/08/2009 12:55, "ANNIE LI" <annie.li@oracle.com> wrote:>> Just finished test, live migration works well on winpv domu after >> ballooning down those pages when pv driver first load. > After more test, i find there are some problem with this method: > ballooning down those pages when pv driver first load. > > If i balloon down those pages when driver first load, save/restore can > only work only once, and the migration can not work.Oh dear.> It seems that a > restored vm lost those pages ballooned down. For migration, destination > does not have those pages which ballooned down on source.Right, that''s the correct behaviour isn''t it? Pages freed on source VM do not magically reappear on destination VM?> But if i balloon down those pages every time(not driver first load), i > tested save/restore/migration for several times, and all work fine. But > the domu will waste lots of memory in this situation.Yes, that''s weird. Do you know what condition causes guest memory allocation failure on xc_domain_restore? Is it due to hitting the guest maxmem limit in Xen? If so, is maxmem the same value across multiple iterations of save/restore or migration?> I will do more test about this and update here.Thanks. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi>> It seems that a >> restored vm lost those pages ballooned down. For migration, destination >> does not have those pages which ballooned down on source. >> > > Right, that''s the correct behaviour isn''t it? Pages freed on source VM do > not magically reappear on destination VM? > >Yes, so this method can not fix this problem.>> But if i balloon down those pages every time(not driver first load), i >> tested save/restore/migration for several times, and all work fine. But >> the domu will waste lots of memory in this situation. >> > > Yes, that''s weird. Do you know what condition causes guest memory allocation > failure on xc_domain_restore? Is it due to hitting the guest maxmem limit in > Xen? If so, is maxmem the same value across multiple iterations of > save/restore or migration? >Sorry, i have no idea about it. Maybe I need to print more log in for(;;) in xc_domain_restore to see what is the difference between without and with balooning down pages. Thanks Annie. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
James Harper wrote:> > I added some more debugging... > > batch 1024 [1] > Allocating 1024 mfns [2] > 197600 allocated [3] > batch 1024 > Allocating 33 mfns > Failed allocation for dom 24: 33 extents of order 0 (err = 32) [4] > > [1] is just after ''j'' is read in xc_domain_restore > [2] is just before the call to populate_physmap > [3] is just after the call to populate_physmap > [4] is the error message in the memory_op function in libxc modified to > give the value of err > > According to the a total of 197632 are being allocated and the last page > cannot be (could be more pages required the next time around the loop > too...)I''ve also added some debugging in xc_domain_save.c and xc_domain_restore.c. Attach has those two files which I am using and xend.log. Those two files are modified based on Xen 3.4.0-xx. What I''ve done is: 1, Create a Windows Server 2008 32bit with PV drivers which can support migration on Xen 3.1.x. 2, Save windows DomU then restore it. Error log is the same as the first mail of this thread. 3, Create a fresh install Windows 2008 32bit VM with almost the same vm.cfg file. 4, Save and restore windows DomU. Here is some xend logs. I am not very sure if it''s fine as I didn''t know the whole process of save/restore in Xen. If you need any more information, please let me know. Line 100: [2009-08-26 01:34:24 2883] INFO (XendCheckpoint:418) shared_info_frame: 0x41ec Line 167: [2009-08-26 01:34:39 2883] INFO (XendCheckpoint:418) shared_info_frame: 0xffffffff Line 581: [2009-08-26 02:10:47 2883] INFO (XendCheckpoint:418) shared_info_frame: 0xfffff Line 649: [2009-08-26 02:11:01 2883] INFO (XendCheckpoint:418) shared_info_frame: 0xffffffff <-- is this OK for a PVHVM when restoring. Line 169: [2009-08-26 01:34:39 2883] INFO (XendCheckpoint:418) Reloading memory pages: 0% Line 170: [2009-08-26 01:34:39 2883] INFO (XendCheckpoint:418) batch 1024 Line 171: [2009-08-26 01:34:39 2883] INFO (XendCheckpoint:418) nr_mfns: 991 <---- Line 651: [2009-08-26 02:11:01 2883] INFO (XendCheckpoint:418) Reloading memory pages: 0% Line 652: [2009-08-26 02:11:01 2883] INFO (XendCheckpoint:418) batch 1024 Line 653: [2009-08-26 02:11:01 2883] INFO (XendCheckpoint:418) nr_mfns: 992 <---- nf_mfns of HVM is one more lager then PVHVM''s Line 472: [2009-08-26 01:34:46 2883] INFO (XendCheckpoint:418) batch 1024 Line 473: [2009-08-26 01:34:46 2883] INFO (XendCheckpoint:418) nr_mfns: 31 <---Restore a PVHVM will allocate 31 more nf_mfns then HVM. thanks wayne _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi,>> Yes, that''s weird. Do you know what condition causes guest memory allocation >> failure on xc_domain_restore? Is it due to hitting the guest maxmem limit in >> Xen? If so, is maxmem the same value across multiple iterations of >> save/restore or migration? >> > Sorry, i have no idea about it. Maybe I need to print more log in > for(;;) in xc_domain_restore to see what is the difference between > without and with balooning down pages.I did some migration test on linux/windows PVHVM on Xen3.4. * I printed the value of "pfn = region_pfn_type[i] & ~XEN_DOMCTL_PFINFO_LTAB_MASK;" in xc_domain_restore.c. When restoring fails with error "Failed allocation for dom 2: 33 extents of order 0", the value of pfn is less than that of restoring successfully. So i think it should not due to hitting the guest maxmem limit in Xen. Is it correct? * After comparing difference between with and without ballooning down (gnttab+shinfo) pages, i find that: If the windows pv driver balloon down those pages, there will be more pages with XEN_DOMCTL_PFINFO_XTAB type in saving process. Furthermore, more bogus/unmapped page are skipped in restoring process. If the winpv driver do not balloon down those pages, there are only a little such pages with XEN_DOMCTL_PFINFO_XTAB type to be processed during save/restore process. * Another result about winpv driver with ballooning down those pages When doing save/restore for the second time, i find p2msize in restoring process become 0xfefff which is less than the normal size 0x100000. Any suggestion about those test result? Or any idea to resolve this problem in winpv or xen? Thanks Annie. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi,> I did some migration test on linux/windows PVHVM on Xen3.4. > > * I printed the value of "pfn = region_pfn_type[i] & > ~XEN_DOMCTL_PFINFO_LTAB_MASK;" in xc_domain_restore.c. When restoring > fails with error "Failed allocation for dom 2: 33 extents of order 0", > the value of pfn is less than that of restoring successfully. So i > think it should not due to hitting the guest maxmem limit in Xen. Is > it correct? > > * After comparing difference between with and without ballooning down > (gnttab+shinfo) pages, i find that: > > If the windows pv driver balloon down those pages, there will be more > pages with XEN_DOMCTL_PFINFO_XTAB type in saving process. Furthermore, > more bogus/unmapped page are skipped in restoring process. > If the winpv driver do not balloon down those pages, there are only a > little such pages with XEN_DOMCTL_PFINFO_XTAB type to be processed > during save/restore process. > > * Another result about winpv driver with ballooning down those pages > When doing save/restore for the second time, i find p2msize in > restoring process become 0xfefff which is less than the normal size > 0x100000. > > Any suggestion about those test result? Or any idea to resolve this > problem in winpv or xen?I did more save/restore test, and compare the logs between linux and windows PVHVM. Those two vms have same memory size. It seems that most log of them are identical, but the only difference between them is also connected with XEN_DOMCTL_PFINFO_XTAB type pages. From the comments in the code, XEN_DOMCTL_PFINFO_XTAB type means invalid page. For linux PVHVM, it have more 31 invalid pages than windows PVHVM during saving process. In for ( j = 0; j < batch; j++ ) of xc_domain_save.c, linux PVHVM will take those pages with pfn value between f2003 and f2021 as invalid pages. But windows PVHVM took those pages as normal pages. Then in restoring process, more memory are allocated for windows PVHVM than linux PVHVM. For example: When windows PVHVM hit the issue: "Failed allocation for dom 2: 33 extents of order 0", and the log shows that nr_mfns before "xc_domain_memory_populate_physmap" is 33. However it is only 14 at the same process of restoring linux PVHVM. It seems there should be more invalid pages in the saving process of windows PVHVM. But i failed to get the root cause of it. Any suggestions? Thanks Annie. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi As we discussed in this thread before, all windows PVHVM should fail to migration on Xen3.4. Can anyone tell me whether Citrix windows pv driver save/restore/migration work properly or not on Xen3.4? Thanks Annie. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi It seems this problem is connected with gnttab, not shareinfo. I changed some code about grant table in winpv driver (not using balloon down shinfo+gnttab method), save/restore/migration can work properly on Xen3.4 now. What i changed is winpv driver use hypercall XENMEM_add_to_physmap to map corresponding grant tables which devices require, instead of mapping all 32 pages grant table during initialization. It seems those extra grant table mapping cause this problem. Thanks Annie. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> It seems this problem is connected with gnttab, not shareinfo. > I changed some code about grant table in winpv driver (not using > balloon down shinfo+gnttab method), save/restore/migration can work > properly on Xen3.4 now. > > What i changed is winpv driver use hypercall XENMEM_add_to_physmap to > map corresponding grant tables which devices require, instead of > mapping all 32 pages grant table during initialization. It seems > those extra grant table mapping cause this problem.Wondering whether those extra grant table mapping is the root cause of the migration problem? or by luck as linux PVHVM too? Thanks Annie. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Sep-04 21:28 UTC
RE: [Xen-devel] Error restoring DomU when using GPLPV
I think I''ve tracked down the cause of this problem in the hypervisor, but am unsure how to best fix it. In tools/libxc/xc_domain_save.c, the static variable p2m_size is said to be "number of pfns this guest has (i.e. number of entries in the P2M)". But apparently p2m_size is getting set to a very large number (0x100000) regardless of the maximum psuedophysical memory for the hvm guest. As a result, some "magic" pages in the 0xf0000-0xfefff range are getting placed in the save file. But since they are not "real" pages, the restore process runs beyond the maximum number of physical pages allowed for the domain and fails. (The gpfn of the last 24 pages saved are f2020, fc000-fc012, feffb, feffc, feffd, feffe.) p2m_size is set in "save" with a call to a memory_op hypercall (XENMEM_maximum_gpfn) which for an hvm domain returns d->arch.p2m->max_mapped_pfn. I suspect that the meaning of max_mapped_pfn changed at some point to more match its name, but this changed the semantics of the hypercall as used by xc_domain_restore, resulting in this curious problem. Any thoughts on how to fix this?> -----Original Message----- > From: Annie Li > Sent: Tuesday, September 01, 2009 10:27 PM > To: Keir Fraser > Cc: Joshua West; James Harper; xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] Error restoring DomU when using GPLPV > > > > > It seems this problem is connected with gnttab, not shareinfo. > > I changed some code about grant table in winpv driver (not using > > balloon down shinfo+gnttab method), save/restore/migration can work > > properly on Xen3.4 now. > > > > What i changed is winpv driver use hypercall > XENMEM_add_to_physmap to > > map corresponding grant tables which devices require, instead of > > mapping all 32 pages grant table during initialization. It seems > > those extra grant table mapping cause this problem. > > Wondering whether those extra grant table mapping is the root > cause of > the migration problem? or by luck as linux PVHVM too? > > Thanks > Annie. > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Sep-04 23:02 UTC
RE: [Xen-devel] Error restoring DomU when using GPLPV
On further debugging, it appears that the p2m_size may be OK, but there''s something about those 24 "magic" gpfns that isn''t quite right.> -----Original Message----- > From: Dan Magenheimer > Sent: Friday, September 04, 2009 3:29 PM > To: Wayne Gong; Annie Li; Keir Fraser > Cc: Joshua West; James Harper; xen-devel@lists.xensource.com > Subject: RE: [Xen-devel] Error restoring DomU when using GPLPV > > > I think I''ve tracked down the cause of this problem > in the hypervisor, but am unsure how to best fix it. > > In tools/libxc/xc_domain_save.c, the static variable p2m_size > is said to be "number of pfns this guest has (i.e. number of > entries in the P2M)". But apparently p2m_size is getting > set to a very large number (0x100000) regardless of the > maximum psuedophysical memory for the hvm guest. As a result, > some "magic" pages in the 0xf0000-0xfefff range are getting > placed in the save file. But since they are not "real" > pages, the restore process runs beyond the maximum number > of physical pages allowed for the domain and fails. > (The gpfn of the last 24 pages saved are f2020, fc000-fc012, > feffb, feffc, feffd, feffe.) > > p2m_size is set in "save" with a call to a memory_op hypercall > (XENMEM_maximum_gpfn) which for an hvm domain returns > d->arch.p2m->max_mapped_pfn. I suspect that the meaning > of max_mapped_pfn changed at some point to more match > its name, but this changed the semantics of the hypercall > as used by xc_domain_restore, resulting in this curious > problem. > > Any thoughts on how to fix this? > > > -----Original Message----- > > From: Annie Li > > Sent: Tuesday, September 01, 2009 10:27 PM > > To: Keir Fraser > > Cc: Joshua West; James Harper; xen-devel@lists.xensource.com > > Subject: Re: [Xen-devel] Error restoring DomU when using GPLPV > > > > > > > > > It seems this problem is connected with gnttab, not shareinfo. > > > I changed some code about grant table in winpv driver (not using > > > balloon down shinfo+gnttab method), > save/restore/migration can work > > > properly on Xen3.4 now. > > > > > > What i changed is winpv driver use hypercall > > XENMEM_add_to_physmap to > > > map corresponding grant tables which devices require, instead of > > > mapping all 32 pages grant table during initialization. It seems > > > those extra grant table mapping cause this problem. > > > > Wondering whether those extra grant table mapping is the root > > cause of > > the migration problem? or by luck as linux PVHVM too? > > > > Thanks > > Annie. > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser wrote:> On 04/08/2009 12:34, "James Harper" <james.harper@bendigoit.com.au> wrote: > >>> Like I said before -- unmapping the gnttab pages I think will not help >> you >>> for live migration, but I suppose it is a reasonable thing to do >> anyway. For >>> live migration I think xc_domain_save needs t get a bit smarter about >>> Xenheap pages in HVM guests. >> Understood. Do you have any idea about why it worked fine under 3.3.x >> but not 3.4.x? > > The bit of code in 3.3''s xc_domain_save.c that is commented "Skip PFNs that > aren''t really there" is removed in 3.4. That will be the reason. > > -- KeirHi, I started looking at this couple days ago, and finally understand what''s going on. In our case, win migration/save-restore just fails, as Annie/Wayne had posted. In the short run, since frames for vga etc are skipped anyways, can we just put the above change back in libxc (xen 3.4) and be ok? thanks, Mukesh changeset: 18383:dade7f0bdc8d user: Keir Fraser <keir.fraser@citrix.com> date: Wed Aug 27 14:53:39 2008 +0100 summary: hvm: Use main memory for video memory. diff -r 2397555ebcc2 -r dade7f0bdc8d tools/libxc/xc_domain_save.c --- a/tools/libxc/xc_domain_save.c Wed Aug 27 13:31:01 2008 +0100 +++ b/tools/libxc/xc_domain_save.c Wed Aug 27 14:53:39 2008 +0100 @@ -1111,12 +1111,6 @@ (test_bit(n, to_fix) && last_iter)) ) continue; - /* Skip PFNs that aren''t really there */ - if ( hvm && ((n >= 0xa0 && n < 0xc0) /* VGA hole */ - || (n >= (HVM_BELOW_4G_MMIO_START >> PAGE_SHIFT) - && n < (1ULL<<32) >> PAGE_SHIFT)) /* MMIO */ ) - continue; - _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 05/09/2009 05:02, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:>> The bit of code in 3.3''s xc_domain_save.c that is commented "Skip PFNs that >> aren''t really there" is removed in 3.4. That will be the reason. > > I started looking at this couple days ago, and finally understand > what''s going on. In our case, win migration/save-restore just fails, as > Annie/Wayne had posted. > > In the short run, since frames for vga etc are skipped anyways, can we > just put the above change back in libxc (xen 3.4) and be ok?I don''t think vga frames are skipped. I think vga now gets saved by xc_domain_save. Also some real RAM is up in that area these days. Like ACPI tables for example. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Not all those pages are special. Frames fc0xx will be ACPI tables, resident in ordinary guest memory pages, for example. Only the Xen-heap pages are special and need to be (1) skipped; or (2) unmapped by the HVMPV drivers on suspend; or (3) accounted for by HVMPV drivers by unmapping and freeing an equal number of domain-heap pages. (1) is ''nicest'' but actually a bit of a pain to implement; (2) won''t work well for live migration, where the pages wouldn''t get unmapped by the drivers until the last round of page copying; and (3) was apparently tried by Annie but didn''t work? I''m curious why (3) didn''t work - I can''t explain that. -- Keir On 05/09/2009 00:02, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:> On further debugging, it appears that the > p2m_size may be OK, but there''s something about > those 24 "magic" gpfns that isn''t quite right. > >> -----Original Message----- >> From: Dan Magenheimer >> Sent: Friday, September 04, 2009 3:29 PM >> To: Wayne Gong; Annie Li; Keir Fraser >> Cc: Joshua West; James Harper; xen-devel@lists.xensource.com >> Subject: RE: [Xen-devel] Error restoring DomU when using GPLPV >> >> >> I think I''ve tracked down the cause of this problem >> in the hypervisor, but am unsure how to best fix it. >> >> In tools/libxc/xc_domain_save.c, the static variable p2m_size >> is said to be "number of pfns this guest has (i.e. number of >> entries in the P2M)". But apparently p2m_size is getting >> set to a very large number (0x100000) regardless of the >> maximum psuedophysical memory for the hvm guest. As a result, >> some "magic" pages in the 0xf0000-0xfefff range are getting >> placed in the save file. But since they are not "real" >> pages, the restore process runs beyond the maximum number >> of physical pages allowed for the domain and fails. >> (The gpfn of the last 24 pages saved are f2020, fc000-fc012, >> feffb, feffc, feffd, feffe.) >> >> p2m_size is set in "save" with a call to a memory_op hypercall >> (XENMEM_maximum_gpfn) which for an hvm domain returns >> d->arch.p2m->max_mapped_pfn. I suspect that the meaning >> of max_mapped_pfn changed at some point to more match >> its name, but this changed the semantics of the hypercall >> as used by xc_domain_restore, resulting in this curious >> problem. >> >> Any thoughts on how to fix this? >> >>> -----Original Message----- >>> From: Annie Li >>> Sent: Tuesday, September 01, 2009 10:27 PM >>> To: Keir Fraser >>> Cc: Joshua West; James Harper; xen-devel@lists.xensource.com >>> Subject: Re: [Xen-devel] Error restoring DomU when using GPLPV >>> >>> >>> >>>> It seems this problem is connected with gnttab, not shareinfo. >>>> I changed some code about grant table in winpv driver (not using >>>> balloon down shinfo+gnttab method), >> save/restore/migration can work >>>> properly on Xen3.4 now. >>>> >>>> What i changed is winpv driver use hypercall >>> XENMEM_add_to_physmap to >>>> map corresponding grant tables which devices require, instead of >>>> mapping all 32 pages grant table during initialization. It seems >>>> those extra grant table mapping cause this problem. >>> >>> Wondering whether those extra grant table mapping is the root >>> cause of >>> the migration problem? or by luck as linux PVHVM too? >>> >>> Thanks >>> Annie. >>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >>> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >>_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yes. About (3), my test result is save/restore can work only once if ballooning down pages when driver first load. But it works if ballooning down when driver load every times. Thanks Annie. Keir Fraser wrote:> Not all those pages are special. Frames fc0xx will be ACPI tables, resident > in ordinary guest memory pages, for example. Only the Xen-heap pages are > special and need to be (1) skipped; or (2) unmapped by the HVMPV drivers on > suspend; or (3) accounted for by HVMPV drivers by unmapping and freeing an > equal number of domain-heap pages. (1) is ''nicest'' but actually a bit of a > pain to implement; (2) won''t work well for live migration, where the pages > wouldn''t get unmapped by the drivers until the last round of page copying; > and (3) was apparently tried by Annie but didn''t work? I''m curious why (3) > didn''t work - I can''t explain that. > > -- Keir > > On 05/09/2009 00:02, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote: > > >> On further debugging, it appears that the >> p2m_size may be OK, but there''s something about >> those 24 "magic" gpfns that isn''t quite right. >> >> >>> -----Original Message----- >>> From: Dan Magenheimer >>> Sent: Friday, September 04, 2009 3:29 PM >>> To: Wayne Gong; Annie Li; Keir Fraser >>> Cc: Joshua West; James Harper; xen-devel@lists.xensource.com >>> Subject: RE: [Xen-devel] Error restoring DomU when using GPLPV >>> >>> >>> I think I''ve tracked down the cause of this problem >>> in the hypervisor, but am unsure how to best fix it. >>> >>> In tools/libxc/xc_domain_save.c, the static variable p2m_size >>> is said to be "number of pfns this guest has (i.e. number of >>> entries in the P2M)". But apparently p2m_size is getting >>> set to a very large number (0x100000) regardless of the >>> maximum psuedophysical memory for the hvm guest. As a result, >>> some "magic" pages in the 0xf0000-0xfefff range are getting >>> placed in the save file. But since they are not "real" >>> pages, the restore process runs beyond the maximum number >>> of physical pages allowed for the domain and fails. >>> (The gpfn of the last 24 pages saved are f2020, fc000-fc012, >>> feffb, feffc, feffd, feffe.) >>> >>> p2m_size is set in "save" with a call to a memory_op hypercall >>> (XENMEM_maximum_gpfn) which for an hvm domain returns >>> d->arch.p2m->max_mapped_pfn. I suspect that the meaning >>> of max_mapped_pfn changed at some point to more match >>> its name, but this changed the semantics of the hypercall >>> as used by xc_domain_restore, resulting in this curious >>> problem. >>> >>> Any thoughts on how to fix this? >>> >>> >>>> -----Original Message----- >>>> From: Annie Li >>>> Sent: Tuesday, September 01, 2009 10:27 PM >>>> To: Keir Fraser >>>> Cc: Joshua West; James Harper; xen-devel@lists.xensource.com >>>> Subject: Re: [Xen-devel] Error restoring DomU when using GPLPV >>>> >>>> >>>> >>>> >>>>> It seems this problem is connected with gnttab, not shareinfo. >>>>> I changed some code about grant table in winpv driver (not using >>>>> balloon down shinfo+gnttab method), >>>>> >>> save/restore/migration can work >>> >>>>> properly on Xen3.4 now. >>>>> >>>>> What i changed is winpv driver use hypercall >>>>> >>>> XENMEM_add_to_physmap to >>>> >>>>> map corresponding grant tables which devices require, instead of >>>>> mapping all 32 pages grant table during initialization. It seems >>>>> those extra grant table mapping cause this problem. >>>>> >>>> Wondering whether those extra grant table mapping is the root >>>> cause of >>>> the migration problem? or by luck as linux PVHVM too? >>>> >>>> Thanks >>>> Annie. >>>> >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xensource.com >>>> http://lists.xensource.com/xen-devel >>>> >>>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >>> >>> > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ok, I''ve been looking at this and figured what''s going on. Annie''s problem lies in not remapping the grant frames post migration. Hence the leak, tot_pages goes up every time until migration fails. On linux, remapping is where the frames created by restore (for heap pfn''s), get freed back to the dom heap, is what I found. So that''s a fix to be made on win pv driver side. Now back to orig problem. As you already know, because libxc is not skipping heap pages, tot_pages in struct domain{} temporarily goes up by (shared-info-frame + gnt-frames) until guest remaps these pages. Hence, migration fails if (max_pages - tot_pages) < (shared-info-frame + gnt-frames). Occassionally, I see tot_pages nearly same as max_pages, and I don''t know of all ways that may happen or what causes that to happen (by default, i see tot_pages short by 21). Anyways, of two solutions: 1. Always balloon down, shinfo+gnttab frames: This needs to be done just once during load, right? I''m not sure how it would work tho if mem gets ballooned up subsequently. I suppose the driver will have to intercept every increase in reservation and balloon down everytime? Also, balloon down during suspend call would prob be too late, right? 2. libxc fix: I wonder how much work this will be. Good thing here is, it''ll take care of both linux and PV HVM guests avoiding driver updates in many versions, and hence appealing to us. Can we somehow mark the frames special to be skipped? Looking at biiig xc_domain_save function, not sure in case of HVM, how pfn_type gets set. May be before the outer loop, it could ask hyp for all xen heap page list, but then what if a new page gets added to the list in between..... Also, unfortunately, the failure case is not handled properly sometimes. If migration fails after suspend, then no way to get the guest back. I even noticed, the guest disappeared totally from both source and target when failed, couple times of several dozen migrations I did. thanks, Mukesh Keir Fraser wrote:> Not all those pages are special. Frames fc0xx will be ACPI tables, resident > in ordinary guest memory pages, for example. Only the Xen-heap pages are > special and need to be (1) skipped; or (2) unmapped by the HVMPV drivers on > suspend; or (3) accounted for by HVMPV drivers by unmapping and freeing an > equal number of domain-heap pages. (1) is ''nicest'' but actually a bit of a > pain to implement; (2) won''t work well for live migration, where the pages > wouldn''t get unmapped by the drivers until the last round of page copying; > and (3) was apparently tried by Annie but didn''t work? I''m curious why (3) > didn''t work - I can''t explain that. > > -- Keir > > On 05/09/2009 00:02, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote: > >> On further debugging, it appears that the >> p2m_size may be OK, but there''s something about >> those 24 "magic" gpfns that isn''t quite right. >> >>> -----Original Message----- >>> From: Dan Magenheimer >>> Sent: Friday, September 04, 2009 3:29 PM >>> To: Wayne Gong; Annie Li; Keir Fraser >>> Cc: Joshua West; James Harper; xen-devel@lists.xensource.com >>> Subject: RE: [Xen-devel] Error restoring DomU when using GPLPV >>> >>> >>> I think I''ve tracked down the cause of this problem >>> in the hypervisor, but am unsure how to best fix it. >>> >>> In tools/libxc/xc_domain_save.c, the static variable p2m_size >>> is said to be "number of pfns this guest has (i.e. number of >>> entries in the P2M)". But apparently p2m_size is getting >>> set to a very large number (0x100000) regardless of the >>> maximum psuedophysical memory for the hvm guest. As a result, >>> some "magic" pages in the 0xf0000-0xfefff range are getting >>> placed in the save file. But since they are not "real" >>> pages, the restore process runs beyond the maximum number >>> of physical pages allowed for the domain and fails. >>> (The gpfn of the last 24 pages saved are f2020, fc000-fc012, >>> feffb, feffc, feffd, feffe.) >>> >>> p2m_size is set in "save" with a call to a memory_op hypercall >>> (XENMEM_maximum_gpfn) which for an hvm domain returns >>> d->arch.p2m->max_mapped_pfn. I suspect that the meaning >>> of max_mapped_pfn changed at some point to more match >>> its name, but this changed the semantics of the hypercall >>> as used by xc_domain_restore, resulting in this curious >>> problem. >>> >>> Any thoughts on how to fix this? >>> >>>> -----Original Message----- >>>> From: Annie Li >>>> Sent: Tuesday, September 01, 2009 10:27 PM >>>> To: Keir Fraser >>>> Cc: Joshua West; James Harper; xen-devel@lists.xensource.com >>>> Subject: Re: [Xen-devel] Error restoring DomU when using GPLPV >>>> >>>> >>>> >>>>> It seems this problem is connected with gnttab, not shareinfo. >>>>> I changed some code about grant table in winpv driver (not using >>>>> balloon down shinfo+gnttab method), >>> save/restore/migration can work >>>>> properly on Xen3.4 now. >>>>> >>>>> What i changed is winpv driver use hypercall >>>> XENMEM_add_to_physmap to >>>>> map corresponding grant tables which devices require, instead of >>>>> mapping all 32 pages grant table during initialization. It seems >>>>> those extra grant table mapping cause this problem. >>>> Wondering whether those extra grant table mapping is the root >>>> cause of >>>> the migration problem? or by luck as linux PVHVM too? >>>> >>>> Thanks >>>> Annie. >>>> >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xensource.com >>>> http://lists.xensource.com/xen-devel >>>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >>> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 15/09/2009 03:25, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:> Ok, I''ve been looking at this and figured what''s going on. Annie''s problem > lies in not remapping the grant frames post migration. Hence the leak, > tot_pages goes up every time until migration fails. On linux, remapping > is where the frames created by restore (for heap pfn''s), get freed back to > the dom heap, is what I found. So that''s a fix to be made on win > pv driver side.Although obviosuly that is a bug, I''m not sure why it would cause this particular issue? The domheap pages do not get freed and replaced with xenheap pages, but why does that affect the next save/restore cycle? After all, xc_domain_save does not distinguish between Xenheap and domheap pages?> 1. Always balloon down, shinfo+gnttab frames: This needs to be done just > once during load, right? I''m not sure how it would work tho if mem gets > ballooned up subsequently. I suppose the driver will have to intercept > every increase in reservation and balloon down everytime?Well, it is the same driver that is doing the ballooning, so it''s kind of easy to intercept, right? Just need to track how many Xenheap pages are mapped and maintain that amount of ''balloon down''.> Also, balloon down during suspend call would prob be too late, right?Indeed it would. Need to do it during boot. It''s only a few pages though, so noone will miss them.> 2. libxc fix: I wonder how much work this will be. Good thing here is, > it''ll take care of both linux and PV HVM guests avoiding driver > updates in many versions, and hence appealing to us. Can we somehow > mark the frames special to be skipped? Looking at biiig xc_domain_save > function, not sure in case of HVM, how pfn_type gets set. May be before > the > outer loop, it could ask hyp for all xen heap page list, but then what if > a > new page gets added to the list in between.....It''s a pain. Pfn_type[] I think doesn''t really get used. Xc_domain_save() just tries to map PFNs and saves all the ones it successfully maps. So the problem is it is allowed to map Xenheap pages. But we can''t always disallow that because sometimes the tools have good reason to map Xenheap pages. So we''d need a new hypercall, or a flag, or something, and that would need dom0 kernel changes as well as Xen and toolstack changes. So it''s rather a pain.> Also, unfortunately, the failure case is not handled properly sometimes. > If migration fails after suspend, then no way to get the guest > back. I even noticed, the guest disappeared totally from both source and > target when failed, couple times of several dozen migrations I did.That shouldn''t happen since there is a mechanism to cancel the suspension of a suspended guest. Possibly xend doesn''t get it right every time, as it''s error handling is pretty poor in general. I trust the underlying mechanisms below xend pretty well however. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser wrote:> On 15/09/2009 03:25, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote: > >> Ok, I''ve been looking at this and figured what''s going on. Annie''s problem >> lies in not remapping the grant frames post migration. Hence the leak, >> tot_pages goes up every time until migration fails. On linux, remapping >> is where the frames created by restore (for heap pfn''s), get freed back to >> the dom heap, is what I found. So that''s a fix to be made on win >> pv driver side. > > Although obviosuly that is a bug, I''m not sure why it would cause this > particular issue? The domheap pages do not get freed and replaced with > xenheap pages, but why does that affect the next save/restore cycle? After > all, xc_domain_save does not distinguish between Xenheap and domheap pages?xc_domain_save doesn''t distinguish is actually the problem, as xc_domain_restore then backs xenheap pfn''s for shinfo/gnt frames with dom heap pages. These dom heap pages do get freed and replaced by xenheap pages on target host (upon guest remap in gnttab_map()) in following code: arch_memory_op(): /* Remove previously mapped page if it was present. */ prev_mfn = gmfn_to_mfn(d, xatp.gpfn); if ( mfn_valid(prev_mfn) ) { ..... guest_remove_page(d, xatp.gpfn); <====== } Eg. my guest with 128M gets created with tot_pages=0x83eb max_pages:0x8400. Now xc_domain_save saves all, 0x83eb+shinfo+gnt frames(2), so I see tot_pages on target go upto 0x83ee. Now, guest remaps() shinfo and gnt frames. The dom heap pages are returned in guest_remove_page(), tot_pages goes back to 0x83eb. In Annie''s case, driver forgets to remap the 2 gnt frames, so dom heap pages are wrongly mapped and tot_pages remains at 0x83ed, and after few more when it reaches 0x83ff, migration fails as save is not be able to create 0x83ff+shinfo+gntframes temporarily, max_page being 0x8400. Hope that makes sense.>> 1. Always balloon down, shinfo+gnttab frames: This needs to be done just >> once during load, right? I''m not sure how it would work tho if mem gets >> ballooned up subsequently. I suppose the driver will have to intercept >> every increase in reservation and balloon down everytime? > > Well, it is the same driver that is doing the ballooning, so it''s kind of > easy to intercept, right? Just need to track how many Xenheap pages are > mapped and maintain that amount of ''balloon down''.Yup, that''s what I thought, but just wanted to make sure.>> Also, balloon down during suspend call would prob be too late, right? > > Indeed it would. Need to do it during boot. It''s only a few pages though, so > noone will miss them. > >> 2. libxc fix: I wonder how much work this will be. Good thing here is, >> it''ll take care of both linux and PV HVM guests avoiding driver >> updates in many versions, and hence appealing to us. Can we somehow >> mark the frames special to be skipped? Looking at biiig xc_domain_save >> function, not sure in case of HVM, how pfn_type gets set. May be before >> the >> outer loop, it could ask hyp for all xen heap page list, but then what if >> a >> new page gets added to the list in between..... > > It''s a pain. Pfn_type[] I think doesn''t really get used. Xc_domain_save() > just tries to map PFNs and saves all the ones it successfully maps. So the > problem is it is allowed to map Xenheap pages. But we can''t always disallow > that because sometimes the tools have good reason to map Xenheap pages. So > we''d need a new hypercall, or a flag, or something, and that would need dom0 > kernel changes as well as Xen and toolstack changes. So it''s rather a pain.Ok got it, I think driver change is the way to go.>> Also, unfortunately, the failure case is not handled properly sometimes. >> If migration fails after suspend, then no way to get the guest >> back. I even noticed, the guest disappeared totally from both source and >> target when failed, couple times of several dozen migrations I did. > > That shouldn''t happen since there is a mechanism to cancel the suspension of > a suspended guest. Possibly xend doesn''t get it right every time, as it''s > error handling is pretty poor in general. I trust the underlying mechanisms > below xend pretty well however.> -- Keirthanks a lot, Mukesh _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 15/09/2009 20:14, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:> Eg. my guest with 128M gets created with tot_pages=0x83eb > max_pages:0x8400. Now xc_domain_save saves all, 0x83eb+shinfo+gnt > frames(2), so I see tot_pages on target go upto 0x83ee. Now, guest > remaps() shinfo and gnt frames. The dom heap pages are returned in > guest_remove_page(), tot_pages goes back to 0x83eb. In Annie''s case, > driver forgets to remap the 2 gnt frames, so dom heap pages are wrongly > mapped and tot_pages remains at 0x83ed, and after few more when it reaches > 0x83ff, migration fails as save is not be able to create > 0x83ff+shinfo+gntframes temporarily, max_page being 0x8400. > > Hope that makes sense.No, it doesn''t. I agree that after the first migration tot_pages will have increased to 0x83ed. But I do not agree that it will continue to increase by three pages on each future migration. Look at it this way -- three GPFNs (guest-physical pages) have changed from xenheap pages to domheap pages across that first migration. On future migrations they will be migrated just like any other ordinary domheap page, since that''s what they now are. And tot_pages will therefore not change. Right? This is why I still cannot understand or explain Annie''s experimental result. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 15/09/2009 22:25, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:> No, it doesn''t. I agree that after the first migration tot_pages will have > increased to 0x83ed. But I do not agree that it will continue to increase by > three pages on each future migration. Look at it this way -- three GPFNs > (guest-physical pages) have changed from xenheap pages to domheap pages > across that first migration. On future migrations they will be migrated just > like any other ordinary domheap page, since that''s what they now are. And > tot_pages will therefore not change. Right?Actually of course you do the right thing with the shinfo page, so actually one page per migration does get switched back to being a Xenheap page (the shinfo page) and tot_pages actually increases by 3 on the first migration, then decreases by 1 when shinfo gets remapped by the PV drivers. Then increases by 1 on every future migration (which is the shinfo Xenheap page getting changed into a domheap page), and then decreases by 1 when shinfo gets remapped by the PV drivers. But even setting things out exactly right as above, the end result is the same: I *still* cannot explain Annie''s result. -- Keir> This is why I still cannot understand or explain Annie''s experimental > result._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser wrote:> On 15/09/2009 22:25, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote: > >> No, it doesn''t. I agree that after the first migration tot_pages will have >> increased to 0x83ed. But I do not agree that it will continue to increase by >> three pages on each future migration. Look at it this way -- three GPFNs >> (guest-physical pages) have changed from xenheap pages to domheap pages >> across that first migration. On future migrations they will be migrated just >> like any other ordinary domheap page, since that''s what they now are. And >> tot_pages will therefore not change. Right? > > Actually of course you do the right thing with the shinfo page, so actually > one page per migration does get switched back to being a Xenheap page (the > shinfo page) and tot_pages actually increases by 3 on the first migration, > then decreases by 1 when shinfo gets remapped by the PV drivers. Then > increases by 1 on every future migration (which is the shinfo Xenheap page > getting changed into a domheap page), and then decreases by 1 when shinfo > gets remapped by the PV drivers. > > But even setting things out exactly right as above, the end result is the > same: I *still* cannot explain Annie''s result.The bug in her driver is that its only remapping shinfo page, and NOT the 2 shared grant frames. tot_pages hence increases by 2 every migration. I can see all in kdb. tot_pages goes up by 3, then down by 1 as shared info frame is remapped, and remains there. Next migration, it goes up by 3, down by 1 again. So each migration leaks 2 frames. The initial difference is 21 frames between tot and max, hence after 10 migrations it fails. (BTW, no max_mem specified in config file, I''m told it means no POD). On linux side, driver remaps shinfo page + both grant frames. So, it goes up by 3 for a moment, then comes remap and down by 3, back to where it was. If tot_pages == max_pages, then mig will fail. Which brings me to a question, to test out balloon changes, what would be the best way to get tot_pages equal to max_pages. xm mem-set doesn''t quite get me there. Occassionally I see the two same after starting guest, but I''ve not figured out what causes that to happen. thakns Mukesh> -- Keir > >> This is why I still cannot understand or explain Annie''s experimental >> result. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi,> Actually of course you do the right thing with the shinfo page, so actually > one page per migration does get switched back to being a Xenheap page (the > shinfo page) and tot_pages actually increases by 3 on the first migration, > then decreases by 1 when shinfo gets remapped by the PV drivers. Then > increases by 1 on every future migration (which is the shinfo Xenheap page > getting changed into a domheap page), and then decreases by 1 when shinfo > gets remapped by the PV drivers. > > But even setting things out exactly right as above, the end result is the > same: I *still* cannot explain Annie''s result.The root cause is that winpv driver did not re-map gnttab frames during resuming. Thanks Mukesh very much. My initial implementation was to map all 32 grant table pages during initialization, and then balloon down those pages during driver first load. However, i leaked those 32 grant pages if i did not re-map those pages during resuming. This is why Save/restore can work only once. My second implementation is to map corresponding grant frames device needs instead of all 32 grant table. But it will leak 2 frames every migration because of missing re-mapping grant tables. Then i tried to re-map the grant table during resuming, and balloon down shinfo+gntab driver first load. I did save/restore several times, did not hit any problem. Furthermore, i also tried to map 64 grant table pages during initialization and ballooned down those pages, all work fine. I will do more test to make sure it and update here. Thanks Annie. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> I will do more test to make sure it and update here.I tried to map 256 grant frames during initialization and balloon down 256+1(shinfo+gnttab) pages driver first load. Then i did save/restore for 50 times, and live migration for 10 times. No error occurs. Thanks Annie. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 16/09/2009 12:10, "ANNIE LI" <annie.li@oracle.com> wrote:>> I will do more test to make sure it and update here. > I tried to map 256 grant frames during initialization and balloon down > 256+1(shinfo+gnttab) pages driver first > load. Then i did save/restore for 50 times, and live migration for 10 > times. No error occurs.Okay, well I still can''t explain why that fixes it, but clearly it does. So that''s good. :-) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Sep-16 18:09 UTC
RE: [Xen-devel] Error restoring DomU when using GPLPV
Before we close down this thread, I have a concern: According to Mukesh, the fix to this bug is dependent on the pv drivers tracking tot_pages for a domain and ballooning to ensure tot_pages+3 does not exceed max_pages for the domain. Well, tmem can affect tot_pages for a domain inside the hypervisor without any notification to pv drivers or the balloon driver. And I''d imagine that PoD and future memory optimization mechanisms such as swapping and page-sharing may do the same. So this solution seems very fragile. Dan> -----Original Message----- > From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > Sent: Wednesday, September 16, 2009 6:28 AM > To: Annie Li > Cc: Joshua West; Dan Magenheimer; xen-devel; Kurt Hackel; > James Harper; > Wayne Gong > Subject: Re: [Xen-devel] Error restoring DomU when using GPLPV > > > On 16/09/2009 12:10, "ANNIE LI" <annie.li@oracle.com> wrote: > > >> I will do more test to make sure it and update here. > > I tried to map 256 grant frames during initialization and > balloon down > > 256+1(shinfo+gnttab) pages driver first > > load. Then i did save/restore for 50 times, and live > migration for 10 > > times. No error occurs. > > Okay, well I still can''t explain why that fixes it, but > clearly it does. So > that''s good. :-) > > -- Keir > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
just in case someone missed the thread earlier, 3 = 1 shinfo + 2 gnt frames default. so, tot_pages + shinfo + num gnt frames. Mukesh Dan Magenheimer wrote:> Before we close down this thread, I have a concern: > > According to Mukesh, the fix to this bug is dependent > on the pv drivers tracking tot_pages for a domain > and ballooning to ensure tot_pages+3 does not exceed > max_pages for the domain. > > Well, tmem can affect tot_pages for a domain inside > the hypervisor without any notification to pv drivers > or the balloon driver. And I''d imagine that PoD and > future memory optimization mechanisms such as > swapping and page-sharing may do the same. > > So this solution seems very fragile. > > Dan > >> -----Original Message----- >> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] >> Sent: Wednesday, September 16, 2009 6:28 AM >> To: Annie Li >> Cc: Joshua West; Dan Magenheimer; xen-devel; Kurt Hackel; >> James Harper; >> Wayne Gong >> Subject: Re: [Xen-devel] Error restoring DomU when using GPLPV >> >> >> On 16/09/2009 12:10, "ANNIE LI" <annie.li@oracle.com> wrote: >> >>>> I will do more test to make sure it and update here. >>> I tried to map 256 grant frames during initialization and >> balloon down >>> 256+1(shinfo+gnttab) pages driver first >>> load. Then i did save/restore for 50 times, and live >> migration for 10 >>> times. No error occurs. >> Okay, well I still can''t explain why that fixes it, but >> clearly it does. So >> that''s good. :-) >> >> -- Keir >> >> >> > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yeah, all the PV drivers are having to do is balloon down one page for every Xenheap page they map. There''s no further complexity than that, so let''s not make a mountain out of a molehill. The approach as discussed and now implemented should work fine with tmem I think. -- Keir On 16/09/2009 21:50, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:> just in case someone missed the thread earlier, > > 3 = 1 shinfo + 2 gnt frames default. > > so, tot_pages + shinfo + num gnt frames. > > > Mukesh > > > > Dan Magenheimer wrote: >> Before we close down this thread, I have a concern: >> >> According to Mukesh, the fix to this bug is dependent >> on the pv drivers tracking tot_pages for a domain >> and ballooning to ensure tot_pages+3 does not exceed >> max_pages for the domain. >> >> Well, tmem can affect tot_pages for a domain inside >> the hypervisor without any notification to pv drivers >> or the balloon driver. And I''d imagine that PoD and >> future memory optimization mechanisms such as >> swapping and page-sharing may do the same. >> >> So this solution seems very fragile. >> >> Dan >> >>> -----Original Message----- >>> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] >>> Sent: Wednesday, September 16, 2009 6:28 AM >>> To: Annie Li >>> Cc: Joshua West; Dan Magenheimer; xen-devel; Kurt Hackel; >>> James Harper; >>> Wayne Gong >>> Subject: Re: [Xen-devel] Error restoring DomU when using GPLPV >>> >>> >>> On 16/09/2009 12:10, "ANNIE LI" <annie.li@oracle.com> wrote: >>> >>>>> I will do more test to make sure it and update here. >>>> I tried to map 256 grant frames during initialization and >>> balloon down >>>> 256+1(shinfo+gnttab) pages driver first >>>> load. Then i did save/restore for 50 times, and live >>> migration for 10 >>>> times. No error occurs. >>> Okay, well I still can''t explain why that fixes it, but >>> clearly it does. So >>> that''s good. :-) >>> >>> -- Keir >>> >>> >>> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Sep-17 15:41 UTC
RE: [Xen-devel] Error restoring DomU when using GPLPV
The problem is that every page that is ballooned down by the balloon driver can be slurped up as a private- persistent ("preswap") page by tmem. Private-persistent pages contain indirectly-accessible domain data, are counted against the domain''s tot_pages, and are migrated along with the domain-directly-accessible pages. So any temporary mapping of xenheap pages into domheap, such as occurs during restore/migration, can cause max_pages to be exceeded. This isn''t a problem today for tmem because tmem only runs in PV domains today, but I suspect the fragileness of this approach will come back and bite us. It reminds me of the classic "shell game". Is there a per-domain counter of these special pages somewhere? If so, a MEMF flag could subtract this from max_pages in the limit check in assign_pages(), e.g.: max = d->max_pages; if ( memflags & MEMF_no_special ) max -= d->special_pages; <snip> if ( unlikely((d->tot_pages + ... > max ) /* Over-allocation */ (Special_pages counts any xenheap pages that contain domain-specific data that needs to be retained across a migration.) Dan> -----Original Message----- > From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > Sent: Thursday, September 17, 2009 12:21 AM > To: Mukesh Rathor; Dan Magenheimer > Cc: Annie Li; Joshua West; James Harper; xen-devel; Wayne Gong; Kurt > Hackel > Subject: Re: [Xen-devel] Error restoring DomU when using GPLPV > > > Yeah, all the PV drivers are having to do is balloon down one > page for every > Xenheap page they map. There''s no further complexity than > that, so let''s not > make a mountain out of a molehill. The approach as discussed and now > implemented should work fine with tmem I think. > > -- Keir > > On 16/09/2009 21:50, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote: > > > just in case someone missed the thread earlier, > > > > 3 = 1 shinfo + 2 gnt frames default. > > > > so, tot_pages + shinfo + num gnt frames. > > > > > > Mukesh > > > > > > > > Dan Magenheimer wrote: > >> Before we close down this thread, I have a concern: > >> > >> According to Mukesh, the fix to this bug is dependent > >> on the pv drivers tracking tot_pages for a domain > >> and ballooning to ensure tot_pages+3 does not exceed > >> max_pages for the domain. > >> > >> Well, tmem can affect tot_pages for a domain inside > >> the hypervisor without any notification to pv drivers > >> or the balloon driver. And I''d imagine that PoD and > >> future memory optimization mechanisms such as > >> swapping and page-sharing may do the same. > >> > >> So this solution seems very fragile. > >> > >> Dan > >> > >>> -----Original Message----- > >>> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > >>> Sent: Wednesday, September 16, 2009 6:28 AM > >>> To: Annie Li > >>> Cc: Joshua West; Dan Magenheimer; xen-devel; Kurt Hackel; > >>> James Harper; > >>> Wayne Gong > >>> Subject: Re: [Xen-devel] Error restoring DomU when using GPLPV > >>> > >>> > >>> On 16/09/2009 12:10, "ANNIE LI" <annie.li@oracle.com> wrote: > >>> > >>>>> I will do more test to make sure it and update here. > >>>> I tried to map 256 grant frames during initialization and > >>> balloon down > >>>> 256+1(shinfo+gnttab) pages driver first > >>>> load. Then i did save/restore for 50 times, and live > >>> migration for 10 > >>>> times. No error occurs. > >>> Okay, well I still can''t explain why that fixes it, but > >>> clearly it does. So > >>> that''s good. :-) > >>> > >>> -- Keir > >>> > >>> > >>> > >> > >> _______________________________________________ > >> Xen-devel mailing list > >> Xen-devel@lists.xensource.com > >> http://lists.xensource.com/xen-devel > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2009-Sep-24 20:24 UTC
Re: [Xen-devel] Error restoring DomU when using GPLPV / fix for GPLPV drivers
On Wed, Sep 16, 2009 at 07:10:19PM +0800, ANNIE LI wrote:> > >I will do more test to make sure it and update here. > I tried to map 256 grant frames during initialization and balloon down > 256+1(shinfo+gnttab) pages driver first > load. Then i did save/restore for 50 times, and live migration for 10 > times. No error occurs. >James: I guess this same fix should be applied to GPLPV drivers? -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keith Coleman
2009-Oct-27 20:05 UTC
Re: [Xen-devel] Error restoring DomU when using GPLPV / fix for GPLPV drivers
On Thu, Sep 24, 2009 at 4:24 PM, Pasi Kärkkäinen <pasik@iki.fi> wrote:> On Wed, Sep 16, 2009 at 07:10:19PM +0800, ANNIE LI wrote: >> >> >I will do more test to make sure it and update here. >> I tried to map 256 grant frames during initialization and balloon down >> 256+1(shinfo+gnttab) pages driver first >> load. Then i did save/restore for 50 times, and live migration for 10 >> times. No error occurs. >> > > James: I guess this same fix should be applied to GPLPV drivers? >The latest GPLPV drivers still can''t restore but the discussion of this issue has gone quiet. Is this a lost cause? I''m using xen-3.4.1, official 2.6.18-xen kernel, gplpv_fre_wnet_x86_0.10.0.130.msi 2:~# xm save win2 win2.save 2:~# xm restore win2.save Error: /usr/lib64/xen/bin/xc_restore 4 49 2 3 1 1 1 failed Usage: xm restore <CheckpointFile> [-p] Restore a domain from a saved state. -p, --paused Do not unpause domain after restoring it Keith Coleman _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel