Tim M
2011-Apr-13 16:25 UTC
[Xen-devel] [Bug 1759] Xen 4.0.1 live migration/restore over-writes new hypervisor''s boot-time record
It was suggested that I submit this to xen-devel in addition to the bug report. I am having the exact problem described in bug 1282. The RedHat 5 errata for Xen 3 describes the problem nicely so I will quote it: xen calculates its running time by adding the hypervisor''s up-time to the hypervisor''s boot-time record. In live migrations of para-virtualized guests, however, the guest would over-write the new hypervisor''s boot-time record with the boot-time of the previous hypervisor. This caused time-dependent processes on the guests to fail This bug was apparently fixed in 3.1.1 (http://xenbits.xen.org/hg/xen-unstable.hg/rev/359707941ae8) but I am having the issue now with Xen 4.0.1 on Debian Squeeze. Did something change with the migrate/restore process so the previous fix no longer applies? Thanks in advance for any help _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Apr-13 18:34 UTC
Re: [Xen-devel] [Bug 1759] Xen 4.0.1 live migration/restore over-writes new hypervisor''s boot-time record
On 13/04/2011 17:25, "Tim M" <bugs@linuxrehab.com> wrote:> It was suggested that I submit this to xen-devel in addition to the bug > report. > > > I am having the exact problem described in bug 1282. The RedHat 5 errata for > Xen 3 describes the problem nicely so I will quote it: > > xen calculates its running time by adding the hypervisor''s up-time to the > hypervisor''s boot-time record. In live migrations of para-virtualized > guests, however, the guest would over-write the new hypervisor''s boot-time > record with the boot-time of the previous hypervisor. This caused > time-dependent processes on the guests to fail > > This bug was apparently fixed in 3.1.1 > (http://xenbits.xen.org/hg/xen-unstable.hg/rev/359707941ae8) but I am having > the issue now with Xen 4.0.1 on Debian Squeeze. > > Did something change with the migrate/restore process so the previous fix no > longer applies?The fix is still there, albeit in a modified form since the restore code has changed quite a bit since Xen 3. Can you reliably repro this, with any PV guest? -- Keir> Thanks in advance for any help > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim M
2011-Apr-13 19:31 UTC
Re: [Xen-devel] [Bug 1759] Xen 4.0.1 live migration/restore over-writes new hypervisor''s boot-time record
On Wed, Apr 13, 2011 at 07:34:01PM +0100, Keir Fraser wrote:> The fix is still there, albeit in a modified form since the restore code has > changed quite a bit since Xen 3. Can you reliably repro this, with any PV > guest? > > -- KeirThis is 100% reproducible. Every time I migrate from a host with an uptime longer than the target host, the VM has a clock/time freeze for however long the uptime difference is. Migrating from a host with shorter uptime to a host with longer has no problem. As a demonstration, I have two dom0 hosts that have roughly 6 minutes difference in uptime. I started a guest VM on the dom0 host with a longer uptime then SSH''d to the guest and ran this command: while true ; do date ; sleep 5 ; done Next I initiated a live migration and this is what the output of the command looks like (with commentary added): Wed Apr 13 12:07:48 PDT 2011 Wed Apr 13 12:07:53 PDT 2011 Wed Apr 13 12:07:58 PDT 2011 Wed Apr 13 12:08:03 PDT 2011 [ migration happens here and SSH to the guest "freezes" for about 6 min ] Wed Apr 13 12:13:55 PDT 2011 Wed Apr 13 12:14:00 PDT 2011 Wed Apr 13 12:14:05 PDT 2011 I have only tried Ubuntu 10.04.2 guests running the 2.6.32 and 2.6.35 server kernel packages but, as mentioned, this happens every time. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Apr-13 20:00 UTC
Re: [Xen-devel] [Bug 1759] Xen 4.0.1 live migration/restore over-writes new hypervisor''s boot-time record
On 13/04/2011 20:31, "Tim M" <bugs@linuxrehab.com> wrote:> Next I initiated a live migration and this is what the output of the command > looks like (with commentary added): > > Wed Apr 13 12:07:48 PDT 2011 > Wed Apr 13 12:07:53 PDT 2011 > Wed Apr 13 12:07:58 PDT 2011 > Wed Apr 13 12:08:03 PDT 2011 [ migration happens here and SSH to the guest > "freezes" for about 6 min ] > Wed Apr 13 12:13:55 PDT 2011 > Wed Apr 13 12:14:00 PDT 2011 > Wed Apr 13 12:14:05 PDT 2011Looks like the wallclock is correct after the migration however, and it''s that that the patch you referred to was fixing.> I have only tried Ubuntu 10.04.2 guests running the 2.6.32 and 2.6.35 server > kernel packages but, as mentioned, this happens every time.I think this is a domU kernel bug, see a similar report here for example: http://lists.xensource.com/archives/html/xen-devel/2010-10/msg00057.html So it could be a common symptom in a range of Debian/Ubuntu kernels. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Apr-14 06:40 UTC
Re: [Xen-devel] [Bug 1759] Xen 4.0.1 live migration/restore over-writes new hypervisor''s boot-time record
On Wed, 2011-04-13 at 21:00 +0100, Keir Fraser wrote:> On 13/04/2011 20:31, "Tim M" <bugs@linuxrehab.com> wrote:> > I have only tried Ubuntu 10.04.2 guests running the 2.6.32 and 2.6.35 server > > kernel packages but, as mentioned, this happens every time. > > I think this is a domU kernel bug, see a similar report here for example: > http://lists.xensource.com/archives/html/xen-devel/2010-10/msg00057.html > So it could be a common symptom in a range of Debian/Ubuntu kernels.This issue was introduced in one of the upstream longterm 2.6.32.y kernels (by 1345126c761f in v2.6.32.16, I think). It was fixed upstream in e7a3481c0246 "x86/pvclock: Zero last_value on resume" from v2.6.37-rc6 which was added to the longterm 2.6.32.x branch as 595b62a8acfb in v2.6.32.30. It appears to have also gone into longterm v2.6.35.12 as ac9a0f1a28f5. This issue is fixed in the 2.6.32-31 package currently in Debian stable. I can''t speak for Ubuntu. You should contact them. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim M
2011-Apr-14 20:44 UTC
Re: [Xen-devel] [Bug 1759] Xen 4.0.1 live migration/restore over-writes new hypervisor''s boot-time record
On Thu, Apr 14, 2011 at 07:40:08AM +0100, Ian Campbell wrote:> This issue is fixed in the 2.6.32-31 package currently in Debian stable.I installed a DomU guest with Debian and the suggested kernel and live migrations work properly. Thanks for all the help! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel