Phil Evans
2013-Jan-10 18:27 UTC
Ever increasing time offset for HVM domain / Huge amounts of drift
Hi, I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as well). We have been having a major problem with sometimes huge amounts of clock drift in Windows VMs. Sometimes the clock on a VM could suddenly jump by over a week (usually forwards, however time has been known to go backwards as well). Now I don''t profess to know the internals of Xen, however through my investigation I believe I have a degree of knowledge of what could be causing the problem. The steps to reproduce this (for me at least), is to simply do a manual NTP sync on a Windows VM. Upon monitoring the qemu-dm log file for the VM, I see similar to the following: Time offset set 489, added offset 480 Time offset set 436, added offset -53 Time offset set 496, added offset 60 Time offset set 494, added offset -2 Time offset set 554, added offset 60 Time offset set 565, added offset 11 Time offset set 606, added offset 41 Time offset set -1974, added offset -2580 Time offset set 1626, added offset 3600 Time offset set 1579, added offset -47 Time offset set 1639, added offset 60 It seems to add the same number of seconds to the offset as has passed since the last sync. The offset just keeps on increasing, eventually resulting in huge numbers equating to days. Occasionally the offset may jump a bit and go down but the general trend is up. Although this does not affect the VM immediately, at some point I am guessing it syncs itself with the CMOS clock (which is now a large number of seconds offset from the actual time), resulting in a huge jump in time. A reboot is a guaranteed way to get the new, incorrect time. Although I do not understand all of the underlying code, I presume the correct way this should work is it should be comparing the CMOS time that''s just been set with the hardware clock on the physical machine, resulting in an offset between the two. This would result in a generally stable number (ideally 0). Obviously it is incorrect behaviour for the number to keep going up. To my mind it looks like it may be somehow getting an inaccurate time from the system (in many cases a fixed time rather than an up-to-date current time). Does anyone have any light they may be able to shed on this? Is it possible it could be struggling to get an accurate time from the hardware? I have checked on several occasions and both the system time and the BIOS clock are spot on. Regards, Phil. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Phil Evans
2013-Jan-14 13:37 UTC
Ever increasing time offset for HVM domain / Huge amounts of drift
Hi, I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as well). We have been having a major problem with sometimes huge amounts of clock drift in Windows VMs. Sometimes the clock on a VM could suddenly jump by over a week (usually forwards, however time has been known to go backwards as well). Now I don’t profess to know the internals of Xen, however through my investigation I believe I have a degree of knowledge of what could be causing the problem. The steps to reproduce this (for me at least), is to simply do a manual NTP sync on a Windows VM. Upon monitoring the qemu-dm log file for the VM, I see similar to the following: Time offset set 489, added offset 480 Time offset set 436, added offset -53 Time offset set 496, added offset 60 Time offset set 494, added offset -2 Time offset set 554, added offset 60 Time offset set 565, added offset 11 Time offset set 606, added offset 41 Time offset set -1974, added offset -2580 Time offset set 1626, added offset 3600 Time offset set 1579, added offset -47 Time offset set 1639, added offset 60 It seems to add the same number of seconds to the offset as has passed since the last sync. The offset just keeps on increasing, eventually resulting in huge numbers equating to days. Occasionally the offset may jump a bit and go down but the general trend is up. Although this does not affect the VM immediately, at some point I am guessing it syncs itself with the CMOS clock (which is now a large number of seconds offset from the actual time), resulting in a huge jump in time. A reboot is a guaranteed way to get the new, incorrect time. Although I do not understand all of the underlying code, I presume the correct way this should work is it should be comparing the CMOS time that’s just been set with the hardware clock on the physical machine, resulting in an offset between the two. This would result in a generally stable number (ideally 0). Obviously it is incorrect behaviour for the number to keep going up. To my mind it looks like it may be somehow getting an inaccurate time from the system (in many cases a fixed time rather than an up-to-date current time). Does anyone have any light they may be able to shed on this? Is it possible it could be struggling to get an accurate time from the hardware? I have checked on several occasions and both the system time and the BIOS clock are spot on. Regards, Phil.
Pasi Kärkkäinen
2013-Jan-14 15:14 UTC
Re: Ever increasing time offset for HVM domain / Huge amounts of drift
On Mon, Jan 14, 2013 at 01:37:01PM +0000, Phil Evans wrote:> Hi, > > I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as well). We have been having a major problem with sometimes huge amounts of clock drift in Windows VMs. Sometimes the clock on a VM could suddenly jump by over a week (usually forwards, however time has been known to go backwards as well). > > Now I don?t profess to know the internals of Xen, however through my investigation I believe I have a degree of knowledge of what could be causing the problem. > > The steps to reproduce this (for me at least), is to simply do a manual NTP sync on a Windows VM. Upon monitoring the qemu-dm log file for the VM, I see similar to the following: > > Time offset set 489, added offset 480 > Time offset set 436, added offset -53 > Time offset set 496, added offset 60 > Time offset set 494, added offset -2 > Time offset set 554, added offset 60 > Time offset set 565, added offset 11 > Time offset set 606, added offset 41 > Time offset set -1974, added offset -2580 > Time offset set 1626, added offset 3600 > Time offset set 1579, added offset -47 > Time offset set 1639, added offset 60 > > It seems to add the same number of seconds to the offset as has passed since the last sync. The offset just keeps on increasing, eventually resulting in huge numbers equating to days. Occasionally the offset may jump a bit and go down but the general trend is up. Although this does not affect the VM immediately, at some point I am guessing it syncs itself with the CMOS clock (which is now a large number of seconds offset from the actual time), resulting in a huge jump in time. A reboot is a guaranteed way to get the new, incorrect time. > > Although I do not understand all of the underlying code, I presume the correct way this should work is it should be comparing the CMOS time that?s just been set with the hardware clock on the physical machine, resulting in an offset between the two. This would result in a generally stable number (ideally 0). Obviously it is incorrect behaviour for the number to keep going up. To my mind it looks like it may be somehow getting an inaccurate time from the system (in many cases a fixed time rather than an up-to-date current time). > > Does anyone have any light they may be able to shed on this? Is it possible it could be struggling to get an accurate time from the hardware? I have checked on several occasions and both the system time and the BIOS clock are spot on. >Please paste the cfgfile of your HVM Windows. -- Pasi
Phil Evans
2013-Jan-14 16:17 UTC
Re: Ever increasing time offset for HVM domain / Huge amounts of drift
Hi, Sorry I should have included that in the first place: import os,re arch = os.uname()[4] kernel = ''/usr/lib/xen-default/boot/hvmloader'' builder = ''hvm'' name = ''vm_141'' memory = ''2048'' disk = [''phy:/dev/storage_node_2/disk_806,xvda,w'',''file:/control/isos/empty.iso,xvdd:cdrom,r''] vif = [''mac=00:16:3e:3f:2f:1a, bridge=vlan_369, vifname=vm_141.0, ip=89.238.190.22 2a02:40:501:3::2 2a02:40:501:3::5 89.238.190.88'',''mac=00:16:3e:67:66:71, bridge=vlan_4000, vifname=vm_141.1, ip=0.0.0.0'',''mac=00:16:3e:01:7e:e8, bridge=vlan_369, vifname=vm_141.2''] device_model = ''/usr/lib/xen-default/bin/qemu-dm'' boot = ''cd'' vnc = 1 vncpasswd = ''YpD5aVZ8'' usbdevice = ''tablet'' acpi = 0 vcpus = 4 viridian = 1 Thanks, Phil. ________________________________________ From: Pasi Kärkkäinen [pasik@iki.fi] Sent: 14 January 2013 15:14 To: Phil Evans Cc: xen-devel@lists.xen.org Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift On Mon, Jan 14, 2013 at 01:37:01PM +0000, Phil Evans wrote:> Hi, > > I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as well). We have been having a major problem with sometimes huge amounts of clock drift in Windows VMs. Sometimes the clock on a VM could suddenly jump by over a week (usually forwards, however time has been known to go backwards as well). > > Now I don?t profess to know the internals of Xen, however through my investigation I believe I have a degree of knowledge of what could be causing the problem. > > The steps to reproduce this (for me at least), is to simply do a manual NTP sync on a Windows VM. Upon monitoring the qemu-dm log file for the VM, I see similar to the following: > > Time offset set 489, added offset 480 > Time offset set 436, added offset -53 > Time offset set 496, added offset 60 > Time offset set 494, added offset -2 > Time offset set 554, added offset 60 > Time offset set 565, added offset 11 > Time offset set 606, added offset 41 > Time offset set -1974, added offset -2580 > Time offset set 1626, added offset 3600 > Time offset set 1579, added offset -47 > Time offset set 1639, added offset 60 > > It seems to add the same number of seconds to the offset as has passed since the last sync. The offset just keeps on increasing, eventually resulting in huge numbers equating to days. Occasionally the offset may jump a bit and go down but the general trend is up. Although this does not affect the VM immediately, at some point I am guessing it syncs itself with the CMOS clock (which is now a large number of seconds offset from the actual time), resulting in a huge jump in time. A reboot is a guaranteed way to get the new, incorrect time. > > Although I do not understand all of the underlying code, I presume the correct way this should work is it should be comparing the CMOS time that?s just been set with the hardware clock on the physical machine, resulting in an offset between the two. This would result in a generally stable number (ideally 0). Obviously it is incorrect behaviour for the number to keep going up. To my mind it looks like it may be somehow getting an inaccurate time from the system (in many cases a fixed time rather than an up-to-date current time). > > Does anyone have any light they may be able to shed on this? Is it possible it could be struggling to get an accurate time from the hardware? I have checked on several occasions and both the system time and the BIOS clock are spot on. >Please paste the cfgfile of your HVM Windows. -- Pasi
Pasi Kärkkäinen
2013-Jan-14 16:30 UTC
Re: Ever increasing time offset for HVM domain / Huge amounts of drift
On Mon, Jan 14, 2013 at 04:17:39PM +0000, Phil Evans wrote:> Hi, > > Sorry I should have included that in the first place: >Ok. Did you try experimenting with these options?: timer_mode=X hpet=0|1 tsc_mode=X -- Pasi> import os,re > arch = os.uname()[4] > kernel = ''/usr/lib/xen-default/boot/hvmloader'' > builder = ''hvm'' > name = ''vm_141'' > memory = ''2048'' > disk = [''phy:/dev/storage_node_2/disk_806,xvda,w'',''file:/control/isos/empty.iso,xvdd:cdrom,r''] > vif = [''mac=00:16:3e:3f:2f:1a, bridge=vlan_369, vifname=vm_141.0, ip=89.238.190.22 2a02:40:501:3::2 2a02:40:501:3::5 89.238.190.88'',''mac=00:16:3e:67:66:71, bridge=vlan_4000, vifname=vm_141.1, ip=0.0.0.0'',''mac=00:16:3e:01:7e:e8, bridge=vlan_369, vifname=vm_141.2''] > device_model = ''/usr/lib/xen-default/bin/qemu-dm'' > boot = ''cd'' > vnc = 1 > vncpasswd = ''YpD5aVZ8'' > usbdevice = ''tablet'' > acpi = 0 > vcpus = 4 > viridian = 1 > > Thanks, > Phil. > ________________________________________ > From: Pasi Kärkkäinen [pasik@iki.fi] > Sent: 14 January 2013 15:14 > To: Phil Evans > Cc: xen-devel@lists.xen.org > Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift > > On Mon, Jan 14, 2013 at 01:37:01PM +0000, Phil Evans wrote: > > Hi, > > > > I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as well). We have been having a major problem with sometimes huge amounts of clock drift in Windows VMs. Sometimes the clock on a VM could suddenly jump by over a week (usually forwards, however time has been known to go backwards as well). > > > > Now I don?t profess to know the internals of Xen, however through my investigation I believe I have a degree of knowledge of what could be causing the problem. > > > > The steps to reproduce this (for me at least), is to simply do a manual NTP sync on a Windows VM. Upon monitoring the qemu-dm log file for the VM, I see similar to the following: > > > > Time offset set 489, added offset 480 > > Time offset set 436, added offset -53 > > Time offset set 496, added offset 60 > > Time offset set 494, added offset -2 > > Time offset set 554, added offset 60 > > Time offset set 565, added offset 11 > > Time offset set 606, added offset 41 > > Time offset set -1974, added offset -2580 > > Time offset set 1626, added offset 3600 > > Time offset set 1579, added offset -47 > > Time offset set 1639, added offset 60 > > > > It seems to add the same number of seconds to the offset as has passed since the last sync. The offset just keeps on increasing, eventually resulting in huge numbers equating to days. Occasionally the offset may jump a bit and go down but the general trend is up. Although this does not affect the VM immediately, at some point I am guessing it syncs itself with the CMOS clock (which is now a large number of seconds offset from the actual time), resulting in a huge jump in time. A reboot is a guaranteed way to get the new, incorrect time. > > > > Although I do not understand all of the underlying code, I presume the correct way this should work is it should be comparing the CMOS time that?s just been set with the hardware clock on the physical machine, resulting in an offset between the two. This would result in a generally stable number (ideally 0). Obviously it is incorrect behaviour for the number to keep going up. To my mind it looks like it may be somehow getting an inaccurate time from the system (in many cases a fixed time rather than an up-to-date current time). > > > > Does anyone have any light they may be able to shed on this? Is it possible it could be struggling to get an accurate time from the hardware? I have checked on several occasions and both the system time and the BIOS clock are spot on. > > > > Please paste the cfgfile of your HVM Windows. > > -- Pasi >
Phil Evans
2013-Jan-14 16:52 UTC
Re: Ever increasing time offset for HVM domain / Huge amounts of drift
I had tried messing with all but tsc_mode. I have tried all options for tsc_mode as well now but there is no change. The key thing here is that the drift is increasing at a rate of 1 second per second of uptime amounting to huge amounts. It doesn''t seem to be a clock skew issue, it seems to be something is simply not counting at all. Phil. ________________________________________ From: Pasi Kärkkäinen [pasik@iki.fi] Sent: 14 January 2013 16:30 To: Phil Evans Cc: xen-devel@lists.xen.org Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift On Mon, Jan 14, 2013 at 04:17:39PM +0000, Phil Evans wrote:> Hi, > > Sorry I should have included that in the first place: >Ok. Did you try experimenting with these options?: timer_mode=X hpet=0|1 tsc_mode=X -- Pasi> import os,re > arch = os.uname()[4] > kernel = ''/usr/lib/xen-default/boot/hvmloader'' > builder = ''hvm'' > name = ''vm_141'' > memory = ''2048'' > disk = [''phy:/dev/storage_node_2/disk_806,xvda,w'',''file:/control/isos/empty.iso,xvdd:cdrom,r''] > vif = [''mac=00:16:3e:3f:2f:1a, bridge=vlan_369, vifname=vm_141.0, ip=89.238.190.22 2a02:40:501:3::2 2a02:40:501:3::5 89.238.190.88'',''mac=00:16:3e:67:66:71, bridge=vlan_4000, vifname=vm_141.1, ip=0.0.0.0'',''mac=00:16:3e:01:7e:e8, bridge=vlan_369, vifname=vm_141.2''] > device_model = ''/usr/lib/xen-default/bin/qemu-dm'' > boot = ''cd'' > vnc = 1 > vncpasswd = ''YpD5aVZ8'' > usbdevice = ''tablet'' > acpi = 0 > vcpus = 4 > viridian = 1 > > Thanks, > Phil. > ________________________________________ > From: Pasi Kärkkäinen [pasik@iki.fi] > Sent: 14 January 2013 15:14 > To: Phil Evans > Cc: xen-devel@lists.xen.org > Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift > > On Mon, Jan 14, 2013 at 01:37:01PM +0000, Phil Evans wrote: > > Hi, > > > > I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as well). We have been having a major problem with sometimes huge amounts of clock drift in Windows VMs. Sometimes the clock on a VM could suddenly jump by over a week (usually forwards, however time has been known to go backwards as well). > > > > Now I don?t profess to know the internals of Xen, however through my investigation I believe I have a degree of knowledge of what could be causing the problem. > > > > The steps to reproduce this (for me at least), is to simply do a manual NTP sync on a Windows VM. Upon monitoring the qemu-dm log file for the VM, I see similar to the following: > > > > Time offset set 489, added offset 480 > > Time offset set 436, added offset -53 > > Time offset set 496, added offset 60 > > Time offset set 494, added offset -2 > > Time offset set 554, added offset 60 > > Time offset set 565, added offset 11 > > Time offset set 606, added offset 41 > > Time offset set -1974, added offset -2580 > > Time offset set 1626, added offset 3600 > > Time offset set 1579, added offset -47 > > Time offset set 1639, added offset 60 > > > > It seems to add the same number of seconds to the offset as has passed since the last sync. The offset just keeps on increasing, eventually resulting in huge numbers equating to days. Occasionally the offset may jump a bit and go down but the general trend is up. Although this does not affect the VM immediately, at some point I am guessing it syncs itself with the CMOS clock (which is now a large number of seconds offset from the actual time), resulting in a huge jump in time. A reboot is a guaranteed way to get the new, incorrect time. > > > > Although I do not understand all of the underlying code, I presume the correct way this should work is it should be comparing the CMOS time that?s just been set with the hardware clock on the physical machine, resulting in an offset between the two. This would result in a generally stable number (ideally 0). Obviously it is incorrect behaviour for the number to keep going up. To my mind it looks like it may be somehow getting an inaccurate time from the system (in many cases a fixed time rather than an up-to-date current time). > > > > Does anyone have any light they may be able to shed on this? Is it possible it could be struggling to get an accurate time from the hardware? I have checked on several occasions and both the system time and the BIOS clock are spot on. > > > > Please paste the cfgfile of your HVM Windows. > > -- Pasi >
Pasi Kärkkäinen
2013-Jan-14 16:54 UTC
Re: Ever increasing time offset for HVM domain / Huge amounts of drift
On Mon, Jan 14, 2013 at 04:52:31PM +0000, Phil Evans wrote:> I had tried messing with all but tsc_mode. I have tried all options for tsc_mode as well now but there is no change. The key thing here is that the drift is increasing at a rate of 1 second per second of uptime amounting to huge amounts. It doesn''t seem to be a clock skew issue, it seems to be something is simply not counting at all. >Please don''t top-post.. What''s the hardware config? -- Pasi> Phil. > ________________________________________ > From: Pasi Kärkkäinen [pasik@iki.fi] > Sent: 14 January 2013 16:30 > To: Phil Evans > Cc: xen-devel@lists.xen.org > Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift > > On Mon, Jan 14, 2013 at 04:17:39PM +0000, Phil Evans wrote: > > Hi, > > > > Sorry I should have included that in the first place: > > > > Ok. Did you try experimenting with these options?: > > timer_mode=X > hpet=0|1 > tsc_mode=X > > > -- Pasi > > > > import os,re > > arch = os.uname()[4] > > kernel = ''/usr/lib/xen-default/boot/hvmloader'' > > builder = ''hvm'' > > name = ''vm_141'' > > memory = ''2048'' > > disk = [''phy:/dev/storage_node_2/disk_806,xvda,w'',''file:/control/isos/empty.iso,xvdd:cdrom,r''] > > vif = [''mac=00:16:3e:3f:2f:1a, bridge=vlan_369, vifname=vm_141.0, ip=89.238.190.22 2a02:40:501:3::2 2a02:40:501:3::5 89.238.190.88'',''mac=00:16:3e:67:66:71, bridge=vlan_4000, vifname=vm_141.1, ip=0.0.0.0'',''mac=00:16:3e:01:7e:e8, bridge=vlan_369, vifname=vm_141.2''] > > device_model = ''/usr/lib/xen-default/bin/qemu-dm'' > > boot = ''cd'' > > vnc = 1 > > vncpasswd = ''YpD5aVZ8'' > > usbdevice = ''tablet'' > > acpi = 0 > > vcpus = 4 > > viridian = 1 > > > > Thanks, > > Phil. > > ________________________________________ > > From: Pasi Kärkkäinen [pasik@iki.fi] > > Sent: 14 January 2013 15:14 > > To: Phil Evans > > Cc: xen-devel@lists.xen.org > > Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift > > > > On Mon, Jan 14, 2013 at 01:37:01PM +0000, Phil Evans wrote: > > > Hi, > > > > > > I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as well). We have been having a major problem with sometimes huge amounts of clock drift in Windows VMs. Sometimes the clock on a VM could suddenly jump by over a week (usually forwards, however time has been known to go backwards as well). > > > > > > Now I don?t profess to know the internals of Xen, however through my investigation I believe I have a degree of knowledge of what could be causing the problem. > > > > > > The steps to reproduce this (for me at least), is to simply do a manual NTP sync on a Windows VM. Upon monitoring the qemu-dm log file for the VM, I see similar to the following: > > > > > > Time offset set 489, added offset 480 > > > Time offset set 436, added offset -53 > > > Time offset set 496, added offset 60 > > > Time offset set 494, added offset -2 > > > Time offset set 554, added offset 60 > > > Time offset set 565, added offset 11 > > > Time offset set 606, added offset 41 > > > Time offset set -1974, added offset -2580 > > > Time offset set 1626, added offset 3600 > > > Time offset set 1579, added offset -47 > > > Time offset set 1639, added offset 60 > > > > > > It seems to add the same number of seconds to the offset as has passed since the last sync. The offset just keeps on increasing, eventually resulting in huge numbers equating to days. Occasionally the offset may jump a bit and go down but the general trend is up. Although this does not affect the VM immediately, at some point I am guessing it syncs itself with the CMOS clock (which is now a large number of seconds offset from the actual time), resulting in a huge jump in time. A reboot is a guaranteed way to get the new, incorrect time. > > > > > > Although I do not understand all of the underlying code, I presume the correct way this should work is it should be comparing the CMOS time that?s just been set with the hardware clock on the physical machine, resulting in an offset between the two. This would result in a generally stable number (ideally 0). Obviously it is incorrect behaviour for the number to keep going up. To my mind it looks like it may be somehow getting an inaccurate time from the system (in many cases a fixed time rather than an up-to-date current time). > > > > > > Does anyone have any light they may be able to shed on this? Is it possible it could be struggling to get an accurate time from the hardware? I have checked on several occasions and both the system time and the BIOS clock are spot on. > > > > > > > Please paste the cfgfile of your HVM Windows. > > > > -- Pasi > >
Phil Evans
2013-Jan-14 17:10 UTC
Re: Ever increasing time offset for HVM domain / Huge amounts of drift
> On Mon, Jan 14, 2013 at 04:52:31PM +0000, Phil Evans wrote: >> I had tried messing with all but tsc_mode. I have tried all options for tsc_mode as well now but there is no change. The key thing here is that the drift is increasing at a rate of 1 second per second of uptime amounting to huge amounts. It doesn''t seem to be a clock skew issue, it seems to be something is simply not counting at all. >> > > Please don''t top-post.. > > What''s the hardware config? > > -- Pasi > >The boxes are Dell PowerEdge R420''s with 128GB RAM and dual 8-core Intel Xeon E5-2450s @ 2.1GHz. Here''s xm info: host : node5 release : 3.2.34 version : #2 SMP Wed Dec 5 12:29:46 GMT 2012 machine : x86_64 nr_cpus : 32 nr_nodes : 2 cores_per_socket : 8 threads_per_core : 2 cpu_mhz : 2100 hw_caps : bfebfbff:2c100800:00000000:00003f40:13bee3ff:00000000:00000001:00000000 virt_caps : hvm hvm_directio total_memory : 130994 free_memory : 119056 free_cpus : 0 xen_major : 4 xen_minor : 2 xen_extra : .1 xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : unavailable xen_commandline : dom0_mem=8192M cpufreq=xen dom0_max_vcpus=4 dom0_vcpus_pin xsave=off cc_compiler : gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4) cc_compile_by : mockbuild cc_compile_domain : crc.id.au cc_compile_date : Wed Dec 19 01:32:40 EST 2012 xend_config_format : 4 Phil.> >> Phil. >> ________________________________________ >> From: Pasi Kärkkäinen [pasik@iki.fi] >> Sent: 14 January 2013 16:30 >> To: Phil Evans >> Cc: xen-devel@lists.xen.org >> Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift >> >> On Mon, Jan 14, 2013 at 04:17:39PM +0000, Phil Evans wrote: >>> Hi, >>> >>> Sorry I should have included that in the first place: >>> >> >> Ok. Did you try experimenting with these options?: >> >> timer_mode=X >> hpet=0|1 >> tsc_mode=X >> >> >> -- Pasi >> >> >>> import os,re >>> arch = os.uname()[4] >>> kernel = ''/usr/lib/xen-default/boot/hvmloader'' >>> builder = ''hvm'' >>> name = ''vm_141'' >>> memory = ''2048'' >>> disk = [''phy:/dev/storage_node_2/disk_806,xvda,w'',''file:/control/isos/empty.iso,xvdd:cdrom,r''] >>> vif = [''mac=00:16:3e:3f:2f:1a, bridge=vlan_369, vifname=vm_141.0, ip=89.238.190.22 2a02:40:501:3::2 2a02:40:501:3::5 89.238.190.88'',''mac=00:16:3e:67:66:71, bridge=vlan_4000, vifname=vm_141.1, ip=0.0.0.0'',''mac=00:16:3e:01:7e:e8, bridge=vlan_369, vifname=vm_141.2''] >>> device_model = ''/usr/lib/xen-default/bin/qemu-dm'' >>> boot = ''cd'' >>> vnc = 1 >>> vncpasswd = ''YpD5aVZ8'' >>> usbdevice = ''tablet'' >>> acpi = 0 >>> vcpus = 4 >>> viridian = 1 >>> >>> Thanks, >>> Phil. >>> ________________________________________ >>> From: Pasi Kärkkäinen [pasik@iki.fi] >>> Sent: 14 January 2013 15:14 >>> To: Phil Evans >>> Cc: xen-devel@lists.xen.org >>> Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift >>> >>> On Mon, Jan 14, 2013 at 01:37:01PM +0000, Phil Evans wrote: >>>> Hi, >>>> >>>> I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as well). We have been having a major problem with sometimes huge amounts of clock drift in Windows VMs. Sometimes the clock on a VM could suddenly jump by over a week (usually forwards, however time has been known to go backwards as well). >>>> >>>> Now I don?t profess to know the internals of Xen, however through my investigation I believe I have a degree of knowledge of what could be causing the problem. >>>> >>>> The steps to reproduce this (for me at least), is to simply do a manual NTP sync on a Windows VM. Upon monitoring the qemu-dm log file for the VM, I see similar to the following: >>>> >>>> Time offset set 489, added offset 480 >>>> Time offset set 436, added offset -53 >>>> Time offset set 496, added offset 60 >>>> Time offset set 494, added offset -2 >>>> Time offset set 554, added offset 60 >>>> Time offset set 565, added offset 11 >>>> Time offset set 606, added offset 41 >>>> Time offset set -1974, added offset -2580 >>>> Time offset set 1626, added offset 3600 >>>> Time offset set 1579, added offset -47 >>>> Time offset set 1639, added offset 60 >>>> >>>> It seems to add the same number of seconds to the offset as has passed since the last sync. The offset just keeps on increasing, eventually resulting in huge numbers equating to days. Occasionally the offset may jump a bit and go down but the general trend is up. Although this does not affect the VM immediately, at some point I am guessing it syncs itself with the CMOS clock (which is now a large number of seconds offset from the actual time), resulting in a huge jump in time. A reboot is a guaranteed way to get the new, incorrect time. >>>> >>>> Although I do not understand all of the underlying code, I presume the correct way this should work is it should be comparing the CMOS time that?s just been set with the hardware clock on the physical machine, resulting in an offset between the two. This would result in a generally stable number (ideally 0). Obviously it is incorrect behaviour for the number to keep going up. To my mind it looks like it may be somehow getting an inaccurate time from the system (in many cases a fixed time rather than an up-to-date current time). >>>> >>>> Does anyone have any light they may be able to shed on this? Is it possible it could be struggling to get an accurate time from the hardware? I have checked on several occasions and both the system time and the BIOS clock are spot on. >>>> >>> >>> Please paste the cfgfile of your HVM Windows. >>> >>> -- Pasi >>>
Dan Magenheimer
2013-Jan-14 23:57 UTC
Re: Ever increasing time offset for HVM domain / Huge amounts of drift
> From: Phil Evans [mailto:Phil.Evans@m247.com] > Sent: Monday, January 14, 2013 9:53 AM > To: Pasi Kärkkäinen > Cc: xen-devel@lists.xen.org > Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift > > I had tried messing with all but tsc_mode. I have tried all options for tsc_mode as well now but > there is no change. The key thing here is that the drift is increasing at a rate of 1 second per > second of uptime amounting to huge amounts. It doesn''t seem to be a clock skew issue, it seems to be > something is simply not counting at all.Just a wild idea... What Windows version are you running? Historically, I think, Windows has ignored TSC i.e. never issues a rdtsc nor writes to TSC. I wonder if the kernel of whatever version of Windows you are running (or the NTP sync somehow with the blessing of the kernel) _is_ checking certain hardware settings, then writing to the guest''s (virtual) TSC and this might cause things to get very confused. P.S. Avoiding top-posting can be difficult and annoying if you are running some versions of Outlook as your mail client. Google it to see how, if you''d like to avoid complaints on this list.> From: Pasi Kärkkäinen [pasik@iki.fi] > Sent: 14 January 2013 16:30 > To: Phil Evans > Cc: xen-devel@lists.xen.org > Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift > > On Mon, Jan 14, 2013 at 04:17:39PM +0000, Phil Evans wrote: > > Hi, > > > > Sorry I should have included that in the first place: > > > > Ok. Did you try experimenting with these options?: > > timer_mode=X > hpet=0|1 > tsc_mode=X > > > -- Pasi > > > > import os,re > > arch = os.uname()[4] > > kernel = ''/usr/lib/xen-default/boot/hvmloader'' > > builder = ''hvm'' > > name = ''vm_141'' > > memory = ''2048'' > > disk = [''phy:/dev/storage_node_2/disk_806,xvda,w'',''file:/control/isos/empty.iso,xvdd:cdrom,r''] > > vif = [''mac=00:16:3e:3f:2f:1a, bridge=vlan_369, vifname=vm_141.0, ip=89.238.190.22 2a02:40:501:3::2 > 2a02:40:501:3::5 89.238.190.88'',''mac=00:16:3e:67:66:71, bridge=vlan_4000, vifname=vm_141.1, > ip=0.0.0.0'',''mac=00:16:3e:01:7e:e8, bridge=vlan_369, vifname=vm_141.2''] > > device_model = ''/usr/lib/xen-default/bin/qemu-dm'' > > boot = ''cd'' > > vnc = 1 > > vncpasswd = ''YpD5aVZ8'' > > usbdevice = ''tablet'' > > acpi = 0 > > vcpus = 4 > > viridian = 1 > > > > Thanks, > > Phil. > > ________________________________________ > > From: Pasi Kärkkäinen [pasik@iki.fi] > > Sent: 14 January 2013 15:14 > > To: Phil Evans > > Cc: xen-devel@lists.xen.org > > Subject: Re: [Xen-devel] Ever increasing time offset for HVM domain / Huge amounts of drift > > > > On Mon, Jan 14, 2013 at 01:37:01PM +0000, Phil Evans wrote: > > > Hi, > > > > > > I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as well). We have been having a > major problem with sometimes huge amounts of clock drift in Windows VMs. Sometimes the clock on a VM > could suddenly jump by over a week (usually forwards, however time has been known to go backwards as > well). > > > > > > Now I don?t profess to know the internals of Xen, however through my investigation I believe I > have a degree of knowledge of what could be causing the problem. > > > > > > The steps to reproduce this (for me at least), is to simply do a manual NTP sync on a Windows VM. > Upon monitoring the qemu-dm log file for the VM, I see similar to the following: > > > > > > Time offset set 489, added offset 480 > > > Time offset set 436, added offset -53 > > > Time offset set 496, added offset 60 > > > Time offset set 494, added offset -2 > > > Time offset set 554, added offset 60 > > > Time offset set 565, added offset 11 > > > Time offset set 606, added offset 41 > > > Time offset set -1974, added offset -2580 > > > Time offset set 1626, added offset 3600 > > > Time offset set 1579, added offset -47 > > > Time offset set 1639, added offset 60 > > > > > > It seems to add the same number of seconds to the offset as has passed since the last sync. The > offset just keeps on increasing, eventually resulting in huge numbers equating to days. Occasionally > the offset may jump a bit and go down but the general trend is up. Although this does not affect the > VM immediately, at some point I am guessing it syncs itself with the CMOS clock (which is now a large > number of seconds offset from the actual time), resulting in a huge jump in time. A reboot is a > guaranteed way to get the new, incorrect time. > > > > > > Although I do not understand all of the underlying code, I presume the correct way this should > work is it should be comparing the CMOS time that?s just been set with the hardware clock on the > physical machine, resulting in an offset between the two. This would result in a generally stable > number (ideally 0). Obviously it is incorrect behaviour for the number to keep going up. To my mind > it looks like it may be somehow getting an inaccurate time from the system (in many cases a fixed time > rather than an up-to-date current time). > > > > > > Does anyone have any light they may be able to shed on this? Is it possible it could be > struggling to get an accurate time from the hardware? I have checked on several occasions and both > the system time and the BIOS clock are spot on. > > > > > > > Please paste the cfgfile of your HVM Windows. > > > > -- Pasi > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Tim Deegan
2013-Jan-17 13:55 UTC
Re: Ever increasing time offset for HVM domain / Huge amounts of drift
Hi, At 13:37 +0000 on 14 Jan (1358170621), Phil Evans wrote:> I am currently running Xen 4.2.1 (this has also happened in 4.2.0 as > well). We have been having a major problem with sometimes huge > amounts of clock drift in Windows VMs. Sometimes the clock on a VM > could suddenly jump by over a week (usually forwards, however time has > been known to go backwards as well). > > The steps to reproduce this (for me at least), is to simply do a > manual NTP sync on a Windows VM. Upon monitoring the qemu-dm log file > for the VM, I see similar to the following:Which version of Windows are you using for this? Did you see this on older (4.1.x) Xen versions?> Time offset set 489, added offset 480 > Time offset set 436, added offset -53 > Time offset set 496, added offset 60 > Time offset set 494, added offset -2 > Time offset set 554, added offset 60 > Time offset set 565, added offset 11 > Time offset set 606, added offset 41 > Time offset set -1974, added offset -2580 > Time offset set 1626, added offset 3600 > Time offset set 1579, added offset -47 > Time offset set 1639, added offset 60 > > It seems to add the same number of seconds to the offset as has passed > since the last sync.This printout is from some code that gets given a _change_ in time offset from Xen; it prints out the new value and the change, so they should always add up like that. But yes, it''s striking that the VM is (mostly) drifting forward over time.> The offset just keeps on increasing, eventually > resulting in huge numbers equating to days. Occasionally the offset > may jump a bit and go down but the general trend is up. Although this > does not affect the VM immediately, at some point I am guessing it > syncs itself with the CMOS clock (which is now a large number of > seconds offset from the actual time), resulting in a huge jump in > time. A reboot is a guaranteed way to get the new, incorrect time.That makes sense; if the RTC is being set to the wrong time, a reboot will copy the error into the new OS time.> Although I do not understand all of the underlying code, I presume the > correct way this should work is it should be comparing the CMOS time > that?s just been set with the hardware clock on the physical machine, > resulting in an offset between the two.More or less. In fact IIRC it compares it with current CMOS time, and propagates the difference into an offset from hardware-clock. It''s possible that the code to calculate ''current CMOS time'' for a VM is buggy -- that was changed in 4.2. Cc''ing the people who touched that code in 4.2 for their opinions. Tim.
Tim Deegan
2013-Jan-17 16:41 UTC
[PATCH] Re: Ever increasing time offset for HVM domain / Huge amounts of drift
At 13:55 +0000 on 17 Jan (1358430927), Tim Deegan wrote:> It''s possible that the code to calculate ''current CMOS time'' for a VM is > buggy -- that was changed in 4.2. Cc''ing the people who touched that > code in 4.2 for their opinions.In fact I think it''s the code that handles CMOS writes. The attached patch fixes the issue for me; can you try it on your system? (Jan, this is a candidate for applying to 4.2 as well as unstable.) Cheers, Tim. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Jan Beulich
2013-Jan-17 17:02 UTC
[PATCH] Re: Ever increasing time offset for HVM domain / Huge amounts of drift
>>> On 17.01.13 at 17:41, Tim Deegan <tim@xen.org> wrote: > At 13:55 +0000 on 17 Jan (1358430927), Tim Deegan wrote: >> It''s possible that the code to calculate ''current CMOS time'' for a VM is >> buggy -- that was changed in 4.2. Cc''ing the people who touched that >> code in 4.2 for their opinions. > > In fact I think it''s the code that handles CMOS writes. The attached > patch fixes the issue for me; can you try it on your system?Looks plausible, so feel free to put my ack on it if it also helps Phil.> (Jan, this is a candidate for applying to 4.2 as well as unstable.)I agree, but the code you touch here hasn''t been changed for years (and even the involved helper functions changed only marginally between 4.1 and 4.2), so wouldn''t that have been a problem for much longer? In which case it ought to also go into 4.1? Jan
Tim Deegan
2013-Jan-17 17:13 UTC
Re: [PATCH] Re: Ever increasing time offset for HVM domain / Huge amounts of drift
At 17:02 +0000 on 17 Jan (1358442159), Jan Beulich wrote:> >>> On 17.01.13 at 17:41, Tim Deegan <tim@xen.org> wrote: > > At 13:55 +0000 on 17 Jan (1358430927), Tim Deegan wrote: > >> It''s possible that the code to calculate ''current CMOS time'' for a VM is > >> buggy -- that was changed in 4.2. Cc''ing the people who touched that > >> code in 4.2 for their opinions. > > > > In fact I think it''s the code that handles CMOS writes. The attached > > patch fixes the issue for me; can you try it on your system? > > Looks plausible, so feel free to put my ack on it if it also helps > Phil. > > > (Jan, this is a candidate for applying to 4.2 as well as unstable.) > > I agree, but the code you touch here hasn''t been changed for > years (and even the involved helper functions changed only > marginally between 4.1 and 4.2), so wouldn''t that have been > a problem for much longer? In which case it ought to also go > into 4.1?The bug was introduced in 24974:6deb0b626f3f, when the 1-second timer that used to keep those registers up-to-date was removed; AFAICT that''s only in 4.2. Tim.
Zhang, Yang Z
2013-Jan-18 01:45 UTC
Re: [PATCH] Re: Ever increasing time offset for HVM domain / Huge amounts of drift
Tim Deegan wrote on 2013-01-18:> At 13:55 +0000 on 17 Jan (1358430927), Tim Deegan wrote: >> It''s possible that the code to calculate ''current CMOS time'' for a VM is >> buggy -- that was changed in 4.2. Cc''ing the people who touched that >> code in 4.2 for their opinions. > > In fact I think it''s the code that handles CMOS writes. The attached > patch fixes the issue for me; can you try it on your system?Right. Must to renew current CMOS before writing. Best regards, Yang
Phil Evans
2013-Jan-18 11:17 UTC
Re: [PATCH] Re: Ever increasing time offset for HVM domain / Huge amounts of drift
On Thu, 17 Jan 2013 16:41:13 +0000, Tim Deegan <tim@xen.org> wrote:> At 13:55 +0000 on 17 Jan (1358430927), Tim Deegan wrote: >> It''s possible that the code to calculate ''current CMOS time'' for a VMis>> buggy -- that was changed in 4.2. Cc''ing the people who touched that >> code in 4.2 for their opinions. > > In fact I think it''s the code that handles CMOS writes. The attached > patch fixes the issue for me; can you try it on your system?I have tried the patch and it fixes the issue for me as well.> (Jan, this is a candidate for applying to 4.2 as well as unstable.) > > Cheers, > > Tim.
Pasi Kärkkäinen
2013-Jan-24 13:38 UTC
Re: [PATCH] Re: Ever increasing time offset for HVM domain / Huge amounts of drift
On Fri, Jan 18, 2013 at 11:17:14AM +0000, Phil Evans wrote:> On Thu, 17 Jan 2013 16:41:13 +0000, Tim Deegan <tim@xen.org> wrote: > > At 13:55 +0000 on 17 Jan (1358430927), Tim Deegan wrote: > >> It''s possible that the code to calculate ''current CMOS time'' for a VM > is > >> buggy -- that was changed in 4.2. Cc''ing the people who touched that > >> code in 4.2 for their opinions. > > > > In fact I think it''s the code that handles CMOS writes. The attached > > patch fixes the issue for me; can you try it on your system? > > I have tried the patch and it fixes the issue for me as well. > > > (Jan, this is a candidate for applying to 4.2 as well as unstable.) > >Hello, Jan: Is this patch in your list to apply to xen-4.2-testing.hg aswell? -- Pasi
Jan Beulich
2013-Jan-24 13:56 UTC
Re: [PATCH] Re: Ever increasing time offset for HVM domain / Huge amounts of drift
>>> On 24.01.13 at 14:38, Pasi Kärkkäinen<pasik@iki.fi> wrote: > On Fri, Jan 18, 2013 at 11:17:14AM +0000, Phil Evans wrote: >> On Thu, 17 Jan 2013 16:41:13 +0000, Tim Deegan <tim@xen.org> wrote: >> > At 13:55 +0000 on 17 Jan (1358430927), Tim Deegan wrote: >> >> It's possible that the code to calculate 'current CMOS time' for a VM >> is >> >> buggy -- that was changed in 4.2. Cc'ing the people who touched that >> >> code in 4.2 for their opinions. >> > >> > In fact I think it's the code that handles CMOS writes. The attached >> > patch fixes the issue for me; can you try it on your system? >> >> I have tried the patch and it fixes the issue for me as well. >> >> > (Jan, this is a candidate for applying to 4.2 as well as unstable.) >> > > > Jan: Is this patch in your list to apply to xen-4.2-testing.hg aswell?Yes, of course. I merely want this to go through our 4.2 based trees before pushing it out, as there's no rush currently. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel