Andy Smith
2011-Sep-20 21:02 UTC
[Xen-users] 47s clock skew in domUs every so often, Debian squeeze packaged hypervisor 4.0.1-2
Hi, Ever since upgrading a particular machine from Debian lenny with Xen 3.3.x to Debian squeeze with xen-hypervisor-4.0-amd64 4.0.1-2 back in June, it''s been inflicting an intermittent 47s clock skew on multiple domUs. All the domUs run ntpd, but only one of them has an ntpd which seems to detect it. Example: Sep 20 16:11:39 test0 ntpd[1802]: time reset -46.869813 s Sep 6 12:58:41 test0 ntpd[1742]: time reset -46.869301 s Jul 27 11:19:35 test0 ntpd[1742]: time reset -46.869786 s Jun 29 19:05:10 test0 ntpd[1708]: time reset -46.868480 s Another one runs heartbeat and it similarly gets upset: Sep 20 15:54:56 foo2 heartbeat: [1407]: WARN: Late heartbeat: Node foo2: interval 47000 ms Sep 6 12:43:02 foo2 heartbeat: [1407]: WARN: Late heartbeat: Node foo2: interval 47590 ms (logs go back no further; I expect same happened on matching dates) Its ntpd doesn''t seem to complain, though. There are some customer domUs on this host running various versions of Debian and CentOS (both 2.6.18 backported Xen kernels and 2.6.36+ pvops), on which dovecot sees the time go backwards and kills itself, but I don''t have direct access to these domUs so can''t report much other than that. There are other domUs which don''t complain but this could be due to them having no software which cares; I have to investigate more to see if something is actually happening on them. Has anyone ever seen this before? I''m thinking this is a bug in Xen because it never happened in Xen 3.x, has only happened since upgrade, and the way it happens on the same day in multiple domUs (of different Linux distributions) suggest dom0 or hypervisor to me. I also have other hosts of Xen 4.x where this doesn''t happen. dom0 is a Supermicro X7DCL-i, Xeon 5410 running amd64 Debian squeeze and the distribution Xen packages. Some info on the various hosts involved: Distribution Kernel ------------------------------------------ dom0 Debian squeeze 2.6.32-5-xen-amd64 foo2 Debian squeeze 2.6.32-5-686-bigmem test0 Debian lenny 2.6.26-2-686-bigmem bar0 Debian squeeze 2.6.32-5-686-bigmem Current Available clocksource clocksources ----------------------------------- dom0 xen xen foo2 tsc xen tsc test0 xen xen tsc jiffies bar0 xen xen tsc Any ideas? Cheers, Andy _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Steve Allison
2011-Sep-21 09:40 UTC
Re: [Xen-users] 47s clock skew in domUs every so often, Debian squeeze packaged hypervisor 4.0.1-2
On 20/09/2011 22:02, Andy Smith wrote:> Hi, > > Ever since upgrading a particular machine from Debian lenny with Xen > 3.3.x to Debian squeeze with xen-hypervisor-4.0-amd64 4.0.1-2 back > in June, it''s been inflicting an intermittent 47s clock skew on > multiple domUs. > > All the domUs run ntpd, but only one of them has an ntpd which seems > to detect it. Example: > > Sep 20 16:11:39 test0 ntpd[1802]: time reset -46.869813 s > Sep 6 12:58:41 test0 ntpd[1742]: time reset -46.869301 s > Jul 27 11:19:35 test0 ntpd[1742]: time reset -46.869786 s > Jun 29 19:05:10 test0 ntpd[1708]: time reset -46.868480 s > > Another one runs heartbeat and it similarly gets upset: > > Sep 20 15:54:56 foo2 heartbeat: [1407]: WARN: Late heartbeat: Node foo2: interval 47000 ms > Sep 6 12:43:02 foo2 heartbeat: [1407]: WARN: Late heartbeat: Node foo2: interval 47590 ms > (logs go back no further; I expect same happened on matching dates) > > Its ntpd doesn''t seem to complain, though. > > There are some customer domUs on this host running various versions > of Debian and CentOS (both 2.6.18 backported Xen kernels and 2.6.36+ > pvops), on which dovecot sees the time go backwards and kills > itself, but I don''t have direct access to these domUs so can''t > report much other than that. > > There are other domUs which don''t complain but this could be due to > them having no software which cares; I have to investigate more to > see if something is actually happening on them. > > Has anyone ever seen this before? I''m thinking this is a bug in Xen > because it never happened in Xen 3.x, has only happened since > upgrade, and the way it happens on the same day in multiple domUs > (of different Linux distributions) suggest dom0 or hypervisor to me. > I also have other hosts of Xen 4.x where this doesn''t happen. > > dom0 is a Supermicro X7DCL-i, Xeon 5410 running amd64 Debian squeeze > and the distribution Xen packages. > > Some info on the various hosts involved: > > Distribution Kernel > ------------------------------------------ > dom0 Debian squeeze 2.6.32-5-xen-amd64 > foo2 Debian squeeze 2.6.32-5-686-bigmem > test0 Debian lenny 2.6.26-2-686-bigmem > bar0 Debian squeeze 2.6.32-5-686-bigmem > > Current Available > clocksource clocksources > ----------------------------------- > dom0 xen xen > foo2 tsc xen tsc > test0 xen xen tsc jiffies > bar0 xen xen tsc > > Any ideas? > > Cheers, > Andy > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-usersAdd this to /etc/default/grub GRUB_CMDLINE_XEN_DEFAULT="clocksource=pit" and then update-grub This seems to make the clock more stable. It still looses some time but its not as bad and the domU''s dont jerk back to normal. -- May the ping be with you .. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Steven Timm
2011-Sep-21 13:55 UTC
Re: [Xen-users] 47s clock skew in domUs every so often, Debian squeeze packaged hypervisor 4.0.1-2
What''s your setting for /proc/sys/xen/independent_wallclock? Steve On Wed, 21 Sep 2011, Steve Allison wrote:> On 20/09/2011 22:02, Andy Smith wrote: >> Hi, >> >> Ever since upgrading a particular machine from Debian lenny with Xen >> 3.3.x to Debian squeeze with xen-hypervisor-4.0-amd64 4.0.1-2 back >> in June, it''s been inflicting an intermittent 47s clock skew on >> multiple domUs. >> >> All the domUs run ntpd, but only one of them has an ntpd which seems >> to detect it. Example: >> >> Sep 20 16:11:39 test0 ntpd[1802]: time reset -46.869813 s >> Sep 6 12:58:41 test0 ntpd[1742]: time reset -46.869301 s >> Jul 27 11:19:35 test0 ntpd[1742]: time reset -46.869786 s >> Jun 29 19:05:10 test0 ntpd[1708]: time reset -46.868480 s >> >> Another one runs heartbeat and it similarly gets upset: >> >> Sep 20 15:54:56 foo2 heartbeat: [1407]: WARN: Late heartbeat: Node foo2: >> interval 47000 ms >> Sep 6 12:43:02 foo2 heartbeat: [1407]: WARN: Late heartbeat: Node foo2: >> interval 47590 ms >> (logs go back no further; I expect same happened on matching dates) >> >> Its ntpd doesn''t seem to complain, though. >> >> There are some customer domUs on this host running various versions >> of Debian and CentOS (both 2.6.18 backported Xen kernels and 2.6.36+ >> pvops), on which dovecot sees the time go backwards and kills >> itself, but I don''t have direct access to these domUs so can''t >> report much other than that. >> >> There are other domUs which don''t complain but this could be due to >> them having no software which cares; I have to investigate more to >> see if something is actually happening on them. >> >> Has anyone ever seen this before? I''m thinking this is a bug in Xen >> because it never happened in Xen 3.x, has only happened since >> upgrade, and the way it happens on the same day in multiple domUs >> (of different Linux distributions) suggest dom0 or hypervisor to me. >> I also have other hosts of Xen 4.x where this doesn''t happen. >> >> dom0 is a Supermicro X7DCL-i, Xeon 5410 running amd64 Debian squeeze >> and the distribution Xen packages. >> >> Some info on the various hosts involved: >> >> Distribution Kernel >> ------------------------------------------ >> dom0 Debian squeeze 2.6.32-5-xen-amd64 >> foo2 Debian squeeze 2.6.32-5-686-bigmem >> test0 Debian lenny 2.6.26-2-686-bigmem >> bar0 Debian squeeze 2.6.32-5-686-bigmem >> >> Current Available >> clocksource clocksources >> ----------------------------------- >> dom0 xen xen >> foo2 tsc xen tsc >> test0 xen xen tsc jiffies >> bar0 xen xen tsc >> >> Any ideas? >> >> Cheers, >> Andy >> >> _______________________________________________ >> Xen-users mailing list >> Xen-users@lists.xensource.com >> http://lists.xensource.com/xen-users > > Add this to /etc/default/grub > > GRUB_CMDLINE_XEN_DEFAULT="clocksource=pit" > > and then update-grub > > This seems to make the clock more stable. It still looses some time but its > not as bad and the domU''s dont jerk back to normal. > >-- ------------------------------------------------------------------ Steven C. Timm, Ph.D (630) 840-8525 timm@fnal.gov http://home.fnal.gov/~timm/ Fermilab Computing Division, Scientific Computing Facilities, Grid Facilities Department, FermiGrid Services Group, Group Leader. Lead of FermiCloud project. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Steve Allison
2011-Sep-21 14:18 UTC
Re: [Xen-users] 47s clock skew in domUs every so often, Debian squeeze packaged hypervisor 4.0.1-2
On 21/09/2011 14:55, Steven Timm wrote:> What''s your setting for /proc/sys/xen/independent_wallclock? > > Steve >$ cat /proc/sys/xen/independent_wallclock 0 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Andy Smith
2011-Sep-21 15:56 UTC
Re: [Xen-users] 47s clock skew in domUs every so often, Debian squeeze packaged hypervisor 4.0.1-2
Hi Steven, On Wed, Sep 21, 2011 at 08:55:08AM -0500, Steven Timm wrote:> What''s your setting for /proc/sys/xen/independent_wallclock?Not sure if that was for Steve or myself, but since most of the domUs I have access to are using pvops kernels they don''t have an independent_wallclock setting at all. Steve,> On Wed, 21 Sep 2011, Steve Allison wrote: >> On 20/09/2011 22:02, Andy Smith wrote: >>> Current Available >>> clocksource clocksources >>> ----------------------------------- >>> dom0 xen xen >>> foo2 tsc xen tsc >>> test0 xen xen tsc jiffies >>> bar0 xen xen tsc[...]>> Add this to /etc/default/grub >> >> GRUB_CMDLINE_XEN_DEFAULT="clocksource=pit"Will that actually work given that dom0 does not list "pit" as an available clocksource? If/when I do reboot the dom0, what should I be looking for in the xm dmesg and dom0 dmesg that will give some info about clock sources? Cheers, Andy _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Steve Allison
2011-Sep-21 17:01 UTC
Re: [Xen-users] 47s clock skew in domUs every so often, Debian squeeze packaged hypervisor 4.0.1-2
On 21/09/2011 16:56, Andy Smith wrote:> Hi Steven, > > On Wed, Sep 21, 2011 at 08:55:08AM -0500, Steven Timm wrote: >> What''s your setting for /proc/sys/xen/independent_wallclock? > Not sure if that was for Steve or myself, but since most of the > domUs I have access to are using pvops kernels they don''t have an > independent_wallclock setting at all. > > Steve, > >> On Wed, 21 Sep 2011, Steve Allison wrote: >>> On 20/09/2011 22:02, Andy Smith wrote: >>>> Current Available >>>> clocksource clocksources >>>> ----------------------------------- >>>> dom0 xen xen >>>> foo2 tsc xen tsc >>>> test0 xen xen tsc jiffies >>>> bar0 xen xen tsc > [...] > >>> Add this to /etc/default/grub >>> >>> GRUB_CMDLINE_XEN_DEFAULT="clocksource=pit" > Will that actually work given that dom0 does not list "pit" as an > available clocksource? > > If/when I do reboot the dom0, what should I be looking for in the xm > dmesg and dom0 dmesg that will give some info about clock sources? > > Cheers, > Andy > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-usersI think dom0 will only ever list "xen" as its clocksource because of the type of virtual machine it is. I think to get the true clocksource listing you need to boot without the hypervisor. Whilst using pit I get the following lines... > xm dmesg|grep -i pit (XEN) Command line: placeholder clocksource=pit (XEN) Platform timer is 1.193MHz PIT _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users