Hi all, I get this warning from the HVM DomU kernel (both PV on HVM or normal HVM): checking TSC synchronization [CPU#0 -> CPU#1]: Measured 116836520 cycles TSC warp between CPUs, turning off TSC clock. Marking TSC unstable due to check_tsc_sync_source failed the host cpu is the following: processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 6 model name : Genuine Intel(R) CPU 3.00GHz stepping : 2 cpu MHz : 3000.050 cache size : 2048 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 6 wp : yes flags : fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht nx constant_tsc pni est cid hypervisor arat bogomips : 6000.10 clflush size : 64 cache_alignment : 128 address sizes : 36 bits physical, 48 bits virtual power management: it happens on both xen 4.0 (pre 21236) and 4.1. if I specify tsc_mode=2, first I get: checking TSC synchronization [CPU#0 -> CPU#1]: passed. but a little bit afterwards: Clocksource tsc unstable (delta = 116372610 ns) If I use a PV on HVM kernel (pv timer enabled) and tsc_mode=1, besides these messages I get about 20-30 messages like the following: CE: xen increased min_delta_ns to 506250 nsec tracing them back to xen I found out that they happen when the guest kernel tries to set the next timer event in the past. Does this mean that the host has some serious tsc issues? Can this be a symptom of a bug in xen? Suggestion are welcome. Cheers, Stefano _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 13/07/2010 15:37, "Stefano Stabellini" <Stefano.Stabellini@eu.citrix.com> wrote:> Does this mean that the host has some serious tsc issues? > Can this be a symptom of a bug in xen? > Suggestion are welcome.The ''s'' and ''t'' debug key handlers will be useful to get an idea of how stable host TSCs are. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > > On 13/07/2010 15:37, "Stefano Stabellini" > <Stefano.Stabellini@eu.citrix.com> > wrote: > > > Does this mean that the host has some serious tsc issues? > > Can this be a symptom of a bug in xen? > > Suggestion are welcome. > > The ''s'' and ''t'' debug key handlers will be useful to get an idea of how > stable host TSCs are. > > -- KeirAlso you can try max_cstate=0 as a Xen boot parameter to rule out power management screwing up the tsc.> > Does this mean that the host has some serious tsc issues?Probably. But the default tsc_mode (0) is intended to hide all such issues. Could you check the ''s'' debug-key output to ensure your guest is actually running with tsc_mode=0?> > Can this be a symptom of a bug in xen?Well, if the guest has problems with the default tsc_mode (0), which does complete tsc emulation, I suppose it could be a bug in Xen. In particular, I wonder if the code that recovers from deep C-states (and writes to the TSC) is broken. IIRC, there was some changesets in that area recently. If the problem goes away with max_cstate=0, that would be a good place to start. Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, 13 Jul 2010, Dan Magenheimer wrote:> > From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > > > > On 13/07/2010 15:37, "Stefano Stabellini" > > <Stefano.Stabellini@eu.citrix.com> > > wrote: > > > > > Does this mean that the host has some serious tsc issues? > > > Can this be a symptom of a bug in xen? > > > Suggestion are welcome. > > > > The ''s'' and ''t'' debug key handlers will be useful to get an idea of how > > stable host TSCs are. > > > > -- Keir > > Also you can try max_cstate=0 as a Xen boot parameter to rule > out power management screwing up the tsc. > > > > Does this mean that the host has some serious tsc issues? > > Probably. But the default tsc_mode (0) is intended to hide all > such issues. Could you check the ''s'' debug-key output to > ensure your guest is actually running with tsc_mode=0? >this is the output of ''s'' and ''t'' without max_cstate=0: (XEN) Synced stime skew: max=245ns avg=202ns samples=2 current=160ns (XEN) Synced cycles skew: max=615 avg=577 samples=2 current=540 (XEN) TSC has constant rate, deep Cstates possible, so not reliable, warp=0 (count=2) (XEN) dom3(hvm): mode=0,ofs=0x2b2e19a77ea,khz=3000048,inc=1,vtsc count: 1211682 total this is the output of ''s'' and ''t'' with max_cstate=0: (XEN) Synced stime skew: max=110ns avg=105ns samples=2 current=110ns (XEN) Synced cycles skew: max=1020 avg=652 samples=2 current=285 (XEN) TSC has constant rate, no deep Cstates, passed warp test, deemed reliable, warp=0 (count=2) (XEN) dom2(hvm): mode=0,ofs=0xb748091f5,khz=3000032,inc=1,vtsc count: 758954 total I still get the same warning from the guest. I started to wonder why the guest is seeing such a big tsc warp when xen is seeing 0, so I added more tracing and eventually I found out that the value of v->arch.hvm_vcpu.stime_offset is significantly different between the two vcpus and the difference increases after the scaling. Then I added timer_mode=1 to my vm config file and the problem went away. I think that delay_for_missed_ticks shouldn''t cause tsc scew in the guest. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 13/07/2010 18:39, "Stefano Stabellini" <Stefano.Stabellini@eu.citrix.com> wrote:> I started to wonder why the guest is seeing such a big tsc warp when xen > is seeing 0, so I added more tracing and eventually I found out that the > value of v->arch.hvm_vcpu.stime_offset is significantly different > between the two vcpus and the difference increases after the scaling. > Then I added timer_mode=1 to my vm config file and the problem went > away. > I think that delay_for_missed_ticks shouldn''t cause tsc scew in > the guest.Well, timer_mode=1 is the default and I doubt in all seriousness that the other modes get any use or testing. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
/me wonders if timer_mode=1 is the default for xl? Or only for xm?> -----Original Message----- > From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > Sent: Tuesday, July 13, 2010 11:48 AM > To: Stefano Stabellini; Dan Magenheimer > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] xen tsc problems? > > On 13/07/2010 18:39, "Stefano Stabellini" > <Stefano.Stabellini@eu.citrix.com> > wrote: > > > I started to wonder why the guest is seeing such a big tsc warp when > xen > > is seeing 0, so I added more tracing and eventually I found out that > the > > value of v->arch.hvm_vcpu.stime_offset is significantly different > > between the two vcpus and the difference increases after the scaling. > > Then I added timer_mode=1 to my vm config file and the problem went > > away. > > I think that delay_for_missed_ticks shouldn''t cause tsc scew in > > the guest. > > Well, timer_mode=1 is the default and I doubt in all seriousness that > the > other modes get any use or testing. > > -- Keir > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 13/07/2010 18:48, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:>> I started to wonder why the guest is seeing such a big tsc warp when xen >> is seeing 0, so I added more tracing and eventually I found out that the >> value of v->arch.hvm_vcpu.stime_offset is significantly different >> between the two vcpus and the difference increases after the scaling. >> Then I added timer_mode=1 to my vm config file and the problem went >> away. >> I think that delay_for_missed_ticks shouldn''t cause tsc scew in >> the guest. > > Well, timer_mode=1 is the default and I doubt in all seriousness that the > other modes get any use or testing.To give you an idea how long it''s probably been broken, my suspicion is that the culprit is xen-unstable:17716, which is over two years old. That patch changed HVM time handling to base it more on Xen system time. The fact that hvm_set_guest_time() no longer directly affects guest TSC is probably the problem here. I think delay_for_missed_ticks might depend on that. Anyway, I''m not certain but I''d put money on it. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, 13 Jul 2010, Dan Magenheimer wrote:> /me wonders if timer_mode=1 is the default for xl? > Or only for xm?no, it is not. Xl defaults to 0, I am going to change it right now. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 13/07/2010 19:14, "Stefano Stabellini" <Stefano.Stabellini@eu.citrix.com> wrote:> On Tue, 13 Jul 2010, Dan Magenheimer wrote: >> /me wonders if timer_mode=1 is the default for xl? >> Or only for xm? > > no, it is not. > Xl defaults to 0, I am going to change it right now.Possibly we should make timer_mode=1 the default in Xen as well, and actually disallow setting it to 0. Clearly no good comes of it. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > Sent: Tuesday, July 13, 2010 12:59 PM > To: Stefano Stabellini; Dan Magenheimer > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] xen tsc problems? > > On 13/07/2010 19:14, "Stefano Stabellini" > <Stefano.Stabellini@eu.citrix.com> > wrote: > > > On Tue, 13 Jul 2010, Dan Magenheimer wrote: > >> /me wonders if timer_mode=1 is the default for xl? > >> Or only for xm? > > > > no, it is not. > > Xl defaults to 0, I am going to change it right now. > > Possibly we should make timer_mode=1 the default in Xen as well, and > actually disallow setting it to 0. Clearly no good comes of it.IIRC from >2 years ago, timer_mode=0 was best for older HVM 32-bit Linux guests. Obviously if the code (interacting with the tsc code) has bit-rotted, "best" is a relative term. :-) Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel