Jürgen Keil
2007-Aug-13 11:00 UTC
Solaris dom0 system time jumps backwards, bug in dtrace_xpv_getsystime() ?
When a Solaris dom0 computes current system time in usr/src/uts/i86xpv/os/xpv_timestamp.c, function dtrace_xpv_getsystime(), it computes the TSC delta between a current TSC timestamp and the reference timestamp that can be found in the vcpu''s time info, converts the TSC delta into xen nanoseconds system time, and adds the system time found in the reference timstamp from vcpu''s time info: result = dst->system_time + ((uint64_t)(tsc_delta * (uint64_t)dst->tsc_to_system_mul) >> 32); I don''t see that the xen hypervisor guarantees that the reference time stamp is "recent"; in fact on my Tecra S1 laptop with my uppc experiments I''ve seen that the vcpu''s time info can get quite old, resulting in a big TSC delta (> 2**32 ticks), so that multiplying a > 2**32 tsc_delta with the 32-bit dst->tsc_to_system_mul scale factor overflows a 64-bit int. Result is that dtrace_xpv_getsystime() returns a "current" xen system timestamp that is several seconds behind current xen system time. Suggested fix: ============= A correct solution would multiply a 64-bit integer "tsc_delta" * 32-bit fraction "tsc_to_system_mul", giving a 96-bit result with 32-bit fraction part, which is then reduced to a 64-bit integer result. Something like this: result = dst->system_time + (uint64_t)(tsc_delta >> 32) * (uint64_t)dst->tsc_to_system_mul + (((uint64_t)(uint32_t)tsc_delta * (uint64_t)dst->tsc_to_system_mul) >> 32); --- wos_b66_xen/usr/src/uts/i86xpv/os/xpv_timestamp.c 2007-07-22 21:16:30.923548979 +0200 +++ wos_b66_xen_uppc/usr/src/uts/i86xpv/os/xpv_timestamp.c 2007-08-12 14:19:53.588835166 +0200 @@ -126,13 +126,28 @@ } while ((src->version & 1) | (dst->version ^ src->version)); + /* + * Compute system time as: + * + * system_time + + * ((((tsc - tsc_timestamp) << tsc_shift) * tsc_to_system_mul) >> 32) + * + * Since there''s no guarantee that ((tsc - tsc_timestamp) << tsc_shift) + * fits into a 32-bit integer we have to make sure that we support + * a full 64-bit * 32-bit multiplication, giving a 96-bit result, + * which is then reduced to a 64-bit value by truncating the 32-bit + * fractional part. + */ + if (dst->tsc_shift >= 0) tsc_delta <<= dst->tsc_shift; else if (dst->tsc_shift < 0) tsc_delta >>= -dst->tsc_shift; - result = dst->system_time + - ((uint64_t)(tsc_delta * (uint64_t)dst->tsc_to_system_mul) >> 32); + result = dst->system_time + + (uint64_t)(tsc_delta >> 32) * (uint64_t)dst->tsc_to_system_mul + + (((uint64_t)(uint32_t)tsc_delta * (uint64_t)dst->tsc_to_system_mul) + >> 32); t->t_preempt--; This message posted from opensolaris.org
John Levon
2007-Aug-13 12:30 UTC
Re: Solaris dom0 system time jumps backwards, bug in dtrace_xpv_getsystime() ?
On Mon, Aug 13, 2007 at 04:00:20AM -0700, J??rgen Keil wrote:> I don''t see that the xen hypervisor guarantees that the reference time stamp is "recent"; in factIIRC the calibration is re-done on every CPU once a second, it seems this should be enough (just?) to avoid any overflow. So I''d be interested to know exactly what''s happening when there''s a delta larger than 2**32 regards john
Juergen Keil
2007-Aug-13 14:03 UTC
Re: Solaris dom0 system time jumps backwards, bug in dtrace_xpv_getsystime() ?
John wrote,> On Mon, Aug 13, 2007 at 04:00:20AM -0700, J??rgen Keil wrote: > > > I don''t see that the xen hypervisor guarantees that the reference time stampis "recent"; in fact> > IIRC the calibration is re-done on every CPU once a second, it seems this > should be enough (just?) to avoid any overflow. So I''d be interested to know > exactly what''s happening when there''s a delta larger than 2**32The time base is updated from xen/common/schedule.c in function schedule(), every time we switch to a new vcpu and the new vcpu isn''t the "idle" thread: 605 /* Ensure that the domain has an up-to-date time base. */ 606 if ( !is_idle_vcpu(next) ) 607 { 608 update_vcpu_system_time(next); 609 if ( next->sleep_tick != sd->tick ) 610 send_timer_event(next); 611 } It seems the code at line 608 isn''t used, my Dom0 vcpu is probably busy 100% and is the only vcpu that is in use when I''m bootstraping the Solaris Dom0. The other time base update is in xen/common/schedule.c in function t_timer_fn(). Hmm, t_timer_fn() should be called every 10 milliseconds: 649 if ( !is_idle_vcpu(v) ) 650 { 651 update_vcpu_system_time(v); 652 send_timer_event(v); 653 } OTOH, we also have this test in xen/arch/x86/time.c update_vcpu_system_time(), that skips the time base update when the local_tsc_stamp hasn''t changed: 691 void update_vcpu_system_time(struct vcpu *v) 692 { 693 if ( v->vcpu_info->time.tsc_timestamp ! 694 this_cpu(cpu_time).local_tsc_stamp ) 695 __update_vcpu_system_time(v); 696 } I think I''ll have to do a few more experiments, e.g. if we ever use this "goto out;" in xen/arch/x86/time.c local_time_calibration(). I think it could explain why local_tsc_stamp hasn''t changed: 788 /* 789 * Weirdness can happen if we lose sync with the platform timer. 790 * We could be smarter here: resync platform timer with local timer? 791 */ 792 if ( ((s64)stime_elapsed64 < (EPOCH / 2)) ) 793 goto out;
Jürgen Keil
2007-Aug-14 09:56 UTC
Re: Solaris dom0 system time jumps backwards, bug in dtrace_xpv_getsystim
I wrote> John wrote, > > > On Mon, Aug 13, 2007 at 04:00:20AM -0700, Jürgen Keil wrote: > > > > > I don't see that the xen hypervisor guarantees that the reference time stamp > > > is "recent"; in fact > > > > IIRC the calibration is re-done on every CPU once a second, it seems this > > should be enough (just?) to avoid any overflow. So I'd be interested to know > > exactly what's happening when there's a delta larger than 2**32> I think I'll have to do a few more experiments, e.g. if we ever use this > "goto out;" in xen/arch/x86/time.c local_time_calibration(). I think it > could explain why local_tsc_stamp hasn't changed: > > 788 /* > 789 * Weirdness can happen if we lose sync with the platform timer. > 790 * We could be smarter here: resync platform timer with local timer? > 791 */ > 792 if ( ((s64)stime_elapsed64 < (EPOCH / 2)) ) > 793 goto out;Yep, at some point during the Solaris dom0 boot, it starts to use that "goto out". Initially, when I'm stopped in "kmdb -d" it doesn't use that "goto out". The platform timer value that is returned by calling read_platform_stime() stops incrementing at some point; "stime_elapsed64 = curr_master_stime - prev_master_stime;" is always less than 0.5 seconds. My Tecra S1 is using the PIT platform timer ((XEN) Platform timer is 1.193MHz PIT). After the platform timer is dead, there are no more vcpu timestamp information updates. (And with my change in dtrace_xpv_getsystime(), dom0 survives this state) The platform timer gets broken as soon as Solaris dom0 is enumerating isa devices, using acpica. This will initialize the acpica subsystem, and breaks the PIT timer when the notebook is switched into acpi mode (i.e. when the "acpi_enable" command byte from the ACPI FADT table is sent to the "smc_cmd" I/O port). By booting the Solaris dom0 kernel with "-B acpi-user-options=8" I can boot dom0 into single user mode without breaking the PIT platform timer. Question is: why does enabling acpi mode break the PIT timer? It seems this does not happen on all x86 systems, though. On another system, ASUS M2NPV-VM, nVidia chipset, AMD64 X2 AM2 cpu, there is a BIOS setup option to disable the HPET timer, and with this setup the xen hypervisor is using the PIT platform timer, too. But on this box a Solaris dom0 is booting just fine. This message posted from opensolaris.org _______________________________________________ xen-discuss mailing list xen-discuss@opensolaris.org
David Edmondson
2007-Aug-14 10:04 UTC
Re: Solaris dom0 system time jumps backwards, bug in dtrace_xpv_getsystim
On Tue, Aug 14, 2007 at 02:56:20AM -0700, J?rgen Keil wrote:> Question is: why does enabling acpi mode break the PIT timer?Max has been looking at a bug where prodding the EHCI controller on one system ''broke'' the PIT (changed the frequency). This was discovered to be caused by the SMM BIOS (which ran off and did things, presumably related to disabling PS2 device emulation). Could this be the same kind of thing? dme.
Jürgen Keil
2007-Aug-14 11:19 UTC
Re: Solaris dom0 system time jumps backwards, bug in dtrace_xpv_getsystim
I wrote> Question is: why does enabling acpi mode break the PIT timer?This message on the xen-devel list seems to describe a problem that is similar (enabling ACPI mode blocks interrupts for a long time, breaks hypervisor''s PIT timer): http://lists.xensource.com/archives/html/xen-devel/2007-05/msg01054.html OTOH, the PIT timer on my Tecra S1 seems to stop completely (or stops sending interrupts?), while the above message on xen-devel sounds as if the PIT platform timer just gets out of sync with the TSC derived time when ACPI mode is enabled... It seems this changeset from xen-unstable is supposed to work around the issue, by implementing yet another platform timer source: changeset: Add support for ACPI PM Timer as platform clock source. changeset 15189: 2d7d33ac982a parent 15188: ae073ca6eb76 child 15190: c9d66baad22b author: kfraser@localhost.localdomain date: Wed May 30 18:12:02 2007 +0100 (2 months ago) files: xen/arch/x86/time.c xen/include/asm-x86/config.h description: Add support for ACPI PM Timer as platform clock source. Signed-off-by: Keir Fraser <keir@xensource.com> This message posted from opensolaris.org
Jürgen Keil
2007-Aug-17 17:22 UTC
Re: Solaris dom0 system time jumps backwards, bug in dtrace_xpv_getsystim
David wrote> On Tue, Aug 14, 2007 at 02:56:20AM -0700, Jürgen Keil wrote: > > Question is: why does enabling acpi mode break the PIT timer? > > Max has been looking at a bug where prodding the EHCI controller on > one system 'broke' the PIT (changed the frequency).While browsing bug reports I just discovered that one... Bug ID 6563730 Synopsis clock goes too fast on dom0 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6563730> This was > discovered to be caused by the SMM BIOS (which ran off and did things, > presumably related to disabling PS2 device emulation). Could this be > the same kind of thing?I think 6563730 is quite similar, but not 100% the same. With the Tecra S1 the contents of the 61h I/O port register (NMI Status and Control Register) is lost when the system is switched into ACPI mode; the lower two bits in port 61h control routing the PIT 2 timer output to the PC's speaker and can be used to enable/disable the PIT 2 timer. (With 6563730, PIT#2's frequency is messed up, it seems) On the Tecra S1 (Intel ICH4-M chipset) ACPI mode is enabled by writing a command byte E0h to the APM_CNT "Advanced Power Management Control Port" register at B2h. On bare metal (standard Nevada Kernel booted on Tecra S1 into kmdb) I can reproduce the problem like this: Read back current status for PIT#2: 43::out c8 42::in (returns 0xC4) 42::in (returns 0) 42::in (returns 0) That is, PIT 2 timer is configured for binary countdown, mode 2 (first byte read from 42h port: C4h). And current counter value of 0x0000 (bytes 2 & 3 read from port 42h). 61::in (returns 0x2c) That is, PIT 2 is disabled (bit 0 == 0) , and not routed to the speaker (bit 1 == 0). Now I program PIT 2 and enable it: 43::out b6 42::out a0 42::out f 61::out f PIT 2, binary countdown, mode 3, 16-bit counter, count = 0x0fa0; I enable passing PIT 2 output to the speaker and PIT 2 is enabled. I'm hearing some 300Hz sound from the speaker. Now, enable ACPI mode, and read back 61h: b2::out e0 61::in (returns 30) I/O Port 61h has lost the lower four bits; that is, the PIT 2 is disabled and not routed to the speaker any more! The 300Hz sound on the speaker has stopped. (Read backs for PIT 2 show that the counter value isn't changing any more...) By writing back 0x0f to I/O port 61h I get back the 300Hz sound, and the PIT 2 counder starts changing once again) : 61::out f Currently, I changed local_time_calibration() in the hypervisor to check if PIT 2 is still enabled when syncronization loss between TSC time and the platform timer is detected. When I detect "inb(0x61) & 1 == 0" I re-enable the PIT 2 timer. Now dom0 starts to become bootable on the Tecra S1 without having to disable ACPI (well - next problem is some issue with the EHCI controller in ACPI mode, but that seems to be a new issue 8-) I checked five other boxes (2x current AMD64 system with nVidia chipset, 1 notebook with VIA chipset, 1 Intel PIIX chipset, 1 Intel ICH7-M chipset) and none messed with the contents of I/O port 61h while enabling ACPI mode. This message posted from opensolaris.org _______________________________________________ xen-discuss mailing list xen-discuss@opensolaris.org
Max Zhen
2007-Aug-18 01:39 UTC
Re: Solaris dom0 system time jumps backwards, bug in dtrace_xpv_getsystim
Jürgen Keil wrote:> David wrote > >> On Tue, Aug 14, 2007 at 02:56:20AM -0700, Jürgen Keil wrote: >> >>> Question is: why does enabling acpi mode break the PIT timer? >>> >> Max has been looking at a bug where prodding the EHCI controller on >> one system 'broke' the PIT (changed the frequency). >> > > While browsing bug reports I just discovered that one... > > Bug ID 6563730 > Synopsis clock goes too fast on dom0 > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6563730 > > >> This was >> discovered to be caused by the SMM BIOS (which ran off and did things, >> presumably related to disabling PS2 device emulation). Could this be >> the same kind of thing? >> > > I think 6563730 is quite similar, but not 100% the same. > > With the Tecra S1 the contents of the 61h I/O port register > (NMI Status and Control Register) is lost when the system > is switched into ACPI mode; the lower two bits in port 61h > control routing the PIT 2 timer output to the PC's speaker > and can be used to enable/disable the PIT 2 timer. > (With 6563730, PIT#2's frequency is messed up, it seems) >Yes, it is reprogrammed by BIOS SMI handler of ehci controller when ehci driver routes all ohci devices to ehci port during ehci_attach(). Our ODM vendor has helped us to change the BIOS code to stop messing up the PIT timer channel 2 (this is only in AMI BIOS, tho). I also did a fix in Xen - watch the ticking of PIT channel 2 in IRQ0 handler and reprogram it if it goes wrong. This fix seems to be working fine :). Max> On the Tecra S1 (Intel ICH4-M chipset) ACPI mode is enabled > by writing a command byte E0h to the APM_CNT "Advanced > Power Management Control Port" register at B2h. > > > On bare metal (standard Nevada Kernel booted on Tecra S1 > into kmdb) I can reproduce the problem like this: > > Read back current status for PIT#2: > > 43::out c8 > 42::in (returns 0xC4) > 42::in (returns 0) > 42::in (returns 0) > > That is, PIT 2 timer is configured for binary countdown, > mode 2 (first byte read from 42h port: C4h). > And current counter value of 0x0000 (bytes 2 & 3 read from > port 42h). > > 61::in (returns 0x2c) > > That is, PIT 2 is disabled (bit 0 == 0) , and not routed to > the speaker (bit 1 == 0). > > > Now I program PIT 2 and enable it: > > 43::out b6 > 42::out a0 > 42::out f > 61::out f > > PIT 2, binary countdown, mode 3, 16-bit counter, count = 0x0fa0; > I enable passing PIT 2 output to the speaker and PIT 2 > is enabled. I'm hearing some 300Hz sound from the speaker. > > Now, enable ACPI mode, and read back 61h: > > b2::out e0 > 61::in (returns 30) > > I/O Port 61h has lost the lower four bits; that is, the PIT 2 is > disabled and not routed to the speaker any more! The 300Hz > sound on the speaker has stopped. (Read backs for PIT 2 > show that the counter value isn't changing any more...) > > By writing back 0x0f to I/O port 61h I get back the 300Hz sound, > and the PIT 2 counder starts changing once again) : > > 61::out f > > > Currently, I changed local_time_calibration() in the hypervisor > to check if PIT 2 is still enabled when syncronization loss > between TSC time and the platform timer is detected. When > I detect "inb(0x61) & 1 == 0" I re-enable the PIT 2 timer. Now > dom0 starts to become bootable on the Tecra S1 without having > to disable ACPI (well - next problem is some issue with the EHCI > controller in ACPI mode, but that seems to be a new issue 8-) > > > I checked five other boxes (2x current AMD64 system with nVidia > chipset, 1 notebook with VIA chipset, 1 Intel PIIX chipset, > 1 Intel ICH7-M chipset) and none messed with the contents of > I/O port 61h while enabling ACPI mode. > > > This message posted from opensolaris.org > _______________________________________________ > xen-discuss mailing list > xen-discuss@opensolaris.org >_______________________________________________ xen-discuss mailing list xen-discuss@opensolaris.org
Jürgen Keil
2007-Aug-21 13:44 UTC
Re: Solaris dom0 system time jumps backwards, bug in dtrace_xpv_getsystim
Max wrote:> Jürgen Keil wrote: > > I think 6563730 is quite similar, but not 100% the same. > > > > With the Tecra S1 the contents of the 61h I/O port register > > (NMI Status and Control Register) is lost when the system > > is switched into ACPI mode; the lower two bits in port 61h > > control routing the PIT 2 timer output to the PC's speaker > > and can be used to enable/disable the PIT 2 timer. > > (With 6563730, PIT#2's frequency is messed up, it seems) > > > Yes, it is reprogrammed by BIOS SMI handler of ehci controller > when ehci driver routes all ohci devices to ehci port during ehci_attach(). > Our ODM vendor has helped us to change the BIOS code to stop messing up > the PIT timer channel 2 (this is only in AMI BIOS, tho).The Toshiba Tecra S1's bios is from "Insyde Software", and it seems it's the BIOS' SMI handler that breaks the PIT timer 2. Unfortunately the BIOS SMI code is a black box; it seems there's no way for me to fix the BIOS SMI behavior (or look at the SMI code). Btw. now that I can boot a Solaris dom0 into multiuser mode on the Tecra S1, I also noticed that time was running too fast: - EHCI controller refuses to initialize, reports "No SOF interrupts have been received, this USB EHCI hostcontroller is unusable" (see /var/adm/messages below) The enabled ehci controller should set a framelist rollover interrupt every 1024 milliseconds; ehci driver waits 1024*1000+20000 = 1044 milliseconds. If time is really running too fast (e.g. factor 2 faster), waiting for "1044 milliseconds" could miss the frame list rollover interrupt because we end up waiting only for a max of ~ 500 milliseconds. - Keyboard auto repeat rate is much higher in Dom0, compared to bare metal - % date ; sleep 30 ; date Sat Aug 18 21:21:54 CEST 2007 Sat Aug 18 21:22:24 CEST 2007 Measured delay between the two date outputs was 14.1 seconds; clock is running ~ 2x too fast. Aug 18 20:00:21 max pseudo: [ID 129642 kern.notice] pseudo-device: dld0 Aug 18 20:00:21 max genunix: [ID 936769 kern.notice] dld0 is /pseudo/dld@0 Aug 18 20:00:22 max pci: [ID 370704 kern.notice] PCI-device: pci8086,3341@1, pci_pci0 Aug 18 20:00:22 max genunix: [ID 936769 kern.notice] pci_pci0 is /pci@0,0/pci8086,3341@1 Aug 18 20:00:22 max xpv_uppc: [ID 693463 kern.info] xen_uppc: non-pci,irqno 1 device i8042 instance 0 Aug 18 20:00:22 max xpv_uppc: [ID 693463 kern.info] xen_uppc: non-pci,irqno 12 device i8042 instance 0 Aug 18 20:00:23 max i8042: [ID 526150 kern.notice] 8042 device: keyboard@0, kb8042 # 0 Aug 18 20:00:23 max genunix: [ID 936769 kern.notice] kb80420 is /isa/i8042@1,60/keyboard@0 Aug 18 20:00:23 max i8042: [ID 526150 kern.notice] 8042 device: mouse@1, mouse8042 # 0 Aug 18 20:00:23 max genunix: [ID 936769 kern.notice] mouse80420 is /isa/i8042@1,60/mouse@1 Aug 18 20:00:25 max genunix: [ID 449358 kern.notice] NOTICE: Kernel debugger present: disabling console power management. Aug 18 20:09:10 max xpv_uppc: [ID 731908 kern.warning] WARNING: psm: crs device either not present or disabled, status 0x9 Aug 18 20:14:31 max xpv_uppc: [ID 352620 kern.info] xen_uppc: Setting irq 11 for device pciclass,0c0320 instance #0 Aug 18 20:29:44 max xpv_uppc: [ID 515050 kern.info] xen_uppc: [ACPI] new irq 11 for device pciclass,0c0320, instance #0 Aug 18 20:31:21 max xpv_uppc: [ID 445903 kern.info] xen_uppc: [ACPI] new irq 11 old irq 0 device pciclass,0c0320, instance 0 Aug 18 20:42:18 max usba: [ID 691482 kern.warning] WARNING: /pci@0,0/pci1179,ff10@1d,7 (ehci0): No SOF interrupts have been received, this USB EHCI hostcontroller is unusable> I also did a fix in Xen - watch the ticking of PIT channel 2 in IRQ0 handler > and reprogram it if it goes wrong. This fix seems to be working fine :).Yep, I'm doing quite the same. Now that I reprogram the PIT channel 2 timer and reenable it via port 61h, time inside the Solaris dom0 is correct - and all of the above mentioned issues are gone. This message posted from opensolaris.org _______________________________________________ xen-discuss mailing list xen-discuss@opensolaris.org