Sukjinder Purewal
2008-Feb-21 17:33 UTC
Clock running fast Solaris Express Developer Edition 1/08 (xVM boot) - snv_79
Have OS running on a Q6600 / MSI Neo2 but the xVM boot has the system clock running fast. Saw a posting dated August 2007 about some timer code in Xen being buggy. Don''t see this as a ''known bug'' Any ideas ? Thanks Suki Purewal
Attila Nagy
2008-Feb-23 00:18 UTC
Re: Clock running fast Solaris Express Developer Edition
At last! :) Somebody other than me has noticed it! I''m struggling with it since SXCE b77 through 80, now 82, and Dom0''s clock is about 3.5 times faster than on bare metal... Actually, I noticed that bug in the DomU''s (2 WinXP SP2''s), and somehow I didn''t noticed, that their clocks were faster, becuase the underlying Xen''s clock was faster :) Interestingly, in the b77-stage, the two virtual Windowses were moved from an Intel C2D E6600 to an Opteron-based X2100 M2 (the smallest model; just can''t remember the exact CPU numbering), and there wasn''t any problem with the clocks on the Intel-based machine. Maybe I should have had reinstall everything, but I thought that the Hypervisor exposed the same hardware to the virtualized OS''es, so I omitted that possiblity (and of course the work had to be done by the day before, so I opted for simply copying the images to the new machine... :) ) Anyways, I''m interested in knowing both the cause (just curious...), and a possible solution! It''s good to know that I''m not alone :) Thanks in advance a This message posted from opensolaris.org
dean ross-smith
2008-Mar-04 22:39 UTC
Re: Clock running fast Solaris Express Developer Edition
I''ll post a "me too" on the wacky clock w/nevada build 82 and sunblade 6000 w/6250 intel blade. Identical blade running sol 10u4 tells time correctly in the same chassis. This message posted from opensolaris.org
Jürgen Keil
2008-Mar-05 16:25 UTC
Re: Clock running fast Solaris Express Developer Edition
> I''m struggling with it since SXCE b77 through 80, now > 82, and Dom0''s clock is about 3.5 times faster than > on bare metal...In "xm dmesg" output, what is reported as "Platform timer" ? # xm dmesg | grep "Platform timer" (XEN) Platform timer is 1.193MHz PIT Maybe this is a case where the BIOS messes around with the PIT timer configuration done by the hypervisor, breaking the hypervisor''s time. Does the systems'' BIOS have an option to enable/disable the HPET timer? Maybe you can workaround the dom0 time problems by enabling the HPET timer? Note that on some boxes, you first have to upgrade the BIOS, to get a BIOS setup option for the HPET timer (IIRC, both my ASUS M2NPV-VM and ASUS M2N-SLI deluxe needed the BIOS update). This message posted from opensolaris.org
Attila Nagy
2008-Mar-06 08:43 UTC
Re: Clock running fast Solaris Express Developer Edition
Hi, thanks for the pointer! # uname -a SunOS solaris 5.11 snv_82 i86pc i386 i86xpv # xm dmesg | grep "Platform timer" (XEN) Platform timer is 1.193MHz PIT It seems to be ok. On the other hand - looking for acpi issues - I''ve found this line in /var/adm/messages: solaris genunix: [ID 636498 kern.warning] WARNING: cannot load platform pm driver acpippm This is a Sun X2100; I don''t remember if it has HPET setting ability (latest bios), but I''d bet not, as I have tried to fiddle around with the acpi settings, and don''t remember to have seen this. Tonight I can restart the system, so that I can check it. Attila This message posted from opensolaris.org
Attila Nagy
2008-Mar-06 17:33 UTC
Re: Clock running fast Solaris Express Developer Edition
Well, I''ve just checked: no HPET setting available in this BIOS. Is there any software workaround for this? A This message posted from opensolaris.org
Jürgen Keil
2008-Mar-06 18:15 UTC
Re: Clock running fast Solaris Express Developer Edition
> Well, I''ve just checked: no HPET setting available in this BIOS. > Is there any software workaround for this?Well, I guess it could be fixed in the xen hypervisor, if we somehow could detect that xen''s PIT platform timer has changed the frequency of timer interrupts. My Toshiba Tecra S1 notebook was messing up the PIT timer, too. When dom0 enables ACPI mode on the Tecra S1, the frequency of the PIT timer is changed, and the timer is stopped! This resulted in the dom0 kernel hanging. Reading the old thread for this issue... http://www.opensolaris.org/jive/thread.jspa?messageID=147903 ... is seems that other systems have this issue when the dom0 starts using the USB controller hardware. Do you use USB devices on your system? If you don''t need USB, an interesting experiment would be to boot dom0 with the ehci and/or ohci and/or uhci driver disabled, and see if that fixes the time issues for dom0. To disable the usb driver, start the dom0 kernel with the option -B disable-ehci=true,disable-ohci=true,disable-uhci=true This message posted from opensolaris.org
Attila Nagy
2008-Mar-06 18:48 UTC
Re: Clock running fast Solaris Express Developer Edition
Great, great, great! Hats off! It works! :) # grep disabled /var/adm/messages Mar 6 19:37:50 solaris unix: [ID 535085 kern.notice] NOTICE: driver ohci disabled Mar 6 19:37:50 solaris unix: [ID 535085 kern.notice] NOTICE: driver ehci disabled Mar 6 19:37:50 solaris unix: [ID 535085 kern.notice] NOTICE: driver uhci disabled ...and # date;sleep 15; date 2008. március 6. 19:43:48 CET 2008. március 6. 19:44:03 CET Really took 15 seconds!!! As I don't need usb devices (well, not always - if I'm in front of the machine, the only keyboard option is a usb keyboard), but other than that, no need for external devices. Maybe I'll experiment by disabling various *hci devices, and see what happens. (Maybe I'll put the results here in case somebody interested) Thank you very much, it was a great help for me, and for others too!! Can this be considered as a bug? I mean, should I file one? Attila Nagy This message posted from opensolaris.org _______________________________________________ xen-discuss mailing list xen-discuss@opensolaris.org
Jürgen Keil
2008-Mar-06 19:18 UTC
Re: Clock running fast Solaris Express Developer Edition
> Great, great, great! > > Hats off! > > It works! :) > > # grep disabled /var/adm/messages > Mar 6 19:37:50 solaris unix: [ID 535085 kern.notice] > NOTICE: driver ohci disabled > Mar 6 19:37:50 solaris unix: [ID 535085 kern.notice] > NOTICE: driver ehci disabled > Mar 6 19:37:50 solaris unix: [ID 535085 kern.notice] > NOTICE: driver uhci disabled > ...and > > # date;sleep 15; date > 2008. március 6. 19:43:48 CET > 2008. március 6. 19:44:03 CET > > Really took 15 seconds!!! > As I don't need usb devices (well, not always - if > I'm in front of the machine, the only keyboard option > is a usb keyboard), but other than that, no need for > external devices. > Maybe I'll experiment by disabling various *hci > devices, and see what happens. (Maybe I'll put the > results here in case somebody interested) > > Thank you very much, it was a great help for me, and > for others too!! > > Can this be considered as a bug? I mean, should I > file one?I'd say this is a bug in the system's bios. My wild guess is that the bios messes with the PIT timer when usb legacy support for the keyboard gets disabled, which should happen at the time the Solaris ohci or uhci driver takes control of the usb 1.x hardware. (I think that the X2100 uses some kind of nVidia chipset, so most likely the box uses ohci usb 1.x hardware and has no uhci 1.x controller - boxes with intel chipsets would use uhci) Hmm, looking at build_84 xVM sources, I see that there is a hypervisor option named "correct_pit" that might work around the problem, too. Try to boot the xVM dom0 with something like this: kernel$ /boot/$ISADIR/xen.gz correct_pit % more xen.hg/.hg/patches/check-pit-channel2 Watch PIT channel 2 and reprogram it, when it goes wrong. Signed-off-by: Max Zhen <Max.Zhen@Sun.COM> diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c --- a/xen/arch/x86/time.c +++ b/xen/arch/x86/time.c @@ -35,6 +35,14 @@ /* NB. This is a gross hack. Mainly useful for HPET testing. */ static int opt_hpet_force = 0; boolean_param("hpet_force", opt_hpet_force); +/* + * On some buggy platform, PIT channel 2 can be interfered w/ by + * other devices. So, we try to do a sanity check every so often. + * If it jitters too much, we'll reprogram it immediately to do + * mode 0 (interrupt on terminal count mode) binary count. + */ +static int opt_correct_pit = 0; +boolean_param("correct_pit", opt_correct_pit); #define EPOCH MILLISECS(1000) ... This message posted from opensolaris.org _______________________________________________ xen-discuss mailing list xen-discuss@opensolaris.org
Attila Nagy
2008-Mar-06 19:45 UTC
Re: Clock running fast Solaris Express Developer Edition
> I''d say this is a bug in the system''s bios. My wild > guess is that the bios messes with the PIT timer > when usb legacy support for the keyboard gets > disabled, which should happen at the time the > Solaris ohci or uhci driver takes control of the usb > 1.x > hardware. (I think that the X2100 uses some kind of > nVidia chipset, so most likely the box uses ohci > usb 1.x hardware and has no uhci 1.x controller - > boxes with intel chipsets would use uhci)Yes, X2100 (this is an M2) uses an nVidia chipset. Somehow we should persuade Sun to release a new bios, right? :)> > > Hmm, looking at build_84 xVM sources, I see that > there is a hypervisor option named "correct_pit" > that might work around the problem, too. > > Try to boot the xVM dom0 with something like > this: > > kernel$ /boot/$ISADIR/xen.gz correct_pit > > % more xen.hg/.hg/patches/check-pit-channel2 > Watch PIT channel 2 and reprogram it, when it goes > wrong. > > Signed-off-by: Max Zhen <Max.Zhen@Sun.COM> > > diff --git a/xen/arch/x86/time.c > b/xen/arch/x86/time.c > --- a/xen/arch/x86/time.c > +++ b/xen/arch/x86/time.c > @@ -35,6 +35,14 @@ > /* NB. This is a gross hack. Mainly useful for HPET > testing. */ > static int opt_hpet_force = 0; > boolean_param("hpet_force", opt_hpet_force); > /* > + * On some buggy platform, PIT channel 2 can be > interfered w/ by > + * other devices. So, we try to do a sanity check > every so often. > + * If it jitters too much, we''ll reprogram it > immediately to do > + * mode 0 (interrupt on terminal count mode) binary > count. > + */ > +static int opt_correct_pit = 0; > +boolean_param("correct_pit", opt_correct_pit); > > #define EPOCH MILLISECS(1000) > > ..It''s only b82, but WORKS! ...again... :)) Thank you! ...again! :) This message posted from opensolaris.org