I tried Xen with genapic on my X86_64 box running 32-bit SLES9 SP1. The default genapic driver gets loaded and I was able to boot successfully without any problems. On an 8-way ES7000, Xen boots successfully but the system loops on the "Time went backwards" message while Dom0 comes up. As far as I tell ES7000 genapic driver has been loaded and the IO APICs and timers have been initialized correctly. The other thing I noticed was that specifying "apic=verbose" on the Xen command line in grub causes the default driver to be loaded initially which is alright as the check for the ES7000 happens later on. So I guess the timer interrupt is not keeping up with the real-time clock? Any idea what is causing this? Aravindh _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 26 May 2005, at 16:47, Puthiyaparambil, Aravindh wrote:> On an 8-way ES7000, Xen boots successfully but the system loops on the > "Time went backwards" message while Dom0 comes up.Does it just happen during boot, or carry on forever? Is dom0 UP or SMP? It sounds like IRQ0 is not getting through -- we bump time via legacy PIT interrupts. If that doesn''t happen then TSC extrapolation works for a second or two but then we wrap and time jumps backwards (we extrapolate using only least 32 bits of the TSC). I suggest adding tracing to timer_interrupt() in xen/arch/x86/time.c and see whether or not it is getting executed.> The other thing I noticed was that > specifying "apic=verbose" on the Xen command line in grub causes the > default driver to be loaded initially which is alright as the check for > the ES7000 happens later on.It seems that Linux has a cmdline conflict between genapic and apic.c tracing as to who owns "apic=". Does our behaviour differ from that of native Linux? I could change the parameter name of one or the other within Xen -- any suggestions for which and to what? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > On an 8-way ES7000, Xen boots successfully but the system loops onthe> > "Time went backwards" message while Dom0 comes up. > > Does it just happen during boot, or carry on forever? Is dom0 UP or > SMP? It sounds like IRQ0 is not getting through -- we bump time viaDom0 is UP. It does not complete booting. It loops infinitely on the "Time went backwards" message. The boot does not continue from that point on. I am attaching the boot output. If IRQ0 does not get through then won''t check_timer() in io_apic.c fail? That seems to be working which indicates that IRQ0 is indeed getting through.> I suggest adding tracing to timer_interrupt() in xen/arch/x86/time.c > and see whether or not it is getting executed.I added some trace to the function and it is being called. Any idea where to go from here?> > The other thing I noticed was that > > specifying "apic=verbose" on the Xen command line in grub causes the > > default driver to be loaded initially which is alright as the checkfor> > the ES7000 happens later on. > > It seems that Linux has a cmdline conflict between genapic and apic.c > tracing as to who owns "apic=". Does our behaviour differ from that of > native Linux? I could change the parameter name of one or the other > within Xen -- any suggestions for which and to what?The same thing happens with native Linux (SLES9 SP1). I would suggest adding a new "genapic" command line option so that it can be separate from the "apic" option. Or we can add a check in generic_apic_probe() to ignore apic=verbose/debug. Aravindh _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 26 May 2005, at 18:54, Puthiyaparambil, Aravindh wrote:> Dom0 is UP. It does not complete booting. It loops infinitely on the > "Time went backwards" message. The boot does not continue from that > point on. I am attaching the boot output. > If IRQ0 does not get through then won''t check_timer() in io_apic.c > fail? > That seems to be working which indicates that IRQ0 is indeed getting > through.Are the TSCs on all CPUs suitably sync''ed up? If the PIT is firing and we are updating our timestamp info often enough, and all CPUs have similar TSC values, then all should work okay. Don Fry has a patch that handles unsync''ed TSCs, if that turns out to be the problem. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> Are the TSCs on all CPUs suitably sync''ed up? If the PIT is firing and > we are updating our timestamp info often enough, and all CPUs have > similar TSC values, then all should work okay. Don Fry has a patchthat> handles unsync''ed TSCs, if that turns out to be the problem.On native Linux the skew is reported and is "fixed up". But that does not mean the TSC was really adjusted, it will keep running with this gap. So the CPU TSC skew could be the problem. The only difference is that for Xen, the skew for the boot CPU is unusually high. This is the output on native Linux: checking TSC synchronization across 16 CPUs: BIOS BUG: CPU#0 improperly initialized, has 360475 usecs TSC skew! FIXED. BIOS BUG: CPU#1 improperly initialized, has 360486 usecs TSC skew! FIXED. BIOS BUG: CPU#2 improperly initialized, has 360476 usecs TSC skew! FIXED. BIOS BUG: CPU#3 improperly initialized, has 360475 usecs TSC skew! FIXED. BIOS BUG: CPU#4 improperly initialized, has -360478 usecs TSC skew! FIXED. BIOS BUG: CPU#5 improperly initialized, has -360479 usecs TSC skew! FIXED. BIOS BUG: CPU#6 improperly initialized, has -360479 usecs TSC skew! FIXED. BIOS BUG: CPU#7 improperly initialized, has -360479 usecs TSC skew! FIXED. BIOS BUG: CPU#8 improperly initialized, has 360484 usecs TSC skew! FIXED. BIOS BUG: CPU#9 improperly initialized, has 360483 usecs TSC skew! FIXED. BIOS BUG: CPU#10 improperly initialized, has 360476 usecs TSC skew! FIXED. BIOS BUG: CPU#11 improperly initialized, has 360475 usecs TSC skew! FIXED. BIOS BUG: CPU#12 improperly initialized, has -360478 usecs TSC skew! FIXED. BIOS BUG: CPU#13 improperly initialized, has -360479 usecs TSC skew! FIXED. BIOS BUG: CPU#14 improperly initialized, has -360479 usecs TSC skew! FIXED. BIOS BUG: CPU#15 improperly initialized, has -360479 usecs TSC skew! FIXED. This is the output for Xen: (XEN) checking TSC synchronization across 16 CPUs: (XEN) CPU#0 had 47220790 usecs TSC skew, fixed it up. (XEN) CPU#1 had 2298443 usecs TSC skew, fixed it up. (XEN) CPU#2 had 2298445 usecs TSC skew, fixed it up. (XEN) CPU#3 had 2298440 usecs TSC skew, fixed it up. (XEN) CPU#4 had -2298445 usecs TSC skew, fixed it up. (XEN) CPU#5 had -2298444 usecs TSC skew, fixed it up. (XEN) CPU#6 had -2298445 usecs TSC skew, fixed it up. (XEN) CPU#7 had -2298444 usecs TSC skew, fixed it up. (XEN) CPU#8 had 2298454 usecs TSC skew, fixed it up. (XEN) CPU#9 had 2298443 usecs TSC skew, fixed it up. (XEN) CPU#10 had 2298450 usecs TSC skew, fixed it up. (XEN) CPU#11 had 2298440 usecs TSC skew, fixed it up. (XEN) CPU#12 had -2298445 usecs TSC skew, fixed it up. (XEN) CPU#13 had -2298442 usecs TSC skew, fixed it up. (XEN) CPU#14 had -2298445 usecs TSC skew, fixed it up. (XEN) CPU#15 had -2298444 usecs TSC skew, fixed it up. Maybe the patch that handles unsync''ed TSCs could solve this issue. Aravindh _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 26 May 2005, at 21:52, Puthiyaparambil, Aravindh wrote:>> Are the TSCs on all CPUs suitably sync''ed up? If the PIT is firing and >> we are updating our timestamp info often enough, and all CPUs have >> similar TSC values, then all should work okay. Don Fry has a patch > that >> handles unsync''ed TSCs, if that turns out to be the problem. > > On native Linux the skew is reported and is "fixed up". But that does > not mean the TSC was really adjusted, it will keep running with this > gap. So the CPU TSC skew could be the problem. The only difference is > that for Xen, the skew for the boot CPU is unusually high.Aravindh, I noticed yesterday that I was missing a call to enable_apic_mode() during local APIC initialisation. On es7000 this does some cryptic twiddling that I don''t understand, but I suppose could arbitrarily break something important. :-) I''ve now fixed this omission so it is probably worth trying to boot on es7000 and see if the timer problems have disappeared. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir, The enable_apic_mode() function no longer does anything useful on our current systems. They are present in the 2.6 LK to support an older class of Unisys machines. I am not trying to run Xen on those systems. I tried using Don''s patch. The "time went backwards" message no longer shows up. But the system is losing interrupts now. I have attached the boot output. It looks like there could be something amiss in the IO APIC redirection table setup. I am presently looking into that. Aravindh> -----Original Message----- > From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > Sent: Tuesday, May 31, 2005 2:24 PM > To: Puthiyaparambil, Aravindh > Cc: Davis, Jason; xen-devel@lists.xensource.com; Vessey, Bruce A; > Subrahmanian, Raj > Subject: Re: Genapic testing > > > On 26 May 2005, at 21:52, Puthiyaparambil, Aravindh wrote: > > >> Are the TSCs on all CPUs suitably sync''ed up? If the PIT is firingand> >> we are updating our timestamp info often enough, and all CPUs have > >> similar TSC values, then all should work okay. Don Fry has a patch > > that > >> handles unsync''ed TSCs, if that turns out to be the problem. > > > > On native Linux the skew is reported and is "fixed up". But thatdoes> > not mean the TSC was really adjusted, it will keep running with this > > gap. So the CPU TSC skew could be the problem. The only differenceis> > that for Xen, the skew for the boot CPU is unusually high. > > Aravindh, > > I noticed yesterday that I was missing a call to enable_apic_mode() > during local APIC initialisation. On es7000 this does some cryptic > twiddling that I don''t understand, but I suppose could arbitrarily > break something important. :-) > > I''ve now fixed this omission so it is probably worth trying to boot on > es7000 and see if the timer problems have disappeared. > > -- Keir_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel