thr3ads.net - Xen devel - [Xen-devel] Problem with BIOS timer interrupts [Nov 2008]

If this information is useful, please help other people find it:
Share via:

Gary Grebus

2008-Nov-18 18:50 UTC

[Xen-devel] Problem with BIOS timer interrupts

Hi,

While changing our Xen 3.2.x based HVM BIOS ROM to use gPXE instead of
etherboot, I ran into an interesting behavior.  The gPXE code, which
runs in real mode, contains the following sequence:

wait_for_tick:
	pushl	%eax
	pushw	%fs
	movw	$0x40, %ax
	movw	%ax, %fs
	movl	%fs:(0x6c), %eax
1:	pushf
	sti
	hlt
	popf
	cmpl	%fs:(0x6c), %eax
	je	1b
	popw	%fs
	popl	%eax
	ret

It uses this to timeout waiting for a key press.  The expected interrupt
is from the BIOS timer implemented in rombios.  But in fact, the loop
hangs.  However, if I insert a nop instruction between the sti and hlt,
then things work as expected.

Is there something wrong with this sequence?  This happens on AMD, so
it''s not a quirk of the real mode emulations on Intel. 

I notice that in the gPXE code currently in xen-unstable, the path that
uses this code is patched out.

	/gary

-- 
Gary Grebus
Virtual Iron Software, Inc.



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Huang2, Wei

2008-Nov-18 19:31 UTC

head link

RE: [Xen-devel] Problem with BIOS timer interrupts

Gary,

Which CPU family you are using? 0xF? There is an errata which seems to
be related. See page 50. 

http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/
33610.pdf


-Wei
		

-----Original Message-----
From: xen-devel-bounces@lists.xensource.com
[mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Gary Grebus
Sent: Tuesday, November 18, 2008 12:50 PM
To: xen-devel
Subject: [Xen-devel] Problem with BIOS timer interrupts

Hi,

While changing our Xen 3.2.x based HVM BIOS ROM to use gPXE instead of
etherboot, I ran into an interesting behavior.  The gPXE code, which
runs in real mode, contains the following sequence:

wait_for_tick:
	pushl	%eax
	pushw	%fs
	movw	$0x40, %ax
	movw	%ax, %fs
	movl	%fs:(0x6c), %eax
1:	pushf
	sti
	hlt
	popf
	cmpl	%fs:(0x6c), %eax
	je	1b
	popw	%fs
	popl	%eax
	ret

It uses this to timeout waiting for a key press.  The expected interrupt
is from the BIOS timer implemented in rombios.  But in fact, the loop
hangs.  However, if I insert a nop instruction between the sti and hlt,
then things work as expected.

Is there something wrong with this sequence?  This happens on AMD, so
it''s not a quirk of the real mode emulations on Intel. 

I notice that in the gPXE code currently in xen-unstable, the path that
uses this code is patched out.

	/gary

-- 
Gary Grebus
Virtual Iron Software, Inc.



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2008-Nov-18 20:02 UTC

head link

Re: [Xen-devel] Problem with BIOS timer interrupts

On 18/11/08 18:50, "Gary Grebus" <ggrebus@virtualiron.com>
wrote:
> It uses this to timeout waiting for a key press.  The expected interrupt
> is from the BIOS timer implemented in rombios.  But in fact, the loop
> hangs.  However, if I insert a nop instruction between the sti and hlt,
> then things work as expected.
> 
> Is there something wrong with this sequence?  This happens on AMD, so
> it''s not a quirk of the real mode emulations on Intel.
> 
> I notice that in the gPXE code currently in xen-unstable, the path that
> uses this code is patched out.
As a data point, I commented it out because the delay''s annoying rather
than
because it caused a boot hang for me. I was testing on Intel though.

Inserting the nop is obviously bogus (I expect you''re aware of that
:-),
since it raises the opportunity of a wakeup-waiting race. That it fixes this
issue is very weird. I expect we have some issue to do with leaving an
interrupt shadow during HLT emulation -- why this would only trigger in real
mode I cannot guess.

Wei''s erratum is not applicable, for three reasons:
 1. We disable C1 clock ramping
 2. We always intercept HLT
 3. STI; HLT is a standard x86 idiom used in all OSes, and this is the only
place we''re seeing a problem. Also the erratum would lead to rare
non-deterministic hangs, not a hang every time (which is what you''re
seeing?).

I would say it''s a good idea to see if you can repro this on
xen-unstable.

 -- Keir

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Gary Grebus

2008-Nov-18 20:28 UTC

head link

Re: [Xen-devel] Problem with BIOS timer interrupts

On Tue, 2008-11-18 at 20:02 +0000, Keir Fraser wrote:> On 18/11/08 18:50, "Gary Grebus" <ggrebus@virtualiron.com>
wrote:
> 
> > It uses this to timeout waiting for a key press.  The expected
interrupt
> > is from the BIOS timer implemented in rombios.  But in fact, the loop
> > hangs.  However, if I insert a nop instruction between the sti and
hlt,
> > then things work as expected.
> > 
> > Is there something wrong with this sequence?  This happens on AMD, so
> > it''s not a quirk of the real mode emulations on Intel.
Interestingly, the same problem and "fix" apply on Intel under
vmxassist.  But likely nobody cares about that anymore (and gPXE has
other problems with vmxassist).
> > 
> > I notice that in the gPXE code currently in xen-unstable, the path
that
> > uses this code is patched out.
> 
> As a data point, I commented it out because the delay''s annoying
rather than
> because it caused a boot hang for me. I was testing on Intel though.
> 
> Inserting the nop is obviously bogus (I expect you''re aware of
that :-),
> since it raises the opportunity of a wakeup-waiting race. That it fixes
this
> issue is very weird. I expect we have some issue to do with leaving an
> interrupt shadow during HLT emulation -- why this would only trigger in
real
> mode I cannot guess.
> 
> Wei''s erratum is not applicable, for three reasons:
>  1. We disable C1 clock ramping
>  2. We always intercept HLT
>  3. STI; HLT is a standard x86 idiom used in all OSes, and this is the only
> place we''re seeing a problem. Also the erratum would lead to rare
> non-deterministic hangs, not a hang every time (which is what
you''re
> seeing?).
OK.  It does appear to be family 0xf processor.  dom0 says:
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 65
model name      : Dual-Core AMD Opteron(tm) Processor 2216
stepping        : 2
cpu MHz         : 2394.000
cache size      : 1024 KB
> 
> I would say it''s a good idea to see if you can repro this on
xen-unstable.
OK... My usual xen-unstable setup is out of commission at the moment,
but I will try to reproduce it.

	/gary


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2008-Nov-18 22:13 UTC

head link

Re: [Xen-devel] Problem with BIOS timer interrupts

On 18/11/08 20:28, "Gary Grebus" <ggrebus@virtualiron.com>
wrote:
>> I would say it''s a good idea to see if you can repro this on
xen-unstable.
> 
> OK... My usual xen-unstable setup is out of commission at the moment,
> but I will try to reproduce it.
Another approach, if this really is happening every time, would be to trace
the hell out of HLT emulation and interrupt delivery with printk. Since this
happens so early, you shouldn''t end up overwhelmed with trace data.
Perhaps
you can narrow down the problem that way...

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Gary Grebus

2008-Nov-19 21:15 UTC

head link

Re: [Xen-devel] Problem with BIOS timer interrupts

On Tue, 2008-11-18 at 22:13 +0000, Keir Fraser wrote:> On 18/11/08 20:28, "Gary Grebus" <ggrebus@virtualiron.com>
wrote:
> 
> >> I would say it''s a good idea to see if you can repro this
on xen-unstable.
> > 
> > OK... My usual xen-unstable setup is out of commission at the moment,
> > but I will try to reproduce it.
> 
> Another approach, if this really is happening every time, would be to trace
> the hell out of HLT emulation and interrupt delivery with printk. Since
this
> happens so early, you shouldn''t end up overwhelmed with trace
data. Perhaps
> you can narrow down the problem that way...
Well, I can''t reproduce this even in my own setup on AMD.  All the
failures must have really been on Intel with vmxassist (or in my
imagination).   Sorry for the noise.

	/gary

-- 
Gary Grebus
Virtual Iron Software, Inc.



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Nov 2008 - Problem with BIOS timer interrupts

[Xen-devel] Problem with BIOS timer interrupts

RE: [Xen-devel] Problem with BIOS timer interrupts

Re: [Xen-devel] Problem with BIOS timer interrupts

Re: [Xen-devel] Problem with BIOS timer interrupts

Re: [Xen-devel] Problem with BIOS timer interrupts

Re: [Xen-devel] Problem with BIOS timer interrupts