thr3ads.net - Xen devel - [Xen-devel] [Patch] time resolution fix. [Mar 2006]

If this information is useful, please help other people find it:
Share via:

Dong, Eddie

2006-Mar-17 14:39 UTC

[Xen-devel] [Patch] time resolution fix.

This patch fix HVM/VMX time resolution issue that cause IA32E complain
"loss tick" occationally and APIC time calibration issue.

not tested on SVM for slight common code change.
Eddie

Signed-off-by: Xiaowei Yang <xiaowei.yang@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Mar-17 15:37 UTC

head link

Re: [Xen-devel] [Patch] time resolution fix.

On 17 Mar 2006, at 14:39, Dong, Eddie wrote:
> This patch fix HVM/VMX time resolution issue that cause IA32E complain
> "loss tick" occationally and APIC time calibration issue.
>
> not tested on SVM for slight common code change.
This patch looks scary. Can you give more info about the problem and 
how you solve it? It looks like you end up forcibly sync''ing the 
guest''s TSC rate to the PIT rate? Would that even be necessary if the 
PIT emulation were moved into Xen, where it ought to be?

On a slightly unrelated note, I think TSC rate management will start to 
get exciting when we have HVM save/restore. What will happen if a guest 
is restored on a machine with quite different TSC rate to the machine 
it originally ran on? I was wondering whether the current TSC_OFFSET 
feature that VMX supports might be extended to allow control over TSC 
clock rate as well. For example, provide ''base'' and
''scale'' values and
apply following when guest executes RDTSC:
  guest_tsc = (host_tsc - base) * scale + offset

How do you guys see this working?

  -- Keir

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Dong, Eddie

2006-Mar-17 16:11 UTC

head link

RE: [Xen-devel] [Patch] time resolution fix.

Keir:
	Before this patch, we saw 2 issues:
	One is when a VM switch happens at  HVM guest within PIT
Interrupt Service Routine (ISR) time. There exist problem if we let
guest see time jump in TSC (TSC is used to adjust PIT). Previously
hypervisor try to minimize the jump seen by guest, but the resolution is
one PIT period that is not enough. (TSC_OFFSET=0-pending_intr_nr
*period). This situation is worse in IA32E.
	The second issue is that when HVM/SMP is enabled, APIC time
calibration want to see a TSC duration of about 100000000 cycles so that
ACPI timer frequency can be calibrated with IRQ disabled. This is
un-achievable in previously code. Because at that time the guest IRQ is
disabled and no PIT IRQ injection, thus guest time is frozen. Due to
that, the guest can never see 100000000 cycles passed (TSC is frozen)
and thus stuck there.
	Another benefit of this is that we can get much accurate guest
calibration result that is previously a known issue on multiple VM case,
and it is long time too.
	
	I have a much detail description in the attached slide, hope
this helpful.
	BTW, due to SMP support and more time resource support (RTC and
ACPI), we are planning to do some design modification to sync all those
different kind of time. This patch is mainly for bug fix that exist for
long time and block SMP effort.
thx,eddie

Keir Fraser wrote:> On 17 Mar 2006, at 14:39, Dong, Eddie wrote:
> 
>> This patch fix HVM/VMX time resolution issue that cause IA32E
>> complain "loss tick" occationally and APIC time calibration
issue.
>> 
>> not tested on SVM for slight common code change.
> 
> This patch looks scary. Can you give more info about the problem and
> how you solve it? It looks like you end up forcibly sync''ing the
> guest''s TSC rate to the PIT rate? Would that even be necessary if
the
> PIT emulation were moved into Xen, where it ought to be?
> 
> On a slightly unrelated note, I think TSC rate management will start
> to get exciting when we have HVM save/restore. What will happen if a
> guest is restored on a machine with quite different TSC rate to the
> machine it originally ran on? I was wondering whether the current
> TSC_OFFSET feature that VMX supports might be extended to allow
> control over TSC clock rate as well. For example, provide
''base'' and
> ''scale'' values and apply following when guest executes
RDTSC:
>   guest_tsc = (host_tsc - base) * scale + offset
> 
> How do you guys see this working?
> 
>   -- Keir
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Mar-19 14:28 UTC

head link

Re: [Xen-devel] [Patch] time resolution fix.

On 17 Mar 2006, at 16:11, Dong, Eddie wrote:
> 	I have a much detail description in the attached slide, hope
> this helpful.
Well, freezing the TSC while a guest is descheduled is not very nice at 
all, but I can imagine it stops you getting time went backwards 
messages if you are also forcibly re-setting the TSC on PIT ticks. :-)

The freezing is I guess why you have the new hook schedule_out(), which 
I''m also not madly keen on. Especially since this must surely be a 
short-term workaround (you don''t intend to TSC freeze as long-term 
solution, right?).
> 	BTW, due to SMP support and more time resource support (RTC and
> ACPI), we are planning to do some design modification to sync all those
> different kind of time. This patch is mainly for bug fix that exist for
> long time and block SMP effort.
Clearly some effort needs to be applied here. Moving all time emulation 
into Xen itself would be a good start (e.g., strip PIT emulation from 
qemu-dm).

And the new support should be HVM generic, since there are no 
differences in time handling between VMX and SVM that should require 
(much) vendor-specific handling I think. If there are, or extra vendor 
support appears in future (e.g., I''d like to see guest TSC rate 
control, as I''ve said a few times before ;-) ), then we can add vendor 
hooks in later.

In summary, I''m not sure about this patch. I feel that if I take it
I''m
encouraging ''onward and upward'' development without spending
the time
to make sure fundamental abstractions like time are designed and 
implemented soundly.

  -- Keir

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Mar-19 14:58 UTC

head link

Re: [Xen-devel] [Patch] time resolution fix.

On 19 Mar 2006, at 14:28, Keir Fraser wrote:
>> 	I have a much detail description in the attached slide, hope
>> this helpful.
>
> Well, freezing the TSC while a guest is descheduled is not very nice 
> at all, but I can imagine it stops you getting time went backwards 
> messages if you are also forcibly re-setting the TSC on PIT ticks. :-)
>
> The freezing is I guess why you have the new hook schedule_out(), 
> which I''m also not madly keen on. Especially since this must
surely be
> a short-term workaround (you don''t intend to TSC freeze as
long-term
> solution, right?).
Actually, I now recall we were going to use this approach long term to 
ensure the guest calibrates TSC rate correctly during boot. But then we 
are going to turn it off the first time the guest reads wall-clock time 
(via RTC, for example). But that means we will need the schedule_out() 
hook long term, and that makes your patch less unattractive. I''ll take 
another look and reconsider it.

  -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ronald G Minnich

2006-Mar-20 15:55 UTC

head link

Re: [Xen-devel] [Patch] time resolution fix.

Keir Fraser wrote:
> Well, freezing the TSC while a guest is descheduled is not very nice at 
> all, but I can imagine it stops you getting time went backwards messages 
> if you are also forcibly re-setting the TSC on PIT ticks. :-)
I think freezing the TSC is probably a very bad idea for guests. I 
really do want to know what''s going on.

> In summary, I''m not sure about this patch. I feel that if I take
it I''m
> encouraging ''onward and upward'' development without
spending the time to
> make sure fundamental abstractions like time are designed and 
> implemented soundly.
I think you should heed your intuition ... it''s usually quite solid!

thanks

ron

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Dong, Eddie

2006-Mar-20 16:08 UTC

head link

RE: [Xen-devel] [Patch] time resolution fix.

Keir:
	Yes, for future multiple platform time resource 
support (RTC, PIT, ACPI), I agree eventually they should be in HV. 
> The freezing is I guess why you have the new hook schedule_out(),
> which I''m also not madly keen on. Especially since this must
surely
> be a short-term workaround (you don''t intend to TSC freeze as
> long-term solution, right?).
It is true that I don''t like to freeze the guest time at deschedule
time, but 
so far we have to stay here before we find better solution :-(

The reason is in legacy guest PIT ISR (interrupt service routine). The
ISR code 
reads TSC and computing the elapsed TSC (compare with saved old TSC)
from last PIT IRQ fire time. If xen doesn''t present exactly expected
difference
in TSC, guest may get accumulated difference in PIT ISR and do some
fixup
that is a messy of guest jiffies and complain "lossed tick".
Eventually
that 
will force guest to give up using TSC as time resource (roll back to
pure PIT).

With this patch, we get very accurate guest time in our local test:-)

Keir Fraser wrote:> Actually, I now recall we were going to use this approach long term to
> ensure the guest calibrates TSC rate correctly during boot. But then
> we are going to turn it off the first time the guest reads wall-clock
> time (via RTC, for example). But that means we will need the
> schedule_out() hook long term, and that makes your patch less
> unattractive. I''ll take another look and reconsider it.
Yes, this meets with what Ian and Asit talked in xensummit too. And it
can solve
 the TSC calibration issue as wall-clock (RTC) read is some time later
after 
TSC calibration. But it has problem in APIC time calibration side, as 
it is done very later in Linux (not sure for other OSes), it is even
later than init 
thread creation that is hard to determine in xen.

Freezing TSC has similar function with this suggestion. The difference
in freezing
TSC approach is that we need to assume the guest calibration is a
one-time task. 
Otherwise the guest may see time backward in runtime.

A better solution to remove this assumption is that we implement a 
mechanism like PIT IRQ output line that will discard accumulated IRQs
during
guest IRQ disable time. I.e. if guest IRQ is disabled,
pickup_deactive_ticks 
should ignore the elapsed ticks (only add one more pending IRQ). In this
way
the guest behavior will be exactly same with native. We should put this
in 
our TODO list :-) 
> In summary, I''m not sure about this patch. I feel that if I take
it
> I''m encouraging ''onward and upward'' development
without spending the
> time to make sure fundamental abstractions like time are designed and
> implemented soundly.
Thanks! We have plan to come out a much complicate time virtualization
design
soon to support multiple platform time resources and SMP better. We saw
several
issues for SMP support in guest time forwarding. We will send the design
out 
as soon as possible and collect feedback from you and all others:-)

thx,eddie
	

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Mar-20 16:48 UTC

head link

Re: [Xen-devel] [Patch] time resolution fix.

On 20 Mar 2006, at 16:08, Dong, Eddie wrote:
> Yes, this meets with what Ian and Asit talked in xensummit too. And it
> can solve
>  the TSC calibration issue as wall-clock (RTC) read is some time later
> after
> TSC calibration. But it has problem in APIC time calibration side, as
> it is done very later in Linux (not sure for other OSes), it is even
> later than init
> thread creation that is hard to determine in xen.
Hmmm.. in fact it looks like Linux reads CMOS RTC before even 
calibrating bogomips, so that wouldn''t be a good point to disable TSC 
freezing after all.

Another issue is that some calibration loops read the PIT counter (and 
would be confused by wrapping), or expect to receive timely interrupts 
to increment jiffies. Those are hard to guarantee in a virtualised 
environment. So there''s a general timeliness issue as well as the 
original ''delay loop progress'' versus ''time
progress'' issue.

There''s no good way out of this I suspect. If guest time is to track 
wallclock time then guests are going to have to see time jumping 
forward across preemptions, or the jumping is simply going to be saved 
up for some time later (eg. as you do currently when the PIT 
underflows).

Maybe we should do something really simple like run the guest in 
''virtual'' (scheduled) time for some number of seconds after
boot, then
switch to real time (which runs at an accelerated rate for a short 
while to catch back up with real time)?
> A better solution to remove this assumption is that we implement a
> mechanism like PIT IRQ output line that will discard accumulated IRQs
> during
> guest IRQ disable time. I.e. if guest IRQ is disabled,
> pickup_deactive_ticks
> should ignore the elapsed ticks (only add one more pending IRQ). In 
> this
> way
> the guest behavior will be exactly same with native. We should put this
> in
> our TODO list :-)
What effect will this have? Are you suggesting to always run guest time 
at ''virtual time'' rather than real wallclock time?

  -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Dong, Eddie

2006-Mar-20 23:55 UTC

head link

RE: [Xen-devel] [Patch] time resolution fix.

Keir:
	Thanks!

Keir Fraser wrote:> What effect will this have? Are you suggesting to always run guest
> time at ''virtual time'' rather than real wallclock time?
> Ooo, the new proposal is not focusing on this issue :-) 

The basic issue we saw is 
1: how to jump guest time
    For example, when a SMP system has 2 VPs, each VP APIC time 
(VP0-VP1) has a scheduled fire time say at 4ms, and 6ms time. 
And the platform time say PIT is scheduled at 8ms time. 
when VP0 is descheduled, while VP1 is switched in, then probably
we can''t inject APIC time IRQ to VP1 even hypervisor undergo 
6ms+ time. Because injecting APIC time IRQ means VP1 saw guest 
time jumped to 6ms later and same on TSC (platform). Otherwise when 
VP0 is switched in, guest TSC time on VP0 is in 6ms+ later time, but
the ACPI timer ISR is still assuming it is in 4ms time. This kind of
lossing 
synchronization means VP0 see backward time that may cause various
corner case like we saw previously in PIT and TSC. 
	Combining per processor time IRQ with platform time IRQ, the 
situation will become much complicated.
    

2: How to deliver guest time IRQ effeciently.
     Same with above situation, if the VP with next scheduled timer
resource 
is deactive, all other VP may be unable to get time IRQ. That is unfair
and may
cause no way to catchup in some difficult case :-)
      Also if platform time IRQ is pinned on certain VP, that is much
worse :-(

3: Make platform time code object orientation.
    That means, no matter RTC, ACPI or PIT time, for each HVM, the
configuration
can choice eithe of them and xen will provide dynamic register APIs. In
this way we 
are no longer pinned on PIT.


We have something in mind, but not fully completed yet. 
For simplicity, we may assume a) An guest OS only use one of the 
platform time as its ticking resource.  b) platform time IRQ is not
pinned on
certain VP.

thx,eddie

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Mar 2006 - [Patch] time resolution fix.

[Xen-devel] [Patch] time resolution fix.

Re: [Xen-devel] [Patch] time resolution fix.

RE: [Xen-devel] [Patch] time resolution fix.

Re: [Xen-devel] [Patch] time resolution fix.

Re: [Xen-devel] [Patch] time resolution fix.

Re: [Xen-devel] [Patch] time resolution fix.

RE: [Xen-devel] [Patch] time resolution fix.

Re: [Xen-devel] [Patch] time resolution fix.

RE: [Xen-devel] [Patch] time resolution fix.