thr3ads.net - Xen devel - [Xen-devel] [PATCH] Fix performance issue brought by TSC-sync logic [Feb 2009]

If this information is useful, please help other people find it:
Share via:

Yang, Xiaowei

2009-Feb-23 08:21 UTC

[Xen-devel] [PATCH] Fix performance issue brought by TSC-sync logic

Recently we found one performance bug when doing network test with VTd
assigned devices - in some extreme case, the network performance in HVM
using new Linux kernel could be 1/20 of native. Root cause is one of our
sync-tsc-under-deep-C-state patches brings extra kilo-TSC drift between
pCPUs and let check-tsc-sync logic in HVM failed. The result is the
kernel fails to use platform timer (HPET, PMtimer) for gettimeofday
instead of TSC and brings very frequent costly IOport access VMExit -
triple per one call.

We provides below 2 patches to address the issue:

tsc1.patch: Minimize the TSC drift between pCPUs by letting BSP/AP set
TSC at the same time in time_calibration_rendezvous(). Looping a few 
times before writing tsc sounds better, but it may be too costly.
Signed-off-by: Xiaowei Yang <xiaowei.yang@intel.com>

tsc2.patch: only do TSC-sync if really necessary, which narrows its 
effect a lot.
Signed-off-by: Wei Gang <wei.gang@intel.com>


Thanks,
Xiaowei




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2009-Feb-23 12:51 UTC

head link

Re: [Xen-devel] [PATCH] Fix performance issue brought by TSC-sync logic

On 23/02/2009 00:21, "Yang, Xiaowei" <xiaowei.yang@intel.com>
wrote:
> Recently we found one performance bug when doing network test with VTd
> assigned devices - in some extreme case, the network performance in HVM
> using new Linux kernel could be 1/20 of native. Root cause is one of our
> sync-tsc-under-deep-C-state patches brings extra kilo-TSC drift between
> pCPUs and let check-tsc-sync logic in HVM failed. The result is the
> kernel fails to use platform timer (HPET, PMtimer) for gettimeofday
> instead of TSC and brings very frequent costly IOport access VMExit -
> triple per one call.
> 
> We provides below 2 patches to address the issue:
Patch 1 looks reasonable. Patch number 2 I''m less keen on, since patch
1
should suffice? Also I think regular re-sync across CPUs is a good idea
anyway. And that also reminds me -- isn''t the CONSTANT_TSC logic in
time.c
broken by host S3, and also by CPU hotplug? There''s nothing to force
sync of
AP TSC to BP TSC when an AP comes online after boot. Doesn''t
init_percpu_time() need to handle that?

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tian, Kevin

2009-Feb-23 12:55 UTC

head link

RE: [Xen-devel] [PATCH] Fix performance issue brought by TSC-sync logic

>From: Keir Fraser
>Sent: Monday, February 23, 2009 8:52 PM
>On 23/02/2009 00:21, "Yang, Xiaowei"
<xiaowei.yang@intel.com> wrote:
>
>> Recently we found one performance bug when doing network 
>test with VTd
>> assigned devices - in some extreme case, the network 
>performance in HVM
>> using new Linux kernel could be 1/20 of native. Root cause 
>is one of our
>> sync-tsc-under-deep-C-state patches brings extra kilo-TSC 
>drift between
>> pCPUs and let check-tsc-sync logic in HVM failed. The result is the
>> kernel fails to use platform timer (HPET, PMtimer) for gettimeofday
>> instead of TSC and brings very frequent costly IOport access VMExit -
>> triple per one call.
>> 
>> We provides below 2 patches to address the issue:
>
>Patch 1 looks reasonable. Patch number 2 I''m less keen on, 
>since patch 1
>should suffice? Also I think regular re-sync across CPUs is a good idea
>anyway. And that also reminds me -- isn''t the CONSTANT_TSC 
>logic in time.c
>broken by host S3, and also by CPU hotplug? There''s nothing to 
>force sync of
>AP TSC to BP TSC when an AP comes online after boot. Doesn''t
>init_percpu_time() need to handle that?
>
Ah, yes, it''s broken regarding to S3. We''ll work out a patch
to handle it.

Thanks,
Kevin
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Yang, Xiaowei

2009-Feb-24 06:33 UTC

head link

Re: [Xen-devel] [PATCH] Fix performance issue brought by TSC-sync logic

Keir Fraser wrote:> On 23/02/2009 00:21, "Yang, Xiaowei"
<xiaowei.yang@intel.com> wrote:
> 
>> Recently we found one performance bug when doing network test with VTd
>> assigned devices - in some extreme case, the network performance in HVM
>> using new Linux kernel could be 1/20 of native. Root cause is one of
our
>> sync-tsc-under-deep-C-state patches brings extra kilo-TSC drift between
>> pCPUs and let check-tsc-sync logic in HVM failed. The result is the
>> kernel fails to use platform timer (HPET, PMtimer) for gettimeofday
>> instead of TSC and brings very frequent costly IOport access VMExit -
>> triple per one call.
>>
>> We provides below 2 patches to address the issue:
> 
> Patch 1 looks reasonable. Patch number 2 I''m less keen on, since
patch 1
> should suffice? Also I think regular re-sync across CPUs is a good idea
> anyway. 
Here is average of 100 cycles skew results on one core 2 quad machine:
1) TSC-sync:            1300
2) TSC-sync+tsc1.patch: 400
3) without TSC-sync:    200 (a.k.a sync at boot time only)

We can see from 1) to 2), cycles skew improves a lot. However Linux 
kernel''s logic to check TSC sync (check_tsc_warp) is very strict, so 
even with tsc1.patch, there are still chances to observe checking failed 
inside VM.

For further improvement to reach the effect of 3), e.g. by taking care 
of cache consistance amongs CPUs, there will be more overhead. And 
considering the function is called per second, we are hesitating to do 
this. What''s your idea?:)

Thanks,
xiaowei

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2009-Feb-24 12:10 UTC

head link

Re: [Xen-devel] [PATCH] Fix performance issue brought by TSC-sync logic

On 23/02/2009 22:33, "Yang, Xiaowei" <xiaowei.yang@intel.com>
wrote:
> For further improvement to reach the effect of 3), e.g. by taking care
> of cache consistance amongs CPUs, there will be more overhead. And
> considering the function is called per second, we are hesitating to do
> this. What''s your idea?:)
Maybe we should see what that overhead is... Also we may not really need to
run that function every second.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Yang, Xiaowei

2009-Feb-25 10:26 UTC

head link

Re: [Xen-devel] [PATCH] Fix performance issue brought by TSC-sync logic

Keir Fraser wrote:> On 23/02/2009 22:33, "Yang, Xiaowei"
<xiaowei.yang@intel.com> wrote:
> 
>> For further improvement to reach the effect of 3), e.g. by taking care
>> of cache consistance amongs CPUs, there will be more overhead. And
>> considering the function is called per second, we are hesitating to do
>> this. What''s your idea?:)
> 
> Maybe we should see what that overhead is... Also we may not really need to
> run that function every second.
> 
>  -- Keir
> 
> 
I measured time_calibration_rendezvous()''s overhead on my machine.
It''s
around 5-6k TSC. And I made anther patch to add loop. The cycle skew 
introduced by TSC-sync is gone with it. And a bit surprisingly, the 
expected extra overhead is not even noticeable:)

The new patch is attached.

Thanks,
Xiaowei


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Feb 2009 - [PATCH] Fix performance issue brought by TSC-sync logic

[Xen-devel] [PATCH] Fix performance issue brought by TSC-sync logic

Re: [Xen-devel] [PATCH] Fix performance issue brought by TSC-sync logic

RE: [Xen-devel] [PATCH] Fix performance issue brought by TSC-sync logic

Re: [Xen-devel] [PATCH] Fix performance issue brought by TSC-sync logic

Re: [Xen-devel] [PATCH] Fix performance issue brought by TSC-sync logic

Re: [Xen-devel] [PATCH] Fix performance issue brought by TSC-sync logic