thr3ads.net - Xen devel - [Xen-devel] rdtsc strangeness on upstream kernel [Jul 2010]

If this information is useful, please help other people find it:
Share via:

Dan Magenheimer

2010-Jul-19 23:19 UTC

[Xen-devel] rdtsc strangeness on upstream kernel

Hi Jeremy --

In trying to understand some strange behavior I was
seeing on a RHEL6b2 guest, I ran across an interesting
anomaly, and it seems to be true on different upstream
(pvops) kernels.  At first I thought it was a result
of the xen_sched_clock->xen_clocksource_read patch
you recently posted, but after some testing this
appears to be unrelated.

The number of rdtsc/second goes up dramatically when
there is CPU-intensive load on an upstream kernel!
I know we both observed some cases where rdtsc/sec
was very high, but I don''t think we ever were able to
reproduce this consistently.

First, this is a single vcpu, 64-bit 2.6.32 (RHEL6b2)
kernel.

I am observing ~300 rdtsc/sec on an idle VM.  When
I run a load of:

main() {while(1);}

I am observing about 10000 rdtsc/sec!!

This is a CONFIG_HZ_1000=y kernel, so I would
expect 1000 rdtsc/sec, or maybe 2000 rdtsc/sec,
but 10000 makes me wonder if there is some hidden
bug.

Changing to vcpus=2, the above numbers roughly
double... ~600 goes up to 20000 rdtsc/sec.
So clearly it is some kind of per-vcpu activity.

Any idea what kernel parts might be doing this?
I can instrument the xen rdtsc code to collect
some more information, but thought I''d get ideas
from you first.

Thanks,
Dan

P.S. Lest anyone panic about this, a rate of
10000 rdtsc/sec/vcpu should translate to ~1%
performance.  Not a catastrophe but still worth
understanding.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2010-Jul-19 23:34 UTC

head link

[Xen-devel] Re: rdtsc strangeness on upstream kernel

On 07/19/2010 04:19 PM, Dan Magenheimer wrote:> In trying to understand some strange behavior I was
> seeing on a RHEL6b2 guest, I ran across an interesting
> anomaly, and it seems to be true on different upstream
> (pvops) kernels.  At first I thought it was a result
> of the xen_sched_clock->xen_clocksource_read patch
> you recently posted, but after some testing this
> appears to be unrelated.
>
> The number of rdtsc/second goes up dramatically when
> there is CPU-intensive load on an upstream kernel!
>   
Are you looking at rdtsc emulation traps when running a PV domain?
> I know we both observed some cases where rdtsc/sec
> was very high, but I don''t think we ever were able to
> reproduce this consistently.
>   
It would be interesting to compare that to the context switch rate (cs
column in vmstat output) to see if they correlate.

Also, how does it relate to timer interrupts?
> First, this is a single vcpu, 64-bit 2.6.32 (RHEL6b2)
> kernel.
>
> I am observing ~300 rdtsc/sec on an idle VM.  When
> I run a load of:
>
> main() {while(1);}
>
> I am observing about 10000 rdtsc/sec!!
>
> This is a CONFIG_HZ_1000=y kernel, so I would
> expect 1000 rdtsc/sec, or maybe 2000 rdtsc/sec,
> but 10000 makes me wonder if there is some hidden
> bug.
>   
Do you have preemption running?  And why HZ=1000?

Thanks,
    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Dan Magenheimer

2010-Jul-19 23:51 UTC

head link

[Xen-devel] RE: rdtsc strangeness on upstream kernel

> > The number of rdtsc/second goes up dramatically when
> > there is CPU-intensive load on an upstream kernel!
> 
> Are you looking at rdtsc emulation traps when running a PV domain?
Yes, with "for i in {0..99}; do xm debug-key s; sleep 1; done".
 > > I know we both observed some cases where rdtsc/sec
> > was very high, but I don''t think we ever were able to
> > reproduce this consistently.
> 
> It would be interesting to compare that to the context switch rate (cs
> column in vmstat output) to see if they correlate.
Doesn''t appear to be related.  Vmstat continues to show
cs at about 40/sec while the rdtsc/sec is very high.
> Also, how does it relate to timer interrupts?
With vcpus=2, timer0 goes up to about 2000 irq/sec on CPU0
and timer1 on CPU1 is the same while rdtsc/sec is about 20000.
Is it possible that the timer interrupt is using rdtsc
five times each interrupt?
> > First, this is a single vcpu, 64-bit 2.6.32 (RHEL6b2)
> > kernel.
> >
> > I am observing ~300 rdtsc/sec on an idle VM.  When
> > I run a load of:
> >
> > main() {while(1);}
> >
> > I am observing about 10000 rdtsc/sec!!
> >
> > This is a CONFIG_HZ_1000=y kernel, so I would
> > expect 1000 rdtsc/sec, or maybe 2000 rdtsc/sec,
> > but 10000 makes me wonder if there is some hidden
> > bug.
> 
> Do you have preemption running?  And why HZ=1000?
CONFIG_PREEMPT is off, though there are some other
PREEMPT-related configs that are on.  Are any others
relevant?

HZ=1000 is because this is a straight RHEL6b2 kernel,
but running as a PV guest.

Dan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Dan Magenheimer

2010-Jul-19 23:55 UTC

head link

[Xen-devel] RE: rdtsc strangeness on upstream kernel

> With vcpus=2, timer0 goes up to about 2000 irq/sec on CPU0
> and timer1 on CPU1 is the same while rdtsc/sec is about 20000.
No, wait, correct that...timer0 and timer1 are 1000 irq/sec.
(I was using "watch -d" which defaults to updating every
two seconds, not every second.)
> Is it possible that the timer interrupt is using rdtsc
> five times each interrupt?
TEN times each interrupt.
> -----Original Message-----
> From: Dan Magenheimer
> Sent: Monday, July 19, 2010 5:51 PM
> To: Jeremy Fitzhardinge
> Cc: xen-devel@lists.xensource.com
> Subject: RE: rdtsc strangeness on upstream kernel
> 
> > > The number of rdtsc/second goes up dramatically when
> > > there is CPU-intensive load on an upstream kernel!
> >
> > Are you looking at rdtsc emulation traps when running a PV domain?
> 
> Yes, with "for i in {0..99}; do xm debug-key s; sleep 1; done".
> 
> > > I know we both observed some cases where rdtsc/sec
> > > was very high, but I don''t think we ever were able to
> > > reproduce this consistently.
> >
> > It would be interesting to compare that to the context switch rate
> (cs
> > column in vmstat output) to see if they correlate.
> 
> Doesn''t appear to be related.  Vmstat continues to show
> cs at about 40/sec while the rdtsc/sec is very high.
> 
> > Also, how does it relate to timer interrupts?
> 
> With vcpus=2, timer0 goes up to about 2000 irq/sec on CPU0
> and timer1 on CPU1 is the same while rdtsc/sec is about 20000.
> Is it possible that the timer interrupt is using rdtsc
> five times each interrupt?
> 
> > > First, this is a single vcpu, 64-bit 2.6.32 (RHEL6b2)
> > > kernel.
> > >
> > > I am observing ~300 rdtsc/sec on an idle VM.  When
> > > I run a load of:
> > >
> > > main() {while(1);}
> > >
> > > I am observing about 10000 rdtsc/sec!!
> > >
> > > This is a CONFIG_HZ_1000=y kernel, so I would
> > > expect 1000 rdtsc/sec, or maybe 2000 rdtsc/sec,
> > > but 10000 makes me wonder if there is some hidden
> > > bug.
> >
> > Do you have preemption running?  And why HZ=1000?
> 
> CONFIG_PREEMPT is off, though there are some other
> PREEMPT-related configs that are on.  Are any others
> relevant?
> 
> HZ=1000 is because this is a straight RHEL6b2 kernel,
> but running as a PV guest.
> 
> Dan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2010-Jul-20 00:34 UTC

head link

Re: [Xen-devel] RE: rdtsc strangeness on upstream kernel

On 07/19/2010 04:55 PM, Dan Magenheimer wrote:>> With vcpus=2, timer0 goes up to about 2000 irq/sec on CPU0
>> and timer1 on CPU1 is the same while rdtsc/sec is about 20000.
>>     
> No, wait, correct that...timer0 and timer1 are 1000 irq/sec.
> (I was using "watch -d" which defaults to updating every
> two seconds, not every second.)
>
>   
>> Is it possible that the timer interrupt is using rdtsc
>> five times each interrupt?
>>     
> TEN times each interrupt.
>   
Do you have any other debugging/tracing enabled?  CONFIG_XEN_DEBUG_FS
for example?

10 sounds a little high, but not necessarily completely unreasonable,
especially if there''s tracing/debugging on.

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Dan Magenheimer

2010-Jul-20 00:42 UTC

head link

RE: [Xen-devel] RE: rdtsc strangeness on upstream kernel

> On 07/19/2010 04:55 PM, Dan Magenheimer wrote:
> >> With vcpus=2, timer0 goes up to about 2000 irq/sec on CPU0
> >> and timer1 on CPU1 is the same while rdtsc/sec is about 20000.
> >>
> > No, wait, correct that...timer0 and timer1 are 1000 irq/sec.
> > (I was using "watch -d" which defaults to updating every
> > two seconds, not every second.)
> >
> >
> >> Is it possible that the timer interrupt is using rdtsc
> >> five times each interrupt?
> >>
> > TEN times each interrupt.
> >
> 
> Do you have any other debugging/tracing enabled?  CONFIG_XEN_DEBUG_FS
> for example?
> 
> 10 sounds a little high, but not necessarily completely unreasonable,
> especially if there''s tracing/debugging on.
Yes, CONFIG_XEN_DEBUG_FS is on.  I dug up some instrumentation
I used in the hypervisor last fall which should identify the
kernel locations of the rdtsc''s.  I''ll plug that in and
post the results tomorrow.

Dan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2010-Jul-20 00:44 UTC

head link

Re: [Xen-devel] RE: rdtsc strangeness on upstream kernel

On 07/19/2010 05:42 PM, Dan Magenheimer wrote:> Yes, CONFIG_XEN_DEBUG_FS is on.  I dug up some instrumentation
> I used in the hypervisor last fall which should identify the
> kernel locations of the rdtsc''s.  I''ll plug that in and
> post the results tomorrow.
>   
They''re probably all in the same place, so you''ll need to get
some
calling stack to make sense of it.

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Dan Magenheimer

2010-Jul-23 16:47 UTC

head link

RE: [Xen-devel] RE: rdtsc strangeness on upstream kernel

> On 07/19/2010 05:42 PM, Dan Magenheimer wrote:
> > Yes, CONFIG_XEN_DEBUG_FS is on.  I dug up some instrumentation
> > I used in the hypervisor last fall which should identify the
> > kernel locations of the rdtsc''s.  I''ll plug that in
and
> > post the results tomorrow.
> 
> They''re probably all in the same place, so you''ll need to
get some
> calling stack to make sense of it.
> 
>     J
Sorry this took awhile.  I''ve been hacking and instrumenting and
measuring and the results are not very good.

I''m cc''ing Andrew Jones since things look worse on stock
RHEL6b2.

First, I''ve measured on both PV 2.6.35-rc4 ("upstream") and
PV
2.6.32-37 ("rhel6b2").  I''ve attached the patch for rhel6b2
that
I used for measurement; the same code fragments were used for
upstream.

Rhel6b2 had HZ=1000 and had vcpus=2.  Upstream had HZ=250 and vcpus=1.

Both had the "xen_sched_clock" call changed to xen_clocksource_read.

The workload is simply "main() {while(1); }" run in background.
On vcpus=2, I ran two in background.

Both had CONFIG_XEN_FS_DEBUG set, but the workload should encounter
few or no kernel spinlocks and the measurements confirmed this.
Rhel6b2 had HIGH_RES_TIMERS and SCHED_HRTICK set; they were not
set on upstream... don''t know if this matters.

BOTTOM LINE: A cpu-loaded vcpus=2 HZ=1000 rhel6b2 guest had close to
20000 rdtsc/sec.  Unloaded, the same guest had about 600 rdtsc/sec.
A cpu-loaded HZ=250 upstream guest had about 4000 rdtsc/sec,
regardless of whether it was loaded or not.

Breakdown: I instrumented several of the primary sources of calls
to native_read_tsc.  No rdtsc instructions occurred, except the
one in native_read_tsc.  (Some small number are binary translated
to call native_read_tsc.)

The most frequent users of rdtsc:
(A) sched_clock via xen_clocksource_read->pvclock_clocksource_read->
                   pvclock_nsec_offset->native_read_tsc
(B) clock->read via xen_clocksource_read
(C) set_next_event via xen_timerop_clockevent

I measured (running the workload overnight) how frequently each
of these paths resulted in a call to native_read_tsc.  Each
of these functions go through this path exactly once so this
is also a frequency (per second) count for the callers.

ALL 4672..18689
(A) 2149...8428 (~8*HZ)
(B) 1350...9456 (~(4*HZ+?)*vcpus)
(C)  524...2102 (~2*HZ)

The measurement code outputs frequency every 2**20 samples
and I did a cursory review to verify that the frequency
samples were consistent. There is some small variance,
which accounts for the fact that the numbers above (which
were from one sample) don''t quite add up.  So please
consider them "trend indicators" rather than exact measurements.
I also did verify the total rdtsc usage counts against the
"xm debug-key s" output from the Xen console.

Conclusions/questions:
1) Frequency of rdtsc use is much higher than expected, especially
   when HZ=1000.  Sadly, the divider= kernel parameter may still
   be necessary for RHEL6b2 on older machines where Xen must do
   rdtsc emulation... and for any guest that has migrated (or will
   migrate?)... unless this high frequency identifies a fixable bug.
2) There is some fix in rhel6b2 that hides this when vcpus are inactive.
   Should this go upstream so idle guests don''t consume so much CPU?
3) Why is sched_clock called at 8*HZ?!? Why is clock->read called at
   4*HZ (on every processor)? !? Why is set_next_event called
   at 2*HZ?!?  Even with hardware rdtsc, this seems like a potential
   (though admittedly small) performance issue.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Jul 2010 - rdtsc strangeness on upstream kernel

[Xen-devel] rdtsc strangeness on upstream kernel

[Xen-devel] Re: rdtsc strangeness on upstream kernel

[Xen-devel] RE: rdtsc strangeness on upstream kernel

[Xen-devel] RE: rdtsc strangeness on upstream kernel

Re: [Xen-devel] RE: rdtsc strangeness on upstream kernel

RE: [Xen-devel] RE: rdtsc strangeness on upstream kernel

Re: [Xen-devel] RE: rdtsc strangeness on upstream kernel

RE: [Xen-devel] RE: rdtsc strangeness on upstream kernel