Dan Magenheimer
2009-Oct-02 17:50 UTC
[Xen-devel] [RFC] Correct/fast timestamping in apps under Xen [0 of 4]
Over the last few weeks and in a number of xen-devel
threads, I''ve been trying to determine a solution
that would allow apps running on Xen to obtain timestamp
information (i.e. rdtsc) in a way which is both correct
and "as fast as possible" in a wide range of hardware
and software conditions. This is important because:
(a) some apps need to obtain a timestamp up to a hundred
thousand times/second or more; (b) direct ("native")
use of the rdtsc instruction in a Xen environment
can lead to extremely subtle bugs with potentially
extreme consequences up to and including data corruption
("correctness" issues); and (c) alternate approaches
for obtaining timestamp information are too
slow (emulation, rdtsc exiting, OS intrinsics).
Thanks very much to the ideas and guidance from
many of you, I think I have a reasonable plan for
a design and implementation. However, it is very
dependent on a few premises which I will summarize
briefly here and then expand on below. Since
the premises introduce both new concepts and
philosophies for Xen and some dependencies
on processor/server capabilities not widely known
or understood (or, for historical reasons, trusted),
and since a complete solution depends on ALL
the premises, I wanted to get review and comment
on ALL of them before commencing implementation
of any. I suspect the implementation will take
less time than the discussion generated ;-)
Further, Keir has stated that he doesn''t want
isolated patches working in this general direction
until he sees the big picture.
Since the premises are diverse, and since it''s
easy to lose interest in a string of responses unrelated
to one expert''s area of interest, I will followup
with a separate thread for each, but let me
briefly summarize them here first.
The premises are:
1) A large and growing percentage of servers running
Xen have a "reliable" TSC and Xen can determine
conclusively whether a server does or does not
have a reliable TSC.
2) A small but growing percentage of servers running
Xen implement the rdtscp instruction but Xen does
not and will not expose this instruction to guest
OSes.
3) Xen is able to track the "incarnation" number for
a guest. This number will increase whenever a
guest is restored or migrated (and possibly more
frequently). Optionally, an administrator can
explicitly mark a guest as "landlocked", disallowing
save/restore/migration for that guest.
4) Apps can become "virtualization aware" in that
they can access certain information directly
from Xen utilizing an OS-independent mechanism.
This information includes not only "Am I running
on Xen?" but also, for example, "Is TSC reliable
on this physical machine?", "Is rdtsc emulated
or native on this virtual machine?", "What is
the current incarnation number for this virtual
machine?", "Is this virtual machine landlocked?",
"What are the pvclock parameters for this
virtual machine?", etc.
I''m not trying to be coy... it''s probably not
difficult to deduce the proposal from the above
and previous posts. I just want to focus first
on the validity of the premises.
If you have comment on any specific premise, please
reply to the subsequent [x of 4] message. If
you have comment on the whole direction/philosopy
(though I hope we''ve already beaten that to death :-),
please reply to this [0 of 4] message.
Thanks,
Dan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Oct-02 19:00 UTC
Re: [Xen-devel] [RFC] Correct/fast timestamping in apps under Xen [0 of 4]
On 10/02/09 10:50, Dan Magenheimer wrote:> The premises are: > > 1) A large and growing percentage of servers running > Xen have a "reliable" TSC and Xen can determine > conclusively whether a server does or does not > have a reliable TSC. > 2) A small but growing percentage of servers running > Xen implement the rdtscp instruction but Xen does > not and will not expose this instruction to guest > OSes. > 3) Xen is able to track the "incarnation" number for > a guest. This number will increase whenever a > guest is restored or migrated (and possibly more > frequently). Optionally, an administrator can > explicitly mark a guest as "landlocked", disallowing > save/restore/migration for that guest. > 4) Apps can become "virtualization aware" in that > they can access certain information directly > from Xen utilizing an OS-independent mechanism. > This information includes not only "Am I running > on Xen?" but also, for example, "Is TSC reliable > on this physical machine?", "Is rdtsc emulated > or native on this virtual machine?", "What is > the current incarnation number for this virtual > machine?", "Is this virtual machine landlocked?", > "What are the pvclock parameters for this > virtual machine?", etc. >I think you''re missing a couple of premises here: 0) It is not possible to update the existing APIs and ABIs available to applications to meet the most demanding performance requirements. 5) There will be enough important applications (ie, broadly used, rather than a few in-house apps) whose developers are willing to update them to your new proposed ABI to justify adding and maintaining these new ABIs. Without discussing these, you''re presupposing your solution is necessary. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Oct-02 20:33 UTC
RE: [Xen-devel] [RFC] Correct/fast timestamping in apps under Xen [0 of 4]
> I think you''re missing a couple of premises here: > >0) It is not possible to update the existing APIs and > ABIs available to applications to meet the most demanding > performance requirements.I think this was covered nicely in the first paragraph where I said "in a wide range of hardware and software conditions.">5) There will be enough important applications (ie, broadly used, > rather than a few in-house apps) whose developers are willing to > update them to your new proposed ABI to justify adding and > maintaining these new ABIs.I guess you''ll have to trust me on that one (sez Dan at ORACLE.com) ;-) Jeremy, I''ve heard that people are tiring of reading all of our public jousting around this topic (though I''ve found it useful in many ways), and I''m trying to make forward progress here, so I''d prefer to hear from other reviewers or at least I''d appreciate more constructive feedback. Thanks, Dan> -----Original Message----- > From: Jeremy Fitzhardinge [mailto:jeremy@goop.org] > Sent: Friday, October 02, 2009 1:00 PM > To: Dan Magenheimer > Cc: Xen-Devel (E-mail); Kurt Hackel; Ian Pratt; Keir Fraser > Subject: Re: [Xen-devel] [RFC] Correct/fast timestamping in apps under > Xen [0 of 4] > > > On 10/02/09 10:50, Dan Magenheimer wrote: > > The premises are: > > > > 1) A large and growing percentage of servers running > > Xen have a "reliable" TSC and Xen can determine > > conclusively whether a server does or does not > > have a reliable TSC. > > 2) A small but growing percentage of servers running > > Xen implement the rdtscp instruction but Xen does > > not and will not expose this instruction to guest > > OSes. > > 3) Xen is able to track the "incarnation" number for > > a guest. This number will increase whenever a > > guest is restored or migrated (and possibly more > > frequently). Optionally, an administrator can > > explicitly mark a guest as "landlocked", disallowing > > save/restore/migration for that guest. > > 4) Apps can become "virtualization aware" in that > > they can access certain information directly > > from Xen utilizing an OS-independent mechanism. > > This information includes not only "Am I running > > on Xen?" but also, for example, "Is TSC reliable > > on this physical machine?", "Is rdtsc emulated > > or native on this virtual machine?", "What is > > the current incarnation number for this virtual > > machine?", "Is this virtual machine landlocked?", > > "What are the pvclock parameters for this > > virtual machine?", etc. > > > > I think you''re missing a couple of premises here: > > 0) It is not possible to update the existing APIs and > ABIs available > to applications to meet the most demanding performance > requirements. > > 5) There will be enough important applications (ie, broadly used, > rather than a few in-house apps) whose developers are willing to > update them to your new proposed ABI to justify adding and > maintaining these new ABIs. > > Without discussing these, you''re presupposing your solution > is necessary. > > J > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Oct-05 15:56 UTC
[Xen-devel] RE: [RFC] Correct/fast timestamping in apps under Xen [0 of 4]
OK, here''e some pseudo-code to describe the
decision tree an app would go through -- once
at startup -- to probe the environment and
determine which timestamp mechanism is the
best, meaning all of correct AND fastest
AND highest resolution. Also following
that is the code an app would use whenever
a timestamp is needed (could be a static
inline function to avoid function call
overhead).
Remember that this is ONLY needed by apps
that must obtain a timestamp at a
high frequency, say >2K/core/sec. At lower
frequencies, TSC emulation or calling an
OS intrinsic should be sufficient.
In some environments (e.g. if/when pvclock+
vsyscall is implemented), an OS intrinsic
may be faster than emulation. However,
there is no way to probe this so, if necessary,
we determine which is fastest by measuring both.
The pvrdtscp mechanism, which must be proactively
configured for a guest, uses the rdtscp
instruction to return TSC and the guest
incarnation number (which changes infrequently).
If the app is currently running on hardware
which either does not support rdtscp or
doesn''t have a reliable TSC, zero is
returned in place of the incarnation number
and Xen system time (in nsec) is returned
instead of TSC.
Legend: XEN_* indicates information obtained
by the app directly from Xen (using a TBD
mechanism). All other functions are in-app
calls.
Dan
===================
/* run once at app startup */
if (running_on_xen())
timestamp_mech = TS_NATIVE;
if (XEN_guest_is_landlocked() && XEN_tsc_is_reliable())
timestamp_mech = TS_RDTSC_SCALE;
timestamp_scale = XEN_tsc_scale();
}
else if (XEN_guest_has_pvrdtscp())
timestamp_mech = TS_PVRDTSCP;
else if (XEN_guest_tsc_emulated()) {
os_intr_cycles = measure_os_intrinsic();
tsc_emul_cycles = measure_tsc_emulation();
if (tsc_emul_cycles < os_intr_cycles) {
timestamp_mech = TS_RDTSC_NSEC;
else
timestamp_mech = TS_OS_INTRINSIC;
}
/* returns monotonically increasing timestamp (nsec) */
u64 get_nsec_timestamp(void)
{
static u64 last = 0;
u64 now;
if (timestamp_mech == TS_NATIVE)
now = native_timestamp();
else if (timestamp_mech == TS_PVRDTSCP) {
static u32 last_aux = 0;
u32 this_aux;
now = rdtscp(&this_aux);
if (this_aux != 0) {
while (this_aux != last_aux) {
/* occurs only very rarely */
last_aux = this_aux;
timestamp_scale = XEN_tsc_scale();
now = rdtscp(&this_aux);
}
now = ts_scale(timestamp_scale,now);
}
/* this_aux == 0 means was emulated and
already have nsec */
}
else if (timestamp_mech == TS_RDTSC_NSEC)
now = rdtsc();
else if (timestamp_mech == TS_RDTSC_SCALE)
now = ts_scale(timestamp_scale,rdtsc());
else /* TS_OS_INTRINSIC */
now = os_intrinsic_get_nsec();
/* ensure monotonically increasing */
if ((s64)(now - last) > 0)
last = now;
else
now = ++last;
return now;
}
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel