Dan Magenheimer
2009-Oct-02 17:50 UTC
[Xen-devel] [RFC] Correct/fast timestamping in apps under Xen [0 of 4]
Over the last few weeks and in a number of xen-devel threads, I''ve been trying to determine a solution that would allow apps running on Xen to obtain timestamp information (i.e. rdtsc) in a way which is both correct and "as fast as possible" in a wide range of hardware and software conditions. This is important because: (a) some apps need to obtain a timestamp up to a hundred thousand times/second or more; (b) direct ("native") use of the rdtsc instruction in a Xen environment can lead to extremely subtle bugs with potentially extreme consequences up to and including data corruption ("correctness" issues); and (c) alternate approaches for obtaining timestamp information are too slow (emulation, rdtsc exiting, OS intrinsics). Thanks very much to the ideas and guidance from many of you, I think I have a reasonable plan for a design and implementation. However, it is very dependent on a few premises which I will summarize briefly here and then expand on below. Since the premises introduce both new concepts and philosophies for Xen and some dependencies on processor/server capabilities not widely known or understood (or, for historical reasons, trusted), and since a complete solution depends on ALL the premises, I wanted to get review and comment on ALL of them before commencing implementation of any. I suspect the implementation will take less time than the discussion generated ;-) Further, Keir has stated that he doesn''t want isolated patches working in this general direction until he sees the big picture. Since the premises are diverse, and since it''s easy to lose interest in a string of responses unrelated to one expert''s area of interest, I will followup with a separate thread for each, but let me briefly summarize them here first. The premises are: 1) A large and growing percentage of servers running Xen have a "reliable" TSC and Xen can determine conclusively whether a server does or does not have a reliable TSC. 2) A small but growing percentage of servers running Xen implement the rdtscp instruction but Xen does not and will not expose this instruction to guest OSes. 3) Xen is able to track the "incarnation" number for a guest. This number will increase whenever a guest is restored or migrated (and possibly more frequently). Optionally, an administrator can explicitly mark a guest as "landlocked", disallowing save/restore/migration for that guest. 4) Apps can become "virtualization aware" in that they can access certain information directly from Xen utilizing an OS-independent mechanism. This information includes not only "Am I running on Xen?" but also, for example, "Is TSC reliable on this physical machine?", "Is rdtsc emulated or native on this virtual machine?", "What is the current incarnation number for this virtual machine?", "Is this virtual machine landlocked?", "What are the pvclock parameters for this virtual machine?", etc. I''m not trying to be coy... it''s probably not difficult to deduce the proposal from the above and previous posts. I just want to focus first on the validity of the premises. If you have comment on any specific premise, please reply to the subsequent [x of 4] message. If you have comment on the whole direction/philosopy (though I hope we''ve already beaten that to death :-), please reply to this [0 of 4] message. Thanks, Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Oct-02 19:00 UTC
Re: [Xen-devel] [RFC] Correct/fast timestamping in apps under Xen [0 of 4]
On 10/02/09 10:50, Dan Magenheimer wrote:> The premises are: > > 1) A large and growing percentage of servers running > Xen have a "reliable" TSC and Xen can determine > conclusively whether a server does or does not > have a reliable TSC. > 2) A small but growing percentage of servers running > Xen implement the rdtscp instruction but Xen does > not and will not expose this instruction to guest > OSes. > 3) Xen is able to track the "incarnation" number for > a guest. This number will increase whenever a > guest is restored or migrated (and possibly more > frequently). Optionally, an administrator can > explicitly mark a guest as "landlocked", disallowing > save/restore/migration for that guest. > 4) Apps can become "virtualization aware" in that > they can access certain information directly > from Xen utilizing an OS-independent mechanism. > This information includes not only "Am I running > on Xen?" but also, for example, "Is TSC reliable > on this physical machine?", "Is rdtsc emulated > or native on this virtual machine?", "What is > the current incarnation number for this virtual > machine?", "Is this virtual machine landlocked?", > "What are the pvclock parameters for this > virtual machine?", etc. >I think you''re missing a couple of premises here: 0) It is not possible to update the existing APIs and ABIs available to applications to meet the most demanding performance requirements. 5) There will be enough important applications (ie, broadly used, rather than a few in-house apps) whose developers are willing to update them to your new proposed ABI to justify adding and maintaining these new ABIs. Without discussing these, you''re presupposing your solution is necessary. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Oct-02 20:33 UTC
RE: [Xen-devel] [RFC] Correct/fast timestamping in apps under Xen [0 of 4]
> I think you''re missing a couple of premises here: > >0) It is not possible to update the existing APIs and > ABIs available to applications to meet the most demanding > performance requirements.I think this was covered nicely in the first paragraph where I said "in a wide range of hardware and software conditions.">5) There will be enough important applications (ie, broadly used, > rather than a few in-house apps) whose developers are willing to > update them to your new proposed ABI to justify adding and > maintaining these new ABIs.I guess you''ll have to trust me on that one (sez Dan at ORACLE.com) ;-) Jeremy, I''ve heard that people are tiring of reading all of our public jousting around this topic (though I''ve found it useful in many ways), and I''m trying to make forward progress here, so I''d prefer to hear from other reviewers or at least I''d appreciate more constructive feedback. Thanks, Dan> -----Original Message----- > From: Jeremy Fitzhardinge [mailto:jeremy@goop.org] > Sent: Friday, October 02, 2009 1:00 PM > To: Dan Magenheimer > Cc: Xen-Devel (E-mail); Kurt Hackel; Ian Pratt; Keir Fraser > Subject: Re: [Xen-devel] [RFC] Correct/fast timestamping in apps under > Xen [0 of 4] > > > On 10/02/09 10:50, Dan Magenheimer wrote: > > The premises are: > > > > 1) A large and growing percentage of servers running > > Xen have a "reliable" TSC and Xen can determine > > conclusively whether a server does or does not > > have a reliable TSC. > > 2) A small but growing percentage of servers running > > Xen implement the rdtscp instruction but Xen does > > not and will not expose this instruction to guest > > OSes. > > 3) Xen is able to track the "incarnation" number for > > a guest. This number will increase whenever a > > guest is restored or migrated (and possibly more > > frequently). Optionally, an administrator can > > explicitly mark a guest as "landlocked", disallowing > > save/restore/migration for that guest. > > 4) Apps can become "virtualization aware" in that > > they can access certain information directly > > from Xen utilizing an OS-independent mechanism. > > This information includes not only "Am I running > > on Xen?" but also, for example, "Is TSC reliable > > on this physical machine?", "Is rdtsc emulated > > or native on this virtual machine?", "What is > > the current incarnation number for this virtual > > machine?", "Is this virtual machine landlocked?", > > "What are the pvclock parameters for this > > virtual machine?", etc. > > > > I think you''re missing a couple of premises here: > > 0) It is not possible to update the existing APIs and > ABIs available > to applications to meet the most demanding performance > requirements. > > 5) There will be enough important applications (ie, broadly used, > rather than a few in-house apps) whose developers are willing to > update them to your new proposed ABI to justify adding and > maintaining these new ABIs. > > Without discussing these, you''re presupposing your solution > is necessary. > > J > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2009-Oct-05 15:56 UTC
[Xen-devel] RE: [RFC] Correct/fast timestamping in apps under Xen [0 of 4]
OK, here''e some pseudo-code to describe the decision tree an app would go through -- once at startup -- to probe the environment and determine which timestamp mechanism is the best, meaning all of correct AND fastest AND highest resolution. Also following that is the code an app would use whenever a timestamp is needed (could be a static inline function to avoid function call overhead). Remember that this is ONLY needed by apps that must obtain a timestamp at a high frequency, say >2K/core/sec. At lower frequencies, TSC emulation or calling an OS intrinsic should be sufficient. In some environments (e.g. if/when pvclock+ vsyscall is implemented), an OS intrinsic may be faster than emulation. However, there is no way to probe this so, if necessary, we determine which is fastest by measuring both. The pvrdtscp mechanism, which must be proactively configured for a guest, uses the rdtscp instruction to return TSC and the guest incarnation number (which changes infrequently). If the app is currently running on hardware which either does not support rdtscp or doesn''t have a reliable TSC, zero is returned in place of the incarnation number and Xen system time (in nsec) is returned instead of TSC. Legend: XEN_* indicates information obtained by the app directly from Xen (using a TBD mechanism). All other functions are in-app calls. Dan =================== /* run once at app startup */ if (running_on_xen()) timestamp_mech = TS_NATIVE; if (XEN_guest_is_landlocked() && XEN_tsc_is_reliable()) timestamp_mech = TS_RDTSC_SCALE; timestamp_scale = XEN_tsc_scale(); } else if (XEN_guest_has_pvrdtscp()) timestamp_mech = TS_PVRDTSCP; else if (XEN_guest_tsc_emulated()) { os_intr_cycles = measure_os_intrinsic(); tsc_emul_cycles = measure_tsc_emulation(); if (tsc_emul_cycles < os_intr_cycles) { timestamp_mech = TS_RDTSC_NSEC; else timestamp_mech = TS_OS_INTRINSIC; } /* returns monotonically increasing timestamp (nsec) */ u64 get_nsec_timestamp(void) { static u64 last = 0; u64 now; if (timestamp_mech == TS_NATIVE) now = native_timestamp(); else if (timestamp_mech == TS_PVRDTSCP) { static u32 last_aux = 0; u32 this_aux; now = rdtscp(&this_aux); if (this_aux != 0) { while (this_aux != last_aux) { /* occurs only very rarely */ last_aux = this_aux; timestamp_scale = XEN_tsc_scale(); now = rdtscp(&this_aux); } now = ts_scale(timestamp_scale,now); } /* this_aux == 0 means was emulated and already have nsec */ } else if (timestamp_mech == TS_RDTSC_NSEC) now = rdtsc(); else if (timestamp_mech == TS_RDTSC_SCALE) now = ts_scale(timestamp_scale,rdtsc()); else /* TS_OS_INTRINSIC */ now = os_intrinsic_get_nsec(); /* ensure monotonically increasing */ if ((s64)(now - last) > 0) last = now; else now = ++last; return now; } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel