Dan Magenheimer
2009-Sep-09 18:05 UTC
[Xen-devel] rdtsc hypercall, from userland?!? (was: rdtsc: correctness vs performance on Xen)
(Although Jeremy and others are still discussing how to implement vsyscall+pvclock for upstream Linux, I am still looking for a way to allow apps to use rdtsc without suffering the performance loss from rdtsc emulation so I''ve begun a new thread.) To recap: In order to properly implement the required semantics of the rdtsc instruction in a virtual environment, the current Xen method of allowing the rdtsc instruction to execute natively is insufficient and may lead to random failure, possibly resulting in data loss. Upstream Xen now has a boot option to force rdtsc to be emulated for both hvm and pv guests. Soon this will be controlled by a per-guest vm.cfg option. The default will likely be emulation. However, some apps do tens-to-hundreds of thousands of rdtsc''s per core per second. On my dual-core Conroe box, an rdtsc instruction takes about 22ns in hardware and about 360ns to emulate. So emulation may slow performance in the worst case by as much as 5-10%. Vsyscall+pvclock in upstream 64-bit Linux may be the right answer at some point in the future. BUT (IMPORTANT NEW POINT!!!) the pvclock algorithm requires an rdtsc instruction, and there is no way to emulate some guest rdtsc instructions (e.g. only those in apps) and not others (e.g. only those in the kernel). Thus, for guests that have rdtsc emulation enabled, vsyscall+pvclock will be SLOWER than emulation, thus meaning it is still not a palatable alternative. I''m looking for something that provides correctness TODAY with less of a performance hit AND does not require guest operating systems to change. (App changes and Xen changes are allowed.) Previous attempts have run into insurmountable x86 architecture barriers (see the previous thread). But it recently occurred to me to compare the performance of a hypercall vs rdtsc emulation. The results are promising, at least on 64-bit guests: rdtsc native: 22ns rdtsc emulated: 360ns nearly-NULL hypercall (32b guest): 260ns nearly-NULL hypercall (64b guest): 125ns (Note these measurements are normal kernel-land hypercalls.) Currently all hypercalls from userland are illegal, but this need not be the case for ALL hypercalls. Is it possible for Xen to implement a "rdtsc hypercall" that is executable from userland, without requiring OS changes? Early discussions look promising. Certainly, it makes sense to implement a normal kernel-callable rdtsc hypercall so that vsyscall+pvclock can execute more quickly. I''ll be taking a look at that, but I''d be grateful for assistance in architecting a userland hypercall mechanism that will work for "hyper-rdtsc". (While implementing a userland "hyper-rdtsc" is highest priority, I''d also be interested in whether the mechanism can be more generic... I''d like to explore the use of tmem from apps, Ian Pratt has suggested that userland hypercalls might be interesting for blktap, and there are probably other OS-independent ideas to explore assuming security issues can be handled.) Thanks, Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Sep-09 18:59 UTC
[Xen-devel] Re: rdtsc hypercall, from userland?!? (was: rdtsc: correctness vs performance on Xen)
On 09/09/09 11:05, Dan Magenheimer wrote:> BUT > (IMPORTANT NEW POINT!!!) the pvclock algorithm requires > an rdtsc instruction, and there is no way to > emulate some guest rdtsc instructions (e.g. only > those in apps) and not others (e.g. only those in > the kernel). Thus, for guests that have rdtsc emulation > enabled, vsyscall+pvclock will be SLOWER than emulation, > thus meaning it is still not a palatable alternative. >You could enable/disable emulation rdtsc each context switch according to the app''s desires/requrements. It would require an extra hypercall per context switch, but it could be batched with the others, resulting in little marginal cost. It would, however, leave the kernel''s use of rdtsc in a confused state. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2009-Sep-10 09:59 UTC
[Xen-devel] Re: rdtsc hypercall, from userland?!? (was: rdtsc: correctness vs performance on Xen)
>>> Jeremy Fitzhardinge <jeremy@goop.org> 09.09.09 20:59 >>> >You could enable/disable emulation rdtsc each context switch according >to the app''s desires/requrements. It would require an extra hypercall >per context switch, but it could be batched with the others, resulting >in little marginal cost. It would, however, leave the kernel''s use of >rdtsc in a confused state.Not necessarily: There could be two flags, one saying app rdtsc needs to be emulated, and a second one for the kernel ones''. Unless a pv kernel wants this, its (emulated) reads could still return the real (hardware) value rather than the calculated, 1GHz-based one. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2009-Sep-10 10:04 UTC
[Xen-devel] Re: rdtsc hypercall, from userland?!? (was: rdtsc: correctness vs performance on Xen)
>>> Dan Magenheimer <dan.magenheimer@oracle.com> 09.09.09 20:05 >>> >(Note these measurements are normal kernel-land >hypercalls.) Currently all hypercalls from userland >are illegal, but this need not be the case for ALL >hypercalls. Is it possible >for Xen to implement a "rdtsc hypercall" that >is executable from userland, without requiring >OS changes? Early discussions look promising.While possible, I''d suspect that the good performance you see for 64-bits wouldn''t hold: You can''t (without potential for ambiguity) re-use syscall for this purpose, and the alternative ways (interrupt or call gate) are likely to be more in the performance range of what you measured for 32-bits, which doesn''t seem that much better than emulated rdtsc. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2009-Sep-10 10:14 UTC
RE: [Xen-devel] Re: rdtsc hypercall, from userland?!? (was: rdtsc: correctness vs performance on Xen)
> While possible, I''d suspect that the good performance you see for > 64-bits wouldn''t hold: You can''t (without potential for ambiguity) > re-use syscall for this purpose.I''ll bet all current 64b PV OSes use EAX as a simple system call number, so it''s probably possible to do something hacky with negative values, after suitable auditing of current PV OSes and other common OSes. Not pretty, but I wouldn''t throw the scheme out if an audit confirms the behaviour. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel