1) Is there a particular reason Jeremy''s sched_clock() variant isn''t being used in linux-2.6.18-hg? Specifically, are there any known downsides to that approach? 2) What is the reason for the inconsistent use of rmb() vs. barrier() in time-xen.c? It would seem to me that rmb() should be sufficient in all cases. 3) Why is it that x86-64''s __smp_call_function_single(), just like native, uses cpu_relax() in its wait-for-response loops, while __smp_call_function() as well as i386''s smp_call_function() use barrier(), other than native? Thanks, Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich, le Mon 03 Dec 2007 11:24:35 +0000, a écrit :> 2) What is the reason for the inconsistent use of rmb() vs. barrier() in > time-xen.c? It would seem to me that rmb() should be sufficient in all > cases.rmb() is more powerful than barrier(), not the converse. Samuel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> Samuel Thibault <samuel.thibault@eu.citrix.com> 03.12.07 12:29 >>> >Jan Beulich, le Mon 03 Dec 2007 11:24:35 +0000, a écrit : >> 2) What is the reason for the inconsistent use of rmb() vs. barrier() in >> time-xen.c? It would seem to me that rmb() should be sufficient in all >> cases. > >rmb() is more powerful than barrier(), not the converse.Oh, sorry, I mixed barrier() with mb(). So the proposal would then simply be the other way around (the use of locked operations or fence instructions on x86 is really unnecessary as long as WC memory or non-temporal stores don''t need to be taken into consideration). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 3/12/07 11:40, "Jan Beulich" <jbeulich@novell.com> wrote:>> rmb() is more powerful than barrier(), not the converse. > > Oh, sorry, I mixed barrier() with mb(). So the proposal would then simply > be the other way around (the use of locked operations or fence instructions > on x86 is really unnecessary as long as WC memory or non-temporal stores > don''t need to be taken into consideration).Then the implementation of rmb() should be equivalent to barrier(). The code in time-xen.c is implemented to the interface definitions of barrier() and rmb() -- the former is used just where instruction ordering is important; the latter where dynamic execution order matters too. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> Keir Fraser <Keir.Fraser@cl.cam.ac.uk> 03.12.07 19:32 >>> >On 3/12/07 11:40, "Jan Beulich" <jbeulich@novell.com> wrote: > >>> rmb() is more powerful than barrier(), not the converse. >> >> Oh, sorry, I mixed barrier() with mb(). So the proposal would then simply >> be the other way around (the use of locked operations or fence instructions >> on x86 is really unnecessary as long as WC memory or non-temporal stores >> don''t need to be taken into consideration). > >Then the implementation of rmb() should be equivalent to barrier(). The code >in time-xen.c is implemented to the interface definitions of barrier() and >rmb() -- the former is used just where instruction ordering is important; >the latter where dynamic execution order matters too.I have to disagree: At least the uses of barrier() in monotonic_clock() appear to be in places where in reality (and from a theoretical standpoint) rmb() ought to be used. But I agree that rmb() (and also wmb()) on x86 doesn''t need to be more than barrier() (except, as said, in the context of WC memory or non-temporal memory accesses) - isn''t that exactly what you just recently did in the hypervisor? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 4/12/07 10:23, "Jan Beulich" <jbeulich@novell.com> wrote:> I have to disagree: At least the uses of barrier() in monotonic_clock() appear > to be in places where in reality (and from a theoretical standpoint) rmb() > ought to be used.We''re sync''ing against concurrent updates of a this_cpu variable. We can only race updates in a local ISR, and hence barrier() suffices.> But I agree that rmb() (and also wmb()) on x86 doesn''t need to be more > than barrier() (except, as said, in the context of WC memory or non-temporal > memory accesses) - isn''t that exactly what you just recently did in the > hypervisor?It is, in response to Intel''s new whitepaper on memory ordering guarantees, and also after seeing similar patches committed in Linux (despite some protestation!). Within Linux guests, we will simply follow the barrier definitions for the tree we are patching. Virtualisation should not affect barrier implementations. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser <Keir.Fraser@cl.cam.ac.uk> writes:> On 4/12/07 10:23, "Jan Beulich" <jbeulich@novell.com> wrote: > >> I have to disagree: At least the uses of barrier() in monotonic_clock() appear >> to be in places where in reality (and from a theoretical standpoint) rmb() >> ought to be used. > > We''re sync''ing against concurrent updates of a this_cpu variable. We can > only race updates in a local ISR, and hence barrier() suffices.Not if you use RDTSC inside the loop. -Andi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 4/12/07 11:13, "Andi Kleen" <andi@firstfloor.org> wrote:>>> I have to disagree: At least the uses of barrier() in monotonic_clock() >>> appear >>> to be in places where in reality (and from a theoretical standpoint) rmb() >>> ought to be used. >> >> We''re sync''ing against concurrent updates of a this_cpu variable. We can >> only race updates in a local ISR, and hence barrier() suffices. > > Not if you use RDTSC inside the loop.I must disagree! And I *know* that RDTSC is not a serialising instruction... If we race, then there was an interrupt. Interrupt delivery is a serialisation point for the interrupted instruction stream. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, Dec 04, 2007 at 11:27:02AM +0000, Keir Fraser wrote:> On 4/12/07 11:13, "Andi Kleen" <andi@firstfloor.org> wrote: > > >>> I have to disagree: At least the uses of barrier() in monotonic_clock() > >>> appear > >>> to be in places where in reality (and from a theoretical standpoint) rmb() > >>> ought to be used. > >> > >> We''re sync''ing against concurrent updates of a this_cpu variable. We can > >> only race updates in a local ISR, and hence barrier() suffices. > > > > Not if you use RDTSC inside the loop. > > I must disagree! And I *know* that RDTSC is not a serialising instruction... > > If we race, then there was an interrupt. Interrupt delivery is a > serialisation point for the interrupted instruction stream.The synchronization relies on the RDTSC happening between the two sequence number checks. Otherwise you can get inconsistent state between RDTSC and the xtime data which might be changing asynchronously on another CPU. Therefore you need RDTSC barriers. -Andi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 4/12/07 11:44, "Andi Kleen" <andi@firstfloor.org> wrote:>> I must disagree! And I *know* that RDTSC is not a serialising instruction... >> >> If we race, then there was an interrupt. Interrupt delivery is a >> serialisation point for the interrupted instruction stream. > > The synchronization relies on the RDTSC happening between the > two sequence number checks. Otherwise you can get inconsistent state > between RDTSC and the xtime data which might be changing asynchronously > on another CPU. Therefore you need RDTSC barriers.Our monotonic_clock() implementation does not reference xtime nor its seqlock. If it did, you would be correct. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> Our monotonic_clock() implementation does not reference xtime nor its > seqlock. If it did, you would be correct.Well it applies to other state too. There certainly is a seqlock equivalent. If you can guarantee the interrupt changing that is always on the same CPU that would be safe, otherwise not. -Andi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel