search for: movntdqa

Displaying 7 results from an estimated 7 matches for "movntdqa".

2016 Jan 14
2
RFC: non-temporal fencing in LLVM IR
...;>> >>>> AFAICT, nontemporal loads and stores seem to have different fencing >>>> rules on x86, none of them very clear. Nontemporal stores should probably >>>> ideally use an SFENCE. Locked instructions seem to be documented to work >>>> with MOVNTDQA. In both cases, there seems to be only empirical evidence as >>>> to which side(s) of the nontemporal operations they should go on? >>>> >>>> I finally decided that I was OK with using a LOCKed top-of-stack update >>>> as a fence in Java on x86. I...
2016 Jan 15
3
RFC: non-temporal fencing in LLVM IR
...loads and stores seem to have > different fencing rules on x86, none of them very > clear. Nontemporal stores should probably ideally > use an SFENCE. Locked instructions seem to be > documented to work with MOVNTDQA. In both cases, > there seems to be only empirical evidence as to > which side(s) of the nontemporal operations they > should go on? > > I finally decided that I was OK with using a >...
2016 Jan 14
4
RFC: non-temporal fencing in LLVM IR
...systematic handling of the fences associated with x86 non-temporal accesses. AFAICT, nontemporal loads and stores seem to have different fencing rules on x86, none of them very clear. Nontemporal stores should probably ideally use an SFENCE. Locked instructions seem to be documented to work with MOVNTDQA. In both cases, there seems to be only empirical evidence as to which side(s) of the nontemporal operations they should go on? I finally decided that I was OK with using a LOCKed top-of-stack update as a fence in Java on x86. I'm significantly less enthusiastic for C++. I also think that ri...
2016 Jan 14
2
RFC: non-temporal fencing in LLVM IR
...with x86 non-temporal accesses. >> >> AFAICT, nontemporal loads and stores seem to have different fencing rules >> on x86, none of them very clear. Nontemporal stores should probably >> ideally use an SFENCE. Locked instructions seem to be documented to work >> with MOVNTDQA. In both cases, there seems to be only empirical evidence as >> to which side(s) of the nontemporal operations they should go on? >> >> I finally decided that I was OK with using a LOCKed top-of-stack update >> as a fence in Java on x86. I'm significantly less enthusia...
2016 May 01
2
r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set
...ou want, you can be a reviewer of this change. Regards Michael Zuckerman From: Craig Topper [mailto:craig.topper at gmail.com] Sent: Thursday, April 28, 2016 04:53 To: Zuckerman, Michael <michael.zuckerman at intel.com> Subject: Re: r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set Can we use native IR for the stores the way the 128-bit and 256-bit equivalents do? On Wed, Apr 27, 2016 at 3:44 AM, Michael Zuckerman via cfe-commits <cfe-commits at lists.llvm.org<mailto:cfe-commits at lists.llvm.org>> wrote: Author: mzuckerm Date:...
2016 May 15
2
r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set
...c Christopher [mailto:echristo at gmail.com] Sent: Sunday, May 01, 2016 19:54 To: Zuckerman, Michael <michael.zuckerman at intel.com>; Craig Topper <craig.topper at gmail.com> Cc: llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set Why? On Sun, May 1, 2016, 6:04 AM Zuckerman, Michael via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: Hi, For now no. But I will add this three builtins to CGBuiltin.cpp. If you want, you can be a reviewer of this cha...
2016 Jan 13
2
RFC: non-temporal fencing in LLVM IR
On Wed, Jan 13, 2016 at 10:32 AM, John Brawn <John.Brawn at arm.com> wrote: > *What about non-x86 architectures?* > > > > Architectures such as ARMv8 support non-temporal instructions and require > barriers such as DMB nshld to order loads and DMB nshst to order stores. > > > > Even ARM's address-dependency rule (a.k.a. the ill-fated >