thr3ads.net - search: "vgather"

Displaying 6 results from an estimated 6 matches for "vgather".

Did you mean: gather

[LLVMdev] codeGen, instruction write one value to the input register.

2014 Jun 16

[LLVMdev] codeGen, instruction write one value to the input register.

Hi Guys, In LLVM codegen, a typical binary operation instruction is defined something like below: " def _rr: NVPTXInst<(outs Int1Regs:$dst), (ins Int1Regs:$a, Int1Regs:$b), "xor.pred \t$dst, $a, $b;", [(set Int1Regs:$dst, (OpNode Int1Regs:$a, Int1Regs:$b))]>; “ which takes two inputs and write the result to the $dst register. Then how to define a binary

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

If I'm understanding correctly, you're saying that vgather* is slow on all of Excavator, Haswell, Broadwell, and Skylake (client). Therefore, we will not generate it for any of those machines. Even if that's true, we should not define "gatherIsSlow()" as "hasAVX2() && !hasAVX512()". It could break for some hypothetical futu...

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

...be good if there is a compiler option for the users to enable LLVM to generate the gather instructions no matter it is faster or slow. Best, Zhi On Fri, Feb 26, 2016 at 12:49 PM, Sanjay Patel <spatel at rotateright.com> wrote: > If I'm understanding correctly, you're saying that vgather* is slow on all > of Excavator, Haswell, Broadwell, and Skylake (client). Therefore, we will > not generate it for any of those machines. > > Even if that's true, we should not define "gatherIsSlow()" as "hasAVX2() > && !hasAVX512()". It could break fo...

[LLVMdev] codeGen, instruction write one value to the input register.

2014 Jul 07

[LLVMdev] codeGen, instruction write one value to the input register.

...l compute sin and cos value of input, return the sin >> result and write the cos result to cosVal. >> Is there anything special constraints or something I should put onto the cos >> register? >> > > Hey Kevin, > > You might get a good start looking at the AVX2 VGATHER patterns in > llvm/lib/Target/X86/X86InstrSSE.td. Those patterns return two results. > They also make use of the @earlyclobber constraint. > > Hope that helps, > Cameron -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/piperm...

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

No. Gather operation is slow on AVX2 processors. - Elena From: zhi chen [mailto:zchenhn at gmail.com] Sent: Thursday, February 25, 2016 20:48 To: Sanjay Patel <spatel at rotateright.com> Cc: Demikhovsky, Elena <elena.demikhovsky at intel.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] how to force

how to force llvm generate gather intrinsic

2016 Feb 25

how to force llvm generate gather intrinsic

It seems that http://reviews.llvm.org/D15690 only implemented gather/scatter for AVX-512, but not for AVX/AVX2. Is there any plan to enable gather for AVX/2? Thanks. Best, Zhi On Thu, Feb 25, 2016 at 8:28 AM, Sanjay Patel <spatel at rotateright.com> wrote: > I don't think gather has been enabled for AVX2 as of r261875. > Masked load/store were enabled for AVX with: >

search for: vgather