thr3ads.net - similar to: "[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits"

Displaying 20 results from an estimated 700 matches similar to: "[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits"

[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits

2011 Sep 21

[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits

Hi Duncan, On Wed, Sep 21, 2011 at 1:24 PM, Duncan Sands <baldrick at free.fr> wrote: > This patch synthesizes haddps/haddpd/hsubps/hsubpd instructions from > floating > point additions and subtractions of appropriate vector shuffles. To do this > I > introduced new x86 FHADD and FHSUB opcodes. These need to be wired up > somehow > in the .td file to the appropriate

[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits

2011 Sep 22

[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits

Hi Bruno, > Some comments: > > + // Try to synthesize horizontal adds from adds of shuffles. > + if (((Subtarget->hasSSE3()&& (VT == MVT::v4f32 || VT == MVT::v2f64)) || > + (Subtarget->hasAVX()&& (VT == MVT::v8f32 || VT == MVT::v4f64)))&& > + isHorizontalBinOp(LHS, RHS, true)) > > 1) You probably want to do something like: >

[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits

2011 Sep 22

[LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits

The output of the avx-hadd program is 3 11 7 15 Preston -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Duncan Sands Sent: Thursday, September 22, 2011 3:14 PM To: Bruno Cardoso Lopes Cc: LLVMdev Subject: Re: [LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits Hi Bruno, > Some comments:

Question about llvm vectors

2020 Aug 20

Question about llvm vectors

Hi Craig, Thank you very much for your answer. I did not want to discuss exactly the semantic and name of one operation but instead raise the question "would it be beneficial to have more vector builtins?". You wrote that the compiler will recognize a pattern and replace it by __builtin_ia32_haddps when possible, but how can I be sure of that? I would have to disassemble the generated

[LLVMdev] X86 LowerVECTOR_SHUFFLE Question

2011 Feb 25

[LLVMdev] X86 LowerVECTOR_SHUFFLE Question

In ToT, LowerVECTOR_SHUFFLE for x86 has this code: if (X86::isUNPCKLMask(SVOp)) getTargetShuffleNode(getUNPCKLOpcode(VT) dl, VT, V1, V2, DAG); why would this not be: if (X86::isUNPCKLMask(SVOp)) return SVOp; I'm trying to add support for VUNPCKL and am getting into trouble because the existing code ends up creating: VUNPCKLPS load load which is badness come selection

[LLVMdev] X86 LowerVECTOR_SHUFFLE Question

2011 Feb 26

[LLVMdev] X86 LowerVECTOR_SHUFFLE Question

David Greene <dag at cray.com> writes: > In ToT, LowerVECTOR_SHUFFLE for x86 has this code: > > if (X86::isUNPCKLMask(SVOp)) > getTargetShuffleNode(getUNPCKLOpcode(VT) dl, VT, V1, V2, DAG); > > why would this not be: > > if (X86::isUNPCKLMask(SVOp)) > return SVOp; Ok, I discovered that Bruno did this in revisions 112934, 112942 and 113020 but the logs

Question about llvm vectors

2020 Aug 19

Question about llvm vectors

Hi, I love llvm vectors, yet I wonder why some advanced vector operations are specific to some CPU targets? Let me take an example: /// Horizontally adds the adjacent pairs of values contained in two /// 128-bit vectors of [4 x float]. /// /// \headerfile <x86intrin.h> /// /// This intrinsic corresponds to the <c> VHADDPS </c> instruction. /// /// \param __a /// A

[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

2015 May 04

[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

Hi Asghar-Ahmed, I saw your last ping - sorry, I'm away on vacation and back on Wednesday. Generally, I'm not sure that having both absd/hadd and sad are compatible with the discussions going on in other threads, for example my thread about min and max. Given that those two intrinsics are fairly trivial to match , I don't see the need to have two different canonical forms. James On

[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

2015 May 05

[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

On 4 May 2015 at 08:37, Shahid, Asghar-ahmad <Asghar-ahmad.Shahid at amd.com> wrote: > My worry is regarding the query for cost calculation for specific SAD > instructions such as ‘psad’ (X86) or ‘usad’ (ARM) in Loop Vectorizer. Hi Shahid, The vectorizer's cost model has the ability to return different costs for the same instruction based on the arguments (scalar/vector,

[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

2015 May 01

[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

Hi All, I would like to introduce intrinsics to generate efficient codes for 'absolute differences', 'horizontal add' and 'sum of absolute differences' Idioms used by user programs. Identifying these idioms at lower level (Codegen) is complex. These idioms can be identified in LV/SLP and vectorized using above intrinsics to generate better code. Proposal: 1. Add

[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

2015 May 05

[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

Hi Renato, Thanks for your response. My concern was actually this. For example, take vector type V8i16 on X86 target With llvm.sad() intrinsic: VC1 (Vector Cost) = Cost associated with "PSAD" instruction. W/ llvm.absd() and llvm.hadd() VC2 = Cost associated with "absolute diff" + "horizontal add" ( ??? ) As I will be querying with getIntrinsicCost(ID) for these

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

2013 Nov 23

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

I agree with Tim, you need to implement a GetCpuIDAndInfoEx function in Host.cpp and pass the correct value to ecx. Also you need to verify that 7 is a valid leaf because an invalid leaf is defined to return the highest supported leaf on that processor. So if a processor supports say leaf 6 and not leaf 7, then an access leaf 7 will return the data from leaf 6 causing unrelated bits to be

[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

2015 May 06

[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

Hi Renato, That’s right. I agree with your *pattern vs complexity* thinking. So I would drop llvm.sad() and go ahead with the remaining two. Does it make sense in general? Regards, Shahid > -----Original Message----- > From: Renato Golin [mailto:renato.golin at linaro.org] > Sent: Tuesday, May 05, 2015 8:40 PM > To: Shahid, Asghar-ahmad > Cc: James Molloy; llvmdev at

[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]

2014 Sep 19

[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]

> -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Tom Stellard > Sent: 19 September 2014 01:36 > To: Sanjay Patel > Cc: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] predicates vs. requirements [TableGen, > X86InstrInfo.td] > > On Thu, Sep 18, 2014 at 03:25:07PM -0600, Sanjay Patel wrote: >

[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

2015 May 06

[LLVMdev] [RFC][PATCH] Adding absd/hadd/sad intrinsics

> For the time being, if you can get away with heuristics, and that fills your > allocated time for this task, that it's the best way forward for now. Sorry that I could not get what exactly you mean with "heuristics". Is it the "intrinsics approach" itself or something else? BTW, now my plan is to just add the two intrinsics for 'absolute difference' and

[LLVMdev] Possible bug in getCallPreservedMask for CallingConv::Intel_OCL_BI

2014 Mar 13

[LLVMdev] Possible bug in getCallPreservedMask for CallingConv::Intel_OCL_BI

Not sure who owns this bit of code, so sending this to the general list. It looks like there may be an unintentional fall through happening in the X86RegisterInfo::getCallPreservedMask function. http://llvm.org/docs/doxygen/html/X86RegisterInfo_8cpp_source.html case CallingConv::Intel_OCL_BI

[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]

2014 Sep 18

[LLVMdev] predicates vs. requirements [TableGen, X86InstrInfo.td]

I tried to add an 'OptForSize' requirement to a pattern in X86InstrSSE.td, but it appears to be ignored. However, the condition was detected when specified as a predicate. So this doesn't work: def : Pat<(v2f64 (X86VBroadcast (loadf64 addr:$src))), (VMOVDDUPrm addr: $src)>, *Requires<[OptForSize**]>*; But this does: * let Predicates = [OptForSize]

sum elements in the vector

2016 Apr 04

sum elements in the vector

My target has an instruction that adds up all elements in the vector and stores the result in a register. I'm trying to implement it in my compiler but I'm not sure even where to start. I did look at other targets, but they don't seem to have anything like it ( I could be wrong. My experience with LLVM is limited, so if I missed it, I'd appreciate if someone could point it out ).

sum elements in the vector

2016 May 28

sum elements in the vector

Hi Rail, Below 2 revisions might be of your interest which Detect SAD patterns and emit psadbw instructions on X86.: http://reviews.llvm.org/D14840 http://reviews.llvm.org/D14897 Intrinsics related to absdiff revisons : http://reviews.llvm.org/D10867 http://reviews.llvm.org/D11678 Hope this helps. Regards, Suyog On Sat, May 28, 2016 at 4:20 AM, Rail Shafigulin via llvm-dev < llvm-dev at

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

All, I've been trying to simplify the way LLVM models sub-register relationships a bit, and the X86 sub_ss and sub_sd sub-register indices are getting in the way. I want to get rid of them. These sub-registers are special, they are only mentioned here: let CompositeIndices = [(sub_ss), (sub_sd)] in { def XMM0: Register<"xmm0">, DwarfRegNum<[17, 21, 21]>; def

similar to: [LLVMdev] Patch to synthesize x86 hadd instructions; need help with the tablegen bits