thr3ads.net - similar to: "vectorization for X86"

Displaying 20 results from an estimated 4000 matches similar to: "vectorization for X86"

2016 Mar 16

generate vectorized code

My question is: How do I make clang to generate assembly with vector instruction for my target? The back story is: I've added a few vector instructions to my target and confirmed that they are used by running my code on the test below and using a following command: opt i.esencia.ll -S -march=esencia -mcpu=esencia -loop-vectorize | llc -mcpu=esencia -o i.esencia.s target datalayout =

clang triple and clang target

2016 Mar 14

clang triple and clang target

On Sat, Mar 12, 2016 at 2:38 PM, Tim Northover <t.p.northover at gmail.com> wrote: > On 12 March 2016 at 11:51, Rail Shafigulin via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > I tried every possible combination of --target I could think of but > nothing > > worked. Would you mind helping me out? > > First, 64-bit x86 is "x86_64", and 32-bit

clang triple and clang target

2016 Mar 12

clang triple and clang target

> > I assume with target you mean the backend? Consider the x86 backend. It > supports 32bit and 64bit mode, with the GNU x32 ABI in between. There > are three different executable formats support (ELF, PE, MachO) with > different constraints. Some platforms require 32bit alignment of the > stack, others require 128bit alignment. The list goes on. The triple > specifies >

clang triple and clang target

2016 Mar 11

clang triple and clang target

Can someone explain what exactly a clang triple is (--triple option) and what is the connection between triple and a target? I know there is an article ( http://clang.llvm.org/docs/CrossCompilation.html) that show how to cross compile code, but I'm not clear about is why I need to specify triple, why I can't just say compile for a given target? Any help is appreciated. -- Rail

[LLVMdev] AVX code gen

2013 Dec 11

[LLVMdev] AVX code gen

Hello - I found this post on the llvm blog: http://blog.llvm.org/2012/12/new-loop-vectorizer.html which makes me think that clang / llvm are capable of generating AVX with packed instructions as well as utilizing the full width of the YMM registers… I have an environment where icc generates these instructions (vmulps %ymm1, %ymm3, %ymm2 for example) but I can not get clang/llvm to generate such

sum elements in the vector

2016 Apr 04

sum elements in the vector

My target has an instruction that adds up all elements in the vector and stores the result in a register. I'm trying to implement it in my compiler but I'm not sure even where to start. I did look at other targets, but they don't seem to have anything like it ( I could be wrong. My experience with LLVM is limited, so if I missed it, I'd appreciate if someone could point it out ).

sum elements in the vector

2016 May 28

sum elements in the vector

Hi Rail, Below 2 revisions might be of your interest which Detect SAD patterns and emit psadbw instructions on X86.: http://reviews.llvm.org/D14840 http://reviews.llvm.org/D14897 Intrinsics related to absdiff revisons : http://reviews.llvm.org/D10867 http://reviews.llvm.org/D11678 Hope this helps. Regards, Suyog On Sat, May 28, 2016 at 4:20 AM, Rail Shafigulin via llvm-dev < llvm-dev at

Specifying DAG patterns in the instruction

2016 Jan 29

Specifying DAG patterns in the instruction

On Thu, Jan 28, 2016 at 8:34 PM, Dylan McKay <dylanmckay34 at gmail.com> wrote: > Try visualising the DAG like this. > > ``` > ---- GPR:$rA > / > set GPR:$rd ---- add > \ > ---- GPR:$rB > ``` > > Each instruction forms a DAG with its operands being subnodes. > >

New register class and patterns

2016 Feb 02

New register class and patterns

> On Feb 1, 2016, at 16:53, Rail Shafigulin <rail at esenciatech.com> wrote: > > > > On Fri, Jan 29, 2016 at 10:03 PM, Matt Arsenault <arsenm2 at gmail.com <mailto:arsenm2 at gmail.com>> wrote: > > > On Jan 29, 2016, at 13:25, Rail Shafigulin via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > >

Enable / Disable a processor feature

2016 Mar 05

Enable / Disable a processor feature

I'm trying to enable/disable a target feature through clang. Here is how my target looks like // Esencia subtarget features //===----------------------------------------------------------------------===// def FeatureMul : SubtargetFeature<"mul", "HasMul", "true", "Enable hardware multiplier">; def FeatureDiv

sum elements in the vector

2016 May 12

sum elements in the vector

> why in order to add this particular instruction (sum elements in a vector) I need to add an insrinsic? Adding intrinsic is not the only way, it is one of the way and user WILL-NOT be required to invoke It specifically. Currently LLVM does not have any instruction to directly represent “sum of elements in a vector” and generate your particular instruction.However, you can do it without

enable/disable features through clang

2016 May 02

enable/disable features through clang

Is there a way to enable/disable target features through clang? I found this, https://github.com/avr-llvm/llvm/issues/9, but this seems to be talking about llc -mattr=+feature1,-feature2... Is there something equivalent for clang? -- Rail Shafigulin Software Engineer Esencia Technologies -------------- next part -------------- An HTML attachment was scrubbed... URL:

sum elements in the vector

2016 May 16

sum elements in the vector

This would be really cool. We have several instructions that perform horizontal vector operations, and have to use built-ins to select them as there is no easy way of expressing them in a TD file. Some like SUM for a ‘v4i32’ are easy enough to express with a pattern fragment, SUM ‘v8i16’ takes TableGen a long time to compute, but SUM ‘v16i8’ resulted in TableGen disappearing into itself for

running intrinsics from C code

2016 May 25

running intrinsics from C code

I've created an intrinsic from my target, but I can't figure out how I can run it from a C code. Most of the targets have a GCCBuiltin and it looks like it is the way to execute an intrinsic from C code. However in my case there is no actual GCC built in. Any help on this is really appreciated. -- Rail Shafigulin Software Engineer Esencia Technologies -------------- next part

sum elements in the vector

2016 May 27

sum elements in the vector

Hi Shahid. Do you mind providing a concrete example of X86 code where an intrinsic was added (preferrable with filenames and line numbers)? I'm having difficulty tracking down the steps you provided. Any help is appreciated. On Mon, Apr 4, 2016 at 9:02 PM, Shahid, Asghar-ahmad < Asghar-ahmad.Shahid at amd.com> wrote: > Hi Rail, > > > > We had done this for generation

sum elements in the vector

2016 May 18

sum elements in the vector

Hi Rail, We used a very simple pattern expansion (actually, not a pattern fragment). For example, for AND, ADD (horizontal sum), OR and XOR of 4 elements we use something like the following TableGen structure: class HORIZ_Op4<SDNode opc, RegisterClass regVT, ValueType rt, ValueType vt, string asmstr> : SHAVE_Instr<(outs regVT:$dst), (ins VRF128:$src),

generate vectorized code

2016 Mar 18

generate vectorized code

On Fri, Mar 18, 2016 at 2:03 PM, Rail Shafigulin <rail at esenciatech.com> wrote: > On Fri, Mar 18, 2016 at 1:53 PM, Mehdi Amini <mehdi.amini at apple.com> > wrote: > >> >> On Mar 18, 2016, at 1:47 PM, Rail Shafigulin <rail at esenciatech.com> >> wrote: >> >> Yes this IR does not build or shuffle any vector. Try to write a function

generate vectorized code

2016 Mar 18

generate vectorized code

> > Here is how I started with SelectionDAG: > > - small IR (bugpoint can help) > Did you mean a break poing? - the magic flag: -debug > - read the output of SelectionDAG debugging (especially with cycles) > - matching the log to source code > What log are you talking about? > - single stepping in a debugger sometimes. > > > -- > Mehdi > > -- Rail

sum elements in the vector

2016 May 16

sum elements in the vector

I'm starting to think we should directly implement horizontal operations on vector types. My suspicion is that coming up with a nice model for this would help us a lot with things like: - Idiom recognition of reduction patterns that use horizontal arithmetic - Ability to use horizontal operations in SLPVectorizer - Significantly easier cost modeling of vectorizing loops with reductions in

Specifying DAG patterns in the instruction

2016 Jan 29

Specifying DAG patterns in the instruction

On Fri, Jan 29, 2016 at 11:39 AM, Rail Shafigulin <rail at esenciatech.com> wrote: > > > On Thu, Jan 28, 2016 at 8:34 PM, Dylan McKay <dylanmckay34 at gmail.com> > wrote: > >> Try visualising the DAG like this. >> >> ``` >> ---- GPR:$rA >> / >> set GPR:$rd ---- add >>

similar to: vectorization for X86