thr3ads.net - search: "fullcount"

2010 May 29

3

[LLVMdev] Vectorized LLVM IR

...input[2]; FAUSTFLOAT* input3 = input[3]; FAUSTFLOAT* output0 = output[0]; for (int i=0; i<count; i++) { output0[i] = (FAUSTFLOAT)(((float)input2[i] + (float)input3[i]) * ((float)input0[i] + (float)input1[i])); } } The "vectorized" C++ code is : virtual void compute (int fullcount, FAUSTFLOAT** input, FAUSTFLOAT** output) { for (int index = 0; index < fullcount; index += 32) { int count = min(32, fullcount-index); FAUSTFLOAT* input0 = &input[0][index]; FAUSTFLOAT* input1 = &input[1][index]; FAUSTFLOAT* input2 = &input[2][index]; FAUSTFLOAT* in...

[LLVMdev] Vectorized LLVM IR

2010 May 29

0

[LLVMdev] Vectorized LLVM IR

... for (int i=0; i<count; i++) { > output0[i] = (FAUSTFLOAT)(((float)input2[i] + (float)input3[i]) * ((float)input0[i] + (float)input1[i])); > } > } > > The "vectorized" C++ code is : > > virtual void compute (int fullcount, FAUSTFLOAT** input, FAUSTFLOAT** output) { > for (int index = 0; index < fullcount; index += 32) { > int count = min(32, fullcount-index); > FAUSTFLOAT* input0 = &input[0][index]; > FAUSTFLOAT* i...

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 05

0

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

On 07/04/2013 01:39 PM, Stéphane Letz wrote: > Hi, > > Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be vectorized with opt -O3 -vectorize-loops. So our guess is that our generated LLVM IR lacks some

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 04

3

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

Hi, Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be vectorized with opt -O3 -vectorize-loops. So our guess is that our generated LLVM IR lacks some informations that are needed by the vectorization passes to

[LLVMdev] Vectorized LLVM IR

2010 May 28

0

[LLVMdev] Vectorized LLVM IR

Hi Stéphane, The SSE support is the LLVM backend is fine. What is the code that's generated? Do you have some short examples of where LLVM doesn't do as well as the equivalent scalar code? -bw On May 28, 2010, at 12:13 PM, Stéphane Letz wrote: > Hi, > > We are experimenting directly generating vectorized LLVM IR (using <8 x float> kind of types), then compiling the code

[LLVMdev] Vectorized LLVM IR

2010 May 28

3

[LLVMdev] Vectorized LLVM IR

Hi, We are experimenting directly generating vectorized LLVM IR (using <8 x float> kind of types), then compiling the code to SSE on a 64 bits machine. Right now the equivalent code in scalar mode sill outperform the SSE one. What is the quality of the SSE support in X86 LLVL backend? Are they any specific things to be aware of to improve the speed? Thanks Stéphane Letz

[LLVMdev] Vectorized LLVM IR

2010 May 29

1

[LLVMdev] Vectorized LLVM IR

...;count; i++) { >> output0[i] = (FAUSTFLOAT)(((float)input2[i] + (float)input3[i]) * ((float)input0[i] + (float)input1[i])); >> } >> } >> >> The "vectorized" C++ code is : >> >> virtual void compute (int fullcount, FAUSTFLOAT** input, FAUSTFLOAT** output) { >> for (int index = 0; index < fullcount; index += 32) { >> int count = min(32, fullcount-index); >> FAUSTFLOAT* input0 = &input[0][index]; >> ...

search for: fullcount