thr3ads.net - similar to: "[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR"

Displaying 20 results from an estimated 400 matches similar to: "[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR"

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 05

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

On 07/04/2013 01:39 PM, Stéphane Letz wrote: > Hi, > > Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be vectorized with opt -O3 -vectorize-loops. So our guess is that our generated LLVM IR lacks some

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 05

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

Le 5 juil. 2013 à 04:11, Tobias Grosser <tobias at grosser.es> a écrit : > On 07/04/2013 01:39 PM, Stéphane Letz wrote: >> Hi, >> >> Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 05

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

On Jul 5, 2013, at 9:50 AM, Stéphane Letz <letz at grame.fr> wrote: > > Le 5 juil. 2013 à 04:11, Tobias Grosser <tobias at grosser.es> a écrit : > >> On 07/04/2013 01:39 PM, Stéphane Letz wrote: >>> Hi, >>> >>> Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 05

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

Le 5 juil. 2013 à 17:23, Arnold Schwaighofer <aschwaighofer at apple.com> a écrit : > > On Jul 5, 2013, at 9:50 AM, Stéphane Letz <letz at grame.fr> wrote: > >> >> Le 5 juil. 2013 à 04:11, Tobias Grosser <tobias at grosser.es> a écrit : >> >>> On 07/04/2013 01:39 PM, Stéphane Letz wrote: >>>> Hi, >>>>

[LLVMdev] Vectorized LLVM IR

2010 May 29

[LLVMdev] Vectorized LLVM IR

Le 29 mai 2010 à 01:08, Bill Wendling a écrit : > Hi Stéphane, > > The SSE support is the LLVM backend is fine. What is the code that's generated? Do you have some short examples of where LLVM doesn't do as well as the equivalent scalar code? > > -bw > > On May 28, 2010, at 12:13 PM, Stéphane Letz wrote: We are actually testing LLVM for the Faust language

[LLVMdev] Vectorized LLVM IR

2010 May 28

[LLVMdev] Vectorized LLVM IR

Hi Stéphane, The SSE support is the LLVM backend is fine. What is the code that's generated? Do you have some short examples of where LLVM doesn't do as well as the equivalent scalar code? -bw On May 28, 2010, at 12:13 PM, Stéphane Letz wrote: > Hi, > > We are experimenting directly generating vectorized LLVM IR (using <8 x float> kind of types), then compiling the code

[LLVMdev] Vectorized LLVM IR

2010 May 29

[LLVMdev] Vectorized LLVM IR

On Sat, May 29, 2010 at 12:42 AM, Stéphane Letz <letz at grame.fr> wrote: > > Le 29 mai 2010 à 01:08, Bill Wendling a écrit : > >> Hi Stéphane, >> >> The SSE support is the LLVM backend is fine. What is the code that's generated? Do you have some short examples of where LLVM doesn't do as well as the equivalent scalar code? >> >> -bw >>

[LLVMdev] Vectorized LLVM IR

2010 May 28

[LLVMdev] Vectorized LLVM IR

Hi, We are experimenting directly generating vectorized LLVM IR (using <8 x float> kind of types), then compiling the code to SSE on a 64 bits machine. Right now the equivalent code in scalar mode sill outperform the SSE one. What is the quality of the SSE support in X86 LLVL backend? Are they any specific things to be aware of to improve the speed? Thanks Stéphane Letz

[LLVMdev] loop vectorizer

2013 Oct 30

[LLVMdev] loop vectorizer

The loop vectorizer seems to be not able to vectorize the following code: void bar(std::uint64_t start, std::uint64_t end, float * __restrict__ c, float * __restrict__ a, float * __restrict__ b) { const std::uint64_t inner = 4; for (std::uint64_t i = start ; i < end ; ++i ) { const std::uint64_t ir0 = ( (i/inner) * 2 + 0 ) * inner + i%4; const std::uint64_t ir1 = ( (i/inner)

[LLVMdev] loop vectorizer

2013 Oct 30

[LLVMdev] loop vectorizer

----- Original Message ----- > > > I ran the BB vectorizer as I guess this is the SLP vectorizer. No, while the BB vectorizer is doing a form of SLP vectorization, there is a separate SLP vectorization pass which uses a different algorithm. You can pass -vectorize-slp to opt. -Hal > > BBV: using target information > BBV: fusing loop #1 for for.body in _Z3barmmPfS_S_...

[LLVMdev] loop vectorizer

2013 Oct 30

[LLVMdev] loop vectorizer

The SLP vectorizer apparently did something in the prologue of the function (where storing of arguments on the stack happens) which then got eliminated later on (since I don't see any vector instructions in the final IR). Below the debug output of the SLP pass: Args: opt -O1 -vectorize-slp -debug loop.ll -S SLP: Analyzing blocks in _Z3barmmPfS_S_. SLP: Found 2 stores to vectorize. SLP:

[LLVMdev] loop vectorizer

2013 Oct 30

[LLVMdev] loop vectorizer

The debug messages are misleading. They should read “trying to vectorize a list of …”; The problem is that the SCEV analysis is unable to detect that C[ir0] and C[ir1] are consecutive. Is this loop from an important benchmark ? Thanks, Nadav On Oct 30, 2013, at 11:13 AM, Frank Winter <fwinter at jlab.org> wrote: > The SLP vectorizer apparently did something in the prologue of the

[LLVMdev] Vectorized LLVM IR

2010 May 29

[LLVMdev] Vectorized LLVM IR

On Sat, May 29, 2010 at 1:18 AM, Eli Friedman <eli.friedman at gmail.com> wrote: > On Sat, May 29, 2010 at 12:42 AM, Stéphane Letz <letz at grame.fr> wrote: >> >> Le 29 mai 2010 à 01:08, Bill Wendling a écrit : >> >>> Hi Stéphane, >>> >>> The SSE support is the LLVM backend is fine. What is the code that's generated? Do you have some

[LLVMdev] loop vectorizer

2013 Oct 30

[LLVMdev] loop vectorizer

Hi Frank, The access pattern to arrays a and b is non-linear. Unrolled loops are usually handled by the SLP-vectorizer. Are ir0 and ir1 consecutive for all values for i ? Thanks, Nadav On Oct 30, 2013, at 9:05 AM, Frank Winter <fwinter at jlab.org> wrote: > The loop vectorizer seems to be not able to vectorize the following code: > > void bar(std::uint64_t start,

[LLVMdev] loop vectorizer

2013 Oct 30

[LLVMdev] loop vectorizer

Well, they are not directly consecutive. They are consecutive with a constant offset or stride: ir1 = ir0 + 4 If I rewrite the function in this form void bar(std::uint64_t start, std::uint64_t end, float * __restrict__ c, float * __restrict__ a, float * __restrict__ b) { const std::uint64_t inner = 4; for (std::uint64_t i = start ; i < end ; ++i ) { const std::uint64_t ir0 = (

[LLVMdev] Preserving NSW/NUW bits

2014 Sep 02

[LLVMdev] Preserving NSW/NUW bits

David/All, Just a quick question about NSW/NUW bits, if you've got a second. I noticed you've been doing a little work on this as of late. I have a bit of code that looks like the following: %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 %2 = add i64 %indvars.iv.next, -1 %tmp = trunc i64 %2 to i32 %cmp = icmp slt i32 %tmp, %0 br i1 %cmp, label %for.body, label

loop unrolling introduces conditional branch

2015 Aug 20

loop unrolling introduces conditional branch

Hi, I want to use loop unrolling pass, however, I find that loop unrolling will introduces conditional branch at end of every "unrolled" part. For example, consider the following code *void foo( int n, int array_x[])* *{* * for (int i=0; i < n; i++)* * array_x[i] = i; * *}* Then I use this command "opt-3.5 try.bc -mem2reg -loops -loop-simplify -loop-rotate -lcssa

loop unrolling introduces conditional branch

2015 Aug 22

loop unrolling introduces conditional branch

Hi, Mehdi, For example, I have this very simple source code: void foo( int n, int array_x[]) { for (int i=0; i < n; i++) array_x[i] = i; } After I use "clang -emit-llvm -o bc_from_clang.bc -c try.cc", I get bc_from_clang.bc. With my code (using LLVM IRbuilder API), I get bc_from_api.bc. Attachment please find thse two files. I also past the IR here.

MemorySSA question

2017 Dec 19

MemorySSA question

Hi, I am new to MemorySSA and wanted to understand its capabilities. Hence I wrote the following program (test.c): int N; void test(int *restrict a, int *restrict b, int *restrict c, int *restrict d, int *restrict e) { int i; for (i = 0; i < N; i = i + 5) { a[i] = b[i] + c[i]; } for (i = 0; i < N - 5; i = i + 5) { e[i] = a[i] * d[i]; } } I compiled this program using

[LLVMdev] Loops Prevent Function Pointer Inlining?

2014 Sep 24

[LLVMdev] Loops Prevent Function Pointer Inlining?

I've CC'ed Chad Rosier as I think this behaviour is a side-effect of his revert of IndVarSimplify.cpp (git c6b1a7e577a0b9e9cff9f9b7ac35a2cde7c448d8, SVN 217962). The change basically makes the IndVar pass change: ; <label>:4 ; preds = %6, %0 %i.0 = phi i32 [ 0, %0 ], [ %11, %6 ] %5 = icmp eq i32 %i.0, 0 br i1 %5, label %6, label %17 To:

similar to: [LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR