search for: unroll_count

Displaying 12 results from an estimated 12 matches for "unroll_count".

2016 Nov 18
2
Loop invariant not being optimized
...e loop-invariant > optimization happening, but it's not. Here's the C code: > > #define DIM 8 > #define UNROLL_DIM DIM > typedef double InArray[DIM][DIM]; > > __declspec(noalias) void f1( InArray c, const InArray a, const InArray b ) > { > > #pragma clang loop unroll_count(UNROLL_DIM) > for( int i=0;i<DIM;i++) > #pragma clang loop unroll_count(UNROLL_DIM) > for( int j=0;j<DIM;j++) > #pragma clang loop unroll_count(UNROLL_DIM) > for( int k=0;k<DIM;k++) { > c[i][k] = c[i][k] + a[i][j]*b[j][k]; >...
2016 Nov 17
2
Loop invariant not being optimized
...example where I think that there should be some loop-invariant optimization happening, but it's not. Here's the C code: #define DIM 8 #define UNROLL_DIM DIM typedef double InArray[DIM][DIM]; __declspec(noalias) void f1( InArray c, const InArray a, const InArray b ) { #pragma clang loop unroll_count(UNROLL_DIM) for( int i=0;i<DIM;i++) #pragma clang loop unroll_count(UNROLL_DIM) for( int j=0;j<DIM;j++) #pragma clang loop unroll_count(UNROLL_DIM) for( int k=0;k<DIM;k++) { c[i][k] = c[i][k] + a[i][j]*b[j][k]; } } The "a[i][j]&quo...
2016 Jul 11
2
extra loads in nested for-loop
I was looking at the code generated from the following c code and noticed extra loads in the inner-loop of these nested for-loops: #define DIM 8 #define UNROLL_DIM DIM typedef double InArray[DIM][DIM]; void f1( InArray c, InArray a, InArray b ) { #pragma clang loop unroll_count(UNROLL_DIM) for( int i=0;i<DIM;i++) #pragma clang loop unroll_count(UNROLL_DIM) for( int j=0;j<DIM;j++) #pragma clang loop unroll_count(UNROLL_DIM) for( int k=0;k<DIM;k++) { c[i][k] = c[i][k] + a[i][j]*b[j][k]; } } In the inner-most loo...
2016 Aug 12
4
Invoke loop vectorizer
...e vectorized code on IR level? On Aug 12, 2016 11:39 AM, "Daniel Berlin" <dberlin at dberlin.org> wrote: > cat > test.c > > #define SIZE 128 > > void bar(int *restrict A, int* restrict B,int K) { > > #pragma clang loop vectorize(enable) vectorize_width(2) unroll_count(8) > > for (int i = 0; i < SIZE; ++i) > > A[i] += B[i] + K; > > } > > [dannyb at dannyb-macbookpro3 11:37:20] ~ :) $ clang -O3 test.c -c > -save-temps > [dannyb at dannyb-macbookpro3 11:38:28] ~ :) $ pcregrep -i "^\s*p" > test.s|less >...
2016 Aug 12
2
Invoke loop vectorizer
...alias in this example. It's > almost certainly not profitable to add a runtime check given the size of > the loop. > > > try > > #define SIZE 8 > > void bar(int *restrict A, int* restrict B,int K) { > > #pragma clang loop vectorize(enable) vectorize_width(2) unroll_count(8) > > for (int i = 0; i < SIZE; ++i) > > A[i] += B[i] + K; > > } > > (i don't remember if llvm also does runtime alias checks, but if it does, > you'd probably need to increase size to get it to vectorize) > > On Fri, Aug 12, 2016 at 11:08 AM, Xiao...
2016 Aug 12
2
Invoke loop vectorizer
...loop vectorizer and SLP vectorizer are enabled, my simple test still not get optimized. I also tried clang pragma in my test to force vectorization. What do you think is the problem? Test: #define SIZE 8 void bar(int *A, int* B,int K) { #pragma clang loop vectorize(enable) vectorize_width(2) unroll_count(8) for (int i = 0; i < SIZE; ++i) A[i] += B[i] + K; } Thanks, Xiaochu On Aug 12, 2016 4:06 AM, "Andrey Bokhanko" <andreybokhanko at gmail.com> wrote: > Hi Xiaochu, > > Clang uses -O0 by default, that doesn't run any optimizations. Try > supplying -O1 o...
2016 Oct 25
2
[Help] Add custom pragma
Hi, all. I want to give programmer ability to tell LLVM that certain region of code is expected to get specialized optimization. So, I'm trying to make custom pragma to mark certain region of code and pass this information to LLVM, in the similar way that '#pragma clang loop unroll_count(N)' works. By tracking the framework of loop unroll pragma, I found out it works in the way below. (1) Detect pragma at lexer, parser. (2) Create AttributeList and push it into AST. (3) Once AST is built, consume AST and generate LLVM IR at CodeGeneration (4) If attribute for loop unroll is fo...
2014 Jul 17
4
[LLVMdev] Removing metadata in a pass
...pass? The context is patch http://reviews.llvm.org/D4571 which removes loop unrolling hint metadata after using it to avoid unrolling more than the hint suggests. This is a potential problem because loop unrolling can be run more than once. Example: a loop annotated with "#pragma clang loop unroll_count(2)" which adds hint metadata to the loop would be unrolled twice every time the loop unrolling pass is run. Anyway, I ask about metadata removal because Eli who is reviewing the patch wasn't sure whether this was acceptable. Loop unrolling metadata can take the following forms: llvm.loop...
2016 Oct 25
2
[Help] Add custom pragma
...I want to give programmer ability to tell LLVM that certain region of > code is expected to get specialized optimization. > > So, I'm trying to make custom pragma to mark certain region of code and > pass this information to LLVM, in the similar way that '#pragma clang loop > unroll_count(N)' works. > > > > By tracking the framework of loop unroll pragma, I found out it works in > the way below. > > (1) Detect pragma at lexer, parser. > > (2) Create AttributeList and push it into AST. > > (3) Once AST is built, consume AST and generate LLVM IR at...
2016 Oct 25
0
[Help] Add custom pragma
...all. > > I want to give programmer ability to tell LLVM that certain region of code is expected to get specialized optimization. > > So, I'm trying to make custom pragma to mark certain region of code and pass this information to LLVM, in the similar way that '#pragma clang loop unroll_count(N)' works. > > > > By tracking the framework of loop unroll pragma, I found out it works in the way below. > > (1) Detect pragma at lexer, parser. > > (2) Create AttributeList and push it into AST. > > (3) Once AST is built, consume AST and generate LLVM IR at Code...
2016 Aug 11
2
Invoke loop vectorizer
Hi there , I use clang-cl /Qvec test.c to compile the code. But the pass LoopVectorizer is never invoked. I was wondering if this is sufficient to enable auto vectorizer? Thanks, Xiaochu -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160811/8b6cb760/attachment.html>
2015 Nov 02
2
noalias parameter attribute not currently exploited by alias analysis?
...demonstrates the same issue which is unencumbered by the EEMBC license. Consider this simple example program: #include <stdint.h> #include <stdio.h> void main_loop(int len, uint8_t *restrict input_buf, uint8_t *restrict output_buf) { int i; uint8_t a, b, c; #pragma clang loop unroll_count(8) for (i = 0; i < len; i++) { a = *input_buf++; b = *input_buf++; c = *input_buf++; a = (uint8_t) (a - 10); b = (uint8_t) (b - 20); c = (uint8_t) (c - 30); *output_buf++ = a; *output_buf++ = b; *output_buf++ = c; } } __attribute__((flatten)) void dumm...