Displaying 12 results from an estimated 12 matches for "unroll_count".
2016 Nov 18
2
Loop invariant not being optimized
...e loop-invariant
> optimization happening, but it's not. Here's the C code:
>
> #define DIM 8
> #define UNROLL_DIM DIM
> typedef double InArray[DIM][DIM];
>
> __declspec(noalias) void f1( InArray c, const InArray a, const InArray b )
> {
>
> #pragma clang loop unroll_count(UNROLL_DIM)
> for( int i=0;i<DIM;i++)
> #pragma clang loop unroll_count(UNROLL_DIM)
> for( int j=0;j<DIM;j++)
> #pragma clang loop unroll_count(UNROLL_DIM)
> for( int k=0;k<DIM;k++) {
> c[i][k] = c[i][k] + a[i][j]*b[j][k];
>...
2016 Nov 17
2
Loop invariant not being optimized
...example where I think that there should be some loop-invariant
optimization happening, but it's not. Here's the C code:
#define DIM 8
#define UNROLL_DIM DIM
typedef double InArray[DIM][DIM];
__declspec(noalias) void f1( InArray c, const InArray a, const InArray b )
{
#pragma clang loop unroll_count(UNROLL_DIM)
for( int i=0;i<DIM;i++)
#pragma clang loop unroll_count(UNROLL_DIM)
for( int j=0;j<DIM;j++)
#pragma clang loop unroll_count(UNROLL_DIM)
for( int k=0;k<DIM;k++) {
c[i][k] = c[i][k] + a[i][j]*b[j][k];
}
}
The "a[i][j]&quo...
2016 Jul 11
2
extra loads in nested for-loop
I was looking at the code generated from the following c code and noticed
extra loads in the inner-loop of these nested for-loops:
#define DIM 8
#define UNROLL_DIM DIM
typedef double InArray[DIM][DIM];
void f1( InArray c, InArray a, InArray b ) {
#pragma clang loop unroll_count(UNROLL_DIM)
for( int i=0;i<DIM;i++)
#pragma clang loop unroll_count(UNROLL_DIM)
for( int j=0;j<DIM;j++)
#pragma clang loop unroll_count(UNROLL_DIM)
for( int k=0;k<DIM;k++) {
c[i][k] = c[i][k] + a[i][j]*b[j][k];
}
}
In the inner-most loo...
2016 Aug 12
4
Invoke loop vectorizer
...e vectorized code on IR level?
On Aug 12, 2016 11:39 AM, "Daniel Berlin" <dberlin at dberlin.org> wrote:
> cat > test.c
>
> #define SIZE 128
>
> void bar(int *restrict A, int* restrict B,int K) {
>
> #pragma clang loop vectorize(enable) vectorize_width(2) unroll_count(8)
>
> for (int i = 0; i < SIZE; ++i)
>
> A[i] += B[i] + K;
>
> }
>
> [dannyb at dannyb-macbookpro3 11:37:20] ~ :) $ clang -O3 test.c -c
> -save-temps
> [dannyb at dannyb-macbookpro3 11:38:28] ~ :) $ pcregrep -i "^\s*p"
> test.s|less
>...
2016 Aug 12
2
Invoke loop vectorizer
...alias in this example. It's
> almost certainly not profitable to add a runtime check given the size of
> the loop.
>
>
> try
>
> #define SIZE 8
>
> void bar(int *restrict A, int* restrict B,int K) {
>
> #pragma clang loop vectorize(enable) vectorize_width(2) unroll_count(8)
>
> for (int i = 0; i < SIZE; ++i)
>
> A[i] += B[i] + K;
>
> }
>
> (i don't remember if llvm also does runtime alias checks, but if it does,
> you'd probably need to increase size to get it to vectorize)
>
> On Fri, Aug 12, 2016 at 11:08 AM, Xiao...
2016 Aug 12
2
Invoke loop vectorizer
...loop vectorizer and SLP vectorizer are enabled,
my simple test still not get optimized. I also tried clang pragma in my
test to force vectorization. What do you think is the problem?
Test:
#define SIZE 8
void bar(int *A, int* B,int K) {
#pragma clang loop vectorize(enable) vectorize_width(2) unroll_count(8)
for (int i = 0; i < SIZE; ++i)
A[i] += B[i] + K;
}
Thanks,
Xiaochu
On Aug 12, 2016 4:06 AM, "Andrey Bokhanko" <andreybokhanko at gmail.com> wrote:
> Hi Xiaochu,
>
> Clang uses -O0 by default, that doesn't run any optimizations. Try
> supplying -O1 o...
2016 Oct 25
2
[Help] Add custom pragma
Hi, all.
I want to give programmer ability to tell LLVM that certain region of code
is expected to get specialized optimization.
So, I'm trying to make custom pragma to mark certain region of code and
pass this information to LLVM, in the similar way that '#pragma clang loop
unroll_count(N)' works.
By tracking the framework of loop unroll pragma, I found out it works in
the way below.
(1) Detect pragma at lexer, parser.
(2) Create AttributeList and push it into AST.
(3) Once AST is built, consume AST and generate LLVM IR at CodeGeneration
(4) If attribute for loop unroll is fo...
2014 Jul 17
4
[LLVMdev] Removing metadata in a pass
...pass? The context is patch
http://reviews.llvm.org/D4571 which removes loop unrolling hint metadata
after using it to avoid unrolling more than the hint suggests. This is a
potential problem because loop unrolling can be run more than once.
Example: a loop annotated with "#pragma clang loop unroll_count(2)" which
adds hint metadata to the loop would be unrolled twice every time the loop
unrolling pass is run. Anyway, I ask about metadata removal because Eli
who is reviewing the patch wasn't sure whether this was acceptable.
Loop unrolling metadata can take the following forms:
llvm.loop...
2016 Oct 25
2
[Help] Add custom pragma
...I want to give programmer ability to tell LLVM that certain region of
> code is expected to get specialized optimization.
> > So, I'm trying to make custom pragma to mark certain region of code and
> pass this information to LLVM, in the similar way that '#pragma clang loop
> unroll_count(N)' works.
> >
> > By tracking the framework of loop unroll pragma, I found out it works in
> the way below.
> > (1) Detect pragma at lexer, parser.
> > (2) Create AttributeList and push it into AST.
> > (3) Once AST is built, consume AST and generate LLVM IR at...
2016 Oct 25
0
[Help] Add custom pragma
...all.
> > I want to give programmer ability to tell LLVM that certain region of code is expected to get specialized optimization.
> > So, I'm trying to make custom pragma to mark certain region of code and pass this information to LLVM, in the similar way that '#pragma clang loop unroll_count(N)' works.
> >
> > By tracking the framework of loop unroll pragma, I found out it works in the way below.
> > (1) Detect pragma at lexer, parser.
> > (2) Create AttributeList and push it into AST.
> > (3) Once AST is built, consume AST and generate LLVM IR at Code...
2016 Aug 11
2
Invoke loop vectorizer
Hi there ,
I use clang-cl /Qvec test.c to compile the code. But the pass
LoopVectorizer is never invoked.
I was wondering if this is sufficient to enable auto vectorizer?
Thanks,
Xiaochu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160811/8b6cb760/attachment.html>
2015 Nov 02
2
noalias parameter attribute not currently exploited by alias analysis?
...demonstrates the same issue which is unencumbered by the EEMBC
license. Consider this simple example program:
#include <stdint.h>
#include <stdio.h>
void main_loop(int len, uint8_t *restrict input_buf, uint8_t *restrict
output_buf) {
int i;
uint8_t a, b, c;
#pragma clang loop unroll_count(8)
for (i = 0; i < len; i++) {
a = *input_buf++;
b = *input_buf++;
c = *input_buf++;
a = (uint8_t) (a - 10);
b = (uint8_t) (b - 20);
c = (uint8_t) (c - 30);
*output_buf++ = a;
*output_buf++ = b;
*output_buf++ = c;
}
}
__attribute__((flatten))
void dumm...