Bingo! That works (when coming from C source)
Now, I have a serious problem. I am not coming from C but I build the 
function with the builder. I am also forced to change the signature and 
load the pointers a,b,c afterwards:
define void @bar([8 x i8]* nocapture readonly %arg_ptr) #0 {
entrypoint:
   %0 = bitcast [8 x i8]* %arg_ptr to i32*
   %1 = load i32* %0, align 4
   %2 = getelementptr [8 x i8]* %arg_ptr, i64 1
   %3 = bitcast [8 x i8]* %2 to i32*
   %4 = load i32* %3, align 4
   %5 = getelementptr [8 x i8]* %arg_ptr, i64 2
   %6 = bitcast [8 x i8]* %5 to float**
   %7 = load float** %6, align 8
   %8 = getelementptr [8 x i8]* %arg_ptr, i64 3
   %9 = bitcast [8 x i8]* %8 to float**
   %10 = load float** %9, align 8
   %11 = getelementptr [8 x i8]* %arg_ptr, i64 4
   %12 = bitcast [8 x i8]* %11 to float**
   %13 = load float** %12, align 8
   %14 = sext i32 %1 to i64
   br label %L0
Now, these pointer (%7,%10,%13) are not qualified with 'restrict' and 
the loop vectorizer gives me the same message:
LV: We can't vectorize because we can't find the array bounds.
LV: Can't vectorize due to memory conflicts
LV: Not vectorizing.
I asked this a few days ago; now it comes up again: Is there a way to 
qualify a pointer/Value to be 'restrict'?
Another possible solution would be telling the loop vectorizer that it's 
safe to treat all arrays as disjunct. Is this possible?
Frank
On 28/10/13 15:11, Hal Finkel wrote:> ----- Original Message -----
>> I am trying to vectorize the function
>>
>> void bar(float *c, float *a, float *b)
>> {
>>     const int width = 256;
>>     for (int i = 0 ; i < 256 ; ++i ) {
>>       c[ i ]         = a[ i ]         + b[ i ];
>>       c[ width + i ] = a[ width + i ] + b[ width + i ];
>>     }
>> }
>>
>> using the following commands
>>
>> clang -emit-llvm -S loop.c
>> opt loop.ll -O3 -debug-only=loop-vectorize -S -o -
>>
>> LV: Checking a loop in "bar"
>> LV: Found a loop: for.body
>> LV: Found an induction variable.
>> LV: Found an unidentified write ptr: float* %c
>> LV: Found an unidentified write ptr: float* %c
>> LV: Found an unidentified read ptr: float* %a
>> LV: Found an unidentified read ptr: float* %b
>> LV: Found an unidentified read ptr: float* %a
>> LV: Found an unidentified read ptr: float* %b
>> LV: Found a runtime check ptr:  %arrayidx4 = getelementptr inbounds
>> float* %c, i64 %indvars.iv
>> LV: Found a runtime check ptr:  %arrayidx14 = getelementptr inbounds
>> float* %c, i64 %2
>> LV: Found a runtime check ptr:  %arrayidx = getelementptr inbounds
>> float* %a, i64 %indvars.iv
>> LV: Found a runtime check ptr:  %arrayidx2 = getelementptr inbounds
>> float* %b, i64 %indvars.iv
>> LV: Found a runtime check ptr:  %arrayidx7 = getelementptr inbounds
>> float* %a, i64 %2
>> LV: Found a runtime check ptr:  %arrayidx10 = getelementptr inbounds
>> float* %b, i64 %2
>> LV: We need to do 10 pointer comparisons.
>> LV: We can't vectorize because we can't find the array bounds.
>> LV: Can't vectorize due to memory conflicts
>> LV: Not vectorizing.
>>
>> Is there any chance to make this work?
> Try adding the restrict keyword to the function parameters:
>
> void bar(float * restrict c, float * restrict a, float * restrict b)
>
>   -Hal
>
>> Frank
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
----- Original Message -----> Bingo! That works (when coming from C source) > > Now, I have a serious problem. I am not coming from C but I build the > function with the builder. I am also forced to change the signature > and > load the pointers a,b,c afterwards: > > define void @bar([8 x i8]* nocapture readonly %arg_ptr) #0 { > entrypoint: > %0 = bitcast [8 x i8]* %arg_ptr to i32* > %1 = load i32* %0, align 4 > %2 = getelementptr [8 x i8]* %arg_ptr, i64 1 > %3 = bitcast [8 x i8]* %2 to i32* > %4 = load i32* %3, align 4 > %5 = getelementptr [8 x i8]* %arg_ptr, i64 2 > %6 = bitcast [8 x i8]* %5 to float** > %7 = load float** %6, align 8 > %8 = getelementptr [8 x i8]* %arg_ptr, i64 3 > %9 = bitcast [8 x i8]* %8 to float** > %10 = load float** %9, align 8 > %11 = getelementptr [8 x i8]* %arg_ptr, i64 4 > %12 = bitcast [8 x i8]* %11 to float** > %13 = load float** %12, align 8 > %14 = sext i32 %1 to i64 > br label %L0 > > Now, these pointer (%7,%10,%13) are not qualified with 'restrict' and > the loop vectorizer gives me the same message: > > LV: We can't vectorize because we can't find the array bounds. > LV: Can't vectorize due to memory conflicts > LV: Not vectorizing. > > I asked this a few days ago; now it comes up again: Is there a way to > qualify a pointer/Value to be 'restrict'?Currently, no. There will be work in that direction soon. You'll need to extract a sub-function so that you can put 'noalias' on the function arguments.> > Another possible solution would be telling the loop vectorizer that > it's > safe to treat all arrays as disjunct. Is this possible?Yes. Look for llvm.mem.parallel_loop_access in the language reference. -Hal> > Frank > > > > > On 28/10/13 15:11, Hal Finkel wrote: > > ----- Original Message ----- > >> I am trying to vectorize the function > >> > >> void bar(float *c, float *a, float *b) > >> { > >> const int width = 256; > >> for (int i = 0 ; i < 256 ; ++i ) { > >> c[ i ] = a[ i ] + b[ i ]; > >> c[ width + i ] = a[ width + i ] + b[ width + i ]; > >> } > >> } > >> > >> using the following commands > >> > >> clang -emit-llvm -S loop.c > >> opt loop.ll -O3 -debug-only=loop-vectorize -S -o - > >> > >> LV: Checking a loop in "bar" > >> LV: Found a loop: for.body > >> LV: Found an induction variable. > >> LV: Found an unidentified write ptr: float* %c > >> LV: Found an unidentified write ptr: float* %c > >> LV: Found an unidentified read ptr: float* %a > >> LV: Found an unidentified read ptr: float* %b > >> LV: Found an unidentified read ptr: float* %a > >> LV: Found an unidentified read ptr: float* %b > >> LV: Found a runtime check ptr: %arrayidx4 = getelementptr > >> inbounds > >> float* %c, i64 %indvars.iv > >> LV: Found a runtime check ptr: %arrayidx14 = getelementptr > >> inbounds > >> float* %c, i64 %2 > >> LV: Found a runtime check ptr: %arrayidx = getelementptr inbounds > >> float* %a, i64 %indvars.iv > >> LV: Found a runtime check ptr: %arrayidx2 = getelementptr > >> inbounds > >> float* %b, i64 %indvars.iv > >> LV: Found a runtime check ptr: %arrayidx7 = getelementptr > >> inbounds > >> float* %a, i64 %2 > >> LV: Found a runtime check ptr: %arrayidx10 = getelementptr > >> inbounds > >> float* %b, i64 %2 > >> LV: We need to do 10 pointer comparisons. > >> LV: We can't vectorize because we can't find the array bounds. > >> LV: Can't vectorize due to memory conflicts > >> LV: Not vectorizing. > >> > >> Is there any chance to make this work? > > Try adding the restrict keyword to the function parameters: > > > > void bar(float * restrict c, float * restrict a, float * restrict > > b) > > > > -Hal > > > >> Frank > >> > >> _______________________________________________ > >> LLVM Developers mailing list > >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >> > > >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Thanks for the alternatives!
I am trying the 'extracting sub-function' approach. However, it seems I 
can't get the 'subfunction' to pass the verifier. This is my
subfunction:
define void @main_extern([8 x i8]* %arg_ptr) {
entrypoint:
   %0 = getelementptr [8 x i8]* %arg_ptr, i32 0
   %1 = bitcast [8 x i8]* %0 to i64*
   %2 = load i64* %1
   %3 = getelementptr [8 x i8]* %arg_ptr, i32 1
   %4 = bitcast [8 x i8]* %3 to i64*
   %5 = load i64* %4
   %6 = getelementptr [8 x i8]* %arg_ptr, i32 2
   %7 = bitcast [8 x i8]* %6 to float**
   %8 = load float** %7
   %9 = getelementptr [8 x i8]* %arg_ptr, i32 3
   %10 = bitcast [8 x i8]* %9 to float**
   %11 = load float** %10
   %12 = getelementptr [8 x i8]* %arg_ptr, i32 4
   %13 = bitcast [8 x i8]* %12 to float**
   %14 = load float** %13
   call void @main(i64 %2, i64 %5, float* %8, float* %11, float* %14)
   ret void
}
Looks good to me. However the verify pass fails:
/svn/llvm/include/llvm/Support/Casting.h:97: static bool 
llvm::isa_impl_cl<To, const From*>::doit(const From*) [with To = 
llvm::GlobalVariable; From = llvm::GlobalValue]: Assertion `Val && 
"isa<> used on a null pointer"' failed.
I have no idea what this tries to tell me. Any idea?
Frank
On 28/10/13 15:27, Hal Finkel wrote:> ----- Original Message -----
>> Bingo! That works (when coming from C source)
>>
>> Now, I have a serious problem. I am not coming from C but I build the
>> function with the builder. I am also forced to change the signature
>> and
>> load the pointers a,b,c afterwards:
>>
>> define void @bar([8 x i8]* nocapture readonly %arg_ptr) #0 {
>> entrypoint:
>>     %0 = bitcast [8 x i8]* %arg_ptr to i32*
>>     %1 = load i32* %0, align 4
>>     %2 = getelementptr [8 x i8]* %arg_ptr, i64 1
>>     %3 = bitcast [8 x i8]* %2 to i32*
>>     %4 = load i32* %3, align 4
>>     %5 = getelementptr [8 x i8]* %arg_ptr, i64 2
>>     %6 = bitcast [8 x i8]* %5 to float**
>>     %7 = load float** %6, align 8
>>     %8 = getelementptr [8 x i8]* %arg_ptr, i64 3
>>     %9 = bitcast [8 x i8]* %8 to float**
>>     %10 = load float** %9, align 8
>>     %11 = getelementptr [8 x i8]* %arg_ptr, i64 4
>>     %12 = bitcast [8 x i8]* %11 to float**
>>     %13 = load float** %12, align 8
>>     %14 = sext i32 %1 to i64
>>     br label %L0
>>
>> Now, these pointer (%7,%10,%13) are not qualified with
'restrict' and
>> the loop vectorizer gives me the same message:
>>
>> LV: We can't vectorize because we can't find the array bounds.
>> LV: Can't vectorize due to memory conflicts
>> LV: Not vectorizing.
>>
>> I asked this a few days ago; now it comes up again: Is there a way to
>> qualify a pointer/Value to be 'restrict'?
> Currently, no. There will be work in that direction soon. You'll need
to extract a sub-function so that you can put 'noalias' on the function
arguments.
>
>> Another possible solution would be telling the loop vectorizer that
>> it's
>> safe to treat all arrays as disjunct. Is this possible?
> Yes. Look for llvm.mem.parallel_loop_access in the language reference.
>
>   -Hal
>
>> Frank
>>
>>
>>
>>
>> On 28/10/13 15:11, Hal Finkel wrote:
>>> ----- Original Message -----
>>>> I am trying to vectorize the function
>>>>
>>>> void bar(float *c, float *a, float *b)
>>>> {
>>>>      const int width = 256;
>>>>      for (int i = 0 ; i < 256 ; ++i ) {
>>>>        c[ i ]         = a[ i ]         + b[ i ];
>>>>        c[ width + i ] = a[ width + i ] + b[ width + i ];
>>>>      }
>>>> }
>>>>
>>>> using the following commands
>>>>
>>>> clang -emit-llvm -S loop.c
>>>> opt loop.ll -O3 -debug-only=loop-vectorize -S -o -
>>>>
>>>> LV: Checking a loop in "bar"
>>>> LV: Found a loop: for.body
>>>> LV: Found an induction variable.
>>>> LV: Found an unidentified write ptr: float* %c
>>>> LV: Found an unidentified write ptr: float* %c
>>>> LV: Found an unidentified read ptr: float* %a
>>>> LV: Found an unidentified read ptr: float* %b
>>>> LV: Found an unidentified read ptr: float* %a
>>>> LV: Found an unidentified read ptr: float* %b
>>>> LV: Found a runtime check ptr:  %arrayidx4 = getelementptr
>>>> inbounds
>>>> float* %c, i64 %indvars.iv
>>>> LV: Found a runtime check ptr:  %arrayidx14 = getelementptr
>>>> inbounds
>>>> float* %c, i64 %2
>>>> LV: Found a runtime check ptr:  %arrayidx = getelementptr
inbounds
>>>> float* %a, i64 %indvars.iv
>>>> LV: Found a runtime check ptr:  %arrayidx2 = getelementptr
>>>> inbounds
>>>> float* %b, i64 %indvars.iv
>>>> LV: Found a runtime check ptr:  %arrayidx7 = getelementptr
>>>> inbounds
>>>> float* %a, i64 %2
>>>> LV: Found a runtime check ptr:  %arrayidx10 = getelementptr
>>>> inbounds
>>>> float* %b, i64 %2
>>>> LV: We need to do 10 pointer comparisons.
>>>> LV: We can't vectorize because we can't find the array
bounds.
>>>> LV: Can't vectorize due to memory conflicts
>>>> LV: Not vectorizing.
>>>>
>>>> Is there any chance to make this work?
>>> Try adding the restrict keyword to the function parameters:
>>>
>>> void bar(float * restrict c, float * restrict a, float * restrict
>>> b)
>>>
>>>    -Hal
>>>
>>>> Frank
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>
>>