I am trying to vectorize the function void bar(float *c, float *a, float *b) { const int width = 256; for (int i = 0 ; i < 256 ; ++i ) { c[ i ] = a[ i ] + b[ i ]; c[ width + i ] = a[ width + i ] + b[ width + i ]; } } using the following commands clang -emit-llvm -S loop.c opt loop.ll -O3 -debug-only=loop-vectorize -S -o - LV: Checking a loop in "bar" LV: Found a loop: for.body LV: Found an induction variable. LV: Found an unidentified write ptr: float* %c LV: Found an unidentified write ptr: float* %c LV: Found an unidentified read ptr: float* %a LV: Found an unidentified read ptr: float* %b LV: Found an unidentified read ptr: float* %a LV: Found an unidentified read ptr: float* %b LV: Found a runtime check ptr: %arrayidx4 = getelementptr inbounds float* %c, i64 %indvars.iv LV: Found a runtime check ptr: %arrayidx14 = getelementptr inbounds float* %c, i64 %2 LV: Found a runtime check ptr: %arrayidx = getelementptr inbounds float* %a, i64 %indvars.iv LV: Found a runtime check ptr: %arrayidx2 = getelementptr inbounds float* %b, i64 %indvars.iv LV: Found a runtime check ptr: %arrayidx7 = getelementptr inbounds float* %a, i64 %2 LV: Found a runtime check ptr: %arrayidx10 = getelementptr inbounds float* %b, i64 %2 LV: We need to do 10 pointer comparisons. LV: We can't vectorize because we can't find the array bounds. LV: Can't vectorize due to memory conflicts LV: Not vectorizing. Is there any chance to make this work? Frank
----- Original Message -----> I am trying to vectorize the function > > void bar(float *c, float *a, float *b) > { > const int width = 256; > for (int i = 0 ; i < 256 ; ++i ) { > c[ i ] = a[ i ] + b[ i ]; > c[ width + i ] = a[ width + i ] + b[ width + i ]; > } > } > > using the following commands > > clang -emit-llvm -S loop.c > opt loop.ll -O3 -debug-only=loop-vectorize -S -o - > > LV: Checking a loop in "bar" > LV: Found a loop: for.body > LV: Found an induction variable. > LV: Found an unidentified write ptr: float* %c > LV: Found an unidentified write ptr: float* %c > LV: Found an unidentified read ptr: float* %a > LV: Found an unidentified read ptr: float* %b > LV: Found an unidentified read ptr: float* %a > LV: Found an unidentified read ptr: float* %b > LV: Found a runtime check ptr: %arrayidx4 = getelementptr inbounds > float* %c, i64 %indvars.iv > LV: Found a runtime check ptr: %arrayidx14 = getelementptr inbounds > float* %c, i64 %2 > LV: Found a runtime check ptr: %arrayidx = getelementptr inbounds > float* %a, i64 %indvars.iv > LV: Found a runtime check ptr: %arrayidx2 = getelementptr inbounds > float* %b, i64 %indvars.iv > LV: Found a runtime check ptr: %arrayidx7 = getelementptr inbounds > float* %a, i64 %2 > LV: Found a runtime check ptr: %arrayidx10 = getelementptr inbounds > float* %b, i64 %2 > LV: We need to do 10 pointer comparisons. > LV: We can't vectorize because we can't find the array bounds. > LV: Can't vectorize due to memory conflicts > LV: Not vectorizing. > > Is there any chance to make this work?Try adding the restrict keyword to the function parameters: void bar(float * restrict c, float * restrict a, float * restrict b) -Hal> > Frank > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Bingo! That works (when coming from C source) Now, I have a serious problem. I am not coming from C but I build the function with the builder. I am also forced to change the signature and load the pointers a,b,c afterwards: define void @bar([8 x i8]* nocapture readonly %arg_ptr) #0 { entrypoint: %0 = bitcast [8 x i8]* %arg_ptr to i32* %1 = load i32* %0, align 4 %2 = getelementptr [8 x i8]* %arg_ptr, i64 1 %3 = bitcast [8 x i8]* %2 to i32* %4 = load i32* %3, align 4 %5 = getelementptr [8 x i8]* %arg_ptr, i64 2 %6 = bitcast [8 x i8]* %5 to float** %7 = load float** %6, align 8 %8 = getelementptr [8 x i8]* %arg_ptr, i64 3 %9 = bitcast [8 x i8]* %8 to float** %10 = load float** %9, align 8 %11 = getelementptr [8 x i8]* %arg_ptr, i64 4 %12 = bitcast [8 x i8]* %11 to float** %13 = load float** %12, align 8 %14 = sext i32 %1 to i64 br label %L0 Now, these pointer (%7,%10,%13) are not qualified with 'restrict' and the loop vectorizer gives me the same message: LV: We can't vectorize because we can't find the array bounds. LV: Can't vectorize due to memory conflicts LV: Not vectorizing. I asked this a few days ago; now it comes up again: Is there a way to qualify a pointer/Value to be 'restrict'? Another possible solution would be telling the loop vectorizer that it's safe to treat all arrays as disjunct. Is this possible? Frank On 28/10/13 15:11, Hal Finkel wrote:> ----- Original Message ----- >> I am trying to vectorize the function >> >> void bar(float *c, float *a, float *b) >> { >> const int width = 256; >> for (int i = 0 ; i < 256 ; ++i ) { >> c[ i ] = a[ i ] + b[ i ]; >> c[ width + i ] = a[ width + i ] + b[ width + i ]; >> } >> } >> >> using the following commands >> >> clang -emit-llvm -S loop.c >> opt loop.ll -O3 -debug-only=loop-vectorize -S -o - >> >> LV: Checking a loop in "bar" >> LV: Found a loop: for.body >> LV: Found an induction variable. >> LV: Found an unidentified write ptr: float* %c >> LV: Found an unidentified write ptr: float* %c >> LV: Found an unidentified read ptr: float* %a >> LV: Found an unidentified read ptr: float* %b >> LV: Found an unidentified read ptr: float* %a >> LV: Found an unidentified read ptr: float* %b >> LV: Found a runtime check ptr: %arrayidx4 = getelementptr inbounds >> float* %c, i64 %indvars.iv >> LV: Found a runtime check ptr: %arrayidx14 = getelementptr inbounds >> float* %c, i64 %2 >> LV: Found a runtime check ptr: %arrayidx = getelementptr inbounds >> float* %a, i64 %indvars.iv >> LV: Found a runtime check ptr: %arrayidx2 = getelementptr inbounds >> float* %b, i64 %indvars.iv >> LV: Found a runtime check ptr: %arrayidx7 = getelementptr inbounds >> float* %a, i64 %2 >> LV: Found a runtime check ptr: %arrayidx10 = getelementptr inbounds >> float* %b, i64 %2 >> LV: We need to do 10 pointer comparisons. >> LV: We can't vectorize because we can't find the array bounds. >> LV: Can't vectorize due to memory conflicts >> LV: Not vectorizing. >> >> Is there any chance to make this work? > Try adding the restrict keyword to the function parameters: > > void bar(float * restrict c, float * restrict a, float * restrict b) > > -Hal > >> Frank >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>