search for: arrayidx9

Displaying 16 results from an estimated 16 matches for "arrayidx9".

Did you mean: arrayidx
2013 Oct 30
3
[LLVMdev] loop vectorizer
...; %rem = and i64 %storemerge10, 3 > %add2 = or i64 %mul1, %rem > %0 = lshr i64 %storemerge10, 1 > %add51 = shl i64 %0, 2 > %mul6 = or i64 %rem, %add51 > %add8 = or i64 %mul6, 4 > %arrayidx = getelementptr inbounds float* %a, i64 %add2 > %1 = load float* %arrayidx, align 4 > %arrayidx9 = getelementptr inbounds float* %b, i64 %add2 > %2 = load float* %arrayidx9, align 4 > %add10 = fadd float %1, %2 > %arrayidx11 = getelementptr inbounds float* %c, i64 %add2 > store float %add10, float* %arrayidx11, align 4 > %arrayidx12 = getelementptr inbounds float* %a, i64 %add8...
2013 Nov 11
2
[LLVMdev] What's the Alias Analysis does clang use ?
...** %v2.addr, 8) AliasSet[0x1b912e0, 1] must alias, Mod/Ref Pointers: (float** %t.addr, 8) AliasSet[0x1b913a0, 1] must alias, Mod/Ref Pointers: (i32* %i, 4) AliasSet[0x1b91510, 4] may alias, Mod/Ref Pointers: (float* %arrayidx, 4), (float* %arrayidx1, 4), (float* %arrayidx2, 4), (float* %arrayidx9, 4) AliasSet[0x1b91590, 1] must alias, Mod/Ref Pointers: (float* %x, 4) AliasSet[0x1b91690, 1] must alias, Mod/Ref Pointers: (float* %y, 4) AliasSet[0x1b91790, 1] must alias, Mod/Ref Pointers: (float* %z, 4) AliasSet[0x1b91850, 1] must alias, Mod/Ref Pointers: (float* %res, 4) ===-...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...%i.015 = phi i64 [ %inc, %for.body ], [ %start, %entry ] %div = lshr i64 %i.015, 2 %mul = shl i64 %div, 3 %rem = and i64 %i.015, 3 %add2 = or i64 %mul, %rem %add8 = or i64 %add2, 4 %arrayidx = getelementptr inbounds float* %a, i64 %add2 %0 = load float* %arrayidx, align 4 %arrayidx9 = getelementptr inbounds float* %b, i64 %add2 %1 = load float* %arrayidx9, align 4 %add10 = fadd float %0, %1 %arrayidx11 = getelementptr inbounds float* %c, i64 %add2 store float %add10, float* %arrayidx11, align 4 %arrayidx12 = getelementptr inbounds float* %a, i64 %add8 %2 = lo...
2013 Nov 12
0
[LLVMdev] What's the Alias Analysis does clang use ?
...t; AliasSet[0x1b912e0, 1] must alias, Mod/Ref Pointers: (float** > %t.addr, 8) > AliasSet[0x1b913a0, 1] must alias, Mod/Ref Pointers: (i32* %i, 4) > AliasSet[0x1b91510, 4] may alias, Mod/Ref Pointers: (float* > %arrayidx, 4), (float* %arrayidx1, 4), (float* %arrayidx2, 4), > (float* %arrayidx9, 4) > AliasSet[0x1b91590, 1] must alias, Mod/Ref Pointers: (float* %x, 4) > AliasSet[0x1b91690, 1] must alias, Mod/Ref Pointers: (float* %y, 4) > AliasSet[0x1b91790, 1] must alias, Mod/Ref Pointers: (float* %z, 4) > AliasSet[0x1b91850, 1] must alias, Mod/Ref Pointers: (float* %res, 4) &...
2013 Oct 30
2
[LLVMdev] loop vectorizer
...c, %for.body ], [ %start, %entry ] > %div = lshr i64 %i.015, 2 > %mul = shl i64 %div, 3 > %rem = and i64 %i.015, 3 > %add2 = or i64 %mul, %rem > %add8 = or i64 %add2, 4 > %arrayidx = getelementptr inbounds float* %a, i64 %add2 > %0 = load float* %arrayidx, align 4 > %arrayidx9 = getelementptr inbounds float* %b, i64 %add2 > %1 = load float* %arrayidx9, align 4 > %add10 = fadd float %0, %1 > %arrayidx11 = getelementptr inbounds float* %c, i64 %add2 > store float %add10, float* %arrayidx11, align 4 > %arrayidx12 = getelementptr inbounds float* %a, i64 %...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...hl i64 %div, 3 %rem = and i64 %storemerge10, 3 %add2 = or i64 %mul1, %rem %0 = lshr i64 %storemerge10, 1 %add51 = shl i64 %0, 2 %mul6 = or i64 %rem, %add51 %add8 = or i64 %mul6, 4 %arrayidx = getelementptr inbounds float* %a, i64 %add2 %1 = load float* %arrayidx, align 4 %arrayidx9 = getelementptr inbounds float* %b, i64 %add2 %2 = load float* %arrayidx9, align 4 %add10 = fadd float %1, %2 %arrayidx11 = getelementptr inbounds float* %c, i64 %add2 store float %add10, float* %arrayidx11, align 4 %arrayidx12 = getelementptr inbounds float* %a, i64 %add8 %3 = lo...
2013 Oct 30
3
[LLVMdev] loop vectorizer
On 30 October 2013 09:25, Nadav Rotem <nrotem at apple.com> wrote: > The access pattern to arrays a and b is non-linear. Unrolled loops are > usually handled by the SLP-vectorizer. Are ir0 and ir1 consecutive for all > values for i ? > Based on his list of values, it seems that the induction stride is linear within each block of 4 iterations, but it's not a clear
2013 Oct 30
2
[LLVMdev] loop vectorizer
...%i.015 = phi i64 [ %inc, %for.body ], [ %start, %entry ] %div = lshr i64 %i.015, 2 %mul = shl i64 %div, 3 %rem = and i64 %i.015, 3 %add2 = or i64 %mul, %rem %add8 = or i64 %add2, 4 %arrayidx = getelementptr inbounds float* %a, i64 %add2 %0 = load float* %arrayidx, align 4 %arrayidx9 = getelementptr inbounds float* %b, i64 %add2 %1 = load float* %arrayidx9, align 4 %add10 = fadd float %0, %1 %arrayidx11 = getelementptr inbounds float* %c, i64 %add2 store float %add10, float* %arrayidx11, align 4 %arrayidx12 = getelementptr inbounds float* %a, i64 %add8 %2 = lo...
2013 Nov 08
1
[LLVMdev] loop vectorizer and storing to uniform addresses
...= %for.cond1 br label %for.inc6 for.inc6: ; preds = %for.end %12 = load i64* %i, align 8 %inc7 = add nsw i64 %12, 1 store i64 %inc7, i64* %i, align 8 br label %for.cond for.end8: ; preds = %for.cond %arrayidx9 = getelementptr inbounds [4 x float]* %sum, i32 0, i64 0 %13 = load float* %arrayidx9, align 4 %arrayidx10 = getelementptr inbounds [4 x float]* %sum, i32 0, i64 1 %14 = load float* %arrayidx10, align 4 %add11 = fadd float %13, %14 %arrayidx12 = getelementptr inbounds [4 x float]* %s...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...gt; %div = lshr i64 %i.015, 2 >> %mul = shl i64 %div, 3 >> %rem = and i64 %i.015, 3 >> %add2 = or i64 %mul, %rem >> %add8 = or i64 %add2, 4 >> %arrayidx = getelementptr inbounds float* %a, i64 %add2 >> %0 = load float* %arrayidx, align 4 >> %arrayidx9 = getelementptr inbounds float* %b, i64 %add2 >> %1 = load float* %arrayidx9, align 4 >> %add10 = fadd float %0, %1 >> %arrayidx11 = getelementptr inbounds float* %c, i64 %add2 >> store float %add10, float* %arrayidx11, align 4 >> %arrayidx12 = getelementptr...
2013 Oct 30
0
[LLVMdev] loop vectorizer
...c, %for.body ], [ %start, %entry ] > %div = lshr i64 %i.015, 2 > %mul = shl i64 %div, 3 > %rem = and i64 %i.015, 3 > %add2 = or i64 %mul, %rem > %add8 = or i64 %add2, 4 > %arrayidx = getelementptr inbounds float* %a, i64 %add2 > %0 = load float* %arrayidx, align 4 > %arrayidx9 = getelementptr inbounds float* %b, i64 %add2 > %1 = load float* %arrayidx9, align 4 > %add10 = fadd float %0, %1 > %arrayidx11 = getelementptr inbounds float* %c, i64 %add2 > store float %add10, float* %arrayidx11, align 4 > %arrayidx12 = getelementptr inbounds float* %a, i64 %...
2013 Nov 08
0
[LLVMdev] loop vectorizer and storing to uniform addresses
On 7 November 2013 17:18, Frank Winter <fwinter at jlab.org> wrote: > LV: We don't allow storing to uniform addresses > This is triggering because it didn't recognize as a reduction variable during the canVectorizeInstrs() but did recognize that sum[q] is loop invariant in canVectorizeMemory(). I'm guessing the nested loop was unrolled because of the low trip-count, and
2016 Apr 08
2
LIBCLC with LLVM 3.9 Trunk
It's not clear what is actually wrong from your original message, I think you need to give some more information as to what you are doing: Example source, what target GPU, compiler error messages or other evidence of "it's wrong" (llvm IR, disassembly, etc) ... -- Mats On 8 April 2016 at 09:55, Liu Xin via llvm-dev <llvm-dev at lists.llvm.org> wrote: > I built it
2014 Sep 29
2
[LLVMdev] Alias Analysis across functions
Hi, I am trying to get the alias info for the following code. The alias analysis returns "MayAlias" for arrays "A" and "B" in both the functions instead of "NoAlias". What passes should I run in opt before the alias analysis pass to get the accurate result? Example: //Note: static and called by func() only. static int sum(int *A, int *B) { int i = 0,
2013 Nov 08
3
[LLVMdev] loop vectorizer and storing to uniform addresses
I am trying my luck on this global reduction kernel: float foo( int start , int end , float * A ) { float sum[4] = {0.,0.,0.,0.}; for (int i = start ; i < end ; ++i ) { for (int q = 0 ; q < 4 ; ++q ) sum[q] += A[i*4+q]; } return sum[0]+sum[1]+sum[2]+sum[3]; } LV: Checking a loop in "foo" LV: Found a loop: for.cond1 LV: Found an induction variable. LV: We
2015 May 21
2
[LLVMdev] How can I remove these redundant copy between registers?
Hi, I've been working on a Blackfin backend (llvm-3.6.0) based on the previous one that was removed in llvm-3.1. llc generates codes like this: 29 p1 = r2; 30 r5 = [p1]; 31 p1 = r2; 32 r6 = [p1 + 4]; 33 r5 = r6 + r5; 34 r6 = [p0 + -4]; 35 r5 *= r6; 36 p1 = r2; 37 r6 = [p1 + 8]; 38 p1 = r2; p1 and r2 are in different register classes. A p*