thr3ads.net - search: "arrayidx9"

2013 Oct 30

3

[LLVMdev] loop vectorizer

...; %rem = and i64 %storemerge10, 3 > %add2 = or i64 %mul1, %rem > %0 = lshr i64 %storemerge10, 1 > %add51 = shl i64 %0, 2 > %mul6 = or i64 %rem, %add51 > %add8 = or i64 %mul6, 4 > %arrayidx = getelementptr inbounds float* %a, i64 %add2 > %1 = load float* %arrayidx, align 4 > %arrayidx9 = getelementptr inbounds float* %b, i64 %add2 > %2 = load float* %arrayidx9, align 4 > %add10 = fadd float %1, %2 > %arrayidx11 = getelementptr inbounds float* %c, i64 %add2 > store float %add10, float* %arrayidx11, align 4 > %arrayidx12 = getelementptr inbounds float* %a, i64 %add8...

[LLVMdev] What's the Alias Analysis does clang use ?

2013 Nov 11

2

[LLVMdev] What's the Alias Analysis does clang use ?

...** %v2.addr, 8) AliasSet[0x1b912e0, 1] must alias, Mod/Ref Pointers: (float** %t.addr, 8) AliasSet[0x1b913a0, 1] must alias, Mod/Ref Pointers: (i32* %i, 4) AliasSet[0x1b91510, 4] may alias, Mod/Ref Pointers: (float* %arrayidx, 4), (float* %arrayidx1, 4), (float* %arrayidx2, 4), (float* %arrayidx9, 4) AliasSet[0x1b91590, 1] must alias, Mod/Ref Pointers: (float* %x, 4) AliasSet[0x1b91690, 1] must alias, Mod/Ref Pointers: (float* %y, 4) AliasSet[0x1b91790, 1] must alias, Mod/Ref Pointers: (float* %z, 4) AliasSet[0x1b91850, 1] must alias, Mod/Ref Pointers: (float* %res, 4) ===-...

[LLVMdev] loop vectorizer

2013 Oct 30

0

[LLVMdev] loop vectorizer

...%i.015 = phi i64 [ %inc, %for.body ], [ %start, %entry ] %div = lshr i64 %i.015, 2 %mul = shl i64 %div, 3 %rem = and i64 %i.015, 3 %add2 = or i64 %mul, %rem %add8 = or i64 %add2, 4 %arrayidx = getelementptr inbounds float* %a, i64 %add2 %0 = load float* %arrayidx, align 4 %arrayidx9 = getelementptr inbounds float* %b, i64 %add2 %1 = load float* %arrayidx9, align 4 %add10 = fadd float %0, %1 %arrayidx11 = getelementptr inbounds float* %c, i64 %add2 store float %add10, float* %arrayidx11, align 4 %arrayidx12 = getelementptr inbounds float* %a, i64 %add8 %2 = lo...

[LLVMdev] What's the Alias Analysis does clang use ?

2013 Nov 12

0

[LLVMdev] What's the Alias Analysis does clang use ?

...t; AliasSet[0x1b912e0, 1] must alias, Mod/Ref Pointers: (float** > %t.addr, 8) > AliasSet[0x1b913a0, 1] must alias, Mod/Ref Pointers: (i32* %i, 4) > AliasSet[0x1b91510, 4] may alias, Mod/Ref Pointers: (float* > %arrayidx, 4), (float* %arrayidx1, 4), (float* %arrayidx2, 4), > (float* %arrayidx9, 4) > AliasSet[0x1b91590, 1] must alias, Mod/Ref Pointers: (float* %x, 4) > AliasSet[0x1b91690, 1] must alias, Mod/Ref Pointers: (float* %y, 4) > AliasSet[0x1b91790, 1] must alias, Mod/Ref Pointers: (float* %z, 4) > AliasSet[0x1b91850, 1] must alias, Mod/Ref Pointers: (float* %res, 4) &...

[LLVMdev] loop vectorizer

2013 Oct 30

2

[LLVMdev] loop vectorizer

...c, %for.body ], [ %start, %entry ] > %div = lshr i64 %i.015, 2 > %mul = shl i64 %div, 3 > %rem = and i64 %i.015, 3 > %add2 = or i64 %mul, %rem > %add8 = or i64 %add2, 4 > %arrayidx = getelementptr inbounds float* %a, i64 %add2 > %0 = load float* %arrayidx, align 4 > %arrayidx9 = getelementptr inbounds float* %b, i64 %add2 > %1 = load float* %arrayidx9, align 4 > %add10 = fadd float %0, %1 > %arrayidx11 = getelementptr inbounds float* %c, i64 %add2 > store float %add10, float* %arrayidx11, align 4 > %arrayidx12 = getelementptr inbounds float* %a, i64 %...

[LLVMdev] loop vectorizer

2013 Oct 30

0

[LLVMdev] loop vectorizer

...hl i64 %div, 3 %rem = and i64 %storemerge10, 3 %add2 = or i64 %mul1, %rem %0 = lshr i64 %storemerge10, 1 %add51 = shl i64 %0, 2 %mul6 = or i64 %rem, %add51 %add8 = or i64 %mul6, 4 %arrayidx = getelementptr inbounds float* %a, i64 %add2 %1 = load float* %arrayidx, align 4 %arrayidx9 = getelementptr inbounds float* %b, i64 %add2 %2 = load float* %arrayidx9, align 4 %add10 = fadd float %1, %2 %arrayidx11 = getelementptr inbounds float* %c, i64 %add2 store float %add10, float* %arrayidx11, align 4 %arrayidx12 = getelementptr inbounds float* %a, i64 %add8 %3 = lo...

[LLVMdev] loop vectorizer

2013 Oct 30

3

[LLVMdev] loop vectorizer

On 30 October 2013 09:25, Nadav Rotem <nrotem at apple.com> wrote: > The access pattern to arrays a and b is non-linear. Unrolled loops are > usually handled by the SLP-vectorizer. Are ir0 and ir1 consecutive for all > values for i ? > Based on his list of values, it seems that the induction stride is linear within each block of 4 iterations, but it's not a clear

[LLVMdev] loop vectorizer

2013 Oct 30

2

[LLVMdev] loop vectorizer

...%i.015 = phi i64 [ %inc, %for.body ], [ %start, %entry ] %div = lshr i64 %i.015, 2 %mul = shl i64 %div, 3 %rem = and i64 %i.015, 3 %add2 = or i64 %mul, %rem %add8 = or i64 %add2, 4 %arrayidx = getelementptr inbounds float* %a, i64 %add2 %0 = load float* %arrayidx, align 4 %arrayidx9 = getelementptr inbounds float* %b, i64 %add2 %1 = load float* %arrayidx9, align 4 %add10 = fadd float %0, %1 %arrayidx11 = getelementptr inbounds float* %c, i64 %add2 store float %add10, float* %arrayidx11, align 4 %arrayidx12 = getelementptr inbounds float* %a, i64 %add8 %2 = lo...

[LLVMdev] loop vectorizer and storing to uniform addresses

2013 Nov 08

1

[LLVMdev] loop vectorizer and storing to uniform addresses

...= %for.cond1 br label %for.inc6 for.inc6: ; preds = %for.end %12 = load i64* %i, align 8 %inc7 = add nsw i64 %12, 1 store i64 %inc7, i64* %i, align 8 br label %for.cond for.end8: ; preds = %for.cond %arrayidx9 = getelementptr inbounds [4 x float]* %sum, i32 0, i64 0 %13 = load float* %arrayidx9, align 4 %arrayidx10 = getelementptr inbounds [4 x float]* %sum, i32 0, i64 1 %14 = load float* %arrayidx10, align 4 %add11 = fadd float %13, %14 %arrayidx12 = getelementptr inbounds [4 x float]* %s...

[LLVMdev] loop vectorizer

2013 Oct 30

0

[LLVMdev] loop vectorizer

...gt; %div = lshr i64 %i.015, 2 >> %mul = shl i64 %div, 3 >> %rem = and i64 %i.015, 3 >> %add2 = or i64 %mul, %rem >> %add8 = or i64 %add2, 4 >> %arrayidx = getelementptr inbounds float* %a, i64 %add2 >> %0 = load float* %arrayidx, align 4 >> %arrayidx9 = getelementptr inbounds float* %b, i64 %add2 >> %1 = load float* %arrayidx9, align 4 >> %add10 = fadd float %0, %1 >> %arrayidx11 = getelementptr inbounds float* %c, i64 %add2 >> store float %add10, float* %arrayidx11, align 4 >> %arrayidx12 = getelementptr...

[LLVMdev] loop vectorizer

2013 Oct 30

0

[LLVMdev] loop vectorizer

...c, %for.body ], [ %start, %entry ] > %div = lshr i64 %i.015, 2 > %mul = shl i64 %div, 3 > %rem = and i64 %i.015, 3 > %add2 = or i64 %mul, %rem > %add8 = or i64 %add2, 4 > %arrayidx = getelementptr inbounds float* %a, i64 %add2 > %0 = load float* %arrayidx, align 4 > %arrayidx9 = getelementptr inbounds float* %b, i64 %add2 > %1 = load float* %arrayidx9, align 4 > %add10 = fadd float %0, %1 > %arrayidx11 = getelementptr inbounds float* %c, i64 %add2 > store float %add10, float* %arrayidx11, align 4 > %arrayidx12 = getelementptr inbounds float* %a, i64 %...

[LLVMdev] loop vectorizer and storing to uniform addresses

2013 Nov 08

0

[LLVMdev] loop vectorizer and storing to uniform addresses

On 7 November 2013 17:18, Frank Winter <fwinter at jlab.org> wrote: > LV: We don't allow storing to uniform addresses > This is triggering because it didn't recognize as a reduction variable during the canVectorizeInstrs() but did recognize that sum[q] is loop invariant in canVectorizeMemory(). I'm guessing the nested loop was unrolled because of the low trip-count, and

LIBCLC with LLVM 3.9 Trunk

2016 Apr 08

2

LIBCLC with LLVM 3.9 Trunk

It's not clear what is actually wrong from your original message, I think you need to give some more information as to what you are doing: Example source, what target GPU, compiler error messages or other evidence of "it's wrong" (llvm IR, disassembly, etc) ... -- Mats On 8 April 2016 at 09:55, Liu Xin via llvm-dev <llvm-dev at lists.llvm.org> wrote: > I built it

[LLVMdev] Alias Analysis across functions

2014 Sep 29

2

[LLVMdev] Alias Analysis across functions

Hi, I am trying to get the alias info for the following code. The alias analysis returns "MayAlias" for arrays "A" and "B" in both the functions instead of "NoAlias". What passes should I run in opt before the alias analysis pass to get the accurate result? Example: //Note: static and called by func() only. static int sum(int *A, int *B) { int i = 0,

[LLVMdev] loop vectorizer and storing to uniform addresses

2013 Nov 08

3

[LLVMdev] loop vectorizer and storing to uniform addresses

I am trying my luck on this global reduction kernel: float foo( int start , int end , float * A ) { float sum[4] = {0.,0.,0.,0.}; for (int i = start ; i < end ; ++i ) { for (int q = 0 ; q < 4 ; ++q ) sum[q] += A[i*4+q]; } return sum[0]+sum[1]+sum[2]+sum[3]; } LV: Checking a loop in "foo" LV: Found a loop: for.cond1 LV: Found an induction variable. LV: We

[LLVMdev] How can I remove these redundant copy between registers?

2015 May 21

2

[LLVMdev] How can I remove these redundant copy between registers?

Hi, I've been working on a Blackfin backend (llvm-3.6.0) based on the previous one that was removed in llvm-3.1. llc generates codes like this: 29 p1 = r2; 30 r5 = [p1]; 31 p1 = r2; 32 r6 = [p1 + 4]; 33 r5 = r6 + r5; 34 r6 = [p0 + -4]; 35 r5 *= r6; 36 p1 = r2; 37 r6 = [p1 + 8]; 38 p1 = r2; p1 and r2 are in different register classes. A p*

search for: arrayidx9