search for: idxprom

Displaying 20 results from an estimated 64 matches for "idxprom".

2019 Nov 10
2
Reassociation is blocking a vectorization
Hi Devs, I am looking at the bug https://bugs.llvm.org/show_bug.cgi?id=43953 and found that following piece of ir %arrayidx = getelementptr inbounds float, float* %Vec0, i64 %idxprom %0 = load float, float* %arrayidx, align 4, !tbaa !2 %arrayidx2 = getelementptr inbounds float, float* %Vec1, i64 %idxprom %1 = load float, float* %arrayidx2, align 4, !tbaa !2 %sub = fsub fast float %0, %1 %add = fadd fast float %sum.0, %sub is transformed into %arrayidx = getelementpt...
2014 Feb 19
2
[LLVMdev] better code for IV
...ry L_entry: ; preds = %L_entry, %L_pre_head %L_ind_var = phi i64 [ 0, %L_pre_head ], [ %L_inc_ind_var, %L_entry ] %L_tid = phi i64 [ 0, %L_pre_head ], [ %L_inc_tid, %L_entry ] %trunc = trunc i64 %L_tid to i32 %idxprom = sext i32 %trunc to i64 %arrayidx = getelementptr inbounds float* %a, i64 %idxprom %0 = load float* %arrayidx, align 4 %arrayidx2 = getelementptr inbounds float* %b, i64 %idxprom %1 = load float* %arrayidx2, align 4 %add = fadd float %0, %1...
2017 Aug 08
2
[ScalarEvolution][SCEV] no-wrap flags dependent on order of getSCEV() calls
...n discussed before, but I just wanted to check if this is indeed the current expected behavior, and if anyone has any plans/ideas for addressing this issue. For reference, below is a reduced loop where this problem occurs. The SCEV for %i.07.i will have <nuw> or not depending on whether %idxprom.i was computed before it: for.body.i: %i.07.i = phi i32 [ %inc.i, %for.body.i ], [ 0, %for.body.i.preheader ] %idxprom.i = zext i32 %i.07.i to i64 %arrayidx.i = getelementptr inbounds i32, i32* %rot, i64 %idxprom.i %0 = load i32, i32* %arrayidx.i %inc.i = add i32 %i.07.i, 1 %cmp...
2017 Aug 08
2
[ScalarEvolution][SCEV] no-wrap flags dependent on order of getSCEV() calls
...ressing this issue. > > The general issue that SCEV nsw is weird is known... see, for example https://bugs.llvm.org/show_bug.cgi?id=23527. > >> For reference, below is a reduced loop where this problem occurs. The SCEV for %i.07.i will have <nuw> or not depending on whether %idxprom.i was computed before it: > > %idxprom.i, the zext? I'm not sure how you're getting that particular effect. ScalarEvolution::getSCEV for a zext immediately calls getSCEV on its operand. Here is an abridged record of the getSCEV results as seen by each pass with/without preserving...
2015 Aug 22
3
loop unrolling introduces conditional branch
...c, %entry %0 = load i32, i32* %i, align 4 %1 = load i32, i32* %n.addr, align 4 %cmp = icmp slt i32 %0, %1 br i1 %cmp, label %for.body, label %for.end for.body: ; preds = %for.cond %2 = load i32, i32* %i, align 4 %3 = load i32, i32* %i, align 4 %idxprom = sext i32 %3 to i64 %4 = load i32*, i32** %array_x.addr, align 8 %arrayidx = getelementptr inbounds i32, i32* %4, i64 %idxprom store i32 %2, i32* %arrayidx, align 4 br label %for.inc for.inc: ; preds = %for.body %5 = load i32, i32* %i, align 4...
2015 Aug 22
2
loop unrolling introduces conditional branch
...gn 4 > %1 = load i32, i32* %n.addr, align 4 > %cmp = icmp slt i32 %0, %1 > br i1 %cmp, label %for.body, label %for.end > > for.body: ; preds = %for.cond > %2 = load i32, i32* %i, align 4 > %3 = load i32, i32* %i, align 4 > %idxprom = sext i32 %3 to i64 > %4 = load i32*, i32** %array_x.addr, align 8 > %arrayidx = getelementptr inbounds i32, i32* %4, i64 %idxprom > store i32 %2, i32* %arrayidx, align 4 > br label %for.inc > > for.inc: ; preds = %for.body >...
2011 Aug 19
3
[LLVMdev] Why int variable get promoted to i64
Hi, all I found in some cases the int variable get promoted to i64, although I think it should i32. I use the online demo (http://llvm.org/demo). And below is the test case. ------------- test case ------------- int test(int x[], int y[], int n) { int i = 0; int sum = 0; for ( ; i < n; i++) { sum += x[i] * y[i]; } return sum; } ------------------------------------- No
2013 Feb 05
3
[LLVMdev] Vectorizing global struct pointers
...and phi nodes). Does that make sense? cheers, --renato PS: A simplified version of the IR: %struct.anon = type { [256 x i64], [256 x i64], [256 x i64] } @Foo = common global %struct.anon zeroinitializer, align 8 ... %arrayidx = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 1, i32 %idxprom %0 = load i64* %arrayidx, align 8 %arrayidx2 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 2, i32 %idxprom %1 = load i64* %arrayidx2, align 8 %mul = mul nsw i64 %1, %0 %arrayidx4 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 0, i32 %idxprom store i64 %mul, i64* %arra...
2017 Dec 06
2
[AMDGPU] Strange results with different address spaces
...vergence -o - as1.ll Printing analysis 'Divergence Analysis' for function '_ZN5pacxx2v213genericKernelIZL12test_barrieriPPcE3$_0EEvT_': DIVERGENT: %6 = tail call i32 @llvm.amdgcn.workitem.id.x() #0, !range !11 DIVERGENT: %add.i.i.i.i.i = add nsw i32 %mul.i.i.i.i.i, %6 DIVERGENT: %idxprom.i.i.i = sext i32 %add.i.i.i.i.i to i64 DIVERGENT: %8 = getelementptr i32, i32 addrspace(1)* %callable.coerce0, i64 %idxprom.i.i.i DIVERGENT: %9 = load i32, i32 addrspace(1)* %8, align 4 DIVERGENT: %10 = getelementptr [16 x i32], [16 x i32] addrspace(3)* @"_ZN5pacxx2v213genericKernelIZL12tes...
2019 Feb 21
2
If there are some passes in LLVM do the opposite of the SROA(Scalar Replacement of Aggregates) pass
Hi LLVM developers, We tried to find if there are some passes in LLVM do the opposite of the SROA(Scalar Replacement of Aggregates) pass, but did not find one. Do we have this kind of pass to bring back the structure type? Or this is done separately in any transformation passes? Thanks, Lin-Ya -------------- next part -------------- An HTML attachment was scrubbed... URL:
2016 Feb 02
5
Particular type of loop optimization
Dear LLVMers, I am trying to implement a particular type of loop optimization, but I am having problems with global variables. To solve this problem, I would like to know if LLVM has some pass that moves loads outside loops. I will illustrate with an example. I want to transform this code below. I am writing in C for readability, but I am analysing LLVM IR: int *vectorE; void foo (int n) {
2019 Jun 30
2
Information Loss of Array Type in Function Interface in IR Generated by Clang
...define dso_local i32 @_Z1fPii(i32* nocapture readonly %A, i32 %x) local_unnamed_addr #0 !dbg !7 { entry: call void @llvm.dbg.value(metadata i32* %A, metadata !13, metadata !DIExpression()), !dbg !15 call void @llvm.dbg.value(metadata i32 %x, metadata !14, metadata !DIExpression()), !dbg !16 %idxprom = sext i32 %x to i64, !dbg !17 %arrayidx = getelementptr inbounds i32, i32* %A, i64 %idxprom, !dbg !17 %0 = load i32, i32* %arrayidx, align 4, !dbg !17, !tbaa !18 ret i32 %0, !dbg !22 } Best regards, ------------------------------------------ Tingyuan LIANG MPhil Student Department of Elect...
2017 Dec 05
2
[AMDGPU] Strange results with different address spaces
...dressing in as1.ll is incorrectly concluded to be uniform: > > %6 = tail call i32 @llvm.amdgcn.workitem.id.x() #0, !range !11 > %7 = tail call i32 @llvm.amdgcn.workgroup.id.x() #0 > %mul.i.i.i.i.i = mul nsw i32 %7, %3 > %add.i.i.i.i.i = add nsw i32 %mul.i.i.i.i.i, %6 > %idxprom.i.i.i = sext i32 %add.i.i.i.i.i to i64 > %8 = getelementptr i32, i32 addrspace(1)* %callable.coerce0, i64 %idxprom.i.i.i, !amdgpu.uniform !12, !amdgpu.noclobber !12 > > However since this depends on workitem.id <http://workitem.id/>.x, it certainly is not > > -Matt Actuall...
2013 Feb 05
0
[LLVMdev] Vectorizing global struct pointers
...t; > PS: > > A simplified version of the IR: > > %struct.anon = type { [256 x i64], [256 x i64], [256 x i64] } > > @Foo = common global %struct.anon zeroinitializer, align 8 > > ... > > %arrayidx = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 1, i32 %idxprom > %0 = load i64* %arrayidx, align 8 > %arrayidx2 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 2, i32 %idxprom > %1 = load i64* %arrayidx2, align 8 > %mul = mul nsw i64 %1, %0 > %arrayidx4 = getelementptr inbounds %struct.anon* @Foo, i32 0, i32 0, i32 %idxprom > st...
2017 May 19
4
memcmp code fragment
...ck[i2]; if (c1 != c2) return (c1 > c2); i1++; i2++; .. .. <repeat 12 times> In LLVM IR it will be following: define i8 @mainGtU(i32 %i1, i32 %i2, i8* readonly %block, i16* nocapture readnone %quadrant, i32 %nblock, i32* nocapture readnone %budget) local_unnamed_addr #0 { entry: %idxprom = zext i32 %i1 to i64 %arrayidx = getelementptr inbounds i8, i8* %block, i64 %idxprom %0 = load i8, i8* %arrayidx, align 1 %idxprom1 = zext i32 %i2 to i64 %arrayidx2 = getelementptr inbounds i8, i8* %block, i64 %idxprom1 %1 = load i8, i8* %arrayidx2, align 1 %cmp = icmp eq i8 %0, %1 b...
2013 Jun 25
2
[LLVMdev] SimplifyIndVar looses nsw flags
...sum += a[i]; return sum; } // *** IR Dump Before Induction Variable Simplification *** // for.body: ; preds = %entry, %for.body // %i.05 = phi i32 [ 0, %entry ], [ %inc, %for.body ] // %sum.04 = phi i32 [ 0, %entry ], [ %add, %for.body ] // %idxprom = sext i32 %i.05 to i64 // %arrayidx = getelementptr inbounds i32* %a, i64 %idxprom // %0 = load i32* %arrayidx, align 4, !tbaa !0 // %add = add nsw i32 %0, %sum.04 // %inc = add nsw i32 %i.05, 1 // %cmp = icmp slt i32 %inc, 1000 // br i1 %cmp, label %for.body, label %for.end // *** IR...
2019 Jun 30
2
Information Loss of Array Type in Function Interface in IR Generated by Clang
...define dso_local i32 @_Z1fPii(i32* nocapture readonly %A, i32 %x) local_unnamed_addr #0 !dbg !7 { entry: call void @llvm.dbg.value(metadata i32* %A, metadata !13, metadata !DIExpression()), !dbg !15 call void @llvm.dbg.value(metadata i32 %x, metadata !14, metadata !DIExpression()), !dbg !16 %idxprom = sext i32 %x to i64, !dbg !17 %arrayidx = getelementptr inbounds i32, i32* %A, i64 %idxprom, !dbg !17 %0 = load i32, i32* %arrayidx, align 4, !dbg !17, !tbaa !18 ret i32 %0, !dbg !22 } Best regards, ------------------------------------------ Tingyuan LIANG MPhil Student Department of Elect...
2013 Feb 05
1
[LLVMdev] Vectorizing global struct pointers
On 5 February 2013 17:28, Nadav Rotem <nrotem at apple.com> wrote: > We insert runtime overlap checks only for unidentified objects. The > problem here is that the vectorizer thinks that A,B,C are all pointers to > the same array, so it gives up. If A,B,C were different arrays then it > could have used runtime checks. > Yes, that is exactly the code that creates the
2015 Apr 25
3
[LLVMdev] alias analysis on llvm internal globals
...oo.init, align 1 br label %if.end if.end: ; preds = %entry.if.end_crit_edge, %if.then %0 = phi i32* [ %.pre, %entry.if.end_crit_edge ], [ getelementptr inbounds ([2048 x i32]* @foo.local_fooBuf, i64 0, i64 0), %if.then ] %div = sdiv i32 %aconst, 2 %idxprom = sext i32 %div to i64 %arrayidx = getelementptr inbounds i32, i32* %0, i64 %idxprom %cmp110 = icmp sgt i32 %aconst, 0 br i1 %cmp110, label %for.body.preheader, label %for.end, !llvm.loop !5 for.body.preheader: ; preds = %if.end br label %for.body for.body:...
2018 Jun 13
2
Question about a May-alias case
...  char *a = buf[3 - idx];   char *b = buf[idx];   *a = *b;   c++;   *a = *b; } We can imagine the second "*a = *b" could be removed. Let's look at the IR snippet with -O3 for above example.   1 define void @test(i32 %idx) {   2 entry:   3   %sub = sub nsw i32 3, %idx   4   %idxprom = sext i32 %sub to i64   5   %arrayidx = getelementptr inbounds [4 x i8*], [4 x i8*]* @buf, i64 0, i64 %idxprom   6   %0 = load i8*, i8** %arrayidx, align 8   7   %idxprom1 = sext i32 %idx to i64   8   %arrayidx2 = getelementptr inbounds [4 x i8*], [4 x i8*]* @buf, i64 0, i64 %idxprom1   9  ...