thr3ads.net - llvm dev - [llvm-dev] Why LLVM generate two induction variables in IR level for the same induction variable in C source code [Jul 2016]

If this information is useful, please help other people find it:
Share via:

Dong Chen via llvm-dev

2016-Jul-13 18:19 UTC

[llvm-dev] Why LLVM generate two induction variables in IR level for the same induction variable in C source code

The C source code:

__kernel void adi_kernel1(__global DATA_TYPE* A, __global DATA_TYPE* B,
__global DATA_TYPE* X, int n)

{

  int i1 = get_global_id(0);

  int i2;


  if ((i1 < n))

  {

    for (i2 = 1; i2 < n; i2++)

    {

      X[i1*n + i2] = X[i1*n + i2] - X[i1*n + (i2-1)] * A[i1*n + i2] /
B[i1*n + (i2-1)];

      B[i1*n + i2] = B[i1*n + i2] - A[i1*n + i2] * A[i1*n + i2] / B[i1*n + (
i2-1)];

    }

  }

}

Only one induction variable (i2) in the source. But in the IR i2.087 and
indvars.iv are all induction variables (I think). Any body know why this
happens? Is there a way to generate only one induction variables?

; Function Attrs: nounwind ssp uwtable

define void @adi_kernel1(float* nocapture readonly %A, float* nocapture %B,
float* nocapture %X, i32 %n) #0 {

entry:

  %call = tail call i32 (i32, ...) bitcast (i32 (...)* @get_global_id to
i32 (i32, ...)*)(i32 0) #2

  %cmp = icmp slt i32 %call, %n

  %cmp186 = icmp sgt i32 %n, 1

  %or.cond = and i1 %cmp, %cmp186

  br i1 %or.cond, label %for.body.lr.ph, label %if.end


for.body.lr.ph:                                   ; preds = %entry

  %mul = mul nsw i32 %call, %n

  %sub = add i32 %mul, -1

  br label %for.body


for.body:                                         ; preds = %for.body, %
for.body.lr.ph

  %indvars.iv = phi i64 [ 1, %for.body.lr.ph ], [ %indvars.iv.next,
%for.body ]

  %i2.087 = phi i32 [ 1, %for.body.lr.ph ], [ %inc, %for.body ]

  %add = add nsw i32 %i2.087, %mul

  %idxprom = sext i32 %add to i64

  %arrayidx = getelementptr inbounds float, float* %X, i64 %idxprom

  %0 = load float, float* %arrayidx, align 4, !tbaa !8

  %1 = trunc i64 %indvars.iv to i32

  %add3 = add i32 %sub, %1

  %idxprom4 = sext i32 %add3 to i64

  %arrayidx5 = getelementptr inbounds float, float* %X, i64 %idxprom4

  %2 = load float, float* %arrayidx5, align 4, !tbaa !8

  %arrayidx9 = getelementptr inbounds float, float* %A, i64 %idxprom

  %3 = load float, float* %arrayidx9, align 4, !tbaa !8

  %mul10 = fmul float %2, %3

  %arrayidx15 = getelementptr inbounds float, float* %B, i64 %idxprom4

  %4 = load float, float* %arrayidx15, align 4, !tbaa !8

  %div = fdiv float %mul10, %4, !fpmath !12

  %sub16 = fsub float %0, %div

  store float %sub16, float* %arrayidx, align 4, !tbaa !8

  %arrayidx24 = getelementptr inbounds float, float* %B, i64 %idxprom

  %5 = load float, float* %arrayidx24, align 4, !tbaa !8

  %6 = load float, float* %arrayidx9, align 4, !tbaa !8

  %mul33 = fmul float %6, %6

  %7 = load float, float* %arrayidx15, align 4, !tbaa !8

  %div39 = fdiv float %mul33, %7, !fpmath !12

  %sub40 = fsub float %5, %div39

  store float %sub40, float* %arrayidx24, align 4, !tbaa !8

  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1

  %inc = add nuw nsw i32 %i2.087, 1

  %lftr.wideiv = trunc i64 %indvars.iv.next to i32

  %exitcond = icmp eq i32 %lftr.wideiv, %n

  br i1 %exitcond, label %if.end.loopexit, label %for.body


if.end.loopexit:                                  ; preds = %for.body

  br label %if.end


if.end:                                           ; preds %if.end.loopexit,
%entry

  ret void

}

Thanks,

Dong
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160713/106be75b/attachment-0001.html>

John Criswell via llvm-dev

2016-Jul-13 18:37 UTC

head link

[llvm-dev] Why LLVM generate two induction variables in IR level for the same induction variable in C source code

Dear Dong,

First, how did you generate the LLVM IR?  Did you run any optimizations 
on the code?

Second, what is it that you are trying to do, and is having more than 
one induction variable a problem?  I would think that, in general, a 
loop can have multiple induction variables, so you will probably need to 
handle that issue no matter what you are doing.

Third, to see what generates this code, you should first use clang -S 
-emit-llvm to generate an LLVM assembly file.  Look at the assembly file 
and see if it has the two induction variables.  If it does, then it's 
clang that is generating the code that way.  If it doesn't, then some 
optimization is responsible.

If it's an optimization, you can narrow it down as follows: write a pass 
that crashes (e.g., use an assert() statement) if it detects a loop with 
two induction variables.  Then, use bugpoint to run your pass.  The 
bugpoint program will reduce the crash to the minimum number of passes 
and minimum code that triggers the crash.  In this case, that would be 
equivalent to finding the pass that generates the second induction variable.

Fourth, in the future, please attach plain text files instead of image 
files.  I can download a plain text file, search through it, assemble it 
with the LLVM assembler, etc.  Image files do not allow for that.  Also, 
images tend to take more space than simple text files.

Regards,

John Criswell


On 7/13/16 1:19 PM, Dong Chen via llvm-dev wrote:> The C source code:
>
> __kernel void adi_kernel1(__global DATA_TYPE* A, __global DATA_TYPE* 
> B, __global DATA_TYPE* X, int n)
>
> {
>
>   int i1=get_global_id(0);
>
>   int i2;
>
>
> if((i1 <n))
>
>   {
>
>     for (i2 =1; i2 < n; i2++)
>
>     {
>
>       X[i1*n +i2] =X[i1*n +i2] -X[i1*n +(i2-1)] *A[i1*n +i2] /B[i1*n 
> +(i2-1)];
>
>       B[i1*n +i2] =B[i1*n +i2] -A[i1*n +i2] *A[i1*n +i2] /B[i1*n +(i2-1)];
>
>     }
>
>   }
>
> }
>
>
> Only one induction variable (i2) in the source. But in the IR i2.087 
> and indvars.iv are all induction variables (I think). Any body know 
> why this happens? Is there a way to generate only one induction variables?
>
> ; Function Attrs: nounwind ssp uwtable
>
> define void @adi_kernel1(float* nocapture readonly %A, float* 
> nocapture %B, float* nocapture %X, i32 %n) #0 {
>
> entry:
>
>   %call = tail call i32 (i32, ...) bitcast (i32 (...)* @get_global_id 
> to i32 (i32, ...)*)(i32 0) #2
>
>   %cmp = icmp slt i32 %call, %n
>
>   %cmp186 = icmp sgt i32 %n, 1
>
>   %or.cond = and i1 %cmp, %cmp186
>
>   br i1 %or.cond, label %for.body.lr.ph <http://for.body.lr.ph>,
label
> %if.end
>
>
> for.body.lr.ph <http://for.body.lr.ph>:                           ; 
> preds = %entry
>
>   %mul = mul nsw i32 %call, %n
>
>   %sub = add i32 %mul, -1
>
>   br label %for.body
>
>
> for.body: ; preds = %for.body, %for.body.lr.ph
<http://for.body.lr.ph>
>
>   %indvars.iv = phi i64 [ 1, %for.body.lr.ph <http://for.body.lr.ph> 
> ], [ %indvars.iv.next, %for.body ]
>
>   %i2.087 = phi i32 [ 1, %for.body.lr.ph <http://for.body.lr.ph> ], [
> %inc, %for.body ]
>
>   %add = add nsw i32 %i2.087, %mul
>
>   %idxprom = sext i32 %add to i64
>
>   %arrayidx = getelementptr inbounds float, float* %X, i64 %idxprom
>
>   %0 = load float, float* %arrayidx, align 4, !tbaa !8
>
>   %1 = trunc i64 %indvars.iv to i32
>
>   %add3 = add i32 %sub, %1
>
>   %idxprom4 = sext i32 %add3 to i64
>
>   %arrayidx5 = getelementptr inbounds float, float* %X, i64 %idxprom4
>
>   %2 = load float, float* %arrayidx5, align 4, !tbaa !8
>
>   %arrayidx9 = getelementptr inbounds float, float* %A, i64 %idxprom
>
>   %3 = load float, float* %arrayidx9, align 4, !tbaa !8
>
>   %mul10 = fmul float %2, %3
>
>   %arrayidx15 = getelementptr inbounds float, float* %B, i64 %idxprom4
>
>   %4 = load float, float* %arrayidx15, align 4, !tbaa !8
>
>   %div = fdiv float %mul10, %4, !fpmath !12
>
>   %sub16 = fsub float %0, %div
>
>   store float %sub16, float* %arrayidx, align 4, !tbaa !8
>
>   %arrayidx24 = getelementptr inbounds float, float* %B, i64 %idxprom
>
>   %5 = load float, float* %arrayidx24, align 4, !tbaa !8
>
>   %6 = load float, float* %arrayidx9, align 4, !tbaa !8
>
>   %mul33 = fmul float %6, %6
>
>   %7 = load float, float* %arrayidx15, align 4, !tbaa !8
>
>   %div39 = fdiv float %mul33, %7, !fpmath !12
>
>   %sub40 = fsub float %5, %div39
>
>   store float %sub40, float* %arrayidx24, align 4, !tbaa !8
>
>   %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
>
>   %inc = add nuw nsw i32 %i2.087, 1
>
>   %lftr.wideiv = trunc i64 %indvars.iv.next to i32
>
>   %exitcond = icmp eq i32 %lftr.wideiv, %n
>
>   br i1 %exitcond, label %if.end.loopexit, label %for.body
>
>
> if.end.loopexit: ; preds = %for.body
>
>   br label %if.end
>
>
> if.end: ; preds = %if.end.loopexit, %entry
>
>   ret void
>
> }
>
>
> Thanks,
>
> Dong
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
John Criswell
Assistant Professor
Department of Computer Science, University of Rochester
http://www.cs.rochester.edu/u/criswell

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160713/915ac8a9/attachment.html>

llvm dev - Jul 2016 - Why LLVM generate two induction variables in IR level for the same induction variable in C source code

[llvm-dev] Why LLVM generate two induction variables in IR level for the same induction variable in C source code

[llvm-dev] Why LLVM generate two induction variables in IR level for the same induction variable in C source code