Ishiguro, Hiroshi via llvm-dev
2018-Feb-27  10:13 UTC
[llvm-dev] Question about instcombine pass.
Hello, Everyone.
I have a question about llvm's "Combine redundant
instructions(instcombine)" pass.
I have tested instcombine pass by writing the following three test cases.
But, CASE3 is not optimized as I expected.
Is this behavior expected?
The version of llvm is:
  clang version 5.0.1 (tags/RELEASE_501/final 325232)
Option of clang command is:
  clang -O1 a.c -S -emit-llvm -o -
TEST Programs:
(CASE1)
This case is optimized as I expected.
----------------------------------
#define LEN 10
int a[LEN], b[LEN], c[LEN], X[LEN];
void foo() {
  int i;
  for (i=1; i<LEN; i++) {
    a[i] += b[i] * c[i];
    a[i] -= b[i] * c[i];
  }
}
----------------------------------
IR.(Excerpt)
----------------------------------
; Function Attrs: norecurse nounwind uwtable define void @foo()
local_unnamed_addr #0 {
for.end:
  ret void
}
----------------------------------
(CASE2)
This case is also optimized as I expected.
----------------------------------
#define LEN 10
int a[LEN], b[LEN], c[LEN], X[LEN];
void foo() {
  int i;
  for (i=1; i<LEN; i++) {
    X[i] = X[i-1] * X[i-1];
    a[i] += b[i] * c[i];
    a[i] -= b[i] * c[i];
  }
}
----------------------------------
IR.(Excerpt)
----------------------------------
for.body:                                         ; preds = %for.body, %entry
  %store_forwarded = phi i32 [ %load_initial, %entry ], [ %mul, %for.body ]
  %indvars.iv = phi i64 [ 1, %entry ], [ %indvars.iv.next, %for.body ]
  %mul = mul nsw i32 %store_forwarded, %store_forwarded
  %arrayidx5 = getelementptr inbounds [10 x i32], [10 x i32]* @X, i64 0, i64
%indvars.iv
  store i32 %mul, i32* %arrayidx5, align 4, !tbaa !2
  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
  %exitcond = icmp eq i64 %indvars.iv.next, 10
  br i1 %exitcond, label %for.end, label %for.body
----------------------------------
(CASE3)
This case is not optimized as I expected.
I expected that instructions about an array 'a' are removed like the
CASE2.
----------------------------------
#define LEN 10
int a[LEN], b[LEN], c[LEN], X[LEN];
void foo() {
  int i;
  for (i=1; i<LEN; i++) {
    a[i] += b[i] * c[i];
    X[i] = X[i-1] * X[i-1];
    a[i] -= b[i] * c[i];
  }
}
----------------------------------
IR.(Excerpt)
----------------------------------
for.body:                                         ; preds = %for.body, %entry
  %store_forwarded = phi i32 [ %load_initial, %entry ], [ %mul10, %for.body ]
  %indvars.iv = phi i64 [ 1, %entry ], [ %indvars.iv.next, %for.body ]
  %arrayidx4 = getelementptr inbounds [10 x i32], [10 x i32]* @a, i64 0, i64
%indvars.iv
  %0 = load i32, i32* %arrayidx4, align 4, !tbaa !2
  %mul10 = mul nsw i32 %store_forwarded, %store_forwarded
  %arrayidx12 = getelementptr inbounds [10 x i32], [10 x i32]* @X, i64 0, i64
%indvars.iv
  store i32 %mul10, i32* %arrayidx12, align 4, !tbaa !2
  store i32 %0, i32* %arrayidx4, align 4, !tbaa !2
  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
  %exitcond = icmp eq i64 %indvars.iv.next, 10
  br i1 %exitcond, label %for.end, label %for.body
----------------------------------
Best Regards,
Hiroshi
