Hi,
I'm trying to figure out why the following sequence of intructions is
not collapsed to "ret i32 0" by the opt tool with "-03".
---
%0 = type <{ i32* }>
define i32 @main(%0* noalias nocapture %arg) nounwind readnone {
bb:
%tmp = alloca [1024 x i32], align 4 ; <[1024 x i32]*>
[#uses=2]
%tmp3 = getelementptr inbounds [1024 x i32]* %tmp, i32 0, i32 0 ;
<i32*> [#uses=1]
%tmp4 = bitcast [1024 x i32]* %tmp to [1 x i32]* ; <[1 x i32]*>
[#uses=1]
store [1 x i32] zeroinitializer, [1 x i32]* %tmp4
%tmp5 = load i32* %tmp3 ; <i32> [#uses=1]
ret i32 %tmp5
}
---
%tmp is what I'd like to call a local heap: It is allocated in the
entry function and passed as a parameter to all called functions.
These functions can then perform instantiations of types with
reference semantics and return the objects to their callers without
problems.
In this case the local heap has a size of 1024 words. Memory for an
array of one integer is allocated on it (%tmp4), initialized to zero,
and then the array's first element is returned (optimizer is using
%tmp3 for the load).
The original code in my language reads:
entry int main()
{
int[] arr = int[1](); // int[] has reference semantics
return arr[0];
}
Any thoughts on this? I figured it had something to do with alias
analysis not covering this case, so I ran the AA evaluator and let
LLVM output all alias queries:
===== Alias Analysis Evaluator Report ==== 6 Total Alias Queries Performed
3 no alias responses (50.0%)
0 may alias responses (0.0%)
3 must alias responses (50.0%)
Alias Analysis Evaluator Pointer Alias Summary: 50%/0%/50%
Alias Analysis Mod/Ref Evaluator Summary: no mod/ref!
no alias ... tmp(4096) : arg(4)
no alias ... tmp3(4) : arg(4)
must alias ... tmp3(4) : tmp(4096)
no alias ... tmp4(4) : arg(4)
must alias ... tmp4(4) : tmp(4096)
must alias ... tmp4(4) : tmp3(4)
%tmp3 and %tmp4 are correctly determined to alias, so it should be
possible for an optimization pass (is it instruction combining?) to
figure out that %tmp5 will be 0. Am I right?
store [1 x i32] zeroinitializer, [1 x i32]* %tmp4
%tmp5 = load i32* %tmp3 ; <i32> [#uses=1]
Thanks!
Stephan