Shuai Wang via llvm-dev
2020-Jul-09 08:15 UTC
[llvm-dev] Understand alias-analysis results
Hello, I am performing alias analysis toward the following simple code: struct MyStruct { int * f1; int * f2; }; void NOALIAS(void* p, void* q){ } int main() { struct MyStruct s[2]; int a,b; s[0].f1 = &a; s[1].f1 = &b; NOALIAS(s[a].f1, s[b].f2); return 0; } When I use the following command to generate .bc code and conduct alias analysis: clang -c -emit-llvm t.c -O2 opt -basicaa -aa-eval -print-alias-sets -disable-output t.bc I got the following outputs: Alias sets for function 'NOALIAS': Alias Set Tracker: 0 alias sets for 0 pointer values. Alias sets for function 'main': Alias Set Tracker: 0 alias sets for 0 pointer values. ===== Alias Analysis Evaluator Report ==== 1 Total Alias Queries Performed 0 no alias responses (0.0%) 1 may alias responses (100.0%) 0 partial alias responses (0.0%) 0 must alias responses (0.0%) I checked the generated .ll code, and it shows that within the main function and NOALIAS functions, there is only a "ret" statement, with no global or local variables used. Could anyone shed some lights on where the "1 may alias" come from? And is there a way that I can force the alias analysis algorithm to focus only the "main" function? Thank you very much. Best, Shuai -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200709/231863e2/attachment.html>
Matt P. Dziubinski via llvm-dev
2020-Jul-09 10:14 UTC
[llvm-dev] Understand alias-analysis results
On 7/9/2020 10:15, Shuai Wang via llvm-dev wrote:> Hello, > > I am performing alias analysis toward the following simple code: > > [...] > > I checked the generated .ll code, and it shows that within the main > function and NOALIAS functions, there is only a "ret" statement, with no > global or local variables used. Could anyone shed some lights on where > the "1 may alias" come from? And is there a way that I can force the > alias analysis algorithm to focus only the "main" function? Thank you > very much.Hi! Here's more information after initializing the variables (assuming the intent in the source code was, e.g., to initialize `a` and `b` to `0` and the pointers `f1` and `f2` to `NULL`, using aggregate initialization for `s`): - Clang [-> LLVM-IR]: https://llvm.godbolt.org/z/WT7V3E - [LLVM-IR ->] opt: https://llvm.godbolt.org/z/Veswa4 Alias sets for function 'main': Alias Set Tracker: 1 alias sets for 2 pointer values. AliasSet[0x55ec7f9a23e0, 3] may alias, Mod/Ref Pointers: (i8* %0, LocationSize::precise(4)), (i32* %a, LocationSize::precise(4)) Note that in the original source code `a`, `b` are uninitialized--consequently, attempting to access `s[a].f1` and `s[b].f2` is undefined behavior (as we're using automatic storage duration objects `a` and `b` while their values are indeterminate): https://taas.trust-in-soft.com/tsnippet/t/acff56c8 Cf. https://cigix.me/c17#6.7.9.p10 ("If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.") & https://cigix.me/c17#J.2.p1 ("The behavior is undefined in the following circumstances: [...] The value of an object with automatic storage duration is used while it is indeterminate"). As such, you can notice that most of the code is going to be optimized away between mem2reg and dead argument elimination: https://llvm.godbolt.org/z/iEdKE_ (Similarly, even if `a` and `b` were initialized to `0`, we only wrote to `f1` for `s[0]` and `s[1]`, so accessing `s[b].f2` is again using an object while it is indeterminate and undefined behavior.) *** IR Dump After Promote Memory to Register *** ; the following corresponds to loading `s[a].f1` %3 = load i32, i32* %a, align 4, !tbaa !7 %idxprom = sext i32 %3 to i64 %arrayidx3 = getelementptr inbounds [2 x %struct.MyStruct], [2 x %struct.MyStruct]* %s, i64 0, i64 %idxprom %f14 = getelementptr inbounds %struct.MyStruct, %struct.MyStruct* %arrayidx3, i32 0, i32 0 %4 = load i32*, i32** %f14, align 16, !tbaa !2 %5 = bitcast i32* %4 to i8* ; the following corresponds to loading `s[b].f2` %6 = load i32, i32* %b, align 4, !tbaa !7 %idxprom5 = sext i32 %6 to i64 %arrayidx6 = getelementptr inbounds [2 x %struct.MyStruct], [2 x %struct.MyStruct]* %s, i64 0, i64 %idxprom5 %f2 = getelementptr inbounds %struct.MyStruct, %struct.MyStruct* %arrayidx6, i32 0, i32 1 %7 = load i32*, i32** %f2, align 8, !tbaa !9 %8 = bitcast i32* %7 to i8* call void @NOALIAS(i8* %5, i8* %8) *** IR Dump After Dead Argument Elimination *** ; note how the arguments have been rewritten to `undef` in the following: call void @NOALIAS(i8* undef, i8* undef) > And is there a way that I can force the alias analysis algorithm to focus only the "main" function? One way is to make the definition of `NOALIAS` unavailable (as if external) by only providing the declaration (as in the above examples). Best, Matt
Shuai Wang via llvm-dev
2020-Jul-09 10:51 UTC
[llvm-dev] Understand alias-analysis results
Hey Matt, That's awesome. Thank you very much for all the information and clarification! Just a few follow up questions. Could you kindly shed some lights on it? Thank you! 1. I tried to tweak the code in the following way: - Clang [-> LLVM-IR]: https://llvm.godbolt.org/z/n9rGrs - [LLVM-IR ->] opt: https://llvm.godbolt.org/z/Uc6h5Y And i note that the outputs are: Alias sets for function 'main': Alias Set Tracker: 2 alias sets for 4 pointer values. *AliasSet[0x563faa6c6260, 5] may alias, Mod/Ref Pointers: (i8* %0, LocationSize::precise(4)), (i32* %a, LocationSize::precise(4)), (i8* %1, LocationSize::precise(4)), (i32* %b, LocationSize::precise(4))* 1 Unknown instructions: call void @NOALIAS(i8* nonnull %0, i8* nonnull %1) #3 *AliasSet[0x563faa6b45e0, 1] must alias, Mod/Ref forwarding to 0x563faa6c6260* ===== Alias Analysis Evaluator Report ==== 6 Total Alias Queries Performed 4 no alias responses (66.6%) * 0 may alias responses (0.0%)* 0 partial alias responses (0.0%) *2 must alias responses (33.3%)* I am trying to interpret the outputs, so if I understand correctly, the output indicates that we have an alias set of 4 pointers which "potentially" point to the same memory region, correct? Then is there any more accurate analysis pass that I could use to somewhat infer that "there are two must alias sets, each set has two pointers"? Correct me if I was wrong here.. Using my local opt (version 6.0), I tried to iterate all feasible alias analysis passes but the results are not changed. Also, what is the "must alias, Mod/Ref forwarding to 0x563faa6c6260"? And how to interpret that we have "2 must alias responses"? Where does it come from? And why do we have "0 may alias response"? I would expect to have at least "4 may alias responses" as well? 2. I note that using the latest opt (version 11.0?) gives different outputs with my local opt (version 6.0). For opt (version 6.0), it reports: 2 alias sets for 2 pointer values. More importantly, can I expect to get generally better alias analysis results when switching to version 11.0? Thank you very much! Best, Shuai On Thu, Jul 9, 2020 at 6:14 PM Matt P. Dziubinski <matdzb at gmail.com> wrote:> On 7/9/2020 10:15, Shuai Wang via llvm-dev wrote: > > Hello, > > > > I am performing alias analysis toward the following simple code: > > > > [...] > > > > I checked the generated .ll code, and it shows that within the main > > function and NOALIAS functions, there is only a "ret" statement, with no > > global or local variables used. Could anyone shed some lights on where > > the "1 may alias" come from? And is there a way that I can force the > > alias analysis algorithm to focus only the "main" function? Thank you > > very much. > > Hi! > > Here's more information after initializing the variables (assuming the > intent in the source code was, e.g., to initialize `a` and `b` to `0` > and the pointers `f1` and `f2` to `NULL`, using aggregate initialization > for `s`): > - Clang [-> LLVM-IR]: https://llvm.godbolt.org/z/WT7V3E > - [LLVM-IR ->] opt: https://llvm.godbolt.org/z/Veswa4 > > Alias sets for function 'main': Alias Set Tracker: 1 alias sets for 2 > pointer values. > AliasSet[0x55ec7f9a23e0, 3] may alias, Mod/Ref Pointers: (i8* %0, > LocationSize::precise(4)), (i32* %a, LocationSize::precise(4)) > > Note that in the original source code `a`, `b` are > uninitialized--consequently, attempting to access `s[a].f1` and > `s[b].f2` is undefined behavior (as we're using automatic storage > duration objects `a` and `b` while their values are indeterminate): > https://taas.trust-in-soft.com/tsnippet/t/acff56c8 > > Cf. https://cigix.me/c17#6.7.9.p10 ("If an object that has automatic > storage duration is not initialized explicitly, its value is > indeterminate.") & https://cigix.me/c17#J.2.p1 > ("The behavior is undefined in the following circumstances: [...] The > value of an object with automatic storage duration is used while it is > indeterminate"). > > As such, you can notice that most of the code is going to be optimized > away between mem2reg and dead argument elimination: > https://llvm.godbolt.org/z/iEdKE_ > > (Similarly, even if `a` and `b` were initialized to `0`, we only wrote > to `f1` for `s[0]` and `s[1]`, so accessing `s[b].f2` is again using an > object while it is indeterminate and undefined behavior.) > > *** IR Dump After Promote Memory to Register *** > > ; the following corresponds to loading `s[a].f1` > %3 = load i32, i32* %a, align 4, !tbaa !7 > %idxprom = sext i32 %3 to i64 > %arrayidx3 = getelementptr inbounds [2 x %struct.MyStruct], [2 x > %struct.MyStruct]* %s, i64 0, i64 %idxprom > %f14 = getelementptr inbounds %struct.MyStruct, %struct.MyStruct* > %arrayidx3, i32 0, i32 0 > %4 = load i32*, i32** %f14, align 16, !tbaa !2 > %5 = bitcast i32* %4 to i8* > > ; the following corresponds to loading `s[b].f2` > %6 = load i32, i32* %b, align 4, !tbaa !7 > %idxprom5 = sext i32 %6 to i64 > %arrayidx6 = getelementptr inbounds [2 x %struct.MyStruct], [2 x > %struct.MyStruct]* %s, i64 0, i64 %idxprom5 > %f2 = getelementptr inbounds %struct.MyStruct, %struct.MyStruct* > %arrayidx6, i32 0, i32 1 > %7 = load i32*, i32** %f2, align 8, !tbaa !9 > %8 = bitcast i32* %7 to i8* > call void @NOALIAS(i8* %5, i8* %8) > > *** IR Dump After Dead Argument Elimination *** > ; note how the arguments have been rewritten to `undef` in the following: > call void @NOALIAS(i8* undef, i8* undef) > > > And is there a way that I can force the alias analysis algorithm to > focus only the "main" function? > > One way is to make the definition of `NOALIAS` unavailable (as if > external) by only providing the declaration (as in the above examples). > > Best, > Matt >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200709/623335fc/attachment.html>