Matt P. Dziubinski via llvm-dev
2020-Jul-09 13:58 UTC
[llvm-dev] Understand alias-analysis results
Hi again! Replying in chronological order:> On Thu, Jul 9, 2020 at 6:51 PM Shuai Wang <wangshuai901 at gmail.com > <mailto:wangshuai901 at gmail.com>> wrote: > > Hey Matt, > > That's awesome. Thank you very much for all the information and > clarification! Just a few follow up questions. Could you kindly shed > some lights on it? Thank you! > > 1. I tried to tweak the code in the following way: [...] > > I am trying to interpret the outputs, so if I understand correctly, > the output indicates that we have an alias set of 4 pointers which > "potentially" point to the same memory region, correct? Then is > there any more accurate analysis pass that I could use to somewhat > infer that "there are two must alias sets, each set has two > pointers"? Correct me if I was wrong here.. Using my local opt > (version 6.0), I tried to iterate all feasible alias analysis passes > but the results are not changed.Seems correct, I don't think you'll get more precise results out of the basic-aa pass, note that it has limited context sensitivity: https://llvm.org/docs/AliasAnalysis.html#the-basic-aa-pass Compare the results for `test_simple`, `test_in_array`, and `test_same_underlying_object_different_indices`: https://github.com/llvm/llvm-project/blob/release/10.x/llvm/test/Analysis/BasicAA/struct-geps.ll> Also, what is the "must alias, Mod/Ref forwarding to 0x563faa6c6260"?If alias sets have been merged, you'll get the attached node forwarding to the root node; note the comment for `getForwardedTarget` making a reference to union-find: https://github.com/llvm/llvm-project/blob/release/10.x/llvm/include/llvm/Analysis/AliasSetTracker.h#L281 (with "merge" corresponding to the union-find collapsing for "union", https://en.wikipedia.org/wiki/Disjoint-set_data_structure#Union). You can see how `AliasSet::mergeSetIn` (called, e.g., by `AliasSetTracker::mergeAliasSetsForPointer`) sets up forwarding for `AS.Forward`: https://github.com/llvm/llvm-project/blob/release/10.x/llvm/lib/Analysis/AliasSetTracker.cpp#L51, https://github.com/llvm/llvm-project/blob/release/10.x/llvm/lib/Analysis/AliasSetTracker.cpp#L301 FWIW, you can use a tracer or manually step through `opt` in a debugger to follow the function calls.> And how to interpret that we have "2 must alias responses"? Where > does it come from? And why do we have "0 may alias response"? I > would expect to have at least "4 may alias responses" as well?No, "MayAlias" and "MustAlias" are distinct elements in the lattice, cf. https://llvm.org/docs/AliasAnalysis.html#must-may-or-no There's a good explanation of the alias analysis queries and responses in the following talk (particularly the part starting with "AA Query" around the 22 min. mark): “Pointers, Alias & ModRef Analyses” (2018 EuroLLVM Developers’ Meeting: A. Sbirlea & N. Lopes) https://www.youtube.com/watch?v=r0XVS4Atl3U When you return `AliasResult` from your analysis you choose one: https://llvm.org/doxygen/namespacellvm.html#ae1738272abcf2ac638b97e7dc6360cfd You can see a simple example here (`TARAAResult::alias`): https://blog.tartanllama.xyz/llvm-alias-analysis/> 2. I note that using the latest opt (version 11.0?) gives different > outputs with my local opt (version 6.0). For opt (version 6.0), it > reports: 2 alias sets for 2 pointer values. > > More importantly, can I expect to get generally better alias > analysis results when switching to version 11.0?I'd assume that generally it shouldn't get worse :-)> > Thank you very much!On 7/9/2020 13:24, Shuai Wang wrote: > And another case: > > - Clang [-> LLVM-IR]: https://llvm.godbolt.org/z/SGeJZw > - [LLVM-IR ->] opt: https://llvm.godbolt.org/z/dNi-k2 > > Is there any chance that we can smoothly infer that: > - x and &c are "must alias" > - x and b are "must alias" > > I don't know how to interpret the current results, in particular the > following outputs: > > AliasSet[0x5584ab7e5f30, 1] must alias, Mod/Ref Pointers: (i32** %x, > LocationSize::precise(8)) > AliasSet[0x5584ab7e6020, 1] must alias, Mod Pointers: (i32* %y, > LocationSize::precise(4)) > > It means we have two "must alias" sets, each of which contain only one > pointer? That seems quite confusing to me.. You can add -print-all-alias-modref-info for more detailed information: https://llvm.godbolt.org/z/9njGqx -- you'll notice "MustAlias: i32* %c, i8* %6". Adding `-evaluate-aa-metadata` for `load` and `store` instructions, https://llvm.godbolt.org/z/YaW1Mb, you'll notice "MustAlias: %0 = load i32**, i32*** %a, align 8 <-> store i32** %b, i32*** %a, align 8" However, from your results we can already note: AliasSet[0x5584ab7e5d00, 5] may alias, Mod/Ref Pointers: (i32* %c, LocationSize::precise(4)), (i32** %b, LocationSize::precise(8)), (i32** %0, LocationSize::precise(8)), (i32* %2, LocationSize::precise(4)) Note how in the C source code pointer `b` points to int `c` (`b = &c;`) corresponding to the memory locations (same object in memory when loading from `c` or `*b`). However, pointers `x` and `y` are distinct objects in memory themselves. In general, remember not to confuse pointers with what they point to--here also distinct, since `x` points `b` but `y` points to `c` (I mention this specifically since the desired inference of "x and b are "must alias"" cannot be correct--these are not the same objects in memory). Best, Matt
Shuai Wang via llvm-dev
2020-Jul-10 05:17 UTC
[llvm-dev] Understand alias-analysis results
Hello! Thank you very much! Yes, that makes a lot of sense to me. However, just want to point out two things that are still unclear: 1. The output contains a alias set of only one element, for instance: "must alias, Mod Pointers: (i32* %y, LocationSize::precise(4))" This one really confused me. I would expect to have an alias set of at least *two* elements (e.g., two pointers point to the same memory region), otherwise for the above case, $y$ is aliased to whom? I searched all outputs that are related to %y from https://llvm.godbolt.org/z/9njGqx but all I can find is "NoAlias". Overall, to understand why an alias set can have only one pointer looks quite strange to me.. 2. For the following code chunks: b = &c; x = *a; int y = *x; MUSTALIAS(x,&c); MUSTALIAS(x,b); I don't get it why (quote from your previous email) "the desired inference of "x and b are "must alias"" cannot be correct--these are not the same objects in memory". x and b are both pointing to c, right? Am I missing anything here? Sorry for the potential trouble this may have caused.. And thank you in advance! Best, Shuai On Thu, Jul 9, 2020 at 9:58 PM Matt P. Dziubinski <matdzb at gmail.com> wrote:> Hi again! > > Replying in chronological order: > > > On Thu, Jul 9, 2020 at 6:51 PM Shuai Wang <wangshuai901 at gmail.com > > <mailto:wangshuai901 at gmail.com>> wrote: > > > > Hey Matt, > > > > That's awesome. Thank you very much for all the information and > > clarification! Just a few follow up questions. Could you kindly shed > > some lights on it? Thank you! > > > > 1. I tried to tweak the code in the following way: [...] > > > > I am trying to interpret the outputs, so if I understand correctly, > > the output indicates that we have an alias set of 4 pointers which > > "potentially" point to the same memory region, correct? Then is > > there any more accurate analysis pass that I could use to somewhat > > infer that "there are two must alias sets, each set has two > > pointers"? Correct me if I was wrong here.. Using my local opt > > (version 6.0), I tried to iterate all feasible alias analysis passes > > but the results are not changed. > > Seems correct, I don't think you'll get more precise results out of the > basic-aa pass, note that it has limited context sensitivity: > https://llvm.org/docs/AliasAnalysis.html#the-basic-aa-pass > > Compare the results for `test_simple`, `test_in_array`, and > `test_same_underlying_object_different_indices`: > > https://github.com/llvm/llvm-project/blob/release/10.x/llvm/test/Analysis/BasicAA/struct-geps.ll > > > Also, what is the "must alias, Mod/Ref forwarding to 0x563faa6c6260"? > > If alias sets have been merged, you'll get the attached node forwarding > to the root node; note the comment for `getForwardedTarget` making a > reference to union-find: > > https://github.com/llvm/llvm-project/blob/release/10.x/llvm/include/llvm/Analysis/AliasSetTracker.h#L281 > (with "merge" corresponding to the union-find collapsing for "union", > https://en.wikipedia.org/wiki/Disjoint-set_data_structure#Union). > > You can see how `AliasSet::mergeSetIn` (called, e.g., by > `AliasSetTracker::mergeAliasSetsForPointer`) sets up forwarding for > `AS.Forward`: > > https://github.com/llvm/llvm-project/blob/release/10.x/llvm/lib/Analysis/AliasSetTracker.cpp#L51, > > > https://github.com/llvm/llvm-project/blob/release/10.x/llvm/lib/Analysis/AliasSetTracker.cpp#L301 > FWIW, you can use a tracer or manually step through `opt` in a debugger > to follow the function calls. > > > And how to interpret that we have "2 must alias responses"? Where > > does it come from? And why do we have "0 may alias response"? I > > would expect to have at least "4 may alias responses" as well? > > No, "MayAlias" and "MustAlias" are distinct elements in the lattice, cf. > https://llvm.org/docs/AliasAnalysis.html#must-may-or-no > There's a good explanation of the alias analysis queries and responses > in the following talk (particularly the part starting with "AA Query" > around the 22 min. mark): > “Pointers, Alias & ModRef Analyses” (2018 EuroLLVM Developers’ Meeting: > A. Sbirlea & N. Lopes) > https://www.youtube.com/watch?v=r0XVS4Atl3U > > When you return `AliasResult` from your analysis you choose one: > > https://llvm.org/doxygen/namespacellvm.html#ae1738272abcf2ac638b97e7dc6360cfd > > You can see a simple example here (`TARAAResult::alias`): > https://blog.tartanllama.xyz/llvm-alias-analysis/ > > > 2. I note that using the latest opt (version 11.0?) gives different > > outputs with my local opt (version 6.0). For opt (version 6.0), it > > reports: 2 alias sets for 2 pointer values. > > > > More importantly, can I expect to get generally better alias > > analysis results when switching to version 11.0? > > I'd assume that generally it shouldn't get worse :-) > > > > > Thank you very much! > > On 7/9/2020 13:24, Shuai Wang wrote: > > And another case: > > > > - Clang [-> LLVM-IR]: https://llvm.godbolt.org/z/SGeJZw > > - [LLVM-IR ->] opt: https://llvm.godbolt.org/z/dNi-k2 > > > > Is there any chance that we can smoothly infer that: > > - x and &c are "must alias" > > - x and b are "must alias" > > > > I don't know how to interpret the current results, in particular the > > following outputs: > > > > AliasSet[0x5584ab7e5f30, 1] must alias, Mod/Ref Pointers: (i32** %x, > > LocationSize::precise(8)) > > AliasSet[0x5584ab7e6020, 1] must alias, Mod Pointers: (i32* %y, > > LocationSize::precise(4)) > > > > It means we have two "must alias" sets, each of which contain only one > > pointer? That seems quite confusing to me.. > > You can add -print-all-alias-modref-info for more detailed information: > https://llvm.godbolt.org/z/9njGqx -- you'll notice "MustAlias: i32* %c, > i8* %6". > > Adding `-evaluate-aa-metadata` for `load` and `store` instructions, > https://llvm.godbolt.org/z/YaW1Mb, you'll notice "MustAlias: %0 = load > i32**, i32*** %a, align 8 <-> store i32** %b, i32*** %a, align 8" > > However, from your results we can already note: > > AliasSet[0x5584ab7e5d00, 5] may alias, Mod/Ref Pointers: (i32* %c, > LocationSize::precise(4)), (i32** %b, LocationSize::precise(8)), (i32** > %0, LocationSize::precise(8)), (i32* %2, LocationSize::precise(4)) > > Note how in the C source code pointer `b` points to int `c` (`b = &c;`) > corresponding to the memory locations (same object in memory when > loading from `c` or `*b`). However, pointers `x` and `y` are distinct > objects in memory themselves. In general, remember not to confuse > pointers with what they point to--here also distinct, since `x` points > `b` but `y` points to `c` (I mention this specifically since the desired > inference of "x and b are "must alias"" cannot be correct--these are not > the same objects in memory). > > Best, > Matt > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200710/2b3a5a0f/attachment.html>
Matt P. Dziubinski via llvm-dev
2020-Jul-10 08:43 UTC
[llvm-dev] Understand alias-analysis results
Hi! On 7/10/2020 07:17, Shuai Wang wrote:> Hello! > > Thank you very much! Yes, that makes a lot of sense to me. However, just > want to point out two things that are still unclear: > > 1. The output contains a alias set of only one element, for instance: > "must alias, Mod Pointers: (i32* %y, LocationSize::precise(4))" > > This one really confused me. I would expect to have an alias set of at > least *two* elements (e.g., two pointers point to the same memory > region), otherwise for the above case, $y$ is aliased to whom? I > searched all outputs that are related to %y from > https://llvm.godbolt.org/z/9njGqx but all I can find is "NoAlias". > Overall, to understand why an alias set can have only one pointer looks > quite strange to me..This seems correct: Note that `int y` is a distinct object in memory, not a pointer, and shares no alias sets with anything (we happen to assign the int _value_ we obtain from pointer `x` by dereferencing it `*x`, but that bears no relevance to aliasing here). Perhaps this can help illustrate the scenario (assuming the URL doesn't get mangled): http://www.pythontutor.com/cpp.html#code=void%20MUSTALIAS%28void%20*p,%20void%20*q%29%20%7B%7D%0A%0Aint%20main%28%29%7B%0A%0A%20%20int%20**a,%20*b,%20*x%20,c%3B%0A%20%20c%20%3D%2010%3B%0A%20%20a%20%3D%20%26b%3B%0A%20%20b%20%3D%20%26c%3B%0A%20%20x%20%3D%20*a%3B%0A%20%20int%20y%20%3D%20*x%3B%0A%20%20MUSTALIAS%28x,%26c%29%3B%0A%20%20MUSTALIAS%28x,b%29%3B%0A%20%20return%200%3B%0A%7D&curInstr=12&mode=display&origin=opt-frontend.js&py=cpp&rawInputLstJSON=%5B%5D Note (in step 13 of 13) how `y` does not alias (it is just an `int` itself) anything (agreeing with NoAlias results you're getting).> > 2. For the following code chunks: > > b = &c; > x = *a; > int y = *x; > MUSTALIAS(x,&c); > MUSTALIAS(x,b); > > I don't get it why (quote from your previous email) "the desired > inference of "x and b are "must alias"" cannot be correct--these are not > the same objects in memory". x and b are both pointing to c, right? Am I > missing anything here?Correct, both `x` and `b` point to `c` (the pointers themselves are distinct, but after dereferencing what they point to is the same location). Consider the slice on the (first call) to `MUSTALIAS(x, &c);`: https://llvm.godbolt.org/z/YaW1Mb %a = alloca i32**, align 8 %b = alloca i32*, align 8 %x = alloca i32*, align 8 %c = alloca i32, align 4 . . . %0 = load i32**, i32*** %a, align 8 %1 = load i32*, i32** %0, align 8 . . . %4 = load i32*, i32** %x, align 8 %5 = bitcast i32* %4 to i8* %6 = bitcast i32* %c to i8* ; MUSTALIAS(x, &c); call void @MUSTALIAS(i8* %5, i8* %6) Notice the following results: NoAlias: i32* %4, i32** %x MayAlias: i32* %4, i32* %c MustAlias: i32* %4, i8* %5 MayAlias: i32* %4, i8* %6 . . . MayAlias: i32** %b, i8* %5 NoAlias: i32** %x, i8* %5 MayAlias: i32* %c, i8* %5 MayAlias: i8* %5, i8* %6 MustAlias: i32* %c, i8* %6 . . . MustAlias: %4 = load i32*, i32** %x, align 8 <-> store i32* %1, i32** %x, align 8 The location in memory of a pointer object `b` may alias location of the first argument `%5` but not the (other memory object) pointer `x`; however, `i32* %c` aliases second argument `i8* %6` (the same memory object) as well as may alias `i8* %5` (which we've obtained from loading the address from pointer `x`; the difference here being between the memory locations storing the pointers themselves, say, on the stack--not aliased, and the memory objects the addresses stored in the pointers refer to, i.e., i32** vs. i32* or i8*). Best, Matt> > Sorry for the potential trouble this may have caused.. And thank you > in advance! > > Best, > Shuai > > On Thu, Jul 9, 2020 at 9:58 PM Matt P. Dziubinski <matdzb at gmail.com > <mailto:matdzb at gmail.com>> wrote: > > Hi again! > > Replying in chronological order: > > > On Thu, Jul 9, 2020 at 6:51 PM Shuai Wang <wangshuai901 at gmail.com > <mailto:wangshuai901 at gmail.com> > > <mailto:wangshuai901 at gmail.com <mailto:wangshuai901 at gmail.com>>> > wrote: > > > > Hey Matt, > > > > That's awesome. Thank you very much for all the information and > > clarification! Just a few follow up questions. Could you > kindly shed > > some lights on it? Thank you! > > > > 1. I tried to tweak the code in the following way: [...] > > > > I am trying to interpret the outputs, so if I understand > correctly, > > the output indicates that we have an alias set of 4 pointers > which > > "potentially" point to the same memory region, correct? Then is > > there any more accurate analysis pass that I could use to > somewhat > > infer that "there are two must alias sets, each set has two > > pointers"? Correct me if I was wrong here.. Using my local opt > > (version 6.0), I tried to iterate all feasible alias analysis > passes > > but the results are not changed. > > Seems correct, I don't think you'll get more precise results out of the > basic-aa pass, note that it has limited context sensitivity: > https://llvm.org/docs/AliasAnalysis.html#the-basic-aa-pass > > Compare the results for `test_simple`, `test_in_array`, and > `test_same_underlying_object_different_indices`: > https://github.com/llvm/llvm-project/blob/release/10.x/llvm/test/Analysis/BasicAA/struct-geps.ll > > > Also, what is the "must alias, Mod/Ref forwarding to > 0x563faa6c6260"? > > If alias sets have been merged, you'll get the attached node forwarding > to the root node; note the comment for `getForwardedTarget` making a > reference to union-find: > https://github.com/llvm/llvm-project/blob/release/10.x/llvm/include/llvm/Analysis/AliasSetTracker.h#L281 > (with "merge" corresponding to the union-find collapsing for "union", > https://en.wikipedia.org/wiki/Disjoint-set_data_structure#Union). > > You can see how `AliasSet::mergeSetIn` (called, e.g., by > `AliasSetTracker::mergeAliasSetsForPointer`) sets up forwarding for > `AS.Forward`: > https://github.com/llvm/llvm-project/blob/release/10.x/llvm/lib/Analysis/AliasSetTracker.cpp#L51, > > https://github.com/llvm/llvm-project/blob/release/10.x/llvm/lib/Analysis/AliasSetTracker.cpp#L301 > FWIW, you can use a tracer or manually step through `opt` in a debugger > to follow the function calls. > > > And how to interpret that we have "2 must alias responses"? Where > > does it come from? And why do we have "0 may alias response"? I > > would expect to have at least "4 may alias responses" as well? > > No, "MayAlias" and "MustAlias" are distinct elements in the lattice, cf. > https://llvm.org/docs/AliasAnalysis.html#must-may-or-no > There's a good explanation of the alias analysis queries and responses > in the following talk (particularly the part starting with "AA Query" > around the 22 min. mark): > “Pointers, Alias & ModRef Analyses” (2018 EuroLLVM Developers’ Meeting: > A. Sbirlea & N. Lopes) > https://www.youtube.com/watch?v=r0XVS4Atl3U > > When you return `AliasResult` from your analysis you choose one: > https://llvm.org/doxygen/namespacellvm.html#ae1738272abcf2ac638b97e7dc6360cfd > > You can see a simple example here (`TARAAResult::alias`): > https://blog.tartanllama.xyz/llvm-alias-analysis/ > > > 2. I note that using the latest opt (version 11.0?) gives > different > > outputs with my local opt (version 6.0). For opt (version > 6.0), it > > reports: 2 alias sets for 2 pointer values. > > > > More importantly, can I expect to get generally better alias > > analysis results when switching to version 11.0? > > I'd assume that generally it shouldn't get worse :-) > > > > > Thank you very much! > > On 7/9/2020 13:24, Shuai Wang wrote: > > And another case: > > > > - Clang [-> LLVM-IR]: https://llvm.godbolt.org/z/SGeJZw > > - [LLVM-IR ->] opt: https://llvm.godbolt.org/z/dNi-k2 > > > > Is there any chance that we can smoothly infer that: > > - x and &c are "must alias" > > - x and b are "must alias" > > > > I don't know how to interpret the current results, in particular the > > following outputs: > > > > AliasSet[0x5584ab7e5f30, 1] must alias, Mod/Ref Pointers: > (i32** %x, > > LocationSize::precise(8)) > > AliasSet[0x5584ab7e6020, 1] must alias, Mod Pointers: > (i32* %y, > > LocationSize::precise(4)) > > > > It means we have two "must alias" sets, each of which contain > only one > > pointer? That seems quite confusing to me.. > > You can add -print-all-alias-modref-info for more detailed information: > https://llvm.godbolt.org/z/9njGqx -- you'll notice "MustAlias: i32* > %c, > i8* %6". > > Adding `-evaluate-aa-metadata` for `load` and `store` instructions, > https://llvm.godbolt.org/z/YaW1Mb, you'll notice "MustAlias: %0 > load > i32**, i32*** %a, align 8 <-> store i32** %b, i32*** %a, align 8" > > However, from your results we can already note: > > AliasSet[0x5584ab7e5d00, 5] may alias, Mod/Ref Pointers: (i32* %c, > LocationSize::precise(4)), (i32** %b, LocationSize::precise(8)), (i32** > %0, LocationSize::precise(8)), (i32* %2, LocationSize::precise(4)) > > Note how in the C source code pointer `b` points to int `c` (`b = &c;`) > corresponding to the memory locations (same object in memory when > loading from `c` or `*b`). However, pointers `x` and `y` are distinct > objects in memory themselves. In general, remember not to confuse > pointers with what they point to--here also distinct, since `x` points > `b` but `y` points to `c` (I mention this specifically since the > desired > inference of "x and b are "must alias"" cannot be correct--these are > not > the same objects in memory). > > Best, > Matt >