On Aug 23, 2012, at 5:20 AM, Tran Ma <tranma at cse.unsw.edu.au> wrote:
> Hi,
>
> I work on DDC, the compiler of a research Haskell dialect, Disciple
(disciple.ouroborus.net). We are looking to make use of LLVM's type-based
alias analysis metadata to encode non-aliasing information between variables. We
have found that the tbaa structure is somewhat limited in its expressivity. In
particular we couldn't encode intransitive relations such as
"no-alias", e.g. (x `no-alias` y) and (y `no-alias` z) but not (x
`no-alias` z), where x, y and z are all variables (not types).
Correct, the current system cannot represent that.
>
> As an example, consider the following contrived piece of code, in which we
assume (b `no-alias` d) and (c `no-alias` d):
>
> %1 = load i32* %b, align 4 , !tbaa !1
> %2 = load i32* %c, align 4 , !tbaa !4
> %3 = add nsw i32 %2, %1
> store i32 %3, i32* %d, align 4 , !tbaa !5, !tbaa !2
>
> ; these two loads should be eliminated
> %4 = load i32* %b, align 4 , !tbaa !1
> %5 = load i32* %c, align 4 , !tbaa !4
> %6 = add nsw i32 %5, %4
> …
> !0 = metadata !{ metadata !"root-distinct-bd" }
> !1 = metadata !{ metadata !"b", metadata !0, i64 0 }
> !2 = metadata !{ metadata !"d1", metadata !0, i64 0 }
> !3 = metadata !{ metadata !"root-distinct-cd" }
> !4 = metadata !{ metadata !"c", metadata !3, i64 0 }
> !5 = metadata !{ metadata !"d2", metadata !3, i64 0 }
>
> Here we have invented different nodes for one variable (d) and annotate the
store with both of them, in hope that LLVM would make use of both nodes.
Unfortunately for us it seems to be ignoring all but the last one,
Right. If you run this code through plain "opt -S", you'll see
that the IR
automatically discards one of the !tbaa nodes. That's a general limitation
of the metadata system.
Do %b and %c really need to be separate types? In other systems, when you
have "a bunch of loads" that don't alias some store, it's
desirable to just
tag all the loads with the same type.
> here is the code I got with "opt -S -tbaa -basicaa -gvn -o -":
>
> %1 = load i32* %b, align 4, !tbaa !0
> %2 = load i32* %c, align 4, !tbaa !2
> %3 = add nsw i32 %2, %1
> store i32 %3, i32* %d, align 4, !tbaa !4
>
> ; only one of the two loads is optimised out
> %4 = load i32* %b, align 4, !tbaa !0
> %5 = add nsw i32 %2, %4
> …
> !0 = metadata !{metadata !"b", metadata !1, i64 0}
> !1 = metadata !{metadata !"root-distinct-bd"}
> !2 = metadata !{metadata !"c", metadata !3, i64 0}
> !3 = metadata !{metadata !"root-distinct-cd"}
> !4 = metadata !{metadata !"d2", metadata !3, i64 0}
>
> Is there a way to encode this relation using existing tbaa metadata that
I'm missing? Or am I consigned to writing my own alias analysis pass?
No, the current code cannot do this. Furthermore, anticipated
extensions of it -- making it use a DAG instead of a tree -- wouldn't be
able to do this either.
It would be possible to implement what you want by adding the proverbial
extra layer of indirection -- instead of having an instruction carry a
TBAA tree node identifier, an instruction could carry an identifier for
a list of tree node identifiers. It's not obviously worthwhile to do this
in mainline because it'd be extra overhead for no gain in the most common
use cases. However, I could be convinced otherwise if you have good reasons.
Dan