Manman Ren
2013-Mar-27 20:11 UTC
[LLVMdev] PROPOSAL: struct-access-path aware TBAA (new version)
Hello, After discussions with Daniel, Dan and others, here is an updated proposal for struct-access-path aware TBAA. Given an example struct A { int x; int y; }; struct B { A a; int z; }; struct C { B b1; B b2; int *p; }; struct D { C c; }; The purpose of struct-path-aware TBAA is to say "C::b1.a" will alias with "B::a.x", "C::b1.a" will alias with "B", "C::b1.a" does not alias with "D::c.b2.a.x". "A::x" does not alias with "A::y" We can potentially disambiguate aggregate accesses and scalar accesses. The current scalar TBAA will say "C::b1.a.x" will alias with "D::c.b2.a.x" since both are accesses to int. "A::x" will alias with "A::y" This proposal is about the format of metadata and how to implement alias queries. --------- Option A A> metadata for the type DAG 1> the existing scalar nodes !0 = metadata !{metadata !"any pointer", metadata !1} !1 = metadata !{metadata !"omnipotent char", metadata !2} !2 = metadata !{metadata !"Simple C/C++ TBAA"} !4 = metadata !{metadata !"int", metadata !1} 2> the struct nodes A struct node has a unique name plus a list of field types For example, struct node for "C" should look like !5 = metadata !{"C", metadata !3, metadata !3, metadata !0} where !3 is the struct node for "B" The type DAG for the example: int <------------- A <- B <-- C <-- D int <-------------------- B any pointer <--------------- C B> We call metadata attached to scalar accesses scalar tags and metadata attached to aggregate accesses aggregate tags. Each tag can be a sequence of field accesses and nodes in the type DAG. For example, the tag for "C::b2.a.x" is !7 := [ "tbaa.path", !C, "C::b2", !B, "B::a", !A, "A::x", !int ] ;C::b2.a.x where !C, !B, !A are metadata nodes in the type DAG, "C::b2" "B::a" "A::x" are strings. C> alias(x, y) = alias_rule(x, y) or alias_rule(y, x), where alias_rule(x, y) is if the first element of 'x' does not enclose the first element of 'y' return NoAlias if the last element of 'x' encloses the first element of 'y' via type DAG, return Alias Check the first element of 'y', if it is not in 'x', return NoAlias Repeat until we reach the end of either 'x' or 'y' Check the next element of 'y' with the next element of 'x' if not the same, return false. -------------- Option B Option B is to include offset of each field in the type DAG and let tag be offset based. A> metadata for the type DAG 1> the existing scalar nodes 2> the struct nodes A struct node has a unique name plus a list of pairs (offset, field type) For example, struct node for "C" should look like !5 = metadata !{"C", 0, metadata !3, 12, metadata !3, 24, metadata !0} where !3 is the struct node for "B" The type DAG for the example: int <- 0 <- A int <- 4 <- A <- 0 <- B <- 0 <- C <- 0 <- D int <-------------- 8 <- B <- 12 <- C any pointer <--------------- 24 <- C B> The tag will be (outermost struct type node, offset, size, the last type node) For example, the tag for C::b2.a.x will be !7 = [ "tbaa.offset", !C, 12, 4, !int] ;C::b2.a.x The last field of the tag is to handle alias(D::c, A::x), the tags will be [!D, 0, 32, !C] and [!A, 0, 4, !int], we only need to check whether !C encloses !A. C> Consider access pair "C::b1.a.x" and "D::c.b2.a.x", the tags are [!C, 0, 4, !int] and [!D, 12, 4, !int]. To check if they alias, we first check if !D encloses !C. If no, we return NoAlias. Since !D does enclose !C in our example, we start tracing from !D and follow the edge with the correct offset by adjusting the offset to be relative to the next node. We will reach C in the type DAG, now we have a common type node and can compare the (offset, size) directly. [!D, 12, 4] will be adjusted to [!C, 12, 4], since it does not overlap with [!C, 0, 4], we say they do not alias. alias_rule(x, y) is if the outermost type of 'x' does not enclose the outermost type of 'y', return NoAlias if the last type of 'x' encloses the outermost type of 'y' return Alias Start from the outermost type of 'x' Repeat until we reach the outermost type of 'y' follow the edge with the correct offset in the type DAG adjust the offset to be relative to the next node Compare the adjusted offset, size pair for 'x' against 'y' if they overlap, return Alias Otherwise, return NoAlias ------------------ Comparison The complexity of alias queries: similar The space for metadata: option A should require more space because the tag is a sequence of nodes and strings. I will work on Option B. ------------------ Holes We use BasicAA to perform sanity check. If BasicAA returns MayAlias, TBAA will kick off with full speed. Watch out for cases where the incoming arguments have different types but they do alias. Scalar TBAA has the same issue but we don't see bugs coming up. If we see bugs coming up with struct-path aware TBAA, we can do less aggressive TBAA or apply similar tricks as GCC. Scalar TBAA will say NoAlias for the placement new example under test/Analysis/TypeBasedAliasAnalysis/placement-tbaa.ll. If we optimize placement-tbaa.ll with -O1 -gvn, BaiscAA will start saying MustAlias. This may cause problem.
Xinliang David Li
2013-Mar-28 16:39 UTC
[LLVMdev] PROPOSAL: struct-access-path aware TBAA (new version)
Nice proposal. A possibly simplified way of doing TBAA is: 1) decompose an access into a 3-tuples: <BaseType, AccessType, BitOffset> -- there is no need to remember the full access path. 2) If the access types are incompatible, return noAlias (scalar TBAA rule) 3) if there is no containment relationship between two BaseTypes, return noAlias 4) Otherwise, align the the enclosed base type into the enclosing type (for each enclosed instance), and adjust the offset. If there is no overlap, return noAlias, otherwise return mayAlias. For instance, "C::b1.a.y" AND "D::c.b2.a.x C::b1.a.y --> <C, int, 32> D::c.b2.a.x --> <D, int, 96> Base Type C is enclosed in D, so the first access is normalized to <D, int, 32> which does not overlap with <D, int, 96>. For more complicated cases, such as p->a[i].x.b[j].y, TBAA can choose to normalize the access via projection before creating the tuple --> p->a[0].x.b[0].y. Make sure MustAlias is not wrongly returned for this case. Cheers, David On Wed, Mar 27, 2013 at 1:11 PM, Manman Ren <mren at apple.com> wrote:> > Hello, > > After discussions with Daniel, Dan and others, here is an updated proposal > for struct-access-path aware TBAA. > > Given an example > struct A { > int x; > int y; > }; > struct B { > A a; > int z; > }; > struct C { > B b1; > B b2; > int *p; > }; > struct D { > C c; > }; > > The purpose of struct-path-aware TBAA is to say > "C::b1.a" will alias with "B::a.x", "C::b1.a" will alias with "B", > "C::b1.a" does not alias with "D::c.b2.a.x". > "A::x" does not alias with "A::y" > We can potentially disambiguate aggregate accesses and scalar accesses. > > The current scalar TBAA will say > "C::b1.a.x" will alias with "D::c.b2.a.x" since both are accesses to int. > "A::x" will alias with "A::y" > > This proposal is about the format of metadata and how to implement alias > queries. > --------- Option A > A> metadata for the type DAG > 1> the existing scalar nodes > !0 = metadata !{metadata !"any pointer", metadata !1} > !1 = metadata !{metadata !"omnipotent char", metadata !2} > !2 = metadata !{metadata !"Simple C/C++ TBAA"} > !4 = metadata !{metadata !"int", metadata !1} > 2> the struct nodes > A struct node has a unique name plus a list of field types > For example, struct node for "C" should look like > !5 = metadata !{"C", metadata !3, metadata !3, metadata !0} > where !3 is the struct node for "B" > > The type DAG for the example: > int <------------- A <- B <-- C <-- D > int <-------------------- B > any pointer <--------------- C > > B> We call metadata attached to scalar accesses scalar tags and metadata > attached to aggregate accesses aggregate tags. > Each tag can be a sequence of field accesses and nodes in the type DAG. > For example, the tag for "C::b2.a.x" is > !7 := [ "tbaa.path", !C, "C::b2", !B, "B::a", !A, "A::x", !int ] > ;C::b2.a.x > where !C, !B, !A are metadata nodes in the type DAG, "C::b2" "B::a" > "A::x" are strings. > C> alias(x, y) = alias_rule(x, y) or alias_rule(y, x), where alias_rule(x, > y) is > if the first element of 'x' does not enclose the first element of 'y' > return NoAlias > if the last element of 'x' encloses the first element of 'y' via type > DAG, > return Alias > Check the first element of 'y', if it is not in 'x', return NoAlias > Repeat until we reach the end of either 'x' or 'y' > Check the next element of 'y' with the next element of 'x' > if not the same, return false. > > -------------- Option B > Option B is to include offset of each field in the type DAG and let tag > be offset based. > A> metadata for the type DAG > 1> the existing scalar nodes > 2> the struct nodes > A struct node has a unique name plus a list of pairs (offset, field > type) > For example, struct node for "C" should look like > !5 = metadata !{"C", 0, metadata !3, 12, metadata !3, 24, metadata > !0} > where !3 is the struct node for "B" > The type DAG for the example: > int <- 0 <- A > int <- 4 <- A <- 0 <- B <- 0 <- C <- 0 <- D > int <-------------- 8 <- B <- 12 <- C > any pointer <--------------- 24 <- C > > B> The tag will be (outermost struct type node, offset, size, the last > type node) > For example, the tag for C::b2.a.x will be > !7 = [ "tbaa.offset", !C, 12, 4, !int] ;C::b2.a.x > The last field of the tag is to handle alias(D::c, A::x), the tags will > be > [!D, 0, 32, !C] and [!A, 0, 4, !int], we only need to check whether !C > encloses > !A. > > C> Consider access pair "C::b1.a.x" and "D::c.b2.a.x", the tags are [!C, > 0, 4, !int] > and [!D, 12, 4, !int]. > To check if they alias, we first check if !D encloses !C. If no, we > return > NoAlias. Since !D does enclose !C in our example, we start tracing from > !D > and follow the edge with the correct offset by > adjusting the offset to be relative to the next node. We will reach C > in the > type DAG, now we have a common type node and can compare the (offset, > size) > directly. [!D, 12, 4] will be adjusted to [!C, 12, 4], since it does not > overlap with [!C, 0, 4], we say they do not alias. > > alias_rule(x, y) is > if the outermost type of 'x' does not enclose the outermost type of 'y', > return NoAlias > if the last type of 'x' encloses the outermost type of 'y' > return Alias > Start from the outermost type of 'x' > Repeat until we reach the outermost type of 'y' > follow the edge with the correct offset in the type DAG > adjust the offset to be relative to the next node > Compare the adjusted offset, size pair for 'x' against 'y' > if they overlap, return Alias > Otherwise, return NoAlias > > ------------------ Comparison > The complexity of alias queries: similar > The space for metadata: option A should require more space because the tag > is a > sequence of nodes and strings. > I will work on Option B. > > ------------------ Holes > We use BasicAA to perform sanity check. If BasicAA returns MayAlias, TBAA > will > kick off with full speed. > > Watch out for cases where the incoming arguments have different types but > they > do alias. Scalar TBAA has the same issue but we don't see bugs coming up. > If > we see bugs coming up with struct-path aware TBAA, we can do less > aggressive > TBAA or apply similar tricks as GCC. > > Scalar TBAA will say NoAlias for the placement new example under > test/Analysis/TypeBasedAliasAnalysis/placement-tbaa.ll. > If we optimize placement-tbaa.ll with -O1 -gvn, BaiscAA will start saying > MustAlias. This may cause problem. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130328/dcbf6eb6/attachment.html>