On 3/11/13 1:17 PM, Daniel Berlin wrote:> On Mon, Mar 11, 2013 at 11:41 AM, Manman Ren <mren at apple.com> wrote: >> Based on discussions with John McCall >> >> We currently focus on field accesses of structs, more specifically, on fields that are scalars or structs. >> >> Fundamental rules from C11 >> -------------------------- >> An object shall have its stored value accessed only by an lvalue expression that has one of the following types: [footnote: The intent of this list is to specify those circumstances in which an object may or may not be aliased.] >> 1. a type compatible with the effective type of the object, >> 2. a qualified version of a type compatible with the effective type of the object, >> 3. a type that is the signed or unsigned type corresponding to the effective type of the object, >> 4. a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object, >> 5. an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or >> 6. a character type. >> >> Example >> ------- >> struct A { >> int x; >> int y; >> }; >> struct B { >> A a; >> int z; >> }; >> struct C { >> B b1; >> B b2; >> int *p; >> }; >> >> Type DAG: >> int <- A::x <- A >> int <- A::y <- A <- B::a <- B <- C::b1 <- C >> int <----------------- B::z <- B <- C::b2 <- C >> any pointer <--------------------- C::p <- C >> >> The type DAG has two types of TBAA nodes: >> 1> the existing scalar nodes >> 2> the struct nodes (this is different from the current tbaa.struct) >> A struct node has a unique name plus a list of pairs (field name, field type). >> For example, struct node for "C" should look like >> !4 = metadata !{"C", "C::b1", metadata !3, "C::b2", metadata !3, "C::p", metadata !2} >> where !3 is the struct node for "B", !2 is pointer type. >> >> Given a field access >> struct B *bp = ...; >> bp->a.x = 5; >> we annotate it as B::a.x. > In the case of multiple structures containing substructures, how are > you differentiating? > > IE given > > struct A { > struct B b; > } > struct C { > struct B b; > } > > How do you know the above struct B *bp =...; is B::b from C and not B::b from A?If I understand correct, the proposed graph is DAG, not tree, and it should be able to tackling the case where a type is included more than once. On the other hand, the info which is annotated to the memory access is kind of "immediate enclosing aggregate type", which should be unique.> > (I agree you can know in the case of direct aggregates, but I argue > you have no way to know in the case of pointer arguments without > interprocedural analysis) > It gets worse the more levels you have. > > ie if you add > struct B { > struct E e; > } > > and have struct E *e = ... > how do you know it's the B::e contained in struct C, or the B::e > contained in struct A? > > > Again, i agree you can do both scalar and direct aggregates, but not > aggregates and scalars through pointers. > >That is exactly the power of TBAA. If the both memory accesses are direct load/store, it dose not even need TBAA, check their base/offset/size is sufficient for disambiguation.
On Tue, Mar 12, 2013 at 3:44 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:> > On 3/11/13 1:17 PM, Daniel Berlin wrote: >> >> On Mon, Mar 11, 2013 at 11:41 AM, Manman Ren <mren at apple.com> wrote: >>> >>> Based on discussions with John McCall >>> >>> We currently focus on field accesses of structs, more specifically, on >>> fields that are scalars or structs. >>> >>> Fundamental rules from C11 >>> -------------------------- >>> An object shall have its stored value accessed only by an lvalue >>> expression that has one of the following types: [footnote: The intent of >>> this list is to specify those circumstances in which an object may or may >>> not be aliased.] >>> 1. a type compatible with the effective type of the object, >>> 2. a qualified version of a type compatible with the effective type of >>> the object, >>> 3. a type that is the signed or unsigned type corresponding to the >>> effective type of the object, >>> 4. a type that is the signed or unsigned type corresponding to a >>> qualified version of the effective type of the object, >>> 5. an aggregate or union type that includes one of the aforementioned >>> types among its members (including, recursively, a member of a subaggregate >>> or contained union), or >>> 6. a character type. >>> >>> Example >>> ------- >>> struct A { >>> int x; >>> int y; >>> }; >>> struct B { >>> A a; >>> int z; >>> }; >>> struct C { >>> B b1; >>> B b2; >>> int *p; >>> }; >>> >>> Type DAG: >>> int <- A::x <- A >>> int <- A::y <- A <- B::a <- B <- C::b1 <- C >>> int <----------------- B::z <- B <- C::b2 <- C >>> any pointer <--------------------- C::p <- C >>> >>> The type DAG has two types of TBAA nodes: >>> 1> the existing scalar nodes >>> 2> the struct nodes (this is different from the current tbaa.struct) >>> A struct node has a unique name plus a list of pairs (field name, >>> field type). >>> For example, struct node for "C" should look like >>> !4 = metadata !{"C", "C::b1", metadata !3, "C::b2", metadata !3, >>> "C::p", metadata !2} >>> where !3 is the struct node for "B", !2 is pointer type. >>> >>> Given a field access >>> struct B *bp = ...; >>> bp->a.x = 5; >>> we annotate it as B::a.x. >> >> In the case of multiple structures containing substructures, how are >> you differentiating? >> >> IE given >> >> struct A { >> struct B b; >> } >> struct C { >> struct B b; >> } >> >> How do you know the above struct B *bp =...; is B::b from C and not B::b >> from A? > > If I understand correct, the proposed graph is DAG, not tree, and it should > be able to tackling the case > where a type is included more than once.This was not clear from the proposal, whether the type hierarchy is duplicated all the way down as necessary, or attempted-to-be-shared by having types with multiple parents.> > On the other hand, the info which is annotated to the memory access is kind > of "immediate enclosing aggregate type", > which should be unique. > > >> >> (I agree you can know in the case of direct aggregates, but I argue >> you have no way to know in the case of pointer arguments without >> interprocedural analysis) >> It gets worse the more levels you have. >> >> ie if you add >> struct B { >> struct E e; >> } >> >> and have struct E *e = ... >> how do you know it's the B::e contained in struct C, or the B::e >> contained in struct A? >> >> >> Again, i agree you can do both scalar and direct aggregates, but not >> aggregates and scalars through pointers. >> >> > That is exactly the power of TBAA.I understand quite a bit about TBAA :) My point was that his check will produce the wrong TBAA answer because it will have an incomplete path in his representation. He will declare no-alias, yet it it will be may-alias under the actual TBAA rules.> If the both memory accesses are direct > load/store, > it dose not even need TBAA, check their base/offset/size is sufficient for > disambiguation.False, in a lot of cases. For starters, roughly any union including aggregate, you will have overlaps that can be disambiguated using TBAA but not otherwise.
Maybe I were not clear in my previous mail, or maybe we are talking about different topics. My point was, if the pair of memory accesses are direct access, in most cases, the base/offset/size should be able to sufficient to figure out if they are alias or not. A tricky case would be subscripped variables, like a.b[i][j] vs a.c[m][n] where a.c and b.c are enclosed in a common union. If mem-dep test failure to figure out if there is dependence, in that case, TBAA might help. It seems what we discussing here is somewhat deviating the original topic:-)>> If the both memory accesses are direct >> load/store, >> it dose not even need TBAA, check their base/offset/size is sufficient for >> disambiguation. > False, in a lot of cases. For starters, roughly any union including > aggregate, you will have overlaps that can be disambiguated using TBAA > but not otherwise.
Possibly Parallel Threads
- [LLVMdev] PROPOSAL: struct-access-path aware TBAA
- [LLVMdev] PROPOSAL: struct-access-path aware TBAA
- [LLVMdev] PROPOSAL: struct-access-path aware TBAA
- [LLVMdev] PROPOSAL: struct-access-path aware TBAA
- [LLVMdev] PROPOSAL: struct-access-path aware TBAA (should we use offset+size instead of path?)