thr3ads.net - llvm dev - [LLVMdev] PROPOSAL: struct-access-path aware TBAA [Mar 2013]

If this information is useful, please help other people find it:
Share via:

Shuxin Yang

2013-Mar-12 22:44 UTC

[LLVMdev] PROPOSAL: struct-access-path aware TBAA

On 3/11/13 1:17 PM, Daniel Berlin wrote:> On Mon, Mar 11, 2013 at 11:41 AM, Manman Ren <mren at apple.com>
wrote:
>> Based on discussions with John McCall
>>
>> We currently focus on field accesses of structs, more specifically, on
fields that are scalars or structs.
>>
>> Fundamental rules from C11
>> --------------------------
>> An object shall have its stored value accessed only by an lvalue
expression that has one of the following types: [footnote: The intent of this
list is to specify those circumstances in which an object may or may not be
aliased.]
>> 1. a type compatible with the effective type of the object,
>> 2. a qualified version of a type compatible with the effective type of
the object,
>> 3. a type that is the signed or unsigned type corresponding to the
effective type of the object,
>> 4. a type that is the signed or unsigned type corresponding to a
qualified version of the effective type of the object,
>> 5. an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a subaggregate or
contained union), or
>> 6. a character type.
>>
>> Example
>> -------
>>    struct A {
>>      int x;
>>      int y;
>>    };
>>    struct B {
>>      A a;
>>      int z;
>>    };
>>    struct C {
>>      B b1;
>>      B b2;
>>      int *p;
>>    };
>>
>> Type DAG:
>>    int <- A::x <- A
>>    int <- A::y <- A <- B::a <- B <- C::b1 <- C
>>    int <----------------- B::z <- B <- C::b2 <- C
>>    any pointer <--------------------- C::p  <- C
>>
>> The type DAG has two types of TBAA nodes:
>> 1> the existing scalar nodes
>> 2> the struct nodes (this is different from the current tbaa.struct)
>>     A struct node has a unique name plus a list of pairs (field name,
field type).
>>     For example, struct node for "C" should look like
>>     !4 = metadata !{"C", "C::b1", metadata !3,
"C::b2", metadata !3, "C::p", metadata !2}
>>     where !3 is the struct node for "B", !2 is pointer type.
>>
>> Given a field access
>>    struct B *bp = ...;
>>    bp->a.x = 5;
>> we annotate it as B::a.x.
> In the case of multiple structures containing substructures, how are
> you differentiating?
>
> IE given
>
> struct A {
> struct B b;
> }
> struct C {
> struct B b;
> }
>
> How do you know the above struct B *bp =...; is B::b from C and not B::b
from A?If I understand correct, the proposed graph is DAG, not tree, and it 
should be able to tackling the case
where a type is included more than once.

On the other hand, the info which is annotated to the memory access is 
kind of "immediate enclosing aggregate type",
which should be unique.
>
> (I agree you can know in the case of direct aggregates, but I argue
> you have no way to know in the case of pointer arguments without
> interprocedural analysis)
> It gets worse the more levels you have.
>
> ie if you add
> struct B {
> struct E e;
> }
>
> and have struct E *e = ...
> how do you know it's the B::e contained in struct C, or the B::e
> contained in struct A?
>
>
> Again, i agree you can do both scalar and direct aggregates, but not
> aggregates and scalars through pointers.
>
>That is exactly the power of TBAA.  If the both memory accesses are 
direct load/store,
it dose not even need TBAA, check their base/offset/size is sufficient 
for disambiguation.

Daniel Berlin

2013-Mar-13 00:40 UTC

head link

[LLVMdev] PROPOSAL: struct-access-path aware TBAA

On Tue, Mar 12, 2013 at 3:44 PM, Shuxin Yang <shuxin.llvm at gmail.com>
wrote:>
> On 3/11/13 1:17 PM, Daniel Berlin wrote:
>>
>> On Mon, Mar 11, 2013 at 11:41 AM, Manman Ren <mren at apple.com>
wrote:
>>>
>>> Based on discussions with John McCall
>>>
>>> We currently focus on field accesses of structs, more specifically,
on
>>> fields that are scalars or structs.
>>>
>>> Fundamental rules from C11
>>> --------------------------
>>> An object shall have its stored value accessed only by an lvalue
>>> expression that has one of the following types: [footnote: The
intent of
>>> this list is to specify those circumstances in which an object may
or may
>>> not be aliased.]
>>> 1. a type compatible with the effective type of the object,
>>> 2. a qualified version of a type compatible with the effective type
of
>>> the object,
>>> 3. a type that is the signed or unsigned type corresponding to the
>>> effective type of the object,
>>> 4. a type that is the signed or unsigned type corresponding to a
>>> qualified version of the effective type of the object,
>>> 5. an aggregate or union type that includes one of the
aforementioned
>>> types among its members (including, recursively, a member of a
subaggregate
>>> or contained union), or
>>> 6. a character type.
>>>
>>> Example
>>> -------
>>>    struct A {
>>>      int x;
>>>      int y;
>>>    };
>>>    struct B {
>>>      A a;
>>>      int z;
>>>    };
>>>    struct C {
>>>      B b1;
>>>      B b2;
>>>      int *p;
>>>    };
>>>
>>> Type DAG:
>>>    int <- A::x <- A
>>>    int <- A::y <- A <- B::a <- B <- C::b1 <- C
>>>    int <----------------- B::z <- B <- C::b2 <- C
>>>    any pointer <--------------------- C::p  <- C
>>>
>>> The type DAG has two types of TBAA nodes:
>>> 1> the existing scalar nodes
>>> 2> the struct nodes (this is different from the current
tbaa.struct)
>>>     A struct node has a unique name plus a list of pairs (field
name,
>>> field type).
>>>     For example, struct node for "C" should look like
>>>     !4 = metadata !{"C", "C::b1", metadata !3,
"C::b2", metadata !3,
>>> "C::p", metadata !2}
>>>     where !3 is the struct node for "B", !2 is pointer
type.
>>>
>>> Given a field access
>>>    struct B *bp = ...;
>>>    bp->a.x = 5;
>>> we annotate it as B::a.x.
>>
>> In the case of multiple structures containing substructures, how are
>> you differentiating?
>>
>> IE given
>>
>> struct A {
>> struct B b;
>> }
>> struct C {
>> struct B b;
>> }
>>
>> How do you know the above struct B *bp =...; is B::b from C and not
B::b
>> from A?
>
> If I understand correct, the proposed graph is DAG, not tree, and it should
> be able to tackling the case
> where a type is included more than once.
This was not clear from the proposal, whether the type hierarchy is
duplicated all the way down as necessary, or attempted-to-be-shared by
having types with multiple parents.
>
> On the other hand, the info which is annotated to the memory access is kind
> of "immediate enclosing aggregate type",
> which should be unique.
>
>
>>
>> (I agree you can know in the case of direct aggregates, but I argue
>> you have no way to know in the case of pointer arguments without
>> interprocedural analysis)
>> It gets worse the more levels you have.
>>
>> ie if you add
>> struct B {
>> struct E e;
>> }
>>
>> and have struct E *e = ...
>> how do you know it's the B::e contained in struct C, or the B::e
>> contained in struct A?
>>
>>
>> Again, i agree you can do both scalar and direct aggregates, but not
>> aggregates and scalars through pointers.
>>
>>
> That is exactly the power of TBAA.
I understand quite a bit about TBAA :)

My point was that his check will produce the wrong TBAA answer because
it will have an incomplete path in his representation.  He will
declare no-alias, yet it it will be may-alias under the actual TBAA
rules.
>  If the both memory accesses are direct
> load/store,
> it dose not even need TBAA, check their base/offset/size is sufficient for
> disambiguation.False, in a lot of cases.  For starters, roughly any union including
aggregate, you will have overlaps that can be disambiguated using TBAA
but not otherwise.

Shuxin Yang

2013-Mar-13 00:56 UTC

head link

[LLVMdev] PROPOSAL: struct-access-path aware TBAA

Maybe I were not clear in my previous mail, or maybe we are talking 
about different topics.

My point was, if the pair of memory accesses are direct access,  in most 
cases, the base/offset/size
should be able to sufficient to figure out if they are alias or not. A 
tricky case would be
subscripped variables, like a.b[i][j] vs a.c[m][n] where a.c and b.c are 
enclosed in a common union.
  If mem-dep test failure to figure out if there is dependence, in that 
case, TBAA might help.

  It seems what we discussing here is somewhat deviating the original 
topic:-)


>>   If the both memory accesses are direct
>> load/store,
>> it dose not even need TBAA, check their base/offset/size is sufficient
for
>> disambiguation.
> False, in a lot of cases.  For starters, roughly any union including
> aggregate, you will have overlaps that can be disambiguated using TBAA
> but not otherwise.

Seemingly Similar Threads

Search for more seemingly similar threads

llvm dev - Mar 2013 - [LLVMdev] PROPOSAL: struct-access-path aware TBAA

[LLVMdev] PROPOSAL: struct-access-path aware TBAA

[LLVMdev] PROPOSAL: struct-access-path aware TBAA

[LLVMdev] PROPOSAL: struct-access-path aware TBAA

Seemingly Similar Threads