thr3ads.net - llvm dev - [LLVMdev] PROPOSAL: struct-access-path aware TBAA [Mar 2013]

If this information is useful, please help other people find it:
Share via:

Shuxin Yang

2013-Mar-13 18:07 UTC

[LLVMdev] PROPOSAL: struct-access-path aware TBAA

> The program I gave was well typed :)
Hi, Daniel:
    Thank you for sharing your insight.  I didn't realized it is 
well-typed -- I'm basically a big nut of any std.
I'd admit std/spec is one of the most boring material on this planet:-).

    So, if I understand correct, your point is:
        if a std call a type-casting (could be one which is in 
bad-taste:-), TBAA has to respect such std.

   If that is strictly true, TBAA has to reply on point-to analysis. 
However, that would virtually disable
TBAA as most point-to set has "unknown" element.

    Going back to my previous mail,> In the below example, GCC assumes p and q point to anything because
> they are incoming arguments.
>
>>
>> ------------------------------
>> typedef struct {
>>      int x;
>> }T1;
>>
>> typedef struct {
>>      int y;
>> }T2;
>>
>> int foo(T1 *p, T2 *q) {
>>      p->x = 1;
>>      q->y = 4;
>>      return p->x;
>> }
>> --------------------------Yes, gcc should assume p and q point to anything, however, the result 
contradict to the assumption --
It promote the p->x expression.

   If I fabricate a caller by stealing some code from your previous 
example, see bellow.
I think these code & your previous example (about placement new) share 
the same std.  I'm wondering
if gcc can give a correct result.

    foo_caller() {
        T1 t1;
        T1 *pt1;
        T2 *pt2 = new (pt1) T2;
        foo(pt1, pt2);
     }

Arnold Schwaighofer

2013-Mar-13 18:37 UTC

head link

[LLVMdev] PROPOSAL: struct-access-path aware TBAA

On Mar 13, 2013, at 1:07 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:
>> 
>> The program I gave was well typed :)
> 
> Hi, Daniel:
>   Thank you for sharing your insight.  I didn't realized it is
well-typed -- I'm basically a big nut of any std.
> I'd admit std/spec is one of the most boring material on this
planet:-).
> 
>   So, if I understand correct, your point is:
>       if a std call a type-casting (could be one which is in bad-taste:-),
TBAA has to respect such std.
> 
>  If that is strictly true, TBAA has to reply on point-to analysis. However,
that would virtually disable
> TBAA as most point-to set has "unknown" element.
> 
>   Going back to my previous mail,
>> In the below example, GCC assumes p and q point to anything because
>> they are incoming arguments.
>> 
>>> 
>>> ------------------------------
>>> typedef struct {
>>>     int x;
>>> }T1;
>>> 
>>> typedef struct {
>>>     int y;
>>> }T2;
>>> 
>>> int foo(T1 *p, T2 *q) {
>>>     p->x = 1;
>>>     q->y = 4;
>>>     return p->x;
>>> }
>>> --------------------------
> Yes, gcc should assume p and q point to anything, however, the result
contradict to the assumption --
> It promote the p->x expression.

Assuming above is C11 code, I think the relevant section in the C spec is the
following:

This is a paragraph from a C11 draft ("N1570 Committee Draft — April 12,
2011") . Assuming my interpretation of it is correct: It seems to imply
that a store to an lvalue can change its subsequent effective type? This would
preclude any purely based TBAA solution. And would, in general, require to take
access/points-to information into account.

---
6.5 Expressions

6: "The effective type of an object for an access to its stored value is
the declared type of the object, if any. If a value is stored into an object
having no declared type through an lvalue having a type that is not a character
type, then the type of the lvalue becomes the effective type of the object for
that access and for subsequent accesses that do not modify the stored value. If
a value is copied into an object having no declared type using memcpy or
memmove, or is copied as an array of character type, then the effective type of
the modified object for that access and for subsequent accesses that do not
modify the value is the effective type of the object from which the value is
copied, if it has one. For all other accesses to an object having no declared
type, the effective type of the object is simply the type of the lvalue used for
the access."
---

This is just before paragraph 6.5 Expressions 7 that is quoted in the current
TBAA proposal.

 "If a value is stored into an object having no declared type through an
lvalue having a type that is not a character type, then the type of the lvalue
becomes the effective type of the object for that access and for subsequent
accesses that <<do not modify>> the stored value."

I read this as "A store will set the "effective type" for any
subsequent read access" on the same object. So, in the above example,
assuming that p and q point to the same object, the effective type is changed
from the first to the second line. Which means that IF p and q pointed to the
same object the read access to "p->x" using the old effective type
is undefined. Hence, we may assume that p and q don't point to the same
object.

I don't know whether that reasoning underlies the decision that GCC makes
but it would be a justification (assuming my reasoning above is correct).

WRT to the current TBAA proposal this means that we have to be aware if we
decide on a purely type/access path based solution we might be breaking a lot
more code than we do now.

Best,
Arnold

Daniel Berlin

2013-Mar-13 20:21 UTC

head link

[LLVMdev] PROPOSAL: struct-access-path aware TBAA

On Wed, Mar 13, 2013 at 11:37 AM, Arnold Schwaighofer
<aschwaighofer at apple.com> wrote:>
> On Mar 13, 2013, at 1:07 PM, Shuxin Yang <shuxin.llvm at gmail.com>
wrote:
>
>>>
>>> The program I gave was well typed :)
>>
>> Hi, Daniel:
>>   Thank you for sharing your insight.  I didn't realized it is
well-typed -- I'm basically a big nut of any std.
>> I'd admit std/spec is one of the most boring material on this
planet:-).
>>
>>   So, if I understand correct, your point is:
>>       if a std call a type-casting (could be one which is in
bad-taste:-), TBAA has to respect such std.
>>
>>  If that is strictly true, TBAA has to reply on point-to analysis.
However, that would virtually disable
>> TBAA as most point-to set has "unknown" element.
>>
>>   Going back to my previous mail,
>>> In the below example, GCC assumes p and q point to anything because
>>> they are incoming arguments.
>>>
>>>>
>>>> ------------------------------
>>>> typedef struct {
>>>>     int x;
>>>> }T1;
>>>>
>>>> typedef struct {
>>>>     int y;
>>>> }T2;
>>>>
>>>> int foo(T1 *p, T2 *q) {
>>>>     p->x = 1;
>>>>     q->y = 4;
>>>>     return p->x;
>>>> }
>>>> --------------------------
>> Yes, gcc should assume p and q point to anything, however, the result
contradict to the assumption --
>> It promote the p->x expression.
>
>
> Assuming above is C11 code, I think the relevant section in the C spec is
the following:
>
> This is a paragraph from a C11 draft ("N1570 Committee Draft — April
12, 2011") . Assuming my interpretation of it is correct: It seems to imply
that a store to an lvalue can change its subsequent effective type? This would
preclude any purely based TBAA solution. And would, in general, require to take
access/points-to information into account.
>
> ---
> 6.5 Expressions
>
> 6: "The effective type of an object for an access to its stored value
is the declared type of the object, if any. If a value is stored into an object
having no declared type through an lvalue having a type that is not a character
type, then the type of the lvalue becomes the effective type of the object for
that access and for subsequent accesses that do not modify the stored value. If
a value is copied into an object having no declared type using memcpy or
memmove, or is copied as an array of character type, then the effective type of
the modified object for that access and for subsequent accesses that do not
modify the value is the effective type of the object from which the value is
copied, if it has one. For all other accesses to an object having no declared
type, the effective type of the object is simply the type of the lvalue used for
the access."
> ---
>
> This is just before paragraph 6.5 Expressions 7 that is quoted in the
current TBAA proposal.
>
>  "If a value is stored into an object having no declared type through
an lvalue having a type that is not a character type, then the type of the
lvalue becomes the effective type of the object for that access and for
subsequent accesses that <<do not modify>> the stored value."
>
> I read this as "A store will set the "effective type" for
any subsequent read access" on the same object. So, in the above example,
assuming
> that p and q point to the same object, the effective type is changed from
the first to the second line. Which means that IF p and q pointed to the >
same object the read access to "p->x" using the old effective type
is undefined. Hence, we may assume that p and q don't point to the same
> object.
Yes, C is quite different than C++ here.

GCC will feel free to move these particular stores around, even though
it believes they point anywhere, but won't in my placement new C++
case, because they *must* point to the same memory.


>
> I don't know whether that reasoning underlies the decision that GCC
makes but it would be a justification (assuming my reasoning above is correct).

>
>
> WRT to the current TBAA proposal this means that we have to be aware if we
decide on a purely type/access path based solution we might be breaking a lot
more code than we do now.
>
> Best,
> Arnold
>
>
>
>
>

Daniel Berlin

2013-Mar-13 20:39 UTC

head link

[LLVMdev] PROPOSAL: struct-access-path aware TBAA

On Wed, Mar 13, 2013 at 11:07 AM, Shuxin Yang <shuxin.llvm at gmail.com>
wrote:>
>> The program I gave was well typed :)
>
>
> Hi, Daniel:
>    Thank you for sharing your insight.  I didn't realized it is
well-typed
> -- I'm basically a big nut of any std.
> I'd admit std/spec is one of the most boring material on this
planet:-).
>
>    So, if I understand correct, your point is:
>        if a std call a type-casting (could be one which is in bad-taste:-),
> TBAA has to respect such std.
Yes.
>
>   If that is strictly true, TBAA has to reply on point-to analysis.
We actually disable TBAA in some cases, and rely on points-to in some others.
It's "complicated" :)
I can go back through the code and list the cases and reasons if you
think it would be helpful.

> However,
> that would virtually disable
> TBAA as most point-to set has "unknown" element.Well, the program you gave in the last message is fine.
It's okay to promote p->x even though it points-to non-local
variables, in *C*, because any
read from q that actually read the same memory would be undefined.

C++ has cases where it's possible they are legally allowed to point to
the same *memory*, though
they cannot be objects that are live at the *same time*.
So basically, you have motion barriers, and you may not be able to see them :)

>
>    Going back to my previous mail,
>
>> In the below example, GCC assumes p and q point to anything because
>> they are incoming arguments.

I misspoke, it actually assumes they point to non-local variables.
Flow-insensitive points-to information

p_1(D), points-to non-local
q_2(D), points-to non-local>>
>>>
>>> ------------------------------
>>> typedef struct {
>>>      int x;
>>> }T1;
>>>
>>> typedef struct {
>>>      int y;
>>> }T2;
>>>
>>> int foo(T1 *p, T2 *q) {
>>>      p->x = 1;
>>>      q->y = 4;
>>>      return p->x;
>>> }
>>> --------------------------
>
> Yes, gcc should assume p and q point to anything, however, the result
> contradict to the assumption --
> It promote the p->x expression.In both C and C++, the a load from q->y would be undefined if they
accessed the same memory.
This is different than the example I gave :)
Note that it actually just propagates p->x, because it knows the other
store can't legally affect the *read*.
It doesn't delete either store :)


>
>   If I fabricate a caller by stealing some code from your previous example,
> see bellow.
> I think these code & your previous example (about placement new) share
the
> same std.  I'm wondering
> if gcc can give a correct result.
>
>    foo_caller() {
>        T1 t1;
>        T1 *pt1;
>        T2 *pt2 = new (pt1) T2;
>        foo(pt1, pt2);
>     }
pt1 is not allowed to be read in foo in this case.

The original example I gave was one where using the alias info causes
it to reorder a  store to a live object above a store to a dead one,
because it does not know it is dead.

Maybe Matching Threads

Search for more reasonably related threads

llvm dev - Mar 2013 - [LLVMdev] PROPOSAL: struct-access-path aware TBAA

[LLVMdev] PROPOSAL: struct-access-path aware TBAA

[LLVMdev] PROPOSAL: struct-access-path aware TBAA

[LLVMdev] PROPOSAL: struct-access-path aware TBAA

[LLVMdev] PROPOSAL: struct-access-path aware TBAA

Maybe Matching Threads