> The program I gave was well typed :)Hi, Daniel: Thank you for sharing your insight. I didn't realized it is well-typed -- I'm basically a big nut of any std. I'd admit std/spec is one of the most boring material on this planet:-). So, if I understand correct, your point is: if a std call a type-casting (could be one which is in bad-taste:-), TBAA has to respect such std. If that is strictly true, TBAA has to reply on point-to analysis. However, that would virtually disable TBAA as most point-to set has "unknown" element. Going back to my previous mail,> In the below example, GCC assumes p and q point to anything because > they are incoming arguments. > >> >> ------------------------------ >> typedef struct { >> int x; >> }T1; >> >> typedef struct { >> int y; >> }T2; >> >> int foo(T1 *p, T2 *q) { >> p->x = 1; >> q->y = 4; >> return p->x; >> } >> --------------------------Yes, gcc should assume p and q point to anything, however, the result contradict to the assumption -- It promote the p->x expression. If I fabricate a caller by stealing some code from your previous example, see bellow. I think these code & your previous example (about placement new) share the same std. I'm wondering if gcc can give a correct result. foo_caller() { T1 t1; T1 *pt1; T2 *pt2 = new (pt1) T2; foo(pt1, pt2); }
Arnold Schwaighofer
2013-Mar-13 18:37 UTC
[LLVMdev] PROPOSAL: struct-access-path aware TBAA
On Mar 13, 2013, at 1:07 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:>> >> The program I gave was well typed :) > > Hi, Daniel: > Thank you for sharing your insight. I didn't realized it is well-typed -- I'm basically a big nut of any std. > I'd admit std/spec is one of the most boring material on this planet:-). > > So, if I understand correct, your point is: > if a std call a type-casting (could be one which is in bad-taste:-), TBAA has to respect such std. > > If that is strictly true, TBAA has to reply on point-to analysis. However, that would virtually disable > TBAA as most point-to set has "unknown" element. > > Going back to my previous mail, >> In the below example, GCC assumes p and q point to anything because >> they are incoming arguments. >> >>> >>> ------------------------------ >>> typedef struct { >>> int x; >>> }T1; >>> >>> typedef struct { >>> int y; >>> }T2; >>> >>> int foo(T1 *p, T2 *q) { >>> p->x = 1; >>> q->y = 4; >>> return p->x; >>> } >>> -------------------------- > Yes, gcc should assume p and q point to anything, however, the result contradict to the assumption -- > It promote the p->x expression.Assuming above is C11 code, I think the relevant section in the C spec is the following: This is a paragraph from a C11 draft ("N1570 Committee Draft — April 12, 2011") . Assuming my interpretation of it is correct: It seems to imply that a store to an lvalue can change its subsequent effective type? This would preclude any purely based TBAA solution. And would, in general, require to take access/points-to information into account. --- 6.5 Expressions 6: "The effective type of an object for an access to its stored value is the declared type of the object, if any. If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access." --- This is just before paragraph 6.5 Expressions 7 that is quoted in the current TBAA proposal. "If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that <<do not modify>> the stored value." I read this as "A store will set the "effective type" for any subsequent read access" on the same object. So, in the above example, assuming that p and q point to the same object, the effective type is changed from the first to the second line. Which means that IF p and q pointed to the same object the read access to "p->x" using the old effective type is undefined. Hence, we may assume that p and q don't point to the same object. I don't know whether that reasoning underlies the decision that GCC makes but it would be a justification (assuming my reasoning above is correct). WRT to the current TBAA proposal this means that we have to be aware if we decide on a purely type/access path based solution we might be breaking a lot more code than we do now. Best, Arnold
On Wed, Mar 13, 2013 at 11:37 AM, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:> > On Mar 13, 2013, at 1:07 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote: > >>> >>> The program I gave was well typed :) >> >> Hi, Daniel: >> Thank you for sharing your insight. I didn't realized it is well-typed -- I'm basically a big nut of any std. >> I'd admit std/spec is one of the most boring material on this planet:-). >> >> So, if I understand correct, your point is: >> if a std call a type-casting (could be one which is in bad-taste:-), TBAA has to respect such std. >> >> If that is strictly true, TBAA has to reply on point-to analysis. However, that would virtually disable >> TBAA as most point-to set has "unknown" element. >> >> Going back to my previous mail, >>> In the below example, GCC assumes p and q point to anything because >>> they are incoming arguments. >>> >>>> >>>> ------------------------------ >>>> typedef struct { >>>> int x; >>>> }T1; >>>> >>>> typedef struct { >>>> int y; >>>> }T2; >>>> >>>> int foo(T1 *p, T2 *q) { >>>> p->x = 1; >>>> q->y = 4; >>>> return p->x; >>>> } >>>> -------------------------- >> Yes, gcc should assume p and q point to anything, however, the result contradict to the assumption -- >> It promote the p->x expression. > > > Assuming above is C11 code, I think the relevant section in the C spec is the following: > > This is a paragraph from a C11 draft ("N1570 Committee Draft — April 12, 2011") . Assuming my interpretation of it is correct: It seems to imply that a store to an lvalue can change its subsequent effective type? This would preclude any purely based TBAA solution. And would, in general, require to take access/points-to information into account. > > --- > 6.5 Expressions > > 6: "The effective type of an object for an access to its stored value is the declared type of the object, if any. If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access." > --- > > This is just before paragraph 6.5 Expressions 7 that is quoted in the current TBAA proposal. > > "If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that <<do not modify>> the stored value." > > I read this as "A store will set the "effective type" for any subsequent read access" on the same object. So, in the above example, assuming > that p and q point to the same object, the effective type is changed from the first to the second line. Which means that IF p and q pointed to the > same object the read access to "p->x" using the old effective type is undefined. Hence, we may assume that p and q don't point to the same > object.Yes, C is quite different than C++ here. GCC will feel free to move these particular stores around, even though it believes they point anywhere, but won't in my placement new C++ case, because they *must* point to the same memory.> > I don't know whether that reasoning underlies the decision that GCC makes but it would be a justification (assuming my reasoning above is correct).> > > WRT to the current TBAA proposal this means that we have to be aware if we decide on a purely type/access path based solution we might be breaking a lot more code than we do now. > > Best, > Arnold > > > > >
On Wed, Mar 13, 2013 at 11:07 AM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:> >> The program I gave was well typed :) > > > Hi, Daniel: > Thank you for sharing your insight. I didn't realized it is well-typed > -- I'm basically a big nut of any std. > I'd admit std/spec is one of the most boring material on this planet:-). > > So, if I understand correct, your point is: > if a std call a type-casting (could be one which is in bad-taste:-), > TBAA has to respect such std.Yes.> > If that is strictly true, TBAA has to reply on point-to analysis.We actually disable TBAA in some cases, and rely on points-to in some others. It's "complicated" :) I can go back through the code and list the cases and reasons if you think it would be helpful.> However, > that would virtually disable > TBAA as most point-to set has "unknown" element.Well, the program you gave in the last message is fine. It's okay to promote p->x even though it points-to non-local variables, in *C*, because any read from q that actually read the same memory would be undefined. C++ has cases where it's possible they are legally allowed to point to the same *memory*, though they cannot be objects that are live at the *same time*. So basically, you have motion barriers, and you may not be able to see them :)> > Going back to my previous mail, > >> In the below example, GCC assumes p and q point to anything because >> they are incoming arguments.I misspoke, it actually assumes they point to non-local variables. Flow-insensitive points-to information p_1(D), points-to non-local q_2(D), points-to non-local>> >>> >>> ------------------------------ >>> typedef struct { >>> int x; >>> }T1; >>> >>> typedef struct { >>> int y; >>> }T2; >>> >>> int foo(T1 *p, T2 *q) { >>> p->x = 1; >>> q->y = 4; >>> return p->x; >>> } >>> -------------------------- > > Yes, gcc should assume p and q point to anything, however, the result > contradict to the assumption -- > It promote the p->x expression.In both C and C++, the a load from q->y would be undefined if they accessed the same memory. This is different than the example I gave :) Note that it actually just propagates p->x, because it knows the other store can't legally affect the *read*. It doesn't delete either store :)> > If I fabricate a caller by stealing some code from your previous example, > see bellow. > I think these code & your previous example (about placement new) share the > same std. I'm wondering > if gcc can give a correct result. > > foo_caller() { > T1 t1; > T1 *pt1; > T2 *pt2 = new (pt1) T2; > foo(pt1, pt2); > }pt1 is not allowed to be read in foo in this case. The original example I gave was one where using the alias info causes it to reorder a store to a live object above a store to a dead one, because it does not know it is dead.