Daniel Berlin via llvm-dev
2017-Feb-13 07:23 UTC
[llvm-dev] RFC: Representing unions in TBAA
> > > I don't think this fully solves the problem -- you'll also need to fix > getMostGenericTBAA. That is, even if you implement the above scheme, > say you started out with: > > union U { > int i; > float f; > }; > > float f(union U *u, int *ii, float *ff, bool c) { > if (c) { > *ii = 10; > *ff = 10.0; > } else { > u->i = 10; // S0 > u->f = 10.0; // S1 > } > return u->f; > } > > (I presume you're trying to avoid reordering S0 and S1?) > > SimplifyCFG or some other such pass may transform f to: > > float f(union U *u, int *ii, float *ff, bool c) { > int *iptr = c ? ii : &(u->i); > int *fptr = c ? ff : &(u->f); > *iptr = 10; // S2 > *fptr = 10.0; // S3 > return u->f; > } > > then getMostGenericTBAA will infer scalar "int" TBAA for S2 and scalar > "float" TBAA for S3, which will be NoAlias and allow the reordering > you were trying to avoid. >FWIW, i have to read this in detail, but a few things pop out at me. 1. We would like to live in a world where we don't depend on TBAA overriding BasicAA to get correct answers. We do now, but don't want to. Hopefully this proposal does not make that impossible. 2. Literally the only way that GCC ends up getting this right is two fold: It only guarantees things about direct access through union. If you take the address of the union member (like the transform above), it knows it will get a wrong answer. So what it does is it finds the type it has to stop at (here, the union) to keep the TBAA set the same, and makes the transform end there. So the above would not occur. 3. A suggestion that TBAA follow all possible paths seems .. very slow. 4. "The main motivation for this is functional correctness of code using unions". I believe you mean "with tbaa and strict-aliasing on". If not,functional correctness for unions should not be in any way related to requiring TBAA. 5. Unions are among the worst area of the standard in terms of "nobody has really thought super-hard about the interaction of aliasing and unions in a way that is coherent". So when you say things like 'necessary for functional correctness of unions', just note that this is pretty much wrong. You probably mean "necessary for a reasonable interpretation" or something. Because we would be *functionally correct* by the standard by destroying the program if you ever read the member you didn't set :) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170212/2ef819c3/attachment.html>
Hubert Tong via llvm-dev
2017-Feb-13 18:07 UTC
[llvm-dev] RFC: Representing unions in TBAA
On Mon, Feb 13, 2017 at 2:23 AM, Daniel Berlin via llvm-dev < llvm-dev at lists.llvm.org> wrote:> >> I don't think this fully solves the problem -- you'll also need to fix >> getMostGenericTBAA. That is, even if you implement the above scheme, >> say you started out with: >> >> union U { >> int i; >> float f; >> }; >> >> float f(union U *u, int *ii, float *ff, bool c) { >> if (c) { >> *ii = 10; >> *ff = 10.0; >> } else { >> u->i = 10; // S0 >> u->f = 10.0; // S1 >> } >> return u->f; >> } >> >> (I presume you're trying to avoid reordering S0 and S1?) >> >> SimplifyCFG or some other such pass may transform f to: >> >> float f(union U *u, int *ii, float *ff, bool c) { >> int *iptr = c ? ii : &(u->i); >> int *fptr = c ? ff : &(u->f); >> *iptr = 10; // S2 >> *fptr = 10.0; // S3 >> return u->f; >> } >> >> then getMostGenericTBAA will infer scalar "int" TBAA for S2 and scalar >> "float" TBAA for S3, which will be NoAlias and allow the reordering >> you were trying to avoid. >> > > FWIW, i have to read this in detail, but a few things pop out at me. > > 1. We would like to live in a world where we don't depend on TBAA > overriding BasicAA to get correct answers. We do now, but don't want to. > Hopefully this proposal does not make that impossible. > > 2. Literally the only way that GCC ends up getting this right is two fold: > It only guarantees things about direct access through union. > If you take the address of the union member (like the transform above), it > knows it will get a wrong answer. > So what it does is it finds the type it has to stop at (here, the union) > to keep the TBAA set the same, and makes the transform end there. > So the above would not occur. > > > 3. A suggestion that TBAA follow all possible paths seems .. very slow. > > 4. "The main motivation for this is functional correctness of code using > unions". I believe you mean "with tbaa and strict-aliasing on". > If not,functional correctness for unions should not be in any way related > to requiring TBAA. > > 5. Unions are among the worst area of the standard in terms of "nobody has > really thought super-hard about the interaction of aliasing and unions in a > way that is coherent". > So when you say things like 'necessary for functional correctness of > unions', just note that this is pretty much wrong. You probably mean > "necessary for a reasonable interpretation" or something. > > Because we would be *functionally correct* by the standard by destroying > the program if you ever read the member you didn't set :) >C11 subclause 6.5.2.3 paragraph 3, has in footnote 95: If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation. So, the intent is at least that the use of the . operator or the -> operator to access a member of a union would "safely" perform type punning.> > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170213/35113311/attachment.html>
Daniel Berlin via llvm-dev
2017-Feb-14 03:39 UTC
[llvm-dev] RFC: Representing unions in TBAA
On Mon, Feb 13, 2017 at 10:07 AM, Hubert Tong < hubert.reinterpretcast at gmail.com> wrote:> On Mon, Feb 13, 2017 at 2:23 AM, Daniel Berlin via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> >>> I don't think this fully solves the problem -- you'll also need to fix >>> getMostGenericTBAA. That is, even if you implement the above scheme, >>> say you started out with: >>> >>> union U { >>> int i; >>> float f; >>> }; >>> >>> float f(union U *u, int *ii, float *ff, bool c) { >>> if (c) { >>> *ii = 10; >>> *ff = 10.0; >>> } else { >>> u->i = 10; // S0 >>> u->f = 10.0; // S1 >>> } >>> return u->f; >>> } >>> >>> (I presume you're trying to avoid reordering S0 and S1?) >>> >>> SimplifyCFG or some other such pass may transform f to: >>> >>> float f(union U *u, int *ii, float *ff, bool c) { >>> int *iptr = c ? ii : &(u->i); >>> int *fptr = c ? ff : &(u->f); >>> *iptr = 10; // S2 >>> *fptr = 10.0; // S3 >>> return u->f; >>> } >>> >>> then getMostGenericTBAA will infer scalar "int" TBAA for S2 and scalar >>> "float" TBAA for S3, which will be NoAlias and allow the reordering >>> you were trying to avoid. >>> >> >> FWIW, i have to read this in detail, but a few things pop out at me. >> >> 1. We would like to live in a world where we don't depend on TBAA >> overriding BasicAA to get correct answers. We do now, but don't want to. >> Hopefully this proposal does not make that impossible. >> >> 2. Literally the only way that GCC ends up getting this right is two >> fold: >> It only guarantees things about direct access through union. >> If you take the address of the union member (like the transform above), >> it knows it will get a wrong answer. >> So what it does is it finds the type it has to stop at (here, the union) >> to keep the TBAA set the same, and makes the transform end there. >> So the above would not occur. >> >> >> 3. A suggestion that TBAA follow all possible paths seems .. very slow. >> >> 4. "The main motivation for this is functional correctness of code using >> unions". I believe you mean "with tbaa and strict-aliasing on". >> If not,functional correctness for unions should not be in any way related >> to requiring TBAA. >> >> 5. Unions are among the worst area of the standard in terms of "nobody >> has really thought super-hard about the interaction of aliasing and unions >> in a way that is coherent". >> So when you say things like 'necessary for functional correctness of >> unions', just note that this is pretty much wrong. You probably mean >> "necessary for a reasonable interpretation" or something. >> >> Because we would be *functionally correct* by the standard by destroying >> the program if you ever read the member you didn't set :) >> > C11 subclause 6.5.2.3 paragraph 3, has in footnote 95: > If the member used to read the contents of a union object is not the same > as the member last used to store a value in the object, the appropriate > part of the object representation of the value is reinterpreted as an > object representation in the new type as described in 6.2.6 (a process > sometimes called "type punning"). This might be a trap representation. > > So, the intent is at least that the use of the . operator or the -> > operator to access a member of a union would "safely" perform type punning. > >Certainly, if you can quote this, you know this is new to C11 (and newer versions of C++). :) It was explicitly *not* true in earlier versions. They've also slowly cleaned up the aliasing rules, but, honestly, still a mess. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170213/053d6d23/attachment.html>