Hal Finkel via llvm-dev
2016-Feb-25  05:34 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
----- Original Message -----> From: "Chandler Carruth via llvm-dev" <llvm-dev at lists.llvm.org> > To: "Philip Reames" <listmail at philipreames.com>, "Duncan P. N. Exon > Smith" <dexonsmith at apple.com>, "Sanjoy Das" > <sanjoy at playingwithpointers.com> > Cc: "llvm-dev" <llvm-dev at lists.llvm.org> > Sent: Wednesday, February 24, 2016 10:29:23 PM > Subject: Re: [llvm-dev] Possible soundness issue with > available_externally (split from "RFC: Add guard intrinsics")> Yea, I'm pretty sad about all of this. I'm also not seeing a lot of > awesome paths forward.> Here is the least bad strategy I can come up with. Curious if folks > think this is sufficient:I may not completely understand the problem, but this seems like overkill. The underlying restriction is that, if the compiler makes a non-determinism-collapsing choice when optimizing a function, it must make the same choice for all definitions of that function (undefined behavior excluded). Thus, with an externally_available function, the CSE in Sanjoy's original example should be forbidden. Richard's example again demonstrates this principle, although in this case the non-determinism is in the choice of a globally-visible implementation technique rather than non-determinism from memory-subsystem reordering. There is a complication, which you imply in your proposal, that such optimizations need to be forbidden not just in the externally_available functions themselves, but in any local function transitively called by one. This, however, we can take care of with an (easily-deduced) attribute. In short, it is not clear to me that the number of problematic optimizations is large (seems likely restricted to things involving atomics in practice), and while I understand the auditing difficulties here, we should just restrict these in appropriate contexts instead of trying to restrict all information flow into or out of comdats. -Hal> 1) Stop deducing function attributes within comdats by examining the > bodies of the functions (so that we remain free to transform the > bodies of functions). > 2) Teach frontends to emit (even at O0!!!) trivially deduced function > attributes for comdats so that we continue to catch easy cases. > 3) Ensure and specify that we never hoist code *into* a comdat group > in which it would not have been executed previously. I don't know of > anything in LLVM that does this today, but it would become an > important invariant. > 4) Work a lot harder to do internalizing and removing of this > restriction.> Pretty horrible. But I think it is correct.> As a slight modification to #1 and #2, we could have a very carefully > crafted deduction rule where we only deduce function attributes for > functions prior to any modification of their function bodies. Such > attributes should be conservatively correct because we would never > lift new code into the function bodies. This would at least allow us > to do bottom-up deduction to catch interprocedural cases. But it > would become incredibly subtle that this is only valid prior to > *any* transformations of the comdat-containing functions.> I'm starting to think this subtle rule might be worth it. But I'm > frankly terrified by the implications.> On Wed, Feb 24, 2016 at 8:13 PM Philip Reames via llvm-dev < > llvm-dev at lists.llvm.org > wrote:> > On 02/24/2016 08:10 PM, Duncan P. N. Exon Smith via llvm-dev wrote: > > > >> On 2016-Feb-24, at 19:46, Sanjoy Das < > > >> sanjoy at playingwithpointers.com > wrote: > > > >> > > > >> On Wed, Feb 24, 2016 at 7:38 PM, Chandler Carruth < > > >> chandlerc at google.com > wrote: > > > >>> On Wed, Feb 24, 2016 at 7:34 PM Duncan P. N. Exon Smith > > > >>> < dexonsmith at apple.com > wrote: > > > >>>> > > > >>>>> On 2016-Feb-24, at 19:17, Chandler Carruth < > > >>>>> chandlerc at google.com > wrote: > > > >>>>> > > > >>>>> On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev > > > >>>>> < llvm-dev at lists.llvm.org > wrote: > > > >>>>> On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith > > > >>>>> < dexonsmith at apple.com > wrote: > > > >>>>>>> If we do not inline @foo(), and instead re-link the call > > >>>>>>> site > > >>>>>>> in > > > >>>>>>> @main > > > >>>>>>> to some non-optimized copy (or differently optimized copy) > > >>>>>>> of > > >>>>>>> @foo, > > > >>>>>>> then it is possible for the program to have the behavior > > >>>>>>> {print("Y"); > > > >>>>>>> print ("X")}, which was disallowed in the earlier program. > > > >>>>>>> > > > >>>>>>> In other words, opt refined the semantics of @foo() (i.e. > > >>>>>>> reduced the > > > >>>>>>> set of behaviors it may have) in ways that would make later > > > >>>>>>> optimizations invalid if we de-refine the implementation of > > >>>>>>> @foo(). > > > >>>>>> I'm probably missing something obvious here. How could the > > >>>>>> result of > > > >>>>>> `%t0 != %t1` be different at optimization time in one file > > >>>>>> than from > > > >>>>>> runtime in the "real" implementation? Doesn't this make the > > >>>>>> CSE > > > >>>>>> invalid? > > > >>>>> `%t0` and `%t1` are "allowed" to "always be the same", i.e. > > >>>>> an > > > >>>>> implementation of @foo that always feeds in the same > > > >>>>> value for `%t0` and `%t1` is a valid implementation (which is > > >>>>> why the > > > >>>>> CSE was valid); but it is not the *only* valid > > >>>>> implementation. > > >>>>> If I > > > >>>>> don't CSE the two load instructions (also a valid thing to > > >>>>> do), > > >>>>> and > > > >>>>> this is a second thread writing to `%par`, then the two > > >>>>> values > > >>>>> loaded > > > >>>>> can be different, and you could end up printing `"X"` in > > >>>>> `@foo`. > > > >>>>> > > > >>>>> Did that make sense? > > > >>>> Yes. To be sure I understand the scope: this is only a problem > > >>>> for > > > >>>> atomics, correct? (Because multi-threaded behaviour with other > > >>>> globals > > > >>>> is UB?) > > > >>>> > > > >>>>>> Does linkonce_odr linkage have the same problem? > > > >>>>>> - If so, do you want to change it too? > > > >>>>>> - Else, why not? > > > >>>>> Going by the specification in the LangRef, I'd say it depends > > >>>>> on how > > > >>>>> you define "definitive". If you're allowed to replace the > > >>>>> body > > >>>>> of a > > > >>>>> function with a differently optimized body, then the above > > >>>>> problem > > > >>>>> exists. > > > >>>>> > > > >>>>> I believe that is the case, and I strongly believe the > > >>>>> problem > > >>>>> you > > > >>>>> outline exists for linkonce_odr exactly as it does for > > >>>>> available_externally. > > > >>>>> > > > >>>>> Which is what makes this scary: every C++ inline function > > >>>>> today > > >>>>> can > > > >>>>> trigger this. > > > >>>> Every C/C++ inline or template function. But only the ones > > >>>> that > > >>>> use > > > >>>> atomics, right? > > > >>> > > > >>> Well, with *this* example... > > > >> Atomic are one source of non-determinism that compilers can > > >> reason > > > >> about. I don't know if the following snippet is well defined or > > >> not, > > > >> but you could have similar issues with > > > >> > > > >> > > > >> void foo() { > > > >> int *p = malloc(sizeof(int)); > > > >> if (*p < 10) print("X"); > > > >> } > > > >> > > > >> or (again, I don't know if this is actually well defined) > > > >> > > > >> void foo() { > > > >> int t; // it is probably reasonable to fold compares with > > > >> ptrtoint(alloca) to undef > > > >> if ((intptr_t)(&t) < 10) print("X"); > > > >> } > > > >> > > > > The first one at least is UB, but as Richard pointed out the > > > scope > > > > is certainly broader than atomics (it's not even just > > > well-defined > > > > non-deterministism). > > > > > > > > I'm kind of terrified by the implications. > > > Me too. :( > > > > > > > >> -- Sanjoy > > > >> > > > >>>> > > > >>>> Not that I'm sure that will end up being a helpful > > >>>> distinction. > > > >>> > > > >>> Right. See Richard's comment. I think that sums up the real > > >>> issue > > >>> here. =/ > > > > _______________________________________________ > > > > LLVM Developers mailing list > > > > llvm-dev at lists.llvm.org > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ > > > LLVM Developers mailing list > > > llvm-dev at lists.llvm.org > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- -- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Chandler Carruth via llvm-dev
2016-Feb-25  05:41 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
On Wed, Feb 24, 2016 at 9:35 PM Hal Finkel <hfinkel at anl.gov> wrote:> ----- Original Message ----- > > > From: "Chandler Carruth via llvm-dev" <llvm-dev at lists.llvm.org> > > To: "Philip Reames" <listmail at philipreames.com>, "Duncan P. N. Exon > > Smith" <dexonsmith at apple.com>, "Sanjoy Das" > > <sanjoy at playingwithpointers.com> > > Cc: "llvm-dev" <llvm-dev at lists.llvm.org> > > Sent: Wednesday, February 24, 2016 10:29:23 PM > > Subject: Re: [llvm-dev] Possible soundness issue with > > available_externally (split from "RFC: Add guard intrinsics") > > > Yea, I'm pretty sad about all of this. I'm also not seeing a lot of > > awesome paths forward. > > > Here is the least bad strategy I can come up with. Curious if folks > > think this is sufficient: > > I may not completely understand the problem, but this seems like overkill. > The underlying restriction is that, if the compiler makes a > non-determinism-collapsing choice when optimizing a function, it must make > the same choice for all definitions of that function (undefined behavior > excluded).This isn't enough, because some definition in some other module may *not be optimized at all*, and yet may get selected at link time. Put another way, it must *prove* that the same choice will *always* be made for all definitions. This is akin to proving that the optimizer is run over all translation units for C++ linkonce_odr functions, which you can't do. The result would be failing to optimize the bodies of linkonce_odr functions in any way which was externally detectable such as this. I think that would be *much* worse than losing the ability to do function attribute deduction for such functions? Thus, with an externally_available function, the CSE in Sanjoy's original> example should be forbidden. Richard's example again demonstrates this > principle, although in this case the non-determinism is in the choice of a > globally-visible implementation technique rather than non-determinism from > memory-subsystem reordering. > > There is a complication, which you imply in your proposal, that such > optimizations need to be forbidden not just in the externally_available > functions themselves, but in any local function transitively called by one. > This, however, we can take care of with an (easily-deduced) attribute. > > In short, it is not clear to me that the number of problematic > optimizations is large (seems likely restricted to things involving atomics > in practice), and while I understand the auditing difficulties here, we > should just restrict these in appropriate contexts instead of trying to > restrict all information flow into or out of comdats. > > -Hal > > > 1) Stop deducing function attributes within comdats by examining the > > bodies of the functions (so that we remain free to transform the > > bodies of functions). > > 2) Teach frontends to emit (even at O0!!!) trivially deduced function > > attributes for comdats so that we continue to catch easy cases. > > 3) Ensure and specify that we never hoist code *into* a comdat group > > in which it would not have been executed previously. I don't know of > > anything in LLVM that does this today, but it would become an > > important invariant. > > 4) Work a lot harder to do internalizing and removing of this > > restriction. > > > Pretty horrible. But I think it is correct. > > > As a slight modification to #1 and #2, we could have a very carefully > > crafted deduction rule where we only deduce function attributes for > > functions prior to any modification of their function bodies. Such > > attributes should be conservatively correct because we would never > > lift new code into the function bodies. This would at least allow us > > to do bottom-up deduction to catch interprocedural cases. But it > > would become incredibly subtle that this is only valid prior to > > *any* transformations of the comdat-containing functions. > > > I'm starting to think this subtle rule might be worth it. But I'm > > frankly terrified by the implications. > > > On Wed, Feb 24, 2016 at 8:13 PM Philip Reames via llvm-dev < > > llvm-dev at lists.llvm.org > wrote: > > > > On 02/24/2016 08:10 PM, Duncan P. N. Exon Smith via llvm-dev wrote: > > > > > >> On 2016-Feb-24, at 19:46, Sanjoy Das < > > > >> sanjoy at playingwithpointers.com > wrote: > > > > > >> > > > > > >> On Wed, Feb 24, 2016 at 7:38 PM, Chandler Carruth < > > > >> chandlerc at google.com > wrote: > > > > > >>> On Wed, Feb 24, 2016 at 7:34 PM Duncan P. N. Exon Smith > > > > > >>> < dexonsmith at apple.com > wrote: > > > > > >>>> > > > > > >>>>> On 2016-Feb-24, at 19:17, Chandler Carruth < > > > >>>>> chandlerc at google.com > wrote: > > > > > >>>>> > > > > > >>>>> On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev > > > > > >>>>> < llvm-dev at lists.llvm.org > wrote: > > > > > >>>>> On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith > > > > > >>>>> < dexonsmith at apple.com > wrote: > > > > > >>>>>>> If we do not inline @foo(), and instead re-link the call > > > >>>>>>> site > > > >>>>>>> in > > > > > >>>>>>> @main > > > > > >>>>>>> to some non-optimized copy (or differently optimized copy) > > > >>>>>>> of > > > >>>>>>> @foo, > > > > > >>>>>>> then it is possible for the program to have the behavior > > > >>>>>>> {print("Y"); > > > > > >>>>>>> print ("X")}, which was disallowed in the earlier program. > > > > > >>>>>>> > > > > > >>>>>>> In other words, opt refined the semantics of @foo() (i.e. > > > >>>>>>> reduced the > > > > > >>>>>>> set of behaviors it may have) in ways that would make later > > > > > >>>>>>> optimizations invalid if we de-refine the implementation of > > > >>>>>>> @foo(). > > > > > >>>>>> I'm probably missing something obvious here. How could the > > > >>>>>> result of > > > > > >>>>>> `%t0 != %t1` be different at optimization time in one file > > > >>>>>> than from > > > > > >>>>>> runtime in the "real" implementation? Doesn't this make the > > > >>>>>> CSE > > > > > >>>>>> invalid? > > > > > >>>>> `%t0` and `%t1` are "allowed" to "always be the same", i.e. > > > >>>>> an > > > > > >>>>> implementation of @foo that always feeds in the same > > > > > >>>>> value for `%t0` and `%t1` is a valid implementation (which is > > > >>>>> why the > > > > > >>>>> CSE was valid); but it is not the *only* valid > > > >>>>> implementation. > > > >>>>> If I > > > > > >>>>> don't CSE the two load instructions (also a valid thing to > > > >>>>> do), > > > >>>>> and > > > > > >>>>> this is a second thread writing to `%par`, then the two > > > >>>>> values > > > >>>>> loaded > > > > > >>>>> can be different, and you could end up printing `"X"` in > > > >>>>> `@foo`. > > > > > >>>>> > > > > > >>>>> Did that make sense? > > > > > >>>> Yes. To be sure I understand the scope: this is only a problem > > > >>>> for > > > > > >>>> atomics, correct? (Because multi-threaded behaviour with other > > > >>>> globals > > > > > >>>> is UB?) > > > > > >>>> > > > > > >>>>>> Does linkonce_odr linkage have the same problem? > > > > > >>>>>> - If so, do you want to change it too? > > > > > >>>>>> - Else, why not? > > > > > >>>>> Going by the specification in the LangRef, I'd say it depends > > > >>>>> on how > > > > > >>>>> you define "definitive". If you're allowed to replace the > > > >>>>> body > > > >>>>> of a > > > > > >>>>> function with a differently optimized body, then the above > > > >>>>> problem > > > > > >>>>> exists. > > > > > >>>>> > > > > > >>>>> I believe that is the case, and I strongly believe the > > > >>>>> problem > > > >>>>> you > > > > > >>>>> outline exists for linkonce_odr exactly as it does for > > > >>>>> available_externally. > > > > > >>>>> > > > > > >>>>> Which is what makes this scary: every C++ inline function > > > >>>>> today > > > >>>>> can > > > > > >>>>> trigger this. > > > > > >>>> Every C/C++ inline or template function. But only the ones > > > >>>> that > > > >>>> use > > > > > >>>> atomics, right? > > > > > >>> > > > > > >>> Well, with *this* example... > > > > > >> Atomic are one source of non-determinism that compilers can > > > >> reason > > > > > >> about. I don't know if the following snippet is well defined or > > > >> not, > > > > > >> but you could have similar issues with > > > > > >> > > > > > >> > > > > > >> void foo() { > > > > > >> int *p = malloc(sizeof(int)); > > > > > >> if (*p < 10) print("X"); > > > > > >> } > > > > > >> > > > > > >> or (again, I don't know if this is actually well defined) > > > > > >> > > > > > >> void foo() { > > > > > >> int t; // it is probably reasonable to fold compares with > > > > > >> ptrtoint(alloca) to undef > > > > > >> if ((intptr_t)(&t) < 10) print("X"); > > > > > >> } > > > > > >> > > > > > > The first one at least is UB, but as Richard pointed out the > > > > scope > > > > > > is certainly broader than atomics (it's not even just > > > > well-defined > > > > > > non-deterministism). > > > > > > > > > > > > I'm kind of terrified by the implications. > > > > > Me too. :( > > > > > > > > > > > >> -- Sanjoy > > > > > >> > > > > > >>>> > > > > > >>>> Not that I'm sure that will end up being a helpful > > > >>>> distinction. > > > > > >>> > > > > > >>> Right. See Richard's comment. I think that sums up the real > > > >>> issue > > > >>> here. =/ > > > > > > _______________________________________________ > > > > > > LLVM Developers mailing list > > > > > > llvm-dev at lists.llvm.org > > > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > > _______________________________________________ > > > > > LLVM Developers mailing list > > > > > llvm-dev at lists.llvm.org > > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > -- > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160225/09a736c4/attachment.html>
Hal Finkel via llvm-dev
2016-Feb-25  06:00 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
----- Original Message -----> From: "Chandler Carruth" <chandlerc at google.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "llvm-dev" <llvm-dev at lists.llvm.org>, "Philip Reames" > <listmail at philipreames.com>, "Duncan P. N. Exon Smith" > <dexonsmith at apple.com>, "Sanjoy Das" > <sanjoy at playingwithpointers.com> > Sent: Wednesday, February 24, 2016 11:41:59 PM > Subject: Re: [llvm-dev] Possible soundness issue with > available_externally (split from "RFC: Add guard intrinsics")> On Wed, Feb 24, 2016 at 9:35 PM Hal Finkel < hfinkel at anl.gov > wrote:> > ----- Original Message ----- >> > > From: "Chandler Carruth via llvm-dev" < llvm-dev at lists.llvm.org > > > > > To: "Philip Reames" < listmail at philipreames.com >, "Duncan P. N. > > > Exon > > > > Smith" < dexonsmith at apple.com >, "Sanjoy Das" > > > > < sanjoy at playingwithpointers.com > > > > > Cc: "llvm-dev" < llvm-dev at lists.llvm.org > > > > > Sent: Wednesday, February 24, 2016 10:29:23 PM > > > > Subject: Re: [llvm-dev] Possible soundness issue with > > > > available_externally (split from "RFC: Add guard intrinsics") >> > > Yea, I'm pretty sad about all of this. I'm also not seeing a lot > > > of > > > > awesome paths forward. >> > > Here is the least bad strategy I can come up with. Curious if > > > folks > > > > think this is sufficient: >> > I may not completely understand the problem, but this seems like > > overkill. The underlying restriction is that, if the compiler makes > > a non-determinism-collapsing choice when optimizing a function, it > > must make the same choice for all definitions of that function > > (undefined behavior excluded). > > This isn't enough, because some definition in some other module may > *not be optimized at all*, and yet may get selected at link time.> Put another way, it must *prove* that the same choice will *always* > be made for all definitions. This is akin to proving that the > optimizer is run over all translation units for C++ linkonce_odr > functions, which you can't do.Sure; which is way I said that we should not perform those optimizations (instead of saying that we just need to make sure that the same choice will be made everywhere - as you say, LTO aside, we can't do that).> The result would be failing to optimize the bodies of linkonce_odr > functions in any way which was externally detectable such as this. I > think that would be *much* worse than losing the ability to do > function attribute deduction for such functions?But it is not all optimizations that are the problem. Rather, it seems like a select few (e.g. things involving collapsing allowed non-determinism in atomics), and losing those optimizations seems better than generally losing function-attribute deduction. -Hal> > Thus, with an externally_available function, the CSE in Sanjoy's > > original example should be forbidden. Richard's example again > > demonstrates this principle, although in this case the > > non-determinism is in the choice of a globally-visible > > implementation technique rather than non-determinism from > > memory-subsystem reordering. >> > There is a complication, which you imply in your proposal, that > > such > > optimizations need to be forbidden not just in the > > externally_available functions themselves, but in any local > > function > > transitively called by one. This, however, we can take care of with > > an (easily-deduced) attribute. >> > In short, it is not clear to me that the number of problematic > > optimizations is large (seems likely restricted to things involving > > atomics in practice), and while I understand the auditing > > difficulties here, we should just restrict these in appropriate > > contexts instead of trying to restrict all information flow into or > > out of comdats. >> > -Hal >> > > 1) Stop deducing function attributes within comdats by examining > > > the > > > > bodies of the functions (so that we remain free to transform the > > > > bodies of functions). > > > > 2) Teach frontends to emit (even at O0!!!) trivially deduced > > > function > > > > attributes for comdats so that we continue to catch easy cases. > > > > 3) Ensure and specify that we never hoist code *into* a comdat > > > group > > > > in which it would not have been executed previously. I don't know > > > of > > > > anything in LLVM that does this today, but it would become an > > > > important invariant. > > > > 4) Work a lot harder to do internalizing and removing of this > > > > restriction. >> > > Pretty horrible. But I think it is correct. >> > > As a slight modification to #1 and #2, we could have a very > > > carefully > > > > crafted deduction rule where we only deduce function attributes > > > for > > > > functions prior to any modification of their function bodies. > > > Such > > > > attributes should be conservatively correct because we would > > > never > > > > lift new code into the function bodies. This would at least allow > > > us > > > > to do bottom-up deduction to catch interprocedural cases. But it > > > > would become incredibly subtle that this is only valid prior to > > > > *any* transformations of the comdat-containing functions. >> > > I'm starting to think this subtle rule might be worth it. But I'm > > > > frankly terrified by the implications. >> > > On Wed, Feb 24, 2016 at 8:13 PM Philip Reames via llvm-dev < > > > > llvm-dev at lists.llvm.org > wrote: >> > > > On 02/24/2016 08:10 PM, Duncan P. N. Exon Smith via llvm-dev > > > > wrote: > > > > > > > > > >> On 2016-Feb-24, at 19:46, Sanjoy Das < > > > > > >> sanjoy at playingwithpointers.com > wrote: > > > > > > > > > >> > > > > > > > > > >> On Wed, Feb 24, 2016 at 7:38 PM, Chandler Carruth < > > > > > >> chandlerc at google.com > wrote: > > > > > > > > > >>> On Wed, Feb 24, 2016 at 7:34 PM Duncan P. N. Exon Smith > > > > > > > > > >>> < dexonsmith at apple.com > wrote: > > > > > > > > > >>>> > > > > > > > > > >>>>> On 2016-Feb-24, at 19:17, Chandler Carruth < > > > > > >>>>> chandlerc at google.com > wrote: > > > > > > > > > >>>>> > > > > > > > > > >>>>> On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev > > > > > > > > > >>>>> < llvm-dev at lists.llvm.org > wrote: > > > > > > > > > >>>>> On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith > > > > > > > > > >>>>> < dexonsmith at apple.com > wrote: > > > > > > > > > >>>>>>> If we do not inline @foo(), and instead re-link the > > > > >>>>>>> call > > > > > >>>>>>> site > > > > > >>>>>>> in > > > > > > > > > >>>>>>> @main > > > > > > > > > >>>>>>> to some non-optimized copy (or differently optimized > > > > >>>>>>> copy) > > > > > >>>>>>> of > > > > > >>>>>>> @foo, > > > > > > > > > >>>>>>> then it is possible for the program to have the > > > > >>>>>>> behavior > > > > > >>>>>>> {print("Y"); > > > > > > > > > >>>>>>> print ("X")}, which was disallowed in the earlier > > > > >>>>>>> program. > > > > > > > > > >>>>>>> > > > > > > > > > >>>>>>> In other words, opt refined the semantics of @foo() > > > > >>>>>>> (i.e. > > > > > >>>>>>> reduced the > > > > > > > > > >>>>>>> set of behaviors it may have) in ways that would make > > > > >>>>>>> later > > > > > > > > > >>>>>>> optimizations invalid if we de-refine the > > > > >>>>>>> implementation > > > > >>>>>>> of > > > > > >>>>>>> @foo(). > > > > > > > > > >>>>>> I'm probably missing something obvious here. How could > > > > >>>>>> the > > > > > >>>>>> result of > > > > > > > > > >>>>>> `%t0 != %t1` be different at optimization time in one > > > > >>>>>> file > > > > > >>>>>> than from > > > > > > > > > >>>>>> runtime in the "real" implementation? Doesn't this make > > > > >>>>>> the > > > > > >>>>>> CSE > > > > > > > > > >>>>>> invalid? > > > > > > > > > >>>>> `%t0` and `%t1` are "allowed" to "always be the same", > > > > >>>>> i.e. > > > > > >>>>> an > > > > > > > > > >>>>> implementation of @foo that always feeds in the same > > > > > > > > > >>>>> value for `%t0` and `%t1` is a valid implementation > > > > >>>>> (which > > > > >>>>> is > > > > > >>>>> why the > > > > > > > > > >>>>> CSE was valid); but it is not the *only* valid > > > > > >>>>> implementation. > > > > > >>>>> If I > > > > > > > > > >>>>> don't CSE the two load instructions (also a valid thing > > > > >>>>> to > > > > > >>>>> do), > > > > > >>>>> and > > > > > > > > > >>>>> this is a second thread writing to `%par`, then the two > > > > > >>>>> values > > > > > >>>>> loaded > > > > > > > > > >>>>> can be different, and you could end up printing `"X"` in > > > > > >>>>> `@foo`. > > > > > > > > > >>>>> > > > > > > > > > >>>>> Did that make sense? > > > > > > > > > >>>> Yes. To be sure I understand the scope: this is only a > > > > >>>> problem > > > > > >>>> for > > > > > > > > > >>>> atomics, correct? (Because multi-threaded behaviour with > > > > >>>> other > > > > > >>>> globals > > > > > > > > > >>>> is UB?) > > > > > > > > > >>>> > > > > > > > > > >>>>>> Does linkonce_odr linkage have the same problem? > > > > > > > > > >>>>>> - If so, do you want to change it too? > > > > > > > > > >>>>>> - Else, why not? > > > > > > > > > >>>>> Going by the specification in the LangRef, I'd say it > > > > >>>>> depends > > > > > >>>>> on how > > > > > > > > > >>>>> you define "definitive". If you're allowed to replace the > > > > > >>>>> body > > > > > >>>>> of a > > > > > > > > > >>>>> function with a differently optimized body, then the > > > > >>>>> above > > > > > >>>>> problem > > > > > > > > > >>>>> exists. > > > > > > > > > >>>>> > > > > > > > > > >>>>> I believe that is the case, and I strongly believe the > > > > > >>>>> problem > > > > > >>>>> you > > > > > > > > > >>>>> outline exists for linkonce_odr exactly as it does for > > > > > >>>>> available_externally. > > > > > > > > > >>>>> > > > > > > > > > >>>>> Which is what makes this scary: every C++ inline function > > > > > >>>>> today > > > > > >>>>> can > > > > > > > > > >>>>> trigger this. > > > > > > > > > >>>> Every C/C++ inline or template function. But only the ones > > > > > >>>> that > > > > > >>>> use > > > > > > > > > >>>> atomics, right? > > > > > > > > > >>> > > > > > > > > > >>> Well, with *this* example... > > > > > > > > > >> Atomic are one source of non-determinism that compilers can > > > > > >> reason > > > > > > > > > >> about. I don't know if the following snippet is well defined > > > > >> or > > > > > >> not, > > > > > > > > > >> but you could have similar issues with > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> void foo() { > > > > > > > > > >> int *p = malloc(sizeof(int)); > > > > > > > > > >> if (*p < 10) print("X"); > > > > > > > > > >> } > > > > > > > > > >> > > > > > > > > > >> or (again, I don't know if this is actually well defined) > > > > > > > > > >> > > > > > > > > > >> void foo() { > > > > > > > > > >> int t; // it is probably reasonable to fold compares with > > > > > > > > > >> ptrtoint(alloca) to undef > > > > > > > > > >> if ((intptr_t)(&t) < 10) print("X"); > > > > > > > > > >> } > > > > > > > > > >> > > > > > > > > > > The first one at least is UB, but as Richard pointed out the > > > > > > scope > > > > > > > > > > is certainly broader than atomics (it's not even just > > > > > > well-defined > > > > > > > > > > non-deterministism). > > > > > > > > > > > > > > > > > > > > I'm kind of terrified by the implications. > > > > > > > > > Me too. :( > > > > > > > > > > > > > > > > > > > >> -- Sanjoy > > > > > > > > > >> > > > > > > > > > >>>> > > > > > > > > > >>>> Not that I'm sure that will end up being a helpful > > > > > >>>> distinction. > > > > > > > > > >>> > > > > > > > > > >>> Right. See Richard's comment. I think that sums up the real > > > > > >>> issue > > > > > >>> here. =/ > > > > > > > > > > _______________________________________________ > > > > > > > > > > LLVM Developers mailing list > > > > > > > > > > llvm-dev at lists.llvm.org > > > > > > > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > >> > > > _______________________________________________ > > > > > > > > > LLVM Developers mailing list > > > > > > > > > llvm-dev at lists.llvm.org > > > > > > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > >> > > _______________________________________________ > > > > LLVM Developers mailing list > > > > llvm-dev at lists.llvm.org > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > -- >> > -- > > > Hal Finkel > > > Assistant Computational Scientist > > > Leadership Computing Facility > > > Argonne National Laboratory >-- -- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Mehdi Amini via llvm-dev
2016-Feb-25  06:02 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
> On Feb 24, 2016, at 9:41 PM, Chandler Carruth via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > On Wed, Feb 24, 2016 at 9:35 PM Hal Finkel <hfinkel at anl.gov <mailto:hfinkel at anl.gov>> wrote: > ----- Original Message ----- > > > From: "Chandler Carruth via llvm-dev" <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> > > To: "Philip Reames" <listmail at philipreames.com <mailto:listmail at philipreames.com>>, "Duncan P. N. Exon > > Smith" <dexonsmith at apple.com <mailto:dexonsmith at apple.com>>, "Sanjoy Das" > > <sanjoy at playingwithpointers.com <mailto:sanjoy at playingwithpointers.com>> > > Cc: "llvm-dev" <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> > > Sent: Wednesday, February 24, 2016 10:29:23 PM > > Subject: Re: [llvm-dev] Possible soundness issue with > > available_externally (split from "RFC: Add guard intrinsics") > > > Yea, I'm pretty sad about all of this. I'm also not seeing a lot of > > awesome paths forward. > > > Here is the least bad strategy I can come up with. Curious if folks > > think this is sufficient: > > I may not completely understand the problem, but this seems like overkill. The underlying restriction is that, if the compiler makes a non-determinism-collapsing choice when optimizing a function, it must make the same choice for all definitions of that function (undefined behavior excluded). > > This isn't enough, because some definition in some other module may *not be optimized at all*, and yet may get selected at link time. > > Put another way, it must *prove* that the same choice will *always* be made for all definitions. This is akin to proving that the optimizer is run over all translation units for C++ linkonce_odr functions, which you can't do.Even if the optimizer is ran, it could take different decision because the context would be different: linkonce_odr foo() { bar(); } If bar() is present in the TU it can gets inlined into foo(). So the optimizer would optimize differently foo().> > The result would be failing to optimize the bodies of linkonce_odr functions in any way which was externally detectable such as this. I think that would be *much* worse than losing the ability to do function attribute deduction for such functions? > > Thus, with an externally_available function, the CSE in Sanjoy's original example should be forbidden. Richard's example again demonstrates this principle, although in this case the non-determinism is in the choice of a globally-visible implementation technique rather than non-determinism from memory-subsystem reordering. > > There is a complication, which you imply in your proposal, that such optimizations need to be forbidden not just in the externally_available functions themselves, but in any local function transitively called by one. This, however, we can take care of with an (easily-deduced) attribute.I'm not sure why " such optimizations need to be forbidden [...] in any local function transitively called by one", can you illustrate with an example? -- Mehdi> > In short, it is not clear to me that the number of problematic optimizations is large (seems likely restricted to things involving atomics in practice), > and while I understand the auditing difficulties here, we should just restrict these in appropriate contexts instead of trying to restrict all information flow into or out of comdats. > > -Hal > > > 1) Stop deducing function attributes within comdats by examining the > > bodies of the functions (so that we remain free to transform the > > bodies of functions). > > 2) Teach frontends to emit (even at O0!!!) trivially deduced function > > attributes for comdats so that we continue to catch easy cases. > > 3) Ensure and specify that we never hoist code *into* a comdat group > > in which it would not have been executed previously. I don't know of > > anything in LLVM that does this today, but it would become an > > important invariant. > > 4) Work a lot harder to do internalizing and removing of this > > restriction. > > > Pretty horrible. But I think it is correct. > > > As a slight modification to #1 and #2, we could have a very carefully > > crafted deduction rule where we only deduce function attributes for > > functions prior to any modification of their function bodies. Such > > attributes should be conservatively correct because we would never > > lift new code into the function bodies. This would at least allow us > > to do bottom-up deduction to catch interprocedural cases. But it > > would become incredibly subtle that this is only valid prior to > > *any* transformations of the comdat-containing functions. > > > I'm starting to think this subtle rule might be worth it. But I'm > > frankly terrified by the implications. > > > On Wed, Feb 24, 2016 at 8:13 PM Philip Reames via llvm-dev < > > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > wrote: > > > > On 02/24/2016 08:10 PM, Duncan P. N. Exon Smith via llvm-dev wrote: > > > > > >> On 2016-Feb-24, at 19:46, Sanjoy Das < > > > >> sanjoy at playingwithpointers.com <mailto:sanjoy at playingwithpointers.com> > wrote: > > > > > >> > > > > > >> On Wed, Feb 24, 2016 at 7:38 PM, Chandler Carruth < > > > >> chandlerc at google.com <mailto:chandlerc at google.com> > wrote: > > > > > >>> On Wed, Feb 24, 2016 at 7:34 PM Duncan P. N. Exon Smith > > > > > >>> < dexonsmith at apple.com <mailto:dexonsmith at apple.com> > wrote: > > > > > >>>> > > > > > >>>>> On 2016-Feb-24, at 19:17, Chandler Carruth < > > > >>>>> chandlerc at google.com <mailto:chandlerc at google.com> > wrote: > > > > > >>>>> > > > > > >>>>> On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev > > > > > >>>>> < llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > wrote: > > > > > >>>>> On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith > > > > > >>>>> < dexonsmith at apple.com <mailto:dexonsmith at apple.com> > wrote: > > > > > >>>>>>> If we do not inline @foo(), and instead re-link the call > > > >>>>>>> site > > > >>>>>>> in > > > > > >>>>>>> @main > > > > > >>>>>>> to some non-optimized copy (or differently optimized copy) > > > >>>>>>> of > > > >>>>>>> @foo, > > > > > >>>>>>> then it is possible for the program to have the behavior > > > >>>>>>> {print("Y"); > > > > > >>>>>>> print ("X")}, which was disallowed in the earlier program. > > > > > >>>>>>> > > > > > >>>>>>> In other words, opt refined the semantics of @foo() (i.e. > > > >>>>>>> reduced the > > > > > >>>>>>> set of behaviors it may have) in ways that would make later > > > > > >>>>>>> optimizations invalid if we de-refine the implementation of > > > >>>>>>> @foo(). > > > > > >>>>>> I'm probably missing something obvious here. How could the > > > >>>>>> result of > > > > > >>>>>> `%t0 != %t1` be different at optimization time in one file > > > >>>>>> than from > > > > > >>>>>> runtime in the "real" implementation? Doesn't this make the > > > >>>>>> CSE > > > > > >>>>>> invalid? > > > > > >>>>> `%t0` and `%t1` are "allowed" to "always be the same", i.e. > > > >>>>> an > > > > > >>>>> implementation of @foo that always feeds in the same > > > > > >>>>> value for `%t0` and `%t1` is a valid implementation (which is > > > >>>>> why the > > > > > >>>>> CSE was valid); but it is not the *only* valid > > > >>>>> implementation. > > > >>>>> If I > > > > > >>>>> don't CSE the two load instructions (also a valid thing to > > > >>>>> do), > > > >>>>> and > > > > > >>>>> this is a second thread writing to `%par`, then the two > > > >>>>> values > > > >>>>> loaded > > > > > >>>>> can be different, and you could end up printing `"X"` in > > > >>>>> `@foo`. > > > > > >>>>> > > > > > >>>>> Did that make sense? > > > > > >>>> Yes. To be sure I understand the scope: this is only a problem > > > >>>> for > > > > > >>>> atomics, correct? (Because multi-threaded behaviour with other > > > >>>> globals > > > > > >>>> is UB?) > > > > > >>>> > > > > > >>>>>> Does linkonce_odr linkage have the same problem? > > > > > >>>>>> - If so, do you want to change it too? > > > > > >>>>>> - Else, why not? > > > > > >>>>> Going by the specification in the LangRef, I'd say it depends > > > >>>>> on how > > > > > >>>>> you define "definitive". If you're allowed to replace the > > > >>>>> body > > > >>>>> of a > > > > > >>>>> function with a differently optimized body, then the above > > > >>>>> problem > > > > > >>>>> exists. > > > > > >>>>> > > > > > >>>>> I believe that is the case, and I strongly believe the > > > >>>>> problem > > > >>>>> you > > > > > >>>>> outline exists for linkonce_odr exactly as it does for > > > >>>>> available_externally. > > > > > >>>>> > > > > > >>>>> Which is what makes this scary: every C++ inline function > > > >>>>> today > > > >>>>> can > > > > > >>>>> trigger this. > > > > > >>>> Every C/C++ inline or template function. But only the ones > > > >>>> that > > > >>>> use > > > > > >>>> atomics, right? > > > > > >>> > > > > > >>> Well, with *this* example... > > > > > >> Atomic are one source of non-determinism that compilers can > > > >> reason > > > > > >> about. I don't know if the following snippet is well defined or > > > >> not, > > > > > >> but you could have similar issues with > > > > > >> > > > > > >> > > > > > >> void foo() { > > > > > >> int *p = malloc(sizeof(int)); > > > > > >> if (*p < 10) print("X"); > > > > > >> } > > > > > >> > > > > > >> or (again, I don't know if this is actually well defined) > > > > > >> > > > > > >> void foo() { > > > > > >> int t; // it is probably reasonable to fold compares with > > > > > >> ptrtoint(alloca) to undef > > > > > >> if ((intptr_t)(&t) < 10) print("X"); > > > > > >> } > > > > > >> > > > > > > The first one at least is UB, but as Richard pointed out the > > > > scope > > > > > > is certainly broader than atomics (it's not even just > > > > well-defined > > > > > > non-deterministism). > > > > > > > > > > > > I'm kind of terrified by the implications. > > > > > Me too. :( > > > > > > > > > > > >> -- Sanjoy > > > > > >> > > > > > >>>> > > > > > >>>> Not that I'm sure that will end up being a helpful > > > >>>> distinction. > > > > > >>> > > > > > >>> Right. See Richard's comment. I think that sums up the real > > > >>> issue > > > >>> here. =/ > > > > > > _______________________________________________ > > > > > > LLVM Developers mailing list > > > > > > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > > > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > > > > > _______________________________________________ > > > > > LLVM Developers mailing list > > > > > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > -- > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160224/669118b0/attachment-0001.html>
Pete Cooper via llvm-dev
2016-Feb-25  06:07 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
Sent from my iPhone> On Feb 24, 2016, at 9:41 PM, Chandler Carruth via llvm-dev <llvm-dev at lists.llvm.org> wrote: > >> On Wed, Feb 24, 2016 at 9:35 PM Hal Finkel <hfinkel at anl.gov> wrote: >> ----- Original Message ----- >> >> > From: "Chandler Carruth via llvm-dev" <llvm-dev at lists.llvm.org> >> > To: "Philip Reames" <listmail at philipreames.com>, "Duncan P. N. Exon >> > Smith" <dexonsmith at apple.com>, "Sanjoy Das" >> > <sanjoy at playingwithpointers.com> >> > Cc: "llvm-dev" <llvm-dev at lists.llvm.org> >> > Sent: Wednesday, February 24, 2016 10:29:23 PM >> > Subject: Re: [llvm-dev] Possible soundness issue with >> > available_externally (split from "RFC: Add guard intrinsics") >> >> > Yea, I'm pretty sad about all of this. I'm also not seeing a lot of >> > awesome paths forward. >> >> > Here is the least bad strategy I can come up with. Curious if folks >> > think this is sufficient: >> >> I may not completely understand the problem, but this seems like overkill. The underlying restriction is that, if the compiler makes a non-determinism-collapsing choice when optimizing a function, it must make the same choice for all definitions of that function (undefined behavior excluded). > > This isn't enough, because some definition in some other module may *not be optimized at all*, and yet may get selected at link time. > > Put another way, it must *prove* that the same choice will *always* be made for all definitions. This is akin to proving that the optimizer is run over all translation units for C++ linkonce_odr functions, which you can't do. > > The result would be failing to optimize the bodies of linkonce_odr functions in any way which was externally detectable such as this. I think that would be *much* worse than losing the ability to do function attribute deduction for such functions? >If you are prior to optimising the function in any way (which I think is a reasonable thing to require as you said before) then any attributes added there will at least be conservatively correct. That will still allow a bunch of optimisation, I hope! So for the following function: int foo(int *i) { if (*i == 1) return 0; return 1; } You can legitimately add readonly and propagate to callers that the returned value has known zero bits > 1. And for this similar function: int foo(int *i, bool b) { if (b) { *i = 1; return 0; } return 1; } You could legitimately optimise a call 'foo(p, false)' so that the call site propagates in the constants and knows that this call returns 1 for sure. But could you constant propagate (without inlining) in to the call and add readonly to the call instruction here? I think you can because you know that passing false for b means that the function will not take the path that loads memory. Pete>> Thus, with an externally_available function, the CSE in Sanjoy's original example should be forbidden. Richard's example again demonstrates this principle, although in this case the non-determinism is in the choice of a globally-visible implementation technique rather than non-determinism from memory-subsystem reordering. >> >> There is a complication, which you imply in your proposal, that such optimizations need to be forbidden not just in the externally_available functions themselves, but in any local function transitively called by one. This, however, we can take care of with an (easily-deduced) attribute. >> >> In short, it is not clear to me that the number of problematic optimizations is large (seems likely restricted to things involving atomics in practice), and while I understand the auditing difficulties here, we should just restrict these in appropriate contexts instead of trying to restrict all information flow into or out of comdats. >> >> -Hal >> >> > 1) Stop deducing function attributes within comdats by examining the >> > bodies of the functions (so that we remain free to transform the >> > bodies of functions). >> > 2) Teach frontends to emit (even at O0!!!) trivially deduced function >> > attributes for comdats so that we continue to catch easy cases. >> > 3) Ensure and specify that we never hoist code *into* a comdat group >> > in which it would not have been executed previously. I don't know of >> > anything in LLVM that does this today, but it would become an >> > important invariant. >> > 4) Work a lot harder to do internalizing and removing of this >> > restriction. >> >> > Pretty horrible. But I think it is correct. >> >> > As a slight modification to #1 and #2, we could have a very carefully >> > crafted deduction rule where we only deduce function attributes for >> > functions prior to any modification of their function bodies. Such >> > attributes should be conservatively correct because we would never >> > lift new code into the function bodies. This would at least allow us >> > to do bottom-up deduction to catch interprocedural cases. But it >> > would become incredibly subtle that this is only valid prior to >> > *any* transformations of the comdat-containing functions. >> >> > I'm starting to think this subtle rule might be worth it. But I'm >> > frankly terrified by the implications. >> >> > On Wed, Feb 24, 2016 at 8:13 PM Philip Reames via llvm-dev < >> > llvm-dev at lists.llvm.org > wrote: >> >> > > On 02/24/2016 08:10 PM, Duncan P. N. Exon Smith via llvm-dev wrote: >> > >> > > >> On 2016-Feb-24, at 19:46, Sanjoy Das < >> > > >> sanjoy at playingwithpointers.com > wrote: >> > >> > > >> >> > >> > > >> On Wed, Feb 24, 2016 at 7:38 PM, Chandler Carruth < >> > > >> chandlerc at google.com > wrote: >> > >> > > >>> On Wed, Feb 24, 2016 at 7:34 PM Duncan P. N. Exon Smith >> > >> > > >>> < dexonsmith at apple.com > wrote: >> > >> > > >>>> >> > >> > > >>>>> On 2016-Feb-24, at 19:17, Chandler Carruth < >> > > >>>>> chandlerc at google.com > wrote: >> > >> > > >>>>> >> > >> > > >>>>> On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev >> > >> > > >>>>> < llvm-dev at lists.llvm.org > wrote: >> > >> > > >>>>> On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith >> > >> > > >>>>> < dexonsmith at apple.com > wrote: >> > >> > > >>>>>>> If we do not inline @foo(), and instead re-link the call >> > > >>>>>>> site >> > > >>>>>>> in >> > >> > > >>>>>>> @main >> > >> > > >>>>>>> to some non-optimized copy (or differently optimized copy) >> > > >>>>>>> of >> > > >>>>>>> @foo, >> > >> > > >>>>>>> then it is possible for the program to have the behavior >> > > >>>>>>> {print("Y"); >> > >> > > >>>>>>> print ("X")}, which was disallowed in the earlier program. >> > >> > > >>>>>>> >> > >> > > >>>>>>> In other words, opt refined the semantics of @foo() (i.e. >> > > >>>>>>> reduced the >> > >> > > >>>>>>> set of behaviors it may have) in ways that would make later >> > >> > > >>>>>>> optimizations invalid if we de-refine the implementation of >> > > >>>>>>> @foo(). >> > >> > > >>>>>> I'm probably missing something obvious here. How could the >> > > >>>>>> result of >> > >> > > >>>>>> `%t0 != %t1` be different at optimization time in one file >> > > >>>>>> than from >> > >> > > >>>>>> runtime in the "real" implementation? Doesn't this make the >> > > >>>>>> CSE >> > >> > > >>>>>> invalid? >> > >> > > >>>>> `%t0` and `%t1` are "allowed" to "always be the same", i.e. >> > > >>>>> an >> > >> > > >>>>> implementation of @foo that always feeds in the same >> > >> > > >>>>> value for `%t0` and `%t1` is a valid implementation (which is >> > > >>>>> why the >> > >> > > >>>>> CSE was valid); but it is not the *only* valid >> > > >>>>> implementation. >> > > >>>>> If I >> > >> > > >>>>> don't CSE the two load instructions (also a valid thing to >> > > >>>>> do), >> > > >>>>> and >> > >> > > >>>>> this is a second thread writing to `%par`, then the two >> > > >>>>> values >> > > >>>>> loaded >> > >> > > >>>>> can be different, and you could end up printing `"X"` in >> > > >>>>> `@foo`. >> > >> > > >>>>> >> > >> > > >>>>> Did that make sense? >> > >> > > >>>> Yes. To be sure I understand the scope: this is only a problem >> > > >>>> for >> > >> > > >>>> atomics, correct? (Because multi-threaded behaviour with other >> > > >>>> globals >> > >> > > >>>> is UB?) >> > >> > > >>>> >> > >> > > >>>>>> Does linkonce_odr linkage have the same problem? >> > >> > > >>>>>> - If so, do you want to change it too? >> > >> > > >>>>>> - Else, why not? >> > >> > > >>>>> Going by the specification in the LangRef, I'd say it depends >> > > >>>>> on how >> > >> > > >>>>> you define "definitive". If you're allowed to replace the >> > > >>>>> body >> > > >>>>> of a >> > >> > > >>>>> function with a differently optimized body, then the above >> > > >>>>> problem >> > >> > > >>>>> exists. >> > >> > > >>>>> >> > >> > > >>>>> I believe that is the case, and I strongly believe the >> > > >>>>> problem >> > > >>>>> you >> > >> > > >>>>> outline exists for linkonce_odr exactly as it does for >> > > >>>>> available_externally. >> > >> > > >>>>> >> > >> > > >>>>> Which is what makes this scary: every C++ inline function >> > > >>>>> today >> > > >>>>> can >> > >> > > >>>>> trigger this. >> > >> > > >>>> Every C/C++ inline or template function. But only the ones >> > > >>>> that >> > > >>>> use >> > >> > > >>>> atomics, right? >> > >> > > >>> >> > >> > > >>> Well, with *this* example... >> > >> > > >> Atomic are one source of non-determinism that compilers can >> > > >> reason >> > >> > > >> about. I don't know if the following snippet is well defined or >> > > >> not, >> > >> > > >> but you could have similar issues with >> > >> > > >> >> > >> > > >> >> > >> > > >> void foo() { >> > >> > > >> int *p = malloc(sizeof(int)); >> > >> > > >> if (*p < 10) print("X"); >> > >> > > >> } >> > >> > > >> >> > >> > > >> or (again, I don't know if this is actually well defined) >> > >> > > >> >> > >> > > >> void foo() { >> > >> > > >> int t; // it is probably reasonable to fold compares with >> > >> > > >> ptrtoint(alloca) to undef >> > >> > > >> if ((intptr_t)(&t) < 10) print("X"); >> > >> > > >> } >> > >> > > >> >> > >> > > > The first one at least is UB, but as Richard pointed out the >> > > > scope >> > >> > > > is certainly broader than atomics (it's not even just >> > > > well-defined >> > >> > > > non-deterministism). >> > >> > > > >> > >> > > > I'm kind of terrified by the implications. >> > >> > > Me too. :( >> > >> > > > >> > >> > > >> -- Sanjoy >> > >> > > >> >> > >> > > >>>> >> > >> > > >>>> Not that I'm sure that will end up being a helpful >> > > >>>> distinction. >> > >> > > >>> >> > >> > > >>> Right. See Richard's comment. I think that sums up the real >> > > >>> issue >> > > >>> here. =/ >> > >> > > > _______________________________________________ >> > >> > > > LLVM Developers mailing list >> > >> > > > llvm-dev at lists.llvm.org >> > >> > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > >> >> > > _______________________________________________ >> > >> > > LLVM Developers mailing list >> > >> > > llvm-dev at lists.llvm.org >> > >> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > >> >> > _______________________________________________ >> > LLVM Developers mailing list >> > llvm-dev at lists.llvm.org >> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> -- >> >> -- >> Hal Finkel >> Assistant Computational Scientist >> Leadership Computing Facility >> Argonne National Laboratory > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160224/d3b5f398/attachment.html>
Possibly Parallel Threads
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")