Hal Finkel via llvm-dev
2016-Feb-25 06:00 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
----- Original Message -----> From: "Chandler Carruth" <chandlerc at google.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "llvm-dev" <llvm-dev at lists.llvm.org>, "Philip Reames" > <listmail at philipreames.com>, "Duncan P. N. Exon Smith" > <dexonsmith at apple.com>, "Sanjoy Das" > <sanjoy at playingwithpointers.com> > Sent: Wednesday, February 24, 2016 11:41:59 PM > Subject: Re: [llvm-dev] Possible soundness issue with > available_externally (split from "RFC: Add guard intrinsics")> On Wed, Feb 24, 2016 at 9:35 PM Hal Finkel < hfinkel at anl.gov > wrote:> > ----- Original Message ----- >> > > From: "Chandler Carruth via llvm-dev" < llvm-dev at lists.llvm.org > > > > > To: "Philip Reames" < listmail at philipreames.com >, "Duncan P. N. > > > Exon > > > > Smith" < dexonsmith at apple.com >, "Sanjoy Das" > > > > < sanjoy at playingwithpointers.com > > > > > Cc: "llvm-dev" < llvm-dev at lists.llvm.org > > > > > Sent: Wednesday, February 24, 2016 10:29:23 PM > > > > Subject: Re: [llvm-dev] Possible soundness issue with > > > > available_externally (split from "RFC: Add guard intrinsics") >> > > Yea, I'm pretty sad about all of this. I'm also not seeing a lot > > > of > > > > awesome paths forward. >> > > Here is the least bad strategy I can come up with. Curious if > > > folks > > > > think this is sufficient: >> > I may not completely understand the problem, but this seems like > > overkill. The underlying restriction is that, if the compiler makes > > a non-determinism-collapsing choice when optimizing a function, it > > must make the same choice for all definitions of that function > > (undefined behavior excluded). > > This isn't enough, because some definition in some other module may > *not be optimized at all*, and yet may get selected at link time.> Put another way, it must *prove* that the same choice will *always* > be made for all definitions. This is akin to proving that the > optimizer is run over all translation units for C++ linkonce_odr > functions, which you can't do.Sure; which is way I said that we should not perform those optimizations (instead of saying that we just need to make sure that the same choice will be made everywhere - as you say, LTO aside, we can't do that).> The result would be failing to optimize the bodies of linkonce_odr > functions in any way which was externally detectable such as this. I > think that would be *much* worse than losing the ability to do > function attribute deduction for such functions?But it is not all optimizations that are the problem. Rather, it seems like a select few (e.g. things involving collapsing allowed non-determinism in atomics), and losing those optimizations seems better than generally losing function-attribute deduction. -Hal> > Thus, with an externally_available function, the CSE in Sanjoy's > > original example should be forbidden. Richard's example again > > demonstrates this principle, although in this case the > > non-determinism is in the choice of a globally-visible > > implementation technique rather than non-determinism from > > memory-subsystem reordering. >> > There is a complication, which you imply in your proposal, that > > such > > optimizations need to be forbidden not just in the > > externally_available functions themselves, but in any local > > function > > transitively called by one. This, however, we can take care of with > > an (easily-deduced) attribute. >> > In short, it is not clear to me that the number of problematic > > optimizations is large (seems likely restricted to things involving > > atomics in practice), and while I understand the auditing > > difficulties here, we should just restrict these in appropriate > > contexts instead of trying to restrict all information flow into or > > out of comdats. >> > -Hal >> > > 1) Stop deducing function attributes within comdats by examining > > > the > > > > bodies of the functions (so that we remain free to transform the > > > > bodies of functions). > > > > 2) Teach frontends to emit (even at O0!!!) trivially deduced > > > function > > > > attributes for comdats so that we continue to catch easy cases. > > > > 3) Ensure and specify that we never hoist code *into* a comdat > > > group > > > > in which it would not have been executed previously. I don't know > > > of > > > > anything in LLVM that does this today, but it would become an > > > > important invariant. > > > > 4) Work a lot harder to do internalizing and removing of this > > > > restriction. >> > > Pretty horrible. But I think it is correct. >> > > As a slight modification to #1 and #2, we could have a very > > > carefully > > > > crafted deduction rule where we only deduce function attributes > > > for > > > > functions prior to any modification of their function bodies. > > > Such > > > > attributes should be conservatively correct because we would > > > never > > > > lift new code into the function bodies. This would at least allow > > > us > > > > to do bottom-up deduction to catch interprocedural cases. But it > > > > would become incredibly subtle that this is only valid prior to > > > > *any* transformations of the comdat-containing functions. >> > > I'm starting to think this subtle rule might be worth it. But I'm > > > > frankly terrified by the implications. >> > > On Wed, Feb 24, 2016 at 8:13 PM Philip Reames via llvm-dev < > > > > llvm-dev at lists.llvm.org > wrote: >> > > > On 02/24/2016 08:10 PM, Duncan P. N. Exon Smith via llvm-dev > > > > wrote: > > > > > > > > > >> On 2016-Feb-24, at 19:46, Sanjoy Das < > > > > > >> sanjoy at playingwithpointers.com > wrote: > > > > > > > > > >> > > > > > > > > > >> On Wed, Feb 24, 2016 at 7:38 PM, Chandler Carruth < > > > > > >> chandlerc at google.com > wrote: > > > > > > > > > >>> On Wed, Feb 24, 2016 at 7:34 PM Duncan P. N. Exon Smith > > > > > > > > > >>> < dexonsmith at apple.com > wrote: > > > > > > > > > >>>> > > > > > > > > > >>>>> On 2016-Feb-24, at 19:17, Chandler Carruth < > > > > > >>>>> chandlerc at google.com > wrote: > > > > > > > > > >>>>> > > > > > > > > > >>>>> On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev > > > > > > > > > >>>>> < llvm-dev at lists.llvm.org > wrote: > > > > > > > > > >>>>> On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith > > > > > > > > > >>>>> < dexonsmith at apple.com > wrote: > > > > > > > > > >>>>>>> If we do not inline @foo(), and instead re-link the > > > > >>>>>>> call > > > > > >>>>>>> site > > > > > >>>>>>> in > > > > > > > > > >>>>>>> @main > > > > > > > > > >>>>>>> to some non-optimized copy (or differently optimized > > > > >>>>>>> copy) > > > > > >>>>>>> of > > > > > >>>>>>> @foo, > > > > > > > > > >>>>>>> then it is possible for the program to have the > > > > >>>>>>> behavior > > > > > >>>>>>> {print("Y"); > > > > > > > > > >>>>>>> print ("X")}, which was disallowed in the earlier > > > > >>>>>>> program. > > > > > > > > > >>>>>>> > > > > > > > > > >>>>>>> In other words, opt refined the semantics of @foo() > > > > >>>>>>> (i.e. > > > > > >>>>>>> reduced the > > > > > > > > > >>>>>>> set of behaviors it may have) in ways that would make > > > > >>>>>>> later > > > > > > > > > >>>>>>> optimizations invalid if we de-refine the > > > > >>>>>>> implementation > > > > >>>>>>> of > > > > > >>>>>>> @foo(). > > > > > > > > > >>>>>> I'm probably missing something obvious here. How could > > > > >>>>>> the > > > > > >>>>>> result of > > > > > > > > > >>>>>> `%t0 != %t1` be different at optimization time in one > > > > >>>>>> file > > > > > >>>>>> than from > > > > > > > > > >>>>>> runtime in the "real" implementation? Doesn't this make > > > > >>>>>> the > > > > > >>>>>> CSE > > > > > > > > > >>>>>> invalid? > > > > > > > > > >>>>> `%t0` and `%t1` are "allowed" to "always be the same", > > > > >>>>> i.e. > > > > > >>>>> an > > > > > > > > > >>>>> implementation of @foo that always feeds in the same > > > > > > > > > >>>>> value for `%t0` and `%t1` is a valid implementation > > > > >>>>> (which > > > > >>>>> is > > > > > >>>>> why the > > > > > > > > > >>>>> CSE was valid); but it is not the *only* valid > > > > > >>>>> implementation. > > > > > >>>>> If I > > > > > > > > > >>>>> don't CSE the two load instructions (also a valid thing > > > > >>>>> to > > > > > >>>>> do), > > > > > >>>>> and > > > > > > > > > >>>>> this is a second thread writing to `%par`, then the two > > > > > >>>>> values > > > > > >>>>> loaded > > > > > > > > > >>>>> can be different, and you could end up printing `"X"` in > > > > > >>>>> `@foo`. > > > > > > > > > >>>>> > > > > > > > > > >>>>> Did that make sense? > > > > > > > > > >>>> Yes. To be sure I understand the scope: this is only a > > > > >>>> problem > > > > > >>>> for > > > > > > > > > >>>> atomics, correct? (Because multi-threaded behaviour with > > > > >>>> other > > > > > >>>> globals > > > > > > > > > >>>> is UB?) > > > > > > > > > >>>> > > > > > > > > > >>>>>> Does linkonce_odr linkage have the same problem? > > > > > > > > > >>>>>> - If so, do you want to change it too? > > > > > > > > > >>>>>> - Else, why not? > > > > > > > > > >>>>> Going by the specification in the LangRef, I'd say it > > > > >>>>> depends > > > > > >>>>> on how > > > > > > > > > >>>>> you define "definitive". If you're allowed to replace the > > > > > >>>>> body > > > > > >>>>> of a > > > > > > > > > >>>>> function with a differently optimized body, then the > > > > >>>>> above > > > > > >>>>> problem > > > > > > > > > >>>>> exists. > > > > > > > > > >>>>> > > > > > > > > > >>>>> I believe that is the case, and I strongly believe the > > > > > >>>>> problem > > > > > >>>>> you > > > > > > > > > >>>>> outline exists for linkonce_odr exactly as it does for > > > > > >>>>> available_externally. > > > > > > > > > >>>>> > > > > > > > > > >>>>> Which is what makes this scary: every C++ inline function > > > > > >>>>> today > > > > > >>>>> can > > > > > > > > > >>>>> trigger this. > > > > > > > > > >>>> Every C/C++ inline or template function. But only the ones > > > > > >>>> that > > > > > >>>> use > > > > > > > > > >>>> atomics, right? > > > > > > > > > >>> > > > > > > > > > >>> Well, with *this* example... > > > > > > > > > >> Atomic are one source of non-determinism that compilers can > > > > > >> reason > > > > > > > > > >> about. I don't know if the following snippet is well defined > > > > >> or > > > > > >> not, > > > > > > > > > >> but you could have similar issues with > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> void foo() { > > > > > > > > > >> int *p = malloc(sizeof(int)); > > > > > > > > > >> if (*p < 10) print("X"); > > > > > > > > > >> } > > > > > > > > > >> > > > > > > > > > >> or (again, I don't know if this is actually well defined) > > > > > > > > > >> > > > > > > > > > >> void foo() { > > > > > > > > > >> int t; // it is probably reasonable to fold compares with > > > > > > > > > >> ptrtoint(alloca) to undef > > > > > > > > > >> if ((intptr_t)(&t) < 10) print("X"); > > > > > > > > > >> } > > > > > > > > > >> > > > > > > > > > > The first one at least is UB, but as Richard pointed out the > > > > > > scope > > > > > > > > > > is certainly broader than atomics (it's not even just > > > > > > well-defined > > > > > > > > > > non-deterministism). > > > > > > > > > > > > > > > > > > > > I'm kind of terrified by the implications. > > > > > > > > > Me too. :( > > > > > > > > > > > > > > > > > > > >> -- Sanjoy > > > > > > > > > >> > > > > > > > > > >>>> > > > > > > > > > >>>> Not that I'm sure that will end up being a helpful > > > > > >>>> distinction. > > > > > > > > > >>> > > > > > > > > > >>> Right. See Richard's comment. I think that sums up the real > > > > > >>> issue > > > > > >>> here. =/ > > > > > > > > > > _______________________________________________ > > > > > > > > > > LLVM Developers mailing list > > > > > > > > > > llvm-dev at lists.llvm.org > > > > > > > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > >> > > > _______________________________________________ > > > > > > > > > LLVM Developers mailing list > > > > > > > > > llvm-dev at lists.llvm.org > > > > > > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > >> > > _______________________________________________ > > > > LLVM Developers mailing list > > > > llvm-dev at lists.llvm.org > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > -- >> > -- > > > Hal Finkel > > > Assistant Computational Scientist > > > Leadership Computing Facility > > > Argonne National Laboratory >-- -- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Sanjoy Das via llvm-dev
2016-Feb-25 06:25 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
Hal Finkel wrote: > But it is not all optimizations that are the problem. Rather, it > seems like a select few (e.g. things involving collapsing allowed > non-determinism in atomics), and losing those optimizations seems > better than generally losing function-attribute deduction. If we go by the langref, then optimizations that fold undef are also problematic (though most C/C++ programs resulting in such IR would have UB in practice). I think there are non-(non-deterministic) problematic cases too. The following won't happen today since `readnone` does not imply `safe_to_speculate`, but if we add a `safe_to_speculate` property some day: int foo(bool C) available_externally { if (C) ((int *) null)++; // UB ret 42; } void bar() { if (<some cond>) foo(true); } Now, normally you can just delete the `if (C)` branch in foo, and it would become just a `ret 42`, and would look like it is speculatable above the `<some cond>` check. But if you then link with an -O0 version, you'll have introduced UB if `<some cond>` is always false at runtime. Today this won't happen since we don't speculate `readnone nounwind` functions, but could become a problem in the future. -- Sanjoy
Chandler Carruth via llvm-dev
2016-Feb-25 06:40 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
On Wed, Feb 24, 2016 at 10:25 PM Sanjoy Das <sanjoy at playingwithpointers.com> wrote:> > Hal Finkel wrote: > > > But it is not all optimizations that are the problem. Rather, it > > seems like a select few (e.g. things involving collapsing allowed > > non-determinism in atomics), and losing those optimizations seems > > better than generally losing function-attribute deduction. > > If we go by the langref, then optimizations that fold undef are also > problematic (though most C/C++ programs resulting in such IR would > have UB in practice). >I'm also concerned about undef. It isn't clear that we can reasonably ensure that all things involving undef suitably stem from UB in the face of speculation to preclude all "impossible" results coming from interposition. I think trying to enumerate the function-local optimizations which are unsafe is *much* more challenging than identifying the interprocedural optimizations which are unsafe, which is essentially what disabling function attribute deduction is doing.> > I think there are non-(non-deterministic) problematic cases too. The > following won't happen today since `readnone` does not imply > `safe_to_speculate`, but if we add a `safe_to_speculate` property some > day: > > int foo(bool C) available_externally { > if (C) > ((int *) null)++; // UB > ret 42; > } > > void bar() { > if (<some cond>) > foo(true); > } > > Now, normally you can just delete the `if (C)` branch in foo, and it > would become just a `ret 42`, and would look like it is speculatable > above the `<some cond>` check. But if you then link with an -O0 > version, you'll have introduced UB if `<some cond>` is always false at > runtime. > > Today this won't happen since we don't speculate `readnone nounwind` > functions, but could become a problem in the future. >All of this makes me think we're just seeing the tip of the iceberg of ways in which local optimizations can combine with attributes to break this. =/ I really think the problem is as Richard describes the interprocedural movement of information across a comdat-like boundary where replacements may occur with differently transformed code. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160225/c2f8883d/attachment.html>
Hal Finkel via llvm-dev
2016-Feb-25 06:56 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
----- Original Message -----> From: "Sanjoy Das" <sanjoy at playingwithpointers.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Chandler Carruth" <chandlerc at google.com>, "llvm-dev" <llvm-dev at lists.llvm.org>, "Philip Reames" > <listmail at philipreames.com>, "Duncan P. N. Exon Smith" <dexonsmith at apple.com> > Sent: Thursday, February 25, 2016 12:25:54 AM > Subject: Re: [llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics") > > > Hal Finkel wrote: > > > But it is not all optimizations that are the problem. Rather, it > > seems like a select few (e.g. things involving collapsing allowed > > non-determinism in atomics), and losing those optimizations seems > > better than generally losing function-attribute deduction. > > If we go by the langref, then optimizations that fold undef are also > problematic (though most C/C++ programs resulting in such IR would > have UB in practice).If the undef is folded to some concrete value (instead of just being propagated), then yes, I agree. We really should be propagating the undef, however, right?> > I think there are non-(non-deterministic) problematic cases too. The > following won't happen today since `readnone` does not imply > `safe_to_speculate`, but if we add a `safe_to_speculate` property > some > day: > > int foo(bool C) available_externally { > if (C) > ((int *) null)++; // UB > ret 42; > } > > void bar() { > if (<some cond>) > foo(true); > } > > Now, normally you can just delete the `if (C)` branch in foo, and it > would become just a `ret 42`, and would look like it is speculatable > above the `<some cond>` check. But if you then link with an -O0 > version, you'll have introduced UB if `<some cond>` is always false > at > runtime.So this is a good point, but I'm not sure how much to generalize this example. When we add a safe_to_speculate attribute, we'll need to keep this in mind (special care must be taken in such non-definitive-definition contexts). -Hal> > Today this won't happen since we don't speculate `readnone nounwind` > functions, but could become a problem in the future. > > -- Sanjoy >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Possibly Parallel Threads
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")