Duncan P. N. Exon Smith via llvm-dev
2016-Feb-25 03:34 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
> On 2016-Feb-24, at 19:17, Chandler Carruth <chandlerc at google.com> wrote: > > On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev <llvm-dev at lists.llvm.org> wrote: > On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith > <dexonsmith at apple.com> wrote: > >> If we do not inline @foo(), and instead re-link the call site in @main > >> to some non-optimized copy (or differently optimized copy) of @foo, > >> then it is possible for the program to have the behavior {print("Y"); > >> print ("X")}, which was disallowed in the earlier program. > >> > >> In other words, opt refined the semantics of @foo() (i.e. reduced the > >> set of behaviors it may have) in ways that would make later > >> optimizations invalid if we de-refine the implementation of @foo(). > > > > I'm probably missing something obvious here. How could the result of > > `%t0 != %t1` be different at optimization time in one file than from > > runtime in the "real" implementation? Doesn't this make the CSE > > invalid? > > `%t0` and `%t1` are "allowed" to "always be the same", i.e. an > implementation of @foo that always feeds in the same > value for `%t0` and `%t1` is a valid implementation (which is why the > CSE was valid); but it is not the *only* valid implementation. If I > don't CSE the two load instructions (also a valid thing to do), and > this is a second thread writing to `%par`, then the two values loaded > can be different, and you could end up printing `"X"` in `@foo`. > > Did that make sense?Yes. To be sure I understand the scope: this is only a problem for atomics, correct? (Because multi-threaded behaviour with other globals is UB?)> > Does linkonce_odr linkage have the same problem? > > - If so, do you want to change it too? > > - Else, why not? > > Going by the specification in the LangRef, I'd say it depends on how > you define "definitive". If you're allowed to replace the body of a > function with a differently optimized body, then the above problem > exists. > > I believe that is the case, and I strongly believe the problem you outline exists for linkonce_odr exactly as it does for available_externally. > > Which is what makes this scary: every C++ inline function today can trigger this.Every C/C++ inline or template function. But only the ones that use atomics, right? Not that I'm sure that will end up being a helpful distinction.> >> The above example is clearly fabricated, but such cases can come up > >> even if everything is optimized to the same level. E.g. one of the > >> atomic loads in the unrefined implementation of @foo() could have been > >> hidden behind a function call, whose body existed in only one module. > >> That module would then be able to refine @foo() to `ret void` but > >> other modules won't. > >> > >> The only solution I can think of is to redefine available_externally > >> to mean "the only kind of IPO/IPA you can do over a call to this > >> function is to inline it". Redefining available_externally this way > >> will also let us soundly use it to represent calls to functions that > >> have guard intrinsics, since a failed guard intrinsic basically > >> replaces the function with a "very de-refined" implementation (the > >> interpreter). > >> > >> What do you think? I don't think implementing the above above will be > >> very difficult, but needless to say, it will still be a fairly > >> non-trivial semantic change (hence I'm not directly jumping to > >> implementation). > > > > This linkage is used in three places (that I know of) by clang: > > > > 1. C-style `inline` functions. > > 2. Functions defined in C++ template classes with external explicit > > instantiations, e.g. S::foo() in: > > > > template <class T> struct S { void foo() {} }; > > void bar() { S<int>().foo(); } > > extern template struct S<int>; > > > > 3. -flto=thin cross-module function importing. > > > > (No comment on (1); its exact semantics are a little fuzzy to me.) > > For (2) and (3), the current behaviour seems correct, and I'd be > > hesitant to lose optimizing power. (2) is under the "ODR" rule, and > > I think we've been applying the same logic to (3). Unless, are you > > saying ODR isn't enough? > > By ODR, do you mean you only have one definition of the function in > the whole link (i.e. across all modules you'll link together)? > Then yes, ODR should be enough to avoid this. But in any place where > the linker sees two differently optimized definitions for a function > and picks one as the definitive version all non-inlined calls link to, > we have this problem. > > No, different levels of optimization must be allowed within ODR. So this is a problem within an ODR context. > > (The term ODR applies to one *source* definition, not one optimized definition)
Chandler Carruth via llvm-dev
2016-Feb-25 03:38 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
On Wed, Feb 24, 2016 at 7:34 PM Duncan P. N. Exon Smith < dexonsmith at apple.com> wrote:> > > On 2016-Feb-24, at 19:17, Chandler Carruth <chandlerc at google.com> wrote: > > > > On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith > > <dexonsmith at apple.com> wrote: > > >> If we do not inline @foo(), and instead re-link the call site in @main > > >> to some non-optimized copy (or differently optimized copy) of @foo, > > >> then it is possible for the program to have the behavior {print("Y"); > > >> print ("X")}, which was disallowed in the earlier program. > > >> > > >> In other words, opt refined the semantics of @foo() (i.e. reduced the > > >> set of behaviors it may have) in ways that would make later > > >> optimizations invalid if we de-refine the implementation of @foo(). > > > > > > I'm probably missing something obvious here. How could the result of > > > `%t0 != %t1` be different at optimization time in one file than from > > > runtime in the "real" implementation? Doesn't this make the CSE > > > invalid? > > > > `%t0` and `%t1` are "allowed" to "always be the same", i.e. an > > implementation of @foo that always feeds in the same > > value for `%t0` and `%t1` is a valid implementation (which is why the > > CSE was valid); but it is not the *only* valid implementation. If I > > don't CSE the two load instructions (also a valid thing to do), and > > this is a second thread writing to `%par`, then the two values loaded > > can be different, and you could end up printing `"X"` in `@foo`. > > > > Did that make sense? > > Yes. To be sure I understand the scope: this is only a problem for > atomics, correct? (Because multi-threaded behaviour with other globals > is UB?) > > > > Does linkonce_odr linkage have the same problem? > > > - If so, do you want to change it too? > > > - Else, why not? > > > > Going by the specification in the LangRef, I'd say it depends on how > > you define "definitive". If you're allowed to replace the body of a > > function with a differently optimized body, then the above problem > > exists. > > > > I believe that is the case, and I strongly believe the problem you > outline exists for linkonce_odr exactly as it does for available_externally. > > > > Which is what makes this scary: every C++ inline function today can > trigger this. > > Every C/C++ inline or template function. But only the ones that use > atomics, right? >Well, with *this* example...> > Not that I'm sure that will end up being a helpful distinction. >Right. See Richard's comment. I think that sums up the real issue here. =/ -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160225/29164228/attachment.html>
Sanjoy Das via llvm-dev
2016-Feb-25 03:46 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
On Wed, Feb 24, 2016 at 7:38 PM, Chandler Carruth <chandlerc at google.com> wrote:> On Wed, Feb 24, 2016 at 7:34 PM Duncan P. N. Exon Smith > <dexonsmith at apple.com> wrote: >> >> >> > On 2016-Feb-24, at 19:17, Chandler Carruth <chandlerc at google.com> wrote: >> > >> > On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev >> > <llvm-dev at lists.llvm.org> wrote: >> > On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith >> > <dexonsmith at apple.com> wrote: >> > >> If we do not inline @foo(), and instead re-link the call site in >> > >> @main >> > >> to some non-optimized copy (or differently optimized copy) of @foo, >> > >> then it is possible for the program to have the behavior {print("Y"); >> > >> print ("X")}, which was disallowed in the earlier program. >> > >> >> > >> In other words, opt refined the semantics of @foo() (i.e. reduced the >> > >> set of behaviors it may have) in ways that would make later >> > >> optimizations invalid if we de-refine the implementation of @foo(). >> > > >> > > I'm probably missing something obvious here. How could the result of >> > > `%t0 != %t1` be different at optimization time in one file than from >> > > runtime in the "real" implementation? Doesn't this make the CSE >> > > invalid? >> > >> > `%t0` and `%t1` are "allowed" to "always be the same", i.e. an >> > implementation of @foo that always feeds in the same >> > value for `%t0` and `%t1` is a valid implementation (which is why the >> > CSE was valid); but it is not the *only* valid implementation. If I >> > don't CSE the two load instructions (also a valid thing to do), and >> > this is a second thread writing to `%par`, then the two values loaded >> > can be different, and you could end up printing `"X"` in `@foo`. >> > >> > Did that make sense? >> >> Yes. To be sure I understand the scope: this is only a problem for >> atomics, correct? (Because multi-threaded behaviour with other globals >> is UB?) >> >> > > Does linkonce_odr linkage have the same problem? >> > > - If so, do you want to change it too? >> > > - Else, why not? >> > >> > Going by the specification in the LangRef, I'd say it depends on how >> > you define "definitive". If you're allowed to replace the body of a >> > function with a differently optimized body, then the above problem >> > exists. >> > >> > I believe that is the case, and I strongly believe the problem you >> > outline exists for linkonce_odr exactly as it does for available_externally. >> > >> > Which is what makes this scary: every C++ inline function today can >> > trigger this. >> >> Every C/C++ inline or template function. But only the ones that use >> atomics, right? > > > Well, with *this* example...Atomic are one source of non-determinism that compilers can reason about. I don't know if the following snippet is well defined or not, but you could have similar issues with void foo() { int *p = malloc(sizeof(int)); if (*p < 10) print("X"); } or (again, I don't know if this is actually well defined) void foo() { int t; // it is probably reasonable to fold compares with ptrtoint(alloca) to undef if ((intptr_t)(&t) < 10) print("X"); } -- Sanjoy> >> >> >> Not that I'm sure that will end up being a helpful distinction. > > > Right. See Richard's comment. I think that sums up the real issue here. =/
Apparently Analagous Threads
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")