Sanjoy Das via llvm-dev
2016-Feb-25 03:09 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:>> If we do not inline @foo(), and instead re-link the call site in @main >> to some non-optimized copy (or differently optimized copy) of @foo, >> then it is possible for the program to have the behavior {print("Y"); >> print ("X")}, which was disallowed in the earlier program. >> >> In other words, opt refined the semantics of @foo() (i.e. reduced the >> set of behaviors it may have) in ways that would make later >> optimizations invalid if we de-refine the implementation of @foo(). > > I'm probably missing something obvious here. How could the result of > `%t0 != %t1` be different at optimization time in one file than from > runtime in the "real" implementation? Doesn't this make the CSE > invalid?`%t0` and `%t1` are "allowed" to "always be the same", i.e. an implementation of @foo that always feeds in the same value for `%t0` and `%t1` is a valid implementation (which is why the CSE was valid); but it is not the *only* valid implementation. If I don't CSE the two load instructions (also a valid thing to do), and this is a second thread writing to `%par`, then the two values loaded can be different, and you could end up printing `"X"` in `@foo`. Did that make sense?> Does linkonce_odr linkage have the same problem? > - If so, do you want to change it too? > - Else, why not?Going by the specification in the LangRef, I'd say it depends on how you define "definitive". If you're allowed to replace the body of a function with a differently optimized body, then the above problem exists.>> The above example is clearly fabricated, but such cases can come up >> even if everything is optimized to the same level. E.g. one of the >> atomic loads in the unrefined implementation of @foo() could have been >> hidden behind a function call, whose body existed in only one module. >> That module would then be able to refine @foo() to `ret void` but >> other modules won't. >> >> The only solution I can think of is to redefine available_externally >> to mean "the only kind of IPO/IPA you can do over a call to this >> function is to inline it". Redefining available_externally this way >> will also let us soundly use it to represent calls to functions that >> have guard intrinsics, since a failed guard intrinsic basically >> replaces the function with a "very de-refined" implementation (the >> interpreter). >> >> What do you think? I don't think implementing the above above will be >> very difficult, but needless to say, it will still be a fairly >> non-trivial semantic change (hence I'm not directly jumping to >> implementation). > > This linkage is used in three places (that I know of) by clang: > > 1. C-style `inline` functions. > 2. Functions defined in C++ template classes with external explicit > instantiations, e.g. S::foo() in: > > template <class T> struct S { void foo() {} }; > void bar() { S<int>().foo(); } > extern template struct S<int>; > > 3. -flto=thin cross-module function importing. > > (No comment on (1); its exact semantics are a little fuzzy to me.) > For (2) and (3), the current behaviour seems correct, and I'd be > hesitant to lose optimizing power. (2) is under the "ODR" rule, and > I think we've been applying the same logic to (3). Unless, are you > saying ODR isn't enough?By ODR, do you mean you only have one definition of the function in the whole link (i.e. across all modules you'll link together)? Then yes, ODR should be enough to avoid this. But in any place where the linker sees two differently optimized definitions for a function and picks one as the definitive version all non-inlined calls link to, we have this problem.> Assuming you need this new definition (but under ODR, the semantics > are correct), I would rather split the linkage than change it. E.g., > use a name like available_externally_odr for (2) and (3).If what I said above is correct (i.e. ODR == OD across everything you're linking into your final executable) then splitting the linkage / adding a new one is probably the best alternative. -- Sanjoy
Chandler Carruth via llvm-dev
2016-Feb-25 03:17 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith > <dexonsmith at apple.com> wrote: > >> If we do not inline @foo(), and instead re-link the call site in @main > >> to some non-optimized copy (or differently optimized copy) of @foo, > >> then it is possible for the program to have the behavior {print("Y"); > >> print ("X")}, which was disallowed in the earlier program. > >> > >> In other words, opt refined the semantics of @foo() (i.e. reduced the > >> set of behaviors it may have) in ways that would make later > >> optimizations invalid if we de-refine the implementation of @foo(). > > > > I'm probably missing something obvious here. How could the result of > > `%t0 != %t1` be different at optimization time in one file than from > > runtime in the "real" implementation? Doesn't this make the CSE > > invalid? > > `%t0` and `%t1` are "allowed" to "always be the same", i.e. an > implementation of @foo that always feeds in the same > value for `%t0` and `%t1` is a valid implementation (which is why the > CSE was valid); but it is not the *only* valid implementation. If I > don't CSE the two load instructions (also a valid thing to do), and > this is a second thread writing to `%par`, then the two values loaded > can be different, and you could end up printing `"X"` in `@foo`. > > Did that make sense? > > > Does linkonce_odr linkage have the same problem? > > - If so, do you want to change it too? > > - Else, why not? > > Going by the specification in the LangRef, I'd say it depends on how > you define "definitive". If you're allowed to replace the body of a > function with a differently optimized body, then the above problem > exists. >I believe that is the case, and I strongly believe the problem you outline exists for linkonce_odr exactly as it does for available_externally. Which is what makes this scary: every C++ inline function today can trigger this.> > >> The above example is clearly fabricated, but such cases can come up > >> even if everything is optimized to the same level. E.g. one of the > >> atomic loads in the unrefined implementation of @foo() could have been > >> hidden behind a function call, whose body existed in only one module. > >> That module would then be able to refine @foo() to `ret void` but > >> other modules won't. > >> > >> The only solution I can think of is to redefine available_externally > >> to mean "the only kind of IPO/IPA you can do over a call to this > >> function is to inline it". Redefining available_externally this way > >> will also let us soundly use it to represent calls to functions that > >> have guard intrinsics, since a failed guard intrinsic basically > >> replaces the function with a "very de-refined" implementation (the > >> interpreter). > >> > >> What do you think? I don't think implementing the above above will be > >> very difficult, but needless to say, it will still be a fairly > >> non-trivial semantic change (hence I'm not directly jumping to > >> implementation). > > > > This linkage is used in three places (that I know of) by clang: > > > > 1. C-style `inline` functions. > > 2. Functions defined in C++ template classes with external explicit > > instantiations, e.g. S::foo() in: > > > > template <class T> struct S { void foo() {} }; > > void bar() { S<int>().foo(); } > > extern template struct S<int>; > > > > 3. -flto=thin cross-module function importing. > > > > (No comment on (1); its exact semantics are a little fuzzy to me.) > > For (2) and (3), the current behaviour seems correct, and I'd be > > hesitant to lose optimizing power. (2) is under the "ODR" rule, and > > I think we've been applying the same logic to (3). Unless, are you > > saying ODR isn't enough? > > By ODR, do you mean you only have one definition of the function in > the whole link (i.e. across all modules you'll link together)? > Then yes, ODR should be enough to avoid this. But in any place where > the linker sees two differently optimized definitions for a function > and picks one as the definitive version all non-inlined calls link to, > we have this problem. >No, different levels of optimization must be allowed within ODR. So this is a problem within an ODR context. (The term ODR applies to one *source* definition, not one optimized definition) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160225/6c4ed949/attachment.html>
Sanjoy Das via llvm-dev
2016-Feb-25 03:26 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
On Wed, Feb 24, 2016 at 7:09 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:> CSE was valid); but it is not the *only* valid implementation. If I > don't CSE the two load instructions (also a valid thing to do), and > this is a second thread writing to `%par`, then the two values loadedTypoe'd a little bit here: should be "and there is a second thread writing to `%ptr`" -- Sanjoy
Richard Smith via llvm-dev
2016-Feb-25 03:33 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
On Wed, Feb 24, 2016 at 7:17 PM, Chandler Carruth via llvm-dev <llvm-dev at lists.llvm.org> wrote:> On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> >> On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith >> <dexonsmith at apple.com> wrote: >> >> If we do not inline @foo(), and instead re-link the call site in @main >> >> to some non-optimized copy (or differently optimized copy) of @foo, >> >> then it is possible for the program to have the behavior {print("Y"); >> >> print ("X")}, which was disallowed in the earlier program. >> >> >> >> In other words, opt refined the semantics of @foo() (i.e. reduced the >> >> set of behaviors it may have) in ways that would make later >> >> optimizations invalid if we de-refine the implementation of @foo(). >> > >> > I'm probably missing something obvious here. How could the result of >> > `%t0 != %t1` be different at optimization time in one file than from >> > runtime in the "real" implementation? Doesn't this make the CSE >> > invalid? >> >> `%t0` and `%t1` are "allowed" to "always be the same", i.e. an >> implementation of @foo that always feeds in the same >> value for `%t0` and `%t1` is a valid implementation (which is why the >> CSE was valid); but it is not the *only* valid implementation. If I >> don't CSE the two load instructions (also a valid thing to do), and >> this is a second thread writing to `%par`, then the two values loaded >> can be different, and you could end up printing `"X"` in `@foo`. >> >> Did that make sense? >> >> > Does linkonce_odr linkage have the same problem? >> > - If so, do you want to change it too? >> > - Else, why not? >> >> Going by the specification in the LangRef, I'd say it depends on how >> you define "definitive". If you're allowed to replace the body of a >> function with a differently optimized body, then the above problem >> exists. > > > I believe that is the case, and I strongly believe the problem you outline > exists for linkonce_odr exactly as it does for available_externally. > > Which is what makes this scary: every C++ inline function today can trigger > this.More generally, the same problem applies to all entities within comdats (we can view available_externally and linkonce_odr as special cases of single-symbol comdats). Similar problems show up with static local variables in inline functions (where the comdats may be equivalent across object files, but aren't necessarily identical -- in particular, one can use constant initialization for a static local variable where the other uses dynamic initialization, and inlining the one with constant initialization then selecting the one with dynamic initialization results in the variable not being initialized). One approach that appears superficially correct would be to say that we cannot move information across a comdat / available_externally / linkonce_odr boundary unless doing so allows us to completely remove that reference to the comdat / function / global. (In particular, you can't use the attrs on a function in a comdat from outside that comdat -- unless you somehow know they're present for all versions of that comdat.) But that seems to have a pretty huge impact on optimizability of comdats.>> >> The above example is clearly fabricated, but such cases can come up >> >> even if everything is optimized to the same level. E.g. one of the >> >> atomic loads in the unrefined implementation of @foo() could have been >> >> hidden behind a function call, whose body existed in only one module. >> >> That module would then be able to refine @foo() to `ret void` but >> >> other modules won't. >> >> >> >> The only solution I can think of is to redefine available_externally >> >> to mean "the only kind of IPO/IPA you can do over a call to this >> >> function is to inline it". Redefining available_externally this way >> >> will also let us soundly use it to represent calls to functions that >> >> have guard intrinsics, since a failed guard intrinsic basically >> >> replaces the function with a "very de-refined" implementation (the >> >> interpreter). >> >> >> >> What do you think? I don't think implementing the above above will be >> >> very difficult, but needless to say, it will still be a fairly >> >> non-trivial semantic change (hence I'm not directly jumping to >> >> implementation). >> > >> > This linkage is used in three places (that I know of) by clang: >> > >> > 1. C-style `inline` functions. >> > 2. Functions defined in C++ template classes with external explicit >> > instantiations, e.g. S::foo() in: >> > >> > template <class T> struct S { void foo() {} }; >> > void bar() { S<int>().foo(); } >> > extern template struct S<int>; >> > >> > 3. -flto=thin cross-module function importing. >> > >> > (No comment on (1); its exact semantics are a little fuzzy to me.) >> > For (2) and (3), the current behaviour seems correct, and I'd be >> > hesitant to lose optimizing power. (2) is under the "ODR" rule, and >> > I think we've been applying the same logic to (3). Unless, are you >> > saying ODR isn't enough? >> >> By ODR, do you mean you only have one definition of the function in >> the whole link (i.e. across all modules you'll link together)? >> Then yes, ODR should be enough to avoid this. But in any place where >> the linker sees two differently optimized definitions for a function >> and picks one as the definitive version all non-inlined calls link to, >> we have this problem. > > > No, different levels of optimization must be allowed within ODR. So this is > a problem within an ODR context. > > (The term ODR applies to one *source* definition, not one optimized > definition) > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >
Duncan P. N. Exon Smith via llvm-dev
2016-Feb-25 03:34 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
> On 2016-Feb-24, at 19:17, Chandler Carruth <chandlerc at google.com> wrote: > > On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev <llvm-dev at lists.llvm.org> wrote: > On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith > <dexonsmith at apple.com> wrote: > >> If we do not inline @foo(), and instead re-link the call site in @main > >> to some non-optimized copy (or differently optimized copy) of @foo, > >> then it is possible for the program to have the behavior {print("Y"); > >> print ("X")}, which was disallowed in the earlier program. > >> > >> In other words, opt refined the semantics of @foo() (i.e. reduced the > >> set of behaviors it may have) in ways that would make later > >> optimizations invalid if we de-refine the implementation of @foo(). > > > > I'm probably missing something obvious here. How could the result of > > `%t0 != %t1` be different at optimization time in one file than from > > runtime in the "real" implementation? Doesn't this make the CSE > > invalid? > > `%t0` and `%t1` are "allowed" to "always be the same", i.e. an > implementation of @foo that always feeds in the same > value for `%t0` and `%t1` is a valid implementation (which is why the > CSE was valid); but it is not the *only* valid implementation. If I > don't CSE the two load instructions (also a valid thing to do), and > this is a second thread writing to `%par`, then the two values loaded > can be different, and you could end up printing `"X"` in `@foo`. > > Did that make sense?Yes. To be sure I understand the scope: this is only a problem for atomics, correct? (Because multi-threaded behaviour with other globals is UB?)> > Does linkonce_odr linkage have the same problem? > > - If so, do you want to change it too? > > - Else, why not? > > Going by the specification in the LangRef, I'd say it depends on how > you define "definitive". If you're allowed to replace the body of a > function with a differently optimized body, then the above problem > exists. > > I believe that is the case, and I strongly believe the problem you outline exists for linkonce_odr exactly as it does for available_externally. > > Which is what makes this scary: every C++ inline function today can trigger this.Every C/C++ inline or template function. But only the ones that use atomics, right? Not that I'm sure that will end up being a helpful distinction.> >> The above example is clearly fabricated, but such cases can come up > >> even if everything is optimized to the same level. E.g. one of the > >> atomic loads in the unrefined implementation of @foo() could have been > >> hidden behind a function call, whose body existed in only one module. > >> That module would then be able to refine @foo() to `ret void` but > >> other modules won't. > >> > >> The only solution I can think of is to redefine available_externally > >> to mean "the only kind of IPO/IPA you can do over a call to this > >> function is to inline it". Redefining available_externally this way > >> will also let us soundly use it to represent calls to functions that > >> have guard intrinsics, since a failed guard intrinsic basically > >> replaces the function with a "very de-refined" implementation (the > >> interpreter). > >> > >> What do you think? I don't think implementing the above above will be > >> very difficult, but needless to say, it will still be a fairly > >> non-trivial semantic change (hence I'm not directly jumping to > >> implementation). > > > > This linkage is used in three places (that I know of) by clang: > > > > 1. C-style `inline` functions. > > 2. Functions defined in C++ template classes with external explicit > > instantiations, e.g. S::foo() in: > > > > template <class T> struct S { void foo() {} }; > > void bar() { S<int>().foo(); } > > extern template struct S<int>; > > > > 3. -flto=thin cross-module function importing. > > > > (No comment on (1); its exact semantics are a little fuzzy to me.) > > For (2) and (3), the current behaviour seems correct, and I'd be > > hesitant to lose optimizing power. (2) is under the "ODR" rule, and > > I think we've been applying the same logic to (3). Unless, are you > > saying ODR isn't enough? > > By ODR, do you mean you only have one definition of the function in > the whole link (i.e. across all modules you'll link together)? > Then yes, ODR should be enough to avoid this. But in any place where > the linker sees two differently optimized definitions for a function > and picks one as the definitive version all non-inlined calls link to, > we have this problem. > > No, different levels of optimization must be allowed within ODR. So this is a problem within an ODR context. > > (The term ODR applies to one *source* definition, not one optimized definition)
Reasonably Related Threads
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")