Gerolf Hoflehner via llvm-dev
2016-Feb-27 06:16 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
> On Feb 25, 2016, at 11:41 AM, James Y Knight via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > While we're talking about this, I'd just mention again that the same issue arises for *normal* functions too, when linked into a shared library: > int foo() { return 1; } > int bar() { return foo(); } > > Now, compare: > clang -fPIC -O1 -S -o - test.c > gcc -fPIC -O1 -S -o - test.c > > GCC will refuse to inline foo into bar, or use any information about foo in compiling bar, because foo is exported in the dynamic symbol table, and thus replaceable via symbol interposition. > > Clang assumes that you won't do that, or that you don't care what happens if you do. It will happily inline. And, in absense of inlining (e.g. if foo is too long to inline), clang will deduce function attributes about foo and rely on those in bar -- despite that the call goes through the PLT and could in fact be an entirely different unrelated implementation (or, for that matter, a differently-optimized version of the same implementation). > > Is that *really* okay?+1 I agree. The problem goes deeper than just dealing with function attributes. The question is what optimizations are allowed for an OS specific preemption model? The function attributes add additional need for clarification. It think at the heart of this difference are assumptions about the OS preemption model. Linux by default assumes that global data/functions are preemptable, so in your example based on that model foo could not be inlined. You should also see gp save and restores around global calls for similar reasons, extra levels of indirections when loading global data etc. An alternative model is to invert the default by requiring preemptable data/functions to be marked. This is the path eg. Windows has chosen with dllimport directives. FWIW, my reading of available_external is that although the function is/can be preempted it still can be inlined since the code of the external function will match the definition in the modulo. The question about legality of other optimizations are similar to the question which optimization is allowed in which preemption model even w/o the attribute. However, I don’t have much experience with the function attributes. -Gerolf> > > On Wed, Feb 24, 2016 at 6:57 PM, Sanjoy Das via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > Hi all, > > This is something that came up in the "RFC: Add guard intrinsics to > LLVM" thread; and while I'm not exactly blocked on this, figuring out > a path forward here will be helpful in deciding if we can use the > available_externally linkage type to expression certain semantic > properties guard intrinsics will have. > > Let's start with an example that shows that we have a problem (direct > copy/paste from the guard intrinsics thread). Say we have: > > ``` > void foo() available_externally { > %t0 = load atomic %ptr > %t1 = load atomic %ptr > if (%t0 != %t1) print("X"); > } > void main() { > foo(); > print("Y"); > } > ``` > > The possible behaviors of the above program are {print("X"), > print("Y")} or {print("Y")}. But if we run opt then we have > > ``` > void foo() available_externally readnone nounwind { > ;; After CSE'ing the two loads and folding the condition > } > void main() { > foo(); > print("Y"); > } > ``` > > and some generic reordering > > ``` > void foo() available_externally readnone nounwind { > ;; After CSE'ing the two loads and folding the condition > } > void main() { > print("Y"); > foo(); // legal since we're moving a readnone nounwind function that > // was guaranteed to execute (hence can't have UB) > } > ``` > > If we do not inline @foo(), and instead re-link the call site in @main > to some non-optimized copy (or differently optimized copy) of @foo, > then it is possible for the program to have the behavior {print("Y"); > print ("X")}, which was disallowed in the earlier program. > > In other words, opt refined the semantics of @foo() (i.e. reduced the > set of behaviors it may have) in ways that would make later > optimizations invalid if we de-refine the implementation of @foo(). > > The above example is clearly fabricated, but such cases can come up > even if everything is optimized to the same level. E.g. one of the > atomic loads in the unrefined implementation of @foo() could have been > hidden behind a function call, whose body existed in only one module. > That module would then be able to refine @foo() to `ret void` but > other modules won't. > > The only solution I can think of is to redefine available_externally > to mean "the only kind of IPO/IPA you can do over a call to this > function is to inline it". Redefining available_externally this way > will also let us soundly use it to represent calls to functions that > have guard intrinsics, since a failed guard intrinsic basically > replaces the function with a "very de-refined" implementation (the > interpreter). > > What do you think? I don't think implementing the above above will be > very difficult, but needless to say, it will still be a fairly > non-trivial semantic change (hence I'm not directly jumping to > implementation). > > > -- Sanjoy > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/a8c0f5ba/attachment-0001.html>
Sanjoy Das via llvm-dev
2016-Feb-27 21:41 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
Just as a reality check, I wrote up a demonstration where one link order causes a SIGFPE and another doesn't (and the program is well defined, as far as I can tell). All TUs are compiled with -O3. This is also an instance where we don't actually speculate an inline function, but only DSE across it (after deducing readnone). Here's the link https://github.com/sanjoy/comdat-ipo I've tested this with my system clang: Apple LLVM version 7.0.2 (clang-700.1.81) Target: x86_64-apple-darwin15.3.0 Thread model: posix I didn't test with ToT, since I don't have a build lying around. -- Sanjoy
Xinliang David Li via llvm-dev
2016-Feb-28 00:21 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
So in this case, ptr[0] = 10 is propagated into one copy of maybe_devide (in source a), and ptr[0]=10 in caller_a is DSEed ? David On Sat, Feb 27, 2016 at 1:41 PM, Sanjoy Das via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Just as a reality check, I wrote up a demonstration where one link > order causes a SIGFPE and another doesn't (and the program is well > defined, as far as I can tell). All TUs are compiled with -O3. This is > also > an instance where we don't actually speculate an inline function, but only > DSE across it (after deducing readnone). > > Here's the link https://github.com/sanjoy/comdat-ipo > > I've tested this with my system clang: > Apple LLVM version 7.0.2 (clang-700.1.81) > Target: x86_64-apple-darwin15.3.0 > Thread model: posix > > I didn't test with ToT, since I don't have a build lying around. > > -- Sanjoy > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160227/20c46f8d/attachment.html>
Sanjoy Das via llvm-dev
2016-Feb-29 16:21 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
On Sat, Feb 27, 2016 at 1:41 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:> Just as a reality check, I wrote up a demonstration where one link > order causes a SIGFPE and another doesn't (and the program is well > defined, as far as I can tell). All TUs are compiled with -O3. This is also > an instance where we don't actually speculate an inline function, but only > DSE across it (after deducing readnone). > > Here's the link https://github.com/sanjoy/comdat-ipoThis test case "works" with gcc 5.3.0 too, afaict. This is what I used: Using built-in specs. COLLECT_GCC=/usr/local/Cellar/gcc/5.3.0/bin/x86_64-apple-darwin15.0.0-g++-5 COLLECT_LTO_WRAPPER=/usr/local/Cellar/gcc/5.3.0/libexec/gcc/x86_64-apple-darwin15.0.0/5.3.0/lto-wrapper Target: x86_64-apple-darwin15.0.0 Configured with: ../configure --build=x86_64-apple-darwin15.0.0 --prefix=/usr/local/Cellar/gcc/5.3.0 --libdir=/usr/local/Cellar/gcc/5.3.0/lib/gcc/5 --enable-languages=c,c++,objc,obj-c++,fortran --program-suffix=-5 --with-gmp=/usr/local/opt/gmp --with-mpfr=/usr/local/opt/mpfr --with-mpc=/usr/local/opt/libmpc --with-isl=/usr/local/opt/isl --with-system-zlib --enable-libstdcxx-time=yes --enable-stage1-checking --enable-checking=release --enable-lto --with-build-config=bootstrap-debug --disable-werror --with-pkgversion='Homebrew gcc 5.3.0' --with-bugurl=https://github.com/Homebrew/homebrew/issues --enable-plugin --disable-nls --enable-multilib Thread model: posix gcc version 5.3.0 (Homebrew gcc 5.3.0) -- Sanjoy
Seemingly Similar Threads
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")