Hal Finkel via llvm-dev
2016-Feb-29 15:50 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
----- Original Message -----> From: "James Y Knight" <jyknight at google.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Sanjoy Das" <sanjoy at playingwithpointers.com>, "llvm-dev" > <llvm-dev at lists.llvm.org> > Sent: Monday, February 29, 2016 9:31:24 AM > Subject: Re: [llvm-dev] Possible soundness issue with > available_externally (split from "RFC: Add guard intrinsics")> On Feb 26, 2016 8:50 PM, "Hal Finkel" < hfinkel at anl.gov > wrote:> > > From: "James Y Knight via llvm-dev" < llvm-dev at lists.llvm.org > > > > > > > To: "Sanjoy Das" < sanjoy at playingwithpointers.com > > > > > > > Cc: "llvm-dev" < llvm-dev at lists.llvm.org > > > > > > > Sent: Thursday, February 25, 2016 1:41:43 PM > > > > > > Subject: Re: [llvm-dev] Possible soundness issue with > > > available_externally (split from "RFC: Add guard intrinsics") > > >> > > While we're talking about this, I'd just mention again that the > > > same > > > issue arises for *normal* functions too, when linked into a > > > shared > > > library: > > > > > > int foo() { return 1; } > > >> > > int bar() { return foo(); } > > >> > > Now, compare: > > > > > > clang -fPIC -O1 -S -o - test.c > > >> > > gcc -fPIC -O1 -S -o - test.c > > >> > > GCC will refuse to inline foo into bar, or use any information > > > about > > > foo in compiling bar, because foo is exported in the dynamic > > > symbol > > > table, and thus replaceable via symbol interposition. > > >> > > Clang assumes that you won't do that, or that you don't care what > > > happens if you do. It will happily inline. And, in absense of > > > inlining (e.g. if foo is too long to inline), clang will deduce > > > function attributes about foo and rely on those in bar -- despite > > > that the call goes through the PLT and could in fact be an > > > entirely > > > different unrelated implementation (or, for that matter, a > > > differently-optimized version of the same implementation). > > >> > > Is that *really* okay? > > >> > I'm comfortable with saying that symbol interposition falls outside > > of the model we have for the targeted system (at least by default), > > and thus, this is okay. We also don't model the possibility of > > someone hex-editing the binary ;) > > I'm not really okay with it; the current behavior feels unprincipled. > We have a visibility attribute which can be used to control this: On > ELF systems, "default" visibililty allows interposition (unlike on > Darwin) -- that is, it explicitly ALLOWS for replacing the symbol's > definition. The policy of "You can't replace the definition of the > symbol, but it is globally visible" is exactly what the "protected" > visibility mode is for.> If we want to say that you can't interpose by default on ELF targets, > that would be a choice. Then, we should make the default symbol > visibility "protected" instead of "default". But, continuing to > generate calls through the PLT -- which is only needed because the > symbols might be replaced -- while simultaneously making > optimizations that are broken if they actually ARE replaced, seems > kinda bogus.This makes sense, and I think you understand my concern here: Most programmers don't understand these issues, nor do they ever expect to use dynamic interposition. They do expect, however, that the compiler has good IPA and will use the information it is provided effectively. I'd be happy to make the default visibility protected, allowing us to continue optimizing well, and provide a principled behavior otherwise. Given, as you point out, this is the default on Darwin, is there experience from Darwin porting, or any other factors, that would indicate this would be a hardship? Thanks again, Hal -- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160229/1d1d4f83/attachment.html>
Hal Finkel via llvm-dev
2016-Nov-29 16:01 UTC
[llvm-dev] RFC: Add an "interposible" linkage type (and implement -fsemantic-interposition)
Hi everyone, Clang/LLVM's support for ELF interposition is in a confusing state, and I propose making a few (hopefully simple) adjustments in order to bring our model into a self-consistent state. The problem: On ELF systems, global symbols can be interposed. This means, for example, that calls to global functions in some (shared) library defined in that same library might end up being redirected to an implementation in some other library (or in the main executable). The most common reason for this is the use of LD_PRELOAD, but there are plenty of other ways to trigger interposition as well. As a result, it is technically inconsistent to inline any global function or do inter-procedural analysis on them because the implementation might be replaced by code with completely different behavior at runtme (or link time). Clang has never supported this (i.e. we do treat these functions as being eligible for inlining and perform IPA on them). GCC, on the other hand, has traditionally respected the possibility of ELF interposition and refrained from doing these things (at least when compiling with -fPIC). I believe that Clang/LLVM's current behavior is the most-useful behavior and we should keep the current behavior (at least as a default). I do understand, however, that there are valid use cases for ELF interposition and places where we should allow it (e.g. when compiling certain system libraries). GCC recently added a flag -fsemantic-interposition/-fno-semantic-interposition, where using -fno-semantic-interposition provides Clang/LLVM's behavior of assuming that ELF interposition will not be used. It has been suggested that, to be self consistent, LLVM should emit global symbols with protected ELF visibility in cases where we've assumed that ELF interposition won't happen. ELF protected visibility does seem to have exactly that meaning: A protected global symbol is externally visible but cannot be interposed. Unfortunately, as I understand it, on some major platforms (e.g. x86), protected-visibility symbols have a major flaw: Non-uniqueness of function pointers (i.e. the function pointer obtained to a function outside of the defining library might be different from the pointer obtained within the defining library). As a result, making this change might be practically prohibited (even if it makes sense in theory). Proposal: 1. Add a new linkage type, interposible, which is like external except that isInterposableLinkage will return true (thus preventing inlining, IPA, etc.). This is similar to weak linkage, in a sense, except that such symbols are never discarded and are not marked as weak for linking, etc. 2. Add -fsemantic-interposition/-fno-semantic-interposition to Clang. Default to -fno-semantic-interposition, but when -fsemantic-interposition is used, use interposible linkage for all functions where external linkage might otherwise have been used. Thoughts? Some useful links: http://hubicka.blogspot.com/2015/04/GCC5-IPA-LTO-news.html (the section on the -fno-semantic-interposition flag) https://gcc.gnu.org/ml/gcc-patches/2014-05/msg01671.html On some issues with ELF protected-visibility symbols: http://www.macieira.org/blog/2012/01/sorry-state-of-dynamic-libraries-on-linux/ http://www.airs.com/blog/archives/307 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19520 Thanks again, Hal P.S. For some previous discussion on this, see below... ----- Original Message -----> From: "Hal Finkel via llvm-dev" <llvm-dev at lists.llvm.org> > To: "James Y Knight" <jyknight at google.com> > Cc: "llvm-dev" <llvm-dev at lists.llvm.org> > Sent: Monday, February 29, 2016 9:50:15 AM > Subject: Re: [llvm-dev] Possible soundness issue with > available_externally (split from "RFC: Add guard intrinsics")> ----- Original Message -----> > From: "James Y Knight" <jyknight at google.com> > > > To: "Hal Finkel" <hfinkel at anl.gov> > > > Cc: "Sanjoy Das" <sanjoy at playingwithpointers.com>, "llvm-dev" > > <llvm-dev at lists.llvm.org> > > > Sent: Monday, February 29, 2016 9:31:24 AM > > > Subject: Re: [llvm-dev] Possible soundness issue with > > available_externally (split from "RFC: Add guard intrinsics") >> > On Feb 26, 2016 8:50 PM, "Hal Finkel" < hfinkel at anl.gov > wrote: >> > > > From: "James Y Knight via llvm-dev" < llvm-dev at lists.llvm.org > > > > > > > > > > > To: "Sanjoy Das" < sanjoy at playingwithpointers.com > > > > > > > > > > > Cc: "llvm-dev" < llvm-dev at lists.llvm.org > > > > > > > > > > > Sent: Thursday, February 25, 2016 1:41:43 PM > > > > > > > > > > Subject: Re: [llvm-dev] Possible soundness issue with > > > > available_externally (split from "RFC: Add guard intrinsics") > > > > > >> > > > While we're talking about this, I'd just mention again that the > > > > same > > > > issue arises for *normal* functions too, when linked into a > > > > shared > > > > library: > > > > > > > > > > int foo() { return 1; } > > > > > >> > > > int bar() { return foo(); } > > > > > >> > > > Now, compare: > > > > > > > > > > clang -fPIC -O1 -S -o - test.c > > > > > >> > > > gcc -fPIC -O1 -S -o - test.c > > > > > >> > > > GCC will refuse to inline foo into bar, or use any information > > > > about > > > > foo in compiling bar, because foo is exported in the dynamic > > > > symbol > > > > table, and thus replaceable via symbol interposition. > > > > > >> > > > Clang assumes that you won't do that, or that you don't care > > > > what > > > > happens if you do. It will happily inline. And, in absense of > > > > inlining (e.g. if foo is too long to inline), clang will deduce > > > > function attributes about foo and rely on those in bar -- > > > > despite > > > > that the call goes through the PLT and could in fact be an > > > > entirely > > > > different unrelated implementation (or, for that matter, a > > > > differently-optimized version of the same implementation). > > > > > >> > > > Is that *really* okay? > > > > > >> > > I'm comfortable with saying that symbol interposition falls > > > outside > > > of the model we have for the targeted system (at least by > > > default), > > > and thus, this is okay. We also don't model the possibility of > > > someone hex-editing the binary ;) > > > > > I'm not really okay with it; the current behavior feels > > unprincipled. > > > We have a visibility attribute which can be used to control this: > > On > > ELF systems, "default" visibililty allows interposition (unlike on > > Darwin) -- that is, it explicitly ALLOWS for replacing the symbol's > > definition. The policy of "You can't replace the definition of the > > symbol, but it is globally visible" is exactly what the "protected" > > visibility mode is for. >> > If we want to say that you can't interpose by default on ELF > > targets, > > that would be a choice. Then, we should make the default symbol > > visibility "protected" instead of "default". But, continuing to > > generate calls through the PLT -- which is only needed because the > > symbols might be replaced -- while simultaneously making > > optimizations that are broken if they actually ARE replaced, seems > > kinda bogus. > > This makes sense, and I think you understand my concern here: Most > programmers don't understand these issues, nor do they ever expect > to use dynamic interposition. They do expect, however, that the > compiler has good IPA and will use the information it is provided > effectively. I'd be happy to make the default visibility protected, > allowing us to continue optimizing well, and provide a principled > behavior otherwise. Given, as you point out, this is the default on > Darwin, is there experience from Darwin porting, or any other > factors, that would indicate this would be a hardship?> Thanks again, > Hal> --> Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory> _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161129/7ff80245/attachment.html>
Reid Kleckner via llvm-dev
2016-Nov-29 17:14 UTC
[llvm-dev] RFC: Add an "interposible" linkage type (and implement -fsemantic-interposition)
I think that all makes sense. You're just adding the missing non-ODR conterpart of 'external' linkage. I could imagine having "external / external_odr" linkage for example. That said, do you think we should take the opportunity to split out a bit for interposability so that we can kill off the *_odr linkage variants? Today's non-ODR weak functions would look more like this: define weak interposable void @foo() { ret void } We could probably preserve bitcode compatibility by continuing to use the old combined linkage encoding. This is useful because we want the old weak to decode as weak+interposable and the old weak_odr to decode as weak. Some more prior discussion: https://reviews.llvm.org/D19995#423481 On Tue, Nov 29, 2016 at 8:01 AM, Hal Finkel via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi everyone, > > Clang/LLVM's support for ELF interposition is in a confusing state, and I > propose making a few (hopefully simple) adjustments in order to bring our > model into a self-consistent state. > > The problem: On ELF systems, global symbols can be interposed. This means, > for example, that calls to global functions in some (shared) library > defined in that same library might end up being redirected to an > implementation in some other library (or in the main executable). The most > common reason for this is the use of LD_PRELOAD, but there are plenty of > other ways to trigger interposition as well. As a result, it is technically > inconsistent to inline any global function or do inter-procedural analysis > on them because the implementation might be replaced by code with > completely different behavior at runtme (or link time). Clang has never > supported this (i.e. we do treat these functions as being eligible for > inlining and perform IPA on them). GCC, on the other hand, has > traditionally respected the possibility of ELF interposition and refrained > from doing these things (at least when compiling with -fPIC). > > I believe that Clang/LLVM's current behavior is the most-useful behavior > and we should keep the current behavior (at least as a default). I do > understand, however, that there are valid use cases for ELF interposition > and places where we should allow it (e.g. when compiling certain system > libraries). GCC recently added a flag -fsemantic-interposition/-fno-semantic-interposition, > where using -fno-semantic-interposition provides Clang/LLVM's behavior of > assuming that ELF interposition will not be used. > > It has been suggested that, to be self consistent, LLVM should emit global > symbols with protected ELF visibility in cases where we've assumed that ELF > interposition won't happen. ELF protected visibility does seem to have > exactly that meaning: A protected global symbol is externally visible but > cannot be interposed. Unfortunately, as I understand it, on some major > platforms (e.g. x86), protected-visibility symbols have a major flaw: > Non-uniqueness of function pointers (i.e. the function pointer obtained to > a function outside of the defining library might be different from the > pointer obtained within the defining library). As a result, making this > change might be practically prohibited (even if it makes sense in theory). > > Proposal: > > 1. Add a new linkage type, interposible, which is like external except > that isInterposableLinkage will return true (thus preventing inlining, IPA, > etc.). This is similar to weak linkage, in a sense, except that such > symbols are never discarded and are not marked as weak for linking, etc. > > 2. Add -fsemantic-interposition/-fno-semantic-interposition to Clang. > Default to -fno-semantic-interposition, but when -fsemantic-interposition > is used, use interposible linkage for all functions where external linkage > might otherwise have been used. > > Thoughts? > > Some useful links: > http://hubicka.blogspot.com/2015/04/GCC5-IPA-LTO-news.html (the section > on the -fno-semantic-interposition flag) > https://gcc.gnu.org/ml/gcc-patches/2014-05/msg01671.html > > On some issues with ELF protected-visibility symbols: > http://www.macieira.org/blog/2012/01/sorry-state-of- > dynamic-libraries-on-linux/ > http://www.airs.com/blog/archives/307 > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19520 > > Thanks again, > Hal > > P.S. For some previous discussion on this, see below... > > ------------------------------ > > *From: *"Hal Finkel via llvm-dev" <llvm-dev at lists.llvm.org> > *To: *"James Y Knight" <jyknight at google.com> > *Cc: *"llvm-dev" <llvm-dev at lists.llvm.org> > *Sent: *Monday, February 29, 2016 9:50:15 AM > *Subject: *Re: [llvm-dev] Possible soundness issue with > available_externally (split from "RFC: Add guard intrinsics") > > > ------------------------------ > > *From: *"James Y Knight" <jyknight at google.com> > *To: *"Hal Finkel" <hfinkel at anl.gov> > *Cc: *"Sanjoy Das" <sanjoy at playingwithpointers.com>, "llvm-dev" < > llvm-dev at lists.llvm.org> > *Sent: *Monday, February 29, 2016 9:31:24 AM > *Subject: *Re: [llvm-dev] Possible soundness issue with > available_externally (split from "RFC: Add guard intrinsics") > > On Feb 26, 2016 8:50 PM, "Hal Finkel" <hfinkel at anl.gov> wrote: > >> *From: *"James Y Knight via llvm-dev" <llvm-dev at lists.llvm.org> >> *To: *"Sanjoy Das" <sanjoy at playingwithpointers.com> >> *Cc: *"llvm-dev" <llvm-dev at lists.llvm.org> >> *Sent: *Thursday, February 25, 2016 1:41:43 PM >> *Subject: *Re: [llvm-dev] Possible soundness issue with >> available_externally (split from "RFC: Add guard intrinsics") >> >> While we're talking about this, I'd just mention again that the same >> issue arises for *normal* functions too, when linked into a shared library: >> int foo() { return 1; } >> int bar() { return foo(); } >> >> Now, compare: >> clang -fPIC -O1 -S -o - test.c >> gcc -fPIC -O1 -S -o - test.c >> >> GCC will refuse to inline foo into bar, or use any information about foo >> in compiling bar, because foo is exported in the dynamic symbol table, and >> thus replaceable via symbol interposition. >> >> Clang assumes that you won't do that, or that you don't care what happens >> if you do. It will happily inline. And, in absense of inlining (e.g. if foo >> is too long to inline), clang will deduce function attributes about foo and >> rely on those in bar -- despite that the call goes through the PLT and >> could in fact be an entirely different unrelated implementation (or, for >> that matter, a differently-optimized version of the same implementation). >> >> Is that *really* okay? >> >> I'm comfortable with saying that symbol interposition falls outside of >> the model we have for the targeted system (at least by default), and thus, >> this is okay. We also don't model the possibility of someone hex-editing >> the binary ;) >> > > I'm not really okay with it; the current behavior feels unprincipled. > > We have a visibility attribute which can be used to control this: On ELF > systems, "default" visibililty allows interposition (unlike on Darwin) -- > that is, it explicitly ALLOWS for replacing the symbol's definition. The > policy of "You can't replace the definition of the symbol, but it is > globally visible" is exactly what the "protected" visibility mode is for. > > If we want to say that you can't interpose by default on ELF targets, that > would be a choice. Then, we should make the default symbol visibility > "protected" instead of "default". But, continuing to generate calls through > the PLT -- which is only needed because the symbols might be replaced -- > while simultaneously making optimizations that are broken if they actually > ARE replaced, seems kinda bogus. > > This makes sense, and I think you understand my concern here: Most > programmers don't understand these issues, nor do they ever expect to use > dynamic interposition. They do expect, however, that the compiler has good > IPA and will use the information it is provided effectively. I'd be happy > to make the default visibility protected, allowing us to continue > optimizing well, and provide a principled behavior otherwise. Given, as you > point out, this is the default on Darwin, is there experience from Darwin > porting, or any other factors, that would indicate this would be a hardship? > > Thanks again, > Hal > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > -- > Hal Finkel > Lead, Compiler Technology and Programming Languages > Leadership Computing Facility > Argonne National Laboratory > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161129/d2665d4e/attachment.html>
Dave Bozier via llvm-dev
2016-Nov-29 17:58 UTC
[llvm-dev] RFC: Add an "interposible" linkage type (and implement -fsemantic-interposition)
Hi Hal, How would this new option/linkage type, differ from say using -fvisibility=hidden/protected? and how would it address ODR for protected/hidden symbols? aswell as inlining and IPA, there is also an enormous amount of bloat of dynamic metadata of the linked ELF program. Some numbers are provided in this article about DSO export control with symbol visibility: https://gcc.gnu.org/wiki/Visibility On Tue, Nov 29, 2016 at 4:01 PM, Hal Finkel via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi everyone, > > Clang/LLVM's support for ELF interposition is in a confusing state, and I > propose making a few (hopefully simple) adjustments in order to bring our > model into a self-consistent state. > > The problem: On ELF systems, global symbols can be interposed. This means, > for example, that calls to global functions in some (shared) library > defined in that same library might end up being redirected to an > implementation in some other library (or in the main executable). The most > common reason for this is the use of LD_PRELOAD, but there are plenty of > other ways to trigger interposition as well. As a result, it is technically > inconsistent to inline any global function or do inter-procedural analysis > on them because the implementation might be replaced by code with > completely different behavior at runtme (or link time). Clang has never > supported this (i.e. we do treat these functions as being eligible for > inlining and perform IPA on them). GCC, on the other hand, has > traditionally respected the possibility of ELF interposition and refrained > from doing these things (at least when compiling with -fPIC). > > I believe that Clang/LLVM's current behavior is the most-useful behavior > and we should keep the current behavior (at least as a default). I do > understand, however, that there are valid use cases for ELF interposition > and places where we should allow it (e.g. when compiling certain system > libraries). GCC recently added a flag -fsemantic-interposition/-fno-semantic-interposition, > where using -fno-semantic-interposition provides Clang/LLVM's behavior of > assuming that ELF interposition will not be used. > > It has been suggested that, to be self consistent, LLVM should emit global > symbols with protected ELF visibility in cases where we've assumed that ELF > interposition won't happen. ELF protected visibility does seem to have > exactly that meaning: A protected global symbol is externally visible but > cannot be interposed. Unfortunately, as I understand it, on some major > platforms (e.g. x86), protected-visibility symbols have a major flaw: > Non-uniqueness of function pointers (i.e. the function pointer obtained to > a function outside of the defining library might be different from the > pointer obtained within the defining library). As a result, making this > change might be practically prohibited (even if it makes sense in theory). > > Proposal: > > 1. Add a new linkage type, interposible, which is like external except > that isInterposableLinkage will return true (thus preventing inlining, IPA, > etc.). This is similar to weak linkage, in a sense, except that such > symbols are never discarded and are not marked as weak for linking, etc. > > 2. Add -fsemantic-interposition/-fno-semantic-interposition to Clang. > Default to -fno-semantic-interposition, but when -fsemantic-interposition > is used, use interposible linkage for all functions where external linkage > might otherwise have been used. > > Thoughts? > > Some useful links: > http://hubicka.blogspot.com/2015/04/GCC5-IPA-LTO-news.html (the section > on the -fno-semantic-interposition flag) > https://gcc.gnu.org/ml/gcc-patches/2014-05/msg01671.html > > On some issues with ELF protected-visibility symbols: > http://www.macieira.org/blog/2012/01/sorry-state-of- > dynamic-libraries-on-linux/ > http://www.airs.com/blog/archives/307 > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19520 > > Thanks again, > Hal > > P.S. For some previous discussion on this, see below... > > ------------------------------ > > *From: *"Hal Finkel via llvm-dev" <llvm-dev at lists.llvm.org> > *To: *"James Y Knight" <jyknight at google.com> > *Cc: *"llvm-dev" <llvm-dev at lists.llvm.org> > *Sent: *Monday, February 29, 2016 9:50:15 AM > *Subject: *Re: [llvm-dev] Possible soundness issue with > available_externally (split from "RFC: Add guard intrinsics") > > > ------------------------------ > > *From: *"James Y Knight" <jyknight at google.com> > *To: *"Hal Finkel" <hfinkel at anl.gov> > *Cc: *"Sanjoy Das" <sanjoy at playingwithpointers.com>, "llvm-dev" < > llvm-dev at lists.llvm.org> > *Sent: *Monday, February 29, 2016 9:31:24 AM > *Subject: *Re: [llvm-dev] Possible soundness issue with > available_externally (split from "RFC: Add guard intrinsics") > > On Feb 26, 2016 8:50 PM, "Hal Finkel" <hfinkel at anl.gov> wrote: > >> *From: *"James Y Knight via llvm-dev" <llvm-dev at lists.llvm.org> >> *To: *"Sanjoy Das" <sanjoy at playingwithpointers.com> >> *Cc: *"llvm-dev" <llvm-dev at lists.llvm.org> >> *Sent: *Thursday, February 25, 2016 1:41:43 PM >> *Subject: *Re: [llvm-dev] Possible soundness issue with >> available_externally (split from "RFC: Add guard intrinsics") >> >> While we're talking about this, I'd just mention again that the same >> issue arises for *normal* functions too, when linked into a shared library: >> int foo() { return 1; } >> int bar() { return foo(); } >> >> Now, compare: >> clang -fPIC -O1 -S -o - test.c >> gcc -fPIC -O1 -S -o - test.c >> >> GCC will refuse to inline foo into bar, or use any information about foo >> in compiling bar, because foo is exported in the dynamic symbol table, and >> thus replaceable via symbol interposition. >> >> Clang assumes that you won't do that, or that you don't care what happens >> if you do. It will happily inline. And, in absense of inlining (e.g. if foo >> is too long to inline), clang will deduce function attributes about foo and >> rely on those in bar -- despite that the call goes through the PLT and >> could in fact be an entirely different unrelated implementation (or, for >> that matter, a differently-optimized version of the same implementation). >> >> Is that *really* okay? >> >> I'm comfortable with saying that symbol interposition falls outside of >> the model we have for the targeted system (at least by default), and thus, >> this is okay. We also don't model the possibility of someone hex-editing >> the binary ;) >> > > I'm not really okay with it; the current behavior feels unprincipled. > > We have a visibility attribute which can be used to control this: On ELF > systems, "default" visibililty allows interposition (unlike on Darwin) -- > that is, it explicitly ALLOWS for replacing the symbol's definition. The > policy of "You can't replace the definition of the symbol, but it is > globally visible" is exactly what the "protected" visibility mode is for. > > If we want to say that you can't interpose by default on ELF targets, that > would be a choice. Then, we should make the default symbol visibility > "protected" instead of "default". But, continuing to generate calls through > the PLT -- which is only needed because the symbols might be replaced -- > while simultaneously making optimizations that are broken if they actually > ARE replaced, seems kinda bogus. > > This makes sense, and I think you understand my concern here: Most > programmers don't understand these issues, nor do they ever expect to use > dynamic interposition. They do expect, however, that the compiler has good > IPA and will use the information it is provided effectively. I'd be happy > to make the default visibility protected, allowing us to continue > optimizing well, and provide a principled behavior otherwise. Given, as you > point out, this is the default on Darwin, is there experience from Darwin > porting, or any other factors, that would indicate this would be a hardship? > > Thanks again, > Hal > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > -- > Hal Finkel > Lead, Compiler Technology and Programming Languages > Leadership Computing Facility > Argonne National Laboratory > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161129/c331f4f0/attachment.html>
James Y Knight via llvm-dev
2016-Nov-30 19:28 UTC
[llvm-dev] RFC: Add an "interposible" linkage type (and implement -fsemantic-interposition)
You present a good case for not using "protected" visibility in ELF, despite it being exactly what it is supposed to mean.>From what http://www.airs.com/blog/archives/307 says, it sounds like the*correctness* issue with protected visibility is because LLVM is doing it wrong -- not an intrinsic property of protected visibility in ELF, or even ELF/x86. The blog says that the compiler should be referencing through the GOT for the addressof operation on a function within the same shared library, thus keeping the address consistent within, and without. But it's not. We should at least fix that, I think? But, the more serious problem to me seems to be the issue with the dynamic link loader performing horribly. That performance penalty certainly means that defaulting to protected visibility would be a very bad plan for normal use-cases. So, yes, I concur that we need something new to indicate that our "default" is effectively "we'll treat this as if it were protected, but we'll keep it a secret from the linker, ssssh...", and also allow for the "usual" ELF default visibility behavior. You don't mention in your proposal anything about changing the existing behavior when the current-default -fno-semantic-interposition is in effect. But, I think we do need to change it. LLVM ought to be consistent with itself -- that part of my original argument remains just as valid. That is: if we're going to assume attributes of functions with visible definitions, based on their definitions, we should also be generating a direct call to the function symbol, and not go through the PLT. Because we can't use protected visibility, we are unable to tell the linker to make the same optimization when linking multiple objects, which is sad, but oh well... Although, actually, now I'm wondering, shouldn't we also be able to mark a declaration as definitely in the shared library? Something like: declare protected_but_dont_tell_the_linker i32 @f() to indicate to LLVM that @f must be defined by another object inside the same shared library -- but not in this file -- and that calls should be made directly to the symbol, not through the PLT. That is, we have: - The default for llvm functions is that if the definition is visible, it is assumed to be the one that will be used (not interposable), and optimizations can assume properties of the function, and it can be called directly without going through the PLT. (But addressof still needs to be careful). - If the definition is NOT visible, the default is to assume the symbol MIGHT be external to the shared object, and needs to go through PLT. Hal's proposal adds the "interposable" linkage type, which makes sense on definitions only (not declarations), and indicates that despite seeing a definition, you can't assume the local version will be used. We should also have a way to do the opposite for declarations, somehow. ...(after reading more of the thread)....oh looky, that's EXACTLY what rafael's said in his patch in D20217, too. :) On Tue, Nov 29, 2016 at 11:01 AM, Hal Finkel <hfinkel at anl.gov> wrote:> Hi everyone, > > Clang/LLVM's support for ELF interposition is in a confusing state, and I > propose making a few (hopefully simple) adjustments in order to bring our > model into a self-consistent state. > > The problem: On ELF systems, global symbols can be interposed. This means, > for example, that calls to global functions in some (shared) library > defined in that same library might end up being redirected to an > implementation in some other library (or in the main executable). The most > common reason for this is the use of LD_PRELOAD, but there are plenty of > other ways to trigger interposition as well. As a result, it is technically > inconsistent to inline any global function or do inter-procedural analysis > on them because the implementation might be replaced by code with > completely different behavior at runtme (or link time). Clang has never > supported this (i.e. we do treat these functions as being eligible for > inlining and perform IPA on them). GCC, on the other hand, has > traditionally respected the possibility of ELF interposition and refrained > from doing these things (at least when compiling with -fPIC). > > I believe that Clang/LLVM's current behavior is the most-useful behavior > and we should keep the current behavior (at least as a default). I do > understand, however, that there are valid use cases for ELF interposition > and places where we should allow it (e.g. when compiling certain system > libraries). GCC recently added a flag -fsemantic-interposition/-fno-semantic-interposition, > where using -fno-semantic-interposition provides Clang/LLVM's behavior of > assuming that ELF interposition will not be used. > > It has been suggested that, to be self consistent, LLVM should emit global > symbols with protected ELF visibility in cases where we've assumed that ELF > interposition won't happen. ELF protected visibility does seem to have > exactly that meaning: A protected global symbol is externally visible but > cannot be interposed. Unfortunately, as I understand it, on some major > platforms (e.g. x86), protected-visibility symbols have a major flaw: > Non-uniqueness of function pointers (i.e. the function pointer obtained to > a function outside of the defining library might be different from the > pointer obtained within the defining library). As a result, making this > change might be practically prohibited (even if it makes sense in theory). > > Proposal: > > 1. Add a new linkage type, interposible, which is like external except > that isInterposableLinkage will return true (thus preventing inlining, IPA, > etc.). This is similar to weak linkage, in a sense, except that such > symbols are never discarded and are not marked as weak for linking, etc. > > 2. Add -fsemantic-interposition/-fno-semantic-interposition to Clang. > Default to -fno-semantic-interposition, but when -fsemantic-interposition > is used, use interposible linkage for all functions where external linkage > might otherwise have been used. > > Thoughts? > > Some useful links: > http://hubicka.blogspot.com/2015/04/GCC5-IPA-LTO-news.html (the section > on the -fno-semantic-interposition flag) > https://gcc.gnu.org/ml/gcc-patches/2014-05/msg01671.html > > On some issues with ELF protected-visibility symbols: > http://www.macieira.org/blog/2012/01/sorry-state-of-dynamic- > libraries-on-linux/ > http://www.airs.com/blog/archives/307 > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19520 > > Thanks again, > Hal > > P.S. For some previous discussion on this, see below... > > ------------------------------ > > *From: *"Hal Finkel via llvm-dev" <llvm-dev at lists.llvm.org> > *To: *"James Y Knight" <jyknight at google.com> > *Cc: *"llvm-dev" <llvm-dev at lists.llvm.org> > *Sent: *Monday, February 29, 2016 9:50:15 AM > *Subject: *Re: [llvm-dev] Possible soundness issue with > available_externally (split from "RFC: Add guard intrinsics") > > > ------------------------------ > > *From: *"James Y Knight" <jyknight at google.com> > *To: *"Hal Finkel" <hfinkel at anl.gov> > *Cc: *"Sanjoy Das" <sanjoy at playingwithpointers.com>, "llvm-dev" < > llvm-dev at lists.llvm.org> > *Sent: *Monday, February 29, 2016 9:31:24 AM > *Subject: *Re: [llvm-dev] Possible soundness issue with > available_externally (split from "RFC: Add guard intrinsics") > > On Feb 26, 2016 8:50 PM, "Hal Finkel" <hfinkel at anl.gov> wrote: > >> *From: *"James Y Knight via llvm-dev" <llvm-dev at lists.llvm.org> >> *To: *"Sanjoy Das" <sanjoy at playingwithpointers.com> >> *Cc: *"llvm-dev" <llvm-dev at lists.llvm.org> >> *Sent: *Thursday, February 25, 2016 1:41:43 PM >> *Subject: *Re: [llvm-dev] Possible soundness issue with >> available_externally (split from "RFC: Add guard intrinsics") >> >> While we're talking about this, I'd just mention again that the same >> issue arises for *normal* functions too, when linked into a shared library: >> int foo() { return 1; } >> int bar() { return foo(); } >> >> Now, compare: >> clang -fPIC -O1 -S -o - test.c >> gcc -fPIC -O1 -S -o - test.c >> >> GCC will refuse to inline foo into bar, or use any information about foo >> in compiling bar, because foo is exported in the dynamic symbol table, and >> thus replaceable via symbol interposition. >> >> Clang assumes that you won't do that, or that you don't care what happens >> if you do. It will happily inline. And, in absense of inlining (e.g. if foo >> is too long to inline), clang will deduce function attributes about foo and >> rely on those in bar -- despite that the call goes through the PLT and >> could in fact be an entirely different unrelated implementation (or, for >> that matter, a differently-optimized version of the same implementation). >> >> Is that *really* okay? >> >> I'm comfortable with saying that symbol interposition falls outside of >> the model we have for the targeted system (at least by default), and thus, >> this is okay. We also don't model the possibility of someone hex-editing >> the binary ;) >> > > I'm not really okay with it; the current behavior feels unprincipled. > > We have a visibility attribute which can be used to control this: On ELF > systems, "default" visibililty allows interposition (unlike on Darwin) -- > that is, it explicitly ALLOWS for replacing the symbol's definition. The > policy of "You can't replace the definition of the symbol, but it is > globally visible" is exactly what the "protected" visibility mode is for. > > If we want to say that you can't interpose by default on ELF targets, that > would be a choice. Then, we should make the default symbol visibility > "protected" instead of "default". But, continuing to generate calls through > the PLT -- which is only needed because the symbols might be replaced -- > while simultaneously making optimizations that are broken if they actually > ARE replaced, seems kinda bogus. > > This makes sense, and I think you understand my concern here: Most > programmers don't understand these issues, nor do they ever expect to use > dynamic interposition. They do expect, however, that the compiler has good > IPA and will use the information it is provided effectively. I'd be happy > to make the default visibility protected, allowing us to continue > optimizing well, and provide a principled behavior otherwise. Given, as you > point out, this is the default on Darwin, is there experience from Darwin > porting, or any other factors, that would indicate this would be a hardship? > > Thanks again, > Hal > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > -- > Hal Finkel > Lead, Compiler Technology and Programming Languages > Leadership Computing Facility > Argonne National Laboratory >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161130/75a74d2d/attachment.html>
Possibly Parallel Threads
- RFC: Add an "interposible" linkage type (and implement -fsemantic-interposition)
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- RFC: Add an "interposible" linkage type (and implement -fsemantic-interposition)
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- RFC: Add an "interposible" linkage type (and implement -fsemantic-interposition)