Hal Finkel via llvm-dev
2016-Nov-29 18:02 UTC
[llvm-dev] RFC: Add an "interposible" linkage type (and implement -fsemantic-interposition)
----- Original Message -----> From: "Eric Christopher" <echristo at gmail.com> > To: "Reid Kleckner" <rnk at google.com>, "Hal Finkel" <hfinkel at anl.gov> > Cc: "llvm-dev" <llvm-dev at lists.llvm.org> > Sent: Tuesday, November 29, 2016 11:34:56 AM > Subject: Re: [llvm-dev] RFC: Add an "interposible" linkage type (and > implement -fsemantic-interposition)> On Tue, Nov 29, 2016 at 9:15 AM Reid Kleckner via llvm-dev < > llvm-dev at lists.llvm.org > wrote:> > I think that all makes sense. You're just adding the missing > > non-ODR > > conterpart of 'external' linkage. I could imagine having "external > > / > > external_odr" linkage for example. >> > That said, do you think we should take the opportunity to split out > > a > > bit for interposability so that we can kill off the *_odr linkage > > variants? Today's non-ODR weak functions would look more like this: > > > define weak interposable void @foo() { ret void } >> > We could probably preserve bitcode compatibility by continuing to > > use > > the old combined linkage encoding. This is useful because we want > > the old weak to decode as weak+interposable and the old weak_odr to > > decode as weak. >> > Some more prior discussion: https://reviews.llvm.org/D19995#423481 >> +1So we're all on the same page, our current encodings are: External: 0 WeakAny: 16 Appending: 2 Internal: 3 LinkOnceAny: 18 ExternalWeak: 7 Common: 8 Private: 9 WeakODR: 17 LinkOnceODR: 19 AvailableExternally: 12 And the new encodings would be: External: 0 External + Interposible: 20 Weak + Interposible: 16 Appending: 2 [interposible N/A] Internal: 3 [never interposible] LinkOnce + Interposible: 18 ExternalWeak: 7 [always interposible] Common: 8 [interposible N/A] Private: 9 [never interposible] Weak: 17 LinkOnce: 19 AvailableExternally: 12 [never interposible] With the mappings: [Old] -> [New] External -> External WeakAny -> Weak + Interposible LinkOnceAny -> LinkOnce + Interposible WeakODR -> Weak LinkOnceODR -> LinkOnce Is this what you had in mind? Thanks again, Hal> -eric> > On Tue, Nov 29, 2016 at 8:01 AM, Hal Finkel via llvm-dev < > > llvm-dev at lists.llvm.org > wrote: >> > > Hi everyone, > > >> > > Clang/LLVM's support for ELF interposition is in a confusing > > > state, > > > and I propose making a few (hopefully simple) adjustments in > > > order > > > to bring our model into a self-consistent state. > > >> > > The problem: On ELF systems, global symbols can be interposed. > > > This > > > means, for example, that calls to global functions in some > > > (shared) > > > library defined in that same library might end up being > > > redirected > > > to an implementation in some other library (or in the main > > > executable). The most common reason for this is the use of > > > LD_PRELOAD, but there are plenty of other ways to trigger > > > interposition as well. As a result, it is technically > > > inconsistent > > > to inline any global function or do inter-procedural analysis on > > > them because the implementation might be replaced by code with > > > completely different behavior at runtme (or link time). Clang has > > > never supported this (i.e. we do treat these functions as being > > > eligible for inlining and perform IPA on them). GCC, on the other > > > hand, has traditionally respected the possibility of ELF > > > interposition and refrained from doing these things (at least > > > when > > > compiling with -fPIC). > > >> > > I believe that Clang/LLVM's current behavior is the most-useful > > > behavior and we should keep the current behavior (at least as a > > > default). I do understand, however, that there are valid use > > > cases > > > for ELF interposition and places where we should allow it (e.g. > > > when > > > compiling certain system libraries). GCC recently added a flag > > > -fsemantic-interposition/-fno-semantic-interposition, where using > > > -fno-semantic-interposition provides Clang/LLVM's behavior of > > > assuming that ELF interposition will not be used. > > >> > > It has been suggested that, to be self consistent, LLVM should > > > emit > > > global symbols with protected ELF visibility in cases where we've > > > assumed that ELF interposition won't happen. ELF protected > > > visibility does seem to have exactly that meaning: A protected > > > global symbol is externally visible but cannot be interposed. > > > Unfortunately, as I understand it, on some major platforms (e.g. > > > x86), protected-visibility symbols have a major flaw: > > > Non-uniqueness > > > of function pointers (i.e. the function pointer obtained to a > > > function outside of the defining library might be different from > > > the > > > pointer obtained within the defining library). As a result, > > > making > > > this change might be practically prohibited (even if it makes > > > sense > > > in theory). > > >> > > Proposal: > > >> > > 1. Add a new linkage type, interposible, which is like external > > > except that isInterposableLinkage will return true (thus > > > preventing > > > inlining, IPA, etc.). This is similar to weak linkage, in a > > > sense, > > > except that such symbols are never discarded and are not marked > > > as > > > weak for linking, etc. > > >> > > 2. Add -fsemantic-interposition/-fno-semantic-interposition to > > > Clang. > > > Default to -fno-semantic-interposition, but when > > > -fsemantic-interposition is used, use interposible linkage for > > > all > > > functions where external linkage might otherwise have been used. > > >> > > Thoughts? > > >> > > Some useful links: > > > > > > http://hubicka.blogspot.com/2015/04/GCC5-IPA-LTO-news.html (the > > > section on the -fno-semantic-interposition flag) > > > > > > https://gcc.gnu.org/ml/gcc-patches/2014-05/msg01671.html > > >> > > On some issues with ELF protected-visibility symbols: > > > > > > http://www.macieira.org/blog/2012/01/sorry-state-of-dynamic-libraries-on-linux/ > > > > > > http://www.airs.com/blog/archives/307 > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19520 > > >> > > Thanks again, > > > > > > Hal > > >> > > P.S. For some previous discussion on this, see below... > > >> > > > From: "Hal Finkel via llvm-dev" < llvm-dev at lists.llvm.org > > > > > > > > > > > To: "James Y Knight" < jyknight at google.com > > > > > > > > > > > Cc: "llvm-dev" < llvm-dev at lists.llvm.org > > > > > > > > > > > Sent: Monday, February 29, 2016 9:50:15 AM > > > > > > > > > > Subject: Re: [llvm-dev] Possible soundness issue with > > > > available_externally (split from "RFC: Add guard intrinsics") > > > > > >> > > > > From: "James Y Knight" < jyknight at google.com > > > > > > > > > > > > > > > > To: "Hal Finkel" < hfinkel at anl.gov > > > > > > > > > > > > > > > > Cc: "Sanjoy Das" < sanjoy at playingwithpointers.com >, > > > > > "llvm-dev" > > > > > < > > > > > llvm-dev at lists.llvm.org > > > > > > > > > > > > > > > > Sent: Monday, February 29, 2016 9:31:24 AM > > > > > > > > > > > > > > > Subject: Re: [llvm-dev] Possible soundness issue with > > > > > available_externally (split from "RFC: Add guard intrinsics") > > > > > > > > > >> > > > > On Feb 26, 2016 8:50 PM, "Hal Finkel" < hfinkel at anl.gov > > > > > > wrote: > > > > > > > > > >> > > > > > > From: "James Y Knight via llvm-dev" < > > > > > > > llvm-dev at lists.llvm.org > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > To: "Sanjoy Das" < sanjoy at playingwithpointers.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cc: "llvm-dev" < llvm-dev at lists.llvm.org > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Sent: Thursday, February 25, 2016 1:41:43 PM > > > > > > > > > > > > > > > > > > > > > > > > > > > > Subject: Re: [llvm-dev] Possible soundness issue with > > > > > > > available_externally (split from "RFC: Add guard > > > > > > > intrinsics") > > > > > > > > > > > > > > > > > > > > >> > > > > > > While we're talking about this, I'd just mention again > > > > > > > that > > > > > > > the > > > > > > > same > > > > > > > issue arises for *normal* functions too, when linked into > > > > > > > a > > > > > > > shared > > > > > > > library: > > > > > > > > > > > > > > > > > > > > > > > > > > > > int foo() { return 1; } > > > > > > > > > > > > > > > > > > > > >> > > > > > > int bar() { return foo(); } > > > > > > > > > > > > > > > > > > > > >> > > > > > > Now, compare: > > > > > > > > > > > > > > > > > > > > > > > > > > > > clang -fPIC -O1 -S -o - test.c > > > > > > > > > > > > > > > > > > > > >> > > > > > > gcc -fPIC -O1 -S -o - test.c > > > > > > > > > > > > > > > > > > > > >> > > > > > > GCC will refuse to inline foo into bar, or use any > > > > > > > information > > > > > > > about > > > > > > > foo in compiling bar, because foo is exported in the > > > > > > > dynamic > > > > > > > symbol > > > > > > > table, and thus replaceable via symbol interposition. > > > > > > > > > > > > > > > > > > > > >> > > > > > > Clang assumes that you won't do that, or that you don't > > > > > > > care > > > > > > > what > > > > > > > happens if you do. It will happily inline. And, in > > > > > > > absense > > > > > > > of > > > > > > > inlining (e.g. if foo is too long to inline), clang will > > > > > > > deduce > > > > > > > function attributes about foo and rely on those in bar -- > > > > > > > despite > > > > > > > that the call goes through the PLT and could in fact be > > > > > > > an > > > > > > > entirely > > > > > > > different unrelated implementation (or, for that matter, > > > > > > > a > > > > > > > differently-optimized version of the same > > > > > > > implementation). > > > > > > > > > > > > > > > > > > > > >> > > > > > > Is that *really* okay? > > > > > > > > > > > > > > > > > > > > >> > > > > > I'm comfortable with saying that symbol interposition falls > > > > > > outside > > > > > > of the model we have for the targeted system (at least by > > > > > > default), > > > > > > and thus, this is okay. We also don't model the possibility > > > > > > of > > > > > > someone hex-editing the binary ;) > > > > > > > > > > > > > > > > > > > > I'm not really okay with it; the current behavior feels > > > > > unprincipled. > > > > > > > > > > > > > > > We have a visibility attribute which can be used to control > > > > > this: > > > > > On > > > > > ELF systems, "default" visibililty allows interposition > > > > > (unlike > > > > > on > > > > > Darwin) -- that is, it explicitly ALLOWS for replacing the > > > > > symbol's > > > > > definition. The policy of "You can't replace the definition > > > > > of > > > > > the > > > > > symbol, but it is globally visible" is exactly what the > > > > > "protected" > > > > > visibility mode is for. > > > > > > > > > >> > > > > If we want to say that you can't interpose by default on ELF > > > > > targets, > > > > > that would be a choice. Then, we should make the default > > > > > symbol > > > > > visibility "protected" instead of "default". But, continuing > > > > > to > > > > > generate calls through the PLT -- which is only needed > > > > > because > > > > > the > > > > > symbols might be replaced -- while simultaneously making > > > > > optimizations that are broken if they actually ARE replaced, > > > > > seems > > > > > kinda bogus. > > > > > > > > > > > > > > This makes sense, and I think you understand my concern here: > > > > Most > > > > programmers don't understand these issues, nor do they ever > > > > expect > > > > to use dynamic interposition. They do expect, however, that the > > > > compiler has good IPA and will use the information it is > > > > provided > > > > effectively. I'd be happy to make the default visibility > > > > protected, > > > > allowing us to continue optimizing well, and provide a > > > > principled > > > > behavior otherwise. Given, as you point out, this is the > > > > default > > > > on > > > > Darwin, is there experience from Darwin porting, or any other > > > > factors, that would indicate this would be a hardship? > > > > > >> > > > Thanks again, > > > > > > > > > > Hal > > > > > >> > > > -- > > > > > >> > > > Hal Finkel > > > > > > > > > > Assistant Computational Scientist > > > > > > > > > > Leadership Computing Facility > > > > > > > > > > Argonne National Laboratory > > > > > >> > > > _______________________________________________ > > > > > > > > > > LLVM Developers mailing list > > > > > > > > > > llvm-dev at lists.llvm.org > > > > > > > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > >> > > -- > > >> > > Hal Finkel > > > > > > Lead, Compiler Technology and Programming Languages > > > > > > Leadership Computing Facility > > > > > > Argonne National Laboratory > > >> > > _______________________________________________ > > > > > > LLVM Developers mailing list > > > > > > llvm-dev at lists.llvm.org > > > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >> > _______________________________________________ > > > LLVM Developers mailing list > > > llvm-dev at lists.llvm.org > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161129/3029d82a/attachment-0001.html>
Reid Kleckner via llvm-dev
2016-Nov-29 18:37 UTC
[llvm-dev] RFC: Add an "interposible" linkage type (and implement -fsemantic-interposition)
On Tue, Nov 29, 2016 at 10:02 AM, Hal Finkel <hfinkel at anl.gov> wrote:> > So we're all on the same page, our current encodings are: > > External: 0 > WeakAny: 16 > Appending: 2 > Internal: 3 > LinkOnceAny: 18 > ExternalWeak: 7 > Common: 8 > Private: 9 > WeakODR: 17 > LinkOnceODR: 19 > AvailableExternally: 12 > > And the new encodings would be: > > External: 0 > External + Interposible: 20 > Weak + Interposible: 16 > Appending: 2 [interposible N/A] > Internal: 3 [never interposible] > LinkOnce + Interposible: 18 > ExternalWeak: 7 [always interposible] > Common: 8 [interposible N/A] > Private: 9 [never interposible] > Weak: 17 > LinkOnce: 19 > AvailableExternally: 12 [never interposible] > > With the mappings: > > [Old] -> [New] > External -> External > WeakAny -> Weak + Interposible > LinkOnceAny -> LinkOnce + Interposible > WeakODR -> Weak > LinkOnceODR -> LinkOnce > > Is this what you had in mind? > > Thanks again, > Hal >Exactly! The verifier should check that users aren't making invalid things like interposable internal/available_externally functions. I'm not exactly sure when ExternalWeak applies and how we should model its interposability. I guess it would always be interposable to avoid special cases. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161129/225b7af7/attachment.html>
Hal Finkel via llvm-dev
2016-Nov-30 18:22 UTC
[llvm-dev] RFC: Add an "interposible" linkage type (and implement -fsemantic-interposition)
Hi Reid, Eric, I'd like to come back to this idea of splitting out interposible into its own bit separate from the base linkage type. I'm still not convinced that I see any structural advantage to doing this... Generally, the advantage of this kind of splitting is orthogonality, but there isn't much to be gained: We currently have four interposible linkage types (out of 11), and after this change we'll have five. For the others, we need to add validation code and diagnostics to disallow the forbidden combinations. Plus, two of these must be interposible (which means more validation code and diagnostics). On top of that, we'll lose an available subclass bit in GlobalValue: We currently use 4 bits to store the linkage type (as there are 11 of them). Adding one will fit in the same number of bits. But only two get removed if we factor out the interposibility, so we still need 4 bits for the base linkage type plus one more for the interposible bit. In short, because so few of the potential combinations are permitted, separating out interposibility will make our encoding less efficient. For backwards compatibility at the bitcode level, we'd need to do this anyway (i.e. explicitly represent the combined value set in one unified encoding space). In addition, there's engineering effort involved in updating APIs, etc. (mostly for me, but for everyone else too). I'm certainly willing to do that if we want to go down this route. Maybe I'm just missing some bigger-picture advantages. Thanks again, Hal ----- Original Message -----> From: "Reid Kleckner" <rnk at google.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Eric Christopher" <echristo at gmail.com>, "llvm-dev" > <llvm-dev at lists.llvm.org> > Sent: Tuesday, November 29, 2016 12:37:32 PM > Subject: Re: [llvm-dev] RFC: Add an "interposible" linkage type (and > implement -fsemantic-interposition)> On Tue, Nov 29, 2016 at 10:02 AM, Hal Finkel < hfinkel at anl.gov > > wrote: > > So we're all on the same page, our current encodings are: >> > External: 0 > > > WeakAny: 16 > > > Appending: 2 > > > Internal: 3 > > > LinkOnceAny: 18 > > > ExternalWeak: 7 > > > Common: 8 > > > Private: 9 > > > WeakODR: 17 > > > LinkOnceODR: 19 > > > AvailableExternally: 12 >> > And the new encodings would be: >> > External: 0 > > > External + Interposible: 20 > > > Weak + Interposible: 16 > > > Appending: 2 [interposible N/A] > > > Internal: 3 [never interposible] > > > LinkOnce + Interposible: 18 > > > ExternalWeak: 7 [always interposible] > > > Common: 8 [interposible N/A] > > > Private: 9 [never interposible] > > > Weak: 17 > > > LinkOnce: 19 > > > AvailableExternally: 12 [never interposible] >> > With the mappings: >> > [Old] -> [New] > > > External -> External > > > WeakAny -> Weak + Interposible > > > LinkOnceAny -> LinkOnce + Interposible > > > WeakODR -> Weak > > > LinkOnceODR -> LinkOnce >> > Is this what you had in mind? >> > Thanks again, > > > Hal > > Exactly!> The verifier should check that users aren't making invalid things > like interposable internal/available_externally functions.> I'm not exactly sure when ExternalWeak applies and how we should > model its interposability. I guess it would always be interposable > to avoid special cases.-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161130/22466d05/attachment.html>