Saleem Abdulrasool via llvm-dev
2018-Jan-10 05:17 UTC
[llvm-dev] Linker Option support for ELF
> On Jan 9, 2018, at 2:32 AM, James Henderson <jh7370.2008 at my.bristol.ac.uk> wrote: > > By my understanding, the proposal is that the user input looks something like: > > #pragma linker_directive("lib", "m") > > which is passed essentially as is to the linker, i.e. the directive payload is a pair of strings, the first being "lib", the second "m". It is then up to the linker to decide if it supports "lib" directives, and translate that into the corresponding command, or do something else if it doesn't support it, e.g. emit an error or ignore it. Neither the backend nor the frontend need to do any conversion, and it makes it much clearer that these aren't like any other command-line option (due to the limited set that can be passed through, and the possible differences in behaviour, such as order on the command-line). > > As an aside, this approach would aid portability to linkers that aren't command-line compatible with one another. If, for example, the switch is "-lib" on linker A, and "-l" on linker B, the user doesn't have to have different source code for each different linker their object is to be consumed by, nor does the compiler have to know about the different possible linkers.Okay, that seems reasonable enough. I think I still like the simplicity of having it just be like a response file, but, this approach isn’t too bad for adding custom extensions to. I do like that it gives greater portability to different linkers. The one last bit that still remains unanswered is the encoding. Do we go forward with the abuse of the ELF notes or do is there enough confidence that adding a new section type shouldn’t break compatibility with older linkers?> James > > On 9 January 2018 at 05:15, Saleem Abdulrasool <compnerd at compnerd.org <mailto:compnerd at compnerd.org>> wrote: > > > > On Jan 7, 2018, at 5:02 PM, Cary Coutant <ccoutant at gmail.com <mailto:ccoutant at gmail.com>> wrote: > > > >> I think we all agree that blindly allowing the linker to honor the options > >> would be scary. I agree that we should whitelist the options, and am of the > >> opinion that we should force validation on the linker side (use of any > >> option which the linker doesn't support in this form can be fatal). > >> Starting small is the best way, with `-l` and `-L` as a starting point. I > >> want to retain the ability to add additional options which may not be > >> available in all linkers. However, whitelisting obviously requires working > >> with the linker as would adding such options, so that could be handled at > >> that time. > > > > This is actually why I'd prefer a new "language" over just > > whitelisting options. With "lib", "file", and "path", as I suggested, > > there's no question whether an option like "-no-pie" is supported, and > > no temptation to even try. The new language should be tailored for > > process-to-process communication, rather than user-to-shell > > communication. > > I suppose I am slightly confused about what you are proposing here now. From the user side, it would be something to that effect. Im suggesting that we take lib and transform it to a different representation in the frontend. Taking an explicit example: > > The user input > `#pragma comment(lib, “m”)` > > would get transformed to `-lm` in the object file encoding. However, this would still not permit you from injecting any arbitrary options, but the backend doesn’t change for any new option, only the frontend and the linker. > > >> I’m thinking about future enhancements. MachO does actually provide > >> something like `-L` -`l` in a single go via `-framework`. But, no such > >> option exists for ELF since it doesn’t have the concept of framework bundles > >> (but the layout itself is interesting), and I just want to try to keep the > >> door open for such features. > > > > This is why I also included "path" in my suggestion. I imagine > > something very much like -framework, where include files and library > > search paths are handled together. > > > > -cary > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180109/e831ac75/attachment.html>
Rafael Avila de Espindola via llvm-dev
2018-Jan-10 17:04 UTC
[llvm-dev] Linker Option support for ELF
Saleem Abdulrasool via llvm-dev <llvm-dev at lists.llvm.org> writes:>> On Jan 9, 2018, at 2:32 AM, James Henderson <jh7370.2008 at my.bristol.ac.uk> wrote: >> >> By my understanding, the proposal is that the user input looks something like: >> >> #pragma linker_directive("lib", "m") >> >> which is passed essentially as is to the linker, i.e. the directive payload is a pair of strings, the first being "lib", the second "m". It is then up to the linker to decide if it supports "lib" directives, and translate that into the corresponding command, or do something else if it doesn't support it, e.g. emit an error or ignore it. Neither the backend nor the frontend need to do any conversion, and it makes it much clearer that these aren't like any other command-line option (due to the limited set that can be passed through, and the possible differences in behaviour, such as order on the command-line). >> >> As an aside, this approach would aid portability to linkers that aren't command-line compatible with one another. If, for example, the switch is "-lib" on linker A, and "-l" on linker B, the user doesn't have to have different source code for each different linker their object is to be consumed by, nor does the compiler have to know about the different possible linkers. > > Okay, that seems reasonable enough. I think I still like the simplicity of having it just be like a response file, but, this approach isn’t too bad for adding custom extensions to. I do like that it gives greater portability to different linkers. > > The one last bit that still remains unanswered is the encoding. Do we go forward with the abuse of the ELF notes or do is there enough confidence that adding a new section type shouldn’t break compatibility with older linkers?Given how ELF works I would expect an unknown section to simply end up in the output, but we can use SHF_EXCLUDE to avoid that. So +1 for a new section type. Cheers, Rafael
On Wed, Jan 10, 2018 at 9:04 AM, Rafael Avila de Espindola via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Saleem Abdulrasool via llvm-dev <llvm-dev at lists.llvm.org> writes: > > >> On Jan 9, 2018, at 2:32 AM, James Henderson < > jh7370.2008 at my.bristol.ac.uk> wrote: > >> > >> By my understanding, the proposal is that the user input looks > something like: > >> > >> #pragma linker_directive("lib", "m") > >> > >> which is passed essentially as is to the linker, i.e. the directive > payload is a pair of strings, the first being "lib", the second "m". It is > then up to the linker to decide if it supports "lib" directives, and > translate that into the corresponding command, or do something else if it > doesn't support it, e.g. emit an error or ignore it. Neither the backend > nor the frontend need to do any conversion, and it makes it much clearer > that these aren't like any other command-line option (due to the limited > set that can be passed through, and the possible differences in behaviour, > such as order on the command-line). > >> > >> As an aside, this approach would aid portability to linkers that aren't > command-line compatible with one another. If, for example, the switch is > "-lib" on linker A, and "-l" on linker B, the user doesn't have to have > different source code for each different linker their object is to be > consumed by, nor does the compiler have to know about the different > possible linkers. > > > > Okay, that seems reasonable enough. I think I still like the simplicity > of having it just be like a response file, but, this approach isn’t too bad > for adding custom extensions to. I do like that it gives greater > portability to different linkers. > > > > The one last bit that still remains unanswered is the encoding. Do we > go forward with the abuse of the ELF notes or do is there enough confidence > that adding a new section type shouldn’t break compatibility with older > linkers? > > Given how ELF works I would expect an unknown section to simply end up > in the output, but we can use SHF_EXCLUDE to avoid that. > > So +1 for a new section type. >+1 I feel .note sections (in particular .note.gnu namespace) are used too casually. Once this proposal is accepted and implemented, we'll live with that until ELF dies, so defining a new section type and then name the section ".linker-options" or something without ".note." prefix and the note section header seems better. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180110/a6d8ef9d/attachment.html>
> Given how ELF works I would expect an unknown section to simply end up > in the output, but we can use SHF_EXCLUDE to avoid that.Yes, gold currently treats unknown section types pretty much the same as PROGBITS sections. The SHF_ALLOCATE and SHF_EXCLUDE flags would control where and whether the section goes into the output file. Another thing that we need to work out is the link order. The link order is basically a topologically-ordered list of objects, ordered so that if A depends on B, A precedes B in the link order. Today, in the absence of any dependency information at all, we rely on the user and compiler to come up with a reasonably correct link order and pass a linear list of files and libraries to the linker. In an ideal world (e.g., one where you can just type "ld main.o"), we'd have explicit dependencies for every object, and we could construct a topological order automatically. But with this feature, we will have a partial list of explicit dependencies, and without a complete list, we have no good way of adding new objects into the link order. One way to approximate a proper link order would be to place each added object immediately after the last object that requests it. For example, if you run "ld a.o b.o c.o -lc", and both a.o and b.o request libm, you would insert libm (i.e., any and all objects extracted from libm if it's an archive library) after b.o and before c.o. But this approach wouldn't work -- we'd have to read and process the directive section from every object before establishing the final link order, which means we can't start building our symbol table until we've read all the objects, which means we can't search archive libraries. I think what would work is to insert each requested object or shared library into the link order immediately after the object that requests it, but only if the object hasn't already been inserted and isn't already listed on the command line (i.e., we won't try to load the same file twice); and to search each requested archive library immediately after each object that requests it (of course, because of how library searching works, we would load a given archive member once at most). With this method, libm would be searched after both a.o and b.o, so we'd load any members needed by a.o before b.o, and any remaining members needed by b.o before c.o. The difference between this and a proper topological ordering would be small, but would have a subtle effect on symbol interposition. I'm willing to require anyone who depends on symbol interposition to control their link order explicitly via the command line. In my ideal world, archive libraries would carry dependency information rather than the individual objects within them. I suspect that's too much to ask. I see no need for shared libraries to carry any dependency information beyond the DT_NEEDED entries they already have. (It would be so much easier to build a self-driving car if we could immediately jump to the point where all cars are self-driving, right?) -cary