Saleem Abdulrasool via llvm-dev
2018-Jan-06 20:05 UTC
[llvm-dev] Linker Option support for ELF
> On Jan 5, 2018, at 4:35 PM, Cary Coutant <ccoutant at gmail.com> wrote: > >>> In general I'm in favor of the proposal. Defining a generic way to convey >>> some information from the compiler to the linker is useful, and it looks >>> like it is just a historical reason that the ELF lacks the feature at the >>> moment. >>> >>> This is a scenario in which the feature is useful: when you include >>> math.h, a compiler (which is driven by some pragma) could added `-lm` to the >>> note section so that a linker automatically links libm.Glad to have you chime in; I know that you have quit a bit of experience from binutils and gold. I really would love to see this support be implemented there too, and having your input is certainly valuable.> I agree that this would be a very useful addition to ELF. I've always > wanted to reach the point where you could just type "ld main.o" and > have all the dependencies automatically linked in. (Go kind of > achieves this, I think.)Excellent! I think that everyone agrees that this is a useful extension to add.> I'm not in favor of using yet another note section, however. SHT_NOTE > sections are intended for the use of "off-axis" tools, not for > something the linker would need to look for. I don't want to have the > linker parsing individual note entries looking for notes of interest > to itself, and then having to decide whether to edit those entries out > of the larger section, or merge them together. And I also don't want > to key off of individual section names -- the linker is not supposed > to have to care about the section name. There should be a new section > type for this feature. This is the kind of extension that ELF was > designed for.I’m really not tied to the note approach of implementing this. I am (admittedly) abusing the notes due to a couple of behavioral aspects of them. So, the main things to realize is that this information is embedded into the object files that are built. The information should be processed by the linker and then *discarded*, none of it should be in the final binary (unless it is a relocatable link). I’m concerned about linkers which do not support this feature preserving the contents. Now, this could very well be a misconception on my part. If that is the case, then, I would say that this needs to be entirely reworked, because then adding the section sounds much nicer.>>> I think I'm also in favor of the format, which is essentially runs of >>> null-terminated strings (*1) that are basically opaque to compilers. >> >> Yes. However, I think I want to clarify that we want this to be completely >> opaque to the backend. The front end could possibly have some enhancements >> to make this better. But, that will be a separate change, and that >> discussion should take place then. We shouldn’t paint ourselves into a >> corner. Basically, I think that there is some legitimate concerns here, but >> they would not be handled at this layer, but above. >> >>> However, you should define as a spec what options are allowed and what >>> their semantics are. We should not accept arbitrary linker options because >>> semantics of some linker options cannot be clearly defined when they appear >>> as embedded options. Just saying "this feature allows you to embed linker >>> options to object files" is too weak as a specification. You need to clearly >>> define a list of options that will be supported by linkers with their clear >>> semantics. >> >> Personally, I would like to see the ability to add support for additional >> options without having to modify the compiler. That said, I think that >> there are options which can be scary (e.g. -nopie). I think that the linker >> should make the decision of what it supports and error out on others. This >> allows for us to enhance the support over time without a huge overhead. As >> a starting point, I think that -l and -L are two that would be interesting. >> I can see -u being useful as well, but the point is that we can slowly grow >> the support after consideration by delaying the validation of the options. >> >>> (*1) One of the big annoyances that I noticed when I was implementing the >>> same feature for COFF is that the COFF's .drctve section that contains >>> linker options have to be tokenized in the same way as the Windows command >>> line does. So it needs to interpret double quotes and backslashes correctly >>> especially when handling space-containing pathnames. This is a design >>> failure that a COFF file contains just a single string instead of runs of >>> strings that have already been tokenized. > > I too would like to keep the linker from having to tokenize the > strings. I kind of agree with Rafael that there should be defined tags > and values, much like a .dynamic section, but I wouldn't want to have > values pointing to strings in yet another section, so I'd prefer > something in between that and a free-form null-terminated string. I > also wouldn't want to open up the complete list of linker options, so > I'd prefer a defined list of tags in string form that could easily be > augmented without additional backend support. We could start with, > perhaps, "lib" to inject a library (a la "-l"), "file" to inject an > object file by full name, and "path" to provide a search path (a la > "-L"). I don't think an equivalent for "-u" would be needed, since the > compiler can simply generate an undef symbol for that case. For the > section format, I'd suggest a series of null-terminated strings, > alternating between tags and values, so that no quote or escape > parsing is necessary.Sounds like we agree on the direction: we don’t want the backend to be involved in adding new options, we don’t think that all options make sense but want to be able to add options still. As to the `-u` option, Im thinking about cases were an unreferenced symbol would like to be preserved with `—gc-sections` and being built with `-ffunction-sections` and/or `-fdata-sections`. So, after discussing some of the items, we ended up somewhere in-between. My current proposal is a semi-pre-tokenized linker response file. Basically, each option/parameter “pair” would be a single string entry in an array of string values. The only difference is instead of TLV entries, it is simply the raw entry. My resistance to the TLV really is driven more by the LLVM IR (which I suppose is possible to alter): https://llvm.org/docs/LangRef.html#automatic-linker-flags-named-metadata <https://llvm.org/docs/LangRef.html#automatic-linker-flags-named-metadata>> For the header files, a simple syntax like > > #pragma linker_directive "lib" "m" > > would provide the extensibility needed to add new tags with no > additional support in the front end or back end.I *really* wish to avoid this discussion right now. I am happy to loop you into a subsequent thread discussing that. I figure that this will be a much more contentious issue as syntax is something everyone has differing opinions on. I’m trying to split this work into three distinct pieces: the frontend support to emit the information, the backend to emit this into the object, and the linker to use it. As an aside, personally, I was thinking more along the lines of `#pragma comment(lib, “m”)`.> -cary-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180106/e3568555/attachment.html>
On Jan 6, 2018 12:05 PM, "Saleem Abdulrasool via llvm-dev" < llvm-dev at lists.llvm.org> wrote: On Jan 5, 2018, at 4:35 PM, Cary Coutant <ccoutant at gmail.com> wrote: In general I'm in favor of the proposal. Defining a generic way to convey some information from the compiler to the linker is useful, and it looks like it is just a historical reason that the ELF lacks the feature at the moment. This is a scenario in which the feature is useful: when you include math.h, a compiler (which is driven by some pragma) could added `-lm` to the note section so that a linker automatically links libm. Glad to have you chime in; I know that you have quit a bit of experience from binutils and gold. I really would love to see this support be implemented there too, and having your input is certainly valuable. I agree that this would be a very useful addition to ELF. I've always wanted to reach the point where you could just type "ld main.o" and have all the dependencies automatically linked in. (Go kind of achieves this, I think.) Excellent! I think that everyone agrees that this is a useful extension to add. I'm not in favor of using yet another note section, however. SHT_NOTE sections are intended for the use of "off-axis" tools, not for something the linker would need to look for. I don't want to have the linker parsing individual note entries looking for notes of interest to itself, and then having to decide whether to edit those entries out of the larger section, or merge them together. And I also don't want to key off of individual section names -- the linker is not supposed to have to care about the section name. There should be a new section type for this feature. This is the kind of extension that ELF was designed for. I’m really not tied to the note approach of implementing this. I am (admittedly) abusing the notes due to a couple of behavioral aspects of them. So, the main things to realize is that this information is embedded into the object files that are built. The information should be processed by the linker and then *discarded*, none of it should be in the final binary (unless it is a relocatable link). I’m concerned about linkers which do not support this feature preserving the contents. Now, this could very well be a misconception on my part. If that is the case, then, I would say that this needs to be entirely reworked, because then adding the section sounds much nicer. Wouldn't a special section type trigger an "unrecognized section type" error for linkers that don't support it? -- Sean Silva I think I'm also in favor of the format, which is essentially runs of null-terminated strings (*1) that are basically opaque to compilers. Yes. However, I think I want to clarify that we want this to be completely opaque to the backend. The front end could possibly have some enhancements to make this better. But, that will be a separate change, and that discussion should take place then. We shouldn’t paint ourselves into a corner. Basically, I think that there is some legitimate concerns here, but they would not be handled at this layer, but above. However, you should define as a spec what options are allowed and what their semantics are. We should not accept arbitrary linker options because semantics of some linker options cannot be clearly defined when they appear as embedded options. Just saying "this feature allows you to embed linker options to object files" is too weak as a specification. You need to clearly define a list of options that will be supported by linkers with their clear semantics. Personally, I would like to see the ability to add support for additional options without having to modify the compiler. That said, I think that there are options which can be scary (e.g. -nopie). I think that the linker should make the decision of what it supports and error out on others. This allows for us to enhance the support over time without a huge overhead. As a starting point, I think that -l and -L are two that would be interesting. I can see -u being useful as well, but the point is that we can slowly grow the support after consideration by delaying the validation of the options. (*1) One of the big annoyances that I noticed when I was implementing the same feature for COFF is that the COFF's .drctve section that contains linker options have to be tokenized in the same way as the Windows command line does. So it needs to interpret double quotes and backslashes correctly especially when handling space-containing pathnames. This is a design failure that a COFF file contains just a single string instead of runs of strings that have already been tokenized. I too would like to keep the linker from having to tokenize the strings. I kind of agree with Rafael that there should be defined tags and values, much like a .dynamic section, but I wouldn't want to have values pointing to strings in yet another section, so I'd prefer something in between that and a free-form null-terminated string. I also wouldn't want to open up the complete list of linker options, so I'd prefer a defined list of tags in string form that could easily be augmented without additional backend support. We could start with, perhaps, "lib" to inject a library (a la "-l"), "file" to inject an object file by full name, and "path" to provide a search path (a la "-L"). I don't think an equivalent for "-u" would be needed, since the compiler can simply generate an undef symbol for that case. For the section format, I'd suggest a series of null-terminated strings, alternating between tags and values, so that no quote or escape parsing is necessary. Sounds like we agree on the direction: we don’t want the backend to be involved in adding new options, we don’t think that all options make sense but want to be able to add options still. As to the `-u` option, Im thinking about cases were an unreferenced symbol would like to be preserved with `—gc-sections` and being built with `-ffunction-sections` and/or `-fdata-sections`. So, after discussing some of the items, we ended up somewhere in-between. My current proposal is a semi-pre-tokenized linker response file. Basically, each option/parameter “pair” would be a single string entry in an array of string values. The only difference is instead of TLV entries, it is simply the raw entry. My resistance to the TLV really is driven more by the LLVM IR (which I suppose is possible to alter): https://llvm.org/docs/LangRef.html#automatic-linker-flags-named-metadata For the header files, a simple syntax like #pragma linker_directive "lib" "m" would provide the extensibility needed to add new tags with no additional support in the front end or back end. I *really* wish to avoid this discussion right now. I am happy to loop you into a subsequent thread discussing that. I figure that this will be a much more contentious issue as syntax is something everyone has differing opinions on. I’m trying to split this work into three distinct pieces: the frontend support to emit the information, the backend to emit this into the object, and the linker to use it. As an aside, personally, I was thinking more along the lines of `#pragma comment(lib, “m”)`. -cary _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180106/436e50bc/attachment.html>
Saleem Abdulrasool via llvm-dev
2018-Jan-07 19:59 UTC
[llvm-dev] Linker Option support for ELF
> On Jan 6, 2018, at 4:33 PM, Sean Silva <chisophugis at gmail.com> wrote: > > > > On Jan 6, 2018 12:05 PM, "Saleem Abdulrasool via llvm-dev" <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> On Jan 5, 2018, at 4:35 PM, Cary Coutant <ccoutant at gmail.com <mailto:ccoutant at gmail.com>> wrote: >> >>>> In general I'm in favor of the proposal. Defining a generic way to convey >>>> some information from the compiler to the linker is useful, and it looks >>>> like it is just a historical reason that the ELF lacks the feature at the >>>> moment. >>>> >>>> This is a scenario in which the feature is useful: when you include >>>> math.h, a compiler (which is driven by some pragma) could added `-lm` to the >>>> note section so that a linker automatically links libm. > > Glad to have you chime in; I know that you have quit a bit of experience from binutils and gold. I really would love to see this support be implemented there too, and having your input is certainly valuable. > >> I agree that this would be a very useful addition to ELF. I've always >> wanted to reach the point where you could just type "ld main.o" and >> have all the dependencies automatically linked in. (Go kind of >> achieves this, I think.) > > Excellent! I think that everyone agrees that this is a useful extension to add. > >> I'm not in favor of using yet another note section, however. SHT_NOTE >> sections are intended for the use of "off-axis" tools, not for >> something the linker would need to look for. I don't want to have the >> linker parsing individual note entries looking for notes of interest >> to itself, and then having to decide whether to edit those entries out >> of the larger section, or merge them together. And I also don't want >> to key off of individual section names -- the linker is not supposed >> to have to care about the section name. There should be a new section >> type for this feature. This is the kind of extension that ELF was >> designed for. > > I’m really not tied to the note approach of implementing this. I am (admittedly) abusing the notes due to a couple of behavioral aspects of them. So, the main things to realize is that this information is embedded into the object files that are built. The information should be processed by the linker and then *discarded*, none of it should be in the final binary (unless it is a relocatable link). I’m concerned about linkers which do not support this feature preserving the contents. Now, this could very well be a misconception on my part. If that is the case, then, I would say that this needs to be entirely reworked, because then adding the section sounds much nicer. > > Wouldn't a special section type trigger an "unrecognized section type" error for linkers that don't support it?Yeah, that is possible. Compatibility problem exists with ld64, and are handled by means of `-mlinker-version`. I don’t know how others feel about bringing that flag over to other platforms. We could force the use of a new section and bring along `-mlinker-version` and base it on that or silently drop the flags (sounds slightly unexpected). Or we could abuse the notes and not have to worry about the compatibility problem (which is why my initial work went that route). Again, I’m not tied to the exact mechanism we use to provide compatibility, though my personal preference tends to be go with the nicer solution for longer term (which does feel like the `-mlinker-version` + custom section, but I’m worried about the silent dropping of flags). Perhaps there is a better solution that I haven’t considered.> > -- Sean Silva > > >>>> I think I'm also in favor of the format, which is essentially runs of >>>> null-terminated strings (*1) that are basically opaque to compilers. >>> >>> Yes. However, I think I want to clarify that we want this to be completely >>> opaque to the backend. The front end could possibly have some enhancements >>> to make this better. But, that will be a separate change, and that >>> discussion should take place then. We shouldn’t paint ourselves into a >>> corner. Basically, I think that there is some legitimate concerns here, but >>> they would not be handled at this layer, but above. >>> >>>> However, you should define as a spec what options are allowed and what >>>> their semantics are. We should not accept arbitrary linker options because >>>> semantics of some linker options cannot be clearly defined when they appear >>>> as embedded options. Just saying "this feature allows you to embed linker >>>> options to object files" is too weak as a specification. You need to clearly >>>> define a list of options that will be supported by linkers with their clear >>>> semantics. >>> >>> Personally, I would like to see the ability to add support for additional >>> options without having to modify the compiler. That said, I think that >>> there are options which can be scary (e.g. -nopie). I think that the linker >>> should make the decision of what it supports and error out on others. This >>> allows for us to enhance the support over time without a huge overhead. As >>> a starting point, I think that -l and -L are two that would be interesting. >>> I can see -u being useful as well, but the point is that we can slowly grow >>> the support after consideration by delaying the validation of the options. >>> >>>> (*1) One of the big annoyances that I noticed when I was implementing the >>>> same feature for COFF is that the COFF's .drctve section that contains >>>> linker options have to be tokenized in the same way as the Windows command >>>> line does. So it needs to interpret double quotes and backslashes correctly >>>> especially when handling space-containing pathnames. This is a design >>>> failure that a COFF file contains just a single string instead of runs of >>>> strings that have already been tokenized. >> >> I too would like to keep the linker from having to tokenize the >> strings. I kind of agree with Rafael that there should be defined tags >> and values, much like a .dynamic section, but I wouldn't want to have >> values pointing to strings in yet another section, so I'd prefer >> something in between that and a free-form null-terminated string. I >> also wouldn't want to open up the complete list of linker options, so >> I'd prefer a defined list of tags in string form that could easily be >> augmented without additional backend support. We could start with, >> perhaps, "lib" to inject a library (a la "-l"), "file" to inject an >> object file by full name, and "path" to provide a search path (a la >> "-L"). I don't think an equivalent for "-u" would be needed, since the >> compiler can simply generate an undef symbol for that case. For the >> section format, I'd suggest a series of null-terminated strings, >> alternating between tags and values, so that no quote or escape >> parsing is necessary. > > Sounds like we agree on the direction: we don’t want the backend to be involved in adding new options, we don’t think that all options make sense but want to be able to add options still. As to the `-u` option, Im thinking about cases were an unreferenced symbol would like to be preserved with `—gc-sections` and being built with `-ffunction-sections` and/or `-fdata-sections`. > > So, after discussing some of the items, we ended up somewhere in-between. My current proposal is a semi-pre-tokenized linker response file. Basically, each option/parameter “pair” would be a single string entry in an array of string values. The only difference is instead of TLV entries, it is simply the raw entry. My resistance to the TLV really is driven more by the LLVM IR (which I suppose is possible to alter): > > https://llvm.org/docs/LangRef.html#automatic-linker-flags-named-metadata <https://llvm.org/docs/LangRef.html#automatic-linker-flags-named-metadata> > >> For the header files, a simple syntax like >> >> #pragma linker_directive "lib" "m" >> >> would provide the extensibility needed to add new tags with no >> additional support in the front end or back end. > > I *really* wish to avoid this discussion right now. I am happy to loop you into a subsequent thread discussing that. I figure that this will be a much more contentious issue as syntax is something everyone has differing opinions on. I’m trying to split this work into three distinct pieces: the frontend support to emit the information, the backend to emit this into the object, and the linker to use it. > > As an aside, personally, I was thinking more along the lines of `#pragma comment(lib, “m”)`. > >> -cary > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180107/4f700894/attachment.html>