thr3ads.net - llvm dev - [llvm-dev] RFC: ELF Autolinking [Mar 2019]

If this information is useful, please help other people find it:
Share via:

bd1976 llvm via llvm-dev

2019-Mar-14 13:07 UTC

[llvm-dev] RFC: ELF Autolinking

At Sony we offer autolinking as a feature in our ELF toolchain. We would
like to see full support for this feature upstream as there is anecdotal
evidence that it would find use beyond Sony.

In general autolinking (https://en.wikipedia.org/wiki/Auto-linking) allows
developers to specify inputs to the linker in their source code. LLVM and
Clang already have support for autolinking on ELF via embedding strings,
which specify linker behavior, into a .linker-options section in
relocatable object files, see:

RFC - http://lists.llvm.org/pipermail/llvm-dev/2018-January/120101.html
LLVM -
https://llvm.org/docs/Extensions.html#linker-options-section-linker-options,
https://reviews.llvm.org/D40849
Clang -
https://clang.llvm.org/docs/LanguageExtensions.html#specifying-linker-options-on-elf-targets,
https://reviews.llvm.org/D42758

However, although support was added to Clang and LLVM, no support has been
implemented in LLD; and, I get the sense, from reading the reviews, that
there wasn't agreement on the implementation when the changes landed. The
original motivation seems to have been to remove the
"autolink-extract"
mechanism used by Swift to workaround the lack of autolinking support for
ELF. However, looking at the Swift source code, Swift still seems to be
using the "autolink-extract" method.

So my first question: Are there any users of the current implementation for
ELF?

Assuming that no one is using the current code, I would like to suggest a
different mechanism for autolinking.

For ELF we need limited autolinking support. Specifically, we only need
support for "comment lib" pragmas (
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017)
in C/C++ e.g. #pragma comment(lib, "foo"). My suggestion that we keep
the
implementation as lean as possible.

Principles to guide the implementation:
- Developers should be able to easily understand autolinking behavior.
- Developers should be able to override autolinking from the linker command
line.
- Inputs specified via pragmas should be handled in a general way to allow
the same source code to work in different environments.

I would like to propose that we focus on autolinking exclusively and that
we divorce the implementation from the idea of "linker options" which,
by
nature, would tie source code to the vagaries of particular linkers. I
don't see much value in supporting other linker operations so I suggest
that the binary representation be a mergable string section (SHF_MERGE,
SHF_STRINGS), called .autolink, with custom type SHT_LLVM_AUTOLINK
(0x6fff4c04), and SHF_EXCLUDE set (to avoid the contents appearing in the
output). The compiler can form this section by concatenating the arguments
of the "comment lib" pragmas in the order they are encountered.
Partial
(-r, -Ur) links can be handled by concatenating .autolink sections with the
normal mergeable string section rules. The current .linker-options can
remain (or be removed); but, "comment lib" pragmas for ELF should be
lowered to .autolink not to .linker-options. This makes sense as there is
no linker option that "comment lib" pragmas map directly to. As an
example,
#pragma comment(lib, "foo") would result in:

.section ".autolink","eMS", at llvm_autolink,1
.asciz "foo"

For LTO, equivalent information to the contents of a the .autolink section
will be written to the IRSymtab so that it is available to the linker for
symbol resolution.

The linker will process the .autolink strings in the following way:

1. Inputs from the .autolink sections of a relocatable object file are
added when the linker decides to include that file (which could itself be
in a library) in the link. Autolinked inputs behave as if they were
appended to the command line as a group after all other options. As a
consequence the set of autolinked libraries are searched last to resolve
symbols.
2. It is an error if a file cannot be found for a given string.
3. Any command line options in effect at the end of the command line
parsing apply to autolinked inputs, e.g. --whole-archive.
4. Duplicate autolinked inputs are ignored.
5. The linker tries to add a library or relocatable object file from each
of the strings in a .autolink section by; first, handling the string as if
it was specified on the commandline; second, by looking for the string in
each of the library search paths in turn; third, by looking for a
lib<string>.a or lib<string>.so (depending on the current mode of
the
linker) in each of the library search paths.
6. A new command line option --no-llvm-autolink will tell LLD to ignore the
.autolink sections.

Rationale for the above points:

1. Adding the autolinked inputs last makes the process simple to understand
from a developers perspective. All linkers are able to implement this
scheme.
2. Error-ing for libraries that are not found seems like better behavior
than failing the link during symbol resolution.
3. It seems useful for the user to be able to apply command line options
which will affect all of the autolinked input files. There is a potential
problem of surprise for developers, who might not realize that these
options would apply to the "invisible" autolinked input files;
however,
despite the potential for surprise, this is easy for developers to reason
about and gives developers the control that they may require.
4. Unlike on the command line it is probably easy to include the same input
file twice via pragmas and might be a pain to fix; think of Third-party
libraries supplied as binaries.
5. This algorithm takes into account all of the different ways that ELF
linkers find input files. The different search methods are tried by the
linker in most obvious to least obvious order.
6. I considered adding finer grained control over which .autolink inputs
were ignored (e.g. MSVC has /nodefaultlib:<library>); however, I concluded
that this is not necessary: if finer control is required developers can
recreate the same effect autolinking would have had using command line
options.

Thoughts?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190314/3b6545f9/attachment.html>

Peter Smith via llvm-dev

2019-Mar-14 15:32 UTC

head link

[llvm-dev] RFC: ELF Autolinking

Hello,

I've put some comments on the proposal inline. Having to had to debug
library selection problems where all the libraries are visible on the
linker command line, I would prefer if people didn't embed difficult
to find directives in object files, but I'm guessing in some languages
this is the natural way of adding libraries.

On Thu, 14 Mar 2019 at 13:08, bd1976 llvm via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>
> At Sony we offer autolinking as a feature in our ELF toolchain. We would
like to see full support for this feature upstream as there is anecdotal
evidence that it would find use beyond Sony.
>
I've not got any use of the existing code. Personally I've not come
across anyone wanting this type of feature, but that is also anecdotal
on my part.
>
> For ELF we need limited autolinking support. Specifically, we only need
support for "comment lib" pragmas
(https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017)
in C/C++ e.g. #pragma comment(lib, "foo"). My suggestion that we keep
the implementation as lean as possible.
>
> Principles to guide the implementation:
> - Developers should be able to easily understand autolinking behavior.
> - Developers should be able to override autolinking from the linker command
line.
> - Inputs specified via pragmas should be handled in a general way to allow
the same source code to work in different environments.
>
> I would like to propose that we focus on autolinking exclusively and that
we divorce the implementation from the idea of "linker options" which,
by nature, would tie source code to the vagaries of particular linkers. I
don't see much value in supporting other linker operations so I suggest that
the binary representation be a mergable string section (SHF_MERGE, SHF_STRINGS),
called .autolink, with custom type SHT_LLVM_AUTOLINK (0x6fff4c04), and
SHF_EXCLUDE set (to avoid the contents appearing in the output). The compiler
can form this section by concatenating the arguments of the "comment
lib" pragmas in the order they are encountered. Partial (-r, -Ur) links can
be handled by concatenating .autolink sections with the normal mergeable string
section rules. The current .linker-options can remain (or be removed); but,
"comment lib" pragmas for ELF should be lowered to .autolink not to
.linker-options. This makes sense as there is no linker option that
"comment lib" pragmas map directly to. As an example, #pragma
comment(lib, "foo") would result in:
>
> .section ".autolink","eMS", at llvm_autolink,1
>         .asciz "foo"
>
> For LTO, equivalent information to the contents of a the .autolink section
will be written to the IRSymtab so that it is available to the linker for symbol
resolution.
>
I'm not sure I understand the bit about "for symbol resolution". I
think that what you mean is that you will encode the autolink section
using symbols instead of as a section, and the linker is expected to
extract this when it reads the symbol table?
> The linker will process the .autolink strings in the following way:
>
> 1. Inputs from the .autolink sections of a relocatable object file are
added when the linker decides to include that file (which could itself be in a
library) in the link. Autolinked inputs behave as if they were appended to the
command line as a group after all other options. As a consequence the set of
autolinked libraries are searched last to resolve symbols.
> 2. It is an error if a file cannot be found for a given string.
> 3. Any command line options in effect at the end of the command line
parsing apply to autolinked inputs, e.g. --whole-archive.
I've not got any experience of autolinking as a user, so I'm
struggling a bit with this one. I'm guessing that autolinking is
useful because someone can do the equivalent of #include <library.h>
and #pragma comment lib "library.so" in the same place without having
to fight the build system. I'm less convinced about --whole-archive as
I think this tends to be a way of structuring the build and would be
best made explicit in the build system. Moreover, what if someone
wants to not use --whole-archive, for their autolink, but one already
exists. This could be quite difficult to check with a large project.
Personally I'd have the user be explicit in the .autolink whether they
were intending it to be whole-archive or not.
> 4. Duplicate autolinked inputs are ignored.
If we take the issue of --whole-archive off the table does it matter
that there are duplicate libraries? Unresolved symbols will match
against the first library. I guess it might make a difference if this
feature is implemented in ld.lld and ld.gold, where you'd have to wrap
the libraries in a start-group, end-group, but is this likely to
happen?
> 5. The linker tries to add a library or relocatable object file from each
of the strings in a .autolink section by; first, handling the string as if it
was specified on the commandline; second, by looking for the string in each of
the library search paths in turn; third, by looking for a lib<string>.a or
lib<string>.so (depending on the current mode of the linker) in each of
the library search paths.
There is some precedent for including files and libraries from
linkerscripts
https://sourceware.org/binutils/docs/ld/File-Commands.html#File-Commands
, these distinguish between "-lfile" and "file". Would this
be a
better fit for a ld.bfd interface compatible linker?
> 6. A new command line option --no-llvm-autolink will tell LLD to ignore the
.autolink sections.
Personally I would have thought --no-llvm-autolink would error if it
found a .autolink section, on the grounds that I wanted all the
libraries to be defined on the command-line or linker script rather
than hidden in object files. I would have thought ignoring the
autolink sections would in most cases result in undefined symbols. If
there is a use case for it, perhaps --ignore-llvm-autolink.
> Rationale for the above points:
>
> 1. Adding the autolinked inputs last makes the process simple to understand
from a developers perspective. All linkers are able to implement this scheme.
> 2. Error-ing for libraries that are not found seems like better behavior
than failing the link during symbol resolution.
> 3. It seems useful for the user to be able to apply command line options
which will affect all of the autolinked input files. There is a potential
problem of surprise for developers, who might not realize that these options
would apply to the "invisible" autolinked input files; however,
despite the potential for surprise, this is easy for developers to reason about
and gives developers the control that they may require.
> 4. Unlike on the command line it is probably easy to include the same input
file twice via pragmas and might be a pain to fix; think of Third-party
libraries supplied as binaries.
> 5. This algorithm takes into account all of the different ways that ELF
linkers find input files. The different search methods are tried by the linker
in most obvious to least obvious order.
> 6. I considered adding finer grained control over which .autolink inputs
were ignored (e.g. MSVC has /nodefaultlib:<library>); however, I concluded
that this is not necessary: if finer control is required developers can recreate
the same effect autolinking would have had using command line options.
>
> Thoughts?
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Reid Kleckner via llvm-dev

2019-Mar-14 16:28 UTC

head link

[llvm-dev] RFC: ELF Autolinking

Hi,

I guess I agree it would be best to remove the objfile linker option
support and replace it with just auto-linking. We already have a mechanism
for adding new features to object files: .note sections. Linkers already
know to ignore ones that they don't understand. If, in the future, we want
to add a new feature that could be handled by embedding linker flags, we
can instead implement it with a new .note section that other linkers and
old versions of LLD will know to ignore.

On top of that, the generic ABI group has previously rejected proposals to
embed linker options in object files (
https://groups.google.com/forum/#!topic/generic-abi/iS_-m-X5ZwQ).

Given how ELF has done things in the past, maybe the section name should be
".note.autolink". We could also be like GCC and namespace our
extensions as
".note.LLVM.autolink", but maybe that's a step too far.

Reid

On Thu, Mar 14, 2019 at 6:08 AM bd1976 llvm via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> At Sony we offer autolinking as a feature in our ELF toolchain. We would
> like to see full support for this feature upstream as there is anecdotal
> evidence that it would find use beyond Sony.
>
> In general autolinking (https://en.wikipedia.org/wiki/Auto-linking)
> allows developers to specify inputs to the linker in their source code.
> LLVM and Clang already have support for autolinking on ELF via embedding
> strings, which specify linker behavior, into a .linker-options section in
> relocatable object files, see:
>
> RFC - http://lists.llvm.org/pipermail/llvm-dev/2018-January/120101.html
> LLVM -
>
https://llvm.org/docs/Extensions.html#linker-options-section-linker-options,
> https://reviews.llvm.org/D40849
> Clang -
>
https://clang.llvm.org/docs/LanguageExtensions.html#specifying-linker-options-on-elf-targets,
> https://reviews.llvm.org/D42758
>
> However, although support was added to Clang and LLVM, no support has been
> implemented in LLD; and, I get the sense, from reading the reviews, that
> there wasn't agreement on the implementation when the changes landed.
The
> original motivation seems to have been to remove the
"autolink-extract"
> mechanism used by Swift to workaround the lack of autolinking support for
> ELF. However, looking at the Swift source code, Swift still seems to be
> using the "autolink-extract" method.
>
> So my first question: Are there any users of the current implementation
> for ELF?
>
> Assuming that no one is using the current code, I would like to suggest a
> different mechanism for autolinking.
>
> For ELF we need limited autolinking support. Specifically, we only need
> support for "comment lib" pragmas (
>
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017)
> in C/C++ e.g. #pragma comment(lib, "foo"). My suggestion that we
keep the
> implementation as lean as possible.
>
> Principles to guide the implementation:
> - Developers should be able to easily understand autolinking behavior.
> - Developers should be able to override autolinking from the linker
> command line.
> - Inputs specified via pragmas should be handled in a general way to allow
> the same source code to work in different environments.
>
> I would like to propose that we focus on autolinking exclusively and that
> we divorce the implementation from the idea of "linker options"
which, by
> nature, would tie source code to the vagaries of particular linkers. I
> don't see much value in supporting other linker operations so I suggest
> that the binary representation be a mergable string section (SHF_MERGE,
> SHF_STRINGS), called .autolink, with custom type SHT_LLVM_AUTOLINK
> (0x6fff4c04), and SHF_EXCLUDE set (to avoid the contents appearing in the
> output). The compiler can form this section by concatenating the arguments
> of the "comment lib" pragmas in the order they are encountered.
Partial
> (-r, -Ur) links can be handled by concatenating .autolink sections with the
> normal mergeable string section rules. The current .linker-options can
> remain (or be removed); but, "comment lib" pragmas for ELF should
be
> lowered to .autolink not to .linker-options. This makes sense as there is
> no linker option that "comment lib" pragmas map directly to. As
an example,
> #pragma comment(lib, "foo") would result in:
>
> .section ".autolink","eMS", at llvm_autolink,1
>         .asciz "foo"
>
> For LTO, equivalent information to the contents of a the .autolink section
> will be written to the IRSymtab so that it is available to the linker for
> symbol resolution.
>
> The linker will process the .autolink strings in the following way:
>
> 1. Inputs from the .autolink sections of a relocatable object file are
> added when the linker decides to include that file (which could itself be
> in a library) in the link. Autolinked inputs behave as if they were
> appended to the command line as a group after all other options. As a
> consequence the set of autolinked libraries are searched last to resolve
> symbols.
> 2. It is an error if a file cannot be found for a given string.
> 3. Any command line options in effect at the end of the command line
> parsing apply to autolinked inputs, e.g. --whole-archive.
> 4. Duplicate autolinked inputs are ignored.
> 5. The linker tries to add a library or relocatable object file from each
> of the strings in a .autolink section by; first, handling the string as if
> it was specified on the commandline; second, by looking for the string in
> each of the library search paths in turn; third, by looking for a
> lib<string>.a or lib<string>.so (depending on the current mode
of the
> linker) in each of the library search paths.
> 6. A new command line option --no-llvm-autolink will tell LLD to ignore
> the .autolink sections.
>
> Rationale for the above points:
>
> 1. Adding the autolinked inputs last makes the process simple to
> understand from a developers perspective. All linkers are able to implement
> this scheme.
> 2. Error-ing for libraries that are not found seems like better behavior
> than failing the link during symbol resolution.
> 3. It seems useful for the user to be able to apply command line options
> which will affect all of the autolinked input files. There is a potential
> problem of surprise for developers, who might not realize that these
> options would apply to the "invisible" autolinked input files;
however,
> despite the potential for surprise, this is easy for developers to reason
> about and gives developers the control that they may require.
> 4. Unlike on the command line it is probably easy to include the same
> input file twice via pragmas and might be a pain to fix; think of
> Third-party libraries supplied as binaries.
> 5. This algorithm takes into account all of the different ways that ELF
> linkers find input files. The different search methods are tried by the
> linker in most obvious to least obvious order.
> 6. I considered adding finer grained control over which .autolink inputs
> were ignored (e.g. MSVC has /nodefaultlib:<library>); however, I
concluded
> that this is not necessary: if finer control is required developers can
> recreate the same effect autolinking would have had using command line
> options.
>
> Thoughts?
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190314/764c421b/attachment-0001.html>

Rui Ueyama via llvm-dev

2019-Mar-14 16:34 UTC

head link

[llvm-dev] RFC: ELF Autolinking

This proposal seems much better than the generic .linker-options scheme
that potentially allows arbitrary linker options to be embedded to an
object file. The proposed scheme is basically the same mechanism as the
"comment lib" feature implemented on Microsoft linker, which I found
mildly
useful and at least not harmful.

As a use case, what I heard of was that in the game industry where many
developers are using Visual Studio as an IDE and familiar with Windows'
semantics of linking, people find it annoying that to build the same
program on Unix, they had to add bunch of -lfoo to the linker command line
while they are automatically handled on Windows. I can understand that --
if you have to add `-lm` 99.9% of the time when #include <math.h> for
example, that's not too odd to think why this is not processed
automatically.

But the above story was from the game industry. Just like Ben, I'd like to
hear from other people if they really want this feature.

Details inline:

On Thu, Mar 14, 2019 at 6:08 AM bd1976 llvm via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> At Sony we offer autolinking as a feature in our ELF toolchain. We would
> like to see full support for this feature upstream as there is anecdotal
> evidence that it would find use beyond Sony.
>
> In general autolinking (https://en.wikipedia.org/wiki/Auto-linking)
> allows developers to specify inputs to the linker in their source code.
> LLVM and Clang already have support for autolinking on ELF via embedding
> strings, which specify linker behavior, into a .linker-options section in
> relocatable object files, see:
>
> RFC - http://lists.llvm.org/pipermail/llvm-dev/2018-January/120101.html
> LLVM -
>
https://llvm.org/docs/Extensions.html#linker-options-section-linker-options,
> https://reviews.llvm.org/D40849
> Clang -
>
https://clang.llvm.org/docs/LanguageExtensions.html#specifying-linker-options-on-elf-targets,
> https://reviews.llvm.org/D42758
>
> However, although support was added to Clang and LLVM, no support has been
> implemented in LLD; and, I get the sense, from reading the reviews, that
> there wasn't agreement on the implementation when the changes landed.
The
> original motivation seems to have been to remove the
"autolink-extract"
> mechanism used by Swift to workaround the lack of autolinking support for
> ELF. However, looking at the Swift source code, Swift still seems to be
> using the "autolink-extract" method.
>
> So my first question: Are there any users of the current implementation
> for ELF?
>
> Assuming that no one is using the current code, I would like to suggest a
> different mechanism for autolinking.
>
> For ELF we need limited autolinking support. Specifically, we only need
> support for "comment lib" pragmas (
>
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017)
> in C/C++ e.g. #pragma comment(lib, "foo"). My suggestion that we
keep the
> implementation as lean as possible.
>
> Principles to guide the implementation:
> - Developers should be able to easily understand autolinking behavior.
> - Developers should be able to override autolinking from the linker
> command line.
> - Inputs specified via pragmas should be handled in a general way to allow
> the same source code to work in different environments.
>
> I would like to propose that we focus on autolinking exclusively and that
> we divorce the implementation from the idea of "linker options"
which, by
> nature, would tie source code to the vagaries of particular linkers. I
> don't see much value in supporting other linker operations so I suggest
> that the binary representation be a mergable string section (SHF_MERGE,
> SHF_STRINGS), called .autolink, with custom type SHT_LLVM_AUTOLINK
> (0x6fff4c04), and SHF_EXCLUDE set (to avoid the contents appearing in the
> output). The compiler can form this section by concatenating the arguments
> of the "comment lib" pragmas in the order they are encountered.
Partial
> (-r, -Ur) links can be handled by concatenating .autolink sections with the
> normal mergeable string section rules. The current .linker-options can
> remain (or be removed); but, "comment lib" pragmas for ELF should
be
> lowered to .autolink not to .linker-options. This makes sense as there is
> no linker option that "comment lib" pragmas map directly to. As
an example,
> #pragma comment(lib, "foo") would result in:
>
> .section ".autolink","eMS", at llvm_autolink,1
>         .asciz "foo"
>
> For LTO, equivalent information to the contents of a the .autolink section
> will be written to the IRSymtab so that it is available to the linker for
> symbol resolution.
>
> The linker will process the .autolink strings in the following way:
>
> 1. Inputs from the .autolink sections of a relocatable object file are
> added when the linker decides to include that file (which could itself be
> in a library) in the link. Autolinked inputs behave as if they were
> appended to the command line as a group after all other options. As a
> consequence the set of autolinked libraries are searched last to resolve
> symbols.
> 2. It is an error if a file cannot be found for a given string.
> 3. Any command line options in effect at the end of the command line
> parsing apply to autolinked inputs, e.g. --whole-archive.
>
I thought that the scope of this mechanism is essentially to add `-lfoo`
automatically to the command line if you include a header that requires
`libfoo`. From that perspective, the item 3 seems odd. Why do you need that?

4. Duplicate autolinked inputs are ignored.>
I'd say duplicate autolinked inputs are processed normally, but because of
the same reason why  the second parameter in `-lfoo -lfoo` is basically
no-op, duplicated autolinked inputs are naturally ignored.

5. The linker tries to add a library or relocatable object file from
each> of the strings in a .autolink section by; first, handling the string as if
> it was specified on the commandline; second, by looking for the string in
> each of the library search paths in turn; third, by looking for a
> lib<string>.a or lib<string>.so (depending on the current mode
of the
> linker) in each of the library search paths.
>
Again, this seems like a little beyond the scope of what I expect (and it
looks like you want to allow an .o file using this scheme).

> 6. A new command line option --no-llvm-autolink will tell LLD to ignore
> the .autolink sections.
>
> Rationale for the above points:
>
> 1. Adding the autolinked inputs last makes the process simple to
> understand from a developers perspective. All linkers are able to implement
> this scheme.
> 2. Error-ing for libraries that are not found seems like better behavior
> than failing the link during symbol resolution.
> 3. It seems useful for the user to be able to apply command line options
> which will affect all of the autolinked input files. There is a potential
> problem of surprise for developers, who might not realize that these
> options would apply to the "invisible" autolinked input files;
however,
> despite the potential for surprise, this is easy for developers to reason
> about and gives developers the control that they may require.
> 4. Unlike on the command line it is probably easy to include the same
> input file twice via pragmas and might be a pain to fix; think of
> Third-party libraries supplied as binaries.
> 5. This algorithm takes into account all of the different ways that ELF
> linkers find input files. The different search methods are tried by the
> linker in most obvious to least obvious order.
> 6. I considered adding finer grained control over which .autolink inputs
> were ignored (e.g. MSVC has /nodefaultlib:<library>); however, I
concluded
> that this is not necessary: if finer control is required developers can
> recreate the same effect autolinking would have had using command line
> options.
>
> Thoughts?
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190314/2531f219/attachment.html>

bd1976 llvm via llvm-dev

2019-Mar-14 16:44 UTC

head link

[llvm-dev] RFC: ELF Autolinking

On Thu, Mar 14, 2019 at 3:32 PM Peter Smith <peter.smith at linaro.org>
wrote:
> Hello,
>
> I've put some comments on the proposal inline. Having to had to debug
> library selection problems where all the libraries are visible on the
> linker command line, I would prefer if people didn't embed difficult
> to find directives in object files, but I'm guessing in some languages
> this is the natural way of adding libraries.
>
> On Thu, 14 Mar 2019 at 13:08, bd1976 llvm via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> >
> > At Sony we offer autolinking as a feature in our ELF toolchain. We
would
> like to see full support for this feature upstream as there is anecdotal
> evidence that it would find use beyond Sony.
> >
>
> I've not got any use of the existing code. Personally I've not come
> across anyone wanting this type of feature, but that is also anecdotal
> on my part.
>
> >
> > For ELF we need limited autolinking support. Specifically, we only
need
> support for "comment lib" pragmas (
>
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017)
> in C/C++ e.g. #pragma comment(lib, "foo"). My suggestion that we
keep the
> implementation as lean as possible.
> >
> > Principles to guide the implementation:
> > - Developers should be able to easily understand autolinking behavior.
> > - Developers should be able to override autolinking from the linker
> command line.
> > - Inputs specified via pragmas should be handled in a general way to
> allow the same source code to work in different environments.
> >
> > I would like to propose that we focus on autolinking exclusively and
> that we divorce the implementation from the idea of "linker
options" which,
> by nature, would tie source code to the vagaries of particular linkers. I
> don't see much value in supporting other linker operations so I suggest
> that the binary representation be a mergable string section (SHF_MERGE,
> SHF_STRINGS), called .autolink, with custom type SHT_LLVM_AUTOLINK
> (0x6fff4c04), and SHF_EXCLUDE set (to avoid the contents appearing in the
> output). The compiler can form this section by concatenating the arguments
> of the "comment lib" pragmas in the order they are encountered.
Partial
> (-r, -Ur) links can be handled by concatenating .autolink sections with the
> normal mergeable string section rules. The current .linker-options can
> remain (or be removed); but, "comment lib" pragmas for ELF should
be
> lowered to .autolink not to .linker-options. This makes sense as there is
> no linker option that "comment lib" pragmas map directly to. As
an example,
> #pragma comment(lib, "foo") would result in:
> >
> > .section ".autolink","eMS", at llvm_autolink,1
> >         .asciz "foo"
> >
> > For LTO, equivalent information to the contents of a the .autolink
> section will be written to the IRSymtab so that it is available to the
> linker for symbol resolution.
> >
>
> I'm not sure I understand the bit about "for symbol
resolution". I
> think that what you mean is that you will encode the autolink section
> using symbols instead of as a section, and the linker is expected to
> extract this when it reads the symbol table?
>
>Whoops... might have used a bit of a colloquialism there; sorry. All I mean
is that there will be a method on the IRSymtab that LLD can use to retrieve
the same set of strings that would be written into the the .autolink
section of the relocatable object files by the backend.

> > The linker will process the .autolink strings in the following way:
> >
> > 1. Inputs from the .autolink sections of a relocatable object file are
> added when the linker decides to include that file (which could itself be
> in a library) in the link. Autolinked inputs behave as if they were
> appended to the command line as a group after all other options. As a
> consequence the set of autolinked libraries are searched last to resolve
> symbols.
> > 2. It is an error if a file cannot be found for a given string.
> > 3. Any command line options in effect at the end of the command line
> parsing apply to autolinked inputs, e.g. --whole-archive.
>
> I've not got any experience of autolinking as a user, so I'm
> struggling a bit with this one. I'm guessing that autolinking is
> useful because someone can do the equivalent of #include <library.h>
> and #pragma comment lib "library.so" in the same place without
having
> to fight the build system.

Right. Consider that many codebases have multiple build configurations and
the linker needs to be given the correct version of a library to use for
the particular build configuration. This is often easier to do using the
preprocessor than in the build system. Also, if a program is dependent on
an external library, autolinking allows the library writer to reorganize
how that library is structured transparently to the users of the library.
There are notes about utility in
https://stackoverflow.com/questions/1685206/pragma-commentlib-xxx-lib-equivalent-under-linux
and
https://stackoverflow.com/questions/3851956/whats-pragma-comment-lib-lib-glut32-lib?noredirect=1&lq=1
.

> I'm less convinced about --whole-archive as
> I think this tends to be a way of structuring the build and would be
> best made explicit in the build system. Moreover, what if someone
> wants to not use --whole-archive, for their autolink, but one already
> exists.

Then they can specify --no-whole-archive on the end of the command line, no?

> This could be quite difficult to check with a large project.
> Personally I'd have the user be explicit in the .autolink whether they
> were intending it to be whole-archive or not.
>
I was hoping to avoid this as I want to avoid getting into how to specify
linker specific options in the frontend. If we dislike the idea that the
state of the command line parser at the end of the linker command line
affects the autolinked libraries then I would rather go for a scheme in
which the default state of the command line parser applies when linking the
autolinked libraries; however, that seems harder to implement in LLD and
gives the user less control over autolinking.

>
> > 4. Duplicate autolinked inputs are ignored.
>
> If we take the issue of --whole-archive off the table does it matter
> that there are duplicate libraries? Unresolved symbols will match
> against the first library.

It doesn't matter for libraries in LLD; but, it is important for object
files. I think that this mechanism should be usable for object files an
libraries. This is common in ELF linkers - for example the --library
command line option can be used to link object files.

> I guess it might make a difference if this
> feature is implemented in ld.lld and ld.gold, where you'd have to wrap
> the libraries in a start-group, end-group, but is this likely to
> happen?
>
I would like the design to be such that it could be implemented by GNU.

>
> > 5. The linker tries to add a library or relocatable object file from
> each of the strings in a .autolink section by; first, handling the string
> as if it was specified on the commandline; second, by looking for the
> string in each of the library search paths in turn; third, by looking for a
> lib<string>.a or lib<string>.so (depending on the current mode
of the
> linker) in each of the library search paths.
>
> There is some precedent for including files and libraries from
> linkerscripts
> https://sourceware.org/binutils/docs/ld/File-Commands.html#File-Commands
> , these distinguish between "-lfile" and "file". Would
this be a
> better fit for a ld.bfd interface compatible linker?
>
>I was hoping to avoid GNUism's and use a "general" mechanism. MSVC
source
code compatibility is a usecase.

> > 6. A new command line option --no-llvm-autolink will tell LLD to
ignore
> the .autolink sections.
>
> Personally I would have thought --no-llvm-autolink would error if it
> found a .autolink section, on the grounds that I wanted all the
> libraries to be defined on the command-line or linker script rather
> than hidden in object files. I would have thought ignoring the
> autolink sections would in most cases result in undefined symbols. If
> there is a use case for it, perhaps --ignore-llvm-autolink.
>
>The usecase that I had in mind is that you need to override autolinking. To
do so you tell the linker to ignore the embedded autolinking information
and construct an equivalent command line. I think your proposed
--ignore-llvm-autolink is a better name for this option given the intended
semantics.

> > Rationale for the above points:
> >
> > 1. Adding the autolinked inputs last makes the process simple to
> understand from a developers perspective. All linkers are able to implement
> this scheme.
> > 2. Error-ing for libraries that are not found seems like better
behavior
> than failing the link during symbol resolution.
> > 3. It seems useful for the user to be able to apply command line
options
> which will affect all of the autolinked input files. There is a potential
> problem of surprise for developers, who might not realize that these
> options would apply to the "invisible" autolinked input files;
however,
> despite the potential for surprise, this is easy for developers to reason
> about and gives developers the control that they may require.
> > 4. Unlike on the command line it is probably easy to include the same
> input file twice via pragmas and might be a pain to fix; think of
> Third-party libraries supplied as binaries.
> > 5. This algorithm takes into account all of the different ways that
ELF
> linkers find input files. The different search methods are tried by the
> linker in most obvious to least obvious order.
> > 6. I considered adding finer grained control over which .autolink
inputs
> were ignored (e.g. MSVC has /nodefaultlib:<library>); however, I
concluded
> that this is not necessary: if finer control is required developers can
> recreate the same effect autolinking would have had using command line
> options.
> >
> > Thoughts?
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190314/bf16cecb/attachment.html>

Zachary Turner via llvm-dev

2019-Mar-14 16:44 UTC

head link

[llvm-dev] RFC: ELF Autolinking

On Thu, Mar 14, 2019 at 9:34 AM Rui Ueyama via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> This proposal seems much better than the generic .linker-options scheme
> that potentially allows arbitrary linker options to be embedded to an
> object file. The proposed scheme is basically the same mechanism as the
> "comment lib" feature implemented on Microsoft linker, which I
found mildly
> useful and at least not harmful.
>
> As a use case, what I heard of was that in the game industry where many
> developers are using Visual Studio as an IDE and familiar with Windows'
> semantics of linking, people find it annoying that to build the same
> program on Unix, they had to add bunch of -lfoo to the linker command line
> while they are automatically handled on Windows. I can understand that --
> if you have to add `-lm` 99.9% of the time when #include <math.h> for
> example, that's not too odd to think why this is not processed
> automatically.
>
> But the above story was from the game industry. Just like Ben, I'd like
to
> hear from other people if they really want this feature.
>
> Another situation where it is useful is when a 3rd party library supports
multiple different ABI-incompatible build configurations (typically
selected between via pre-processor settings).  This way, the header file
can choose the right version of the library to link against.

This comes up often when linking against python, for example, where if you
have #defined _DEBUG then you need to link against python35_d.[dll|so]
versus python35.[dll|so].

You can certainly represent this kind of logic in a build system, but it
leads to even more maintenance burden in my experience, and in general it's
nice if things "just work".
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190314/2706cc84/attachment.html>

Rui Ueyama via llvm-dev

2019-Mar-14 16:58 UTC

head link

[llvm-dev] RFC: ELF Autolinking

On Thu, Mar 14, 2019 at 9:28 AM Reid Kleckner via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi,
>
> I guess I agree it would be best to remove the objfile linker option
> support and replace it with just auto-linking. We already have a mechanism
> for adding new features to object files: .note sections. Linkers already
> know to ignore ones that they don't understand. If, in the future, we
want
> to add a new feature that could be handled by embedding linker flags, we
> can instead implement it with a new .note section that other linkers and
> old versions of LLD will know to ignore.
>
> On top of that, the generic ABI group has previously rejected proposals to
> embed linker options in object files (
> https://groups.google.com/forum/#!topic/generic-abi/iS_-m-X5ZwQ).
>
> Given how ELF has done things in the past, maybe the section name should
> be ".note.autolink". We could also be like GCC and namespace our
extensions
> as ".note.LLVM.autolink", but maybe that's a step too far.
>
A .note section consists of a series of type-length-value records. My
understanding is that the static linker aggregates them to a single
location and put it into PT_NOTE segment, and the records can still be read
by the loader even after the section table is stripped from an executable.
For the proposed purpose, the note section header would not be useful or
meaningful, so a plain section that just contains an ASCII string would be
simpler.

Reid>
> On Thu, Mar 14, 2019 at 6:08 AM bd1976 llvm via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> At Sony we offer autolinking as a feature in our ELF toolchain. We
would
>> like to see full support for this feature upstream as there is
anecdotal
>> evidence that it would find use beyond Sony.
>>
>> In general autolinking (https://en.wikipedia.org/wiki/Auto-linking)
>> allows developers to specify inputs to the linker in their source code.
>> LLVM and Clang already have support for autolinking on ELF via
embedding
>> strings, which specify linker behavior, into a .linker-options section
in
>> relocatable object files, see:
>>
>> RFC - http://lists.llvm.org/pipermail/llvm-dev/2018-January/120101.html
>> LLVM -
>>
https://llvm.org/docs/Extensions.html#linker-options-section-linker-options,
>> https://reviews.llvm.org/D40849
>> Clang -
>>
https://clang.llvm.org/docs/LanguageExtensions.html#specifying-linker-options-on-elf-targets,
>> https://reviews.llvm.org/D42758
>>
>> However, although support was added to Clang and LLVM, no support has
>> been implemented in LLD; and, I get the sense, from reading the
reviews,
>> that there wasn't agreement on the implementation when the changes
landed.
>> The original motivation seems to have been to remove the
"autolink-extract"
>> mechanism used by Swift to workaround the lack of autolinking support
for
>> ELF. However, looking at the Swift source code, Swift still seems to be
>> using the "autolink-extract" method.
>>
>> So my first question: Are there any users of the current implementation
>> for ELF?
>>
>> Assuming that no one is using the current code, I would like to suggest
a
>> different mechanism for autolinking.
>>
>> For ELF we need limited autolinking support. Specifically, we only need
>> support for "comment lib" pragmas (
>>
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017)
>> in C/C++ e.g. #pragma comment(lib, "foo"). My suggestion that
we keep the
>> implementation as lean as possible.
>>
>> Principles to guide the implementation:
>> - Developers should be able to easily understand autolinking behavior.
>> - Developers should be able to override autolinking from the linker
>> command line.
>> - Inputs specified via pragmas should be handled in a general way to
>> allow the same source code to work in different environments.
>>
>> I would like to propose that we focus on autolinking exclusively and
that
>> we divorce the implementation from the idea of "linker
options" which, by
>> nature, would tie source code to the vagaries of particular linkers. I
>> don't see much value in supporting other linker operations so I
suggest
>> that the binary representation be a mergable string section (SHF_MERGE,
>> SHF_STRINGS), called .autolink, with custom type SHT_LLVM_AUTOLINK
>> (0x6fff4c04), and SHF_EXCLUDE set (to avoid the contents appearing in
the
>> output). The compiler can form this section by concatenating the
arguments
>> of the "comment lib" pragmas in the order they are
encountered. Partial
>> (-r, -Ur) links can be handled by concatenating .autolink sections with
the
>> normal mergeable string section rules. The current .linker-options can
>> remain (or be removed); but, "comment lib" pragmas for ELF
should be
>> lowered to .autolink not to .linker-options. This makes sense as there
is
>> no linker option that "comment lib" pragmas map directly to.
As an example,
>> #pragma comment(lib, "foo") would result in:
>>
>> .section ".autolink","eMS", at llvm_autolink,1
>>         .asciz "foo"
>>
>> For LTO, equivalent information to the contents of a the .autolink
>> section will be written to the IRSymtab so that it is available to the
>> linker for symbol resolution.
>>
>> The linker will process the .autolink strings in the following way:
>>
>> 1. Inputs from the .autolink sections of a relocatable object file are
>> added when the linker decides to include that file (which could itself
be
>> in a library) in the link. Autolinked inputs behave as if they were
>> appended to the command line as a group after all other options. As a
>> consequence the set of autolinked libraries are searched last to
resolve
>> symbols.
>> 2. It is an error if a file cannot be found for a given string.
>> 3. Any command line options in effect at the end of the command line
>> parsing apply to autolinked inputs, e.g. --whole-archive.
>> 4. Duplicate autolinked inputs are ignored.
>> 5. The linker tries to add a library or relocatable object file from
each
>> of the strings in a .autolink section by; first, handling the string as
if
>> it was specified on the commandline; second, by looking for the string
in
>> each of the library search paths in turn; third, by looking for a
>> lib<string>.a or lib<string>.so (depending on the current
mode of the
>> linker) in each of the library search paths.
>> 6. A new command line option --no-llvm-autolink will tell LLD to ignore
>> the .autolink sections.
>>
>> Rationale for the above points:
>>
>> 1. Adding the autolinked inputs last makes the process simple to
>> understand from a developers perspective. All linkers are able to
implement
>> this scheme.
>> 2. Error-ing for libraries that are not found seems like better
behavior
>> than failing the link during symbol resolution.
>> 3. It seems useful for the user to be able to apply command line
options
>> which will affect all of the autolinked input files. There is a
potential
>> problem of surprise for developers, who might not realize that these
>> options would apply to the "invisible" autolinked input
files; however,
>> despite the potential for surprise, this is easy for developers to
reason
>> about and gives developers the control that they may require.
>> 4. Unlike on the command line it is probably easy to include the same
>> input file twice via pragmas and might be a pain to fix; think of
>> Third-party libraries supplied as binaries.
>> 5. This algorithm takes into account all of the different ways that ELF
>> linkers find input files. The different search methods are tried by the
>> linker in most obvious to least obvious order.
>> 6. I considered adding finer grained control over which .autolink
inputs
>> were ignored (e.g. MSVC has /nodefaultlib:<library>); however, I
concluded
>> that this is not necessary: if finer control is required developers can
>> recreate the same effect autolinking would have had using command line
>> options.
>>
>> Thoughts?
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190314/848fc8b4/attachment.html>

Peter Collingbourne via llvm-dev

2019-Mar-14 17:30 UTC

head link

[llvm-dev] RFC: ELF Autolinking

On Thu, Mar 14, 2019 at 6:08 AM bd1976 llvm via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> At Sony we offer autolinking as a feature in our ELF toolchain. We would
> like to see full support for this feature upstream as there is anecdotal
> evidence that it would find use beyond Sony.
>
> In general autolinking (https://en.wikipedia.org/wiki/Auto-linking)
> allows developers to specify inputs to the linker in their source code.
> LLVM and Clang already have support for autolinking on ELF via embedding
> strings, which specify linker behavior, into a .linker-options section in
> relocatable object files, see:
>
> RFC - http://lists.llvm.org/pipermail/llvm-dev/2018-January/120101.html
> LLVM -
>
https://llvm.org/docs/Extensions.html#linker-options-section-linker-options,
> https://reviews.llvm.org/D40849
> Clang -
>
https://clang.llvm.org/docs/LanguageExtensions.html#specifying-linker-options-on-elf-targets,
> https://reviews.llvm.org/D42758
>
> However, although support was added to Clang and LLVM, no support has been
> implemented in LLD; and, I get the sense, from reading the reviews, that
> there wasn't agreement on the implementation when the changes landed.
The
> original motivation seems to have been to remove the
"autolink-extract"
> mechanism used by Swift to workaround the lack of autolinking support for
> ELF. However, looking at the Swift source code, Swift still seems to be
> using the "autolink-extract" method.
>
> So my first question: Are there any users of the current implementation
> for ELF?
>
> Assuming that no one is using the current code, I would like to suggest a
> different mechanism for autolinking.
>
> For ELF we need limited autolinking support. Specifically, we only need
> support for "comment lib" pragmas (
>
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017)
> in C/C++ e.g. #pragma comment(lib, "foo"). My suggestion that we
keep the
> implementation as lean as possible.
>
> Principles to guide the implementation:
> - Developers should be able to easily understand autolinking behavior.
> - Developers should be able to override autolinking from the linker
> command line.
> - Inputs specified via pragmas should be handled in a general way to allow
> the same source code to work in different environments.
>
> I would like to propose that we focus on autolinking exclusively and that
> we divorce the implementation from the idea of "linker options"
which, by
> nature, would tie source code to the vagaries of particular linkers. I
> don't see much value in supporting other linker operations so I suggest
> that the binary representation be a mergable string section (SHF_MERGE,
> SHF_STRINGS), called .autolink, with custom type SHT_LLVM_AUTOLINK
> (0x6fff4c04), and SHF_EXCLUDE set (to avoid the contents appearing in the
> output).
>
Should we set SHF_EXCLUDE on the section? I think it may be better not to
set it. The semantics of the bit are that the section will be excluded from
the final output file, but there's no requirement that other sections are
not excluded. For example SHT_REL(A) sections are interpreted by the linker
and excluded from the output, but they do not have SHF_EXCLUDE set. The
SHT_LLVM_AUTOLINK section could be treated similarly in compatible linkers.
SHF_EXCLUDE is appropriate for sections that can be freely dropped without
changing the semantics, such as the address significance table, but the
autolink section is different because dropping it changes the semantics. By
not setting the bit, we would cause incompatible linkers to leave the
autolink section in the output file, which would allow downstream tools to
be written that would detect (and possibly diagnose) such files.

Peter


> The compiler can form this section by concatenating the arguments of the
> "comment lib" pragmas in the order they are encountered. Partial
(-r, -Ur)
> links can be handled by concatenating .autolink sections with the normal
> mergeable string section rules. The current .linker-options can remain (or
> be removed); but, "comment lib" pragmas for ELF should be lowered
to
> .autolink not to .linker-options. This makes sense as there is no linker
> option that "comment lib" pragmas map directly to. As an example,
#pragma
> comment(lib, "foo") would result in:
>
> .section ".autolink","eMS", at llvm_autolink,1
>         .asciz "foo"
>
> For LTO, equivalent information to the contents of a the .autolink section
> will be written to the IRSymtab so that it is available to the linker for
> symbol resolution.
>
> The linker will process the .autolink strings in the following way:
>
> 1. Inputs from the .autolink sections of a relocatable object file are
> added when the linker decides to include that file (which could itself be
> in a library) in the link. Autolinked inputs behave as if they were
> appended to the command line as a group after all other options. As a
> consequence the set of autolinked libraries are searched last to resolve
> symbols.
> 2. It is an error if a file cannot be found for a given string.
> 3. Any command line options in effect at the end of the command line
> parsing apply to autolinked inputs, e.g. --whole-archive.
> 4. Duplicate autolinked inputs are ignored.
> 5. The linker tries to add a library or relocatable object file from each
> of the strings in a .autolink section by; first, handling the string as if
> it was specified on the commandline; second, by looking for the string in
> each of the library search paths in turn; third, by looking for a
> lib<string>.a or lib<string>.so (depending on the current mode
of the
> linker) in each of the library search paths.
> 6. A new command line option --no-llvm-autolink will tell LLD to ignore
> the .autolink sections.
>
> Rationale for the above points:
>
> 1. Adding the autolinked inputs last makes the process simple to
> understand from a developers perspective. All linkers are able to implement
> this scheme.
> 2. Error-ing for libraries that are not found seems like better behavior
> than failing the link during symbol resolution.
> 3. It seems useful for the user to be able to apply command line options
> which will affect all of the autolinked input files. There is a potential
> problem of surprise for developers, who might not realize that these
> options would apply to the "invisible" autolinked input files;
however,
> despite the potential for surprise, this is easy for developers to reason
> about and gives developers the control that they may require.
> 4. Unlike on the command line it is probably easy to include the same
> input file twice via pragmas and might be a pain to fix; think of
> Third-party libraries supplied as binaries.
> 5. This algorithm takes into account all of the different ways that ELF
> linkers find input files. The different search methods are tried by the
> linker in most obvious to least obvious order.
> 6. I considered adding finer grained control over which .autolink inputs
> were ignored (e.g. MSVC has /nodefaultlib:<library>); however, I
concluded
> that this is not necessary: if finer control is required developers can
> recreate the same effect autolinking would have had using command line
> options.
>
> Thoughts?
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

-- 
-- 
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190314/b8807897/attachment.html>

bd1976 llvm via llvm-dev

2019-Mar-14 17:43 UTC

head link

[llvm-dev] RFC: ELF Autolinking

On Thu, Mar 14, 2019 at 4:34 PM Rui Ueyama <ruiu at google.com> wrote:
> On Thu, Mar 14, 2019 at 6:08 AM bd1976 llvm via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> The linker will process the .autolink strings in the following way:
>>
>> 1. Inputs from the .autolink sections of a relocatable object file are
>> added when the linker decides to include that file (which could itself
be
>> in a library) in the link. Autolinked inputs behave as if they were
>> appended to the command line as a group after all other options. As a
>> consequence the set of autolinked libraries are searched last to
resolve
>> symbols.
>> 2. It is an error if a file cannot be found for a given string.
>> 3. Any command line options in effect at the end of the command line
>> parsing apply to autolinked inputs, e.g. --whole-archive.
>>
>
> I thought that the scope of this mechanism is essentially to add `-lfoo`
> automatically to the command line if you include a header that requires
> `libfoo`. From that perspective, the item 3 seems odd. Why do you need
that?
>
I replied to Peter already on this; but, what I'm basically saying is that
if you had a command line like...

bob.o --whole-archive -lbar

... and bob.o was built with #pragma comment(lib, "foo"), then this
effectively is transformed into...

bob.o --whole-archive -lbar -lfoo

...and the --whole-archive applies to the -lfoo. The alternative that you
use some defaults when handling the autolinked libraries.

> 4. Duplicate autolinked inputs are ignored.
>>
>
> I'd say duplicate autolinked inputs are processed normally, but because
of
> the same reason why  the second parameter in `-lfoo -lfoo` is basically
> no-op, duplicated autolinked inputs are naturally ignored.
>
Again, replied to Peter - .o files and I also want to leave the door open
for GNU to be able to implement this.

>
> 5. The linker tries to add a library or relocatable object file from each
>> of the strings in a .autolink section by; first, handling the string as
if
>> it was specified on the commandline; second, by looking for the string
in
>> each of the library search paths in turn; third, by looking for a
>> lib<string>.a or lib<string>.so (depending on the current
mode of the
>> linker) in each of the library search paths.
>>
>
> Again, this seems like a little beyond the scope of what I expect (and it
> looks like you want to allow an .o file using this scheme).
>
Again, replied to Peter. I think that .o files should be included,
--library allows this.

> 6. A new command line option --no-llvm-autolink will tell LLD to ignore
>> the .autolink sections.
>>
>> Rationale for the above points:
>>
>> 1. Adding the autolinked inputs last makes the process simple to
>> understand from a developers perspective. All linkers are able to
implement
>> this scheme.
>> 2. Error-ing for libraries that are not found seems like better
behavior
>> than failing the link during symbol resolution.
>> 3. It seems useful for the user to be able to apply command line
options
>> which will affect all of the autolinked input files. There is a
potential
>> problem of surprise for developers, who might not realize that these
>> options would apply to the "invisible" autolinked input
files; however,
>> despite the potential for surprise, this is easy for developers to
reason
>> about and gives developers the control that they may require.
>> 4. Unlike on the command line it is probably easy to include the same
>> input file twice via pragmas and might be a pain to fix; think of
>> Third-party libraries supplied as binaries.
>> 5. This algorithm takes into account all of the different ways that ELF
>> linkers find input files. The different search methods are tried by the
>> linker in most obvious to least obvious order.
>> 6. I considered adding finer grained control over which .autolink
inputs
>> were ignored (e.g. MSVC has /nodefaultlib:<library>); however, I
concluded
>> that this is not necessary: if finer control is required developers can
>> recreate the same effect autolinking would have had using command line
>> options.
>>
>> Thoughts?
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190314/2838b75d/attachment.html>

bd1976 llvm via llvm-dev

2019-Mar-14 18:00 UTC

head link

[llvm-dev] RFC: ELF Autolinking

On Thu, Mar 14, 2019 at 5:30 PM Peter Collingbourne <peter at pcc.me.uk>
wrote:
>
>> I would like to propose that we focus on autolinking exclusively and
that
>> we divorce the implementation from the idea of "linker
options" which, by
>> nature, would tie source code to the vagaries of particular linkers. I
>> don't see much value in supporting other linker operations so I
suggest
>> that the binary representation be a mergable string section (SHF_MERGE,
>> SHF_STRINGS), called .autolink, with custom type SHT_LLVM_AUTOLINK
>> (0x6fff4c04), and SHF_EXCLUDE set (to avoid the contents appearing in
the
>> output).
>>
>
> Should we set SHF_EXCLUDE on the section? I think it may be better not to
> set it. The semantics of the bit are that the section will be excluded from
> the final output file, but there's no requirement that other sections
are
> not excluded. For example SHT_REL(A) sections are interpreted by the linker
> and excluded from the output, but they do not have SHF_EXCLUDE set. The
> SHT_LLVM_AUTOLINK section could be treated similarly in compatible linkers.
> SHF_EXCLUDE is appropriate for sections that can be freely dropped without
> changing the semantics, such as the address significance table, but the
> autolink section is different because dropping it changes the semantics. By
> not setting the bit, we would cause incompatible linkers to leave the
> autolink section in the output file, which would allow downstream tools to
> be written that would detect (and possibly diagnose) such files.
>
> Peter
>
Great point Peter, accepted. I'm rather afraid to say that I just copied
the SHF_EXCLUDE from the existing .linker-options without thinking it
through. When I prototyped this up SHF_EXCLUDE doesn't actually buy you
much because you still need to special case SHT_LLVM_AUTOLINK for -r links
to undo the effect of SHF_EXCLUDE. Probably ELF needs some sort of EXCLUDED
but not for -r links flag.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190314/028b09b9/attachment.html>

Peter Collingbourne via llvm-dev

2019-Mar-14 18:27 UTC

head link

[llvm-dev] RFC: ELF Autolinking

On Thu, Mar 14, 2019 at 6:08 AM bd1976 llvm via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> At Sony we offer autolinking as a feature in our ELF toolchain. We would
> like to see full support for this feature upstream as there is anecdotal
> evidence that it would find use beyond Sony.
>
> In general autolinking (https://en.wikipedia.org/wiki/Auto-linking)
> allows developers to specify inputs to the linker in their source code.
> LLVM and Clang already have support for autolinking on ELF via embedding
> strings, which specify linker behavior, into a .linker-options section in
> relocatable object files, see:
>
> RFC - http://lists.llvm.org/pipermail/llvm-dev/2018-January/120101.html
> LLVM -
>
https://llvm.org/docs/Extensions.html#linker-options-section-linker-options,
> https://reviews.llvm.org/D40849
> Clang -
>
https://clang.llvm.org/docs/LanguageExtensions.html#specifying-linker-options-on-elf-targets,
> https://reviews.llvm.org/D42758
>
> However, although support was added to Clang and LLVM, no support has been
> implemented in LLD; and, I get the sense, from reading the reviews, that
> there wasn't agreement on the implementation when the changes landed.
The
> original motivation seems to have been to remove the
"autolink-extract"
> mechanism used by Swift to workaround the lack of autolinking support for
> ELF. However, looking at the Swift source code, Swift still seems to be
> using the "autolink-extract" method.
>
> So my first question: Are there any users of the current implementation
> for ELF?
>
> Assuming that no one is using the current code, I would like to suggest a
> different mechanism for autolinking.
>
> For ELF we need limited autolinking support. Specifically, we only need
> support for "comment lib" pragmas (
>
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017)
> in C/C++ e.g. #pragma comment(lib, "foo"). My suggestion that we
keep the
> implementation as lean as possible.
>
> Principles to guide the implementation:
> - Developers should be able to easily understand autolinking behavior.
> - Developers should be able to override autolinking from the linker
> command line.
> - Inputs specified via pragmas should be handled in a general way to allow
> the same source code to work in different environments.
>
> I would like to propose that we focus on autolinking exclusively and that
> we divorce the implementation from the idea of "linker options"
which, by
> nature, would tie source code to the vagaries of particular linkers. I
> don't see much value in supporting other linker operations so I suggest
> that the binary representation be a mergable string section (SHF_MERGE,
> SHF_STRINGS), called .autolink, with custom type SHT_LLVM_AUTOLINK
> (0x6fff4c04), and SHF_EXCLUDE set (to avoid the contents appearing in the
> output). The compiler can form this section by concatenating the arguments
> of the "comment lib" pragmas in the order they are encountered.
Partial
> (-r, -Ur) links can be handled by concatenating .autolink sections with the
> normal mergeable string section rules. The current .linker-options can
> remain (or be removed); but, "comment lib" pragmas for ELF should
be
> lowered to .autolink not to .linker-options. This makes sense as there is
> no linker option that "comment lib" pragmas map directly to. As
an example,
> #pragma comment(lib, "foo") would result in:
>
> .section ".autolink","eMS", at llvm_autolink,1
>         .asciz "foo"
>
> For LTO, equivalent information to the contents of a the .autolink section
> will be written to the IRSymtab so that it is available to the linker for
> symbol resolution.
>
> The linker will process the .autolink strings in the following way:
>
> 1. Inputs from the .autolink sections of a relocatable object file are
> added when the linker decides to include that file (which could itself be
> in a library) in the link. Autolinked inputs behave as if they were
> appended to the command line as a group after all other options. As a
> consequence the set of autolinked libraries are searched last to resolve
> symbols.
>
If we want this to be compatible with GNU linkers, doesn't the autolinked
input need to appear at the point immediately after the object file appears
in the link? I'm imagining the case where you have a statically linked libc
as well as a libbar.a autolinked from a foo.o. The link command line would
look like this:

ld foo.o -lc

Now foo.o autolinks against bar. The command line becomes:

ld foo.o -lc -lbar

If libbar.a requires an additional object file from libc.a, it will not be
added to the link.

2. It is an error if a file cannot be found for a given
string.> 3. Any command line options in effect at the end of the command line
> parsing apply to autolinked inputs, e.g. --whole-archive.
> 4. Duplicate autolinked inputs are ignored.
>
This seems like it would work in GNU linkers, as long as the autolinked
file is added to the link immediately after the last mention, rather than
the first. Otherwise a command line like:

ld foo1.o foo2.o

(where foo1.o and foo2.o both autolink bar) could end up looking like:

ld foo1.o -lbar foo2.o

and you will not link anything from libbar.a that only foo2.o requires. It
may end up being simpler to not ignore duplicates.

5. The linker tries to add a library or relocatable object file from
each> of the strings in a .autolink section by; first, handling the string as if
> it was specified on the commandline; second, by looking for the string in
> each of the library search paths in turn; third, by looking for a
> lib<string>.a or lib<string>.so (depending on the current mode
of the
> linker) in each of the library search paths.
>
Is the second part necessary? "-l:foo" causes the linker to search for
a
file named "foo" in the library search path, so it seems that allowing
the
autolink string to look like ":foo" would satisfy this use case.

6. A new command line option --no-llvm-autolink will tell LLD to ignore
the> .autolink sections.
>
> Rationale for the above points:
>
> 1. Adding the autolinked inputs last makes the process simple to
> understand from a developers perspective. All linkers are able to implement
> this scheme.
> 2. Error-ing for libraries that are not found seems like better behavior
> than failing the link during symbol resolution.
> 3. It seems useful for the user to be able to apply command line options
> which will affect all of the autolinked input files. There is a potential
> problem of surprise for developers, who might not realize that these
> options would apply to the "invisible" autolinked input files;
however,
> despite the potential for surprise, this is easy for developers to reason
> about and gives developers the control that they may require.
> 4. Unlike on the command line it is probably easy to include the same
> input file twice via pragmas and might be a pain to fix; think of
> Third-party libraries supplied as binaries.
> 5. This algorithm takes into account all of the different ways that ELF
> linkers find input files. The different search methods are tried by the
> linker in most obvious to least obvious order.
> 6. I considered adding finer grained control over which .autolink inputs
> were ignored (e.g. MSVC has /nodefaultlib:<library>); however, I
concluded
> that this is not necessary: if finer control is required developers can
> recreate the same effect autolinking would have had using command line
> options.
>
> Thoughts?
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

-- 
-- 
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190314/540f2a25/attachment.html>

bd1976 llvm via llvm-dev

2019-Mar-14 20:04 UTC

head link

[llvm-dev] RFC: ELF Autolinking

On Thu, Mar 14, 2019 at 6:27 PM Peter Collingbourne <peter at pcc.me.uk>
wrote:
>
>
> On Thu, Mar 14, 2019 at 6:08 AM bd1976 llvm via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> At Sony we offer autolinking as a feature in our ELF toolchain. We
would
>> like to see full support for this feature upstream as there is
anecdotal
>> evidence that it would find use beyond Sony.
>>
>> In general autolinking (https://en.wikipedia.org/wiki/Auto-linking)
>> allows developers to specify inputs to the linker in their source code.
>> LLVM and Clang already have support for autolinking on ELF via
embedding
>> strings, which specify linker behavior, into a .linker-options section
in
>> relocatable object files, see:
>>
>> RFC - http://lists.llvm.org/pipermail/llvm-dev/2018-January/120101.html
>> LLVM -
>>
https://llvm.org/docs/Extensions.html#linker-options-section-linker-options,
>> https://reviews.llvm.org/D40849
>> Clang -
>>
https://clang.llvm.org/docs/LanguageExtensions.html#specifying-linker-options-on-elf-targets,
>> https://reviews.llvm.org/D42758
>>
>> However, although support was added to Clang and LLVM, no support has
>> been implemented in LLD; and, I get the sense, from reading the
reviews,
>> that there wasn't agreement on the implementation when the changes
landed.
>> The original motivation seems to have been to remove the
"autolink-extract"
>> mechanism used by Swift to workaround the lack of autolinking support
for
>> ELF. However, looking at the Swift source code, Swift still seems to be
>> using the "autolink-extract" method.
>>
>> So my first question: Are there any users of the current implementation
>> for ELF?
>>
>> Assuming that no one is using the current code, I would like to suggest
a
>> different mechanism for autolinking.
>>
>> For ELF we need limited autolinking support. Specifically, we only need
>> support for "comment lib" pragmas (
>>
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017)
>> in C/C++ e.g. #pragma comment(lib, "foo"). My suggestion that
we keep the
>> implementation as lean as possible.
>>
>> Principles to guide the implementation:
>> - Developers should be able to easily understand autolinking behavior.
>> - Developers should be able to override autolinking from the linker
>> command line.
>> - Inputs specified via pragmas should be handled in a general way to
>> allow the same source code to work in different environments.
>>
>> I would like to propose that we focus on autolinking exclusively and
that
>> we divorce the implementation from the idea of "linker
options" which, by
>> nature, would tie source code to the vagaries of particular linkers. I
>> don't see much value in supporting other linker operations so I
suggest
>> that the binary representation be a mergable string section (SHF_MERGE,
>> SHF_STRINGS), called .autolink, with custom type SHT_LLVM_AUTOLINK
>> (0x6fff4c04), and SHF_EXCLUDE set (to avoid the contents appearing in
the
>> output). The compiler can form this section by concatenating the
arguments
>> of the "comment lib" pragmas in the order they are
encountered. Partial
>> (-r, -Ur) links can be handled by concatenating .autolink sections with
the
>> normal mergeable string section rules. The current .linker-options can
>> remain (or be removed); but, "comment lib" pragmas for ELF
should be
>> lowered to .autolink not to .linker-options. This makes sense as there
is
>> no linker option that "comment lib" pragmas map directly to.
As an example,
>> #pragma comment(lib, "foo") would result in:
>>
>> .section ".autolink","eMS", at llvm_autolink,1
>>         .asciz "foo"
>>
>> For LTO, equivalent information to the contents of a the .autolink
>> section will be written to the IRSymtab so that it is available to the
>> linker for symbol resolution.
>>
>> The linker will process the .autolink strings in the following way:
>>
>> 1. Inputs from the .autolink sections of a relocatable object file are
>> added when the linker decides to include that file (which could itself
be
>> in a library) in the link. Autolinked inputs behave as if they were
>> appended to the command line as a group after all other options. As a
>> consequence the set of autolinked libraries are searched last to
resolve
>> symbols.
>>
>
> If we want this to be compatible with GNU linkers, doesn't the
autolinked
> input need to appear at the point immediately after the object file appears
> in the link? I'm imagining the case where you have a statically linked
libc
> as well as a libbar.a autolinked from a foo.o. The link command line would
> look like this:
>
> ld foo.o -lc
>
> Now foo.o autolinks against bar. The command line becomes:
>
> ld foo.o -lc -lbar
>
Actually, I was thinking that on a GNU linker the command line would become
"ld foo.o -lc -( -lbar )-"; but, this doesn't affect your point.

>
> If libbar.a requires an additional object file from libc.a, it will not be
> added to the link.
>
>As it stands all the dependencies of an autolinked library must themselves
be autolinked. I had imagined that this is a reasonable limitation. If not
we need another scheme. I try to think about some motivating examples for
this.

> 2. It is an error if a file cannot be found for a given string.
>> 3. Any command line options in effect at the end of the command line
>> parsing apply to autolinked inputs, e.g. --whole-archive.
>> 4. Duplicate autolinked inputs are ignored.
>>
>
> This seems like it would work in GNU linkers, as long as the autolinked
> file is added to the link immediately after the last mention, rather than
> the first. Otherwise a command line like:
>
> ld foo1.o foo2.o
>
> (where foo1.o and foo2.o both autolink bar) could end up looking like:
>
> ld foo1.o -lbar foo2.o
>
> and you will not link anything from libbar.a that only foo2.o requires. It
> may end up being simpler to not ignore duplicates.
>
Correct; but, given that the proposal was to handle the libraries as if
they are appended to the link line after everything on the command line
then I think this will work. With deduplication (and the use of SHF_MERGE)
developers get no ordering guarantees. I claim that this is a feature! My
rationale is that the order in which libraries are linked affects different
linkers in different ways (e.g. LLD does not resolve symbols from archives
in a compatible manner with either the Microsoft linker or the GNU
linkers.), by not allowing the user to control the order I am essentially
saying that autolinking is not suitable for libraries that offer competing
copies of the same symbol. This ties into my argument that "comment
lib"
pragmas should be handled in as "general" a way as possible.

> 5. The linker tries to add a library or relocatable object file from each
>> of the strings in a .autolink section by; first, handling the string as
if
>> it was specified on the commandline; second, by looking for the string
in
>> each of the library search paths in turn; third, by looking for a
>> lib<string>.a or lib<string>.so (depending on the current
mode of the
>> linker) in each of the library search paths.
>>
>
> Is the second part necessary? "-l:foo" causes the linker to
search for a
> file named "foo" in the library search path, so it seems that
allowing the
> autolink string to look like ":foo" would satisfy this use case.
>

I worded the proposal to avoid mapping "comment lib" pragmas to
--library
command line options. My reasons:

1. I find the requirement that the user put ':' in their lib strings
slightly awkward. It means that the source code is now coupled to a
GNU-style linker. So then this isn't merely an ELF linking proposal,
it's a
proposal for ELF toolchains with GNU-like linkers (e.g. the arm linker
doesn't support the colon prefix
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0474c/Cjahbdei.html
).

2. The syntax is #pragma comment(lib, ...) not #pragma
linker-option(library, ...) i.e. the only thing this (frankly rather
bizarre) syntax definitely implies is that the argument is related to
libraries (and comments ¯\_(ツ)_/¯); it is a bit of a stretch to interpret
"comment lib" pragmas as mapping directly to "specifying an
additional
--library command line option".

AFAIK all linkers support two ways of specifying inputs; firstly, directly
on the command line; secondly, with an option with very similar semantics
to GNU's --library option. I choose a method of finding a input files that
encompasses both methods of specifying a library on the command line. I
think that this method is actually more intuitive than either the method
used by the linker script INPUT command or by --library. FWIW, I looked
into the history of the colon prefix. It was added in
https://www.sourceware.org/ml/binutils/2007-03/msg00421.html.
Unfortunately, the rationale given is that it was merely a port of a
vxworks linker extension. I couldn't trace the history any further than
that to find the actual design discussion. The linker script command INPUT
uses a different scheme and the command already had this search order 20
years ago, which is the earliest version of the GNU linker I have history
for; again, the rationale is not available.

> 6. A new command line option --no-llvm-autolink will tell LLD to ignore
>> the .autolink sections.
>>
>> Rationale for the above points:
>>
>> 1. Adding the autolinked inputs last makes the process simple to
>> understand from a developers perspective. All linkers are able to
implement
>> this scheme.
>> 2. Error-ing for libraries that are not found seems like better
behavior
>> than failing the link during symbol resolution.
>> 3. It seems useful for the user to be able to apply command line
options
>> which will affect all of the autolinked input files. There is a
potential
>> problem of surprise for developers, who might not realize that these
>> options would apply to the "invisible" autolinked input
files; however,
>> despite the potential for surprise, this is easy for developers to
reason
>> about and gives developers the control that they may require.
>> 4. Unlike on the command line it is probably easy to include the same
>> input file twice via pragmas and might be a pain to fix; think of
>> Third-party libraries supplied as binaries.
>> 5. This algorithm takes into account all of the different ways that ELF
>> linkers find input files. The different search methods are tried by the
>> linker in most obvious to least obvious order.
>> 6. I considered adding finer grained control over which .autolink
inputs
>> were ignored (e.g. MSVC has /nodefaultlib:<library>); however, I
concluded
>> that this is not necessary: if finer control is required developers can
>> recreate the same effect autolinking would have had using command line
>> options.
>>
>> Thoughts?
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
>
> --
> --
> Peter
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190314/7feba067/attachment.html>

Possibly Parallel Threads

Search for more maybe matching threads

llvm dev - Mar 2019 - RFC: ELF Autolinking

[llvm-dev] RFC: ELF Autolinking

[llvm-dev] RFC: ELF Autolinking

[llvm-dev] RFC: ELF Autolinking

[llvm-dev] RFC: ELF Autolinking

[llvm-dev] RFC: ELF Autolinking

[llvm-dev] RFC: ELF Autolinking

[llvm-dev] RFC: ELF Autolinking

[llvm-dev] RFC: ELF Autolinking

[llvm-dev] RFC: ELF Autolinking

[llvm-dev] RFC: ELF Autolinking

[llvm-dev] RFC: ELF Autolinking

[llvm-dev] RFC: ELF Autolinking

Possibly Parallel Threads