thr3ads.net - llvm dev - [llvm-dev] RFC: ELF Autolinking [Mar 2019]

If this information is useful, please help other people find it:
Share via:

James Y Knight via llvm-dev

2019-Mar-25 17:07 UTC

[llvm-dev] RFC: ELF Autolinking

Are you planning to add support for "-F" and "-framework" to
ELF linkers?

On Mon, Mar 25, 2019 at 12:51 AM Saleem Abdulrasool via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Sorry for the late chiming in.
>
> Yes, swift does use autolinking, and I would like to use that on all the
> targets.  The only target which does not support this functionality
> currently are ELF based.  That said, I think that `#pragma comment(link,
> ...)` is insufficient for my needs.  Building Foundation requires framework
> style linking as well.  The original design that I had in mind was derived
> from ld64 and link.  Personally, I still strongly favour link's
behaviour
> of parsing "command line" options from the object files when they
are
> loaded.  There was strong opposition to that approach from Rui though.
> Would we want to have special pragmas for each "feature"?
>
> The ELF model doesn't have the simplistic model for processing the
command
> line that PE/COFF does.  Because ordering is relevant to the model, it
> would be ideal to process them inline, but, since lld already moves far
> enough away from the traditional Unix model, perhaps we can simplify it to
> append the command line directives to the end of the command line.
>
> The other case that is interesting to think about is the autolinking
> support in C++ (and clang) modules.
>
> On Thu, Mar 21, 2019 at 9:49 AM bd1976 llvm <bd1976llvm at gmail.com>
wrote:
>
>> On Thu, Mar 21, 2019 at 12:06 AM Rui Ueyama <ruiu at google.com>
wrote:
>>
>>> Perhaps there's no one clean way to solve this issue, because
previously
>>> all libraries and object files are explicitly given to the linker
via a
>>> command line and the order of files in the command line matters.
That
>>> assumes human intervention to work correctly. Now, the autolinking
feature
>>> will add libraries implicitly. Since it's implicit, there will
be only one
>>> way how that works, so sometimes that works and sometimes
doesn't.
>>>
>>> It feels to me that we should aim for making it work reasonably
well for
>>> reasonable use cases. By reasonable use cases, I'm thinking of
the
>>> following:
>>>
>>>  1. --static option may or may not be given (i.e. we should allow
that
>>> feature for both static linking and dynamic linking.)
>>>  2. There are no competing defined symbols in a given set of
libraries,
>>> or if they exist, the program owner doesn't care which is
linked to their
>>> program.
>>>  3. There may be circular dependencies between libraries.
>>>
>>> I don't think the above assumption is too odd. If I have to
implement
>>> the autolinking feature to GNU linker for the above scenario,
I'd probably
>>> use the following scheme:
>>>
>>>  1. While reading object files, memorize libraries that are
autolinked
>>>  2. After linking everything, create a list of files consisting of
>>> autolinked libraries AND libraries given via the command line
>>>  3. Visit each file in the list as if they were wrapped in
--start-group
>>> and --end-group.
>>>
>>> I'd think the above scheme should work reasonably well. What do
you
>>> think?
>>>
>>
>> Very nice. I agree with your definition of "reasonable"
usecaes
>> (actually, as I have said before, I think that restricting autolinking
to
>> this "reasonable" set is actually a feature -  to avoid
developers having
>> source code that only works with a particular linker). I also like the
>> proposal for a GNU implementation - I think this is enough to show that
>> GNU-like linkers could implement this.
>>
>> At this point I will try to prototype this up so that people have an
>> implementation to play with.
>>
>> I am keen to hear from Saleem (compnerd) on this, as he did the
original
>> .linker-options work.
>>
>>
>>>
>>> On Tue, Mar 19, 2019 at 11:02 AM bd1976 llvm <bd1976llvm at
gmail.com>
>>> wrote:
>>>
>>>> On Mon, Mar 18, 2019 at 8:02 PM Rui Ueyama <ruiu at
google.com> wrote:
>>>>
>>>>> On Thu, Mar 14, 2019 at 1:05 PM bd1976 llvm via llvm-dev
<
>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>
>>>>>> On Thu, Mar 14, 2019 at 6:27 PM Peter Collingbourne
<peter at pcc.me.uk>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Mar 14, 2019 at 6:08 AM bd1976 llvm via
llvm-dev <
>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>
>>>>>>>> At Sony we offer autolinking as a feature in
our ELF toolchain. We
>>>>>>>> would like to see full support for this feature
upstream as there is
>>>>>>>> anecdotal evidence that it would find use
beyond Sony.
>>>>>>>>
>>>>>>>> In general autolinking
(https://en.wikipedia.org/wiki/Auto-linking)
>>>>>>>> allows developers to specify inputs to the
linker in their source code.
>>>>>>>> LLVM and Clang already have support for
autolinking on ELF via embedding
>>>>>>>> strings, which specify linker behavior, into a
.linker-options section in
>>>>>>>> relocatable object files, see:
>>>>>>>>
>>>>>>>> RFC -
>>>>>>>>
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120101.html
>>>>>>>> LLVM -
>>>>>>>>
https://llvm.org/docs/Extensions.html#linker-options-section-linker-options,
>>>>>>>> https://reviews.llvm.org/D40849
>>>>>>>> Clang -
>>>>>>>>
https://clang.llvm.org/docs/LanguageExtensions.html#specifying-linker-options-on-elf-targets,
>>>>>>>> https://reviews.llvm.org/D42758
>>>>>>>>
>>>>>>>> However, although support was added to Clang
and LLVM, no support
>>>>>>>> has been implemented in LLD; and, I get the
sense, from reading the
>>>>>>>> reviews, that there wasn't agreement on the
implementation when the changes
>>>>>>>> landed. The original motivation seems to have
been to remove the
>>>>>>>> "autolink-extract" mechanism used by
Swift to workaround the lack of
>>>>>>>> autolinking support for ELF. However, looking
at the Swift source code,
>>>>>>>> Swift still seems to be using the
"autolink-extract" method.
>>>>>>>>
>>>>>>>> So my first question: Are there any users of
the current
>>>>>>>> implementation for ELF?
>>>>>>>>
>>>>>>>> Assuming that no one is using the current code,
I would like to
>>>>>>>> suggest a different mechanism for autolinking.
>>>>>>>>
>>>>>>>> For ELF we need limited autolinking support.
Specifically, we only
>>>>>>>> need support for "comment lib"
pragmas (
>>>>>>>>
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017)
>>>>>>>> in C/C++ e.g. #pragma comment(lib,
"foo"). My suggestion that we keep the
>>>>>>>> implementation as lean as possible.
>>>>>>>>
>>>>>>>> Principles to guide the implementation:
>>>>>>>> - Developers should be able to easily
understand autolinking
>>>>>>>> behavior.
>>>>>>>> - Developers should be able to override
autolinking from the linker
>>>>>>>> command line.
>>>>>>>> - Inputs specified via pragmas should be
handled in a general way
>>>>>>>> to allow the same source code to work in
different environments.
>>>>>>>>
>>>>>>>> I would like to propose that we focus on
autolinking exclusively
>>>>>>>> and that we divorce the implementation from the
idea of "linker options"
>>>>>>>> which, by nature, would tie source code to the
vagaries of particular
>>>>>>>> linkers. I don't see much value in
supporting other linker operations so I
>>>>>>>> suggest that the binary representation be a
mergable string section
>>>>>>>> (SHF_MERGE, SHF_STRINGS), called .autolink,
with custom type
>>>>>>>> SHT_LLVM_AUTOLINK (0x6fff4c04), and SHF_EXCLUDE
set (to avoid the contents
>>>>>>>> appearing in the output). The compiler can form
this section by
>>>>>>>> concatenating the arguments of the
"comment lib" pragmas in the order they
>>>>>>>> are encountered. Partial (-r, -Ur) links can be
handled by concatenating
>>>>>>>> .autolink sections with the normal mergeable
string section rules. The
>>>>>>>> current .linker-options can remain (or be
removed); but, "comment lib"
>>>>>>>> pragmas for ELF should be lowered to .autolink
not to .linker-options. This
>>>>>>>> makes sense as there is no linker option that
"comment lib" pragmas map
>>>>>>>> directly to. As an example, #pragma
comment(lib, "foo") would result in:
>>>>>>>>
>>>>>>>> .section ".autolink","eMS",
at llvm_autolink,1
>>>>>>>>         .asciz "foo"
>>>>>>>>
>>>>>>>> For LTO, equivalent information to the contents
of a the .autolink
>>>>>>>> section will be written to the IRSymtab so that
it is available to the
>>>>>>>> linker for symbol resolution.
>>>>>>>>
>>>>>>>> The linker will process the .autolink strings
in the following way:
>>>>>>>>
>>>>>>>> 1. Inputs from the .autolink sections of a
relocatable object file
>>>>>>>> are added when the linker decides to include
that file (which could itself
>>>>>>>> be in a library) in the link. Autolinked inputs
behave as if they were
>>>>>>>> appended to the command line as a group after
all other options. As a
>>>>>>>> consequence the set of autolinked libraries are
searched last to resolve
>>>>>>>> symbols.
>>>>>>>>
>>>>>>>
>>>>>>> If we want this to be compatible with GNU linkers,
doesn't the
>>>>>>> autolinked input need to appear at the point
immediately after the object
>>>>>>> file appears in the link? I'm imagining the
case where you have a
>>>>>>> statically linked libc as well as a libbar.a
autolinked from a foo.o. The
>>>>>>> link command line would look like this:
>>>>>>>
>>>>>>> ld foo.o -lc
>>>>>>>
>>>>>>> Now foo.o autolinks against bar. The command line
becomes:
>>>>>>>
>>>>>>> ld foo.o -lc -lbar
>>>>>>>
>>>>>>
>>>>>> Actually, I was thinking that on a GNU linker the
command line would
>>>>>> become "ld foo.o -lc -( -lbar )-"; but, this
doesn't affect your point.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> If libbar.a requires an additional object file from
libc.a, it will
>>>>>>> not be added to the link.
>>>>>>>
>>>>>>>
>>>>>> As it stands all the dependencies of an autolinked
library must
>>>>>> themselves be autolinked. I had imagined that this is a
reasonable
>>>>>> limitation. If not we need another scheme. I try to
think about some
>>>>>> motivating examples for this.
>>>>>>
>>>>>>
>>>>>>> 2. It is an error if a file cannot be found for a
given string.
>>>>>>>> 3. Any command line options in effect at the
end of the command
>>>>>>>> line parsing apply to autolinked inputs, e.g.
--whole-archive.
>>>>>>>> 4. Duplicate autolinked inputs are ignored.
>>>>>>>>
>>>>>>>
>>>>>>> This seems like it would work in GNU linkers, as
long as the
>>>>>>> autolinked file is added to the link immediately
after the last mention,
>>>>>>> rather than the first. Otherwise a command line
like:
>>>>>>>
>>>>>>> ld foo1.o foo2.o
>>>>>>>
>>>>>>> (where foo1.o and foo2.o both autolink bar) could
end up looking
>>>>>>> like:
>>>>>>>
>>>>>>> ld foo1.o -lbar foo2.o
>>>>>>>
>>>>>>> and you will not link anything from libbar.a that
only foo2.o
>>>>>>> requires. It may end up being simpler to not ignore
duplicates.
>>>>>>>
>>>>>>
>>>>>> Correct; but, given that the proposal was to handle the
libraries as
>>>>>> if they are appended to the link line after everything
on the command line
>>>>>> then I think this will work. With deduplication (and
the use of SHF_MERGE)
>>>>>> developers get no ordering guarantees. I claim that
this is a feature! My
>>>>>> rationale is that the order in which libraries are
linked affects different
>>>>>> linkers in different ways (e.g. LLD does not resolve
symbols from archives
>>>>>> in a compatible manner with either the Microsoft linker
or the GNU
>>>>>> linkers.), by not allowing the user to control the
order I am essentially
>>>>>> saying that autolinking is not suitable for libraries
that offer competing
>>>>>> copies of the same symbol. This ties into my argument
that "comment lib"
>>>>>> pragmas should be handled in as "general" a
way as possible.
>>>>>>
>>>>>
>>>>> Right. I think if you need a fine control over the link
order,
>>>>> autolinking is not a feature you want to use. Or, in
general, if your
>>>>> program is sensitive to a link order because its source
object files have
>>>>> competing symbols of the same name, it's perhaps
unnecessarily fragile.
>>>>>
>>>>> That being said, I think you need to address the issue that
pcc
>>>>> pointed out. If you statically link a program `foo` with
the following
>>>>> command line
>>>>>
>>>>>   ld -o foo foo.o -lc
>>>>>
>>>>> , `foo.o` auto-imports libbar.a, and libbar.a depends on
libc.a, can
>>>>> your proposed feature pull out object files needed for
libbar.a?
>>>>>
>>>>
>>>> It won't work on GNU linkers. It will work with LLD as LLD
has
>>>> MSVC-like archive handling. However, I would like to make sure
that
>>>> whatever we come up with can be supported in the GNU toolchain.
>>>>
>>>> I had thought that it would be acceptable that all the
dependencies of
>>>> an autolinked library must themselves be autolinked in order to
work on GNU
>>>> style linkers. Having thought more, I don't like this
limitation -
>>>> especially as it doesn't exist for Microsoft style linkers.
One possible
>>>> resolution could be that GNU linkers might have to implement
another
>>>> command line option e.g. --auto-dep=<file> to allow
injection into the
>>>> group of autolinked libraries.
>>>>
>>>> i.e In pcc's example you would need to do: "ld foo.o
--auto-dep=libc.a"
>>>> which would become "ld --start-group libbar.a libc.a
--end-group" with
>>>> autolinking.
>>>>
>>>> I wanted to avoid the approach of inserting autolinked
libraries after
>>>> the object that autolinks them. In LLD (and MSVC) it becomes
hard to reason
>>>> about "where" the linker is in the command line and
it would also mean that
>>>> we can't have the nice separation between parsing the
command line and
>>>> doing the rest of the link that we currently have. Also, if you
give people
>>>> a way to have a fine grained control over the link order with
autolinking
>>>> you risk ending up with source code that will link on GNU style
linkers but
>>>> not with LLD (assuming GNU ever implemented support for
autolinking).
>>>>
>>>> Scenario:
>>>>
>>>> libbar.a(bar.o) - defines symbol bar
>>>> libfoo.a(foo.o) - defines foo and autolinks libbar.a
>>>> main.o - references foo
>>>> another.o - does not reference foo
>>>> No references to bar exist
>>>>
>>>> lld -lfoo another.o --whole-archive main.o with autolinking
becomes lld
>>>> -lfoo another.o --whole-archive main.o -lbar result: bar.o gets
added to
>>>> the link.
>>>> But, if a change is made so that another.o references bar then
the link
>>>> line with autolinking becomes lld -lfoo another.o -lbar
--whole-archive
>>>> main.o result: bar.o is not added to the link.
>>>>
>>>> Hopefully the above scenario demonstrates why I think that it
becomes
>>>> too complicated to reason about the effects of autolinking with
pcc's
>>>> proposed insertion scheme.
>>>>
>>>>
>>>>
>>>>> 5. The linker tries to add a library or relocatable object
file from
>>>>>>>> each of the strings in a .autolink section by;
first, handling the string
>>>>>>>> as if it was specified on the commandline;
second, by looking for the
>>>>>>>> string in each of the library search paths in
turn; third, by looking for a
>>>>>>>> lib<string>.a or lib<string>.so
(depending on the current mode of the
>>>>>>>> linker) in each of the library search paths.
>>>>>>>>
>>>>>>>
>>>>>>> Is the second part necessary? "-l:foo"
causes the linker to search
>>>>>>> for a file named "foo" in the library
search path, so it seems that
>>>>>>> allowing the autolink string to look like
":foo" would satisfy this use
>>>>>>> case.
>>>>>>>
>>>>>>
>>>>>>
>>>>>> I worded the proposal to avoid mapping "comment
lib" pragmas to
>>>>>> --library command line options. My reasons:
>>>>>>
>>>>>> 1. I find the requirement that the user put ':'
in their lib strings
>>>>>> slightly awkward. It means that the source code is now
coupled to a
>>>>>> GNU-style linker. So then this isn't merely an ELF
linking proposal, it's a
>>>>>> proposal for ELF toolchains with GNU-like linkers (e.g.
the arm linker
>>>>>> doesn't support the colon prefix
>>>>>>
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0474c/Cjahbdei.html
>>>>>> ).
>>>>>>
>>>>>> 2. The syntax is #pragma comment(lib, ...) not #pragma
>>>>>> linker-option(library, ...) i.e. the only thing this
(frankly rather
>>>>>> bizarre) syntax definitely implies is that the argument
is related to
>>>>>> libraries (and comments ¯\_(ツ)_/¯); it is a bit of a
stretch to interpret
>>>>>> "comment lib" pragmas as mapping directly to
"specifying an additional
>>>>>> --library command line option".
>>>>>>
>>>>>> AFAIK all linkers support two ways of specifying
inputs; firstly,
>>>>>> directly on the command line; secondly, with an option
with very similar
>>>>>> semantics to GNU's --library option. I choose a
method of finding a input
>>>>>> files that encompasses both methods of specifying a
library on the command
>>>>>> line. I think that this method is actually more
intuitive than either the
>>>>>> method used by the linker script INPUT command or by
--library. FWIW, I
>>>>>> looked into the history of the colon prefix. It was
added in
>>>>>>
https://www.sourceware.org/ml/binutils/2007-03/msg00421.html.
>>>>>> Unfortunately, the rationale given is that it was
merely a port of a
>>>>>> vxworks linker extension. I couldn't trace the
history any further than
>>>>>> that to find the actual design discussion. The linker
script command INPUT
>>>>>> uses a different scheme and the command already had
this search order 20
>>>>>> years ago, which is the earliest version of the GNU
linker I have history
>>>>>> for; again, the rationale is not available.
>>>>>>
>>>>>>
>>>>>>> 6. A new command line option --no-llvm-autolink
will tell LLD to
>>>>>>>> ignore the .autolink sections.
>>>>>>>>
>>>>>>>> Rationale for the above points:
>>>>>>>>
>>>>>>>> 1. Adding the autolinked inputs last makes the
process simple to
>>>>>>>> understand from a developers perspective. All
linkers are able to implement
>>>>>>>> this scheme.
>>>>>>>> 2. Error-ing for libraries that are not found
seems like better
>>>>>>>> behavior than failing the link during symbol
resolution.
>>>>>>>> 3. It seems useful for the user to be able to
apply command line
>>>>>>>> options which will affect all of the autolinked
input files. There is a
>>>>>>>> potential problem of surprise for developers,
who might not realize that
>>>>>>>> these options would apply to the
"invisible" autolinked input files;
>>>>>>>> however, despite the potential for surprise,
this is easy for developers to
>>>>>>>> reason about and gives developers the control
that they may require.
>>>>>>>> 4. Unlike on the command line it is probably
easy to include the
>>>>>>>> same input file twice via pragmas and might be
a pain to fix; think of
>>>>>>>> Third-party libraries supplied as binaries.
>>>>>>>> 5. This algorithm takes into account all of the
different ways that
>>>>>>>> ELF linkers find input files. The different
search methods are tried by the
>>>>>>>> linker in most obvious to least obvious order.
>>>>>>>> 6. I considered adding finer grained control
over which .autolink
>>>>>>>> inputs were ignored (e.g. MSVC has
/nodefaultlib:<library>); however, I
>>>>>>>> concluded that this is not necessary: if finer
control is required
>>>>>>>> developers can recreate the same effect
autolinking would have had using
>>>>>>>> command line options.
>>>>>>>>
>>>>>>>> Thoughts?
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> LLVM Developers mailing list
>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> --
>>>>>>> Peter
>>>>>>>
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> llvm-dev at lists.llvm.org
>>>>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>
>>>>>
>
> --
> Saleem Abdulrasool
> compnerd (at) compnerd (dot) org
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190325/ca8c5d7e/attachment-0001.html>

Rui Ueyama via llvm-dev

2019-Mar-25 17:20 UTC

head link

[llvm-dev] RFC: ELF Autolinking

Could you explain what that feature is?

On Mon, Mar 25, 2019 at 10:08 AM James Y Knight via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Are you planning to add support for "-F" and
"-framework" to ELF linkers?
>
> On Mon, Mar 25, 2019 at 12:51 AM Saleem Abdulrasool via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Sorry for the late chiming in.
>>
>> Yes, swift does use autolinking, and I would like to use that on all
the
>> targets.  The only target which does not support this functionality
>> currently are ELF based.  That said, I think that `#pragma
comment(link,
>> ...)` is insufficient for my needs.  Building Foundation requires
framework
>> style linking as well.  The original design that I had in mind was
derived
>> from ld64 and link.  Personally, I still strongly favour link's
behaviour
>> of parsing "command line" options from the object files when
they are
>> loaded.  There was strong opposition to that approach from Rui though.
>> Would we want to have special pragmas for each "feature"?
>>
>> The ELF model doesn't have the simplistic model for processing the
>> command line that PE/COFF does.  Because ordering is relevant to the
model,
>> it would be ideal to process them inline, but, since lld already moves
far
>> enough away from the traditional Unix model, perhaps we can simplify it
to
>> append the command line directives to the end of the command line.
>>
>> The other case that is interesting to think about is the autolinking
>> support in C++ (and clang) modules.
>>
>> On Thu, Mar 21, 2019 at 9:49 AM bd1976 llvm <bd1976llvm at
gmail.com> wrote:
>>
>>> On Thu, Mar 21, 2019 at 12:06 AM Rui Ueyama <ruiu at
google.com> wrote:
>>>
>>>> Perhaps there's no one clean way to solve this issue,
because
>>>> previously all libraries and object files are explicitly given
to the
>>>> linker via a command line and the order of files in the command
line
>>>> matters. That assumes human intervention to work correctly.
Now, the
>>>> autolinking feature will add libraries implicitly. Since
it's implicit,
>>>> there will be only one way how that works, so sometimes that
works and
>>>> sometimes doesn't.
>>>>
>>>> It feels to me that we should aim for making it work reasonably
well
>>>> for reasonable use cases. By reasonable use cases, I'm
thinking of the
>>>> following:
>>>>
>>>>  1. --static option may or may not be given (i.e. we should
allow that
>>>> feature for both static linking and dynamic linking.)
>>>>  2. There are no competing defined symbols in a given set of
libraries,
>>>> or if they exist, the program owner doesn't care which is
linked to their
>>>> program.
>>>>  3. There may be circular dependencies between libraries.
>>>>
>>>> I don't think the above assumption is too odd. If I have to
implement
>>>> the autolinking feature to GNU linker for the above scenario,
I'd probably
>>>> use the following scheme:
>>>>
>>>>  1. While reading object files, memorize libraries that are
autolinked
>>>>  2. After linking everything, create a list of files consisting
of
>>>> autolinked libraries AND libraries given via the command line
>>>>  3. Visit each file in the list as if they were wrapped in
>>>> --start-group and --end-group.
>>>>
>>>> I'd think the above scheme should work reasonably well.
What do you
>>>> think?
>>>>
>>>
>>> Very nice. I agree with your definition of "reasonable"
usecaes
>>> (actually, as I have said before, I think that restricting
autolinking to
>>> this "reasonable" set is actually a feature -  to avoid
developers having
>>> source code that only works with a particular linker). I also like
the
>>> proposal for a GNU implementation - I think this is enough to show
that
>>> GNU-like linkers could implement this.
>>>
>>> At this point I will try to prototype this up so that people have
an
>>> implementation to play with.
>>>
>>> I am keen to hear from Saleem (compnerd) on this, as he did the
>>> original .linker-options work.
>>>
>>>
>>>>
>>>> On Tue, Mar 19, 2019 at 11:02 AM bd1976 llvm <bd1976llvm at
gmail.com>
>>>> wrote:
>>>>
>>>>> On Mon, Mar 18, 2019 at 8:02 PM Rui Ueyama <ruiu at
google.com> wrote:
>>>>>
>>>>>> On Thu, Mar 14, 2019 at 1:05 PM bd1976 llvm via
llvm-dev <
>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>
>>>>>>> On Thu, Mar 14, 2019 at 6:27 PM Peter Collingbourne
<peter at pcc.me.uk>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Mar 14, 2019 at 6:08 AM bd1976 llvm via
llvm-dev <
>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>
>>>>>>>>> At Sony we offer autolinking as a feature
in our ELF toolchain. We
>>>>>>>>> would like to see full support for this
feature upstream as there is
>>>>>>>>> anecdotal evidence that it would find use
beyond Sony.
>>>>>>>>>
>>>>>>>>> In general autolinking
(https://en.wikipedia.org/wiki/Auto-linking)
>>>>>>>>> allows developers to specify inputs to the
linker in their source code.
>>>>>>>>> LLVM and Clang already have support for
autolinking on ELF via embedding
>>>>>>>>> strings, which specify linker behavior,
into a .linker-options section in
>>>>>>>>> relocatable object files, see:
>>>>>>>>>
>>>>>>>>> RFC -
>>>>>>>>>
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120101.html
>>>>>>>>> LLVM -
>>>>>>>>>
https://llvm.org/docs/Extensions.html#linker-options-section-linker-options,
>>>>>>>>> https://reviews.llvm.org/D40849
>>>>>>>>> Clang -
>>>>>>>>>
https://clang.llvm.org/docs/LanguageExtensions.html#specifying-linker-options-on-elf-targets,
>>>>>>>>> https://reviews.llvm.org/D42758
>>>>>>>>>
>>>>>>>>> However, although support was added to
Clang and LLVM, no support
>>>>>>>>> has been implemented in LLD; and, I get the
sense, from reading the
>>>>>>>>> reviews, that there wasn't agreement on
the implementation when the changes
>>>>>>>>> landed. The original motivation seems to
have been to remove the
>>>>>>>>> "autolink-extract" mechanism used
by Swift to workaround the lack of
>>>>>>>>> autolinking support for ELF. However,
looking at the Swift source code,
>>>>>>>>> Swift still seems to be using the
"autolink-extract" method.
>>>>>>>>>
>>>>>>>>> So my first question: Are there any users
of the current
>>>>>>>>> implementation for ELF?
>>>>>>>>>
>>>>>>>>> Assuming that no one is using the current
code, I would like to
>>>>>>>>> suggest a different mechanism for
autolinking.
>>>>>>>>>
>>>>>>>>> For ELF we need limited autolinking
support. Specifically, we only
>>>>>>>>> need support for "comment lib"
pragmas (
>>>>>>>>>
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017)
>>>>>>>>> in C/C++ e.g. #pragma comment(lib,
"foo"). My suggestion that we keep the
>>>>>>>>> implementation as lean as possible.
>>>>>>>>>
>>>>>>>>> Principles to guide the implementation:
>>>>>>>>> - Developers should be able to easily
understand autolinking
>>>>>>>>> behavior.
>>>>>>>>> - Developers should be able to override
autolinking from the
>>>>>>>>> linker command line.
>>>>>>>>> - Inputs specified via pragmas should be
handled in a general way
>>>>>>>>> to allow the same source code to work in
different environments.
>>>>>>>>>
>>>>>>>>> I would like to propose that we focus on
autolinking exclusively
>>>>>>>>> and that we divorce the implementation from
the idea of "linker options"
>>>>>>>>> which, by nature, would tie source code to
the vagaries of particular
>>>>>>>>> linkers. I don't see much value in
supporting other linker operations so I
>>>>>>>>> suggest that the binary representation be a
mergable string section
>>>>>>>>> (SHF_MERGE, SHF_STRINGS), called .autolink,
with custom type
>>>>>>>>> SHT_LLVM_AUTOLINK (0x6fff4c04), and
SHF_EXCLUDE set (to avoid the contents
>>>>>>>>> appearing in the output). The compiler can
form this section by
>>>>>>>>> concatenating the arguments of the
"comment lib" pragmas in the order they
>>>>>>>>> are encountered. Partial (-r, -Ur) links
can be handled by concatenating
>>>>>>>>> .autolink sections with the normal
mergeable string section rules. The
>>>>>>>>> current .linker-options can remain (or be
removed); but, "comment lib"
>>>>>>>>> pragmas for ELF should be lowered to
.autolink not to .linker-options. This
>>>>>>>>> makes sense as there is no linker option
that "comment lib" pragmas map
>>>>>>>>> directly to. As an example, #pragma
comment(lib, "foo") would result in:
>>>>>>>>>
>>>>>>>>> .section
".autolink","eMS", at llvm_autolink,1
>>>>>>>>>         .asciz "foo"
>>>>>>>>>
>>>>>>>>> For LTO, equivalent information to the
contents of a the .autolink
>>>>>>>>> section will be written to the IRSymtab so
that it is available to the
>>>>>>>>> linker for symbol resolution.
>>>>>>>>>
>>>>>>>>> The linker will process the .autolink
strings in the following way:
>>>>>>>>>
>>>>>>>>> 1. Inputs from the .autolink sections of a
relocatable object file
>>>>>>>>> are added when the linker decides to
include that file (which could itself
>>>>>>>>> be in a library) in the link. Autolinked
inputs behave as if they were
>>>>>>>>> appended to the command line as a group
after all other options. As a
>>>>>>>>> consequence the set of autolinked libraries
are searched last to resolve
>>>>>>>>> symbols.
>>>>>>>>>
>>>>>>>>
>>>>>>>> If we want this to be compatible with GNU
linkers, doesn't the
>>>>>>>> autolinked input need to appear at the point
immediately after the object
>>>>>>>> file appears in the link? I'm imagining the
case where you have a
>>>>>>>> statically linked libc as well as a libbar.a
autolinked from a foo.o. The
>>>>>>>> link command line would look like this:
>>>>>>>>
>>>>>>>> ld foo.o -lc
>>>>>>>>
>>>>>>>> Now foo.o autolinks against bar. The command
line becomes:
>>>>>>>>
>>>>>>>> ld foo.o -lc -lbar
>>>>>>>>
>>>>>>>
>>>>>>> Actually, I was thinking that on a GNU linker the
command line would
>>>>>>> become "ld foo.o -lc -( -lbar )-"; but,
this doesn't affect your point.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> If libbar.a requires an additional object file
from libc.a, it will
>>>>>>>> not be added to the link.
>>>>>>>>
>>>>>>>>
>>>>>>> As it stands all the dependencies of an autolinked
library must
>>>>>>> themselves be autolinked. I had imagined that this
is a reasonable
>>>>>>> limitation. If not we need another scheme. I try to
think about some
>>>>>>> motivating examples for this.
>>>>>>>
>>>>>>>
>>>>>>>> 2. It is an error if a file cannot be found for
a given string.
>>>>>>>>> 3. Any command line options in effect at
the end of the command
>>>>>>>>> line parsing apply to autolinked inputs,
e.g. --whole-archive.
>>>>>>>>> 4. Duplicate autolinked inputs are ignored.
>>>>>>>>>
>>>>>>>>
>>>>>>>> This seems like it would work in GNU linkers,
as long as the
>>>>>>>> autolinked file is added to the link
immediately after the last mention,
>>>>>>>> rather than the first. Otherwise a command line
like:
>>>>>>>>
>>>>>>>> ld foo1.o foo2.o
>>>>>>>>
>>>>>>>> (where foo1.o and foo2.o both autolink bar)
could end up looking
>>>>>>>> like:
>>>>>>>>
>>>>>>>> ld foo1.o -lbar foo2.o
>>>>>>>>
>>>>>>>> and you will not link anything from libbar.a
that only foo2.o
>>>>>>>> requires. It may end up being simpler to not
ignore duplicates.
>>>>>>>>
>>>>>>>
>>>>>>> Correct; but, given that the proposal was to handle
the libraries as
>>>>>>> if they are appended to the link line after
everything on the command line
>>>>>>> then I think this will work. With deduplication
(and the use of SHF_MERGE)
>>>>>>> developers get no ordering guarantees. I claim that
this is a feature! My
>>>>>>> rationale is that the order in which libraries are
linked affects different
>>>>>>> linkers in different ways (e.g. LLD does not
resolve symbols from archives
>>>>>>> in a compatible manner with either the Microsoft
linker or the GNU
>>>>>>> linkers.), by not allowing the user to control the
order I am essentially
>>>>>>> saying that autolinking is not suitable for
libraries that offer competing
>>>>>>> copies of the same symbol. This ties into my
argument that "comment lib"
>>>>>>> pragmas should be handled in as "general"
a way as possible.
>>>>>>>
>>>>>>
>>>>>> Right. I think if you need a fine control over the link
order,
>>>>>> autolinking is not a feature you want to use. Or, in
general, if your
>>>>>> program is sensitive to a link order because its source
object files have
>>>>>> competing symbols of the same name, it's perhaps
unnecessarily fragile.
>>>>>>
>>>>>> That being said, I think you need to address the issue
that pcc
>>>>>> pointed out. If you statically link a program `foo`
with the following
>>>>>> command line
>>>>>>
>>>>>>   ld -o foo foo.o -lc
>>>>>>
>>>>>> , `foo.o` auto-imports libbar.a, and libbar.a depends
on libc.a, can
>>>>>> your proposed feature pull out object files needed for
libbar.a?
>>>>>>
>>>>>
>>>>> It won't work on GNU linkers. It will work with LLD as
LLD has
>>>>> MSVC-like archive handling. However, I would like to make
sure that
>>>>> whatever we come up with can be supported in the GNU
toolchain.
>>>>>
>>>>> I had thought that it would be acceptable that all the
dependencies of
>>>>> an autolinked library must themselves be autolinked in
order to work on GNU
>>>>> style linkers. Having thought more, I don't like this
limitation -
>>>>> especially as it doesn't exist for Microsoft style
linkers. One possible
>>>>> resolution could be that GNU linkers might have to
implement another
>>>>> command line option e.g. --auto-dep=<file> to allow
injection into the
>>>>> group of autolinked libraries.
>>>>>
>>>>> i.e In pcc's example you would need to do: "ld
foo.o
>>>>> --auto-dep=libc.a" which would become "ld
--start-group libbar.a libc.a
>>>>> --end-group" with autolinking.
>>>>>
>>>>> I wanted to avoid the approach of inserting autolinked
libraries after
>>>>> the object that autolinks them. In LLD (and MSVC) it
becomes hard to reason
>>>>> about "where" the linker is in the command line
and it would also mean that
>>>>> we can't have the nice separation between parsing the
command line and
>>>>> doing the rest of the link that we currently have. Also, if
you give people
>>>>> a way to have a fine grained control over the link order
with autolinking
>>>>> you risk ending up with source code that will link on GNU
style linkers but
>>>>> not with LLD (assuming GNU ever implemented support for
autolinking).
>>>>>
>>>>> Scenario:
>>>>>
>>>>> libbar.a(bar.o) - defines symbol bar
>>>>> libfoo.a(foo.o) - defines foo and autolinks libbar.a
>>>>> main.o - references foo
>>>>> another.o - does not reference foo
>>>>> No references to bar exist
>>>>>
>>>>> lld -lfoo another.o --whole-archive main.o with autolinking
becomes
>>>>> lld -lfoo another.o --whole-archive main.o -lbar result:
bar.o gets added
>>>>> to the link.
>>>>> But, if a change is made so that another.o references bar
then the
>>>>> link line with autolinking becomes lld -lfoo another.o
>>>>> -lbar --whole-archive main.o result: bar.o is not added to
the link.
>>>>>
>>>>> Hopefully the above scenario demonstrates why I think that
it becomes
>>>>> too complicated to reason about the effects of autolinking
with pcc's
>>>>> proposed insertion scheme.
>>>>>
>>>>>
>>>>>
>>>>>> 5. The linker tries to add a library or relocatable
object file from
>>>>>>>>> each of the strings in a .autolink section
by; first, handling the string
>>>>>>>>> as if it was specified on the commandline;
second, by looking for the
>>>>>>>>> string in each of the library search paths
in turn; third, by looking for a
>>>>>>>>> lib<string>.a or lib<string>.so
(depending on the current mode of the
>>>>>>>>> linker) in each of the library search
paths.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Is the second part necessary?
"-l:foo" causes the linker to search
>>>>>>>> for a file named "foo" in the library
search path, so it seems that
>>>>>>>> allowing the autolink string to look like
":foo" would satisfy this use
>>>>>>>> case.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I worded the proposal to avoid mapping
"comment lib" pragmas to
>>>>>>> --library command line options. My reasons:
>>>>>>>
>>>>>>> 1. I find the requirement that the user put
':' in their lib strings
>>>>>>> slightly awkward. It means that the source code is
now coupled to a
>>>>>>> GNU-style linker. So then this isn't merely an
ELF linking proposal, it's a
>>>>>>> proposal for ELF toolchains with GNU-like linkers
(e.g. the arm linker
>>>>>>> doesn't support the colon prefix
>>>>>>>
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0474c/Cjahbdei.html
>>>>>>> ).
>>>>>>>
>>>>>>> 2. The syntax is #pragma comment(lib, ...) not
#pragma
>>>>>>> linker-option(library, ...) i.e. the only thing
this (frankly rather
>>>>>>> bizarre) syntax definitely implies is that the
argument is related to
>>>>>>> libraries (and comments ¯\_(ツ)_/¯); it is a bit of
a stretch to interpret
>>>>>>> "comment lib" pragmas as mapping directly
to "specifying an additional
>>>>>>> --library command line option".
>>>>>>>
>>>>>>> AFAIK all linkers support two ways of specifying
inputs; firstly,
>>>>>>> directly on the command line; secondly, with an
option with very similar
>>>>>>> semantics to GNU's --library option. I choose a
method of finding a input
>>>>>>> files that encompasses both methods of specifying a
library on the command
>>>>>>> line. I think that this method is actually more
intuitive than either the
>>>>>>> method used by the linker script INPUT command or
by --library. FWIW, I
>>>>>>> looked into the history of the colon prefix. It was
added in
>>>>>>>
https://www.sourceware.org/ml/binutils/2007-03/msg00421.html.
>>>>>>> Unfortunately, the rationale given is that it was
merely a port of a
>>>>>>> vxworks linker extension. I couldn't trace the
history any further than
>>>>>>> that to find the actual design discussion. The
linker script command INPUT
>>>>>>> uses a different scheme and the command already had
this search order 20
>>>>>>> years ago, which is the earliest version of the GNU
linker I have history
>>>>>>> for; again, the rationale is not available.
>>>>>>>
>>>>>>>
>>>>>>>> 6. A new command line option --no-llvm-autolink
will tell LLD to
>>>>>>>>> ignore the .autolink sections.
>>>>>>>>>
>>>>>>>>> Rationale for the above points:
>>>>>>>>>
>>>>>>>>> 1. Adding the autolinked inputs last makes
the process simple to
>>>>>>>>> understand from a developers perspective.
All linkers are able to implement
>>>>>>>>> this scheme.
>>>>>>>>> 2. Error-ing for libraries that are not
found seems like better
>>>>>>>>> behavior than failing the link during
symbol resolution.
>>>>>>>>> 3. It seems useful for the user to be able
to apply command line
>>>>>>>>> options which will affect all of the
autolinked input files. There is a
>>>>>>>>> potential problem of surprise for
developers, who might not realize that
>>>>>>>>> these options would apply to the
"invisible" autolinked input files;
>>>>>>>>> however, despite the potential for
surprise, this is easy for developers to
>>>>>>>>> reason about and gives developers the
control that they may require.
>>>>>>>>> 4. Unlike on the command line it is
probably easy to include the
>>>>>>>>> same input file twice via pragmas and might
be a pain to fix; think of
>>>>>>>>> Third-party libraries supplied as binaries.
>>>>>>>>> 5. This algorithm takes into account all of
the different ways
>>>>>>>>> that ELF linkers find input files. The
different search methods are tried
>>>>>>>>> by the linker in most obvious to least
obvious order.
>>>>>>>>> 6. I considered adding finer grained
control over which .autolink
>>>>>>>>> inputs were ignored (e.g. MSVC has
/nodefaultlib:<library>); however, I
>>>>>>>>> concluded that this is not necessary: if
finer control is required
>>>>>>>>> developers can recreate the same effect
autolinking would have had using
>>>>>>>>> command line options.
>>>>>>>>>
>>>>>>>>> Thoughts?
>>>>>>>>>
>>>>>>>>>
_______________________________________________
>>>>>>>>> LLVM Developers mailing list
>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> --
>>>>>>>> Peter
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>
>>
>> --
>> Saleem Abdulrasool
>> compnerd (at) compnerd (dot) org
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190325/d6421de2/attachment.html>

James Y Knight via llvm-dev

2019-Mar-25 17:45 UTC

head link

[llvm-dev] RFC: ELF Autolinking

In Apple-world --
1. The linker has a second search path, specified with "-F PATH",
which
works just like "-L PATH", except that it applies to libraries
specified
with "-framework FOO", instead of those specified with
"-lFOO".
2. The option "-framework FOO", in addition to using a different
search-path, constructs the library file path differently. Instead of
looking for a file named "{Lsearchpath}/lib{FOO}.so" or
"{Lsearchpath}/lib{FOO}.a", it looks for a file named
"{Fsearchpath}/FOO.framework/FOO".

(Note that in compilation, -F also affects the header search path -- when
you specify -F PATH, an #include "FOO/Something.h" looks for a file
named
"{Fsearchpath}/FOO.framework/Headers/Something.h". So you only need to
use
a single search-path option, for both compilation and linking, which
specifies the parent directory of any NNN.framework directories.)

On the plus side, this allows creating a single directory with all the
resources required for some component (headers, libraries, image
files/etc). On the downside, it intermingles the files needed for
development with the files needed for deployment, and only supports
multiarch via fat binaries.

IMO, there's not really a point to adding it -- one could just as well
install a file "libFOO.so" next to the "FOO.framework"
directory instead of
within it, and then use the usual -L/-l options. But I'm not sure if Saleem
has other opinions.


On Mon, Mar 25, 2019 at 1:20 PM Rui Ueyama <ruiu at google.com> wrote:
> Could you explain what that feature is?
>
> On Mon, Mar 25, 2019 at 10:08 AM James Y Knight via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Are you planning to add support for "-F" and
"-framework" to ELF linkers?
>>
>> On Mon, Mar 25, 2019 at 12:51 AM Saleem Abdulrasool via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Sorry for the late chiming in.
>>>
>>> Yes, swift does use autolinking, and I would like to use that on
all the
>>> targets.  The only target which does not support this functionality
>>> currently are ELF based.  That said, I think that `#pragma
comment(link,
>>> ...)` is insufficient for my needs.  Building Foundation requires
framework
>>> style linking as well.  The original design that I had in mind was
derived
>>> from ld64 and link.  Personally, I still strongly favour link's
behaviour
>>> of parsing "command line" options from the object files
when they are
>>> loaded.  There was strong opposition to that approach from Rui
though.
>>> Would we want to have special pragmas for each "feature"?
>>>
>>> The ELF model doesn't have the simplistic model for processing
the
>>> command line that PE/COFF does.  Because ordering is relevant to
the model,
>>> it would be ideal to process them inline, but, since lld already
moves far
>>> enough away from the traditional Unix model, perhaps we can
simplify it to
>>> append the command line directives to the end of the command line.
>>>
>>> The other case that is interesting to think about is the
autolinking
>>> support in C++ (and clang) modules.
>>>
>>> On Thu, Mar 21, 2019 at 9:49 AM bd1976 llvm <bd1976llvm at
gmail.com>
>>> wrote:
>>>
>>>> On Thu, Mar 21, 2019 at 12:06 AM Rui Ueyama <ruiu at
google.com> wrote:
>>>>
>>>>> Perhaps there's no one clean way to solve this issue,
because
>>>>> previously all libraries and object files are explicitly
given to the
>>>>> linker via a command line and the order of files in the
command line
>>>>> matters. That assumes human intervention to work correctly.
Now, the
>>>>> autolinking feature will add libraries implicitly. Since
it's implicit,
>>>>> there will be only one way how that works, so sometimes
that works and
>>>>> sometimes doesn't.
>>>>>
>>>>> It feels to me that we should aim for making it work
reasonably well
>>>>> for reasonable use cases. By reasonable use cases, I'm
thinking of the
>>>>> following:
>>>>>
>>>>>  1. --static option may or may not be given (i.e. we should
allow that
>>>>> feature for both static linking and dynamic linking.)
>>>>>  2. There are no competing defined symbols in a given set
of
>>>>> libraries, or if they exist, the program owner doesn't
care which is linked
>>>>> to their program.
>>>>>  3. There may be circular dependencies between libraries.
>>>>>
>>>>> I don't think the above assumption is too odd. If I
have to implement
>>>>> the autolinking feature to GNU linker for the above
scenario, I'd probably
>>>>> use the following scheme:
>>>>>
>>>>>  1. While reading object files, memorize libraries that are
autolinked
>>>>>  2. After linking everything, create a list of files
consisting of
>>>>> autolinked libraries AND libraries given via the command
line
>>>>>  3. Visit each file in the list as if they were wrapped in
>>>>> --start-group and --end-group.
>>>>>
>>>>> I'd think the above scheme should work reasonably well.
What do you
>>>>> think?
>>>>>
>>>>
>>>> Very nice. I agree with your definition of
"reasonable" usecaes
>>>> (actually, as I have said before, I think that restricting
autolinking to
>>>> this "reasonable" set is actually a feature -  to
avoid developers having
>>>> source code that only works with a particular linker). I also
like the
>>>> proposal for a GNU implementation - I think this is enough to
show that
>>>> GNU-like linkers could implement this.
>>>>
>>>> At this point I will try to prototype this up so that people
have an
>>>> implementation to play with.
>>>>
>>>> I am keen to hear from Saleem (compnerd) on this, as he did the
>>>> original .linker-options work.
>>>>
>>>>
>>>>>
>>>>> On Tue, Mar 19, 2019 at 11:02 AM bd1976 llvm <bd1976llvm
at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> On Mon, Mar 18, 2019 at 8:02 PM Rui Ueyama <ruiu at
google.com> wrote:
>>>>>>
>>>>>>> On Thu, Mar 14, 2019 at 1:05 PM bd1976 llvm via
llvm-dev <
>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>
>>>>>>>> On Thu, Mar 14, 2019 at 6:27 PM Peter
Collingbourne <
>>>>>>>> peter at pcc.me.uk> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Mar 14, 2019 at 6:08 AM bd1976 llvm
via llvm-dev <
>>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>>
>>>>>>>>>> At Sony we offer autolinking as a
feature in our ELF toolchain.
>>>>>>>>>> We would like to see full support for
this feature upstream as there is
>>>>>>>>>> anecdotal evidence that it would find
use beyond Sony.
>>>>>>>>>>
>>>>>>>>>> In general autolinking (
>>>>>>>>>>
https://en.wikipedia.org/wiki/Auto-linking) allows developers to
>>>>>>>>>> specify inputs to the linker in their
source code. LLVM and Clang already
>>>>>>>>>> have support for autolinking on ELF via
embedding strings, which specify
>>>>>>>>>> linker behavior, into a .linker-options
section in relocatable object
>>>>>>>>>> files, see:
>>>>>>>>>>
>>>>>>>>>> RFC -
>>>>>>>>>>
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120101.html
>>>>>>>>>> LLVM -
>>>>>>>>>>
https://llvm.org/docs/Extensions.html#linker-options-section-linker-options,
>>>>>>>>>> https://reviews.llvm.org/D40849
>>>>>>>>>> Clang -
>>>>>>>>>>
https://clang.llvm.org/docs/LanguageExtensions.html#specifying-linker-options-on-elf-targets,
>>>>>>>>>> https://reviews.llvm.org/D42758
>>>>>>>>>>
>>>>>>>>>> However, although support was added to
Clang and LLVM, no support
>>>>>>>>>> has been implemented in LLD; and, I get
the sense, from reading the
>>>>>>>>>> reviews, that there wasn't
agreement on the implementation when the changes
>>>>>>>>>> landed. The original motivation seems
to have been to remove the
>>>>>>>>>> "autolink-extract" mechanism
used by Swift to workaround the lack of
>>>>>>>>>> autolinking support for ELF. However,
looking at the Swift source code,
>>>>>>>>>> Swift still seems to be using the
"autolink-extract" method.
>>>>>>>>>>
>>>>>>>>>> So my first question: Are there any
users of the current
>>>>>>>>>> implementation for ELF?
>>>>>>>>>>
>>>>>>>>>> Assuming that no one is using the
current code, I would like to
>>>>>>>>>> suggest a different mechanism for
autolinking.
>>>>>>>>>>
>>>>>>>>>> For ELF we need limited autolinking
support. Specifically, we
>>>>>>>>>> only need support for "comment
lib" pragmas (
>>>>>>>>>>
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017)
>>>>>>>>>> in C/C++ e.g. #pragma comment(lib,
"foo"). My suggestion that we keep the
>>>>>>>>>> implementation as lean as possible.
>>>>>>>>>>
>>>>>>>>>> Principles to guide the implementation:
>>>>>>>>>> - Developers should be able to easily
understand autolinking
>>>>>>>>>> behavior.
>>>>>>>>>> - Developers should be able to override
autolinking from the
>>>>>>>>>> linker command line.
>>>>>>>>>> - Inputs specified via pragmas should
be handled in a general way
>>>>>>>>>> to allow the same source code to work
in different environments.
>>>>>>>>>>
>>>>>>>>>> I would like to propose that we focus
on autolinking exclusively
>>>>>>>>>> and that we divorce the implementation
from the idea of "linker options"
>>>>>>>>>> which, by nature, would tie source code
to the vagaries of particular
>>>>>>>>>> linkers. I don't see much value in
supporting other linker operations so I
>>>>>>>>>> suggest that the binary representation
be a mergable string section
>>>>>>>>>> (SHF_MERGE, SHF_STRINGS), called
.autolink, with custom type
>>>>>>>>>> SHT_LLVM_AUTOLINK (0x6fff4c04), and
SHF_EXCLUDE set (to avoid the contents
>>>>>>>>>> appearing in the output). The compiler
can form this section by
>>>>>>>>>> concatenating the arguments of the
"comment lib" pragmas in the order they
>>>>>>>>>> are encountered. Partial (-r, -Ur)
links can be handled by concatenating
>>>>>>>>>> .autolink sections with the normal
mergeable string section rules. The
>>>>>>>>>> current .linker-options can remain (or
be removed); but, "comment lib"
>>>>>>>>>> pragmas for ELF should be lowered to
.autolink not to .linker-options. This
>>>>>>>>>> makes sense as there is no linker
option that "comment lib" pragmas map
>>>>>>>>>> directly to. As an example, #pragma
comment(lib, "foo") would result in:
>>>>>>>>>>
>>>>>>>>>> .section
".autolink","eMS", at llvm_autolink,1
>>>>>>>>>>         .asciz "foo"
>>>>>>>>>>
>>>>>>>>>> For LTO, equivalent information to the
contents of a the
>>>>>>>>>> .autolink section will be written to
the IRSymtab so that it is available
>>>>>>>>>> to the linker for symbol resolution.
>>>>>>>>>>
>>>>>>>>>> The linker will process the .autolink
strings in the following
>>>>>>>>>> way:
>>>>>>>>>>
>>>>>>>>>> 1. Inputs from the .autolink sections
of a relocatable object
>>>>>>>>>> file are added when the linker decides
to include that file (which could
>>>>>>>>>> itself be in a library) in the link.
Autolinked inputs behave as if they
>>>>>>>>>> were appended to the command line as a
group after all other options. As a
>>>>>>>>>> consequence the set of autolinked
libraries are searched last to resolve
>>>>>>>>>> symbols.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> If we want this to be compatible with GNU
linkers, doesn't the
>>>>>>>>> autolinked input need to appear at the
point immediately after the object
>>>>>>>>> file appears in the link? I'm imagining
the case where you have a
>>>>>>>>> statically linked libc as well as a
libbar.a autolinked from a foo.o. The
>>>>>>>>> link command line would look like this:
>>>>>>>>>
>>>>>>>>> ld foo.o -lc
>>>>>>>>>
>>>>>>>>> Now foo.o autolinks against bar. The
command line becomes:
>>>>>>>>>
>>>>>>>>> ld foo.o -lc -lbar
>>>>>>>>>
>>>>>>>>
>>>>>>>> Actually, I was thinking that on a GNU linker
the command line
>>>>>>>> would become "ld foo.o -lc -( -lbar
)-"; but, this doesn't affect your
>>>>>>>> point.
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> If libbar.a requires an additional object
file from libc.a, it
>>>>>>>>> will not be added to the link.
>>>>>>>>>
>>>>>>>>>
>>>>>>>> As it stands all the dependencies of an
autolinked library must
>>>>>>>> themselves be autolinked. I had imagined that
this is a reasonable
>>>>>>>> limitation. If not we need another scheme. I
try to think about some
>>>>>>>> motivating examples for this.
>>>>>>>>
>>>>>>>>
>>>>>>>>> 2. It is an error if a file cannot be found
for a given string.
>>>>>>>>>> 3. Any command line options in effect
at the end of the command
>>>>>>>>>> line parsing apply to autolinked
inputs, e.g. --whole-archive.
>>>>>>>>>> 4. Duplicate autolinked inputs are
ignored.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This seems like it would work in GNU
linkers, as long as the
>>>>>>>>> autolinked file is added to the link
immediately after the last mention,
>>>>>>>>> rather than the first. Otherwise a command
line like:
>>>>>>>>>
>>>>>>>>> ld foo1.o foo2.o
>>>>>>>>>
>>>>>>>>> (where foo1.o and foo2.o both autolink bar)
could end up looking
>>>>>>>>> like:
>>>>>>>>>
>>>>>>>>> ld foo1.o -lbar foo2.o
>>>>>>>>>
>>>>>>>>> and you will not link anything from
libbar.a that only foo2.o
>>>>>>>>> requires. It may end up being simpler to
not ignore duplicates.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Correct; but, given that the proposal was to
handle the libraries
>>>>>>>> as if they are appended to the link line after
everything on the command
>>>>>>>> line then I think this will work. With
deduplication (and the use of
>>>>>>>> SHF_MERGE) developers get no ordering
guarantees. I claim that this is a
>>>>>>>> feature! My rationale is that the order in
which libraries are linked
>>>>>>>> affects different linkers in different ways
(e.g. LLD does not resolve
>>>>>>>> symbols from archives in a compatible manner
with either the Microsoft
>>>>>>>> linker or the GNU linkers.), by not allowing
the user to control the order
>>>>>>>> I am essentially saying that autolinking is not
suitable for libraries that
>>>>>>>> offer competing copies of the same symbol. This
ties into my argument that
>>>>>>>> "comment lib" pragmas should be
handled in as "general" a way as possible.
>>>>>>>>
>>>>>>>
>>>>>>> Right. I think if you need a fine control over the
link order,
>>>>>>> autolinking is not a feature you want to use. Or,
in general, if your
>>>>>>> program is sensitive to a link order because its
source object files have
>>>>>>> competing symbols of the same name, it's
perhaps unnecessarily fragile.
>>>>>>>
>>>>>>> That being said, I think you need to address the
issue that pcc
>>>>>>> pointed out. If you statically link a program `foo`
with the following
>>>>>>> command line
>>>>>>>
>>>>>>>   ld -o foo foo.o -lc
>>>>>>>
>>>>>>> , `foo.o` auto-imports libbar.a, and libbar.a
depends on libc.a, can
>>>>>>> your proposed feature pull out object files needed
for libbar.a?
>>>>>>>
>>>>>>
>>>>>> It won't work on GNU linkers. It will work with LLD
as LLD has
>>>>>> MSVC-like archive handling. However, I would like to
make sure that
>>>>>> whatever we come up with can be supported in the GNU
toolchain.
>>>>>>
>>>>>> I had thought that it would be acceptable that all the
dependencies
>>>>>> of an autolinked library must themselves be autolinked
in order to work on
>>>>>> GNU style linkers. Having thought more, I don't
like this limitation -
>>>>>> especially as it doesn't exist for Microsoft style
linkers. One possible
>>>>>> resolution could be that GNU linkers might have to
implement another
>>>>>> command line option e.g. --auto-dep=<file> to
allow injection into the
>>>>>> group of autolinked libraries.
>>>>>>
>>>>>> i.e In pcc's example you would need to do: "ld
foo.o
>>>>>> --auto-dep=libc.a" which would become "ld
--start-group libbar.a libc.a
>>>>>> --end-group" with autolinking.
>>>>>>
>>>>>> I wanted to avoid the approach of inserting autolinked
libraries
>>>>>> after the object that autolinks them. In LLD (and MSVC)
it becomes hard to
>>>>>> reason about "where" the linker is in the
command line and it would also
>>>>>> mean that we can't have the nice separation between
parsing the command
>>>>>> line and doing the rest of the link that we currently
have. Also, if you
>>>>>> give people a way to have a fine grained control over
the link order with
>>>>>> autolinking you risk ending up with source code that
will link on GNU style
>>>>>> linkers but not with LLD (assuming GNU ever implemented
support for
>>>>>> autolinking).
>>>>>>
>>>>>> Scenario:
>>>>>>
>>>>>> libbar.a(bar.o) - defines symbol bar
>>>>>> libfoo.a(foo.o) - defines foo and autolinks libbar.a
>>>>>> main.o - references foo
>>>>>> another.o - does not reference foo
>>>>>> No references to bar exist
>>>>>>
>>>>>> lld -lfoo another.o --whole-archive main.o with
autolinking becomes
>>>>>> lld -lfoo another.o --whole-archive main.o -lbar
result: bar.o gets added
>>>>>> to the link.
>>>>>> But, if a change is made so that another.o references
bar then the
>>>>>> link line with autolinking becomes lld -lfoo another.o
>>>>>> -lbar --whole-archive main.o result: bar.o is not added
to the link.
>>>>>>
>>>>>> Hopefully the above scenario demonstrates why I think
that it becomes
>>>>>> too complicated to reason about the effects of
autolinking with pcc's
>>>>>> proposed insertion scheme.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> 5. The linker tries to add a library or relocatable
object file from
>>>>>>>>>> each of the strings in a .autolink
section by; first, handling the string
>>>>>>>>>> as if it was specified on the
commandline; second, by looking for the
>>>>>>>>>> string in each of the library search
paths in turn; third, by looking for a
>>>>>>>>>> lib<string>.a or
lib<string>.so (depending on the current mode of the
>>>>>>>>>> linker) in each of the library search
paths.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Is the second part necessary?
"-l:foo" causes the linker to search
>>>>>>>>> for a file named "foo" in the
library search path, so it seems that
>>>>>>>>> allowing the autolink string to look like
":foo" would satisfy this use
>>>>>>>>> case.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I worded the proposal to avoid mapping
"comment lib" pragmas to
>>>>>>>> --library command line options. My reasons:
>>>>>>>>
>>>>>>>> 1. I find the requirement that the user put
':' in their lib
>>>>>>>> strings slightly awkward. It means that the
source code is now coupled to a
>>>>>>>> GNU-style linker. So then this isn't merely
an ELF linking proposal, it's a
>>>>>>>> proposal for ELF toolchains with GNU-like
linkers (e.g. the arm linker
>>>>>>>> doesn't support the colon prefix
>>>>>>>>
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0474c/Cjahbdei.html
>>>>>>>> ).
>>>>>>>>
>>>>>>>> 2. The syntax is #pragma comment(lib, ...) not
#pragma
>>>>>>>> linker-option(library, ...) i.e. the only thing
this (frankly rather
>>>>>>>> bizarre) syntax definitely implies is that the
argument is related to
>>>>>>>> libraries (and comments ¯\_(ツ)_/¯); it is a bit
of a stretch to interpret
>>>>>>>> "comment lib" pragmas as mapping
directly to "specifying an additional
>>>>>>>> --library command line option".
>>>>>>>>
>>>>>>>> AFAIK all linkers support two ways of
specifying inputs; firstly,
>>>>>>>> directly on the command line; secondly, with an
option with very similar
>>>>>>>> semantics to GNU's --library option. I
choose a method of finding a input
>>>>>>>> files that encompasses both methods of
specifying a library on the command
>>>>>>>> line. I think that this method is actually more
intuitive than either the
>>>>>>>> method used by the linker script INPUT command
or by --library. FWIW, I
>>>>>>>> looked into the history of the colon prefix. It
was added in
>>>>>>>>
https://www.sourceware.org/ml/binutils/2007-03/msg00421.html.
>>>>>>>> Unfortunately, the rationale given is that it
was merely a port of a
>>>>>>>> vxworks linker extension. I couldn't trace
the history any further than
>>>>>>>> that to find the actual design discussion. The
linker script command INPUT
>>>>>>>> uses a different scheme and the command already
had this search order 20
>>>>>>>> years ago, which is the earliest version of the
GNU linker I have history
>>>>>>>> for; again, the rationale is not available.
>>>>>>>>
>>>>>>>>
>>>>>>>>> 6. A new command line option
--no-llvm-autolink will tell LLD to
>>>>>>>>>> ignore the .autolink sections.
>>>>>>>>>>
>>>>>>>>>> Rationale for the above points:
>>>>>>>>>>
>>>>>>>>>> 1. Adding the autolinked inputs last
makes the process simple to
>>>>>>>>>> understand from a developers
perspective. All linkers are able to implement
>>>>>>>>>> this scheme.
>>>>>>>>>> 2. Error-ing for libraries that are not
found seems like better
>>>>>>>>>> behavior than failing the link during
symbol resolution.
>>>>>>>>>> 3. It seems useful for the user to be
able to apply command line
>>>>>>>>>> options which will affect all of the
autolinked input files. There is a
>>>>>>>>>> potential problem of surprise for
developers, who might not realize that
>>>>>>>>>> these options would apply to the
"invisible" autolinked input files;
>>>>>>>>>> however, despite the potential for
surprise, this is easy for developers to
>>>>>>>>>> reason about and gives developers the
control that they may require.
>>>>>>>>>> 4. Unlike on the command line it is
probably easy to include the
>>>>>>>>>> same input file twice via pragmas and
might be a pain to fix; think of
>>>>>>>>>> Third-party libraries supplied as
binaries.
>>>>>>>>>> 5. This algorithm takes into account
all of the different ways
>>>>>>>>>> that ELF linkers find input files. The
different search methods are tried
>>>>>>>>>> by the linker in most obvious to least
obvious order.
>>>>>>>>>> 6. I considered adding finer grained
control over which .autolink
>>>>>>>>>> inputs were ignored (e.g. MSVC has
/nodefaultlib:<library>); however, I
>>>>>>>>>> concluded that this is not necessary:
if finer control is required
>>>>>>>>>> developers can recreate the same effect
autolinking would have had using
>>>>>>>>>> command line options.
>>>>>>>>>>
>>>>>>>>>> Thoughts?
>>>>>>>>>>
>>>>>>>>>>
_______________________________________________
>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> --
>>>>>>>>> Peter
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> LLVM Developers mailing list
>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>
>>>>>>>
>>>
>>> --
>>> Saleem Abdulrasool
>>> compnerd (at) compnerd (dot) org
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190325/e6ae8fe7/attachment-0001.html>

Saleem Abdulrasool via llvm-dev

2019-Mar-26 05:07 UTC

head link

[llvm-dev] RFC: ELF Autolinking

No and yes.  I don't think that `-F` is possible, since that is already in
use for specifying `DT_FILTER` options for the ELF targets.  I would like
to add `-framework`.  As to handling the framework seatch path, perhaps a
`-framework-search-path` option.

On Mon, Mar 25, 2019 at 10:08 AM James Y Knight <jyknight at google.com>
wrote:
> Are you planning to add support for "-F" and
"-framework" to ELF linkers?
>
> On Mon, Mar 25, 2019 at 12:51 AM Saleem Abdulrasool via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Sorry for the late chiming in.
>>
>> Yes, swift does use autolinking, and I would like to use that on all
the
>> targets.  The only target which does not support this functionality
>> currently are ELF based.  That said, I think that `#pragma
comment(link,
>> ...)` is insufficient for my needs.  Building Foundation requires
framework
>> style linking as well.  The original design that I had in mind was
derived
>> from ld64 and link.  Personally, I still strongly favour link's
behaviour
>> of parsing "command line" options from the object files when
they are
>> loaded.  There was strong opposition to that approach from Rui though.
>> Would we want to have special pragmas for each "feature"?
>>
>> The ELF model doesn't have the simplistic model for processing the
>> command line that PE/COFF does.  Because ordering is relevant to the
model,
>> it would be ideal to process them inline, but, since lld already moves
far
>> enough away from the traditional Unix model, perhaps we can simplify it
to
>> append the command line directives to the end of the command line.
>>
>> The other case that is interesting to think about is the autolinking
>> support in C++ (and clang) modules.
>>
>> On Thu, Mar 21, 2019 at 9:49 AM bd1976 llvm <bd1976llvm at
gmail.com> wrote:
>>
>>> On Thu, Mar 21, 2019 at 12:06 AM Rui Ueyama <ruiu at
google.com> wrote:
>>>
>>>> Perhaps there's no one clean way to solve this issue,
because
>>>> previously all libraries and object files are explicitly given
to the
>>>> linker via a command line and the order of files in the command
line
>>>> matters. That assumes human intervention to work correctly.
Now, the
>>>> autolinking feature will add libraries implicitly. Since
it's implicit,
>>>> there will be only one way how that works, so sometimes that
works and
>>>> sometimes doesn't.
>>>>
>>>> It feels to me that we should aim for making it work reasonably
well
>>>> for reasonable use cases. By reasonable use cases, I'm
thinking of the
>>>> following:
>>>>
>>>>  1. --static option may or may not be given (i.e. we should
allow that
>>>> feature for both static linking and dynamic linking.)
>>>>  2. There are no competing defined symbols in a given set of
libraries,
>>>> or if they exist, the program owner doesn't care which is
linked to their
>>>> program.
>>>>  3. There may be circular dependencies between libraries.
>>>>
>>>> I don't think the above assumption is too odd. If I have to
implement
>>>> the autolinking feature to GNU linker for the above scenario,
I'd probably
>>>> use the following scheme:
>>>>
>>>>  1. While reading object files, memorize libraries that are
autolinked
>>>>  2. After linking everything, create a list of files consisting
of
>>>> autolinked libraries AND libraries given via the command line
>>>>  3. Visit each file in the list as if they were wrapped in
>>>> --start-group and --end-group.
>>>>
>>>> I'd think the above scheme should work reasonably well.
What do you
>>>> think?
>>>>
>>>
>>> Very nice. I agree with your definition of "reasonable"
usecaes
>>> (actually, as I have said before, I think that restricting
autolinking to
>>> this "reasonable" set is actually a feature -  to avoid
developers having
>>> source code that only works with a particular linker). I also like
the
>>> proposal for a GNU implementation - I think this is enough to show
that
>>> GNU-like linkers could implement this.
>>>
>>> At this point I will try to prototype this up so that people have
an
>>> implementation to play with.
>>>
>>> I am keen to hear from Saleem (compnerd) on this, as he did the
>>> original .linker-options work.
>>>
>>>
>>>>
>>>> On Tue, Mar 19, 2019 at 11:02 AM bd1976 llvm <bd1976llvm at
gmail.com>
>>>> wrote:
>>>>
>>>>> On Mon, Mar 18, 2019 at 8:02 PM Rui Ueyama <ruiu at
google.com> wrote:
>>>>>
>>>>>> On Thu, Mar 14, 2019 at 1:05 PM bd1976 llvm via
llvm-dev <
>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>
>>>>>>> On Thu, Mar 14, 2019 at 6:27 PM Peter Collingbourne
<peter at pcc.me.uk>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Mar 14, 2019 at 6:08 AM bd1976 llvm via
llvm-dev <
>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>
>>>>>>>>> At Sony we offer autolinking as a feature
in our ELF toolchain. We
>>>>>>>>> would like to see full support for this
feature upstream as there is
>>>>>>>>> anecdotal evidence that it would find use
beyond Sony.
>>>>>>>>>
>>>>>>>>> In general autolinking
(https://en.wikipedia.org/wiki/Auto-linking)
>>>>>>>>> allows developers to specify inputs to the
linker in their source code.
>>>>>>>>> LLVM and Clang already have support for
autolinking on ELF via embedding
>>>>>>>>> strings, which specify linker behavior,
into a .linker-options section in
>>>>>>>>> relocatable object files, see:
>>>>>>>>>
>>>>>>>>> RFC -
>>>>>>>>>
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120101.html
>>>>>>>>> LLVM -
>>>>>>>>>
https://llvm.org/docs/Extensions.html#linker-options-section-linker-options,
>>>>>>>>> https://reviews.llvm.org/D40849
>>>>>>>>> Clang -
>>>>>>>>>
https://clang.llvm.org/docs/LanguageExtensions.html#specifying-linker-options-on-elf-targets,
>>>>>>>>> https://reviews.llvm.org/D42758
>>>>>>>>>
>>>>>>>>> However, although support was added to
Clang and LLVM, no support
>>>>>>>>> has been implemented in LLD; and, I get the
sense, from reading the
>>>>>>>>> reviews, that there wasn't agreement on
the implementation when the changes
>>>>>>>>> landed. The original motivation seems to
have been to remove the
>>>>>>>>> "autolink-extract" mechanism used
by Swift to workaround the lack of
>>>>>>>>> autolinking support for ELF. However,
looking at the Swift source code,
>>>>>>>>> Swift still seems to be using the
"autolink-extract" method.
>>>>>>>>>
>>>>>>>>> So my first question: Are there any users
of the current
>>>>>>>>> implementation for ELF?
>>>>>>>>>
>>>>>>>>> Assuming that no one is using the current
code, I would like to
>>>>>>>>> suggest a different mechanism for
autolinking.
>>>>>>>>>
>>>>>>>>> For ELF we need limited autolinking
support. Specifically, we only
>>>>>>>>> need support for "comment lib"
pragmas (
>>>>>>>>>
https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017)
>>>>>>>>> in C/C++ e.g. #pragma comment(lib,
"foo"). My suggestion that we keep the
>>>>>>>>> implementation as lean as possible.
>>>>>>>>>
>>>>>>>>> Principles to guide the implementation:
>>>>>>>>> - Developers should be able to easily
understand autolinking
>>>>>>>>> behavior.
>>>>>>>>> - Developers should be able to override
autolinking from the
>>>>>>>>> linker command line.
>>>>>>>>> - Inputs specified via pragmas should be
handled in a general way
>>>>>>>>> to allow the same source code to work in
different environments.
>>>>>>>>>
>>>>>>>>> I would like to propose that we focus on
autolinking exclusively
>>>>>>>>> and that we divorce the implementation from
the idea of "linker options"
>>>>>>>>> which, by nature, would tie source code to
the vagaries of particular
>>>>>>>>> linkers. I don't see much value in
supporting other linker operations so I
>>>>>>>>> suggest that the binary representation be a
mergable string section
>>>>>>>>> (SHF_MERGE, SHF_STRINGS), called .autolink,
with custom type
>>>>>>>>> SHT_LLVM_AUTOLINK (0x6fff4c04), and
SHF_EXCLUDE set (to avoid the contents
>>>>>>>>> appearing in the output). The compiler can
form this section by
>>>>>>>>> concatenating the arguments of the
"comment lib" pragmas in the order they
>>>>>>>>> are encountered. Partial (-r, -Ur) links
can be handled by concatenating
>>>>>>>>> .autolink sections with the normal
mergeable string section rules. The
>>>>>>>>> current .linker-options can remain (or be
removed); but, "comment lib"
>>>>>>>>> pragmas for ELF should be lowered to
.autolink not to .linker-options. This
>>>>>>>>> makes sense as there is no linker option
that "comment lib" pragmas map
>>>>>>>>> directly to. As an example, #pragma
comment(lib, "foo") would result in:
>>>>>>>>>
>>>>>>>>> .section
".autolink","eMS", at llvm_autolink,1
>>>>>>>>>         .asciz "foo"
>>>>>>>>>
>>>>>>>>> For LTO, equivalent information to the
contents of a the .autolink
>>>>>>>>> section will be written to the IRSymtab so
that it is available to the
>>>>>>>>> linker for symbol resolution.
>>>>>>>>>
>>>>>>>>> The linker will process the .autolink
strings in the following way:
>>>>>>>>>
>>>>>>>>> 1. Inputs from the .autolink sections of a
relocatable object file
>>>>>>>>> are added when the linker decides to
include that file (which could itself
>>>>>>>>> be in a library) in the link. Autolinked
inputs behave as if they were
>>>>>>>>> appended to the command line as a group
after all other options. As a
>>>>>>>>> consequence the set of autolinked libraries
are searched last to resolve
>>>>>>>>> symbols.
>>>>>>>>>
>>>>>>>>
>>>>>>>> If we want this to be compatible with GNU
linkers, doesn't the
>>>>>>>> autolinked input need to appear at the point
immediately after the object
>>>>>>>> file appears in the link? I'm imagining the
case where you have a
>>>>>>>> statically linked libc as well as a libbar.a
autolinked from a foo.o. The
>>>>>>>> link command line would look like this:
>>>>>>>>
>>>>>>>> ld foo.o -lc
>>>>>>>>
>>>>>>>> Now foo.o autolinks against bar. The command
line becomes:
>>>>>>>>
>>>>>>>> ld foo.o -lc -lbar
>>>>>>>>
>>>>>>>
>>>>>>> Actually, I was thinking that on a GNU linker the
command line would
>>>>>>> become "ld foo.o -lc -( -lbar )-"; but,
this doesn't affect your point.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> If libbar.a requires an additional object file
from libc.a, it will
>>>>>>>> not be added to the link.
>>>>>>>>
>>>>>>>>
>>>>>>> As it stands all the dependencies of an autolinked
library must
>>>>>>> themselves be autolinked. I had imagined that this
is a reasonable
>>>>>>> limitation. If not we need another scheme. I try to
think about some
>>>>>>> motivating examples for this.
>>>>>>>
>>>>>>>
>>>>>>>> 2. It is an error if a file cannot be found for
a given string.
>>>>>>>>> 3. Any command line options in effect at
the end of the command
>>>>>>>>> line parsing apply to autolinked inputs,
e.g. --whole-archive.
>>>>>>>>> 4. Duplicate autolinked inputs are ignored.
>>>>>>>>>
>>>>>>>>
>>>>>>>> This seems like it would work in GNU linkers,
as long as the
>>>>>>>> autolinked file is added to the link
immediately after the last mention,
>>>>>>>> rather than the first. Otherwise a command line
like:
>>>>>>>>
>>>>>>>> ld foo1.o foo2.o
>>>>>>>>
>>>>>>>> (where foo1.o and foo2.o both autolink bar)
could end up looking
>>>>>>>> like:
>>>>>>>>
>>>>>>>> ld foo1.o -lbar foo2.o
>>>>>>>>
>>>>>>>> and you will not link anything from libbar.a
that only foo2.o
>>>>>>>> requires. It may end up being simpler to not
ignore duplicates.
>>>>>>>>
>>>>>>>
>>>>>>> Correct; but, given that the proposal was to handle
the libraries as
>>>>>>> if they are appended to the link line after
everything on the command line
>>>>>>> then I think this will work. With deduplication
(and the use of SHF_MERGE)
>>>>>>> developers get no ordering guarantees. I claim that
this is a feature! My
>>>>>>> rationale is that the order in which libraries are
linked affects different
>>>>>>> linkers in different ways (e.g. LLD does not
resolve symbols from archives
>>>>>>> in a compatible manner with either the Microsoft
linker or the GNU
>>>>>>> linkers.), by not allowing the user to control the
order I am essentially
>>>>>>> saying that autolinking is not suitable for
libraries that offer competing
>>>>>>> copies of the same symbol. This ties into my
argument that "comment lib"
>>>>>>> pragmas should be handled in as "general"
a way as possible.
>>>>>>>
>>>>>>
>>>>>> Right. I think if you need a fine control over the link
order,
>>>>>> autolinking is not a feature you want to use. Or, in
general, if your
>>>>>> program is sensitive to a link order because its source
object files have
>>>>>> competing symbols of the same name, it's perhaps
unnecessarily fragile.
>>>>>>
>>>>>> That being said, I think you need to address the issue
that pcc
>>>>>> pointed out. If you statically link a program `foo`
with the following
>>>>>> command line
>>>>>>
>>>>>>   ld -o foo foo.o -lc
>>>>>>
>>>>>> , `foo.o` auto-imports libbar.a, and libbar.a depends
on libc.a, can
>>>>>> your proposed feature pull out object files needed for
libbar.a?
>>>>>>
>>>>>
>>>>> It won't work on GNU linkers. It will work with LLD as
LLD has
>>>>> MSVC-like archive handling. However, I would like to make
sure that
>>>>> whatever we come up with can be supported in the GNU
toolchain.
>>>>>
>>>>> I had thought that it would be acceptable that all the
dependencies of
>>>>> an autolinked library must themselves be autolinked in
order to work on GNU
>>>>> style linkers. Having thought more, I don't like this
limitation -
>>>>> especially as it doesn't exist for Microsoft style
linkers. One possible
>>>>> resolution could be that GNU linkers might have to
implement another
>>>>> command line option e.g. --auto-dep=<file> to allow
injection into the
>>>>> group of autolinked libraries.
>>>>>
>>>>> i.e In pcc's example you would need to do: "ld
foo.o
>>>>> --auto-dep=libc.a" which would become "ld
--start-group libbar.a libc.a
>>>>> --end-group" with autolinking.
>>>>>
>>>>> I wanted to avoid the approach of inserting autolinked
libraries after
>>>>> the object that autolinks them. In LLD (and MSVC) it
becomes hard to reason
>>>>> about "where" the linker is in the command line
and it would also mean that
>>>>> we can't have the nice separation between parsing the
command line and
>>>>> doing the rest of the link that we currently have. Also, if
you give people
>>>>> a way to have a fine grained control over the link order
with autolinking
>>>>> you risk ending up with source code that will link on GNU
style linkers but
>>>>> not with LLD (assuming GNU ever implemented support for
autolinking).
>>>>>
>>>>> Scenario:
>>>>>
>>>>> libbar.a(bar.o) - defines symbol bar
>>>>> libfoo.a(foo.o) - defines foo and autolinks libbar.a
>>>>> main.o - references foo
>>>>> another.o - does not reference foo
>>>>> No references to bar exist
>>>>>
>>>>> lld -lfoo another.o --whole-archive main.o with autolinking
becomes
>>>>> lld -lfoo another.o --whole-archive main.o -lbar result:
bar.o gets added
>>>>> to the link.
>>>>> But, if a change is made so that another.o references bar
then the
>>>>> link line with autolinking becomes lld -lfoo another.o
>>>>> -lbar --whole-archive main.o result: bar.o is not added to
the link.
>>>>>
>>>>> Hopefully the above scenario demonstrates why I think that
it becomes
>>>>> too complicated to reason about the effects of autolinking
with pcc's
>>>>> proposed insertion scheme.
>>>>>
>>>>>
>>>>>
>>>>>> 5. The linker tries to add a library or relocatable
object file from
>>>>>>>>> each of the strings in a .autolink section
by; first, handling the string
>>>>>>>>> as if it was specified on the commandline;
second, by looking for the
>>>>>>>>> string in each of the library search paths
in turn; third, by looking for a
>>>>>>>>> lib<string>.a or lib<string>.so
(depending on the current mode of the
>>>>>>>>> linker) in each of the library search
paths.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Is the second part necessary?
"-l:foo" causes the linker to search
>>>>>>>> for a file named "foo" in the library
search path, so it seems that
>>>>>>>> allowing the autolink string to look like
":foo" would satisfy this use
>>>>>>>> case.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I worded the proposal to avoid mapping
"comment lib" pragmas to
>>>>>>> --library command line options. My reasons:
>>>>>>>
>>>>>>> 1. I find the requirement that the user put
':' in their lib strings
>>>>>>> slightly awkward. It means that the source code is
now coupled to a
>>>>>>> GNU-style linker. So then this isn't merely an
ELF linking proposal, it's a
>>>>>>> proposal for ELF toolchains with GNU-like linkers
(e.g. the arm linker
>>>>>>> doesn't support the colon prefix
>>>>>>>
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0474c/Cjahbdei.html
>>>>>>> ).
>>>>>>>
>>>>>>> 2. The syntax is #pragma comment(lib, ...) not
#pragma
>>>>>>> linker-option(library, ...) i.e. the only thing
this (frankly rather
>>>>>>> bizarre) syntax definitely implies is that the
argument is related to
>>>>>>> libraries (and comments ¯\_(ツ)_/¯); it is a bit of
a stretch to interpret
>>>>>>> "comment lib" pragmas as mapping directly
to "specifying an additional
>>>>>>> --library command line option".
>>>>>>>
>>>>>>> AFAIK all linkers support two ways of specifying
inputs; firstly,
>>>>>>> directly on the command line; secondly, with an
option with very similar
>>>>>>> semantics to GNU's --library option. I choose a
method of finding a input
>>>>>>> files that encompasses both methods of specifying a
library on the command
>>>>>>> line. I think that this method is actually more
intuitive than either the
>>>>>>> method used by the linker script INPUT command or
by --library. FWIW, I
>>>>>>> looked into the history of the colon prefix. It was
added in
>>>>>>>
https://www.sourceware.org/ml/binutils/2007-03/msg00421.html.
>>>>>>> Unfortunately, the rationale given is that it was
merely a port of a
>>>>>>> vxworks linker extension. I couldn't trace the
history any further than
>>>>>>> that to find the actual design discussion. The
linker script command INPUT
>>>>>>> uses a different scheme and the command already had
this search order 20
>>>>>>> years ago, which is the earliest version of the GNU
linker I have history
>>>>>>> for; again, the rationale is not available.
>>>>>>>
>>>>>>>
>>>>>>>> 6. A new command line option --no-llvm-autolink
will tell LLD to
>>>>>>>>> ignore the .autolink sections.
>>>>>>>>>
>>>>>>>>> Rationale for the above points:
>>>>>>>>>
>>>>>>>>> 1. Adding the autolinked inputs last makes
the process simple to
>>>>>>>>> understand from a developers perspective.
All linkers are able to implement
>>>>>>>>> this scheme.
>>>>>>>>> 2. Error-ing for libraries that are not
found seems like better
>>>>>>>>> behavior than failing the link during
symbol resolution.
>>>>>>>>> 3. It seems useful for the user to be able
to apply command line
>>>>>>>>> options which will affect all of the
autolinked input files. There is a
>>>>>>>>> potential problem of surprise for
developers, who might not realize that
>>>>>>>>> these options would apply to the
"invisible" autolinked input files;
>>>>>>>>> however, despite the potential for
surprise, this is easy for developers to
>>>>>>>>> reason about and gives developers the
control that they may require.
>>>>>>>>> 4. Unlike on the command line it is
probably easy to include the
>>>>>>>>> same input file twice via pragmas and might
be a pain to fix; think of
>>>>>>>>> Third-party libraries supplied as binaries.
>>>>>>>>> 5. This algorithm takes into account all of
the different ways
>>>>>>>>> that ELF linkers find input files. The
different search methods are tried
>>>>>>>>> by the linker in most obvious to least
obvious order.
>>>>>>>>> 6. I considered adding finer grained
control over which .autolink
>>>>>>>>> inputs were ignored (e.g. MSVC has
/nodefaultlib:<library>); however, I
>>>>>>>>> concluded that this is not necessary: if
finer control is required
>>>>>>>>> developers can recreate the same effect
autolinking would have had using
>>>>>>>>> command line options.
>>>>>>>>>
>>>>>>>>> Thoughts?
>>>>>>>>>
>>>>>>>>>
_______________________________________________
>>>>>>>>> LLVM Developers mailing list
>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> --
>>>>>>>> Peter
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>
>>
>> --
>> Saleem Abdulrasool
>> compnerd (at) compnerd (dot) org
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
-- 
Saleem Abdulrasool
compnerd (at) compnerd (dot) org
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190325/1ed2ad4e/attachment.html>

llvm dev - Mar 2019 - RFC: ELF Autolinking

[llvm-dev] RFC: ELF Autolinking

[llvm-dev] RFC: ELF Autolinking

[llvm-dev] RFC: ELF Autolinking

[llvm-dev] RFC: ELF Autolinking