thr3ads.net - llvm dev - [llvm-dev] LLD: creating linker-generated sections as input sections instead of output sections [Oct 2016]

If this information is useful, please help other people find it:
Share via:

Rui Ueyama via llvm-dev

2016-Oct-19 23:46 UTC

[llvm-dev] LLD: creating linker-generated sections as input sections instead of output sections

On Wed, Oct 19, 2016 at 3:34 AM, Peter Smith via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Thanks for the RFC.
>
> I'm in favour of the option of creating InputSections for some linker
> generated content. I think it would add extra flexibility to the
> linker. ARM's proprietary linker uses the equivalent of InputSections
> with a pseudo linker defined ObjectFile for SHF_ALLOC content. As
> Eugene points out it isn't always appropriate for meta-data sections.
>
It is encouraging that the ARM's linker uses the same scheme.

In particular it would be great to have a pseudo InputFile that
local> symbols could be generated in as this would make supporting mapping
> symbols[*] in linker generated sections much easier. It would also
> make a Thunk implementation that generated standalone InputSections
> rather than adding as patches to existing InputSections.
>
> The disadvantage with extra flexibility is that it increases the
> amount of opportunities for both implementers and users to make
> mistakes, and it makes some implementation details more complicated.
> Where we would have been able to guarantee a single OutputSection, we
> may have many clumps of InputSections distributed across several
> OutputSections. In some cases it is user error to split InputSections
> apart as they need to be contiguous, which requites diagnostics, and
> in some cases algorithms need to be careful, for example in embedded
> systems it is not always appropriate to string merge between
> OutputSections as these OutputSections may not exist in memory on the
> at the same time (Overlays).
>
> Personally I think the additional flexibility is worth it.
>
> [*] Mapping symbols identify ranges of ARM code ($a), Thumb code ($t)
> and literal data ($d). It would be great to add these to Thunks and
> PLT entries as this would improve disassembly.
>
> On 19 October 2016 at 10:37, Eugene Leviant via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> > I would suggest converting only part of linker generated sections to
> > input sections to reduce amount of code changes.
> > For example it's unlikely that SymbolTableSection or
> > StringTableSection would ever require such treatment, so why
> > converting them to input sections?
>
I think if the scheme works for all sections, and if it doesn't
unnecessarily complicate the code, we should convert all sections to input
sections in order to keep the architecture simple. "The linker creates its
own sections as input sections" is easier to explain/understand than
"...
as either input sections or output sections."
>
> > 2016-10-19 11:03 GMT+03:00 George Rimar <grimar at
accesssoftek.com>:
> >>>This idea popped up in the review thread for
> >>> https://reviews.llvm.org/D25627.
> >>
> >>>
> >>>Problem:
> >>>
> >>>Currently, LLD creates special sections that are not just
> concatenations of
> >>> input sections but need link-time data generation, such as
.got, .plt,
> >>> interp, .mips.options, >etc., as output sections. We have
> OutputSectionBase
> >>> subclasses (e.g. GotSection, PltSection, etc.) to create data.
Even
> though
> >>> this scheme works in most cases, >there are a few
situations that
> doesn't
> >>> work well as you may have noticed. Here are a issues.
> >>>
> >>  >- You cannot mix special sections with other types of
sections.
> >>>
> >>>  For example, using linker scripts, you can instruct the
linker put
> >>> mergeable sections and non-mergeable sections into the same
output
> section.
> >>> Such script makes >sense. However, LLD cannot handle such
script
> because
> >>> string merging is the special mergeable output section's
feature. The
> output
> >>> section doesn't know how to >handle other types of
sections, so you
> cannot
> >>> feed non-mergeable sections to a mergeable output section.
> >>>
> >>> - It cannot handle linker scripts like this as pointed by
Eugene.
> >>>
> >>>  .got { *(.got.plt) *(.got) }
> >>>
> >>>   In our current architecture, .got section is an output
section, so it
> >>> cannot be added to other output section. There's no clean
way to
> handle this
> >>> linker script.
> >>>
> >>>Proposal:
> >>>
> >>>Here's my idea: how about creating all special sections as
input
> sections
> >>> instead of output sections?
> >>>
> >>>GotSection, PltSection, etc. will be subclasses of InputSection
that
> don't
> >>> have corresponding input files. What they will do remain the
same.
> They will
> >>> be added to >OutputSections just like other regular
sections are
> added. I
> >>> think we could simplify OutputSection a lot -- OutputSection
will
> probably
> >>> become a dumb container >that just concatenates all input
sections.
> >>>
> >>>This approach would solve the problems described above. Now
that we
> create
> >>> .got as an special input section with ".got" as a
name, so they can
> >>> naturally be added >to any output section. String merging
occurs
> inside a
> >>> special mergeable input section, so they can be added to any
section,
> too.
> >>>
> >>>So, I think by moving the implementations from OutputSection to
> >>> InputSection, we can solve many problems. I do not think of
any obvious
> >>> problem with the >approach.
> >>>
> >>>What do you think?
> >>
> >> For me that sounds as interesting idea. My consern and guess that
> amount of
> >> code changes can be really large for that.
> >> But generally I so not see real problems with this approach too.
> >>
> >> George.
> >>
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161019/49e29a99/attachment-0001.html>

Eugene Leviant via llvm-dev

2016-Oct-21 14:45 UTC

head link

[llvm-dev] LLD: creating linker-generated sections as input sections instead of output sections

Is anyone already working on it? If not then I can take this task.

2016-10-20 2:46 GMT+03:00 Rui Ueyama <ruiu at
google.com>:> On Wed, Oct 19, 2016 at 3:34 AM, Peter Smith via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>>
>> Thanks for the RFC.
>>
>> I'm in favour of the option of creating InputSections for some
linker
>> generated content. I think it would add extra flexibility to the
>> linker. ARM's proprietary linker uses the equivalent of
InputSections
>> with a pseudo linker defined ObjectFile for SHF_ALLOC content. As
>> Eugene points out it isn't always appropriate for meta-data
sections.
>
>
> It is encouraging that the ARM's linker uses the same scheme.
>
>> In particular it would be great to have a pseudo InputFile that local
>> symbols could be generated in as this would make supporting mapping
>> symbols[*] in linker generated sections much easier. It would also
>> make a Thunk implementation that generated standalone InputSections
>> rather than adding as patches to existing InputSections.
>>
>> The disadvantage with extra flexibility is that it increases the
>> amount of opportunities for both implementers and users to make
>> mistakes, and it makes some implementation details more complicated.
>> Where we would have been able to guarantee a single OutputSection, we
>> may have many clumps of InputSections distributed across several
>> OutputSections. In some cases it is user error to split InputSections
>> apart as they need to be contiguous, which requites diagnostics, and
>> in some cases algorithms need to be careful, for example in embedded
>> systems it is not always appropriate to string merge between
>> OutputSections as these OutputSections may not exist in memory on the
>> at the same time (Overlays).
>>
>> Personally I think the additional flexibility is worth it.
>>
>> [*] Mapping symbols identify ranges of ARM code ($a), Thumb code ($t)
>> and literal data ($d). It would be great to add these to Thunks and
>> PLT entries as this would improve disassembly.
>>
>> On 19 October 2016 at 10:37, Eugene Leviant via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>> > I would suggest converting only part of linker generated sections
to
>> > input sections to reduce amount of code changes.
>> > For example it's unlikely that SymbolTableSection or
>> > StringTableSection would ever require such treatment, so why
>> > converting them to input sections?
>
>
> I think if the scheme works for all sections, and if it doesn't
> unnecessarily complicate the code, we should convert all sections to input
> sections in order to keep the architecture simple. "The linker creates
its
> own sections as input sections" is easier to explain/understand than
"... as
> either input sections or output sections."
>
>> >
>> > 2016-10-19 11:03 GMT+03:00 George Rimar <grimar at
accesssoftek.com>:
>> >>>This idea popped up in the review thread for
>> >>> https://reviews.llvm.org/D25627.
>> >>
>> >>>
>> >>>Problem:
>> >>>
>> >>>Currently, LLD creates special sections that are not just
>> >>> concatenations of
>> >>> input sections but need link-time data generation, such as
.got, .plt,
>> >>> interp, .mips.options, >etc., as output sections. We
have
>> >>> OutputSectionBase
>> >>> subclasses (e.g. GotSection, PltSection, etc.) to create
data. Even
>> >>> though
>> >>> this scheme works in most cases, >there are a few
situations that
>> >>> doesn't
>> >>> work well as you may have noticed. Here are a issues.
>> >>>
>> >>  >- You cannot mix special sections with other types of
sections.
>> >>>
>> >>>  For example, using linker scripts, you can instruct the
linker put
>> >>> mergeable sections and non-mergeable sections into the
same output
>> >>> section.
>> >>> Such script makes >sense. However, LLD cannot handle
such script
>> >>> because
>> >>> string merging is the special mergeable output
section's feature. The
>> >>> output
>> >>> section doesn't know how to >handle other types of
sections, so you
>> >>> cannot
>> >>> feed non-mergeable sections to a mergeable output section.
>> >>>
>> >>> - It cannot handle linker scripts like this as pointed by
Eugene.
>> >>>
>> >>>  .got { *(.got.plt) *(.got) }
>> >>>
>> >>>   In our current architecture, .got section is an output
section, so
>> >>> it
>> >>> cannot be added to other output section. There's no
clean way to
>> >>> handle this
>> >>> linker script.
>> >>>
>> >>>Proposal:
>> >>>
>> >>>Here's my idea: how about creating all special sections
as input
>> >>> sections
>> >>> instead of output sections?
>> >>>
>> >>>GotSection, PltSection, etc. will be subclasses of
InputSection that
>> >>> don't
>> >>> have corresponding input files. What they will do remain
the same.
>> >>> They will
>> >>> be added to >OutputSections just like other regular
sections are
>> >>> added. I
>> >>> think we could simplify OutputSection a lot --
OutputSection will
>> >>> probably
>> >>> become a dumb container >that just concatenates all
input sections.
>> >>>
>> >>>This approach would solve the problems described above. Now
that we
>> >>> create
>> >>> .got as an special input section with ".got" as
a name, so they can
>> >>> naturally be added >to any output section. String
merging occurs
>> >>> inside a
>> >>> special mergeable input section, so they can be added to
any section,
>> >>> too.
>> >>>
>> >>>So, I think by moving the implementations from
OutputSection to
>> >>> InputSection, we can solve many problems. I do not think
of any
>> >>> obvious
>> >>> problem with the >approach.
>> >>>
>> >>>What do you think?
>> >>
>> >> For me that sounds as interesting idea. My consern and guess
that
>> >> amount of
>> >> code changes can be really large for that.
>> >> But generally I so not see real problems with this approach
too.
>> >>
>> >> George.
>> >>
>> > _______________________________________________
>> > LLVM Developers mailing list
>> > llvm-dev at lists.llvm.org
>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>

Peter Smith via llvm-dev

2016-Oct-21 14:51 UTC

head link

[llvm-dev] LLD: creating linker-generated sections as input sections instead of output sections

I've not started. I'm currently looking at getting static linking
working on ARM with a recent GCC sysroot, so I'll be busy for a little
while.

Peter

On 21 October 2016 at 15:45, Eugene Leviant <evgeny.leviant at gmail.com>
wrote:> Is anyone already working on it? If not then I can take this task.
>
> 2016-10-20 2:46 GMT+03:00 Rui Ueyama <ruiu at google.com>:
>> On Wed, Oct 19, 2016 at 3:34 AM, Peter Smith via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>>>
>>> Thanks for the RFC.
>>>
>>> I'm in favour of the option of creating InputSections for some
linker
>>> generated content. I think it would add extra flexibility to the
>>> linker. ARM's proprietary linker uses the equivalent of
InputSections
>>> with a pseudo linker defined ObjectFile for SHF_ALLOC content. As
>>> Eugene points out it isn't always appropriate for meta-data
sections.
>>
>>
>> It is encouraging that the ARM's linker uses the same scheme.
>>
>>> In particular it would be great to have a pseudo InputFile that
local
>>> symbols could be generated in as this would make supporting mapping
>>> symbols[*] in linker generated sections much easier. It would also
>>> make a Thunk implementation that generated standalone InputSections
>>> rather than adding as patches to existing InputSections.
>>>
>>> The disadvantage with extra flexibility is that it increases the
>>> amount of opportunities for both implementers and users to make
>>> mistakes, and it makes some implementation details more
complicated.
>>> Where we would have been able to guarantee a single OutputSection,
we
>>> may have many clumps of InputSections distributed across several
>>> OutputSections. In some cases it is user error to split
InputSections
>>> apart as they need to be contiguous, which requites diagnostics,
and
>>> in some cases algorithms need to be careful, for example in
embedded
>>> systems it is not always appropriate to string merge between
>>> OutputSections as these OutputSections may not exist in memory on
the
>>> at the same time (Overlays).
>>>
>>> Personally I think the additional flexibility is worth it.
>>>
>>> [*] Mapping symbols identify ranges of ARM code ($a), Thumb code
($t)
>>> and literal data ($d). It would be great to add these to Thunks and
>>> PLT entries as this would improve disassembly.
>>>
>>> On 19 October 2016 at 10:37, Eugene Leviant via llvm-dev
>>> <llvm-dev at lists.llvm.org> wrote:
>>> > I would suggest converting only part of linker generated
sections to
>>> > input sections to reduce amount of code changes.
>>> > For example it's unlikely that SymbolTableSection or
>>> > StringTableSection would ever require such treatment, so why
>>> > converting them to input sections?
>>
>>
>> I think if the scheme works for all sections, and if it doesn't
>> unnecessarily complicate the code, we should convert all sections to
input
>> sections in order to keep the architecture simple. "The linker
creates its
>> own sections as input sections" is easier to explain/understand
than "... as
>> either input sections or output sections."
>>
>>> >
>>> > 2016-10-19 11:03 GMT+03:00 George Rimar <grimar at
accesssoftek.com>:
>>> >>>This idea popped up in the review thread for
>>> >>> https://reviews.llvm.org/D25627.
>>> >>
>>> >>>
>>> >>>Problem:
>>> >>>
>>> >>>Currently, LLD creates special sections that are not
just
>>> >>> concatenations of
>>> >>> input sections but need link-time data generation,
such as .got, .plt,
>>> >>> interp, .mips.options, >etc., as output sections.
We have
>>> >>> OutputSectionBase
>>> >>> subclasses (e.g. GotSection, PltSection, etc.) to
create data. Even
>>> >>> though
>>> >>> this scheme works in most cases, >there are a few
situations that
>>> >>> doesn't
>>> >>> work well as you may have noticed. Here are a issues.
>>> >>>
>>> >>  >- You cannot mix special sections with other types of
sections.
>>> >>>
>>> >>>  For example, using linker scripts, you can instruct
the linker put
>>> >>> mergeable sections and non-mergeable sections into the
same output
>>> >>> section.
>>> >>> Such script makes >sense. However, LLD cannot
handle such script
>>> >>> because
>>> >>> string merging is the special mergeable output
section's feature. The
>>> >>> output
>>> >>> section doesn't know how to >handle other types
of sections, so you
>>> >>> cannot
>>> >>> feed non-mergeable sections to a mergeable output
section.
>>> >>>
>>> >>> - It cannot handle linker scripts like this as pointed
by Eugene.
>>> >>>
>>> >>>  .got { *(.got.plt) *(.got) }
>>> >>>
>>> >>>   In our current architecture, .got section is an
output section, so
>>> >>> it
>>> >>> cannot be added to other output section. There's
no clean way to
>>> >>> handle this
>>> >>> linker script.
>>> >>>
>>> >>>Proposal:
>>> >>>
>>> >>>Here's my idea: how about creating all special
sections as input
>>> >>> sections
>>> >>> instead of output sections?
>>> >>>
>>> >>>GotSection, PltSection, etc. will be subclasses of
InputSection that
>>> >>> don't
>>> >>> have corresponding input files. What they will do
remain the same.
>>> >>> They will
>>> >>> be added to >OutputSections just like other regular
sections are
>>> >>> added. I
>>> >>> think we could simplify OutputSection a lot --
OutputSection will
>>> >>> probably
>>> >>> become a dumb container >that just concatenates all
input sections.
>>> >>>
>>> >>>This approach would solve the problems described above.
Now that we
>>> >>> create
>>> >>> .got as an special input section with ".got"
as a name, so they can
>>> >>> naturally be added >to any output section. String
merging occurs
>>> >>> inside a
>>> >>> special mergeable input section, so they can be added
to any section,
>>> >>> too.
>>> >>>
>>> >>>So, I think by moving the implementations from
OutputSection to
>>> >>> InputSection, we can solve many problems. I do not
think of any
>>> >>> obvious
>>> >>> problem with the >approach.
>>> >>>
>>> >>>What do you think?
>>> >>
>>> >> For me that sounds as interesting idea. My consern and
guess that
>>> >> amount of
>>> >> code changes can be really large for that.
>>> >> But generally I so not see real problems with this
approach too.
>>> >>
>>> >> George.
>>> >>
>>> > _______________________________________________
>>> > LLVM Developers mailing list
>>> > llvm-dev at lists.llvm.org
>>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>

George Rimar via llvm-dev

2016-Oct-21 14:51 UTC

head link

[llvm-dev] LLD: creating linker-generated sections as input sections instead of output sections

> Is anyone already working on it? If not then I can take this task.
Me - not.

George.

llvm dev - Oct 2016 - LLD: creating linker-generated sections as input sections instead of output sections

[llvm-dev] LLD: creating linker-generated sections as input sections instead of output sections

[llvm-dev] LLD: creating linker-generated sections as input sections instead of output sections

[llvm-dev] LLD: creating linker-generated sections as input sections instead of output sections

[llvm-dev] LLD: creating linker-generated sections as input sections instead of output sections