thr3ads.net - llvm dev - [llvm-dev] [LLVM-DEV][LLD] RFC Range Thunks Implementation review for ARM and Mips [Apr 2017]

If this information is useful, please help other people find it:
Share via:

Rui Ueyama via llvm-dev

2017-Apr-05 23:24 UTC

[llvm-dev] [LLVM-DEV][LLD] RFC Range Thunks Implementation review for ARM and Mips

Are you suggesting other linker jobs such as creating _end symbols to the
linker script?

The linker script support was implemented after we wrote the current Writer
class, so it is somewhat "plugged in" to the Writer. It might not be
the
best design, and not many other options have been explored. So there might
be room to improve code by moving work loads from the Writer to the
LinkerScript. But we need to careful not to hurt performance by doing that.

On Wed, Apr 5, 2017 at 4:14 PM, Rafael Espíndola via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> > Proposed implementation for range extension thunks
> > At a high-level we need to solve the following problems:
> > - Assign addresses more than once
> > - Maintain state between successive calls of createThunks()
> > - Synchronization of the linker script and the OutputSection after
> adding thunks
>
> This last past seems to be the messier. The issue is not with the
> patch, is with the existing infrastructure that uses a completely
> different representation for linker scripts and non linker scripts.
>
> What I think is needed is for the writer to create a dummy
"script"
> and use what is now LinkerScript::assignAddresses. That "script"
would
>
> * Contain only OutputSectionCommand.
> * All string manipulations would have been moved before assignAddress.
> * All the orphan handling would have been made explicit before
> assignAddress.
> * Each OutputSectionCommand would contain just a InputSectionDescription.
>
> With this the thunk creation should be able to add thunk to a single
> location.
>
> Cheers,
> Rafael
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170405/9a9ccad9/attachment.html>

Peter Smith via llvm-dev

2017-Apr-06 11:01 UTC

head link

[llvm-dev] [LLVM-DEV][LLD] RFC Range Thunks Implementation review for ARM and Mips

My understanding is that this would be (initially) limited to
fabricating enough linker script commands such that we could replace:
fixSectionAlignments()
assignAddresses()
Script->processNonSectionCommands()

With something like:
Script->assignAddresses() // Could be done multiple times
Script->processNonSectionCommands() // This should only be done once

In theory all the other __start and __end symbols could still be kept
separate if the linker script commands were created late, and in a
compatible way. I also don't think that this means removing
OutputSections::Sections just yet either?

I don't think that we are proposing to follow the ld.bfd model of
driving the default case via a built in linker script yet? I think
that this would be considerably more work than just this limited
change.

I think the best way forward is to try and prototype something to see
if it splashes out any special cases. I can give this a go to see what
happens.

In the meantime I would be grateful if there is any opportunity to
move forward some of the range thunks changes in parallel, even if
they do not initially work with some linker scripts.

If the above change to always using Script->assignAddresses() did
happen then createThunks() would become a little bit more complicated
as it would need to step through one or more input section
descriptions per OutputSection. Any Thunks created would still need to
be added to both InputSectionDescriptions and OutputSections::Sections
but we could just use push_back().

Peter

On 6 April 2017 at 00:24, Rui Ueyama <ruiu at google.com>
wrote:> Are you suggesting other linker jobs such as creating _end symbols to the
> linker script?
>
> The linker script support was implemented after we wrote the current Writer
> class, so it is somewhat "plugged in" to the Writer. It might not
be the
> best design, and not many other options have been explored. So there might
> be room to improve code by moving work loads from the Writer to the
> LinkerScript. But we need to careful not to hurt performance by doing that.
>
> On Wed, Apr 5, 2017 at 4:14 PM, Rafael Espíndola via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>>
>> > Proposed implementation for range extension thunks
>> > At a high-level we need to solve the following problems:
>> > - Assign addresses more than once
>> > - Maintain state between successive calls of createThunks()
>> > - Synchronization of the linker script and the OutputSection after
>> > adding thunks
>>
>> This last past seems to be the messier. The issue is not with the
>> patch, is with the existing infrastructure that uses a completely
>> different representation for linker scripts and non linker scripts.
>>
>> What I think is needed is for the writer to create a dummy
"script"
>> and use what is now LinkerScript::assignAddresses. That
"script" would
>>
>> * Contain only OutputSectionCommand.
>> * All string manipulations would have been moved before assignAddress.
>> * All the orphan handling would have been made explicit before
>> assignAddress.
>> * Each OutputSectionCommand would contain just a
InputSectionDescription.
>>
>> With this the thunk creation should be able to add thunk to a single
>> location.
>>
>> Cheers,
>> Rafael
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>

Peter Smith via llvm-dev

2017-Apr-06 16:44 UTC

head link

[llvm-dev] [LLVM-DEV][LLD] RFC Range Thunks Implementation review for ARM and Mips

Just FYI: A quick experiment that got as far as creating an
OutputSectionCmd for each OutputSection when doing a link without a
linker script exposed an interesting performance problem with the
many-sections.s test.

To reproduce just add a linker script to the link in the test that
will force the creation of a large number of orphan sections, for
example:
// RUN: echo "SECTIONS { \
// RUN:       . = SIZEOF_HEADERS; \
// RUN:       .text : { *(.text) } } \
// RUN: " > %t.script
// RUN: ld.lld %t --script %t.script -o %t2
// RUN: llvm-readobj -t %t2 | FileCheck --check-prefix=LINKED %s

This will take over 60s to run the test on my machine. I think the
culprit is Script->writeDataBytes(Name, Buf); in
OutputSection::writeTo() which searches for the OutputSection by name.
With a huge number of sections this is going to take a long time. I'm
not sure if many-sections.s with a linker script is a representative
test case for lld as it stands but if we do go down the route of
fabricating a linker script command for each output section we'll need
to make a better mapping from OutputSection to OutputSection command
than a linear search by name.

Peter

On 6 April 2017 at 12:01, Peter Smith <peter.smith at linaro.org>
wrote:> My understanding is that this would be (initially) limited to
> fabricating enough linker script commands such that we could replace:
> fixSectionAlignments()
> assignAddresses()
> Script->processNonSectionCommands()
>
> With something like:
> Script->assignAddresses() // Could be done multiple times
> Script->processNonSectionCommands() // This should only be done once
>
> In theory all the other __start and __end symbols could still be kept
> separate if the linker script commands were created late, and in a
> compatible way. I also don't think that this means removing
> OutputSections::Sections just yet either?
>
> I don't think that we are proposing to follow the ld.bfd model of
> driving the default case via a built in linker script yet? I think
> that this would be considerably more work than just this limited
> change.
>
> I think the best way forward is to try and prototype something to see
> if it splashes out any special cases. I can give this a go to see what
> happens.
>
> In the meantime I would be grateful if there is any opportunity to
> move forward some of the range thunks changes in parallel, even if
> they do not initially work with some linker scripts.
>
> If the above change to always using Script->assignAddresses() did
> happen then createThunks() would become a little bit more complicated
> as it would need to step through one or more input section
> descriptions per OutputSection. Any Thunks created would still need to
> be added to both InputSectionDescriptions and OutputSections::Sections
> but we could just use push_back().
>
> Peter
>
> On 6 April 2017 at 00:24, Rui Ueyama <ruiu at google.com> wrote:
>> Are you suggesting other linker jobs such as creating _end symbols to
the
>> linker script?
>>
>> The linker script support was implemented after we wrote the current
Writer
>> class, so it is somewhat "plugged in" to the Writer. It might
not be the
>> best design, and not many other options have been explored. So there
might
>> be room to improve code by moving work loads from the Writer to the
>> LinkerScript. But we need to careful not to hurt performance by doing
that.
>>
>> On Wed, Apr 5, 2017 at 4:14 PM, Rafael Espíndola via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>>>
>>> > Proposed implementation for range extension thunks
>>> > At a high-level we need to solve the following problems:
>>> > - Assign addresses more than once
>>> > - Maintain state between successive calls of createThunks()
>>> > - Synchronization of the linker script and the OutputSection
after
>>> > adding thunks
>>>
>>> This last past seems to be the messier. The issue is not with the
>>> patch, is with the existing infrastructure that uses a completely
>>> different representation for linker scripts and non linker scripts.
>>>
>>> What I think is needed is for the writer to create a dummy
"script"
>>> and use what is now LinkerScript::assignAddresses. That
"script" would
>>>
>>> * Contain only OutputSectionCommand.
>>> * All string manipulations would have been moved before
assignAddress.
>>> * All the orphan handling would have been made explicit before
>>> assignAddress.
>>> * Each OutputSectionCommand would contain just a
InputSectionDescription.
>>>
>>> With this the thunk creation should be able to add thunk to a
single
>>> location.
>>>
>>> Cheers,
>>> Rafael
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>

Rafael Espíndola via llvm-dev

2017-Apr-06 20:28 UTC

head link

[llvm-dev] [LLVM-DEV][LLD] RFC Range Thunks Implementation review for ARM and Mips

On 5 April 2017 at 19:24, Rui Ueyama <ruiu at google.com>
wrote:> Are you suggesting other linker jobs such as creating _end symbols to the
> linker script?
No. The artificial commands would contain just sections.

Cheers,
Rafael

Rafael Espíndola via llvm-dev

2017-Apr-06 20:34 UTC

head link

[llvm-dev] [LLVM-DEV][LLD] RFC Range Thunks Implementation review for ARM and Mips

On 6 April 2017 at 07:01, Peter Smith <peter.smith at linaro.org>
wrote:> My understanding is that this would be (initially) limited to
> fabricating enough linker script commands such that we could replace:
> fixSectionAlignments()
> assignAddresses()
> Script->processNonSectionCommands()
>
> With something like:
> Script->assignAddresses() // Could be done multiple times
> Script->processNonSectionCommands() // This should only be done once
Correct.
> In theory all the other __start and __end symbols could still be kept
> separate if the linker script commands were created late, and in a
> compatible way. I also don't think that this means removing
> OutputSections::Sections just yet either?
Probably not, but it might got away in the future.
> I don't think that we are proposing to follow the ld.bfd model of
> driving the default case via a built in linker script yet? I think
> that this would be considerably more work than just this limited
> change.
I really *don't* want to see lld do that. Using a real linker script
is a bad idea is it forces the link to be section name based. There is
no way to combine sections based on their flags for example.

We would still have exactly the same logic as to how sections are
combined. We would then just create the same structure that the linker
script address assignment logic uses.

Before any of this, we have to move all name based logic out of assignAddresses.
> I think the best way forward is to try and prototype something to see
> if it splashes out any special cases. I can give this a go to see what
> happens.
Cool. I am going on vacation tomorrow night, but I will try to at
least move some of the string lookups before assign address.
> In the meantime I would be grateful if there is any opportunity to
> move forward some of the range thunks changes in parallel, even if
> they do not initially work with some linker scripts.
Could we maybe start with *no* linker script support? If the idea of
unifying the representation works out we will get that for free.

Cheers,
Rafael

Apparently Analagous Threads

Search for more reasonably related threads

llvm dev - Apr 2017 - [LLD] RFC Range Thunks Implementation review for ARM and Mips

[llvm-dev] [LLVM-DEV][LLD] RFC Range Thunks Implementation review for ARM and Mips

[llvm-dev] [LLVM-DEV][LLD] RFC Range Thunks Implementation review for ARM and Mips

[llvm-dev] [LLVM-DEV][LLD] RFC Range Thunks Implementation review for ARM and Mips

[llvm-dev] [LLVM-DEV][LLD] RFC Range Thunks Implementation review for ARM and Mips

[llvm-dev] [LLVM-DEV][LLD] RFC Range Thunks Implementation review for ARM and Mips

Apparently Analagous Threads