Rui Ueyama via llvm-dev
2016-Oct-19 23:46 UTC
[llvm-dev] LLD: creating linker-generated sections as input sections instead of output sections
On Wed, Oct 19, 2016 at 3:34 AM, Peter Smith via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Thanks for the RFC. > > I'm in favour of the option of creating InputSections for some linker > generated content. I think it would add extra flexibility to the > linker. ARM's proprietary linker uses the equivalent of InputSections > with a pseudo linker defined ObjectFile for SHF_ALLOC content. As > Eugene points out it isn't always appropriate for meta-data sections. >It is encouraging that the ARM's linker uses the same scheme. In particular it would be great to have a pseudo InputFile that local> symbols could be generated in as this would make supporting mapping > symbols[*] in linker generated sections much easier. It would also > make a Thunk implementation that generated standalone InputSections > rather than adding as patches to existing InputSections. > > The disadvantage with extra flexibility is that it increases the > amount of opportunities for both implementers and users to make > mistakes, and it makes some implementation details more complicated. > Where we would have been able to guarantee a single OutputSection, we > may have many clumps of InputSections distributed across several > OutputSections. In some cases it is user error to split InputSections > apart as they need to be contiguous, which requites diagnostics, and > in some cases algorithms need to be careful, for example in embedded > systems it is not always appropriate to string merge between > OutputSections as these OutputSections may not exist in memory on the > at the same time (Overlays). > > Personally I think the additional flexibility is worth it. > > [*] Mapping symbols identify ranges of ARM code ($a), Thumb code ($t) > and literal data ($d). It would be great to add these to Thunks and > PLT entries as this would improve disassembly. > > On 19 October 2016 at 10:37, Eugene Leviant via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > I would suggest converting only part of linker generated sections to > > input sections to reduce amount of code changes. > > For example it's unlikely that SymbolTableSection or > > StringTableSection would ever require such treatment, so why > > converting them to input sections? >I think if the scheme works for all sections, and if it doesn't unnecessarily complicate the code, we should convert all sections to input sections in order to keep the architecture simple. "The linker creates its own sections as input sections" is easier to explain/understand than "... as either input sections or output sections."> > > 2016-10-19 11:03 GMT+03:00 George Rimar <grimar at accesssoftek.com>: > >>>This idea popped up in the review thread for > >>> https://reviews.llvm.org/D25627. > >> > >>> > >>>Problem: > >>> > >>>Currently, LLD creates special sections that are not just > concatenations of > >>> input sections but need link-time data generation, such as .got, .plt, > >>> interp, .mips.options, >etc., as output sections. We have > OutputSectionBase > >>> subclasses (e.g. GotSection, PltSection, etc.) to create data. Even > though > >>> this scheme works in most cases, >there are a few situations that > doesn't > >>> work well as you may have noticed. Here are a issues. > >>> > >> >- You cannot mix special sections with other types of sections. > >>> > >>> For example, using linker scripts, you can instruct the linker put > >>> mergeable sections and non-mergeable sections into the same output > section. > >>> Such script makes >sense. However, LLD cannot handle such script > because > >>> string merging is the special mergeable output section's feature. The > output > >>> section doesn't know how to >handle other types of sections, so you > cannot > >>> feed non-mergeable sections to a mergeable output section. > >>> > >>> - It cannot handle linker scripts like this as pointed by Eugene. > >>> > >>> .got { *(.got.plt) *(.got) } > >>> > >>> In our current architecture, .got section is an output section, so it > >>> cannot be added to other output section. There's no clean way to > handle this > >>> linker script. > >>> > >>>Proposal: > >>> > >>>Here's my idea: how about creating all special sections as input > sections > >>> instead of output sections? > >>> > >>>GotSection, PltSection, etc. will be subclasses of InputSection that > don't > >>> have corresponding input files. What they will do remain the same. > They will > >>> be added to >OutputSections just like other regular sections are > added. I > >>> think we could simplify OutputSection a lot -- OutputSection will > probably > >>> become a dumb container >that just concatenates all input sections. > >>> > >>>This approach would solve the problems described above. Now that we > create > >>> .got as an special input section with ".got" as a name, so they can > >>> naturally be added >to any output section. String merging occurs > inside a > >>> special mergeable input section, so they can be added to any section, > too. > >>> > >>>So, I think by moving the implementations from OutputSection to > >>> InputSection, we can solve many problems. I do not think of any obvious > >>> problem with the >approach. > >>> > >>>What do you think? > >> > >> For me that sounds as interesting idea. My consern and guess that > amount of > >> code changes can be really large for that. > >> But generally I so not see real problems with this approach too. > >> > >> George. > >> > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161019/49e29a99/attachment-0001.html>
Eugene Leviant via llvm-dev
2016-Oct-21 14:45 UTC
[llvm-dev] LLD: creating linker-generated sections as input sections instead of output sections
Is anyone already working on it? If not then I can take this task. 2016-10-20 2:46 GMT+03:00 Rui Ueyama <ruiu at google.com>:> On Wed, Oct 19, 2016 at 3:34 AM, Peter Smith via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> >> Thanks for the RFC. >> >> I'm in favour of the option of creating InputSections for some linker >> generated content. I think it would add extra flexibility to the >> linker. ARM's proprietary linker uses the equivalent of InputSections >> with a pseudo linker defined ObjectFile for SHF_ALLOC content. As >> Eugene points out it isn't always appropriate for meta-data sections. > > > It is encouraging that the ARM's linker uses the same scheme. > >> In particular it would be great to have a pseudo InputFile that local >> symbols could be generated in as this would make supporting mapping >> symbols[*] in linker generated sections much easier. It would also >> make a Thunk implementation that generated standalone InputSections >> rather than adding as patches to existing InputSections. >> >> The disadvantage with extra flexibility is that it increases the >> amount of opportunities for both implementers and users to make >> mistakes, and it makes some implementation details more complicated. >> Where we would have been able to guarantee a single OutputSection, we >> may have many clumps of InputSections distributed across several >> OutputSections. In some cases it is user error to split InputSections >> apart as they need to be contiguous, which requites diagnostics, and >> in some cases algorithms need to be careful, for example in embedded >> systems it is not always appropriate to string merge between >> OutputSections as these OutputSections may not exist in memory on the >> at the same time (Overlays). >> >> Personally I think the additional flexibility is worth it. >> >> [*] Mapping symbols identify ranges of ARM code ($a), Thumb code ($t) >> and literal data ($d). It would be great to add these to Thunks and >> PLT entries as this would improve disassembly. >> >> On 19 October 2016 at 10:37, Eugene Leviant via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >> > I would suggest converting only part of linker generated sections to >> > input sections to reduce amount of code changes. >> > For example it's unlikely that SymbolTableSection or >> > StringTableSection would ever require such treatment, so why >> > converting them to input sections? > > > I think if the scheme works for all sections, and if it doesn't > unnecessarily complicate the code, we should convert all sections to input > sections in order to keep the architecture simple. "The linker creates its > own sections as input sections" is easier to explain/understand than "... as > either input sections or output sections." > >> > >> > 2016-10-19 11:03 GMT+03:00 George Rimar <grimar at accesssoftek.com>: >> >>>This idea popped up in the review thread for >> >>> https://reviews.llvm.org/D25627. >> >> >> >>> >> >>>Problem: >> >>> >> >>>Currently, LLD creates special sections that are not just >> >>> concatenations of >> >>> input sections but need link-time data generation, such as .got, .plt, >> >>> interp, .mips.options, >etc., as output sections. We have >> >>> OutputSectionBase >> >>> subclasses (e.g. GotSection, PltSection, etc.) to create data. Even >> >>> though >> >>> this scheme works in most cases, >there are a few situations that >> >>> doesn't >> >>> work well as you may have noticed. Here are a issues. >> >>> >> >> >- You cannot mix special sections with other types of sections. >> >>> >> >>> For example, using linker scripts, you can instruct the linker put >> >>> mergeable sections and non-mergeable sections into the same output >> >>> section. >> >>> Such script makes >sense. However, LLD cannot handle such script >> >>> because >> >>> string merging is the special mergeable output section's feature. The >> >>> output >> >>> section doesn't know how to >handle other types of sections, so you >> >>> cannot >> >>> feed non-mergeable sections to a mergeable output section. >> >>> >> >>> - It cannot handle linker scripts like this as pointed by Eugene. >> >>> >> >>> .got { *(.got.plt) *(.got) } >> >>> >> >>> In our current architecture, .got section is an output section, so >> >>> it >> >>> cannot be added to other output section. There's no clean way to >> >>> handle this >> >>> linker script. >> >>> >> >>>Proposal: >> >>> >> >>>Here's my idea: how about creating all special sections as input >> >>> sections >> >>> instead of output sections? >> >>> >> >>>GotSection, PltSection, etc. will be subclasses of InputSection that >> >>> don't >> >>> have corresponding input files. What they will do remain the same. >> >>> They will >> >>> be added to >OutputSections just like other regular sections are >> >>> added. I >> >>> think we could simplify OutputSection a lot -- OutputSection will >> >>> probably >> >>> become a dumb container >that just concatenates all input sections. >> >>> >> >>>This approach would solve the problems described above. Now that we >> >>> create >> >>> .got as an special input section with ".got" as a name, so they can >> >>> naturally be added >to any output section. String merging occurs >> >>> inside a >> >>> special mergeable input section, so they can be added to any section, >> >>> too. >> >>> >> >>>So, I think by moving the implementations from OutputSection to >> >>> InputSection, we can solve many problems. I do not think of any >> >>> obvious >> >>> problem with the >approach. >> >>> >> >>>What do you think? >> >> >> >> For me that sounds as interesting idea. My consern and guess that >> >> amount of >> >> code changes can be really large for that. >> >> But generally I so not see real problems with this approach too. >> >> >> >> George. >> >> >> > _______________________________________________ >> > LLVM Developers mailing list >> > llvm-dev at lists.llvm.org >> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >
Peter Smith via llvm-dev
2016-Oct-21 14:51 UTC
[llvm-dev] LLD: creating linker-generated sections as input sections instead of output sections
I've not started. I'm currently looking at getting static linking working on ARM with a recent GCC sysroot, so I'll be busy for a little while. Peter On 21 October 2016 at 15:45, Eugene Leviant <evgeny.leviant at gmail.com> wrote:> Is anyone already working on it? If not then I can take this task. > > 2016-10-20 2:46 GMT+03:00 Rui Ueyama <ruiu at google.com>: >> On Wed, Oct 19, 2016 at 3:34 AM, Peter Smith via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >>> >>> Thanks for the RFC. >>> >>> I'm in favour of the option of creating InputSections for some linker >>> generated content. I think it would add extra flexibility to the >>> linker. ARM's proprietary linker uses the equivalent of InputSections >>> with a pseudo linker defined ObjectFile for SHF_ALLOC content. As >>> Eugene points out it isn't always appropriate for meta-data sections. >> >> >> It is encouraging that the ARM's linker uses the same scheme. >> >>> In particular it would be great to have a pseudo InputFile that local >>> symbols could be generated in as this would make supporting mapping >>> symbols[*] in linker generated sections much easier. It would also >>> make a Thunk implementation that generated standalone InputSections >>> rather than adding as patches to existing InputSections. >>> >>> The disadvantage with extra flexibility is that it increases the >>> amount of opportunities for both implementers and users to make >>> mistakes, and it makes some implementation details more complicated. >>> Where we would have been able to guarantee a single OutputSection, we >>> may have many clumps of InputSections distributed across several >>> OutputSections. In some cases it is user error to split InputSections >>> apart as they need to be contiguous, which requites diagnostics, and >>> in some cases algorithms need to be careful, for example in embedded >>> systems it is not always appropriate to string merge between >>> OutputSections as these OutputSections may not exist in memory on the >>> at the same time (Overlays). >>> >>> Personally I think the additional flexibility is worth it. >>> >>> [*] Mapping symbols identify ranges of ARM code ($a), Thumb code ($t) >>> and literal data ($d). It would be great to add these to Thunks and >>> PLT entries as this would improve disassembly. >>> >>> On 19 October 2016 at 10:37, Eugene Leviant via llvm-dev >>> <llvm-dev at lists.llvm.org> wrote: >>> > I would suggest converting only part of linker generated sections to >>> > input sections to reduce amount of code changes. >>> > For example it's unlikely that SymbolTableSection or >>> > StringTableSection would ever require such treatment, so why >>> > converting them to input sections? >> >> >> I think if the scheme works for all sections, and if it doesn't >> unnecessarily complicate the code, we should convert all sections to input >> sections in order to keep the architecture simple. "The linker creates its >> own sections as input sections" is easier to explain/understand than "... as >> either input sections or output sections." >> >>> > >>> > 2016-10-19 11:03 GMT+03:00 George Rimar <grimar at accesssoftek.com>: >>> >>>This idea popped up in the review thread for >>> >>> https://reviews.llvm.org/D25627. >>> >> >>> >>> >>> >>>Problem: >>> >>> >>> >>>Currently, LLD creates special sections that are not just >>> >>> concatenations of >>> >>> input sections but need link-time data generation, such as .got, .plt, >>> >>> interp, .mips.options, >etc., as output sections. We have >>> >>> OutputSectionBase >>> >>> subclasses (e.g. GotSection, PltSection, etc.) to create data. Even >>> >>> though >>> >>> this scheme works in most cases, >there are a few situations that >>> >>> doesn't >>> >>> work well as you may have noticed. Here are a issues. >>> >>> >>> >> >- You cannot mix special sections with other types of sections. >>> >>> >>> >>> For example, using linker scripts, you can instruct the linker put >>> >>> mergeable sections and non-mergeable sections into the same output >>> >>> section. >>> >>> Such script makes >sense. However, LLD cannot handle such script >>> >>> because >>> >>> string merging is the special mergeable output section's feature. The >>> >>> output >>> >>> section doesn't know how to >handle other types of sections, so you >>> >>> cannot >>> >>> feed non-mergeable sections to a mergeable output section. >>> >>> >>> >>> - It cannot handle linker scripts like this as pointed by Eugene. >>> >>> >>> >>> .got { *(.got.plt) *(.got) } >>> >>> >>> >>> In our current architecture, .got section is an output section, so >>> >>> it >>> >>> cannot be added to other output section. There's no clean way to >>> >>> handle this >>> >>> linker script. >>> >>> >>> >>>Proposal: >>> >>> >>> >>>Here's my idea: how about creating all special sections as input >>> >>> sections >>> >>> instead of output sections? >>> >>> >>> >>>GotSection, PltSection, etc. will be subclasses of InputSection that >>> >>> don't >>> >>> have corresponding input files. What they will do remain the same. >>> >>> They will >>> >>> be added to >OutputSections just like other regular sections are >>> >>> added. I >>> >>> think we could simplify OutputSection a lot -- OutputSection will >>> >>> probably >>> >>> become a dumb container >that just concatenates all input sections. >>> >>> >>> >>>This approach would solve the problems described above. Now that we >>> >>> create >>> >>> .got as an special input section with ".got" as a name, so they can >>> >>> naturally be added >to any output section. String merging occurs >>> >>> inside a >>> >>> special mergeable input section, so they can be added to any section, >>> >>> too. >>> >>> >>> >>>So, I think by moving the implementations from OutputSection to >>> >>> InputSection, we can solve many problems. I do not think of any >>> >>> obvious >>> >>> problem with the >approach. >>> >>> >>> >>>What do you think? >>> >> >>> >> For me that sounds as interesting idea. My consern and guess that >>> >> amount of >>> >> code changes can be really large for that. >>> >> But generally I so not see real problems with this approach too. >>> >> >>> >> George. >>> >> >>> > _______________________________________________ >>> > LLVM Developers mailing list >>> > llvm-dev at lists.llvm.org >>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >>