Peter Smith via llvm-dev
2016-Jun-21 12:38 UTC
[llvm-dev] [LLD] thunk implementation correctness depends on order of input section.
I've been working on supporting ARM/Thumb interworking thunks in LLD and have encountered a limitation that I think it is worth bringing up in a wider context. This is all LLD specific, apologies if I've abused llvm-dev here. TL;DR summary: - Thunks in lld may not work if they are added to InputSections that have already been scanned. - There is a short term fix, but in the longer term I think that we will want to allow multiple passes of the relocations. - I'd like to know if there are any preferences on a solution. The current thunk implementation scans for and adds Thunks to InputSections within scanRelocations(). At present a Thunk for a RegularSymbol is always added to the InputSection that defines the RegularSymbol. I think that this can cause a problem when the InputSection with the relocation that needs to be indirected via the thunk is processed after the InputSection that the thunk is added to. For reference the important code snippet in Writer.cpp is: // Scan relocations. This must be done after every symbol is declared so that // we can correctly decide if a dynamic relocation is needed. for (OutputSectionBase<ELFT> *Sec : OutputSections) { Sec->forEachInputSection([&](InputSectionBase<ELFT> *S) { if (auto *IS = dyn_cast<InputSection<ELFT>>(S)) { // Set OutSecOff so that scanRelocations can use it. uintX_t Off = alignTo(Sec->getSize(), S->Alignment); IS->OutSecOff = Off; scanRelocations(*IS); // Now that scan relocs possibly changed the size, update the offset. Sec->setSize(Off + S->getSize()); } else if (auto *EH = dyn_cast<EhInputSection<ELFT>>(S)) { if (EH->RelocSection) scanRelocations(*EH, *EH->RelocSection); } }); } Adding a thunk to an InputSection that has already been processed will mean that its size will not be updated by Sec->setSize(Off + S->getSize()); this leads to corrupt or missing thunks being generated as they are written to invalid buffer location. To reproduce the problem I've taken the existing test/ELF/mips-npic-call-pic.s and reversed the order of the input objects so that the relocations needing thunks are processed last. I've attached the patch I've made to the test case and the ouputs of llvm-objdump for convenience. In the mips-thunk-correct.txt output all thunks are present, and of the correct size. In the mips-thunk-corrupt.txt not all thunks are present and some appear to be corrupt. Given that scanRelocations depends on IS->OutSecOff it isn't as simple as just finding the InputSection and correcting its size as this will affect the value of IS->OutSecOff of all sections that follow it, potentially invalidating any decision made on the basis of IS->OutSecOff. The simplest short-term fix is to add the thunks to the InputSection that contains the relocation needing the Thunk (The current IS). In the longer term, with range-extension thunks in mind, I think it would be wise to allow multiple passes through the relocations. For example one possibility for the example above is that adding a Thunk to a section already processed triggers another pass of the loop, updating anything created that relies upon IS->OutSecOff. My current thought is that to get a simple implementation of ARM/Thumb interworking thunks working, including thunks to PLT entries [*], I will need to add support for a Thunk to be added to a different InputSection than the target SymbolBody anyway. So I can add all Thunks to the current InputSection fix as part of the ARM/Thumb interworking. Support for range extension thunks are a way down the priority list at the moment so I don't propose to make any radical changes to scanRelocations at this point, however I'm willing to do so if others would prefer? [*] The interworking can be inlined into the ARM PLT entries, at the cost of requiring multiple entry points for every ARM PLT entry. My preference is to implement support for thunks to PLT entries. Kind regards Peter -------------- next part -------------- /work/linaro/build/llvm/tools/lld/test/ELF/Output/mips-npic-call-pic.s.tmp.exe: file format ELF32-mips Disassembly of section .text: fpic: 20000: 00 00 00 00 nop 20004: 00 00 00 00 nop 20008: 00 00 00 00 nop 2000c: 00 00 00 00 nop fnpic: 20010: 00 00 00 00 nop 20014: 00 00 00 00 nop 20018: 00 00 00 00 nop 2001c: 00 00 00 00 nop foo1a: 20020: 00 00 00 00 nop foo1b: 20024: 00 00 00 00 nop 20028: 3c 19 00 02 lui $25, 2 2002c: 08 00 80 08 j 131104 <foo1a> foo2: 20030: 00 00 00 00 nop 20034: 3c 19 00 02 lui $25, 2 20038: 08 00 80 0c j 131120 <foo2> 2003c: 27 39 00 30 addiu $25, $25, 48 __start: 20040: 0c 00 80 0a jal 131112 <foo1b+0x4> 20044: 00 00 00 00 nop 20048: 0c 00 80 0d jal 131124 <foo2+0x4> 2004c: 00 00 00 00 nop 20050: 0c 00 80 0e jal 131128 <foo2+0x8> 20054: 00 00 00 00 nop 20058: 0c 00 80 0d jal 131124 <foo2+0x4> 2005c: 00 00 00 00 nop 20060: 0c 00 80 08 jal 131104 <foo1a> 20064: 00 00 00 00 nop 20068: 0c 00 80 04 jal 131088 <fnpic> 2006c: 00 00 00 00 nop -------------- next part -------------- /work/linaro/build/llvm/tools/lld/test/ELF/Output/mips-npic-call-pic.s.tmp.exe: file format ELF32-mips Disassembly of section .text: __start: 20000: 0c 00 80 0e jal 131128 <foo1b+0x4> 20004: 00 00 00 00 nop 20008: 0c 00 80 19 jal 131172 <foo2+0x4> 2000c: 00 00 00 00 nop 20010: 0c 00 80 12 jal 131144 <foo1b+0x14> 20014: 00 00 00 00 nop 20018: 0c 00 80 19 jal 131172 <foo2+0x4> 2001c: 00 00 00 00 nop 20020: 0c 00 80 28 jal 131232 <fnpic+0x10> 20024: 00 00 00 00 nop 20028: 0c 00 80 24 jal 131216 <fnpic> 2002c: 00 00 00 00 nop foo1a: 20030: 00 00 00 00 nop foo1b: 20034: 00 00 00 00 nop 20038: 3c 19 00 02 lui $25, 2 2003c: 08 00 80 0c j 131120 <foo1a> 20040: 27 39 00 30 addiu $25, $25, 48 20044: 00 00 00 00 nop 20048: 3c 19 00 02 lui $25, 2 2004c: 08 00 80 0d j 131124 <foo1b> 20050: 27 39 00 34 addiu $25, $25, 52 20054: 00 00 00 00 nop 20058: 00 00 00 00 nop 2005c: 00 00 00 00 nop foo2: 20060: 00 00 00 00 nop 20064: 3c 19 00 02 lui $25, 2 20068: 08 00 80 18 j 131168 <foo2> 2006c: 27 39 00 60 addiu $25, $25, 96 20070: 00 00 00 00 nop 20074: 00 00 00 00 nop 20078: 00 00 00 00 nop 2007c: 00 00 00 00 nop fpic: 20080: 00 00 00 00 nop 20084: 00 00 00 00 nop 20088: 00 00 00 00 nop 2008c: 00 00 00 00 nop fnpic: 20090: 00 00 00 00 nop 20094: 00 00 00 00 nop 20098: 00 00 00 00 nop 2009c: 00 00 00 00 nop 200a0: 3c 19 00 02 lui $25, 2 200a4: 08 00 80 20 j 131200 <fpic> 200a8: 27 39 00 80 addiu $25, $25, 128 200ac: 00 00 00 00 nop -------------- next part -------------- diff --git a/test/ELF/mips-npic-call-pic.s b/test/ELF/mips-npic-call-pic.s index 755963c..31b7cd5 100644 --- a/test/ELF/mips-npic-call-pic.s +++ b/test/ELF/mips-npic-call-pic.s @@ -4,7 +4,7 @@ # RUN: llvm-mc -filetype=obj -triple=mips-unknown-linux \ # RUN: %p/Inputs/mips-pic.s -o %t-pic.o # RUN: llvm-mc -filetype=obj -triple=mips-unknown-linux %s -o %t-npic.o -# RUN: ld.lld %t-npic.o %t-pic.o %p/Inputs/mips-sto-pic.o -o %t.exe +# RUN: ld.lld %p/Inputs/mips-sto-pic.o %t-pic.o %t-npic.o -o %t.exe # RUN: llvm-objdump -d %t.exe | FileCheck %s # REQUIRES: mips
Rui Ueyama via llvm-dev
2016-Jun-22 02:29 UTC
[llvm-dev] [LLD] thunk implementation correctness depends on order of input section.
The first think I'd want to say is to not try too hard to use the existing "framework" in LLD. The current thunk mechanism was made just for MIPS non-PIC/PIC function calls and are not proven to be generic or useful for other purposes. We are cool with that because it just works for MIPS (except the bug you found) and satisfies our needs. And the amount of code for MIPS thunk is so small that we are not serious about reusing it. Probably the name given to it ("thunk") is too generic -- maybe we should rename it MipsLA25Thunk or something like that. I think you want to create a new type of thunk for ARM. The bug that we sometimes create broken MIPS thunks seems to have introduced in r265673 which Rafael made. Before that patch, we didn't assume that section VAs are available in scanRelocs. I think we want to revert that change (although it cannot simply be reverted because the patch was submitted in April, and many changes has been made on it since then.) Rafael, can you take at that change? On Tue, Jun 21, 2016 at 9:38 PM, Peter Smith <peter.smith at linaro.org> wrote:> I've been working on supporting ARM/Thumb interworking thunks in LLD > and have encountered a limitation that I think it is worth bringing up > in a wider context. This is all LLD specific, apologies if I've abused > llvm-dev here. > > TL;DR summary: > - Thunks in lld may not work if they are added to InputSections that > have already been scanned. > - There is a short term fix, but in the longer term I think that we > will want to allow multiple passes of the relocations. > - I'd like to know if there are any preferences on a solution. > > The current thunk implementation scans for and adds Thunks to > InputSections within scanRelocations(). At present a Thunk for a > RegularSymbol is always added to the InputSection that defines the > RegularSymbol. I think that this can cause a problem when the > InputSection with the relocation that needs to be indirected via the > thunk is processed after the InputSection that the thunk is added to. > > For reference the important code snippet in Writer.cpp is: > > // Scan relocations. This must be done after every symbol is declared so > that > // we can correctly decide if a dynamic relocation is needed. > for (OutputSectionBase<ELFT> *Sec : OutputSections) { > Sec->forEachInputSection([&](InputSectionBase<ELFT> *S) { > if (auto *IS = dyn_cast<InputSection<ELFT>>(S)) { > // Set OutSecOff so that scanRelocations can use it. > uintX_t Off = alignTo(Sec->getSize(), S->Alignment); > IS->OutSecOff = Off; > > scanRelocations(*IS); > > // Now that scan relocs possibly changed the size, update the > offset. > Sec->setSize(Off + S->getSize()); > } else if (auto *EH = dyn_cast<EhInputSection<ELFT>>(S)) { > if (EH->RelocSection) > scanRelocations(*EH, *EH->RelocSection); > } > }); > } > > Adding a thunk to an InputSection that has already been processed will > mean that its size will not be updated by Sec->setSize(Off + > S->getSize()); this leads to corrupt or missing thunks being generated > as they are written to invalid buffer location. > > To reproduce the problem I've taken the existing > test/ELF/mips-npic-call-pic.s and reversed the order of the input > objects so that the relocations needing thunks are processed last. > I've attached the patch I've made to the test case and the ouputs of > llvm-objdump for convenience. In the mips-thunk-correct.txt output all > thunks are present, and of the correct size. In the > mips-thunk-corrupt.txt not all thunks are present and some appear to > be corrupt. > > Given that scanRelocations depends on IS->OutSecOff it isn't as simple > as just finding the InputSection and correcting its size as this will > affect the value of IS->OutSecOff of all sections that follow it, > potentially invalidating any decision made on the basis of > IS->OutSecOff. > > The simplest short-term fix is to add the thunks to the InputSection > that contains the relocation needing the Thunk (The current IS). In > the longer term, with range-extension thunks in mind, I think it would > be wise to allow multiple passes through the relocations. For example > one possibility for the example above is that adding a Thunk to a > section already processed triggers another pass of the loop, updating > anything created that relies upon IS->OutSecOff. > > My current thought is that to get a simple implementation of ARM/Thumb > interworking thunks working, including thunks to PLT entries [*], I > will need to add support for a Thunk to be added to a different > InputSection than the target SymbolBody anyway. So I can add all > Thunks to the current InputSection fix as part of the ARM/Thumb > interworking. Support for range extension thunks are a way down the > priority list at the moment so I don't propose to make any radical > changes to scanRelocations at this point, however I'm willing to do so > if others would prefer? > > [*] The interworking can be inlined into the ARM PLT entries, at the > cost of requiring multiple entry points for every ARM PLT entry. My > preference is to implement support for thunks to PLT entries. > > Kind regards > > Peter >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160622/83c91ede/attachment.html>
Simon Atanasyan via llvm-dev
2016-Jun-22 05:40 UTC
[llvm-dev] [LLD] thunk implementation correctness depends on order of input section.
First of all thanks for finding the bug. I agree with Rui that right now we can manage without general thunk infrastructure. Let's provide at least a few "thunk" implementation and after that reconsider necessity of common thunk framework. As to MIPS there is one more type of thunk (keyword is .MIPS.stubs) and one more optimization of current thunk (putting a thunk in the beginning of the section if the section contains the only function required the thunk). But I do not plan to implement these features in the near future. On Wed, Jun 22, 2016 at 5:29 AM, Rui Ueyama <ruiu at google.com> wrote:> The first think I'd want to say is to not try too hard to use the existing > "framework" in LLD. The current thunk mechanism was made just for MIPS > non-PIC/PIC function calls and are not proven to be generic or useful for > other purposes. We are cool with that because it just works for MIPS (except > the bug you found) and satisfies our needs. And the amount of code for MIPS > thunk is so small that we are not serious about reusing it. Probably the > name given to it ("thunk") is too generic -- maybe we should rename it > MipsLA25Thunk or something like that. I think you want to create a new type > of thunk for ARM. > > The bug that we sometimes create broken MIPS thunks seems to have introduced > in r265673 which Rafael made. Before that patch, we didn't assume that > section VAs are available in scanRelocs. I think we want to revert that > change (although it cannot simply be reverted because the patch was > submitted in April, and many changes has been made on it since then.) > > Rafael, can you take at that change? > > On Tue, Jun 21, 2016 at 9:38 PM, Peter Smith <peter.smith at linaro.org> wrote: >> >> I've been working on supporting ARM/Thumb interworking thunks in LLD >> and have encountered a limitation that I think it is worth bringing up >> in a wider context. This is all LLD specific, apologies if I've abused >> llvm-dev here. >> >> TL;DR summary: >> - Thunks in lld may not work if they are added to InputSections that >> have already been scanned. >> - There is a short term fix, but in the longer term I think that we >> will want to allow multiple passes of the relocations. >> - I'd like to know if there are any preferences on a solution.-- Simon Atanasyan