Jonas Devlieghere via llvm-dev
2017-Jul-26 20:56 UTC
[llvm-dev] armv7 pc-rel bx thumb instruction
Hi Tim, Thank you for clarifying what the error actually means! I did read something about the BLX instruction but since I'm compiling strictly for thumb, it didn't make much sense to me. Adding -mdisable-tail-calls as a cc1 command indeed allowed me to link the generated binary. After looking some more at the ld64 source code, I came across the following comment: // The tail-call optimization may result in a function ending in a jump (b) // to another functions. At compile time the compiler does not know // if the target of the jump will be in the same mode (arm vs thumb). // The arm/thumb instruction set has a way to change modes in a bl(x) // instruction, but no instruction to change mode in a jump (b) instruction. // In those rare cases, the linker needs to insert a shim of code to // make the mode switch. So it seems that a branch island is glue code added by the linker to do the actual mode switch if necessary. But why would we need a mode switch for a jump to a function that is also in thumb mode? And why is the branch island arm code and not thumb? Would you mind helping me understand how these branch islands work? I'd love to comprehend what's actually going on here. Thanks again for your help! Jonas On Wed, Jul 26, 2017 at 9:25 PM, Tim Northover <t.p.northover at gmail.com> wrote:> Hi Jonas, > > On 26 July 2017 at 10:52, Jonas Devlieghere via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > I'm working on some custom transformation passes that have the > side-effect > > of > > significantly increasing the code size. While testing it on some larger, > > real-world code bases, I run into a linker error for armv7 thumb code. > The > > particular error I get from ld64 is that "armv7 has no pc-rel bx thumb > > instruction." I've been able to reproduce the problem by taking a random > > thumbv7 bitcode file and cloning functions until the linker fails. > > Interesting. It looks like you've got a tail call from Thumb code to > ARM code. The linker would normally turn a BL into a BLX to make this > work, but it's (rightly) reporting that there's no "BX some_func" > instruction (you have to load the destination into a register and jump > there). > > If you have control over both functions you probably just want to > compile the destination in Thumb mode (there's hardly ever reason to > use ARM mode these days). But given your circumstances there's a > pretty good chance the ARM code is actually a branch island ld64 is > trying to insert. > > Other than that Clang has a "-fno-optimize-sibling-calls" which should > disable tail calls and make things work. I'd suggest reporting a bug > against ld64 too, it should be able to handle this case really. > > Cheers. > > Tim. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170726/dd25b5bd/attachment.html>
Tim Northover via llvm-dev
2017-Jul-26 21:36 UTC
[llvm-dev] armv7 pc-rel bx thumb instruction
Hi Jonas, On 26 July 2017 at 13:56, Jonas Devlieghere via llvm-dev <llvm-dev at lists.llvm.org> wrote:> So it seems that a branch island is glue code added by the linker to do the > actual mode switch if necessary. But why would we need a mode switch for a > jump to a function that is also in thumb mode?We wouldn't unless shim is in ARM mode; that's what the code actually has to jump to. But it's just speculation, I haven't read the ld64 code nearly enough to pinpoint the error there.> And why is the branch island arm code and not thumb?If that really is the issue, it'll just be an oversight.> Would you mind helping me understand how these branch islands work?The basic idea is that if a call destination is too far away for the instruction to make it there in one step the linker inserts a code sequence roughly like this: ldr ip, Laddr bx ip Laddr: .word real_function_dest that is in range and converts the original call to jump there instead. This allows the jump to reach anywhere in the 32-bit address since the pointer at Laddr can be anything it wants. There are bells and whistles for PIC code, and obviously linker internal details get involved, but for those you're probably better off just looking at the code. Tim.
Jonas Devlieghere via llvm-dev
2017-Aug-01 07:35 UTC
[llvm-dev] armv7 pc-rel bx thumb instruction
Thanks a lot for the explanation! I've done some more testing and while -mdisable-tail-calls does solve the problem for some samples, there are others where the error remains. Any chance you or anyone else has another idea what might cause this? Some sample show a different error "unknown ARM scattered relocation type 11" which also seems to be related to jump islands (being out of range?). Thank you, Jonas On Wed, Jul 26, 2017 at 11:36 PM, Tim Northover <t.p.northover at gmail.com> wrote:> Hi Jonas, > > On 26 July 2017 at 13:56, Jonas Devlieghere via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> So it seems that a branch island is glue code added by the linker to do the >> actual mode switch if necessary. But why would we need a mode switch for a >> jump to a function that is also in thumb mode? > > We wouldn't unless shim is in ARM mode; that's what the code actually > has to jump to. But it's just speculation, I haven't read the ld64 > code nearly enough to pinpoint the error there. > >> And why is the branch island arm code and not thumb? > > If that really is the issue, it'll just be an oversight. > >> Would you mind helping me understand how these branch islands work? > > The basic idea is that if a call destination is too far away for the > instruction to make it there in one step the linker inserts a code > sequence roughly like this: > > ldr ip, Laddr > bx ip > Laddr: > .word real_function_dest > > that is in range and converts the original call to jump there instead. > This allows the jump to reach anywhere in the 32-bit address since the > pointer at Laddr can be anything it wants. > > There are bells and whistles for PIC code, and obviously linker > internal details get involved, but for those you're probably better > off just looking at the code. > > Tim.