thr3ads.net - llvm dev - [llvm-dev] [LLD] Writing thunks before the corresponding section [Sep 2016]

If this information is useful, please help other people find it:
Share via:

Simon Atanasyan via llvm-dev

2016-Sep-07 13:58 UTC

[llvm-dev] [LLD] Writing thunks before the corresponding section

Hi,

MIPS LA25 thunk is used to call PIC function from non-PIC code.
Usually it contains three instructions:

lui   $25, %hi(func)
addiu $25, $25, %lo(func)
j     func

We can write such thunk in an arbitrary place of the generated file.
But if a PIC function requires the thunk is the first routine in a
section, we can optimize the code and escape jump instruction. To do
so we just write the following thunk right before the PIC routine.

lui   $25, %hi(func)
addiu $25, $25, %lo(func)

In fact GNU bfd/gold linkers write all MIPS LA25 thunks required for
the section "A" into a separate input section "S" and put
section "S"
before "A". The last thunk in the section "S" might have an
optimized
two-instructions form.

I would like to implement such optimization in LLD. My question is
about ARM thunks - is it okay to write them before corresponding input
section not after like LLD does now?

-- 
Simon Atanasyan

Peter Smith via llvm-dev

2016-Sep-07 16:55 UTC

head link

[llvm-dev] [LLD] Writing thunks before the corresponding section

Hello Simon,

Yes it is okay to write ARM thunks before an InputSection. There is a
similar "inline state change" thunk in ARM that does BX PC, NOP to
change state and fall through. The ARM Thunks that are implemented now
just need to be in range of the source branch. I have previously
worked on an ARM Linker that has thunks in separate sections in the
same way that you describe for bfd/gold.

I can't tell if you are planning to implement Thunks as separate
InputSections or assigning them to existing InputSections as they are
now but writing them at the front and not the end.

If you are considering putting the thunks as data to be written prior
to the InputSection contents I think you'll need some extra book
keeping.
- Padding might be needed between the last thunk and the InputSection
contents if the alignment of the InputSection is higher than the usual
2 or 4.
- If the Thunk is conceptually part of the InputSection (starts at
offset 0) then all the relocations and symbols will need displacing.

It is worth mentioning that disassembly of ARM and Thumb Thunks may
look a bit strange if they are moved from after the InputSection. This
is because they lack a mapping symbol ($a or $t) that tells the
disassembler what instruction set to disassemble. I've got adding
mapping symbol for linker generated InputSections on my list of things
to do.

Hope this helps

Peter

On 7 September 2016 at 14:58, Simon Atanasyan <simon at atanasyan.com>
wrote:> Hi,
>
> MIPS LA25 thunk is used to call PIC function from non-PIC code.
> Usually it contains three instructions:
>
> lui   $25, %hi(func)
> addiu $25, $25, %lo(func)
> j     func
>
> We can write such thunk in an arbitrary place of the generated file.
> But if a PIC function requires the thunk is the first routine in a
> section, we can optimize the code and escape jump instruction. To do
> so we just write the following thunk right before the PIC routine.
>
> lui   $25, %hi(func)
> addiu $25, $25, %lo(func)
>
> In fact GNU bfd/gold linkers write all MIPS LA25 thunks required for
> the section "A" into a separate input section "S" and
put section "S"
> before "A". The last thunk in the section "S" might
have an optimized
> two-instructions form.
>
> I would like to implement such optimization in LLD. My question is
> about ARM thunks - is it okay to write them before corresponding input
> section not after like LLD does now?
>
> --
> Simon Atanasyan

Bruce Hoult via llvm-dev

2016-Sep-07 20:50 UTC

head link

[llvm-dev] [LLD] Writing thunks before the corresponding section

On Wed, Sep 7, 2016 at 7:55 PM, Peter Smith via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hello Simon,
>
> Yes it is okay to write ARM thunks before an InputSection. There is a
> similar "inline state change" thunk in ARM that does BX PC, NOP
to
> change state and fall through.

Maybe it's a little bit evil, but I've found that SUB PC,PC,#3 works
just
fine to change to Thumb state without any NOP needed on all
current-generation CPUs I've tried it on, and in particular  Raspberry Pi 2
(Cortex A7), Pi 3 (Cortex A53) and Odroid XU4 (Cortex A15).

Unfortunately I never though to try this ten years ago on the ARM7TDMI

e.g. (assumes Linux EABI kernel)

.equ SYSCALL_EXIT, 1
.equ SYSCALL_WRITE, 4
.equ STDOUT, 1

.globl _start
.syntax unified
_start:
sub pc,pc,#3
.thumb
movs r0,#STDOUT
adr r1,hello
movs r2,#11
movs r7,#SYSCALL_WRITE
swi 0
movs r7,#SYSCALL_EXIT
swi 0

.align 2
hello: .asciz "Hello asm!\n"


It is worth mentioning that disassembly of ARM and Thumb Thunks
may> look a bit strange if they are moved from after the InputSection. This
> is because they lack a mapping symbol ($a or $t) that tells the
> disassembler what instruction set to disassemble. I've got adding
> mapping symbol for linker generated InputSections on my list of things
> to do.
>
This disassembles fine when built in the standard way so there's clearly no
fundamental problem with disassembling past inline thunks:

$ as asm_test.s -o asm_test.o
$ ld asm_test.o -o asm_test
$ ./asm_test
Hello asm!
$ objdump -d asm_test

asm_test:     file format elf32-littlearm

Disassembly of section .text:

00010054 <_start>:
   10054: e24ff003 sub pc, pc, #3
   10058: 2001       movs r0, #1
   1005a: a103       add r1, pc, #12 ; (adr r1, 10068 <hello>)
   1005c: 220b       movs r2, #11
   1005e: 2704       movs r7, #4
   10060: df00       svc 0
   10062: 2701       movs r7, #1
   10064: df00       svc 0
   10066: 46c0       nop ; (mov r8, r8)

00010068 <hello>:
   10068: 6c6c6548 .word 0x6c6c6548
   1006c: 7361206f .word 0x7361206f
   10070: 000a216d .word 0x000a216d

NB that first e24ff003 is an ARM instruction, *not* Thumb2.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160907/f47ed9ed/attachment.html>

Rui Ueyama via llvm-dev

2016-Sep-07 22:44 UTC

head link

[llvm-dev] [LLD] Writing thunks before the corresponding section

This seems to be a reasonable optimization, and I don't have any particular
concern about implementing it.

On Wed, Sep 7, 2016 at 6:58 AM, Simon Atanasyan <simon at atanasyan.com>
wrote:
> Hi,
>
> MIPS LA25 thunk is used to call PIC function from non-PIC code.
> Usually it contains three instructions:
>
> lui   $25, %hi(func)
> addiu $25, $25, %lo(func)
> j     func
>
> We can write such thunk in an arbitrary place of the generated file.
> But if a PIC function requires the thunk is the first routine in a
> section, we can optimize the code and escape jump instruction. To do
> so we just write the following thunk right before the PIC routine.
>
> lui   $25, %hi(func)
> addiu $25, $25, %lo(func)
>
> In fact GNU bfd/gold linkers write all MIPS LA25 thunks required for
> the section "A" into a separate input section "S" and
put section "S"
> before "A". The last thunk in the section "S" might
have an optimized
> two-instructions form.
>
> I would like to implement such optimization in LLD. My question is
> about ARM thunks - is it okay to write them before corresponding input
> section not after like LLD does now?
>
> --
> Simon Atanasyan
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160907/6ce4dd16/attachment.html>

Bruce Hoult via llvm-dev

2016-Sep-08 11:42 UTC

head link

[llvm-dev] [LLD] Writing thunks before the corresponding section

On Wed, Sep 7, 2016 at 7:55 PM, Peter Smith via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hello Simon,
>
> Yes it is okay to write ARM thunks before an InputSection. There is a
> similar "inline state change" thunk in ARM that does BX PC, NOP
to
> change state and fall through.

Forgot to mention: BX PC won't do anything in ARM mode. Standard way is ADD
Rn,PC,#1;BX Rn (typically LR).

In Thumb mode BX PC will switch to ARM, but the BX instruction should be
4-byte aligned and the next 2 bytes are ignored .. doesn't matter whether
they are NOP or not.

The architecture manual says BX PC from the 2nd Thumb instruction in a 4
byte word is unpredictable. On some implementations it will work, resuming
at the ARM instruction in the very next bytes (address 4 bytes more than
the word the Thumb instruction was in). But it's hit and miss. The
following code works on Odroid XU4 (A15) and Raspberry Pi 2 (A7) but not on
Raspberry Pi 3 (A53 - bus error):

 00010054 <_start>:
   10054: e24ff003 sub pc, pc, #3
   10058: 2001       movs r0, #1
   1005a: a105       add r1, pc, #20 ; (adr r1, 10070 <hello>)
   1005c: 220b       movs r2, #11
   1005e: 4778       bx pc
   10060: e3b07004 movs r7, #4
   10064: ef000000 svc 0x00000000
   10068: e3b07001 movs r7, #1
   1006c: ef000000 svc 0x00000000

00010070 <hello>:
   10070: 6c6c6548 .word 0x6c6c6548
   10074: 7361206f .word 0x7361206f
   10078: 000a216d .word 0x000a216d
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160908/b256dbf4/attachment.html>

Simon Atanasyan via llvm-dev

2016-Nov-29 21:18 UTC

head link

[llvm-dev] [LLD] Writing thunks before the corresponding section

Hi,

Sorry for delay with reply.

It looks like now thunks can be implemented as a synthetic sections.
In that case we give flexible solution and will be able to put thunks
before/after related sections, using different alignment etc. As far
as I know BFD linker uses the same approach at least for MIPS thunks.
I will try to implement this idea.

On Thu, Sep 8, 2016 at 1:44 AM, Rui Ueyama <ruiu at google.com>
wrote:> This seems to be a reasonable optimization, and I don't have any
particular
> concern about implementing it.
>
> On Wed, Sep 7, 2016 at 6:58 AM, Simon Atanasyan <simon at
atanasyan.com> wrote:
>>
>> Hi,
>>
>> MIPS LA25 thunk is used to call PIC function from non-PIC code.
>> Usually it contains three instructions:
>>
>> lui   $25, %hi(func)
>> addiu $25, $25, %lo(func)
>> j     func
>>
>> We can write such thunk in an arbitrary place of the generated file.
>> But if a PIC function requires the thunk is the first routine in a
>> section, we can optimize the code and escape jump instruction. To do
>> so we just write the following thunk right before the PIC routine.
>>
>> lui   $25, %hi(func)
>> addiu $25, $25, %lo(func)
>>
>> In fact GNU bfd/gold linkers write all MIPS LA25 thunks required for
>> the section "A" into a separate input section "S"
and put section "S"
>> before "A". The last thunk in the section "S" might
have an optimized
>> two-instructions form.
>>
>> I would like to implement such optimization in LLD. My question is
>> about ARM thunks - is it okay to write them before corresponding input
>> section not after like LLD does now?
-- 
Simon Atanasyan

llvm dev - Sep 2016 - [LLD] Writing thunks before the corresponding section

[llvm-dev] [LLD] Writing thunks before the corresponding section

[llvm-dev] [LLD] Writing thunks before the corresponding section

[llvm-dev] [LLD] Writing thunks before the corresponding section

[llvm-dev] [LLD] Writing thunks before the corresponding section

[llvm-dev] [LLD] Writing thunks before the corresponding section

[llvm-dev] [LLD] Writing thunks before the corresponding section