Hi, On Wed, Jul 12, 2017 at 2:21 AM, Rui Ueyama via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > Thanks, Bruce. This is a very interesting optimization. > > lld doesn't currently have code to support that kind of code shrinking > optimization, but we can definitely add it. It seems that essentially we > need to iterate over all relocations while rewriting instructions until a > convergence is obtained. But the point is how to do do it efficiently -- > link speed really matters. I can't come up with an algorithm to parallelize > it. Do you have any idea? > > In order to shrink instructions, all address references must be explicitly > represented as relocations even if they are in the same section. I think > that means object files for RISC-V have many more relocations than the other > architectures. Is this correct?Indeed. RISC-V would need to emit relocations for PC-relative offsets otherwise those offsets will become incorrect after relaxation. On Wed, Jul 12, 2017 at 2:27 AM, Rui Ueyama via llvm-dev <llvm-dev at lists.llvm.org> wrote:> By the way, since this is an optional code relaxation, we can think about it > later. The first thing I would do is to add RISC-V support to lld without > code shrinking relaxations, which I believe is doable by at most a few > hundreds lines of code.Yes, we have a working target for RISC-V in lld now (with relaxation) and is passing our internal tests. Our iterated relaxation currently makes a copy of each input section, loops over each of them to process relaxation and then adjust symbol address and relocation entries accordingly, just before they are written out. This works but isn't optimal. Since we intend to contribute this target back to upstream later on, we'd like to discuss how this should be properly handled. Note that RISC-V also handles alignment as part of relaxation, so it isn't really optional. For example: _start: mv a0, a0 .p2align 2 li a0, 0 The assembler inserts a 3-byte padding (note: this behavior isn't merged yet, see: https://github.com/riscv/riscv-binutils-gdb/pull/88): 00000000 <_start>: 0: 852a mv a0,a0 2: 00 01 00 # R_RISCV_ALIGN 2: R_RISCV_ALIGN *ABS*+0x3 5: 4501 li a0,0 The linker then remove 1 byte from padding to align to the desired width: 00010054 <_start>: 10054: 852a mv a0,a0 10056: 0001 nop 10058: 4501 li a0,0 This essentially shrinks code size and must be performed as RISC-V instructions must be 2-byte aligned. Therefore lld must be able to accommodate changes of content in an input section. Chih-Mao Chen (PkmX) Software R&D, Andes Technology
On Wed, Jul 12, 2017 at 10:26 AM, PkmX via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Note that RISC-V also handles alignment as part of relaxation, so it > isn't really optional. For example: > > _start: > mv a0, a0 > .p2align 2 > li a0, 0 > > The assembler inserts a 3-byte padding (note: this behavior isn't > merged yet, see: https://github.com/riscv/riscv-binutils-gdb/pull/88): > > 00000000 <_start>: > 0: 852a mv a0,a0 > 2: 00 01 00 # R_RISCV_ALIGN > 2: R_RISCV_ALIGN *ABS*+0x3 > 5: 4501 li a0,0 > > The linker then remove 1 byte from padding to align to the desired width: >This seems ... pessimised :-) At least output 4 bytes not 3! To keep it legal before the linker changes sizes. The assembler can easily keep track of the current alignment as it generates the code, and create an object file that is correct assuming no size changes by the linker. If there is, say, a .byte literal of length one then the ".p2align 2" should perhaps emit 7 bytes? Then it is legal as-is if the linker doesn't change the size of anything, or the linker can delete as much as possible if it is adjusting sizes. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170712/d231956b/attachment.html>
On Wed, Jul 12, 2017 at 4:10 AM, Bruce Hoult <bruce at hoult.org> wrote:> On Wed, Jul 12, 2017 at 10:26 AM, PkmX via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Note that RISC-V also handles alignment as part of relaxation, so it >> isn't really optional. For example: >> >> _start: >> mv a0, a0 >> .p2align 2 >> li a0, 0 >> >> The assembler inserts a 3-byte padding (note: this behavior isn't >> merged yet, see: https://github.com/riscv/riscv-binutils-gdb/pull/88): >> >> 00000000 <_start>: >> 0: 852a mv a0,a0 >> 2: 00 01 00 # R_RISCV_ALIGN >> 2: R_RISCV_ALIGN *ABS*+0x3 >> 5: 4501 li a0,0 >> >> The linker then remove 1 byte from padding to align to the desired width: >> > > This seems ... pessimised :-) > > At least output 4 bytes not 3! To keep it legal before the linker changes > sizes. > > The assembler can easily keep track of the current alignment as it > generates the code, and create an object file that is correct assuming no > size changes by the linker. >Yeah, I think I strongly agree with that. Code shrinking relaxation is a good optional optimization, but mandating it doesn't sound like a good idea. If there is, say, a .byte literal of length one then the ".p2align 2"> should perhaps emit 7 bytes? Then it is legal as-is if the linker doesn't > change the size of anything, or the linker can delete as much as possible > if it is adjusting sizes. > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170712/9db157df/attachment.html>
Rafael Avila de Espindola via llvm-dev
2017-Jul-12 18:07 UTC
[llvm-dev] [LLD] Linker Relaxation
PkmX via llvm-dev <llvm-dev at lists.llvm.org> writes:> Note that RISC-V also handles alignment as part of relaxation, so it > isn't really optional. For example: > > _start: > mv a0, a0 > .p2align 2 > li a0, 0 > > The assembler inserts a 3-byte padding (note: this behavior isn't > merged yet, see: https://github.com/riscv/riscv-binutils-gdb/pull/88):Why 3 bytes? The assembler knows the section alignment. I can see why another relaxation would require alignments to be revisited, but it should be possible to link without any relaxations, no? Cheers, Rafael
As a concrete suggestion, here is code for deciding how many bytes of padding to emit. For the case of .align 2 (4 byte alignment), it will emit a minimum of 3 and a maximum of 6 bytes of padding. For .align 4 it emit a minimum of 15 and a maximum of 30 bytes of padding. /* RISC-V align. Goals: 1) align to the desired amount in the generated object file. 2) provide enough extra padding that alignment can be maintained in the face of code size changes, by only making the padding smaller, never bigger. */ void emit_padding(uintptr_t sz); uintptr_t align(uintptr_t oldPos, int logAligment){ uintptr_t alignment = 1 << logAligment, mask = alignment - 1; uintptr_t newPos = (oldPos + mask) & mask; if ((newPos - oldPos) < mask) newPos += alignment; emit_padding(newPos - oldPos); return newPos; } On Wed, Jul 12, 2017 at 9:07 PM, Rafael Avila de Espindola via llvm-dev < llvm-dev at lists.llvm.org> wrote:> PkmX via llvm-dev <llvm-dev at lists.llvm.org> writes: > > Note that RISC-V also handles alignment as part of relaxation, so it > > isn't really optional. For example: > > > > _start: > > mv a0, a0 > > .p2align 2 > > li a0, 0 > > > > The assembler inserts a 3-byte padding (note: this behavior isn't > > merged yet, see: https://github.com/riscv/riscv-binutils-gdb/pull/88): > > Why 3 bytes? The assembler knows the section alignment. > > I can see why another relaxation would require alignments to be > revisited, but it should be possible to link without any relaxations, > no? > > Cheers, > Rafael > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170713/564042bf/attachment-0001.html>
On Jul 13, 2017 02:07, "Rafael Avila de Espindola" < rafael.espindola at gmail.com> wrote: PkmX via llvm-dev <llvm-dev at lists.llvm.org> writes:> Note that RISC-V also handles alignment as part of relaxation, so it > isn't really optional. For example: > > _start: > mv a0, a0 > .p2align 2 > li a0, 0 > > The assembler inserts a 3-byte padding (note: this behavior isn't > merged yet, see: https://github.com/riscv/riscv-binutils-gdb/pull/88):Why 3 bytes? The assembler knows the section alignment. I can see why another relaxation would require alignments to be revisited, but it should be possible to link without any relaxations, no? Cheers, Rafael I agree the 3-byte padding is kinda strange. RISC-V insns are always multiples of 2 bytes and must be 2-byte aligned, so a 2-byte padding should suffice for the linker to re-align after relaxation. Indeed, the assembler should be able to configured to not emit any relaxation info and directly emit the correct number of nop's (w/o R_RISCV_ALIGN), and the object file can be linked without relaxation, but I don't think GAS can do that right now. (There is `.option norelax`, but currently it just emits zeros in the padding.) So the issue remains that GAS assumes linkers will always handle alignment (this is true for GNU ld even if --no-relax is given), so even for a basic RISC-V port lld needs to be extended to support relaxation. On Thu, Jul 13, 2017 at 6:07 AM, Bruce Hoult <bruce at hoult.org> wrote:> As a concrete suggestion, here is code for deciding how many bytes of > padding to emit. > > For the case of .align 2 (4 byte alignment), it will emit a minimum of 3and> a maximum of 6 bytes of padding. For .align 4 it emit a minimum of 15 anda> maximum of 30 bytes of padding.Why does .align 2 need 3 to 6 bytes of padding though? I think for `.p2align n` only 2^n - 2 (or 2^n - 4 without RVC) padding bytes would be needed. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170713/49f6c341/attachment.html>