thr3ads.net - llvm dev - [llvm-dev] [LLD] Slow callstacks in gdb [Dec 2017]

If this information is useful, please help other people find it:
Share via:

Rafael Avila de Espindola via llvm-dev

2017-Dec-06 02:22 UTC

[llvm-dev] [LLD] Slow callstacks in gdb

Rui Ueyama <ruiu at google.com> writes:
> On Tue, Dec 5, 2017 at 1:22 PM, Rafael Avila de Espindola <
> rafael.espindola at gmail.com> wrote:
>
>> Martin Richtarsky <s at martinien.de> writes:
>>
>> > Output looks as follows [1] Seems sh_offset is missing?
>>
>> That is what readelf prints as Off
>>
>> >   [17] .rela.text        RELA            0000000000000000 071423
001728
>> 18
>> >      1   4  8
>>
>> The offset of rela text should have been aligned, but it is not. Can
you
>> report a bug on icc? As a work around using the gnu assembler if
>> possible should fix this.
>
>
> Yeah this is a violation of the spec and must be a bug in ICC. That being
> said, is there a practical benefit of checking the validity of the
> alignment, except finding buggy object files early? I mean, if an object
> file is in an static archive, all "aligned" data in the object
file might
> not be aligned against the beginning of the archive file.
It will at least be aligned to two bytes.

With most current host architectures handling
packed_endian_specific_integral is fairly efficient. For example, on
x86_64 reading 32 bits with 1 2 and 4 byte alignment produces in all
cases:

  movl    (%rdi), %eax

But on armv6 the aligned case is

  ldr     r0, [r0]

the 2 byte aligned case is

  ldrh    r1, [r0, #2]
  ldrh    r0, [r0]
  orr     r0, r0, r1, lsl #16

and the unaligned case is

  ldrb    r1, [r0]
  ldrb    r2, [r0, #1]
  ldrb    r3, [r0, #2]
  ldrb    r0, [r0, #3]
  orr     r1, r1, r2, lsl #8
  orr     r0, r3, r0, lsl #8
  orr     r0, r1, r0, lsl #16

On armv7 it is a single ldr on all cases.

Now, I don't really know how much we support *host* architectures
without a unaligned load instruction. If we don't care about making lld
and other llvm tools slower on those host architectures we could use
packed_endian_specific_integral with an alignment of 1 and remove the
check. I guess we have to ask on llvmdev before changing that.

Cheers,
Rafael

Rui Ueyama via llvm-dev

2017-Dec-06 02:47 UTC

head link

[llvm-dev] [LLD] Slow callstacks in gdb

Somewhat orthogonal to the original issue, but if object files are aligned
only to two bytes in a static archive, and if we are using the four byte
aligned load instruction on armv6 to load data from object files, that
means current LLVM can easily cause a bus error on armv6, no?

On Tue, Dec 5, 2017 at 6:22 PM, Rafael Avila de Espindola <
rafael.espindola at gmail.com> wrote:
> Rui Ueyama <ruiu at google.com> writes:
>
> > On Tue, Dec 5, 2017 at 1:22 PM, Rafael Avila de Espindola <
> > rafael.espindola at gmail.com> wrote:
> >
> >> Martin Richtarsky <s at martinien.de> writes:
> >>
> >> > Output looks as follows [1] Seems sh_offset is missing?
> >>
> >> That is what readelf prints as Off
> >>
> >> >   [17] .rela.text        RELA            0000000000000000
071423
> 001728
> >> 18
> >> >      1   4  8
> >>
> >> The offset of rela text should have been aligned, but it is not.
Can you
> >> report a bug on icc? As a work around using the gnu assembler if
> >> possible should fix this.
> >
> >
> > Yeah this is a violation of the spec and must be a bug in ICC. That
being
> > said, is there a practical benefit of checking the validity of the
> > alignment, except finding buggy object files early? I mean, if an
object
> > file is in an static archive, all "aligned" data in the
object file might
> > not be aligned against the beginning of the archive file.
>
> It will at least be aligned to two bytes.
>
> With most current host architectures handling
> packed_endian_specific_integral is fairly efficient. For example, on
> x86_64 reading 32 bits with 1 2 and 4 byte alignment produces in all
> cases:
>
>   movl    (%rdi), %eax
>
> But on armv6 the aligned case is
>
>   ldr     r0, [r0]
>
> the 2 byte aligned case is
>
>   ldrh    r1, [r0, #2]
>   ldrh    r0, [r0]
>   orr     r0, r0, r1, lsl #16
>
> and the unaligned case is
>
>   ldrb    r1, [r0]
>   ldrb    r2, [r0, #1]
>   ldrb    r3, [r0, #2]
>   ldrb    r0, [r0, #3]
>   orr     r1, r1, r2, lsl #8
>   orr     r0, r3, r0, lsl #8
>   orr     r0, r1, r0, lsl #16
>
> On armv7 it is a single ldr on all cases.
>
> Now, I don't really know how much we support *host* architectures
> without a unaligned load instruction. If we don't care about making lld
> and other llvm tools slower on those host architectures we could use
> packed_endian_specific_integral with an alignment of 1 and remove the
> check. I guess we have to ask on llvmdev before changing that.
>
> Cheers,
> Rafael
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20171205/505c6743/attachment.html>

Rafael Avila de Espindola via llvm-dev

2017-Dec-06 03:14 UTC

head link

[llvm-dev] [LLD] Slow callstacks in gdb

Rui Ueyama <ruiu at google.com> writes:
> Somewhat orthogonal to the original issue, but if object files are aligned
> only to two bytes in a static archive, and if we are using the four byte
> aligned load instruction on armv6 to load data from object files, that
> means current LLVM can easily cause a bus error on armv6, no?
We are not using a 4 byte aligned load. We are using:
>> the 2 byte aligned case is
>>
>>   ldrh    r1, [r0, #2]
>>   ldrh    r0, [r0]
>>   orr     r0, r0, r1, lsl #16
That is why the check for the section being at least 2 byte aligned is
important.

The 2 is from

  using Elf_Word = support::detail::packed_endian_specific_integral<
      uint32_t, target_endianness, 2>;

Cheers,
Rafael

llvm dev - Dec 2017 - [LLD] Slow callstacks in gdb

[llvm-dev] [LLD] Slow callstacks in gdb

[llvm-dev] [LLD] Slow callstacks in gdb

[llvm-dev] [LLD] Slow callstacks in gdb