thr3ads.net - search: "scanrelocations"

Displaying 19 results from an estimated 19 matches for "scanrelocations".

[LLD] thunk implementation correctness depends on order of input section.

2016 Jun 21

[LLD] thunk implementation correctness depends on order of input section.

...ave already been scanned. - There is a short term fix, but in the longer term I think that we will want to allow multiple passes of the relocations. - I'd like to know if there are any preferences on a solution. The current thunk implementation scans for and adds Thunks to InputSections within scanRelocations(). At present a Thunk for a RegularSymbol is always added to the InputSection that defines the RegularSymbol. I think that this can cause a problem when the InputSection with the relocation that needs to be indirected via the thunk is processed after the InputSection that the thunk is added to. Fo...

[lld] We call SymbolBody::getVA redundantly a lot...

2017 Mar 01

[lld] We call SymbolBody::getVA redundantly a lot...

...thing called by SymbolTable::addFile) The frontend work seems to be largely dominated by ObjectFile::parse (as you would expect), though there is about 10% of total runtime slipping through the cracks here in various other "frontend" tasks. The backend work is split about evenly between scanRelocations and OutputSection::writeTo. InputSectionBase::relocate is only about 10% of the total runtime (part of OutputSection::writeTo). Some slightly cleaned up `perf report` output with some more details: https://reviews.llvm.org/P7972 So it seems like overall, the profile is basically split 3 ways (abo...

[lld] We call SymbolBody::getVA redundantly a lot...

2017 Mar 01

[lld] We call SymbolBody::getVA redundantly a lot...

On Tue, Feb 28, 2017 at 11:39 PM, Rui Ueyama <ruiu at google.com> wrote: > I also did a quick profiling a few months ago and noticed just like you > that scanRelocations consumes a fairly large percentage of overall > execution time. That caught my attention because at the time I was looking > for a place that I can parallelize. > > scanRelocations is not parallelizable as-is because it does some global > operations as side effects such as adding new...

lld dynamic relocation creation issue

2016 Feb 03

lld dynamic relocation creation issue

Hi all, Working on lld aarch64 support I came across an issue where I am not sure which would be best design approach to solve. The aarch64 R_AARCH64_ABS64 relocation for PIC/PIE build requires a dynamic relocation (R_AARCH64_RELATIVE) with the value set as the addend of the relocation. For instance, when linking the crtbeginS.o which contains: Relocation section '.rela.init_array' at

LLD: Possible optimization for TargetInfo

2016 Mar 30

LLD: Possible optimization for TargetInfo

On Wed, Mar 30, 2016 at 4:20 PM, Sean Silva <chisophugis at gmail.com> wrote: > I believe the relocation stuff that Rafael is currently working on will > make this a non-issue (it will make relocation application much friendlier > for the CPU). > I don't think Rafael's patch would make this a non-issue. He's making scanRelocs to create data, which would reduce the

[LLD] thunk implementation correctness depends on order of input section.

2016 Jun 22

[LLD] thunk implementation correctness depends on order of input section.

First of all thanks for finding the bug. I agree with Rui that right now we can manage without general thunk infrastructure. Let's provide at least a few "thunk" implementation and after that reconsider necessity of common thunk framework. As to MIPS there is one more type of thunk (keyword is .MIPS.stubs) and one more optimization of current thunk (putting a thunk in the beginning

LLD: Possible optimization for TargetInfo

2016 Mar 30

LLD: Possible optimization for TargetInfo

> On Mar 30, 2016, at 4:25 PM, Rui Ueyama via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > On Wed, Mar 30, 2016 at 4:20 PM, Sean Silva <chisophugis at gmail.com <mailto:chisophugis at gmail.com>> wrote: > I believe the relocation stuff that Rafael is currently working on will make this a non-issue (it will make relocation application much friendlier for the CPU).

LLD: Possible optimization for TargetInfo

2016 Mar 31

LLD: Possible optimization for TargetInfo

On Wed, Mar 30, 2016 at 4:25 PM, Rui Ueyama <ruiu at google.com> wrote: > On Wed, Mar 30, 2016 at 4:20 PM, Sean Silva <chisophugis at gmail.com> wrote: > >> I believe the relocation stuff that Rafael is currently working on will >> make this a non-issue (it will make relocation application much friendlier >> for the CPU). >> > > I don't think

LLD: Possible optimization for TargetInfo

2016 Mar 31

LLD: Possible optimization for TargetInfo

On Wed, Mar 30, 2016 at 5:34 PM, Sean Silva <chisophugis at gmail.com> wrote: > > > On Wed, Mar 30, 2016 at 4:25 PM, Rui Ueyama <ruiu at google.com> wrote: > >> On Wed, Mar 30, 2016 at 4:20 PM, Sean Silva <chisophugis at gmail.com> >> wrote: >> >>> I believe the relocation stuff that Rafael is currently working on will >>> make

[lld] We call SymbolBody::getVA redundantly a lot...

2017 Feb 28

[lld] We call SymbolBody::getVA redundantly a lot...

tl;dr: it looks like we call SymbolBody::getVA about 5x more times than we need to Should we cache it or something? (careful with threads). Here is a link to a PDF of my Mathematica notebook which has all the details of my investigation: https://drive.google.com/open?id=0B8v10qJ6EXRxVDQ3YnZtUlFtZ1k There seem to be two main regimes that we redundantly call SymbolBody::getVA: 1. most

LLD: Possible optimization for TargetInfo

2016 Mar 31

LLD: Possible optimization for TargetInfo

On Wed, Mar 30, 2016 at 6:17 PM, Rui Ueyama <ruiu at google.com> wrote: > On Wed, Mar 30, 2016 at 5:34 PM, Sean Silva <chisophugis at gmail.com> wrote: > >> >> >> On Wed, Mar 30, 2016 at 4:25 PM, Rui Ueyama <ruiu at google.com> wrote: >> >>> On Wed, Mar 30, 2016 at 4:20 PM, Sean Silva <chisophugis at gmail.com> >>> wrote:

[lld] avoid emitting PLT entries for ifuncs

2018 Aug 21

[lld] avoid emitting PLT entries for ifuncs

Hello, We've recently started using ifuncs in the x86(_64) FreeBSD kernel. Currently lld will emit a PLT entry for each ifunc, so ifunc calls are more expensive that those of regular functions. In our kernel, this overhead isn't really necessary: if lld instead emits PC-relative relocations for each ifunc call site, where each relocation references a symbol of type GNU_IFUNC, then during

LLD: Possible optimization for TargetInfo

2016 Mar 30

LLD: Possible optimization for TargetInfo

I believe the relocation stuff that Rafael is currently working on will make this a non-issue (it will make relocation application much friendlier for the CPU). However, even in the current scheme, since the target is fixed, all the indirect call sites should be monomorphic and so there shouldn't be much branch-prediction cost (certainly nothing that would cause 1.8% performance delta for the

[llvm-mc] FreeBSD kernel module performance impact when upgrading clang

2020 Nov 02

[llvm-mc] FreeBSD kernel module performance impact when upgrading clang

Hi, I'm in the process of migrating from clang5 to clang10. Unfortunately clang10 introduced a negative performance impact. The cause is an increase of PLT entries from this patch (first released in clang7): https://bugs.llvm.org/show_bug.cgi?id=36370 https://reviews.llvm.org/D43383 If I revert that clang patch locally, the additional PLT entries and the performance impact disappear. This

RFC: LLD range extension thunks

2017 Jan 04

RFC: LLD range extension thunks

I'm about to start working on range extension thunks in lld. This is an attempt to summarize the approach I'd like to take and what the impact will be on lld outside of thunks. I'm interested if anyone has any constraints the approach will break, alternative suggestions, or is working on something I'll need to take account of? I expect range extension thunks to be important for

LLD: Possible optimization for TargetInfo

2016 Mar 31

LLD: Possible optimization for TargetInfo

On Wed, Mar 30, 2016 at 6:42 PM, Sean Silva <chisophugis at gmail.com> wrote: > > > On Wed, Mar 30, 2016 at 6:17 PM, Rui Ueyama <ruiu at google.com> wrote: > >> On Wed, Mar 30, 2016 at 5:34 PM, Sean Silva <chisophugis at gmail.com> >> wrote: >> >>> >>> >>> On Wed, Mar 30, 2016 at 4:25 PM, Rui Ueyama <ruiu at

LLD: Possible optimization for TargetInfo

2016 Mar 30

LLD: Possible optimization for TargetInfo

I was wandering how much is the overhead of virtual function calls of TargetInfo member functions. TargetInfo handles platform-specific details, and we have target-specific subclasses of that class. The subclasses override functions defined in TargetInfo. The TargetInfo member functions are called multiple times for each relocation. So the cost of virtual function calls may be non-neglible. That

RFC: LLD range extension thunks

2017 Jan 05

RFC: LLD range extension thunks

Hello Rui, Thanks for the comments - Synthetic sections and rewriting relocations I think that this would definitely be worth trying. It should remove the need for thunks to be represented in the core data structures, and would allow . It would also mean that we wouldn't have to associate symbols with thunks as the relocations would directly target the thunks. ARM interworking makes reusing

[EXTERNAL] [llvm-mc] FreeBSD kernel module performance impact when upgrading clang

2020 Nov 05

[EXTERNAL] [llvm-mc] FreeBSD kernel module performance impact when upgrading clang

> You used -noinhibit-exec to ignore the diagnostic, which is usually a bad thing. I certainly agree with that. The point I was trying to make in my original email is that, specifically for kernel objects, this diagnostic is incorrect. R_X86_64_PC32 can be used safely against the symbol foo in that specific context, and should be possible without ignoring diagnostics. I wondered if there

search for: scanrelocations