similar to: A couple metrics of LLD/ELF's performance

Displaying 20 results from an estimated 1600 matches similar to: "A couple metrics of LLD/ELF's performance"

2017 Feb 28
4
[lld] We call SymbolBody::getVA redundantly a lot...
tl;dr: it looks like we call SymbolBody::getVA about 5x more times than we need to Should we cache it or something? (careful with threads). Here is a link to a PDF of my Mathematica notebook which has all the details of my investigation: https://drive.google.com/open?id=0B8v10qJ6EXRxVDQ3YnZtUlFtZ1k There seem to be two main regimes that we redundantly call SymbolBody::getVA: 1. most
2017 Mar 01
2
[lld] We call SymbolBody::getVA redundantly a lot...
On Tue, Feb 28, 2017 at 12:10 PM, Rui Ueyama <ruiu at google.com> wrote: > I don't think getVA is particularly expensive, and if it is not expensive > I wouldn't cache its result. Did you experiment to cache getVA results? I > think you can do that fairly easily by adding a std::atomic_uint64_t to > SymbolBody and use it as a cache for getVA. > You're right,
2017 Mar 01
2
[lld] We call SymbolBody::getVA redundantly a lot...
On Tue, Feb 28, 2017 at 11:39 PM, Rui Ueyama <ruiu at google.com> wrote: > I also did a quick profiling a few months ago and noticed just like you > that scanRelocations consumes a fairly large percentage of overall > execution time. That caught my attention because at the time I was looking > for a place that I can parallelize. > > scanRelocations is not parallelizable
2016 Nov 16
3
LLD: time to enable --threads by default
On 16 November 2016 at 15:52, Rafael Espíndola <rafael.espindola at gmail.com> wrote: > I will do a quick benchmark run. On a mac pro (running linux) the results I got with all cores available: firefox master 7.146418217 patch 5.304271767 1.34729488437x faster firefox-gc master 7.316743822 patch 5.46436812 1.33899174824x faster chromium master 4.265597914 patch
2016 Nov 17
3
LLD: time to enable --threads by default
SHA1 in LLVM is *very* naive, any improvement is welcome there! It think Amaury pointed it originally and he had an alternative implementation IIRC. — Mehdi > On Nov 16, 2016, at 3:58 PM, Rui Ueyama via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > By the way, while running benchmark, I found that our SHA1 function seems much slower than the one in gold. gold slowed down by
2016 Nov 17
2
LLD: time to enable --threads by default
The current implementation was “copy/pasted” from somewhere (it was explicitly public domain). > On Nov 16, 2016, at 4:05 PM, Rui Ueyama <ruiu at google.com> wrote: > > Can we just copy-and-paste optimized code from somewhere? > > On Wed, Nov 16, 2016 at 4:03 PM, Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote: > SHA1 in LLVM is
2018 Feb 16
4
New LLD performance builder
>Hello everyone, > >I have added a new public LLD performance builder at >http://lab.llvm.org:8011/builders/lld-perf-testsuite. >It builds LLVM and LLD by the latest releaed Clang and runs a set of >perfromance tests. > >The builder is reliable. Please pay attention on the failures. > >The performance statistics are here:
2018 Feb 16
0
New LLD performance builder
Hello George, Sorry, somehow hit a send button too soon. Please ignore the previous e-mail. The bot does 10 runs for each of the benchmarks (those dots in the logs are meaningful). We can increase the number of runs if proven that this would significantly increase the accuracy. I didn't see the increase in accuracy when have been staging the bot, which would justify the extra time and larger
2016 Nov 16
9
LLD: time to enable --threads by default
LLD supports multi-threading, and it seems to be working well as you can see in a recent result <http://llvm.org/viewvc/llvm-project?view=revision&revision=287140>. In short, LLD runs 30% faster with --threads option and more than 50% faster if you are using --build-id (your mileage may vary depending on your computer). However, I don't think most users even don't know about that
2010 Feb 11
3
[LLVMdev] Metadata
On Thursday 11 February 2010 13:31:58 David Greene wrote: > > Putting a bit (or multiple bits) in MachineMemOperand for this > > would also make sense. > > Is there any chance a MachineMemOperand will be shared by multiple > instructions? So I tried to do this: %r8 = load <2 x double>* %r6, align 16, !"nontemporal" and the assembler doesn't like it.
2016 Mar 16
2
LLD performance w.r.t. local symbols (and --build-id)
Hi, Rafael took some measurements to try to investigate the effect of the local symbols changes. I've been taking a look at the measurements he got and there were some interesting things we noticed. For starters, in the range of revisions tested (r263214 through r263471), we found that the commit for --build-id was the most noticeable, with slowdowns from 7% to 23% (note: these were
2010 Feb 11
0
[LLVMdev] Metadata
On Thursday 11 February 2010 14:05:21 David Greene wrote: > Either ParseLoad and probably other instructions need to look for metadata > explicitly or ParseOptionalCommaAlign needs to know about general metadata. > > My inkling is to fix ParseOptionalCommaAlign. Sound reasonable? Well, that's a rat's nest. I backed up and thought maybe I have the metadata syntax wrong. So
2020 Apr 28
2
Nontemporal memory accesses and fences
The current specification of the behavior of the !nontemporal attribute in LLVM, and the __builtin_nontemporal_* functions in Clang, is rather spartan and underspecified. In effect, it says the following things: * Atomic !nontemporal has no defined semantics * !nontemporal may use special instructions to save cache bandwidth, such as "MOVNT" on x86. What is crucially lacking
2020 Apr 29
2
Nontemporal memory accesses and fences
________________________________ From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of JF Bastien via llvm-dev <llvm-dev at lists.llvm.org> Sent: Tuesday, April 28, 2020 4:54 PM To: Cranmer, Joshua <joshua.cranmer at intel.com> Cc: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] Nontemporal memory accesses and fences I see
2013 Jan 14
2
[LLVMdev] Dynamic Profiling - Instrumentation basic query
Hi, @Alastair: Thanks a bunch for explaining this so well. I was able to write a simple profiler, and run it. I need to profile the code for branches (branch mis predicts simulation), load/store instructions (for cache hits/miss rate), and a couple of other things and therefore, would need to instrument the code. However, I would like to know if writing the output to a file would increase the
2018 Jan 20
2
Non-Temporal hints from Loop Vectorizer
i have already seen usage of __builtin_nontemporal_store but i want to automate identification of non temporal loads/stores. i think i need to go for a pass. is it possiblee to detect non temporal loops without polly? On Sat, Jan 20, 2018 at 11:26 PM, Simon Pilgrim <llvm-dev at redking.me.uk> wrote: > On 20/01/2018 18:16, hameeza ahmed wrote: > > Actually i am working on vector
2018 Jan 20
2
Non-Temporal hints from Loop Vectorizer
Actually i am working on vector accelerator which will perform those instructions which are non temporal. for instance if i have this loop for(i=0;i<2048;i++) a[i]=b[i]+c[i]; currently it emits following IR; %0 = getelementptr inbounds [2048 x i32], [2048 x i32]* @b, i64 0, i64 %index %1 = bitcast i32* %0 to <16 x i32>* %wide.load = load <16 x i32>, <16 x i32>* %1,
2016 Jan 13
4
RFC: non-temporal fencing in LLVM IR
Hello, fencing enthusiasts! *TL;DR:* We'd like to propose an addition to the LLVM memory model requiring non-temporal accesses be surrounded by non-temporal load barriers and non-temporal store barriers, and we'd like to add such orderings to the fence IR opcode. We are open to different approaches, hence this email instead of a patch. *Who's "we"?* Philip Reames brought
2010 Feb 11
4
[LLVMdev] Metadata
On Feb 11, 2010, at 12:50 PM, David Greene wrote: > On Thursday 11 February 2010 14:05:21 David Greene wrote: > >> Either ParseLoad and probably other instructions need to look for metadata >> explicitly or ParseOptionalCommaAlign needs to know about general metadata. >> >> My inkling is to fix ParseOptionalCommaAlign. Sound reasonable? > > Well, that's
2018 Jan 21
0
Non-Temporal hints from Loop Vectorizer
On 01/20/2018 12:29 PM, hameeza ahmed via llvm-dev wrote: > i have already seen usage of __builtin_nontemporal_store but i want to > automate identification of non temporal loads/stores. i think i need > to go for a pass. is it possiblee to detect non temporal loops without > polly? Yes, but we don't have anything that does that right now. The cost modeling is non-trivial,