thr3ads.net - search: "speedup"

Displaying 20 results from an estimated 1822 matches for "speedup".

[LLVMdev] libiomp, not libgomp as default library linked with -fopenmp

2015 May 03

[LLVMdev] libiomp, not libgomp as default library linked with -fopenmp

A couple more data points. Current llvm 3.7svn with the two outstanding OPENMP patches can build the openmp support in gdl 0.9.5 (which completely passes its test suite) and apbs 1.4.1's limited openmp support. On Sat, May 2, 2015 at 11:11 PM, Jack Howarth < howarth.mailing.lists at gmail.com> wrote: > On a positive note, current llvm 3.7svn with the two outstanding > OPENMP

[LLVMdev] RFC: Callee speedup estimation in inline cost analysis

2015 Jul 30

[LLVMdev] RFC: Callee speedup estimation in inline cost analysis

...oposal ------------- LLVM inlines a function if the size growth (in the given context) is less than a threshold. The threshold is increased based on certain characteristics of the called function (inline keyword and the fraction of vector instructions, for example). I propose the use of estimated speedup (estimated reduction in dynamic instruction count to be precise) as another factor that controls threshold. This would allow larger functions whose inlining potentially reduces execution time to be inlined. The dynamic instruction count of (an uninlined) function F is DI(F) = Sum_BB(Freq(BB) * In...

[LLVMdev] RFC: Callee speedup estimation in inline cost analysis

2015 Jul 31

[LLVMdev] RFC: Callee speedup estimation in inline cost analysis

Just nitpicking: 1) DI(F) should include a component that estimate the epi/prologue cost (frameSetupCost) which InlinedDF does not have 2) The speedup should include callsite cost associated with 'C' (call instr, argument passing): Speedup(F,C) = (DI(F) + CallCost(C) - InlinedDF(F,C))/DI(F). Otherwise the proposal looks reasonable to me. David On Thu, Jul 30, 2015 at 2:25 PM, Easwaran Raman <eraman at google.com> wrote: &gt...

Rsync shouldn't display a meaningless speedup on a dry run

2007 Nov 05

Rsync shouldn't display a meaningless speedup on a dry run

On a dry run, rsync displays a speedup value calculated from the total size of the source file data and the amount of data sent over the connection, but this value is meaningless and grossly misleading because the file data is not sent over the connection. Example: [matt@mattlaptop2 test]$ rsync -avi -n ~/eclipse/releases/eclipse-SDK-...

[LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn

2013 Jun 02

[LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn

...Turning on LLVM's vectorizer gives a 2% slowdown. > aermod 16.03 14.45 16.13 Turning on LLVM's vectorizer gives a 2.5% slowdown. > air 6.80 5.28 5.73 > capacita 39.89 35.21 34.96 Turning on LLVM's vectorizer gives a 5% speedup. GCC gets a 5.5% speedup from its vectorizer. > channel 2.06 2.29 2.69 GCC's gets a 30% speedup from its vectorizer which LLVM doesn't get. On the other hand, without vectorization LLVM's version runs 23% faster than GCC's, so while GCC's vectorizer lea...

RFC: speedups with instruction side-data (ADCE, perhaps others?)

2015 Sep 16

RFC: speedups with instruction side-data (ADCE, perhaps others?)

...m of (cost of managing the set) + (cost of eraseinstruction), which in our case turns out to be 1/3 the former and 2/3 the latter (roughly). —escha > On Sep 15, 2015, at 6:50 PM, Daniel Berlin <dberlin at dberlin.org> wrote: > > Can someone provide the file used to demonstrate the speedup here? > I'd be glad to take a quick crack at seeing if i can achieve the same speedup. > > > On Tue, Sep 15, 2015 at 2:16 PM, Owen Anderson via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> >> On Sep 14, 2015, at 5:02 PM, Mehdi Amini via llvm-dev >>...

[LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn

2013 Jun 02

[LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn

...2% slowdown. > >> aermod 16.03 14.45 16.13 > > Turning on LLVM's vectorizer gives a 2.5% slowdown. > >> air 6.80 5.28 5.73 >> capacita 39.89 35.21 34.96 > > Turning on LLVM's vectorizer gives a 5% speedup. GCC gets a 5.5% speedup from > its vectorizer. > >> channel 2.06 2.29 2.69 > > GCC's gets a 30% speedup from its vectorizer which LLVM doesn't get. On the > other hand, without vectorization LLVM's version runs 23% faster than GCC's, so &...

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2011 Nov 08

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

...39;clang -O3' against 'clang -O3 -mllvm -vectorize'? Yes. [I've tested the current patch directly using opt -vectorize -unroll-allow-partial; for running the test suite I recompiled llvm/clang to hardcode the options as I wanted them]. > > > The largest three performance speedups are: > > SingleSource/Benchmarks/BenchmarkGame/puzzle - 59.2% speedup > > SingleSource/UnitTests/Vector/multiplies - 57.7% speedup > > SingleSource/Benchmarks/Misc/flops-7 - 50.75% speedup > > > > The largest three performance slowdowns are: > > MultiSource/Benc...

[patch] making rsync less verbose

2004 Jan 06

[patch] making rsync less verbose

...you suppress all output with the -q option, rsync will always print an initial "building/receiving file list ... done" line. In my opinion, this is a bit superfluous. When I want to see a progress indication, I can use the --progress option. Another issue is the 3-line transfer speed and speedup factor report at the end. So every rsync invocation produces at least four lines of output. Attached are two patches to reduce the verbosity of rsync. The first one removes the initial line. The second patch adds a new option -s/--speedup to control the generation of the speedup report, which is n...

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2011 Nov 08

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

...ang -O3 -mllvm -unroll-allow-partial' with 'clang -O3 -mllvm -unroll-allow-partial -mllvm -vectorize'. It will show how much of the runtime overhead is due to the unrolling (produces more code that needs to be optimized) and which part is due to vectorization. The same counts for the speedup. How much is caused by unrolling and how much is actually caused by your pass. >>> The largest three performance speedups are: >>> SingleSource/Benchmarks/BenchmarkGame/puzzle - 59.2% speedup >>> SingleSource/UnitTests/Vector/multiplies - 57.7% speedup >>> Singl...

Patch

2001 Sep 08

Patch

Hallo short question how is the Syntax for interactivity.patch ext3-dir-speedup.patch ? patch -p0 ext3-dir-speedup.patch doesnt work -- Frank

[LLVMdev] selection dag speedups / llc speedups

2010 May 17

[LLVMdev] selection dag speedups / llc speedups

On May 14, 2010, at 11:24 AM, Jan Voung wrote: > I'm sure this has been asked many times, but is there current work on decreasing the time taken by the DAG-based instruction selector, or the other phases of llc? I am just beginning to dive into LLVM, and I am interested in compile-time reductions that do not reduce code quality dramatically. For example, simply switching on

[LLVMdev] selection dag speedups / llc speedups

2010 May 19

[LLVMdev] selection dag speedups / llc speedups

On May 18, 2010, at 12:07 PM, Jan Voung wrote: > Here are some recent stats of the fast vs local vs linear scan at O0 on "opt -std-compile-opts" processed bitcode files. The fast regalloc is still certainly faster at codegen than local with such bitcode files. Let me know if the link doesn't work: > >

speedup is always 0.99

2013 Feb 22

speedup is always 0.99

I'm syncing from a USB disk to my hard disk like this: rsync -vr /path/to/usb/disk/dir/ /path/to/hard/disk/dir/ But the speedup is always 0.99 which I think means it is just copying the files each time instead of syncing them. What could be wrong? - Grant

[LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn

2013 Jun 03

[LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn

...fmul's. > > I'm not sure what the best way to implement this optimization in LLVM > is. Maybe > Shuxin has some ideas. > > So it looks like a missed fast-math optimization rather than anything > to do with > vectorization, which is strange as GCC only gets the big speedup when > vectorization is turned on. > > Ciao, Duncan. > >> >> Thanks, >> Nadav >> >> >> On Jun 2, 2013, at 1:27, Duncan Sands <duncan.sands at gmail.com >> <mailto:duncan.sands at gmail.com>> wrote: >> >>> Hi Jack, than...

Is this correct?

2003 Dec 30

Is this correct?

I am trying to copy an 8 gig file. After first copying it over I try and rsync it again. There are no changes so why is it copying the file over again? Here is the output of what I am trying to do. And what does speedup mean? [root@VMWARE Storage]# rsync --verbose --recursive -t --stats -- progress ntapps.dsk vmware2::home/ [root@VMWARE Storage]# rsync --verbose --recursive -t --stats -- progress ntapps. building file list ... 1 file to consider ntapps.dsk 8388608512 100% 2.11MB/s 1:03:11 rsync[8049] (sen...

RFC: speedups with instruction side-data (ADCE, perhaps others?)

2015 Sep 14

RFC: speedups with instruction side-data (ADCE, perhaps others?)

I did something similar for dominators, for GVN, etc. All see significant speedups. However, the answer i got back when i mentioned this was "things like ptrset and densemap should only have a small performance difference from side data when used and sized right", and i've found this to mostly be true after looking harder. In the case you are looking at, i see:...

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2011 Nov 08

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

...ch will probably > work for you. Hey Hal, those are great news. Especially as the numbers seem to show that vectorization has a significant performance impact. What did you compare exactly. 'clang -O3' against 'clang -O3 -mllvm -vectorize'? > The largest three performance speedups are: > SingleSource/Benchmarks/BenchmarkGame/puzzle - 59.2% speedup > SingleSource/UnitTests/Vector/multiplies - 57.7% speedup > SingleSource/Benchmarks/Misc/flops-7 - 50.75% speedup > > The largest three performance slowdowns are: > MultiSource/Benchmarks/MiBench/security-rijnda...

[LLVMdev] selection dag speedups / llc speedups

2010 May 18

[LLVMdev] selection dag speedups / llc speedups

On May 17, 2010, at 9:09 PM, Rafael Espindola wrote: >> The fast and local register allocators are meant to be used on unoptimized code, a 'Debug build'. While they do work on optimized code, they do not give good results. Their primary goal is compile time, not code quality. > > Yes, we have a somewhat uncommon use case. It is fine to spend time > optimizing bitcode (LTO

--keep-dirlinks --delete erroneously deletes symlinks to directories

2009 Oct 03

--keep-dirlinks --delete erroneously deletes symlinks to directories

...------ ./foo/subfile - the data on both hosts is the same, so rsync shouldn't do anything - mirroring Host 1 -> Host 2 correctly does nothing: $ rsync -av --keep-dirlinks 1/ 2/ sending incremental file list sent 94 bytes received 13 bytes 214.00 bytes/sec total size is 0 speedup is 0.00 - mirroring Host 2 -> Host 1 correctly does nothing: $ rsync -av --copy-dirlinks 2/ 1/ sending incremental file list sent 94 bytes received 13 bytes 214.00 bytes/sec total size is 0 speedup is 0.00 - --delete-after works correctly, and does nothing: $ rsync -av...

search for: speedup