search for: looprerolling

Displaying 20 results from an estimated 21 matches for "looprerolling".

2013 Jul 14
6
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...like to hear what others in the community think about this and give other people the opportunity to perform their own performance measurements. — Performance Gains — SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68% MultiSource/Benchmarks/Olden/power/power -18.55% MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71% SingleSource/Benchmarks/Misc/flops-6 -11.02% SingleSource/Benchmarks/Misc/flops-5 -10.03% MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37% External/Nurbs/nurbs -7.98% SingleSource/Benchmarks/Misc/pi -7.29% External/SPEC/CINT2000/252_eon/2...
2013 Jul 15
0
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...ould be straight-forward. It would also be really useful to see what the code size and compile time impact is. -Chris > > — Performance Gains — > SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68% > MultiSource/Benchmarks/Olden/power/power -18.55% > MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71% > SingleSource/Benchmarks/Misc/flops-6 -11.02% > SingleSource/Benchmarks/Misc/flops-5 -10.03% > MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37% > External/Nurbs/nurbs -7.98% > SingleSource/Benchmarks/Misc/pi -7.29% > Ex...
2013 Jul 28
2
[LLVMdev] Enabling the SLP-vectorizer by default for -O3
...us Current σ SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.67% 1.4064 0.6516 0.0007 External/Nurbs/nurbs -19.47% 2.5389 2.0445 0.0029 MultiSource/Benchmarks/Olden/power/power -18.49% 1.2572 1.0248 0.0004 SingleSource/Benchmarks/Misc/flops-4 -15.93% 0.7767 0.6530 0.0348 MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.72% 2.3925 2.0404 0.0013 SingleSource/Benchmarks/Misc/flops-6 -11.05% 1.1427 1.0164 0.0009 SingleSource/Benchmarks/Misc/flops-5 -10.43% 1.2771 1.1439 0.0015 MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.10% 2.3468 2.1568 0.0195 SingleSource/Bench...
2013 Jul 15
3
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...also be really useful to see what the code size and compile time impact is. > > -Chris > >> >> — Performance Gains — >> SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68% >> MultiSource/Benchmarks/Olden/power/power -18.55% >> MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71% >> SingleSource/Benchmarks/Misc/flops-6 -11.02% >> SingleSource/Benchmarks/Misc/flops-5 -10.03% >> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37% >> External/Nurbs/nurbs -7.98% >> SingleSource/Benchmarks/Mi...
2013 Jul 23
0
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...at the code size and compile time impact is. >> >> -Chris >> >>> >>> — Performance Gains — >>> SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68% >>> MultiSource/Benchmarks/Olden/power/power -18.55% >>> MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71% >>> SingleSource/Benchmarks/Misc/flops-6 -11.02% >>> SingleSource/Benchmarks/Misc/flops-5 -10.03% >>> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37% >>> External/Nurbs/nurbs -7.98% >>> Single...
2013 Jul 14
0
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...> community think about this and give other people the opportunity to perform > their own performance measurements. > > — Performance Gains — > SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68% > MultiSource/Benchmarks/Olden/power/power -18.55% > MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71% > SingleSource/Benchmarks/Misc/flops-6 -11.02% > SingleSource/Benchmarks/Misc/flops-5 -10.03% > MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt > -8.37% > External/Nurbs/nurbs -7.98% > SingleSource/Benchmarks/Misc/pi -7.29% &g...
2013 Jul 28
0
[LLVMdev] Enabling the SLP-vectorizer by default for -O3
...ntσ > SingleSource/Benchmarks/Misc/matmul_f64_4x4-53.67%1.40640.65160.0007 > External/Nurbs/nurbs-19.47%2.53892.04450.0029 > MultiSource/Benchmarks/Olden/power/power-18.49%1.25721.02480.0004 > SingleSource/Benchmarks/Misc/flops-4-15.93%0.77670.65300.0348 > MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt-14.72% > 2.39252.04040.0013SingleSource/Benchmarks/Misc/flops-6-11.05%1.14271.0164 > 0.0009SingleSource/Benchmarks/Misc/flops-5-10.43%1.27711.14390.0015 > MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt > -8.10%2.34682.15680.0195SingleSource/B...
2013 Sep 25
0
[LLVMdev] [Polly] Performance comparison between Cloog and ISL code generation
...g) MultiSource/Benchmarks/ASC_Sequoia/AMGmk/AMGmk -69.11% MultiSource/Benchmarks/Trimaran/netbench-crc/netbench-crc -44.39% SingleSource/Benchmarks/Polybench/linear-algebra/kernels/3mm/3mm -12.74% SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemm/gemm -11.21% MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -11.14% SingleSource/Benchmarks/Polybench/linear-algebra/kernels/syr2k/syr2k -11.11% MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt -10.87% MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl -10.87% SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm -10...
2015 Jan 30
0
[LLVMdev] About user of bitcast/GEP instruction
----- Original Message ----- > From: "guoqing zhang" <gqzhang81 at gmail.com> > To: llvmdev at cs.uiuc.edu > Sent: Friday, January 30, 2015 4:29:16 AM > Subject: [LLVMdev] About user of bitcast/GEP instruction > > Hi, > > > In PromoteMemoryToRegister.cpp, it seems to rely on the fact that the > only users of bitcast/GEP instruction are lifetime
2015 Jan 30
3
[LLVMdev] About user of bitcast/GEP instruction
Hi, In PromoteMemoryToRegister.cpp, it seems to rely on the fact that the only users of bitcast/GEP instruction are lifetime intrinsics (llvm.lifetime.start/end). I did some searching in llvm/test folder, it seems to be true. However, by reading LLVM IR manual, I don't see any restriction stated on the possible user of bitcast/GEP instruction. So my question is who impose the restriction ?
2015 Feb 26
5
[LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?
Hi all, I've started looking at the GlobalMerge pass, enabled by default on ARM and AArch64. I think we should reconsider that, at least for AArch64. As is, the pass just merges all globals together, in groups of 4KB (AArch64, 128B on ARM). At the time it was enabled, the general thinking was "it's almost free, it doesn't affect performance much, we might as well use it".
2015 Jan 30
1
[LLVMdev] About user of bitcast/GEP instruction
Hi, If the special handling in the meg2reg pass is to look for lifetime intrinsics, shouldn't it cast to <IntrisicInst> and then use getInstrinsicID to check for lifetime_start and lifetime_end ? The thing that I don't understand is the following piece of code, which finds all the users and cast it to <Instruction> then eraseFromParent(). How can this guarantee that it only
2018 Feb 06
2
[RFC] Make LoopVectorize Aware of SLP Operations
Hello, We would like to propose making LoopVectorize aware of SLP operations, to improve the generated code for loops operating on struct fields or doing complex math. At the moment, LoopVectorize uses interleaving to vectorize loops that operate on values loaded/stored from consecutive addresses: vector loads/stores are generated to combine consecutive loads/stores and then shufflevector
2018 Feb 08
0
[RFC] Make LoopVectorize Aware of SLP Operations
Hi Florian! This proposal sounds pretty exciting! Integrating SLP-aware loop vectorization (or the other way around) and SLP into the VPlan framework is definitely aligned with the long term vision and we would prefer this approach to the LoopReroll and InstCombine alternatives that you mentioned. We prefer a generic implementation that can handle complicated cases to something ad-hoc for some
2016 Jul 21
2
RFC: Strong GC References in LLVM
On Thu, Jul 21, 2016 at 9:39 AM, Daniel Berlin <dberlin at dberlin.org> wrote: > > > On Thu, Jul 21, 2016 at 9:26 AM, Andrew Trick <atrick at apple.com> wrote: > >> >> On Jul 21, 2016, at 7:45 AM, Philip Reames <listmail at philipreames.com> >> wrote: >> >> Joining in very late, but the tangent here has been interesting (if >>
2016 Jul 21
3
RFC: Strong GC References in LLVM
> On Jul 21, 2016, at 7:45 AM, Philip Reames <listmail at philipreames.com> wrote: > > Joining in very late, but the tangent here has been interesting (if rather OT for the original thread). > > I agree with Danny that we might want to take a close look at how we model things like maythrow calls, no return, and other implicit control flow. I'm not convinced that moving
2016 Jul 21
4
RFC: Strong GC References in LLVM
Okay, so it sounds like it might actually be better to be even more low level, call it "ExtendedBBInfo" or something, and rename what it provides to be more clearly structural: A. Inst * FirstIsGuaranteedToTransferExecutionToSuccessor(BB) (naming bikeshed open on this one :P) B. Inst * LastIsGuaranteedToTransferExecutionToSuccessor(BB) C. Inst *FirstMayThrow(BB) D. Inst
2013 Jul 28
0
[LLVMdev] IR Passes and TargetTransformInfo: Straw Man
...862 -1.5981735159817 Applications/ClamAV/clamscan 0.094 0.0925 -1.5957446808510 Benchmarks/McCat/17-bintr/bintr 0.0666 0.0658 -1.2012012012012 Benchmarks/MiBench/automotive-susan/automot 0.0312 0.0309 -0.9615384615384 Benchmarks/TSVC/LoopRerolling-dbl/LoopRerol 2.7783 2.7524 -0.9322247417485 Benchmarks/SciMark2-C/scimark2 22.2684 22.0824 -0.8352643207414 Benchmarks/mediabench/g721/g721encode/encod 0.0403 0.04 -0.7444168734491 Benchmarks/ASC_Sequoia/AMGmk/AMGmk 5.0381 5.0033 -0...
2013 Jul 18
3
[LLVMdev] IR Passes and TargetTransformInfo: Straw Man
Andy and I briefly discussed this the other day, we have not yet got chance to list a detailed pass order for the pre- and post- IPO scalar optimizations. This is wish-list in our mind: pre-IPO: based on the ordering he propose, get rid of the inlining (or just inline tiny func), get rid of all loop xforms... post-IPO: get rid of inlining, or maybe we still need it, only
2018 Apr 26
0
Compare test-suite benchmarks performance complied without TBAA, with default TBAA and with new TBAA struct path
...70111|2.093725046| -0.16|17747770107| 0|2.089922083| 0.02|17747770106| 0| |MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt.test | 40|1.426366572|15128429282|1.421173178| 0.37|15128429276| 0|1.421455642| 0.35|15128429276| 0| |MultiSource/Benchmarks/TSVC/LoopRerolling-dbl/LoopRerolling-dbl.test | 40|2.479358186|12728810214|2.476810043| 0.1|12728810217| 0|2.479956135| -0.02|12728810217| 0| |MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt.test | 40|1.749389689| 9799621839|1.751205696| -0.1| 9799621843| 0|1.746...