search for: loopreroll

Displaying 20 results from an estimated 21 matches for "loopreroll".

2013 Jul 14
6
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...like to hear what others in the community think about this and give other people the opportunity to perform their own performance measurements. — Performance Gains — SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68% MultiSource/Benchmarks/Olden/power/power -18.55% MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71% SingleSource/Benchmarks/Misc/flops-6 -11.02% SingleSource/Benchmarks/Misc/flops-5 -10.03% MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37% External/Nurbs/nurbs -7.98% SingleSource/Benchmarks/Misc/pi -7.29% External/SPEC/CINT2000/252_eo...
2013 Jul 15
0
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...ould be straight-forward. It would also be really useful to see what the code size and compile time impact is. -Chris > > — Performance Gains — > SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68% > MultiSource/Benchmarks/Olden/power/power -18.55% > MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71% > SingleSource/Benchmarks/Misc/flops-6 -11.02% > SingleSource/Benchmarks/Misc/flops-5 -10.03% > MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37% > External/Nurbs/nurbs -7.98% > SingleSource/Benchmarks/Misc/pi -7.29% >...
2013 Jul 28
2
[LLVMdev] Enabling the SLP-vectorizer by default for -O3
...us Current σ SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.67% 1.4064 0.6516 0.0007 External/Nurbs/nurbs -19.47% 2.5389 2.0445 0.0029 MultiSource/Benchmarks/Olden/power/power -18.49% 1.2572 1.0248 0.0004 SingleSource/Benchmarks/Misc/flops-4 -15.93% 0.7767 0.6530 0.0348 MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.72% 2.3925 2.0404 0.0013 SingleSource/Benchmarks/Misc/flops-6 -11.05% 1.1427 1.0164 0.0009 SingleSource/Benchmarks/Misc/flops-5 -10.43% 1.2771 1.1439 0.0015 MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.10% 2.3468 2.1568 0.0195 SingleSource/Be...
2013 Jul 15
3
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...also be really useful to see what the code size and compile time impact is. > > -Chris > >> >> — Performance Gains — >> SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68% >> MultiSource/Benchmarks/Olden/power/power -18.55% >> MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71% >> SingleSource/Benchmarks/Misc/flops-6 -11.02% >> SingleSource/Benchmarks/Misc/flops-5 -10.03% >> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37% >> External/Nurbs/nurbs -7.98% >> SingleSource/Benchmarks...
2013 Jul 23
0
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...at the code size and compile time impact is. >> >> -Chris >> >>> >>> — Performance Gains — >>> SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68% >>> MultiSource/Benchmarks/Olden/power/power -18.55% >>> MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71% >>> SingleSource/Benchmarks/Misc/flops-6 -11.02% >>> SingleSource/Benchmarks/Misc/flops-5 -10.03% >>> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37% >>> External/Nurbs/nurbs -7.98% >>> Sin...
2013 Jul 14
0
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...> community think about this and give other people the opportunity to perform > their own performance measurements. > > — Performance Gains — > SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68% > MultiSource/Benchmarks/Olden/power/power -18.55% > MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71% > SingleSource/Benchmarks/Misc/flops-6 -11.02% > SingleSource/Benchmarks/Misc/flops-5 -10.03% > MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt > -8.37% > External/Nurbs/nurbs -7.98% > SingleSource/Benchmarks/Misc/pi -7.29%...
2013 Jul 28
0
[LLVMdev] Enabling the SLP-vectorizer by default for -O3
...ntσ > SingleSource/Benchmarks/Misc/matmul_f64_4x4-53.67%1.40640.65160.0007 > External/Nurbs/nurbs-19.47%2.53892.04450.0029 > MultiSource/Benchmarks/Olden/power/power-18.49%1.25721.02480.0004 > SingleSource/Benchmarks/Misc/flops-4-15.93%0.77670.65300.0348 > MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt-14.72% > 2.39252.04040.0013SingleSource/Benchmarks/Misc/flops-6-11.05%1.14271.0164 > 0.0009SingleSource/Benchmarks/Misc/flops-5-10.43%1.27711.14390.0015 > MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt > -8.10%2.34682.15680.0195SingleSourc...
2013 Sep 25
0
[LLVMdev] [Polly] Performance comparison between Cloog and ISL code generation
...g) MultiSource/Benchmarks/ASC_Sequoia/AMGmk/AMGmk -69.11% MultiSource/Benchmarks/Trimaran/netbench-crc/netbench-crc -44.39% SingleSource/Benchmarks/Polybench/linear-algebra/kernels/3mm/3mm -12.74% SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemm/gemm -11.21% MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -11.14% SingleSource/Benchmarks/Polybench/linear-algebra/kernels/syr2k/syr2k -11.11% MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt -10.87% MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl -10.87% SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm...
2015 Jan 30
0
[LLVMdev] About user of bitcast/GEP instruction
...is that pass for looking through those instructions to find lifetime intrinsics, but that's all. > I did some searching in llvm/test folder, > it seems to be true. > bitcasts and GEPs are general-purpose instructions. GEPs are most often used by loads/stores. Look at test/Transforms/LoopReroll/basic.ll for an example with lots of them. test/Analysis/BasicAA/gep-alias.ll is another place to look. > > However, by reading LLVM IR manual, I don't see any restriction > stated on the possible user of bitcast/GEP instruction. So my > question is who impose the restriction ? Is...
2015 Jan 30
3
[LLVMdev] About user of bitcast/GEP instruction
Hi, In PromoteMemoryToRegister.cpp, it seems to rely on the fact that the only users of bitcast/GEP instruction are lifetime intrinsics (llvm.lifetime.start/end). I did some searching in llvm/test folder, it seems to be true. However, by reading LLVM IR manual, I don't see any restriction stated on the possible user of bitcast/GEP instruction. So my question is who impose the restriction ?
2015 Feb 26
5
[LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?
Hi all, I've started looking at the GlobalMerge pass, enabled by default on ARM and AArch64. I think we should reconsider that, at least for AArch64. As is, the pass just merges all globals together, in groups of 4KB (AArch64, 128B on ARM). At the time it was enabled, the general thinking was "it's almost free, it doesn't affect performance much, we might as well use it".
2015 Jan 30
1
[LLVMdev] About user of bitcast/GEP instruction
...e instructions to find lifetime intrinsics, but that's > all. > > > I did some searching in llvm/test folder, > > it seems to be true. > > > > bitcasts and GEPs are general-purpose instructions. GEPs are most often > used by loads/stores. Look at test/Transforms/LoopReroll/basic.ll for an > example with lots of them. test/Analysis/BasicAA/gep-alias.ll is another > place to look. > > > > > However, by reading LLVM IR manual, I don't see any restriction > > stated on the possible user of bitcast/GEP instruction. So my > > question i...
2018 Feb 06
2
[RFC] Make LoopVectorize Aware of SLP Operations
...using compound values 3. Add support for vectorizing compound operations to VPlan Please note that the example in the previous section is intentionally kept simple and one could argue it could be handled by doing loop rerolling as a canonicalization step before LoopVectorize (currently LoopReroll does not handle this case, but it could be naturally extended to do so). However, the main advantage of making LoopVectorize aware of SLP opportunities is that it will allow us to deal with more complex cases like * loops where some vectors need to be transformed, for example where differ...
2018 Feb 08
0
[RFC] Make LoopVectorize Aware of SLP Operations
Hi Florian! This proposal sounds pretty exciting! Integrating SLP-aware loop vectorization (or the other way around) and SLP into the VPlan framework is definitely aligned with the long term vision and we would prefer this approach to the LoopReroll and InstCombine alternatives that you mentioned. We prefer a generic implementation that can handle complicated cases to something ad-hoc for some simple ones. Because of this, we would have some comments regarding the design that you propose: > 1. Detect loops containing SLP opportunities (o...
2016 Jul 21
2
RFC: Strong GC References in LLVM
...t can be lazy, and should be an analysis, i'm, again, > really not sure which passes you are thinking about here that do code > sinking/speculation that won't need it. > > Here's the list definitely needing it right now: > GVN > GVNHoist > LICM > LoadCombine > LoopReroll > LoopUnswitch > LoopVersioningLICM > MemCpyOptimizer > MergedLoadStoreMotion > Sink > > The list is almost certainly larger than this, this was a pretty trivial > grep and examination. > (and doesn't take into account bugs, etc) > > (Note, this list is stuff i...
2016 Jul 21
3
RFC: Strong GC References in LLVM
> On Jul 21, 2016, at 7:45 AM, Philip Reames <listmail at philipreames.com> wrote: > > Joining in very late, but the tangent here has been interesting (if rather OT for the original thread). > > I agree with Danny that we might want to take a close look at how we model things like maythrow calls, no return, and other implicit control flow. I'm not convinced that moving
2016 Jul 21
4
RFC: Strong GC References in LLVM
...ysis, i'm, again, >> really not sure which passes you are thinking about here that do code >> sinking/speculation that won't need it. >> >> Here's the list definitely needing it right now: >> GVN >> GVNHoist >> LICM >> LoadCombine >> LoopReroll >> LoopUnswitch >> LoopVersioningLICM >> MemCpyOptimizer >> MergedLoadStoreMotion >> Sink >> >> The list is almost certainly larger than this, this was a pretty trivial >> grep and examination. >> (and doesn't take into account bugs, etc) &g...
2013 Jul 28
0
[LLVMdev] IR Passes and TargetTransformInfo: Straw Man
...862 -1.5981735159817 Applications/ClamAV/clamscan 0.094 0.0925 -1.5957446808510 Benchmarks/McCat/17-bintr/bintr 0.0666 0.0658 -1.2012012012012 Benchmarks/MiBench/automotive-susan/automot 0.0312 0.0309 -0.9615384615384 Benchmarks/TSVC/LoopRerolling-dbl/LoopRerol 2.7783 2.7524 -0.9322247417485 Benchmarks/SciMark2-C/scimark2 22.2684 22.0824 -0.8352643207414 Benchmarks/mediabench/g721/g721encode/encod 0.0403 0.04 -0.7444168734491 Benchmarks/ASC_Sequoia/AMGmk/AMGmk 5.0381 5.0033...
2013 Jul 18
3
[LLVMdev] IR Passes and TargetTransformInfo: Straw Man
Andy and I briefly discussed this the other day, we have not yet got chance to list a detailed pass order for the pre- and post- IPO scalar optimizations. This is wish-list in our mind: pre-IPO: based on the ordering he propose, get rid of the inlining (or just inline tiny func), get rid of all loop xforms... post-IPO: get rid of inlining, or maybe we still need it, only
2018 Apr 26
0
Compare test-suite benchmarks performance complied without TBAA, with default TBAA and with new TBAA struct path
...70111|2.093725046| -0.16|17747770107| 0|2.089922083| 0.02|17747770106| 0| |MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt.test | 40|1.426366572|15128429282|1.421173178| 0.37|15128429276| 0|1.421455642| 0.35|15128429276| 0| |MultiSource/Benchmarks/TSVC/LoopRerolling-dbl/LoopRerolling-dbl.test | 40|2.479358186|12728810214|2.476810043| 0.1|12728810217| 0|2.479956135| -0.02|12728810217| 0| |MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt.test | 40|1.749389689| 9799621839|1.751205696| -0.1| 9799621843| 0|1....