search for: tsvc

Displaying 20 results from an estimated 87 matches for "tsvc".

Did you mean: tsc
2015 Feb 26
5
[LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?
Hi all, I've started looking at the GlobalMerge pass, enabled by default on ARM and AArch64. I think we should reconsider that, at least for AArch64. As is, the pass just merges all globals together, in groups of 4KB (AArch64, 128B on ARM). At the time it was enabled, the general thinking was "it's almost free, it doesn't affect performance much, we might as well use it".
2013 Jul 28
2
[LLVMdev] Enabling the SLP-vectorizer by default for -O3
...e is not much we can do at the IR-level to predict this. Performance Regressions - Compile Time Δ Previous Current σ MultiSource/Benchmarks/VersaBench/beamformer/beamformer 18.98% 0.0722 0.0859 0.0003 MultiSource/Benchmarks/FreeBench/pifft/pifft 5.66% 0.5003 0.5286 0.0015 MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt 4.85% 0.4084 0.4282 0.0014 MultiSource/Benchmarks/TSVC/LoopRestructuring-flt/LoopRestructuring-flt 4.36% 0.3856 0.4024 0.0018 MultiSource/Benchmarks/TSVC/ControlFlow-flt/ControlFlow-flt 2.62% 0.4424 0.4540 0.0019 External/SPEC/CINT2006/401_bzip2/401_bzip2 1...
2013 Jul 14
6
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...would like to hear what others in the community think about this and give other people the opportunity to perform their own performance measurements. — Performance Gains — SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68% MultiSource/Benchmarks/Olden/power/power -18.55% MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71% SingleSource/Benchmarks/Misc/flops-6 -11.02% SingleSource/Benchmarks/Misc/flops-5 -10.03% MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37% External/Nurbs/nurbs -7.98% SingleSource/Benchmarks/Misc/pi -7.29% External/SPEC/CINT...
2013 Jul 28
0
[LLVMdev] IR Passes and TargetTransformInfo: Straw Man
...lvm.org/pipermail/llvm-dev/attachments/20130727/ff57e511/attachment.html> -------------- next part -------------- name exec_was exec_is exec_diff ------------------------------------------- ---------- ---------- ---------------- Benchmarks/TSVC/Symbolics-flt/Symbolics-flt 1.4634 0.684 -53.259532595326 Benchmarks/MiBench/security-sha/security-sh 0.0199 0.0128 -35.678391959799 Benchmarks/mediabench/adpcm/rawcaudio/rawca 0.0034 0.0025 -26.470588235294 Benchmarks/Prolangs-C/agrep/agrep 0.0032...
2013 Jul 18
3
[LLVMdev] IR Passes and TargetTransformInfo: Straw Man
Andy and I briefly discussed this the other day, we have not yet got chance to list a detailed pass order for the pre- and post- IPO scalar optimizations. This is wish-list in our mind: pre-IPO: based on the ordering he propose, get rid of the inlining (or just inline tiny func), get rid of all loop xforms... post-IPO: get rid of inlining, or maybe we still need it, only
2012 Oct 05
0
[LLVMdev] TSVC/Equivalencing-dbl
PS: Here's how I can reproduce with clang on linux: clang -S -o tsc.ll -O0 -flto -std=gnu99 tsc.c ; clang -S -o dummy.ll -O0 -flto -std=gnu99 dummy.c ; opt -std-compile-opts tsc.ll -S -o tsc.1.ll ; opt -std-compile-opts dummy.ll -S -o dummy.1.ll ; llvm-link tsc.1.ll dummy.1.ll -S -o total.ll ; opt -std-link-opts total.ll -S -o total.1.ll ; llc total.1.ll ; gcc -o z total.1.s The program
2018 Aug 14
3
[RFC] Delaying phi-to-select transformation until later in the pass pipeline
...-O3 with and without the patch linked above (using trunk llvm from a week or so ago). AArch64 results on ARM Cortex-A72: Performance Regressions - execution_time Change SingleSource/Benchmarks/Shootout/Shootout-ary3 9.48% MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt 3.79% SingleSource/Benchmarks/CoyoteBench/huffbench 1.40% Performance Improvements - execution_time Change MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl -23.74% External/SPEC/CINT...
2018 Apr 26
0
Compare test-suite benchmarks performance complied without TBAA, with default TBAA and with new TBAA struct path
...------------------------------|--------|--------|--------|--------| |Bitcode/Benchmarks/Halide/blur/halide_blur.test | 239.61 | 239.62 | 413.65 | 413.65 | |SingleSource/Benchmarks/Misc/himenobmtxpa.test | 64.58 | 64.97 | 219.74 | 219.74 | |MultiSource/Benchmarks/TSVC/Equivalencing-flt/Equivalencing-flt.t| 46.74 | 47.04 | 48.01 | 48.01 | |MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.test | 41.32 | 41.57 | 54.97 | 54.97 | |SingleSource/Benchmarks/Dhrystone/dry.test | 20.02 | 20.02 | 11.54 | 11.54 | |SingleSource/Be...
2012 Oct 07
0
[LLVMdev] TSVC/Equivalencing-dbl
...n spend some time to look into the problem. Thanks, Shivaram -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Hal Finkel Sent: Saturday, October 06, 2012 1:57 AM To: Duncan Sands Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] TSVC/Equivalencing-dbl ----- Original Message ----- > From: "Duncan Sands" <duncan.sands at gmail.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: llvmdev at cs.uiuc.edu > Sent: Friday, October 5, 2012 2:50:06 PM > Subject: Re: TSVC/Equivalencing-dbl &...
2013 Sep 25
0
[LLVMdev] [Polly] Performance comparison between Cloog and ISL code generation
...formance comparison between Polly's Cloog and ISL code generator is posted on http://188.40.87.11:8000/db_default/v4/nts/59?compare_to=58&baseline=58 It seems their execution-time performance are comparable: Performance Regressions - Execution Time  (ISL over Cloog) MultiSource/Benchmarks/TSVC/ControlFlow-flt/ControlFlow-flt 8.49% MultiSource/Benchmarks/TSVC/StatementReordering-flt/StatementReordering-flt 6.77% MultiSource/Benchmarks/TSVC/CrossingThresholds-flt/CrossingThresholds-flt 2.65% SingleSource/UnitTests/Vectorizer/gcc-loops 2.63% Performance Improvements - Execution Time  (ISL...
2012 Oct 05
4
[LLVMdev] TSVC/Equivalencing-dbl
----- Original Message ----- > From: "Duncan Sands" <duncan.sands at gmail.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: llvmdev at cs.uiuc.edu > Sent: Friday, October 5, 2012 12:10:03 PM > Subject: Re: TSVC/Equivalencing-dbl > > Oops, I ran the testsuite wrong: read clang output for dragonegg > output. Okay, can you resummarize? Do you mean that? gcc -O0: S1421 0.00 16000 gcc -O0 under valgrind: S1421 0.00 17208.404325315 clang: S1421 0....
2012 Oct 05
2
[LLVMdev] TSVC/Equivalencing-dbl
----- Original Message ----- > From: "Duncan Sands" <duncan.sands at gmail.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: llvmdev at cs.uiuc.edu > Sent: Friday, October 5, 2012 2:50:06 PM > Subject: Re: TSVC/Equivalencing-dbl > > Hi Hal, > > On 05/10/12 20:32, Hal Finkel wrote: > > ----- Original Message ----- > >> From: "Duncan Sands" <duncan.sands at gmail.com> > >> To: "Hal Finkel" <hfinkel at anl.gov> > >> Cc: llvmdev at...
2013 Jul 15
0
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...it should be straight-forward. It would also be really useful to see what the code size and compile time impact is. -Chris > > — Performance Gains — > SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68% > MultiSource/Benchmarks/Olden/power/power -18.55% > MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71% > SingleSource/Benchmarks/Misc/flops-6 -11.02% > SingleSource/Benchmarks/Misc/flops-5 -10.03% > MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37% > External/Nurbs/nurbs -7.98% > SingleSource/Benchmarks/Misc/pi...
2013 Jul 28
0
[LLVMdev] Enabling the SLP-vectorizer by default for -O3
...at the IR-level to > predict this. > > > > Performance Regressions - Compile TimeΔPreviousCurrentσ > MultiSource/Benchmarks/VersaBench/beamformer/beamformer18.98%0.07220.0859 > 0.0003MultiSource/Benchmarks/FreeBench/pifft/pifft5.66%0.50030.52860.0015 > MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt4.85% > 0.40840.42820.0014 > MultiSource/Benchmarks/TSVC/LoopRestructuring-flt/LoopRestructuring-flt > 4.36%0.38560.40240.0018 > MultiSource/Benchmarks/TSVC/ControlFlow-flt/ControlFlow-flt2.62%0.4424 > 0.45400.0019External/SPEC/CINT2006/401_bz...
2012 Oct 05
2
[LLVMdev] TSVC/Equivalencing-dbl
Hi Hal, I was looking into why this fails with dragonegg, and noticed the following: if I compile with GCC (-O0) then I get as output: Running each loop 3125 times... Loop Time(Sec) Checksum S421 0.00 32010.620068485 S1421 0.00 16000 S422 0.00 3.7377231414078 S423 0.00 32000.736895702 S424 0.00 32822.36069424 This is the same as the reference output. If I run exactly the
2012 Oct 05
0
[LLVMdev] TSVC/Equivalencing-dbl
Oops, I ran the testsuite wrong: read clang output for dragonegg output.
2012 Oct 05
0
[LLVMdev] TSVC/Equivalencing-dbl
..., Hal Finkel wrote: > ----- Original Message ----- >> From: "Duncan Sands" <duncan.sands at gmail.com> >> To: "Hal Finkel" <hfinkel at anl.gov> >> Cc: llvmdev at cs.uiuc.edu >> Sent: Friday, October 5, 2012 12:10:03 PM >> Subject: Re: TSVC/Equivalencing-dbl >> >> Oops, I ran the testsuite wrong: read clang output for dragonegg >> output. > > Okay, can you resummarize? Do you mean that? > > gcc -O0: > S1421 0.00 16000 > > gcc -O0 under valgrind: > S1421 0.00...
2013 Jul 15
3
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...would also be really useful to see what the code size and compile time impact is. > > -Chris > >> >> — Performance Gains — >> SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68% >> MultiSource/Benchmarks/Olden/power/power -18.55% >> MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71% >> SingleSource/Benchmarks/Misc/flops-6 -11.02% >> SingleSource/Benchmarks/Misc/flops-5 -10.03% >> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37% >> External/Nurbs/nurbs -7.98% >> SingleSource...
2018 Aug 15
2
[RFC] Delaying phi-to-select transformation until later in the pass pipeline
...ked > above (using > trunk llvm from a week or so ago). > > AArch64 results on ARM Cortex-A72: > > Performance Regressions - execution_time           Change > SingleSource/Benchmarks/Shootout/Shootout-ary3                9.48% > MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt               >   3.79% > SingleSource/Benchmarks/CoyoteBench/huffbench                 1.40% > > Performance Improvements - execution_time          Change > MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl           >   -23.74% &...
2013 Jul 23
0
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...ee what the code size and compile time impact is. >> >> -Chris >> >>> >>> — Performance Gains — >>> SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68% >>> MultiSource/Benchmarks/Olden/power/power -18.55% >>> MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71% >>> SingleSource/Benchmarks/Misc/flops-6 -11.02% >>> SingleSource/Benchmarks/Misc/flops-5 -10.03% >>> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37% >>> External/Nurbs/nurbs -7.98% >&...