thr3ads.net - search: "tsvc"

Displaying 20 results from an estimated 87 matches for "tsvc".

Did you mean: tsc

[LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?

2015 Feb 26

[LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?

Hi all, I've started looking at the GlobalMerge pass, enabled by default on ARM and AArch64. I think we should reconsider that, at least for AArch64. As is, the pass just merges all globals together, in groups of 4KB (AArch64, 128B on ARM). At the time it was enabled, the general thinking was "it's almost free, it doesn't affect performance much, we might as well use it".

[LLVMdev] Enabling the SLP-vectorizer by default for -O3

2013 Jul 28

[LLVMdev] Enabling the SLP-vectorizer by default for -O3

...e is not much we can do at the IR-level to predict this. Performance Regressions - Compile Time Δ Previous Current σ MultiSource/Benchmarks/VersaBench/beamformer/beamformer 18.98% 0.0722 0.0859 0.0003 MultiSource/Benchmarks/FreeBench/pifft/pifft 5.66% 0.5003 0.5286 0.0015 MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt 4.85% 0.4084 0.4282 0.0014 MultiSource/Benchmarks/TSVC/LoopRestructuring-flt/LoopRestructuring-flt 4.36% 0.3856 0.4024 0.0018 MultiSource/Benchmarks/TSVC/ControlFlow-flt/ControlFlow-flt 2.62% 0.4424 0.4540 0.0019 External/SPEC/CINT2006/401_bzip2/401_bzip2 1...

[LLVMdev] Enabling the SLP vectorizer by default for -O3

2013 Jul 14

[LLVMdev] Enabling the SLP vectorizer by default for -O3

...would like to hear what others in the community think about this and give other people the opportunity to perform their own performance measurements. — Performance Gains — SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68% MultiSource/Benchmarks/Olden/power/power -18.55% MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71% SingleSource/Benchmarks/Misc/flops-6 -11.02% SingleSource/Benchmarks/Misc/flops-5 -10.03% MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37% External/Nurbs/nurbs -7.98% SingleSource/Benchmarks/Misc/pi -7.29% External/SPEC/CINT...

[LLVMdev] IR Passes and TargetTransformInfo: Straw Man

2013 Jul 28

[LLVMdev] IR Passes and TargetTransformInfo: Straw Man

...lvm.org/pipermail/llvm-dev/attachments/20130727/ff57e511/attachment.html> -------------- next part -------------- name exec_was exec_is exec_diff ------------------------------------------- ---------- ---------- ---------------- Benchmarks/TSVC/Symbolics-flt/Symbolics-flt 1.4634 0.684 -53.259532595326 Benchmarks/MiBench/security-sha/security-sh 0.0199 0.0128 -35.678391959799 Benchmarks/mediabench/adpcm/rawcaudio/rawca 0.0034 0.0025 -26.470588235294 Benchmarks/Prolangs-C/agrep/agrep 0.0032...

[LLVMdev] IR Passes and TargetTransformInfo: Straw Man

2013 Jul 18

[LLVMdev] IR Passes and TargetTransformInfo: Straw Man

Andy and I briefly discussed this the other day, we have not yet got chance to list a detailed pass order for the pre- and post- IPO scalar optimizations. This is wish-list in our mind: pre-IPO: based on the ordering he propose, get rid of the inlining (or just inline tiny func), get rid of all loop xforms... post-IPO: get rid of inlining, or maybe we still need it, only

[LLVMdev] TSVC/Equivalencing-dbl

2012 Oct 05

[LLVMdev] TSVC/Equivalencing-dbl

PS: Here's how I can reproduce with clang on linux: clang -S -o tsc.ll -O0 -flto -std=gnu99 tsc.c ; clang -S -o dummy.ll -O0 -flto -std=gnu99 dummy.c ; opt -std-compile-opts tsc.ll -S -o tsc.1.ll ; opt -std-compile-opts dummy.ll -S -o dummy.1.ll ; llvm-link tsc.1.ll dummy.1.ll -S -o total.ll ; opt -std-link-opts total.ll -S -o total.1.ll ; llc total.1.ll ; gcc -o z total.1.s The program

[RFC] Delaying phi-to-select transformation until later in the pass pipeline

2018 Aug 14

[RFC] Delaying phi-to-select transformation until later in the pass pipeline

...-O3 with and without the patch linked above (using trunk llvm from a week or so ago). AArch64 results on ARM Cortex-A72: Performance Regressions - execution_time Change SingleSource/Benchmarks/Shootout/Shootout-ary3 9.48% MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt 3.79% SingleSource/Benchmarks/CoyoteBench/huffbench 1.40% Performance Improvements - execution_time Change MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl -23.74% External/SPEC/CINT...

Compare test-suite benchmarks performance complied without TBAA, with default TBAA and with new TBAA struct path

2018 Apr 26

Compare test-suite benchmarks performance complied without TBAA, with default TBAA and with new TBAA struct path

...------------------------------|--------|--------|--------|--------| |Bitcode/Benchmarks/Halide/blur/halide_blur.test | 239.61 | 239.62 | 413.65 | 413.65 | |SingleSource/Benchmarks/Misc/himenobmtxpa.test | 64.58 | 64.97 | 219.74 | 219.74 | |MultiSource/Benchmarks/TSVC/Equivalencing-flt/Equivalencing-flt.t| 46.74 | 47.04 | 48.01 | 48.01 | |MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.test | 41.32 | 41.57 | 54.97 | 54.97 | |SingleSource/Benchmarks/Dhrystone/dry.test | 20.02 | 20.02 | 11.54 | 11.54 | |SingleSource/Be...

[LLVMdev] TSVC/Equivalencing-dbl

2012 Oct 07

[LLVMdev] TSVC/Equivalencing-dbl

...n spend some time to look into the problem. Thanks, Shivaram -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Hal Finkel Sent: Saturday, October 06, 2012 1:57 AM To: Duncan Sands Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] TSVC/Equivalencing-dbl ----- Original Message ----- > From: "Duncan Sands" <duncan.sands at gmail.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: llvmdev at cs.uiuc.edu > Sent: Friday, October 5, 2012 2:50:06 PM > Subject: Re: TSVC/Equivalencing-dbl &...

[LLVMdev] [Polly] Performance comparison between Cloog and ISL code generation

2013 Sep 25

[LLVMdev] [Polly] Performance comparison between Cloog and ISL code generation

...formance comparison between Polly's Cloog and ISL code generator is posted on http://188.40.87.11:8000/db_default/v4/nts/59?compare_to=58&baseline=58 It seems their execution-time performance are comparable: Performance Regressions - Execution Time (ISL over Cloog) MultiSource/Benchmarks/TSVC/ControlFlow-flt/ControlFlow-flt 8.49% MultiSource/Benchmarks/TSVC/StatementReordering-flt/StatementReordering-flt 6.77% MultiSource/Benchmarks/TSVC/CrossingThresholds-flt/CrossingThresholds-flt 2.65% SingleSource/UnitTests/Vectorizer/gcc-loops 2.63% Performance Improvements - Execution Time (ISL...

[LLVMdev] TSVC/Equivalencing-dbl

2012 Oct 05

[LLVMdev] TSVC/Equivalencing-dbl

----- Original Message ----- > From: "Duncan Sands" <duncan.sands at gmail.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: llvmdev at cs.uiuc.edu > Sent: Friday, October 5, 2012 12:10:03 PM > Subject: Re: TSVC/Equivalencing-dbl > > Oops, I ran the testsuite wrong: read clang output for dragonegg > output. Okay, can you resummarize? Do you mean that? gcc -O0: S1421 0.00 16000 gcc -O0 under valgrind: S1421 0.00 17208.404325315 clang: S1421 0....

[LLVMdev] TSVC/Equivalencing-dbl

2012 Oct 05

[LLVMdev] TSVC/Equivalencing-dbl

----- Original Message ----- > From: "Duncan Sands" <duncan.sands at gmail.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: llvmdev at cs.uiuc.edu > Sent: Friday, October 5, 2012 2:50:06 PM > Subject: Re: TSVC/Equivalencing-dbl > > Hi Hal, > > On 05/10/12 20:32, Hal Finkel wrote: > > ----- Original Message ----- > >> From: "Duncan Sands" <duncan.sands at gmail.com> > >> To: "Hal Finkel" <hfinkel at anl.gov> > >> Cc: llvmdev at...

[LLVMdev] Enabling the SLP vectorizer by default for -O3

2013 Jul 15

[LLVMdev] Enabling the SLP vectorizer by default for -O3

...it should be straight-forward. It would also be really useful to see what the code size and compile time impact is. -Chris > > — Performance Gains — > SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68% > MultiSource/Benchmarks/Olden/power/power -18.55% > MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71% > SingleSource/Benchmarks/Misc/flops-6 -11.02% > SingleSource/Benchmarks/Misc/flops-5 -10.03% > MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37% > External/Nurbs/nurbs -7.98% > SingleSource/Benchmarks/Misc/pi...

[LLVMdev] Enabling the SLP-vectorizer by default for -O3

2013 Jul 28

[LLVMdev] Enabling the SLP-vectorizer by default for -O3

...at the IR-level to > predict this. > > > > Performance Regressions - Compile TimeΔPreviousCurrentσ > MultiSource/Benchmarks/VersaBench/beamformer/beamformer18.98%0.07220.0859 > 0.0003MultiSource/Benchmarks/FreeBench/pifft/pifft5.66%0.50030.52860.0015 > MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt4.85% > 0.40840.42820.0014 > MultiSource/Benchmarks/TSVC/LoopRestructuring-flt/LoopRestructuring-flt > 4.36%0.38560.40240.0018 > MultiSource/Benchmarks/TSVC/ControlFlow-flt/ControlFlow-flt2.62%0.4424 > 0.45400.0019External/SPEC/CINT2006/401_bz...

[LLVMdev] TSVC/Equivalencing-dbl

2012 Oct 05

[LLVMdev] TSVC/Equivalencing-dbl

Hi Hal, I was looking into why this fails with dragonegg, and noticed the following: if I compile with GCC (-O0) then I get as output: Running each loop 3125 times... Loop Time(Sec) Checksum S421 0.00 32010.620068485 S1421 0.00 16000 S422 0.00 3.7377231414078 S423 0.00 32000.736895702 S424 0.00 32822.36069424 This is the same as the reference output. If I run exactly the

[LLVMdev] TSVC/Equivalencing-dbl

2012 Oct 05

[LLVMdev] TSVC/Equivalencing-dbl

Oops, I ran the testsuite wrong: read clang output for dragonegg output.

[LLVMdev] TSVC/Equivalencing-dbl

2012 Oct 05

[LLVMdev] TSVC/Equivalencing-dbl

..., Hal Finkel wrote: > ----- Original Message ----- >> From: "Duncan Sands" <duncan.sands at gmail.com> >> To: "Hal Finkel" <hfinkel at anl.gov> >> Cc: llvmdev at cs.uiuc.edu >> Sent: Friday, October 5, 2012 12:10:03 PM >> Subject: Re: TSVC/Equivalencing-dbl >> >> Oops, I ran the testsuite wrong: read clang output for dragonegg >> output. > > Okay, can you resummarize? Do you mean that? > > gcc -O0: > S1421 0.00 16000 > > gcc -O0 under valgrind: > S1421 0.00...

[LLVMdev] Enabling the SLP vectorizer by default for -O3

2013 Jul 15

[LLVMdev] Enabling the SLP vectorizer by default for -O3

...would also be really useful to see what the code size and compile time impact is. > > -Chris > >> >> — Performance Gains — >> SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68% >> MultiSource/Benchmarks/Olden/power/power -18.55% >> MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71% >> SingleSource/Benchmarks/Misc/flops-6 -11.02% >> SingleSource/Benchmarks/Misc/flops-5 -10.03% >> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37% >> External/Nurbs/nurbs -7.98% >> SingleSource...

[RFC] Delaying phi-to-select transformation until later in the pass pipeline

2018 Aug 15

[RFC] Delaying phi-to-select transformation until later in the pass pipeline

...ked > above (using > trunk llvm from a week or so ago). > > AArch64 results on ARM Cortex-A72: > > Performance Regressions - execution_time Change > SingleSource/Benchmarks/Shootout/Shootout-ary3 9.48% > MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt > 3.79% > SingleSource/Benchmarks/CoyoteBench/huffbench 1.40% > > Performance Improvements - execution_time Change > MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl > -23.74% &...

[LLVMdev] Enabling the SLP vectorizer by default for -O3

2013 Jul 23

[LLVMdev] Enabling the SLP vectorizer by default for -O3

...ee what the code size and compile time impact is. >> >> -Chris >> >>> >>> — Performance Gains — >>> SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68% >>> MultiSource/Benchmarks/Olden/power/power -18.55% >>> MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71% >>> SingleSource/Benchmarks/Misc/flops-6 -11.02% >>> SingleSource/Benchmarks/Misc/flops-5 -10.03% >>> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37% >>> External/Nurbs/nurbs -7.98% >&...

search for: tsvc