Displaying 20 results from an estimated 87 matches for "tsvc".
Did you mean:
tsc
2015 Feb 26
5
[LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?
Hi all,
I've started looking at the GlobalMerge pass, enabled by default on
ARM and AArch64. I think we should reconsider that, at least for
AArch64.
As is, the pass just merges all globals together, in groups of 4KB
(AArch64, 128B on ARM).
At the time it was enabled, the general thinking was "it's almost
free, it doesn't affect performance much, we might as well use it".
2013 Jul 28
2
[LLVMdev] Enabling the SLP-vectorizer by default for -O3
...e is not much we can do at the IR-level to predict this.
Performance Regressions - Compile Time Δ Previous Current σ
MultiSource/Benchmarks/VersaBench/beamformer/beamformer 18.98% 0.0722 0.0859 0.0003
MultiSource/Benchmarks/FreeBench/pifft/pifft 5.66% 0.5003 0.5286 0.0015
MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt 4.85% 0.4084 0.4282 0.0014
MultiSource/Benchmarks/TSVC/LoopRestructuring-flt/LoopRestructuring-flt 4.36% 0.3856 0.4024 0.0018
MultiSource/Benchmarks/TSVC/ControlFlow-flt/ControlFlow-flt 2.62% 0.4424 0.4540 0.0019
External/SPEC/CINT2006/401_bzip2/401_bzip2 1...
2013 Jul 14
6
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...would like to hear what others in the community think about this and give other people the opportunity to perform their own performance measurements.
— Performance Gains —
SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68%
MultiSource/Benchmarks/Olden/power/power -18.55%
MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71%
SingleSource/Benchmarks/Misc/flops-6 -11.02%
SingleSource/Benchmarks/Misc/flops-5 -10.03%
MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37%
External/Nurbs/nurbs -7.98%
SingleSource/Benchmarks/Misc/pi -7.29%
External/SPEC/CINT...
2013 Jul 28
0
[LLVMdev] IR Passes and TargetTransformInfo: Straw Man
...lvm.org/pipermail/llvm-dev/attachments/20130727/ff57e511/attachment.html>
-------------- next part --------------
name exec_was exec_is exec_diff
------------------------------------------- ---------- ---------- ----------------
Benchmarks/TSVC/Symbolics-flt/Symbolics-flt 1.4634 0.684 -53.259532595326
Benchmarks/MiBench/security-sha/security-sh 0.0199 0.0128 -35.678391959799
Benchmarks/mediabench/adpcm/rawcaudio/rawca 0.0034 0.0025 -26.470588235294
Benchmarks/Prolangs-C/agrep/agrep 0.0032...
2013 Jul 18
3
[LLVMdev] IR Passes and TargetTransformInfo: Straw Man
Andy and I briefly discussed this the other day, we have not yet got
chance to list a detailed pass order
for the pre- and post- IPO scalar optimizations.
This is wish-list in our mind:
pre-IPO: based on the ordering he propose, get rid of the inlining (or
just inline tiny func), get rid of
all loop xforms...
post-IPO: get rid of inlining, or maybe we still need it, only
2012 Oct 05
0
[LLVMdev] TSVC/Equivalencing-dbl
PS: Here's how I can reproduce with clang on linux:
clang -S -o tsc.ll -O0 -flto -std=gnu99 tsc.c ; clang -S -o dummy.ll -O0 -flto
-std=gnu99 dummy.c ; opt -std-compile-opts tsc.ll -S -o tsc.1.ll ; opt
-std-compile-opts dummy.ll -S -o dummy.1.ll ; llvm-link tsc.1.ll dummy.1.ll -S
-o total.ll ; opt -std-link-opts total.ll -S -o total.1.ll ; llc total.1.ll ;
gcc -o z total.1.s
The program
2018 Aug 14
3
[RFC] Delaying phi-to-select transformation until later in the pass pipeline
...-O3 with and without the patch linked above (using
trunk llvm from a week or so ago).
AArch64 results on ARM Cortex-A72:
Performance Regressions - execution_time Change
SingleSource/Benchmarks/Shootout/Shootout-ary3 9.48%
MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt 3.79%
SingleSource/Benchmarks/CoyoteBench/huffbench 1.40%
Performance Improvements - execution_time Change
MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl -23.74%
External/SPEC/CINT...
2018 Apr 26
0
Compare test-suite benchmarks performance complied without TBAA, with default TBAA and with new TBAA struct path
...------------------------------|--------|--------|--------|--------|
|Bitcode/Benchmarks/Halide/blur/halide_blur.test | 239.61 | 239.62 | 413.65 | 413.65 |
|SingleSource/Benchmarks/Misc/himenobmtxpa.test | 64.58 | 64.97 | 219.74 | 219.74 |
|MultiSource/Benchmarks/TSVC/Equivalencing-flt/Equivalencing-flt.t| 46.74 | 47.04 | 48.01 | 48.01 |
|MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.test | 41.32 | 41.57 | 54.97 | 54.97 |
|SingleSource/Benchmarks/Dhrystone/dry.test | 20.02 | 20.02 | 11.54 | 11.54 |
|SingleSource/Be...
2012 Oct 07
0
[LLVMdev] TSVC/Equivalencing-dbl
...n spend some time to look into the problem.
Thanks,
Shivaram
-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Hal Finkel
Sent: Saturday, October 06, 2012 1:57 AM
To: Duncan Sands
Cc: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] TSVC/Equivalencing-dbl
----- Original Message -----
> From: "Duncan Sands" <duncan.sands at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: llvmdev at cs.uiuc.edu
> Sent: Friday, October 5, 2012 2:50:06 PM
> Subject: Re: TSVC/Equivalencing-dbl
&...
2013 Sep 25
0
[LLVMdev] [Polly] Performance comparison between Cloog and ISL code generation
...formance comparison between Polly's Cloog and ISL code generator is posted on http://188.40.87.11:8000/db_default/v4/nts/59?compare_to=58&baseline=58
It seems their execution-time performance are comparable:
Performance Regressions - Execution Time (ISL over Cloog)
MultiSource/Benchmarks/TSVC/ControlFlow-flt/ControlFlow-flt 8.49%
MultiSource/Benchmarks/TSVC/StatementReordering-flt/StatementReordering-flt 6.77%
MultiSource/Benchmarks/TSVC/CrossingThresholds-flt/CrossingThresholds-flt 2.65%
SingleSource/UnitTests/Vectorizer/gcc-loops 2.63%
Performance Improvements - Execution Time (ISL...
2012 Oct 05
4
[LLVMdev] TSVC/Equivalencing-dbl
----- Original Message -----
> From: "Duncan Sands" <duncan.sands at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: llvmdev at cs.uiuc.edu
> Sent: Friday, October 5, 2012 12:10:03 PM
> Subject: Re: TSVC/Equivalencing-dbl
>
> Oops, I ran the testsuite wrong: read clang output for dragonegg
> output.
Okay, can you resummarize? Do you mean that?
gcc -O0:
S1421 0.00 16000
gcc -O0 under valgrind:
S1421 0.00 17208.404325315
clang:
S1421 0....
2012 Oct 05
2
[LLVMdev] TSVC/Equivalencing-dbl
----- Original Message -----
> From: "Duncan Sands" <duncan.sands at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: llvmdev at cs.uiuc.edu
> Sent: Friday, October 5, 2012 2:50:06 PM
> Subject: Re: TSVC/Equivalencing-dbl
>
> Hi Hal,
>
> On 05/10/12 20:32, Hal Finkel wrote:
> > ----- Original Message -----
> >> From: "Duncan Sands" <duncan.sands at gmail.com>
> >> To: "Hal Finkel" <hfinkel at anl.gov>
> >> Cc: llvmdev at...
2013 Jul 15
0
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...it should be straight-forward. It would also be really useful to see what the code size and compile time impact is.
-Chris
>
> — Performance Gains —
> SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68%
> MultiSource/Benchmarks/Olden/power/power -18.55%
> MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71%
> SingleSource/Benchmarks/Misc/flops-6 -11.02%
> SingleSource/Benchmarks/Misc/flops-5 -10.03%
> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37%
> External/Nurbs/nurbs -7.98%
> SingleSource/Benchmarks/Misc/pi...
2013 Jul 28
0
[LLVMdev] Enabling the SLP-vectorizer by default for -O3
...at the IR-level to
> predict this.
>
>
>
> Performance Regressions - Compile TimeΔPreviousCurrentσ
> MultiSource/Benchmarks/VersaBench/beamformer/beamformer18.98%0.07220.0859
> 0.0003MultiSource/Benchmarks/FreeBench/pifft/pifft5.66%0.50030.52860.0015
> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt4.85%
> 0.40840.42820.0014
> MultiSource/Benchmarks/TSVC/LoopRestructuring-flt/LoopRestructuring-flt
> 4.36%0.38560.40240.0018
> MultiSource/Benchmarks/TSVC/ControlFlow-flt/ControlFlow-flt2.62%0.4424
> 0.45400.0019External/SPEC/CINT2006/401_bz...
2012 Oct 05
2
[LLVMdev] TSVC/Equivalencing-dbl
Hi Hal, I was looking into why this fails with dragonegg, and noticed the
following: if I compile with GCC (-O0) then I get as output:
Running each loop 3125 times...
Loop Time(Sec) Checksum
S421 0.00 32010.620068485
S1421 0.00 16000
S422 0.00 3.7377231414078
S423 0.00 32000.736895702
S424 0.00 32822.36069424
This is the same as the reference output. If I run exactly the
2012 Oct 05
0
[LLVMdev] TSVC/Equivalencing-dbl
Oops, I ran the testsuite wrong: read clang output for dragonegg output.
2012 Oct 05
0
[LLVMdev] TSVC/Equivalencing-dbl
..., Hal Finkel wrote:
> ----- Original Message -----
>> From: "Duncan Sands" <duncan.sands at gmail.com>
>> To: "Hal Finkel" <hfinkel at anl.gov>
>> Cc: llvmdev at cs.uiuc.edu
>> Sent: Friday, October 5, 2012 12:10:03 PM
>> Subject: Re: TSVC/Equivalencing-dbl
>>
>> Oops, I ran the testsuite wrong: read clang output for dragonegg
>> output.
>
> Okay, can you resummarize? Do you mean that?
>
> gcc -O0:
> S1421 0.00 16000
>
> gcc -O0 under valgrind:
> S1421 0.00...
2013 Jul 15
3
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...would also be really useful to see what the code size and compile time impact is.
>
> -Chris
>
>>
>> — Performance Gains —
>> SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68%
>> MultiSource/Benchmarks/Olden/power/power -18.55%
>> MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71%
>> SingleSource/Benchmarks/Misc/flops-6 -11.02%
>> SingleSource/Benchmarks/Misc/flops-5 -10.03%
>> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37%
>> External/Nurbs/nurbs -7.98%
>> SingleSource...
2018 Aug 15
2
[RFC] Delaying phi-to-select transformation until later in the pass pipeline
...ked
> above (using
> trunk llvm from a week or so ago).
>
> AArch64 results on ARM Cortex-A72:
>
> Performance Regressions - execution_time Change
> SingleSource/Benchmarks/Shootout/Shootout-ary3 9.48%
> MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt
> 3.79%
> SingleSource/Benchmarks/CoyoteBench/huffbench 1.40%
>
> Performance Improvements - execution_time Change
> MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl
> -23.74%
&...
2013 Jul 23
0
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...ee what the code size and compile time impact is.
>>
>> -Chris
>>
>>>
>>> — Performance Gains —
>>> SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68%
>>> MultiSource/Benchmarks/Olden/power/power -18.55%
>>> MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71%
>>> SingleSource/Benchmarks/Misc/flops-6 -11.02%
>>> SingleSource/Benchmarks/Misc/flops-5 -10.03%
>>> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37%
>>> External/Nurbs/nurbs -7.98%
>&...