thr3ads.net - search: "h264ref"

Displaying 20 results from an estimated 51 matches for "h264ref".

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 30

(RFC) Adjusting default loop fully unroll threshold

...nrolling will not affect LSD/ICache performance. In https://reviews.llvm.org/D28368, I proposed to double the threshold for loop fully unroller. This will change the codegen of several SPECCPU benchmarks: Code size: 447.dealII 0.50% 453.povray 0.42% 433.milc 0.20% 445.gobmk 0.32% 403.gcc 0.05% 464.h264ref 3.62% Compile Time: 447.dealII 0.22% 453.povray -0.16% 433.milc 0.09% 445.gobmk -2.43% 403.gcc 0.06% 464.h264ref 3.21% Performance (on intel sandybridge): 447.dealII +0.07% 453.povray +1.79% 433.milc +1.02% 445.gobmk +0.56% 403.gcc -0.16% 464.h264ref -0.41% Looks like the change has overall posi...

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 30

(RFC) Adjusting default loop fully unroll threshold

...368 <https://reviews.llvm.org/D28368>, I proposed to double the threshold for loop fully unroller. This will change the codegen of several SPECCPU benchmarks: > > Code size: > 447.dealII 0.50% > 453.povray 0.42% > 433.milc 0.20% > 445.gobmk 0.32% > 403.gcc 0.05% > 464.h264ref 3.62% > > Compile Time: > 447.dealII 0.22% > 453.povray -0.16% > 433.milc 0.09% > 445.gobmk -2.43% > 403.gcc 0.06% > 464.h264ref 3.21% > > Performance (on intel sandybridge): > 447.dealII +0.07% > 453.povray +1.79% > 433.milc +1.02% > 445.gobmk +0.56% &gt...

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 30

(RFC) Adjusting default loop fully unroll threshold

...In https://reviews.llvm.org/D28368, I proposed to > double the threshold for loop fully unroller. This will change the codegen > of several SPECCPU benchmarks: > > Code size: > 447.dealII 0.50% > 453.povray 0.42% > 433.milc 0.20% > 445.gobmk 0.32% > 403.gcc 0.05% > 464.h264ref 3.62% > > Compile Time: > 447.dealII 0.22% > 453.povray -0.16% > 433.milc 0.09% > 445.gobmk -2.43% > 403.gcc 0.06% > 464.h264ref 3.21% > > Performance (on intel sandybridge): > 447.dealII +0.07% > 453.povray +1.79% > 433.milc +1.02% > 445.gobmk +0.56% >...

[RFC] Using Intel MPX to harden SafeStack

2017 Feb 18

[RFC] Using Intel MPX to harden SafeStack

....hmmer|534.94|533.68|534.37|553.40 | +--------------+---------+---------+---------+-------+ |458.sjeng|633.69|641.21|641.81|655.94 | +--------------+---------+---------+---------+-------+ |462.libquantum|362.82|367.00|367.38|382.14 | +--------------+---------+---------+---------+-------+ |464.h264ref|701.37|682.13|683.41|699.93 | +--------------+---------+---------+---------+-------+ |471.omnetpp|397.04|407.38|407.33|411.36 | +--------------+---------+---------+---------+-------+ |473.astar|611.51|610.46|610.19|624.78 | +--------------+---------+---------+---------+-------+ |483.xalancbmk...

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 31

(RFC) Adjusting default loop fully unroll threshold

...oposed >> to double the threshold for loop fully unroller. This will change the >> codegen of several SPECCPU benchmarks: >> >> Code size: >> 447.dealII 0.50% >> 453.povray 0.42% >> 433.milc 0.20% >> 445.gobmk 0.32% >> 403.gcc 0.05% >> 464.h264ref 3.62% >> >> Compile Time: >> 447.dealII 0.22% >> 453.povray -0.16% >> 433.milc 0.09% >> 445.gobmk -2.43% >> 403.gcc 0.06% >> 464.h264ref 3.21% >> >> Performance (on intel sandybridge): >> 447.dealII +0.07% >> 453.povray +1.79%...

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 31

(RFC) Adjusting default loop fully unroll threshold

.../D28368>, I proposed to double the threshold for loop fully unroller. This will change the codegen of several SPECCPU benchmarks: >> >> Code size: >> 447.dealII 0.50% >> 453.povray 0.42% >> 433.milc 0.20% >> 445.gobmk 0.32% >> 403.gcc 0.05% >> 464.h264ref 3.62% >> >> Compile Time: >> 447.dealII 0.22% >> 453.povray -0.16% >> 433.milc 0.09% >> 445.gobmk -2.43% >> 403.gcc 0.06% >> 464.h264ref 3.21% >> >> Performance (on intel sandybridge): >> 447.dealII +0.07% >> 453.povray +1.79...

[RFC] Add IR level interprocedural outliner for code size.

2017 Jul 22

[RFC] Add IR level interprocedural outliner for code size.

...lt: 22.19% > - > > StatementReordering-flt: 22.15% > - > > Searching-flt: 21.96% > > > SPEC, top improvements: > E&LO: > > - > > bzip2: 9.15% > - > > gcc: 4.03% > - > > sphinx3: 3.8% > - > > H264ref: 3.24% > - > > Perlbench: 3% > > LO: > > - > > bzip2: 7.27% > - > > sphinx3: 3.65% > - > > Namd: 3.08% > - > > Gcc: 3.06% > - > > H264ref: 3.05% > > MO: > > - > > Namd: 7.8% &g...

[LLVMdev] Greedy register allocation

2011 Apr 30

[LLVMdev] Greedy register allocation

...the SPEC benchmarks that change by more than 3% (minus means faster, plus slower): Targeting i386: -19.3% 164.gzip -12.5% 433.milc -8.8% 473.astar -7.4% 401.bzip2 -6.4% 183.equake -4.9% 456.hmmer -4.6% 186.crafty -4.6% 188.ammp -4.1% 403.gcc -4.0% 256.bzip2 -3.2% 197.parser -3.1% 175.vpr -3.0% 464.h264ref +6.7% 177.mesa With more registers and out-of-order execution hiding the cost of spilling, x86-64 is more mixed. I suspect this architecture is more sensitive to code layout issues than to register allocation: Targeting x86-64: -6.4% 464.h264ref -6.1% 256.bzip2 -5.2% 183.equake -4.8% 447.deal...

GSoC Proposal : Path Profiling Support

2016 Mar 16

GSoC Proposal : Path Profiling Support

...------------------------------------------------------------------------+ | swaptions | _Z21HJM_Swaption_BlockingPddddddiidS_PS_llii | +-----------------+----------------------------------------------------------------------------------------------+ | h264ref | dct_luma_16x16 | +-----------------+----------------------------------------------------------------------------------------------+ > Do you have data when such manual selection is not done? At the moment, I do not. > > thanks, > &g...

[RFC] Delaying phi-to-select transformation until later in the pass pipeline

2018 Aug 14

[RFC] Delaying phi-to-select transformation until later in the pass pipeline

...sions - execution_time Change MultiSource/Benchmarks/Ptrdist/yacr2/yacr2 5.62% Performance Improvements - execution_time Change SingleSource/Benchmarks/Misc-C++/Large/sphereflake -4.43% External/SPEC/CINT2006/456.hmmer/456.hmmer -2.50% External/SPEC/CINT2006/464.h264ref/464.h264ref -1.60% MultiSource/Benchmarks/nbench/nbench -1.19% SingleSource/Benchmarks/Adobe-C++/functionobjects -1.07% I had a brief look at the regressions and they all look to be caused by getting bad luck with branch mispredictions: I looked into the Shootout-ary3 and yacr2...

Enable vectorizer-maximize-bandwidth by default?

2017 May 18

Enable vectorizer-maximize-bandwidth by default?

...spec/2006/int/C/429.mcf 40.35 +0.27% spec/2006/int/C/445.gobmk 26.96 +0.06% spec/2006/int/C/456.hmmer 24.4 +0.19% spec/2006/int/C/458.sjeng 27.91 -0.08% spec/2006/int/C/462.libquantum 57.47 -0.20% spec/2006/int/C/464.h264ref 46.52 +1.35% geometric mean +0.29% Scores are benchmark specific. We do have regression on 453.povray, but it's due to secondary effects as all hot functions are the same. I've also tested the code size impact, it does not change for tes...

[LLVMdev] LoopInfo are not able to identify some natural loops?

2011 Apr 30

[LLVMdev] LoopInfo are not able to identify some natural loops?

Hi, I found that some loops can not be identified by LoopInfo pass. For example, the loop at line 3094 of rdopt.c of benchmark 464.h264ref from spec cpu2006 is not a loop or a child (pr grandchild) of any loop in the loop list generated by LoopInfo pass. The documentation of LoopInfo says that it identifies natural loops, who have exactly one entry point. But the IR of this loops shows that it's header only has one BB in preds. Do...

Fwd: cfl-aa

2016 Aug 30

Fwd: cfl-aa

...| 5 428082 | 458.sjeng | 9773 808471 | 433.milc | 2163 1787190 | 450.soplex | 72 2472234 | 401.bzip2 | 229 2574217 | 456.hmmer | 1833 3492577 | 445.gobmk | 8480 3685838 | 444.namd | 616 12943554 | 471.omnetpp | 422 20068605 | 464.h264ref | 8593 23849576 | 400.perlbench | 99316 37779455 | 447.dealII | 11204 186008992 | 403.gcc | 404828 I am finding these results weird because I was expecting a larger number of no-alias responses. For instance, I got only 404,828 responses out of 186,008,992 queries....

[CodeGen] CodeSize - TailMerging and BlockPlacement

2016 Mar 29

[CodeGen] CodeSize - TailMerging and BlockPlacement

...ow. I checked the binaries and did not find any increase of unwanted instructions. The change does not hurt any benchmark with noticeable regression and sometimes results in small improvement (1%-3%). 473.astar -7 401.bzip2 -110 403.gcc -13,006 445.gobmk -1,716 464.h264ref -684 456.hmmer -391 462.libquantum -4 429.mcf -4 471.omnetpp -1,980 400.perlbench -4,176 458.sjeng -338 450.soplex -395 483.xalancbmk -4,183 447.dealII -186 433.milc -34 444.namd -104 453.povray -1,785 482.sphinx3 -112...

GSoC Proposal : Path Profiling Support

2016 Mar 22

GSoC Proposal : Path Profiling Support

...---------------------------+ > > | swaptions | _Z21HJM_Swaption_BlockingPddddddiidS_PS_llii > > | > > > +-----------------+----------------------------------------------------------------------------------------------+ > > | h264ref | dct_luma_16x16 > > | > > > +-----------------+----------------------------------------------------------------------------------------------+ > > > >> Do you have data when such manual selection is not done? > >...

GSoC Proposal : Path Profiling Support

2016 Mar 21

GSoC Proposal : Path Profiling Support

...----------------------------------------------------+ > | swaptions | _Z21HJM_Swaption_BlockingPddddddiidS_PS_llii > | > +-----------------+----------------------------------------------------------------------------------------------+ > | h264ref | dct_luma_16x16 > | > +-----------------+----------------------------------------------------------------------------------------------+ > >> Do you have data when such manual selection is not done? > > At the moment, I do not....

[LLVMdev] LoopInfo are not able to identify some natural loops?

2011 Apr 30

[LLVMdev] LoopInfo are not able to identify some natural loops?

...e loop jumping to a block in the > loop body. > > Cameron > > On Apr 29, 2011, at 7:43 PM, Bo Wu <bwu at cs.wm.edu> wrote: > > Hi, > > I found that some loops can not be identified by LoopInfo pass. For > example, the loop at line 3094 of rdopt.c of benchmark 464.h264ref from spec > cpu2006 is not a loop or a child (pr grandchild) of any loop in the loop > list generated by LoopInfo pass. The documentation of LoopInfo says that it > identifies natural loops, who have exactly one entry point. But the IR of > this loops shows that it's header only has...

GSoC Proposal : Path Profiling Support

2016 Mar 23

GSoC Proposal : Path Profiling Support

...--+ >> > | swaptions | _Z21HJM_Swaption_BlockingPddddddiidS_PS_llii >> > | >> > >> > +-----------------+----------------------------------------------------------------------------------------------+ >> > | h264ref | dct_luma_16x16 >> > | >> > >> > +-----------------+----------------------------------------------------------------------------------------------+ >> > >> >> Do you have data when such manual selecti...

GSoC Proposal : Path Profiling Support

2016 Mar 16

GSoC Proposal : Path Profiling Support

...+---------------+-----------+--------------+----------+ | swaptions | 20655 | 0m0.965s | 0m0.950s | 13 | 0m0.263s | 0m0.178s | 193841 | 184274 | +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+ | h264ref | 24130 | 0m4.278s | 0m4.272s | 76 | 3m26.701s | 3m4.461s | 816660 | 812396 | +---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+ | lbm | 8 | 0m0.824s | 0m0.815s | 5 |...

[LLVMdev] LoopInfo are not able to identify some natural loops?

2011 Apr 30

[LLVMdev] LoopInfo are not able to identify some natural loops?

...obably some block outside of the loop jumping to a block in the loop body. Cameron On Apr 29, 2011, at 7:43 PM, Bo Wu <bwu at cs.wm.edu> wrote: > Hi, > > I found that some loops can not be identified by LoopInfo pass. For example, the loop at line 3094 of rdopt.c of benchmark 464.h264ref from spec cpu2006 is not a loop or a child (pr grandchild) of any loop in the loop list generated by LoopInfo pass. The documentation of LoopInfo says that it identifies natural loops, who have exactly one entry point. But the IR of this loops shows that it's header only has one BB in preds. Do...

search for: h264ref