Displaying 20 results from an estimated 51 matches for "h264ref".
2017 Jan 30
4
(RFC) Adjusting default loop fully unroll threshold
...nrolling will not affect
LSD/ICache performance. In https://reviews.llvm.org/D28368, I proposed to
double the threshold for loop fully unroller. This will change the codegen
of several SPECCPU benchmarks:
Code size:
447.dealII 0.50%
453.povray 0.42%
433.milc 0.20%
445.gobmk 0.32%
403.gcc 0.05%
464.h264ref 3.62%
Compile Time:
447.dealII 0.22%
453.povray -0.16%
433.milc 0.09%
445.gobmk -2.43%
403.gcc 0.06%
464.h264ref 3.21%
Performance (on intel sandybridge):
447.dealII +0.07%
453.povray +1.79%
433.milc +1.02%
445.gobmk +0.56%
403.gcc -0.16%
464.h264ref -0.41%
Looks like the change has overall posi...
2017 Jan 30
0
(RFC) Adjusting default loop fully unroll threshold
...368 <https://reviews.llvm.org/D28368>, I proposed to double the threshold for loop fully unroller. This will change the codegen of several SPECCPU benchmarks:
>
> Code size:
> 447.dealII 0.50%
> 453.povray 0.42%
> 433.milc 0.20%
> 445.gobmk 0.32%
> 403.gcc 0.05%
> 464.h264ref 3.62%
>
> Compile Time:
> 447.dealII 0.22%
> 453.povray -0.16%
> 433.milc 0.09%
> 445.gobmk -2.43%
> 403.gcc 0.06%
> 464.h264ref 3.21%
>
> Performance (on intel sandybridge):
> 447.dealII +0.07%
> 453.povray +1.79%
> 433.milc +1.02%
> 445.gobmk +0.56%
>...
2017 Jan 30
2
(RFC) Adjusting default loop fully unroll threshold
...In https://reviews.llvm.org/D28368, I proposed to
> double the threshold for loop fully unroller. This will change the codegen
> of several SPECCPU benchmarks:
>
> Code size:
> 447.dealII 0.50%
> 453.povray 0.42%
> 433.milc 0.20%
> 445.gobmk 0.32%
> 403.gcc 0.05%
> 464.h264ref 3.62%
>
> Compile Time:
> 447.dealII 0.22%
> 453.povray -0.16%
> 433.milc 0.09%
> 445.gobmk -2.43%
> 403.gcc 0.06%
> 464.h264ref 3.21%
>
> Performance (on intel sandybridge):
> 447.dealII +0.07%
> 453.povray +1.79%
> 433.milc +1.02%
> 445.gobmk +0.56%
>...
2017 Feb 18
2
[RFC] Using Intel MPX to harden SafeStack
....hmmer|534.94|533.68|534.37|553.40 |
+--------------+---------+---------+---------+-------+
|458.sjeng|633.69|641.21|641.81|655.94 |
+--------------+---------+---------+---------+-------+
|462.libquantum|362.82|367.00|367.38|382.14 |
+--------------+---------+---------+---------+-------+
|464.h264ref|701.37|682.13|683.41|699.93 |
+--------------+---------+---------+---------+-------+
|471.omnetpp|397.04|407.38|407.33|411.36 |
+--------------+---------+---------+---------+-------+
|473.astar|611.51|610.46|610.19|624.78 |
+--------------+---------+---------+---------+-------+
|483.xalancbmk...
2017 Jan 31
0
(RFC) Adjusting default loop fully unroll threshold
...oposed
>> to double the threshold for loop fully unroller. This will change the
>> codegen of several SPECCPU benchmarks:
>>
>> Code size:
>> 447.dealII 0.50%
>> 453.povray 0.42%
>> 433.milc 0.20%
>> 445.gobmk 0.32%
>> 403.gcc 0.05%
>> 464.h264ref 3.62%
>>
>> Compile Time:
>> 447.dealII 0.22%
>> 453.povray -0.16%
>> 433.milc 0.09%
>> 445.gobmk -2.43%
>> 403.gcc 0.06%
>> 464.h264ref 3.21%
>>
>> Performance (on intel sandybridge):
>> 447.dealII +0.07%
>> 453.povray +1.79%...
2017 Jan 31
3
(RFC) Adjusting default loop fully unroll threshold
.../D28368>, I proposed to double the threshold for loop fully unroller. This will change the codegen of several SPECCPU benchmarks:
>>
>> Code size:
>> 447.dealII 0.50%
>> 453.povray 0.42%
>> 433.milc 0.20%
>> 445.gobmk 0.32%
>> 403.gcc 0.05%
>> 464.h264ref 3.62%
>>
>> Compile Time:
>> 447.dealII 0.22%
>> 453.povray -0.16%
>> 433.milc 0.09%
>> 445.gobmk -2.43%
>> 403.gcc 0.06%
>> 464.h264ref 3.21%
>>
>> Performance (on intel sandybridge):
>> 447.dealII +0.07%
>> 453.povray +1.79...
2017 Jul 22
4
[RFC] Add IR level interprocedural outliner for code size.
...lt: 22.19%
> -
>
> StatementReordering-flt: 22.15%
> -
>
> Searching-flt: 21.96%
>
>
> SPEC, top improvements:
> E&LO:
>
> -
>
> bzip2: 9.15%
> -
>
> gcc: 4.03%
> -
>
> sphinx3: 3.8%
> -
>
> H264ref: 3.24%
> -
>
> Perlbench: 3%
>
> LO:
>
> -
>
> bzip2: 7.27%
> -
>
> sphinx3: 3.65%
> -
>
> Namd: 3.08%
> -
>
> Gcc: 3.06%
> -
>
> H264ref: 3.05%
>
> MO:
>
> -
>
> Namd: 7.8%
&g...
2011 Apr 30
2
[LLVMdev] Greedy register allocation
...the SPEC benchmarks that change by more than 3% (minus means faster, plus slower):
Targeting i386:
-19.3% 164.gzip
-12.5% 433.milc
-8.8% 473.astar
-7.4% 401.bzip2
-6.4% 183.equake
-4.9% 456.hmmer
-4.6% 186.crafty
-4.6% 188.ammp
-4.1% 403.gcc
-4.0% 256.bzip2
-3.2% 197.parser
-3.1% 175.vpr
-3.0% 464.h264ref
+6.7% 177.mesa
With more registers and out-of-order execution hiding the cost of spilling, x86-64 is more mixed. I suspect this architecture is more sensitive to code layout issues than to register allocation:
Targeting x86-64:
-6.4% 464.h264ref
-6.1% 256.bzip2
-5.2% 183.equake
-4.8% 447.deal...
2016 Mar 16
3
GSoC Proposal : Path Profiling Support
...------------------------------------------------------------------------+
| swaptions | _Z21HJM_Swaption_BlockingPddddddiidS_PS_llii
|
+-----------------+----------------------------------------------------------------------------------------------+
| h264ref | dct_luma_16x16
|
+-----------------+----------------------------------------------------------------------------------------------+
> Do you have data when such manual selection is not done?
At the moment, I do not.
>
> thanks,
>
&g...
2018 Aug 14
3
[RFC] Delaying phi-to-select transformation until later in the pass pipeline
...sions - execution_time Change
MultiSource/Benchmarks/Ptrdist/yacr2/yacr2 5.62%
Performance Improvements - execution_time Change
SingleSource/Benchmarks/Misc-C++/Large/sphereflake -4.43%
External/SPEC/CINT2006/456.hmmer/456.hmmer -2.50%
External/SPEC/CINT2006/464.h264ref/464.h264ref -1.60%
MultiSource/Benchmarks/nbench/nbench -1.19%
SingleSource/Benchmarks/Adobe-C++/functionobjects -1.07%
I had a brief look at the regressions and they all look to be caused by
getting bad luck with branch mispredictions: I looked into the Shootout-ary3 and
yacr2...
2017 May 18
6
Enable vectorizer-maximize-bandwidth by default?
...spec/2006/int/C/429.mcf 40.35 +0.27%
spec/2006/int/C/445.gobmk 26.96 +0.06%
spec/2006/int/C/456.hmmer 24.4 +0.19%
spec/2006/int/C/458.sjeng 27.91 -0.08%
spec/2006/int/C/462.libquantum 57.47 -0.20%
spec/2006/int/C/464.h264ref 46.52 +1.35%
geometric mean +0.29%
Scores are benchmark specific.
We do have regression on 453.povray, but it's due to secondary effects as
all hot functions are the same. I've also tested the code size impact, it
does not change for tes...
2011 Apr 30
2
[LLVMdev] LoopInfo are not able to identify some natural loops?
Hi,
I found that some loops can not be identified by LoopInfo pass. For example,
the loop at line 3094 of rdopt.c of benchmark 464.h264ref from spec cpu2006
is not a loop or a child (pr grandchild) of any loop in the loop list
generated by LoopInfo pass. The documentation of LoopInfo says that it
identifies natural loops, who have exactly one entry point. But the IR of
this loops shows that it's header only has one BB in preds. Do...
2016 Aug 30
2
Fwd: cfl-aa
...| 5
428082 | 458.sjeng | 9773
808471 | 433.milc | 2163
1787190 | 450.soplex | 72
2472234 | 401.bzip2 | 229
2574217 | 456.hmmer | 1833
3492577 | 445.gobmk | 8480
3685838 | 444.namd | 616
12943554 | 471.omnetpp | 422
20068605 | 464.h264ref | 8593
23849576 | 400.perlbench | 99316
37779455 | 447.dealII | 11204
186008992 | 403.gcc | 404828
I am finding these results weird because I was expecting a larger
number of no-alias responses. For instance, I got only 404,828 responses
out of 186,008,992 queries....
2016 Mar 29
2
[CodeGen] CodeSize - TailMerging and BlockPlacement
...ow. I
checked the binaries and did not find any increase of unwanted
instructions. The change does not hurt any benchmark with noticeable
regression and sometimes results in small improvement (1%-3%).
473.astar -7
401.bzip2 -110
403.gcc -13,006
445.gobmk -1,716
464.h264ref -684
456.hmmer -391
462.libquantum -4
429.mcf -4
471.omnetpp -1,980
400.perlbench -4,176
458.sjeng -338
450.soplex -395
483.xalancbmk -4,183
447.dealII -186
433.milc -34
444.namd -104
453.povray -1,785
482.sphinx3 -112...
2016 Mar 22
2
GSoC Proposal : Path Profiling Support
...---------------------------+
> > | swaptions | _Z21HJM_Swaption_BlockingPddddddiidS_PS_llii
> > |
> >
> +-----------------+----------------------------------------------------------------------------------------------+
> > | h264ref | dct_luma_16x16
> > |
> >
> +-----------------+----------------------------------------------------------------------------------------------+
> >
> >> Do you have data when such manual selection is not done?
> >...
2016 Mar 21
0
GSoC Proposal : Path Profiling Support
...----------------------------------------------------+
> | swaptions | _Z21HJM_Swaption_BlockingPddddddiidS_PS_llii
> |
> +-----------------+----------------------------------------------------------------------------------------------+
> | h264ref | dct_luma_16x16
> |
> +-----------------+----------------------------------------------------------------------------------------------+
>
>> Do you have data when such manual selection is not done?
>
> At the moment, I do not....
2011 Apr 30
3
[LLVMdev] LoopInfo are not able to identify some natural loops?
...e loop jumping to a block in the
> loop body.
>
> Cameron
>
> On Apr 29, 2011, at 7:43 PM, Bo Wu <bwu at cs.wm.edu> wrote:
>
> Hi,
>
> I found that some loops can not be identified by LoopInfo pass. For
> example, the loop at line 3094 of rdopt.c of benchmark 464.h264ref from spec
> cpu2006 is not a loop or a child (pr grandchild) of any loop in the loop
> list generated by LoopInfo pass. The documentation of LoopInfo says that it
> identifies natural loops, who have exactly one entry point. But the IR of
> this loops shows that it's header only has...
2016 Mar 23
0
GSoC Proposal : Path Profiling Support
...--+
>> > | swaptions | _Z21HJM_Swaption_BlockingPddddddiidS_PS_llii
>> > |
>> >
>> > +-----------------+----------------------------------------------------------------------------------------------+
>> > | h264ref | dct_luma_16x16
>> > |
>> >
>> > +-----------------+----------------------------------------------------------------------------------------------+
>> >
>> >> Do you have data when such manual selecti...
2016 Mar 16
2
GSoC Proposal : Path Profiling Support
...+---------------+-----------+--------------+----------+
| swaptions | 20655 | 0m0.965s | 0m0.950s | 13 | 0m0.263s | 0m0.178s | 193841 | 184274 |
+---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
| h264ref | 24130 | 0m4.278s | 0m4.272s | 76 | 3m26.701s | 3m4.461s | 816660 | 812396 |
+---------------+----------+-------------+-----------+-----------+---------------+-----------+--------------+----------+
| lbm | 8 | 0m0.824s | 0m0.815s | 5 |...
2011 Apr 30
0
[LLVMdev] LoopInfo are not able to identify some natural loops?
...obably some block outside of the loop jumping to a block in the loop body.
Cameron
On Apr 29, 2011, at 7:43 PM, Bo Wu <bwu at cs.wm.edu> wrote:
> Hi,
>
> I found that some loops can not be identified by LoopInfo pass. For example, the loop at line 3094 of rdopt.c of benchmark 464.h264ref from spec cpu2006 is not a loop or a child (pr grandchild) of any loop in the loop list generated by LoopInfo pass. The documentation of LoopInfo says that it identifies natural loops, who have exactly one entry point. But the IR of this loops shows that it's header only has one BB in preds. Do...