thr3ads.net - search: "gobmk"

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 30

4

(RFC) Adjusting default loop fully unroll threshold

...mic/partial unrolling, fully unrolling will not affect LSD/ICache performance. In https://reviews.llvm.org/D28368, I proposed to double the threshold for loop fully unroller. This will change the codegen of several SPECCPU benchmarks: Code size: 447.dealII 0.50% 453.povray 0.42% 433.milc 0.20% 445.gobmk 0.32% 403.gcc 0.05% 464.h264ref 3.62% Compile Time: 447.dealII 0.22% 453.povray -0.16% 433.milc 0.09% 445.gobmk -2.43% 403.gcc 0.06% 464.h264ref 3.21% Performance (on intel sandybridge): 447.dealII +0.07% 453.povray +1.79% 433.milc +1.02% 445.gobmk +0.56% 403.gcc -0.16% 464.h264ref -0.41% Looks...

[RFC] Using Intel MPX to harden SafeStack

2017 Feb 18

2

[RFC] Using Intel MPX to harden SafeStack

...--+ |401.bzip2|711.43|716.59|717.35|750.06 | +--------------+---------+---------+---------+-------+ |403.gcc|333.76|334.11|334.95|336.13 | +--------------+---------+---------+---------+-------+ |429.mcf|371.48|375.75|373.50|377.93 | +--------------+---------+---------+---------+-------+ |445.gobmk|677.80|686.12|685.50|702.87 | +--------------+---------+---------+---------+-------+ |456.hmmer|534.94|533.68|534.37|553.40 | +--------------+---------+---------+---------+-------+ |458.sjeng|633.69|641.21|641.81|655.94 | +--------------+---------+---------+---------+-------+ |462.libquantum|...

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 30

0

(RFC) Adjusting default loop fully unroll threshold

...ormance. In https://reviews.llvm.org/D28368 <https://reviews.llvm.org/D28368>, I proposed to double the threshold for loop fully unroller. This will change the codegen of several SPECCPU benchmarks: > > Code size: > 447.dealII 0.50% > 453.povray 0.42% > 433.milc 0.20% > 445.gobmk 0.32% > 403.gcc 0.05% > 464.h264ref 3.62% > > Compile Time: > 447.dealII 0.22% > 453.povray -0.16% > 433.milc 0.09% > 445.gobmk -2.43% > 403.gcc 0.06% > 464.h264ref 3.21% > > Performance (on intel sandybridge): > 447.dealII +0.07% > 453.povray +1.79% >...

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 30

2

(RFC) Adjusting default loop fully unroll threshold

...not affect > LSD/ICache performance. In https://reviews.llvm.org/D28368, I proposed to > double the threshold for loop fully unroller. This will change the codegen > of several SPECCPU benchmarks: > > Code size: > 447.dealII 0.50% > 453.povray 0.42% > 433.milc 0.20% > 445.gobmk 0.32% > 403.gcc 0.05% > 464.h264ref 3.62% > > Compile Time: > 447.dealII 0.22% > 453.povray -0.16% > 433.milc 0.09% > 445.gobmk -2.43% > 403.gcc 0.06% > 464.h264ref 3.21% > > Performance (on intel sandybridge): > 447.dealII +0.07% > 453.povray +1.79% > 4...

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 31

0

(RFC) Adjusting default loop fully unroll threshold

...rmance. In https://reviews.llvm.org/D28368, I proposed >> to double the threshold for loop fully unroller. This will change the >> codegen of several SPECCPU benchmarks: >> >> Code size: >> 447.dealII 0.50% >> 453.povray 0.42% >> 433.milc 0.20% >> 445.gobmk 0.32% >> 403.gcc 0.05% >> 464.h264ref 3.62% >> >> Compile Time: >> 447.dealII 0.22% >> 453.povray -0.16% >> 433.milc 0.09% >> 445.gobmk -2.43% >> 403.gcc 0.06% >> 464.h264ref 3.21% >> >> Performance (on intel sandybridge): >...

Saving Compile Time in InstCombine

2017 Mar 17

7

Saving Compile Time in InstCombine

...External/SPEC/CINT2006/403.gcc/403.gcc <http://michaelsmacmini.local/perf/v4/nts/2/graph?test.14=2> -1.64% 54.0801 53.1930 - External/SPEC/CINT2006/400.perlbench/400.perlbench <http://michaelsmacmini.local/perf/v4/nts/2/graph?test.7=2> -1.25% 19.1481 18.9091 - External/SPEC/CINT2006/445.gobmk/445.gobmk <http://michaelsmacmini.local/perf/v4/nts/2/graph?test.15=2> -1.01% 15.2819 15.1274 - Do such changes make sense? The patch doesn't change O3, but it does change Os and potentially can change performance there (though I didn't see any changes in my tests). The patch is at...

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 31

3

(RFC) Adjusting default loop fully unroll threshold

...ews.llvm.org/D28368 <https://reviews.llvm.org/D28368>, I proposed to double the threshold for loop fully unroller. This will change the codegen of several SPECCPU benchmarks: >> >> Code size: >> 447.dealII 0.50% >> 453.povray 0.42% >> 433.milc 0.20% >> 445.gobmk 0.32% >> 403.gcc 0.05% >> 464.h264ref 3.62% >> >> Compile Time: >> 447.dealII 0.22% >> 453.povray -0.16% >> 433.milc 0.09% >> 445.gobmk -2.43% >> 403.gcc 0.06% >> 464.h264ref 3.21% >> >> Performance (on intel sandybridge): &g...

Enable vectorizer-maximize-bandwidth by default?

2017 May 18

6

Enable vectorizer-maximize-bandwidth by default?

...spec/2006/int/C++/483.xalancbmk 33.69 +4.97% spec/2006/int/C/400.perlbench 33.43 +1.70% spec/2006/int/C/401.bzip2 23.02 -0.19% spec/2006/int/C/403.gcc 32.57 -0.43% spec/2006/int/C/429.mcf 40.35 +0.27% spec/2006/int/C/445.gobmk 26.96 +0.06% spec/2006/int/C/456.hmmer 24.4 +0.19% spec/2006/int/C/458.sjeng 27.91 -0.08% spec/2006/int/C/462.libquantum 57.47 -0.20% spec/2006/int/C/464.h264ref 46.52 +1.35% geometric mean...

Fwd: cfl-aa

2016 Aug 30

2

Fwd: cfl-aa

...470.lbm | 0 49133 | 429.mcf | 42 95098 | 473.astar | 0 146301 | 462.libquantum | 5 428082 | 458.sjeng | 9773 808471 | 433.milc | 2163 1787190 | 450.soplex | 72 2472234 | 401.bzip2 | 229 2574217 | 456.hmmer | 1833 3492577 | 445.gobmk | 8480 3685838 | 444.namd | 616 12943554 | 471.omnetpp | 422 20068605 | 464.h264ref | 8593 23849576 | 400.perlbench | 99316 37779455 | 447.dealII | 11204 186008992 | 403.gcc | 404828 I am finding these results weird because I was expecting a...

[CodeGen] CodeSize - TailMerging and BlockPlacement

2016 Mar 29

2

[CodeGen] CodeSize - TailMerging and BlockPlacement

...benchmarks as shown below. I checked the binaries and did not find any increase of unwanted instructions. The change does not hurt any benchmark with noticeable regression and sometimes results in small improvement (1%-3%). 473.astar -7 401.bzip2 -110 403.gcc -13,006 445.gobmk -1,716 464.h264ref -684 456.hmmer -391 462.libquantum -4 429.mcf -4 471.omnetpp -1,980 400.perlbench -4,176 458.sjeng -338 450.soplex -395 483.xalancbmk -4,183 447.dealII -186 433.milc -34 444.namd -104 453.povray -1,...

Saving Compile Time in InstCombine

2017 Mar 18

4

Saving Compile Time in InstCombine

...aelsmacmini.local/perf/v4/nts/2/graph?test.14=2> -1.64% >> 54.0801 53.1930 - >> External/SPEC/CINT2006/400.perlbench/400.perlbench >> <http://michaelsmacmini.local/perf/v4/nts/2/graph?test.7=2> -1.25% >> 19.1481 18.9091 - >> External/SPEC/CINT2006/445.gobmk/445.gobmk >> <http://michaelsmacmini.local/perf/v4/nts/2/graph?test.15=2> -1.01% >> 15.2819 15.1274 - >> >> >> >> Do such changes make sense? The patch doesn't change O3, but it does >> change Os and potentially can change performance there...

Saving Compile Time in InstCombine

2017 Mar 20

2

Saving Compile Time in InstCombine

....gcc/403.gcc <http://michaelsmacmini.local/perf/v4/nts/2/graph?test.14=2> -1.64% 54.0801 53.1930 - >>> External/SPEC/CINT2006/400.perlbench/400.perlbench <http://michaelsmacmini.local/perf/v4/nts/2/graph?test.7=2> -1.25% 19.1481 18.9091 - >>> External/SPEC/CINT2006/445.gobmk/445.gobmk <http://michaelsmacmini.local/perf/v4/nts/2/graph?test.15=2> -1.01% 15.2819 15.1274 - >>> >>> >>> Do such changes make sense? The patch doesn't change O3, but it does change Os and potentially can change performance there (though I didn't see any...

[LLVMdev] Measurements of the new inlinehint attribute

2010 Feb 15

0

[LLVMdev] Measurements of the new inlinehint attribute

...h/400.perlbench 0.33% 0.40% 35.88% -2.45% SPEC/CINT2006/401.bzip2/401.bzip2 0.00% -0.94% 69.38% -0.94% SPEC/CINT2006/403.gcc/403.gcc 0.76% 0.00% 48.35% 1.20% SPEC/CINT2006/429.mcf/429.mcf 0.00% -1.78% 11.88% 0.61% SPEC/CINT2006/445.gobmk/445.gobmk 0.02% 0.00% 13.86% 0.00% SPEC/CINT2006/456.hmmer/456.hmmer 0.17% 1.72% 28.38% 1.72% SPEC/CINT2006/458.sjeng/458.sjeng 0.19% 1.35% 8.97% 6.05% SPEC/CINT2006/462.libquantum/462.libquantum 1.08% -20.22% 146.24% -7.26% SPEC/CINT2006/4...

Saving Compile Time in InstCombine

2017 Mar 21

2

Saving Compile Time in InstCombine

....gcc/403.gcc <http://michaelsmacmini.local/perf/v4/nts/2/graph?test.14=2> -1.64% 54.0801 53.1930 - >>> External/SPEC/CINT2006/400.perlbench/400.perlbench <http://michaelsmacmini.local/perf/v4/nts/2/graph?test.7=2> -1.25% 19.1481 18.9091 - >>> External/SPEC/CINT2006/445.gobmk/445.gobmk <http://michaelsmacmini.local/perf/v4/nts/2/graph?test.15=2> -1.01% 15.2819 15.1274 - >>> >>> >>> Do such changes make sense? The patch doesn't change O3, but it does change Os and potentially can change performance there (though I didn't see any...

[LLVMdev] LLVM's Pre-allocation Scheduler Tested against a Branch-and-Bound Scheduler

2012 Sep 29

7

[LLVMdev] LLVM's Pre-allocation Scheduler Tested against a Branch-and-Bound Scheduler

...ter Science Princess Sumaya University for Technology Amman, Jordan Attachments inlined: Rough Latencies Benchmark Branch-and-Bound LLVM SPEC Score SPEC Score % Score Difference 400.perlbench 21.2 20.2 4.95% 401.bzip2 13.9 13.6 2.21% 403.gcc 19.5 19.8 -1.52% 429.mcf 20.5 20.5 0.00% 445.gobmk 18.6 18.6 0.00% 456.hmmer 11.1 11.1 0.00% 458.sjeng 19.3 19.3 0.00% 462.libquantum 39.5 39.5 0.00% 464.h264ref 28.5 28.5 0.00% 471.omnetpp 15.6 15.6 0.00% 473.astar 13 13 0.00% 483.xalancbmk 21.9 21.9 0.00% GEOMEAN 19.0929865 19.00588287 0.46% 410.bwaves 15.2 15.2 0.00% 416.gamess CE...

(RFC) Encoding code duplication factor in discriminator

2016 Oct 27

2

(RFC) Encoding code duplication factor in discriminator

...ing and loop vectorization. The debug_line size overhead for "-O2 -g1" binary of speccpu C/C++ benchmarks: 433.milc 23.59% 444.namd 6.25% 447.dealII 8.43% 450.soplex 2.41% 453.povray 5.40% 470.lbm 0.00% 482.sphinx3 7.10% 400.perlbench 2.77% 401.bzip2 9.62% 403.gcc 2.67% 429.mcf 9.54% 445.gobmk 7.40% 456.hmmer 9.79% 458.sjeng 9.98% 462.libquantum 10.90% 464.h264ref 30.21% 471.omnetpp 0.52% 473.astar 5.67% 483.xalancbmk 1.46% mean 7.86% Dehao On Thu, Oct 27, 2016 at 11:55 AM, Xinliang David Li <davidxl at google.com> wrote: > Do you have an estimate of the debug_line size increa...

[RFC] Switching to MemorySSA-backed Dead Store Elimination (aka cross-bb DSE)

2020 Aug 18

7

[RFC] Switching to MemorySSA-backed Dead Store Elimination (aka cross-bb DSE)

...ram legacy mssa. diff test-suite...-typeset/consumer-typeset.test 186.00 1815.00 875.8% test-suite...lications/sqlite3/sqlite3.test 29.00 167.00 475.9% test-suite...T2006/445.gobmk/445.gobmk.test 19.00 88.00 363.2% test-suite.../Applications/SPASS/SPASS.test 49.00 155.00 216.3% test-suite...lications/ClamAV/clamscan.test 72.00 227.00 215.3% test-suite.../Benchmarks/nbench/nbench.test 30.00 92.00 20...

Saving Compile Time in InstCombine

2017 Mar 22

3

Saving Compile Time in InstCombine

...tp://michaelsmacmini.local/perf/v4/nts/2/graph?test.14=2> -1.64% >> 54.0801 53.1930 - >> External/SPEC/CINT2006/400.perlbench/400.perlbench >> <http://michaelsmacmini.local/perf/v4/nts/2/graph?test.7=2> -1.25% >> 19.1481 18.9091 - >> External/SPEC/CINT2006/445.gobmk/445.gobmk >> <http://michaelsmacmini.local/perf/v4/nts/2/graph?test.15=2> -1.01% >> 15.2819 15.1274 - >> >> >> Do such changes make sense? The patch doesn't change O3, but it does >> change Os and potentially can change performance there (though I didn&...

[LLVMdev] LLVM's Pre-allocation Scheduler Tested against a Branch-and-Bound Scheduler

2012 Sep 29

0

[LLVMdev] LLVM's Pre-allocation Scheduler Tested against a Branch-and-Bound Scheduler

...; > Attachments inlined: > > Rough Latencies > > Benchmark Branch-and-Bound LLVM > > SPEC Score SPEC Score % Score Difference > 400.perlbench 21.2 20.2 4.95% > 401.bzip2 13.9 13.6 2.21% > 403.gcc 19.5 19.8 -1.52% > 429.mcf 20.5 20.5 0.00% > 445.gobmk 18.6 18.6 0.00% > 456.hmmer 11.1 11.1 0.00% > 458.sjeng 19.3 19.3 0.00% > 462.libquantum 39.5 39.5 0.00% > 464.h264ref 28.5 28.5 0.00% > 471.omnetpp 15.6 15.6 0.00% > 473.astar 13 13 0.00% > 483.xalancbmk 21.9 21.9 0.00% > GEOMEAN 19.0929865 19.0058...

(RFC) Encoding code duplication factor in discriminator

2016 Oct 27

0

(RFC) Encoding code duplication factor in discriminator

...2 -g1" binary of speccpu > C/C++ benchmarks: > > 433.milc 23.59% > 444.namd 6.25% > 447.dealII 8.43% > 450.soplex 2.41% > 453.povray 5.40% > 470.lbm 0.00% > 482.sphinx3 7.10% > 400.perlbench 2.77% > 401.bzip2 9.62% > 403.gcc 2.67% > 429.mcf 9.54% > 445.gobmk 7.40% > 456.hmmer 9.79% > 458.sjeng 9.98% > 462.libquantum 10.90% > 464.h264ref 30.21% > 471.omnetpp 0.52% > 473.astar 5.67% > 483.xalancbmk 1.46% > mean 7.86% > Dehao > > On Thu, Oct 27, 2016 at 11:55 AM, Xinliang David Li <davidxl at google.com> > wrote:...

search for: gobmk