Perhaps you noticed that LLVM gained a new optimizing register allocator
yesterday (r130568). Linear scan is going away, and RAGreedy is the new default
for optimizing builds.
Hopefully, you noticed because your binaries were suddenly 2% smaller and 10%
faster*. Some noticed because LLVM started crashing or miscompiling their code.
Greedy replaces a fairly big chunk of the code generator, so there will be some
bugs. Please file bug reports!
Linear scan will stick around for a limited time. It can be enabled with
'clang -mllvm -regalloc=linearscan' and 'llc
-regalloc=linearscan'. If you think you found a bug in the new allocator,
please verify that the problem goes away when switching back to linear scan.
I would also like to hear about instances where greedy produces obviously silly
code, even if it is technically correct.
Share and Enjoy!
*) Individual results may vary.
Greedy Register Allocator
The new register allocator is designed to be more flexible than linear scan. It
still uses the LiveInterval data structure for interference checking, but it can
allocate live ranges in any order.
The most important new features are:
Allow code editing on the fly. Currently, spill code and copies are inserted
immediately, but other code changes are also possible.
Flexible order of allocation and eviction. The ordering can be used to fine tune
the register allocator's behavior. Register assignments can be undone
without backtracking.
Live range splitting. Global and local live ranges are split by two different
algorithms, both guided by interference from already allocated registers.
Prefer cheap registers for busy live ranges. The x86-64 and ARM thumb2 targets
have registers that are more expensive to encode in instructions. Those
registers are used for the less busy live ranges which improves code size.
Code Size
The new allocator can improve code size because live range splitting causes less
spill code to be inserted, and on x86-64 and thumb2, expensive registers are
used less. It can increase code size by inserting more register copies in order
to avoid a spill.
On the nightly test suite, we see these changes in total code size:
i386: -1.2%
x86-64: -1.6%
armv7: -2.3%
Using clang to build itself on x86-64, we get these __text segments for
Release+Debug/bin/clang:
LinScan: 15750170
Greedy: 15486090
Difference: -1.7%
More complete code size data below, there is a lot of variation.
Performance
Live range splitting improves the performance of compiled code by eliminating
spills, or by moving spills out of loops. Live ranges that cross a function call
and are also used in a hot loop get split so there are no spills in the loop.
Long, complicated basic blocks benefit from local live range splitting.
The register-starved i386 target benefits the most. This is the change in
execution time for the SPEC benchmarks that change by more than 3% (minus means
faster, plus slower):
Targeting i386:
-19.3% 164.gzip
-12.5% 433.milc
-8.8% 473.astar
-7.4% 401.bzip2
-6.4% 183.equake
-4.9% 456.hmmer
-4.6% 186.crafty
-4.6% 188.ammp
-4.1% 403.gcc
-4.0% 256.bzip2
-3.2% 197.parser
-3.1% 175.vpr
-3.0% 464.h264ref
+6.7% 177.mesa
With more registers and out-of-order execution hiding the cost of spilling,
x86-64 is more mixed. I suspect this architecture is more sensitive to code
layout issues than to register allocation:
Targeting x86-64:
-6.4% 464.h264ref
-6.1% 256.bzip2
-5.2% 183.equake
-4.8% 447.dealII
-3.9% 400.perlbench
-3.5% 401.bzip2
-3.3% 255.vortex
+3.8% 186.crafty
+5.0% 462.libquantum
+8.0% 471.omnetpp
Finally, armv7/thumb2 running on a Cortex-A9 CPU does quite well:
Targeting armv7:
-6.2% 447.dealII
-4.4% 183.equake
-4.1% 462.libquantum
-3.5% 401.bzip2
Clang builds llvm+clang about 0.5% faster when it was built with the greedy
register allocator.
More data below.
Compile Time
Linear scan spends about 50% of its compile time in VirtRegRewriter, unfolding
stack accesses that it folded when spilling. The new allocator uses live range
splitting instead, so it can use a much faster trivial rewriter that never
unfolds memory accesses.
This means that the new allocator is faster than linear scan when compiling
small functions (such as Objective-C code), but global live range splitting
becomes expensive when compiling large functions. When compiling 403.gcc, the
new allocator uses 15-40% more time than linear scan, depending on the target.
These times are for a PIC build of 403.gcc, timing just the register allocator
pass:
LinScan Greedy LLC Total
armv7 1.77s 2.47s +3.2%
x86-64 1.91s 2.38s +1.9%
i386 2.42s 2.82s +0.9%
These numbers translate to a 1-3% increase in llc total code generation time.
Linear scan is faster on ARM because that target doesn't fold memory
operands. That means there is nothing to unfold for the expensive
VirtRegRewriter. ARM's use of base registers in functions with stack frames
also means more work for global live range splitting.
A more realistic example is timing clang building llvm+clang for x86-64:
LinScan: 1991.20s
Greedy: 1991.93s
Difference: none
The greedy register allocator speeds up clang a bit on x86-64, so the new
allocator is actually faster than linear scan when self-hosting:
Greedy hosting greedy: 1981.56s
Difference: -0.5% (greedy faster)
Timing llc on Sketch.bc created from all the Objective-C source files in the
Xcode Sketch sample project, we get:
LinScan: 0.733s
Greedy: 0.724s
Difference: -1.23% (greedy faster)
Code Size Data
This is all the nightly test suite tests that produce a __text segment larger
than 8K and a code size difference larger than 1%. Negative numbers means
smaller code, positive larger.
Targeting i386 PIC -O2:
-7.3% MultiSource/Benchmarks/MiBench/security-rijndael/Output/security-rijndael
-7.2% MultiSource/Benchmarks/Prolangs-C/simulator/Output/simulator
-5.3% MultiSource/Applications/hexxagon/Output/hexxagon
-5.2% MultiSource/Applications/SIBsim4/Output/SIBsim4
-4.8% MultiSource/Benchmarks/MallocBench/cfrac/Output/cfrac
-4.1% MultiSource/Applications/d/Output/make_dparser
-4.0% MultiSource/Applications/JM/ldecod/Output/ldecod
-3.9% External/Nurbs/Output/nurbs
-3.7% MultiSource/Benchmarks/Ptrdist/bc/Output/bc
-3.4% MultiSource/Applications/lemon/Output/lemon
-3.4% SingleSource/Benchmarks/Misc-C++/Output/bigfib
-3.4% External/SPEC/CFP2000/183.equake/Output/183.equake
-3.2% MultiSource/Applications/hbd/Output/hbd
-3.1% MultiSource/Applications/minisat/Output/minisat
-3.0% MultiSource/Benchmarks/tramp3d-v4/Output/tramp3d-v4
-2.9% External/skidmarks10/Output/skidmarks
-2.9% MultiSource/Benchmarks/ASCI_Purple/SMG2000/Output/smg2000
-2.8% External/SPEC/CINT95/132.ijpeg/Output/132.ijpeg
-2.8% MultiSource/Benchmarks/MiBench/telecomm-gsm/Output/telecomm-gsm
-2.8% MultiSource/Benchmarks/mediabench/gsm/toast/Output/toast
-2.7% External/SPEC/CINT95/134.perl/Output/134.perl
-2.6% MultiSource/Benchmarks/mediabench/jpeg/jpeg-6a/Output/cjpeg
-2.6% MultiSource/Applications/oggenc/Output/oggenc
-2.5% MultiSource/Benchmarks/VersaBench/dbms/Output/dbms
-2.5% External/SPEC/CINT2006/473.astar/Output/473.astar
-2.5% SingleSource/Benchmarks/Adobe-C++/Output/simple_types_loop_invariant
-2.5% MultiSource/Benchmarks/Prolangs-C/loader/Output/loader
-2.4% MultiSource/Benchmarks/sim/Output/sim
-2.4% External/SPEC/CFP2006/433.milc/Output/433.milc
-2.4% External/SPEC/CINT95/124.m88ksim/Output/124.m88ksim
-2.4% MultiSource/Benchmarks/Prolangs-C/TimberWolfMC/Output/timberwolfmc
-2.4% MultiSource/Applications/treecc/Output/treecc
-2.3% MultiSource/Benchmarks/MiBench/consumer-jpeg/Output/consumer-jpeg
-2.2% External/SPEC/CFP2006/450.soplex/Output/450.soplex
-2.2% External/SPEC/CINT2000/186.crafty/Output/186.crafty
-2.2% SingleSource/Benchmarks/Misc-C++/Output/stepanov_container
-2.1% External/SPEC/CINT2006/456.hmmer/Output/456.hmmer
-2.1% External/SPEC/CINT2000/254.gap/Output/254.gap
-2.0% MultiSource/Applications/JM/lencod/Output/lencod
-2.0% MultiSource/Benchmarks/FreeBench/pifft/Output/pifft
-1.9% External/SPEC/CFP2006/447.dealII/Output/447.dealII
-1.9% External/SPEC/CINT2006/401.bzip2/Output/401.bzip2
-1.9% External/SPEC/CINT2006/464.h264ref/Output/464.h264ref
-1.9% MultiSource/Applications/SPASS/Output/SPASS
-1.8% MultiSource/Benchmarks/McCat/18-imp/Output/imp
-1.8% External/SPEC/CFP2006/444.namd/Output/444.namd
-1.8% MultiSource/Benchmarks/Prolangs-C/unix-smail/Output/unix-smail
-1.8% MultiSource/Benchmarks/MallocBench/espresso/Output/espresso
-1.7% MultiSource/Benchmarks/MallocBench/gs/Output/gs
-1.7% MultiSource/Applications/sqlite3/Output/sqlite3
-1.7% External/SPEC/CINT2000/255.vortex/Output/255.vortex
-1.7% External/SPEC/CINT95/147.vortex/Output/147.vortex
-1.6% External/SPEC/CINT2006/483.xalancbmk/Output/483.xalancbmk
-1.5% External/SPEC/CINT2000/256.bzip2/Output/256.bzip2
-1.5% External/SPEC/CINT2000/252.eon/Output/252.eon
-1.5% MultiSource/Benchmarks/FreeBench/fourinarow/Output/fourinarow
-1.5% MultiSource/Benchmarks/mafft/Output/pairlocalalign
-1.5% MultiSource/Applications/lua/Output/lua
-1.4% MultiSource/Applications/siod/Output/siod
-1.4% External/SPEC/CFP2000/177.mesa/Output/177.mesa
-1.3% MultiSource/Benchmarks/Bullet/Output/bullet
-1.3% MultiSource/Applications/spiff/Output/spiff
-1.3% External/SPEC/CINT2000/175.vpr/Output/175.vpr
-1.2% Total
-1.1% External/SPEC/CINT95/130.li/Output/130.li
-1.1% External/SPEC/CINT2006/403.gcc/Output/403.gcc
-1.1% External/SPEC/CINT2006/429.mcf/Output/429.mcf
-1.1% MultiSource/Applications/lambda-0.1.3/Output/lambda
-1.0% MultiSource/Benchmarks/Trimaran/enc-3des/Output/enc-3des
-1.0% MultiSource/Benchmarks/Ptrdist/yacr2/Output/yacr2
+1.1% MultiSource/Benchmarks/MiBench/office-ispell/Output/office-ispell
+1.2% SingleSource/Benchmarks/Misc-C++-EH/Output/spirit
+1.4% SingleSource/Benchmarks/Misc/Output/oourafft
+1.4% MultiSource/Benchmarks/Prolangs-C/assembler/Output/assembler
+2.1% MultiSource/Benchmarks/Prolangs-C/unix-tbl/Output/unix-tbl
+2.3% MultiSource/Benchmarks/Prolangs-C/gnugo/Output/gnugo
+2.4% External/SPEC/CFP2000/179.art/Output/179.art
+2.5% SingleSource/Benchmarks/Adobe-C++/Output/functionobjects
+3.3% SingleSource/Benchmarks/Adobe-C++/Output/simple_types_constant_folding
+3.3% External/SPEC/CINT2006/471.omnetpp/Output/471.omnetpp
+7.1% MultiSource/Benchmarks/Prolangs-C/cdecl/Output/cdecl
Targeting x86-64 PIC -O2:
-5.3% MultiSource/Benchmarks/ASCI_Purple/SMG2000/Output/smg2000
-5.1% MultiSource/Applications/SIBsim4/Output/SIBsim4
-4.7% External/SPEC/CINT2006/401.bzip2/Output/401.bzip2
-4.4% External/skidmarks10/Output/skidmarks
-3.9% MultiSource/Applications/JM/ldecod/Output/ldecod
-3.9% MultiSource/Benchmarks/MallocBench/cfrac/Output/cfrac
-3.6% MultiSource/Benchmarks/Prolangs-C/TimberWolfMC/Output/timberwolfmc
-3.2% MultiSource/Benchmarks/MiBench/security-rijndael/Output/security-rijndael
-3.0% MultiSource/Benchmarks/Prolangs-C/simulator/Output/simulator
-2.9% External/SPEC/CINT2000/252.eon/Output/252.eon
-2.9% MultiSource/Benchmarks/Prolangs-C/cdecl/Output/cdecl
-2.8% MultiSource/Benchmarks/MiBench/telecomm-gsm/Output/telecomm-gsm
-2.8% MultiSource/Benchmarks/mediabench/gsm/toast/Output/toast
-2.8% MultiSource/Benchmarks/Ptrdist/bc/Output/bc
-2.7% SingleSource/Benchmarks/Adobe-C++/Output/functionobjects
-2.6% MultiSource/Applications/JM/lencod/Output/lencod
-2.6% MultiSource/Applications/Burg/Output/burg
-2.6% MultiSource/Benchmarks/tramp3d-v4/Output/tramp3d-v4
-2.5% MultiSource/Applications/d/Output/make_dparser
-2.5% External/SPEC/CINT95/124.m88ksim/Output/124.m88ksim
-2.5% External/SPEC/CINT2006/458.sjeng/Output/458.sjeng
-2.4% MultiSource/Benchmarks/Olden/bh/Output/bh
-2.4% MultiSource/Benchmarks/MallocBench/espresso/Output/espresso
-2.4% External/SPEC/CINT2000/300.twolf/Output/300.twolf
-2.4% MultiSource/Benchmarks/Prolangs-C/unix-smail/Output/unix-smail
-2.3% MultiSource/Applications/kimwitu++/Output/kc
-2.3% MultiSource/Applications/hbd/Output/hbd
-2.3% MultiSource/Applications/spiff/Output/spiff
-2.3% MultiSource/Applications/hexxagon/Output/hexxagon
-2.3% External/SPEC/CINT95/134.perl/Output/134.perl
-2.3% External/SPEC/CINT2000/256.bzip2/Output/256.bzip2
-2.2% MultiSource/Benchmarks/Prolangs-C/loader/Output/loader
-2.2% External/SPEC/CFP2000/183.equake/Output/183.equake
-2.2% External/SPEC/CINT2006/464.h264ref/Output/464.h264ref
-2.2% MultiSource/Applications/ClamAV/Output/clamscan
-2.1% External/SPEC/CFP2000/179.art/Output/179.art
-2.1% External/SPEC/CINT2006/471.omnetpp/Output/471.omnetpp
-2.1% MultiSource/Benchmarks/PAQ8p/Output/paq8p
-2.1% MultiSource/Benchmarks/Ptrdist/yacr2/Output/yacr2
-2.1% External/Povray/Output/povray
-2.0% External/SPEC/CFP2006/470.lbm/Output/470.lbm
-2.0% External/SPEC/CINT2006/429.mcf/Output/429.mcf
-2.0% SingleSource/Benchmarks/Misc-C++/Output/stepanov_container
-2.0% External/SPEC/CINT2006/456.hmmer/Output/456.hmmer
-2.0% External/SPEC/CINT2000/164.gzip/Output/164.gzip
-2.0% MultiSource/Benchmarks/mafft/Output/pairlocalalign
-2.0% MultiSource/Benchmarks/Prolangs-C/assembler/Output/assembler
-2.0% MultiSource/Benchmarks/MiBench/office-ispell/Output/office-ispell
-2.0% External/SPEC/CFP2006/433.milc/Output/433.milc
-1.9% MultiSource/Benchmarks/MiBench/consumer-typeset/Output/consumer-typeset
-1.9% External/SPEC/CFP2000/177.mesa/Output/177.mesa
-1.8% External/SPEC/CFP2006/447.dealII/Output/447.dealII
-1.8% External/SPEC/CFP2000/188.ammp/Output/188.ammp
-1.8% External/SPEC/CINT2000/186.crafty/Output/186.crafty
-1.7% MultiSource/Benchmarks/MiBench/consumer-lame/Output/consumer-lame
-1.7% External/SPEC/CINT95/099.go/Output/099.go
-1.7% MultiSource/Applications/SPASS/Output/SPASS
-1.7% SingleSource/Benchmarks/Adobe-C++/Output/simple_types_constant_folding
-1.7% MultiSource/Benchmarks/FreeBench/fourinarow/Output/fourinarow
-1.6% MultiSource/Applications/oggenc/Output/oggenc
-1.6% Total
-1.6% External/SPEC/CINT2006/403.gcc/Output/403.gcc
-1.6% External/SPEC/CINT95/130.li/Output/130.li
-1.6% MultiSource/Benchmarks/VersaBench/dbms/Output/dbms
-1.6% MultiSource/Applications/minisat/Output/minisat
-1.5% MultiSource/Benchmarks/Prolangs-C/agrep/Output/agrep
-1.5% MultiSource/Benchmarks/Prolangs-C/bison/Output/mybison
-1.4% MultiSource/Benchmarks/Trimaran/enc-3des/Output/enc-3des
-1.4% MultiSource/Applications/sqlite3/Output/sqlite3
-1.3% External/SPEC/CFP2006/450.soplex/Output/450.soplex
-1.3% External/SPEC/CINT2006/400.perlbench/Output/400.perlbench
-1.2% MultiSource/Benchmarks/MiBench/consumer-jpeg/Output/consumer-jpeg
-1.2% MultiSource/Benchmarks/MallocBench/gs/Output/gs
-1.2% MultiSource/Benchmarks/MiBench/security-blowfish/Output/security-blowfish
-1.2% MultiSource/Benchmarks/Prolangs-C/football/Output/football
-1.2% External/SPEC/CINT2000/197.parser/Output/197.parser
-1.2% MultiSource/Applications/lemon/Output/lemon
-1.2% MultiSource/Benchmarks/MiBench/automotive-susan/Output/automotive-susan
-1.2% External/SPEC/CINT2000/254.gap/Output/254.gap
-1.1% External/SPEC/CINT2000/253.perlbmk/Output/253.perlbmk
-1.0% External/SPEC/CINT95/132.ijpeg/Output/132.ijpeg
-1.0% MultiSource/Benchmarks/Bullet/Output/bullet
+1.2% MultiSource/Benchmarks/FreeBench/pifft/Output/pifft
+1.6% MultiSource/Benchmarks/McCat/18-imp/Output/imp
+1.7% SingleSource/Benchmarks/Adobe-C++/Output/loop_unroll
+3.2% SingleSource/Benchmarks/Misc/Output/oourafft
Targeting thumbv7 PIC -O2:
-6.8% MultiSource/Benchmarks/Ptrdist/yacr2/Output/yacr2
-5.8% SingleSource/Benchmarks/Adobe-C++/Output/simple_types_constant_folding
-5.8% MultiSource/Benchmarks/Ptrdist/bc/Output/bc
-5.6% External/SPEC/CINT2000/256.bzip2/Output/256.bzip2
-5.6% MultiSource/Applications/Burg/Output/burg
-5.5% MultiSource/Benchmarks/Prolangs-C/assembler/Output/assembler
-5.4% External/SPEC/CINT2006/401.bzip2/Output/401.bzip2
-5.3% External/SPEC/CINT2000/186.crafty/Output/186.crafty
-5.3% MultiSource/Benchmarks/FreeBench/fourinarow/Output/fourinarow
-5.2% MultiSource/Benchmarks/PAQ8p/Output/paq8p
-5.2% MultiSource/Benchmarks/MiBench/automotive-susan/Output/automotive-susan
-5.2% MultiSource/Benchmarks/ASCI_Purple/SMG2000/Output/smg2000
-5.1% External/SPEC/CINT2006/464.h264ref/Output/464.h264ref
-5.1% MultiSource/Benchmarks/MiBench/consumer-typeset/Output/consumer-typeset
-5.0% External/SPEC/CINT95/099.go/Output/099.go
-4.5% External/SPEC/CINT2000/300.twolf/Output/300.twolf
-4.1% MultiSource/Benchmarks/MiBench/security-rijndael/Output/security-rijndael
-4.1% MultiSource/Benchmarks/Prolangs-C/gnugo/Output/gnugo
-4.0% MultiSource/Benchmarks/sim/Output/sim
-3.9% MultiSource/Benchmarks/MiBench/consumer-lame/Output/consumer-lame
-3.6% MultiSource/Applications/hexxagon/Output/hexxagon
-3.6% MultiSource/Benchmarks/MiBench/telecomm-gsm/Output/telecomm-gsm
-3.6% MultiSource/Benchmarks/mediabench/gsm/toast/Output/toast
-3.5% External/SPEC/CFP2000/188.ammp/Output/188.ammp
-3.5% MultiSource/Applications/spiff/Output/spiff
-3.4% MultiSource/Applications/JM/ldecod/Output/ldecod
-3.3% External/SPEC/CINT2000/164.gzip/Output/164.gzip
-3.2% MultiSource/Benchmarks/Prolangs-C/agrep/Output/agrep
-3.2% MultiSource/Benchmarks/Prolangs-C/TimberWolfMC/Output/timberwolfmc
-3.1% MultiSource/Applications/d/Output/make_dparser
-3.1% MultiSource/Applications/lemon/Output/lemon
-2.8% External/SPEC/CINT95/124.m88ksim/Output/124.m88ksim
-2.8% MultiSource/Applications/JM/lencod/Output/lencod
-2.7% MultiSource/Applications/SIBsim4/Output/SIBsim4
-2.7% External/SPEC/CINT2006/400.perlbench/Output/400.perlbench
-2.6% MultiSource/Applications/kimwitu++/Output/kc
-2.6% MultiSource/Applications/siod/Output/siod
-2.6% External/SPEC/CINT95/130.li/Output/130.li
-2.6% MultiSource/Benchmarks/tramp3d-v4/Output/tramp3d-v4
-2.6% External/SPEC/CINT2000/253.perlbmk/Output/253.perlbmk
-2.4% MultiSource/Applications/SPASS/Output/SPASS
-2.4% External/SPEC/CINT95/134.perl/Output/134.perl
-2.4% MultiSource/Benchmarks/MallocBench/espresso/Output/espresso
-2.3% External/SPEC/CINT2006/471.omnetpp/Output/471.omnetpp
-2.3% External/SPEC/CINT2006/445.gobmk/Output/445.gobmk
-2.3% Total
-2.2% MultiSource/Benchmarks/Prolangs-C/unix-tbl/Output/unix-tbl
-2.2% External/SPEC/CFP2000/179.art/Output/179.art
-2.1% MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/Output/mpeg2decode
-2.1% External/SPEC/CINT2000/175.vpr/Output/175.vpr
-2.1% MultiSource/Benchmarks/Prolangs-C/football/Output/football
-2.0% External/SPEC/CFP2000/177.mesa/Output/177.mesa
-2.0% MultiSource/Benchmarks/VersaBench/dbms/Output/dbms
-1.9% External/SPEC/CINT95/147.vortex/Output/147.vortex
-1.9% External/SPEC/CINT2000/197.parser/Output/197.parser
-1.8% MultiSource/Benchmarks/mediabench/jpeg/jpeg-6a/Output/cjpeg
-1.8% External/SPEC/CINT2000/255.vortex/Output/255.vortex
-1.8% External/SPEC/CFP2006/450.soplex/Output/450.soplex
-1.8% MultiSource/Benchmarks/Prolangs-C/unix-smail/Output/unix-smail
-1.7% MultiSource/Applications/hbd/Output/hbd
-1.7% External/SPEC/CINT2000/254.gap/Output/254.gap
-1.6% MultiSource/Benchmarks/Prolangs-C/bison/Output/mybison
-1.5% External/skidmarks10/Output/skidmarks
-1.4% MultiSource/Benchmarks/MallocBench/gs/Output/gs
-1.4% External/SPEC/CINT2006/483.xalancbmk/Output/483.xalancbmk
-1.4% External/SPEC/CFP2006/444.namd/Output/444.namd
-1.4% MultiSource/Applications/ClamAV/Output/clamscan
-1.4% External/SPEC/CINT95/132.ijpeg/Output/132.ijpeg
-1.2% External/SPEC/CFP2006/447.dealII/Output/447.dealII
-1.2% SingleSource/Benchmarks/Adobe-C++/Output/simple_types_loop_invariant
-1.2% MultiSource/Applications/oggenc/Output/oggenc
-1.1% External/SPEC/CINT2006/462.libquantum/Output/462.libquantum
+1.1% MultiSource/Benchmarks/Bullet/Output/bullet
+1.8% MultiSource/Benchmarks/FreeBench/pifft/Output/pifft
+2.2% External/SPEC/CINT2000/252.eon/Output/252.eon
+9.7% MultiSource/Benchmarks/Trimaran/enc-3des/Output/enc-3des
Performance Data
===============
This is all the nightly test suite tests that show a difference in execution
time larger than 20ms and larger than 3%. Negative numbers means faster,
positive slower.
Targeting i386 PIC -O2:
-23.1% SingleSource/Benchmarks/Shootout-C++/moments
-21.1% SingleSource/Benchmarks/Stanford/Puzzle
-19.3% External/SPEC/CINT2000/164.gzip/164.gzip
-18.7% SingleSource/Benchmarks/Misc/flops
-17.5% MultiSource/Benchmarks/Olden/tsp/tsp
-14.4% MultiSource/Benchmarks/Ptrdist/anagram/anagram
-14.3% MultiSource/Benchmarks/FreeBench/pifft/pifft
-12.5% External/SPEC/CFP2006/433.milc/433.milc
-11.4% MultiSource/Benchmarks/Trimaran/enc-md5/enc-md5
-11.1% MultiSource/Benchmarks/Trimaran/enc-rc4/enc-rc4
-11.0% MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000
-9.0% MultiSource/Applications/minisat/minisat
-8.8% External/SPEC/CINT2006/473.astar/473.astar
-8.7% SingleSource/Benchmarks/Misc/ReedSolomon
-8.6% SingleSource/Benchmarks/BenchmarkGame/nsieve-bits
-8.0% SingleSource/Benchmarks/Misc/salsa20
-7.7% MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk
-7.4% External/SPEC/CINT2006/401.bzip2/401.bzip2
-7.4% SingleSource/Benchmarks/Misc/mandel-2
-7.3% SingleSource/Benchmarks/Shootout-C++/methcall
-6.5% MultiSource/Benchmarks/MiBench/consumer-lame/consumer-lame
-6.4% External/SPEC/CFP2000/183.equake/183.equake
-5.6% MultiSource/Applications/lemon/lemon
-5.5% MultiSource/Applications/SIBsim4/SIBsim4
-5.2% MultiSource/Benchmarks/Olden/bisort/bisort
-5.1% MultiSource/Applications/SPASS/SPASS
-4.9% External/SPEC/CINT2006/456.hmmer/456.hmmer
-4.6% SingleSource/Benchmarks/Adobe-C++/simple_types_loop_invariant
-4.6% External/SPEC/CINT2000/186.crafty/186.crafty
-4.6% External/SPEC/CFP2000/188.ammp/188.ammp
-4.5% SingleSource/Benchmarks/Misc/oourafft
-4.5% MultiSource/Applications/sqlite3/sqlite3
-4.4% MultiSource/Applications/lua/lua
-4.4% MultiSource/Benchmarks/Bullet/bullet
-4.2% MultiSource/Applications/viterbi/viterbi
-4.1% External/SPEC/CINT2006/403.gcc/403.gcc
-4.0% External/SPEC/CINT2000/256.bzip2/256.bzip2
-3.9% MultiSource/Benchmarks/MallocBench/espresso/espresso
-3.8% MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4
-3.5% SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding
-3.4% MultiSource/Applications/spiff/spiff
-3.2% External/SPEC/CINT2000/197.parser/197.parser
-3.1% External/Nurbs/nurbs
-3.1% External/SPEC/CINT2000/175.vpr/175.vpr
-3.0% External/SPEC/CINT2006/464.h264ref/464.h264ref
+3.4% SingleSource/Benchmarks/Shootout-C++/hash2
+3.7% MultiSource/Applications/aha/aha
+3.9% SingleSource/Benchmarks/Shootout-C++/objinst
+4.7% MultiSource/Benchmarks/Ptrdist/yacr2/yacr2
+6.6% MultiSource/Applications/JM/lencod/lencod
+6.7% External/SPEC/CFP2000/177.mesa/177.mesa
+7.4% MultiSource/Benchmarks/Trimaran/enc-pc1/enc-pc1
+8.2% MultiSource/Benchmarks/SciMark2-C/scimark2
+10.7% SingleSource/Benchmarks/Shootout/methcall
+12.5% SingleSource/UnitTests/Vector/SSE/sse.stepfft
+14.6% MultiSource/Benchmarks/BitBench/drop3/drop3
Targeting x86-64 PIC -O2:
-40.0% SingleSource/Benchmarks/Stanford/Queens
-35.7% SingleSource/Benchmarks/Misc/flops
-24.2% SingleSource/UnitTests/Vector/build2
-23.1% MultiSource/Benchmarks/Olden/tsp/tsp
-19.4% MultiSource/Benchmarks/MiBench/telecomm-gsm/telecomm-gsm
-18.2% SingleSource/Benchmarks/Shootout-C++/moments
-13.7% MultiSource/Benchmarks/Trimaran/enc-pc1/enc-pc1
-11.0% MultiSource/Benchmarks/Ptrdist/anagram/anagram
-11.0% SingleSource/Benchmarks/Dhrystone/fldry
-10.8% MultiSource/Benchmarks/Ptrdist/ks/ks
-8.7% SingleSource/Benchmarks/Misc/ReedSolomon
-6.9% MultiSource/Applications/lua/lua
-6.4% External/SPEC/CINT2006/464.h264ref/464.h264ref
-6.1% External/SPEC/CINT2000/256.bzip2/256.bzip2
-5.2% External/SPEC/CFP2000/183.equake/183.equake
-4.8% External/SPEC/CFP2006/447.dealII/447.dealII
-4.1% MultiSource/Benchmarks/BitBench/drop3/drop3
-3.9% External/SPEC/CINT2006/400.perlbench/400.perlbench
-3.7% SingleSource/Benchmarks/BenchmarkGame/Large/fasta
-3.5% External/SPEC/CINT2006/401.bzip2/401.bzip2
-3.3% External/SPEC/CINT2000/255.vortex/255.vortex
-3.2% MultiSource/Applications/JM/lencod/lencod
-3.2% MultiSource/Applications/lambda-0.1.3/lambda
-3.2% MultiSource/Benchmarks/Ptrdist/ft/ft
-3.2% MultiSource/Applications/sqlite3/sqlite3
-3.1% SingleSource/Benchmarks/Misc/fbench
+3.8% External/SPEC/CINT2000/186.crafty/186.crafty
+4.3% MultiSource/Benchmarks/Trimaran/enc-rc4/enc-rc4
+4.3% SingleSource/Benchmarks/BenchmarkGame/nsieve-bits
+4.8% MultiSource/Applications/hexxagon/hexxagon
+5.0% External/SPEC/CINT2006/462.libquantum/462.libquantum
+6.7% MultiSource/Benchmarks/Ptrdist/yacr2/yacr2
+8.0% External/SPEC/CINT2006/471.omnetpp/471.omnetpp
+10.0% SingleSource/Benchmarks/CoyoteBench/huffbench
+12.0% SingleSource/Benchmarks/McGill/chomp
+18.0% SingleSource/Benchmarks/BenchmarkGame/n-body
+45.5% SingleSource/Benchmarks/BenchmarkGame/puzzle
Targeting armv7 -O2, Cortex-A9:
-38.1% MultiSource/Benchmarks/MiBench/security-sha/security-sha
-29.1% MultiSource/Benchmarks/MiBench/telecomm-CRC32/telecomm-CRC32
-20.9% SingleSource/Benchmarks/CoyoteBench/huffbench
-17.2% MultiSource/Benchmarks/Trimaran/enc-pc1/enc-pc1
-14.8% SingleSource/Benchmarks/Misc/salsa20
-14.3% SingleSource/UnitTests/Vector/build2
-10.3% MultiSource/Benchmarks/MallocBench/gs/gs
-10.2% SingleSource/Benchmarks/Misc/ReedSolomon
-9.8% MultiSource/Applications/siod/siod
-9.7% MultiSource/Benchmarks/Prolangs-C/gnugo/gnugo
-9.1% MultiSource/Benchmarks/llubenchmark/llu
-9.1% MultiSource/Benchmarks/FreeBench/analyzer/analyzer
-8.9% MultiSource/Benchmarks/VersaBench/ecbdes/ecbdes
-8.8% SingleSource/Benchmarks/Shootout/sieve
-8.6% MultiSource/Benchmarks/MiBench/security-rijndael/security-rijndael
-7.9% SingleSource/Benchmarks/Misc/mandel-2
-6.2% MultiSource/Benchmarks/MiBench/network-dijkstra/network-dijkstra
-6.2% External/SPEC/CFP2006/447.dealII/447.dealII
-5.7% MultiSource/Applications/JM/ldecod/ldecod
-5.1% SingleSource/Benchmarks/Stanford/Treesort
-5.0% MultiSource/Benchmarks/Olden/mst/mst
-4.4% MultiSource/Benchmarks/McCat/18-imp/imp
-4.4% External/SPEC/CFP2000/183.equake/183.equake
-4.3% MultiSource/Benchmarks/MiBench/network-patricia/network-patricia
-4.3% MultiSource/Benchmarks/MiBench/consumer-typeset/consumer-typeset
-4.1% External/SPEC/CINT2006/462.libquantum/462.libquantum
-4.0% MultiSource/Applications/viterbi/viterbi
-3.7% MultiSource/Benchmarks/MiBench/consumer-lame/consumer-lame
-3.5% SingleSource/Benchmarks/Stanford/Puzzle
-3.5% External/SPEC/CINT2006/401.bzip2/401.bzip2
-3.5% External/Povray/povray
-3.4% MultiSource/Applications/oggenc/oggenc
-3.4% SingleSource/Benchmarks/CoyoteBench/fftbench
-3.4% SingleSource/Benchmarks/Shootout-C++/hash2
-3.4% MultiSource/Applications/ClamAV/clamscan
+3.0% MultiSource/Applications/hexxagon/hexxagon
+3.7% SingleSource/Benchmarks/Shootout/ary3
+3.8% MultiSource/Benchmarks/BitBench/uudecode/uudecode
+4.0% MultiSource/Benchmarks/sim/sim
+4.9% SingleSource/Benchmarks/BenchmarkGame/nsieve-bits
+5.0% MultiSource/Benchmarks/Olden/em3d/em3d
+5.3% SingleSource/Benchmarks/Misc-C++/Large/ray
+6.1% MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan
+6.5% MultiSource/Benchmarks/McCat/08-main/main
+6.5% MultiSource/Benchmarks/Olden/health/health
+6.9% MultiSource/Benchmarks/Trimaran/enc-md5/enc-md5
+9.1% MultiSource/Benchmarks/BitBench/drop3/drop3
+9.2% MultiSource/Applications/SIBsim4/SIBsim4
+10.0% SingleSource/Benchmarks/Shootout/heapsort
+10.5% MultiSource/Benchmarks/Trimaran/enc-3des/enc-3des
+10.9% SingleSource/Benchmarks/Shootout-C++/heapsort
+11.7% MultiSource/Benchmarks/Ptrdist/bc/bc
+12.0% MultiSource/Benchmarks/McCat/17-bintr/bintr
+55.2% SingleSource/Benchmarks/Shootout/methcall
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110430/22a214c4/attachment.html>