Jack Howarth
2011-Jun-09 21:16 UTC
[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
On Thu, Jun 09, 2011 at 03:44:40PM +0200, Duncan Sands wrote:> Hi Jack, thanks for doing this. > >> Below are the tabulated compile times and executable sizes. >> >> A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize >> B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns >> C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize > > These numbers really surprised me: the GCC code generators must be really slow > if the entire set of LLVM IR and codegen optimizations takes less time to run > than GCC codegen (since with -fplugin-arg-dragonegg-enable-gcc-optzns the only > part of GCC being disabled is codegen, i.e. RTL). I was assuming that I would > need to reduce the LLVM optimization level to get decent speed. Are you sure > that you built GCC with checking disabled (or --enable-checking=release)? > Can you please also redo this (along with execution times), adding the option > -fplugin-arg-dragonegg-llvm-ir-optimize=2. I expect that to always result in > a decent compile time win for dragonegg wrt stock gcc-4.5. If it doesn't have > a significant impact on execution speed, then I'd be tempted to use the formula > LLVM optimization level = (1 + GCC optimization level) / 2 > as the default, i.e. GCC -O3 -> LLVM -O2, GCC -O2 -> LLVM -O1, GCC -O1 -> LLVM > -O1, GCC -O0 -> LLVM -O0, GCC -O5 -> LLVM -O3. > > Best wishes, Duncan.I get about the same thing with --enable-checking=release applied to gcc-4.5.4... Compile time (seconds) Benchmark A) stock B) gcc 4.5.4/ C) gcc 4.5.4/ gcc 4.5.4 dragonegg/optzns dragonegg ac 0.86 0.44 0.31 aermod 31.13 25.81 20.94 air 1.74 1.48 0.81 capacita 0.86 0.74 0.44 channel 0.35 0.32 0.23 doduc 3.08 2.63 1.63 fatigue 1.04 1.05 0.89 gas_dyn 0.94 0.94 0.75 induct 3.30 2.52 1.84 linpk 0.33 0.28 0.20 mdbx 1.09 1.02 0.60 nf 0.41 0.40 0.28 protein 1.56 1.28 0.98 rnflow 1.75 1.70 1.24 test_fpu 1.38 1.41 1.05 tfft 0.31 0.28 0.19 mean 3.13 2.64 2.02 I wouldn't put a lot of faith in the compile time measurements because unlike the actual benchmark runs, pb05 doesn't attempt to repeat the compilations until it has converged on a low error measurement for the compilation time. Jack> >> >> Compile time (seconds) >> >> Benchmark A) stock B) gcc 4.5.4/ C) gcc 4.5.4/ >> gcc 4.5.4 dragonegg/optzns dragonegg >> >> ac 0.61 1.65 0.32 >> aermod 31.24 25.83 21.02 >> air 1.74 1.49 0.81 >> capacita 0.83 0.80 0.44 >> channel 0.34 0.33 0.25 >> doduc 3.09 2.63 1.63 >> fatigue 1.04 1.08 0.84 >> gas_dyn 0.91 0.95 0.75 >> induct 3.18 2.57 1.73 >> linpk 0.34 0.30 0.21 >> mdbx 1.08 1.01 0.59 >> nf 0.39 0.41 0.28 >> protein 1.55 1.29 0.97 >> rnflow 1.76 1.73 1.26 >> test_fpu 1.38 1.40 1.05 >> tfft 0.31 0.28 0.19 >> >> mean 3.11 2.73 2.02 >> >> Executable size (bytes) >> >> Benchmark A) stock B) gcc 4.5.4/ C) gcc 4.5.4/ >> gcc 4.5.4 dragonegg/optzns dragonegg >> >> ac 26344 30896 26704 >> aermod 1145924 1043816 1052056 >> air 57404 57700 53532 >> capacita 40864 41008 37064 >> channel 22448 22664 22664 >> doduc 127340 124108 120124 >> fatigue 61152 65352 65664 >> gas_dyn 647864 58768 !!! 59024 >> induct 162360 180440 175312 >> linpk 18112 18848 18864 >> mdbx 53464 57652 49516 >> nf 22560 23784 24080 >> protein 74320 74440 74816 >> rnflow 66040 71488 71648 >> test_fpu 52624 58224 58320 >> tfft 18416 18456 18600 >> >> The compile times with optzns are 26% slower than stock dragonegg >> but 12% faster than stock gcc 4.5.4. The most interesting executable >> size difference is gas_dyn which fastest with optzns but 11x larger >> in size with stock gcc 4.5.4 compared to either stock dragonegg or >> dragonegg with optzns. This is likely much improved in gcc 4.6 with >> the new -fwhole-file default. >> >> On Thu, Jun 09, 2011 at 09:51:51AM +0200, Duncan Sands wrote: >>> Hi Jack, thanks for these numbers. Can you also please measure compile times? >>> I'm thinking of enabling gcc optimizations by default, but I don't want to >>> increase compile times, which means choosing a value for the >>> -fplugin-arg-dragonegg-llvm-ir-optimize option that is low enough to get good >>> compile times, yet high enough to get fast code. It would be great if you could >>> play around with this to find a good choice. >>> >>> Best wishes, Duncan. >>> >>>> Current dragonegg svn has all of the -fplugin-arg-dragonegg-enable-gcc-optzns bugs for >>>> usage with -ffast-math -O3 addressed except for those related to PR2314. Using the -fno-tree-vectorize >>>> option, we can evaluate the current state of -fplugin-arg-dragonegg-enable-gcc-optzns with >>>> the Polyhedron 2005 benchmarks compared to stock dragonegg and stock gcc 4.5.4. The runtime >>>> benchmarks below show that we average slightly faster than stock gcc 4.5.4 and significantly >>>> faster than stock dragonegg through the use of -fplugin-arg-dragonegg-enable-gcc-optzns. >>>> >>>> x86_64 darwin >>>> >>>> A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize >>>> B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns >>>> C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize >>>> >>>> >>>> Benchmark A) stock B) gcc 4.5.4/ C) gcc 4.5.4/ >>>> gcc 4.5.4 dragonegg/optzns dragonegg >>>> >>>> ac 9.58 9.13 12.30 >>>> aermod 20.88 16.10 17.62 >>>> air 6.16 6.59 7.70 >>>> capacita 35.68 39.94 46.22 >>>> channel 2.03 2.04 1.96 >>>> doduc 28.28 28.43 30.41 >>>> fatigue 8.13 7.19 10.40 >>>> gas_dyn 10.10 9.83 11.73 >>>> induct 20.17 20.76 48.76 >>>> linpk 15.42 15.65 15.69 >>>> mdbx 11.42 11.73 12.07 >>>> nf 27.99 28.60 29.39 >>>> protein 38.36 39.08 39.98 >>>> rnflow 27.28 28.19 31.90 >>>> test_fpu 11.43 11.17 11.50 >>>> tfft 1.91 1.95 2.16 >>>> >>>> Mean 12.72 12.62 14.71 >>>> >>>> Once vector_select() is implemented we can retest without -fno-tree-vectorize. >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Jack Howarth
2011-Jun-10 00:47 UTC
[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
Duncan, Here are the complete benchmarks rerun against gcc 4.5.4 built with... Using built-in specs. COLLECT_GCC=gfortran-fsf-4.5 COLLECT_LTO_WRAPPER=/sw/lib/gcc4.5/libexec/gcc/x86_64-apple-darwin11.0.0/4.5.4/lto-wrapper Target: x86_64-apple-darwin11.0.0 Configured with: ../gcc-4.5.4/configure --prefix=/sw --prefix=/sw/lib/gcc4.5 --mandir=/sw/share/man --infodir=/sw/lib/gcc4.5/info --enable-languages=c,c++,fortran,objc,obj-c++,java --with-gmp=/sw --with-libiconv-prefix=/sw --with-ppl=/sw --with-cloog=/sw --with-mpc=/sw --with-system-zlib --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib --program-suffix=-fsf-4.5 --enable-lto --enable-checking=release Thread model: posix gcc version 4.5.4 20110608 (prerelease) (GCC) x86_64 darwin A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize D) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns -fplugin-arg-dragonegg-llvm-ir-optimize=2 E) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-llvm-ir-optimize=2 Run Time (seconds) Benchmark A) stock B) gcc 4.5.4/ C) gcc 4.5.4/ D) gcc 4.5.4/ E) gcc 4.5.4/ gcc 4.5.4 dragonegg/optzns dragonegg dragonegg/optzns/ dragonegg/optimize=2 optimize=2 ac 9.58 9.11 12.28 9.12 12.73 aermod 20.99 16.18 17.86 16.30 17.89 air 6.06 6.58 7.69 6.51 7.64 capacita 35.76 39.86 46.10 39.58 45.89 channel 2.03 2.04 1.96 2.04 1.96 doduc 28.16 28.50 30.34 28.53 30.42 fatigue 8.12 7.09 10.34 7.06 10.25 gas_dyn 10.16 9.92 11.67 9.96 11.81 induct 20.14 20.76 48.75 20.78 48.75 linpk 15.43 15.41 15.64 15.41 15.64 mdbx 11.41 11.72 12.11 11.72 12.07 nf 27.90 28.52 29.26 28.42 29.13 protein 38.65 38.72 41.31 38.75 39.49 rnflow 27.22 28.18 31.81 28.15 31.98 test_fpu 11.49 11.23 11.57 11.17 11.52 tfft 1.91 1.95 2.15 1.95 2.16 Mean 12.72 12.60 14.73 12.59 14.72 Compile Time (seconds) Benchmark A) stock B) gcc 4.5.4/ C) gcc 4.5.4/ D) gcc 4.5.4/ E) gcc 4.5.4/ gcc 4.5.4 dragonegg/optzns dragonegg dragonegg/optzns/ dragonegg/optimize=2 optimize=2 ac 0.86 0.44 0.31 0.41 0.28 aermod 31.13 25.81 20.94 25.44 20.87 air 1.74 1.48 0.81 1.46 0.78 capacita 0.86 0.74 0.44 0.71 0.42 channel 0.35 0.32 0.23 0.30 0.23 doduc 3.08 2.63 1.63 2.60 1.58 fatigue 1.04 1.05 0.89 0.90 0.70 gas_dyn 0.94 0.94 0.75 0.84 0.62 induct 3.30 2.52 1.84 2.36 1.66 linpk 0.33 0.28 0.20 0.28 0.20 mdbx 1.09 1.02 0.60 0.99 0.59 nf 0.41 0.40 0.28 0.40 0.28 protein 1.56 1.28 0.98 1.21 0.82 rnflow 1.75 1.70 1.24 1.61 1.13 test_fpu 1.38 1.41 1.05 1.31 0.95 tfft 0.31 0.28 0.19 0.28 0.19 Executable Size (bytes) Benchmark A) stock B) gcc 4.5.4/ C) gcc 4.5.4/ D) gcc 4.5.4/ E) gcc 4.5.4/ gcc 4.5.4 dragonegg/optzns dragonegg dragonegg/optzns/ dragonegg/optimize=2 optimize=2 ac 26344 30896 26704 30896 26824 aermod 1145924 1043816 1052056 1027680 1031880 air 57404 57700 53532 53556 53532 capacita 40864 41008 37064 41008 37064 channel 22448 22664 22664 22664 22664 doduc 127340 124108 120124 124372 120484 fatigue 61152 65352 65664 61256 61568 gas_dyn 647864 58768 59024 54672 54960 induct 162360 180440 175312 168304 163176 linpk 18112 18848 18864 18848 18896 mdbx 53464 57652 49516 57652 49516 nf 22560 23784 24080 23784 24080 protein 74320 74440 74816 70344 66624 rnflow 66040 71488 71648 67416 67616 test_fpu 52624 58224 58320 54128 54256 tfft 18416 18456 18600 18456 18600
Jack Howarth
2011-Jun-10 14:00 UTC
[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
On Thu, Jun 09, 2011 at 08:47:26PM -0400, Jack Howarth wrote:> Duncan, > Here are the complete benchmarks rerun against gcc 4.5.4 built with... > > Using built-in specs. > COLLECT_GCC=gfortran-fsf-4.5 > COLLECT_LTO_WRAPPER=/sw/lib/gcc4.5/libexec/gcc/x86_64-apple-darwin11.0.0/4.5.4/lto-wrapper > Target: x86_64-apple-darwin11.0.0 > Configured with: ../gcc-4.5.4/configure --prefix=/sw --prefix=/sw/lib/gcc4.5 --mandir=/sw/share/man --infodir=/sw/lib/gcc4.5/info --enable-languages=c,c++,fortran,objc,obj-c++,java --with-gmp=/sw --with-libiconv-prefix=/sw --with-ppl=/sw --with-cloog=/sw --with-mpc=/sw --with-system-zlib --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib --program-suffix=-fsf-4.5 --enable-lto --enable-checking=release > Thread model: posix > gcc version 4.5.4 20110608 (prerelease) (GCC) > > x86_64 darwin > > A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize > B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns > C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize > D) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns -fplugin-arg-dragonegg-llvm-ir-optimize=2 > E) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-llvm-ir-optimize=2 > > Run Time (seconds) > Benchmark A) stock B) gcc 4.5.4/ C) gcc 4.5.4/ D) gcc 4.5.4/ E) gcc 4.5.4/ > gcc 4.5.4 dragonegg/optzns dragonegg dragonegg/optzns/ dragonegg/optimize=2 > optimize=2 > > ac 9.58 9.11 12.28 9.12 12.73 > aermod 20.99 16.18 17.86 16.30 17.89 > air 6.06 6.58 7.69 6.51 7.64 > capacita 35.76 39.86 46.10 39.58 45.89 > channel 2.03 2.04 1.96 2.04 1.96 > doduc 28.16 28.50 30.34 28.53 30.42 > fatigue 8.12 7.09 10.34 7.06 10.25 > gas_dyn 10.16 9.92 11.67 9.96 11.81 > induct 20.14 20.76 48.75 20.78 48.75 > linpk 15.43 15.41 15.64 15.41 15.64 > mdbx 11.41 11.72 12.11 11.72 12.07 > nf 27.90 28.52 29.26 28.42 29.13 > protein 38.65 38.72 41.31 38.75 39.49 > rnflow 27.22 28.18 31.81 28.15 31.98 > test_fpu 11.49 11.23 11.57 11.17 11.52 > tfft 1.91 1.95 2.15 1.95 2.16 > > Mean 12.72 12.60 14.73 12.59 14.72 > > Compile Time (seconds) > Benchmark A) stock B) gcc 4.5.4/ C) gcc 4.5.4/ D) gcc 4.5.4/ E) gcc 4.5.4/ > gcc 4.5.4 dragonegg/optzns dragonegg dragonegg/optzns/ dragonegg/optimize=2 > optimize=2 > > ac 0.86 0.44 0.31 0.41 0.28 > aermod 31.13 25.81 20.94 25.44 20.87 > air 1.74 1.48 0.81 1.46 0.78 > capacita 0.86 0.74 0.44 0.71 0.42 > channel 0.35 0.32 0.23 0.30 0.23 > doduc 3.08 2.63 1.63 2.60 1.58 > fatigue 1.04 1.05 0.89 0.90 0.70 > gas_dyn 0.94 0.94 0.75 0.84 0.62 > induct 3.30 2.52 1.84 2.36 1.66 > linpk 0.33 0.28 0.20 0.28 0.20 > mdbx 1.09 1.02 0.60 0.99 0.59 > nf 0.41 0.40 0.28 0.40 0.28 > protein 1.56 1.28 0.98 1.21 0.82 > rnflow 1.75 1.70 1.24 1.61 1.13 > test_fpu 1.38 1.41 1.05 1.31 0.95 > tfft 0.31 0.28 0.19 0.28 0.19mean 3.13 2.64 2.02 2.57 1.96 Duncan, hese numbers were from release builds for both FSF gcc 4.5.4 and llvm. It seems that -fplugin-arg-dragonegg-llvm-ir-optimize=2 provides a small offsetting reduction in compile time to compensate for the increased compile time from -fplugin-arg-dragonegg-enable-gcc-optzns at -O3 -ffast-math. It also appears that with -fplugin-arg-dragonegg-llvm-ir-optimize=2, the addition of -fplugin-arg-dragonegg-enable-gcc-optzns slows compilation by 24% with -O3 -ffast-math (which is very close to the 23% increase in compile time seen without -fplugin-arg-dragonegg-llvm-ir-optimize=2). We should rebenchmark pb05 with -O2 -ffast-math to see if -fplugin-arg-dragonegg-enable-gcc-optzns has the same impact on compile times. IMHO, if -fplugin-arg-dragonegg-enable-gcc-optzns has less effect at -O2, it would might make sense to default -fplugin-arg-dragonegg-enable-gcc-optzns on in dragonegg. That is, if the compile time regressions are mainly at -O3 that would be tolerable because run-time of the resulting binaries should be more important there. Jack> > Executable Size (bytes) > Benchmark A) stock B) gcc 4.5.4/ C) gcc 4.5.4/ D) gcc 4.5.4/ E) gcc 4.5.4/ > gcc 4.5.4 dragonegg/optzns dragonegg dragonegg/optzns/ dragonegg/optimize=2 > optimize=2 > > ac 26344 30896 26704 30896 26824 > aermod 1145924 1043816 1052056 1027680 1031880 > air 57404 57700 53532 53556 53532 > capacita 40864 41008 37064 41008 37064 > channel 22448 22664 22664 22664 22664 > doduc 127340 124108 120124 124372 120484 > fatigue 61152 65352 65664 61256 61568 > gas_dyn 647864 58768 59024 54672 54960 > induct 162360 180440 175312 168304 163176 > linpk 18112 18848 18864 18848 18896 > mdbx 53464 57652 49516 57652 49516 > nf 22560 23784 24080 23784 24080 > protein 74320 74440 74816 70344 66624 > rnflow 66040 71488 71648 67416 67616 > test_fpu 52624 58224 58320 54128 54256 > tfft 18416 18456 18600 18456 18600 > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Duncan Sands
2011-Jun-10 14:30 UTC
[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
Hi Jack,> Here are the complete benchmarks rerun against gcc 4.5.4 built with...thanks for these great numbers. It is interesting to see that dropping the LLVM IR optimization level to 2 makes no difference to the run-times. As a radical experiment I just committed a patch to dragonegg (commit 132846) that disables all heavy LLVM optimizations when the GCC optimizers are enabled. A few small cleanups are run on each function, but otherwise only LLVM codegen (and codegen optimizations) are done. I did some measurements and this results in very fast compile times. But how does it impact run-time? Can you please benchmark run times with -fplugin-arg-dragonegg-enable-gcc-optzns and this patch applied (plus don't use the -fplugin-arg-dragonegg-llvm-ir-optimize option since that turns on heavy LLVM IR optimizations again). If it has no impact on run-times then that would suggest that LLVM's IR level optimizers are not doing any useful optimization: GCC already got everything. If it does have an impact then that suggests that LLVM is picking up stuff that GCC missed. I can't way to see! Thanks a lot, Duncan.> > Using built-in specs. > COLLECT_GCC=gfortran-fsf-4.5 > COLLECT_LTO_WRAPPER=/sw/lib/gcc4.5/libexec/gcc/x86_64-apple-darwin11.0.0/4.5.4/lto-wrapper > Target: x86_64-apple-darwin11.0.0 > Configured with: ../gcc-4.5.4/configure --prefix=/sw --prefix=/sw/lib/gcc4.5 --mandir=/sw/share/man --infodir=/sw/lib/gcc4.5/info --enable-languages=c,c++,fortran,objc,obj-c++,java --with-gmp=/sw --with-libiconv-prefix=/sw --with-ppl=/sw --with-cloog=/sw --with-mpc=/sw --with-system-zlib --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib --program-suffix=-fsf-4.5 --enable-lto --enable-checking=release > Thread model: posix > gcc version 4.5.4 20110608 (prerelease) (GCC) > > x86_64 darwin > > A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize > B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns > C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize > D) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns -fplugin-arg-dragonegg-llvm-ir-optimize=2 > E) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-llvm-ir-optimize=2 > > Run Time (seconds) > Benchmark A) stock B) gcc 4.5.4/ C) gcc 4.5.4/ D) gcc 4.5.4/ E) gcc 4.5.4/ > gcc 4.5.4 dragonegg/optzns dragonegg dragonegg/optzns/ dragonegg/optimize=2 > optimize=2 > > ac 9.58 9.11 12.28 9.12 12.73 > aermod 20.99 16.18 17.86 16.30 17.89 > air 6.06 6.58 7.69 6.51 7.64 > capacita 35.76 39.86 46.10 39.58 45.89 > channel 2.03 2.04 1.96 2.04 1.96 > doduc 28.16 28.50 30.34 28.53 30.42 > fatigue 8.12 7.09 10.34 7.06 10.25 > gas_dyn 10.16 9.92 11.67 9.96 11.81 > induct 20.14 20.76 48.75 20.78 48.75 > linpk 15.43 15.41 15.64 15.41 15.64 > mdbx 11.41 11.72 12.11 11.72 12.07 > nf 27.90 28.52 29.26 28.42 29.13 > protein 38.65 38.72 41.31 38.75 39.49 > rnflow 27.22 28.18 31.81 28.15 31.98 > test_fpu 11.49 11.23 11.57 11.17 11.52 > tfft 1.91 1.95 2.15 1.95 2.16 > > Mean 12.72 12.60 14.73 12.59 14.72 > > Compile Time (seconds) > Benchmark A) stock B) gcc 4.5.4/ C) gcc 4.5.4/ D) gcc 4.5.4/ E) gcc 4.5.4/ > gcc 4.5.4 dragonegg/optzns dragonegg dragonegg/optzns/ dragonegg/optimize=2 > optimize=2 > > ac 0.86 0.44 0.31 0.41 0.28 > aermod 31.13 25.81 20.94 25.44 20.87 > air 1.74 1.48 0.81 1.46 0.78 > capacita 0.86 0.74 0.44 0.71 0.42 > channel 0.35 0.32 0.23 0.30 0.23 > doduc 3.08 2.63 1.63 2.60 1.58 > fatigue 1.04 1.05 0.89 0.90 0.70 > gas_dyn 0.94 0.94 0.75 0.84 0.62 > induct 3.30 2.52 1.84 2.36 1.66 > linpk 0.33 0.28 0.20 0.28 0.20 > mdbx 1.09 1.02 0.60 0.99 0.59 > nf 0.41 0.40 0.28 0.40 0.28 > protein 1.56 1.28 0.98 1.21 0.82 > rnflow 1.75 1.70 1.24 1.61 1.13 > test_fpu 1.38 1.41 1.05 1.31 0.95 > tfft 0.31 0.28 0.19 0.28 0.19 > > Executable Size (bytes) > Benchmark A) stock B) gcc 4.5.4/ C) gcc 4.5.4/ D) gcc 4.5.4/ E) gcc 4.5.4/ > gcc 4.5.4 dragonegg/optzns dragonegg dragonegg/optzns/ dragonegg/optimize=2 > optimize=2 > > ac 26344 30896 26704 30896 26824 > aermod 1145924 1043816 1052056 1027680 1031880 > air 57404 57700 53532 53556 53532 > capacita 40864 41008 37064 41008 37064 > channel 22448 22664 22664 22664 22664 > doduc 127340 124108 120124 124372 120484 > fatigue 61152 65352 65664 61256 61568 > gas_dyn 647864 58768 59024 54672 54960 > induct 162360 180440 175312 168304 163176 > linpk 18112 18848 18864 18848 18896 > mdbx 53464 57652 49516 57652 49516 > nf 22560 23784 24080 23784 24080 > protein 74320 74440 74816 70344 66624 > rnflow 66040 71488 71648 67416 67616 > test_fpu 52624 58224 58320 54128 54256 > tfft 18416 18456 18600 18456 18600 > >
Possibly Parallel Threads
- [LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
- [LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
- [LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
- [LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
- [LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status