Below are the results for the Polyhedron 2005 benchmarks compiled with llvm/compiler-rt/dragonegg 3.3svn at r182439 against current FSF gcc 4.7.3svn and 4.8.1svn. The only major bug remaining in the dragonegg 3.3svn support for gcc 4.8.x is http://llvm.org/bugs/show_bug.cgi?id=15980 which results in unresolved symbols for _iround and _iroundf in the aermod and rnflow testcases. Note that this skews the geometric mean of the run time to much higher values. Jack Tested on x86_apple-darwin12 Compile Flags: -ffast-math -funroll-loops -O3 de-gfortran47: /sw/lib/gcc4.7/bin/gfortran -fplugin=/sw/lib/gcc4.7/lib/dragonegg.so -specs=/sw/lib/gcc4.7/lib/integrated-as.specs de-gfortran48: /sw/lib/gcc4.8/bin/gfortran -fplugin=/sw/lib/gcc4.8/lib/dragonegg.so -specs=/sw/lib/gcc4.8/lib/integrated-as.specs de-gfortran47+optzns: /sw/lib/gcc4.7/bin/gfortran -fplugin=/sw/lib/gcc4.7/lib/dragonegg.so -specs=/sw/lib/gcc4.7/lib/integrated-as.specs -fplugin-arg-dragonegg-enable-gcc-optzns de-gfortran48+optzns: /sw/lib/gcc4.8/bin/gfortran -fplugin=/sw/lib/gcc4.8/lib/dragonegg.so -specs=/sw/lib/gcc4.8/lib/integrated-as.specs -fplugin-arg-dragonegg-enable-gcc-optzns gfortran47: /sw/bin/gfortran-fsf-4.7 gfortran48: /sw/bin/gfortran-fsf-4.8 Run time (secs) Benchmark de-gfortran47 de-gfortran48 de-gfortran47+optzns de-gfortran48+optzns gfortran47 gfortran48 ac 11.39 11.39 8.09 8.14 8.18 8.05 aermod 16.35 -1.00 14.50 -1.00 16.45 16.23 air 6.88 6.79 5.42 5.25 5.83 5.73 capacita 39.85 39.85 34.71 33.39 32.51 33.02 channel 2.05 2.03 2.15 1.98 1.83 1.83 doduc 27.10 27.24 26.75 26.36 25.91 25.76 fatigue 8.85 8.88 7.72 5.56 8.26 5.60 gas_dyn 11.76 11.45 4.51 4.20 3.88 3.59 induct 24.01 24.00 11.86 11.85 12.08 12.21 linpk 15.43 15.44 15.40 15.77 15.37 15.64 mdbx 11.92 11.92 11.30 11.28 11.18 11.42 nf 29.57 29.82 29.50 29.46 27.21 27.25 protein 36.15 35.10 35.93 34.13 31.88 31.81 rnflow 27.02 -1.00 26.77 -1.00 24.67 21.21 test_fpu 11.49 11.34 9.11 9.30 7.90 8.01 tfft 1.92 1.92 1.92 1.90 1.86 1.90 Geom. Mean 13.19 21.26 10.99 17.31 10.60 10.22 Compile time (secs) Benchmark de-gfortran47 de-gfortran48 de-gfortran47+optzns de-gfortran48+optzns gfortran47 gfortran48 ac 0.62 0.31 2.20 1.38 2.88 2.08 aermod 35.19 35.52 43.50 42.89 42.75 55.97 air 1.16 1.17 2.72 2.36 4.48 4.28 capacita 0.52 0.55 1.02 0.99 1.90 1.89 channel 0.26 0.26 0.47 0.47 0.65 0.75 doduc 1.74 1.76 3.78 3.54 6.03 5.68 fatigue 0.91 0.91 1.33 1.49 1.97 2.04 gas_dyn 0.70 0.69 1.40 1.38 3.39 2.44 induct 1.95 1.73 2.87 2.98 4.08 4.42 linpk 0.25 0.24 0.53 0.71 0.92 1.25 mdbx 0.66 0.67 1.30 1.14 2.16 1.90 nf 0.39 0.39 0.80 0.74 2.12 1.67 protein 1.12 1.11 2.01 1.77 4.39 3.62 rnflow 1.26 1.26 2.93 2.74 6.43 5.47 test_fpu 0.91 0.91 2.27 2.22 5.28 4.26 tfft 0.22 0.21 0.39 0.44 0.59 0.78 Executable (bytes) Benchmark de-gfortran47 de-gfortran48 de-gfortran47+optzns de-gfortran48+optzns gfortran47 gfortran48 ac 26776 26792 47160 34928 59120 42784 aermod 1023024 0 1052728 0 1392840 1286136 air 61940 61948 65964 61876 110768 106680 capacita 41344 41144 45440 45040 77920 73248 channel 22736 22744 26696 22552 34704 34656 doduc 128376 128384 140580 136296 205320 189040 fatigue 65648 65640 69808 73848 90240 82040 gas_dyn 54840 54936 63144 71304 123680 99184 induct 163064 158792 163192 166920 179080 170872 linpk 18680 18688 22896 34920 42640 50936 mdbx 49492 49508 57692 53604 90232 78032 nf 23880 23888 32088 32104 84072 67744 protein 74960 75048 87144 83128 131976 115688 rnflow 67704 0 88248 0 205584 176912 test_fpu 50000 50008 70440 78456 179464 142608 tfft 18568 18576 18416 22544 30680 34832
Duncan, With r182593, the dragonegg 3.3 branch now completely passes the Polyhedron 2005 benchmarks using the FSF gcc 4.8.1svn compiler. Thanks. Jack Tested on x86_apple-darwin12 Compile Flags: -ffast-math -funroll-loops -O3 de-gfortran47: /sw/lib/gcc4.7/bin/gfortran -fplugin=/sw/lib/gcc4.7/lib/dragonegg.so -specs=/sw/lib/gcc4.7/lib/integrated-as.specs de-gfortran48: /sw/lib/gcc4.8/bin/gfortran -fplugin=/sw/lib/gcc4.8/lib/dragonegg.so -specs=/sw/lib/gcc4.8/lib/integrated-as.specs de-gfortran47+optzns: /sw/lib/gcc4.7/bin/gfortran -fplugin=/sw/lib/gcc4.7/lib/dragonegg.so -specs=/sw/lib/gcc4.7/lib/integrated-as.specs -fplugin-arg-dragonegg-enable-gcc-optzns de-gfortran48+optzns: /sw/lib/gcc4.8/bin/gfortran -fplugin=/sw/lib/gcc4.8/lib/dragonegg.so -specs=/sw/lib/gcc4.8/lib/integrated-as.specs -fplugin-arg-dragonegg-enable-gcc-optzns gfortran47: /sw/bin/gfortran-fsf-4.7 gfortran48: /sw/bin/gfortran-fsf-4.8 Run time (secs) Benchmark de-gfortran47 de-gfortran48 de-gfortran47+optzns de-gfortran48+optzns gfortran47 gfortran48 ac 11.39 11.39 8.09 8.14 8.18 8.05 aermod 16.35 16.00 14.50 15.28 16.45 16.23 air 6.88 6.77 5.42 5.28 5.83 5.73 capacita 39.85 39.83 34.71 33.47 32.51 33.02 channel 2.05 2.05 2.15 1.99 1.83 1.83 doduc 27.10 27.37 26.75 26.31 25.91 25.76 fatigue 8.85 8.81 7.72 5.60 8.26 5.60 gas_dyn 11.76 11.50 4.51 4.21 3.88 3.59 induct 24.01 24.04 11.86 11.85 12.08 12.21 linpk 15.43 15.48 15.40 15.83 15.37 15.64 mdbx 11.92 11.91 11.30 11.27 11.18 11.42 nf 29.57 30.04 29.50 29.59 27.21 27.25 protein 36.15 35.21 35.93 34.16 31.88 31.81 rnflow 27.02 25.92 26.77 22.20 24.67 21.21 test_fpu 11.49 11.47 9.11 9.30 7.90 8.01 tfft 1.92 1.92 1.92 1.89 1.86 1.90 Geom. Mean 13.19 13.10 10.99 10.52 10.60 10.22 Compile time (secs) Benchmark de-gfortran47 de-gfortran48 de-gfortran47+optzns de-gfortran48+optzns gfortran47 gfortran48 ac 0.62 0.29 2.20 0.71 2.88 2.08 aermod 35.19 20.44 43.50 42.90 42.75 55.97 air 1.16 1.11 2.72 2.40 4.48 4.28 capacita 0.52 0.52 1.02 1.04 1.90 1.89 channel 0.26 0.23 0.47 0.50 0.65 0.75 doduc 1.74 1.74 3.78 3.53 6.03 5.68 fatigue 0.91 0.87 1.33 1.49 1.97 2.04 gas_dyn 0.70 0.63 1.40 1.39 3.39 2.44 induct 1.95 1.77 2.87 2.99 4.08 4.42 linpk 0.25 0.21 0.53 0.72 0.92 1.25 mdbx 0.66 0.61 1.30 1.24 2.16 1.90 nf 0.39 0.35 0.80 0.74 2.12 1.67 protein 1.12 1.03 2.01 1.79 4.39 3.62 rnflow 1.26 1.19 2.93 2.72 6.43 5.47 test_fpu 0.91 0.85 2.27 2.22 5.28 4.26 tfft 0.22 0.18 0.39 0.46 0.59 0.78 Executable (bytes) Benchmark de-gfortran47 de-gfortran48 de-gfortran47+optzns de-gfortran48+optzns gfortran47 gfortran48 ac 26776 26792 47160 34928 59120 42784 aermod 1023024 1023064 1052728 1031576 1392840 1286136 air 61940 61948 65964 61876 110768 106680 capacita 41344 41144 45440 45040 77920 73248 channel 22736 22744 26696 22552 34704 34656 doduc 128376 128384 140580 136296 205320 189040 fatigue 65648 65640 69808 73848 90240 82040 gas_dyn 54840 54936 63144 71304 123680 99184 induct 163064 158792 163192 166920 179080 170872 linpk 18680 18688 22896 34920 42640 50936 mdbx 49492 49508 57692 53604 90232 78032 nf 23880 23888 32088 32104 84072 67744 protein 74960 75048 87144 83128 131976 115688 rnflow 67704 67712 88248 96152 205584 176912 test_fpu 50000 50008 70440 78456 179464 142608 tfft 18568 18576 18416 22544 30680 34832
Hi Jack, do the results improve significantly with the attached patch applied? If enables IR level fast math optimizations and the loop vectorizer. Note that some loop vectorizations only kick in if fast-math is enabled too. Best wishes, Duncan. On 24/05/13 01:37, Jack Howarth wrote:> Duncan, > With r182593, the dragonegg 3.3 branch now completely passes the Polyhedron 2005 benchmarks > using the FSF gcc 4.8.1svn compiler. Thanks. > Jack > > Tested on x86_apple-darwin12 > > Compile Flags: -ffast-math -funroll-loops -O3 > > de-gfortran47: /sw/lib/gcc4.7/bin/gfortran -fplugin=/sw/lib/gcc4.7/lib/dragonegg.so -specs=/sw/lib/gcc4.7/lib/integrated-as.specs > de-gfortran48: /sw/lib/gcc4.8/bin/gfortran -fplugin=/sw/lib/gcc4.8/lib/dragonegg.so -specs=/sw/lib/gcc4.8/lib/integrated-as.specs > de-gfortran47+optzns: /sw/lib/gcc4.7/bin/gfortran -fplugin=/sw/lib/gcc4.7/lib/dragonegg.so -specs=/sw/lib/gcc4.7/lib/integrated-as.specs -fplugin-arg-dragonegg-enable-gcc-optzns > de-gfortran48+optzns: /sw/lib/gcc4.8/bin/gfortran -fplugin=/sw/lib/gcc4.8/lib/dragonegg.so -specs=/sw/lib/gcc4.8/lib/integrated-as.specs -fplugin-arg-dragonegg-enable-gcc-optzns > gfortran47: /sw/bin/gfortran-fsf-4.7 > gfortran48: /sw/bin/gfortran-fsf-4.8 > > Run time (secs) > > Benchmark de-gfortran47 de-gfortran48 de-gfortran47+optzns de-gfortran48+optzns gfortran47 gfortran48 > ac 11.39 11.39 8.09 8.14 8.18 8.05 > aermod 16.35 16.00 14.50 15.28 16.45 16.23 > air 6.88 6.77 5.42 5.28 5.83 5.73 > capacita 39.85 39.83 34.71 33.47 32.51 33.02 > channel 2.05 2.05 2.15 1.99 1.83 1.83 > doduc 27.10 27.37 26.75 26.31 25.91 25.76 > fatigue 8.85 8.81 7.72 5.60 8.26 5.60 > gas_dyn 11.76 11.50 4.51 4.21 3.88 3.59 > induct 24.01 24.04 11.86 11.85 12.08 12.21 > linpk 15.43 15.48 15.40 15.83 15.37 15.64 > mdbx 11.92 11.91 11.30 11.27 11.18 11.42 > nf 29.57 30.04 29.50 29.59 27.21 27.25 > protein 36.15 35.21 35.93 34.16 31.88 31.81 > rnflow 27.02 25.92 26.77 22.20 24.67 21.21 > test_fpu 11.49 11.47 9.11 9.30 7.90 8.01 > tfft 1.92 1.92 1.92 1.89 1.86 1.90 > > Geom. Mean 13.19 13.10 10.99 10.52 10.60 10.22 > > Compile time (secs) > > Benchmark de-gfortran47 de-gfortran48 de-gfortran47+optzns de-gfortran48+optzns gfortran47 gfortran48 > ac 0.62 0.29 2.20 0.71 2.88 2.08 > aermod 35.19 20.44 43.50 42.90 42.75 55.97 > air 1.16 1.11 2.72 2.40 4.48 4.28 > capacita 0.52 0.52 1.02 1.04 1.90 1.89 > channel 0.26 0.23 0.47 0.50 0.65 0.75 > doduc 1.74 1.74 3.78 3.53 6.03 5.68 > fatigue 0.91 0.87 1.33 1.49 1.97 2.04 > gas_dyn 0.70 0.63 1.40 1.39 3.39 2.44 > induct 1.95 1.77 2.87 2.99 4.08 4.42 > linpk 0.25 0.21 0.53 0.72 0.92 1.25 > mdbx 0.66 0.61 1.30 1.24 2.16 1.90 > nf 0.39 0.35 0.80 0.74 2.12 1.67 > protein 1.12 1.03 2.01 1.79 4.39 3.62 > rnflow 1.26 1.19 2.93 2.72 6.43 5.47 > test_fpu 0.91 0.85 2.27 2.22 5.28 4.26 > tfft 0.22 0.18 0.39 0.46 0.59 0.78 > > Executable (bytes) > > Benchmark de-gfortran47 de-gfortran48 de-gfortran47+optzns de-gfortran48+optzns gfortran47 gfortran48 > ac 26776 26792 47160 34928 59120 42784 > aermod 1023024 1023064 1052728 1031576 1392840 1286136 > air 61940 61948 65964 61876 110768 106680 > capacita 41344 41144 45440 45040 77920 73248 > channel 22736 22744 26696 22552 34704 34656 > doduc 128376 128384 140580 136296 205320 189040 > fatigue 65648 65640 69808 73848 90240 82040 > gas_dyn 54840 54936 63144 71304 123680 99184 > induct 163064 158792 163192 166920 179080 170872 > linpk 18680 18688 22896 34920 42640 50936 > mdbx 49492 49508 57692 53604 90232 78032 > nf 23880 23888 32088 32104 84072 67744 > protein 74960 75048 87144 83128 131976 115688 > rnflow 67704 67712 88248 96152 205584 176912 > test_fpu 50000 50008 70440 78456 179464 142608 > tfft 18568 18576 18416 22544 30680 34832 >-------------- next part -------------- A non-text attachment was scrubbed... Name: fm.diff Type: text/x-patch Size: 2814 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130524/a874625f/attachment.bin>
On Wed, May 29, 2013 at 03:25:30PM +0200, Duncan Sands wrote:> Hi Jack, I pulled the loop vectorizer and fast math changes into the 3.3 branch, > so hopefully they will be part of 3.3 rc3 (and 3.3 final!). It would be great > if you could redo the benchmarks rc3. >Duncan, As requested, appended are the updated Polyhedron 2005 benchmark results with both RC1 and RC3 llvm 3.3 testing. There is a small improvement in the dragonegg results (without -fplugin-arg-dragonegg-enable-gcc-optzns) in RC3. I assume we still only have partial coverage of all of the -ffast-math optimizations performed by FSF gcc in llvm's fast-math support, correct? Jack Tested on x86_apple-darwin12 Compile Flags: -ffast-math -funroll-loops -O3 de-gfc47: /sw/lib/gcc4.7/bin/gfortran -fplugin=/sw/lib/gcc4.7/lib/dragonegg.so -specs=/sw/lib/gcc4.7/lib/integrated-as.specs de-gfc48: /sw/lib/gcc4.8/bin/gfortran -fplugin=/sw/lib/gcc4.8/lib/dragonegg.so -specs=/sw/lib/gcc4.8/lib/integrated-as.specs de-gfc47+optzns: /sw/lib/gcc4.7/bin/gfortran -fplugin=/sw/lib/gcc4.7/lib/dragonegg.so -specs=/sw/lib/gcc4.7/lib/integrated-as.specs +-fplugin-arg-dragonegg-enable-gcc-optzns de-gfc48+optzns: /sw/lib/gcc4.8/bin/gfortran -fplugin=/sw/lib/gcc4.8/lib/dragonegg.so -specs=/sw/lib/gcc4.8/lib/integrated-as.specs +-fplugin-arg-dragonegg-enable-gcc-optzns gfortran47: /sw/bin/gfortran-fsf-4.7 gfortran48: /sw/bin/gfortran-fsf-4.8 Run time (secs) Benchmark de-gfc47 de-gfc47 de-gfc48 de-gfc48 de-gfc47 de-gfc47 de-gfc48 de-gfc48 gfortran47 gfortran48 +optzns +optzns +optzns +optzns RC1 RC3 RC1 RC3 RC1 RC3 RC1 RC3 ac 11.39 11.66 11.39 11.58 8.09 8.07 8.14 8.14 8.18 8.05 aermod 16.35 16.47 16.00 16.44 14.50 14.61 15.28 14.43 16.45 16.23 air 6.88 6.87 6.77 6.77 5.42 5.42 5.28 5.27 5.83 5.73 capacita 39.85 37.80 39.83 37.86 34.71 34.81 33.47 33.53 32.51 33.02 channel 2.05 2.06 2.05 2.06 2.15 2.15 1.99 1.99 1.83 1.83 doduc 27.10 27.43 27.37 27.39 26.75 27.03 26.31 26.24 25.91 25.76 fatigue 8.85 8.84 8.81 8.88 7.72 7.75 5.60 5.42 8.26 5.60 gas_dyn 11.76 8.25 11.50 7.94 4.51 4.52 4.21 4.20 3.88 3.59 induct 24.01 24.45 24.04 24.04 11.86 11.90 11.85 11.85 12.08 12.21 linpk 15.43 15.48 15.48 15.49 15.40 15.47 15.83 15.81 15.37 15.64 mdbx 11.92 12.14 11.91 12.15 11.30 11.29 11.27 11.27 11.18 11.42 nf 29.57 30.08 30.04 30.11 29.50 29.82 29.59 29.86 27.21 27.25 protein 36.15 36.15 35.21 35.17 35.93 36.02 34.16 34.06 31.88 31.81 rnflow 27.02 27.08 25.92 26.12 26.77 26.83 22.20 22.21 24.67 21.21 test_fpu 11.49 11.55 11.47 11.52 9.11 9.11 9.30 9.30 7.90 8.01 tfft 1.92 1.94 1.92 1.92 1.92 1.92 1.89 1.90 1.86 1.90 Geom. Mean 13.19 12.95 13.10 12.83 10.99 11.02 10.52 10.47 10.60 10.22 Compile time (secs) Benchmark de-gfc47 de-gfc47 de-gfc48 de-gfc48 de-gfc47 de-gfc47 de-gfc48 de-gfc48 gfortran47 gfortran48 +optzns +optzns +optzns +optzns RC1 RC3 RC1 RC3 RC1 RC3 RC1 RC3 ac 0.62 1.63 0.29 0.93 2.20 1.02 0.71 0.73 2.88 2.08 aermod 35.19 35.57 20.44 35.86 43.50 43.39 42.90 43.08 42.75 55.97 air 1.16 1.23 1.11 1.26 2.72 2.68 2.40 2.35 4.48 4.28 capacita 0.52 0.60 0.52 0.62 1.02 0.94 1.04 0.96 1.90 1.89 channel 0.26 0.28 0.23 0.30 0.47 0.45 0.50 0.47 0.65 0.75 doduc 1.74 1.89 1.74 1.91 3.78 3.71 3.53 3.55 6.03 5.68 fatigue 0.91 0.91 0.87 0.91 1.33 1.30 1.49 1.49 1.97 2.04 gas_dyn 0.70 0.87 0.63 0.88 1.40 1.37 1.39 1.39 3.39 2.44 induct 1.95 1.83 1.77 1.83 2.87 2.81 2.99 3.02 4.08 4.42 linpk 0.25 0.32 0.21 0.32 0.53 0.52 0.72 0.73 0.92 1.25 mdbx 0.66 0.73 0.61 0.75 1.30 1.26 1.24 1.15 2.16 1.90 nf 0.39 0.55 0.35 0.55 0.80 0.80 0.74 0.74 2.12 1.67 protein 1.12 1.18 1.03 1.20 2.01 1.99 1.79 1.77 4.39 3.62 rnflow 1.26 1.55 1.19 1.55 2.93 2.84 2.72 2.73 6.43 5.47 test_fpu 0.91 1.12 0.85 1.13 2.27 5.06 2.22 2.23 5.28 4.26 tfft 0.22 0.24 0.18 0.22 0.39 0.40 0.46 0.46 0.59 0.78 Executable (bytes) Benchmark de-gfc47 de-gfc47 de-gfc48 de-gfc48 de-gfc47 de-gfc47 de-gfc48 de-gfc48 gfortran47 gfortran48 +optzns +optzns +optzns +optzns RC1 RC3 RC1 RC3 RC1 RC3 RC1 RC3 ac 26776 30896 26792 30912 47160 47160 34928 34928 59120 42784 aermod 1023024 1035312 1023064 1031248 1052728 1052728 1031576 1031568 1392840 1286136 air 61940 61940 61948 61948 65964 65964 61876 61876 110768 106680 capaci 41344 45440 41144 41144 45440 45440 45040 45040 77920 73248 channe 22736 22600 22744 22608 26696 22600 22552 22552 34704 34656 doduc 128376 120188 128384 120196 140580 140580 136296 136296 205320 189040 fatigu 65648 69744 65640 69736 69808 69808 73848 73848 90240 82040 gas_dy 54840 58936 54936 59032 63144 63144 71304 71304 123680 99184 induct 163064 163064 158792 162888 163192 167288 166920 171024 179080 170872 linpk 18680 22896 18688 22904 22896 22896 34920 34920 42640 50936 mdbx 49492 57684 49508 57700 57692 57692 53604 53604 90232 78032 nf 23880 32080 23888 27984 32088 32088 32104 32104 84072 67744 protei 74960 79056 75048 79144 87144 87144 83128 83128 131976 115688 rnflow 67704 79992 67712 80000 88248 88248 96152 96152 205584 176912 test_f 50000 62296 50008 62304 70440 70440 78456 78456 179464 142608 tfft 18568 18568 18576 18576 18416 18416 22544 22544 30680 34832
Hi Jack, On 29/05/13 22:04, Jack Howarth wrote:> On Wed, May 29, 2013 at 03:25:30PM +0200, Duncan Sands wrote: >> Hi Jack, I pulled the loop vectorizer and fast math changes into the 3.3 branch, >> so hopefully they will be part of 3.3 rc3 (and 3.3 final!). It would be great >> if you could redo the benchmarks rc3. >> > > Duncan, > As requested, appended are the updated Polyhedron 2005 benchmark results with both RC1 and RC3 llvm 3.3 testing.thanks for doing this. As rc3 hasn't been tagged yet, I assume you used latest 3.3svn?> There is a small improvement in the dragonegg results (without -fplugin-arg-dragonegg-enable-gcc-optzns) in RC3. I assume > we still only have partial coverage of all of the -ffast-math optimizations performed by FSF gcc in llvm's fast-math > support, correct?These results are very disappointing, I was hoping to see a big improvement somewhere instead of no real improvement anywhere (except for gas_dyn) or a regression (eg: mdbx). I think LLVM now has a reasonable array of fast-math optimizations. I will try to find time to poke at gas_dyn and induct: since turning on gcc's optimizations there halve the run-time, LLVM's IR optimizers are clearly missing something important. Ciao, Duncan.> Jack > > Tested on x86_apple-darwin12 > > Compile Flags: -ffast-math -funroll-loops -O3 > > de-gfc47: /sw/lib/gcc4.7/bin/gfortran -fplugin=/sw/lib/gcc4.7/lib/dragonegg.so -specs=/sw/lib/gcc4.7/lib/integrated-as.specs > de-gfc48: /sw/lib/gcc4.8/bin/gfortran -fplugin=/sw/lib/gcc4.8/lib/dragonegg.so -specs=/sw/lib/gcc4.8/lib/integrated-as.specs > de-gfc47+optzns: /sw/lib/gcc4.7/bin/gfortran -fplugin=/sw/lib/gcc4.7/lib/dragonegg.so -specs=/sw/lib/gcc4.7/lib/integrated-as.specs > +-fplugin-arg-dragonegg-enable-gcc-optzns > de-gfc48+optzns: /sw/lib/gcc4.8/bin/gfortran -fplugin=/sw/lib/gcc4.8/lib/dragonegg.so -specs=/sw/lib/gcc4.8/lib/integrated-as.specs > +-fplugin-arg-dragonegg-enable-gcc-optzns > gfortran47: /sw/bin/gfortran-fsf-4.7 > gfortran48: /sw/bin/gfortran-fsf-4.8 > > Run time (secs) > > Benchmark de-gfc47 de-gfc47 de-gfc48 de-gfc48 de-gfc47 de-gfc47 de-gfc48 de-gfc48 gfortran47 gfortran48 > +optzns +optzns +optzns +optzns > RC1 RC3 RC1 RC3 RC1 RC3 RC1 RC3 > ac 11.39 11.66 11.39 11.58 8.09 8.07 8.14 8.14 8.18 8.05 > aermod 16.35 16.47 16.00 16.44 14.50 14.61 15.28 14.43 16.45 16.23 > air 6.88 6.87 6.77 6.77 5.42 5.42 5.28 5.27 5.83 5.73 > capacita 39.85 37.80 39.83 37.86 34.71 34.81 33.47 33.53 32.51 33.02 > channel 2.05 2.06 2.05 2.06 2.15 2.15 1.99 1.99 1.83 1.83 > doduc 27.10 27.43 27.37 27.39 26.75 27.03 26.31 26.24 25.91 25.76 > fatigue 8.85 8.84 8.81 8.88 7.72 7.75 5.60 5.42 8.26 5.60 > gas_dyn 11.76 8.25 11.50 7.94 4.51 4.52 4.21 4.20 3.88 3.59 > induct 24.01 24.45 24.04 24.04 11.86 11.90 11.85 11.85 12.08 12.21 > linpk 15.43 15.48 15.48 15.49 15.40 15.47 15.83 15.81 15.37 15.64 > mdbx 11.92 12.14 11.91 12.15 11.30 11.29 11.27 11.27 11.18 11.42 > nf 29.57 30.08 30.04 30.11 29.50 29.82 29.59 29.86 27.21 27.25 > protein 36.15 36.15 35.21 35.17 35.93 36.02 34.16 34.06 31.88 31.81 > rnflow 27.02 27.08 25.92 26.12 26.77 26.83 22.20 22.21 24.67 21.21 > test_fpu 11.49 11.55 11.47 11.52 9.11 9.11 9.30 9.30 7.90 8.01 > tfft 1.92 1.94 1.92 1.92 1.92 1.92 1.89 1.90 1.86 1.90 > > Geom. Mean 13.19 12.95 13.10 12.83 10.99 11.02 10.52 10.47 10.60 10.22 > > Compile time (secs) > > Benchmark de-gfc47 de-gfc47 de-gfc48 de-gfc48 de-gfc47 de-gfc47 de-gfc48 de-gfc48 gfortran47 gfortran48 > +optzns +optzns +optzns +optzns > RC1 RC3 RC1 RC3 RC1 RC3 RC1 RC3 > ac 0.62 1.63 0.29 0.93 2.20 1.02 0.71 0.73 2.88 2.08 > aermod 35.19 35.57 20.44 35.86 43.50 43.39 42.90 43.08 42.75 55.97 > air 1.16 1.23 1.11 1.26 2.72 2.68 2.40 2.35 4.48 4.28 > capacita 0.52 0.60 0.52 0.62 1.02 0.94 1.04 0.96 1.90 1.89 > channel 0.26 0.28 0.23 0.30 0.47 0.45 0.50 0.47 0.65 0.75 > doduc 1.74 1.89 1.74 1.91 3.78 3.71 3.53 3.55 6.03 5.68 > fatigue 0.91 0.91 0.87 0.91 1.33 1.30 1.49 1.49 1.97 2.04 > gas_dyn 0.70 0.87 0.63 0.88 1.40 1.37 1.39 1.39 3.39 2.44 > induct 1.95 1.83 1.77 1.83 2.87 2.81 2.99 3.02 4.08 4.42 > linpk 0.25 0.32 0.21 0.32 0.53 0.52 0.72 0.73 0.92 1.25 > mdbx 0.66 0.73 0.61 0.75 1.30 1.26 1.24 1.15 2.16 1.90 > nf 0.39 0.55 0.35 0.55 0.80 0.80 0.74 0.74 2.12 1.67 > protein 1.12 1.18 1.03 1.20 2.01 1.99 1.79 1.77 4.39 3.62 > rnflow 1.26 1.55 1.19 1.55 2.93 2.84 2.72 2.73 6.43 5.47 > test_fpu 0.91 1.12 0.85 1.13 2.27 5.06 2.22 2.23 5.28 4.26 > tfft 0.22 0.24 0.18 0.22 0.39 0.40 0.46 0.46 0.59 0.78 > > Executable (bytes) > > Benchmark de-gfc47 de-gfc47 de-gfc48 de-gfc48 de-gfc47 de-gfc47 de-gfc48 de-gfc48 gfortran47 gfortran48 > +optzns +optzns +optzns +optzns > RC1 RC3 RC1 RC3 RC1 RC3 RC1 RC3 > ac 26776 30896 26792 30912 47160 47160 34928 34928 59120 42784 > aermod 1023024 1035312 1023064 1031248 1052728 1052728 1031576 1031568 1392840 1286136 > air 61940 61940 61948 61948 65964 65964 61876 61876 110768 106680 > capaci 41344 45440 41144 41144 45440 45440 45040 45040 77920 73248 > channe 22736 22600 22744 22608 26696 22600 22552 22552 34704 34656 > doduc 128376 120188 128384 120196 140580 140580 136296 136296 205320 189040 > fatigu 65648 69744 65640 69736 69808 69808 73848 73848 90240 82040 > gas_dy 54840 58936 54936 59032 63144 63144 71304 71304 123680 99184 > induct 163064 163064 158792 162888 163192 167288 166920 171024 179080 170872 > linpk 18680 22896 18688 22904 22896 22896 34920 34920 42640 50936 > mdbx 49492 57684 49508 57700 57692 57692 53604 53604 90232 78032 > nf 23880 32080 23888 27984 32088 32088 32104 32104 84072 67744 > protei 74960 79056 75048 79144 87144 87144 83128 83128 131976 115688 > rnflow 67704 79992 67712 80000 88248 88248 96152 96152 205584 176912 > test_f 50000 62296 50008 62304 70440 70440 78456 78456 179464 142608 > tfft 18568 18568 18576 18576 18416 18416 22544 22544 30680 34832 > >
Seemingly Similar Threads
- [LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn
- [LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn
- [LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn
- [LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn
- [LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn