On Thu, May 23, 2013 at 03:40:00PM +0200, Duncan Sands wrote:> Hi Jack, > > On 23/05/13 15:37, Jack Howarth wrote: >> Below are the results for the Polyhedron 2005 benchmarks compiled with llvm/compiler-rt/dragonegg 3.3svn at r182439 against current >> FSF gcc 4.7.3svn and 4.8.1svn. The only major bug remaining in the dragonegg 3.3svn support for gcc 4.8.x is http://llvm.org/bugs/show_bug.cgi?id=15980 >> which results in unresolved symbols for _iround and _iroundf in the aermod and rnflow testcases. Note that this skews the geometric mean >> of the run time to much higher values. > > I also didn't hook up LLVM's new fast-math optimizations yet, which I expect to > make a big difference.Duncan, Is current dragonegg 3.3svn configured to enable llvm vectorization like clang or do we still have to manually pass those flags with -fplugin-arg-dragonegg-llvm-option? Also can the new fast-math optimizations be manually enabled in dragonegg via flags to -fplugin-arg-dragonegg-llvm-option as well? Jack> > Ciao, Duncan. > >> Jack >> >> Tested on x86_apple-darwin12 >> >> Compile Flags: -ffast-math -funroll-loops -O3 >> >> de-gfortran47: /sw/lib/gcc4.7/bin/gfortran -fplugin=/sw/lib/gcc4.7/lib/dragonegg.so -specs=/sw/lib/gcc4.7/lib/integrated-as.specs >> de-gfortran48: /sw/lib/gcc4.8/bin/gfortran -fplugin=/sw/lib/gcc4.8/lib/dragonegg.so -specs=/sw/lib/gcc4.8/lib/integrated-as.specs >> de-gfortran47+optzns: /sw/lib/gcc4.7/bin/gfortran -fplugin=/sw/lib/gcc4.7/lib/dragonegg.so -specs=/sw/lib/gcc4.7/lib/integrated-as.specs -fplugin-arg-dragonegg-enable-gcc-optzns >> de-gfortran48+optzns: /sw/lib/gcc4.8/bin/gfortran -fplugin=/sw/lib/gcc4.8/lib/dragonegg.so -specs=/sw/lib/gcc4.8/lib/integrated-as.specs -fplugin-arg-dragonegg-enable-gcc-optzns >> gfortran47: /sw/bin/gfortran-fsf-4.7 >> gfortran48: /sw/bin/gfortran-fsf-4.8 >> >> >> Run time (secs) >> >> Benchmark de-gfortran47 de-gfortran48 de-gfortran47+optzns de-gfortran48+optzns gfortran47 gfortran48 >> ac 11.39 11.39 8.09 8.14 8.18 8.05 >> aermod 16.35 -1.00 14.50 -1.00 16.45 16.23 >> air 6.88 6.79 5.42 5.25 5.83 5.73 >> capacita 39.85 39.85 34.71 33.39 32.51 33.02 >> channel 2.05 2.03 2.15 1.98 1.83 1.83 >> doduc 27.10 27.24 26.75 26.36 25.91 25.76 >> fatigue 8.85 8.88 7.72 5.56 8.26 5.60 >> gas_dyn 11.76 11.45 4.51 4.20 3.88 3.59 >> induct 24.01 24.00 11.86 11.85 12.08 12.21 >> linpk 15.43 15.44 15.40 15.77 15.37 15.64 >> mdbx 11.92 11.92 11.30 11.28 11.18 11.42 >> nf 29.57 29.82 29.50 29.46 27.21 27.25 >> protein 36.15 35.10 35.93 34.13 31.88 31.81 >> rnflow 27.02 -1.00 26.77 -1.00 24.67 21.21 >> test_fpu 11.49 11.34 9.11 9.30 7.90 8.01 >> tfft 1.92 1.92 1.92 1.90 1.86 1.90 >> >> Geom. Mean 13.19 21.26 10.99 17.31 10.60 10.22 >> >> Compile time (secs) >> >> Benchmark de-gfortran47 de-gfortran48 de-gfortran47+optzns de-gfortran48+optzns gfortran47 gfortran48 >> ac 0.62 0.31 2.20 1.38 2.88 2.08 >> aermod 35.19 35.52 43.50 42.89 42.75 55.97 >> air 1.16 1.17 2.72 2.36 4.48 4.28 >> capacita 0.52 0.55 1.02 0.99 1.90 1.89 >> channel 0.26 0.26 0.47 0.47 0.65 0.75 >> doduc 1.74 1.76 3.78 3.54 6.03 5.68 >> fatigue 0.91 0.91 1.33 1.49 1.97 2.04 >> gas_dyn 0.70 0.69 1.40 1.38 3.39 2.44 >> induct 1.95 1.73 2.87 2.98 4.08 4.42 >> linpk 0.25 0.24 0.53 0.71 0.92 1.25 >> mdbx 0.66 0.67 1.30 1.14 2.16 1.90 >> nf 0.39 0.39 0.80 0.74 2.12 1.67 >> protein 1.12 1.11 2.01 1.77 4.39 3.62 >> rnflow 1.26 1.26 2.93 2.74 6.43 5.47 >> test_fpu 0.91 0.91 2.27 2.22 5.28 4.26 >> tfft 0.22 0.21 0.39 0.44 0.59 0.78 >> >> Executable (bytes) >> >> Benchmark de-gfortran47 de-gfortran48 de-gfortran47+optzns de-gfortran48+optzns gfortran47 gfortran48 >> ac 26776 26792 47160 34928 59120 42784 >> aermod 1023024 0 1052728 0 1392840 1286136 >> air 61940 61948 65964 61876 110768 106680 >> capacita 41344 41144 45440 45040 77920 73248 >> channel 22736 22744 26696 22552 34704 34656 >> doduc 128376 128384 140580 136296 205320 189040 >> fatigue 65648 65640 69808 73848 90240 82040 >> gas_dyn 54840 54936 63144 71304 123680 99184 >> induct 163064 158792 163192 166920 179080 170872 >> linpk 18680 18688 22896 34920 42640 50936 >> mdbx 49492 49508 57692 53604 90232 78032 >> nf 23880 23888 32088 32104 84072 67744 >> protein 74960 75048 87144 83128 131976 115688 >> rnflow 67704 0 88248 0 205584 176912 >> test_fpu 50000 50008 70440 78456 179464 142608 >> tfft 18568 18576 18416 22544 30680 34832 >> >>
Hi Jack, On 23/05/13 15:53, Jack Howarth wrote:> On Thu, May 23, 2013 at 03:40:00PM +0200, Duncan Sands wrote: >> Hi Jack, >> >> On 23/05/13 15:37, Jack Howarth wrote: >>> Below are the results for the Polyhedron 2005 benchmarks compiled with llvm/compiler-rt/dragonegg 3.3svn at r182439 against current >>> FSF gcc 4.7.3svn and 4.8.1svn. The only major bug remaining in the dragonegg 3.3svn support for gcc 4.8.x is http://llvm.org/bugs/show_bug.cgi?id=15980 >>> which results in unresolved symbols for _iround and _iroundf in the aermod and rnflow testcases. Note that this skews the geometric mean >>> of the run time to much higher values. >> >> I also didn't hook up LLVM's new fast-math optimizations yet, which I expect to >> make a big difference. > > Duncan, > Is current dragonegg 3.3svn configured to enable llvm vectorization like clang or do we still have to > manually pass those flags with -fplugin-arg-dragonegg-llvm-option? Also can the new fast-math optimizations > be manually enabled in dragonegg via flags to -fplugin-arg-dragonegg-llvm-option as well?no, like -fast-math, the autovectorizer is not turned on. This is because I want the testsuite to be building with no failures again before turning on additional optimizations. There is no way to turn on the fast-math optimizations. Ciao, Duncan.
Apparently Analagous Threads
- [LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn
- [LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn
- [LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn
- [LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn
- [LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn