Douglas Gregor
2012-Jun-15 16:16 UTC
[LLVMdev] [cfe-dev] C++ Expression Template Benchmarks for GCC/Clang/Intel/PGI/MSVC
On Jun 14, 2012, at 3:54 PM, Walter Landry wrote:> Hello Everyone, > > I thought you might be interested in some C++ expression template > benchmarks I have done. > > http://www.wlandry.net/Projects/FTensor#Benchmarks > > Clang's performance was mixed. It optimized the expression template > code just as well as the code that unrolled the expressions by hand, > but that may be because it only did a mediocre job of optimizing the > unrolled versions. GCC had similar performance issues until I used > -Ofast. I could not find a similar option for Clang, partly because I > could not find a complete list of Clang compiler options. You can see > a list of all of the compiler options that I used at-Ofast enables unsafe optimizations that can change the results produced by floating-point operations, so it doesn't make sense to compare the code generated by one compiler using -Ofast (which gets to break the rules of floating-point math) against the code generated by another compiler that hasn't been allowed to break those rules. It's very possible that -Ofast doesn't even make sense for your library, unless you don't care about the accuracy of your results. IIRC, Clang doesn't actually do anything with -ffast-math, either. So, an apples-to-apples comparison would not use -Ofast or -ffast-math for either. Of course, it's completely fair criticism to say that, for people who don't require exact FP math, -Ofast gives a very nice performance boost in GCC that Clang can't match.> http://www.wlandry.net/Projects/FTensor/compilers_2012.html > > I used clang 3.0. I also tried the 3.1 binary. The difference in > performance was, on the whole, not significant.CC'ing llvm-dev, because code generation and optimization is handled by the LLVM core. - Doug
Owen Anderson
2012-Jun-15 17:07 UTC
[LLVMdev] [cfe-dev] C++ Expression Template Benchmarks for GCC/Clang/Intel/PGI/MSVC
On Jun 15, 2012, at 9:16 AM, Douglas Gregor <dgregor at apple.com> wrote:> IIRC, Clang doesn't actually do anything with -ffast-math, either. So, an apples-to-apples comparison would not use -Ofast or -ffast-math for either. Of course, it's completely fair criticism to say that, for people who don't require exact FP math, -Ofast gives a very nice performance boost in GCC that Clang can't match.The code generator does understand the -ffast-math flag, and there are a small number of peephole optimizations that make use of it. However, it's not very comprehensive. --Owen
John McCall
2012-Jun-15 17:15 UTC
[LLVMdev] [cfe-dev] C++ Expression Template Benchmarks for GCC/Clang/Intel/PGI/MSVC
On Jun 15, 2012, at 9:16 AM, Douglas Gregor wrote:> On Jun 14, 2012, at 3:54 PM, Walter Landry wrote: >> Hello Everyone, >> >> I thought you might be interested in some C++ expression template >> benchmarks I have done. >> >> http://www.wlandry.net/Projects/FTensor#Benchmarks >> >> Clang's performance was mixed. It optimized the expression template >> code just as well as the code that unrolled the expressions by hand, >> but that may be because it only did a mediocre job of optimizing the >> unrolled versions. GCC had similar performance issues until I used >> -Ofast. I could not find a similar option for Clang, partly because I >> could not find a complete list of Clang compiler options. You can see >> a list of all of the compiler options that I used at > > -Ofast enables unsafe optimizations that can change the results produced by floating-point operations, so it doesn't make sense to compare the code generated by one compiler using -Ofast (which gets to break the rules of floating-point math) against the code generated by another compiler that hasn't been allowed to break those rules. It's very possible that -Ofast doesn't even make sense for your library, unless you don't care about the accuracy of your results. > > IIRC, Clang doesn't actually do anything with -ffast-math, either. So, an apples-to-apples comparison would not use -Ofast or -ffast-math for either. Of course, it's completely fair criticism to say that, for people who don't require exact FP math, -Ofast gives a very nice performance boost in GCC that Clang can't match.IIRC, fast-math does get propagated to LLVM, which does honor it. John.
Douglas Gregor
2012-Jun-15 17:23 UTC
[LLVMdev] [cfe-dev] C++ Expression Template Benchmarks for GCC/Clang/Intel/PGI/MSVC
On Jun 15, 2012, at 10:15 AM, John McCall wrote:> On Jun 15, 2012, at 9:16 AM, Douglas Gregor wrote: >> On Jun 14, 2012, at 3:54 PM, Walter Landry wrote: >>> Hello Everyone, >>> >>> I thought you might be interested in some C++ expression template >>> benchmarks I have done. >>> >>> http://www.wlandry.net/Projects/FTensor#Benchmarks >>> >>> Clang's performance was mixed. It optimized the expression template >>> code just as well as the code that unrolled the expressions by hand, >>> but that may be because it only did a mediocre job of optimizing the >>> unrolled versions. GCC had similar performance issues until I used >>> -Ofast. I could not find a similar option for Clang, partly because I >>> could not find a complete list of Clang compiler options. You can see >>> a list of all of the compiler options that I used at >> >> -Ofast enables unsafe optimizations that can change the results produced by floating-point operations, so it doesn't make sense to compare the code generated by one compiler using -Ofast (which gets to break the rules of floating-point math) against the code generated by another compiler that hasn't been allowed to break those rules. It's very possible that -Ofast doesn't even make sense for your library, unless you don't care about the accuracy of your results. >> >> IIRC, Clang doesn't actually do anything with -ffast-math, either. So, an apples-to-apples comparison would not use -Ofast or -ffast-math for either. Of course, it's completely fair criticism to say that, for people who don't require exact FP math, -Ofast gives a very nice performance boost in GCC that Clang can't match. > > IIRC, fast-math does get propagated to LLVM, which does honor it.You're right; I see it now. - Doug
Reasonably Related Threads
- [LLVMdev] [cfe-dev] C++ Expression Template Benchmarks for GCC/Clang/Intel/PGI/MSVC
- [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
- [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
- [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
- Matthias` suggestion for "test-suite" tests that are broken at "-Ofast" and are difficult to "repair"