thr3ads.net - llvm dev - [LLVMdev] runtime performance benchmarking tools for clang [Oct 2013]

If this information is useful, please help other people find it:
Share via:

Jyoti

2013-Oct-03 08:22 UTC

[LLVMdev] runtime performance benchmarking tools for clang

Hi All,
Could anyone point me to some good benchmarking tools to measure the
runtime performance of clang compiled C++ applications.

Thanks !
- Jyoti
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131003/3cc029f1/attachment.html>

Kun Ling

2013-Oct-03 12:42 UTC

head link

[LLVMdev] runtime performance benchmarking tools for clang

Hi Jyoti,

   The best benchmark is your application, and since Clang & LLVM have
plenty of aggressive optimizations ( some of them may be bug-prone), it
also depends on how do you want to improve the performance.

   The following is some benchmarks that you could use to evaluate
performance of clang.

   1. Phoronix have done some performance test using its Phoronix Test
Benchmarks (http://www.phoronix-test-suite.com/ ), it includes plenty of
commonly used applications. The full list of applications in Phoronix
benchmark  could be found here: http://openbenchmarking.org/suites/pts

   2. For industry standard performance comparison, SPEC CPU is also a good
choice. You could find out more here: http://www.spec.org/cpu/ . General
Purpose CPU vendors use it to show performance improvements.

   3. There are also some other small benchmarks that could test compiler
performance, like polybench (
http://www.cse.ohio-state.edu/~pouchet/software/polybench/ ), which focus
on evaluating the loop transformation of the compiler.

Regards,
Kun Ling

On Thu, Oct 3, 2013 at 4:22 PM, Jyoti <jyoti.yalamanchili at gmail.com>
wrote:
> Hi All,
> Could anyone point me to some good benchmarking tools to measure the
> runtime performance of clang compiled C++ applications.
>
> Thanks !
> - Jyoti
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>

-- 
http://www.lingcc.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131003/3d3614a2/attachment.html>

Jyoti

2013-Dec-11 14:53 UTC

head link

[LLVMdev] runtime performance benchmarking tools for clang

Hi Kun Ling & Bergstrom,
Thanks a lot for your earlier responses. We did use the benchmarks in llvm
testsuite for comparing execution time taken by clang & gcc. It appears
that clang is slower than gcc for cases where floating point operations are
involved and recursive calls are involved (note that pic/pie was enabled
for both gcc as well as clang ).

1) For lag in execution time due to recursive calls, it was obvious that
resolving dynamic relocations via .plt indirections added to the delay.
However, it was not clear as to how gcc was able to achieve it in less time
than clang, when same libc.so.6 & ld-linux.so.3 were being used for both
executables generated by gcc & clang executions.
What could be the possible reason ?

2) For lag in execution time due to floating point operations, it was
clearly observed that gcc used floating point instruction FSQRT, where as
clang seemed to use emulated function (?) BL SQRT.
Note that we used the following flags for both clang as well as gcc
compilation.

-march=armv7-a -mfloat-abi=softfp -mfpu=vfpv3-d16 -mtune=cortex-a8
Infact, i was surprised to see that even when " -march=armv7-a
-mfloat-abi*hard* -mfpu=vfpv3-d16 -mtune=cortex-a8"
was used, the code generated did not use hardware *vsqrt* instruction,
instead there was a *bl sqrt* instruction.
Could someone point out why *vsqrt *was not emited in assembly even though
softfp or 'hard' float-abi was specified ?


3) Could you suggest other benchmarks specifically for floating point other
than those in llvm testsuite ?




On Thu, Oct 3, 2013 at 6:43 PM, "C. Bergström" <cbergstrom at
pathscale.com>wrote:
> On 10/ 3/13 07:42 PM, Kun Ling wrote:
>
>> Hi Jyoti,
>>
>>    The best benchmark is your application, and since Clang & LLVM
have
>> plenty of aggressive optimizations ( some of them may be bug-prone), it
>> also depends on how do you want to improve the performance.
>>
>>    The following is some benchmarks that you could use to evaluate
>> performance of clang.
>>
>>    1. Phoronix have done some performance test using its Phoronix Test
>> Benchmarks (http://www.phoronix-test-suite.com/ ), it includes plenty
of
>> commonly used applications. The full list of applications in Phoronix
>> benchmark  could be found here: http://openbenchmarking.org/suites/pts
>>
> Hi LK
>
> -1 :P Have you looked at their testsuite and how it's setup? It gives
> little regard for switching out and tracking the performance of compiler
> flag changes.
>
>
>>    2. For industry standard performance comparison, SPEC CPU is also a
>> good choice. You could find out more here: http://www.spec.org/cpu/ .
>> General Purpose CPU vendors use it to show performance improvements.
>>
> Waaay over tuned...
>
>>
>>    3. There are also some other small benchmarks that could test
compiler
>> performance, like polybench (http://www.cse.ohio-state.
>> edu/~pouchet/software/polybench/ <http://www.cse.ohio-state.
>> edu/%7Epouchet/software/polybench/> ), which focus on evaluating the
>> loop transformation of the compiler.
>>
>>  I can't say with absolute certainty, but didn't these favor
polyhedral
> type loop optimizations.
> ---------------------
> You have to decide what types of code you want to benchmark - HPC, C++,
> scalar/vectorized.. embedded.. etc
>
> If you narrow done what sort of performance comparison - I can offer some
> suggestions. The above benchmarks aren't bad, but in some cases it
won't be
> a fair comparison against clang/llvm. Other compilers may have done
> excessive tuning and it'll be reflective compared to a default
clang/llvm
>
> Just like NAS parallel benchmark probably has less direct tuning from
> Intel. If you're looking at embedded maybe Dhrystone....
>
> Lastly - There's some benchmarks in the llvm testsuite to consider
> I can't say these are very good choices, but they are probably easy to
run
> https://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131211/86e395ff/attachment.html>

David Peixotto

2013-Dec-11 17:58 UTC

head link

[LLVMdev] runtime performance benchmarking tools for clang

2) For lag in execution time due to floating point operations, it was
clearly observed that gcc used floating point instruction FSQRT, where as
clang seemed to use emulated function (?) BL SQRT.

Note that we used the following flags for both clang as well as gcc
compilation.

 

-march=armv7-a -mfloat-abi=softfp -mfpu=vfpv3-d16 -mtune=cortex-a8

Infact, i was surprised to see that even when " -march=armv7-a
-mfloat-abi=hard -mfpu=vfpv3-d16 -mtune=cortex-a8" 

was used, the code generated did not use hardware vsqrt instruction, instead
there was a bl sqrt instruction.

Could someone point out why vsqrt was not emited in assembly even though
softfp or 'hard' float-abi was specified ?

 

The vsqrt instruction may not be generated when automatically for platforms
where math functions may set errno. Try compiling with -fno-math-errno and
see if that helps.

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131211/44cefac6/attachment.html>

Possibly Parallel Threads

Search for more maybe matching threads

llvm dev - Oct 2013 - [LLVMdev] runtime performance benchmarking tools for clang

[LLVMdev] runtime performance benchmarking tools for clang

[LLVMdev] runtime performance benchmarking tools for clang

[LLVMdev] runtime performance benchmarking tools for clang

[LLVMdev] runtime performance benchmarking tools for clang

Possibly Parallel Threads