thr3ads.net - llvm dev - [LLVMdev] Performance vs other VMs [Jan 2009]

If this information is useful, please help other people find it:
Share via:

Jon Harrop

2009-Jan-30 20:56 UTC

[LLVMdev] Performance vs other VMs

The release of a new code generator in Mono 2.2 prompted me to benchmark the 
performance of various VMs using the SciMark2 benchmark on an 8x 2.1GHz 
64-bit Opteron and I have published the results here:

  http://flyingfrogblog.blogspot.com/2009/01/mono-22.html

The LLVM results were generated using llvm-gcc 4.2.1 on the C version of 
SciMark2 with the following command-line options:

  llvm-gcc -Wall -lm -O2 -funroll-loops *.c -o scimark2

Mono was up to 12x slower than LLVM before and is now only 2.2x slower on 
average. Interestingly, the JVM scores slightly higher than LLVM on this 
benchmark on average and beats LLVM on two of the five individual tests.

The individual scores are particularly enlightening. Specifically:

. LLVM outperforms all other VMs by a significant margin on FFT, Monte Carlo 
and sparse matrix multiply.

. LLVM is beaten by the JVM on successive over-relaxation (SOR) and LU 
decomposition.

In the context of the SOR test, I suspect the JVM is using alias information 
to perform optimizations that LLVM and llvm-gcc probably do not do.

I am not sure what causes the performance discrepancy on LU. Perhaps the JVM 
is generating SSE instructions. Does llvm-gcc generate SSE instructions under 
any circumstances?

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

BGB

2009-Jan-31 02:17 UTC

head link

[LLVMdev] Performance vs other VMs

----- Original Message ----- 
From: "Jon Harrop" <jon at ffconsultancy.com>
To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
Sent: Saturday, January 31, 2009 6:56 AM
Subject: [LLVMdev] Performance vs other VMs

>
> The release of a new code generator in Mono 2.2 prompted me to benchmark 
> the
> performance of various VMs using the SciMark2 benchmark on an 8x 2.1GHz
> 64-bit Opteron and I have published the results here:
>
>  http://flyingfrogblog.blogspot.com/2009/01/mono-22.html
>
> The LLVM results were generated using llvm-gcc 4.2.1 on the C version of
> SciMark2 with the following command-line options:
>
>  llvm-gcc -Wall -lm -O2 -funroll-loops *.c -o scimark2
>
> Mono was up to 12x slower than LLVM before and is now only 2.2x slower on
> average. Interestingly, the JVM scores slightly higher than LLVM on this
> benchmark on average and beats LLVM on two of the five individual tests.
>
> The individual scores are particularly enlightening. Specifically:
>
> . LLVM outperforms all other VMs by a significant margin on FFT, Monte 
> Carlo
> and sparse matrix multiply.
>
> . LLVM is beaten by the JVM on successive over-relaxation (SOR) and LU
> decomposition.
>
> In the context of the SOR test, I suspect the JVM is using alias 
> information
> to perform optimizations that LLVM and llvm-gcc probably do not do.
>
> I am not sure what causes the performance discrepancy on LU. Perhaps the 
> JVM
> is generating SSE instructions. Does llvm-gcc generate SSE instructions 
> under
> any circumstances?
>
interesting, but can you add plain C compiled with the good old-fashined GCC 
or similar to serve as a point of reference as well?...

> -- 
> Dr Jon Harrop, Flying Frog Consultancy Ltd.
> http://www.ffconsultancy.com/?e
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Jon Harrop

2009-Jan-31 03:31 UTC

head link

[LLVMdev] Performance vs other VMs

On Saturday 31 January 2009 02:17:31 BGB wrote:> interesting, but can you add plain C compiled with the good old-fashined
> GCC or similar to serve as a point of reference as well?...
This is the highest composite score I have been able to get with gcc 4.3.2:

$ gcc -Wall -lm -O3 -march=barcelona -funroll-all-loops *.c -o scimark2
$ ./scimark2
Composite Score:          708.63
FFT             Mflops:   573.76    (N=1024)
SOR             Mflops:   481.74    (100 x 100)
MonteCarlo:     Mflops:   129.06
Sparse matmult  Mflops:   775.57    (N=1000, nz=5000)
LU              Mflops:  1583.00    (M=100, N=100)

One reason is, perhaps, that the version of llvm-gcc that I am using does not 
recognise -march=barcelona for this CPU but gcc does.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

Ramón García

2009-Feb-01 05:25 UTC

head link

[LLVMdev] Performance vs other VMs

This is not a quite fair comparison. Other virtual machines must be
doing garbage collection, while LLVM, as it is using C code, it is
taking advantage of memory allocation by hand.

On Fri, Jan 30, 2009 at 9:56 PM, Jon Harrop <jon at ffconsultancy.com>
wrote:>
> The release of a new code generator in Mono 2.2 prompted me to benchmark
the
> performance of various VMs using the SciMark2 benchmark on an 8x 2.1GHz
> 64-bit Opteron and I have published the results here:
>
>  http://flyingfrogblog.blogspot.com/2009/01/mono-22.html
>
> The LLVM results were generated using llvm-gcc 4.2.1 on the C version of
> SciMark2 with the following command-line options:
>
>  llvm-gcc -Wall -lm -O2 -funroll-loops *.c -o scimark2
>
> Mono was up to 12x slower than LLVM before and is now only 2.2x slower on
> average. Interestingly, the JVM scores slightly higher than LLVM on this
> benchmark on average and beats LLVM on two of the five individual tests.
>
> The individual scores are particularly enlightening. Specifically:
>
> . LLVM outperforms all other VMs by a significant margin on FFT, Monte
Carlo
> and sparse matrix multiply.
>
> . LLVM is beaten by the JVM on successive over-relaxation (SOR) and LU
> decomposition.
>
> In the context of the SOR test, I suspect the JVM is using alias
information
> to perform optimizations that LLVM and llvm-gcc probably do not do.
>
> I am not sure what causes the performance discrepancy on LU. Perhaps the
JVM
> is generating SSE instructions. Does llvm-gcc generate SSE instructions
under
> any circumstances?
>
> --
> Dr Jon Harrop, Flying Frog Consultancy Ltd.
> http://www.ffconsultancy.com/?e
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Patrick Meredith

2009-Feb-01 06:47 UTC

head link

[LLVMdev] Performance vs other VMs

Here is a run of scimark2 with verbose GC enabled.  You'll see that  
there are two garbage collection cycles for a total of around .003  
seconds of time.
It should also be noted that these GCs happened before the timer  
starts running.  There is almost no dynamic memory allocation in this  
code.  Modern garbage collectors
are also very efficient (sometimes better than hand deallocation).

java -verbose:gc jnt/scimark2/commandline
[GC 511K->202K(1984K), 0.0018845 secs]
[GC 714K->415K(1984K), 0.0015513 secs]

SciMark 2.0a

Composite Score: 327.3062235870194
FFT (1024): 127.42845375506063
SOR (100x100):   677.3128255261597
Monte Carlo : 29.4337095721763
Sparse matmult (N=1000, nz=5000): 300.2107071278524
LU (100x100): 502.14542195384803

java.vendor: Apple Inc.
java.version: 1.5.0_16
os.arch: i386
os.name: Mac OS X
os.version: 10.5.6



On Jan 31, 2009, at 11:25 PM, Ramón García wrote:
> This is not a quite fair comparison. Other virtual machines must be
> doing garbage collection, while LLVM, as it is using C code, it is
> taking advantage of memory allocation by hand.
>
> On Fri, Jan 30, 2009 at 9:56 PM, Jon Harrop <jon at
ffconsultancy.com>
> wrote:
>>
>> The release of a new code generator in Mono 2.2 prompted me to  
>> benchmark the
>> performance of various VMs using the SciMark2 benchmark on an 8x  
>> 2.1GHz
>> 64-bit Opteron and I have published the results here:
>>
>> http://flyingfrogblog.blogspot.com/2009/01/mono-22.html
>>
>> The LLVM results were generated using llvm-gcc 4.2.1 on the C  
>> version of
>> SciMark2 with the following command-line options:
>>
>> llvm-gcc -Wall -lm -O2 -funroll-loops *.c -o scimark2
>>
>> Mono was up to 12x slower than LLVM before and is now only 2.2x  
>> slower on
>> average. Interestingly, the JVM scores slightly higher than LLVM on  
>> this
>> benchmark on average and beats LLVM on two of the five individual  
>> tests.
>>
>> The individual scores are particularly enlightening. Specifically:
>>
>> . LLVM outperforms all other VMs by a significant margin on FFT,  
>> Monte Carlo
>> and sparse matrix multiply.
>>
>> . LLVM is beaten by the JVM on successive over-relaxation (SOR) and  
>> LU
>> decomposition.
>>
>> In the context of the SOR test, I suspect the JVM is using alias  
>> information
>> to perform optimizations that LLVM and llvm-gcc probably do not do.
>>
>> I am not sure what causes the performance discrepancy on LU.  
>> Perhaps the JVM
>> is generating SSE instructions. Does llvm-gcc generate SSE  
>> instructions under
>> any circumstances?
>>
>> --
>> Dr Jon Harrop, Flying Frog Consultancy Ltd.
>> http://www.ffconsultancy.com/?e
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Jon Harrop

2009-Feb-01 16:01 UTC

head link

[LLVMdev] Aliasing (was Performance vs other VMs)

On Sunday 01 February 2009 05:25:40 Ramón García wrote:> This is not a quite fair comparison. Other virtual machines must be
> doing garbage collection, while LLVM, as it is using C code, it is
> taking advantage of memory allocation by hand.
That is an insignificant advantage in this particular case (SciMark2) because 
the memory for each test is preallocated and not part of the measurement and 
the heap and stack are both tiny during the computations so there is little 
to traverse.

I am interested in the comparative results for LLVM because I consider it to 
represent how fast my LLVM-based VM might be compared to other garbage 
collected VMs.

However, LLVM has a serious disadvantage compared to the other VMs here 
because it does not have aliasing assurances. For example, it does not know 
about array aliasing, e.g. that the subarrays in the successive 
over-relaxation test cannot overlap.

The LLVM 2.1 release notes say that llvm-gcc got alias analysis and understood 
the "restrict" keyword but when I add it to the C code for SciMark2 it
makes
no difference. Can anyone else get this to work?

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

Maybe Matching Threads

Search for more seemingly similar threads

llvm dev - Jan 2009 - [LLVMdev] Performance vs other VMs

[LLVMdev] Performance vs other VMs

[LLVMdev] Performance vs other VMs

[LLVMdev] Performance vs other VMs

[LLVMdev] Performance vs other VMs

[LLVMdev] Performance vs other VMs

[LLVMdev] Aliasing (was Performance vs other VMs)

Maybe Matching Threads