Jon Harrop
2009-Feb-26 21:07 UTC
[LLVMdev] Impressive performance result for LLVM: complex arithmetic
Following a discussion about numerical performance on comp.lang.functional
recently I just tried running a simple C mandelbrot benchmark that uses
C99's
complex arithmetic using gcc and llvm-gcc on a 2.1GHz Opteron 2352 running
Debian:
gcc: 5.727s
llvm-gcc: 1.393s
There is still 20% room for improvement but LLVM is >4x faster than gcc here.
Sweet.
Here's the code:
#include <stdio.h>
#include <stdlib.h>
#include <complex.h>
int max_i = 65536;
double sqr(double x) { return x*x; }
double cnorm2(complex z) { return sqr(creal(z)) + sqr(cimag(z)); }
int loop(complex c) {
complex z=c;
int i=1;
while (cnorm2(z) <= 4.0 && i++ < max_i)
z = z*z + c;
return i;
}
int main() {
for (int j = -39; j < 39; ++j) {
for (int i = -39; i < 39; ++i)
printf(loop(j/40.0-0.5 + i/40.0*I) > max_i ? "*" :
" ");
printf("\n");
}
return 0;
}
--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
Daniel Berlin
2009-Feb-27 18:12 UTC
[LLVMdev] Impressive performance result for LLVM: complex arithmetic
On gcc's side, this is a simple missed opt on the part of builtin lowering. As a result, the gcc code ends up with a call to muldc3 (complex = 2x2 multiply double) and the llvm code doesn't. GCC should be fixed in a second, and with that, there is no appreciable performance difference between the two. On Thu, Feb 26, 2009 at 4:07 PM, Jon Harrop <jon at ffconsultancy.com> wrote:> > Following a discussion about numerical performance on comp.lang.functional > recently I just tried running a simple C mandelbrot benchmark that uses C99's > complex arithmetic using gcc and llvm-gcc on a 2.1GHz Opteron 2352 running > Debian: > > gcc: 5.727s > llvm-gcc: 1.393s > > There is still 20% room for improvement but LLVM is >4x faster than gcc here. > Sweet. > > Here's the code: > > #include <stdio.h> > #include <stdlib.h> > #include <complex.h> > > int max_i = 65536; > > double sqr(double x) { return x*x; } > > double cnorm2(complex z) { return sqr(creal(z)) + sqr(cimag(z)); } > > int loop(complex c) { > complex z=c; > int i=1; > while (cnorm2(z) <= 4.0 && i++ < max_i) > z = z*z + c; > return i; > } > > int main() { > for (int j = -39; j < 39; ++j) { > for (int i = -39; i < 39; ++i) > printf(loop(j/40.0-0.5 + i/40.0*I) > max_i ? "*" : " "); > printf("\n"); > } > return 0; > } > > -- > Dr Jon Harrop, Flying Frog Consultancy Ltd. > http://www.ffconsultancy.com/?e > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
David Greene
2009-Mar-06 20:31 UTC
[LLVMdev] Impressive performance result for LLVM: complex arithmetic
On Friday 27 February 2009 12:12, Daniel Berlin wrote:> On gcc's side, this is a simple missed opt on the part of builtin lowering. > As a result, the gcc code ends up with a call to muldc3 (complex = 2x2 > multiply double) and the llvm code doesn't. > GCC should be fixed in a second, and with that, there is no > appreciable performance difference between the two.FYI, gcc 4.3.3 gets the same performance with -O3. I reproduced Jon's gcc results on 4.3.3 with -O0. -Dave