Jon Harrop
2009-Feb-26 21:07 UTC
[LLVMdev] Impressive performance result for LLVM: complex arithmetic
Following a discussion about numerical performance on comp.lang.functional recently I just tried running a simple C mandelbrot benchmark that uses C99's complex arithmetic using gcc and llvm-gcc on a 2.1GHz Opteron 2352 running Debian: gcc: 5.727s llvm-gcc: 1.393s There is still 20% room for improvement but LLVM is >4x faster than gcc here. Sweet. Here's the code: #include <stdio.h> #include <stdlib.h> #include <complex.h> int max_i = 65536; double sqr(double x) { return x*x; } double cnorm2(complex z) { return sqr(creal(z)) + sqr(cimag(z)); } int loop(complex c) { complex z=c; int i=1; while (cnorm2(z) <= 4.0 && i++ < max_i) z = z*z + c; return i; } int main() { for (int j = -39; j < 39; ++j) { for (int i = -39; i < 39; ++i) printf(loop(j/40.0-0.5 + i/40.0*I) > max_i ? "*" : " "); printf("\n"); } return 0; } -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e
Daniel Berlin
2009-Feb-27 18:12 UTC
[LLVMdev] Impressive performance result for LLVM: complex arithmetic
On gcc's side, this is a simple missed opt on the part of builtin lowering. As a result, the gcc code ends up with a call to muldc3 (complex = 2x2 multiply double) and the llvm code doesn't. GCC should be fixed in a second, and with that, there is no appreciable performance difference between the two. On Thu, Feb 26, 2009 at 4:07 PM, Jon Harrop <jon at ffconsultancy.com> wrote:> > Following a discussion about numerical performance on comp.lang.functional > recently I just tried running a simple C mandelbrot benchmark that uses C99's > complex arithmetic using gcc and llvm-gcc on a 2.1GHz Opteron 2352 running > Debian: > > gcc: 5.727s > llvm-gcc: 1.393s > > There is still 20% room for improvement but LLVM is >4x faster than gcc here. > Sweet. > > Here's the code: > > #include <stdio.h> > #include <stdlib.h> > #include <complex.h> > > int max_i = 65536; > > double sqr(double x) { return x*x; } > > double cnorm2(complex z) { return sqr(creal(z)) + sqr(cimag(z)); } > > int loop(complex c) { > complex z=c; > int i=1; > while (cnorm2(z) <= 4.0 && i++ < max_i) > z = z*z + c; > return i; > } > > int main() { > for (int j = -39; j < 39; ++j) { > for (int i = -39; i < 39; ++i) > printf(loop(j/40.0-0.5 + i/40.0*I) > max_i ? "*" : " "); > printf("\n"); > } > return 0; > } > > -- > Dr Jon Harrop, Flying Frog Consultancy Ltd. > http://www.ffconsultancy.com/?e > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
David Greene
2009-Mar-06 20:31 UTC
[LLVMdev] Impressive performance result for LLVM: complex arithmetic
On Friday 27 February 2009 12:12, Daniel Berlin wrote:> On gcc's side, this is a simple missed opt on the part of builtin lowering. > As a result, the gcc code ends up with a call to muldc3 (complex = 2x2 > multiply double) and the llvm code doesn't. > GCC should be fixed in a second, and with that, there is no > appreciable performance difference between the two.FYI, gcc 4.3.3 gets the same performance with -O3. I reproduced Jon's gcc results on 4.3.3 with -O0. -Dave