I wanted to use LLVM for my math parser but it seems that floating point optimizations are poor. For example consider such C code: float foo(float x) { return x+x+x; } and here is the code generated with "optimized" live demo: define float @foo(float %x) nounwind readnone { entry: %0 = fmul float %x, 2.000000e+00 ; <float> [#uses=1] %1 = fadd float %0, %x ; <float> [#uses=1] ret float %1 } So what is going on? I mean, it was not replaced by %0 = fmul %x, 3.0. I thought that maybe LLVM follows some strict IEEE floating point rules like exceptions/subnormals/NaN/Inf etc issues. But (x+x) was actually transformed to %x * 2.0. And this is even stranger, because on many architectures, MUL takes more cycles than ADD (reason you'd rather use LEA than IMUL in X86 code). Could someone explain what is going on? Maybe there are some special optimization passes for such scenarios but I've been playing with them (in C++ app) and I didn't achieve anything. And to be honest, those optimization passes are not well documented. With regards, Bob D.
And also the resulting assembly code is very poor: 00460013 movss xmm0,dword ptr [esp+8] 00460019 movaps xmm1,xmm0 0046001C addss xmm1,xmm1 00460020 pxor xmm2,xmm2 00460024 addss xmm2,xmm1 00460028 addss xmm2,xmm0 0046002C movss dword ptr [esp],xmm2 00460031 fld dword ptr [esp] Especially pxor&and instead of movss (which is unnecessary anyway) is just pure madness. Bob D.
On Nov 20, 2010, at 2:41 PM, Sdadsda Sdasdaas wrote:> And also the resulting assembly code is very poor: > > 00460013 movss xmm0,dword ptr [esp+8] > 00460019 movaps xmm1,xmm0 > 0046001C addss xmm1,xmm1 > 00460020 pxor xmm2,xmm2 > 00460024 addss xmm2,xmm1 > 00460028 addss xmm2,xmm0 > 0046002C movss dword ptr [esp],xmm2 > 00460031 fld dword ptr [esp] > > Especially pxor&and instead of movss (which is unnecessary anyway) is just pure > madness.X+0.0 isn't the same as X if X is -0.0. Have you tried setting 'UnsafeFPMath' in TargetOptions.h? If you're still having problems, it would be best to indicate more about your configuration, how you're using the llvm tools/libs etc. -Chris
Possibly Parallel Threads
- [LLVMdev] Poor floating point optimizations?
- [LLVMdev] Poor floating point optimizations?
- [LLVMdev] X86 - Help on fixing a poor code generation bug
- TypePromoteFloat loses intermediate rounding operations
- [LLVMdev] Packed instructions generaetd by LoopVectorize?