Chen Li
2014-May-30 19:34 UTC
[LLVMdev] Is LLVM’s floating point operations IEEE 754 compliant on x86?
Hi llvmdev, I would like to know some implementation details of LLVM’s floating point operations on x86. Is it IEEE 754 compliant by default? On x86, the precision of floating point operations could be very different due to different implementations. For example, FPU (x87) instructions allows better precision by calculating 80-bit intermediate results, while SSE2 instructions only have single and double precision. FMA (fused multiply-add) is another example of reducing two roundings to a single rounding compared with normal multiply-add.>From LLVM language reference manual, I saw there are several floating point types defined, e.g., x86_fp80 to use 80-bit x87 register, and double of 64-bit. Does this imply that x87 is only supported when an explicit x86_fp80 is declared or it is also supported in intermediate results of 64-bit double floating point operations? Anything to do with the fast-math flag?I have also found a old note on LLVM’s floating point operations back to 2004 (http://nondot.org/sabre/LLVMNotes/FloatingPointChanges.txt) and I am not sure if it is still valid today. In the note, it is said LLVM’s x86 floating point implementation is not IEEE 754 compliant by default because of performance considerations. However, nowadays, some of those concerns might not be true any longer. From what I’ve read, especially on x86-64, some SSE2 instructions are more efficient than FPU instructions, and therefore, having IEEE compliant operations is not expensive as it used to be. And the proposed add_strict (operations to support IEEE compliant) in the note seems not pushed in the mainstream. Based on what I found so far, I think LLVM’s default setting is to only allow fmuladd as fused floating point operations (defined by FPOpFusion in TargetOptions.h), but I am not sure how x87 is used. Dose anyone know some details of those? Also, what does LLVM do with compile time optimizations on floating point? For example, constant folding on (1.5 + 2.5 + 3.5), does it use the same precision on the intermediate result or it only does one rounding at the end? Thanks in advance! best, chen
Tim Northover
2014-Jun-01 07:02 UTC
[LLVMdev] Is LLVM’s floating point operations IEEE 754 compliant on x86?
Hi Chen, This isn't really my area, but I'll do my best in the hope that any mistakes I make will annoy people enough to correct me...> Is it IEEE 754 compliant by default?I believe so, apart from things like exceptions and probably reliable support for things like C's "#pragma fenv"> From LLVM language reference manual, I saw there are several floating point types defined, e.g., x86_fp80 to use 80-bit x87 register, and double of 64-bit. Does this imply that x87 is only supported when an explicit x86_fp80 is declared or it is also supported in intermediate results of 64-bit double floating point operations? Anything to do with the fast-math flag?I believe we only use x87 for float & double types if we absolutely have to (i.e. SSE isn't available).> I have also found a old note on LLVM’s floating point operations back to 2004 (http://nondot.org/sabre/LLVMNotes/FloatingPointChanges.txt) and I am not sure if it is still valid today.Probably not, at least in the details. LLVM has changed a lot since 2004. I think we're still strict by default but the floating-point instructions have fast-math flags that can be applied to enable relaxed optimisations (http://llvm.org/docs/LangRef.html#fast-math-flags).> Based on what I found so far, I think LLVM’s default setting is to only allow fmuladd as fused floating point operations (defined by FPOpFusion in TargetOptions.h), but I am not sure how x87 is used. Dose anyone know some details of those?The fmuladd sounds about right (well, there's also a @llvm.fma which will always be fused, regardless of benefit I think).> Also, what does LLVM do with compile time optimizations on floating point? For example, constant folding on (1.5 + 2.5 + 3.5), does it use the same precision on the intermediate result or it only does one rounding at the end?We have an APFloat class which performs the operations in software and will round at each step. It's *probably* used reliably. Cheers. Tim.