I notice llvm provides both ordered and unordered variants of floating-point comparison. Which of these is the right one to use by default? I suppose the two criteria would be, in order of importance: 1. Which is more efficient (more directly maps to typical hardware)? 2. Which is more familiar (more like the way C and Fortran do it)?
On Sun, Mar 28, 2010 at 7:45 AM, Russell Wallace <russell.wallace at gmail.com> wrote:> I notice llvm provides both ordered and unordered variants of > floating-point comparison. Which of these is the right one to use by > default? I suppose the two criteria would be, in order of importance: > > 1. Which is more efficient (more directly maps to typical hardware)?You can figure this out by looking at the output of llc: $ cat test.ll define i1 @less(double %x, double %y) nounwind readnone { entry: %0 = fcmp ult double %x, %y ; <i1> [#uses=1] ret i1 %0 } $ Debug/bin/llc <test.ll .section __TEXT,__text,regular,pure_instructions .globl _less .align 4, 0x90 _less: ## @less ## BB#0: ## %entry movsd 4(%esp), %xmm0 ucomisd 12(%esp), %xmm0 sbbb %al, %al andb $1, %al ret> 2. Which is more familiar (more like the way C and Fortran do it)?You can use http://llvm.org/demo/ to figure that out.
That works, thanks! It turns out that x86/SSE at least handles both equally well, but the ordered version is what C uses. On Sun, Mar 28, 2010 at 6:38 PM, Jeffrey Yasskin <jyasskin at google.com> wrote:> On Sun, Mar 28, 2010 at 7:45 AM, Russell Wallace > <russell.wallace at gmail.com> wrote: >> I notice llvm provides both ordered and unordered variants of >> floating-point comparison. Which of these is the right one to use by >> default? I suppose the two criteria would be, in order of importance: >> >> 1. Which is more efficient (more directly maps to typical hardware)? > > You can figure this out by looking at the output of llc: > > $ cat test.ll > define i1 @less(double %x, double %y) nounwind readnone { > entry: > %0 = fcmp ult double %x, %y ; <i1> [#uses=1] > ret i1 %0 > } > $ Debug/bin/llc <test.ll > .section __TEXT,__text,regular,pure_instructions > .globl _less > .align 4, 0x90 > _less: ## @less > ## BB#0: ## %entry > movsd 4(%esp), %xmm0 > ucomisd 12(%esp), %xmm0 > sbbb %al, %al > andb $1, %al > ret > >> 2. Which is more familiar (more like the way C and Fortran do it)? > > You can use http://llvm.org/demo/ to figure that out. >
Possibly Parallel Threads
- [LLVMdev] Which floating-point comparison?
- [Codegen bug in LLVM 3.8?] br following `fcmp une` is present in ll, absent in asm
- [LLVMdev] x86 REP-prefixed instructions seem to be dropped by instruction decoder?
- [LLVMdev] SIMD instructions and memory alignment on X86
- [LLVMdev] x86 REP-prefixed instructions seem to be dropped by instruction decoder?