Török Edvin
2007-Oct-19 10:50 UTC
[LLVMdev] llvm_fcmp_ord and llvm_fcmp_uno and assembly code generation
Hi, The C backend in llc generates code like: static inline int llvm_fcmp_ord(double X, double Y) { return X == X && Y == Y; } static inline int llvm_fcmp_uno(double X, double Y) { return X != X || Y != Y; } First of all it generates a warning by clang and gcc (with certain flags): x.cbe.c:130: warning: comparing floating point with == or != is unsafe Now, C99 provides a macro for this kind of stuff, but unfortunately ANSI C doesn't have something like this (for unordered testing) AFAIK. *If* we would be using C99 the code could look like: return isunordered(X, Y); return !isunordered(X, Y); However the assembly code generated is much shorter if I am using the C99 macros, both on gcc and llvm-gcc. This raises 2 issues: * can llvm_fcmp_ord/uno be implemented in ANSI/ISO C differently, which doesn't generate a warning, *and* generates optimal code * can llvm-gcc be improved to recognize functions like llvm_fcmp_ord/uno, and generate the optimal code (one ucomisd, rather than two). Not that llvm_fcmp_ord/uno would be on a critical path in a program, but any optimization is good, and worth mentioning IMHO ;) Look: #include <math.h> static inline int llvm_fcmp_ord(double X, double Y) { return X == X && Y == Y; } static inline int llvm_fcmp_uno(double X, double Y) { return X != X || Y != Y; } int x(double X, double Y) { return llvm_fcmp_uno(X,Y); } int xx(double X, double Y) { return isunordered(X, Y); } $ gcc -std=c99 -O3 -S x.c -o x.gcc.s $ llvm-gcc -std=c99 -O3 -S x.c -o x.llvm.s x.gcc.s: x: .LFB7: movl $1, %eax ucomisd %xmm0, %xmm0 jne .L5 jp .L5 xorl %eax, %eax ucomisd %xmm1, %xmm1 setp %al .L5: rep ; ret .LFE7: .size x, .-x .p2align 4,,15 .globl xx .type xx, @function xx: .LFB8: xorl %eax, %eax ucomisd %xmm1, %xmm0 setp %al ret x.llvm.s: x: pxor %xmm2, %xmm2 ucomisd %xmm2, %xmm0 setp %al ucomisd %xmm2, %xmm1 setp %cl orb %al, %cl movzbl %cl, %eax ret .size x, .-x .align 16 .globl xx .type xx, at function xx: ucomisd %xmm1, %xmm0 setp %al movzbl %al, %eax ret .size xx, .-xx Best regards, Edwin
Evan Cheng
2007-Oct-22 17:46 UTC
[LLVMdev] llvm_fcmp_ord and llvm_fcmp_uno and assembly code generation
Hi, Can you file a bugzilla on this? Thanks! Evan On Oct 19, 2007, at 3:50 AM, Török Edvin wrote:> Hi, > > The C backend in llc generates code like: > static inline int llvm_fcmp_ord(double X, double Y) { return X == X > && Y == Y; } > static inline int llvm_fcmp_uno(double X, double Y) { return X != X > || Y != Y; } > > First of all it generates a warning by clang and gcc (with certain > flags): > x.cbe.c:130: warning: comparing floating point with == or != is unsafe > > Now, C99 provides a macro for this kind of stuff, but unfortunately > ANSI C doesn't have something like this (for unordered testing) AFAIK. > > *If* we would be using C99 the code could look like: > return isunordered(X, Y); > return !isunordered(X, Y); > > However the assembly code generated is much shorter if I am using the > C99 macros, both on gcc and llvm-gcc. > > This raises 2 issues: > * can llvm_fcmp_ord/uno be implemented in ANSI/ISO C differently, > which doesn't generate a warning, *and* generates optimal code > * can llvm-gcc be improved to recognize functions like > llvm_fcmp_ord/uno, and generate the optimal code (one ucomisd, rather > than two). > > Not that llvm_fcmp_ord/uno would be on a critical path in a program, > but any optimization > is good, and worth mentioning IMHO ;) > > Look: > #include <math.h> > static inline int llvm_fcmp_ord(double X, double Y) { return X == X > && Y == Y; } > static inline int llvm_fcmp_uno(double X, double Y) { return X != X > || Y != Y; } > int x(double X, double Y) > { > return llvm_fcmp_uno(X,Y); > } > > int xx(double X, double Y) > { > return isunordered(X, Y); > } > > $ gcc -std=c99 -O3 -S x.c -o x.gcc.s > $ llvm-gcc -std=c99 -O3 -S x.c -o x.llvm.s > > x.gcc.s: > x: > .LFB7: > movl $1, %eax > ucomisd %xmm0, %xmm0 > jne .L5 > jp .L5 > xorl %eax, %eax > ucomisd %xmm1, %xmm1 > setp %al > .L5: > rep ; ret > .LFE7: > .size x, .-x > .p2align 4,,15 > .globl xx > .type xx, @function > xx: > .LFB8: > xorl %eax, %eax > ucomisd %xmm1, %xmm0 > setp %al > ret > > x.llvm.s: > x: > pxor %xmm2, %xmm2 > ucomisd %xmm2, %xmm0 > setp %al > ucomisd %xmm2, %xmm1 > setp %cl > orb %al, %cl > movzbl %cl, %eax > ret > .size x, .-x > > > .align 16 > .globl xx > .type xx, at function > xx: > ucomisd %xmm1, %xmm0 > setp %al > movzbl %al, %eax > ret > .size xx, .-xx > > Best regards, > Edwin > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Seemingly Similar Threads
- [LLVMdev] llvm_fcmp_ord and llvm_fcmp_uno and assembly code generation
- [LLVMdev] llvm.x86.sse2.sqrt.pd not using sqrtpd, calling a function that modifies ECX
- RFC: Element-atomic memory intrinsics
- Idletimeout patch, third attempt
- [LLVMdev] RFC: Exception Handling Rewrite