Nicolas Capens
2008-May-27 09:09 UTC
[LLVMdev] Float compare-for-equality and select optimization opportunity
Hi all, I'm trying to generate code containing an ordered float compare for equality, and select. The resulting code however has an unordered compare and some Boolean logic that I think could be eliminated. In C syntax the code looks like this: float x, y; int a, b, c if(x == y) // Rotate the integers { int t; t = a; a = b; b = c; c = t; } This is the resulting x86 assembly code: movss xmm0,dword ptr [ecx+4] ucomiss xmm0,dword ptr [ecx+8] sete al setnp dl test dl,al mov edx,edi cmovne edx,ecx cmovne ecx,esi cmovne esi,edi While I'm pleasantly surprised that my branch does get turned into several select operations as intended (cmov - conditional move - in x86), I'm confused why it uses the ucomiss instruction (unordered compare and set flags). I only used IRBuilder::CreateFCmpOEQ. It also appears to invert the conditional, for no clear reason. I think it could be rewritten as follows: movss xmm0,dword ptr [ecx+4] comiss xmm0,dword ptr [ecx+8] mov edx,edi cmove edx,ecx cmove ecx,esi cmove esi,edi Compared to the original C syntax code this looks pretty straightforward. Curiously, when I replace the compare-for-equality with something like a less-than, it does generate such compact code (using comiss and cmova). And the not-equal case looks like this: movss xmm0,dword ptr [ecx+4] ucomiss xmm0,dword ptr [ecx+8] mov esi,ecx cmove esi,edx cmovne ecx,eax cmove edx,eax So this generates compact code but with an unordered compare. Anyway, it looks like the compare-for-equality case in particular is missing an optimization opportunity. It's no big deal to me but I thought someone here might be interested. Cheers, Nicolas Capens -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080527/dedc7749/attachment.html>
Marc B. Reynolds
2008-May-27 12:06 UTC
[LLVMdev] Float compare-for-equality and select optimizationopportunity
Both ZF and PF will be set if unordered, so the code below is IEEE correct...you want to generate 'fcmp ueq' instead of 'fcmp oqe' This is the resulting x86 assembly code: movss xmm0,dword ptr [ecx+4] ucomiss xmm0,dword ptr [ecx+8] sete al setnp dl test dl,al mov edx,edi cmovne edx,ecx cmovne ecx,esi cmovne esi,edi While I'm pleasantly surprised that my branch does get turned into several select operations as intended (cmov - conditional move - in x86), I'm confused why it uses the ucomiss instruction (unordered compare and set flags). I only used IRBuilder::CreateFCmpOEQ. It also appears to invert the conditional, for no clear reason. I think it could be rewritten as follows: movss xmm0,dword ptr [ecx+4] comiss xmm0,dword ptr [ecx+8] mov edx,edi cmove edx,ecx cmove ecx,esi cmove esi,edi Compared to the original C syntax code this looks pretty straightforward. Curiously, when I replace the compare-for-equality with something like a less-than, it does generate such compact code (using comiss and cmova). And the not-equal case looks like this: movss xmm0,dword ptr [ecx+4] ucomiss xmm0,dword ptr [ecx+8] mov esi,ecx cmove esi,edx cmovne ecx,eax cmove edx,eax So this generates compact code but with an unordered compare. Anyway, it looks like the compare-for-equality case in particular is missing an optimization opportunity. It's no big deal to me but I thought someone here might be interested. Cheers, Nicolas Capens -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080527/892e7aa5/attachment.html>
Marc B. Reynolds
2008-May-27 12:49 UTC
[LLVMdev] Float compare-for-equality and select optimizationopportunity
Nicolas: The LLVM mail server is being slow, so I'm direct e-mailing my comment. The generated code is IEEE correct...use CreateFCmpUEQ for unordered or equal, which should generate your hand written version....I'm assuming that your next e-mail (min/max) is probably the same thing. -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Nicolas Capens Sent: Tuesday, May 27, 2008 11:10 AM To: 'LLVM Developers Mailing List' Subject: [LLVMdev] Float compare-for-equality and select optimizationopportunity Hi all, I'm trying to generate code containing an ordered float compare for equality, and select. The resulting code however has an unordered compare and some Boolean logic that I think could be eliminated. In C syntax the code looks like this: float x, y; int a, b, c if(x == y) // Rotate the integers { int t; t = a; a = b; b = c; c = t; } This is the resulting x86 assembly code: movss xmm0,dword ptr [ecx+4] ucomiss xmm0,dword ptr [ecx+8] sete al setnp dl test dl,al mov edx,edi cmovne edx,ecx cmovne ecx,esi cmovne esi,edi While I'm pleasantly surprised that my branch does get turned into several select operations as intended (cmov - conditional move - in x86), I'm confused why it uses the ucomiss instruction (unordered compare and set flags). I only used IRBuilder::CreateFCmpOEQ. It also appears to invert the conditional, for no clear reason. I think it could be rewritten as follows: movss xmm0,dword ptr [ecx+4] comiss xmm0,dword ptr [ecx+8] mov edx,edi cmove edx,ecx cmove ecx,esi cmove esi,edi Compared to the original C syntax code this looks pretty straightforward. Curiously, when I replace the compare-for-equality with something like a less-than, it does generate such compact code (using comiss and cmova). And the not-equal case looks like this: movss xmm0,dword ptr [ecx+4] ucomiss xmm0,dword ptr [ecx+8] mov esi,ecx cmove esi,edx cmovne ecx,eax cmove edx,eax So this generates compact code but with an unordered compare. Anyway, it looks like the compare-for-equality case in particular is missing an optimization opportunity. It's no big deal to me but I thought someone here might be interested. Cheers, Nicolas Capens -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080527/080482ef/attachment.html>
Nicolas Capens
2008-May-27 15:35 UTC
[LLVMdev] Float compare-for-equality and select optimizationopportunity
Hi Marc, I'm a bit confused. Isn't the standard compare (i.e. the one for a language like C) an ordered one? I tried converting some C code to LLVM C++ API code with the online demo, and it uses FCMP_OEQ. Cheers, Nicolas From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Marc B. Reynolds Sent: Tuesday, 27 May, 2008 14:07 To: 'LLVM Developers Mailing List' Subject: Re: [LLVMdev] Float compare-for-equality and select optimizationopportunity Both ZF and PF will be set if unordered, so the code below is IEEE correct...you want to generate 'fcmp ueq' instead of 'fcmp oqe' This is the resulting x86 assembly code: movss xmm0,dword ptr [ecx+4] ucomiss xmm0,dword ptr [ecx+8] sete al setnp dl test dl,al mov edx,edi cmovne edx,ecx cmovne ecx,esi cmovne esi,edi While I'm pleasantly surprised that my branch does get turned into several select operations as intended (cmov - conditional move - in x86), I'm confused why it uses the ucomiss instruction (unordered compare and set flags). I only used IRBuilder::CreateFCmpOEQ. It also appears to invert the conditional, for no clear reason. I think it could be rewritten as follows: movss xmm0,dword ptr [ecx+4] comiss xmm0,dword ptr [ecx+8] mov edx,edi cmove edx,ecx cmove ecx,esi cmove esi,edi Compared to the original C syntax code this looks pretty straightforward. Curiously, when I replace the compare-for-equality with something like a less-than, it does generate such compact code (using comiss and cmova). And the not-equal case looks like this: movss xmm0,dword ptr [ecx+4] ucomiss xmm0,dword ptr [ecx+8] mov esi,ecx cmove esi,edx cmovne ecx,eax cmove edx,eax So this generates compact code but with an unordered compare. Anyway, it looks like the compare-for-equality case in particular is missing an optimization opportunity. It's no big deal to me but I thought someone here might be interested. Cheers, Nicolas Capens -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080527/c1bef4df/attachment.html>
Possibly Parallel Threads
- [LLVMdev] Float compare-for-equality and select optimizationopportunity
- [LLVMdev] Float compare-for-equality and select optimizationopportunity
- [LLVMdev] Float compare-for-equality andselect optimizationopportunity
- [cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long
- [cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long