Hi all, I'm trying to implement a floating-point 'min' and 'max' operation using select. For 'min' I get the expected x86 assembly minss instruction, but for 'max' I get a branch instead of maxss. The corresponding C syntax code looks like this: float z = (x > y) ? x : y; Any clues? Could someone maybe explain to me the basics of LLVM's target specific optimizations and code generation? I'd love to analyze things like this myself but I don't know where to start. Thanks, Nicolas Capens -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080527/fa5ba41a/attachment.html>
On May 27, 2008, at 2:49 AM, Nicolas Capens wrote:> Hi all, > > I’m trying to implement a floating-point ‘min’ and ‘max’ operation > using select. For ‘min’ I get the expected x86 assembly minss > instruction, but for ‘max’ I get a branch instead of maxss. > > The corresponding C syntax code looks like this: > > float z = (x > y) ? x : y; > > Any clues?Your code is not safe for NaNs. This is the correct way to write maxss in C: float max(float x, float y) { return !(x < y) ? x : y; } If you don't care about NaNs, you can pass -ffast-math to llvm-gcc, or set "UnsafeFPMath=true" from <llvm/Target/TargetOptions.h>> Could someone maybe explain to me the basics of LLVM’s target > specific optimizations and code generation? I’d love to analyze > things like this myself but I don’t know where to start.This one specifically boils down to the semantics of maxss and LLVM IR instructions. For example, this code: float not_max(float x, float y) { return (x > y) ? x : y; } float really_max(float x, float y) { return !(x < y) ? x : y; } compiles into this LLVM IR (llvm-gcc t.c -S -o - -O -emit-llvm): define float @not_max(float %x, float %y) nounwind { entry: %tmp3 = fcmp ogt float %x, %y ; <i1> [#uses=1] %iftmp.0.0 = select i1 %tmp3, float %x, float %y ; <float> [#uses=1] ret float %iftmp.0.0 } define float @really_max(float %x, float %y) nounwind { entry: %tmp3 = fcmp uge float %x, %y ; <i1> [#uses=1] %iftmp.1.0 = select i1 %tmp3, float %x, float %y ; <float> [#uses=1] ret float %iftmp.1.0 } If you're interested in target-specific x86 optimizations to be done, take a look at lib/Target/X86/README*.txt -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080527/e36b59e5/attachment.html>
Hi all, Marc pointed out to me that this might be related to my question about floating-point equality, however I still think there's a valid optimization opportunity here. The Intel documents specify that the maxss instruction returns the second operand if either operand is a NaN. I believe this is exactly the intended behavior of (x > y) ? x : y where > is an ordered compare. If either x or y is NaN the > returns false so y is returned, the second argument. I wrote a little test app where I compared the results of (x > y) ? x : y with some inline assembly using maxss for all combinations of 0.0, 1.0 and NaN as inputs for x and y, and they were identical. Cheers, Nicolas From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Nicolas Capens Sent: Tuesday, 27 May, 2008 11:50 To: 'LLVM Developers Mailing List' Subject: [LLVMdev] Min and max Hi all, I'm trying to implement a floating-point 'min' and 'max' operation using select. For 'min' I get the expected x86 assembly minss instruction, but for 'max' I get a branch instead of maxss. The corresponding C syntax code looks like this: float z = (x > y) ? x : y; Any clues? Could someone maybe explain to me the basics of LLVM's target specific optimizations and code generation? I'd love to analyze things like this myself but I don't know where to start. Thanks, Nicolas Capens -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080528/ba8c4b21/attachment.html>