thr3ads.net - llvm dev - [LLVMdev] Min and max [May 2008]

If this information is useful, please help other people find it:
Share via:

Nicolas Capens

2008-May-27 09:49 UTC

[LLVMdev] Min and max

Hi all,

 

I'm trying to implement a floating-point 'min' and 'max'
operation using
select. For 'min' I get the expected x86 assembly minss instruction, but
for
'max' I get a branch instead of maxss.

 

The corresponding C syntax code looks like this:

 

float z = (x > y) ? x : y;

 

Any clues?

 

Could someone maybe explain to me the basics of LLVM's target specific
optimizations and code generation? I'd love to analyze things like this
myself but I don't know where to start.

 

Thanks,

 

Nicolas Capens

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20080527/fa5ba41a/attachment.html>

Chris Lattner

2008-May-28 03:54 UTC

head link

[LLVMdev] Min and max

On May 27, 2008, at 2:49 AM, Nicolas Capens wrote:
> Hi all,
>
> I’m trying to implement a floating-point ‘min’ and ‘max’ operation  
> using select. For ‘min’ I get the expected x86 assembly minss  
> instruction, but for ‘max’ I get a branch instead of maxss.
>
> The corresponding C syntax code looks like this:
>
> float z = (x > y) ? x : y;
>
> Any clues?
Your code is not safe for NaNs.  This is the correct way to write  
maxss in C:

float max(float x, float y) {
   return !(x < y) ? x : y;
}

If you don't care about NaNs, you can pass -ffast-math to llvm-gcc, or  
set "UnsafeFPMath=true" from <llvm/Target/TargetOptions.h>
> Could someone maybe explain to me the basics of LLVM’s target  
> specific optimizations and code generation? I’d love to analyze  
> things like this myself but I don’t know where to start.
This one specifically boils down to the semantics of maxss and LLVM IR  
instructions.  For example, this code:

float not_max(float x, float y) {
   return (x > y) ? x : y;
}

float really_max(float x, float y) {
   return !(x < y) ? x : y;
}

compiles into this LLVM IR (llvm-gcc t.c -S -o - -O -emit-llvm):

define float @not_max(float %x, float %y) nounwind  {
entry:
	%tmp3 = fcmp ogt float %x, %y		; <i1> [#uses=1]
	%iftmp.0.0 = select i1 %tmp3, float %x, float %y		; <float> [#uses=1]
	ret float %iftmp.0.0
}

define float @really_max(float %x, float %y) nounwind  {
entry:
	%tmp3 = fcmp uge float %x, %y		; <i1> [#uses=1]
	%iftmp.1.0 = select i1 %tmp3, float %x, float %y		; <float> [#uses=1]
	ret float %iftmp.1.0
}

If you're interested in target-specific x86 optimizations to be done,  
take a look at lib/Target/X86/README*.txt

-Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20080527/e36b59e5/attachment.html>

Nicolas Capens

2008-May-28 08:03 UTC

head link

[LLVMdev] Min and max

Hi all,

 

Marc pointed out to me that this might be related to my question about
floating-point equality, however I still think there's a valid optimization
opportunity here.

 

The Intel documents specify that the maxss instruction returns the second
operand if either operand is a NaN. I believe this is exactly the intended
behavior of (x > y) ? x : y where > is an ordered compare. If either x or
y
is NaN the > returns false so y is returned, the second argument.

 

I wrote a little test app where I compared the results of (x > y) ? x : y
with some inline assembly using maxss for all combinations of 0.0, 1.0 and
NaN as inputs for x and y, and they were identical.

 

Cheers,

 

Nicolas

 

From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Nicolas Capens
Sent: Tuesday, 27 May, 2008 11:50
To: 'LLVM Developers Mailing List'
Subject: [LLVMdev] Min and max

 

Hi all,

 

I'm trying to implement a floating-point 'min' and 'max'
operation using
select. For 'min' I get the expected x86 assembly minss instruction, but
for
'max' I get a branch instead of maxss.

 

The corresponding C syntax code looks like this:

 

float z = (x > y) ? x : y;

 

Any clues?

 

Could someone maybe explain to me the basics of LLVM's target specific
optimizations and code generation? I'd love to analyze things like this
myself but I don't know where to start.

 

Thanks,

 

Nicolas Capens

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20080528/ba8c4b21/attachment.html>

Apparently Analagous Threads

Search for more maybe matching threads

llvm dev - May 2008 - [LLVMdev] Min and max

[LLVMdev] Min and max

[LLVMdev] Min and max

[LLVMdev] Min and max

Apparently Analagous Threads