thr3ads.net - search: "ucomiss"

[LLVMdev] Float compare-for-equality and select optimization opportunity

2008 May 27

3

[LLVMdev] Float compare-for-equality and select optimization opportunity

...C syntax the code looks like this: float x, y; int a, b, c if(x == y) // Rotate the integers { int t; t = a; a = b; b = c; c = t; } This is the resulting x86 assembly code: movss xmm0,dword ptr [ecx+4] ucomiss xmm0,dword ptr [ecx+8] sete al setnp dl test dl,al mov edx,edi cmovne edx,ecx cmovne ecx,esi cmovne esi,edi While I'm pleasantly surprised that my branch does get turned into several select operations as intended (cmov - condi...

[LLVMdev] Float compare-for-equality and select optimizationopportunity

2008 May 27

1

[LLVMdev] Float compare-for-equality and select optimizationopportunity

Both ZF and PF will be set if unordered, so the code below is IEEE correct...you want to generate 'fcmp ueq' instead of 'fcmp oqe' This is the resulting x86 assembly code: movss xmm0,dword ptr [ecx+4] ucomiss xmm0,dword ptr [ecx+8] sete al setnp dl test dl,al mov edx,edi cmovne edx,ecx cmovne ecx,esi cmovne esi,edi While I'm pleasantly surprised that my branch does get turned into several select operations as intended (cmov - condi...

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

2017 Apr 19

3

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

...figure out what the cast gets renamed to in the target layer so I can find where the sequence is emitted. > > > $ more llvm/lib/Target/X86//README-X86-64.txt > … > Are we better off using branches instead of cmove to implement FP to > unsigned i64? > > _conv: > ucomiss LC0(%rip), %xmm0 > cvttss2siq %xmm0, %rdx > jb L3 > subss LC0(%rip), %xmm0 > movabsq $-9223372036854775808, %rax > cvttss2siq %xmm0, %rdx > xorq %rax, %rdx > L3: > movq %rdx, %rax > ret...

[LLVMdev] Float compare-for-equality andselect optimizationopportunity

2008 May 27

1

[LLVMdev] Float compare-for-equality andselect optimizationopportunity

...demo, and it uses FCMP_OEQ. No, if you have: x = NaN y = NaN then the comparison: (x == y) is false. Which is what your seeing from your first post and is the standard IEEE expected behavior. Why I expected your min/max question to be related, consider the flags of 'comiss, ucomiss, etc.' : ZPC unordered 111 greater than 000 less than 001 equal 100 Try the following C program with gcc, first with no options and then with --ffinite-math-only (or --ffast-math) ----------- #define STR(X)...

[LLVMdev] Float compare-for-equality and select optimizationopportunity

2008 May 27

0

[LLVMdev] Float compare-for-equality and select optimizationopportunity

...ev] Float compare-for-equality and select optimizationopportunity Both ZF and PF will be set if unordered, so the code below is IEEE correct...you want to generate 'fcmp ueq' instead of 'fcmp oqe' This is the resulting x86 assembly code: movss xmm0,dword ptr [ecx+4] ucomiss xmm0,dword ptr [ecx+8] sete al setnp dl test dl,al mov edx,edi cmovne edx,ecx cmovne ecx,esi cmovne esi,edi While I'm pleasantly surprised that my branch does get turned into several select operations as intended (cmov - condi...

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

2013 Aug 29

1

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

...e used the comiss instruction (and exceptions weren't masked) then we'd take an exception if either A or B was a NaN. Look at the C file I've attached, where I start by unmasking exceptions and then perform what's essentially an "one" comparison. LLVM implements this with ucomiss, and gets the result correct. If you change this to "comiss" then instead the program terminates with a floating point exception. So there seem to be 2 cases: 1. Exception is masked. ucomiss is identical to comiss, why bother complicating things by emitting comiss? 2. Exception is not ma...

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

2013 Aug 29

2

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

...t.p.northover at gmail.com> wrote: > If so, a compare that used that instruction would have to become more > like an "invoke" with a landingpad for the exception and so on, > wouldn't it? The current fcmp can already distinguish between ordered > and unordered, because ucomiss provides that information. There are currently lots of limitations in the expressiveness of LLVM IR for floating point operations (e.g. distinguishing between trapping and non-trapping cases and representing the floating point environment). If anyone wants to fully implement the floating point pa...

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

2017 Apr 20

4

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

...ToUI in llvm/lib/Target/X86 so I’m trying to figure out what the cast gets renamed to in the target layer so I can find where the sequence is emitted. $ more llvm/lib/Target/X86//README-X86-64.txt … Are we better off using branches instead of cmove to implement FP to unsigned i64? _conv: ucomiss LC0(%rip), %xmm0 cvttss2siq %xmm0, %rdx jb L3 subss LC0(%rip), %xmm0 movabsq $-9223372036854775808, %rax cvttss2siq %xmm0, %rdx xorq %rax, %rdx L3: movq %rdx, %rax ret instead of _conv: movss LCPI1_0(%r...

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

2013 Aug 29

0

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

...nother case. LLVM IR distinguishes between ordered and unordered compare and X86 backend has appropriate instructions. But during DAG selection we just lose this information and always generate unordered fcmp. I.e. in case of ordered fcmp the vcomiss should be generated, and in case of unordered - vucomiss. - Elena -----Original Message----- From: Dr D. Chisnall [mailto:dc552 at hermes.cam.ac.uk] On Behalf Of David Chisnall Sent: Thursday, August 29, 2013 10:50 To: Tim Northover Cc: Demikhovsky, Elena; llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Ordered / Unordered FP compare are not handled p...

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

2013 Aug 29

0

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

On 29 August 2013 06:31, Demikhovsky, Elena <elena.demikhovsky at intel.com> wrote: > Should I open a ticket for this? I think he was saying this is intended behaviour. Isn't the difference between ucomiss and comiss just whether an exception is raised for NaN? If so, a compare that used that instruction would have to become more like an "invoke" with a landingpad for the exception and so on, wouldn't it? The current fcmp can already distinguish between ordered and unordered, because u...

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

2013 Aug 29

2

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

...ed / Unordered FP compare are not handled properly on X86 On Wed, Aug 28, 2013 at 2:16 AM, Demikhovsky, Elena <elena.demikhovsky at intel.com<mailto:elena.demikhovsky at intel.com>> wrote: I found that there is no diff in code generator for Ordered / Unordered FP compare instructions. FUCOMISS, FUCOMISD are generated in the both cases. Yes. That's how fcmp is defined in LangRef. -Eli --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recip...

[LLVMdev] VCOMISS instruction in X86

2013 May 20

2

[LLVMdev] VCOMISS instruction in X86

Hi, I'm looking at scalar and packed instructions in X86. The instruction VCOMISS is scalar. May I remove SSEPackedSingle/SSEPackedDouble domain from it? defm VUCOMISS : sse12_ord_cmp<0x2E, FR32, X86cmp, f32, f32mem, loadf32, "ucomiss", SSEPackedSingle>, TB, VEX, VEX_LIG; defm VUCOMISD : sse12_ord_cmp<0x2E, FR64, X86cmp, f64, f64mem, loadf64, "ucomisd", SSEPackedDou...

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

2017 Apr 21

2

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

...ToUI in llvm/lib/Target/X86 so I’m trying to figure out what the cast gets renamed to in the target layer so I can find where the sequence is emitted. $ more llvm/lib/Target/X86//README-X86-64.txt … Are we better off using branches instead of cmove to implement FP to unsigned i64? _conv: ucomiss LC0(%rip), %xmm0 cvttss2siq %xmm0, %rdx jb L3 subss LC0(%rip), %xmm0 movabsq $-9223372036854775808, %rax cvttss2siq %xmm0, %rdx xorq %rax, %rdx L3: movq %rdx, %rax ret instead of _conv: movss LCPI1_0(%r...

[LLVMdev] Crash when using InstallLazyFunctionCreator and JIT on Linux x64.

2009 Jan 13

0

[LLVMdev] Crash when using InstallLazyFunctionCreator and JIT on Linux x64.

...sembled JIT'd function looks like this: 0x00007f45ef2b6018: sub $0x8,%rsp 0x00007f45ef2b601c: mov $0x7f45ef2b6010,%rax 0x00007f45ef2b6026: movss (%rax,%riz,1),%xmm0 0x00007f45ef2b602b: movss %xmm0,0x4(%rsp) 0x00007f45ef2b6031: callq 0x7f46005ce64e 0x00007f45ef2b6036: ucomiss 0x4(%rsp),%xmm0 0x00007f45ef2b603b: setnp %cl 0x00007f45ef2b603e: sete %al 0x00007f45ef2b6041: and %cl,%al 0x00007f45ef2b6043: add $0x8,%rsp 0x00007f45ef2b6047: retq As you can see, the upper 32 bits of the function address that the function is making a call to are incor...

FENV_ACCESS and floating point LibFunc calls

2017 May 11

2

FENV_ACCESS and floating point LibFunc calls

....0f) return x; return 12; } define i32 @foo(float %x) { %cmp = fcmp olt float %x, 4.200000e+01 %conv = fptosi float %x to i32 %ret = select i1 %cmp, i32 %conv, i32 12 ret i32 %ret } $ clang -O2 cmovfp.c -S -o - movss LCPI0_0(%rip), %xmm1 ## xmm1 = mem[0],zero,zero,zero ucomiss %xmm0, %xmm1 cvttss2si %xmm0, %ecx movl $12, %eax cmoval %ecx, %eax retq On Thu, May 11, 2017 at 1:28 PM, Kaylor, Andrew via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi Michael, > > > > To be honest I haven’t started working on FP to integ...

FENV_ACCESS and floating point LibFunc calls

2017 May 11

3

FENV_ACCESS and floating point LibFunc calls

...2 @foo(float %x) { > %cmp = fcmp olt float %x, 4.200000e+01 > %conv = fptosi float %x to i32 > %ret = select i1 %cmp, i32 %conv, i32 12 > ret i32 %ret > } > > $ clang -O2 cmovfp.c -S -o - > movss LCPI0_0(%rip), %xmm1 ## xmm1 = mem[0],zero,zero,zero > ucomiss %xmm0, %xmm1 > cvttss2si %xmm0, %ecx > movl $12, %eax > cmoval %ecx, %eax > retq > > > > On Thu, May 11, 2017 at 1:28 PM, Kaylor, Andrew via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Hi Michael, > > > > To be h...

FENV_ACCESS and floating point LibFunc calls

2017 May 11

2

FENV_ACCESS and floating point LibFunc calls

Hi Andy, I’m interested to try out your patches… I understand the scope of FENV_ACCESS is relatively wide, however I’m still curious if you managed to figure out how to prevent the SelectionDAGLegalize::ExpandNode() FP_TO_UINT lowering of the FPToUI intrinsic from producing the predicate logic that incorrectly sets the floating point accrued exceptions due to unconditional execution of the

search for: ucomiss