similar to: [LLVMdev] Float compare-for-equality and select optimization opportunity

Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] Float compare-for-equality and select optimization opportunity"

2008 May 27
1
[LLVMdev] Float compare-for-equality and select optimizationopportunity
Both ZF and PF will be set if unordered, so the code below is IEEE correct...you want to generate 'fcmp ueq' instead of 'fcmp oqe' This is the resulting x86 assembly code: movss xmm0,dword ptr [ecx+4] ucomiss xmm0,dword ptr [ecx+8] sete al setnp dl test dl,al mov edx,edi cmovne edx,ecx cmovne ecx,esi cmovne
2008 May 27
0
[LLVMdev] Float compare-for-equality and select optimizationopportunity
Hi Marc, I'm a bit confused. Isn't the standard compare (i.e. the one for a language like C) an ordered one? I tried converting some C code to LLVM C++ API code with the online demo, and it uses FCMP_OEQ. Cheers, Nicolas From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Marc B. Reynolds Sent: Tuesday, 27 May, 2008 14:07 To: 'LLVM
2008 May 27
1
[LLVMdev] Float compare-for-equality andselect optimizationopportunity
Hi Marc, I'm a bit confused. Isn't the standard compare (i.e. the one for a language like C) an ordered one? I tried converting some C code to LLVM C++ API code with the online demo, and it uses FCMP_OEQ. No, if you have: x = NaN y = NaN then the comparison: (x == y) is false. Which is what your seeing from your first post and is the standard IEEE expected behavior.
2017 Apr 19
3
[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long
Changing the list from cfe-dev to llvm-dev > On 20 Apr 2017, at 4:52 AM, Michael Clark <michaeljclark at mac.com> wrote: > > I’m getting close. I think it may be an issue with an individual intrinsic. I’m looking for the X86 lowering of Instruction::FPToUI. > > I found a comment around the rationale for using a conditional move versus a branch. I believe the predicate logic
2017 Apr 20
4
[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long
> This seems like it was done for perf reason (mispredict). Conditional-to-cmov transformation should keep > from introducing additional observable side-effects, and it's clear that whatever did this did not account > for floating point exception. That’s a very reasonable statement, but I’m not sure it corresponds to the way we have typically approached this sort of problem. In
2018 May 09
3
Ignored branch predictor hints
Hello, #define likely(x) __builtin_expect((x),1) // switch like char * b(int e) { if (likely(e == 0)) return "0"; else if (e == 1) return "1"; else return "f"; } GCC correctly prefers the first case: b(int): mov eax, OFFSET FLAT:.LC0 test edi, edi jne .L7 ret But Clang seems to ignore _builtin_expect hints in this case.
2018 May 09
0
Ignored branch predictor hints
I did https://bugs.llvm.org/show_bug.cgi?id=37368 2018-05-09 20:33 GMT+02:00 Dávid Bolvanský <david.bolvansky at gmail.com>: > I did > > https://bugs.llvm.org/show_bug.cgi?id=37368 > > 2018-05-09 20:29 GMT+02:00 David Zarzycki <dave at znu.io>: > >> I’d wager that the if-else chain is being converted to a "switch >> statement” during an optimization
2013 Aug 29
1
[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86
On 29 August 2013 10:12, Demikhovsky, Elena <elena.demikhovsky at intel.com> wrote: > But this is another case. LLVM IR distinguishes between ordered and unordered compare and X86 backend has appropriate instructions. I think LLVM uses ordered/unordered compare to mean something different to what the x86 instructions do. For example, "not equal": fcmp une == unordered not
2013 Aug 29
2
[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86
On 29 Aug 2013, at 08:19, Tim Northover <t.p.northover at gmail.com> wrote: > If so, a compare that used that instruction would have to become more > like an "invoke" with a landingpad for the exception and so on, > wouldn't it? The current fcmp can already distinguish between ordered > and unordered, because ucomiss provides that information. There are currently
2013 Aug 29
0
[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86
But this is another case. LLVM IR distinguishes between ordered and unordered compare and X86 backend has appropriate instructions. But during DAG selection we just lose this information and always generate unordered fcmp. I.e. in case of ordered fcmp the vcomiss should be generated, and in case of unordered - vucomiss. - Elena -----Original Message----- From: Dr D. Chisnall [mailto:dc552 at
2018 May 09
2
Ignored branch predictor hints
Hi Dávid, Looks like you can defeat the switch conversion by adding a dummy asm(“”): #define likely(x) __builtin_expect((x),1) // switch like char * b(int e) { if (likely(e == 0)) return "0"; asm(""); if (e == 1) return "1"; else return "f"; } Dave > On May 9, 2018, at 2:33 PM, Dávid Bolvanský via llvm-dev
2015 Nov 21
2
Recent -Os code size regressions
On Fri, Nov 20, 2015 at 5:06 PM, James Molloy <james at jamesmolloy.co.uk> wrote: > > Hi, > > We'd need to look precisely at what's causing the code size bloat. The midend commit pointed out by Steve shouldn't cause bloat in and of itself - it should reduce code size. It removes a load of stores and branches. > > I know a backend change I made to ARM isn't
2013 May 20
2
[LLVMdev] VCOMISS instruction in X86
Hi, I'm looking at scalar and packed instructions in X86. The instruction VCOMISS is scalar. May I remove SSEPackedSingle/SSEPackedDouble domain from it? defm VUCOMISS : sse12_ord_cmp<0x2E, FR32, X86cmp, f32, f32mem, loadf32, "ucomiss", SSEPackedSingle>, TB, VEX, VEX_LIG; defm VUCOMISD : sse12_ord_cmp<0x2E, FR64, X86cmp, f64,
2013 Aug 29
2
[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86
Should I open a ticket for this? - Elena From: Eli Friedman [mailto:eli.friedman at gmail.com] Sent: Wednesday, August 28, 2013 19:51 To: Demikhovsky, Elena Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Ordered / Unordered FP compare are not handled properly on X86 On Wed, Aug 28, 2013 at 2:16 AM, Demikhovsky, Elena <elena.demikhovsky at intel.com<mailto:elena.demikhovsky at
2011 Oct 07
2
[LLVMdev] Aliasing confusion
Hi all, I'm having trouble understanding how llvm determines if pointers alias. Consider the following two functions that each do a redundant load: define float @A(float * noalias %ptr1) {   %ptr2 = getelementptr float* %ptr1, i32 1024   %val1a = load float* %ptr1   store float %val1a, float* %ptr2   %val1b = load float* %ptr1   ret float %val1b } define float @B(float * noalias %ptr1,
2014 Oct 13
2
[LLVMdev] Unexpected spilling of vector register during lane extraction on some x86_64 targets
Hello, Depending on how I extract integer lanes from an x86_64 xmm register, the backend may spill that register in order to load scalars. The effect was observed on two targets: corei7-avx and btver1 (I haven't checked other targets). Here's a test case with spilling/no-spilling code put on conditional compile: #if __SSE4_1__ != 0 #include <smmintrin.h> #else #include
2015 Jul 29
2
[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address
When I compile attached IR with LLVM 3.6 llc -march=x86-64 -o f.S f.ll it generates an aligned ADDPS with unaligned address. See attached f.S, here an extract: addq $12, %r9 # $12 is not a multiple of 4, thus for xmm0 this is unaligned xorl %esi, %esi .align 16, 0x90 .LBB0_1: # %loop2
2011 Oct 07
0
[LLVMdev] Aliasing confusion
On Fri, Oct 7, 2011 at 2:15 PM, andrew adams <andrew.b.adams at gmail.com> wrote: > Hi all, > > I'm having trouble understanding how llvm determines if pointers > alias. Consider the following two functions that each do a redundant > load: > > define float @A(float * noalias %ptr1) { >   %ptr2 = getelementptr float* %ptr1, i32 1024 >   %val1a = load float*
2013 Sep 18
2
[LLVMdev] JIT compiled intrinsics calls is call to null pointer
Hi everyone, I am trying to call an LLVM intrinsic (llvm.pow.f32), inserted with the following call: std::vector<llvm::Type *> arg_types;arg_types.push_back(llvm::Type::getFloatTy(context));auto function=llvm::Intrinsic::getDeclaration(module, llvm::Intrinsic::pow, arg_types);auto result=ir_builder->CreateCall(function, args); When I try to execute the code generated by the JIT
2017 May 11
2
FENV_ACCESS and floating point LibFunc calls
Sounds like the select lowering issue is definitely separate from the FENV work. Is there a bug report with a C or IR example? You want to generate compare and branch instead of a cmov for something like this? int foo(float x) { if (x < 42.0f) return x; return 12; } define i32 @foo(float %x) { %cmp = fcmp olt float %x, 4.200000e+01 %conv = fptosi float %x to i32 %ret = select