thr3ads.net - similar to: "[LLVMdev] Float compare-for-equality and select optimization opportunity"

Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] Float compare-for-equality and select optimization opportunity"

[LLVMdev] Float compare-for-equality and select optimizationopportunity

2008 May 27

[LLVMdev] Float compare-for-equality and select optimizationopportunity

Both ZF and PF will be set if unordered, so the code below is IEEE correct...you want to generate 'fcmp ueq' instead of 'fcmp oqe' This is the resulting x86 assembly code: movss xmm0,dword ptr [ecx+4] ucomiss xmm0,dword ptr [ecx+8] sete al setnp dl test dl,al mov edx,edi cmovne edx,ecx cmovne ecx,esi cmovne

[LLVMdev] Float compare-for-equality and select optimizationopportunity

2008 May 27

[LLVMdev] Float compare-for-equality and select optimizationopportunity

Hi Marc, I'm a bit confused. Isn't the standard compare (i.e. the one for a language like C) an ordered one? I tried converting some C code to LLVM C++ API code with the online demo, and it uses FCMP_OEQ. Cheers, Nicolas From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Marc B. Reynolds Sent: Tuesday, 27 May, 2008 14:07 To: 'LLVM

[LLVMdev] Float compare-for-equality andselect optimizationopportunity

2008 May 27

[LLVMdev] Float compare-for-equality andselect optimizationopportunity

Hi Marc, I'm a bit confused. Isn't the standard compare (i.e. the one for a language like C) an ordered one? I tried converting some C code to LLVM C++ API code with the online demo, and it uses FCMP_OEQ. No, if you have: x = NaN y = NaN then the comparison: (x == y) is false. Which is what your seeing from your first post and is the standard IEEE expected behavior.

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

2017 Apr 19

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

Changing the list from cfe-dev to llvm-dev > On 20 Apr 2017, at 4:52 AM, Michael Clark <michaeljclark at mac.com> wrote: > > I’m getting close. I think it may be an issue with an individual intrinsic. I’m looking for the X86 lowering of Instruction::FPToUI. > > I found a comment around the rationale for using a conditional move versus a branch. I believe the predicate logic

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

2017 Apr 20

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

> This seems like it was done for perf reason (mispredict). Conditional-to-cmov transformation should keep > from introducing additional observable side-effects, and it's clear that whatever did this did not account > for floating point exception. That’s a very reasonable statement, but I’m not sure it corresponds to the way we have typically approached this sort of problem. In

Ignored branch predictor hints

2018 May 09

Ignored branch predictor hints

Hello, #define likely(x) __builtin_expect((x),1) // switch like char * b(int e) { if (likely(e == 0)) return "0"; else if (e == 1) return "1"; else return "f"; } GCC correctly prefers the first case: b(int): mov eax, OFFSET FLAT:.LC0 test edi, edi jne .L7 ret But Clang seems to ignore _builtin_expect hints in this case.

Ignored branch predictor hints

2018 May 09

Ignored branch predictor hints

I did https://bugs.llvm.org/show_bug.cgi?id=37368 2018-05-09 20:33 GMT+02:00 Dávid Bolvanský <david.bolvansky at gmail.com>: > I did > > https://bugs.llvm.org/show_bug.cgi?id=37368 > > 2018-05-09 20:29 GMT+02:00 David Zarzycki <dave at znu.io>: > >> I’d wager that the if-else chain is being converted to a "switch >> statement” during an optimization

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

2013 Aug 29

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

On 29 August 2013 10:12, Demikhovsky, Elena <elena.demikhovsky at intel.com> wrote: > But this is another case. LLVM IR distinguishes between ordered and unordered compare and X86 backend has appropriate instructions. I think LLVM uses ordered/unordered compare to mean something different to what the x86 instructions do. For example, "not equal": fcmp une == unordered not

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

2013 Aug 29

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

On 29 Aug 2013, at 08:19, Tim Northover <t.p.northover at gmail.com> wrote: > If so, a compare that used that instruction would have to become more > like an "invoke" with a landingpad for the exception and so on, > wouldn't it? The current fcmp can already distinguish between ordered > and unordered, because ucomiss provides that information. There are currently

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

2013 Aug 29

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

But this is another case. LLVM IR distinguishes between ordered and unordered compare and X86 backend has appropriate instructions. But during DAG selection we just lose this information and always generate unordered fcmp. I.e. in case of ordered fcmp the vcomiss should be generated, and in case of unordered - vucomiss. - Elena -----Original Message----- From: Dr D. Chisnall [mailto:dc552 at

Ignored branch predictor hints

2018 May 09

Ignored branch predictor hints

Hi Dávid, Looks like you can defeat the switch conversion by adding a dummy asm(“”): #define likely(x) __builtin_expect((x),1) // switch like char * b(int e) { if (likely(e == 0)) return "0"; asm(""); if (e == 1) return "1"; else return "f"; } Dave > On May 9, 2018, at 2:33 PM, Dávid Bolvanský via llvm-dev

Recent -Os code size regressions

2015 Nov 21

Recent -Os code size regressions

On Fri, Nov 20, 2015 at 5:06 PM, James Molloy <james at jamesmolloy.co.uk> wrote: > > Hi, > > We'd need to look precisely at what's causing the code size bloat. The midend commit pointed out by Steve shouldn't cause bloat in and of itself - it should reduce code size. It removes a load of stores and branches. > > I know a backend change I made to ARM isn't

[LLVMdev] VCOMISS instruction in X86

2013 May 20

[LLVMdev] VCOMISS instruction in X86

Hi, I'm looking at scalar and packed instructions in X86. The instruction VCOMISS is scalar. May I remove SSEPackedSingle/SSEPackedDouble domain from it? defm VUCOMISS : sse12_ord_cmp<0x2E, FR32, X86cmp, f32, f32mem, loadf32, "ucomiss", SSEPackedSingle>, TB, VEX, VEX_LIG; defm VUCOMISD : sse12_ord_cmp<0x2E, FR64, X86cmp, f64,

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

2013 Aug 29

[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86

Should I open a ticket for this? - Elena From: Eli Friedman [mailto:eli.friedman at gmail.com] Sent: Wednesday, August 28, 2013 19:51 To: Demikhovsky, Elena Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Ordered / Unordered FP compare are not handled properly on X86 On Wed, Aug 28, 2013 at 2:16 AM, Demikhovsky, Elena <elena.demikhovsky at intel.com<mailto:elena.demikhovsky at

[LLVMdev] Aliasing confusion

2011 Oct 07

[LLVMdev] Aliasing confusion

Hi all, I'm having trouble understanding how llvm determines if pointers alias. Consider the following two functions that each do a redundant load: define float @A(float * noalias %ptr1) { %ptr2 = getelementptr float* %ptr1, i32 1024 %val1a = load float* %ptr1 store float %val1a, float* %ptr2 %val1b = load float* %ptr1 ret float %val1b } define float @B(float * noalias %ptr1,

[LLVMdev] Unexpected spilling of vector register during lane extraction on some x86_64 targets

2014 Oct 13

[LLVMdev] Unexpected spilling of vector register during lane extraction on some x86_64 targets

Hello, Depending on how I extract integer lanes from an x86_64 xmm register, the backend may spill that register in order to load scalars. The effect was observed on two targets: corei7-avx and btver1 (I haven't checked other targets). Here's a test case with spilling/no-spilling code put on conditional compile: #if __SSE4_1__ != 0 #include <smmintrin.h> #else #include

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

2015 Jul 29

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

When I compile attached IR with LLVM 3.6 llc -march=x86-64 -o f.S f.ll it generates an aligned ADDPS with unaligned address. See attached f.S, here an extract: addq $12, %r9 # $12 is not a multiple of 4, thus for xmm0 this is unaligned xorl %esi, %esi .align 16, 0x90 .LBB0_1: # %loop2

[LLVMdev] Aliasing confusion

2011 Oct 07

[LLVMdev] Aliasing confusion

On Fri, Oct 7, 2011 at 2:15 PM, andrew adams <andrew.b.adams at gmail.com> wrote: > Hi all, > > I'm having trouble understanding how llvm determines if pointers > alias. Consider the following two functions that each do a redundant > load: > > define float @A(float * noalias %ptr1) { > %ptr2 = getelementptr float* %ptr1, i32 1024 > %val1a = load float*

[LLVMdev] JIT compiled intrinsics calls is call to null pointer

2013 Sep 18

[LLVMdev] JIT compiled intrinsics calls is call to null pointer

Hi everyone, I am trying to call an LLVM intrinsic (llvm.pow.f32), inserted with the following call: std::vector<llvm::Type *> arg_types;arg_types.push_back(llvm::Type::getFloatTy(context));auto function=llvm::Intrinsic::getDeclaration(module, llvm::Intrinsic::pow, arg_types);auto result=ir_builder->CreateCall(function, args); When I try to execute the code generated by the JIT

FENV_ACCESS and floating point LibFunc calls

2017 May 11

FENV_ACCESS and floating point LibFunc calls

Sounds like the select lowering issue is definitely separate from the FENV work. Is there a bug report with a C or IR example? You want to generate compare and branch instead of a cmov for something like this? int foo(float x) { if (x < 42.0f) return x; return 12; } define i32 @foo(float %x) { %cmp = fcmp olt float %x, 4.200000e+01 %conv = fptosi float %x to i32 %ret = select

similar to: [LLVMdev] Float compare-for-equality and select optimization opportunity