Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] Float compare-for-equality and select optimization opportunity"
2008 May 27
1
[LLVMdev] Float compare-for-equality and select optimizationopportunity
Both ZF and PF will be set if unordered, so the code below is IEEE
correct...you want to generate 'fcmp ueq' instead of 'fcmp oqe'
This is the resulting x86 assembly code:
movss xmm0,dword ptr [ecx+4]
ucomiss xmm0,dword ptr [ecx+8]
sete al
setnp dl
test dl,al
mov edx,edi
cmovne edx,ecx
cmovne ecx,esi
cmovne
2008 May 27
0
[LLVMdev] Float compare-for-equality and select optimizationopportunity
Hi Marc,
I'm a bit confused. Isn't the standard compare (i.e. the one for a language
like C) an ordered one? I tried converting some C code to LLVM C++ API code
with the online demo, and it uses FCMP_OEQ.
Cheers,
Nicolas
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Marc B. Reynolds
Sent: Tuesday, 27 May, 2008 14:07
To: 'LLVM
2008 May 27
1
[LLVMdev] Float compare-for-equality andselect optimizationopportunity
Hi Marc,
I'm a bit confused. Isn't the standard compare (i.e. the one for a language
like C) an ordered one? I tried converting some C code to LLVM C++ API code
with the online demo, and it uses FCMP_OEQ.
No, if you have:
x = NaN
y = NaN
then the comparison:
(x == y) is false.
Which is what your seeing from your first post and is the standard IEEE
expected behavior.
2017 Apr 19
3
[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long
Changing the list from cfe-dev to llvm-dev
> On 20 Apr 2017, at 4:52 AM, Michael Clark <michaeljclark at mac.com> wrote:
>
> I’m getting close. I think it may be an issue with an individual intrinsic. I’m looking for the X86 lowering of Instruction::FPToUI.
>
> I found a comment around the rationale for using a conditional move versus a branch. I believe the predicate logic
2017 Apr 20
4
[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long
> This seems like it was done for perf reason (mispredict). Conditional-to-cmov transformation should keep
> from introducing additional observable side-effects, and it's clear that whatever did this did not account
> for floating point exception.
That’s a very reasonable statement, but I’m not sure it corresponds to the way we have typically approached this sort of problem.
In
2018 May 09
3
Ignored branch predictor hints
Hello,
#define likely(x) __builtin_expect((x),1)
// switch like
char * b(int e) {
if (likely(e == 0))
return "0";
else if (e == 1)
return "1";
else return "f";
}
GCC correctly prefers the first case:
b(int):
mov eax, OFFSET FLAT:.LC0
test edi, edi
jne .L7
ret
But Clang seems to ignore _builtin_expect hints in this case.
2018 May 09
0
Ignored branch predictor hints
I did
https://bugs.llvm.org/show_bug.cgi?id=37368
2018-05-09 20:33 GMT+02:00 Dávid Bolvanský <david.bolvansky at gmail.com>:
> I did
>
> https://bugs.llvm.org/show_bug.cgi?id=37368
>
> 2018-05-09 20:29 GMT+02:00 David Zarzycki <dave at znu.io>:
>
>> I’d wager that the if-else chain is being converted to a "switch
>> statement” during an optimization
2013 Aug 29
1
[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86
On 29 August 2013 10:12, Demikhovsky, Elena <elena.demikhovsky at intel.com> wrote:
> But this is another case. LLVM IR distinguishes between ordered and unordered compare and X86 backend has appropriate instructions.
I think LLVM uses ordered/unordered compare to mean something
different to what the x86 instructions do. For example, "not equal":
fcmp une == unordered not
2013 Aug 29
2
[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86
On 29 Aug 2013, at 08:19, Tim Northover <t.p.northover at gmail.com> wrote:
> If so, a compare that used that instruction would have to become more
> like an "invoke" with a landingpad for the exception and so on,
> wouldn't it? The current fcmp can already distinguish between ordered
> and unordered, because ucomiss provides that information.
There are currently
2013 Aug 29
0
[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86
But this is another case. LLVM IR distinguishes between ordered and unordered compare and X86 backend has appropriate instructions.
But during DAG selection we just lose this information and always generate unordered fcmp.
I.e. in case of ordered fcmp the vcomiss should be generated, and in case of unordered - vucomiss.
- Elena
-----Original Message-----
From: Dr D. Chisnall [mailto:dc552 at
2018 May 09
2
Ignored branch predictor hints
Hi Dávid,
Looks like you can defeat the switch conversion by adding a dummy asm(“”):
#define likely(x) __builtin_expect((x),1)
// switch like
char * b(int e) {
if (likely(e == 0))
return "0";
asm("");
if (e == 1)
return "1";
else return "f";
}
Dave
> On May 9, 2018, at 2:33 PM, Dávid Bolvanský via llvm-dev
2015 Nov 21
2
Recent -Os code size regressions
On Fri, Nov 20, 2015 at 5:06 PM, James Molloy <james at jamesmolloy.co.uk>
wrote:
>
> Hi,
>
> We'd need to look precisely at what's causing the code size bloat. The
midend commit pointed out by Steve shouldn't cause bloat in and of itself -
it should reduce code size. It removes a load of stores and branches.
>
> I know a backend change I made to ARM isn't
2013 May 20
2
[LLVMdev] VCOMISS instruction in X86
Hi,
I'm looking at scalar and packed instructions in X86.
The instruction VCOMISS is scalar. May I remove SSEPackedSingle/SSEPackedDouble domain from it?
defm VUCOMISS : sse12_ord_cmp<0x2E, FR32, X86cmp, f32, f32mem, loadf32,
"ucomiss", SSEPackedSingle>, TB, VEX, VEX_LIG;
defm VUCOMISD : sse12_ord_cmp<0x2E, FR64, X86cmp, f64,
2013 Aug 29
2
[LLVMdev] Ordered / Unordered FP compare are not handled properly on X86
Should I open a ticket for this?
- Elena
From: Eli Friedman [mailto:eli.friedman at gmail.com]
Sent: Wednesday, August 28, 2013 19:51
To: Demikhovsky, Elena
Cc: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Ordered / Unordered FP compare are not handled properly on X86
On Wed, Aug 28, 2013 at 2:16 AM, Demikhovsky, Elena <elena.demikhovsky at intel.com<mailto:elena.demikhovsky at
2011 Oct 07
2
[LLVMdev] Aliasing confusion
Hi all,
I'm having trouble understanding how llvm determines if pointers
alias. Consider the following two functions that each do a redundant
load:
define float @A(float * noalias %ptr1) {
%ptr2 = getelementptr float* %ptr1, i32 1024
%val1a = load float* %ptr1
store float %val1a, float* %ptr2
%val1b = load float* %ptr1
ret float %val1b
}
define float @B(float * noalias %ptr1,
2014 Oct 13
2
[LLVMdev] Unexpected spilling of vector register during lane extraction on some x86_64 targets
Hello,
Depending on how I extract integer lanes from an x86_64 xmm register, the
backend may spill that register in order to load scalars. The effect was
observed on two targets: corei7-avx and btver1 (I haven't checked other
targets).
Here's a test case with spilling/no-spilling code put on conditional
compile:
#if __SSE4_1__ != 0
#include <smmintrin.h>
#else
#include
2015 Jul 29
2
[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address
When I compile attached IR with LLVM 3.6
llc -march=x86-64 -o f.S f.ll
it generates an aligned ADDPS with unaligned address. See attached f.S,
here an extract:
addq $12, %r9 # $12 is not a multiple of 4, thus for
xmm0 this is unaligned
xorl %esi, %esi
.align 16, 0x90
.LBB0_1: # %loop2
2011 Oct 07
0
[LLVMdev] Aliasing confusion
On Fri, Oct 7, 2011 at 2:15 PM, andrew adams <andrew.b.adams at gmail.com> wrote:
> Hi all,
>
> I'm having trouble understanding how llvm determines if pointers
> alias. Consider the following two functions that each do a redundant
> load:
>
> define float @A(float * noalias %ptr1) {
> %ptr2 = getelementptr float* %ptr1, i32 1024
> %val1a = load float*
2013 Sep 18
2
[LLVMdev] JIT compiled intrinsics calls is call to null pointer
Hi everyone,
I am trying to call an LLVM intrinsic (llvm.pow.f32), inserted with the
following call:
std::vector<llvm::Type *>
arg_types;arg_types.push_back(llvm::Type::getFloatTy(context));auto
function=llvm::Intrinsic::getDeclaration(module, llvm::Intrinsic::pow,
arg_types);auto result=ir_builder->CreateCall(function, args);
When I try to execute the code generated by the JIT
2017 May 11
2
FENV_ACCESS and floating point LibFunc calls
Sounds like the select lowering issue is definitely separate from the FENV
work.
Is there a bug report with a C or IR example? You want to generate compare
and branch instead of a cmov for something like this?
int foo(float x) {
if (x < 42.0f)
return x;
return 12;
}
define i32 @foo(float %x) {
%cmp = fcmp olt float %x, 4.200000e+01
%conv = fptosi float %x to i32
%ret = select