thr3ads.net - similar to: "[LLVMdev] Is it a bug or am I missing something ?"

Displaying 20 results from an estimated 3000 matches similar to: "[LLVMdev] Is it a bug or am I missing something ?"

[LLVMdev] Is it a bug or am I missing something ?

2013 Feb 19

[LLVMdev] Is it a bug or am I missing something ?

<<<<<<<<<<<<<<<<<<<<<<<<<< ; ModuleID = 'shufxbug.ll' target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:6 4-v64:64:64-v128:128:128-a0:0:64-f80:32:32-n8:16:32" target triple = "i386-pc-linux-gnu" define void @sample_test(<4 x float>* nocapture

[LLVMdev] "equivalent" .ll files diverge after optimizations are applied

2010 Aug 31

[LLVMdev] "equivalent" .ll files diverge after optimizations are applied

Hi, I've attached 2 .ll files which are supposed to be equivalent but 'unopt-fail.ll' causes a crash in webkit's test suite while 'unopt-pass.ll' does not. I can't give more details about the crash, when I run the crashing test it in isolation it passes, when I run the full suite it crashes; it boggles the mind. Below I provide the optimized asm that is produced from

[LLVMdev] "equivalent" .ll files diverge after optimizations are applied

2010 Aug 31

[LLVMdev] "equivalent" .ll files diverge after optimizations are applied

Using MM registers is wrong unless the user has specifically asked for it, which doesn't seem to be the case here. In the awesome MMX architecture, touching an MM register makes subsequent x87 operations fail unless an EMMS instruction is issued first; none of the compilers here are smart enough to insert EMMS instructions in the right places, so the only safe thing is not to use

[LLVMdev] "equivalent" .ll files diverge after optimizations are applied

2010 Aug 31

[LLVMdev] "equivalent" .ll files diverge after optimizations are applied

Here's the optimized versions: $ opt -std-compile-opts unopt-pass.ll -o - | llvm-dis -o - [...] define %3 @_ZN7WebCore15GraphicsContext19roundToDevicePixelsERKNS_9FloatRectE(%"class.WebCore::GraphicsContext"* %this, %"struct.WebCore::FloatRect"* %rect) nounwind ssp align 2 { %roundedOrigin = alloca %"class.WebCore::FloatSize", align 4 ;

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 09

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

Hi Chandler, Thanks for fixing the problem with the insertps mask. Generally the new shuffle lowering looks promising, however there are some cases where the codegen is now worse causing runtime performance regressions in some of our internal codebase. You have already mentioned how the new shuffle lowering is missing some features; for example, you explicitly said that we currently lack of

[LLVMdev] Unexpected spilling of vector register during lane extraction on some x86_64 targets

2014 Oct 13

[LLVMdev] Unexpected spilling of vector register during lane extraction on some x86_64 targets

Hello, Depending on how I extract integer lanes from an x86_64 xmm register, the backend may spill that register in order to load scalars. The effect was observed on two targets: corei7-avx and btver1 (I haven't checked other targets). Here's a test case with spilling/no-spilling code put on conditional compile: #if __SSE4_1__ != 0 #include <smmintrin.h> #else #include

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

2015 Jul 29

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

When I compile attached IR with LLVM 3.6 llc -march=x86-64 -o f.S f.ll it generates an aligned ADDPS with unaligned address. See attached f.S, here an extract: addq $12, %r9 # $12 is not a multiple of 4, thus for xmm0 this is unaligned xorl %esi, %esi .align 16, 0x90 .LBB0_1: # %loop2

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

2015 Jul 29

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

This load instruction assumes the default ABI alignment for the <4 x float> type, which is 16: %15 = load <4 x float>* %14 You can set the alignment of loads to something lower than 16 in your frontend, and this will make LLVM use movups instructions: %15 = load <4 x float>* %14, align 4 If some LLVM mid-level pass is introducing this load without proving that the vector is

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 10

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

On Tue, Sep 9, 2014 at 11:39 PM, Chandler Carruth <chandlerc at google.com> wrote: > Awesome, thanks for all the information! > > See below: > > On Tue, Sep 9, 2014 at 6:13 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> > wrote: >> >> You have already mentioned how the new shuffle lowering is missing >> some features; for example, you explicitly

[LLVMdev] Poor floating point optimizations?

2010 Nov 20

[LLVMdev] Poor floating point optimizations?

I wanted to use LLVM for my math parser but it seems that floating point optimizations are poor. For example consider such C code: float foo(float x) { return x+x+x; } and here is the code generated with "optimized" live demo: define float @foo(float %x) nounwind readnone { entry: %0 = fmul float %x, 2.000000e+00 ; <float> [#uses=1] %1 = fadd float %0, %x

[LLVMdev] Poor floating point optimizations?

2010 Nov 20

[LLVMdev] Poor floating point optimizations?

And also the resulting assembly code is very poor: 00460013 movss xmm0,dword ptr [esp+8] 00460019 movaps xmm1,xmm0 0046001C addss xmm1,xmm1 00460020 pxor xmm2,xmm2 00460024 addss xmm2,xmm1 00460028 addss xmm2,xmm0 0046002C movss dword ptr [esp],xmm2 00460031 fld dword ptr [esp] Especially pxor&and instead of movss (which is

[LLVMdev] Seg faulting on vector ops

2007 Jul 20

[LLVMdev] Seg faulting on vector ops

Hola LLVMers, I'm looking to make use of the vectorization primitives in the Intel chip with the code we generate from LLVM and so I've started experimenting with it. What is the state of the machine code generated for vectors? In my tinkering, I seem to be getting some wonky machine instructions, but I'm most likely just doing something wrong and I'm hoping you can set me in

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

2017 Apr 19

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

Changing the list from cfe-dev to llvm-dev > On 20 Apr 2017, at 4:52 AM, Michael Clark <michaeljclark at mac.com> wrote: > > I’m getting close. I think it may be an issue with an individual intrinsic. I’m looking for the X86 lowering of Instruction::FPToUI. > > I found a comment around the rationale for using a conditional move versus a branch. I believe the predicate logic

[LLVMdev] About JIT by LLVM 2.9 or later

2011 Nov 02

[LLVMdev] About JIT by LLVM 2.9 or later

Hello guys, Thanks for your help when you are busing. I am working on an open source project. It supports shader language and I want JIT feature, so LLVM is used. But now I find the ABI & Calling Convention did not co-work with MSVC. For example, following code I have: struct float4 { float x, y, z, w; }; struct float4x4 { float4 x, y, z, w; }; float4 fetch_vs( float4x4* mat

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 08

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

> On Sep 7, 2014, at 8:49 PM, Quentin Colombet <qcolombet at apple.com> wrote: > > Sure, > > Here is the command line: > clang -cc1 -triple x86_64-apple-macosx -S -disable-free -disable-llvm-verifier -main-file-name tmp.i -mrelocation-model pic -pic-level 2 -mdisable-fp-elim -masm-verbose -munwind-tables -target-cpu core-avx-i -O3 -ferror-limit 19 -fmessage-length 114

[LLVMdev] Seg faulting on vector ops

2007 Jul 21

[LLVMdev] Seg faulting on vector ops

On Fri, 20 Jul 2007, Chuck Rose III wrote: > I'm looking to make use of the vectorization primitives in the Intel > chip with the code we generate from LLVM and so I've started > experimenting with it. What is the state of the machine code generated > for vectors? In my tinkering, I seem to be getting some wonky machine > instructions, but I'm most likely just doing

[LLVMdev] X86 - Help on fixing a poor code generation bug

2013 Dec 05

[LLVMdev] X86 - Help on fixing a poor code generation bug

Hi all, I noticed that the x86 backend tends to emit unnecessary vector insert instructions immediately after sse scalar fp instructions like addss/mulss. For example: ///////////////////////////////// __m128 foo(__m128 A, __m128 B) { _mm_add_ss(A, B); } ///////////////////////////////// produces the sequence: addss %xmm0, %xmm1 movss %xmm1, %xmm0 which could be easily optimized into

[LLVMdev] Seg faulting on vector ops

2007 Jul 24

[LLVMdev] Seg faulting on vector ops

Hrm. This problem shouldn't be target specific. I am pretty sure prologue / epilogue inserter aligns stack correctly if there are stack objects with greater than default stack alignment requirement. Seems to be the initial alloca() instruction should specify 16 byte alignment? Evan On Jul 21, 2007, at 2:51 PM, Chris Lattner wrote: > On Fri, 20 Jul 2007, Chuck Rose III wrote:

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

2017 Apr 20

[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

> This seems like it was done for perf reason (mispredict). Conditional-to-cmov transformation should keep > from introducing additional observable side-effects, and it's clear that whatever did this did not account > for floating point exception. That’s a very reasonable statement, but I’m not sure it corresponds to the way we have typically approached this sort of problem. In

[LLVMdev] Seg faulting on vector ops

2007 Jul 20

[LLVMdev] Seg faulting on vector ops

Hi Chuck! On Jul 20, 2007, at 11:36 AM, Chuck Rose III wrote: > Hola LLVMers, > > > > I’m looking to make use of the vectorization primitives in the > Intel chip with the code we generate from LLVM and so I’ve started > experimenting with it. What is the state of the machine code > generated for vectors? In my tinkering, I seem to be getting some > wonky

similar to: [LLVMdev] Is it a bug or am I missing something ?