thr3ads.net - similar to: "error: couldn't allocate input reg for constraint '{xmm0}'"

Displaying 20 results from an estimated 2000 matches similar to: "error: couldn't allocate input reg for constraint '{xmm0}'"

[LLVMdev] Shuffle regression

2008 Jul 12

[LLVMdev] Shuffle regression

Hi all, I think I found a regression in the shuffle instruction. I've attached a replacement of fibonacci.cpp to reproduce the issue. It runs fine on release 2.3 but revision 52648 fails, and I suspect that the issue is still present. 2.3 generates the following x86 code: 03A10010 push ebp 03A10011 mov ebp,esp 03A10013 and esp,0FFFFFFF0h 03A10019

[LLVMdev] Shuffle regression

2008 Jul 12

[LLVMdev] Shuffle regression

I have fixed a related bug: 52740. Can you check if that fixes this problem? Evan On Jul 11, 2008, at 6:43 PM, Nicolas Capens wrote: > Hi all, > > I think I found a regression in the shuffle instruction. I’ve > attached a replacement of fibonacci.cpp to reproduce the issue. It > runs fine on release 2.3 but revision 52648 fails, and I suspect > that the issue is still

[LLVMdev] Spilled variables using unaligned moves

2008 Jul 14

[LLVMdev] Spilled variables using unaligned moves

Hi all, It looks like vector spills don't use aligned moves even though the stack is aligned. This seems like an optimization opportunity. The attached replacement of fibonacci.cpp generates x86 code like this: 03A70010 push ebp 03A70011 mov ebp,esp 03A70013 and esp,0FFFFFFF0h 03A70019 sub esp,1A0h ... 03A7006C movups xmmword ptr

[LLVMdev] movaps being generated despite alignment 1 being specified

2007 Oct 18

[LLVMdev] movaps being generated despite alignment 1 being specified

Hello LLVMers, High order bit: Presence of a called function is causing a store on an unrelated vector to generate an aligned store rather an unaligned one despite unaligned store being indicated in the associated StoreInst. Details: I pulled down the latest source, so this is something I'm finding with the current LLVM. I'm hoping you'll have an idea what's

[LLVMdev] InstructionCombining forgets alignment of globals

2008 Jul 10

[LLVMdev] InstructionCombining forgets alignment of globals

Hi all, The InstructionCombining pass causes alignment of globals to be ignored. I've attached a replacement of Fibonacci.cpp which reproduces this (I used 2.3 release). Here's the x86 code it produces: 03C20019 movaps xmm0,xmmword ptr ds:[164E799h] 03C20020 mulps xmm0,xmmword ptr ds:[164E79Ah] 03C20027 movaps xmmword ptr ds:[164E799h],xmm0 03C2002E

[LLVMdev] SSE intrinsic alignment bug?

2008 May 22

[LLVMdev] SSE intrinsic alignment bug?

The intent here is that "in" and "out" are always aligned, by forcing the stack pointer in the function that defines them to be aligned. On some targets (darwin) the stack pointer is always 16-byte aligned; on other targets there should be code in the function prologue to force it to be aligned. On May 22, 2008, at 4:36 PM, Nicolas Capens wrote: > Small typo, for

[LLVMdev] SSE intrinsic alignment bug?

2008 May 22

[LLVMdev] SSE intrinsic alignment bug?

Hi all, I think I might have found a potential bug when using SSE intrinsic and unaligned memory. Here's the code to reproduce it: #include "llvm/Module.h" #include "llvm/Intrinsics.h" #include "llvm/Instructions.h" #include "llvm/ModuleProvider.h" #include "llvm/ExecutionEngine/JIT.h" #include

[LLVMdev] movaps being generated despite alignment 1 being specified

2007 Oct 19

[LLVMdev] movaps being generated despite alignment 1 being specified

On Oct 18, 2007, at 1:52 PM, Chuck Rose III wrote: > > Here are the instructions for evaluateDependents. The JITter > hasn’t compiled foo yet. What’s confusing to me is why did my > movups suddenly become a movaps? All the stores and loads have > align 1 on them. Hi Chuck, I believe this is a bug but am unable to reproduce it with the test case you've provided. I

[LLVMdev] Spilled variables using unaligned moves

2008 Jul 14

[LLVMdev] Spilled variables using unaligned moves

This is on Windows / Cygwin? I think the dynamic stack pointer re- alignment doesn't happen until post- register allocation. Assuming there aren't other instructions between the prologue and the first movups that mess up esp (there shouldn't), this is indeed a bug. Please file a bug and attach a bc file. Thanks. Evan On Jul 14, 2008, at 7:43 AM, Nicolas Capens wrote: > Hi

[LLVMdev] Is it a bug or am I missing something ?

2013 Feb 19

[LLVMdev] Is it a bug or am I missing something ?

Hi all, on following code: ; ModuleID = 'shufxbug.ll' target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32-n8:16:32" target triple = "i386-pc-linux-gnu" define void @sample_test(<4 x float>* nocapture %source, <8 x float>* nocapture %dest) nounwind noinline { L.entry:

[LLVMdev] SSE intrinsic alignment bug?

2008 May 22

[LLVMdev] SSE intrinsic alignment bug?

Small typo, for the correct assembly code I meant: mov eax,dword ptr [esp+8] movups xmm0,xmmword ptr [eax] rcpps xmm1,xmm0 mov eax,dword ptr [esp+4] movups xmmword ptr [eax],xmm1 ret -------------- next part -------------- An HTML attachment was scrubbed... URL:

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

2015 Jul 29

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

This load instruction assumes the default ABI alignment for the <4 x float> type, which is 16: %15 = load <4 x float>* %14 You can set the alignment of loads to something lower than 16 in your frontend, and this will make LLVM use movups instructions: %15 = load <4 x float>* %14, align 4 If some LLVM mid-level pass is introducing this load without proving that the vector is

[LLVMdev] llvm.exp.f32 didn't work

2012 Mar 31

[LLVMdev] llvm.exp.f32 didn't work

Hi, I found that llvm.exp.f32 didn't work but sqrt works well. I implemented a function like define inlinehint float "my_exp"(float %.value) { .body: %0 = call float @llvm.exp.f32(float %.value) ret float %0 } declare float @llvm.exp.f32(float) nounwind readonly But it generates following ASM: 00280072 movups xmm0,xmmword ptr [esp+8] 00280077 movss dword ptr

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

2015 Jul 29

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

When I compile attached IR with LLVM 3.6 llc -march=x86-64 -o f.S f.ll it generates an aligned ADDPS with unaligned address. See attached f.S, here an extract: addq $12, %r9 # $12 is not a multiple of 4, thus for xmm0 this is unaligned xorl %esi, %esi .align 16, 0x90 .LBB0_1: # %loop2

[LLVMdev] Is it a bug or am I missing something ?

2013 Feb 19

[LLVMdev] Is it a bug or am I missing something ?

<<<<<<<<<<<<<<<<<<<<<<<<<< ; ModuleID = 'shufxbug.ll' target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:6 4-v64:64:64-v128:128:128-a0:0:64-f80:32:32-n8:16:32" target triple = "i386-pc-linux-gnu" define void @sample_test(<4 x float>* nocapture

RFC: A proposal for vectorizing loops with calls to math functions using SVML

2016 Apr 01

RFC: A proposal for vectorizing loops with calls to math functions using SVML

RFC: A proposal for vectorizing loops with calls to math functions using SVML (short vector math library). ========= Overview ========= Very simply, SVML (Intel short vector math library) functions are vector variants of scalar math functions that take vector arguments, apply an operation to each element, and store the result in a vector register. These vector variants can be generated by the

windows ABI problem with i128?

2018 Apr 26

windows ABI problem with i128?

On Thu, Apr 26, 2018 at 3:44 AM, Anton Korobeynikov <anton at korobeynikov.info > wrote: > Most probably you need to properly specify the calling convention the > backend is using for calling the runtime functions. Thanks for the tip. Can you be more specific? Are you suggesting there is some config parameter I can set before running TargetMachineEmitToFile? Do you know what

RFC: A proposal for vectorizing loops with calls to math functions using SVML

2016 Apr 04

RFC: A proposal for vectorizing loops with calls to math functions using SVML

Hi Sanjay, For sincos calls, I’m currently just going through isTriviallyVectorizable(), which was good enough to get things working so that I could test the translation. I don’t see why this cannot be changed to use addVectorizableFunctionsFromVecLib(). The other functions that I’m working with are already vectorized using the loop pragma. Those include sin, cos, exp, log, and pow. From: Sanjay

[LLVMdev] Spilled variables using unaligned moves

2008 Jul 14

[LLVMdev] Spilled variables using unaligned moves

On Jul 14, 2008, at 7:43 AM, Nicolas Capens wrote: > Hi all, > > It looks like vector spills don’t use aligned moves even though the > stack is aligned. This seems like an optimization opportunity. What target is this? Linux doesn't have a 16-byte aligned stack. -Chris > > The attached replacement of fibonacci.cpp generates x86 code like > this: > > 03A70010

[LLVMdev] Spilled variables using unaligned moves

2008 Jul 15

[LLVMdev] Spilled variables using unaligned moves

Hi Evan, Could you maybe point me to the source files where this issue might originate? I'd like to learn more about LLVM's innards but so far I've just scraped the surface and I don't know where what phase of instruction selection / register allocation / stack layout / etc. happens. If I understand correctly this issue might be fixed by moving stack pointer alignment

similar to: error: couldn't allocate input reg for constraint '{xmm0}'