thr3ads.net - similar to: "[LLVMdev] stack alignment (again)"

Displaying 20 results from an estimated 8000 matches similar to: "[LLVMdev] stack alignment (again)"

2008 Mar 30

[LLVMdev] stack alignment (again)

On Mar 28, 2008, at 5:17 PM, Chuck Rose III wrote: > I was curious about the state of stack alignment on x86. I noticed > there are a few bugs outstanding on the issue. I recently added > some code which had the effect of throwing an extra function > parameter on our stack at runtime, a 4 byte pointer. > > Esp is now not 16-byte aligned, so instructions like unpcklps

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

2015 Jul 29

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

When I compile attached IR with LLVM 3.6 llc -march=x86-64 -o f.S f.ll it generates an aligned ADDPS with unaligned address. See attached f.S, here an extract: addq $12, %r9 # $12 is not a multiple of 4, thus for xmm0 this is unaligned xorl %esi, %esi .align 16, 0x90 .LBB0_1: # %loop2

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

2015 Jul 29

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

This load instruction assumes the default ABI alignment for the <4 x float> type, which is 16: %15 = load <4 x float>* %14 You can set the alignment of loads to something lower than 16 in your frontend, and this will make LLVM use movups instructions: %15 = load <4 x float>* %14, align 4 If some LLVM mid-level pass is introducing this load without proving that the vector is

[LLVMdev] Unexpected spilling of vector register during lane extraction on some x86_64 targets

2014 Oct 13

[LLVMdev] Unexpected spilling of vector register during lane extraction on some x86_64 targets

Hello, Depending on how I extract integer lanes from an x86_64 xmm register, the backend may spill that register in order to load scalars. The effect was observed on two targets: corei7-avx and btver1 (I haven't checked other targets). Here's a test case with spilling/no-spilling code put on conditional compile: #if __SSE4_1__ != 0 #include <smmintrin.h> #else #include

[LLVMdev] Seg faulting on vector ops

2007 Jul 20

[LLVMdev] Seg faulting on vector ops

Hola LLVMers, I'm looking to make use of the vectorization primitives in the Intel chip with the code we generate from LLVM and so I've started experimenting with it. What is the state of the machine code generated for vectors? In my tinkering, I seem to be getting some wonky machine instructions, but I'm most likely just doing something wrong and I'm hoping you can set me in

RFC: A proposal for vectorizing loops with calls to math functions using SVML

2016 Apr 01

RFC: A proposal for vectorizing loops with calls to math functions using SVML

RFC: A proposal for vectorizing loops with calls to math functions using SVML (short vector math library). ========= Overview ========= Very simply, SVML (Intel short vector math library) functions are vector variants of scalar math functions that take vector arguments, apply an operation to each element, and store the result in a vector register. These vector variants can be generated by the

RFC: A proposal for vectorizing loops with calls to math functions using SVML

2016 Apr 04

RFC: A proposal for vectorizing loops with calls to math functions using SVML

Hi Sanjay, For sincos calls, I’m currently just going through isTriviallyVectorizable(), which was good enough to get things working so that I could test the translation. I don’t see why this cannot be changed to use addVectorizableFunctionsFromVecLib(). The other functions that I’m working with are already vectorized using the loop pragma. Those include sin, cos, exp, log, and pow. From: Sanjay

[LLVMdev] Seg faulting on vector ops

2007 Jul 21

[LLVMdev] Seg faulting on vector ops

On Fri, 20 Jul 2007, Chuck Rose III wrote: > I'm looking to make use of the vectorization primitives in the Intel > chip with the code we generate from LLVM and so I've started > experimenting with it. What is the state of the machine code generated > for vectors? In my tinkering, I seem to be getting some wonky machine > instructions, but I'm most likely just doing

[LLVMdev] Seg faulting on vector ops

2007 Jul 24

[LLVMdev] Seg faulting on vector ops

Hrm. This problem shouldn't be target specific. I am pretty sure prologue / epilogue inserter aligns stack correctly if there are stack objects with greater than default stack alignment requirement. Seems to be the initial alloca() instruction should specify 16 byte alignment? Evan On Jul 21, 2007, at 2:51 PM, Chris Lattner wrote: > On Fri, 20 Jul 2007, Chuck Rose III wrote:

[LLVMdev] Seg faulting on vector ops

2007 Jul 26

[LLVMdev] Seg faulting on vector ops

I am fairly certain this is right. Chuck, can you do a quick experiment for me? Go back to your original code but make sure the alloca instruction specify 16-byte alignment. The code should work. If not, please file a bug. Thanks, Evan On Jul 24, 2007, at 1:58 PM, Evan Cheng wrote: > Hrm. This problem shouldn't be target specific. I am pretty sure > prologue / epilogue inserter

[LLVMdev] Seg faulting on vector ops

2007 Jul 20

[LLVMdev] Seg faulting on vector ops

Hi Chuck! On Jul 20, 2007, at 11:36 AM, Chuck Rose III wrote: > Hola LLVMers, > > > > I’m looking to make use of the vectorization primitives in the > Intel chip with the code we generate from LLVM and so I’ve started > experimenting with it. What is the state of the machine code > generated for vectors? In my tinkering, I seem to be getting some > wonky

Allowing virtual registers after register allocation

2015 Dec 09

Allowing virtual registers after register allocation

Hi all, Virtual ISAs such as WebAssembly and NVPTX use infinite virtual register sets instead of traditional phsyical registers. PrologEpilogInserter is run after register allocation and asserts that all virtuals have been allocated but doesn't otherwise depend on this if scavenging is not needed. We'd like to use the target-independent PEI code for WebAssembly, so we're proposing a

[LLVMdev] Spilling & UNPCKLPS Question

2009 Nov 20

[LLVMdev] Spilling & UNPCKLPS Question

I'm working on adding some more annotations to asm and I cam across this odd construct generated for X86/split-vector-rem.ll: movss %xmm0, 32(%rsp) # Scalar Spill [...] unpcklps 48(%rsp), %xmm0 # Vector Folded Reload [...] movaps %xmm0, 16(%rsp) # Vector Spill [...] unpcklps

[LLVMdev] Vector troubles

2007 Sep 28

[LLVMdev] Vector troubles

Hola LLVMers, I'm working on engaging SSE via the LLVM vector ops on x86. I had some questions a while back that you all helped out on, but I'm seeing similar issues and was hoping you'd have some ideas. Below is the dump of the LLVM IR of a program which is designed to take a vector stored in a float*, build an LLVM vector from it, copy it to another vector, and then take it

[LLVMdev] Stack alignment problem

2004 Jul 01

[LLVMdev] Stack alignment problem

Hello, it seems the Prolog/Epilog insertion does not correctly align stack for me. Consider the PEI::calculateFrameObjectOffsets method. It only aligns the stack if FFI->hasCalls() is true. The only place where MachineFrameInfo::setHasCalls is invoked is PEI::saveCallerSavedRegisters and the value 'true' is only passed when there are instructions with opcodes equal

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 10

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

On Tue, Sep 9, 2014 at 11:39 PM, Chandler Carruth <chandlerc at google.com> wrote: > Awesome, thanks for all the information! > > See below: > > On Tue, Sep 9, 2014 at 6:13 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> > wrote: >> >> You have already mentioned how the new shuffle lowering is missing >> some features; for example, you explicitly

Allowing virtual registers after register allocation

2016 Jan 22

Allowing virtual registers after register allocation

Here are 2 patches, which are independent of each other. The first splits PrologEpilogInserter into 2 parts : http://reviews.llvm.org/D16481 After looking at the code I thought it made more sense for the major split to include whether callee-saved register spills are supported. So for non-virtual targets, virtual registers are not supported and scavenging is optionally supported, and vice versa

Allowing virtual registers after register allocation

2016 Jan 13

Allowing virtual registers after register allocation

We had some additional discussion on this. There is a lot of concern generally about post-RA passes which do not expect to have to handle virtual registers; specifically if they unexpectedly start seeing virtual registers, or if they work today but start making assumptions in the future. We discussed considering a mechanism that would require MachineFunctionPasses to "opt-in" and declare

Allowing virtual registers after register allocation

2015 Dec 10

Allowing virtual registers after register allocation

> On Dec 10, 2015, at 10:49 AM, Derek Schuff <dschuff at google.com> wrote: > > > > On Thu, Dec 10, 2015 at 10:13 AM Quentin Colombet <qcolombet at apple.com <mailto:qcolombet at apple.com>> wrote: > > I am tempted to think no, we don’t, but I don’t know the use cases. > What post-RA passes with want to run with virtual regs? > > The immediate

[LLVMdev] Re: Creating Release 1.7 Branch at 1:00pm PDT

2006 Apr 13

[LLVMdev] Re: Creating Release 1.7 Branch at 1:00pm PDT

Here's what's left on Linux (GCC 4.1.0), after all updates that went into the branch: Running /proj/llvm/build/../llvm/test/Regression/CFrontend/dg.exp ... FAIL: /proj/llvm/build/../llvm/test/Regression/CFrontend/2004-02-12- LargeAggregateCopy.c.tr: gccas: /proj/llvm/build/../llvm/lib/VMCore/Function.cpp:266: unsigned int llvm::Function::getIntrinsicID() const: Assertion `0 &&

similar to: [LLVMdev] stack alignment (again)