Displaying 20 results from an estimated 10000 matches similar to: "Aligned vector spills and variably sized stack frames"
2015 Aug 28
2
Aligned vector spills and variably sized stack frames
----- Original Message -----
> From: "Philip Reames via llvm-dev" <llvm-dev at lists.llvm.org>
> To: "llvm-dev" <llvm-dev at lists.llvm.org>
> Sent: Friday, August 28, 2015 6:21:00 PM
> Subject: Re: [llvm-dev] Aligned vector spills and variably sized stack frames
>
> On 08/28/2015 04:00 PM, Philip Reames via llvm-dev wrote:
> > I've run
2016 May 06
3
Unnecessary spill/fill issue
Hi, I am using mcjit in llvm 3.6 to jit kernels to x86 avx2. I've noticed
some inefficient use of the stack around constant vectors. In one example,
I have code that computes a series of constant vectors at compile time.
Each vector has a single use. In the final asm, I see a series of spills at
the top of the function of all the constant vectors immediately to stack,
then each use references
2012 Mar 01
3
[LLVMdev] Stack alignment on X86 AVX seems incorrect
Hi Elena,
You're correct. LLVM does not align the stack to 32-bytes for AVX and
unaligned moves should be used for YMM spills.
I wrote some code to align the stack to 32-bytes when AVX spills are
present; it does break the x86-64 ABI though. If upstream would be
interested in this code, I can arrange with my employer to send a patch to
the mailing list.
-Cameron
On Mar 1, 2012, at 4:09 PM,
2020 Sep 01
2
Vector evolution?
Hi,
Please consider the following loop:
using v4f32 = float __attribute__((__vector_size__(16)));
void fct6(v4f32 *x)
{
#pragma clang loop vectorize(enable)
for (int i = 0; i < 256; ++i)
x[i] = 7 * x[i];
}
After compiling it with:
clang++ -O3 -march=native -mtune=native \
-Rpass=loop-vectorize,slp-vectorize
-Rpass-missed=loop-vectorize,slp-vectorize
2015 Jul 24
2
[LLVMdev] SIMD for sdiv <2 x i64>
On 07/24/2015 03:42 AM, Benjamin Kramer wrote:
>> On 24.07.2015, at 08:06, zhi chen <zchenhn at gmail.com> wrote:
>>
>> It seems that that it's hard to vectorize int64 in LLVM. For example, LLVM 3.4 generates very complicated code for the following IR. I am running on a Haswell processor. Is it because there is no alternative AVX/2 instructions for int64? The same thing
2015 Aug 31
3
MCRegisterClass mandatory vs preferred alignment?
Looking around today, it appears that TargetRegisterClass and
MCRegisterClass only includes a single alignment. This is documented as
being the minimum legal alignment, but it appears to often be greater
than this in practice. For instance, on x86 the alignment of %ymm0 is
listed as 32, not 1. Does anyone know why this is?
Additionally, where are these alignments actually defined? I
2015 Jul 24
0
[LLVMdev] SIMD for sdiv <2 x i64>
------------------------------------ IR
------------------------------------------------------------------
if.then.i.i.i.i.i.i: ; preds = %if.then4
%S25_D = zext <2 x i32> %splatLDS17_D.splat to <2 x i64>
%umul_with_overflow.i.iS26_D = shl <2 x i64> %S25_D, <i64 3, i64 3>
%extumul_with_overflow.i.iS26_D = extractelement <2 x i64>
2015 Jul 24
1
[LLVMdev] SIMD for sdiv <2 x i64>
This snippet of IR is interesting:
%sub.ptr.div.iS37_D = sdiv <2 x i64> %sub.ptr.sub.iS36_D, <i64 24,
i64 24>
%cmp10S38_D = icmp ugt <2 x i64> %sub.ptr.div.iS37_D,
%splatInsMapS1_D.splat
%zextS39_D = sext <2 x i1> %cmp10S38_D to <2 x i64>
%BCS39_D = bitcast <2 x i64> %zextS39_D to i128
%mskS39_D = icmp ne i128 %BCS39_D, 0
br i1 %mskS39_D,
2012 Mar 01
2
[LLVMdev] Stack alignment on X86 AVX seems incorrect
On Thu, Mar 01, 2012 at 06:16:46PM +0000, Demikhovsky, Elena wrote:
> vmovaps should not access stack if it is not aligned to 32
I'm not completely sure I understand your problem. Are you saying that
the generated code assumes 256bit alignment, your default stack
alignment is 128bit and LLVM doesn't adjust it automatically?
Joerg
2015 Aug 31
2
MCRegisterClass mandatory vs preferred alignment?
On 08/31/2015 03:59 PM, Matthias Braun wrote:
> Looks to me like the alignment is specified in tablegen. From Target.td:
>
> class RegisterClass<string namespace, list<ValueType> regTypes, int alignment,
> dag regList, RegAltNameIndex idx = NoRegAltName>
>
> X86RegisterInfo.td:
>
> def VR256 : RegisterClass<"X86", [v32i8,
2015 Jul 14
4
[LLVMdev] Poor register allocation (constants causing spilling)
Hi,
While investigating a performance issue with an internal codebase I
came across what looks to be poor register allocation. I have
constructed a small(ish) reproducible which demonstrates the issue
(see test.ll attached).
I have spent some time going through the register allocator to
understand what is happening. I have also experimented with some
small changes to try and improve the
2018 Jul 02
2
[RFC][VECLIB] how should we legalize VECLIB calls?
Adding to Ashutosh's comments, We are also interested in making LLVM
generate vector math library calls that are available with glibc (version >
2.22).
reference: https://sourceware.org/glibc/wiki/libmvec
Using the example case given in the reference, we found there are 2 vector
versions for "sin" (4 X double) with same VF namely _ZGVcN4v_sin (avx)
version and _ZGVdN4v_sin
2018 Jun 29
2
[RFC][VECLIB] how should we legalize VECLIB calls?
Ashutosh,
Thanks for the repy.
Related earlier topic on this appears in the review of the SVML patch (@mmasten). Adding few names from there.
https://reviews.llvm.org/D19544
There, I see Hal's review comment "let's start only with the directly-legal calls". Apparently, what we have right now
in the trunk is "not legal enough". I'll work on the patch to stop
2013 Nov 16
1
[LLVMdev] Limit loop vectorizer to SSE
The vectorizer will now emit
= load <8 x i32>, align #TargetAlignmentOfScalari32
where before it would emit
= load <8 x i32>
(which has the semantics of “= load <8 xi32>, align 0” which means the address is aligned with target abi alignment, see http://llvm.org/docs/LangRef.html#load-instruction).
When the backend generates code for the former it will emit an unaligned move:
2012 Mar 01
0
[LLVMdev] Stack alignment on X86 AVX seems incorrect
When stack is unaligned, LLVM should generate vmovups instead of vmovaps.
- Elena
-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Joerg Sonnenberger
Sent: Thursday, March 01, 2012 20:31
To: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Stack alignment on X86 AVX seems incorrect
On Thu, Mar 01, 2012 at 06:16:46PM +0000,
2012 Mar 01
4
[LLVMdev] Stack alignment on X86 AVX seems incorrect
On Thu, Mar 1, 2012 at 4:29 PM, Evandro Menezes <emenezes at codeaurora.org>wrote:
...
> Aligning the stack to 32 bytes when there are auto AVX vector variables
> present shouldn't necessarily break the x86-64 ABI, as long as smaller auto
> variables remain properly aligned. A similar approach was taken for i386
> in GCC in order to support SSE vectors.
>
> Perhaps
2018 Jun 29
2
[RFC][VECLIB] how should we legalize VECLIB calls?
Illustrative Example:
clang -fveclib=SVML -O3 svml.c -mavx
#include <math.h>
void foo(double *a, int N){
int i;
#pragma clang loop vectorize_width(8)
for (i=0;i<N;i++){
a[i] = sin(i);
}
}
Currently, this results in a call to <8 x double> __svml_sin8(<8 x double>) after the vectorizer.
This is 8-element SVML sin() called with 8-element argument. On the surface,
2018 Jul 02
2
[RFC][VECLIB] how should we legalize VECLIB calls?
It may not be a full solution for the problems you're trying to solve, but
I don't know why adding to include/llvm/CodeGen/RuntimeLibcalls.def is a
problem in itself. Certainly, it's a mess that could be organized,
especially so we're not repeating everything for each data type as we do
right now.
So yes, I think that would allow us to remove the VecLib mappings because
we are
2013 Nov 16
0
[LLVMdev] Limit loop vectorizer to SSE
I confirm that r194876 fixes the issue, i.e. segfault not caused.
My program still passed 16 byte aligned pointers to the function
which the loop vectorizer processes successfully:
LV: Vector loop of width 8 costs: 1.
LV: Selecting VF = : 8.
LV: Found a vectorizable loop (8) in func_orig.ll
LV: Unroll Factor is 1
Since the program runs fine, it seems to be allowed for the CPU
to issue a vector
2018 Jul 02
8
[RFC][VECLIB] how should we legalize VECLIB calls?
On 07/02/2018 04:33 PM, Saito, Hideki wrote:
>
>
>
> >It may not be a full solution for the problems you're trying to solve
>
>
>
> If we are inventing a new solution, I’d like it also to solve OpenMP
> declare simd legalization issue. If a small extension of existing scheme
>
> works for mathlib only, I’m happy to take that and discuss OpenMP
>