similar to: [LLVMdev] use AVX automatically if present

Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] use AVX automatically if present"

2012 May 24
0
[LLVMdev] use AVX automatically if present
On Thu, 24 May 2012, Pan, Wei wrote: > Very likely AVX is not enabled in your llc. This feature was enabled > just recently (late of April). I forgot to mention that I am using recent LLVM-3.1 and in principle my llc knows about avx as I have shown in the second example. But avx does not seem to be used by default. On Thu, 24 May 2012, Henning Thielemann wrote: > $ llc -o - -mattr
2012 May 24
2
[LLVMdev] use AVX automatically if present
Henning, I believe the code that is supposed to do this is in: lib/Target/X86/X86Subtarget.cpp in X86Subtarget::AutoDetectSubtargetFeatures() Is there a bug in that function? -Hal On Thu, 24 May 2012 23:56:48 +0200 (CEST) Henning Thielemann <llvm at henning-thielemann.de> wrote: > > On Thu, 24 May 2012, Pan, Wei wrote: > > > Very likely AVX is not enabled in your llc.
2013 Dec 11
2
[LLVMdev] AVX code gen
Hello - I found this post on the llvm blog: http://blog.llvm.org/2012/12/new-loop-vectorizer.html which makes me think that clang / llvm are capable of generating AVX with packed instructions as well as utilizing the full width of the YMM registers… I have an environment where icc generates these instructions (vmulps %ymm1, %ymm3, %ymm2 for example) but I can not get clang/llvm to generate such
2013 Dec 12
0
[LLVMdev] AVX code gen
It probably does not pick the right processor architecture. You could try “clang -mavx” or “clang -march=corei7-avx” for ivy-bridge and “clang -march=core-avx2” or “clang -mavx2" for haswell. $ clang -march=core-avx2 -O3 -S -o - test.c .section __TEXT,__text,regular,pure_instructions .globl _f .align 4, 0x90 _f: ## @f
2012 Jan 09
3
[LLVMdev] Calling conventions for YMM registers on AVX
On Jan 9, 2012, at 10:00 AM, Jakob Stoklund Olesen wrote: > > On Jan 8, 2012, at 11:18 PM, Demikhovsky, Elena wrote: > >> I'll explain what we see in the code. >> 1. The caller saves XMM registers across the call if needed (according to DEFS definition). >> YMMs are not in the set, so caller does not take care. > > This is not how the register allocator
2012 Jan 10
0
[LLVMdev] Calling conventions for YMM registers on AVX
This is the wrong code: declare <16 x float> @foo(<16 x float>) define <16 x float> @test(<16 x float> %x, <16 x float> %y) nounwind { entry: %x1 = fadd <16 x float> %x, %y %call = call <16 x float> @foo(<16 x float> %x1) nounwind %y1 = fsub <16 x float> %call, %y ret <16 x float> %y1 } ./llc -mattr=+avx
2013 Sep 05
1
[LLVMdev] AVX calling convention?
I am tracking down an x86-64 code generation problem that has to do with AVX instructions. The symptom is: a function is called, and the upper half of the function argument (which is short16) is zero. This happens only when I compile code with pocl, but not when I use clang and/or llc manually. I tracked this down to the following. The call site looks like vmovdqa 24064(%rsp), %ymm0 vmovdqa
2014 May 11
2
[LLVMdev] [cfe-dev] Code generation for noexcept functions
On Sun, May 11, 2014 at 8:19 AM, Stephan Tolksdorf <st at quanttec.com> wrote: > Hi, > > When clang/LLVM can't prove that a noexcept function only contains > non-throwing code, it seems to insert an explicit exception handler that > calls std::terminate. Why doesn't clang leave it to the eh personality > function to call std::terminate when an exception is thrown
2020 Feb 28
2
Is BlockAddress always correct ?
Hi I use BlockAddress to get the address of BasicBlock , and I use GlobalVariable 's getInitializer() to pass the address of BasicBlock to the global variable of my own program and then I print it out. But , I found that BlockAddress is not always correct. For example, some function's rsp (stack pointer) or other register is maintained by caller, so it would be like:
2020 Sep 01
2
Vector evolution?
Hi, Please consider the following loop: using v4f32 = float __attribute__((__vector_size__(16))); void fct6(v4f32 *x) { #pragma clang loop vectorize(enable) for (int i = 0; i < 256; ++i) x[i] = 7 * x[i]; } After compiling it with: clang++ -O3 -march=native -mtune=native \ -Rpass=loop-vectorize,slp-vectorize -Rpass-missed=loop-vectorize,slp-vectorize
2013 Jul 10
4
[LLVMdev] unaligned AVX store gets split into two instructions
I'm seeing a difference in how LLVM 3.3 and 3.2 emit unaligned vector loads on AVX. 3.3 is splitting up an unaligned vector load but in 3.2, it was emitted as a single instruction (details below). In a matrix-matrix inner-kernel, I see a ~25% decrease in performance, which seems to be due to this. Any ideas why this changed? Thanks! Zach LLVM Code: define <4 x double> @vstore(<4 x
2012 Mar 01
2
[LLVMdev] Stack alignment in kernel
I'm running in AVX mode, but the stack before call to kernel is aligned to 16 bit. Could you, please, tell me where it should be specified? Thank you. - Elena --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or
2012 Mar 01
3
[LLVMdev] Stack alignment on X86 AVX seems incorrect
Hi Elena, You're correct. LLVM does not align the stack to 32-bytes for AVX and unaligned moves should be used for YMM spills. I wrote some code to align the stack to 32-bytes when AVX spills are present; it does break the x86-64 ABI though. If upstream would be interested in this code, I can arrange with my employer to send a patch to the mailing list. -Cameron On Mar 1, 2012, at 4:09 PM,
2019 Sep 02
3
AVX2 codegen - question reg. FMA generation
Hello, On the appended reasonably simple test case that has an fmul/fadd sequence on <8 x float> vector types, I don't see the x86-64 code generator (with cpu set to haswell or later types) turning it into an AVX2 FMA instructions. Here's the snippet in the output it generates: $ llc -O3 -mcpu=skylake --------------------- .LBB0_2: # =>This Inner
2016 May 06
3
Unnecessary spill/fill issue
Hi, I am using mcjit in llvm 3.6 to jit kernels to x86 avx2. I've noticed some inefficient use of the stack around constant vectors. In one example, I have code that computes a series of constant vectors at compile time. Each vector has a single use. In the final asm, I see a series of spills at the top of the function of all the constant vectors immediately to stack, then each use references
2018 Jun 29
2
[RFC][VECLIB] how should we legalize VECLIB calls?
Illustrative Example: clang -fveclib=SVML -O3 svml.c -mavx #include <math.h> void foo(double *a, int N){ int i; #pragma clang loop vectorize_width(8) for (i=0;i<N;i++){ a[i] = sin(i); } } Currently, this results in a call to <8 x double> __svml_sin8(<8 x double>) after the vectorizer. This is 8-element SVML sin() called with 8-element argument. On the surface,
2017 Aug 21
3
DragonEgg for GCC v8.x and LLVM v6.x is just able to work
Hi LLVM and GCC developers, My sincere thanks will goto: * Duncan, the core developer of llvm-gcc and dragonegg http://llvm.org/devmtg/2009-10/Sands_LLVMGCCPlugin.pdf * David, the innovator and developer of GCC https://dmalcolm.fedorapeople.org/gcc/global-state/requirements.html and others who give me kind response for teaching me patiently and carefully about how to migrate GCC v4.8.x to
2018 Jun 29
2
[RFC][VECLIB] how should we legalize VECLIB calls?
Ashutosh, Thanks for the repy. Related earlier topic on this appears in the review of the SVML patch (@mmasten). Adding few names from there. https://reviews.llvm.org/D19544 There, I see Hal's review comment "let's start only with the directly-legal calls". Apparently, what we have right now in the trunk is "not legal enough". I'll work on the patch to stop
2013 Sep 20
2
[LLVMdev] LLVM ERROR: expected relocatable expression
This example generates the following error: .Ltmp3: .Ltmp5: .Ltmp13: .word (.Ltmp5-.Ltmp3)-.Ltmp13 ./llvm-mc ex.s -filetype=obj LLVM ERROR: expected relocatable expression when using: -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
2018 Jul 02
2
[RFC][VECLIB] how should we legalize VECLIB calls?
Adding to Ashutosh's comments, We are also interested in making LLVM generate vector math library calls that are available with glibc (version > 2.22). reference: https://sourceware.org/glibc/wiki/libmvec Using the example case given in the reference, we found there are 2 vector versions for "sin" (4 X double) with same VF namely _ZGVcN4v_sin (avx) version and _ZGVdN4v_sin