thr3ads.net - similar to: "[LLVMdev] ARM NEON intrinsics in clang"

Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] ARM NEON intrinsics in clang"

2013 Sep 26

[LLVMdev] ARM NEON intrinsics in clang

On 26 September 2013 17:52, Stanislav Manilov <stanislav.manilov at gmail.com>wrote: > To answer your question I am testing on a pandaboard currently, which has > an arm cortex-a9 processor, which I think is 64-bit. > Cortex-A9 is still 32-bits, so you'll have all support you need. ;) however it doesn't if I remove the -ffreestanding flag. I need to figure > this out

[LLVMdev] ARM NEON intrinsics in clang

2013 Sep 26

[LLVMdev] ARM NEON intrinsics in clang

> To answer your question I am testing on a pandaboard currently, which has >> an arm cortex-a9 processor, which I think is 64-bit. >> > > Cortex-A9 is still 32-bits, so you'll have all support you need. ;) > Ah, Okay, embarrassing... however it doesn't if I remove the -ffreestanding flag. I need to figure >> this out next. >> > > Can you at

[LLVMdev] ARM NEON intrinsics in clang

2013 Sep 26

[LLVMdev] ARM NEON intrinsics in clang

Hello LLVM Devs, I am starting my PhD on Automatic Parallelization for DSP and want to play with some ARM NEON intrinsics for a start. I spent the last three days trying to compile a version of LLVM that would allow me to compile sources that contain these intrinsics, but with no success. In the process I found out that clang doesn't support NEON (as per

[LLVMdev] ARM NEON intrinsics in clang

2013 Sep 26

[LLVMdev] ARM NEON intrinsics in clang

Hello Tim, > I spent the last three days trying to compile a version of LLVM that would > > allow me to compile sources that contain these intrinsics, but with no > success. > > Ok. This we can probably help with. Did you manage to build a version > of Clang (preferably from git/subversion)? > Yes, I managed to build the latest (r191291) svn revision of LLVM + clang. If

[LLVMdev] Implementing the ARM NEON Intrinsics for PowerPC

2013 Oct 01

[LLVMdev] Implementing the ARM NEON Intrinsics for PowerPC

Hello LLVM Devs, Thanks for helping me previously to cross-compile for ARM, I managed to get a working toolchain and am currently having fun compiling different toy problems and running them on a pandaboard. As part of my research I am trying to implement the ARM NEON Intrinsics in the PowerPC LLVM backend. I am still at the beginning of my efforts and am not yet familiar with either the ARM or

Is the CppBackend still supported?

2016 May 04

Is the CppBackend still supported?

On Wed, May 4, 2016 at 3:10 PM, Stanislav Manilov < stanislav.manilov at gmail.com> wrote: > As in "look at the source of clang" or as in "look at the -S -emit-llvm" > output? If you mean the former, then would that be easy for someone who > hasn't seen the clang source before? > Generally the latter - then potentially set some breakpoints & look at

A bug in DependenceAnalysis?

2017 Jun 22

A bug in DependenceAnalysis?

Hi Philip, I forgot to mention that I was ignoring loop-independent dependences. If I don't I get an inconsistent, ordered, anti, loop-independent dependence and an inconsistent, ordered, flow, loop-carried dependence for example A. At the same time I get just a consistent, ordered, anti, loop-independent dependence for example B. Here's the .ll code for example A: *; Function Attrs:

[LLVMdev] Implementing the ARM NEON Intrinsics for PowerPC

2013 Oct 02

[LLVMdev] Implementing the ARM NEON Intrinsics for PowerPC

Hello Hal, I am not very familiar with the DSP capabilities of PowerPC, but I imagine there will be instructions for simple vector operations like vector addition, multiplication, etc. so for these I imagine the implementation would consist of just outputting the correct instruction. However, for NEON instructions like the reciprocal step (see

[RFC PATCH v1] arm: kf_bfly4: Introduce ARM neon intrinsics

2014 Nov 09

[RFC PATCH v1] arm: kf_bfly4: Introduce ARM neon intrinsics

Optimize kf_bfly4 function using ARM NEON intrinsics for SoCs that have NEON VFP unit As initial step, only targetting ARMv7-VFP based SoCs. To enable this optimization, use --enable-armv7-neon-float when running configure command. This is disabled by default. --- Makefile.am | 16 ++++ celt/_kiss_fft_guts.h | 13 +++ celt/arm/kiss_fft_neon.c | 211

[RFC PATCH v2] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Dec 09

[RFC PATCH v2] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics

Viswanath Puttagunta wrote: > + SUMM = vdupq_n_f32(0); It kills me that there's no intrinsic for VMOV.F32 d0, #0 (or at least I couldn't find one), so this takes two instructions instead of one. > + /* Consume 4 elements in x vector and 8 elements in y > + * vector. However, the 8'th element in y never really gets > + * touched in this loop. So, if len == 4,

Writing an LLVM Pass that depends on mem2reg

2016 Feb 11

Writing an LLVM Pass that depends on mem2reg

Oh, I see, that makes a lot of sense. How do I build the pass pipeline? On Thu, Feb 11, 2016 at 5:54 PM, Mehdi Amini <mehdi.amini at apple.com> wrote: > > On Feb 11, 2016, at 9:49 AM, Stanislav Manilov via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > Hello, > > > > I am used to specifying dependence on other LLVM passes in the >

[LLVMdev] ARM NEON intrinsics in clang

2013 Sep 26

[LLVMdev] ARM NEON intrinsics in clang

On 26 September 2013 18:13, Stanislav Manilov <stanislav.manilov at gmail.com>wrote: > which I suspect has something to do with the fact that in /usr/include I > have a folder called x86_64-linux-gnu but not one > called arm-linux-gnueabihf. Am I even remotely right? > Yes, you are, and the docs should (hopefully) have all the information you need to get past that, and other

[PATCH v1] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Dec 19

[PATCH v1] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics

Optimize celt_pitch_xcorr function (for floating point) using ARM NEON intrinsics for SoCs that have NEON VFP unit. To enable this optimization, use --enable-intrinsics configure option. Compile time and runtime checks are also supported to make sure this optimization is only enabled when the compiler supports neon intrinsics. --- Makefile.am | 12 ++

[RFC PATCH v3] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Dec 10

[RFC PATCH v3] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics

Is the CppBackend still supported?

2016 May 04

Is the CppBackend still supported?

The usual advice I provide people is "see what Clang does with an equivalent C construct" On Wed, May 4, 2016 at 12:18 PM, Stanislav Manilov < stanislav.manilov at gmail.com> wrote: > Hi, > > There is another benefit to keeping the CppBackend: it's great for > learning how to use the IR and the C++ API in particular, as can be seen > from this SO Q&A: >

[RFC PATCH v2] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Dec 07

[RFC PATCH v2] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics

ARM vectorized fp16 support

2019 Sep 05

ARM vectorized fp16 support

Hi, I'm trying to compile half precision program for ARM, while it seems LLVM fails to automatically generate fused-multiply-add instructions for c += a * b. I'm wondering whether I did something wrong, if not, is it a missing feature that will be supported later? (I know there're fp16 FMLA intrinsics though) Test programs and outputs, $ clang -O3 -march=armv8.2-a+fp16fml

[LLVMdev] Implementing the ARM NEON Intrinsics for PowerPC

2013 Oct 02

[LLVMdev] Implementing the ARM NEON Intrinsics for PowerPC

On 2 October 2013 12:17, Renato Golin <renato.golin at linaro.org> wrote: > On 2 October 2013 10:12, Steven Newbury <steve at snewbury.org.uk> wrote: > >> How does this make any sense? >> > > I have to agree with you that this doesn't make much sense, but there is a > case where you would want something like that: when the original source > uses NEON

[PATCH v1] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Dec 19

[PATCH v1] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics

On 19 December 2014 at 17:25, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote: > Optimize celt_pitch_xcorr function (for floating point) > using ARM NEON intrinsics for SoCs that have NEON VFP unit. > > To enable this optimization, use --enable-intrinsics > configure option. > > Compile time and runtime checks are also supported to make sure > this

A bug in DependenceAnalysis?

2017 Jun 21

A bug in DependenceAnalysis?

Hi Philip, Thanks for checking! I'm running my own Foo pass that registers DependenceAnalysisWrapperPass as a prerequisite and then I run it like so: opt -load libfoo.so -foo example.bc This is LLVM 3.9. Cheers, - Stan On Wed, Jun 21, 2017 at 5:40 PM, Philip Pfaffe <philip.pfaffe at gmail.com> wrote: > Hi Stan, > > in both cases I get a consistent anti result. Can you

similar to: [LLVMdev] ARM NEON intrinsics in clang