Hello LLVM Devs, I am starting my PhD on Automatic Parallelization for DSP and want to play with some ARM NEON intrinsics for a start. I spent the last three days trying to compile a version of LLVM that would allow me to compile sources that contain these intrinsics, but with no success. In the process I found out that clang doesn't support NEON (as per blog.llvm.org/2010/04/arm-advanced-simd-neon-intrinsics-and.html), but there has been at least some effort in adding it ( codeaurora.org/patches/quic/llvm/32040/clang-Initial-Neon-support.patch ). I also tried compiling LLVM 2.9 + llvm-gcc but that failed too many times and I gave up. After some discussions with colleagues (notably Alberto Magni, who added OpenCL support to clang some time ago lists.cs.uiuc.edu/pipermail/cfe-dev/2010-November/012293.html) my current plan is to implement the ARM NEON intrinsics as a shared library, using attributes as in: typedef float float4 __attribute__((ext_vector_type(4))); or if that doesn't work, I will try to implement the intrinsics in clang itself (not sure this is the best way of doing it). Ideally, I want to be able to compile C code that includes ARM NEON intrinsics to other targets (TI processors, e.g.). Suggestions, comments, and recommendations are very welcome. Kind regards, - Stan -- Stan Manilov 1st year Ph.D. student 2013 Graduate in B.Sc. Computer Science and Mathematics The University of Edinburgh -------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.llvm.org/pipermail/llvm-dev/attachments/20130926/b31b3f2b/attachment.html>
On 26 September 2013 12:22, Stanislav Manilov <S.Z.Manilov at sms.ed.ac.uk>wrote:> In the process I found out that clang doesn't support NEON (as per > blog.llvm.org/2010/04/arm-advanced-simd-neon-intrinsics-and.html), > but there has been at least some effort in adding it ( > codeaurora.org/patches/quic/llvm/32040/clang-Initial-Neon-support.patch > ). >Hi Stanislav, LLVM does support NEON on ARM32 for a very long time. The commit you're referring is about AArch64, and yes, support for ARM64 NEON is patchy at the moment, but it's progressing quite quickly. What back-end are you trying to use? 32-bits or 64-bits? I also tried compiling LLVM 2.9 + llvm-gcc but that failed too many times> and I gave up. After some discussions with colleagues (notably Alberto > Magni, who added OpenCL support to clang some time ago > lists.cs.uiuc.edu/pipermail/cfe-dev/2010-November/012293.html) my > current plan is to implement the ARM NEON intrinsics as a shared library, > using attributes as in: >LLVM 2.9 is really old, and llvm-gcc is discontinued, so I wouldn't even try that. If you don't want to use trunk, I recommend you to use LLVM with Clang 3.3 and see what you get. typedef float float4 __attribute__((ext_vector_type(4)));> or if that doesn't work, I will try to implement the intrinsics in clang > itself (not sure this is the best way of doing it). > Ideally, I want to be able to compile C code that includes ARM NEON > intrinsics to other targets (TI processors, e.g.). >So, if I get it right, you have a file with ARM NEON intrinsics (the ones defined in arm_neon.h) and passed it through LLVM 2.9 with LLVM-GCC front-end and failed. As of 2010, LLVM can compile every single NEON instruction, but you should use LLVM's own version of arm_neon.h, since the type definitions do vary between toolchains. In the end, they amount to the same thing on each toolchain, but their representation can be different. I suggest you try with Clang 3.3 and if that fails, we'll start from there. cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.llvm.org/pipermail/llvm-dev/attachments/20130926/f2608d17/attachment.html>
Hi Stan,> I spent the last three days trying to compile a version of LLVM that would > allow me to compile sources that contain these intrinsics, but with no success.Ok. This we can probably help with. Did you manage to build a version of Clang (preferably from git/subversion)? If so, you're probably having problems cross-compiling. Renato's recently worked on some documentation in this area: clang.llvm.org/docs/CrossCompilation.html. But for a quick hack, you could try: $ cat > neon.c #include <arm_neon.h> float32x4_t my_func(float32x4_t lhs, float32x4_t rhs) { return vaddq_f32(lhs, rhs); } $ clang --target=arm-linux-gnueabihf -mcpu=cortex-a15 -ffreestanding -O3 -S -o - neon.c ("ffreestanding" will dodge any issues with your supporting toolchain, but won't work for larger tests. You've got to actually solve the issues before you start running code).> In the process I found out that clang doesn't support NEON (as per > blog.llvm.org/2010/04/arm-advanced-simd-neon-intrinsics-and.html),That's rather out of date, I'm afraid. 32-bit ARM does support both NEON intrinsics and a reasonable amount of LLVM's own auto-vectorisation (which is in its early stages, but we have some kind of loop and SLP vectorisation going on).> but there has been at least some effort in adding it > (codeaurora.org/patches/quic/llvm/32040/clang-Initial-Neon-support.patch).That patch is part of the effort to implement NEON (instructions and intrinsics) on the 64-bit ARM architecture (AArch64).> I also tried compiling LLVM 2.9 + llvm-gcc but that failed too many times > and I gave up.Yep. llvm-gcc is long dead, and LLVM 2.9 isn't much healthier.> current plan is to implement the ARM NEON intrinsics as a shared library, > using attributes as in:That would probably be possible, but very bad from a performance perspective. The whole point of NEON intrinsics is to speed up vector code; if you've got the overhead of a call/return for each intrinsic and completely fixed registers around even that you'll be in for a world of pain.> Ideally, I want to be able to compile C code that includes ARM NEON > intrinsics to other targets (TI processors, e.g.).Now that's going to be harder. LLVM itself doesn't support any TI processors, for a start. And many of the NEON intrinsics (those with more complex semantics) compile to LLVM IR with LLVM-level intrinsics, which are only supported in the ARM backend. Your shared library idea would work semantically, of course. But I'm not sure what useful information could be extracted from it. Cheers. Tim.
Apparently Analagous Threads
- [LLVMdev] ARM NEON intrinsics in clang
- [RFC PATCH v2] cover: armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics
- [RFC PATCH v2] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics
- [LLVMdev] ARM NEON intrinsics in clang
- [PATCH v1] cover: armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics