thr3ads.net - similar to: "[LLVMdev] runtime performance benchmarking tools for clang"

Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] runtime performance benchmarking tools for clang"

[LLVMdev] runtime performance benchmarking tools for clang

2013 Dec 11

[LLVMdev] runtime performance benchmarking tools for clang

Hi Kun Ling & Bergstrom, Thanks a lot for your earlier responses. We did use the benchmarks in llvm testsuite for comparing execution time taken by clang & gcc. It appears that clang is slower than gcc for cases where floating point operations are involved and recursive calls are involved (note that pic/pie was enabled for both gcc as well as clang ). 1) For lag in execution time due to

[LLVMdev] runtime performance benchmarking tools for clang

2013 Dec 11

[LLVMdev] runtime performance benchmarking tools for clang

2) For lag in execution time due to floating point operations, it was clearly observed that gcc used floating point instruction FSQRT, where as clang seemed to use emulated function (?) BL SQRT. Note that we used the following flags for both clang as well as gcc compilation. -march=armv7-a -mfloat-abi=softfp -mfpu=vfpv3-d16 -mtune=cortex-a8 Infact, i was surprised to see that even when

[LLVMdev] LLVM ARM VMLA instruction

2013 Dec 19

[LLVMdev] LLVM ARM VMLA instruction

Hi Tim, > > cortex-a15 vfpv4 : vmla instruction emitted (which is a NEON instruction) > > I get a VFP vmla here rather than a NEON one (clang -target > armv7-linux-gnueabihf -mcpu=cortex-a15): "vmla.f32 s0, s1, s2". Are > you seeing something different? > As per Renato comment above, vmla instruction is NEON instruction while vmfa is VFP instruction. Correct

[LLVMdev] LLVM on ARM testing.

2011 Jul 05

[LLVMdev] LLVM on ARM testing.

On 3 July 2011 21:32, Karel Gardas <karel.gardas at centrum.cz> wrote: > please see http://ghcarm.wordpress.com/2011/07/03/llvm-on-arm-testing/ > > Is there anything other I might do for you to get those regressions fixed? Hi Karel, This is great! I can see there's only a handful of errors. All JIT errors seem to be the same (MC). All O2 errors, too (MIPS). The select.ll

[LLVMdev] LLVM ARM VMLA instruction

2013 Dec 19

[LLVMdev] LLVM ARM VMLA instruction

On 19 December 2013 08:50, suyog sarda <sardask01 at gmail.com> wrote: > It may seem that total number of cycles are more or less same for single > vmla and vmul+vadd. However, when vmul+vadd combination is used instead of > vmla, then intermediate results will be generated which needs to be stored > in memory for future access. This will lead to lot of load/store ops being >

NEON FP flags

2016 Mar 25

NEON FP flags

On 25 March 2016 at 04:11, Hal Finkel <hfinkel at anl.gov> wrote: > As I understand it, the fundamental property being addresses here is: Are the semantics of scalar FP math the same as vector FP math? TTI seems like a good place to expose that information. If the semantics are indeed different, then the vectorizer would require fast-math flags in order to vectorize FP operations

[LLVMdev] Question about arm thumb2 code generation

2012 Aug 02

[LLVMdev] Question about arm thumb2 code generation

Thanks andrew for the answer. I would like to generate code for Cortex-A9 that don't use neon for fp computation but vfpv3 -d16. I've tried some combination of -mattr=+neon,-neonfp,+vfp3,+d16 but couldn't get ".fpu vfpv3-d16" directive generated in assembly file. Do you know how to make it happen ? Best Regards Seb From: Andrew Trick [mailto:atrick at apple.com] Sent:

[LLVMdev] LLVM on ARM testing.

2011 Jul 03

[LLVMdev] LLVM on ARM testing.

Hello, I asked here for kind of reference GCC version which LLVM development team is using for *native* testing on ARM hardware. (no cross compilation!) last week or so. I've been curious myself how the situation looks and so I tested LLVM 2.9 as a reference point and LLVM HEAD as of June 29 on ARMv7 (two boards with two different Ubuntu versions) compiled by GCC 4.3.4, 4.4.1, 4.4.5,

[LLVMdev] LLVM ARM VMLA instruction

2013 Dec 19

[LLVMdev] LLVM ARM VMLA instruction

On Thu, Dec 19, 2013 at 2:43 PM, Tim Northover <t.p.northover at gmail.com>wrote: > > As per Renato comment above, vmla instruction is NEON instruction while > vmfa is VFP instruction. Correct me if i am wrong on this. > > My version of the ARM architecture reference manual (v7 A & R) lists > versions requiring NEON and versions requiring VFP. (Section > A8.8.337).

[LLVMdev] LLVM ARM VMLA instruction

2013 Dec 19

[LLVMdev] LLVM ARM VMLA instruction

> As per Renato comment above, vmla instruction is NEON instruction while vmfa is VFP instruction. Correct me if i am wrong on this. My version of the ARM architecture reference manual (v7 A & R) lists versions requiring NEON and versions requiring VFP. (Section A8.8.337). Split in just the way you'd expect (SIMD variants need NEON). > It may seem that total number of cycles are

NEON FP flags

2016 Mar 29

NEON FP flags

On Fri, Mar 25, 2016 at 01:23:03PM +0000, Renato Golin via llvm-dev wrote: > On 25 March 2016 at 04:11, Hal Finkel <hfinkel at anl.gov> wrote: > > As I understand it, the fundamental property being addresses here is: Are > > the semantics of scalar FP math the same as vector FP math? TTI seems like > > a good place to expose that information. If the semantics are indeed

[LLVMdev] Cross compilation error LLVM-3.0

2012 Jan 23

[LLVMdev] Cross compilation error LLVM-3.0

Hi, I am trying to cross compile LLVM-3.0 for Arm target i.e. I would like to run the LLVM tools on ARM platform and generate the code for the ARM platform. I configured the build with ../src/configure --build=i686-pc-linux-gnu --host=arm-linux-gnueabi --target=arm-linux-gnueabi --enable-optimized=no --prefix=/home/user/Acads/Compiler/LLVM/llvm-3.0/bin --with-as=/usr/bin/arm-linux-gnueabi-as

[LLVMdev] LLVM CodeGen Engineer job opening with Apple's compiler team

2011 May 26

[LLVMdev] LLVM CodeGen Engineer job opening with Apple's compiler team

Hi all, LLVM CodeGen and Tools team at Apple is looking for exceptional compiler engineers. This is a great opportunity to work with many of the leaders in the LLVM community. If you are interested in this position, please send your resume / CV and relevant information to evan.cheng at apple.com Thanks, Evan Job description The Apple compiler team is seeking an engineer who is strongly

libc++ cross-compile linux-armv7 and math function problems

2018 Feb 06

libc++ cross-compile linux-armv7 and math function problems

Hello, I am trying to cross-compile libc++ from my x86_64 linux system to armv7hf. We have our own gcc compiler that we build with crosstools-ng (based on gcc 6.3.0) and I set my environment like this: CC=armv7a-plex-linux-gnueabihf-gcc CXX=armv7a-plex-linux-gnueabihf-g++ CFLAGS=-fPIC -DPIC -mfloat-abi=hard -march=armv7-a -Os -mfpu=vfpv3-d16 --sysroot=<path> CXXFLAGS=-fPIC -DPIC

AM335x ARM Cortex-A8 performance drop opus 1.1

2013 Oct 18

AM335x ARM Cortex-A8 performance drop opus 1.1

Hello!, i've just compared the 1.0.3 release with the master branch on a BeagleBone Black (AM335x 1GHz ARM Cortex-A8 with NEON floating-point accelerator) and Arch Linux ARM. At the moment I dont no why, but I see that 1.1 is much slower in encoding. Are there any default changes, that I missed and could explain this? Normaly I suggested a better performance with 1.1 and the ARM

[RFC] New Clang target selection options for ARM/AArch64

2018 Sep 25

[RFC] New Clang target selection options for ARM/AArch64

Hi Eli, Renato, Thanks for your feedback, there's a lot more to some of these things than I knew. I've addressed your points below. The overall summary is: - Start with converting the TargetParser to tableGen, with no user facing changes - Add warnings based on that, behind -Wall. Starting with command lines, since directives have larger implications that need investigation Thanks,

Cross compiling for Baremetal ARM without using GCC

2017 Oct 31

Cross compiling for Baremetal ARM without using GCC

Dear LLVM developers, Hello, I'm trying to find a way of cross-compiling my c code against Baremetal Cortex-M device (so target triple will be arm-none-eabi) only using LLVM/Clang, and not using anything from GNU (ld or libc). I'm doing this to know which one of LLVM/clang and GCC produces smaller flash image size because saving flash is a big deal in our projects. 1) When I just follow

libc++ cross-compile linux-armv7 and math function problems

2018 Feb 06

libc++ cross-compile linux-armv7 and math function problems

Hello Dimitry and thanks for your answer. I am pretty sure it does indeed support long double. It's configured with vfpv3-d16 - but I noticed that c++config.h in gcc has _GLIBCXX__HAS_FABSL and friends are undefined. I think I need to look deeper at the configuration of our toolchain. long double support is required in libc++ then I gather? -- Tobias On Tue, Feb 6, 2018 at 11:47 AM,

libc++ cross-compile linux-armv7 and math function problems

2018 Feb 06

libc++ cross-compile linux-armv7 and math function problems

At first glance, it looks like long double functions (such as fabsl and friends) are missing from your sysroot's <math.h>. Does your target support long double at all? -Dimitry > On 6 Feb 2018, at 09:51, Tobias Hieta via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hello, > > I am trying to cross-compile libc++ from my x86_64 linux system to armv7hf. We have

[LLVMdev] Question about arm thumb2 code generation

2012 Jul 28

[LLVMdev] Question about arm thumb2 code generation

On Jul 27, 2012, at 9:04 AM, Sebastien DELDON-GNB <sebastien.deldon at st.com> wrote: > Hi all, > > Does llc –march=thumb –mcpu=cortex-a9 enable generation of thumb2 code for armv7 ? That's how I usually do it. Somewhere in the target description we associate a9 with -mattr=+thumb2. There are plenty of other ways to get the same result, and it's all very confusing and

similar to: [LLVMdev] runtime performance benchmarking tools for clang