search for: vfpv3

Displaying 20 results from an estimated 70 matches for "vfpv3".

2014 Jun 23
2
[LLVMdev] VFP3
...ources and directly calling into relevant LLVM classes and methods. Thanks, Daman On 23/06/14 4:11 pm, "Amara Emerson" <amara.emerson at gmail.com> wrote: >Hi Damanjit, > >I assume you're trying to use the tools like llvm-mc, in which case >you can use the -mattr=+vfpv3 flag to enable it. This applies to other >subtarget features defined in ARM.td as well. > >Cheers, >Amara > >On 23 June 2014 11:27, Damanjit Singh <dsingh at adobe.com> wrote: >> How can I ensure use of VFP3 via LLVM target options? I am currently >>using >&gt...
2011 Jul 08
1
[LLVMdev] LLVM on ARM testing.
...as that x86 was in better shape -- that was a time I tested on x86... > About your configure options, I have some questions: > > - You use "--with-arch=armv6 --with-tune=cortex-a8", shouldn't it be armv7? > > - You're using softfp on Cortex-A8, have you tried with VFPv3/NEON? > > - On GCC 4.4.5 you're using "--with-float=softfp --with-fpu=vfpv3-d16" > and (correctly) "--with-arch=armv7-a", which mode prevails? Soft or > hard? > > - You're using Thumb on 4.4.5 and 4.5.2 only, any special reason not > to use Thumb on...
2011 Jul 05
0
[LLVMdev] LLVM on ARM testing.
...Do we have a similar thing for x86? Just to make sure the Mips errors are not ARM specific. About your configure options, I have some questions: - You use "--with-arch=armv6 --with-tune=cortex-a8", shouldn't it be armv7? - You're using softfp on Cortex-A8, have you tried with VFPv3/NEON? - On GCC 4.4.5 you're using "--with-float=softfp --with-fpu=vfpv3-d16" and (correctly) "--with-arch=armv7-a", which mode prevails? Soft or hard? - You're using Thumb on 4.4.5 and 4.5.2 only, any special reason not to use Thumb on previous GCCs? cheers, --renato
2012 Aug 02
1
[LLVMdev] Question about arm thumb2 code generation
Thanks andrew for the answer. I would like to generate code for Cortex-A9 that don't use neon for fp computation but vfpv3 -d16. I've tried some combination of -mattr=+neon,-neonfp,+vfp3,+d16 but couldn't get ".fpu vfpv3-d16" directive generated in assembly file. Do you know how to make it happen ? Best Regards Seb From: Andrew Trick [mailto:atrick at apple.com] Sent: Saturday, July 28, 2012 2:46 AM...
2013 Dec 18
2
[LLVMdev] LLVM ARM VMLA instruction
On 18 December 2013 09:42, Tim Northover <t.p.northover at gmail.com> wrote: > That means one strictly newer than > cortex-a8: cortex-a7 (don't ask), cortex-a9, cortex-a12, cortex-a15 or > krait I believe. > Hi Tim, Cortex A8 and A9 use VFPv3. A7, A12 and A15 use VFPv4. cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131218/4d77d09c/attachment.html>
2013 Dec 20
0
[LLVMdev] LLVM ARM VMLA instruction
...A8 hardware soon, can someone please check it on A8 hardware as > well (Sorry for the trouble)? I've got a BeagleBone hanging around, and tested Clang against a hacked version of itself (without the VMLx disabling on Cortex-A8). The results (for matmul_f64_4x4, -O3 -mcpu=cortex=a8) were: 1. vfpv3-d16, stock Clang: 96.2s 2. vfpv3-d16, clang + vmla: 95.7s 3. vfpv3, stock clang: 82.9s 4. vfpv3, clang + vmla: 81.1s Worth investigating more, but as the others have said nowhere near enough data on its own. Especially since Evan clearly did some benchmarking himself before specifically disabling...
2013 Dec 11
1
[LLVMdev] runtime performance benchmarking tools for clang
...on time due to floating point operations, it was clearly observed that gcc used floating point instruction FSQRT, where as clang seemed to use emulated function (?) BL SQRT. Note that we used the following flags for both clang as well as gcc compilation. -march=armv7-a -mfloat-abi=softfp -mfpu=vfpv3-d16 -mtune=cortex-a8 Infact, i was surprised to see that even when " -march=armv7-a -mfloat-abi=hard -mfpu=vfpv3-d16 -mtune=cortex-a8" was used, the code generated did not use hardware vsqrt instruction, instead there was a bl sqrt instruction. Could someone point out why vsqrt was no...
2013 Dec 19
1
[LLVMdev] LLVM ARM VMLA instruction
...iplication (no loops, all multiplication and additions are hard coded - basically all the operations are expanded e.g Result[0][0] = A[0][0]*B[0][0] + A[0][1]*B[1][0] + A[0][2]*B[2][0] and so on for all 9 elements of the result ). If i compile above code with "clang -O3 -mcpu=cortex-a8 -mfpu=vfpv3-d16" (only 16 floating point registers present with my arm, so specifying vfpv3-d16), there are 27 vmul, 18 vadd, 23 store and 30 load ops in total. If same is compiled with gcc with same options there are 9 vmul, 18 vmla, 9 store and 20 load ops. So, its clear that extra load/store ops gets...
2013 Dec 11
0
[LLVMdev] runtime performance benchmarking tools for clang
...cution time due to floating point operations, it was clearly observed that gcc used floating point instruction FSQRT, where as clang seemed to use emulated function (?) BL SQRT. Note that we used the following flags for both clang as well as gcc compilation. -march=armv7-a -mfloat-abi=softfp -mfpu=vfpv3-d16 -mtune=cortex-a8 Infact, i was surprised to see that even when " -march=armv7-a -mfloat-abi= *hard* -mfpu=vfpv3-d16 -mtune=cortex-a8" was used, the code generated did not use hardware *vsqrt* instruction, instead there was a *bl sqrt* instruction. Could someone point out why *vsqrt *w...
2008 Dec 06
2
[LLVMdev] anybody working on ARM Cortex support?
Is anybody actively working on additional ARM target support? I need Cortex support (ARMv7, VFPv3, and Neon). Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20081205/3756d719/attachment.html>
2013 Dec 18
0
[LLVMdev] LLVM ARM VMLA instruction
> Cortex A8 and A9 use VFPv3. A7, A12 and A15 use VFPv4. That's what I thought! But we do seem to generate vfma on Cortex-A9. Wonder if that's a bug, or Cortex-A9 is "VFPv3, but chuck in vfma too"? Tim.
2009 Mar 27
3
[LLVMdev] GSoC 2009: proposals!
...transformations, optimizations, etc. I read two interesting papers on this so far: * Automated Synthesis Of Efficient Binary Decoders for Retargetable Software Toolkits, DAC 2003 * Generating decision trees for decoding binaries 3) Improve ARM be adding necessary ARMv7 instructions and support for VFPv3 and Thumb-2, all this for the ARMv7 Cortex A8 (I can get a i.MX515 board to play if this project is selected) 4) Improve MIPS be to support the Malta board (I'm working on this already, but I rather not work only with MIPS to change things a little bit, but I'd not bother working on this t...
2013 Oct 03
3
[LLVMdev] runtime performance benchmarking tools for clang
Hi All, Could anyone point me to some good benchmarking tools to measure the runtime performance of clang compiled C++ applications. Thanks ! - Jyoti -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131003/3cc029f1/attachment.html>
2012 Aug 31
2
[LLVMdev] Clang incompatible with GCC on Linux + ARM Cortex-A9
...dir=/usr/include/c++/4.6 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --enable-objc-gc --enable-multilib --disable-sjlj-exceptions --with-arch=armv7-a --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb --disable-werror --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf Thread model: posix gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) 4. Testing program code: #define P(B, F) printf("%p (+%d): %s\n", &a...
2013 Dec 19
3
[LLVMdev] LLVM ARM VMLA instruction
Test case name : >> llvm/projects/test-suite/SingleSource/Benchmarks/Misc/matmul_f64_4x4.c - >> This is a 4x4 matrix multiplication, we can make small changes to make it a >> 3x3 matrix multiplication for making things simple to understand . >> > > This is one very specific case. How does that behave on all other cases? > Normally, every big improvement comes with
2015 Jan 21
3
[LLVMdev] [3.6 Release] RC1 has been tagged, Testing Phase I begins
On 20 January 2015 at 18:55, Hans Wennborg <hans at chromium.org> wrote: > There weren't many merges between the branch point and the rc1 tag. They were: > > r226023 InstCombine: Don't take A-B<0 into A<B if A-B has other uses > r226029 IR: Fix a use-after-free in RAUW > r226044 IR: Drop metadata references more aggressively during teardown > r226046
2011 Jul 03
9
[LLVMdev] LLVM on ARM testing.
Hello, I asked here for kind of reference GCC version which LLVM development team is using for *native* testing on ARM hardware. (no cross compilation!) last week or so. I've been curious myself how the situation looks and so I tested LLVM 2.9 as a reference point and LLVM HEAD as of June 29 on ARMv7 (two boards with two different Ubuntu versions) compiled by GCC 4.3.4, 4.4.1, 4.4.5,
2008 Dec 19
0
[LLVMdev] anybody working on ARM Cortex support?
...ing conventions. Is there a reason why the ARM Target isn't using the CCState machinery? Deep On Fri, Dec 5, 2008 at 5:22 PM, Sandeep Patel <deeppatel1987 at gmail.com> wrote: > > Is anybody actively working on additional ARM target support? > > I need Cortex support (ARMv7, VFPv3, and Neon). > > Thank you.
2008 Dec 20
1
[LLVMdev] anybody working on ARM Cortex support?
...ure what you mean by CCState machinery. Evan > > > Deep > > On Fri, Dec 5, 2008 at 5:22 PM, Sandeep Patel > <deeppatel1987 at gmail.com> wrote: >> >> Is anybody actively working on additional ARM target support? >> >> I need Cortex support (ARMv7, VFPv3, and Neon). >> >> Thank you. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
2012 Jul 28
0
[LLVMdev] Question about arm thumb2 code generation
On Jul 27, 2012, at 9:04 AM, Sebastien DELDON-GNB <sebastien.deldon at st.com> wrote: > Hi all, > > Does llc –march=thumb –mcpu=cortex-a9 enable generation of thumb2 code for armv7 ? That's how I usually do it. Somewhere in the target description we associate a9 with -mattr=+thumb2. There are plenty of other ways to get the same result, and it's all very confusing and