thr3ads.net - search: "vfpv4"

Displaying 20 results from an estimated 31 matches for "vfpv4".

Did you mean: vfpv3

2013 Dec 19

[LLVMdev] LLVM ARM VMLA instruction

Hi all, Thanks for the info. Few observations from my side : LLVM : cortex-a8 vfpv3 : no vmla or vfma instruction emitted cortex-a8 vfpv4 : no vmla or vfma instruction emitted (This is invalid though as cortex-a8 does not have vfpv4) cortex-a8 vfpv4 with ffp-contract=fast : vfma instruction emitted ( this seems a bug to me!! If cortex-a8 doesn't come with vfpv4 then vfma instructions generated will be invalid ) cortex-a15 vfpv...

[LLVMdev] LLVM ARM VMLA instruction

2013 Dec 19

[LLVMdev] LLVM ARM VMLA instruction

Just to clarify: gcc 4.8.1 generates that fma at -O2; no FP relaxation or other flags specified. On Wed, Dec 18, 2013 at 6:02 PM, Kay Tiong Khoo <kkhoo at perfwizard.com>wrote: > Thanks for the explanation, Tim! > > gcc 4.8.1 *does* generate an fma for your code example for an x86 target > that supports fma. I'd bet that the HW vendors' compilers do the same, but >

[LLVMdev] LLVM ARM VMLA instruction

2013 Dec 19

[LLVMdev] LLVM ARM VMLA instruction

> cortex-a8 vfpv4 with ffp-contract=fast : vfma instruction emitted ( this > seems a bug to me!! If cortex-a8 doesn't come with vfpv4 then vfma > instructions generated will be invalid ) If I'm understanding correctly, you've specifically told it this Cortex-A8 *does* come with vfpv4. Those kinds...

[LLVMdev] LLVM ARM VMLA instruction

2013 Dec 19

[LLVMdev] LLVM ARM VMLA instruction

Thanks for the explanation, Tim! gcc 4.8.1 *does* generate an fma for your code example for an x86 target that supports fma. I'd bet that the HW vendors' compilers do the same, but I don't have any of those installed at the moment to test that theory. So this is a bug in those compilers? Do you know how they justify it? I see section 6.5 "Expressions" in the C standard, and

[LLVMdev] fmac generation for cortex-a9

2012 Nov 09

[LLVMdev] fmac generation for cortex-a9

...rmv7-eabi). Maybe someone from ARM can answer the question ? Seb From: JF Bastien [mailto:jfb at google.com] Sent: Friday, November 09, 2012 5:36 PM To: Sebastien DELDON-GNB Cc: Anitha Boyapati; llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] fmac generation for cortex-a9 AFAIK A9 doesn't have VFPv4 or AdvSIMDv2, so it doesn't have VFMA. I don't know what LLVM does, but it shouldn't emit VFMA when you target A9. VMLA isn't a fused multiply-add, it's a multiply followed by an add and has different latency as well as precision. On Thu, Nov 8, 2012 at 4:57 AM, Sebastien DELDO...

[LLVMdev] fmac generation for cortex-a9

2012 Nov 09

[LLVMdev] fmac generation for cortex-a9

...> Cc: JF Bastien; llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] fmac generation for cortex-a9 > > Hi Sebastien, > > ARMv7-M has VFMA and LLVM's "triple" is far from perfect. > > Wikipedia tells me NovaThor can also be A15, or STE could have cramped a > VFPv4 in it? ;) Or possibly, your code never branches into the VFMA. > Many things could be happening, but usually, VFMA shouldn't be generated > for A9. > > A GCC bug, maybe? > > > On 9 November 2012 16:51, Sebastien DELDON-GNB > <sebastien.deldon at st.com> wrote: &...

[LLVMdev] fmac generation for cortex-a9

2012 Nov 09

[LLVMdev] fmac generation for cortex-a9

Hi Sebastien, ARMv7-M has VFMA and LLVM's "triple" is far from perfect. Wikipedia tells me NovaThor can also be A15, or STE could have cramped a VFPv4 in it? ;) Or possibly, your code never branches into the VFMA. Many things could be happening, but usually, VFMA shouldn't be generated for A9. A GCC bug, maybe? On 9 November 2012 16:51, Sebastien DELDON-GNB <sebastien.deldon at st.com> wrote: > Hi Bastien, > > > > Weir...

[LLVMdev] Building clang on Raspberry Pi2

2015 Feb 12

[LLVMdev] Building clang on Raspberry Pi2

...target-abi aapcs-linux -mfloat-abi hard -target-linker-version 2.24.90.20141023 -dwarf-column-info -ffunction-sections -fdata-sections -coverage-file But it's a Cortex-A7, not A8, how do I convince the test-release.sh or build scripts to make Phase1 clang target Cortex A7 for Phase2? (with vfpv4-d16) Whilst we're at it, I'm not convinced that the Pi2 would make a great cross-compilation machine, so is it possible to reduce the enabled targets? "X86 Sparc PowerPC ARM AArch64 Mips XCore MSP430 CppBackend NVPTX Hexagon SystemZ R600" Thanks, Ben

[LLVMdev] fmac generation for cortex-a9

2012 Nov 09

[LLVMdev] fmac generation for cortex-a9

...iuc.edu > > Subject: Re: [LLVMdev] fmac generation for cortex-a9 > > > > Hi Sebastien, > > > > ARMv7-M has VFMA and LLVM's "triple" is far from perfect. > > > > Wikipedia tells me NovaThor can also be A15, or STE could have cramped a > > VFPv4 in it? ;) Or possibly, your code never branches into the VFMA. > > Many things could be happening, but usually, VFMA shouldn't be generated > > for A9. > > > > A GCC bug, maybe? > > > > > > On 9 November 2012 16:51, Sebastien DELDON-GNB > > <se...

[LLVMdev] LLVM ARM VMLA instruction

2013 Dec 18

[LLVMdev] LLVM ARM VMLA instruction

Hi, Hi, I was going through Code of LLVM instruction code generation for ARM. I came across VMLA instruction hazards (Floating point multiply and accumulate). I was comparing assembly code emitted by LLVM and GCC, where i saw that GCC was happily using VMLA instruction for floating point while LLVM never used it, instead it used a pair of VMUL and VADD instruction. I wanted to know if there is

[LLVMdev] LLVM ARM VMLA instruction

2013 Dec 18

[LLVMdev] LLVM ARM VMLA instruction

...happily using VMLA instruction for floating point while LLVM never used > it, instead it used a pair of VMUL and VADD instruction. It looks like Clang allows the formation by default, but you need to be compiling for a CPU that actually supports the instruction (the key feature is called "VFPv4". That means one strictly newer than cortex-a8: cortex-a7 (don't ask), cortex-a9, cortex-a12, cortex-a15 or krait I believe. With that I get: $ cat tmp.c float foo(float accum, float lhs, float rhs) { return accum + lhs*rhs; } $ clang -target armv7-linux-gnueabihf -mcpu=cortex-a15 -S -o...

[LLVMdev] LLVM ARM VMLA instruction

2013 Dec 18

[LLVMdev] LLVM ARM VMLA instruction

On 18 December 2013 09:42, Tim Northover <t.p.northover at gmail.com> wrote: > That means one strictly newer than > cortex-a8: cortex-a7 (don't ask), cortex-a9, cortex-a12, cortex-a15 or > krait I believe. > Hi Tim, Cortex A8 and A9 use VFPv3. A7, A12 and A15 use VFPv4. cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131218/4d77d09c/attachment.html>

[RFC] New Clang target selection options for ARM/AArch64

2018 Sep 25

[RFC] New Clang target selection options for ARM/AArch64

Hi Eli, Renato, Thanks for your feedback, there's a lot more to some of these things than I knew. I've addressed your points below. The overall summary is: - Start with converting the TargetParser to tableGen, with no user facing changes - Add warnings based on that, behind -Wall. Starting with command lines, since directives have larger implications that need investigation Thanks,

[LLVMdev] fmac generation for cortex-a9

2012 Nov 09

[LLVMdev] fmac generation for cortex-a9

AFAIK A9 doesn't have VFPv4 or AdvSIMDv2, so it doesn't have VFMA. I don't know what LLVM does, but it shouldn't emit VFMA when you target A9. VMLA isn't a fused multiply-add, it's a multiply followed by an add and has different latency as well as precision. On Thu, Nov 8, 2012 at 4:57 AM, Sebastien DELD...

[LLVMdev] RE : fmac generation for cortex-a9

2012 Nov 12

[LLVMdev] RE : fmac generation for cortex-a9

...s.uiuc.edu<mailto:llvmdev at cs.uiuc.edu> > Subject: Re: [LLVMdev] fmac generation for cortex-a9 > > Hi Sebastien, > > ARMv7-M has VFMA and LLVM's "triple" is far from perfect. > > Wikipedia tells me NovaThor can also be A15, or STE could have cramped a > VFPv4 in it? ;) Or possibly, your code never branches into the VFMA. > Many things could be happening, but usually, VFMA shouldn't be generated > for A9. > > A GCC bug, maybe? > > > On 9 November 2012 16:51, Sebastien DELDON-GNB > <sebastien.deldon at st.com<mailto:sebas...

[LLVMdev] fmac generation for cortex-a9

2012 Nov 08

[LLVMdev] fmac generation for cortex-a9

Hi Anitha, Thanks for your answer but -mcpu=cortex-a9 -mattr=+vfp4 doesn' t enable fused mac generation for me. I would like just to understand why -mtriple=armv7-eabi enables it while -mcpu=cortex-a9 seems to disable it ? Seb > -----Original Message----- > From: Anitha Boyapati [mailto:anitha.boyapati at gmail.com] > Sent: Thursday, November 08, 2012 10:22 AM > To: Sebastien

[LLVMdev] ARM assembler bug on LLVM 3.5

2014 Sep 20

[LLVMdev] ARM assembler bug on LLVM 3.5

Hi I have the following ARM Linux program. The program detects if the processor has division instruction, if it does, it uses it, otherwise it uses slower library call. The program works with gcc, but it doesn't work with clang. clang reports error on the sdiv instruction in the assembler. The problem is this - you either compile this program with -mcpu=cortex-a9, then clang reports

[LLVMdev] Unwanted push/pop on Cortex-M.

2013 Oct 15

[LLVMdev] Unwanted push/pop on Cortex-M.

...lang -cc1 version 3.3 based upon LLVM 3.3 default target armv7m-none-eabi > #include "..." search starts here: > End of search list. > > $ cat a.s > .syntax unified > .eabi_attribute 6, 10 > .eabi_attribute 9, 2 > .eabi_attribute 10, 5 > .fpu vfpv4 > .eabi_attribute 20, 1 > .eabi_attribute 21, 1 > .eabi_attribute 23, 3 > .eabi_attribute 24, 1 > .eabi_attribute 25, 1 > .eabi_attribute 44, 1 > .file "a.c" > .section .text.out_char,"ax",%progbits > .globl...

[LLVMdev] ARM assembler bug on LLVM 3.5

2014 Sep 22

[LLVMdev] ARM assembler bug on LLVM 3.5

...to write portable ARM applications that run on different cores. > The problem is that there isn't currently a way to pass flags to the > integrated assembler as of now (I'm working on it). gas unlocks all instructions if these directives are used ".cpu cortex-a15\n .fpu neon-vfpv4". LLVM integrated assembler will choke on it and produce invalid object file. You can try: int main(void) { asm volatile (".cpu cortex-a15\n .fpu neon-vfpv4\n"); return 0; } /usr/bin/ld: Warning: /tmp/as-b084f5.o: Unknown EABI object attribute 488 /usr/bin/ld: /tmp/...

celt_inner_prod() and dual_inner_prod() NEON intrinsics

2017 Jun 06

celt_inner_prod() and dual_inner_prod() NEON intrinsics

Hi Linfeng, On 06/06/17 04:09 PM, Jonathan Lennox wrote: > Two comments on the various infrastructure for RTCD etc. > > 1. The 0002- patch changes the ABI of the celt_pitch_xcorr functions, > but doesn’t change the assembly in celt/arm/celt_pitch_xcorr_arm.s > correspondingly. I suspect the ‘arch’ parameter can just be ignored > by the assembly functions, but at least the

search for: vfpv4